Thursday, 2023-04-06

opendevreviewDon Kehn proposed openstack/designate-tempest-plugin master: Adds test for the multipool bind9 configuration.  https://review.opendev.org/c/openstack/designate-tempest-plugin/+/87118401:14
opendevreviewVadym Markov proposed openstack/designate-dashboard master: Fix "Masters IP Address" for Zone update form  https://review.opendev.org/c/openstack/designate-dashboard/+/87973712:19
ozzzo_workjohnsom: I discussed the patch with my team; it doesn't look like that is going to fly. Whatever we do to fix it in the lab needs to be suitable for dev/qa/prod also13:11
ozzzo_workI tried introducing an extra delay before deletion, and that fixes the problem so it must be some kind of race condition13:11
eanderssonozzzo_work: The most common race condition is that the delete notification happens before the create notification has been processed properly. You cannot delete a record that does not properly exist yet.14:39
eanderssonI worked around this myself by only allowing one A record per VM, and during the create portion just delete any previous A records that has the same name.14:43
eanderssonIt wouldn't fix dangling A records, but it would prevent new VMs with the same name from having old data.14:43
eanderssonA potentially hacky work around would be to try to find the records a few times with a delay in between each attempt but that isn’t great 14:45
ozzzo_workthat's what I'm trying now14:55
ozzzo_workeandersson: That didn't fix it: https://paste.openstack.org/show/bnj6zSHrVJIJ1nhUrqVl/18:35
ozzzo_workit seems like 10 seconds should be plenty of time for a race condition to resolve itself. It looks like the failure is occurring here: https://github.com/openstack/designate/blob/train-eol/designate/objects/base.py#L40718:38
ozzzo_workWhen I generated a python error during this condition I got: https://paste.openstack.org/show/bgigVrtMPEK6zmQw6btx/18:41
ozzzo_workI don't understand how this is possible. I'm iterating through the list, and seeing the item there, and then I get "x not in list" when I try to remove it18:42
ozzzo_workif it was a race condition, shouldn't I see the item missing when I iterate through the l ist?18:42
eanderssonYea - that is odd18:51
eanderssonI don't think this will make a difference but can you try this patch just in case18:51
eanderssonhttps://review.opendev.org/c/openstack/designate/+/87930518:51
eanderssonbtw what version were you running again ozzzo_work?18:55
eanderssonoh train right?18:59
eanderssonI wrote a lot of tests last night. I can back port them to train and see if it behaves any different.18:59
eanderssonbtw the retry would need to re-fetch the recordset each time19:01
eanderssonHonestly you would want the retry to happen here19:02
eanderssonhttps://github.com/openstack/designate/blob/master/designate/notification_handler/base.py#L25219:02
eanderssonstart with requesting find_records again as a first step in the retry process19:02
ozzzo_workWe're running train but we have have johnsom's "Fix race condition in the sink when deleting records " patch20:24
ozzzo_workthe train-eol code was last updated in 2020. The version we're running looks like this: https://github.com/openstack/designate/blob/60edc59ff765b406e4b936deb4d200a2d9b411ce/designate/notification_handler/base.py20:25
johnsomFYI, I just backported that patch20:26
ozzzo_workI wasn't re-calling find_records so that explains why it didn't work. I'll try your patch without the retry, and if that doesn't fix it I'll do a proper retry20:30
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline for extended periods between 22:00 and 00:00 UTC for software upgrades and project renames: https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/thread/VW2O56AXI4OX34CWDNRNZDCWJDZR3QJP/21:04
eanderssonbtw the more logs you can provide us with the better. Ideally from worker / central / sink21:37
eanderssonBut I also understand if that is difficult.21:47
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline for extended periods over the next two hours for software upgrades and project renames: https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/thread/VW2O56AXI4OX34CWDNRNZDCWJDZR3QJP/21:58
opendevreviewMerged openstack/designate master: Move to a batch model for incrementing serial  https://review.opendev.org/c/openstack/designate/+/87125522:51
johnsomWahooo Nice work eandersson22:51
eanderssonExciting22:53

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!