Wednesday, 2020-05-13

*** hamalq has quit IRC00:13
*** livelace has quit IRC01:18
*** awalende has joined #openstack-dns02:07
*** awalende has quit IRC02:12
*** awalende has joined #openstack-dns06:00
*** awalende has quit IRC06:04
*** njohnston has quit IRC06:05
*** bersace has joined #openstack-dns06:15
*** kaveh has quit IRC07:13
*** kaveh has joined #openstack-dns07:19
*** awalende has joined #openstack-dns07:39
*** livelace has joined #openstack-dns08:12
*** salmankhan has joined #openstack-dns08:30
*** livelace has quit IRC08:50
*** salmankhan1 has joined #openstack-dns08:57
*** livelace has joined #openstack-dns08:57
*** salmankhan has quit IRC08:59
*** salmankhan1 is now known as salmankhan08:59
*** livelace has quit IRC09:02
*** livelace has joined #openstack-dns09:08
*** livelace has quit IRC09:13
*** livelace has joined #openstack-dns09:34
mindthecapmugsie: dig works as expected and returns two NS records. designate-mdns show that it's doing AXFR (version: 1). Serial for the zone is 88, which i can see from the master NS server logs (transfer of xxxx: AXFR started (serial 88... ended).10:51
mindthecapAfter the zone is created in Designate, one of the designate containers (i have three controllers so three designate containers) creates the zone in designate backend. Backend then tries to transfer the zone from designate container but gives an error " zone xxx has no NS records"10:55
*** njohnston has joined #openstack-dns11:01
*** salmankhan1 has joined #openstack-dns11:28
*** salmankhan has quit IRC11:28
*** salmankhan1 is now known as salmankhan11:28
*** livelace has quit IRC12:23
*** livelace has joined #openstack-dns12:29
*** ianychoi has quit IRC12:55
openstackgerritMerged openstack/designate master: Fix slow zone imports.  https://review.opendev.org/72179313:48
*** livelace has quit IRC14:45
*** hamalq has joined #openstack-dns16:51
hamalq can i get +1 on https://review.opendev.org/#/c/726214/ ( all code review changes done )16:58
*** salmankhan has quit IRC17:22
*** hamalq has quit IRC18:33
*** hamalq has joined #openstack-dns18:34
*** livelace has joined #openstack-dns19:09
*** roukoswarf has joined #openstack-dns19:15
roukoswarfanyone know how the notify works in designate? im trying to fix a bug with updating pool ns that gets everything stuck pending update19:16
openstackgerritAndreas Jaeger proposed openstack/designate-tempest-plugin master: Update hacking for Python3  https://review.opendev.org/71568919:18
johnsomIs the notification handler "nova_fixed" deprecated? Was that just for nova networking or does it have some purpose in the neutron networking world?19:52
mugsiejohnsom: no, it is still supported, kind of19:53
mugsieit only uses unversioned notifications though19:53
johnsomIs it just for nova networking deployments?19:53
mugsieno, it just reacts when the VM is created19:53
mugsieso it could have the VM name19:54
mugsiewhere neutron may not19:54
johnsomOk, thanks. Yeah, I can see those timing issues.19:54
mugsieroukoswarf: https://opendev.org/openstack/designate/src/branch/master/designate/mdns/notify.py19:54
roukoswarfso, im having issues with update_pool19:55
roukoswarfi fixed it breaking on tenants having their own NS records, that was easy. but it still sticks things in update/pending status19:56
roukoswarfi created a _change_ns which swaps the ns and then updates in 1 step without set_delayed_notify, but it still gets stuck in pending19:57
roukoswarfif i do a set on the zones, they get out of pending immediately.19:57
mugsieah, that code is weird19:58
mugsieone sec19:58
roukoswarfs/weird/broken, trying to make it work19:58
mugsieif you set zone.delayed_notify = False then do central.update_zone(context, zone) it should work19:59
roukoswarfupdating a recordset with delated_notify=False doesnt trigger that?20:00
mugsieeh, I can't remember the flow off the top of my head - but I *think* that block it being sent20:01
roukoswarfim setting the NS recordsets, then the parent function calls @notification('dns.pool.update') which doesnt trigger any sync20:02
mugsieah - that is for OpenStack billing notifications20:02
mugsienot DNS notifications20:02
mugsie(naming sucks, sorry)20:02
roukoswarfalright, well at least now i know that was meaningless for my uses, so... i update the NS, then call update_zone to get it to move? or is there a way updating a recordset should be able to notify?20:03
mugsieupdating a recordset should trigger a notify20:04
roukoswarfit does not.20:04
roukoswarfat least, not in this context20:04
mugsiewhat function are you calling to update it?20:04
mugsiecentral.update_recordset() ?20:04
mugsiehttps://opendev.org/openstack/designate/src/branch/master/designate/central/service.py#L1411 - that will trigger a DNS Notify20:05
roukoswarfi was just following the example of _add_ns, which called self._update_recordset_in_storage(context, zone, ns_recordset) with set_delayed_notify=True20:05
roukoswarfi removed the set_delayed_notify, and expected it to work20:05
mugsieah, OK.20:06
roukoswarfthis code never seems to call update_recordset directly, should i be?20:06
roukoswarfim working under update_pool20:06
mugsieOh! - so you are implementing updating the NS records for all zones in a pool after the zones are created?20:07
mugsieOK.20:07
roukoswarfafter a pools.yaml update20:07
mugsieyeah20:07
roukoswarfwhich, is currently busted20:07
roukoswarfat least in stein, not sure if the code changed in master, doesnt look like it.20:08
mugsieOk. in that case call the update_recordset() fuction20:08
roukoswarfinstead or in addition to the update_recordset_in_storage?20:08
mugsiethe _add_ns() is part of a bigger set fo code that will cause a NOTIFY elsewhere20:08
mugsieinstead of20:08
roukoswarfwell, 80% of my test zones are now out of pending20:11
mugsiethe delayed notify chunks up the work20:11
mugsieso it may take a little bit20:12
roukoswarfbut im not delaying notify, i removed that from my new _change_ns function which bulk updates all the pool ns changes.20:12
roukoswarfi do all the recordset changes from the pool change in one pass, then update_recordset it.20:12
mugsiethen it could just be RMQ backlog, waiting for the workers to chew through the notify tasks20:14
roukoswarfhttp://paste.openstack.org/show/793538/20:15
roukoswarfmy code, im calling instead of _add_ns and _delete_ns in update_pool20:15
roukoswarfmy rmq is empty20:16
roukoswarfcalling openstack zone set <id> instantly unsticks the zone20:17
mugsieyeah, that can happen if there is a long term issue.20:18
roukoswarfguess ill run a set on every stuck zone and start from a clean active state20:18
mugsiethere is a period task that can be run on a 24 hourly basis to basically send a notify for all zones20:18
mugsiebut I need to dig that out20:19
roukoswarfeverything else has worked fine, we just hit a crash in designate on doing a pool update, so im trying to make the process smooth20:19
roukoswarfill have to PR the bugfix, but this sync problem seems to be real too20:19
roukoswarfthe crash was an easy fix, looked like an oversight.20:20
mugsiethats great - more than happy to review + help merge20:20
roukoswarfif you make a zone, then a tenant adds NS records to it, then you do a pool update that changes the NS records, it will crash.20:20
mugsiecrash? yeah - that is an oversight20:20
roukoswarf_delete_ns doesnt search for managed, and does a get_recordset, singular, when in this case, there will be multiple20:21
roukoswarfadd selects it right, so, must have just been missed20:21
mugsieyeap.20:21
mugsiewe don't do functional testing of the pool modifcations, so that might be a good additon to make sure this stays fixed :)20:22
roukoswarfi thought i saw tests, but they dont test this case, it works perfectly if users never add NS recordsets of their own20:22
roukoswarfdo you know, by reading the code, why they did it with _add_ns and _delete_ns in 2 steps, and then not notifying?20:23
roukoswarfi combined the work into a single pass, but... it must have been that way for a reason20:24
* mugsie goes to git blame to check it wasn't him20:24
roukoswarfdoing it in 2 steps means you would spam the queue if you did notify, which is probably why this whole notify fiasco is happening to me20:25
roukoswarfalright... got all ym zones into active/none, lets try with update_recordset again from a clean slate.20:26
mugsieno, notify was for a much bigger issue20:26
roukoswarfwell, why call add and delete separately with delay_notify on both?20:27
mugsiein super active installs (HP Cloud / Racksapce etc), the number of changes was causing issues, and it was easier to batch up a zones changes into a single notify20:27
roukoswarfjust gets stuck 100% of the time20:27
roukoswarfyeah, which is what i did by making each zone a single write to the db.20:28
roukoswarfif you do a pool update, do you get out of pending state?20:28
mugsieI honestly havent tried that in a long time20:29
mugsiechanging the NS records20:29
roukoswarfi have 4 clusters in multiple versions, and every cluster gets all zones stuck on ns change.20:29
mugsieyeah, I am looking at commits from 5 years ago that could have caused this20:30
mugsiemost peoples NS records are fairly stable I guess20:30
roukoswarfupdated_pool = self.storage.update_pool(context, pool) should this line be after changing the ns? would that magically fix the notify?20:30
roukoswarfcurrently its before the ns changes20:30
mugsieno - that is the DB call20:31
roukoswarfwell, theres no other call than db calls in the code, update_recordset, update_zone, etc, never called, only the db calls are there.20:32
mugsieI think this is an aritfact of the old DB schema, where records were a thing on their own20:32
roukoswarfso all these _in_storage queries should be replaced?20:33
mugsiethe set_delayed_notify=True on add_ns should trigger the delayed notify task runner20:33
mugsieOH20:33
roukoswarfnow yer seeing it.20:36
mugsieno - the worker may not be reading the delay notify20:36
roukoswarfwell hey if its an easier fix than i thought thatd be great, ive spent the last 2 days reading through the code to try and pick up the pieces on how designate was written, my head hurts.20:38
mugsieyeah. it grew organically20:38
mugsiein desiognate.conf - in the [service:producer] section - do you have an "enabled_tasks" item?20:40
mugsietry adding one with "delayed_notify"20:41
mugsieas the value20:41
mugsieI have to drop off unfortunately, but leave info here, or email the mailing list, and I can look in the morning20:42
mugsieroukoswarf: ^20:42
roukoswarfhehwould it be the worker?20:43
roukoswarfoh, producer, got it20:43
mugsieno - so what is supposed to happen is the producer is supposed to look at the db for zones with a "delayed notify" - bundle them up and tell the workers to send the notify on a regular basis20:44
roukoswarfenabled_tasks = None, nice. this is a kolla deployment, so maybe thats something i need to point at them for instead in this case.20:44
mugsieand shard that out across a few producers and worker processes20:44
mugsieI just checked the - docs and sample config aree wrong20:45
mugsienot kollas fault :)20:45
roukoswarfNone is a very pythony value for a config file, ill check if kolla has it as a var somewhere undocumented.20:45
mugsiei suspect it came from the config firel generation - https://docs.openstack.org/designate/latest/admin/samples/config.html20:46
mugsiebut yet - that should fix it20:46
mugsieo/20:46
roukoswarfthanks a bunch, ill go find some contrib guide somewhere for the bugfix, as thats still valid, but at least i know im not crazy on the notify issues.20:46
mugsie https://bugs.launchpad.net/designate20:55
roukoswarfwell, i have the fixed code, figured i could just open a PR21:00
roukoswarfor is that not a thing21:00
*** livelace has quit IRC21:11
*** livelace has joined #openstack-dns21:17
*** livelace has quit IRC21:18
roukoswarfmugsie: so yes, after setting the task things work perfectly as intended, with the original code. thank you very much. not sure the best way to get kolla setting it, but ill talk to them.21:30
*** awalende has quit IRC21:38
*** awalende has joined #openstack-dns21:39
*** awalende has quit IRC21:43
*** roukoswarf has quit IRC22:50

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!