Tuesday, 2021-02-23

*** ricolin_ has joined #openstack-infra00:20
*** ricolin_ has quit IRC00:22
*** yamamoto has joined #openstack-infra00:23
*** tosky has quit IRC00:38
*** yamamoto has quit IRC00:46
*** yamamoto has joined #openstack-infra00:50
*** yamamoto has quit IRC01:06
*** yamamoto has joined #openstack-infra01:12
*** ysandeep|away is now known as ysandeep|ruck01:14
*** yamamoto has quit IRC01:19
*** yamamoto has joined #openstack-infra01:20
*** yamamoto has quit IRC01:20
*** yamamoto has joined #openstack-infra01:20
*** __ministry has joined #openstack-infra01:21
*** yamamoto has quit IRC01:22
*** hamalq has quit IRC01:26
*** rlandy has quit IRC01:28
*** yamamoto has joined #openstack-infra01:29
*** yamamoto has quit IRC01:29
*** yamamoto has joined #openstack-infra01:29
*** yamamoto has quit IRC01:32
*** yamamoto has joined #openstack-infra01:32
*** yamamoto has quit IRC01:32
*** yamamoto has joined #openstack-infra01:34
*** yamamoto has quit IRC01:34
*** yamamoto has joined #openstack-infra01:35
*** yamamoto has quit IRC01:35
*** yamamoto has joined #openstack-infra01:35
*** yamamoto has quit IRC01:36
*** yamamoto has joined #openstack-infra01:37
*** yamamoto has quit IRC01:38
*** yamamoto has joined #openstack-infra01:39
*** yamamoto has quit IRC01:39
*** yamamoto has joined #openstack-infra01:39
*** yamamoto has quit IRC01:39
*** yamamoto has joined #openstack-infra01:52
*** benj_ has quit IRC01:53
*** benj_ has joined #openstack-infra01:54
*** stevebaker has quit IRC02:16
*** ysandeep|ruck has quit IRC02:18
*** ysandeep has joined #openstack-infra02:19
*** ysandeep is now known as ysandeep|away02:48
*** ricolin has quit IRC03:15
*** ricolin has joined #openstack-infra03:30
*** psachin has joined #openstack-infra03:36
*** stevebaker has joined #openstack-infra03:49
*** zxiiro has quit IRC03:55
*** yamamoto has quit IRC03:55
*** yamamoto has joined #openstack-infra03:55
*** yamamoto has quit IRC03:56
*** yamamoto has joined #openstack-infra04:03
*** yamamoto has quit IRC04:03
*** yamamoto has joined #openstack-infra04:04
*** yamamoto has quit IRC04:06
*** yamamoto has joined #openstack-infra04:07
*** yamamoto has quit IRC04:11
*** yamamoto has joined #openstack-infra04:21
*** yamamoto has quit IRC04:26
*** ysandeep|away is now known as ysandeep|ruck04:27
*** yamamoto has joined #openstack-infra04:29
*** yamamoto has quit IRC04:33
*** ykarel has joined #openstack-infra04:36
*** __ministry has quit IRC04:45
*** __ministry has joined #openstack-infra04:47
*** yamamoto has joined #openstack-infra04:49
*** yamamoto has quit IRC04:53
*** yamamoto has joined #openstack-infra04:54
*** yamamoto has quit IRC04:59
*** rcernin has quit IRC05:15
*** rcernin has joined #openstack-infra05:22
*** gyee has quit IRC06:12
*** yamamoto has joined #openstack-infra06:12
*** yamamoto has quit IRC06:17
*** vishalmanchanda has joined #openstack-infra06:24
*** xek has joined #openstack-infra06:29
*** yamamoto has joined #openstack-infra06:38
*** yamamoto has quit IRC06:43
*** zzzeek has quit IRC06:45
*** zzzeek has joined #openstack-infra06:46
*** slaweq_ has joined #openstack-infra06:50
*** hashar has joined #openstack-infra07:12
*** rcernin has quit IRC07:15
*** ralonsoh has joined #openstack-infra07:21
*** nightmare_unreal has joined #openstack-infra07:27
*** smcginnis has quit IRC07:30
*** smcginnis has joined #openstack-infra07:30
*** piotrowskim has joined #openstack-infra07:35
*** ysandeep|ruck is now known as ysandeep|lunch07:35
*** zzzeek has quit IRC07:39
*** jcapitao has joined #openstack-infra07:40
*** zzzeek has joined #openstack-infra07:40
*** rcernin has joined #openstack-infra07:49
*** gfidente|afk is now known as gfidente07:52
*** eolivare has joined #openstack-infra07:54
*** dklyle has quit IRC07:57
*** rcernin has quit IRC08:06
*** andrewbonney has joined #openstack-infra08:06
*** rcernin has joined #openstack-infra08:09
*** dchen has quit IRC08:11
*** yamamoto has joined #openstack-infra08:15
*** yamamoto has quit IRC08:18
*** rcernin has quit IRC08:23
*** rpittau|afk is now known as rpittau08:28
*** ykarel_ has joined #openstack-infra08:31
*** ykarel has quit IRC08:33
*** ysandeep|lunch is now known as ysandeep|ruck08:39
*** tosky has joined #openstack-infra08:50
*** jpena|off is now known as jpena08:58
*** ociuhandu has joined #openstack-infra08:59
*** yamamoto has joined #openstack-infra09:02
*** yamamoto has quit IRC09:03
*** ociuhandu has quit IRC09:04
*** lucasagomes has joined #openstack-infra09:08
*** psachin has quit IRC09:11
*** ociuhandu has joined #openstack-infra09:12
*** ociuhandu has quit IRC09:24
*** ociuhandu has joined #openstack-infra09:24
*** noonedeadpunk has quit IRC09:26
*** bauzas has quit IRC09:26
*** bauzas has joined #openstack-infra09:27
*** ociuhandu has quit IRC09:28
*** ociuhandu has joined #openstack-infra09:28
*** noonedeadpunk has joined #openstack-infra09:29
*** lpetrut has joined #openstack-infra09:35
*** slaweq_ is now known as slaweq09:40
*** yamamoto has joined #openstack-infra09:42
*** derekh has joined #openstack-infra09:51
*** yamamoto has quit IRC09:59
*** ykarel_ is now known as ykarel10:20
*** yamamoto has joined #openstack-infra10:34
*** ajitha has joined #openstack-infra10:35
*** ociuhandu has quit IRC10:45
*** ociuhandu has joined #openstack-infra10:46
*** ociuhandu has quit IRC10:47
*** ociuhandu has joined #openstack-infra10:47
*** rpittau is now known as rpittau|bbl11:00
*** iurygregory_ has joined #openstack-infra11:00
*** iurygregory has quit IRC11:01
*** dtantsur|afk is now known as dtantsur11:03
*** iurygregory_ is now known as iurygregory11:06
*** jcapitao is now known as jcapitao_lunch11:13
*** smcginnis has quit IRC11:19
*** __ministry has quit IRC11:24
*** smcginnis has joined #openstack-infra11:26
*** nightmare_unreal has quit IRC11:37
openstackgerritMartin Chacon Piza proposed openstack/project-config master: Deprecate monasca-log-api  https://review.opendev.org/c/openstack/project-config/+/77709311:39
*** ociuhandu has quit IRC11:40
openstackgerritMartin Chacon Piza proposed openstack/project-config master: Deprecate monasca-ceilometer  https://review.opendev.org/c/openstack/project-config/+/77709511:43
*** ociuhandu has joined #openstack-infra11:56
*** hashar is now known as hasharLunch11:57
*** smcginnis has quit IRC12:01
*** ociuhandu has quit IRC12:01
*** smcginnis has joined #openstack-infra12:07
*** ociuhandu has joined #openstack-infra12:11
*** ociuhandu has quit IRC12:16
*** nightmare_unreal has joined #openstack-infra12:26
*** ociuhandu has joined #openstack-infra12:29
*** rlandy has joined #openstack-infra12:29
*** jpena is now known as jpena|lunch12:30
*** ociuhandu has quit IRC12:42
*** jcapitao_lunch is now known as jcapitao12:50
*** amoralej is now known as amoralej|lunch13:00
*** rpittau|bbl is now known as rpittau13:01
*** ganso has joined #openstack-infra13:32
*** jpena|lunch is now known as jpena13:33
*** nweinber has joined #openstack-infra13:33
*** ociuhandu has joined #openstack-infra13:35
*** ociuhandu has quit IRC13:39
*** ociuhandu has joined #openstack-infra13:40
*** ociuhandu has quit IRC13:40
*** ociuhandu has joined #openstack-infra13:42
*** ociuhandu has quit IRC13:49
*** yamamoto has quit IRC13:52
*** ociuhandu has joined #openstack-infra13:53
*** thiagop has joined #openstack-infra13:54
*** thiagop is now known as outbrito13:55
*** rlandy is now known as rlandy|training13:57
*** ociuhandu has quit IRC13:57
*** ociuhandu has joined #openstack-infra13:57
*** jhouser has joined #openstack-infra14:01
*** amoralej|lunch is now known as amoralej14:05
*** vishalmanchanda has quit IRC14:33
*** yamamoto has joined #openstack-infra14:42
*** ysandeep|ruck is now known as ysandeep|dinner14:51
*** yamamoto has quit IRC14:55
*** zxiiro has joined #openstack-infra14:59
*** ykarel has quit IRC14:59
*** jhouser has quit IRC15:03
*** jhouser has joined #openstack-infra15:03
*** dklyle has joined #openstack-infra15:16
*** ociuhandu has quit IRC15:20
*** ociuhandu has joined #openstack-infra15:21
*** ysandeep|dinner is now known as ysandeep|ruck15:25
*** ociuhandu has quit IRC15:25
*** ociuhandu has joined #openstack-infra15:29
eolivarehi there! looks like gate queue is blocked to hang job from a patch: https://zuul.opendev.org/t/openstack/status#77538015:53
eolivarecan anyone take a look at this please?15:53
ralonsohfungi, ^^15:53
ralonsohdo you know who can take a look at this?15:53
fungiralonsoh: why do you say it's blocked? it's currently in the integrated gate queue15:54
ralonsohfungi, because it's been there for 16 hours15:55
fungithere are roughly a dozen other changes ahead of it which need to merge or be ejected15:55
eolivareI think the problem is patch 772986,115:55
eolivarefungi, for example, nova patch '776694,1' started 1:30 ago and all the jobs are running, while nova patch '772986,1' started 17 hours ago and most of its jobs are queued15:58
fungithat's not "started" 17 hours ago, it entered the queue 17 hours ago16:00
fungithough yes the completed py27 job for that change did start at 2021-02-22 22:58 so ~16 hours ago16:01
eolivareok, but how other nova patches that entered the queue 1:30 ago are running all jobs16:01
fungii agree something is not right about that change16:01
eolivareack16:01
fungii'm late dialling into a meeting but will continue trying to sort this out in parallel16:01
clarkbI think its node requests must've ended up behind the node requests for all those other changes somehow16:01
clarkb(perhaps a zuul bug?)16:01
clarkbif thati s the case then I would suspect the top of the queue to get assignments now that the bottom of the queue has started jobs16:02
eolivarefungi, maybe rechecking that patch would help? stephenfin, it's your patch, what do you think? https://review.opendev.org/c/openstack/nova/+/772986/16:04
*** sshnaidm is now known as sshnaidm|afk16:04
clarkbinspecting the scheduler logs should help though16:04
clarkbeolivare: at best rechecking would send the chagen into the check queue too16:04
fungiright, recheck will be ignored, it either needs to finish getting its node requests filled or be dequeued16:05
stephenfineolivare: Please feel free to do whatever's necessary. I haven't actually been paying attention to that since I pushed it, heh /o\16:05
fungii'll hopefully know more once i dig through some service logs16:06
fungiif we dequeue that, everything behind it which has already completed will be automatically rerun, so i don't want to do that if there's an easier fix16:06
eolivarefungi, clarkb, stephenfin, ack, i won't recheck :) thank you16:06
fungiso that's buildset 9b9f72fbc2ba4ba8ac6e08ab196c5c0516:10
fungiaccording to the inventory from one of the completed builds16:10
fungievent id is 040ba7438ff443d58a36863fe3832a3216:12
fungier, no, 7be06742213544c0b70da0c2cad24b1e16:12
*** amoralej is now known as amoralej|off16:13
clarkbit might be good to pick a job (say the pep8 job) and try to trace why it specifically isn't running yet16:13
clarkbbuildset -> job -> build -> noderequest etc16:13
fungiyeah, exactly where i was headed. this was the last mention of that queued pep8 build:16:14
fungi2021-02-23 13:58:11,582 DEBUG zuul.layout: [e: 7be06742213544c0b70da0c2cad24b1e] Pipeline variant <Job openstack-tox-pep8 branches: None source: openstack/openstack-zuul-jobs/zuul.d/project-templates.yaml@master#305> matched <Change 0x7f080aa84b50 openstack/nova 772986,1>16:14
fungia couple hours ago now16:14
clarkbanother possibility is the node requests have gotten held up in some cloud provider that is very slowly failing to build instances16:16
fungiseeing some interesting tracebacks in the scheduler log, probably unrelated, but not sure yet16:19
fungiException: No job nodeset for devstack-plugin-ceph-tempest-py316:19
clarkbthat job isn't listed as one of the queued jobs at least16:20
fungilots of different jobs mentioned if i just grep for 'Exception: No job nodeset for '16:20
fungino clear pattern16:20
fungidifferent jobs, different executors16:22
*** hasharLunch is now known as hashar16:22
fungilooks like it may happen when the scheduler cancels a build which hasn't gotten node assignments16:23
fungianyway, probably not related, unless it's a separate symptom of a shared problem16:26
clarkbthere appear to be ~12 node request related to that event (7be06742213544c0b70da0c2cad24b1e) complaining frequently that the node request is locked and unable to be revised16:32
*** lpetrut has quit IRC16:32
clarkb200-0012939975 is the node request for the pep8 job16:34
clarkbfrom nl01 2021-02-22 23:02:45,802 INFO nodepool.NodeLauncher: [e: 7be06742213544c0b70da0c2cad24b1e] [node_request: 200-0012939975] [node: 0023140850] Node is ready16:36
clarkbwe have ~18 ready but locked instances that are all about 17 and a half hours old16:39
clarkbthey are also all rax-dfw instances16:39
clarkber no all rax16:39
clarkbwhich means they are all provided by the same provider16:39
fungiso maybe this is fallout from some problem in that region16:44
fungigood catch16:44
clarkb* same launcher not same provider :)16:45
clarkbI'm trying to find in the nodepool code where a normal unlock would happen to try and understand why it didn't in this case16:45
clarkbit looks like the nodepool driver system polls to see when all the nodes for a request are ready, then it updates the noderequest and unlocks all the nodes?16:51
clarkband that poll short circuits if the provider reports launches are not complete16:52
clarkbcorvus: ^ any idea of what might be going on here?16:53
clarkbI half suspect that we can restart the launcher on nl01 and that may unstick things16:53
*** gfidente is now known as gfidente|afk16:54
clarkblaunchesComplete in openstack's handler doesn't do any logging16:55
clarkblooking at the node request directly in zk its state is still pending16:58
clarkb(is there a better way to do that?)16:58
clarkbI think for some reason the handler poll for node request completion is not completing16:59
clarkband that is not unlocking the node and setting the node request state to fulfilled17:00
*** lucasagomes has quit IRC17:00
clarkbthere is a single node in the node request and the single node's state is ready so launchesComplete should detect it as completed17:01
corvusclarkb: you have confirmed that the node requests for those nodes is still unfulfilled?17:02
corvusie, it's nodepool holding the lock, not zuul?17:02
clarkbcorvus: maybe? the node request state is pending whcih I think means it is still in nodepool's hands17:04
clarkbI'm not sure how to check who owns the lock directly17:04
clarkb(nodepool appears to set it to fulfilled before unlocking the request)17:04
corvusthat's probably good enough (you can check directly by inspecting zk, but i don't think that's necessary based on what you just said)17:05
clarkbcorvus: if I do a get on the lock znode there is no content17:06
clarkboh maybe I have to cd into that znode and look at its child17:07
clarkbthat gives me a uuid which I can't seem to map to anything else17:09
clarkbbut ya seems to be nodepool hanging on to it based on other states17:10
fungilast restart of that launcher was ~4 days ago so doesn't seem to have been triggered by an update17:11
*** d34dh0r53 has quit IRC17:11
*** ysandeep|ruck is now known as ysandeep|away17:21
*** ociuhandu_ has joined #openstack-infra17:24
*** ociuhandu has quit IRC17:28
*** ociuhandu_ has quit IRC17:28
fungiclarkb: should we grab a thread dump of the launcher before restarting it?17:29
fungiand do we want to restart or attempt znode surgery?17:29
clarkbfungi: I did sigusr2 twice already (first starts yappi and that makes things slow so I did a followup to turn it off)17:29
clarkbthe two resulting thread dumps are in the debug log17:30
fungiahh, awesome17:30
fungialso is https://rackspace.service-now.com/system_status/ blank for everyone or just me?17:30
clarkbgrepping for the node id doesn't show anything. Neither does the request number17:30
clarkbthat url gives me content17:30
fungithe last few times i've tried to consult it in recent months, it's always been blank17:31
clarkbyou're probably noscripting it or something17:31
fungisomething. though i don't use noscript17:31
clarkbI think the loop that checks for completed handlers runs in the main top level provider thread though and that seems to be running17:31
fungimaybe it's privacy badger or ddgpe17:32
clarkbmaybe we lost the handlers somehow so we can't poll them17:32
fungicool, an unmodded chromium profile seems to load their status page for me17:32
clarkbanyway, should we restart now?17:33
clarkband see if that makes things happier17:33
fungicloud servers incident logged for the syd region yesterday, otherwise nada17:33
fungiclarkb: yeah, a restart of the launcher seems safe enough17:33
fungiworst case we discard some in progress launches along with the problem locks17:34
fungibut shouldn't be all that wasteful17:34
clarkbok trying that now17:34
fungithanks!17:34
clarkb0023140850 is in use now17:35
clarkband jobs are starting on that change17:35
fungieolivare: ralonsoh: stephenfin: things should be moving again, sorry for the delay and thanks for bringing it to our attention17:38
ralonsohfungi, thanks a lot17:38
eolivarefungi, clarkb, thank you!17:39
clarkbI'm going to try and preserve those thread dumps in my homedir then will need to find food before starting next round of meetings17:40
fungiyep, scarfing down some leftovers for lunch before next meeting17:40
*** rpittau is now known as rpittau|afk17:41
clarkbnl01:~clarkb/ready-nodes-locked-and-stuck.threaddumps has that captured info17:43
*** piotrowskim has quit IRC17:47
*** derekh has quit IRC18:00
*** dtantsur is now known as dtantsur|afk18:00
*** jpena is now known as jpena|off18:02
*** rlandy|training is now known as rlandy18:02
*** yamamoto has joined #openstack-infra18:24
*** eolivare has quit IRC18:29
*** yamamoto has quit IRC18:30
*** lpetrut has joined #openstack-infra18:34
*** lpetrut has quit IRC18:40
*** jcapitao has quit IRC18:48
*** bdodd has quit IRC19:18
*** bdodd has joined #openstack-infra19:19
*** hashar has quit IRC19:22
openstackgerritMerged openstack/project-config master: Deprecate monasca-ceilometer  https://review.opendev.org/c/openstack/project-config/+/77709519:25
*** auristor has quit IRC19:28
openstackgerritMerged openstack/project-config master: Deprecate monasca-log-api  https://review.opendev.org/c/openstack/project-config/+/77709319:29
*** andrewbonney has quit IRC19:34
*** gmann is now known as gmann_lunch19:47
*** nightmare_unreal has quit IRC19:47
*** auristor has joined #openstack-infra20:05
*** gmann_lunch is now known as gmann20:07
*** ysirndjuro has joined #openstack-infra20:09
*** gyee has joined #openstack-infra20:14
*** zzzeek has quit IRC20:29
*** hamalq has joined #openstack-infra20:30
*** zzzeek has joined #openstack-infra20:31
*** outbrito has quit IRC20:35
*** outbrito has joined #openstack-infra20:35
*** rcernin has joined #openstack-infra20:44
*** slaweq has quit IRC20:50
*** rajinir has quit IRC21:01
*** rcernin has quit IRC21:01
*** rajinir has joined #openstack-infra21:02
*** knikolla has quit IRC21:11
*** knikolla has joined #openstack-infra21:11
*** zaro has quit IRC21:11
*** zaro has joined #openstack-infra21:13
*** gfidente|afk has quit IRC21:24
*** ajitha has quit IRC21:24
openstackgerritKendall Nelson proposed openstack/project-config master: Add New Repo for StoryBoard-vue  https://review.opendev.org/c/openstack/project-config/+/77724421:38
*** rcernin has joined #openstack-infra21:46
*** rcernin has quit IRC21:57
*** rcernin has joined #openstack-infra21:57
*** nweinber has quit IRC22:06
*** yamamoto has joined #openstack-infra22:15
*** xek has quit IRC22:26
*** yamamoto has quit IRC22:28
*** yamamoto has joined #openstack-infra22:28
*** ralonsoh has quit IRC22:31
openstackgerritMerged openstack/devstack-gate master: Use stable constraints for Tempest venv for stable/stein testing  https://review.opendev.org/c/openstack/devstack-gate/+/77672222:40
*** dansmith has quit IRC22:46
*** valleedelisle has quit IRC22:46
*** dansmith has joined #openstack-infra22:48
*** valleedelisle has joined #openstack-infra22:48
*** tkajinam has joined #openstack-infra22:51
*** dchen has joined #openstack-infra23:35
*** tkajinam has quit IRC23:40
*** tkajinam has joined #openstack-infra23:40
spotzfungi: Did you just get a phishing PM that freenode is moving?23:50
fungispotz: yep, i just mentioned it in #opendev23:50
spotzhehe23:51
fungiprobably someone scraping the nicklists for populous channels on freenode23:51
fungithey'll accidentally spam staff soon enough and get klined23:51
spotzYeah23:51

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!