Monday, 2021-03-29

*** hamalq has joined #zuul00:56
*** hamalq has quit IRC01:01
*** sshnaidm|afk has quit IRC02:06
*** evrardjp has quit IRC02:33
*** evrardjp has joined #zuul02:33
*** hamalq has joined #zuul02:57
*** hamalq has quit IRC03:01
*** ykarel__ has joined #zuul03:40
*** ykarel__ is now known as ykarel04:16
*** paladox has quit IRC04:20
*** ricolin has quit IRC04:35
*** ricolin has joined #zuul04:48
*** hamalq has joined #zuul04:58
*** hamalq has quit IRC05:03
*** jangutter has quit IRC06:04
*** jangutter has joined #zuul06:04
*** jcapitao has joined #zuul06:17
*** hamalq has joined #zuul06:59
*** hamalq has quit IRC07:03
*** hashar has joined #zuul07:03
*** rpittau|afk is now known as rpittau07:15
openstackgerritSimon Westphahl proposed zuul/zuul master: Fix race condition related to out-of-band acks  https://review.opendev.org/c/zuul/zuul/+/78360707:16
*** mhu has joined #zuul07:28
*** tosky has joined #zuul08:28
*** saneax has joined #zuul08:33
*** ykarel is now known as ykarel|lunch08:53
*** hamalq has joined #zuul09:00
*** hamalq has quit IRC09:04
*** ajitha has joined #zuul09:05
openstackgerritSimon Westphahl proposed zuul/zuul master: Move project secrets key loading to key storage  https://review.opendev.org/c/zuul/zuul/+/75893909:16
openstackgerritSimon Westphahl proposed zuul/zuul master: Store secrets keys and SSH keys in Zookeeper  https://review.opendev.org/c/zuul/zuul/+/75894009:16
*** nils has joined #zuul09:17
*** holser has quit IRC09:25
*** ykarel|lunch is now known as ykarel09:27
openstackgerritSimon Westphahl proposed zuul/zuul master: Store secrets keys and SSH keys in Zookeeper  https://review.opendev.org/c/zuul/zuul/+/75894009:27
*** holser has joined #zuul09:27
*** hashar is now known as hasharLunch09:36
*** sshnaidm|afk has joined #zuul10:18
*** jangutter has quit IRC10:24
*** jangutter has joined #zuul10:25
*** jangutter_ has joined #zuul10:47
*** jangutter_ has joined #zuul10:47
*** jangutter has quit IRC10:50
*** hamalq has joined #zuul11:00
*** hamalq has quit IRC11:05
*** jcapitao is now known as jcapitao_lunch11:07
*** shanemcd has quit IRC11:19
*** shanemcd has joined #zuul11:19
*** sshnaidm|afk is now known as sshnaidm|off11:45
*** sduthil has joined #zuul12:19
*** jcapitao_lunch is now known as jcapitao12:25
*** paladox has joined #zuul12:32
*** cloudnull has joined #zuul12:41
*** hamalq has joined #zuul13:01
*** hamalq has quit IRC13:06
*** hasharLunch is now known as hashar13:17
*** GomathiselviS has joined #zuul13:36
*** ykarel_ has joined #zuul13:39
*** ykarel has quit IRC13:42
GomathiselviScorvus fungi : looking for merge today - https://review.opendev.org/c/zuul/zuul-jobs/+/77347414:01
*** harrymichal has joined #zuul14:13
corvustristanC: fyi https://zuul.opendev.org/api/tenant/zuul/pipeline/check/project/zuul/zuul/branch/master/freeze-job/zuul-build-image14:21
corvusjhesketh: ^14:22
tristanCcorvus: nice thanks a lot, i'll give it a try with the zuul-runner cli!14:24
avasscorvus: oh nice14:24
*** jangutter has joined #zuul14:39
*** jangutter_ has quit IRC14:40
*** jangutter_ has joined #zuul14:40
*** jangutter has quit IRC14:44
corvusi'm looking at https://grafana.opendev.org/d/5Imot6EMk/zuul-status?orgId=1&from=now-7d&to=now and trying to figure out if the ~100 node requests with available capacity represents some kind of new lag related to zk, or if this is what a monday ramp-up looks like14:44
corvuslast week if we had > 100 node requests for more than an hour we were at over 725 nodes in use14:47
corvusnow we're over 100 requests for several hours with 625 in use14:47
clarkbcorvus: cross check against errors booting instances to make sure there isn't a new launch failure?14:49
avassthe event processing time doesn't seem to match the node request ramp up14:49
corvusclarkb: yeah, simplest explanation is probably a cloud thing14:49
corvusavass: agree; i'm not seeing any zk metrics correlating14:49
avassI suppose the spike around 06:00 does match, I suppose that's a periodic pipeline triggering?14:55
*** ykarel_ is now known as ykarel14:55
avassoh it is :)14:57
corvusokay we need to install 'less' on our nodepool images :)14:58
avassnumber of znodes/ephemeral nodes/watches and data size does seem to increase over time14:59
corvusthat correlates well with nodes in use14:59
tobiashdo you have throughput numbers like jobs/h?15:02
*** hamalq has joined #zuul15:02
tobiashwe found those useful to see if there is unusual behavior (aka at quota throughput is comparable)15:03
tobiashwe often spotted issues in the past when we saw unusual high or low throughput than usual when the system is under load15:04
corvusthere's launched-per-hour; i don't know if we have completed-per-hour handy, but i'm sure we could get it15:06
*** hamalq has quit IRC15:07
tobiashI guess launched-per-hour is similar enough to completed-per-hour15:07
corvusthey should at least correlate15:07
corvusi see a lot of arm64 requests outstanding15:09
corvus12115:09
avassyeah nothing seem to be starting in check-arm6415:10
corvusour arm64 cloud situation is not as robust; i would not be surprised if there's an operational issue there15:10
clarkb10 days ago it had trouble with finding hypervisors to place the VMs on15:10
corvusyep, the cloud's ssl cert has expired15:11
corvusso this is a false alarm for #zuul; lookes like the behavior change is an #opendev ops issue and not related to zk work15:11
fungikevinz fixed the expired cert there for us last time, but i guess it has expired again now15:13
fungiprobably three months ago ;)15:13
tristanCcorvus: with trigger events being stored in zk, shouldn't the ZNodes values be higher than last week?15:30
*** ykarel is now known as ykarel|away15:33
*** hashar has quit IRC15:39
*** ykarel|away has quit IRC15:42
fungiGomathiselviS: corvus: i approved https://review.opendev.org/773474 just now, and will keep tabs on any impact in opendev's deployment15:52
fungithe copy in opendev's build-test was demonstrated properly defaulting to rsa keys and able to run the rest of jobs normally15:54
corvustristanC: the event queue is in zk, and ideally the queue length should stay near zero, so we shouldn't see an increase in storage size or znode count (unless something goes wrong).  we might see it spike up 100 or so on reconfigurations or similar situations where we stop processing event queues.15:54
corvusfungi, GomathiselviS, thanks :)15:55
fungis/build-test/base-test/15:55
corvusfungi: i knew you meant testing thingamajig15:55
GomathiselviSfungi corvus pabelanger : Thanks for the help !15:56
fungiGomathiselviS: thanks for your patience with the complexity of testing lower-level roles like that one15:58
*** rpittau is now known as rpittau|afk16:08
*** saneax has quit IRC16:08
openstackgerritMerged zuul/zuul-jobs master: Create a template for ssh-key and size  https://review.opendev.org/c/zuul/zuul-jobs/+/77347416:10
*** hamalq has joined #zuul16:15
*** nils has quit IRC16:25
*** hamalq has quit IRC16:31
*** hamalq has joined #zuul16:31
*** hamalq has quit IRC16:33
*** hamalq has joined #zuul16:33
*** jcapitao has quit IRC16:43
openstackgerritShturm Svetlana proposed zuul/zuul-jobs master: Fix undefined error for zuul_ssh_key_algorithm  https://review.opendev.org/c/zuul/zuul-jobs/+/78371716:54
clarkbfungi: ^ is that related to the change you helped land?16:57
tristanCclarkb: it seems like it yes, in https://review.opendev.org/c/zuul/zuul-jobs/+/773474 , the remove-build-ssh-key is now using a variable that was not added to the defaults17:00
clarkbtristanC: but it is added as a role var?17:01
clarkbare role vars only accessible within a role?17:02
fungiclarkb: GomathiselviS: corvus: it hasn't been breaking jobs in opendev as far as i can tell17:02
fungii wonder how it's getting used in the breaking environment17:03
tristanCfungi: i guess this happen when using the remove-build-ssh-key role directly17:04
fungiand the vars defined in the role aren't getting used?17:05
clarkboh if only that role is used and not the add-build-sshkey role?17:05
fungiwouldn't using the role also instantiate everything from roles/add-build-sshkey/vars/main.yaml ?17:05
clarkbthat could be17:05
fungiaha, i see what you're saying17:05
fungii missed it was for a different role17:07
fungii suppose caching could influence that behavior17:08
*** jangutter has joined #zuul17:09
*** jangutter_ has quit IRC17:12
openstackgerritMerged zuul/zuul-jobs master: Fix undefined error for zuul_ssh_key_algorithm  https://review.opendev.org/c/zuul/zuul-jobs/+/78371717:18
corvusi'm glad we didn't merge that on friday afternoon17:29
*** harrymichal has quit IRC17:30
*** harrymichal has joined #zuul17:30
fungiyep!17:50
*** harrymichal has left #zuul17:57
openstackgerritJames E. Blair proposed zuul/zuul master: Fix ZK-related race condition in github driver  https://review.opendev.org/c/zuul/zuul/+/78372617:58
corvusswest, tobiash: ^ that's an alternate fix for the race18:02
*** cloudnull has quit IRC18:43
*** Goneri has joined #zuul18:58
*** Goneri has quit IRC18:59
*** cloudnull has joined #zuul19:21
*** harrymichal has joined #zuul19:30
*** ajitha has quit IRC19:48
*** jangutter_ has joined #zuul19:49
*** jangutter has quit IRC19:52
*** nhicher has quit IRC20:56
*** fbo has joined #zuul20:59
*** nhicher has joined #zuul21:00
*** harrymichal has quit IRC21:34
*** GomathiselviS has quit IRC22:05
*** cloudnull has quit IRC22:09
*** y2kenny has joined #zuul22:18
y2kennyHi, I am seeing a lot of "Waiting on logger" in my logs (job-output.txt) even though the command output shows up in job-output.json, what could be the cause of this?  (I already have the 'start-zuul-console' role at the beginning of the play)22:23
clarkby2kenny: its basically the time between the job starting to do stuff and start-zuul-console successfully starting the remote logger process22:23
clarkby2kenny: there was a semi recent change made to reduce the amount of that output as it was fairly verbose previously (I think now its cut by 1/10th)22:24
y2kennyclarkb: ok... so sounds like the start-zuul-console role never suceeded on my baremetal node.  What is the requirement for the start-zuul-console?  Are there any good way to debug it?22:25
y2kennyI see this: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/start-zuul-console/tasks/main.yaml but not too sure where to go from there.22:27
clarkboff the top of my head I think it may only run on linux? thought some of the windows users can conform or deny that. It starts a python process that listens on port 19885 which zuul connects to to fetch the data over22:28
clarkbzuul/ansible/base/library/zuul_console.py is the code that runs to do this22:29
y2kennyok... I am using linux on the baremetal node but I wonder if fedora's firewall rules blocks it by default22:29
fungiy2kenny: they might, we bake an exception for that into our node images: https://opendev.org/openstack/project-config/src/branch/master/nodepool/elements/nodepool-base/install.d/20-iptables#L6022:43
corvustobiash: a couple of questions on https://review.opendev.org/66341322:44
y2kennyfungi: thanks!22:44
*** cloudnull has joined #zuul22:52

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!