Wednesday, 2022-01-12

-@gerrit:opendev.org- Shnaidman Sagi (Sergey) proposed on behalf of Sorin Sbârnea: [zuul/zuul-jobs] 803471: Include podman installation with molecule https://review.opendev.org/c/zuul/zuul-jobs/+/80347113:36
@jpew:matrix.orgjpew14:20
@y2kenny:matrix.orgcorvus I just want to follow up on https://review.opendev.org/c/zuul/zuul/+/823732 (git_over_ssh.)  Can that go in?16:30
@jim:acmegating.comKenny Ho: yeah, thanks for double checking!17:01
@y2kenny:matrix.orgAwesome! Thanks!17:02
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:17:32
- [zuul/zuul] 823587: Add some ZK debug scripts https://review.opendev.org/c/zuul/zuul/+/823587
- [zuul/zuul] 824077: Add a zk-shell debug script https://review.opendev.org/c/zuul/zuul/+/824077
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 824218: Pin tzlocal to avoid warnings https://review.opendev.org/c/zuul/zuul/+/82421817:33
@tobias.henkel:matrix.orgcorvus: q on ^17:45
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 824218: Pin tzlocal to avoid warnings https://review.opendev.org/c/zuul/zuul/+/82421817:56
@jim:acmegating.comtobiash: thx, that was a silly copypasta17:57
@tobias.henkel:matrix.org+217:57
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 824477: Improve documentation around ZK requirements https://review.opendev.org/c/zuul/zuul/+/82447718:29
@hanson76:matrix.orgHi, we are running Zuul 4.11.0 and Nodepool 4.3.0 togther with the aws driver to launch EC2 instances.18:29
We have recently seen that we get NODE_ERROR on builds from time to time and it starting to become annoying to have every fifth build fail.
I've done some digging around and figured out that the aws driver in nodepool is accessing the DescribrInstanceId API too quickly after
the create instance API call has finished.
It takes some time for the AWS backends to propagate information about the newly created instance.
Nodepool ends up receiving a NotFound error from DescribeInstanceId in some cases because of this.
I've added a story in the story board about this (https://storyboard.openstack.org/#!/story/2009781)
My guess is that a simple loop with a sleep around the DescribeInstanceId could fix this problem and make the
aws driver more robust. Is this something that could be fixed ? Just noticed that Nodepool 4.4.0 was released yesterday.
@clarkb:matrix.orga retry loop with a sleep up to some timeout seems reasonable. I believe there are other similar cases of code beacuse clouds are weird :)18:32
@clarkb:matrix.orgAnders Hanson: if you have it a copy of the full traceback would likely be helpful. I wouldn't know what exception to catch myself and don't have aws credentials to test with18:33
@clarkb:matrix.orgAlso if you'd like to write the fix yourself I'd be happy to help with reviews and general process18:34
@hanson76:matrix.orgI'll start with digging up the stacktrace.18:35
@hanson76:matrix.orgI've added the stacktrace to the story.18:41
@clarkb:matrix.orgI see it thanks18:47
@clarkb:matrix.orgAnders Hanson: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/driver/simple.py#L114-L119 is code in another driver that does similar to what you need I think18:49
@clarkb:matrix.orgYou'd need to put that in the aws driver and updates it to catch that exception and ignore it until the timeout18:50
@clarkb:matrix.orgIf you'd prefer someone else write the change let us know, but this is probably a good bugfix to get involved if interested :)18:50
@hanson76:matrix.orgThanks, I'll take a stab at it tomorrow, not written anything in python before.18:54
-@gerrit:opendev.org- Zuul merged on behalf of Kenny Ho: [zuul/zuul] 823732: Add git_over_ssh option for Gerrit connection https://review.opendev.org/c/zuul/zuul/+/82373219:01
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 824218: Pin tzlocal to avoid warnings https://review.opendev.org/c/zuul/zuul/+/82421819:25
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 824482: Add "zuul delete-pipeline-state" command https://review.opendev.org/c/zuul/zuul/+/82448220:57
@clarkb:matrix.orgcorvus: swest for https://review.opendev.org/c/zuul/zuul/+/823782/8/zuul/zk/zkobject.py This will render gets with zkshell pretty much useless. I think zlib isn't directly decompressable with gzip too? I guess we'll have to run a python shell and import zlib and decompress that way?22:57
@clarkb:matrix.orgIts not the end of the world but I think keeping the database as human readable as possible is a useful thing particularlysince I'ev had to go diving in the db a non zero number of times22:58
@jim:acmegating.comClark: correct, thus i made https://review.opendev.org/82407722:58
@clarkb:matrix.orgaha, thanks22:59
@jim:acmegating.comClark: also added a decompress option to the dump script in https://review.opendev.org/82358722:59
@clarkb:matrix.orgfwiw opendev's db size is well under 500MB which isn't crazy, but I guess other installs are probably quite a bit bigger23:00
@jim:acmegating.comClark: and yes, i totally agree on keeping the db the same.  apparently this will make a huge difference in performance for swest's use case.  both in db size as well as throughput (like, it actually makes the scheduler run faster).  so i think it's worth it, especially since we can mitigate the loss of functionality with those tools.23:01
@jim:acmegating.com(i mean, it should make a 90% improvement for everyone, but that has outsized impact at larger scales)23:04
@clarkb:matrix.orgI guess once the extra tools land opendev should update their zk config to drop port 218123:04
@clarkb:matrix.orgsince we'll want to use these tools anyway and they support the compressed data unlike zkshell23:05
@jim:acmegating.comClark: either way -- that's still firewalled only to the servers, and ssl is optional in the tool i wrote23:05
@clarkb:matrix.orgya I just figure we'll be using the tools since they understand zuul's db better and since they support the certs may as well drop the easy mode23:05
@jim:acmegating.comso we can use the new tool locally on zk04 with no ssl, or remotely on zuul01 with ssl23:05
@jim:acmegating.comClark: also, fyi `zlib-flate` (in the `qpdf` package in debuntu) is an easy way to decompress zlib from shell.  there are other options which involve gzip and the magic header too, but `zilb-flate` is the most straightforward.23:09
@clarkb:matrix.orgTIL23:09
@clarkb:matrix.orgya for gzip you have to prepend a header or something23:10
@clarkb:matrix.orgthen it just works23:10
@clarkb:matrix.orgcorvus: TIL about cmd as well23:30
@jim:acmegating.comClark: yeah, i, erm, wrote 50% of what cmd does myself before i stumbled on it and started over :)23:33
@jim:acmegating.comi think there's a bit of room for improvement, but it's good enough23:33
@jim:acmegating.com(like, i'd like to see something that's a combo of cmd+argparse)23:34
@clarkb:matrix.orgcorvus: why is help_get different than the others? Is it because it is multiline docstring?23:39
@clarkb:matrix.orgre argparse you can feed argpase the string provided by args in cmd I think. But ya probably not necessary23:40
@jim:acmegating.comClark: yeah.  that's one of the wonky things.  i wanted to dedent it properly.23:40
@clarkb:matrix.orgcorvus: I think I found a bug in https://review.opendev.org/c/zuul/zuul/+/82407723:43
@clarkb:matrix.orgI left a comment23:43
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 824077: Add a zk-shell debug script https://review.opendev.org/c/zuul/zuul/+/82407723:48
@jim:acmegating.comClark: agree thx23:48
@clarkb:matrix.orgcorvus: for https://review.opendev.org/c/zuul/zuul/+/817626 and children do we want ot avoid landing anything until after 4.12.0 ? Or are we planning on incorporating all of this stuff in to 5.0?23:52
@jim:acmegating.comClark: i'm leaning toward: land all the zkobject stuff, 4.12.0, then land that stack and 5.0.23:53
@clarkb:matrix.orgok23:53
@jim:acmegating.com(but i could be talked into wrapping all that into 5.0 -- i'm just thinking that a little more time with this before 5.0 would be best)23:54
@clarkb:matrix.orgno objections from me23:54

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!