Wednesday, 2021-10-20

-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 814695: ensure-docker: remove Debian Stretch testing https://review.opendev.org/c/zuul/zuul-jobs/+/81469503:55
-@gerrit:opendev.org- Sandeep Yadav proposed: [zuul/zuul-jobs] 814516: multi-node-bridge: repos to install ovs in C9 https://review.opendev.org/c/zuul/zuul-jobs/+/81451605:47
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 814711: UI: Fix build time calculation for empty buildsets https://review.opendev.org/c/zuul/zuul/+/81471106:50
@felixedel:matrix.org^ corvus, mhuin: This should fix a bug on the buildset result page which prevents the page from loading empty buildsets. The bug is related to the newly introduced time/duration calculation.06:54
@felixedel:matrix.org * ^ @corvus, @mhuin: This should fix a bug on the buildset result page which prevents the page from loading empty buildsets. The bug is related to the newly introduced time/duration calculation.06:54
@felixedel:matrix.org * ^ corvus , mhu : This should fix a bug on the buildset result page which prevents the page from loading empty buildsets. The bug is related to the newly introduced time/duration calculation.06:55
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 814717: UI: Ignore empty timestamps in build time calculation on buildset page https://review.opendev.org/c/zuul/zuul/+/81471707:24
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 814720: UI: Fix gant chart on buildset page for missing timestamps https://review.opendev.org/c/zuul/zuul/+/81472007:47
@felixedel:matrix.org^ two follow up changes related to missing timestamp information on the buildset result page.07:51
-@gerrit:opendev.org- Zuul merged on behalf of Sandeep Yadav: [zuul/zuul-jobs] 814516: multi-node-bridge: repos to install ovs in C9 https://review.opendev.org/c/zuul/zuul-jobs/+/81451612:45
-@gerrit:opendev.org- Simon Westphahl proposed:12:47
- [zuul/zuul] 809414: Make QueueItem a Zookeeper object https://review.opendev.org/c/zuul/zuul/+/809414
- [zuul/zuul] 810658: Store pipeline state in Zookeeper https://review.opendev.org/c/zuul/zuul/+/810658
- [zuul/zuul] 810920: Store change queues in Zookeeper https://review.opendev.org/c/zuul/zuul/+/810920
- [zuul/zuul] 811422: Save and restore bundle with item in Zookeeper https://review.opendev.org/c/zuul/zuul/+/811422
- [zuul/zuul] 811955: Pass ZK context to deserialize method of ZKObjects https://review.opendev.org/c/zuul/zuul/+/811955
- [zuul/zuul] 812450: Move ZuulMark from configloader to model https://review.opendev.org/c/zuul/zuul/+/812450
- [zuul/zuul] 812451: Recursively delete all sub-nodes of ZKObjects https://review.opendev.org/c/zuul/zuul/+/812451
- [zuul/zuul] 812466: Only retry ZK operations for Kazoo exceptions https://review.opendev.org/c/zuul/zuul/+/812466
- [zuul/zuul] 812452: Store build sets in Zookeeper https://review.opendev.org/c/zuul/zuul/+/812452
- [zuul/zuul] 812467: Add support for sharded ZKObjects https://review.opendev.org/c/zuul/zuul/+/812467
- [zuul/zuul] 812673: Store RepoFiles for a build set in Zookeeper https://review.opendev.org/c/zuul/zuul/+/812673
- [zuul/zuul] 813805: Remove project pipeline config from queue item https://review.opendev.org/c/zuul/zuul/+/813805
- [zuul/zuul] 813809: Lookup event class names from global symbol table https://review.opendev.org/c/zuul/zuul/+/813809
- [zuul/zuul] 813826: Store and resolve queue item's ahead/behind refs https://review.opendev.org/c/zuul/zuul/+/813826
- [zuul/zuul] 814544: Cleanup stale items after refreshing a pipeline https://review.opendev.org/c/zuul/zuul/+/814544
- [zuul/zuul] 814570: Reference active change queues in pipeline state https://review.opendev.org/c/zuul/zuul/+/814570
- [zuul/zuul] 814571: Update pipeline state when modifying attributes https://review.opendev.org/c/zuul/zuul/+/814571
- [zuul/zuul] 814772: Allow passing extra attributes to ZKObject.fromZK https://review.opendev.org/c/zuul/zuul/+/814772
- [zuul/zuul] 814773: wip: move re-enqueue to pipeline processing https://review.opendev.org/c/zuul/zuul/+/814773
-@gerrit:opendev.org- Simon Westphahl proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com:12:47
- [zuul/zuul] 812750: Add LocalZKContext for job freezing https://review.opendev.org/c/zuul/zuul/+/812750
- [zuul/zuul] 812760: Add RepoState object https://review.opendev.org/c/zuul/zuul/+/812760
- [zuul/zuul] 813552: Remove Worker class https://review.opendev.org/c/zuul/zuul/+/813552
- [zuul/zuul] 813895: Move job_graph attribute to BuildSet https://review.opendev.org/c/zuul/zuul/+/813895
- [zuul/zuul] 813913: Serialize JobGraph objects to ZK https://review.opendev.org/c/zuul/zuul/+/813913
- [zuul/zuul] 814065: Serialize ProjectMetadata on JobGraph https://review.opendev.org/c/zuul/zuul/+/814065
- [zuul/zuul] 814071: Add test_freeze_noop_job https://review.opendev.org/c/zuul/zuul/+/814071
- [zuul/zuul] 814069: Remove setBase from job freeze API https://review.opendev.org/c/zuul/zuul/+/814069
- [zuul/zuul] 814070: Create Abstract and FrozenJob classes https://review.opendev.org/c/zuul/zuul/+/814070
- [zuul/zuul] 814242: Make FrozenJob.updateParentData a static method https://review.opendev.org/c/zuul/zuul/+/814242
- [zuul/zuul] 814281: Remove toDict from FrozenJob https://review.opendev.org/c/zuul/zuul/+/814281
- [zuul/zuul] 814243: Make FrozenJob a ZKObject https://review.opendev.org/c/zuul/zuul/+/814243
- [zuul/zuul] 814329: Implement frozen job serialization/deserialization https://review.opendev.org/c/zuul/zuul/+/814329
- [zuul/zuul] 814679: Store FrozenJob data in separate znodes https://review.opendev.org/c/zuul/zuul/+/814679
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 814676: Support skipping pragma through tenant config include list https://review.opendev.org/c/zuul/zuul/+/81467613:20
@goneri:matrix.orgHi, we've got a weird situation here where nodepool collects wrong ssh fingerprint from a host (ssh-ed25519, ssh-ecdsa), the ssh-rsa key is good. The paramiko issue tracker's got some report about bogus keys. This may also be a problem with our VM, the fact the problem does not happens with other hosts is also confusing. Anyway, does this ring a bell to someone here?13:38
@nborg:matrix.orgPotentially. We have to use ssh-ed25519 host key. We just work around it by doing that.13:40
@clarkb:matrix.orgGonéri nodepool should collect all host keys.13:47
@clarkb:matrix.orgHave you independently checked with ssh-keyscan?13:47
@goneri:matrix.orgI've just found a case where we've got just on key of a Fedora 34 in nodepool, whereas the host's got 3 host keys (ssh-keyscan).13:48
@goneri:matrix.orgIt's just something I've observed, there is no connection with that and my problem above :-).13:49
@clarkb:matrix.orgGonéri: I wonder if paramiko doesn't do rsa-sha2-* key types (nodepool just iterates through the list in paramiko iirc) and so you get only ed25519 and ecdsa13:50
@clarkb:matrix.organd fedora 34 must've disabled ssh-rsa on the server side now (not just client)?13:50
@clarkb:matrix.orgGonéri: why are the other two keys invalid? if they are present on the host then nodepool and zuul should verify them just fine?13:50
@goneri:matrix.orgFor the Fedora-34 case, the ssh-rsa key was indeed missing, but it's present with my eos appliance (my problem).13:52
@clarkb:matrix.orgGonéri: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/nodeutils.py#L116-L161 is the relevant code. Note lines 124-128. We connect to the sshd in sequence trying each key type supported by paramiko13:56
@goneri:matrix.orgYes, I read this block too. I suspect a race condition.13:58
@clarkb:matrix.orgGonéri: https://github.com/paramiko/paramiko/blob/7714caf79a09dc455a32c6071dd22ba37c399758/paramiko/transport.py#L171-L178 I think that confirms no rsa-sha2 whcih would explain the lack of rsa keys14:10
@clarkb:matrix.orgI don't think that should be a problem though as you can verify host keys with ecdsa14:10
@goneri:matrix.orgI'm not sure I understand, ssh-rsa is in the list:  https://github.com/paramiko/paramiko/blob/7714caf79a09dc455a32c6071dd22ba37c399758/paramiko/transport.py#L17614:11
@clarkb:matrix.orgGonéri: yes but newer openssh deprecated ssh-rsa by default14:12
@clarkb:matrix.orgyou have to use rsa-sha2-256 or rsa-sha2-512 instead but paramiko doesn't list them14:12
@clarkb:matrix.orgbasically that means it is highly likely that when paramiko connects and says give me your ssh-rsa key fedora 34 says "I don't have one" and then paramiko continues to try the next options. I don't think this is a race14:13
@goneri:matrix.orgI see. in my case, all my case but the rsa one are wrong. We initially suspected some old hostkeys in the image, but it's not the case.14:13
@clarkb:matrix.orgSo the real issue is that the values themselves are wrong?14:13
@goneri:matrix.orgoh yes, this is a pain in the neck. I'm happy we've finally figured out where the problem is.14:14
@clarkb:matrix.orgwell now i'm not sure we haev found the problem?14:14
@clarkb:matrix.orgI'm getting confused as to what the problem actually is. It sounded like you wanted an rsa key and weren't getting one. But now I guess the real problem is that you're getting ecdsa and ed25519 host keys that are incorrect?14:15
@goneri:matrix.orgI've got a work around for now (we've disabled everything but ssh-rsa). And, I will keep digging to find a better solution.14:15
@goneri:matrix.orgYes, the missing ssh-rsa key is just something unrelated that I observed this morning.14:16
@clarkb:matrix.orgIf the keys you get back are wrong the first thing I would suspect is an ip address conflict between multiple nodes and when the scan id done you're getting back host keys for the wrong host14:20
@goneri:matrix.orgI didn't check that because the problem happens only with a specific type of image. But indeed, I will take a look.14:21
-@gerrit:opendev.org- Douglas Viroel proposed: [zuul/zuul-jobs] 813253: Add FIPS enable multinode job definition https://review.opendev.org/c/zuul/zuul-jobs/+/81325314:22
@nhicher:matrix.orghello, do you know where I can find the Dockerfiles used to build containers available on https://hub.docker.com/u/zuul ?15:02
@clarkb:matrix.orgnhicher: https://opendev.org/zuul/zuul/src/branch/master/Dockerfile and https://opendev.org/zuul/nodepool/src/branch/master/Dockerfile and so on15:12
@nhicher:matrix.orgClark: ok, thanks. I was not sure for there is entrypoint with 'dumb-init' on the image I didn't see on the dockerfile on zuul/zuul.15:17
-@gerrit:opendev.org- Felix Edel proposed:15:17
- [zuul/zuul] 760805: Add /components API endpoint to zuul-web https://review.opendev.org/c/zuul/zuul/+/760805
- [zuul/zuul] 760804: Store version information in component registry https://review.opendev.org/c/zuul/zuul/+/760804
- [zuul/zuul] 760806: UI: Add actions and reducers to retrieve components https://review.opendev.org/c/zuul/zuul/+/760806
- [zuul/zuul] 760807: UI: Add components page https://review.opendev.org/c/zuul/zuul/+/760807
@clarkb:matrix.orgnhicher: dumb init comes from https://opendev.org/opendev/system-config/src/branch/master/docker/python-base/Dockerfile which is the image those docker files build upon15:18
@nhicher:matrix.orgClark: thanks15:19
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 814676: Support skipping pragma through tenant config include list https://review.opendev.org/c/zuul/zuul/+/81467615:33
@jim:acmegating.comClark: https://review.opendev.org/760805 and child are simple changes that would be good to get merged so we can get the ball rolling on restarting with those and testing out the ui changes.  also, i'm really looking forward to a web page that shows the current component statuses/versions.16:05
@clarkb:matrix.orgOk I can take a look after breakfast16:06
@clarkb:matrix.orgcorvus: https://review.opendev.org/c/zuul/zuul/+/760804 will require a full cluster restart. Due to accessing comp.version rather than comp.get("version")16:22
@clarkb:matrix.orgI don't think that is a deal breaker, but calling out16:22
@jim:acmegating.comClark: good point.  we'll have to watch for things like that going forward.  i think we're still generally operating under the assumption that it's best to restart the whole cluster during 4.x development, so i'd be okay with just approving it this time...16:23
@clarkb:matrix.orgOk I'll approve it16:24
@goneri:matrix.orgClark:  I've got a script here that boots VM and in parallel uses nodepool.nodeutils.nodescan() to retrieve the host keys. And it's really easy to get a missing key. Pretty much every time I call the function I get a different output. I will work on a patch later today.16:40
@clarkb:matrix.orgGonéri: I guess I'm still confused then. Is the problem a missing key? or is the problem that the keys you get have the wrong data?16:44
@goneri:matrix.orgI'm not so sure if there is a connection between the two problems. So I tried to reproduce this one.16:45
@clarkb:matrix.orgGonéri: note that the nodescan process wait for sshd to connect first and then disconnects and starst the nodescan process from there. I suppose it is possible that if the initial connection is able to be made early enough with a satifisfactory result the nodescans might hit an unprepared sshd?16:51
@goneri:matrix.orgI see, I will reproduce this during my tests.16:56
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 814676: Support skipping pragma through tenant config include list https://review.opendev.org/c/zuul/zuul/+/81467617:04
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:17:07
- [zuul/zuul] 814684: DNM: Increase unit test job timeout to 2h https://review.opendev.org/c/zuul/zuul/+/814684
- [zuul/zuul] 814685: DNM: Test unit tests on larger nodes https://review.opendev.org/c/zuul/zuul/+/814685
@goneri:matrix.orgI can still reproduce the problem, even if it's harder. =~ 15% or the cases.17:14
@clarkb:matrix.orgGonéri: for the wrong key data is it possible the images you are booting have a set of ssh host keys and we're racing with the startup unit that clears them out and generates new ones? I think DIB is really careful to avoid setting any host keys in the image forcing them to be generated on startup17:18
@jim:acmegating.comthe new containerfile element may be a wildcard there17:20
@clarkb:matrix.orgoh good point17:20
@goneri:matrix.orgClark: The key is different all the time and we removed the host keys from the image. But I suspect something similar, like two scripts internally generating the same host key files. Anyway, if we can bulletproof nodescan(), this may resolve our problem too.17:21
@clarkb:matrix.orgcorvus: the other thing that occurred to me before we do a release is you/me/we might want to test running the zuul delete-key command against the names we already deleted in opendev to ensure it cleans stuff up as expected? The testing for that should be pretty good but that might be sensitive enough to double check?17:32
@clarkb:matrix.orgI figure we can do an export keys first then a delete against some of the names to clean up the old unneeded top level dirs17:32
@fungicide:matrix.organd then export keys again and compare vs the previous one to make sure there were no unintended changes17:38
@fungicide:matrix.orgtristanC: do you think https://review.opendev.org/795419 is probably safe for sf? since it's not testing that change any longer i wouldn't want to assume, but it doesn't seem like it should alter any default behaviors17:46
@tristanc_:matrix.orgfungi: it is probably safe for sf, this is not something we have a control over, e.g. that depends on how the nodeset are defined.18:05
@fungicide:matrix.orgthanks!18:09
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 814676: Support skipping pragma through tenant config include list https://review.opendev.org/c/zuul/zuul/+/81467619:53
@tristanc_:matrix.orgcorvus: could you please check 814676 when you have a moment, I think that should fix the issue about re-using devstack job by not importing its pragma.19:54
@jim:acmegating.comtristanC: will do!19:57
@tristanc_:matrix.orgfungi: by the way, it seems like running setup.py is now deprecated according to: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html#summary20:24
@tristanc_:matrix.orgoh you may already knew about, I just learned about it today :)20:28
-@gerrit:opendev.org- Zuul merged on behalf of Tim Burke: [zuul/zuul-jobs] 795419: build-python-release: Add flag for whether to build a wheel or not https://review.opendev.org/c/zuul/zuul-jobs/+/79541920:32
@clarkb:matrix.orgright but setuptools is sticking around20:34
@clarkb:matrix.orgthe way we use it with pbr should be fine long term I think you just have to stop running python setup.py wheel20:34
@clarkb:matrix.orgetc20:34
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 814676: Support skipping pragma through tenant config include list https://review.opendev.org/c/zuul/zuul/+/81467620:43
-@gerrit:opendev.org- Zuul merged on behalf of Felix Edel: [zuul/zuul] 760805: Add /components API endpoint to zuul-web https://review.opendev.org/c/zuul/zuul/+/76080521:25
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 814676: Support skipping pragma through tenant config include list https://review.opendev.org/c/zuul/zuul/+/81467622:13
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/nodepool] 806312: Update Docker and bindep for Bullseye base images https://review.opendev.org/c/zuul/nodepool/+/80631222:15
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/nodepool] 814830: Switch to Python 3.9 images https://review.opendev.org/c/zuul/nodepool/+/81483022:26
@iwienand:matrix.orgjust looking at https://review.opendev.org/814830 as it's building23:08
@iwienand:matrix.org   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND23:09
11406 root 20 0 387172 191572 14508 R 99.7 2.4 10:23.30 pip
@iwienand:matrix.orgpip running under qemu is just flat out; but it's not building anything23:09
@iwienand:matrix.orgso, not sure what we could do to make buildx faster23:10
@iwienand:matrix.orgi don't think separating the builds would make any difference; it's really single cpu performance limiting us here23:14
@fungicide:matrix.org> <@tristanc_:matrix.org> oh you may already knew about, I just learned about it today :)23:15
yep, my personal projects already use https://pypi.org/project/build/ (which is still calling setup.py in th ebackground, but i'm no longer invoking it directly)
@foodster:matrix.orgHello! I am having a hard time scheduling jobs in a node pool..23:16
I have a nodeset defined with group "worker-nodes" which has 2 nodes and in the job's run playbook I am setting hosts as "worker-nodes"..
problem is that the job is being executed on both the nodes whereas I want just one of the nodes to run it..is this the expected behavior? or is there another config that I need to set for such a behavior?
@iwienand:matrix.org@foodster:matrix.org: it kind of sounds like your playbook is written to run on hosts:all maybe?  so it's running on both nodes?23:19
@foodster:matrix.orgno the playbook has "hosts: worker-nodes"23:22
@iwienand:matrix.org@foodster:matrix.org: well that would be both nodes...?23:26
@foodster:matrix.orgyeah worker-nodes group has both the nodes23:27
@iwienand:matrix.orgyou probably want something a little like https://opendev.org/opendev/system-config/src/branch/master/zuul.d/system-config-run.yaml#L5123:28
@foodster:matrix.orgso I have 2 nodes..I want the job to be executed on either of them but not on both of them..is that how groups work? I am basically trying to distribute jobs on those nodes23:31
@iwienand:matrix.orgin a word no; if you have a nodeset with multiple nodes, that is intended for one job that requires multiple nodes.  so when you actually want to set things up on two machines at once in one test, like devstack multi-node tests, or maybe something more generic like where you'd have a db node, a webserver node, an api node, and your test sets them all up and runs it at the same time23:36
@iwienand:matrix.orgin terms of "run on one or the other" -- that's something nodepool will sort out.  it has the pool of resources and picks the node (or, nodes) to run the job on23:37
@foodster:matrix.orggot it..thanks..I think either through nodepool or via some ansible configuration23:43
@iwienand:matrix.orgyeah, your nodepool labels should probably have things that describe the node.  in opendev we're very focused on the underlying distro the node provides so we have labels that correspond to that https://opendev.org/openstack/project-config/src/branch/master/nodepool/nl01.opendev.org.yaml#L1723:49
@iwienand:matrix.organd then your nodeset/node names would be more symbolic, about what's running on that node23:49
-@gerrit:opendev.org- Zuul merged on behalf of Douglas Viroel: [zuul/zuul-jobs] 813253: Add FIPS enable multinode job definition https://review.opendev.org/c/zuul/zuul-jobs/+/81325323:54

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!