Thursday, 2021-11-11

@spamaps:spamaps.ems.hostHrm. nodepool marks nodes ready about 30s before they can be SSH'd to properly...00:24
@spamaps:spamaps.ems.host(So the ansible setup task times out)00:26
@jim:acmegating.comshould be an ssh check in nodepool unless it's disabled00:26
@spamaps:spamaps.ems.hostIt just checks the port00:26
@spamaps:spamaps.ems.hostthe user can't SSH yet00:26
@jim:acmegating.comit should get the keys00:26
@spamaps:spamaps.ems.hostcloud-init problems00:27
@spamaps:spamaps.ems.hostIt seems much faster on 20.04..00:27
@jim:acmegating.comah, it starts with keys then changes them00:27
@spamaps:spamaps.ems.hostWell it starts with a functional sshd, but no authorized_keys .. and then installs them some time later.00:27
@spamaps:spamaps.ems.hostJust late enough on 18.04 that if the node hasn't been sitting about a minute before the job starts.. it fails.00:28
@spamaps:spamaps.ems.host20.04 seems to get it done much faster.00:28
@jim:acmegating.comspamaps: did you see the nodepool.yaml i have for gerrit?00:28
@spamaps:spamaps.ems.hostNo?00:28
@jim:acmegating.comspamaps: https://gerrit.googlesource.com/zuul/ops/+/refs/heads/master/nodepool/nodepool.yaml  there's some comments in there about it00:29
@jim:acmegating.comspamaps: it doesn't look like that's necessarily a solution to the problem you're seeing.  but it's more data.00:30
@spamaps:spamaps.ems.hostOh this isn't an ssh thing..00:31
@spamaps:spamaps.ems.hostthe network actually seems to go down for a minute.00:31
@spamaps:spamaps.ems.host🤷00:31
@jim:acmegating.comi could see how that would affect connectivity :)00:31
@spamaps:spamaps.ems.hostI'm going to side-step this .. I don't actually want to use this particular image.. In fact I really want to run 90% of things in gke pods anyway.00:31
@spamaps:spamaps.ems.hostI can get my job to try and run on 20.04 but because of the way they've configured networking and DNS our internal ubuntu mirrors don't have 20.04 .. so.. yeah.. n/m ... onward to GKE.00:32
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 817479: Fix buildset config_errors https://review.opendev.org/c/zuul/zuul/+/81747900:34
@spamaps:spamaps.ems.hostHrm, I think I may have to invent a new kubernetes type00:36
@spamaps:spamaps.ems.hostThere's no way I'm getting namespace-creation permissions00:36
@spamaps:spamaps.ems.hostand I don't even want that00:36
@spamaps:spamaps.ems.hostI just want pods... lots of pods.. in one namespace.00:37
@jim:acmegating.comspamaps: is https://zuul-ci.org/docs/nodepool/kubernetes.html#value-providers.[kubernetes].pools.labels.type.pod not sufficient?00:39
@spamaps:spamaps.ems.hostNo, that always creates a namespace00:39
@jim:acmegating.comoh i see, that makes a namespace for the pod.  nm.00:39
@spamaps:spamaps.ems.hostI'm going to see if I can add a namespace: setting to it, and then skip the createNamespace part.00:40
@spamaps:spamaps.ems.hostThat may actually be sufficient.00:40
@jim:acmegating.compretty sure there was a good reason for the way that was made.  so expect some discussion on that.  :)00:40
@spamaps:spamaps.ems.hostSure. For me, I just want to run stuff in a container with a particular image and thus contents. I can't expect admin level controls and don't need separation at the namespace level anyway.00:41
@spamaps:spamaps.ems.hostI can imagine that a reason not to do it this way is that one pod can probably spy on other running pods if the k8s perms aren't set up just so.00:42
@spamaps:spamaps.ems.hostBut I think that can be handled by locking down the particular namespace to deny that.00:43
@jim:acmegating.comyeah.  i'm not sure if we documented the reasons; so we might have to poll some people to figure it out.  i agree what you want is sensible, and that implementing it and pushing up a change for review is a great way to start the discussion.  just wanted to flag that as potentially something that might not sail right through :)00:44
@spamaps:spamaps.ems.hostYeah, if I could get namespace creation access I wouldn't care, I'd just let this happen. :)00:45
@spamaps:spamaps.ems.hostBut.. trying to take advantage of the well-managed k8s clusters we already have if I can. :)00:45
@spamaps:spamaps.ems.hostFor what I'm trying to prove right now.. I can probably run on VMs for a while anyway. We have a big monorepo build that is killing the jenkins nodes we have access to (32GB 16 core vms). So I want to just break that up into 7 or so zuul jobs and have them all run on their own right-sized vms. But.. if I can't use k8s.. I have to figure out how to run the action containers on the vms.00:49
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com:00:53
- [zuul/zuul] 817484: Use a stable hash for ConfigurationErrorKeys https://review.opendev.org/c/zuul/zuul/+/817484
- [zuul/zuul] 817486: Identify the object in ZKObject (de)serialization errors https://review.opendev.org/c/zuul/zuul/+/817486
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 817490: Log null change key deference https://review.opendev.org/c/zuul/zuul/+/81749000:53
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 817343: Fix a bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/81734304:20
-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 817343: Fix a bug in getting changed files https://review.opendev.org/c/zuul/zuul/+/81734305:46
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 817518: Add an icon for each type of component to the components page https://review.opendev.org/c/zuul/zuul/+/81751808:19
-@gerrit:opendev.org- Felix Edel proposed: [zuul/zuul] 817518: Add an icon for each type of component to the components page https://review.opendev.org/c/zuul/zuul/+/81751808:26
-@gerrit:opendev.org- Zuul merged on behalf of Felix Edel: [zuul/zuul] 814711: UI: Fix build time calculation for empty buildsets https://review.opendev.org/c/zuul/zuul/+/81471109:04
@mordred:inaugust.com> <@spamaps:spamaps.ems.host> woo I got a test written and it even does some decorating10:20
did you eventually discover that you could run the unit tests on mac directly?
-@gerrit:opendev.org- Zuul merged on behalf of Felix Edel: [zuul/zuul] 814717: UI: Ignore empty timestamps in build time calculation on buildset page https://review.opendev.org/c/zuul/zuul/+/81471710:32
@avass:vassast.orgcorvus: spamaps regarding https://review.opendev.org/c/zuul/zuul-jobs/+/817291, can we try to make that more generic? I know we talked about the revoke-sudo role previously where we didn't want to revoke sudo on static nodes. The idea there was to configure something like `attribute.revoke_sudo: false` on the label or node in nodepool which then gets passed to the role as `nodepool.revoke_sudo` or `nodepool.attributes.revoke_sudo`. Maybe the sudoers file should be configured in nodepool too?11:22
@emacchi:matrix.orgHey folks, I'm using Zuul from openlab recently (for Gophercloud CI) and we added new CI jobs in .zuul.yaml of the project, but somehow Zuul still runs the old job. If I look at the footer of https://status.openlabtesting.org/status - I can see that the last reconfiguration was more than a month ago. Does Zuul needs to be reconfigured when a new job is added?13:34
@fungicide:matrix.orgemacchi: typically, you should expect that timestamp to update any time a configuration change merges to a tracked repository/branch, or when a reconfiguration is requested (e.g. after updating the tenant config to add/remove repos), or when the scheduler was last restarted, whichever happened more recently13:49
@fungicide:matrix.orgif zuul is tracking the repository/branch and configured to load job configs from it, then it should notice when those configs change and reconfigure itself automatically13:50
@fungicide:matrix.orgit's possible this configuration error is preventing the scheduler from being able to complete tenant reconfiguration: https://status.openlabtesting.org/config-errors13:53
@emacchi:matrix.org> <@fungicide:matrix.org> it's possible this configuration error is preventing the scheduler from being able to complete tenant reconfiguration: https://status.openlabtesting.org/config-errors14:01
oh so if we fix the error, there is a good chance that zuul scheduler will successfully restart and load the new config.
@fungicide:matrix.orgemacchi: yes, well it should reload its configuration, normally a restart shouldn't be necessary for that14:04
@tobias.henkel:matrix.orgmordred, spamaps nodepool unit tests on mac? That should be possible but requires some hacks14:08
@tobias.henkel:matrix.orgbut nodepool is easier than zuul I think14:09
@tobias.henkel:matrix.orgbut it's already a while I ran the nodepool tests locally last time14:14
@apevec:matrix.orgemacchi: what is openlab and what Zuul do they run?14:43
@jim:acmegating.comapevec: https://openlabtesting.org/14:44
@apevec:matrix.orghmm doesn't say much "managed by open source community members"14:45
@emacchi:matrix.org> <@apevec:matrix.org> emacchi: what is openlab and what Zuul do they run?14:46
they run their own infra
@jim:acmegating.comapevec: was started by huawei.  mostly for openstack<->k8s integration testing14:46
@emacchi:matrix.orgwe'll probably leave it btw14:47
@emacchi:matrix.orgwe're discussing about it (see openstack-discuss)14:47
@apevec:matrix.orginteresting they've choosen Zuul14:47
@spamaps:spamaps.ems.host> <@mordred:inaugust.com> did you eventually discover that you could run the unit tests on mac directly?15:07
Some, not all.
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 817491: Add repr to FrozenJob https://review.opendev.org/c/zuul/zuul/+/81749115:23
@fungicide:matrix.org> <@apevec:matrix.org> interesting they've choosen Zuul16:33
you'll find them listed at https://zuul-ci.org/users.html with a link to an article/interview from several years ago
@fungicide:matrix.orgit explains some of their reasons for that choice16:33
-@gerrit:opendev.org- Tobias Henkel proposed: [zuul/zuul] 817604: DNM: Upload build requests async https://review.opendev.org/c/zuul/zuul/+/81760417:15
@tobias.henkel:matrix.orgthis is an analysis that lead to this change17:21
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 817626: WIP: Remove RPC client from autohold tests https://review.opendev.org/c/zuul/zuul/+/81762617:41
@tobias.henkel:matrix.orgcorvus: +2 with comment on 81719618:58
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 817640: Add some pipeline processing stats https://review.opendev.org/c/zuul/zuul/+/81764019:03
@jim:acmegating.comtobiash, swest: ^ that may give us some interesting performance stats19:04
@jim:acmegating.comtobiash: good point; i think i'm willing to run that in prod on opendev for just a bit to see how bad it is :)19:08
@tobias.henkel:matrix.orgI think the badness of that scales with the number of pipelines and tenants :)19:09
@jim:acmegating.comyep.  but we'll get a baseline... some value that we don't know how to scale ;)19:10
@tobias.henkel:matrix.orgfirst item in gate will fail20:15
@tobias.henkel:matrix.orglooks like again the test_zookeeper_disconnect test case that's the first that failed20:16
@jim:acmegating.comsigh.  i'll try to dig into that more20:17
@jim:acmegating.comor maybe we should delete the test20:19
@jim:acmegating.comokay, the issue with that is the stats thread.  it start the election after the reconnect, but it doesn't win it, and the cancel doesn't appear to have canceled the election.20:32
@iwienand:matrix.orgjust checking we're not in any state that merging https://review.opendev.org/c/zuul/nodepool/+/817482 to bump dib in nodepool would be a concern?21:23
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 817646: Re-order scheduler shutdown https://review.opendev.org/c/zuul/zuul/+/81764621:23
@jim:acmegating.comianw: +2 i think it's clear21:24
-@gerrit:opendev.org- Zuul merged on behalf of Felix Edel: [zuul/zuul] 816807: Split up registerScheduler() and onLoad() methods https://review.opendev.org/c/zuul/zuul/+/81680721:53
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-sphinx] 817650: Add :zuul:path support https://review.opendev.org/c/zuul/zuul-sphinx/+/81765022:23
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-sphinx] 817650: Add :zuul:path support https://review.opendev.org/c/zuul/zuul-sphinx/+/81765022:24
@harrymichal:matrix.orgHi folks! I'm wondering about the behaviour of artifacts in jobs. Do the artifacts live across pipeline invocations? E.g., a build job caches some data that speeds up future builds -> possible speed up of future pipeline runs.22:27
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 817652: Prevent duplicate config file entries https://review.opendev.org/c/zuul/zuul/+/81765222:32
@jim:acmegating.commartymichal: yep, the info is stored in the sql database, so if a required artifact is satisfied by a build that's currently running, it gets the data from memory; if its satisfied by a previous build, it gets it from the db.22:33
@clarkb:matrix.orgcorvus: ^ I think that change largely addresses the problem I discovered the other day with extra-config-paths potentailly doubling up configs. However, the tests don't pass yet because I think they expose another bug where we early load configs in the scheduler before validating them. This means you get a much more cryptic error message than you would from the validator (my test should hopefully illustrate this). You might want to take a look at that to double check it isn't a larger issue22:33
@jim:acmegating.commartymichal: (you may know this, but just to be clear: zuul manages metadata about the artifact, like the storage location; it's up to jobs and roles to implement the actual storage and fetching of it; the zuul-jobs repo has some roles to help with that)22:34
@jim:acmegating.comClark: ack, thx22:34
@harrymichal:matrix.org> martymichal: (you may know this, but just to be clear: zuul manages metadata about the artifact, like the storage location; it's up to jobs and roles to implement the actual storage and fetching of it; the zuul-jobs repo has some roles to help with that)22:36
Didn't know that. So, I'll have to check those out before I start writing the config for the artifacts. Before I do so, does this require the project to acquire some actual storage space?
@harrymichal:matrix.orgAnyway, thanks for the clarification!22:42
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/nodepool] 817482: Bump DIB to 3.15.1 https://review.opendev.org/c/zuul/nodepool/+/81748222:59
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-sphinx] 817650: Add :zuul:path support https://review.opendev.org/c/zuul/zuul-sphinx/+/81765023:07
@jim:acmegating.comianw: would you mind taking a quick look at https://review.opendev.org/817650 ?23:37
@jim:acmegating.comhttps://review.opendev.org/809300 is the use case (cc: tristanC)23:38
@iwienand:matrix.orgsure, just let me get into sphinx mode :)23:38
@jim:acmegating.comyeah... it's been a few years :)23:40

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!