Friday, 2022-09-09

-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: [zuul/zuul] 856246: Don't run cleanup playbooks after setup failure https://review.opendev.org/c/zuul/zuul/+/85624608:52
-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: [zuul/zuul] 855890: Don't try to report build w/o buildset to DB https://review.opendev.org/c/zuul/zuul/+/85589008:55
@avass:vassast.orgClark: yeah looks like the governor was unregistering/reregistering a lot yesterday before we added another two executors, which also seem to have improved performance. But shouldn't that show up in statsd as fewer executors accepting work?09:46
@avass:vassast.orghmm maybe they did but the resolution wasn't good enough09:47
@fungicide:matrix.orgAlbin Vass: you can see it on ours, but we seem to have roughly 20-second granularity for the graphs: https://grafana.opendev.org/d/21a6e53ea4/zuul-status?orgId=1&viewPanel=2111:32
@fungicide:matrix.orgi think most of the time it's the "too many builds starting" governor in our case11:33
@tristanc_:matrix.orgcorvus: swest thanks, that looks great!11:53
@avass:vassast.orgfungi: yeah i was mostly surprised by the difference it made even though we always had at least three executors accepting jobs13:04
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 849887: Add Nodepool in Zuul spec https://review.opendev.org/c/zuul/zuul/+/84988715:49
@pearcetyler:matrix.orgAny idea what would cause gearman to fail to start? I keep getting timeouts in the web container waiting for gearman to start16:30
@clarkb:matrix.orgIn older zuul (note current zuul doesn't use gearman anymore) the default is the schedule forks off a gearman daemon. Maybe the scheduler isn't starting or something is already listening on port 4730 preventing the geard from doing so16:31
@pearcetyler:matrix.orgHmm, the scheduler seems to be starting but is hitting some errors like `2022-09-09 16:32:10,888 ERROR zuul.Scheduler: Exception loading ZKObject <zuul.model.PipelineSummary object at 0x7f649c06ad40> at /zuul/tenant/github-junipersquare/pipeline/gate/status`16:32
@clarkb:matrix.orgI think the geard fork happens well before that16:37
@clarkb:matrix.orgHere's an idea, is your scheduler much newer than your web? it may be that you've upgraded the scheduler to the point where it doesn't run gear anymore because it was removed. But if the web is older it may still be looking for it16:38
@pearcetyler:matrix.orgAh interesting. It just started after a `docker system prune -a --volumes` and pulling everything back down via `sudo -E docker-compose -p zuul up`. I wonder if that caused things to get out of sync?16:39
@clarkb:matrix.orgyou should be able to do a docker image list to see what versions of conatiners you've got. Or exec into the containers and check the `pip freeze` versions16:41
@gobi_g:matrix.orgHi,18:49
Can anyone explain how pipeline window and window-floor works?
Thanks
@clarkb:matrix.orgkarthi: it is similar to tcp slow start. The idea is as more changes merge the window size increases allowing more changes to run their jobs concurrently. When changes fail to merge the window is reduced. The idea behind is when the software is more stable it is safe to speculatively assign more test resources to testing changes. But when things are unstable you want it to be more limited as a gate reset discards every live job behind that failure19:02
@clarkb:matrix.orgianw: left a couple of questions/notes/concerns on https://review.opendev.org/c/zuul/zuul/+/85530919:06
@gobi_g:matrix.orgThanks for the info. Does the window mean one pipeline?19:22
In the doc it mentioned the default window size is 20. Window floor 3.
But I can't run more than 3 pipelines at a single time.
@clarkb:matrix.org> <@gobi_g:matrix.org> Thanks for the info. Does the window mean one pipeline?19:24
>
> In the doc it mentioned the default window size is 20. Window floor 3.
> But I can't run more than 3 pipelines at a single time.
That implies to me that you've had enough failures that the window size has shrunk to the floor. If you want to force a larger number I would increase the floor. I think we set ours to 20 so it can grow from the default but not shrink smaller.
@gobi_g:matrix.orgBut I was able to run 3 pipelines every time even though my previous 6 pipelines failed. Is it because of the window-floor set to 3?19:28
@gobi_g:matrix.orgIs there a way to check the current available window value? Is there any API or command 19:29
@clarkb:matrix.org> <@gobi_g:matrix.org> Is there a way to check the current available window value? Is there any API or command 19:31
It is in the status.json file and the web status rendering shows it using different colored bubbles on the left side of the changes. The floor is 3 so it started at 20 then shrunk to 3 as the minimum size due to the previous failures. You can configure the floor to better suit your needs
@gobi_g:matrix.orgOkay I got it now. So, the floor is max pipelines you can run at the time of continuous failures or more failures. 19:34
@clarkb:matrix.org> <@gobi_g:matrix.org> Okay I got it now. So, the floor is max pipelines you can run at the time of continuous failures or more failures. 19:37
Yes the window is a dynamic size based on the current state of affairs. In with tcp the idea is to maximize bandwidth utilization without overfilling the pipe. With zuul it's to ensure efficient use of resources to merge changes. You don't want to run jobs for 20 changes and have them restart continuously that wastes a lot of resources
@gobi_g:matrix.orgThank you. But in status.json I can't find the window details.19:40
@gobi_g:matrix.orgThanks for the detailed explanation ๐Ÿ™‚. Is it a way to notify admin when window size decreased to the window-floor. To alert when continuous failures.19:49
@gobi_g:matrix.org* Thanks for the detailed explanation ๐Ÿ™‚. Is there a way to notify admin when window size decreased to the window-floor. To alert when continuous failures.19:53
@clarkb:matrix.orgI don't think that is built in currently. You could poll it and trigger an alert off of that.19:54
@gobi_g:matrix.orgOkay. During the failures zuul add the comment in Gerrit. Is there an option to send an email? Or integrate other tools to notify in  them?19:59
@jim:acmegating.comthere are several other reporters available, including email, mqtt, or elasticsearch, any of which could be used for notification20:00
@jim:acmegating.comthis doesn't exist, but adding a metric for window size would be a good idea i think.20:01
@gobi_g:matrix.orgCould you please share the reference or doc link reports ๐Ÿ˜…20:02
@jim:acmegating.comkarthi: https://zuul-ci.org/docs/zuul/latest/drivers/index.html20:02
@gobi_g:matrix.orgThank you 20:06
@gobi_g:matrix.orgIs it okay to use the same label in different projects pipelines. 20:13
Project1:
Gate1
Project2:
Gate1
The issue was when I trigger a MR for project 1 it shows in both gate1 and gate2 for 2 seconds then it disappears from gate2
@gobi_g:matrix.org* Is it okay to use the same label in different projects pipelines. 20:13
Project1:
Gate1
Project2:
Gate2
The issue was when I trigger a MR for project 1 it shows in both gate1 and gate2 for 2 seconds then it disappears from gate2
@clarkb:matrix.orgI think that is a side effect of how zuul evaluates whether or not a change applies to a pipeline 20:14
@clarkb:matrix.orgBut yes in opendev tenants there is a single gate pipeline defined for everything in the tenant. This is actually important for cross repo gating20:14
@gobi_g:matrix.orgOkay. So pipelines need their own label to prevent this.20:17
@clarkb:matrix.orgNo, I think there may not be a way to prevent it currently 20:17
@clarkb:matrix.orgIt's just rendering that each change is evaluated by each pipeline and when it doesn't apply it gets removed before doing further work20:18
@gobi_g:matrix.org20:20
Project1:
Pipeline: Gate1
Label: Gate1
Project2:
Pipeline: Gate2
Label: Gate2
If I configure like this it won't show the gate2 change in gate1 pipeline right?
@clarkb:matrix.orgIt will still show momentarily while gate1 and gate2 both evaluate if the change applies to them. I would define a single gate pipeline instead of multiple20:22
@gobi_g:matrix.orgOkay. In my case project1 and project2 not related to each other both have different kind of test cases and different test setups.20:23
@clarkb:matrix.orgThat's fine. The pipeline configuration should express when jobs run not what kind of test or relationships between repos20:24
@clarkb:matrix.orgThe relationships between repos are set via queues and the jobs are controlled by individual projects20:24
@gobi_g:matrix.orgYeah got your point. My case is like two different products. If I set it in a single pipeline because product 2 issues product 1 will get affected that's why.20:27
@gobi_g:matrix.org* Yeah got your point. My case is like two different products. If I set it in a single pipeline because of product 2 issues product 1 will get affected that's why.20:28
@fungicide:matrix.orgkarthi: for example, if you look at the gate pipeline shown here at the moment there are related and unrelated projects in a single gate pipeline, but only the related ones are forming sequenced queues, the others merge independently: https://zuul.opendev.org/t/openstack/status20:28
@clarkb:matrix.org> <@gobi_g:matrix.org> Yeah got your point. My case is like two different products. If I set it in a single pipeline because of product 2 issues product 1 will get affected that's why.20:28
That isn't true unless you want them to have different trigger definitions
@clarkb:matrix.orgYou only need a different pipeline if the trigger conditions are different 20:29
@fungicide:matrix.orgfor example, keystone and swift share a dependent queue called "integrated" but the puppet-neutron change is in its own queue since it has no queue shared with the other projects showing there20:29
@fungicide:matrix.orgthe two python-cinderclient changes have formed a queue together though because they're for the same project, so they implicitly share a queue for that project20:30
@clarkb:matrix.orgI suppose it is the reporter actions that matter too20:31
@clarkb:matrix.orgDifferent pipelines when triggers or reporting actions differ. Otherwise you can share20:31
@fungicide:matrix.organd if the concern is about windows, the window calculations are independent per queue so if one queue in the gate pipeline has a lot of failures its window will shrink, but another queue in the same pipeline may be succeeding a lot and so its window could be growing at the same time20:34
@fungicide:matrix.orgi guess the thing that may not be clear is that queues form within a dependent pipeline, but you can have multiple queues in one pipeline20:35
@gobi_g:matrix.orgOkay. My doubt was 20:36
If I'm triggering 2 MRs for product 1(project 1) always pass
MRs for product 2(project 2) always fail00:00
In this case where it will affect future MRs of project 1bcz of window size shrink
@gobi_g:matrix.orgYou explained it nice ๐Ÿ‘20:37
@fungicide:matrix.orgif you've declared that those projects should share a queue then yes, but just using the same pipeline definition doesn't mean they share a queue20:37
@fungicide:matrix.orgif they share a queue they will both affect the window sizing for their shared queue. if they don't share a queue then their window sizes are determined independently. but they can both still use the same pipeline20:38
@fungicide:matrix.orgif you have projects which are fairly tightly coupled (perhaps you run integration test jobs which install them both and then test they work together) then you'll usually want them to share a queue. if they're fairly independently-developed projects which only maybe consume strictly versioned releases of one another, then that's common for them to have their own separate pipelines20:41
@gobi_g:matrix.orgGot it. From your explanation it looks like we even have a single pipeline for independently development projects with proper job configuration and separate queues.20:45
@gobi_g:matrix.orgWhen both projects need the same kind of trigger*20:46
@fungicide:matrix.orgexactly... if they share triggering patterns and reporting, then having the same queue for them is sensible. that openstack tenant i pasted the link to has many hundreds of different repositories, some of which share queues and many of which don't, but we have basic archetypes of pipeline definitions they all use like check, gate, experimental, post, promote, release, deploy, periodic...21:00
@fungicide:matrix.org * exactly... if they share triggering patterns and reporting, then having the same pipeline for them is sensible. that openstack tenant i pasted the link to has many hundreds of different repositories, some of which share queues and many of which don't, but we have basic archetypes of pipeline definitions they all use like check, gate, experimental, post, promote, release, deploy, periodic...21:00
@gobi_g:matrix.org fungi: Clark: Thanks for all the detailed explanations๐ŸŽ‰21:11
@fungicide:matrix.organy time!21:11

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!