Tuesday, 2022-05-10

@y2kenny:matrix.orgHi, is there a way to silent Zuul on branches that it should not do anything?  For some reason Zuul keeps reporting "This change depends on a change that failed to merge", scoring verified-1 and blocking a change on a branch that it was not configured for.  (The branch is only configured with a submit job.)14:30
@avass:vassast.orgKenny Ho: you mean branches where zuul shouldn't run anything at all? :)14:32
@avass:vassast.orgIf so I started on this to configure zuul to ignore certain branches completely: https://review.opendev.org/c/zuul/zuul/+/83755914:32
@y2kenny:matrix.orgAlbin Vass: that would also be useful in other instances (I have other situations where Zuul is commenting/scoring when there is no configuration.)  For the current case, the branch is actually configured with some submit jobs and I don't know why Zuul is doing anything before submission.14:34
@y2kenny:matrix.orgthe issue here is that I am  getting complaint about Zuul getting into the way of other people's work and I can't explain the issue or stop it.14:35
@avass:vassast.orgKenny Ho: There's this for silencing merge-failures: https://www.zuul-ci.org/docs/zuul/latest/config/pipeline.html#attr-pipeline.merge-conflict but I don't think it's possible to configure that for specific branches14:36
@y2kenny:matrix.orgAlbin Vass: Oh that one should be useful for my other cases.  But in this particular case it's kind of weird.  I have only seen "This change depends on a change that failed to merge" on gate pipeline (i.e. when Zuul try to auto submit things)14:38
@y2kenny:matrix.orgThere is a gate pipeline configured for the repository but there is a branch filter associated with the jobs.  So I have no clue why Zuul even tries anything there.14:39
-@gerrit:opendev.org- Joshua Watt proposed: [zuul/nodepool] 839226: Do not reset quota cache timestamp when invalid https://review.opendev.org/c/zuul/nodepool/+/83922614:42
@avass:vassast.orgKenny Ho: I think zuul reports something like that from merge failures when it tries to load config from a branch14:42
@avass:vassast.orgbut eh, maybe not "depends on a change that failed to merge" 14:42
@y2kenny:matrix.orgI think that's a good hypothesis for my other issues as well.  But on the line of thought, the projects is already included in the tenant config with "include: []"14:44
@y2kenny:matrix.orgso I am not sure why Zuul would try to load any config there either14:45
@clarkb:matrix.orgKenny Ho: you are trying to make the parent change be ignored by Zuul? but you get reports on the child change that zuul cannot merge the parent?14:45
@y2kenny:matrix.orgI am trying to make that branch ignored by Zuul entirely.  Zuul shouldn't be doing anything with the parent commit or the child commit pre-submit.14:47
@y2kenny:matrix.orgbut on the child commit it is reporting Verified-1 and "This change depends on a change that failed to merge".  The parent commit is a merge commit but I am not sure if that means anything.)14:48
@clarkb:matrix.orgyou should remove the pipeline configs for the branch14:48
@clarkb:matrix.orgthen zuul should ignore it14:48
@y2kenny:matrix.orgThe only other time I have seen this is on branches with a gate pipeline and Zuul try to auto submit a patch that is not properly rebased.14:49
@clarkb:matrix.orginclude: [] applies at an entire repo level not per branch iirc. If you are wanting zuul to apply to some branches but not others you need to remove the configs for the specific branch14:49
@y2kenny:matrix.orgok I should be more clear (but may be this is where my confusion is.)  The include is applied to the repo in question.  The repo does not contain zuul configs.  The zuul config for this repo is stored in a separate repo.  (Does this nullify the include: []?)14:51
@clarkb:matrix.orgThe include: [] talks about where to load configs from. Not what to apply them to14:52
@clarkb:matrix.orgIt says "do not load any zuul.yaml configs from this repository"14:52
@clarkb:matrix.orgbut if you define configuration for that repository in other repos it can be loaded from those repos instead14:52
@clarkb:matrix.orgyou should remove that configuration if you do not want zuul to operate on the repo14:52
@y2kenny:matrix.orgum... I think I am confusing things by talking about multiple issues at once.   I have repo A that is an open source project that has not adapted Zuul.  It is included in the Zuul tenant with include: [].  I have zuul config in repo B that define jobs and triggers.  In repo B, various pipeline and triggers are defined for some branch of repo A but for some reason, Zuul is acting on branches and events that is not defined by the zuul configs in repo B.15:06
@clarkb:matrix.orgKenny Ho: Ok that helps. Zuul should only run jobs on the branches that you have specified to run the jobs on. However, in order to make that determination I believe it does a minimal amount of processing for all events on all branches. And that includes merge checking15:08
@clarkb:matrix.orgIn this case it seems merging is failing for one reason or another and it is reporting that information?15:09
@y2kenny:matrix.orgthis seems like a second order to that.  I have certainly seen Zuul attempting merging thing and reporting merge issue all over the place.  But this seems to be happening to the child commit to the parent commit that can't merge.15:10
@y2kenny:matrix.orgI am not sure if it's the same kind of things or something unrelated15:11
@clarkb:matrix.orgI think it may be the same issue if the parent can't merge beacuse merging a child implies also merging the parent15:11
@y2kenny:matrix.orgI have seen valid "This change depends on a change that failed to merge" message in other context, but it feels weird in this context15:11
@y2kenny:matrix.orgbut it could be just a mis-routed exception pointing to a confusing error message15:11
@clarkb:matrix.orgBut I'd need to look at logs to confirm that. You should be able to find the event logs for the child event and see what decision were made15:11
@y2kenny:matrix.orgok I will watch out for that15:12
@y2kenny:matrix.org> <@clarkb:matrix.org> Kenny Ho: Ok that helps. Zuul should only run jobs on the branches that you have specified to run the jobs on. However, in order to make that determination I believe it does a minimal amount of processing for all events on all branches. And that includes merge checking17:26
Clark: when you said "minimal amount of processing for all events on all branches", what kind of processing are they? Can you point me to a particular section of the code base for me to read?
@clarkb:matrix.orgKenny Ho: I think https://opendev.org/zuul/zuul/src/branch/master/zuul/scheduler.py#L2261-L2287 is it. Notice the very end of that function is what checks if the matchers match but there is stuff done prior to that17:52
@y2kenny:matrix.orgOn the issue of scheduler getting overwhelmed, I have turned on the debug log but I am not sure there are much more info I can get out of it.  Without debug, I see repeated logs of "Adding change <Change > to queue <ChangeQueue... > in pipeline" (lots and lots of them.)18:26
@ecsantos:matrix.orgHello folks, just a quick question: how can I configure pipeline.success and pipeline.failure so that Zuul comments on changes but doesn't leave a label value (e.g. Verified: +1)? My team is configuring a third-party CI but we don't want to change the Verified label on changes18:26
@y2kenny:matrix.orgwith debug log on, I see additional logs like "Checking for changes needed by <Change>"18:26
@y2kenny:matrix.orgChange < ...003> needs change <...002>: Needed change is already ahead in the queue18:27
@y2kenny:matrix.orgOther logs I noticed just now... "Running Tarjan's algorithm on current dependencies..."18:30
@y2kenny:matrix.orgall of these are on a commit series that will end up being noop18:30
@clarkb:matrix.orgecsantos: the verified vote is distinct. Compare https://opendev.org/openstack/project-config/src/branch/master/zuul.d/pipelines.yaml#L45-L57 to https://opendev.org/openstack/project-config/src/branch/master/zuul.d/pipelines.yaml#L388-L391 Note that the success and failure messages may not be a thing anymore the expectation is that you always get the normal message now? I could be wrong about that though18:32
@clarkb:matrix.orgKenny Ho: you should get event ids on the debug logs. If you grep for the event id across the service logs you'll get a pretty good pciture of what each individual event ends up running. From that you can work backwards to see how to shut off extra activity if possible18:33
@ecsantos:matrix.orgClark: Interesting, gonna try the connection name with the empty dicts, that looks like it'd work18:34
@y2kenny:matrix.orgClark: I am greping for the event id but there's just so much data... almost feel like something recursive is running.  i.e. let say someone pushed a patchset of commit 1->2->3->4 (with commit 1 being the oldest)... I see repeated logs of "Checking for changes needed by change 1" and then for change 2, there's the same thing for change 2 but I will also have Change 1 needs Change 1 and "Needed change is already ahead in the queue"18:39
@y2kenny:matrix.org(but instead of a patchset of 4 commits, it's a patchset of 30 commits)18:40
@clarkb:matrix.orgyes if you push a stack it will look that way as zuul handles those events as discrete items18:40
@clarkb:matrix.orggerrit emits 4 events for patchset created when you push a stack. It isn't a single event18:40
@y2kenny:matrix.org4 events per pipeline?18:41
@clarkb:matrix.orgits one event per change. ANd if you have multiple changes then multiple events. Then for each event each pipeline considers if it applies to itself18:42
@y2kenny:matrix.orgdoes that also multiply across scheduler if I have multiple scheduler? is the division of labour between scheduler before or after this?18:43
@y2kenny:matrix.orgbetween schedulers*18:44
@clarkb:matrix.orgthe event can be handled by multiple schedulers. But only one scheduler will hanlde the event for a specific pipeline18:45
@clarkb:matrix.orgthat means shceduler1 can process the event for pipeline check and scheduler2 can process the event for pipeline gate18:46
@y2kenny:matrix.orgum... ok...18:49
@y2kenny:matrix.orgwhat is the meaning of "locked pipeline"?  That's another log items that I've noticed.18:50
@clarkb:matrix.orgthat is what ensures multiple scheduelrs don't process a pipeline at the same time. So in my example scheduler 1 would lock the check pipeline and process it preventing scheduler2 from processing it so scheduler 2 would look at the next pipeline (gate) and lock it then process it18:53
@y2kenny:matrix.orgok so that's probably normal18:54
@y2kenny:matrix.orgdoes the scheduler contact gerrit directly or does it do so via the executor?18:55
@clarkb:matrix.orgthe scheduler does it directly18:56
@y2kenny:matrix.orgok, so if there's connection issue I should see it in the scheduler log18:56
@clarkb:matrix.orgyes18:58
@clarkb:matrix.orgFor github the callbacks are sent to zuul web and web adds them to zookeeper then the scheudler processes them18:58
@clarkb:matrix.orgbut for gerrit it listens to the event stream over ssh directly from the scheduler18:58
@y2kenny:matrix.orgIs there a way to inspect the "ChangeQueue" of a pipeline?  I am wondering what is causing all the log message "Needed change is already ahead in the queue"19:25
@y2kenny:matrix.orgThis is the kind of things that is filling up my log:19:31
https://paste.openstack.org/show/bVwB9QDF3tmYx9qUNjSA/
@y2kenny:matrix.orgI am guess zuul is not getting what it needs from Gerrit in a timely fashion but I am wondering if there's something I can configure to get zuul to back off a bit19:32
@clarkb:matrix.org> <@y2kenny:matrix.org> Is there a way to inspect the "ChangeQueue" of a pipeline?  I am wondering what is causing all the log message "Needed change is already ahead in the queue"19:33
That is just zuul recording that it needs that chaneg ahead and it is already ahead so it can continue on. Otherwise you'd get log messages about enqueing the change ahead
@clarkb:matrix.orgmaybe noisy but nothing to be concerned about19:33
@y2kenny:matrix.orgoh ok.19:33
@y2kenny:matrix.orgI am out of idea...  this feels like zuul is basically ddos'ed by someone pushing a 30 commit patchset19:42
@y2kenny:matrix.orgI guess it's not entirely out of service... some of the new jobs are showing up about 30minutes later19:44
@y2kenny:matrix.orgbut then the 30-commit patchset is still ahead of the rest of the jobs waiting to be resolved into noop19:45
@y2kenny:matrix.orgoh and now all of a sudden all of the backlog in one of the pipeline disappeared...19:48
@y2kenny:matrix.orgI am very confused....19:48
@y2kenny:matrix.orgoh... here's another question.  When the scheduler talk to Gerrit, is it a git operation or is it just ssh or REST api query?19:56
@y2kenny:matrix.orgI get that streamevent is an ssh thing, but not too sure about the processing afterward20:00
@clarkb:matrix.orgit does all of the above20:01
@clarkb:matrix.orgdepending on how you configure it. If you configure it with a rest token it will use that for some things that cannot be done over ssh. It will use ssh for event streams at least since those don't have an http analogue20:02
@clarkb:matrix.organd finally the mergers will fetch from gerrit using git protocols over ssh (and maybe http if you configure it to use http isntead)20:02
@y2kenny:matrix.orgAh... I forgot about the merger20:03
@y2kenny:matrix.orgright20:03
-@gerrit:opendev.org- Joshua Watt proposed: [zuul/nodepool] 839226: Do not reset quota cache timestamp when invalid https://review.opendev.org/c/zuul/nodepool/+/83922620:04
@y2kenny:matrix.orgAfter 2 hrs, all the backlog seems to have resolved on its own but I don't think I am any closer to preventing this from happening again... my gut feel is that I have too many merger and scheduler on an already heavily loaded Gerrit but I can't really tell for sure.  If I want to configure Zuul to talk to Gerrit replica and master at the same time, is that possible with Zuul?  Do I just define multiple Gerrit connection?20:06
@y2kenny:matrix.orgconnections*20:06
@y2kenny:matrix.orgI am not sure how different events are coordinated though.  I think Jenkins can handle replication-complete event but I don't recall coming across similar thing in Zuul20:07
@clarkb:matrix.orgmultiple gerrit connections represent multiple logical gerrit installations20:09
@clarkb:matrix.orgI think zuul may currently expect you to address that with load balancing that is transparent to zuul20:09
@clarkb:matrix.orgit is possible that expectation is flawed20:09
@y2kenny:matrix.orgok....20:09
@y2kenny:matrix.orgI think it's a reasonable expectation but unfortunately in my current context, the gerrit deployment is outside of my control20:10
@y2kenny:matrix.orgI also have a strong feeling that a lot of the issue I am seeing is due to gerrit connection backing up but there's not much I can do about it20:11
@y2kenny:matrix.organother question... are there any 'smartness' in merger usage?  For example, let say I have 5 mergers, one of the mergers just finished processing an event from the linux repo20:13
@y2kenny:matrix.orgis zuul smart enough to go back to that merger for next event from linux repo?20:14
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:20:16
- [zuul/zuul] 804177: Add include- and exclude-branches tenant config options https://review.opendev.org/c/zuul/zuul/+/804177
- [zuul/zuul] 841336: Add always-dynamic-branches option https://review.opendev.org/c/zuul/zuul/+/841336
@y2kenny:matrix.orgalternatively, if there is a burst 10 events, would the scheduler immediately try to spread the workload across all 5 mergers or would it try to process most of the events in the merger with the "warm cache"? 20:17
@avass:vassast.orgcorvus: nice :)20:18
@avass:vassast.orgcorvus: small comment on that patch20:20
@jim:acmegating.comAlbin Vass: thanks20:21
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:20:22
- [zuul/zuul] 804177: Add include- and exclude-branches tenant config options https://review.opendev.org/c/zuul/zuul/+/804177
- [zuul/zuul] 841336: Add always-dynamic-branches option https://review.opendev.org/c/zuul/zuul/+/841336
@y2kenny:matrix.orgJust to balance out my negativity a bit... here's a positive observation.  I just have a patchset (single commit) that spawn 176 jobs and Zuul is handling it like a champ.  (Jobs are completing as nodes are available, etc., ,etc.)22:02
@clarkb:matrix.orgwow I think our large job count is like 20 something23:40
@jim:acmegating.comi've worked with folks with job counts in the hundreds.23:46
@jim:acmegating.com * i've worked with folks with single-item job counts in the hundreds.23:47

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!