Wednesday, 2023-06-28

-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 887123: List process ids in bwrap namespace https://review.opendev.org/c/zuul/zuul/+/88712300:49
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 878039: Add implied-branch-matchers to tenant config https://review.opendev.org/c/zuul/zuul/+/87803907:09
-@gerrit:opendev.org- Zuul merged on behalf of Benedikt Löffler: [zuul/zuul-jobs] 886186: Use zuul_workspace_root for prepare/mirror workspace test roles https://review.opendev.org/c/zuul/zuul-jobs/+/88618618:36
@jpew:matrix.orgZuul gets really unhappy if a gerrit server goes down.... it's sitting here refusing to start running _any_ jobs; and the scheduler has no logs20:00
@jpew:matrix.orgSorry, the logs part is not true20:01
@jpew:matrix.orgI was looking at the wrong logs :)20:01
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 887229: DNM: Test base-test https://review.opendev.org/c/zuul/zuul-jobs/+/88722920:11
@clarkb:matrix.orgwe definitely turn off our gerrit and zuul handles it just fine20:13
@clarkb:matrix.orgBut I don't know that we are checking that github keeps triggering jobs20:13
@clarkb:matrix.orgbut in general if the source of repo content isn't available I wouldn't expect jobs relying on that repo content to be able to run20:13
@jpew:matrix.orgClark: It looks like the queue processes really slowly when gerrit is down20:20
@jpew:matrix.orgAnd we have a periodic-hourly pipeline that enqueues 100's of items..... and I think it's enqueing them faster than the scheduler can remove them20:21
@jpew:matrix.orgThey don't actually do anything, it's just a lot of branches to check20:21
@jpew:matrix.org.. ya it's only removing about 5 items a minute from the queue.... maybe because each one has to timeout?20:24
@clarkb:matrix.orgI'd have to look at logs. I would expect mergers to fail more quickly but maybe they are all waiting for network connections to start20:29
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 887123: List process ids in bwrap namespace https://review.opendev.org/c/zuul/zuul/+/88712320:31
@jpew:matrix.orgAh, it sleeps 30 seconds between each try20:35
@jpew:matrix.orgAnd it tries 3 times before failing (looks like?)20:35
@jim:acmegating.comif you need to clear the queue you can use the api via the client20:36
@jpew:matrix.orgAh right20:36
@jim:acmegating.comjpew: you could build a client with https://review.opendev.org/835319 and let us know how it goes :)20:37
@jpew:matrix.orgYa, I was just going to say "I don't see that option"20:38
@jim:acmegating.com(or you could mimic it with one of the queue dump scripts and then script running a single dequeue repeatedly)20:39
@jpew:matrix.orgI built the client; it worked20:46
@jpew:matrix.orgLeft a +120:46
@jpew:matrix.orgHmm, OK. That dequed the changes, but there are still 100's of changes that are "Waiting for merger" and it did not clear those out20:55
@jpew:matrix.orgAnd I think that's the bottleneck20:55
@jpew:matrix.orgIs there a way to cancel jobs that are waiting for a merger?21:08
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 883318: Report early failure from Ansible task failures https://review.opendev.org/c/zuul/zuul/+/88331821:18
@jim:acmegating.comjpew: should still be able to dequeue the items, but you can't alter the individual jobs in the items21:19
@jpew:matrix.orgIt's weird though because there aren't any jobs; all the items we process are for repos and branches that don't have any pipeline defined21:21
@jpew:matrix.orgWe process 100's of repo/branches, but only acutally run 1 or 2 jobs21:21
@jpew:matrix.orgBut the processing of the repo/branches is what seems to be waiting on the merger, and going _really_ slow21:21
@jpew:matrix.org * But the processing of the repo/branches is what seems to be waiting on the merger, and going _really_ slow (when gerrit is not running)21:21
@jpew:matrix.orghmmm21:24
@jpew:matrix.orgOr maybe our periodic-hourly pipeline is a red herring and the actual culprit is our hourly smart-reconfigure?21:25
@jpew:matrix.orgStill would be nice to cancel the pending ones though :(21:26
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed on behalf of Simon Westphahl:22:04
- [zuul/nodepool] 885960: Copy cached objects in tree cache before iterating https://review.opendev.org/c/zuul/nodepool/+/885960
- [zuul/nodepool] 886083: Copy cached paths in tree cache before iterating https://review.opendev.org/c/zuul/nodepool/+/886083
@jpew:matrix.orgNope, it's not the reconfigure that's the problem. It's our periodic pipeline :(22:08
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 887123: List process ids in bwrap namespace https://review.opendev.org/c/zuul/zuul/+/88712322:20
@jpew:matrix.orgIs it possible for zuul-client dequeue to somehow remove that merger job?22:29
@jpew:matrix.orgWe had to lobotimize our tenant until Gerrit is back, which is not ideal22:30
@jpew:matrix.orgOr maybe if a clone for a repo fails it can also fail all of the other merger jobs for that same repo branch instead of letting them stack up?22:32
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 883940: Parallelize static startup more https://review.opendev.org/c/zuul/nodepool/+/88394022:51
-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl:23:28
- [zuul/nodepool] 885960: Copy cached objects in tree cache before iterating https://review.opendev.org/c/zuul/nodepool/+/885960
- [zuul/nodepool] 886083: Copy cached paths in tree cache before iterating https://review.opendev.org/c/zuul/nodepool/+/886083
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 885088: Fix error with static node reuse https://review.opendev.org/c/zuul/nodepool/+/88508823:28
-@gerrit:opendev.org- Zuul merged on behalf of Tobias Henkel: [zuul/nodepool] 883253: Fix race condition with node reuse and multiple providers https://review.opendev.org/c/zuul/nodepool/+/88325323:56

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!