-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 887123: List process ids in bwrap namespace https://review.opendev.org/c/zuul/zuul/+/887123 | 00:49 | |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 878039: Add implied-branch-matchers to tenant config https://review.opendev.org/c/zuul/zuul/+/878039 | 07:09 | |
-@gerrit:opendev.org- Zuul merged on behalf of Benedikt Löffler: [zuul/zuul-jobs] 886186: Use zuul_workspace_root for prepare/mirror workspace test roles https://review.opendev.org/c/zuul/zuul-jobs/+/886186 | 18:36 | |
@jpew:matrix.org | Zuul gets really unhappy if a gerrit server goes down.... it's sitting here refusing to start running _any_ jobs; and the scheduler has no logs | 20:00 |
---|---|---|
@jpew:matrix.org | Sorry, the logs part is not true | 20:01 |
@jpew:matrix.org | I was looking at the wrong logs :) | 20:01 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 887229: DNM: Test base-test https://review.opendev.org/c/zuul/zuul-jobs/+/887229 | 20:11 | |
@clarkb:matrix.org | we definitely turn off our gerrit and zuul handles it just fine | 20:13 |
@clarkb:matrix.org | But I don't know that we are checking that github keeps triggering jobs | 20:13 |
@clarkb:matrix.org | but in general if the source of repo content isn't available I wouldn't expect jobs relying on that repo content to be able to run | 20:13 |
@jpew:matrix.org | Clark: It looks like the queue processes really slowly when gerrit is down | 20:20 |
@jpew:matrix.org | And we have a periodic-hourly pipeline that enqueues 100's of items..... and I think it's enqueing them faster than the scheduler can remove them | 20:21 |
@jpew:matrix.org | They don't actually do anything, it's just a lot of branches to check | 20:21 |
@jpew:matrix.org | .. ya it's only removing about 5 items a minute from the queue.... maybe because each one has to timeout? | 20:24 |
@clarkb:matrix.org | I'd have to look at logs. I would expect mergers to fail more quickly but maybe they are all waiting for network connections to start | 20:29 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 887123: List process ids in bwrap namespace https://review.opendev.org/c/zuul/zuul/+/887123 | 20:31 | |
@jpew:matrix.org | Ah, it sleeps 30 seconds between each try | 20:35 |
@jpew:matrix.org | And it tries 3 times before failing (looks like?) | 20:35 |
@jim:acmegating.com | if you need to clear the queue you can use the api via the client | 20:36 |
@jpew:matrix.org | Ah right | 20:36 |
@jim:acmegating.com | jpew: you could build a client with https://review.opendev.org/835319 and let us know how it goes :) | 20:37 |
@jpew:matrix.org | Ya, I was just going to say "I don't see that option" | 20:38 |
@jim:acmegating.com | (or you could mimic it with one of the queue dump scripts and then script running a single dequeue repeatedly) | 20:39 |
@jpew:matrix.org | I built the client; it worked | 20:46 |
@jpew:matrix.org | Left a +1 | 20:46 |
@jpew:matrix.org | Hmm, OK. That dequed the changes, but there are still 100's of changes that are "Waiting for merger" and it did not clear those out | 20:55 |
@jpew:matrix.org | And I think that's the bottleneck | 20:55 |
@jpew:matrix.org | Is there a way to cancel jobs that are waiting for a merger? | 21:08 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 883318: Report early failure from Ansible task failures https://review.opendev.org/c/zuul/zuul/+/883318 | 21:18 | |
@jim:acmegating.com | jpew: should still be able to dequeue the items, but you can't alter the individual jobs in the items | 21:19 |
@jpew:matrix.org | It's weird though because there aren't any jobs; all the items we process are for repos and branches that don't have any pipeline defined | 21:21 |
@jpew:matrix.org | We process 100's of repo/branches, but only acutally run 1 or 2 jobs | 21:21 |
@jpew:matrix.org | But the processing of the repo/branches is what seems to be waiting on the merger, and going _really_ slow | 21:21 |
@jpew:matrix.org | * But the processing of the repo/branches is what seems to be waiting on the merger, and going _really_ slow (when gerrit is not running) | 21:21 |
@jpew:matrix.org | hmmm | 21:24 |
@jpew:matrix.org | Or maybe our periodic-hourly pipeline is a red herring and the actual culprit is our hourly smart-reconfigure? | 21:25 |
@jpew:matrix.org | Still would be nice to cancel the pending ones though :( | 21:26 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed on behalf of Simon Westphahl: | 22:04 | |
- [zuul/nodepool] 885960: Copy cached objects in tree cache before iterating https://review.opendev.org/c/zuul/nodepool/+/885960 | ||
- [zuul/nodepool] 886083: Copy cached paths in tree cache before iterating https://review.opendev.org/c/zuul/nodepool/+/886083 | ||
@jpew:matrix.org | Nope, it's not the reconfigure that's the problem. It's our periodic pipeline :( | 22:08 |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 887123: List process ids in bwrap namespace https://review.opendev.org/c/zuul/zuul/+/887123 | 22:20 | |
@jpew:matrix.org | Is it possible for zuul-client dequeue to somehow remove that merger job? | 22:29 |
@jpew:matrix.org | We had to lobotimize our tenant until Gerrit is back, which is not ideal | 22:30 |
@jpew:matrix.org | Or maybe if a clone for a repo fails it can also fail all of the other merger jobs for that same repo branch instead of letting them stack up? | 22:32 |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 883940: Parallelize static startup more https://review.opendev.org/c/zuul/nodepool/+/883940 | 22:51 | |
-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: | 23:28 | |
- [zuul/nodepool] 885960: Copy cached objects in tree cache before iterating https://review.opendev.org/c/zuul/nodepool/+/885960 | ||
- [zuul/nodepool] 886083: Copy cached paths in tree cache before iterating https://review.opendev.org/c/zuul/nodepool/+/886083 | ||
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 885088: Fix error with static node reuse https://review.opendev.org/c/zuul/nodepool/+/885088 | 23:28 | |
-@gerrit:opendev.org- Zuul merged on behalf of Tobias Henkel: [zuul/nodepool] 883253: Fix race condition with node reuse and multiple providers https://review.opendev.org/c/zuul/nodepool/+/883253 | 23:56 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!