Thursday, 2022-03-03

-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 826549: Annotate variable interpolation in the tutorial zuul.conf https://review.opendev.org/c/zuul/zuul/+/82654900:15
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-registry] 831620: Stamp mitmproxy log https://review.opendev.org/c/zuul/zuul-registry/+/83162000:31
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-registry] 831620: Stamp mitmproxy log https://review.opendev.org/c/zuul/zuul-registry/+/83162001:51
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 831085: Calculate elapsed and remaining times in javascript https://review.opendev.org/c/zuul/zuul/+/83108508:30
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 831661: Ignore errors in merge request cleanup https://review.opendev.org/c/zuul/zuul/+/83166110:46
@iselor:matrix.orgHi, another question. I have an issue with 5.0 that when one pipeline job has config error, for example zuul.yaml is incorrect, then whole queue is frozen. Is there any solution for that? Currently I have to remove jobs with invalid config using zuul-client.10:55
@gobi_g:matrix.orgHi,11:16
How to setup https://zuul-ci.org/docs/zuul/latest/monitoring.html#monitoring
Is there any sample mapping config available?
-@gerrit:opendev.org- Albin Vass proposed: [zuul/zuul] 830840: Add feature to fail without retry in pre-run https://review.opendev.org/c/zuul/zuul/+/83084011:57
-@gerrit:opendev.org- Albin Vass proposed: [zuul/zuul] 830840: Make it possible to configure job retries with zuul_return https://review.opendev.org/c/zuul/zuul/+/83084011:58
@avass:vassast.org> <@gobi_g:matrix.org> Albin Vass: any comments on this?11:59
I guess so, I don't think we have it setup yet.
-@gerrit:opendev.org- Albin Vass proposed: [zuul/zuul] 831737: Simplify reportBuildEnd call https://review.opendev.org/c/zuul/zuul/+/83173712:09
@avass:vassast.orgMaybe I'm missing something but I found that ^ while looking through the code12:09
-@gerrit:opendev.org- Albin Vass proposed: [zuul/zuul] 830840: Make it possible to configure job retries with zuul_return https://review.opendev.org/c/zuul/zuul/+/83084012:36
-@gerrit:opendev.org- Matthieu Huin https://matrix.to/#/@mhuin:matrix.org proposed: [zuul/zuul-client] 819118: Support for "basic" authentication https://review.opendev.org/c/zuul/zuul-client/+/81911814:10
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 831609: Add a tenant reconfiguration metric https://review.opendev.org/c/zuul/zuul/+/83160915:32
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 830896: Use kazoo.retry in zkobject https://review.opendev.org/c/zuul/zuul/+/83089615:47
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:15:49
- [zuul/zuul] 830707: Use a transaction for BuildCompletedEvent https://review.opendev.org/c/zuul/zuul/+/830707
- [zuul/zuul] 830896: Use kazoo.retry in zkobject https://review.opendev.org/c/zuul/zuul/+/830896
@clarkb:matrix.org> <@iselor:matrix.org> Hi, another question. I have an issue with 5.0 that when one pipeline job has config error, for example zuul.yaml is incorrect, then whole queue is frozen. Is there any solution for that? Currently I have to remove jobs with invalid config using zuul-client.16:24
Generally Zuul should prevent you from merging invalid configs. However, there are some corner cases where this can happen, but in those instances the behavior I've seen is jobs not enqueuing at all. Not jobs getting stuck
@clarkb:matrix.orgI would definitely start by ensuring you have zuul gating the changes that update its configs16:24
@fungicide:matrix.orgnot sure if anyone else has seen it yet, but https://review.opendev.org/828125 has an interesting proposal for an alternative to the files and (more importantly) irrelevant-files job attributes, allowing them to be combined into a single flexible matcher16:51
@clarkb:matrix.orgcorvus: looks like the zuul-registry fixup has enough +2s now. Is now/today a good time to land that and check it?17:23
@jim:acmegating.comClark: wfm17:26
@clarkb:matrix.orgok I guess I'll approve it then. Falling back to 1.1.0 if we are impacted in our ability to revert should still be a workable plan17:27
-@gerrit:opendev.org- Zuul merged on behalf of Szymon Datko: [zuul/zuul-jobs] 831423: [ensure-python] Improve check for CentOS/RHEL 9 packages https://review.opendev.org/c/zuul/zuul-jobs/+/83142317:37
-@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan:17:51
- [zuul/zuul-registry] 831235: Perform atomic upload updates v2 https://review.opendev.org/c/zuul/zuul-registry/+/831235
- [zuul/zuul-registry] 831274: Add more robust testing of the registry https://review.opendev.org/c/zuul/zuul-registry/+/831274
@clarkb:matrix.orgtristanC: do you know if it is possible for matrix users to opt into getting notifications for the bot messages? Or are they forced to never notify?18:07
@clarkb:matrix.org * tristanC: do you know if it is possible for matrix users to opt into getting notifications for the gerritbot messages? Or are they forced to never notify?18:07
@tristanc_:matrix.orgClark: the gerritbot is using m.notice event type, so i guess they can be filtered on the client side.18:08
@clarkb:matrix.orgaha element has a "messages sent by bot" option. I wonder if the notice type is considered a bot message?18:10
@clarkb:matrix.orgI've toggled that from off to on and will see if that helps.18:10
@clarkb:matrix.orgit is too bad that I can't filter by user/source like with IRC clients though18:10
@tristanc_:matrix.orgClark: at the API level, when syncing you can use a complex event filter that should let you do that, though I don't know if end user clients enable custom filters18:14
@fungicide:matrix.orgi wonder if weechat's filters can be coerced into doing that with the matrix plugin18:25
@fungicide:matrix.orgi expect so18:25
@jpew:matrix.orgDoes `zuul_return` work for the `gerrit` driver?19:20
@jim:acmegating.comjpew: it has very little interaction with drivers19:20
@jpew:matrix.orgSorry, specifcially `file_comments`19:21
@jim:acmegating.comjpew: yes, make sure you're using an http connection for reporting19:21
@jpew:matrix.orgAh, that's why. Thanks19:21
-@gerrit:opendev.org- Ian Wienand proposed:19:47
- [zuul/zuul-registry] 831319: podman buildset testing: dump image list https://review.opendev.org/c/zuul/zuul-registry/+/831319
- [zuul/zuul-registry] 831339: podman: make sure we remove the pulled image https://review.opendev.org/c/zuul/zuul-registry/+/831339
- [zuul/zuul-registry] 831440: Fix and/or matching for image pre-conditions https://review.opendev.org/c/zuul/zuul-registry/+/831440
- [zuul/zuul-registry] 831135: Update testing to Ubuntu Focal https://review.opendev.org/c/zuul/zuul-registry/+/831135
- [zuul/zuul-registry] 831480: tox-py38 : don't run on Fedora https://review.opendev.org/c/zuul/zuul-registry/+/831480
- [zuul/zuul-registry] 831131: Enable mitmproxy between docker/podman and tesitng image https://review.opendev.org/c/zuul/zuul-registry/+/831131
- [zuul/zuul-registry] 831620: Stamp mitmproxy log https://review.opendev.org/c/zuul/zuul-registry/+/831620
@blaisep-sureify:matrix.org(N00b alert) When I run the zuul quickstart with docker-compose, I notice a lot of console exceptions like:21:03
```zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: Exception in CleanupWorker (empty node cleanup)
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: Traceback (most recent call last):
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: File "/usr/local/lib/python3.9/site-packages/nodepool/launcher.py", line 667, in _run
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: task()
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: File "/usr/local/lib/python3.9/site-packages/nodepool/launcher.py", line 654, in _cleanupEmptyNodes
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: for node_id in zk_conn.getNodes():
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: File "/usr/local/lib/python3.9/site-packages/nodepool/zk.py", line 2041, in getNodes
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: return self.client.get_children(self.NODE_ROOT)
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: File "/usr/local/lib/python3.9/site-packages/kazoo/client.py", line 1218, in get_children
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: return self.get_children_async(path, watch=watch,
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: File "/usr/local/lib/python3.9/site-packages/kazoo/handlers/utils.py", line 69, in get
zuul-tutorial-launcher-1 | 2022-03-03 19:02:53,014 ERROR nodepool.CleanupWorker: raise self._exception
```
@blaisep-sureify:matrix.orgIs that expected?21:03
@blaisep-sureify:matrix.org```21:04
commit 9525e33cfe46481ea016b8fa657e838be1d5f026 (HEAD -> master, origin/master, origin/HEAD, feature/k8s)
Merge: 44f91bf2 c75b849b
Author: Zuul <zuul@review.opendev.org>
Date: Thu Feb 24 18:40:18 2022 +0000
Merge "Fix multi-scheduler test races in waitUntilSettled"
```
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-registry] 831846: testing: add DEBUG flag to testing container https://review.opendev.org/c/zuul/zuul-registry/+/83184621:11
@iwienand:matrix.orgBlaise Pabon: that looks a little like dropping out of ZK, can you post a bit more context around the exception @ paste.opendev.org to show all the message?21:13
@blaisep-sureify:matrix.orghttps://paste.opendev.org/show/b9Wpi5sN6vukoOINcsyT/21:17
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-registry] 831846: testing: add DEBUG flag to testing container https://review.opendev.org/c/zuul/zuul-registry/+/83184621:18
@blaisep-sureify:matrix.org> https://paste.opendev.org/show/b9Wpi5sN6vukoOINcsyT/21:19
```
Client:
Cloud integration: v1.0.22
Version: 20.10.12
API version: 1.41
Go version: go1.16.12
Git commit: e91ed57
Built: Mon Dec 13 11:46:56 2021
OS/Arch: darwin/amd64
Context: default
Experimental: true
Server: Docker Desktop 4.5.0 (74594)
Engine:
Version: 20.10.12
API version: 1.41 (minimum version 1.12)
Go version: go1.16.12
Git commit: 459d0df
Built: Mon Dec 13 11:43:56 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.12
GitCommit: 7b11cfaabd73bb80907dd23182b9347b4245eb5d
runc:
Version: 1.0.2
GitCommit: v1.0.2-0-g52b36a2
docker-init:
Version: 0.19.0
GitCommit: de40ad0
```
@blaisep-sureify:matrix.orgI can try it on my Fedora server downstairs....21:20
@clarkb:matrix.orgBlaise Pabon: `zuul-tutorial-zk-1            | 2022-03-03 19:02:52,679 [myid:1] - INFO  [SessionTracker:ZooKeeperServer@628] - Expiring session 0x1000008583e0002, timeout of 10000ms exceeded` seems to be what started it21:23
@clarkb:matrix.orgsomething caused the connection between nodepool and zookeeper to not ping pong for over 10 seconds and it killed the connection. Zuul and nodepool should recover from that if the network connection can be reestablished, but it can result in test nodes being deleted and jobs restarting. This means if it happens frequently enough no jobs can complete so definitely something to dig into21:24
@blaisep-sureify:matrix.orgThank you, I'm happy to troubleshoot. right now I'm running th e `docker-compose` on my fedora server which is quite well endowed.21:28
@tobias.henkel:matrix.orgBlaise Pabon: there is a one hour gap in the logs of your paste, was that due to suspend or a time jump in the docker vm maybe?21:53
@blaisep-sureify:matrix.orgOh! I wonder if I messeup the copy... I must say, it seems to be running a lot smoother on my fedora host.22:00
@clarkb:matrix.orgtobiash: its late for you so not urgent but https://review.opendev.org/c/zuul/zuul/+/831102 and changes on that stack were reviewed by you previously if you have time to rereview (they were rebased to address a conflict iirc)22:00
@blaisep-sureify:matrix.orgThank you for the guidance tobiash , I have nothing urgent. I'm just learning my way around22:01
@tobias.henkel:matrix.orgClark: I'll have a look22:03
@tobias.henkel:matrix.orgClark: there is a further metrics stack starting at 831246 which would help us further optimizing the pipeline processing22:04
@jim:acmegating.commoar grafs22:04
@clarkb:matrix.orgthe context manager thing you've done in https://review.opendev.org/c/zuul/zuul/+/831609/3/zuul/scheduler.py is something I hadn't seen before22:20
@clarkb:matrix.orgcorvus: any idea what the order is does tlock wrap timer_ctx or is it the other way around?22:21
@clarkb:matrix.orgJust thinking depending on how we wnt to measure the numbers the behavior there may matter22:21
@clarkb:matrix.orgOk the docs say that the lock wraps the timer. Which means we won't measure the amount of time spent acquiring the lock. I suspect that is correct for this use case22:23
@clarkb:matrix.orgwe want to know how much time it takes to do things once we have the lock22:24
@jim:acmegating.comClark: yeah that was the intent; and i think that's what we want22:25
@jim:acmegating.comClark: also, while that syntax is supported in all python versions we support, whether or not it can be wrapped in () differs, so threading the needle between what python and flake8 allows when writing that was interesting (see first patchset)22:26
@jim:acmegating.com3.9 allows `with (foo, bar)` but 3.8 only accepts `with foo, bar`22:27
@clarkb:matrix.orgfun22:28
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 831102: Add duration and start time to buildset table https://review.opendev.org/c/zuul/zuul/+/83110223:06
@jim:acmegating.comClark noted in #opendev a job whose start was delayed; it appears to be related to the optimization of only processing pipelines with events.  in this case, the job was waiting on a semaphore which was held by a job in a different pipeline.  the completion of the job in the other pipeline did not trigger pipeline processing for the one that was waiting.23:12
@jim:acmegating.comi think to solve this, we need to either look for semaphore changes in our skip-processing check, or find a way for semaphore releases to generate pipeline events23:13
@jim:acmegating.comneither one seems easy or convenient23:13
@jim:acmegating.comthey both seem to require knowledge of who's waiting for semaphores; unless we were to just broadcast an event to every pipeline.23:14
@clarkb:matrix.orgbroadcasting seems straightforward but might undermine some of the optimizations that have been made if a lot of locks are used23:14
@clarkb:matrix.org * broadcasting seems straightforward but might undermine some of the optimizations that have been made if a lot of semaphores are used23:15
@jim:acmegating.comthat's not a terrible idea, except that it starts to approximate the old behavior if there are a lot of semaphores involved (like, if you used one for every job, then you basically get the old behavior)23:15
@jim:acmegating.comyeah exactly... probably okay, but not elegant :)23:15
@jim:acmegating.commaybe we can start appending semaphore waiters under the semaphores... the accounting wouldn't even need to be exact, just enough to know which pipelines to send events to23:16
@jim:acmegating.com(if we send too many events because some item was dequeued, its not important)23:17
@jim:acmegating.comi'm leaning toward broadcasting for now, and then optimizing for only waiting pipelines if we want to narrow it down more23:19
@jim:acmegating.comi'll work on that in a bit23:20
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 831873: Use JobData for Build result data https://review.opendev.org/c/zuul/zuul/+/83187323:30
-@gerrit:opendev.org- Ian Wienand proposed:23:38
- [zuul/zuul-registry] 831846: testing: add DEBUG flag to testing container https://review.opendev.org/c/zuul/zuul-registry/+/831846
- [zuul/zuul-registry] 831875: Always save testing zuul-registry container logs after each test https://review.opendev.org/c/zuul/zuul-registry/+/831875

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!