Wednesday, 2022-02-09

-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl:00:00
- [zuul/zuul] 828224: Fix leaked semaphore cleanup https://review.opendev.org/c/zuul/zuul/+/828224
- [zuul/zuul] 828292: Avoid data races when dequeueing superceded items https://review.opendev.org/c/zuul/zuul/+/828292
@jim:acmegating.comClark: replied00:00
@jim:acmegating.comClark: i think that lock thing is hard to test because the conditional is only true if the tenant doesn't exist already.  and a tenant that is just being added to the system doesn't have any objects in zk that need manipulating.00:03
@jim:acmegating.com(so therefore the context never gets used)00:03
@jim:acmegating.comi think i've convinced myself it's not practical to test it.  we may be able to split it out into its own change, or we could just leave it in there and add a pgraph to the commit message.00:05
@clarkb:matrix.orgcorvus: thanks that helped00:10
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 828441: ensure-sphinx: upgrade pip https://review.opendev.org/c/zuul/zuul-jobs/+/82844100:15
@clarkb:matrix.orgcorvus: I +2'd https://review.opendev.org/c/zuul/zuul/+/826400 but did leave a couple of comments you might want ot check00:16
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed on behalf of Simon Westphahl: [zuul/zuul] 827841: Don't lock pipelines during local layout update https://review.opendev.org/c/zuul/zuul/+/82784100:25
@jim:acmegating.comClark: will do; ^ passed my local smoketest, i'm running the full test suite on it too, but i think it's gtg.00:26
@clarkb:matrix.orgcorvus: hrm did we not need to update the test_semaphore test too?00:29
@jim:acmegating.comClark: nope, the semaphore fix that just merged fixes it00:29
@clarkb:matrix.orgoh right00:29
@jim:acmegating.com(so i'm really glad i unparented it so i could learn about that)00:29
@jim:acmegating.comtests.unit.test_scheduler.TestScheduler.test_initial_pipeline_gauges fails ... naturally... maybe we need to just move that stanza to the prime method where we expect no data00:34
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed on behalf of Simon Westphahl: [zuul/zuul] 827841: Don't lock pipelines during local layout update https://review.opendev.org/c/zuul/zuul/+/82784100:43
@jim:acmegating.comClark: okay, that should be better%00:44
@jim:acmegating.com * Clark: okay, that should be better ^00:44
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 828441: ensure-sphinx: upgrade pip https://review.opendev.org/c/zuul/zuul-jobs/+/82844100:44
@jim:acmegating.comswest: tobiash when you're back, please take a look at the revised https://review.opendev.org/827841 and feel free to approve if you like it.01:08
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul-jobs] 828441: ensure-sphinx: upgrade pip https://review.opendev.org/c/zuul/zuul-jobs/+/82844101:22
@iwienand:matrix.orghrm, i tried to replay https://zuul.opendev.org/t/openstack/build/d83594eb56204d6690d1c3c2f40e1765/logs with 02:31
@iwienand:matrix.orguul-client enqueue-ref --tenant  openstack --pipeline post --project opendev.org/openstack/nova --newrev f7fa3bf5fcfc39f8ac9dcfa126747e376d801eb5 --oldrev b6fe7521afa8d42febc68f5f79782f7bcc3b568f --ref refs/heads/master02:31
@iwienand:matrix.orgit does not seem like it has found anything to do02:33
@iwienand:matrix.orglast thing is02:36
@iwienand:matrix.org2022-02-09 02:23:33,613 DEBUG zuul.MergerApi: Submitting job request to ZooKeeper <MergeRequest fe8150dec19f4b54b0feafd53bff64a8, job_type=refstate, state=requested, path=/zuul/merger/requests/fe8150dec19f4b54b0feafd53bff64a8>02:36
@iwienand:matrix.orgthe merger queue is very high.  i guess i just happened to submit at the same time as the periodic jobs02:38
@westphahl:matrix.orgcorvus: 827841 lgtm, thanks for pushing this along!08:20
@avass:vassast.orgWould it be possible to add a `revision` attribute on `job.roles` so a job can pin a commit instead of always using the latest version of a role? :)08:39
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 828505: dnm: Filter model API 2+3 version checks by scheduler https://review.opendev.org/c/zuul/zuul/+/82850509:24
-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: [zuul/zuul] 827841: Don't lock pipelines during local layout update https://review.opendev.org/c/zuul/zuul/+/82784111:02
@sts_:matrix.orgHey. I want to try out zuul, so I'm trying to set up the zuul based on the example under /doc/source/examples. From what I can see from the logs all nodes startup correctly, without any complaints about the configuration.15:58
When I create a new change or patchset, however, I don't see any reply from zuul on the change.
The scheduler seems to correctly detect that a new patchset was uploaded. It prints the following in it's log:
```
zuul-scheduler | 2022-02-09 12:49:11,314 INFO zuul.GerritConnection: [e: 5f8b54ab90f6441c8fc254bae58a72e9] Updating <Change 0x7f280c193550 None 1381,9>
zuul-scheduler | 2022-02-09 12:49:11,804 INFO zuul.GerritConnection: [e: 0602244776434d9dbe75936e341cc24f] Updating <Change 0x7f280c193550 testproj 1381,9>
```
Where 1381,9 corresponds to the change id and patchset that was uploaded.
Based on my (very limitted) understanding of zuul, I would expect the executor to start executing jobs on the job node next. The executor doesn't snow anything in it's logs after startup. I'm not if this is because it did not receive any triggers to do anything with the new patchset, or because it doesn't log anything when it receives a trigger.
I'm a bit at a loss here. What is the best way to debug this? Is there some way to configure zuul to be more verbose about what it's doing/trying to do?
@fungicide:matrix.orgsts_: so you have a check pipeline like this defined? https://opendev.org/zuul/zuul/src/branch/master/doc/source/examples/pipelines/gerrit-reference-pipelines.yaml#L1-L2416:13
@fungicide:matrix.organd a zuul.yaml like this in testproj (or in your 1381 change)? https://opendev.org/zuul/zuul/src/branch/master/doc/source/examples/test1/zuul.yaml16:14
@sts_:matrix.org> <@fungicide:matrix.org> sts_: so you have a check pipeline like this defined? https://opendev.org/zuul/zuul/src/branch/master/doc/source/examples/pipelines/gerrit-reference-pipelines.yaml#L1-L2416:15
I have a zuul-config project with exactly that file under zuul.d/pipelines.yaml in the main branch
@sts_:matrix.org> <@fungicide:matrix.org> and a zuul.yaml like this in testproj (or in your 1381 change)? https://opendev.org/zuul/zuul/src/branch/master/doc/source/examples/test1/zuul.yaml16:16
I called it .zuul.yaml, but other than that it is exactly that file
@fungicide:matrix.organd you've set up a main.yaml for the scheduler with testproj in one of the projects lists, like this? https://opendev.org/zuul/zuul/src/branch/master/doc/source/examples/etc_zuul/main.yaml16:17
@sts_:matrix.orgYes. It looks like this: ```16:18
- tenant:
name: test-tenant
source:
gerrit:
config-projects:
- zuul-config
untrusted-projects:
- testproj
opendev.org:
untrusted-projects:
- zuul/zuul-jobs:
include:
- job
```
@fungicide:matrix.orgi'll assume indentation is correct in the file and matrix has simply mangled it, the scheduler would probably fail to start otherwise16:19
@sts_:matrix.org * Yes. It looks like this:16:20
```
- tenant:
name: test-tenant
source:
gerrit:
config-projects:
- zuul-config
untrusted-projects:
- testproj
opendev.org:
untrusted-projects:
- zuul/zuul-jobs:
include:
- job
```
@fungicide:matrix.orgokay, yep that looks right16:20
@fungicide:matrix.orgdo you see any config errors (bell icon in the top-right corner) on the status page for that tenant in the zuul web dashboard?16:21
@sts_:matrix.orgNope, none16:21
@fungicide:matrix.orgthose ``[e: xxxxx]`` uuids you see in the log are event ids, you should be able to filter the log file by one of them and see all the log lines associated with that event16:23
@fungicide:matrix.orgyou can also run the scheduler in "debug" mode and get much more verbose logs if you really want to trace every last thing that happened, but normally the info level logging and up should be sufficient for diagnosing why something isn't happening16:26
@sts_:matrix.orgI only have those two lines, non of the other nodes prints any message with that number (or even something that looks like [e: xxxxx])16:26
@sts_:matrix.org> <@fungicide:matrix.org> you can also run the scheduler in "debug" mode and get much more verbose logs if you really want to trace every last thing that happened, but normally the info level logging and up should be sufficient for diagnosing why something isn't happening16:32
How do I run zuul in debug mode? Can I select this in zuul's config file?
@fungicide:matrix.orgsts_: sorry, i was going to refer you to the documentation, but it's looking like we don't actually have documentation of the command-line switches for the services, at least not that i can find. there's -d for debug-level logging, and -f for logging to the tty with the process in the foreground: https://opendev.org/zuul/zuul/src/branch/master/zuul/cmd/__init__.py#L172-L17516:47
@fungicide:matrix.org``zuul-scheduler --help`` will give you context help for those16:48
@clarkb:matrix.orgconfirmed the sphinx program-output directive doesn't seem to be used with the daemon commands16:50
@sts_:matrix.org> <@fungicide:matrix.org> sts_: sorry, i was going to refer you to the documentation, but it's looking like we don't actually have documentation of the command-line switches for the services, at least not that i can find. there's -d for debug-level logging, and -f for logging to the tty with the process in the foreground: https://opendev.org/zuul/zuul/src/branch/master/zuul/cmd/__init__.py#L172-L17516:53
Thanks a lot for all the help! Will try add these flags when I get home
@fungicide:matrix.orgClark: the plugin's probably overkill given there are only two options (plus help), but we could probably stand to mention them in either or both of the service administration and troubleshooting sections, do you think? i'm happy to write something up for that16:56
@clarkb:matrix.orgfungi: ya. I think those flags are the same for all the command so you could mention all the zuul services accept the three flags16:57
@fungicide:matrix.orgaha! we sort of already have -f documented at the very top of https://zuul-ci.org/docs/zuul/latest/operation.html17:14
@fungicide:matrix.orgi'll word it up some and also weave a bit about -d in there17:15
@fungicide:matrix.organd then mention that section from the troubleshooting doc17:15
@fungicide:matrix.orgoof, that'll teach me to ``tox -e docs`` in a zuul checkout... it used up the 3.5gb i had available on that filesystem and subsequently broke17:35
@fungicide:matrix.orgi'll just push it up and let zuul tell me if my sphinx/rst is terribad17:36
-@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: [zuul/zuul] 828596: Better document service command-line switches https://review.opendev.org/c/zuul/zuul/+/82859617:36
@clarkb:matrix.orgfungi:  in that change where is the service debug logs <operation> ref?17:41
@fungicide:matrix.orgtop of doc/source/operation.rst17:45
@fungicide:matrix.orgpretty sure ``:ref:`` allows use of implicit (heading) references17:47
@clarkb:matrix.orgoh I guess case doesn't matter?19:20
@clarkb:matrix.orgMy grep failed for that reason I think19:20
@fungicide:matrix.orgyes, refs are case-insensitive as far as i know19:22
@nhicher:matrix.orghello, prior to zuul 5.0.0, we've tested the scheduler was started by waiting for gearman port with an ansible wait_for task. What could be the strategy with 5.0.0 to be sure the scheduler is up and running? Thanks20:15
@clarkb:matrix.orgnhicher: there is a status api now iirc20:16
@tristanc_:matrix.orgnhicher: perhaps using the components endpoint, e.g. by looking for scheduler.state in https://zuul.opendev.org/api/components20:16
@clarkb:matrix.orgya that20:16
@clarkb:matrix.orgyou probably want to wait for one of scheduler, merger, executor, web to be up ebfore considering the whole thing active20:16
@tristanc_:matrix.orgClark: does the scheduler wait for the configuration to be fully loaded before switching to running?20:17
@nhicher:matrix.orgtristanC and Clark thanks, I will have a look20:17
@clarkb:matrix.orgtristanC: I think it may have three states. Let me see20:17
@clarkb:matrix.orgyes when it is starting up it will report initializing then switch to running when configs are loaded and actually ready to process work20:20
@clarkb:matrix.orgfb3d3f7471f3f03a7edfae9706d85db644275d60 that commit if you want to see details20:20
@tristanc_:matrix.orgClark: excellent, thank you very much for confirming20:20
@jim:acmegating.comtristanC: there's also a liveness and ready probe prom endpoint: https://zuul-ci.org/docs/zuul/latest/monitoring.html#prometheus-liveness20:25
@jim:acmegating.comnhicher: ^20:25
@tristanc_:matrix.orgcorvus: wow, I completely forgot about that, thanks!20:25
@jim:acmegating.comwe just added the ready one in 4.11 i think20:27
@nhicher:matrix.orgcorvus:  ok, nice, I will test with components first20:27
-@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: [zuul/zuul] 828596: Better document service command-line switches https://review.opendev.org/c/zuul/zuul/+/82859620:49
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 826400: Add zuul-scheduler tenant-reconfigure https://review.opendev.org/c/zuul/zuul/+/82640021:01
-@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: [zuul/zuul] 828356: Serialize update of changes in timer driver https://review.opendev.org/c/zuul/zuul/+/82835621:01
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 828429: Make a global component registry https://review.opendev.org/c/zuul/zuul/+/82842921:47
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 828614: Correct exit routine in web, merger, fingergw https://review.opendev.org/c/zuul/zuul/+/82861422:31
@jim:acmegating.comClarkfungi ^ based on recent adventures in opendev, that seems to make everything exit in all the ways i can think of trying locally22:32
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 828615: [dnm] testing https://review.opendev.org/c/opendev/base-jobs/+/828440 https://review.opendev.org/c/zuul/zuul-jobs/+/82861522:39
@clarkb:matrix.orgcorvus: left a question on that one23:39
@jim:acmegating.comClark: ++ replied23:48
@clarkb:matrix.orgcorvus: do you want to do that in a followup or fix in that change?23:52
@jim:acmegating.comClark: it looks like it passes tests, so i'll go ahead and revise it23:57

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!