Thursday, 2022-09-22

@q:fricklercloud.deanyone care to review zuul-sphinx some time? https://review.opendev.org/c/zuul/zuul-sphinx/+/84036607:23
also still looking for github users for https://review.opendev.org/c/zuul/zuul/+/834671
@jim:acmegating.comtristanC: would you like to re-review https://review.opendev.org/854458 and https://review.opendev.org/855096 ? (tracing)15:37
@jim:acmegating.comi think we're ready to start merging the first few changes there, and opendev has a jaeger server up now: https://tracing.opendev.org/search15:42
-@gerrit:opendev.org- Zuul merged on behalf of Dr. Jens Harbott: [zuul/zuul-sphinx] 840366: Avoid infinite loop in find_zuul_yaml() https://review.opendev.org/c/zuul/zuul-sphinx/+/84036615:51
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 858960: Detect and handle auth proxy redirects https://review.opendev.org/c/zuul/zuul/+/85896016:52
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/85896117:25
@jim:acmegating.comClark: ^ you talk about an ansible module there, but if you're thinking of doing an ansible module that internally runs shell commands, then you'll be duplicating the command module and will need to handle streaming output manually, etc....17:28
@clarkb:matrix.orgoh thats a good point17:29
@clarkb:matrix.orgwe could use gitpython, but maybe adding deps like that is not a good idea17:29
@clarkb:matrix.orgI think in my mind the best thing to do if possible is to remove these loops entirely. but if that is not practical (due to needing a lot of code and error checking) then simplifying to a single loop like this is probably best. I can update the commit message if you like17:30
@jim:acmegating.comClark: i agree with that; up to you whether you want to update the commit msg; i'll just leave a note either way.17:31
@jim:acmegating.comClark: left one more q on that change, but structurally lgtm17:33
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/85896117:37
@clarkb:matrix.orgI went ahead and added -x. I didn't have it because it wasn't there before but I think that is a good improvement and will help us debug if anything goes wrong in the rewrite17:37
@jim:acmegating.comyeah, i guess it's a judgement call when a script gets complex enough to need -x?17:38
@clarkb:matrix.orgcorvus: in this case I think we can probably drop the -x after the update has been in production for a bit17:39
@clarkb:matrix.orgmostly I think it is a really good idea to have while we're making the change in case something behaves unexpectedly. But once we are confident in it we can trim back the logging verbosity17:39
@jim:acmegating.comok.  as long as we're thinking about it (because any task we remove, we also lose debug context information)17:40
@clarkb:matrix.orgif it doesn't end up being too verbose we can just keep it too17:41
@clarkb:matrix.orgcompared to the git output it might be relatively low volume17:42
@clarkb:matrix.orgThis change might also deserve a base-test iteration17:43
@jim:acmegating.com++17:44
@clarkb:matrix.orgoh yup there is already a test version of this role. Let me add a second change to modify that test role. We can land that first then have opendev base-test use the test role17:44
-@gerrit:opendev.org- Clark Boylan proposed:17:48
- [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961
- [zuul/zuul-jobs] 858963: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963
@clarkb:matrix.orghttps://review.opendev.org/c/opendev/base-jobs/+/858964 is the base-test update17:50
@clarkb:matrix.orghrm there are a couple of comments I need to carry over too17:53
-@gerrit:opendev.org- Clark Boylan proposed:17:56
- [zuul/zuul-jobs] 858963: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963
- [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961
-@gerrit:opendev.org- Clark Boylan proposed:18:10
- [zuul/zuul-jobs] 858963: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963
- [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961
@clarkb:matrix.orgSometimes being your own critic is the best/worst thing. Anyway I think ^ is in a mergeable state now if we want ot review and land the changes to the test role18:10
@clarkb:matrix.orgA followup change might move the config update to allow pushes into a non bare repository from mirror-workspace-git-repos into prepare-workspace-git, but I'd need to think about the compatibility ramifications of that18:12
@caiquemello:matrix.orgHi guys, I need some help. I'm facing some weird behavior in my Zuul env.18:57
@caiquemello:matrix.orgMy pipelines are "stuck" right before Zuul scheduler executor job. Zuul.nodepool build and assign de node to a job, but when zuul.ExecutorClient Execute the job, nothing seems to happen.18:57
@caiquemello:matrix.orgNothing is showing in the console, only "End of stream". The job duration time is showing  "unknown" instead of the estimated job time. The status bar is for each job is in a "loading" state.18:57
@caiquemello:matrix.orgAny ideia where I can take a look? Zuul scheduler and executor don't show any error log.19:00
@clarkb:matrix.org> <@caiquemello:matrix.org> Any ideia where I can take a look? Zuul scheduler and executor don't show any error log.20:31
I would check that your executor is accepting jobs. It is possible it may have tripped a governor and stopped running jobs
@clarkb:matrix.orgOtherwise grep the event I'd associated with the change and see where it stops processing things and work backward from there20:32
@clarkb:matrix.org* Otherwise grep the event ID associated with the change and see where it stops processing things and work backward from there20:32
@hanson76:matrix.orgWe see a strange behavor when upgrading Zuul from 6.3.0 to 6.4.020:37
Our tests fail, we starta a java application on the "testnode" (a aws node) and then try to access it from the same testnode.
It does not work and we get a "Connection refused: testnode/1.2.3.4:10250" error (ip replaced for this question, it's not 1.2.3.4)
The same test works as soon as we do a rollback to Zuul 6.3.0.
Have tried multiple times.
Has this something to do with the switch to Ansible 5 and something that anyone has seen before?
@clarkb:matrix.orgAnsible 5 broke some connection specific configuration settings. Do you rely on any connection settings to connect to the node?20:38
@clarkb:matrix.orgAlso can you test with 6.4.0 using Ansible 2.9?20:39
@clarkb:matrix.orgThat may quickly rule out some possibilities 20:39
@hanson76:matrix.orgWe do rely on the 'multi-node-hosts-file' role to populate the hosts files and then refer to the hostname and not ip address when we connect.20:46
Will check with 2.9 now
@clarkb:matrix.orgcorvus: fungi if you are around maybe we can review and land https://review.opendev.org/c/zuul/zuul-jobs/+/858963 to get a start on testing that change?20:50
@jim:acmegating.comClark: i think i can single-core approve that one so i did20:51
@clarkb:matrix.orgthanks20:52
-@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [zuul/zuul-jobs] 858963: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/85896321:00
@hanson76:matrix.orgGetting the same error when running 6.4.0 and specifying Ansible 2.9.21:02
Verified that it ran with 2.9, checked inventory.yaml
@clarkb:matrix.orgok so not likely related to the ansible 5 connection configuration bug I mentioned21:16
@clarkb:matrix.orgI would probably hold the node and inspect it to see what may be going on21:17
@clarkb:matrix.orgI can't think of any other likely things related to the zuul update (though I haven't trolled through the git log for culprits)21:17
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 858987: WIP: Rename admin-rule to authorization-rule https://review.opendev.org/c/zuul/zuul/+/85898721:19
@jim:acmegating.comif anyone wants to bikeshed over that ^ before i go too far down that rabbit hole, feel free21:21
@clarkb:matrix.orgcorvus: I haven't reviewed the change but read the commit message and it makes sense to update that way22:04
@jim:acmegating.comthanks :)22:07
@hanson76:matrix.orgDid some more troubleshooting , ran 'ss -ltn' on the testnode after the application is started, the strange thing is that it does not list the port that the application is supposed to listen on.22:13
@clarkb:matrix.org> <@hanson76:matrix.org> Did some more troubleshooting , ran 'ss -ltn' on the testnode after the application is started, the strange thing is that it does not list the port that the application is supposed to listen on.22:27
And this is some service that your ansible tasks should be starting? Can you check the service logs?
@hanson76:matrix.orgYes, using ansible.builtin.command to start it, it becomes a daemon. 22:30
The logs look the same with 6.3.0 and 6.4.0, have to enable bit more debug log the see more.
Will continue troubleshooting tomorrow.
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 854458: Add support for configuring and testing tracing https://review.opendev.org/c/zuul/zuul/+/85445822:36
@hanson76:matrix.orgDid a last test, it looks like the application is killed for some reason after the ansible.builtin.command when running with 6.4.0.22:40
Ran ps -deaf in a task right after the command task and the application is not listed with 6.4.0 but is with 6.3.0.
The only difference between the runs is that I updated the version of Zuul via kubectl apply where the version of all parts of Zuul is the only difference.
I'm then triggering the same change that fails in 6.4.0.
@clarkb:matrix.orgI'm surprised that ansible 2.9 failed in that case. I don't think anything in zuul would affect that. I suppose something in the console logging or command hooks could though?22:47
@clarkb:matrix.orgHowever, if the daemon isn't daemonizing then you'd expect the process to die when its parent goes away.22:48
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 858995: DNM reparenting multinode to base-test to test devstack https://review.opendev.org/c/zuul/zuul-jobs/+/85899523:03
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 858726: WIP: Fix CORS and endpoint in AWS log upload https://review.opendev.org/c/zuul/zuul-jobs/+/85872623:57

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!