@q:fricklercloud.de | anyone care to review zuul-sphinx some time? https://review.opendev.org/c/zuul/zuul-sphinx/+/840366 | 07:23 |
---|---|---|
also still looking for github users for https://review.opendev.org/c/zuul/zuul/+/834671 | ||
@jim:acmegating.com | tristanC: would you like to re-review https://review.opendev.org/854458 and https://review.opendev.org/855096 ? (tracing) | 15:37 |
@jim:acmegating.com | i think we're ready to start merging the first few changes there, and opendev has a jaeger server up now: https://tracing.opendev.org/search | 15:42 |
-@gerrit:opendev.org- Zuul merged on behalf of Dr. Jens Harbott: [zuul/zuul-sphinx] 840366: Avoid infinite loop in find_zuul_yaml() https://review.opendev.org/c/zuul/zuul-sphinx/+/840366 | 15:51 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 858960: Detect and handle auth proxy redirects https://review.opendev.org/c/zuul/zuul/+/858960 | 16:52 | |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | 17:25 | |
@jim:acmegating.com | Clark: ^ you talk about an ansible module there, but if you're thinking of doing an ansible module that internally runs shell commands, then you'll be duplicating the command module and will need to handle streaming output manually, etc.... | 17:28 |
@clarkb:matrix.org | oh thats a good point | 17:29 |
@clarkb:matrix.org | we could use gitpython, but maybe adding deps like that is not a good idea | 17:29 |
@clarkb:matrix.org | I think in my mind the best thing to do if possible is to remove these loops entirely. but if that is not practical (due to needing a lot of code and error checking) then simplifying to a single loop like this is probably best. I can update the commit message if you like | 17:30 |
@jim:acmegating.com | Clark: i agree with that; up to you whether you want to update the commit msg; i'll just leave a note either way. | 17:31 |
@jim:acmegating.com | Clark: left one more q on that change, but structurally lgtm | 17:33 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | 17:37 | |
@clarkb:matrix.org | I went ahead and added -x. I didn't have it because it wasn't there before but I think that is a good improvement and will help us debug if anything goes wrong in the rewrite | 17:37 |
@jim:acmegating.com | yeah, i guess it's a judgement call when a script gets complex enough to need -x? | 17:38 |
@clarkb:matrix.org | corvus: in this case I think we can probably drop the -x after the update has been in production for a bit | 17:39 |
@clarkb:matrix.org | mostly I think it is a really good idea to have while we're making the change in case something behaves unexpectedly. But once we are confident in it we can trim back the logging verbosity | 17:39 |
@jim:acmegating.com | ok. as long as we're thinking about it (because any task we remove, we also lose debug context information) | 17:40 |
@clarkb:matrix.org | if it doesn't end up being too verbose we can just keep it too | 17:41 |
@clarkb:matrix.org | compared to the git output it might be relatively low volume | 17:42 |
@clarkb:matrix.org | This change might also deserve a base-test iteration | 17:43 |
@jim:acmegating.com | ++ | 17:44 |
@clarkb:matrix.org | oh yup there is already a test version of this role. Let me add a second change to modify that test role. We can land that first then have opendev base-test use the test role | 17:44 |
-@gerrit:opendev.org- Clark Boylan proposed: | 17:48 | |
- [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | ||
- [zuul/zuul-jobs] 858963: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963 | ||
@clarkb:matrix.org | https://review.opendev.org/c/opendev/base-jobs/+/858964 is the base-test update | 17:50 |
@clarkb:matrix.org | hrm there are a couple of comments I need to carry over too | 17:53 |
-@gerrit:opendev.org- Clark Boylan proposed: | 17:56 | |
- [zuul/zuul-jobs] 858963: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963 | ||
- [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | ||
-@gerrit:opendev.org- Clark Boylan proposed: | 18:10 | |
- [zuul/zuul-jobs] 858963: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963 | ||
- [zuul/zuul-jobs] 858961: Reduce the number of loops in prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858961 | ||
@clarkb:matrix.org | Sometimes being your own critic is the best/worst thing. Anyway I think ^ is in a mergeable state now if we want ot review and land the changes to the test role | 18:10 |
@clarkb:matrix.org | A followup change might move the config update to allow pushes into a non bare repository from mirror-workspace-git-repos into prepare-workspace-git, but I'd need to think about the compatibility ramifications of that | 18:12 |
@caiquemello:matrix.org | Hi guys, I need some help. I'm facing some weird behavior in my Zuul env. | 18:57 |
@caiquemello:matrix.org | My pipelines are "stuck" right before Zuul scheduler executor job. Zuul.nodepool build and assign de node to a job, but when zuul.ExecutorClient Execute the job, nothing seems to happen. | 18:57 |
@caiquemello:matrix.org | Nothing is showing in the console, only "End of stream". The job duration time is showing "unknown" instead of the estimated job time. The status bar is for each job is in a "loading" state. | 18:57 |
@caiquemello:matrix.org | Any ideia where I can take a look? Zuul scheduler and executor don't show any error log. | 19:00 |
@clarkb:matrix.org | > <@caiquemello:matrix.org> Any ideia where I can take a look? Zuul scheduler and executor don't show any error log. | 20:31 |
I would check that your executor is accepting jobs. It is possible it may have tripped a governor and stopped running jobs | ||
@clarkb:matrix.org | Otherwise grep the event I'd associated with the change and see where it stops processing things and work backward from there | 20:32 |
@clarkb:matrix.org | * Otherwise grep the event ID associated with the change and see where it stops processing things and work backward from there | 20:32 |
@hanson76:matrix.org | We see a strange behavor when upgrading Zuul from 6.3.0 to 6.4.0 | 20:37 |
Our tests fail, we starta a java application on the "testnode" (a aws node) and then try to access it from the same testnode. | ||
It does not work and we get a "Connection refused: testnode/1.2.3.4:10250" error (ip replaced for this question, it's not 1.2.3.4) | ||
The same test works as soon as we do a rollback to Zuul 6.3.0. | ||
Have tried multiple times. | ||
Has this something to do with the switch to Ansible 5 and something that anyone has seen before? | ||
@clarkb:matrix.org | Ansible 5 broke some connection specific configuration settings. Do you rely on any connection settings to connect to the node? | 20:38 |
@clarkb:matrix.org | Also can you test with 6.4.0 using Ansible 2.9? | 20:39 |
@clarkb:matrix.org | That may quickly rule out some possibilities | 20:39 |
@hanson76:matrix.org | We do rely on the 'multi-node-hosts-file' role to populate the hosts files and then refer to the hostname and not ip address when we connect. | 20:46 |
Will check with 2.9 now | ||
@clarkb:matrix.org | corvus: fungi if you are around maybe we can review and land https://review.opendev.org/c/zuul/zuul-jobs/+/858963 to get a start on testing that change? | 20:50 |
@jim:acmegating.com | Clark: i think i can single-core approve that one so i did | 20:51 |
@clarkb:matrix.org | thanks | 20:52 |
-@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [zuul/zuul-jobs] 858963: Test speedup change to prepare-workspace-git https://review.opendev.org/c/zuul/zuul-jobs/+/858963 | 21:00 | |
@hanson76:matrix.org | Getting the same error when running 6.4.0 and specifying Ansible 2.9. | 21:02 |
Verified that it ran with 2.9, checked inventory.yaml | ||
@clarkb:matrix.org | ok so not likely related to the ansible 5 connection configuration bug I mentioned | 21:16 |
@clarkb:matrix.org | I would probably hold the node and inspect it to see what may be going on | 21:17 |
@clarkb:matrix.org | I can't think of any other likely things related to the zuul update (though I haven't trolled through the git log for culprits) | 21:17 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 858987: WIP: Rename admin-rule to authorization-rule https://review.opendev.org/c/zuul/zuul/+/858987 | 21:19 | |
@jim:acmegating.com | if anyone wants to bikeshed over that ^ before i go too far down that rabbit hole, feel free | 21:21 |
@clarkb:matrix.org | corvus: I haven't reviewed the change but read the commit message and it makes sense to update that way | 22:04 |
@jim:acmegating.com | thanks :) | 22:07 |
@hanson76:matrix.org | Did some more troubleshooting , ran 'ss -ltn' on the testnode after the application is started, the strange thing is that it does not list the port that the application is supposed to listen on. | 22:13 |
@clarkb:matrix.org | > <@hanson76:matrix.org> Did some more troubleshooting , ran 'ss -ltn' on the testnode after the application is started, the strange thing is that it does not list the port that the application is supposed to listen on. | 22:27 |
And this is some service that your ansible tasks should be starting? Can you check the service logs? | ||
@hanson76:matrix.org | Yes, using ansible.builtin.command to start it, it becomes a daemon. | 22:30 |
The logs look the same with 6.3.0 and 6.4.0, have to enable bit more debug log the see more. | ||
Will continue troubleshooting tomorrow. | ||
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 854458: Add support for configuring and testing tracing https://review.opendev.org/c/zuul/zuul/+/854458 | 22:36 | |
@hanson76:matrix.org | Did a last test, it looks like the application is killed for some reason after the ansible.builtin.command when running with 6.4.0. | 22:40 |
Ran ps -deaf in a task right after the command task and the application is not listed with 6.4.0 but is with 6.3.0. | ||
The only difference between the runs is that I updated the version of Zuul via kubectl apply where the version of all parts of Zuul is the only difference. | ||
I'm then triggering the same change that fails in 6.4.0. | ||
@clarkb:matrix.org | I'm surprised that ansible 2.9 failed in that case. I don't think anything in zuul would affect that. I suppose something in the console logging or command hooks could though? | 22:47 |
@clarkb:matrix.org | However, if the daemon isn't daemonizing then you'd expect the process to die when its parent goes away. | 22:48 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 858995: DNM reparenting multinode to base-test to test devstack https://review.opendev.org/c/zuul/zuul-jobs/+/858995 | 23:03 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 858726: WIP: Fix CORS and endpoint in AWS log upload https://review.opendev.org/c/zuul/zuul-jobs/+/858726 | 23:57 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!