-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 936999: Fix auth redirect problem when root url is accessed https://review.opendev.org/c/zuul/zuul/+/936999 | 09:56 | |
@hanson76:matrix.org | Have done a test where I use a debug task to write out the complete hostvars and they are identical between 11.1.0 and 11.2.0 runs except dynamic ids etc. | 12:10 |
---|---|---|
I did notice that ansible_version is 8 for 11.1.0 and 9 for 11.2.0 | ||
I have not yet tried 11.2.0 with ansbile 8, tried to set it on the job but failed. | ||
@fungicide:matrix.org | hanson76: what clarkb suggested is overriding the ansible version to 9 in one of your working jobs on zuul 11.1.0 to see if the error is the same | 13:26 |
@hanson76:matrix.org | Did a test with ansible-version set to 9 and zuul 11.1.0, that did not fail, the multi-node-hosts-file role works as expected. | 14:20 |
Did verify that we used 9, the ansible_version var at runtime has full version set to 2.16.11 for 9 and 2.15.12 for 8. | ||
Running 11.2.0 with ansible 8 does not fix the problem either. | ||
Out of ideas what to try now to figure this out. | ||
@fungicide:matrix.org | well, seems like you've probably ruled out the default ansible version difference at least | 14:22 |
@fungicide:matrix.org | looking at the erroring task, i wonder if the hostvars array has different/additional entries under zuul 11.2.0 for some reason | 14:29 |
@dfajfer:fsfe.org | is there a way to check my playbooks if they're good for a new ansible version? | 14:31 |
@dfajfer:fsfe.org | it might be a silly question but I except for updating them whenever I saw any deprecation warnings or reading something and updating them I don't know if all of them are up to date | 14:32 |
@dfajfer:fsfe.org | * it might be a silly question but except for updating them whenever I saw any deprecation warnings or reading something and updating them I don't know if all of them are up to date | 14:32 |
@fungicide:matrix.org | dfajfer: zuul adds support for new ansible versions before it increases the default version, so you should be able to propose a job configuraiton change for an untrusted repo and have the jobs in question run speculatively with the ansible version overridden to what you're curious about | 14:33 |
@fungicide:matrix.org | hanson76: i've checked our zuul deployment, and we do seem to be successfully using the same role you're seeing errors for: https://zuul.opendev.org/t/openstack/build/22b4eddba24447d0bd65dbce3463a09d/console#1/0/4/compute1 | 14:34 |
@fungicide:matrix.org | dfajfer: here's an example of overriding ansible-version: https://opendev.org/zuul/zuul-jobs/src/branch/master/zuul-tests.d/general-roles-jobs.yaml#L309 | 14:39 |
@fungicide:matrix.org | and the corresponding documentation on that option: https://zuul-ci.org/docs/zuul/latest/config/job.html#attr-job.ansible-version | 14:40 |
@dfajfer:fsfe.org | yeah I did that, the thing is im gonna jump a few versions - not sure if there's an easy way to do this | 14:40 |
@dfajfer:fsfe.org | from the looks of it - nope, just some legwork for us to do then | 14:41 |
@fungicide:matrix.org | are you skipping major versions of zuul? support for an ansible version is only dropped at major version increases of zuul, and the default never moves to a version that was unsupported in the previous major zuul revision, so as long as you pause your upgrading and run the last patch release of each major version you should have the option to incrementally switch jobs over from old ansible versions to newer ones before you reach a zuul version where support is gone | 14:43 |
@fungicide:matrix.org | the release notes are also good about mentioning when support for a new version of ansible is introduced, when the default version changes, and when support for an old ansible version is dropped | 14:44 |
@dfajfer:fsfe.org | not skipping major versions, not even trying to, I'm in the middle of that process already got to know release notes/changelogs. I'm just wondering about custom playbooks I wrote, thought there's some static analysis that could tell me of deprecation | 14:52 |
@fungicide:matrix.org | unless the ansible community has a tool for that i'm unaware of, not really no | 14:53 |
@fungicide:matrix.org | at least for the content of the upstream zuul-jobs standard library we have jobs which exercise the roles as much as possible so we catch that before increasing the default in zuul itself | 14:54 |
@fungicide:matrix.org | and in the opendev collaboratory we pin and test increasing the ansible default at the tenant level so we can find issues early and, if necessary, easily do a temporary rollback | 14:55 |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 937035: Remove implicit smart reconfig during startup https://review.opendev.org/c/zuul/zuul/+/937035 | 15:16 | |
@clarkb:matrix.org | We also typically run specific canary jobs that are more complex under specific ansible versions before bumping it tenant wide | 15:57 |
@clarkb:matrix.org | typically we find one or two issues with the canary jobs and fixing those makes the update tenant wide pretty safe | 15:57 |
@clarkb:matrix.org | (we discover issues with the canary jobs then search for those specific problems tenant wide and fix them tenant wide if possible bfore bumping) | 15:57 |
@clarkb:matrix.org | corvus: in rereading my prep for OpenDev's upcoming Gerrit 3.10 upgrade I noticed https://gerrit-review.googlesource.com/c/gerrit/+/404717 this change only deprecates review commands without a project argument so I don't think it is a problem for our 3.10 upgrade but it may be for 3.11 if zuul relies on that. I feel like we discussed this previously but my attempt at searching matrix is failing to find it. Anyway not an issue for OpenDev's upgrade but wanted to call it out for potential future impacts. | 18:15 |
@jim:acmegating.com | here's the earlier convo: https://matrix.to/#/!yuuvjJSOEGSfTzxOjK:opendev.org/$zTb_ubhZXFNNLmfoXDkZDnwwg5PwT6rmtg4ZKPV29ZQ?via=opendev.org&via=matrix.org&via=fricklercloud.de | 18:48 |
@clarkb:matrix.org | huh I searched by change number but maybe needed to use the full url to get whatever indexer matrix is using to find it. Thanks for digging that up | 18:49 |
@jim:acmegating.com | yeah, change no did not work for me, but "3.10" did :| | 19:23 |
@hanson76:matrix.org | I know think I have found what cause multi-node-hosts-file to not work with zuul 11.2.0 for me but I have no clue why. | 20:38 |
I have set include-vars.use-ref: false on the job that uses multi-node-hosts-file and they now work with 11.2.0. | ||
I did see that the hostvars include "include_vars": [] in 11.2.0 but not in 11.1.0, no clue why that matters or can cause issues. | ||
@hanson76:matrix.org | * I know think I have found what cause multi-node-hosts-file to not work with zuul 11.2.0 for me but I have no clue why. | 20:39 |
I have set include-vars.use-ref: false on the job that uses multi-node-hosts-file and they now work with 11.2.0. | ||
I did see that the hostvars include "include\_vars": \[\] in 11.2.0 but not in 11.1.0, no clue why that matters or can cause issues. | ||
We do not use include-vars in any of our projects. | ||
@fungicide:matrix.org | includevars is new in 11.2.0, yes. i suppose where that task is iterating through hostvars it's assuming every entry is a host, which includevars is not. still no idea why opendev hasn't been hitting the problem | 20:45 |
@clarkb:matrix.org | That's a new feature do any existing library jobs or roles use it? | 20:52 |
@jim:acmegating.com | could be related to https://nvd.nist.gov/vuln/detail/CVE-2024-11079 in https://github.com/ansible/ansible/blob/v2.16.14/changelogs/CHANGELOG-v2.16.rst#v2-16-14 | 20:54 |
@jim:acmegating.com | it's another ansible behavior change cve in a micro release | 20:54 |
@jim:acmegating.com | opendev is one micro behind due to the release timing | 20:55 |
@jim:acmegating.com | it would probably be a good idea to confirm that with a simple playbook using a newer image and fix it in zuul-jobs if confirmed. if the guess above is right, then opendev may start seeing it on friday/saturday. | 21:01 |
@clarkb:matrix.org | Using nested Ansible? | 21:02 |
@jim:acmegating.com | ya, probably just invoking /usr/local/lib/zuul/ansible/9/bin/ansible-playbook with a simple playbook and inventory would be enough. but getting the actual inventory with the correct !unsafe tags may be tricky | 21:03 |
@jim:acmegating.com | i've got one handy, i'll do that check real quick | 21:05 |
@clarkb:matrix.org | thanks. I need to finish up my train of thought on this gerrit openid thing and get it pushed upstream today while its fresh in my mind so not in a good psot to debug ansible | 21:08 |
@jim:acmegating.com | yep that reproduces | 21:11 |
@fungicide:matrix.org | so it's coming from ansible 2.6.14 specifically? and i guess we're running with 2.6.13 in opendev? | 21:12 |
@jim:acmegating.com | yep | 21:13 |
@clarkb:matrix.org | reading the red hat notice baout it it unfortunately doesn't say much about the behavior change in the code just that you the user should change your behaviopr | 21:45 |
@jim:acmegating.com | okay the reason is very strange and a little concerning. it appears that `localhost` now has an entry in hostvars. so everytime we iterate over the hosts in hostvars, we get localhost even though it's not explicitly listed in inventory. | 21:54 |
@jim:acmegating.com | the behavior change with adding localhost doesn't jive with the cve; non-zero chance this is unintended fallout. | 21:55 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 937071: Protect hostvars iterations from implicit localhost https://review.opendev.org/c/zuul/zuul-jobs/+/937071 | 21:59 | |
@jim:acmegating.com | i'm a little worried the blast radius could be larger, but that's what i found via grep. | 22:00 |
@jim:acmegating.com | https://github.com/ansible/ansible/commit/70e83e72b43e05e57eb42a6d52d01a4d9768f510#diff-b36fa2cf8e062dc662300f9e707a65fcdd9d6c3a51927a589643b67d3352e4f0L97 | 22:12 |
@jim:acmegating.com | yeah they really did change that as part of the cve fix with no explanation why | 22:12 |
@fungicide:matrix.org | weird | 22:15 |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 937071: Protect hostvars iterations from implicit localhost https://review.opendev.org/c/zuul/zuul-jobs/+/937071 | 23:02 | |
@clarkb:matrix.org | Hanson: ^ I guess try again and see if that is happier now? | 23:10 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!