Wednesday, 2024-12-04

-@gerrit:opendev.org- Dong Zhang proposed: [zuul/zuul] 936999: Fix auth redirect problem when root url is accessed https://review.opendev.org/c/zuul/zuul/+/93699909:56
@hanson76:matrix.orgHave done a test where I use a debug task to write out the complete hostvars and they are identical between 11.1.0 and 11.2.0 runs except dynamic ids etc.12:10
I did notice that ansible_version is 8 for 11.1.0 and 9 for 11.2.0
I have not yet tried 11.2.0 with ansbile 8, tried to set it on the job but failed.
@fungicide:matrix.orghanson76: what clarkb suggested is overriding the ansible version to 9 in one of your working jobs on zuul 11.1.0 to see if the error is the same13:26
@hanson76:matrix.orgDid a test with ansible-version set to 9 and zuul 11.1.0, that did not fail, the multi-node-hosts-file role works as expected.14:20
Did verify that we used 9, the ansible_version var at runtime has full version set to 2.16.11 for 9 and 2.15.12 for 8.
Running 11.2.0 with ansible 8 does not fix the problem either.
Out of ideas what to try now to figure this out.
@fungicide:matrix.orgwell, seems like you've probably ruled out the default ansible version difference at least14:22
@fungicide:matrix.orglooking at the erroring task, i wonder if the hostvars array has different/additional entries under zuul 11.2.0 for some reason14:29
@dfajfer:fsfe.orgis there a way to check my playbooks if they're good for a new ansible version?14:31
@dfajfer:fsfe.orgit might be a silly question but I except for updating them whenever I saw any deprecation warnings or reading something and updating them I don't know if all of them are up to date14:32
@dfajfer:fsfe.org * it might be a silly question but except for updating them whenever I saw any deprecation warnings or reading something and updating them I don't know if all of them are up to date14:32
@fungicide:matrix.orgdfajfer: zuul adds support for new ansible versions before it increases the default version, so you should be able to propose a job configuraiton change for an untrusted repo and have the jobs in question run speculatively with the ansible version overridden to what you're curious about14:33
@fungicide:matrix.orghanson76: i've checked our zuul deployment, and we do seem to be successfully using the same role you're seeing errors for: https://zuul.opendev.org/t/openstack/build/22b4eddba24447d0bd65dbce3463a09d/console#1/0/4/compute114:34
@fungicide:matrix.orgdfajfer: here's an example of overriding ansible-version: https://opendev.org/zuul/zuul-jobs/src/branch/master/zuul-tests.d/general-roles-jobs.yaml#L30914:39
@fungicide:matrix.organd the corresponding documentation on that option: https://zuul-ci.org/docs/zuul/latest/config/job.html#attr-job.ansible-version14:40
@dfajfer:fsfe.orgyeah I did that, the thing is im gonna jump a few versions - not sure if there's an easy way to do this14:40
@dfajfer:fsfe.orgfrom the looks of it - nope, just some legwork for us to do then14:41
@fungicide:matrix.orgare you skipping major versions of zuul? support for an ansible version is only dropped at major version increases of zuul, and the default never moves to a version that was unsupported in the previous major zuul revision, so as long as you pause your upgrading and run the last patch release of each major version you should have the option to incrementally switch jobs over from old ansible versions to newer ones before you reach a zuul version where support is gone14:43
@fungicide:matrix.orgthe release notes are also good about mentioning when support for a new version of ansible is introduced, when the default version changes, and when support for an old ansible version is dropped14:44
@dfajfer:fsfe.orgnot skipping major versions, not even trying to, I'm in the middle of that process already got to know release notes/changelogs. I'm just wondering about custom playbooks I wrote, thought there's some static analysis that could tell me of deprecation14:52
@fungicide:matrix.orgunless the ansible community has a tool for that i'm unaware of, not really no14:53
@fungicide:matrix.orgat least for the content of the upstream zuul-jobs standard library we have jobs which exercise the roles as much as possible so we catch that before increasing the default in zuul itself14:54
@fungicide:matrix.organd in the opendev collaboratory we pin and test increasing the ansible default at the tenant level so we can find issues early and, if necessary, easily do a temporary rollback14:55
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 937035: Remove implicit smart reconfig during startup https://review.opendev.org/c/zuul/zuul/+/93703515:16
@clarkb:matrix.orgWe also typically run specific canary jobs that are more complex under specific ansible versions before bumping it tenant wide15:57
@clarkb:matrix.orgtypically we find one or two issues with the canary jobs and fixing those makes the update tenant wide pretty safe15:57
@clarkb:matrix.org(we discover issues with the canary jobs then search for those specific problems tenant wide and fix them tenant wide if possible bfore bumping)15:57
@clarkb:matrix.orgcorvus: in rereading my prep for OpenDev's upcoming Gerrit 3.10 upgrade I noticed https://gerrit-review.googlesource.com/c/gerrit/+/404717 this change only deprecates review commands without a project argument so I don't think it is a problem for our 3.10 upgrade but it may be for 3.11 if zuul relies on that. I feel like we discussed this previously but my attempt at searching matrix is failing to find it. Anyway not an issue for OpenDev's upgrade but wanted to call it out for potential future impacts.18:15
@jim:acmegating.comhere's the earlier convo: https://matrix.to/#/!yuuvjJSOEGSfTzxOjK:opendev.org/$zTb_ubhZXFNNLmfoXDkZDnwwg5PwT6rmtg4ZKPV29ZQ?via=opendev.org&via=matrix.org&via=fricklercloud.de18:48
@clarkb:matrix.orghuh I searched by change number but maybe needed to use the full url to get whatever indexer matrix is using to find it. Thanks for digging that up18:49
@jim:acmegating.comyeah, change no did not work for me, but "3.10" did :|19:23
@hanson76:matrix.orgI know think I have found what cause multi-node-hosts-file to not work with zuul 11.2.0 for me but I have no clue why.20:38
I have set include-vars.use-ref: false on the job that uses multi-node-hosts-file and they now work with 11.2.0.
I did see that the hostvars include "include_vars": [] in 11.2.0 but not in 11.1.0, no clue why that matters or can cause issues.
@hanson76:matrix.org * I know think I have found what cause multi-node-hosts-file to not work with zuul 11.2.0 for me but I have no clue why.20:39
I have set include-vars.use-ref: false on the job that uses multi-node-hosts-file and they now work with 11.2.0.
I did see that the hostvars include "include\_vars": \[\] in 11.2.0 but not in 11.1.0, no clue why that matters or can cause issues.
We do not use include-vars in any of our projects.
@fungicide:matrix.orgincludevars is new in 11.2.0, yes. i suppose where that task is iterating through hostvars it's assuming every entry is a host, which includevars is not. still no idea why opendev hasn't been hitting the problem20:45
@clarkb:matrix.orgThat's a new feature do any existing library jobs or roles use it?20:52
@jim:acmegating.comcould be related to https://nvd.nist.gov/vuln/detail/CVE-2024-11079 in https://github.com/ansible/ansible/blob/v2.16.14/changelogs/CHANGELOG-v2.16.rst#v2-16-1420:54
@jim:acmegating.comit's another ansible behavior change cve in a micro release20:54
@jim:acmegating.comopendev is one micro behind due to the release timing20:55
@jim:acmegating.comit would probably be a good idea to confirm that with a simple playbook using a newer image and fix it in zuul-jobs if confirmed.  if the guess above is right, then opendev may start seeing it on friday/saturday.21:01
@clarkb:matrix.orgUsing nested Ansible?21:02
@jim:acmegating.comya, probably just invoking /usr/local/lib/zuul/ansible/9/bin/ansible-playbook with a simple playbook and inventory would be enough.  but getting the actual inventory with the correct !unsafe tags may be tricky21:03
@jim:acmegating.comi've got one handy, i'll do that check real quick21:05
@clarkb:matrix.orgthanks. I need to finish up my train of thought on this gerrit openid thing and get it pushed upstream today while its fresh in my mind so not in a good psot to debug ansible21:08
@jim:acmegating.comyep that reproduces21:11
@fungicide:matrix.orgso it's coming from ansible 2.6.14 specifically? and i guess we're running with 2.6.13 in opendev?21:12
@jim:acmegating.comyep21:13
@clarkb:matrix.orgreading the red hat notice baout it it unfortunately doesn't say much about the behavior change in the code just that you the user should change your behaviopr21:45
@jim:acmegating.comokay the reason is very strange and a little concerning.  it appears that `localhost` now has an entry in hostvars.  so everytime we iterate over the hosts in hostvars, we get localhost even though it's not explicitly listed in inventory.21:54
@jim:acmegating.comthe behavior change with adding localhost doesn't jive with the cve; non-zero chance this is unintended fallout.21:55
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 937071: Protect hostvars iterations from implicit localhost https://review.opendev.org/c/zuul/zuul-jobs/+/93707121:59
@jim:acmegating.comi'm a little worried the blast radius could be larger, but that's what i found via grep.22:00
@jim:acmegating.comhttps://github.com/ansible/ansible/commit/70e83e72b43e05e57eb42a6d52d01a4d9768f510#diff-b36fa2cf8e062dc662300f9e707a65fcdd9d6c3a51927a589643b67d3352e4f0L9722:12
@jim:acmegating.comyeah they really did change that as part of the cve fix with no explanation why22:12
@fungicide:matrix.orgweird22:15
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 937071: Protect hostvars iterations from implicit localhost https://review.opendev.org/c/zuul/zuul-jobs/+/93707123:02
@clarkb:matrix.orgHanson: ^ I guess try again and see if that is happier now?23:10

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!