@fungicide:matrix.org | during today's weekly rolling zuul upgrade in opendev, we observed a transient backward-incompatibility. it appears that when executors were updated to include https://review.opendev.org/862978 and then tried to start builds which were enqueued by a scheduler running code older than that, we saw jobs immediately fail/retry with KeyError exceptions like this: https://paste.opendev.org/show/b8bvT5s5FUkOnE3lZiII/ | 15:13 |
---|---|---|
@fungicide:matrix.org | once the schedulers were also upgraded some hours later, the problem subsided | 15:14 |
@fungicide:matrix.org | it may make sense to have the executors backward-compatibly handle playbooks without semaphores, at least until after the next release is tagged, in order to make live upgrades smoother | 15:15 |
@fungicide:matrix.org | also if we see issues like this frequently (i don't think we do now, but that could certainly change), i wonder if it would make sense to do a special unit test job which runs the schedulers deployed from the most recent git tag instead of the proposed change, or from the parent of the change | 15:17 |
@fungicide:matrix.org | for the immediate fix, would it be as simple as changing zuul/executor/server.py:2044 to something like `jobdir_playbook.semaphores = playbook.get('semaphores', [])`? | 15:36 |
@fungicide:matrix.org | i guess i'll push a change with that and we can discuss there. no idea if there's value in adding a regression test for something transient like that | 15:37 |
-@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: [zuul/zuul] 864335: Smooth out upgrades to playbook semaphores https://review.opendev.org/c/zuul/zuul/+/864335 | 15:44 | |
@jim:acmegating.com | fungi: we can catch that in the unit test job, we just need to remember to test for it. i think we may be able to make a catch-all unit test job that does what you suggest (basically, start with model_api=n-1) | 15:53 |
@jim:acmegating.com | and we should also almost always use .get() when we're adding new zk attrs. i think we just haven't completely retrained ourselves yet :) | 15:54 |
@jim:acmegating.com | fungi: thanks for the fix, and sorry for the trouble :) | 15:55 |
@jim:acmegating.com | * oops, we should also almost always use .get() when we're adding new zk attrs. i think we just haven't completely retrained ourselves yet :) | 16:14 |
-@gerrit:opendev.org- Zuul merged on behalf of Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org: [zuul/zuul] 864335: Smooth out upgrades to playbook semaphores https://review.opendev.org/c/zuul/zuul/+/864335 | 18:25 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!