Saturday, 2022-11-12

@fungicide:matrix.orgduring today's weekly rolling zuul upgrade in opendev, we observed a transient backward-incompatibility. it appears that when executors were updated to include https://review.opendev.org/862978 and then tried to start builds which were enqueued by a scheduler running code older than that, we saw jobs immediately fail/retry with KeyError exceptions like this: https://paste.opendev.org/show/b8bvT5s5FUkOnE3lZiII/15:13
@fungicide:matrix.orgonce the schedulers were also upgraded some hours later, the problem subsided15:14
@fungicide:matrix.orgit may make sense to have the executors backward-compatibly handle playbooks without semaphores, at least until after the next release is tagged, in order to make live upgrades smoother15:15
@fungicide:matrix.orgalso if we see issues like this frequently (i don't think we do now, but that could certainly change), i wonder if it would make sense to do a special unit test job which runs the schedulers deployed from the most recent git tag instead of the proposed change, or from the parent of the change15:17
@fungicide:matrix.orgfor the immediate fix, would it be as simple as changing zuul/executor/server.py:2044 to something like `jobdir_playbook.semaphores = playbook.get('semaphores', [])`?15:36
@fungicide:matrix.orgi guess i'll push a change with that and we can discuss there. no idea if there's value in adding a regression test for something transient like that15:37
-@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: [zuul/zuul] 864335: Smooth out upgrades to playbook semaphores https://review.opendev.org/c/zuul/zuul/+/86433515:44
@jim:acmegating.comfungi: we can catch that in the unit test job, we just need to remember to test for it.  i think we may be able to make a catch-all unit test job that does what you suggest (basically, start with model_api=n-1)15:53
@jim:acmegating.comand we should also almost always use .get() when we're adding new zk attrs.  i think we just haven't completely retrained ourselves yet :)15:54
@jim:acmegating.comfungi: thanks for the fix, and sorry for the trouble :)15:55
@jim:acmegating.com * oops, we should also almost always use .get() when we're adding new zk attrs.  i think we just haven't completely retrained ourselves yet :)16:14
-@gerrit:opendev.org- Zuul merged on behalf of Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org: [zuul/zuul] 864335: Smooth out upgrades to playbook semaphores https://review.opendev.org/c/zuul/zuul/+/86433518:25

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!