mordred | clarkb, corvus: 541939 is green on build-sphinx-docs - should we add a depends-on or rebase the corvus stack on that patch? | 00:04 |
---|---|---|
*** elyezer has quit IRC | 00:17 | |
*** elyezer has joined #zuul | 00:17 | |
*** elyezer has quit IRC | 00:31 | |
openstackgerrit | Merged openstack-infra/nodepool master: Update tox docs environment to match build-sphinx-docs https://review.openstack.org/541939 | 00:31 |
*** elyezer has joined #zuul | 00:35 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Infer python2 vs. python3 for sphinx jobs https://review.openstack.org/541952 | 00:39 |
jhesketh | corvus: do you have a moment to chat about the update_queue (nothing urgent) | 00:45 |
corvus | jhesketh: yep | 00:46 |
jhesketh | cancelling the queue on 'stop' (from my reading) shouldn't have any bad affects as the jobs are all being stopped (it might require a bit of extra error handling, but nothing difficult). | 00:47 |
jhesketh | the challenge is with the updates requested as part of the merger that's included with the executor | 00:47 |
jhesketh | given that they only enqueue when there is no other work, it's a bit of an edge case | 00:48 |
jhesketh | but if we do shutdown in the middle of an update that work we may give the scheduler/configure an error | 00:49 |
jhesketh | one way to handle it woudl be to just shut it down and have the merger client understand how to recognise a request has failed because of a disconnect rather than being something such as unable to merge | 00:49 |
jhesketh | this could be helpful as what we're trying to smooth out is network errors. This would mean the merger client (and hence scheduler) can handle disconnects from any of the merger processes | 00:50 |
jhesketh | so should be need to restart a merger server we can as well | 00:50 |
corvus | yeah, i think that would be ideal for general robustness anyway (and in the future, we may want to retry merge jobs in response to some errors, so that will give us a framework for that | 00:51 |
corvus | that's probably a bit more work though | 00:52 |
jhesketh | right, that was my thinking | 00:52 |
jhesketh | corvus: so I'm not sure it is too bad. We can modify MergeGearmanClient.handleDisconnect to re-submit the job X times | 00:53 |
jhesketh | actually handleDisconnect is for when the gearman client can't talk to the gearman server, so that won't solve this issue, but will still help | 00:53 |
jhesketh | handleWorkFail is probably what we need | 00:54 |
corvus | jhesketh: yeah | 00:54 |
jhesketh | and maybe handleWorkException, but if a worker is causing exceptions because of a malformed job we don't really want to send that job elsewhere | 00:54 |
corvus | jhesketh: if you want to start down that path and see if it heads down a rabbit hole, that seems like a good idea -- i think it's the ideal solution | 00:54 |
jhesketh | yep, will do :-) | 00:55 |
corvus | jhesketh: and if it's too complicated, we can work out how to compromise in the executor for now | 00:55 |
corvus | jhesketh: maybe handle exception the same way? could be out of disk space or something...? | 00:55 |
jhesketh | corvus: can you confirm for me (if you know) that on merge failure etc the gearman worker still sends a WorkComplete with the error described in the result? | 00:55 |
corvus | jhesketh: i can't confirm off the top of my head, but if not, i think they should. i think we've been moving things to that model (it makes a lot more sense because of things like this) | 00:56 |
jhesketh | corvus: yeah, so long as we're limiting retries we should be okay... also want to avoid the case where a malformed job takes the worker offline completely (eg segfault) which will then slowly take out other workers | 00:56 |
corvus | jhesketh: ++ | 00:56 |
jhesketh | corvus: yep, agreed. I think I made that mistake in early turbo-hipster versions where I thought WORK_FAIL was for failures in the job, but really it should be failures in the protocol | 00:57 |
jhesketh | so if I come across that we can look at fixing that too | 00:57 |
corvus | jhesketh: a *really* quick look suggests that the merger behaves the way we want, but it's worth keeping an eye out | 00:57 |
jhesketh | yep, that was my observation too | 00:58 |
corvus | jhesketh: (ie, a grep for workFail returns only one obvious result) | 00:58 |
jhesketh | cool, I'll get on with that then. Thanks for your time :-) | 00:58 |
corvus | jhesketh: thank you! | 00:58 |
*** dkranz has quit IRC | 01:25 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: Add /label-list to the webapp https://review.openstack.org/535563 | 01:43 |
*** elyezer has quit IRC | 02:57 | |
*** elyezer has joined #zuul | 03:02 | |
*** bhavik1 has joined #zuul | 04:12 | |
*** bhavik1 has quit IRC | 04:39 | |
*** elyezer has quit IRC | 05:22 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool master: zk: use kazoo retry facilities https://review.openstack.org/535537 | 05:24 |
*** elyezer has joined #zuul | 05:25 | |
tristanC | Shrews: last PS of 535537 should correctly handle sequence creation error, and there is a test for it | 05:27 |
*** threestrands has quit IRC | 05:57 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Allow Ansible 2.4 https://review.openstack.org/535781 | 06:28 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Disable action and lookup plugins from 2.4 https://review.openstack.org/535839 | 06:28 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 06:28 |
*** elyezer has quit IRC | 06:29 | |
*** elyezer has joined #zuul | 06:31 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 06:41 |
*** sshnaidm|afk has quit IRC | 07:00 | |
*** sshnaidm|afk has joined #zuul | 07:04 | |
*** xinliang has joined #zuul | 07:16 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 07:21 |
*** elyezer has quit IRC | 07:26 | |
*** elyezer has joined #zuul | 07:28 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 07:36 |
*** elyezer has quit IRC | 07:37 | |
*** elyezer has joined #zuul | 07:40 | |
*** sshnaidm|afk has quit IRC | 08:03 | |
*** jpena|off is now known as jpena | 08:11 | |
*** maxamillion has quit IRC | 08:17 | |
*** sshnaidm|afk has joined #zuul | 08:17 | |
*** mnaser has quit IRC | 08:17 | |
*** kmalloc has quit IRC | 08:17 | |
*** maxamillion has joined #zuul | 08:17 | |
*** mnaser has joined #zuul | 08:18 | |
*** kmalloc has joined #zuul | 08:18 | |
*** robcresswell has quit IRC | 08:18 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 08:18 |
*** sshnaidm|afk is now known as sshnaidm | 08:26 | |
*** elyezer has quit IRC | 08:35 | |
*** elyezer has joined #zuul | 08:41 | |
*** robcresswell has joined #zuul | 08:49 | |
*** elyezer has quit IRC | 08:58 | |
*** elyezer has joined #zuul | 09:08 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Clean held nodes automatically after configurable timeout https://review.openstack.org/536295 | 09:18 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 09:29 |
*** hashar has joined #zuul | 09:44 | |
*** hashar has joined #zuul | 09:44 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 09:51 |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Clean held nodes automatically after configurable timeout https://review.openstack.org/536295 | 09:57 |
*** hashar has quit IRC | 09:57 | |
*** hashar has joined #zuul | 09:59 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 10:19 |
*** jpena is now known as jpena|away | 10:59 | |
openstackgerrit | Andrea Frittoli proposed openstack-infra/zuul-jobs master: Change the list of extensions to a dict https://review.openstack.org/540485 | 11:07 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 11:12 |
*** hashar has quit IRC | 11:17 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/nodepool master: Clean held nodes automatically after configurable timeout https://review.openstack.org/536295 | 11:17 |
*** hashar has joined #zuul | 11:22 | |
openstackgerrit | Matthieu Huin proposed openstack-infra/zuul master: zuul autohold: allow operator to specify nodes TTL https://review.openstack.org/539596 | 11:22 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: test functional stream https://review.openstack.org/542138 | 11:23 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 11:29 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 11:39 |
*** elyezer has quit IRC | 11:45 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: test functional stream https://review.openstack.org/542138 | 11:49 |
*** elyezer has joined #zuul | 11:51 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 12:15 |
*** elyezer has quit IRC | 12:40 | |
*** elyezer has joined #zuul | 12:49 | |
*** jpena|away is now known as jpena|off | 12:52 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 12:57 |
openstackgerrit | Markus Hosch proposed openstack-infra/zuul master: Extend stackdump to display the daemonize status https://review.openstack.org/542158 | 12:57 |
openstackgerrit | Markus Hosch proposed openstack-infra/zuul master: Cleanly shutdown zuul scheduler if startup fails https://review.openstack.org/542159 | 12:58 |
openstackgerrit | Markus Hosch proposed openstack-infra/zuul master: Add --check-config option to zuul scheduler https://review.openstack.org/542160 | 12:58 |
*** jpena|off is now known as jpena | 13:06 | |
*** maho has joined #zuul | 13:09 | |
*** Wei_Liu has quit IRC | 13:15 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 13:21 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542182 | 13:41 |
*** Wei_Liu has joined #zuul | 13:47 | |
tobiash | sorry for spamming, trying to get that thing to run ^ | 13:49 |
*** sshnaidm is now known as sshnaidm|rover | 13:50 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542182 | 14:04 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Run zuul-stream-functional on every change to ansible https://review.openstack.org/542203 | 14:07 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Run zuul-stream-functional on every change to ansible https://review.openstack.org/542203 | 14:10 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Allow Ansible 2.4 https://review.openstack.org/535781 | 14:18 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Disable action and lookup plugins from 2.4 https://review.openstack.org/535839 | 14:18 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 14:18 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542182 | 14:18 |
*** maxamillion has quit IRC | 14:21 | |
*** maxamillion has joined #zuul | 14:21 | |
kklimonda | SpamapS: let me see if I can find some time today and tomorrow to fix those last comments - I've been running with this patch for some time and that obviously has made me less anxious to get it into mainline :/ | 14:34 |
SpamapS | kklimonda: yeah I am trying not to cherry-pick too much and run mostly off mainline (except the slack reporter which I made ;-) | 14:36 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Infer python2 vs. python3 for sphinx jobs https://review.openstack.org/541952 | 14:37 |
kklimonda | sigh, I'm 100 commits behind mainline I think | 14:37 |
SpamapS | Yeah I haven't rebased in a while. | 14:38 |
*** dkranz has joined #zuul | 14:38 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Fix builds queued forever after failure to get node request https://review.openstack.org/537335 | 14:44 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Port in changes from ansible 2.4 command module https://review.openstack.org/535840 | 14:53 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542182 | 14:53 |
tobiash | mordred: I think this works now except the log of a debug task with var ^ | 14:55 |
tobiash | currently a bit stuck there | 14:55 |
tobiash | mordred: you handle the debug task there right? | 14:56 |
tobiash | http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/callback/zuul_stream.py#n377 | 14:56 |
mordred | tobiash: yes- and we use debug tasks in jobs so we know they work | 14:57 |
tobiash | with ansible 2.4 the debug task with var has no msg anymore (did it have one in the past?) in the result dict | 14:57 |
tobiash | mordred: that's the result dict for debug with var and ansible 2.4: http://logs.openstack.org/82/542182/2/check/zuul-stream-functional/5c1325e/job-output.txt.gz#_2018-02-08_14_15_08_051172 | 14:59 |
tobiash | not that nice to filter | 14:59 |
mordred | tobiash: OH - I get what you're saying now | 15:00 |
tobiash | hrm, that should take care of dumping everything rught? | 15:02 |
tobiash | http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/callback/zuul_stream.py#n387 | 15:02 |
mordred | tobiash: so - the logic for debug with a var is (or needs to be), I think - remove any keys that start with _ansible - remove 'changed' - anything left is a debug var to print - but I'm not sure how we detect that - so maybe it has to be a final condition? | 15:03 |
tobiash | mordred: maybe with result._task.action not in ('debug') ? | 15:03 |
tobiash | will try that | 15:03 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542182 | 15:07 |
tobiash | yeah, I think just the detection if a debug task fails | 15:07 |
*** Wei_Liu has quit IRC | 15:08 | |
dmsimard | corvus: I suppose the executors need to be restarted to pick up the new statsd counters ? Don't see them in graphite yet. | 15:21 |
corvus | dmsimard: yep | 15:22 |
dmsimard | okay, there's no rush so I'll just wait until we need to restart them for something. Thanks. | 15:23 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542182 | 15:23 |
*** maho has quit IRC | 15:31 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Handle debug tasks more robust in zuul_stream https://review.openstack.org/542182 | 15:34 |
*** elyezer has quit IRC | 15:35 | |
*** elyezer has joined #zuul | 15:36 | |
*** hashar is now known as hasharAway | 15:38 | |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Exit launchers and builders immediately https://review.openstack.org/541930 | 16:02 |
*** xinliang has quit IRC | 16:29 | |
corvus | SpamapS, clarkb: can you give https://review.openstack.org/540489 a look? | 16:45 |
*** openstackgerrit has quit IRC | 16:48 | |
*** jimi|ansible has quit IRC | 16:51 | |
*** openstackgerrit has joined #zuul | 16:58 | |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Exercise pidfile before daemonizing https://review.openstack.org/542315 | 16:58 |
*** markush1 has joined #zuul | 17:08 | |
*** markush1 has quit IRC | 17:10 | |
*** myoung is now known as myoung|biaf | 17:22 | |
SpamapS | corvus: looking | 17:42 |
SpamapS | corvus: +1. :) | 17:43 |
*** myoung|biaf is now known as myoung | 17:46 | |
corvus | Shrews, mhu: it looks like the nodepool webapp is still built in to the launcher -- should we split it out into its own process? | 17:47 |
corvus | tristanC: ^ | 17:47 |
corvus | also, 'if not no_webapp' takes me back to grammar school. :) | 17:48 |
pabelanger | ha | 17:48 |
SpamapS | you aint webapp. You waint nothin. | 17:52 |
mordred | corvus, mhu, Shrews, tristanC: I was also thinking yesterday from looking at the nodepool webapp patches that perhaps we should migrate nodepool's web stuff from webob to aiohttp so that the two codebases are more similar | 17:57 |
corvus | mordred: worth considering... though we also were talking about maybe just folding into zuul entirely | 17:58 |
mordred | corvus: that would also work | 18:00 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: WIP Rework log streaming to use python logging https://review.openstack.org/541434 | 18:00 |
corvus | i brought it up now mostly in case we think we need to do something urgently before the v3.0 release | 18:00 |
corvus | but as i think about it -- launchers running webapps is actually probably fine | 18:01 |
corvus | like, it's a little weird, but there's probably nothing structurally wrong about it that we need to deal with before release... i think? | 18:01 |
corvus | we don't seem to be exposing them in openstack though, so it's hard for me to say for sure | 18:02 |
mordred | corvus: I dunno - webapps being on launchers will make it harder to incorporate node information into a dashboard - since something would have to somehow be aware of the launchers to be able to make the queries | 18:02 |
*** myoung is now known as myoung|food | 18:02 | |
corvus | mordred: oh yes absolutely -- i'm just trying to find the right thing for the 3.0 release here -- i don't think we have to, or want to, commit to this long-term, because you're right, it's hard to design around. but it's also probably not completely broken. | 18:03 |
corvus | maybe we should just put a note in the docs indicating that the current nodepool webapp is likely to change in subsequent releasese. | 18:04 |
mordred | yah | 18:04 |
mordred | corvus, Shrews, tobiash: the https://review.openstack.org/541434 above shifts the stream receiver into the callback plugin - the plumbing all works to get the port created by ssh and passing the info to the command module - but I'm not getting any hits on the receiver's handler method | 18:06 |
mordred | oh! nope. I lied - I did something and the port forward isn't getting applied ... | 18:07 |
tobiash | corvus: +1 for a note and later making it zuul-web like | 18:09 |
*** elyezer has quit IRC | 18:09 | |
*** elyezer has joined #zuul | 18:10 | |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Resolve paths before demonization https://review.openstack.org/542353 | 18:11 |
*** jpena is now known as jpena|off | 18:13 | |
*** jimi|ansible has joined #zuul | 18:15 | |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Exit launchers and builders immediately https://review.openstack.org/541930 | 18:15 |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Exercise pidfile before daemonizing https://review.openstack.org/542315 | 18:15 |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Resolve paths before demonization https://review.openstack.org/542353 | 18:15 |
corvus | oh right, Shrews is afk starting today | 18:16 |
tobiash | corvus: I'll be on vacation next week | 18:19 |
corvus | tobiash: thanks, enjoy! | 18:19 |
tobiash | :) | 18:20 |
openstackgerrit | Krzysztof Klimonda proposed openstack-infra/zuul master: Support autoholding nodes for specific changes/refs https://review.openstack.org/540035 | 18:21 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542360 | 18:25 |
openstackgerrit | Krzysztof Klimonda proposed openstack-infra/zuul master: Support autoholding nodes for specific changes/refs https://review.openstack.org/540035 | 18:26 |
tobiash | mordred: looks like you're also fighting hard with log streaming | 18:31 |
mordred | tobiash: yah - although hopefully in a different part | 18:32 |
tobiash | checking | 18:33 |
tobiash | mordred: there will be some conflicts in zuul/ansible/library/command.py but I think they'll be easy to resolve | 18:34 |
mordred | WOOT! I got it working!!!! | 18:36 |
mordred | turns out hte issue was my continual inability to understand ssh -R port:host:port syntax | 18:36 |
tobiash | that's a thing I have to look up in the manpage everytime I use it ;) | 18:38 |
mordred | me too - and I STILL get it wrong half the time :) | 18:39 |
tobiash | hrm, the round trip times are getting longer now | 18:39 |
*** myoung|food is now known as myoung | 18:43 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Fix builds queued forever after failure to get node request https://review.openstack.org/537335 | 18:44 |
openstackgerrit | Andrea Frittoli proposed openstack-infra/zuul-jobs master: Change the list of extensions to a dict https://review.openstack.org/540485 | 18:52 |
*** sshnaidm|rover is now known as sshnaidm|off | 18:52 | |
tobiash | mordred: yay, https://review.openstack.org/#/c/542182/ fixes the debug task for ansible 2.4 while being still compatible to 2.3 :) | 18:56 |
*** sshnaidm|off has quit IRC | 18:57 | |
tobiash | so one step further, now binary output is broken with 2.4... | 18:59 |
mordred | tobiash: woot! | 19:15 |
*** elyezer has quit IRC | 19:19 | |
*** elyezer has joined #zuul | 19:24 | |
*** zaro_ has quit IRC | 19:26 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542360 | 19:28 |
*** zaro_ has joined #zuul | 19:29 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542360 | 19:43 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542360 | 19:47 |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Store build logs automatically https://review.openstack.org/542386 | 19:55 |
rcarrillocruz | corvus, mordred : sorry, just reading scrollback. So for 3.0, one webapp per launcher, but intent is in the future to decouple it so we can reach out just one webapp which potentailly gives aggregate info from nodepool? | 19:59 |
corvus | rcarrillocruz: i *think* currently each launcher-webapp should give you all the info. | 20:00 |
corvus | but otherwise, yes, i think that's probably what will happen, in some form. | 20:00 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542360 | 20:00 |
rcarrillocruz | ah, so webapp is just a passthru to zookeeper, ergo doesn't matter which webapp endpoint you reach | 20:00 |
rcarrillocruz | is that accurate? | 20:00 |
rcarrillocruz | from info you can grab i mean | 20:01 |
corvus | that's my understanding. i can't confirm that since i don't think we currently expose that in openstack-infra | 20:01 |
rcarrillocruz | k thx | 20:01 |
tobiash | corvus: I can confirm that it just iterates over zk | 20:02 |
tobiash | it's primitive compared to zuul-web but it works :) | 20:02 |
mordred | corvus: you know - we could jus keep it as one-webapp-per-launcher and just deal with it being a usable endpoint with a load balancer - since each launcher is a stateless http thing | 20:04 |
corvus | mordred: that's an option too | 20:06 |
rcarrillocruz | so folks, was thinking the other day, we have a vendor where they use openstack for their CI. So I thought 'oh nice, let's suggest him using zuul/nodepool'. The guy said that he knows nodepool , but he prefers using heat for doing on test trigger provisioning, as their IT team is not happy having nodepool keeping a $min-nodes quota used. Is it possible to set up nodepool to create nodes on demand, by having 0 | 20:06 |
rcarrillocruz | min-ready ? I think not, but you know better than me | 20:06 |
mordred | rcarrillocruz: yes | 20:07 |
mordred | rcarrillocruz: it's very possible | 20:07 |
rcarrillocruz | i thought setting min-ready 0 meant disabling that node type ? | 20:07 |
mordred | rcarrillocruz: that's -1 | 20:08 |
tobiash | rcarrillocruz: min-ready was a problem in v2 world, with nodepool/zuulv3 that's absolutely no problem | 20:08 |
mordred | rcarrillocruz: https://docs.openstack.org/infra/nodepool/configuration.html#labels | 20:08 |
rcarrillocruz | LOL | 20:08 |
rcarrillocruz | o-k | 20:08 |
rcarrillocruz | that's it, i had that assumption from the old days | 20:08 |
rcarrillocruz | geez | 20:08 |
rcarrillocruz | so this week i learned: | 20:08 |
rcarrillocruz | you can set up nodepool to be complete private, by setting auto-floating-ips: False in provider plus default-interface: False in clouds.yaml | 20:09 |
rcarrillocruz | and nodepool being able to create nodes in demand , by having 0 nodes per type | 20:09 |
rcarrillocruz | love this community | 20:09 |
mordred | \o/ | 20:09 |
rcarrillocruz | THANKS! | 20:09 |
* rcarrillocruz goes hack | 20:09 | |
tobiash | and with easy crontab scripting you even can change min-ready depending on office hours | 20:10 |
rcarrillocruz | tobiash: yeah, that's kind of the tihng i have in place (we have just-nodepool + DCI for our current CI), but the fact we CAN have zero nodes per label...OMG! | 20:10 |
corvus | i think we should get rid of all the -1 things | 20:11 |
corvus | they shouldn't be necessary anymore, and they are confusing | 20:11 |
corvus | (i don't tell anyone to use them anymore) | 20:12 |
corvus | if you want to remove a label, just remove the label now | 20:12 |
rcarrillocruz | aye, cos you have this dual meaning of -1 plus simply removing the label to 'disable this label' | 20:12 |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Default min-ready to 0 https://review.openstack.org/542410 | 20:13 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: DNM: fix streaming tests https://review.openstack.org/542360 | 20:24 |
pabelanger | speaking of min-ready and nodepool, https://review.openstack.org/540916/ would be good to get a few eyes on. Adds tests for multiple launchers and min-ready nodes | 20:26 |
*** jimi|ansible has quit IRC | 20:30 | |
tobiash | pabelanger: was that meant for going in now or as a base for starting to work on multi launcher races? | 20:32 |
pabelanger | tobiash: yah, mostly want to make sure everybody is aware how it works, before starting rework. However, I am still trying to come up with solution to this. | 20:35 |
pabelanger | actually, let me test something | 20:37 |
tobiash | mordred: geez, I broke the binary data test with my additional debugging output... | 20:37 |
pabelanger | tobiash: another workaround, is to remove min-ready from all but 1 nodepool-launcher.yaml file. It then will ensure specific providers / labels will be launched. No the best, but a way to deal with it until we figure out how to share config between launchers | 20:44 |
tobiash | yeah, that's a possible workaround | 20:46 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Update to Ansible 2.4 https://review.openstack.org/535781 | 20:47 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Disable action and lookup plugins from 2.4 https://review.openstack.org/535839 | 20:47 |
openstackgerrit | Paul Belanger proposed openstack-infra/nodepool master: Add unit test for multiple launchers https://review.openstack.org/540916 | 20:49 |
pabelanger | tobiash: ^ | 20:49 |
tobiash | pabelanger: so this is two launchers, two clouds, one label? | 20:52 |
pabelanger | tobiash: right | 20:52 |
tobiash | next difficulty level would be two launchers, one cloud, but that would also need coordinated quota handling | 20:53 |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Store build logs automatically https://review.openstack.org/542386 | 20:54 |
openstackgerrit | James E. Blair proposed openstack-infra/nodepool master: Default min-ready to 0 https://review.openstack.org/542410 | 20:54 |
pabelanger | tobiash: yah, I'd love us to support that, but maybe 4.0? :) | 20:55 |
tobiash | probably | 20:56 |
*** jimi|ansible has joined #zuul | 20:57 | |
*** ChanServ has quit IRC | 20:57 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Update to Ansible 2.4 https://review.openstack.org/535781 | 21:06 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul master: Disable action and lookup plugins from 2.4 https://review.openstack.org/535839 | 21:06 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Add doc for executor state_dir https://review.openstack.org/542430 | 21:08 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Make min_starting_builds a heuristic https://review.openstack.org/542431 | 21:08 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: Rework log streaming to use python logging https://review.openstack.org/541434 | 21:18 |
mordred | corvus, tobiash: ^^ ok. that works now locally in synthetic tests | 21:19 |
mordred | it also uses json instead of pickle | 21:20 |
mordred | there is one more engineering challenge to solve - which is how to deal with controlpersist | 21:20 |
mordred | (we have one controlpersist per remote host, but we have more than one ansible-playbook invocatoin - so we need to make sure subsequent listeners use the same local port ...) | 21:21 |
corvus | mordred: how about unix domain sockets? does that solve that? | 21:23 |
corvus | put the socket in the build dir, call it done? | 21:23 |
mordred | corvus: you know what - it sure does | 21:26 |
mordred | corvus: cause I was already thinking of essentially writing out a json map with remote-hostname+port in it | 21:26 |
corvus | \o/ turns out you warn't just whistlin' dixie thar | 21:26 |
mordred | but if we just name the unix socket predictably with the hostname ... | 21:27 |
mordred | I also think we MIGHT be able to get away with just one listener rather than one per remote host - now that I've got this all plubmed through | 21:27 |
tobiash | \o/ the ansible 2.4 stack is working now https://review.openstack.org/#/q/topic:zuul-ansible-2.4+status:open | 21:28 |
tobiash | :) | 21:28 |
mordred | tobiash: woohoo! | 21:30 |
corvus | tobiash: cool... i'm thinking maybe we should land that shortly after the v3.0 release? | 21:30 |
*** ChanServ has joined #zuul | 21:30 | |
*** barjavel.freenode.net sets mode: +o ChanServ | 21:30 | |
tobiash | corvus: do we already have a target for a release date? | 21:31 |
corvus | i'm still aiming for around the ptg, ideally shortly before, less ideally shortly after | 21:32 |
*** dkranz has quit IRC | 21:32 | |
tobiash | I have to switch to 2.4 asap as the windows modules are all broken in 2.3 and py3 | 21:32 |
tobiash | but I'm ok having this in my staging branch for a few weeks | 21:32 |
*** sshnaidm has joined #zuul | 21:32 | |
corvus | ah. my suggestion was predicated on the assumption that 2.3 is working with zuul "well enough" for us to make a release, since we've stabilized a bunch of stuff over many months; i'm open to merging it sooner if folks think that would be better. | 21:33 |
mordred | corvus: I think the only thing we were waiting on wrt 2.4 was the openstack queens release being done ... | 21:33 |
dmsimard | 2.4.3 has a few fixes we're interested in | 21:33 |
mordred | corvus: so I'd support landing 2.4 support and updating openstack to 2.4 once queens is done | 21:34 |
mordred | but I don't have strong feelings in either direction - mostly whatever works for other people | 21:34 |
corvus | mordred: that's the same timeframe as ptg... so maybe the plan to relase on 2.3 then immediately update is still best? | 21:34 |
tobiash | corvus: that's ok for me | 21:34 |
tobiash | how about windows support? | 21:35 |
tobiash | is that for 3.0 or shortly after? | 21:35 |
tobiash | there are some small patches for that lingering around | 21:35 |
mordred | dmsimard: you might find https://review.openstack.org/541434 interesting | 21:35 |
corvus | tobiash: i don't think it's on our 3.0 roadmap, so i don't think we'd block on windows support, but can merge things as time permits | 21:36 |
tobiash | ok | 21:37 |
dmsimard | mordred: very nice. I'll have to check it out. | 21:41 |
dmsimard | mordred: btw the nodepool static driver landed so I might try my hand at the nested zuul idea we talked about a while back -- to test stuff like that. | 21:42 |
SpamapS | corvus: https://github.com/facebook/pyre2/commit/35e71e99ba8fb20d1cec14493115137db6362ed8 | 21:52 |
corvus | SpamapS: i'm shivering with antici | 21:55 |
corvus | pation | 21:55 |
SpamapS | nice | 21:56 |
*** hasharAway has quit IRC | 22:06 | |
mordred | corvus: ok. unix domain sockets work - but don't actually solve the control persist issue | 22:21 |
mordred | corvus: the issue being that the first control-persist connection creates the connection and forwards the socket- but with unix domain sockets once the listening process (the first playbook) exits, subsequent playbooks can't re-open the same socket because no SO_REUSEADDR support in unix sockets | 22:26 |
corvus | doh | 22:26 |
mordred | corvus: so it's almost more like we need to spawn up a thing similar to how we do ssh agents | 22:26 |
mordred | corvus: although I'm pretty confident i can get it down to a single server listening on a single local socket for all of the remote connections | 22:27 |
corvus | mordred: so a process that takes input on a socket and writes that to a file? | 22:27 |
mordred | yah | 22:27 |
mordred | or, more specifically, a process that takes input on a socket, decodes logging messages and emits them to the location for the logger as defined in the logging config file | 22:28 |
mordred | but yet | 22:28 |
mordred | but yes | 22:28 |
corvus | *nod* | 22:28 |
corvus | mordred: even after it exits, it can't be reused? | 22:31 |
mordred | corvus: not with it having been forwarded over the persistent ssh connection | 22:31 |
corvus | oh i see | 22:32 |
mordred | corvus: hrm. also - when i said 'unix domain sockets work' earlier - I may not have been accurate | 22:37 |
mordred | nevermind. inverted port forwarding arguments. AGAIN | 22:39 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul master: Use unix sockets instead of TCP sockets https://review.openstack.org/542469 | 22:39 |
mordred | corvus: so - that totally works for one playbook - and will work again once control persist dies | 22:41 |
mordred | corvus: I'm going to step away from the computer for a second, but I'll figure out job-lifecycled version next | 22:42 |
*** elyezer has quit IRC | 22:43 | |
*** elyezer has joined #zuul | 22:44 | |
corvus | keystoneauth1.exceptions.http.BadRequest: Expecting to find domain in project. The server could not comply with the request since it is either malformed or otherwise incorrect. The client is assumed to be in error. (HTTP 400) (Request-ID: req-0b54d669-dbc1-4e32-b81b-e8cad33d3486) | 22:58 |
corvus | mordred: ^ what does that mean? | 22:58 |
dmsimard | Sounds like keystone v2 vs v3 config mismatch | 23:19 |
corvus | i tried using the v2 config too, but that wasn't working | 23:20 |
corvus | i'm stumped | 23:21 |
*** elyezer has quit IRC | 23:43 | |
*** elyezer has joined #zuul | 23:44 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!