*** jamesmcarthur has joined #zuul | 00:11 | |
*** jamesmcarthur has quit IRC | 00:16 | |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: WIP: Add tutorial tests https://review.opendev.org/728194 | 00:23 |
---|---|---|
*** jamesmcarthur has joined #zuul | 00:23 | |
*** jamesmcarthur has quit IRC | 00:34 | |
openstackgerrit | melanie witt proposed zuul/zuul-jobs master: Run sphinx-build in parallel for releasenotes https://review.opendev.org/727473 | 00:35 |
*** rlandy has quit IRC | 00:40 | |
*** jamesmcarthur has joined #zuul | 01:08 | |
tristanC | corvus: i left a reply to your comment on https://review.opendev.org/728151 | 01:09 |
*** ysandeep|sleep is now known as ysandeep | 01:34 | |
*** jamesmcarthur has quit IRC | 01:44 | |
*** swest has quit IRC | 01:44 | |
*** jamesmcarthur has joined #zuul | 01:45 | |
*** jamesmcarthur has quit IRC | 01:49 | |
*** swest has joined #zuul | 02:00 | |
*** rlandy has joined #zuul | 02:11 | |
*** jamesmcarthur has joined #zuul | 02:12 | |
*** jamesmcarthur has quit IRC | 02:14 | |
*** jamesmcarthur has joined #zuul | 02:14 | |
*** jamesmcarthur_ has joined #zuul | 02:17 | |
*** jamesmcarthur has quit IRC | 02:18 | |
*** jamesmcarthur has joined #zuul | 02:20 | |
*** jamesmca_ has joined #zuul | 02:21 | |
*** jamesmcarthur_ has quit IRC | 02:22 | |
*** jamesmcarthur_ has joined #zuul | 02:23 | |
*** jamesmcarthur has quit IRC | 02:24 | |
*** jamesmca_ has quit IRC | 02:25 | |
*** jamesmcarthur has joined #zuul | 02:26 | |
*** jamesmca_ has joined #zuul | 02:27 | |
*** jamesmc__ has joined #zuul | 02:28 | |
*** jamesmcarthur_ has quit IRC | 02:28 | |
*** jamesmcarthur has quit IRC | 02:30 | |
*** jamesmca_ has quit IRC | 02:31 | |
*** jamesmc__ has quit IRC | 02:32 | |
*** bhavikdbavishi has joined #zuul | 03:14 | |
*** jamesmcarthur has joined #zuul | 03:14 | |
*** bhavikdbavishi1 has joined #zuul | 03:17 | |
*** jamesmcarthur has quit IRC | 03:17 | |
*** jamesmcarthur has joined #zuul | 03:17 | |
*** bhavikdbavishi has quit IRC | 03:18 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 03:18 | |
*** cloudnull has quit IRC | 03:55 | |
*** jamesmcarthur has quit IRC | 03:57 | |
*** jamesmcarthur has joined #zuul | 03:57 | |
*** jamesmcarthur has quit IRC | 04:08 | |
*** jamesmcarthur has joined #zuul | 04:09 | |
*** jamesmcarthur has quit IRC | 04:10 | |
*** jamesmcarthur has joined #zuul | 04:10 | |
*** evrardjp has quit IRC | 04:33 | |
*** evrardjp has joined #zuul | 04:33 | |
*** bhavikdbavishi has quit IRC | 04:47 | |
*** ysandeep is now known as ysandeep|afk | 04:51 | |
*** bhavikdbavishi has joined #zuul | 05:09 | |
*** felixedel has joined #zuul | 05:24 | |
*** felixedel has quit IRC | 05:45 | |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Drop support for ansible 2.7 https://review.opendev.org/727373 | 05:48 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Update images to use python 3.8 https://review.opendev.org/727374 | 05:48 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Drop support for ansible 2.7 https://review.opendev.org/727373 | 05:50 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Update images to use python 3.8 https://review.opendev.org/727374 | 05:55 |
*** dpawlik has joined #zuul | 05:58 | |
*** y2kenny has quit IRC | 06:00 | |
*** sgw has quit IRC | 06:00 | |
*** ysandeep|afk is now known as ysandeep | 06:03 | |
*** zxiiro has quit IRC | 06:04 | |
*** saneax has joined #zuul | 06:12 | |
*** jamesmcarthur has quit IRC | 06:22 | |
openstackgerrit | Merged zuul/zuul master: Fix loading_errors bug https://review.opendev.org/728286 | 06:24 |
*** jamesmcarthur has joined #zuul | 06:56 | |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Stop jobs on gearman disconnect https://review.opendev.org/714722 | 07:09 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: WIP: Add tutorial tests https://review.opendev.org/728194 | 07:11 |
*** rpittau|afk is now known as rpittau | 07:12 | |
*** guillaumec has joined #zuul | 07:19 | |
*** jcapitao has joined #zuul | 07:22 | |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Default to Ansible 2.9 https://review.opendev.org/727345 | 07:22 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Drop support for ansible 2.6 https://review.opendev.org/727157 | 07:22 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Drop support for ansible 2.7 https://review.opendev.org/727373 | 07:22 |
*** bhavikdbavishi has quit IRC | 07:30 | |
*** yolanda has joined #zuul | 07:32 | |
*** tosky has joined #zuul | 07:35 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: Add simple testing for Zuul CLI & REST API https://review.opendev.org/728098 | 07:54 |
*** bhavikdbavishi has joined #zuul | 07:55 | |
*** nils has joined #zuul | 08:04 | |
*** fbo|off is now known as fbo|afk | 08:18 | |
*** piotrowskim has joined #zuul | 08:22 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: REST API: remove useless tenant when doing autohold query by id https://review.opendev.org/728118 | 08:24 |
piotrowskim | Hello, https://zuul.opendev.org/t/openstack/build/5c716fd6fbfe42548b9b58e1a2e49545, could anyone help me with this issue? I don't see the mentioned line in file, and I am not sure if I have lint error in my project or it's something else | 08:25 |
AJaeger | piotrowskim: that's from the 12th, isn't it? Please recheck. We fixed a few problems in that area | 08:27 |
piotrowskim | I think 10 | 08:27 |
piotrowskim | you ask about nodejs version? | 08:27 |
AJaeger | piotrowskim: 12th of May | 08:37 |
piotrowskim | ok | 08:38 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: DNM: Debug sibling install https://review.opendev.org/728384 | 08:44 |
AJaeger | piotrowskim: did the recheck help? | 08:58 |
*** ysandeep is now known as ysandeep|lunch | 09:01 | |
piotrowskim | i think so, thanks | 09:01 |
AJaeger | great | 09:02 |
avass | AJaeger: what issue are you checking for tox_siblings? | 09:11 |
*** jamesmcarthur has quit IRC | 09:13 | |
AJaeger | nova is setting in tox.ini envdir and thus sibling install fails since we expect that envdir contains the envlist (pdf-docs) and not another one, see https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/tox/library/tox_install_sibling_packages.py#L190-L194 | 09:17 |
avass | AJaeger: you mean the default evlist? not sure I follow | 09:18 |
AJaeger | avass: I was figuring out why tox_siblings works for docs but not pdf-docs. the envdir=docs in nova's tox.ini breaks it. So, that was my debugging. Now question: how to fix? Remove envdir from nova/tox.ini or teach siblings to parse tox.ini and check for envdir? | 09:18 |
AJaeger | that's a discussion for another time today - I now know what's broken | 09:19 |
avass | AJaeger: it's probably better to parse tox.ini, that shouldn't be too hard | 09:19 |
avass | AJaeger: since someone else could be doing that too | 09:19 |
avass | AJaeger: I have a change that updates tox_siblings to take a list of testenvs, I could stack another change on top of that where I parse tox.ini as well | 09:21 |
AJaeger | avass, wow, that would be really great! | 09:36 |
*** ysandeep|lunch is now known as ysandeep | 09:39 | |
*** bhavikdbavishi has quit IRC | 09:43 | |
*** bhavikdbavishi has joined #zuul | 09:44 | |
*** jamesmcarthur has joined #zuul | 09:44 | |
*** bhavikdbavishi has quit IRC | 09:48 | |
*** jamesmcarthur has quit IRC | 09:55 | |
zbr | is there an easy way to post a message to irc when a job fails? -- controlling this from the job definition? | 10:09 |
*** rpittau is now known as rpittau|bbl | 10:09 | |
AJaeger | zbr: gerritbot allows that | 10:11 |
zbr | AJaeger: that is not for gerrit changes, and not even our own zuul. | 10:11 |
zbr | i wonder if there is an ansible module that can be used to do this | 10:12 |
AJaeger | gerritbot reports various events, we have failure reporting not enabled. x-vrif-minus-2 is enabled for e.g. ironic. But that reacts to -2 only. So, fail of a single job: Nothing in gerritbot. | 10:13 |
AJaeger | zbr: for the generic question you have, I cannot help. | 10:13 |
zbr | in fact i think it would still require a webservice acting as a broken because you do not want to reconnect every time | 10:14 |
AJaeger | zbr, could you answer ianw's comment on https://review.opendev.org/#/c/727561/ , please? | 10:14 |
AJaeger | zbr: I would be happy to merge that change to move us one step further ^ | 10:15 |
zbr | https://github.com/ansible/ansible/blob/stable-2.9/changelogs/CHANGELOG-v2.9.rst#bugfixes which mentions https://github.com/ansible/ansible/issues/52275 | 10:16 |
zbr | i 100% sure i tried to use "python -m xxx" on virtualenv_command and it did not work | 10:16 |
zbr | so relying on it, would be a bad idea. | 10:16 |
zbr | that is regardless venv | 10:17 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST https://review.opendev.org/728410 | 10:17 |
zbr | same applies to virtualenv | 10:17 |
*** jamesmcarthur has joined #zuul | 10:23 | |
zbr | i hate so much that I cannot select text in git review comments.... | 10:24 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: REST API: remove useless tenant when doing autohold query by id https://review.opendev.org/728118 | 10:25 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST https://review.opendev.org/728410 | 10:25 |
*** dpawlik has quit IRC | 10:27 | |
*** dpawlik has joined #zuul | 10:28 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: bindep: Add missing virtualenv and fixed repo install https://review.opendev.org/693637 | 10:31 |
*** jcapitao is now known as jcapitao_lunch | 10:44 | |
*** fbo|afk is now known as fbo | 11:06 | |
*** bhavikdbavishi has joined #zuul | 11:07 | |
AJaeger | zbr: ianw addressedd your concerns with https://review.opendev.org/#/c/726715/4/roles/ensure-pip/tasks/main.yaml. My understanding is that there is a bug that will not manifest the way ianw wrote this up. | 11:16 |
avass | hmm, I've noticed that some of our static nodes get locked in zookeeper for some reason | 11:19 |
avass | they get stuck and nodepool is waiting in a pending state forever for the node: http://paste.openstack.org/show/793653/ | 11:22 |
avass | and I don't know how to fix it other than manually deleting the request-lock in zookeeper | 11:22 |
avass | actually no, not the request node, but deleteing the node in zookeeper | 11:26 |
avass | sicne it's stuck in a ready but locked state | 11:26 |
avass | deleting in nodepool didn't seem to work | 11:26 |
*** jamesmcarthur has quit IRC | 11:27 | |
tobiash | avass: ready+locked state is something we see as well in some conditions (the scheduler holds the lock for a never-to-be-started build) | 11:41 |
tobiash | avass: I tried to fix that in https://review.opendev.org/714852 but that caused a memleak in zuul | 11:41 |
tobiash | so far we regularly delete the locks of nodes that are in ready+locked for longer periods of time as a workaround (this is unsupported and might cause other side effects tm) | 11:42 |
avass | tobiash: yeah, it only ever happens for one specific tenant for us | 11:43 |
*** threestrands has quit IRC | 11:46 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: REST API: remove useless tenant when doing autohold query by id https://review.opendev.org/728118 | 11:49 |
*** jamesmcarthur has joined #zuul | 11:55 | |
*** jamesmcarthur has quit IRC | 12:04 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST https://review.opendev.org/728410 | 12:06 |
*** rpittau|bbl is now known as rpittau | 12:06 | |
*** sshnaidm|afk is now known as sshnaidm|off | 12:07 | |
mordred | zbr: bindep output is ordered? | 12:08 |
AJaeger | mordred: it's not so far - that's a new feature that zbr wants to introduce. That's why I ask to split his change into three so that we can discuss it. | 12:09 |
*** jcapitao_lunch is now known as jcapitao | 12:12 | |
mordred | AJaeger: ah - gotcha | 12:12 |
mordred | I mean - the epel-release example makes a good amount of sense and at least for some things solves an issue I've had with how to deal with bindep and packages needing external repos | 12:13 |
mordred | but good to know | 12:13 |
*** ysandeep is now known as ysandeep|afk | 12:16 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: Add remove-zuul-sshkey https://review.opendev.org/680712 | 12:24 |
mordred | avass: I think you're looking at the "parse tox.ini for envdir" thing. you might want to consider using tox --showconfig (which can be filtered for a given env with -e{env}) to expand any macros and whatnot | 12:28 |
*** asaleh_ has joined #zuul | 12:29 | |
*** ysandeep|afk is now known as ysandeep | 12:29 | |
avass | mordred: ah, that looks like, thanks! | 12:29 |
avass | looks nice* | 12:29 |
*** cloudnull has joined #zuul | 12:29 | |
*** panda|out is now known as panda | 12:30 | |
tristanC | zuul-maint : https://review.opendev.org/680712 is quite important for kubectl node user, could you please have a look | 12:31 |
AJaeger | mordred: cool, "tox --showconfig -e pdf-docs |grep ^envdir" is what we need instead of setting envdir manually in that line | 12:34 |
mordred | AJaeger: yah | 12:36 |
cloudnull | mornings | 12:37 |
avass | I was planning on passing it to configparser, but I guess that works too :) | 12:37 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: WIP: Import user tutorials from Software Factory project blog https://review.opendev.org/728193 | 12:38 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: WIP: Add tutorial tests https://review.opendev.org/728194 | 12:38 |
avass | cloudnull: good day! | 12:38 |
cloudnull | o/ | 12:39 |
avass | mordred, AJaeger: I think configparser would be easier to extend, with tox --showconfig we can be sure that the logdir is correct too | 12:41 |
AJaeger | try configparser on nova and see whether it does the right thing ;) | 12:42 |
avass | oh, what happens? | 12:43 |
mordred | avass: yeah - use that command, feed the output into configparser - should be very solid | 12:46 |
avass | oh.. I think our tox-siblings.yaml test-playbook is broken | 12:52 |
*** bhavikdbavishi has quit IRC | 12:55 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Fix broken tox-siblings.yaml test https://review.opendev.org/728438 | 12:55 |
openstackgerrit | Monty Taylor proposed zuul/zuul master: Update to new javascript jobs https://review.opendev.org/726554 | 13:00 |
*** jamesmcarthur has joined #zuul | 13:01 | |
*** sgw has joined #zuul | 13:05 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-website master: Add blog to website https://review.opendev.org/724648 | 13:07 |
*** jamesmcarthur has quit IRC | 13:11 | |
openstackgerrit | Monty Taylor proposed zuul/nodepool master: Add podman and config to the nodepool-builder image https://review.opendev.org/726477 | 13:12 |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: tox: allow tox to be upgraded https://review.opendev.org/690057 | 13:13 |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST https://review.opendev.org/728410 | 13:23 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Fix broken tox-siblings.yaml test https://review.opendev.org/728438 | 13:32 |
openstackgerrit | Merged zuul/zuul-jobs master: Add remove-zuul-sshkey https://review.opendev.org/680712 | 13:33 |
openstackgerrit | Guillaume Chauvel proposed zuul/zuul master: WIP: Add tutorial tests https://review.opendev.org/728194 | 13:41 |
*** brendangalloway has joined #zuul | 13:46 | |
brendangalloway | Hello, our nodepool-launcher seems to have entered some sort of error state. Is this the correct place to ask for assistance? | 13:47 |
avass | brendangalloway: absolutely :) | 13:48 |
*** zxiiro has joined #zuul | 13:48 | |
brendangalloway | According to the logs, earlier today the various poolworkers received a stop signal. I'm not sure why. After that, all the requests in the requests list have not been responded to. I've restarted all the zuul and nodepool services, but the requests are still not being filled | 13:49 |
brendangalloway | Is there some way I can query the state of specific pool workers? | 13:50 |
brendangalloway | I suspect that restarting the launcher did not start them again, but I can't find any way to interact with them | 13:51 |
fungi | brendangalloway: try explicitly stopping them before starting them | 13:52 |
fungi | also after stopping, check for stale pidfiles which might have been left behind when they were killed earlier | 13:52 |
brendangalloway | the nodepool launcher and zuul services? | 13:53 |
fungi | probably just the launchers, since it sounds like that's what you had trouble with | 13:54 |
*** felixedel has joined #zuul | 13:54 | |
fungi | if they're being started and stopped normally with initscripts using systemd's sysvinit compat layer, then systemd can have an inconsistent view of the service state, and think they're still running if they got stopped via some other mechanism like a direct kill signal | 13:55 |
fungi | so then when you tell systemd to start them, it just does nothing because it assumes they were already running | 13:55 |
fungi | but telling systemd to explicitly stop them first will get its internal state synced up with reality | 13:55 |
*** Goneri has joined #zuul | 13:56 | |
avass | AJaeger, mordred: wanna +2 https://review.opendev.org/#/c/728438/2 to fix the tox-siblings job? | 13:56 |
brendangalloway | fungi: The nodepool log did indicate it was restarting | 13:57 |
brendangalloway | but to be 100% sure, how would I check for stale PIDs? | 13:57 |
fungi | stop the launchers, then look in their rundir for a something.pid file... on our launchers it was in /var/run/nodepool-launcher/ until a little over a week ago when we switched to using docker containers | 14:00 |
fungi | on some installations it may have been in /var/run/nodepool/ | 14:01 |
brendangalloway | fungi: I'm using softwarefactory 3.4 - I don't see anything nodepool related in /var/run with the service stopped | 14:04 |
fungi | brendangalloway: i don't know much about how sf has the services arranged, but maybe tristanC can provide more precise guidance. if the launcher process doesn't exist/persist in the process list after you start it, and doesn't log anything, then you may need to try invoking it directly in the foreground | 14:07 |
mhu | brendangalloway, with SF nodepool stuff should be in /usr/bin/nodepool & /etc/nodepool | 14:07 |
fungi | mhu: is that where it writes its pidfile too? | 14:07 |
mhu | and logs in /var/log/nodepool/ | 14:07 |
tristanC | fungi: mhu: we don't use pidfile as the services are managed with systemd | 14:08 |
*** jamesmcarthur has joined #zuul | 14:08 | |
tristanC | brendangalloway: is the `nodepool request-list` output looks correct? | 14:08 |
fungi | got it. so maybe explicitly stopping it before trying to start it again is good enough | 14:08 |
brendangalloway | tristanC: it has the requests that the current jobs are waiting for | 14:09 |
fungi | though if systemd is working as a direct parent of the nodepool processes in that scenario, it shouldn't get out of sync and think the service is running when it's not, so the problem starting it may be elsewhere | 14:09 |
tristanC | brendangalloway: iirc SF set a long minSessionTime in zookeeper, and it may take sometime before a new launcher service to process the request | 14:10 |
fungi | any idea how long? | 14:10 |
fungi | brendangalloway: after you've started the launcher, is there a nodepool-launcher process in the process table at least? | 14:11 |
tristanC | by default it should be 10 minutes, up to 30 minutes | 14:11 |
brendangalloway | fungi: yes, nodepool is in ps | 14:11 |
fungi | okay, so it *is* starting | 14:12 |
fungi | it's just not processing the backlog? does it process new requests? | 14:12 |
brendangalloway | fungi: not that I can see. I cleared all the jobs and issued a recheck on one that was previously queued | 14:13 |
brendangalloway | request-list is generated, but never fulfilled | 14:14 |
fungi | tristanC: does the minSessionTime prevent a launcher from accepting any requests until it's had a session established for at least that long? | 14:14 |
brendangalloway | 1 node of the requested type is present in nodepool list | 14:14 |
tristanC | fungi: it should not, perhaps there is another issue | 14:14 |
tristanC | brendangalloway: what about /etc/nodepool/nodepool.yaml, is there provider listed? | 14:15 |
brendangalloway | tristanC: I think you might have it - the provider is [] | 14:16 |
*** avass has quit IRC | 14:16 | |
tristanC | brendangalloway: arg, so there is a suprising bug in ansible fact cachin, if you look at `grep ansible_hostname /var/lib/software-factory/ansible/facts/*` then you should see an incorrect hostname defined for your nodepool-launcher host | 14:17 |
*** jamesmcarthur has quit IRC | 14:17 | |
brendangalloway | tristanc: yes, I have hit this before. I had put in the fix you recommended previously so I did not look there again | 14:18 |
tristanC | brendangalloway: we haven't figure out the root cause yet. Best is to remove the fact and re-run the configuration: `rm -f /var/lib/software-factory/ansible/facts/* && sfconfig --skip-install` | 14:18 |
openstackgerrit | Merged zuul/zuul-jobs master: Fix broken tox-siblings.yaml test https://review.opendev.org/728438 | 14:19 |
brendangalloway | tristanc: trying that | 14:23 |
brendangalloway | tristanc: sfconfig has completed again, but the provider list is still empty | 14:34 |
tristanC | brendangalloway: arg, that's infortunate, and is ansible_hostname fact correct? | 14:35 |
brendangalloway | no, it is incorrect again | 14:35 |
fbo | brendangalloway: is the ansible version on your deployment 2.6.19 ? | 14:41 |
brendangalloway | fbo: yes | 14:42 |
openstackgerrit | Monty Taylor proposed zuul/zuul master: Update node to v14 and update to new jobs https://review.opendev.org/726553 | 14:50 |
openstackgerrit | Felix Edel proposed zuul/zuul master: WIP: Link to previous buildset results when reporting a check to Github https://review.opendev.org/728463 | 14:52 |
fbo | brendangalloway: is this, return the wrong hostname: ansible all -m setup -a "gather_subset=all" | grep hostname | 14:56 |
brendangalloway | fbo: it doesn't seem so - I don't see two duplicates at least | 14:58 |
*** harrymichal has joined #zuul | 15:03 | |
*** felixedel has quit IRC | 15:08 | |
*** jamesmcarthur has joined #zuul | 15:15 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST https://review.opendev.org/728410 | 15:17 |
*** avass has joined #zuul | 15:17 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: REST API: remove useless tenant when doing autohold query by id https://review.opendev.org/728118 | 15:25 |
*** ysandeep is now known as ysandeep|away | 15:36 | |
*** dpawlik has quit IRC | 15:41 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Don't require tox_envlist https://review.opendev.org/726829 | 15:44 |
*** jcapitao has quit IRC | 15:51 | |
openstackgerrit | Matthieu Huin proposed zuul/zuul master: REST API: add promote endpoint https://review.opendev.org/728489 | 16:04 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Don't require tox_envlist https://review.opendev.org/726829 | 16:12 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Remove unecessary environment from tox_siblings test https://review.opendev.org/728494 | 16:21 |
avass | mordred, AJaeger: actually it wasn't broken, I'm just tired and stupid. But it did add missing files: that it should track so instead of reverting I'll just remove the environment ^ :) | 16:22 |
avass | I'm going to take a break now | 16:22 |
AJaeger | avass: take a break and relax! Thanks a lot! | 16:22 |
*** nils has quit IRC | 16:28 | |
*** evrardjp has quit IRC | 16:33 | |
*** evrardjp has joined #zuul | 16:33 | |
*** brendangalloway has quit IRC | 16:33 | |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role https://review.opendev.org/728503 | 16:52 |
*** fbo is now known as fbo|off | 16:53 | |
*** jamesmcarthur has quit IRC | 16:55 | |
*** rpittau is now known as rpittau|afk | 17:11 | |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role https://review.opendev.org/728503 | 17:11 |
openstackgerrit | Graham Hayes proposed zuul/nodepool master: Implement an Azure driver https://review.opendev.org/554432 | 17:19 |
Open10K8S | Hi team | 17:24 |
Open10K8S | Zuul is not reachable | 17:24 |
Open10K8S | 503 service unavailable | 17:24 |
*** jamesmcarthur has joined #zuul | 17:30 | |
clarkb | as noted in #opendev this was us taking advantage of a good time to do a complete restart of our zuul on a friday. We had a few things queued up behind openstack completing its release (which ahppened earlier this week) | 17:32 |
AJaeger | corvus, mordred, could you review https://review.opendev.org/#/c/727561/1 , please? I suggest to move forward since it addresses a real bug. | 17:53 |
corvus | AJaeger: agree; +2; will wait for mordred to +3 or we can +w if he's busy | 17:58 |
AJaeger | thanks, corvus | 17:59 |
AJaeger | mordred: time for a review on 727561 or shall I +A? | 17:59 |
*** jamesmcarthur has quit IRC | 18:05 | |
*** jamesmcarthur has joined #zuul | 18:05 | |
AJaeger | thanks, mordred | 18:06 |
AJaeger | zuul-jobs maintainer, three smaller reviews, please: https://review.opendev.org/728494 and https://review.opendev.org/725030 https://review.opendev.org/#/c/727929/ | 18:11 |
AJaeger | and two longer ones: https://review.opendev.org/727135 and https://review.opendev.org/679306 , please | 18:12 |
openstackgerrit | Merged zuul/zuul-jobs master: bindep: use virtualenv_command from ensure-pip https://review.opendev.org/727561 | 18:16 |
avass | I think we're not cleaning ssh-keys good enough before and after jobs | 18:28 |
avass | I realized that a user could theoretically add their own ssh-key to a node otherwise this wouldn't work: https://review.opendev.org/#/c/679306/ | 18:28 |
avass | and that's not a problem for dynamic nodes | 18:28 |
avass | but for static nodes that ssh-key wouldn't be removed normally | 18:29 |
avass | acutally, for that specific role it would, but we don't stop a user from installing their own ssh-key during a job | 18:30 |
*** iurygregory has quit IRC | 18:35 | |
*** iurygregory has joined #zuul | 18:38 | |
fungi | avass: also there are probably plenty of other ways untrusted code could install backdoors on persistent nodes, even reverse tunnels you couldn't block without some seriously restrictive and fragile egress filtering | 18:40 |
avass | fungi: yeah, but even with that you could reach other static nodes in the network by rechecking over and over again | 18:41 |
fungi | if you confined all the processes descended from the job to an ephemeral cgroup, and plugged a number of escape hatches to keep them from forking outside it, then you could probably forcibly terminate any processes left running | 18:42 |
fungi | after which it's a matter of making sure ssh keys and any other remote access normally initialized outside the cgroup got reset | 18:42 |
AJaeger | fungi: question is as well: Do we make it easier with a role like https://review.opendev.org/679306 ? | 18:43 |
avass | fungi, AJaeger: you would need to isolate each node and not allow any multi-node jobs | 18:44 |
*** iurygregory has quit IRC | 18:45 | |
openstackgerrit | Albin Vass proposed zuul/zuul-base-jobs master: Make sure authorized_keys is not altered during a job https://review.opendev.org/728551 | 18:46 |
avass | AJaeger, fungi: wouldn't something like that ^ stop the user from leaving a public key installed on the node though? | 18:46 |
fungi | toctou race, the user can ssh in while the job is running after it has altered the authorized keys file before zuul sets it back | 18:47 |
fungi | but again, if they really want in, they could just set up some other backdoor not reliant on ssh | 18:48 |
fungi | there are so many other (far worse) ways people could accomplish that, might as well not start down the road of making someone think they can secure against it like that | 18:49 |
avass | I'm not trying to stop the user to ssh into the node they got during the job | 18:49 |
fungi | right, but how do you make sure they're kicked off when the node gets reused for the next job? | 18:50 |
avass | but stop them from leaving an ssh-key so they can reach nodes they didn't request | 18:50 |
fungi | i guess you could reboot between jobs | 18:50 |
avass | otherwise they could intercept a different job using secrets | 18:50 |
fungi | how many quick "leave my backdoor" jobs would they need to trigger to be fairly certain they've got a persistent backdoor into every one of your persistent nodes? | 18:51 |
avass | well, one, they could request every node to install it :) | 18:53 |
fungi | assuming you don't limit the nodeset size, yep | 18:53 |
avass | yeah | 18:53 |
fungi | back in the time before nodepool, when we used a lot of persistent nodes in what would become opendev, we accepted that was a risk and segregated jobs to different nodes (and different jenkins masters, because jenkins slaves got basically unrestricted shell access to masters in those days) based on whether they ran untrusted code, and didn't allow credentials to be used by jobs which ran on those unclean | 18:53 |
fungi | persistent nodes | 18:53 |
avass | that's probably a solution | 18:54 |
fungi | so basically if a job needed sensitive data, it was not allowed to run any arbitrary code, and it was isolated to special-use persistent nodes, often nodes which only ran that one job or a closely-related class of jobs | 18:55 |
avass | fungi: still even if you limit the nodeset size, they could just increase the number of jobs. couldn't they? | 18:55 |
fungi | yep, that's why i asked how many jobs they'd need to run. they could still probably do it with a single event triggered for a single change though | 18:55 |
avass | ah | 18:56 |
fungi | where m copies of the job times n nodes in the nodeset = your pool size | 18:56 |
avass | I'm glad we're moving towards cloud | 18:57 |
fungi | of course this assumes you don't catch them at it, but still, it's a possible avenue | 18:57 |
*** asaleh_ has quit IRC | 18:59 | |
fungi | we have what are effectively some persistent nodes in opendev today, but the only jobs they run are deployments triggered by changes merging in repositories where the reviewers with approval rights also have root access on the "nodes" (our production servers) | 19:00 |
avass | yeah we do that too for some infrastructure nodes | 19:02 |
avass | but for some reason the team that is hosting our PyPi repos requires us to use credentials for read access to those repos, and I've been trying to figure out a way to allow the users to install packages during the jobs without revealing credentials. | 19:03 |
avass | and it seems impossible :) | 19:03 |
fungi | does their authentication mechanism support ephemeral tokens? | 19:04 |
avass | they still need to be coupled to a user, so we would need one user per job and they still woulnd't be revoked automatically | 19:05 |
fungi | you could have the executor create and authorize a build-specific access credential and stick that in a variable or push it into a file on the node, then deauthorize it at the end of the build | 19:05 |
avass | I'm thinking about setting up mirrors instead | 19:05 |
fungi | and yeah, probably having some built-in expiration for those credentials would also be useful, in case the deauthorization failed to fire or encountered an error | 19:06 |
avass | but yes, that would have been a solution if we were able to do that, but adding a new user every time we add a new node doens't seem like a good solution | 19:07 |
avass | since we also can't do that ourselves but have to submit a ticket to another team.... and so on :) | 19:08 |
*** jamesmcarthur has quit IRC | 19:12 | |
avass | fungi: hmm, looks like we are supposed to be able to do that, I guess that team just doesn't know about it | 19:14 |
avass | fungi: thanks for the tip! | 19:14 |
*** jamesmcarthur has joined #zuul | 19:21 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Bump ansible-lint to 4.3.0 https://review.opendev.org/702679 | 19:39 |
avass | fungi: now I feel stupid for not checking that earlier | 19:39 |
*** noonedeadpunk has quit IRC | 19:42 | |
*** noonedeadpunk has joined #zuul | 19:42 | |
fungi | avass: glad my inane ramblings are helpful to someone! ;) | 19:50 |
avass | :) | 19:50 |
*** rlandy_ has joined #zuul | 20:02 | |
*** rlandy has quit IRC | 20:05 | |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role https://review.opendev.org/728503 | 20:08 |
*** rlandy_ is now known as rlandy | 20:08 | |
avass | zbr: re 702679, I posted some questions | 20:18 |
openstackgerrit | Oleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role https://review.opendev.org/728503 | 20:20 |
zbr | avass: as I said, disabled only to allow the bumping, not because we do not want them. | 20:22 |
zbr | otherwise the size of the change would be too big | 20:22 |
zbr | ahh, correct, i swapped two lines by mistake. | 20:23 |
avass | zbr: didn't we have 206 before the bump though? | 20:23 |
*** harrymichal has quit IRC | 20:25 | |
zbr | i double checking them, this review is very old, some may be leftovers | 20:25 |
*** tumble has joined #zuul | 20:26 | |
zbr | avass: super, i can remove all, there are only ~20 fixes to do. | 20:29 |
avass | zbr: I'm quitting for tonight, I'll check in tomorrow :) | 20:34 |
zbr | ouch... i see a regression, inline noqa no longer seems to be working. | 20:34 |
zbr | sure, is not urgent, i better to the same | 20:35 |
zbr | avass: narrowed it down as https://github.com/ansible/ansible-lint/issues/786 so not a big deal. | 20:41 |
*** iurygregory has joined #zuul | 20:42 | |
*** rfolco|rover has quit IRC | 20:42 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Bump ansible-lint to 4.3.0 https://review.opendev.org/702679 | 21:06 |
*** paladox has quit IRC | 21:28 | |
*** paladox has joined #zuul | 21:30 | |
*** rlandy is now known as rlandy|brb | 22:12 | |
*** sanjayu_ has joined #zuul | 22:18 | |
*** saneax has quit IRC | 22:20 | |
*** jamesmcarthur_ has joined #zuul | 22:25 | |
*** jamesmcarthur has quit IRC | 22:29 | |
*** rlandy|brb is now known as rlandy | 22:32 | |
*** rlandy has quit IRC | 22:52 | |
*** ysandeep|away is now known as ysandeep | 23:04 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-website master: Switch website to Gatsby https://review.opendev.org/717371 | 23:08 |
openstackgerrit | Monty Taylor proposed zuul/zuul-website master: Add blog to website https://review.opendev.org/724648 | 23:09 |
*** ysandeep is now known as ysandeep|weekend | 23:09 | |
openstackgerrit | Monty Taylor proposed zuul/zuul master: Update node to v14 and update to new jobs https://review.opendev.org/726553 | 23:10 |
*** sanjayu_ has quit IRC | 23:17 | |
*** jamesmcarthur_ has quit IRC | 23:17 | |
*** jamesmcarthur has joined #zuul | 23:21 | |
*** jamesmcarthur has quit IRC | 23:26 | |
*** jamesmcarthur has joined #zuul | 23:28 | |
*** jamesmcarthur has quit IRC | 23:34 | |
*** jamesmcarthur has joined #zuul | 23:37 | |
*** tosky has quit IRC | 23:39 | |
*** guillaumec has quit IRC | 23:41 | |
*** jamesmcarthur has quit IRC | 23:42 | |
*** jamesmcarthur has joined #zuul | 23:43 | |
*** jamesmcarthur has quit IRC | 23:45 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!