Friday, 2020-05-15

*** jamesmcarthur has joined #zuul00:11
*** jamesmcarthur has quit IRC00:16
openstackgerritGuillaume Chauvel proposed zuul/zuul master: WIP: Add tutorial tests  https://review.opendev.org/72819400:23
*** jamesmcarthur has joined #zuul00:23
*** jamesmcarthur has quit IRC00:34
openstackgerritmelanie witt proposed zuul/zuul-jobs master: Run sphinx-build in parallel for releasenotes  https://review.opendev.org/72747300:35
*** rlandy has quit IRC00:40
*** jamesmcarthur has joined #zuul01:08
tristanCcorvus: i left a reply to your comment on https://review.opendev.org/72815101:09
*** ysandeep|sleep is now known as ysandeep01:34
*** jamesmcarthur has quit IRC01:44
*** swest has quit IRC01:44
*** jamesmcarthur has joined #zuul01:45
*** jamesmcarthur has quit IRC01:49
*** swest has joined #zuul02:00
*** rlandy has joined #zuul02:11
*** jamesmcarthur has joined #zuul02:12
*** jamesmcarthur has quit IRC02:14
*** jamesmcarthur has joined #zuul02:14
*** jamesmcarthur_ has joined #zuul02:17
*** jamesmcarthur has quit IRC02:18
*** jamesmcarthur has joined #zuul02:20
*** jamesmca_ has joined #zuul02:21
*** jamesmcarthur_ has quit IRC02:22
*** jamesmcarthur_ has joined #zuul02:23
*** jamesmcarthur has quit IRC02:24
*** jamesmca_ has quit IRC02:25
*** jamesmcarthur has joined #zuul02:26
*** jamesmca_ has joined #zuul02:27
*** jamesmc__ has joined #zuul02:28
*** jamesmcarthur_ has quit IRC02:28
*** jamesmcarthur has quit IRC02:30
*** jamesmca_ has quit IRC02:31
*** jamesmc__ has quit IRC02:32
*** bhavikdbavishi has joined #zuul03:14
*** jamesmcarthur has joined #zuul03:14
*** bhavikdbavishi1 has joined #zuul03:17
*** jamesmcarthur has quit IRC03:17
*** jamesmcarthur has joined #zuul03:17
*** bhavikdbavishi has quit IRC03:18
*** bhavikdbavishi1 is now known as bhavikdbavishi03:18
*** cloudnull has quit IRC03:55
*** jamesmcarthur has quit IRC03:57
*** jamesmcarthur has joined #zuul03:57
*** jamesmcarthur has quit IRC04:08
*** jamesmcarthur has joined #zuul04:09
*** jamesmcarthur has quit IRC04:10
*** jamesmcarthur has joined #zuul04:10
*** evrardjp has quit IRC04:33
*** evrardjp has joined #zuul04:33
*** bhavikdbavishi has quit IRC04:47
*** ysandeep is now known as ysandeep|afk04:51
*** bhavikdbavishi has joined #zuul05:09
*** felixedel has joined #zuul05:24
*** felixedel has quit IRC05:45
openstackgerritTobias Henkel proposed zuul/zuul master: Drop support for ansible 2.7  https://review.opendev.org/72737305:48
openstackgerritTobias Henkel proposed zuul/zuul master: Update images to use python 3.8  https://review.opendev.org/72737405:48
openstackgerritTobias Henkel proposed zuul/zuul master: Drop support for ansible 2.7  https://review.opendev.org/72737305:50
openstackgerritTobias Henkel proposed zuul/zuul master: Update images to use python 3.8  https://review.opendev.org/72737405:55
*** dpawlik has joined #zuul05:58
*** y2kenny has quit IRC06:00
*** sgw has quit IRC06:00
*** ysandeep|afk is now known as ysandeep06:03
*** zxiiro has quit IRC06:04
*** saneax has joined #zuul06:12
*** jamesmcarthur has quit IRC06:22
openstackgerritMerged zuul/zuul master: Fix loading_errors bug  https://review.opendev.org/72828606:24
*** jamesmcarthur has joined #zuul06:56
openstackgerritTobias Henkel proposed zuul/zuul master: Stop jobs on gearman disconnect  https://review.opendev.org/71472207:09
openstackgerritGuillaume Chauvel proposed zuul/zuul master: WIP: Add tutorial tests  https://review.opendev.org/72819407:11
*** rpittau|afk is now known as rpittau07:12
*** guillaumec has joined #zuul07:19
*** jcapitao has joined #zuul07:22
openstackgerritTobias Henkel proposed zuul/zuul master: Default to Ansible 2.9  https://review.opendev.org/72734507:22
openstackgerritTobias Henkel proposed zuul/zuul master: Drop support for ansible 2.6  https://review.opendev.org/72715707:22
openstackgerritTobias Henkel proposed zuul/zuul master: Drop support for ansible 2.7  https://review.opendev.org/72737307:22
*** bhavikdbavishi has quit IRC07:30
*** yolanda has joined #zuul07:32
*** tosky has joined #zuul07:35
openstackgerritMatthieu Huin proposed zuul/zuul master: Add simple testing for Zuul CLI & REST API  https://review.opendev.org/72809807:54
*** bhavikdbavishi has joined #zuul07:55
*** nils has joined #zuul08:04
*** fbo|off is now known as fbo|afk08:18
*** piotrowskim has joined #zuul08:22
openstackgerritMatthieu Huin proposed zuul/zuul master: REST API: remove useless tenant when doing autohold query by id  https://review.opendev.org/72811808:24
piotrowskimHello, https://zuul.opendev.org/t/openstack/build/5c716fd6fbfe42548b9b58e1a2e49545, could anyone help me with this issue? I don't see the mentioned line in file, and I am not sure if I have lint error in my project or it's something else08:25
AJaegerpiotrowskim: that's from the 12th, isn't it? Please recheck. We fixed a few problems in that area08:27
piotrowskimI think 1008:27
piotrowskimyou ask about nodejs version?08:27
AJaegerpiotrowskim: 12th of May08:37
piotrowskimok08:38
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: DNM: Debug sibling install  https://review.opendev.org/72838408:44
AJaegerpiotrowskim: did the recheck help?08:58
*** ysandeep is now known as ysandeep|lunch09:01
piotrowskimi think so, thanks09:01
AJaegergreat09:02
avassAJaeger: what issue are you checking for tox_siblings?09:11
*** jamesmcarthur has quit IRC09:13
AJaegernova is setting in tox.ini envdir and thus sibling install fails since we expect that envdir contains the envlist (pdf-docs) and not another one, see https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/tox/library/tox_install_sibling_packages.py#L190-L19409:17
avassAJaeger: you mean the default evlist? not sure I follow09:18
AJaegeravass: I was figuring out why tox_siblings works for docs but not pdf-docs. the envdir=docs in nova's tox.ini breaks it. So, that was my debugging. Now question: how to fix? Remove envdir from nova/tox.ini or teach siblings to parse tox.ini and check for envdir?09:18
AJaegerthat's a discussion for another time today - I now know what's broken09:19
avassAJaeger: it's probably better to parse tox.ini, that shouldn't be too hard09:19
avassAJaeger: since someone else could be doing that too09:19
avassAJaeger: I have a change that updates tox_siblings to take a list of testenvs, I could stack another change on top of that where I parse tox.ini as well09:21
AJaegeravass, wow, that would be really great!09:36
*** ysandeep|lunch is now known as ysandeep09:39
*** bhavikdbavishi has quit IRC09:43
*** bhavikdbavishi has joined #zuul09:44
*** jamesmcarthur has joined #zuul09:44
*** bhavikdbavishi has quit IRC09:48
*** jamesmcarthur has quit IRC09:55
zbris there an easy way to post a message to irc when a job fails? -- controlling this from the job definition?10:09
*** rpittau is now known as rpittau|bbl10:09
AJaegerzbr: gerritbot allows that10:11
zbrAJaeger: that is not for gerrit changes, and not even our own zuul.10:11
zbri wonder if there is an ansible module that can be used to do this10:12
AJaegergerritbot reports various events, we have failure reporting not enabled. x-vrif-minus-2 is enabled for e.g. ironic. But that reacts to -2 only. So, fail of a single job: Nothing in gerritbot.10:13
AJaegerzbr: for the generic question you have, I cannot help.10:13
zbrin fact i think it would still require a webservice acting as a broken because you do not want to reconnect every time10:14
AJaegerzbr, could you answer ianw's comment on https://review.opendev.org/#/c/727561/ , please?10:14
AJaegerzbr: I would be happy to merge that change to move us one step further ^10:15
zbrhttps://github.com/ansible/ansible/blob/stable-2.9/changelogs/CHANGELOG-v2.9.rst#bugfixes which mentions https://github.com/ansible/ansible/issues/5227510:16
zbri 100% sure i tried to use "python -m xxx" on virtualenv_command and it did not work10:16
zbrso relying on it, would be a bad idea.10:16
zbrthat is regardless venv10:17
openstackgerritMatthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST  https://review.opendev.org/72841010:17
zbrsame applies to virtualenv10:17
*** jamesmcarthur has joined #zuul10:23
zbri hate so much that I cannot select text in git review comments....10:24
openstackgerritMatthieu Huin proposed zuul/zuul master: REST API: remove useless tenant when doing autohold query by id  https://review.opendev.org/72811810:25
openstackgerritMatthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST  https://review.opendev.org/72841010:25
*** dpawlik has quit IRC10:27
*** dpawlik has joined #zuul10:28
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: bindep: Add missing virtualenv and fixed repo install  https://review.opendev.org/69363710:31
*** jcapitao is now known as jcapitao_lunch10:44
*** fbo|afk is now known as fbo11:06
*** bhavikdbavishi has joined #zuul11:07
AJaegerzbr: ianw addressedd your concerns with https://review.opendev.org/#/c/726715/4/roles/ensure-pip/tasks/main.yaml. My understanding is that there is a bug that will not manifest the way ianw wrote this up.11:16
avasshmm, I've noticed that some of our static nodes get locked in zookeeper for some reason11:19
avassthey get stuck and nodepool is waiting in a pending state forever for the node: http://paste.openstack.org/show/793653/11:22
avassand I don't know how to fix it other than manually deleting the request-lock in zookeeper11:22
avassactually no, not the request node, but deleteing the node in zookeeper11:26
avasssicne it's stuck in a ready but locked state11:26
avassdeleting in nodepool didn't seem to work11:26
*** jamesmcarthur has quit IRC11:27
tobiashavass: ready+locked state is something we see as well in some conditions (the scheduler holds the lock for a never-to-be-started build)11:41
tobiashavass: I tried to fix that in https://review.opendev.org/714852 but that caused a memleak in zuul11:41
tobiashso far we regularly delete the locks of nodes that are in ready+locked for longer periods of time as a workaround (this is unsupported and might cause other side effects tm)11:42
avasstobiash: yeah, it only ever happens for one specific tenant for us11:43
*** threestrands has quit IRC11:46
openstackgerritMatthieu Huin proposed zuul/zuul master: REST API: remove useless tenant when doing autohold query by id  https://review.opendev.org/72811811:49
*** jamesmcarthur has joined #zuul11:55
*** jamesmcarthur has quit IRC12:04
openstackgerritMatthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST  https://review.opendev.org/72841012:06
*** rpittau|bbl is now known as rpittau12:06
*** sshnaidm|afk is now known as sshnaidm|off12:07
mordredzbr: bindep output is ordered?12:08
AJaegermordred: it's not so far - that's a new feature that zbr wants to introduce. That's why I ask to split his change into three so that we can discuss it.12:09
*** jcapitao_lunch is now known as jcapitao12:12
mordredAJaeger: ah - gotcha12:12
mordredI mean - the epel-release example makes a good amount of sense and at least for some things solves an issue I've had with how to deal with bindep and packages needing external repos12:13
mordredbut good to know12:13
*** ysandeep is now known as ysandeep|afk12:16
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: Add remove-zuul-sshkey  https://review.opendev.org/68071212:24
mordredavass: I think you're looking at the "parse tox.ini for envdir" thing. you might want to consider using tox --showconfig (which can be filtered for a given env with -e{env}) to expand any macros and whatnot12:28
*** asaleh_ has joined #zuul12:29
*** ysandeep|afk is now known as ysandeep12:29
avassmordred: ah, that looks like, thanks!12:29
avasslooks nice*12:29
*** cloudnull has joined #zuul12:29
*** panda|out is now known as panda12:30
tristanCzuul-maint : https://review.opendev.org/680712 is quite important for kubectl node user, could you please have a look12:31
AJaegermordred: cool, "tox --showconfig -e pdf-docs |grep ^envdir" is what we need instead of setting envdir manually in that line12:34
mordredAJaeger: yah12:36
cloudnullmornings12:37
avassI was planning on passing it to configparser, but I guess that works too :)12:37
openstackgerritGuillaume Chauvel proposed zuul/zuul master: WIP: Import user tutorials from Software Factory project blog  https://review.opendev.org/72819312:38
openstackgerritGuillaume Chauvel proposed zuul/zuul master: WIP: Add tutorial tests  https://review.opendev.org/72819412:38
avasscloudnull: good day!12:38
cloudnullo/12:39
avassmordred, AJaeger: I think configparser would be easier to extend, with tox --showconfig we can be sure that the logdir is correct too12:41
AJaegertry configparser on nova and see whether it does the right thing ;)12:42
avassoh, what happens?12:43
mordredavass: yeah - use that command, feed the output into configparser - should be very solid12:46
avassoh.. I think our tox-siblings.yaml test-playbook is broken12:52
*** bhavikdbavishi has quit IRC12:55
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Fix broken tox-siblings.yaml test  https://review.opendev.org/72843812:55
openstackgerritMonty Taylor proposed zuul/zuul master: Update to new javascript jobs  https://review.opendev.org/72655413:00
*** jamesmcarthur has joined #zuul13:01
*** sgw has joined #zuul13:05
openstackgerritMonty Taylor proposed zuul/zuul-website master: Add blog to website  https://review.opendev.org/72464813:07
*** jamesmcarthur has quit IRC13:11
openstackgerritMonty Taylor proposed zuul/nodepool master: Add podman and config to the nodepool-builder image  https://review.opendev.org/72647713:12
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005713:13
openstackgerritMatthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST  https://review.opendev.org/72841013:23
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Fix broken tox-siblings.yaml test  https://review.opendev.org/72843813:32
openstackgerritMerged zuul/zuul-jobs master: Add remove-zuul-sshkey  https://review.opendev.org/68071213:33
openstackgerritGuillaume Chauvel proposed zuul/zuul master: WIP: Add tutorial tests  https://review.opendev.org/72819413:41
*** brendangalloway has joined #zuul13:46
brendangallowayHello, our nodepool-launcher seems to have entered some sort of error state.  Is this the correct place to ask for assistance?13:47
avassbrendangalloway: absolutely :)13:48
*** zxiiro has joined #zuul13:48
brendangallowayAccording to the logs, earlier today the various poolworkers received a stop signal.  I'm not sure why.  After that, all the requests in the requests list have not been responded to.  I've restarted all the zuul and nodepool services, but the requests are still not being filled13:49
brendangallowayIs there some way I can query the state of specific pool workers?13:50
brendangallowayI suspect that restarting the launcher did not start them again, but I can't find any way to interact with them13:51
fungibrendangalloway: try explicitly stopping them before starting them13:52
fungialso after stopping, check for stale pidfiles which might have been left behind when they were killed earlier13:52
brendangallowaythe nodepool launcher and zuul services?13:53
fungiprobably just the launchers, since it sounds like that's what you had trouble with13:54
*** felixedel has joined #zuul13:54
fungiif they're being started and stopped normally with initscripts using systemd's sysvinit compat layer, then systemd can have an inconsistent view of the service state, and think they're still running if they got stopped via some other mechanism like a direct kill signal13:55
fungiso then when you tell systemd to start them, it just does nothing because it assumes they were already running13:55
fungibut telling systemd to explicitly stop them first will get its internal state synced up with reality13:55
*** Goneri has joined #zuul13:56
avassAJaeger, mordred: wanna +2 https://review.opendev.org/#/c/728438/2 to fix the tox-siblings job?13:56
brendangallowayfungi: The nodepool log did indicate it was restarting13:57
brendangallowaybut to be 100% sure, how would I check for stale PIDs?13:57
fungistop the launchers, then look in their rundir for a something.pid file... on our launchers it was in /var/run/nodepool-launcher/ until a little over a week ago when we switched to using docker containers14:00
fungion some installations it may have been in /var/run/nodepool/14:01
brendangallowayfungi: I'm using softwarefactory 3.4 - I don't see anything nodepool related in /var/run with the service stopped14:04
fungibrendangalloway: i don't know much about how sf has the services arranged, but maybe tristanC can provide more precise guidance. if the launcher process doesn't exist/persist in the process list after you start it, and doesn't log anything, then you may need to try invoking it directly in the foreground14:07
mhubrendangalloway, with SF nodepool stuff should be in /usr/bin/nodepool & /etc/nodepool14:07
fungimhu: is that where it writes its pidfile too?14:07
mhuand logs in /var/log/nodepool/14:07
tristanCfungi: mhu: we don't use pidfile as the services are managed with systemd14:08
*** jamesmcarthur has joined #zuul14:08
tristanCbrendangalloway: is the `nodepool request-list` output looks correct?14:08
fungigot it. so maybe explicitly stopping it before trying to start it again is good enough14:08
brendangallowaytristanC: it has the requests that the current jobs are waiting for14:09
fungithough if systemd is working as a direct parent of the nodepool processes in that scenario, it shouldn't get out of sync and think the service is running when it's not, so the problem starting it may be elsewhere14:09
tristanCbrendangalloway: iirc SF set a long minSessionTime in zookeeper, and it may take sometime before a new launcher service to process the request14:10
fungiany idea how long?14:10
fungibrendangalloway: after you've started the launcher, is there a nodepool-launcher process in the process table at least?14:11
tristanCby default it should be 10 minutes, up to 30 minutes14:11
brendangallowayfungi: yes, nodepool is in ps14:11
fungiokay, so it *is* starting14:12
fungiit's just not processing the backlog? does it process new requests?14:12
brendangallowayfungi: not that I can see.  I cleared all the jobs and issued a recheck on one that was previously queued14:13
brendangallowayrequest-list is generated, but never fulfilled14:14
fungitristanC: does the minSessionTime prevent a launcher from accepting any requests until it's had a session established for at least that long?14:14
brendangalloway1 node of the requested type is present in nodepool list14:14
tristanCfungi: it should not, perhaps there is another issue14:14
tristanCbrendangalloway: what about /etc/nodepool/nodepool.yaml, is there provider listed?14:15
brendangallowaytristanC:  I think you might have it - the provider is []14:16
*** avass has quit IRC14:16
tristanCbrendangalloway: arg, so there is a suprising bug in ansible fact cachin, if you look at `grep ansible_hostname /var/lib/software-factory/ansible/facts/*` then you should see an incorrect hostname defined for your nodepool-launcher host14:17
*** jamesmcarthur has quit IRC14:17
brendangallowaytristanc: yes, I have hit this before.  I had put in the fix you recommended previously so I did not look there again14:18
tristanCbrendangalloway: we haven't figure out the root cause yet. Best is to remove the fact and re-run the configuration: `rm -f /var/lib/software-factory/ansible/facts/* && sfconfig --skip-install`14:18
openstackgerritMerged zuul/zuul-jobs master: Fix broken tox-siblings.yaml test  https://review.opendev.org/72843814:19
brendangallowaytristanc: trying that14:23
brendangallowaytristanc: sfconfig has completed again, but the provider list is still empty14:34
tristanCbrendangalloway: arg, that's infortunate, and is ansible_hostname fact correct?14:35
brendangallowayno, it is incorrect again14:35
fbobrendangalloway: is the ansible version on your deployment 2.6.19 ?14:41
brendangallowayfbo: yes14:42
openstackgerritMonty Taylor proposed zuul/zuul master: Update node to v14 and update to new jobs  https://review.opendev.org/72655314:50
openstackgerritFelix Edel proposed zuul/zuul master: WIP: Link to previous buildset results when reporting a check to Github  https://review.opendev.org/72846314:52
fbobrendangalloway: is this, return the wrong hostname: ansible all -m setup -a "gather_subset=all" | grep hostname14:56
brendangallowayfbo: it doesn't seem so - I don't see two duplicates at least14:58
*** harrymichal has joined #zuul15:03
*** felixedel has quit IRC15:08
*** jamesmcarthur has joined #zuul15:15
openstackgerritMatthieu Huin proposed zuul/zuul master: CLI: add autohold-info, autohold-delete via REST  https://review.opendev.org/72841015:17
*** avass has joined #zuul15:17
openstackgerritMatthieu Huin proposed zuul/zuul master: REST API: remove useless tenant when doing autohold query by id  https://review.opendev.org/72811815:25
*** ysandeep is now known as ysandeep|away15:36
*** dpawlik has quit IRC15:41
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Don't require tox_envlist  https://review.opendev.org/72682915:44
*** jcapitao has quit IRC15:51
openstackgerritMatthieu Huin proposed zuul/zuul master: REST API: add promote endpoint  https://review.opendev.org/72848916:04
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Don't require tox_envlist  https://review.opendev.org/72682916:12
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Remove unecessary environment from tox_siblings test  https://review.opendev.org/72849416:21
avassmordred, AJaeger: actually it wasn't broken, I'm just tired and stupid. But it did add missing files: that it should track so instead of reverting I'll just remove the environment ^ :)16:22
avassI'm going to take a break now16:22
AJaegeravass: take a break and relax! Thanks a lot!16:22
*** nils has quit IRC16:28
*** evrardjp has quit IRC16:33
*** evrardjp has joined #zuul16:33
*** brendangalloway has quit IRC16:33
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role  https://review.opendev.org/72850316:52
*** fbo is now known as fbo|off16:53
*** jamesmcarthur has quit IRC16:55
*** rpittau is now known as rpittau|afk17:11
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role  https://review.opendev.org/72850317:11
openstackgerritGraham Hayes proposed zuul/nodepool master: Implement an Azure driver  https://review.opendev.org/55443217:19
Open10K8SHi team17:24
Open10K8SZuul is not reachable17:24
Open10K8S503 service unavailable17:24
*** jamesmcarthur has joined #zuul17:30
clarkbas noted in #opendev this was us taking advantage of a good time to do a complete restart of our zuul on a friday. We had a few things queued up behind openstack completing its release (which ahppened earlier this week)17:32
AJaegercorvus, mordred, could you review https://review.opendev.org/#/c/727561/1 , please? I suggest to move forward since it addresses a real bug.17:53
corvusAJaeger: agree; +2; will wait for mordred to +3 or we can +w if he's busy17:58
AJaegerthanks, corvus17:59
AJaegermordred: time for a review on 727561 or shall I +A?17:59
*** jamesmcarthur has quit IRC18:05
*** jamesmcarthur has joined #zuul18:05
AJaegerthanks, mordred18:06
AJaegerzuul-jobs maintainer, three smaller reviews, please: https://review.opendev.org/728494 and https://review.opendev.org/725030 https://review.opendev.org/#/c/727929/18:11
AJaegerand two longer ones: https://review.opendev.org/727135 and https://review.opendev.org/679306 , please18:12
openstackgerritMerged zuul/zuul-jobs master: bindep: use virtualenv_command from ensure-pip  https://review.opendev.org/72756118:16
avassI think we're not cleaning ssh-keys good enough before and after jobs18:28
avassI realized that a user could theoretically add their own ssh-key to a node otherwise this wouldn't work: https://review.opendev.org/#/c/679306/18:28
avassand that's not a problem for dynamic nodes18:28
avassbut for static nodes that ssh-key wouldn't be removed normally18:29
avassacutally, for that specific role it would, but we don't stop a user from installing their own ssh-key during a job18:30
*** iurygregory has quit IRC18:35
*** iurygregory has joined #zuul18:38
fungiavass: also there are probably plenty of other ways untrusted code could install backdoors on persistent nodes, even reverse tunnels you couldn't block without some seriously restrictive and fragile egress filtering18:40
avassfungi: yeah, but even with that you could reach other static nodes in the network by rechecking over and over again18:41
fungiif you confined all the processes descended from the job to an ephemeral cgroup, and plugged a number of escape hatches to keep them from forking outside it, then you could probably forcibly terminate any processes left running18:42
fungiafter which it's a matter of making sure ssh keys and any other remote access normally initialized outside the cgroup got reset18:42
AJaegerfungi: question is as well: Do we make it easier with a role like https://review.opendev.org/679306 ?18:43
avassfungi, AJaeger: you would need to isolate each node and not allow any multi-node jobs18:44
*** iurygregory has quit IRC18:45
openstackgerritAlbin Vass proposed zuul/zuul-base-jobs master: Make sure authorized_keys is not altered during a job  https://review.opendev.org/72855118:46
avassAJaeger, fungi: wouldn't something like that ^ stop the user from leaving a public key installed on the node though?18:46
fungitoctou race, the user can ssh in while the job is running after it has altered the authorized keys file before zuul sets it back18:47
fungibut again, if they really want in, they could just set up some other backdoor not reliant on ssh18:48
fungithere are so many other (far worse) ways people could accomplish that, might as well not start down the road of making someone think they can secure against it like that18:49
avassI'm not trying to stop the user to ssh into the node they got during the job18:49
fungiright, but how do you make sure they're kicked off when the node gets reused for the next job?18:50
avassbut stop them from leaving an ssh-key so they can reach nodes they didn't request18:50
fungii guess you could reboot between jobs18:50
avassotherwise they could intercept a different job using secrets18:50
fungihow many quick "leave my backdoor" jobs would they need to trigger to be fairly certain they've got a persistent backdoor into every one of your persistent nodes?18:51
avasswell, one, they could request every node to install it :)18:53
fungiassuming you don't limit the nodeset size, yep18:53
avassyeah18:53
fungiback in the time before nodepool, when we used a lot of persistent nodes in what would become opendev, we accepted that was a risk and segregated jobs to different nodes (and different jenkins masters, because jenkins slaves got basically unrestricted shell access to masters in those days) based on whether they ran untrusted code, and didn't allow credentials to be used by jobs which ran on those unclean18:53
fungipersistent nodes18:53
avassthat's probably a solution18:54
fungiso basically if a job needed sensitive data, it was not allowed to run any arbitrary code, and it was isolated to special-use persistent nodes, often nodes which only ran that one job or a closely-related class of jobs18:55
avassfungi: still even if you limit the nodeset size, they could just increase the number of jobs. couldn't they?18:55
fungiyep, that's why i asked how many jobs they'd need to run. they could still probably do it with a single event triggered for a single change though18:55
avassah18:56
fungiwhere m copies of the job times n nodes in the nodeset = your pool size18:56
avassI'm glad we're moving towards cloud18:57
fungiof course this assumes you don't catch them at it, but still, it's a possible avenue18:57
*** asaleh_ has quit IRC18:59
fungiwe have what are effectively some persistent nodes in opendev today, but the only jobs they run are deployments triggered by changes merging in repositories where the reviewers with approval rights also have root access on the "nodes" (our production servers)19:00
avassyeah we do that too for some infrastructure nodes19:02
avassbut for some reason the team that is hosting our PyPi repos requires us to use credentials for read access to those repos, and I've been trying to figure out a way to allow the users to install packages during the jobs without revealing credentials.19:03
avassand it seems impossible :)19:03
fungidoes their authentication mechanism support ephemeral tokens?19:04
avassthey still need to be coupled to a user, so we would need one user per job and they still woulnd't be revoked automatically19:05
fungiyou could have the executor create and authorize a build-specific access credential and stick that in a variable or push it into a file on the node, then deauthorize it at the end of the build19:05
avassI'm thinking about setting up mirrors instead19:05
fungiand yeah, probably having some built-in expiration for those credentials would also be useful, in case the deauthorization failed to fire or encountered an error19:06
avassbut yes, that would have been a solution if we were able to do that, but adding a new user every time we add a new node doens't seem like a good solution19:07
avasssince we also can't do that ourselves but have to submit a ticket to another team.... and so on :)19:08
*** jamesmcarthur has quit IRC19:12
avassfungi: hmm, looks like we are supposed to be able to do that, I guess that team just doesn't know about it19:14
avassfungi: thanks for the tip!19:14
*** jamesmcarthur has joined #zuul19:21
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Bump ansible-lint to 4.3.0  https://review.opendev.org/70267919:39
avassfungi: now I feel stupid for not checking that earlier19:39
*** noonedeadpunk has quit IRC19:42
*** noonedeadpunk has joined #zuul19:42
fungiavass: glad my inane ramblings are helpful to someone! ;)19:50
avass:)19:50
*** rlandy_ has joined #zuul20:02
*** rlandy has quit IRC20:05
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role  https://review.opendev.org/72850320:08
*** rlandy_ is now known as rlandy20:08
avasszbr: re 702679, I posted some questions20:18
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add DaemonSet check for wait-for-pods role  https://review.opendev.org/72850320:20
zbravass: as I said, disabled only to allow the bumping, not because we do not want them.20:22
zbrotherwise the size of the change would be too big20:22
zbrahh, correct, i swapped two lines by mistake.20:23
avasszbr: didn't we have 206 before the bump though?20:23
*** harrymichal has quit IRC20:25
zbri double checking them, this review is very old, some may be leftovers20:25
*** tumble has joined #zuul20:26
zbravass: super, i can remove all, there are only ~20 fixes to do.20:29
avasszbr: I'm quitting for tonight, I'll check in tomorrow :)20:34
zbrouch... i see a regression, inline noqa no longer seems to be working.20:34
zbrsure, is not urgent, i better to the same20:35
zbravass: narrowed it down as https://github.com/ansible/ansible-lint/issues/786 so not a big deal.20:41
*** iurygregory has joined #zuul20:42
*** rfolco|rover has quit IRC20:42
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Bump ansible-lint to 4.3.0  https://review.opendev.org/70267921:06
*** paladox has quit IRC21:28
*** paladox has joined #zuul21:30
*** rlandy is now known as rlandy|brb22:12
*** sanjayu_ has joined #zuul22:18
*** saneax has quit IRC22:20
*** jamesmcarthur_ has joined #zuul22:25
*** jamesmcarthur has quit IRC22:29
*** rlandy|brb is now known as rlandy22:32
*** rlandy has quit IRC22:52
*** ysandeep|away is now known as ysandeep23:04
openstackgerritMonty Taylor proposed zuul/zuul-website master: Switch website to Gatsby  https://review.opendev.org/71737123:08
openstackgerritMonty Taylor proposed zuul/zuul-website master: Add blog to website  https://review.opendev.org/72464823:09
*** ysandeep is now known as ysandeep|weekend23:09
openstackgerritMonty Taylor proposed zuul/zuul master: Update node to v14 and update to new jobs  https://review.opendev.org/72655323:10
*** sanjayu_ has quit IRC23:17
*** jamesmcarthur_ has quit IRC23:17
*** jamesmcarthur has joined #zuul23:21
*** jamesmcarthur has quit IRC23:26
*** jamesmcarthur has joined #zuul23:28
*** jamesmcarthur has quit IRC23:34
*** jamesmcarthur has joined #zuul23:37
*** tosky has quit IRC23:39
*** guillaumec has quit IRC23:41
*** jamesmcarthur has quit IRC23:42
*** jamesmcarthur has joined #zuul23:43
*** jamesmcarthur has quit IRC23:45

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!