Tuesday, 2020-03-10

*** jamesmcarthur has joined #zuul00:11
*** Goneri has quit IRC00:14
openstackgerritTristan Cacqueray proposed zuul/zuul master: Implement zookeeper-auth  https://review.opendev.org/61915600:15
*** jamesmcarthur has quit IRC00:19
*** jamesmcarthur has joined #zuul00:21
*** jamesmcarthur has quit IRC00:26
*** y2kenny has joined #zuul00:34
y2kennyI have been trying the tutorial and have been adopting/modifying it to my existing environment.  When I try the testjob step, I have been running into RETRY_LIMIT error (according to the dashboard), in duration of 6 seconds00:37
fungizuul sees failures in "pre" phase playbooks and also problems like inability to connect to defined job nodes as a reason to automatically retry a build00:39
fungiby default, three retries in a row for such conditions bails out with a RETRY_LIMIT build result00:40
y2kennyI see.  I was looking at the log.  I see scheduler logging "Execute job testjob ... on nodes... for change ... with dependent changes... "00:40
fungiyou might check the executor log for evidence of some systemic connectivity problem or early pre phase error00:41
y2kennyok.  Thanks for the tips.  I see executor logging Started SSH agent, added key, beginning job test job, updating repo, skipping updating local repo, checking out ... branch...00:43
y2kennymay be I messed with the keys too much.00:44
*** jamesmcarthur has joined #zuul00:44
y2kennyI will try running the plain unmodify tutorial and compare.00:45
fungiy2kenny: are you archiving logs anywhere? you could check the build results and see if ansible logged any errors with a pre playbook00:56
fungithe build logs i mean, not the service logs00:57
fungieven a retry-limit result should usually log the build console for the final (third by default) attempt00:59
*** jamesmcarthur has quit IRC01:08
y2kennyoddly enough the build log is not getting posted even though I added the post-logs.yaml.  I might have messed around with the config a bit too much (I wanted to connect zuul to my production gerrit instead of the sample gerrit instance.)  A lot of things are working (like the noop check and gate, recheck from comment, etc.)  Auto submit also work.01:11
*** jamesmcarthur has joined #zuul01:12
openstackgerritTristan Cacqueray proposed zuul/zuul master: Implement zookeeper-auth  https://review.opendev.org/61915601:18
fungianother option might be to try to view the console log stream for a build if you can catch it while it's still running01:20
*** jamesmcarthur has quit IRC01:24
*** jamesmcarthur has joined #zuul01:24
*** rlandy|bbl is now known as rlandy01:53
*** jamesmcarthur has quit IRC02:06
*** jamesmcarthur has joined #zuul02:07
*** swest has quit IRC02:16
*** swest has joined #zuul02:30
*** bhavikdbavishi has joined #zuul03:18
*** jamesmcarthur has quit IRC03:21
*** rlandy has quit IRC03:36
*** bhavikdbavishi has quit IRC04:05
*** ianychoi has quit IRC04:39
*** ianychoi has joined #zuul04:40
*** evrardjp has quit IRC05:35
*** evrardjp has joined #zuul05:35
y2kennyUm... I was able to get the console log but for some reason I got an "ANSIBLE PARSE ERROR"06:16
y2kennyalthough, later on it said "ubuntu-bionic | UNREACHABLE!"06:16
y2kennyin between there are a few skipping plugin ara_read and  ara_record06:17
*** AJaeger has joined #zuul07:03
*** y2kenny has quit IRC07:05
*** dpawlik has joined #zuul07:21
*** AJaeger has quit IRC07:49
*** AJaeger has joined #zuul08:00
*** tosky has joined #zuul08:18
*** avass has quit IRC08:20
*** avass has joined #zuul08:20
*** jcapitao has joined #zuul08:22
*** Defolos has joined #zuul08:25
*** decimuscorvinus has quit IRC08:45
*** decimuscorvinus has joined #zuul08:45
*** jpena|off is now known as jpena08:50
*** bhavikdbavishi has joined #zuul08:52
*** Defolos has quit IRC09:12
*** Defolos has joined #zuul09:37
openstackgerritTobias Henkel proposed zuul/zuul master: Evaluate CODEOWNERS settings during canMerge check  https://review.opendev.org/64455709:41
*** bhavikdbavishi has quit IRC09:42
openstackgerritTobias Henkel proposed zuul/zuul master: Add optional support for circular dependencies  https://review.opendev.org/68535409:45
openstackgerritTobias Henkel proposed zuul/zuul master: Optionally allow zoned executors to process unzoned jobs  https://review.opendev.org/67384009:49
openstackgerritTobias Henkel proposed zuul/zuul master: Use implied branch matcher for implied branches  https://review.opendev.org/64027210:01
*** sshnaidm|afk is now known as sshnaidm10:01
*** jcapitao has quit IRC10:13
*** rishabhhpe has joined #zuul10:14
rishabhhpeHi All , is there any way in which i can restrict zuul to execute only one job at a single time ?10:15
*** jcapitao has joined #zuul10:15
openstackgerritBenjamin Schanzel proposed zuul/nodepool master: Kubernetes/OpenShift Provider: Don't Require Bash in Container Images  https://review.opendev.org/71203410:17
*** mhu has joined #zuul10:26
AJaegerrishabhhpe: you can create a semaphore in your job config and use it everywhere, the semaphore can limit number of jobs.10:26
AJaegerrishabhhpe:check the Zuul manual at zuul-ci.org for semaphore. But I wonder why you want this. What is the reason you ask this? What are you trying to achieve?10:27
rishabhhpeAjaeger: just for information i am asking .. not trying anything with this .10:27
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: DNM: test triggers  https://review.opendev.org/71203710:33
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: DNM: test triggers  https://review.opendev.org/71203710:33
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: DNM: test triggers  https://review.opendev.org/71203710:42
*** bschanzel has joined #zuul10:43
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: DNM: test triggers  https://review.opendev.org/71203710:43
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Improve ensure-tox role  https://review.opendev.org/70864210:51
openstackgerritMerged zuul/zuul-jobs master: Tests bindep role on all-platforms  https://review.opendev.org/70870411:06
*** jcapitao is now known as jcapitao_lunch11:11
*** ianychoi has quit IRC11:16
*** ianychoi has joined #zuul11:16
*** jamesmcarthur has joined #zuul11:19
*** jamesmcarthur has quit IRC11:24
openstackgerritTobias Henkel proposed zuul/zuul master: Optionally allow zoned executors to process unzoned jobs  https://review.opendev.org/67384011:30
openstackgerritTobias Henkel proposed zuul/zuul master: Add spec for enhanced regional executor distribution  https://review.opendev.org/66341311:34
openstackgerritTobias Henkel proposed zuul/zuul master: Move fingergw config to fingergw  https://review.opendev.org/66494911:36
openstackgerritTobias Henkel proposed zuul/zuul master: WIP: Route streams to different zones via finger gateway  https://review.opendev.org/66496511:36
openstackgerritTobias Henkel proposed zuul/zuul master: Support ssl encrypted fingergw  https://review.opendev.org/66495011:36
*** bolg has quit IRC11:37
openstackgerritTobias Henkel proposed zuul/zuul master: WIP: Route streams to different zones via finger gateway  https://review.opendev.org/66496511:47
openstackgerritTobias Henkel proposed zuul/zuul master: Support ssl encrypted fingergw  https://review.opendev.org/66495011:47
*** Goneri has joined #zuul11:49
*** bhavikdbavishi has joined #zuul12:01
*** Goneri has quit IRC12:12
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: bindep: Add missing virtualenv and fixed repo install  https://review.opendev.org/69363712:17
*** jamesmcarthur has joined #zuul12:21
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: WIP: Make ensure-tox pass cross-platform  https://review.opendev.org/70743912:24
*** jpena is now known as jpena|lunch12:25
openstackgerritTristan Cacqueray proposed zuul/zuul master: Implement zookeeper-auth  https://review.opendev.org/61915612:25
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: ensure-tox: use failed_when  https://review.opendev.org/71206212:28
tristanCcorvus: Shrews: ^ is now failing on test_zookeeper_disconnect and test_node_request_disconnect , but it passes test_zookeeper_disconnect2 . would you understand why? (in PS20 the three test use the same connection restart process)12:28
*** jamesmcarthur has quit IRC12:36
*** Goneri has joined #zuul12:43
*** ianychoi has quit IRC12:49
*** ianychoi has joined #zuul12:50
*** jcapitao_lunch is now known as jcapitao12:57
*** jpena|lunch is now known as jpena13:04
*** harrymichal_ has joined #zuul13:07
*** zxiiro has joined #zuul13:19
*** bhavikdbavishi has quit IRC13:22
*** harrymichal_ has quit IRC14:04
*** ianychoi has quit IRC14:10
*** ianychoi has joined #zuul14:11
*** harrymichal has joined #zuul14:11
ShrewstristanC: hrm, not sure. If I had to guess, might have something to do with the failing tests doing self.zk.client.stop()/self.zk.client.start() while the disconnect2 test calls self.zk.connect() which re-establishes the auth stuff14:17
ShrewsI didn't actually look at PS20 to see if there were changes around that, though14:19
ShrewsI looked at the latest14:19
tristanCShrews: in PS20 i used another restart process (stop() then connect()), and that made the test failure localized to `test_zookeeper_disconnect` and `test_node_request_disconnect`14:21
tristanCShrews: in PS21 i revert back to (stop() start()), but that is causing unrelated test failure14:22
ShrewstristanC: give me a bit to deal with some payroll headaches then I'll look at a little closer at those patch sets14:25
tristanCShrews: also, using (stop() then connect()) somehow fixed failure in `test_zookeeper_disconnect2`14:25
tristanCShrews: thanks!14:26
*** jamesmcarthur has joined #zuul14:39
bschanzelHi tristanC, I've got a change proposal for syncing git repos to k8s pod nodes (related to https://review.opendev.org/#/c/535557/ and https://review.opendev.org/#/c/570667). See https://review.opendev.org/#/c/711920. I'd be glad about your opinion there.14:45
*** jamesmcarthur has quit IRC14:46
bschanzelAlso, currently the k8s and OpenShift drivers require bash on the build nodes/pods. Therefore proposed https://review.opendev.org/#/c/712034. What do you think?14:46
*** bschanzel has quit IRC14:49
*** bschanzel has joined #zuul14:51
*** jamesmcarthur has joined #zuul14:54
*** jamesmcarthur has quit IRC14:56
*** jamesmcarthur has joined #zuul14:56
tristanCbschanzel: LGTM, thanks!15:01
bschanzeltristanC cool, thanks for your review!15:03
*** mattw4 has joined #zuul15:14
*** rishabhhpe has quit IRC15:16
*** bschanzel has quit IRC15:31
corvusi'm back and slowly catching up; i'll look at the zk auth changes shortly15:32
*** harrymichal has quit IRC15:47
*** erbarr has joined #zuul15:52
*** mhu has quit IRC15:58
*** mhu has joined #zuul15:58
ShrewstristanC: did you try adding a client.add_auth() call in between the client.stop()/client.start() calls in the test?16:01
ShrewsI'm trying to find where that's done in KazooClient but haven't found it yet  :/16:06
ShrewsWow, I totally do not get how the kazoo auth system does the auth registration.16:11
corvusShrews: it may not send add_auth16:11
*** ianychoi has quit IRC16:11
corvusShrews: when i used zk-shell, i was unable to use add_auth; i'm starting to suspect that add_auth only works for the old digest auth, and sasl is different and only happens on connect16:12
*** ianychoi has joined #zuul16:12
Shrewshrmmm16:12
clarkbsasl should happen on connection creation16:12
clarkb(as a general rule of thumb for sasl)16:12
corvusShrews: there's some stuff in kazoo/protocol/connection.py16:13
Shrewsah, indeed. my grepping fails me16:14
corvusShrews: an initial skim of that file makes it look to me like stop()/start() should do everything needed to reconnect with sasl (but i'm not positive -- i have only skimmed it)16:16
Shrewsmaybe we need a call to close()  in between the stop/start16:20
*** harrymichal has joined #zuul16:21
Shrewstotal guess on that though16:21
corvusyeah, that's what tristanC did in the other test16:23
corvusShrews, tristanC: oh, i think i may see it -- the kazoo client connection object has a .sasl_cli attribute, and that has a .completed attribute -- it may be that it isn't cleared by .stop() so it think it has already completed the sasl request on the second connection16:25
*** jcapitao has quit IRC16:25
tristanCcorvus: that sounds like a good explanation of the failure16:27
*** harrymichal has quit IRC16:27
*** harrymichal has joined #zuul16:28
corvustristanC, Shrews: yeah, if i insert "self.zk.client._connection.sasl_cli = None" between stop() and start() in test_zookeeper_disconnect it works16:28
corvusi'm not sure that's what we should do in the test, but i think it helps narrow down the problem :)16:28
Shrewscorvus: that begs the question to what happens in the real world using sasl and we lose our zk connection....16:32
*** harrymichal has quit IRC16:32
*** harrymichal_ has joined #zuul16:33
corvusShrews: indeed...16:33
corvusShrews, tristanC: this is probably worth a test script that does authenticates and performs a 'get' every few seconds.  run that and then restart zk.16:34
tristanCShrews: could we force a full reconnection in the connection listener?16:34
*** jcapitao has joined #zuul16:35
ShrewsConnectionHandler only sets sasl_cli to None in __init__, and KazooClient only gets a connection handler once in its __init__16:35
corvuswell, kazoo is supposed to do that for us16:35
ShrewsI'm guessing this isn't a well tested path for kazoo16:35
corvusShrews: based on that, i suspect that this is a bug in kazoo and it will fail in real life, but i think we'll need to do something like the test i proposed to verify it (i don't think the zuul unit test is an adequate simulation)16:36
corvus(well, i mean, the unit test in zuul might be a *great* simulation, but we need to confirm reality matches :)16:37
Shrewsagreed16:37
*** harrymichal_ has quit IRC16:38
corvusShrews, tristanC: i think i know the answer to the other half of tristanC's question from earlier: the stop(), close(), connect() sequence that he used in test_zookeeper_disconnect2 appears to work, so why did test_zookeeper_disconnect fail when it was used there?16:39
corvusShrews, tristanC: i think the answer there is that test_zookeeper_disconnect relies on watches which are only called after the client reconnects.  the stop/close/connect sequence completely kills the kazoo client, so all the watches disappear.16:40
*** harrymichal has joined #zuul16:40
corvusShrews, tristanC: so i think that while even though that sequence appears to correctly reconnect, we should not use it because we rely on maintaining state information like watches in the kazoo client16:41
corvusShrews, tristanC: i just confirmed our suspicion with a test16:44
corvusShrews, tristanC: (i just modified a unit test to do the get in an infinite loop, then restarted zookeeper from under it -- easier than writing a dedicated script)16:44
Shrewssasl failed i guess??16:45
corvusyeah, got the noautherror16:45
corvusShrews: what's the diff between lost and suspended?16:45
Shrewscorvus: i think suspended was the connection to zk was lost but the session might still be valid???16:46
corvusah16:46
Shrewslemme find the doc16:46
Shrewscorvus: https://kazoo.readthedocs.io/en/latest/api/protocol/states.html#kazoo.protocol.states.KazooState16:47
*** ianychoi has quit IRC16:47
*** ianychoi has joined #zuul16:48
corvusShrews, tristanC: proposal: we set "self.client._connection.sasl_cli = None" in our connection listener for when it's suspended.  and we open an issue on kazoo.  then we remove the workaround when it's fixed.16:48
Shrewsin suspended or lost? or both?16:49
corvusShrews: my reading is that suspended happens before lost, but maybe both if i'm wrong?16:49
corvusShrews: oh, the 'valid state transitions' in that doc covers it16:49
corvusthere is a possible connected-> lost, but only if the creds don't work...16:50
corvusand suspended -> lost can happen if the connection resumes....16:50
corvusShrews: so maybe both.  :)16:50
Shrewsyeah, i think both is safest16:50
corvusi just tried a test with both, and it seems to work16:51
corvusso both at least seems good for the "zk server restarts" case16:51
corvusShrews, tristanC: i've got most of this typed up, let me just amend tristanC's patch with it16:52
openstackgerritJames E. Blair proposed zuul/zuul master: Implement zookeeper-auth  https://review.opendev.org/61915616:54
corvusShrews, tristanC: ^16:54
corvusShrews, tristanC: note also i left a -1 comment on the zk auth fixup script16:54
Shrewscan haz same change in nodepool?16:54
corvusShrews, tristanC: i'll go open an issue on kazoo now16:54
Shrewscorvus: i just searched open PRs and didn't see any similar16:55
corvuscool16:55
corvusShrews: you want to copy that over to the nodepool change while i open the pr?16:55
Shrewssure16:55
openstackgerritDavid Shrewsbury proposed zuul/nodepool master: Implement zookeeper-auth  https://review.opendev.org/61915516:57
Shrewsoh, they added a sasl test suite last month according to commit log17:02
corvusShrews, tristanC: https://github.com/python-zk/kazoo/issues/59417:03
Shrewscorvus: what kazoo version did your test use?17:06
corvus2.6.117:06
corvuslooks like that's the latest, yeah?17:06
Shrewsyeah17:07
Shrewshttps://github.com/python-zk/kazoo/blob/master/CHANGES.md17:07
* Shrews lunches17:08
*** Defolos has quit IRC17:09
*** jamesmcarthur has quit IRC17:20
*** ianychoi has quit IRC17:26
*** mattw4 has quit IRC17:26
*** mattw4 has joined #zuul17:27
*** jamesmcarthur has joined #zuul17:27
*** ianychoi has joined #zuul17:27
*** sshnaidm is now known as sshnaidm|afk17:31
*** evrardjp has quit IRC17:35
*** evrardjp has joined #zuul17:35
*** harrymichal has quit IRC17:40
*** harrymichal has joined #zuul17:41
*** jpena is now known as jpena|off17:43
*** jamesmcarthur has quit IRC17:43
*** jcapitao is now known as jcapitao_off17:44
*** y2kenny has joined #zuul17:48
*** harrymichal_ has joined #zuul17:51
*** harrymichal has quit IRC17:55
*** harrymichal_ is now known as harrymichal17:55
*** jamesmcarthur has joined #zuul17:56
*** ianychoi has quit IRC18:10
*** ianychoi has joined #zuul18:17
*** harrymichal has quit IRC18:28
*** harrymichal has joined #zuul18:33
*** harrymichal has quit IRC18:41
*** harrymichal has joined #zuul18:42
*** Defolos has joined #zuul18:43
*** harrymichal has quit IRC18:52
y2kennyHi, is it possible to have a job that neither success or fail?  (for example, a job that stop the pipeline  from proceeding.)  I see that there is a no_jobs reporter but I am not sure if that's relevant or how a job can trigger that condition.18:56
fungiy2kenny: not sure what you mean by "stop the pipeline from proceeding" but not adding any jobs for a particular project-pipeline will cause no jobs to be run for the project in the corresponding pipeline and so no results to be reported for it18:58
y2kennyfungi: I am thinking something like a conditional noop18:59
y2kennylike, a job will run but it may proceed or not depending on some condition18:59
corvusy2kenny: fbo was just asking about something similar on the mailing list18:59
y2kennyOh18:59
y2kennyI will go check18:59
corvusy2kenny: see this message: http://lists.zuul-ci.org/pipermail/zuul-discuss/2020-March/001171.html18:59
y2kennythanks18:59
*** igordc has joined #zuul19:09
*** armstrongs has joined #zuul19:11
openstackgerritTristan Cacqueray proposed zuul/zuul master: spec: add a zuul-runner cli  https://review.opendev.org/68127719:22
*** mugsie has quit IRC19:23
tristanCcorvus: Shrews: thank you for taking care of zk-auth, the changes lgtm19:23
*** armstrongs has quit IRC19:24
corvustristanC: great -- i left a -1 comment for you on the zuul change19:25
*** mugsie has joined #zuul19:26
tristanCcorvus: alright, i'll have a look tomorrow19:26
*** jamesmcarthur has quit IRC19:30
y2kennyI am still a bit stumped on the connection between executor, nodepool and my node.  I am modifying the tutorial.  The only things I have changed are: connecting to my own production gerrit and mounting in pregenerated ssh keys.  When I try doing the testjob, I am getting node unreachable Permission denied:19:31
y2kenny ubuntu-bionic | UNREACHABLE! => {     "changed": false,     "msg": "Data could not be sent to remote host \"node\". Make sure this host can be reached over ssh: root@node: Permission denied (publickey,password).\r\n",     "unreachable": true }19:32
clarkby2kenny: the tutorial may use the same key for gerrit as it does for connecting to the node (so if you've changed them may need to modify the test node)19:32
*** openstackgerrit has quit IRC19:32
y2kennyI see the nodepool and zuul keypair19:33
y2kennyand I mount the keys into the various  container similar to the tutorial19:33
y2kennysame location, etc.  I have added the zuul.pub to my gerrit account as well.19:34
y2kennythe connection to gerrit is fine.  I am able to pull in the config and so on.19:34
y2kennyand in the service log, I see a bunch of event that make sense19:34
y2kennyfor example, I get the scheduler "adding change to queue in pipeline check (I added the testjob to check)19:35
clarkbya that shows that zuul is able to connect to gerrit correctly so that half of the change is working. But zuul also needs to be able to ssh into the test node and I'm not sure if that is the same key or a different key (in a meeting now but can help look more closely later)19:36
y2kennyscheduler also report "Submitted node request"19:36
funginode requests are coordinated between the zuul scheduler and nodepool launchers through zookeeper19:37
y2kennyand launcher reports assigning node request and then scheduler accepting and completed node request.19:37
y2kennythen scheduler report nodepool setting node set in use, executor started ssh agent, added ssh key /var/ssh/nodepool19:38
y2kennythen executor report Beginning job testjob for ref refs/changes/53/329253/119:39
y2kenny updating repo (I assume this is happening on the executor and not the node?)19:39
y2kennyexecutor also does a few checkout19:41
y2kennybut then scheduler report back Build complete, result None, no warnings and returning nodeset19:42
fungiyes, there are initial checkouts in the workspace on the executor19:42
fungiand then there's an optional role to rsync those to the nodes19:42
y2kennydoes executor talk to the nodes directly or via nodepool?19:43
y2kennyor does scheduler launch a node via nodepool and executor talk to node once the node come up?19:43
y2kennyand what does "launching a node" means in the context of the tutorial? My understanding is that the tutorial starts a single node container.  But the node id from the web ui seems to increase with each retry19:45
clarkby2kenny: "scheduler launch a node via nodepool and executor talk to node once the node come up" that is what happens19:45
y2kennyOk.19:45
clarkblaunching a node means that nodepool has managed to procure the resources from $somewhere19:45
clarkbin the quick start case its a static node so it doesnt' actualyl do much19:45
clarkbbut it could mean creating a VM in a cloud or a pod in k8s19:46
y2kennyok.19:46
y2kennyso the problem I am having seems to be communication between the executor and node container19:46
y2kennyI have exec into the executor container and tried to ssh node and it's reachable from the networking perspective19:47
fungimake sure the ssh keys available in ~zuul/.ssh/ are usable to authenticate to the zuul account on the node19:48
y2kennyls19:49
y2kennyoops... sorry... wrong window19:50
Shrewsy2kenny: what nodepool driver are you using for your nodes?19:50
y2kennyjust static right now19:50
y2kennythe node root/.ssh/ does have the authorized_keys19:51
y2kennywhich is the nodepool.pub19:51
Shrewsis the username set correctly? https://zuul-ci.org/docs/nodepool/configuration.html#attr-providers.[static].pools.nodes.username19:53
Shrewslooks like you're assuming 'root' but i'm certain that our tutorials use root19:54
Shrewsi'm NOT certain19:54
y2kennythe username is root19:54
y2kennyit's set in nodepool.yaml19:54
y2kennyand default_username for the executor19:54
y2kennyin zuul.conf19:55
y2kennythe tutorial works fully19:55
y2kennyI am just not sure what I change that made things not to work.19:55
y2kennyI plan to connect zuul to a k8s cluster for production anyway but I am not confident to move forward to that if I can't debug a static node19:56
y2kennyalthough the container networking may have complicated things?  I am not sure19:57
y2kennyseems like the issue is between the executor and the node but I am not sure why19:57
y2kennyIs there a way for me to manually/interactively pretend to be an executor?19:57
y2kennyI am guessing the executor tries to connect via some python ssh session instead of doing so over the shell?19:58
Shrewsy2kenny: the executor just runs Ansible which in turn uses ssh to contact the node19:59
y2kennyum... so I should be able to exec into the executor container and try to run ansible?20:00
fungialso it's running ansible inside a lightweight bubblewrap container, so the ssh keys have to be inside the filesystem (i think usually via a bindmount?)20:00
*** y2kenny55 has joined #zuul20:02
*** y2kenny55 has left #zuul20:03
*** y2kenny79 has joined #zuul20:03
y2kenny79um...20:03
y2kenny79is this working?20:04
y2kenny79I am not sure what happened with my connection20:04
clarkby2kenny79: you seem to be here again :)20:04
*** y2kenny has quit IRC20:04
y2kenny79ok... looks like my original name has timeout20:05
y2kenny79let me get back to that one20:05
*** y2kenny79 has left #zuul20:05
*** y2kenny has joined #zuul20:06
fungiyou can just `/nick y2kenny` (if your nick is registered you can even ask nickserv to kick your ghosted connection)20:06
y2kennyok.  (I am new to irc also... :))20:07
clarkbfungi: note this channel requires registration20:07
y2kennyfungi: so when you said lightweight bubblewarp container, is that a regular docker container or other kind of container?20:08
corvusclarkb: ftr, the quickstart uses separate keys for gerrit and the worker node (to avoid exactly the kind of confusion you were worried about)20:09
*** jcapitao_off has quit IRC20:09
corvusbefore we go too far down the rabbit hole, let's see if we can come up with a simple command to test20:09
corvusy2kenny: what happens if you run "ssh -i /var/node/id_rsa -l root <ip address of test node>" on the executor?20:11
corvuscorrection:20:11
y2kennyvar/ssh/nodepool ?20:12
corvusy2kenny: yep :)20:13
corvus"ssh -i /var/ssh/nodepool/id_rsa -l root <ip address of test node>"20:13
y2kennyok that is the key difference between the tutorial and my setup20:14
y2kennyI have a parallel setup with just plain tutorial and the command works20:14
y2kennybut my mess up setup the password is prompted20:14
y2kennythat tells me something is wrong with the node's authorized_keys20:14
y2kennywhich is unexpected20:15
corvusy2kenny: yeah, sounds like it20:15
y2kennylet me double check... probably I have a typo some where20:15
fungiy2kenny: and no, it's not really a docker container, more like a chroot with cgroups/process isolation20:15
corvusy2kenny: now you've got a command that should be roughly equivalent to what zuul (via ansible and bubblewrap) will do to help test20:15
y2kennyyes.  Thanks for the tips.20:16
y2kennythis give me something to poke around20:16
*** jamesmcarthur has joined #zuul20:17
y2kennyok... I think this may be the .ssh directory's permission20:17
*** ianychoi has quit IRC20:17
*** ianychoi has joined #zuul20:18
y2kennyum... I spoke too soon... there's a bunch of other differences... I will be back20:19
*** openstackstatus has joined #zuul20:20
*** ChanServ sets mode: +v openstackstatus20:20
y2kennyOOOOOOOOOOOk.... I think this may have fixed it.  The ssh authorized keys is mounted in with wrong owner and permission (1000 instead of root) on my setup20:28
y2kennyyup... now I got further.  Thanks folks, you guys are awesome!20:30
corvusy2kenny: yay!  good luck, let us know how it goes :)20:33
*** mattw4 has quit IRC20:40
*** openstackgerrit has joined #zuul20:43
openstackgerritClark Boylan proposed zuul/nodepool master: Install zypper on the nodepool-builder image  https://review.opendev.org/71217720:43
clarkbianw: mordred ^ fyi I think that will get us zypper in the nodepool-builder image20:43
*** plaurin has joined #zuul21:15
plaurinhello irc people!!21:15
clarkbhello21:16
plaurinQuite late in the day for me, however I wanted to talk about the kubernetes log streaming. I installed zuul 3.18.0 and nodepool 3.12.0, but still no luck. HOWEVER I now see a bunch of "Zuul log stream did not terminate" here and there21:16
clarkbplaurin: that is actualy caused by it not starting properly. corvus added a release note update to cover that case (which may fix your problem) let me find a link21:17
clarkbplaurin: https://zuul-ci.org/docs/zuul/reference/releasenotes.html#upgrade-notes are socat and kubectl installed on the executor?21:18
*** armstrongs has joined #zuul21:19
plaurinthx, yeah I installed socat indeed before updating21:22
plaurinchecking the upgrade note21:22
plaurinha I might be missing the start-zuul-console for some reason checking that21:23
clarkboh that may actually be what was added later21:27
corvusplaurin: yeah, we realized rather late that start-zuul-console was required21:27
corvusi think we realized it after the change merged, but right before we actually cut the release21:27
corvusplaurin: so if my memory serves, that would be right after you went off to start testing it, sorry21:27
*** jamesmcarthur has quit IRC21:28
plaurinno problem, I'm excited to see this log streaming working, some people are going to be quite happy21:28
plaurinI'm really grateful21:28
*** jamesmcarthur has joined #zuul21:28
corvusplaurin: no problem, thanks for testing!21:28
plaurintesting in prod, like everyone should lol21:29
*** armstrongs has quit IRC21:29
fungias long as you're also *testing* prod, that sounds ideal!21:31
plaurinyep, .. outside of work hours at least21:33
*** avass has quit IRC21:43
*** jamesmcarthur has quit IRC21:44
*** jamesmcarthur has joined #zuul21:45
*** jamesmcarthur has quit IRC21:53
plaurinYES21:58
plaurinThank you, sreaming is working now21:58
plaurinI guess this can be resolved or closed https://storyboard.openstack.org/#!/story/200732121:58
*** sreejithp has joined #zuul21:59
fungiyay for working screaming! ;)22:00
*** jcapitao_off has joined #zuul22:01
*** sreejithp has quit IRC22:02
*** jamesmcarthur has joined #zuul22:05
*** marvs has quit IRC22:05
*** jamesmcarthur has quit IRC22:10
plaurinI am streaming of joy22:15
plaurinsed -i 's/screaming/streaming/g'22:16
*** evrardjp has quit IRC22:17
*** evrardjp has joined #zuul22:18
*** ianychoi has quit IRC22:19
*** ianychoi has joined #zuul22:20
*** plaurin has quit IRC22:27
*** evrardjp has quit IRC22:31
*** evrardjp has joined #zuul22:33
*** dpawlik has quit IRC22:40
*** ianychoi has quit IRC22:41
y2kennyok, I am back... now that the ssh issue is fixed.  I am running into problem of role not found.  These are zuul/zuul-jobs (such as add-build-sshkey or upload-logs.)  From conversation from earlier, these zuul roles are supposed to be rsync to the node by the executor?22:41
*** ianychoi has joined #zuul22:42
fungido you have a connection for opendev.org configured and are you including the zuul/zuul-jobs repository in your tenant configuration?22:47
y2kennyOooo... ok... I think I assumed a bit too much magic :)22:48
fungipretty sure add-build-sshkey and upload-logs run in the workspace on the executor anyway22:48
y2kennyI removed the opendev connection22:48
fungiyou can remove the opendev connection if you want to carry a local fork of the zuul-jobs repo22:49
y2kennyright.22:49
fungiand i think a number of sites do that22:49
*** jcapitao_off has quit IRC22:49
fungibut we designed it so you can reuse the public copy as a standard library22:49
fungiand we treat everything in it as an api contract, with deprecation announcements for behavior changes and the like22:50
y2kennyin my mind I thought it works like dockerhub or something where docker (in this case zuul) would fetch the role.22:50
y2kennyindependent of the opendev connection22:51
y2kennyI treated that connection as part of the tutorial/example22:51
fungiit does fetch the role, but that connection is how it knows where you want to fetch it from, and it's extensible so you can treat any repository anywhere reachable as a source of job configuration22:51
y2kennygot it.  Thanks!22:51
fungiyeah, the opendev.org connection and zuul/zuul-jobs repository are not in any way "special, they're just another pubic source of job configuration22:52
fungiwhich you're free to add/remove/use/ignore/fork/reimplement/whatever suits your needs22:53
y2kennyso the bits that tell zuul to fetch the role is:22:53
y2kennyroles:      - zuul: zuul/zuul-jobs22:53
y2kennyunder jobs.yaml?22:54
fungiyep22:54
fungibut that doesn't tell it where to find the roles22:55
corvusclarkb, Shrews, tristanC: i believe i have a local zk cluster of 3 nodes using server-side (quorum) tls!22:56
clarkbnice22:56
corvusapparently all the java keystore stuff requires passwords (and some things break without them), so the hardest part is i had to type "foobar" a lot.22:57
y2kennyok so now I have another question... for things in pipeline like trigger, reporter, etc., I would specify a source.  But looks like I don't do that for projects.  What if I have zuul/zuul-jobs in both the opendev and my internal gerrit server?22:57
corvusclarkb, Shrews, tristanC: tomorrow, i'll work on client tls config, then i'll see about setting up our tests that way and documenting this22:58
y2kennyOh... main.yaml/tenant config22:58
fungiy2kenny: https://zuul-ci.org/docs/zuul/reference/tenants.html#tenant has an example of the tenant config, with zuul/zuul-jobs included as an untrusted repository, and below that you can see an example of filtering what kinds of configuration you want to allow it to consume from specific repositories as well22:58
*** igordc has quit IRC22:58
y2kennyok, yes... I think I am connecting the dots now.22:58
y2kennythanks22:58
fungiy2kenny: and then here's the bit on defining connections in your zuul.conf: https://zuul-ci.org/docs/zuul/reference/connections.html22:58
corvusy2kenny: if there is a name collision (ie, zuul/zuul-jobs) you can supply the fully qualified name for a repo (eg opendev.org/zuul/zuul-jobs).  we use the canonical name there rather than the connection name so that it can be the same across different zuul installations even if they have different connection names22:59
y2kennygreat.  Yes, it's taking me a bit of time to connect all the concepts together.  Coming from the jenkins world, this is really awesome.23:00
corvusy2kenny: (that's generally true any place in zuul where a repo name is supplied outside of the context of a source)23:00
fungiglad to hear someone coming from the jenkins world thinks zuul is awesome, rather than uselessly confusing! ;)23:00
y2kennycorvus: that's good to know.  I actually had some question in my head when I was setting up the gerrit connection.  I feel like I need to name the server in 3 different places.23:01
corvusy2kenny: yeah, i think one of those is superfluous and we should remove it from the docs, but i don't know which :)23:01
y2kennyfungi: zuul give me all the stuff I wish jenkins would come with out of the box.  Jenkins is fine for simple projects but, at least for my setup, the complexity of reality just made it grew into a monster23:03
y2kennya lot of these zuul concept are essentially what we have implemnted in Jenkins, custom, in groovy... Jenkins' groovy.  It got pretty ugly.23:04
corvusy2kenny: yeah, zuul is the result of running jenkins at scale for several years :)23:05
fungiy2kenny: us too!23:05
fungithat's basically how zuul evolved23:05
fungiwe wrote so much glue and orchestration around jenkins, that there was a lot more of it than there was jenkins23:06
fungiso then we "just" swapped the jenkins masters out for ansible/executors23:07
y2kennycorvus: it definitely shows.  I was reading the stuff and I was like... these all make sense!  I was so unhappy with Jenkins I was about to cook up something on my own.  Good thing I saw the talk at Gerrit User Summit in december.23:07
y2kennyand I was like... awesome, someone has already done this for me :D23:07
fungiy2kenny: if you're intrigued by the jenkins history of zuul, there's an article here which recounts the highlights of what we went through to end up here: https://opensource.com/article/20/2/zuul23:08
corvusy2kenny: oh great!  did you see we're making progress on using zuul for gerrit's gerrit?  https://ci.gerritcodereview.com/tenants23:08
corvusy2kenny, fungi: yeah, that's a great article on the subject -- it might help zuul users coming from jenkinsland23:09
y2kennyfungi: that's going to be useful when someone challenge my decision for not sticking with jenkins.23:15
*** zxiiro has quit IRC23:15
y2kennycorvus: I did see the discussion on repo-discuss but didn't know you guys got it up and running already.  This is great.23:16
corvusy2kenny: heh, it's, er, speculatively running :)  there are about 8 required patches that haven't merged yet, but since none of them are in config repos, we can actually run jobs and see the result before they land.  so we know it works, it's just wrangling reviews now :)23:17
fungiyeah, it's almost mind-bending at times that you're able to basically test a complex ci deployment without even merging most of the configuration for it23:19

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!