Friday, 2018-04-27

clarkbssh happens in a forked process so I don't think paramiko updating would affect that00:00
SpamapSAFAIK we don't use paramiko00:00
clarkbwe do just at the beginning for host key handling00:01
SpamapSyeah just confirmed, ansible is using ssh00:01
SpamapSah dunno about that00:01
SpamapSbut I see what you're saying00:01
clarkbbut ya ansible is forked and uses openssh by default00:01
SpamapSnodepool might have torched the node00:01
SpamapSbecause my deploy job is just a job that runs on a bastion00:01
SpamapSthe bastion being a regular node00:01
SpamapShrm.. hard to find the node's real hostname anywhere00:04
SpamapSsince it was a post_failure00:04
SpamapSno logs were saved00:04
SpamapSwhich kinda sucks.. probably bad form on my post playbook part00:04
clarkbconnectivity problems do make this difficult00:04
clarkbI say as I need to address my derp home networking. Warm eather seems to have made my office's wireless bridge device unhappy00:05
SpamapSnodepool did delete the node out from under the job00:07
SpamapSI wonder if I *am* restarting zookeeper or something00:07
*** rlandy has quit IRC00:08
SpamapS2018-04-26 15:45:23,507 DEBUG zuul.AnsibleJob: [build: f287dae10e8d4b98a70a97b21f7f021c] Ansible output: b'RUNNING HANDLER [zookeeper : Restart zookeeper] ********************************'00:09
SpamapSyep00:09
SpamapSrestarted it, which presumably caused the lock to be lost00:09
SpamapSwell now at least I know00:09
SpamapS- meta: flush_handlers00:13
SpamapS>:|00:13
tristanCfdegir: zuul.rpm only contains the cli and the module... the doc, webui and services are sub packages. you can get them all using "yum install rh-python35-zuul-*"00:25
tristanCclarkb: the lock should survive a zookeeper restart if the client reconnect before the session timeout00:39
clarkbtristanC: it may actually happen because nodepool sees all the nodes as aliens if zk isnt responding?01:16
*** harlowja has quit IRC01:23
tristanCclarkb: can't find that behavior in the launcher code, maybe this happens if a zk call is executed when the service is down01:35
clarkbcorvus: so I dont forget your changes to config loading probably deserve a release note02:16
SpamapShttp://paste.openstack.org/show/719983/03:41
SpamapSBeen getting these a lot03:41
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/zuul master: Sometimes GitHub doesn't return repo permissions  https://review.openstack.org/56466603:54
SpamapS^^ looks like a simple case of assuming the latest version of an API that isn't stable.03:54
SpamapSHeh in fact, looks like GHE 2.13 doesn't even have /collaborators03:56
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: mqtt: add basic reporter  https://review.openstack.org/53554304:13
SpamapShrm04:36
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: mqtt: add basic reporter  https://review.openstack.org/53554304:40
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Make revoke-sudo work on base cloud-init images  https://review.openstack.org/56467404:45
SpamapS^^ FYI, I want this for our internal cloud tests here at GD, because I want to run things like tox/flake8/etc. with the exact image that most of our users use..04:46
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: mqtt: add basic reporter  https://review.openstack.org/53554305:06
*** swest has joined #zuul05:12
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add OpenAPI documentation  https://review.openstack.org/53554105:52
SpamapShrm, how does ensure-tox work exactly? it installs tox with --user ... but .local/bin is only added to path on login shells.. which you don't get with the command: module.05:57
SpamapSGuessing I need to start installing tox without --user05:57
tristanCSpamapS: .local/bin could be added to the environment, like so: https://review.openstack.org/#/c/532083/7/roles/ansible-lint/tasks/main.yaml06:08
SpamapStristanC: yeah, it could. But it's not yet. ;)06:33
SpamapSand I believe this works fine because tox is pre-installed on custom images06:33
*** yolanda__ is now known as yolanda06:59
openstackgerritTristan Cacqueray proposed openstack-infra/nodepool master: builder: support setting diskimage env-vars in secure configuration  https://review.openstack.org/56468707:13
*** xinliang has quit IRC07:14
*** xinliang has joined #zuul07:15
*** ssbarnea_ has joined #zuul07:45
*** hashar has joined #zuul07:47
*** jamesblonde has joined #zuul07:50
*** jpena|off is now known as jpena07:52
jamesblondehello :) is there some people to answer my questions ?07:54
tobiashjamesblonde: just post your question, but note that most people here are located in us timezones07:59
jamesblondethat's why I asked so I will try to stay tuned. My question is how does the nodepool is connected to jenkins ?08:18
openstackgerritMatthieu Huin proposed openstack-infra/nodepool master: Add separate modules for management commands  https://review.openstack.org/53630308:28
openstackgerritMatthieu Huin proposed openstack-infra/nodepool master: Add separate modules for management commands  https://review.openstack.org/53630308:37
jamesblondeand what difference between Zuul Launcher + Zuul trigger (v2) and Zuul executor (v3) both were replaced by this ?08:49
tobiashjamesblonde: nodepool v2 or v3?09:05
tobiashv3 has no connection to jenkins (as there is no jenkins with zuul v3)09:06
tobiashjamesblonde: zuul launcher (v2) was replaced by zuul executor (v3)09:07
tobiashjamesblonde: not sure what you mean with zuul trigger (v2)09:08
*** jamesblonde has quit IRC09:10
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: zuul web: add admin endpoint, enqueue & autohold commands  https://review.openstack.org/53900410:23
*** CrayZee has joined #zuul10:27
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: zuul web: add admin endpoint, enqueue & autohold commands  https://review.openstack.org/53900410:30
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: zuul web: add admin endpoint, enqueue & autohold commands  https://review.openstack.org/53900410:45
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: zuul web: add admin endpoint, enqueue & autohold commands  https://review.openstack.org/53900411:19
*** jpena is now known as jpena|lunch11:56
*** ssbarnea_ has quit IRC12:06
*** ssbarnea_ has joined #zuul12:07
*** ssbarnea_ has quit IRC12:09
*** ssbarnea_ has joined #zuul12:13
mordredtobiash: perhaps we need a FAQ entry for v2 -> v3 migrations - I think this is the second day a similar question has been asked about launchers/executors - might be nice if we had a short page "so you're already running a zuul v2 and looking to upgrade"12:18
tobiashmordred: good idea12:18
mordredbecause they're certainly fair questions12:18
tobiashmordred: so you're back from traveling hell?12:18
tobiash;)12:19
*** ssbarnea_ has quit IRC12:19
mordredtobiash: yes!12:19
mordredtobiash: my couch at home is much more comfortable than places that are not my couch at home12:20
*** jamesblonde has joined #zuul12:20
tobiashmordred: I can imagine that12:20
jamesblondethe nodepool that comes with zuul v3 ?12:21
tobiashjamesblonde: the nodepool that comes with v3 has no linkage to jenkins12:21
*** ssbarnea_ has joined #zuul12:21
tobiashjamesblonde: as jenkins is replaced in v3 by zuul-executor12:21
jamesblondeok got it, and what if i want to keep using jenkins with gearman plugin ? should I keep executors ?12:22
mordredjamesblonde: you should not upgrade to zuul v3 at the moment if you want to keep using jenkins. however, there are a few people - electrofelix is one - who have been working on zuul v3 + jenkins12:23
jamesblondeThat's my though right now. Is there a particular reason ? Not tested yet ?12:24
tobiashjamesblonde: the data driven architecture has been changed12:24
tobiashjamesblonde: the sources are now pushed to the nodes by the executor12:25
tobiashjamesblonde: the merger doesn't serve any repos anymore12:25
pabelangermordred: jamesblonde: tobiash: I was recently pointed to https://github.com/jenkinsci/nodepool-agents-plugin for nodepoolv3 and jenkins12:25
pabelangerI believe it is coming out of rackspace12:25
mordredpabelanger: neat! yah - that's hughsaunders and odyssey4me12:26
pabelangeryar12:26
jamesblondeYes instead of pulling it with jenkins... I am asking because we don't want to use openstack cloud but wanted to migrate to zuul v3.12:26
mordredjamesblonde: oh - well, you don't have to use an openstack cloud with v312:27
odyssey4meyep, I didn't do the development - that's down to hughsaunders and some other team members... I'm just a tester :)12:27
jamesblondeThanks for your recommandation, i check the repo and keep in mind for electrofelix12:27
mordredjamesblonde: zuulv3 has direct support for pre-defined static nodes12:27
mordredas well as a growing number of non-openstack node providers12:27
mordredso if that's the reason you wanted to keep your jenkins - we've got you covered :)12:27
mordredI should say - nodepool v3 has direct support for static nodes as well as a growing number of non-openstack dynamic node providers12:28
mordredzuul v3 has support for whatever nodepool gives it :)12:28
jamesblondeSo that would be the best for us12:28
mordred\o/12:28
mordredodyssey4me: 'just a tester'12:29
jamesblondeAnd in this case zuul executor is not needed like in the v2 ? Should I keep my zuul launcher & trigger ?12:30
pabelangerright, zuul-executor will only work with zuulv312:30
mordredwait - I think y'all just said different things12:30
mordredin v3 you need a zuul-scheduler and at least one executor12:31
mordredand a nodepool12:31
jamesblondeThat's what I want, use zuul v3 but using another node pool manager12:31
pabelangerah, yes. I was only focusing on zuul-launcher / zuul-executor part12:31
mordredyah - so that's not really a thing with zuul v312:31
mordredzuul v3 gets nodes from nodepool12:32
mordredif you want to use zuul v3 and the nodes are somewhere, the best bet would be to write a plugin for nodepool to get nodes from whatever is managing them12:32
jamesblondeok so zuul v3 is only pre-configured to work with nodepool12:32
tobiashjamesblonde: yes, so essentially nodepool is actually a mandatory part of zuul now12:33
mordredyes12:33
mordredbut - it itself is pluggable - so nodepool should be able to get nodes from whatever system - be it static or openstack or ec2 or something homegrown12:34
tobiashjamesblonde: but as mordred said nodepool also can manage a pool of e.g. statically defined nodes12:34
electrofelixjamesblonde: from my testing so far the upgrade path will be migrate to nodepool v3 with jenkins, hughsaunders is the person to chat to about the plugin, once we've had a chance to migrate locally ourselves hoping to help him with that plugin and subsequently a zuul-trigger plugin to allow zuul v3 -> jenkins communication12:35
jamesblondeok i am going to think about writing such a plugin12:35
tobiashjamesblonde: do you use a system for dynamic node provisioning?12:36
electrofelixjamesblonde: I'd get nodepoolv3 integrated with jenkins first as a first pass12:37
mordredelectrofelix: I think the issue is that the reason they were having jenkins in the mix was to avoid nodepool since they have jenkins getting nodes from somewhere else12:37
jamesblondewe use AWS cloud only but it is more VM instances12:37
mordredjamesblonde: there is an ec2 driver up for review already actually12:37
mordredjamesblonde: https://review.openstack.org/#/c/535558/12:38
pabelangeryah, best to talk with tristanC about nodepool drivers, he writes them in his sleep :)12:38
jamesblondemy goal is to use a Zuul v2 like behavior with a set of dynamic nodes to manage ephemeral ressources (because today we have 5 fulltime jenkins masters running)12:38
tobiashjamesblonde: in this case I think you want v3 without jenkins and with nodepool and https://review.openstack.org/53555812:39
*** rlandy has joined #zuul12:39
jamesblondeI thinks that exactly what we are looking for12:40
mordredsweet12:41
jamesblondeso I was bad thinking that nodepool was made for OpenStack based cloud => as you can read here https://docs.openstack.org/infra/nodepool/ "It is designed to work with any OpenStack based cloud,"12:41
mordredoh. heh. good call!12:41
jamesblonde(i am french so that sentence made me think I had to use a openstack based cloud, and not standalone physical or virtual machine)12:42
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Clarify in doc introduction that OpenStack is not required  https://review.openstack.org/56474612:44
mordredjamesblonde: ^^ maybe that will prevent such confusion next time12:44
jamesblondeoh indeed12:45
jamesblondeyes i did not see that one12:45
jamesblondegood review btw ;)12:45
mordred\o/12:46
jamesblondeIt is clearer in my brain now ^^ i am going to test it using the aws driver12:46
mordredjamesblonde: sweet. I think tristanC has used that for some things, so it should work, but it's also new, so please let us know if you have any issues with it12:47
jamesblondeof course I'll come back and write some doc too12:48
jamesblonde(if i plan to use it12:48
Shrewsi think the proposed aws driver is fairly limited12:48
Shrewsmore of a WIP12:49
mordredShrews: you're more of a WIP12:50
jamesblondei will let you know but could be a good idea to contribute on it12:50
mordred++ that would be very welcome12:50
Shrews/ignore mordred --reason "just cause"12:50
*** jpena|lunch is now known as jpena12:54
openstackgerritMerged openstack-infra/zuul master: Fix zuul home directory in zuul from scratch document  https://review.openstack.org/56438612:57
openstackgerritMerged openstack-infra/nodepool master: Clarify in doc introduction that OpenStack is not required  https://review.openstack.org/56474612:57
*** dkranz has joined #zuul13:14
SpamapSEven if it is limited..13:49
SpamapSIt's needed.13:49
SpamapSAnd you have to start somewhere.13:49
SpamapSWhat's blocking it currently?13:49
* SpamapS going through the review slowly13:49
SpamapSalso, if we do want people to write drivers, https://review.openstack.org/#/c/535555/ is critical13:50
SpamapS(and has two +2's.. so...)13:50
SpamapSMake that 313:53
SpamapSShrews: was there some unstated reason we haven't landed 535555?13:53
SpamapSActually it just looks like it's been sitting ready to ship for a few days, so, +3'd13:54
ShrewsSpamapS: nope13:55
SpamapSwerd14:02
Shrewsi sort of want to make the drivers pluggable (except for openstack and maybe static) so others don't need to wait on nodepool releases to get the latest and greatest driver14:04
Shrewsi wonder how others feel about that though14:04
Shrewsalso, i don't really want to review AWS changes  :)14:04
openstackgerritMerged openstack-infra/nodepool master: Refactor NodeLauncher to be generic  https://review.openstack.org/53555514:05
Shrewsor VMWare changes14:07
Shrewsor Azure changes14:07
Shrewsetc14:07
mordredShrews: well - I'd agree, except the horizon/neutron plugin testing mess makes me think twice about that14:08
Shrewsmordred: i am not aware of the details there14:08
mordredShrews: it's solveable - but basically the out of tree driver needs the thing it's a driver for in order to test - it's probably fine for us since we release frequently14:09
openstackgerritPaul Belanger proposed openstack-infra/nodepool master: Add fedora-28 to nodepool dsvm  https://review.openstack.org/55921114:09
Shrewsmordred: well, i mean, the shifting of responsibility to make sure it works with nodepool is the main reason i'd like it pluggable14:11
Shrewsbecause we can't test with any other provider ourselves14:11
Shrewsso we'd just be guessing14:11
Shrewsbut if a driver author has those resources to test... great14:11
mordrednod. yeah - it's a topic we should certainly consider how to deal with14:12
Shrewsi just imagine someone coming to #zuul and saying "hey, the aws driver doesn't do this thing"14:12
Shrewshow can we (other than tristanC) test and fix?14:13
mordredShrews: that said, I bet markatwood would give us quota to test the ec2 driver14:13
Shrewsmordred: there's still vmware, azure, kubernetes, some-other-latest-greatest-thing14:13
Shrewsi guess i've already decided which way *I* lean on this  :)14:14
mordredvmware is the only one f those that seems problematic though - since we'd have to install vmware and that would suck14:14
mordredShrews: hehe14:14
pabelangerit would be great for driver author to setup 3pci on the nodepool driver some how14:14
pabelangerand report results14:14
rcarrillocruzproblem is on clouds/products that are not free14:15
pabelangerunless openstack-infra gets credentials to azure / aws14:15
Shrewsit's not just access... it's a working knowledge of the thing14:15
rcarrillocruzlike, tristan developed his asw driver by using the free tier account14:15
rcarrillocruzbut that goes away after a year me thinkg14:15
pabelangeryah14:15
mordredI think my concern is that I don't want to wind up with key things in a contrib ghetto14:15
Shrewse.g., i don't have any desire to learn vmware14:15
rcarrillocruzlol14:15
mordredhowever we can make sure that they're in good shape and reasonable for people to use I'm in favor of14:16
pabelangerShrews: easy, nova vmware driver14:16
mordredpabelanger: ;)14:16
pabelangermonies please14:16
rcarrillocruzthis is what ansible folks use to test vmware modules, https://github.com/vmware/govmomi/tree/master/vcsim , but yeah, i hear what Shrews says about 'knowing everything about all drivers to review them'14:16
pabelanger:D14:16
mordredyah. the ansible community choice to empower driver authors to care about their driver is more scalable than them all having to learn all of the drivers14:17
mordredso it might be more of figuring out what the line isof drivers that we think are important enough that we should collectively learn something about them14:17
mordredand also have a mechanism for people who want to care and feed for a driver that we can not care about14:18
Shrewsi think this warrants a ML discussion. i can start that up14:18
mordredcoolio14:19
mordredcause I think the major cloud providers (other than openstack of course) - ec2, gce and azure - are ones we should have out of the box support for - just like having out of the box support for github for zuul14:20
mordrednow - the others - the digital oceans and mac stadiums - the line gets much more blurry for me14:20
Shrewsi think the line needs to be drawn on what we actively test14:21
Shrewsnot on popularity14:22
Shrewsbut i'll put that in the initial email14:22
tobiashShrews, mordred: the pluggable driver interface was discussed a few months ago and the decision at that time was that we want such a thing but need time to stabilize the driver api first before making that public14:25
mordredtobiash: ++14:26
mordredShrews: oh totally - but I think we should actively test ec2, azure and gce in addition to openstack14:26
tobiashI think corvus wanted to land a few more drivers before making that step to get real experiences14:27
mordred(assuming, of course, we can get donated quota to do such a thing)14:27
tobiashso maybe we want to wait until some of tristanC's drivers landed to validate that the internal api works and can be published14:27
Shrewsmordred: if we can actively test them, i'm more ameniable to having them in-tree14:28
Shrewsamenable14:29
Shrewswords are hard14:29
* mordred hands Shrews a box of ameniable rhinocerouses14:29
Shrewsmmm, yummy14:29
dmsimardmordred: I think I found a bug in the zuul UI ? If I go here: http://zuul.openstack.org/jobs.html and then ctrl+f our oddly specific job "legacy-grenade-dsvm-cinder-mn-sub-volbak", clicking on the "builds" link changes the link in the address bar to http://zuul.openstack.org/builds.html?job_name=legacy-grenade-dsvm-cinder-mn-sub-volbak but it doesn't actually refresh the page to go to the builds for that job.14:32
rcarrillocruzwe could team up with ansible/ansible to see if they could donate us 'some' quota for those providers14:33
rcarrillocruzhint hint14:33
mordreddmsimard: that doesn't seem awesome14:36
mordreddmsimard: although I need to finish the angular5 patch (one more thing outstanding) - so let's check if against that (tracking it down in the current code is likely not going to be the world's most fun thing)14:37
dmsimardmordred: np14:38
mordreddmsimard: http://logs.openstack.org/89/551989/31/check/zuul-build-dashboard/f6d6097/npm/html/builds.html?job_name=legacy-grenade-dsvm-cinder-mn-sub-volbak <-- worked on top of the angular5 patch14:39
mordreddmsimard: so - I think I've fixed your bug in an upcoming patch14:39
dmsimardmordred: going to that URL directly works14:39
dmsimardmordred: it's clicking on the builds link from the jobs page that doesn't, let me try14:39
mordreddmsimard: ya - but I got to that by following your process14:39
mordredhttp://logs.openstack.org/89/551989/31/check/zuul-build-dashboard/f6d6097/npm/html/jobs.html14:40
dmsimardah, ++14:40
dmsimardmordred: you're so good you fix problems you didn't even know you had :)14:40
mordredunfortunatley I have a half-done fix for a different problem sitting on that patch locally - but I haven't touched it ina week so I don't remember what the problem was anymore14:40
*** jimi|ansible has quit IRC14:42
corvusmordred, tobiash, Shrews: i very much think that a reasonable set of popular drivers should be in-tree in order to be useful for users.  and yes, they should be tested, though i'm not sure they always need to tested against live systems -- betamax/mocks/fakes may be enough in some circumstances.  as core reviewers we don't need to know everything about them.  we need to make a good api interface so that14:50
corvuspeople who do know about them can maintain them.14:50
Shrewsi don't understand the reasoning that having them in-tree makes it more useful14:51
Shrewsit may make it simpler14:52
Shrewsemail just sent, btw14:52
*** acozine1 has joined #zuul14:53
corvusShrews: yes, simpler is useful14:53
Shrewsand i fear the "drive-by" driver contribution. we accept a new driver, but the author then disappears and doesn't maintain it14:54
corvusShrews: having them in or out of tree has no impact on that.  if there's no one to maintain it, it's dead either way.14:55
corvusShrews: we need to be responsible for nodepool being usable and functional; it's too important for us to outsource that.14:56
corvusShrews: i'm not saying *all* drivers need to be in-tree14:56
corvusShrews: but most of the ones on the table so far should be, because they're all pretty major players.14:57
corvus(i'm fine with creating an out-of-tree driver interface after we have openstack/aws/k8s/... in tree)14:58
*** gtema has joined #zuul15:09
*** jimi|ansible has joined #zuul15:12
*** jimi|ansible has joined #zuul15:12
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: zuul web: add admin endpoint, enqueue & autohold commands  https://review.openstack.org/53900415:29
*** myoung is now known as myoung|email-unl15:32
*** myoung|email-unl is now known as myoung|emailplz15:32
corvushow about we merge my config changes?15:34
corvusi hit w+1 on the ones that lacked it15:34
corvusmaybe we can restart openstack-infra with them today and see how they perform15:35
clarkbcorvus: and watch cacti memory graphs15:37
corvusclarkb: oh, also, i agree about a release note15:38
jimi|ansiblemordred / Shrews : does zuul support restart sub-jobs yet? having a discussion about our current CI and figured if zuul doesn't do this yet we should probably start pestering you for it now :)15:40
clarkbjimi|ansible: you are describing a feature that would let you describe to zuul it is ok for a job to run up to N times before succeeding and if eventually succeeding treat it as a success?15:43
clarkbor is this restart in another context?15:43
mordredclarkb: I think this is "recheck specific-job"15:46
clarkbah15:46
jimi|ansibleyeah just restart a job due to transient network/etc. failures15:48
jimi|ansiblefor example, in ansible we do integration tests across all the distros, and quite often we'll see failures on ubuntu or fedora for example due to failures in the apt/yum/dnf/whatever tests because the remote resource had an issue15:49
jimi|ansibleso rather than re-run the entire test suite just restart that sub-job15:49
openstackgerritMerged openstack-infra/zuul master: Don't store references to secret objects from jobs  https://review.openstack.org/55359615:50
gtemasorry for the stupid question. When I install fresh nodepool and configure static pool with 1 host, should 'nodepool list' show this node? I'm trying to install on premise zuul but struggling here. I see, that nodepool tries to login to host upon service restart, but it fails and no proper log information is available15:55
*** jamesblonde has quit IRC15:55
*** hashar is now known as hasharAway15:56
clarkbgtema: reading the code it looks like static node info isn't written into zookeeper until first use, and zookeeper's node records are where `nodepool list` output comes from16:01
corvusgtema: i believe it should not appear in the list.  personally, i think it should, but the driver isn't implemented that way.16:02
corvusi'd like us to revisit that.16:02
clarkbcorvus: ++ it appears that when the static nodes are launched() their records are written we can probably just write records for all of them on start up then launch will update status?16:02
corvusyeah, it seems like it should be possible16:03
tobiashcorvus: maybe you want to rebase the timeout fix to the start of your stack to minimize rechecks ;)16:03
corvusi think we talked about it in review; i'm not sure why it didn't work out16:03
*** hasharAway is now known as hashar16:03
corvustobiash: yeah, now that i've incurred the cost, i'll do that :)16:03
clarkbgtema: as for debuggin the ssh, I don't know that nodepool actually tries to login but will ask the remote node for its ssh hostkey16:04
clarkbgtema: what logs do you have?16:04
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Allow extra time for some ansible tests  https://review.openstack.org/56457216:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Fix race in test_bubblewrap_leak  https://review.openstack.org/56464016:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Perform late validation of secrets  https://review.openstack.org/55304116:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Perform late validation of nodesets  https://review.openstack.org/55308816:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Late bind projects  https://review.openstack.org/55361816:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Make config objects freezable  https://review.openstack.org/56281616:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Remove layout from ParseContext  https://review.openstack.org/56369516:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Remove 'base' from UnparsedAbideConfig  https://review.openstack.org/56375716:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Change TestMaxTimeout to not run ansible  https://review.openstack.org/56456216:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Store source context on all config objects  https://review.openstack.org/56456316:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Cache configuration objects in addition to YAML dicts  https://review.openstack.org/56406116:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Stop deep-copying job variables  https://review.openstack.org/56456416:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Remove source_context argument to Pipeline  https://review.openstack.org/56464216:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Simplify UnparsedConfig.copy  https://review.openstack.org/56464716:05
corvusall rebase ^16:05
clarkbcorvus: I'm assuming that gerrit will reapply our +2 in most (all) changes since the thing changing was two lines in the tests16:05
clarkbcorvus: let me know if I need to rereview something16:06
gtemaonly switching nodepool to debug by manually changing logconfig.py normal to handler to DEBUG16:06
corvus(we need to make that a command line argument) ^16:06
clarkbgtema: can you share those logs with a paste service so that we can see what it is doing?16:06
gtemaclarkb: and on the target host failed attempts from audit.log16:06
gtemaclarkb: https://pastebin.com/H9CdA3Vi - nodepool.log16:09
gtemaclarkb: immediately after restart in the /var/log/messages: https://pastebin.com/S05KiRCf16:12
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Add regex support to project stanzas  https://review.openstack.org/53571316:12
tobiashclarkb, corvus: rebased ^ to match your stack16:13
corvustobiash: thanks and sorry16:13
tobiashI just hit the rebase button ;)16:14
tobiashvotes are retained16:14
clarkbgtema: my initial reading of the nodepool logs is that nothing is wrong, nodepool creates the records it needs then is waiting for node requests from zuul16:14
tobiashcorvus: shall we +w this too or do you want a further review on that?16:14
corvustobiash: let me sanity check it in the new context16:14
clarkband I don't see whee nodepool would be logging into the remote host, it definitely does a keyscan though16:14
tobiashok16:14
clarkboh wait its gonna do the ready check isn't it /me digs more16:15
tobiashclarkb: it does a keyscan during reconfig16:15
gtemaclarkb: ok, thanks. I was confused that nodes are not listed. Will continue zuul setup. But would those nodes listed only when tasks are executing there, or permanently after first task was executed16:15
corvusgtema: only when tasks are executed, i believe16:16
gtemaclarkb: ok, thanks16:16
clarkbtobiash: ya I see the keyscanning. gtema best guess is that the keyscan implementation attempts to do a login to get the key(s)?16:16
tobiashclarkb: no, it just does a keyscan16:17
tobiashso it can hand them over to zuul16:17
clarkbthere is definitely a paramiko.start_client then client.get_remote_server_key16:17
clarkbunsure if the start_client will attempt a login?16:17
clarkbor at least appear that way from audit.log's perspective16:17
corvusclarkb: maybe any ssh connection that doesn't end with a login is a "login failed" ?16:18
corvusfrom sshd's pov16:18
clarkbcorvus: ya16:18
clarkbalso no account info in that logged entry16:18
clarkbwhich lines up with I just made keyscan16:18
corvusgtema: so best guess is that everything's working okay, and if you continue with zuul setup so it requests a static node, it should (hopefully) work16:19
gtemaok, thanks16:19
tobiashcorvus: I think I'll rebase the regex change on top of your complete stack, currently it's somewhere in the middle16:22
corvustobiash: sounds good16:22
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Add allowed-triggers and allowed-reporters tenant settings  https://review.openstack.org/55408216:31
openstackgerritMerged openstack-infra/zuul master: Allow extra time for some ansible tests  https://review.openstack.org/56457216:44
openstackgerritMerged openstack-infra/zuul master: Fix race in test_bubblewrap_leak  https://review.openstack.org/56464016:45
tobiashhrm, everything broken, I think I have to restructure the regex change16:47
clarkboh ya types move and stuff16:47
corvustobiash: yeah, that's what i was worried about; let me know if you need help.16:47
tobiashcorvus: just some guidance about the way to choose, in UnparsedConfig.copy16:48
tobiashfirst choice is to keep the regex projects grouped by regex, but then it would be someting extra during copy16:49
tobiashor making that a list and group them by regex in tenantparser._addlayoutitem16:50
tobiashI'm leaning towards option 2 even if that may occur a slight performance cost16:50
corvustobiash: hrm, i'm not sure i understand completely.  in both options, where would you separate out the regex projects from the regular ones?16:53
corvustobiash: also, while i'm thinking about it, my guess is that your main loop should go in Layout.getProjectPipelineConfig now16:53
corvustobiash: maybe the thing to do is to just keep them in the project list with all the others in UnparsedConfig, but then separate them out into their own list or dict in parseConfig.16:55
corvus(so UnparsedConfig only has "projects" and ParsedConfig has "projects" and "projects_by_regex")16:56
openstackgerritMerged openstack-infra/zuul master: Perform late validation of secrets  https://review.openstack.org/55304116:58
openstackgerritMerged openstack-infra/zuul master: Perform late validation of nodesets  https://review.openstack.org/55308816:58
openstackgerritMerged openstack-infra/zuul master: Late bind projects  https://review.openstack.org/55361816:58
tobiashright, the unparsed config should not know about regex16:58
tobiashI'll try that16:58
clarkbI like that separation as the unparsedConfig is just raw datastructures16:59
corvusfriendly reminder, today is a fine day to update https://etherpad.openstack.org/p/zuul-update-email17:02
corvusmordred: clarkb and i were just having a chat in etherpad about the fact that we probably should have added a release note about the new re2 dependency17:07
*** jpena is now known as jpena|off17:07
corvusmordred: do you know if we can retroactively add a note?17:08
corvus(i mean, obviously we can add it to the next release, but i mean is there a way to get it categorized under the previous one?)17:08
*** gtema has quit IRC17:09
corvusi'll ask in #openstack-release17:09
*** jimi|ansible has quit IRC17:13
corvusmordred: there's a mypy error in http://logs.openstack.org/28/564628/2/check/tox-pep8/0c5268b/job-output.txt.gz17:17
corvusmordred: oh, i think it's correct :)17:20
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Report git sha in status page version  https://review.openstack.org/56462817:21
*** kmalloc has joined #zuul17:22
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Add release note about re2  https://review.openstack.org/56484717:27
corvusclarkb: ^ apparently we can just do that :)17:28
corvusit's probably worth thinking about whether we want to add release notes for dependency additions though.  one could argue that openstack-infra is just broken because we don't run bindep on our install.  :)17:28
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: zuul web: add admin endpoint, enqueue & autohold commands  https://review.openstack.org/53900417:34
JosefWellsHey, zuul masters, I was wondering if any other CI systems have a similar nearest-non-failing algorithm for starting test runs, etc18:00
clarkbJosefWells: the only one that comes to mind is chef's thing oh what is it called. Its not open source but is zuul inspired18:06
JosefWellsI've seen similar systems in semiconductor companies, but nothing open source till zuul18:07
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Add debug info to test_slow_start  https://review.openstack.org/56485718:11
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Add regex support to project stanzas  https://review.openstack.org/53571318:14
tobiashcorvus: had to reimplement half of this but got it running now ^18:15
JosefWellsThanks clarkb!  I'm off to play with zuul!18:16
tobiashcorvus: 564847 results in a strange ordering of the release notes: http://logs.openstack.org/47/564847/1/check/build-sphinx-docs/a6f6b7e/html/releasenotes.html18:21
SpamapSJosefWells: I believe the Prow folks are thinking of doing it.18:23
SpamapShttps://github.com/kubernetes/test-infra/tree/master/prow18:23
SpamapSbut for now IIRC it uses a simpler "1+n" window algorithm where they try 1, and then 1+n, and that way they have a chance at landing 1 or 1+n changes.18:24
clarkbtobiash: looks like the tests didn't have to chagne though thats good18:28
*** electrofelix has quit IRC18:31
mordredcorvus: wow - mypy caught an actual thing? neat18:31
SpamapS#winning18:31
tobiashclarkb: yeah, had to reimplement almost everything except the tests ;)18:31
SpamapSHm.. feature idea.. let trusted playbooks request holds.18:35
SpamapSIt would be cool to basically be able to say "If you find XYZ in the logs, and the author doesn't have any other holds active, hold these nodes"18:36
clarkbSpamapS: you could implement it as a playbook/role with a secret (to talk to nodepool)18:38
SpamapSyeah, I also just want nodepool to have a rest API18:39
SpamapSso I can do exactly that18:39
SpamapSI need a non CLI non-shared-box UI for nodepool18:39
*** jimi|ansible has joined #zuul18:39
*** jimi|ansible has joined #zuul18:39
*** elyezer has quit IRC18:39
SpamapSright now I have people logging in and sudo'ing to nodepool/zuul to make holds and clean them up18:39
openstackgerritAndreas Jaeger proposed openstack-infra/zuul master: Fix description for DependentPipelineManager  https://review.openstack.org/56486218:40
clarkbtobiash: left a couple comments but they don't apepar to be regressions so didn't -118:43
openstackgerritAndreas Jaeger proposed openstack-infra/zuul master: Fix some code description  https://review.openstack.org/56486218:44
ShrewsSpamapS: you've seen https://review.openstack.org/539004 ?18:45
clarkbSpamapS: I'm sure your copy the entire journald data contents is related btu is it common to not be able to debug based on logs in your env?18:47
clarkb(its one of the big things I push back on with openstack teams, if you can debug it from the logs then your ops can't either)18:47
SpamapSclarkb: people use it as a dev-on-demand service18:49
SpamapSwrite the patch, throw at wall, log in and fix wrong assumptions, repeat18:49
SpamapSworks pretty well18:50
SpamapSwould like this to be a first class paradigm in zuul eventually18:50
openstackgerritMerged openstack-infra/zuul master: Make config objects freezable  https://review.openstack.org/56281618:51
openstackgerritMerged openstack-infra/zuul master: Remove layout from ParseContext  https://review.openstack.org/56369518:51
clarkbah so assumption is that initial pass will fail and dev will jump on to iterate18:51
clarkbinteresting18:51
SpamapSsometimes18:52
SpamapSnot always18:52
SpamapSjust a common like, "I need to fiddle with it some"18:52
SpamapSand rather than having a parallel vagrant path..18:52
SpamapSjust zuul for all18:52
corvusSpamapS: i don't see a problem with this in principle, but i think we'll want to explore the ux around it a bit.  how would the idea of, rather than requesting it in a playbook, simply every failed job was auto-held, perhaps up to a per-author or per-tenant limit or something?  could even be a limit of 1 -- so the last failed job for $author is auto-held for 24 hours.18:54
corvus(to be clear, i'm just brainstorming)18:54
SpamapSYeah I've been wondering that too.18:54
SpamapSHave had similar thoughts18:55
SpamapSAnother thought I've had is to dump an SSH key into a recheck comment.18:55
SpamapSLike "I'm a trusted person and I want to be able to get into the nodes if this fails"18:55
SpamapSrecheck-with-hold18:56
SpamapSsomething like that18:56
SpamapSanyway.. just something I'm thinking about18:56
SpamapStoo many ideas to get done18:56
SpamapSFor a team of about 10 users, the current method is working fine.18:57
SpamapSBut I can see it failing to scale quickly.18:57
corvusSpamapS: that's a promising idea too -- it sounds like it could have a good level of delegation there (presumably could be enabled per-pipeline)18:57
openstackgerritMerged openstack-infra/zuul master: Remove 'base' from UnparsedAbideConfig  https://review.openstack.org/56375718:58
openstackgerritMerged openstack-infra/zuul master: Change TestMaxTimeout to not run ansible  https://review.openstack.org/56456218:58
openstackgerritMerged openstack-infra/zuul master: Store source context on all config objects  https://review.openstack.org/56456318:58
openstackgerritMerged openstack-infra/zuul master: Cache configuration objects in addition to YAML dicts  https://review.openstack.org/56406118:58
openstackgerritMerged openstack-infra/zuul master: Stop deep-copying job variables  https://review.openstack.org/56456418:58
openstackgerritMerged openstack-infra/zuul master: Remove source_context argument to Pipeline  https://review.openstack.org/56464218:58
openstackgerritMerged openstack-infra/zuul master: Simplify UnparsedConfig.copy  https://review.openstack.org/56464718:58
corvuswelp, that's that landed!18:58
*** elyezer has joined #zuul19:00
*** spsurya has quit IRC19:01
openstackgerritMerged openstack-infra/zuul master: Report git sha in status page version  https://review.openstack.org/56462819:15
*** myoung|emailplz is now known as myoung19:18
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Add regex support to project stanzas  https://review.openstack.org/53571319:35
tobiashclarkb: adapted to your comments ^19:36
openstackgerritFatih Degirmenci proposed openstack-infra/nodepool master: Add nodepool service file for CentOS7  https://review.openstack.org/56487219:49
corvustobiash: dhellman says it's a bug in reno and is unrelated to that patch.  the sections can end up in any order, and in fact, i think we're seeing it in action right now with them changing on the website.19:54
corvustobiash: https://storyboard.openstack.org/#!/story/200193419:54
corvusso it's not related to the change to add the re2 releasenote, that should be safe to land19:54
mordredcorvus: we've exercised reno a bit recently haven't we?19:55
corvusayup19:56
tobiashAh ok19:56
pabelangerfdegir: left a suggestion on 56487220:01
*** CrayZee has quit IRC20:06
fdegirpabelanger: just looking at it20:11
fdegirpabelanger: i didn't get this part of the comment: it will combined both files and use the proper path for centos.20:12
fdegirpabelanger: when you say "both files", which files do you mean?20:12
pabelangeryou'd install the existing nodepool-launcher.service and new nodepool-launcher.d/centos.config20:14
fdegirpabelanger: now i got it20:14
pabelangercentos.conf20:14
fdegirright when you responded20:14
pabelanger:)20:14
fdegirbut...20:15
fdegirpabelanger: even though systemd seems to be happy, sudo systemctl start nodepool-launcher hangs20:15
pabelangerdoes nodepool-launcher -d work?20:16
fdegirthat works20:16
pabelangerif so, you might have permissions issues20:16
fdegirI think we need the start command updated20:16
pabelangerno, I suspect you cannot create the pid file20:16
fdegirand modified like the one i have in centos one with -d20:16
fdegirpid file is there20:16
fdegiri was looking at the service file from softwarefactory20:17
pabelangershould be in /var/run/nodepool, which systemd creates with RuntimeDirectory=nodepool20:17
pabelangeris the nodepool-launcher process running, maybe strace20:17
fdegirJob for nodepool-launcher.service failed because a timeout was exceeded. See "systemctl status nodepool-launcher.service" and "journalctl -xe" for details.20:17
fdegirit is running20:17
pabelangerforking?20:17
fdegiryes20:17
fdegirif I use the one from sf with Type=simple and /usr/bin/nodepool-launcher -d20:18
fdegirit works20:18
pabelangermight need guessmainpid=no20:18
pabelangerand pidfile set20:18
fdegirlet me try that one20:18
pabelangerI stopped testing with centos, but your likely hitting some issues with systemd and python-daemon.20:19
pabelangeryou can also enable systemd debugs to get more info on why it is failing20:19
pabelangerI guess nothing in journalctl -u nodepool-launcher.service20:20
fdegirpabelanger: nodepool-launcher.service start operation timed out. Terminating.20:20
fdegirpabelanger: if you look at this one20:20
fdegirhttps://review.openstack.org/#/c/564872/1/etc/centos7/nodepool-launcher.service20:20
fdegirthe 3 main differences are the Type, ExecStart, and PIDFile20:21
fdegirand that one works with no issues20:21
fdegirbut since i don't have fedora system, I am not sure if the one i sent for centos works on fedora as well20:22
pabelangerright, you can use nodepool-launcher -d and type=simple but don't want that to be default20:22
corvusheads up that current master may be broken (we apparently have a hole in our testing)20:22
pabelangeryou should be able to use type=forking, pidfile, execstart20:22
pabelangerbut likey need more flags on centos20:22
fdegirok20:22
pabelangermaybe guessmainpid=no20:23
pabelangerI think that will read the PIDfile for the process to watch20:23
fdegirtried guessmainpid and it timed our as well20:23
fdegirout*20:23
pabelangerI'd enable debugging in systemd and see what is happening20:24
pabelangerfdegir: but I do use nodepool-launcher -d myself and it works20:24
pabelangerwe just want zfs to use type=forking20:24
*** acozine1 has quit IRC20:26
fdegirpabelanger: yes, if i run manually things work20:26
fdegirpabelanger: but not as a service20:26
pabelangerthat to mean sounds like permission issue for selinux issue20:27
pabelangermight want to check audit logs20:27
pabelangeror set selinux to passive for nwo20:27
pabelangernow*20:27
fdegirsorry, didn't help20:28
fdegirthe thing is20:28
fdegirwhen i issue systemctl start, i see the process20:28
fdegirthe pid is in pidfile20:28
fdegirthe nodepool reporting 018-04-27 20:27:33,673 INFO nodepool.NodePool: Starting PoolWorker.static-vms-main20:28
fdegirso everything seems to be working but the systemctl start doesn't seem proceed further, keeps waiting and finally timing out20:29
pabelangerdoes process die too?20:29
fdegiryes20:30
pabelangeryah, likely python-daemon cannot start properly. Check permissions on all folders, eg: /var/log/nodepool, etc20:30
pabelanger/etc/nodepool20:31
pabelangerif you sudo su nodepool20:31
pabelangerthen run nodepool-launcher20:31
pabelangerit also likey fails20:31
pabelangerwhich common cause is permissions issue20:31
pabelangerand because python-daemon as stderr=none, you don't see failure20:31
clarkb(because proper unix daemonization says you should close all open fds)20:36
pabelangeryah, wonder if we need a --noop / --dry-run, or script to validate proper permissions on folders so daemon can properly start. Pretty hard for a new user to nodepool to understand what is happening when not using -d20:37
corvuspabelanger: https://review.openstack.org/54788920:38
pabelangeryay!20:39
corvusif ianw is busy, maybe someone else can port that to zuul20:39
*** ssbarnea_ has quit IRC20:43
fdegirpabelanger: this is what i get with systemd debugging20:50
fdegirpabelanger: https://hastebin.com/ofidunewiw.sql20:50
clarkbfdegir: pabelanger I think that is telling us we set the type to forking but the fork parent never exited (we know it did fork though beacuse the child is mentioned in the log)20:53
fdegiragain, all the permissions are right20:57
fdegiri can start things manually20:57
fdegirwith systemctl start, i see20:57
fdegircat /var/run/nodepool/nodepool.pid20:57
fdegir2073220:57
fdegirnodepool 20732     1  4 20:56 ?        00:00:01 /usr/bin/python3.5 /usr/bin/nodepool-launcher20:57
fdegirwhile systemctl start is waiting20:58
clarkbya rereading docs the parent isn't exiting20:58
fdegirand then the stuff you see in log happens20:58
fungicould it be blocking on additional (higher-numbered) file descriptors inherited from the shell or something? i've never looked to see whether that daemon library is smart enough to iterate over all bound fds21:00
fdegirfew weeks ago when i tried is on fedora, it worked21:00
fdegirso this seems to be centos thingy21:00
fungisome naive daemonization routines just assume closing stdin, stdout and stderr is sufficient21:00
fdegirand seeing sf using simple made me think they have a reason to use simple21:01
fdegirthey might have faced similar issue21:01
clarkbfungi: systemd says it waits for parent to exit21:02
clarkbhttps://pagure.io/python-daemon/blob/master/f/daemon/daemon.py#_812 is how the library decides to detach or not by default21:02
clarkbso oddly I think that means we don't want type = forking or we want to set detach process flag to true21:03
clarkbthis feels like an optimization for systemd21:03
clarkbpabelanger: ^ does forking work for you? I Think you said you had tested it on fedora at least21:04
clarkbfdegir: try it without the -d and type simple21:05
pabelangerclarkb: I can test quickly21:05
pabelangerI haven't yet21:05
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Coerce MappingProxyTypes in job vars to dicts  https://review.openstack.org/56488621:05
*** harlowja has joined #zuul21:07
pabelangerokay, I don't think we tested this on fedora, it is also hanging for me21:08
pabelangerlet me try something21:08
fdegirclarkb: it works21:08
fdegirclarkb: i mean without -d and type simple21:08
pabelangerclarkb: fdegir: that was the fix, detach_process=True21:19
fdegirpabelanger: so forking didn't work on fedora as well?21:20
pabelangeronly after I patched nodepool/cmd/__init__.py21:21
pabelangerI've been using simple and -d myself21:22
pabelangerso, if we want to support forking, we'll need to patch nodepool / zuul21:22
pabelangerhowever, having issue with pidfile21:22
corvuspabelanger: the zfs docs should work on fedora, are you saying they don't?21:53
pabelangercorvus: I was testing with nodepool-builder, let me try nodepool-launcher21:55
fdegiri just tried again now and it didn't work21:56
fdegiron fedora2721:56
fdegirsame timeout occurs there too21:56
fdegirApr 27 21:44:06 fedora.localdomain systemd[1]: nodepool-launcher.service: Start operation timed out. Terminating.21:56
corvusare the service files that ended up in the repo the same ones from the earlier version of the docs?21:56
fdegiri used the one from nodepool repo21:57
fdegiroh21:57
fdegircorvus: i just looked at leifmadsen's gist21:57
fdegirclarkb: and that gist has simple there so the one in nodepool repo doesn't match to that21:58
fdegircorvus: ^21:58
pabelangeryah, nodepool-launcher and forking isn't working. I'm not sure anybody actually tested it21:58
corvusgist?21:58
fdegirhttps://gist.github.com/leifmadsen/93b9283d10dfddba096e32fb172cf56921:58
pabelangerit is failing on fedora for me21:58
corvusfdegir: oh, that's ... rather out of date :)21:58
fdegirbecause i was 100% sure it worked on fedora for me when he was working with the first version21:58
fdegirbut the service file contains simple21:58
corvusfdegir: this is the most up to date thing, which is derived from that: https://zuul-ci.org/docs/zuul/admin/zuul-from-scratch.html21:59
fdegirso if that part of nodepool hasn't changed then the service file that ended up in nodepool wasn't the correct one21:59
fdegircorvus: yes21:59
*** elyezer has quit IRC21:59
fdegircorvus: the "official" one points to service files from nodepool repo21:59
fdegircorvus: and that's what i've been working on for centos docs21:59
corvusokay, that should work for fedora22:00
fdegircorvus: when you said if the right service files ended up in repo then i checked gist22:00
fdegircorvus: it doesn't22:00
corvusfdegir: oh, i meant the ones from a previous version of the docs, but later than the gist22:00
fdegirthe official one doesn't work22:00
corvusfdegir: to be clear: you're saying if i follow the instructions in https://zuul-ci.org/docs/zuul/admin/zuul-from-scratch.html it won't work?22:00
fdegiryes, that's what i am saying22:01
fdegirthe service file the doc tells user to copy from nodepool repo is the problem22:01
fdegirhttps://zuul-ci.org/docs/zuul/admin/nodepool_install.html22:01
fdegirsudo cp etc/nodepool-launcher.service /etc/systemd/system/nodepool-launcher.service22:01
fdegirthis service file has forking in it22:01
corvusokay, that's a problem for which i will drop everything and run through the instructions again22:01
clarkbcorvus: the issue is https://pagure.io/python-daemon/blob/master/f/daemon/daemon.py#_81222:02
fdegiri think the easiest fix is to switch to simple instead22:02
fdegiruntil nodepool/zuul is patched according to what clarkb just pasted22:02
clarkbcorvus: fdegir we can either decide to use simple and allow default behavior from ^ or override the default behavior and fork twice22:02
pabelangerokay22:02
pabelangerthe issue is type=forking22:02
clarkbsort of22:03
pabelangerswitching back to type=simple, the pidfile is created properly22:03
pabelangerand systemd starts properly22:03
pabelangerhowever, I don't think that is the right way systemd wants the process to work22:03
pabelangerwe'd need the setting clarkb said above for type=forking I think22:03
*** rlandy has quit IRC22:04
clarkbright forking is fine if you fork. and simple is fine if you don't fork. Just have to decide which we want22:04
corvushttp://git.zuul-ci.org/cgit/zuul/commit/doc/source/admin/zuul-from-scratch.rst?id=28d99222a6cb82aaf7698571359363be6416b38f22:04
openstackgerritMerged openstack-infra/zuul master: Coerce MappingProxyTypes in job vars to dicts  https://review.openstack.org/56488622:04
fdegirsame problem probably exists for zuul-{scheduler, executor} as well since those service files use type=forking too22:04
corvusthe service file that was added to nodepool was *not* the one that was in the docs22:04
pabelangernope, I lied type=simple doesn't work22:04
pabelangerit was killed after x seconds22:05
corvusShrews: ^22:05
clarkbpabelanger: ok that at least makes me think we didn't do something completely wrong in investigating the forking option22:05
corvuspabelanger, fdegir: have you tried the version in http://git.zuul-ci.org/cgit/zuul/commit/doc/source/admin/zuul-from-scratch.rst?id=28d99222a6cb82aaf7698571359363be6416b38f ?22:06
corvuspabelanger, fdegir: specifically at http://git.zuul-ci.org/cgit/zuul/tree/doc/source/admin/zuul-from-scratch.rst?id=38b26de3b398e1ee1fa2bcbed0a6bc5105589f67#n25422:06
pabelangerclarkb: yah, enabling detach_process=True is what gets type=forking working22:06
pabelangercorvus: testing22:06
*** elyezer has joined #zuul22:08
clarkbwhat is odd about fdegir's log is that it seems to indicate there is a child22:09
clarkbbut the only os.fork happens if detach_process=True22:09
fdegircorvus: that seems to work22:11
pabelangerconfirmed22:11
fdegircorvus: it's still alive22:11
corvuspabelanger: can you please propose that as a patch.  can you also please verify that the zuul service files are the same ones from that version of the documentation?22:12
pabelangerbut, I don't think systemd will ever use the pid file we are creating as PIDfile is only used with forking22:12
pabelangercorvus: sure22:12
fdegirpabelanger: can you add me to those changes as reviewer so i can continue with centos instructions based on those?22:13
corvuspabelanger, Shrews, tobiash: i'd like us to be very careful with the zuul-from-scratch document.  when we make changes, we need someone to actually do the process manually and verify that it works.22:13
corvuswhat happened here is that after i spent several days running through the document and verified everything in it, we made changes based on things that people thought "should work".  let's not do that again.22:14
corvusso please at least get a review comment from someone -- the author or a reviewer -- that says "i tested this and it works"22:14
pabelangeryah, I left a +2 with I have not tested, I should have really done a +122:15
openstackgerritPaul Belanger proposed openstack-infra/nodepool master: Fix nodepool-launcher systemd file  https://review.openstack.org/56490122:18
pabelangercorvus: fdegir: clarkb: ^ that is working systemd file22:18
pabelangerfor nodepool22:19
pabelangerI'll test zuul over the weekend22:19
fdegirtried it and it works22:20
fdegirthanks all for the help22:22
fdegirnow i can go back to where i left things22:22
corvusfdegir: it looks like the same problem exists for the zuul service files22:22
corvusfdegir: you can get a good version of those from the doc i linked earlier until we fix it22:23
corvuspabelanger: it looks like you asked Shrews to make the same erroneous changes to the service files in zuul, can we go ahead and fix that now?22:23
fdegircorvus: will look for those patches as well and base the work on it22:23
pabelangercorvus: yes, I won't be able to test them until later however22:24
corvuspabelanger: as long as they match the version i confirmed was working earlier, i'm happy.  they're certainly broken now.22:24
corvusclarkb: can you approve https://review.openstack.org/564901 ?22:26
mordredcorvus: I +2d - want me to wait on clarkb or just +A?22:26
corvusmordred: +a22:27
mordredcorvus: done22:27
fdegirwould you like me to send those since with the new nodepool service files I am now moving to zuul steps22:27
clarkbsorry finally getting to lunch now22:27
fdegirand can verify those service files from the earlier version of the doc and send the change22:27
corvusfdegir: i think pabelanger is about to do that in just a few mins22:28
openstackgerritPaul Belanger proposed openstack-infra/zuul master: Fix zuul systemd files  https://review.openstack.org/56490322:28
fdegirok22:28
corvusseconds even22:28
fdegir:)22:28
pabelangerrevert, but untested22:28
pabelanger(by me)22:28
corvusthey match the ones i tested22:29
*** hashar has quit IRC23:19
openstackgerritFatih Degirmenci proposed openstack-infra/zuul master: Add CentOS 7 environment setup instructions  https://review.openstack.org/56494823:24
openstackgerritFatih Degirmenci proposed openstack-infra/zuul master: Add CentOS 7 environment setup instructions  https://review.openstack.org/56494823:26
openstackgerritFatih Degirmenci proposed openstack-infra/nodepool master: Add systemd drop-in file for CentOS 7  https://review.openstack.org/56487223:41
openstackgerritFatih Degirmenci proposed openstack-infra/zuul master: Add steps to use systemd drop-in for Nodepool on CentOS 7  https://review.openstack.org/56495023:47

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!