Monday, 2017-03-13

*** jamielennox is now known as jamielennox|away00:20
*** jamielennox|away is now known as jamielennox00:27
openstackgerritJamie Lennox proposed openstack-infra/nodepool feature/zuulv3: Add a failure message to zookeeper  https://review.openstack.org/44467300:40
mordredShrews: you're so close on all the ectomies!03:14
mordredShrews: +2 on the stack03:31
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Configure regional mirrors for our jobs  https://review.openstack.org/43915206:05
*** bhavik1 has joined #zuul06:07
*** isaacb has joined #zuul07:24
*** Cibo_ has joined #zuul07:31
*** Cibo_ has quit IRC08:00
*** openstackgerrit has quit IRC08:18
*** hashar has joined #zuul08:38
*** hashar has quit IRC08:58
*** bhavik1 has quit IRC09:00
*** hashar has joined #zuul09:08
*** bhavik1 has joined #zuul09:32
*** isaacb has quit IRC09:38
*** isaacb has joined #zuul09:48
*** hashar has quit IRC09:58
*** hashar has joined #zuul10:01
*** bhavik1 has quit IRC10:04
*** isaacb has quit IRC11:36
*** isaacb has joined #zuul11:44
*** Cibo_ has joined #zuul12:50
Shrewspabelanger: oops, looks like i've duplicated your 436027 change in 444647 during my hackfest yesterday13:05
*** Cibo_ has quit IRC13:22
*** hashar is now known as hasharfood13:27
*** isaacb has quit IRC13:47
*** isaacb has joined #zuul14:03
*** openstackgerrit has joined #zuul14:06
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Remove MySQL  https://review.openstack.org/44491914:06
Shrewsmordred: that ^^^ was the best ectomy14:06
*** hasharfood is now known as hashar14:16
ShrewsSpamapS: Who is Cullen Taylor and have they started on the nodepool side of https://storyboard.openstack.org/#!/story/2000897 ? If not, I'd like to knock that one out today since it completes what is needed from nodepool.14:17
Shrewseggshell: hi! see above  :)14:19
*** isaacb has quit IRC14:22
pabelangerShrews: np, I can abandon14:35
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Remove MySQL  https://review.openstack.org/44491914:56
*** bhavik1 has joined #zuul15:11
*** jianghuaw has quit IRC15:14
pabelangerclarkb: jeblair: mordred: https://review.openstack.org/#/c/438281/ seems to be the preferred way of defining out generic tox playbooks. Is that something we want to land today? This will allow me to keep iterating forward on establishing out base jobs for zuul in zuulv3-dev.o.o15:32
jeblairyep15:33
pabelangerdanke15:34
pabelangerjeblair: also, are we at a point we can restart nl01.o.o? We have a few jobs stuck in queue on zuulv3-dev.o.o because lack of nodes15:35
pabelangerthis is related to the locked ready nodes from last week15:36
jeblairpabelanger: we want to land https://review.openstack.org/444462 first15:37
jeblairpabelanger: i'll high-priority review its parents15:38
pabelangergreat, thanks15:38
jeblairShrews, mordred, pabelanger: in my reading of https://review.openstack.org/410812 it looks like we don't "consume both forms" as the commit message states.  i'm fine with that, but wanted to double check the discrepancy...15:41
jeblairi think maybe that's left over from the version of the change to master?15:42
pabelangerShrews: left a comment on 444919, curious what could use secure.conf in the future15:42
Shrewspabelanger: zookeeper creds15:42
pabelangerya, that was I suspected15:42
pabelangerk15:42
Shrewspabelanger: jeblair: we also want to land the leaked node fix before restarting nl0115:43
pabelangerShrews: you have a pep8 failure, why I didn't +215:43
pabelangerShrews: ++15:43
Shrewspabelanger: yeah, adding 2 reviews before that one to fix the pep815:43
jeblairmordred: can you take a look at 442114 and 442124 please?15:53
*** Cibo_ has joined #zuul15:55
clarkbjeblair: mordred re consume both forms, we may want that in v3 in order to transition cleanly without deleting all nodes and rebuilding them all?16:00
jeblairclarkb: well, everything else needs to change in v3, so i'm not too worried about that changing too.  i *do* think it's required if we land the equivalent patch in v0.16:01
jeblairShrews, mordred: ^ but i would still like to clarify the intent since the patch doesn't seem to do what the commit message says.  :)16:04
Shrewswe sort of HAVE to delete all nodes for v3 because otherwise we have no record of them in zk. so both forms not required for v316:07
clarkbShrews: oh right db change anyways16:07
Shrewsbut for v0, yeah, probably need both16:08
jeblairclarkb, fungi: who should we get to weigh in on https://review.openstack.org/443985 ?16:09
clarkbI'm trying to remembre who really wanted that feature. I think amrith with trove jobs, but there were others I am not remembering right now16:10
fungiright, amrith is the main one coming to mind since he approached us at the ptg about it16:11
jeblairyes, i think most requestors have been external to openstack16:11
fungiand in his case, it was really more of a "oh, jobs support dependencies? well i have these three jobs and when they succeed i want to run these two other jobs..."16:12
pabelangerour wheel build jobs could use it16:13
pabelangersince we only have to afs release one time, but build multiple things16:14
pabelangerbut, we've since removed those jobs16:14
*** Cibo_ has quit IRC16:15
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Remove MySQL  https://review.openstack.org/44491916:20
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Remove test_*_cleanup_on_start tests  https://review.openstack.org/44497316:20
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_node_delete_failure  https://review.openstack.org/44497416:20
Shrewspabelanger: those two new reviews should fix the pep8 fail in the mysql one16:20
Shrewsand that's ALL of the nodepool tests enabled \o/16:20
jeblairSpamapS: 444495 security spec looks great!  mordred, clarkb, fungi: <-- you may be interested in that.16:25
fungithanks!16:26
pabelangerShrews: awesome16:27
jeblairpabelanger, dmsimard: commented on 44408816:43
dmsimardjeblair: will look ty16:44
Shrewsugh, just realized that the zuul meeting is now 6pm my time16:46
jeblairi'm open to rescheduling it.16:47
SpamapSjeblair: glad you like it. I'm hoping to add some more details on a back-of-napkin plan for how to add some abstraction so we don't get too married to whatever we choose to use.16:53
*** bhavik1 has quit IRC16:55
clarkbok hav ereviewed security spec16:56
jeblairi set the topic of all the zuulv3-related specs to zuulv3 so they should show up in people's dashboards17:03
clarkbjeblair: I commented on https://review.openstack.org/#/c/443985/1 too17:03
mordredjeblair: the commit message on commit-both-forms is just out of date17:03
mordredjeblair: I can update it if you like?17:04
jeblairmordred: i'm okay just landing it; i think we've had the necessary productive discussion :)17:04
* SpamapS is terrible at using gerrit topics17:04
jeblairmordred: (it's at the bottom of a pile; i don't think it's worth the churn)17:05
jeblairclarkb: thanks, will propose that in a followup17:06
* Shrews apologizes for creating the pile17:07
mordredjeblair: yah - I didn't want to upset thepile17:07
jeblairShrews: never apologize for writing patches!  :)17:07
jeblairokay, maybe not 'never'.... :)17:08
jeblairclarkb, pabelanger: https://review.openstack.org/444974  is the last change in the nodepool stack without a +W17:10
*** hashar has quit IRC17:14
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Stop json-encoding the nodepool metadata  https://review.openstack.org/41081217:14
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Use node ID for instance leak detection  https://review.openstack.org/44450817:15
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Fix failure of node assignment at quota  https://review.openstack.org/44446217:15
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Store a pointer to the paused node request handler  https://review.openstack.org/44452017:15
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Re-enable test_disabled_label  https://review.openstack.org/44464617:15
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Re-enable test_node_az  https://review.openstack.org/44464717:15
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Fix provider-label association  https://review.openstack.org/44465017:16
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Re-enable test_node_ipv6  https://review.openstack.org/44465117:16
mordredso much patches17:16
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Remove test_nodepool.test_job_* tests  https://review.openstack.org/44465217:16
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Remove Jenkins  https://review.openstack.org/44465417:16
mordredthat last patch is fun :)17:16
*** openstackgerrit has quit IRC17:18
clarkbjeblair: Shrews in https://review.openstack.org/#/c/444974/1/nodepool/nodepool.py why is the unlock happening before the delete? Shouldn't you hold the lock while "writing"?17:18
Shrewsclarkb: the lock is part of the root node path, so if you delete first, you delete the lock17:19
clarkbShrews: wouldn't that be more "correct"17:20
Shrewsclarkb: the "unlock" would throw an exception17:20
clarkbShrews: right don't unlock if the previous statement is effectively an unlock17:20
clarkbbut let the unlock happen as part of the delete to reduce the race (and perhaps remove it?)17:21
Shrewsclarkb: it's actually relatively safe, even with the race.17:23
Shrewsit's a pattern we use in the builder, too, iirc. Might be worth investigating what happens in your scenario and handling that more gracefully throughout, as an improvement.17:25
jeblairShrews: i forget, can we issue a recursive delete?  or do we have to delete the children first?17:26
Shrewsrecursive=True17:26
Shrewsto the delete call17:26
jeblairShrews: hrm, then it seems like that should be safe in this case?  that may not hold for the cases in the builder, but maybe this is worth a try?17:27
clarkbI left a comment for historical purposes17:27
jeblair(i say it's safe because we know there aren't going to be any other children showing up under this node)17:27
jeblairclarkb: i added https://review.openstack.org/445022 to implement your doc suggestion17:31
jeblairoh tab17:31
Shrewsheh, looks like the unlock wouldn't actually throw an exception17:36
clarkbShrews: new bug?17:40
Shrewsclarkb: no. happy feature?17:40
*** Cibo_ has joined #zuul17:44
*** openstackgerrit has joined #zuul17:46
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Remove MySQL  https://review.openstack.org/44491917:46
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_node_delete_failure  https://review.openstack.org/44497417:46
harlowjajeblair if u get a sec https://github.com/python-zk/kazoo/pull/420/17:47
harlowjathat should fix the gevent stuff17:47
Shrewsclarkb: there ya go ^^^17:47
clarkbhanks17:47
clarkb*thanks17:48
jeblairharlowja: yay!  what should i do?17:49
harlowjamerge it, lol17:49
jeblairharlowja: git-merge it into my pr?17:49
harlowjaactually i'm not sure where the rest of the kazoo folks are :-P17:49
harlowjamaybe its just me left, lol17:50
harlowjajeblair nah, i'll just merge the fix17:50
harlowjathen recheck yours17:50
harlowjai think that should work17:50
jeblairharlowja: cool17:50
jeblairharlowja: how do you tell travis to recheck?17:51
harlowjamagic recheck button that i think only i have17:51
harlowjalol17:52
mordredharlowja: I always say that decisions are made by those who show up - you exist, therefor you are the kazoo developers17:52
harlowja:(17:52
harlowjalol17:52
harlowjajeblair http://imgur.com/a/jP83z (i get a restart job button)17:52
harlowjaafaik that's admin to the repo only or something17:53
jeblairharlowja: ah thanks17:53
harlowjau may have to rebase, not sure, let's see17:53
*** Cibo_ has quit IRC17:53
*** jamielennox is now known as jamielennox|away17:53
clarkbthat makes me feel better about "recheck" in general :P17:54
harlowja:-P17:54
harlowjajenkins also has a simlar button17:54
harlowjalol17:54
jeblairyeah, if travis still just checks out my pr branch tip (vs merging it into the target branch), then i think it will still fail.  in that case, i guess i would update my pr with a merge commit from master.17:55
harlowjalet's see, ha17:55
jeblairbut maybe they merge into the target branch for tests now.  i dunno.  this is a nice test case.  :)17:56
mordredjeblair: you can also rebase and force push and the PR will update17:56
jeblairmordred: will github people yell at me for that?17:56
harlowjamordred is SpamapS now with u at redhat?17:56
harlowjai can't keep track anymore17:56
harlowjalol17:56
mordrednope. it's required for some projects17:57
mordredjeblair: ^^17:57
clarkbmordred: jeblair the problem with that is it breaks reviews iirc17:58
SpamapSI'm still blue17:58
clarkbso if you have review history that is relevant and important going forward you want to be careful doing that17:58
SpamapSwell I mean, I was light blue17:58
harlowjalol17:59
SpamapSand then for like, 5 minutes, green rectangle17:59
SpamapSbut now deep blue17:59
clarkbSpamapS: golang-client is older than gophercloud fwiw17:59
mordredclarkb: yah - if there are inline reviews17:59
*** Cibo_ has joined #zuul17:59
SpamapSclarkb: older < adopted17:59
clarkbSpamapS: sure, but your criticism goes both ways17:59
mordredclarkb: older than the rax version? or the non-rax version?17:59
clarkbSpamapS: why didn't gophercloud devs work with existing community instead?17:59
clarkbetc etc17:59
clarkbmordred: whatever is in github18:00
clarkbinitial commit is 2 weeks newer than golang-client18:00
SpamapSclarkb: I may be totally insane on this, but it's a REST API client lib, and we shouldn't expect people to come into OpenStack just to make convenience libs for our REST API's. Or maybe we should. #JustSayin18:00
mordredyah18:01
clarkbSpamapS: I don't think they need to, but they litereally did they thing you are complaining about18:01
clarkband got ahead doing it and now you want people to use that18:01
clarkbtldr collaborating is hard18:01
SpamapSI don't know who the they's are18:02
mordredwell - it was ORIGINALLY a rackspace cloud client not a general openstack client18:02
mordredit grew into a general openstack client18:02
clarkbSpamapS: gophercloud neglected an existing project18:02
clarkbSpamapS: so rathre than engage existing community and tooling they went and made their own thing (this apperas to be your complaint today)18:02
mordredwhen gophercloud started, their intent was not being an openstack client library18:02
mordredwhen it started, it was very specifically being a rackspace cloud client - completely with built-in support for the silly rackspace api-key thing18:02
SpamapSI don't want to punish people for not doing the things we want them to do.. I want us to look at the situation on the ground today and make good choices about how to help OpenStack adoption in that context.18:03
mordredso it would have been inappropriate, at that time, in openstack18:03
clarkbmordred: oh interesting that makes the whole thing even more fun18:03
mordredyah - today gophercloud has its own non-rax org and they stripped the rax specific stuff18:03
clarkbSpamapS: I'm not suggesting we punish people. I'm just pointing out collaborating is hard18:03
clarkbSpamapS: and its unfair today IMO to complain at dims when gopherlcoud did/does literally the same thing18:04
clarkbinstead the focus should be how to collaborate better18:04
mordredit is now quite intentionally a general openstack lib - and I believe they would be happy accepting a patch to add clouds.yaml support and whatnot18:04
clarkbmordred: cool18:04
SpamapSclarkb: agreed! And IMO, we have failed at it again by resurrecting a library that competes with the library the small OpenStack-using Go community has adopted as the one to use.18:04
harlowjapunish all the people18:04
clarkbSpamapS: well we haven't really resurrected it18:04
clarkbSpamapS: its always been there an dalways worked aiui18:04
clarkbSpamapS: its just had less attention18:04
SpamapSthat's fair18:05
mordredyah - that _does_ highlight my frustration with rax behavior in this regard over the years - focusing on "Rackspace Cloud" and sending devs to go add support to ecosystem things for "Rackspace Cloud" and not for "OpenStack" with docs mentioning that rackspace is an OpenStack public cloud so works18:06
harlowjamordred deep breath18:06
mordredbut -that's an annoyance at rax business leader folks, not the devs18:06
harlowjahow is rackspace and openstack working out anyway, i didn't hear so well :-/18:08
clarkbharlowja: oh while I am nitpicking email threads, there are pletnly of openstack public clouds out there today :P18:08
harlowja1 billion clouds18:09
harlowjalol18:09
SpamapSberzillion is the technical term18:10
mordredclarkb: did harlowja say something about there not being?18:13
clarkbmordred: yes he said openstack is dead in public cloud sso we should give up18:13
harlowjalol18:13
clarkband use cloudstack and euca?18:13
harlowjaeuca only18:13
harlowjalol18:13
* harlowja didn't think rax was doing so well (at least from what i hear from the back channels with projects that move things to AWS and such) 18:14
SpamapSbeing acquired by private equity isn't usually a good isgn.18:15
mordredharlowja: jesus man18:15
SpamapSbut sometimes those things get chopped up and some of it survives. We don't know really.18:16
dmsimardjeblair: so you know, I think that sort of makes sense -- maybe we should redraft the spec to be a bit more generic, something along the lines of providing users an easy way to configure callbacks ?18:16
clarkbharlowja: I was thinking of ovh and internap and citycloud and datacentred and vexxhost and entercloud and everyone else at https://docs.openstack.org/developer/os-client-config/vendor-support.html18:16
Shrewseggshell: around to discuss https://storyboard.openstack.org/#!/story/2000897 ?18:16
mordredharlowja: you're as bad at the corrupt silicon valley tech press - OpenStakc is doing very well in public cloud18:16
mordredharlowja: unless you have a completely US-centric world view driven only by VC exits18:16
SpamapSclarkb: https://www.openstack.org/marketplace/public-clouds/ is another good list.18:16
dmsimardjeblair: i.e, I'm definitely okay with ARA being sort of a "JJB builder" where it'd install ara and configure the callback18:16
* mordred throws wet cat at harlowja18:16
harlowjalol18:17
SpamapSThough the o-c-c one is more useful :)18:17
eggshellShrews: sure. Haven't had the chance to dive to deep into it yet.18:17
dmsimardjeblair: versus what would end up being a tight coupling -- it's okay to keep zuul as lean and as simple as possible18:17
SpamapSharlowja: what's dead is speculation. :)18:17
Shrewseggshell: mind if i code the nodepool part of that then? I'd like to put that up today if possible.18:17
Shrewseggshell: i'll leave you the zuul part  :)18:17
SpamapSthe speculation was that if you just setup an API for running IaaS, the dumptrucks of money would divert into your datacenter instead of AWS/GCE/Azure's18:17
* SpamapS will take this discussion to his inside-voice now though18:18
eggshellShrews: go ahead :)18:18
harlowjaSpamapS ya, fair point18:18
Shrewseggshell: great, thx18:19
harlowjaclarkb how many of those clouds are contributing to openstack (the codebase/s), any idea?18:20
dmsimardmordred: TBH (disclaimer, $oldjob=Internap) public cloud with real customers on OpenStack is kind of hard18:20
clarkbharlowja: a good chunk of them contribute the test resources we run our testing on18:21
clarkbharlowja: and ovh does a lot of work with swift aiui18:21
mordreddmsimard: internap made a bunch of left-field choices in its deployments18:21
rbergeronoh boy humans talking lots18:21
harlowjalol18:21
dmsimardThe market is saturated -- if they're not in AWS/GCE/Azure, what they're looking for is probably just cheap VPSes and it's a race to the bottom for that18:21
clarkbharlowja: apparently they have been doing the EC work recently? something like that18:21
mordreddmsimard: again - that assumes the market is US18:22
dmsimardmordred: I know all about the deployment choices, I was a part of them :D18:22
mordreddmsimard: there are more countries than the US18:22
mordreddmsimard: :)18:22
harlowjanooooooo18:22
harlowjathere is only US18:22
harlowjalol18:22
mordredthere might even be places where they do not want to put their data into the control of a company based in Seattle18:22
mordredeven if that company has sa datacenter in their country18:22
harlowjaguess i should move to europe based on https://www.openstack.org/marketplace/public-clouds/18:23
dmsimardright, we had some customers reaching out to us just because of the Canada datacenter locations18:23
mordrednow - if literally ANY of the US OpenStack cloud providers had worked _together_ instead of competing with eachother18:23
dmsimardmordred: the inter-cloud story is bad18:23
mordredthe US story also might be better18:23
dmsimardlike federation or that cisco intercloud thing18:23
mordreddmsimard: you don't need those things18:24
harlowjamordred isn't tha a tough sell when u are racing to the bottom18:24
harlowja*that18:24
mordredharlowja: what, working with each other?18:24
harlowjaya18:24
mordredthat's literally the ONLY WAY ANONE WAS EVER GOING TO BE SUCCESSFUL18:24
dmsimardYeah .. interop is a must18:25
harlowjai get that point, what was missing in the equation to make that happen?18:25
harlowja(in your view)18:25
mordredlack of arrogance18:25
jeblairoh, hi18:25
jeblaircan we take the openstack pontificating to another channel?18:25
harlowjalol18:25
* clarkb apologizes for pushing the snowball down the hill18:26
harlowja:)18:26
* mordred blames clarkb18:26
clarkband points at #openstack-dev as good alternate venue18:26
* mordred blames himself for being an easy troll target18:26
harlowja:)18:28
jeblairdmsimard: cool -- i think it's okay to have the spec call out ara as a first-class use-case still.  i think a goal here is "make it easy to use ara".18:29
dmsimardjeblair: I'm okay with that18:29
pabelangerare we ready to restart nl01.o.o? I see we've merged some nodepool changes18:29
jeblairpabelanger: yes18:30
jeblairharlowja: 419 checks passed18:31
pabelangerjeblair: great, did you want to do it, or should I?18:33
jeblairpabelanger: can you please?18:34
pabelangerrestarted18:36
pabelangerjeblair: should I expect zuul start clearing out is pipeline or should I do something else? EG: manually delete ready nodes18:38
*** bhavik1 has joined #zuul18:40
dmsimardjeblair: fyi I'm a bit all over the place right now but I'll rework the draft sometime this week if I have a chance18:44
*** bhavik1 has quit IRC18:50
jeblairpabelanger: should be automatic; if not, we may have more bugs18:57
jeblairpabelanger, Shrews: i think we should add a 'nodepool request-list' command18:57
pabelangerjeblair: okay, I think we have more bugs18:58
jeblairit looks like the launcher is throwing some exceptions18:58
jeblairand it seems stuck again18:59
SpamapSclarkb: jeblair do you two think it might be a good idea to have a separate spec for disk space monitoring, or just flesh out the disk space bits a bit more in the launcher security spec?19:00
SpamapSI'm leaning toward the latter19:01
clarkbSpamapS: depending on the preferred security tooling they could be tightly coupled19:01
clarkbeg docker with its fs mounting19:02
SpamapSdocker does overlayfs by default IIRC19:02
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Record SSH public keys for new nodes in ZK  https://review.openstack.org/44505519:03
Shrewspabelanger: zuul launcher or nodepool launcher?19:04
SpamapSah no, aufs is default.. but the likely have mostly the same features19:04
pabelangerShrews: both are not doing anything currently. I believe the issue is on the nodepool side, but jeblair might have a better handle on it19:06
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Record SSH public keys for new nodes in ZK  https://review.openstack.org/44505519:06
clarkbSpamapS: in new docker its btrfs I think because aufs is going by wayside in kernel? something like that19:07
clarkbSpamapS: I remember kolla wanting xenial test nodes for this19:07
Shrewspabelanger: jeblair: based on the np exceptions, i'd say the request is still locked somewhere19:09
SpamapSclarkb: looks like most people actually end up using overlay219:09
SpamapSbecause aufs has been gone from ubuntu kernels since 14.0419:09
SpamapSor could be 16.0419:09
jeblairSpamapS: re spec: let's keep them together; the du thread idea is so dirt-simple we might go ahead and implement it before the spec lands, but if we keep it around it should co-evolve with the rest of what's in there; and if it's not necessary, we can remove it.19:11
Shrewsthough it seems to have cleared up (the locking)19:13
SpamapSclarkb: I hadn't really looked closely at the docker storage drivers.. but honestly.. they have a lot of what we want built in. Hm.19:13
SpamapSespecially the device-mapper one19:14
pabelangerShrews: left a question on 44505519:15
Shrewspabelanger: not sure how to answer it. i sort of copied what the zuul code was doing. is there an advantage of using paramiko over ssh-keyscan?19:16
Shrewsand why the heck is https://review.openstack.org/444973 not gating?19:16
Shrewspabelanger: oh, for the dependency19:18
clarkbShrews: because when you already have a verified +1 and zuul revotes verified +1 that second +1 isn't actually emmitted in the event19:18
clarkbShrews: so zuul doesn't see the change as gateable. You can reapprove19:18
pabelangerShrews: operating system dependency vs python dependency? About all I can think of19:18
pabelangerShrews: either works for me honestly19:19
Shrewsclarkb: neat19:19
SpamapSoh hm19:19
Shrewspabelanger: i'll go with whatever jeblair suggests  :)19:19
SpamapSdocker's device-mapper driver actually _does not_ do what we want.19:19
SpamapSIt has one pool, for all the containers' write space19:19
pabelangerShrews: WFM!19:19
SpamapSclarkb: ^19:19
Shrewspabelanger: he originally suggested the zuul code, so ... :)19:20
SpamapSso one container can fill the pool19:20
pabelanger++19:20
Shrewspabelanger: can i get your eyes on https://review.openstack.org/444974 ?19:21
SpamapSand same for overlay19:21
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Remove test_*_cleanup_on_start tests  https://review.openstack.org/44497319:22
pabelangerShrews: +319:25
Shrews\o/19:25
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Re-enable test_node_delete_failure  https://review.openstack.org/44497419:31
Shrewswait for it...19:31
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Remove MySQL  https://review.openstack.org/44491919:31
Shrewsmmmmmm.... that merge makes me tingley all over19:31
pabelanger\o/19:31
* Shrews takes a smoke break19:32
pabelangerSpamapS: are you a DM or DD for debian?19:32
SpamapSpabelanger: DD, need something sponsored?19:34
SpamapS(We should ITP zuul ;)19:34
pabelangerSpamapS: actually! I have some old packaging for nodepool / zuul kicking around, now that we are close to zuulv3, was going to push that into debian. But ya, need sponsored :)19:35
pabelangerwon't happen today, but might send some packages your way19:35
clarkbSpamapS: re docker that seems poorly designed19:35
clarkbSpamapS: but guessing much eaiser to implement19:35
jeblairclarkb, fungi, mordred, SpamapS: when you have a moment, thoughts on the question in https://review.openstack.org/445055 would be appreciated.19:36
clarkbwow I derped that response19:38
clarkbit quoted the top level comment and not any of hte inline comments, I fail19:38
clarkbbut I like using paramiko there19:38
fungijeblair: the question being whether to also use paramiko for keyscan instead of ssh subprocess?19:38
jeblairfungi: yes.  i would strike 'also' from that though, since i expect us to remove all other uses of ssh from nodepool (ready scripts should move to zuul pre-playbooks)19:39
jeblairfungi: so basically, we have a dependency either on paramiko or ssh-keyscan for a pretty simple task.19:39
fungioh, i see...19:39
fungiso i guess it boils down to whether it continues to carry a dependency on paramiko or on command-line ssh client19:40
jeblairya19:40
jeblairi mean, the client is *probably* already there.  almost certainly.19:40
jeblairbut still19:41
fungithe latter is almost guaranteed to be present, so it's really a choice between a subprocess call or an additional python dep for one single function19:41
clarkbfungi: jeblair well zuul on windows or zuul in containers likely won't have an openssh toolchain19:41
clarkb(there are other barriers to zuul on windows, but zuul in containers is doable today)19:41
pabelangerparamiko means pip install nodepool would just work out of box, single command19:41
fungiyep19:42
jeblairi don't think we have made any promises regarding the first thing.  the second, otoh, is good to keep in mind.19:42
fungiparamiko _does_ claim windows support too, for what that's worth https://github.com/paramiko/paramiko/blob/master/setup.py#L5919:43
fungihrm, though it does itself depend on cryptography and pyasn119:45
ShrewsI'm about to put up the paramiko alternative, fwiw19:45
clarkbfungi: for reading key files I think19:46
fungipyasn1 is less of a concern, cryptography is not pure-python (which makes paramiko impure as well), but if we're already using cryptography for something else then that's nbd19:46
fungijeblair: rcarrillocruz: we were talking about using the pyca/cryptography library for encrypting secrets with zuul though, right? if so, we're eating that cost in part of the toolchain anyway19:49
jeblairfungi: yeah, i think that's still the plan there19:51
Shrewsjeblair: pabelanger: paramiko seems to return only the "ssh-rsa" key, where as ssh-keyscan returns other types (at least when i test against localhost). Is that ok?19:55
pabelangerOh, interesting19:55
pabelangerI don't think we have rsa keys on our workers19:55
pabelangerlet me check19:55
clarkbwe do19:55
pabelangercool19:55
Shrewsget_remote_server_key() doesn't take any options to control that http://docs.paramiko.org/en/2.1/api/transport.html#paramiko.transport.Transport.get_remote_server_key19:56
Shrewsso, not sure what the difference is there19:56
*** Cibo_ has quit IRC19:56
clarkbdoes paramiko support ecdsa at all?19:57
Shrewsbut at least locally, ssh-keyscan will return ssh-ed25519 and ecdsa-sha2-nistp25619:57
Shrewsclarkb: *shrug*19:57
pabelangerhttps://github.com/paramiko/paramiko/issues/79419:58
pabelangerI think we'd hit that19:58
pabelangerso, looks like ecdsa is in the pipeline. I see some open pull requests for it20:00
fungirsa is "good enough" here in my opinion, and if we want to be able to default to other host key types we can try to pitch in on paramiko fixes i guess20:00
*** Cibo_ has joined #zuul20:02
jeblairbut is it the case that if a nodepool user didn't use rsa keys at all, we would fail to get a fingerprint?20:03
SpamapSclarkb: I was wrong actually20:04
clarkbjeblair: I think openssh at least will request the one it knows and warn about the others but not force you to type yes for them?20:04
SpamapSclarkb: they added a size option https://github.com/docker/docker/issues/380420:04
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Record SSH public keys for new nodes in ZK  https://review.openstack.org/44505520:05
clarkbI'd have to test that behavior I want to say it will work fine with openssh as long as remote host has rsa host key20:05
Shrews^^^ paramiko version20:05
fungijeblair: yeah, if they configured sshd not to have an rsa host key, this would probably break (though our previous use of paramiko for ready scripts would similarly have run afoul of that, so doesn't seem like much of a regression to me if any?)20:06
jeblairgood point.  also https://github.com/paramiko/paramiko/pull/911 looks promising20:07
pabelangerya, I think these are know issues upstream (paramiko). Which is good to see20:08
pabelanger911 is track for the 2.2 release too20:08
Shrewsfunny that 911 doesn't adjust the API for get_remote_server_key20:12
Shrewsor add a new one20:13
*** hashar has joined #zuul20:32
SpamapSclarkb: I'm playing with docker storage drivers now... looks like not all of them take the size= opt20:35
SpamapS"For the overlay2 storage driver, the size option is only available if the backing fs is xfs and mounted with the pquota mount option. Under these conditions, user can pass any size less then the backing fs size."20:35
SpamapSmmmmm XFS20:36
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Add request-list nodepool command  https://review.openstack.org/44516920:36
Shrewsjeblair: ask and ye shall receive ^^^^20:36
Shrewssometimes, at least20:37
*** jamielennox|away is now known as jamielennox20:37
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Remove AllocatorTestCase and RoundRobinTestCase  https://review.openstack.org/44517520:45
*** Cibo_ has quit IRC20:52
jheskethMorning20:57
jeblairShrews: nice!20:57
Shrewsi think the paramiko host keys change is not working. the dsvm jobs seem to be taking quite a while20:59
jeblairunpossible21:01
Shrewsif the paramiko call doesn't return anything, we'd be stuck repeatedly building nodes21:02
Shrewswhich is what i suspect is happening21:02
SpamapSmeeting?21:02
jeblairSpamapS: in one hour21:02
SpamapSOH21:03
SpamapSDST yay21:03
jeblairyeah, that :|21:03
SpamapSit's actually good because I was double booked last week21:04
Shrewsjeblair: should we consider it an error if we can't get any host keys? that's what the code does now21:05
jeblairShrews: i think so; it is in our env.  if that's inconvenient for other users, a config option makes sense.21:06
* Shrews impatiently waits for logs21:07
jeblairthis is where the 'finger random log i ask for' thing is really going to come in handy21:10
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Record SSH public keys for new nodes in ZK  https://review.openstack.org/44505521:11
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Add request-list nodepool command  https://review.openstack.org/44516921:12
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Remove AllocatorTestCase and RoundRobinTestCase  https://review.openstack.org/44517521:12
Shrewssilly "%s" with an int21:13
jeblairShrews: ?21:14
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Record SSH public keys for new nodes in ZK  https://review.openstack.org/44505521:14
Shrewsnm. silly , instead of %   :)21:14
jeblairah, that makes more sense :)21:15
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Add request-list nodepool command  https://review.openstack.org/44516921:15
jeblairShrews: one more time though21:15
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Remove AllocatorTestCase and RoundRobinTestCase  https://review.openstack.org/44517521:15
jeblairShrews: inline comment21:15
Shrewsgah21:15
jeblairquotes optional21:16
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Record SSH public keys for new nodes in ZK  https://review.openstack.org/44505521:16
* Shrews cries21:16
jeblairShrews: you're getting very fast at that21:16
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Add request-list nodepool command  https://review.openstack.org/44516921:16
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Remove AllocatorTestCase and RoundRobinTestCase  https://review.openstack.org/44517521:17
Shrews*phew* seems to be passing now21:53
pabelangernice21:54
jeblairit's zuul meeting time in #openstack-meeting-alt22:00
pabelangerclarkb: fungi: want to review / approve 445055 ? Since you commented on it22:02
Shrewsb 1722:03
*** jamielennox is now known as jamielennox|away22:03
Shrewsgah22:03
fungipabelanger: maybe post-meeting22:03
*** jamielennox|away is now known as jamielennox22:07
*** jamielennox has left #zuul22:39
*** jamielennox has joined #zuul22:39
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Add generic tox job (multiple playbooks)  https://review.openstack.org/43828123:02
pabelanger\o/23:03
pabelangerI'll rebase tomorrow the rest of the stack23:03
*** hashar has quit IRC23:05
jheskethpabelanger: did you abandon the other approach?23:06
Shrewspabelanger: so, those instances without "nodepool_provider_name" in infracloud-chocolate (from the launcher log) will need to manually deleted because we never put that info into those instances23:07
Shrewspabelanger: they're leaked nodes (no ZK entry)23:07
pabelangerjhesketh: not yet23:08
pabelangerShrews: actually! Those are nodes from nodepool.o.o (production)23:08
pabelangerso, things are working as expected23:08
pabelangeraside from the log spam23:08
Shrewspabelanger: oh, good23:08
pabelangerwill loop back, need to help with family23:09
Shrewspabelanger: i think there's a problem with the pause-at-quota algorithm now. after 7pm here so i really need to go make dinner. can take a look tomorrow if we can wait that long23:09
Shrewsjeblair: fyi ^^^23:10
* Shrews aways for food23:11
jlkSpamapS: reading through the security spec, what about rkt?23:11
jlkdoesn't even appear to require a daemon23:11
jamielennoxjlk: daemonless is good, but i don't think which container format changes any of the actual security principles23:14
jlkright, I meant to address the complexity of Docker23:14
SpamapSjlk: I've heard of it, but don't really know how it works.23:15
jlkI'm reading up on it23:15
jlkseems to approach security differently23:15
jlkhttps://coreos.com/rkt/docs/latest/devel/architecture.html23:15
SpamapSand yes, daemonless and simple would be nice.23:16
jamielennoxcomplexitiy of docker generally refers to networking though right?23:16
SpamapSno the whole thing is getting big23:16
SpamapSstorage drivers23:17
SpamapSnetworks23:17
SpamapSdockerhub23:17
jlkhttps://coreos.com/rkt/docs/latest/rkt-vs-other-projects.html might be of help too23:17
SpamapSjust lots of moving pieces to exploit23:17
jamielennoxoh, right, i understand they've jammed a whole bunch of crap in there23:17
jlkit's multiple daemons now23:17
jlkdocker daemon to take API calls, and containerd to actually run the containers23:17
jamielennoxi was just thinking of it from a usability and config perspective23:17
jamielennoxShrews: any chance there is an upgrade doc for nodepool -> v3?23:19
SpamapSjlk: heh, I should probably also just mention systemd-nspawn23:20
jlkIf we want to go really low level, runC is an option too23:21
jlksince we basically just want shell in a box23:21
jamielennoxSpamapS: i konw you meant that as a joke, but for OS level containment/cgroups etc, systemd-nspawn is probably a reasonable choice23:22
jamielennoxbikeshed = orange23:22
clarkbSpamapS: I was actually going to look into it and then probably suggest it (the systemd thing)23:22
clarkbSpamapS: since it does userspace containering now too iirc23:22
jamielennoxSpamapS: embrace the systemd23:23
fungiShrews: on 445055 i guess get_remote_server_key() returns a binary blob rather than the typical ascii formats ssh-keyscan emits?23:24
clarkband appraently can be used without having systemd proper on the system?23:25
SpamapSThe thing is23:26
SpamapSwe need mixed readonly/immutable/writable dirs23:27
SpamapSjamielennox: NEVER23:27
SpamapSthat's impossible23:27
SpamapS;)23:27
SpamapSso the reason docker/lxc/bubblewrap are nice is they have the facilities to help manage the dirs underneath the playbook23:27
SpamapSbecase you need git trees (readonly but dynamic), ansible+dependencies (readonly, immutable and reusable), and scratch space for transfers/artifacts/etc (writable)23:28
clarkbdo we need to enforce that though assuming the container itself is divested fro mthe host sufficiently?23:29
jlkI was going to ask the same23:29
fungiattention citizen: trust the systemd; the systemd is your friend; your current clearance level is high programmer23:29
SpamapSyou going to build a whole chroot for every playbook execution?23:29
jlkthe "immutable" part is that you restart each time from the base container23:29
clarkbSpamapS: why not? its incredibly cheap with eg btrfs23:29
jlkrather than continuing to use a scratch space23:29
jamielennoxSpamapS: yes, the container build part is basically instant23:29
SpamapSclarkb: yes, but that's what docker does for you already.23:29
clarkbSpamapS: nspawn does it too23:30
SpamapSthe programming time and maintenance of chroot builders is not free23:30
jlkbuild the userland once, start it again and again and again23:30
clarkbSpamapS: if paired with btrfs23:30
clarkbbut yes its non zero overhead that has to be paid somewhere23:30
clarkbfwiw I thought that at PTG we decided that the git repos would be in the scratch space23:30
clarkband we wouldn't put effort into multiple classes of data there23:31
fungis/btrfs/lvm2/ for tat matter23:31
SpamapSsystemd-nspawn --template interesting23:31
SpamapSclarkb: Oh, I was not present for that but it's fine.23:31
SpamapSI'd prefer that they stay readonly but that's only for clean implementation.. if somebody wants to trash the git trees in trying to break out I don't care.23:32
clarkbfungi: well nspawn specifically supports btrfs snapshots with the template flag SpamapS noted above23:32
SpamapSmy main point was just that they're not there to be written to23:32
clarkbso its a really cheap way to hvae a thing23:32
fungiclarkb: oh, interesting that systemd would tie a feature to one specific fs23:32
jlkin my head, seems "easiest" to let them run amok inside the container, and not worry about read-onlying things inside the container, Unless we need to bind mount something from the host in23:32
SpamapSclarkb: --image= can make use of lvm's23:33
clarkbSpamapS: ah cool23:33
jamielennoxif executors are used once only i don't care if people trash their own git repos23:33
clarkbjamielennox: ya I think that was what we ended up deciding in the room23:34
clarkbbasically the only thing that will suffer if you od that is you so meh23:34
SpamapSSo yeah that's the other option... make zuul-executor a single-job thing23:34
clarkb:)23:34
jamielennoxclarkb: not tied, it says it can use a regular directory as a template, but in which case you involve a full copy of the directory, if btrfs it does COW23:34
jlkpop an executor, clone the code from within it, run the ansible?23:34
SpamapSbut making zuul-executor a single-use thing just punts the setup/cleaning to something else23:34
SpamapS(like kubernetes)23:34
clarkbjlk: right snapshots are supported if using btrfs23:35
clarkber jamielennox ^23:35
jlkwell..23:35
clarkbbtrfs is not required23:35
SpamapSI keep hearing btrfs gets slower and slower the longer you use it.23:35
jamielennoxstrongly recommended23:36
jlkI thought more that zuul-executord would run on the host, and it would in turn launch/remove the container processes for doing the ansible work23:36
SpamapSjlk: that's how I think it should work yes23:36
fungiso in theory you could leverage lvm2 copy-on-write snapshots with --image23:36
SpamapSfungi: correct23:36
clarkbSpamapS: that was an issue back with cephfs' implementation but I want to say that they have sorted htat out and is a big reason why cephfs is no longer beta23:36
SpamapSclarkb: Oh they're back on btrfs?23:37
pabelangerjlk: yes, thats how I last heard it23:37
SpamapSThey were giving up on it 2 years ago.23:37
jlkdoesn't look like systemd-nspawn can limit cpu/memory, etc.. is that true?23:37
pabelangerbut lots of things being discussed now23:37
jamielennoxI would agree with a executord model that you request executors from23:37
SpamapSjlk: that would be weird because cgroups are about limiting cpu/memory :-P23:37
jamielennoxjlk: it starts a systemd process so i would expect it to go through that23:37
fungijlk: apply cgroups liberally and lather?23:37
clarkbSpamapS: I think their future is bluestore which is write to disk directly, but pretty sure there is intermediate use btrfs but maybe its xfs and I am just behind the times)23:37
jlkI was just reading https://coreos.com/rkt/docs/latest/rkt-vs-other-projects.html#rkt-vs-systemd-nspawn23:38
jeblairSpamapS, clarkb: (late response for something like 10 minutes ago: yes, the git repos are in the scratch area, and i believe that's the current intent.  we can change it but i don't think we have to; they are definitely per-job and there's no compelling reason to make them readonly afaik.  if zuul pushes merges, we'll have something else do that.)23:38
Shrewsjamielennox: no23:38
Shrewsfungi: yes23:38
clarkbjlk: you can limit cpu/memory etc with nspawn23:39
clarkbjlk: but you apply the cgroup rules post container start looks like23:39
clarkbjlk: via systemctl23:39
fungiShrews: okay, cool (if somewhat strange). thanks!23:39
jlkclarkb: ah, eww.23:39
jlkor whateve.r23:39
SpamapSclarkb: yeah I recall bluestore as their path to happiness.23:39
SpamapSwind out of sails23:40
SpamapSsystemd-nspawn just not quite all the things23:41
clarkbjlk: I mean it makes senes, there is already a tool to edit cgroup rules just reuse it23:41
jamielennoxShrews: how would you feel about me splitting nodepool-webapp into an app rather than something you have to --no-webapp23:41
* SpamapS needs a thing that is all the things23:41
clarkbjlk: rewriting tools to rewrite them is silly and hard on users :)23:41
jlkclarkb: sure, it's the typical unix problem. "ALl these tools exist, just use them better", and then we get things like Docker, which use them better, but then gets seen as "too complex doing too many things".23:42
jlkrinse, repeat :)23:42
Shrewsjamielennox: i don't have an issue with that, but should check with jeblair and pabelanger23:42
* Shrews aways again23:42
pabelangerShrews: jeblair: jamielennox: no issue here23:42
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Record SSH public keys for new nodes in ZK  https://review.openstack.org/44505523:43
openstackgerritJamie Lennox proposed openstack-infra/nodepool feature/zuulv3: Remove the --no-delete option from nodepool  https://review.openstack.org/44524123:43
jeblairjamielennox: if you're talking v3: yeah, i think that makes sense -- with the launchers fully distributed, it doesn't make sense for it to be combined with them anymore.23:46
jamielennoxjeblair: yep, i'm talking v323:47
openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Encapsulate determining the event purpose  https://review.openstack.org/44524223:47
SpamapSrealistically this rkt stage1 with systemd-nspawn has all the cgroupy/namespacey things.. and has a really nice simple interface23:49
SpamapSI like that rkt's API is mostly writing text files to disk and then executing something23:49
SpamapSI remember now meeting the CoreOS engineers early on and noting that they were _very_ excited about systemd.23:52
clarkbrcarrillocruz: jeblair I have reviewed the ansiblification of overlay networking in d-g23:54
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP Add per-repo public and private keys  https://review.openstack.org/40638223:54
clarkbSpamapS: you can also do things like hvae a unit that runs in a container iirc23:54
clarkbSpamapS: so its possile that zuul-executor-container@.service is a thing that zuul-executor just starts with some args and done, check exit codes, and make barbeque23:56
clarkb(I'm hand waving)23:56
SpamapSclarkb: if that includes a prep statement that slaps together a chroot, and a post that cleans itup.. sure. :)23:57
jlkwait23:57
jlkwait wait23:57
jlk"slaps together a chroot"23:57
jlkwhy isn't that a pre-baked image?23:57
SpamapSit starts with a copy of that23:58
jlkthat's baked once a day/week/executord-restart ?23:58
SpamapSand then adds scratch space and dumps git trees in23:58
jlkokay, so yeah, I guess I'm used to higher level tooling that does this23:58
jlkwhere you say "launch this image and this process inside of it"23:58
jlkand it makes does that23:58
SpamapSdocker23:58
SpamapShonestly23:58
SpamapSIt's a whole big hot mess23:58
jlkrkt can do that too23:58
clarkbI mean thats what nspawn does too23:58
SpamapSnot really23:59
SpamapSnspawn is giving me the hand-waves on cgroups23:59
clarkbSpamapS: it doesn't build the image for you, but you can give it one23:59
SpamapSI get that there's a way23:59
pabelangerclarkb: ideally, nodepool-builder can build it, and some how pass it back to zuul-executor23:59
clarkbyou basically say "use this thing over here that might be one of these specific 'image' types"23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!