Monday, 2017-07-24

*** dkranz has quit IRC00:16
clarkbyou could dual license but that would only make things more complicated I think02:00
dmsimardclarkb: (huge coincidence that I saw you reply just now) yeah, that's sort of why I almost don't want to bother with ARA02:02
dmsimardconsidering there hasn't been a lot of contributors (yet), it's not under CLA, it's not under openstack foundation governance, etc.02:02
dmsimardIf you dual license, it sort of becomes ambiguous, confusing and you have to be careful about what you import where so you don't taint in the wrong direction..02:04
dmsimardmordred: btw thanks for http://lists.openstack.org/pipermail/openstack-dev/2017-April/115013.html02:06
* dmsimard totally not switching from uuid primary keys to ids right now02:06
*** jesusaurum has quit IRC03:41
mordreddmsimard: you're welcome! and totes -although I _do_ recommend switching at some point - it'll make you happier with larger installs03:57
mordreddmsimard: and yah - there's no reason for you to not just make ARA GPL if you have the agreement from all of the peple who have contributed patches ( just make sure you actually have agreement from their employers, since most people don't indivually have the legal authority to agree)03:58
mordreddmsimard: only matters for copyrightable patches - https://review.openstack.org/#/c/414381/1/ara/webapp.py, for instance, I don't think you need to worry about, for instance03:59
mordreddmsimard: from looking at stackalytics, it looks like you have 17 commits you  need to look at, determine if they are completely trivial and if not contact the author for permission. it would be 'best' to make the patch to switch to GPL and then get each author whose perission you need to switch to +1 one that commit04:02
*** bhavik1 has joined #zuul04:37
*** bhavik1 has quit IRC05:57
*** isaacb has joined #zuul06:17
*** hashar has joined #zuul06:29
*** amoralej|off is now known as amoralej06:45
*** yolanda_ has joined #zuul07:05
*** yolanda_ has quit IRC07:06
*** 7ITABD5MB has joined #zuul07:06
*** 07IAALFJ9 has joined #zuul07:06
*** 07IAALFJ9 has quit IRC07:07
*** 7ITABD5MB has quit IRC07:08
*** yolanda_ has joined #zuul07:08
*** yolanda_ is now known as yolanda07:11
*** isaacb has quit IRC07:15
*** lennyb has quit IRC07:19
*** isaacb has joined #zuul07:23
*** lennyb has joined #zuul07:32
jamielennoxhey is there a zuul logo/mascot i can put in a slide?07:37
jamielennoxi feel like i've seen one before07:40
jamielennoxmordred: as you're in this tz and might be here ^07:40
mordredjamielennox: I'm not sure we've produced one of those yet07:51
*** isaacb has quit IRC09:15
*** isaacb has joined #zuul09:16
*** amoralej is now known as amoralej|brb10:08
*** jkilpatr has quit IRC10:45
*** jkilpatr has joined #zuul11:02
*** hashar is now known as hasharLunch11:13
*** amoralej|brb is now known as amoralej11:34
*** dkranz has joined #zuul11:50
*** hasharLunch is now known as hashar13:01
*** isaacb has quit IRC13:58
*** isaacb has joined #zuul14:12
dmsimardFor Zuul v3 secrets ( https://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html#secrets )14:30
dmsimardHow would you pass the equivalent of a credentials-binding for a file rather than a string ? Encrypt the base64 or something &?14:30
pabelangerdmsimard: just confirming, you want to encrypt the whole file?14:38
dmsimardpabelanger: currently jenkins allows to encrypt a text (string) or a file14:39
dmsimardand then at runtime it sends that file to the slave, decrypts it and makes it available as an env var14:39
pabelangerYa, I don't think we support files ATM. But you should be able to store file_contents as encrypted blob then template it14:40
pabelangerthat's what we'd plan to do with SSH private keys14:41
jeblaironly up to 4096 bits14:57
jeblairor, actually, i think a bit less than that14:57
jeblair4096, according to the docs: https://docs.openstack.org/infra/zuul/feature/zuulv3/user/encryption.html14:58
jeblairdmsimard: ^14:58
*** isaacb has quit IRC15:08
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove ZUUL_PROJECT  https://review.openstack.org/48625115:19
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove ZUUL_UUID  https://review.openstack.org/48625215:19
pabelangerjeblair: mordred: 485824 should be a straightforward review for zuul-jobs15:36
jeblairpabelanger: +3.  anything else i should look at?15:37
pabelangerjeblair: just that for now, thanks. Working on more refactor patches this morning15:38
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: testing  https://review.openstack.org/48666515:38
openstackgerritMerged openstack-infra/zuul-jobs master: Remove nodepool DIB specific logic  https://review.openstack.org/48582415:40
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Remove .txt suffix from tox logs  https://review.openstack.org/48666515:45
*** hashar is now known as hasharMeeting15:52
leifmadsenjust to confirm, master branch == zuul v2.5 and all v3 work still on feature/zuulv3 ?16:01
pabelangeryes16:03
leifmadsenthx16:05
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: do not merge  https://review.openstack.org/48667916:06
*** bhavik1 has joined #zuul16:10
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: do not merge  https://review.openstack.org/48667916:14
*** bhavik1 has quit IRC16:20
openstackgerritMerged openstack-infra/zuul-jobs master: Remove .txt suffix from tox logs  https://review.openstack.org/48666516:22
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Allow loading additional variables file for site config  https://review.openstack.org/44773416:30
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: do not merge  https://review.openstack.org/48667916:44
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul feature/zuulv3: WIP: Implement autohold  https://review.openstack.org/48669216:45
pabelangerjeblair: I'd like to restart zuulv3 to pick up latest logging improvements16:47
jeblairpabelanger: all yours16:54
Shrewsjeblair: so, before 692 ^^^ starts getting into the actual meat of the change, curious as to how you see the in-memory autohold list being managed. Like, do we delete the project/job from the list after the first hold?16:55
Shrewsjeblair: also, do we need to specify tenant?16:57
pabelangerzuulv3 restarted16:58
jeblairShrews: lookin16:59
Shrewswell, not much to see there yet. it's just the beginnings of plugging into the scheduler  :)17:00
jeblairShrews: (well, that also includes looking at what i wrote in storyboard so i sound like i know what i'm talking about)17:01
Shrewslol17:01
openstackgerritPaul Belanger proposed openstack-infra/zuul feature/zuulv3: Log an extra blank line to get space after each skip  https://review.openstack.org/48669817:04
jeblairShrews: ah ok!  good questions! :)  in nodepool v0, we tell it how many failed nodes it should accumulate for a given job.  i think we default to 3.  so maybe we should do that here -- add an extra cmdline argument to specify the count.17:04
jeblairShrews: in v0, nodepool puts a note in the 'comment' field in the node table in the db like "auto held for job foo".  it counts those to figure out if it has met the limit17:06
jeblairShrews: we could do something similar in v3, or we could actually add a field to the zk node rec for this purpose.  like "zuul_job" or something.17:06
Shrewsjeblair: ah, ok.17:07
jeblairShrews: i think maybe once we've hit the limit, drop the entry from zuul's in-memory list?  we don't do that in nodepool v0, but i think this might be a better behavior.17:07
jeblair(only fungi is good at remembering to clean up autoheld nodes :)17:08
jeblairShrews: and yes, we need to specify tenant as well17:08
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: do not merge  https://review.openstack.org/48667917:09
*** harlowja has joined #zuul17:09
Shrewsjeblair: great. thanks.17:10
jeblairShrews: and the project name should obey the new convention we're establing -- it should be a fully-qualified canonical project name (ie, git.openstack.org/foo/bar) if that's required to disambiguate it from another similarly named project, or if it's unique, it can just be "foo/bar".  the Tenant.getProject method will take care of all that for you, so you can just treat it as an opaque string and hand it off to that method to get a project back (or an ...17:10
jeblair... error).17:10
Shrewsah. yeah, i suppose i should use that to validate the input17:12
jeblairShrews: i wouldn't try to do much local input validation -- just pass it over the wire and validate it on the zuul-scheduler side, then return errors from that if there are any.  i think most of the other methods work that way.17:13
Shrews*nod*17:13
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Create tox_environment_defaults variable for tox based jobs  https://review.openstack.org/48667917:17
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Create tox_environment_defaults variable for tox based jobs  https://review.openstack.org/48667917:20
pabelangerjeblair: mordred: okay, so I think we are ready to bike shed on https://review.openstack.org/#/q/topic:tox_environment_defaults17:28
pabelangerthat gives us a way to setup tox defaults, but allows anybody to also override them17:29
* fungi doesn't feel like he does a particularly excellent job of remembering to delete his held or autoheld nodes17:29
pabelangermordred: jeblair: I'll reserve comments until you've had a chance to look17:31
jeblairfungi: then the rest of us are even worse off!17:33
fungiyikes17:34
leifmadsenare there any documentation patches, especially around setting up zuul w/ github (and just generally getting started) that I can review/test?17:37
jeblairleifmadsen: nothing in flight at the moment, but we do have some stuff merged. all docs:  https://docs.openstack.org/infra/zuul/feature/zuulv3/17:39
leifmadsenthanks, reading though now, looks like I'll have to do some code digging17:40
jeblairleifmadsen: the administrators guide has things for someone setting up zuul: https://docs.openstack.org/infra/zuul/feature/zuulv3/admin/index.html17:40
leifmadsenI generated the latest stuff locally17:40
jeblairleifmadsen: there are two big weak spots we know about:17:40
jeblairleifmadsen: a good install HOWTO.  we want to have a playbook to help with that.17:41
leifmadsenjust remember that playbooks are not documentation :)17:41
jeblairleifmadsen: exactly.  we still need everything to be fully documented.  but "i just want to see it run" is never going to be quick and easy with a distributed system, so it'll be nice to have both.  :)17:42
SpamapSleifmadsen: You might be able to glean some info from BonnyCI's deployment ansible, called hoist... which deploys pointed at github17:42
SpamapSleifmadsen: https://github.com/BonnyCI/hoist17:42
jeblairleifmadsen: and we know there's some stuff missing in the github docs about how to actually set up the webhooks/triggers/etc in github's interface itself.17:43
SpamapSThere's still stuff for v2.5 in there but v3 works17:43
leifmadsenyea, mostly interested in v3 with github events as I'm starting a comparison / review between zuulv3 and prow17:43
leifmadsenand just understanding how both work, etc17:43
ShrewsSpamapS: lol @ hoist. i'm sensing a theme17:44
Shrews"mateys-ahoy" ... theme confirmed17:44
leifmadsennautical name theme definitely a k8s style thing :)17:45
SpamapSShrews: click 'Projects' for a hearty flagon of pirate humor.17:45
SpamapSwell, org projects17:46
SpamapShttps://github.com/orgs/BonnyCI/projects/117:46
SpamapSWe don't groom the backlog.. we swab it. ;)17:46
Shrews404'd on that17:46
jeblairleifmadsen: please let us know about any other missing/confusing docs17:47
SpamapSOh I wonder if that's org-only :-P well it's our scrum board and we named it Poop Deck. ;-)17:48
* fungi is _not_ swabbing the poopdeck17:49
jeblairSpamapS: what's the status of bonnyci/charts?17:51
SpamapSjeblair: it was a spike by jamielennox .. not sure how far he got.17:52
jeblairah, thus the "20 days ago"17:52
SpamapSWe're being compelled to move our stuff off our openstack cloud which will be shutdown soon, so we were going to see if we could use that to deploy onto BlueMix k8s17:52
SpamapS(and get nodes from some public cloud vendor)17:53
jeblairgotcha.  it'll be nice to have helm charts too.17:55
SpamapSI agree, it's a good fit I think17:57
SpamapSI was actually also going to play with Habitat17:57
SpamapSbut.. distractions abound17:58
Shrewssquirrel!17:58
jeblairleifmadsen: fyi, right now we're heavily focused on prepping to move openstack to zuul, hopefully in a little over a month.  we're working on a shared job library so that not everyone has to write their own version of a "run $language unit tests" job, and building openstack's installation on top of that.  and of course, fixing any issues that surface as part of that.17:59
leifmadsenwell, I'll just over here toiling on trying to get it working as a newbie :)17:59
jeblairleifmadsen: cool, just wanted to give you some context18:02
adam_gv2.5 problem, anyone have any tips for debugging an issue where a node sometimes gets re-used for two changes? it looks like zmq msgs are being processed correctly, but im watching nodepool happily hand out a USED node after a previous job has completed. its fairly easy to reproduce in our env /w a loaded queue and triggers being delivered in quick succession18:02
pabelangerhttp://git.openstack.org/cgit/openstack/ansible-role-zuul should get you most of the way to zuulv3. but I haven't tested it with github integration18:02
jeblairadam_g: are you sending OFFLINE_NODE_WHEN_COMPLETE=1 as a job parameter?18:04
adam_gjeblair: no, not afaics18:04
adam_gshould i be?18:04
pabelangerwas just going to ask that18:05
jeblairadam_g: yes18:05
adam_gi'll give that a shot18:06
jeblairadam_g: remember, the v2.5 launcher is basically emulating jenkins, so nodes normally just stay attached to the "master".18:06
jeblairadam_g: so that's emulating the thing we added to the gearman plugin to take a node offline when the job is done.18:06
*** amoralej is now known as amoralej|off18:07
adam_gjeblair: ok, so it happens to work w/o that settings because the deleter eventually kicks in after DELETE_DELAY ?18:07
jeblairadam_g: yes.  this addresses that race condition.18:08
adam_gjeblair: great18:08
pabelangeryou should be able to reuse our openstack_functions.py python-file and setup the following regex: http://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul/layout.yaml#n111218:09
adam_gjlk: jamielennox SpamapS ^ look for a hoist patch to apply this, surprised we didnt see this more often /w our bonny jobs at peak working hours18:13
SpamapSadam_g: "peak" ;-)18:13
SpamapSadam_g: actually it's entirely possible our jobs were happy to run again without breaking maybe18:13
jeblairSpamapS: that's possible, but even so, if nodepool decides to delete the node mid-run, that's also, erm, problematic.18:21
jeblairSpamapS: though, actually, not as much as it could be... because zuul is likely to reschedule the job in that case18:21
jeblairSpamapS: so there's a pretty convincing explanation for how it could go unnoticed.18:21
jeblair"cloud node disappearing out from under me" is something zuul is designed to handle.  even if it's self-inflicted.  :/18:22
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584018:25
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584018:30
SpamapSjeblair: That's a bit schizophrenic, but I like that we have coping strategies. ;)18:38
jeblairSpamapS: "stop hitting yourself!"18:50
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584018:51
*** hasharMeeting is now known as hasharDinner18:56
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584019:00
SpamapSjeblair: perhaps all distributed systems problems can be boiled down to sibling rivalry tropes. Kerberos key exchange problems might be "I know you are but what am I?"19:08
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584019:09
jeblairSpamapS: this is your chance for the big time:  No results found for "i know you are but what am i algorithm".19:12
SpamapSjeblair: It's too generic to patent. :)19:14
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul feature/zuulv3: WIP: Implement autohold  https://review.openstack.org/48669219:21
*** hasharDinner has quit IRC19:23
Shrewsjeblair: When you have a moment, looking at https://review.openstack.org/#/c/486692/2/zuul/scheduler.py , I know you said not to do much validation, but am I trying to do too much there? My thinking is that returning False (which I hope will mean job failure????) would be a friendlier way to tell the user "nope, couldn't do the hold".19:25
Shrewswithout those checks, we could just fallback to the less friendly exceptions that might occur b/c of invalid things19:27
leifmadsenis there an example tenant configuration for the github driver somewhere I could peep at?19:30
leifmadsenoh might have just figured it out (of course, right after I ask)19:32
Shrewsjeblair: oh, doesn't look like returning False is enough to signal that. Would have to throw an exception. bummer19:34
Shrewsguess i could just 'raise Exception()' instead19:36
jeblairShrews: yeah, all the current errors are job exceptions.19:36
jeblairShrews: take a look at handle_enqueue in rpclistener19:36
jeblairShrews: it does input validation and returns nice error exceptions that indicate the problem19:37
SpamapSleifmadsen: helps to get things out of your own head :)19:38
Shrewsjeblair: perfect. thx19:38
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing into fetch-testr-output  https://review.openstack.org/48584020:43
*** dkranz has quit IRC20:44
*** jkilpatr has quit IRC21:04
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing into fetch-testr-output  https://review.openstack.org/48584021:32
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Allow loading additional variables file for site config  https://review.openstack.org/44773421:50
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove state_dir from setMountsMap  https://review.openstack.org/48676621:50
jeblairtristanC: can you take a look at 486766 and make sure i'm correct about that?21:50
jeblairjamielennox: i picked up your site vars change (447734); can you take a look and let me know if that works for you?21:51
Shrewsanyone else care to review/+A the nodepool uuid change and its parent? https://review.openstack.org/484414  Already two +2's21:54
ShrewsSpamapS or pabelanger? ^^^21:55
jeblairit's zuul meeting time in #openstack-meeting-alt22:01
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/zuul feature/zuulv3: Monitor job root and kill over limit jobs  https://review.openstack.org/48590222:07
SpamapSjeblair: good news.. ^^ now that we're synchronously killing jobs, the tests don't need to whitelist executor-diskaccountant22:07
*** jkilpatr has joined #zuul22:21
jeblairclarkb: we should have an expand-all button if we do that23:02
clarkbjeblair: ++23:02
jamielennoxclarkb: in counter point though, in 99% of cases where a test fails (and you're not on the -infra) team, it's not the node's fault and i really care about is the output of my tox23:02
clarkbpabelanger: I left a review on one of your tox playbook changes23:02
jeblaircause, yeah, we need to be able to see everything, but we do also have a problem in that right now, the actual error is usually right in the middle of the log.  with a bunch of ignorable errors below it!  :)23:03
jamielennoxi'm not saying remove it, but debugging for example the pep8 jobs in projects involves skipping 100s lines of setup to find the actual console output23:03
pabelangerclarkb: thanks, replied23:03
clarkbjamielennox: I ^F error, which breaks in teh collapsed style setup23:03
pabelangerclarkb: FWIW: I do not link that patch myself. But need a good way to support all the paths for tox_environment23:03
jamielennoxanyway we can deal with the UX later, this is an awesome start23:04
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Create tox_environment_defaults variable for tox based jobs  https://review.openstack.org/48667923:05
clarkbpabelanger: I'm having a hard time parsing that last message :)23:05
pabelangerclarkb: so, we had a discussion last week about no defined variable is better then defined empty variable23:06
pabelangerwhen it comes to playbooks23:06
clarkbpabelanger: does environment: {} and environment: omit behave differently?23:06
jamielennoxjeblair: scrolling back re BonnyCI/charts, it largely works I've definitiely got it running jobs and currently still struggling with getting the right secrets in place for uploading logs which is a problem of the non-kubernetes infrastructure23:06
pabelangerclarkb: yes, omit would not pass environment to the task.23:07
pabelangerbut {} would be passed23:07
clarkbpabelanger: right but does that behave differently?23:07
SpamapSjamielennox: at least the tox jobs that have subunit give you the nice HTML breakdown though. ;)23:07
jeblairjamielennox: oh nice.  i mean, not the struggling, but the rest of it.  :)23:07
jamielennoxjeblair: the main concerns are that it is more difficult to debug, and if you get for example the scheduler pod restarting then you end up in a really odd state23:07
SpamapSmaybe we should make pep8 run through subunit23:07
jamielennoxso i sort of stopped when all the option changes happened23:07
clarkbpabelanger: if it does then I would worry that setting vars would not do what we want either because we still want to overlay with the system defaults right? so the three layers would be system defaults, tox defaults, playbook explicit env23:08
jeblairSpamapS: pep8 is on my short list of things to move to line-review-comments once we add that :)23:08
jeblairjamielennox: anything about site-vars we didn't touch on in the meeting?23:08
SpamapSjeblair: mmmmmmmmmmmmmmmmmmmmmmmmmmm23:08
jamielennoxthere's a few problems that really require coordination with putting code into zuul itself - which IMO makes it a post v3 thing23:08
* SpamapS dreams of line review comments23:08
clarkbpabelanger: stuff like LANG and so on we likely want to inherit from the system? (which is current zuulv2.5 behavior iirc)23:09
jamielennoxjeblair: all i've looked at at the moment is the executor/server file and it seems to do the same thing23:09
pabelangerclarkb: I don't know if there is a difference, but today when using the shell command, we don't pass empty environment for tasks.  So, need to test23:09
jamielennoxjeblair: at the moment we're not using it because i got sick of rebasing the patch :P23:09
pabelangerclarkb: right, we don't overwrite them23:09
pabelangerunless somebody bassed LANG into tox_environment23:09
pabelangerpassed*23:09
SpamapSjamielennox: pod restarting seems like something that k8s should have facilities for doing carefully.23:09
SpamapSisn't there a way to tell k8s "only one of these ever" ?23:10
clarkbpabelanger: right but if you pass environment: {} would that ovewrite system default env?23:10
jeblairjamielennox: yep.  i didn't change anything substantial.  but i wrote docs and tests -- i mostly wanted to make sure we knew what the story was with precedence.23:10
clarkbpabelanger: if not then omit and {} should be equivalent right? but using {} will reduce playbook complexity?23:10
pabelangerclarkb: well, so we always want to pass envirnment for run tox shell? or only pass it when a variable is defined23:11
jamielennoxSpamapS: it'll restart just fine, and yes it'll only run 1, but it assumes that it should be able to move pods if it hsa to, but if you take down scheduler without coordinating the other components things get weird23:11
clarkbpabelanger: well if you always pass it then you simplify the playbook significantly and assuming the behavior isn't different that seems preferable to me23:11
jamielennoxso i can't say (that i know of) if you restart scheduler also restart these executors23:11
pabelangerclarkb: we can try passing environment: {}23:11
clarkbpabelanger: because then you can just combine the two dicts and then pass the result in23:11
clarkbpabelanger: you don't even need a special block you just combine them at the environment: statement23:12
jamielennoxso this is the sort of thing that just needs fixes to zuul to better reconnect gearman processes and to store some more state23:12
pabelangerclarkb: we still need logic to check if tox_environment and tox_environment_defaults are defined, but yes23:12
clarkbpabelanger: well you'd define them to default to {} so they would be defined23:12
clarkbbut yes that23:13
jamielennoxother things that are annoying is that you basically need to run the nodepool-builder and the zuul-executor with --priviledged for dib and bubblewrap23:13
jeblairi have to go run some errands now23:13
jamielennoxagain i think we could tune that out with a bit of dedicated effort23:13
clarkbjamielennox: dib at least essentially is privileged though23:14
clarkbjamielennox: since it can mount and write filesystems and do all sorts of fun things23:14
jamielennoxclarkb: there should be a way of providing that cap though without giving priviledged right?23:15
jamielennoxbecause we're only mounting things within the container23:15
clarkbjamielennox: aiui the reason mount is part of privileged (it can be separately given out) is that if you can mount you can mount whatever including the host fs?23:16
clarkband once you've done that you own the system23:16
*** artgon has left #zuul23:16
jamielennoxyou would need to have access to the host fs right? or is the implication you can still get that through /dev?23:17
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Create tox_environment_defaults variable for tox based jobs  https://review.openstack.org/48667923:17
jamielennoxso the classic security issue is running as root in the container and mounting directories in23:17
clarkbjamielennox: I think worst case you just create the node in /dev ?23:18
jamielennoxah, ok, didn't realize you could just recreate the node23:18
clarkbwhere worst case is "my host tried to hide it form me"23:18
jamielennoxmknod has always been magic to me23:18
clarkbits been a while since I looked into all this with the iscsi container woes23:18
clarkbbut ya mount is scary in containers23:18
jamielennoxso that'll probably affect bubblewrap as well?23:19
clarkbjamielennox: reading really quickly mknod is a default docker privilege23:19
clarkbjamielennox: so its possible this is just docker being silly too23:20
clarkbjamielennox: so if you add mount to a docker container it already has mknod and thats all you need23:20
jamielennoxyea ok, so in this case nodepool-builder i thought might be fixable, but is reasonably controlled/truste23:22
jamielennoxrunning zuul-executor with --priviledged is a big problem23:22
jamielennoxhaving said that i think part of the reason is the whole bubblewrap setuid thing23:23
jamielennoxi'm not actually sure how it works if i run the executor itself as root23:23
pabelangeryou only need root for finger today, did you change the post to something > 1024 ?23:23
pabelangerport*23:23
jamielennoxpabelanger: yea i just put the port number up for that23:24
jamielennoxthere's a problem here that i don't fully understand23:24
pabelangerI'd like us to drop root in openstack-infra too, once we have websocket proxy23:24
jamielennoxif you don't run bubblewrap as root you generally give it setuid so it can run23:24
jamielennoxbut there is a problem (to do with user namespaces afaict) with running setuid within the container23:25
jamielennoxanyway, once i gave it --priviledged it worked, and i moved on with a note to come back to the problem23:26
pabelangernot sure I understand, I'm running bubblewrap locally as non-root. I don't think I setup anything with setuid23:26
pabelangersomething, something, container?23:26
jamielennoxit's not close enough for a production use yet anyway23:26
jamielennoxpabelanger: i think the .deb puts setuid on the bin right?23:26
pabelangerHmm, need to check. I am using fedora23:27
pabelangerunless rpm did something23:27
clarkbiirc you need setuid on older kernels23:27
clarkbwhere older kernel is like anything not newer than 2 months old23:27
jamielennox-rwsr-xr-x 1 root root 47072 May  2 16:41 /usr/bin/bwrap23:27
clarkbso if using a .deb that implies ubuntu/debian which have old kernels23:27
jamielennoxthat's after install on an up to date xenial23:28
pabelanger-rwxr-xr-x. 1 root root 48904 May 26 02:32 /usr/bin/bwrap23:28
jamielennoxclarkb: yea, my understanding is that there's a kernel fix that still hasn't made it into xenial23:28
jamielennoxthat will fix the bwrap problem in particular23:29
jamielennoxbut i'm not sure why user namespaces and setuid is a problem, but it's mentioned in a number of places23:29
pabelangerjamielennox: confirmed, that is how bwrap is setup on xenial23:30
pabelangerhttps://anonscm.debian.org/cgit/collab-maint/bubblewrap.git/tree/debian/rules23:31
clarkbjamielennox: I think it is because the setuid perms in a namespace will setuid to a non privileged user23:31
clarkbjamielennox: if you use the host namespace then setuid is going to use proper root and be happy23:31
jamielennoxinterestingly if it's a kernel problem then i'm not sure what happens if we flip the docker container over to centos or something because the underlying infrastructure might not be on the host23:32
jamielennoxclarkb: that's interesting because at least theoretically for this you only need to be root in that container, you're not writing anything out23:33
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Create tox_environment_defaults variable for tox based jobs  https://review.openstack.org/48667923:33
pabelangerclarkb: okay, updated^23:34
jamielennoxbut again, apparently this is something that is fixed/improved in later kernels23:34
clarkbjamielennox: except that bubblwrap is using kernel capabilities that a in container unprivileged user won't have aiui23:34
jamielennoxso it's probably something where the practical has not yet caught up with the theoretical23:34
clarkbjamielennox: in newer kernels they made those capabilities more fine grained so that you don't need proper root like caps23:34
pabelangerand EOD for me23:35
jamielennoxclarkb: yep, we can add specific caps to the container fairly easily, which i'm ok with doing, just would prefer not to do the full --priviledged23:35
clarkbjamielennox: ya though my understanding is until you have a newer kernel that basically means root so its probably six one way half dozen the other until you can rely on newer kernels23:36
clarkbclearly we just need the future here today to solve all the problems23:36
jamielennoxclarkb: yea, which is how i've basically go to the point that all this is super interesting but i wouldn't feel comfortable running this in any sort of prod today23:37
jamielennoxregardless of how you lock it down23:37
jamielennoxwhich is a shame because i think having a fairly easy chart you could deploy to something like GKE would be good for adoption, but something we can look at again in future23:38

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!