Friday, 2020-01-31

*** rfolco has quit IRC00:22
*** rfolco has joined #zuul00:22
*** toabctl has quit IRC00:24
*** rfolco has quit IRC00:24
*** rfolco has joined #zuul00:24
*** rfolco has quit IRC00:25
*** rfolco has joined #zuul00:25
*** toabctl has joined #zuul00:27
*** tosky has quit IRC00:28
*** rfolco has quit IRC00:28
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Add OpenShift SCC and functional test  https://review.opendev.org/70275800:28
*** rfolco has joined #zuul00:29
*** rfolco has quit IRC00:30
*** rfolco has joined #zuul00:31
*** rfolco has quit IRC00:33
*** rfolco has joined #zuul00:33
*** fdegir has quit IRC00:35
*** fdegir has joined #zuul00:36
*** rfolco has quit IRC00:38
*** mattw4 has quit IRC00:38
tristanC^ finally passed ci for both minikube and openshift00:57
*** jamesmcarthur has joined #zuul00:57
*** jamesmcarthur has quit IRC01:17
*** igordc has quit IRC01:43
*** jamesmcarthur has joined #zuul03:16
*** jamesmcarthur has quit IRC03:27
*** jamesmcarthur has joined #zuul03:27
*** rlandy has quit IRC04:03
*** jamesmcarthur has quit IRC04:14
*** jamesmcarthur has joined #zuul04:33
*** jamesmcarthur has quit IRC04:41
*** raukadah is now known as chkumar|rover05:17
*** evrardjp has quit IRC05:33
*** evrardjp has joined #zuul05:33
openstackgerritJan Kubovy proposed zuul/zuul master: Add spec for scale out scheduler  https://review.opendev.org/62147905:51
*** bolg has joined #zuul05:53
bolgcorvus, tobiash: just made another iteration on https://review.opendev.org/c/62147905:53
*** swest has joined #zuul06:14
mnasercorvus: +1'd from me, and +2 from tristanC so i think it's probably fine if you merge it06:15
mnasercorvus: perhaps we can setup an acl for zuul-helm-maintainers as a set that includes zuul-maint and those interested in maintaining those charts? (hi!)06:16
*** bolg has quit IRC06:38
*** persia has quit IRC07:06
*** persia has joined #zuul07:08
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Update upload-afs README  https://review.opendev.org/70516607:35
openstackgerritFelix Schmidt proposed zuul/zuul master: Fix github app authentication to work with checks API endpoints  https://review.opendev.org/70516707:37
openstackgerritFelix Schmidt proposed zuul/zuul master: Implement github checks API  https://review.opendev.org/70516807:37
openstackgerritMerged zuul/zuul-helm master: Set ClusterIP to None for executor logs  https://review.opendev.org/70513207:44
openstackgerritSimon Westphahl proposed zuul/zuul master: wip: Add support for tracing using OpenTelemetry  https://review.opendev.org/70517007:47
*** bolg has joined #zuul07:52
*** tosky has joined #zuul08:23
*** avass has joined #zuul08:35
*** hashar has joined #zuul08:45
*** bhavikdbavishi has joined #zuul08:48
*** flaper87 has quit IRC09:19
openstackgerritSorin Sbarnea proposed zuul/zuul master: Add build history link to summary  https://review.opendev.org/70504909:47
*** bhavikdbavishi has quit IRC10:05
openstackgerritFelix Schmidt proposed zuul/zuul master: Implement github checks API  https://review.opendev.org/70516810:17
openstackgerritTobias Henkel proposed zuul/zuul master: Add foreground option  https://review.opendev.org/63564910:21
openstackgerritTobias Henkel proposed zuul/zuul master: Add foreground option  https://review.opendev.org/63564910:26
openstackgerritTobias Henkel proposed zuul/zuul master: Deprecate -d switch for running in foreground  https://review.opendev.org/70518510:34
openstackgerritTobias Henkel proposed zuul/zuul master: Don't enforce forefround with -d switch  https://review.opendev.org/70518910:38
openstackgerritTobias Henkel proposed zuul/zuul master: Don't enforce foreground with -d switch  https://review.opendev.org/70518910:54
openstackgerritMatthieu Huin proposed zuul/zuul master: Authorization rules: add templating  https://review.opendev.org/70519311:16
*** hashar has quit IRC11:30
openstackgerritFelix Schmidt proposed zuul/zuul master: Make reporting asynchronous  https://review.opendev.org/69125311:44
openstackgerritFelix Schmidt proposed zuul/zuul master: Make direct-push configurable on project-level  https://review.opendev.org/67710911:45
openstackgerritFelix Schmidt proposed zuul/zuul master: Implement push job in merger  https://review.opendev.org/67711011:45
openstackgerritFelix Schmidt proposed zuul/zuul master: Push changes in GerritReporter if direct-push is enabled  https://review.opendev.org/67711111:45
*** hashar has joined #zuul11:52
*** felixedel has joined #zuul12:06
*** rfolco has joined #zuul12:10
openstackgerritFelix Schmidt proposed zuul/zuul master: Report retried builds in a build set via mqtt.  https://review.opendev.org/63272712:14
openstackgerritFelix Schmidt proposed zuul/zuul master: Report retried builds via sql reporter.  https://review.opendev.org/63350112:14
openstackgerritFelix Schmidt proposed zuul/zuul master: Store information about gate resets in MQTT and SQL reporter  https://review.opendev.org/69667012:14
openstackgerritMatthieu Huin proposed zuul/zuul master: Authorization rules: add templating  https://review.opendev.org/70519312:32
*** hashar has quit IRC12:40
*** hashar has joined #zuul12:44
*** hashar has quit IRC12:55
*** hashar has joined #zuul12:55
*** rlandy has joined #zuul13:08
*** jamesmcarthur has joined #zuul13:22
openstackgerritMatthieu Huin proposed zuul/zuul master: Authorization rules: add templating  https://review.opendev.org/70519313:23
*** felixedel has quit IRC13:26
*** bolg has quit IRC13:28
*** jamesmcarthur has quit IRC13:35
*** jamesmcarthur has joined #zuul13:37
*** jamesmcarthur has quit IRC13:42
zbrcorvus: does "build history" look ok on https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_0ff/705049/2/check/zuul-build-dashboard/0ff3ac3/npm/html/build/2e6d36faf2114b749ddcfdc30e35cc05 ?14:06
zbris too close? should I put a long-dash in between maybe?14:06
*** jamesmcarthur has joined #zuul14:06
*** jamesmcarthur has quit IRC14:13
openstackgerritMatthieu Huin proposed zuul/zuul master: Authorization rules: add templating  https://review.opendev.org/70519314:26
*** zxiiro has joined #zuul14:35
openstackgerritAntoine Musso proposed zuul/zuul master: web: humanize time durations  https://review.opendev.org/70512014:50
hashartristanC: hello, I have updated my patch to format duration on the web pages :]  https://review.opendev.org/#/c/705120/14:52
hasharused https://www.npmjs.com/package/moment-duration-format which does exactly want I wanted  \o/14:52
corvustristanC, mnaser, tobiash: i could use your help in understanding the user situation on the executor container.  as it stands now, i don't understand how anyone has every gotten this working in k8s.  :)  let me explain my confusion:14:52
hashartristanC: no rush though since I am heading out soonish. But I will be back around in two hours14:52
corvustristanC, mnaser, tobiash: it looks like the current zuul-executor images don't specify a user, therefore they run as root.  that means most things conveniently work (everything is writable!).  but if you run "rsync -a" inside bwrap as root, rsync will try to change directory ownership, but bwrap will disallow that.  the fetch-output role does this and therefore fails.14:53
corvustristanC, mnaser, tobiash: but merely setting a runAsUser on the container doesn't work, because the user isn't defined in the container and therefore defaults to a home directory of /, which is not writeable, and therefore ansible won't run at all (among other problems).14:54
tristanCcorvus: isn't that fixed by https://review.opendev.org/650246 ?  I needed that to make zuul works in openshift14:55
corvustristanC, mnaser, tobiash: it looks like in order for z-e to run in k8s at all, we need a proper user created in the container (with a writeable homedir for ansible)14:55
tobiashcorvus: our executor runs as user 1000 but in a privileged container14:55
tobiashbut we haven't migrated to the official images yet14:56
mnasermine.. works.. somehow?14:56
mnaseri def have jobs working with zuul-executor under k8s using cri-o14:56
mnaserand using those helm charts and the upstream images14:56
corvustristanC: yes, i think that would -- except, do we need a writeable home directory too?  i just see 246 creating a passwd entry14:57
corvustobiash: i think i'm in a privileged container too (i believe that's default in gke) -- do you create a user and homedir in your image build?14:57
corvusmnaser: are you perhaps not using the fetch-output role?14:57
tobiashlet me dig up our k8s spec14:57
*** jamesmcarthur has joined #zuul14:58
corvusmnaser: (or even any role that runs synchronize to pull stuff back to the executor?)14:58
mnasercorvus: isn't that pretty much a requirement to be able to push logs up?14:58
mnaserwe use swift to publish logs14:58
corvusmnaser: more or less, yes...14:58
tobiashyes, we create a user during image build14:58
tobiashwith a pre-defined uid14:58
corvustobiash: like with useradd/adduser?  so it gets a normal homedir?14:58
tobiashcorvus: http://paste.openstack.org/show/789009/14:59
mnaserim looking at a build on jan 28 on the zuul we run using those charts with logs on swift15:00
tobiashand boot.sh does some git settings (could probably go to the image build as well)15:00
tristanCcorvus: https://review.opendev.org/#/c/650246/6/tools/uid_entrypoint.sh setups the user with HOME, and the operator set it like so: https://review.opendev.org/#/c/702106/17/conf/zuul/resources.dhall@59115:00
corvusmnaser: https://gerrit-zuul.inaugust.com/t/gerrit/build/7f9e1bcd84bc498d8e61c78da0bccd28/console is the error i get  (and i was able to repro by running the command as root in bwrap)15:01
corvustristanC: what's $HOME in that?15:01
corvusoh i see15:02
corvus/var/lib/zuul15:02
mnaseri can share a build in private that works just fine15:02
corvusok, so 246 does end up with a writeable homedir because of "volume /var/lib/zuul"15:02
tristanCcorvus: exactly15:03
corvustristanC: cool, i think that would address all the problems i'm seeing...15:03
tobiashcorvus: to be more precise, our executor container starts boot.sh as root and privileged and this at the end does this:15:03
corvusnow i'm just really curious why it's working for mnaser15:03
tobiashexec runuser -u zuul -g zuul -- /sbin/tini -g -s -- /usr/bin/eatmydata -- /opt/zuul/bin/zuul-executor -d15:03
tristanCcorvus: note that this is documented in https://docs.okd.io/latest/creating_images/guidelines.html15:04
openstackgerritTristan Cacqueray proposed zuul/zuul master: Dockerfile: add support for arbritary uid  https://review.opendev.org/65024615:05
mnasercorvus: im happy to help investigate certain things inside the container if you'd like15:06
corvusmnaser: i'm wondering if something about the default security policy is different, but i don't yet know how to investigate that15:07
mnasercorvus: thats my guess, it probably has to do with the k8s that you get from gke vs what we deploy (via kubeadm)15:08
*** hashar has quit IRC15:09
corvus(wow there's some real garbage on the internet about this)15:10
*** jamesmcarthur has quit IRC15:11
*** jamesmcarthur has joined #zuul15:12
*** zbr has quit IRC15:15
*** jamesmcarthur has quit IRC15:17
corvustristanC: looks like a legit quick-start error on that change: https://zuul.opendev.org/t/zuul/build/2560c9f640e0472ebbf663c81bd401ad/log/container_logs/gerritconfig.log15:19
tristanCcorvus: oh thanks for the link, i'll fix that then15:19
corvustristanC: i guess the gerritconfig container needs a volume?15:20
corvustristanC: also left an inline note about this: https://zuul.opendev.org/t/zuul/build/2560c9f640e0472ebbf663c81bd401ad/log/container_logs/gerritconfig.log#115:20
corvusmnaser: i don't know what to investigate.  but i'll let you know if i think of something.15:22
mnasercorvus: cool, happy to run any commands to check permissions of any specific folders/paths/etc15:22
*** zbr has joined #zuul15:23
*** bhavikdbavishi has joined #zuul15:25
openstackgerritTristan Cacqueray proposed zuul/zuul master: Dockerfile: add support for arbritary uid  https://review.opendev.org/65024615:25
*** jamesmcarthur has joined #zuul15:27
clarkbdo we need rsync -a if the end result is a seift upload?15:28
clarkbwould not running with -a simplify this enough to be happy in a container as is?15:28
corvusclarkb: no, we could turn off the perms settings, but it's easier/better to just avoid running as root i think :)15:29
corvusthat was "no" to the first question and "yes" to the second :)15:29
clarkbalso why is root different than another user? or is the ideathat the ownership would match remote so no chowning with nonroot?15:30
*** pcaruana has quit IRC15:30
*** Goneri has joined #zuul15:32
*** jamesmcarthur has quit IRC15:32
tristanChm, that whoami command doesn't work15:32
tristanCwell it's always failing because of the incorrect redirection, resulting in the user being created...15:33
corvusclarkb: well if it's running as non-root, i don't think it fails on trying to chown15:33
openstackgerritMatthieu Huin proposed zuul/zuul master: Authorization rules: add templating  https://review.opendev.org/70519315:33
corvustristanC: yes, so it's failing toward working in the usual case.  but would do the wrong thing if the user did exist.15:33
sugaarhi guys15:35
sugaarin the gerrit deployment, did you include ansible-playbook yourselves?15:35
sugaarbecuase it seems like my container can't finde it under the path writen in docker-compose15:35
corvussugaar: which container?  can you share the error?15:36
sugaarcorvus bash-4.2$ /usr/local/lib/zuul/ansible/2.8/bin/ansible-playbook /var/playbooks/setup.yaml15:37
sugaarbash: /usr/local/lib/zuul/ansible/2.8/bin/ansible-playbook: No such file or directory15:37
sugaarthat is inside gerrit15:37
*** ssbarnea has left #zuul15:38
tristanCcorvus: good thing the user does not exist then :)  what do you think of using `[[ `whoami` =~ ^[a-zA-Z]+ ]] || ...` ?15:39
sugaarcorvus and here the file, if you wanna have a look https://gitlab.com/celduin/infrastructure/celduin-infra/blob/39a877851b185f101d0c433e2887db22eec7f65e/zuul-deployment/gerrit.yaml15:40
corvustristanC: whoami works in my testing15:40
tristanCcorvus: you mean it exit(1) if the user is missing?15:41
corvustristanC: yes; 1 sec15:41
tobiashsugaar: is that a fork of the quickstart docker-compose definition?15:41
sugaarsort of yes15:41
sugaarI am triyung to make it in k8s so later on I can tune it to our needs15:41
corvustristanC: http://paste.openstack.org/show/789012/15:41
sugaarbut first I wan to make work a basic version of it15:42
pabelangercorvus: when this issue last came up for sean-k-mooney, https://review.opendev.org/633796/ was the proposed fix to devstack. But long term, was to have same user on remote node / executor to help keep rsync happy15:42
tobiashsugaar: the ansible-playbook path you posted should be like this in the executor image15:42
tristanCcorvus: ha, my bad, i was using an image already patched with the uid_entrypoint15:42
pabelangerhttp://eavesdrop.openstack.org/irclogs/%23zuul/%23zuul.2019-01-29.log.html#t2019-01-29T16:05:03 was last time chown error came up15:43
tobiashsugaar: you've probably taken that from the gerritconfig container of the quick start which uses the zuul-executor image and not the gerrit image15:43
sugaaroh that is tru15:43
pabelangerwell, one of the last times15:43
sugaarmy bad15:43
sugaarthanks tobiash15:43
tobiashno problem15:43
corvuspabelanger, clarkb: hrm, maybe we should start adding user/group: no to sync tasks?15:44
openstackgerritTristan Cacqueray proposed zuul/zuul master: uid_entrypoint: fix redirection error  https://review.opendev.org/70523915:44
*** bhavikdbavishi1 has joined #zuul15:44
*** jamesmcarthur has joined #zuul15:45
tobiashcorvus: I think that's a good idea anyway because setting wrong uid on files might block the deletion of the job dir15:45
corvustristanC: wait, we don't need to share the /var/lib/zuul volumes15:45
*** bhavikdbavishi has quit IRC15:46
*** bhavikdbavishi1 is now known as bhavikdbavishi15:46
corvustristanC: i think we just need to add it to the gerritconfig container15:46
*** ofosos has joined #zuul15:46
corvustristanC: but wait -- gerritconfig is zuul-executor; it should have a volume15:47
openstackgerritTristan Cacqueray proposed zuul/nodepool master: uid_entrypoint: fix redirection error  https://review.opendev.org/70524115:47
tristanCcorvus: yeah, i don't get why docker-compose didn't used the VOLUME defined in the zuul image...15:47
corvustristanC: yeah, i'm not sure i understand the original error anymore15:48
*** pcaruana has joined #zuul15:49
pabelangerIt isn't clear for me where to look at google zuul configuration for zuul / zuul-executors, could somebody please share15:52
*** felixedel has joined #zuul15:53
corvuspabelanger: it's still mostly on my workstation until i can get something that actually works.  https://gerrit-review.googlesource.com/c/zuul/ops/+/252316 is close though.15:53
pabelangerthanks!15:55
corvustristanC: are we sure that a VOLUME entry will actually have the right perms?  maybe it needs to be chowned?15:56
*** jamesmcarthur_ has joined #zuul15:58
pabelangercorvus: if I understand,  zuul-config volume, will contain the zuul.conf file? That isn't generated by some template, but something you'd created (as secret)?15:59
pabelangerwas trying to see what the trusted_rw_dirs / trusted_rw_paths were set too16:00
corvuspabelanger: correct, that's entirely local right now because it has the mysql password in it.  once i have a working system, i can generate it with zuul secrets.16:01
*** jamesmcarthur has quit IRC16:01
*** chkumar|rover is now known as raukadah16:02
pabelangercorvus: ack, np.16:02
corvuspabelanger: the only trusted_ option i have set in trusted_ro_paths=/authdaemon/token16:02
corvuspabelanger: that's just for a special google service account token thing16:02
felixedel@zuul-maintainers We have three patches https://review.opendev.org/#/q/topic:bloeffler-report-reties+(status:open+OR+status:merged)that are pending for a long time now and I would like to ask if you could provide some feedback :)They are all about providing additional information in the MQTT reporter which we are using to get some statistics about16:03
felixedelthe buildsrunning in our Zuul CI instance. The first patch https://review.opendev.org/#/c/632727/16 is IMHO consistent as is and could be integrated.For the second one I did some adaptions to incorporate the requested changes (store the retry_builds in a different table thanthe normal builds), but at second glance this doesn't quite look right - as16:03
felixedelboth contain the same data structure in the end.Please see my comment on patchset 14 of https://review.opendev.org/#/c/633501/. I still kept the change to get your opinion about this.If we decide to go this way and store the retry_builds in a different table, I will adapt the child change accordingly (which isabout storing gate resets in the same16:03
felixedelmanner).16:03
pabelangercorvus: is there the logs are going, some volume in k8s?16:03
corvuspabelanger: no, uploaded to gcs16:03
pabelangercorvus: do you have default_username=root in zuul.conf / executor section?16:11
corvuspabelanger: no; nodepool provides the username of zuul16:12
pabelangercorvus: ack, and zuul-executor is running as root? Or as zuul user16:13
corvuspabelanger: as root (that's the discussion earlier)16:14
pabelangerk,16:14
pabelangerI think we could reproduce this via quickstart16:14
pabelangertoday, quickstart looks to be using root user to log into container jobs16:14
*** felixedel has quit IRC16:16
pabelangerso far, in the past we've had 2 options, keep executor user / nodepool user in sync (uid/gid). Or relax rsync to not preserve permissions16:17
pabelangerI don't believe, at job runtime, we'd be able to create the user in bwrap?16:18
corvuspabelanger: yes we can; we currently create the user that zuul is running as16:18
pabelangerah16:19
pabelangerso, maybe we need to update https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/fetch-output/tasks/main.yaml#L11 to also try to set uid / gid when we create directories.16:19
corvuswe could change that, but i think it could complicate things, especially if you wanted the bwrap user to be able to write to the executor (ie, a trusted job)16:19
corvuspabelanger: that would not affect any subdirs that were copied over16:20
openstackgerritMatthieu Huin proposed zuul/zuul master: Authorization rules: add templating  https://review.opendev.org/70519316:26
corvustristanC: i don't understand how /etc/passwd is writable16:28
corvustristanC: surely uid_entrypoint runs as the container user16:29
tristanCcorvus: https://review.opendev.org/#/c/650246/8/Dockerfile@4516:30
corvustristanC: doesn't that just make the group permission bits of /etc/passwd the same as user?  but both user and group are root?16:31
tristanCcorvus: yes, and iirc k8s gid are set to 016:33
corvustristanC: well, isn't this whole exercise supposed to be about not running as root?16:34
corvusi think running with gid is nearly as bad16:34
tristanCcontainer process gid are set to 0 when running in k8s16:34
corvuswith gid 016:34
corvustristanC: yes, container process uid are also set to 016:34
tristanCcorvus: iiuc, gid 0 doesn't give you root capability, it's mostly for filesystem access16:34
corvusmy understanding is that this exercise was meant to support "runasuser:0" and "runasgroup:0"16:34
corvustristanC: but we've just made */etc/passwd* writeable by a supposedly "unprivileged" user16:35
corvusi mean, that's a pretty short escalation path16:35
corvusyou just write it again and set your uid to 0, then spawn a shell and you're root16:35
tristanCcorvus: that implies you can change uid16:36
tristanCwhich shouldn't be the case in container16:36
corvusthere's a *lot* of if's going on here.  it also assumes that no important files will be gid=0+w16:37
tristanCcorvus: iiuc, allowing root group to write to /etc/passwd doesn't give any more privilege or attack vector, and this is what okd recommend for container images ( https://docs.okd.io/latest/creating_images/guidelines.html )16:38
tristanCeither you are privileged, and you can already do that kind of things (change file mode or change uid), either you are not privileged, then you can't change uid or do privileged actions16:39
tristanCit's just that allowing unprivileged to write to /etc/passwd let you fix issue for tool like git or rsync that except the uid to be a real user16:40
tristanCin other words, once you have one /etc/passwd per context, then it's content is not really relevant security wise16:43
*** jamesmcarthur_ has quit IRC16:47
*** mattw4 has joined #zuul16:56
*** jamesmcarthur has joined #zuul17:00
*** jamesmcarthur has quit IRC17:00
corvustristanC: i don't agree that running a container process as gid=0 is more secure than not.  it is one setgid binary away from a privilege escalation.  it's a loss of defense in depth -- and a very risky change since it reverses the security assumptions that almost every other part of the system is making.17:00
*** jamesmcarthur has joined #zuul17:00
*** hashar has joined #zuul17:01
tristanCcorvus: oh right, but is this something we can change?  (i was only refering to the chmod g=u /etc/passwrd)17:02
openstackgerritMerged zuul/zuul-jobs master: Add a markdownlint job and role  https://review.opendev.org/60769117:05
corvustristanC: found one.  here's a privilege escalation using writeable /etc/passwd: http://paste.openstack.org/show/789014/17:10
corvusi just set the password for the zuul user, and set its uid to 0, then run su.17:11
corvusprobably could have just set root's password too.17:11
tristanCcorvus: if you think that's an issue, then i don't know how to prevent it while supporting arbritary uid17:21
tristanCiiuc, to prevent process from changing identity you drop the cap_setuid17:24
corvustristanC: i definitely think that's an issue.  if we believe that not running as root is valuable (and i think we all do), then clearly this is not meeting our goal.  i mean, that's hardly an exploit -- it's really just using tools as they were meant to be used.  i think at the very least, if you're using that uid_entrypoint script anywhere, you should update it to "chmod g-w /etc/passwd".  that should help17:28
corvusprevent this particular hole, but i feel like this is an example of what can happen when you reverse the assumptions of the unix permissions model, and more creative people than i may still come up with a way to utilize being gid 0.17:28
corvusalso, it's kind of a time bomb for anyone who wants to use the image and overrides the entrypoint.  they may not care about anyuid, it will work fine for them, but they'll have an exploit waiting to happen on the image.  i really don't think it's safe.17:29
tristanCcorvus: that's fair, would you like to propose adding the chmod g-w to the okd guideline?17:31
*** evrardjp has quit IRC17:33
*** evrardjp has joined #zuul17:34
corvustristanC: any idea where that source is?17:34
tristanCcorvus: https://github.com/RHsyseng/container-rhel-examples/blob/master/starter-arbitrary-uid/Dockerfile.centos7#L2817:35
tristanCwhich is the example referenced by the okd guidelines17:35
corvustristanC: oh i thought you meant update the okd docs themselves17:35
tristanCcorvus: imo, in the context of the zuul-executor pod, i think that if a malicious user can execute arbritary action as the zuul user, then there isn't much more to gain by getting root privilege.17:36
tristanCcorvus: for example, you also need to disable the service-account token, otherwise the malicious might as well spawn another pod using the root account directly17:36
tristanCcorvus: and if you start enabling such security setting, then you might as well configure the pod security context to prevent setuid altogether17:38
corvustristanC: indeed.  but that argument is very close to "it's okay to run a container as root".  (it kind-of is okay to do that, which is why we haven't been too worried about this up to this point, but if we're going to agree that it's better not too, then we should really do that)17:38
corvustristanC: remind me again, are we relying on bwrap being setuid, or is it all caps now?17:40
tristanCcorvus: i think we do because k8s doesn't support userns iirc17:43
tristanCbut i don't quite remember what is at really required between seccomp, capabilities(7) and selinux...17:46
tristanCif the uid_entrypoint script enables privilege escalation, then we should fix the script upstream.17:48
*** jamesmcarthur has quit IRC17:48
openstackgerritAntoine Musso proposed zuul/zuul master: web: humanize time durations  https://review.opendev.org/70512017:49
*** jamesmcarthur has joined #zuul17:50
corvustristanC: i have 3 concerns: 1) running gid 0 may create vulnerabilities because it runs counter to assumptions that other tool makers may have made. 2) leaving /etc/passwd writable after the script runs clearly opens an easily-exploitable vulnerability.  this is an example of #1, but perhaps not the only one.  this is easy to fix.  3) leaving /etc/passwd writable in the image itself creates a latent17:51
corvusopportunity for #2 to happen if someone overrides the entrypoint.  no one expects overriding the entrypoint to make a container image exploitable.17:51
openstackgerritTristan Cacqueray proposed zuul/zuul master: Dockerfile: add support for arbritary uid  https://review.opendev.org/65024617:54
corvusfinally found the source: https://github.com/openshift/openshift-docs/blob/master/modules/images-create-guide-openshift.adoc#support-arbitrary-user-ids17:54
corvustristanC: https://github.com/openshift/openshift-docs/commit/1128e47ee0ea01ac1410cfff42370f4872dbbffd17:55
tristanCcorvus: ha, but the PR comment says that only works with OCPv417:56
corvusoh, the real info is in the pr comments, not the commit message.  i was wondering.17:57
corvusit still doesn't say why it's not applicable.17:57
corvustristanC: any ideas why it is "unnecessary" in 4?17:58
tristanCcorvus: your concerns makes sense to me, i just don't know why the uid_entrypoint is designed like that17:58
tristanCcorvus: perhaps in 4 there is oci hook in place to take care of that? no idea17:59
hasharhi :)18:00
corvustristanC: in the mean time, is it possible to tell openshift to run a container with a fixed uid?  can we just say "runasuser: 1000"?18:01
corvus(we're already not running in the default config because of privileged)18:01
tristanCcorvus: it is, but not with the default scc18:01
corvustristanC: that's what we have to do for the executor for privileged, right?18:02
tristanCcorvus: i kept the uid range, but that can be changed here https://review.opendev.org/#/c/702758/15/deploy/scc.yaml@2218:03
tristanCcorvus: so we could have a18:03
tristanCextra scc for the rest of the zuul service to runasuser:1000, or we grant the zuul-operator to use the existing anyuid scc18:05
corvusone of those would be my vote for how to handle this for all of our images for now.  i'd like to see what's coming in v4 to perhaps allow running as a range.18:06
tristanCcorvus: and thus replacing the uid_entrypoint with an useradd ?18:08
corvustristanC: yes18:08
tristanCcorvus: wfm18:08
tristanCcorvus: i'm waiting to see if last PS of https://review.opendev.org/#/c/650246/ worked (where i moved the VOLUME statement after the USER), then i'll update it accordingly18:10
*** plaurin has joined #zuul18:16
plaurinHi irc people! I wonder if anyone could help me out. Still working trying to make kubernetes work in zuul. I have a couple of shell steps, which all work until one returns the error \"ansible.errors.AnsibleError: Unable to create local directories(/home/ubuntu/.ansible/tmp): [Errno 13] Permission denied: b'/home/ubuntu/.ansible/tmp'\"], 'stderr_lines': []}"18:23
plaurinSeems like a common issue found on google to fix with ansible.cfg but not sure where I should be "editing" anslble.cfg in a zuul installation18:25
plaurinother than passing as variables  in /etc/zuul/variables.yml18:25
openstackgerritTristan Cacqueray proposed zuul/zuul master: Dockerfile: create and configure a zuul user with uid 1000  https://review.opendev.org/65024618:26
pabelangerplaurin: are you using 'ubuntu' as the user to connect to nodepool nodes?18:26
tristanCcorvus: no luck with the gerritconfig eperm error, let see if that new dockerfile does the trick ^18:27
plaurinyes, it's a docker image built using packer, from the official ubuntu docker image. I pass the "USER ubuntu" to the commited image18:28
plaurinso connecting to the image it works with ubuntu user18:28
plaurinunless zuul/nodepool changes this i'm not sure18:28
pabelangerif you SSH into node, as ubuntu and cannot change permissions on /home/ubuntu. Sounds like something isn't configured properly18:29
pabelangercan you check permissions?18:29
plaurinit's using kubectl exec, not ssh as this is using ansible's kubernetes driver18:29
pabelangerokay, can you log into node and ls -la /home/ubuntu18:30
plaurinsure18:30
plaurinjust a sec18:30
corvustristanC: https://github.com/RHsyseng/container-rhel-examples/issues/10318:30
pabelangerwe then need to see what user ansible is using18:30
plaurinoh well look at that the .ansible belongs to root:root 😑️18:31
tristanCcorvus: thanks!18:31
plaurinI never tough checking that output after building with packer18:32
pabelangerplaurin: yah, I would assume packer ansible, was root and left it18:32
pabelangerthen zuul cannot18:32
plaurinI'm surprised because the ansible provisioner used with packer is configured to use 'ubuntu' user18:33
plaurinheh I guess I need to fix around packer, not zuul then :P18:34
*** armstrongs has joined #zuul18:35
plaurinohh I use "become: yes" in my initial yml "" facepalm ""18:35
pabelanger\o/18:35
*** tosky has quit IRC18:36
plaurinthx I'll fix that out18:36
plaurin(-‸ლ)18:36
plaurinI should come here more often you guys help me fix things much faster18:37
pabelangeranything to get more zuul users :)18:38
*** jamesmcarthur has quit IRC18:38
plaurinI'm doing fun/complex stuff with zuul. Upgrading firmwares and BMCs, and running tests automagically 24/718:39
*** jamesmcarthur has joined #zuul18:40
Shrewsplaurin: sounds worthy of a blog post   ;)18:40
plaurin& making sure you won't break your BIOS if you flash it 30 times in a row and cutting current midway hehe18:41
plaurinIt's nice to create my ansible recipes to do what I want and just "plug" them in zuul with a cron18:42
*** hashar has quit IRC18:43
plaurinokay gotta go back focus on my work, as usual thx for your help!18:43
*** armstrongs has quit IRC18:45
corvusplaurin: have you talked with jamesmcarthur?  he's looking for zuul users18:45
plaurinI haven't18:45
corvusplaurin: maybe you can pm him your email address if you have a minute.  i'm sure he'd appreciate it18:46
plaurinsure, will do18:47
clarkbplaurin: as a user of things that have firmware it makes me happy to hear people care about not bricking my devices :)18:47
corvusplaurin: yeah, i can confidently say we're all rooting for you.  (so to speak)18:47
*** jamesmcarthur has quit IRC18:48
plaurinhehe, yeah I do devops to try and brick stuff in a manner of speaking18:48
corvustristanC, mnaser, tobiash: it looks like you can't set the owner of a mounted secret in k8s, so if we change our uid, i think we also have to drop the defaultMode lines we have in the helm charts18:49
tobiashcorvus: I set this on the pod (not container) spec: http://paste.openstack.org/show/789017/18:50
tobiashwhich afaik should change the group of mounted things18:50
tobiashbut maybe that doesn't work18:52
tobiashlooks like they are still owned by root18:53
*** plaurin has quit IRC18:57
*** igordc has joined #zuul19:03
openstackgerritJames E. Blair proposed zuul/zuul master: Add google-cloud-storage to executor ansible  https://review.opendev.org/70527919:07
*** bhavikdbavishi has quit IRC19:07
openstackgerritJames E. Blair proposed zuul/zuul master: WIP: Add uid/gid 1000 to Dockerfile  https://review.opendev.org/70528019:07
corvustristanC: ^ just so i don't lose track, that's what i did locally to build an image (based on tobiash's paste from earlier)19:08
corvusit's slightly different than what you uploaded (and is missing USER lines).  i don't have an opinion on which is better/correct.  but nuances about volumes and ownership may be important here.19:09
tristanCcorvus: quickstart is now failing because of Saving key "/var/ssh/admin" failed: Permission denied19:17
tristanCtobiash: i confirm k8s secrets are always monted as root, and therefor require using a relaxed defaultMode, e.g. the one by default is 064419:22
tristanCcorvus: it seems like docker-compose volume doesn't respect the container user. perhaps we have to define the `user` attribute in the compose file?19:26
tristanCor according to https://github.com/docker/compose/issues/3270#issuecomment-543603959 , we should mkdir /var/ssh && chown zuul /var/ssh in the Dockerfile19:29
*** pcaruana has quit IRC19:30
openstackgerritTristan Cacqueray proposed zuul/zuul master: Dockerfile: create and configure a zuul user with uid 1000  https://review.opendev.org/65024619:32
corvushttps://gerrit-zuul.inaugust.com/t/gerrit/build/508e47023cd243c68d5f8c128556a03b20:08
corvusa successful build with an uploaded log20:08
corvustristanC: hrm, i'm not a fan of having quick-start specific volumes in the docker image....20:14
corvus(or rather, quickstart specific paths)20:15
corvustristanC: what if we add a container to docker-compose.yaml with a dockerfile which creates and chowns those mountpoints, and volume mounts for all the quickstart-specific volumes, then have the quickstart services all depends_on that container?20:26
corvusdepends_on waits for a container to "start" -- i'm assuming that means initializing new volumes...20:27
tristanCcorvus: not sure what `volume, when first used by a container, get it's initial content and permission inherited from the container` (from the compose issue) implies. i'm not a fan either, but the cost of such directories in the layers is rather negligable20:28
corvustristanC: i think that means that docker will copy the contents from the container image to the volume the first time a container mounts a newly created docker volume.20:31
corvustristanC: i think it's really messy to have this kind of interaction between the images and quickstart; i'd prefer a straightforward upstream image that people can use downstream.20:32
*** hashar has joined #zuul20:32
clarkbhttps://zuul.opendev.org/t/openstack/build/b4ad5a108af94478b09cd715b4e6a5f4/log/job-output.txt#3140 do we need a become:true in collect-container-logs ?20:34
clarkbmnaser: ^ you added that role. Is the expectation that we do that at a higher level?20:34
corvusclarkb: normally the zuul user is in the docker group20:34
corvusclarkb: perhaps that's not the case for the system-config jobs though20:35
tristanCcorvus: so you prefer an extra container in the docker-compose to chown 1000 the mountpoints ?20:35
corvusclarkb: (because they use the system-config method of bootstrapping docker?)20:35
clarkbya looks like there is a local install-docker20:35
clarkbbecause we install docker in production20:36
tristanCcorvus: also note that when the upstream zuul image setup the USER, then it may break current user. e.g. tripleo folks copied the docker-compose and they will need to also use the extra container to do the chowning20:38
clarkbI've added apply: become: yes to the include_role I expect that will fix it20:39
corvustristanC: yes, it is becoming apparent that adding USER is going to be a breaking change for anyone using images.  i think we will need to announce that.20:40
corvusthat's really separate from quick-start though20:41
corvustristanC: do you think there's any chance that if we mounted the volumes under /var/lib/zuul they would pick up the perms from that (even though /var/lib/zuul is also a volume?20:42
tristanCcorvus: actually we don't have to set USER now, we could just create the user, that should unblock k8s deployment20:42
corvustristanC: agreed (then work on USER later)20:42
corvustristanC: it looks like https://review.opendev.org/705280 is sufficient to work in k8s with runAsUser and does not break quick-start20:43
corvusi have to grab lunch; biab20:43
*** rfolco has quit IRC20:43
openstackgerritTristan Cacqueray proposed zuul/zuul master: Dockerfile: create a zuul user with uid 10001  https://review.opendev.org/65024620:47
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: Add OpenShift SCC and functional test  https://review.opendev.org/70275820:52
*** mattw4 has quit IRC20:54
openstackgerritAntoine Musso proposed zuul/zuul master: executor: ansible does not need cowsay  https://review.opendev.org/70529420:56
*** mattw4 has joined #zuul20:58
corvustristanC, tobiash: how should we reconcile https://review.opendev.org/705280 and https://review.opendev.org/650246 ?  should we have an addgroup?  should we set the homedir to /var/lib/zuul ?21:53
corvus(and what should the userid be?)21:53
corvushow about uid 10001, homedir=/var/lib/zuul, add zuul group ?21:53
corvusthat seems like it gets the highlights from both -- does that work?21:54
corvusoh, also we should do a chmod on /var/lib/zuul since that's a volume we expect docker to create, so that would let it work as non-root in docker21:55
corvuslet me propose that just to get a strawman up21:55
tristanCcorvus: zuul doesn't need a special gid iiuc21:56
*** rlandy has quit IRC21:56
corvus(oh, the -m will create the homedir, so the chown is already there for /var/lib/zuul)21:57
corvustristanC: what does it end up with in your patch?21:57
tristanCcorvus: with 650246 and this change https://review.opendev.org/#/c/702758/15..16 (using runAsUser), both k8s and openshift for some reasone failed with: https://zuul.opendev.org/t/zuul/build/0ed04a02b1254c33949b84c619fd49a6/console#3/0/6/controller21:58
corvustristanC: that error probably means non-writeable homedir21:58
corvusit's probably trying to write to ~/.ansible/tmp  -- at least that's what was happening in my earlier trials21:59
tristanCcorvus: is there a reason why that would only happen with ansible-2.9?22:00
corvustristanC: oh wow, i didn't notice it was only 2.9 that failed... i *thought* earlier i saw all 4 versions fail.  so either i was mistaken in my earlier test, or this is a different error.22:00
tristanCi guess ansible-2.9 does something special on init with the home directory, though i don't get why 702758 is failing with the useradd addition while it was working with the uid_entrypoint script22:02
*** tosky has joined #zuul22:05
corvustristanC: you can run the built image from insecure-ci-registry.opendev.org:5000/zuul/zuul-executor:d1f09d7823254a54bac0ed0f13d23606_latest22:07
corvustristanC: i'm doing that now to inspect it22:07
corvus(ftr, the other question: uid=10001(zuul) gid=10001(zuul) groups=10001(zuul)22:07
corvustristanC: i'm unable to repro in gke, it seems to work fine22:10
tristanCcorvus: podman doesn't seems to be able to run that image: http://paste.openstack.org/show/789025/22:10
tristanCit works with docker though22:12
tristanCcorvus: could you paste the k8s container resources you apply on gke22:13
corvustristanC: this is what i just did to test that http://paste.openstack.org/show/789026/22:13
corvusit seems to have started up without incident and joined the scheduler22:14
*** mattw4 has quit IRC22:17
tristanCcorvus: it seems like one difference is the securityContext.runAsUser being set for the podspec while i put it for the container (along with the privileged attr), i'll update the operator to try do that22:18
corvustristanC: yeah, that was accidental on my part.  i'll retry with everything in the container spec and see if it breaks22:18
corvustristanC: didn't seem to change anything22:19
*** avass has quit IRC22:22
openstackgerritMerged zuul/zuul master: executor: ansible does not need cowsay  https://review.opendev.org/70529422:26
openstackgerritTristan Cacqueray proposed zuul/zuul-operator master: DNM: add PodSpec securityContext attribute  https://review.opendev.org/70530822:26
tristanCcorvus: fwiw, ^ renders http://paste.openstack.org/show/789027/22:28
corvustristanC: (btw it looks like that has the same service bug i proposed a fix to zuul-helm -- the service should have a "clusterIP: None" entry in the spec)22:29
corvustristanC: why the HOME env var?22:30
tristanCcorvus: previous PS worked fine without the clusterIP22:30
*** mattw4 has joined #zuul22:30
tristanCcorvus: the HOME is a leftover from the uid_entrypoint22:31
corvustristanC: you won't notice the clusterip thing until you try to stream a log22:31
tristanCcorvus: i did streamed log without issue, on both k8s and openshift22:32
tristanC(the resources have been tested end-to-end with a job running on kubernetes)22:32
corvustristanC: in the web dashboard?22:33
corvusi must have missed the test for that, but that's pretty cool if we have it.22:33
tristanCi meant manually tested using the web interface streaming this job: https://softwarefactory-project.io/r/#/c/17347/3/conf/sf/applications/SoftwareFactory.dhall@158, we still need a ci job to demonstrate this automatically.22:37
corvustristanC: i'd be interested in knowing what name the scheduler connected to22:39
corvusbecause my understanding is that unless you make a headless service you don't get an entry in dns for the statefulset pod22:40
corvushttps://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#stable-network-id22:44
tristanCcorvus: i have the executor running job, trying to reproduce using curl to console-stream22:46
corvustristanC: in my test, the scheduler was trying to contact zuul-executor-0.zuul-executor.zuul.svc.cluster.local22:47
corvusthat name did not resolve unless i made the service headless22:48
tristanCcorvus: here is the url i got from the status page "finger://sf-executor-0.executor.myproject.svc.cluster.local/30c0dbdf68214ce1aafa25534163dc54"22:49
corvusand that hostname resolves from the scheduler?22:50
corvusregardless, i think a headless service is more correct anyway -- it doesn't make sense to connect to executor.myproject.svc.cluster.local:7900 (executors aren't a load-balanced service).  so while i'm curious about the different behavior, i still think we should add it to the operator.  it's not urgent though, we have enough problems. :)22:52
tristanCcorvus: it does from zuul-web : http://paste.openstack.org/show/789029/22:52
tristanCand the same ip from the scheduler22:53
corvusyou can sudo on zuul-web?22:54
corvusoh, you're doing that from outside k8s22:54
tristanCcorvus: yes, to be able to use `dig`22:55
tristanCcorvus: actually, i may not have tested the zuul-web console-stream on k8s, only the job-output.txt22:56
clarkbhttps://zuul.opendev.org/t/openstack/build/c13a6d5a761e4de9af2be566033c5f51/console#3/1/3/refstack01.openstack.org somehow the container name is becoming container_command23:00
clarkbbetween these two tasks https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/collect-container-logs/tasks/main.yaml#L6-L1623:00
clarkbanyone understand how that could happen?23:01
clarkbmaybe something to do with the loop?23:01
tristanCcorvus: here is a cli test: sudo dnf install -y python3-websocket-client; function zuul-stream { wsdump -v 2 -t '{"uuid": "'$(echo $1 | sed 's/.*stream.\([^\?]*\).*/\1/')'", "logfile": "console.log"}'  -r ws://172.30.45.62:9000/api/tenant/local/console-stream ; }23:03
clarkbit is almost like every variable in the loop is getting replaced with {{ item }}23:04
corvusclarkb: i agree with what you're seeing and it makes no sense to me.23:05
clarkbI'm skimming ansible commit history to see if there is any suspect fixes, but this really looks like an ansible loop bug to me23:08
clarkbbecause that var is not overridden in the previous task and its value is correct there23:08
clarkbthis job ran with 2.8.7, there is a 2.8.823:08
clarkbnothing jumps out though23:09
clarkbpabelanger: ^ you seem to keep on top of ansible behaviors does that look familiar to you at all?23:09
tristanCclarkb: perhaps setting `loop_control: {loop_var: item}` would yield helpful result?23:11
clarkbtristanC: I can try that, then depends on it to check, changes in a minute23:11
openstackgerritClark Boylan proposed zuul/zuul-jobs master: Debug weird Ansible loop behavior  https://review.opendev.org/70531223:14
hasharclarkb: do you have a way to run the playbook with extra verbosity?  ansible-playbook -vvvvv23:15
hasharor ANSIBLE_DEBUG=123:15
hasharor something similar23:15
clarkbI don't know, it is zuul running it23:15
corvusclarkb: i'm unable to reproduce locally with a similar playbook under 2.8.723:16
corvusclarkb: zuul-executor verbose23:16
corvusyou might need to do that for all the executors before the job starts though23:16
corvusmake sure you turn it back off if you do :)23:16
clarkbya, let me see if we get different behavior with the above change and the ndecide if I need to do that23:17
corvusthat gets you "-vvv"23:17
corvuswhich is not quite as many v's as hashar suggested.  not sure if that's enough23:17
clarkbanother option is to set the jobs ansible version to 2.923:21
clarkbto try and narrow down the behavior23:21
corvustristanC: i'm not convinced that 702758 used the zuul image builds from 650246  (though i don't understand why not)23:27
hasharclarkb: not sure if that can help, but 'loop' lets you register the results which would have the value of 'item'  (well it is known but still.. that might help maybe)23:29
hasharhttps://docs.ansible.com/ansible/latest/user_guide/playbooks_loops.html#registering-variables-with-a-loop23:29
clarkbin this case it is the input of the loop commands that is the problem not the output23:30
hashareek :(23:30
clarkbthe verbose run idea is a good one, but I'll avoid that until I have no othe roptions because I have to update all the executors to do it23:30
corvustristanC: yeah https://zuul.opendev.org/t/zuul/build/12605547681f4171aef450e20060e758/log/zuul-info/inventory.yaml shows 246 in the chain, but no artifacts from it23:31
tristanCcorvus: indeed, could recheck fix that?23:34
corvustristanC: possibly; i went ahead and did that.23:34
corvusi think we need more debug lines in the provides/requires code23:34
openstackgerritJames E. Blair proposed zuul/zuul master: Add some debug lines for provides/requires  https://review.opendev.org/70531323:42
openstackgerritJames E. Blair proposed zuul/zuul master: Add some debug lines for provides/requires  https://review.opendev.org/70531323:43
*** rfolco has joined #zuul23:48
clarkbsetting the loop var did not change anything23:50
*** igordc has quit IRC23:52

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!