Tuesday, 2018-07-24

*** threestrands has quit IRC00:38
*** jiapei has joined #zuul01:20
openstackgerritPaul Belanger proposed openstack-infra/zuul master: Use mode 0o755 for executor server mkdirs  https://review.openstack.org/58506802:16
SpamapSsix is a dependency of like, everything. Probably not reasonable to rely on it not being there. ;)04:49
SpamapSpip and virtualenv are the only things I ever install into system python :-P04:50
openstackgerritIan Wienand proposed openstack-infra/zuul master: Rename didAnyJobFail() to hasAnyJobFailed()  https://review.openstack.org/58513405:16
*** nchakrab has joined #zuul05:27
*** quiquell|off is now known as quiquell05:38
openstackgerritJoshua Hesketh proposed openstack-infra/zuul master: Add instructions on building static web components  https://review.openstack.org/58125605:46
*** nchakrab has quit IRC06:22
*** nchakrab has joined #zuul06:31
*** pcaruana has joined #zuul06:34
*** quiquell is now known as quiquell|bbl06:54
openstackgerritMerged openstack-infra/zuul-jobs master: Scope all_known_hosts in mult-node-known-hosts  https://review.openstack.org/58499306:56
*** hashar has joined #zuul06:58
*** bbayszczak has joined #zuul07:06
*** bbayszcz_ has joined #zuul07:23
*** bbayszczak has quit IRC07:26
*** nchakrab has quit IRC07:29
*** nchakrab has joined #zuul07:36
*** gchenuet has joined #zuul07:43
*** bbayszcz_ has quit IRC07:43
*** bbayszczak has joined #zuul07:44
*** gchenuet has quit IRC07:47
*** quiquell|bbl is now known as quiquell07:52
*** nchakrab_ has joined #zuul08:29
*** nchakrab has quit IRC08:31
*** sshnaidm|ruck is now known as sshnaidm|afk08:46
*** gouthamr has quit IRC09:02
*** dmellado has quit IRC09:03
*** panda|off is now known as panda09:25
*** goern has joined #zuul10:16
*** gouthamr has joined #zuul10:20
*** dmellado has joined #zuul10:24
*** nchakrab_ has quit IRC10:42
*** hashar has quit IRC10:47
*** quiquell has quit IRC10:48
*** panda is now known as panda|lunch11:10
tristanCpabelanger: i've started an implementation of the "container that behave like machine" as specified by the container-build-resources spec, and it seems like it can't support bindep unless we give root capability to the pod11:23
tristanCpabelanger: which will be a similar regression you reported about the runC driver: you need to duplicate the bindep list to the container build recipe...11:24
*** sshnaidm|afk is now known as sshnaidm|ruck11:29
*** GonZo2000 has joined #zuul11:39
*** GonZo2000 has quit IRC11:39
*** GonZo2000 has joined #zuul11:39
*** jiapei has quit IRC12:10
*** nchakrab has joined #zuul12:12
*** panda|lunch is now known as panda12:36
*** GonZo2000_ has joined #zuul12:38
*** rlandy has joined #zuul12:39
*** GonZo2000 has quit IRC12:40
*** samccann has joined #zuul13:00
mordredtristanC: for the container to behave like a machine I would expect the code running in it to be able to run commands as root - otherwise it's not really a thing that behaves like a machine ...13:05
mordredtristanC: is the issue with bindep just that installing packages at runtime at all would be an issue?13:06
mordredor is there something specific that bindep is doing that requires some capability beyond 'install packages'13:06
*** ianychoi has quit IRC13:08
*** ianychoi has joined #zuul13:08
tristanCmordred: this is similar to the bwrap issue, iiuc we can't create container with root capabilities13:09
tobiashmordred: in openshift containers by default cannot run as root (aka uid 0 inside the container)13:10
tobiashmordred: that is distinct to running with privileges13:10
mordredgotcha. so I'd say that, by default, openshift cannot provide thething I would consider "container that behaves like a machine"13:10
tobiashmordred: but both restricted13:10
mordredbut it's interesting informatoin, and means we might need to define 2 different things here13:10
tristanCthough i don't think this is a regression, user just need to provide an image with the bits they need to run a job13:11
mordredyeah - but you can't test a patch that way13:11
mordredif the patch contains an update to bindep.txt13:11
mordredthe testing of the speculative future state would be incorrect13:11
mordred(if the test itself cannot create the environment)13:11
mordredor?13:12
* mordred might be making statements - but they're all also questions13:12
tobiashmordred: in the container world you may want to build an updated image first13:13
mordredright. but that's container-native13:13
tristanCmordred: and openshift seems designed for container-native...13:14
mordredyes13:14
mordredcontainer-as-machine is explicitly about serving people who are not container native but who maybe don't need full vm isolation13:14
mordredpep8 job in openstack is a good example13:14
mordredor tox-py3513:15
mordredit does not need to be rethought as a container-native workload - it's running pep8 - just running commands in the container should be fine13:15
tobiashmordred: in that case the operator either must know that root is not a thing or explicitly allow root in his kubernetes/openshift13:15
mordredtobiash: yes, I believe I'm coming to agree with that13:16
tristanCmy current implementation adds a label's type to request a pod or project. e.g. "labels: [{name: openshift-project, type: project}, {name: py36-pod, type: pod, image: docker.io/centos/python-36-centos7}]13:16
tristanCcould we have a step between container-as-machine and container-native, e.g. container-image ?13:18
mordredmaybe - I think this is worthy of further discussion, as I think "containers can't run things as root" is a piece of information that is new13:18
tristanCi mean a step in the container-build-resources spec13:19
mordredtristanC: I'm not sure I follow what you mean?13:20
tristanCmordred: container-as-machine, e.g. un-restricted env like k8s or docker can provide, container-image, e.g. a regular pod in openshift, container-native, e.g. a project to build speculative pod(s)13:21
rcarrillocruzare you folks aware of kubevirt? i've used it, is about making virtual machines first class citizens in pods13:22
mordredrcarrillocruz: yes, but I don't think it solves anything here13:22
mordredthis isn't about being able to use k8s to get vms (we can already get vms) - it's about getting containers that are containers13:23
tristanCmordred: because atm, I can't implement the container-build-resources spec on openshift for the simple use-case, unless we create a distinction for containers without root access13:24
rcarrillocruzright, i guess my point is, if you want to do root things in k8s you have to set a privileged pod13:24
rcarrillocruzbe a container13:24
rcarrillocruzor a kubevirt vm13:24
mordredI don't want to do root things in k8s13:24
mordredI want to do root things in $something that behaves like an actual machine rather than like a PaaS13:25
rcarrillocruzroot in the pod i mean, for the context of a job13:25
mordredright. so it's possible the answer here is "the thing people have been saying will make some things easier is in fact impossible and so the use case is worthless"13:25
rcarrillocruzheh13:26
rcarrillocruzyeah, is retrofitting , not a good use case13:26
mordredbecause the idea here is to be able to use a container in place of a VM for jobs where the container vs. vm distinction is not useful13:26
mordredagain, I'll reference the tox-py35 job13:26
mordredthere is nothing container native or vm native about that job - it's a DEAD SIMPLE thing that runs, installs software and runs some unit tests13:26
mordredso the argument has been made for the last 5 years or so that we could have  amajor win if we just grabbed ubuntu containers instead of spinning up ubuntu vms13:27
mordredit's starting to look like all of the people who made that argument so far were ignoring the fact that the things one gets containers from don't let you do basic things lke install packages13:27
rcarrillocruzright, it's the same argument about 'hey, i can do whatever in docker you do in vms' , 'oh really, can you do this thing that is not a web app and requires root' , 'oh ofc, i run it in privileged mode'13:28
rcarrillocruzwhich means you are running something in a less secure way13:28
mordredyup13:28
rcarrillocruzjust for the sake 'i use containers'13:28
mordredyes - so that's what I'm learning so far this morning13:28
mordredyet again "containers make this easier/better" is a marketing falsehood13:28
rcarrillocruzis moving complexity from config management to image building, apart from being less secure... at least that's how i see it13:29
mordredwell - it's also missing the step between the source and the workload13:29
mordredbecause it seems to be quite common, as was mentioned before, that someone "provides an image"13:29
tristanCit seems like just a separation of privilege between build and runtime13:29
mordredzuul, however, is a system that is designed to work on source code, so if a user had to build an image and upload it somewhere before they could run a test ...13:30
tobiashjust to note, you *can* have root containers without privileges in openshift, you just need to allow that13:30
mordredthat would be like the old-school pre-gating CI where people push patches to a branch and then test run to tell you whether it's red or green13:30
mordredtobiash: awesome. so it may just be that for container-as-machine it'll only work if you've allowed that?13:31
mordredand that we may need to document that requirement for that use case13:31
tobiashmordred: if your workload needs to install packages, yes13:31
tobiashmordred: however I don't know what's the default in kubernetes13:31
mordredI don't either13:32
tobiashmaybe there it's more relaxed13:32
mordredI feel like I should write some long-form words about this ... but one of my goals is to be able to do the thing we think of as gating without any humans ever having to do the equiv of docker push13:33
mordredbecaues docker push is just like git push ... if a human is doing it, something has gone wrong13:33
tristanCtobiash: don't you need cluster-admin access to enable any uid?13:33
*** swest1 has quit IRC13:33
tobiashmordred: there are different sccs (security context constraints) in openshift, one is privileged, one is any-uid and the default is restricted (which starts the pod with a random uid)13:34
tobiashtristanC: the cluster admin can allow specific sccs to specific users or groups13:34
tobiashthen that user can make use of that scc13:35
* mordred has learned many things this morning13:35
tobiashmordred: the any-uid enables you to choose the uid inside the pod (including 0) without giving you privileges13:35
tobiashmordred: it's just an additional default safety mechanism in addition to dropping privileges13:36
mordredcool. that sounds like the thing that would be needed for container-as-machine to work as intended. and when I say "as intended" - what I mean is that I expect jobs in zuul-jobs to work in containers provided by container-as-machine13:36
mordredit's possible we might want to be more explicit about that and maybe name that somehow13:37
mordredthen I think there is another but potentially similar thing, perhaps what tristanC is referring to as container-image which is like container-as-machine but does not need any-uid13:38
rcarrillocruzhttps://blog.zhaw.ch/icclab/the-intricacies-of-running-containers-on-openshift/13:39
rcarrillocruzso yeah, the uid thing is 'special' in openshift13:39
tristanCmordred: on a shared kernel, that seems like a safe thing to do for unstrusted ci jobs13:39
rcarrillocruzi hit that on trying to get zuul control plane to work in openshift online13:39
rcarrillocruzwhich is why i asked tobiash how he did, and cos he was admin he could just mangle with sccs13:39
mordredhowever, I imagine this is going to be a similar thing we're going to need to look at for container-native anyway, since zuul is going to nede to be able to build containers13:40
tristanCmordred: with build-config object, s2i can build image for you, as a regular user13:40
mordredso a zuul install is going to need _some_ level of elevated privs to interact with container systems whether we're doing container-as-machine or container-native13:40
mordredtristanC: nope13:40
tobiashmordred: in openshift you by default have several service accounts where one of it is 'builder'13:41
tobiashmordred: I think we might be able to use that service account to create 'builder' pods that build images13:41
tristanCmordred: well yes, see https://review.openstack.org/#/c/570669/3/playbooks/openshift/build-project.yaml13:41
mordredso - I think part of the problem here is that Zuul is a service that wants to _provide_ the same services as openshift buildConfig does13:42
tristanCtobiash: yes, for container-native that is fine, but for the simpler use-case to run linters job on an pre-exisiting image, it would be nice to have a distinction between environment that allow root and the one that doesn't13:43
mordredthere are almost zero jobs defined in openstack's zuul today that would be able to be run in containers without root13:44
mordredI suppose the javascript jobs probably could be defined to run in javascript base image containers13:45
mordredbut not even the python unittest jobs can13:45
mordredso maybe instead of using linters as the example "simple" case, we need to use "python unittests" as an example "simple" use case as we think through this?13:46
tristanCmordred: hum, iiuc the container-build-resources spec, the image building step is part of the job, not the zuul service per se13:47
mordredbut it's also possible that we need to take a step back and take another stab at defining the use cases and what we're trying to accomplish with them - because today's conversation has brought a giant piece of new information13:47
tristanCmordred: so it should be up to the base job owner to use or not the build config service13:47
*** dkranz has joined #zuul13:47
mordredtristanC: that would depend on the base job owner knowing that they were running in openshift13:48
mordredand would mean that **zuul** would not be providing the appropriate primitives to allow someone to write a job13:48
mordredmaybe that' sok13:48
mordredI don't know13:48
mordredit's just a ton of new information13:48
mordredI know it does not sound like how a system I would like to use would work - so it might take me a while to wrap my head around it13:49
tristanCmordred: well the zuul-base-job could be part of zuul too, and the primitives could be there too13:49
mordredmaybe so13:49
mordredI've just been considering zuul as a replacement for the openshift provided buildConfig system, since AIUI that uses jenkins on the backend to do its work, no?13:50
tristanCmordred: buildconfig doesn't need jenkins afaik13:51
mordredok13:52
mordredwell - I feel like I've learned many things - so thanks ... I definitely need to digest for a bit13:52
tobiashmordred: openshift buildconfig uses build containers spawned in pods using the builder service account13:53
mordredok13:53
tobiashmordred: and openshift also allows custom build containers13:53
mordredbut does it allow for speculatively building build containers based on the git repo state of the repo that the build container would come from?13:54
tobiashmordred: it just supplies a few default builder containers (various ones for s2i and one for building using docker files)13:54
mordredbecause that's the thing we need to do to actually make that whole stack work13:54
mordredright?13:54
tristanCmordred: yes it can, but you need to give access to a git ref13:54
tobiashmordred: for that we might have to look if we can create builder containers without a build config13:54
mordredtristanC: right, which means you have to do the "I'm going to serve git refs over http" right? which is going to be hideously inefficient at scale13:55
tristanCmordred: well that's how the openshift-base ( https://review.openstack.org/570669 ) works atm. what's the difference between pull from http to push over exec connection?13:56
mordredI don't know. maybe nothing13:58
*** maeca has joined #zuul13:58
mordredit seems like it would make it very hard to keep a git repo cache mechanism that would obviate the need to push the entire contents every time13:58
mordredif we had zuul know how to build images itself natively, then it could also manage a git repo cache environment so that it cuts down on the overall traffic13:59
mordredwhich is why I've been expecting that zuul would not use the existing builder service inside of the COE but would have something it does itself14:00
mordredbut again, that may just be a terrible assumption on my part14:00
tristanCmordred: iiuc, openshift user are already using the builder service...14:03
pabelangertristanC: sorry, dealing with issues downstream but looks like you and mordred are already talking about solutions :)14:03
tristanCmordred: anyway, i don't think this discussion change the spec too much. the container-native is already enabling any building system as it is part of the job, not zuul14:04
mordredtristanC: nod. yeah- it's possible the buidler service is the right thing to do for that part of the spec14:04
mordredtristanC: and it's possible that the thing we've been calling container-as-machine has some caveats and restricts14:05
mordredresrictions14:05
mordredgah14:05
mordredrestrictions14:05
mordredthat just need to be documented14:05
tristanCmordred: ideally we should just add a mention regarding container-as-machine may not have root access (since that's not available to regular openshift user)14:05
tristanCor create a new category, container-as-restricted-machine ?14:06
mordredyeah. so - my concern is that I REALLY don't want to fall into the trap openstack did where different zuuls behave massively differently because of deployment details of the operator14:06
mordredI've been fixing that hell personally for many years now14:06
mordredI don't want some of the jobs in zuul-jobs to not work on some zuul installations because the operator decided to use containers for build resources and didn't have access to any-uid14:08
tristanCmordred: well, runc, lxc, docker, k8s and openshift all has a different feature set, so isn't that bound to happen?14:08
mordredtristanC: to me - all of those are ways to get access to linux14:08
tristanCmordred: zuul-jobs already doesn't work if you can't prepare-workspace, and afaik this role only works through ssh or docker atm14:08
mordredright - but the spec for container-as-machine is written so that we could update prepare-workspace to work on them14:09
tristanCor more generally, any task using synchronize14:09
mordredright - that's why all of that work went in to fleshing out ways to do synchronize in container-as-machine14:10
mordredif container-as-machine only works with prepare-workspace on _some_ zuul insalls, I think that's very confusing14:10
tobiashmordred: in that case I think the best we can do is to clearly state that anyuid is needed if using containers (and maybe even refusing to work if that's not satisfied)14:12
mordredtobiash: yeah. that's my gut-feeling right now (and we need special things for the executors anyway currently) ... but I also think my thoughts my change the more we think and talk about this14:13
tristanCtobiash: which exclude openshift user to use simple container image... could we please add a new container-as-restricted-machine to support user that do not need root access?14:13
mordredwhich is to say - thanks to both of you ... this has been *very* helpful for my brain :)14:13
tobiashtristanC: we might leave that option open but clearly state it as unsupported because of default jobs that won't work14:14
mordredtristanC: maybe? it's literally a **completely** new idea I've never even thought of before this morning so I don't really have much in the way of opinions on what the zuul user experience is for such a thing yet14:14
tristanCi mean, 90% of our jobs run fine on a common image that have a few devel package backed-in14:15
mordredmaybe 1% of our jobs do14:15
mordredtristanC: but I hear what you're saying and I could see how it could be a thing that people would want - I just don't know how to think about it yet14:18
tristanCs/90%/most/ (sorry, that number doesn't make sense, i meant images just need bindep installed, and new bindeps could be installed preemptively)14:20
tristanCmordred: sure, i just hope this doesn't delay the container spec too much :)14:22
mordredtristanC: that just still sounds like container-native to me - because instaling the bindeps preemptively would mean the base job would need to submit something to buildConfig to get a base image with the bindep depends installed, and then would need to boot another container based on that new base image to run the rest of the job14:22
mordredtristanC: because it's a key part of the jobs that they install the bindep depends as part of the job14:23
mordredalso - for things like openstack, we have 2k repos and all of them have different bindep depends list - so we'd either have to maintain base images with pre-installed deps for every repo+branch+distro combo ... and lose the ability to fail a job if someone proposes an update to the bindep list14:24
tristanCmordred: well maybe it's just a matter to define label that support dynamic bindep (e.g. root access required or container-native) from the one that doesn't (e.g. regular openshift user pod)14:24
mordredor we need to have the base job able to build a base image that's only for that job14:24
mordredindeed - that might be a possibility ... "this label provides root access"14:25
mordredbut I think we'd need to think about how we want to express that or where14:25
mordreddo you want to put a flag on a job that says "this job requires root"14:25
mordredand have zuul/nodepool be able to provide a useful error message if someone tries to run a job-needs-root on a label that doens't provide it?14:26
tristanCmordred: it's more the project that should indicate that, because one tox job would need to run bindep, but another would not14:26
mordredthat sounds confusing to me - the tox job in zuul-jobs runs bindep, so the tox job in zuul-jobs needs root14:27
mordredthat would be the first time that we'd have a job in zuul-jobs that requires a specific capability in  nodepool label14:27
mordredit's written to be cross-distro because it's not tied to qualities of the label14:28
mordredalso - users of the zuul may have *no idea* about the details of the underlying COE - and they may not know that the base job they are using needs root14:28
mordredbecause the way tox runs bindep may not be a thing they are aware of14:28
mordredrunning on an openshift without privs is a zuul operator choice and a thing a zuul operator could be reasonably expected to understand14:29
mordredbut it doesn't seem like a pieceof information that a zuul user should need to know14:29
tristanCmordred: we should support user that doesn't need root access to run job14:30
tristanCmordred: otherwise, regular openshift user won't be able to use zuul14:30
mordredregular openshift user already can't use zuul14:30
tristanCwell they can't deploy it atm, but they can use a vm for that...14:31
mordred:)14:31
tristanC(i'll eod soon)14:32
mordredtristanC: seriously - thanks for the nice long in-depth conversation - I know it's super helpful for my understanding - and I imagine the scollback will be very interseting and useful to people as they wake up14:32
rcarrillocruz(it is)14:33
mordred\o/14:33
tristanChope that helps :-)14:34
corvusgood morning!  thanks for the conversation in scrollback :)  i've read it, and will digest as well.14:36
corvusin other news... over in #openstack-infra we found a nodepool regression14:36
corvusit looks like the launchers may not be reporting API times14:37
corvusi wonder if that's related to the openstacksdk switch?14:37
corvus(reporting -> reporting to statsd)14:37
tobiashcorvus: sounds most likely14:37
mordredcorvus: oh good. yeah - it certainly could be14:37
mordredcorvus: I shall start looking in to that right now14:38
tobiashcorvus: re job pause, should we make inventory merging into child inventory optional?14:39
corvustobiash: inventory merging?  i don't follow.14:40
tobiashcorvus: the idea of buildset resources is to make the parent job resources available in the child jobs14:40
Shrewscorvus: pabelanger: mordred: so a regression from the shade to sdk move?14:41
tobiashI think I'd like that to be optional14:41
Shrewsis it the rate limiting?14:41
pabelangerShrews: not sure, still looking14:41
mordredShrews, corvus, pabelanger: in the logs, are we seeing (queue: ) in the "Manager %s running task" lines?14:42
corvustobiash: i did not think that the idea was to *directly* to that, but rather, to allow services running on the parent jobs to keep running and allow the child jobs to indirectly use those services.  so a parent could run, say, an image server, and then return the address of that image server using zuul_return and the child job could access the image server through the network.  not that the child job would14:42
corvusget that node in its inventory.14:42
tobiashthat would support a use case like the parent spins up a service (containing possibly secrets) that can be used by the child but the child job should not be able to ssh-access theparent14:42
corvusShrews, pabelanger, mordred: i have tcpdumped on a host and see this: E..c..@.@./.....h........O.0nodepool.task.packethost-us-west-1.compute.DELETE.servers:779.000000|ms14:42
corvusthat sure looks like a task time14:42
tobiashcorvus: ah oh, so this would be needed to setup by the parent14:43
corvustobiash: yeah... do you have any use case that requires the parent nodes in the child inventory?14:43
tobiashlike create ssh key and supply it via zuul_return too to make rsync available14:43
mordredShrews, corvus, pabelanger: ok, we are. that means we are properly running the nodepool TaskManager at least (first panic)14:43
tobiashcorvus: no, that was just my first idea to make it available to the synchronize task14:43
tobiashwithout adding it to the inventory we cannot use synchronize and need to call rsync directly (which is not a problem for me)14:44
corvustobiash: ah, good, then i think we're mostly imagining the same thing.  yeah, i think that may be the best approach for now14:44
mordredShrews, pabelanger, corvus: did the key names somehow change?14:44
tobiashcorvus: ok, so that probably was a bad idea, thanks14:44
pabelangermordred: I don't see "Manager %s ran task" in logs14:45
pabelangermordred: which is from post_run_task14:45
corvusmordred: yes14:45
corvusnodepool.task.$provider.ComputeDeleteServers.mean14:45
corvusthat's what grafana is expecting14:45
openstackgerritTobias Henkel proposed openstack-infra/zuul master: WIP: Support parent job pause  https://review.openstack.org/58538914:46
corvusmordred: er, also, the tasks all seem to be "GET" "DELETE" "POST"14:46
tobiashcorvus: that's a first proof of concept that passes a simple test case and doesn't break other things14:46
corvustobiash: heh, that makes it sound ready to merge ;)14:47
mordredcorvus: well crap14:47
mordredcorvus: yes, that is where the regression is - lemme see if I can figure out a good way to fix it14:48
corvusmordred: ack, thx14:48
tobiashcorvus: yeah, it needs to be polished a bit but I think I got the event flow changes very small14:48
mordredcorvus: sdk produces the name "compute.DELETE.servers" instead of ComputeDeleteServers14:49
mordredcorvus: and I totally forgot about that when workingon the sdk patch14:49
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Make statsd key look like the keys from shade  https://review.openstack.org/58539714:53
mordredcorvus, Shrews, pabelanger: ^^ that should fix the current regression14:53
*** nchakrab has quit IRC14:54
corvusmordred: that will produce ComputeGetServersDetail for example?14:55
corvus(along with ComputeGetServers )14:55
mordredcorvus: yes - although I'm verifying the details one real quick14:56
tobiashmordred: that regression sounds like we have a gap in the tests14:56
mordredalso ...14:56
mordredpabelanger: we're not logging 'ran task' anymore14:57
corvustobiash: indeed, i was thinking we should port over zuul's statsd fixture14:57
mordred++14:57
tobiash++14:57
mordredcorvus: yes. ComputeGetServersDetail and ComputeGetServers14:57
corvuserm.  we do have a statsd fixture in nodepool.14:59
corvustest_launcher.py:        self.assertReportedStat('nodepool.nodes.ready', '1|g')14:59
corvustest_launcher.py:        self.assertReportedStat('nodepool.nodes.building', '0|g')14:59
corvusthat's the only assertion we do.  we probably just need to add some more.15:00
mordredpabelanger: oh - hrm. that gets logged in the sdk taskmanager - so should happen in the super call - but sdk taskmanager is using self._log and nodepool's is using self.log - so we're also missing that log ine to the wrong logger15:00
* mordred works on more patches15:00
corvushrm, since the taskmanager is inside openstacksdk, the normal test path doesn't use that.  so we're not exercising the task manager in the unit tests.15:08
corvus(and therefore, not generating these metrics)15:09
mordredcorvus: oh, hrm15:09
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Emit 'Manager ran task' log lines  https://review.openstack.org/58540115:09
mordredthat'll fix the other regression15:10
clarkbmordred: is the logging all that the parent class post_run_task does?15:13
mordredclarkb: yup15:13
mordredclarkb: if it ever grew something else it would be statsd-emitting15:14
mordredcorvus: I think we'd have to pull in requests-mock and the mock out openstack things at the http layer instead of the FakeOpenStackCloud to be able to get a test of that specific thing - which would be a non-trivial change in the way we're testing things that I'm not sure is worth it ...15:14
corvusyeah.  we could look at adding statsd to the functional tests15:15
mordredcorvus: perhaps we should just make sure there are tests in sdk that test what task.name is and make sure it's clear that that's part of the API there15:15
mordredand also functional tests15:16
clarkbmordred: question on the child change but feel free to approve if you don't want to update that15:17
mordredclarkb: I could make a followup for that - or respin that patch15:18
mordredcorvus: thoughts?15:18
mordredcorvus: (re clarkb's suggestion in https://review.openstack.org/#/c/585401/1/nodepool/task_manager.py@81)15:19
corvusoh, hrm, yeah i think that might be best15:19
mordredrespin then?15:19
corvus(i'm not too worried about backwards compat in log lines, it'd be more to keep nodepool internally consistent)15:20
corvusyeah i think so15:20
mordredok. how about I squash both patches and update it with that then15:22
corvuswfm15:22
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Make statsd key look like the keys from shade  https://review.openstack.org/58539715:23
mordredcorvus, clarkb: ^^15:24
*** nchakrab has joined #zuul15:30
*** nchakrab has quit IRC15:32
*** pcaruana has quit IRC15:32
Shrewsyay patches15:36
Shrewsthx mordred15:36
Shrewsmordred: found an issue15:38
Shrewsafter I +W'd. slow brain15:39
mordredShrews: yay!15:41
*** rlandy is now known as rlandy|brb15:42
mordredShrews: ooh - thanks!15:43
Shrewsmordred: also, should that be ".".join() ?15:44
clarkbShrews: no it is converting foo.Bar.baz to FooBarBaz15:44
Shrewsah15:44
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Make statsd key look like the keys from shade  https://review.openstack.org/58539715:44
mordredShrews: I updated the comment to make the second thing clearer for reading15:45
Shrewsdanke15:45
Shrews+215:45
*** nchakrab has joined #zuul16:13
*** nchakrab has quit IRC16:15
*** panda is now known as panda|off16:18
*** rlandy|brb is now known as rlandy16:28
clarkbshould we enqueue 585397 directly to the gate so that we can restart nodepool launchers before shrews calls it a day?16:32
*** sshnaidm|ruck is now known as sshnaidm|bbl16:37
mordredclarkb: yah - maybe so16:40
Shrewsclarkb: i really shouldn't need to be around for the restart. the only other things going in are very minor16:41
clarkbok16:41
Shrewslogging changes and mordred's statsd fix16:41
*** jlk has quit IRC16:48
*** jlk has joined #zuul16:48
*** jlk has quit IRC16:48
*** jlk has joined #zuul16:48
*** rlandy is now known as rlandy|afk16:49
*** myoung is now known as myoung|lunch16:58
*** bbayszcz_ has joined #zuul17:05
*** bbayszczak has quit IRC17:09
*** bbayszcz_ has quit IRC17:09
jtannermordred: http://bullshitipsum.com/ for your enjoyment17:18
mordredjtanner: ++17:27
rcarrillocruzfolks, can't remember if there was a talk about serving the nodepool images in some central location18:10
rcarrillocruzi.e. dib images18:10
mordredrcarrillocruz: we've talked about it before, but we haven't done it at all yet18:19
rcarrillocruzok18:19
rcarrillocruzre: 'i want to repro this job that failed in the gate locally'18:19
rcarrillocruzi.e. there should be a way to mimic that by using the same image18:19
mordredyah'18:19
rcarrillocruz(providing the image allows ssh key injection)18:20
mordredso - part of the issue is that our current images are HUGE because they are also serving a git repo caches18:20
rcarrillocruzin some of ours it will not (yay ! )18:20
mordredthere some thoughts that came out of the container spec about some other ways to do that18:20
clarkbrcarrillocruz: mordred openstack infra is serving the images now18:24
clarkbhttps://nb01.openstack.org/images for example18:24
clarkbwe just put apache in front of the dir that dib builds them in and filter out everything but qcow2 (bceause qcow2 are smallest)18:25
rcarrillocruzah sweet18:25
*** myoung|lunch is now known as myoung18:37
clarkbmordred: http://logs.openstack.org/97/585397/3/check/tox-pep8/093d4e5/job-output.txt.gz#_2018-07-24_17_45_16_774059 fyi18:48
mordredclarkb: sigh18:55
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Make statsd key look like the keys from shade  https://review.openstack.org/58539718:57
*** rlandy|afk is now known as rlandy19:03
*** sshnaidm|bbl is now known as sshnaidm|ruck20:11
*** GZ2k has joined #zuul20:15
*** GonZo2000_ has quit IRC20:18
*** samccann has quit IRC20:35
*** dkranz has quit IRC20:36
openstackgerritMerged openstack-infra/nodepool master: Make statsd key look like the keys from shade  https://review.openstack.org/58539720:44
openstackgerritIan Wienand proposed openstack-infra/zuul master: Add variables to project  https://review.openstack.org/58423020:48
openstackgerritIan Wienand proposed openstack-infra/zuul master: Use definition list for job status  https://review.openstack.org/58478020:48
corvusmordred: http://paste.openstack.org/show/726561/22:00
mordredcorvus: well then!22:08
mordredcorvus: that is ludicrously fantastic22:08
corvusit's pretty groovy!  i'm tidying up the boilerplate; should be done soon22:09
clarkbianw: re https://review.openstack.org/#/c/547889/ did the desired zuul change get pushed?22:14
*** GZ2k has quit IRC22:21
*** rlandy has quit IRC22:37
ianwclarkb: i haven't got back to that yet22:40
clarkbok not a rush, just noticed it when doing a pass over nodepool changes to see if there was anything else worth trying to get into the next release22:41
clarkbspeaking of I think based on current nodepool behavior mordred's last commit that merged is taggable22:49
corvuscool.  let's tag it tomorrow?22:50
corvusi'm happy with zuul's mimory usage in infra.  should we tag that commit tomorrow as well?22:50
clarkbwfm22:50
mordredclarkb, clarkb: ++22:55
mordredgah22:55
mordredcorvus: ++22:55
clarkbpabelanger: did you see my comment on https://review.openstack.org/#/c/584978/ ?22:55
clarkbpabelanger: ianw has a good comment on https://review.openstack.org/#/c/585068/ too22:56
pabelangerclarkb: haven't looked yet22:57
clarkbianw: is the change in header levels in https://review.openstack.org/#/c/584779/1/doc/source/user/jobs.rst intentional? Change items et al were under Zuul Variables but are now at the same level?23:01
clarkbhttp://logs.openstack.org/79/584779/1/check/build-sphinx-docs/7c39d99/html/user/jobs.html#change-items for example23:02
clarkbotherwise that organziational change makes a lot of sense to me, but I think Change Items et all should be under Zuul Variables as they extend the base set defined in Zuul Variables23:03
ianwclarkb: hrm, i thought i just moved the zuul.* jobs into a section, which would have "raised" it a level23:08
clarkbianw: Change Items and friends have the same underline string as Zuul Variables so are rendered at the same level now23:09
ianwclarkb: hrm, it should be23:11
ianwzuul23:11
ianw----23:11
ianwchange23:11
ianw~~~~23:11
clarkbianw: you know I may just be blind, I looked at the rendered output too and they looked very similar, but that is probably a style css thing more than rst23:12
ianwyeah, i htink if you look at http://logs.openstack.org/79/584779/1/check/build-sphinx-docs/7c39d99/html/user/jobs.html#id123:12
ianwyou can see there "working directory" is *slightly* smaller than the "SSH Keys" below it23:13
corvusTOC in top left can confirm23:13
ianwyes good point, the TOC reflects what i thought i was doing, anyway :)23:14
clarkbyup, I'm just blind23:14
ianwi've never found an emacs mode/magic that can roll up sections.  i'm sure it exists23:15
ianwi even installed atom at one point to see if it did that effectively, but i couldn't deal with it :)23:15
openstackgerritPaul Belanger proposed openstack-infra/zuul master: Fix zuul reporting build failure with only non-voting jobs  https://review.openstack.org/58499023:16
clarkbianw: did it use all the memory?23:16
pabelangercorvus: clarkb: tobiash: we also should include 584990 for zuul release tomorrow23:17
pabelangerif you'd like to review23:17
corvusoh yes, we were blocking on that23:19
pabelangerclarkb: ianw: re: 585068 yah, I can change umask, will look at that for the morning23:20
corvuspabelanger: left a comment on 58499023:21
pabelangercorvus: okay, I'll pick it up first thing in the morning, if that is okay23:22
corvusyep23:23
clarkbianw: left a thought on https://review.openstack.org/#/c/584230/4 I don't know that it is correct but something to consider23:29
ianwclarkb: yeah, i'll take advice if people want to see it differently.  after re-reading this morning a clarification that the variables are merged from templates into the project vars was what i just added in the /4 change23:35
clarkbianw: probably a decision for jeblair, mostly if I was reading the config naively I would assume that those project vars only apply to the jobs defined under them, however it is a "project-template"23:36
ianwnote the job can override them in it's var: section too, so it is "leaky" as it were :)23:37
corvusthat's a really good question.  i'm going to think on it overnight.23:41
*** GonZo2000 has joined #zuul23:45
*** GonZo2000 has quit IRC23:45
*** GonZo2000 has joined #zuul23:45
*** yolanda_ has joined #zuul23:56
*** yolanda has quit IRC23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!