openstackgerrit | Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Fix traceback when evaluating github changes https://review.openstack.org/470068 | 00:04 |
---|---|---|
*** jamielennox is now known as jamielennox|away | 00:44 | |
SpamapS | pabelanger: certainly seems like landing the change should cause Zuul to just reconfigure, but maybe we're forgetting something. :-P | 01:10 |
pabelanger | SpamapS: I assume it will, but zuulv3-dev.o.o isn't managed by puppet today. One we get zuulv3.o.o online, hopefully in the next day or so. We'll know for certain! | 01:12 |
jeblair | SpamapS: re 453851, i think we should land the whole series as soon as all the tests are green; however, the last 2 changes have failed for some reason i haven't looked into yet | 01:43 |
jeblair | SpamapS, pabelanger, jlk: as soon as a change to a config-repo lands, zuul notices and loads the new configuration. | 01:44 |
jeblair | SpamapS: oh i see the problem on the bwrap series, fixing | 01:45 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Make zuul ansible dir self-contained https://review.openstack.org/468208 | 01:47 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add untrusted-projects ansible test https://review.openstack.org/461881 | 01:47 |
SpamapS | 2017-06-01 23:56:31.011945 | [Zuul] Job timed out, result: FAILURE ? | 01:55 |
*** jamielennox|away is now known as jamielennox | 02:00 | |
jlk | jeblair: interesting, that's not exactly what I was witnessing. | 02:52 |
jlk | (update on change) | 02:52 |
*** jamielennox is now known as jamielennox|away | 03:30 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Case sensitive label matching https://review.openstack.org/469946 | 05:11 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Fix tests for "Case sensitive label matching" https://review.openstack.org/469985 | 05:11 |
*** mmedvede has quit IRC | 05:28 | |
*** mmedvede has joined #zuul | 05:31 | |
*** rattboi has left #zuul | 07:00 | |
*** hashar has joined #zuul | 07:49 | |
*** jamielennox|away is now known as jamielennox | 08:21 | |
*** jamielennox is now known as jamielennox|away | 08:49 | |
*** persia has quit IRC | 09:38 | |
*** persia has joined #zuul | 09:39 | |
tobiash | mordred: I have a use case for configuring the finger port. | 09:50 |
tobiash | mordred: If running in an unprivileged container it is not allowed to listen on a low port (79). | 09:51 |
*** jkilpatr has quit IRC | 10:30 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Make finger port configurable https://review.openstack.org/470243 | 10:38 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Don't copy the __pycache__ folder https://review.openstack.org/470245 | 10:42 |
*** jkilpatr has joined #zuul | 10:53 | |
mordred | tobiash: awesome. | 11:20 |
tobiash | :) | 11:34 |
Shrews | tobiash: wow, you really are running right on the cutting edge of development, aren't you? :) | 11:38 |
tobiash | Shrews: latest (this morning) feature/zuulv3 + some not yet approved patches :) | 11:42 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Make finger port configurable https://review.openstack.org/470243 | 12:09 |
pabelanger | Shrews: maybe if you have time today, you can help me understand how to pruge old images from dib-image-list on nodepool.o.o. We have an unknown image in deleting state for 8 days now | 13:52 |
pabelanger | only think I can think of is using zkshell to remove it | 13:53 |
Shrews | pabelanger: looking | 13:59 |
SpamapS | tobiash: you're going to need a privileged container to run bubblewrap FYI | 14:00 |
pabelanger | containers in containers! :D | 14:01 |
SpamapS | tobiash: you can of course opt out by using the nullwrap driver, but then you risk untrusted jobs breaking out and pwn'ing your executor. | 14:01 |
Shrews | pabelanger: those dib files do not exist on the owning builder nb04. how did that happen? | 14:03 |
Shrews | pabelanger: if they were manually removed, then yeah, we'll have to delete the znodes manually | 14:04 |
pabelanger | Shrews: we renamed the image a few days ago from opensuse-42.2 to opensuse-422. The images also failed to build a few times, and seems they hung around | 14:04 |
pabelanger | Shrews: is it possible to check zookeeper to see if the image name is valid, then make just auto purge them? | 14:05 |
pabelanger | or some other nodepool util to clean them up | 14:05 |
SpamapS | I wonder if we could make a wrapper that uses docker/rkt/k8s instead of bubblewrap | 14:05 |
Shrews | pabelanger: how would you decide if an image name is valid? | 14:05 |
pabelanger | Shrews: right, I guess we cannot depend on local nodepool.yaml file | 14:06 |
SpamapS | tobiash: what's your container overlord? Docker? | 14:06 |
pabelanger | since it would be possible for other builder to have a different image now | 14:06 |
pabelanger | name* | 14:06 |
Shrews | pabelanger: what we can probably do is change the way we delete the znode. right now, it's only deleted if the dib is successfully removed from disk. i suppose we could add a check that removes the znode if the on-disk dib is already gone | 14:07 |
pabelanger | Shrews: actually, we know nb04.o.o build the image, but it does not exist in its configuration file, can we not do something about that? | 14:07 |
pabelanger | Shrews: Ya, that's basically the issue now I think | 14:07 |
Shrews | pabelanger: oh, we already have that check | 14:09 |
pabelanger | mordred: this seems new on nb04.o.o: http://paste.openstack.org/show/611300/ | 14:11 |
pabelanger | Shrews: okay, looking in logs to see what is happening | 14:11 |
pabelanger | Shrews: are we bailing out on line 406 for nodepool/builder.py? | 14:14 |
tobiash | a gate job timed out: https://review.openstack.org/#/c/470243/ | 14:14 |
tobiash | ist this a known issue? | 14:14 |
tobiash | http://logs.openstack.org/43/470243/2/gate/gate-zuul-python27-ubuntu-xenial/db8a735/ | 14:15 |
Shrews | i don't know | 14:15 |
tobiash | SpamapS: good to know, thanks | 14:16 |
tobiash | SpamapS: currently using plain docker | 14:16 |
SpamapS | I expect bwrap to land pretty soon so be ready :) | 14:16 |
SpamapS | would be interesting to try and write a docker wrapper though | 14:17 |
tobiash | SpamapS: will try that out after my vacation in two weeks | 14:19 |
tristanC | OA | 14:26 |
tristanC | oops :-) | 14:26 |
Shrews | pabelanger: i *think* what's happening is that the diskimages is not in nodepool.yaml AND there are no local files on disk (for whatever reason), so nb04 doesn't think it's responsible for it (according to https://github.com/openstack-infra/nodepool/blob/master/nodepool/builder.py#L402-L405) | 14:27 |
pabelanger | Shrews: yes, that is what I am thinking too. I was trying to see if we could some how use the 'builder' field from zk, since we add it to each build | 14:28 |
Shrews | i'm not entirely certain why that check exists | 14:29 |
pabelanger | I think we want to use ImageBuild.builder | 14:30 |
pabelanger | wasn't it to stop other nodepool-builders from deleting DIBs that didn't exist on their filesystem? | 14:30 |
Shrews | my concern is that we might be doing things behind the builder's back that might be causing this situation | 14:30 |
Shrews | or, we have a bug somewhere | 14:31 |
pabelanger | Ya, we should see about adding a test to expose this. should be straightforward to setup current zk values | 14:32 |
pabelanger | unknown image in building state | 14:32 |
Shrews | pabelanger: why does nb04 have a fully qualified hostname, but the others do not? | 14:33 |
Shrews | others == nb03 | 14:33 |
pabelanger | Shrews: ubuntu xenail image in rackspace did that | 14:36 |
pabelanger | nb03 is in vexxhost | 14:36 |
Shrews | seems obvious NOW that we could just use the builder hostname to see if it is a locally owned image rather than checking on-disk files, but that should have seemed obvious THEN too, so maybe there's a reason i can't remember why we didn't use hostname | 14:37 |
pabelanger | ya, not sure: https://review.openstack.org/#/c/400421/ was the review that seems to add it | 14:39 |
Shrews | jeblair: Can you remember why we wouldn't just compare builder hostnames to determine if we own the build? ^^^ | 14:52 |
jeblair | i think originally we just didn't want to rely on hostnames for something like that. they are finicky. (if you have to change your hostname, how do you convince nodepool to delete those images, and would it even be obvious to a user that the reason their system broke is they changed the hostname) | 14:52 |
Shrews | speak o' da debil | 14:52 |
Shrews | ok, yeah, that makes sense | 14:52 |
Shrews | pabelanger: i think we'll just have to manually delete that image from zk | 14:53 |
jeblair | i'm willing to revisit that (we can probably mitigate those concerns with more work). though i'm curious how we ended up with the image missing from disk | 14:54 |
Shrews | jeblair: yeah, the missing image is also my concern | 14:54 |
pabelanger | I can see about reproducing it, but I think it had to do with stopping nodepool-builder while DIB was building failed dib builds. In this case, I think we were missing the zypper dependency for DIB, and it was just looping failures | 14:58 |
pabelanger | then, when we started up, I don't think the deleting states were ever purged | 14:58 |
pabelanger | we eventually created a successful image, but then renamed it | 14:58 |
pabelanger | leaving the stale data | 14:58 |
jeblair | pabelanger: do you think nodepool was stopped after it had deleted a failed build but before it removed the associated zk node? | 14:59 |
pabelanger | I think there was nothing to delete, because we didn't create a DIB | 15:01 |
*** dkranz has joined #zuul | 15:01 | |
jeblair | pabelanger: did we accumulate a lot of 'deleting' images then? | 15:02 |
jeblair | or did it actually remove the zk nodes for some/most of them? | 15:02 |
pabelanger | I cannot completely remember. Let me check the logs. but I have not done anything directly to zookeeper personally | 15:04 |
Shrews | so maybe znodes were created in BUILDING state, the DIB failed w/o creating anything on disk, then later BUILDING is set to DELETING but we have nothing to delete ondisk | 15:05 |
Shrews | if DIB doesn't even get to the point of creating the dib directory, then yeah, i can see this happening | 15:05 |
jeblair | yeah, that's what i'm wondering; but if that were the case, we'd expect lots of znode entries, since it would reliably never delete | 15:06 |
Shrews | that is also true | 15:06 |
pabelanger | nodepool-builder.log.2017-05-24_15 on nb04 has what happened | 15:06 |
pabelanger | we looped about 401 times trying to build opensuse-42.2 | 15:07 |
pabelanger | but we then changed its configuration under nodepool when we renamed to opensuse-422 | 15:07 |
Shrews | yeah, several failures, but only 3 znodes | 15:08 |
Shrews | the last 3 failures are the znodes still there | 15:09 |
Shrews | trying to remember if there is some logic that would delete the others | 15:10 |
pabelanger | Ya, I cannot remember if we had 400+ deleting or just a few | 15:10 |
Shrews | znode is deleted ONLY when the local files are deleted, so i'm not sure what happened there | 15:12 |
pabelanger | possible they are deleted on start up? | 15:14 |
Shrews | nope, continuously | 15:17 |
Shrews | every minute, by default, unless the CleanupWorker thread is dead | 15:18 |
Shrews | but it is alive | 15:19 |
pabelanger | Oh, IIRC, I did a manual build of opensuse-42.2 | 15:19 |
pabelanger | because we didn't have opensuse-42.2 in images section for pool | 15:19 |
pabelanger | builder-debug.log.2017-05-24_16:2017-05-24 21:04:15,054 INFO nodepool.builder.BuildWorker.0: Manual build request for image opensuse-42.2 | 15:19 |
pabelanger | not sure if that changes anything | 15:20 |
Shrews | shouldn't matter | 15:21 |
jlk | jeblair: when you say that zuul will watch a config-repo for changes, what's the watch mechanism? how does it become informed of changes there? | 15:31 |
jlk | jeblair: is that through a source driver? | 15:31 |
jeblair | jlk: oh, you know what, that's gerrit specific. we need to abstract that. it's in Scheduler.process_event_queue | 15:32 |
jlk | ah hah | 15:33 |
* jlk adds it to the github support task list | 15:33 | |
jlk | although... | 15:34 |
jlk | maybe that should be it's own story, to support a raw git repo as well as as github | 15:34 |
jeblair | jlk: mostly need to make 'event.type == "change.merged"' into an api call | 15:34 |
jeblair | event.isChangeMerged or something :) | 15:34 |
clarkb | pabelanger: Shrews is this another case where we might consider a `nodepool $action --force` command to handle when things get funny? | 15:35 |
jlk | right okay | 15:36 |
jlk | so, that probably wont' work for pure git, since you don't get events in pure git world | 15:36 |
jeblair | jlk: drivers can (now) return their own subclasses of TriggerEvent, so a git driver could return its own subclass as well, which would also implement isChangeMerged(). and yeah, we'd need to synthetically create events for that to work. that's a possibility down the line. | 15:37 |
jlk | and for github, you probably want to tie that to a push event (which happens on merge too, so you can get the event if it's merged via a PR or just directly pushed) | 15:37 |
jeblair | jlk: sounds like it | 15:37 |
jlk | I should be able to test that out. THis is sounding mor egithub specific, so I'll drop a task in the story. | 15:38 |
*** hashar is now known as hasharAway | 15:39 | |
jlk | Can you link directly to a storyboard task? | 15:43 |
jlk | anyway https://storyboard.openstack.org/?#!/story/2000774 task 4664 I think captures the desire in the notes. | 15:43 |
jeblair | ++ | 15:44 |
Shrews | clarkb: any --force options would scare me because unless someone really understands the system, they could end up making things worse | 15:49 |
jeblair | Shrews, pabelanger: the 3 stuck images appear to have been "created" right before a configuration change, based on the 'creating new providermanager" lines in the log. but it doesn't look like the builder was restarted, based on the absence of zk connection lines. | 15:57 |
jeblair | Shrews, pabelanger: i think i can explain it | 16:08 |
jeblair | the key is that the configuration was changed to remove the opensuse-42.2 image while it was in that failed build loop | 16:09 |
jeblair | if you look at _cleanupImage.... | 16:10 |
jeblair | before the configuration change diskimage=True (because it is in the config) and local_builds=[] (because DIB did not create a build directory) | 16:10 |
jeblair | the function decides it *is* responsible for the image (diskimage=True). it calls _deleteLocalBuild, which finds that there is no local build, *but* there is a hostname check in there. if there is no local build, but the hostname of the builder record in zk matches ours, we assume this case -- that dib failed without creating a dir, so we return true | 16:12 |
jeblair | this allows it to then delete the zk node | 16:12 |
jeblair | then *after* the configuration change which removed opensuse42.2 | 16:13 |
jeblair | diskimage=False (because it is not in the config), so we exit _cleanupImage early because we are not responsible. | 16:13 |
jeblair | (since there are still no local builds) | 16:13 |
Shrews | jeblair: ah, that makes sense. a race with the user changing the config | 16:15 |
pabelanger | yes, that seems to make the most sense | 16:16 |
jeblair | combined with an invalid assumption about external state (that we get a dib directory if we invoke dib) | 16:16 |
jeblair | i can think of 3 approaches to correct: a) remove the zk record earlier if there is no dib directory. b) ensure there is always a dib directory, eg, by creating it before invoking dib. c) rely more on hostnames to determine build ownership | 16:18 |
SpamapS | is there a sub-version of (a) which is to mark the zk record as vulnerable to this race while we're starting builds, and then that enables (c) ? | 16:19 |
* SpamapS says this without having looked at the records at all to see if they already have that | 16:20 | |
jeblair | SpamapS: could be. we're actually pretty close to that now with setting it as 'FAILED' then changing it to 'DELETE'; we just don't do anything with that knowledge. | 16:20 |
Shrews | jeblair: what if we change this line to also check for a hostname match? https://github.com/openstack-infra/nodepool/blob/master/nodepool/builder.py#L356 | 16:23 |
Shrews | we'll consider it a local build if we have the files OR the hostname matches | 16:24 |
Shrews | that's probably a safe use of hostname | 16:24 |
SpamapS | Another idea is to have each builder make its own permanent uuid that isn't the build, but the builder, and attach that to every build. So instead of "is there a dir for the build" just "if there is a record for my builder ID, I own that build" and forget hostname even exists other than for admin-info. | 16:25 |
jeblair | SpamapS: yep, and store that in a local state dir | 16:25 |
SpamapS | jeblair: GMTA :) | 16:26 |
jeblair | Shrews: yeah, that would work. we would be relying on hostname enough that "bad things" are likely to happen if you have two builders with the same hostname, but not relying on them so much that if you change a hostname, things will automatically break. | 16:28 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool master: Consider hostname in determining local builds https://review.openstack.org/470364 | 16:39 |
Shrews | I did not submit a features/zuulv3 version since I figure the update with master will cover that. | 16:41 |
jeblair | SpamapS, mordred: i think you have convinced me that if we are going to improve the executor merger situation by avoiding repeating the merge ops, we should do so by fetching from the merger over http rather than sending git-packs over gearman. the latter would likely work, but it may inhibit our ability to move to zk for zuulv4. meanwhile, i think the added requirement of a webserver on the mergers probably isn't too onerous as long as it's ... | 17:20 |
jeblair | ... internal-only (ie, only the executors need to be able to reach them). | 17:20 |
mordred | jeblair: ++ | 17:21 |
clarkb | that should alleviate corporate firwall rule issues | 17:21 |
jeblair | SpamapS: the only remaining grey area for me is, if someone is running a merger in a container, will that merger be able to know enough to say "contact me at http://address:port/" to the executor? | 17:21 |
jeblair | mordred: ^ | 17:21 |
jeblair | clarkb: yeah, that's the big one. "also set up apache or whatever" is annoying but tractable. | 17:22 |
jeblair | (besides, you don't actually need to do it if you run zuul all-in-one, so that's nice for small deployments) | 17:22 |
clarkb | re container addressing I think it depends entirely on your container setup. You likely want to be able to configure what that address is in the merger for flexibility | 17:22 |
pabelanger | mordred: would fetch over git+ssh be an option too? | 17:23 |
mordred | SpamapS, jeblair: I like the uuid-in-local-state-dir idea - also referencing container space - running builders in k8s pods with a persistent volume - the volume would be the thing that's the most sticky and least likely to change | 17:23 |
mordred | I agree with clarkb re: addressing | 17:23 |
clarkb | NAT is a tool that pops up over and over :( | 17:23 |
jeblair | mordred, clarkb: so we add a config option to the merger to say "this is your http accessible address"? | 17:23 |
mordred | optional merger config that tells the merger what its hostname and port are is likely important | 17:24 |
mordred | jeblair: yah | 17:24 |
jeblair | sounds workable | 17:24 |
pabelanger | jeblair: sorry, that git+ssh question should have been for you | 17:24 |
SpamapS | Yes, typically apps that run in containers are just like apps that run behind load balancers. You need to tell them where they're reachable, because they won't actually know themselves. | 17:24 |
jeblair | pabelanger: yeah, git+ssh would work; i'm not sure which is the easier thing to set up. | 17:25 |
SpamapS | It's really just a mapping problem. They should probably just say "merge result is at zm03" and zm03's address is mapped by a cluster manager that knows where it put zm03. | 17:26 |
SpamapS | (there's this awesome technology for mapping names to addresses btw...) | 17:26 |
*** harlowja has quit IRC | 17:28 | |
pabelanger | jeblair: git+ssh would be one less external dependency (apache / nginx) but mean setup public / private keys. I think the public / private key approach would be okay, since we do that for nodepool / zuul-executor today | 17:28 |
SpamapS | so... | 17:29 |
pabelanger | might be a fun thing to test in puppet / ansible | 17:29 |
SpamapS | if it's in a container, it's one _more_ dependency | 17:29 |
SpamapS | because containers don't typically run sshd | 17:29 |
pabelanger | agree | 17:29 |
pabelanger | SpamapS: I guess with containers you would share the mount for /var/lib/zuul/git and use your protocol of choice, in another containers? | 17:31 |
SpamapS | pabelanger: you could | 17:32 |
mordred | yah - I think there is an easy a way the apache bit can work with containers (run a pod with a volume a merger and an apache) - so for now just doing the apache route and not sorting out ssh seems easier to me | 17:33 |
mordred | since that's also easy to ansible or puppet for non-container land | 17:33 |
pabelanger | ya, I think git+ssh sounds promosing because we'd likely not need http://git.openstack.org/cgit/openstack-infra/puppet-zuul/tree/templates/zuul.vhost.erb any more. Or does that get smaller with zuulv3? | 17:34 |
clarkb | from the corporate firewall perspective IME opening up http(s) is a significantly more easy than any other protocol | 17:35 |
pabelanger | git-core/git-http-backend was what I was thinking | 17:35 |
pabelanger | clarkb: that is true | 17:35 |
clarkb | and relying on DNS (while my preference) is often a burden in corp envs where things need legal review and so on | 17:36 |
jeblair | though, we're still talking internal access only. this is a system which already requires zookeeper (and currently gearman). i'm not as worried about internal firewalls. | 17:37 |
clarkb | jeblair: IME internal firealls are just as much a problem :) | 17:37 |
clarkb | but if we assume a cloudy env probably not a big deal | 17:37 |
jeblair | i'm not assuming a cloudy env, i'm assuming an environment where multiple systems can talk to each other. :) | 17:37 |
jeblair | i'm saying that we crossed that bridge once we created a distributed system at all. | 17:38 |
clarkb | I agree, its just in my expereince https is the easiest protocol to get opened up in such environments | 17:43 |
clarkb | because its a thing everywhere they are often already rules in place that say you can http rather than going up in front of a security board and having them tell you no | 17:44 |
mordred | yah. I mean, I think we should use http(s) for this ... but I also think that a person is not going to be succcesful if they do not have adequate control of the network flow in-between their zuul nodes | 17:45 |
mordred | since they'll already have needed to go to the security board to get zk and gear opened, not to mention ssh between executors and build nodes | 17:46 |
mordred | but since this is _only_ for intra-cluster communicatoin, it really shouldn't be an issue like it would be if we were talking about needing the build nodes to be able to contact this git service (like from gozer days) | 17:46 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_checkout https://review.openstack.org/470381 | 17:47 |
*** jkilpatr has quit IRC | 17:48 | |
jeblair | clarkb: yeah, sadly, sometimes it is more difficult to open ssh, even internally. i don't think we should weigh that heavily against it; fortunately, i think there are some other reasons to prefer http anyway. | 17:49 |
jeblair | also, we don't have to implement this now -- the "repeat the merge" code is written and good enough to get us out the door. | 17:50 |
*** jkilpatr has joined #zuul | 18:03 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_and_master_checkout https://review.openstack.org/470385 | 18:05 |
mordred | jeblair, SpamapS: re: pabelanger's comment on https://review.openstack.org/#/c/453851 ... any reason to not just use bwrap for both branches? | 18:05 |
SpamapS | mordred: I think it's a great idea. | 18:06 |
SpamapS | Prevents mishaps. :) | 18:07 |
jeblair | mordred: well, the intent was that a config repo was entitled to do things outside of a container. like write to afs. | 18:07 |
mordred | jeblair: ah. gotcha. write to afs was the bit I was leaving out of my brainhole when looking at the bwrap invocation | 18:08 |
jeblair | we might be able to bind mount afs in | 18:08 |
jeblair | but that was a prototypical example | 18:08 |
jeblair | so i guess we need to decide how much of a prototype it is. :) | 18:09 |
mordred | jeblair: I htink it's easy enough to swith to always using bwrap | 18:09 |
mordred | switch to | 18:09 |
mordred | so I'm fine with the patch as it is now while we think about that | 18:09 |
jeblair | yeah, so let's chalk that up as something we feel fondly about doing in the future after we work through some things :) | 18:09 |
pabelanger | ++ | 18:09 |
mordred | pabelanger, SpamapS: I think the stack up to https://review.openstack.org/#/c/468554 is ready for y'all whenver | 18:15 |
pabelanger | mordred: k, I can review shortly. Working on zuulv3.o.o atm | 18:17 |
mordred | pabelanger: oh! do that. :) | 18:22 |
pabelanger | we should be in good shape to get ze01.o.o online shortly too | 18:29 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_git_cache_dir https://review.openstack.org/470389 | 18:32 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_upgrade https://review.openstack.org/470390 | 18:32 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_periodic_update https://review.openstack.org/470391 | 18:32 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_post_checkout https://review.openstack.org/470392 | 18:32 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename TestCloner TestExecutorRepos https://review.openstack.org/470393 | 18:32 |
jeblair | End of Series! | 18:32 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_post_checkout https://review.openstack.org/470392 | 18:37 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename TestCloner TestExecutorRepos https://review.openstack.org/470393 | 18:37 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_checkout https://review.openstack.org/470381 | 18:37 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_and_master_checkout https://review.openstack.org/470385 | 18:37 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_git_cache_dir https://review.openstack.org/470389 | 18:37 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_upgrade https://review.openstack.org/470390 | 18:37 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_periodic_update https://review.openstack.org/470391 | 18:38 |
*** jkilpatr has quit IRC | 18:42 | |
*** jkilpatr has joined #zuul | 18:55 | |
mordred | jeblair: I don't believe you | 19:13 |
mordred | tristanC: I'm sorry I have made it through an entire additional week without starting the threads about your patches - I promise it's on my list! | 19:25 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Abort jobs if JobDir disk usage exceeds an optional maximum https://review.openstack.org/463022 | 19:31 |
mordred | rebased and fixed up adam_g patch ^^ | 19:31 |
openstackgerrit | K Jonathan Harker proposed openstack-infra/zuul feature/zuulv3: Handle job or build is None in formatUrlPattern https://review.openstack.org/470415 | 19:34 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add SSH Agent Primitives and usage https://review.openstack.org/468173 | 19:34 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add support for bwrap https://review.openstack.org/453851 | 19:35 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Make zuul ansible dir self-contained https://review.openstack.org/468208 | 19:35 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add untrusted-projects ansible test https://review.openstack.org/461881 | 19:35 |
jesusaur | jhesketh: i found a minor issue with safe_job and safe_build, can you take a look at 470415 when you're around? | 19:40 |
mordred | SpamapS: woot bubblewrap! | 19:41 |
*** hasharAway is now known as hashar | 19:43 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Make finger port configurable https://review.openstack.org/470243 | 20:03 |
*** harlowja has joined #zuul | 20:04 | |
SpamapS | \o/ | 20:24 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_post_checkout https://review.openstack.org/470392 | 20:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename TestCloner TestExecutorRepos https://review.openstack.org/470393 | 20:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_checkout https://review.openstack.org/470381 | 20:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_and_master_checkout https://review.openstack.org/470385 | 20:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_periodic https://review.openstack.org/470053 | 20:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_git_cache_dir https://review.openstack.org/470389 | 20:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_upgrade https://review.openstack.org/470390 | 20:26 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_periodic_update https://review.openstack.org/470391 | 20:26 |
* SpamapS finally getting back to review queue.. :-P | 20:32 | |
* jlk out for the weekend. o/ | 20:47 | |
*** dougbtv has quit IRC | 21:28 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Reject some untrusted config objects https://review.openstack.org/470442 | 21:31 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Reject some untrusted config objects https://review.openstack.org/470442 | 21:34 |
*** hashar has quit IRC | 21:35 | |
jeblair | i've seen some test failures recently with: | 21:38 |
jeblair | stderr: 'fatal: Unable to create '/tmp/tmp.2kLIXZeUlb/tmp4tvsjM/zuul-test/executor-git/review.example.com/org/project/.git/index.lock': File exists. | 21:38 |
jeblair | i believe i know the problem; i'll work up a patch to fix | 21:39 |
*** dougbtv has joined #zuul | 21:41 | |
*** dougbtv has quit IRC | 21:50 | |
*** dougbtv has joined #zuul | 21:54 | |
*** jkilpatr has quit IRC | 21:58 | |
*** dougbtv has quit IRC | 22:01 | |
*** dougbtv has joined #zuul | 22:14 | |
*** jkilpatr has joined #zuul | 22:16 | |
*** dougbtv_ has joined #zuul | 22:17 | |
openstackgerrit | K Jonathan Harker proposed openstack-infra/zuul feature/zuulv3: Add github reporter status-url config option https://review.openstack.org/449794 | 22:20 |
*** dougbtv has quit IRC | 22:20 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor https://review.openstack.org/470571 | 22:34 |
*** cinerama` has joined #zuul | 22:34 | |
jeblair | that should fix the git lock issues ^ | 22:34 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor https://review.openstack.org/470571 | 22:36 |
jeblair | SpamapS, mordred, clarkb: we probably want to land that soonish so we can avoid further spurious test failures | 22:36 |
* SpamapS peeks | 22:36 | |
jeblair | drat, debug lines | 22:37 |
clarkb | kk | 22:38 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor https://review.openstack.org/470571 | 22:38 |
jeblair | that just removes 3 debug lines i forgot to clean up | 22:38 |
mordred | jeblair: lgtm | 22:39 |
*** cinerama has quit IRC | 22:40 | |
clarkb | is it worth documenting that you can't cohabitate those repos (either via nfs or any other mechanism) when using split out processes/workers/ | 22:45 |
*** jesusaur has quit IRC | 22:48 | |
jeblair | clarkb: yep. and also if you run a merger and executor on the same host (for some reason. don't do that though since the executor has a merger). | 22:51 |
jeblair | but we haven't really gotten to the revised operational docs yet | 22:52 |
jeblair | so we'll just have to keep that in mind when we do write those | 22:52 |
*** jesusaur has joined #zuul | 22:54 | |
SpamapS | might be good to start building an outline of sorts.. I've heard a few caveats expressed here and I worry we'll forget some | 22:55 |
clarkb | jeblair: reading the change there are three locations where we can run into a conflict. cat, update (which is called by cat), and merge. These all appear to run in the same thread? I must be missing where update can happen in another thread | 22:55 |
clarkb | oh its updateLoop is in a thread of its own got it | 22:56 |
jeblair | yep | 22:56 |
jeblair | cat puts entries into the update queue, but so does executor:execute | 22:56 |
jeblair | (which is another thread) | 22:57 |
clarkb | SpamapS: do you want me to hold off approving so you can reviwe? | 22:57 |
SpamapS | clarkb: if you don't mind, I'm almost done. :) | 22:58 |
clarkb | sure | 22:58 |
clarkb | I +2'd feel free to approve if you don't find a reason to -1 | 22:58 |
SpamapS | clarkb: Thanks, +3'd :) | 22:59 |
SpamapS | jeblair: I left a little comment but it was more "oh there's another way to do this in case you didn't consider it" not a -1 or even a need for follow-up | 23:00 |
jeblair | SpamapS: ah yeah, that would work too. | 23:03 |
jeblair | SpamapS: i *think* this may be the lowest cost way (since with (un)registering we may end up flapping a lot, and incurring extra network cpu costs), but i'm not certain about that. :) | 23:04 |
jeblair | "extra network *and* cpu costs" | 23:04 |
*** jesusaur has quit IRC | 23:13 | |
*** jesusaur has joined #zuul | 23:16 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!