Friday, 2017-06-02

openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Fix traceback when evaluating github changes  https://review.openstack.org/47006800:04
*** jamielennox is now known as jamielennox|away00:44
SpamapSpabelanger: certainly seems like landing the change should cause Zuul to just reconfigure, but maybe we're forgetting something. :-P01:10
pabelangerSpamapS: I assume it will, but zuulv3-dev.o.o isn't managed by puppet today. One we get zuulv3.o.o online, hopefully in the next day or so. We'll know for certain!01:12
jeblairSpamapS: re 453851, i think we should land the whole series as soon as all the tests are green; however, the last 2 changes have failed for some reason i haven't looked into yet01:43
jeblairSpamapS, pabelanger, jlk: as soon as a change to a config-repo lands, zuul notices and loads the new configuration.01:44
jeblairSpamapS: oh i see the problem on the bwrap series, fixing01:45
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Make zuul ansible dir self-contained  https://review.openstack.org/46820801:47
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Add untrusted-projects ansible test  https://review.openstack.org/46188101:47
SpamapS2017-06-01 23:56:31.011945 | [Zuul] Job timed out, result: FAILURE ?01:55
*** jamielennox|away is now known as jamielennox02:00
jlkjeblair: interesting, that's not exactly what I was witnessing.02:52
jlk(update on change)02:52
*** jamielennox is now known as jamielennox|away03:30
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Case sensitive label matching  https://review.openstack.org/46994605:11
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Fix tests for "Case sensitive label matching"  https://review.openstack.org/46998505:11
*** mmedvede has quit IRC05:28
*** mmedvede has joined #zuul05:31
*** rattboi has left #zuul07:00
*** hashar has joined #zuul07:49
*** jamielennox|away is now known as jamielennox08:21
*** jamielennox is now known as jamielennox|away08:49
*** persia has quit IRC09:38
*** persia has joined #zuul09:39
tobiashmordred: I have a use case for configuring the finger port.09:50
tobiashmordred: If running in an unprivileged container it is not allowed to listen on a low port (79).09:51
*** jkilpatr has quit IRC10:30
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Make finger port configurable  https://review.openstack.org/47024310:38
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Don't copy the __pycache__ folder  https://review.openstack.org/47024510:42
*** jkilpatr has joined #zuul10:53
mordredtobiash: awesome.11:20
tobiash:)11:34
Shrewstobiash: wow, you really are running right on the cutting edge of development, aren't you?  :)11:38
tobiashShrews: latest (this morning) feature/zuulv3 + some not yet approved patches :)11:42
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Make finger port configurable  https://review.openstack.org/47024312:09
pabelangerShrews: maybe if you have time today, you can help me understand how to pruge old images from dib-image-list on nodepool.o.o.  We have an unknown image in deleting state for 8 days now13:52
pabelangeronly think I can think of is using zkshell to remove it13:53
Shrewspabelanger: looking13:59
SpamapStobiash: you're going to need a privileged container to run bubblewrap FYI14:00
pabelangercontainers in containers! :D14:01
SpamapStobiash: you can of course opt out by using the nullwrap driver, but then you risk untrusted jobs breaking out and pwn'ing your executor.14:01
Shrewspabelanger: those dib files do not exist on the owning builder nb04. how did that happen?14:03
Shrewspabelanger: if they were manually removed, then yeah, we'll have to delete the znodes manually14:04
pabelangerShrews: we renamed the image a few days ago from opensuse-42.2 to opensuse-422. The images also failed to build a few times, and seems they hung around14:04
pabelangerShrews: is it possible to check zookeeper to see if the image name is valid, then make just auto purge them?14:05
pabelangeror some other nodepool util to clean them up14:05
SpamapSI wonder if we could make a wrapper that uses docker/rkt/k8s instead of bubblewrap14:05
Shrewspabelanger: how would you decide if an image name is valid?14:05
pabelangerShrews: right, I guess we cannot depend on local nodepool.yaml file14:06
SpamapStobiash: what's your container overlord? Docker?14:06
pabelangersince it would be possible for other builder to have a different image now14:06
pabelangername*14:06
Shrewspabelanger: what we can probably do is change the way we delete the znode. right now, it's only deleted if the dib is successfully removed from disk. i suppose we could add a check that removes the znode if the on-disk dib is already gone14:07
pabelangerShrews: actually, we know nb04.o.o build the image, but it does not exist in its configuration file, can we not do something about that?14:07
pabelangerShrews: Ya, that's basically the issue now I think14:07
Shrewspabelanger: oh, we already have that check14:09
pabelangermordred: this seems new on nb04.o.o: http://paste.openstack.org/show/611300/14:11
pabelangerShrews: okay, looking in logs to see what is happening14:11
pabelangerShrews: are we bailing out on line 406 for nodepool/builder.py?14:14
tobiasha gate job timed out: https://review.openstack.org/#/c/470243/14:14
tobiashist this a known issue?14:14
tobiashhttp://logs.openstack.org/43/470243/2/gate/gate-zuul-python27-ubuntu-xenial/db8a735/14:15
Shrewsi don't know14:15
tobiashSpamapS: good to know, thanks14:16
tobiashSpamapS: currently using plain docker14:16
SpamapSI expect bwrap to land pretty soon so be ready :)14:16
SpamapSwould be interesting to try and write a docker wrapper though14:17
tobiashSpamapS: will try that out after my vacation in two weeks14:19
tristanCOA14:26
tristanCoops :-)14:26
Shrewspabelanger: i *think* what's happening is that the diskimages is not in nodepool.yaml AND there are no local files on disk (for whatever reason), so nb04 doesn't think it's responsible for it (according to https://github.com/openstack-infra/nodepool/blob/master/nodepool/builder.py#L402-L405)14:27
pabelangerShrews: yes, that is what I am thinking too. I was trying to see if we could some how use the 'builder' field from zk, since we add it to each build14:28
Shrewsi'm not entirely certain why that check exists14:29
pabelangerI think we want to use ImageBuild.builder14:30
pabelangerwasn't it to stop other nodepool-builders from deleting DIBs that didn't exist on their filesystem?14:30
Shrewsmy concern is that we might be doing things behind the builder's back that might be causing this situation14:30
Shrewsor, we have a bug somewhere14:31
pabelangerYa, we should see about adding a test to expose this. should be straightforward to setup current zk values14:32
pabelangerunknown image in building state14:32
Shrewspabelanger: why does nb04 have a fully qualified hostname, but the others do not?14:33
Shrewsothers == nb0314:33
pabelangerShrews: ubuntu xenail image in rackspace did that14:36
pabelangernb03 is in vexxhost14:36
Shrewsseems obvious NOW that we could just use the builder hostname to see if it is a locally owned image rather than checking on-disk files, but that should have seemed obvious THEN too, so maybe there's a reason i can't remember why we didn't use hostname14:37
pabelangerya, not sure: https://review.openstack.org/#/c/400421/ was the review that seems to add it14:39
Shrewsjeblair: Can you remember why we wouldn't just compare builder hostnames to determine if we own the build? ^^^14:52
jeblairi think originally we just didn't want to rely on hostnames for something like that.  they are finicky.  (if you have to change your hostname, how do you convince nodepool to delete those images, and would it even be obvious to a user that the reason their system broke is they changed the hostname)14:52
Shrewsspeak o' da debil14:52
Shrewsok, yeah, that makes sense14:52
Shrewspabelanger: i think we'll just have to manually delete that image from zk14:53
jeblairi'm willing to revisit that (we can probably mitigate those concerns with more work).  though i'm curious how we ended up with the image missing from disk14:54
Shrewsjeblair: yeah, the missing image is also my concern14:54
pabelangerI can see about reproducing it, but I think it had to do with stopping nodepool-builder while DIB was building failed dib builds.  In this case, I think we were missing the zypper dependency for DIB, and it was just looping failures14:58
pabelangerthen, when we started up, I don't think the deleting states were ever purged14:58
pabelangerwe eventually created a successful image, but then renamed it14:58
pabelangerleaving the stale data14:58
jeblairpabelanger: do you think nodepool was stopped after it had deleted a failed build but before it removed the associated zk node?14:59
pabelangerI think there was nothing to delete, because we didn't create a DIB15:01
*** dkranz has joined #zuul15:01
jeblairpabelanger: did we accumulate a lot of 'deleting' images then?15:02
jeblairor did it actually remove the zk nodes for some/most of them?15:02
pabelangerI cannot completely remember. Let me check the logs.  but I have not done anything directly to zookeeper personally15:04
Shrewsso maybe znodes were created in BUILDING state, the DIB failed w/o creating anything on disk, then later BUILDING is set to DELETING but we have nothing to delete ondisk15:05
Shrewsif DIB doesn't even get to the point of creating the dib directory, then yeah, i can see this happening15:05
jeblairyeah, that's what i'm wondering; but if that were the case, we'd expect lots of znode entries, since it would reliably never delete15:06
Shrewsthat is also true15:06
pabelangernodepool-builder.log.2017-05-24_15 on nb04 has what happened15:06
pabelangerwe looped about 401 times trying to build opensuse-42.215:07
pabelangerbut we then changed its configuration under nodepool when we renamed to opensuse-42215:07
Shrewsyeah, several failures, but only 3 znodes15:08
Shrewsthe last 3 failures are the znodes still there15:09
Shrewstrying to remember if there is some logic that would delete the others15:10
pabelangerYa, I cannot remember if we had 400+ deleting or just a few15:10
Shrewsznode is deleted ONLY when the local files are deleted, so i'm not sure what happened there15:12
pabelangerpossible they are deleted on start up?15:14
Shrewsnope, continuously15:17
Shrewsevery minute, by default, unless the CleanupWorker thread is dead15:18
Shrewsbut it is alive15:19
pabelangerOh, IIRC, I did a manual build of opensuse-42.215:19
pabelangerbecause we didn't have opensuse-42.2 in images section for pool15:19
pabelangerbuilder-debug.log.2017-05-24_16:2017-05-24 21:04:15,054 INFO nodepool.builder.BuildWorker.0: Manual build request for image opensuse-42.215:19
pabelangernot sure if that changes anything15:20
Shrewsshouldn't matter15:21
jlkjeblair: when you say that zuul will watch a config-repo for changes, what's the watch mechanism? how does it become informed of changes there?15:31
jlkjeblair: is that through a source driver?15:31
jeblairjlk: oh, you know what, that's gerrit specific.  we need to abstract that.  it's in Scheduler.process_event_queue15:32
jlkah hah15:33
* jlk adds it to the github support task list15:33
jlkalthough...15:34
jlkmaybe that should be it's own story, to support a raw git repo as well as as github15:34
jeblairjlk: mostly need to make 'event.type == "change.merged"' into an api call15:34
jeblairevent.isChangeMerged or something :)15:34
clarkbpabelanger: Shrews is this another case where we might consider a `nodepool $action --force` command to handle when things get funny?15:35
jlkright okay15:36
jlkso, that probably wont' work for pure git, since you don't get events in pure git world15:36
jeblairjlk: drivers can (now) return their own subclasses of TriggerEvent, so a git driver could return its own subclass as well, which would also implement isChangeMerged().  and yeah, we'd need to synthetically create events for that to work.  that's a possibility down the line.15:37
jlkand for github, you probably want to tie that to a push event (which happens on merge too, so you can get the event if it's merged via a PR or just directly pushed)15:37
jeblairjlk: sounds like it15:37
jlkI should be able to test that out. THis is sounding mor egithub specific, so I'll drop a task in the story.15:38
*** hashar is now known as hasharAway15:39
jlkCan you link directly to a storyboard task?15:43
jlkanyway https://storyboard.openstack.org/?#!/story/2000774 task 4664 I think captures the desire in the notes.15:43
jeblair++15:44
Shrewsclarkb: any --force options would scare me because unless someone really understands the system, they could end up making things worse15:49
jeblairShrews, pabelanger: the 3 stuck images appear to have been "created" right before a configuration change, based on the 'creating new providermanager" lines in the log.  but it doesn't look like the builder was restarted, based on the absence of zk connection lines.15:57
jeblairShrews, pabelanger: i think i can explain it16:08
jeblairthe key is that the configuration was changed to remove the opensuse-42.2 image while it was in that failed build loop16:09
jeblairif you look at _cleanupImage....16:10
jeblairbefore the configuration change diskimage=True (because it is in the config) and local_builds=[] (because DIB did not create a build directory)16:10
jeblairthe function decides it *is* responsible for the image (diskimage=True).  it calls _deleteLocalBuild, which finds that there is no local build, *but* there is a hostname check in there.  if there is no local build, but the hostname of the builder record in zk matches ours, we assume this case -- that dib failed without creating a dir, so we return true16:12
jeblairthis allows it to then delete the zk node16:12
jeblairthen *after* the configuration change which removed opensuse42.216:13
jeblairdiskimage=False (because it is not in the config), so we exit _cleanupImage early because we are not responsible.16:13
jeblair(since there are still no local builds)16:13
Shrewsjeblair: ah, that makes sense. a race with the user changing the config16:15
pabelangeryes, that seems to make the most sense16:16
jeblaircombined with an invalid assumption about external state (that we get a dib directory if we invoke dib)16:16
jeblairi can think of 3 approaches to correct: a) remove the zk record earlier if there is no dib directory.  b) ensure there is always a dib directory, eg, by creating it before invoking dib.  c) rely more on hostnames to determine build ownership16:18
SpamapSis there a sub-version of (a) which is to mark the zk record as vulnerable to this race while we're starting builds, and then that enables (c) ?16:19
* SpamapS says this without having looked at the records at all to see if they already have that16:20
jeblairSpamapS: could be.  we're actually pretty close to that now with setting it as 'FAILED' then changing it to 'DELETE'; we just don't do anything with that knowledge.16:20
Shrewsjeblair: what if we change this line to also check for a hostname match? https://github.com/openstack-infra/nodepool/blob/master/nodepool/builder.py#L35616:23
Shrewswe'll consider it a local build if we have the files OR the hostname matches16:24
Shrewsthat's probably a safe use of hostname16:24
SpamapSAnother idea is to have each builder make its own permanent uuid that isn't the build, but the builder, and attach that to every build. So instead of "is there a dir for the build" just "if there is a record for my builder ID, I own that build" and forget hostname even exists other than for admin-info.16:25
jeblairSpamapS: yep, and store that in a local state dir16:25
SpamapSjeblair: GMTA :)16:26
jeblairShrews: yeah, that would work.  we would be relying on hostname enough that "bad things" are likely to happen if you have two builders with the same hostname, but not relying on them so much that if you change a hostname, things will automatically break.16:28
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: Consider hostname in determining local builds  https://review.openstack.org/47036416:39
ShrewsI did not submit a features/zuulv3 version since I figure the update with master will cover that.16:41
jeblairSpamapS, mordred: i think you have convinced me that if we are going to improve the executor merger situation by avoiding repeating the merge ops, we should do so by fetching from the merger over http rather than sending git-packs over gearman.  the latter would likely work, but it may inhibit our ability to move to zk for zuulv4.  meanwhile, i think the added requirement of a webserver on the mergers probably isn't too onerous as long as it's ...17:20
jeblair... internal-only (ie, only the executors need to be able to reach them).17:20
mordredjeblair: ++17:21
clarkbthat should alleviate corporate firwall rule issues17:21
jeblairSpamapS: the only remaining grey area for me is, if someone is running a merger in a container, will that merger be able to know enough to say "contact me at http://address:port/" to the executor?17:21
jeblairmordred: ^17:21
jeblairclarkb: yeah, that's the big one.  "also set up apache or whatever" is annoying but tractable.17:22
jeblair(besides, you don't actually need to do it if you run zuul all-in-one, so that's nice for small deployments)17:22
clarkbre container addressing I think it depends entirely on your container setup. You likely want to be able to configure what that address is in the merger for flexibility17:22
pabelangermordred: would fetch over git+ssh be an option too?17:23
mordredSpamapS, jeblair: I like the uuid-in-local-state-dir idea - also referencing container space - running builders in k8s pods with a persistent volume - the volume would be the thing that's the most sticky and least likely to change17:23
mordredI agree with clarkb re: addressing17:23
clarkbNAT is a tool that pops up over and over :(17:23
jeblairmordred, clarkb: so we add a config option to the merger to say "this is your http accessible address"?17:23
mordredoptional merger config that tells the merger what its hostname and port are is likely important17:24
mordredjeblair: yah17:24
jeblairsounds workable17:24
pabelangerjeblair: sorry, that git+ssh question should have been for you17:24
SpamapSYes, typically apps that run in containers are just like apps that run behind load balancers. You need to tell them where they're reachable, because they won't actually know themselves.17:24
jeblairpabelanger: yeah, git+ssh would work; i'm not sure which is the easier thing to set up.17:25
SpamapSIt's really just a mapping problem. They should probably just say "merge result is at zm03" and zm03's address is mapped by a cluster manager that knows where it put zm03.17:26
SpamapS(there's this awesome technology for mapping names to addresses btw...)17:26
*** harlowja has quit IRC17:28
pabelangerjeblair: git+ssh would be one less external dependency (apache / nginx) but mean setup public / private keys.  I think the public / private key approach would be okay, since we do that for nodepool / zuul-executor today17:28
SpamapSso...17:29
pabelangermight be a fun thing to test in puppet / ansible17:29
SpamapSif it's in a container, it's one _more_ dependency17:29
SpamapSbecause containers don't typically run sshd17:29
pabelangeragree17:29
pabelangerSpamapS: I guess with containers you would share the mount for /var/lib/zuul/git and use your protocol of choice, in another containers?17:31
SpamapSpabelanger: you could17:32
mordredyah - I think there is an easy a way the apache bit can work with containers (run a pod with a volume a merger and an apache) - so for now just doing the apache route and not sorting out ssh seems easier to me17:33
mordredsince that's also easy to ansible or puppet for non-container land17:33
pabelangerya, I think git+ssh sounds promosing because we'd likely not need http://git.openstack.org/cgit/openstack-infra/puppet-zuul/tree/templates/zuul.vhost.erb any more. Or does that get smaller with zuulv3?17:34
clarkbfrom the corporate firewall perspective IME opening up http(s) is a significantly more easy than any other protocol17:35
pabelangergit-core/git-http-backend was what I was thinking17:35
pabelangerclarkb: that is true17:35
clarkband relying on DNS (while my preference) is often a burden in corp envs where things need legal review and so on17:36
jeblairthough, we're still talking internal access only.  this is a system which already requires zookeeper (and currently gearman).  i'm not as worried about internal firewalls.17:37
clarkbjeblair: IME internal firealls are just as much a problem :)17:37
clarkbbut if we assume a cloudy env probably not a big deal17:37
jeblairi'm not assuming a cloudy env, i'm assuming an environment where multiple systems can talk to each other.  :)17:37
jeblairi'm saying that we crossed that bridge once we created a distributed system at all.17:38
clarkbI agree, its just in my expereince https is the easiest protocol to get opened up in such environments17:43
clarkbbecause its a thing everywhere they are often already rules in place that say you can http rather than going up in front of a security board and having them tell you no17:44
mordredyah. I mean, I think we should use http(s) for this ... but I also think that a person is not going to be succcesful if they do not have adequate control of the network flow in-between their zuul nodes17:45
mordredsince they'll already have needed to go to the security board to get zk and gear opened, not to mention ssh between executors and build nodes17:46
mordredbut since this is _only_ for intra-cluster communicatoin, it really shouldn't be an issue like it would be if we were talking about needing the build nodes to be able to contact this git service (like from gozer days)17:46
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_checkout  https://review.openstack.org/47038117:47
*** jkilpatr has quit IRC17:48
jeblairclarkb: yeah, sadly, sometimes it is more difficult to open ssh, even internally.  i don't think we should weigh that heavily against it; fortunately, i think there are some other reasons to prefer http anyway.17:49
jeblairalso, we don't have to implement this now -- the "repeat the merge" code is written and good enough to get us out the door.17:50
*** jkilpatr has joined #zuul18:03
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_and_master_checkout  https://review.openstack.org/47038518:05
mordredjeblair, SpamapS: re: pabelanger's comment on https://review.openstack.org/#/c/453851 ... any reason to not just use bwrap for both branches?18:05
SpamapSmordred: I think it's a great idea.18:06
SpamapSPrevents mishaps. :)18:07
jeblairmordred: well, the intent was that a config repo was entitled to do things outside of a container.  like write to afs.18:07
mordredjeblair: ah. gotcha. write to afs was the bit I was leaving out of my brainhole when looking at the bwrap invocation18:08
jeblairwe might be able to bind mount afs in18:08
jeblairbut that was a prototypical example18:08
jeblairso i guess we need to decide how much of a prototype it is.  :)18:09
mordredjeblair: I htink it's easy enough to swith to always using bwrap18:09
mordredswitch to18:09
mordredso I'm fine with the patch as it is now while we think about that18:09
jeblairyeah, so let's chalk that up as something we feel fondly about doing in the future after we work through some things :)18:09
pabelanger++18:09
mordredpabelanger, SpamapS: I think the stack up to https://review.openstack.org/#/c/468554 is ready for y'all whenver18:15
pabelangermordred: k, I can review shortly. Working on zuulv3.o.o atm18:17
mordredpabelanger: oh! do that. :)18:22
pabelangerwe should be in good shape to get ze01.o.o online shortly too18:29
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_git_cache_dir  https://review.openstack.org/47038918:32
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_upgrade  https://review.openstack.org/47039018:32
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_periodic_update  https://review.openstack.org/47039118:32
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_post_checkout  https://review.openstack.org/47039218:32
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename TestCloner TestExecutorRepos  https://review.openstack.org/47039318:32
jeblairEnd of Series!18:32
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_post_checkout  https://review.openstack.org/47039218:37
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename TestCloner TestExecutorRepos  https://review.openstack.org/47039318:37
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_checkout  https://review.openstack.org/47038118:37
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_and_master_checkout  https://review.openstack.org/47038518:37
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_git_cache_dir  https://review.openstack.org/47038918:37
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_upgrade  https://review.openstack.org/47039018:37
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_periodic_update  https://review.openstack.org/47039118:38
*** jkilpatr has quit IRC18:42
*** jkilpatr has joined #zuul18:55
mordredjeblair: I don't believe you19:13
mordredtristanC: I'm sorry I have made it through an entire additional week without starting the threads about your patches - I promise it's on my list!19:25
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Abort jobs if JobDir disk usage exceeds an optional maximum  https://review.openstack.org/46302219:31
mordredrebased and fixed up adam_g patch ^^19:31
openstackgerritK Jonathan Harker proposed openstack-infra/zuul feature/zuulv3: Handle job or build is None in formatUrlPattern  https://review.openstack.org/47041519:34
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Add SSH Agent Primitives and usage  https://review.openstack.org/46817319:34
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Add support for bwrap  https://review.openstack.org/45385119:35
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Make zuul ansible dir self-contained  https://review.openstack.org/46820819:35
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Add untrusted-projects ansible test  https://review.openstack.org/46188119:35
jesusaurjhesketh: i found a minor issue with safe_job and safe_build, can you take a look at 470415 when you're around?19:40
mordredSpamapS: woot bubblewrap!19:41
*** hasharAway is now known as hashar19:43
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Make finger port configurable  https://review.openstack.org/47024320:03
*** harlowja has joined #zuul20:04
SpamapS\o/20:24
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_post_checkout  https://review.openstack.org/47039220:26
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename TestCloner TestExecutorRepos  https://review.openstack.org/47039320:26
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_checkout  https://review.openstack.org/47038120:26
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_and_master_checkout  https://review.openstack.org/47038520:26
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_periodic  https://review.openstack.org/47005320:26
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_git_cache_dir  https://review.openstack.org/47038920:26
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_upgrade  https://review.openstack.org/47039020:26
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_periodic_update  https://review.openstack.org/47039120:26
* SpamapS finally getting back to review queue.. :-P20:32
* jlk out for the weekend. o/20:47
*** dougbtv has quit IRC21:28
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Reject some untrusted config objects  https://review.openstack.org/47044221:31
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Reject some untrusted config objects  https://review.openstack.org/47044221:34
*** hashar has quit IRC21:35
jeblairi've seen some test failures recently with:21:38
jeblair  stderr: 'fatal: Unable to create '/tmp/tmp.2kLIXZeUlb/tmp4tvsjM/zuul-test/executor-git/review.example.com/org/project/.git/index.lock': File exists.21:38
jeblairi believe i know the problem; i'll work up a patch to fix21:39
*** dougbtv has joined #zuul21:41
*** dougbtv has quit IRC21:50
*** dougbtv has joined #zuul21:54
*** jkilpatr has quit IRC21:58
*** dougbtv has quit IRC22:01
*** dougbtv has joined #zuul22:14
*** jkilpatr has joined #zuul22:16
*** dougbtv_ has joined #zuul22:17
openstackgerritK Jonathan Harker proposed openstack-infra/zuul feature/zuulv3: Add github reporter status-url config option  https://review.openstack.org/44979422:20
*** dougbtv has quit IRC22:20
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor  https://review.openstack.org/47057122:34
*** cinerama` has joined #zuul22:34
jeblairthat should fix the git lock issues ^22:34
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor  https://review.openstack.org/47057122:36
jeblairSpamapS, mordred, clarkb: we probably want to land that soonish so we can avoid further spurious test failures22:36
* SpamapS peeks22:36
jeblairdrat, debug lines22:37
clarkbkk22:38
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor  https://review.openstack.org/47057122:38
jeblairthat just removes 3 debug lines i forgot to clean up22:38
mordredjeblair: lgtm22:39
*** cinerama has quit IRC22:40
clarkbis it worth documenting that you can't cohabitate those repos (either via nfs or any other mechanism) when using split out processes/workers/22:45
*** jesusaur has quit IRC22:48
jeblairclarkb: yep.  and also if you run a merger and executor on the same host (for some reason.  don't do that though since the executor has a merger).22:51
jeblairbut we haven't really gotten to the revised operational docs yet22:52
jeblairso we'll just have to keep that in mind when we do write those22:52
*** jesusaur has joined #zuul22:54
SpamapSmight be good to start building an outline of sorts.. I've heard a few caveats expressed here and I worry we'll forget some22:55
clarkbjeblair: reading the change there are three locations where we can run into a conflict. cat, update (which is called by cat), and merge. These all appear to run in the same thread? I must be missing where update can happen in another thread22:55
clarkboh its updateLoop is in a thread of its own got it22:56
jeblairyep22:56
jeblaircat puts entries into the update queue, but so does executor:execute22:56
jeblair(which is another thread)22:57
clarkbSpamapS: do you want me to hold off approving so you can reviwe?22:57
SpamapSclarkb: if you don't mind, I'm almost done. :)22:58
clarkbsure22:58
clarkbI +2'd feel free to approve if you don't find a reason to -122:58
SpamapSclarkb: Thanks, +3'd :)22:59
SpamapSjeblair: I left a little comment but it was more "oh there's another way to do this in case you didn't consider it" not a -1 or even a need for follow-up23:00
jeblairSpamapS: ah yeah, that would work too.23:03
jeblairSpamapS: i *think* this may be the lowest cost way (since with (un)registering we may end up flapping a lot, and incurring extra network cpu costs), but i'm not certain about that.  :)23:04
jeblair"extra network *and* cpu costs"23:04
*** jesusaur has quit IRC23:13
*** jesusaur has joined #zuul23:16

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!