Friday, 2017-06-02

openstackgerrit	Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Fix traceback when evaluating github changes https://review.openstack.org/470068	00:04
*** jamielennox is now known as jamielennox\|away		00:44
SpamapS	pabelanger: certainly seems like landing the change should cause Zuul to just reconfigure, but maybe we're forgetting something. :-P	01:10
pabelanger	SpamapS: I assume it will, but zuulv3-dev.o.o isn't managed by puppet today. One we get zuulv3.o.o online, hopefully in the next day or so. We'll know for certain!	01:12
jeblair	SpamapS: re 453851, i think we should land the whole series as soon as all the tests are green; however, the last 2 changes have failed for some reason i haven't looked into yet	01:43
jeblair	SpamapS, pabelanger, jlk: as soon as a change to a config-repo lands, zuul notices and loads the new configuration.	01:44
jeblair	SpamapS: oh i see the problem on the bwrap series, fixing	01:45
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Make zuul ansible dir self-contained https://review.openstack.org/468208	01:47
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add untrusted-projects ansible test https://review.openstack.org/461881	01:47
SpamapS	2017-06-01 23:56:31.011945 \| [Zuul] Job timed out, result: FAILURE ?	01:55
*** jamielennox\|away is now known as jamielennox		02:00
jlk	jeblair: interesting, that's not exactly what I was witnessing.	02:52
jlk	(update on change)	02:52
*** jamielennox is now known as jamielennox\|away		03:30
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Case sensitive label matching https://review.openstack.org/469946	05:11
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Fix tests for "Case sensitive label matching" https://review.openstack.org/469985	05:11
*** mmedvede has quit IRC		05:28
*** mmedvede has joined #zuul		05:31
*** rattboi has left #zuul		07:00
*** hashar has joined #zuul		07:49
*** jamielennox\|away is now known as jamielennox		08:21
*** jamielennox is now known as jamielennox\|away		08:49
*** persia has quit IRC		09:38
*** persia has joined #zuul		09:39
tobiash	mordred: I have a use case for configuring the finger port.	09:50
tobiash	mordred: If running in an unprivileged container it is not allowed to listen on a low port (79).	09:51
*** jkilpatr has quit IRC		10:30
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Make finger port configurable https://review.openstack.org/470243	10:38
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Don't copy the __pycache__ folder https://review.openstack.org/470245	10:42
*** jkilpatr has joined #zuul		10:53
mordred	tobiash: awesome.	11:20
tobiash	:)	11:34
Shrews	tobiash: wow, you really are running right on the cutting edge of development, aren't you? :)	11:38
tobiash	Shrews: latest (this morning) feature/zuulv3 + some not yet approved patches :)	11:42
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Make finger port configurable https://review.openstack.org/470243	12:09
pabelanger	Shrews: maybe if you have time today, you can help me understand how to pruge old images from dib-image-list on nodepool.o.o. We have an unknown image in deleting state for 8 days now	13:52
pabelanger	only think I can think of is using zkshell to remove it	13:53
Shrews	pabelanger: looking	13:59
SpamapS	tobiash: you're going to need a privileged container to run bubblewrap FYI	14:00
pabelanger	containers in containers! :D	14:01
SpamapS	tobiash: you can of course opt out by using the nullwrap driver, but then you risk untrusted jobs breaking out and pwn'ing your executor.	14:01
Shrews	pabelanger: those dib files do not exist on the owning builder nb04. how did that happen?	14:03
Shrews	pabelanger: if they were manually removed, then yeah, we'll have to delete the znodes manually	14:04
pabelanger	Shrews: we renamed the image a few days ago from opensuse-42.2 to opensuse-422. The images also failed to build a few times, and seems they hung around	14:04
pabelanger	Shrews: is it possible to check zookeeper to see if the image name is valid, then make just auto purge them?	14:05
pabelanger	or some other nodepool util to clean them up	14:05
SpamapS	I wonder if we could make a wrapper that uses docker/rkt/k8s instead of bubblewrap	14:05
Shrews	pabelanger: how would you decide if an image name is valid?	14:05
pabelanger	Shrews: right, I guess we cannot depend on local nodepool.yaml file	14:06
SpamapS	tobiash: what's your container overlord? Docker?	14:06
pabelanger	since it would be possible for other builder to have a different image now	14:06
pabelanger	name*	14:06
Shrews	pabelanger: what we can probably do is change the way we delete the znode. right now, it's only deleted if the dib is successfully removed from disk. i suppose we could add a check that removes the znode if the on-disk dib is already gone	14:07
pabelanger	Shrews: actually, we know nb04.o.o build the image, but it does not exist in its configuration file, can we not do something about that?	14:07
pabelanger	Shrews: Ya, that's basically the issue now I think	14:07
Shrews	pabelanger: oh, we already have that check	14:09
pabelanger	mordred: this seems new on nb04.o.o: http://paste.openstack.org/show/611300/	14:11
pabelanger	Shrews: okay, looking in logs to see what is happening	14:11
pabelanger	Shrews: are we bailing out on line 406 for nodepool/builder.py?	14:14
tobiash	a gate job timed out: https://review.openstack.org/#/c/470243/	14:14
tobiash	ist this a known issue?	14:14
tobiash	http://logs.openstack.org/43/470243/2/gate/gate-zuul-python27-ubuntu-xenial/db8a735/	14:15
Shrews	i don't know	14:15
tobiash	SpamapS: good to know, thanks	14:16
tobiash	SpamapS: currently using plain docker	14:16
SpamapS	I expect bwrap to land pretty soon so be ready :)	14:16
SpamapS	would be interesting to try and write a docker wrapper though	14:17
tobiash	SpamapS: will try that out after my vacation in two weeks	14:19
tristanC	OA	14:26
tristanC	oops :-)	14:26
Shrews	pabelanger: i think what's happening is that the diskimages is not in nodepool.yaml AND there are no local files on disk (for whatever reason), so nb04 doesn't think it's responsible for it (according to https://github.com/openstack-infra/nodepool/blob/master/nodepool/builder.py#L402-L405)	14:27
pabelanger	Shrews: yes, that is what I am thinking too. I was trying to see if we could some how use the 'builder' field from zk, since we add it to each build	14:28
Shrews	i'm not entirely certain why that check exists	14:29
pabelanger	I think we want to use ImageBuild.builder	14:30
pabelanger	wasn't it to stop other nodepool-builders from deleting DIBs that didn't exist on their filesystem?	14:30
Shrews	my concern is that we might be doing things behind the builder's back that might be causing this situation	14:30
Shrews	or, we have a bug somewhere	14:31
pabelanger	Ya, we should see about adding a test to expose this. should be straightforward to setup current zk values	14:32
pabelanger	unknown image in building state	14:32
Shrews	pabelanger: why does nb04 have a fully qualified hostname, but the others do not?	14:33
Shrews	others == nb03	14:33
pabelanger	Shrews: ubuntu xenail image in rackspace did that	14:36
pabelanger	nb03 is in vexxhost	14:36
Shrews	seems obvious NOW that we could just use the builder hostname to see if it is a locally owned image rather than checking on-disk files, but that should have seemed obvious THEN too, so maybe there's a reason i can't remember why we didn't use hostname	14:37
pabelanger	ya, not sure: https://review.openstack.org/#/c/400421/ was the review that seems to add it	14:39
Shrews	jeblair: Can you remember why we wouldn't just compare builder hostnames to determine if we own the build? ^^^	14:52
jeblair	i think originally we just didn't want to rely on hostnames for something like that. they are finicky. (if you have to change your hostname, how do you convince nodepool to delete those images, and would it even be obvious to a user that the reason their system broke is they changed the hostname)	14:52
Shrews	speak o' da debil	14:52
Shrews	ok, yeah, that makes sense	14:52
Shrews	pabelanger: i think we'll just have to manually delete that image from zk	14:53
jeblair	i'm willing to revisit that (we can probably mitigate those concerns with more work). though i'm curious how we ended up with the image missing from disk	14:54
Shrews	jeblair: yeah, the missing image is also my concern	14:54
pabelanger	I can see about reproducing it, but I think it had to do with stopping nodepool-builder while DIB was building failed dib builds. In this case, I think we were missing the zypper dependency for DIB, and it was just looping failures	14:58
pabelanger	then, when we started up, I don't think the deleting states were ever purged	14:58
pabelanger	we eventually created a successful image, but then renamed it	14:58
pabelanger	leaving the stale data	14:58
jeblair	pabelanger: do you think nodepool was stopped after it had deleted a failed build but before it removed the associated zk node?	14:59
pabelanger	I think there was nothing to delete, because we didn't create a DIB	15:01
*** dkranz has joined #zuul		15:01
jeblair	pabelanger: did we accumulate a lot of 'deleting' images then?	15:02
jeblair	or did it actually remove the zk nodes for some/most of them?	15:02
pabelanger	I cannot completely remember. Let me check the logs. but I have not done anything directly to zookeeper personally	15:04
Shrews	so maybe znodes were created in BUILDING state, the DIB failed w/o creating anything on disk, then later BUILDING is set to DELETING but we have nothing to delete ondisk	15:05
Shrews	if DIB doesn't even get to the point of creating the dib directory, then yeah, i can see this happening	15:05
jeblair	yeah, that's what i'm wondering; but if that were the case, we'd expect lots of znode entries, since it would reliably never delete	15:06
Shrews	that is also true	15:06
pabelanger	nodepool-builder.log.2017-05-24_15 on nb04 has what happened	15:06
pabelanger	we looped about 401 times trying to build opensuse-42.2	15:07
pabelanger	but we then changed its configuration under nodepool when we renamed to opensuse-422	15:07
Shrews	yeah, several failures, but only 3 znodes	15:08
Shrews	the last 3 failures are the znodes still there	15:09
Shrews	trying to remember if there is some logic that would delete the others	15:10
pabelanger	Ya, I cannot remember if we had 400+ deleting or just a few	15:10
Shrews	znode is deleted ONLY when the local files are deleted, so i'm not sure what happened there	15:12
pabelanger	possible they are deleted on start up?	15:14
Shrews	nope, continuously	15:17
Shrews	every minute, by default, unless the CleanupWorker thread is dead	15:18
Shrews	but it is alive	15:19
pabelanger	Oh, IIRC, I did a manual build of opensuse-42.2	15:19
pabelanger	because we didn't have opensuse-42.2 in images section for pool	15:19
pabelanger	builder-debug.log.2017-05-24_16:2017-05-24 21:04:15,054 INFO nodepool.builder.BuildWorker.0: Manual build request for image opensuse-42.2	15:19
pabelanger	not sure if that changes anything	15:20
Shrews	shouldn't matter	15:21
jlk	jeblair: when you say that zuul will watch a config-repo for changes, what's the watch mechanism? how does it become informed of changes there?	15:31
jlk	jeblair: is that through a source driver?	15:31
jeblair	jlk: oh, you know what, that's gerrit specific. we need to abstract that. it's in Scheduler.process_event_queue	15:32
jlk	ah hah	15:33
* jlk adds it to the github support task list		15:33
jlk	although...	15:34
jlk	maybe that should be it's own story, to support a raw git repo as well as as github	15:34
jeblair	jlk: mostly need to make 'event.type == "change.merged"' into an api call	15:34
jeblair	event.isChangeMerged or something :)	15:34
clarkb	pabelanger: Shrews is this another case where we might consider a `nodepool $action --force` command to handle when things get funny?	15:35
jlk	right okay	15:36
jlk	so, that probably wont' work for pure git, since you don't get events in pure git world	15:36
jeblair	jlk: drivers can (now) return their own subclasses of TriggerEvent, so a git driver could return its own subclass as well, which would also implement isChangeMerged(). and yeah, we'd need to synthetically create events for that to work. that's a possibility down the line.	15:37
jlk	and for github, you probably want to tie that to a push event (which happens on merge too, so you can get the event if it's merged via a PR or just directly pushed)	15:37
jeblair	jlk: sounds like it	15:37
jlk	I should be able to test that out. THis is sounding mor egithub specific, so I'll drop a task in the story.	15:38
*** hashar is now known as hasharAway		15:39
jlk	Can you link directly to a storyboard task?	15:43
jlk	anyway https://storyboard.openstack.org/?#!/story/2000774 task 4664 I think captures the desire in the notes.	15:43
jeblair	++	15:44
Shrews	clarkb: any --force options would scare me because unless someone really understands the system, they could end up making things worse	15:49
jeblair	Shrews, pabelanger: the 3 stuck images appear to have been "created" right before a configuration change, based on the 'creating new providermanager" lines in the log. but it doesn't look like the builder was restarted, based on the absence of zk connection lines.	15:57
jeblair	Shrews, pabelanger: i think i can explain it	16:08
jeblair	the key is that the configuration was changed to remove the opensuse-42.2 image while it was in that failed build loop	16:09
jeblair	if you look at _cleanupImage....	16:10
jeblair	before the configuration change diskimage=True (because it is in the config) and local_builds=[] (because DIB did not create a build directory)	16:10
jeblair	the function decides it is responsible for the image (diskimage=True). it calls _deleteLocalBuild, which finds that there is no local build, but there is a hostname check in there. if there is no local build, but the hostname of the builder record in zk matches ours, we assume this case -- that dib failed without creating a dir, so we return true	16:12
jeblair	this allows it to then delete the zk node	16:12
jeblair	then after the configuration change which removed opensuse42.2	16:13
jeblair	diskimage=False (because it is not in the config), so we exit _cleanupImage early because we are not responsible.	16:13
jeblair	(since there are still no local builds)	16:13
Shrews	jeblair: ah, that makes sense. a race with the user changing the config	16:15
pabelanger	yes, that seems to make the most sense	16:16
jeblair	combined with an invalid assumption about external state (that we get a dib directory if we invoke dib)	16:16
jeblair	i can think of 3 approaches to correct: a) remove the zk record earlier if there is no dib directory. b) ensure there is always a dib directory, eg, by creating it before invoking dib. c) rely more on hostnames to determine build ownership	16:18
SpamapS	is there a sub-version of (a) which is to mark the zk record as vulnerable to this race while we're starting builds, and then that enables (c) ?	16:19
* SpamapS says this without having looked at the records at all to see if they already have that		16:20
jeblair	SpamapS: could be. we're actually pretty close to that now with setting it as 'FAILED' then changing it to 'DELETE'; we just don't do anything with that knowledge.	16:20
Shrews	jeblair: what if we change this line to also check for a hostname match? https://github.com/openstack-infra/nodepool/blob/master/nodepool/builder.py#L356	16:23
Shrews	we'll consider it a local build if we have the files OR the hostname matches	16:24
Shrews	that's probably a safe use of hostname	16:24
SpamapS	Another idea is to have each builder make its own permanent uuid that isn't the build, but the builder, and attach that to every build. So instead of "is there a dir for the build" just "if there is a record for my builder ID, I own that build" and forget hostname even exists other than for admin-info.	16:25
jeblair	SpamapS: yep, and store that in a local state dir	16:25
SpamapS	jeblair: GMTA :)	16:26
jeblair	Shrews: yeah, that would work. we would be relying on hostname enough that "bad things" are likely to happen if you have two builders with the same hostname, but not relying on them so much that if you change a hostname, things will automatically break.	16:28
openstackgerrit	David Shrewsbury proposed openstack-infra/nodepool master: Consider hostname in determining local builds https://review.openstack.org/470364	16:39
Shrews	I did not submit a features/zuulv3 version since I figure the update with master will cover that.	16:41
jeblair	SpamapS, mordred: i think you have convinced me that if we are going to improve the executor merger situation by avoiding repeating the merge ops, we should do so by fetching from the merger over http rather than sending git-packs over gearman. the latter would likely work, but it may inhibit our ability to move to zk for zuulv4. meanwhile, i think the added requirement of a webserver on the mergers probably isn't too onerous as long as it's ...	17:20
jeblair	... internal-only (ie, only the executors need to be able to reach them).	17:20
mordred	jeblair: ++	17:21
clarkb	that should alleviate corporate firwall rule issues	17:21
jeblair	SpamapS: the only remaining grey area for me is, if someone is running a merger in a container, will that merger be able to know enough to say "contact me at http://address:port/" to the executor?	17:21
jeblair	mordred: ^	17:21
jeblair	clarkb: yeah, that's the big one. "also set up apache or whatever" is annoying but tractable.	17:22
jeblair	(besides, you don't actually need to do it if you run zuul all-in-one, so that's nice for small deployments)	17:22
clarkb	re container addressing I think it depends entirely on your container setup. You likely want to be able to configure what that address is in the merger for flexibility	17:22
pabelanger	mordred: would fetch over git+ssh be an option too?	17:23
mordred	SpamapS, jeblair: I like the uuid-in-local-state-dir idea - also referencing container space - running builders in k8s pods with a persistent volume - the volume would be the thing that's the most sticky and least likely to change	17:23
mordred	I agree with clarkb re: addressing	17:23
clarkb	NAT is a tool that pops up over and over :(	17:23
jeblair	mordred, clarkb: so we add a config option to the merger to say "this is your http accessible address"?	17:23
mordred	optional merger config that tells the merger what its hostname and port are is likely important	17:24
mordred	jeblair: yah	17:24
jeblair	sounds workable	17:24
pabelanger	jeblair: sorry, that git+ssh question should have been for you	17:24
SpamapS	Yes, typically apps that run in containers are just like apps that run behind load balancers. You need to tell them where they're reachable, because they won't actually know themselves.	17:24
jeblair	pabelanger: yeah, git+ssh would work; i'm not sure which is the easier thing to set up.	17:25
SpamapS	It's really just a mapping problem. They should probably just say "merge result is at zm03" and zm03's address is mapped by a cluster manager that knows where it put zm03.	17:26
SpamapS	(there's this awesome technology for mapping names to addresses btw...)	17:26
*** harlowja has quit IRC		17:28
pabelanger	jeblair: git+ssh would be one less external dependency (apache / nginx) but mean setup public / private keys. I think the public / private key approach would be okay, since we do that for nodepool / zuul-executor today	17:28
SpamapS	so...	17:29
pabelanger	might be a fun thing to test in puppet / ansible	17:29
SpamapS	if it's in a container, it's one _more_ dependency	17:29
SpamapS	because containers don't typically run sshd	17:29
pabelanger	agree	17:29
pabelanger	SpamapS: I guess with containers you would share the mount for /var/lib/zuul/git and use your protocol of choice, in another containers?	17:31
SpamapS	pabelanger: you could	17:32
mordred	yah - I think there is an easy a way the apache bit can work with containers (run a pod with a volume a merger and an apache) - so for now just doing the apache route and not sorting out ssh seems easier to me	17:33
mordred	since that's also easy to ansible or puppet for non-container land	17:33
pabelanger	ya, I think git+ssh sounds promosing because we'd likely not need http://git.openstack.org/cgit/openstack-infra/puppet-zuul/tree/templates/zuul.vhost.erb any more. Or does that get smaller with zuulv3?	17:34
clarkb	from the corporate firewall perspective IME opening up http(s) is a significantly more easy than any other protocol	17:35
pabelanger	git-core/git-http-backend was what I was thinking	17:35
pabelanger	clarkb: that is true	17:35
clarkb	and relying on DNS (while my preference) is often a burden in corp envs where things need legal review and so on	17:36
jeblair	though, we're still talking internal access only. this is a system which already requires zookeeper (and currently gearman). i'm not as worried about internal firewalls.	17:37
clarkb	jeblair: IME internal firealls are just as much a problem :)	17:37
clarkb	but if we assume a cloudy env probably not a big deal	17:37
jeblair	i'm not assuming a cloudy env, i'm assuming an environment where multiple systems can talk to each other. :)	17:37
jeblair	i'm saying that we crossed that bridge once we created a distributed system at all.	17:38
clarkb	I agree, its just in my expereince https is the easiest protocol to get opened up in such environments	17:43
clarkb	because its a thing everywhere they are often already rules in place that say you can http rather than going up in front of a security board and having them tell you no	17:44
mordred	yah. I mean, I think we should use http(s) for this ... but I also think that a person is not going to be succcesful if they do not have adequate control of the network flow in-between their zuul nodes	17:45
mordred	since they'll already have needed to go to the security board to get zk and gear opened, not to mention ssh between executors and build nodes	17:46
mordred	but since this is _only_ for intra-cluster communicatoin, it really shouldn't be an issue like it would be if we were talking about needing the build nodes to be able to contact this git service (like from gozer days)	17:46
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_checkout https://review.openstack.org/470381	17:47
*** jkilpatr has quit IRC		17:48
jeblair	clarkb: yeah, sadly, sometimes it is more difficult to open ssh, even internally. i don't think we should weigh that heavily against it; fortunately, i think there are some other reasons to prefer http anyway.	17:49
jeblair	also, we don't have to implement this now -- the "repeat the merge" code is written and good enough to get us out the door.	17:50
*** jkilpatr has joined #zuul		18:03
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_and_master_checkout https://review.openstack.org/470385	18:05
mordred	jeblair, SpamapS: re: pabelanger's comment on https://review.openstack.org/#/c/453851 ... any reason to not just use bwrap for both branches?	18:05
SpamapS	mordred: I think it's a great idea.	18:06
SpamapS	Prevents mishaps. :)	18:07
jeblair	mordred: well, the intent was that a config repo was entitled to do things outside of a container. like write to afs.	18:07
mordred	jeblair: ah. gotcha. write to afs was the bit I was leaving out of my brainhole when looking at the bwrap invocation	18:08
jeblair	we might be able to bind mount afs in	18:08
jeblair	but that was a prototypical example	18:08
jeblair	so i guess we need to decide how much of a prototype it is. :)	18:09
mordred	jeblair: I htink it's easy enough to swith to always using bwrap	18:09
mordred	switch to	18:09
mordred	so I'm fine with the patch as it is now while we think about that	18:09
jeblair	yeah, so let's chalk that up as something we feel fondly about doing in the future after we work through some things :)	18:09
pabelanger	++	18:09
mordred	pabelanger, SpamapS: I think the stack up to https://review.openstack.org/#/c/468554 is ready for y'all whenver	18:15
pabelanger	mordred: k, I can review shortly. Working on zuulv3.o.o atm	18:17
mordred	pabelanger: oh! do that. :)	18:22
pabelanger	we should be in good shape to get ze01.o.o online shortly too	18:29
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_git_cache_dir https://review.openstack.org/470389	18:32
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_upgrade https://review.openstack.org/470390	18:32
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_periodic_update https://review.openstack.org/470391	18:32
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_post_checkout https://review.openstack.org/470392	18:32
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename TestCloner TestExecutorRepos https://review.openstack.org/470393	18:32
jeblair	End of Series!	18:32
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_post_checkout https://review.openstack.org/470392	18:37
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename TestCloner TestExecutorRepos https://review.openstack.org/470393	18:37
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_checkout https://review.openstack.org/470381	18:37
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_and_master_checkout https://review.openstack.org/470385	18:37
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_git_cache_dir https://review.openstack.org/470389	18:37
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_upgrade https://review.openstack.org/470390	18:37
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_periodic_update https://review.openstack.org/470391	18:38
*** jkilpatr has quit IRC		18:42
*** jkilpatr has joined #zuul		18:55
mordred	jeblair: I don't believe you	19:13
mordred	tristanC: I'm sorry I have made it through an entire additional week without starting the threads about your patches - I promise it's on my list!	19:25
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Abort jobs if JobDir disk usage exceeds an optional maximum https://review.openstack.org/463022	19:31
mordred	rebased and fixed up adam_g patch ^^	19:31
openstackgerrit	K Jonathan Harker proposed openstack-infra/zuul feature/zuulv3: Handle job or build is None in formatUrlPattern https://review.openstack.org/470415	19:34
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Add SSH Agent Primitives and usage https://review.openstack.org/468173	19:34
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Add support for bwrap https://review.openstack.org/453851	19:35
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Make zuul ansible dir self-contained https://review.openstack.org/468208	19:35
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Add untrusted-projects ansible test https://review.openstack.org/461881	19:35
jesusaur	jhesketh: i found a minor issue with safe_job and safe_build, can you take a look at 470415 when you're around?	19:40
mordred	SpamapS: woot bubblewrap!	19:41
*** hasharAway is now known as hashar		19:43
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Make finger port configurable https://review.openstack.org/470243	20:03
*** harlowja has joined #zuul		20:04
SpamapS	\o/	20:24
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_post_checkout https://review.openstack.org/470392	20:26
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename TestCloner TestExecutorRepos https://review.openstack.org/470393	20:26
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_checkout https://review.openstack.org/470381	20:26
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_post_and_master_checkout https://review.openstack.org/470385	20:26
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Enable test_periodic https://review.openstack.org/470053	20:26
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_git_cache_dir https://review.openstack.org/470389	20:26
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_upgrade https://review.openstack.org/470390	20:26
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Remove test_periodic_update https://review.openstack.org/470391	20:26
* SpamapS finally getting back to review queue.. :-P		20:32
* jlk out for the weekend. o/		20:47
*** dougbtv has quit IRC		21:28
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Reject some untrusted config objects https://review.openstack.org/470442	21:31
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Reject some untrusted config objects https://review.openstack.org/470442	21:34
*** hashar has quit IRC		21:35
jeblair	i've seen some test failures recently with:	21:38
jeblair	stderr: 'fatal: Unable to create '/tmp/tmp.2kLIXZeUlb/tmp4tvsjM/zuul-test/executor-git/review.example.com/org/project/.git/index.lock': File exists.	21:38
jeblair	i believe i know the problem; i'll work up a patch to fix	21:39
*** dougbtv has joined #zuul		21:41
*** dougbtv has quit IRC		21:50
*** dougbtv has joined #zuul		21:54
*** jkilpatr has quit IRC		21:58
*** dougbtv has quit IRC		22:01
*** dougbtv has joined #zuul		22:14
*** jkilpatr has joined #zuul		22:16
*** dougbtv_ has joined #zuul		22:17
openstackgerrit	K Jonathan Harker proposed openstack-infra/zuul feature/zuulv3: Add github reporter status-url config option https://review.openstack.org/449794	22:20
*** dougbtv has quit IRC		22:20
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor https://review.openstack.org/470571	22:34
*** cinerama` has joined #zuul		22:34
jeblair	that should fix the git lock issues ^	22:34
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor https://review.openstack.org/470571	22:36
jeblair	SpamapS, mordred, clarkb: we probably want to land that soonish so we can avoid further spurious test failures	22:36
* SpamapS peeks		22:36
jeblair	drat, debug lines	22:37
clarkb	kk	22:38
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor https://review.openstack.org/470571	22:38
jeblair	that just removes 3 debug lines i forgot to clean up	22:38
mordred	jeblair: lgtm	22:39
*** cinerama has quit IRC		22:40
clarkb	is it worth documenting that you can't cohabitate those repos (either via nfs or any other mechanism) when using split out processes/workers/	22:45
*** jesusaur has quit IRC		22:48
jeblair	clarkb: yep. and also if you run a merger and executor on the same host (for some reason. don't do that though since the executor has a merger).	22:51
jeblair	but we haven't really gotten to the revised operational docs yet	22:52
jeblair	so we'll just have to keep that in mind when we do write those	22:52
*** jesusaur has joined #zuul		22:54
SpamapS	might be good to start building an outline of sorts.. I've heard a few caveats expressed here and I worry we'll forget some	22:55
clarkb	jeblair: reading the change there are three locations where we can run into a conflict. cat, update (which is called by cat), and merge. These all appear to run in the same thread? I must be missing where update can happen in another thread	22:55
clarkb	oh its updateLoop is in a thread of its own got it	22:56
jeblair	yep	22:56
jeblair	cat puts entries into the update queue, but so does executor:execute	22:56
jeblair	(which is another thread)	22:57
clarkb	SpamapS: do you want me to hold off approving so you can reviwe?	22:57
SpamapS	clarkb: if you don't mind, I'm almost done. :)	22:58
clarkb	sure	22:58
clarkb	I +2'd feel free to approve if you don't find a reason to -1	22:58
SpamapS	clarkb: Thanks, +3'd :)	22:59
SpamapS	jeblair: I left a little comment but it was more "oh there's another way to do this in case you didn't consider it" not a -1 or even a need for follow-up	23:00
jeblair	SpamapS: ah yeah, that would work too.	23:03
jeblair	SpamapS: i think this may be the lowest cost way (since with (un)registering we may end up flapping a lot, and incurring extra network cpu costs), but i'm not certain about that. :)	23:04
jeblair	"extra network and cpu costs"	23:04
*** jesusaur has quit IRC		23:13
*** jesusaur has joined #zuul		23:16

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!