Monday, 2017-06-05

openstackgerrit	Wang Xuguang proposed openstack-infra/zuul master: Using assertIsNone(xxx) instead of assertEqual(None, xxx) Using assertTrue(xxx) instead of assertEqual(True, xxx) https://review.openstack.org/470796	01:56
*** cinerama` has quit IRC		02:02
*** cinerama has joined #zuul		02:06
openstackgerrit	Tuan Luong-Anh proposed openstack-infra/zuul master: Replace assertEqual(None, *) with assertIsNone in tests https://review.openstack.org/257251	03:40
*** dougbtv_ has quit IRC		06:17
*** dougbtv_ has joined #zuul		06:24
*** cinerama has quit IRC		06:50
*** SpamapS has quit IRC		06:50
*** zaro has quit IRC		06:50
*** mattclay has quit IRC		06:50
*** patrickeast has quit IRC		06:50
*** TheJulia has quit IRC		06:50
*** cinerama` has joined #zuul		06:50
*** mattclay has joined #zuul		06:50
*** patrickeast has joined #zuul		06:51
*** zaro_ has joined #zuul		06:51
*** TheJulia has joined #zuul		06:51
*** MapspaM has joined #zuul		06:51
*** hashar has joined #zuul		10:19
*** jkilpatr has joined #zuul		10:59
openstackgerrit	Merged openstack-infra/nodepool feature/zuulv3: Set socket timeout for SSH keyscan https://review.openstack.org/451470	13:32
*** persia has quit IRC		14:17
*** persia has joined #zuul		14:21
*** MapspaM is now known as SpamapS		15:54
*** SpamapS has quit IRC		15:55
*** SpamapS has joined #zuul		15:55
* SpamapS awakens		15:55
jlk	the sleeper ^	15:56
SpamapS	I HUNGER	15:57
*** hashar has quit IRC		15:58
jlk	I left you some tasty snacks in a slack group PM	15:59
*** hashar has joined #zuul		16:00
*** hashar has quit IRC		16:00
SpamapS	jlk: delicious	16:06
*** harlowja has joined #zuul		16:45
*** jkilpatr_ has joined #zuul		17:03
*** jkilpatr has quit IRC		17:05
SpamapS	It's quiet	17:07
Shrews	jeblair: If I use hold_jobs_in_build, as you suggested, how do I get the build data (like uuid for the python27 build)? getJobFromHistory() is obviously not the way	17:12
* Shrews getting lost in the test framework details		17:13
Shrews	ah, self.builds	17:19
SpamapS	I get lost in those too sometimes. :)	17:20
SpamapS	started to click somewhere around the 15th test re-enablement :)	17:20
jeblair	yeah. builds == running builds; history == completed builds.	17:23
jeblair	so the change i wrote friday, 470571, failed due to a race which i believe is possible now, but that change makes much more likely. it's not just a test race; it's a real production race too. basically -- there's some ambiguity about knowing whether zuul should report certain kinds of failures before the initial merge for a change has completed.	17:26
jeblair	i'm going to take tobiash's change 455711 which also touches that area, and see about building a fix on that, so that maybe we can have some comprehensive behavior there, then restack 470571 on top of it.	17:27
jeblair	(the race is that the mutex around the executor merger makes it much more likely that tests will end up dequeuing changes before their initial merge completes)	17:28
mordred	jeblair: ah - fun	18:02
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Replace equals None with is None https://review.openstack.org/471075	18:08
Shrews	jeblair: so, i'm finding that the actual build directory does not yet exist with hold_jobs_in_build	18:09
mordred	Shrews: whatcha trying to do?	18:11
Shrews	mordred: write a test for the log streamer	18:12
Shrews	so i need, like, a log to stream	18:12
mordred	ah - so you do, don't you	18:12
mordred	Shrews: so - in case it's helpful, I've got this: https://review.openstack.org/#/c/467634/ which needs to look at files in the build dir	18:13
jeblair	Shrews: hrm, it should exist, even regardless of whether ansible is run	18:13
jeblair	yeah _get_build_inventory in that change ^ is a good reference	18:14
Shrews	jeblair: i'm sure i'm doing something silly	18:14
openstackgerrit	David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: WIP: add log streaming test https://review.openstack.org/471079	18:15
Shrews	jeblair: ^^^ that's what i'm doing, atm	18:15
jeblair	though the path for the log file is jobdir/work/logs/ansible_log.txt	18:15
mordred	jeblair: speaking of - https://review.openstack.org/#/c/467634/5/tests/base.py doesn't seem to want to find the build by name	18:15
mordred	jeblair: am I doing something stupid in that method?	18:16
jeblair	mordred: lazy test method	18:16
jeblair	^ (joke, in case that didn't read)	18:16
Shrews	jeblair: line 91 fails	18:16
mordred	:)	18:16
jeblair	(/me forgot "lazy" is a thing in programming)	18:16
Shrews	the first listdir() confirms the build uuid subdir does not exist	18:17
jeblair	Shrews: do you have some output of 83 -- 92?	18:18
Shrews	yep. 1 sec	18:18
jeblair	Shrews: also, "KEEP_TEMPDIRS=1 ttrun -epy27 ..." may be helpful here. tests won't clean up any tempdirs and you can cd into them and inspect.	18:19
Shrews	jeblair: http://paste.openstack.org/show/611426/	18:19
jeblair	that's weird, since setting jobdir.log_root is set on the line before makedirs is called	18:22
Shrews	oh, using KEEP_TEMPDIRS=1 makes it work	18:22
Shrews	at least changes the behavior	18:22
jeblair	Shrews: those 2 things suggest that somehow your test is happening after the jobdir is removed (which happens at end of job)	18:23
jeblair	oh!	18:24
jeblair	ansible jobs can't be held in build	18:24
jeblair	so if you actually run ansible (subclass AnsibleZuulTestCase), then you can't use hold_jobs_in_build	18:24
* SpamapS catching up		18:25
jeblair	and in this case, the build is finishing before the path checks	18:25
jeblair	(it's actually racing them)	18:25
Shrews	hrm. how do i do this then?	18:25
jeblair	Shrews: if we need ansible to actually write the log file (probably so), then i think we'll have to make ansible pause. maybe add an ansible task that waits for a semaphore file somewhere in the jobdir, then set that file in the test when you want ansible to proceed.	18:26
SpamapS	wait_for is a thing :)	18:27
Shrews	also, with the kept dirs, i'm not seeing any log files, although the directories exist	18:27
jeblair	SpamapS: they thought of everything!	18:27
* SpamapS especially likes that kitchensink module		18:27
Shrews	so i think we're missing more than that piece	18:27
Shrews	SpamapS: i heard the kitchensink module has terrible leaks	18:28
SpamapS	Shrews: hey-oh	18:28
Shrews	yay Bad Joke Mondays	18:29
Shrews	jeblair: do we bypass actual logging for the ansible stuff?	18:30
jeblair	Shrews: nope, i can't think of why that would be empty	18:31
*** jkilpatr has joined #zuul		18:31
jeblair	i'm loading it up on my xenial machine now	18:31
Shrews	sigh	18:31
*** jkilpatr_ has quit IRC		18:32
jeblair	Shrews: while this is setting up -- we should probably create a new configuration just for this test which doesn't run all of those jobs (there's a bunch of jobs that test ansible behaviors)	18:35
jeblair	i mean, we'll need to do that to have the wait_for task anyway	18:35
jeblair	but we should just have a single job which does that, and not add it to the existing jobs used by the ansible test	18:36
jeblair	Shrews: mordred's change can also serve as a template for that -- he adds a whole new config setup there	18:37
*** hashar has joined #zuul		18:37
jeblair	Shrews: when i run it, i get an ansible_log in the python27 jobdir work/logs	18:39
jeblair	Shrews: one other thought i just had -- if you are using ttrun, you'll need to actually activate the venv	18:39
jeblair	because zuul won't find the ansible-playbook command otherwise	18:40
jeblair	.tox/py35/bin/activate; ttrun -epy35 ...	18:40
jeblair	is needed for tests which actually run ansible	18:40
Shrews	ah, i did not do the activation	18:41
mordred	jeblair: oh - I was supposed to try to fix that in ttrun I think, wasn't I?	18:41
mordred	jeblair: ttrun 1.0.5 _should_ have that fixed	18:43
Shrews	jeblair: yes, i have the log files now	18:43
mordred	Shrews: can you confirm your ttrun version? I'd love to know if the fix I put in is broken	18:44
Shrews	mordred: 1.0.4	18:44
Shrews	just upgraded it	18:44
Shrews	mordred: 1.0.5 does not seem to fix that	18:45
mordred	hrm. thanks	18:46
* Shrews goes back to the drawing board		18:48
Shrews	also, we should document that little testing tidbit, if it is not already	18:50
mordred	Shrews: ++	19:39
jlk	If anybody has cycles, I'd appreciate some reviews on 469297 and 470068	19:40
*** jkilpatr has quit IRC		20:06
mordred	jlk: oh - I agree with SpamapS's reviews - but think the patch is good	20:14
mordred	jlk: on 469297	20:14
SpamapS	aw shucks	20:18
SpamapS	I'm just RFC'ing ;)	20:18
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Write inventory as yaml not ini https://review.openstack.org/467634	20:20
*** jkilpatr has joined #zuul		20:22
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Fix traceback when evaluating github changes https://review.openstack.org/470068	20:22
Shrews	I'm trying to find a way to make ansible pause after the run, but before the jobdir cleanup. Adding a wait_for to the playbook itself just for tests seems... icky	20:28
Shrews	the solution, thus far, eludes me	20:29
jeblair	mordred: spotted the problem on 467634	20:31
jeblair	mordred: left comment	20:31
SpamapS	Shrews: that's doesn't seem icky at all to me.	20:31
SpamapS	Shrews: it's the playbook form of monkeypatching, which is how all of our tests work.	20:31
Shrews	SpamapS: i do not know how all of our tests work, so my ickiness remains	20:31
Shrews	but i'll look at where we do similar things	20:32
SpamapS	The alternative is either less meaningful unit testing and end to end "hope that worked like we thought" integration tests.	20:32
SpamapS	s/and/or/	20:32
jeblair	Shrews: you want to pause the test while the job is running, right? it seems very natural to me to put the pause directly in the job.... i mean, the system is, after all, one designed to run arbitrary actions in tests. :)	20:33
SpamapS	Shrews: we haven't yet done anything like that for playbooks, but it's no different than monkey patching in our recording executor and getting in between the methods that do things with a "hold builds" functionality.	20:33
jeblair	(and it's not like this applies globally -- this is only for tests that need this behavior.	20:33
jeblair	SpamapS: i'd even argue that this is better than that. this is unlike monkeypatching in that we are just suppling actual content to the system in the same way a user would.	20:34
SpamapS	Agreed.	20:34
SpamapS	and then pretending to be the user :)	20:34
jeblair	(i kind of wanted to do this to all of the tests before i realized running ansible would be very expensive for most of them for very little gain)	20:34
*** hashar has quit IRC		20:46
pabelanger	ohai!	20:59
pabelanger	been head down in training today, just wating for train back to ottawa now	20:59
openstackgerrit	Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Handle change related reqs on push like events https://review.openstack.org/469297	21:08
jlk	SpamapS: jeblair: updated ^^ accordingly.	21:08
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor https://review.openstack.org/470571	21:51
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Don't report start on unmanaged projects https://review.openstack.org/455711	21:51
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix reporting on changes dequeued for deps https://review.openstack.org/471143	21:51
jeblair	471143 is a tiny change that took me most of the day to come up with. :)	21:53
SpamapS	jeblair: it's not the size of the patch, but how you test it.	21:54
jeblair	jlk: lgtm! +3	21:55
jeblair	zuul meeting now in #openstack-meeting-alt	22:01
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Don't copy the __pycache__ folder https://review.openstack.org/470245	22:04
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Handle change related reqs on push like events https://review.openstack.org/469297	22:06
SpamapS	jlk: after the meeting, can we chat about your containerized zuul? I am poking at maybe pushing it up into bluemix k8s	22:11
SpamapS	(failing building nodepool images at the moment)	22:11
*** jamielennox\|away is now known as jamielennox		22:13
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Write inventory as yaml not ini https://review.openstack.org/467634	22:59
jlk	SpamapS: yes, would love to	23:06
SpamapS	jlk: were you successful in building images with nodepool? I forget	23:14
jlk	I was, once I started using volume mounts for the nodepool working space	23:15
SpamapS	jlk: ah, so something about the aufs/etc. failed?	23:15
jlk	yeah	23:15
SpamapS	weird	23:15
jlk	I have volumes for /opt/nodepool/images and /tmp/ in the docker-compose file	23:15
SpamapS	jlk: also did you ever think about taking it from docker-compose -> helm?	23:16
jlk	pretty sure that's upstream now. That was from hints from tobiash	23:16
jlk	SpamapS: not as of yet. I wanted to get it working simply on my laptop with a single docker daemon first.	23:16
jlk	walk, then run	23:16
SpamapS	https://github.com/j2sol/z8s <-- we're talking about this yes?	23:16
jlk	yeah	23:16
jlk	in gerrit land, updated_at is in reference to which? the last time a patch was changed, or anything was edited?	23:19
jlk	oh n/m	23:19
clarkb	jlk: aiui its any attribute of the change being updated	23:20
clarkb	so comments, topics, votes, etc	23:20
clarkb	jeblair: looks like your fix that you pointed out above failed on unittesting?	23:20
pabelanger	jlk: diskimage-builder is it fedora, you should be able to yum install it over pip install	23:20
pabelanger	in*	23:20
jlk	yeah, I know, but often we want more specific control of what version we install	23:21
jlk	sometimes our own fork	23:21
jlk	sometimes pre-release	23:21
jlk	oh this is tricksy. There is a isUpdateOf attribute to a change, which looks to see if there is an older version of the change already in the queue. For gerrit easy to just look at the patch #s since they increase. For github not so, which is where the "updatedAt" comes in. But that bit is going away in graphql as a generic thing. I wonder if I have to paw through the commits and look at those? eww.	23:24
jeblair	clarkb: yep, looking now.	23:24
jeblair	jlk: could probably keep an internal timestamp if that's necessary?	23:26
jlk	we could, may have to go that route, but still exploring graphql. This is... hard to wrap my head around	23:26
mordred	jlk: but changing EVERYTHING about how people do things is the future	23:27
mordred	jlk: I'm not sure I fully understand the "is there an older versoin of the change already in the queue" bit	23:28
mordred	jlk: (for the github case)	23:28
jlk	open PR A	23:28
jlk	zuul ingests that, but hasn't acted on it yet	23:28
jlk	add commit to PR A	23:28
jlk	new event	23:28
clarkb	couldn't you compare sha1s?	23:28
jlk	Zuul should see that the old event for PR A opening should be tossed.	23:28
jlk	clarkb: they might be two completely different sha1s	23:28
clarkb	yes exactly	23:28
mordred	GOTCHA - so the PR has the same id	23:28
jlk	which one is "newest" ?	23:29
clarkb	jlk: whichever event you received last?	23:29
jlk	if the events come out of order?	23:29
mordred	is that a thing?	23:29
jlk	:shrug:	23:29
mordred	wow, fun	23:29
mordred	jlk: I at least understand your problem now :)	23:29
SpamapS	They order them on the UI	23:29
SpamapS	so one would hope they're ordered somehow	23:29
jlk	order what?	23:29
jlk	the commits?	23:29
jlk	but you can force push an entirely new set of commits	23:30
SpamapS	yeah, commits in the PR view	23:30
SpamapS	oh that	23:30
SpamapS	actualy	23:30
jlk	yeah, that.	23:30
SpamapS	they show that too	23:30
mordred	jlk: I do not have any immediately helpful suggestions if the events can come in out of order	23:30
SpamapS	you'll see "jlk updated the pr"	23:30
*** jamielennox is now known as jamielennox\|away		23:30
SpamapS	and then everything before that starts garbage collecting	23:30
SpamapS	:-P	23:30
jlk	I don't know if they could. We might be safe to just toss out any existing change with the same ID	23:30
jlk	or use the deprecated updatedAt for now	23:31
SpamapS	jlk: my guess is the best you can do is go ask the PR for its current sha.	23:31
SpamapS	as in, the pr source	23:31
jlk	yeah I can get a set of the commits	23:31
mordred	ah - right. and use that to determine which of the N events you have for a given PR is the one that has the shas GH thinks are currently on the PR	23:32
mordred	it would be super neat to know whether or not gh webhooks can deliver events out of sequence or not	23:32
jlk	it's the Internet	23:32
jlk	things can wind up delayed in route for whatever reason	23:33
jlk	so GH may issue them in order, but that doesn't guarantee that we'll receive and process them in order	23:33
mordred	I'm not sure I follow that - the webhook receiver on our side is an http service - the question in my head is will gh submit a second request to our http service before getting a response code from the server for the first one?	23:34
mordred	if it fires them async on their side, then totally	23:34
mordred	(which is probably the more likely thing they're doing, right?)	23:34
SpamapS	It logs them in the sequence they respond in, but there's no telling if they guarantee one webhook per target.	23:35
SpamapS	I'd guess that they are async	23:36
jlk	feels wrong to assume and rely upon order	23:36
mordred	totally. we shoudl not assume order unless we know something about order	23:37
clarkb	I dunno, if events don't carry necessary ordering sematics within themselves then using the order you receive them in seems appropriate	23:37
clarkb	but I guess this is a difference between gerrit and github	23:37
clarkb	gerrit is an actual event stream	23:37
clarkb	which implies order and all that	23:38
clarkb	github is just async "here is a thing YOLO"	23:38
mordred	jlk: you said updatedAt is going away in the graphql ... got a handy link to a doc?	23:39
jlk	sure	23:39
jlk	https://developer.github.com/v4/reference/object/pullrequest/	23:39
jlk	there's a possible replacement, a connection to the pull request timeline, https://developer.github.com/v4/reference/object/pullrequesttimelineconnection	23:40
mordred	jlk: yah - that seems to have ordering info in it	23:40
jlk	but I was looking for a simple refchanged like event	23:40
jlk	and failing	23:40
jlk	https://developer.github.com/v4/reference/union/pullrequesttimelineitem/	23:40
mordred	yah	23:40
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor https://review.openstack.org/470571	23:41
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix reporting on changes dequeued for deps https://review.openstack.org/471143	23:41
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Don't report start on unmanaged projects https://review.openstack.org/455711	23:41
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix race in gerrit+github test https://review.openstack.org/471161	23:41
jlk	Seems to be missing HeadRefUpdatedEvent	23:41
mordred	jlk: yah. there's a ton of things about types of force push	23:41
mordred	jlk: maybe "commit" is the event for that?	23:42
jlk	that just leads to a commit object, which I'd have to mine for info	23:42
mordred	jlk: but - it would be an entry in the list of timeline events	23:43
jlk	yeah, might be able to take the last one of those	23:43
jlk	but...	23:43
mordred	jlk: so if the timeline property on the PullRequest is an ordered list - then you could walk it from the back for ordering perhaps?	23:43
jlk	oh hrm.	23:43
clarkb	the first doc you linked to says the general updatedat datetime will eventually be replaced, maybe just use it for now and then swtich when they replace it to something else?	23:44
jlk	re-thinking the scenario	23:44
jlk	clarkb: that's what I'm leaning toward	23:44
clarkb	I read that as "use this thing but know you will have to change it once we get around to actually making the new thing"	23:44
jlk	so I have event A leading to a change object A and event B leading to change object B, both are of the same PR.	23:44
jlk	in event A the commit event list has one commit, and in event B it also has one, but different, commit.	23:45
jlk	I think I lack the data necessary to know which of those event/objects is the "newest" one	23:45
*** jamielennox\|away is now known as jamielennox		23:46
jeblair	jlk: do you query the pr when you get an event?	23:47
mordred	jlk: yah - it seems like you're have to query the api for the PR and get the full timeline, then see which of the commits is later in that timeline	23:47
* jeblair should start saying 'we' :)		23:47
jlk	mordred: the crux of the problem is that I need the info of "when this commit appeared on this PR" which is a timestamp that I don't think is tracked in the commit itself, because the commit may be a part of many PRs, or it may have been dormant for a long time until it showed up in the PR.	23:48
jeblair	jlk: oh, when you said "change object" above, did you mean in zuul? if so, that's a misapprehension -- there should only ever be one change object in zuul for a given pr.	23:48
jlk	it's a relationship between the commit and the PR specifically.	23:48
clarkb	jlk: reading the api you'd get two PR events right? each with a commit list. You check the updatedat of the PR objects find the newer one, then look at commits.last or whatever	23:48
mordred	clarkb: right. with updatedat it makes total sense - but without updatedat it seems hard to reason about	23:49
clarkb	mordred: yes, but I think what github is saying is that they will replace updatedat at some point but haven't yet so thats the only reason we don't have that data now	23:49
jlk	yeah updatedAt works perfect for this	23:49
jlk	jeblair: um.	23:49
clarkb	so I would just stick with updatedAt, wait for them to actually write the replacement code, then switch	23:49
mordred	yah. I think we have to use updatedAt - there doesnt' seem to be good replacement for it yet	23:49
mordred	++	23:49
jlk	jeblair: I'm not sure I follow that.	23:50
jlk	jeblair: is that assuming a caching of PR -> change object?	23:50
jlk	(because that doesn't exist in the github driver yet)	23:50
jeblair	jlk: oh, they'll be different because they are different "patchsets", sorry.	23:50
jeblair	jlk: but yes, that caching also needs to happen	23:51
jeblair	jlk: mostly i wanted to make sure we didn't base any behaviors on the lack of caching	23:51
jeblair	but i don't think i added anything useful to the conversation, sorry. :)	23:51
jeblair	fwiw, in gerrit, we didn't go very long before we got to the point where we needed to query the change on each event. so we may end up eating the cost needed to resolve this without updatedAt before it becomes a problem anyway. :)	23:52
mordred	jeblair, jlk: if we get to that point, then as opqaue as I find it right now, the graphql stuff may wind up being useful, as it allows us to submit more than one query at a time	23:54
mordred	I say that may be useful, because I worry about the gh rate limit if we have to do a query on every webhook submission	23:54
mordred	but I'm wildly waving my hands in the air now	23:54
clarkb	would be interesting if they readjusted the rate limits significantly based on graphql change	23:55
clarkb	like even less requests per hour because now you can have richer reuqests	23:55
jamielennox	so it'll be interesting to see what gh does with API request limits for graphql because it's much more expensive server side	23:57
jeblair	"have fewer pull requests"	23:57
jamielennox	currently you get like 5000 api requests/hour but 100 search requests/hour	23:57
jamielennox	and i can't imagine that graphql is much less expensive than search	23:57
jeblair	git operations are unlimited?	23:57
jlk	they did	23:58
jlk	the limits are a lot lower with graphql	23:58
jeblair	many of zuul's questions may be able to be answered via git :)	23:58
jlk	so, to answer an earlier question	23:58
jamielennox	i'm not sure about git operations	23:58
jlk	currently it depends on the type of event as to whether or not we query github for the pullrequest data. Some events we get enough from the event body itself to move on, and the PR isn't queried until the getChange call.	23:58
jlk	however,	23:59
jamielennox	i know i've hit rate limits in the past for unauthenticated ops, but don't remember if that was ssh,git or http access	23:59
jlk	with graphQL, and with a caching inside of zuul, I've assumed that we'll start making a PR query and object for every event (that can be linked back to a PR), created when the event comes in, and updated at the getChange() call time (mirroring what gerrit does).	23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!