Wednesday, 2017-09-06

jeblair	turning off keep	00:01
*** harlowja has joined #zuul		00:03
*** harlowja has quit IRC		00:09
*** harlowja has joined #zuul		00:14
ianw	jeblair: doesn't fd.readline() in follow return str ... so under python3 the strings will always be encoded	00:52
ianw	not that i think this is wrong per se, but slightly conflicts with the changelog	00:53
*** harlowja has quit IRC		00:57
ianw	ah no, you're right, it's only with universal_newlines set that Popen does that	00:57
jeblair	ianw: er, good! i hadn't realized that, but it sounds like it's still accidentally correct. that's part of why i wrote the detailed changelog -- just to make sure it all makes sense to all of us. :)	01:01
*** harlowja has joined #zuul		01:01
*** jkilpatr has quit IRC		01:26
clarkb	readline returns str if you set universal newlines to true iirc	01:34
clarkb	oh depends if text mode or binary and text implies universal newlines	01:35
*** jkilpatr has joined #zuul		01:37
mordred	jeblair: lovely!	01:49
mordred	clarkb: yah - I agree on the collapsing of the environment defaults for tox	01:52
mordred	clarkb: if you're around, the 2 changes before 229 only have 1 +2 - they're mostly just lead-up to 229 though	01:54
mordred	clarkb: I'm not sure if you want to explicitly review them, or if your +2 on the end results of 229 is good and I can just land the stack	01:54
jamielennox	did we end up implementing a new way to hold the nodes of failing jobs?	01:55
mordred	jamielennox: yes we did!	01:55
jamielennox	\o/ - do you remember it?	01:55
mordred	jamielennox: https://docs.openstack.org/infra/zuul/feature/zuulv3/admin/client.html#autohold	01:55
mordred	jamielennox: (was looking for doc link for you)	01:56
jamielennox	ahh, i was looking at nodepool	01:56
jamielennox	probably does make more sense now for that to be on zuul side	01:56
mordred	yah - it used to be there - but with v3 and the shift to active-requests ... yah	01:56
jamielennox	is there still a use for "nodepool hold"	01:57
jamielennox	?	01:57
jamielennox	mordred: anyway - thanks!	01:59
mordred	jamielennox: I don't think so? maybe?	02:00
jamielennox	not super urgent for now anyway - it can be a useful way of pulling a node out for your own usage	02:00
jamielennox	just not sure if that'll be common	02:01
mordred	jamielennox: I do know that a thing we don't have but people keep asking for is "nodepool boot" - so thatyou can ask nodepool to boot you a node of a particular label - like if you need to debug something about one	02:01
mordred	jamielennox: which I htink is similar to the use case you're talking about yeah?	02:01
jamielennox	mordred: yea, becaues zuul will skip things marked HOLD it basically reserves you a node	02:02
mordred	"as an admin of a zuul/nodepool, I'm having issues that only show up in test and I'd like a node to ssh in to and poke around at to see if I can figure out"	02:02
mordred	jamielennox: ah - ya - hold in nodepool gets you a node to play with - autohold in zuul doesn't delete a node when a job fails	02:02
jamielennox	re: autohold, it'd be useful to not have every parameter required, like if tox-py27 is failing consistently in a tenant i probably don't care which project i capture from?	02:02
jamielennox	which is a feature i should put in storyboard, but i still struggle to know where to put things like that in storyboard	02:03
mordred	yah- I could see that	02:03
mordred	jamielennox: I think we all do	02:03
clarkb	mordred: I think you can go for it. I'm fighting the "my wahing machine stopped working and neither water valve for it has a handle" battle now	02:04
mordred	clarkb: oh good	02:04
*** jkilpatr has quit IRC		02:04
mordred	clarkb: I'll land those ina sec - I'm working on the squash-tox-environment thing right now	02:04
mordred	butI need to prove something to myself first	02:05
clarkb	it turns out when you replace a section of leaky pipr that kills washing machines	02:05
clarkb	can I blame jaypipes for this?	02:05
mordred	clarkb: yes. he's he right person to blame	02:07
*** jkilpatr has joined #zuul		02:16
*** xinliang has quit IRC		02:17
fungi	pabelanger: jamielennox: mordred: i'm still catching up... not installing bindep if there's no bindep.txt ignores the fact that we have a bindep fallback list, doesn't it? or am i misunderstanding the suggestion?	02:19
*** xinliang has joined #zuul		02:30
*** xinliang has quit IRC		02:30
*** xinliang has joined #zuul		02:30
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Collapse tox_environment and tox_environment_defaults https://review.openstack.org/501075	02:38
mordred	fungi: the code that looks for bindep.txt also looks for the fallback file	02:39
mordred	fungi: so for openstack, it will always find a bindep.txt - AND it will never intsall bindep becaues we have it pre-installed	02:39
mordred	clarkb: ^^ https://review.openstack.org/501075 collapses the tox_environment settings like you mentioned AND gets rid of the python module	02:39
mordred	clarkb: so tons of simplification	02:40
clarkb	nice	02:40
mordred	clarkb: can probably squash with the previous patch, but I figured I'd put it up separate for reading purposes	02:40
clarkb	mordred: I've pulled it up for review first thing tomorrow	02:40
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Don't install bindep if there's no bindep file https://review.openstack.org/501018	02:50
mordred	jeblair: I hit +A on https://review.openstack.org/#/c/501040 but there's some good comments from ianw in there that are worthy of reading and potentially a followup	03:07
fungi	jhesketh: trying to fix up your 456162 change i'm down to just one unit test failure now... if you get a chance to take a look at why test_crd_gate_unknown is unhappy with it we might be able to get it merged soon	03:26
* fungi needs to get some sleep, but will be getting into more zuulishness tomorrow		03:26
jhesketh	fungi: sure, I'll take a look	03:30
ianw	mordred: 501040 ... is that a known thing?	03:39
ianw	the -2 i mean	03:39
*** bhavik1 has joined #zuul		05:02
jamielennox	{"msg": "[Errno 2] No such file or directory" is such a useless message	05:03
jamielennox	why isn't the accessed name in their by default	05:04
*** bhavik1 has quit IRC		05:25
*** hashar has joined #zuul		05:35
openstackgerrit	Joshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Only grab the gerrit change if necessary https://review.openstack.org/456162	06:06
jhesketh	fungi: ^ I think that fixes the problem	06:07
* jhesketh misses working on zuul		06:08
tobiash	jhesketh: I have a thought on ^	07:44
tobiash	jeblair: just noticed that maintainCache is never called so we probably don't clear anything from the change caches currently	07:57
tobiash	jeblair: http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/scheduler.py?h=feature/zuulv3#n589	07:57
tobiash	jeblair: there is still some comment to update maintainConnectionCache for tenants but to me this method looks correct	07:58
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Enable maintainConnectionCache https://review.openstack.org/501144	08:02
tobiash	jeblair: wip'ed ^ in case you have any objections for this	08:03
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Use password supplied from nodepool https://review.openstack.org/500823	08:44
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Enable maintainConnectionCache https://review.openstack.org/501144	09:06
*** openstackgerrit has quit IRC		09:18
jhesketh	tobiash: ah cool, thanks... I think you're right and your suggestion is good. Should we let this one land and fix it up in a follow up or do it now?	09:19
tobiash	jhesketh: I don't mind, but I also think the data structure change should be its own patch	09:19
jhesketh	tobiash: umm, so you do want them separate? (sorry, I'm confused by your last message)	09:21
tobiash	jhesketh: I think we should have a patch which restructures the cache data structure and the patch which already exists. Possibilities are that the restructure change is either the parent or the child of your change	09:22
jhesketh	oh right, I follow	09:23
*** openstackgerrit has joined #zuul		09:48
openstackgerrit	Joshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Only grab the gerrit change if necessary https://review.openstack.org/456162	09:48
openstackgerrit	Joshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Connection change cache improvement https://review.openstack.org/501187	09:48
jhesketh	tobiash: ^	09:48
*** openstackgerrit has quit IRC		10:03
tobiash	looking	10:08
*** openstackgerrit has joined #zuul		10:22
openstackgerrit	Joshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Connection change cache improvement https://review.openstack.org/501187	10:22
openstackgerrit	Joshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Only grab the gerrit change if necessary https://review.openstack.org/456162	10:22
tobiash	jhesketh: +2	10:23
*** jkilpatr has quit IRC		10:42
rcarrillocruz	mordred, jeblair : ok, so I created "Ricky Zuul" GH app https://github.com/apps/ricky-zuul . The perms are a bit of guesswork, put r/w on PR and r/w on repo contents	10:46
rcarrillocruz	oh, and also on commit statuses	10:46
rcarrillocruz	i installed that app on my rcarrillocruz org dummy repo 'zuul-tested-repo'	10:46
rcarrillocruz	now on my way to set up zuul-scheduler with github driver	10:47
rcarrillocruz	pabelanger: ^	10:47
rcarrillocruz	erm, i guess the commit statuses is not needed	11:01
rcarrillocruz	jeblair: so i guess once we agree the bare minimum perms that are needed for creating a bespoke GH app for zuul usage, I can push a change and document that	11:02
rcarrillocruz	is that expected to change? i remember reading a perms model change in GH, something about graphql , not sure if the GitHub App thing may be in flux ?	11:03
*** jkilpatr has joined #zuul		11:07
*** jkilpatr has quit IRC		11:07
*** jkilpatr has joined #zuul		11:07
*** jkilpatr has quit IRC		11:15
*** jkilpatr has joined #zuul		11:28
mordred	rcarrillocruz: my understanding is that the App thing itself is the "new" way for 3rd parties to provide services to github users	11:31
mordred	rcarrillocruz: but you're very right - gh is moving their apis to all be graphql-based	11:31
mordred	rcarrillocruz: so at some point we'll need to update the gh driver to use graphql-api instead of rest	11:32
mordred	ianw: hrm. not to me	11:33
mordred	jhesketh: we miss you woring on zuul!	11:33
rcarrillocruz	sigh	11:35
rcarrillocruz	what's wrong with rest	11:35
* rcarrillocruz has nightmares, as network vendors instead of adopting rest they are coming back to xml apis		11:36
rcarrillocruz	does http://paste.openstack.org/show/620512/ ring a bell anyone?	11:41
rcarrillocruz	that from scheduler startup	11:41
rcarrillocruz	if i do on python3 shell	11:41
rcarrillocruz	import github3	11:42
rcarrillocruz	gh = github3.GitHub()	11:42
rcarrillocruz	it does not have a session either	11:42
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Connection change cache improvement https://review.openstack.org/501187	11:51
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Only grab the gerrit change if necessary https://review.openstack.org/456162	11:51
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Collapse tox_environment and tox_environment_defaults https://review.openstack.org/501075	12:01
*** weshay_PTO is now known as weshay		12:08
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Collapse tox_environment and tox_environment_defaults https://review.openstack.org/501075	12:11
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Delete unused run-cover role https://review.openstack.org/501244	12:12
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Switch to openstack-doc-build for doc build jobs https://review.openstack.org/501246	12:18
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Be explicit about byte and encoding in command module https://review.openstack.org/501040	12:21
mordred	jeblair: SOOOOOO	12:26
mordred	jeblair: issue for you to look at as soon as you are awake	12:26
mordred	jeblair: I just watched shade have a job in the gate pipeline - which is incorrect	12:27
mordred	jeblair: the build uuid is 80e369018a664c5f86c3cd64af7a4640	12:27
mordred	jeblair: https://review.openstack.org/#/c/494535/ is the change it happened for ...	12:29
mordred	jeblair: there is another change: https://review.openstack.org/#/c/500201/ which added it the job to the gate pipeline, which I approved because I'm a moron and didn't register that it had a gate entry	12:30
mordred	jeblair: that change had not landed, but it WAS running in the gate when https://review.openstack.org/#/c/494535/ was approved and enqueued	12:31
mordred	rcarrillocruz: zomg network vendors are moving back to XML? they should at least, if they're not going to do REST, do something sane like gRPC	12:32
mordred	rcarrillocruz: did you properly get the version of github3.py from git?	12:33
rcarrillocruz	https://en.wikipedia.org/wiki/NETCONF	12:33
rcarrillocruz	it's been around for a while, but getting more vendors onboard now	12:33
rcarrillocruz	which is a shame, since there's a thing called RESTConf	12:33
mordred	rcarrillocruz: http://git.openstack.org/cgit/openstack-infra/zuul/tree/requirements.txt?h=feature/zuulv3#n5	12:33
rcarrillocruz	anyay	12:33
mordred	rcarrillocruz: HEADDESK	12:34
rcarrillocruz	mordred: yeah, i did	12:34
rcarrillocruz	thing is	12:34
mordred	rcarrillocruz: now is _definitely_ the time to start adopting a protocol written in 2006	12:34
rcarrillocruz	i don't understand that code	12:34
rcarrillocruz	the github object is suppoed to get the session when it logins	12:34
rcarrillocruz	let me link	12:34
rcarrillocruz	https://github.com/openstack-infra/zuul/blob/feature/zuulv3/zuul/driver/github/githubconnection.py#L427	12:35
rcarrillocruz	it fails there	12:35
rcarrillocruz	but, the login is done after that method	12:35
rcarrillocruz	so at that point there's no session	12:35
rcarrillocruz	commenting out those lines the execution goes over fine	12:38
mordred	rcarrillocruz: that's really weird - Idon't see that error in our logs	12:39
mordred	rcarrillocruz: I wonder if there is a difference in how we have the auth things configured?	12:39
mordred	rcarrillocruz: http://paste.openstack.org/show/620519/ is our github config snippet (with two values omitted, clearly)	12:40
rcarrillocruz	yeah, i have the same thing	12:41
rcarrillocruz	mordred: http://paste.openstack.org/show/620521/	12:42
rcarrillocruz	that's pretty much what the driver code does	12:42
rcarrillocruz	in a python3 shell session	12:42
rcarrillocruz	getting a client	12:42
rcarrillocruz	i don't have a session attr	12:42
rcarrillocruz	i'm confused how that works in your side	12:42
mordred	rcarrillocruz: that worksfor me: <github3.session.GitHubSession object at 0x7efeae5aa9e8>	12:44
rcarrillocruz	hmm	12:44
mordred	http://paste.openstack.org/show/620522/	12:45
mordred	rcarrillocruz: >>> github3.__version__	12:45
mordred	'1.0.0a4'	12:45
mordred	although that doesn't really tell what version from git it's installed from	12:46
rcarrillocruz	eugh	12:49
rcarrillocruz	that was it	12:49
rcarrillocruz	it seems i had a github lib floating around	12:49
rcarrillocruz	my messing with pip vs pip3 probably	12:50
rcarrillocruz	thx	12:50
mordred	rcarrillocruz: the pip vs. pip3 thing has bitten us more than once :)	12:50
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Collapse tox_environment and tox_environment_defaults https://review.openstack.org/501075	12:56
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Delete unused run-cover role https://review.openstack.org/501244	12:56
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Add UPPER_CONSTRAINTS_FILE file if it exists https://review.openstack.org/500320	12:56
rcarrillocruz	mordred: the webhook URL path, is by default <zuul_server>/connection/github/payload or you have some reverse proxy redirecting that to other thing ?	13:00
mordred	rcarrillocruz: it is that by default where zuul_server is the zuul-scheduler webapp process	13:02
rcarrillocruz	sweet	13:03
mordred	rcarrillocruz: we also have a reverse proxy in front of that for us, as there are 2 different web apps at the moment (zuul-scheduler and zuul-web) and we want to hide that	13:03
rcarrillocruz	that reminds me i should bring t up now (zuul-web)	13:03
mordred	hopefully it won't be too long after the PTG for us to migrate the rest of the scheduler webapp into zuul-web so we can go back to having one web app	13:04
rcarrillocruz	jebus	13:09
rcarrillocruz	http://paste.openstack.org/show/620528/	13:09
rcarrillocruz	i'm excited!	13:09
rcarrillocruz	\o/	13:09
Shrews	morning folks	13:13
mordred	morning Shrews	13:13
mordred	rcarrillocruz: woot!	13:13
mordred	rcarrillocruz: it's kind of amazeballs isn't it?	13:13
rcarrillocruz	for sure :D	13:14
*** dkranz has joined #zuul		13:21
mordred	jeblair: also still seeing the weird -2 on patches not at the top of a stack over in shade even with yesterday's patch running	13:23
mordred	jeblair: http://paste.openstack.org/show/620530/ is the relevant portion of the log I think	13:23
mordred	jeblair: also, a little further back in the log: 2017-09-06 12:03:16,903 DEBUG zuul.DependentPipelineManager: Scheduling merge for item <QueueItem 0x7f4882408b38 for <Change 0x7f48927d5ba8 500930,2> in gate> (files: ['zuul.yaml', '.zuul.yaml'], dirs: ['zuul.d', '.zuul.d'])	13:27
rcarrillocruz	mordred: does zuul-scheduler log github events?	13:28
rcarrillocruz	like if i do a PR push, should I expect something in that log (in my case, foreground, not logging to its own file yet)	13:29
mordred	yes	13:29
mordred	you should definitely see activity	13:29
rcarrillocruz	sweet, i'll ry that out	13:30
mordred	rcarrillocruz: https://github.com/organizations/openstack-infra/settings/apps/openstack-zuul/advanced (or replacing with your url)	13:30
mordred	rcarrillocruz: shows you a list of events github has delivered to your app	13:30
mordred	rcarrillocruz: we should be logging the event id so that if you want you can cross-reference with github's log	13:31
mordred	jlk: ^^ speaking of that ... on that advanced tab gh also shows the response it got	13:31
rcarrillocruz	sweet	13:31
mordred	jlk: I wonder if maybe we should add $something to our response - like a header - that includes $something from zuul	13:32
rcarrillocruz	oh , i get 404 , i guess cos i'm not member of the org	13:32
mordred	jlk: it's possible we don't have anything yet	13:32
rcarrillocruz	but yeah	13:32
rcarrillocruz	i can look on my own org	13:32
mordred	jlk: at that stage of processing	13:32
mordred	jlk: but if we do, maybe returning it in our response headers is a thing that could be useful somehow?	13:33
mordred	jlk: just an idle thought	13:33
*** hashar is now known as hasharAway		13:38
rcarrillocruz	hmm, mordred i don't see logging on a PR I just pushed. Howevre, I do not have layout.yaml set yet. I wonder if the logging is only when the project is set up on a pipeline with jobs and all. i.e. the webhook raw events are not logged ?	13:38
rcarrillocruz	nah, seems like a config issue	13:41
rcarrillocruz	checking the github app i see undelivered messages	13:41
* rcarrillocruz looks		13:41
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Update tests to use AF_INET6 https://review.openstack.org/501266	13:50
mordred	rcarrillocruz: yah - you shold see log entries for every event that happens	13:51
rcarrillocruz	tadaaaa	14:03
rcarrillocruz	Sep 06 14:03:08 zuul sh[16326]: 2017-09-06 14:03:08,619 DEBUG zuul.GithubWebhookListener: Github Webhook Received: 7eda70b6-9308-11e7-8827-64faa1b0f4fd	14:03
rcarrillocruz	got confused between zuul-webapp and zuul-web ports	14:03
rcarrillocruz	put 8001 on the gh app URL and sorted it	14:03
mordred	woot!	14:12
mordred	rcarrillocruz: and yah - I'm looking forward to there only being one web port - the current thing is annoying	14:12
mordred	tobiash: left -1 on https://review.openstack.org/#/c/500799 - but overall I like both sides of that stack!	14:13
tobiash	:)	14:13
rcarrillocruz	in order to have feature parity to what we have with dci (periodic CI jobs), i'll set up a periodic pipeline and set what we have now. After that, check_github and gate_github	14:15
dmsimard	mordred: hey, a bit of a silly question -- how do we make the base job work on either ubuntu-xenial and centos-7, but not both ?	14:17
mordred	rcarrillocruz: \o/	14:17
dmsimard	right now the base job defaults to one ubuntu-xenial node -- so jobs wanting to run on something else would need to override it I guess ?	14:18
mordred	dmsimard: uhm. I'm not 100% sure what you mean by that - can you say that with different words?	14:18
mordred	dmsimard: yes! that is correct	14:18
mordred	dmsimard: jobs that want not-ubuntu-xenial just add whatever they want in nodes:	14:18
dmsimard	mordred: okay, pabelanger stood up the centos-7 image so I'll run some tests to see if it works ahead of the migration	14:18
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Fix missing logconfig when running tests in pycharm https://review.openstack.org/500748	14:19
mordred	dmsimard: ++	14:19
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Close logging config file after write https://review.openstack.org/500754	14:19
dmsimard	mordred: the JJB translation to shell will handle the node definition as well ?	14:19
dmsimard	some have centos-7, centos-7-2-nodes etc	14:20
mordred	dmsimard: yah	14:21
openstackgerrit	David Moreau Simard proposed openstack-infra/zuul-jobs master: Do not merge: test v3 jobs on the centos-7 image https://review.openstack.org/501281	14:21
jeblair	mordred: regarding 494535 being in gate -- i'm not actually sure that was incorrect. when a change is enqueued into gate, it's a proposed future state, and each change enqueued after it exists in the context of that proposed future state. so as soon as the change adding the shade job to gate was in the gate pipeline, the next shade change added to the gate pipeline should run that job too, because it's running in a world where shade has a gate ...	14:33
jeblair	... job.	14:33
mordred	jeblair: yes - for sure!	14:33
mordred	jeblair: but - when the change adding the gate job does not land, the change behind it should get re-enqueued in a context that doesn't include the gate-adding change	14:34
mordred	jeblair: so while it should run in the gate for a moment in time, the other change should certainly not be merged by the gate pipeline	14:34
mordred	(which is what happened)	14:35
jeblair	mordred: ah yes. i'll make a test.	14:35
mordred	jeblair: here's the sequence that occurred https://etherpad.openstack.org/p/FZhs4Rh86F	14:39
jeblair	mordred: got it	14:40
mordred	jeblair: also - I was going to fix pabelanger's change to remove the gate mention, but was waiting to make sure you didn't need it for any reason	14:40
jeblair	nope	14:40
mordred	jeblair: cool. also added followup with theory on what the followups are with the other changes (that issue is persistent and still happening, fwiw)	14:45
jeblair	mordred: that makes sense too	14:47
mordred	k. cool	14:47
jeblair	we'd certainly want to fix that before we start having zuulv3 chime in on more repos.	14:47
jlk	mordred: a header tossed back, some sort of uuid?	14:48
jlk	mordred: would make sense, particularly if we could carry that ID throughout zuul logging, for tracing.	14:50
mordred	jeblair: agree	14:51
mordred	jlk: and yah -maybe? I mean, on the other hand we already log the GH event id in the zuul logs	14:51
mordred	jlk: so I'm not sure if reporting a zuul id in the gh response is actually useful for cross-reference	14:52
jlk	well	14:52
mordred	jlk: since you'd ultimately need to find the gh event id in the zuul log to know which event to open in the gh ui	14:52
jlk	it'd make sense if we did the same elsewhere, and had an ID that could be carried around like with openstack	14:52
mordred	yah - that's a good point "I got an event from something, I created an ID for it, and that ID is gonna carry through the system on the things that event triggers"	14:53
jlk	but yeah, if you were looking from zuul side, you'd eventually trace it back to the incoming event, which would have the GH ID	14:53
mordred	yup	14:53
jlk	so maybe not as useful to toss it back, but certainly that idea spurred other ideas :D	14:54
jlk	spurred?	14:54
mordred	jlk: oh - I also happened to notice in the log:	14:54
mordred	AttributeError: 'GithubWebhookListener' object has no attribute '_event_pull_request_review_comment'	14:54
mordred	jlk: only 4 of them in the current debug log	14:54
jlk	yeah we aren't listening for those	14:54
jlk	that's when somebody does a "review" and comments on the code/review in the review context	14:54
jlk	separate from just a single comment on the PR	14:55
jlk	(and also different from a single comment on the diff)	14:55
mordred	jlk: gotcha. so we don't care about those because we only care about comments on PRs for recheck, yeah?	14:55
jlk	correct. Those come through as issue comment	14:55
mordred	jlk: and for reviews I'd imagine we'd prefer to respond to review approval rather than text in a review	14:56
jlk	because a PR is an issue, except that it isn't.	14:56
mordred	yah	14:56
jlk	mordred: bingo. We do respond to approval/request changes events	14:56
mordred	jlk: maybe we should make a few explicit no-op handlers for thins we know we're not listening to on purpose	14:56
mordred	jlk: so that we don't log AttributeErrors for a thign that's actually purposeful behavior	14:57
jlk	We could do that. It should be gracefully returning to github if we don't handle the event.	14:57
jlk	but yeah, that's a tad ugly in the code	14:57
jlk	Maybe we could just not emit that error when we don't match an event.	14:57
jlk	GH adds events from time to time, we'd be chasing it if we had a noop for each.	14:57
mordred	jlk: yah - at this point I think we're fairly happy with our event matching - we could make a separate logger for unmatched events that defaults to off but that people could add to their logging config if they wanted to debug something related to event matching	14:59
mordred	btw - we're getting "We couldn’t deliver this payload: Service Timeout" from time to time on gh events	15:01
jlk	oh wonderful	15:02
jlk	from gh to zuul or from zuul to gh?	15:02
mordred	so it's possible that we've already hit the scaling point where having more than one webhook listener behind a loadbalancer is needed	15:02
mordred	gh to zuul	15:02
jlk	yeah, I figured that would happen soon	15:02
jlk	I think a big part of that problem is a bunch of processing happens while the sender is connected	15:03
mordred	yah - even though we're not doing anything with them yet, the firehose of gh events from ansible/ansible is actually kind of useful for shaking this sort of stuff out	15:03
jlk	sender connects, zuul chews on the event for a bit, hits the API a bunch, then returns	15:03
jlk	it would probably be much better to take in the event content and return right away.	15:04
mordred	ah. any reason we can't return as soon as we have the json even if we haven't enqueued it yet? or do you think waiting until it's enqueued so that we properly return to gh that we didn't accept it is better?	15:04
jlk	we should be doing minimal processing.	15:04
jlk	I think I broke this a bit	15:04
jlk	it probably was really fast before, but when I moved to caching the PR data, it meant hitting the APIs much earlier.	15:05
mordred	yah. I think returning quickly is likely better here- and we can do work zuul-side to make sure we either don't lose events after returning 200 or that we log ourselves when we do	15:05
jlk	building up the cached object at web event time and carrying it forward.	15:05
jlk	we could still do that, we just have to be careful	15:05
jlk	need a persistent queue	15:05
jlk	btw I'm going to try to participate in the ansible contributor day thing	15:06
mordred	yah - well - luckily the move from webapp to zuul-web will put a gearman in the middle anyway	15:06
jlk	via video/IRC	15:06
mordred	jlk: cool	15:06
jlk	I think the one hard one to easily do a 200 immediately on is when it's a status event	15:06
jlk	actually, no, we could probably post-process that anyway	15:06
jlk	so basically, we'd get an event from GH, we'd ensure it's signed and properly formatted, then return a 200. Maybe we could check to see if it's a project we care about and do a !200 if we don't care about the project yet.	15:09
jlk	doing all the API work in the event thread ties up the event processor. I think that's single threaded, no?	15:09
jeblair	jlk, mordred: that's what we do with gerrit today -- there's a queue object that connects the gerrit listener to the gerrit connection. all of that within the driver.	15:10
jlk	nod	15:10
jeblair	so if we need this before we move the listener into zuul-web, there's a pattern we can copy pretty quickly. it's not much code.	15:11
jlk	yeah I could probably bang that out today	15:11
jlk	see if that gets us out of resource contention	15:11
jlk	I honestly think I'll need some high bandwidth brain / face time with y'all to sort out the zuul-web move in my head. I read some of the code but it's not exactly clicking yet	15:12
jeblair	jlk: it helps if you write it out on a piece of glass and look at it from the back side, upside down	15:17
jlk	perfect!	15:19
mordred	jlk: :)	15:20
mordred	jlk: I thinkn we can sort out the zuul-web move with high bandwidth brain time pretty quickly - it's actually pretty straightforward given the structure of the github driver - at least in my head	15:22
* jlk drops off to prepare for ansiblefest things		15:26
rcarrillocruz	folks, i have to say the zuul docs have been vaaastly improved	15:29
rcarrillocruz	kudos everyone	15:29
* rcarrillocruz keeps reading how to define zuul v3 jobs		15:29
*** hasharAway is now known as hashar		15:31
mordred	rcarrillocruz: luckily - we've got a TON of content now	15:33
Shrews	more content to go in with some +3's: https://review.openstack.org/500213	15:43
Shrews	jeblair: left a -1 on https://review.openstack.org/500216 because of a missing _ causing the link to not be clickable	15:44
jeblair	Shrews: okay, i'll update that after i finish writing tests for mordred's problem	15:45
openstackgerrit	David Moreau Simard proposed openstack-infra/zuul-jobs master: Do not merge: test v3 jobs on the centos-7 image https://review.openstack.org/501281	15:45
jeblair	mordred: with the test i've written, change B does incorrectly run the gate job, however, it correctly does not report it.	15:46
jeblair	mordred: are you sure there aren't any other gate jobs? maybe some with matchers so they aren't actually run?	15:48
jeblair	mordred: or were there any other changes involved in that sequence?	15:48
mordred	jeblair: no - not to my knowledge to either	15:58
jeblair	oh... there might be an interaction with the check pipeline... lemme rejigger the test	15:59
mordred	jeblair: there is no mention of shade in a zuul gate pipeline anywhere other than that one change	16:00
jeblair	mordred: ah there we go -- it's the presence it check that caused that behavior. i think i have the reproduction now. sorry for the red herring.	16:01
pabelanger	mordred: I think we are ready to land https://review.openstack.org/500990 this morning. Do you have time to review? Our new publish-openstack-python-docs-infra job	16:01
pabelanger	and child patches	16:02
mordred	jeblair: woot!	16:03
mordred	pabelanger: looking now	16:03
mordred	jeblair: glad you found a reproduction - it's those sorts of squirrely things that this whole run-it phase should be smoking out	16:06
rcarrillocruz	folks, where is the zuul base default job defined	16:09
rcarrillocruz	is it in tree	16:09
rcarrillocruz	or within zuul-jobs repo	16:09
jlk	I thought it was in zuul-jobs	16:09
rcarrillocruz	asking as i defined a custom job on my test repo	16:09
rcarrillocruz	and got	16:09
jlk	or maybe project-config	16:09
jlk	project-config	16:09
rcarrillocruz	"Job base not defined"	16:09
mordred	rcarrillocruz: it's project-config	16:10
rcarrillocruz	so, that means, it's a requirement to pull that repo in order to have a minmal zuul right?	16:10
mordred	rcarrillocruz: you have to define your own base job	16:10
jlk	playbooks/base	16:10
jlk	you can define your own base, or re-use project-config.	16:10
mordred	rcarrillocruz: however, our base job playbooks are just built on roles that are all in zuul-jobs	16:10
rcarrillocruz	k, thought we had some sort of empty 'base' in the code	16:10
rcarrillocruz	so we didn't have to define it	16:10
mordred	rcarrillocruz: it's on our todo list to do that- there are a few things that need to be sorted out first,so for now you need a deployment-specific base job	16:11
rcarrillocruz	dummy question: as there's going to be a super handy library of base jobs, how you plan to distribute that? as part of pip or will it always be a thing on git openstack	16:12
mordred	rcarrillocruz: I recommend copying the base job + playbooks from project-config, then defining a secret that holds credential for whereever you want to upload logs	16:12
mordred	rcarrillocruz: git openstack	16:12
mordred	rcarrillocruz: because you can just put openstack-infra/zuul-jobs directly in your zuul/main.yaml	16:12
mordred	rcarrillocruz: and zuul will update it for you magically	16:12
mordred	rcarrillocruz: I'm sure at somepoint someone is going to think they want a frozen pip/rpm/deb installable set of jobs, but I'm going to argue with them as strongly as I can that they don't really want that :)	16:13
mordred	pabelanger: I'm +2 on that whole stack	16:13
rcarrillocruz	ah ofc, zuul-jobs is in github too	16:14
rcarrillocruz	so in my case, a only github driver installation, it would pull it as well	16:14
rcarrillocruz	++	16:14
jlk	yeah you can point to github	16:14
jlk	we do at Bonny	16:14
mordred	rcarrillocruz: yup	16:14
dmsimard	mordred, jeblair: is it possible to prevent the base role from running ?	16:15
rcarrillocruz	jlk: periodic also work on github right?	16:15
rcarrillocruz	i'm doing a POC	16:15
jlk	periodic driver?	16:15
jlk	I haven't tried...	16:15
rcarrillocruz	a periodic pipeline	16:15
rcarrillocruz	with github source	16:15
jlk	I mean, it should?	16:15
mordred	dmsimard: you can say "parent: none" to make a job that doesn't use the base job	16:15
dmsimard	mordred: perfect! thanks.	16:15
mordred	dmsimard: although if you did that on openstack's zuul you'd be very sad	16:15
mordred	dmsimard: since you odn't get logging without our base job :)	16:15
dmsimard	mordred: right, but the purpose is to test the base playbook (and the roles it contains)	16:16
dmsimard	so it's kind of inconvenient if the trusted role runs first, and then we re-run the (modified) role on top	16:16
mordred	oh - well - we'll never run a proposed versoin of the base job in a job	16:16
dmsimard	mordred: what is preventing me from adding a required-projects: project-config and then running that playbook with the checked out roles from a review ?	16:17
mordred	dmsimard: zuul is	16:17
mordred	oh - hrm.	16:18
mordred	dmsimard: yah - ok, you could construct something - it would still be synthetic, as it wouldn't have access to the secrets needed for the base job to work	16:18
dmsimard	mordred: the tl;dr is that I want to make sure the base playbook works for all distros -- right now it only works for ubuntu. This is for centos: https://review.openstack.org/#/c/501281/ but I'll also add the debian, fedora and opensuse image to nodepool v3	16:19
dmsimard	and not getting configure-mirror "self-tested" in the gate will make this suck, a lot	16:20
dmsimard	thus, I was planning on spawning a multi node and run ansible from a controller node, to the node where the base role would be applied -- a bit like how you showed me with zuul stream	16:20
mordred	dmsimard: yah - well, the base-job content is purposely not self-testing - which is why we need to make some synthetic tests	16:20
mordred	dmsimard: oh - but yes, that's exactly right	16:20
dmsimard	mordred: so, can I do that then ?	16:20
mordred	dmsimard: doing that is, I think, what we need to do to verify base content - but then it's not really about being ableto runa job without a its base job...	16:21
mordred	or, rather...	16:21
mordred	dmsimard: the synthetic job will still need to deal with providing the job running from the controller to the other node with variables that the job can use whenit runs the roles in the base job's playbooks	16:22
dmsimard	sure, I can figure what to pass and provide "mock" data as necessary	16:23
mordred	dmsimard: so - that's a thing you could have your synthetic job create - like making sure there is a key installed on one of the nodes, then passing it in	16:24
dmsimard	the purpose is to test that it works without failing horribly	16:24
dmsimard	right	16:24
pabelanger	mordred: thanks, jeblair clarkb fungi: are you interested in reviewing https://review.openstack.org/500990 for our publish-openstack-python-docs-infra jobs	16:24
mordred	dmsimard: we still likely want to make base job in project-config like "base-post-only" or sometihng that you could use that would only run the post-logs playbook - and maybe that would only run the pre-playbooks against the controller node instead of against hosts: all	16:25
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix dynamic dependent pipeline failure https://review.openstack.org/501345	16:25
rbergeron	jlk, pabelanger: jeblair / mordred / shrews aren't here with us today at the ansible contributor event -- but since you're here (paul in person, jesse remotely) if you feel like there's anything to hail them about, feel free to do so --	16:25
mordred	dmsimard: so that you can use the REAL base job to set up your interactions with controller and to get logs published, etc	16:25
rbergeron	the agenda and bluejeans stuff is posted here: https://public.etherpad-mozilla.org/p/ansible-summit-september-2017-core including the bluejeans video stuff	16:25
rbergeron	which i said twice in one sentence	16:26
* mordred waves to rbergeron		16:26
rbergeron	anyway.	16:26
rbergeron	shrews: i have your hoodie also :)	16:26
fungi	pabelanger: i'm interested in reviewing anything and everything zuulv3, working my way through the infra-manual patches right now but i can look at those next	16:26
* rbergeron waves to mordred		16:26
rbergeron	(and we're in #ansible-meeting on irc). sorry for the noise. also if anyone else wants to pop in for whatever reason you are welcome to :)	16:26
jeblair	rbergeron: thanks!	16:27
pabelanger	fungi: great! I think we're ready to start testing afs publishing again for infra jobs	16:27
jeblair	mordred: https://review.openstack.org/501345 fixes the first thing (the A+B changes)	16:28
mordred	jeblair: awesome. reading. also, I added post-merge review comments to https://review.openstack.org/#/c/500213	16:31
dmsimard	mordred: I'm not sure that matters, 'controller' is a subset of 'all' anyway -- so when running the playbook in the job, it will target a specific node as appropriate	16:31
mordred	dmsimard: I believe I see where you're going, but I do not believe it's going to work quite like you want - you can get the proposed project-config change onto a build node, but you cannot get the existing zuul to execute playbooks from the proposed change no matter what you do because of the way config repos work	16:39
clarkb	pabelanger: mordred please see comment on 990	16:39
mordred	dmsimard: so by synthetic, I mean you're going ot have to command: ansible-playbook something on the controller against one of the other nodes in the multinode job	16:39
dmsimard	mordred: yes, exactly -- I'll be running ansible-playbook	16:39
mordred	clarkb: will do - could you look at https://review.openstack.org/501345 ?	16:39
dmsimard	it's not perfect but it's the best we've got	16:39
mordred	dmsimard: ++	16:39
mordred	clarkb: we found a really fun edge case with gating this morning :)	16:40
mordred	clarkb: responded. I agree with your comment, but I think pabelanger can make the mv docs/post.yaml docs/infra-post.yaml when he adds the real post playbook for the openstack job	16:42
clarkb	mordred: re parent: none from above, the change you just linked uses parent: null. Is that just a convention of pointing to undefined name or is null actually needed?	16:47
dmsimard	jeblair: Can parameters from a job also be applied to a node ? If, for example, you'd want to run a playbook or a role against only one node.	16:52
mordred	dmsimard: you define that in the playbook	16:52
mordred	dmsimard: you can define groups for your nodes in the nodeset definition	16:52
mordred	dmsimard: so you can put some of them into a group and then write the playbook to target that group	16:52
mordred	clarkb: parent: null is required for the base job - since by definition it's the root of the inheritance hierarchy	16:53
clarkb	mordred: and that is distinct from parent: none?	16:53
mordred	nope. Ijust mistyped earlier	16:53
clarkb	ah ok	16:53
mordred	null is the yaml for None iirc	16:54
dmsimard	mordred: so here's my next awesome problem	16:54
clarkb	like false == False and so on	16:54
mordred	mian thing is - for a base job you must tell zuul explicitly it doesnt have a parent	16:54
mordred	becase omitting parent: means parent: base	16:54
mordred	clarkb: yup	16:54
pabelanger	mordred: clarkb: replied. Yes, publish-openstack-python-docs job needs to be updated now, what I am working on locally now. But want to make sure that python-docs-infra is now working, since we can build a top of that for unified docs	16:54
dmsimard	mordred: I'd like the real real base job to actually run on the controller node (so it can, like, upload logs for real and stuff) but not on the node I'm going to fake-run base on	16:54
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add test for dependent changes not in a pipeline https://review.openstack.org/501353	16:55
dmsimard	I hope that makes sense	16:55
jeblair	mordred: ^ turns out that change fixed the second thing too, so that's just a test add	16:55
mordred	dmsimard: right- that's why I was thinking we needed to make a special base job for this purpose in project-config that is like the base job but has playbooks that target a specific group instead of "all"	16:55
* jeblair reads scrollback		16:56
mordred	jeblair: woot. btw - the first patch failed tests in gate	16:56
dmsimard	mordred: making another playbook is easy, I mean, I can just add it as a "fixture" in zuul-jobs -- but then it can get out of sync from the "real" base playbook	16:56
dmsimard	mordred: oh, wait, I'm confusing myself I see what you mean now	16:57
dmsimard	ok, sure, let's do that	16:57
mordred	dmsimard: although - now that I think about it - hosts: lines can have variables - so we COULD consider making a variable on the base job that is like "zuul_default_target: all" - and then defining some of our base playbooks to use hosts: "{{ zuul_default_target }}" - which would let people override that variable and run base playbooks against a subset of nodes	16:57
dmsimard	mordred: what I was thinking about is more along the lines of --limit from the CLI	16:58
dmsimard	mordred: your playbook has 'all' but you're passing a --limit <node name> so that you'd only be running against a specific node or group	16:58
mordred	I could see times in which that would be beneficialto other people - especially with a controller/nodes pattern - like if osmeone wanted 3 nodes for a puppet integration test but only was ever going to run their zuul playbooks against controller since they want puppet to talk and manage their other nodes	16:58
pabelanger	mordred: jeblair: re: wheel builders we'll need more openstack-infra projects to zuulv3, are we okay to do that or hold off until mass import?	16:58
mordred	dmsimard: I thnk both are things we should consider - but for now let's just do a second base job with a limited hardcoded set	16:59
mordred	dmsimard: and hash out a plan for the general usecase next week	16:59
mordred	pabelanger: it's only 3 more projects	16:59
dmsimard	mordred: right, I think adding support for a parameter which gets passed to --limit makes sense, wonder if I should write it down somewhere	16:59
pabelanger	mordred: yes, I can propose it now. just wanted to confirm first	16:59
mordred	https://review.openstack.org/#/c/500626/2/zuul.yaml	16:59
mordred	pabelanger: I have the whole wheel-builder stack :)	17:00
openstackgerrit	David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add zuul-cloner shim https://review.openstack.org/500922	17:00
jeblair	dmsimard: why not just write your playbook to act on the nodes you want it to?	17:00
Shrews	rbergeron: \o/	17:01
mordred	jeblair: because whathe wants to do is limit the nodes the base playbook operates on	17:01
jeblair	mordred: if that's the case then we have too much stuff in the base playbook	17:01
mordred	jeblair: so that content he runs on the 'controller' node is what's responsible for doing the things to the other node that base would normally do	17:01
dmsimard	it's probably confusing to explain in writing, I'll write it down and explain in person :D	17:01
pabelanger	mordred: Oh, now I see	17:02
mordred	jeblair: not really - zuul_stream test, for instance, has a zuul_console on the node it runs against because the zuul base job sets that up - so we don't actually test in that change that running zuul_console on the remote node does what we expect	17:02
pabelanger	doh	17:02
mordred	jeblair: but I think we can shelve this as a general topic until next week	17:02
dmsimard	mordred: could/should I re-purpose base-test for that purpose ?	17:02
pabelanger	mordred: we need to land zuul/main.yaml first. I'll make change	17:02
mordred	jeblair: and for now do exactly what you said - which is "write a playbook which explicitly lists the hosts desired"	17:02
jeblair	mordred: i have a really high bar for adding features to zuul for the express purpose of being able to test zuul	17:02
mordred	jeblair: yes. I do not think it's needed	17:03
mordred	jeblair: to add any features	17:03
jeblair	so the fact that zuul_stream is hard to test doesn't bother me. that's our problem, we don't need to inflict that on users of the general case :)	17:03
mordred	jeblair: right. I'm not saying we need to add any features - I think there are some things we can do simply right now, and some things we can do that might be more complex and general that we can talk about next week	17:03
jeblair	(we can write a playbook to kill it; job done :)	17:03
dmsimard	jeblair: What I'm talking about is not for the purpose of testing Zuul, it's a generic feature of Ansible to be able to limit what hosts a playbook will run against -- your playbook could have 'hosts: all' but if you do ansible-playbook playbook.yml --limit controller, it would only run against your controller node.	17:04
jlk	Do we have any pending upstream Ansible features?	17:05
*** jkilpatr has quit IRC		17:06
mordred	jlk: the upstreaming of log streaming that we discussed with abadger1999 and jimi-c - but I believe that's at "write a spec" stage	17:06
jlk	okay	17:06
jeblair	dmsimard: right, but i worry that in the zuul context that gets a little confusing when compared to playbooks authored to run against node lists. it seems like if you have a playbook that runs against 'foobar' nodes, and you don't want it to run on foobar nodes, don't add any foobar nodes to the job?	17:07
mordred	jlk: other than that, I think we're pretty solid at the moment	17:07
jlk	okay	17:07
*** harlowja has quit IRC		17:07
*** harlowja has joined #zuul		17:07
mordred	for now how about we let's add a base-integration - or even base-zuul-integration that has the same content as base but with playbooks that target controller instead of all - it's a purely job-content solution for being able to build a zuul job to test the base job	17:09
mordred	which, as jeblair points out, is a fairly unique and specific problem	17:10
Shrews	jeblair: cloner shim is ready for review. I thought it would be less error prone to just flat out copy the ClonerMapper class into the shim. Also, using 'cp' for the hard linking since that seemed less error prone than a pure python solution.	17:10
jeblair	yeah, i think part of the context that we're missing here is that we just spent about 3 weeks ensuring that the base content ran everywhere as part of our security posture	17:10
jeblair	mordred, dmsimard: so anything where we back that out is something that we need to do very carefully	17:11
jeblair	mordred, dmsimard: for instance, even in the reduced base job that mordred is talking about, we need to run the ssh key thing	17:12
jeblair	mordred, dmsimard: and there can be no ability for a job to opt-out of that, otherwise we have created a vulnerability	17:13
dmsimard	jeblair: ok let's take a step back	17:13
jeblair	mordred, dmsimard: (to be clear, i'm in favor of mordred's reduced base job, as long as it still contains the ssh key roles running on all hosts)	17:14
dmsimard	jeblair: instead of telling you what I think I want, let me tell you what's my problem and let's see if we're on the same wavelength	17:14
dmsimard	jeblair: I'm trying to iterate on this: https://review.openstack.org/#/c/501281/	17:15
dmsimard	jeblair: my problem: how do I test that this doesn't break the base playbook on different distros ?	17:16
pabelanger	Shrews: so if we wanted to bring nl02.o.o online, is it better to run both nl01 and nl02 at the same time or stop nl01.o.o and start nl02.o.o. Do you have a preference?	17:16
Shrews	pabelanger: i can't think of any reason why you'd need to stop nl01	17:17
jeblair	dmsimard: thanks. having that example helps.	17:17
dmsimard	jeblair: what I think I need: create a 2 node job, 'controller' and 'node', run the real base playbook on the controller, and get 'controller' to run ansible-playbook pre/post/etcbase.yaml on 'node'	17:17
pabelanger	Shrews: eventually we want to stop / delete / rebuild nl01, since it is trusty	17:17
dmsimard	but if the real base playbooks run on 'node', then it kind of sucks because I'm running on top of what already ran.	17:18
clarkb	mordred: is log collecting in v3 expected to left align everything? http://logs.openstack.org/45/501345/1/check/tox-py35/4f47527/job-output.txt.gz#_2017-09-06_16_36_23_660139	17:18
pabelanger	I think we might want to start bringing online another zuulv3 merger, ze01.o.o is currently processing a large nova change	17:19
jeblair	dmsimard: yeah, so i think mordred's suggestion of the reduced base job which only does minimal things (ssh keys, zuul_stream, logs) is the way to go; ssh keys and zuul_stream are the only things that will run on the remote node, and ssh keys are the only thing that might have an operating system interaction	17:19
mordred	clarkb: nope. that's a bug	17:19
*** jkilpatr has joined #zuul		17:19
jeblair	pabelanger: ze01 -- ze04 exist; you can make sure the others are up to date and bring them online	17:19
jeblair	dmsimard: however	17:20
pabelanger	jeblair: ah, right. I'll check that now	17:20
jeblair	dmsimard: note that once you have the reduced base job, you don't actually need to implement this as a multinode job, you can run the additional roles on the main node	17:20
mordred	clarkb: although that's content from inside of the output from tox from testr from zuul's tests - so I don't believe we're processing that exception text zuul side	17:20
dmsimard	jeblair: I guess	17:21
clarkb	jeblair: mordred: looking at test failures for 501345 I think that the fix has basically caught test fixtures that were/are broken and now we dequeue and don't report but assert we should report	17:22
jeblair	dmsimard: i think the key thing here is the ssh keys -- regardless of the technical capabilities of the system, we must as a matter of policy in openstack-infra, at the very least run the ssh keys role on every node.	17:22
mordred	jeblair: can you though? you won't be able to get zuul to put the proposed versions of the project-config roles in place on the executor	17:22
mordred	jeblair: or I guess I'm wrong - the job playbook that declares role will get those ... so yah	17:22
jeblair	mordred: ya that second thing, so we can have a test job in zuul-jobs that exercises the roles	17:23
clarkb	mordred: unless that is a new behavior in testr I'm pretty sure it won't left align like that	17:23
mordred	jeblair: yah- main thing will be that we won't be able to get zuul to run playbooks from project-config	17:23
pabelanger	jeblair: mordred: clarkb: fungi: I'm going to start ze02.o.o now, any objections? It is up to date	17:23
jeblair	clarkb: ah thanks, looks like i have a bit of cleanup to do	17:23
mordred	jeblair: but sincethose playbooks are simple anyway, that shouldn't be a problem	17:23
dmsimard	mordred: could you not do required-projects: project-config and then have a playbook that includes project-config/playbooks/something.yaml ?	17:24
mordred	clarkb: yah - I'll look at the zuul_stream stack and see if I can reproduce	17:24
mordred	dmsimard: nope	17:24
dmsimard	mordred: or ansible-playbook project-config/playbooks/something.yaml	17:24
dmsimard	why ?	17:24
mordred	dmsimard: you can't execute commands on the executor	17:24
jeblair	(that sounds ironic)	17:24
mordred	dmsimard: the only way for playbooks to be executed from the executor is for zuul to execute them	17:24
clarkb	mordred: if you don't mind I'd like to poke at that for a bit first just to gain more familiarity with the streaming setup	17:25
mordred	so we can put a playbook in zuul-jobs that runs the same roles as the base job - and _that_ playbook is one that zuul can execute	17:25
mordred	clarkb: awesome! I have a helper tool in tree to help get set up with local testing, fwiw	17:25
dmsimard	mordred: I'm confused, let me put up a gist to express what I'm trying to say	17:26
clarkb	mordred: ya I see it test() :)	17:26
mordred	clarkb: https://review.openstack.org/#/c/500161/	17:26
mordred	dmsimard: ++	17:26
fungi	pabelanger: starting ze02 sounds good to me	17:27
pabelanger	okay, ze02.o.o started	17:30
dmsimard	mordred: https://gist.github.com/dmsimard/1fc6b22a40009298713c7432d9368a37	17:34
fungi	pabelanger: is your comment in 500990 implying that you have another patchset coming to address clarkb's concern, or a followup change?	17:34
dmsimard	mordred: I guess in this example, we'd also need to set up the ansible roles path to seek from the checked out zuul-jobs	17:35
pabelanger	fungi: yes, that is what I am writing now. I hope to push up the publish-openstack-python-docs changes in the next hour	17:35
dmsimard	mordred: edited the gist to add the roles_path	17:36
dmsimard	jeblair: https://gist.github.com/dmsimard/1fc6b22a40009298713c7432d9368a37 ?	17:36
dmsimard	er, that ansible.cfg would not be effective	17:38
dmsimard	unless we'd run from a multi-node setup and run ansible manually from a controller to a node	17:39
fungi	pabelanger: awesome, but my question was about whether it'	17:39
jeblair	dmsimard, mordred: theoretically, i think the include approach could work -- you could probably use include to get zuul to execute the un-merged project-config code (if you merged that job which did that)	17:39
fungi	s a new patchset for that change, or a new change entirely	17:39
jeblair	dmsimard, mordred: however, that's the reason we should not merge such a change, as it allows arbitrary code execution on the executor	17:39
pabelanger	fungi: sorry, it will be a follow up because we need a new role in openstack-zuul-jobs. Basically, we can delete publish-openstack-python-docs job right now, if we want to avoid projects using it, currently that is shade	17:40
jeblair	dmsimard: (ftr that path would actually be "{{ zuul.executor.src_dir }}/git.openstack.org/project-config/..." but that's a minor detail)	17:41
dmsimard	jeblair: ah that's the one I was looking for actually, I couldn't find it	17:42
jeblair	dmsimard: put another way: that change would let you tell the executor to run un-vetted code. it would also let people pwn the executor. so we can't merge it.	17:42
fungi	pabelanger: okay, i'll put 500990 on the back burner for a bit and review it in the context of your coming changes	17:42
dmsimard	jeblair: but $enduser from $project could merge something like that	17:43
jeblair	dmsimard: no, that's a base job, and they can only go into project-config	17:43
dmsimard	jeblair: do trusted jobs run outside the bubblewrap ?	17:43
openstackgerrit	Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Switch to publish-openstack-python-docs-infra https://review.openstack.org/501362	17:44
jeblair	dmsimard: no, they have their own bubblewrap (with potentially more access)	17:44
pabelanger	fungi: so, we'd need to land ^, then we can remove publish-openstack-python-docs until new code is ready	17:44
dmsimard	jeblair: I guess that's part of what I missed, okay.	17:44
jlk	mordred: et al: There was code added to the github driver to handle a ping event, if it's from a repo we aren't configured to listen to. We're apparently being nice and responding back to github with a 404. If I move over to a ingest, queue, process model, we wouldn't be able to immediately (or at all really) return that 404. How important is this nicety of the 404?	17:47
*** jkilpatr has quit IRC		17:47
pabelanger	fungi: 501363 removes it for now, until I push up new role	17:47
jeblair	jlk: i guess the 404 just says to anyone looking at the github webhook logs that zuul is ignoring it?	17:48
clarkb	mordred: where does the hostname come from in the log? I see we do timestamp \| log_line but no hostname	17:48
jlk	jeblair: yeah	17:48
pabelanger	fungi: but, I think we should start testing 500990 sooner to confirm it works as expected	17:48
*** jkilpatr has joined #zuul		17:48
jeblair	jlk: i feel like 200 is okay. like "message received!" the fact that it was subsequently ignored is a detail that a zuul admin can inspect.	17:48
jlk	jeblair: it's odd that we do it specifically for the ping event, which is when somebody installs a webhook .	17:49
jlk	different than an app install I think.	17:49
mordred	jeblair, dmsimard: I don't think it would open the door to arbitrary code execution - it would just not run because the execution context is still untrusted	17:49
jlk	I'll drop a TODO in here to validate that the project we got an event for is a project we care about.	17:49
jlk	because we've talked about doing that anyway across the board, not just on ping events.	17:49
mordred	clarkb: http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/callback/zuul_stream.py?h=feature/zuulv3#n277	17:50
mordred	clarkb: and http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/callback/zuul_stream.py?h=feature/zuulv3#n277	17:50
jeblair	mordred: in dmsimard's gist, there is a playbook defined in a project-config job, so it runs in the trusted context. the content of that playbook is to ansible-include other playbooks from the checkout of unmerged code on the executor. that means that a change to an untrusted-project which used that base job and depended on an un-merged change to project-config would run the un-merged project-config content in the trusted context.	17:52
mordred	jlk: I think it's fine to do 200	17:52
openstackgerrit	Merged openstack-infra/zuul-jobs master: Ignore errors from ara generate https://review.openstack.org/500645	17:52
clarkb	mordred: thanks, I also see the bug now. But have more questions about the relationship between zuul/ansible/library/ and zuul/ansible/callback	17:53
mordred	jeblair: oh - right - sorry - in my brain that playbook was a playbook in zuul-jobs	17:53
mordred	clarkb: sweet! questions are good	17:53
clarkb	mordred: so I was mostly looking in librarby. and command.py there writes to /tmp/uuid.log and zuul_console.py in library reads that and streams it on 19885	17:54
clarkb	mordred: so I guess the question is where does the callback fit in if we are already writing to the file and streaming it	17:55
mordred	jeblair, dmsimard: in any case, I agree that that's not how we should do this - I think the limited base job in project-config that just does ssh keys, stream and log collection, then a job in zuul-jobs or anywhere else can use that safely	17:55
dmsimard	mordred: yeah I'm sending a patch for exactly that in a moment	17:55
mordred	clarkb: yah - so - command.py in library is ACTUALLY the thing you want to be thinking about from library	17:55
mordred	clarkb: that runs on the remote node when command or shell tasks are run	17:56
mordred	clarkb: and yes, it logs to a local file	17:56
jeblair	dmsimard: note (this is based on your gist) that the base job doesn't need to be node-specific. you can specify a default, or even omit the nodes section entirely from it. either way, the job or jobs you use to iterate on this in the zuul-jobs repo can specify a nodes section with the labels you want	17:56
mordred	clarkb: at the top of the pre-playbook in the base job we run zuul_console which forks off a daemon that reads the files that the command tasks write to disk	17:56
mordred	clarkb: that daemon also listens on port 19885 for incoming connections	17:57
clarkb	right and that is in library/ as well	17:57
mordred	clarkb: zuul_stream in callbacks runs as part of the ansible-playbook process on the executor	17:58
mordred	clarkb: one instance of it is created per ansible-playbook invocation, and ansible-playbook calls its methods as things happen on the executor	17:58
dmsimard	jeblair: I was actually wondering -- what I'm doing amounts to integration test the base playbooks against all distros. Should I do one job with 5 nodes? One of each distro?	17:58
clarkb	mordred: and the callbacks are the aggregation point?	17:59
mordred	clarkb: yes	18:00
mordred	clarkb: as tasks start and stop the callbacks get methods called - and then in zuul_stream if we notice that it's a command or shell task, we spin up a thread to connect to the port of the daemon on the remote node and read the log stream from it	18:00
mordred	clarkb: as we collect that content it is written to local disk on the executor in job-output.txt - which is what the finger daemon reads from when you hit it	18:01
jeblair	dmsimard: you can; though 5 jobs each on one node is both easier for nodepool to supply and easier for humans to parse test results.	18:02
*** harlowja has quit IRC		18:02
dmsimard	jeblair: in zuul could I set up a job-template and then expand the template with the node types ?	18:03
dmsimard	I see in the docs there's a notion of job template but it's not very fleshed out	18:03
jeblair	dmsimard: there's no job-template in zuul v3.	18:03
dmsimard	jeblair: ah, er, I mistook for project template I guess	18:03
jeblair	dmsimard: instead, make a job definition, then make 5 jobs that inherit from it, each with a different node type	18:04
dmsimard	jeblair:	18:04
dmsimard	jeblair: that's what I was doing but it felt a bit verbose	18:04
jeblair	dmsimard: something like https://etherpad.openstack.org/p/Gdqs1NlMIQ	18:06
jeblair	dmsimard: also, we should probably put these jobs in openstack-zuul-jobs rather than the zuul-jobs repo	18:06
jeblair	dmsimard: the actual labels we're testing against are a little openstack-specific	18:06
clarkb	mordred: and is _log_message() there to record non command/shell logs?	18:07
clarkb	mordred: there is both _log and _log_message in the callback and trying to figure out why we need both	18:07
jeblair	dmsimard: (but if it's faster to iterate against zuul-jobs for now, we can do that, and just avoid landing the changes for the moment)	18:08
pabelanger	ze02.o.o looks to be working	18:12
dmsimard	jeblair: I didn't even know that openstack-zuul-jobs was a thing	18:12
pabelanger	mordred: https://review.openstack.org/500201/ so do we want openstack-doc-builds for shade and zuul? Or will it be tox-docs ?	18:14
clarkb	mordred: actually in v2_runner_on_skipped we use both	18:15
mordred	clarkb: _log is lower level - _log_message is a convenience wrapper	18:18
dmsimard	Do we need to use the new depends-on syntax for v3 ?	18:20
dmsimard	Or can we still just use the gerrit changeid ?	18:20
dmsimard	docs seem to suggest it's just gerrit changeid but I recall a certain thread mentioning it would change (to support gerrit and github side by side for example?)	18:23
jlk	I know github only supports the new method	18:27
jlk	I think gerrit supports both?	18:28
dmsimard	ah so it'd be driver/backend specific	18:28
* dmsimard digs in code		18:28
jlk	ah crap. Something broke local logging	18:28
jlk	zuul-scheduler_1 \| Error grabbing logs: invalid character '\x00' looking for beginning of value	18:28
jlk	and I'm not getting things logged to console	18:29
dmsimard	jlk: yeah you're right it's driver specific	18:29
mordred	clarkb: also - fwiw, the entire zuul_stream file needs to be refactored - but have been putting that off	18:31
mordred	dmsimard: we also have not yet implemented cross-driver depends-on - that's a post-ptg thing	18:32
mordred	dmsimard: so you can't (yet) depends-on a github change from a gerrit change or vice-versa	18:32
mordred	dmsimard: we _definitely_ want to add that though	18:32
dmsimard	jlk: I think jeblair had a patch to fix some junk whitespace issue	18:32
dmsimard	jlk: https://github.com/openstack-infra/zuul-jobs/commit/a35c2ad35ed4aa5be85194d8bcf419bb0025272f	18:32
dmsimard	not sure if it's related	18:32
jlk	not related, this is well before job running	18:33
dmsimard	mordred: right, I was wondering if inside gerrit we could keep using depends-on: <changeid> which seems to be the case so that's okay	18:33
jlk	mordred: we should add, if we haven't already, the ability for gerrit to USE the new syntax, even if it just refers to itself.	18:33
jlk	so that hte same syntax between github and gerrit can be used	18:34
dmsimard	mordred: btw, the minimal playbook thing: https://review.openstack.org/#/c/501368/	18:35
mordred	jlk: I believe we have	18:36
fungi	clarkb: does pabelanger's followup change address your comment on 500990?	18:37
dmsimard	jlk: ah, yes, I guess that's what I meant -- if it was necessary for gerrit to use the new syntax against itself	18:38
fungi	dmsimard: the proposal i recall was that for the gerrit trigger we would support change-id format as a means of backward-compatibility, but deprecate it and encourage everyone to switch to the new url-based format	18:42
jlk	ansibot is about to be talked about in the ansible contributor thing	18:44
jeblair	fungi, dmsimard, jlk: yes; we haven't had a chance to pull the gerrit syntax forward yet; that'll come with cross-source depends	18:45
jeblair	mordred, jlk: no i don't think gerrit supports the new syntax yet	18:45
dmsimard	jlk: there's no livestream/hangouts/whatever we can stalk in I guess ?	18:45
jeblair	dmsimard: it's on the etherpad: https://public.etherpad-mozilla.org/p/ansible-summit-september-2017-core	18:46
dmsimard	oohhh	18:46
jlk	dmsimard: yup, bluejeans, IRC	18:47
fungi	jlk: i _so_ hope ansibot is a bot for making ansi-escape-based art and animations	18:47
dmsimard	fungi: it's what helps the ansible maintainers keep their sanity with the github workflow of issues and pull requests :)	18:48
pabelanger	jeblair: mordred: fungi: do we have syntax for zuul client to enqueue-ref on a periodic pipeline?	18:48
jeblair	pabelanger: "zuul --help"?	18:49
fungi	pabelanger: i want to say last time i looked, enqueue-ref didn't work with periodic? or maybe i just haven't tried recently	18:49
fungi	especially since there is no ref for periodic jobs	18:49
jeblair	there is in zuul v3	18:49
fungi	ooh	18:49
jeblair	so try it and if it doesn't work fix it :)	18:49
pabelanger	k, will see if I can figure it out	18:50
jeblair	fungi: thus ending the requirement that periodic jobs bake in their branch; you can just use a regular gate job in periodic and it gets a branch just like any other	18:50
fungi	i missed that innovation	18:51
fungi	that'll be quite handy	18:51
jeblair	you may have been on vacation :)	18:51
fungi	i may have. i seem to do that a lot	18:51
pabelanger	okay, I think https://review.openstack.org/500626/ is ready for final review, if everybody is okay, I can +A upto 500626	18:52
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix dynamic dependent pipeline failure https://review.openstack.org/501345	19:01
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Add test for dependent changes not in a pipeline https://review.openstack.org/501353	19:01
jlk	Well, I have a thread started for reading things from the queue, and it read at least one from the queue. Not sure why it's not reading more from the queue.	19:03
jeblair	pabelanger: wfm	19:04
jeblair	Shrews: i'll look at cloner after lunch	19:04
pabelanger	jeblair: mordred: Shrews: any objections on bringing online nl02.o.o? Currently waiting on some code reviews, so can shift to standing up infra stuff	19:05
jeblair	pabelanger: go for it	19:05
mordred	pabelanger: do it	19:05
pabelanger	k	19:05
Shrews	dooo eeet	19:12
openstackgerrit	David Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro https://review.openstack.org/501281	19:28
openstackgerrit	David Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro https://review.openstack.org/501281	19:29
dmsimard	bah, doing a depends-on a review that hasn't merged in project-config doesn't work :)	19:33
dmsimard	(in v3)	19:33
dmsimard	which I guess is okay, once again security wins	19:33
openstackgerrit	Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Split github hook ingest and processing https://review.openstack.org/501390	19:33
jlk	mordred: jeblair: ^^ that introduces the eat and queue model for github events. Sending events is significantly faster to get a 200 back	19:35
openstackgerrit	Clark Boylan proposed openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs https://review.openstack.org/501394	19:36
clarkb	mordred: ^ something like that I think	19:36
jlk	mordred: jeblair: so from cold start, new method is 1.2 total seconds to get to a 200, and then after that it's .4 on repeated events. Old method was 4.35 on cold, and then 1.2 to 2.9 to whatever on repeat events.	19:39
* jlk lunches		19:40
*** olaph has quit IRC		19:50
*** olaph has joined #zuul		19:51
mordred	clarkb: reading	19:53
mordred	jlk: reading	19:53
* mordred reads in parallel		19:53
*** olaph1 has joined #zuul		19:55
*** olaph has quit IRC		19:56
mordred	clarkb: one comment - otherwise looks great	19:58
*** olaph1 is now known as olaph		20:00
mordred	clarkb: unfortunately ansible does not log issues in the callback plugins particularly well	20:00
mordred	clarkb: I just had an idea of a thing we can do about that in this context though...	20:01
pabelanger	okay, I am stopping nl01	20:02
clarkb	mordred: the test failures appears to be in syncing the job-output.json file though. Not sure its related to my change	20:02
clarkb	broken pipe (32) from rsync	20:02
pabelanger	and nl02.o.o started	20:03
openstackgerrit	Clark Boylan proposed openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs https://review.openstack.org/501394	20:04
pabelanger	cool, nl02.o.o is responding to requests	20:04
jeblair	Shrews: clone mapper looks good to me. we may want to shift that over and merge it into zuul-jobs instead of zuul. stick that in as a templated file and make a role that copies it over top of the zuul-cloner that's baked into our images.	20:05
jeblair	mordred, clarkb: ^ one of you want to look that over (https://review.openstack.org/500922)	20:05
pabelanger	Hmm, I've just noticed nl02.o.o doesn't have swap setup	20:05
jeblair	jlk: cool! though zuul seems to be expressing some displeasure with the unit tests with your patch.	20:07
jlk	ruh roh	20:07
mordred	jeblair, jlk: is queue.Queue inherently threadsafe?	20:07
jlk	dunno, it's what Gerrit driver uses	20:07
clarkb	mordred: ya	20:07
mordred	cool	20:07
mordred	it was just putting and getting from a multi thread context without explicit locks, so I figured I'd ask :)	20:08
clarkb	that said if using asyncio I think there are special queue objects for it (that are not thread safe beacuse no proper threads)	20:08
mordred	clarkb: yah - that'll be a whole other thing for later	20:08
jeblair	mordred: yes	20:10
clarkb	jeblair: looking at 501345, curious how there were tests that failed outside of that change, but you got it to pass only be modifying tests in that change	20:10
clarkb	jeblair: are the test's side effecting each other?	20:11
jeblair	clarkb: no the addition of check: jobs: [] fixed those	20:11
mordred	clarkb: although my first hunch is that the zuul-web webhook will do what this one is doing except instead of self.connection.addEvent() it'll do "gearman.submitJob('addGithubEvent', background=True)" or something	20:11
jeblair	clarkb: basically, the fix tightened up when we report on things not in pipelines; some of those tests then needed their project to be in a pipeline. so i added them to the check pipeline with no jobs.	20:11
jeblair	that's a thing in zuul v3, more or less exactly for this. :)	20:12
pabelanger	jeblair: clarkb: fungi: mordred: okay, nl02.o.o is running, but without swap. I am thinking of leaving it for now, rebuild nl01.o.o under xenial, validate swap is working, swap back to nl01.o.o then fix nl02.o.o. any objections?	20:12
pabelanger	otherwise, I can roll back to nl01.o.o first, and fix swap on nl02.o.o	20:12
jeblair	pabelanger: wfm	20:12
clarkb	jeblair: oh I see other tests not touched in that chagne are also using the in-repo fixture	20:13
jeblair	mordred: agreed; gearman should be the queue in next iteration	20:13
mordred	jeblair: yah	20:13
jeblair	clarkb: yep	20:13
fungi	pabelanger: sounds fine. nl02 doesn't seem to be under any memory pressure	20:13
jeblair	mordred: want to re +2/+3 501345 and child?	20:14
mordred	jeblair: I do!	20:14
jeblair	mordred, pabelanger: we have not restarted since the command.py (utf8 log streaming) patch landed, correct?	20:15
clarkb	I've got another log streaming change https://review.openstack.org/501394 that would be nice to get in for better formatted logs (though utf8 fix is definitely higher priority)	20:16
jeblair	clarkb: ack	20:16
* fungi reviews		20:16
pabelanger	jeblair: I am not sure	20:16
mordred	jeblair: I restarted zuul first thing this morning iirc - but not since then	20:17
jeblair	mordred: do you know if the stream change had landed? i think you approved it first thing this morning as well, so unsure which first thing was first :)	20:18
fungi	clarkb: did you mean rstrip where you used rsplit?	20:18
clarkb	fungi: yes I most certainly did	20:18
* clarkb fixes		20:18
* fungi suddenly feels useful		20:18
jeblair	there is an rsplit	20:18
mordred	jeblair: I do not remember	20:19
jeblair	mordred: okay, let's just land clarkb's thing and restart anyway	20:19
openstackgerrit	Clark Boylan proposed openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs https://review.openstack.org/501394	20:19
mordred	jeblair: ++	20:19
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Fix dynamic dependent pipeline failure https://review.openstack.org/501345	20:23
jlk	hrm I think something isn't closing the thread.	20:26
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Add test for dependent changes not in a pipeline https://review.openstack.org/501353	20:28
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Add zuul-cloner shim https://review.openstack.org/500922	20:30
pabelanger	fungi: clarkb: do you mind reviewing https://review.openstack.org/500990/ again, it has related patches needed for zuul and publishing afs docs	20:30
pabelanger	that will stop overwriting http://docs.openstack.org/infra/zuul/ with zuulv3 docs	20:31
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Update tests to use AF_INET6 https://review.openstack.org/501266	20:32
Shrews	jeblair: ack. will do the env var thing, then shift it to zuul-jobs	20:32
clarkb	pabelanger: done	20:32
mordred	clarkb, pabelanger: https://review.openstack.org/#/c/501381/ while we're at it	20:33
jeblair	jlk: ah i think i see the issue	20:34
jlk	oh good! I haven't found it yet	20:34
jeblair	jlk: tests/base.py line 2137	20:35
jeblair	jlk: that makes sure that the tests wait for the gerrit connector event queue to empty before deciding that the system is stable (in waitUntilSettled)	20:35
dmsimard	ah, eh, ew ?	20:35
jeblair	jlk: we need to add the github event queue to that list as wel, a few lines down.	20:35
dmsimard	ansible_distribution returns "openSUSE Leap" with an actual space in it	20:35
jeblair	Shrews: or maybe if it ends up being a templated script, you could just template that in instead of env-varring	20:36
* dmsimard uses os_family for suse		20:36
jlk	I think that line number is off?	20:36
jeblair	jlk: maybe; it's in def getGerritConnection(driver, name, config):	20:36
jeblair	jlk: self.event_queues.append(con.event_queue)	20:37
jlk	oooh I see	20:37
jeblair	i should really make an emacs macro to open the current line in cgit	20:37
clarkb	dmsimard: and I think for tumbleweed it may be just "tumbleweed" ? family sounds like a good idea	20:38
dmsimard	clarkb: facts: http://logs.openstack.org/67/499467/8/check/gate-tempest-dsvm-neutron-full-opensuse-423-nv/e9b97c3/logs/ara/host/0962bf05-4d54-41b1-8c1d-49bc318e9f33/	20:38
jeblair	clarkb: mordred: stream change https://review.openstack.org/501394 failed	20:41
clarkb	hrm this time it actually failed and wasn't post sync failing	20:42
clarkb	aha I see why	20:43
jeblair	clarkb: change worked, test needs updating	20:43
clarkb	2017-09-06 20:33:09.738370 \| node1 \| link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 <- yup	20:44
jlk	oh haha, I have to remove the ping event handler too	20:45
openstackgerrit	Clark Boylan proposed openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs https://review.openstack.org/501394	20:46
mordred	clarkb: nice	20:46
clarkb	I skimmed the other greps too and ^ appeared to be the only two that needed indentation	20:46
mordred	clarkb: also - thanks for fixing that - my eyes hadn't realized what was going on - the update looks great	20:46
jeblair	i went ahead and +3d it; if folks see issues in http://logs.openstack.org/94/501394/3/check/zuul-stream-functional/f889e56/stream-files/stream-job-output.txt we can block it before it merges	20:47
openstackgerrit	David Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro https://review.openstack.org/501281	20:49
pabelanger	clarkb: dmsimard: do you happen to have an idea how to automate that process? I'm happy to write the code, but I ended up just modifying make_swap.sh manually to create it on xenial server. Is this an issue for devstack-gate, if so, maybe we just update launch-node to use that role now	20:51
clarkb	pabelanger: make_swap.sh in system-config is independent of devstack-gate completely iirc	20:51
pabelanger	clarkb: dmsimard: sorry, this should have been in #openstack-infra for swap issue	20:51
*** jkilpatr has quit IRC		20:54
openstackgerrit	Jesse Keating proposed openstack-infra/zuul feature/zuulv3: Split github hook ingest and processing https://review.openstack.org/501390	20:59
jlk	mordred: jeblair: fixed!	20:59
jeblair	jlk: lgtm; let's see what zuul thinks! :)	21:00
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Actually use fetch-stestr-output in unittests base job https://review.openstack.org/501441	21:03
*** jkilpatr has joined #zuul		21:08
mordred	jeblair: have we restarted zuul yet with your dependent fix? or are we waiting for clarkb's?	21:15
mordred	clarkb: btw - that failed again	21:15
clarkb	mordred: wat	21:16
clarkb	mordred: host key verification failed	21:17
clarkb	dont think I touched that	21:17
mordred	oh - that's not great	21:17
mordred	you didn't	21:17
fungi	mordred: jeblair: looks like puppet updated the zuul install on zuulv3.o.o 8 minutes ago... time to restart?	21:17
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Split github hook ingest and processing https://review.openstack.org/501390	21:18
fungi	oh, we don't have the rstrip console change in yet	21:18
mordred	clarkb: so - it did the hostkey role - http://logs.openstack.org/94/501394/4/check/zuul-stream-functional/868fc52/job-output.txt.gz#_2017-09-06_20_56_20_714355	21:19
jeblair	mordred, clarkb, fungi: it looks like my stream fix merged before the last restart, so i've resumed poking at devstack while waiting for clark's to merge	21:20
mordred	jeblair: awesome	21:21
mordred	clarkb: 91.106.198.111 floating ips	21:21
mordred	clarkb: these nodes have http://logs.openstack.org/94/501394/4/check/zuul-stream-functional/868fc52/zuul-info/inventory.yaml	21:21
mordred	floating ips (see interface_ip in the inventory)	21:21
fungi	jeblair: yeah, the stream encoding fix is installed on the server already, i did see it on there	21:22
mordred	but multi-node-known-hosts doesn't seem to be adding those	21:22
pabelanger	okay, I've rolled back to nl01.o.o, which is xenial	21:23
fungi	oh weird... multinode over fip bug?	21:23
mordred	well - bug in our role that sets it up so that all hte nodes can ssh to each other	21:24
mordred	I see a fix - one sec	21:24
jeblair	clarkb: can you help me out with a suggestion regarding https://review.openstack.org/451492	21:25
jeblair	clarkb: the devstack job is running into this: http://logs.openstack.org/02/500202/18/check/devstack/9eb2549/job-output.txt.gz#_2017-09-06_21_08_47_095758	21:25
clarkb	oh hrm	21:26
*** yolanda has quit IRC		21:26
jeblair	what creates the mirror_info.sh file?	21:26
*** yolanda has joined #zuul		21:26
clarkb	nodepool ready script iirx	21:26
clarkb	which we dont have in v4	21:27
clarkb	*3	21:27
clarkb	so maybe we just need a pre task that drops that in? I think mordred was looking at that?	21:27
jeblair	yeah, in theory we could do it in the configure_mirrors role	21:29
jeblair	(cc dmsimard, pabelanger ^)	21:29
pabelanger	okay, and nl02.o.o is backonline too. So we are running 2 nodepool-launchers right now, are we good with that?	21:29
jeblair	though it's also worth asking: is that the way we want to handle this?	21:29
jeblair	clarkb: why isn't that in devstack-gate?	21:30
jlk	Oooh, hopefully next restart of zuul includes the github change I just made, so we can see if it reduces the timeouts.	21:30
dmsimard	We'll have to keep the file for the time being for backwards compat	21:30
jeblair	clarkb: (that == the add-apt-repo)	21:30
clarkb	jeblair: because devstack needs it to function	21:30
dmsimard	And yes, we can likely set it up through config mirror	21:30
clarkb	old libvirt just isnt reliable	21:30
jeblair	clarkb: yeah, but there's a big "if running in gate" block there...	21:31
jeblair	clarkb: so why not do the "if running in gate" block in devstack-gate, and then... hopefully add-apt-repo noops in devstack? :)	21:31
clarkb	ya we could have devstack skip if some uca repo exists	21:32
jeblair	so to make this a really high-level question: is /etc/ci/mirror_info.sh an API that we want to support for openstack	21:33
jeblair	er	21:33
jeblair	for jobs in openstack-infra	21:33
pabelanger	Oh, /etc/ci/mirror_info.sh. Ya, we'll have to create that file today. But we could write them as facts on disk moving forward	21:33
jeblair	or is there something more ansiblish/v3 we could do.	21:33
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Handle floating ips in multi-node-known-hosts https://review.openstack.org/501459	21:33
jeblair	like what pabelanger just suggested	21:33
mordred	clarkb, jeblair, pabelanger: ^^ thatshould fix the ssh hostkey task for our floating ip clouds	21:33
pabelanger	ya, we could write them into /etc/ansible/facts, but haven't thought what that would look like	21:34
jeblair	okay, so maybe we should add /etc/ci/mirror_info.sh to configure_mirrors role for now, and think about alternatives later	21:35
mordred	yah - so - I think sort-term we _definitely_ have to write that file out, because a metric truckload of people consume mirror_info.sh aiui	21:35
jeblair	really?	21:35
mordred	yah - people use it in things like inside-of-docker-images-in-kolla	21:35
clarkb	ya and dib builds and such	21:35
clarkb	its how we communicate "this is where you find thingd"	21:36
jeblair	mordred: well 14 projects do :)	21:36
clarkb	particularly useful if say builing ubuntu image on centos	21:36
*** olaph1 has joined #zuul		21:36
mordred	yah - at least "some"	21:36
mordred	maybe not metric truckload - but maybe an imperial one	21:36
jeblair	who wants to add that?	21:36
jeblair	dmsimard: does it make sense for you to work that into your current effort?	21:36
dmsimard	Sure	21:37
*** olaph has quit IRC		21:37
fungi	it seems worth changing soonish after cut-over, and we ought to be able to find consumers of it pretty easily with git grep (or codesearch.o.o in case they're calling into it from scripts in their repos)	21:37
mordred	it can almost certainly be a fairly easy cut/paste from the existing configure_mirror.sh just replacing the here-doc with a template	21:37
jeblair	dmsimard: http://git.openstack.org/cgit/openstack-infra/project-config/tree/nodepool/scripts/configure_mirror.sh#n73	21:37
dmsimard	I'm familiar with that file, yes, we hack it in review.rdo :)	21:38
jeblair	dmsimard: cool thanks :)	21:38
pabelanger	ya, adding to configure-mirror role +1	21:38
dmsimard	Need +3 on https://review.openstack.org/#/c/501368/ to unblock me though	21:38
jeblair	i will do a hacky thing to devstack to get past that now	21:38
jeblair	mordred: ^ that's you	21:39
fungi	i definitely don't think a sourced shell snippet setting some relatively ad-hoc envvars is an api we want to support in the long term if we value our sanity	21:39
jeblair	mordred: (the +3 on 501368)	21:39
jeblair	fungi: yeah, there must be a better way. i don't know it right now, but we'll find it :)	21:40
* dmsimard Raymond H. voice "there has to be a better way"		21:40
mordred	dmsimard: +A	21:41
pabelanger	okay, nodepool-launcher looks happy. I'm moving back to testing afs publishing	21:41
clarkb	the biggest problem has been discovering complete list regardless of platform	21:42
clarkb	because there are cases where ubuntu based jobs need centos repos	21:42
mordred	dmsimard, jeblair: for the configure mirrors thing - it probably ALSO needs to write out that list of files after the here doc	21:42
mordred	honestly - I think for now we're likely better of actually just copying that entire file and then running it with NODEPOOL_MIRROR_HOST set properly in an env var	21:43
mordred	cause all of the putting sources.list.available.d and whatnot at the end	21:44
mordred	and it sets up unbound at the top	21:44
*** harlowja has joined #zuul		21:45
pabelanger	++	21:45
pabelanger	then we can itterate on it	21:45
pabelanger	iterate*	21:45
pabelanger	mordred: jeblair: clarkb: fungi: https://review.openstack.org/501362 would like a review, switches zuul to publish-openstack-python-docs-infra job	21:47
*** olaph1 is now known as olaph		21:51
*** hashar has quit IRC		21:54
mordred	pabelanger: lgtm +A - also added follow up https://review.openstack.org/501475 which just caught my eye in the review	21:54
SpamapS	is there a way to tell zuul/nodepool that a certain job should _always_ hold its nodes?	21:55
mordred	SpamapS: not to my knowledge, no	21:57
SpamapS	I wonder how well it would work to just have jobs that don't complete for a few days.	21:57
mordred	SpamapS: there is a count argument to autohold though - so I imagine implementing support for that as a count=-1 or something similar wouldn't be terribly difficult	21:57
mordred	SpamapS: it should work as well as gearman works :)	21:57
mordred	SpamapS: oh - I mean, holding a node happens after the job completes though	21:58
mordred	SpamapS: so the only jobs-don't-complete portion would be if you're starving your availble nodes by holding all of them	21:58
mordred	SpamapS: while you're here ... any chance you have a sec to look at / review https://review.openstack.org/#/c/501459/ ?	21:59
mordred	SpamapS: (since you wrote that originally)	21:59
mordred	clarkb: could I get an amen on https://review.openstack.org/#/c/501441 ?	21:59
SpamapS	mordred: I'll look in a few.	22:00
mordred	SpamapS: thanks!	22:01
openstackgerrit	David Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro https://review.openstack.org/501281	22:01
pabelanger	mordred: left -1 with comment	22:01
mordred	SpamapS: (as you well know the multi-host-host-keys is dense code)	22:01
dmsimard	jeblair, mordred: you have a hint on this "unknown configuration error" error here ? https://review.openstack.org/#/c/501281/6	22:02
jeblair	dmsimard: btw, you don't need to define those nodesets, you can just inline them in the jobs under "nodes:" (nodes: is either a nodeset name, or a nodeset definition)	22:03
jeblair	dmsimard: i will go spelunking in logs	22:03
mordred	pabelanger: good call - I responded - but tl;dr - that says to me I'd like to refactor something, but I think we can refactor it later	22:03
dmsimard	jeblair: I know about the node definition but I was actually wondering if these should be defined by default in project-config to be made less redundant	22:04
dmsimard	jeblair: otherwise we end up re-declaring these same nodes over and over	22:04
dmsimard	It probably looks too verbose because it's just one job but would end up saving lines for a dozen different jobs	22:06
dmsimard	I don't have a strong opinion on this one.	22:06
jeblair	nor do i	22:06
dmsimard	I was ready to do one job with 5 nodes to keep zuul.yaml clean :D	22:06
mordred	I mean - there's not much value in a nodeset called "ubuntu-trusty" that has a single node called "ubuntu-trusty" on label "ubuntu-trusty" - it seems that if you're adding a node toa job you should just be able to say "nodes: -ubuntu-trusty"	22:06
mordred	dmsimard: well, you can totally do that you know - ansible will let you :)	22:07
jeblair	mordred: yeah, we can make the name optional and default it to the label	22:07
dmsimard	jeblair: +1	22:07
dmsimard	that would solve the problem	22:07
jeblair	mostly, i just don't want to establish the idea that you have to define a nodeset	22:07
jeblair	to be honest, i'd rather we be slightly less clever at first if it means we avoid showing people the wrong way to do things :)	22:08
mordred	yah	22:08
jeblair	Exception: Configuration item dictionaries must have a single key	22:09
mordred	jeblair: I can't define two different variants with different nodes can I?	22:09
jeblair	mordred: sure you can	22:09
mordred	jeblair: cool	22:09
jeblair	mordred: assuming they match different things	22:10
jeblair	that's sort of the primary use case for variants	22:10
mordred	oh - no - matching the same thing	22:10
jeblair	"stable runs on trusty; master runs on xenial"	22:10
jeblair	mordred: that's not a variant, that's a job	22:10
mordred	nod	22:10
jeblair	dmsimard: all those nodeset definitions need more indentation	22:11
jeblair	i'll see if i can't make that into a nice error	22:12
dmsimard	jeblair: I'm submitting a patchset without nodesets anyway	22:12
jeblair	k	22:12
openstackgerrit	David Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro https://review.openstack.org/501281	22:12
dmsimard	^	22:13
dmsimard	that one worked	22:13
dmsimard	the jobs are scheduled	22:13
dmsimard	so I guess it was junk out of the nodeset config	22:13
jeblair	dmsimard: it was the indentation	22:13
jeblair	maybe it wasn't clear, but that exception and my indentation suggestion were the result of checking the zuul log for the actual error	22:14
openstackgerrit	David Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro https://review.openstack.org/501281	22:14
dmsimard	yeah, I think almost anything except "unknown configuration error" could be a good pointer	22:14
jeblair	that's a specific enough error we can actually say "dude, hit tab"	22:15
dmsimard	lol	22:15
dmsimard	or space space space space	22:15
dmsimard	afk for a bit, I'll hack on the base roles tonight now that it's unblocked \o/	22:15
dmsimard	oops, looks like something might be wrong with the minimal job http://logs.openstack.org/81/501281/8/check/base-integration-ubuntu-xenial/cc1ffbb/job-output.txt.gz#_2017-09-06_22_14_56_415036	22:16
dmsimard	I'll look later /me afk	22:16
pabelanger	mordred: ack	22:17
SpamapS	mordred: +A'd	22:17
mordred	SpamapS: thanks!	22:18
fungi	dmsimard: wow! look at all those job failures ;)	22:18
mordred	pabelanger: oh - also - https://review.openstack.org/#/c/501246 goes along with your other one	22:19
SpamapS	mordred: so the use case I have is that I want to have zuul and nodepool spin up test nodes in a number of scenarios, and one of those is more of the "developer wants test nodes deployed with the latest code to test XXX" ...	22:19
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Update tests to use AF_INET6 https://review.openstack.org/501266	22:20
openstackgerrit	Merged openstack-infra/zuul-jobs master: Actually use fetch-stestr-output in unittests base job https://review.openstack.org/501441	22:20
fungi	dmsimard: "ERROR: Executing local code is prohibited"	22:20
fungi	for when you get back	22:20
fungi	oh, i see you already found that error	22:22
* fungi should read scrollback more carefully		22:22
pabelanger	mordred: +3	22:22
mordred	SpamapS: nod. I can understand that desire. I think we should chat about the best way to expose that	22:22
mordred	SpamapS: for now, you could TOTALLY fake it by adding an autohold to a job with a count of like 99999 or something	22:22
SpamapS	mordred: One way I was thinking to do it is to just have a job that doesn't complete until the user is done playing with the nodes.	22:23
SpamapS	How does one return held nodes?	22:23
mordred	SpamapS: yah - that's another thing you could do - you'd have to either disable or put a REALLY long timeout on it	22:23
* SpamapS has, oddly enough, never done that		22:23
mordred	SpamapS: you tell nodepool to delete the node	22:23
fungi	SpamapS: an administrator sets the node state to something else (generally delete)	22:23
fungi	so it's not really self-service	22:24
SpamapS	Hm	22:24
SpamapS	I wonder if a better thing to do would be to just emulate zuul+nodepool with a manual provisioning playbook or something.	22:24
mordred	SpamapS: oh - hah. I've got an idea ...	22:24
SpamapS	but that gets into pushing.. blah blah	22:24
SpamapS	The reason I want this is that the job we run to run would have 5 nodes	22:25
mordred	SpamapS: have your job that doesn't complete until the user is done ... just create a stamp file and then wait until the file goes away	22:25
SpamapS	s/run to run/want to run/	22:25
mordred	SpamapS: so that the dev can just delete the stamp file when they're done	22:25
SpamapS	mordred: that's exactly what I was thinking too	22:25
mordred	SpamapS: and that could be just on one node	22:25
SpamapS	Yeah just have like, a 72 hour timeout on the job and test for a stamp file at the end of the job playbook.	22:26
mordred	yah	22:26
mordred	and the 72 hour timeout is your safety net for people forgetting about it	22:26
SpamapS	exactly	22:26
pabelanger	jeblair: mordred: interesting failure	22:40
pabelanger	http://logs.openstack.org/62/501362/1/gate/tox-py35/040f590/job-output.txt.gz	22:40
pabelanger	possible related to ipv6?	22:41
pabelanger	that ran on vexxhost	22:41
mordred	hrm	22:43
pabelanger	oh	22:44
pabelanger	2017-09-06 22:40:47,211 DEBUG zuul.AnsibleJob: [build: 72137254d1804768873127b65f5006f3] Ansible output: b'ERROR! A worker was found in a dead state'	22:44
pabelanger	that ran on ze02	22:44
pabelanger	I don't think we are running right python there	22:44
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs https://review.openstack.org/501394	22:45
pabelanger	mordred: jeblair: I am going to stop ze02.o.o because of dead state	22:45
jamielennox	pabelanger: gah: http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/fetch-testr-output/tasks/process.yaml#n8	22:45
pabelanger	jamielennox: ya, we need to fix that	22:46
fungi	looks like clarkb's rstrip fix landed, so watching puppet apply logs now	22:46
jamielennox	pabelanger: so does that always make sense in a base job like that or something that should (somehow) be performed by the client's tests?	22:47
jamielennox	particularly in a non-openstack case, is that something that the client should perform and drop into the logs folder?	22:47
pabelanger	jamielennox: might want to check with mordred. But, we should likely have ensure-testr role, or something like that	22:49
jamielennox	well testr is within tox right? so that should/will be installed each time	22:50
jamielennox	i guess the question is does post-processing always occur in the base jobs or is it something that (somehow) the individual repo should generte?	22:51
pabelanger	right now we run testr in its own virtualenv on DIB	22:51
pabelanger	but, it could be installed in test-requirements.txt	22:51
openstackgerrit	James E. Blair proposed openstack-infra/zuul feature/zuulv3: Handle some common yaml syntax errors https://review.openstack.org/501486	22:51
pabelanger	but, we likely want the job to ensure testr is installed and if missing, install it some place	22:51
jeblair	dmsimard: https://review.openstack.org/501486 handles the syntax error you found and some more	22:52
mordred	jamielennox: well - for now, the general idea is that if your job can produce subunit then we should be able to get that into other forms - or alternately you can just make whatever output you want	22:52
pabelanger	mordred: how did you add python PPA to ze01.o.o?	22:53
pabelanger	I thought that was in system-config	22:53
fungi	i'm not sure what to make of the unit test failure on 501459	22:53
mordred	pabelanger: it's in the puppet	22:53
fungi	has anyone seen that yet?	22:53
mordred	jamielennox: but it's an area we need to work out amongst ourselves - cause there's a too-much case and a not-enough case	22:53
pabelanger	mordred: k, I see it	22:53
jeblair	pabelanger, mordred: do we need to do something to get mordred's patched python3 on ze02-ze04?	22:54
jeblair	pabelanger: oh you're on that, sorry	22:54
mordred	jamielennox: I call out subunit specifically because at some point in the future when we get around to it we want to be able to have the executor snoop the output stream as it's happening and if contains subunit to notice if any tests failed so we can report that tests are _going_ to fail without waiting until the end	22:54
mordred	pabelanger: did I put it in a bad place?	22:54
clarkb	fungi: http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_680199	22:54
jamielennox	patched python3 ?	22:55
jeblair	jamielennox: lemme dig up links	22:55
mordred	jamielennox: yah - there's a crashing bug in the version in xenial - we've submitted a backport upstream	22:55
pabelanger	mordred: no, we just need to run apt-get upgrade for some reason	22:55
clarkb	jamielennox: turns out that python rewrite a good chunk of dict which broke things in ubuntu's versio nof python	22:55
pabelanger	on ze02.o.o	22:55
mordred	https://launchpad.net/~openstack-ci-core/+archive/ubuntu/python-bpo-27945-backport	22:56
jamielennox	oh god	22:56
mordred	jamielennox, jeblair: ^^	22:56
clarkb	its second lts release of ubuntu with broken python3 :)	22:56
clarkb	hard to blame ubuntu as both were bugs upstream but still painful	22:56
pabelanger	mordred: ya, so we have never versions of python that needs to be installed vi apt. Puppet isn't upgrading the PPA, because we don't have python3-dev latest any place	22:57
mordred	GAH	22:57
pabelanger	mordred: I can manually do it, but we should puppet it	22:57
mordred	SpamapS: "Chris Halse Rogers (raof) wrote 20 hours ago: Proposed package upload rejected"	22:57
mordred	pabelanger: yes. we should	22:57
fungi	clarkb: yeah, i found the traceback, just trying to figure out how it lost the console log file	22:57
mordred	SpamapS: from https://bugs.launchpad.net/ubuntu/+source/python3.5/+bug/1711724	22:58
openstack	Launchpad bug 1711724 in python3.5 (Ubuntu Xenial) "Segfaults with dict" [High,In progress] - Assigned to Clint Byrum (clint-fewbar)	22:58
pabelanger	mordred: so, if we have puppet manage that, it might break zuul-executor, since we need to uninstall python	22:58
mordred	pabelanger: why do we need to uninstall python?	22:59
pabelanger	mordred: apt does it	23:00
pabelanger	oh wait	23:00
pabelanger	mordred: ignore me	23:00
clarkb	fungi: happened at http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_599560 too which is a different path	23:01
clarkb	fungi: perhaps that tmpdir was removed?	23:01
SpamapS	mordred: DAMNIT	23:03
pabelanger	SpamapS: mordred: didn't the patch add a unit test?	23:04
mordred	pabelanger: I believe the rejected the upload beause of a different bug that also requested an SRU	23:04
SpamapS	Yeah	23:04
SpamapS	doko piggy backed on ours	23:04
*** dkranz has quit IRC		23:05
SpamapS	and then failed to follow the process	23:05
clarkb	https://bugs.launchpad.net/ubuntu/+source/python3.5/+bug/1682934 that one	23:05
openstack	Launchpad bug 1682934 in python2.7 (Ubuntu) "python3 in /usr/local/bin can cause python3 packages to fail to install" [Undecided,Confirmed]	23:05
SpamapS	(our bug also explained why a zesty upload wasn't needed)	23:05
mordred	thanks doko	23:05
jeblair	SpamapS: you explained artful, but not zesty?	23:05
SpamapS	Oh maybe. Hrm	23:06
SpamapS	looks like doko is dropping that one	23:06
SpamapS	so I can just re-upload the one I previously produced	23:06
clarkb	I don't see where/how those bugs were associated	23:07
clarkb	other than the comment saying no have a nice day	23:07
SpamapS	clarkb: they were only associated by an upload that was caught in a manual-approval queue	23:07
SpamapS	doko downloaded my upload, added his fix, then re-uploaded	23:08
SpamapS	which I knew..	23:08
SpamapS	and is not uncommon	23:08
SpamapS	wtf.. zesty eol'd 4/13	23:09
jeblair	SpamapS: oh. maybe raof needs to update a rejectoscript.	23:10
clarkb	jeblair: or use some zuul gate pipeline to properly evict children :)	23:10
SpamapS	jeblair: I think that's manually typed in	23:10
SpamapS	No I'm dumnb	23:11
SpamapS	dumb	23:11
clarkb	fungi: looking more its a logging handler called jobfile that wants to write to that job-output.txt location and fails to open that. Maybe a race in test setup?	23:11
SpamapS	Zesty was RELEASED 4/13	23:11
clarkb	so eol is ~11/13	23:12
clarkb	er that math is wrong	23:12
clarkb	01/13 ?	23:12
clarkb	I can add 9 to 4 and mod by 12 honest	23:12
clarkb	fungi: ya reading logs there are playbooks under that tmpdir that appear to be read fine	23:13
fungi	huh. vexing	23:14
jeblair	clarkb, fungi: do i need to look into something? i haven't been following.	23:15
fungi	jeblair: unit test failure on 501459 looks like a race on a console log file	23:15
clarkb	http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_599560 and http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_680199	23:16
fungi	not sure whether it's just a racy test or finding a bug in zuul	23:16
clarkb	ok I don't think its a race in the log anymore	23:17
clarkb	we attempt to cat the job-output.txt file when a test fails	23:17
clarkb	but depending on where that assertion happens it may be completely valid to not have that file on disk	23:17
fungi	aha, as in perhaps too early	23:17
clarkb	ya	23:18
clarkb	http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_616595 appears to be a valid fail that is being caught there	23:19
clarkb	I'll push up a patch to not traceback if the file doesn't exist (and log the case instead)	23:19
jeblair	clarkb: ++	23:19
fungi	good eye	23:20
clarkb	jeblair: http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_611455 I think that is what caused it to post failure	23:21
clarkb	and then there it post failures http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_615440 ?	23:22
clarkb	jeblair: but I'm not sure if the nonodeerror is expected	23:22
jeblair	clarkb: the nonodeerror should be fine	23:22
jeblair	that's probably a periodic poll	23:22
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Handle debug messages cleanly https://review.openstack.org/501490	23:23
openstackgerrit	Merged openstack-infra/zuul feature/zuulv3: Switch to publish-openstack-python-docs-infra https://review.openstack.org/501362	23:23
clarkb	ok if not that then I don't see any other sadness, it is running the command then later it returns exit code 2	23:24
jeblair	clarkb: hrm; the answer should be in job-output.txt. if it got that far, it should be there	23:24
clarkb	[build: 30249a4eec9e429294f5cfeff0ccfd3e] Ansible output: b"<localhost> EXEC /bin/sh -c '/usr/bin/python2 && sleep 0'" then [build: 30249a4eec9e429294f5cfeff0ccfd3e] Ansible output terminated	23:24
clarkb	the python2 invocation there is weird to me, is it piping python into it to execute?	23:27
clarkb	ya that appears to be ansible's localhost connection logging that it is running python2	23:31
pabelanger	woot	23:32
pabelanger	http://logs.openstack.org/9f/ffee8582c2d8013b89ae5f9c82c4bec9fdd5b59f/post/publish-openstack-python-docs-infra/a389335/job-output.txt.gz	23:32
clarkb	and that hello-world playbook is copying "hello world" into a file in that tmpdir so maybe I am back to something is off with the tmpdir	23:32
pabelanger	afs-docs job worked after our refactors	23:32
fungi	looks like the streamer indentation fix is installed on zuulv3.o.o now if anyone's up for a restart	23:32
fungi	oh, though i guess it's actually the executors we care about there?	23:33
jeblair	fungi: yep, executors	23:33
pabelanger	ze01 and ze02 please	23:33
pabelanger	we're running both now	23:33
clarkb	dest: "{{zuul.executor.log_root}}/hello-world.txt" so could just be the log dir	23:33
pabelanger	we likely should update our ansible-playbook in system-config to support zuul-executors	23:34
fungi	and was there a bug which is currently causing jobs not to get reenqueued when an executor is restarted?	23:34
fungi	any special handling i need there?	23:34
pabelanger	Ya, I think we'll have to see why aborted jobs are not getting requeued	23:35
pabelanger	I can likely look into that in the morning	23:35
jeblair	fungi: nah, just recheck if you care :)	23:35
openstackgerrit	Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add change_url to zuul dict passed into inventory https://review.openstack.org/501492	23:37
mordred	pabelanger: oh good re: afs-docs!	23:37
pabelanger	mordred: Yup, I'll finish up publish-openstack-python-docs and get it up for review, but should be able to close that out for tomorrow	23:38
fungi	is zuul not actually installed system-wide on the executors?	23:38
fungi	oh, it's installed into a chroot, right?	23:38
jeblair	fungi: it's installed in the normal manner	23:38
pabelanger	zuul-executor is the service	23:39
fungi	weird, pbr freeze doesn't seem to think zuul is installed	23:40
mordred	fungi: it's python3	23:40
pabelanger	pip3 :)	23:40
mordred	fungi: you're probalby getting pbr from python2	23:40
* mordred needs to make "python3 -m pbr freeze" work ...		23:40
fungi	mordred: gah, you're correct	23:41
fungi	for some reason on zuulv3.o.o that wasn't happening	23:41
mordred	fungi: well, to be fair it should also not be happening on ze01 - but we have, from time to time, accidentally done pip install . instead of pip3 install .	23:42
mordred	fungi: which means some things, like the pbr bin script, may have last been installed with pip2	23:42
mordred	fungi: since command line entrypoints are last-installed-wins	23:43
fungi	got it	23:44
fungi	well, anyway, i confirmed the fixed version is present on both ze01 and ze02 and restarted zuul-executor on them	23:44
fungi	mordred: anyway, `python3 /usr/local/bin/pbr freeze` does work even if python3 -m doesn't there yet	23:46
fungi	so good enough for me once you reminded me zuul's not installed under the default python any longer	23:46
clarkb	jeblair: I think streams may be getting crossed between ansible builds (and maybe tests) http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_600305 notice that the build: uuid doesn't match the uuid for the work dir	23:55
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Split the log_path creation into its own role https://review.openstack.org/501494	23:58
openstackgerrit	Monty Taylor proposed openstack-infra/zuul-jobs master: Add a role to emit an informative header for logs https://review.openstack.org/501495	23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!