Wednesday, 2017-09-06

jeblairturning off keep00:01
*** harlowja has joined #zuul00:03
*** harlowja has quit IRC00:09
*** harlowja has joined #zuul00:14
ianwjeblair: doesn't fd.readline() in follow return str ... so under python3 the strings will always be encoded00:52
ianwnot that i think this is wrong per se, but slightly conflicts with the changelog00:53
*** harlowja has quit IRC00:57
ianwah no, you're right, it's only with universal_newlines set that Popen does that00:57
jeblairianw: er, good!  i hadn't realized that, but it sounds like it's still accidentally correct.  that's part of why i wrote the detailed changelog -- just to make sure it all makes sense to all of us.  :)01:01
*** harlowja has joined #zuul01:01
*** jkilpatr has quit IRC01:26
clarkbreadline returns str if you set universal newlines to true iirc01:34
clarkboh depends if text mode or binary and text implies universal newlines01:35
*** jkilpatr has joined #zuul01:37
mordredjeblair: lovely!01:49
mordredclarkb: yah - I agree on the collapsing of the environment defaults for tox01:52
mordredclarkb: if you're around, the 2 changes before 229 only have 1 +2 - they're mostly just lead-up to 229 though01:54
mordredclarkb: I'm not sure if you want to explicitly review them, or if your +2 on the end results of 229 is good and I can just land the stack01:54
jamielennoxdid we end up implementing a new way to hold the nodes of failing jobs?01:55
mordredjamielennox: yes we did!01:55
jamielennox\o/ - do you remember it?01:55
mordredjamielennox: https://docs.openstack.org/infra/zuul/feature/zuulv3/admin/client.html#autohold01:55
mordredjamielennox: (was looking for doc link for you)01:56
jamielennoxahh, i was looking at nodepool01:56
jamielennoxprobably does make more sense now for that to be on zuul side01:56
mordredyah - it used to be there - but with v3 and the shift to active-requests ... yah01:56
jamielennoxis there still a use for "nodepool hold"01:57
jamielennox?01:57
jamielennoxmordred: anyway - thanks!01:59
mordredjamielennox: I don't think so? maybe?02:00
jamielennoxnot super urgent for now anyway - it can be a useful way of pulling a node out for your own usage02:00
jamielennoxjust not sure if that'll be common02:01
mordredjamielennox: I do know that a thing we don't have but people keep asking for is "nodepool boot" - so thatyou can ask nodepool to boot you a node of a particular label - like if you need to debug something about one02:01
mordredjamielennox: which I htink is similar to the use case you're talking about yeah?02:01
jamielennoxmordred: yea, becaues zuul will skip things marked HOLD it basically reserves you a node02:02
mordred"as an admin of a zuul/nodepool, I'm having issues that only show up in test and I'd like a node to ssh in to and poke around at to see if I can figure out"02:02
mordredjamielennox: ah - ya - hold in nodepool gets you a node to play with - autohold in zuul doesn't delete a node when a job fails02:02
jamielennoxre: autohold, it'd be useful to not have every parameter required, like if tox-py27 is failing consistently in a tenant i probably don't care which project i capture from?02:02
jamielennoxwhich is a feature i should put in storyboard, but i still struggle to know where to put things like that in storyboard02:03
mordredyah- I could see that02:03
mordredjamielennox: I think we all do02:03
clarkbmordred: I think you can go for it. I'm fighting the "my wahing machine stopped working and neither water valve for it has a handle" battle now02:04
mordredclarkb: oh good02:04
*** jkilpatr has quit IRC02:04
mordredclarkb: I'll land those ina sec - I'm working on the squash-tox-environment thing right now02:04
mordredbutI need to prove something to myself first02:05
clarkbit turns out when you replace a section of leaky pipr that kills washing machines02:05
clarkbcan I blame jaypipes for this?02:05
mordredclarkb: yes. he's he right person to blame02:07
*** jkilpatr has joined #zuul02:16
*** xinliang has quit IRC02:17
fungipabelanger: jamielennox: mordred: i'm still catching up... not installing bindep if there's no bindep.txt ignores the fact that we have a bindep fallback list, doesn't it? or am i misunderstanding the suggestion?02:19
*** xinliang has joined #zuul02:30
*** xinliang has quit IRC02:30
*** xinliang has joined #zuul02:30
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Collapse tox_environment and tox_environment_defaults  https://review.openstack.org/50107502:38
mordredfungi: the code that looks for bindep.txt also looks for the fallback file02:39
mordredfungi: so for openstack, it will always find a bindep.txt - AND it will never intsall bindep becaues we have it pre-installed02:39
mordredclarkb: ^^ https://review.openstack.org/501075 collapses the tox_environment settings like you mentioned AND gets rid of the python module02:39
mordredclarkb: so tons of simplification02:40
clarkbnice02:40
mordredclarkb: can probably squash with the previous patch, but I figured I'd put it up separate for reading purposes02:40
clarkbmordred: I've pulled it up for review first thing tomorrow02:40
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Don't install bindep if there's no bindep file  https://review.openstack.org/50101802:50
mordredjeblair: I hit +A on https://review.openstack.org/#/c/501040 but there's some good comments from ianw in there that are worthy of reading and potentially a followup03:07
fungijhesketh: trying to fix up your 456162 change i'm down to just one unit test failure now... if you get a chance to take a look at why test_crd_gate_unknown is unhappy with it we might be able to get it merged soon03:26
* fungi needs to get some sleep, but will be getting into more zuulishness tomorrow03:26
jheskethfungi: sure, I'll take a look03:30
ianwmordred: 501040 ... is that a known thing?03:39
ianwthe -2 i mean03:39
*** bhavik1 has joined #zuul05:02
jamielennox{"msg": "[Errno 2] No such file or directory"     is such a useless message05:03
jamielennoxwhy isn't the accessed name in their by default05:04
*** bhavik1 has quit IRC05:25
*** hashar has joined #zuul05:35
openstackgerritJoshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Only grab the gerrit change if necessary  https://review.openstack.org/45616206:06
jheskethfungi: ^ I think that fixes the problem06:07
* jhesketh misses working on zuul06:08
tobiashjhesketh: I have a thought on ^07:44
tobiashjeblair: just noticed that maintainCache is never called so we probably don't clear anything from the change caches currently07:57
tobiashjeblair: http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/scheduler.py?h=feature/zuulv3#n58907:57
tobiashjeblair: there is still some comment to update maintainConnectionCache for tenants but to me this method looks correct07:58
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Enable maintainConnectionCache  https://review.openstack.org/50114408:02
tobiashjeblair: wip'ed ^ in case you have any objections for this08:03
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Use password supplied from nodepool  https://review.openstack.org/50082308:44
openstackgerritTobias Henkel proposed openstack-infra/zuul feature/zuulv3: Enable maintainConnectionCache  https://review.openstack.org/50114409:06
*** openstackgerrit has quit IRC09:18
jheskethtobiash: ah cool, thanks... I think you're right and your suggestion is good. Should we let this one land and fix it up in a follow up or do it now?09:19
tobiashjhesketh: I don't mind, but I also think the data structure change should be its own patch09:19
jheskethtobiash: umm, so you do want them separate? (sorry, I'm confused by your last message)09:21
tobiashjhesketh: I think we should have a patch which restructures the cache data structure and the patch which already exists. Possibilities are that the restructure change is either the parent or the child of your change09:22
jheskethoh right, I follow09:23
*** openstackgerrit has joined #zuul09:48
openstackgerritJoshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Only grab the gerrit change if necessary  https://review.openstack.org/45616209:48
openstackgerritJoshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Connection change cache improvement  https://review.openstack.org/50118709:48
jheskethtobiash: ^09:48
*** openstackgerrit has quit IRC10:03
tobiashlooking10:08
*** openstackgerrit has joined #zuul10:22
openstackgerritJoshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Connection change cache improvement  https://review.openstack.org/50118710:22
openstackgerritJoshua Hesketh proposed openstack-infra/zuul feature/zuulv3: Only grab the gerrit change if necessary  https://review.openstack.org/45616210:22
tobiashjhesketh: +210:23
*** jkilpatr has quit IRC10:42
rcarrillocruzmordred, jeblair : ok, so I created "Ricky Zuul" GH app https://github.com/apps/ricky-zuul . The perms are a bit of guesswork, put r/w on PR and r/w on repo contents10:46
rcarrillocruzoh, and also on commit statuses10:46
rcarrillocruzi installed that app on my rcarrillocruz org dummy repo 'zuul-tested-repo'10:46
rcarrillocruznow on my way to set up zuul-scheduler with github driver10:47
rcarrillocruzpabelanger: ^10:47
rcarrillocruzerm, i guess the commit statuses is not needed11:01
rcarrillocruzjeblair: so i guess once we agree the bare minimum perms that are needed for creating a bespoke GH app for zuul usage, I can push a change and document that11:02
rcarrillocruzis that expected to change? i remember reading a perms model change in GH, something about graphql , not sure if the GitHub App thing may be in flux ?11:03
*** jkilpatr has joined #zuul11:07
*** jkilpatr has quit IRC11:07
*** jkilpatr has joined #zuul11:07
*** jkilpatr has quit IRC11:15
*** jkilpatr has joined #zuul11:28
mordredrcarrillocruz: my understanding is that the App thing itself is the "new" way for 3rd parties to provide services to github users11:31
mordredrcarrillocruz: but you're very right - gh is moving their apis to all be graphql-based11:31
mordredrcarrillocruz: so at *some* point we'll need to update the gh driver to use graphql-api instead of rest11:32
mordredianw: hrm. not to me11:33
mordredjhesketh: we miss you woring on zuul!11:33
rcarrillocruzsigh11:35
rcarrillocruzwhat's wrong with rest11:35
* rcarrillocruz has nightmares, as network vendors instead of adopting rest they are coming back to xml apis11:36
rcarrillocruzdoes http://paste.openstack.org/show/620512/ ring a bell anyone?11:41
rcarrillocruzthat from scheduler startup11:41
rcarrillocruzif i do on python3 shell11:41
rcarrillocruzimport github311:42
rcarrillocruzgh = github3.GitHub()11:42
rcarrillocruzit does not have a session either11:42
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Connection change cache improvement  https://review.openstack.org/50118711:51
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Only grab the gerrit change if necessary  https://review.openstack.org/45616211:51
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Collapse tox_environment and tox_environment_defaults  https://review.openstack.org/50107512:01
*** weshay_PTO is now known as weshay12:08
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Collapse tox_environment and tox_environment_defaults  https://review.openstack.org/50107512:11
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Delete unused run-cover role  https://review.openstack.org/50124412:12
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Switch to openstack-doc-build for doc build jobs  https://review.openstack.org/50124612:18
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Be explicit about byte and encoding in command module  https://review.openstack.org/50104012:21
mordredjeblair: SOOOOOO12:26
mordredjeblair: issue for you to look at as soon as you are awake12:26
mordredjeblair: I just watched shade have a job in the gate pipeline - which is incorrect12:27
mordredjeblair: the build uuid is 80e369018a664c5f86c3cd64af7a464012:27
mordredjeblair: https://review.openstack.org/#/c/494535/ is the change it happened for ...12:29
mordredjeblair: there is another change: https://review.openstack.org/#/c/500201/ which added it the job to the gate pipeline, which I approved because I'm a moron and didn't register that it had a gate entry12:30
mordredjeblair: that change had not landed, but it *WAS* running in the gate when https://review.openstack.org/#/c/494535/ was approved and enqueued12:31
mordredrcarrillocruz: zomg network vendors are moving back to XML? they should at least, if they're not going to do REST, do something sane like gRPC12:32
mordredrcarrillocruz: did you properly get the  version of github3.py from git?12:33
rcarrillocruzhttps://en.wikipedia.org/wiki/NETCONF12:33
rcarrillocruzit's been around for a while, but getting more vendors onboard now12:33
rcarrillocruzwhich is a shame, since there's a thing called RESTConf12:33
mordredrcarrillocruz: http://git.openstack.org/cgit/openstack-infra/zuul/tree/requirements.txt?h=feature/zuulv3#n512:33
rcarrillocruzanyay12:33
mordredrcarrillocruz: **HEADDESK**12:34
rcarrillocruzmordred: yeah, i did12:34
rcarrillocruzthing is12:34
mordredrcarrillocruz: now is _definitely_ the time to start adopting a protocol written in 200612:34
rcarrillocruzi don't understand that code12:34
rcarrillocruzthe github object is suppoed to get the session when it logins12:34
rcarrillocruzlet me link12:34
rcarrillocruzhttps://github.com/openstack-infra/zuul/blob/feature/zuulv3/zuul/driver/github/githubconnection.py#L42712:35
rcarrillocruzit fails there12:35
rcarrillocruzbut, the login is done after that method12:35
rcarrillocruzso at that point there's no session12:35
rcarrillocruzcommenting out those lines the execution goes over fine12:38
mordredrcarrillocruz: that's really weird - Idon't see that error in our logs12:39
mordredrcarrillocruz: I wonder if there is a difference in how we have the auth things configured?12:39
mordredrcarrillocruz: http://paste.openstack.org/show/620519/ is our github config snippet (with two values omitted, clearly)12:40
rcarrillocruzyeah, i have the same thing12:41
rcarrillocruzmordred: http://paste.openstack.org/show/620521/12:42
rcarrillocruzthat's pretty much what the driver code does12:42
rcarrillocruzin a python3 shell session12:42
rcarrillocruzgetting a client12:42
rcarrillocruzi don't have a session attr12:42
rcarrillocruzi'm confused how that works in your side12:42
mordredrcarrillocruz: that worksfor me: <github3.session.GitHubSession object at 0x7efeae5aa9e8>12:44
rcarrillocruzhmm12:44
mordredhttp://paste.openstack.org/show/620522/12:45
mordredrcarrillocruz: >>> github3.__version__12:45
mordred'1.0.0a4'12:45
mordredalthough that doesn't really tell what version from git it's installed from12:46
rcarrillocruzeugh12:49
rcarrillocruzthat was it12:49
rcarrillocruzit seems i had a github lib floating around12:49
rcarrillocruzmy messing with pip vs pip3 probably12:50
rcarrillocruzthx12:50
mordredrcarrillocruz: the pip vs. pip3 thing has bitten us more than once :)12:50
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Collapse tox_environment and tox_environment_defaults  https://review.openstack.org/50107512:56
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Delete unused run-cover role  https://review.openstack.org/50124412:56
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Add UPPER_CONSTRAINTS_FILE file if it exists  https://review.openstack.org/50032012:56
rcarrillocruzmordred: the webhook URL path, is by default <zuul_server>/connection/github/payload or you have some reverse proxy redirecting that to other thing ?13:00
mordredrcarrillocruz: it is that by default where zuul_server is the zuul-scheduler webapp process13:02
rcarrillocruzsweet13:03
mordredrcarrillocruz: we also have a reverse proxy in front of that for us, as there are 2 different web apps at the moment (zuul-scheduler and zuul-web) and we want to hide that13:03
rcarrillocruzthat reminds me i should bring t up now (zuul-web)13:03
mordredhopefully it won't be too long after the PTG for us to migrate the rest of the scheduler webapp into zuul-web so we can go back to having one web app13:04
rcarrillocruzjebus13:09
rcarrillocruzhttp://paste.openstack.org/show/620528/13:09
rcarrillocruzi'm excited!13:09
rcarrillocruz\o/13:09
Shrewsmorning folks13:13
mordredmorning Shrews13:13
mordredrcarrillocruz: woot!13:13
mordredrcarrillocruz: it's kind of amazeballs isn't it?13:13
rcarrillocruzfor sure :D13:14
*** dkranz has joined #zuul13:21
mordredjeblair: also still seeing the weird -2 on patches not at the top of a stack over in shade even with yesterday's patch running13:23
mordredjeblair: http://paste.openstack.org/show/620530/ is the relevant portion of the log I think13:23
mordredjeblair: also, a little further back in the log: 2017-09-06 12:03:16,903 DEBUG zuul.DependentPipelineManager: Scheduling merge for item <QueueItem 0x7f4882408b38 for <Change 0x7f48927d5ba8 500930,2> in gate> (files: ['zuul.yaml', '.zuul.yaml'], dirs: ['zuul.d', '.zuul.d'])13:27
rcarrillocruzmordred: does zuul-scheduler log github events?13:28
rcarrillocruzlike if i do a PR push, should I expect something in that log (in my case, foreground, not logging to its own file yet)13:29
mordredyes13:29
mordredyou should definitely see activity13:29
rcarrillocruzsweet, i'll ry that out13:30
mordredrcarrillocruz: https://github.com/organizations/openstack-infra/settings/apps/openstack-zuul/advanced (or replacing with your url)13:30
mordredrcarrillocruz: shows you a list of events github has delivered to your app13:30
mordredrcarrillocruz: we should be logging the event id so that if you want you can cross-reference with github's log13:31
mordredjlk: ^^ speaking of that ... on that advanced tab gh also shows the response it got13:31
rcarrillocruzsweet13:31
mordredjlk: I wonder if maybe we should add $something to our response - like a header - that includes $something from zuul13:32
rcarrillocruzoh , i get 404 , i guess cos i'm not member of the org13:32
mordredjlk: it's possible we don't have anything yet13:32
rcarrillocruzbut yeah13:32
rcarrillocruzi can look on my own org13:32
mordredjlk: at that stage of processing13:32
mordredjlk: but if we do, maybe returning it in our response headers is a thing that could be useful somehow?13:33
mordredjlk: just an idle thought13:33
*** hashar is now known as hasharAway13:38
rcarrillocruzhmm, mordred  i don't see logging on a PR I just pushed. Howevre, I do not have layout.yaml set yet. I wonder if the logging is only when the project is set up on a pipeline with jobs and all. i.e. the webhook raw events are not logged ?13:38
rcarrillocruznah, seems like a config issue13:41
rcarrillocruzchecking the github app i see undelivered messages13:41
* rcarrillocruz looks13:41
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Update tests to use AF_INET6  https://review.openstack.org/50126613:50
mordredrcarrillocruz: yah - you shold see log entries for every event that happens13:51
rcarrillocruztadaaaa14:03
rcarrillocruzSep 06 14:03:08 zuul sh[16326]: 2017-09-06 14:03:08,619 DEBUG zuul.GithubWebhookListener: Github Webhook Received: 7eda70b6-9308-11e7-8827-64faa1b0f4fd14:03
rcarrillocruzgot confused between zuul-webapp and zuul-web ports14:03
rcarrillocruzput 8001 on the gh app URL and sorted it14:03
mordredwoot!14:12
mordredrcarrillocruz: and yah - I'm looking forward to there only being one web port - the current thing is annoying14:12
mordredtobiash: left -1 on https://review.openstack.org/#/c/500799 - but overall I like both sides of that stack!14:13
tobiash:)14:13
rcarrillocruzin order to have feature parity to what we have with dci (periodic CI jobs), i'll set up a periodic pipeline and set what we have now. After that, check_github and gate_github14:15
dmsimardmordred: hey, a bit of a silly question -- how do we make the base job work on either ubuntu-xenial and centos-7, but not both ?14:17
mordredrcarrillocruz: \o/14:17
dmsimardright now the base job defaults to one ubuntu-xenial node -- so jobs wanting to run on something else would need to override it I guess ?14:18
mordreddmsimard: uhm. I'm not 100% sure what you mean by that - can you say that with different words?14:18
mordreddmsimard: yes! that is correct14:18
mordreddmsimard: jobs that want not-ubuntu-xenial just add whatever they want in nodes:14:18
dmsimardmordred: okay, pabelanger stood up the centos-7 image so I'll run some tests to see if it works ahead of the migration14:18
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Fix missing logconfig when running tests in pycharm  https://review.openstack.org/50074814:19
mordreddmsimard: ++14:19
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Close logging config file after write  https://review.openstack.org/50075414:19
dmsimardmordred: the JJB translation to shell will handle the node definition as well ?14:19
dmsimardsome have centos-7, centos-7-2-nodes etc14:20
mordreddmsimard: yah14:21
openstackgerritDavid Moreau Simard proposed openstack-infra/zuul-jobs master: Do not merge: test v3 jobs on the centos-7 image  https://review.openstack.org/50128114:21
jeblairmordred: regarding 494535 being in gate -- i'm not actually sure that was incorrect.  when a change is enqueued into gate, it's a proposed future state, and each change enqueued after it exists in the context of that proposed future state.  so as soon as the change adding the shade job to gate was in the gate pipeline, the next shade change added to the gate pipeline should run that job too, because it's running in a world where shade has a gate ...14:33
jeblair... job.14:33
mordredjeblair: yes - for sure!14:33
mordredjeblair: but - when the change adding the gate job does not land, the change behind it should get re-enqueued in a context that doesn't include the gate-adding change14:34
mordredjeblair: so while it should run in the gate for a moment in time, the other change should certainly not be merged by the gate pipeline14:34
mordred(which is what happened)14:35
jeblairmordred: ah yes.  i'll make a test.14:35
mordredjeblair: here's the sequence that occurred https://etherpad.openstack.org/p/FZhs4Rh86F14:39
jeblairmordred: got it14:40
mordredjeblair: also - I was going to fix pabelanger's change to remove the gate mention, but was waiting to make sure you didn't need it for any reason14:40
jeblairnope14:40
mordredjeblair: cool. also added followup with theory on what the followups are with the other changes (that issue is persistent and still happening, fwiw)14:45
jeblairmordred: that makes sense too14:47
mordredk. cool14:47
jeblairwe'd certainly want to fix that before we start having zuulv3 chime in on more repos.14:47
jlkmordred: a header tossed back, some sort of uuid?14:48
jlkmordred: would make sense, particularly if we could carry that ID throughout zuul logging, for tracing.14:50
mordredjeblair: agree14:51
mordredjlk: and yah -maybe? I mean, on the other hand we already log the GH event id in the zuul logs14:51
mordredjlk: so I'm not sure if reporting a zuul id in the gh response is *actually* useful for cross-reference14:52
jlkwell14:52
mordredjlk: since you'd ultimately need to find the gh event id in the zuul log to know which event to open in the gh ui14:52
jlkit'd make sense if we did the same elsewhere, and had an ID that could be carried around like with openstack14:52
mordredyah - that's a good point "I got an event from something, I created an ID for it, and that ID is gonna carry through the system on the things that event triggers"14:53
jlkbut yeah, if you were looking from zuul side, you'd eventually trace it back to the incoming event, which would have the GH ID14:53
mordredyup14:53
jlkso maybe not as useful to toss it back, but certainly that idea spurred other ideas :D14:54
jlkspurred?14:54
mordredjlk: oh - I also happened to notice in the log:14:54
mordredAttributeError: 'GithubWebhookListener' object has no attribute '_event_pull_request_review_comment'14:54
mordredjlk: only 4 of them in the current debug log14:54
jlkyeah we aren't listening for those14:54
jlkthat's when somebody does a "review" and comments on the code/review in the review context14:54
jlkseparate from just a single comment on the PR14:55
jlk(and also different from a single comment on the diff)14:55
mordredjlk: gotcha. so we don't care about those because we only care about comments on PRs for recheck, yeah?14:55
jlkcorrect. Those come through as issue comment14:55
mordredjlk: and for reviews I'd imagine we'd prefer to respond to review approval rather than text in a review14:56
jlkbecause a PR is an issue, except that it isn't.14:56
mordredyah14:56
jlkmordred: bingo. We do respond to approval/request changes events14:56
mordredjlk: maybe we should make a few explicit no-op handlers for thins we know we're not listening to on purpose14:56
mordredjlk: so that we don't log AttributeErrors for a thign that's actually purposeful behavior14:57
jlkWe could do that. It should be gracefully returning to github if we don't handle the event.14:57
jlkbut yeah, that's a tad ugly in the code14:57
jlkMaybe we could just not emit that error when we don't match an event.14:57
jlkGH adds events from time to time, we'd be chasing it if we had a noop for each.14:57
mordredjlk: yah - at this point I think we're fairly happy with our event matching - we could make a separate logger for unmatched events that defaults to off but that people could add to their logging config if they wanted to debug something related to event matching14:59
mordredbtw - we're getting "We couldn’t deliver this payload: Service Timeout" from time to time on gh events15:01
jlkoh wonderful15:02
jlkfrom gh to zuul or from zuul to gh?15:02
mordredso it's possible that we've already hit the scaling point where having more than one webhook listener behind a loadbalancer is needed15:02
mordredgh to zuul15:02
jlkyeah, I figured that would happen soon15:02
jlkI think a big part of that problem is a bunch of processing happens while the sender is connected15:03
mordredyah - even though we're not doing anything with them yet, the firehose of gh events from ansible/ansible is actually kind of useful for shaking this sort of stuff out15:03
jlksender connects, zuul chews on the event for a bit, hits the API a bunch, then returns15:03
jlkit would probably be much better to take in the event content and return right away.15:04
mordredah. any reason we can't return as soon as we have the json even if we haven't enqueued it yet? or do you think waiting until it's enqueued so that we properly return to gh that we didn't accept it is better?15:04
jlkwe should be doing minimal processing.15:04
jlkI think I broke this a bit15:04
jlkit probably was really fast before, but when I moved to caching the PR data, it meant hitting the APIs much earlier.15:05
mordredyah. I think returning quickly is likely better here- and we can do work zuul-side to make sure we either don't lose events after returning 200 or that we log ourselves when we do15:05
jlkbuilding up the cached object at web event time and carrying it forward.15:05
jlkwe could still do that, we just have to be careful15:05
jlkneed a persistent queue15:05
jlkbtw I'm going to try to participate in the ansible contributor day thing15:06
mordredyah - well - luckily the move from webapp to zuul-web will put a gearman in the middle anyway15:06
jlkvia video/IRC15:06
mordredjlk: cool15:06
jlkI think the one hard one to easily do a 200 immediately on is when it's a status event15:06
jlkactually, no, we could probably post-process that anyway15:06
jlkso basically, we'd get an event from GH, we'd ensure it's signed and properly formatted, then return a 200. Maybe we could check to see if it's a project we care about and do a !200 if we don't care about the project yet.15:09
jlkdoing all the API work in the event thread ties up the event processor. I think that's single threaded, no?15:09
jeblairjlk, mordred: that's what we do with gerrit today -- there's a queue object that connects the gerrit listener to the gerrit connection.  all of that within the driver.15:10
jlknod15:10
jeblairso if we need this before we move the listener into zuul-web, there's a pattern we can copy pretty quickly.  it's not much code.15:11
jlkyeah I could probably bang that out today15:11
jlksee if that gets us out of resource contention15:11
jlkI honestly think I'll need some high bandwidth brain / face time with y'all to sort out the zuul-web move in my head. I read some of the code but it's not exactly clicking yet15:12
jeblairjlk: it helps if you write it out on a piece of glass and look at it from the back side, upside down15:17
jlkperfect!15:19
mordredjlk: :)15:20
mordredjlk: I thinkn we can sort out the zuul-web move with high bandwidth brain time pretty quickly - it's actually pretty straightforward given the structure of the github driver - at least in my head15:22
* jlk drops off to prepare for ansiblefest things15:26
rcarrillocruzfolks, i have to say the zuul docs have been vaaastly improved15:29
rcarrillocruzkudos everyone15:29
* rcarrillocruz keeps reading how to define zuul v3 jobs15:29
*** hasharAway is now known as hashar15:31
mordredrcarrillocruz: luckily - we've got a TON of content now15:33
Shrewsmore content to go in with some +3's: https://review.openstack.org/50021315:43
Shrewsjeblair: left a -1 on https://review.openstack.org/500216 because of a missing _ causing the link to not be clickable15:44
jeblairShrews: okay, i'll update that after i finish writing tests for mordred's problem15:45
openstackgerritDavid Moreau Simard proposed openstack-infra/zuul-jobs master: Do not merge: test v3 jobs on the centos-7 image  https://review.openstack.org/50128115:45
jeblairmordred: with the test i've written, change B *does* incorrectly run the gate job, however, it correctly *does not* report it.15:46
jeblairmordred: are you sure there aren't any other gate jobs?  maybe some with matchers so they aren't actually run?15:48
jeblairmordred: or were there any other changes involved in that sequence?15:48
mordredjeblair: no - not to  my knowledge to either15:58
jeblairoh... there *might* be an interaction with the check pipeline... lemme rejigger the test15:59
mordredjeblair: there is no mention of shade in a zuul gate pipeline anywhere other than that one change16:00
jeblairmordred: ah there we go -- it's the presence it check that caused that behavior.  i think i have the reproduction now.  sorry for the red herring.16:01
pabelangermordred: I think we are ready to land https://review.openstack.org/500990 this morning.  Do you have time to review? Our new publish-openstack-python-docs-infra job16:01
pabelangerand child patches16:02
mordredjeblair: woot!16:03
mordredpabelanger: looking now16:03
mordredjeblair: glad you found a reproduction - it's those sorts of squirrely things that this whole run-it phase should be smoking out16:06
rcarrillocruzfolks, where is the zuul base default job defined16:09
rcarrillocruzis it in tree16:09
rcarrillocruzor within zuul-jobs repo16:09
jlkI thought it was in zuul-jobs16:09
rcarrillocruzasking as i defined a custom job on my test repo16:09
rcarrillocruzand got16:09
jlkor maybe project-config16:09
jlkproject-config16:09
rcarrillocruz "Job base not defined"16:09
mordredrcarrillocruz: it's project-config16:10
rcarrillocruzso, that means, it's a requirement to pull that repo in order to have a minmal zuul right?16:10
mordredrcarrillocruz: you have to define your own base job16:10
jlkplaybooks/base16:10
jlkyou can define your own base, or re-use project-config.16:10
mordredrcarrillocruz: however, our base job playbooks are just built on roles that are all in zuul-jobs16:10
rcarrillocruzk, thought we had some sort of empty 'base' in the code16:10
rcarrillocruzso we didn't have to define it16:10
mordredrcarrillocruz: it's on our todo list to do that- there are a few things that need to be sorted out first,so for now you need a deployment-specific base job16:11
rcarrillocruzdummy question: as there's going to be a super handy library of base jobs, how you plan to distribute that? as part of pip or will it always be a thing on git openstack16:12
mordredrcarrillocruz: I recommend copying the base job + playbooks from project-config, then defining a secret that holds credential for whereever you want to upload logs16:12
mordredrcarrillocruz: git openstack16:12
mordredrcarrillocruz: because you can just put openstack-infra/zuul-jobs directly in your zuul/main.yaml16:12
mordredrcarrillocruz: and zuul will update it for you magically16:12
mordredrcarrillocruz: I'm sure at somepoint someone is going to think they want a frozen pip/rpm/deb installable set of jobs, but I'm going to argue with them as strongly as I can that they don't really want that :)16:13
mordredpabelanger: I'm +2 on that whole stack16:13
rcarrillocruzah ofc, zuul-jobs is *in* github too16:14
rcarrillocruzso in my case, a only github driver installation, it would pull it as well16:14
rcarrillocruz++16:14
jlkyeah you can point to github16:14
jlkwe do at Bonny16:14
mordredrcarrillocruz: yup16:14
dmsimardmordred, jeblair: is it possible to prevent the base role from running ?16:15
rcarrillocruzjlk: periodic also work on github right?16:15
rcarrillocruzi'm doing a POC16:15
jlkperiodic driver?16:15
jlkI haven't tried...16:15
rcarrillocruza periodic pipeline16:15
rcarrillocruzwith github source16:15
jlkI mean, it should?16:15
mordreddmsimard: you can say "parent: none" to make a job that doesn't use the base job16:15
dmsimardmordred: perfect! thanks.16:15
mordreddmsimard: although if you did that on openstack's zuul you'd be very sad16:15
mordreddmsimard: since you odn't get logging without our base job :)16:15
dmsimardmordred: right, but the purpose is to test the base playbook (and the roles it contains)16:16
dmsimardso it's kind of inconvenient if the trusted role runs first, and then we re-run the (modified) role on top16:16
mordredoh - well - we'll never run a proposed versoin of the base job in a job16:16
dmsimardmordred: what is preventing me from adding a required-projects: project-config and then running that playbook with the checked out roles from a review ?16:17
mordreddmsimard: zuul is16:17
mordredoh - hrm.16:18
mordreddmsimard: yah - ok, you could construct something - it would still be synthetic, as it wouldn't have access to the secrets needed for the base job to work16:18
dmsimardmordred: the tl;dr is that I want to make sure the base playbook works for all distros -- right now it only works for ubuntu. This is for centos: https://review.openstack.org/#/c/501281/ but I'll also add the debian, fedora and opensuse image to nodepool v316:19
dmsimardand not getting configure-mirror "self-tested" in the gate will make this suck, a lot16:20
dmsimardthus, I was planning on spawning a multi node and run ansible from a controller node, to the node where the base role would be applied -- a bit like how you showed me with zuul stream16:20
mordreddmsimard: yah - well, the base-job content is purposely not self-testing - which is why we need to make some synthetic tests16:20
mordreddmsimard: oh - but yes, that's exactly right16:20
dmsimardmordred: so, can I do that then ?16:20
mordreddmsimard: doing that is, I think, what we need to do to verify base content - but then it's not really about being ableto runa job without a its base job...16:21
mordredor, rather...16:21
mordreddmsimard: the synthetic job will still need to deal with providing the job running from the controller to the other node with variables that the job can use whenit runs the roles in the base job's playbooks16:22
dmsimardsure, I can figure what to pass and provide "mock" data as necessary16:23
mordreddmsimard: so - that's a thing you could have your synthetic job create - like making sure there is a key installed on one of the nodes, then passing it in16:24
dmsimardthe purpose is to test that it works without failing horribly16:24
dmsimardright16:24
pabelangermordred: thanks, jeblair clarkb fungi: are you interested in reviewing https://review.openstack.org/500990 for our publish-openstack-python-docs-infra jobs16:24
mordreddmsimard: we still likely want to make base job in project-config like "base-post-only" or sometihng that you could use that would only run the post-logs playbook - and maybe that would only run the pre-playbooks against the controller node instead of against hosts: all16:25
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix dynamic dependent pipeline failure  https://review.openstack.org/50134516:25
rbergeronjlk, pabelanger: jeblair / mordred / shrews aren't here with us today at the ansible contributor event -- but since you're here (paul in person, jesse remotely) if you feel like there's anything to hail them about, feel free to do so --16:25
mordreddmsimard: so that you can use the REAL base job to set up your interactions with controller and to get logs published, etc16:25
rbergeronthe agenda and bluejeans stuff is posted here: https://public.etherpad-mozilla.org/p/ansible-summit-september-2017-core including the bluejeans video stuff16:25
rbergeronwhich i said twice in one sentence16:26
* mordred waves to rbergeron16:26
rbergeronanyway.16:26
rbergeronshrews: i have your hoodie also :)16:26
fungipabelanger: i'm interested in reviewing anything and everything zuulv3, working my way through the infra-manual patches right now but i can look at those next16:26
* rbergeron waves to mordred16:26
rbergeron(and we're in #ansible-meeting on irc). sorry for the noise. also if anyone else wants to pop in for whatever reason you are welcome to :)16:26
jeblairrbergeron: thanks!16:27
pabelangerfungi: great! I think we're ready to start testing afs publishing again for infra jobs16:27
jeblairmordred:   https://review.openstack.org/501345  fixes the first thing (the A+B changes)16:28
mordredjeblair: awesome. reading. also, I added post-merge review comments to https://review.openstack.org/#/c/50021316:31
dmsimardmordred: I'm not sure that matters, 'controller' is a subset of 'all' anyway -- so when running the playbook in the job, it will target a specific node as appropriate16:31
mordreddmsimard: I believe I see where you're going, but I do not believe it's going to work quite like you want - you can get the proposed project-config change onto a build node, but you cannot get the existing zuul to execute playbooks from the proposed change no matter what you do because of the way config repos work16:39
clarkbpabelanger: mordred please see comment on 99016:39
mordreddmsimard: so by synthetic, I mean you're going ot have to command: ansible-playbook something on the controller against one of the other nodes in the multinode job16:39
dmsimardmordred: yes, exactly -- I'll be running ansible-playbook16:39
mordredclarkb: will do - could you look at https://review.openstack.org/501345 ?16:39
dmsimardit's not perfect but it's the best we've got16:39
mordreddmsimard: ++16:39
mordredclarkb: we found a really fun edge case with gating this morning :)16:40
mordredclarkb: responded. I agree with your comment, but I think pabelanger can make the mv docs/post.yaml docs/infra-post.yaml when he adds the real post playbook for the openstack job16:42
clarkbmordred: re parent: none from above, the change you just linked uses parent: null. Is that just a convention of pointing to undefined name or is null actually needed?16:47
dmsimardjeblair: Can parameters from a job also be applied to a node ? If, for example, you'd want to run a playbook or a role against only one node.16:52
mordreddmsimard: you define that in the playbook16:52
mordreddmsimard: you can define groups for your nodes in the nodeset definition16:52
mordreddmsimard: so you can put some of them into a group and then write the playbook to target that group16:52
mordredclarkb: parent: null is required for the base job - since by definition it's the root of the inheritance hierarchy16:53
clarkbmordred: and that is distinct from parent: none?16:53
mordrednope. Ijust mistyped earlier16:53
clarkbah ok16:53
mordrednull is the yaml for None iirc16:54
dmsimardmordred: so here's my next awesome problem16:54
clarkblike false == False and so on16:54
mordredmian thing is - for a base job you must tell zuul explicitly it doesnt have a parent16:54
mordredbecase omitting parent: means parent: base16:54
mordredclarkb: yup16:54
pabelangermordred: clarkb: replied. Yes, publish-openstack-python-docs job needs to be updated now, what I am working on locally now. But want to make sure that python-docs-infra is now working, since we can build a top of that for unified docs16:54
dmsimardmordred: I'd like the *real real* base job to actually run on the controller node (so it can, like, upload logs for real and stuff) but not on the node I'm going to fake-run base on16:54
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Add test for dependent changes not in a pipeline  https://review.openstack.org/50135316:55
dmsimardI hope that makes sense16:55
jeblairmordred: ^ turns out that change fixed the second thing too, so that's just a test add16:55
mordreddmsimard: right- that's why I was thinking we needed to make a special base job for this purpose in project-config that is like the base job but has playbooks that target a specific group instead of "all"16:55
* jeblair reads scrollback16:56
mordredjeblair: woot. btw - the first patch failed tests in gate16:56
dmsimardmordred: making another playbook is easy, I mean, I can just add it as a "fixture" in zuul-jobs -- but then it can get out of sync from the "real" base playbook16:56
dmsimardmordred: oh, wait, I'm confusing myself I see what you mean now16:57
dmsimardok, sure, let's do that16:57
mordreddmsimard: although - now that I think about it - hosts: lines can have variables - so we COULD consider making a variable on the base job that is like "zuul_default_target: all" - and then defining some of our base playbooks to use hosts: "{{ zuul_default_target }}" - which would let people override that variable and run base playbooks against a subset of nodes16:57
dmsimardmordred: what I was thinking about is more along the lines of --limit from the CLI16:58
dmsimardmordred: your playbook has 'all' but you're passing a --limit <node name> so that you'd only be running against a specific node or group16:58
mordredI could see times in which that would be beneficialto other people - especially with a controller/nodes pattern - like if osmeone wanted 3 nodes for a puppet integration test but only was ever going to run their zuul playbooks against controller since they want puppet to talk and manage their other nodes16:58
pabelangermordred: jeblair: re: wheel builders we'll need more openstack-infra projects to zuulv3, are we okay to do that or hold off until mass import?16:58
mordreddmsimard: I thnk both are things we should consider - but for now let's just do a second base job with a limited hardcoded set16:59
mordreddmsimard: and hash out a plan for the general usecase next week16:59
mordredpabelanger: it's only 3 more projects16:59
dmsimardmordred: right, I think adding support for a parameter which gets passed to --limit makes sense, wonder if I should write it down somewhere16:59
pabelangermordred: yes, I can propose it now. just wanted to confirm first16:59
mordredhttps://review.openstack.org/#/c/500626/2/zuul.yaml16:59
mordredpabelanger: I have the whole wheel-builder stack :)17:00
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Add zuul-cloner shim  https://review.openstack.org/50092217:00
jeblairdmsimard: why not just write your playbook to act on the nodes you want it to?17:00
Shrewsrbergeron: \o/17:01
mordredjeblair: because whathe wants to do is limit the nodes the base playbook operates on17:01
jeblairmordred: if that's the case then we have too much stuff in the base playbook17:01
mordredjeblair: so that content he runs on the 'controller' node is what's responsible for doing the things to the other node that base would normally do17:01
dmsimardit's probably confusing to explain in writing, I'll write it down and explain in person :D17:01
pabelangermordred: Oh, now I see17:02
mordredjeblair: not really - zuul_stream test, for instance, has a zuul_console on the node it runs against because the zuul base job sets that up - so we don't actually test in that change that running zuul_console on the remote node does what we expect17:02
pabelangerdoh17:02
mordredjeblair: but I think we can shelve this as a general topic until next week17:02
dmsimardmordred: could/should I re-purpose base-test for that purpose ?17:02
pabelangermordred: we need to land zuul/main.yaml first. I'll make change17:02
mordredjeblair: and for now do exactly what you said - which is "write a playbook which explicitly lists the hosts desired"17:02
jeblairmordred: i have a really high bar for adding features to zuul for the express purpose of being able to test zuul17:02
mordredjeblair: yes. I do not think it's needed17:03
mordredjeblair: to add any features17:03
jeblairso the fact that zuul_stream is hard to test doesn't bother me.  that's *our* problem, we don't need to inflict that on users of the general case :)17:03
mordredjeblair: right. I'm not saying we need to add any features - I think there are some things we can do simply right now, and some things we can do that might be more complex and general that we can talk about next week17:03
jeblair(we can write a playbook to kill it; job done :)17:03
dmsimardjeblair: What I'm talking about is not for the purpose of testing Zuul, it's a generic feature of Ansible to be able to limit what hosts a playbook will run against -- your playbook could have 'hosts: all' but if you do ansible-playbook playbook.yml --limit controller, it would only run against your controller node.17:04
jlkDo we have any pending upstream Ansible features?17:05
*** jkilpatr has quit IRC17:06
mordredjlk: the upstreaming of log streaming that we discussed with abadger1999 and jimi-c - but I believe that's at "write a spec" stage17:06
jlkokay17:06
jeblairdmsimard: right, but i worry that in the zuul context that gets a little confusing when compared to playbooks authored to run against node lists.  it seems like if you have a playbook that runs against 'foobar' nodes, and you don't want it to run on foobar nodes, don't add any foobar nodes to the job?17:07
mordredjlk: other than that, I think we're pretty solid at the moment17:07
jlkokay17:07
*** harlowja has quit IRC17:07
*** harlowja has joined #zuul17:07
mordredfor now how about we let's add a base-integration - or even base-zuul-integration that has the same content as base but with playbooks that target controller instead of all - it's a purely job-content solution for being able to build a zuul job to test the base job17:09
mordredwhich, as jeblair points out, is a fairly unique and specific problem17:10
Shrewsjeblair: cloner shim is ready for review. I thought it would be less error prone to just flat out copy the ClonerMapper class into the shim. Also, using 'cp' for the hard linking since that seemed less error prone than a pure python solution.17:10
jeblairyeah, i think part of the context that we're missing here is that we just spent about 3 weeks ensuring that the base content ran everywhere as part of our security posture17:10
jeblairmordred, dmsimard: so anything where we back that out is something that we need to do very carefully17:11
jeblairmordred, dmsimard: for instance, even in the reduced base job that mordred is talking about, we need to run the ssh key thing17:12
jeblairmordred, dmsimard: and there can be no ability for a job to opt-out of that, otherwise we have created a vulnerability17:13
dmsimardjeblair: ok let's take a step back17:13
jeblairmordred, dmsimard: (to be clear, i'm in favor of mordred's reduced base job, as long as it still contains the ssh key roles running on all hosts)17:14
dmsimardjeblair: instead of telling you what I think I want, let me tell you what's my problem and let's see if we're on the same wavelength17:14
dmsimardjeblair: I'm trying to iterate on this: https://review.openstack.org/#/c/501281/17:15
dmsimardjeblair: my problem: how do I test that this doesn't break the base playbook on different distros ?17:16
pabelangerShrews: so if we wanted to bring nl02.o.o online, is it better to run both nl01 and nl02 at the same time or stop nl01.o.o and start nl02.o.o. Do you have a preference?17:16
Shrewspabelanger: i can't think of any reason why you'd need to stop nl0117:17
jeblairdmsimard: thanks. having that example helps.17:17
dmsimardjeblair: what I think I need: create a 2 node job, 'controller' and 'node', run the *real* base playbook on the controller, and get 'controller' to run ansible-playbook pre/post/etcbase.yaml on 'node'17:17
pabelangerShrews: eventually we want to stop / delete / rebuild nl01, since it is trusty17:17
dmsimardbut if the *real* base playbooks run on 'node', then it kind of sucks because I'm running on top of what already ran.17:18
clarkbmordred: is log collecting in v3 expected to left align everything? http://logs.openstack.org/45/501345/1/check/tox-py35/4f47527/job-output.txt.gz#_2017-09-06_16_36_23_66013917:18
pabelangerI think we might want to start bringing online another zuulv3 merger, ze01.o.o is currently processing a large nova change17:19
jeblairdmsimard: yeah, so i think mordred's suggestion of the reduced base job which only does minimal things (ssh keys, zuul_stream, logs) is the way to go;  ssh keys and zuul_stream are the only things that will run on the remote node, and ssh keys are the only thing that might have an operating system interaction17:19
mordredclarkb: nope. that's a bug17:19
*** jkilpatr has joined #zuul17:19
jeblairpabelanger: ze01 -- ze04 exist; you can make sure the others are up to date and bring them online17:19
jeblairdmsimard: however17:20
pabelangerjeblair: ah, right. I'll check that now17:20
jeblairdmsimard: note that once you have the reduced base job, you don't actually need to implement this as a multinode job, you can run the additional roles on the main node17:20
mordredclarkb: although that's content from inside of the output from tox from testr from zuul's tests - so I don't believe we're processing that exception text zuul side17:20
dmsimardjeblair: I guess17:21
clarkbjeblair: mordred: looking at test failures for 501345 I think that the fix has basically caught test fixtures that were/are broken and now we dequeue and don't report but assert we should report17:22
jeblairdmsimard: i think the key thing here is the ssh keys -- regardless of the technical capabilities of the system, we must as a matter of policy in openstack-infra, at the very least run the ssh keys role on every node.17:22
mordredjeblair: can you though? you won't be able to get zuul to put the proposed versions of the project-config roles in place on the executor17:22
mordredjeblair: or I guess I'm wrong - the job playbook that declares role will get those ... so yah17:22
jeblairmordred: ya that second thing, so we can have a test job in zuul-jobs that exercises the roles17:23
clarkbmordred: unless that is a new behavior in testr I'm pretty sure it won't left align like that17:23
mordredjeblair: yah- main thing will be that we won't be able to get zuul to run *playbooks* from project-config17:23
pabelangerjeblair: mordred: clarkb: fungi: I'm going to start ze02.o.o now, any objections? It is up to date17:23
jeblairclarkb: ah thanks, looks like i have a bit of cleanup to do17:23
mordredjeblair: but sincethose playbooks are simple anyway, that shouldn't be a problem17:23
dmsimardmordred: could you not do required-projects: project-config and then have a playbook that includes project-config/playbooks/something.yaml ?17:24
mordredclarkb: yah - I'll look at the zuul_stream stack and see if I can reproduce17:24
mordreddmsimard: nope17:24
dmsimardmordred: or ansible-playbook project-config/playbooks/something.yaml17:24
dmsimardwhy ?17:24
mordreddmsimard: you can't execute commands on the executor17:24
jeblair(that sounds ironic)17:24
mordreddmsimard: the only way for playbooks to be executed from the executor is for zuul to execute them17:24
clarkbmordred: if you don't mind I'd like to poke at that for a bit first just to gain more familiarity with the streaming setup17:25
mordredso we can put a playbook in zuul-jobs that runs the same roles as the base job - and _that_ playbook is one that zuul can execute17:25
mordredclarkb: awesome! I have a helper tool in tree to help get set up with local testing, fwiw17:25
dmsimardmordred: I'm confused, let me put up a gist to express what I'm trying to say17:26
clarkbmordred: ya I see it test() :)17:26
mordredclarkb: https://review.openstack.org/#/c/500161/17:26
mordreddmsimard: ++17:26
fungipabelanger: starting ze02 sounds good to me17:27
pabelangerokay, ze02.o.o started17:30
dmsimardmordred: https://gist.github.com/dmsimard/1fc6b22a40009298713c7432d9368a3717:34
fungipabelanger: is your comment in 500990 implying that you have another patchset coming to address clarkb's concern, or a followup change?17:34
dmsimardmordred: I guess in this example, we'd also need to set up the ansible roles path to seek from the checked out zuul-jobs17:35
pabelangerfungi: yes, that is what I am writing now. I hope to push up the publish-openstack-python-docs changes in the next hour17:35
dmsimardmordred: edited the gist to add the roles_path17:36
dmsimardjeblair: https://gist.github.com/dmsimard/1fc6b22a40009298713c7432d9368a37 ?17:36
dmsimarder, that ansible.cfg would not be effective17:38
dmsimardunless we'd run from a multi-node setup and run ansible manually from a controller to a node17:39
fungipabelanger: awesome, but my question was about whether it'17:39
jeblairdmsimard, mordred: theoretically, i think the include approach could work -- you could probably use include to get zuul to execute the un-merged project-config code (if you merged that job which did that)17:39
fungis a new patchset for that change, or a new change entirely17:39
jeblairdmsimard, mordred: however, that's the reason we should not merge such a change, as it allows arbitrary code execution on the executor17:39
pabelangerfungi: sorry, it will be a follow up because we need a new role in openstack-zuul-jobs. Basically, we can delete publish-openstack-python-docs job right now, if we want to avoid projects using it, currently that is shade17:40
jeblairdmsimard: (ftr that path would actually be "{{ zuul.executor.src_dir }}/git.openstack.org/project-config/..." but that's a minor detail)17:41
dmsimardjeblair: ah that's the one I was looking for actually, I couldn't find it17:42
jeblairdmsimard: put another way: that change would let you tell the executor to run un-vetted code.  it would also let people pwn the executor.  so we can't merge it.17:42
fungipabelanger: okay, i'll put 500990 on the back burner for a bit and review it in the context of your coming changes17:42
dmsimardjeblair: but $enduser from $project could merge something like that17:43
jeblairdmsimard: no, that's a base job, and they can only go into project-config17:43
dmsimardjeblair: do trusted jobs run outside the bubblewrap ?17:43
openstackgerritPaul Belanger proposed openstack-infra/zuul feature/zuulv3: Switch to publish-openstack-python-docs-infra  https://review.openstack.org/50136217:44
jeblairdmsimard: no, they have their own bubblewrap (with potentially more access)17:44
pabelangerfungi: so, we'd need to land ^, then we can remove publish-openstack-python-docs until new code is ready17:44
dmsimardjeblair: I guess that's part of what I missed, okay.17:44
jlkmordred: et al: There was code added to the github driver to handle a ping event, if it's from a repo we aren't configured to listen to. We're apparently being nice and responding back to github with a 404. If I move over to a ingest, queue, process model, we wouldn't be able to immediately (or at all really) return that 404. How important is this nicety of the 404?17:47
*** jkilpatr has quit IRC17:47
pabelangerfungi: 501363 removes it for now, until I push up new role17:47
jeblairjlk: i guess the 404 just says to anyone looking at the github webhook logs that zuul is ignoring it?17:48
clarkbmordred: where does the hostname come from in the log? I see we do timestamp | log_line  but no hostname17:48
jlkjeblair: yeah17:48
pabelangerfungi: but, I think we should start testing 500990 sooner to confirm it works as expected17:48
*** jkilpatr has joined #zuul17:48
jeblairjlk: i feel like 200 is okay.  like "message received!"  the fact that it was subsequently ignored is a detail that a zuul admin can inspect.17:48
jlkjeblair: it's odd that we do it specifically for the ping event, which is when somebody installs a webhook .17:49
jlkdifferent than an app install I think.17:49
mordredjeblair, dmsimard: I don't think it would open the door to arbitrary code execution - it would just not run because the execution context is still untrusted17:49
jlkI'll drop a TODO in here to validate that the project we got an event for is a project we care about.17:49
jlkbecause we've talked about doing that anyway across the board, not just on ping events.17:49
mordredclarkb: http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/callback/zuul_stream.py?h=feature/zuulv3#n27717:50
mordredclarkb: and http://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/ansible/callback/zuul_stream.py?h=feature/zuulv3#n27717:50
jeblairmordred: in dmsimard's gist, there is a playbook defined in a project-config job, so it runs in the trusted context.  the content of that playbook is to ansible-include other playbooks from the checkout of unmerged code on the executor.  that means that a change to an untrusted-project which used that base job and depended on an un-merged change to project-config would run the un-merged project-config content in the trusted context.17:52
mordredjlk: I think it's fine to do 20017:52
openstackgerritMerged openstack-infra/zuul-jobs master: Ignore errors from ara generate  https://review.openstack.org/50064517:52
clarkbmordred: thanks, I also see the bug now. But have more questions about the relationship between zuul/ansible/library/ and zuul/ansible/callback17:53
mordredjeblair: oh - right - sorry - in my brain that playbook was a playbook in zuul-jobs17:53
mordredclarkb: sweet! questions are good17:53
clarkbmordred: so I was mostly looking in librarby. and command.py there writes to /tmp/uuid.log and zuul_console.py in library reads that and streams it on 1988517:54
clarkbmordred: so I guess the question is where does the callback fit in if we are already writing to the file and streaming it17:55
mordredjeblair, dmsimard: in any case, I agree that that's not how we should do this - I think the limited base job in project-config that just does ssh keys, stream and log collection, then a job in zuul-jobs or anywhere else can use that safely17:55
dmsimardmordred: yeah I'm sending a patch for exactly that in a moment17:55
mordredclarkb: yah - so - command.py in library is ACTUALLY the thing you want to be thinking about from library17:55
mordredclarkb: that runs on the remote node when command or shell tasks are run17:56
mordredclarkb: and yes, it logs to a local file17:56
jeblairdmsimard: note (this is based on your gist) that the base job doesn't need to be node-specific.  you can specify a default, or even omit the nodes section entirely from it.  either way, the job or jobs you use to iterate on this in the zuul-jobs repo can specify a nodes section with the labels you want17:56
mordredclarkb: at the top of the pre-playbook in the base job we run zuul_console which forks off a daemon that reads the files that the command tasks write to disk17:56
mordredclarkb: that daemon also listens on port 19885 for incoming connections17:57
clarkbright and that is in library/ as well17:57
mordredclarkb: zuul_stream in callbacks runs as part of the ansible-playbook process on the executor17:58
mordredclarkb: one instance of it is created per ansible-playbook invocation, and ansible-playbook calls its methods as things happen on the executor17:58
dmsimardjeblair: I was actually wondering -- what I'm doing amounts to integration test the base playbooks against all distros. Should I do one job with 5 nodes? One of each distro?17:58
clarkbmordred: and the callbacks are the aggregation point?17:59
mordredclarkb: yes18:00
mordredclarkb: as tasks start and stop the callbacks get methods called - and then in zuul_stream if we notice that it's a command or shell task, we spin up a thread to connect to the port of the daemon on the remote node and read the log stream from it18:00
mordredclarkb: as we collect that content it is written to local disk on the executor in job-output.txt - which is what the finger daemon reads from when you hit it18:01
jeblairdmsimard: you can; though 5 jobs each on one node is both easier for nodepool to supply and easier for humans to parse test results.18:02
*** harlowja has quit IRC18:02
dmsimardjeblair: in zuul could I set up a job-template and then expand the template with the node types ?18:03
dmsimardI see in the docs there's a notion of job template but it's not very fleshed out18:03
jeblairdmsimard: there's no job-template in zuul v3.18:03
dmsimardjeblair: ah, er, I mistook for project template I guess18:03
jeblairdmsimard: instead, make a job definition, then make 5 jobs that inherit from it, each with a different node type18:04
dmsimardjeblair:18:04
dmsimardjeblair: that's what I was doing but it felt a bit verbose18:04
jeblairdmsimard: something like https://etherpad.openstack.org/p/Gdqs1NlMIQ18:06
jeblairdmsimard: also, we should probably put these jobs in openstack-zuul-jobs rather than the zuul-jobs repo18:06
jeblairdmsimard: the actual labels we're testing against are a little openstack-specific18:06
clarkbmordred: and is _log_message() there to record non command/shell logs?18:07
clarkbmordred: there is both _log and _log_message in the callback and trying to figure out why we need both18:07
jeblairdmsimard: (but if it's faster to iterate against zuul-jobs for now, we can do that, and just avoid landing the changes for the moment)18:08
pabelangerze02.o.o looks to be working18:12
dmsimardjeblair: I didn't even know that openstack-zuul-jobs was a thing18:12
pabelangermordred: https://review.openstack.org/500201/ so do we want openstack-doc-builds for shade and zuul? Or will it be tox-docs ?18:14
clarkbmordred: actually in v2_runner_on_skipped we use both18:15
mordredclarkb: _log is lower level - _log_message is a convenience wrapper18:18
dmsimardDo we need to use the new depends-on syntax for v3 ?18:20
dmsimardOr can we still just use the gerrit changeid ?18:20
dmsimarddocs seem to suggest it's just gerrit changeid but I recall a certain thread mentioning it would change (to support gerrit and github side by side for example?)18:23
jlkI know github only supports the new method18:27
jlkI think gerrit supports both?18:28
dmsimardah so it'd be driver/backend specific18:28
* dmsimard digs in code18:28
jlkah crap. Something broke local logging18:28
jlkzuul-scheduler_1     | Error grabbing logs: invalid character '\x00' looking for beginning of value18:28
jlkand I'm not getting things logged to console18:29
dmsimardjlk: yeah you're right it's driver specific18:29
mordredclarkb: also - fwiw, the entire zuul_stream file needs to be refactored - but have been putting that off18:31
mordreddmsimard: we also have not yet implemented cross-driver depends-on - that's a post-ptg thing18:32
mordreddmsimard: so you can't (yet) depends-on a github change from a gerrit change or vice-versa18:32
mordreddmsimard: we _definitely_ want to add that though18:32
dmsimardjlk: I think jeblair had a patch to fix some junk whitespace issue18:32
dmsimardjlk: https://github.com/openstack-infra/zuul-jobs/commit/a35c2ad35ed4aa5be85194d8bcf419bb0025272f18:32
dmsimardnot sure if it's related18:32
jlknot related, this is well before job running18:33
dmsimardmordred: right, I was wondering if inside gerrit we could keep using depends-on: <changeid> which seems to be the case so that's okay18:33
jlkmordred: we should add, if we haven't already, the ability for gerrit to USE the new syntax, even if it just refers to itself.18:33
jlkso that hte same syntax between github and gerrit can be used18:34
dmsimardmordred: btw, the minimal playbook thing: https://review.openstack.org/#/c/501368/18:35
mordredjlk: I believe we have18:36
fungiclarkb: does pabelanger's followup change address your comment on 500990?18:37
dmsimardjlk: ah, yes, I guess that's what I meant -- if it was necessary for gerrit to use the new syntax against itself18:38
fungidmsimard: the proposal i recall was that for the gerrit trigger we would support change-id format as a means of backward-compatibility, but deprecate it and encourage everyone to switch to the new url-based format18:42
jlkansibot is about to be talked about in the ansible contributor thing18:44
jeblairfungi, dmsimard, jlk: yes; we haven't had a chance to pull the gerrit syntax forward yet; that'll come with cross-source depends18:45
jeblairmordred, jlk: no i don't think gerrit supports the new syntax yet18:45
dmsimardjlk: there's no livestream/hangouts/whatever we can stalk in I guess ?18:45
jeblairdmsimard: it's on the etherpad: https://public.etherpad-mozilla.org/p/ansible-summit-september-2017-core18:46
dmsimardoohhh18:46
jlkdmsimard: yup, bluejeans, IRC18:47
fungijlk: i _so_ hope ansibot is a bot for making ansi-escape-based art and animations18:47
dmsimardfungi: it's what helps the ansible maintainers keep their sanity with the github workflow of issues and pull requests :)18:48
pabelangerjeblair: mordred: fungi: do we have syntax for zuul client to enqueue-ref on a periodic pipeline?18:48
jeblairpabelanger: "zuul --help"?18:49
fungipabelanger: i want to say last time i looked, enqueue-ref didn't work with periodic? or maybe i just haven't tried recently18:49
fungiespecially since there is no ref for periodic jobs18:49
jeblairthere is in zuul v318:49
fungiooh18:49
jeblairso try it and if it doesn't work fix it :)18:49
pabelangerk, will see if I can figure it out18:50
jeblairfungi: thus ending the requirement that periodic jobs bake in their branch; you can just use a regular gate job in periodic and it gets a branch just like any other18:50
fungii missed that innovation18:51
fungithat'll be quite handy18:51
jeblairyou may have been on vacation :)18:51
fungii may have. i seem to do that a lot18:51
pabelangerokay, I think https://review.openstack.org/500626/ is ready for final review, if everybody is okay, I can +A upto 50062618:52
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix dynamic dependent pipeline failure  https://review.openstack.org/50134519:01
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Add test for dependent changes not in a pipeline  https://review.openstack.org/50135319:01
jlkWell, I have a thread started for reading things from the queue, and it read at least one from the queue. Not sure why it's not reading more from the queue.19:03
jeblairpabelanger: wfm19:04
jeblairShrews: i'll look at cloner after lunch19:04
pabelangerjeblair: mordred: Shrews: any objections on bringing online nl02.o.o? Currently waiting on some code reviews, so can shift to standing up infra stuff19:05
jeblairpabelanger: go for it19:05
mordredpabelanger: do it19:05
pabelangerk19:05
Shrewsdooo eeet19:12
openstackgerritDavid Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro  https://review.openstack.org/50128119:28
openstackgerritDavid Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro  https://review.openstack.org/50128119:29
dmsimardbah, doing a depends-on a review that hasn't merged in project-config doesn't work :)19:33
dmsimard(in v3)19:33
dmsimardwhich I guess is okay, once again security wins19:33
openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Split github hook ingest and processing  https://review.openstack.org/50139019:33
jlkmordred: jeblair: ^^ that introduces the eat and queue model for github events. Sending events is significantly faster to get a 200 back19:35
openstackgerritClark Boylan proposed openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs  https://review.openstack.org/50139419:36
clarkbmordred: ^ something like that I think19:36
jlkmordred: jeblair: so from cold start, new method is 1.2 total seconds to get to a 200, and then after that it's .4 on repeated events.   Old method was 4.35 on cold, and then 1.2 to 2.9 to whatever on repeat events.19:39
* jlk lunches19:40
*** olaph has quit IRC19:50
*** olaph has joined #zuul19:51
mordredclarkb: reading19:53
mordredjlk: reading19:53
* mordred reads in parallel19:53
*** olaph1 has joined #zuul19:55
*** olaph has quit IRC19:56
mordredclarkb: one comment - otherwise looks great19:58
*** olaph1 is now known as olaph20:00
mordredclarkb: unfortunately ansible does not log issues in the callback plugins particularly well20:00
mordredclarkb: I just had an idea of a thing we can do about that in this context though...20:01
pabelangerokay, I am stopping nl0120:02
clarkbmordred: the test failures appears to be in syncing the job-output.json file though. Not sure its related to my change20:02
clarkbbroken pipe (32) from rsync20:02
pabelangerand nl02.o.o started20:03
openstackgerritClark Boylan proposed openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs  https://review.openstack.org/50139420:04
pabelangercool, nl02.o.o is responding to requests20:04
jeblairShrews: clone mapper looks good to me.  we may want to shift that over and merge it into zuul-jobs instead of zuul.  stick that in as a templated file and make a role that copies it over top of the zuul-cloner that's baked into our images.20:05
jeblairmordred, clarkb: ^ one of you want to look that over (https://review.openstack.org/500922)20:05
pabelangerHmm, I've just noticed nl02.o.o doesn't have swap setup20:05
jeblairjlk: cool!  though zuul seems to be expressing some displeasure with the unit tests with your patch.20:07
jlkruh roh20:07
mordredjeblair, jlk: is queue.Queue inherently threadsafe?20:07
jlkdunno, it's what Gerrit driver uses20:07
clarkbmordred: ya20:07
mordredcool20:07
mordredit was just putting and getting from a multi thread context without explicit locks, so I figured I'd ask :)20:08
clarkbthat said if using asyncio I think there are special queue objects for it (that are not thread safe beacuse no proper threads)20:08
mordredclarkb: yah - that'll be a whole other thing for later20:08
jeblairmordred: yes20:10
clarkbjeblair: looking at 501345, curious how there were tests that failed outside of that change, but you got it to pass only be modifying tests in that change20:10
clarkbjeblair: are the test's side effecting each other?20:11
jeblairclarkb: no the addition of check: jobs: [] fixed those20:11
mordredclarkb: although my first hunch is that the zuul-web webhook will do what this one is doing except instead of self.connection.addEvent() it'll do "gearman.submitJob('addGithubEvent', background=True)" or something20:11
jeblairclarkb: basically, the fix tightened up when we report on things not in pipelines; some of those tests then needed their project to be in a pipeline.  so i added them to the check pipeline with no jobs.20:11
jeblairthat's a thing in zuul v3, more or less exactly for this.  :)20:12
pabelangerjeblair: clarkb: fungi: mordred: okay, nl02.o.o is running, but without swap. I am thinking of leaving it for now, rebuild nl01.o.o under xenial, validate swap is working, swap back to nl01.o.o then fix nl02.o.o. any objections?20:12
pabelangerotherwise, I can roll back to nl01.o.o first, and fix swap on nl02.o.o20:12
jeblairpabelanger: wfm20:12
clarkbjeblair: oh I see other tests not touched in that chagne are also using the in-repo fixture20:13
jeblairmordred: agreed; gearman should be the queue in next iteration20:13
mordredjeblair: yah20:13
jeblairclarkb: yep20:13
fungipabelanger: sounds fine. nl02 doesn't seem to be under any memory pressure20:13
jeblairmordred: want to re +2/+3 501345 and child?20:14
mordredjeblair: I do!20:14
jeblairmordred, pabelanger: we have not restarted since the command.py (utf8 log streaming) patch landed, correct?20:15
clarkbI've got another log streaming change https://review.openstack.org/501394 that would be nice to get in for better formatted logs (though utf8 fix is definitely higher priority)20:16
jeblairclarkb: ack20:16
* fungi reviews20:16
pabelangerjeblair: I am not sure20:16
mordredjeblair: I restarted zuul first thing this morning iirc - but not since then20:17
jeblairmordred: do you know if the stream change had landed?  i think you approved it first thing this morning as well, so unsure which first thing was first :)20:18
fungiclarkb: did you mean rstrip where you used rsplit?20:18
clarkbfungi: yes I most certainly did20:18
* clarkb fixes20:18
* fungi suddenly feels useful20:18
jeblairthere *is* an rsplit20:18
mordredjeblair: I do not remember20:19
jeblairmordred: okay, let's just land clarkb's thing and restart anyway20:19
openstackgerritClark Boylan proposed openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs  https://review.openstack.org/50139420:19
mordredjeblair: ++20:19
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Fix dynamic dependent pipeline failure  https://review.openstack.org/50134520:23
jlkhrm I think something isn't closing the thread.20:26
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Add test for dependent changes not in a pipeline  https://review.openstack.org/50135320:28
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Add zuul-cloner shim  https://review.openstack.org/50092220:30
pabelangerfungi: clarkb: do you mind reviewing https://review.openstack.org/500990/ again, it has related patches needed for zuul and publishing afs docs20:30
pabelangerthat will stop overwriting http://docs.openstack.org/infra/zuul/ with zuulv3 docs20:31
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Update tests to use AF_INET6  https://review.openstack.org/50126620:32
Shrewsjeblair: ack. will do the env var thing, then shift it to zuul-jobs20:32
clarkbpabelanger: done20:32
mordredclarkb, pabelanger: https://review.openstack.org/#/c/501381/ while we're at it20:33
jeblairjlk: ah i think i see the issue20:34
jlkoh good! I haven't found it yet20:34
jeblairjlk: tests/base.py line 213720:35
jeblairjlk: that makes sure that the tests wait for the gerrit connector event queue to empty before deciding that the system is stable (in waitUntilSettled)20:35
dmsimardah, eh, ew ?20:35
jeblairjlk: we need to add the github event queue to that list as wel, a few lines down.20:35
dmsimardansible_distribution returns "openSUSE Leap" with an actual space in it20:35
jeblairShrews: or maybe if it ends up being a templated script, you could just template that in instead of env-varring20:36
* dmsimard uses os_family for suse20:36
jlkI think that line number is off?20:36
jeblairjlk: maybe; it's in         def getGerritConnection(driver, name, config):20:36
jeblairjlk:             self.event_queues.append(con.event_queue)20:37
jlkoooh I see20:37
jeblairi should really make an emacs macro to open the current line in cgit20:37
clarkbdmsimard: and I think for tumbleweed it may be just "tumbleweed" ? family sounds like a good idea20:38
dmsimardclarkb: facts: http://logs.openstack.org/67/499467/8/check/gate-tempest-dsvm-neutron-full-opensuse-423-nv/e9b97c3/logs/ara/host/0962bf05-4d54-41b1-8c1d-49bc318e9f33/20:38
jeblairclarkb: mordred: stream change https://review.openstack.org/501394 failed20:41
clarkbhrm this time it actually failed and wasn't post sync failing20:42
clarkbaha I see why20:43
jeblairclarkb: change worked, test needs updating20:43
clarkb2017-09-06 20:33:09.738370 | node1 |     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 <- yup20:44
jlkoh haha, I have to remove the ping event handler too20:45
openstackgerritClark Boylan proposed openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs  https://review.openstack.org/50139420:46
mordredclarkb: nice20:46
clarkbI skimmed the other greps too and ^ appeared to be the only two that needed indentation20:46
mordredclarkb: also - thanks for fixing that - my eyes hadn't realized what was going on - the update looks great20:46
jeblairi went ahead and +3d it; if folks see issues in http://logs.openstack.org/94/501394/3/check/zuul-stream-functional/f889e56/stream-files/stream-job-output.txt we can block it before it merges20:47
openstackgerritDavid Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro  https://review.openstack.org/50128120:49
pabelangerclarkb: dmsimard: do you happen to have an idea how to automate that process? I'm happy to write the code, but I ended up just modifying make_swap.sh manually to create it on xenial server. Is this an issue for devstack-gate, if so, maybe we just update launch-node to use that role now20:51
clarkbpabelanger: make_swap.sh in system-config is independent of devstack-gate completely iirc20:51
pabelangerclarkb: dmsimard: sorry, this should have been in #openstack-infra for swap issue20:51
*** jkilpatr has quit IRC20:54
openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Split github hook ingest and processing  https://review.openstack.org/50139020:59
jlkmordred: jeblair: fixed!20:59
jeblairjlk: lgtm; let's see what zuul thinks!  :)21:00
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Actually use fetch-stestr-output in unittests base job  https://review.openstack.org/50144121:03
*** jkilpatr has joined #zuul21:08
mordredjeblair: have we restarted zuul yet with your dependent fix? or are we waiting for clarkb's?21:15
mordredclarkb: btw - that failed again21:15
clarkbmordred: wat21:16
clarkbmordred: host key verification failed21:17
clarkbdont think I touched that21:17
mordredoh - that's not great21:17
mordredyou didn't21:17
fungimordred: jeblair: looks like puppet updated the zuul install on zuulv3.o.o 8 minutes ago... time to restart?21:17
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Split github hook ingest and processing  https://review.openstack.org/50139021:18
fungioh, we don't have the rstrip console change in yet21:18
mordredclarkb: so - it did the hostkey role - http://logs.openstack.org/94/501394/4/check/zuul-stream-functional/868fc52/job-output.txt.gz#_2017-09-06_20_56_20_71435521:19
jeblairmordred, clarkb, fungi: it looks like my stream fix merged before the last restart, so i've resumed poking at devstack while waiting for clark's to merge21:20
mordredjeblair: awesome21:21
mordredclarkb: 91.106.198.111 floating ips21:21
mordredclarkb: these nodes have http://logs.openstack.org/94/501394/4/check/zuul-stream-functional/868fc52/zuul-info/inventory.yaml21:21
mordredfloating ips (see interface_ip in the inventory)21:21
fungijeblair: yeah, the stream encoding fix is installed on the server already, i did see it on there21:22
mordredbut multi-node-known-hosts doesn't seem to be adding those21:22
pabelangerokay, I've rolled back to nl01.o.o, which is xenial21:23
fungioh weird... multinode over fip bug?21:23
mordredwell - bug in our role that sets it up so that all hte nodes can ssh to each other21:24
mordredI see a fix - one sec21:24
jeblairclarkb: can you help me out with a suggestion regarding https://review.openstack.org/45149221:25
jeblairclarkb: the devstack job is running into this: http://logs.openstack.org/02/500202/18/check/devstack/9eb2549/job-output.txt.gz#_2017-09-06_21_08_47_09575821:25
clarkboh hrm21:26
*** yolanda has quit IRC21:26
jeblairwhat creates the mirror_info.sh file?21:26
*** yolanda has joined #zuul21:26
clarkbnodepool ready script iirx21:26
clarkbwhich we dont have in v421:27
clarkb*321:27
clarkbso maybe we just need a pre task that drops that in? I think mordred was looking at that?21:27
jeblairyeah, in theory we could do it in the configure_mirrors role21:29
jeblair(cc dmsimard, pabelanger ^)21:29
pabelangerokay, and nl02.o.o is backonline too. So we are running 2 nodepool-launchers right now, are we good with that?21:29
jeblairthough it's also worth asking: is that the way we want to handle this?21:29
jeblairclarkb: why isn't that in devstack-gate?21:30
jlkOooh, hopefully next restart of zuul includes the github change I just made, so we can see if it reduces the timeouts.21:30
dmsimardWe'll have to keep the file for the time being for backwards compat21:30
jeblairclarkb: (that == the add-apt-repo)21:30
clarkbjeblair: because devstack needs it to function21:30
dmsimardAnd yes, we can likely set it up through config mirror21:30
clarkbold libvirt just isnt reliable21:30
jeblairclarkb: yeah, but there's a big "if running in gate" block there...21:31
jeblairclarkb: so why not do the "if running in gate" block in devstack-gate, and then... hopefully add-apt-repo noops in devstack? :)21:31
clarkbya we could have devstack skip if some uca repo exists21:32
jeblairso to make this a really high-level question: is /etc/ci/mirror_info.sh an API that we want to support for openstack21:33
jeblairer21:33
jeblairfor jobs in openstack-infra21:33
pabelangerOh, /etc/ci/mirror_info.sh. Ya, we'll have to create that file today. But we could write them as facts on disk moving forward21:33
jeblairor is there something more ansiblish/v3 we could do.21:33
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Handle floating ips in multi-node-known-hosts  https://review.openstack.org/50145921:33
jeblairlike what pabelanger just suggested21:33
mordredclarkb, jeblair, pabelanger: ^^ thatshould fix the ssh hostkey task for our floating ip clouds21:33
pabelangerya, we could write them into /etc/ansible/facts, but haven't thought what that would look like21:34
jeblairokay, so maybe we should add /etc/ci/mirror_info.sh to configure_mirrors role for now, and think about alternatives later21:35
mordredyah - so - I think sort-term we _definitely_ have to write that file out, because a metric truckload of people consume mirror_info.sh aiui21:35
jeblairreally?21:35
mordredyah - people use it in things like inside-of-docker-images-in-kolla21:35
clarkbya and dib builds and such21:35
clarkbits how we communicate "this is where you find thingd"21:36
jeblairmordred: well 14 projects do :)21:36
clarkbparticularly useful if say builing ubuntu image on centos21:36
*** olaph1 has joined #zuul21:36
mordredyah - at least "some"21:36
mordredmaybe not metric truckload - but  maybe an imperial one21:36
jeblairwho wants to add that?21:36
jeblairdmsimard: does it make sense for you to work that into your current effort?21:36
dmsimardSure21:37
*** olaph has quit IRC21:37
fungiit seems worth changing soonish after cut-over, and we ought to be able to find consumers of it pretty easily with git grep (or codesearch.o.o in case they're calling into it from scripts in their repos)21:37
mordredit can almost certainly be a fairly easy cut/paste from the existing configure_mirror.sh just replacing the here-doc with a template21:37
jeblairdmsimard: http://git.openstack.org/cgit/openstack-infra/project-config/tree/nodepool/scripts/configure_mirror.sh#n7321:37
dmsimardI'm familiar with that file, yes, we hack it in review.rdo :)21:38
jeblairdmsimard: cool thanks :)21:38
pabelangerya, adding to configure-mirror role +121:38
dmsimardNeed +3 on https://review.openstack.org/#/c/501368/ to unblock me though21:38
jeblairi will do a hacky thing to devstack to get past that now21:38
jeblairmordred: ^ that's you21:39
fungii definitely don't think a sourced shell snippet setting some relatively ad-hoc envvars is an api we want to support in the long term if we value our sanity21:39
jeblairmordred: (the +3 on 501368)21:39
jeblairfungi: yeah, there must be a better way.  i don't know it right now, but we'll find it :)21:40
* dmsimard Raymond H. voice "there has to be a better way"21:40
mordreddmsimard: +A21:41
pabelangerokay, nodepool-launcher looks happy. I'm moving back to testing afs publishing21:41
clarkbthe biggest problem has been discovering complete list regardless of platform21:42
clarkbbecause there are cases where ubuntu based jobs need centos repos21:42
mordreddmsimard, jeblair: for the configure mirrors thing - it probably ALSO needs to write out that list of files after the here doc21:42
mordredhonestly - I think for now we're likely better of actually just copying that entire file and then running it with NODEPOOL_MIRROR_HOST set properly in an env var21:43
mordredcause all of the putting sources.list.available.d and whatnot at the end21:44
mordredand it sets up unbound at the top21:44
*** harlowja has joined #zuul21:45
pabelanger++21:45
pabelangerthen we can itterate on it21:45
pabelangeriterate*21:45
pabelangermordred: jeblair: clarkb: fungi: https://review.openstack.org/501362 would like a review, switches zuul to publish-openstack-python-docs-infra job21:47
*** olaph1 is now known as olaph21:51
*** hashar has quit IRC21:54
mordredpabelanger: lgtm +A - also added follow up https://review.openstack.org/501475 which just caught my eye in the review21:54
SpamapSis there a way to tell zuul/nodepool that a certain job should _always_ hold its nodes?21:55
mordredSpamapS: not to my knowledge, no21:57
SpamapSI wonder how well it would work to just have jobs that don't complete for a few days.21:57
mordredSpamapS: there is a count argument to autohold though - so I imagine implementing support for that as a count=-1 or something similar wouldn't be terribly difficult21:57
mordredSpamapS: it should work as well as gearman works :)21:57
mordredSpamapS: oh - I mean, holding a node happens after the job completes though21:58
mordredSpamapS: so the only jobs-don't-complete portion would be if you're starving your availble nodes by holding all of them21:58
mordredSpamapS: while you're here ... any chance you have a sec to look at / review https://review.openstack.org/#/c/501459/ ?21:59
mordredSpamapS: (since you wrote that originally)21:59
mordredclarkb: could I get an amen on https://review.openstack.org/#/c/501441 ?21:59
SpamapSmordred: I'll look in a few.22:00
mordredSpamapS: thanks!22:01
openstackgerritDavid Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro  https://review.openstack.org/50128122:01
pabelangermordred: left -1 with comment22:01
mordredSpamapS: (as you well know the multi-host-host-keys is dense code)22:01
dmsimardjeblair, mordred: you have a hint on this "unknown configuration error" error here ? https://review.openstack.org/#/c/501281/622:02
jeblairdmsimard: btw, you don't need to define those nodesets, you can just inline them in the jobs under "nodes:" (nodes: is either a nodeset name, or a nodeset definition)22:03
jeblairdmsimard: i will go spelunking in logs22:03
mordredpabelanger: good call - I responded - but tl;dr - that says to me I'd like to refactor something, but I think we can refactor it later22:03
dmsimardjeblair: I know about the node definition but I was actually wondering if these should be defined by default in project-config to be made less redundant22:04
dmsimardjeblair: otherwise we end up re-declaring these same nodes over and over22:04
dmsimardIt probably looks too verbose because it's just one job but would end up saving lines for a dozen different jobs22:06
dmsimardI don't have a strong opinion on this one.22:06
jeblairnor do i22:06
dmsimardI was ready to do one job with 5 nodes to keep zuul.yaml clean :D22:06
mordredI mean - there's not much value in a nodeset called "ubuntu-trusty" that has a single node called "ubuntu-trusty" on label "ubuntu-trusty" - it seems that if you're adding a node toa job you should just be able to say "nodes: -ubuntu-trusty"22:06
mordreddmsimard: well, you can totally do that you know - ansible will let you :)22:07
jeblairmordred: yeah, we can make the name optional and default it to the label22:07
dmsimardjeblair: +122:07
dmsimardthat would solve the problem22:07
jeblairmostly, i just don't want to establish the idea that you have to define a nodeset22:07
jeblairto be honest, i'd rather we be *slightly* less clever at first if it means we avoid showing people the wrong way to do things :)22:08
mordredyah22:08
jeblairException: Configuration item dictionaries must have a single key22:09
mordredjeblair: I can't define two different variants with different nodes can I?22:09
jeblairmordred: sure you can22:09
mordredjeblair: cool22:09
jeblairmordred: assuming they match different things22:10
jeblairthat's sort of the primary use case for variants22:10
mordredoh - no - matching the same thing22:10
jeblair"stable runs on trusty; master runs on xenial"22:10
jeblairmordred: that's not a variant, that's a job22:10
mordrednod22:10
jeblairdmsimard: all those nodeset definitions need more indentation22:11
jeblairi'll see if i can't make that into a nice error22:12
dmsimardjeblair: I'm submitting a patchset without nodesets anyway22:12
jeblairk22:12
openstackgerritDavid Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro  https://review.openstack.org/50128122:12
dmsimard^22:13
dmsimardthat one worked22:13
dmsimardthe jobs are scheduled22:13
dmsimardso I guess it was junk out of the nodeset config22:13
jeblairdmsimard: it was the indentation22:13
jeblairmaybe it wasn't clear, but that exception and my indentation suggestion were the result of checking the zuul log for the actual error22:14
openstackgerritDavid Moreau Simard proposed openstack-infra/zuul-jobs master: WIP: Make the base playbooks/roles work for every supported distro  https://review.openstack.org/50128122:14
dmsimardyeah, I think almost anything except "unknown configuration error" could be a good pointer22:14
jeblairthat's a specific enough error we can actually say "dude, hit tab"22:15
dmsimardlol22:15
dmsimardor space space space space22:15
dmsimardafk for a bit, I'll hack on the base roles tonight now that it's unblocked \o/22:15
dmsimardoops, looks like something might be wrong with the minimal job http://logs.openstack.org/81/501281/8/check/base-integration-ubuntu-xenial/cc1ffbb/job-output.txt.gz#_2017-09-06_22_14_56_41503622:16
dmsimardI'll look later /me afk22:16
pabelangermordred: ack22:17
SpamapSmordred: +A'd22:17
mordredSpamapS: thanks!22:18
fungidmsimard: wow! look at all those job failures ;)22:18
mordredpabelanger: oh - also - https://review.openstack.org/#/c/501246 goes along with your other one22:19
SpamapSmordred: so the use case I have is that I want to have zuul and nodepool spin up test nodes in a number of scenarios, and one of those is more of the "developer wants test nodes deployed with the latest code to test XXX" ...22:19
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Update tests to use AF_INET6  https://review.openstack.org/50126622:20
openstackgerritMerged openstack-infra/zuul-jobs master: Actually use fetch-stestr-output in unittests base job  https://review.openstack.org/50144122:20
fungidmsimard: "ERROR: Executing local code is prohibited"22:20
fungifor when you get back22:20
fungioh, i see you already found that error22:22
* fungi should read scrollback more carefully22:22
pabelangermordred: +322:22
mordredSpamapS: nod. I can understand that desire. I think we should chat about the best way to expose that22:22
mordredSpamapS: for now, you could TOTALLY fake it by adding an autohold to a job with a count of like 99999 or something22:22
SpamapSmordred: One way I was thinking to do it is to just have a job that doesn't complete until the user is done playing with the nodes.22:23
SpamapSHow does one return held nodes?22:23
mordredSpamapS: yah - that's another thing you could do - you'd have to either disable or put a REALLY long timeout on it22:23
* SpamapS has, oddly enough, never done that22:23
mordredSpamapS: you tell nodepool to delete the node22:23
fungiSpamapS: an administrator sets the node state to something else (generally delete)22:23
fungiso it's not really self-service22:24
SpamapSHm22:24
SpamapSI wonder if a better thing to do would be to just emulate zuul+nodepool with a manual provisioning playbook or something.22:24
mordredSpamapS: oh - hah. I've got an idea ...22:24
SpamapSbut that gets into pushing.. blah blah22:24
SpamapSThe reason I want this is that the job we run to run would have 5 nodes22:25
mordredSpamapS: have your job that doesn't complete until the user is done ... just create a stamp file and then wait until the file goes away22:25
SpamapSs/run to run/want to run/22:25
mordredSpamapS: so that the dev can just delete the stamp file when they're done22:25
SpamapSmordred: that's exactly what I was thinking too22:25
mordredSpamapS: and that could be just on one node22:25
SpamapSYeah just have like, a 72 hour timeout on the job and test for a stamp file at the end of the job playbook.22:26
mordredyah22:26
mordredand the 72 hour timeout is your safety net for people forgetting about it22:26
SpamapSexactly22:26
pabelangerjeblair: mordred: interesting failure22:40
pabelangerhttp://logs.openstack.org/62/501362/1/gate/tox-py35/040f590/job-output.txt.gz22:40
pabelangerpossible related to ipv6?22:41
pabelangerthat ran on vexxhost22:41
mordredhrm22:43
pabelangeroh22:44
pabelanger2017-09-06 22:40:47,211 DEBUG zuul.AnsibleJob: [build: 72137254d1804768873127b65f5006f3] Ansible output: b'ERROR! A worker was found in a dead state'22:44
pabelangerthat ran on ze0222:44
pabelangerI don't think we are running right python there22:44
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Only strip trailing whitespace from console logs  https://review.openstack.org/50139422:45
pabelangermordred: jeblair: I am going to stop ze02.o.o because of dead state22:45
jamielennoxpabelanger: gah: http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/fetch-testr-output/tasks/process.yaml#n822:45
pabelangerjamielennox: ya, we need to fix that22:46
fungilooks like clarkb's rstrip fix landed, so watching puppet apply logs now22:46
jamielennoxpabelanger: so does that always make sense in a base job like that or something that should (somehow) be performed by the client's tests?22:47
jamielennoxparticularly in a non-openstack case, is that something that the client should perform and drop into the logs folder?22:47
pabelangerjamielennox: might want to check with mordred. But, we should likely have ensure-testr role, or something like that22:49
jamielennoxwell testr is within tox right? so that should/will be installed each time22:50
jamielennoxi guess the question is does post-processing always occur in the base jobs or is it something that (somehow) the individual repo should generte?22:51
pabelangerright now we run testr in its own virtualenv on DIB22:51
pabelangerbut, it could be installed in test-requirements.txt22:51
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Handle some common yaml syntax errors  https://review.openstack.org/50148622:51
pabelangerbut, we likely want the job to ensure testr is installed and if missing, install it some place22:51
jeblairdmsimard:  https://review.openstack.org/501486  handles the syntax error you found and some more22:52
mordredjamielennox: well - for now, the general idea is that if your job can produce subunit then we should be able to get that into other forms - or alternately you can just make whatever output you want22:52
pabelangermordred: how did you add python PPA to ze01.o.o?22:53
pabelangerI thought that was in system-config22:53
fungii'm not sure what to make of the unit test failure on 50145922:53
mordredpabelanger: it's in the puppet22:53
fungihas anyone seen that yet?22:53
mordredjamielennox: but it's an area we need to work out amongst ourselves - cause there's a too-much case and a not-enough case22:53
pabelangermordred: k, I see it22:53
jeblairpabelanger, mordred: do we need to do something to get mordred's patched python3 on ze02-ze04?22:54
jeblairpabelanger: oh you're on that, sorry22:54
mordredjamielennox: I call out subunit specifically because at some point in the future when we get around to it we want to be able to have the executor snoop the output stream as it's happening and if contains subunit to notice if any tests failed so we can report that tests are _going_ to fail without waiting until the end22:54
mordredpabelanger: did I put it in a bad place?22:54
clarkbfungi: http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_68019922:54
jamielennoxpatched python3 ?22:55
jeblairjamielennox: lemme dig up links22:55
mordredjamielennox: yah - there's a crashing bug in the version in xenial - we've submitted a backport upstream22:55
pabelangermordred: no, we just need to run apt-get upgrade for some reason22:55
clarkbjamielennox: turns out that python rewrite a good chunk of dict which broke things in ubuntu's versio nof python22:55
pabelangeron ze02.o.o22:55
mordredhttps://launchpad.net/~openstack-ci-core/+archive/ubuntu/python-bpo-27945-backport22:56
jamielennoxoh god22:56
mordredjamielennox, jeblair: ^^22:56
clarkbits second lts release of ubuntu with broken python3 :)22:56
clarkbhard to blame ubuntu as both were bugs upstream but still painful22:56
pabelangermordred: ya, so we have never versions of python that needs to be installed vi apt. Puppet isn't upgrading the PPA, because we don't have python3-dev latest any place22:57
mordredGAH22:57
pabelangermordred: I can manually do it, but we should puppet it22:57
mordredSpamapS: "Chris Halse Rogers (raof) wrote 20 hours ago: Proposed package upload rejected"22:57
mordredpabelanger: yes. we should22:57
fungiclarkb: yeah, i found the traceback, just trying to figure out how it lost the console log file22:57
mordredSpamapS: from https://bugs.launchpad.net/ubuntu/+source/python3.5/+bug/171172422:58
openstackLaunchpad bug 1711724 in python3.5 (Ubuntu Xenial) "Segfaults with dict" [High,In progress] - Assigned to Clint Byrum (clint-fewbar)22:58
pabelangermordred: so, if we have puppet manage that, it might break zuul-executor, since we need to uninstall python22:58
mordredpabelanger: why do we need to uninstall python?22:59
pabelangermordred: apt does it23:00
pabelangeroh wait23:00
pabelangermordred: ignore me23:00
clarkbfungi: happened at http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_599560 too which is a different path23:01
clarkbfungi: perhaps that tmpdir was removed?23:01
SpamapSmordred: DAMNIT23:03
pabelangerSpamapS: mordred: didn't the patch add a unit test?23:04
mordredpabelanger: I believe the rejected the upload beause of a different bug that also requested an SRU23:04
SpamapSYeah23:04
SpamapSdoko piggy backed on ours23:04
*** dkranz has quit IRC23:05
SpamapSand then failed to follow the process23:05
clarkbhttps://bugs.launchpad.net/ubuntu/+source/python3.5/+bug/1682934 that one23:05
openstackLaunchpad bug 1682934 in python2.7 (Ubuntu) "python3 in /usr/local/bin can cause python3 packages to fail to install" [Undecided,Confirmed]23:05
SpamapS(our bug also explained why a zesty upload wasn't needed)23:05
mordredthanks doko23:05
jeblairSpamapS: you explained artful, but not zesty?23:05
SpamapSOh maybe. Hrm23:06
SpamapSlooks like doko is dropping that one23:06
SpamapSso I can just re-upload the one I previously produced23:06
clarkbI don't see where/how those bugs were associated23:07
clarkbother than the comment saying no have a nice day23:07
SpamapSclarkb: they were only associated by an upload that was caught in a manual-approval queue23:07
SpamapSdoko downloaded my upload, added his fix, then re-uploaded23:08
SpamapSwhich I knew..23:08
SpamapSand is not uncommon23:08
SpamapSwtf.. zesty eol'd 4/1323:09
jeblairSpamapS: oh.  maybe raof needs to update a rejectoscript.23:10
clarkbjeblair: or use some zuul gate pipeline to properly evict children :)23:10
SpamapSjeblair: I think that's manually typed in23:10
SpamapSNo I'm dumnb23:11
SpamapSdumb23:11
clarkbfungi: looking more its a logging handler called jobfile that wants to write to that job-output.txt location and fails to open that. Maybe a race in test setup?23:11
SpamapSZesty was RELEASED 4/1323:11
clarkbso eol is ~11/1323:12
clarkber that math is wrong23:12
clarkb01/13 ?23:12
clarkbI can add 9 to 4 and mod by 12 honest23:12
clarkbfungi: ya reading logs there are playbooks under that tmpdir that appear to be read fine23:13
fungihuh. vexing23:14
jeblairclarkb, fungi: do i need to look into something?  i haven't been following.23:15
fungijeblair: unit test failure on 501459 looks like a race on a console log file23:15
clarkbhttp://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_599560 and http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_68019923:16
funginot sure whether it's just a racy test or finding a bug in zuul23:16
clarkbok I don't think its a race in the log anymore23:17
clarkbwe attempt to cat the job-output.txt file when a test fails23:17
clarkbbut depending on where that assertion happens it may be completely valid to not have that file on disk23:17
fungiaha, as in perhaps too early23:17
clarkbya23:18
clarkbhttp://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_616595 appears to be a valid fail that is being caught there23:19
clarkbI'll push up a patch to not traceback if the file doesn't exist (and log the case instead)23:19
jeblairclarkb: ++23:19
fungigood eye23:20
clarkbjeblair: http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_611455 I think that is what caused it to post failure23:21
clarkband then there it post failures http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_615440 ?23:22
clarkbjeblair: but I'm not sure if the nonodeerror is expected23:22
jeblairclarkb: the nonodeerror should be fine23:22
jeblairthat's probably a periodic poll23:22
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Handle debug messages cleanly  https://review.openstack.org/50149023:23
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Switch to publish-openstack-python-docs-infra  https://review.openstack.org/50136223:23
clarkbok if not that then I don't see any other sadness, it is running the command then later it returns exit code 223:24
jeblairclarkb: hrm; the answer should be in job-output.txt.  if it got that far, it should be there23:24
clarkb[build: 30249a4eec9e429294f5cfeff0ccfd3e] Ansible output: b"<localhost> EXEC /bin/sh -c '/usr/bin/python2 && sleep 0'" then [build: 30249a4eec9e429294f5cfeff0ccfd3e] Ansible output terminated23:24
clarkbthe python2 invocation there is weird to me, is it piping python into it to execute?23:27
clarkbya that appears to be ansible's localhost connection logging that it is running python223:31
pabelangerwoot23:32
pabelangerhttp://logs.openstack.org/9f/ffee8582c2d8013b89ae5f9c82c4bec9fdd5b59f/post/publish-openstack-python-docs-infra/a389335/job-output.txt.gz23:32
clarkband that hello-world playbook is copying "hello world" into a file in that tmpdir so maybe I am back to something is off with the tmpdir23:32
pabelangerafs-docs job worked after our refactors23:32
fungilooks like the streamer indentation fix is installed on zuulv3.o.o now if anyone's up for a restart23:32
fungioh, though i guess it's actually the executors we care about there?23:33
jeblairfungi: yep, executors23:33
pabelangerze01 and ze02 please23:33
pabelangerwe're running both now23:33
clarkbdest: "{{zuul.executor.log_root}}/hello-world.txt" so could just be the log dir23:33
pabelangerwe likely should update our ansible-playbook in system-config to support zuul-executors23:34
fungiand was there a bug which is currently causing jobs not to get reenqueued when an executor is restarted?23:34
fungiany special handling i need there?23:34
pabelangerYa, I think we'll have to see why aborted jobs are not getting requeued23:35
pabelangerI can likely look into that in the morning23:35
jeblairfungi: nah, just recheck if you care :)23:35
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Add change_url to zuul dict passed into inventory  https://review.openstack.org/50149223:37
mordredpabelanger: oh good re: afs-docs!23:37
pabelangermordred: Yup, I'll finish up publish-openstack-python-docs and get it up for review, but should be able to close that out for tomorrow23:38
fungiis zuul not actually installed system-wide on the executors?23:38
fungioh, it's installed into a chroot, right?23:38
jeblairfungi: it's installed in the normal manner23:38
pabelangerzuul-executor is the service23:39
fungiweird, pbr freeze doesn't seem to think zuul is installed23:40
mordredfungi: it's python323:40
pabelangerpip3 :)23:40
mordredfungi: you're probalby getting pbr from python223:40
* mordred needs to make "python3 -m pbr freeze" work ...23:40
fungimordred: gah, you're correct23:41
fungifor some reason on zuulv3.o.o that wasn't happening23:41
mordredfungi: well, to be fair it should also not be happening on ze01 - but we have, from time to time, accidentally done pip install . instead of pip3 install .23:42
mordredfungi: which means some things, like the pbr bin script, may have last been installed with pip223:42
mordredfungi: since command line entrypoints are last-installed-wins23:43
fungigot it23:44
fungiwell, anyway, i confirmed the fixed version is present on both ze01 and ze02 and restarted zuul-executor on them23:44
fungimordred: anyway, `python3 /usr/local/bin/pbr freeze` does work even if python3 -m doesn't there yet23:46
fungiso good enough for me once you reminded me zuul's not installed under the default python any longer23:46
clarkbjeblair: I think streams may be getting crossed between ansible builds (and maybe tests) http://logs.openstack.org/59/501459/1/gate/tox-py35-on-zuul/143c361/job-output.txt.gz#_2017-09-06_22_25_24_600305 notice that the build: uuid doesn't match the uuid for the work dir23:55
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Split the log_path creation into its own role  https://review.openstack.org/50149423:58
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Add a role to emit an informative header for logs  https://review.openstack.org/50149523:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!