Thursday, 2017-07-20

mordredpabelanger: for zuul-jobs since the consumption is intended to be via git rather than releases I'm not sure if reno would be a win - however, I like reno a lot - so maybe it still would be?00:00
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Fix change history cycle detection  https://review.openstack.org/48536800:02
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Add project info to JobDirPlaybook  https://review.openstack.org/48527300:02
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Make playbook run meta info less fragile  https://review.openstack.org/48528400:02
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Add spacer after playbook stats rather than before playbook  https://review.openstack.org/48528500:03
pabelangermordred: ya, I like the idea of reno too00:03
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Switch from tox-linters to tox-pep8  https://review.openstack.org/48537900:04
mordredpabelanger: ^^ :)00:04
pabelanger+200:05
mordredpabelanger: I'm going to restart the executor to pick up the logging format hanges00:09
mordredchanges00:09
mordrednot hanges00:09
mordredpabelanger: also - if you have a sec, ould you look at https://review.openstack.org/#/c/485345/ ?00:10
mordredpabelanger: jeblair and Shrews +2'd it earlier, but I had a fix a  puppet apply test issue00:10
pabelangermordred: +2, feel free to +3 is you get nobody else00:11
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Gather facts for tox/run playbook  https://review.openstack.org/48538000:16
pabelangerremote:   https://review.openstack.org/483987 Create openstack-py35 job with upper-constraints00:17
pabelangermordred: so those 2 are interesting ^. and think we need to gather_facts for all our playbooks in zuul-jobs by default00:17
mordredpabelanger: whyfore? what are we missing?00:21
* mordred looks at patch00:21
pabelangermordred: i'm trying to pass ansible_user_dir into an environmental variable00:21
mordredpabelanger: and we don't have that one without gather_facts?00:22
pabelangerright, facts are disabled by default00:22
pabelangerin ansible.cfg00:22
pabelangerso we always have to opt into them00:22
mordredyup00:23
mordredpabelanger: I just plopped in a comment about gather_subset ...00:23
pabelangerhttp://logs.openstack.org/27/484027/5/check/openstack-py35/a18c5ba/job-output.txt passes00:23
mordredpabelanger: actually - we should have it there00:24
pabelangerlooking00:24
mordredpabelanger: because unittests ... oh, nevermind00:24
mordredit'sunittests _pre_ that has a gather_facts00:24
jeblairpabelanger: you're losing me on that.  would you mind explaining the whole problem in the commit message please?00:24
mordredhttp://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/playbooks/unittests/pre.yaml00:24
pabelangerjeblair: sure I can go into more detail00:24
mordredjeblair: https://review.openstack.org/#/c/483987/9/zuul.yaml for context00:27
jeblairokay, so that's fallout from dropping zuul_workspace_dir00:28
mordredyah00:28
pabelangerright, ansible_user_dir would be the same thing00:29
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Gather facts for tox/run playbook  https://review.openstack.org/48538000:30
pabelangerokay, some more details00:30
jeblairto be fair, hardcoding /home/zuul there, or in a job variable would be about as valid.  but i feel like the problem may be likely to occur in a less openstack-specific way in the future.00:30
jeblairis it worth considering adding zuul_workspace_dir back?  or should we focus on making the facts approach better?00:31
jeblaircan we cache facts across playbook runs?00:31
mordredyah. we could switch for gather: false in ansible.cfg to on but with gather_subset: !all in ansible.cfg00:31
jeblairmordred: seems like if the bulk of our jobs are doing that anyway (and now are about to twice), that may be a win.00:31
mordredgathering = smart00:32
pabelangerIt is actually an interesting question, aside from the preformance issues (I am not sure how bad it is) what is the downside of having ansible variables by default to jobs?00:32
mordredfact_caching = jsonfile00:32
jeblair(and who knows, the devstack jobs may end up doing it too, we just haven't gotten there yet)00:32
mordredfact_caching_connection = /tmp/facts_cache00:32
jeblairpabelanger: i think the reason we turned facts off was performance issues?00:33
mordredpabelanger: yah  its an extra hit on each play - but in the zuul context I think we could turn on jsonfile fact_caching and it'll be likely pretty fine00:33
pabelangerya, I am not sure why we disabled actually. But cachine of facts does seem like a nice idea00:33
pabelangercaching*00:33
jeblairmordred: any concern with having the fact cache be in the untrusted writable work dir?00:34
jeblairi guess if a trusted post-playbook relied on a fact, an untrusted main playbook could have overwritten it beforehand...00:36
jeblairso caching may be tricky00:36
mordredjeblair: we should see what things it caches00:36
* mordred tests00:36
pabelangerya, doing the same00:37
mordredit does not cache facts set with set_fact00:38
mordredso it's only a cache of facts that are gathered by the system00:38
pabelangerya, that make sense00:38
jeblairmordred: but someone could still directly write the file00:38
pabelangerfact_caching_connection is a local filesystem path to a writeable directory, so we'd have it outside the src directory on executor?00:40
jeblairmordred: oh, i guess we can stick it in jobroot/ansible and that should work00:40
pabelangercan brwap write to jobroot/ansible?00:41
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Gather facts smartly and cache them  https://review.openstack.org/48538400:42
mordredjeblair, pabelanger: I was thinking like that ^^00:42
jeblairpabelanger: yeah, all of jobroot is writable inside of bwrap.  the only protection for inside of jobroot but outside of jobroot/work is the ansible module path checks00:42
pabelangerjeblair: cool, had to look it up to see00:43
jeblairso if you escape ansible, you can't hose the executor, but you can alter the code that's about to run in trusted post playbooks.00:43
pabelangermordred: think you have a typo00:43
pabelanger%/.ansible00:43
mordredprobably00:43
pabelanger%s/.ansible00:43
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Gather facts smartly and cache them  https://review.openstack.org/48538400:44
jeblairmordred: do you mean to use ".ansible/" rather than "ansible/"?00:44
mordredjeblair: yah - I figured sticking it in the same place as local_tmp and remote_tmp since it's ultimately ephemeral data00:45
mordredalthough I guess I could make it be facts_tmp to math those dir names ...00:45
jeblairmordred: i don't think facts_tmp is necessary00:46
mordredkk00:46
jeblair(the other options are literally "remote_tmp")00:47
pabelanger+200:47
mordredjeblair: while I've got you here: https://review.openstack.org/#/c/485345/ and https://review.openstack.org/#/c/485379/ could both use a quick +A00:48
jeblairdone00:48
jeblairi'm going to eod now00:49
pabelangermordred: have I also mentioned this is pretty fun00:51
mordredpabelanger: ++00:52
mordredpabelanger: also - https://review.openstack.org/#/c/262597/ is the reno patch I made for zuul forever ago00:52
pabelangercool, I'll look into it in the morning00:53
pabelangerEOD now00:53
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Switch from tox-linters to tox-pep8  https://review.openstack.org/48537900:56
*** jkilpatr has quit IRC01:11
mordredhttp://logs.openstack.org/27/484027/5/check/openstack-py35/8f780a0/job-output.txt latest logging updates applied01:18
*** harlowja has quit IRC01:32
clarkbhttps://blog.sileht.net/automate-rebasing-and-merging-of-your-pr-with-pastamaker.html may be of interest to the channel02:08
mordredindeed02:10
*** harlowja has joined #zuul02:52
mordredclarkb: the promise of github seems to be allowing every dev team to write their own PR bot03:22
clarkbright03:23
SpamapSIt's the yak every dev team is dying to shave.03:44
*** harlowja has quit IRC04:18
*** harlowja has joined #zuul04:30
openstackgerritMerged openstack-infra/zuul-jobs master: Rework tox -e linters  https://review.openstack.org/48483605:14
*** isaacb has joined #zuul05:16
tobiashsounds like they actually would need BonnyCI...05:24
*** harlowja has quit IRC05:33
*** isaacb has quit IRC06:13
*** isaacb has joined #zuul06:24
*** bhavik1 has joined #zuul06:30
*** amoralej|off is now known as amoralej07:26
*** hashar has joined #zuul07:41
*** yolanda has quit IRC07:58
*** clarkb has quit IRC08:04
*** clarkb has joined #zuul08:05
*** isaacb has quit IRC08:31
*** isaacb has joined #zuul08:32
*** yolanda has joined #zuul08:53
*** yolanda has quit IRC08:54
*** yolanda has joined #zuul08:55
*** smyers has quit IRC09:04
*** smyers has joined #zuul09:18
*** bhavik1 has quit IRC10:27
*** xinliang has quit IRC10:29
*** xinliang has joined #zuul10:41
*** xinliang has quit IRC10:41
*** xinliang has joined #zuul10:41
*** jkilpatr has joined #zuul11:02
*** amoralej is now known as amoralej|lunch11:04
*** amoralej|lunch is now known as amoralej12:24
*** dkranz has joined #zuul13:18
pabelangermordred: nice work on console logs! One request, LOOP [Collect tox logs] in your job-output.txt above, can we prettyprint the dict item to make it easier to read the output?13:38
dmsimardpabelanger, clarkb, mordred: did you guys just flip the switch when you moved from jenkins to zuul 2.5 ? I forget. Was there a weird period where both of them were running side by side ? We're wondering how to make sure Zuul and Nodepool don't get confused on which target the job is meant to run13:39
dmsimardtristanC: ^13:39
pabelangerdmsimard: no, we stopped jenkins and started launchers IIRC. Don't think we ever had both running at same time13:40
pabelangerdmsimard: are you seeing an issue?13:41
pabelangermordred: also, I looked like we are missing ok between some tasks:13:41
pabelanger2017-07-20 01:04:44.850458 | TASK [bindep : Look for other-requirements.txt]13:41
pabelanger2017-07-20 01:04:44.871528 | TASK [bindep : Define bindep_file fact]13:41
pabelangerwould have expected to see ok or skipped there13:41
dmsimardpabelanger: we intended to run jenkins and zuul-launcher side by side to allow for a transition period but it looks when both targets are enabled, zuul and nodepool get confused about which node is requested and allocated13:42
dmsimardpabelanger: I can forward you an email since I haven't personally witnessed this13:43
dmsimardsent13:43
pabelangerright, different image names would be one way to fix that13:43
dmsimardpabelanger: that's what I thought too, see internal irc sf channel backlog13:45
pabelangerbut ya, I wouldn't expect both running at the same time to be good.  Going back into irc logs to see what we did.  But, I seem to remember we did the roll out dance a few times13:45
pabelangerdmsimard: where is your nodepool.yaml file?13:45
dmsimardpabelanger: mixture of https://softwarefactory-project.io/r/gitweb?p=config.git;a=blob;f=nodepool/nodepool.yaml;h=ee8c998c4c80b9de8ddf3e5891671ed16f880c9b;hb=HEAD13:46
dmsimardand https://softwarefactory-project.io/r/gitweb?p=software-factory/sf-config.git;a=blob;f=ansible/roles/sf-nodepool/templates/_nodepool.yaml.j2;hb=HEAD13:46
dmsimard(it's mashed by software factory when running the nodepool config update job)13:46
dmsimardthis is what nodepool list is giving: https://softwarefactory-project.io/paste/show/xLmGA1UyA1bfi4fGUT2h/13:47
pabelangerdmsimard: Ya, I'd just home your images to either jenkins or launcher for now13:48
pabelangernot across both13:48
dmsimardpabelanger: tristanC says that wouldn't work, see #softwarefactory13:48
pabelangerthen setup a new cloud provider13:49
pabelangerand have jenkins and zuulv2.5 only access one or other13:49
dmsimardpabelanger: hmm, how would we go about that ?13:49
pabelangerboth could use the same project, just different names in nodepool.yaml. Like we do for osic-cloud113:50
dmsimardlike, I know how to setup additional providers but how to get only jenkins to consume this cloud and zuul the other ?13:50
pabelangerdmsimard: look at osic-cloud1-s3500 and osic-cloud1-s3700 in nodepool.yaml, both are using the same cloud, project13:50
pabelangerOh, hmm13:51
tristanCpabelanger: iiuc the issue is because when there are node ready, nodepool doesn't check if they are correctly assigned to the requesting target13:51
tristanCpabelanger: even if in the logs it seems to correctly create server and assigned them to the right target when requested, we ends up with lots of node ready in the wrong target13:52
pabelangerya, guess you'd need a 2nd nodepool too13:53
pabelangerwhich didn't know about jenkins13:53
pabelangermaybe jeblair or clarkb have an idea on how to work around that with gearman13:53
tristanCpabelanger: yes, that would work too13:54
tristanCdmsimard: iirc, openstack-infra was running jenkins and zuul-launcher with the same set of jobs13:54
pabelangerhttps://review.openstack.org/#/c/325992/13:59
pabelangertristanC: ya, you are right, we did run both in production14:02
pabelangerI'm remembering now14:02
pabelangerhowever, we wanted jobs to run on both jenkins and zuul-launcher14:02
pabelangeras not to interrupt the gate14:02
pabelangerI don't think we ever pinned a job just to launcher or jenkins14:03
pabelangerbecause we wanted to ensure the migration between launcher and jenkins worked as expected14:03
pabelangersince both were JJB, they both should run on both14:03
pabelangerdmsimard: so looping back, why do you want a job to only run on launcher?14:04
dmsimardpabelanger: I'm a scaredy cat of flipping the switch 100% from jenkins to launcher without the opportunity to test things first14:05
pabelangerdmsimard: right, so you can use both jenkins and launcher per tristanC comment, and make sure everything is 100% across both14:06
dmsimardpabelanger: and we're in a bit of a time crunch from a lot of angles so if we don't get that transition period where both are running side by side, we will need to delay the upgrade14:06
dmsimardpabelanger: well, running side them side by side was the plan, but this whole discussion is about that not exactly working as intended14:07
pabelangerdmsimard: more JJB jobs into another SF instance and test them there first?14:07
pabelangerbut ya, no easy migration path for this14:08
dmsimardpabelanger: requires time and it's a precious resource right now14:08
dmsimardespecially with folks on PTO left and right14:08
pabelangerit took us about 4 weeks to migrate to zuul-launchers and all hands were on deck14:09
pabelangerdon't want to rush it14:09
dmsimardyeah, that is what I am thinking as well -- RDO's deployment is not nearly openstack-infra scale but we still have a considerable amount of jobs, some that are fairly complicated14:10
dmsimardso I'm not confident in just flipping the switch14:10
dmsimardWe'll discuss our options, thanks14:10
*** isaacb has quit IRC14:30
jeblairyeah, we added zuul into the mix, so with 8 jenkins masters and one zuul launcher, zuul ran 1/9 of the jobs.14:44
jeblairif we saw failures, we turned off the zuul launcher, fixed them and repeated.14:44
jeblairthen as things got gradually better, we added more zuul launchers and reduced the jenkins masters.14:45
pabelangermordred: jeblair: so smart facts and bubblewrap are broken at the moment. Working on fix, this is because the 'zuul' user doesn't exist on my local system when tox is run14:58
pabelangerand the ansible connection for localhost is defaulting to 'zuul' user14:58
pabelangerhttp://paste.openstack.org/show/616030/ is the issue15:01
pabelangerthis also means we'd be gathering facts on executors15:02
jeblairwhy does it default to 'zuul' for localhost?15:04
pabelangerjeblair: not sure yet. trying to figure that out15:05
jeblairpabelanger: i think this could cause problems for folks running zuul-executor as a different username than, well, whatever thing is causing it to use 'zuul'.  :)15:05
pabelangeragree, think we exposed a bug15:05
jeblairpabelanger: localhost facts probably aren't important; any way to turn it off?15:05
pabelangerjeblair: not sure yet, will find out15:06
Shrewsforgive my ignorance, what are "smart facts"15:06
Shrews?15:06
pabelangeransible will scan host, if missing from fact cache15:07
pabelangerhttp://docs.ansible.com/ansible/intro_configuration.html#gathering15:07
pabelangeris 'host' is mssing from fact cache15:07
Shrewsah15:08
*** isaacb has joined #zuul15:13
*** dmsimard is now known as dmsimard|cave15:14
pabelangerjeblair: SpamapS: so, If I try to run the brap command from our tox jobs myself, I get the following error: Can't write data to file /etc/passwd: Bad file descriptor15:41
pabelangerany idea why?15:41
jeblairpabelanger: zuul creates a new password file as a pipe and hands the fd to bubblewrap.  i don't know what could cause a problem with that.  can you paste your command and error?15:46
pabelangersure15:48
pabelangerhttp://paste.openstack.org/show/616042/15:48
pabelangererror is above is only thing I get back15:48
jeblairpabelanger: oh, you're copy/pasting a previously run bwrap command15:49
jeblairpabelanger: so you're telling bwrap to use a file descriptor that doesn't exist15:50
pabelangerjeblair: ya, I'm trying to reproduce the environment from tox15:50
jeblair--file 1000 /etc/passwd"15:50
pabelangerah15:50
jeblairpabelanger: try running the bwrap module as main.  it has a main method so you can do testing like that15:50
jeblairpython zuul/drivers/bubblewrap/__init__.py15:51
jeblairthat should take care of creating the passwd/group files and run the command you give it in bwrap15:51
pabelangerjeblair: ya, I did that and smart facts works15:53
pabelangerso, was thinking we might be doing something different some how15:54
jeblairpabelanger: i ran tests with vvv to get the full tb: http://paste.openstack.org/show/616046/15:57
pabelangerjeblair: yup, that is the error I am seeing too from tox15:57
pabelangerit looks like zuul isn't in /etc/passwd file15:58
pabelangerbut trying to confirm15:58
pabelanger2017-07-20 12:00:51,174 zuul.AnsibleJob                  DEBUG    [build: 19831828f32042a7bfddba05a2059dd9] Ansible output: b'    "stdout": "pabelanger:x:1000:1000:pabelanger:/tmp/tmpaapfgaf3/zuul-test/19831828f32042a7bfddba05a2059dd9:/bin/bash",'16:01
pabelangerso ya, it is using my uid16:01
pabelangernot zuul16:01
jeblairwhy does ansible want to look up zuul?16:02
pabelangerI think we are running ansible-playbook as zuul user some how16:02
jeblairpabelanger: as you pointed out, you don't have a zuul user on your system, and there isn't one in the bwrap passwd file either.16:04
pabelangerjeblair: Ya, that is what is driving me crazy. I don't know where it is coming from atm16:05
jeblairright, that was my question -- what, specifically, causes ansible to do a pwnam lookup on zuul?16:07
jeblairthe code in ansible is https://github.com/ansible/ansible/blob/stable-2.3/lib/ansible/module_utils/facts.py#L55116:07
jeblairso it's getting the name to look up from getpass.getuser()16:07
pabelangerOh16:09
pabelangerI see it now16:09
pabelangerhttp://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/executor/server.py?h=feature/zuulv3#n131816:10
jeblairpabelanger: yep.  python getpass looks at LOGNAME first16:10
*** isaacb has quit IRC16:11
jeblairthe only reason i know of we hardcode logname to user is so that when we used the default ansible logging module to create the job log, it put "zuul" as the username.16:11
jeblairpabelanger: we can probably drop that now.16:11
pabelangerjeblair: cool, let me try that16:11
jeblairsince i think that has entirely been superceded by zuul_stream and friends16:12
jeblairpabelanger: the hello-world job is failing the playbook test because we're gathering facts locally, and executing local code is prohibited16:23
pabelangerya, that is what I am debuging now16:23
jeblairpabelanger, mordred: that error did not make it into the text console log, only the json one.  i think that's a bug.16:23
jeblairi'll file a story for that16:24
pabelangerso, should we allow facts to be gathered on localhost?16:25
jeblairno16:25
pabelangerk, I'll see how we can stop smart from doing that16:26
pabelangermaybe prime the fact cache with something16:26
jeblairmordred: for whenever your next cup of coffee is:  https://storyboard.openstack.org/#!/story/200112916:29
clarkbpabelanger: tristanC I do not know why you'd have nodes assigned to the wrong target. I can imagine that targets "forgetting" nodes so they leak on the nodepool side is possible though16:40
*** bhavik1 has joined #zuul16:50
pabelangerjeblair: mordred: okay, so if we prime the 'localhost' json file in fact-cache with {"module_setup": true} ansible will not run setup task17:02
pabelangerthat will stop untrusted playbooks from trying run it on executor17:02
*** hashar has quit IRC17:03
pabelangerand trusted playbooks will not have ansible facts I think17:03
pabelangerunless we use a different cache for trusted / untrusted?17:03
jeblairpabelanger: i don't think we want anything different between trusted/untrusted17:04
openstackgerritPaul Belanger proposed openstack-infra/zuul feature/zuulv3: Remove hardcoded LOGNAME for ansible-playbook  https://review.openstack.org/48574917:05
pabelangerjeblair: k, any thoughts about accessing ansible varialbes on executor with trusted playbook?17:07
pabelangerI think if the first playbook that runs today is trusted, localhost fact gets created and other playbooks will use it17:08
pabelangerhowever, if untrusted is first, localhost fact cache will fail17:08
pabelangerthe other intereting thing is, from an untrusted playbook, it does create the localhost json file in fact-cache, even though job fails17:18
SpamapSjeblair: FYI, you don't need to find the bwrap driver to run its main method. it has a CLI entrypoint called zuul-bwrap :)17:21
jeblairSpamapS: oh right!  i should put a comment in there to remind myself.  :)17:24
jeblairpabelanger: i thought you were looking at pre-populating the cache with something so it doesn't run at all, and we would have the same behavior in both places.17:25
pabelangerjeblair: right, we can do that. but means we'd never be able to use ansible varibles for a playbook on executor17:26
pabelangerwant to make sure we are okay with that17:27
jeblairpabelanger: until someone comes up with a use case for it, i'm okay with it.17:27
pabelangerok17:28
* SpamapS sets out to write a disk monitor thing17:32
*** artgon has joined #zuul17:43
openstackgerritPaul Belanger proposed openstack-infra/zuul feature/zuulv3: Gather facts smartly and cache them  https://review.openstack.org/48538417:49
pabelangermordred: jeblair: ^ should be the fix17:49
*** amoralej is now known as amoralej|off17:59
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: Check out implicit branch in timer jobs  https://review.openstack.org/48532918:00
jeblairpabelanger: lgtm, thanks.  let's see what mordred says when he awakens.18:02
*** harlowja has joined #zuul18:06
*** bhavik1 has quit IRC18:35
*** amoralej|off is now known as amoralej18:59
ShrewsHas anyone started poking at the zuul auto-hold stuff yet? I might start looking into it if not.19:05
*** isaacb has joined #zuul19:20
jeblairShrews: not that i'm aware of.  there's a story for it with a plan i think should work.  and i think if you assign it to yourself, you win.  :)19:28
*** dmsimard|cave is now known as dmsimard19:28
pabelanger++ to auto-hold, would make debugging new jobs easier too19:29
Shrewsbeen skirting the boundaries of zuul code thus far, so it would be my first adventure with it. hopefully it won't end up being over my head19:33
jeblairShrews: i'm happy to help with any questions!19:39
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: Check out implicit branch in timer jobs  https://review.openstack.org/48532919:49
Shrewsi <3 watching streaming logs19:53
Shrewsmuch more readable these days, too19:53
jeblairo hrm.  the url in the status page isn't switching to the log url when the job is finished.  it's sposed to do that.19:57
*** openstack has joined #zuul20:01
pabelangercan we also display finger URL also on status page?20:01
pabelangerbut steamming is cool20:02
jeblairyes we can+should20:04
*** jkilpatr has quit IRC20:05
tobiashjeblair: I also noticed that, but didn't have time yet to look into that20:06
pabelangertobiash: feedback for html stream, have autoscroll happen in you are at bottom of browser, but don't auto scroll if you move up a few lines20:07
pabelangerI have no idea how you'd do that with JS20:07
* tobiash too20:07
jeblairpabelanger, mordred, clarkb, SpamapS: i'm working on tidying up things related to non-change jobs.  periodic, post, tag, etc.  can you take a look at the commit message and the docs for https://review.openstack.org/485329 and let me know if that direction looks good?20:08
tobiashpabelanger: I'm a js and html noob so this was poor mans first try on console streaming ;)20:08
pabelangertobiash: well done! much better then I could have done :D20:09
jeblairpabelanger, mordred, clarkb, SpamapS: (i know we're going to have jobs where we want to keep using the old shell variables during a transition period.  i figure we can have some ansible to do the translations for us and add that to those jobs.)20:10
*** amoralej is now known as amoralej|off20:10
mordredmorning all20:10
jeblairmordred: morning!20:10
pabelangerjeblair: sure, looking now20:10
pabelangerlook a mordred20:10
jeblairpabelanger, mordred, clarkb, SpamapS: also, the changes to model.py there -- i've got distinct objects for ref, branch, tag, and change now.20:12
*** isaacb has quit IRC20:12
*** openstackgerrit has quit IRC20:17
*** isaacb has joined #zuul20:19
*** dkranz has quit IRC20:22
*** isaacb_ has joined #zuul20:22
*** isaacb has quit IRC20:25
mordredjeblair: change looks good in general - but I'm also still waking up20:32
jeblairmordred: no takebacks!20:34
*** dkranz has joined #zuul20:35
SpamapSjeblair: I think we'll all go a little less insane if we can change as little as possible about what gets run by jobs.. so yeah, using the old envvars, perhaps for a long time (till zuul4?) is a good idea.20:39
jeblairSpamapS: i agree, though my time horizon is perhaps a bit shorter.  the bulk of our jobs won't use the old zuul env variables out of the gate (none of the zuulv3 jobs currently do).  but yeah, i don't want to be under time pressure to get the long tail cleaned up before ptg.20:43
*** openstackgerrit has joined #zuul20:45
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: gzip console log before uploading it  https://review.openstack.org/48361120:45
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Remove hardcoded LOGNAME for ansible-playbook  https://review.openstack.org/48574920:45
SpamapSjeblair: the ones that use them are also going to be the stickiest nastiest ones that do everything we told them not to do ;)20:46
jeblairSpamapS: indeed!20:47
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Gather facts smartly and cache them  https://review.openstack.org/48538420:47
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Simplify bindep logic removing fallback support  https://review.openstack.org/48265020:48
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Simplify bindep logic removing fallback support  https://review.openstack.org/48265020:49
pabelangerYay, new facts merged20:50
mordredpabelanger: I abaondoned your gather-facts change for that job20:51
pabelangermordred: ++20:52
mordredjeblair: if you get a sec: https://review.openstack.org/#/c/48537720:52
pabelangermordred: caching is also fast too20:52
pabelangertried it a few times20:52
mordredthere's only 3 open zuul-jobs changes: https://review.openstack.org/#/q/project:openstack-infra/zuul-jobs+status:open20:52
pabelangerYup, I can start back up tox role now20:53
openstackgerritMerged openstack-infra/zuul-jobs master: Remove OOM check from tox role  https://review.openstack.org/48537720:56
pabelangermordred: jeblair: I'm thinking we should also remove the 'zero' tests run check. Or is that something we care about for zuul-jobs? Maybe we still want it in openstack?20:59
jeblairpabelanger: that one may be good to keep and generally useful.21:01
pabelangerjeblair: okay, let me see what the logic would look like21:02
jeblairpabelanger: i assume it will have to branch based on testr/nose/etc21:03
mordredpabelanger: I think we're probably going to want to have a "did I find a .testrepository? if so, include: testrepository.yaml21:03
pabelangermordred: ya, that's what I am thinking too21:04
mordredpabelanger: you know what - that might want to be in its own role too -so that if someone is writing a job that does things but doesn't use tox (like a job to just run testr without tox or something) thatthey could add the "check-zero-tests" role to their playbook21:05
mordred(or something)21:05
pabelangerAgree21:06
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Remove nodepool DIB specific logic  https://review.openstack.org/48582421:07
pabelangermordred: jeblair: I also wouldn't object if we wanted to restart zuulv3 to pickup smart facts too21:08
mordred++21:09
jeblairpabelanger: wfm21:10
jeblairmordred: speaking of restart... did you see this from earlier?  https://storyboard.openstack.org/#!/story/200112921:10
jeblairmordred: i had to read a .json file to figure out what went wrong with a job :(21:10
jeblairmordred: (the first play failed but did not log its failure to the job log.  the second play *also* failed -- don't be distracted by that.  :)21:12
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: Check out implicit branch in timer jobs  https://review.openstack.org/48532921:13
mordredjeblair: oh goodie21:19
*** dkranz has quit IRC21:20
mordredjeblair: cool. I should be able to reproduce thatlocally pretty easily21:21
pabelangerjeblair: should graceful restart work?21:21
pabelangerI haven't run the command yet on ze0121:21
mordredjeblair: so - unfortunately I actually CAN'T reproduce that locally21:25
mordredoh - wait21:26
mordreduyup. there it is (I had fact caching turned on - so it wasn't runing)21:27
mordredjeblair: turnsit the fact that it was the fact gathering is the relevant thing21:27
mordredalthough there is another thing that's ugly ...21:28
jeblairpabelanger: i think so?21:35
jeblairpabelanger: but it might run into the pidfile issue on restart21:36
pabelangerk, I'll stop when no jobs are running21:41
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Improve display of simple error messages  https://review.openstack.org/48583321:41
mordredjeblair, pabelanger: that'll fix the story (gah, I should reference in commit message - one sec)21:42
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Improve display of simple error messages  https://review.openstack.org/48583321:42
mordredthat should fix the story - and also fix error messages that are just an error message from getting json-displayed21:43
jeblairmordred: merge conflicts21:46
mordredgah21:47
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Improve display of simple error messages  https://review.openstack.org/48583321:47
mordredpabelanger: your item/pprint request is ... harder21:47
mordredpabelanger: the problem is that what we're dealing with is that the dict on the ITEM line is the loop key21:49
mordredin this case, we're looping over the results from the previous task21:49
mordredso there is no "good" thing to display there21:49
pabelangerhmm21:50
pabelangerk, +3 lets not hold up the patch21:52
pabelangermordred: also, using zuul_stream in tox jobs makes it pretty hard to see what ansible is doing. I had to actually disable it in ansible.cfg today21:53
pabelangerI am not sure if we have a better story for that21:53
pabelangereg: tox -epy35 on zuul21:54
jeblairpabelanger: i turn on KEEP_TEMPDIRS and go look at the job log.21:55
jeblairpabelanger: if we want to make that more automatic, we could attach the job log as a subunit attachment21:55
mordredpabelanger: can you expand on that? like - what is it that you were missing? (also, I expect that to be better once I get the json formatting thing done)21:55
pabelangerjeblair: ya, I did have keep_tempdirs eventually. But attachment might be nice too.21:55
mordredactually - lemme do a couple of things ...21:56
pabelangermordred: it was around the gather facts stuff today. So part of it was ansible traceback, -vvv help but others were shell output is not longer displayed in zuul_stream. So, shell: cat /etc/passwd wouldn't return any data for me21:57
pabelangerso, have to stop using zuul_stream and default back to normal ansible output21:57
mordredpabelanger: wait - that' sa bug21:57
mordredshell output should ABSOLUTELY be displayed in zuul_stream21:58
pabelangerAh, I thought we disabled it for some reason21:58
mordredthat's like the entire reason it exists21:58
mordredis to live-stream shell output :)21:58
pabelangerlet me test is again here locally with your latest changes21:58
jeblairi worry that we're not all on the same page here21:59
jeblairso at the risk of stating things people may already know:21:59
mordredcool - please do - and yes, any time that you have to disable something, or hold a temp dir or look in the json file to figure out what went wrong with a job, please point that out21:59
jeblairi think pabelanger is concerned with running tests on zuul itself.  so not a running production instance of zuul, but rather what's happening inside of a unit test.22:00
mordredAH. gotcha22:00
mordredthank you22:00
mordredwe were not on the same page22:00
pabelangerjeblair: yes, thanks22:01
pabelangerze01 has been restarted22:04
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Remove gather_facts: true  https://review.openstack.org/48583422:06
*** jkilpatr has joined #zuul22:07
pabelangermordred: we likely can drop validate-host gather fact logic too, and maybe just use the fact_cache files instead?22:08
mordredpabelanger: yah - you mean the gather_facts at the top of playbooks/unittests/pre.yaml ?22:10
mordredoh - you mean the setup call22:10
pabelangerya: http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/validate-host/tasks/main.yaml#n1622:11
pabelangerthat _should_ be the same as cached facts right?22:11
pabelangerwe could have a post job to add it into logs folder22:11
mordredpabelanger: yup. I agree22:11
mordredpabelanger: we can ALSO remote the gather_facts: and gather_subset: lines22:12
mordredpabelanger: (we're also doing validate-host in both base and unittests atm)22:12
pabelangermordred: ya, I see that now too22:12
pabelangerbug?22:12
pabelangerour logs are looking real good now too22:13
pabelangermuch nicer to debug with22:13
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Improve display of simple error messages  https://review.openstack.org/48583322:20
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Remove validate-host from unittests/pre.yaml  https://review.openstack.org/48583622:23
pabelangermordred: openstack-py35 now passing on shade: http://logs.openstack.org/27/484027/5/check/openstack-py35/1b33d30/job-output.txt22:23
pabelangermordred: still need to rework the job based on your feedback still22:24
jeblairoO22:24
pabelangeralso, still22:24
mordredwoot!22:24
*** dkranz has joined #zuul22:25
pabelangermordred: I would have expected  TASK [Gathering Facts] to return ubuntu-xenial | ok in that log too. But holding out to restart ze01 again with latest changes22:25
pabelangerjeblair: did you mention you were working on a fix to logs.o.o on zuulv3.o.o status page after job finishes?22:28
pabelangeralso, https://review.openstack.org/#/c/483611/ could use a review too. gzip our job-output.txt file22:29
jeblairpabelanger: no fix, just observed the issue22:29
openstackgerritMerged openstack-infra/zuul-jobs master: gzip console log before uploading it  https://review.openstack.org/48361122:33
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: Check out implicit branch in timer jobs  https://review.openstack.org/48532922:37
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584022:49
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: WIP: Check out implicit branch in timer jobs  https://review.openstack.org/48532922:55
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584022:56
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584023:03
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Check out implicit branch in timer jobs  https://review.openstack.org/48532923:03
jeblairpabelanger, mordred, SpamapS: ^ that's in final form for review23:03
jeblairalso, it's 50% docs.  :)23:04
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename tags job variable jobtags  https://review.openstack.org/48584523:06
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584023:07
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: WIP: Move subunit processing  https://review.openstack.org/48584023:24
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Rename uuid to build  https://review.openstack.org/48585123:32
* SpamapS getting into nitty gritty of config file changes for a disk monitor23:39
SpamapSThis feels like something that would possibly be unique to each executor.23:40
SpamapSThough that may also not really be true in practice, since it would be annoying if sometimes your job works, and sometimes it fails, because sometimes it hits the big executor and sometimes the small one.23:41
pabelangerjeblair: mordred: going to restart ze01 again to pick up latest logging changes23:41
mordredcool23:41
jeblairSpamapS: i think we also had the idea of a jobs-per executor limit.  so fine tuning might involve considering those values together.23:43
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Add zuul.items to job vars  https://review.openstack.org/48585323:44
mordredjeblair:your stack looks good - but there's a missing word in the docs in the first patch23:48
SpamapSjeblair: Right, what would limit an executor from overloading itself now? Nothing?23:54
jeblairmordred: ugh.  maybe a followup at this point? :)23:58
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Log items in loops better  https://review.openstack.org/48586423:58
mordredjeblair: yah. that's what I was thining23:59
mordredpabelanger: ^^ that should make your item/loop thing from earlier better23:59
jeblairSpamapS: indeed.  we spun up zl08 recently because we think we overloaded our v2 launchers when running at full capacity (we were at 213 simultaneous jobs, now down to 187)23:59
pabelangerokay, ze01 restarted23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!