Monday, 2017-06-05

openstackgerritWang Xuguang proposed openstack-infra/zuul master: Using assertIsNone(xxx) instead of assertEqual(None, xxx) Using assertTrue(xxx) instead of assertEqual(True, xxx)  https://review.openstack.org/47079601:56
*** cinerama` has quit IRC02:02
*** cinerama has joined #zuul02:06
openstackgerritTuan Luong-Anh proposed openstack-infra/zuul master: Replace assertEqual(None, *) with assertIsNone in tests  https://review.openstack.org/25725103:40
*** dougbtv_ has quit IRC06:17
*** dougbtv_ has joined #zuul06:24
*** cinerama has quit IRC06:50
*** SpamapS has quit IRC06:50
*** zaro has quit IRC06:50
*** mattclay has quit IRC06:50
*** patrickeast has quit IRC06:50
*** TheJulia has quit IRC06:50
*** cinerama` has joined #zuul06:50
*** mattclay has joined #zuul06:50
*** patrickeast has joined #zuul06:51
*** zaro_ has joined #zuul06:51
*** TheJulia has joined #zuul06:51
*** MapspaM has joined #zuul06:51
*** hashar has joined #zuul10:19
*** jkilpatr has joined #zuul10:59
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Set socket timeout for SSH keyscan  https://review.openstack.org/45147013:32
*** persia has quit IRC14:17
*** persia has joined #zuul14:21
*** MapspaM is now known as SpamapS15:54
*** SpamapS has quit IRC15:55
*** SpamapS has joined #zuul15:55
* SpamapS awakens15:55
jlkthe sleeper ^15:56
SpamapSI HUNGER15:57
*** hashar has quit IRC15:58
jlkI left you some tasty snacks in a slack group PM15:59
*** hashar has joined #zuul16:00
*** hashar has quit IRC16:00
SpamapSjlk: delicious16:06
*** harlowja has joined #zuul16:45
*** jkilpatr_ has joined #zuul17:03
*** jkilpatr has quit IRC17:05
SpamapSIt's quiet17:07
Shrewsjeblair: If I use hold_jobs_in_build, as you suggested, how do I get the build data (like uuid for the python27 build)? getJobFromHistory() is obviously not the way17:12
* Shrews getting lost in the test framework details17:13
Shrewsah, self.builds17:19
SpamapSI get lost in those too sometimes. :)17:20
SpamapSstarted to click somewhere around the 15th test re-enablement :)17:20
jeblairyeah.  builds == running builds; history == completed builds.17:23
jeblairso the change i wrote friday, 470571, failed due to a race which i believe is possible now, but that change makes much more likely.  it's not just a test race; it's a real production race too.  basically -- there's some ambiguity about knowing whether zuul should report certain kinds of failures before the initial merge for a change has completed.17:26
jeblairi'm going to take tobiash's change 455711 which also touches that area, and see about building a fix on that, so that maybe we can have some comprehensive behavior there, then restack 470571 on top of it.17:27
jeblair(the race is that the mutex around the executor merger makes it much more likely that tests will end up dequeuing changes before their initial merge completes)17:28
mordredjeblair: ah - fun18:02
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Replace equals None with is None  https://review.openstack.org/47107518:08
Shrewsjeblair: so, i'm finding that the actual build directory does not yet exist with hold_jobs_in_build18:09
mordredShrews: whatcha trying to do?18:11
Shrewsmordred: write a test for the log streamer18:12
Shrewsso i need, like, a log to stream18:12
mordredah - so you do, don't you18:12
mordredShrews: so - in case it's helpful, I've got this: https://review.openstack.org/#/c/467634/ which needs to look at files in the build dir18:13
jeblairShrews: hrm, it should exist, even regardless of whether ansible is run18:13
jeblairyeah _get_build_inventory in that change ^ is a good reference18:14
Shrewsjeblair: i'm sure i'm doing something silly18:14
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul feature/zuulv3: WIP: add log streaming test  https://review.openstack.org/47107918:15
Shrewsjeblair: ^^^ that's what i'm doing, atm18:15
jeblairthough the path for the log file is jobdir/work/logs/ansible_log.txt18:15
mordredjeblair: speaking of - https://review.openstack.org/#/c/467634/5/tests/base.py doesn't seem to want to find the build by name18:15
mordredjeblair: am I doing something stupid in that method?18:16
jeblairmordred: lazy test method18:16
jeblair^ (joke, in case that didn't read)18:16
Shrewsjeblair: line 91 fails18:16
mordred:)18:16
jeblair(/me forgot "lazy" is a thing in programming)18:16
Shrewsthe first listdir() confirms the build uuid subdir does not exist18:17
jeblairShrews: do you have some output of 83 -- 92?18:18
Shrewsyep. 1 sec18:18
jeblairShrews: also, "KEEP_TEMPDIRS=1 ttrun -epy27 ..." may be helpful here.  tests won't clean up any tempdirs and you can cd into them and inspect.18:19
Shrewsjeblair: http://paste.openstack.org/show/611426/18:19
jeblairthat's weird, since setting jobdir.log_root is set on the line before makedirs is called18:22
Shrewsoh, using KEEP_TEMPDIRS=1 makes it work18:22
Shrewsat least changes the behavior18:22
jeblairShrews: those 2 things suggest that somehow your test is happening after the jobdir is removed (which happens at end of job)18:23
jeblairoh!18:24
jeblairansible jobs can't be held in build18:24
jeblairso if you actually run ansible (subclass AnsibleZuulTestCase), then you can't use hold_jobs_in_build18:24
* SpamapS catching up18:25
jeblairand in this case, the build is finishing before the path checks18:25
jeblair(it's actually racing them)18:25
Shrewshrm. how do i do this then?18:25
jeblairShrews: if we need ansible to actually write the log file (probably so), then i think we'll have to make ansible pause.  maybe add an ansible task that waits for a semaphore file somewhere in the jobdir, then set that file in the test when you want ansible to proceed.18:26
SpamapSwait_for is a thing :)18:27
Shrewsalso, with the kept dirs, i'm not seeing any log files, although the directories exist18:27
jeblairSpamapS: they thought of everything!18:27
* SpamapS especially likes that kitchensink module18:27
Shrewsso i think we're missing more than that piece18:27
ShrewsSpamapS: i heard the kitchensink module has terrible leaks18:28
SpamapSShrews: hey-oh18:28
Shrewsyay Bad Joke Mondays18:29
Shrewsjeblair: do we bypass actual logging for the ansible stuff?18:30
jeblairShrews: nope, i can't think of why that would be empty18:31
*** jkilpatr has joined #zuul18:31
jeblairi'm loading it up on my xenial machine now18:31
Shrews*sigh*18:31
*** jkilpatr_ has quit IRC18:32
jeblairShrews: while this is setting up -- we should probably create a new configuration just for this test which doesn't run all of those jobs (there's a bunch of jobs that test ansible behaviors)18:35
jeblairi mean, we'll need to do that to have the wait_for task anyway18:35
jeblairbut we should just have a single job which does that, and not add it to the existing jobs used by the ansible test18:36
jeblairShrews: mordred's change can also serve as a template for that -- he adds a whole new config setup there18:37
*** hashar has joined #zuul18:37
jeblairShrews: when i run it, i get an ansible_log in the python27 jobdir work/logs18:39
jeblairShrews: one other thought i just had -- if you are using ttrun, you'll need to actually activate the venv18:39
jeblairbecause zuul won't find the ansible-playbook command otherwise18:40
jeblair.tox/py35/bin/activate; ttrun -epy35 ...18:40
jeblairis needed for tests which actually run ansible18:40
Shrewsah, i did not do the activation18:41
mordredjeblair: oh - I was supposed to try to fix that in ttrun I think, wasn't I?18:41
mordredjeblair: ttrun 1.0.5 _should_ have that fixed18:43
Shrewsjeblair: yes, i have the log files now18:43
mordredShrews: can you confirm your ttrun version? I'd love to know if the fix I put in is broken18:44
Shrewsmordred: 1.0.418:44
Shrewsjust upgraded it18:44
Shrewsmordred: 1.0.5 does not seem to fix that18:45
mordredhrm. thanks18:46
* Shrews goes back to the drawing board18:48
Shrewsalso, we should document that little testing tidbit, if it is not already18:50
mordredShrews: ++19:39
jlkIf anybody has cycles, I'd appreciate some reviews on 469297 and 47006819:40
*** jkilpatr has quit IRC20:06
mordredjlk: oh - I agree with SpamapS's reviews - but think the patch is good20:14
mordredjlk: on 46929720:14
SpamapSaw shucks20:18
SpamapSI'm just RFC'ing ;)20:18
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Write inventory as yaml not ini  https://review.openstack.org/46763420:20
*** jkilpatr has joined #zuul20:22
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Fix traceback when evaluating github changes  https://review.openstack.org/47006820:22
ShrewsI'm trying to find a way to make ansible pause after the run, but before the jobdir cleanup. Adding a wait_for to the playbook itself just for tests seems... icky20:28
Shrewsthe solution, thus far, eludes me20:29
jeblairmordred: spotted the problem on 46763420:31
jeblairmordred: left comment20:31
SpamapSShrews: that's doesn't seem icky at all to me.20:31
SpamapSShrews: it's the playbook form of monkeypatching, which is how all of our tests work.20:31
ShrewsSpamapS: i do not know how all of our tests work, so my ickiness remains20:31
Shrewsbut i'll look at where we do similar things20:32
SpamapSThe alternative is either less meaningful unit testing and end to end "hope that worked like we thought" integration tests.20:32
SpamapSs/and/or/20:32
jeblairShrews: you want to pause the test while the job is running, right?  it seems very natural to me to put the pause directly in the job....  i mean, the system is, after all, one designed to run arbitrary actions in tests.  :)20:33
SpamapSShrews: we haven't yet done anything like that for playbooks, but it's no different than monkey patching in our recording executor and getting in between the methods that do things with a "hold builds" functionality.20:33
jeblair(and it's not like this applies globally -- this is only for tests that need this behavior.20:33
jeblairSpamapS: i'd even argue that this is better than that.  this is unlike monkeypatching in that we are just suppling actual content to the system in the same way a user would.20:34
SpamapSAgreed.20:34
SpamapSand then pretending to be the user :)20:34
jeblair(i kind of wanted to do this to *all* of the tests before i realized running ansible would be very expensive for most of them for very little gain)20:34
*** hashar has quit IRC20:46
pabelangerohai!20:59
pabelangerbeen head down in training today, just wating for train back to ottawa now20:59
openstackgerritJesse Keating proposed openstack-infra/zuul feature/zuulv3: Handle change related reqs on push like events  https://review.openstack.org/46929721:08
jlkSpamapS: jeblair: updated ^^ accordingly.21:08
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor  https://review.openstack.org/47057121:51
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Don't report start on unmanaged projects  https://review.openstack.org/45571121:51
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix reporting on changes dequeued for deps  https://review.openstack.org/47114321:51
jeblair471143 is a tiny change that took me most of the day to come up with.  :)21:53
SpamapSjeblair: it's not the size of the patch, but how you test it.21:54
jeblairjlk: lgtm! +321:55
jeblairzuul meeting now in #openstack-meeting-alt22:01
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Don't copy the __pycache__ folder  https://review.openstack.org/47024522:04
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Handle change related reqs on push like events  https://review.openstack.org/46929722:06
SpamapSjlk: after the meeting, can we chat about your containerized zuul? I am poking at maybe pushing it up into bluemix k8s22:11
SpamapS(failing building nodepool images at the moment)22:11
*** jamielennox|away is now known as jamielennox22:13
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Write inventory as yaml not ini  https://review.openstack.org/46763422:59
jlkSpamapS: yes, would love to23:06
SpamapSjlk: were you successful in building images with nodepool? I forget23:14
jlkI was, once I started using volume mounts for the nodepool working space23:15
SpamapSjlk: ah, so something about the aufs/etc. failed?23:15
jlkyeah23:15
SpamapSweird23:15
jlkI have volumes for /opt/nodepool/images and /tmp/  in the docker-compose file23:15
SpamapSjlk: also did you ever think about taking it from docker-compose -> helm?23:16
jlkpretty sure that's upstream now. That was from hints from tobiash23:16
jlkSpamapS: not as of yet. I wanted to get it working simply on my laptop with a single docker daemon first.23:16
jlkwalk, then run23:16
SpamapShttps://github.com/j2sol/z8s <-- we're talking about this yes?23:16
jlkyeah23:16
jlkin gerrit land, updated_at is in reference to which? the last time a patch was changed, or anything was edited?23:19
jlkoh n/m23:19
clarkbjlk: aiui its any attribute of the change being updated23:20
clarkbso comments, topics, votes, etc23:20
clarkbjeblair: looks like your fix that you pointed out above failed on unittesting?23:20
pabelangerjlk: diskimage-builder is it fedora, you should be able to yum install it over pip install23:20
pabelangerin*23:20
jlkyeah, I know, but often we want more specific control of what version we install23:21
jlksometimes our own fork23:21
jlksometimes pre-release23:21
jlkoh this is tricksy. There is a isUpdateOf attribute to a change, which looks to see if there is an older version of the change already in the queue. For gerrit easy to just look at the patch #s since they increase. For github not so, which is where the "updatedAt" comes in. But that bit is going away in graphql as a generic thing. I wonder if I have to paw through the commits and look at those? eww.23:24
jeblairclarkb: yep, looking now.23:24
jeblairjlk: could probably keep an internal timestamp if that's necessary?23:26
jlkwe could, may have to go that route, but still exploring graphql. This is... hard to wrap my head around23:26
mordredjlk: but changing EVERYTHING about how people do things is the future23:27
mordredjlk: I'm not sure I fully understand the "is there an older versoin of the change already in the queue" bit23:28
mordredjlk: (for the github case)23:28
jlkopen PR A23:28
jlkzuul ingests that, but hasn't acted on it yet23:28
jlkadd commit to PR A23:28
jlknew event23:28
clarkbcouldn't you compare sha1s?23:28
jlkZuul should see that the old event for PR A opening should be tossed.23:28
jlkclarkb: they might be two completely different sha1s23:28
clarkbyes exactly23:28
mordredGOTCHA - so the PR has the same id23:28
jlkwhich one is "newest" ?23:29
clarkbjlk: whichever event you received last?23:29
jlkif the events come out of order?23:29
mordredis that a thing?23:29
jlk:shrug:23:29
mordredwow, fun23:29
mordredjlk: I at least understand your problem now :)23:29
SpamapSThey order them on the UI23:29
SpamapSso one would hope they're ordered somehow23:29
jlkorder what?23:29
jlkthe commits?23:29
jlkbut you can force push an entirely new set of commits23:30
SpamapSyeah, commits in the PR view23:30
SpamapSoh that23:30
SpamapSactualy23:30
jlkyeah, that.23:30
SpamapSthey show that too23:30
mordredjlk: I do not have any immediately helpful suggestions if the events can come in out of order23:30
SpamapSyou'll see "jlk updated the pr"23:30
*** jamielennox is now known as jamielennox|away23:30
SpamapSand then everything before that starts garbage collecting23:30
SpamapS:-P23:30
jlkI don't know if they could. We might be safe to just toss out any existing change with the same ID23:30
jlkor use the deprecated updatedAt for now23:31
SpamapSjlk: my guess is the best you can do is go ask the PR for its current sha.23:31
SpamapSas in, the pr source23:31
jlkyeah I can get a set of the commits23:31
mordredah - right. and use that to determine which of the N events you have for a given PR is the one that has the shas GH thinks are currently on the PR23:32
mordredit would be super neat to know whether or not gh webhooks can deliver events out of sequence or not23:32
jlkit's the Internet23:32
jlkthings can wind up delayed in route for whatever reason23:33
jlkso GH may issue them in order, but that doesn't guarantee that we'll receive and process them in order23:33
mordredI'm not sure I follow that - the webhook receiver on our side is an http service - the question in my head is will gh submit a second request to our http service before getting a response code from the server for the first one?23:34
mordredif it fires them async on their side, then totally23:34
mordred(which is probably the more likely thing they're doing, right?)23:34
SpamapSIt logs them in the sequence they respond in, but there's no telling if they guarantee one webhook per target.23:35
SpamapSI'd guess that they are async23:36
jlkfeels wrong to assume and rely upon order23:36
mordredtotally. we shoudl not assume order unless we know something about order23:37
clarkbI dunno, if events don't carry necessary ordering sematics within themselves then using the order you receive them in seems appropriate23:37
clarkbbut I guess this is a difference between gerrit and github23:37
clarkbgerrit is an actual event stream23:37
clarkbwhich implies order and all that23:38
clarkbgithub is just async "here is a thing YOLO"23:38
mordredjlk: you said updatedAt is going away in the graphql ... got a handy link to a doc?23:39
jlksure23:39
jlkhttps://developer.github.com/v4/reference/object/pullrequest/23:39
jlkthere's a possible replacement, a connection to the pull request timeline, https://developer.github.com/v4/reference/object/pullrequesttimelineconnection23:40
mordredjlk: yah - that seems to have ordering info in it23:40
jlkbut I was looking for a simple refchanged like event23:40
jlkand failing23:40
jlkhttps://developer.github.com/v4/reference/union/pullrequesttimelineitem/23:40
mordredyah23:40
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Mutex repo updates and merge ops in executor  https://review.openstack.org/47057123:41
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix reporting on changes dequeued for deps  https://review.openstack.org/47114323:41
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Don't report start on unmanaged projects  https://review.openstack.org/45571123:41
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix race in gerrit+github test  https://review.openstack.org/47116123:41
jlkSeems to be missing HeadRefUpdatedEvent23:41
mordredjlk: yah. there's a ton of things about types of force push23:41
mordredjlk: maybe "commit" is the event for that?23:42
jlkthat just leads to a commit object, which I'd have to mine for info23:42
mordredjlk: but - it would be an entry in the list of timeline events23:43
jlkyeah, might be able to take the last one of those23:43
jlkbut...23:43
mordredjlk: so if the timeline property on the PullRequest is an ordered list - then you could walk it from the back for ordering perhaps?23:43
jlkoh hrm.23:43
clarkbthe first doc you linked to says the general updatedat datetime will eventually be replaced, maybe just use it for now and then swtich when they replace it to something else?23:44
jlkre-thinking the scenario23:44
jlkclarkb: that's what I'm leaning toward23:44
clarkbI read that as "use this thing but know you will have to change it once we get around to actually making the new thing"23:44
jlkso I have event A leading to a change object A and event B leading to change object B, both are of the same PR.23:44
jlkin event A the commit event list has one commit, and in event B it also has one, but different, commit.23:45
jlkI think I lack the data necessary to know which of those event/objects is the "newest" one23:45
*** jamielennox|away is now known as jamielennox23:46
jeblairjlk: do you query the pr when you get an event?23:47
mordredjlk: yah - it seems like you're have to query the api for the PR and get the full timeline, then see which of the commits is later in that timeline23:47
* jeblair should start saying 'we' :)23:47
jlkmordred: the crux of the problem is that I need the info of "when this commit appeared on this PR" which is a timestamp that I don't think is tracked in the commit itself, because the commit may be a part of many PRs, or it may have been dormant for a long time until it showed up in the PR.23:48
jeblairjlk: oh, when you said "change object" above, did you mean in zuul?  if so, that's a misapprehension -- there should only ever be one change object in zuul for a given pr.23:48
jlkit's a relationship between the commit and the PR specifically.23:48
clarkbjlk: reading the api you'd get two PR events right? each with a commit list. You check the updatedat of the PR objects find the newer one, then look at commits.last or whatever23:48
mordredclarkb: right. with updatedat it makes total sense - but without updatedat it seems hard to reason about23:49
clarkbmordred: yes, but I think what github is saying is that they will replace updatedat at some point but haven't yet so thats the only reason we don't have that data now23:49
jlkyeah updatedAt works perfect for this23:49
jlkjeblair: um.23:49
clarkbso I would just stick with updatedAt, wait for them to actually write the replacement code, then switch23:49
mordredyah. I think we have to use updatedAt - there doesnt' seem to be good replacement for it yet23:49
mordred++23:49
jlkjeblair: I'm not sure I follow that.23:50
jlkjeblair: is that assuming a caching of PR -> change object?23:50
jlk(because that doesn't exist in the github driver yet)23:50
jeblairjlk: oh, they'll be different because they are different "patchsets", sorry.23:50
jeblairjlk: but yes, that caching also needs to happen23:51
jeblairjlk: mostly i wanted to make sure we didn't base any behaviors on the lack of caching23:51
jeblairbut i don't think i added anything useful to the conversation, sorry.  :)23:51
jeblairfwiw, in gerrit, we didn't go very long before we got to the point where we needed to query the change on each event.  so we may end up eating the cost needed to resolve this without updatedAt before it becomes a problem anyway.  :)23:52
mordredjeblair, jlk: if we get to that point, then as opqaue as I find it right now, the graphql stuff may wind up being useful, as it allows us to submit more than one query at a time23:54
mordredI say that may be useful, because I worry about the gh rate limit if we have to do a query on every webhook submission23:54
mordredbut I'm wildly waving my hands in the air now23:54
clarkbwould be interesting if they readjusted the rate limits significantly based on graphql change23:55
clarkblike even less requests per hour because now you can have richer reuqests23:55
jamielennoxso it'll be interesting to see what gh does with API request limits for graphql because it's much more expensive server side23:57
jeblair"have fewer pull requests"23:57
jamielennoxcurrently you get like 5000 api requests/hour but 100 search requests/hour23:57
jamielennoxand i can't imagine that graphql is much less expensive than search23:57
jeblairgit operations are unlimited?23:57
jlkthey did23:58
jlkthe limits are a lot lower with graphql23:58
jeblairmany of zuul's questions may be able to be answered via git :)23:58
jlkso, to answer an earlier question23:58
jamielennoxi'm not sure about git operations23:58
jlkcurrently it depends on the type of event as to whether or not we query github for the pullrequest data. Some events we get enough from the event body itself to move on, and the PR isn't queried until the getChange call.23:58
jlkhowever,23:59
jamielennoxi know i've hit rate limits in the past for unauthenticated ops, but don't remember if that was ssh,git or http access23:59
jlkwith graphQL, and with a caching inside of zuul, I've assumed that we'll start making a PR query and object for every event (that can be linked back to a PR), created when the event comes in, and updated at the getChange() call time (mirroring what gerrit does).23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!