Tuesday, 2016-11-22

openstackgerritJoshua Hesketh proposed openstack-infra/zuul: Merge branch 'master' into workingv3  https://review.openstack.org/38947000:18
*** saneax-_-|AFK is now known as saneax00:24
openstackgerritAdam Gandelman proposed openstack-infra/zuul: Re-enable test_success_pattern as test_success_url  https://review.openstack.org/40045500:46
openstackgerritJames E. Blair proposed openstack-infra/zuul: Correct logic problem with job trees  https://review.openstack.org/40045600:53
jeblairSpamapS: ^ maybe we can take that and apply your additional tests to it00:54
clarkboh sorry for mic drop,larissa hauled me away01:21
SpamapSjeblair: I'll look soon. Was sick this morning and now been driving the last 3 hours. :-p01:46
jeblairclarkb: no it was perfect :)03:07
jeblairSpamapS: no rush; get well.03:08
SpamapSjeblair: seems to have been a 24hr bug that finally let go this afternoon. :-P03:09
*** saneax is now known as saneax-_-|AFK05:54
*** abregman has joined #zuul06:23
openstackgerritJoshua Hesketh proposed openstack-infra/nodepool: Merge branch 'master' into feature/zuulv3  https://review.openstack.org/40053606:46
*** willthames has quit IRC06:50
*** hashar has joined #zuul07:12
openstackgerritAndreas Jaeger proposed openstack-infra/nodepool: Add tools/test-setup.sh  https://review.openstack.org/39907907:52
openstackgerritAndreas Jaeger proposed openstack-infra/nodepool: Add tools/test-setup.sh  https://review.openstack.org/39917707:56
*** hogepodge has quit IRC08:45
*** rcarrillocruz has quit IRC09:29
*** ricky1 has joined #zuul09:33
*** ricky1 is now known as rcarrillocruz09:33
*** yolanda has quit IRC10:35
*** yolanda has joined #zuul10:38
*** rcarrillocruz has quit IRC11:46
*** rcarrillocruz has joined #zuul11:46
*** abregman is now known as abregman|mtg12:57
*** yolanda has quit IRC12:58
*** abregman|mtg has quit IRC13:02
*** abregman has joined #zuul14:12
pabelangermorning14:13
pabelangerlooking at http://logs.openstack.org/34/399734/1/gate/gate-nodepool-python27-db-ubuntu-xenial/a9db8e9/console.html14:13
pabelangerseems we are uploading the same build twice14:14
pabelangernot sure if we recently fixed that or not14:14
pabelangerwill recheck now14:14
*** yolanda has joined #zuul14:17
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_hold test  https://review.openstack.org/39975814:17
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Enable test_delete / test_delete_now tests  https://review.openstack.org/39973414:17
Shrewspabelanger: that was probably (hopefully?) fixed with my two bugs squashed yesterday14:21
Shrewsnot sure if that review needs a rebase to get those or not14:23
*** yolanda has quit IRC14:36
*** yolanda has joined #zuul14:43
openstackgerritMerged openstack-infra/nodepool: Add tools/test-setup.sh  https://review.openstack.org/39907915:02
Shrewsumm, anyone know where getSnapshotImageByExternalID gets defined in nodepool? is that autogenerated from the db layer or something?15:03
Shrewsoh, it was removed in a previous commit. *phew*15:08
Shrewsjeblair: so, the alien-image-list command seemed to operate on snapshot images. since support for that was removed, can that command (and associated tests) be removed?15:10
Shrewsif not, what should it be doing instead at this point?15:11
jeblairShrews: i think it should be comparing the images it finds in the clouds to uploads15:11
jeblairShrews: (that was probably another case of us overloading the term snapshot to also mean 'dib we uploaded' -- since dib was an add-on)15:11
*** yolanda has quit IRC15:12
Shrewsjeblair: ok. that's what i *assumed* it should be trying to do, but the whole snapshot thing threw me off15:12
*** yolanda has joined #zuul15:52
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable alien-image-list command and tests  https://review.openstack.org/40083616:00
Shrewsjeblair: pabelanger: see my NOTE note in that review ^^^16:01
jeblairShrews: ack, i'll take a look at that later16:04
*** yolanda has quit IRC16:04
jeblair(i mean, i'll take a look at seeing if there's a way to add the positive test case)16:07
*** yolanda has joined #zuul16:29
*** abregman has quit IRC16:33
*** abregman has joined #zuul16:33
*** yolanda has quit IRC16:42
*** yolanda has joined #zuul16:51
*** hashar is now known as hasharAway16:57
*** yolanda has quit IRC17:40
*** abregman is now known as abregman|afk17:59
*** yolanda has joined #zuul18:20
phschwartzjeblair: ok, so if you have some time. I am open right now. I basically want to throw out what I have been attempting because all feel really really bad and get your thoughts based on that original review that did part of the work of trying to turn it into a graph18:22
*** yolanda has quit IRC18:23
*** yolanda has joined #zuul18:23
jeblairphschwartz: sure thing18:24
clarkbShrews: jeblair re alient list yup. Basically snapshot has/had two meanings 1) actual snapshot images and 2) dib uploads18:26
phschwartzso from reading the commit message and the code, the point of the change was to fix the issue where 2 jobs had the same child but the child would run from which ever finished first instead of when both were done18:28
phschwartzdoes that sound correct?18:28
jeblairphschwartz: yes, (i'm not sure whether it was the one that finished first vs the one that was defined first -- but regardless, it is to express that we want A, B+C, then D to run when B and C are finished.18:30
phschwartzSo the previous review tried to move the jobTree to a jobGraph with some cyclic dependency protection, but only changed about 1/6th the code that uses jobTree's and in not an easy maner to replace.18:32
phschwartzDo you think it is cleaner to continue down that process and replace all the jobTree usage with a graph like object that can be used expressively to list a job chain/order?18:33
phschwartzI was attempting to fix it in the job tree itself and without adding extra objects to try and track cross dependent children it turned into a nightmare18:34
*** yolanda has quit IRC18:34
jeblairphschwartz: i'm not sure how best to answer your question.  i think this is a fundamental alteration of jobtrees -- we are, in fact, changing them to a graph structure.  so i think whatever investment in altering the data model to accomodate that clearly is worth it (i don't think we should try to shoehorn this into the existing jobtree structure if it doesn't fit).  however, we do still need something that lays out the order in which jobs are ...18:37
jeblair... to be run, and can be used to easily answer the question "these jobs have completed, what jobs can be run now?"18:37
phschwartzjeblair: ok, that gives me a clearer picture of exactly what you want. Instead of a normal open graph like the review that was presented, I am going to take a shot at moving the jobTree to a weighted or dependency graph. I think that will give us the best function for solving this. Do you have an issue with the format of the pipeline defn changing a bit to allow clearer reading while still using a yaml dict18:40
phschwartzformat?18:40
jeblairphschwartz: indeed, i think even more specifically a DAG is what we want, yeah?18:42
phschwartzjeblair: That is what I was leaning towards18:43
jeblairphschwartz: and yeah, we can consider format changes -- maybe make some mockups in an etherpad and run them by me before you get too far into it?18:44
phschwartzthat was my thoughts on it18:45
*** yolanda has joined #zuul18:55
*** yolanda has quit IRC18:59
mordredjob DAG will make many people happy19:12
phschwartzmordred: I think it will.19:14
*** yolanda has joined #zuul19:16
*** hasharAway is now known as hashar19:22
*** yolanda has quit IRC19:32
*** yolanda has joined #zuul19:42
*** yolanda has quit IRC19:54
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_list_nodes  https://review.openstack.org/40095520:32
Shrewsmissed that one ^^^20:33
*** abregman|afk has quit IRC20:40
Shrewsthere is also the test_image_upload_fail test, but i'm not quite sure how to fix that one yet20:45
pabelanger+2 on test_list_nodes20:48
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable TestWebApp tests  https://review.openstack.org/39971621:01
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Remove waitForBuiltImages() / JobTracker() from nodepool.py  https://review.openstack.org/39972721:01
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Add test to validate when a node build is disabled  https://review.openstack.org/39964221:01
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_hold test  https://review.openstack.org/39975821:04
pabelangerShrews: regarding: https://review.openstack.org/#/c/399974/ where is the build_number generated for builds today? If 0000000001 isn't guaranteed, is there no safe way to force it to be the default?21:12
Shrewspabelanger: it's a sequence znode managed by zookeeper. we have no control over it21:13
pabelangerHmm21:15
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: [WIP] Re-enable test: test_image_upload_fail  https://review.openstack.org/40097021:16
Shrewspabelanger: clarkb: jeblair: my testing for ^^^ shows that the fakeuploadfailcloud is not getting swapped in for that test (thus it fails). not sure what the issue there is but it must have worked before, yeah?21:18
pabelangerShrews: Ya, we make the assumption in a few places already with fake-image-0000000001, for example.  So, we'll also need to clean that up too. test_dib_image_delete and test_image_removal21:19
Shrewspabelanger: erm, didn't notice that21:19
Shrewspabelanger: dib-image-delete looks like an easy fix. just use build.id there21:20
clarkbShrews: on line 111 make sure that object path is still the same path used now?21:21
Shrewserr, builds[0].id21:21
Shrewsclarkb: yup, the same. though i think the _get_one_cloud() path is different?21:22
jheskethMorning21:23
jeblairShrews: we haven't seen a test fail because the sequence started at 0 yet, have we?21:28
Shrewsjeblair: i saw it while developing zk.py21:28
Shrewsso, not a test failure, no21:28
jeblairShrews: maybe there was something special about the conditions when you saw that21:29
Shrewsjeblair: i didn't bother digging into the zookeeper code (java, bleh!) to figure the conditions21:29
jeblairi confess, it bothers me that something that is guaranteed to be a monotonically increasing number would have an undefined behavior like that.  and i like that we can use it to assert that the image we built is the *first* image we built.  if we read the value, then all we can do is assert that the second image was the one that was built after the first.21:30
Shrewsjeblair: ZK makes no guarantees as far as the docs (https://zookeeper.apache.org/doc/r3.2.1/zookeeperProgrammers.html#Sequence+Nodes+--+Unique+Naming) and harlowja have informed me21:31
harlowjaya, i think its really based on the modifications to the parent znode21:31
harlowjaafaik that's where the number comes from21:32
harlowjadelete the parent and maybe it will start at 000000000121:32
harlowjai forget though21:32
harlowjaha21:32
harlowjacheck out the java code :-P21:32
Shrewsno, that's why i summoned you!  lol21:34
* Shrews is allergic to java21:34
harlowjalol21:34
pabelangerjeblair: agreed, it would be good to know what the first build ID would be for a new image. It would make writing tests easier :)21:36
pabelangerthat's actually was I was trying to do, some how prime a build number with storeBuild(), then delete that data. We'd know what the current build_id would be21:37
pabelangerdocs to reference 000000000121:40
pabelangerand not 000000000021:40
harlowjasomewhere in https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java i think21:40
Shrewspabelanger: as an example, yes21:40
ShrewsEven if we knew the conditions, we're not guaranteed that the zk devs wouldn't change them at some point21:43
harlowjaya21:43
harlowjasomewhere in there, lol21:43
pabelangerShrews: jeblair: and was it the downside, to not having zookeeper generate the id?21:45
Shrewsi think that's even why i made zk.storeBuild() return the build number21:45
pabelangeror upside, depending on how we look at it21:45
Shrewspabelanger: it started out NOT using zk for the sequence numbers21:46
pabelangerokay, so we switched to it. Was that discussion logged any place?21:47
pabelangermostly curious on the history of using it21:47
Shrewspabelanger: see fac06108552b7482086d840dcfe5b2eaa1d11b3621:47
pabelangerk21:47
*** hashar has quit IRC21:49
Shrewsthe getMaxBuildID() used to be necessary to query ZK to calculate the _next_ ID. pretty inefficient21:49
Shrewsfor one21:49
harlowjau guys going to make your own sequence number stuff (transcationally?)21:50
Shrewsand i think that involved more locks21:50
harlowjai wouldn't recomend it :-P21:50
jeblairzk does have a test where they hardcod 0000000001 -- https://github.com/apache/zookeeper/blob/ec20c5434cc8a334b3fd25e27d26dccf4793c8f3/src/java/test/org/apache/zookeeper/test/AsyncOps.java#L80621:50
harlowjaya, more locks needed21:50
Shrewsnah, i don't want to create a new GUID21:50
pabelangerjeblair: interesting21:51
jeblair(well, separately from this discussion, we also have the potential desire to make global unique build and upload ids, which might involve a global sequence oracle, or a uuid; neither of those things solve this question/problem though)21:51
Shrewswell that's all well and good. doesn't mean 0000000000 could never be generated21:51
Shrewsif we want to assume 1 for the tests, we might get away with that for a long time. i'm just relating my experience. i really don't think non-test code should EVER depend on that though21:52
jeblairShrews: i agree with that.21:53
jeblairShrews: i'm just tending to think that it might be okay for tests (until it's not), and we get the benefit of it being easier to write/understand, plus a little extra assertiveness on the part of the test.21:53
jeblairi love that i've spent 20 minutes trying to find the place in zk where it sets or increments the seqno and have not succeeded yet.21:54
harlowjame too21:54
harlowjalol21:54
harlowja10 minutes though21:54
harlowjai have a feeling https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/util/ZxidUtils.java#L25-L27 might be used?21:55
harlowjabut not sure, ha21:55
harlowjalet me know if u find it :-P21:55
jeblairi have exceeded the number of windows my wm will let me open21:56
harlowjaha21:56
jeblairi did not know there was a max21:56
jeblairit's apparently 144 per desktop21:56
*** yolanda has joined #zuul21:57
pabelangerwow, impressive21:57
Shrewsapparently stored using a 12x12 matrix21:58
clarkbjeblair: now you want me to see how many xmonad will do21:59
clarkber21:59
clarkbyou have made me want to see21:59
jeblairclarkb: yes.  also, i double dog dare you.21:59
pabelangerhttps://review.openstack.org/#/q/status:open+project:openstack-infra/nodepool+branch:feature/zuulv3+topic:v3-tests is 2 tests that are ready for review / merge. Minimal changes22:04
pabelanger5 now22:05
pabelangertopic update22:05
harlowjajeblair https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java#L671-L67522:06
harlowjahttps://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java#L672-L67522:06
openstackgerritAdam Gandelman proposed openstack-infra/zuul: Re-enable test_success_pattern as test_success_url  https://review.openstack.org/40045522:14
jeblairharlowja, Shrews, pabelanger: so it looks like it is the version number of the parent (for us, a node which does nothing other than contain the builds/uploads).  so whether it starts at 0, 1, or something else is likely related to how exactly we end up creating that node (and whether we do anything else with it in the future).  in other words, we're more likely to make that unstable than zookeeper.  having said that, i'm now leaning toward ...22:18
jeblair... agreeing with Shrews that we should try to avoid it.22:18
harlowjaor make new parent nodes for testing22:18
*** willthames has joined #zuul22:19
harlowjaand then u'll probably get 00000122:19
harlowjaotherwise prob not, lol22:19
harlowjaprob more/less test what u actually care about (that its incrementing?)22:19
openstackgerritAdam Gandelman proposed openstack-infra/zuul: Re-enable test_success_pattern as test_success_url  https://review.openstack.org/40045522:22
openstackgerritAdam Gandelman proposed openstack-infra/zuul: Re-enable test_success_pattern as test_success_url  https://review.openstack.org/40045522:23
adam_gjeez. third times a charm22:23
pabelangerjeblair: harlowja: Shrews: okay, let me think about another way to waitForBuild without using build_id.  Maybe we need to stop using builder.DibImageFile.from_image_id()22:33
jeblairpabelanger: i'd say let's do that, but not let it block us for now.22:35
pabelangerk22:36
*** yolanda has quit IRC22:58
Shrewspabelanger: waitForImage implies the build. just have it return the image if you need the build id23:02
*** jamielennox is now known as jamielennox|away23:24
*** jamielennox|away is now known as jamielennox23:24
jamielennoxhey, is there anything about the test_requirements tests that means they've been left so late?23:42
jamielennoxanything major changing there re v3, or not suitable for beginners?23:42

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!