Wednesday, 2016-11-16

jeblairSpamapS: let me grab a flashlight...00:04
SpamapSjeblair: "I'll submit what I have now, as it's failing reliably somewhat deep into the test00:04
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/zuul: Refactor test_zuul_refs  https://review.openstack.org/39799400:06
SpamapSjeblair: note that ^^ has something I'm not sure is right. There are 10 builds, but I'm only checking the first 4 that match the ZUUL_CHANGE that I want.. given the nature of the test, I thought this was probably right.. but not sure.00:08
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable TestNodepool.test_node  https://review.openstack.org/39794400:12
pabelangerjeblair: Shrews: okay, I think that is a little more cleaner^00:12
pabelangerI'll hold off on any more tests until that one is reviews, since it add zookeeper into nodepool.py00:13
jeblairSpamapS: yeah, those changes should trigger more than 4 jobs (10 seems strange, i wonder why it's not 12?) and we're presuming they're all getting the same information, so we only check one build from each.00:16
SpamapSjeblair: it's possible I observed the length of self.builds mid-launch00:17
jeblairapparently there will be 14 builds for that by the time it's done00:18
SpamapS(Pdb) p self.builds00:19
SpamapS[<FakeBuild project-test1 3,1 [waiting]>, <FakeBuild project-test2 3,1 [waiting]>, <FakeBuild project1-project2-integration 3,1 [waiting]>, <FakeBuild project-test1 3,1 4,1 [waiting]>, <FakeBuild project-test2 3,1 4,1 [waiting]>, <FakeBuild project1-project2-integration 3,1 4,1 [waiting]>, <FakeBuild project-test1 3,1 4,1 5,1 [waiting]>, <FakeBuild project-test2 3,1 4,1 5,1 [waiting]>, <FakeBuild project-test100:19
SpamapS3,1 4,1 5,1 6,1 [waiting]>, <FakeBuild project-test2 3,1 4,1 5,1 6,1 [waiting]>]00:19
SpamapSy00:19
jeblairoh right those are running builds00:19
jeblairso that's correct00:20
jeblairthe 4 -merge builds have completed (so they are accessible in self.history and are removed from self.builds)00:20
jeblairsince this is the last message from hasChanges (and seems to match the call in 2116), i assume this is the error:00:23
jeblair2016-11-16 00:08:26,033 zuul.test                        DEBUG    Checking if build <FakeBuild project-test2 3,1 4,1 5,1 [waiting]> has changes; commit_messages ['A-1']; repo_messages [u"Merge commit 'refs/changes/1/5/1' of /tmp/tmp.PQac7IgBC1/tmp37U9tW/zuul-test/upstream/org/project2 into HEAD", u'M2-1', u'C-1', u'add content from fixture', u'initial commit']00:23
SpamapSRight, ZUUL_CHANGE == 5 doesn't seem to be the right change anymore00:25
jeblairit's looking for 'A-1' in that list of commit message, but not seeing it.  it sees C-1, but the A and B commits are not there00:25
SpamapSI'm still learning the fixtures, so I can't even tell how ZUUL_CHANGE gets created yet00:25
SpamapScan't even find where self.fake_gerrit is created00:26
jeblairSpamapS: ZUUL_CHANGE is essentially the gerrit change id, it starts at 1 and monotonically increases each time we addFakeChange in the test00:26
jeblairso in that test, M1=1, M2=2, A=3, B=4, C=5, D=600:26
SpamapSso guessing there are merger things stubbed out or just broken right now00:28
jeblairprobably broken; this shouldn't be hitting stub code00:29
SpamapScool, worth digging deeper then. :)00:31
* SpamapS packs some cliff bars and an extra 200m of rope.00:31
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Remove image building responsibility from nodepool.py  https://review.openstack.org/39796500:33
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable TestNodepool.test_node  https://review.openstack.org/39794400:33
pabelangerjhesketh: do you mind reviewing: https://review.openstack.org/#/q/status:open+project:openstack-infra/zuul+branch:feature/zuulv3+topic:enable-tests today?00:36
jheskethof course00:37
* jhesketh jumps on it now before he forgots00:38
jhesketh*forgets00:38
jheskethpabelanger: awesome stuff with taking all these on btw :-)00:38
jeblairSpamapS: i think i see the problem --00:38
jeblairSpamapS: that's the correct git history for project2 (the project that change 5/C is in).  but we're actually looking for change A which is in project100:40
pabelangerjhesketh: not an issue, happy to help where I can00:41
jheskethjeblair: any particular reason you didn't +3 some of pabelanger's tests as you mentioned for me yesterday?00:41
jhesketh:-)00:41
jeblairSpamapS: ref_has_change looks at the git repo for the project corresponding to the change you're asking for, but build.hasChanges only looks at ZUUL_PROJECT.  it *probably* needs to similarly check the project corresponding to each change you're asking about00:42
jeblairjhesketh: i +3d the ones you reviewed.  i +2d the rest and left them for you to review?00:43
jheskethjeblair: sure... maybe I misunderstood yesterday, but did you want me to +3 'trivial'00:44
jheskethpatches with no other +2's00:44
jheskethon the v3 branch00:44
jeblairare there any?00:44
jheskethnot sure, just working through them at the moment.. .but in general when reviewing for v300:44
jheskethand in particular for the test-enable ones00:45
jeblairlet's not -- i've -1d a couple of trivial ones that turn out not to be trivial00:45
SpamapSjeblair: ah yes, test fixture issue. I wasn't finding much in the other end.00:45
jheskethokay, sure00:45
jheskethmy apologies for misunderstanding yesterday then00:45
jeblairjhesketh: np -- your reviews on them are appreciated so i want to make sure both of us see as many as possible :)00:46
jheskethack00:46
jeblairpabelanger: 944 lgtm, but i wonder if we're at a point where we should just change node.yaml to be like node_dib.yaml -- that might result in less test churn?00:49
jeblair(specifically, i'm thinking if we don't have to change all occurences of 'fake-image' to 'fake-dib-image' it would be nice)00:50
pabelangerjeblair: ya, I was thinking about that too. I think those files can be merged now00:51
jeblaircool, worth a shot i think00:51
pabelangershould I do that in 944 or the following patch?00:52
jeblairpabelanger: i think either would be fine (i don't mind a following patch changing that back)00:53
pabelangerokay, I can do that first thing in the morning00:53
openstackgerritMerged openstack-infra/zuul: Re-enable TestMergerRepo() class for testing  https://review.openstack.org/39718900:57
openstackgerritMerged openstack-infra/zuul: Re-enable test_crd_gate / test_crd_multiline / test_crd_gate_reverse  https://review.openstack.org/39727700:58
openstackgerritMerged openstack-infra/zuul: Re-enable test_crd_check job  https://review.openstack.org/39731700:58
openstackgerritMerged openstack-infra/zuul: Re-enable test_crd_branch test  https://review.openstack.org/39733700:59
openstackgerritMerged openstack-infra/zuul: Re-enable test_crd_cycle_join test  https://review.openstack.org/39734000:59
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/zuul: Refactor test_zuul_refs and FakeBuild.hasChanges  https://review.openstack.org/39799401:03
SpamapSjeblair: good call^01:04
SpamapStest passes now. Did not test to see if there was fallout from the changes... have to run out to fetch kids now01:04
jeblairSpamapS: cool, the rest of the tests will tell us that :)01:05
*** harlowja has quit IRC05:55
*** bhavik1 has joined #zuul07:01
openstackgerritIan Wienand proposed openstack-infra/nodepool: Catch all upload exceptions  https://review.openstack.org/39814807:17
*** abregman has joined #zuul07:22
openstackgerritIan Wienand proposed openstack-infra/nodepool: Catch all upload exceptions  https://review.openstack.org/39814807:49
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/zuul: Refactor test_zuul_refs and FakeBuild.hasChanges  https://review.openstack.org/39799408:18
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/zuul: Refactor test_zuul_refs and FakeBuild.hasChanges  https://review.openstack.org/39799408:20
*** saneax-_-|AFK is now known as saneax08:36
*** hashar has joined #zuul08:43
*** saneax is now known as saneax-_-|AFK08:55
*** saneax-_-|AFK is now known as saneax09:00
*** saneax is now known as saneax-_-|AFK09:13
*** saneax-_-|AFK is now known as saneax09:38
*** saneax is now known as saneax-_-|AFK09:59
*** bhavik1 has quit IRC10:50
*** abregman is now known as abregman|afk10:51
*** abregman|afk is now known as abregman11:29
*** _ari_ has quit IRC12:34
*** _ari_ has joined #zuul12:36
*** jamielennox is now known as jamielennox|away13:25
*** abregman is now known as abregman|mtg13:32
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Replace node_dib.yaml with node.yaml for tests  https://review.openstack.org/39835713:40
*** abregman_ has joined #zuul13:56
*** abregman|mtg has quit IRC13:59
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Replace node_dib.yaml with node.yaml for tests  https://review.openstack.org/39835713:59
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable TestNodepool.test_node  https://review.openstack.org/39794413:59
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enalbe test_node_net_name test  https://review.openstack.org/39836814:03
*** abregman_ has quit IRC14:04
*** abregman has joined #zuul14:05
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Enable test_node_vhd_image test  https://review.openstack.org/39836914:10
pabelangermordred: do you mind helping +3 jeblair stack of changes for nodepool? https://review.openstack.org/#/c/397952 Would save me a rebase14:12
*** Cibo has joined #zuul14:18
pabelangeractually, I can rebase for now14:22
mordredpabelanger: looking14:25
mordredpabelanger: done14:26
pabelangermordred: danke14:28
pabelangerShrews: going to take a stab at getting devstack job updated with zookeeper14:29
openstackgerritMerged openstack-infra/nodepool: Fix race condition in build cleanup  https://review.openstack.org/39797514:30
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Enabled zookeeper for devstack jobs  https://review.openstack.org/39837914:31
mordredpabelanger: that looks very straightforward ^^ :)14:33
openstackgerritMerged openstack-infra/nodepool: Remove Zookeeper per-test fixture  https://review.openstack.org/39794814:33
openstackgerritMerged openstack-infra/nodepool: Reduce kazoo logging in tests  https://review.openstack.org/39795214:33
openstackgerritMerged openstack-infra/nodepool: Remove discover from test-requirements  https://review.openstack.org/34580114:33
pabelangermordred: ya, untested, lets see what the job does14:33
pabelangersadly no centos package yet14:33
openstackgerritMerged openstack-infra/nodepool: Remove image building responsibility from nodepool.py  https://review.openstack.org/39796514:33
*** abregman has quit IRC14:54
*** abregman has joined #zuul15:16
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Enabled zookeeper for devstack jobs  https://review.openstack.org/39837915:17
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Replace node_dib.yaml with node.yaml for tests  https://review.openstack.org/39835715:31
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Enable test_node_vhd_image test  https://review.openstack.org/39836915:31
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable TestNodepool.test_node  https://review.openstack.org/39794415:31
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_node_net_name test  https://review.openstack.org/39836815:31
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_node_vhd_and_qcow2 test  https://review.openstack.org/39841615:31
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_dib_image_delete  https://review.openstack.org/39841815:41
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_dib_upload_fail test  https://review.openstack.org/39842615:45
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_config_validate  https://review.openstack.org/39842715:45
*** abregman has quit IRC15:55
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_image_build  https://review.openstack.org/39843715:57
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_subnodes test  https://review.openstack.org/39843815:59
Shrewspabelanger: what does the job-create nodepool command supposed to do?15:59
Shrewsyes, i know, create a job... thanks mordred :-P15:59
Shrewsbut, what is the job?15:59
pabelangerShrews: we use that to setup auto-holds in nodepool16:00
pabelangerso, when a job files, nodepool will not delete the node, but flag it as hold16:00
*** abregman has joined #zuul16:01
Shrewshrm, that seems very un-zookeeper related16:01
* Shrews tries re-enabling16:01
pabelangeragreed16:01
mordredShrews: :)16:01
Shrewscool, test_job_create and test_job_delete still work16:02
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable tests: test_job_create, test_job_delete  https://review.openstack.org/39844016:02
pabelangerYay16:03
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Enable test_node_az test  https://review.openstack.org/39844216:05
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Remove test: test_snapshot_image_update  https://review.openstack.org/39844516:11
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_node_ipv6 test  https://review.openstack.org/39844616:12
* Shrews wonders how much of the infra compute resources is being used by pabelanger and Shrews16:13
pabelanger:)16:13
pabelanger698 nodes in use right now16:13
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_node_delete_success test  https://review.openstack.org/39844816:14
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_node_delete_failure test  https://review.openstack.org/39845016:17
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_db test  https://review.openstack.org/39845116:18
pabelangerI don't want to jinx this, but it is going pretty smoothly the re-enabling of tests for test_nodepool.py16:18
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_leaked_node test  https://review.openstack.org/39845816:21
jeblairShrews: https://review.openstack.org/397944 is an important one i want to make sure you review16:22
Shrewspabelanger: comment for you on 39794416:24
pabelangerchecking16:24
Shrewsjeblair: yeah, just looked at it16:24
pabelangerShrews: sure, I can change that if not correct16:25
jeblairShrews: also be aware of https://review.openstack.org/398357 which may affect test re-enablement16:25
openstackgerritMonty Taylor proposed openstack-infra/nodepool: Use the name parameter from Task instead of class name  https://review.openstack.org/39846116:25
openstackgerritMonty Taylor proposed openstack-infra/nodepool: Remove need for findNetwork  https://review.openstack.org/39846216:25
jeblairShrews, pabelanger: sounds like a good follow-up patch16:26
pabelangerack16:26
Shrewspabelanger: yeah, let's not rebase the entire chain16:26
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_job_start_event test  https://review.openstack.org/39846316:26
mordredShrews, pabelanger: http://logs.openstack.org/79/398379/2/check/gate-dsvm-nodepool/5acebe8/logs/screen-nodepool-builder.txt.gz#_2016-11-16_15_33_44_88416:26
mordredwhy is there an "osic-cloud1" in the image string? I can't find ... oh wait a sec16:27
* jeblair watches firefox thrash16:27
Shrewsmordred: File Not Found16:27
Shrewsfun16:27
mordredok - I will ask the question for real now - I don't see osic anywhere in the repo16:28
Shrewsoh, found now16:28
pabelangerit was run on osic16:29
jeblairwell, i still can't see that file -- but the builder does store the hostname as a field in zk to indicate what host it was built on... but i don't know why it would be in the image name16:29
jeblairmordred: oh!16:29
jeblairmordred: still just guessing here -- but i think the devstack test does some awk/grep stuff on nodepool command output.  i bet we need to update the field order16:29
Shrewsyeah, the builder hostname is definitely NOT used in the image name16:30
jeblairlike, it's looking for an image name, but it's pulling the build host out of the output table instead of the image name16:30
pabelangerStill downloading log file :/16:30
mordredI'm dumb. yah. that's the hostname16:30
mordredubuntu-trusty-osic-cloud1-disk-540239016:30
clarkbalso figuring out why that logfile is so large now wouldbe good16:31
mordredthat's a kazooclient log message - I think it' sjust saying it's talking to the zookeeper on ubuntu-trusty-osic-cloud1-disk-540239016:31
pabelangerkazoo.client I suspect16:31
pabelangerI'll update the logging.conf file to INFO for it16:31
jeblairpabelanger: ++thx16:31
clarkb12mbcompressed16:31
* jeblair kills ff16:32
jeblairSpamapS, mordred: the only other thought i have on 397994 is that we aren't really going to use zuul_ref for its intended purpose anymore (which is to be the ref that zuul-cloner uses to fetch things from the merger -- all of that is gone in v3).  we could start removing it now, however, there's a possibility we might want to keep it, or something like it, to uniquely identify build configurations (though we would probably call it something ...16:39
jeblair... else).  (it's one of the things we can use to allow jobs in the same build set to stash and fetch artifacts with a shared key, for instance).  so i'm inclined to say maybe we leave that for the moment and see if we think we need that later.  we could log this as a backlog item: "remove or update zuul_ref env variable".16:39
mordredjeblair: I agree with keep it for now16:40
*** abregman has quit IRC16:40
SpamapSjeblair: yeah I had wondered about that too. Still ramping up in v3's changes so I wasn't sure enough to say.16:41
SpamapSjeblair: other than inform zuul-cloner of how to find merged things, ZUUL_REF doesn't really do anything, right?16:41
SpamapSthe way I worded that was sort of like Homer haggling with Professor Frink... "Well it _ONLY_ transports _MATTER_"16:42
openstackgerritMerged openstack-infra/zuul: Refactor test_zuul_refs and FakeBuild.hasChanges  https://review.openstack.org/39799416:42
jeblairSpamapS: yep :)16:43
* SpamapS updates workboard16:44
timrcStoryboard fail:16:45
clarkbjeblair: will there not be a ref I can fetch locally to reproduce a job on my laptop?16:45
timrc"400: GET /api/v1/users/preferences: Invalid input for field/attribute user_id. Value: 'preferences'. unable to convert to int "16:45
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Enabled zookeeper for devstack jobs  https://review.openstack.org/39837916:46
SotKtimrc: Sorry about that, its a known bug that happens when you've not accessed storyboard for a while. Refreshing the page should fix it.16:47
timrcSpamapS: There's a few worklist/boards with the word "zuul" in them but the one I should be paying attention to is: https://storyboard.openstack.org/#!/board/41?16:47
timrcSotK: Ah, should have tried that first. I just logged out and back in.16:47
jeblairclarkb: nope :(  parts of reproducing get easier (ansible!), but parts are harder (we push rather than pull git repos).  so i think we'll have to come up with something to fill that gap.16:48
jeblairclarkb: perhaps the merger can record what it does, and we can somehow include that (as a series of git commands?) in the build artifacts16:48
clarkbjeblair: I guess I had assumed (incorrectly) that pushing wouldnt remove ref hosting16:48
clarkband instead just be an optimization for jobs running16:49
clarkbso make refs as today, and then also push them16:49
pabelangerwould it be much work to support both?16:50
jeblairclarkb: we *could* implement it that way, but the way i did it has some advantages: first, the merger operations can happen in parallel, since each build gets its own copies of the git repos.  second, we don't have zuul refs to clean up.  (currently, we need to delete all our zuul refs every month or so to get reasonable performance from the mergers).  third, we don't have to tell users to run an external facing service (apache).16:52
clarkbI think the first thing could still happen if refs were no longer unique per buildset (can still fetch a ref for particular jobs to reproduce them)16:54
jeblair(keep in mind that in many cases, the "reproduce a job" use case is actually "i already have repos on my workstation, i want to run the job".  that gets *easier* with this setup.  clarkb's example is different, but still valid, and we should support it, of course.)16:54
SpamapStimrc: yes, 4116:54
jeblairclarkb: i'm not sure i follow16:55
clarkbjeblair: make a zuul ref for each job and do them in parallel16:55
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_job_end_event test  https://review.openstack.org/39849316:56
clarkbthe other two issues are harder to solve because git16:56
jeblairclarkb: oh, sorry, the reason they can be parallel is that they are literally happening in different git trees.  the merge operation dirties the working tree, so you can only do it one at a time.  current v3 implementation clones all necessary git repos into the jobdir and performs the speculative merges there.16:56
mordredI like the idea of recording the merge operations in some manner - and having a thing someone can run that will consume that output and produce the same git repo state16:57
SpamapSwell...16:57
mordredit seems like a thing that would be consumable in a thing like reproduce.sh - or as a standalone16:57
SpamapSyou could just have it push things to some archival "all the merges recently done" spot16:57
jeblairSpamapS: we did that 3 years ago for several months and killed gerrit :(16:57
SpamapSlike, optionally, as a thing that is used for shops that want to be able to grab them.16:57
clarkbjeblair: I see so you'd need separate hosting of every job dir16:58
jeblairSpamapS: so yes, possible, but a significant system administration overhead16:58
mordredjeblair: yah - it might have to be a separate thing that's only used for keeping recent historical merges16:58
SpamapSjeblair: I was just thinking a dumb dir somewhere16:58
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_job_auto_hold_success / test_job_auto_hold_failure  https://review.openstack.org/39849516:58
mordredrather than pushing back to gerrit16:58
mordredor the existing git farm16:58
SpamapSsomething that has an hourly cron that removes everything > 24 hours old16:58
jeblairmordred: sure, i'm just saying that kind of git repo management is *hard*.  it is *harder than we are willing to do*.16:58
mordredtotes16:59
jeblairi would not want to inflict that on our users :)16:59
jeblairlet me suggest this:16:59
SpamapSSeems like a decent optional add-on to support if shops find they aren't comfortable with pseudo-reproductions.16:59
jeblairif a user wants that feature, they can write a role (and put in in their pre-run playbook) that pushes the job's repo state to their big git history machine16:59
jeblairSpamapS: ^ ya16:59
persiaSpamapS: Rather than >24 hours, could it remove speculative merges known unreliable, speculative merges already accomplished, and speculative merges superceded?  The problems with time include holidays, timezones, etc.16:59
SpamapSI personally have never pulled a ref to reproduce something... I usually just git review -d.16:59
SpamapSand then rebase17:00
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_job_auto_hold_* tests  https://review.openstack.org/39849517:00
mordredSpamapS: same here17:00
clarkbI pull all the time...17:00
clarkbbut I deal with "I camt reproduce this locally" as a job hazard17:00
mordredjeblair: but I agree - an ansible role that can17:00
jeblairmordred, clarkb, SpamapS: meanwhile, we can also do the 'record what the merger does' idea and stuff it in some ansible variables so that something like reproduce.sh can output the git commands17:00
mordredgah17:00
SpamapSjeblair: I like the idea of a standard library of things that can be pulled in according to needs.17:00
jeblairSpamapS: yep17:01
mordredSpamapS: ++17:01
jeblairwe can even write that role for folks, which means we can put a nice warning label on it.  :)17:01
SpamapSPlease somebody write this down in a story.17:01
clarkbthe use case is also met if there is a simple way to produce a ref locally17:02
SpamapSIt sounds like something that may end up being a regression for clarkb17:02
jeblairyes, i do not think we should run the big git machine for storing refs in infra.  i think for clarkb's case, we should do the 'output git commands' thing.17:02
mordredand then see how angry that makes clarkb17:02
mordredand fix if the anger level exceeds the anger level that etherpad produces :)17:03
clarkbugh dont remind me17:03
clarkbnpm ehy?!17:03
* mordred hands clarkb a nice fluffy bunny17:03
SpamapSSomebody write a paper to define the etherpad anger scale.17:03
jeblairSpamapS: i will storify.17:04
jeblairbut would have loved it if someone else volunteered.17:04
SpamapS0 being "I don't have to run etherpad". 10 being "I run etherpad for OpenStack summits"17:04
timrcI think SpamapS inadvertently highlighted a potential issue... thinking of workflow in terms of git review. If zuulv3 is going to cater to things like GH, we shoul be thinking about how developers on that platform work with changes.17:05
jeblairtimrc: for this purpose, there is no difference.  git commits are fetched regardless.  they are called different things.17:07
SpamapStimrc: that's a good point. There's no git review -d there. But there should be sha's that one can recover from GH.17:07
SpamapSI think timrc's point is that there's no record of the sha's in PR's17:07
jeblairto be clear, there is no 'git review -d' in infra's zuul either.17:07
SpamapSeh, gerrit?17:08
jeblairwe don't use tha sha's for gerrit17:08
jeblairzuul fetches named refs from gerrit to test changes.  it will also fetch named refs from github.17:08
mordredyah - pulls have named refs associated with them17:08
SpamapSI think users might have a hard time determining the named refs.17:09
SpamapSSince PR sources get push -f'd over17:09
mordredgit fetch origin pull/{ID}/head17:09
SpamapSthat {ID}, is that the PR #?17:09
mordredyes17:09
jeblairyeah, in gerrit they are named refs/changes/34/1234.  in github they are named /pull/1234/head17:10
SpamapSso you're screweed when somebody has pushed over it.17:10
mordredyup17:10
SpamapSbut that sha is still there17:10
mordredbut that's just one of teh limitations of github17:10
SpamapSand could be recovered17:10
mordredgithub actually garbage collects fairly aggressively17:10
mordredso it might not be there17:10
SpamapSof course it does17:10
SpamapSyay GH17:10
timrcWell we'll ahve a reflog in the cloner right :)?17:10
jeblairi do not want to try to improve github.  we should use/expose the service they provide17:10
mordredyah17:11
timrcjeblair: No no fair point, it just got me thinking.17:11
mordred++17:11
Zara(+1 from this lurker)17:11
SotK+1 from this one too17:11
jeblairSpamapS: most of the time people add commits to prs though.  that has a correspondence with gerrit patchsets and maintains reproducibility.17:13
jeblairwell, i'll say "many times".  i do not know if it is most of the time.17:13
clarkbjeblair: a lot of projects enforce a rebase and squash workflow to keep things "clean" (which is somewhat funny bceause thats like the biggest reasion people hate gerrit except that gerrit actually does preserve all that unclean history too whereas github doesn't)17:14
jeblairclarkb: thus my self-correction.17:14
jeblairat any rate, we'll be as reproducible as the underlying system17:15
jeblairand we can record the shas in our merger-recorder so that if something changes, it is detectable.17:16
pabelangerShrews: any ideas? I am trying to use zk.deleteBuild() but getting an kazoo.exception: http://paste.openstack.org/show/589484/17:16
pabelangermaybe I am not using the right build_number17:17
pabelangerbut I don't think so17:17
SpamapShttps://www.dropbox.com/s/d3bhhwcmevgwj1n/GerritVSGithub.jpg?dl=017:19
* SpamapS felt a need to express how he feels about these two things17:19
pabelangerShrews: ah, but using self.client.delete(path, recursive=True) for deleteBuild() the exception goes away17:19
jeblairSpamapS, clarkb, mordred: https://storyboard.openstack.org/#!/story/2000801  those tasks have notes -- click the arrow to see them17:21
SpamapSjeblair: I think a gating workflow will cure people of their "add commits, squash later" behavior relatively quickly. ;)17:21
jeblairSpamapS: :)17:21
ZaraSpamapS: :D17:21
* SotK hopes so17:22
jeblairpabelanger: are you working on the same thing as in https://review.openstack.org/398418 ?17:23
pabelangerjeblair: I am but for test_nodepool17:24
pabelangerI think I did the same thing17:24
pabelangerlet me push up the patch17:24
jeblairpabelanger, Shrews: see comment in https://review.openstack.org/39841817:25
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Re-enable test_image_delete test  https://review.openstack.org/39850217:25
pabelangerjeblair: Shrews: okay, ^ could use some eyes. That's the first test I've had to rewrite to not use the database17:26
pabelangerI am a little concerned we are actually testing the image was deleted17:26
jeblairpabelanger: i wonder if that test should go away17:26
jeblairpabelanger: i suspect it has been superceded by a test in test_builder (where that functionality actually resides now)17:27
pabelangersure, let me check it quickly17:27
jeblairpabelanger: cool thx17:27
clarkbpabelanger: I too have comments on https://review.openstack.org/#/c/397944/6 happy to have them be follow up items, but wanted you to see them both before I approve that change17:27
clarkb(and I intend on approving the change just as soon as we can have a quick glance and agree that nothing needs to be fixed in that patchset)17:27
pabelangerjeblair: looks some are still disabled in test_builder, but I don't see one specific to deleting an image.  I can remove it from test_nodepool now, but we'll likely need to increase code coverage for test_builder17:30
clarkbjeblair: pabelanger I agree that that test should live in test_builder.py now17:31
clarkbso if there isn't such a test maybe we move it as part of removing it from test_nodepool.py?17:31
pabelangerclarkb: re: 397944. I think you are correct, we shouldn't only compare the 0th element17:35
clarkbpabelanger: but its ok to fix in a followup because our zk config lists only have a single elemtn currently right?17:36
clarkbpabelanger: or does it need to be addressed in that base change?17:36
pabelangerclarkb: today, we only have a single element17:36
pabelangerI'll do a follow up then17:37
clarkbkk17:37
*** hashar has quit IRC17:38
pabelangerclarkb: jeblair: So, I think we can delete or move the remaining tests in test_nodepool.  test_building_image_cleanup_on_start / test_handle_dib_build_gear_disconnect17:38
pabelangergear for sure can be deleted17:38
pabelangerclean up maybe into test_builder17:38
jeblairpabelanger: the cleanup may not be necessary either -- there's a janitor thread running that should delete any failed image builds, so we don't need to do that on start any more.17:42
clarkbthough having a test that things do get cleaned up by that thread is probably a good idea17:42
clarkbso would change to build enough images that one gets superceded, wait 2*cleanup interval, check image is gone17:42
openstackgerritMerged openstack-infra/nodepool: Re-enable TestNodepool.test_node  https://review.openstack.org/39794417:43
openstackgerritMerged openstack-infra/nodepool: Replace node_dib.yaml with node.yaml for tests  https://review.openstack.org/39835717:43
openstackgerritMerged openstack-infra/nodepool: Re-enable test_node_net_name test  https://review.openstack.org/39836817:43
clarkbpabelanger: ^ I am approving things, followup to fix the zk startup would be great17:43
pabelangerclarkb: sure, I can do that now17:43
pabelangerHmm, looks like a race: http://logs.openstack.org/46/398446/1/check/gate-nodepool-python27-db-ubuntu-xenial/b9bb5c5/console.html17:44
pabelangerneed to dig into that17:44
jeblairclarkb: yes, though an analog to this case would be "start builder, kill builder during image build, start builder and verify previous 'building' image was deleted"17:44
clarkboh right the state there is what is important17:45
clarkbjeblair: pabelanger also while people are reviewing nodepool things https://review.openstack.org/#/c/370455/ would be nice to fix on master17:46
jeblairaprvd17:47
clarkbty17:47
clarkb(especially since we have to live with subnodes for a little while longer aiui)17:47
openstackgerritMerged openstack-infra/nodepool: Enable test_node_vhd_image test  https://review.openstack.org/39836917:48
openstackgerritMerged openstack-infra/nodepool: Re-enable test_node_vhd_and_qcow2 test  https://review.openstack.org/39841617:48
openstackgerritMerged openstack-infra/nodepool: Re-enable test_dib_upload_fail test  https://review.openstack.org/39842617:49
clarkbpabelanger: http://logs.openstack.org/46/398446/1/check/gate-nodepool-python27-db-ubuntu-xenial/b9bb5c5/console.html maybe a race in the new enabled jobs? I think I will hold off on approving things until we can sort that out17:50
clarkb*more things17:50
pabelangerclarkb: agreed, trying to figure out why now17:51
openstackgerritMerged openstack-infra/nodepool: Fix subnode deletion  https://review.openstack.org/37045517:51
pabelangerclarkb: Shrews: It looks like we are upload the same image multiple times: http://logs.openstack.org/46/398446/1/check/gate-nodepool-python27-db-ubuntu-xenial/b9bb5c5/console.html#_2016-11-16_16_18_45_22010617:52
clarkbpabelanger: yes thats the image that fails to upload17:53
pabelangerAh17:53
clarkbpabelanger: so the builder retries it, what should happen is the other image is uploaded successfully and nodepool boots nodes in that provider instead17:53
clarkbpabelanger: basically this is testing that if one provider can't get images that we successfully use other providers that can17:53
pabelangerlooks like we don't attempt the upload of the fake-provider217:54
clarkbpabelanger: interesting, but that doesn't happen always because https://review.openstack.org/398426 passed testing enough times to merge17:55
openstackgerritMerged openstack-infra/nodepool: Re-enable test_subnodes test  https://review.openstack.org/39843817:57
clarkbpabelanger: but I agree http://logs.openstack.org/46/398446/1/check/gate-nodepool-python27-db-ubuntu-xenial/b9bb5c5/console.html#_2016-11-16_16_18_45_28175917:57
openstackgerritMerged openstack-infra/nodepool: Enable test_node_az test  https://review.openstack.org/39844217:57
clarkbpabelanger: shows that we never get the second image which should succeed there17:57
pabelangerYa17:58
pabelangerthink we have an issue in our imageupload loop17:58
clarkbya this looks like a real bug reading the logs17:58
pabelangerinteresting17:58
pabelangerhttp://paste.openstack.org/show/589491/17:59
pabelangersome lock exceptions when I run it locally again17:59
clarkbat least if you grep out the builder logs only the provider that should fail has upload attempts17:59
clarkbif I were guessing the builder doesn't round robin its looping? I was just reading that code but I don't remember off the top of my head18:01
clarkbso this test will succeed if provider2 happesn to go first for some reason18:01
clarkband fail if 01 goes first? (this is theorizing)18:01
pabelangerclarkb: I think you are right18:02
pabelangerall my local working tests, fake-provide1 is first18:02
pabelangererr18:02
pabelangerfake-provider2 is first18:03
clarkbShrews: ^ you may have input18:03
pabelangerlet me see if I can switch it18:03
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Enabled zookeeper for devstack jobs  https://review.openstack.org/39837918:08
jeblairclarkb, Shrews, pabelanger: yeah, i think we need some more exception handling there.  i will write a patch.18:13
*** harlowja has joined #zuul18:14
clarkboh right this maybe the case I mentioned yesterday in my review18:15
clarkbbceause its going to raise an exception during upload18:15
jeblairclarkb: yep :)18:17
pabelangerneat, we know the issue18:21
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Make image uploads separable  https://review.openstack.org/39852418:22
jeblairclarkb, pabelanger, Shrews: ^18:22
jeblairi'm going to build on that a little more in a subsequent change as well18:22
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Remove DibImage from the database  https://review.openstack.org/39852518:24
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add a fallback exception handler for UploadWorker  https://review.openstack.org/39852618:27
pabelanger+2, want clarkb to also see18:27
clarkbyup about to pull it up18:27
pabelanger2016-11-16 18:27:15,781 INFO nodepool.image.build.ubuntu-dib: * Failed to connect to bootstrap.pypa.io port 443: Connection timed out18:30
pabelangerguess we should update nodepool devstack test to use the cached version :)18:30
mordredpabelanger: :)18:31
clarkbor juist fix neutron18:31
clarkbbecause really we should have a working neutron18:31
pabelangeris that why we are failing to download it?18:32
clarkbyes18:33
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add a fallback exception handler for CleanupWorker  https://review.openstack.org/39852718:33
clarkbwell I am 95% sure at least18:33
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add a fallback exception handler for CleanupWorker  https://review.openstack.org/39852718:33
clarkbguessing the job ran on osic and nodepool uses neutron and neutron breaks ipv4 internet connectivity on osic18:33
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add a fallback exception handler for BuildWorker  https://review.openstack.org/39852818:35
pabelangerclarkb: Ya, osic18:35
mordredclarkb: have you looked at kevinbenton's devstack patch yet?18:36
clarkbmordred: not yet18:36
clarkbit was being iterated on heavily last night and I had to get dinner and stuff18:36
mordredclarkb: https://review.openstack.org/#/c/398012/ seems to be no longer heavily iterated upon18:37
mordredwhich is the new patch18:37
clarkbcool will take a look after reviewing these nodepool fixes18:37
*** abregman has joined #zuul18:37
*** abregman is now known as abregman|afk18:37
mordredI, on the other hand, am going to the dentist18:38
mordredso you enjoy reviewing those patches :)18:38
morgan_mordred: reconfiguring my dev env, hopefullt will have ksa stuff cleaned up soo18:38
morgan_mordred: enjoy the dentist. =/18:38
jeblairer, i'm going to rejigger 52418:39
jeblairi don't think we need the stopprocessingexception18:39
clarkbjeblair: ok if you do that maybe remove the new newline before the run() docstring?18:40
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add a fallback exception handler for BuildWorker  https://review.openstack.org/39852818:41
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Make image uploads separable  https://review.openstack.org/39852418:41
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add a fallback exception handler for CleanupWorker  https://review.openstack.org/39852718:41
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Add a fallback exception handler for UploadWorker  https://review.openstack.org/39852618:41
jeblairclarkb: sorry18:41
clarkbits still valid python so not a huge deal18:42
clarkbjeblair: pabelanger I want to read the test logs for 524 before posting a review (to double check we ran both of them and just didn't get lucky running provider2 first again, but otherwise looks good)18:45
pabelangerclarkb: I can rebase our current failure at top of that stack18:46
clarkbpabelanger: I don't think you need to18:46
clarkbwe will just approve jeblair's first18:46
pabelangerk18:46
clarkboh because the test isn't enabled right now...18:46
pabelangerright18:46
clarkbok ya lets just approve it then we can recheck your change and double check the logs there18:46
pabelangerack18:46
Shrewsjeblair: may i ask for a bit more explanation on your comment on 398418?18:49
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Make image cleanup separable  https://review.openstack.org/39853518:49
jeblairShrews: certainly!  unless i'm misunderstanding, i think that change removes a cli command which i think is still necessary.18:50
jeblairShrews: oh i did misunderstand18:50
jeblairShrews: i see my mistake now :)18:50
jeblairShrews: sorry about that.  +2.  :)18:51
Shrewsjeblair: ok. i think that failure on that change might be exposing some testing overlap. getting 2 results vs. the 1 we expect. will need to investigate18:51
Shrews(assuming i grok the failure correctly)18:52
jeblairShrews: which failure?18:53
Shrewsjeblair: oh, a different review. sorry18:54
Shrews(too much stuff happened while i was at lunch)18:54
jeblairyay stuff!18:56
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_dib_image_delete  https://review.openstack.org/39841818:58
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_config_validate  https://review.openstack.org/39842718:59
Shrewsjeblair: this is review that had the failure i was referring to: https://review.openstack.org/39843718:59
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Make image building separable  https://review.openstack.org/39854219:02
Shrewsoh, i see what it is19:03
Shrewsduh19:03
*** Shuo has joined #zuul19:05
jeblairShrews: btw, you can probably just do waitForImage there (the upload should happen automatically; and since we're removing the image-upload command, that's the expected workflow anyway (ie, i think that's maybe what the test should test))19:06
clarkbpabelanger: can you rereview https://review.openstack.org/#/c/398524/2 and approve if it looks good to you? then we can recheck your change19:11
Shrewsjeblair: ack. i don't understand why, though, when i run the test_image_build test alone, it succeeds, but when i run the entire test_commands.py suite, an additional fake image exists19:11
jeblairShrews: i. um.19:12
Shrewsexactly19:12
pabelangerclarkb: ack19:12
jeblairShrews: i get that too.19:13
Shrewsjeblair: it may be a race... the first one created from _useBuilder just isn't ready yet?19:14
Shrewswhen run alone19:14
jeblairthat sounds plausible19:15
Shrewsmm, nah, i don't think that's it19:15
jeblairShrews: oh19:16
jeblairShrews: it's both a scheduled and manual build19:16
Shrewsright. so the scheduled one just hasn't been hit by the scheduler yet19:17
Shrews??19:17
jeblairyep19:17
Shrewshah19:17
jeblairi think this worked before because the builder didn't schedule its own builds19:18
Shrewsso i should waitForImage() before doing the manual build request, then ignore the scheduled build19:18
jeblairwe don't have a way to convince it not to do that.  so we may need to alter this test to expect two builds.19:19
jeblairShrews: yep19:19
Shrewsexciting19:19
jeblairstart; wait for first; run manual; wait for second (will probably need to change waitForBuild to do this); verify 2 builds19:19
Shrewsjeblair: didn't you just suggest to remove waitForBuild?19:20
jeblairShrews: yeah, sorry, meant to type waitForImage there19:22
Shrewshrm, this is where the missing 'count' param for getMostRecentImageUpload() would have been useful19:22
*** jamielennox|away is now known as jamielennox19:26
Shrewsi guess i can give it a list of IDs to ignore19:26
SpamapSbleh..19:37
SpamapSjeblair: fighting a bit with the gear text friendly things19:38
SpamapSinheritance really just makes it pervasive up and down all the classes. :-P19:38
SpamapSevery attribute ends up having to be managed by getters/setters .. and data, in particular, (the response from worker->client) is complicated because it's an array that is mutated a lot by reference.19:39
jeblairSpamapS: can you just operate on 'name' using @property?19:40
openstackgerritMerged openstack-infra/nodepool: Make image uploads separable  https://review.openstack.org/39852419:40
SpamapSjeblair: name being assumed to be utf-8 is a relatively easy change.19:40
SpamapSit's all the other things that resist. :-P19:41
SpamapSlet me submit the name change by itself.19:41
Shrewsjeblair: LOL/CRY... so, getMostRecentImageUpload() in waitForImage() is always getting the scheduled upload in that case (since it happens after the manual build request).19:41
openstackgerritMerged openstack-infra/nodepool: Add a fallback exception handler for UploadWorker  https://review.openstack.org/39852619:41
jeblairoh, i misunderstood what you were saying (i thought you meant "i can't just @property name, i have to __setattr__ the whole class")19:41
Shrewsi'm obviously doing something wrong19:42
Shrews:)19:42
jeblairShrews: that should be good enough to wait for the first (scheduled) upload to finish, then if you submit a manual request, we do still need something to wait for that (either a modification to that function, or just a loop over getmostrecentupload until it returns something different, maybe)19:43
SpamapSjeblair: What I'm having to do is make a getter/setter for everything that wants to be different in Text(Client/Worker/Job/WorkerJob)19:46
SpamapSjeblair: and then also make a binary accessor for times when you actually want the binary no matter what.19:47
Shrewsjeblair: actually, i'm not entirely convinced that getMostRecentImageUpload is working properly19:48
openstackgerritClark Boylan proposed openstack-infra/zuul: Don't retry when using synchronize module  https://review.openstack.org/39855819:49
clarkbjeblair: ^ there we go19:49
jeblairSpamapS: hrm, i thought with an inheritance sequence where Text things override Binary things we would just not have to do anything tricky for the binary bits.19:52
jeblairShrews: how so?19:52
SpamapSjeblair: the problem is that we pass around jobs internally and make packets out of them19:52
SpamapSand that _always_ has to be binary19:52
*** hasharAway is now known as hashar19:52
morgan_SpamapS: this is largely what I ran across in my previous attempts at this19:53
SpamapSmorgan_: yep19:53
SpamapSI'm not saying it can't be done19:53
Shrewsjeblair: the ZK data looks correct, but it doesn't seem to be returning the most recent upload across builds (for whatever reason)19:53
SpamapSjust that it's a whole lot of new code for convenience sake.19:53
Shrewsjeblair: putting together a unit test for it now19:53
morgan_SpamapS: it's an issue with python3 and the technology we're using, it's doable, it's just unfun and what you just said. alot of code for working around base features of what we're using19:54
morgan_SpamapS: for some minor convienence19:54
morgan_but i agree it is doable. just not sure if it is worth it19:54
SpamapSWell I made one that didn't use inheritance19:54
SpamapSso the facade was only on the user side19:54
SpamapSinternally gear just kept using Job and WorkerJob19:55
SpamapSthat was a lot less code, but... maybe a bit confusing code ;)19:55
morgan_ah yeah19:56
morgan_i could see that19:56
SpamapShttps://review.openstack.org/39856019:56
SpamapS^^ this is just to make name always utf-8 assumed (so you always get strings back from job.name)19:56
SpamapSwhich is technically an API break19:56
SpamapSmorgan_: if you want to look at the facade I made.. it's here for now: https://review.openstack.org/39354419:57
morgan_i think it's an ok API break because it is unlikely that anyone has seriously used it in py3 until very recently at best19:57
morgan_and in py2... again not really something anyone dealt with19:57
SpamapSYeah it doesn't break py2 at all19:58
SpamapSand the existing tests are unchanged19:58
SpamapSso it's a really, really subtle API break19:58
morgan_yeah19:58
SpamapSI should actually add a test for the name19:58
SpamapSin string19:58
openstackgerritMerged openstack-infra/nodepool: Add a fallback exception handler for CleanupWorker  https://review.openstack.org/39852720:01
morgan_I personally prefer the application using gear managing the binary conversion/stringifying vs making gear smarter.20:01
SpamapSmorgan_: well I think the idea is to give the user a thing that does the simplest thing well.20:02
morgan_but that doesn't mean my view is correct for this case20:02
morgan_just personal preferance.20:02
clarkbjeblair: starting at https://review.openstack.org/#/c/398535/1 that exception handling stack has failed pep820:03
jeblairi hope the idea that gear providing an interface that lets this simple thing work for many use-cases (including the one gear was created for) without reducing our ability to interface at a low level is a compromise we can agree on.  :)20:04
SpamapSLike, in the facade version I put up, if you're using TextWorker, you get a string arguments, so your worker can just be like 'job.sendWorkComplete(process_data(json.loads(job.arguments)))'20:04
jeblairclarkb: on it20:05
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Make image cleanup separable  https://review.openstack.org/39853520:06
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Make image building separable  https://review.openstack.org/39854220:06
SpamapSjeblair: so, when you do get time to look at the one I just pushed up that just does name.. to do a full text client/worker/job set.. every attribute needs three methods.. 2 for access by the user, and 1 for library intra-class access to the binary attributes (we could perhaps just make the binary ones named, so not a method actually)20:06
clarkbI am going to grab lunch but then will be back to keep reviewing those nodepool changes20:07
clarkbpabelanger: don't forget fixing the zk connection stuff in nodepool.py (I can help write that change too if that helps you)20:07
pabelangerclarkb: ya, working on that now20:08
jeblairSpamapS: right -- though name (and unique?) are simple.  data, as you point out, is slightly more complicated.  but those are the only things we have to deal with, right?20:08
clarkbpabelanger: awesome20:08
pabelangerclarkb: just finished rechecking some failed, patches20:08
pabelangerclarkb: I think you can continue your revires20:08
openstackgerritMerged openstack-infra/nodepool: Add a fallback exception handler for BuildWorker  https://review.openstack.org/39852820:08
pabelangerreviews*20:08
jeblairSpamapS: i'll take a closer look after lunch20:09
SpamapSjeblair: arguments needs to be textified too20:09
clarkbpabelanger: ya I can, but need food20:10
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Add test for getMostRecentImageUpload  https://review.openstack.org/39856520:10
Shrewsjeblair: nm, it appears to work according to ^^^. issue lies elsewhere20:10
Shrewsoh! but i realize where the problem is, thanks to creating that test20:13
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Rename zookeeper_client to zk for nodepool.py  https://review.openstack.org/39856620:13
pabelangerShrews: clarkb: ^20:14
pabelangerthat should address your previous comments20:14
* mordred announces that he has a very numb half face20:17
pabelangerquick get the trout20:17
mordred:)20:20
pabelangernever see this before: http://logs.openstack.org/66/398566/1/check/gate-nodepool-python27-db-ubuntu-xenial/6bdcc1d/console.html#_2016-11-16_20_17_19_09990220:21
pabelangerResource temporarily unavailable (src/signaler.cpp:301)20:21
mordredseems to be a zmq thing20:25
mordredhttps://github.com/zeromq/libzmq/issues/158320:25
morgan_mordred: i welcome our new, half-numb-faced, overlords20:27
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Add build_id attribute to ImageUpload object  https://review.openstack.org/39856820:29
mordredmorgan_: :)20:31
mordredmorgan_: half-numb-face is eversomuchfun20:31
morgan_mordred: i did whole-numb-face last dentist visit20:32
morgan_mordred: ok so i'm reinstalling my dev machine now and will be working on fixing betamax in ksa then onto task interface20:32
morgan_mordred: i have some ideas on testing :)20:32
morgan_mordred: hopefully have this wrapped up today/tomorrow/friday :)20:33
mordredwoot20:41
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_config_validate  https://review.openstack.org/39842720:41
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_dib_image_delete  https://review.openstack.org/39841820:41
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_image_build  https://review.openstack.org/39843720:41
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable tests: test_job_create, test_job_delete  https://review.openstack.org/39844020:43
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Remove test: test_snapshot_image_update  https://review.openstack.org/39844520:43
Shrewspabelanger: sorry, i had to be heads down for a bit to get some of this stuff rolled out. did you still need my help with something?20:44
pabelangerShrews: I did have a question about: https://review.openstack.org/#/c/398502/1/nodepool/zk.py20:45
pabelangerI had to add recursive to fix an exception that kazoo was raising20:45
pabelangernot sure if I was using the function properly or not20:45
Shrewshrm, let me think about that a bit. not sure if we really want recursive there20:46
pabelangerack20:46
pabelangerlike I said, it is possible I am using it incorrectly20:46
pabelangerwas trying to delete an image20:47
Shrewspabelanger: we shouldn't delete a build unless the uploads have been deleted first... which may be the reason i didn't have recursive there20:47
Shrewsgotta revisit the logic a bit20:47
mordredShrews: I believe you have an oops in https://review.openstack.org/#/c/398568 ?20:47
Shrewsmordred: ugh20:48
pabelangeryay20:48
pabelangerhttp://logs.openstack.org/79/398379/4/check/gate-dsvm-nodepool/a7fc84b/logs/nodepool-image-list.txt.gz20:48
mordredShrews: have I mentioned the dislike for positional parameters I've developed?20:48
pabelangernow to see why the job failed20:48
mordredpabelanger: it's the name20:49
mordredpabelanger: the test is doing a grep for 'trusty-server'20:49
pabelangermordred: Ah20:50
pabelangerthanks20:50
mordredpabelanger: that's SUPER EXCITING20:50
pabelangerhttp://logs.openstack.org/79/398379/4/check/gate-dsvm-nodepool/a7fc84b/logs/screen-nodepool.txt.gz#_2016-11-16_19_50_02_30920:50
pabelangerNODE ONLINE20:50
pabelangerwee20:50
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Add build_id attribute to ImageUpload object  https://review.openstack.org/39856820:50
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool: Re-enable test: test_dib_image_delete  https://review.openstack.org/39841820:51
mordredpabelanger: I love it when we work really hard and the end result is that we have booted a node20:52
mordredpabelanger: and we're legitimately excited about having done so20:52
pabelanger++20:52
pabelangerit is exciting for sure20:52
mordredpabelanger: I remember being similarly excited when we replaced the nodepool calls to openstack with shade20:53
mordredand the end result of much pain was a thing that worked exactly the same as before20:53
timrcIt has more potential now, though :)20:54
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Enabled zookeeper for devstack jobs  https://review.openstack.org/39837920:54
pabelangermordred: okay, that should do it^. Was looking for trusty-server, like you said, which was a snaphot build20:54
pabelangerwe can remove it now20:54
Shrewspabelanger: so...20:56
Shrewsabout that test... it seems redundant (see https://review.openstack.org/398418), and adding recursive there is not what we want20:56
pabelangerShrews: Ya, I think we are going to delete it now, since it lives in test_nodepool.py, but I wanted to see if recursive was needed or not20:57
clarkbthe command test is not redundant right? its trsting the cli works20:58
clarkbmaybe I am talking about different test20:58
Shrewspabelanger: what should happen is you mark the build as 'deleted' in ZK, and then the builder janitor thread will do the actual deleting of things (from provider and disk)20:59
pabelangerya20:59
Shrewspabelanger: so calling zk.deleteBuild() there is not the right way20:59
pabelangerclarkb: I think we agreed in backscroll to add a new test to make sure nodepool-builder actually did the deletes from disk21:00
pabelangerShrews: k, when do we use it?21:00
clarkbpabelanger: correct but you still need a test that the cli worls too21:00
clarkbypu need both21:00
pabelangeryup21:00
Shrewspabelanger: after you call deleteUpload() for all of the uploads of that build.21:01
Shrewspabelanger: if we recursively delete from the build node downward, then we'd be missing the provider upload information21:02
pabelangerright21:03
Shrewsso i guess it's sort of a safety mechanism to not recursively delete21:03
pabelangerSo, that means we cannot delete a image on disk without deleting it from the provider uploads too?21:03
Shrewspabelanger: you can delete on disk stuff, just gotta mark the build node as 'deleted' when you do21:04
clarkbthat seema like an anyi feature21:04
clarkbah ok21:04
pabelangerYa, trying to think how it works today21:05
clarkb(sometimes you run out of disk)21:05
pabelangerright :)21:05
Shrewspabelanger: this code is pretty much what i just described: https://github.com/openstack-infra/nodepool/blob/feature/zuulv3/nodepool/builder.py#L316-L32121:05
openstackgerritMerged openstack-infra/nodepool: Add test for getMostRecentImageUpload  https://review.openstack.org/39856521:10
clarkbpabelanger: ok back to review that stack21:11
pabelangeryay21:11
jeblairclarkb: if that becomes a problem, we could probably delete the actual file on disk before we delete the znode (which should still only happen after the uploads are deleted)21:11
Shrewspabelanger: btw, we should just be setting state on zk nodes to trigger things happening. so you shouldn't have to try to model that same logic elsewhere21:12
pabelangerShrews: okay, I'll do that moving forward21:12
clarkbjeblair: I guess I don't think we need the strict enforcement that an image must not exist in a cloud for it to be removed locally, I think in an ideal world thats the case but we run out of disk all the time21:12
jeblairclarkb: i don't think we disagree?21:14
clarkbI had to reread your comment a few times but now I think I get it21:14
clarkbbasically you are saying its ok to have a znode in deleted state and no content on disk21:15
jeblairyeah.  that's not how it works now.  but i think we can try it for a little bit, and if we need to reorder it, we can.  it should be a simple change and does not substantially change the deletion algorithm.21:15
clarkbso zk still has a record of the thing but its not consuming any real bytes on your filesystem (outside of a smal record in zk)21:15
jeblairclarkb: correct21:15
clarkbgotcha ya that should work too21:15
jeblairShrews: a suggestion on 398437, but i think the logic looks right.21:23
Shrewsjeblair: can totally just pass the object in. good idea21:25
Shrewsi like to complicate things21:25
Shrewsand then have you fix me21:25
Shrews:)21:25
jeblairi like to simplify things.  we're a team.21:25
clarkbjeblair: Shrews https://review.openstack.org/#/c/398535/2 commenst on that that I think is related21:29
clarkbcan you tell me if I am wrong on those two things (I +2'd because its not a regression, but suggestions for simplification)21:29
openstackgerritMerged openstack-infra/nodepool: Re-enable test_node_ipv6 test  https://review.openstack.org/39844621:29
jeblairclarkb leaves comment requiring brain thinking21:30
clarkbtldr I think we are hitting zk way more than necessary :)21:31
pabelangerYay!21:31
pabelangerdevstack job worked21:31
pabelangerhttp://logs.openstack.org/79/398379/5/check/gate-dsvm-nodepool-src-shade/dfc3449/console.html21:31
pabelangerExcellent work Shrews and jeblair!21:32
jeblairpabelanger: yays!21:33
*** abregman|afk has quit IRC21:35
jeblairclarkb: a build that is not ready, building, or deleted doesn't leave much...  that pretty much covers all of the current states...21:38
clarkbjeblair: well we already exclude ready, building, and deleted there21:39
clarkbwe just do it in a really round about way21:39
clarkboh we aren't excluding deleted21:39
jeblairi don't think so... i'll reply inline21:39
clarkbjust building and ready21:39
openstackgerritMerged openstack-infra/nodepool: Re-enable test_node_delete_success test  https://review.openstack.org/39844821:39
clarkbbut still I think we nca more easily, make list excluding building and ready once (cuts down on zk queries) then continue21:39
openstackgerritMerged openstack-infra/nodepool: Re-enable test_node_delete_failure test  https://review.openstack.org/39845021:39
openstackgerritMerged openstack-infra/nodepool: Re-enable test_db test  https://review.openstack.org/39845121:40
openstackgerritMerged openstack-infra/nodepool: Re-enable test_leaked_node test  https://review.openstack.org/39845821:40
jeblairclarkb: basically, we want to delete any build that is 'delete', ignore any 'building' that's actually building, delete any 'building' that isn't building, and delete any ready older than the 2 most recent21:40
jeblairso it's not super straightforward :)21:40
clarkbah tahts what the 2 there means ok. So I think the case we don't need to handle is buidling that isn't building21:41
clarkbwe should just make sure that that always updates to delete?21:41
jeblairclarkb: it's to handle the case where someone hits the builder with a sigkill21:41
jeblairor oom or whatever21:41
clarkbI see21:42
clarkbok so maybe we can't easily simplify that portion of the code, I think the second comment is doable though?21:42
clarkbhrm actually because its over all known providers and not just providers with uploads that may not work (depending on how case of delete upload that doesn't exist is handled)21:43
Shrewspart of the thing that complicates that code is it must not assume that it has access to all providers that might be recorded in zookeeper.21:46
Shrewsin that case that one might want to split up provider ownership across different machines21:47
jeblair(which is difficult, but doable with a network file system.  any one will do.)21:47
pabelangerclarkb: I think you can also review https://review.openstack.org/#/c/398379/ now. That brings our dsvm jobs online again21:47
clarkbpabelanger: yup I have pulled that change and a couple others up for review next21:47
clarkbjeblair: Shrews: for the keeping a deleted node around? I agree but I also think we can use error handling to more easily express that in the code and avoid more zk lookups21:48
clarkbjeblair: Shrews: basically if I failed to delete this then keep the deleted nodearound in zk21:48
clarkbratherthan doing a second pass to see what was and wasn't deleted21:48
clarkbShrews: is the problem there that you may not even attempt to delete from a provider bceause known providers is known to me only?21:49
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Log image delete exception as exception  https://review.openstack.org/39859621:49
clarkbpabelanger: is there a snapshot version of https://review.openstack.org/#/c/398525/1 ?21:51
pabelangerclarkb: not yet, but can get started21:51
jeblairclarkb: yeah, imagine that we ran an image builder local to each region we use (but they shared a filesystem, so they used the same image files, but we wanted to be local when uploading).  i don't know if it's a *good* topology, but it would work with this.  :)21:52
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Remove DibImage from the database  https://review.openstack.org/39852521:53
pabelangerclarkb: sorry, missed some code21:53
clarkbjeblair: Shrews we should probably document that? seems like a very subtle thing that would be easy to break in the future (also maybe test that specific behavior of two different builders with the same "backend" fs)21:54
jeblairclarkb, Shrews: as i think about this, we may have some problems removing a provider.  i think that would look exactly like this case to the builder.21:54
Shrewsjeblair: yes, that could pose a problem with the existing cleanup logic21:55
clarkbpabelanger: http://logs.openstack.org/79/398379/5/check/gate-dsvm-nodepool/82c6e61/console.html failed21:56
clarkbpabelanger: so I don't know that we are quite there yet on making the integration testing work21:56
Shrewsmy brain is fried from all the context switching on things. time to step away for a bit21:57
pabelangerclarkb: I think that is the neutron bug, causing dib to fail to build21:58
* mordred hands Shrews some nice fried pig brians21:58
clarkbpabelanger: ah21:58
clarkbpabelanger: wonderful21:58
pabelangerya21:58
pabelangerhttp://logs.openstack.org/79/398379/5/check/gate-dsvm-nodepool/82c6e61/logs/screen-nodepool-builder.txt.gz21:58
jeblairyeah, last thing there is get-pip21:59
clarkbpabelanger: I also notice that you don't explicitly start zookeeperd, I thought that it didn't auto start on ubuntu? was i mistaken?21:59
jeblairclarkb: i think it does if you install the zookeeperd package (we do)21:59
clarkboh are there different packages with different behavior? interesting21:59
pabelangerYa, zookeeperd has the init.d scripts22:00
jeblairclarkb: there's a separate server package and init scripts22:00
mordredyah. zookeeper has the zookeeper22:00
pabelangerright22:00
clarkbgotcha22:00
mordredit's a great design22:00
mordredI wish more things did it22:00
jeblair"it's a great design" [evaluation: 50% troll probability] "I wish more things did it" [evaluation: 15% troll probability]22:01
clarkband with the kazoo logging fix we no longer get 12MB compressed files of logs22:01
mordredjeblair: :)22:01
jeblairmordred: the trollocop plugin for irssi is indespensible22:02
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Remove SnapshotImage from the database  https://review.openstack.org/39860222:02
pabelangerand that should be the snapshot bits from the database22:02
mordredjeblair: I like that it serves the half of the people who believe that installing a service should imply starting it AND the half of people who believe the opposite22:02
jeblairclarkb, pabelanger: so i heard some talk about using a cached get-pip?  what's up with that?22:02
mordredthat is, of course, one of the fundamental philosophical divergences between debian and red hat22:02
jeblairmordred: turns out, we *can* all just get along.22:02
clarkbjeblair: we can set dib env vars such that it will use the cached version of that file that we cache for devstack22:03
mordredright?22:03
clarkb(I think)22:03
mordredyah22:03
mordredit's DIB_REPOLOCATION_something_something22:03
mordredgreghaynes: ^^22:03
pabelangerclarkb: jeblair: Ya, we can first check it exists on dist, then add it for the gate22:03
pabelangerdisk*22:03
greghayneshrm?22:03
jeblairhas that been implemented elsewhere so we can copy it?22:03
greghaynesoh, yes, DIB_REPOLOCATION_pip_and_virtualenv=/some/path22:03
greghaynesworst var name ever22:04
clarkbgreghaynes: basically neutron is breaking our ability to internet to fetch get-pip.py when building a dib image for nodepool integration test22:04
clarkbgreghaynes: but we already have a copy on the test instance so maybe we can reuse it22:04
mordredyah - so just set that var to where we store get-pip.py and we should be good22:04
jeblairhttp://git.openstack.org/cgit/openstack-infra/nodepool/tree/devstack/plugin.sh#n25422:04
clarkbgreghaynes: http://logs.openstack.org/79/398379/5/check/gate-dsvm-nodepool/82c6e61/logs/screen-nodepool-builder.txt.gz#_2016-11-16_21_10_43_097 is the logging where the internets don't work22:05
greghaynesawesome, I think trove was hitting that too22:05
mordredgreghaynes: see jeblair's link22:05
clarkbI think its /opt/stack/new/devstack/files/get-pip.py22:05
mordredit seems we ARE setting it22:05
jeblairbut only if NODEPOOL_CACHE_GET_PIP is set, right?22:05
pabelangerlet me check quickly22:05
mordredNODEPOOL_CACHE_GET_PIP=/opt/stack/cache/files/get-pip.py22:05
mordredthat's in the top of the file22:05
jeblairoh i think this is a master/zuulv3 thing22:06
clarkbnot sure if that is the right path22:06
*** hashar has quit IRC22:06
greghaynesim not sure the file:// is what you want to have set either22:06
jeblairlike it may not be added to v3 branch22:06
clarkbjeblair: oh we need to merge that into v322:06
pabelangerha22:06
mordred:)22:06
mordredwhat? we solved this before?22:06
clarkbjeblair: http://git.openstack.org/cgit/openstack-infra/nodepool/tree/devstack/plugin.sh?h=feature/zuulv322:06
clarkbjeblair: so ya think thats the case22:06
mordredI agree22:06
pabelangerlooks like it22:06
clarkbapparentl;y we don'y use the default devstack files path either22:07
clarkbmaybe thats because we want to move things around before devstack is cloned22:07
pabelangerpath is correct for get-pip.py22:07
openstackgerritMerged openstack-infra/nodepool: Enabled zookeeper for devstack jobs  https://review.openstack.org/39837922:07
clarkbmordred: it turns out that neutron has been broken since august22:08
clarkbmordred: and attempts to fix it have been lacking so everyone and their uncle has worked around it22:08
mordredclarkb: yah22:08
mordredwell - there is a devstack patch outstanding ...22:09
clarkbyup and its on my list of things to review today :)22:09
mordredhttps://review.openstack.org/398012 ... so far no devstack cores have reviewed22:09
mordredhey jeblair ^^ I hear you have +2 access ...22:09
clarkbianw and sc68cl too iirc22:10
mordredclarkb: ooh. I'll bug them22:10
greghaynesI also think the line should be DIB_REPOLOCATION_pip_and_virtualenv: $NODEPOOL_CACHE_GET_PIP BTW22:10
greghaynesor, is that tested somewhere?22:10
mordredgreghaynes: it's how the master branch nodepool jobs run22:10
greghayneshrm22:10
mordredlemme find log22:10
greghayneshow on earth does that work22:11
mordredgreghaynes: http://logs.openstack.org/50/397750/1/check/gate-dsvm-nodepool-src-shade/41ba133/22:11
ianwmordred: ok ... that's a lota changelog.  i was probably scared when i last went through devstack queue :)22:11
clarkbmordred: we need a devstack-gate change to set that var too22:11
mordredhttp://logs.openstack.org/50/397750/1/check/gate-dsvm-nodepool-src-shade/41ba133/logs/screen-nodepool-builder.txt.gz22:11
clarkbmordred: do you know if that exists yet?22:11
*** Shuo_ has joined #zuul22:11
mordredclarkb: I will make one right now22:11
clarkbmordred: kk22:11
greghaynesmordred: TIL that curl supports curl file:///some/path22:12
*** Shuo has quit IRC22:12
greghaynesmordred: which is how that works22:12
clarkbcurl is magical22:12
greghaynesso, if you didnt have file:// dib would notice its a path and not shell out to curl22:13
greghaynesbut yea, not an issue apparently because curl is awesome22:13
openstackgerritMerged openstack-infra/nodepool: Rename zookeeper_client to zk for nodepool.py  https://review.openstack.org/39856622:14
mordredclarkb: https://review.openstack.org/39861122:14
mordredgreghaynes: neat!22:14
clarkbmordred: what do you think about supporting a more easy to understand devstack_gate var too?22:15
clarkbmordred: basically support the old thing for compat but also support a new var22:15
clarkbso DEVSTACK_GATE_IPv4_ADDRS_SAFE_TO_USE defaults to DEVSTACK_GATE_IPv4_FIXED_RANGE22:15
clarkboh wait22:15
clarkbthats what it does already? do we not have this in the -wrap script?22:16
mordredclarkb: sorry - I talked about that in infra channel because it started feeling like this was less zuul related22:16
clarkbwow ya we don't have it in -wrap22:16
mordredyah - I think we should just drop the D_G var22:16
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Remove stray print statement  https://review.openstack.org/39862322:18
clarkbis anyone working on cherry picking the plugin fix on master to v3 yet?22:20
clarkbI can do that as soon as I finish reviewing these devstack thigns22:20
mordredclarkb: I can do it - if nobody else is22:21
mordredpabelanger: https://review.openstack.org/#/c/398596 ... was it the , you were concerned about?22:21
clarkbmordred: I am done with devstack reviews, I will go ahead and cherry pick now22:23
openstackgerritMonty Taylor proposed openstack-infra/nodepool: Install get-pip.py from local cached location  https://review.openstack.org/39862622:23
mordredclarkb: oh - sorry - was already doing it22:24
clarkboh no biggy22:24
clarkbnow I just have to review it22:24
mordredclarkb: I'm going to follow up with a patch merging the rest of the devstack plugin changes22:25
clarkbmordred: the indentation around that heredoc stuff just melted my brain22:26
mordredyah22:26
mordredclarkb: I'm missing an fi I think22:28
clarkbin deed you are22:29
clarkbsee how melted my brain is?22:29
openstackgerritMonty Taylor proposed openstack-infra/nodepool: Install get-pip.py from local cached location  https://review.openstack.org/39862622:30
openstackgerritMonty Taylor proposed openstack-infra/nodepool: Sync devstack plugin from master  https://review.openstack.org/39863222:30
mordredclarkb: ok - there's the fi and also the rest of the changes to the devstack/plugin.sh from master22:31
mordredwhich we might as well sync because we'll need to sync them when we merge back anyway :)22:31
Shuo_gerrit seems to built for working with git. Just wondering, has anyone made gerrit work with perforce (at least I know google's internal code review system was built along with perforce back then :-) )22:32
mordredShuo_: not to my knowledge - but gerrit is VERY deeply tied to git22:32
mordredShuo_: as is zuul22:32
mordredShuo_: like, gerrit is based on a jgit implementation - it would be a substantial rewrite of most of it to accomodate non-git things22:33
jeblairmordred: one thing on your plugin sync change22:34
mordredwoot22:35
openstackgerritMonty Taylor proposed openstack-infra/nodepool: Sync devstack plugin from master  https://review.openstack.org/39863222:35
mordredjeblair: thanks - that was a line I was unsure on22:35
pabelangermordred: Oh, maybe it was line 257 that is the issue22:37
pabelangershows up red in gerrit ui22:38
mordredpabelanger: which patch?22:38
pabelanger398596, sorry for not saying22:39
mordredno worries!22:39
mordredpabelanger: ah! yes - that does seem to be unhappy line22:39
clarkbmordred: one more thing on that change22:39
clarkb632 that is22:39
openstackgerritMonty Taylor proposed openstack-infra/nodepool: Sync devstack plugin from master  https://review.openstack.org/39863222:41
mordredclarkb: done. thank you22:41
openstackgerritJames E. Blair proposed openstack-infra/nodepool: Log image delete exception as exception  https://review.openstack.org/39859622:41
openstackgerritMerged openstack-infra/nodepool: Make image cleanup separable  https://review.openstack.org/39853522:50
openstackgerritMerged openstack-infra/nodepool: Make image building separable  https://review.openstack.org/39854222:51
openstackgerritMerged openstack-infra/nodepool: Add build_id attribute to ImageUpload object  https://review.openstack.org/39856822:51
openstackgerritMerged openstack-infra/nodepool: Re-enable test: test_dib_image_delete  https://review.openstack.org/39841822:51
ianwmordred: FIXED_RANGE and subnet pools are mutually exclusive right?  you can't just setup your network on fixed range, you've got to ask for a subnet pool?22:52
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Remove SnapshotImage from the database  https://review.openstack.org/39860222:53
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Remove DibImage from the database  https://review.openstack.org/39852522:53
openstackgerritPaul Belanger proposed openstack-infra/nodepool: Replace snap_image with cloud_image  https://review.openstack.org/39863822:53
pabelangerclarkb: I went with cloud_image ^22:55
mordredianw: yes22:55
mordredianw: they are different things - which is why the complexity22:56
mordredianw: SUBNETPOOL_PREFIX may describe a larger network from which many smaller subnets are created22:56
openstackgerritMerged openstack-infra/nodepool: Install get-pip.py from local cached location  https://review.openstack.org/39862622:56
mordredianw: FIXED_RANGE describes a single network to be used directly without subdivision22:56
mordredianw: AIUI22:56
ianwmordred: ok, cool.  yes that's AIUI too so glad we agree :)22:58
mordred\o/22:58
mordredit's almost like I've learned things22:58
SpamapSjeblair: FYI, I just sent a message to openstack-infra and asked people to bother you. So.. prepare to be bothered. ;)22:59
jeblairSpamapS: wait people are only *just now* going to start bothering me?!23:00
jeblairSpamapS: ++23:03
mordredjeblair: yes. up until now the feeling you've experienced is "joy"23:03
openstackgerritMerged openstack-infra/nodepool: Re-enable test_job_start_event test  https://review.openstack.org/39846323:03
mordredjeblair: so just try to remember what it felt like23:03
jeblairmordred: am i about to be inundated with "love"?23:03
mordredjeblair: sure. that's definitely the right word23:06
mordredSpamapS: good email23:07
*** Cibo has quit IRC23:08
openstackgerritMerged openstack-infra/nodepool: Sync devstack plugin from master  https://review.openstack.org/39863223:09
pabelangerjeblair: Shrews: So, I only have 3 more tests in test_nodepool.py to do, they are related to images. I think we agree to delete them and move them into test_builder23:17
pabelangerAside from that, dsvm job is now working too23:17
pabelangerthink we made some great process today23:17
pabelangerand zookeeper.py works great!23:17
pabelangerjeblair: one thing I did notice with our testing, our configs are still setup to use jenkins_manager.py.  Should we make the switch to assign-via-gearman for our targets?23:20
jeblairpabelanger: let's leave that alone for now.  it's time will come when we do the next phase.  :)23:21
openstackgerritMerged openstack-infra/nodepool: Remove stray print statement  https://review.openstack.org/39862323:21
pabelangerjeblair: cool23:23
pabelangerhttp://logs.openstack.org/93/398493/1/check/nodepool-coverage-db-ubuntu-xenial/0da901c/console.html23:27
pabelangernodepool.tests.test_nodepool.TestNodepool.test_subnodes failed there, did a recheck23:27
pabelangerbut going to see try and reproduce it23:27
clarkbpabelanger: reading the logs almost looks like gearman server wasn't available/23:28
pabelangerclarkb: ya, I see that too23:29
pabelangerbut looks like after the schedule was shutdown23:29
clarkbthe scheduler shouldn't stop until the job ends and that job didn't end properly23:31
clarkboh!23:31
clarkbpabelanger: checkout the traceback, this is a long standing known issue23:31
clarkbpabelanger: this is the case where we try to drop the db but it doesn't drop23:31
pabelangerAh23:31
ianwmordred: ok, i think i got my head around it; LGTM but i think follow-ons could help as mentioned too23:31
pabelangerclarkb: first time I've seen it23:32
clarkbpabelanger: so its timing out in the job cleanup which is trying to drop the db23:32
pabelangerya, I see that now23:32
clarkbpabelanger: I had a few changes up to try and debug it by outputing mysql debugging info but that didn't really get anywhere23:32
pabelangerack23:32
clarkbpabelanger: I would probably ignore it unless you see it happen frequently, in which case we can resurrect those debugging changes and try again23:33
clarkblooks like Shrews' stack of changes got into a merge conflict23:35
mordredboo23:35
pabelangerYa, that is likely my fault :(23:36
pabelangerwhen we change node_dib.yaml to node.yaml today23:36
clarkbpabelanger: jeblair mordred can I get reviews on https://review.openstack.org/#/c/398558/ before too long? that way we can restart the launchers soonish hopefully and get better error logging23:49
pabelangerchecking23:50
pabelangerclarkb: +223:51
clarkbthanks23:51
mordredclarkb: +A23:52
openstackgerritMerged openstack-infra/nodepool: Remove DibImage from the database  https://review.openstack.org/39852523:52
pabelangerclarkb: https://review.openstack.org/#/c/398638 is my last stack of the day23:53
jeblairabadger suggested a fix for that if someone wants to do some ansible hacking23:53
jeblairdescribed in https://github.com/ansible/ansible/issues/1828123:53
pabelangerthat removes our database tables for snaphots23:53
openstackgerritMerged openstack-infra/zuul: Don't retry when using synchronize module  https://review.openstack.org/39855823:55
clarkbI have promised a new zanata dev server this week so probably need to start that tomorrow, maybe if that goes quickly I can figure out the ansible fixes23:56

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!