Tuesday, 2013-11-19

*** mrodden has quit IRC00:00
jog0clarkb: moving over to infra00:01
jog0got a cool graph about the swift bug00:01
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: update doc and add new JJB unit tests  https://review.openstack.org/5671500:01
jog0http://logstash.openstack.org/#eyJzZWFyY2giOiJcIkdvdCBlcnJvciBmcm9tIFN3aWZ0OiBwdXRfb2JqZWN0XCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzODQ4MTkyMDI2ODd900:01
*** MarkAtwood has joined #openstack-infra00:01
*** dcramer_ has quit IRC00:02
clarkbso it happened a bunch last week and is more sparse this week00:02
jog0clarkb: yup after the fix00:02
jog0but it is stil an issue00:02
clarkbfungi: you can start rax on jenkins01 now00:03
fungiahh, yup, see that now00:03
fungithx00:03
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: fix jjb job template documentation  https://review.openstack.org/5706200:03
*** rnirmal has quit IRC00:05
*** sarob has joined #openstack-infra00:05
*** loq_mac has quit IRC00:07
*** mriedem has joined #openstack-infra00:07
*** mriedem has quit IRC00:08
*** loq_mac has joined #openstack-infra00:08
*** herndon_ has quit IRC00:09
*** mriedem has joined #openstack-infra00:09
*** julim has quit IRC00:10
mikalIs there a known issue with the gate at the moment?00:11
mikal49305,17 seems to have been at the top of the gate for a long time00:11
anteayayeah, well in as much as jenkins has had a hard day00:11
mikalWell, that review has had py26 running for over 12 hours00:12
*** oubiwann has quit IRC00:12
anteayaah that is a long time00:12
mikalQuite00:12
anteayaclarkb fungi00:12
anteayathe gate might need a shove00:12
*** jamesmcarthur has joined #openstack-infra00:13
clarkbya those are just hosed00:14
clarkbwe are waiting for everything in the gate to get unhosed so that we can restart zuul00:14
clarkbor you can push a new patchset to gerrit, that may or may not unstick it00:14
anteayahow about reverify?00:14
mikalclarkb: if I -2 that review, wont that cancel its zuul run?00:14
anteayawould reverify work?00:15
*** jamesmcarthur has quit IRC00:15
*** pmathews has quit IRC00:15
*** matsuhashi has joined #openstack-infra00:15
fungia new patchset might jostle it out, but more likely it will have to wait for the zuul restart here in a little while00:16
openstackgerritJoe Gordon proposed a change to openstack-infra/elastic-recheck: Add query for bug 1252514  https://review.openstack.org/5707000:16
uvirtbotLaunchpad bug 1252514 in swift "glance doesn't recover if Swift returns an error" [Undecided,New] https://launchpad.net/bugs/125251400:16
*** jcooley_ has joined #openstack-infra00:17
fungioh, and clarkb has already stated those things00:17
clarkbmikal: no zuul doesn't react on -2 yet00:17
fungithe -2 simply prevents it from merging00:17
clarkbfungi: there appear to be 3 jenkins01 hpcloud straggles00:17
fungiclarkb: yup, they were probably in the process of being created when the original list ran, or else they failed to delete on the first try. as soon as the rax nodes finish clearing in the next minute or two i'll do a cleanup pass00:18
clarkbk00:18
*** danger_fo_away is now known as dangers00:24
yjiang5Hi, all, noticed from http://ci.openstack.org/devstack-gate.html that the destack starts from a virtual machine. Where can I find more detailed information on how th VM created and is it a HVM VM? Asking this because I'm considering if it's possible to add a PCI device to that VM to test PCI device assignment in openstack.00:26
fungiyjiang5: https://git.openstack.org/cgit/openstack-infra/devstack-gate/tree/README.rst#n10000:28
*** markwash has quit IRC00:29
fungiyjiang5: in short, those virtual machines are currently whatever rackspace and hpcloud provide to us00:29
yjiang5fungi: thanks, I just focus on checking the three script in the repo. Sorry.00:29
yjiang5fungi: I will have a look on it in detail.00:29
jog0http://paste.openstack.org/show/53572/00:31
clarkbfungi: down to 5 nodes on jenkins0100:33
*** loq_mac has quit IRC00:34
*** jcooley_ has quit IRC00:35
*** adalbas has quit IRC00:35
fungiyeah, manually deleting the last three now which escaped the first apsses00:35
fungipasses00:35
zaromgagne: what do you think? https://review.openstack.org/#/c/5706800:36
*** rnirmal has joined #openstack-infra00:38
fungiokay, all jenkins01 nodes are gone from the nodepool list00:39
NobodyCamhumm is this job stuck?00:39
NobodyCamironic/conductor/manager.py00:39
fungistarting jenkins01 back up now00:39
NobodyCamhttps://jenkins01.openstack.org/job/gate-ironic-docs/566/00:39
NobodyCam:)00:39
NobodyCamty fungi00:39
clarkbfungi: woot00:39
fungilet's see where this gets us00:39
mgagnezaro: the idea is nice. Are tests bundled when shipping JJB?00:40
*** nati_uen_ has quit IRC00:40
NobodyCamstill think this job is stuck00:40
NobodyCamhttps://jenkins01.openstack.org/job/gate-ironic-docs/566/00:40
*** ryanpetrello has joined #openstack-infra00:40
mgagnezaro: generating docs would require tests to be there. Is it a reasonable assumption?00:40
clarkbNobodyCam: it is00:41
NobodyCam:)00:41
clarkbyou can attempt pushing a new patchset which zuul should treat as an indication to kick the existing change out of the queues00:41
zaromgagne: tests do not need to be bundled.00:41
clarkbor you can wait for things to settle down allowing us to restart zuul00:41
NobodyCamI'll wait00:41
NobodyCamalmost 500:41
zaromgagne: docs are generated by jenkins which has source and tests so once doc is generated i believe it has everything.00:41
*** loq_mac has joined #openstack-infra00:42
clarkbfungi: nodepool appears to slowly be deleting hpcloud nodes attached to jenkins0200:42
clarkbso I think this was a event stream derp00:42
NobodyCamyou guys are awesome00:42
NobodyCamjust fyi00:43
clarkbNobodyCam: our software isn't :/ it really went sidewyas today00:43
NobodyCam:)00:44
*** emagana has quit IRC00:44
fungiclarkb: agreed, that's the best fit for what we saw00:44
fungiclarkb: aside from the original original issue, puppet agent eating itself. we still need to discuss options for safely rolling that out00:45
*** Ryan_Lane has quit IRC00:45
fungiprobably need to rubify the certname in the config00:46
clarkbfungi: or have a killswitch where the config is only managed on the first run00:46
clarkb(not sure how that would work though)00:46
*** Ryan_Lane has joined #openstack-infra00:47
clarkbjenkins02 has masses of slaves backed up now too00:47
fungibut yeah, right now that's a time bomb waiting to strike any of our systems00:47
clarkbI guess I/we should just be patient and see if it picks those off slowly00:47
fungiit did last time around00:47
clarkbthe graphs look a lot better now than they did before (lots of hosts in deleting instead of ready)00:47
*** senk has joined #openstack-infra00:48
clarkbfungi: right the difference I think is that nodepool actually knows to delete them rather than treating them as ready00:48
*** gyee has quit IRC00:48
fungiyup00:48
clarkbyou can see that in the zuul graph as the green vs purple00:48
*** MarkAtwood has quit IRC00:48
fungiwell, i meant it was successfully whittling away at the delete list on jenkins02 earlier when we swamped it00:48
*** yamahata_ has joined #openstack-infra00:48
*** dcramer_ has joined #openstack-infra00:49
clarkbah00:50
*** jergerber has quit IRC00:51
*** ryanpetrello has quit IRC00:56
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: use jjb tests as the examples  https://review.openstack.org/5706800:57
mgagnezaro: example now includes "# vim: sw=4 ts=4 et" ^^'00:58
clarkbmgagne: zaro: we should probably just remove the modelines unless people really find them useful00:59
openstackgerritJoe Gordon proposed a change to openstack-infra/elastic-recheck: Add query for bug 1252514  https://review.openstack.org/5707000:59
clarkb(I don't and I use vim :) )00:59
uvirtbotLaunchpad bug 1252514 in swift "glance doesn't recover if Swift returns an error" [Undecided,New] https://launchpad.net/bugs/125251400:59
*** wenlock has quit IRC00:59
*** herndon has joined #openstack-infra00:59
*** MarkAtwood has joined #openstack-infra01:00
jog0why is a file from a week ago not on logs.o01:00
jog0http://logs.openstack.org/69/51169/3/gate/gate-tempest-devstack-vm-postgres-full/d8a7b6f/console.html01:00
jog0ohh .gz01:00
*** matsuhashi has quit IRC01:01
*** matsuhashi has joined #openstack-infra01:02
*** matsuhashi has quit IRC01:02
*** matsuhashi has joined #openstack-infra01:02
fungii do use vim, and i still am no fan of modelines01:02
fungibest suggestion i saw in the modelines thread was that people who care strongly about editor magic ought to write openstack-specific editor style plugins01:03
fungiand maintain them as projects for use by the rest of the development community01:03
*** ryanpetrello has joined #openstack-infra01:04
fungii could imagine them fitting in openstack-dev/.* if we determined that we wanted to continue to put things in that container01:04
*** senk has quit IRC01:05
*** ^demon|busy has quit IRC01:05
fungior even in a contrib subdirectory of hacking, maybe01:08
fungithough jog0 might not agree ;)01:09
jheskethfungi: If you have a moment to approve this change that'd be great please: https://review.openstack.org/#/c/56857/01:09
fungijhesketh: i'm only one cider into the evening, so still safe i think01:10
clarkb:)01:10
jheskethexcellent :-)01:10
jog0fungi: keep your modelines out of my hacking01:11
jog0;)01:11
reedthe TOPIC police!01:11
fungireed: wave that baton fiercely01:12
mikalHey... How hard is it to pin a ubuntu package on your workers?01:13
mikaljog0 has noted that the console log failure coincides with going from libvirt0_0.9.8-2ubuntu17.13_amd64.deb to libvirt0_0.9.8-2ubuntu17.15_amd64.deb01:14
jog0mikal:  you don't want to use that word around here01:14
jog0pin01:14
mikalpins are bad?01:14
mikalWe have voodoo dolls or something?01:14
fungiour test slaves are voodoo dolls01:14
fungiso, pinning a deb on those assumes that it hasn't been upgraded yet, that we can safely downgrade it, and that the deb in question is still in a package repository somewhere01:15
*** oubiwann has joined #openstack-infra01:15
fungibetter would be to pester zul about libvirt0_0.9.8-2ubuntu17.15_amd64.deb introducing some sort of issue not present in libvirt0_0.9.8-2ubuntu17.13_amd64.deb01:16
mikalThis is the only thing in the change log for .15:01:16
mikalupdate fix-for-parallel-port-passthrough-for-qemu: the xml for the new01:16
mikal     testcase was too modern and caused the test to fail by including new01:16
mikal     keywords.01:16
anteayathat is just funny01:18
anteayawho decides the definition of "too modern"01:19
mikal.14 has two very boring changes as well01:19
mikal * qemu-delete-usb-devices-on-stop and01:19
mikal    qemu-build-activeusbhostdevs-on-reconnect: ensure that we can re-use01:19
mikal    a usb device after another domain using the device has shut01:19
mikal    down.  (LP: #1190387)  Backported from upstream git.01:19
mikal  * cherrypick fix-for-parallel-port-passthrough-for-qemu from upstream01:19
mikal    (LP: #1203620)01:19
*** dcramer_ has quit IRC01:19
*** jhesketh has quit IRC01:21
*** nosnos has joined #openstack-infra01:21
*** jhesketh has joined #openstack-infra01:22
*** sjing has joined #openstack-infra01:25
*** nati_ueno has joined #openstack-infra01:26
*** bingbu has joined #openstack-infra01:27
fungijenkins01 vs jenkins02 devstack node numbers are approaching equilibrium, the graph looks more like what i would expect (lots of used, a bunch building, very little deleting or available) and the gate seems to be moving01:27
clarkbfungi: woot that is my reading as well01:28
fungigate and check pipeline totals have started to drop as well01:28
*** sjing has quit IRC01:29
*** sjing has joined #openstack-infra01:30
*** rnirmal has quit IRC01:30
*** dcramer_ has joined #openstack-infra01:33
*** vipul has quit IRC01:38
*** michchap has quit IRC01:38
*** vipul has joined #openstack-infra01:38
*** michchap has joined #openstack-infra01:38
*** zjdriver has quit IRC01:45
*** rnirmal has joined #openstack-infra01:45
*** dstanek has quit IRC01:47
*** dstanek has joined #openstack-infra01:47
clarkbfungi: I updated thatreview with some distilled questions01:51
clarkbfungi: I think I covered the important bits, let us see if mordred has answers (or anyone else)01:51
openstackgerritA change was merged to openstack-infra/config: Install graphviz on jenkins slaves for docs  https://review.openstack.org/5685701:51
fungiokay, great01:51
clarkbtl;dr I believe the patch will work as is but only because the availability_zone kwarg passed to that __init__ is ignored01:53
clarkband nova will just give us nodes on whichever az01:53
clarkbmikal: fungi: since we create new d-g images once a day we can pin it then new images will use the old version01:54
*** jcooley_ has joined #openstack-infra01:54
clarkbmikal: fungi: I don't think we should do that if we can figure out what the actual problem is, but it may give us temporary sanity01:54
anteayais there any other kind of sanity>01:55
anteaya?01:55
clarkbanteaya: I like to think of my insanity as temporary :)01:55
anteayanice01:55
anteayaI will support that fantasy you have01:55
* fungi has plenty of false sanity01:55
anteayaha ha ha01:55
*** jcooley_ has quit IRC01:55
clarkbanteaya: thank you, it really helps to have everyone else playing along01:56
mikalclarkb: I'd hold off. We can't find a change in the package which justifies it as the problem.01:56
anteayaI try to do my bit01:56
clarkb:)01:56
clarkbmikal: ok01:56
clarkbmikal: is it possible that there was an undocumented change?01:56
*** dcramer_ has quit IRC01:56
mikalclarkb: that is possible01:57
mikalWe're trying to push in some better debugging now to see if that poitns the finger at something01:57
clarkbsounds good01:57
*** jcooley_ has joined #openstack-infra01:57
clarkbfungi: I have just been summoned for dinner, I will be back on IRC in a bit but jenkins seems mostly happy for the moment01:58
clarkbfungi: I will restart zuul later tonight if the queues go away01:59
fungiawesome. pleasant eats02:00
*** hashar has quit IRC02:03
clarkbfungi is it too late to look at havana jobs?02:09
anteayacheck is down to 2502:09
*** dcramer_ has joined #openstack-infra02:10
fungii can look... there are changes, yes?02:10
clarkbyes ai have one and dprince wrote one02:11
*** changbl has quit IRC02:12
fungii'll find them and take a peek02:12
*** ryanpetrello has quit IRC02:16
*** Ryan_Lane has quit IRC02:17
*** senk has joined #openstack-infra02:18
*** senk has quit IRC02:22
openstackgerritMatt Riedemann proposed a change to openstack-infra/elastic-recheck: Add query for bug 1252170  https://review.openstack.org/5708102:23
uvirtbotLaunchpad bug 1252170 in tempest "tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_server_confirm[compute] failed" [Undecided,New] https://launchpad.net/bugs/125217002:23
*** b3nt_pin has quit IRC02:23
*** amotoki has joined #openstack-infra02:27
*** che-arne has quit IRC02:27
*** sarob has quit IRC02:27
*** ljjjustin has joined #openstack-infra02:33
*** reed has quit IRC02:39
*** MarkAtwood has quit IRC02:43
*** yamahata_ has quit IRC02:44
*** senk has joined #openstack-infra02:46
*** nati_ueno has quit IRC02:47
fungiclarkb: 56720 xml comparison job says two periodic grizzly jobs went missing. those got renamed, right02:50
fungi?02:50
*** changbl has joined #openstack-infra02:51
*** jcooley_ has quit IRC02:52
*** jcooley_ has joined #openstack-infra02:52
clarkbyup02:54
clarkbto be consistent02:54
*** jcooley_ has quit IRC02:54
*** jcooley_ has joined #openstack-infra02:56
fungik, lgtm then. okay to merge?02:58
*** wenlock has joined #openstack-infra02:59
*** masayukig has joined #openstack-infra03:00
fungiclarkb: ^03:00
clarkbfungi: I think so03:00
clarkbfeel like the gate is going nowhere fast (too many resets)03:00
*** xeyed4good has quit IRC03:01
*** yaguang has joined #openstack-infra03:03
*** sjing has quit IRC03:05
fungiclarkb: 57066 can't pass grenade, do you agree it's going to require forcing in?03:06
*** sjing has joined #openstack-infra03:06
clarkbis that dprinces?03:07
fungiyah03:08
clarkbfungi: I don't think that one should be forced in03:09
clarkball of those d-g cahnges that sdague wrote were part of the retooling. I think that failure may be one of those 10%ers03:09
fungii suspect it may instabreak the gate until havava upgrades are pounded back into shape03:10
*** dcramer_ has quit IRC03:10
clarkbthat may be the case as well03:10
fungilooking at grenade logs on my current device is not a quick task03:11
clarkbfungi: several tempest failures at the end of the run03:12
clarkbwhich means the first bits were ok03:12
clarkbI think this may be "flaky" tests03:12
fungigrrr03:12
*** salv-orlando has quit IRC03:13
fungiheatclient maybe03:13
*** pcrews has quit IRC03:13
clarkbFAIL: tempest.api.compute.v3.servers.test_server_actions.ServerActionsV3TestXML.test_resize_server_confirm[gate,smoke] things like that03:14
clarkbwhich may or may not be related to the change03:14
clarkbfungi: I think we should get sdague to look at it since he orchestrated the whole thing03:14
clarkbI am not very hopeful that the gate queue will be empty before I go to bed03:15
*** guohliu has joined #openstack-infra03:16
*** wenlock has quit IRC03:17
openstackgerritA change was merged to openstack-infra/config: Add stable/havana jobs.  https://review.openstack.org/5672003:21
*** Hunner_irssi has quit IRC03:27
*** Hunner_irssi has joined #openstack-infra03:27
*** Hunner_irssi has joined #openstack-infra03:27
*** herndon has quit IRC03:27
*** matsuhashi has quit IRC03:29
fungiwell, we can restart zuul prior to that if we want and it should mostly resume where it left off, yeah?03:29
*** matsuhashi has joined #openstack-infra03:30
fungias for sdague, i think he didn't expect to be around again until next week, right?03:30
*** changbl has quit IRC03:31
clarkbI am not sure when sdague planned on being back03:31
clarkbfungi: I don't think we can start a graceful restart because those jobs that got "stuck" will prevent zuul from stopping03:32
fungioh, very good point. we'd have to wait until those are all that's left during a graceful quiescence and then dump the state and feed that list back into rechecks/reverifies03:34
funginontrivial03:34
*** matsuhashi has quit IRC03:35
clarkbya :/03:35
*** krast has joined #openstack-infra03:40
*** krast has quit IRC03:41
*** nati_ueno has joined #openstack-infra03:41
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: use jjb tests as the examples  https://review.openstack.org/5706803:42
fungiinteresting oscillation emerging in the nodepool graph03:43
fungidprince is rechecking his change, but not hanging out in irc. poo03:47
*** dkliban has joined #openstack-infra03:47
*** rnirmal has quit IRC03:50
clarkbya we basically use all the nodes then reset03:52
fungilather, rinse, reset03:53
*** melwitt has quit IRC03:58
mikalIs there some way I can get access to devstack-precise-check-rax-ord-676412 ?04:00
clarkbdid the test finish already04:01
mikalYes04:01
clarkbif so no04:01
mikalBugger04:01
fungimikal: unlikely. by the time tests finished running it probably survived no more than a few minutes04:01
mikalBecause we don't have a very big window between tempest failing and the test ending04:02
mikalGiven tempest is the long thing04:02
fungiwe run very tight on quotas there04:02
mikalIs there a simple way to tell in log stash which cluster a test ran in?04:02
clarkbmikal no unfortunately, you can look at the first few lines of the console log but that is the easiest way04:03
mikalSo I have http://logstash.openstack.org/#eyJzZWFyY2giOiJmaWxlbmFtZTpjb25zb2xlLmh0bWwgQU5EIG1lc3NhZ2U6XCJhc3NlcnRpb25lcnJvcjogY29uc29sZSBvdXRwdXQgd2FzIGVtcHR5XCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjEzODQ2NDEwNzIxODl904:03
mikalIt would be interesting to turn that into a list of clusters04:03
clarkbyou can query rhe string that says running on node foo04:03
mikalPerhaps just one cluster hates us04:03
mikalclarkb: I don't think I can because I am already matching on a line from the log?04:03
mikalUnless I can assert I want files with two different lines in them?04:03
clarkbya you can use OR and () to group04:04
clarkbso blah and blah (message:foo OR message:bar)04:04
*** Ryan_Lane has joined #openstack-infra04:04
mikalI would want (message:foo and message:bar)04:05
mikalWould that work?04:05
mikalThere's no machine name attribute on the log files?04:05
clarkbno because jenkins doesnt tell us that info04:05
mikalOk04:05
fungiEJENKINS04:05
clarkbhmm yeah you want and. I think you need two queries04:05
* mikal feels a script coming on04:05
*** nati_ueno has quit IRC04:06
*** nati_ueno has joined #openstack-infra04:06
mikalYeah, I think I can do something horrible with the RSS feed04:06
fungimikal: they make a pill for that04:06
mikalPlease hold while I flail around04:06
mikalIs there some way to get more results from the rss feed version of the results?04:08
clarkbI dont know04:10
mikalOk04:11
mikalWell, lets see if these 20 results show something interesting first04:11
*** wenlock has joined #openstack-infra04:14
mikalSo, there are failures on non-rax nodes at leasrt04:14
*** ArxCruz has quit IRC04:22
*** ArxCruz has joined #openstack-infra04:23
*** DennyZhang has joined #openstack-infra04:24
*** jcooley_ has quit IRC04:29
openstackgerritEdward Raigosa proposed a change to openstack-infra/config: Make pip install from upstream better  https://review.openstack.org/5142504:30
* fungi is nodding off... until we meet on the morrow04:31
*** markwash has joined #openstack-infra04:33
clarkbgnight04:33
clarkbI am going to keep one eye on zuul just in case it starts moving04:33
clarkbfungi: if you get up early and it is quiet I say go for the restart (will note here if I have restarted it)04:34
openstackgerritKei YAMAZAKI proposed a change to openstack-infra/jenkins-job-builder: Added support for Emotional Jenkins  https://review.openstack.org/5677904:35
*** UtahDave has joined #openstack-infra04:35
*** mriedem has quit IRC04:38
*** senk has quit IRC04:42
*** matsuhashi has joined #openstack-infra04:46
*** balar has quit IRC04:55
*** pcrews has joined #openstack-infra04:58
*** SergeyLukjanov has joined #openstack-infra05:00
*** nati_uen_ has joined #openstack-infra05:00
*** nati_ueno has quit IRC05:03
*** nati_uen_ is now known as nati_ueno05:04
*** rakhmerov has joined #openstack-infra05:08
*** UtahDave has quit IRC05:12
*** jcooley_ has joined #openstack-infra05:13
*** sarob has joined #openstack-infra05:15
*** boris-42 has joined #openstack-infra05:18
*** nosnos has quit IRC05:23
*** nosnos has joined #openstack-infra05:23
*** senk has joined #openstack-infra05:23
*** jcooley_ has quit IRC05:27
*** jcooley_ has joined #openstack-infra05:28
*** jcooley_ has quit IRC05:32
*** sdake_ has joined #openstack-infra05:35
*** changbl has joined #openstack-infra05:38
*** dangers is now known as danger_fo_away05:40
*** senk has quit IRC05:41
*** DennyZhang has quit IRC05:43
*** sdake_ has quit IRC05:43
*** sdake_ has joined #openstack-infra05:43
*** rcleere has quit IRC05:51
*** loq_mac has joined #openstack-infra05:51
*** jcooley_ has joined #openstack-infra05:54
*** masayukig has quit IRC05:55
*** guohliu has quit IRC05:57
*** guohliu has joined #openstack-infra06:00
*** xeyed4good has joined #openstack-infra06:01
*** SergeyLukjanov has quit IRC06:02
*** michchap has quit IRC06:02
*** oubiwann has quit IRC06:02
*** michchap has joined #openstack-infra06:04
*** pcrews has quit IRC06:05
openstackgerritRussell Bryant proposed a change to openstack-infra/config: Add gate-solum-devstack job  https://review.openstack.org/5709806:05
*** xeyed4good has quit IRC06:05
*** sarob has quit IRC06:07
*** dstanek has quit IRC06:21
*** wenlock has quit IRC06:25
openstackgerritFei Long Wang proposed a change to openstack/requirements: Get better format for long lines with PrettyTable  https://review.openstack.org/5710406:39
*** DennyZhang has joined #openstack-infra06:39
*** matsuhashi has quit IRC06:41
*** matsuhashi has joined #openstack-infra06:42
*** DennyZhang has quit IRC06:44
*** matsuhashi has quit IRC06:46
*** matsuhashi has joined #openstack-infra06:48
*** fifieldt has quit IRC06:50
*** dizquierdo has joined #openstack-infra06:51
*** jhesketh has quit IRC06:55
*** denis_makogon has joined #openstack-infra07:02
*** boris-42 has quit IRC07:02
*** michchap has quit IRC07:03
*** michchap has joined #openstack-infra07:04
*** nosnos_ has joined #openstack-infra07:05
*** nosnos has quit IRC07:05
*** sarob has joined #openstack-infra07:08
*** matsuhashi has quit IRC07:09
*** matsuhashi has joined #openstack-infra07:10
*** yolanda has joined #openstack-infra07:12
*** matsuhas_ has joined #openstack-infra07:13
*** sarob has quit IRC07:13
*** matsuhashi has quit IRC07:14
*** arata has joined #openstack-infra07:17
*** davidhadas has quit IRC07:23
*** SergeyLukjanov has joined #openstack-infra07:24
*** SergeyLukjanov has quit IRC07:30
*** dizquierdo has quit IRC07:36
*** sjing has quit IRC07:36
*** DinaBelova has joined #openstack-infra07:37
*** SergeyLukjanov has joined #openstack-infra07:38
*** sjing has joined #openstack-infra07:38
*** nsaje has joined #openstack-infra07:38
*** DennyZhang has joined #openstack-infra07:40
*** DennyZhang has quit IRC07:44
*** fbo is now known as fbo_away07:57
*** nsaje has quit IRC08:03
*** nsaje has joined #openstack-infra08:03
*** jcoufal has joined #openstack-infra08:04
*** marun has quit IRC08:06
*** dizquierdo has joined #openstack-infra08:06
*** nsaje has quit IRC08:08
*** sarob has joined #openstack-infra08:09
*** osanchez has joined #openstack-infra08:11
*** sarob has quit IRC08:13
*** nsaje has joined #openstack-infra08:14
*** sarob has joined #openstack-infra08:15
*** mestery_ has joined #openstack-infra08:15
*** jcooley_ has quit IRC08:18
*** mestery has quit IRC08:19
*** sarob has quit IRC08:19
*** katyafervent has quit IRC08:20
ttxmarkwash, clarkb, fungi: I thought the whole plan around client releases is that you have a single release channel, which means no stable branches and no backward-incompatible changes. When mordred pushed it I remember that we said the model would break in several places if we suddenly introduced branches in it08:23
ttxmarkwash, clarkb, fungi: that would make a good topic for release meeting today if mordred can make it08:23
*** marun has joined #openstack-infra08:24
*** masayukig has joined #openstack-infra08:25
*** denis_makogon has quit IRC08:28
*** jcooley_ has joined #openstack-infra08:30
*** davidhadas has joined #openstack-infra08:32
*** afazekas_ has joined #openstack-infra08:34
*** hashar has joined #openstack-infra08:34
*** flaper87|afk is now known as flaper8708:37
*** arata has left #openstack-infra08:39
*** boris-42 has joined #openstack-infra08:41
*** jhesketh_ has joined #openstack-infra08:41
*** jhesketh__ has joined #openstack-infra08:41
*** rakhmerov has quit IRC08:42
*** boris-42_ has joined #openstack-infra08:45
*** boris-42 has quit IRC08:45
*** jcooley_ has quit IRC08:49
openstackgerritSascha Peilicke proposed a change to openstack-dev/pbr: Support building wheels (PEP-427)  https://review.openstack.org/5711708:52
*** dachary has quit IRC08:53
*** ilyashakhat has quit IRC08:53
*** DinaBelova has quit IRC08:54
*** ilyashakhat has joined #openstack-infra08:54
*** DinaBelova has joined #openstack-infra08:54
*** dachary has joined #openstack-infra08:55
*** yassine has joined #openstack-infra08:56
*** mkerrin has quit IRC08:56
*** nati_ueno has quit IRC08:59
*** davidhadas_ has joined #openstack-infra09:00
*** yamahata_ has joined #openstack-infra09:01
*** jhesketh__ has quit IRC09:02
*** jhesketh_ has quit IRC09:02
*** davidhadas has quit IRC09:03
*** salv-orlando has joined #openstack-infra09:05
*** osanchez has quit IRC09:08
*** sarob has joined #openstack-infra09:08
*** jpich has joined #openstack-infra09:09
*** nsaje has quit IRC09:11
*** derekh has joined #openstack-infra09:11
*** nsaje has joined #openstack-infra09:12
*** sarob has quit IRC09:12
*** jhesketh__ has joined #openstack-infra09:14
*** jhesketh_ has joined #openstack-infra09:15
*** nsaje has quit IRC09:15
*** nsaje has joined #openstack-infra09:16
*** jcooley_ has joined #openstack-infra09:20
*** locke105 has quit IRC09:21
*** Ryan_Lane has quit IRC09:21
*** fbo_away is now known as fbo09:23
*** michchap has quit IRC09:24
*** matsuhas_ has quit IRC09:25
openstackgerritPeter Liljenberg proposed a change to openstack-infra/jenkins-job-builder: Added support for Jenkins plugin Blame upstream committers  https://review.openstack.org/5408509:25
*** matsuhashi has joined #openstack-infra09:25
*** jcooley_ has quit IRC09:25
*** SergeyLukjanov has quit IRC09:26
*** michchap has joined #openstack-infra09:26
*** bingbu has quit IRC09:29
*** rakhmerov has joined #openstack-infra09:29
*** yamahata_ has quit IRC09:30
*** matsuhashi has quit IRC09:30
*** matsuhashi has joined #openstack-infra09:30
*** nosnos_ has quit IRC09:33
*** nosnos has joined #openstack-infra09:34
*** BobBallAway is now known as BobBall09:44
*** alexpilotti has joined #openstack-infra09:51
*** odyssey4me has joined #openstack-infra10:02
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard: Add story dialog improvement  https://review.openstack.org/5718010:03
*** guohliu has quit IRC10:03
*** sjing has quit IRC10:04
*** nsaje has quit IRC10:04
*** davidhadas has joined #openstack-infra10:05
*** ilyashakhat has quit IRC10:06
*** ilyashakhat has joined #openstack-infra10:07
*** arata has joined #openstack-infra10:07
*** davidhadas_ has quit IRC10:08
*** sarob has joined #openstack-infra10:08
*** nati_ueno has joined #openstack-infra10:09
*** nati_ueno has quit IRC10:11
*** sgran has joined #openstack-infra10:11
sgranmorning10:11
sgranis something broken with check-tempest-devstack-vm-postgres-full ?10:12
sgranapologies if others have brought this up this morning10:12
*** sarob has quit IRC10:13
*** SergeyLukjanov has joined #openstack-infra10:13
*** SergeyLukjanov has quit IRC10:15
*** SergeyLukjanov has joined #openstack-infra10:16
ogelbukhI recall some bot in infra which reported new bugs created in launchpad, is it for real or just my imagination?10:19
*** jcooley_ has joined #openstack-infra10:22
*** adalbas has joined #openstack-infra10:23
*** nsaje has joined #openstack-infra10:23
openstackgerritKei YAMAZAKI proposed a change to openstack-infra/jenkins-job-builder: Added support for Emotional Jenkins  https://review.openstack.org/5677910:26
*** jcooley_ has quit IRC10:28
*** lcestari has joined #openstack-infra10:32
*** nsaje has quit IRC10:36
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard: Added task ordering  https://review.openstack.org/5602610:36
*** nsaje has joined #openstack-infra10:37
*** nsaje has quit IRC10:37
*** nsaje has joined #openstack-infra10:38
*** yamahata_ has joined #openstack-infra10:38
*** lcestari has quit IRC10:39
*** nsaje has quit IRC10:42
*** nsaje has joined #openstack-infra10:46
*** lcestari has joined #openstack-infra10:46
*** marun has joined #openstack-infra10:49
*** hashar has quit IRC10:51
*** davidhadas has quit IRC10:52
*** davidhadas has joined #openstack-infra10:53
*** vladan_ has joined #openstack-infra10:56
*** boris-42_ is now known as boris-4210:56
*** matsuhashi has quit IRC10:58
*** matsuhashi has joined #openstack-infra10:59
*** vladan has quit IRC10:59
*** vladan_ is now known as vladan10:59
*** matsuhashi has quit IRC11:04
*** rfolco has joined #openstack-infra11:08
*** sarob has joined #openstack-infra11:08
*** dizquierdo has quit IRC11:12
*** sarob has quit IRC11:12
*** plomakin has quit IRC11:19
*** yaguang has quit IRC11:20
*** CaptTofu has quit IRC11:21
*** CaptTofu has joined #openstack-infra11:22
*** jcooley_ has joined #openstack-infra11:25
*** CaptTofu has quit IRC11:27
*** SergeyLukjanov has quit IRC11:30
*** marun has joined #openstack-infra11:31
*** SergeyLukjanov has joined #openstack-infra11:32
*** pcm_ has joined #openstack-infra11:34
*** pcm_ has quit IRC11:34
*** pcm_ has joined #openstack-infra11:35
*** mugsie has quit IRC11:40
*** ogelbukh1 has quit IRC11:41
*** plomakin has joined #openstack-infra11:46
*** ilyashakhat has quit IRC11:47
*** ilyashakhat has joined #openstack-infra11:48
*** ben_duyujie has joined #openstack-infra11:48
*** SergeyLukjanov has quit IRC11:51
*** arata has left #openstack-infra11:57
*** jcooley_ has quit IRC11:59
*** davidhadas_ has joined #openstack-infra12:02
*** amotoki has quit IRC12:03
*** davidhadas has quit IRC12:05
*** ruhe has joined #openstack-infra12:08
*** sarob has joined #openstack-infra12:08
*** davidhadas has joined #openstack-infra12:09
*** mugsie has joined #openstack-infra12:09
*** davidhadas_ has quit IRC12:12
*** sarob has quit IRC12:14
*** osanchez has joined #openstack-infra12:20
*** markmc has joined #openstack-infra12:23
*** nati_ueno has joined #openstack-infra12:24
*** odyssey4me has quit IRC12:26
*** ruhe has quit IRC12:32
*** salv-orlando has quit IRC12:34
openstackgerritZane Bitter proposed a change to openstack-infra/reviewstats: Add randall-burt to heat-core  https://review.openstack.org/5721112:34
*** davidhadas_ has joined #openstack-infra12:35
*** davidhadas has quit IRC12:38
*** salv-orlando has joined #openstack-infra12:43
sorenI'm looking at why this failed: https://review.openstack.org/#/c/55519/  Does the name in zuul's layout.yaml's jobs section need to start with ^ and end with $ for it to be recognised as a regex or did I just screw up the regex?12:44
*** jhesketh_ has quit IRC12:45
*** masayukig has quit IRC12:48
sorenYup, it seems it needs to start with a ^.12:48
* soren fixes12:48
*** markmc has quit IRC12:49
openstackgerritSoren Hansen proposed a change to openstack-infra/config: Add BasicDB to stackforge  https://review.openstack.org/5551912:50
*** jcooley_ has joined #openstack-infra12:55
*** davidhadas has joined #openstack-infra12:57
*** davidhadas_ has quit IRC13:00
*** dkranz has joined #openstack-infra13:00
*** jcooley_ has quit IRC13:00
*** dizquierdo has joined #openstack-infra13:01
*** afazekas_ has quit IRC13:02
*** sarob has joined #openstack-infra13:08
*** sarob has quit IRC13:13
*** afazekas_ has joined #openstack-infra13:14
*** salv-orlando has quit IRC13:17
*** xeyed4good has joined #openstack-infra13:17
*** dprince has joined #openstack-infra13:18
*** marun has quit IRC13:20
*** thomasem has joined #openstack-infra13:34
*** nosnos has quit IRC13:35
*** matel has joined #openstack-infra13:36
yassineHi all,13:42
*** nsaje has quit IRC13:42
yassinecould someone please validate https://review.openstack.org/#/c/5692713:42
*** nsaje has joined #openstack-infra13:43
matelDo we have any stats about the number of the nova check jobs?13:43
anteayathe gate appears to be starved for nodes13:44
anteayacheck jobs are running though13:45
anteayamatel do you mean the number of nova check jobs running right now?13:45
*** weshay has joined #openstack-infra13:45
anteayaor waht time frame13:45
anteayahi yassine when a core wanders by, they can have a peek13:46
anteaya*what time frame13:46
*** sandywalsh_ has joined #openstack-infra13:46
*** nsaje has quit IRC13:47
yassineanteaya, thank you Anita ;)13:47
*** yamahata_ has quit IRC13:48
*** dstanek has joined #openstack-infra13:48
anteayanp :D13:48
matelanteaya: I would like to know how many jobs do I need to run if I do the third party testing on nova.13:48
*** yamahata_ has joined #openstack-infra13:49
matelto know how much compute capacity is needed.13:49
*** marun has joined #openstack-infra13:49
*** wenlock has joined #openstack-infra13:50
*** wenlock has quit IRC13:50
anteayamatel ummm, are you asking what compute capcity is needed by infrastructure or on your own local system?13:50
matelyep.13:51
anteayaI haven't caught up to the context of your question13:51
anteayacompute capacity for infrastructure would be a question for fungi when he awakes13:51
matelI am looking at: http://ci.openstack.org/third_party.html13:51
anteayaI'd give him about another 40 minutes to arrive13:51
anteayaah okay13:52
anteayaknow that right now our infrastructure is in a very unhappy place13:52
matelSo if I were doing the 3rd party testing in my own datacenter, roughly how many machines would I need.13:52
anteayaand fungi will probably be attending to that for at least the first 30 minutes upon his arrival13:52
anteayamatel: okay, catching up with you now13:52
anteayayou you are just testing nova?13:53
matelYes, I am only interested in nova atm.13:53
anteayaokay AruxCrux might have some suggestions, as he has set up his own infra that listens to our gerrit stream13:53
dprincesdague: ping13:54
anteayaso that is an option, but he isn't online right now13:54
matelDo you have any stats on the check jobs? How many jobs did you run on each day, or something like this?13:54
anteayamorning dprince13:54
dprinceanteaya: morning!13:54
anteayamatel: have you seen our status page? http://status.openstack.org/zuul/13:54
anteayadprince: sdague is day 2 on $new_job and is in San Jose this week13:54
*** arata has joined #openstack-infra13:55
*** SergeyLukjanov has joined #openstack-infra13:55
anteayadprince: didn't see him online at all yesteday13:55
*** SergeyLukjanov is now known as _SergeyLukjanov13:55
anteayanot sure what to expect today13:55
dprinceanteaya: maybe he's in training on the new job13:55
anteayadprince: that is my guess as well13:55
*** _SergeyLukjanov is now known as SergeyLukjanov13:56
*** SergeyLukjanov is now known as _SergeyLukjanov13:56
anteayadprince: hang out with us today, jenkins is sick and at least having you know that helps to curb the recheck13:56
*** jcooley_ has joined #openstack-infra13:57
*** wenlock has joined #openstack-infra13:57
anteayaand the gate has some nodes now13:57
matelanteaya: I am looking for a number of checks per day for project nova, and I cannot find it at zuul's page, could you help me?13:57
anteayamatel: I am giving you what I have13:57
* dprince hangs out13:57
anteayait is possible we don't keep that stat13:57
anteayadprince: thanks13:57
*** hashar has joined #openstack-infra13:58
mateldprince: do you have any stats regarding to the number of nova check jobs per day?13:58
*** sandywalsh__ has joined #openstack-infra13:58
anteayaalso there may be other -infra folks who happen by who might be better help than I13:58
anteayaso hang around13:58
matelyep, trying dprince atm.13:58
*** davidhadas_ has joined #openstack-infra13:58
matelmaybe ss has some stats.13:58
anteayayou don't have to ask people, they will read the backscroll13:59
anteayaand respond to you if they have an answer13:59
anteayajust hang out and welcome to the channel13:59
dprincematel: between 200-400 a day I think is reasonable for nova13:59
mateldprince: thanks.13:59
dprincematel: actually I think that is total (not just nova)13:59
anteayawhile you are waiting you are welcome to read some logs: http://eavesdrop.openstack.org/irclogs/%23openstack-infra/13:59
*** herndon_ has joined #openstack-infra14:00
dprincematel: but nova runs the bulk of them for sure14:00
*** sandywalsh_ has quit IRC14:00
*** _SergeyLukjanov has quit IRC14:00
*** sandywalsh__ has quit IRC14:01
*** davidhadas has quit IRC14:01
*** SergeyLukjanov has joined #openstack-infra14:01
*** johnthetubaguy has joined #openstack-infra14:01
fungiwhat was the capacity question?14:01
*** dkranz has quit IRC14:02
*** afazekas_ has quit IRC14:02
fungioh, asking "why does the gate go so slowly?"14:02
fungiit's that thread posted to the -dev ml last week (i think by jog0?) about "gate math" wherein it was explained that the current deciding factor in gating throughput is the percentage of nondeterministic failures in integration tests14:03
*** yaguang has joined #openstack-infra14:04
anteayamorning fungi14:04
*** arata has left #openstack-infra14:04
anteayamatel needs to do third party testing on nova and wants to know how large his/her data centre should be14:04
*** ruhe has joined #openstack-infra14:04
fungisince every change which encounters a test failure restarts tests for all the changes behind it. when a change averages half a dozen devstack-gate nodes and the queue is 75 changes deep, a reset near the head of the queue throws away 450 (75*6) virtual machines. our combined quota across all providers is right around 400 at the moment14:05
fungiahh, so for third-party testing, chances are it's once job per change being checked14:06
anteayafungi: yes, so the question was do we have stats on how many nova check jobs run in a 24 hour period14:06
*** sandywalsh_ has joined #openstack-infra14:07
*** nsaje has joined #openstack-infra14:07
anteayaif we have this stat, I don't know where it would be14:07
*** dkliban has quit IRC14:07
fungii don't have exact numbers for nova (might be able to dig them out of graphite.openstack.org if we have per-project stats there?), but if you look at the patchset created graph at the bottom of status.openstack.org/zuul (the green line on the gerrit events chart) we seem to be floating around 50 new patchsets an hour project-wide14:08
*** sarob has joined #openstack-infra14:08
fungii expect nova to be less than half that14:08
anteayaawesome thanks fungi14:09
anteayamatel: did you catch that?14:09
anteayarough guess is about 20-25 new patchsets per hour14:10
fungii would call that a safe overestimate14:11
*** julim has joined #openstack-infra14:12
*** sarob has quit IRC14:13
*** mriedem has joined #openstack-infra14:13
*** dkranz has joined #openstack-infra14:14
*** ben_duyujie has quit IRC14:14
*** jergerber has joined #openstack-infra14:16
*** ryanpetrello has joined #openstack-infra14:16
*** DinaBelova has quit IRC14:17
*** yamahata_ has quit IRC14:18
anteayaI'd like to have a conversation about resource usage for devstack/tests14:20
anteayaright now there is no policy14:20
anteayaso tests that use more resources just consume more14:21
*** dolphm has joined #openstack-infra14:21
*** alcabrera has joined #openstack-infra14:21
anteayaI'd like us to have a discussion about having some form of policy about it so that folks using local development as well as scaling for more tests (infra) at least have some structure regarding resource usage expecations14:22
*** sgran has left #openstack-infra14:23
*** dolphm has quit IRC14:23
*** yamahata_ has joined #openstack-infra14:23
fungia start would be to push back on people wanting to increase timeouts on some jobs, and consider any job which comes within 25% of its allotted run time to need "fixing" (more parallelization? split types of tests between an increased number of jobs?)14:24
*** mestery_ is now known as mestery14:24
*** changbl has quit IRC14:25
*** DinaBelova has joined #openstack-infra14:25
*** dolphm has joined #openstack-infra14:28
anteayawould creating more jobs reduce resource consumption overall?14:28
anteayathinking of the spliting-tests suggestion14:29
*** jcooley_ has quit IRC14:30
*** mfer has joined #openstack-infra14:30
fungiit could at least reduce the time changes take to get through the gate since it would trivially increase parallelization14:31
fungithe tradeoff is that it would utilize more jenkins slaves (so we'd want more compute resources donated, maybe from additional providers)14:32
anteayafair enough14:32
anteayadid we get any indication at teh summit there might be any additional providers in the future?14:33
*** mrodden has joined #openstack-infra14:34
fungittx: while you seem to be around, thoughts on tagging folsom-eol to the tip of stable/folsom on each of the projects listed in http://lists.openstack.org/pipermail/openstack-stable-maint/2013-November/001707.html14:34
fungittx: is that something you want to do (looks like you did diablo and essex), or a task for adam_g/apevec?14:34
fungi(or should i just do it so we can get on to deleting the branches?)14:35
*** xeyed4good has quit IRC14:35
*** afazekas_ has joined #openstack-infra14:35
anteayayou know the model that has individuals running scientific computational processes running on their laptop during downtime to help large science projects?14:36
anteayaI wonder if we could look at that as a model for our testing somehow14:36
*** oubiwann has joined #openstack-infra14:36
anteayasince even getting new cloud providers will again reach a limit14:36
anteayasimple tests like the pep8 tests14:37
*** locke105 has joined #openstack-infra14:38
fungisomething distributed-grid-ish like boinc? it's well designed for basic computations, but emulating an entire operating system under that model would quickly cease to be viable i think14:39
fungii used to manage high-performance distributed compute clusters, and even with existing homogeneity (achieved by having all cluster nodes consist of identical hardware with identical software and configuration), there's a distinct lower bound on how finely you can separate tasks14:40
*** markmc has joined #openstack-infra14:41
fungia distributed scheduler like mosix is about the closest i ever saw, and even that depended on the underlying operating systems and hardware to be identical14:41
fungias far as running os-level tasks distributed throughout a cluster i mean14:42
fungipvm/mpi got better performance, but there you were writing specifically cluster-aware applications to take advantage of your message passing interfaces14:42
anteayaokay14:43
anteayadidn't think it was viable14:43
anteayabut the idea arrived so I thought I would give it some air time14:43
fungibasically, with cloud providers giving us mostly-identical virtual machines, we're sort of already doing something like that14:44
anteayayes14:44
anteayaam just trying the thing of options for scaling out14:45
anteayawith the rate of growth, I can see us spending a lot of time this cycle hitting limits14:45
*** jlk has quit IRC14:45
*** jamesmcarthur has joined #openstack-infra14:45
anteayaso trying to float ideas for new models, rather than just making the current model bigger14:45
*** jlk has joined #openstack-infra14:46
anteayaI have a lot of ideas that don't work14:46
fungibut even if we conquered the virtual-machine-on-a-loose-grid challenge, massive distributed projects like seti, folding and so on rely on multiple systems redundantly solving the same tasks so that they can deal with nodes misreporting or disappearing. worse, there's no real guarantee on how long a task will take to complete under that model, so i think our throughput would likely suffer14:46
*** thedodd has joined #openstack-infra14:46
*** zul has quit IRC14:47
*** zul has joined #openstack-infra14:47
anteayayes it would14:47
anteayawas thinking the same14:47
dstanekis there a way to open a closed review?  there are a couple of reviews that i'm interested in that have been abandoned because the author never came back after a bad review14:47
anteayanodes going down due to power outages, or folks shutting down14:47
anteayadstanek: do you have urls?14:48
*** davidhadas_ has quit IRC14:48
anteayainfra core can open as well as the original author14:48
anteayathose are the only 2 choices I know of14:48
fungidstanek: if you're not the author, you need a gerrit admin to restore it. give me the review numbers and i can take care of it14:48
dstanekanteaya, fungi: is there a permission to allow core reviewers to reopen - so i don't have to bug you guys all the time?14:49
anteayafungi: is gerrit admin different from infra-core?14:49
dstaneka simple example is review.openstack.org/54632 - that came across in my email today14:49
fungidstanek: nope. if there were, we wouldn't be having this conversation ;)14:49
dstanekanteaya: i mean a core reviewer of the project14:50
dstanekfungi: :)14:50
fungii agree it's silly that it isn't a gerrit acl permission14:50
fungianteaya: at the moment there is a 1:1:1 correspondence between gerrit admins and infra core reviewers and infra root sysadmins14:52
fungii was simply being more specific14:52
anteayaokay, thanks14:53
anteayaI'm losing the distinction in my mind since it is all the same group14:53
anteayabut gerrit admins it is14:53
anteayaI count on your specificity14:53
*** dolphm has quit IRC14:54
mriedemi'm not subscribed to the openstack-infra mailing list, is it pretty active, or does openstack-dev with the [infra] tag get just as much traffic?14:57
*** sarob has joined #openstack-infra14:59
anteayamriedem: fairly quiet15:00
anteayaa thread or two a week and pleia2 meeting reminder and the link to logs15:00
anteayamriedem: http://lists.openstack.org/pipermail/openstack-infra/15:01
*** dolphm has joined #openstack-infra15:02
*** salv-orlando has joined #openstack-infra15:02
*** sarob has quit IRC15:03
mriedemanteaya: ah, ok, was looking for this: http://lists.openstack.org/pipermail/openstack-infra/2013-November/000432.html15:03
*** sarob has joined #openstack-infra15:04
mriedemwe have a team in china getting CI setup for DB2 and they are relatively new to community15:04
*** dkliban has joined #openstack-infra15:04
mriedemwas trying to monitor what they are doing but wasn't on the infra ML15:04
anteayamriedem: ah okay, great15:05
anteayawell feel free to join or track archieves, your call15:05
*** rnirmal has joined #openstack-infra15:05
anteayahttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra15:05
anteayaif you want to sign up15:06
matelanteaya: Thanks for the info (I re-read the logs)15:07
anteayamatel: great15:08
anteayafeel free to idle, read logs, ask questions and help if you can15:08
anteayaas you set up your 3rd party tests15:08
anteayaand afterward15:09
mriedemanteaya: do you know if there are some numbers for how many patches go through the check queue in a month for a given project? not seeing anything obvious on zuul, but i have to believe that infra tracks that15:12
mriedemgetting an idea for load15:12
anteayamriedem: seems to be a popular question today15:12
anteayafungi did give an explaination about how one might calculate that15:13
anteayaand no we don't track check load based on project15:13
fungiwell, what i said was we might have per-project stats on graphite.openstack.org but i don't remember15:13
anteayaah yes15:13
* mriedem looks15:14
*** xeyed4good has joined #openstack-infra15:14
fungiwe do have per-job stats, so could probably base it off a graph for the nova python 2.7 unit tests15:15
dolphmanyone- is it possible to grant Restore privileges to *-core groups?15:15
anteayadolphm: do we have a use case?15:15
mriedemfungi: bingo, looking at gate-sqlalchemy-migrate-python27 in graphite15:16
dolphmanteaya: during release time and summit time, a lot of reviews get abandoned because *-core is busy15:17
dolphmanteaya: it's really *-core's fault, and we should have the power to fix the problem15:17
fungidolphm: we wish, but no it's a change-owner-only feature with no corresponding acl permission15:18
dolphmanteaya: rather than depending on the author to Restore (they're generally just as busy anyway)15:18
fungidolphm: only the change owner and gerrit administrators can restore an abandoned change15:18
fungidolphm: it's one of the oversights i hope to be corrected in newer gerrit releases15:19
dolphmfungi: boo, thanks15:20
dolphmfungi: if you know of anything tracking that upstream, i'd appreciate it :)15:20
fungidolphm: i'll try to remember to ask zaro if he's tried it on gerrit 2.7, since he's been running the effort to eventually be able to upgrade us15:21
anteayahere is the patchset created graph for the last 24 hours, matel, mriedem: http://graphite.openstack.org/graphlot/?width=1228&height=630&_salt=1384874559.636&target=stats_counts.gerrit.event.patchset-created15:23
*** nati_uen_ has joined #openstack-infra15:25
*** dolphm has quit IRC15:26
*** jcooley_ has joined #openstack-infra15:27
*** nati_ueno has quit IRC15:28
matelanteaya: Thanks, that's great!15:28
anteayamatel: glad to help15:28
*** dcramer_ has joined #openstack-infra15:28
*** dolphm_ has joined #openstack-infra15:29
funginote that that's patchsets created for all projects15:29
mordredGAH scrollback15:31
mordredfungi: morning fungi - have I missed anything interesting?15:32
*** datsun180b has joined #openstack-infra15:32
*** jcooley_ has quit IRC15:33
mordredttx: yes. that is exactly correct. it basically comes down to an end user downloading *client15:33
mordredttx: if they want to talk to a nova, they do not necessarily know, nor should they need to, what version of nova is runnin15:33
mordredso the decision was that we'd make sure that client libs were always backwards compatible15:34
mordrednow we just need to land the jobs taht test that :)15:34
*** senk has joined #openstack-infra15:34
ttxmordred: see backlog, someone suggested multiple client branches15:35
mordredttx: I scanned and couldn't find the mention15:35
mordredttx: do you know what they were trying to achieve?15:35
*** atiwari has joined #openstack-infra15:36
ttxmordred: <markwash> 18:09:48> can I use gerrit to track two major-release series of python-glanceclient? Like, suppose I want to create a 1.0 branch and start making backwards incompatible changes. . .15:37
mordredah. ok. that's slightly different15:37
mordredand that can  theoretically be done -- the intent there is not "I want a stable/grizzly glanceclient" but instead "I want to rev the major version"15:38
mordredthere are a BUNCH of things we'd need to think through, but mechanically it's not impossible15:38
mordredthat's what jog0 does with hacking15:38
dolphm_is this "official"? https://twitter.com/osjenkins15:40
*** dolphm_ is now known as dolphm15:40
anteayadolphm: never seen it before, don't know who put it up15:44
fungidolphm: i do not believe so, but not sure whose it is15:45
*** Loquacity has quit IRC15:45
anteayasince it is using OpenStack as part of its title, would the foundation be interested in this account, I wonder?15:45
anteayawhat do you think ttx?15:45
fungimordred: but i think determining when to perform integration tests with which client branch (would we duplicate every test involving that client?) might get complicated15:46
* ttx looks15:47
dolphmi assume one of the people that retweeted the only tweet created the account15:47
*** Loquacity has joined #openstack-infra15:47
ttxanteaya: maybe?15:48
*** SpamapS_ has joined #openstack-infra15:48
*** Vivek_ has joined #openstack-infra15:48
* ttx enjoys a 10-min break15:48
anteayafair enough15:48
anteaya:D15:48
zarofungi: huh, what about gerrit?15:48
*** pcrews has joined #openstack-infra15:48
ttxThese 10 15-min status syncs over the day are a bit crazier idea than I thought15:49
anteayadolphm: the three retweets are all eNovance folks15:49
anteayasounds like an eNovance thing15:49
fungizaro: any idea if acls in 2.7 allow us to set groups who are allowed to restore (unabandon) changes?15:49
zaroyes, i believe that's available.15:49
*** salv-orlando has quit IRC15:50
*** afazekas_ has quit IRC15:50
dolphmanteaya: well there you go15:50
fungidolphm: ^ sounds like maybe it'll be fixed once we upgrade then15:50
dolphmfungi: zaro: ooh, thanks!15:50
* dolphm waits patiently for 2.715:51
*** klrmn1 has joined #openstack-infra15:51
anteayadolphm: yup, thanks for the link15:51
*** ArxCruz has quit IRC15:51
*** ArxCruz has joined #openstack-infra15:51
*** samalba_ has joined #openstack-infra15:51
*** koohead17_ has joined #openstack-infra15:52
*** juice_ has joined #openstack-infra15:52
*** Vivek has quit IRC15:52
*** ianw has quit IRC15:52
*** samalba has quit IRC15:52
*** SpamapS has quit IRC15:52
*** mordred has quit IRC15:52
*** juice has quit IRC15:52
*** klrmn has quit IRC15:52
*** koolhead17 has quit IRC15:52
*** klrmn1 is now known as klrmn15:52
*** juice_ is now known as juice15:52
fungittx: thoughts on tagging folsom-eol to the tip of stable/folsom on each of the projects listed in http://lists.openstack.org/pipermail/openstack-stable-maint/2013-November/001707.html (you did the previous ones, but do you want adam_g/apevec to do it this time, or should i just do it so i can go ahead deleting the branches)?15:53
*** atiwari has quit IRC15:53
*** jcooley_ has joined #openstack-infra15:54
*** samalba_ is now known as samalba15:54
*** atiwari_ has joined #openstack-infra15:54
*** atiwari_ has quit IRC15:54
*** atiwari has joined #openstack-infra15:54
*** adam_g has left #openstack-infra15:54
*** adam_g has joined #openstack-infra15:54
adam_gfungi, hey! sorry, im on holiday.  this is my first hour online since HK. im fine with you tagging it.15:55
*** jcooley_ has quit IRC15:55
*** jcooley_ has joined #openstack-infra15:56
adam_gfungi, about to be offline again but back the 25th15:56
fungiadam_g: no worries. have a good holiday15:57
*** salv-orlando has joined #openstack-infra15:58
*** ianw has joined #openstack-infra15:59
ttxfungi: you can go ahead15:59
*** mordred has joined #openstack-infra15:59
fungittx: okay, i'll sign them and emulate whatever you did with tag message for diablo/essex15:59
*** primemin1sterp is now known as primeministerp16:01
*** acabrera has joined #openstack-infra16:02
*** sandywalsh_ has quit IRC16:03
*** reed has joined #openstack-infra16:03
*** adalbas has quit IRC16:04
*** nsaje has quit IRC16:05
*** adam_g is now known as adam_g_afk16:05
*** alcabrera has quit IRC16:05
*** nsaje has joined #openstack-infra16:05
openstackgerritAlex Gaynor proposed a change to openstack-dev/pbr: Bump the development status classifier.  https://review.openstack.org/5727216:06
*** sarob has quit IRC16:07
*** sarob has joined #openstack-infra16:07
*** adalbas has joined #openstack-infra16:07
*** xeyed4good has quit IRC16:07
*** koohead17_ is now known as koolhead1716:08
*** rcleere has joined #openstack-infra16:09
*** nsaje has quit IRC16:10
*** danger_fo_away is now known as dangers16:11
*** sarob has quit IRC16:11
*** katyafervent has joined #openstack-infra16:13
*** rakhmerov has quit IRC16:14
*** rakhmerov has joined #openstack-infra16:14
*** rakhmerov has quit IRC16:14
*** mrodden has quit IRC16:15
*** ruhe has quit IRC16:15
*** dkranz has quit IRC16:19
mordrednetsplit ftw16:20
mordredfungi: Alex_Gaynor says we have some issues in the gate16:20
Alex_Gaynormordred: technically in the check I suppose :)16:21
*** ruhe has joined #openstack-infra16:22
*** senk has quit IRC16:22
openstackgerritMonty Taylor proposed a change to openstack-infra/config: Create and upload wheels  https://review.openstack.org/5676016:22
*** senk has joined #openstack-infra16:22
Alex_Gaynormordred: Any idea how hard it would be to make the bot use different messages for "new patch" vs. "udpated a patch"?16:23
*** nati_uen_ has quit IRC16:24
mordredAlex_Gaynor: hrm. I have no idea. the event is always patchset-created ... I'm not sure if a ,2 is in the payload anywhere16:24
*** nsaje has joined #openstack-infra16:24
EmilienMmordred: i'm trying to get more review on https://review.openstack.org/#/c/48042/ since long time, without success, could you help on this ?16:24
mordredEmilienM: I'm not core on devstack16:25
EmilienMmordred: oops, ok16:25
mriedemEmilienM: https://review.openstack.org/#/admin/groups/50,members16:25
EmilienMmriedem: thx16:26
mordredEmilienM: sorry. I'd totally ehlp you otherwise16:26
*** senk has quit IRC16:26
EmilienMmordred: appreciate16:26
*** noorul has joined #openstack-infra16:27
noorulHello16:28
*** cody-somerville has quit IRC16:28
noorulhttps://review.openstack.org/#/c/57095/16:28
jeblairhello everybody16:28
noorulIn the review commit lob "blueprint api" is mentioned16:28
*** senk has joined #openstack-infra16:28
noorulIt gets translated to https://blueprints.launchpad.net/openstack/?searchtext=api16:28
noorulInstead of https://blueprints.launchpad.net/solum/+spec/api16:29
fungimordred: Alex_Gaynor: what issues exactly? we have a few hung changes waiting for a zuul restart, but the gate was too deep overnight and this morning to consider it yet16:29
noorulIs this expected behavior?16:29
jeblairmordred, fungi, ttx, clarkb (when you get online): i'm making a back-from-vacation todo list; anything you need me to put on it?16:29
mordredjeblair: yay!16:29
mordredjeblair: you're back!16:29
Alex_Gaynorfungi: yeah it's the hung ones I think. They are 404s n Jenkins at this point16:29
mordredjeblair: I would really like to get everything in the wheels-in-mirror topic landed16:29
fungiAlex_Gaynor: yeah, those are left over from when puppet agent ate jenkins01 yesterday16:30
mordredjeblair: AND - when you have a minute, I'd like to talk to you about a logic add to nodepool16:30
mordredjeblair: related to adding the new hp cloud region16:30
Alex_Gaynorfungi: heh. Okey doke16:30
mordredjeblair: I wanted to talk to you before I did anything in that direction16:30
mordredalthough landing the new hp cloud region into nodepool now should be fine as a trial baloon16:31
*** senk has quit IRC16:31
mordredjeblair: other than that - welcome back!16:31
*** ruhe has quit IRC16:31
jeblairmordred: ack, thx.16:32
fungimordred: i hadn't looked back from after clarkb and i discussed the hpcloud beta region addition, but does that nodepool change deal okay with providers with no region-name specified (getting set to none in novaclient)?16:32
mordredjeblair: (some of the wheels-in-mirror patches have -1 vrfy because of gate races, some of them are waiting on https://review.openstack.org/#/c/56920/ to merge)16:33
*** senk has joined #openstack-infra16:33
mordredfungi: you mean availability-zone? I believe it should - I think it should only add availability-zone if it's in the file16:34
fungimordred: er, right. az16:34
*** hashar has quit IRC16:34
*** dkranz has joined #openstack-infra16:35
fungiit was getting late and we thought the code was setting it to none in the connection if not specified, but i'll review with that in mind16:35
mordredjeblair: tl;dr - new hp cloud has multi-az in single region, which means we still have multiple azs to balance across, but they share a quota16:35
mordredjeblair: so we need to tell the nova api which az to put a thing in16:36
funginoorul: gerrit can't differentiate between projects when hyperlinking blueprints to lp16:36
mordredjeblair: alternately, jog0 says that if we leave az out, nova will just pick one - so it's possible that just ignoring the az altogether might be fine16:36
funginoorul: if you use globally unique bp names and make solum "part of" the openstack project group (do other stackforge projects do that?) the bp search link may work as expected16:37
*** hashar has joined #openstack-infra16:38
*** ruhe has joined #openstack-infra16:38
*** changbl has joined #openstack-infra16:38
noorulfungi: I see16:38
jeblairmordred: that last sounds ideal, maybe lets try it first?16:39
*** yaguang has quit IRC16:40
*** cody-somerville has joined #openstack-infra16:40
*** dkranz has quit IRC16:42
fungijeblair: for the o'reilly press conference call scheduling, see e-mails from holly bauer in thread with subject: openstack workflow meeting part 216:46
*** sdake_ has quit IRC16:46
*** jcooley_ has quit IRC16:46
*** sdake_ has joined #openstack-infra16:47
*** sdake_ has quit IRC16:47
*** sdake_ has joined #openstack-infra16:47
*** dprince has quit IRC16:50
openstackgerritA change was merged to openstack-infra/reviewstats: Add randall-burt to heat-core  https://review.openstack.org/5721116:50
openstackgerritA change was merged to openstack-infra/storyboard: Added task ordering  https://review.openstack.org/5602616:50
*** markmc has quit IRC16:52
*** markmc has joined #openstack-infra16:52
openstackgerritA change was merged to openstack-infra/config: Update PyPI mirror for sqlalchemy-migrate releases  https://review.openstack.org/5669316:54
*** DennyZhang has joined #openstack-infra16:55
*** dkranz has joined #openstack-infra16:55
yolandahi, i'm just taking a look at bug https://bugs.launchpad.net/openstack-ci/+bug/95040716:57
uvirtbotLaunchpad bug 950407 in openstack-ci "jenkins should run licensecheck on all projects" [Low,Triaged]16:57
*** dangers is now known as danger_fo_away16:57
yolandaand wanted to have a bit more info about it16:57
yolandashould that involve create new jobs, like for example the *-pep8 ones?16:57
jeblairyolanda: yay! :)  i love it when people pick up low-hanging-fruit bugs!16:57
jeblairyolanda: we could create new ones... or maybe we should just have the pep8 job run it... (it's really the "style check job" at this point, it's already more than pep8)16:58
jeblairmordred, jog0: ^16:58
*** sarob has joined #openstack-infra16:58
*** chandankumar has quit IRC16:59
mordredjeblair: I'm fine with the pep8 job running it16:59
yolandaso bugfix will pass by updating job configuration in jjb for pep8?17:00
jeblairyolanda: that sounds like a good end-state goal.  though there's one other consideration -- all our projects may not "pass" licensecheck right now...17:02
*** hashar is now known as hasharCall17:02
yolandajeblair, so that licensecheck should be run for some specific ones?17:03
jeblairyolanda: so maybe we should start with a new job to run licensecheck, put it in the experimental queue (and then silent, and later check), and when we're happy it's working, add it to pep817:03
mgagnejeblair: better add a non-voting job until it gets fixed? otherwise gates will be blocked until it's fixed.17:03
*** nibalizer has quit IRC17:03
mgagneforgot about experimental queue ^^'17:03
jeblairmgagne: yeah, same idea, basically17:04
yolandajeblair, so you have a jenkins that runs only the experimental jobs, right? how is it called?17:04
*** markwash has quit IRC17:04
*** gyee has joined #openstack-infra17:05
*** changbl has quit IRC17:05
jeblairyolanda: zuul has an experimental pipeline, so if you add the new job just to that pipeline, you can run it on request17:06
jeblairyolanda: see https://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/zuul/layout.yaml  for some examples (gate-devstack-vm-cells is one)17:06
yolandaok17:06
jeblairyolanda: you leave a comment with "check experimental" and it will run those jobs on that change17:06
*** nibalizer has joined #openstack-infra17:07
yolandaok, i see it17:07
mordredneat!17:08
mordredmordred@camelot:~/src/openstack/nova$ licensecheck nova . | grep -v Apache17:08
mordred./run_tests.sh: *No copyright* UNKNOWN17:08
mordredthat's not bad17:08
*** afazekas_ has joined #openstack-infra17:08
yolandai'll do some tries first on my test jenkins and then try to push some job17:09
jeblairyolanda: awesome, thanks!17:09
*** sdake_ has quit IRC17:09
*** alexpilotti has quit IRC17:10
*** jcooley_ has joined #openstack-infra17:10
pleia2welcome back, jeblair :)17:12
*** pycabrera has joined #openstack-infra17:13
*** pycabrera is now known as alcabrera17:13
jeblairpleia2: thanks :)17:14
hub_capjeblair: how do i get u the wads of unmarked bills i owe u17:14
*** salv-orlando has quit IRC17:14
hub_capthey have successfully been laundered as per your wishes17:14
*** locke1051 has joined #openstack-infra17:16
*** acabrera has quit IRC17:16
*** locke105 has quit IRC17:17
jeblairhub_cap: i'll email you shortly (accounts receivable is #3 on my todo list [which i think is shockingly low])17:19
Alex_Gaynorfungi: fwiw, in addition tot eh stopped jobs, zuul's queue is also growing (643 results right now)17:19
fungiAlex_Gaynor: yikes, that is high. it wasn't like that earlier i don't think17:22
*** salv-orlando has joined #openstack-infra17:22
SpamapS_Anyone know why this automated "update from global requirements" patch isn't being.. well.. updated? https://review.openstack.org/#/c/5616117:23
fungiAlex_Gaynor: though gate resets with the pipeline as deep as it is probably would account for spikes that high17:23
hub_capjeblair: roger17:23
Alex_Gaynorfungi: hmm, good point17:23
jeblairAlex_Gaynor, fungi: yeah, that's likely a gate reset in progress17:23
fungiSpamapS_: we disabled the job temporarily because it wasn't properly branch aware. there is a change in review to fix it, but i think it still needs work17:23
*** fbo is now known as fbo_away17:24
jeblairAlex_Gaynor, fungi: which just completed17:24
*** mrodden has joined #openstack-infra17:24
*** changbl has joined #openstack-infra17:26
*** flaper87 is now known as flaper87|afk17:27
*** alcabrera is now known as alcabrera|afk17:27
Alex_Gaynorindeed, looks like it's starting to come down now17:27
ttxlooks like we could issue some openstackstatus notice to account for current gate slowness17:28
ttx(or could have issued)17:28
* ttx goes for dinner17:28
openstackgerritRoman Prykhodchenko proposed a change to openstack-infra/devstack-gate: Support Ironic in devstack gate  https://review.openstack.org/5389917:28
*** pcrews has quit IRC17:29
jeblairttx: you can do that, you know.  :)17:29
*** SergeyLukjanov has quit IRC17:30
anteayafungi Alex_Gaynor correct at one point the zuul queue was 017:30
anteayaabout an hour or so ago17:30
*** reed has quit IRC17:30
anteayawelcome back jeblair17:31
SpamapS_fungi: ahh thanks!17:31
*** SpamapS_ is now known as SpamapS17:31
*** DinaBelova has quit IRC17:32
annegentlejeblair: welcome back! Do you have any availability at all today for a call with O'Reilly on workflow? Otherwise has to be put off til next week.17:32
*** datsun180b has quit IRC17:34
clarkbjeblair the zuul nnfi bug17:35
fungiannegentle: oh, they weren't going to be available at all the rest of teh week after today? i misread and thought it was next week they were out17:35
jeblairannegentle: today?  hrm, fungi told me they were looking at this week...17:35
clarkbthe one mtreinish found. i submitted it against zuul and is a "high" bug iirc17:36
annegentlejeblair: Holly's last email said "If not, we may need to try for a week from today due to travel schedules.  "17:36
fungiahh, yes, i just reread it now. i guess so17:36
clarkbfungi: I need to kill old etherpad this morning but then I wanted to add new hpcloud if possible17:37
annegentlejeblair: it'd be great to get it done before t-giving is all, punt to next week on the email thread as needed. My afternoon is already sunk anyway17:37
clarkbmordred did you see my comment on the az nodepool change?17:37
fungiclarkb: it sounds like we may want to rework the hpcloud additionm17:37
fungiclarkb: in particular, there was a suggestion (i think per jog0?) that hpcloud's nova will just pick an az for us if we don't specify one17:38
clarkbfungi: yes that is the case17:38
*** salv-orlando has quit IRC17:39
clarkband image snapshots should cross AZs17:39
clarkbso we should be able to treat that region as a single "AZ"17:39
fungiso we could in theory try just adding that region and forego any concerns of cross-az quota sharing support in nodepool for now17:39
clarkb++ simplifies things17:39
clarkbdid floating ip quota come up?17:39
fungioh, right, forgot that was also a thing17:40
*** moted has joined #openstack-infra17:40
clarkbjeblair: https://bugs.launchpad.net/zuul/+bug/124683817:40
uvirtbotLaunchpad bug 1246838 in zuul "NNFI doesn't always reattach child of failing change to nearest non failing item." [High,Triaged]17:40
fungimordred: we worry that a 15 floating ip address limit may become an issue, so are you able to request a large enough fip quota to match our server quota in that region?17:40
*** dprince has joined #openstack-infra17:41
fungibut anyway, at least having nodepool not care which az it's using in the new region would also make it so that we don't immediately need to worry about implementing sub-region az support in nodepool either17:42
Alex_GaynorWhen did the gate get so slow again :/ It used to be like 25-30 minutes17:43
clarkbAlex_Gaynor: between havana release and now17:43
clarkbAlex_Gaynor: it is kind of funny because changes to havana fly through testing with few problems :)17:43
clarkbwe have regressed quickly17:44
Alex_GaynorYeah, was going to say, that didn't take long17:44
clarkbmordred: east nodes don't get a public IP you must floating ip them, so we either have proxy nodes or a giant floating ip quota17:44
*** boris-42 has quit IRC17:44
esmuteHey guys, can you guys take a look at https://review.openstack.org/#/c/52137/? It is currently random tempest tests. This is currently blocking our trove integration in horizon and heat17:45
Alex_Gaynoresmute: Small thing, not everyone here is a guy.17:45
clarkbesmute: there isn't much to say other than everyone is in the same boat and it sucks. we had a long discussion over the weekend about how we could tackle the problem but we didn't come away from that with anything concrete17:45
esmuteAlex_Gaynor: The term 'guys' refers to both male and female.17:47
*** salv-orlando has joined #openstack-infra17:47
esmuteAlex_Gaynor:  I apologize if that is not the case.17:47
clarkbwe need folks to be jumping on grenades and sorting out these bugs (and some people are yay jog0 and mikal) but so far it has been a losing battle (bugs come in quicker than we can fix them :/)17:47
jeblairannegentle: my morning is mostly infra and tc meeting prep, could probably call this afternoon (>2100 utc; after tc meeting), but if your afternoon is sunk, maybe we should punt.17:48
*** markwash has joined #openstack-infra17:48
*** jerryz has joined #openstack-infra17:48
clarkboh meeeting is earlier now (relative to me but not fungi) /me runs to office so that old etherpad can be killed17:48
*** hasharCall has quit IRC17:49
esmuteclarkb: Thanks. The tests fail in different test everytime. So i cant get jenkins to +1.17:49
clarkbesmute: right, these are real bugs in openstack that are causing the failures and there is more than one of them17:50
*** hogepodge has joined #openstack-infra17:50
clarkbesmute: http://status.openstack.org/elastic-recheck/ is tracking some of the ones we know about17:50
clarkb1251920 is particularly bad17:51
*** derekh has quit IRC17:51
*** dizquierdo has quit IRC17:51
jeblairclarkb: are they neutron related?  would running duplicate jobs on neutron help?17:52
clarkbjeblair: some are but 1251920 isn't17:52
clarkbjeblair: and we already have extra neutron jobs which seems to have helped17:52
clarkbhttps://bugs.launchpad.net/nova/+bug/1251920 mikal thought it may be a libvirt issue but hasn't been able to track that down yet17:53
uvirtbotLaunchpad bug 1251920 in nova "Tempest failures due to failure to return console logs from an instance" [High,Confirmed]17:53
fungiwell, except that extra neutron jobs mean neutron changes are even less likely to make it through the gate, so the roughly 1/3 changes in the gate right now which belong to neutron are a big reason our gate throughput has ground slower and slower17:53
clarkbalso checked the cirros issues17:53
annegentlejeblair: it would be fine for you to meet without me, it's just that I already have to miss the first half of the TC meeting. I could meet after the TC meeting though17:53
annegentlejeblair: fungi should we try for 4:00 CST/ 5:00 EST?17:54
jeblairclarkb: we do?  i don't see any duplicates (i thought the proposal was to run vm-neutron like 4 times or something)17:54
jeblairclarkb: i do see there are some extra variants though17:54
clarkbjeblair: well we have a new postgres job17:54
fungijeblair: there are additional job variants which run only on neutron17:55
fungiright17:55
clarkbthat17:55
annegentlejeblair: saw the email, thanks!17:55
clarkbwe could do more with those special jobs17:55
clarkbbefore I forget and I want to talk about this in the meeting, I would like all new d-g/tempest jobs to come with a periodic variant17:56
clarkbkeep getting distracted by things I Need to do before leaving17:56
fungineutron also gets an isolation job which we don't run elsewhere yet17:56
fungiat least last i checked17:56
clarkbfungi: I think there are two isolation and two non isolation one for each DB17:56
clarkbthen there is the large ops job which isn't voting yet but could be17:57
fungiyeah, also soon the parallel isolation job, which is only experimental for the moment17:57
jeblairannegentle, fungi: 2200 utc wfm17:57
clarkbthat works for me too if having extra folks listen in isn't too horrible17:58
*** SergeyLukjanov has joined #openstack-infra17:58
fungijeblair: annegentle: clarkb: i'm fine with 2200z today too17:59
*** sdake_ has joined #openstack-infra18:02
NobodyCamclarkb: that stuck ironic job is still showing in zuul? any thing I can do to clear it out?18:03
*** melwitt has joined #openstack-infra18:04
*** DinaBelova has joined #openstack-infra18:05
clarkbNobodyCam push a new patchset or continue waiting. zuul is mega busy though :(18:06
NobodyCamlol ack :)18:06
*** jcooley_ has quit IRC18:08
*** DennyZhang has quit IRC18:09
*** shardy is now known as shardy_afk18:10
*** afazekas_ has quit IRC18:12
*** osanchez has quit IRC18:12
fungiclarkb: yeah, when i woke up this morning the gate was around 75 changes deep, so i didn't think adding a zuul restart into the mix was such a great idea after all18:14
*** ruhe has quit IRC18:17
openstackgerritRoman Prykhodchenko proposed a change to openstack-infra/config: Adds devstack-gate tests for Ironic  https://review.openstack.org/5391718:17
*** hogepodge has quit IRC18:18
zarofungi: yeh.  about the gerrit unabandon thing.18:18
*** flaper87|afk is now known as flaper8718:18
zarofungi: i think any user that has abandon privileges can restore.18:18
*** shardy_afk is now known as shardy18:18
*** moted has quit IRC18:19
zarofungi: so i think we can do that now. right?18:19
fungizaro: in 2.4? i didn't think that was an available option18:19
openstackgerritA change was merged to openstack-infra/jenkins-job-builder: Added support for Emotional Jenkins  https://review.openstack.org/5677918:19
* fungi checks18:19
*** moted has joined #openstack-infra18:20
*** UtahDave has joined #openstack-infra18:20
zarofungi: http://gerrit-documentation.googlecode.com/svn/Documentation/2.4.2/access-control.html18:21
zarofungi: yup. your right, i don't see abandon control.18:21
fungii was going to check, but the zuul status page is eating my browser alive18:22
*** pcrews has joined #openstack-infra18:22
fungiswap thrashfest18:22
*** nsaje has quit IRC18:23
*** nsaje has joined #openstack-infra18:23
Shrewsnom nom nom... browser cookies... nom nom nom18:24
fungiwow... firefox consuming all 16g of ram on my desktop18:26
clarkbfungi: nice18:26
fungii should probably not leave a window displaying that page 24x7 for weeks18:27
*** johnthetubaguy has quit IRC18:28
*** johnthetubaguy has joined #openstack-infra18:28
*** nsaje has quit IRC18:29
openstackgerritlifeless proposed a change to openstack-infra/reviewstats: Add jenkins-job-builder as a broken out project  https://review.openstack.org/5660318:29
fungivery leaky. after a restart, ~700mb18:29
clarkbfungi: does iceweasel keep up with upstream updates?18:30
anteayattx 3.5 hours remaining in the horizon election?18:30
clarkbyou might try firefox proper I don't have that problem keeping zuul status open for long periods of time18:30
clarkbI am going to kill old etherpad-dev now18:30
clarkbthen old etherpad.o.o18:31
*** reed has joined #openstack-infra18:31
*** david-lyle is now known as david-lyle_lunch18:33
*** sarob has quit IRC18:35
*** sarob has joined #openstack-infra18:35
*** UtahDave has quit IRC18:35
*** hogepodge has joined #openstack-infra18:36
lifelessclarkb: care to abandond your 'test change do not merge' from d-g ? It's up in reviewers faces ;)18:37
openstackgerritRoman Prykhodchenko proposed a change to openstack-infra/config: Adds devstack-gate tests for Ironic  https://review.openstack.org/5391718:37
clarkblifeless: sure18:37
clarkbdone18:38
lifelessthanks!18:40
*** sarob has quit IRC18:40
lifelesszuul still looks kindof unhappy?18:40
*** johnthetubaguy has quit IRC18:41
clarkbyup, gate fails have it near worst case so gate pipeline has backed up18:41
clarkblast night I was seeing the head of the piepline fail most of the time (we did merge three changes once )18:41
clarkbso not quite near worst case but not very happy either18:41
*** Ryan_Lane has joined #openstack-infra18:42
lifelesswheeee18:42
zulwhere are the instructions for deploying openstack-ci for 3rd party testing?18:45
*** blamar has quit IRC18:45
clarkbzul: http://ci.openstack.org/third-party-testing.html iirc18:46
lifelesszul:18:46
lifelesshttp://ci.openstack.org/third_party.html18:46
lifelessbah18:46
lifeless:)18:46
*** nsaje has joined #openstack-infra18:46
fungizul: though keep in mind that's not "installing openstack ci" (we assume people have their own ci of some variety they want to hook up to ours)18:46
fungibuilding a mock openstack infra deployment to use for third-party testing is probably overkill, and also a much more involved topic18:47
zulfungi:  cool thanks18:47
*** mrmartin has joined #openstack-infra18:47
*** blamar has joined #openstack-infra18:48
*** datsun180b has joined #openstack-infra18:49
clarkbetherpad-dev is dead18:50
clarkbprobably not going to get etherpad.o.o before the meeting as I want to double check a few things18:50
anteayaclarkb: well done18:50
*** senk has quit IRC18:54
*** rnirmal has quit IRC18:58
jog0jeblair: hacking has a apache 2 check thanks to sdague19:00
jog0and -1 to overloading the pep8 job any more then we have19:00
*** alcabrera|afk is now known as alcabrera19:01
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: update doc and add new JJB unit tests  https://review.openstack.org/5671519:01
*** sarob has joined #openstack-infra19:01
fungimeetin time!19:01
jeblairjog0: does licensecheck == apache 2 check?  not so sure about that.19:01
jeblairjog0: what would running that outside of the pep8 ("style check") job get us?19:02
jog0jeblair: someone can run 'tox -epep8' on there own box and get the same as the check19:02
clarkbI think that is always the goal so tox would capture it at the very least19:03
jog0clarkb: the bug lsited this as an apt-get install19:03
*** ruhe has joined #openstack-infra19:03
*** jcoufal has quit IRC19:04
jog0so I assume its a no go for tox19:05
jeblairjog0: the bug is _very_ old19:05
jeblairjog0: if it won't work in the pep8 job, then yes, it would need to be a different one19:05
jeblairyolanda: ^19:06
*** blamar has quit IRC19:06
*** jcoufal has joined #openstack-infra19:06
jog0jeblair: IMHO the bug is done, we have aapache2 test19:06
jog0jeblair: ohh and welcome back19:07
anteayajog0: joining us for -infra meeting?19:07
jeblairjog0: apparently it fails on nova because run_tests.sh does not have a licence check19:07
jeblairjog0: thanks! :)19:07
jog0jeblair: run_tests.sh isn't deliverable code, I am less worried about missing apache2 headers there19:07
clarkbI don't think license check == apache 219:08
jog0anteaya: in the triplo meeting, but ping me if anything comes up19:08
clarkbI always interpreted that bug as look for potential license conflicts19:08
anteayajog0: very good19:08
*** david-lyle_lunch is now known as david-lyle19:08
jog0clarkb: hmm19:08
yolandajeblair, so no need for that bugfix? or it still makes sense to run it as an extra check?19:09
pleia2yolanda: I think you mean jog0 :)19:09
jeblairyolanda: let's see if we can get some more agreement from jog0 and mordred about that.19:10
yolandaboth :)19:10
yolandaok19:10
*** jamesmcarthur has quit IRC19:10
jog0- licensecheck: given a list of source files, attempt to determine which license (or combination of licenses) each file is placed under.19:10
jog0if thats what it is, then we don't need it AFAIK19:11
*** jamesmcarthur has joined #openstack-infra19:11
jog0its easy for us19:11
jog0apache2 FTW19:11
jog0yolanda:19:11
*** blamar has joined #openstack-infra19:11
yolandaso yes, licensecheck reports every file without a valid license, the reason for that licensecheck job is only for Apache?19:12
jog0yolanda: we only have apache219:12
clarkbjog0: that isn't 100% true :) pbr is an offender19:13
jog0clarkb: WAT19:13
clarkbpbr bundles d2to1 which isn't apache2 iirc19:14
clarkbit is bsd or something19:14
jog0ahh anyway thats a special case19:14
jog0this bug doesn't seem valid anymore IMHO19:14
*** sandywalsh_ has joined #openstack-infra19:15
*** mrmartin has quit IRC19:16
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: fix jjb job template documentation  https://review.openstack.org/5706219:19
*** shardy is now known as shardy_afk19:20
*** adalbas has quit IRC19:21
* sdague finally getting back to having a functional laptop19:24
jog0jeblair: want to see a fun graph http://paste.openstack.org/show/53572/19:24
clarkbsdague: \o/19:24
*** whoops has joined #openstack-infra19:24
clarkbsdague: if you look at the bottom of the elastic-recheck page you will see mikal's current nemesis19:25
*** WarrenUsui has joined #openstack-infra19:25
sdagueok, well, I'm just in the barely getting back stage :) so will need to continue on setup much of the day19:25
sdagueI was however racing to get irc working again by TC meeting19:25
clarkbsdague: as much as it hurts to say this no rush19:26
clarkbwe have been dealing with this for the last few days19:26
sdagueheh19:26
sdagueyeh, I noticed the spike gate pattern this morning19:26
openstackgerritA change was merged to openstack-infra/reviewstats: Add jenkins-job-builder as a broken out project  https://review.openstack.org/5660319:27
ttxanteaya: sounds about right19:28
*** aardvark has quit IRC19:28
anteayak19:30
*** sarob_ has joined #openstack-infra19:30
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: fix jjb configuration documentation  https://review.openstack.org/5706219:30
*** sarob has quit IRC19:33
*** sarob_ has quit IRC19:34
*** sarob has joined #openstack-infra19:34
*** rnirmal has joined #openstack-infra19:35
jeblairmordred: can you weigh in on the necessity of licensecheck?19:35
jeblairmordred: jog0 thinks it's redundant (see above ^)19:36
jeblairmordred: and i'd hate for yolanda to waste effort19:36
*** alexpilotti has joined #openstack-infra19:37
yolandajeblair, ATM only wasted like 10 minutes looking at the bug, no worries19:37
portantefolks, regarding the 1251920 related failures, should we be rechecking/reverifying them?19:37
Hunnerclarkb: I have a puppetboard demo set up. Want to G+ screenshare some time today to see if it's something you'd be interested in to replace your dashboard use cases?19:38
*** jerryz has quit IRC19:39
*** marun has quit IRC19:39
*** sarob has quit IRC19:39
clarkbHunner: maybe? I am still kind of swamped after the summit. anteaya and pleia2 might be interested too19:39
*** sarob_ has joined #openstack-infra19:39
*** jerryz has joined #openstack-infra19:40
pleia2Hunner: oh cool, if you and anteaya are free in an hour I'd love to have a look19:40
anteayalet's try it19:41
anteayaI can never get sound to work on this laptop but I can look19:41
*** markmc has quit IRC19:41
HunnerI have a meeting in 49 minutes, but will be free in 1919:41
anteayaor use my personal laptop19:42
anteaya19 minutes works for me19:42
*** markmc has joined #openstack-infra19:42
pleia2haha, I have a meeting in 1919:42
*** sarob_ has quit IRC19:42
Hunnerpleia2: anteaya: 22:00UTC?19:42
pleia2sure19:43
anteayak19:43
Hunner:D19:43
*** sarob has joined #openstack-infra19:44
Hunneranteaya: pleia2: Actually sorry, I have a 22:00-22:30utc. So how about 22:30-23:00?19:45
anteayasure19:45
pleia2ok19:46
*** sarob has quit IRC19:47
*** sarob has joined #openstack-infra19:47
*** chuck__ has joined #openstack-infra19:48
*** sarob has quit IRC19:48
*** sarob has joined #openstack-infra19:49
*** rockyg has joined #openstack-infra19:49
*** davidhadas has joined #openstack-infra19:51
openstackgerritPeter Liljenberg proposed a change to openstack-infra/jenkins-job-builder: Added support for Jenkins plugin Blame upstream committers  https://review.openstack.org/5408519:53
*** alcabrera is now known as alcabrera|afk19:55
*** dolphm has quit IRC19:55
*** sarob has quit IRC19:57
*** sarob_ has joined #openstack-infra19:57
*** sarob_ has quit IRC19:59
*** yassine has quit IRC19:59
*** sarob has joined #openstack-infra19:59
*** mihgen has joined #openstack-infra19:59
jeblairclarkb: would you like me to tag jjb 0.6.0 ?20:01
clarkbjeblair: I am not sure. zaro mgagne were there any outstanding changes you feel really need to get in a release?20:01
jog0jeblair: so after lunch want to talk about the future of rechecks for a few minutes?20:02
*** sarob has quit IRC20:02
clarkbI did a quick review off what was there and we merged a few things. Not sure if there are others that are really important20:02
*** sarob has joined #openstack-infra20:02
jeblairjog0: sure20:02
zaroclarkb: let me take a look.20:02
*** zul has quit IRC20:03
zaroclarkb: this would be the only one i feel strongly about https://review.openstack.org/#/c/5671520:04
*** lcestari has quit IRC20:04
*** ryanpetrello_ has joined #openstack-infra20:05
*** sarob has quit IRC20:08
*** sparkycollier has joined #openstack-infra20:08
*** ryanpetrello has quit IRC20:09
*** ryanpetrello_ is now known as ryanpetrello20:09
*** yolanda has quit IRC20:09
*** sarob has joined #openstack-infra20:09
*** yolanda has joined #openstack-infra20:10
clarkbreviewed20:11
openstackgerritMonty Taylor proposed a change to openstack-infra/config: Add new HP Cloud region  https://review.openstack.org/5626020:12
mordredclarkb: ^^20:12
openstackgerritSergey Lukjanov proposed a change to openstack-infra/config: Setup devstack-gate tests for Savanna  https://review.openstack.org/5731720:12
clarkbmordred: reviewed, couple things noted20:13
clarkbalso ty20:13
*** alcabrera|afk is now known as alcabrera20:14
jeblairmordred: did you see the floating ip thing earlier?20:14
*** ace05_ has quit IRC20:15
*** markmc has quit IRC20:16
mordredjeblair: just learning about it now20:16
*** japplewhite has joined #openstack-infra20:17
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: update doc and add new JJB unit tests  https://review.openstack.org/5671520:19
*** ace05__ has joined #openstack-infra20:20
*** vipul is now known as vipul-away20:21
*** vipul-away is now known as vipul20:21
*** mrmartin has joined #openstack-infra20:22
sparkycolliermordred: does the new hp trunk zone include Heat?20:23
mordredsparkycollier: no20:23
*** mrmartin has quit IRC20:26
pabelangerAsked before but didn't see a reply, but why don't we include tox -e venv -- python setup.py build_sphinx by default as a tox.ini env?20:27
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: update doc and add new JJB unit tests  https://review.openstack.org/5671520:28
clarkbpabelanger: I don't know20:29
clarkbpabelanger: though it may be reasonable to do that especially since some projects are tracking different requirements for doc builds. I htink the biggest hurdle is we currently use tox -evenv -- python setup.py build_sphinx in the doc build jobs, so we would need to update all projects then switch the doc job script20:30
dprincesdague: you mentioned a grenade fix related to https://review.openstack.org/#/c/57066/?20:30
pabelangerclarkb, ya, I suspect that might be the issue20:31
sdaguedprince: yeh, maurosr said he'd get the patch posted today20:31
sdaguelater in that meeting20:31
*** yolanda has quit IRC20:32
dprincesdague: ah, cool. I missed that. (too many concurrent meetings)20:32
japplewhitequestion all: is this bug https://bugs.launchpad.net/devstack/+bug/1248923 causing any problems in the CI environment? If not can someone explain why not? (I would like for our test env to not be affected too!)20:33
uvirtbotLaunchpad bug 1248923 in devstack "Devstack install is failing:" [Undecided,Confirmed]20:33
sdaguedprince: I hear you20:35
*** vipul is now known as vipul-away20:36
*** Hefeweizen has quit IRC20:37
openstackgerritSergey Lukjanov proposed a change to openstack-infra/config: Setup devstack-gate tests for Savanna  https://review.openstack.org/5731720:43
clarkbwe dont enable novncso dont hit that problem20:46
japplewhiteclarkb: thanks - is that considered deprecate then?20:47
clarkbI dont think so. more of an always problematic so we ignore it issue20:48
*** dolphm has joined #openstack-infra20:48
clarkbthere is history there that I dont have enough familiarity with20:49
*** DinaBelova has quit IRC20:49
fungialso i have doubts we'd be able to exercise novnc properly in an unattended/headless test20:49
*** Hefeweizen has joined #openstack-infra20:49
fungiat least not without a lot of work to emulate client interaction with the interface20:50
*** pcrews has quit IRC20:50
fungianyone want to guess what happened here? https://jenkins01.openstack.org/job/gate-python-glanceclient-pep8/191/console20:55
clarkbfungi: looking20:55
fungii'm suspecting a broken git repo in that workspace. confirming now20:55
japplewhiteclarkb - devstack is working on Ubuntu 13.04 - just tested it…seems limited to 12.0420:56
anteayafungi yeah, the git disappeared20:56
anteayajapplewhite: can you add that to the bug report?20:57
fungiclarkb: yeah, http://paste.openstack.org/show/53619/20:58
fungiclearing20:58
*** datsun180b_ has joined #openstack-infra20:59
fungiunfortunately that issue caused at least one gate reset21:00
*** Ryan_Lane1 has joined #openstack-infra21:01
*** Ryan_Lane has quit IRC21:01
openstackgerritSergey Lukjanov proposed a change to openstack-infra/devstack-gate: Add Savanna testing support  https://review.openstack.org/5732521:02
*** datsun180b has quit IRC21:02
*** datsun180b_ is now known as datsun180b21:02
*** pcrews has joined #openstack-infra21:03
sdaguemordred: I actually know a guy that knows some prolog21:05
sdagueif you had interest in solving it that way21:05
anteayasdague: zaro is always looking for folks with more prolog21:06
*** japplewhite has quit IRC21:08
anteayasdague: how goes the new job?21:08
anteayafinding the coffee maker okay?21:09
*** Ryan_Lane1 is now known as Ryan_Lane21:09
*** Ryan_Lane has joined #openstack-infra21:09
sdagueum... still setting up new laptop, will be slow for a couple days still21:09
anteayasdague: happy?21:10
sdagueso far so good :)21:10
anteayayay21:11
anteayaglad for you21:11
*** rahmu has quit IRC21:12
*** yamahata_ has quit IRC21:12
*** nsaje has quit IRC21:13
*** ruhe has quit IRC21:13
*** nsaje has joined #openstack-infra21:14
*** rahmu has joined #openstack-infra21:14
*** ericw has joined #openstack-infra21:16
*** alcabrera has quit IRC21:17
mriedemmikal: hey, so regarding the direct emails from shaomei ji, sorry about that, i gave those guys a lecture on a call this morning21:18
mikalHeh, thanks21:22
mikalI'm sure he means well21:22
mikalI think he's just very confused about who I am21:22
mikalAnd how big I am in infra21:22
mriedemyeah, they are new21:22
mriedemthat's why i had to start getting on calls with them b/c they were not making progress21:23
mriedemso when they were like 'well we emailed this guy and blah blah blah' the sirens went off21:23
mikalI have learned never to reply to requests and say "root is a bad account name"21:23
mikalI am being punished for not making clarkb do that21:24
mriedemha21:24
clarkbmikal: but we really appreciated your helpfulness :) would request mikal help again21:24
mikalLOL21:24
*** davidhadas_ has joined #openstack-infra21:24
* mikal creates bogus LP bugs to test changes because clarkb is a hater21:25
*** vipul-away is now known as vipul21:26
*** davidhadas has quit IRC21:26
*** MarkAtwood has joined #openstack-infra21:26
clarkbmikal: 1251920 is the biggest pain ever21:27
clarkbI am going to kill old etherpad.o.o now but when that is over I am half tempted to become your apprentice until 1251920 is fixed21:27
clarkbs/apprentice/sidekick/ because super heros are cooler than magicians21:28
mikalclarkb: yeah, my problem with that bug is I am running out of ideas21:28
mikalclarkb: the only one I have left is that perhaps we should just increase the timeout and see if that fixes it21:28
mikalAlthough I can't see it helping for the "no console logs at all" case21:29
mikalIt would be super nice if we could poke around a failed instance21:29
clarkbmikal: ok, let me see what I can do about holding nodes. we can try holding a handful then hope one of them runs into the failure21:29
clarkbbut first must kill etherpad.o.lo21:29
mriedemmikal: i don't think increasing the timeouts would help21:29
mriedemin some cases the mismatch is 9 != 10, and you'll see a lot of responses with the 9 split output21:29
clarkbfungi: jeblair: do you have any better ideas than grabbing a few tests slaves and crossing fingers?21:30
*** jamesmcarthur has quit IRC21:30
mriedemmy feeling is that's not going to change regardless of the timeout21:30
mikalclarkb: the failure rate is pretty high21:30
mikalI don't think you'd need to cross your fingers all that much...21:30
*** jhesketh__ has quit IRC21:30
*** rockyg has quit IRC21:30
clarkbya21:30
*** jhesketh_ has joined #openstack-infra21:30
clarkbfungi: jeblair: the oldest local DB backup on new and old servers is from october 21st so they overlap. I think I am good to delete the old server21:31
clarkbfungi: jeblair: is there anything you would like me to check in addition to DB backups?21:31
openstackgerritMichael Still proposed a change to openstack-infra/jeepyb: Allow automatic subscription to DocImpact bugs  https://review.openstack.org/5615821:31
*** esker has joined #openstack-infra21:31
*** che-arne has joined #openstack-infra21:34
*** yamahata_ has joined #openstack-infra21:34
*** jamesmcarthur has joined #openstack-infra21:36
*** atiwari has quit IRC21:36
*** jamesmcarthur has quit IRC21:36
mikalclarkb: so... this config file I am pushing with puppet21:37
clarkbya21:37
mikalclarkb: where are the other config files for gerrit? Where should I put it on the machines?21:37
mikal /home/gerrit2>?21:38
clarkbmikal: I think it can live where the shell wrapper scripts for jeepyb scripts live21:38
mikalclarkb: ok21:38
clarkbmikal: eg the patchset-created or change-merged scripts21:38
clarkbthat call out to jeepyb21:38
*** japplewhite has joined #openstack-infra21:39
jeblairclarkb: hold and cross fingers is best idea21:40
jeblairclarkb: can't think of anything else to check21:40
mriedemwas wondering if maybe danpb could help with the console problems, i saw he committed a change to libvirt within the last week regarding consoles21:41
mriedemhttp://libvirt.org/git/?p=libvirt.git;a=commit;h=5087a5a0092853702eb5e0c0297937a7859bcab321:42
notmynamejeblair: russellb: idea for gerrit to get more people to do reviews: add a column that shows how many reviews the author of the submitted patch has done in the past 30 days21:42
mriedemthat wouldn't be related to what we're seeing b/c danpb's patch was for pty but we're seeing file consoles in the logs21:42
*** japplewhite has quit IRC21:43
mriedemnotmyname: but is the new contributor implicitly punished then?21:43
mikalmriedem: including danpb is a good idea21:43
mikalmriedem: he's in the EU though, so might be hard to catch now21:43
mriedemmikayeah21:43
mriedemmikal: yeah21:43
russellbnotmyname: i've started tracking that in my stats21:44
notmynamerussellb: ya, I saw that today. it's similar21:44
notmynamerussellb: I've got the same issue in swift that you have in nova: an ever-growing patch queue and a static number of hours in the day21:45
notmynamerussellb: and I'd love to have an answer on how to keep it manageable21:46
clarkbjeblair: fungi: if there is nothing else to check I am deleting etherpad.o.o old now21:46
mriedemnotmyname: this might be a good example of why i think it might not work https://review.openstack.org/#/c/56381/21:46
russellbwe have some great reviewers, and then a long tail of infrequent contributors that don't really review (and may not be good candidates to anyway).  it's tough21:46
jeblairclarkb: +121:46
openstackgerritMichael Still proposed a change to openstack-infra/config: Add an initial subscriber map for notify_impact  https://review.openstack.org/5733221:46
russellbjust need to continue grooming regulars to help with the load21:46
notmynamerussellb: yup21:46
mriedemi triaged that bug this weekend, the guy pushed a patch but when i assigned the bug to him in launchpad, it was his first one21:46
mikalclarkb: wanna re-review https://review.openstack.org/#/c/56158/ and then https://review.openstack.org/5733221:46
mikal?21:46
russellbnotmyname: but at least having numbers helps my sanity ... it was just constant feeling of drowning before21:47
mriedemnotmyname: so it's good that he's reporting the bug and pushing the patch, but if people ignore his patch b/c it's his first contribution, that's probably bad21:47
russellbi can see 8 days, vs feeling like it must be 92340234 days based on some attitudes21:47
russellb:)21:47
russellbgood times.21:47
notmynamemriedem: ya, it may be a horrible idea (if it is, I'll take credit. if not, I'll tell you who came up with it)21:47
mriedemnotmyname: i'm not actually sure it's an original idea - seems like that came up in the ML thread about the same topic back when hyperv was threatening to go out of tree for lack of review love21:48
notmynamemriedem: ah. could be21:48
openstackgerritKhai Do proposed a change to openstack-infra/config: add nodepool-dev to jenkins-dev server  https://review.openstack.org/5733321:49
clarkbjeblair: fungi: nova delete executed now we wait21:49
mriedemdoes gerrit have a limit on showing the number of patches i'm currently reviewing?21:49
clarkband it is gone21:49
mriedemseems i can't always find the most recent ones21:49
clarkbmikal: sure21:50
*** denis_makogon_ has joined #openstack-infra21:51
mikalclarkb: ta21:51
openstackgerritKhai Do proposed a change to openstack-infra/config: add nodepool to jenkins-dev server  https://review.openstack.org/5733321:51
*** sparkycollier has quit IRC21:53
jog0jeblair: ping21:53
*** nati_ueno has joined #openstack-infra21:53
jeblairjog0: pong21:54
clarkbjeblair: https://jenkins01.openstack.org/job/tempest-docs/245/console zuul triggered a bunch of jobs with null zuul refs, any idea of why that may happen?21:54
jeblairclarkb: branch deletion21:54
openstackgerritKhai Do proposed a change to openstack-infra/config: add nodepool to jenkins-dev server  https://review.openstack.org/5733321:54
clarkbjeblair: gotcha thanks21:55
fungiclarkb: sorry, was away cooking dinner (still am sort of, back shortly) but +++ on etherpad.o.o deletion21:56
clarkbfungi: its gone21:57
clarkbso \o/21:57
jog0so according to my most likely slightly wrong graph in http://paste.openstack.org/show/53572/21:58
jog0jeblair: our recheck policy is resulting in the gate not working21:58
*** gyee has quit IRC21:58
*** denis_makogon_ is now known as denis_makogon21:59
clarkbooh we get to continue this weekends discussion22:00
fungiyes, that seems to be a common conclusion22:00
jeblairjog0: studying graph22:01
notmynamejog0: to help me understand, hat graphs the percentage of jobs that result in FAILURE?22:01
jog0notmyname: yes22:02
jog0at least that is the intention of it22:02
jog0failure / (failure+pass)22:02
notmynamejog0: which means that gate-tempest-devstack-vm-neutron is always passing, right?22:03
jog0notmyname: it appears to be the case22:03
notmynameok, thanks22:03
jog0but check is not22:03
notmynameya, I was just trying to learn to graphite ;-)22:03
*** hogepodge has quit IRC22:03
jog0notmyname: I'm new to it too22:04
fungii personally think that nondeterministic failures breed more nondeterministic failures, because people are so used to having to reverify their patches to get them to merge that they are doing so even when it's their patch which is introducing a nondeterministic bug22:04
jog0and have a strong hunch I am doing something wrong22:04
jog0fungi: I agree with you nondeterministically :)22:04
jog0jeblair: I think fungi summed it up pretty well22:04
clarkbfungi: ++22:04
*** hogepodge has joined #openstack-infra22:04
fungiand i avoided saying "flaky" so as to not raise sdague's ire22:05
clarkbmikal: ok changes reviewed, I am going to hold a few nodes across rackspace and hp now22:05
* fungi puts a dollar in the "flaky" jar22:05
clarkbfungi: we need a bot to track flaky jar and quantum jar use then at the next summit we have it buy dev lounge goodies or something22:05
jeblairfungi: i don't think "flaky" is the objection but "$FLAKY test"22:05
*** japplewhite has joined #openstack-infra22:06
*** dcramer_ has quit IRC22:06
fungiflaky pastry22:06
jeblairfungi: as in, it's not usually the _test_ that's flaky, these days.22:06
jeblairanyway22:06
* fungi nods22:06
jog0so the check queue is failing just under around 50% of the time22:06
jeblairi wound tend to agree with this sentiment, jog0, fungi22:06
* fungi feels very wounded now22:07
jeblairfungi: ?22:07
clarkbmikal: does 1251920 affect all devstack jobs? neutron postgres and normal full? or just postres and full? (making sure I hold nodes running the correct tests22:08
fungisorry, making fun of your typo. i should just assume jetlag fingers22:08
*** dkliban has quit IRC22:08
openstackgerritClay Gerrard proposed a change to openstack-dev/hacking: Add noqa support for H201 (bare except)  https://review.openstack.org/5733422:08
jeblairfungi: aha.  :)22:08
openstackgerritMichael Still proposed a change to openstack-infra/jeepyb: Allow automatic subscription to DocImpact bugs  https://review.openstack.org/5615822:09
jeblairjog0: so we've talked about removing the recheck commands...22:09
jog0getting the real failure rate for check:22:09
jeblairi think this is something we should probably only do when things are working well, as otherwise it would just grind the gate to a halt and make more work for everyone22:09
*** nati_ueno has quit IRC22:10
mikalclarkb: I've just uploaded a new version of notify_impact with fixed whitespace22:10
jog0with postgres22:10
clarkbjeblair: jog0: I was actually thinking keeping recheck but removing reverify would help22:10
clarkbthat allows people to sort out problems pre merge but not during gating22:10
jog0http://paste.openstack.org/show/53626/22:10
clarkbthen we would rely on cores to do assessments and traige before reapproving22:10
jog0clarkb: so removing recheck isn't enough IMHO22:11
jog0there is a deaper issue22:11
jeblairclarkb: reasonable22:11
clarkbmordred pointed out we have far too many cores to make that practical though22:11
clarkbbut I think we can try it22:11
clarkbjog0: do go on22:11
jog0ok so current check failure rate:22:11
*** japplewhite has left #openstack-infra22:11
*** pcm_ has quit IRC22:12
*** yamahata_ has quit IRC22:12
jog01-(0.7)*(0.7)*(0.88)=56%22:12
jog0so check fails 56% of time22:12
jog0taking a step back the goal of gate is to keep trunk working22:14
clarkbmikal: in addition to tests were you seeing a higher incidence on hpcloud or rax or is it a crapshoot?22:14
jog0I don't think we are doing that today22:14
clarkbjog0: I agree22:14
*** davidhadas_ has quit IRC22:14
mikalclarkb: I had a theory briefly yesterday that it only happened on rax, but I couldn't find data to support that22:14
jog0so this is what I have:22:14
jog0top goal: keep gate stable and green22:14
jeblairjog0: i agree, though not because check fails 56% of the time, but rather because gate fails so often.22:14
fungiright, i think the technical measures we've put in place presently guarantee that a change will be merged so long as it works "some of the time"22:14
clarkbmikal: ok, and which jobs trigger it? is it tempest*full and tempest*postgres-full?22:15
mikalclarkb: yes22:15
clarkbok going to hold a few nodes now22:15
jog0subgoals:22:15
jog0* when bugs get past the gate, squash quickly22:15
jog0* make it harder for bugs to get passed the gate22:15
fungiclarkb: i'm curious to know how your node holding turns out. i can't remember whether that's expected to work now22:15
jeblairfungi: why wouldn't it?22:16
jog0I think we need solutions to both of those issues22:16
*** anteaya_ has joined #openstack-infra22:16
fungijeblair: at one point nodepool hold worked but the nodepool status change code didn't check for that status or did so incorrectly. i think that got fixed though--i just haven't had a chance to try it since22:16
jog0and addressing the nature of 'recheck' is only a subset of those22:16
clarkbfungi: pretty sure it works now I did it semi recently for a different bug22:17
fungioh, awesome22:17
*** ftcjeff has joined #openstack-infra22:17
anteaya_Hunner: you wanted to do something on G+ is that correct?22:17
*** sandywalsh_ has quit IRC22:18
mikalclarkb: of 38 fails, 28 are rax, 10 are not22:18
jog0jeblair: so as fungi pointed out we have gotten accustomed to using recheck which is bad22:18
jeblairjog0: removing/reducing recheck/reverify makes it harder for a change to land if it fails tests sometimes22:18
clarkbmikal: ok I held 6 rax 2 hp22:18
jeblairjog0: that addresses point 222:18
jog0jeblair: partially yes, I think we need to run the tests multiple times too22:19
*** mriedem has quit IRC22:19
fungijog0: so anyway, one solution i was noodling around was whether we could integrate the e-r matching so that changes could only be rechecked/reverified if e-r was already successfully classifying it. and then have strict rules in place not to add matches in e-r unless more than one change encountered that issue already22:19
funginot sure whether reasonable, but it was a jumping off point anyway22:19
jog0fungi:  Ithink thats a good idea22:19
*** ljjjusti1 has joined #openstack-infra22:19
clarkbmikal: the nodes I held are running these jobs https://jenkins01.openstack.org/job/check-tempest-devstack-vm-postgres-full/6677/ https://jenkins01.openstack.org/job/check-tempest-devstack-vm-full/7009/ https://jenkins01.openstack.org/job/check-tempest-devstack-vm-postgres-full/6678/ https://jenkins01.openstack.org/job/check-tempest-devstack-vm-full/7011/22:19
jeblairjog0: it might address point 1, if the lack of ability to easily merge a change that doesn't fix an outstanding bug is seen as a motivator for people to squash bugs.  i don't think it will be, so i don't think it addresse point 1e.22:19
clarkbhttps://jenkins02.openstack.org/job/check-tempest-devstack-vm-full/6138/ https://jenkins02.openstack.org/job/check-tempest-devstack-vm-full/6139/ https://jenkins02.openstack.org/job/check-tempest-devstack-vm-postgres-full/6370/ https://jenkins02.openstack.org/job/check-tempest-devstack-vm-postgres-full/6373/22:19
jog0just to do a full brain dump I had afew other ideas (msotly orthogonal)22:19
lifelessfungi: so I like the way you're thinking, but stats wise22:20
*** nati_ueno has joined #openstack-infra22:20
lifelessI'm not sure it makes sense22:20
fungiyeah, it very well may be idiotic22:20
jeblairjog0: we've had the ability to run jobs multiple times now, but no one seems to have taken advantage of that...22:20
clarkbfungi: I like the idea simply because it forces us to track the problems22:20
fungimainly just trying to stab at how do we keep people from rechecking a new nondeterministic failure into trunk22:20
clarkbbut it may add significant overhead for folks like sdague and jog022:20
jeblairjog0: sdague had a change up to run neutron 4 times, which prompted me to write the code to make that change work, but he abandoned it, and i haven't seen anything since.22:20
jog0on a high level we need to put more pressue on devs when gate isn't stable22:21
jog0what fungi proposed helps with that22:21
jeblairjog0: to be fair, i haven't actually looked at reviews in 3 weeks.22:21
jog0we are getting better at identifying the issues at hand22:21
clarkbmikal: I will try keeping an eye on those, but please let me know if you have ruled any of them out allowing them to be released22:21
jog0but we can do better at finding bugs22:21
jeblairjog0: but i'm definitely okay and the system is ready to run jobs multiple times22:21
jog0I think mart ofthe issue is no one reads the log files22:21
jog0jeblair: cool22:22
sdaguejeblair: so I fixed the neutron needs to run more jobs thing by blasting out a different feature matrix22:22
jog0jeblair: can we just start running everything 2x ?22:22
*** dprince has quit IRC22:22
jog0clarkb: I am fine with the overhead of only rechecking if in e-r22:22
jog0sounds reasonable22:22
jog0and if we make it harder for bugs to get in the burdon will slow down over time22:22
*** thomasem has quit IRC22:23
sdaguejog0: I think until we get at least the start of the dashboard, we're going to drive more breaks, but not more clarity. I won't really be able to get focus time on that until next week.22:23
*** ljjjustin has quit IRC22:23
jeblairjog0: it is technically possible to run everything 2x.  atm that means adding a lot of lines to zuul's layout.yaml (at least, until one of us gets around to fixing the templating support to make that easier)22:23
*** gyee has joined #openstack-infra22:23
*** xeyed4good has joined #openstack-infra22:24
jeblairjog0, fungi: i'm not certain about the practicality of directly integrating zuul and e-r, however, we can update the recheck regexes to match the list of bugs in e-r)22:24
jeblairit's ugly and annoying, but will work immediately22:24
jog0sdague: I think the dashboard will help a lot, but more for when gate is more stable and we have more subtle bugs22:24
jog0(not that it won't help now too)22:24
lifelessfungi: the problem is that if something occurs (say) 10 % of the time22:25
jog0so there is another aspect to this22:25
jog0fixing the bugs22:25
lifelessfungi: it will still get into trunk22:25
lifelessfungi: so your think would help identify whats broken more rigorously, but wouldn't stop things breaking22:25
clarkbI just noticed my sampling is iad and hpaz1 not very diverse... hopefully we don't need to do more than a couple passes at holding nodes :)22:25
lifelessfungi: perhaps that is something we need22:25
lifelessfungi: but it should be looked at that way, not as a prevention tool22:26
*** svarnau has joined #openstack-infra22:26
jog0with regard to actually getting bugs fixed, I don't think we have the right culture around fixing these gate hitting bugs22:27
*** hashar has joined #openstack-infra22:27
jog0IMHO they should be marked as critical and all hands on deck for the related teams22:27
jog0especially for the big ones22:27
clarkbmikal: I think I got one !22:27
*** hashar has quit IRC22:28
mikalclarkb: http://logs.openstack.org/33/54833/8/check/check-tempest-devstack-vm-full/fb77190/console.html is a winner on one of your held machines22:28
*** ArxCruz has quit IRC22:28
clarkbmikal: yup thats the one, have a public key I can throw on the host?22:29
mikalSure, one sec22:29
jog0jeblair: so I wanted to talk to you, (and sdague and other TC people) because I think we need a few drastic steps ASAP22:29
*** hashar has joined #openstack-infra22:30
jeblairjog0: how should getting all-hands on deck work?22:30
jog0jeblair: not sure22:30
jeblairjog0: should the qa team have the ability to triage those bugs and set them as critical?22:30
jog0I think so22:30
jog0accross all projects22:30
jog0and they shouldn't ahve to fix em22:30
jog0I think part ofthat is blocking gate22:30
jog0not exactly block but something along those lines22:31
jog0russellb: ^22:31
jog0you may want to lurl22:31
jog0lurk22:31
*** sdake_ has quit IRC22:31
jeblairjog0: and expect that within 24 hours after being triaged, projects should have someone assigned to them?22:31
jog0jeblair: I don't think rules like that are enough22:31
jog0it depends on how bad the bug actually is22:31
jeblairjog0: it's less about _rules_ and more about process22:32
jog0this is where sdague's dashboard is nice22:32
jog0we want to ask, what % of failures are because of bug x22:32
Hunneranteaya: anteaya_: pleia2: Does https://plus.google.com/hangouts/_/7acpiif9bk4ip30vud3e224rdg?hl=en work for you?22:32
*** sdake_ has joined #openstack-infra22:32
jog0if its bad then gather the troops and fix ASAP22:32
jeblairjog0: it sounds like your goal is to get people working on them, so that's the strawman i'm putting out -- a process for ensuring people are working on them.22:32
pleia2Hunner: seems to!22:32
jog0jeblair: so thats part of it22:32
jog0and a good start22:32
pleia2Hunner: haha, doh, when I tried to connect "You're not allowed to join this video call."22:32
jog0but there is the second part22:33
*** SergeyLukjanov has quit IRC22:33
Hunnerpleia2: Okay, just a sec...22:33
jog0do we tweak how recheck is used? do we run all tests twice to make the gate a little more rigorous?22:33
jog0also we need to get buy in from teams22:33
fungisorry, stepped away22:33
jog0so I was thinking a ML most?22:33
jeblairjog0: yes, of course.  i thought you wanted brainstorming.22:33
*** ArxCruz has joined #openstack-infra22:33
jog0I do22:33
jog0I was hoping we could get a few ideas and then take the ones that make the most sense to the ML22:34
jog0I like clarkb's idea about recheck22:34
fungilifeless: agreed, i wasn't expecting any solution to drive out 100% of nondeterministic failures, just discouraging the current behavior of reverifying because of a nondeterministic bug you're introducing in your change just to get it to land22:34
*** whoops has quit IRC22:34
jog0so if we kill 'recheck no bug'22:35
Hunneranteaya: anteaya_: pleia2: https://plus.google.com/hangouts/_/7ecpj1pbmc3jnmao9coklmbp0k?hl=en22:35
jog0that will make sure we do a better job of classifying bugs and hopefully that will make fixing them a higher priority22:35
* jog0 opens up a etherpad22:35
jog0https://etherpad.openstack.org/p/future-of-recheck22:36
jog0jeblair: lets brainstorm in there22:36
anteaya_Hunner: I never use G+ so I have to install the plugins22:37
Hunneranteaya_: Okay22:37
fungiinfra does have a need to be able to readd changes into gate and check pipelines, but maybe we can address that through something other than recheck/reverify comments22:37
*** nati_ueno has quit IRC22:37
jeblairfungi: that's been a goal for a while22:37
*** dkranz has quit IRC22:38
*** nati_ueno has joined #openstack-infra22:38
fungijog0: what was clarkb's idea you're referring to? i seem to have missed it in scrollback22:38
clarkbfungi: removing the ability to reverify?22:38
fungioh, that. okay22:39
hasharHunner: that was antoine, sorry :D22:39
mikalSo, I now have access to a jenkins node which had the problem22:39
mikalAt 22:19:08 we started an instance22:39
fungicombining some ideas, how about running multiple instances of all jobs if it's due to a recheck/reverify? might conserve resources but make it harder to let those introduce new failures22:39
jog0fungi lifeless clarkb: https://etherpad.openstack.org/p/future-of-recheck22:40
clarkbfungi: that allows changes that passed but should've failed through22:40
clarkbfungi: we have a higher chance of catching those if we just run more jobs22:40
fungiwell, true. i meant if we can't spare enough resources to double or triple all job counts, then only do it as a deterrent to reverify-induced merging of new issues22:41
*** alchen99 has joined #openstack-infra22:41
fungiagreed that being able to shore up even initial attempts against new nondeterministic bugs is better22:42
clarkbfirefox kept disconnecting me from the etherpad... chromium seems to be fine with it (if anyone else is having trouble)22:43
*** rnirmal has quit IRC22:43
jeblairclarkb: i'm fine in ff22:43
jeblairclarkb: did you get a phone call yesterday?22:44
clarkbjeblair: yes22:44
jeblairclarkb: did you speak to mordred about it?22:44
clarkbI did22:44
jeblairclarkb: because mordred has been VERY supportive of our using quite as many resources as we want in order to test all the things22:44
jeblairclarkb: i had taken that as being somewhat representative of hp's willingness to support this effort22:44
clarkbjeblair: yup, I think there is tension between running the cloud and the resources we want, I punted to mordred and haven't heard from them since. But it is something to consider22:45
jeblairif it is not, then actually, we need to seriously reconsider some things.22:45
fungioh, heh, i missed the etherpad until just now. derp22:45
jeblairclarkb: when you say you punted to mordred... what is your current understanding of the situation?22:45
clarkbjeblair: that this was a temporary situation and one of the reasons we need to start using the new region22:46
jeblairmordred: perhaps you could advise us on (a) whether we can continue to use the resources we have been given, and (b) whether we can expect more?22:46
Alex_Gaynorjog0: when I said it got slower, I mean the time to run the tests themselves has increased, not time-to-land22:46
jeblairclarkb: er, so you're saying we should not consider this as a change, and it is not really relevant to our current discussion?22:46
clarkbI think we have to consider it in the short term until we get the new region going22:47
jog0Alex_Gaynor: oh good point22:47
jeblairclarkb: i thought we were already supposed to be using the new region, and at any rate, should certainly be using it within a week or two.22:47
jeblairclarkb: that sounds like an operational issue and does not need to be considered for a conversation about high-level direction.22:48
clarkbright, I just want to make sure we keep that in mind. let me rephrase what I have on the etherpad22:48
*** mgagne has quit IRC22:51
*** jhesketh__ has joined #openstack-infra22:52
*** mfer has quit IRC22:52
mordredjeblair, clarkb: ola22:53
clarkbmordred: hi there see https://etherpad.openstack.org/p/future-of-recheck and questions of quota22:54
mordredclarkb: blerg. http proxy. can you tl;dr the quota question?22:54
*** denis_makogon has quit IRC22:54
mordredalso - the new region is intended to be the only region that exists in the cloud in the future22:54
clarkbmordred: hp politely asked me yesterday to use less of our quota, is that going to be a long term problem for us?22:55
mordredso it's not that they're going to remove our quota in the current regions, as much as they are going to delete the regions themselves22:55
mordredclarkb: they can shove it22:55
mordredclarkb: and I'll escalate to people above their paygrade if they don't like it22:55
jeblairmordred: that is perfectly understandable, and i expect us to move to the new region asap.22:55
mordredjeblair: I do think we may need to think about the floating-ip thing that clarkb was mentioning22:56
mordredjeblair: in terms of perhaps having a pre-allocated pool of floating-ips that we re-use, rather than creating/destroying each time?22:56
jeblairmordred: do you believe that we can get a quota increase, after we move to the new region?22:56
mordredjeblair: how about I ask around some - but I'm sure we can22:56
jeblairmordred: oh, i thought the problem was just that our floating ip quota was more limited than our machine quota?22:56
*** mihgen has quit IRC22:57
mordredah. I may be misunderstanding then22:57
jeblairmordred: if it's something other than that, where can i find the problem statement?22:57
clarkbfloating ip problem statement is essentially that, by default we get much fewer floating ips than host quota. How do we fix this? get more quota? use a proxy? and so on22:57
mordredcan I get a set of numbers that describe what we want quota-wise to move from old to new? (like, what quota size do we need in the new region for it to be sufficient)22:57
mordredand then I'll fun that up the flagpole immediately22:57
mordredlet's start with get more quota22:58
clarkbmordred: currently we have 8GB*96*3 RAM quota which is ~2.3TB of total RAM quota (did I do that math correctly?)22:58
clarkbmordred: and 288 hosts which would each need floating IPs22:58
jog0the etherpad is looking pretty good22:59
jog0we have a bunch of good ideas22:59
clarkbthat is the quota needed to maintain current levels of use22:59
jeblairclarkb: correct22:59
clarkbif we want to increase that we would be bumping that 96 value to something greater (maybe 128 for a start?) which would bump RAM to ~3TB and floating ips to 38423:00
*** flaper87 is now known as flaper87|afk23:00
clarkb8GB and 3 are the constants. 96 is the value we want to increase23:00
fungimath++23:01
*** ryanpetrello_ has joined #openstack-infra23:02
*** ryanpetrello has quit IRC23:02
*** ryanpetrello_ is now known as ryanpetrello23:02
*** mindjive1 has joined #openstack-infra23:02
jeblairfungi: so, maths? :)23:02
clarkb3 comes from 3 AZs (I wasn't clear about that)23:02
*** thedodd has quit IRC23:03
*** mindjiver has quit IRC23:03
*** julim has quit IRC23:03
fungijeblair: yes. maths. i learnt the britism from watching "look around you" which had an entire episode entitled "maths"23:03
*** anteaya_ has quit IRC23:04
fungisuch a marvellous show23:04
fungis/show/program/23:04
*** jpich has quit IRC23:04
*** hashar has quit IRC23:04
pleia2Hunner: https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting is our meeting schedule fyi, added puppetboard to it for next week (excited, so want to do it sooner, but we're all swamped I think, next week is best) cc: anteaya23:04
anteayapleia2: ack23:04
anteayalooks good Hunner, thanks for setting this up23:05
Hunnerpleia2: Yep on the swamped part. Queues work better than interrupts for this sort of thing23:05
pleia2yes, thanks \o/23:05
*** jcoufal has quit IRC23:06
Hunnernp :)23:06
openstackgerritKhai Do proposed a change to openstack-infra/pypi-mirror: add an export option  https://review.openstack.org/5734523:07
*** salv-orlando has quit IRC23:07
*** salv-orlando has joined #openstack-infra23:07
*** jcooley_ has joined #openstack-infra23:08
*** dolphm has quit IRC23:09
*** jcooley_ has quit IRC23:10
*** atiwari has joined #openstack-infra23:11
*** atiwari has quit IRC23:11
jog0so now that we have a awesome list of the ideas23:11
jog0lets look at them23:12
jog0so running tests 2x?23:12
jog0is that feasable with current resources?23:12
jog0clarkb: ^23:14
*** jergerber has quit IRC23:14
clarkbjog0: it is possible, it will just reduce our throughput by 1/2 in theory23:14
fungidepends on how big the gate pipeline gets23:14
*** hogepodge has quit IRC23:15
jog0if it means its harder to get breaking patches in, then sounds worth it23:16
*** jcooley_ has joined #openstack-infra23:16
jog0jeblair mordred lifeless: ^23:17
*** nati_ueno has quit IRC23:17
fungii think once we get things near squeaky-clean, it will reduce our throughput to slightly less than half23:17
fungiuntil then, it will probably be more of a geometric reduction in throughput23:17
lifeless2x will make ~- difference23:18
openstackgerritKhai Do proposed a change to openstack-infra/config: add nodepool to jenkins-dev server  https://review.openstack.org/5733323:18
jog0lifeless: what does'~-23:19
jog0mean23:19
lifelessabout none23:19
lifelessN/3 probability23:20
jog0lifeless: hmm I'm afraid you are right23:20
jog0in that case what about the future af recheck proposals23:20
jeblairi don't think it's none...23:20
jeblairfor some code paths 2x means 2x, for others it means 16x23:21
jeblair(that is to say for code paths that are already exercised 8x, or whatever our full complement of gate jobs is)23:21
jeblair(i haven't counted since i got back)23:21
jog0jeblair: at the very least all we loose is some resources23:22
jog0which sounds like a good tradeoff23:22
*** Ryan_Lane has quit IRC23:22
*** Ryan_Lane1 has joined #openstack-infra23:22
*** dkranz has joined #openstack-infra23:23
*** weshay has quit IRC23:23
jog0who is wenlock?23:24
clarkbwenlock: jog023:24
clarkbjog0: wenlock23:24
clarkb:)23:24
wenlockjog0 hi hi23:25
*** datsun180b has quit IRC23:25
jog0wenlock: o/23:25
jog0wenlock: I didn't want to scare you away, was just wondering23:25
wenlockno worries, was following along.... checking it out23:26
*** hogepodge has joined #openstack-infra23:26
fungiwenlock: roll up your sleeves and jump in. more help==better23:26
jog0jeblair: so you say its easy to make things run 2x?23:27
jog0if so lets propose a patch23:27
jog0and lets talk about rechecks23:27
*** dkranz has quit IRC23:27
jog0clarkb: btw https://review.openstack.org/#/c/57070/23:28
jeblairjog0: mechanically easy: double the line count in the gate: section of project definitions in zuul's layout.yaml.  it's likely to be a huge patch, at least until someone makes zuul's templating a bit more flexible23:28
jog0jeblair: https://review.openstack.org/#/c/56118/4/elastic_recheck/elasticRecheck.py23:28
clarkbjog0: approved23:28
fungiright. running additional copies of a job under some circumstances is a trivial patch. running two of every single job is probably better addressed through some mechanism other than roughly doubling the number of lines in layout.yaml23:29
lifelessjeblair: so if we take a codepath from 8x to 16x23:30
openstackgerritKhai Do proposed a change to openstack-dev/pbr: show transitive dependencies  https://review.openstack.org/5463923:30
lifelessprobability of a broken path that we don't detect goes from 3/8 to 3/16 or .37 to .1623:31
lifelessjeblair: but AIUI we're looking to avoid failures that occur more than a few percent23:31
jeblairjog0: cool.  however, to keep my sanity, i need to work through some more back-from-vacation todos and then will probably review changes in order23:31
lifelessjeblair: 60x should get us 5% sensitivity, for instance.23:32
openstackgerritJoe Gordon proposed a change to openstack-infra/config: Always run check-tempest-devstack-vm-full twice  https://review.openstack.org/5734723:33
jog0jeblair: np23:33
jog0that is a POC patch23:33
jog0feel free to -2 it or w/e23:33
lifelessjeblair: when I say approximately no effect, I mean that the individual failure rates we're seeing are already much lower than the sensitivity we'll get from 8x -> 16x23:33
jog0so rechecks themselves23:34
*** ftcjeff has quit IRC23:34
jog0we have some good ideas here: https://etherpad.openstack.org/p/future-of-recheck23:34
jog0do we want to propose any to the ML or just do them etc?23:34
*** oubiwann has quit IRC23:35
mikaljog0: Ok, so I am pretty much out of ideas on this console log thing23:36
jog0mikal: did you try cursing?23:36
jeblairlifeless: not necessarily; part of the hypothesis is that many bugs may be entering even after failing already, so part of the solution involves making that harder to happen; separately making the screen that catches those failure finer may be part of a layered solution23:36
mikaljog0: yes, yes I did23:36
mikaljog0: I can see entries in the libvirt log which imply the monitor has crashed23:36
mikaljog0: but the version of qemu-kvm we're running is a couple of weeks older than us having this problem23:37
*** Ryan_Lane1 is now known as Ryan_Lane23:37
*** Ryan_Lane has joined #openstack-infra23:37
jeblairjog0: i think 'run more jobs' and 'be more strict' are things we can just do...23:37
jog0jeblair: I agree with that23:37
jog0buit for recheck23:37
jog0I am not saying we need to go to the ML23:37
jog0I just want to get something changed sooner then later23:38
jog0because status quo = mikal crying23:38
jog0mikal: lets chat in nova23:38
mikaljog0: well, I was rather thinking of getting out of my pjs23:38
jeblairjog0: i think we should decide within infra/qa what would be the best approach to take wrt to recheck/reverify and then notify the list about that.23:39
jog0jeblair: ++23:39
fungii think improvements (expected or even experimental) which are unlikely to be contentious should just be implemented and see how they fare23:39
jeblairjog0: it's not something i think we can do right away anyway; i think we have to wait until the gate gets more normal23:39
jog0so what do we think the best approach is?23:39
jeblairjog0: but we should be prepared to implement it when that happens23:39
jog0jeblair: why wait till its more normal?23:39
jog0and can we assume that will happen?23:39
fungiit has been normal before. the sun also may not rise tomorrow23:40
jeblairjog0: we could have done this last week easily, for instance.23:40
*** rcleere has quit IRC23:41
jeblairfungi: "    This may make it harder for people to re-run jobs for purposes of testing the jobs themselves... allow it but don't let zuul update its -1 vote?"23:41
*** vipul is now known as vipul-away23:41
jog0jeblair: I'm not sure we need to wait but thats a moot point23:41
jeblairfungi: i don't follow that23:41
fungijeblair: there have been times where people want to rerun self-gating changes to tests to see if they're deterministic. adding a recheck comment has been how that was accomplished in the past23:42
lifelessjeblair: in that people are retrying?23:42
clarkbI think retrying in the check queue is fine and should be allowed23:42
lifelessjeblair: so the bug is being detected but folk are shoving it through ?23:42
jeblairlifeless: yes, that is jog0's hypothesis23:42
clarkbthe problem is that you can force things through the gate queue with a little luck and persistence23:42
jog0the bug isn't always there fault either23:43
fungijeblair: merely suggesting that if the goal in that section is to completely neuter "recheck no bug" then possibly do that by having zuul not update its vote when leaving further result comments on the same patchset23:43
jog0if a bug is already in gate, you may hit it23:43
*** sdake_ has quit IRC23:43
jog0and just push through23:43
clarkbjog0: right which is why I think falling back on the core reviwers may be a good idea23:43
*** nsaje has quit IRC23:43
clarkbas they can make large project wide judgements (I hope)23:43
jog0clarkb: agreed23:43
jog0they can run a 'no bug' test23:44
jog0and everyone else can do recheck bug x23:44
mordredyou're assuming they're not part of the problem23:44
*** nsaje has joined #openstack-infra23:44
mordredthe reason we don't let people push code directly is because of people who work on it all the time being more likely to override protections23:45
jog0mordred: yes, that is correct.  Not sure how to tell who is the problem yet23:45
jeblairjog0: perhaps we should do a more rigorous analysis...23:45
mordredif the system depends on one set of people behaving better than another set, I believe we're screwed23:45
jog0mordred: ohh you mean core vs non-core23:45
mordredbecause it's essentially re-introducing the idea of a committer23:46
mordredjust doing it weirdly23:46
jog0misunderstood, yeah as a core I am happt to say I am part of the problem23:46
jeblairfor recent nondeterministic bugs that merged, do we links to the patches that introduced them?23:46
jog0jeblair: its not always clear how they got introduced, but we do have some records23:47
clarkbone example of that would be the swift related bugs that tempest was hitting23:47
jog0I think some of the neutron bugs of late have records like that23:47
jeblairjog0: can we test your hypothesis that way (whether people really are reverifying to get in flaky changes), and also identify the culprits (to know whether they are core, in certain projects, or possibly even just tell personally them to stop)23:47
*** ryanpetrello has quit IRC23:47
clarkbwe were running out of disk space, but what change made that a problem in tempest we don't know23:47
jog0jeblair: with a big bug failing 10% of the time they may never see the bug themselves23:47
clarkband we still have no idea why 1251920 is a thing (still hitting it with a hammer)23:48
jeblairjog0: right, in which case it was not a problem where someone overrode the test results by reverifying23:48
jog0but jeblair let me dig upn example23:48
*** changbl has quit IRC23:48
*** loq_mac has joined #openstack-infra23:48
*** nsaje has quit IRC23:49
*** loq_mac has quit IRC23:49
jog0jeblair: so my concern is this - gate is so flaky now that even if you see your patch fail you don't think it was you23:49
jog0but that is secondary to the bigger issue IMHO,23:49
jog0I push a docs change. Gate fails I know it couldn't have been my code free patch that broke it, but e-r doesn't know why it failed23:49
openstackgerritEdward Raigosa proposed a change to openstack-infra/config: Make pip install from upstream better  https://review.openstack.org/5142523:50
jog0I get impatient and say 'recheck no bug'23:50
jog0so the bug exists and is unclassified (we know nothing about it) and the more we do this the more bugs we get23:50
jog0until we notice, hey why is gate so bad? turns out its many bugs that crept in not one23:50
jog0so the issue to me isn't someone trying to force a bad patch through the gate23:51
jog0its apathy23:51
jeblairjog0: ok, you want to improve the data collection for unknown bugs23:51
fungiand bugs are likely to invite their friends until it becomes a party23:51
jog0jeblair: I do23:51
jog0fungi: ++23:51
jog0jeblair: I was thinking of making a webapp that lists all unclassified failures23:51
jog0and has you add queires to e-r that list is filtered23:51
jog0and you get points for how many you remove23:52
fungii rather like your "spot that bug" game idea23:52
jog0fungi: good, you want to write it?23:52
jeblairjog0: ok, so i think removing 'no bug' and optionally further restricting reverify bug citations to bugs in e-r helps with that23:52
*** ryanpetrello has joined #openstack-infra23:52
jog0jeblair: agreed23:52
jeblairjog0: but i think we should fully explore the process a dev goes through there23:52
lifelesswhat about infrastructure failures23:53
jeblairif a docs change fails, what does a dev do?23:53
lifelesslike when jenkins slaves are killed23:53
jeblairlifeless: infra isn't exempt from this, even now the instructions say if it's an infra bug, file it in openstack-ci and cite it in recheck23:54
lifelessjeblair: ok; I had had the impression than when a slave goes sideways it was was considered a fact of life and not something to do root cause analysis on23:54
lifelessjeblair: I'm glad to be wrong!23:54
fungithe current annoying loophole there being that people file bugs which say, "infra failed my patch. have a bug"23:55
fungibut i suspect the same befalls all projects, not just infra23:55
clarkbfungi: yup, I think the tempest guys deal with it more than we do23:55
jeblairlifeless: depends on how it goes sideways.  we've done quite a bit of root-cause analysis which has led us to the conclusion "we should stop using jenkins"23:56
*** rfolco has quit IRC23:56
jeblairlifeless: if we see repetitions or similar failures, we are likely to say something that sounds more like "it's a fact of life" than "let's do root cause analysis!"; but that's not because we accept failure, it's because we're in the middle of a _very_ long process of fixing it.  :)23:57
clarkbjeblair: though maybe this new jenkins that fixed their thread insanity fixes these problems23:57
clarkbthey went from 3 threads per executor to 1 or some such (which is a massive simplification)23:57
fungibut also, random transient network breakage can impact test results and render false negatives. having a way to classify those would be nice (i guess even those could be viewed as fixable problems, given enough resources to throw at it)23:58
jeblairclarkb: number of threads is not the only problem we have with jenkins.  very far from it, in fact.23:58
clarkbjeblair: definitely (just there is the particular bug of jenkins losing connectivity with a slave without knowing)23:58
lifelessfungi: right, thats indeed another case23:58
jeblairthese instances are VERY rare, partly because we treat them so seriously and have gone to great lengths to eliminate them.23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!