Saturday, 2014-01-11

sdaguefungi / jeblair: next time there is a reset event - https://review.openstack.org/#/c/45766/  could use to go to the top00:00
jeblairfungi: yeah, rackspace is fairly bad at deleting nodes, so i think nodepool may need to be more aggressive about deleting them00:00
fungialso, immediately following the restart, that band of green "ready lag" which the graph was showing constantly has disappeared00:00
jeblairfungi: they hit nodepool's 10 minute timeout waiting for deletion often, which makes deleting change from a parallel to serial operation (via the cleanup thread)00:01
fungii'll review 45766 and bump it on the next reset00:01
fungisdague: ^00:01
sdaguefungi: thanks00:02
fungisdague: flashgordon: i read "multi threaded conductor" as "go more faster"00:02
sdaguefungi: yeh, basically00:03
fungiokay, hopefully that's a00:03
fungi"good" faster and not a "bad" faster00:03
*** ^demon|away has quit IRC00:03
sdaguerustlebee and damnsmith both believe so00:03
sdaguelook at rustlebee's comment in there00:04
fungiokay, awesomesauce00:05
*** gokrokve has quit IRC00:05
*** gokrokve has joined #openstack-infra00:05
sdagueok, I think I'm done for the night / weekend. Have a good one folks00:06
jeblairsdague: good night/weekend!00:06
sdagueI will look forward to a time when the top of queue isn't 16hrs - http://f.slukjanov.name/w/review.o.o/66056/1/status.html00:06
fungihave a good weekend, sdague00:06
fungijeblair: if you're in the mood for other excitement, we're at about 85% full on the 300+gb root filesystem of the graphite server. basically all whisper files. is there pruning we need to be doing? also, it's got one cpu pegged solid on iowait, which seems to be driven by carbon-cache.py00:07
fungiand it's frequently having trouble serving up images00:07
flashgordonfungi: without the patches00:07
fungiflashgordon: too fun00:09
jeblairfungi: no pruning; they are fixed size00:09
fungiflashgordon: did you try tempest yet?00:09
*** mrodden has joined #openstack-infra00:09
fungijeblair: okay, so it's a whisper file count driving the utilization then?00:09
flashgordonfungi: that failed but not sure if failure was related :/00:09
fungiflashgordon: these days, it's a coin toss on tempest fails00:09
*** gokrokve has quit IRC00:09
jeblairfungi: yes.  oh... when we do things like change job names, we could probably delete old metrics files if we don't care about them.00:10
jeblairfungi: so perhaps find to delete files older than 4 weeks or something.00:10
flashgordonI didn't see any libvirt issues so thinking it was something else00:10
fungithat makes sense. basically if the metric name/bucket changed more than a month ago, we don't care to keep it around00:11
*** banix has quit IRC00:11
flashgordonone of the patches was needed to pass unit tests https://review.openstack.org/#/c/65360/00:11
fungiflashgordon: i can rerun my baseline this weekend and see if i still get the same consistent fails i was seeing before with it00:11
flashgordonfungi: that would be great00:11
jeblairfungi: aggregate graphs with lots of metrics can cause a lot of graphite load, so people need to be careful creating those00:11
flashgordonI'm hoping new libvirt will make gate more stabile and fix00:12
flashgordonhttps://bugs.launchpad.net/bugs/125487200:12
fungijeblair: i was wondering whether we might have an uptick in people pulling/reloading those. maybe we need some caching in front of the graphs? squid?00:12
flashgordonwhich would be a big win00:12
jeblairfungi: yeah, we should look into that.  note that graphs on the zuul status page have cache-busting parameters added to them.  we might need to think about caching holistically.00:14
*** vipuls-away is now known as vipuls00:14
*** thuc has joined #openstack-infra00:15
*** thuc has quit IRC00:15
*** thuc has joined #openstack-infra00:16
fungijeblair: the iowait seems to be mostly the kjournald thread though, so these are presumably write operations. i remounted the root filesystem with relatime just on a whim, in case reads were driving atime updates, but that ended up not helping00:16
*** mrodden has quit IRC00:16
*** thuc has quit IRC00:17
*** thuc has joined #openstack-infra00:17
jeblairfungi: (i think that's because if you update a page with js, the browser won't even check whether it should reload an image at the same url)00:17
openstackgerritA change was merged to openstack-infra/elastic-recheck: Fix the e-r query for bug 1258682  https://review.openstack.org/6530300:18
jeblairfungi: wow, that's exciting.  maybe we need to put them on an ssd cinder volume?00:18
fungimaybe00:19
fungii wouldn't think we'd really be generating *that* many statsd updates, though, in the grand scheme of things00:19
fungiwhich is why it strikes me as a little odd00:20
*** denis_makogon has quit IRC00:20
jeblairfungi: i agree00:21
*** sarob has joined #openstack-infra00:23
fungii mean, tens a second maybe, tops. it's also possible that a process there has just worked itself into a wonky state due to some bug, and is going wild with flush calls in a loop00:24
fungidavid-lyle: https://pypi.python.org/pypi/django_openstack_auth00:25
fungidavid-lyle: zuul was just busy, but it got done00:25
*** rnirmal has joined #openstack-infra00:26
*** sarob has quit IRC00:28
*** sarob has joined #openstack-infra00:29
*** ryanpetrello has joined #openstack-infra00:29
openstackgerritAaron Greengrass proposed a change to openstack-infra/config: Remove hardcoded config assumptions, cleanup variables  https://review.openstack.org/6607200:30
*** sarob_ has joined #openstack-infra00:30
*** CaptTofu has joined #openstack-infra00:30
fungii'm promoting 45766,2 because the change at the head of the gate is getting ready to fail anyway according to the console log on its last remaining test00:30
fungiso we're bound for a 100% gate reset either way. maybe that fix will save more changes from a reverify00:31
*** sarob__ has joined #openstack-infra00:32
*** CaptTofu has quit IRC00:32
flashgordonfungi: sweet00:32
flashgordonwhat about the postgres patch?00:33
flashgordonis that already in00:33
fungiflashgordon: yep, all jobs getting restarted in the gate now will get a longer timeout on the postgres-full job00:33
flashgordonexcellent00:33
fungiwhich should help too00:33
*** dstanek has quit IRC00:34
*** sarob has quit IRC00:34
flashgordon/openstack/openstack$ git log --since=1.day --oneline | wc -l00:35
flashgordon7000:35
*** sarob_ has quit IRC00:35
flashgordonnot bad for a slow gate day00:35
fungiflashgordon: so even will all this, we merged 70 changes through the integrated gate? not bad indeed00:35
david-lylefungi: I see it now, thanks for your help!00:35
david-lyleand the education00:36
fungidavid-lyle: no problem--sorry about the wait!00:36
*** hogepodge has quit IRC00:36
fungidavid-lyle: on a "good" day, it takes less than a minute from tag push to pypi availability00:36
*** dstanek has joined #openstack-infra00:36
fungi(though there's still a 30-60 minute wait after that before it makes it into our testing mirror for jobs to make use of)00:37
david-lyleplenty fast enough00:38
fungii'd like it to be faster. maybe eventually00:38
*** ivar-lazzaro has quit IRC00:42
*** mrodden has joined #openstack-infra00:44
*** mrodden has quit IRC00:44
*** rwsu has quit IRC00:46
*** rnirmal_ has joined #openstack-infra00:54
*** rnirmal has quit IRC00:55
*** rnirmal_ is now known as rnirmal00:55
*** prad has quit IRC00:58
sdagueguess I'm not quite gone for the night. So enqueue time gets completely reset when you promote, huh?00:58
sdaguehttp://f.slukjanov.name/w/review.o.o/66056/1/status.html00:59
sdaguethe entire queue is now 20m00:59
*** ryanpetrello has quit IRC01:02
*** gyee_ has joined #openstack-infra01:04
*** pcrews has joined #openstack-infra01:05
*** _ruhe is now known as ruhe01:09
*** thuc has quit IRC01:11
*** thuc has joined #openstack-infra01:11
fungisdague: yeah, i guess since the queue gets reordered, thus not a strict fifo change, add time restarts01:13
*** thuc has quit IRC01:15
sdagueit would be nice to preserve the enqueue time if possible in zuul01:17
*** sarob__ has quit IRC01:24
*** AaronGr is now known as AaronGr_Zzz01:24
*** sarob has joined #openstack-infra01:25
*** yamahata has quit IRC01:25
jeblairsdague: yeah, it's implemented as a complete dequeue and re-queue, basically because the queue ordering logic is complicated, and that's the easiest way to re-use it.01:26
jeblairsdague: it may be possible to save those values to the side and restore them though.01:27
sdagueyeh, that would be cool01:27
sdaguebecause knowing those numbers is extremely useful in understanding where we stand, I think more so than the gate queue length01:28
jeblairsdague: can you file a zuul bug?  that's a pretty good lhf item01:28
sdaguejeblair: which tracker do you use for that?01:28
*** larrycai has joined #openstack-infra01:28
jeblairsdague: launchad/zuul; you can also target openstack-ci01:28
sdaguejeblair: where abouts in the code would you implement that?01:29
notmynamesdague: what are those timings on that link?01:29
*** sarob has quit IRC01:29
sdaguenotmyname: time in queue, though they get reset by priority bumping01:30
notmynamesdague: good info, especially if you can get the total time (but you're already talking to jeblair about that ;-) )01:30
jeblairsdague: probably in     def _doPromoteEvent(self, event):01:30
*** banix has joined #openstack-infra01:31
notmynamesdague: "prioroty bumping" == gate flush?01:31
jeblairsdague: do you know if sergey has that change up for review yet, or is he experimenting locally?01:31
jeblairit's really cool.  :)01:32
* jeblair -> sprint01:34
sdaguejeblair: I have it up for review01:35
sdaguejeblair: https://review.openstack.org/#/c/65993/01:36
sdaguehe just did a test and deploy somewhere public01:36
fungigah, 45766,2 hit "Availability zone 'test_az_-tempest-2024199811' is invalid (HTTP 400)"01:40
fungireenqueue it?01:40
flashgordonfungi:  are we having fun yet?01:40
fungiflashgordon: apparently01:41
*** dstanek has quit IRC01:41
flashgordonand yes to reenqueue01:41
fungidoing01:41
* fungi sighs01:41
*** ruhe is now known as _ruhe01:41
fungithe failure log for that job is https://jenkins04.openstack.org/job/gate-tempest-dsvm-full/1671/consoleText in case someone wants to satisfy themselves it isn't caused by that nova fix01:42
*** larrycai has quit IRC01:45
*** flashgordon is now known as jog001:46
*** CaptTofu has joined #openstack-infra01:50
*** sarob has joined #openstack-infra01:52
*** reed has quit IRC01:52
sdaguejog0: you have an er bug for the az issue yet?01:55
sdaguethat's actually probably a tempest bug01:55
*** CaptTofu has quit IRC01:55
sdagueI bet someone forgot a lock on az manip01:56
*** banix has quit IRC01:57
*** nati_uen_ has quit IRC01:58
openstackgerritSean Dague proposed a change to openstack-infra/zuul: make enqueue_time durable by caching in change  https://review.openstack.org/6609502:03
sdagueok, dinner on the table02:03
*** mriedem has quit IRC02:05
*** starmer_ has joined #openstack-infra02:07
*** ryanpetrello has joined #openstack-infra02:10
*** gokrokve has joined #openstack-infra02:13
*** oubiwann has joined #openstack-infra02:15
*** pcrews has quit IRC02:16
*** weshay has quit IRC02:22
jog0sdague: there is I think02:26
jog0sdague: 126567202:27
*** sarob has quit IRC02:28
*** sarob has joined #openstack-infra02:28
*** sarob has quit IRC02:32
openstackgerritSean Dague proposed a change to openstack-infra/zuul: make enqueue_time passable to addChange  https://review.openstack.org/6609502:37
sdaguejeblair: so how about that approach instead?02:37
*** pcrews has joined #openstack-infra02:42
*** gokrokve has quit IRC02:43
*** gokrokve has joined #openstack-infra02:44
*** banix has joined #openstack-infra02:45
*** ryanpetrello has quit IRC02:47
*** kraman1 has quit IRC02:47
*** gokrokve has quit IRC02:48
jeblairsdague: lovely; should it have a regression test?02:51
jeblairsdague: i think promote has a test; you could probably just check the times in that existing test02:52
sdaguejeblair: possibly, though you might need to walk me through that bit02:52
sdagueok, sure02:52
sdagueself.builds is the queue?02:53
sdaguethat will have to wait until morning02:55
*** rakhmerov has joined #openstack-infra02:58
*** jerryz has quit IRC03:00
jeblairsdague: self.builds in the set of running (jenkins) builds; you'll probably either want to inspect the zuul pipeline directly, or get the status json; there are tests that do both03:01
*** senk has quit IRC03:03
*** wenlock has joined #openstack-infra03:03
*** loq_mac has joined #openstack-infra03:07
openstackgerritDevananda van der Veen proposed a change to openstack-infra/config: Enable tempest/ironic gate tests  https://review.openstack.org/6584503:08
*** coolsvap has quit IRC03:13
*** gokrokve has joined #openstack-infra03:14
*** banix has quit IRC03:22
*** banix has joined #openstack-infra03:24
openstackgerritMichael Still proposed a change to openstack-infra/zuul: Implement a simple mysql reporter.  https://review.openstack.org/6588503:30
*** praneshp has quit IRC03:31
openstackgerritMichael Still proposed a change to openstack-infra/zuul: Implement a simple mysql reporter.  https://review.openstack.org/6588503:33
*** MarkAtwood has joined #openstack-infra03:43
*** pcrews has quit IRC03:45
*** michchap_ has quit IRC04:01
*** michchap has joined #openstack-infra04:01
*** banix has quit IRC04:04
*** praneshp has joined #openstack-infra04:07
*** praneshp_ has joined #openstack-infra04:10
*** praneshp has quit IRC04:11
*** praneshp_ is now known as praneshp04:11
*** julim has quit IRC04:28
*** FallenPegasus has joined #openstack-infra04:35
*** FallenPegasus has quit IRC04:37
*** MarkAtwood has quit IRC04:38
*** rnirmal has quit IRC04:42
*** yamahata has joined #openstack-infra04:46
*** loq_mac has quit IRC04:47
*** loq_mac has joined #openstack-infra04:47
*** rakhmerov has quit IRC04:52
*** oubiwann has quit IRC04:52
*** coolsvap has joined #openstack-infra04:57
*** rakhmerov has joined #openstack-infra05:00
*** oubiwann has joined #openstack-infra05:01
*** loq_mac has quit IRC05:03
*** rakhmerov has quit IRC05:05
*** gokrokve has quit IRC05:20
*** rakhmerov has joined #openstack-infra05:22
*** rakhmerov has quit IRC05:27
*** loq_mac has joined #openstack-infra05:28
*** yamahata has quit IRC05:41
*** yamahata has joined #openstack-infra05:42
*** oubiwann has quit IRC05:45
*** rcarrillocruz1 has quit IRC05:54
*** dpyzhov has joined #openstack-infra05:59
*** gyee_ has quit IRC06:02
*** mattoliverau has joined #openstack-infra06:06
*** rakhmerov has joined #openstack-infra06:08
*** mattoliverau has quit IRC06:09
*** rakhmerov has quit IRC06:12
*** dcramer_ has joined #openstack-infra06:13
*** mattoliverau has joined #openstack-infra06:15
*** mattoliverau has quit IRC06:15
*** mattoliverau has joined #openstack-infra06:20
*** loq_mac has quit IRC06:22
*** loq_mac has joined #openstack-infra06:23
*** SergeyLukjanov has joined #openstack-infra06:28
*** mattoliverau has quit IRC06:29
*** mattoliverau has joined #openstack-infra06:30
*** starmer_ has quit IRC06:31
*** praneshp has quit IRC06:38
*** mattoliverau has quit IRC06:44
openstackgerritMichael Still proposed a change to openstack-infra/zuul: Implement a simple mysql reporter.  https://review.openstack.org/6588506:46
*** mattoliverau has joined #openstack-infra06:48
*** mattoliverau has quit IRC06:49
*** mattoliverau has joined #openstack-infra06:53
*** mattoliv1rau has joined #openstack-infra06:58
*** mattoliverau has quit IRC06:58
*** mattoliv1rau has quit IRC07:07
*** mattoliverau has joined #openstack-infra07:08
*** SergeyLukjanov is now known as _SergeyLukjanov07:08
*** rakhmerov has joined #openstack-infra07:08
*** _SergeyLukjanov has quit IRC07:09
*** harlowja is now known as harlowja_away07:10
*** mattoliverau has quit IRC07:12
*** rakhmerov has quit IRC07:13
*** SergeyLukjanov has joined #openstack-infra07:13
*** starmer has joined #openstack-infra07:16
*** loq_mac has quit IRC07:25
*** loq_mac has joined #openstack-infra07:29
*** a7ndrew has joined #openstack-infra07:32
*** bogdando has quit IRC07:34
*** loq_mac has quit IRC07:34
*** loq_mac has joined #openstack-infra07:35
openstackgerritRuslan Kamaldinov proposed a change to openstack-infra/storyboard: [Do not review] Added tests for DB migrations  https://review.openstack.org/6611607:37
*** denis_makogon has joined #openstack-infra07:47
*** cbkyeoh has joined #openstack-infra07:48
*** larrycai has joined #openstack-infra07:50
*** larrycai has quit IRC07:54
*** dpyzhov has quit IRC07:55
*** coolsvap has quit IRC07:55
*** SergeyLukjanov has quit IRC08:03
*** _ruhe is now known as ruhe08:08
*** loq_mac has quit IRC08:08
*** rakhmerov has joined #openstack-infra08:09
*** rakhmerov has quit IRC08:14
*** dpyzhov has joined #openstack-infra08:14
*** ruhe is now known as _ruhe08:19
*** nicedice has quit IRC08:31
*** nicedice has joined #openstack-infra08:33
*** hashar has joined #openstack-infra08:38
*** yolanda has joined #openstack-infra08:58
*** rakhmerov has joined #openstack-infra09:10
*** rakhmerov1 has joined #openstack-infra09:12
*** rakhmerov has quit IRC09:12
*** SergeyLukjanov has joined #openstack-infra09:12
*** nicedice has quit IRC09:15
*** rakhmerov1 has quit IRC09:16
*** SergeyLukjanov is now known as _SergeyLukjanov09:27
*** _SergeyLukjanov has quit IRC09:28
*** denis_makogon has quit IRC09:29
*** starmer has quit IRC09:30
*** SergeyLukjanov has joined #openstack-infra09:48
*** hashar has quit IRC09:55
*** rakhmerov has joined #openstack-infra10:12
*** yolanda has quit IRC10:13
*** yolanda has joined #openstack-infra10:13
*** rakhmerov has quit IRC10:17
*** dpyzhov has quit IRC10:27
*** michchap has quit IRC10:37
*** michchap has joined #openstack-infra10:37
*** wenlock has quit IRC11:12
*** rakhmerov has joined #openstack-infra11:13
*** enikanorov_ has joined #openstack-infra11:14
*** enikanorov has quit IRC11:16
*** rakhmerov has quit IRC11:17
*** tma996 has joined #openstack-infra11:26
*** boris-42 has quit IRC11:28
*** boris-42 has joined #openstack-infra11:28
*** uvirtbot has joined #openstack-infra11:45
*** hashar has joined #openstack-infra12:09
*** rakhmerov has joined #openstack-infra12:13
*** rakhmerov has quit IRC12:18
*** SergeyLukjanov is now known as _SergeyLukjanov12:21
*** _SergeyLukjanov is now known as SergeyLukjanov12:21
*** hashar has quit IRC12:48
*** denis_makogon has joined #openstack-infra12:51
*** bauzas has joined #openstack-infra12:55
*** yolanda has quit IRC13:08
*** rakhmerov has joined #openstack-infra13:14
*** cbkyeoh has quit IRC13:14
*** cbkyeoh has joined #openstack-infra13:15
*** rakhmerov has quit IRC13:19
*** cbkyeoh has quit IRC13:32
*** cbkyeoh has joined #openstack-infra13:34
*** cyeoh has quit IRC13:39
*** cbkyeoh is now known as cyeoh13:39
*** mozawa has joined #openstack-infra14:04
*** rakhmerov has joined #openstack-infra14:15
*** rakhmerov has quit IRC14:19
*** dcramer_ has quit IRC14:33
*** dstanek has joined #openstack-infra14:38
*** sdague has quit IRC14:44
*** dstanek has quit IRC14:48
*** sdague has joined #openstack-infra15:02
*** sdague has quit IRC15:07
*** sdague has joined #openstack-infra15:08
*** dmsimard has joined #openstack-infra15:14
*** coolsvap has joined #openstack-infra15:14
dmsimardIs there some problems with Jenkins this morning ? Getting weird checks again.15:15
*** dcramer_ has joined #openstack-infra15:15
*** rakhmerov has joined #openstack-infra15:15
*** ryanpetrello has joined #openstack-infra15:17
*** tma996 has quit IRC15:18
*** dstanek has joined #openstack-infra15:18
*** ryanpetrello has quit IRC15:19
*** rakhmerov has quit IRC15:20
*** sdake has quit IRC15:21
fungidmsimard: "weird" how?15:22
fungihave an example link?15:23
dmsimardfungi: https://review.openstack.org/#/c/64388/115:23
*** tma996 has joined #openstack-infra15:24
dmsimardOnly have logs for 2 of the 5 checks and then we get errors that shouldn't be happening on gate-puppet-ceph-puppet-unit-3.015:25
*** CaptTofu has joined #openstack-infra15:27
fungiyeah, i suspect another jenkins unit test slave has gone rogue. hunting it now15:29
dmsimardAppreciate it fungi, thanks15:29
dmsimardTough week for you guys ?15:30
fungiyes :/15:30
fungijenkins02 seems to be mostly unresponsive15:30
fungioh, it came up for me on a reload15:30
*** ryanpetrello has joined #openstack-infra15:31
*** gokrokve has joined #openstack-infra15:32
*** gokrokve has quit IRC15:37
fungiyep, precise20 was impacted by bug 1267364 now... https://jenkins02.openstack.org/job/gate-puppet-ceph-puppet-syntax/51/console15:38
uvirtbotLaunchpad bug 1267364 in openstack-ci "Recurrent jenkins slave agent failures" [Critical,In progress] https://launchpad.net/bugs/126736415:38
fungii've taken it offline15:38
fungioh, actually maybe not. that's a different backtrace15:38
fungiyeah, this one failed the same way on precise20 too... https://jenkins02.openstack.org/job/gate-puppet-ceph-puppet-lint/44/console15:41
fungiand i'm getting tons of timeouts/proxy errors from the jenkins02 webui, so i'm going to put it in shutdown, delete the currently disabled slaves so that they don't get reenabled on startup, then give jenkins02 a service restart15:42
fungijenkins01/03/04 are all snappy by comparison15:44
fungiat least as snappy as the horrible jenkins webui ever gets15:44
*** ryanpetrello has quit IRC15:47
fungioh yeah, cpu graph for jenkins02 is fun... http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=823&rra_id=all15:47
dmsimardouch15:47
dmsimardbe right back15:48
fungiand load average http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=822&rra_id=all15:48
*** dmsimard has quit IRC15:48
*** SergeyLukjanov is now known as _SergeyLukjanov15:48
*** dmsimard has joined #openstack-infra15:49
fungihowever, matching the cpu and load average graphs to memory consumption makes it look like maybe it was doing something to reclaim cache memory starting around the same time or just before http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=826&rra_id=all15:49
dmsimardI'm not familiar with the CI infra. The jenkins nodes are just dispatching to the slaves, right ?15:49
dmsimardWhy would jenkins02 be crawling then :(15:49
fungiEJAVA15:50
dmsimardOh, java, of course15:50
fungijenkins is one giant java virtual machine, and it is doing more than just dispatching jobs. it also gets dynamic slaves registered/deregistered, collects job artifacts from them, keeps track of job status and estimated run time, et cetera15:51
fungiand has tons of global locks, so doesn't scale well to our work volume. that's why zuul grew the ability to manage multiple jenkins masters (we have five now)15:52
fungijenkins itself always assumes one and only one master server, so we had to add a layer above it15:53
*** rakhmerov has joined #openstack-infra15:53
fungioh, wow, first time i've seen this message on the jenkins management page... "This Jenkins appears to have crashed. Please check the logs."15:55
*** rakhmerov has quit IRC15:55
*** rakhmerov1 has joined #openstack-infra15:55
fungioh... "Jan 6, 2014 9:07:26 PM  (4 days 18 hr ago)"15:56
fungithat's probably when we were restarting it15:56
fungiand i had to forcibly kill it15:56
fungiokay, i've placed jenkins02 in prepare for shutdown state15:58
dmsimardThink I'll have to issue another recheck comment ?15:58
fungidmsimard: no, hopefully not for this issue anyway15:59
*** dstanek has quit IRC15:59
dmsimardok if it happens again i'll link that bug16:00
fungidmsimard: oh, well, you'll need to issue *a* recheck comment (you entered "reverify" instead)16:00
dmsimardSo it's recheck, not reverify ? I could've sworn I saw a reverify work16:00
fungireverify is for changes which get approved and fail to merge to to test failuews16:01
fungifailures16:01
dmsimardAh16:01
fungirecheck is for patchsets which are still undergoing review16:01
*** gokrokve has joined #openstack-infra16:02
fungidmsimard: they're discussed briefly in the article section jenkins links in its failure comments... https://wiki.openstack.org/wiki/GerritJenkinsGit#Test_Failures16:02
dmsimardK, sent a recheck, let's wait and see16:04
*** oubiwann has joined #openstack-infra16:04
*** dmsimard has quit IRC16:07
*** ryanpetrello has joined #openstack-infra16:10
fungijenkins02 is estimating over an hour for some currently running jobs, so i'm going to step away for a few while it winds down16:12
*** wenlock has joined #openstack-infra16:12
*** _SergeyLukjanov is now known as SergeyLukjanov16:12
*** gokrokve has quit IRC16:16
*** wenlock has quit IRC16:16
*** ryanpetrello has quit IRC16:35
*** tma996 has quit IRC16:37
*** oubiwann has quit IRC16:47
*** gokrokve has joined #openstack-infra16:55
*** krtaylor has quit IRC16:59
fungilast job on jenkins02 is finishing up now17:00
*** gokrokve has quit IRC17:08
*** agordeev has quit IRC17:10
*** agordeev has joined #openstack-infra17:15
fungiokay, i was able to delete all slaves on jenkins02 with hung jobs from several days ago, though it took multiple tries of offlining/disconnecting to be able to delete them... centos6-{2,12,14} and precise{2,22,36}17:26
*** dcramer_ has quit IRC17:27
fungithere was also a hung gate-swift-docs job pending in the queue on jenkins02 for some reason, again dating back a couple days, which i was able to kill17:27
fungii also deleted the slaves which either we or jenkins02 offlined for connectivity problems... precise{4,10,12,18,20,26,34}17:27
fungiand deleted all nodepool nodes for jenkins02 from nodepool except for that one hung devstack-precise-hpcloud-az2-777371 which even novaclient can't seem to delete17:28
fungiand also deleted any lingering nodepool-managed nodes from the jenkins02 webui17:28
fungitried but was unable to stop the jenkins service on jenkins02 cleanly from its initscript (like earlier in the week)17:30
fungisigterm and sighup were being ignored by both the daemon process and the java subprocess. had to take the child down with sigsegv and then separately do the same to the parent since it still didn't terminate on its own (even with lots of waiting between various signals)17:35
fungiuhh, top claims the 10-minute load average is 2195.3017:36
fungithat doesn't even seem possible17:36
fungidropping rapidly, so must have been much higher17:37
fungijenkins02 is completely idle now though, so starting jenkins service on it again17:38
*** dcramer_ has joined #openstack-infra17:38
fungiit's back up, running jobs, and at least one has already succeeded17:40
*** SergeyLukjanov has quit IRC17:40
openstackgerritSean Dague proposed a change to openstack-infra/config: Zuul status: don't toggle on link click  https://review.openstack.org/6471617:43
openstackgerritSean Dague proposed a change to openstack-infra/config: provide time in queue in zuul ui  https://review.openstack.org/6599317:43
openstackgerritSean Dague proposed a change to openstack-infra/config: clean up possible js incompatibilities  https://review.openstack.org/6605717:43
openstackgerritSean Dague proposed a change to openstack-infra/config: make merge conflict changes black  https://review.openstack.org/6605617:43
sdaguefungi: is it possible to inject a periodic event?17:47
*** boris-42 has quit IRC17:48
fungisdague: not currently i don't think, but i'll try17:48
*** dcramer_ has quit IRC17:48
sdagueI was trying to fix a layout issue17:48
sdaguebut it only shows up on periodics17:49
fungiyep, see that--was in the middle of reviewing the new patchset for it17:49
fungilooks like zuul enqueue requires --change which i suspect won't take a git refname17:50
fungitrigger-job.py can take refs, but only has parameters for a few of the pipelines (not the periodic one)17:51
*** boris-42 has joined #openstack-infra17:51
fungiand while i think zuul-dev runs some dummy periodic changes with great frequency (or at least used to), its status.json currently isn't available apparently... http://zuul-dev.openstack.org/17:52
fungisdague: i believe there are some sample files in the zuul repo though, which might be fed in to test the interface?17:53
fungisample json payloads i mean17:53
sdaguefungi: yeh, so I actually think my current layout work around will be fine17:53
sdaguehonestly, I'm about to stop doing this for the day, just one more email :)17:54
fungihttp://git.openstack.org/cgit/openstack-infra/zuul/tree/etc/status/public_html/status-openstack.json-sample17:54
fungioh, looks like the sample is missing the periodic pipeline17:55
sdaguethe fact that we're still 40 deep on Sat is not a good sign18:06
*** starmer has joined #openstack-infra18:06
*** fbo_away is now known as fbo18:07
fungiyeah, i still saw quite a few tempest jobs failing on a variety of issues, most commonly ssh timeouts though18:07
*** mozawa has quit IRC18:09
fungiyep, the tempest change which caused the last reset died on an ssh timeout in the tempest run at the end of its grenade job18:11
*** rakhmerov1 has quit IRC18:23
*** denis_makogon has quit IRC18:26
*** yolanda has joined #openstack-infra18:48
*** boris-42 has quit IRC18:51
*** boris-42 has joined #openstack-infra18:51
*** CaptTofu has quit IRC19:19
*** CaptTofu has joined #openstack-infra19:20
*** coolsvap has quit IRC19:21
*** CaptTofu has quit IRC19:22
fungithe 13 persistent slaves which i deleted out of jenkins02 before the reboot have been halted, hard-rebooted through the rackspace dashboard, readded to jenkins02 and i've watched each of them run and complete at least one job successfully apiece19:58
fungier, before the jenkins02 service restart (wasn't a reboot of the server)19:59
*** CaptTofu has joined #openstack-infra20:04
*** mdenny has quit IRC20:09
*** dstanek has joined #openstack-infra20:28
*** starmer has quit IRC20:30
*** starmer has joined #openstack-infra20:33
*** dstanek has quit IRC20:51
*** senk has joined #openstack-infra20:56
*** oubiwann has joined #openstack-infra21:02
*** erfanian has joined #openstack-infra21:13
*** erfanian has quit IRC21:18
*** rakhmerov has joined #openstack-infra21:20
*** senk has quit IRC21:47
*** senk has joined #openstack-infra21:48
*** fbo is now known as fbo_away21:48
*** dcramer_ has joined #openstack-infra21:49
*** oubiwann has quit IRC21:59
*** dimsum has quit IRC21:59
*** oubiwann has joined #openstack-infra22:04
sdaguefungi: if you are still around today - https://review.openstack.org/#/c/65804/ and pop it to top of queue? Will be interesting to see if that clears everything out.22:16
*** olaph has quit IRC22:17
*** annegentle_ has quit IRC22:20
*** ryanpetrello has joined #openstack-infra22:23
fungigotta run out to dinner, but i'll do that real fast22:24
fungidone22:25
fungiback later22:25
*** senk has quit IRC22:29
*** ryanpetrello has quit IRC22:36
*** rnirmal has joined #openstack-infra22:39
*** salv-orlando has quit IRC22:54
*** ryanpetrello has joined #openstack-infra22:55
*** yolanda has quit IRC23:02
openstackgerritEmilien Macchi proposed a change to openstack-infra/devstack-gate: Enable Neutron metering service plugin  https://review.openstack.org/6614223:14
zarofungi: i'm having trouble making a release.23:14
jeblairsdague: did you see i pasted the json with a periodic event?23:15
jeblairsdague: i was hoping that would help you debug/fix that23:16
jeblairsdague, fungi: also, 65804 wasn't designed to change anything; it just adds the knob, it doesn't turn it.23:17
jeblairsdague, fungi: https://review.openstack.org/#/c/65805/ turns the knob down, at the expense of increasing the runtime to 1.0-1.5 hours23:20
*** bauzas has quit IRC23:21
*** dstanek has joined #openstack-infra23:26
openstackgerritA change was merged to openstack-infra/config: Zuul status: don't toggle on link click  https://review.openstack.org/6471623:34
*** erfanian has joined #openstack-infra23:34
*** rnirmal has quit IRC23:37
*** oubiwann has quit IRC23:39
*** michchap has quit IRC23:46
*** michchap has joined #openstack-infra23:46
*** dstanek has quit IRC23:47
*** erfanian has quit IRC23:56

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!