Thursday, 2014-06-12

*** sweston has quit IRC00:04
*** lcostantino has quit IRC00:05
*** julim has quit IRC00:05
openstackgerritA change was merged to openstack-dev/hacking: Fixed warning H302 when used with six.moves  https://review.openstack.org/9946700:05
*** mmaglana has quit IRC00:05
*** oomichi_sleeping is now known as oomichi00:09
*** dims__ has quit IRC00:11
*** dims has joined #openstack-infra00:11
*** hemna is now known as hemna_00:12
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Error message & notification handling  https://review.openstack.org/9951500:13
ianwsdague: sorry, another go at https://review.openstack.org/#/c/99047/ ... i got reports dbus restart was killing gnome.  i tested and restarting just firewalld is sufficient on rackspace images00:16
sdaguewhy in gods name are people running gnome on their devstack envs?00:17
ianwsdague: yeah... maybe you can forgive me for not noticing that failure case :)00:18
*** dkehn_ has joined #openstack-infra00:18
*** dkehn_ is now known as dkehnx00:19
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Error message & notification handling  https://review.openstack.org/9951500:20
mordredsdague: because, you know ..00:20
mordredsdague: X11aaS00:20
mordredalso known as ... X1100:20
*** dims has quit IRC00:21
*** _nadya_ has quit IRC00:25
*** zhiyan_ is now known as zhiyan00:27
*** homeless has quit IRC00:27
*** matsuhashi has joined #openstack-infra00:27
*** dims has joined #openstack-infra00:27
*** praneshp has quit IRC00:28
*** nosnos has joined #openstack-infra00:32
*** HenryG_ has joined #openstack-infra00:33
*** yamahata has joined #openstack-infra00:33
SpamapSsdague: didn't you mean "why in gods name are people running gnome" ?00:33
*** thuc has joined #openstack-infra00:34
*** thuc has quit IRC00:36
*** thuc has joined #openstack-infra00:37
*** jhesketh has quit IRC00:37
*** thuc has quit IRC00:37
*** thuc has joined #openstack-infra00:38
*** thuc has quit IRC00:38
*** thuc has joined #openstack-infra00:39
*** richm has left #openstack-infra00:39
*** thuc_ has joined #openstack-infra00:40
*** HenryG_ has quit IRC00:42
*** thuc has quit IRC00:44
*** asettle has quit IRC00:45
*** jhesketh has joined #openstack-infra00:46
*** ildikov_ has joined #openstack-infra00:48
fungicde with motif should be enough for anybody00:49
*** ildikov has quit IRC00:51
*** mfer has joined #openstack-infra00:52
*** _nadya_ has joined #openstack-infra00:52
*** CaptTofu_ has quit IRC01:01
*** CaptTofu_ has joined #openstack-infra01:02
*** yaguang has joined #openstack-infra01:02
*** chianingwang has quit IRC01:02
mordredSpamapS: I'm running gnome:old-stable myself01:03
mordredaka mate01:04
mordredclock-applet FTW!01:04
*** chianingwang has joined #openstack-infra01:04
jogosdague: re INFO logs too bad, because that would drop the data in ES way down.01:04
jogosdague: re heat test, do you have a good scenario to test?01:04
jogostevebaker fungi: can we prioritize https://review.openstack.org/#/c/99517/ in the gate01:05
jogomordred: ^01:05
*** CaptTofu_ has quit IRC01:06
jogoAlex_Gaynor: want hacking 0.9.2 now? or think you can find another bug first?01:07
Alex_Gaynorjogo: heh, a release would be awesome, I could delete a few crufty noqa's01:07
fungijogo: looks like it's got ~20 minutes until it passes check tests, but if it's contributing significantly to gate slowness i can promote it to the front once it finishes check jobs01:08
jogofungi: its the top of the http://status.openstack.org/elastic-recheck/01:08
jogothat would be great01:09
jogoAlex_Gaynor: done, its working its way through release queue. When its done want the honors of sending out an email on the hacking 0.9 thread about the fix01:12
Alex_Gaynorjogo: not especially :-)01:12
jogoAlex_Gaynor: I'll send one out01:14
Alex_Gaynorjogo: thanks01:14
*** gokrokve has joined #openstack-infra01:14
sdaguejogo: look at the existing tests that are failing which keep the heat job from voting, that would be a good start01:17
*** alkari has quit IRC01:17
sdagueI think we need to be able to consistently create and delete a single stack before going the large ops route01:17
sdaguejogo: if we didn't log API requests, how would you *ever* figure out what's going on :)01:18
sdagueianw: +201:18
*** thuc_ has quit IRC01:20
sdaguemordred: you do realize that the calendar-indicator in Ubuntu does exactly that timezone switch thing. Though you'd have to be running unity.01:20
* sdague ducsk01:20
sdagueor ducks even01:20
*** thuc has joined #openstack-infra01:20
morganfainbergoh.. i just realized...01:21
morganfainbergexchange or whatever can do UTC meetings... can someone explain to me my gmail doesn't let you set a meeting for UTC timezone? :P01:21
ianwsdague: thanks!01:22
tchaypomorganfainberg: you really want me to launch into my rant about gmail's handling of timezones?01:22
tchaypohow about I just rant abotu the fact that it says "Sydney, Melbourne (GMT+10)"01:23
morganfainbergtchaypo,  lol01:23
tchaypobut Sydney and Melbourne are GMT+11 for half the year, so does that mean the appointment is in Sydney time or in GMT+10?01:23
mordredsdague: the calendar-indicator in Ubuntu does not do the same things01:24
*** thuc has quit IRC01:24
tchaypoFWIW they have a timezone that's "(GMT+00:00) GMT (no daylight saving)"01:24
tchaypoI *think* tat's the same thing as UTC, but it makes no sense, as GMT doesn't ever have daylight saving01:25
SpamapSmordred: you're still clinging to your clock applet? lulz01:25
morganfainbergtchaypo, they do? huh, i thought the GMT one they had did DST01:25
morganfainbergtchaypo, oh the iceland one thats right, GMT no DST, but thats a hack01:25
tchaypoThere's a "GMT+0 London", which will presumably have DST01:25
*** sarob has quit IRC01:25
morganfainbergright01:26
mordredSpamapS: clinging to - hell, I just got it BACK after being without it for 2 years because UI got taken over by people who clearly do not actually use linux01:26
morganfainbergi don't want DST, IRC meetings for OS01:26
*** sarob has joined #openstack-infra01:26
tchaypoIn my list I've got Reykjavik, St Helena, GMT (No daylight saving), Dublin, Lisbon01:26
*** esker has joined #openstack-infra01:26
SpamapSmordred: I've grown accustomed to the unity clock applet.. which is lacking only the map. :-P01:26
mordredand the weather/temperature. and the timezone indicators. and the easy timezone switching between favorites01:27
*** gyee has quit IRC01:28
mordredSpamapS: http://imgur.com/5vgSM3P01:29
*** gokrokve has quit IRC01:30
*** nati_ueno has quit IRC01:30
mordredSpamapS: there is a lovely and useful pile of information in that display. now, is it all necessarily related to _time_? No. but sometimes purity of interface needs to DIAF in favor of shit that works well01:30
*** sarob_ has joined #openstack-infra01:30
*** gokrokve has joined #openstack-infra01:30
*** sarob has quit IRC01:30
*** zehicle_in_sfo has quit IRC01:30
morganfainbergmordred, lol01:31
mordredin unity/gnome3 I can also add a weather applet, but then if I change locations, I need to change locations in two places to get it to show me things in the menu bar, which is silly01:31
mordredalso, none of those things give me the quick graphical indicator of "are people asleep in that location" - which is helpful01:32
mordred(I can go on about this for a while if anyone wants more)01:32
sdague:)01:33
* mordred needs to figure out how to keep the Mate guys funded...01:34
sdaguedamn, now you are nearly inspiring me to write it, mostly for the map visualization01:34
openstackgerritJoshua Hesketh proposed a change to openstack-infra/config: Fetch graphitejs zuul dependency  https://review.openstack.org/9801801:34
openstackgerritJoshua Hesketh proposed a change to openstack-infra/config: Use the latest jquery on zuul  https://review.openstack.org/9802901:34
*** gokrokve has quit IRC01:35
sdaguenight all01:35
*** yjiang has joined #openstack-infra01:36
mordrednight sdague01:36
openstackgerritAlex Gaynor proposed a change to openstack-dev/hacking: Mark hacking as being a universal wheel  https://review.openstack.org/9952801:36
*** trinaths has joined #openstack-infra01:38
dimsjogo, Didn't you mean "0.9.2 has just been released"?01:38
jogodims: I reused the old thread, but yes01:39
dimsack. thx01:39
*** zns has quit IRC01:39
fungireset second from the front of the gate, so i'll promote 99517,3 to the front as soon as it reports01:42
*** rwsu has quit IRC01:42
*** chianingwang_ has joined #openstack-infra01:44
*** Ryan_Lane1 has quit IRC01:44
jogofungi: thanks01:47
*** _nadya_ has quit IRC01:50
*** trinaths has quit IRC01:51
openstackgerritA change was merged to openstack-infra/jenkins-job-builder: re-arrange docs for clarity  https://review.openstack.org/9846701:52
*** alexandra_ has joined #openstack-infra01:53
*** alexandra_ is now known as asettle01:53
tchayposo I'm trying to write an email to the tripleo-team asking people to spend more time reviewing as our backlog is getting bigger01:53
tchayposomeone suggested yesterdat that other emails with tips like "if it just needs a rebase, please rebase it" to openstack-dev, but I can't find it01:54
tchaypodoes anyone here have any memory of such an email?01:54
morganfainbergtchaypo, well i can tell you keystone has success with that type of stuff01:55
*** thomasbiege1 has joined #openstack-infra01:55
morganfainbergtchaypo, often someone will help with rebases, if there is a minor nit (otherwise good) upload a fix.. mostly "help get things through"01:55
mordredyah. aroudn here we'll also single-approve trivial things too01:55
mordredyou know - spelling error in a README? single approve01:55
mordredchange in zuul's internal algorithms? double approve and probably lots of prayer01:55
morganfainbergmordred, oh that is a good idea.01:56
mordredit's a thing to be careful with - and if there is even the slightest shred of doubt, I default back to double01:56
morganfainbergmordred, yeah.01:57
morganfainbergmordred, true - i have a hard enough time to convince people to single approve translation imports - because we've had 2 approver ground in to much01:57
*** thomasbiege has quit IRC01:58
morganfainbergi'd rather have that issue than the inverse though.01:58
mordredyup01:58
*** sarob_ has quit IRC02:02
*** sarob has joined #openstack-infra02:03
*** sweston has joined #openstack-infra02:03
*** sarob has quit IRC02:07
*** masayukig has quit IRC02:10
*** masayukig has joined #openstack-infra02:12
*** _nadya_ has joined #openstack-infra02:17
*** chianingwang_ has quit IRC02:18
*** cp16net has joined #openstack-infra02:19
*** otter768 has joined #openstack-infra02:21
*** CaptTofu_ has joined #openstack-infra02:23
*** penguinRaider has joined #openstack-infra02:24
*** sarob has joined #openstack-infra02:25
*** CaptTofu_ has quit IRC02:28
*** mmaglana has joined #openstack-infra02:28
*** arnaud__ has quit IRC02:29
*** thuc has joined #openstack-infra02:31
*** markmcclain has quit IRC02:31
*** penguinRaider has quit IRC02:34
*** alkari has joined #openstack-infra02:34
*** mrodden1 has joined #openstack-infra02:35
*** mrodden has quit IRC02:35
*** thuc has quit IRC02:36
*** CaptTofu_ has joined #openstack-infra02:38
*** mfer has quit IRC02:40
*** dims has quit IRC02:40
*** fanhe has joined #openstack-infra02:51
*** sarob has quit IRC02:57
*** chianingwang_ has joined #openstack-infra03:01
*** sarob has joined #openstack-infra03:02
openstackgerritJoshua Harlow proposed a change to openstack/requirements: Bump up to the six 1.7.x series  https://review.openstack.org/9955603:03
*** gokrokve has joined #openstack-infra03:05
*** dims_ has joined #openstack-infra03:07
*** otter768 has quit IRC03:10
*** dims_ has quit IRC03:12
*** praneshp has joined #openstack-infra03:18
*** Longgeek has joined #openstack-infra03:21
*** chianingwang_ has quit IRC03:21
*** sarob has quit IRC03:26
*** sarob has joined #openstack-infra03:26
*** CaptTofu_ has quit IRC03:26
*** CaptTofu_ has joined #openstack-infra03:27
*** chianingwang has quit IRC03:27
*** zhiyan is now known as zhiyan_03:30
lifelessclarkb: ping me if/when you want more dib kibbitzing03:30
lifelessclarkb: I'm keen to get you a good answer to the problem03:30
*** sarob has quit IRC03:30
*** CaptTofu_ has quit IRC03:31
*** talluri has joined #openstack-infra03:36
*** arnaud__ has joined #openstack-infra03:36
*** praneshp_ has joined #openstack-infra03:39
*** pcrews has quit IRC03:40
*** zhiyan_ is now known as zhiyan03:41
*** praneshp has quit IRC03:41
*** praneshp_ is now known as praneshp03:41
*** nosnos has quit IRC03:44
*** thuc has joined #openstack-infra03:45
*** thuc has quit IRC03:49
*** thuc has joined #openstack-infra03:50
*** zns has joined #openstack-infra03:51
*** thuc_ has joined #openstack-infra03:51
lifelesshmm, gertty also needs to linewrap the text lines not just comments :(03:52
lifelessclarkb: nuts, the expand-next-N lines stuff doesn't show up in the commit messages atm; not sure why :(03:54
lifelessclarkb: or... its a bug with end-of-file diffs in gertty. Actually that seems more likely.03:55
*** thuc has quit IRC03:55
*** amcrn has joined #openstack-infra03:58
*** zns has quit IRC03:59
*** zns has joined #openstack-infra04:00
*** yfried has quit IRC04:08
*** matsuhashi has quit IRC04:11
*** arnaud__ has quit IRC04:12
openstackgerritlifeless proposed a change to stackforge/gertty: Don't crash on comments on unchanged files  https://review.openstack.org/9956304:16
*** harlowja is now known as harlowja_away04:16
*** rohitk has joined #openstack-infra04:19
*** thuc_ has quit IRC04:20
*** thuc has joined #openstack-infra04:21
*** arnaud__ has joined #openstack-infra04:21
*** rohitk has quit IRC04:25
*** thuc has quit IRC04:26
*** sarob has joined #openstack-infra04:27
jogoso I am submitting a talk on hacking and looking for a good title04:28
jogowas thinking 'Bikeshedding OpenStack, or why Style Guides Matter'04:29
kashyapMaybe just leave the "Bikeshedding" aspect and just stick to the why :-)04:30
jogokashyap: that is the why04:30
jogowell part of it04:30
*** sarob has quit IRC04:31
*** CaptTofu_ has joined #openstack-infra04:31
jogolifeless: I know you always have opinions ^04:31
lifelessOPINIONS04:31
lifelessjogo: Making a project 500 bugs in one small release?04:32
lifelessjogo: as a subtitle04:32
lifelessso actually04:32
jogohaha04:33
lifelessmy opinion here is that style guides are inferior to automation04:33
lifelesspep8 is automating whinging04:33
mikaljogo: you never came back to chat to me04:33
mikaljogo: so lame04:33
jogoso I don't disagree with that statement, but step one is have a style guide step 2 is automate fixing it04:33
lifelessit does some, but fairly little, towards improving product quality and velocity. [I know different folk have different opinions here].04:33
* jogo hides from mikal04:33
lifelessjogo: I think step one is to have an automated but nearly empty style guide, and step 2 is to increase it.04:34
Alex_Gaynorlifeless: go fmt is a million times better than flake8 (apologies to all)04:34
lifelessjogo: what we have is an ever increasing step 1 and no step 2.04:34
lifelessAlex_Gaynor: exactly my point.04:34
lifelessAlex_Gaynor: I experimented with one of the pep8 autoformatters on LP04:34
lifelessAlex_Gaynor: I think it was a couple weeks work away from being usable.04:34
jogoAlex_Gaynor: yeah go fmt is awesome04:35
lifelessjogo: the more hacking adds the *higher* the effort to get to 'go fmt'.04:35
Alex_Gaynorlifeless: the problem is there are still lots of python people who believe "A foolish consistency is the hobgoblin of little minds", the automation doesn't work if not /every/ file participates04:35
tchaypothanks mordred, morganfainberg04:35
StevenKlifeless: I actually miss utilities/format-new-and-modified-imports from LP04:35
*** SumitNaiksatam has quit IRC04:35
jogoAlex_Gaynor: good point04:35
lifelessAlex_Gaynor: right. I can't buy into 'make me work harder on trivia'. I can totally buy into 'I am not allowed to care about $aesthetic because the daemon will rewrite it for me'04:35
*** CaptTofu_ has quit IRC04:35
jogolifeless: you ever use autopep8?04:36
lifelesslet me dig up my post on this04:36
*** SumitNaiksatam has joined #openstack-infra04:36
*** zns has quit IRC04:36
*** sarob has joined #openstack-infra04:39
jogoAlex_Gaynor: you make a good point about the a foolish consistency04:39
Alex_Gaynorjogo: I believe the opposite to be clear, I think a lot of people are hung up on that04:39
jogoAlex_Gaynor: I guess a big part of hacking is we are going for a foolish consistency04:39
*** zns has joined #openstack-infra04:39
Alex_Gaynor+1 :-)04:39
jogoso maybe the title can be: "Style Guides: Why Foolish Consistency Matters"04:40
fungijogo: robotic hobgoblins04:41
jogofungi: how would you work that into the title? maybe I can use that in the abstract04:42
fungino idea. the image was merely compelling04:43
*** yfried has joined #openstack-infra04:43
*** sarob has quit IRC04:43
lifelessjogo: https://lists.launchpad.net/launchpad-dev/msg07330.html04:44
lifelessjogo: https://lists.launchpad.net/launchpad-dev/msg07338.html04:44
lifelessnow I was sure I did an experiment04:45
*** matsuhashi has joined #openstack-infra04:48
lifelesshttps://lists.launchpad.net/launchpad-dev/msg07340.html ...04:48
* StevenK stabs paste.o.o for giving ISEs04:48
lifelessjogo: https://lists.launchpad.net/launchpad-dev/msg07681.html - seems to be it04:48
lifelessjogo: so tl;dr - PythonTidy was decent and hackable04:48
*** nosnos has joined #openstack-infra04:48
lifelessjogo: I'd would deeply deeply deeply love to see *that* hacked on and hacking frozen04:49
lifelessjogo: because PythonTidy makes everyones life better, not more painful.04:49
*** trinaths has joined #openstack-infra04:50
jogopythondity is very old but good to know04:50
*** markmcclain has joined #openstack-infra04:50
jogolifeless: FWIW I generally am against any additions to hacking.rst at this point04:51
openstackgerritA change was merged to openstack/requirements: Updated taskflow now that 0.3.x is released  https://review.openstack.org/9918804:51
*** gokrokve has quit IRC04:53
lifelessjogo: hacking the project I meant, not the rst file :)04:53
jogolifeless: well the rst file is in the project but I know what you mean04:54
*** esker has quit IRC04:54
*** michchap_ has quit IRC04:55
*** michchap has joined #openstack-infra04:55
openstackgerritJoe Gordon proposed a change to openstack-infra/config: Don't notify nova's IRC room on patch creation  https://review.openstack.org/9956604:56
*** lcheng has joined #openstack-infra04:57
*** mmaglana has quit IRC04:58
*** mmaglana has joined #openstack-infra04:59
*** yfried_ has joined #openstack-infra05:01
*** yfried has quit IRC05:02
tchaypoDo we have an openstack link shortener?05:02
tchaypousing bit.ly makes me feel dirty05:02
*** mmaglana has quit IRC05:03
lifelessjogo: so anyhow, I hope my opinions were useful05:03
lifelessjogo: I realise I sidetracked you -sorry05:04
jogolifeless: they were useful a little side tracked but useful05:06
lifelessjogo: cool05:07
*** lcheng has quit IRC05:11
*** zns has quit IRC05:16
*** zns has joined #openstack-infra05:16
trinathsfungi: hello05:20
*** Longgeek_ has joined #openstack-infra05:21
*** salv-orlando has joined #openstack-infra05:23
*** Longgeek has quit IRC05:24
*** markwash has quit IRC05:27
*** thuc has joined #openstack-infra05:31
*** markwash has joined #openstack-infra05:31
*** salv-orlando has quit IRC05:33
*** thuc has quit IRC05:36
*** sarob has joined #openstack-infra05:39
*** zehicle_at_dell has joined #openstack-infra05:41
*** rdopieralski has joined #openstack-infra05:42
*** sarob has quit IRC05:43
*** mmaglana has joined #openstack-infra05:49
*** _nadya_ has quit IRC05:51
*** enikanorov has joined #openstack-infra05:58
lifelessdevananda: could you abandon 97646 ? alembic is in requirements.txt05:58
*** oomichi has quit IRC05:58
*** yfried_ is now known as yfried06:03
*** basha has joined #openstack-infra06:08
*** melwitt has quit IRC06:12
*** thomasbiege1 has left #openstack-infra06:14
*** markmcclain has quit IRC06:15
*** praneshp has quit IRC06:16
*** _nadya_ has joined #openstack-infra06:17
*** markmcclain has joined #openstack-infra06:18
*** markwash has quit IRC06:19
*** CaptTofu_ has joined #openstack-infra06:19
*** ildikov_ has quit IRC06:20
bashaanybody there?06:21
*** denis_makogon has joined #openstack-infra06:22
nibalizerbasha: best to just ask your question06:22
*** arnaud__ has quit IRC06:22
nibalizerif someone knows the answer they will help you06:22
bashanibalizer: ive a patch which has been failing jenkins since yesterday.06:22
bashafor pretty random reasons.06:22
nibalizerlink?06:22
bashanibalizer: https://review.openstack.org/#/c/78269/06:23
bashajust wanted to know if its just the gate acting up and whether its worth retriggering it now?06:23
*** CaptTofu_ has quit IRC06:24
clarkbbasha: we started voting on python33 with glanceclient recently06:25
clarkbthat change doesn't appear to be python33 clean06:25
nibalizerthats beyond my ability to debug06:25
nibalizerwell if you look its failing to find subunit_log.txt, where does that file get created in the pipeline?06:26
bashaclarkb: the python33 log says its unable to find subunit_log06:26
nibalizerdoesn't really look to me like the code doesn't work under python3306:26
bashanibalizer: yes06:26
clarkbbasha: nibalizer http://logs.openstack.org/69/78269/6/check/gate-python-glanceclient-python33/ded5f8d/console.html#_2014-06-11_14_17_08_220 the error is there06:26
clarkbno subunit log because that failed06:26
bashaits doesnt seem to be my change06:26
bashathats pretty random too :(06:28
clarkbits not random...06:28
clarkbwe made the change recently to start gating on python 306:28
bashaclarkb: so it runs the nosetests on python 3 as well, is that ryt?06:29
clarkband the import errors are in files modified by that change06:29
clarkbbasha: it runs testr under python3 yes06:29
clarkbbasha: if you look in that block I linked there is a fairly hard to read section that says `Ad\x17text/plain;charset=utf8\rimport06:30
bashaclarkb: but I dont see any import06:30
clarkberrorsA3tests.test_exc\ntests.test_http\ntests.test_progressbar\ntests.te and so on06:30
clarkbtests.test_exc, tests.test_http, tets.test_progressbar and so forth failed to import under python3306:31
bashaclarkb: but the thing is I was able to get jenkins green the prev patch set. And the only diff is an extra log line06:31
bashaWhich obviously doesnt do a new import06:31
clarkbthe test is new06:32
clarkbwell newly voting06:32
clarkbpatchset 1 failed with the same issues http://logs.openstack.org/69/78269/1/check/gate-python-glanceclient-python33/1e198c1/console.html.gz#_2014-03-05_15_55_10_47506:33
bashaclarkb: oh you mean python3 started voting just recently?06:33
clarkbbut it wasn't voting at the time06:33
clarkbbasha: yes06:33
bashaok gotcha. hmmm06:34
clarkbglance was able to get it to work on master so we went ahead and made it voting to prevent new regressions06:34
bashalet me try running tests on python3 on my box then06:34
*** lcheng has joined #openstack-infra06:34
*** ildikov has joined #openstack-infra06:39
*** sarob has joined #openstack-infra06:39
*** fanhe has quit IRC06:40
*** sarob_ has joined #openstack-infra06:41
*** alkari1 has joined #openstack-infra06:42
*** alkari has quit IRC06:43
*** sarob has quit IRC06:43
*** sarob_ has quit IRC06:45
*** achuprin_ has quit IRC06:51
*** _nadya_ has quit IRC06:51
*** _nadya_ has joined #openstack-infra06:52
*** zehicle_at_dell has quit IRC06:53
*** sarob has joined #openstack-infra06:53
*** cody-somerville has quit IRC06:53
*** zehicle_at_dell has joined #openstack-infra06:53
*** jlibosva has joined #openstack-infra06:54
*** sarob has quit IRC06:57
*** srenatus has quit IRC06:58
*** doude has joined #openstack-infra06:59
*** srenatus has joined #openstack-infra06:59
*** alkari1 has quit IRC07:01
*** zns_ has joined #openstack-infra07:02
*** zns has quit IRC07:03
*** oomichi has joined #openstack-infra07:04
*** alkari has joined #openstack-infra07:04
*** cody-somerville has joined #openstack-infra07:05
*** _nadya_ has quit IRC07:05
*** ildikov has quit IRC07:07
*** achuprin_ has joined #openstack-infra07:07
*** Clabbe has quit IRC07:09
*** markmcclain has quit IRC07:11
*** pblaho has joined #openstack-infra07:12
*** yfried has quit IRC07:12
*** yfried has joined #openstack-infra07:12
*** jcoufal has joined #openstack-infra07:13
*** achuprin_ has quit IRC07:15
*** yfried_ has joined #openstack-infra07:18
*** yfried has quit IRC07:19
*** yfried_ is now known as yfried07:19
*** trinaths has quit IRC07:19
*** flaper87|afk is now known as flaper8707:22
*** ildikov has joined #openstack-infra07:24
*** e0ne has joined #openstack-infra07:24
*** zehicle_at_dell has quit IRC07:25
*** tkelsey has joined #openstack-infra07:26
*** zehicle_at_dell has joined #openstack-infra07:27
*** achuprin_ has joined #openstack-infra07:27
*** sarob has joined #openstack-infra07:29
*** zns_ has quit IRC07:29
*** talluri has quit IRC07:32
*** talluri has joined #openstack-infra07:32
mattoliverauI'm calling it a day, night all.07:33
*** jcoufal has quit IRC07:33
*** sarob has quit IRC07:34
*** zehicle_at_dell has quit IRC07:36
*** mugsie has quit IRC07:36
*** cody-somerville has quit IRC07:36
*** talluri has quit IRC07:36
*** _nadya_ has joined #openstack-infra07:38
*** sarob has joined #openstack-infra07:39
*** mrda is now known as mrda-away07:40
*** jcoufal has joined #openstack-infra07:43
*** sarob has quit IRC07:44
*** e0ne has quit IRC07:44
*** e0ne has joined #openstack-infra07:45
*** mmaglana has quit IRC07:48
*** e0ne has quit IRC07:49
*** hashar has joined #openstack-infra07:49
*** StevenK has quit IRC07:50
*** cody-somerville has joined #openstack-infra07:50
openstackgerritlifeless proposed a change to stackforge/gertty: Don't crash on comments on unchanged files  https://review.openstack.org/9956307:50
openstackgerritlifeless proposed a change to stackforge/gertty: Hide fully reviewed projects by default  https://review.openstack.org/9959107:50
lifelessclarkb: ^ I think you'll like this shiny07:50
*** dizquierdo has joined #openstack-infra07:51
*** jcoufal has quit IRC07:51
*** StevenK has joined #openstack-infra07:51
*** achuprin_ has quit IRC07:55
*** ihrachyshka has joined #openstack-infra07:55
*** rcarrill` has joined #openstack-infra07:58
*** freyes has joined #openstack-infra07:58
*** rcarrillocruz has quit IRC08:00
*** jcoufal has joined #openstack-infra08:00
*** ihrachyshka has quit IRC08:02
*** jistr has joined #openstack-infra08:02
*** ihrachyshka has joined #openstack-infra08:02
*** _nadya_ has quit IRC08:03
*** oomichi has quit IRC08:05
*** andreykurilin_ has joined #openstack-infra08:06
*** achuprin_ has joined #openstack-infra08:10
*** derekh_ has joined #openstack-infra08:11
*** enikanorov__ has quit IRC08:12
*** fbo_away is now known as fbo08:14
*** jcoufal has quit IRC08:14
*** plars has quit IRC08:15
*** plomakin_ has quit IRC08:16
*** Hal_ has joined #openstack-infra08:16
*** skraynev has quit IRC08:16
*** tnurlygayanov has quit IRC08:17
*** ilyashakhat has quit IRC08:17
*** markmc has joined #openstack-infra08:19
*** locke105 has quit IRC08:19
*** srenatus has quit IRC08:24
*** srenatus has joined #openstack-infra08:25
*** pelix has joined #openstack-infra08:26
*** andreykurilin_ has quit IRC08:28
*** talluri has joined #openstack-infra08:33
*** andreykurilin_ has joined #openstack-infra08:34
lifelessSergeyLukjanov: hey08:35
lifelessSergeyLukjanov: have a look on review 9274908:35
*** talluri has quit IRC08:35
lifeless(or jhesketh) ^08:35
lifelessBrocade OSS CI is commenting with links to status.o.o/zuul - I think they're spinning up a new 3rd-party system, badly.08:36
lifelessmight want to pull their access08:36
lifelessbefore they mess everyone up :)08:36
jheskethlifeless: might be worth reaching out to them and letting them know before pulling access out from under their feet08:38
*** cody-somerville has quit IRC08:39
*** sarob has joined #openstack-infra08:39
*** afazekas has joined #openstack-infra08:42
lifelessjhesketh: I have no idea how to do that08:42
jheskeththe email on the account is DL-GRP-VYATTA-OSS@Brocade.com08:43
*** rlandy has joined #openstack-infra08:43
*** jcoufal has joined #openstack-infra08:43
*** sarob has quit IRC08:43
isviridovSergeyLukjanov, could you take a look at https://review.openstack.org/#/c/99039/ and https://review.openstack.org/#/c/91050/08:44
*** rcarrillocruz has joined #openstack-infra08:45
lifelessjhesketh: they've commented on looks like hundreds of reviews in the last few hours08:46
lifelesshmmm, no gmail fail on screen size08:46
lifeless30ish08:46
lifelessI will mail them cc infra08:46
jheskethokay08:46
jheskeththanks08:46
jheskethmight be worth removing their access but lets try email first08:46
jheskethSergeyLukjanov might disagree and just revoke it anyway ;-)08:47
*** rcarrill` has quit IRC08:47
openstackgerritA change was merged to openstack-infra/config: ceilometer: enable gate-grenade-dsvm-forward  https://review.openstack.org/9743008:48
lifelessemail sent08:48
*** dims_ has joined #openstack-infra08:49
*** andreykurilin_ has quit IRC08:51
*** cody-somerville has joined #openstack-infra08:52
swestonjhesketh: lifeless: hello08:52
swestonthat is my system, apologies08:53
lifelesssweston: hi08:53
swestonI stopped the zuul service08:53
*** Alexei_987 has quit IRC08:53
lifelessthank you08:53
swestonthat can you verify that we are no longer posting back?08:53
*** e0ne has joined #openstack-infra08:53
swestonsorry, this is my first attempt at this08:54
*** dims_ has quit IRC08:54
swestonshould I just take everything out of my projects.yaml file except for the sandbox and try again?08:55
swestonguess that would be my layout.yaml file in /etc/zuul08:56
jheskethsweston: you should modify your zuul's layout.yaml to not report to gerrit08:56
jheskethinstead try setting up an smtp reporter08:56
jheskethso you can email yourself results while you set it up08:56
*** mrmartin has joined #openstack-infra08:57
mrmartinre08:57
* sweston is looking up zuul docs08:57
*** ominakov has joined #openstack-infra09:01
*** achuprin_ has quit IRC09:03
swestonjhesketh: so I should change the success and parameters in each of the pipeline definitions?09:03
swestonjhesketh: lifeless: nevermind, I have the correct layout file now.  apologies, again for the disturbance to your systems09:13
lifelesssweston: np, thanks for taking prompt action09:14
*** _nadya_ has joined #openstack-infra09:14
swestonlifeless: sure thing, the least I could do :-)09:14
*** habib has joined #openstack-infra09:16
*** achuprin_ has joined #openstack-infra09:16
*** chianingwang has joined #openstack-infra09:18
*** _nadya_ has quit IRC09:19
*** chianingwang has quit IRC09:23
*** zhiyan is now known as zhiyan_09:23
*** amcrn has quit IRC09:24
*** andreykurilin_ has joined #openstack-infra09:24
yjiangi folks,I use "Gerrit trigger"(not zuul) to trigger Openstack 3rd part test.I've set Verify values "Successful 1", "Failed -1".My CI test could be triggered as normal,but the "Verified +/-1" will not be shown on the review page after CI test is done.Why this happened?Does there anyone could help to answer my question?Any suggestion is appreciated.Thx a lot!09:32
*** zehicle_at_dell has joined #openstack-infra09:33
*** sarob has joined #openstack-infra09:38
*** andreykurilin_ has quit IRC09:38
*** sarob_ has joined #openstack-infra09:39
Kiallyjiang: your 3rd party testing account needs special permissions to give a Verified vote - that's rarely given out, and when it is, the 3rd party testing need a track record of producing useful and accurate tests etc etc09:41
*** sarob has quit IRC09:43
*** denis_makogon has quit IRC09:43
*** sarob_ has quit IRC09:44
*** e0ne_ has joined #openstack-infra09:44
*** e0ne has quit IRC09:48
SergeyLukjanovlifeless, jhesketh, does brocade ci still sending incorrect links?09:52
SergeyLukjanovoh, I see the dialog with sweston09:53
swestonSergeyLukjanov: yes, sorry about that09:53
SergeyLukjanovsweston, np, it's not like the spamming all projects ;)09:54
*** oomichi has joined #openstack-infra09:54
SergeyLukjanovfungi, jeblair, clarkb, jhesketh, mordred, there are four holiday days in russia, so, I'll be limited available till Monday]09:54
swestonSergeyLukjanov:  yup, understood.  Won't be making that mistake again :-)09:55
openstackgerritKiall Mac Innes proposed a change to openstack-infra/config: Move unbound DNS recursor instance to 127.0.2.1  https://review.openstack.org/9961109:57
*** zehicle_at_dell has quit IRC10:02
*** zehicle_at_dell has joined #openstack-infra10:03
*** ominakov has quit IRC10:05
yjiangKiall: OK,thanks to reply.Another question.If I want to apply this "Verified vote" rights,what specific target should I got?For example,find how many bugs by my test or may be something else.10:05
KiallHonestly, not sure :) I just know it's handed out sparingly after proving reliable :)10:06
yjiangKiall: OK,thx!10:07
*** ominakov has joined #openstack-infra10:10
*** ihrachyshka has quit IRC10:11
openstackgerritlifeless proposed a change to openstack-infra/infra-specs: Make use of IP per slave optional.  https://review.openstack.org/9562510:13
*** skraynev has joined #openstack-infra10:14
*** talluri has joined #openstack-infra10:16
*** talluri has quit IRC10:20
*** matsuhashi has quit IRC10:20
openstackgerritlifeless proposed a change to stackforge/gertty: Don't crash on comments on unchanged files  https://review.openstack.org/9956310:20
*** matsuhashi has joined #openstack-infra10:20
*** jerryz has quit IRC10:21
*** yamahata has quit IRC10:21
*** dims_ has joined #openstack-infra10:23
*** amcrn has joined #openstack-infra10:24
*** matsuhashi has quit IRC10:25
*** Hal_ has quit IRC10:26
*** yaguang has quit IRC10:30
cgoncalvesHi. When re-submitting a patchset to Gerrit can one change the topic? I'm concern about Gerrit possibly re-generating a new Change-Id.10:33
gilliardIIRC you can't _just_ change the topic because the git commit hash will be the same.10:35
*** e0ne_ has quit IRC10:37
cgoncalvesgilliard: I wouldn't only be changing the topic, so based on what you just said I believe it should be find changing the topic. thanks10:37
*** talluri has joined #openstack-infra10:37
gilliardBut if you change the commit msg or some code it should keep the came change-id10:37
gilliard(just testing that)10:37
*** e0ne has joined #openstack-infra10:37
gilliardconfirmed/10:37
cgoncalvesgilliard: much appreciated :-)10:38
*** kmartin has quit IRC10:38
*** sarob has joined #openstack-infra10:39
*** e0ne has quit IRC10:41
*** thomasbiege has joined #openstack-infra10:42
*** sarob has quit IRC10:43
*** rcarrillocruz has quit IRC10:48
*** rcarrillocruz has joined #openstack-infra10:52
*** _nadya_ has joined #openstack-infra10:56
*** srenatus has quit IRC10:59
*** srenatus has joined #openstack-infra10:59
*** atiwari has quit IRC11:04
*** _nadya_ has quit IRC11:05
openstackgerritBob Ball proposed a change to openstack-infra/nodepool: Support install phase with nodepool  https://review.openstack.org/9778711:06
openstackgerritBob Ball proposed a change to openstack-infra/nodepool: Support nodes with launch condition  https://review.openstack.org/9779811:06
*** nosnos has quit IRC11:13
*** e0ne has joined #openstack-infra11:15
*** e0ne has quit IRC11:17
*** e0ne has joined #openstack-infra11:17
* fungi is going to try and get a few openstack things done today, but is getting mired in final moving details and constant interruptions by agents trying to show the current house, so probably not around for a lot of general support11:19
fungiclarkb: mordred: ^11:19
*** ihrachyshka has joined #openstack-infra11:20
sdaguefungi: with that being said, could I get you to promote 99396, as working on readability for grenade (which should help sort some of these bugs) is somewhat limitted until that lands11:21
*** trinaths has joined #openstack-infra11:22
*** e0ne has quit IRC11:22
fungisdague: sure11:22
*** srenatus has quit IRC11:23
sdagueI also think we need to kick all of ironic out of the gate11:23
*** andreykurilin_ has joined #openstack-infra11:23
*** srenatus has joined #openstack-infra11:24
*** tkelsey has quit IRC11:24
*** [1]trinaths has joined #openstack-infra11:24
*** basha has quit IRC11:26
sdagueyeh, gate-tempest-dsvm-virtual-ironic has a 33% fail rate in the gate11:26
fungiouch11:26
sdagueI think we need to make that test non voting11:26
sdaguebasically it's a terrible configuration11:26
sdaguerelies massively more on the network than the rest of the world11:27
*** trinaths has quit IRC11:27
*** [1]trinaths is now known as trinaths11:27
*** yjiang has quit IRC11:28
*** plomakin has joined #openstack-infra11:29
sdaguefungi: if I push a config for that, can you fast path it?11:29
sdague30 gate fails in the last 48 hrs11:30
openstackgerritSean Dague proposed a change to openstack-infra/config: disable voting on gate-tempest-dsvm-virtual-ironic  https://review.openstack.org/9963011:34
*** basha has joined #openstack-infra11:38
*** andreykurilin_ has quit IRC11:39
*** sarob has joined #openstack-infra11:39
*** mrmartin has quit IRC11:42
*** basha has quit IRC11:42
*** sarob has quit IRC11:43
*** lcheng has quit IRC11:44
*** CaptTofu_ has joined #openstack-infra11:44
*** e0ne has joined #openstack-infra11:45
*** CaptTofu_ has quit IRC11:45
*** CaptTofu_ has joined #openstack-infra11:45
openstackgerritSean Dague proposed a change to openstack-infra/elastic-recheck: gate-tempest-dsvm-virtual-ironic is in our gate  https://review.openstack.org/9963511:46
sdaguefungi: so... the non-voting?11:50
*** basha has joined #openstack-infra11:53
*** yfried_ has joined #openstack-infra11:53
openstackgerritA change was merged to openstack-infra/elastic-recheck: gate-tempest-dsvm-virtual-ironic is in our gate  https://review.openstack.org/9963511:54
*** thomasbiege has quit IRC11:55
*** yfried has quit IRC11:56
*** mrmartin has joined #openstack-infra11:56
*** lbragstad has quit IRC11:59
*** thomasbiege has joined #openstack-infra12:00
*** dprince has joined #openstack-infra12:03
*** mugsie has joined #openstack-infra12:07
fungisdague: back now, and approved12:08
sdaguefungi: thanks12:08
sdagueI sent an email to the list describing the reasons as well12:08
fungiexcellent. if the ironic devs are worried about that letting breakage slip through... well...12:09
fungii guess the alternative is to go back to not running any shared jobs12:09
fungiso that it can break in its own queue without slowing down the main integrated queue12:10
sdagueyeh, it only turns off the vote in gate12:11
*** yamahata has joined #openstack-infra12:11
fungioh, that too12:11
fungia very good point12:11
sdagueI did try to do the minimum impact thing here :)12:12
openstackgerritA change was merged to openstack-infra/config: disable voting on gate-tempest-dsvm-virtual-ironic  https://review.openstack.org/9963012:13
openstackgerritMaxime Vidori proposed a change to openstack-infra/storyboard-webclient: Remove boostrap.js  https://review.openstack.org/9963812:15
*** dkranz has joined #openstack-infra12:19
*** smarcet has joined #openstack-infra12:20
*** adalbas has joined #openstack-infra12:20
*** dims_ has quit IRC12:22
*** dims_ has joined #openstack-infra12:23
*** aysyd has joined #openstack-infra12:23
sdaguefungi: oh ffs, new fails12:23
*** hashar has quit IRC12:24
*** e0ne_ has joined #openstack-infra12:24
*** basha has quit IRC12:24
*** weshay has joined #openstack-infra12:25
*** dims_ is now known as dims12:25
fungiit is feature breeding season after all12:27
*** e0ne has quit IRC12:28
*** ArxCruz has joined #openstack-infra12:30
*** amotoki has quit IRC12:31
*** talluri has quit IRC12:33
*** talluri has joined #openstack-infra12:33
*** dkliban_afk is now known as dkliban12:36
*** fanhe has joined #openstack-infra12:37
*** talluri has quit IRC12:38
openstackgerritDenis M. proposed a change to openstack-infra/config: Added experimental job for trove mongodb functional tests  https://review.openstack.org/9964412:38
*** denis_makogon has joined #openstack-infra12:38
*** rfolco has joined #openstack-infra12:39
*** sarob has joined #openstack-infra12:39
*** maxbit has joined #openstack-infra12:39
*** Hal_ has joined #openstack-infra12:40
*** chandan_kumar has joined #openstack-infra12:41
*** thomasbiege has left #openstack-infra12:43
*** crc32 has quit IRC12:43
*** sarob has quit IRC12:44
*** chandan_kumar has quit IRC12:44
openstackgerritA change was merged to openstack-infra/config: Don't deny visibility of ICLA group to its members  https://review.openstack.org/9922312:45
*** amcrn has quit IRC12:46
*** bradm has quit IRC12:46
*** openstackgerrit has quit IRC12:46
*** yfried_ has quit IRC12:46
*** bradm has joined #openstack-infra12:46
*** openstackgerrit has joined #openstack-infra12:48
*** oomichi has quit IRC12:49
*** eharney has joined #openstack-infra12:50
*** chandankumar has joined #openstack-infra12:52
*** dims_ has joined #openstack-infra12:55
*** prad has joined #openstack-infra12:56
*** dims has quit IRC12:56
*** lbragstad has joined #openstack-infra12:57
andreafclarkb: ping13:00
*** julim has joined #openstack-infra13:01
openstackgerritDenis M. proposed a change to openstack-infra/config: Added experimental job for trove mongodb functional tests  https://review.openstack.org/9964413:02
*** yamahata has quit IRC13:03
*** yamahata has joined #openstack-infra13:03
*** ihrachyshka has quit IRC13:05
*** ihrachyshka has joined #openstack-infra13:07
*** sweston has quit IRC13:09
*** mriedem has joined #openstack-infra13:12
*** hashar has joined #openstack-infra13:12
*** dkranz has quit IRC13:14
*** mrmartin has quit IRC13:14
*** CaptTofu_ has quit IRC13:15
*** CaptTofu_ has joined #openstack-infra13:16
*** thuc_ has joined #openstack-infra13:18
*** mbacchi has joined #openstack-infra13:20
*** CaptTofu_ has quit IRC13:20
*** oomichi has joined #openstack-infra13:22
openstackgerritMaxime Vidori proposed a change to openstack-infra/storyboard-webclient: Remove boostrap.js  https://review.openstack.org/9963813:27
openstackgerritMaxime Vidori proposed a change to openstack-infra/storyboard-webclient: Removal of jquery  https://review.openstack.org/9966013:27
*** mrmartin has joined #openstack-infra13:31
*** jergerber has joined #openstack-infra13:32
*** talluri has joined #openstack-infra13:34
*** CaptTofu_ has joined #openstack-infra13:34
*** tkelsey has joined #openstack-infra13:35
*** zehicle_at_dell has quit IRC13:36
*** mwagner_lap has quit IRC13:36
*** talluri has quit IRC13:38
*** sarob has joined #openstack-infra13:39
*** crc32 has joined #openstack-infra13:40
*** crc32 has quit IRC13:41
*** sileht has quit IRC13:43
*** trinaths has quit IRC13:43
*** sarob has quit IRC13:43
BobBallAnyone know how the yaml python package gets into the VMs?13:44
BobBalldevstack-gate/test-matrix.py needs yaml but it's not installed on the xs CI vms initially, but it _does_ get installed later by something (system package) - but I can't figure out where and it's bugging me!13:44
fungiBobBall: perhaps it's installed by devstack?13:45
BobBallnot that I could see :/13:45
BobBalla bunch of puppet stuff depends on yaml - but the ones that do don't appear to be depended on by a node...13:46
*** crc32 has joined #openstack-infra13:47
*** mfer has joined #openstack-infra13:47
*** bknudson has joined #openstack-infra13:50
fungiyeah, this is perplexing... still looking13:50
*** ihrachyshka has quit IRC13:51
fungilooking at a random devstack-precise node, python-yaml 3.10-2 is installed from a deb13:52
*** homeless has joined #openstack-infra13:53
*** yolanda has joined #openstack-infra13:54
BobBallhmmm...13:54
BobBalllemme check something...13:54
*** beekneemech is now known as bnemec13:54
openstackgerritA change was merged to openstack-infra/storyboard: Removed PostGres from code and documentation  https://review.openstack.org/9887013:54
*** yfried_ has joined #openstack-infra13:55
*** Longgeek has joined #openstack-infra13:55
*** ihrachyshka has joined #openstack-infra13:55
BobBallit's installed as a dependency13:55
BobBallhttp://paste.openstack.org/show/83806/13:56
*** basha has joined #openstack-infra13:56
*** yfried_ has quit IRC13:56
*** reaper has joined #openstack-infra13:56
fungiahh, yep13:56
*** yfried_ has joined #openstack-infra13:56
*** zz_gondoi is now known as gondoi13:57
BobBallnot sure what the dependency path is ... but that's why it's there after devstack has run13:57
*** Longgeek_ has quit IRC13:57
*** jistr has quit IRC14:00
*** timrc-afk is now known as timrc14:01
*** malini1 has joined #openstack-infra14:01
*** sileht has joined #openstack-infra14:02
fungiBobBall: dpkg -l | grep -e $(echo `apt-cache rdepends python-yaml|grep '^  '`|sed 's/ / -e /g')14:02
fungilooks like cloud-init, cloud-utils and python-kombu may be responsible (recurse as needed)14:02
*** jistr has joined #openstack-infra14:02
malini1hello ! We are running into a heat timeout failure at the gate, which is blocking our patches. I see tht there is already a query associated with this one https://github.com/openstack-infra/elastic-recheck/blob/master/queries/1306029.yaml14:03
BobBallgreat - thanks - just what I wanted fungi!14:03
malini1What else do I need to make this part of elastic recheck?14:03
fungimalini1: http://docs.openstack.org/infra/elastic-recheck/readme.html#adding-bug-signatures14:04
*** mugsie has quit IRC14:04
*** mugsie has joined #openstack-infra14:05
malini1fungi: it seems like Step 4 in tht link is already in place. There is already a query in the repo https://github.com/openstack-infra/elastic-recheck/blob/master/queries/1306029.yaml14:06
fungimalini1: also see the logstash job queue graph at the bottom of http://status.openstack.org/zuul/ indicating that we're currently running on a perpetual log processing backlog, so i think right now elastic-recheck is giving up most/all of the time waiting for logs of a failed job to get indexed (it only waits up to 15 minutes before deciding it's been too long)14:07
fungimalini1: i think clarkb was planning to dig back into possible causes/remediation once he's awake14:08
malini1Thanks fungi ! I cud buy clarkb some caffeine ;)14:09
malini1fungi: So for now, my only option is to wait -rt?14:09
fungimalini1: and in the meantime look at the failure logs yourself and identify the bug that way, when possible14:10
*** pcrews has joined #openstack-infra14:10
malini1fungi: can I still use the 'recheck bug #', if elastic recheck doesnt find it?14:11
fungimalini1: yes14:11
malini1fungi: cool! I didnt know that14:11
malini1It solves our problem14:11
malini1thanks again14:13
*** shayneburgess has joined #openstack-infra14:15
*** shayneburgess has quit IRC14:15
*** sweston has joined #openstack-infra14:15
*** wenlock_ has joined #openstack-infra14:17
*** atiwari has joined #openstack-infra14:18
*** otherwiseguy has joined #openstack-infra14:19
*** mrmartin has quit IRC14:19
*** basha has quit IRC14:21
*** UtahDave has joined #openstack-infra14:22
*** malini1 has quit IRC14:25
*** malini1 has joined #openstack-infra14:25
*** malini1 has quit IRC14:26
openstackgerritA change was merged to openstack-infra/elastic-recheck: Add query for ceilometer test_notify_alarm bug 1321826  https://review.openstack.org/9814914:27
uvirtbotLaunchpad bug 1321826 in ceilometer "periodic notifier unit test failure" [Medium,In progress] https://launchpad.net/bugs/132182614:27
*** malini1 has joined #openstack-infra14:27
*** trinaths has joined #openstack-infra14:27
*** zehicle_at_dell has joined #openstack-infra14:27
*** jcoufal has quit IRC14:27
*** otherwiseguy has quit IRC14:31
*** basha has joined #openstack-infra14:31
openstackgerritA change was merged to openstack-infra/reviewday: Prettified all HTML files  https://review.openstack.org/9865514:31
*** shayneburgess has joined #openstack-infra14:34
Kiallfungi / clarkb: I put this review together in an attempt to fix that unbound issue mentioned during the Designate meet yesterday, approach seem OK? https://review.openstack.org/#/c/99611/14:34
*** radez_g0n3 is now known as radez14:34
*** otherwiseguy has joined #openstack-infra14:35
*** shayneburgess has quit IRC14:38
*** rdopieralski has quit IRC14:38
*** sarob has joined #openstack-infra14:39
*** timrc is now known as timrc-afk14:40
*** yamahata has quit IRC14:41
phschwartzWhat are the requirements for CLA for the puppet-* acl  configs for gerrit? Currently only one of the 6 in the repo require cla and I currently have a -1 from push back for not having it in my file I am adding.14:42
*** basha has quit IRC14:42
*** sarob has quit IRC14:43
openstackgerritA change was merged to openstack-infra/storyboard-webclient: Refresh token support  https://review.openstack.org/9547814:43
phschwartzfungi: mordred: anteaya: clarkb: jeblair: ^ any of you have a weigh in?14:44
*** basha has joined #openstack-infra14:44
*** mkerrin has quit IRC14:44
mordred_phonephschwartz: I believe we decided that they don't need it14:45
phschwartzawesome, makes that easy. Now to bug you core reviewers for an approval ;)14:46
fungiphschwartz: the status of cla as a requirement for things not distributed as a component of an openstack cloud is currently somewhat muddy. at a recent infra team meeting we resolved for now that if it derives in any way from another project which isn't currently enforcing the cla then it's not necessary14:46
fungiphschwartz: so if you're copy/pasting some bits from openstack-infra/config for example, that repo already doesn't have a cla requrement14:46
fungirequirement14:46
*** shayneburgess has joined #openstack-infra14:47
* fungi digs up a link to use for reference14:47
mordreddstufft: just read an article about DNF which is the new replacement for yum for fedora 2214:48
mordreddstufft: it apparently uses "new dep solver technology which is faster and uses less memory"14:48
mordreddstufft: since it's in python - maybe that dep solver can be re-used?14:48
*** timrc-afk is now known as timrc14:49
fungiphschwartz: http://eavesdrop.openstack.org/meetings/infra/2014/infra.2014-06-03-19.02.log.html#l-275 (you can skip down to around 19:50:15)14:50
mordreddstufft: nevermind. its much less in python and much more in C14:50
*** ArxCruz has quit IRC14:51
*** wenlock_ has quit IRC14:52
*** thedodd has joined #openstack-infra14:52
phschwartzso if any core want to take a look at https://review.openstack.org/#/c/93953/ it would be appreciated as the -1 should now be a mute point.14:54
phschwartzjhesketh: ping, you around?14:54
*** kmartin has joined #openstack-infra14:55
phschwartzfungi: ty, that was a good read as to what you guys are thinking and want to use as the current model to follow.14:56
*** jistr has quit IRC14:56
*** jistr has joined #openstack-infra14:56
*** ArxCruz has joined #openstack-infra14:56
fungiphschwartz: jhesketh probably won't be awake for a while... it's pretty dark in au right about now14:56
*** sarob has joined #openstack-infra14:57
phschwartzah, didn't realize he was in AU.14:57
phschwartzlol14:57
openstackgerritNikhil Manchanda proposed a change to openstack-infra/config: Added new experimental job for trove functional tests  https://review.openstack.org/9851714:57
openstackgerritNikhil Manchanda proposed a change to openstack-infra/config: Use job-template for gate-trove-buildimage jobs  https://review.openstack.org/9968014:57
*** james_li has joined #openstack-infra14:58
*** oomichi has quit IRC14:58
*** marun has joined #openstack-infra14:58
*** flaper87 is now known as flaper87|afk15:00
openstackgerrityolanda.robla proposed a change to openstack-infra/storyboard-webclient: Display dates in timeago format  https://review.openstack.org/9671315:00
*** jlibosva has quit IRC15:01
*** sarob has quit IRC15:01
fungiKiall: since designate is officially incubated now, i suppose accommodating its typical configuration with our job workers is reasonable even if it means running the local resolver daemon on a less typical address15:02
*** jlibosva has joined #openstack-infra15:02
*** sarob has joined #openstack-infra15:03
mordredfungi: I agree15:03
yolandamordred, cboylan, doing some tests on nodepool in dib, but i receive this error:15:03
yolandaStderr: "qemu-img: 'image' uses a qcow2 feature which is not supported by this qemu version: QCOW version 3\nqemu-img: Could not open '/var/lib/nova/instances/_base/94d0200a5bb90968e0e40f682f9e187025d84276.part': Operation not supported\n"15:03
KiallWell, I was more aiming to put unbound on an address no sane service/config would normally do - It move it once and be done with it15:03
Kialli.e. move it once*15:03
yolandahave you seen that before? looks as something related with dib15:03
fungiKiall: yes, it seems like a reasonable approach15:04
*** lcostantino has joined #openstack-infra15:04
*** reed has joined #openstack-infra15:05
fungiKiall: out of curiosity, though, why did designate care about the loopback address? wouldn't it be a service running inside an instance rather than in the devstack host context?15:05
*** dprince has quit IRC15:06
fungiKiall: or is it not a service vm in the same vein as, say, trove?15:06
KiallDesignate can run right next to nova etc, or inside nova VMs.. The DevStack gate makes running it alongside nova easier15:06
KiallNot quite the same - but there are plans in that direction for certain use cases15:06
anteayasweston: shivharis was looking for me yesterday and I was away, please let me know if there is anything you need from me15:06
fungiKiall: okay, fair enough15:06
anteayasweston: talking to one person is easier for me then doing cross-purposes with two15:07
swestonanteaya:  hello15:07
*** sarob has quit IRC15:07
anteayasweston: hi15:07
anteayasweston: anything you need from me?15:08
mordredyolanda: I haven't seen that - may want to ask the tripleo folks - i agree, it seems like a dib issue15:08
swestonanteaya: yes, I understand.  We had a meeting yesterday to coordinate everything across our three business units working with OpenStack plugins.15:08
*** vhoward has left #openstack-infra15:09
anteayasweston: yay15:09
anteayasweston: how did it go?15:09
yolandamordred, looks as we need to call qemu-img with --compat=1.0 , at least for the environment i'm using for testing, but there is no option like that in dib15:10
swestonyes, I do need to speak with you, may we continue our conversation out of band?15:10
*** basha has quit IRC15:10
mordredyolanda: yah. that'll definitely be a question for #tripleo15:11
yolandasure, heading there15:11
swestonanteaya: it went very well.15:11
anteayasweston: let's go to -dev then, I want to ensure we are in a logged channel15:11
*** ihrachyshka_ has joined #openstack-infra15:12
swestonanteaya:  ok, we can continue here as well.15:12
*** ihrachyshka_ has quit IRC15:12
anteayaokay15:12
* anteaya listens15:12
swestonanteaya: so, I now have zuul posting back to the openstack-dev sandbox project15:13
anteayasweston: let's back up15:13
anteayahow many gerrit accounts are you tracking with brocade?15:13
*** ramashri has joined #openstack-infra15:13
swestonanteaya: ok, sure.  Right now we have three15:14
anteayawhat are they called?15:15
swestonanteaya:  is that consistent with what you understand?15:15
phschwartzIs there a reason why we don't use rebuild to reuse instances from nodepool, but delete them completely and recreate?15:15
anteayathere is a system called Brocade CI that I don't have on my lists15:15
phschwartzWith some changes on the nova side for us, we could cut the instance creation times by a lot.15:15
*** ihrachyshka has quit IRC15:15
anteayahttps://etherpad.openstack.org/p/automated-gerrit-account-naming-format15:15
*** doug-fish has joined #openstack-infra15:16
anteayaline 24, 25 and 2615:16
*** andreaf has quit IRC15:16
swestonanteaya:  yes, Brocade CI is one of ours, another one should be named Brocade OSS CI, and I need to look up the third15:16
anteayaI don't have a Brocade CI15:16
*** shayneburgess has quit IRC15:16
anteayaI have Brocade ADX CI, Brocade OSS CI, and Brocade Tempest15:17
swestonthat should somehow be linked to Pattabi Ayyasami15:17
*** ramashri has quit IRC15:17
*** _nadya_ has joined #openstack-infra15:17
anteayaI have Brocade ADX CI lined to that name15:17
*** otherwiseguy has quit IRC15:17
swestonanteaya: ok, then Brocade Tempest must be Shiv Haris's?15:18
*** shayneburgess has joined #openstack-infra15:18
anteayafungi: can you take a peek at the gerrit db and see if there is a Brocade CI that isn't a part of the third party gerrit group?15:18
anteayasweston: Brocade CI is the name15:18
swestonanteaya: I suspect that is linked to pattabi somehow15:19
swestonanteaya: let me give him a call and ask15:19
*** denis_makogon has quit IRC15:19
openstackgerritBen Nemec proposed a change to openstack-infra/reviewstats: Oslo project updates  https://review.openstack.org/9918415:19
anteayasweston: thanks15:19
swestonanteaya:  sure, give me a few minutes ...  :-)15:20
*** otherwiseguy has joined #openstack-infra15:21
anteayasweston: also please respond to http://lists.openstack.org/pipermail/openstack-infra/2014-June/001336.html15:21
anteayasweston: from what I can tell this is Brocade OSS CI15:21
mordredphschwartz: what does rebuild do?15:21
swestonanteaya: yes, I followed up in infra irc immediately15:21
*** _nadya_ has quit IRC15:21
mordredphschwartz: do we get an instance that looks like it had just been booted?15:22
anteayaI haven't read backscroll yet15:22
phschwartzmordred: Rebuilds an instance and brings it back to initial startup state15:22
*** markmcclain has joined #openstack-infra15:22
phschwartzmordred: If we did move to that with changes on our nova side we could pre-cache the image which would remove the redownload of the image from glance cutting down on the time it takes.15:22
swestonanteaya:  so that's what it looks like when you run zuul with the layout.yaml file unchanged15:22
mordredphschwartz: k. no, I do not believe there is a conscious reason - except it would potentially make the elastic logic need to be rethought15:22
phschwartzmordred: would only hit the glance download when the image is refreshed.15:23
*** annegent_ has joined #openstack-infra15:23
*** timrc is now known as timrc-afk15:23
*** dkranz has joined #openstack-infra15:23
mordredbecause delete and create are handled independent - so there isn't really anywhere in the system that knows "I'm done with this node, but when I delete it, I'll still need one, so let me re-build instead of delete"15:24
annegent_ttx: governance question when you have a moment (and I should know this but want to ask)15:24
mordredphschwartz: so it could work out - but it might not be the _easiest_ patch to write15:24
*** dprince has joined #openstack-infra15:25
ttxannegent_: ask15:25
mordredmight not be TOO terrible though15:25
*** CaptTofu_ has quit IRC15:25
phschwartzmordred: I have been looking at the code as I work on the throttling on error and I will see what it might take to do it.15:25
annegent_ttx: should the ptl be elected prior to incubation?15:25
annegent_the/a15:25
mordredsince nodepool marks a node for delete in its db15:25
NobodyCamfungi: happen to be around?15:25
mordredthe create code _could_ just say "I need to create a node, are there any nodes in DELETE state in the db, if so, let me nova rebuild them and set their state to BUILDING"15:26
ttxannegent_: http://git.openstack.org/cgit/openstack/governance/tree/reference/new-programs-requirements.rst15:26
mordredphschwartz: ^^15:26
ttx"Team should have a lead, selected by the team contributors"15:26
annegent_ttx: thanks! I was looking at http://git.openstack.org/cgit/openstack/governance/tree/reference/incubation-integration-requirements.rst15:26
swestonanteaya: I cannot reach him at the moment, I will take an action item to find out and get back to you as soon as I can15:26
ttxthat's only if the project requires a new program15:26
phschwartzmordred: That might be the easiest way15:26
annegent_ttx: so that's even prior to the next hop15:26
ttxannegent_: or at least concurrent15:26
*** alexpilotti has joined #openstack-infra15:26
annegent_ttx: got it, thanks. Election isn't a requirement, teams can do their own selection process.15:27
ttxannegent_: note that te wording gives some room for maneuvering15:27
openstackgerritA change was merged to openstack-infra/config: Use WatchedFileHandler to avoid copytruncate.  https://review.openstack.org/9593515:27
ttxyes15:27
mordredphschwartz: one of those times when having deletes be async makes logic _easier_15:27
annegent_ttx: thanks!15:27
annegent_ttx: in the case of a project incubating within a program, is there just one ptl?15:27
fungianteaya: https://review.openstack.org/#/q/owner:%22Brocade+CI+%253Copenstack_gerrit%2540brocade.com%253E%22,n,z15:27
mordredttx: someone said something yesterday that made me think you should be involved in something15:28
*** talluri has joined #openstack-infra15:28
phschwartzmordred: exactly. Even build async might be nice. That way it rechecks a started instance instead of deleting when it first sees error incase it corrects itself (yeah, I know, rax issue)15:28
mordredttx: oh!15:28
annegent_ttx: considering the training incubation within docs program15:28
fungianteaya: er, i meant https://review.openstack.org/#/q/reviewer:%22Brocade+CI+%253Copenstack_gerrit%2540brocade.com%253E%22,n,z15:28
annegent_mordred: you crack me up15:28
mordredttx: https://api.launchpad.net/devel.html#specification <--- lifeless pointed out that there _is_ a blueprints API15:29
mordredttx: I still like the 'just f-ing use storyboard' plan though15:29
*** timrc-afk is now known as timrc15:29
mordredphschwartz: well, the same logic would hit there -because if it hits a failure it just marks it with delete state15:29
mordredphschwartz: so we'd see the benefits there from the same patch15:30
*** gyee has joined #openstack-infra15:30
ttxmordred: There is one for sure. I use iot all the time. i wrote 25% of it. except it does not let you create a new spec15:30
fungiNobodyCam: only barely around--what's up?15:30
ttxmordred: tere is no createSpec method at project level15:30
ttxmordred: but if lifeless knows how to workaround that, I'll take it15:31
mordredttx: that sounds like potentially a launchpadlib thing rather than an API thing?15:31
ttxmordred: no, the API doesn't have the method15:31
mordredttx: still though - seriously - I'm passing it on in the name of completeness - I like the other plan better15:31
anteayafungi: okay thanks, I don't have that account listed as a member of the third party group15:31
mordredttx: wouldn't it just be a PUT with a spec body?15:31
phschwartzmordred: would you mind if I took this patch on along with the changes for the throttling? It would be during the week next week that I get to it as I am away this weekend.15:31
NobodyCamhey hey fungi :) I was looking at this slightly old bug, (https://bugs.launchpad.net/openstack-ci/+bug/1300208) and saw you commented on it. I'm hitting it here: http://logs.openstack.org/02/96902/13/check/check-tempest-dsvm-ironic/27f0c0a/logs/devstack-gate-setup-workspace-new.txt.gz#_2014-06-12_13_35_19_36015:32
uvirtbotLaunchpad bug 1300208 in openstack-ci "ERROR: the main setup script run by this job " [Undecided,New]15:32
anteayafungi: https://etherpad.openstack.org/p/automated-gerrit-account-naming-format line 24, 25, 26 are my Brocade accounts15:32
*** rfolco has quit IRC15:32
mordredphschwartz: please do!15:32
NobodyCamand wanted to check if you thought it was still valid15:32
mordredphschwartz: I'm excited about both patches15:32
fungianteaya: it's entirely likely it was misbehaving back when we didn't have a separate non-voting group and so was simply taken out of the group and never revisited later15:32
anteayafungi: ah okay15:32
anteayacan we wiggle that onto a todo list?15:32
*** thuc_ has quit IRC15:32
anteayafungi: I think One Convergence is in the same boat15:32
*** doude has quit IRC15:33
*** thuc has joined #openstack-infra15:33
ttxmordred: that's not how you create bugs or anything else. it's far from REST15:33
mordredttx: nod15:34
anteayafungi: https://review.openstack.org/#/q/reviewer:oc-neutron-test%2540oneconvergence.com+status:open,n,z15:34
anteayafungi: they aren't on my list either15:34
fungianteaya: likely so15:34
anteayathanks15:34
fungiNobodyCam: looking15:34
ttxmordred: but then maybe I miss something. i'll gladly accept example code that shows me how to create a blueprint in Launchpad from the API.15:34
anteayafungi: I'm still catching up, is there something I can do of high priorty to help?15:34
fungiNobodyCam: the log line you linked to is a normal attempt by devstack-gate to discover whether there are any specific git refs calculated by zuul for a given project. git doesn't have a way to test for a remote ref without just trying to retrieve it and handling the error it returns when there isn't one to be had15:36
rainyamordred, was germany right smack dab over my birthday your idea for midcycle?!15:36
fungirainya: bierhaus birthday blowout15:37
NobodyCamfungi: Ack TY15:37
*** thuc has quit IRC15:38
fungianteaya: aside from helping me spackle and sand drywall, probably not ;)15:38
rainyafungi :) fwiw, not sure i'm going to be able to swing getting any of my team out there in person, which causes me much saddness! families for some silly reason expect us to be HOME for summer vacation15:38
fungianteaya: looked like it's been relatively quiet in here today thankfully15:38
* anteaya finds safety glasses15:38
anteayafungi: yay15:39
rainyaphschwartz mentioned remote participation, so that will at least be a consolation prize for those of us that have spouses who would kill us if we changed vacation plans15:39
anteayaokay I'm catching up on email and backscroll15:39
fungianteaya: probably just plugging at the gate job failures mostly, from an importance standpoint15:39
*** sarob has joined #openstack-infra15:39
anteayafungi: do ping if I can spin any plates15:39
anteayafungi: kk, I'll put that on my list after I am caught up15:39
mordredrainya: get less families15:40
rainyamordred, not helpful advice, but thanks15:40
mordredrainya: :)15:41
rainyamordred, was going to do NYC for my birthday this year (by myself without families!)15:41
*** CaptTofu_ has joined #openstack-infra15:41
fungiliving at the beach, i'm going to be glad to have a week away from fighting off the hordes of summer vacationers15:42
*** annegent_ has quit IRC15:42
devanandafungi: hi! have a minute to talk about gate-tempest-dsvm-virtual-ironic and its now non-voting nature?15:42
devanandafungi: that is the main job that ironic uses in our own gate15:43
phschwartzfungi: Where do you live?15:43
anteayafungi: I live in a summer vacation haven15:43
phschwartzI am near the beaches in FL and I never go to them. lol15:43
anteayafungi: Sunday nights are the best15:43
anteayafungi: never never go to the grocery store on the weekend15:43
fungidevananda: sure. sdague says it accesses the network a lot, which is causing it to fail over random network connectivity problems far more often than other jobs, something like 33% of the time now15:43
*** sarob has quit IRC15:43
fungiphschwartz: as of wednesday i'll be living in the north carolina outer banks15:44
devanandafungi: we landed https://review.openstack.org/#/c/98886/ to address that issue15:44
phschwartzAh, very nice. At least it is a place with all 4 seasons instead of 4 versions of the same season15:44
fungiphschwartz: a sandbar miles off shore, water within a few hundred feet in either direction15:45
fungihard to avoid the beach there15:45
devanandafungi: i think sdague is just proposing that we move the caching of u-c-a keyring into nodepool, so it isn't part of the d-g job prep15:45
phschwartzfungi: very hard. Here I at least live about 7 miles from the sand. lol15:45
devanandafungi: which i think is grand (though i'm not sure how to do that yet)15:45
fungidevananda: nodepool caches things which devstack has in its package lists15:45
devanandafungi: can nodepool prerun apt-add-repository?15:46
isviridovHello infra, just 2 patched for magnetodb https://review.openstack.org/#/c/91050/ and https://review.openstack.org/#/c/99039/15:46
fungidevananda: the trick is that we need to un-add it too, because we don't want other tests besides ironic's subjected to uca versions of packages15:46
devanandafungi: ironic's virtual tests can't run on precise w/o backports of certain things. I can dig up the bugs if needed15:47
*** ramashri has joined #openstack-infra15:47
*** timrc is now known as timrc-afk15:47
devanandafungi: or can we pin ironic to 14.04 nodes?15:47
fungidevananda: so possibly nodepool could grow a routine to add the repository, update package lists, retrieve the keyring package, remove the repository, update package lists again15:47
fungidevananda: though i'm sure we can pin ironic to 14.04 nodes once we have some15:48
*** eharney_ has joined #openstack-infra15:48
*** eharney has quit IRC15:48
*** eharney_ is now known as eharney15:49
*** zns has joined #openstack-infra15:49
ttxdevananda: you're wanted in #openstack-relmgr-office15:49
fungidevananda: another option might be a specific devstack-precise-uca node type, but that would probably be nearly as much work as getting trusty implemented15:49
*** mugsie has quit IRC15:52
fungianteaya: yeah, the local wisdom seems to be dine at restaurants on weekends because vacationers are busy with check-in/check-out, and do your grocery shopping weekdays first thing in the morning or very late at night. and also if it's raining don't go to stores, restaurants, movie theaters, bowling alleys, arcades... they'll be packed like sardine cans15:52
anteayayou got it15:52
anteayathat is _exactly_ what we do15:53
anteayaand wait for October15:53
anteayathey all leave by then15:53
fungiyup15:53
anteayaI did chat with the fire chief and word got around so I don't have fireworks every week anymore15:54
*** gokrokve has joined #openstack-infra15:54
*** habib has quit IRC15:54
anteayajust on holiday weekends which is about 6 weeks out of the summer, so that is okay15:54
swestonanteaya:  ok, I responded to the ML, and I am ready to move on when you are :-)15:55
*** cp16net_ has joined #openstack-infra15:55
*** bogdando has quit IRC15:55
*** bogdando has joined #openstack-infra15:56
*** cp16net_ has quit IRC15:56
*** markmcclain has quit IRC15:57
*** markmcclain has joined #openstack-infra15:57
*** ihrachyshka has joined #openstack-infra15:57
*** markmcclain has quit IRC15:57
*** markmcclain has joined #openstack-infra15:58
*** marcoemorais has quit IRC15:58
anteayasweston: okay so it appears that Brocade CI is not in the third-party CI gerrit group, so when fungi has a moment for housekeeping he will fix that so that it is15:59
anteayasweston: so you actually have 4 Brocade accounts15:59
*** e0ne_ has quit IRC15:59
*** cp16net_ has joined #openstack-infra15:59
*** hashar has quit IRC16:00
*** e0ne has joined #openstack-infra16:00
fungianteaya: added16:00
fungianteaya: there was another you thought should be added too?16:00
Shrewsfungi: hope your move goes smoothly. will suck to not have you nearby now16:00
anteayaOne Convergence16:00
anteayahttps://review.openstack.org/#/q/reviewer:oc-neutron-test%2540oneconvergence.com+status:open,n,z16:00
openstackgerritMaxime Vidori proposed a change to openstack-infra/storyboard-webclient: Remove boostrap.js  https://review.openstack.org/9963816:00
fungiShrews: not really--you'll just have to come to the beach for work-beer16:00
openstackgerritMaxime Vidori proposed a change to openstack-infra/storyboard-webclient: Removal of jquery  https://review.openstack.org/9966016:00
anteayafungi: they show up as a third party ci system16:01
fungianteaya: done16:01
anteayafungi: thank you16:01
*** cp16net_ has quit IRC16:02
*** cp16net_ has joined #openstack-infra16:03
swestonanteaya: ok, I see you updated the etherpad.16:03
*** mestery has quit IRC16:03
*** pblaho has quit IRC16:03
*** mestery has joined #openstack-infra16:04
anteayayeah16:04
*** cp16net has quit IRC16:04
anteayaearly on we didn't have the two groups voting and none voting16:04
*** cp16net_ is now known as cp16net16:04
*** e0ne has quit IRC16:04
anteayaso if a system went wild and we removed their voting rights there was no group for them16:04
anteayawe have most corralled back again16:05
anteayasince we have a non-voting group now, which is the default group16:05
*** pblaho has joined #openstack-infra16:05
*** mwagner_lap has joined #openstack-infra16:07
*** yfried_ has quit IRC16:07
*** shayneburgess has quit IRC16:07
swestonanteaya:  I understand.16:08
anteayasweston: great thanks16:09
anteayanow that we have the emergiences addressed16:10
anteayawhat can I do to help?16:10
*** pblaho has quit IRC16:10
*** yfried_ has joined #openstack-infra16:11
*** ociuhandu has joined #openstack-infra16:11
*** chuckC has quit IRC16:11
swestonanteaya:  ok, sorry trying to balance another conversation ..16:11
*** locke105 has joined #openstack-infra16:12
devanandafungi: I'm looking at a recent failure of gate-tempest-dsvm-virtual-ironic and wondering why it didn't seem to run the fix we landed several days ago16:12
*** mrodden1 has quit IRC16:12
devanandahttp://logs.openstack.org/14/96114/4/gate/gate-tempest-dsvm-virtual-ironic/7f8433c/console.html#_2014-06-12_06_42_32_82716:12
devanandavs https://review.openstack.org/#/c/98886/1/modules/openstack_project/files/jenkins_job_builder/config/devstack-gate.yaml16:13
anteayasweston: I hear that16:13
devanandathere's no log of running apt-get update prior to a-a-r uca16:13
*** chuckC has joined #openstack-infra16:13
openstackgerritJulien Vey proposed a change to openstack-infra/config: Add solum-guestagent repo to stackforge  https://review.openstack.org/9970516:13
fungidevananda: seems to be using a devstack-precise node in hpcloud-b3. what was the url for the fix again?16:14
devanandafungi: just pasted it :)16:14
fungidevananda: i'll see if it's a stale nodepool image issue16:14
*** mbacchi has quit IRC16:14
fungioh, that was it16:14
devanandafungi: if that fix isn't fixing the problem, that's one thing. but if stale nodepool is causing all the gate failures that sdague is pointing out, that's another16:15
fungiseems to have merged 2014-06-10 at 18:0816:15
sdaguedevananda: regardless, 65% pass rate is terrible16:15
devanandasdague: i agree16:15
sdaguethe ironic team has to get that up16:15
fungidevananda: and yeah, it was a job config change, so wouldn't be impacted by stale node images...16:15
sdagueand has to be the people monitoring that16:15
sdaguelanding a gate job also incurs the respossibility of ensuring it's at a high success rate, and digging on it when it's not16:16
devanandasdague: and we are16:16
devanandasdague: our gate was completely blocked for ~10 days so we couldn't do much to even see when it would/wouldn't work16:16
devanandathen nova landed the revert to unblock us16:16
devanandawe landed ~15 bug fixes16:16
devanandaincluding one that we thought addressed that very issue16:17
devananda(in infra)16:17
sdaguewell you also need to realize there is no more "our gate"16:17
sdagueyou are in the main integration gate now16:17
fungidevananda: another possibility is that jenkins02's job configuration is stale for some reason... checking that next16:17
devanandasorry -- our check queue was blocked16:17
*** annegent_ has joined #openstack-infra16:17
sdagueso the impact goes way up16:17
devanandasdague: but you're right. that's another problem16:18
swestonanteaya:  I am finding out how to rename the Brocade CI account.16:18
anteayasweston: you can't16:18
devanandasdague: ironic afaik shouldn't bbe in the integrated gate right now16:18
anteayasweston: we have to16:18
swestonanteaya: yes, i mean, what to rename it to16:18
anteayasweston: all you have to do is tell me what name you want it to be16:18
sdagueyeh, this actually kind of leads back to whether or not we can take the risk on olsotest16:18
anteayasweston: ah yes, that I do need from you16:18
devanandasdague: whether our tempest jobs vote on ironic or not shouldn't impact the integrated gate. I recognize that it does -- how can we detangle taht?16:18
sdaguebecause I'm actually kind of concerned of the "join the world" that oslotest is causing16:18
anteayasweston: thanks16:18
swestonanteaya:  yup yup16:18
devanandasdague: without disabling our main test's ability to vote on ironic changes16:18
sdaguedevananda: oslotest16:19
sdagueis dhellmann about16:19
*** jistr has quit IRC16:19
sdaguethis is the problem, but the way the oslotest jobs are set up, everything that uses oslotest is now merged into one flow16:19
sdagueand from a theory perspective, I get why16:20
devanandai really want to see that job continue to vote on ironic -- even failing 33% of the time -- because that forces us to clean things up. making it non-voting doesn't encourage ironic devs to fix it.16:20
sdaguebut from a practice perspective, I think we're kind of boned16:20
sdaguedevananda: it's still voting in check16:20
sdagueand you have to have clean check to get to the gate16:21
*** Hal_ has quit IRC16:22
devanandasdague: maybe i'm missing something in how oslotest tied everything together16:22
devanandasdague: how does a job that votes on ironic's gate cause patches to /other/ projects to fail to merge?16:22
fungidevananda: yeah, so that was it... the jenkins master associated with that particular slave still has the previous version of the job config. digging now to see if/why jjb is failing to update it there16:23
sdaguedevananda: ok. stop using the term "ironic's gate"16:23
devanandasdague: or is the problem that the integrated gate is now serialized, so if there is an ironic change in the merge queue, and it fails, it slows things down?16:23
sdaguebecause no project has their own gate16:23
*** markmcclain has quit IRC16:23
sdaguegates are constructed by computing projects that have overlapping jobs16:24
devanandasorry, i'll rephrase16:24
swestonanteaya:  I am not getting a response right now, I will need to follow up with you when I have more info.16:24
anteayasweston: very good16:24
sdagueand the issue is a gate reset, because of failing in the gate, adds about 1 hr delay to redo all the jobs16:24
anteayasweston: anything else for the moment?16:24
sdagueironic had 30 resets in the last 48 hrs in the gate queue on that one job16:25
swestonanteaya:  ok, so .. moving on.  I have Zuul reporting back to the openstack-dev sandbox now.16:25
devanandasdague: gotcha16:25
sdaguewhich means generating 30 hrs of delay (back of the envelope)16:25
sdaguethat's huge16:25
*** mugsie has joined #openstack-infra16:25
sdagueand a piece of the puzzle for why we have a 24 hour gate pipeline right now16:25
swestonanteaya:  can we verify that is happening correctly?16:25
anteayasweston: may I have a url16:25
devanandasdague: right. now I understand.16:26
devanandasdague: no predictive parallel testing, so one failure means re-testing everything "behind" that patch16:26
sdaguedevananda: we are predictive16:26
sdaguebut it's a speculation16:26
sdagueso if you fail in the gate, we have to unwind and redo our speculation16:27
swestonanteaya: yes https://review.openstack.org/#/c/99656/16:27
*** cgoncalves has quit IRC16:27
*** cgoncalves has joined #openstack-infra16:27
sdagueotherwise you can land code that never was tested in that combination16:27
fungioptimistically predictive to avoid wasting even more resources on jobs than we already do16:27
fungias opposed to pessimistically predicting several failures deep in case a change fails16:27
sdagueyeh, with infinite resources you could assume things would fail and just grind16:27
anteayasweston: okay so third-party ci systems can't post "Starting check jobs." please disable16:28
anteayasweston: it is too much noise16:28
sdaguebut take the current resources and multiply16:28
clarkbo/16:28
sdagueto get what you'd need for that16:28
swestonanteaya:  ok16:28
anteayasweston: and firefox can't find your logs: http://logs.ci.vyatta.net/56/99656/1/check/noop-check-communication/288afb5738384c4ebea7dd6ecdc8372216:28
clarkbanteaya: we actually need to turn that off in our zuul too16:28
swestonanteaya: yes, I need to fix that16:29
*** Longgeek has quit IRC16:29
anteayaclarkb: awesome16:29
anteayasweston: great, that is my feedback16:29
clarkbanteaya: gerrit 2.8 should fix that. I can confirm over on review-dev16:29
fungiclarkb: though in actuality now gerrit adds a comment anyway when zuul un-sets its vote, so...16:29
anteayasweston: go forth, do great things16:29
clarkboh16:29
anteayaclarkb: awesome, thank you16:29
clarkbsilly gerrit16:30
fungiclarkb: it'll just be a comment with "patchset X: -Verify" as the only content16:30
swestonanteaya: ok, so nothing should be reported when the system starts the check job?16:30
anteayasweston: correct16:30
fungiclarkb: at least i think that's the new vote+0 behavior16:30
anteayasweston: with 20 systems reporting on a patch, that is too much activty for too little information16:30
*** mrodden has joined #openstack-infra16:31
devanandasdague: so. non-integrated projects which are now, by virtue of oslotest, in the integreated gate. that includes incubated projects, tripleo projects, etc16:31
sdagueyep16:31
clarkbfungi: ya I think you are correct. I can poke at review-dev to see16:31
devanandasdague: is that going to change? or the new status quo?16:31
sdaguethat's some fallout that was not entirely figured out16:31
clarkbfungi: it may end up being clearer that way anyways?16:31
sdagueI think we need to rethink that16:31
*** _nadya_ has joined #openstack-infra16:32
devananda++16:32
fungiclarkb: it may end up being less code in zuul, at least16:32
sdagueand just take the risk of oslotest breaking16:32
*** gokrokve has quit IRC16:33
sdaguebut I need dhellmann around before I propose that16:33
mordredclarkb: phschwartz and I were talking in the scrollback about a patch to nodepool to use nova rebuild in some cases16:33
clarkbmordred: what does nova rebuild do?16:34
mordredclarkb: it reuses the vm but splats the contents of the original image on it - so the end result is like having booted a new instance, but it's quicker/cheaper16:34
swestonanteaya:  gotcha.  I have a response on our earlier query, could we change Brocade CI to Brocade BNA CI, and change Brocade OSS CI to Brocade Vyatta CI16:34
anteayayes16:35
anteayaI will make a note of those16:35
sdaguemordred: when it works :)16:35
mordredclarkb: the short theory is to have the node launching logic say "I need more nodes, are there any nodes with a matching image in the DELETE state, if so, rebuild, if not, launch a new one"16:35
clarkbmordred: lifeless: also the more I think of the dib service issue the more I come back to have puppet install mysql and postgres then manually start postgres and mysql in post-install.d and add the user and db. Then once we are switched delete that code from puppet to prevent branching too hard16:35
anteayawe don't have a time yet for the naming changes to take effect yet, but when we rename those are the names they will be16:35
swestonanteaya: Yay!! And I have a new ssh key we would like associated with Brocade BNA CI, how should I post that to you?16:35
mordredclarkb: ++16:35
mordredclarkb: I swear I'm going to write you a patch in that direction16:36
clarkbok16:36
mordredclarkb: you know, if I can get the heck out of vegas16:36
clarkbmordred: are you ready for world cup?16:36
clarkbI am hanging out at home today so that I acn watch todays game16:36
mordredsdague, clarkb: for the nodepool patch, I think we'd also want to put a flag or a counter on a node in the db so that if we have rebuild failures, we can mark it as "please don't try to rebuild me"16:36
mordredclarkb: I am - although we're going to be travelling for this game :(16:37
clarkbmordred: phschwartz sdague once a node goes into delete that means we are trying to delete it16:37
mordredclarkb: right16:37
sdaguemordred: it would honestly be nice if nodepool ended image build with trying to run tempest-dsvm-full16:37
sdagueto know if the think worked16:37
clarkbhow do we preempt that deletion on the nova side?16:37
mordredsdague: yes. I want to get to taht point16:37
anteayasweston: please post the ssh key to the infra email list16:37
clarkbsdague: I want to make that part of the dib cycle16:37
sdagueespecially with the UPDATE_REPOS=False, to know that we cached16:37
mordredclarkb: don't we mark it for delete in teh db and then have a reaper thread which goes through and does the deletes? Or am I high?16:37
sdaguebecause, right now we don't16:38
anteayasweston: please check the etherpad to ensure all the Brocade account names are correct16:38
sdaguein a lot of the images16:38
clarkbsdague: create dib image, upload to glance as image-beta/test/derp then have a periodic job that runs once a day that only runs on that image flavor which if it passes triggers a thing to rename the image16:38
anteayasweston: and note that the usernames will not be automatically renamed, we are going to make a list of accounts that are willing to volunteer for that16:38
*** openstackgerrit has quit IRC16:38
clarkbmordred: yes that is how it works16:38
clarkbmordred: so there is a race between making delete api request and rebuilding16:38
swestonanteaya:  I am going to set up mailing lists for all of the accounts, so that we can add and remove people from the list as necessary, but I do not have the server up yet.  Will that be okay?16:38
phschwartzclarkb: Correct, but if all we have to do is look for an instance that is marked as delete, but not deleted yet. (not sure if the db currently denotes if the instance has been grabbed by the delete thread and is in the process of calling the api to delete.16:39
clarkbmordred: and currently I don't think you can win that race beacuse its a fifo queue16:39
mordredclarkb: right. so once it's marked in the db, if the delete thread gets to it, great. but if it doesn't and the rebuild thread gets to it, neat16:39
anteayasweston: since the timing of that renaming needs to be co-ordinated since once we change the name, you can't use the system until you rename your system16:39
mordredclarkb: ah - so it might need a little more logic work then16:39
clarkbyes16:39
*** sarob has joined #openstack-infra16:39
mordredkk.16:39
mordredphschwartz: have fun!16:39
clarkbI am sure you can make it happen, but you will haev to be careful to not mark a rebuilt node as ready again16:39
anteayasweston: as long as I can talk to you about anything Brocade, and you can turn systems off if they go wild16:39
clarkbthen have it deleted in 20 minutes16:39
anteayasweston: you do anything you need to do from your end to make it work16:40
sdagueclarkb: cool. though right now, I mostly need a working ES cluster :)16:40
clarkbsdague: yes I am booting this morning16:40
sdaguewoot16:40
devanandafungi: any further info on the stale image?16:40
phschwartzmordred: I think there is a way around it. If we can grab it before it is deleted, then we can change the status so the delete thread never tries to delete it. (drop it from the queue, might have to move from being a fifo for that)16:40
clarkbsdague: I think next step is to start tailing all the logs and make sense of what appears to be a cyclic process16:40
swestonanteaya: ok, awesome :-) what does that mean, then.  we won't be able to use the accounts from when to when?16:41
*** marcoemorais has joined #openstack-infra16:41
devanandasdague, fungi: it looks like, in the last 48 hours, only one failure in gate-tempest-dsvm-virtual-ironic was from something /other/ than UCA failing (which I think is just due to stale images at this point, since we landed a fix on 6-10)16:41
clarkbwait wasn't the fix in the job definitions not the salves?16:41
swestonanteaya: yes, I will be available 24/7 in the event of problems16:41
sdaguedevananda: sure, but until that's actually working and fixed, it's still a problem16:42
*** marcoemorais has quit IRC16:42
devanandasdague: ack16:42
*** dizquierdo has quit IRC16:42
sdagueI don't actually care why it's failing, the fact that it's failing disqualifies that from voting16:42
*** marcoemorais has joined #openstack-infra16:42
phschwartzThere is another benefit of getting the rebuild working also. It removes a race where one of our regions might be out of capacity and when you do the delete it hits a soft-delete while it is waiting for the resources to be released basically leaving it as an usable slot for a short period of time. If we do a rebuild instead, the slot never leaves the usage from infra so no competing for a new slot that might not be available at that time.16:42
clarkbphschwartz: yup I think it will be a good thing to try16:43
anteayasweston: if, and only if, you volunteer to have your accounts renamed, you I and fungi pick a time for it to happen, we change gerrit, you change your 4 systems, and we are all good16:43
anteayasweston: if you choose to do it, it should take about 15 minutes start to finish16:43
*** sarob has quit IRC16:43
anteayasweston: the trick is to pick a time when fungi has the 15 minutes16:44
phschwartzclarkb: I am going to start working on it monday morning while I am flying to SAT. I want to get a WIP patch out for you guys to start looking at as soon as I can.16:44
*** markmcclain has joined #openstack-infra16:44
clarkbphschwartz: mordred: or even preempt the entire marking of DELETION if the current needed nodes is non zero16:45
clarkbthis doesn't fix it quite so properly but may be simpler16:45
anteayasweston: I am afk for a bit, I will let you know when I am back16:45
swestonanteaya: that sounds great.  I will let you know when we are ready to proceed with the name changes, send the new key out to the mailing list, and ping you when the log and status servers are up.16:45
*** zehicle_at_dell has quit IRC16:45
sdagueok, going to drop off for a bit, need to get some lunch and relocate back to home.16:45
swestonanteaya: oh ok, looks like our messages crossed paths, I'll wait :-)16:45
phschwartzclarkb: hmm, I think I have an idea based on that logic. Give me a second to think before I respond.16:45
*** isviridov is now known as isviridov|away16:47
*** rfolco has joined #openstack-infra16:47
*** amcrn has joined #openstack-infra16:47
phschwartzclarkb: ok, I know this would complicate the process a bit. But what if when we are done with an instance or at a time when we would mark it for deletion, we mark it for rebuild. We would need 1 more thread for the rebuilder that if it decides it cannot rebuild an instance, it would flag it for deletion. This would make it so there is no need to modify delete process at all.16:48
phschwartzThe rebuild thread would put it into a state that the create thread can then use.16:49
*** fanhe has quit IRC16:49
devanandafungi: fwiw, all the failures seem to be from jobs started by jenkins0216:49
*** _nadya_ has quit IRC16:50
devanandaexcept for one java.io.IOException from jenkins0416:50
*** sweston has quit IRC16:51
*** maxbit has quit IRC16:51
*** sweston has joined #openstack-infra16:51
*** cp16net has quit IRC16:52
*** chianingwang has joined #openstack-infra16:53
clarkbphschwartz: yeah you coudl slip someting in between like that16:53
clarkbphschwartz: which should make it a bit easier to reason about races as there shouldn't be any16:53
rcarrillocruzhey guys, i'm deploying review.pp in a clound instance. If I access to the server with http, I get "The requested URL / was not found on this server.", if I access it with https I get "SSL Connection error".16:53
phschwartzI am going to think more on it while I head out with the wife to get lunch.16:53
*** trinaths has quit IRC16:53
clarkbdevananda: right I don'tthink it is a slave thing it is a master thing16:53
rcarrillocruznow, if i edit the vhost and replace the VirtualHostname <hostname>:<port> for VirtualHostname *:<port> it works16:53
*** mkerrin has joined #openstack-infra16:53
clarkbphschwartz: sure let me know what you come up with (I think any approach is worth doing though)16:54
mordredsdague: ++ to removing postgres, btw16:54
*** harlowja_away is now known as harlowja16:54
clarkbrcarrillocruz: does the name match the name you are hitting it with?16:54
clarkbrcarrillocruz: if you try to hit localhost but the vhost says review.foo it won't work16:54
devanandaclarkb: how do we address it? that staleness is still causing ~30% of our tests to fail16:54
phschwartzsdague: ++ from me also, I am a certified postgres admin and developer and I can't stand using it.16:54
*** markwash has joined #openstack-infra16:54
clarkbdevananda: fungi said he was looking at it16:54
devanandaclarkb: ack16:55
rcarrillocruzwhat i did was to add the host name in my laptop /etc/hosts and in the cloud instance itself16:55
devanandajust trying to be helpful16:55
rcarrillocruz<external ip> gerrit.openstacklocal16:55
clarkbdevananda: ya I think we need to hear back from fungi16:55
clarkbdevananda: it *should* be as simple as kicking JJB to apply the new job16:55
clarkbrcarrillocruz: and gerrit.openstacklocal is what the vhost said before the splat?16:55
rcarrillocruzlemme paste the vhost in paste.openstack.org16:56
mordredsdague, jogo: I know I'm a broken record on this ... but ^^ is the dumbest default behavior ever. I blame both of you because of the nova-core status16:56
mordredKiall: you're going to fix that with designate, right?16:57
*** zehicle_at_dell has joined #openstack-infra16:58
*** skraynev has quit IRC16:58
rcarrillocruzclarkb: http://paste.openstack.org/show/83844/16:58
*** jcoufal has joined #openstack-infra16:58
*** skraynev has joined #openstack-infra16:59
rcarrillocruzand put in both laptop/gerrit server the pair <external IP> gerrit.openstacklocal gerrit under /etc/hosts16:59
rcarrillocruzso it shouldn't resolve to localhost16:59
*** sarob has joined #openstack-infra16:59
*** rwsu has joined #openstack-infra17:00
*** radez is now known as radez_g0n317:00
*** bogdando has quit IRC17:00
*** jlibosva has quit IRC17:01
*** marcoemorais has quit IRC17:02
devanandaclarkb: in response to sdague's email, how do you feel about nodepool precaching UCA? I have yet to look into the nodepool code, but it sounds like he doesn't want to re-enable voting on that job until it no longer pulls anything directly from UCA17:02
*** jlibosva has joined #openstack-infra17:02
mordredwhy don't we just have UCA enabled from the get-go again?17:02
Kiallmordred: was AFK.. What am I fixing? ;)17:03
mordredKiall: nova boot foo.bar.com ; ssh foo.bar.com ; hostname == foo.openstacklocal17:03
KiallYea - We had a chat with Nova/Neutron guys at the summit to talk about fixing that ;)17:04
mordredKiall: I generally assume you can solve all of my problems17:04
clarkbmordred: devananda UCA doesn't work17:04
*** zns_ has joined #openstack-infra17:04
clarkbor hadn't17:05
mordredoh. well, that's a good reason17:05
clarkbwhich is why we avoided it17:05
Kiallmordred: funny that, I thought I caused more problems that I fixed ;)17:05
clarkblibvirt was broken17:05
clarkbmongo was broken17:05
devanandaclarkb: https://review.openstack.org/#/c/98886/117:05
*** dprince has quit IRC17:05
clarkbdevananda: right so in that case its just you that are affected17:05
mordreddevananda: I believe he means "the software in UCA is broken"17:05
*** ramashri has quit IRC17:05
devanandamordred: ah17:05
mordrednot the mechanism17:05
devanandaso in this case, the issue sdague has is the mechanism17:06
*** andreykurilin_ has joined #openstack-infra17:06
devanandathat we're installing UCA at run time, instead of precaching it17:06
*** nati_ueno has joined #openstack-infra17:06
clarkbsdague: so it looks like some of our logstash-indexers are off derping17:06
mordredyah. hrm. I wonder ...17:06
*** ramashri has joined #openstack-infra17:06
mordreddevananda: I have an idea17:06
devanandamordred: ironic's tempest job is now non-voting17:06
clarkbsdague: I am going to sweep through and see if I can figure out what they are doing but I think the issue is at the logstash indexer level17:07
mordreddevananda: we do the normal apt precaching that we do17:07
devanandamordred: because, essentially, intalling UCA fails too often17:07
clarkbsdague: so adding new ones helped until they derped too17:07
*** zns has quit IRC17:07
mordreddevananda: then we a-a-r uca and do another round of apt precaching ...17:07
mordreddevananda: THEN, remove the sources.list.d file17:07
mordredso that "enabling" uca is not running a-a-r, it's adding the sources.list.d file back, and then any additional packages you'd get from uca would also be pre-cached17:08
mordredclarkb: ^^ sanity check me on that17:08
mordredI think that would allow us the mechanism to pre-cache/pre-download the things we need without polluting the box for non-uca runs17:08
mordredwe could even add uca without a-a-r at all17:09
mordredafter all, it's just a sources.list.d file and an apt-key command17:09
clarkbmordred: thta should work17:09
*** zns_ has quit IRC17:09
mordredwe could potentially generalize is - so that the UCA repo is referenced somewhere in devstack as a repo that might get enabled17:10
*** marcoemorais has joined #openstack-infra17:10
*** zehicle_at_dell is now known as zehicle_defcore17:10
mordredand we could have a generalized thing in d-g that pre-caches stuff with any additional repos that devstack lists17:10
*** zns has joined #openstack-infra17:10
mordredbut that would be step two and may never be needed17:10
clarkbwow dnsmasq hates us in syslo17:11
devanandamordred: so, side note, since https://review.openstack.org/#/c/98886/1/modules/openstack_project/files/jenkins_job_builder/config/devstack-gate.yaml landed17:11
*** james_li has quit IRC17:11
devanandamordred: i haven't seen any more of those failures17:11
devanandamordred: except for the possibly-stale nodes that fungi is looking into17:11
*** zzelle has joined #openstack-infra17:12
mordreddevananda: yah - but we do have a hole where we'll be downloading from the internet rather than from pre-cache17:12
devanandamordred: but in principle, I dont see why this really fixes it17:12
devanandaexactly17:12
mordreddevananda: and we've developed pretty good history to know that that WILL break17:12
mordredit's just a matter of time17:12
devanandaright17:12
mordredso the longer version above should fix it the _right_ way17:12
*** markmc has quit IRC17:14
*** derekh_ has quit IRC17:14
*** trinaths has joined #openstack-infra17:15
*** annegent_ has quit IRC17:17
fungidevananda: not so much stale nodes and jenkins masters not getting job configs updated17:17
*** Ryan_Lane has joined #openstack-infra17:17
fungigah, step away for a few minutes and so many pings17:17
fungidevananda: i started the jjb update on jenkins02 before i stepped away, but it's still churning17:19
fungiseems to think it needs to reconfigure lots and lots of jobs17:19
*** trinaths has quit IRC17:19
*** cp16net has joined #openstack-infra17:19
fungiit's creating a bunch it was missing too17:19
devanandafungi: ack17:20
devanandamordred: so i haven't dug into the nodepool precaching code before. a) do you think step1 is needed before gate-tempest-dsvm-virtual-ironic can vote again? b) if so, mind pointing me in the right direction to get started on that?17:21
*** gokrokve has joined #openstack-infra17:22
*** jlibosva has quit IRC17:23
*** esker has joined #openstack-infra17:23
mordreddevananda: ./modules/openstack_project/files/nodepool/scripts/cache_devstack.py in openstack-infra/config does it17:23
*** gyee has quit IRC17:23
mordreddevananda: since we collect the list of apt packages17:24
*** cp16net has quit IRC17:24
mordreddevananda: I think we may just want to put in something around line 14217:24
devanandaoff topic, ironic has a patch up to make the ipmi driver fail to load if ipmitool isn't installed, which seems like a sane thing17:24
*** dprince has joined #openstack-infra17:25
devanandabut that made me realize taht we're not installing ipmitool in nodepool17:25
clarkbsdague: :timestamp=>"2014-06-12T17:22:26.537000+0000", :message=>"Failed to flush outgoing items", :outgoing_count=>512, :exception=>java.lang.OutOfMemoryError: Java heap space, may be our culprit17:25
mordreddevananda: hrm. I may need to think about a sane way to implement the above stuff17:25
*** markmcclain has quit IRC17:25
devanandaalso, we're not doing CI with ipmitool anyway17:25
mordreddevananda: that would be a devstack thing. you'd need to add ipmitool to an apts file in devstack17:25
mordreddevananda: and then nodepool will know to pre-cache it17:25
devanandaso a) we add ipmitool to devstack/xx/apts or b) we dont enable the ipmitool driver in devstack, since it's not used in CI testing17:26
mordreddevananda: but devstack's ironic config would want to be the one to actually install it17:26
devanandamordred: right. but we're not actually going to use it for upstream CI17:26
mordreddevananda: nod17:26
mordreddevananda: I could see either thing ... devstack _is_ used for more than the gate17:26
devanandawhich do ya'll prefer? a) saner devstack config for folks testing, but an extra (unused) package17:26
devanandaright17:26
mordreddevananda: so if you expect someone using devstack to be able to configure ironic and then have that cloud control things with ipmi ... then I'd go ahead and add it17:27
mordredand the fact that we don't use it in the testing is meh17:27
devanandaack, will do that then17:27
devanandasince it is the recommended / default / reference driver that most folks test with17:27
mordred++17:27
clarkbsdague: but this makes me want to use fluentd more17:28
*** MarkAtwood has joined #openstack-infra17:28
clarkbmordred: ^17:28
mordreddevananda: if you can figure out a sensible way to implement the stuff above in cache_devstack.py go for it - if not, poke me when I get back home and I'll figure it out17:28
mordredclarkb: I support your choices in this area 100%17:28
mordredclarkb: if fluentd would be a better choice, then awesome17:28
*** maxbit has joined #openstack-infra17:29
clarkbmordred: well it isn't a better choice until we have structured data. but definitelysomething we can work towards17:29
*** ihrachyshka has quit IRC17:29
*** arnaud__ has joined #openstack-infra17:30
mordredclarkb: it's the same choice essentially while we dont' though right?17:30
mordredclarkb: so would we be doing a fluentd+elasticsearch cluster intead of a logstash+elasticsearch cluster?17:30
*** rwsu has quit IRC17:30
clarkbya17:31
mordredcool17:31
*** praneshp has joined #openstack-infra17:31
anteayasweston: back, send the key to the infra ml list anytime use both the new and old Full Name for the account17:31
clarkbmordred: fluentd doesn't really do parsing of unstructured data17:31
clarkbmordred: so it is supposed to be able to do much better throughput17:31
clarkbmordred: but you have to start with good data17:31
anteayasweston: I'll let you know when we are ready to change the Full Name of the account, we can address changing the username after that17:31
mordredclarkb: so what do we do until we have structured data?17:31
*** rwsu has joined #openstack-infra17:32
clarkbmordred: limp along on logstash17:32
mordredah - gotcha17:32
anteayasweston: and yes, let me know when you have something new for me to see in the sandbox repo comments17:32
mordredso we need to get the ability to have structured data, then spin up fluentd?17:32
zaromorning17:32
mordredor spin them up side by side?17:32
*** ihrachyshka has joined #openstack-infra17:32
swestonanteaya: awesome! Thank you so much for your time today.17:33
*** markmcclain has joined #openstack-infra17:33
*** markmcclain has quit IRC17:33
anteayasweston: np17:33
anteayasweston: thanks for being the Brocade point person, saves me time17:33
fungimordred: devananda: i think the hard part about caching this in nodepool is going to be that the packages ironic's job needs cached are in ubuntu cloud archive, which means we need to enable it, update package lists, then retrieve the package versions it needs into the cache, then disable it, then update the package index again17:34
swestonanteaya: you bet.  always glad to do what I can to ease the burden for others :-D17:34
*** marcoemorais1 has joined #openstack-infra17:34
mordredfungi: yes. that is what I wrote above17:34
fungimordred: devananda: and so the ironic job is still going to have to re-enable uca and re-update the package list17:34
clarkbmordred: I think we focus on structured data first17:34
mordredfungi: although you summarized is very nicely17:34
*** marcoemorais1 has quit IRC17:34
clarkbmordred: as that is project side and will be potentially problematic17:34
mordredclarkb: ++17:34
clarkbmordred: though if we make oslo.config import python json logging that may be all we need17:34
*** marcoemorais has quit IRC17:34
clarkbmordred: then we can config it to do json17:34
*** marcoemorais has joined #openstack-infra17:35
mordredfungi: but it should still get us much further in that they would not be pulling packages from the internets17:35
*** markmcclain has joined #openstack-infra17:35
*** ihrachyshka has quit IRC17:35
*** SumitNaiksatam has left #openstack-infra17:35
fungimordred: true. also it *might* be possible (though sorta hacky) to save and pivor between package lists17:35
fungis/pivor/pivot/17:36
*** SumitNaiksatam has joined #openstack-infra17:36
dhellmannclarkb: I'm probably missing some context, but there's a json logger in the oslo log code17:36
clarkbdhellmann: oh cool17:36
fungirsync /var/cache/apt to a /var/cache/apt.ironic or something and swap back and forth during image creation and within the jobs17:36
clarkbdhellmann: I didn't know so apparently we would just have to flip a switch to make that work17:37
clarkbdhellmann: we are finding that doing post processing of log data to make it structured is expensive and we shouldn't do it17:37
clarkbdhellmann: so starting with json is where we want to go17:37
dhellmannclarkb: https://review.openstack.org/#/c/95929/17:37
fungialso possible we could play tricks with apt pinning and just have a very low preference on the uca repos, then ironic specifically requests the version/suite it needs for a given dependency17:37
dhellmannclarkb: that makes complete sense, and we may want to make that logger smarter after we move it to oslo.log17:38
clarkbdhellmann: awesome that is great news17:38
mordredwoot!17:38
clarkbdhellmann: do you know if json logging is available in say nova today?17:38
clarkbdhellmann: or any of the projects?17:38
clarkbdhellmann: if not I can work on syncing logging17:38
dhellmannclarkb: the class is there, I don't know if anyone uses it17:38
*** lbragstad has quit IRC17:39
clarkbdhellmann: mordred: ok I will do some digging and see if I can get a d-g run to spit out json logs17:39
devanandafungi, mordred: so that discussion has indeed gone past my ability to track it // rapidly implement any of the things you're suggesting :(17:39
dhellmannthat is, the class is in the incubated version of log.py, but I don't know if nova is up to date and I don't know if any nova users have turned that on17:39
anteayasweston: :D17:39
mordredclarkb: it's your next project to get you ATC status in all the projects :)17:39
clarkbmordred: :)17:39
clarkbdhellmann: ah ok I may end up trying to do syncs if necessary but this is great news thanks17:39
mordreddhellmann: we're very excited by this17:39
clarkbmordred: fungi: sdague: in the interim we can try going to 8GB perf nodes for logstash workers17:39
*** mrmartin has joined #openstack-infra17:40
clarkbthen double the jvm heap space for logstash17:40
mordredclarkb: gross17:40
dhellmannclarkb, mordred : I would love to have some feedback about how useful that class actually is and how to make it better17:40
clarkbdhellmann: noted. will try to provide it17:40
dhellmannsdague: you had something about oslotest and gating you wanted to talk about?17:40
dhellmannclarkb: thanks!17:40
*** chmartinez has joined #openstack-infra17:42
chmartinezhello! Sorry to bother. Could someone tell me what's going on with the gate jobs of this review: https://review.openstack.org/#/c/96582/?17:43
zaroclarkb: is there some secret to allow logging in from one hpcloud vm to another?  i can't seem to get passwordless ssh connection.17:43
*** sweston has quit IRC17:43
chmartinezat zuul, the gate is marked with red :|17:43
zaroclarkb: actually i can't seem to do any type of connection.17:43
clarkbzaro: you have to forwad your ssh agent but generally shouldnt17:43
*** esker has quit IRC17:44
*** ramashri has quit IRC17:44
*** ArxCruz has quit IRC17:44
zaroclarkb: forget ssh, just simple login from one vm to another.  does that work for you?17:44
fungimordred: possibly something to keep on your radar, this is why we're presently not publishing wheels for data-only projects https://bitbucket.org/pypa/wheel/issue/116 i've put up a pull request, but you've probably dug a lot deeper into that code than i have so input would be welcome17:45
clarkbzaro: oh you mean any communication? check your security groups17:46
clarkbzaro: I ended up going the infra route and opened my security groups whide open17:46
clarkbthen mange local firewalls17:46
*** afazekas has quit IRC17:47
zaroclarkb: you mean open all ports?17:47
*** mbacchi has joined #openstack-infra17:47
clarkbzaro: ya thats what I did17:48
clarkbzaro: you really don't need to but I got sick of dealing with it at that level17:48
clarkbmuch easier to modify iptables on a host17:48
clarkbmordred: ^ is that maybe feedback we should give to openstack as a whole?17:49
*** lbragstad has joined #openstack-infra17:49
*** ramashri has joined #openstack-infra17:51
*** talluri has quit IRC17:52
clarkbKiall: if you are around did you get sorted on the unbound thing?17:56
clarkbI see there is a change17:57
mrmartinfungi: hi, if you have some time, may I ask a review for this path: https://review.openstack.org/#/c/99481/ it is a larger refactoring code for community portal instance to provide better deployment / update path and is required for deploy some features there17:57
fungimrmartin: i probably won't have time this week or next. i'm fairly busy packing and moving17:58
mrmartinok, no prob.17:58
fungimrmartin: but hopefully some of our other reviewers will have a look in the meantime17:58
clarkbKiall: fungi pretty sure that chaneg will break everything17:59
Kiallclarkb: Code Review in action :D17:59
Kiallwhy?17:59
clarkbit doesn't put unbound on 127.0.2.1 for long lived servers but updates resolv.conf17:59
zaroclarkb: opened all ports but still cannot connect.  you should be able to ssh connect from VM A to VM B with ubuntu account right?17:59
clarkbKiall: so we will end up in a situation where nothing resolves on review.openstack.org for example17:59
clarkbzaro: yes18:00
*** dangers_away is now known as dangers18:00
zaroargg!18:00
clarkbKiall: fungi: I am much more comfortable with the services being tested being treated special18:00
clarkbKiall: fungi: especialyl for a service like DNS18:00
KiallOh - I thought unbound just went on the single use slaves?18:00
fungiKiall: no, it's on all our servers18:00
*** jerryz has joined #openstack-infra18:00
fungiand yeah, i missed that we didn't factor out the unbound configuration for the nodepool nodes separately from everythign else18:01
*** markmcclain has quit IRC18:01
*** markmcclain has joined #openstack-infra18:01
chmartinezhello! Sorry to bother. Could someone tell me what's going on with the gate jobs of this review: https://review.openstack.org/#/c/96582/?18:02
*** markmcclain has quit IRC18:02
zaroclarkb: do i need to do anything to make that happen? keep getting permission denied (public key).18:02
Kiallfungi: Okay, I can rework it to listen only 127.0.0.1:53, rather than *:53, and at least we can work around that easily in devstack18:02
clarkbsdague: well I kicked things to deal with the unhappy OOMers and now thinsg appear to be worse18:02
clarkbsdague: maybe the OOMing is self regulating :/18:02
clarkbzaro: yes you ahve to forward your key18:03
clarkbzaro: but you shouldn't do that18:03
*** zul has quit IRC18:03
*** pelix has quit IRC18:03
fungichmartinez: it was approved at 02:17 utc, check tests were rerun on it, then it was enqueued into the gate at 03:55 utc and is waiting for its turn18:03
*** zul has joined #openstack-infra18:03
clarkbKiall: does it listen on *:53 by default?18:03
Kiallyep18:03
* clarkb looks18:03
fungichmartinez: you can find its current status by searching for the change number on http://status.openstack.org/zuul/18:04
clarkbKiall: netstat says it doesn't18:04
Kiallhumm - can you paste the output?18:04
clarkbKiall: 127.0.0.1:53 and ::1:53 for tcp and udp18:05
zaroclarkb: i should still be able to login without forwarding key right?  just type in password?  but ssh doesn't even ask me for the password.18:05
clarkbtcp        0      0 127.0.0.1:53            0.0.0.0:*               LISTEN18:05
*** markmcclain has joined #openstack-infra18:05
clarkbudp        0      0 127.0.0.1:53            0.0.0.0:*18:05
clarkbzaro: password auth is probably disabled18:05
chmartinezfungi: yes, I check that  and I'm seeing this: openstack/ceilometer unknown 14 hr 10 min18:05
clarkbbceause you shouldn't password auth either18:05
KiallHumm - The documentation suggested it was 0.0.0.0:53, If that's the case, we should be able to work around it18:05
*** praneshp_ has joined #openstack-infra18:05
sdagueclarkb: bummer18:06
chmartinezfungi: it's being enqueued for 14hs.. Is that normal? (sorry to ask, I'm new at this)18:06
fungichmartinez: correct, it's presently taking changes ~24 hours to get to the top of the gate given the current rate of random test failures18:06
*** praneshp has quit IRC18:06
*** praneshp_ is now known as praneshp18:06
clarkbsdague: I mean in theory peple have hundreds of ndoes in these clusters18:06
clarkbsdague: but maybe they have real hardware18:07
chmartinezfungi: OK. Good to know :) Thanks!!18:07
fungichmartinez: the known bugs impeding testing are tracked at http://status.openstack.org/elastic-recheck/ if you're interested18:08
clarkbsdague: oh! kicking things seems to be writing to slightly older indexes. This may be related18:08
sdagueclarkb: yeh, I wonder if we could hit up some provider for real hardware, for this one use case18:09
mtreinishfungi, clarkb: is there an issue with gerritbot? I just pushed a patch and didn't see an irc msg18:10
sdaguedhellmann: you still around?18:10
*** tkelsey has quit IRC18:11
fungimtreinish: openstackgerrit left earlier today on a netsplit and never rejoined. i'll give it a nudge18:11
clarkbsdague: at this point I am curious to see if it reregulates on its own18:12
*** ildikov has quit IRC18:12
clarkbsdague: because that may be an indication of what is happening18:12
clarkbsdague: so I think we have a few issues that we can definitely work on addressing.18:12
sdagueok18:12
clarkbthe OOMing in jvm. the disk space situation18:12
clarkbbut even when they are happy the whole thing seems to be :?18:13
clarkber :/18:13
mtreinishfungi: ok, np. I just was curious18:13
*** openstackgerrit has joined #openstack-infra18:13
chmartinezfungi: OK!18:15
fungimtreinish: openstackgerrit is back now, btw18:17
mtreinishfungi: cool18:17
sdaguefungi: next time we get a promote window - https://review.openstack.org/#/c/99412/18:17
zaroclarkb: password auth was disabled.  turn it on and i'm finally able to connect.  thanks.  that's what i'm gonna use unless you can tell me a better way.  just needed the connection to test jenkins.18:17
sdagueI added it to the etherpad18:17
sdagueI think a chunk of grenade failures are actually that18:17
clarkbzaro: use keys18:17
sdaguebut hidden in a buffering issue18:17
*** jcoufal has quit IRC18:17
*** rfolco has quit IRC18:17
zaroclarkb: yeah ok.  i'll try that next.18:18
clarkbzaro: just don't use your key18:18
clarkbcreate ones specifically for that18:18
zaroyeah, i got that at least  :)18:18
lifelessyolanda: thats not a dib issue18:19
*** james_li has joined #openstack-infra18:19
*** cp16net has joined #openstack-infra18:20
lifelessyolanda: that path is a nova instance path, no?18:20
lifelessttx: let me look18:20
*** e0ne has joined #openstack-infra18:21
lifelessclarkb: sounds reasonable18:21
lifelessyolanda: oh, I think perhaps thats what the nova folk from the cloud you're testing with are reporting? I'd like to know what version of qemu they have18:22
*** markwash_ has joined #openstack-infra18:22
lifelessyolanda: and what version you're building with - we don't use any exotic options18:22
lifelessyolanda: my guess - latest ubuntu (you're running utopic?) has a default that LTS can't handle, or some suc18:23
*** zns has quit IRC18:23
*** ominakov has quit IRC18:23
*** sweston has joined #openstack-infra18:23
*** timrc-afk is now known as timrc18:24
*** cp16net has quit IRC18:24
*** markwash has quit IRC18:25
*** markwash_ is now known as markwash18:25
*** YorikSar has joined #openstack-infra18:27
ttxlifeless: AFAICT bugs are created with bugs.createBug(), releases with milestone.createRelease()... but there is nothing like createSpec() or createBlueprint()18:29
*** cp16net has joined #openstack-infra18:29
*** zns has joined #openstack-infra18:30
*** markmcclain has quit IRC18:31
*** markmcclain has joined #openstack-infra18:31
openstackgerritNikhil Manchanda proposed a change to openstack-infra/config: Added new experimental job for trove functional tests  https://review.openstack.org/9851718:31
openstackgerritNikhil Manchanda proposed a change to openstack-infra/config: Use job-template for gate-trove-buildimage jobs  https://review.openstack.org/9968018:31
clarkbsdague: we are hitting iowait18:33
clarkbI think18:33
sdaguewhat's the storage backends for these?18:33
sdaguelocal ephemeral disks?18:33
clarkbsdague: nope local ephemeral isn't big enough18:33
clarkbits cinder ovlumes or rax equivalent18:34
sdaguesingle volumes, or something raided?18:34
clarkbfungi: care to look at sar -dp 5 5 on the ES nodes and tell me what you see18:34
clarkbsdague: single volumes18:34
lifelessttx: I see yes; I will track down18:34
sdagueapparently the ec2 trick is to allocate 4 volumes and stripe them18:35
sdagueI wonder if that would help here18:35
sdagueor if we're maxed on the network side18:35
fungiclarkb: i picked a random es node and it claims sar isn't installed... should it be?18:35
clarkbfungi: no you need to install sysstat I am installing it as  Igo18:36
fungik18:36
clarkbsdague: it looks like only one node may be affected18:36
ttxlifeless: I can't say i'm surprised -- blueprints never had a full API, it was all added piecemeal by platform team when we needed to scratch itches. Like I said, I probably authored 25% of it.18:36
clarkbsdague: which may be a rax side problem?18:36
sdagueclarkb: and one bad node hurts the others?18:36
clarkbsdague: or that node is being crazy as compared to the others18:36
sdagueis it the node handling api requests?18:37
clarkbsdague: yes because of replicas and searches hitting that disk18:37
clarkbsdague: yes it is that node too18:37
clarkbsdague: but it shouldn't be spooling any of that to disk18:37
clarkbapi requests should be memroy which is why we went to much bigger nodes for more memory18:37
fungioh yeah, await is spiking up fairly high at points18:37
sdagueclarkb: but that will have no interaction with the local shard?18:38
lifelessttx: yah, I've pinged cprov and he's looking at hwo hard it would be to get the collection exposed18:38
sdagueI'm just wondering if we're in high load otherwise if it's impacting18:38
lifelessttx: if you look at bugs there is /bugs and there is the bugs type, specs only has the type exposed18:38
clarkbsdague: it may, we may find turning off e-r makes it stop18:38
sdaguewell, also when things are bad is when people are using logstash a lot to discover things18:39
*** markmcclain has quit IRC18:39
clarkbfungi: ya and now look at 0118:39
fungiyep18:39
dhellmannsdague: I'm back18:39
mgagnesdague: regarding the thread about capacity issues in the gate, would throwing more hardware/ressources to the problem fix the problem or would it just buy us some time until a deeper unknown problem (to me) is fixed?18:39
ttxlifeless: cool, thx. Keep me posted!18:39
fungipew pew lasers on the lvm pv18:39
clarkbsdague: the quick and easy thing to try is to stop apache on logstash.o.o18:39
mesteryfungi: FYI, I just received word from the OpenDaylight folks that their CI is now functioning normally again, if you have time, can you let them vote again? Thanks!18:39
lifelessttx: https://bugs.launchpad.net/launchpad/+bug/132942418:39
clarkbsdague: we may see everything get happy again18:39
uvirtbotLaunchpad bug 1329424 in launchpad "cannot create specification via API" [Undecided,New]18:39
sdagueclarkb: or the cron jobs18:40
ttxlifeless: like I said my current script works around it by spawning a browser window, which is kind of unwieldy :)18:40
clarkbsdague: they hit apache :)18:40
sdagueclarkb: oh, right :)18:40
clarkbits like a giant valve I can turn off which is nice18:40
fungimestery: done18:40
sdagueclarkb: sure, want to black it out for 30 minutes18:40
sdagueand see if it impacts things18:40
ttxlifeless: ok, subscribed myself18:40
clarkbsdague: ya lets try that18:40
mesteryfungi: thank you sir!18:40
clarkbI am stopping pupept and apache on logstash.o.o now18:40
sdaguemgagne: more nodes never hurts, but we've got as bad a people scaling problem on tracking down the fails18:41
fungisounds like a good test of the theory at any rate18:41
sdaguepeople are less elastic18:41
sdaguedhellmann: ok, so oslotest has had some interesting implications when it comes to zuul18:42
dhellmannsdague: yes?18:42
sdaguebecause in it's current job matrix it has joined the world into a single gate18:42
dhellmannoof18:43
dhellmannwe'll have a similar issue for oslo.config, oslo.i18n, etc.18:43
sdaguebecause it creates a set of transitive dependencies18:43
sdagueright18:43
sdagueso I'd like to propose a risk model here18:43
dhellmannI image we'll replace that matrix when we implement https://etherpad.openstack.org/p/juno-infra-library-testing18:44
sdaguein that we only run those jobs in check18:44
dhellmannthat's fair, since they are only unit tests18:44
sdaguethat does mean there is a chance we'll get a wedge across some projects if just the wrong set of things go through the gate18:44
sdaguehowever18:44
sdagueI think that's less pain then the current setup which puts all the world into the same gate queue18:45
sdagueand made ironics fail issues back up everything else, for instance18:45
dhellmannyeah, let's remove those gate jobs18:47
lifelesssdague: btw I have a little confusion about the namespace thing we discussed with pcrews18:47
lifelesssdague: the current web ui looks like it has namespaces (all pipelines, gate pipeline, uncategorized) already18:48
dhellmannsdague: is the comment on line 52 of https://etherpad.openstack.org/p/juno-infra-library-testing accurate?18:48
*** markwash_ has joined #openstack-infra18:48
sdaguelifeless: we need another dimension18:48
lifelesssdague: ok18:49
lifelesssdague: thanks18:49
sdaguedhellmann: yes18:49
sdaguedhellmann: let me propose this as a config change, then we can discuss18:50
dhellmannsdague: sounds good18:50
*** markwash has quit IRC18:50
*** markwash_ is now known as markwash18:50
openstackgerritA change was merged to openstack-dev/pbr: Register testr as a distutil entry point  https://review.openstack.org/9927718:52
*** mrmartin has quit IRC18:57
*** SumitNaiksatam has quit IRC18:58
*** markmcclain has joined #openstack-infra18:59
*** SumitNaiksatam has joined #openstack-infra18:59
openstackgerritSean Dague proposed a change to openstack-infra/config: do not co-gate oslotest with the projects that include it  https://review.openstack.org/9973619:00
sdaguedhellmann: ok, I tried to be really verbose with that commit message19:00
sdagueif you can take a look19:00
dhellmannsdague: looking19:00
sdagueturning off ironic voting in gate seems to have vastly increased velocity19:01
dhellmannsdague: +119:02
sdaguefungi: hey, so we just got a gate reset19:02
sdaguecan I get a promote on the ceilo grenade fix?19:02
*** mancdaz has quit IRC19:02
*** johnthetubaguy has quit IRC19:02
*** changbl has quit IRC19:02
*** phschwartz has quit IRC19:02
fungisdague: yep, promoting as soon as those two at the top report19:02
*** phschwartz_ has joined #openstack-infra19:03
*** changbl has joined #openstack-infra19:03
sdaguefungi: awesome, thank you sir19:03
*** ramashri has quit IRC19:03
*** ramashri has joined #openstack-infra19:03
fungisdague: is the failure which caused the reset there what 99412 is trying to address?19:04
sdaguefungi: nope19:04
sdaguebut it's something that's in the grenade uncategorized fail list19:05
fungigiven it killed a ceilo change i was sort of wondering19:05
sdagueand I think we've got a buffering problem there which I'm hoping my new output filter solves19:05
sdagueyeh19:05
sdagueno, that's another thing19:05
*** mancdaz has joined #openstack-infra19:05
sdaguethere are 'so many' new bugs here19:05
*** radez_g0n3 is now known as radez19:05
*** johnthetubaguy has joined #openstack-infra19:05
sdagueI've rarely seen my changes get bounced twice for the same fail19:05
*** annegent_ has joined #openstack-infra19:06
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Including UX Feedback on menu and nav.  https://review.openstack.org/9920919:06
sdaguefungi / clarkb / mordred / SergeyLukjanov - https://review.openstack.org/99736 should also relieve some things19:06
sdagueand dhellmann is onboard19:06
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Error message & notification handling  https://review.openstack.org/9951519:07
sdaguethat's a config change19:07
sdaguethat will split the queues out19:07
clarkblooking19:07
fungisdague: ayup, saw the conversation. makes sense19:08
*** annegent_ has quit IRC19:08
sdaguefungi: the swift change at top of gate is failing on grenade19:08
sdagueI would just promote now19:08
sdagueand give those a second go19:08
sdaguebecause that swift change is failing badly and slowly19:08
sdagueit's going to take a long time to report19:08
*** freyes_ has joined #openstack-infra19:09
clarkbsdague: so trying to grok that change, does it do enoughto break the transitivity?19:10
clarkbsdague: oh wait I think I grok, those tests are run on the other projects hence the transitive inclusion19:10
sdagueclarkb: right19:10
clarkbbut removing them from the source side breaks that19:10
sdaguein the gate queue19:11
sdagueyep19:11
*** phschwartz_ is now known as phschwartz19:11
clarkbok approved19:11
clarkbfungi: the other thing that we may want to do in the nearish future is restart zuul to pick up that fix for the swift stuff19:11
clarkbfungi: but I am happy to focus on gate fixes19:11
fungisdague: done19:12
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Removed old template  https://review.openstack.org/9973819:12
sdaguefungi: thanks19:12
phschwartzclarkb: So I thought about it more and more. If there are no objections I am going to go down the road of having a new thread for rebuilds and have everything that currently goes to deleted go to rebuild and let the rebuild thread determine if it should be moved to deleted.19:12
clarkbphschwartz: sounds good to me19:12
*** e0ne has quit IRC19:13
*** saper has quit IRC19:13
clarkbso we are still doing an order of magnitude more data on es0119:13
clarkbbut stopping apache has helped19:13
*** e0ne has joined #openstack-infra19:14
*** chmartinez has left #openstack-infra19:14
clarkbI have no idea what makes es01 special at this point. Maybe it is still trying to deal with requests?19:14
sdaguehmmm19:14
fungiclarkb: is that just elasticsearch not sharding with an eye for access volume?19:14
fungimaybe es01 got "lucky" and has the most accessed shards?19:14
clarkbfungi: the shard allocation is random in our setup so we should see even writes across all of them19:15
fungihuh. okay. scratch that idea19:15
clarkbI mean there may be a bug in that19:15
sdagueclarkb: would it be worth trying to do the striped raid of volumes on that node to increase it's io throughput?19:15
clarkbsdague: possibly, I think first I need to figure out what makes that node special19:15
sdaguethat's fair19:15
clarkbsdague: the other potential thing to try is doing non data nodes19:16
clarkband let them deal with searches19:16
sdaguesure19:16
phschwartzclarkb: oh, and as to your question yesterday about requesting more volume space, it would have to be discussed with Pvo, I know mordred has been in contact with him so you might want to ask him to pose the question.19:16
clarkbthey essentially act as fat caches to take strain off the indexing nodes19:16
clarkbphschwartz: thanks19:16
*** e0ne has quit IRC19:16
sdagueok19:16
clarkbsdague: but if we add a second volume to each ndoe I definitely think we can try doing the raid approach19:17
sdagueman, our deletes really are backing up as well19:17
sdagueon overall nodes19:17
sdaguelooks like > 50% are currently deleting19:17
*** _nadya_ has joined #openstack-infra19:17
fungilooks like most of rax-dfw is deleting19:19
fungiand most of rax-iad19:19
*** ociuhandu has quit IRC19:19
fungiand about half of ord19:19
phschwartzLet me look at our cloud monitor19:19
fungimany have been attempting to delete for 1-2 hours19:21
clarkbes01 is starting to get more inline with the other 5 nodes now19:22
fungiactually more than half of the nodes nodepool wants to delete in rax regions have been in that state for more than 4 hours19:22
clarkbI want to watch and see if it is consistent that way19:22
*** ArxCruz has joined #openstack-infra19:22
phschwartzfungi can you get me a list of all of the uuid's of the instances stuck in deleting across the regions. I will get ops involved.19:22
jesusaurusdid we upgrade the version of puppet? im seeing a lot of new deprecation warnings in the puppet-apply-precise test19:23
fungiphschwartz: from what i've seen they're not so much "stuck" as getting ignored and then periodically retried19:23
fungiphschwartz: nodepool requests a delete from nova, the client call returns, the node never disappears, nodepool throws it back into the queue and tries again in 10 minutes or so, lather, rinse, repeat19:24
jesusaurusclarkb: did you figure out what was mucking up your es node?19:24
*** ildikov has joined #openstack-infra19:24
*** markmcclain has quit IRC19:25
fungii'm going to do a bulk parallel delete of anything nodepool's been trying to delete for at least an hour19:25
clarkbjesusaurus: seems possibly related to hammering one of the nodes with searches19:25
clarkbjesusaurus: the disk on that node couldn't keep up19:25
sdaguefungi: we should just write a tool that opens a ticket automatically when it takes more than an hour to delete a node :)19:25
clarkbhttp://gibrown.wordpress.com/2014/01/09/scaling-elasticsearch-part-1-overview/ is interesting19:25
openstackgerritAdam Gandelman proposed a change to openstack-infra/config: Pre-cache UCA packages during nodepool img build  https://review.openstack.org/9974019:25
clarkba quick glance shows we are trying to do more with less :/19:26
phschwartzfungi: hold off for a couple of min if you can. I want an admin to look if they can19:26
jesusaurusclarkb: huh19:26
clarkbfungi: sdague: the other thing we can try is ssd volumes19:26
fungiphschwartz: okay, will do19:26
sdagueclarkb: oh... that's an option?19:26
clarkbso I think we have a lot of options. but may need to caht with pvo19:27
clarkbsdague: possibly. They exist :)19:27
*** ArxCruz has quit IRC19:27
*** denis_makogon has joined #openstack-infra19:27
clarkbmordred: ^ is that something you want to do?19:27
clarkbpvo doesn't seem to be resident here anymore19:27
sdagueright, I guess it's hard to know if the bottleneck is the volume backend, or the path to the volume19:27
clarkbright19:27
clarkband before we make big changes would be good to ave an understanding from rax19:27
sdaguecan phschwartz help us figure that one out? :)19:28
clarkbhe mentioned talking to pvo19:28
phschwartzI will ping  him, but the request coming to him from mordred would probably be better.19:28
clarkbphschwartz: ok19:29
clarkbthanks for the help19:29
clarkb01 is still higher than 02 though19:30
openstackgerritAntoine Musso proposed a change to openstack-infra/zuul: Make swiftclient an optional dependency  https://review.openstack.org/9793319:30
mordred_phoneclarkb: I can ping him19:31
*** thuc has joined #openstack-infra19:33
*** smarcet has quit IRC19:33
*** hashar has joined #openstack-infra19:35
clarkbthe big difference seems to be es01 is much more heavy on reads than es0219:35
clarkbwhich may still be fallotu from being the api endpoint node19:35
*** mrmartin has joined #openstack-infra19:37
jesusaurusclarkb: oh, you arent load-balancing requests across all the nodes?19:37
openstackgerritPhilip Marc Schwartz proposed a change to openstack-infra/config: Creation of vinz project in the openstack-infra scheme.  https://review.openstack.org/9395319:37
*** thuc has quit IRC19:38
phschwartzmordred: fungi: anteaya: ^ that is the rebase due to the merge failure19:38
*** ominakov has joined #openstack-infra19:39
*** andreykurilin_ has quit IRC19:40
*** cp16net has quit IRC19:40
anteayaphschwartz: you are a trooper19:41
*** ominakov has quit IRC19:41
phschwartzOnly thing I hate about long running reviews. The need for multiple rebases. lol19:41
jogowhy are so many of our cloud resources in deleting state?19:43
clarkbjesusaurus: no because its supposed to do that for me19:43
phschwartzfungi: before attempting to force the delete, can you open a ticket for the issue. I have ping some people to look and one of our nova dev's is looking at one of the  instances now.19:43
clarkbbut apparently the chosen node is hit harder than I expected19:43
*** markmcclain has joined #openstack-infra19:45
*** james_li has quit IRC19:46
mrmartinhi. what is the proper way to add compass (http://compass-style.org), a css styling tool required to compile sass files to css to the jenkins slave? The problem that the ubuntu 12.04lts package is very old, but we could use a "gem install compass" to deploy the latest stable version.19:46
*** smarcet has joined #openstack-infra19:46
*** nati_ueno has quit IRC19:47
openstackgerritAlexandre Viau proposed a change to openstack-infra/config: Added the Surveil project to gerritbot and stackforge config  https://review.openstack.org/9974619:47
mgagnemrmartin: does your project have a Gemfile or use Bundler already?19:47
jogoclarkb: logstash.o.o is giving me errors19:47
*** e0ne has joined #openstack-infra19:48
mrmartinmgagne: not yet. what I want to achieve, to remove the pre-compiled css files from github repository, and cleanup styling related patches.19:48
*** _nadya_ has quit IRC19:49
clarkbjogo yup we killed it19:49
*** ihrachyshka has joined #openstack-infra19:49
mgagnemrmartin: we (puppet modules for openstack) are already relying on Bundler to install specific gem versions of our dependencies. I guess it should be trivial to do the same with yours.19:49
jogoclarkb: ahh19:49
clarkbjogo es issues seem related to search volume19:50
jogoclarkb: I take it, it will be coming back at some point in the not so distant future19:50
mrmartinmgagne: could you show me some example in the current infra repo?19:50
clarkbwe are hitting iowait so bad19:50
jogoclarkb: thanks19:50
jogoouch19:50
mgagnemrmartin: sure, hold on19:50
jogomoar cloud19:50
*** thuc has joined #openstack-infra19:51
*** markmcclain has quit IRC19:51
swestonanteaya:  Hello, again.  I have just been informed by Shiv that he is using the key attached to Brocade CI.19:51
mgagnemrmartin: one of our job: https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/jenkins_job_builder/config/puppet-module-jobs.yaml#L8-L1319:52
*** praneshp has quit IRC19:52
mrmartinmgagne: thank you, I'll review this19:52
mgagnemrmartin: afaik, with bundler, you are expected to wrap your call with bundle exec to have access to console scripts provided by gems19:53
swestonanteaya:  I sent a correction to my ml request.  I just want to make sure I updated you with the latest information.19:53
*** cp16net has joined #openstack-infra19:53
*** thuc has quit IRC19:53
*** thuc has joined #openstack-infra19:53
swestonanteaya: never mind, I see you are already on top of it :-)19:54
mrmartinmgagne: I had a talk with this guy today in the same topic, so compass / bundler integration seems to be working here: http://cheppers.com/blog/bundlerize-your-sassy-themes19:54
mgagnemrmartin: might not be what you want if console scripts are executed from tox, a Makefile or whatever you use in your project. Console scripts are expected to be found in the system search path.19:54
anteayasweston: k19:54
*** pvo has joined #openstack-infra19:54
anteayasweston: please confirm my emailed statement is correct, posting below my response is prefered19:55
swestonanteaya: yes, your statement is correct.19:55
anteayasweston: thanks19:56
mrmartinmgagne: I want to add here: https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/jenkins_job_builder/config/groups.yaml so I guess it will work19:56
*** nati_ueno has joined #openstack-infra19:56
swestonanteaya:  apologies, I the email just left my inbox ... I will post below in future correspondence with the ml.19:56
*** james_li has joined #openstack-infra19:56
anteayasweston: thanks, I appreciate that19:56
*** ArxCruz has joined #openstack-infra19:57
*** markwash has quit IRC19:58
*** e0ne has quit IRC19:59
*** markwash has joined #openstack-infra20:00
*** e0ne has joined #openstack-infra20:00
*** e0ne has quit IRC20:02
fungiphschwartz: i could open a ticket but i'm not sure how to characterize it (nor do i have a good description for how to trivially recreate the condition)20:04
*** ArxCruz has quit IRC20:04
openstackgerritJoe Gordon proposed a change to openstack-infra/config: Don't run large-ops test on stable branches  https://review.openstack.org/9975020:05
jogosdague: ^20:05
*** dprince has quit IRC20:05
sdaguejogo: I think we only want to exclude havana20:05
fungiphschwartz: i'm mostly afk, but if i had time i would probably dig in the nodepool logs and a thread dump to identify exactly why nodepool is unable to delete from one provider in as timely a manner as another20:05
sdagueI think we should still run it on icehouse20:05
jogosdague: sure20:05
* fungi is currently working from the seat of a car in a grocery store parking lot while people are touring his old residence20:06
*** zns has quit IRC20:07
sdaguefungi: awesome20:07
jogosdague: and itwas the wrong job20:07
*** Alexei_987 has joined #openstack-infra20:07
sdaguejogo: ++20:07
*** radez is now known as radez_g0n320:07
*** zns has joined #openstack-infra20:07
*** malini1 has quit IRC20:08
openstackgerritA change was merged to openstack-infra/config: do not co-gate oslotest with the projects that include it  https://review.openstack.org/9973620:08
clarkbfungi: you should be in a bar watching world cup20:09
*** mbacchi has quit IRC20:10
*** ramashri_ has joined #openstack-infra20:10
openstackgerritJoe Gordon proposed a change to openstack-infra/config: Don't run large-ops test on stable/havana branches  https://review.openstack.org/9975020:10
jogosdague: take two ^20:10
*** ramashri has quit IRC20:11
sdaguejogo: you still have copy/paste errors20:11
sdaguelook at the diff20:11
jogosdague: ahh20:12
*** rlandy has quit IRC20:12
jogosdague: take 320:14
openstackgerritJoe Gordon proposed a change to openstack-infra/config: Don't run large-ops test on stable/havana branches  https://review.openstack.org/9975020:14
jogofungi: for the delete issue in nodepool, is the issue only on rax or both clouds?20:16
jogofungi: I wonder if we can fix something in nova to help this20:16
fungijogo: all our providers lag some on delete calls, but at the moment most of our rax quota is tied up in instances nodepool wanted deleted hours ago20:17
jogofungi: there is a force delete command not sure if its admin only though20:17
*** boris-42 has quit IRC20:18
*** bookwar has quit IRC20:18
*** morganfainberg has quit IRC20:18
fungijogo: that's not so much the issue. retrying the delete has a ~% chance of either working or just being ignored from what we've seen in the past20:18
*** boris-42 has joined #openstack-infra20:18
*** bookwar has joined #openstack-infra20:18
*** morganfainberg has joined #openstack-infra20:18
jogofungi: so it sounds like a nova bug?20:19
fungifrom any provider (parts of hpcloud 1.0 were pretty bad about that too)20:19
fungijogo: maybe. depends on what sort of modifications they have in place compared to vanilla nova, how it's impacted by load/performance issues/whatever on the underlying infrastructure, et cetera20:20
*** mbacchi has joined #openstack-infra20:20
*** ramashri_ has quit IRC20:21
*** a2hill has joined #openstack-infra20:21
openstackgerritSean Dague proposed a change to openstack-infra/elastic-recheck: remove old queries  https://review.openstack.org/9975620:22
fungibasically nodepool issues a nova delete and then if that returns an ok response periodically looks at the nova list to watch for the instance to finally disappear. sometimes it never does, so nodepool retries that periodically until it finally works20:22
*** alkari has quit IRC20:22
sdagueclarkb: I wonder if our query growth impacts things20:22
openstackgerritA change was merged to openstack-infra/config: Creation of vinz project in the openstack-infra scheme.  https://review.openstack.org/9395320:22
clarkbsdague: its possible20:22
clarkbsdague: also possible that we have a bunch of terrible queries20:22
sdagueclarkb: yeh, so are the logs of the batch jobs contained anywhere?20:23
sdagueI could put timers around the queries so that we can determine which ones seem most expensive20:23
clarkbuh probably not. we just run them out of cron right?20:23
sdagueclarkb: yep20:23
clarkbsdague: but we should be able to have them write to /var/log/es-batch-jobs or whatever20:24
jogojaypipes: ^^^20:24
jogofungi: I wonder if we can somehow reproduce that issue20:24
fungiclarkb: sdague: does the e-r web dashboard call to es? is it possible the additional query load coincides with when the change merged to replace the old rechecks page with it?20:24
sdaguefungi: no, it's batch generated20:25
clarkbfungi: no it generates each of those 3 tabs twice and hour20:25
clarkbso 6 jobs per hour20:25
jogofungi: the bot goes way more often20:25
sdagueso at least that's a fixed cost that we control20:25
*** esker has joined #openstack-infra20:25
sdaguejogo: it processes in serial though20:25
fungiokay. i just notice about a 10-second lag between when the empty page with theming comes up and when the graphs appear, so didn't know what was being queried20:25
jogosdague: true20:25
sdaguefungi: yeh, that's the json loading over the network for the graphs20:25
fungik20:26
sdagueas the graphs are client side20:26
fungiunrelated then20:26
sdagueyep20:26
jogofungi: so when I try deleting a rax instance locally its quick20:28
fungidoes the query volume per job failure increase linearly with the number of classification queries e-s has at its disposal?20:28
jogofungi: is there anything special on these instances? volumes etc20:28
*** nati_ueno has quit IRC20:28
sdaguefungi: yes20:28
fungijogo: instance booted from snapshot. it also very well may be related to the volume of instance operations we perform in our tenant20:29
*** esker has quit IRC20:29
jogojaypipes: ^20:29
sdaguejogo: that's 16 queries I think we can drop - https://review.openstack.org/9975620:29
jogoahhh20:29
fungisdague: do we periodically de-cruft the old classified entries we aren't hitting any longer?20:29
sdaguefungi: manually20:29
fungiokay20:29
sdaguethat's what I was just doing20:30
fungiyup, you read my mind, or something ;)20:30
jogosdague: have you updated the related bugs in launchapd?20:30
*** esker has joined #openstack-infra20:30
sdaguejogo: nop20:30
jogosdague: want to do that ?20:30
jogothen +W from me20:30
*** wenlock_ has joined #openstack-infra20:32
sdaguejogo: I'm not sure what that is20:32
*** a2hill has quit IRC20:32
*** blogan has joined #openstack-infra20:32
sdaguejogo: you should do the thing you want to do in launchpad, I always just delete these things20:33
clarkbsdague: I want to see if it will catch up in this state20:33
clarkbsdague: then we can try turning apache back on20:33
sdagueclarkb: yeh, the slope looks good20:33
*** radez_g0n3 is now known as radez20:33
clarkbsdague: it may be that when e-r doesn't have to ask over and over and over for a change that we get better behavior20:33
sdagueclarkb: agreed20:34
phschwartzfungi: ty, do you want the request for the groups created to be an email or here?20:35
*** julim has quit IRC20:35
*** openstackgerrit has quit IRC20:35
*** praneshp has joined #openstack-infra20:35
fungiphschwartz: in here is fine. the groups get automatically created but i have to manually add people to them20:36
*** marcoemorais has quit IRC20:36
*** markmcclain has joined #openstack-infra20:36
fungibut i won't get to it for a bit still20:36
*** marcoemorais has joined #openstack-infra20:36
*** openstackgerrit has joined #openstack-infra20:36
*** wenlock_ has quit IRC20:36
phschwartzfungi: They are vinz-core and vinz-ptl20:38
phschwartzfungi: not a problem for a wait.20:38
phschwartzhow often does puppetmaster get updated and propagate the changes anyways? I have never asked.20:39
jogosdague: just mark them as invalid etc20:39
sdaguejogo: the bug might not be invalid20:39
sdaguesome of these bugs are still out there, we're just not matching them any more20:39
jogoso the ones that are20:39
sdagueI don't know which ones those are20:39
sdaguethe whole point when we age out is just that this query is no longer matching that bug20:40
sdaguethe reasons for that might be that it's fixed, or it moved around20:40
jogosdague: so for example https://bugs.launchpad.net/swift/+bug/120908620:41
*** bknudson has left #openstack-infra20:41
uvirtbotLaunchpad bug 1209086 in swift "grenade tests fail with error trying to create container" [Medium,Confirmed]20:41
jogoI am just commenting20:41
*** otherwiseguy has quit IRC20:41
jogoto at least give folks some insight20:41
sdaguegotcha20:41
*** blogan has left #openstack-infra20:42
sdagueso long term we should probably do a post job that comments on launchpad bugs when we add or remove queries20:42
lifelesssdague: heh, I suggested that the other day20:42
lifelesssdague: but I suggest you use a bug attachment20:42
*** gyee has joined #openstack-infra20:42
sdaguelifeless: why an attachment?20:42
lifelesssdague: then you can have the current query as a yaml file attached to the bug, and if theres no file there is no attachment20:42
lifelesssdague: so you don't need to read through N comments to figure it out20:43
*** __afazekas is now known as afazekas20:43
sdaguewell given how often lp times out, a comment seems safer, as there won't be 100% consistency20:43
sdaguealso I vaguely know how to do that with lplib :)20:43
lifelesssdague: don't see how comments or attachments are safer, same API servers in use, same notification code20:44
lifelessalso attachments trigger notifications. up to you though20:44
sdagueyeh, but if we are just telling people something has changed in er, then they come back to er for source of truth20:44
lifelessthere's example attachment code on the api page- api.l.n/devel/20:44
*** nati_ueno has joined #openstack-infra20:44
sdagueI worry about pushing out the actual data to lp, because then it feels we need to be more responsible for making sure it's consistent20:44
openstackgerritAlexandre Viau proposed a change to openstack-infra/config: Added the Surveil project to gerritbot and stackforge config  https://review.openstack.org/9974620:46
lifelessmordred_phone: can we please release pbr 0.8.3 to get the testr fixout? setup.py test doesn't accept options, so we do need that fix.20:47
lifelessmordred_phone: I can tag it if you're ok with a release20:47
*** dims_ has quit IRC20:47
openstackgerritlifeless proposed a change to openstack-dev/pbr: Allow examining parsing exceptions.  https://review.openstack.org/8085620:47
openstackgerritlifeless proposed a change to openstack-dev/pbr: Teach pbr VersionInfo about debian versions.  https://review.openstack.org/8107420:47
openstackgerritlifeless proposed a change to openstack-dev/pbr: Teach pbr about post versioned dev versions.  https://review.openstack.org/8044920:48
openstackgerritlifeless proposed a change to openstack-dev/pbr: Use the current pbr for testpackage tests.  https://review.openstack.org/9410720:48
openstackgerritlifeless proposed a change to openstack-dev/pbr: Add a converter to version_tuples.  https://review.openstack.org/8045720:48
openstackgerritlifeless proposed a change to openstack-dev/pbr: Break out a common version object from VersionInfo  https://review.openstack.org/9410820:48
jogosdague: done20:51
*** ramashri has joined #openstack-infra20:51
sdaguejogo: yeh, I was racing with yuo through that20:53
*** chianingwang has quit IRC20:54
*** mrmartin has quit IRC20:56
sdagueclarkb: so it occurs to me that the bot is actually querying across all indexes for the real time queries20:56
sdaguewhen it probably only needs the most recent one20:57
*** e0ne has joined #openstack-infra20:57
sdagueI wonder if a time boundary there will help20:57
clarkboh yes20:57
clarkbI didn't realize it was hitting all of them20:57
sdaguewell it has no bounds20:58
clarkbah20:58
clarkbya we should change that20:58
sdaguewhich I assume means all20:58
clarkbyup if you don't bound it is all indexes20:58
sdaguewhat's the syntax for "since"20:58
clarkbwell you query a specific index20:58
clarkbeg logstash-todaysdate20:58
sdagueso we have a race across rotation?20:58
clarkbyes there will be20:59
sdagueI thought there was a time range20:59
clarkbsdague: there is too but it will search all indexes for that time range20:59
sdaguecan I get the last 2 indexes?20:59
clarkbsdague: yeah you can comma delimit them20:59
*** thuc_ has joined #openstack-infra20:59
clarkb/index1,index2/query or whatever21:00
openstackgerritA change was merged to openstack/requirements: Bump pep8 from 1.5.6 to 1.5.7  https://review.openstack.org/9794421:00
sdaguewe need to change the query url?21:00
clarkbsdague: yes21:00
clarkbsdague: otherwise it searches all indexes21:00
sdagueso the 15 minute searches via logstash come back pretty fast21:00
sdagueit's going to be easier in the code to do the date range, I'm wondering if that's going to be good enough21:01
clarkbfor a timestamp range you do @timestamp:[2014-06-01T12:12:12Z TO 2014-06-02T12:12:12] but that searches all indexes for that range21:01
clarkbsdague: because indexes aren't necessarily date bound, its just how logstash does it21:01
sdagueoh21:01
clarkbsdague: so kibana is being smart when you say give me last 15 minutes21:01
clarkbI think21:02
sdagueah, gotcha21:02
sdaguewe always rotate at UTC 00:0021:02
sdague?21:02
clarkbyes21:02
*** gokrokve_ has joined #openstack-infra21:02
sdagueok, let me figure out if I can put the same smarts into our side21:02
clarkbok21:02
clarkblet me know if you have other questions21:02
clarkbI can dig into the kibana source but there is a config option to tell it when rollover happens beacuse it does this magic too21:03
sdaguewill do, but I probably just need to dive on this for a little bit21:03
*** thuc has quit IRC21:03
jogofungi: to be clear, what is teh workflow for creating and deleting a isntance for you guys: 'nova boot';'nova image-create' to create aucstom image' and boot from that image?21:04
clarkbjogo: yes21:04
clarkbnodepool boots off of the "base" image provided by our providers21:04
phschwartzfungi: It looks like the instances stuck in deleting might be an issue on our side and some of them have been stuck for multiple days. (looks like possibly 5 days or more back)21:04
clarkbthen it runs scripts on that to create our image, snapshots that and deletes the node the snapshot was taken from21:04
jogothanks, I am trying to locally (with rax) reprodue the slow deletes21:04
clarkbjogo: then we boot off of that snapshot for all of the slaves21:04
*** gokrokve has quit IRC21:05
openstackgerritMaxime Vidori proposed a change to openstack-infra/storyboard-webclient: Remove boostrap.js  https://review.openstack.org/9963821:06
openstackgerritMaxime Vidori proposed a change to openstack-infra/storyboard-webclient: Removal of jquery  https://review.openstack.org/9966021:06
*** nati_ueno has quit IRC21:07
*** thuc has joined #openstack-infra21:07
clarkbsdague: also the gerrit comments should come with a timestamp21:09
jogophschwartz: oh?21:09
clarkbsdague: it is probably relatively straightforward to convert that into a N and N-1 index21:09
NobodyCamoh new ironic / DIB queue...21:09
phschwartzjogo: yeah, we are still trying to trace the issue., just was letting fungi know where we stand at the moment.21:09
*** melwitt has joined #openstack-infra21:09
*** thuc_ has quit IRC21:10
NobodyCamis that new queue permanent?21:10
fungiphschwartz: thanks. added you to those groups just now too21:11
*** jerryz_ has joined #openstack-infra21:11
*** markmcclain has quit IRC21:12
*** markmcclain1 has joined #openstack-infra21:12
phschwartzfungi: ty21:12
fungiNobodyCam: is anything here ever permanent? it's a result of taking oslo cross-tests off those projects for the gate pipeline, which caused them to no longer have jobs in common with anything in the main integrated gate queue21:13
NobodyCam:) ahh ok Ty fungi :)21:14
fungiso their job failures no longer impact time to land anything in the larger queue21:14
NobodyCamand vis versa21:14
fungiyup21:14
jogowow booting from a snapshot is super duper slow21:14
*** doug-fish has left #openstack-infra21:15
clarkbjogo: yes its one reason we want to move away from it (but not the most important reason)21:15
openstackgerritA change was merged to openstack-infra/elastic-recheck: remove old queries  https://review.openstack.org/9975621:17
*** mbacchi has quit IRC21:18
*** fifieldt_ has quit IRC21:19
JayFNow up to 5 times I've been bounced from the list :(21:21
openstackgerritCraig Bryant proposed a change to openstack-infra/config: Add the python-monascaclient  https://review.openstack.org/9976721:22
sdagueclarkb: what's the naming convention of the indexes?21:22
sdaguealso I moved to the living room so that sportsball is on21:22
clarkbsdague: yes I have done the same >_>21:22
clarkbsdague: one sec I will get it for you21:22
clarkbsdague: http://logstash.net/docs/1.4.1/outputs/elasticsearch#index we use the default21:23
clarkbso logstash-2014.06.12 for today21:23
*** e0ne has quit IRC21:26
*** e0ne has joined #openstack-infra21:26
*** andreykurilin_ has joined #openstack-infra21:28
*** mmaglana has joined #openstack-infra21:29
phschwartzsdague: for your devstack-vagrant, what format has is it looking for, for the password.21:29
sdaguephschwartz: it's the hashed value in /etc/shadow21:30
lifelesssdague: so anyhwo - https://review.openstack.org/#/c/92497/ delete the whole start-output ?21:30
sdaguelifeless: yeh, I think so21:30
sdaguephschwartz: what you'd pass usermod -p21:31
zaroclarkb: you know what's up with and hpcloud and sudo?  when i use sudo the command it takes about 10 times longer to execute?21:31
clarkbzaro: might be doing a name lookup?21:31
*** e0ne has quit IRC21:31
phschwartzsdague: makes sense. I just have to remember what hash alg linux uses for that as I am on a mac. lol21:32
clarkbzaro: those nodes are really slow too21:32
sdaguephschwartz: just spin up a linux node set the password, and snag the value21:32
sdaguethat's what I do :)21:32
zaroclarkb: yes, it's slow, but crawling when sudo-ing.21:33
zaromordred: ^ do you see same issue?21:35
*** fifieldt_ has joined #openstack-infra21:36
lifelesszaro: almost certainly hostname21:36
lifelesszaro: check that hostname is in /etc/hosts21:36
fungizaro: yeah, gethostbyname() calls. you could strace the sudo process to see where it's hanging if you want to confirm21:37
*** mfer has quit IRC21:39
*** smarcet has quit IRC21:40
jogofungi: just reproduced the slow delete, very odd21:41
jogodid it without a snapshot21:42
fungioh! interesting21:42
jogoa second delete helped21:42
*** esker has quit IRC21:44
openstackgerritMaxime Vidori proposed a change to openstack-infra/storyboard-webclient: Documentation improvment  https://review.openstack.org/9977521:45
*** HenryG has quit IRC21:45
*** esker has joined #openstack-infra21:45
fungijogo: yep, same for us. basically nodepool retries them periodically, then sticks them back into its delete queue if they don't disappear within the expected timeframe, and it tries again later21:47
fungiand then eventually they're freed up21:48
*** mriedem has quit IRC21:48
*** esker has quit IRC21:49
*** lbragstad has quit IRC21:50
*** masayuki_ has joined #openstack-infra21:50
*** radez is now known as radez_g0n321:50
*** mrodden has quit IRC21:51
jogovery odd, next step is to reproduce in devstack (doubtful)21:51
openstackgerritSean Dague proposed a change to openstack-infra/elastic-recheck: have realtime engine only search recent indexes  https://review.openstack.org/9977621:52
sdagueclarkb ^^^21:52
sdaguealso jogo, and any other er people21:52
clarkblooking21:52
sdaguegah, I missed a thing21:53
sdagueone second21:53
openstackgerritSean Dague proposed a change to openstack-infra/elastic-recheck: have realtime engine only search recent indexes  https://review.openstack.org/9977621:53
sdagueit helps to actually pass the index param to search21:53
clarkbI was just going to ask about that21:54
clarkbsdague: is that tested?21:54
clarkbsdague: the approach is sound to me21:54
sdagueit is not, I just finished it21:54
*** hashar has quit IRC21:55
sdagueI did only start 50 minutes ago :)21:55
mtreinishsdague: yeah it looks reasonable to me, but we probably should test it :)21:55
phschwartzfungi: Looks like nova refuses all requests to delete while instances are in deleting state so all the spam of trying to delete after the first are a waste. (API shouldn't timeout as the error is seen in node-pool though)21:55
clarkbsdague: also sportsball21:55
sdagueclarkb: yeh21:55
sdaguethough I was kind of rooting against br21:56
*** marcoemorais has quit IRC21:56
*** markmcclain1 has quit IRC21:56
sdaguebecause, that would be funny21:56
jogo++ to it sounds very reasonable to me21:56
*** markmcclain has joined #openstack-infra21:56
*** marcoemorais has joined #openstack-infra21:56
*** lcostantino has quit IRC21:56
*** dims_ has joined #openstack-infra21:57
*** praneshp_ has joined #openstack-infra21:57
*** andreykurilin_ has quit IRC21:57
sdagueclarkb: do we have the api back on yet?21:58
*** jamielennox is now known as jamielennox|away21:58
fungiphschwartz: i'm not entirely sure that's true. we regularly see deletes requested which basically never get processed, but then clear up immediately on a subsequent delete call21:58
sdagueor are we waiting for that to fully burn down ?21:58
clarkbsdague: not yet21:58
clarkbI was waiting for it to burn down but its picking up again21:58
clarkband load on es01 is climbing21:58
sdagueok, any idea what else is going on?21:58
clarkbno according to bigdesk there are no searches21:59
lifelesshuh21:59
*** praneshp has quit IRC21:59
*** praneshp_ is now known as praneshp21:59
lifelesswhy is hudson-openstack closing bugs on merge, rather than fix-committing them ?21:59
*** thuc has quit IRC21:59
lifelesssee https://bugs.launchpad.net/tripleo/+bug/132709021:59
uvirtbotLaunchpad bug 1327090 in tripleo "can't deploy ci-overclouds on Ubuntu - ensure-bridge wipes out /e/n/i" [High,Fix committed]21:59
lifeless(I just put it to fix committed, which is the state it should have)22:00
clarkblifeless: it is a toggleable option22:00
fungilifeless: it depends on how your project is configured in review.projects.yaml22:00
*** thuc has joined #openstack-infra22:00
lifelesshmm22:00
clarkbsdague: iotop shows es doing a lot of reads and sar corroborates that as the slowness22:00
fungilifeless: it sounds like that project is set for direct release, implying it's one which doesn't do real releases and is just used from trunk22:00
*** MarkAtwood has quit IRC22:00
clarkbsdague: so I don't think this is purely related to queries22:00
clarkbI am half tempted to restart es on that node22:01
lifelessis this new? Its wrong.22:01
clarkbI guess I can strace22:01
fungilifeless: no idea. looking to see now which one it is and git-blaming the file for you22:01
*** mrda-away is now known as mrda22:01
sdagueclarkb: yeh, I would say we should take the opportunity to try to diagnose while we've got the query side off22:02
lifelessfungi: 0a9d800b modules/openstack_project/files/review.projects.yaml         (Monty Taylor              2013-12-13 12:12:54 -0500 441)     - direct-release22:02
*** nati_ueno has joined #openstack-infra22:03
fungilifeless: yup22:03
fungilifeless: nix that one line and it will do the default thing which is to set to fix committed on merge22:03
openstackgerritlifeless proposed a change to openstack-infra/config: Unbreak tripleo projects  https://review.openstack.org/9977822:04
fungiheh22:04
lifelessfungi: obviously we're going to want that in quickly :)22:04
phschwartzfungi: all of your current instances in deleting state have tons of requests going through our api's but when they get to the api at the cell level are kicked with this error. 2014-06-10 11:25:34.426 25720 INFO nova.compute.api [req-c314c89d-58ac-46ae-8a60-57f0edbfac54 156185 637776] [instance: 1bc3554b-1f2a-44c7-b9a7-3d8bc956f7cd] Instance is already in deleting state, ignoring this request22:04
*** thuc has quit IRC22:04
clarkboh cells22:04
phschwartzfungi: so if it makes to deleting state from the nova-api they are ignored after that22:04
fungiphschwartz: sounds like a different class of problem than we're used to seeing in that case22:05
sdagueyeh, probably something new.22:05
fungiphschwartz: just to be clear, are these the handful which are undeletable or the hundreds which seem to delete if i call nova delete again on them22:05
phschwartzfungi: I think it is and I am thinking on how it should be handled from node-pool that way it doesn't keep retrying if they are in a deleting state22:05
sdaguejaypipes did point out a cells delete fix in review, but the fact that cells is basically untested upstream I'm sure doesn't help22:05
sdaguehttps://review.openstack.org/#/c/93860/22:06
fungiphschwartz: oh, yeah if nova shows the state is deleting that's a different class of problem than the one i was thinking of22:06
phschwartzfungi: All the ones that are undeletable because they are stuck in the deleting state. as the delete errored and it looks like nova never reset the vm_state to error.22:06
fungiwe also regularly see many which remain in active state according to nova after a delete is requested22:06
*** amcrn has quit IRC22:06
*** esker has joined #openstack-infra22:06
phschwartzfungi: I have seen that before and if a retry is done it will delete.22:06
fungiyep22:06
phschwartzfungi: this issue is if the delete actually tried to happen22:07
fungithat's what i was thinking we were probably hitting, but yes this sounds different and solvable (to the degree to which we have any real control over it) in nodepool22:07
sdaguephschwartz: does the linked review above look relevant here?22:07
phschwartzsdague: no, but this one does. https://review.openstack.org/#/c/58829/22:08
phschwartzThat remove the wrapper for reverting state on error of delete.22:08
* fungi graciously bows out of the discussion to get back to moving prep22:08
phschwartzsdague: basically that one leaves an instance so it can't be deleted from the api once a delete error happens.22:09
*** resker has joined #openstack-infra22:09
sdaguephschwartz: gotcha22:09
*** pvo has quit IRC22:09
*** resker has quit IRC22:10
*** mrodden has joined #openstack-infra22:10
*** resker has joined #openstack-infra22:10
*** esker has quit IRC22:11
*** mrodden1 has joined #openstack-infra22:11
*** cp16net has quit IRC22:11
*** jergerber has quit IRC22:12
phschwartzI understand why the change was made, but it traded an issue of "Oh" this puts the instance into a bad state as you can't revert a delete. But it ended up putting into a worse one as there is no way with out making a db change to fix the state so you can forcefully delete it22:12
clarkbso strace does show lots of reads mapping file descriptors to actual throughput is a bit hard22:12
jogophschwartz: AFAIK this has been an issue way before that patch22:13
jogophschwartz: plus I can call nova delete on teh same instance multiple times22:14
phschwartzjogo: no, it wouldn't have been as the revert_task would have put it in a pure error instead of leaving it in a vm_state of deleting.22:14
*** dkliban is now known as dkliban_afk22:14
phschwartzNova doesn't allow you to delete a deleting instance22:14
phschwartzso stuck in that state a reissue of the delete does nothing22:15
*** resker has quit IRC22:15
*** mrodden has quit IRC22:15
*** mfer has joined #openstack-infra22:15
jogophschwartz: just to be clear you are saying that paych isn't it?22:16
jogoor it could be22:16
phschwartzNo, that patch is the issue. It removed the reverts_task_state wrapper from the delete action22:16
*** denis_makogon has quit IRC22:16
phschwartzSo the delete that fails causes the instance to stay in deleting.22:17
*** gondoi is now known as zz_gondoi22:17
phschwartzsubsequent delete requests are ignored by nova because of the deleting state basically locking the instance to where you need to force modify the state in the db to delete it by the api, or you have to delete it at the compute level and then hand change the table to deleted.22:18
fungiJayF: we should revisit http://wiki.list.org/pages/viewpage.action?pageId=17891458 based on your reminder. would you care to add it to the agenda for tuesday's infra team meeting? i won't probably be around to discuss it but others on the team also have a firm grasp of the issue i think22:19
*** UtahDave has quit IRC22:19
*** mfer has quit IRC22:19
*** andreaf has joined #openstack-infra22:20
clarkbooh there is a slow log and it looks like I can enable it on the fly22:20
*** mmaglana has quit IRC22:21
jogophschwartz: so the nodepool instances that are stuck are in error state?22:21
jogoand are failing a delete?22:21
jogophschwartz: "Failure to power off a VM during delete leads to it going back to Active(None) Edit22:21
jogo"22:21
jogodoesn't sound like this issue22:22
jogodirecrtly at least22:22
phschwartzjogo: The task_state is deleting on them so all further delete requests are ignored.22:22
jogophschwartz: that is the state the stuck instances are in?22:22
phschwartzbasically with that patch I linked it makes them vm_state=Error, task_state=deleting and22:22
sdaguerealistically we actually hit this issue in the gate as well22:22
jogoI found a different state22:22
phschwartzjogo: yes22:22
sdagueclarkb: oh, coolness22:22
jogophschwartz: do you have access to nodepools 'nova list'22:23
phschwartzjogo: I have better then that. I have all our logging and the nova db22:23
jogophschwartz: who is 'our'? RAX?22:23
*** SumitNaiksatam has quit IRC22:24
*** dkranz has quit IRC22:24
phschwartzjogo: correct, I am a developer from Rax that has become a helpful resource to infra ;)22:24
fungiphschwartz: a VERY helpful resource! you're our new pvo ;)22:24
jogophschwartz: wanna share the relavent logs?22:24
*** dkranz has joined #openstack-infra22:25
jogofungi: do you have a nova list from nodepool?22:25
fungijogo: i can get you one, though it changes by the minute22:25
*** dizquierdo has joined #openstack-infra22:25
fungijust a sec22:25
jogofungi: thanks22:25
phschwartzjogo: Unfortunately that is something I can't do at the moment as they have info for non-infra things in them and would be too hard to scrub.22:25
fungiwe can at least speak in terms of specific uuids though22:26
jogophschwartz: thats what I assumed but figured I would ask anyway22:26
sdaguephschwartz ftw!22:26
*** freyes_ has quit IRC22:26
jogophschwartz: yeah thanks for helping out on this22:26
*** markmcclain has quit IRC22:27
phschwartzThat is what I am here for. I am thinking of a quick fix for node-pool to not hammer our api if it gets stuck in deleting for the edge case. I am leaving now to go to a concert and will have to send an email to infra when I get back if I have come up with something good.22:27
jogophschwartz: why would that help?22:28
jogoless load?22:28
*** weshay has quit IRC22:28
phschwartzIn about 10 days there have been over 55k requests to delete the same instances that are stuck in deleting22:29
phschwartzThat is traffic that is a waste :)22:29
jogophschwartz: waste yes, but not sure how that would help infra22:29
*** dizquierdo has quit IRC22:29
jogojust not yest convinced that would make things better per se.22:30
phschwartzThat is why I have to think on the best way to do it because if something is stuck in deleting for a long time we need it to report, not just try to delete it over and over again22:30
phschwartzwell I am off, I will be back in a bit.22:30
fungijogo: ...22:30
sdaguephschwartz: well that was my tongue in cheek idea of having nodepool open a ticket22:30
fungidfw: http://paste.openstack.org/show/8388022:30
fungiiad: http://paste.openstack.org/show/8388122:30
fungiord: http://paste.openstack.org/show/8388222:30
sdaguebut then you'd get a lot of tickets :)22:30
fungithanks again for the help phschwartz! enjoy the concert22:31
*** gokrokve_ has quit IRC22:32
jogophschwartz: o/22:32
clarkbok I have slowlog setup for both index and serach22:32
clarkbthe search slowlog is empty even with trace at 200ms22:32
jogofungi: just as phschwartz said22:32
clarkbso I think we can reasonably confirm that there is no searching22:32
sdagueanother gate reset coming, ceilometer unit tests22:33
jogofungi: what does  'nova show $instance in error,deleting' say22:33
*** esker has joined #openstack-infra22:33
jogofungi: hmm I think we have a reset state button somewhere in nova22:33
clarkbthis appears to be an indexing problem too22:33
clarkbyou know I wonder if we just have a cranky volume22:34
clarkband it acts up in intervals22:34
sdaguehttps://jenkins04.openstack.org/job/gate-ceilometer-python26/666/console actually kind of weird22:34
clarkbwhich is why we see it as a cyclic problem22:34
clarkbfungi: ^22:34
sdagueclarkb: like we're getting hit by a qos issue?22:34
clarkbsdague: maybe22:34
clarkbwe hit max number of iops for that period then get throttled22:34
sdagueare the ssd volumes in rax?22:35
clarkbwe are not using ssd volumes iirc but they have them22:35
sdagueso I'd expect they'd be configured for higher iops, no?22:35
clarkbI would expect so22:35
jogofungi: reset-state isn't in rax :/22:35
clarkbI think we really need to get in touch with pvo22:35
clarkbmordred_phone: any luck?22:35
clarkbthe slow log isn't growing at an insane rate so I will leave it in place for a bit and maybe we will see it drop off on the next downward portion of the cycle22:36
*** andreaf has quit IRC22:36
jogofungi: so it looks like we are tickling a bug in RAX that is causing deletes to error out.22:37
jogoperahps when phschwartz gets back he can help tell us what that is22:37
*** dangers is now known as dangers_away22:37
anteayaso we have ERROR deleting and ACTIVE deleting, so ERROR deleting are the stuck nodes?22:38
jogoanteaya: that is my understanding22:39
sdagueclarkb: is the slow log going to give us slow queries?22:39
*** thedodd has quit IRC22:39
anteayaand hopefully ACTIVE deleting turns into a DELETED status22:39
fungijogo: is 'nova show $instance in error,deleting' an actual syntax? i get help/usage output from that22:39
*** crc32 has quit IRC22:40
jogofungi: nova show f160258b-46d6-4122-a217-4294cf6a9bb522:40
*** mrmartin has joined #openstack-infra22:40
jogofor rax-dfw22:40
fungijogo: oh, you mean show output for each of the nodes in one of those two states. i can construct a shell one-liner to get that into a paste. just a sec22:41
jogofungi: I just need one example22:41
jogonot all22:41
jogofungi: all will take too long22:41
jogofungi: as a user I don't think there will be any really useful info but figured it is worht a shot22:42
mattoliverauMorning22:42
*** HenryG has joined #openstack-infra22:42
anteayamorning mattoliverau22:43
jogolifeless: https://review.openstack.org/#/c/99743/22:44
fungijogo: well, too late... http://paste.openstack.org/show/8388522:45
lifelessjogo: yes :)22:45
fungiConnectionFailed' object has no attribute 'status_code'22:46
*** zehicle_defcore has quit IRC22:46
*** gokrokve has joined #openstack-infra22:46
jogolifeless: I want to get rid of that rule as well22:46
clarkbsdague: yes it will giev us slow queries too22:46
jogofungi: ohh this was fun22:46
jogo| fault                  | {u'message': u'Connection to neutron failed: Maximum attempts reached', u'code': 500, u'created': u'2014-06-05T05:45:02Z'} |22:46
clarkbsdague: we will need to fine tune the valeus as right now they are not conservative at all and we will probably fill the logs up real quick22:47
*** jhesketh has quit IRC22:47
jogoits neutron hahahaha22:47
jogosdague: you would like this  http://paste.openstack.org/show/83885/22:47
anteayais connection reset by peer neutron as well?22:47
jogoanteaya: I *think* so but don't quote me on that22:47
jogophschwartz: ^22:48
anteayaand no attribute 'status_code'?22:48
fungijogo: heh22:48
sdagueheh22:48
jogoanteaya: you ask too many good questions22:48
jogonot sure22:48
anteayasorry to spoil the taring and feathering party22:49
anteayado carry on22:49
jogomestery: ^22:49
sdaguejogo: I wonder if that's a failed conversion from objects in cells22:50
jogosdague: ohh possibly.22:50
sdagueI know it's lagging there22:51
sdagueyou should poke alaski, I think he was chasing some of that22:51
jogoof 16 nodes in deleting22:51
clarkbsdague: I have a few entries in the indexing slow log on the other nodes but nothing like on 0122:51
jogo7 have neutron in the error message22:51
jogoalaski: ^ poke22:51
* jogo wonders off to catch BART22:51
clarkband 01 has the same number of master and replica shards as three other machines22:51
*** sarob has quit IRC22:52
clarkbI am becoming more suspicious of that volume (but that may be the lazy in me22:52
sdagueclarkb: I support the lazy in you22:53
*** sarob has joined #openstack-infra22:53
clarkbindexing is properly load balanced. the logstash processes join the cluster and talk directly to the node that needs the data22:53
sdagueI do like the idea of trying to get higher performance volumes22:53
clarkbI have mostly confirmed at this point that queries do not cause the issue though they may contribute22:53
clarkbit is present in indexing but only on one node with similar use as compared to other nodes in the cluster22:53
sdaguecan you do some straight out IO testing?22:54
sdaguelike shut down ES and just beat on the volumes?22:54
clarkbsdague: numbers probably won't mean anything without shutting down the cluster22:54
clarkbyeah we could probably do something like that22:54
sdagueright22:54
fungibonnie++ or whatever the kids are using these days22:54
sdagueyeh22:54
sdaguethat should at least give a definitive answer of volume vs. configuration22:54
sdaguealso, some interesting data on the volume throughput22:55
clarkbdo we want to shut everything down for that though?22:55
fungiclarkb: it is a cloud. entirely possible one volume is sharing a device crushed under the weight of i/o from another neighboring tenant22:55
sdagueclarkb: well, I don't know22:55
sdagueon the up side, it would give us some answers, maybe22:56
openstackgerritAdam Gandelman proposed a change to openstack-infra/config: Pre-cache UCA packages during nodepool img build  https://review.openstack.org/9974022:56
*** james_li has quit IRC22:56
sdagueand we've not drained the inbound queue in days22:56
sdagueon the down side, our blindspot would get much bigger22:56
fungior sharing an i/o channel choked by a neighbor on the same nova compute node22:56
clarkbfungi: right22:57
*** james_li has joined #openstack-infra22:57
sdagueclarkb: so what if you killed es01 brought up a new compute, and allocated a fresh volume for it?22:57
clarkbsdague: we could do something like that too22:57
clarkbsdague: my only concern there is we are so close to the edge on available disk that the cluster may die22:57
sdaguewell, actually, what about this22:57
fungithough the shard rebuilding would be similarly traumatic to cluster performance22:57
clarkbsdague: we don't currently have enough extra disk on nodes to lose one :/22:58
sdagueclarkb: even after kiling the indexes?22:58
clarkbsdague: yeah22:58
bodepdany chance I can get one more +2 on the puppet repos I am waiting for?22:58
sdaguehow deep are our indexes now?22:58
*** eharney has quit IRC22:58
sdaguecan we trim down?22:58
sdaguelike trim to 7 days22:58
bodepdI'm also curious about the best way to proceed about the module decoupling from config22:59
bodepdI assume it's blocked b/c you guys would rather switch to r10k?22:59
sdagueor can we do add then remove?22:59
sdagueso bring in the new node first22:59
sdaguereshard, and drop the old one22:59
sdagueI would say keep es01 around after just to benchmark that volume23:00
sdagueto know if it seems bad23:00
clarkbsdague: http://paste.openstack.org/show/83890/23:00
sdagueok, I'm not sure what I'm looking at :)23:01
clarkbsdague: its teh size as reported by _status for each index23:01
clarkbsdague: that includes the replica23:01
clarkbso divide by two for size without replica23:01
*** zzelle has quit IRC23:01
*** dims_ has quit IRC23:01
sdagueok and what does it need to fit into?23:01
*** james_li has quit IRC23:01
clarkb6TB with enough room that ext4 doesn't hate us23:01
fungiext4 will always hate you, it just does a better job of hiding it than ext3 did23:02
clarkbfungi: what was your process for spinning up nodes and adding the volumes23:03
clarkbfungi: such that ES doesn't come up without a volume and immediately get cranky23:03
clarkbfungi: were you just relying on firewall rules to prevent it from joining the cluster?23:03
*** adalbas has quit IRC23:04
clarkbalso  Ireally wish nova + cinder had a better first boot story23:04
clarkbmikal: jgriffith: it would be amazing if I could spin up a node with a block device preattached and formatted23:05
*** jamielennox|away is now known as jamielennox23:05
fungiclarkb: hummm... i think i manually added them to the cluster one at a time as they were ready, though now i can't remember23:06
clarkbfungi: yeah I think that will work23:06
clarkbfungi: basically first boot and puppet will use normal disk then stop ES, attach volume, format, mount, start ES23:07
jgriffithclarkb: hmmm23:07
clarkband sometime after ES is stopped update firewall rules23:07
jgriffithclarkb: I could probably make that happen23:07
clarkbjgriffith: the use case being on first boot you tend to do all sorts of config and stuff23:07
jgriffithclarkb: I could add an option to "cinder create" that let's you do partitioning and formatting23:08
clarkbjgriffith: and you can either induce a failure the first time if the fs isn't there and deal with it later23:08
jgriffithclarkb: yeah, I get that for sure23:08
clarkbjgriffith: or you run into masking of stuff23:08
clarkbneither of which is great and has different trade offs23:08
jgriffithclarkb: that's why I use BFV for everything :)23:08
jgriffithclarkb: I'll write a spec23:08
clarkbfungi: what do you think?23:09
*** bookwar has quit IRC23:09
clarkbfungi: is it worth going through the rebalance terror to see if we get lucky?23:09
clarkbfungi: or should we maybe poke rax harder?23:09
*** dkranz is now known as dkranz_afk23:09
*** sarob has quit IRC23:09
clarkbfungi: sdague: I think the "luck" portion of this is what bothers me most23:09
clarkbjgriffith: so you would create block device first and format it, then nova boot with it attached?23:10
clarkbI think that would work23:10
jgriffithclarkb: yeah23:10
jgriffithclarkb: so cinder could have options at create to do all that23:10
*** sarob has joined #openstack-infra23:10
jgriffithclarkb: trick is getting nova to mount it23:10
clarkb++ though I have no idea what that means for you guys23:11
*** thuc has joined #openstack-infra23:11
fungiclarkb: thinking back, i *believe* what i did was puppet them without any elasticsearch in the global site manifest, add the volumes, then add es in the manifest one patch at a time23:11
clarkbfungi: so you used a dev env with custom site.pp?23:11
jgriffithclarkb: not sure how to make that work in our current nova without just relying on cloud init or something23:11
fungiclarkb: i think i approved patches for them individually23:11
clarkbfungi: oh right23:11
fungito add and remove cluster members23:12
*** esker has quit IRC23:12
fungiit went on over the course of days because we didn't want to strain the existing overloaded cluster by doing too many nodes at once23:12
clarkbyeah23:13
*** jhesketh has joined #openstack-infra23:13
clarkbnow we have node /^elasticsearch\d+\.openstack\.org$/ so it will puppet as es server23:13
jheskethMorning23:14
jheskethphschwartz: pong23:14
*** thuc_ has joined #openstack-infra23:14
clarkbjhesketh: o/23:14
anteayamorning jhesketh23:14
anteayaphschwartz is at a concert23:14
anteayahe will be back later23:14
clarkbfungi: I am going to modify that regex to give us the option of doing this the other way23:15
*** thuc_ has quit IRC23:15
*** thuc has quit IRC23:15
*** rcarrillocruz has quit IRC23:15
clarkbbut I am still a bit worried it is a lot of work for nothing as its luck of the draw sort of thing23:15
*** thuc has joined #openstack-infra23:16
anteayaanyone have any reason why paste.o.o 500'd on me a minute or so ago?23:16
anteayaand why is it using so much cache memory? http://cacti.openstack.org/cacti/graph_view.php?action=tree&tree_id=1&leaf_id=14&page=223:16
anteayaI couldn't see anything else funny on paste from cacti23:16
clarkbanteaya: cache memory is al inux thing23:16
clarkbanteaya: linux says memory is there if it isn't used for anything specific I will use it for cache23:16
anteayadoesn't it seem like paste is using quite a bit?23:16
anteayaoh okay23:17
anteayaso back to why did it 500 on me23:17
*** esker has joined #openstack-infra23:17
anteayacacti doestn' point to anything specific23:17
clarkbfor the 500 I don't have a specific answer, but guessing its related to trove23:17
anteayaoh23:17
anteayais that a thing or is it just "oh trove"23:17
anteayalike do we have a bug report for it, or do we want one?23:18
clarkbanteaya: we probably need to properly debug it first23:18
anteayaah okay23:18
*** markmcclain has joined #openstack-infra23:18
anteayaI'm guessing I have no much debugging power from this side of being no access to servers23:18
*** rcarrillocruz has joined #openstack-infra23:19
sdaguepaste.o.o 500s quite a bit actually23:19
*** esker has quit IRC23:19
sdagueclarkb: are we recording apache logs for that?23:19
sdagueI'd say 1 time in 20 I get a 50023:19
*** ramashri has quit IRC23:20
clarkbsdague: we should be. the puppet apache module (or maybe just ubuntu  apache) seems to do the right thing23:20
*** thuc has quit IRC23:20
clarkbI can hop on there in a moment23:20
jogofungi: sunds like we have the bug: nova cannot delete an instance in error,deleting23:20
jogo                    1bc3554b-1f2a-44c7-b9a7-3d8bc956f7cd] Instance is already in deleting state, ignoring this request23:21
anteayajogo: where did you find that error message?23:21
fungiclarkb: i too remain unconvinced it's worth replacing the node to test this theory. perhaps move the query api endpoint to another node instead and see if the problem stays behind or follows the action?23:22
jogoanteaya: from phschwartz23:22
anteayajogo: helpful chap23:22
openstackgerritClark Boylan proposed a change to openstack-infra/config: Be specific about which ES nodes are puppetable  https://review.openstack.org/9979423:23
jogob3c9cc504903eccbc68c441a81b0a727a83117fa  I2f97f93bd714e0ea3b6d4fa3ac457ab43eed00e123:23
clarkbfungi: ^ something like that23:23
*** ramashri has joined #openstack-infra23:23
clarkbfungi: we still have queries disabled and it happens when indexing23:23
*** amcrn has joined #openstack-infra23:23
clarkbfungi: I am reasonably confident this doesn't isn't induced by the queries23:23
*** ihrachyshka has quit IRC23:23
fungigot it23:24
sdaguefungi: though we haven't been querying it for the last couple of hours23:24
clarkbfungi: but ya the buckshot approach to cloud is :?23:24
clarkber :/23:25
fungiscattershot23:25
fungigit yer scatter gun23:26
fungias we say 'round these parts23:26
*** atiwari has quit IRC23:27
fungicompletely unrelated to anything going on here (other than mentioning the open-source movement and a nod to zero wing), i loved seeing http://www.teslamotors.com/blog/all-our-patent-are-belong-you23:29
jesusaurusclarkb: with an approach like 99749 how would you replace an existing node? if es02 went sideways and a new one needed to be spun up, would the replacement have to be es07 and there would be no es02 afterwards?23:29
clarkbjesusaurus: correct23:29
anteayafungi: I shouted with joy as I read that23:30
clarkbjesusaurus: in a cattle world that doesn't bother me a whole lot23:30
clarkbjesusaurus: but it could potentially be confusing23:30
jogophschwartz fungi: https://review.openstack.org/9979623:30
jogoI think ^ should fix things for us23:30
jogogotta actually test that, etc but thats the idea23:30
jogosdague: ^23:30
clarkbjesusaurus: one potential way around that is to make all of the ES stuff run on internal addresses23:31
jesusaurusclarkb: i would be confused/annoyed by a set of non-sequential host numbers, but that might just be me23:31
fungiclarkb: jesusaurus: gaps in sequential server numbering don't bother me in the least, and i'm probably close to clinically ocd23:31
clarkbjesusaurus: but at one point I had considered multi az clustering23:31
clarkbjesusaurus: that never happened23:31
*** mmaglana has joined #openstack-infra23:31
* fungi has learnt to use his ocd for good, not evil23:32
fungipragmatic choices are an acceptable loss of consistency23:32
anteayajogo: I'm confused23:33
alaskijogo: just saw your poke about object conversion and cells23:33
anteayajogo: I thought the == vm_states.ERROR was the state we are seeing that needs to be addressed23:34
*** praneshp_ has joined #openstack-infra23:34
*** mmaglana has quit IRC23:36
clarkboh boo lodge is being proxied to no mod wsgi'd23:36
clarkband now I can't find logs /me hunts for them23:36
*** esker has joined #openstack-infra23:37
*** praneshp has quit IRC23:37
*** praneshp_ is now known as praneshp23:37
clarkbupstart to the rescue23:37
*** esker has quit IRC23:37
clarkbOperationalError: (OperationalError) (2006, 'MySQL server has gone away')23:38
clarkbanteaya: fungi sdague ^23:38
clarkb110 occurences of that over the last 18 hours or so23:39
anteayawhere does it go?23:39
anteayawhen it goes away23:39
*** gokrokve has quit IRC23:39
anteayanow on the east coast of Canada "away" is Quebec, Ontario or the West23:40
anteayamaybe MySQL went there23:40
anteayaI wonder what it does there23:40
jogoalaski: we are digging into why rax is erroring out on so mayn delets http://paste.openstack.org/show/83885/23:41
anteayaprobably swats flies like the rest of us23:41
*** sarob has quit IRC23:41
jogoalaski: and someone thought | fault                  | {u'message': u"'ConnectionFailed' object has no attribute 'status_code'", u'code': 500, u'created': u'2014-06-09T12:41:33Z'} |23:41
*** sarob_ has joined #openstack-infra23:41
jogo could be objects23:41
anteayajogo: so I don't understand your patch23:41
anteayasince I thought the vm_state was ERROR23:41
jogoanteaya: yeah, so wrote that on the train so let me see if it makes any sense23:41
mattoliveraumaybe the max_connections needs to be raised in the trove instance23:41
anteayabut your patch seems to say not ERROR23:41
*** reed has quit IRC23:42
anteayajogo: okay well I don't understand it yet23:42
jogoso the issue right now is this: you try to deleting an instance, and previously if that failed it went back to active state23:43
alaskijogo: a very good guess, but cells/objects failures are usally "'dict' object has no attribute..."23:43
jogoalaski: also surprised any now many neutron errors you have ;)23:43
jogoanteaya: so https://review.openstack.org/#/c/58829/ changed that behavior23:44
clarkbmattoliverau: possibly23:44
alaskijogo: yeah :(23:44
jogoanteaya: so now instances go into deleting,error23:45
jogoanteaya: which makes more sense.23:45
jogoanteaya: note I am ignoring *why* the delete fails23:45
jogoanteaya: and there was another patch https://review.openstack.org/#/c/55444/23:45
jogothat says if you send a second delete to an instance deleting to ignore it, as its already in deleting state23:45
jogoanteaya: does that sound right so far? this is helping me sanity check this, so thank you for that23:45
jogoalaski: also https://review.openstack.org/#/c/99796/23:46
jogoalaski: I think that should fix a lot of our issues with having instances stuck in error,deleting23:46
anteayaalways happy to share in a sanity check23:47
*** markwash has quit IRC23:47
anteayaand yes I am following the bouncing ball so far23:47
jogoanteaya: so my patch should make the ignore the second delete if already deleting logic to:23:47
jogoignore if already deleting and not in error state23:47
anteayaoh23:48
anteayaummmm how does that fix what we are seeing then?23:48
fungiclarkb: mattoliverau: that sounds like a very, very plausible explanation for what we're seeing with paste.o.o23:48
anteayaor at least any part of what we are seeing?23:49
jogoanteaya: it will fix the part where we keep trying to delete an isntance and it doesn't delete23:49
jogoanteaya: if fungi tries to manaully delete one of the instances in error,deleting it won't work23:49
anteayaokay that was part of what phschwartz wanted23:49
jogowith my patch it should be deletable23:50
anteayato reduce using resources to no effect23:50
anteayaoh23:50
jogowell he wanted us to give up on those instances23:50
anteayaany way to test it?23:50
anteayahe did23:50
jogoanteaya: unit testing?23:50
anteayabut we still need to get rid of them23:50
jogoanteaya: yup, this patch should do that (if it works)23:50
anteayawell yes, but I was thinking put it in action and get rid of one of those instances23:50
anteayasince we have a few to try to kill23:51
jogoanteaya: ohh well if we could get into rax and change code23:51
jogo(maybe someoen can)23:51
anteayaguess that is pvo then23:51
*** thuc_ has joined #openstack-infra23:52
anteayajogo: in the commit message can I get some urls to the two patches you linked me to: https://review.openstack.org/#/c/99796/1//COMMIT_MSG23:53
*** thuc_ has quit IRC23:53
anteayawith a small sample of the chain of events you just described for me?23:53
jogoanteaya: yup, I am transcribing what I told you into it23:54
jogoanteaya: one step ahead of me23:54
anteayaawesome thank you23:54
anteaya:D23:54
*** thuc_ has joined #openstack-infra23:54
anteayacomposing commit messages is harder than just typing stuff into irc23:54
anteayayou don't need puncuation or anything23:54
anteayaunless you _want_ to put a full stop at the end of a sentence.23:55
anteayathen you can, in irc.23:55
anteayalong live the full stop.23:55
fungiit's like typing in telegram.23:55
anteayaha ha ha23:55
clarkbI am still trying to sort out lodgeits concurrency model23:55
fungiclarkb: it *has* a concurrency model?23:56
clarkbfungi: mattoliverau ^ it uses weurkzeug script.make_runserver23:56
fungiclarkb: in my head that translates to "someone should write a paste server"23:56
clarkbhttps://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/script.py#L28823:57
clarkbthat line is great. we are using a development server...23:57
clarkbit looks like by default it uses one proces without threads23:57
clarkbso we shouldn't have connection issues right?23:58
*** rcarrillocruz has quit IRC23:58
fungibwahaha23:58
jogocomstud: can you help reset task states for now?23:58
*** rcarrillocruz has joined #openstack-infra23:59
fungiclarkb: yeah, unless it's not a concurrency issue but rather a rate issue?23:59
*** ramashri has quit IRC23:59
clarkbfungi: could be23:59
clarkbpoor server can't deal with it23:59
*** jerryz_ has quit IRC23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!