Sunday, 2013-11-17

*** UtahDave has joined #openstack-infra00:05
*** boris-42 has joined #openstack-infra00:13
*** boris-42 has quit IRC00:17
*** fifieldt has joined #openstack-infra00:36
*** dcramer_ has quit IRC00:39
*** fifieldt has quit IRC00:45
*** blamar has quit IRC01:08
*** CaptTofu has quit IRC01:11
*** CaptTofu has joined #openstack-infra01:12
*** Alex_Gaynor has quit IRC01:35
*** Alex_Gaynor has joined #openstack-infra01:35
*** talluri has joined #openstack-infra01:40
*** adarazs has quit IRC01:41
*** UtahDave has quit IRC01:42
*** talluri has quit IRC01:45
*** markwash has joined #openstack-infra01:46
*** CaptTofu has quit IRC01:54
*** CaptTofu has joined #openstack-infra01:54
*** pcrews has quit IRC02:00
*** CaptTofu has quit IRC02:01
*** CaptTofu has joined #openstack-infra02:01
*** nati_ueno has joined #openstack-infra02:20
*** nati_ueno has quit IRC02:20
*** nati_ueno has joined #openstack-infra02:43
*** senk has joined #openstack-infra02:44
*** wenlock has joined #openstack-infra02:48
*** nati_ueno has quit IRC02:48
*** senk has quit IRC02:53
*** jamesmcarthur has joined #openstack-infra02:55
*** senk has joined #openstack-infra02:56
*** senk has quit IRC03:00
*** senk has joined #openstack-infra03:01
*** senk has quit IRC03:11
*** mriedem has quit IRC03:20
*** reed has joined #openstack-infra03:33
*** reed has quit IRC03:39
*** UtahDave has joined #openstack-infra03:56
*** CaptTofu has quit IRC04:07
*** CaptTofu has joined #openstack-infra04:08
*** marun has joined #openstack-infra04:14
openstackgerritNoorul Islam K M proposed a change to openstack-infra/reviewstats: Add new core reviewers  https://review.openstack.org/5680604:22
*** UtahDave has quit IRC04:36
*** blamar has joined #openstack-infra04:45
*** sdake_ has quit IRC04:48
openstackgerritNoorul Islam K M proposed a change to openstack-infra/reviewstats: Add rubick project tracking  https://review.openstack.org/5680804:52
openstackgerritNoorul Islam K M proposed a change to openstack-infra/reviewstats: Add python-solumclient subproject  https://review.openstack.org/5681004:57
*** jamesmcarthur has quit IRC04:58
*** WarrenUsui has joined #openstack-infra05:06
*** aardvark has quit IRC05:09
*** sdake_ has joined #openstack-infra05:14
*** boris-42 has joined #openstack-infra05:25
*** che-arne has quit IRC05:34
*** che-arne has joined #openstack-infra05:36
*** wenlock has quit IRC05:36
*** SergeyLukjanov has joined #openstack-infra05:57
*** SergeyLukjanov has quit IRC05:59
*** boris-42 has quit IRC06:31
*** boris-42 has joined #openstack-infra06:34
*** davidhadas has joined #openstack-infra06:40
*** UtahDave has joined #openstack-infra06:49
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Serve local mirror using apache  https://review.openstack.org/5675207:08
*** alexpilotti has joined #openstack-infra07:10
*** rwsu-pto has quit IRC07:17
*** rwsu-pto has joined #openstack-infra07:17
*** yolanda has joined #openstack-infra07:32
*** talluri has joined #openstack-infra07:40
*** boris-42 has quit IRC07:41
clarkbmordred: jetlag must really be getting to you :)07:48
*** talluri has quit IRC07:50
*** talluri has joined #openstack-infra07:50
*** UtahDave has quit IRC07:56
mordredclarkb: o m g08:04
mordredclarkb: I can't seem to stop myself from going to sleep at 5pm08:05
*** julim has joined #openstack-infra08:26
*** julim has quit IRC08:34
*** dkliban has joined #openstack-infra08:36
*** markwash has quit IRC08:37
*** noorul has joined #openstack-infra08:58
*** noorul has left #openstack-infra08:58
*** rwsu-pto has quit IRC09:50
*** rwsu-pto has joined #openstack-infra09:50
*** adarazs has joined #openstack-infra09:52
*** harlowja has quit IRC09:54
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Serve local mirror using apache  https://review.openstack.org/5675210:25
openstackgerritA change was merged to openstack/requirements: Update stevedore  https://review.openstack.org/5624010:26
*** Ryan_Lane has quit IRC10:28
openstackgerritMonty Taylor proposed a change to openstack-infra/pypi-mirror: Refactor run-mirror to make it readable  https://review.openstack.org/5678410:44
*** boris-42 has joined #openstack-infra11:04
*** boris-42 has quit IRC11:06
*** SergeyLukjanov has joined #openstack-infra11:16
*** branen_ has quit IRC11:18
*** boris-42 has joined #openstack-infra11:18
*** SergeyLukjanov has quit IRC11:20
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Serve local mirror using apache  https://review.openstack.org/5675211:26
openstackgerritMonty Taylor proposed a change to openstack-infra/config: Run the mirror if we land changes to pypi-mirror  https://review.openstack.org/5677711:29
*** salv-orlando has quit IRC11:30
openstackgerritMonty Taylor proposed a change to openstack-infra/config: Add support for per-distro wheel mirrors  https://review.openstack.org/4126811:46
openstackgerritMonty Taylor proposed a change to openstack-infra/devstack-gate: Use select-mirror in devstack-gate  https://review.openstack.org/4806411:48
*** talluri has quit IRC11:55
*** Loquacity has quit IRC11:58
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Clean up integration script  https://review.openstack.org/5681612:19
*** SergeyLukjanov has joined #openstack-infra12:20
*** jeblair has quit IRC12:21
*** jeblair has joined #openstack-infra12:21
*** marun has quit IRC12:26
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Enable wheel processing in the tests  https://review.openstack.org/5681712:28
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Use wheels for installation  https://review.openstack.org/4880312:28
mordredfor those following along at home, I'm making an easy to understand set of changes around wheels12:30
mordredhttps://review.openstack.org/#/q/status:open+topic:wheels-in-mirror,n,z12:30
*** wayneeseguin has quit IRC12:32
openstackgerritMonty Taylor proposed a change to openstack-infra/devstack-gate: Enable the use of wheels in the gate  https://review.openstack.org/5666012:38
*** wayneeseguin has joined #openstack-infra12:38
*** jamesmcarthur has joined #openstack-infra12:46
*** yolanda has quit IRC12:56
*** yolanda has joined #openstack-infra12:59
*** nati_ueno has joined #openstack-infra13:04
*** SergeyLukjanov has quit IRC13:04
*** adarazs has quit IRC13:18
*** jamesmcarthur has quit IRC13:23
*** jamesmcarthur has joined #openstack-infra13:24
*** yolanda has quit IRC13:27
*** boris-42 has quit IRC13:33
*** boris-42 has joined #openstack-infra13:41
*** CaptTofu has quit IRC13:41
*** CaptTofu has joined #openstack-infra13:42
*** ace05_ has joined #openstack-infra13:42
*** nati_uen_ has joined #openstack-infra13:48
*** nati_ueno has quit IRC13:52
*** SergeyLukjanov has joined #openstack-infra13:57
*** boris-42 has quit IRC13:57
*** boris-42 has joined #openstack-infra13:59
*** nati_uen_ has quit IRC14:00
*** boris-42 has quit IRC14:07
*** boris-42 has joined #openstack-infra14:08
*** yolanda has joined #openstack-infra14:09
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Enable wheel processing in the tests  https://review.openstack.org/5681714:24
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Clean up integration script  https://review.openstack.org/5681614:24
*** yolanda has quit IRC14:24
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Use wheels for installation  https://review.openstack.org/4880314:24
*** mriedem has joined #openstack-infra14:35
*** SergeyLukjanov has quit IRC14:39
*** nati_ueno has joined #openstack-infra14:42
*** dstanek has joined #openstack-infra14:43
*** alexpilotti has quit IRC14:51
anteayamordred I keep going to sleep at 7pm14:55
mordredanteaya: I'm jealous - I keep going to sleep at 514:56
anteayaI went for an extra long walk yesterday and then watched some movie14:56
anteayawas enought to keep me up to 7pm14:56
anteayaAliens is one of my favourite movies, it was on at 8pm - couldn't stay up for it though14:57
anteayamordred: I am rechecking 5675214:57
anteayalet's hope it passes grenade this time14:58
mordredawesome14:58
*** jamesmcarthur has quit IRC14:59
anteaya56816 is failing on integration14:59
anteayaso is 5681715:00
mordredyes. I've uploaded new versions of those15:01
mordredoh. crap. still failing15:01
anteayayeah15:01
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Enable wheel processing in the tests  https://review.openstack.org/5681715:04
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Clean up integration script  https://review.openstack.org/5681615:04
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Use wheels for installation  https://review.openstack.org/4880315:04
mordredthat shoudl do it15:04
anteayalet's find out what jenkins says15:05
anteaya*crosses fingers*15:05
anteayawhere's that rubber chicken?15:05
*** SergeyLukjanov has joined #openstack-infra15:21
*** dstanek has quit IRC15:24
anteaya56752 passed grenade this time15:25
anteayaand failed on devstack vm full15:39
*** jamesmcarthur has joined #openstack-infra15:39
mordredwee15:39
* mordred cries15:39
anteaya:(15:41
anteayawell it is a Sunday so not a large check queue15:42
anteayamordred: do you have a minute to join us in -neutron15:49
anteayaa question about increasing the number of lines for the backtrace in a failed test, run locally15:50
*** jamesmcarthur has quit IRC15:51
*** jamesmcarthur has joined #openstack-infra15:54
*** pcrews has joined #openstack-infra16:00
*** CaptTofu has quit IRC16:08
*** CaptTofu has joined #openstack-infra16:09
*** mriedem has quit IRC16:27
*** CaptTofu has quit IRC16:33
*** CaptTofu has joined #openstack-infra16:33
*** blamar has quit IRC16:41
*** mestery_ has joined #openstack-infra16:42
*** mestery has quit IRC16:45
*** jamesmcarthur has quit IRC16:46
*** salv-orlando has joined #openstack-infra16:48
*** boris-42 has quit IRC16:48
*** mestery_ is now known as mestery16:48
*** nati_ueno has quit IRC16:52
*** nati_ueno has joined #openstack-infra16:53
*** blamar has joined #openstack-infra16:56
*** ilyashakhat has joined #openstack-infra17:04
*** ogelbukh has quit IRC17:07
*** wenlock has joined #openstack-infra17:07
*** ilyashakhat_ has quit IRC17:08
*** metabro has quit IRC17:14
*** metabro has joined #openstack-infra17:16
*** dcramer_ has joined #openstack-infra17:17
*** plomakin has quit IRC17:19
*** plomakin has joined #openstack-infra17:21
*** ogelbukh has joined #openstack-infra17:23
*** dstanek has joined #openstack-infra17:23
*** dcramer_ has quit IRC17:23
*** yolanda has joined #openstack-infra17:23
*** wenlock has quit IRC17:25
*** yolanda has quit IRC17:28
*** crank has quit IRC17:30
*** crank has joined #openstack-infra17:34
*** dcramer_ has joined #openstack-infra17:35
openstackgerritKhai Do proposed a change to openstack-infra/config: proposal to add new jjb release managers group  https://review.openstack.org/5682317:37
*** talluri has joined #openstack-infra17:38
*** wenlock has joined #openstack-infra17:40
*** dstanek has quit IRC17:44
*** dkliban has quit IRC17:49
*** wenlock has quit IRC17:55
*** DennyZhang has joined #openstack-infra18:08
*** dstanek has joined #openstack-infra18:08
*** metabro has quit IRC18:20
*** ilyashakhat has quit IRC18:23
lifelessmordred: have you eyeballed http://code.google.com/p/vitess/ ?18:27
mordrednope18:27
lifelessI ratholed down into that following a language thingy18:27
lifelessbut thought 'Oh, that might interest mordred'18:27
mordredlooks at first light like mysql-proxy18:28
mordredyup. looks pretty much like they re-implemented jan's mysql-proxy18:29
mordredwhich is super helpful to  lot of people in a lot of usecases18:29
mordredso vitess is proably also similarly really useful18:29
lifelessI thought mysql proxy took in mysql ?18:30
mordredit does - but no reason it has to - the other features on the list are where teh shared feature list is18:31
mordredand where the interesting bits are18:31
lifelessack18:31
lifelessthanks18:31
mordredthe architectural core concepts in vitess seem to be sound and battle-tested18:32
mordredhttps://github.com/youtube/vitess has more info18:34
mordredanother zookeeper user - should I get over my aversion and learn to like zookeeper at some point?18:34
*** zul has quit IRC18:37
*** SergeyLukjanov has quit IRC18:39
lifelessmordred: I haven't yet.18:40
lifelessmordred: I may if it gets built in SSL18:40
*** salv-orlando has quit IRC18:41
*** dstanek has quit IRC18:44
*** ogelbukh has quit IRC18:46
*** SergeyLukjanov has joined #openstack-infra18:48
*** plomakin has quit IRC18:50
*** dstanek has joined #openstack-infra18:56
*** DennyZha` has joined #openstack-infra18:59
*** EmilienM has quit IRC19:01
*** aardvark has joined #openstack-infra19:02
*** wenlock has joined #openstack-infra19:02
*** EmilienM has joined #openstack-infra19:02
*** DennyZhang has quit IRC19:04
*** WarrenUsui has quit IRC19:05
*** metabro has joined #openstack-infra19:07
*** plomakin has joined #openstack-infra19:07
*** DennyZha` has quit IRC19:07
*** ogelbukh has joined #openstack-infra19:07
*** thedodd has joined #openstack-infra19:14
*** thedodd has quit IRC19:17
*** thedodd has joined #openstack-infra19:18
*** thedodd has quit IRC19:21
*** thedodd has joined #openstack-infra19:21
clarkbmordred: I think the gate trouble for havana is related to jog0's smoke-serial change19:24
clarkbmordred: I have just proposed a backport to stable/havana and am trying to sort out how to make it work with stable/grizzly though with grizzly it should already work I think19:25
mordredclarkb: awesome19:29
clarkbI think grizzly needed a change too19:30
clarkbso I pushed that19:30
clarkbmordred: https://review.openstack.org/#/c/56825/ and https://review.openstack.org/#/c/56826/ if those pass tests or at least change the failure then we are on the right track19:31
*** dcramer_ has quit IRC19:33
*** yolanda has joined #openstack-infra19:33
*** SergeyLukjanov has quit IRC19:33
*** thedodd has quit IRC19:33
*** thedodd has joined #openstack-infra19:34
*** thedodd has quit IRC19:34
*** thedodd has joined #openstack-infra19:35
*** CaptTofu has quit IRC19:36
*** thedodd has quit IRC19:36
*** thedodd has joined #openstack-infra19:37
*** CaptTofu has joined #openstack-infra19:37
zaroclarkb: I noticed that our gerrit project are not in heirarchy.19:37
zaroclarkb: do we NOT want to apply acls in heirarchy order?19:37
clarkbzaro: I am not parsing, you are talking about our fork of gerrit? and what hierarchy?19:38
zaroclarkb: no i'm talking about how our gerrit projects are setup in config.19:39
zaroclarkb: their's only 1 container project, 'All-Projects'19:39
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Add version override support from nova  https://review.openstack.org/3651119:39
clarkbwe don't manage All-Projects in config iirc19:40
clarkbjust the individual projects19:40
zaroclarkb: i'm thinking we would want to setup containers for 'openstack-infra', 'openstack-dev', etc..19:40
zaroclarkb: then apply global acls to each container19:40
clarkbzaro: we could do that, not sure how much it helps beyond openstack-infra though. All of theo ther projects are pretty different19:40
*** dstanek has quit IRC19:41
zaroclarkb: would help to segregate global acls for each project type.19:41
clarkbzaro: but we don't need them to be segregated do we?19:41
clarkbmy concern is that there are almost as many project types as there are projects19:41
clarkbso this won't simplify much19:41
mordredit could help some of them19:42
zaroclarkb: it looks like 'openstack-dev' and 'openstack-infra' have different global acl settings.19:42
mordredoslo, savana, fuel, manila all have multiple repos19:42
zaroclarkb: would it not make sense to setup global settings for each one?19:43
clarkbmordred: I think for the wrong reasons :)19:43
mordredclarkb: :)19:43
mordredotoh - we don't change them much either19:43
mordredI find the lack of templating in layout.yaml more upsetting regularly19:43
clarkbzaro: I definitely think openstack-infra is one place where it makes senese19:43
mordredyah. openstack-infra could use it19:43
mordredopenstack-dev is all different19:44
zarowell it can't happen unless theere is some heirarchy.19:44
zarowhat about the others? stackforge, and openstack19:45
*** senk has joined #openstack-infra19:45
clarkbstackforge and openstack are pretty different (particularly stackforge)19:45
clarkbsince they are lots of disparate projects under a common prefix19:46
clarkbHmm I should've put the bug number in my tempest changes too but I didn't want to change the commit message too much.19:46
zarook.  i see.  so i assume it doesn't make sense to introduce heirarchy for just openstack-infra?19:46
clarkbzaro: that is what I am thinking, but it might (I am not sure what sort of changes need to be made to have that happen)19:47
*** talluri has quit IRC19:47
zaroi think that would involve moving projects, so i'm not sure that's easily or safely done in gerrit.19:47
mordrednah19:48
mordredyou just need an inheritfrom line19:48
mordredlook at19:48
mordredfor instance19:48
mordred./modules/openstack_project/files/gerrit/acls/openstack/identity-api.config19:49
clarkbthats simple19:49
zaroi guess worth considering then.  i think it makes a lot of sense.  i just proposed a patch that sets up a new group.  it would be good if setup with inheritance.19:50
zaroalso i think patch security review gerrit would need changes to global acls, and i'm not sure it that would apply accross all proejcts.19:51
clarkbzaro: I think that is correct, global ACLs would need to be much more restricted by default (eg no read to anonymous users)19:51
*** Alex_Gaynor has quit IRC19:55
*** Alex_Gaynor has joined #openstack-infra19:56
*** mriedem has joined #openstack-infra19:58
*** markwash has joined #openstack-infra20:04
mordredO M G Gate Races20:06
mordredthe gate races on this are kiling me: https://review.openstack.org/#/c/56752/20:06
clarkbmordred: we need a support group20:12
mordredclarkb: we need for things to be less broken20:12
clarkbwhere we get together, flip all of the tables in the room, then hack on fixing the problems20:12
mordredyah20:12
mordredclarkb, sdague: so - more thoughts on removing tests20:13
mordredone of the resaons we clean warnings and turn them into errors and whatnot20:13
*** wenlock has quit IRC20:13
mordredis so that people don'tignore it when the system tells them something20:13
*** yolanda has quit IRC20:14
mordredI know the tests are pointing out real problems - but I'm concerned that continual recheck hell is going to undermine (or already has) people's reflex to see a negative response from the gate as  aproblem to be fixed20:15
mordredso I know we talk about it all the time20:15
mordredbut I want to make sure that when we frustratedly consider removing individual tests, that we consider that as one of the factors20:16
clarkbmordred: I definitely think the ability to force things through compounds the problem20:16
clarkbare you suggesting that we unforce thigns by removing particular tests?20:16
mordredI am20:16
mordredand then figure out a mechanism to be more strict on adding additoinal things in to make sure that they work20:17
mordredwhen we talk about 3rd party testing rigs, one of the design points that we bring up is ability to respond consistently without a high number of false negatives20:18
clarkbeducating people on how to do some debugging themselves is what I would like to see20:18
mordredand we think that's a super important feature of the system20:18
clarkbthe submit bug aginst ci then recheck issue is a symptom of folks not knowing what to do20:18
clarkbthese are not false negatives though20:18
mordredI understand that - but they are false negatives on the patch20:19
mordredpart of the point of the gate is to protect devs against other devs breaking the tree20:20
mordredbut we managed to let tree breakages slip in20:20
clarkbright, and I think the minimal debug then recheck loop is a large contributor to that20:21
mordredso we're no longer sitting in the original design state, which is that errors from a test run should be errors caused by the developer's patch20:21
clarkbif your patch introduces a thing that passes 80% of the time, recheck will allow you to eventaully get it in20:21
mordredyup. I agree20:21
mordredproblem is - you have to recheck EVERY patch20:22
mordredit's so frequent20:22
mordredand it's so frequently someone else's fault20:22
mordredthat we're actually training people to not debug their own patches20:22
clarkbright, so I am proposing that we teach everyone how to properly deal with rechecks (or maybe just remove them all together)20:22
clarkbinitially that will be super painful because of the state we are in now20:22
clarkbbut after a couple weeks we should be in a better place assumign we actually fix the bugs20:23
mordredI think that's extremely unfair to most of our devs20:23
clarkbI don't think so20:23
clarkbthis is a project we are all working together on20:23
mordredwell, neutron already knows that they need to do a complete rewrite of their agent code20:23
mordredand that's apparently something that only 1 or 2 people can work on20:24
mordredis what we're saying that we need to do is hold everyone else up from being able to do work on other pieces while we wait on 2 guys to do that?20:24
mordredthat seems baby-with-bathwater20:24
clarkbno, there are many many many other things that need fixing20:24
clarkbmordred: your change for example has nothing to do with neutron failures20:25
mordredmy change has triggered like, 5 different failures at least20:25
jesusaurusare we tracking rechecks? do we have a per-project break-down of frequence or percentage of rechecks? im curious how wide-spread the problem is20:25
mordredwe are20:25
mordredhttp://status.openstack.org/rechecks/20:25
mordredand then there is elastic recheck20:26
clarkbwe also have http://status.openstack.org/elastic-recheck/ tracking what we think are actual numbers for thigns that have had resources put into diagnosing them20:26
mordredbut I'm growing more worried that what clarkb said above is a greater problem20:26
mordredwhichis that, in a world where recheck is the norm20:26
mikalCan one of yo guys rescue me from IBMers who don't know how the internet works?20:26
mordredpeople will recheck their own buggy race condition code into the tree20:26
clarkbmikal: you assume we know how the internet works :)20:27
clarkbmordred: exactly20:27
mikalI replied to the thread "[OpenStack-Infra] Request a a dedicated user account." from an IBMer asking for a gerrit account20:27
mikalAnd now they think I'm the guy setting it up20:27
mikalAnd are emailing me personally20:27
clarkbmikal: oh that one, I was so happy you responded to that :) sure point them back at the list and I can respond to what ends up there20:27
mordredand that will happen with higher velocity than we can fix the current outstaning bugs20:27
mikalMy sin was replying to their request saying I thought "root" was a bad account name to be request20:27
mikaling20:27
mordredmikal: thanks for that, btw20:27
clarkbmikal: it was a bad account name :)20:27
mikalOh, I know I was right20:28
mikalI'm now just being punished20:28
mikalI've told them to go back to the list20:28
mikalI'd ask an IBMer with clue to lean in and help, but I think they've all quit (except cyeoh, but I think he's a different team)20:29
clarkbmtreinish is on vacation20:29
jesusaurusmordred: oh, i see, you're talking about preventing future intermittent bugs20:29
mordredjesusaurus: and dealing with the current ones too20:29
jesusaurusclarkb: is this why you mentioned running tests multiple times the other day?20:29
mordredyah20:29
clarkbjesusaurus: yup20:29
clarkbafter feature freeze a bunch of stuff was fixed20:30
clarkband we got the rates of failure to be pretty low, but since then we are back up to crazy amounts of hurt20:30
clarkband I think it is because we have a mechanism to get half passing code in20:30
clarkbmordred: what if we get rid of reverify20:31
clarkbmordred: force cores to look at the failures and decide if a second +A is appropriate20:31
clarkbthen you can recheck all you want and build up data on a change20:31
mordredhrm20:31
clarkbbut code doesn't go into gate without someone actually thinking about it20:31
mordredI don't think that will have the result you think it will20:32
mordredthere are over 100 cores20:32
clarkbya good point20:32
mordredI'm honestly stumped. I hate all of the options20:33
mordredI do not think us taking a hard-line is working20:33
*** dstanek has joined #openstack-infra20:33
mordredI do not think cancelling the gate will work or be a good thing20:33
lifelessoh hai20:33
mordredI concede that disabling tests is a bad idea20:34
lifelessetoomucscrollback20:34
lifelesshow reliable do we want the gate to be ?20:34
mordredI think us 'educating' people is not just not working, but i also a dangerous position to sit in20:34
mordredlifeless: we want it to be 100% reliable20:34
mordredultimate goal, no failures that are not caused by the dev's patch20:35
clarkband that number depends on perspective right20:35
mordredthat may be a nirvana of course20:35
clarkbthese are not really false negatives so it is reliable in that case20:35
lifelessideally sure, but pragmatically? How many reverifies a day would we accept ? 10? 20 ? 5?20:35
clarkbbut there are a lot of unrelated patches being hurt by the pain20:35
lifelessdiminishing returns and all that20:35
lifelessSecond question, how long does it take to detect a fragile change has landed ?20:35
clarkblifeless: a bit of time as humans need to go back and identify the problem, find when it started and fix it20:36
clarkbits usually a few days20:36
clarkbmikal found one that started midweek yesterday20:36
* mordred cries20:36
mordredI do like the ideaof ratcheting up iterations20:36
mordredto increase surface area possibility of catching them20:36
mordredbut that is, of course, expensive20:37
mikalclarkb: I think I need a login on a box which is experiencing that failure to debug further20:37
mikalI'm not sure if its a libvirt problem at this point20:37
mordredand still sits us inthe same place that removing rechecks do20:37
clarkbmikal: ok20:37
lifelessmordred: probability of a problem occuring that isn't detected in N runs is N/320:37
mikalOr I can add more debugging code to nova I suppose20:38
mikalThe basic problem seems to be very slow or empty console logs20:38
clarkbmikal: that might be an easy first step20:38
mikalSometimes tempest times out with just have of the cirros boot done20:38
mordredwhichis that we are currently in a world of broken, and if we all of a sudden make everything run 10x, we will trigger failure more often, efectively stopping everyone20:38
mikalSometimes nothing has hit the console log at all20:38
clarkbmikal: interesting20:38
lifelessmordred: that result will be needed if you want to ratchet up runs to reduce flakiness: unless we ratchet it up enough to force failures under the current failure rate, we'll gain nothing20:38
mordredyup20:39
mikalclarkb: so I do wonder if there is something environmental which is causing libvirt to be slow on some test nodes20:39
mordredso my suggestion would be20:39
clarkbmikal: could be something related to the underlying cloud20:39
lifelesse.g. 10 runs does not get you a 10% failure rtte20:39
lifelessits gets you 30%20:39
mordredratched it up to trigger point20:39
lifeless30 runs will get you 10%20:39
lifeless60 runs 5%20:39
*** markwash has quit IRC20:40
mordredthen disable the blergy tests, then work on getting them and fixes for them back in under the higher ratchet20:40
mordredbut - its a terrible idea20:40
lifelessI think we have to expect this to happen and engineer to manage it20:40
lifelesse.g. drive the TTD down very low20:40
mordredTTD=?20:40
clarkbtime to detection?20:43
clarkbI think we help time to detection if we kill easy rechecks20:43
clarkbbecause it will force people to debug20:43
jesusaurushow do you define easy?20:43
mordredI just don't think it will do that20:44
mordredI think it will cause people to file more bugs against us20:44
mordredand I think it will cause more people to block on 'central' people20:44
clarkbjesusaurus: anyone being able to do it20:44
jesusaurusthat would definitely put a lot of pressure on those able to recheck20:46
clarkbI agree and like I said earlier it will suck short term20:46
lifelessonce we have it detected, then perhaps locking the trees into release mode and fixing it, then resuming normal process is sane20:47
clarkbbut should give us breathing room to shake the current problems out without adding heaps more20:47
lifelessor20:47
lifelessrevert the commit20:47
lifelessif we get the TTD low enough20:47
mordredbut what do we do about the current set of fails?20:47
clarkbmordred: basically what we did around havana feature freeze20:48
lifelessif we can track down the commit, proposing a revert and seeing if it passes would be a good start20:49
mordredclarkb: the problem is, that clearly is not a maintainable state, because in a week where a couple of core people were on vacation, the entire project regressed20:49
lifelesswe capture enough data20:49
lifelessthat getting a low TTD should be possible20:49
*** salv-orlando has joined #openstack-infra20:50
clarkbmordred: it isn't sustainable, but hopefulyl you get enough changes in place that now you can breath and prevent future breakage of the same sort20:50
clarkbeither by running more tests or no more recheks or removing tests that are particularl;y problematic20:50
lifelessclarkb: whats the current aggregate failure rate?20:50
clarkbI just think this is a race we can't get ahead in without drastic measures20:50
mordredor some combo of all of the above20:50
mordredlifeless: right now it's LUDICROUS20:50
clarkblifeless: it is ~50% according to jog0 iirc20:51
mikalclarkb: it occurs to me one question I haven't asked is does that failure happen for anyone else's change?20:52
lifelessok, so running 6 jobs (which might mean just 3 times the current tempest jobs?) should put an effective barrier stopping that getting worse [95% of the time 6 runs will detect anything with an incident rate higher than 50%]20:52
mikalPerhaps my change very subtly breaks libvirt booting in a way I don't expect20:52
lifelessbut again, I think the stats are against us20:52
mikal(Unlikely, but possible)20:52
mordredlifeless: well, it's not a problem with a 50% failure rate - it's mutliple problems with an average effective 50% failure rate20:53
lifelessIt seems to me that leveraging the vast amount of data we already gather to detect failures fast + revert is better20:53
clarkbmikal: did you find ~90 failures over the last week20:53
lifelessmordred: yes, I know that20:53
clarkbmikal: I thought you had a query for it20:53
lifelessmordred: but you can invert the thing and look for successful runs that shouldn't have succeeded and it becomes easier to think about20:54
mordredlifeless: well, and we have plans to work on all of this20:54
lifelessanyhow20:54
lifelesswhat can I do to help?20:54
fungii'm wondering if elastic-recheck can't get to the point where we simply don't recheck a failed change unless it's already known to and tracked within e-r20:54
clarkbfungi: OOOHHHHH20:55
mordredboth faster detectio, and offline analysis of the corpus20:55
clarkbquick someone give that man a beer20:55
fungiand to recheck anything else it has to get researched and added to e-r first20:55
jesusaurus+120:55
fungiplease note i haven't thought through the implications there, and it is the weekend still here, so it might be a terrible, terrible idea after all20:55
clarkbfungi: I think it is the best one I have heard so far20:56
mordredstarting with patchset 6 this morning: https://review.openstack.org/#/c/56752/20:56
mordred5 failures, all unknown, all unrelated to the patch in question20:56
fungii've been heads down hacking on arm hardware all weekend with hardly a break, so need to reset my brain to the topic anyway20:56
mordredfungi: nice!20:56
mordredone of them had two different failures even20:56
mordredand I'm not saying we have to fix it for me - just merely that I've got a data point we can look at20:57
mikalclarkb: ther's a query in the bug I filed20:58
mikalSo yeah, perhaps I'm innocent20:58
mikalWhich would be nice20:58
mikalclarkb: I'm on the road tomorrow. I'll spend some more time digging into the failure I found when I get home tonight.21:01
mikalUnless someone beats me to it21:02
clarkbI probably won't beat you to it, today is my day of being lazy on a couch21:03
mikalFair enough21:03
mikalOne can hope that lifeles will fix it though21:03
mordredlifeless: yeah - if you could just take two days off and fix all the bugs, that would be great21:04
mikalGET ON THAT21:04
lifelessmordred: let me just cellotape C to the ceiling and put on noise cancelling headphones.21:06
mordredlifeless: I'm fine with both of those things21:06
mordredespecially  use of the very cute word cellotape21:06
lifelessfungi: I likes.21:06
* mikal goes to the doctor21:09
* fungi goes to find post-hacking beer^Wdinner21:21
*** loq_mac has joined #openstack-infra21:22
*** Loquacity has joined #openstack-infra21:24
*** markwash has joined #openstack-infra21:28
*** mkoderer has quit IRC21:31
*** sdake has quit IRC21:33
* mordred goes ...21:34
* mordred goes nowhere. he's in a plane seat. there is nowhere to go.21:34
*** mkoderer_ has joined #openstack-infra21:34
*** mkoderer_ is now known as mkoderer21:34
openstackgerritMasayuki Igawa proposed a change to openstack-infra/elastic-recheck: Add query for bug 1251999  https://review.openstack.org/5683921:34
*** sdake__ has joined #openstack-infra21:34
*** dstanek has quit IRC21:43
*** dstanek has joined #openstack-infra21:50
*** dstanek has quit IRC21:54
*** DennyZhang has joined #openstack-infra21:58
*** slong has joined #openstack-infra22:01
*** jhesketh has joined #openstack-infra22:02
*** dcramer_ has joined #openstack-infra22:06
*** mriedem has quit IRC22:06
*** cody-somerville has quit IRC22:09
*** markwash has quit IRC22:09
*** thedodd has quit IRC22:23
*** zul has joined #openstack-infra22:23
*** jamesmcarthur has joined #openstack-infra22:24
*** dcramer_ has quit IRC22:25
*** DennyZhang has quit IRC22:27
*** DennyZhang has joined #openstack-infra22:28
*** nati_ueno has quit IRC22:37
*** senk has quit IRC22:38
*** dcramer_ has joined #openstack-infra22:39
*** CaptTofu has quit IRC22:42
*** CaptTofu has joined #openstack-infra22:43
*** dstanek has joined #openstack-infra22:52
*** Ryan_Lane has joined #openstack-infra22:53
*** Ryan_Lane has joined #openstack-infra22:53
*** DennyZhang has quit IRC23:02
*** paul-- has joined #openstack-infra23:03
*** herndon has joined #openstack-infra23:06
*** loq_mac has quit IRC23:06
*** loq_mac has joined #openstack-infra23:07
*** loq_mac has quit IRC23:07
*** salv-orlando has quit IRC23:07
*** salv-orlando has joined #openstack-infra23:07
*** jamesmcarthur has quit IRC23:11
*** wenlock has joined #openstack-infra23:14
*** dripton has quit IRC23:14
*** dripton has joined #openstack-infra23:15
*** michchap has quit IRC23:18
*** dkliban has joined #openstack-infra23:19
*** michchap has joined #openstack-infra23:21
openstackgerritA change was merged to openstack-infra/reviewstats: Add new core reviewers  https://review.openstack.org/5680623:21
openstackgerritA change was merged to openstack-infra/reviewstats: Add rubick project tracking  https://review.openstack.org/5680823:22
*** bknudson has quit IRC23:26
*** michchap has quit IRC23:34
*** michchap has joined #openstack-infra23:35
*** thomasem has joined #openstack-infra23:36
*** dstanek has quit IRC23:41
openstackgerritNoorul Islam K M proposed a change to openstack-infra/reviewstats: Add python-solumclient subproject  https://review.openstack.org/5681023:43
*** jamesmcarthur has joined #openstack-infra23:44
*** michchap has quit IRC23:48
*** dstanek has joined #openstack-infra23:53
*** michchap has joined #openstack-infra23:54
*** talluri has joined #openstack-infra23:56

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!