morganfainberg | ^ IANA assignes ports up to 49151 | 00:00 |
---|---|---|
*** ^d has quit IRC | 00:00 | |
jog0 | lifeless jeblair: good thing we have some TC folks here to make sure this type of thing gets fixed | 00:00 |
jog0 | sdague: this should be a req for incubation: port number stuff | 00:00 |
lifeless | lol | 00:00 |
lifeless | IANA concludes that this request and the modification request (ticket | 00:00 |
lifeless | #585246) will be administratively resolved without prejudice. | 00:00 |
lifeless | so linux ephemeral range needs fixing | 00:01 |
lifeless | we can fix that in tripleo | 00:01 |
lifeless | and we should file bugs on Ubuntu RedHat and Suse | 00:01 |
*** ftcjeff has quit IRC | 00:01 | |
sdague | jog0: swift has the same issue, it uses X ports | 00:01 |
morganfainberg | it looks like keystone (openstack-id) is the only openstack project with a registered port. | 00:01 |
*** nati_ueno has quit IRC | 00:01 | |
morganfainberg | as far as i see. | 00:02 |
clarkb | morganfainberg: ya I can't find others | 00:02 |
*** yamahata_ has joined #openstack-infra | 00:02 | |
morganfainberg | somewhat interesting (but not super relevant) | 00:02 |
harlowja | mordred yt | 00:02 |
harlowja | a question about pbr if u are around | 00:02 |
notmyname | sdague: is that a thing that's causing issues right now? | 00:02 |
sdague | notmyname: I don't think so | 00:02 |
*** CaptTofu has quit IRC | 00:03 | |
sdague | it was just in reference to jog0's comment above about making sure projects had port number registrations | 00:03 |
clarkb | it is a thing causing problems with keystone | 00:03 |
clarkb | but before I go fixing keystone I need to check the other projects | 00:03 |
jog0 | notmyname: just a tiny issue | 00:03 |
*** CaptTofu has joined #openstack-infra | 00:03 | |
jog0 | very tiny | 00:03 |
morganfainberg | clarkb, if you set the floor to 49152 you wont hit any other registered ports | 00:04 |
morganfainberg | and i am _fairly_ certain no project uses over 49151 as it's listener | 00:04 |
mikal | jog0: yeah, that revert still gives us a 20%ish fail rate for 1251920 | 00:04 |
mikal | jog0: so I don't think its the answer | 00:04 |
jeblair | so, raise the ephemeral port floor, or just configure keystone to use port 5000 like it used to? | 00:05 |
jog0 | mikal: I am taking the nuclear option https://review.openstack.org/#/c/57566/1 | 00:06 |
morganfainberg | jeblair, we use(d) 5000 and 35357 | 00:06 |
mikal | jog0: oh good, that was my next step | 00:06 |
morganfainberg | so i think raising the floor is the correct answer. | 00:06 |
*** jhesketh_ has quit IRC | 00:06 | |
mikal | jog0: cause like totally out of ideas | 00:06 |
jeblair | morganfainberg: oh one is admin, the other user? | 00:06 |
morganfainberg | that was how it was in grizzly | 00:07 |
mikal | jog0: send four more reviews of that... (ie remove the change id and reupload four more times). That way we can have five running in parallel and get more testing done. | 00:07 |
morganfainberg | in havana they are ... the same thing really. | 00:07 |
morganfainberg | (v3) | 00:07 |
jeblair | morganfainberg: is only 35357 used in havana? | 00:07 |
morganfainberg | jeblair, i think we still use both | 00:07 |
jeblair | morganfainberg: oh, but they provide the same service now | 00:07 |
morganfainberg | yeah | 00:07 |
morganfainberg | compatibility | 00:08 |
morganfainberg | hopefully we can migrate to just one port. | 00:08 |
morganfainberg | ideally that would be our IANA number | 00:08 |
morganfainberg | but i can't speak 100% to that plan. | 00:08 |
jeblair | morganfainberg: gotcha | 00:09 |
*** derekh has quit IRC | 00:09 | |
morganfainberg | jeblair (and lurking in this channel pays off again! woo) | 00:09 |
jog0 | mikal: excellent idea | 00:09 |
fungi | clarkb: ephemeral port collision... great eye! | 00:10 |
clarkb | jeblair: I am going to shift the ephemeral port floor to 49152 | 00:10 |
dolphm | morganfainberg: jeblair: as of the j-release, we'll be able to provide a default configuration that uses only one port (pending deprecating and removal of the v2 api, which behaves differently on each port) | 00:10 |
morganfainberg | there we go, the official word. | 00:11 |
*** dolphm is now known as dolphm_afk | 00:12 | |
jog0 | mikal: there | 00:12 |
jog0 | 5 copies running | 00:12 |
jog0 | mikal: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:revert,n,z | 00:13 |
mordred | harlowja: what's up? | 00:13 |
*** amotoki is now known as __amotoki__ | 00:14 | |
*** CaptTofu has quit IRC | 00:14 | |
harlowja | mordred just was wondering if there is anyway for pbr to have 2 different set of requirements (for example eventlet in py2.6/py2.7 and no eventlet in 3.3) or if there is another recommended practice | 00:14 |
*** CaptTofu has joined #openstack-infra | 00:15 | |
*** senk has joined #openstack-infra | 00:15 | |
harlowja | in taskflow i made the eventlet usage optional, but its still a valid requirement in 2.6/2.7 | 00:15 |
harlowja | but since there is only 1 requirements.txt file, that requirements.txt file has to be the one that works with 3.3 (and can't include eventlet) | 00:16 |
*** fbo is now known as fbo_away | 00:16 | |
*** Ryan_Lane has quit IRC | 00:17 | |
mordred | harlowja: well.... | 00:18 |
mordred | harlowja: there is a small and a large answer | 00:18 |
mordred | the answer to both is no - but there are different details | 00:18 |
notmyname | harlowja: you should ask mordred to add a decent python dependency resolver to pbr. :-) | 00:21 |
*** yamahata_ has quit IRC | 00:22 | |
* mordred throws a goat at notmyname | 00:22 | |
*** yamahata_ has joined #openstack-infra | 00:22 | |
notmyname | harlowja: FWIW, I like your ideas and wish to subscribe to your newsletter. I have "issues" and other "feature requests" for managing the requirements file | 00:23 |
harlowja | newsletter, haha | 00:23 |
*** MarkAtwood has quit IRC | 00:24 | |
harlowja | maybe a special issue for u notmyname | 00:24 |
fungi | short answer yes with an if, long answer no with a but | 00:24 |
harlowja | mordred is there any plans on that, especially as we get more and more 3.3 and 2.6/2.7 compat, it becomes a little weird | 00:24 |
mordred | harlowja: yeah, I have some plans | 00:24 |
harlowja | np | 00:25 |
mordred | that are involved with metadata 2.0 | 00:25 |
harlowja | can i subscribe to your newsletter mordred which i can then forward to notmyname as my newsletter (i'll change the newsletter photo), ha | 00:25 |
harlowja | replace with a pict of me, ha | 00:25 |
harlowja | lol | 00:25 |
notmyname | I just saw a patch land in swift. did it sneak through or does that mean stuff is starting to be readded to the gate queue? | 00:26 |
clarkb | notmyname: a few things have snuck onto the gate queue. There were also a couple swift changes that fixed bugs that we put in the gate queue | 00:26 |
*** salv-orlando has quit IRC | 00:26 | |
jog0 | clarkb: we have another patch to add to the VIP list | 00:26 |
notmyname | harlowja: I already get mordred's newsletter. it comes out quarterly, but it normally has booze with it | 00:26 |
jog0 | https://review.openstack.org/#/c/57572/ | 00:26 |
mordred | harlowja, notmyname: part of them involve markerlib | 00:27 |
clarkb | jog0: sounds good to me. now we just need someone to approve that change :) | 00:27 |
clarkb | jeblair: quick question. for shifting the local port range. I wanted to do that in devstack, but would have to do that as one of the first things devstack does for it to have any hope of being useful. Is that ok with you? | 00:28 |
mordred | harlowja, notmyname: http://www.python.org/dev/peps/pep-0345/#version-specifiers | 00:28 |
harlowja | intersting, didn't know about markerlib | 00:28 |
clarkb | jeblair: I also think I will create a puppet change that shifts the range on all of our slaves using /etc/sysctl.d/60-keystone-port-shift | 00:28 |
harlowja | neat, thx mordred | 00:28 |
mordred | or actuall, http://www.python.org/dev/peps/pep-0345/#environment-markers | 00:28 |
mordred | harlowja: ^^ | 00:28 |
mordred | the idea | 00:28 |
mordred | Requires-Dist: bar; python_version == '2.4' or python_version == '2.5' | 00:29 |
mordred | is that ^^ | 00:29 |
fungi | jeblair: clarkb: heard back from rackspace... their environment has inherent limitations of 16 xen virtual block devices per domu, so the two built-in plus 14 cinder volumes is the maximum we can expect from them | 00:29 |
harlowja | interesting | 00:29 |
mordred | so, instead of requirements.txt itself, we'll have a list of things potentially in setup.cfg itself | 00:29 |
mordred | which can express using the above format | 00:29 |
mordred | (or both) | 00:29 |
harlowja | gotcha | 00:29 |
harlowja | makes sense | 00:29 |
mordred | but requirements.txt is not really intended ot have the flexibility that the above stuff does as part of metadata 2.0 | 00:30 |
harlowja | reminds me of rpm :-/ | 00:30 |
mordred | realistically, I will not get to this until Jan | 00:30 |
mordred | so in the mean time, you can do a local in-tree pbr plugin | 00:30 |
harlowja | its ok, just was wondering the thoughts around it | 00:30 |
jog0 | clarkb: https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013 | 00:31 |
jog0 | jeblair fungi: ^ | 00:31 |
jog0 | to keep track of things | 00:31 |
fungi | jog0: thanmks--tracking | 00:31 |
jog0 | I may be missing some | 00:31 |
mordred | harlowja: if you look at neutron/hooks.py | 00:31 |
harlowja | cool | 00:31 |
mordred | harlowja: in the neutron tree | 00:31 |
mordred | there is an example | 00:31 |
*** boris-42 has joined #openstack-infra | 00:32 | |
mordred | harlowja: you register it in setup.cfg (so look at that in setup.cfg in neutron too) | 00:32 |
*** atiwari has quit IRC | 00:32 | |
fungi | jog0: those were the three i was aware of plus the two latecomers | 00:32 |
*** matsuhashi has joined #openstack-infra | 00:32 | |
harlowja | nice, i'll check it out mordred | 00:32 |
jog0 | fungi: cool | 00:32 |
jog0 | anteaya: any from neutron we should track? | 00:32 |
*** dolphm_afk is now known as dolphm_for_real_ | 00:35 | |
*** dolphm_for_real_ is now known as dolphm_really_af | 00:35 | |
*** dolphm_really_af is now known as dolphm_reallyafk | 00:36 | |
openstackgerrit | Tim Daly, Jr. proposed a change to openstack-infra/config: Disable python33 check for tomograph for now https://review.openstack.org/57573 | 00:36 |
*** julim has joined #openstack-infra | 00:37 | |
pabelanger | Any other reviewers or feedback for import zuul packaging into -infra? | 00:37 |
pabelanger | https://review.openstack.org/#/c/56107/ | 00:37 |
*** dcramer_ has joined #openstack-infra | 00:38 | |
mordred | pabelanger: I think the team has been heads-down on the gate today | 00:39 |
mordred | pabelanger: however, I spoke with both zul and zigo_ today and both are interested in having larger discussions with you on here | 00:39 |
pabelanger | mordred, no problems. No rush on this | 00:39 |
pabelanger | mordred, Ya, I'm tracking along a debian thread now about it. | 00:39 |
mordred | ossum | 00:40 |
*** arata has joined #openstack-infra | 00:42 | |
anteaya | jog0: 57290 is still looking to merge | 00:42 |
zul | pablenager: although i dont know how much i can help depending on my work load | 00:42 |
jog0 | anteaya: thanks, on to the list it goes | 00:43 |
*** CaptTofu has quit IRC | 00:43 | |
anteaya | thanks | 00:44 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Shift local port range to avoid IANA conflict. https://review.openstack.org/57575 | 00:44 |
*** sdake_ has quit IRC | 00:44 | |
*** CaptTofu has joined #openstack-infra | 00:44 | |
anteaya | going to be pushing the other bugs mentioned in your gate blocking bugs email, again tomorrow | 00:44 |
clarkb | actually I need to reload the service when that file changes... | 00:44 |
anteaya | have more momentum, so thanks for that | 00:44 |
pabelanger | zul: I think if we get a plan together and people are all on board, I'll be able to commit a fair bit of time to it. I have some code already working locally, but likely need to get a design scope going that everybody is happy with | 00:44 |
clarkb | all of the d-g images are updated now | 00:44 |
*** pcrews has quit IRC | 00:44 | |
zul | pabelanger: cool lemme know | 00:45 |
fungi | clarkb: reload what service? | 00:45 |
fungi | clarkb: reread sysctl.conf? | 00:45 |
fungi | i guess distros do treat that action as a service reload via initscripts | 00:45 |
clarkb | fungi: yeah reread the sysctl.conf and sysctl.d files | 00:46 |
*** julim has quit IRC | 00:46 | |
fungi | it dawned on me that's what you meant. my brain just doesn't connect that with the word "service" for some reason. ignore me | 00:47 |
clarkb | fungi: ubuntu has a procps "service" but centos doesn't appear to have one | 00:47 |
clarkb | I may just use sysctl -p directly | 00:47 |
*** cody-somerville has quit IRC | 00:48 | |
anteaya | when I say pushing I mean haranging -neutron devs | 00:48 |
anteaya | all things considered this week is going better than last week | 00:48 |
jog0 | anteaya: heh, I just hope we get everything in today (https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013) | 00:49 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Shift local port range to avoid IANA conflict. https://review.openstack.org/57575 | 00:50 |
clarkb | that should be better | 00:50 |
anteaya | well we got no new movement on 1249065, and 1251448 so I will be picking that mission up again tomorrow | 00:50 |
anteaya | already posted that I would be talking about it again tomorrow as a heads up in -neutron tonight | 00:51 |
jog0 | anteaya: thanks | 00:51 |
anteaya | thank you | 00:51 |
anteaya | had a dream about you last night | 00:51 |
anteaya | you were on a bike and I was on a bike behind you | 00:52 |
fungi | clarkb: puppet lint will still hate on you there | 00:52 |
anteaya | pedalled as fast as I could, couldn't keep up | 00:52 |
jog0 | haha | 00:52 |
anteaya | then you took this wicked complex trail and I couldn't follow you | 00:52 |
anteaya | then you disappeared in a pool | 00:52 |
jeblair | clarkb: why shift it in puppet? | 00:53 |
anteaya | I kept getting people to look for you, I was upset you had drowned | 00:53 |
anteaya | in the pool - nodepool | 00:53 |
anteaya | ah dreams | 00:53 |
clarkb | jeblair: because something may grab that port before devstack runs | 00:53 |
clarkb | jeblair: I am going to updated devstack too, but devstack itself may not be sufficient | 00:53 |
clarkb | fungi: :/ | 00:53 |
jeblair | clarkb: if not, then i really think we should look into changing the port in the devstack configuration | 00:54 |
clarkb | jeblair: ugh | 00:54 |
clarkb | jeblair: that makes sense sort of. We really should be testing as we expect keystone to be deployed | 00:54 |
clarkb | (granted changing the port is a minor thing) | 00:54 |
jeblair | clarkb: we should have a simple config that will work with "./stack.sh" on our systems as well as a random dev | 00:55 |
*** CaptTofu has quit IRC | 00:55 | |
*** CaptTofu has joined #openstack-infra | 00:55 | |
clarkb | jeblair: should we change the default in devstack itself or have d-g move it for us? | 00:55 |
fungi | i suppose asking iana for a do-over which doesn't conflict with linux's incorrectly-chosen ephemeral ports range isn't going to fly | 00:56 |
clarkb | it seems silly ot have an IANA assignment and not use it | 00:56 |
jeblair | clarkb: devstack itself i think. | 00:56 |
jeblair | clarkb: i agree, but it's also silly to have a config that may not work by default... | 00:56 |
*** nati_ueno has joined #openstack-infra | 00:56 | |
jeblair | clarkb: this is a bad situation with no good answers i fear | 00:56 |
clarkb | ya | 00:56 |
*** nati_ueno has quit IRC | 00:57 | |
fungi | so a section in devstack wrapped in a comment which says "really ugly workaround" and then goes in and adjusts your kernel | 00:57 |
clarkb | fungi: the problem with that is you may already have a thing running on your box using the port | 00:58 |
fungi | yep | 00:58 |
clarkb | so jeblair's suggestion is use a port < 32768 | 00:58 |
clarkb | which is an easy change in devstack actually | 00:58 |
mikal | jog0: bad news | 00:58 |
fungi | true. much easier than playing with kernel knobs | 00:58 |
mikal | jog0: that big revert you did still has the console log problem | 00:58 |
mikal | jog0: http://logs.openstack.org/66/57566/1/check/check-tempest-devstack-vm-full/f61db56/console.html | 00:58 |
clarkb | let me whip that up and we can argue over it in gerrit :) | 00:58 |
jeblair | clarkb: ++ | 00:59 |
jog0 | mikal: WAT :( | 00:59 |
jog0 | everything else worked which is pretty amazing | 00:59 |
*** nati_ueno has joined #openstack-infra | 01:00 | |
*** nati_ueno has quit IRC | 01:00 | |
jog0 | mikal: so that means it wasn't a nova patch that did it | 01:00 |
fungi | in better news, our swift friend 57019 is 8 minutes and a neutron py26 unit test run away from merging | 01:00 |
jog0 | at least not on its own | 01:00 |
jog0 | fungi: :) | 01:00 |
jog0 | mikal: I really thought this was going to work :/ | 01:00 |
*** ericw has joined #openstack-infra | 01:01 | |
jog0 | so if its not nova, and we don't *think* its the new tempest | 01:01 |
jog0 | what are the other candidates? | 01:01 |
jog0 | we could revert back another whole week | 01:01 |
jog0 | just to be extra safe | 01:01 |
jog0 | but don't see what that could possibly do | 01:01 |
fungi | jog0: mikal: did new libvirt gets ruled out for solid reasons, or just on the merit of its changelog? | 01:01 |
jog0 | fungi: it got ruled out based on timing | 01:01 |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: Setup a private gerrit instance for security reviews https://review.openstack.org/47937 | 01:01 |
fungi | ahh, okay | 01:02 |
*** matsuhashi has quit IRC | 01:02 | |
jog0 | not a strong rule out just a, how could this timing line up | 01:02 |
fungi | that's pretty strong, all things considered | 01:02 |
clarkb | jog0: could be the new package | 01:02 |
*** matsuhashi has joined #openstack-infra | 01:02 | |
clarkb | though that seems very far fetched | 01:02 |
*** cody-somerville has joined #openstack-infra | 01:02 | |
jog0 | clarkb: I did a diff of apt-get | 01:02 |
jog0 | and only change was libvirt | 01:02 |
clarkb | ya the new libvirt package, sorry I stopped typing too soon | 01:03 |
fungi | clarkb: you ruled out anything having to do with timing being coincidental to when we started putting wheels in pypi.o.o right? | 01:03 |
fungi | i know you mentioned it as maybe a thing at one point | 01:03 |
clarkb | fungi: yes, we are not consuming the wheels at all | 01:04 |
clarkb | they are just on the mirror for when we want to consume them | 01:04 |
fungi | okay, awwesome | 01:04 |
*** nati_ueno has joined #openstack-infra | 01:05 | |
jog0 | mikal: I will try reverting this test | 01:05 |
jog0 | https://review.openstack.org/#/c/54363/ | 01:05 |
jog0 | touches the same tempest file nd matches up timin wise | 01:05 |
jog0 | and yes I am just guessing now | 01:06 |
jog0 | and will abandon my revert the last week of nova patches | 01:06 |
*** nati_ueno has quit IRC | 01:06 | |
*** nati_ueno has joined #openstack-infra | 01:06 | |
*** jhesketh has joined #openstack-infra | 01:06 | |
*** matsuhashi has quit IRC | 01:07 | |
anteaya | jog0: this patch of dims passed check and I think we should have logging were we don't have it: https://review.openstack.org/#/c/56316/ | 01:07 |
anteaya | *where | 01:07 |
jog0 | anteaya: makes sense to me | 01:07 |
jog0 | if we think it will help | 01:07 |
*** senk has quit IRC | 01:08 | |
jog0 | mikal: latest attempt https://review.openstack.org/#/c/57578/ | 01:09 |
clarkb | jeblair: https://review.openstack.org/#/c/57577/ | 01:09 |
fungi | 57373 isn't going to make it this time around | 01:10 |
sdague | jog0: so... my propose fixed has some other issue | 01:10 |
jog0 | sdague: :( | 01:10 |
sdague | let me be more clever | 01:10 |
*** reed has quit IRC | 01:10 | |
*** zul has quit IRC | 01:11 | |
jog0 | mikal: so https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:57578,n,z https://review.openstack.org/#/c/57578/ | 01:11 |
anteaya | jog0: I don't see more logging hurting anything | 01:11 |
*** gyee has quit IRC | 01:12 | |
anteaya | yay 57290 made it | 01:13 |
clarkb | woot | 01:13 |
*** thomasem has quit IRC | 01:13 | |
*** dcramer_ has quit IRC | 01:13 | |
anteaya | that is all that -neutron has to offer this round | 01:13 |
clarkb | jog0: https://review.openstack.org/#/c/57373/ | 01:14 |
clarkb | jog0: that change is going to fail the gate | 01:14 |
*** matsuhashi has joined #openstack-infra | 01:15 | |
jog0 | clarkb: sigh | 01:15 |
fungi | most of them came through in that pass, marked in the etherpad as merged now | 01:15 |
jog0 | another kick | 01:15 |
jog0 | fungi: WOOT! | 01:15 |
clarkb | ooh 57509 got in though | 01:16 |
jog0 | wow we got most of em | 01:16 |
anteaya | fungi you are fast on the etherpad | 01:16 |
jog0 | very | 01:16 |
* fungi wins at etherpad | 01:16 | |
clarkb | so we are getting there | 01:16 |
clarkb | 1251920 still hates us, but removing the other noise should help a lot I bet | 01:16 |
jog0 | clarkb: yeah | 01:16 |
anteaya | what is the call? | 01:16 |
jog0 | hoping this may work https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:57578,n,z | 01:16 |
anteaya | requeue or still on lockdown? | 01:17 |
jog0 | wait | 01:17 |
anteaya | waiting | 01:17 |
jog0 | 920 is a ard one | 01:17 |
clarkb | yeah we need to sort 920 if at all possible | 01:17 |
*** boris-42 has quit IRC | 01:20 | |
mikal | jog0: it occurs to me that there are two other things we can try | 01:21 |
mikal | jog0: running modern code with the Havana tempest (I know that's sort of what you're doing) | 01:21 |
mikal | jog0: and reverting any deltas in the requirements files since Havana | 01:21 |
mikal | jog0: given that Havana seems to just work for us | 01:22 |
jog0 | mikal: ohh requirements didn't really look at that too much | 01:23 |
jog0 | although did a pip freeze diff | 01:23 |
jog0 | mikal: anyway at this point any ideayou have do it! | 01:23 |
mikal | jog0: dude, that's been my strategy for days... :P | 01:23 |
jog0 | mikal: :) | 01:24 |
*** DennyZhang has joined #openstack-infra | 01:24 | |
jog0 | so clarkb did stable code vs trunk tempest | 01:24 |
jog0 | we can try modern nova vs havana tempest | 01:24 |
mikal | I think that's worth a try | 01:24 |
jog0 | do a squash of all changes since havana and propose to stable | 01:24 |
jog0 | easy enough | 01:24 |
jog0 | and can't hurt | 01:24 |
mikal | I don't see anything obvious in requirements | 01:24 |
mikal | New netaddr and Bable basically | 01:25 |
jog0 | I do like the modern code on havana | 01:26 |
*** dcramer_ has joined #openstack-infra | 01:26 | |
clarkb | jeblair: heh devstack really didn't like that port change | 01:27 |
clarkb | doesn't look like we change the port in teh config | 01:30 |
*** nosnos has joined #openstack-infra | 01:30 | |
*** xchu has joined #openstack-infra | 01:31 | |
*** yaguang has joined #openstack-infra | 01:31 | |
lifeless | righto, how is the gatelooking? | 01:31 |
clarkb | lifeless: it is a bit better. we got ~4-5 changes in that fix a variety of problems but the console problem (bug 1251920) is still outstanding | 01:32 |
uvirtbot | Launchpad bug 1251920 in nova "Tempest failures due to failure to return console logs from an instance" [Critical,Fix committed] https://launchpad.net/bugs/1251920 | 01:32 |
sdague | jog0: so I'm lacking a reasonable test env right now, but I know the conceptual fix for this if you are able to code it up? | 01:33 |
jog0 | sdague: sorry I am about to go AWOL for a bit | 01:33 |
jog0 | but lifeless just popped back in | 01:33 |
sdague | no worries | 01:34 |
sdague | so the conceptual fix - https://github.com/openstack-dev/devstack/blob/master/lib/tempest#L145 need to conditionally create those flavors if they don't exist | 01:34 |
sdague | we are failing creating them the 2nd time | 01:34 |
clarkb | jeblair: new patchset on the devstack change should fix the problem (it wasn't being iniset) | 01:34 |
lifeless | because they already exit? | 01:34 |
lifeless | exist? | 01:34 |
sdague | lifeless: yes | 01:35 |
clarkb | sdague: this is for the grenade failures? | 01:35 |
sdague | nova-flavor create fails on second attempt | 01:35 |
sdague | clarkb: yes, after that's done, my other grenade patch will take us through the right part of that conditional | 01:35 |
lifeless | so a flavor-show check should be sufficient | 01:36 |
sdague | lifeless: yes | 01:36 |
sdague | I just don't trust myself with the syntax without functioning test env :) | 01:36 |
*** arata has left #openstack-infra | 01:37 | |
*** sarob has joined #openstack-infra | 01:37 | |
*** mriedem has joined #openstack-infra | 01:37 | |
sdague | there is a better refactoring here as well, but honestly, I won't have the concentration for it until I'm back home | 01:38 |
clarkb | lifeless: were you going to whip up a patch? | 01:38 |
clarkb | I don't have a proper devstack/tempest testbed but have no problems throwing something at jenkins >_> | 01:39 |
jeblair | clarkb: why does 57577,2 reference KEYSTONE_SERVICE_PORT in the else? | 01:41 |
clarkb | jeblair: so that you can override that value too | 01:42 |
clarkb | without it only the auth port can be overridden in the config | 01:43 |
jeblair | clarkb: so it's a minor nice enhancement but not strictly needed for the patch? | 01:43 |
lifeless | clarkb: you should | 01:44 |
lifeless | clarkb: I'm being torn hither and yon | 01:44 |
lifeless | clarkb: and I too would be tossing at jenkins and waiting | 01:44 |
clarkb | jeblair: correct | 01:44 |
clarkb | we really don't make it is to parse the output of flavor-list | 01:45 |
*** matsuhashi has quit IRC | 01:45 | |
*** matsuhashi has joined #openstack-infra | 01:46 | |
sdague | jenkins tossing is fine | 01:46 |
lifeless | sdague: so how come it is a sporadic fail? | 01:47 |
jog0 | I'm out for the day | 01:48 |
jog0 | thanks everyone | 01:48 |
jog0 | we made decent progress considering | 01:48 |
sdague | because we're using default guest sizes that are too large | 01:48 |
sdague | so if you get too many of them up at the same time, you run out of memory | 01:48 |
*** jerryz has quit IRC | 01:48 | |
sdague | all depending on which come and go in which order | 01:49 |
anteaya | thanks jog0 | 01:49 |
anteaya | beware the pools | 01:49 |
*** pcrews has joined #openstack-infra | 01:49 | |
*** wenlock has quit IRC | 01:50 | |
*** ryanpetrello has quit IRC | 01:52 | |
clarkb | sdague: lifeless: something like https://review.openstack.org/57584 | 01:52 |
* anteaya cheers for 57584 | 01:53 | |
sdague | clarkb: yep | 01:57 |
sdague | if it passes tests, I'll rush it through | 01:58 |
*** sarob has quit IRC | 01:58 | |
jeblair | i double checked some of the shell logic against rackspace output | 01:58 |
clarkb | jeblair: ya I was doing that locally too | 01:59 |
clarkb | jeblair: I can't figure out why the keystone port change is still unhappy though. Looks like installing or starting tempest may be erroring | 01:59 |
clarkb | sdague: how does tempest know what port to use for keystone? | 01:59 |
clarkb | tempest.conf seems to only use port 5000 | 02:00 |
*** nati_uen_ has joined #openstack-infra | 02:00 | |
clarkb | the lights just went out in this room. I am going to head home now | 02:02 |
sdague | clarkb: it uses the uri | 02:02 |
sdague | in the config | 02:02 |
sdague | defaulting to the 5000 one | 02:02 |
sdague | but it's settable | 02:02 |
sdague | and, yeh, need to head out here as well and regroup for dinner | 02:02 |
clarkb | sdague: but I shouldn't need to change it if I change the keystone auth port? | 02:02 |
clarkb | moving from 35357 to 32357 | 02:03 |
clarkb | that particular bug is probably a lowe priority since chance says 35357 will be available | 02:03 |
sdague | hmmm... good question, and honestly my brain is a bit shot | 02:04 |
clarkb | just not always | 02:04 |
clarkb | sdague: thats fair, I am going to afk and rest the brain | 02:04 |
jeblair | clarkb: http://logs.openstack.org/77/57577/2/check/check-tempest-devstack-vm-full/21f7af6/console.html | 02:04 |
jeblair | clarkb: er, yeah, i don't understand what is failing there. as in literally -- what command is failing? | 02:04 |
*** nati_ueno has quit IRC | 02:04 | |
*** DennyZhang has quit IRC | 02:05 | |
clarkb | jeblair: even the devstack log ends there | 02:05 |
clarkb | I probably need to run my change locally | 02:05 |
jeblair | clarkb: oh, there are a bunch of "ERROR: Unauthorized" lines in there | 02:05 |
clarkb | and get hands on with it | 02:05 |
clarkb | I bet we are hardcoded to 35357 in other places | 02:05 |
*** bingbu has joined #openstack-infra | 02:09 | |
*** sgran has quit IRC | 02:09 | |
jeblair | clarkb: yeah the nova admin commands are emitting that | 02:09 |
lifeless | clarkb: reviewed; minor quibble | 02:10 |
*** sgran has joined #openstack-infra | 02:10 | |
*** michchap_ has joined #openstack-infra | 02:11 | |
*** moted has quit IRC | 02:11 | |
*** ericw has quit IRC | 02:12 | |
*** michchap has quit IRC | 02:14 | |
*** nati_uen_ has quit IRC | 02:15 | |
*** nati_ueno has joined #openstack-infra | 02:16 | |
*** dolphm_reallyafk is now known as dolphm | 02:17 | |
anteaya | I can't hold -neutron in lockdown much longer, our devs in the other side of the world who aren't on irc are waking up and submitting patches | 02:17 |
anteaya | and most of the other core devs are offline | 02:18 |
*** sarob has joined #openstack-infra | 02:18 | |
*** ArxCruz has quit IRC | 02:18 | |
*** michchap has joined #openstack-infra | 02:21 | |
*** dkliban_ has quit IRC | 02:21 | |
*** moted has joined #openstack-infra | 02:22 | |
*** michchap_ has quit IRC | 02:24 | |
anteaya | lifeless did you want to comment again on 57584, it looks like it might pass check | 02:24 |
*** boris-42 has joined #openstack-infra | 02:24 | |
lifeless | anteaya: hmm? | 02:24 |
anteaya | lifeless: https://review.openstack.org/#/c/57584/1 | 02:25 |
anteaya | jeblair asked a question | 02:26 |
anteaya | I have my +1 handy | 02:26 |
anteaya | would be great if the two of you came to agreement | 02:26 |
lifeless | My comment is a comment, not a -1. | 02:27 |
*** ftcjeff has joined #openstack-infra | 02:27 | |
*** sdake_ has joined #openstack-infra | 02:30 | |
clarkb | I am walking home will check comment from there | 02:32 |
*** loq_mac has quit IRC | 02:32 | |
*** llu has joined #openstack-infra | 02:34 | |
*** masayukig has quit IRC | 02:37 | |
*** llu has left #openstack-infra | 02:37 | |
*** masayukig has joined #openstack-infra | 02:37 | |
*** dolphm has quit IRC | 02:38 | |
*** jhesketh__ has quit IRC | 02:38 | |
*** wenlock has joined #openstack-infra | 02:38 | |
anteaya | lifeless: okay | 02:41 |
*** jhesketh__ has joined #openstack-infra | 02:42 | |
*** yamahata_ has quit IRC | 02:42 | |
clarkb | lifeless: is there less work to do? http GETs are relatively expensive compared to a regex right? | 02:43 |
lifeless | clarkb: it's a tradeoff; in this case - meh -, but when dealing with all neutron ports for instance, individual GETs will be much cheaper. | 02:44 |
lifeless | clarkb: in principle a single GET of a flavor is one digital signature + a one row DB query. | 02:45 |
jeblair | it takes 1.5 seconds against rax prod. | 02:45 |
*** dolphm has joined #openstack-infra | 02:45 | |
lifeless | clarkb: however openstack hasn't put much effort into optimisation yet... | 02:45 |
*** dkranz has joined #openstack-infra | 02:45 | |
lifeless | jeblair: :( | 02:47 |
lifeless | anyhow, I don't think it needs changing, it will work. | 02:48 |
lifeless | ti's more a principle thing, avoiding unnecessary work. | 02:48 |
*** guohliu has joined #openstack-infra | 02:51 | |
*** senk has joined #openstack-infra | 02:51 | |
anteaya | go go 57584 | 02:52 |
anteaya | if/after that merges are we still on gate lockdown? | 02:52 |
anteaya | who is keeping track? | 02:52 |
jeblair | anteaya: https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013 | 02:53 |
anteaya | yes | 02:53 |
anteaya | I'll add 57584 | 02:53 |
anteaya | if it merges, are we open for business again? | 02:54 |
*** sarob has quit IRC | 02:55 | |
*** sarob has joined #openstack-infra | 02:56 | |
*** sdake_ has quit IRC | 02:56 | |
anteaya | I need to go to sleep soon, but I need to know what to tell -neutron before I go offline | 02:57 |
anteaya | they have been really co-operative and I want to give them good information | 02:57 |
clarkb | anteaya: we need 57584 then the change it depends on | 02:57 |
clarkb | then a revert of another change so ~3 changes to go | 02:57 |
anteaya | okay | 02:57 |
*** wenlock has quit IRC | 02:57 | |
clarkb | 1251920 is still an open issue though we seem to actually be able to merge things now after the various other fixes have gone in | 02:58 |
*** dkliban_ has joined #openstack-infra | 02:58 | |
anteaya | clarkb: 57584 doesn't have any dependencies | 02:58 |
anteaya | what is the change it depends on? | 02:58 |
clarkb | anteaya: a change in a different project, this is noted on the etherpad | 02:58 |
anteaya | sorry yes, now I see it | 02:59 |
*** thomasem has joined #openstack-infra | 02:59 | |
anteaya | clarkb: do we have a url for the revert? | 03:00 |
clarkb | anteaya: I don't think anyone has proposed it yet | 03:00 |
*** sarob has quit IRC | 03:00 | |
anteaya | so do we leave the gate in lockdown all night? | 03:00 |
anteaya | or open it up and then lock it again tomorrow | 03:01 |
anteaya | or are people going to stay up? | 03:01 |
clarkb | anteaya: considering the success of us asking people to stop I am half tempted to keep it this way until the last few really annoying bugs get fixed | 03:01 |
anteaya | okay | 03:01 |
*** thomasem has quit IRC | 03:01 | |
clarkb | but I also think we need to be fair and let other people do normal work | 03:01 |
clarkb | so I honestly don't have a strong opinion either way | 03:01 |
anteaya | I vote for opening it up and going to sleep - for me | 03:02 |
anteaya | and assessing the situation tomorrow | 03:02 |
clarkb | ya worst case we shut it down again and do what we did today | 03:02 |
anteaya | yes | 03:02 |
cyeoh | anteaya: the "dependent" one is https://review.openstack.org/#/c/57572/ | 03:02 |
clarkb | which wasn't too painful | 03:02 |
anteaya | the poor folks who submit to neutron and who are not on irc are going to have a heck of a confusing work day otherwise | 03:03 |
cyeoh | clarkb: is the revert one you are referring to the v3 disable patch? | 03:03 |
anteaya | cyeoh: thanks | 03:03 |
anteaya | I picture the poor man who emailed mikal looking for his account credentials | 03:04 |
*** dolphm has quit IRC | 03:04 | |
clarkb | cyeoh: ya we should unrevert the v3 disable patch if the other grenade thigns get us going | 03:04 |
anteaya | easier to let them submit and take their chances | 03:04 |
*** sdake_ has joined #openstack-infra | 03:04 | |
anteaya | clarkb: and yes this was successful, for a rare event | 03:04 |
*** matsuhas_ has joined #openstack-infra | 03:05 | |
cyeoh | clarkb: yep, agreed. I'll propose the revert and leave a comment not to approve it until we know the gate is ok. | 03:06 |
clarkb | sounds good | 03:06 |
*** matsuhashi has quit IRC | 03:08 | |
*** nati_uen_ has joined #openstack-infra | 03:10 | |
*** dcramer_ has quit IRC | 03:13 | |
*** nati_ueno has quit IRC | 03:14 | |
*** fifieldt has quit IRC | 03:16 | |
sdague | ok, so what is the magic invocation to use python-novaclient against rax without any of their crazy gorp? | 03:17 |
clarkb | sdague: uh uh uh I think you use passwords | 03:17 |
clarkb | so you must avoid all the api key stuff | 03:17 |
sdague | it didn't like that | 03:17 |
*** fifieldt has joined #openstack-infra | 03:18 | |
sdague | I'm trying to use baked in python-novaclient on 13.10 | 03:18 |
notmyname | anteaya: clarkb: so stuff that was blocked on the gate should be "reverify no bug" right? | 03:19 |
sdague | I carved off a user on my rax account, and tried to use the devstack-gate script | 03:19 |
clarkb | notmyname: yes | 03:19 |
anteaya | clarkb: are we opening the gate again? | 03:19 |
clarkb | mordred: do you have an answer for sdague? I think you ahve actually tested it | 03:20 |
sdague | ERROR: Invalid OpenStack Nova credentials. | 03:20 |
clarkb | anteaya: meh, I would like those 3 changes to go through but we have been dealing with a bit of leakage all day anyways | 03:20 |
clarkb | notice the many ceilometer changes in the gate... | 03:20 |
anteaya | yes I saw | 03:21 |
*** sdake_ has quit IRC | 03:21 | |
notmyname | anteaya: it's slow now anyway. what's the worst that could happen? ;-) | 03:21 |
anteaya | I have a neutron-core in Japan -2 everything -neutron that has been submitted | 03:21 |
anteaya | well if we are I will communicate as such to -neutron and go to bed | 03:22 |
anteaya | I just want to comply | 03:22 |
anteaya | since that is the message I am bringing to -neutron | 03:22 |
anteaya | and they are getting it | 03:22 |
clarkb | anteaya: I think it is ok to start letting things through | 03:22 |
anteaya | very good | 03:22 |
anteaya | thank you | 03:23 |
clarkb | because most peopel aren't able to babysit bug fixes anyways | 03:23 |
clarkb | and if we get a giant backlog again we will just kill it and repeat | 03:23 |
sdague | once that devstack change lands, can someone trigger a recheck on - https://review.openstack.org/#/c/57572/ - if anyone is up | 03:23 |
clarkb | sdague: ya I can do that | 03:23 |
*** fifieldt has quit IRC | 03:23 | |
sdague | I'm about to head to dinner, and if that's green when I get back, I'll push it to the gate | 03:23 |
clarkb | k | 03:23 |
anteaya | night all | 03:23 |
sdague | man, what I wouldn't give for that Depends-On: logic :) | 03:24 |
anteaya | thanks for all the hard work today | 03:24 |
clarkb | yup thanks everyone | 03:24 |
*** DennyZhang has joined #openstack-infra | 03:25 | |
*** dkliban_ has quit IRC | 03:25 | |
*** melwitt has quit IRC | 03:30 | |
*** svarnau has quit IRC | 03:30 | |
sdague | maybe I can troll mikal about it | 03:32 |
*** wenlock has joined #openstack-infra | 03:37 | |
*** nati_uen_ has quit IRC | 03:39 | |
*** nati_ueno has joined #openstack-infra | 03:40 | |
notmyname | bad news? https://review.openstack.org/#/c/57582/ | 03:41 |
notmyname | failed because of bug 1251920 | 03:41 |
uvirtbot | Launchpad bug 1251920 in nova "Tempest failures due to failure to return console logs from an instance" [Critical,Fix committed] https://launchpad.net/bugs/1251920 | 03:41 |
notmyname | check, not gate | 03:41 |
*** pcrews has quit IRC | 03:45 | |
*** fifieldt has joined #openstack-infra | 03:46 | |
*** dkliban_ has joined #openstack-infra | 03:48 | |
*** wenlock has quit IRC | 03:48 | |
clarkb | notmyname: that is going to be the worst outstanding bug that we haven't managed to fix yet | 03:49 |
clarkb | so we will still probably see a relatively high incidence of it. hopefully easier to debug now that a lot of the other problems have been fixed | 03:49 |
notmyname | ok, thanks | 03:49 |
*** loq_mac has joined #openstack-infra | 03:50 | |
*** CaptTofu has quit IRC | 03:54 | |
*** CaptTofu has joined #openstack-infra | 03:55 | |
*** michchap has quit IRC | 03:55 | |
*** boris-42 has quit IRC | 03:57 | |
*** nati_uen_ has joined #openstack-infra | 03:57 | |
*** boris-42 has joined #openstack-infra | 03:57 | |
*** nati_uen_ has quit IRC | 03:58 | |
*** matsuhas_ has quit IRC | 04:00 | |
*** nati_ueno has quit IRC | 04:01 | |
*** matsuhashi has joined #openstack-infra | 04:02 | |
zaro | anybody know if there's a way to search openstack-infra mailing list? | 04:04 |
*** ftcjeff has quit IRC | 04:04 | |
clarkb | zaro: google | 04:04 |
*** mriedem has quit IRC | 04:04 | |
clarkb | site:lists.openstack.org and go from there | 04:05 |
zaro | clarkb: google groups or just plain google? | 04:05 |
clarkb | just plain google, they should be indexing the mail archives | 04:05 |
*** chandankumar has joined #openstack-infra | 04:07 | |
*** sandywalsh has quit IRC | 04:07 | |
zaro | too much crap.. just gonna holla. | 04:08 |
zaro | fungi: do you have a link to that security gerrit question i sent to infra list? the one asking about permissions on groups for security gerrit? | 04:09 |
clarkb | zaro: http://lists.openstack.org/pipermail/openstack-infra/2013-October/000314.html that one? | 04:10 |
*** matsuhashi has quit IRC | 04:14 | |
*** matsuhashi has joined #openstack-infra | 04:19 | |
*** DinaBelova has joined #openstack-infra | 04:22 | |
portante | clarkb: still around? | 04:22 |
clarkb | portante: ya | 04:23 |
*** odyssey4me has joined #openstack-infra | 04:23 | |
portante | so the gate jobs seem odd | 04:23 |
*** DinaBelova has quit IRC | 04:23 | |
portante | looking at the top one, the jenkins output is out "running testr" | 04:24 |
zaro | holla! u da best clarkb | 04:24 |
portante | is that right? | 04:24 |
clarkb | portante: looking | 04:24 |
clarkb | portante: which one can you link? the one I grabbed isn't doing that | 04:25 |
portante | I was looking at https://jenkins02.openstack.org/job/gate-tempest-devstack-vm-large-ops/16508/console | 04:26 |
clarkb | portante: I have noticed in the past that since testr by default is pretty quiet the jenkins console output will appear to hang there but it is just buffering the data and waiting for enough to come back to display it to you | 04:27 |
portante | and it seems I just caught it after 04:18:09 | 04:27 |
clarkb | portante: I think that is what happened in that test | 04:27 |
clarkb | basically testr wasn't writing enough stuff to the console to have jenkins show it to you | 04:27 |
portante | I saw that in a few jobs, so just chekcing | 04:27 |
*** dkranz has quit IRC | 04:35 | |
*** clayg has left #openstack-infra | 04:35 | |
*** loq_mac has quit IRC | 04:35 | |
*** dkranz has joined #openstack-infra | 04:36 | |
*** senk has quit IRC | 04:37 | |
*** sandywalsh has joined #openstack-infra | 04:37 | |
*** sdake_ has joined #openstack-infra | 04:38 | |
*** sdake_ has joined #openstack-infra | 04:38 | |
clarkb | 1251920 you are my nemesis | 04:41 |
clarkb | sdague: my devstack change hit 1251920, I will reverify, but ya ugh | 04:41 |
portante | clarkb: so we are not out of the woods yet | 04:44 |
clarkb | portante: no not completely, most of the problems other than 1251920 that we had identified were fixed | 04:44 |
clarkb | so today was good, but not enough | 04:45 |
clarkb | problem with 1251920 is I don't think anyone knows why it is happening | 04:45 |
clarkb | and most theories have been debunked | 04:45 |
portante | hmm, if somebody wants to walk me through it, I can take a look tomorrow | 04:45 |
portante | I say that naively, just happy to help | 04:46 |
clarkb | portante: I am pretty naive to it as well. tl;dr is tempest has a test that requests the console output from a qemu VM via nova. For some reason at a relatively high incidence rate nova fails to return that data | 04:47 |
clarkb | mikal has done some good investigating. he may be able to chime in and give more info | 04:47 |
mikal | Last I heard jog0 was trying with the havana tempest | 04:53 |
mikal | Not sure how that went | 04:53 |
*** nati_ueno has joined #openstack-infra | 04:57 | |
*** DennyZhang has quit IRC | 04:59 | |
*** michchap has joined #openstack-infra | 05:01 | |
*** matsuhashi has quit IRC | 05:02 | |
*** odyssey4me has quit IRC | 05:03 | |
portante | clarkb, mikal: so tempest was changed sometime after Havana, and how was it tested to ensure it works? | 05:04 |
*** afazekas has quit IRC | 05:05 | |
clarkb | portante: I rebased tempest master onto tempest havana, squashed that into one commit, then proposed several changes with that code so that icehouse tempest would be run against everything else havana | 05:07 |
portante | oh, and how was that received? | 05:08 |
clarkb | portante: https://review.openstack.org/#/c/57504/ https://review.openstack.org/#/c/57506/ and https://review.openstack.org/#/c/57507/ | 05:08 |
clarkb | the jobs don't pass but no incidence of 1251920 | 05:08 |
portante | that is a good start | 05:08 |
portante | seems like we need to consider more closely what changes to tempest mean | 05:09 |
clarkb | yeah, we might need to go through it with a comb instead of my bruteforce squahs everything together method | 05:09 |
portante | i'd be happy to help | 05:09 |
portante | ... tomorrow ... :) | 05:09 |
clarkb | I am about to sign off for the night as well | 05:10 |
portante | ping when you are ready | 05:10 |
clarkb | will do | 05:10 |
portante | 'night | 05:10 |
*** svarnau has joined #openstack-infra | 05:13 | |
*** loq_mac has joined #openstack-infra | 05:17 | |
*** loq_mac has quit IRC | 05:19 | |
*** loq_mac has joined #openstack-infra | 05:20 | |
*** odyssey4me has joined #openstack-infra | 05:20 | |
*** loq_mac has quit IRC | 05:20 | |
*** loq_mac has joined #openstack-infra | 05:21 | |
*** odyssey4me2 has joined #openstack-infra | 05:21 | |
*** wenlock has joined #openstack-infra | 05:21 | |
*** odyssey4me3 has joined #openstack-infra | 05:23 | |
*** odyssey4me has quit IRC | 05:25 | |
*** odyssey4me2 has quit IRC | 05:26 | |
jog0 | mikal: reading scrollback | 05:27 |
*** afazekas has joined #openstack-infra | 05:27 | |
*** afazekas has quit IRC | 05:27 | |
jog0 | mikal: https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:57578,n,z looks promising | 05:28 |
*** dkranz has quit IRC | 05:29 | |
jog0 | mikal: can you babysit that patch for US night | 05:29 |
*** dkranz has joined #openstack-infra | 05:29 | |
jog0 | and if it looks good, propose a patch with a proper commit message | 05:29 |
jog0 | 4 runs and no consolelog == promising | 05:30 |
jog0 | only hitting that grenade bug | 05:30 |
*** svarnau has quit IRC | 05:31 | |
clarkb | jog0: ooh | 05:32 |
* clarkb looks | 05:32 | |
jgriffith | jog0: not core but I can run rechecks if there's a threshold you want to hit on it | 05:32 |
jgriffith | Let me know... I'll be up for a bit | 05:32 |
jog0 | jgriffith: yes! | 05:32 |
clarkb | jog0: any idea why that may help? or was that a stab in the dark? | 05:32 |
jog0 | clarkb: stab in dark | 05:32 |
jog0 | jgriffith: lets say if we get 10 collective runs without console log | 05:33 |
*** loq_mac has quit IRC | 05:33 | |
jog0 | jgriffith: and we have 4 patches to babysit | 05:33 |
jog0 | https://review.openstack.org/#/c/57578/ https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:57578,n,z | 05:33 |
jgriffith | aye... | 05:34 |
jgriffith | well I'm happy to help monitor on and off | 05:34 |
jgriffith | without knowing which is root it's kinda dicey though | 05:34 |
jog0 | btw http://paste.openstack.org/show/53720/ | 05:35 |
jog0 | getting better | 05:35 |
*** mihgen has joined #openstack-infra | 05:35 | |
jog0 | jgriffith: we are looking for https://bugs.launchpad.net/nova/+bug/1251920 | 05:36 |
uvirtbot | Launchpad bug 1251920 in nova "Tempest failures due to failure to return console logs from an instance" [Critical,Fix committed] | 05:36 |
jog0 | mainly AssertionError: Console output was empty. | 05:36 |
jgriffith | Yep, caught up on that at least | 05:36 |
jog0 | but yeah I don't know the root | 05:36 |
jgriffith | jog0: I'll help out if I can, I can at least be the "recheck" monkey | 05:36 |
jog0 | clarkb: that patch touched the right file at the right time | 05:36 |
jog0 | jgriffith: thanks | 05:36 |
jgriffith | jog0: looking at the test/code it seems reasonable | 05:37 |
*** zul has joined #openstack-infra | 05:37 | |
jgriffith | anyway... | 05:37 |
jog0 | jgriffith: I didn't finish it but I was writting a recheck monkey myself | 05:37 |
clarkb | jog0: gotcha | 05:37 |
jog0 | http://paste.debian.net/66940/ | 05:37 |
jog0 | jgriffith: you can just finish that up instead if you want | 05:37 |
jog0 | gerritlib can comment on patches | 05:37 |
* jgriffith has been replaced by software :) | 05:37 | |
*** sarob has joined #openstack-infra | 05:39 | |
*** sarob has quit IRC | 05:43 | |
*** SergeyLukjanov has joined #openstack-infra | 05:48 | |
jgriffith | jog0: cool... so if I'm following your inent here | 05:53 |
jgriffith | jog0: replace the 57357 with a list of the other 4 patches | 05:54 |
jog0 | yeah | 05:54 |
jgriffith | jog0: add the handler to detect and leave comment on event | 05:54 |
clarkb | jog0: ken'ichi appears to be on the same track (see email) | 05:54 |
clarkb | so this has me very hopeful | 05:54 |
jgriffith | I'll have to wait til I get one to disect/understand teh results it gives back but should be straight forward | 05:55 |
jgriffith | playing in the lib code now to try and get a preview | 05:55 |
jog0 | clarkb: cool | 05:55 |
jog0 | so we have at least 8 successes then | 05:56 |
jog0 | ken'ichi's and ours | 05:56 |
jog0 | another 2 and I say go for it | 05:56 |
clarkb | jog0: should probably do ken'ichi's change as it skips rather than removes the test (though a broken test should maybe be removed?) | 05:56 |
jog0 | clarkb: same difference to me | 05:57 |
clarkb | jog0: oh the skip includes the bug, I like that | 05:59 |
jog0 | clarkb: that sounds better | 05:59 |
clarkb | jog0: oh those tests will use the same server | 06:00 |
clarkb | I bet if they run in the same process the backup test is interfering with the console test | 06:01 |
jog0 | clarkb: ahhh that makes sense so depending on order the server is in a bad state | 06:01 |
jog0 | clarkb: cool | 06:01 |
jog0 | if you can debug that a little more lets merge | 06:01 |
clarkb | but when run in different processes its ok because they should use different servers | 06:01 |
* clarkb looks at subunit logs | 06:01 | |
jog0 | clarkb: also the original patch failed for this bug afew times | 06:05 |
jog0 | which confirms my theory | 06:05 |
jog0 | if gate is flakey people ignore who causes the flakey | 06:05 |
*** jhesketh has quit IRC | 06:05 | |
clarkb | jog0: argh | 06:06 |
clarkb | jog0: so my theory is at least partially wrong because all tempest tests belonging to the same class go in the same test process | 06:06 |
clarkb | jog0: but it may have to do with test order | 06:06 |
clarkb | so looking at that now | 06:06 |
jog0 | clarkb: yeah | 06:06 |
jog0 | thats what I was thinking | 06:06 |
jog0 | if console log is second it will break is my guess | 06:09 |
jog0 | why not sure | 06:09 |
clarkb | so in one failure it is second | 06:09 |
clarkb | going to look at success now | 06:09 |
clarkb | also XML and JSON both did backup then console output but only xml failed | 06:09 |
jog0 | ohh that could be part of the issue | 06:10 |
jog0 | xml vs json tests | 06:10 |
mikal | jog0: yep, I will recheck it a couple of times before bed | 06:10 |
mikal | Unless jgriffith beats me to it | 06:11 |
jgriffith | scripts about done, just need to verify the message in the event | 06:11 |
mikal | Its all good | 06:11 |
clarkb | jog0: a passing test has backup then console for both XML and JSON so it may be more subtle than I was hoping | 06:11 |
jog0 | clarkb: :/ | 06:11 |
jog0 | either way looks like a good candidate and disabling the test is a almost 0 risk | 06:12 |
jog0 | and we have 8 passes | 06:12 |
jog0 | clarkb: what status of the grenade fix? | 06:12 |
jgriffith | mikal: if nothing else I know have a utility to tell me to go hit recheck :) | 06:12 |
jgriffith | s/konw/now/ | 06:12 |
jog0 | jgriffith: I was going to take it further and have it auto run recheck | 06:13 |
clarkb | jog0: devstack change is in the gate | 06:13 |
clarkb | it hit 1251920 | 06:13 |
jgriffith | jog0: yeah, that's where I'm at | 06:13 |
jog0 | look at the code in elastic-recheck for how to do that | 06:13 |
jgriffith | jog0: just wanted to see a success message example back from gerrit before trying to cod | 06:13 |
jgriffith | ahhhh | 06:13 |
jgriffith | excellet! | 06:13 |
jgriffith | ent | 06:13 |
jog0 | jgriffith: thats why i didn't do it yet either | 06:13 |
jgriffith | LOL | 06:13 |
*** rpodolyaka1 has joined #openstack-infra | 06:16 | |
sdague | clarkb: so I could always just take the risk and jump the grenade change into the gate | 06:18 |
jog0 | clarkb: I think we should do just A+ ken'ichi's patch | 06:19 |
jog0 | clarkb: and see if it merges after sdague jumps the grenade | 06:19 |
jog0 | then we should be in good shape | 06:19 |
sdague | where is ken'ichi's patch? | 06:19 |
jog0 | clarkb: ^ | 06:19 |
clarkb | https://review.openstack.org/#/c/57193/ | 06:20 |
sdague | sorry, just got back from dinner | 06:20 |
clarkb | you might want an updated commit message | 06:20 |
jog0 | clarkb: we can just do that for him | 06:20 |
jog0 | his patch is better then mine | 06:20 |
clarkb | he is around too it looks like | 06:20 |
clarkb | nto sure if on irc | 06:20 |
*** odyssey4me3 has quit IRC | 06:21 | |
sdague | yeh, I'm not sure what his nick is | 06:23 |
*** loq_mac has joined #openstack-infra | 06:23 | |
jog0 | https://launchpad.net/~oomichi | 06:24 |
jog0 | don't see him around | 06:24 |
jog0 | ok I'm out for the night | 06:26 |
jog0 | hopefully by the time I am online tomorrow the gate queue will be relaoded | 06:26 |
sdague | so if someone wants to un WIP his patch, I can gate hop it | 06:28 |
clarkb | sdague: k | 06:29 |
clarkb | sdague: pushed | 06:31 |
sdague | clarkb: +A | 06:31 |
*** rpodolyaka1 has quit IRC | 06:33 | |
sdague | ok, that's all the damage I can do tonight. I'll take a look in the morning and see what's merged. | 06:34 |
clarkb | sdague: now we just hope that no other things prevent it from getting in :) | 06:34 |
clarkb | I will try to get change in in the morning if they haven't made it in by then | 06:34 |
sdague | yeh, honestly, I'm tempted to snipe out all those swift and ceilometer changes | 06:34 |
clarkb | sdague: the only way to do that other than restarting zuul again is pushing new patchsets | 06:35 |
clarkb | which is far less than ideal | 06:35 |
sdague | yep | 06:35 |
sdague | I've done it before | 06:35 |
clarkb | :) | 06:35 |
sdague | but I'll give it overnight | 06:35 |
*** loquacities has joined #openstack-infra | 06:36 | |
*** loq_mac has quit IRC | 06:36 | |
openstackgerrit | Russell Bryant proposed a change to openstack-infra/reviewstats: Fix disagreement percentage calculation https://review.openstack.org/57606 | 06:37 |
*** loquacities has quit IRC | 06:40 | |
*** loq_mac has joined #openstack-infra | 06:40 | |
*** pcrews has joined #openstack-infra | 06:42 | |
*** michchap has quit IRC | 06:43 | |
*** michchap has joined #openstack-infra | 06:44 | |
*** rpodolyaka1 has joined #openstack-infra | 06:44 | |
*** nosnos has quit IRC | 06:48 | |
*** michchap has quit IRC | 06:48 | |
*** nosnos has joined #openstack-infra | 06:48 | |
*** chandankumar_ has joined #openstack-infra | 06:50 | |
*** chandankumar has quit IRC | 06:51 | |
*** amotoki has joined #openstack-infra | 06:53 | |
*** dstanek has quit IRC | 06:54 | |
*** loq_mac has quit IRC | 06:56 | |
*** masayukig has quit IRC | 06:59 | |
*** SergeyLukjanov has quit IRC | 06:59 | |
*** yolanda has joined #openstack-infra | 07:08 | |
*** jcoufal has joined #openstack-infra | 07:14 | |
*** rpodolyaka1 has quit IRC | 07:17 | |
jgriffith | mikal: jog0 http://paste.openstack.org/show/53726/ | 07:18 |
jgriffith | seems to work ok | 07:19 |
jgriffith | I'm heading offline for a bit | 07:19 |
jgriffith | may make it back on depending on how my other project goes | 07:19 |
jgriffith | rather than leave it running on my machine I figure I'd give it to somebody else incase there's a bug :) | 07:19 |
ogelbukh | is there a blogpost or some article anywhere about how gerrit is scaled in OpenStack infra? | 07:22 |
clarkb | jgriffith: we went ahead and approved one of those changes | 07:26 |
clarkb | ogelbukh: there are the docs at http://ci.openstack.org | 07:27 |
clarkb | gerrit itself isn't hard to scale. one semi beefy machine and a few parameters tuned is all we do | 07:27 |
*** denis_makogon_ has joined #openstack-infra | 07:28 | |
ogelbukh | clarkb: thanks, aligns to my understanding | 07:28 |
*** wenlock has quit IRC | 07:32 | |
mikal | jgriffith: I'm going to keep rechecking jog0's change, so go out with vigour | 07:34 |
*** nati_ueno has quit IRC | 07:40 | |
*** nati_ueno has joined #openstack-infra | 07:43 | |
*** nicedice has quit IRC | 07:43 | |
*** dstanek has joined #openstack-infra | 07:46 | |
*** mihgen has quit IRC | 07:49 | |
*** dstanek has quit IRC | 07:51 | |
*** hdd has joined #openstack-infra | 07:52 | |
*** salv-orlando has joined #openstack-infra | 07:55 | |
clarkb | mikal no need | 07:56 |
clarkb | we made the decision to approve one. though more rechecks give more confidence | 07:57 |
*** fifieldt has quit IRC | 08:12 | |
*** Mithrandir has quit IRC | 08:21 | |
*** SergeyLukjanov has joined #openstack-infra | 08:21 | |
*** flaper87|afk is now known as flaper87 | 08:21 | |
*** nosnos_ has joined #openstack-infra | 08:30 | |
*** mihgen has joined #openstack-infra | 08:30 | |
*** nosnos has quit IRC | 08:33 | |
*** boris-42 has quit IRC | 08:41 | |
*** mkerrin has joined #openstack-infra | 08:42 | |
*** Bada has joined #openstack-infra | 08:43 | |
*** ruhe has joined #openstack-infra | 08:47 | |
*** osanchez has joined #openstack-infra | 08:49 | |
*** yassine has joined #openstack-infra | 08:52 | |
*** derekh has joined #openstack-infra | 08:58 | |
*** Ng has joined #openstack-infra | 09:06 | |
*** plomakin has joined #openstack-infra | 09:06 | |
*** Ng has quit IRC | 09:06 | |
*** Ng has joined #openstack-infra | 09:07 | |
*** masayukig has joined #openstack-infra | 09:14 | |
*** yamahata_ has joined #openstack-infra | 09:14 | |
*** jpich has joined #openstack-infra | 09:18 | |
*** ruhe has quit IRC | 09:22 | |
*** dizquierdo has joined #openstack-infra | 09:23 | |
*** ljjjustin has joined #openstack-infra | 09:24 | |
*** pblaho has joined #openstack-infra | 09:26 | |
*** thomasbiege has joined #openstack-infra | 09:28 | |
openstackgerrit | Sergey Lukjanov proposed a change to openstack-infra/config: Setup devstack-gate tests for Savanna https://review.openstack.org/57317 | 09:28 |
*** pblaho has quit IRC | 09:38 | |
*** pblaho has joined #openstack-infra | 09:38 | |
*** yamahata_ has quit IRC | 09:40 | |
*** fbo_away is now known as fbo | 09:42 | |
*** johnthetubaguy has joined #openstack-infra | 09:44 | |
*** thomasbiege has quit IRC | 09:44 | |
*** boris-42 has joined #openstack-infra | 09:48 | |
*** boris-42 has quit IRC | 09:52 | |
matel | Hi all, where can I get information about when will the infra team use TripleO to test hypervisors? | 09:53 |
mikal | clarkb: oh, cool | 09:54 |
*** plomakin has quit IRC | 09:55 | |
*** mkerrin has quit IRC | 09:55 | |
*** nati_ueno has quit IRC | 09:55 | |
*** plomakin has joined #openstack-infra | 09:55 | |
*** ruhe has joined #openstack-infra | 09:58 | |
ttx | mordred/jeblair: you should chime in markwash thread on client major releases | 09:58 |
*** SergeyLukjanov is now known as _SergeyLukjanov | 09:59 | |
*** Bada has quit IRC | 10:07 | |
*** xchu has quit IRC | 10:08 | |
*** jcoufal has quit IRC | 10:09 | |
*** odyssey4me3 has joined #openstack-infra | 10:10 | |
*** afazekas has joined #openstack-infra | 10:12 | |
*** masayukig has quit IRC | 10:17 | |
*** bingbu has quit IRC | 10:19 | |
*** yamahata_ has joined #openstack-infra | 10:21 | |
*** odyssey4me3 has quit IRC | 10:21 | |
BobBall | fungi: What's your usecase? Why do you want more than 14 cinder volumes? Depending on exactly what you want, we might be able to get more. | 10:25 |
BobBall | fungi: I guess I really am asking how important is it :) | 10:25 |
*** Mithrandir has joined #openstack-infra | 10:29 | |
*** ljjjustin has quit IRC | 10:29 | |
openstackgerrit | Julien Danjou proposed a change to openstack/requirements: Bump to using SQLAlchemy migrate 0.8.2. https://review.openstack.org/56662 | 10:30 |
*** arata has joined #openstack-infra | 10:31 | |
*** arata has left #openstack-infra | 10:31 | |
*** odyssey4me3 has joined #openstack-infra | 10:31 | |
*** ruhe has quit IRC | 10:31 | |
*** guohliu has quit IRC | 10:32 | |
*** yamahata_ has quit IRC | 10:33 | |
*** nati_ueno has joined #openstack-infra | 10:37 | |
*** ruhe has joined #openstack-infra | 10:41 | |
*** mihgen has quit IRC | 10:52 | |
openstackgerrit | Roman Prykhodchenko proposed a change to openstack-infra/config: Adds devstack-gate tests for Ironic https://review.openstack.org/53917 | 10:54 |
*** mihgen has joined #openstack-infra | 10:56 | |
*** DinaBelova has joined #openstack-infra | 10:56 | |
*** SergeyLukjanov has joined #openstack-infra | 10:59 | |
*** jhesketh__ has quit IRC | 11:02 | |
*** adalbas has joined #openstack-infra | 11:11 | |
*** odyssey4me3 has quit IRC | 11:11 | |
*** ArxCruz has joined #openstack-infra | 11:12 | |
*** jhesketh__ has joined #openstack-infra | 11:14 | |
openstackgerrit | Yuuichi Fujioka proposed a change to openstack-dev/hacking: Add metaclass for Python3 compatibility https://review.openstack.org/56890 | 11:18 |
*** nosnos_ has quit IRC | 11:20 | |
*** nosnos has joined #openstack-infra | 11:21 | |
*** nicedice has joined #openstack-infra | 11:22 | |
*** nosnos has quit IRC | 11:25 | |
*** syerrapragada has quit IRC | 11:32 | |
*** syerrapragada has joined #openstack-infra | 11:33 | |
*** odyssey4me3 has joined #openstack-infra | 11:33 | |
*** mihgen has quit IRC | 11:37 | |
*** loq_mac has joined #openstack-infra | 11:40 | |
*** mihgen has joined #openstack-infra | 11:42 | |
*** boris-42 has joined #openstack-infra | 11:45 | |
openstackgerrit | Sergey Lukjanov proposed a change to openstack-infra/config: Setup devstack-gate tests for Savanna https://review.openstack.org/57317 | 11:45 |
*** ericw has joined #openstack-infra | 11:47 | |
*** loq_mac has quit IRC | 11:48 | |
*** odyssey4me3 has quit IRC | 11:48 | |
*** rfolco has joined #openstack-infra | 11:49 | |
*** ericw has quit IRC | 11:49 | |
*** yamahata_ has joined #openstack-infra | 11:50 | |
openstackgerrit | Yuuichi Fujioka proposed a change to openstack-dev/hacking: Add metaclass for Python3 compatibility https://review.openstack.org/56890 | 11:52 |
openstackgerrit | Jaroslav Henner proposed a change to openstack-infra/jenkins-job-builder: Add properties testing. https://review.openstack.org/57654 | 11:55 |
openstackgerrit | Jaroslav Henner proposed a change to openstack-infra/jenkins-job-builder: Add batch_tasks support. https://review.openstack.org/57469 | 11:55 |
openstackgerrit | Jaroslav Henner proposed a change to openstack-infra/jenkins-job-builder: Add seealso to batch_tasks from promoted_build. https://review.openstack.org/57473 | 11:55 |
*** yaguang has quit IRC | 11:58 | |
*** weshay has joined #openstack-infra | 11:58 | |
*** Guest97079 is now known as cyril__ | 12:05 | |
*** salv-orlando has quit IRC | 12:08 | |
*** rfolco has quit IRC | 12:08 | |
*** ericw has joined #openstack-infra | 12:13 | |
*** rfolco has joined #openstack-infra | 12:14 | |
*** ruhe has quit IRC | 12:15 | |
*** salv-orlando has joined #openstack-infra | 12:16 | |
*** syerrapragada has quit IRC | 12:17 | |
*** syerrapragada has joined #openstack-infra | 12:17 | |
*** nati_ueno has quit IRC | 12:18 | |
openstackgerrit | Jaroslav Henner proposed a change to openstack-infra/jenkins-job-builder: Add batch_tasks support. https://review.openstack.org/57469 | 12:18 |
openstackgerrit | Jaroslav Henner proposed a change to openstack-infra/jenkins-job-builder: Add seealso to batch_tasks from promoted_build. https://review.openstack.org/57473 | 12:19 |
*** ruhe has joined #openstack-infra | 12:26 | |
*** michchap has joined #openstack-infra | 12:28 | |
*** amotoki has quit IRC | 12:40 | |
*** pcm_ has joined #openstack-infra | 12:47 | |
*** mihgen has quit IRC | 12:54 | |
*** pblaho has quit IRC | 12:54 | |
*** pblaho has joined #openstack-infra | 12:57 | |
*** sergmelikyan has joined #openstack-infra | 13:07 | |
sergmelikyan | Is a zuul down? | 13:07 |
*** alcabrera has joined #openstack-infra | 13:08 | |
ogelbukh | no, it's not | 13:09 |
ogelbukh | though status page is somewhat slow to render | 13:10 |
*** ruhe has quit IRC | 13:12 | |
anteaya | yes my status page was slow to render as well | 13:14 |
anteaya | if that is the only problem today, we are going to have a great day | 13:14 |
ogelbukh | ) | 13:15 |
*** ruhe has joined #openstack-infra | 13:17 | |
*** mriedem has joined #openstack-infra | 13:18 | |
*** dstanek has joined #openstack-infra | 13:18 | |
*** w_ has joined #openstack-infra | 13:23 | |
*** olaph has quit IRC | 13:25 | |
*** thomasem has joined #openstack-infra | 13:27 | |
anteaya | openstack.org is slow | 13:27 |
openstackgerrit | Sergey Lukjanov proposed a change to openstack-infra/config: Add merge-release-tags job to Savanna https://review.openstack.org/57667 | 13:31 |
*** mihgen has joined #openstack-infra | 13:35 | |
*** CaptTofu has quit IRC | 13:39 | |
*** CaptTofu has joined #openstack-infra | 13:39 | |
*** hashar has joined #openstack-infra | 13:40 | |
*** sandywalsh has quit IRC | 13:40 | |
*** chandankumar_ has quit IRC | 13:40 | |
*** dizquierdo has quit IRC | 13:42 | |
openstackgerrit | Sergey Lukjanov proposed a change to openstack-infra/config: Enable #openstack-climate IRC channel logging https://review.openstack.org/57675 | 13:45 |
*** jergerber has joined #openstack-infra | 13:45 | |
*** zul has quit IRC | 13:50 | |
*** zul has joined #openstack-infra | 13:51 | |
*** jergerber has quit IRC | 13:52 | |
*** dprince has joined #openstack-infra | 13:53 | |
*** sandywalsh has joined #openstack-infra | 13:53 | |
*** ilyashakhat has quit IRC | 13:55 | |
*** jergerber has joined #openstack-infra | 13:55 | |
*** yamahata_ has quit IRC | 13:57 | |
mordred | ttx: done | 13:57 |
*** changbl has quit IRC | 13:57 | |
*** yamahata_ has joined #openstack-infra | 13:58 | |
*** bpokorny has quit IRC | 13:59 | |
*** dkranz has quit IRC | 14:03 | |
*** w_ is now known as olaph | 14:05 | |
sdague | ok, so I'm thinking of sniping the stuff out ahead of that grenade change | 14:06 |
*** yamahata_ has quit IRC | 14:07 | |
*** yamahata_ has joined #openstack-infra | 14:08 | |
*** DinaBelova has quit IRC | 14:09 | |
*** mfer has joined #openstack-infra | 14:10 | |
*** herndon_ has joined #openstack-infra | 14:10 | |
*** rongze has joined #openstack-infra | 14:11 | |
*** arosen has joined #openstack-infra | 14:14 | |
*** dolphm has joined #openstack-infra | 14:14 | |
sdague | man, zuul is taking a long time to react to these | 14:15 |
arosen | Hi, I was wondering if others though it might be useful to have the CI also upload a copy of a mysqldump of the databases at the end of a run. I'm trying to track down one of the gate failures and it seems like that would be somewhat useful/helpful. | 14:15 |
*** yaguang has joined #openstack-infra | 14:16 | |
*** dkliban_ has quit IRC | 14:16 | |
sdague | arosen: I think that would be fine | 14:17 |
sdague | it would be added to devstack-gate | 14:17 |
*** DinaBelova has joined #openstack-infra | 14:20 | |
ttx | mordred: we are making rootwrap standalone and I now have a repo that should make a good base for openstack/oslo.rootwrap... what's the best first stage for this ? Push to github ? | 14:21 |
arosen | this might also be a dumb question but i've been using chrome to view the log files and search there but it's really slow because the log files are so large so i've been downloading them locally and using vi to search them in the gz formate but vim doesn't auto load all of the file in the gz format. | 14:21 |
arosen | When I go to extract those files I get an error | 14:22 |
ttx | I'd definitely like my filter-tree and addition of packaging support files to get reviewed by people before we push it anywhere | 14:22 |
arosen | gunzip < screen-n-cpu.txt.gz | tar xvf - | 14:22 |
arosen | gzip: stdin: not in gzip format tar: This does not look like a tar archive tar: Exiting with failure status due to previous errors | 14:22 |
*** markmc has joined #openstack-infra | 14:22 | |
*** julim has joined #openstack-infra | 14:22 | |
openstackgerrit | Dan Prince proposed a change to openstack-infra/config: Drop the saz-gearman module (we don't use it) https://review.openstack.org/57527 | 14:22 |
*** julim has quit IRC | 14:23 | |
arosen | anyone have any work flow tips there? | 14:23 |
fungi | so much scrollback buffer | 14:24 |
*** rongze_ has joined #openstack-infra | 14:24 | |
mordred | ttx: yeah. github is the best bet | 14:24 |
*** wenlock has joined #openstack-infra | 14:26 | |
*** rongze has quit IRC | 14:27 | |
*** lchen has joined #openstack-infra | 14:28 | |
*** rongze has joined #openstack-infra | 14:30 | |
*** yamahata_ has quit IRC | 14:30 | |
*** yamahata_ has joined #openstack-infra | 14:31 | |
*** dolphm has quit IRC | 14:33 | |
*** rongze_ has quit IRC | 14:33 | |
*** xeyed4good has joined #openstack-infra | 14:34 | |
mattymo | fungi, I just wanted to thank you for the help back during Summit for our silly branching issues | 14:34 |
mattymo | it really made a big difference for us | 14:34 |
*** xeyed4good has left #openstack-infra | 14:35 | |
fungi | mattymo: you're welcome--always glad to help | 14:35 |
*** pblaho has quit IRC | 14:36 | |
anteaya | arosen: I haven't run into that problem myself since I just use the browser and ctrl-f for searching | 14:39 |
anteaya | any vim users downloading the gz log format able to give an earnest -neutron bug fixer a hand? | 14:40 |
sdague | man zuul takes some time to reschedule | 14:40 |
*** changbl has joined #openstack-infra | 14:40 | |
*** julim has joined #openstack-infra | 14:40 | |
fungi | BobBall: well, we started out adding 0.5tb cinder devices to a lvm2 vg on a vm, but eventually needed more than we could attach to it. long-term we're looking at finding a way to front-end that data in and out of swift, but near-term options are to migrate the pvs individually from 0.5tb to 1tb to increase available space in the vg or hope there's a way to add more cinder devices to the vm | 14:42 |
*** marun has joined #openstack-infra | 14:43 | |
fungi | arosen: i believe sdague added the server-side filtering cgi on logs.o.o specifically to help make that easier | 14:43 |
fungi | arosen: though to get it to not convert those files to uncompressed text on the fly you need to pass a get variable i think... checking | 14:45 |
*** dcramer_ has joined #openstack-infra | 14:46 | |
sdague | yeh, if you wget it it will come down as real text | 14:46 |
*** ryanpetrello has joined #openstack-infra | 14:46 | |
sdague | it does content negotiation | 14:46 |
*** sergmelikyan has quit IRC | 14:46 | |
arosen | Hrm I've been doing wget of : http://logs.openstack.org/22/55722/8/check/check-tempest-devstack-vm-neutron/eb06ca6/logs/screen-n-cpu.txt.gz maybe i should drop the gz from the end. | 14:47 |
fungi | arosen: add ?content-type=text/plain | 14:47 |
sdague | arosen: it | 14:47 |
anteaya | arosen: running gunzip screen-n-cpu.txt.gz works for me | 14:47 |
anteaya | but the file is html though doesn't render as html | 14:47 |
anteaya | when I open it with a browser it is just a text file with html tags | 14:48 |
sdague | arosen: yeh, pipe it through gzip -d locally | 14:48 |
sdague | it's not a tar, it's just a gzip | 14:48 |
sdague | otherwise it's *really* slow over the network | 14:48 |
sdague | we get about a 15x compression on the logs | 14:49 |
*** yamahata_ has quit IRC | 14:50 | |
fungi | my note on ?content-type=text/plain seems to have been wrong. not helping after all | 14:50 |
*** CaptTofu has quit IRC | 14:50 | |
sdague | sounds like I need to sort some docs for the header | 14:51 |
*** yamahata_ has joined #openstack-infra | 14:52 | |
*** dkranz has joined #openstack-infra | 14:52 | |
fungi | oh, right, there's a separate flag to wget to force specific content negotiation | 14:52 |
sdague | fungi: that param should work | 14:52 |
arosen | I see, if i drop the .gz from the file it works much faster in vim. I guess it's already a txt file the way i'm downloading it though the extension says. nvm.. Thanks | 14:52 |
fungi | sdague: wget -O screen-n-cpu.txt.gz 'http://logs.openstack.org/22/55722/8/check/check-tempest-devstack-vm-neutron/eb06ca6/logs/screen-n-cpu.txt.gz?content-type=text/plain' gets me something which file reports as ASCII text, with very long lines | 14:53 |
sdague | arosen: it actually works for an odd reason of the way the configs work, not why you think. But if it's working for you for now, all good :) | 14:53 |
sdague | oh, yeh, wget will be text/plain anyway | 14:53 |
sdague | you need to 0 out the Accept-Encoding header | 14:53 |
sdague | to drop gz | 14:54 |
sdague | gzip streams | 14:54 |
fungi | that's what i was forgetting. thx | 14:54 |
sdague | so the top 2 patches in the gate should make some of the things better | 14:55 |
*** dkliban_ has joined #openstack-infra | 14:56 | |
anteaya | sdague: \o/ | 14:56 |
sdague | but the first one might fail without the second, so it's a race | 14:56 |
sdague | if the first one does reset, I think we have to save off the zuul queues and just work on getting those through | 14:57 |
anteaya | 57584 and 57572? | 14:57 |
sdague | yeh | 14:57 |
* anteaya cheers for them | 14:57 | |
*** wenlock has quit IRC | 14:57 | |
anteaya | I wonder if we should set up a pipeline for gate fixing patches | 14:58 |
anteaya | so if we have to do this again, we just invoke that pipeline with infra or qa cores identifying which patches go in | 14:58 |
*** markmcclain has quit IRC | 14:58 | |
sdague | yeh, some sort of escape valve has been talked about in the past | 14:59 |
anteaya | and devs can push away but their patches either bounce or get stored in a list | 14:59 |
*** ftcjeff has joined #openstack-infra | 15:00 | |
lchen | Hi, could anyone tell me how the docs gate is done? would be very appreciated for any help and hints | 15:02 |
lchen | I mean where the code resides | 15:04 |
anteaya | lchen: is this what you are looking for? http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/zuul/layout.yaml#n210 | 15:05 |
*** rcleere has joined #openstack-infra | 15:06 | |
lchen | anteaya: Thank you! I think so... | 15:07 |
anteaya | okay great | 15:07 |
*** ftcjeff has quit IRC | 15:08 | |
*** ruhe has quit IRC | 15:09 | |
*** ^d has joined #openstack-infra | 15:11 | |
*** pblaho has joined #openstack-infra | 15:14 | |
sdague | https://jenkins02.openstack.org/job/gate-grenade-devstack-vm/17163/console why the grenade changes are stalling forever is something we need to sort | 15:16 |
*** ruhe has joined #openstack-infra | 15:16 | |
*** DinaBelova has quit IRC | 15:17 | |
*** nati_ueno has joined #openstack-infra | 15:20 | |
*** yamahata_ has quit IRC | 15:21 | |
*** blamar has quit IRC | 15:22 | |
anteaya | 57584 got in \o/ | 15:23 |
*** markmcclain has joined #openstack-infra | 15:25 | |
*** dolphm has joined #openstack-infra | 15:27 | |
fungi | sdague: perhaps if we dumped some date commands into the wrap script around various expensive functions we'd get a little granularity there? | 15:28 |
sdague | yeh, there is a pattern for it | 15:28 |
sdague | just needs to be done | 15:28 |
lchen | anteaya: one more question ;) Could you point me into the direction where I can find how gate-heat-docs is defined? | 15:31 |
*** changbl has quit IRC | 15:33 | |
*** dizquierdo has joined #openstack-infra | 15:33 | |
*** DinaBelova has joined #openstack-infra | 15:35 | |
*** CaptTofu has joined #openstack-infra | 15:35 | |
anteaya | fungi: would gate-heat-docs be in jenkins-job-builder? | 15:36 |
*** datsun180b has joined #openstack-infra | 15:36 | |
*** rongze_ has joined #openstack-infra | 15:37 | |
fungi | anteaya: yeah, it's probably generated from a job template, maybe part of the python-jobs group | 15:37 |
*** wenlock has joined #openstack-infra | 15:37 | |
fungi | i can dig it up if you wind up not being able to find it | 15:37 |
*** rongze has quit IRC | 15:37 | |
anteaya | I'll look | 15:38 |
*** afazekas has quit IRC | 15:39 | |
*** jhesketh__ has quit IRC | 15:40 | |
*** herndon_ has quit IRC | 15:41 | |
lchen | thank you for all the help | 15:41 |
*** __amotoki__ is now known as amotoki | 15:42 | |
*** yassine has quit IRC | 15:43 | |
*** yassine has joined #openstack-infra | 15:44 | |
anteaya | fungi: I can't find anything for a gate-heat-docs definition in config or jenkins-job-builder | 15:45 |
*** hashar has quit IRC | 15:45 | |
*** rpodolyaka has joined #openstack-infra | 15:46 | |
fungi | anteaya: right, it'll probably be a job template so it won't have "heat" in the template name. i'll dig up the details for you | 15:46 |
anteaya | ah | 15:46 |
anteaya | thanks | 15:46 |
*** boris-42 has quit IRC | 15:48 | |
fungi | anteaya: here you can see heat instantiates all jobs and job templates which are members of the python-jobs group https://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/jenkins_job_builder/config/projects.yaml#n410 | 15:48 |
*** markmcclain has quit IRC | 15:48 | |
fungi | anteaya: the python-jobs groups includes as a member the 'gate-{name}-docs' job template https://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/jenkins_job_builder/config/python-jobs.yaml#n220 | 15:49 |
anteaya | ah ha, I should have been looking in jenkins_job_builder _inside_ of config | 15:49 |
anteaya | silly me | 15:49 |
* anteaya goes to look | 15:49 | |
fungi | anteaya: and you can see that template's details at https://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/jenkins_job_builder/config/python-jobs.yaml#n141 | 15:49 |
*** hashar has joined #openstack-infra | 15:50 | |
fungi | in particular, it uses the "docs" builder and passes in the github-org and project parameters to that https://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/jenkins_job_builder/config/macros.yaml#n11 | 15:51 |
fungi | anteaya: and the script you see being invoked in that builder can be seen at https://git.openstack.org/cgit/openstack-infra/config/tree/modules/jenkins/files/slave_scripts/run-docs.sh | 15:52 |
fungi | in particular, line #28, which is the meat of that script, runs 'tox -e$venv -- python setup.py build_sphinx' within a checkout of that project's git repo | 15:53 |
*** ruhe has quit IRC | 15:53 | |
fungi | where $venv is 'venv' (set up on line #24) | 15:53 |
*** markmcclain has joined #openstack-infra | 15:54 | |
fungi | lchen: hopefully all the above ^ is of help to you as well | 15:55 |
anteaya | fungi you are the best!! I really appreciate the tour, I have been wondering how this worked for some time | 15:55 |
*** SergeyLukjanov has quit IRC | 15:55 | |
fungi | it seems a bit like spaghetti until you bang your head against it for a while | 15:55 |
anteaya | been banging my head, the tour relieves all the pain | 15:56 |
anteaya | thank you | 15:56 |
lchen | fungi: Sure. That's really helpful. Thanks for the information. Though I may still need sometime to understand them, I am new to infra ;) | 15:56 |
*** hashar has quit IRC | 15:56 | |
*** kgriffs_afk is now known as kgriffs | 15:57 | |
fungi | lchen: well, welcome! feel free to ask any other questions you have, and don't take offense if we're busy or missing and don't get you an answer right away (just keep asking if that happens) | 15:57 |
*** afazekas has joined #openstack-infra | 16:00 | |
lchen | fungi: yup. Thanks a lot! | 16:00 |
*** DinaBelova has quit IRC | 16:01 | |
anteaya | yay 57572 merged | 16:01 |
*** atiwari has joined #openstack-infra | 16:03 | |
*** datsun180b_ has joined #openstack-infra | 16:04 | |
*** datsun180b_ has quit IRC | 16:04 | |
*** datsun180b has quit IRC | 16:04 | |
*** datsun180b has joined #openstack-infra | 16:04 | |
*** afazekas is now known as _afazekas | 16:05 | |
*** afazekas has joined #openstack-infra | 16:09 | |
*** kgriffs is now known as kgriffs_afk | 16:10 | |
fungi | i've rechecked 57589 now | 16:10 |
*** blamar has joined #openstack-infra | 16:10 | |
fungi | we'll see whether it fares any better | 16:10 |
* anteaya crosses her fingers | 16:12 | |
*** CaptTofu has quit IRC | 16:17 | |
*** CaptTofu has joined #openstack-infra | 16:17 | |
*** rongze_ has quit IRC | 16:17 | |
*** CaptTofu has quit IRC | 16:18 | |
*** CaptTofu has joined #openstack-infra | 16:19 | |
*** mrodden has joined #openstack-infra | 16:20 | |
*** rongze has joined #openstack-infra | 16:23 | |
*** CaptTofu has quit IRC | 16:23 | |
*** bpokorny has joined #openstack-infra | 16:25 | |
*** mihgen has quit IRC | 16:26 | |
*** CaptTofu has joined #openstack-infra | 16:28 | |
*** zaro0508 has quit IRC | 16:28 | |
*** dkranz has quit IRC | 16:29 | |
*** kgriffs_afk is now known as kgriffs | 16:29 | |
*** markmcclain has quit IRC | 16:29 | |
*** MarkAtwood has joined #openstack-infra | 16:30 | |
*** ruhe has joined #openstack-infra | 16:31 | |
*** ruhe has quit IRC | 16:31 | |
*** amotoki is now known as amotoki_zzz | 16:33 | |
*** branen_ has joined #openstack-infra | 16:34 | |
*** branen_ has quit IRC | 16:35 | |
clarkb | what is situation on 1251920? I see one fix didnt merege and that fix got -1'd because another fix was merged? | 16:39 |
fungi | clarkb: i think we're still trying to keep the situation up to date in https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013 | 16:40 |
*** kgriffs is now known as kgriffs_afk | 16:40 | |
*** pblaho has quit IRC | 16:41 | |
clarkb | 57193 is the fix I am talking about (sorry on phone and not super useful) | 16:43 |
clarkb | sdague ^ | 16:46 |
fungi | clarkb: that one is #4 from the head of teh gate now | 16:46 |
fungi | so either it needs a -2cdrv/0aprv/new patchset or it's likely to land shortly | 16:47 |
fungi | d'oh, one ahead of it just reset, so we've got a while still | 16:47 |
*** yassine has quit IRC | 16:47 | |
clarkb | :/ ok once that is in 1251920 should stop failing all the things | 16:48 |
*** yassine has joined #openstack-infra | 16:48 | |
zaro | is gate open for approval yet? | 16:53 |
*** yassine has quit IRC | 16:54 | |
notmyname | where is the elastic recheck page that shows what's been detected? | 16:54 |
clarkb | not quite. couple more things need to get in | 16:54 |
clarkb | notmyname: http://status.openstack.org/elastic-recheck | 16:55 |
notmyname | thanks | 16:55 |
*** svarnau has joined #openstack-infra | 16:58 | |
*** mriedem1 has joined #openstack-infra | 16:58 | |
openstackgerrit | Mathieu Gagné proposed a change to openstack-infra/jenkins-job-builder: Ensure jobparams and group_jobparams are dict https://review.openstack.org/57525 | 16:58 |
*** mrodden1 has joined #openstack-infra | 16:59 | |
*** mriedem has quit IRC | 16:59 | |
*** dkranz has joined #openstack-infra | 17:00 | |
*** changbl has joined #openstack-infra | 17:00 | |
*** mriedem1 has quit IRC | 17:01 | |
*** mriedem has joined #openstack-infra | 17:01 | |
*** mrodden has quit IRC | 17:01 | |
*** mrodden1 has quit IRC | 17:01 | |
*** mrodden has joined #openstack-infra | 17:03 | |
jog0 | clarkb: looks like we are getting close to the end | 17:07 |
jog0 | with the final patch(s) in the queue | 17:07 |
jog0 | do you want to start writting up a report of what happened in etherpad? | 17:07 |
jog0 | I will start doing that in a bit myself (first relocating to Berkeley for the day) | 17:08 |
*** markwash has quit IRC | 17:11 | |
*** reed has joined #openstack-infra | 17:13 | |
jog0 | clarkb: I think we need a email saying how we got here (bug by bug) and the fixes | 17:13 |
*** reed has quit IRC | 17:13 | |
*** reed has joined #openstack-infra | 17:13 | |
openstackgerrit | Mathieu Gagné proposed a change to openstack-infra/jenkins-job-builder: Ensure jobparams and group_jobparams are dict https://review.openstack.org/57525 | 17:14 |
*** boris-42 has joined #openstack-infra | 17:16 | |
*** jpich has quit IRC | 17:18 | |
*** osanchez has quit IRC | 17:18 | |
*** kgriffs_afk is now known as kgriffs | 17:21 | |
*** SergeyLukjanov has joined #openstack-infra | 17:22 | |
clarkb | jog0 ya I can start that | 17:22 |
*** markmcclain has joined #openstack-infra | 17:24 | |
sdague | jog0: I sniped out stuff that was in my way this morning, as I was annoyed the grenade patch was still chugging | 17:25 |
*** ekarlso has quit IRC | 17:30 | |
jog0 | clarkb: cool maybe put it in the same etherpad | 17:30 |
jog0 | I will work on it in abit | 17:31 |
*** fbo is now known as fbo_away | 17:31 | |
mordred | jog0, clarkb: neat. we're in good shape now? | 17:31 |
clarkb | mordred: almost 57193 needs to merge as well as the grenade cleanup changes | 17:32 |
mordred | woot | 17:32 |
mordred | you guys are awesome | 17:32 |
dprince | sdague: am I still waiting on someone to push a branch for this? https://review.openstack.org/#/c/57066/ who did you say was working on it? | 17:32 |
dprince | sdague: the grenade baseline bump to use Havana instead of Griz | 17:32 |
sdague | maurosr | 17:32 |
*** mriedem1 has joined #openstack-infra | 17:35 | |
*** mrodden1 has joined #openstack-infra | 17:36 | |
*** dolphm is now known as dolphm_afk | 17:37 | |
*** ekarlso has joined #openstack-infra | 17:37 | |
*** mriedem has quit IRC | 17:37 | |
*** Ryan_Lane has joined #openstack-infra | 17:38 | |
*** ruhe has joined #openstack-infra | 17:39 | |
*** mrodden has quit IRC | 17:39 | |
pabelanger | So, where can one good to see the stats for nodepool? I know status.openstack.org/zuul/ has some graphs but is there anyplace else? | 17:40 |
*** mfer has quit IRC | 17:41 | |
openstackgerrit | Jaroslav Henner proposed a change to openstack-infra/jenkins-job-builder: Add batch_tasks support. https://review.openstack.org/57469 | 17:41 |
openstackgerrit | Jaroslav Henner proposed a change to openstack-infra/jenkins-job-builder: Add seealso to batch_tasks from promoted_build. https://review.openstack.org/57473 | 17:41 |
openstackgerrit | Jaroslav Henner proposed a change to openstack-infra/jenkins-job-builder: Add properties testing. https://review.openstack.org/57654 | 17:41 |
jeblair | pabelanger: there are some more metrics at http://graphite.openstack.org/ under stats.nodepool | 17:42 |
*** mihgen has joined #openstack-infra | 17:42 | |
pabelanger | jeblair, great, thank you | 17:44 |
*** reed has quit IRC | 17:45 | |
yolanda | hi jeblair, did you have any feedback for the licensecheck bug? | 17:45 |
*** markwash has joined #openstack-infra | 17:47 | |
jeblair | mordred: ^ please see question from yolanda | 17:49 |
jeblair | mordred: you filed the bug, i'd like you to decide if it's redundant | 17:50 |
jeblair | mordred: bug 950407 | 17:50 |
uvirtbot | Launchpad bug 950407 in openstack-ci "jenkins should run licensecheck on all projects" [Low,Triaged] https://launchpad.net/bugs/950407 | 17:50 |
*** CaptTofu has quit IRC | 17:52 | |
*** julim has quit IRC | 17:53 | |
*** rpodolyaka1 has joined #openstack-infra | 17:55 | |
clarkb | jog0: anteaya https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013 has a first draft of the situation tl;dr. THere are a couple spots that I think you guys can help fill in as my familiarity with those particular problems isn't very good | 17:57 |
clarkb | also feel free to edit whatever, I am not too happy with the current state of the tl;dr | 17:57 |
*** herndon has joined #openstack-infra | 17:58 | |
mordred | jeblair, yolanda hrm. I think I agree with jog0 - I think it's an old bug and hacking is handling this now | 17:59 |
mordred | yolanda: sorry - I should have closed that | 17:59 |
yolanda | np | 18:00 |
*** mrmartin has joined #openstack-infra | 18:00 | |
*** sarob has joined #openstack-infra | 18:00 | |
jeblair | clarkb: i'm curious about the bugs that caused things, and how they got in.... | 18:01 |
clarkb | jeblair: yeah going to start filling that data in | 18:02 |
jeblair | clarkb: since there's one change in your etherpad i can track back... | 18:02 |
openstackgerrit | A change was merged to openstack-infra/jenkins-job-builder: Provide default ConfigParser object https://review.openstack.org/48790 | 18:02 |
*** derekh has quit IRC | 18:02 | |
jeblair | https://review.openstack.org/#/c/54363/ looks like the start of 1251920 | 18:02 |
jeblair | and indeed... there are _3_ 'reverify no bug' comments | 18:03 |
jeblair | sdague: ^ i think examining the comment log of that change will be instructive | 18:03 |
jeblair | jog0: ^ | 18:03 |
mordred | ok - just so everyone knows - I just wrote a script to -1 all of the PEP-427 related patches in the system | 18:03 |
yolanda | mordred, jeblair, so is there some low-hanging-fruit or easy bug where i could collaborate? | 18:04 |
mordred | if anyone asks, we do not need to do this: https://review.openstack.org/#/c/57127/ in any way to support wheels | 18:04 |
fungi | yeah, the reverifies on that were pointed out last night. amusing possible confirmation of our suspicions on psychology of nondeterminism in gating | 18:04 |
jeblair | mordred: thank you | 18:04 |
rpodolyaka1 | mordred: in tripleo we already merged those patches... and kind of released projects since then... should I go and revert those patches and do point releases? | 18:05 |
*** arosen has quit IRC | 18:05 | |
*** boris-42 has quit IRC | 18:05 | |
*** melwitt has joined #openstack-infra | 18:05 | |
rpodolyaka1 | mordred: or it's not critical and can be put to the next releases (we are doing them weekly) | 18:06 |
jeblair | yolanda: do any other bugs here look interesting? https://bugs.launchpad.net/openstack-ci/+bugs?field.tag=low-hanging-fruit | 18:06 |
*** sarob has quit IRC | 18:07 | |
clarkb | jeblair: the last paragraph in the wrapup now has more info. That bug appears to be a case of forcing it through the gate mutliple times until it passes | 18:07 |
*** sarob has joined #openstack-infra | 18:07 | |
yolanda | jeblair, maybe that one https://bugs.launchpad.net/openstack-ci/+bug/1183716 | 18:08 |
uvirtbot | Launchpad bug 1183716 in openstack-ci "delete old jobs with Jenkins Job Builder" [Medium,Triaged] | 18:08 |
*** gyee has joined #openstack-infra | 18:08 | |
*** markmcclain has quit IRC | 18:09 | |
jeblair | clarkb, jog0, mordred, sdague: after mulling it over, and seeing this evidence, i think "remove 'reverify no bug'" is one of the next steps we should take. | 18:09 |
Mithrandir | yolanda: that one is fixed now. | 18:09 |
mordred | rpodolyaka1: it's not critical, but yes please revert them | 18:09 |
Mithrandir | (there's a --delete-old or something you need to pass, though) | 18:10 |
mordred | rpodolyaka1: well - let me be more specific - | 18:10 |
clarkb | jeblair: note that particular change only triggered 1251920 in one of the three failed gate attempts | 18:10 |
clarkb | jeblair: but still I agree | 18:10 |
mordred | rpodolyaka1: are the things that you merged them on actually py2/py3 compat? | 18:10 |
jeblair | i think it's a good incremental step; and maybe we keep going after that, but it may help with this kind of situation | 18:10 |
rpodolyaka1 | mordred: I don't think so :( | 18:10 |
yolanda | i also looked at that one https://bugs.launchpad.net/openstack-ci/+bug/1193444 , but needed clarification | 18:10 |
uvirtbot | Launchpad bug 1193444 in openstack-ci "jenkins-job-builder doesn't properly invalidate cache" [Low,Triaged] | 18:10 |
mordred | rpodolyaka1: ok. then yes please revert - but don't worry about re-releasing | 18:10 |
mordred | it won't be a problem because we're not cutting wheels yet | 18:10 |
rpodolyaka1 | mordred: cool. thank you for clarifying! | 18:11 |
*** senk1 has joined #openstack-infra | 18:11 | |
jeblair | mordred: Unfortunately, I am not able to sign a CLA, so I can't contribute a patch. This puts me in a lousy position to complain about implementations | 18:11 |
jeblair | mordred: https://bugs.launchpad.net/openstack-ci/+bug/1193444 | 18:12 |
uvirtbot | Launchpad bug 1193444 in openstack-ci "jenkins-job-builder doesn't properly invalidate cache" [Low,Triaged] | 18:12 |
*** sarob has quit IRC | 18:12 | |
clarkb | does JJB require a CLA >_> | 18:12 |
openstackgerrit | Russell Bryant proposed a change to openstack-infra/config: Add gate-solum-devstack job https://review.openstack.org/57098 | 18:12 |
Shrews | fungi, olaph: Where am I drinking away my sorrows tonight? | 18:13 |
Mithrandir | clarkb: did when I wanted to contribute to it | 18:14 |
mordred | jeblair: sigh | 18:14 |
clarkb | mordred: k | 18:14 |
clarkb | er Mithrandir | 18:14 |
Mithrandir | (not a complain, a data point) | 18:15 |
Mithrandir | complaint | 18:15 |
olaph | Shrews: flying burrito / lynnwood grill | 18:15 |
jeblair | Mithrandir: please feel free to complain... | 18:15 |
anteaya | clarkb: I took a stab at it, not because I understand it - I just summarizied the commit messages from the bug fix patch and two dependencies | 18:16 |
clarkb | olaph: you guys have burritos that fly in NC? | 18:16 |
clarkb | anteaya: thanks | 18:16 |
anteaya | clarkb: let me know if you think I need more eyes on it | 18:16 |
jeblair | for that matter if you or anyone you know of has problems contributing because of the CLA, please let us or stefano (reed) know; we need data on that point. | 18:16 |
anteaya | np, thanks for organizing the wrap-up, we need a summation | 18:16 |
Mithrandir | jeblair: meh, given I've signed the CLA, I don't really care. I don't see the point in CLAs in general (I believe that you give the recipient about the same rights by submitting a patch as it it), but it's not really the role of random person who rocks up to complain about the procedures of a project | 18:17 |
Mithrandir | IMO, at least. | 18:17 |
olaph | clarkb: you have to have them flown into NC. domestic production is prohibited | 18:17 |
jeblair | Mithrandir: tbh, i'm no longer sure what rights our cla gives. when i read it, it certainly didn't match up with what people say about it. | 18:18 |
*** hogepodge has joined #openstack-infra | 18:18 | |
*** boris-42 has joined #openstack-infra | 18:18 | |
*** boris-42 has quit IRC | 18:19 | |
Mithrandir | jeblair: most people aren't trained in reading legal documents | 18:19 |
Mithrandir | so you might get pretty wild interpretations | 18:19 |
fungi | Mithrandir: i think there are a lot of us who would like to see potential contributors who have legal or philosophical misgivings about agreeing to our cla speak up, and loudly | 18:21 |
jeblair | fungi: ++ we need data -- i'm not asking people to complain for no purpose -- i'm asking people to speak up (like that person did in the bug) so that we know about concerns | 18:22 |
fungi | at the moment we're enforcing a legal agreement for contribution, and one of the primary reasons it persists, i believe, is because there's a perception that it doesn't get in the way of attracting capable contributors to the project | 18:22 |
fungi | so if it really does, then that's something we absolutely need to know | 18:23 |
portante | folks, should we be holding off approvals still? | 18:23 |
clarkb | portante: let me check the gate | 18:23 |
portante | 44 looks like "a lot" | 18:23 |
*** afazekas has quit IRC | 18:23 | |
fungi | portante: yes, we've still got a couple of fixes percolating in the gate | 18:23 |
clarkb | portante: 57193,2 is second in the gate queue and will work around 1251920 | 18:23 |
clarkb | so having things behind that is probably mostly ok | 18:24 |
portante | k, thx | 18:24 |
clarkb | assuming it doesn't get bumped out due to some other failure :/ | 18:24 |
jeblair | mgagne: yolanda wants to work on https://bugs.launchpad.net/openstack-ci/+bug/1193444 can you help determine the current state of that bug? | 18:24 |
uvirtbot | Launchpad bug 1193444 in openstack-ci "jenkins-job-builder doesn't properly invalidate cache" [Low,Triaged] | 18:24 |
fungi | right, which will most likely mean we dump the gate again so we can requeue it at the head | 18:24 |
clarkb | portante: we are trying to summarize the various issues at https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013 any chance you can fill in some of the blanks around the swift things? | 18:24 |
clarkb | portante: I started a paragraph for the swift things but it is very incomplete | 18:24 |
Mithrandir | fungi: well, I think it adds a barrier, and I don't think it gives you particularly much in terms of legal shielding so I think it should go, at least for more peripheral projects. | 18:24 |
Mithrandir | it's not a problem _for me_ because I work in small companies that are flexible with those kinds of things. | 18:25 |
Mithrandir | I know really capable people who work for large universities who can't sign CLAs for instance, but I have no idea if they'd be interested in openstack. :-) | 18:26 |
Mithrandir | (Stanford comes to mind) | 18:26 |
fungi | Mithrandir: i *personally* would be fine seeing it go away everywhere. some openstack member companies are a fan of it, so it's there at least until we have real evidence it's a problem | 18:26 |
fungi | Mithrandir: and yes, i'd love rra to say he wants to contribute to one of our projects but is disallowed | 18:26 |
Mithrandir | ooi, where do you know rra from? | 18:27 |
Mithrandir | (and yes, he's one of the people I was thinking of) | 18:27 |
Mithrandir | you'll always have under-reporting of it too, since people who work for such an institution who'll see the CLA and then just turn away. | 18:27 |
notmyname | clarkb: it would be nice to order (or at least mention) the relative weight of these issues. perhaps using the elastic recheck results to show their relative weight in causing issues. | 18:27 |
fungi | debian. he sponsors stuff into the archive because i'm too lazy^H^H^H^Hbusy to go through nm | 18:27 |
*** rnirmal has joined #openstack-infra | 18:27 | |
clarkb | notmyname: good idea, I will add that info | 18:28 |
jeblair | Mithrandir: everyone knows rra :) | 18:28 |
Mithrandir | fungi: ah, right. | 18:28 |
*** markmcclain has joined #openstack-infra | 18:29 | |
jeblair | Mithrandir: he maintained the gnu project's news server and mail-news gateway... handles afs and kerberos packaging (he made installing afs easy!)... he's super-human. :) | 18:29 |
fungi | yes, he is rather an amazing guy | 18:29 |
jeblair | and yeah, when i worked at UC Berkeley, i would have had trouble contributing. i might have been able to -- with quite a lot of work. | 18:30 |
Mithrandir | jeblair: I know, it's great. | 18:30 |
Mithrandir | (all that rra does, not that you'd have trouble contributing) | 18:30 |
*** mfer has joined #openstack-infra | 18:30 | |
*** julim has joined #openstack-infra | 18:30 | |
jeblair | it's kind of daft of us to say we don't really care for contributions from people at stanford and berkeley and, well, most r1 universites. | 18:30 |
jeblair | heh :) | 18:31 |
jeblair | yolanda: i think bug 1193444 may be heading into a somewhat-unresolvable place... i think there are different ideas of how jjb should be used... | 18:32 |
uvirtbot | Launchpad bug 1193444 in openstack-ci "jenkins-job-builder doesn't properly invalidate cache" [Low,Triaged] https://launchpad.net/bugs/1193444 | 18:32 |
jeblair | yolanda: my personal feeling is that the cache support is probably as good as it can get, and if you don't want it, then you need to turn it off, and there's a flag to do that now. | 18:33 |
jeblair | yolanda: so i think we may want to close that bug too, now that '--ignore-cache' has been added | 18:33 |
yolanda | ok | 18:34 |
jeblair | i'd want another jjb developer who's more familiar with that to weigh in (like mgagne), but i'm starting to think it's probably not a good one. | 18:34 |
jeblair | gee, sorry. :( | 18:34 |
jeblair | i think maybe on our next bug triage day, we should actually pay much closer attention to the low-hanging-fruit bugs, because this isn't a very good experience for a new contributor | 18:35 |
jeblair | (i think stale lhf bugs are worse than stale advanced bugs for this reason) | 18:35 |
mgagne | jeblair: there is already a way to bypass cache. An additional feature would be to allow the user to permanently disable the cache with a config. | 18:36 |
clarkb | ++ also we should schedule our bug days now that we know the release schedule | 18:36 |
jeblair | mgagne: oh, that's a good idea | 18:36 |
fungi | Shrews: olaph: yes, flying burrito is next to the raleigh grande theater on grove barton rd (just off glenwood/us70 and lynn rd). lynnwood grill is right across grove barton rd from flying burrito, so we can head there second if you want. still planning on 6:30 pm est? | 18:37 |
jeblair | yolanda: ^ want to add a config file option that does the same as --ignore-cache and use that to close 1193444? | 18:37 |
jgriffith | Anybody know how to reset a job that's apprantly *stuck*: https://review.openstack.org/#/c/55923/ | 18:37 |
jgriffith | started gate on the 19'th and never returned | 18:38 |
fungi | jgriffith: is it still showing on status.openstack.org/zuul? if so, push a trivial rebase or similar minor patchset update | 18:38 |
jgriffith | fungi: it's not, and will do thanks | 18:38 |
fungi | that was one of the changes being tested by jenkins01 when it spontaneously died on monday | 18:39 |
jeblair | jgriffith: if it's not showing, just 'recheck no bug' | 18:39 |
clarkb | notmyname: I have filled out some numbers for the things that I have good info on in the etherpad | 18:39 |
notmyname | clarkb: thanks (looking..) | 18:39 |
clarkb | notmyname: still waiting on swift things (I can dig them up if I have to, but need to run and do errands here shortly so won't get to that soon) | 18:39 |
jgriffith | fungi: yeah, thought that might be the case. Anywho... thanks for the tip | 18:39 |
fungi | jgriffith: and yeah, as jeblair said, if it's not showing on the status page then a recheck ought to work fine | 18:40 |
notmyname | clarkb: ya, I was hoping for some numbers on the DBConnection timeout issue. my understanding is that it was a suspect, and may have shown up a couple of times, but perhaps only affected swift patches (not openstack-wide ones) | 18:40 |
fungi | jgriffith: but may take a few minutes to show up, what with the current load | 18:40 |
yolanda | jeblair, not sure if i follow you | 18:40 |
lifeless | how's the gate looking this morning? | 18:40 |
notmyname | clarkb: I didn't see it on the elastic recheck page, so I don't know where to look | 18:40 |
*** rongze has quit IRC | 18:40 | |
* portante runs a kibana search on it | 18:41 | |
clarkb | portante: feel free to fill in info at https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013 | 18:41 |
notmyname | portante: ah, thanks | 18:41 |
fungi | lifeless: improving, and the second change from the head of the integrated queue should improve it substantially as well | 18:41 |
portante | http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwibG9ja2VkXCIgQU5EIGZpbGVuYW1lOlwibG9ncy9zeXNsb2cudHh0XCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6ImN1c3RvbSIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJmcm9tIjoiMjAxMy0wOS0wMVQxNzo0NzoxMCswMDowMCIsInRvIjoiMjAxMy0xMS0yMFQxNzo0NzoxMCswMDowMCIsInVzZXJfaW50ZXJ2YWwiOiIwIn0sInN0YW1wIjoxMzg1MDU5Mjc1OTQwLCJtb2RlIjoiIiwiYW5hbHl6ZV9maWVsZCI6IiJ9 | 18:42 |
lifeless | was the bad olso local in nova an issue, or a false fix? | 18:42 |
jeblair | yolanda: mgagne suggested that it would be useful to have a config file option that did the same thing as the '--ignore-cache' command line option, so you could permanently set it for your system if you didn't want to use a cache. | 18:42 |
clarkb | lifeless: I have numbers up at https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013 looks like it may have fixed the biggest problem | 18:43 |
yolanda | ok, sounds useful | 18:43 |
yolanda | i'll take a look at it | 18:43 |
* lifeless does the I was useful dance | 18:43 | |
portante | notmyname: there is no record of these happening before November 8th | 18:44 |
lifeless | and can we now stop copying code into projects? | 18:44 |
portante | these being "database is locked" errors from swift | 18:44 |
notmyname | portante: was that when the pypy refactor went in? | 18:44 |
portante | I am not sure | 18:45 |
portante | well | 18:45 |
portante | no, that happened before havana was wrapped up, if I remember right | 18:45 |
*** yaguang has quit IRC | 18:45 | |
clarkb | though it looks like that locked stuff is still happening? | 18:46 |
notmyname | portante: all of those seem to have a build_status of SUCCESS | 18:47 |
clarkb | portante: note we only have 2 weeks of indexed logs in elasticsearch | 18:47 |
notmyname | how do I filter? build_Status:!SUCCESS breaks it | 18:47 |
clarkb | notmyname: NOT build_status:"SUCCESS" | 18:47 |
portante | clarkb: oh | 18:47 |
notmyname | clarkb: thanks | 18:47 |
notmyname | portante: http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwibG9ja2VkXCIgQU5EIGZpbGVuYW1lOlwibG9ncy9zeXNsb2cudHh0XCIgYW5kIE5PVCBidWlsZF9zdGF0dXM6XCJTVUNDRVNTXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6ImN1c3RvbSIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJmcm9tIjoiMjAxMy0xMS0wMVQxNzo0NzoxMCswMDowMCIsInRvIjoiMjAxMy0xMS0yMFQxNzo0NzoxMCswMDowMCIsInVzZXJfaW50ZXJ2YWwiOiIwIn0sInN0YW1wIjoxMzg1MDU5NjcwMTU3LCJtb2RlIjoiIiwiYW5hbHl6ZV9maWVsZ | 18:48 |
notmyname | CI6IiJ9 | 18:48 |
clarkb | portante: I want to extend that more, but I have to be careful because disk | 18:48 |
clarkb | also RAM | 18:48 |
notmyname | hmm...need a URL shortener on those | 18:48 |
*** ryanpetrello has quit IRC | 18:49 | |
notmyname | clarkb: it's like you need a scalable place to put large amounts of unstructured data | 18:49 |
portante | clarkb: understood | 18:49 |
portante | :) | 18:49 |
portante | got get 'em PTL of Swift! | 18:49 |
portante | go | 18:49 |
clarkb | notmyname: I am not sure how inverted indexes on swift would do | 18:49 |
clarkb | really the big problem is the more data you put on disk the more data it wants to load into memory to perform queries | 18:50 |
jog0 | jeblair: I agree we should get rid of reverify no bug and I would like to get rid of recehck no bug as well | 18:50 |
clarkb | eventually you need bigger machiens than rax provides | 18:50 |
*** krtaylor has quit IRC | 18:50 | |
notmyname | jog0: counterpoint is that opening the gate was all "reverify no bug" | 18:51 |
*** mrodden1 has quit IRC | 18:51 | |
jog0 | notmyname: fair enough, I think this is a good place to start | 18:51 |
*** mriedem1 has quit IRC | 18:51 | |
clarkb | notmyname: core reviewers and infra can open the gate without reverify no bug | 18:51 |
clarkb | just leave +A votes | 18:51 |
jeblair | notmyname: well, it was a lot of 'reverify bug 1251920' | 18:51 |
uvirtbot | Launchpad bug 1251920 in nova "Tempest failures due to failure to return console logs from an instance" [Critical,Fix committed] https://launchpad.net/bugs/1251920 | 18:52 |
jeblair | notmyname: breaking the gate was a lot of 'reverify no bug'; see https://review.openstack.org/#/c/54363/ for an example | 18:53 |
dprince | jeblair: I'm not a fan of getting rid of the 'no bug' options. Sure they get overused... but in some cases (for dependency related failures) they are very useful | 18:53 |
notmyname | I'd just hate to see removing functionality as a reaction to a current bad situation. I agree that it should include bug numbers, but it seems similar to "OMG somebody deleted all the data, let's disable deletes" | 18:53 |
jeblair | dprince: dependency related failures? | 18:53 |
clarkb | fwiw I think we completely get rid of reverify | 18:53 |
*** xeyed4good1 has joined #openstack-infra | 18:53 | |
* jog0 reads clarkb's wrapup | 18:53 | |
dprince | jeblair: yes. Like I push two branches across different projects. One is a WIP that depends on the other to land first. | 18:53 |
jog0 | oh clarkb so everything big merged? | 18:54 |
clarkb | jog0: not yet | 18:54 |
jog0 | clarkb: whats left? | 18:54 |
jog0 | 920? | 18:54 |
clarkb | jog0: the fix for 1251920 is still in the gate | 18:54 |
dprince | jeblair: when the one lands I recheck no bug and would expect to see a pass. | 18:54 |
dprince | jeblair: it isn't a bug... rather an expected failure | 18:54 |
clarkb | jog0: and the change to reenable nova v3 changes needs hand holding | 18:54 |
jeblair | dprince: i said 'reverify no bug' -- not 'recheck no bug' :) | 18:54 |
clarkb | er nova v3 tempest tests | 18:54 |
jog0 | clarkb:the v3 test enabaling should be done after this is all over | 18:54 |
jeblair | dprince: but i see jog0 brought up 'recheck no bug' | 18:54 |
jog0 | aka when we are back to normal | 18:54 |
clarkb | jog0: ok, just want to make sure it doesn't get forgotten | 18:55 |
fungi | 57193 is very close to merging though | 18:55 |
jog0 | clarkb: agreed | 18:55 |
jeblair | dprince: and indeed, that is one of the reasons i think i'm not ready to get rid of 'recheck no bug'. i also used it today on an old patch. | 18:55 |
dprince | jeblair: Fair enough. I'm concerned we'll take it too far. | 18:55 |
jog0 | I think cyeoh is on it | 18:55 |
fungi | potentially 15 minutes if it and the change ahead of it don't step in something | 18:55 |
jog0 | cyeoh: ^ | 18:55 |
jog0 | jeblair: I don't to get rid of recheck no bug and not having anything else | 18:55 |
jeblair | jog0: k | 18:56 |
jog0 | but recheck thisisn'tabugIswear | 18:56 |
jog0 | or something like that | 18:56 |
fungi | cross my hear and hope to die, stick a revert in my eye | 18:56 |
fungi | heart | 18:56 |
*** pete5 has joined #openstack-infra | 18:56 | |
clarkb | rechecks aren't the issue imo | 18:56 |
jog0 | because recheck no bug when it is a bug | 18:56 |
jog0 | is very bad | 18:56 |
clarkb | rechecks happen prior to merging and the gate | 18:56 |
jog0 | clarkb: let me rephrase | 18:57 |
clarkb | the big problem here is you can force bad changes into the gate if they are bad only 25% of the time | 18:57 |
jog0 | rechecks aren't bad yes, unknown bugs in the gate are bad | 18:57 |
jog0 | so maybe recheck no bug isn't the answer but devs ignoring gate bugs is bad | 18:57 |
*** dkliban_ has quit IRC | 18:58 | |
jeblair | clarkb: though the extra test runs from the check queue are useful, and when 'recheck no bug' is used there out of laziness, it's as bad as a 'reverify no bug' in the gate. | 18:58 |
*** mrodden has joined #openstack-infra | 18:58 | |
jeblair | (eg, if the check caught a bug in the patch but it was ignored) | 18:58 |
jeblair | (and the gate subsequently misses it) | 18:58 |
jog0 | jeblair: right | 18:58 |
*** mriedem has joined #openstack-infra | 18:58 | |
clarkb | jeblair: hmm good point | 18:59 |
fungi | if we did keep "no bug" maybe we should allow (require?) comments in the pattern. i know that i always feel like i need to justify why i recheck/reverify no bug and end up doing it in a second gerrit review comment instead | 18:59 |
jog0 | fungi: I can get behind that | 19:00 |
jog0 | fungi: cool 8 minutes away (I hope) from 920 being fixed | 19:01 |
nikhil__ | hi, can someone help me resolve jenkins issues on this MP https://review.openstack.org/#/c/54198/ ? | 19:02 |
nikhil__ | seems like, it's failing consistently due to different kinds of issues | 19:02 |
fungi | nikhil__: welcome to the party ;) | 19:02 |
nikhil__ | was wondering how to do a recheck with many bugs? | 19:02 |
nikhil__ | fungi: heh :) | 19:02 |
fungi | nikhil__: the recheck pattern only allows one bug number | 19:03 |
fungi | for now, feel free to just pick one, though hopefully those should cease being a problem here shortly (i think several already may be solved) | 19:03 |
nikhil__ | oh, gotcha. Thanks fungi | 19:04 |
nikhil__ | would you rather want me waiting while the issues in the gating process are resolved? | 19:04 |
nikhil__ | or incoming recheck requests are fine? | 19:04 |
fungi | nikhil__: if you don't mind, that would be great. your chances of getting a passing verify score from jenkins will be going up considerably here shortly, we hope | 19:05 |
openstackgerrit | Douglas Mendizabal proposed a change to openstack-infra/config: Rename Project Barbican channel https://review.openstack.org/57741 | 19:05 |
*** mrmartin has quit IRC | 19:05 | |
sdague | jeblair: +1 to removing reverify no bug | 19:05 |
nikhil__ | sure fungi . I will wait (anything that can be helpful :) ) | 19:05 |
jog0 | FWI http://paste.openstack.org/show/53765/ | 19:05 |
*** sarob has joined #openstack-infra | 19:06 | |
jog0 | things are still pretty shake | 19:06 |
jog0 | bad | 19:06 |
jog0 | grenade is still failing all the time | 19:06 |
*** senk1 has quit IRC | 19:06 | |
dprince | sdague: maurosr is unresponsive. Can you clue me in on what exactly needs to happen w/ grenade. I see no reason to wait around? | 19:06 |
*** senk has joined #openstack-infra | 19:07 | |
*** dolphm_afk is now known as dolphm | 19:07 | |
*** thomasbiege has joined #openstack-infra | 19:07 | |
notmyname | jog0: I think we just had a "no bug" fail in swift, BTW | 19:07 |
*** xeyed4good1 has quit IRC | 19:08 | |
notmyname | so if a gate job failed due to a timeout, is that "no bug"? | 19:09 |
portante | notmyname: are you talking about: https://review.openstack.org/54620 | 19:09 |
rfolco | Hello folks. Is there a easier way to skip multiple Tempest tests than using @skip_because decorator ? | 19:10 |
notmyname | portante: ya | 19:10 |
portante | that was weird, though | 19:10 |
portante | the swift functional tests failed on the check job, for some weird reason | 19:10 |
notmyname | portante: functests failed due to a tcp drop, and grenade was killed after 60 minutes | 19:10 |
notmyname | portante: ya. error talking to git | 19:10 |
portante | but there was a check and a gate job running at the same time | 19:10 |
jog0 | notmyname: if something times out at least file a bug for hey it timed out | 19:11 |
jog0 | notmyname: plus gate isn't fixed yet | 19:11 |
portante | if you look there was a reverify no bug by you, then dfg did a recheck bug 1251920 | 19:11 |
uvirtbot | Launchpad bug 1251920 in nova "Tempest failures due to failure to return console logs from an instance" [Critical,Fix committed] https://launchpad.net/bugs/1251920 | 19:11 |
*** rongze has joined #openstack-infra | 19:11 | |
jog0 | notmyname: we are in bug fixes only in gate now | 19:11 |
portante | that reverify no bug came from last night | 19:12 |
portante | that was not done this morning | 19:12 |
jog0 | ahh | 19:12 |
jog0 | we were in gate freeze last night too | 19:12 |
fungi | gah. 57193 restarted because the change ahead of it failed | 19:12 |
notmyname | portante: ya, that's when it was ok to try some stuff because it was probably ok | 19:12 |
maurosr | dprince: sdague hi sorry, two meetings in the last hours, pushing it | 19:13 |
fungi | MismatchError: 9 != 10 (which one was that again?) | 19:13 |
notmyname | yay, the patch at the front of the gate just failed. does that mean another 60 minutes for the good patch to get in? | 19:13 |
sdague | maurosr: great | 19:13 |
jog0 | clarkb: I am just going to do a brain dump for the wrapup and hope someone can help me turn it into something coherent | 19:13 |
dprince | maurosr: thanks | 19:13 |
*** jgrimm has joined #openstack-infra | 19:14 | |
maurosr | sdague: I still have lots of refactors to avoid duplication, but will do they in separate commit, they were breaking my patch, so I need to think on it better | 19:14 |
sdague | maurosr: ok, well if you have a rev out there, I can help with it as well | 19:15 |
sdague | it would be good to unblock dprince on this | 19:15 |
*** senk has quit IRC | 19:16 | |
maurosr | sdague: the ones moving each release to a separated file are out, the refactor is just some functions that exists on every single file and could be reused.. I don't think it is a block cause they are not really necessary | 19:16 |
sdague | ok, cool, lets see how they handle tests | 19:16 |
jog0 | sdague: is gerande worging again? | 19:16 |
sdague | I'll review post lunch | 19:16 |
jog0 | working* | 19:17 |
sdague | jog0: well, my fix landed | 19:18 |
*** hogepodge_ has joined #openstack-infra | 19:18 | |
*** rongze has quit IRC | 19:18 | |
*** hogepodge has quit IRC | 19:19 | |
*** hogepodge_ is now known as hogepodge | 19:19 | |
sdague | the last grenade fail I see is unrelated - http://logs.openstack.org/05/55405/5/check/check-grenade-devstack-vm/7d7de61/console.html | 19:19 |
jog0 | sdague: cool | 19:20 |
anteaya | yay sdague | 19:20 |
jog0 | sdague: when it land? | 19:20 |
sdague | 2 hours ago | 19:21 |
jog0 | sdague: ahh that makes sense then | 19:21 |
jog0 | too early to tell | 19:21 |
sdague | so the timeout is still 60? | 19:21 |
sdague | the slow node setup time makes me think we should up that regardless | 19:22 |
jog0 | as far as I know | 19:22 |
jog0 | ++ | 19:22 |
jeblair | i'm looking into slow node setup time | 19:22 |
sdague | jeblair: cool | 19:22 |
sdague | that last change I posted, 30 minutes to get to the main run | 19:22 |
zaro | clarkb: would you be able to reply to paul's comment? https://review.openstack.org/#/c/47937/10/modules/openstack_project/manifests/review_security.pp | 19:22 |
jeblair | i think it might be time to remove the zuul repos (and remove their zuul refs) | 19:24 |
jeblair | so if we restart zuul again, we should 'rm -fr' its working directory | 19:24 |
fungi | jeblair: noted | 19:24 |
*** johnthetubaguy has quit IRC | 19:25 | |
*** herndon has quit IRC | 19:25 | |
zaro | fungi: i have reread this.. http://lists.openstack.org/pipermail/openstack-infra/2013-October/000314.html | 19:26 |
fungi | notmyname: 53 minutes now (so sayeth zuul's estimate) | 19:26 |
jeblair | http://cacti.openstack.org/cacti/graph_image.php?action=view&local_graph_id=388&rra_id=3 | 19:26 |
jeblair | is an absurd graph. it's mostly due to git upload packs | 19:26 |
zaro | fungi: i'm not sure what the hold up is, can you refresh my memory? | 19:27 |
fungi | zaro: sure--rereading now so i can refresh mine first ;) | 19:27 |
sdague | jeblair: heh | 19:28 |
jog0 | jeblair: what do you think about a zuul mode that only merges code with a bug assocatged with it? | 19:28 |
jog0 | telling 200 people not to approve patches .. is hard | 19:29 |
jog0 | clarkb: how long was the gate queue when we flushed it? | 19:30 |
notmyname | jog0: 127 (give or take a few) IIRC | 19:31 |
fungi | zaro: mmm, everything seems covered in that thread if you didn't have any other follow-up there. i think i just need to review that outstanding change again | 19:31 |
jeblair | 145 | 19:31 |
*** ryanpetrello has joined #openstack-infra | 19:32 | |
zaro | fungi: cool, i was just about to git review new patch. it's # 47937 | 19:32 |
fungi | zaro: yep, was just looking at it. seemed to be failing tests and wip | 19:32 |
lifeless | is there some way to stop a meeting someone else started? | 19:33 |
zaro | fungi: ohh, looks like clarkb is not around. would appreciate it if you could reply to paul's comment. | 19:33 |
lifeless | see #openstack-meeting | 19:33 |
morganfainberg | lifeless, #endmeeting ? | 19:33 |
lifeless | morganfainberg: 'see #openstack-meeting' :) | 19:33 |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: Setup a private gerrit instance for security reviews https://review.openstack.org/47937 | 19:33 |
*** dkliban_ has joined #openstack-infra | 19:34 | |
zaro | fungi: was in wip because i thought we were in holding pattern. new patch fixes the white space chars. | 19:35 |
jog0 | notmyname jeblair: remember how long it was in hours? | 19:36 |
notmyname | jog0: 18? | 19:36 |
jog0 | notmyname: that sounds about right | 19:36 |
morganfainberg | before i go and reverify/recheck things for keystone, wanted to do a temperature check in here. not sure where we are sitting wrt gate this morning, (not seeing an explicit all clear) | 19:37 |
morganfainberg | you know, not making a bad situation worse. | 19:38 |
fungi | zaro: replied to the inline comment | 19:39 |
*** thomasbiege has quit IRC | 19:39 | |
jog0 | morganfainberg: please wait | 19:40 |
morganfainberg | jog0, figured as much, this is why i asked :) | 19:40 |
jog0 | morganfainberg: the last big bug patch 57193 is at the top of the qeueue | 19:40 |
jog0 | but until its merged we want to hold off turning the floodgates on | 19:40 |
jog0 | morganfainberg: thanks you very much for asking | 19:40 |
morganfainberg | ah cool. will keep my eye on that one, thanks for all the awesome work the last day. | 19:40 |
zaro | fungi: cool. thx. i guess the thing i'm really not sure of is config in All-Projects-review_security.config | 19:41 |
*** kgriffs has left #openstack-infra | 19:43 | |
fungi | zaro: yeah, i need to give the whole thing a detailed review. but as for that file we probably want to turn it into a section in docs/source/gerrit.rst instead i think (or add a similar file for review-security.rst or something) | 19:43 |
*** gyee has quit IRC | 19:43 | |
fungi | zaro: though we could test it as is and see if applying that works | 19:43 |
*** rongze has joined #openstack-infra | 19:45 | |
*** ruhe has quit IRC | 19:45 | |
zaro | fungi: not sure what you mean. are you suggesting that we just need to document it? or replace what's in the review? | 19:45 |
fungi | zaro: there's a chicken-and-egg problem with trying to puppet gerrit from scratch and add an all-projects acl | 19:46 |
fungi | zaro: we work around that at the moment by simply documenting what the initial administrator should configure there | 19:46 |
openstackgerrit | Sergey Lukjanov proposed a change to openstack-infra/jeepyb: Savanna client using separated LP project now https://review.openstack.org/57752 | 19:47 |
*** krtaylor has joined #openstack-infra | 19:48 | |
zaro | fungi: ahh. so instead of puppeting all-projects those configs need to be set manually? it seems like puppeting is doable, no? | 19:48 |
fungi | zaro: not safely... at least certainly not initially | 19:48 |
fungi | zaro: to be able to apply the acl you have to push it through gerrit's ssh git interface | 19:48 |
fungi | zaro: and that acl is what determines who and how changes are allowed to be pushed | 19:49 |
*** mihgen has quit IRC | 19:50 | |
fungi | zaro: it *can* be applied directly on the filesystem via local git operations instead, but that skips gerrit's vacl validation, group uuid creation and so on | 19:50 |
*** rongze has quit IRC | 19:50 | |
*** rpodolyaka1 has quit IRC | 19:51 | |
*** fbo_away is now known as fbo | 19:51 | |
*** sarob has quit IRC | 19:51 | |
fungi | zaro: yolanda worked out a way to do it, but i think it involves using git to locally apply a very minimal and rigorously tested all-projects acl along with corresponding database queries to emulate the group uuid mapping which corresponds to that, then as a second stage push the more complex all-projects acl through the normal update mechanism in gerrit | 19:52 |
*** sarob has joined #openstack-infra | 19:52 | |
fungi | while it might be nice to have, it's tricky and certainly not trivial to implement | 19:52 |
*** mihgen has joined #openstack-infra | 19:54 | |
*** yolanda has quit IRC | 19:55 | |
*** whayutin_ has joined #openstack-infra | 19:56 | |
*** weshay has quit IRC | 19:56 | |
zaro | fungi: ok. i was just setting up locally on my machine so didn't realize the complexity. i guess i should change all-projects config to documentation instead? | 19:56 |
*** sarob has quit IRC | 19:56 | |
fungi | zaro: i think so. that's what we already do for our public gerrit instance on review.o.o (merely document what all-projects should look like) | 19:57 |
*** sarob has joined #openstack-infra | 19:59 | |
*** sarob has joined #openstack-infra | 19:59 | |
*** ericw has quit IRC | 20:01 | |
*** danger_fo_away is now known as dangers | 20:01 | |
*** whayutin_ is now known as weshay | 20:02 | |
fungi | zaro: i left a comment on the change summarizing | 20:02 |
zaro | fungi: excellent. thanks. | 20:04 |
jog0 | anyone want to help me build the bug timeline? | 20:04 |
*** hashar has joined #openstack-infra | 20:07 | |
*** herndon has joined #openstack-infra | 20:10 | |
*** vipul is now known as vipul-away | 20:15 | |
*** vipul-away is now known as vipul | 20:15 | |
jog0 | looks looks like https://review.openstack.org/#/c/57193/ will merge | 20:19 |
mgagne | zaro: any experience with jenkins slaves running windows? | 20:20 |
fungi | jog0: there's still a grenade failure behind it though | 20:20 |
*** ^d has quit IRC | 20:21 | |
notmyname | jog0: unless it times out... ;-) | 20:22 |
jog0 | fungi: link? | 20:22 |
jog0 | notmyname: :( | 20:22 |
jog0 | jeblair sdague: is the timeout bump for grenade in the queue | 20:22 |
mikal | jog0: so... how goes operation WTF Console Log? | 20:23 |
jog0 | mikal: http://status.openstack.org/zuul/ its at the top of the qeueue | 20:23 |
jog0 | its all up to greande | 20:23 |
jog0 | grenade | 20:23 |
fungi | jog0: looks like this one is trying to apt-get over the network. and running into connectivity issues... https://jenkins02.openstack.org/job/gate-grenade-devstack-vm/17202/console | 20:23 |
jog0 | fungi: ahh | 20:24 |
fungi | so unrelated | 20:24 |
jog0 | uh oh | 20:25 |
jog0 | 17201 has 2 minutes | 20:25 |
jog0 | and is on hpcloud | 20:25 |
fungi | jog0: the console log on this one is a bit on the brief side (haven't looked at what uploaded to logs.o.o yet): https://jenkins01.openstack.org/job/gate-tempest-devstack-vm-neutron-large-ops/8656/console | 20:25 |
jog0 | fungi: yeah taht console lokg is short | 20:25 |
fungi | same for https://jenkins01.openstack.org/job/gate-tempest-devstack-vm-neutron/21816/console | 20:25 |
jog0 | ohh really short | 20:25 |
jog0 | looks like a network bug | 20:26 |
fungi | short log with a looooong gap | 20:26 |
jog0 | lol | 20:26 |
jog0 | mikal: looks like https://review.openstack.org/#/c/57193/ passed | 20:26 |
notmyname | jog0: does this mean you'll have to flush the whole gate again to not get stuck behind 46 other changes (all tested without this patch) | 20:26 |
jog0 | as long as it beats the buzzer | 20:26 |
fungi | jog0: came in just under the wire | 20:26 |
notmyname | success! | 20:27 |
jog0 | notmyname: wrong patch | 20:27 |
jog0 | mikal: ^ | 20:27 |
jog0 | https://jenkins02.openstack.org/job/gate-grenade-devstack-vm/17202/console puppetlabs | 20:27 |
jog0 | WAT | 20:28 |
jog0 | why | 20:28 |
jog0 | fungi: ^ | 20:28 |
jog0 | fungi: so if apt.puppet goes down we do t ? | 20:28 |
jog0 | to | 20:28 |
fungi | jog0: very good question. i think we don't want that, no | 20:28 |
jog0 | clarkb jeblair sdague: I think we are out of critical modde | 20:29 |
jog0 | turn off the ceremonial bug fix only mode? | 20:29 |
jog0 | mikal: ^ | 20:29 |
fungi | jog0: we install puppet from there to use for configuring the slave image. devstack then wants to update package lists | 20:29 |
jog0 | clarkb: and reload the queuee | 20:29 |
zaro | mgagne: did i hear windows? | 20:29 |
* jog0 files a bug against infra | 20:29 | |
mgagne | zaro: sort of. lets say it's a non-linux OS. | 20:29 |
*** CaptTofu has joined #openstack-infra | 20:31 | |
jog0 | fungi: ^ | 20:31 |
jog0 | fungi: https://bugs.launchpad.net/openstack-ci/+bug/1253774 | 20:31 |
uvirtbot | Launchpad bug 1253774 in openstack-ci "Reduce number of apt sources that must be up for gate to work" [Undecided,New] | 20:31 |
fungi | jog0: it might be possible to remove the puppet sources.list entry after we build the image, before it gets snapshotted | 20:31 |
fungi | jog0: but i think we should probably also have a close look at the list to make sure there aren't other things in a similar situation and clean them all up consistently | 20:31 |
zaro | mgagne: i have some experience with that sorta thing. | 20:32 |
jog0 | fungi: agreed, I filed a bug because this isn't a fix this ASAP issue | 20:32 |
*** mrodden has quit IRC | 20:32 | |
fungi | jog0: great--thanks! i'll triage and get some thoughts in there in case i don't work on it right away | 20:32 |
fungi | (which seems likely) | 20:32 |
jog0 | fungi: thanks, and with regard to turning on the full gate flood? | 20:32 |
fungi | jog0: i'm in favor, but looking for consensus | 20:33 |
jog0 | fungi: agreed | 20:33 |
anteaya | neutron has been business as usual since last night | 20:33 |
fungi | jog0: though it might be nice to watch one more iteration of jobs to see how fast the current 45 drop | 20:33 |
jog0 | true | 20:34 |
anteaya | I had to give the Japanese core some sense of direction | 20:34 |
anteaya | couldn't leave them hanging and then go to bed | 20:34 |
fungi | anteaya: i think the question is more with regards to auto-reverifying the 100+ we had from the original dump | 20:34 |
anteaya | ah | 20:34 |
jog0 | fungi: so can you babysit zuul | 20:34 |
jog0 | I am going to keep working on the report | 20:35 |
fungi | jog0: for a few hours, yeah | 20:35 |
anteaya | some folks are coming to me and asking for recheck reverify, which I have been doing manually | 20:35 |
mgagne | zaro: I think I found what I was looking for. I'll give it a try until I cut myself. | 20:35 |
anteaya | can we do stages of 25 patches at a time? | 20:35 |
*** mriedem1 has joined #openstack-infra | 20:35 | |
*** mrodden1 has joined #openstack-infra | 20:35 | |
jog0 | fungi: cool, I think we should decide about gate by teh end of the hour | 20:35 |
fungi | jog0: agreed | 20:35 |
jog0 | and ping me if anything fails | 20:35 |
fungi | will do | 20:35 |
jog0 | fungi: we should also prep an email and the requeue old stuff scripts just in case | 20:36 |
clarkb | jog0: sorry just got back from errands (woo new lease) | 20:36 |
clarkb | jog0: is there a tl;dr on the infoos you need? | 20:36 |
jog0 | clarkb: yesah | 20:37 |
anteaya | clarkb: congrats on new lease | 20:37 |
jog0 | clarkb: https://review.openstack.org/#/c/57193/ merged | 20:38 |
clarkb | woot! | 20:38 |
*** vipul is now known as vipul-away | 20:38 | |
*** vipul-away is now known as vipul | 20:38 | |
jog0 | clarkb: so fungi and I decied watch zuul till the end of the hour | 20:38 |
jog0 | and decide if we call an all clear and requeue the stuff we bumped | 20:38 |
*** mriedem has quit IRC | 20:38 | |
clarkb | ok | 20:39 |
jog0 | clarkb: and I am working on the report | 20:39 |
fungi | basically if it looks like the gate is clearing quickly, then it should be safe to dump in the remainder | 20:40 |
fungi | otherwise, there's still work to do | 20:40 |
fungi | i've poked at the bug timeline a little, but not sure what you're looking for next (an analysis of what changes caused them to emerge?) | 20:40 |
zaro | mgagne: don't try too hard. x64 win is not real nice to with jenkins. | 20:41 |
jog0 | fungi: I first want to list when each bug went into play | 20:41 |
fungi | ahh, okay | 20:42 |
jog0 | for the wrapup I was going to explain how we got into the state we did (bug by bug blow type thing) | 20:42 |
jog0 | and our fixes | 20:42 |
jog0 | just o give people a better idea of how bad it was | 20:42 |
mgagne | zaro: ... :-/ | 20:43 |
*** rongze has joined #openstack-infra | 20:47 | |
*** markmcclain has quit IRC | 20:47 | |
jog0 | wow this timeline is a little scary | 20:50 |
jog0 | essentially just 2 bugs did us in | 20:50 |
*** markmc has quit IRC | 20:50 | |
*** rongze has quit IRC | 20:51 | |
*** vipul is now known as vipul-away | 20:53 | |
openstackgerrit | James E. Blair proposed a change to openstack-infra/devstack-gate: Add timestamps to devstack-gate output https://review.openstack.org/57770 | 20:53 |
*** vipul-away is now known as vipul | 20:54 | |
jeblair | wow we run a lot of jobs on dg changes :) | 20:54 |
fungi | jeblair: and remember we only just stopped running stable/folsom compat jobs on it last week ;) | 20:55 |
fungi | there were more than a few of those as well | 20:55 |
clarkb | jog0: 1251920 and the nova oslo thing? | 20:55 |
*** rfolco has quit IRC | 20:56 | |
jog0 | clarkb: let me rephraase | 20:57 |
jog0 | 2 bugs put us over the edge | 20:57 |
clarkb | I see | 20:57 |
jog0 | 1251920 and parallel greande and everything that involved | 20:57 |
*** markmcclain has joined #openstack-infra | 20:57 | |
jog0 | we had enough underyling bugs that thats all it took | 20:57 |
jog0 | and 1251920 hit everything | 20:57 |
*** datsun180b has quit IRC | 20:58 | |
jog0 | clarkb fungi: so how does gate look | 20:58 |
fungi | jog0: all green so far | 20:58 |
mikal | jog0: so just so I am clear, you think you've fixed 1251920? | 20:59 |
mgagne | zaro: what a pain... | 20:59 |
jog0 | fungi: none of the important tests reported back yet | 20:59 |
mikal | jog0: cause I'd like to never have to do that again please | 20:59 |
jog0 | mikal: fixed, no disabled yes | 20:59 |
fungi | no failures on any gate jobs since the fix to 1251920 merged (there's a merge conflict a ways down, but that's uninteresting) | 20:59 |
lifeless | clarkb: can I forward that mail to the larger internal list? | 20:59 |
jog0 | we found the test that triggered it and turned it off | 20:59 |
clarkb | lifeless: oh I did a reply only, yeah thats fine | 20:59 |
mikal | jog0: do we understand why that test triggered it yet? | 20:59 |
anteaya | w00t | 20:59 |
jog0 | mikal: :( | 20:59 |
fungi | jog0: quite a few have reported back so far (devstack/tempest/grenade) and are all success stories | 20:59 |
jog0 | mikal: https://review.openstack.org/#/c/57193/ | 21:00 |
jog0 | fungi: if your happy, I am happy | 21:00 |
fungi | i take that back... no grenades yet (hasn't been long enough) | 21:00 |
*** johnthetubaguy has joined #openstack-infra | 21:00 | |
jog0 | fungi: yeah | 21:01 |
fungi | jog0: current average completion time on grenade seems to be ~58 minutes | 21:01 |
jog0 | mikal: so we still need to fix that bug | 21:02 |
jog0 | so happy hunting to us :) | 21:02 |
fungi | but jenkins may just have not adjusted its expectations downward yet | 21:02 |
jog0 | fungi: can you push a patch to bump the timeout | 21:02 |
jog0 | fungi: ohh | 21:02 |
*** SergeyLukjanov has quit IRC | 21:02 | |
fungi | jog0: i can. note that it won'y affect any already-running jobs, only jobs which start after teh config change is applied | 21:02 |
*** alcabrera has quit IRC | 21:03 | |
fungi | so we might as well wait to see if it's still that slow first | 21:03 |
jeblair | i'd like to remove the zuul repos before adjusting the timeout | 21:03 |
jog0 | fungi: ack | 21:03 |
*** dprince has quit IRC | 21:03 | |
jog0 | jeblair fungi: so it sounds like you have that covered cool | 21:03 |
*** denis_makogon_ is now known as denis_makogon | 21:04 | |
openstackgerrit | A change was merged to openstack-infra/jenkins-job-builder: fix jjb configuration documentation https://review.openstack.org/57062 | 21:04 |
jeblair | it's currently taking 2.5 minutes to determine that a ref doesn't exist, and a significant part of that is number of refs in the repos; longer term solution is to push zuul refs to a load balanced set of servers | 21:04 |
fungi | eee, yeah... i wonder if some way to expire zuul refs older than a week or a month or something wouldn't be too hard as a hot periodic sort of task | 21:05 |
notmyname | should I be worried that check jobs are still getting hit with 1251920? | 21:06 |
fungi | notmyname: they probably started before the fix for that merged and are based on an earlier state | 21:06 |
notmyname | fungi: https://review.openstack.org/#/c/57753/ | 21:07 |
* fungi checks | 21:07 | |
notmyname | perhpas. but it's close | 21:07 |
*** mrodden1 is now known as mrodden | 21:08 | |
fungi | looks like the job started at 20:17 | 21:08 |
fungi | fix for it merged at 20:27 | 21:08 |
notmyname | kk | 21:08 |
fungi | jog0: stuff seems to be merging and not ejecting... down to 42 in the gate pipeline now | 21:10 |
jog0 | fungi: yeah things are all green :) | 21:10 |
jog0 | so back to business as usual? | 21:10 |
fungi | in about 10-15 minutes we should have a whole swath of changes finishing, several with grenade jobs | 21:11 |
jog0 | fungi: cool | 21:11 |
clarkb | jog0: in the wrapup stuff, can you fill in the swift details? | 21:11 |
clarkb | jog0: they are a bit fuzzy for me because portante managed to just do them :) (which is a good thing) | 21:12 |
jog0 | fungi: sounds like a good plan, I Jam just getting antsey about letting people get back to work | 21:12 |
portante | what'd i do | 21:12 |
jog0 | clarkb: yeah I will fill that in, in a few minutes | 21:12 |
portante | which way did he go | 21:12 |
jog0 | portante: you want to do it instead | 21:12 |
jog0 | https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013 | 21:12 |
*** gyee has joined #openstack-infra | 21:12 | |
jog0 | so this is the second gate crash that I know of hehe | 21:12 |
clarkb | portante: near the bottom of that etherpad is a paragraph on the swift failures, with a sentence asking joe to fill in detail. can you take a stab at that? | 21:12 |
jeblair | i don't like 'gate crash' for the same reason sdague doesn't like 'flakey test' :) | 21:13 |
clarkb | jeblair: ++ | 21:13 |
portante | yes | 21:13 |
notmyname | jeblair: the "jim failure"? ;-) | 21:13 |
jog0 | jeblair clarkb: it was supposed to be a refence to *gate (watergate etc) | 21:14 |
jog0 | but fair enough | 21:14 |
jog0 | jeblair: what catchy name do you propose? | 21:14 |
notmyname | jeblair: the day the music stopped: gerritgate | 21:14 |
jog0 | notmyname: ^_^ | 21:15 |
jeblair | notmyname: heh, yeah, let's go all out and just say i singlehandedly broke openstack. while on vacation, no less. :) | 21:15 |
notmyname | jeblair: no harm intended :-) | 21:16 |
mikal | jog0: so its it time to send an all clear on people approving changes again? | 21:16 |
mikal | jog0: or am I missing somehting? | 21:16 |
jeblair | jog0: i think describing it as wedged or stuck is reasonable | 21:16 |
fungi | mikal: just watching gate jobs for a bit to make sure there's nothing else slowing it up | 21:16 |
*** sandywalsh has quit IRC | 21:16 | |
jeblair | notmyname: i understand -- collaborative jesting is difficult in irc and it looks like i don't quite have the knack yet. :) | 21:16 |
fungi | mikal: so far it's a sea of green, though grenade timeouts may still bite us | 21:17 |
jog0 | mikal: not yet, we are wating a few more minutes | 21:17 |
jeblair | how about we let a bunch of things merge, then stop zuul, remove the repos, restart, and then send an all clear | 21:17 |
jog0 | jeblair: sounds good, but its not as catchy as the day the music stopped: gerritgate | 21:17 |
mikal | jog0: ok, cool | 21:17 |
jeblair | that way we don't send an all clear and then stop zuul. :) | 21:17 |
clarkb | jeblair: I like that plan | 21:17 |
mikal | jog0: so I should sneakily approve my stuff now before the flood? | 21:17 |
clarkb | jeblair: my morning errands are all now complete so I can help with that too :) | 21:18 |
fungi | worth noting, jenkins has adjusted its grenade completion estimates down quite a bit | 21:18 |
jog0 | mikal: heh | 21:18 |
clarkb | fungi: what was the grenade timing problem? | 21:18 |
*** flaper87 is now known as flaper87|afk | 21:18 | |
portante | clarkb, notmyname: how does that paragraph read? | 21:18 |
mikal | jog0: you think I'm joking! | 21:18 |
jog0 | mikal: I don't actually | 21:18 |
mikal | jog0: I'm going to "test" the gate with this here approved patch of mine | 21:18 |
jog0 | mikal: seriously we don't need more tests | 21:18 |
jog0 | we have 40odd | 21:19 |
mikal | :( | 21:19 |
clarkb | portante: looks good. can you add links to gerrit changes and lp bugs too? | 21:19 |
jeblair | mikal: it won't help since i'm about to dump the queue anyway; likely in 16 minutes | 21:19 |
fungi | clarkb: jeblair has evidence to support zuul ref checking in the setup taking longer than needed because of all the zuul ref buildup in zuul's git repos | 21:19 |
jog0 | mikal: look at zuul btw | 21:19 |
jog0 | aboutto merge 4 patches | 21:19 |
jeblair | fungi, clarkb: that combined with the sheer number of simultaneous jobs pulling zuul refs | 21:19 |
clarkb | fungi: ooh fun, basically git slowdowns due to ETOOMANYREFS? | 21:19 |
jog0 | if grenade finishes | 21:19 |
mikal | jog0: but none of them mine!@ | 21:19 |
jog0 | mikal: I think one of my patches is in the gate | 21:20 |
jog0 | someone reapproved it :( | 21:20 |
*** reed has joined #openstack-infra | 21:20 | |
fungi | here we go! | 21:20 |
fungi | 4 changes about to cram through | 21:20 |
jog0 | fungi: \o/ | 21:20 |
fungi | and a bunch more right on their heels | 21:20 |
* jog0 has never been so happy to see things work | 21:20 | |
zaro | fungi: you sure security-gerrit all-projects should get included with the general docs? | 21:21 |
fungi | so much greeeeen. so beauuuutiful | 21:21 |
fungi | zaro: well, we will use a shadow gerrit, and we need to know how to rebuild it if needed | 21:21 |
jog0 | fungi: http://www.amsterdam-mamas.nl/wp-content/uploads/2013/02/kermit-Frog.jpg | 21:22 |
clarkb | fungi: zaro: I think manage-projects should manage that for us | 21:22 |
notmyname | portante: I'm not sure how to word it seeing as there still seems to be a DB lock error in swift | 21:22 |
fungi | zaro: i'm unconvinced whether that goes in gerrit.rst as an additional section (or several), additional paragraphs in existing sections, or a whole new rst file separate from it | 21:22 |
*** mindjive1 is now known as mindjiver | 21:22 | |
fungi | clarkb: should manage the all-projects acl? | 21:22 |
portante | notmyname: oh, can you show me? | 21:22 |
clarkb | fungi: zaro: we can still document why and how in the docs though, but the actual management of it can be in manage-projects | 21:22 |
clarkb | fungi: ya | 21:22 |
notmyname | portante: see links in -swift | 21:22 |
fungi | clarkb: how? | 21:22 |
clarkb | fungi: its just another gerrit project :) | 21:22 |
clarkb | so need an All-Projects entry in projects.yaml | 21:23 |
fungi | clarkb: we need at least some stub all-projects acl configuration to be able to add the accounts necessary | 21:23 |
fungi | clarkb: so that manage-projects will work | 21:23 |
*** marun has quit IRC | 21:23 | |
clarkb | fungi: oh good point | 21:23 |
clarkb | silly eggs and their chickens | 21:23 |
portante | notmyname: I see | 21:24 |
zaro | fungi: even as a seperate rst file, wouldn't you want to include it into gerrit.rst? | 21:24 |
fungi | clarkb: i was telling zaro, yolanda worked out how to do it, but requires writing a well-tested stub acl via local git operations, making mysql queries to create initial group uuid mappings, and then you can push a more complex acl in on top of it through the normal workflow | 21:24 |
notmyname | portante: point being, while something that needs to get fixed, it seems the gate issues were around the devstack config, not the db lock | 21:24 |
clarkb | fungi: gotcha, ya I think for now we should document it like we do for the other gerrit, so that we can bootstrap but we should also automate it beyond the bootstrapping | 21:25 |
fungi | zaro: if it were a separate rst file, it would just go in the list of server/service-specific rst docs on ci.o.o | 21:25 |
jeblair | i would love that automation | 21:25 |
*** sandywalsh has joined #openstack-infra | 21:25 | |
jeblair | but agree we don't neet to block on it if it's super hard | 21:25 |
jeblair | need | 21:26 |
portante | notmyname: so you think we rushed that change through because of the gate issues? | 21:26 |
portante | that had been in the works well before these gate issues | 21:26 |
notmyname | portante: yes, I do. | 21:26 |
portante | oh, what gives you that impression? | 21:26 |
zaro | fungi: ahh. ok. that sounds good. i'll create new rst for it. | 21:27 |
*** ^d has joined #openstack-infra | 21:27 | |
*** ^d has joined #openstack-infra | 21:27 | |
fungi | jog0: so we have a gate reset a ways down courtesy of https://jenkins02.openstack.org/job/gate-oslo.messaging-python27/159/console (looks like it ran too long) | 21:28 |
notmyname | portante: 2 things: 1) the original bug was introduced in swift on july19 and 2) the gate *appears* stable now, even though the subsequent patch shows there are still issues | 21:28 |
portante | huh? | 21:28 |
*** hdd has quit IRC | 21:29 | |
zaro | fungi: ohh how do you put the new rst in the list of server/service-specific rst docs on ci.o.o ? | 21:29 |
portante | can you share the july19th thing, and have you looked at: https://bugs.launchpad.net/swift/+bug/1224253? | 21:30 |
*** che-arne has joined #openstack-infra | 21:30 | |
uvirtbot | Launchpad bug 1224253 in swift "test_object_upload_in_segments fails with OperationalError database is locked" [Undecided,New] | 21:30 |
fungi | jog0: also looks like we have a bunch of grenade and tempest jobs up near the head which are getting very close to 60 minutes | 21:30 |
portante | I have seen these errors since at least the 12th of September | 21:30 |
portante | they might have been overshadowed by other errors, as rechecks choose one bug to use, when there are multiple in place | 21:31 |
notmyname | portante: ya, see the commit message in the patch https://review.openstack.org/#/c/57019/ | 21:31 |
notmyname | portante: the referenced commit landed on jul19 | 21:32 |
fungi | jog0: jeblair: yup, timeout at the head of the gate on https://jenkins01.openstack.org/job/gate-grenade-devstack-vm/15750/console | 21:32 |
portante | I see that | 21:32 |
jog0 | fungi: :(, will that self correct | 21:32 |
*** hashar has quit IRC | 21:33 | |
jog0 | portante: see status.openstack.org/elastic-recheck/ for better numbers | 21:33 |
fungi | so this is probably a good time to dump the zuul state and start with fresh repos jeblair? or were you still prepping? | 21:33 |
notmyname | jog0: bug 1243973 doesn't show up there | 21:33 |
uvirtbot | Launchpad bug 1243973 in swift "Simultaneous PUT requests for the same account or container causes server error response" [Undecided,Fix committed] https://launchpad.net/bugs/1243973 | 21:34 |
*** pcm_ has quit IRC | 21:34 | |
jeblair | fungi: agreed, i'll save the queue now | 21:34 |
portante | but it would only show up if folks triaged to the point that they found this problem, and most find other jobs failing that are known issues, this one just did not rise to the top | 21:34 |
jog0 | notmyname: ahh if you make a query you can get it added | 21:35 |
portante | I may be just me, but I don't see that as a rush job, an incomplete fix, but not a rush job | 21:35 |
jeblair | okay, queue saved | 21:36 |
jeblair | stopping zuul now | 21:36 |
* portante back in a bit | 21:36 | |
jeblair | removing zuul git repos | 21:37 |
jeblair | fungi, clarkb: want to kill some jenkins jobs? | 21:38 |
fungi | on it | 21:38 |
clarkb | jeblair: yup I will start on 02 | 21:38 |
fungi | i'm sworking up from the bottom on 01 | 21:38 |
fungi | sworking. snark | 21:38 |
fungi | load average on zuul has dropped like a stone too | 21:39 |
fungi | jenkins01 is all clear | 21:41 |
clarkb | jeblair: 02 is clear | 21:42 |
clarkb | jeblair: but there are a bunch of stuck jobs there | 21:42 |
clarkb | maybe we should restart 02 too? | 21:42 |
clarkb | jeblair: how did you unstick the jobs that were stuck yesterday? | 21:43 |
jeblair | clarkb: ok let's restart 02 | 21:43 |
jeblair | clarkb: i deleted their nodes in jenkins, then nodepool | 21:43 |
clarkb | jeblair: can you delete them when the jobs are running? | 21:43 |
* clarkb tries this | 21:43 | |
jeblair | clarkb: yes | 21:43 |
jeblair | nodepool can't though, so i had to do it through the jenkins webui | 21:44 |
fungi | shall i generate a quick list of all nodepool nodes associated with jenkins02 and ask nodepool to start deleting those? | 21:44 |
jeblair | fungi: i think we only need to worry about the problem ones | 21:44 |
fungi | okay | 21:45 |
clarkb | I am starting at the top of the list if you want to work bottom up | 21:45 |
fungi | clarkb: which was the first one you deleted? there were 7 but by the time i started opening windows for them there were only 6 | 21:45 |
clarkb | 708541 and 708544 are being deleted by me | 21:46 |
clarkb | now 708545 | 21:46 |
jeblair | 708547 | 21:46 |
fungi | oh, i had those. maybe one deleted on its own | 21:46 |
jeblair | 708552 | 21:47 |
*** rongze has joined #openstack-infra | 21:47 | |
openstackgerrit | Solly Ross proposed a change to openstack-infra/os-loganalyze: Support Setting The Path Using an ENV Variable https://review.openstack.org/57783 | 21:48 |
openstackgerrit | Solly Ross proposed a change to openstack-infra/os-loganalyze: Introduce Generic Parsing/Filtering Framework https://review.openstack.org/57784 | 21:48 |
openstackgerrit | Solly Ross proposed a change to openstack-infra/os-loganalyze: Introduce Console Version, Move Common Code https://review.openstack.org/57785 | 21:48 |
openstackgerrit | Solly Ross proposed a change to openstack-infra/os-loganalyze: Use JS to Up-Filter https://review.openstack.org/57786 | 21:48 |
clarkb | 02 looks clean now | 21:48 |
clarkb | should I restart it? | 21:48 |
jeblair | clarkb: wait a sec | 21:48 |
clarkb | k | 21:48 |
jeblair | nodepool may still be trying to delete nodes, i want to give it a chance to finish | 21:49 |
fungi | it is | 21:49 |
fungi | some of them were already deleted according to the tracebacks it gave me | 21:49 |
fungi | but i also only explicitly deleted the ones which were being removed by hand in jenkins | 21:50 |
jeblair | ok, the only nodes left on jenkins02 that are offline were ones it failed to delete from a long time ago | 21:50 |
fungi | looks like it thinks there are still over 100 associated with jenkins02 | 21:51 |
jeblair | clarkb: i believe it's safe to restart jenkins02 now | 21:51 |
*** mihgen has quit IRC | 21:51 | |
clarkb | ok shutting it down now then will start again | 21:51 |
jeblair | fungi: yeah, fixing the nodepool cleanup thread is next on my list | 21:51 |
clarkb | its starting now | 21:51 |
fungi | jeblair: did you see my change from a couple days ago? | 21:51 |
jeblair | fungi: no! | 21:52 |
fungi | jeblair: https://jenkins02.openstack.org/ | 21:52 |
fungi | er, that's not the right url :/ | 21:52 |
jeblair | https://review.openstack.org/#/c/57364/ | 21:52 |
*** rongze has quit IRC | 21:52 | |
fungi | that one, yes | 21:52 |
fungi | no idea if that's what you had in mind | 21:52 |
*** CaptTofu has quit IRC | 21:52 | |
jeblair | fungi: that's exactly what i had in mind; any chance that's tested? | 21:52 |
fungi | i would say there's probably very close to 0 chance that's tested | 21:53 |
*** CaptTofu has joined #openstack-infra | 21:53 | |
jeblair | fungi: i'm willing to test that in production. :) | 21:53 |
fungi | should i just spin up a local nodepool and then... i'm not sure on how to make it block the thread | 21:54 |
fungi | okay | 21:54 |
fungi | i'm guessing i'd have to run that under a debugger to properly test it | 21:54 |
* clarkb looks at it | 21:54 | |
clarkb | jeblair: fungi: I am happy giving that a shot as well | 21:55 |
clarkb | its simple and I am pretty confident it can't make the problem worse | 21:55 |
fungi | oh, right, i wouldn't simulate the thread deadlock issue, that's a symptom, not what we were solving (only just looked back at the change myself) | 21:56 |
jeblair | clarkb: jenkins02 up? | 21:56 |
clarkb | jeblair: looks like it | 21:56 |
jeblair | starting zuul | 21:56 |
clarkb | I can get the web ui | 21:56 |
*** mfer has quit IRC | 21:56 | |
jeblair | now we wait for it to clone all the things | 21:56 |
jeblair | oh right, we clone on demand now, don't we? | 21:57 |
jeblair | oh, no there's still a clone everything step on startup | 21:57 |
fungi | at least it's giving nodepool a chance to stock back up for the coming onslaught | 21:58 |
*** svarnau has quit IRC | 21:59 | |
clarkb | jeblair: https://review.openstack.org/#/c/52689/ and its child may interest you | 21:59 |
*** jhesketh has joined #openstack-infra | 21:59 | |
jeblair | clarkb: very much so, thanks :) | 22:00 |
jeblair | we have a lot of repos. | 22:01 |
fungi | and it just keeps increasing | 22:01 |
Shrews | fungi: yep, 6:30 | 22:02 |
jeblair | it's done | 22:02 |
fungi | i see jobs running | 22:02 |
*** julim has quit IRC | 22:03 | |
jeblair | shall i send the reverifies for the gate queue? | 22:03 |
mikal | jeblair: how are you selecting reverifies to send? Or is it just everything approved but unmerged? | 22:04 |
*** lcestari has quit IRC | 22:04 | |
jeblair | mikal: everything that was in the gate queue before i stopped it | 22:04 |
clarkb | wait for it, mikal is going to bribe us to put his changes in first | 22:05 |
openstackgerrit | A change was merged to openstack-infra/nodepool: Skip periodic cleanup if the node is not stale https://review.openstack.org/57364 | 22:05 |
jeblair | yoink | 22:05 |
mikal | clarkb: no bribes | 22:05 |
fungi | jenkins02 seems to be running jobs successfully | 22:05 |
mikal | clarkb: just rage quitting debugging 1251920 unless my demands are met | 22:05 |
clarkb | fungi: yup it has jobs and is happy | 22:05 |
clarkb | mikal: nice. I like the way you swung that around on me | 22:06 |
jeblair | mikal: go aprv your changes now; i'll be slow. | 22:06 |
jeblair | #status ok | 22:06 |
*** ChanServ changes topic to "Discussion of OpenStack Project Infrastructure | Docs http://ci.openstack.org/ | Bugs https://launchpad.net/openstack-ci | Code https://git.openstack.org/cgit/openstack-infra/" | 22:06 | |
mikal | jeblair: heh | 22:06 |
mikal | Its ok | 22:06 |
fungi | it's a race to the gate now | 22:06 |
mikal | I'VE ONLY BEEN WAITING A WEEK FOR THEM TO MERGE | 22:06 |
jeblair | also, i put 5 second sleeps in my script | 22:06 |
mikal | jeblair: seriously though, do you want things to start being approved again? | 22:07 |
mikal | jeblair: or wait a bit for the queue to clear? | 22:07 |
jog0 | mikal: gate is empty | 22:07 |
clarkb | mikal: go for it | 22:07 |
jeblair | mikal: is fine, i think we're all agreed on 'back to normal' now | 22:07 |
fungi | this graph is nothing short of spectacular, btw... (particularly on the weekly version) http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=388&rra_id=all | 22:07 |
jog0 | so who will send the email out? | 22:07 |
fungi | at one point in the past 24 hours, zuul had a 5-minute load average of 143.98 | 22:08 |
jog0 | mikal russellb: ^ | 22:08 |
jeblair | jog0, clarkb: i think one of you should, what with having written most of it. :) | 22:08 |
russellb | fungi: niiiice | 22:08 |
mikal | jog0: I can if you want | 22:08 |
jog0 | jeblair: we ll we need two emails | 22:08 |
mikal | jog0: or you can take credit, I don't mind | 22:08 |
jeblair | jog0: oh, the all-clear email | 22:08 |
jog0 | mikal: can send out th response to his stop working email | 22:08 |
jog0 | jeblair: yeah | 22:08 |
jog0 | just taht for now | 22:08 |
jeblair | mikal, jog0: ++ | 22:09 |
mikal | jog0: ok, doing that now | 22:09 |
jog0 | mikal: thanks | 22:10 |
*** yamahata_ has joined #openstack-infra | 22:10 | |
jog0 | clarkb: are you going to reload the saved jobs? | 22:10 |
clarkb | jog0: jeblair is doing that | 22:10 |
jog0 | clarkb: cool beans | 22:10 |
jeblair | jog0: i am, but i'm being slow about it for some reason :) | 22:10 |
mikal | Done | 22:11 |
jog0 | mikal: thanks | 22:12 |
mikal | jog0: np | 22:12 |
mikal | jog0: thanks for being awesome and stuff | 22:12 |
jeblair | oh, there we go, now my script is running. :) | 22:12 |
jog0 | mikal: any time | 22:12 |
*** fifieldt has joined #openstack-infra | 22:12 | |
jeblair | clarkb, fungi: i'm manually nodepool-deleting a bunch of nodes in the building state for 48 hours | 22:13 |
jog0 | so we aren't worried about grenade timeout anymore | 22:13 |
clarkb | jog0: I think we should keep an eye on it but the zuul recloning should fix that for now | 22:13 |
jog0 | cool | 22:13 |
jeblair | clarkb, fungi: that should help with nodepool responsiveness (until we restart with fungi's patch) | 22:13 |
*** eharney has joined #openstack-infra | 22:14 | |
pabelanger | fungi, gebus. Surprised the system didn't tip over | 22:14 |
jog0 | mikal: https://review.openstack.org/#/c/55605/ | 22:15 |
jog0 | failed pep8 :) | 22:15 |
*** mfer has joined #openstack-infra | 22:15 | |
jog0 | mikal: good thing zuul scheduler can deal with that | 22:16 |
mikal | jog0: but how? I passed it in check | 22:16 |
jog0 | mikal: my guess is the config file stuff | 22:16 |
jog0 | not sure | 22:16 |
jeblair | 2013-11-21 22:14:25.282 | E: nova.conf.sample is not up to date, please run tools/config/generate_sample.sh | 22:16 |
mikal | jog0: oh yeah, that stupid generator | 22:17 |
jog0 | https://jenkins01.openstack.org/job/gate-nova-pep8/12659/console | 22:17 |
*** changbl has quit IRC | 22:17 | |
jog0 | mikal: yeah :/ | 22:17 |
jog0 | thats hit me a bunch too | 22:17 |
mikal | Ugh | 22:17 |
mikal | That's probably going to break all my patches | 22:17 |
mikal | Dammit | 22:17 |
mikal | Yeah, it is the config file thing | 22:17 |
jeblair | queue reload and nodepool deletes both finished | 22:19 |
mikal | jeblair: have you guys had a try of the gate on performance flavours yet? | 22:19 |
mikal | jeblair: because I'd be super interested in results | 22:19 |
mordred | mikal: no | 22:19 |
clarkb | mikal: no, mordred was writing changes to support that | 22:19 |
mikal | Cool. | 22:19 |
mordred | mikal: I put up half of the change needed, haven't done the other half | 22:19 |
mordred | jeblair: there is a nodepool patch up to add the capability for it | 22:20 |
mikal | mordred: is it possible to move a portion of the pool in a region to a different flavor, or is it all or nothing? | 22:20 |
*** dolphm has quit IRC | 22:20 | |
mordred | it is possible to trial baloon things for sure, which is what we'll do | 22:20 |
mordred | mikal: we already only have a couple of rax nodes in the pool because they're slow | 22:20 |
mikal | jeblair: https://review.openstack.org/#/c/56118/ is a small review waiting for a comment from you when you have some idle cycles | 22:20 |
mordred | so if we just move those couple of things, we shoudl be able to test | 22:20 |
mikal | mordred: oh, cool | 22:20 |
*** mrodden has quit IRC | 22:20 | |
mikal | mordred: so the move would be small (in terms of instance count) | 22:21 |
pabelanger | How long does it take to bootstrap a node? | 22:21 |
mordred | yes | 22:21 |
clarkb | pabelanger: image builds take ~20 minutes to an hour | 22:21 |
clarkb | pabelanger: booting off of the snapshots of those images is relatively quick, we timeout after 2 minutes | 22:21 |
*** bknudson has joined #openstack-infra | 22:21 | |
mikal | clarkb: https://review.openstack.org/#/c/56158/ is one for you too... I addressed your concerns and then ignored it during the recent unpleasantness | 22:21 |
clarkb | mordred: I think we should try and get new hpcloud region in today | 22:21 |
mordred | clarkb: I agree | 22:22 |
pabelanger | clarkb, how often are snapshots built? | 22:22 |
bknudson | it would be nice to get this keystoneclient change merged quickly because it's breaking keystone -- https://review.openstack.org/#/c/57583/ | 22:22 |
clarkb | pabelanger: daily per cloud region/az | 22:22 |
clarkb | mordred: do you have time today to address my -1 of the nodepool config change? | 22:22 |
clarkb | mordred: I can just fix it if you don't | 22:22 |
bknudson | never mind... I see it's in the queue already. | 22:23 |
mordred | clarkb: can you just fix it? I'm in a stupid meeting | 22:23 |
clarkb | mordred: sure | 22:23 |
pabelanger | bknudson, interesting concept, some how adding a priority flag into Zuul. | 22:23 |
bknudson | just let me "bump" it. | 22:23 |
mordred | clarkb: actually, hold off on the new hp region | 22:24 |
clarkb | mordred: oh? should I still fix the nodepool config change? | 22:24 |
mordred | clarkb: yeah - but let's hold off on putting it through | 22:24 |
clarkb | ok | 22:24 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Add new HP Cloud region https://review.openstack.org/56260 | 22:25 |
clarkb | mordred: ^ is the updated config | 22:25 |
mordred | clarkb: yes | 22:26 |
openstackgerrit | Tim Daly, Jr. proposed a change to openstack-infra/config: Make python33 check for tomograph non-voting https://review.openstack.org/57573 | 22:26 |
*** ken1ohmichi has joined #openstack-infra | 22:27 | |
*** mrodden has joined #openstack-infra | 22:30 | |
*** ^d has quit IRC | 22:30 | |
*** loq_mac has joined #openstack-infra | 22:30 | |
*** fbo is now known as fbo_away | 22:30 | |
*** mfer has quit IRC | 22:30 | |
*** ^d has joined #openstack-infra | 22:30 | |
jeblair | mikal: commented, thx. | 22:32 |
jeblair | i have attempted to review patches today. i reviewed some that were apprx 1 month old, so it may be a bit before i'm regularly reviewing recent patches. | 22:32 |
mordred | :) | 22:32 |
jeblair | zuul load is not obscene | 22:33 |
clarkb | I am so far behind on reviews, I have hope that next week will be quiet with the holdiay and I can find good chunks of review time | 22:33 |
*** dkranz has quit IRC | 22:34 | |
*** jhesketh has quit IRC | 22:35 | |
*** thomasem has quit IRC | 22:36 | |
*** jhesketh has joined #openstack-infra | 22:36 | |
*** ryanpetrello has quit IRC | 22:37 | |
*** michchap has quit IRC | 22:39 | |
jog0 | gate is looking pretty decent at the moment | 22:40 |
jog0 | merging a patch now :) | 22:40 |
*** markmcclain has quit IRC | 22:40 | |
jog0 | mikal: its one of yours | 22:40 |
clarkb | jeblair: for the zuul refs, should we have a monthly jobs that kills all refs older than a month? | 22:41 |
mikal | jog0: yay! | 22:41 |
clarkb | and pair that with a gc probably | 22:41 |
mordred | clarkb: I was going to give you a hard time about all of my patches that need reviews | 22:41 |
*** michchap has joined #openstack-infra | 22:42 | |
mikal | jeblair: if you're really behind do you love or hate the idea of people highlighting ones which are blocking on specifically you? | 22:42 |
mordred | clarkb: but then you worked on fixing the gate and I didn't | 22:42 |
mordred | clarkb: so, I'll keep my mouth shut | 22:42 |
clarkb | mikal: I like it, if I am specifically requested to review a thing I try to get to it because I know someone is following along | 22:42 |
mordred | clarkb: is gate ok enough that I can recheck patches of mine that were trapped by earlier gate break? | 22:43 |
clarkb | mikal: a lot of reviews seem to go in a black hole so I like it when the chance of that happening is low | 22:43 |
clarkb | mordred: I think so | 22:43 |
mordred | clarkb: ok. is it ok to recheck no bug them? I know that they are pre-gate-fix blockages that I just left alone | 22:43 |
mordred | (these are the pbr integration script patches) | 22:43 |
clarkb | mordred: I think so, I don't tink we can get away from that until after the backlog has caught up | 22:44 |
mordred | clarkb: ok. cool. thank you. | 22:44 |
mordred | and thank you to everyone who worked on this - I'm sorry I was AFK for all of it | 22:44 |
mikal | clarkb: cool | 22:45 |
mikal | I ask because in nova land people tend to avoid requesting reviews from people specifically | 22:45 |
mikal | Its seen as rude in 99% of cases | 22:45 |
clarkb | mikal: I have also been known to say I will get to it when I can :) | 22:46 |
clarkb | but it fits into my starring things review pattern really well | 22:46 |
mgagne | zaro: ping | 22:47 |
mordred | mikal: we tend to have a STUPIDLY HIGH amount of patches - so help is always welcome | 22:47 |
*** rongze has joined #openstack-infra | 22:48 | |
jog0 | mordred: so it took us 26 hours to unbreak things | 22:49 |
jog0 | even with drastic measures | 22:49 |
jog0 | just a fun fact | 22:49 |
mordred | jog0: I'm impressed that you fixed things in 26 hours, tbh | 22:50 |
jog0 | mordred: we got 6 bugs fixed | 22:50 |
jog0 | and mordred the top two bugs on are gate failing list | 22:50 |
jog0 | so not to bad | 22:51 |
jog0 | why is horizon doc hanging | 22:51 |
jog0 | https://jenkins02.openstack.org/job/gate-horizon-docs/1439/ | 22:51 |
jog0 | ohh its just slow | 22:52 |
*** dcramer_ has quit IRC | 22:52 | |
*** jhesketh has quit IRC | 22:53 | |
*** jhesketh has joined #openstack-infra | 22:54 | |
*** rongze has quit IRC | 22:54 | |
*** loq_mac has quit IRC | 22:56 | |
zaro | mgagne: sup? | 22:58 |
mgagne | zaro: sorry, found it ^^' | 22:59 |
mgagne | zaro: I now have a slave running Windows. Now is time to puppetize this thing. ^^' | 23:01 |
*** thomasem has joined #openstack-infra | 23:01 | |
zaro | mgagne: nice! | 23:02 |
*** eharney has quit IRC | 23:02 | |
*** ^d has quit IRC | 23:02 | |
*** ^d has joined #openstack-infra | 23:02 | |
*** weshay has quit IRC | 23:04 | |
*** dizquierdo has quit IRC | 23:06 | |
*** bpokorny has quit IRC | 23:10 | |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: Setup a private gerrit instance for security reviews https://review.openstack.org/47937 | 23:10 |
zaro | fungi: ^ | 23:10 |
jeblair | i restarted nodepool | 23:11 |
fungi | zaro: awesome. i'm about to disappear for the night (places to be) but will add it to my pile for tomorrow | 23:13 |
*** thomasem has quit IRC | 23:13 | |
zaro | fungi: in review-security.rst, i only documented high levels. I just copied project.config setting from general gerrit. | 23:13 |
jgriffith | jeblair: jog0 in the spirit of discontinuing "reverify no bug" do we want to log new bugs for random well understood issues? | 23:14 |
zaro | fungi: I didn't understand how the gerrit groups in the doc map to high level users so i was wondering if we could iterate and refine project.config? | 23:14 |
jgriffith | jeblair: jog0 ie failure to connect to puppet repo | 23:14 |
jgriffith | seems like a no bug is appropriate for now until we have a general category or something | 23:14 |
fungi | jeblair: i'll keep an eye om nodepool for the next few days and see if we witness similar stale node behavior to suggest that maybe it wasn't that race (or additional causes) | 23:14 |
jeblair | jgriffith: yes, and it'll help us know if that's a problem; it can be a ci bug | 23:15 |
jgriffith | jeblair: alright, logging one now | 23:15 |
*** wenlock has quit IRC | 23:15 | |
jeblair | jgriffith: (i think we'll just have to learn to live with some bugs that stay open for a long time because they aren't very actionable) | 23:15 |
jeblair | jgriffith: oh wait | 23:15 |
jgriffith | agreed | 23:15 |
jgriffith | but tracking metrics are always helpful | 23:15 |
jeblair | fungi: didn't someone file a bug about that? | 23:15 |
jeblair | jog0: ? ^ | 23:16 |
jgriffith | jeblair: fungi I haven't searched yet... gimmie a sec | 23:16 |
fungi | zaro: yeah, more details on project.config would be great | 23:16 |
clarkb | jgriffith: jeblair: what is the error? | 23:16 |
fungi | jeblair: jgriffith: yeah, jog0 just filed it a couple hours ago | 23:16 |
jeblair | jgriffith: i think there was a suggestion that we remove some apt repos before snapshotting on devstack runs, so actually, that's a pretty actionable bug. :) | 23:16 |
* fungi finds | 23:17 | |
jeblair | clarkb: i think 'apt-get update' fails because we have a puppet apt repo, which is less reliable than the rax mirror we use for the os | 23:17 |
jgriffith | jeblair: coolio | 23:17 |
jgriffith | jeblair: that's correct... in the grenade test | 23:17 |
clarkb | jgriffith: gotcha | 23:17 |
fungi | jgriffith: jeblair: bug 1253774 | 23:17 |
uvirtbot | Launchpad bug 1253774 in openstack-ci "Reduce number of apt sources that must be up for gate to work" [Medium,Confirmed] https://launchpad.net/bugs/1253774 | 23:17 |
jgriffith | http://logs.openstack.org/68/54068/23/gate/gate-grenade-devstack-vm/e036eb6/console.html#_2013-11-21_20_19_18_749 | 23:17 |
clarkb | should the nodepool image build scripts remove that apt repo as one of the last things it does? | 23:17 |
jeblair | that's it! | 23:17 |
jgriffith | fungi: nice... I'll log it against that one | 23:17 |
jgriffith | fungi: jeblair clarkb thanks! | 23:18 |
jeblair | clarkb: i think that's the way i'd do it. | 23:18 |
*** krtaylor has quit IRC | 23:18 | |
fungi | clarkb: you should suggest that in the bug ;) | 23:18 |
clarkb | fungi: I will | 23:18 |
jeblair | fungi: i think you already did suggest that? :) | 23:18 |
fungi | yes, i did | 23:19 |
jeblair | fungi, clarkb: good ideas all around! :) | 23:19 |
fungi | agreement in the bug is just as good though | 23:19 |
* fungi really disappears now. can't keep Shrews and olaph waiting | 23:19 | |
clarkb | oh heh I read the bug now :) | 23:20 |
*** mrodden has quit IRC | 23:24 | |
*** reed has quit IRC | 23:24 | |
*** mrodden has joined #openstack-infra | 23:25 | |
*** yamahata_ has quit IRC | 23:25 | |
*** CaptTofu has quit IRC | 23:25 | |
*** CaptTofu has joined #openstack-infra | 23:26 | |
*** herndon has quit IRC | 23:27 | |
*** lchen has quit IRC | 23:29 | |
*** CaptTofu_ has joined #openstack-infra | 23:30 | |
*** CaptTofu has quit IRC | 23:31 | |
*** wenlock has joined #openstack-infra | 23:32 | |
clarkb | I haven't seen any obvious grenade failures | 23:32 |
*** changbl has joined #openstack-infra | 23:35 | |
jeblair | nodepool started its periodic cleanup at 2315, and hasn't hit an exception or stopped yet | 23:36 |
clarkb | jeblair: I have been thinking about removing git zuul refs, is the best way to do that with a find -mtime then a git gc? | 23:39 |
clarkb | er find -mtime delete? | 23:40 |
jog0 | jgriffith: thanks for getting on board with the new world view | 23:41 |
*** sarob has quit IRC | 23:44 | |
*** sarob has joined #openstack-infra | 23:45 | |
jeblair | clarkb: er, it's slightly more complicated, and i'm not sure it's safe to do while zuul is running | 23:45 |
jeblair | clarkb: i'll try to dig up my script | 23:45 |
clarkb | jeblair: is it more complicated because the refs themselves can get packed/ | 23:46 |
*** atiwari has quit IRC | 23:49 | |
*** gyee has quit IRC | 23:49 | |
*** sarob has quit IRC | 23:49 | |
*** gyee has joined #openstack-infra | 23:50 | |
*** rongze has joined #openstack-infra | 23:50 | |
*** atiwari has joined #openstack-infra | 23:50 | |
*** jgrimm has quit IRC | 23:51 | |
*** jhesketh has quit IRC | 23:53 | |
clarkb | lifeless: https://jenkins02.openstack.org/job/gate-neutron-python26/3379/console the html output conversion for neutron is taking forever under python26 (it takes 10 minutes on python27 which is pretty bad too). The subunit files are really large, any thoughts on making it go quicker? | 23:53 |
clarkb | hmm are we not converting to subunitv2 anymore /me checks | 23:53 |
lifeless | get me a copy of them ? | 23:53 |
*** rcleere has quit IRC | 23:54 | |
*** dkliban_ has quit IRC | 23:54 | |
*** rnirmal has quit IRC | 23:54 | |
clarkb | lifeless: nevermind we aren't converting to v2 first | 23:54 |
*** pete5 has quit IRC | 23:54 | |
*** ken1ohmichi has quit IRC | 23:54 | |
clarkb | lifeless: that code got disappeared somehow, I will correct it (v1 going through v2 parser is slow and that is a known problem) | 23:54 |
clarkb | hmm did that change never get merged? | 23:55 |
clarkb | gah it happened when we moved to run-unittest.sh | 23:56 |
*** rongze has quit IRC | 23:56 | |
*** michchap has quit IRC | 23:58 | |
*** michchap has joined #openstack-infra | 23:58 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!