* fungi has no idea if twitter acts as an rss feed aggregator | 00:00 | |
jeblair | infra-root: if you are interested in supporting statusbot's twitter integration, please fix it; otherwise i think we should disable it in our configuration. | 00:00 |
---|---|---|
mtreinish | fungi: is the script to do that in python. You can just steal the rss bits from o-h api server pretty easily | 00:01 |
fungi | i would be in favor of reverting it if it seems unstable, since we can't really test it | 00:01 |
mtreinish | whatever lib we used for that wasn't that hard to deal with | 00:01 |
clarkb | I believe it is at least configurable so we can in theory start by removing the configfor it | 00:01 |
fungi | mtreinish: yeah, statusbot is in python | 00:01 |
clarkb | there is an unmade rye mule with my name on it downstairs /me is going to pop out to get that made but will keep an eye on irc until sleep time | 00:02 |
clarkb | thanks again everyone! | 00:02 |
jeblair | if we see statusbot pop in with an "finished sending ok" then we can assume the error cleared up | 00:03 |
mtreinish | fungi: https://pypi.python.org/pypi/feedgen/ | 00:03 |
fungi | mtreinish: thanks, i'll keep that in mind | 00:04 |
jeblair | clarkb: oh hey i found the error in the log | 00:07 |
jeblair | TwitterError: Text must be less than or equal to 140 characters. | 00:07 |
jeblair | so the split support in statusbot is somehow broken (maybe on 'ok' messages?) | 00:07 |
*** oanson has quit IRC | 00:08 | |
*** oanson has joined #openstack-infra | 00:09 | |
clarkb | fun | 00:09 |
jeblair | that means statusbot should be back in its normal state, it just didn't send you the ok | 00:09 |
jeblair | #status log please avoid merging new project creation changes until after we have the git backends puppeting properly | 00:10 |
openstackstatus | jeblair: finished logging | 00:10 |
jeblair | good | 00:10 |
prometheanfire | woo :D | 00:10 |
jeblair | still, we should either fix that bug or turn it off | 00:11 |
clarkb | also we'll need to watch melody bu that should be it for restarting gerrit due to memory cache leaks iirc | 00:11 |
*** vhosakot has quit IRC | 00:11 | |
openstackgerrit | Matthew Thode proposed openstack/diskimage-builder master: Update Gentoo element for element changes https://review.openstack.org/503844 | 00:11 |
jeblair | (the twitter feed doesn't have the status ok, so we are giving people misleading information. i feel pretty strongly about not doing that.) | 00:11 |
*** stakeda has joined #openstack-infra | 00:12 | |
*** csomerville has quit IRC | 00:13 | |
mnaser | can we reapprove changes | 00:16 |
mnaser | ..safely? :> | 00:16 |
clarkb | yes I think so | 00:17 |
clarkb | please let us know if you see any weird behavior but we've sent the ok signal | 00:17 |
clarkb | mnaser: ^ | 00:17 |
mnaser | clarkb will do, i did get one weird quirk | 00:17 |
mnaser | let me try to see if i can repro | 00:18 |
mnaser | i edited a change (with the ui), clicking save changes left it stuck in "working..." , the change submitted but it was stuck on working till i had to f5 | 00:18 |
clarkb | huh | 00:18 |
clarkb | the edit was fine though just ui failed to update? | 00:19 |
mnaser | the edit went through but it looks like the UI got a weird response or something and got stuck in 'working' forever | 00:19 |
mnaser | let me retry | 00:19 |
fungi | improve your gerrit with this one weird quirk | 00:20 |
mordred | clarkb: I also saw a slight weirdness related to abandoning a change that sounds familiar - but it was also doing replication thread spinup - so I didn't pay attention too terribly | 00:20 |
mnaser | okay i can replicate it | 00:20 |
*** chlong has joined #openstack-infra | 00:20 | |
mnaser | https://review.openstack.org/#/c/504361/ -- click on "Commit Message" in the files | 00:20 |
mnaser | click the "edit" icon, change commit message | 00:21 |
mnaser | hit save, close, click "publish edit" | 00:21 |
mnaser | the ui gets stuck on loading | 00:21 |
clarkb | mnaser: woo, though that sounds like the type of bug we document and live with since you can always refrssh or edit locally (its fun how our mass of users can find all the things so much better than I can though :) ) | 00:21 |
mnaser | oh yeah i'm just lazy and dont want to find the branch where i have something to push back up :p | 00:22 |
clarkb | though worth looking into as may indicate somethibg wrong with our proxy | 00:22 |
mordred | fungi, jeblair: sorry - I was afk for a few - I'll fix the twitter integration | 00:22 |
*** caphrim007 has quit IRC | 00:23 | |
*** caphrim007 has joined #openstack-infra | 00:23 | |
mnaser | let me check if i get any console errors | 00:23 |
mnaser | or 5xx | 00:23 |
mnaser | so it looks like it actually even puts the change in this weird state | 00:25 |
mnaser | where we have change 1, "edit" then 2 | 00:25 |
*** askb has joined #openstack-infra | 00:26 | |
jeblair | mordred: cool thx; you can find the traceback in the 2017-09-18 log | 00:26 |
jeblair | mordred: (also, if you have the password for that account handy, maybe manually tweet the 'ok' for anyone following?) | 00:26 |
*** thorst has joined #openstack-infra | 00:26 | |
ianw | hmm, i just got a 502 error. might be glitch in the matrix | 00:26 |
*** caphrim007 has quit IRC | 00:28 | |
*** mat128 has joined #openstack-infra | 00:28 | |
*** rossella_s has quit IRC | 00:29 | |
*** gcb has joined #openstack-infra | 00:30 | |
*** Swami has quit IRC | 00:30 | |
ianw | and again | 00:30 |
clarkb | on a particular url? | 00:31 |
ianw | just navigating around doing some devstack reviews | 00:32 |
ianw | https://review.openstack.org/#/c/501892/2/functions-common was the one just now | 00:32 |
clarkb | huh I wonder if that is apache not seeing gerrit respond fast enough? | 00:32 |
*** rossella_s has joined #openstack-infra | 00:32 | |
clarkb | also that content get sheavily cached iirc so could be that we have to load more up to avoid timeouts? | 00:32 |
ianw | gerrit-ssl-error.log:[Tue Sep 19 00:30:42.134614 2017] [proxy:error] [pid 118439:tid 139970131998464] [client 122.106.204.156:37824] AH00898: Error reading from remote server returned by /changes/501892/edit, referer: https://review.openstack.org/ | 00:33 |
ianw | that's me | 00:33 |
clarkb | did you try editing the file? | 00:34 |
ianw | no, just pressed "up" on the review to go back to the main review | 00:34 |
clarkb | weird | 00:34 |
clarkb | I wonder if related to mnasers thing | 00:34 |
ianw | i guss the other was | 00:35 |
ianw | gerrit-ssl-error.log:[Tue Sep 19 00:25:42.591561 2017] [proxy:error] [pid 22646:tid 139970140391168] [client 122.106.204.156:37642] AH00898: Error reading from remote server returned by /changes/457963/revisions/4/drafts, referer: https://review.openstack.org/ | 00:35 |
clarkb | and drafts is another edit feature | 00:36 |
clarkb | maybe thats just completely derp in 2.13? | 00:36 |
*** rossella_s has quit IRC | 00:37 | |
ianw | i'm not trying to edit/draft etc ... | 00:37 |
clarkb | It may be trying to check if you have any outstanding drafts/edits? | 00:38 |
ianw | nothing in the gerrit logs of interest around 00:25:42 i can see | 00:38 |
ianw | yeah, firebug shows the call | 00:39 |
*** rossella_s has joined #openstack-infra | 00:39 | |
ianw | a GET to https://review.openstack.org/changes/501892/revisions/06c83af151e39a6f98d0bfe7b3b7cd08921b587f/drafts | 00:40 |
clarkb | probably first thing to try is increase apache tineout | 00:40 |
clarkb | I am also hoping that as caches grow it becomes less of a problem | 00:41 |
*** xarses_ has joined #openstack-infra | 00:42 | |
clarkb | ianw: I think there is nothing in the gerrit log because it didnt error. apache just timed out to the backend | 00:42 |
ianw | yeah, probably right | 00:42 |
ianw | from the error logs, i am also the only one seeing it | 00:43 |
ianw | although probably just very quiet | 00:43 |
clarkb | I havent been able to reproduce from my phone | 00:44 |
clarkb | perhps rtt is playing a role? | 00:44 |
openstackgerrit | Monty Taylor proposed openstack-infra/statusbot master: Let the python-twitter library handle message splitting https://review.openstack.org/504980 | 00:46 |
mordred | jeblair, clarkb: ^^ that should fix it | 00:46 |
mordred | I'm not sure I have the password handy - looking | 00:46 |
ianw | clarkb: -ETOOFARAWAY | 00:46 |
*** LindaWang has joined #openstack-infra | 00:47 | |
ianw | clarkb: keep an eye ... should i pre-prepare a timeout change just in case as (.au) day goes on we get more reports ? | 00:47 |
clarkb | ianw: I think that is a good idea but puppet isnt running there yet so would have to hand apply for now anyways | 00:48 |
clarkb | but a change tracking it would be good | 00:48 |
ianw | oh right, yeah that was left off | 00:49 |
mordred | clarkb, jeblair: status tweets sent | 00:54 |
*** tiswanso has joined #openstack-infra | 00:58 | |
ianw | clarkb: hmm, proxytimeout should == timeout == 300 | 00:59 |
clarkb | apache docs say default for timeout is 60 did we override? | 01:01 |
ianw | root@review:/var/log/apache2# cat /etc/apache2/apache2.conf | grep '^Timeout' | 01:04 |
ianw | Timeout 300 | 01:04 |
clarkb | must be debuntu | 01:05 |
*** askb has quit IRC | 01:05 | |
ianw | shouldn't need any keepalive stuff | 01:06 |
*** aeng_ has joined #openstack-infra | 01:08 | |
*** aeng has quit IRC | 01:08 | |
ianw | ok, someone other than me would have seen something; from Minneapolis Minnesota (according to maxmind) "Error reading from remote server returned by /changes/" | 01:12 |
ianw | so presumably rtt is not too much of an issue there | 01:12 |
clarkb | ya that should be fairly close | 01:12 |
*** Sukhdev has quit IRC | 01:12 | |
clarkb | melody looks ok | 01:13 |
*** jdandrea_ has joined #openstack-infra | 01:13 | |
clarkb | cpu use might be a little higher | 01:13 |
*** mixos has joined #openstack-infra | 01:16 | |
clarkb | I've got while curl https://review.openstack.org/changes/ ; do echo "next" ; done running in a loop | 01:16 |
clarkb | has not failed yet | 01:16 |
ianw | not the first time we've had trusty-era apache proxy issues :/ | 01:16 |
clarkb | ya | 01:17 |
*** Apoorva_ has joined #openstack-infra | 01:17 | |
clarkb | probably bumps up priority of upgrading to xenial if it persists | 01:17 |
clarkb | (and cache warming doesn't help | 01:17 |
ianw | yeah ... in this case connection pooling etc don't seem to be likely candidates | 01:18 |
clarkb | my curl has likely run a few hundred time now | 01:18 |
ianw | yeah, i'm tailing the error log to see too | 01:18 |
*** tiswanso has quit IRC | 01:19 | |
*** tiswanso has joined #openstack-infra | 01:20 | |
clarkb | we could also consider using a different proxy in an effort to debug (just have it on port 4443 or something) | 01:20 |
*** Apoorva has quit IRC | 01:20 | |
clarkb | haproxy or nginx etc | 01:20 |
clarkb | I've stopped my loop as it had no errors | 01:21 |
clarkb | I don't think this is a catastrophic problem | 01:21 |
*** cshastri has joined #openstack-infra | 01:21 | |
clarkb | lets keep an eye on it and consider further debugsteps but for now I really need to take a break (been a long day) and spend time with family | 01:21 |
*** rtjure has quit IRC | 01:21 | |
clarkb | ping if something urgent comes up and I will check in periodically | 01:21 |
*** Apoorva_ has quit IRC | 01:21 | |
*** sshnaidm has joined #openstack-infra | 01:22 | |
*** cuongnv has joined #openstack-infra | 01:22 | |
*** apetrich has quit IRC | 01:23 | |
ianw | np, will keep an eye. thanks! | 01:23 |
*** apetrich has joined #openstack-infra | 01:25 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: Set repo setup release in playbook https://review.openstack.org/504939 | 01:25 |
*** rtjure has joined #openstack-infra | 01:26 | |
*** thorst has quit IRC | 01:27 | |
*** yamahata has quit IRC | 01:29 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Actually sort mount-point list https://review.openstack.org/504819 | 01:30 |
*** liujiong has joined #openstack-infra | 01:39 | |
fungi | heading to bed soon myself, but will attempt to jump on problems from scrollback when i wake up, if any | 01:45 |
*** ijw has quit IRC | 01:51 | |
*** mixos has quit IRC | 01:54 | |
*** hongbin has joined #openstack-infra | 01:56 | |
*** e0ne has joined #openstack-infra | 01:58 | |
*** e0ne has quit IRC | 02:02 | |
*** chlong has quit IRC | 02:03 | |
*** aeng_ is now known as aeng | 02:03 | |
*** xarses_ has quit IRC | 02:04 | |
*** lennyb has quit IRC | 02:04 | |
*** baoli has joined #openstack-infra | 02:05 | |
mnaser | gerrit seems slow/choppy right now | 02:05 |
mnaser | getting 502's now in the UI | 02:06 |
*** lennyb has joined #openstack-infra | 02:06 | |
ianw | yeah, they're increasing | 02:08 |
jeblair | cacti says we just had a mojor outbound traffic spike | 02:10 |
ianw | for a start, something stackalytics does ... both it seems infra and stackalytics-bot-2 is causing a lot of errors | 02:10 |
ianw | it seems to close the connection on gerrit during queries | 02:10 |
jeblair | melody says we're at 3% garbage collector time | 02:12 |
clarkb | ya melody says someone grabbed fuel-lib frkm gerrit? | 02:12 |
clarkb | wouldnt be surprised if that us related to traffic spike | 02:13 |
ianw | there's a lot of "5000 ms timeout reached for Diff loader in project openstack ... " too | 02:13 |
ianw | there was a bunch of issues but nothing for ~5 mins now | 02:13 |
jeblair | we're also using as much memory as we were the last time we restarted | 02:14 |
clarkb | :/ | 02:14 |
ianw | getting some more now | 02:14 |
*** cody-somerville has joined #openstack-infra | 02:14 | |
jeblair | (we're at 25GB) | 02:14 |
ianw | stackalytics-bot-2 seems to be just broken | 02:15 |
clarkb | ianw: ya it had probelms so they just made a second account hence the -2 iirc | 02:15 |
clarkb | I want to say it has the paramiko fail to nicely close ssh connection problem we ran into | 02:16 |
jeblair | the traffic spike started at 01:40 | 02:16 |
ianw | can we tell what's happening by thread-id maybe somehow? if they're dying for some reason | 02:17 |
*** kong has joined #openstack-infra | 02:17 | |
*** anticw has joined #openstack-infra | 02:17 | |
*** ihrachys has quit IRC | 02:17 | |
anticw | re: the upgrade of gerrit earlier today ... we seem to be lacking the github/gitweb links; are there plans to restore those? | 02:18 |
mnaser | oooh those have disappeared indeed | 02:18 |
anticw | which are the only things that make gerrit slightly usabel | 02:18 |
anticw | or usable | 02:18 |
mnaser | that might have been missed with the upgrade | 02:19 |
mnaser | https://review-dev.openstack.org/#/c/107956/ | 02:19 |
mnaser | i see it in dev | 02:19 |
jeblair | anticw: ouch, that hurts | 02:19 |
ianw | "Error reading from remote server returned by /monitoring, referer: https://review.openstack.org/monitoring" | 02:20 |
ianw | now that's weird, right ... i mean that's a totally different bit | 02:20 |
clarkb | if its in dev then likely just a config difference we have to sort out | 02:20 |
anticw | jeblair: i use those links to review patches, very useful so it would be nice to have them back | 02:20 |
ianw | does this suggest it's more likely to be apache's fault than gerrits fault? | 02:20 |
clarkb | ianw: ya I have a hunch it is the proxy at least partially at fault | 02:21 |
anticw | review-dev has a bogus cert (i'm sure you know this... just sayin') | 02:22 |
jeblair | ianw, clarkb: really? high gc time, memory use at levels previously associated with gerrit being visibly slow? that seems to point pretty squarely at gerrit | 02:22 |
jeblair | how could apache be contributing to this? | 02:23 |
ianw | jeblair: well i'm just thinking that "/monitoring" is a pretty different end-point ... | 02:23 |
jeblair | ianw: yeah, though it's still served by the same jvm | 02:23 |
clarkb | well also ianw saw it earlier before these things spiked I'm sure it is gerrit too | 02:23 |
*** camunoz has quit IRC | 02:24 | |
jeblair | ianw: it's usually a bit more robust than the rest of gerrit (since it doesn't have as much to do), but when things get really bad, we also lose access to it | 02:24 |
jeblair | we've apparently peaked at 31G of jvm memory | 02:25 |
jeblair | out of a max of 30 | 02:25 |
jeblair | (java math?) | 02:25 |
clarkb | likely, since there is non heap stuff too? | 02:26 |
jeblair | my inclination is to bump us up to 48G in the jvm, and restart. and then hope that (a) that's enough, and (b) that traffic spike was anomolous and somehow responsible. | 02:28 |
clarkb | that seems like a reasonable step to take. we have 60gb on the server total | 02:28 |
jeblair | after that, i'm likely to start saying words like "downgrade" or "upgrade". | 02:29 |
clarkb | I know other users of gerrit use significantly bigger servers we may be finding out why :( | 02:29 |
ianw | ++, fwiw | 02:30 |
*** ramishra has joined #openstack-infra | 02:30 | |
ianw | really odd that nothing logs if threads are dying unexpectedly | 02:31 |
jeblair | ianw: are threads dying unexpectedly? | 02:31 |
SamYaple | clarkb: i asked this over the weekend with no response (PTG and travel and all) | 02:31 |
SamYaple | I want to do an automated build on dockerhub pointed at https://github.com/openstack/loci but would require git permissions to install a hook to allow dockerhub build to trigger | 02:31 |
SamYaple | is this somethign infra would allow? | 02:31 |
clarkb | I think apache is just closing connections more aggresively since gerrit is slower | 02:32 |
clarkb | and so no proper error on the gerrit side | 02:32 |
mnaser | SamYaple infra currently trying to deal with gerrit upgrade issues so might not be a good time :> | 02:32 |
SamYaple | mnaser: ah cool. ill check back at a less busy time. thanks | 02:32 |
clarkb | or maybe not more aggressively but its basically masking any potential errors that could happen later | 02:32 |
jeblair | where do we keep the jvm command? | 02:32 |
ianw | clarkb: but the error suggests that apache didn't know why the connection stopped. i think there's a separate msg for timeouts | 02:33 |
clarkb | jeblair: its in review_site/bin in the init script iirc | 02:33 |
clarkb | it sources from etc/default I want to say | 02:33 |
jeblair | hrm, /etc/default/gerritcodereview doesn't seem to have a mem limit | 02:34 |
mnaser | i think if things are timing out apache would return a 504, not a 502 :x | 02:34 |
*** dave-mccowan has quit IRC | 02:35 | |
jeblair | i will give the first person who finds where we configure the jvm memory size a cookie | 02:35 |
clarkb | jeblair: looks like it is is in gerrit.config? | 02:35 |
clarkb | gerrit.sh is running a git config command to get GERRIT_MEMORY | 02:35 |
clarkb | ya heaplimit in gerrit.confg | 02:36 |
jeblair | of course | 02:36 |
jeblair | # container_heaplimit: | 02:36 |
jeblair | ^ that's the extent of our documentation of that param :( | 02:36 |
jeblair | okay, i have modified the config file in place | 02:37 |
* jeblair hands clarkb a cookie | 02:37 | |
jeblair | shall we restart now? | 02:38 |
ianw | i think so, 48gb is enough for anyone | 02:38 |
jeblair | #status notice Gerrit is being restarted to feed its insatiable memory appetite | 02:39 |
openstackstatus | jeblair: sending notice | 02:39 |
ianw | mnaser: AH00898 in particular is what apache thinks about it | 02:39 |
clarkb | wfm | 02:39 |
clarkb | also https://stackoverflow.com/questions/169453/bad-gateway-502-error-with-apache-mod-proxy-and-tomcat | 02:39 |
clarkb | re which 500 error apache should return on a timeout | 02:39 |
clarkb | there is anecdotal evidence at least that 502s are what you'd get | 02:40 |
-openstackstatus- NOTICE: Gerrit is being restarted to feed its insatiable memory appetite | 02:40 | |
clarkb | though 300 seconds seems like plenty | 02:40 |
openstackgerrit | James E. Blair proposed openstack-infra/system-config master: Bump gerrit to 48g heap https://review.openstack.org/504993 | 02:40 |
mnaser | i cant imagine gerrit needing 300s to respond | 02:41 |
clarkb | oh though that says it is the tomcat server timing out | 02:41 |
jeblair | that ^ indicates it's running again | 02:41 |
jeblair | clarkb: so maybe we're seeing jetty timeouts | 02:41 |
*** sshnaidm is now known as sshnaidm|off | 02:41 | |
clarkb | ya | 02:41 |
clarkb | but then I'd expect that to get logged by gerrit but maybe something is breaking that | 02:42 |
openstackstatus | jeblair: finished sending notice | 02:42 |
jeblair | i'm going to go eat my own cookie now | 02:43 |
*** hongbin has quit IRC | 02:43 | |
clarkb | it is interesting that the active thread counts jumped up accrding to melody, I think that would support the theory that jetty is timing out if all the threads are busy | 02:43 |
ianw | clarkb: http threads? | 02:43 |
clarkb | ianw: ya | 02:43 |
*** chason has joined #openstack-infra | 02:43 | |
clarkb | well melody just says threads | 02:43 |
clarkb | its all of them | 02:43 |
* clarkb looks if it can be more specific | 02:43 | |
clarkb | on the melody page if you click on the thread details thing it drops down a list of all the threads which may show us in the future | 02:44 |
ianw | yeah, a lot of the http ones are in timed_wait | 02:45 |
clarkb | they are labeled HTTP-XX | 02:45 |
clarkb | ianw: they are in idlejobpoll | 02:46 |
*** mwarad has joined #openstack-infra | 02:46 | |
clarkb | which if I am assuming things about named methods that implies those are http threads just waiting for a request? | 02:47 |
ianw | agree | 02:47 |
*** hongbin has joined #openstack-infra | 02:49 | |
*** baoli has quit IRC | 02:51 | |
clarkb | gitweb docs say that gerrit looks by default at /usr/lib/cgi-bin/gitweb.cgi which is there but is a symlink, perhaps it stopped following symlinks? | 02:51 |
clarkb | ianw: have you gotten any 500s since the restart? | 02:53 |
*** eumel8 has joined #openstack-infra | 02:53 | |
clarkb | anticw: its definitely a bug that gitweb doesn't show up anymore, we'll have to look into it. My hunch is that gerrit stopped following symlinks to the gitweb cgi for some reason. But will look into it | 02:53 |
ianw | clarkb: nup, nothing coming up | 02:53 |
ianw | AH01136: Unescaped URL path matched ProxyPass; ignoring unsafe nocanon | 02:54 |
clarkb | SamYaple: I don't think there is a good way to delegate that in github today, at least not the way we have the openstack/ org set up. Monty has been fiddling with github hooks/apps around zuul recently so may have ideas | 02:54 |
ianw | is, that's not new | 02:54 |
clarkb | SamYaple: the way we worked around that with read the docs was having zuul jobs hit the rebuild api url for read the docs projects. Perhaps there is something similar in dockerhub? | 02:55 |
SamYaple | clarkb: i can setup a url that you can POST to in dockerhub that triggers the rebuild, can that be done in a post job? | 02:56 |
SamYaple | that way it auto rebuilds and dockerhub doesnt need to access github | 02:56 |
clarkb | ya that is basically the exact thing we do with read the docs. The only potential gotcha there is authentication. Read the docs lets anyone trigger a rebuild anonymously against their API (I think they rate limit itthough) | 02:57 |
SamYaple | hmmm. ok so the trigger would be publically available | 02:59 |
SamYaple | thats *fine*. it can't hurt things. but its not ideal | 03:00 |
clarkb | at least in zuulv2.5 (what we are currently running), once 3.0 is up you'll be able to provide your own project specific secrets that you can manage and as long as you don't disclose them only zuul will know | 03:00 |
SamYaple | anyway to make it secret? (or is this something zuulv3 provides) | 03:00 |
SamYaple | ah cool | 03:00 |
SamYaple | i guess i can make them public for now. it wont hurt. i can regen the trigger later when zuulv3 lands | 03:01 |
SamYaple | do you have a link to the way docs does it? (or at least the project i need to dig into) | 03:01 |
clarkb | SamYaple: https://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/jobs/hooks.yaml | 03:03 |
SamYaple | clarkb: sweet! something i know how to do! | 03:03 |
SamYaple | thanks for the pointers. its super handy | 03:03 |
clarkb | no problem | 03:04 |
clarkb | ianw: ok I need to afk again. Kids are screaming at me | 03:04 |
clarkb | will pick this up in the morning | 03:04 |
ianw | clarkb / jeblair : thanks! | 03:05 |
ianw | if nothing else, i can collate any issues if they occur | 03:05 |
SamYaple | so a bit of a more broad question. would it be possible to allow arbitrary hooks to be called in post? that way if keystone merges a commit then in the post job it would call a hook to rebuild loci-keystone in dockerhub? | 03:11 |
SamYaple | i mean i can submit a patchset to project-config right now to do this, but if you let one in... | 03:11 |
anticw | clarkb: thanks, i appreciate it as i use that feature very heavily | 03:24 |
*** xarses_ has joined #openstack-infra | 03:25 | |
masayukig | pabelanger: mtreinish: Is there any update for the 404 for http://mirror.mtl01.inap.openstack.org:8080/registry.npmjs/@gulp-sourcemaps%2fmap-sources ? | 03:25 |
*** thorst has joined #openstack-infra | 03:28 | |
*** rlandy|brb is now known as rlandy | 03:32 | |
*** Sree has joined #openstack-infra | 03:32 | |
*** thorst has quit IRC | 03:33 | |
dmsimard | SamYaple: in zuulv3 there's technically nothing preventing you from putting one of your jobs in the keystone project pipelines | 03:45 |
dmsimard | For better or for worse | 03:46 |
dmsimard | Which might be a problem actually.. | 03:47 |
dmsimard | Hey let me put that failing job in your gate for the lulz | 03:47 |
*** jdandrea_ has quit IRC | 03:51 | |
*** hongbin has quit IRC | 03:51 | |
*** coolsvap has joined #openstack-infra | 03:51 | |
*** gcb has quit IRC | 03:53 | |
SamYaple | dmsimard: right thats sort of what i was getting at | 03:53 |
SamYaple | to be fair, a fire-and-forget task like `curl -XPOST URL` is hard to break... | 03:54 |
SamYaple | esspecially if you ignore it when it breaks `curl || :`. which i would be ok with | 03:55 |
dmsimard | Right, but the complexity of a job is not a factor in allowing folks to put any job on any project | 03:55 |
dmsimard | I think we're mostly hoping that people behave properly for the time being | 03:55 |
SamYaple | yea thats a zuulv3 thing i dont fully understand yet | 03:56 |
SamYaple | and being realistic, we trust alot of people to not do silly things that abuse the gate | 03:56 |
SamYaple | and it mostly works out | 03:56 |
dmsimard | SamYaple: for example, here you see we define jobs for zuul-jobs: https://github.com/openstack-infra/project-config/blob/master/zuul.yaml#L961 | 03:57 |
dmsimard | and here I added jobs elsewhere on the same project: https://review.openstack.org/#/c/503806/ | 03:58 |
dmsimard | zuulv3 gives a lot of freedom over what you can do and how you can do it | 03:58 |
SamYaple | oh i see.. hmmm | 03:59 |
SamYaple | i bet most projects wouldn't care/notice a post merge job like that | 03:59 |
SamYaple | infra might care if it fails though | 03:59 |
dmsimard | Surely there are things planned (or already implemented, not 100% familiar with everything in v3) to provide some amount of restrictions | 03:59 |
*** e0ne has joined #openstack-infra | 04:00 | |
dmsimard | Like the concept of trusted/untrusted jobs and projects | 04:00 |
*** ykarel has joined #openstack-infra | 04:01 | |
dmsimard | ianw: btw I addressed your comments in https://review.openstack.org/#/c/504554/ with https://review.openstack.org/#/c/504936/ | 04:01 |
SamYaple | this is all very good info. ive got enough to do some cool stuff right now, so im going t osee how far that takes me | 04:01 |
*** ykarel_ has joined #openstack-infra | 04:03 | |
*** aeng has quit IRC | 04:03 | |
*** e0ne has quit IRC | 04:04 | |
openstackgerrit | David Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Add integration tests for multi-node-bridge role https://review.openstack.org/504789 | 04:04 |
dmsimard | ianw: also note that all those multinode roles are integration tested too, the other reviews are in the topic https://review.openstack.org/#/q/topic:zuulv3-multinode | 04:04 |
*** ykarel has quit IRC | 04:05 | |
ianw | dmsimard: cool, thanks, will take a look. does all look good | 04:06 |
*** tpsilva has quit IRC | 04:09 | |
*** mat128 has quit IRC | 04:10 | |
*** udesale has joined #openstack-infra | 04:16 | |
*** claudiub has joined #openstack-infra | 04:20 | |
*** aeng has joined #openstack-infra | 04:21 | |
*** thorst has joined #openstack-infra | 04:29 | |
ianw | dmsimard: what's with the check for iptables6 in https://review.openstack.org/#/c/504553/ ? | 04:31 |
dmsimard | ianw: I missed that one, will fix tomorrow | 04:33 |
dmsimard | Not sure why I did that, must have misremembered the original script | 04:33 |
dmsimard | PTG was a long week... :) | 04:33 |
*** thorst has quit IRC | 04:34 | |
*** benj_ has quit IRC | 04:34 | |
ianw | dmsimard: that's ok, i tend not to -1 unless i'm sure there's something wrong :) | 04:40 |
*** aeng has quit IRC | 04:50 | |
*** aeng has joined #openstack-infra | 04:51 | |
*** benj_ has joined #openstack-infra | 04:55 | |
*** yamamoto has quit IRC | 04:57 | |
*** yamamoto has joined #openstack-infra | 05:07 | |
*** gildub_ has joined #openstack-infra | 05:08 | |
*** gildub_ has quit IRC | 05:12 | |
*** gildub_ has joined #openstack-infra | 05:17 | |
*** cody-somerville has quit IRC | 05:22 | |
*** zhurong has joined #openstack-infra | 05:24 | |
*** ijw has joined #openstack-infra | 05:25 | |
*** ijw has quit IRC | 05:30 | |
*** thorst has joined #openstack-infra | 05:30 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack-infra/tripleo-ci master: Only inject cloud-init in CentOS 7.3 https://review.openstack.org/504850 | 05:32 |
*** thorst has quit IRC | 05:34 | |
*** hichihara has quit IRC | 05:35 | |
*** amotoki_ has joined #openstack-infra | 05:39 | |
*** sshnaidm|off has quit IRC | 05:40 | |
*** lbragstad has quit IRC | 05:44 | |
mordred | dmsimard: in v3 there is TOTALLY something preventing you from putting your project in the keystone project pipeline | 05:49 |
mordred | it's that you can't do that at all ;) | 05:50 |
mordred | dmsimard: an untrusted project can only manipulate its own pipeline | 05:50 |
mordred | dmsimard: only config projects can manipulate other project's pipelines | 05:50 |
*** amotoki__ has joined #openstack-infra | 05:52 | |
*** amotoki_ has quit IRC | 05:52 | |
*** ykarel_ is now known as ykarel | 05:53 | |
*** aeng has quit IRC | 05:53 | |
*** amotoki__ has quit IRC | 05:56 | |
ianw | yolanda / AJaeger: fyi as you come online ... gerrit seems fine. we had a period of memory blow-out and increased 502 errors, leading to https://review.openstack.org/#/c/504993/, but it's all been sane since restart, no errors at all | 05:57 |
*** amotoki_ has joined #openstack-infra | 05:57 | |
*** udesale has quit IRC | 05:59 | |
*** udesale has joined #openstack-infra | 06:00 | |
*** jtomasek has joined #openstack-infra | 06:00 | |
*** mriedem has quit IRC | 06:05 | |
*** jdandrea has quit IRC | 06:05 | |
*** akscram1 has quit IRC | 06:05 | |
*** Jeffrey4l has quit IRC | 06:05 | |
*** Shrews has quit IRC | 06:05 | |
*** afazekas has quit IRC | 06:05 | |
*** mdrabe has quit IRC | 06:05 | |
*** numans has quit IRC | 06:05 | |
*** apuimedo has quit IRC | 06:05 | |
*** dhellmann has quit IRC | 06:05 | |
*** ilpianista_ has quit IRC | 06:05 | |
*** uberjay has quit IRC | 06:05 | |
*** mancdaz has quit IRC | 06:05 | |
*** harlowja has quit IRC | 06:05 | |
*** _d34dh0r53_ has quit IRC | 06:05 | |
*** GregHous- has quit IRC | 06:05 | |
*** StevenK has quit IRC | 06:05 | |
*** aspiers has quit IRC | 06:05 | |
*** _Cyclone_ has quit IRC | 06:05 | |
*** Krenair has quit IRC | 06:05 | |
*** ggherdov- has quit IRC | 06:05 | |
*** jamespage has quit IRC | 06:05 | |
*** fmccrthy has quit IRC | 06:05 | |
*** jmccrory has quit IRC | 06:05 | |
*** wendar has quit IRC | 06:05 | |
*** tdasilva has quit IRC | 06:05 | |
*** mwhahaha has quit IRC | 06:05 | |
*** cmurphy has quit IRC | 06:05 | |
*** melwitt has quit IRC | 06:05 | |
*** petrovich has quit IRC | 06:05 | |
*** nhandler has quit IRC | 06:05 | |
*** electrical has quit IRC | 06:05 | |
*** rajinir has quit IRC | 06:05 | |
*** jpmaxman has quit IRC | 06:05 | |
*** Ng has quit IRC | 06:05 | |
*** melwitt has joined #openstack-infra | 06:05 | |
*** wendar has joined #openstack-infra | 06:05 | |
*** d34dh0r53 has joined #openstack-infra | 06:05 | |
*** dhellmann has joined #openstack-infra | 06:05 | |
*** _Cyclone_ has joined #openstack-infra | 06:05 | |
*** Jeffrey4l has joined #openstack-infra | 06:05 | |
*** petrovich has joined #openstack-infra | 06:05 | |
*** StevenK has joined #openstack-infra | 06:05 | |
*** uberjay has joined #openstack-infra | 06:05 | |
*** apuimedo has joined #openstack-infra | 06:05 | |
*** afazekas has joined #openstack-infra | 06:05 | |
*** cmurphy has joined #openstack-infra | 06:05 | |
*** nhandler has joined #openstack-infra | 06:05 | |
*** aspiers has joined #openstack-infra | 06:05 | |
*** Shrews has joined #openstack-infra | 06:05 | |
chandankumar | ianw: Hello | 06:05 |
*** Ng has joined #openstack-infra | 06:05 | |
*** melwitt is now known as Guest36641 | 06:06 | |
*** numans has joined #openstack-infra | 06:06 | |
chandankumar | ianw: https://review.openstack.org/#/c/502224/ review for creating repo for neutron-tempest-plugin got merged | 06:06 |
chandankumar | ianw: but repo is not yet created on git.openstack.org | 06:06 |
*** electrical has joined #openstack-infra | 06:06 | |
chandankumar | ianw: please have a look, Thanks :-) | 06:06 |
*** ggherdov- has joined #openstack-infra | 06:06 | |
*** fmccrthy has joined #openstack-infra | 06:06 | |
*** mwhahaha has joined #openstack-infra | 06:06 | |
*** jamespage has joined #openstack-infra | 06:06 | |
ianw | chandankumar: i think we've stopped puppet as part of gerrit upgrades. admins will be looking at this tomorrow USA time | 06:06 |
ianw | so check after that | 06:06 |
chandankumar | ianw: thanks for the info :-) | 06:07 |
*** panda|bbl has quit IRC | 06:08 | |
*** mrunge has quit IRC | 06:08 | |
*** jdandrea has joined #openstack-infra | 06:09 | |
*** xarses_ has quit IRC | 06:09 | |
*** mrunge has joined #openstack-infra | 06:09 | |
*** masayukig[m] has quit IRC | 06:09 | |
*** aspiers[m] has quit IRC | 06:09 | |
*** mriedem has joined #openstack-infra | 06:10 | |
*** akscram1 has joined #openstack-infra | 06:10 | |
*** mdrabe has joined #openstack-infra | 06:10 | |
*** mancdaz has joined #openstack-infra | 06:10 | |
*** GregHous- has joined #openstack-infra | 06:10 | |
*** jmccrory has joined #openstack-infra | 06:10 | |
*** tdasilva has joined #openstack-infra | 06:10 | |
*** rajinir has joined #openstack-infra | 06:10 | |
*** jpmaxman has joined #openstack-infra | 06:10 | |
*** panda has joined #openstack-infra | 06:11 | |
*** Krenair has joined #openstack-infra | 06:12 | |
*** mnaser has quit IRC | 06:12 | |
*** andreas_s has joined #openstack-infra | 06:22 | |
*** mriedem has quit IRC | 06:25 | |
ianw | clarkb / anticw : so the (cgit) links being available on review-dev are because gitweb isn't enabled there -> https://git.openstack.org/cgit/openstack-infra/system-config/tree/modules/openstack_project/manifests/review_dev.pp#n75 | 06:27 |
ianw | only cgit | 06:27 |
*** mnaser has joined #openstack-infra | 06:28 | |
*** thorst has joined #openstack-infra | 06:31 | |
*** dizquierdo has joined #openstack-infra | 06:31 | |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config master: Switch review.o.o to cgit links https://review.openstack.org/505067 | 06:33 |
ianw | clarkb: ^ an option, maybe. i've run out of time to figure out why gitweb 404's but i'm sure you'll figure it out :) | 06:34 |
*** thorst has quit IRC | 06:35 | |
AJaeger | ianw: thanks for update. I notice that I don't get emails by gerrit this morning. Is that working for you? | 06:36 |
ianw | AJaeger: i am getting emails ok, i think | 06:37 |
*** pgadiya has joined #openstack-infra | 06:38 | |
ianw | AJaeger: now you mention it, maybe i'm not ... | 06:42 |
*** dhajare has joined #openstack-infra | 06:43 | |
*** martinkopec has joined #openstack-infra | 06:44 | |
ianw | i dunno, exim is processing and not erroring | 06:45 |
frickler | ianw: AJaeger: I noticed some mails missing, too, though I also received some | 06:46 |
AJaeger | ianw: didn't get any in the last hour - and rechecked some... | 06:49 |
*** ccamacho has joined #openstack-infra | 06:49 | |
*** florianf has joined #openstack-infra | 06:49 | |
* AJaeger will do other stuff now and report back later today if this didn't heal itself... | 06:49 | |
yolanda | hi, good morning, can i help? reading the messages... | 06:50 |
*** zhurong has quit IRC | 06:54 | |
*** gildub_ has quit IRC | 06:58 | |
*** xinliang has quit IRC | 07:02 | |
*** gildub_ has joined #openstack-infra | 07:03 | |
AJaeger | ianw, yolanda : IT looks like post-jobs are not run at all ;( | 07:07 |
AJaeger | infra-root ^ | 07:07 |
AJaeger | https://review.openstack.org/#/c/505068/ just merged, I do not see it in status.o.o/zuul anymore (and nothing in post queue) | 07:07 |
AJaeger | and log files are non-existing: http://logs.openstack.org/aa/aab471e1911cbd91f3ea42c702f712da2c3ed16a/ | 07:08 |
AJaeger | ianw, yolanda: Can you check logs and see why post jobs are not run? | 07:10 |
AJaeger | #status log Zuul is not running any post jobs | 07:13 |
openstackstatus | AJaeger: finished logging | 07:13 |
yolanda | nothing strange on zuul logs... | 07:13 |
AJaeger | thanks for checking - that's strange | 07:14 |
*** mwarad has quit IRC | 07:14 | |
*** _mwarad_ has joined #openstack-infra | 07:14 | |
yolanda | also nodepool is spinning vms | 07:15 |
*** xinliang has joined #openstack-infra | 07:15 | |
AJaeger | yolanda: I do not see any post jobs in status.o.o/zuul | 07:15 |
yolanda | yep | 07:15 |
yolanda | the other queues look fine, but don't know what's going on for post | 07:15 |
*** masayukig[m] has joined #openstack-infra | 07:16 | |
AJaeger | at this time there have to be some - since the translation jobs are all run in periodic queue and thus it should queue all post jobs for repos with translations - and we merged a few in the last hour including api-site | 07:16 |
ianw | yeah i'm not seeing anything obvious | 07:17 |
AJaeger | Let's send an alert out - what about #status alert Post jobs are not executed currently, do not tag any relesaeses. | 07:17 |
*** hashar has joined #openstack-infra | 07:18 | |
yolanda | i'm seeing some POST_FAILURE errors | 07:18 |
*** rcernin has joined #openstack-infra | 07:18 | |
yolanda | #status alert Post jobs are not executed currently, do not tag any releases | 07:19 |
openstackstatus | yolanda: sending alert | 07:19 |
*** zhurong has joined #openstack-infra | 07:19 | |
AJaeger | thanks, yolanda | 07:19 |
AJaeger | I'll be back later | 07:19 |
*** dhajare has quit IRC | 07:21 | |
-openstackstatus- NOTICE: Post jobs are not executed currently, do not tag any releases | 07:22 | |
ianw | the last time post seemed to do something was 2017-09-18 15:01:00,316 DEBUG zuul.IndependentPipelineManager: Finished queue processor: post (changed: True) | 07:22 |
*** ChanServ changes topic to "Post jobs are not executed currently, do not tag any releases" | 07:22 | |
yolanda | yep, it doesn't seem to be detecting new changes in the queue | 07:22 |
*** pcaruana has joined #openstack-infra | 07:22 | |
AJaeger | ianw: so, that's before the gerrit update | 07:22 |
*** tesseract has joined #openstack-infra | 07:24 | |
*** armax has quit IRC | 07:24 | |
openstackstatus | yolanda: finished sending alert | 07:25 |
*** gildub_ has quit IRC | 07:30 | |
*** thorst has joined #openstack-infra | 07:32 | |
*** _mwarad_ has quit IRC | 07:32 | |
*** itooon has joined #openstack-infra | 07:33 | |
*** ilpianista_ has joined #openstack-infra | 07:33 | |
*** aspiers[m] has joined #openstack-infra | 07:33 | |
*** jpena|off is now known as jpena | 07:35 | |
*** threestrands has quit IRC | 07:36 | |
*** thorst has quit IRC | 07:36 | |
*** itooon has quit IRC | 07:38 | |
*** xinliang has quit IRC | 07:42 | |
*** xinliang has joined #openstack-infra | 07:42 | |
*** xinliang has quit IRC | 07:42 | |
*** xinliang has joined #openstack-infra | 07:42 | |
*** thorre_se has joined #openstack-infra | 07:42 | |
*** alexchadin has joined #openstack-infra | 07:44 | |
*** thorre has quit IRC | 07:46 | |
*** thorre_se is now known as thorre | 07:46 | |
*** aarefiev_ptg is now known as aarefiev | 07:47 | |
*** dhajare has joined #openstack-infra | 07:48 | |
*** egonzalez has joined #openstack-infra | 07:52 | |
*** ralonsoh has joined #openstack-infra | 07:53 | |
*** mrunge has quit IRC | 07:54 | |
*** _mwarad_ has joined #openstack-infra | 07:55 | |
*** mrunge has joined #openstack-infra | 07:57 | |
*** d0ugal has quit IRC | 08:01 | |
*** ociuhandu has quit IRC | 08:03 | |
*** shardy has joined #openstack-infra | 08:04 | |
*** d0ugal has joined #openstack-infra | 08:09 | |
*** jpich has joined #openstack-infra | 08:11 | |
*** ykarel is now known as ykarel|lunch | 08:11 | |
*** gildub_ has joined #openstack-infra | 08:12 | |
*** liujiong_lj has joined #openstack-infra | 08:14 | |
*** liujiong has quit IRC | 08:15 | |
*** efoley has joined #openstack-infra | 08:20 | |
*** priteau has joined #openstack-infra | 08:21 | |
*** zhurong has quit IRC | 08:24 | |
*** mrunge has quit IRC | 08:26 | |
*** mrunge has joined #openstack-infra | 08:26 | |
*** gildub_ has quit IRC | 08:28 | |
*** thorst has joined #openstack-infra | 08:33 | |
*** jaosorior has quit IRC | 08:33 | |
*** jaosorior has joined #openstack-infra | 08:34 | |
*** hashar has quit IRC | 08:36 | |
*** thorst has quit IRC | 08:37 | |
*** hashar has joined #openstack-infra | 08:37 | |
*** Sree_ has joined #openstack-infra | 08:38 | |
*** Sree_ is now known as Guest32514 | 08:38 | |
*** Sree has quit IRC | 08:41 | |
frickler | infra-root: if you need a break from checking gerrit: ask.o.o seems also to be down for an extended period this morning. usually there is a 30 minute outage at around 6:30, now it is gone for two hours | 08:44 |
*** electrofelix has joined #openstack-infra | 08:45 | |
*** ykarel|lunch is now known as ykarel | 08:47 | |
*** udesale has quit IRC | 08:50 | |
*** pbourke has joined #openstack-infra | 08:54 | |
*** ijw has joined #openstack-infra | 08:55 | |
*** ijw has quit IRC | 09:00 | |
*** udesale has joined #openstack-infra | 09:03 | |
*** alexchadin has quit IRC | 09:04 | |
*** slaweq has quit IRC | 09:04 | |
*** alexchadin has joined #openstack-infra | 09:05 | |
*** udesale has quit IRC | 09:05 | |
*** udesale has joined #openstack-infra | 09:05 | |
*** zhurong has joined #openstack-infra | 09:06 | |
*** slaweq has joined #openstack-infra | 09:07 | |
*** yamamoto has quit IRC | 09:07 | |
d0ugal | The is:starred search on gerrit no longer works :-( | 09:09 |
d0ugal | actually, ignore me - I was just signed out. | 09:10 |
d0ugal | phew :) | 09:10 |
*** udesale__ has joined #openstack-infra | 09:10 | |
egonzalez | hi is there a estimate time for post jobs back to normal state? | 09:10 |
*** dizquierdo has quit IRC | 09:11 | |
*** udesale has quit IRC | 09:12 | |
*** e0ne has joined #openstack-infra | 09:24 | |
*** yamamoto has joined #openstack-infra | 09:32 | |
*** thorst has joined #openstack-infra | 09:33 | |
*** tosky has joined #openstack-infra | 09:37 | |
*** thorst has quit IRC | 09:38 | |
*** s-shiono has quit IRC | 09:51 | |
*** stakeda has quit IRC | 09:52 | |
*** ociuhandu has joined #openstack-infra | 09:55 | |
AJaeger | egonzalez: we need to wait for the US part of the team to investigate... | 09:59 |
egonzalez | AJaeger, thanks | 10:00 |
*** ociuhandu has quit IRC | 10:01 | |
*** liujiong_lj has quit IRC | 10:05 | |
openstackgerrit | Thierry Carrez proposed openstack-infra/odsreg master: Update OpenStack logo https://review.openstack.org/505145 | 10:06 |
*** cuongnv has quit IRC | 10:07 | |
*** nicolasbock has joined #openstack-infra | 10:09 | |
*** Guest32514 has quit IRC | 10:10 | |
frickler | another gerrit bug: every patch is shown has having the same topic as itself, so there is always at least "Same Topic (1)" | 10:10 |
frickler | s/has/as/ | 10:11 |
*** Sree has joined #openstack-infra | 10:12 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack-infra/project-config master: Normalize projects.yaml https://review.openstack.org/505148 | 10:12 |
*** sambetts|afk is now known as sambetts | 10:13 | |
*** alexchadin has quit IRC | 10:13 | |
openstackgerrit | Thierry Carrez proposed openstack-infra/odsreg master: Increase title to 60 chars https://review.openstack.org/505150 | 10:15 |
openstackgerrit | Thierry Carrez proposed openstack-infra/odsreg master: Single topic uses admins to review https://review.openstack.org/505151 | 10:15 |
*** Sree has quit IRC | 10:16 | |
*** thorst has joined #openstack-infra | 10:18 | |
*** thorst has quit IRC | 10:19 | |
*** mrunge has quit IRC | 10:23 | |
*** mrunge has joined #openstack-infra | 10:25 | |
*** _mwarad_ has quit IRC | 10:26 | |
*** bhavik1 has joined #openstack-infra | 10:30 | |
*** alexchadin has joined #openstack-infra | 10:31 | |
*** udesale__ has quit IRC | 10:32 | |
*** udesale has joined #openstack-infra | 10:32 | |
*** jtomasek has quit IRC | 10:33 | |
*** bhavik1 has quit IRC | 10:33 | |
*** yamamoto has quit IRC | 10:34 | |
*** jtomasek has joined #openstack-infra | 10:34 | |
*** alexchadin has quit IRC | 10:36 | |
*** yamamoto has joined #openstack-infra | 10:38 | |
*** jkilpatr has quit IRC | 10:42 | |
*** yamamoto has quit IRC | 10:47 | |
frickler | infra-root: it looks like mails from gerrit may not be lost, but just delayed quite a bit, see these headers with more than 4h delay: http://paste.openstack.org/show/621408/ | 10:48 |
*** timothyb89 has quit IRC | 10:48 | |
frickler | not consistent, though, I also did receive a mail at 10:20 with only 2 minutes delay | 10:51 |
*** udesale has quit IRC | 10:54 | |
*** udesale has joined #openstack-infra | 10:55 | |
*** alexchadin has joined #openstack-infra | 10:55 | |
*** dtantsur|afk is now known as dtantsur | 10:58 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: WIP: Honor cloud quotas before launching nodes https://review.openstack.org/503838 | 11:00 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Don't fail on quota exceeded https://review.openstack.org/503051 | 11:00 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Make max-servers optional https://review.openstack.org/504282 | 11:00 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support cores limit per pool https://review.openstack.org/504283 | 11:00 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support ram limit per pool https://review.openstack.org/504284 | 11:00 |
*** wolverineav has joined #openstack-infra | 11:05 | |
*** dhajare has quit IRC | 11:05 | |
*** dhajare has joined #openstack-infra | 11:05 | |
*** sdague has joined #openstack-infra | 11:12 | |
*** cshastri has quit IRC | 11:13 | |
*** baoli has joined #openstack-infra | 11:14 | |
*** jkilpatr has joined #openstack-infra | 11:16 | |
*** thorst has joined #openstack-infra | 11:19 | |
*** yamamoto has joined #openstack-infra | 11:23 | |
*** wolverineav has quit IRC | 11:24 | |
*** thorst has quit IRC | 11:26 | |
*** baoli has quit IRC | 11:28 | |
openstackgerrit | Mathieu Velten proposed openstack-infra/project-config master: Update Magnum DCOS image build https://review.openstack.org/504431 | 11:29 |
*** tpsilva has joined #openstack-infra | 11:29 | |
*** alexchadin has quit IRC | 11:33 | |
*** martinkopec has quit IRC | 11:33 | |
*** alexchadin has joined #openstack-infra | 11:34 | |
*** ldnunes has joined #openstack-infra | 11:35 | |
*** alexchadin has quit IRC | 11:35 | |
hwoarang | good day | 11:36 |
hwoarang | since the gerrit update last night my custom dashboards do not work anymore | 11:36 |
*** alexchadin has joined #openstack-infra | 11:36 | |
hwoarang | has something changed so gerrit-dash-creator has to be updated or is there something wrong with gerrit? | 11:36 |
hwoarang | for example the one returned by ./gerrit-dash-creator dashboards/openstack-ansible.dash returns no results which is not normal | 11:37 |
*** jpena is now known as jpena|lunch | 11:40 | |
dmsimard | mordred: where can we see what projects are trusted and which aren't ? I guess I've only ever had to deal with trusted projects so far cause I haven't seen that kind of restriction in action yet | 11:41 |
*** pgadiya has quit IRC | 11:45 | |
*** ociuhandu has joined #openstack-infra | 11:48 | |
*** zhurong has quit IRC | 11:49 | |
*** alexchadin has quit IRC | 11:56 | |
*** alexchadin has joined #openstack-infra | 11:57 | |
*** jaosorior has quit IRC | 12:02 | |
*** efoley has quit IRC | 12:03 | |
*** efoley_ has joined #openstack-infra | 12:03 | |
Shrews | dmsimard: http://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul/main.yaml | 12:03 |
*** thorst has joined #openstack-infra | 12:04 | |
*** hwoarang has quit IRC | 12:05 | |
*** panda is now known as panda|lunch | 12:05 | |
*** szahers has joined #openstack-infra | 12:07 | |
*** hwoarang has joined #openstack-infra | 12:07 | |
*** wolverineav has joined #openstack-infra | 12:08 | |
*** jdandrea_ has joined #openstack-infra | 12:09 | |
szahers | Hi folks, We are trying to publish freezer docs, so I have created this change https://review.openstack.org/#/c/504329/ and it got merged, but It seems we are missing something as I can't see freezer docs on https://docs.openstack.org/freezer/latest/ (it's required to get this change https://review.openstack.org/#/c/504325/ to pass) | 12:09 |
szahers | anyone can help or let me know what should we do ? | 12:09 |
*** alexchadin has quit IRC | 12:13 | |
*** alexchadin has joined #openstack-infra | 12:13 | |
fungi | ianw: clarkb: yeah, we had been wanting to switch from linking the internal gitweb to linking external cgit instead so as to take more load off gerrit (and possibly allow us to turn off the former eventually). i'm in favor of going ahead with that switch now | 12:15 |
*** jdandrea_ has quit IRC | 12:16 | |
*** dprince has joined #openstack-infra | 12:16 | |
*** martinkopec has joined #openstack-infra | 12:17 | |
*** LindaWang has quit IRC | 12:17 | |
*** LindaWang has joined #openstack-infra | 12:18 | |
AJaeger | szahers: see channel topic - post jobs are not running at all currently. And your docs are published from post jobs. We need to fix that first and then your next change will publish in the post pipeline. | 12:18 |
*** rlandy has joined #openstack-infra | 12:19 | |
szahers | AJaeger ah, my bad. Thanks :) | 12:20 |
fungi | hwoarang: try dropping ,n,z from the end of the dashboard url? that was a deprecated option format which was finally dropped sometime between 2.11 and 2.13 | 12:21 |
fungi | i'll check the gerrit event stream to see what may have changed about the sorts of events on which we're triggering the post pipeline | 12:22 |
hwoarang | fungi: there are no such characters at the end or anywhere in the url | 12:23 |
*** trown|outtypewww is now known as trown | 12:23 | |
fungi | hwoarang: in that case, you may need to manually reconstruct some similar query through the search box in the webui and see how it differs from what the dashboard creator generates | 12:24 |
hwoarang | yeah i thought so .. | 12:24 |
hwoarang | will try that | 12:24 |
fungi | unfortunately gerrit likes to occasionally change their url formats between releases | 12:25 |
*** acoles has joined #openstack-infra | 12:25 | |
*** pblaho has joined #openstack-infra | 12:26 | |
acoles | Hi, thanks infra team for merging this https://review.openstack.org/#/c/504127/ - we're still not seeing alerts for those branches in #openstack-swift, is there anything else that needs to happen to kick gerritbot? | 12:27 |
fungi | acoles: config updates of the gerrit server are temporarily disabled while we work through some remaining issues following yesterday's upgrade maintenance | 12:29 |
fungi | so i expect that change hasn't been applied yet | 12:29 |
acoles | fungi: ok. thanks for replying. | 12:29 |
*** dhajare has quit IRC | 12:30 | |
*** efoley has joined #openstack-infra | 12:32 | |
*** efoley_ has quit IRC | 12:32 | |
*** markmcd has quit IRC | 12:35 | |
*** kgiusti has joined #openstack-infra | 12:36 | |
*** camunoz has joined #openstack-infra | 12:37 | |
*** markmcd has joined #openstack-infra | 12:37 | |
* mhayden tips his hat to the folks who worked hard on upgrading gerrit -- works great! | 12:38 | |
fungi | infra-root: so on the post pipeline triggering... zuul is configured to trigger on ref-updated events; i see some of those while listening on the event stream | 12:38 |
*** sshnaidm has joined #openstack-infra | 12:38 | |
fungi | going to try and track one down in the zuul debug log next | 12:38 |
*** Sree has joined #openstack-infra | 12:39 | |
*** markmcd has quit IRC | 12:39 | |
fungi | log says zuul's adding and processing ref-updated trigger events | 12:41 |
*** markmcd has joined #openstack-infra | 12:41 | |
fungi | mhayden: we're not out of the woods yet... increased memory utilization, missing gitweb links, post pipeline jobs aren't triggering... but thanks! | 12:42 |
*** jpena|lunch is now known as jpena | 12:42 | |
fungi | it's a complex system, so at least we didn't expect to catch everything before upgrading and knew there would be issues to work through today | 12:42 |
sdague | fungi: ah, maybe we need to upgrade gerrit dash creator? | 12:45 |
*** dave-mccowan has joined #openstack-infra | 12:46 | |
fungi | sdague: likely, but i don't know the details there | 12:46 |
*** sshnaidm has quit IRC | 12:47 | |
*** dave-mcc_ has joined #openstack-infra | 12:51 | |
*** dave-mccowan has quit IRC | 12:51 | |
*** pblaho has quit IRC | 12:52 | |
*** esberglu has joined #openstack-infra | 12:52 | |
*** isviridov_away has quit IRC | 12:52 | |
*** vaidy has quit IRC | 12:52 | |
*** LindaWang has quit IRC | 12:53 | |
*** szahers has quit IRC | 12:54 | |
*** bh526r has joined #openstack-infra | 12:55 | |
*** pblaho has joined #openstack-infra | 12:55 | |
sdague | oh, you know what changed | 12:55 |
sdague | label:Code-Review>=-2,self | 12:56 |
sdague | used to only match on things with -2,-1,1,2 | 12:56 |
sdague | now it matches on unvoted on things | 12:56 |
*** jcoufal has joined #openstack-infra | 12:56 | |
fungi | "unvoted" as in changes where you've left a comment with a 0 vote? | 12:57 |
*** Goneri has joined #openstack-infra | 12:57 | |
*** mriedem has joined #openstack-infra | 12:57 | |
sdague | or never voted | 12:58 |
sdague | or never left a comment | 12:58 |
*** mat128 has joined #openstack-infra | 12:58 | |
fungi | that's certainly strange | 12:58 |
sdague | that it changed, or the old behavior | 13:00 |
sdague | the old behavior was equally odd | 13:00 |
sdague | well, I sent an ML post out about it, about to have network outage as they upgrade fiber here | 13:03 |
dmsimard | fungi: do you happen to know if jobs running zuul v3 are in logstash yet ? | 13:05 |
*** hashar_ has joined #openstack-infra | 13:06 | |
fungi | dmsimard: no clue | 13:06 |
dmsimard | fungi: searching for "build_change" with a review number doesn't yield anything | 13:06 |
dmsimard | I know jeblair was working on it, perhaps it's not finished yet | 13:06 |
dmsimard | I wanted to create a E-R query for an Ansible issue I've been noticing. | 13:07 |
*** bobh has joined #openstack-infra | 13:07 | |
*** hashar has quit IRC | 13:08 | |
*** amotoki_ has quit IRC | 13:09 | |
*** vaidy has joined #openstack-infra | 13:09 | |
*** gouthamr has joined #openstack-infra | 13:13 | |
*** isviridov_away has joined #openstack-infra | 13:13 | |
openstackgerrit | David Moreau Simard proposed openstack-infra/elastic-recheck master: Add query for Ansible privilege escalation timeout https://review.openstack.org/505233 | 13:14 |
openstackgerrit | David Moreau Simard proposed openstack-infra/elastic-recheck master: Add query for Ansible privilege escalation timeout https://review.openstack.org/505233 | 13:17 |
*** udesale has quit IRC | 13:17 | |
*** acoles has left #openstack-infra | 13:17 | |
sdague | oh, this is pretty good, reviewedby: actually is the better version of it all | 13:19 |
*** chlong has joined #openstack-infra | 13:19 | |
hwoarang | fungi: initial debugging suggests that new gerrit doesn't understand regexp in the foreach statement | 13:23 |
hwoarang | if you hardcode the names of the projects you are looking for then dashboard seems to render find | 13:23 |
*** panda|lunch is now known as panda | 13:25 | |
fungi | hwoarang: are you starting the regex with ^ | 13:27 |
*** Sree_ has joined #openstack-infra | 13:28 | |
fungi | the api at least seems to handle regular expressions fine as long as they start with a ^ character | 13:28 |
*** alexchadin has quit IRC | 13:28 | |
*** Sree_ is now known as Guest7335 | 13:28 | |
*** felipemonteiro_ has joined #openstack-infra | 13:28 | |
*** felipemonteiro__ has joined #openstack-infra | 13:29 | |
*** Sree has quit IRC | 13:29 | |
*** markvoelker has joined #openstack-infra | 13:30 | |
*** felipemonteiro_ has quit IRC | 13:33 | |
*** tiswanso has quit IRC | 13:33 | |
*** LindaWang has joined #openstack-infra | 13:33 | |
*** ihrachys has joined #openstack-infra | 13:34 | |
hwoarang | yeah it starts with ^ | 13:37 |
*** dtantsur is now known as dtantsur|lunch | 13:38 | |
*** coolsvap has quit IRC | 13:42 | |
*** ijw has joined #openstack-infra | 13:46 | |
hwoarang | very strange because this query does work https://review.openstack.org/#/q/project:openstack/openstack-ansible-rabbitmq_server+AND+status:open but this dashboard (which is similar afaics) doesn't https://review.openstack.org/#/dashboard/?foreach=%28project%3Aopenstack%2Fopenstack%2Dansible%2Drabbitmq_server%29+status%3Aopen&title=OpenStack%2DAnsible+Review+Inbox&Foobar=age%3A30d) | 13:49 |
hwoarang | anyway | 13:49 |
*** ijw has quit IRC | 13:50 | |
dmsimard | fungi: I forgot about the warning for not tagging releases | 13:50 |
dmsimard | fungi: and I tagged something and it's running so I guess whatever it was it works now ? | 13:51 |
dmsimard | this is pre-release FWIW | 13:51 |
*** hongbin has joined #openstack-infra | 13:51 | |
fungi | dmsimard: it's apparently post pipeline jobs which are having trouble, so might not actually impact most release activities | 13:51 |
dmsimard | fungi: is anyone looking into that ? can I help ? | 13:52 |
fungi | dmsimard: except stuff driven through release management, which relies on a post job for changes merged to the releases repo to trigger automated tagging | 13:52 |
sdague | hwoarang: age:30d means patches that haven't changed in over 30d | 13:53 |
sdague | there are only 3 patches | 13:53 |
fungi | dmsimard: all i can tell so far is that we're still getting ref-updated events and the zuul debug log indicates it's adding the trigger and processing them | 13:53 |
sdague | they all have seen activity in the past week | 13:53 |
fungi | dmsimard: but it's not launching any jobs so something's not matching correctly i guess... still digging | 13:53 |
dmsimard | fungi: so a review that merges and triggers POST, the merge occurs properly but then the post job doesn't trigger ? | 13:54 |
hwoarang | sdague: true :/ i will keep digging for a simpler reproducer | 13:54 |
sdague | hwoarang: what are you trying to figure out? | 13:55 |
*** wolverineav has quit IRC | 13:55 | |
hwoarang | sdague: the link produced by ./gerrit-dash-creator dashboards/openstack-ansible.dash returns 0 results for me whereas two days ago the list was pretty massive | 13:55 |
sdague | hwoarang: is it in gerrit dash creator repo? | 13:56 |
hwoarang | yep | 13:56 |
sdague | label:Code-Review>=0,self | 13:57 |
sdague | that's your problem | 13:57 |
sdague | see the email that I sent | 13:57 |
fungi | dmsimard: yeah, i think something about the parameter names may have changed in ref-updated events causing zuul not to find commit details, but that's just a hunch so far. i'm trying to match up the parameter names and data types to what it expects | 13:58 |
sdague | hwoarang: http://lists.openstack.org/pipermail/openstack-dev/2017-September/122277.html | 13:58 |
sdague | Code-Review=0 used to be a noop | 13:58 |
hwoarang | ahh thank you sdague | 13:58 |
sdague | now it actually matches everything you've never voted on | 13:58 |
*** efoley has quit IRC | 14:01 | |
*** efoley has joined #openstack-infra | 14:01 | |
*** Guest36641 is now known as melwitt | 14:03 | |
*** erlon has joined #openstack-infra | 14:05 | |
*** esberglu has quit IRC | 14:06 | |
*** esberglu has joined #openstack-infra | 14:07 | |
yolanda | hwoarang, do you recognize this error on suse? http://logs.openstack.org/28/505128/1/check/gate-bifrost-integration-tinyipa-opensuse-423/ff4c568/console.html | 14:08 |
*** srobert has joined #openstack-infra | 14:08 | |
*** rbrndt has joined #openstack-infra | 14:09 | |
*** armax has joined #openstack-infra | 14:09 | |
hwoarang | yolanda: seems like something (bindep?) called 'zypper install' without providing any package name | 14:10 |
yolanda | i was testing this on centos, and i had to pin the version to work, but seems to fail for suse | 14:10 |
dmsimard | fungi: I'll cheer from the sidelines :( | 14:10 |
hwoarang | yolanda: what patch is this? | 14:11 |
fungi | dmsimard: it's slow going because i still have inlaws visiting for my wife's birthday | 14:11 |
*** jheroux has joined #openstack-infra | 14:11 | |
yolanda | http://logs.openstack.org/28/505128 | 14:11 |
dmsimard | fungi: zuul logs aren't in logstash are they ? | 14:11 |
yolanda | 505128 is the patch | 14:11 |
*** esberglu has quit IRC | 14:11 | |
yolanda | hwoarang, actually i'm getting output from the test run, and i get the same for centos: bindep -b &> /dev/null || sudo -H -E /bin/yum -y install | 14:12 |
yolanda | that's the command that is generated | 14:12 |
hwoarang | oh bindep 2.3 will not work for suse | 14:12 |
yolanda | and 2.4 seem to fail on centos | 14:12 |
yolanda | something with unicode | 14:12 |
hwoarang | yolanda: there are some suse specific fixes that made it after 2.3 especially the opensuse leap support | 14:13 |
hwoarang | which is what the gate uses | 14:13 |
yolanda | hwoarang, are you familiar with bindep then? let me show you the error on centos | 14:13 |
yolanda | mm, weird, not it doesn't give errors, but on a clean system it was failing with some unicode error | 14:15 |
hwoarang | maybe the locale was not set properly? | 14:15 |
yolanda | mm, fails on my fedora as well | 14:15 |
clarkb | fungi you havent managed to record the json data for a ref updated event yet have you? docs say the content is refName I wonder if that used to be called "ref" | 14:15 |
yolanda | hwoarang, http://paste.openstack.org/show/621434/ | 14:16 |
yolanda | with 2.5.0 | 14:16 |
clarkb | wehave a fix for th dashboard so that leaves, zuul post, email slowness, gitweb, and memory consumption on list of items I see from scrollback | 14:17 |
frickler | ask.o.o is still offline, anyone up for a quick apache restart maybe? | 14:18 |
frickler | clarkb: although not urgent, please also add the "each patch has the same topic as itself" to that list, I'm pretty sure that didn't happen before | 14:19 |
*** yamamoto has quit IRC | 14:20 | |
*** esberglu has joined #openstack-infra | 14:20 | |
fungi | clarkb: i have one, just a sec and i'll paste | 14:20 |
*** baoli has joined #openstack-infra | 14:21 | |
clarkb | frickler: I'm not sure I understand that one. I knoe I tested search by topic worked after the upgrade. Can you expand on the behavior you are seeing? | 14:22 |
*** wolverineav has joined #openstack-infra | 14:22 | |
fungi | clarkb: http://paste.openstack.org/show/621436 | 14:22 |
fungi | there's a bunch of them | 14:22 |
frickler | clarkb: when viewing a singleton patch like https://review.openstack.org/504349 , there is a tab on the right with "Same Topic (1)" and a link to the patch itself. IMHO that tab should only exist when there are other patches with the same topic | 14:23 |
fungi | that sounds more like a behavior change than a bug | 14:23 |
jeblair | fungi: those are all updates to changes so they wouldn't be enqueued in post; we need a ref update to a branch | 14:24 |
fungi | jeblair: ahh, yep, i'll see if i can nab one of those | 14:24 |
sdague | frickler: it's definitely a change | 14:25 |
clarkb | I think 2.11 did that too in some situations. Related changes would include the current change for example | 14:25 |
hwoarang | yolanda: actually bindep fails here as well | 14:25 |
fungi | trap laid | 14:25 |
jeblair | 505248,1 may be about to merge | 14:25 |
hwoarang | something is not right | 14:25 |
sdague | frickler: however, it's honestly less confusing UX for that list to exist all the time instead of only if >=2 patches in the series | 14:25 |
*** bobh has quit IRC | 14:26 | |
fungi | jeblair: i'm listening to the event stream for ref-updated events and filtering out any for refs/changes/ so hopefully should see one when that merges | 14:26 |
jeblair | fascinating -- i'm seeing events for "project":"All-Users" | 14:26 |
jeblair | i wonder if that happens when folks update their settings | 14:26 |
*** efoley has quit IRC | 14:26 | |
yolanda | hwoarang, if you export LANG to en_US works? seemed to do the trick in my fedora | 14:27 |
fungi | so i guess it's possible gerrit no longer emits ref-updated on changes merging. i can't find evidence i caught any earlier besides for change updates | 14:27 |
hwoarang | yolanda: it actually fails with AttributeError: 'Depends' object has no attribute 'platform' | 14:27 |
fungi | checking docs to see if they made that into a new event type | 14:27 |
jeblair | {"submitter":{"name":"Jenkins","username":"jenkins"},"refUpdate":{"oldRev":"a74c237cf717246d659cef241a5d09096cbbc2a0","newRev":"5c5ca5f1eeca79ab8c13ac6b7221077f78f9fa6b","refName":"refs/heads/master","project":"openstack/openstack-ansible-rabbitmq_server"},"type":"ref-updated","eventCreatedOn":1505831339} | 14:30 |
yolanda | hwoarang, with 2.5.0? | 14:30 |
jeblair | fungi, clarkb: ^ | 14:30 |
hwoarang | yep | 14:30 |
frickler | sdague: hmm, not sure I agree, particularly when it may hide other more relevant tabs like cherry-picks or conflicts. but maybe that's really a matter of taste | 14:30 |
jeblair | it looks like they stopped dropping refs/heads from the ref on branches | 14:30 |
jeblair | which means two things: | 14:31 |
jeblair | 1) our pipeline config is wrong 2) our jobs are all wrong | 14:31 |
fungi | oh, ugh. that's going to be a ton of changes to jobs | 14:31 |
fungi | yeah, that | 14:31 |
jeblair | mordred: ^ this could have a significant impact on v3 conversion | 14:31 |
clarkb | 1) should be straightforward its 2) that will be painful unless we want to bake behavior compat into zuul | 14:32 |
fungi | i wonder if it also impacts how zuul-cloner needs to handle $ZUUL_REF(NAME) | 14:32 |
*** sdague has quit IRC | 14:32 | |
jeblair | actually........ | 14:32 |
jeblair | i *think* we were clever enough to standardize zuul.ref in v3! :) | 14:32 |
jeblair | so all the v3 job content should not be affected by the change | 14:32 |
yolanda | hwoarang, forced the LANG on that patch, let's see if that helps suse, or it's another problem | 14:32 |
jeblair | clearly we should migrate to v3 before upgrading gerrit. :| | 14:33 |
clarkb | should be simpleish to add a pipeline trigger flag to drop refs/.*/ from the refName ? | 14:34 |
*** ykarel has quit IRC | 14:35 | |
*** efoley has joined #openstack-infra | 14:36 | |
jeblair | needs to be a driver config option, but yeah. | 14:36 |
jeblair | though i'm only okay with that because we're throwing the code away next week. :) | 14:36 |
clarkb | anf we can apply that before checking the co dition match too so both 1 and 2 are addressed | 14:37 |
*** rbrndt has quit IRC | 14:37 | |
*** rbrndt has joined #openstack-infra | 14:38 | |
clarkb | though have to be careful to only do it for refs/heads and refs/tags I think | 14:38 |
jeblair | clarkb: just refs/heads | 14:38 |
mordred | jeblair: yes - I agree, I think we're fine in v3 | 14:38 |
*** esberglu has quit IRC | 14:39 | |
*** esberglu has joined #openstack-infra | 14:40 | |
jeblair | clarkb: want me to write that patch? | 14:40 |
clarkb | jeblair: yes please. I am going to get a list going of the data we have collected so far | 14:40 |
*** sdague has joined #openstack-infra | 14:40 | |
*** jrist has quit IRC | 14:41 | |
*** esberglu has quit IRC | 14:41 | |
clarkb | and things that need investigation and or fixing | 14:41 |
*** esberglu has joined #openstack-infra | 14:41 | |
tonyb | Can someone check on 503601 I can't see it in status.o.o/zuul by either change number or git SHA. so I'm wondering if it dropped somehow with the update | 14:41 |
*** baoli has quit IRC | 14:42 | |
*** rossella_s has quit IRC | 14:43 | |
*** baoli has joined #openstack-infra | 14:43 | |
*** baoli has quit IRC | 14:44 | |
*** baoli has joined #openstack-infra | 14:44 | |
*** martinkopec has quit IRC | 14:46 | |
*** sdague has quit IRC | 14:46 | |
*** rossella_s has joined #openstack-infra | 14:46 | |
hwoarang | yolanda: my problem is a regression in tumbleweed | 14:47 |
clarkb | tonyb: I think because it didn't get a jenkins +1 it isn't going into the gate after workflow +1. I expect a recheck would sort it out | 14:48 |
AJaeger | tonyb: agree with clarkb ^ | 14:48 |
*** jaosorior has joined #openstack-infra | 14:48 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul master: Add strip_branch_ref compat option https://review.openstack.org/505290 | 14:48 |
jeblair | clarkb, mordred: ^ | 14:49 |
mordred | jeblair: +2 | 14:50 |
tonyb | clarkb, AJaeger: Thanks. As often happens I had the tools I just didn't use them correctly. | 14:50 |
*** Guest7335 has quit IRC | 14:51 | |
openstackgerrit | James E. Blair proposed openstack-infra/puppet-zuul master: Add gerrit_strip_branch_ref option https://review.openstack.org/505292 | 14:51 |
clarkb | jeblair: lgtm should I go ahead and approve? mordred has +2'd as well | 14:52 |
mordred | clarkb: ++ | 14:52 |
jeblair | clarkb: ya | 14:52 |
jeblair | after it lands, we'll need to make a new branch with the other patch on zuul.o.o | 14:53 |
*** szahers has joined #openstack-infra | 14:53 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: configloader: don't use path in SourceContext comparaison https://review.openstack.org/505293 | 14:53 |
fungi | clarkb: if you're putting together a list, https://review.openstack.org/505067 is a good candidate i think | 14:53 |
clarkb | https://etherpad.openstack.org/p/gerrit-2.13-issues is the list | 14:53 |
fungi | thanks | 14:53 |
*** david-lyle has joined #openstack-infra | 14:53 | |
fungi | and looks like it's already on there | 14:53 |
jeblair | mordred: can you update the regex in the zuulv3 pipeline config? | 14:55 |
clarkb | fungi: just added it :) | 14:55 |
clarkb | of all of these the one that concerns me most is the memory consumption. changes to jgit were supposed to mean that gerrit didn't cache everything until we had to restart anymore, but could mean we just use more memory in general... | 14:56 |
*** beekneemech is now known as bnemec | 14:56 | |
clarkb | it also seems to be a reaction to load, as it clearly picks up in as north america wakes | 14:57 |
openstackgerrit | James E. Blair proposed openstack-infra/puppet-openstackci master: Pass through gerrit_strip_branch_ref https://review.openstack.org/505296 | 14:57 |
clarkb | (there is image link on etherpad above) | 14:57 |
clarkb | jeblair: oh are we going to have to edit the zuul layout voluptuous checker too? | 14:57 |
clarkb | jeblair: I wonder if ^ will fail without editing your change | 14:57 |
jeblair | clarkb: it's a zuul.conf setting | 14:58 |
clarkb | ah | 14:58 |
clarkb | because it is part of the connection roger | 14:58 |
openstackgerrit | Merged openstack-infra/zuul master: Add strip_branch_ref compat option https://review.openstack.org/505290 | 14:59 |
*** lbragstad has joined #openstack-infra | 14:59 | |
openstackgerrit | James E. Blair proposed openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul https://review.openstack.org/505297 | 14:59 |
jeblair | okay "topic:branch-ref" should be ready | 14:59 |
*** xarses_ has joined #openstack-infra | 15:00 | |
jeblair | mordred: nevermind, i'll get the zuulv3 config | 15:00 |
*** szahers has quit IRC | 15:00 | |
*** cody-somerville has joined #openstack-infra | 15:00 | |
mordred | jeblair: (sorry, in my bi-weekly corporate overlord staff meeting) | 15:01 |
dmsimard | infra-root: image builds (at least on review.rdo's nodepool) are broken due to https://github.com/openstack-infra/project-config/commit/0335417d0d1b6b69cc95da0134b15f7e67878f3b merging and the repository not existing on git.o.o | 15:02 |
mordred | dmsimard: I think that's on the TDL for fixing - thanks | 15:03 |
*** andreww has joined #openstack-infra | 15:03 | |
dmsimard | yeah I suspect it's due to the ref issue | 15:03 |
*** ccamacho has quit IRC | 15:03 | |
openstackgerrit | James E. Blair proposed openstack-infra/project-config master: Update post ref regex https://review.openstack.org/505300 | 15:03 |
*** andreww has quit IRC | 15:03 | |
*** trown is now known as trown|brb | 15:03 | |
*** baoli has quit IRC | 15:03 | |
dmsimard | mordred: TDL is https://etherpad.openstack.org/p/gerrit-2.13-issues ? I'll add it there., | 15:04 |
*** baoli has joined #openstack-infra | 15:04 | |
mordred | dmsimard: well, we also had a puppet issue which is causing puppet to have sads on git.o.o | 15:05 |
*** xarses_ has quit IRC | 15:05 | |
mordred | dmsimard: which I _think_ is the root cause of that particular thing | 15:05 |
*** andreas_s has quit IRC | 15:05 | |
dmsimard | mordred: the paramiko thing, yes | 15:05 |
mordred | yah | 15:05 |
dmsimard | mordred: I thought new projects were created by jeepyb ? | 15:05 |
* mordred is stuck in morning meetings so hasn't had a chance to be very helpful yet | 15:05 | |
dmsimard | or is that just gerrit | 15:05 |
*** amoralej has joined #openstack-infra | 15:05 | |
mordred | dmsimard: yes. but jeepyb is run by puppet | 15:05 |
dmsimard | ohhhhhh | 15:05 |
*** andreww has joined #openstack-infra | 15:05 | |
mordred | so puppet is stuck on git0*.o.o, which means that those aren't getting updated | 15:06 |
*** andreww is now known as xarses_ | 15:06 | |
openstackgerrit | Matthew Thode proposed openstack-infra/project-config master: make a gentoo nodepool image https://review.openstack.org/504530 | 15:06 |
dmsimard | mordred: I'll try and reproduce on my end, although the fix is probably to just disable EPEL :/ | 15:07 |
dmsimard | mordred: the base OS version of python-paramiko (provided by centos extras) is much more recent than the one from EPEL | 15:07 |
portdirect | hey - I'm having trouble linking a blueprint to a ps - I've tried a few things, but wondering if theres anything up since the gerrit upgrade? | 15:07 |
pabelanger | yup, once we get puppet going again, it should be created | 15:07 |
dmsimard | mordred: python-paramiko probably got a huge bump in centos extras due to ansible being shipped in extras now. | 15:07 |
*** trown|brb is now known as trown | 15:08 | |
dmsimard | what's happening is that the version of extras is already installed and it's trying to install the version from epel on top or something. | 15:08 |
*** rossella_s has quit IRC | 15:08 | |
clarkb | infra-root thinking it would be good to avoid restarting gerrit for as long as possible (I knowthis delays fixing stuff like gitweb/cgit) because the longer it runs the better memory profile we will get. It is possible it just needs more memory than before but will reach steady state but also possible it just wants a big gulp amount of memory and we will have a hard time seeing that if we restart it | 15:08 |
clarkb | often | 15:08 |
dmsimard | but I can't do much beyond speculate | 15:08 |
clarkb | portdirect: what method are you using to link the two? | 15:08 |
dmsimard | clarkb: could it be related to the core count/thread bump we did ? | 15:08 |
amoralej | would it be possible to get a new release of diskimage-builder?, i'd like to get a new tag which includes https://review.openstack.org/#/c/500212/ | 15:09 |
clarkb | dmsimard: no, we have had the same core count as before that was just a change to the reindex step which is a process that exited when complete | 15:09 |
dmsimard | clarkb: oh. | 15:09 |
clarkb | dmsimard: that reindex step builds on disk indexes for gerrit and is a one off step | 15:09 |
pabelanger | +2 on topic:branch-ref | 15:09 |
fungi | jeblair: topic:branch-ref lgtm | 15:10 |
*** rossella_s has joined #openstack-infra | 15:10 | |
jeblair | they're all safe to approve | 15:10 |
fungi | i can go back and approve them now that someone else has also +2'd | 15:10 |
fungi | hrm, gate-infra-puppet-apply-3-ubuntu-trusty fails on the last one | 15:11 |
portdirect | clarkb: tried just the "Paritially implements" in the commit message: https://review.openstack.org/#/c/504089/ | 15:11 |
*** lbragstad has quit IRC | 15:11 | |
portdirect | which for us as a hosted project used to fail on the search in launchpad, but would at least link it there. | 15:11 |
fungi | Puppet (err): Invalid parameter gerrit_strip_branch_ref on Class[Openstackci::Zuul_scheduler] at Class[Openstackci::Zuul_scheduler] at /etc/puppet/modules/openstack_project/manifests/zuul_prod.pp:58 | 15:12 |
fungi | oh! wrong depends-on i think | 15:13 |
jeblair | fungi: yep! i'll fix | 15:14 |
fungi | it's already queued in my gertty | 15:14 |
fungi | but... isn't getting pushed to gerrit for some reason | 15:14 |
jeblair | fungi: i see you updated it | 15:14 |
fungi | gertty's gone to offline sync | 15:14 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul https://review.openstack.org/505297 | 15:15 |
fungi | there it goes | 15:15 |
clarkb | portdirect: http://paste.openstack.org/show/621445/ possibly related to the broken search? | 15:15 |
mordred | fungi: +3 | 15:16 |
*** slaweq has quit IRC | 15:16 | |
*** makowals has quit IRC | 15:16 | |
*** yamamoto has joined #openstack-infra | 15:20 | |
*** ccamacho has joined #openstack-infra | 15:21 | |
*** yamamoto has quit IRC | 15:25 | |
*** dimak has quit IRC | 15:25 | |
*** dimak has joined #openstack-infra | 15:26 | |
*** bobh has joined #openstack-infra | 15:26 | |
*** slaweq_ has quit IRC | 15:27 | |
*** slaweq has joined #openstack-infra | 15:28 | |
openstackgerrit | Merged openstack-infra/puppet-zuul master: Add gerrit_strip_branch_ref option https://review.openstack.org/505292 | 15:28 |
*** csomerville has joined #openstack-infra | 15:29 | |
clarkb | infra-root: proposal, lets get zuul post stuff sorted (in progress yay) since zuul puppeting should work as soon as we reenable puppeting globally. Then focus on fixing git backends puppeting so that we can start applying some of the more gerrit specific fixes. The etherpad https://etherpad.openstack.org/p/gerrit-2.13-issues has thoughts/ideas/fixes for each entry so far | 15:29 |
clarkb | that should give us time to gather more data on memory utilization before restarting as well | 15:29 |
clarkb | while still actively working to fix things | 15:29 |
clarkb | once I'm off my weekly tuesday call I will work to get puppet running again | 15:30 |
openstackgerrit | Merged openstack-infra/puppet-openstackci master: Pass through gerrit_strip_branch_ref https://review.openstack.org/505296 | 15:31 |
*** bobh has quit IRC | 15:31 | |
*** cody-somerville has quit IRC | 15:32 | |
*** jaosorior has quit IRC | 15:32 | |
*** Swami has joined #openstack-infra | 15:33 | |
openstackgerrit | Monty Taylor proposed openstack-infra/shade master: Add method to set bootable flag on volumes https://review.openstack.org/502479 | 15:33 |
mordred | clarkb: ++ | 15:34 |
mordred | clarkb: great plan by me | 15:34 |
pabelanger | +1 | 15:34 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul https://review.openstack.org/505297 | 15:35 |
clarkb | I've just approved https://review.openstack.org/#/c/504993/1 which bumps gerrit heap memory usability to 48GB (from 30GB) this was applied by hand last night and getting that in ensures we don't regress once puppet is running again | 15:36 |
*** e0ne has quit IRC | 15:37 | |
fungi | weird, why did gerritbot rereport that? | 15:37 |
fungi | uh | 15:38 |
fungi | gertty reverted that somehow | 15:38 |
fungi | like, on its own | 15:39 |
clarkb | fungi: ? | 15:39 |
fungi | checking its logs now, but wondering if that was some sort of timeout where it pushed the edit but never got a response and tried to roll it back | 15:39 |
fungi | 505297 | 15:39 |
jeblair | fungi: let me push the update to fix it | 15:40 |
fungi | thanks | 15:40 |
fungi | this sort of jives with the behavior people were reporting last night about the webui getting hung up after pushing edits through it | 15:40 |
clarkb | though in those cases it seemed to "work" in the end? I guess error handling in gertty could be the different behavior? | 15:41 |
jeblair | i'm seeing similar issues as fungi did | 15:41 |
jeblair | i have debug log enabled, so can comb through this later | 15:41 |
openstackgerrit | James E. Blair proposed openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul https://review.openstack.org/505297 | 15:41 |
fungi | looks like maybe it gave up after waiting 20 minutes | 15:41 |
fungi | and yeah, i don't have debug logging enabled | 15:42 |
clarkb | fungi: maybe check for errors on the gerrit side/ | 15:43 |
*** LindaWang has quit IRC | 15:45 | |
fungi | nothing in the error log i could find matching that change-id or numeric index | 15:46 |
*** sdague has joined #openstack-infra | 15:46 | |
jeblair | i definitely got some gertty errors during the initial edit -- fungi you may have some in your log too. we'll see if something happens 20 minutes later. | 15:47 |
jeblair | "edit already in progress" | 15:47 |
fungi | for some reason my .gertty.log only had a couple tracebacks from february | 15:47 |
jeblair | huh | 15:48 |
fungi | oh, wait, now it's got some | 15:48 |
fungi | Edit already in progress on change 505297 | 15:48 |
fungi | yeah | 15:48 |
fungi | this is the first thing in there: | 15:49 |
fungi | 2017-09-19 15:14:37,182 Offline due to: HTTPSConnectionPool(host='review.openstack.org', port=443): Read timed out. (read timeout=30) | 15:49 |
openstackgerrit | Merged openstack-infra/system-config master: Bump gerrit to 48g heap https://review.openstack.org/504993 | 15:49 |
fungi | and then the "Edit already in progress" exceptions start up | 15:49 |
fungi | is that from retries? | 15:50 |
fungi | also getting some of these: | 15:50 |
fungi | 2017-09-19 15:33:28,840 Offline due to: ('Connection aborted.', BadStatusLine("''",)) | 15:50 |
*** baoli has quit IRC | 15:50 | |
clarkb | ready for me to uncomment the entry in roots crontab for puppet runs? | 15:52 |
*** yamamoto has joined #openstack-infra | 15:52 | |
fungi | did we want to wait for the system-config change for zuul to land? | 15:53 |
clarkb | we can | 15:53 |
fungi | 505297 is still pending | 15:53 |
clarkb | that will save us waiting for previous run to end | 15:53 |
fungi | that's exactly what i was thinking | 15:53 |
clarkb | kk waiting for that then | 15:53 |
openstackgerrit | Doug Hellmann proposed openstack-infra/project-config master: add whereto to gerritbot for openstack-doc https://review.openstack.org/505314 | 15:54 |
zaro | Hey great job on the upgrade! | 15:57 |
clarkb | zaro: hey, if you have a moment can you look over https://etherpad.openstack.org/p/gerrit-2.13-issues and see if anything looks familiar? | 15:58 |
clarkb | zaro: I think we have ideas for fixes on most of that | 15:58 |
clarkb | zaro: curious if there is a suggested email thread pool size in particular and if you have any idea on the memory | 15:58 |
*** florianf has quit IRC | 15:59 | |
zaro | clarkb: i think i have a few changes for the cgit thing | 16:00 |
*** dtantsur|lunch is now known as dtantsur | 16:00 | |
kmalloc | the new gerrit seems way faster | 16:01 |
kmalloc | fyi | 16:01 |
kmalloc | just in general use | 16:01 |
openstackgerrit | Merged openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul https://review.openstack.org/505297 | 16:02 |
*** dprince has quit IRC | 16:02 | |
jeblair | fungi: any chance you stop/started gertty? | 16:02 |
clarkb | fungi: ^ you good for me to uncomment crontab entry now? | 16:02 |
clarkb | 1615 will be first puppet_run_all run | 16:03 |
*** yamamoto has quit IRC | 16:03 | |
*** sbezverk has quit IRC | 16:03 | |
*** egonzalez has quit IRC | 16:03 | |
fungi | jeblair: i stop/started it shortly before reviewing those changes because it was hung syncing from overnight and so wasn't finding them | 16:03 |
fungi | clarkb: yeah, go for it | 16:03 |
*** isaacb has joined #openstack-infra | 16:04 | |
*** dprince has joined #openstack-infra | 16:04 | |
clarkb | fungi: #*/15 * * * * flock -n /var/run/puppet/puppet_run_all.lock bash /opt/system-config/production/run_all.sh >> /var/log/puppet_run_all_cron.log 2>&1 that is the line you commented out? | 16:05 |
clarkb | the log file looks wrong to me, shouldn't it me puppet_run_all.log? | 16:06 |
clarkb | looking at system config I may be wrong, but want ot double check I am restoring the correct entry | 16:07 |
*** hashar_ is now known as hashar | 16:08 | |
*** Apoorva has joined #openstack-infra | 16:08 | |
clarkb | aha ansible log path is set to the non _cron path | 16:10 |
clarkb | ok I am uncommenting the line I pasted above | 16:10 |
openstackgerrit | Ihar Hrachyshka proposed openstack-infra/project-config master: Revert "neutron: Make grenade-neutron-dvr-multinode job non-voting" https://review.openstack.org/505318 | 16:10 |
clarkb | 16:15 UTC will be next puppet run | 16:11 |
fungi | clarkb: yep | 16:11 |
fungi | that was the one | 16:11 |
clarkb | thanks | 16:11 |
*** hamzy has quit IRC | 16:11 | |
jeblair | clarkb, fungi, mordred: i think mysql may be implicated in the email slowness | 16:13 |
jeblair | based on thread tracebacks | 16:13 |
jeblair | gerrit is currently between emails, and it's sitting in a socket read under mysql | 16:13 |
*** dizquierdo has joined #openstack-infra | 16:14 | |
jeblair | i'll try to get the mysql query it's running | 16:14 |
clarkb | ok | 16:14 |
clarkb | ansible-puppet has started | 16:15 |
clarkb | I am watching it for this first pass and assuming it acts as expected look at addressing problems in git backends next | 16:17 |
zaro | clarkb: memory graph seems to indicate a few gc cycles then no more. does that seem right? | 16:17 |
clarkb | zaro: yes, we increased the heap memory to 48G from 30G after those gc cycles | 16:17 |
mordred | jeblair: ok. off the phone and done with followups | 16:17 |
clarkb | zaro: it was causing slowness and 500 errors | 16:17 |
clarkb | zaro: since the memory bump it has been happy except for using almost all of the extra memory | 16:18 |
fungi | rackspace doesn't have any larger vm flavors than the one we're using, right? | 16:18 |
clarkb | git backends failed to puppet as expected so review.o.o was not touched (also it is in the emergency file list) | 16:18 |
mordred | jeblair: which logs are implicated mysql? | 16:19 |
jeblair | clarkb, mordred: hrm, that may have been a red herring. i'm not seeing it in any really long queries; it may be that the email thread happens to do a lot of queries and i just keep catching it in one | 16:19 |
jeblair | mordred: i'm looking at the thread dump | 16:19 |
mordred | ok. I'll look at mysql to see if I see anything just in case | 16:19 |
clarkb | fungi: there are 90 adn 120 GB ram flavors | 16:19 |
clarkb | we are using 60GB flavor currently | 16:19 |
fungi | ahh, okay | 16:20 |
fungi | for some reason i thought 60 was the top | 16:20 |
fungi | regardless that seems pretty huge | 16:20 |
*** caphrim007 has joined #openstack-infra | 16:20 | |
clarkb | fungi: I think zaro will likely report 60gb host instance is small for gerrit, I want to say people run it wiht 256GB baremetal in places | 16:21 |
clarkb | that said I generally agree, but EJAVA | 16:21 |
mordred | fungi: what's the process for changing the my.cnf config for our db? | 16:21 |
jeblair | i still think we're on the larger size | 16:21 |
clarkb | melody reports a massive drop in memory use fwiw | 16:21 |
fungi | mordred: puppet | 16:21 |
clarkb | and no major gc blip | 16:21 |
fungi | afaik | 16:22 |
clarkb | so maybe we can steady state we just needed a little more | 16:22 |
fungi | mordred: or you mean within the db instance? | 16:22 |
mordred | fungi: I thought there was some cloud thing ... yah | 16:22 |
mordred | the 'profile' or something | 16:22 |
fungi | mordred: i've been doing it through the rackspace dashboard | 16:22 |
jeblair | clarkb: http://help.collab.net/topic/teamforge178/reference/Gerrit-Performance-Tuning-Cheat-Sheet.pdf | 16:22 |
jeblair | clarkb: the collabnet cheat sheet "large" size only goes up to 32G ram | 16:22 |
fungi | mordred: they have an option editor you use to create a custom configuration for the appropriate database type and server version, and then you apply it to a db instance (and restart the instance) | 16:23 |
mordred | fungi: cool. I'm gonna look at a couple of things based on status variables | 16:24 |
clarkb | puppet ran through afs and is now in the else section of nodes | 16:24 |
clarkb | (so zuul should be getting updated shortly) | 16:24 |
mordred | fungi: I don't see a way to get to the option editor | 16:25 |
mordred | ah - nevermind. found it | 16:25 |
fungi | rackspace dashboard: databases -> mysql configurations | 16:25 |
*** askb has joined #openstack-infra | 16:26 | |
fungi | the one we're using for review-mysql instance right now is the "sanity" configuration for 5.1 | 16:26 |
mordred | oh wow - we're still on 5.1 for gerrit :) | 16:26 |
clarkb | dmsimard: is there an easy way (or any way at all I suppose) of doing a package listing and sorting by repo location? I want to make sure that we aren't using epel for anything important before turning it off on the backends (as that seems to be one of the suggested fixes) | 16:26 |
clarkb | mordred: yes that is why upgrading is next on the list | 16:26 |
mordred | when we do the xenial update for 2.14 we should perhaps consider migrating to 5.7 if we havent' already been considering that | 16:26 |
mordred | cool | 16:26 |
clarkb | mordred: we can't do 4 byte utf8 | 16:26 |
fungi | mordred: yep, leading me to wonder whether the planned db instance upgrade/replacement should get its timetable stepped up | 16:27 |
jeblair | clarkb, mordred: this is really confusing. according to exim, gerrit is sometimes waiting one or two minutes between sending out emails. | 16:27 |
jeblair | it doesn't seem to be stuck in any individual long queris | 16:27 |
fungi | if we really are seeing performance issues we can better control with newer mysql | 16:27 |
dmsimard | clarkb: repoquery -i <package> ? | 16:27 |
jeblair | but i wonder if it's just doing *a lot* of them, or many kinda-sorta-slow ones, or having a lot of cache misses... | 16:27 |
*** bobh has joined #openstack-infra | 16:27 | |
*** prometheanfire has left #openstack-infra | 16:28 | |
*** vhosakot has joined #openstack-infra | 16:28 | |
*** rwsu has quit IRC | 16:28 | |
jeblair | the mysql calls are always underneath the account cache | 16:28 |
dmsimard | clarkb: I have a script that already does almost of what you're looking for | 16:29 |
jeblair | wow show-caches is very slow to responde | 16:30 |
*** rwsu has joined #openstack-infra | 16:30 | |
*** rwsu has quit IRC | 16:30 | |
*** rwsu has joined #openstack-infra | 16:30 | |
jeblair | accounts | 1024 | 2.9ms | 64% | | 16:31 |
mordred | jeblair: there are definitely a couple of mysql-level caches where we have significant misses, which is what I'm looking at right now | 16:31 |
jeblair | what you say we bump that ^? | 16:31 |
*** mat128 has quit IRC | 16:32 | |
*** bobh has quit IRC | 16:32 | |
clarkb | dmsimard: any chance it is shareable? I owe you beer or cookies if so | 16:32 |
dmsimard | clarkb: yeah, let me adapt it from https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/nodepool/scripts/filter_packages.sh | 16:33 |
clarkb | thanks | 16:33 |
clarkb | in related news, thread on upstream gerrit ml says upstream gerrit has a case of the 500s today as well | 16:33 |
*** ralonsoh has quit IRC | 16:33 | |
*** bh526r has quit IRC | 16:34 | |
*** sdague has quit IRC | 16:35 | |
*** dprince has quit IRC | 16:35 | |
jeblair | i will work on a change to bump the account cache(s) | 16:36 |
*** sdague has joined #openstack-infra | 16:36 | |
*** jpich has quit IRC | 16:36 | |
dmsimard | clarkb: http://paste.openstack.org/show/621449/ should work | 16:36 |
dmsimard | clarkb: example (unsorted) output http://paste.openstack.org/show/621450/ | 16:37 |
*** mikal has quit IRC | 16:37 | |
*** pcaruana has quit IRC | 16:38 | |
clarkb | jeblair: sounds good | 16:38 |
electrofelix | jamielennox: have you had a chance to review https://review.openstack.org/475474 | 16:39 |
clarkb | dmsimard: thanks I will sanity check git backends for epel shortly | 16:39 |
openstackgerrit | James E. Blair proposed openstack-infra/puppet-gerrit master: Allow configuring account cache limits https://review.openstack.org/505328 | 16:39 |
openstackgerrit | Merged openstack/os-client-config master: Fix requires_floating_ip https://review.openstack.org/504616 | 16:41 |
openstackgerrit | James E. Blair proposed openstack-infra/system-config master: Bump gerrit account cache to 2048 https://review.openstack.org/505330 | 16:43 |
jeblair | clarkb, fungi, mordred: ^ "topic:account-cache" is ready | 16:43 |
clarkb | maybe you want a similar stack for increasing email send thread pool count? | 16:44 |
jeblair | clarkb: i actually don't want to increase the threads until i see one thread behaving well | 16:44 |
clarkb | ok | 16:44 |
clarkb | cache stack lgtm | 16:44 |
jeblair | it should not take 1 minute for a computer to compose an email :) | 16:45 |
fungi | and approved | 16:45 |
*** isaacb has quit IRC | 16:45 | |
clarkb | jeblair: ++ | 16:45 |
jeblair | er i spot a typo in one of those | 16:45 |
*** Qiming has quit IRC | 16:45 | |
jeblair | i will -2, fix, and i'm also going to add another cache entry | 16:46 |
fungi | i unapproved them anyway | 16:46 |
fungi | oh, yep | 16:47 |
clarkb | ok | 16:47 |
*** mikal has joined #openstack-infra | 16:47 | |
fungi | you mixed cache_accounts_by{name,email} with cache_by{name,email} | 16:47 |
openstackgerrit | James E. Blair proposed openstack-infra/puppet-gerrit master: Allow configuring account/group cache limits https://review.openstack.org/505328 | 16:48 |
jeblair | fungi: yep. fixed, and added another cache: | 16:48 |
jeblair | groups_byuuid | 1024 | 2.0ms | 36% | | 16:48 |
fungi | thanks | 16:48 |
fungi | yeah, that's probably a good addition | 16:48 |
jeblair | updating system-config change now | 16:48 |
fungi | you can probably remove your cr-2 now | 16:48 |
jeblair | done | 16:49 |
*** trown is now known as trown|lunch | 16:49 | |
clarkb | exim comes from epel on git01 | 16:49 |
clarkb | (still waiting on a complete list) | 16:49 |
*** Qiming has joined #openstack-infra | 16:49 | |
openstackgerrit | James E. Blair proposed openstack-infra/system-config master: Bump gerrit account cache to 2048 https://review.openstack.org/505330 | 16:49 |
*** jpena is now known as jpena|away | 16:49 | |
jeblair | okay, those are both ready again ^ | 16:49 |
fungi | clarkb: oh, right, because postfix is the default for centos i guess? | 16:50 |
*** mat128 has joined #openstack-infra | 16:51 | |
clarkb | fungi: must be? | 16:51 |
jeblair | i've never managed to run a production system without epel | 16:51 |
mordred | jeblair, fungi, clarkb: I have an updated db config for review.o.o I'd like to apply - I made one called "review-5.1" since it's a bit specific to review.o.o | 16:51 |
fungi | i think i've (hopefully) fixed our default language preference for ovh now, though i had to set our "location" to something other than france (or else it only allowed to select french) | 16:51 |
mordred | applying it requires restarting the db | 16:51 |
clarkb | mordred: ok, lets try to coordinate that with a gerrit restart | 16:52 |
mordred | yah | 16:52 |
clarkb | mordred: since we have a few changes we want ot apply to gerrit as well | 16:52 |
jeblair | mordred: cool, i think maybe after my tuning changes merge ^ yeah that | 16:52 |
*** rcernin has quit IRC | 16:52 | |
clarkb | also maybe get cgit in? | 16:52 |
anticw | +1 | 16:52 |
*** tesseract has quit IRC | 16:52 | |
fungi | checking out the proposed db config now | 16:52 |
*** baoli has joined #openstack-infra | 16:53 | |
clarkb | based on http://paste.openstack.org/show/621452/ I don't think we can turn off epel. exim and cgit are both in epel and important to the git backends | 16:53 |
jeblair | clarkb: what's the problem with epel? | 16:53 |
mordred | fungi: tl;dr is bump query cache size since we have a high number of query_cache_lowmem_prunes - but we have WAY WAY WAY more reads than writes - and also to bump up the innodb_buffer_pool_size since this instance has 4G but we only have 1.2G allocated to innodb_buffer_pool (which means we effectively only have a 2G instance) | 16:54 |
fungi | mordred: yep, for the sake of others not having to log into the rackspace dashboard, this seems to be your additions over our standard "sanity" config: http://paste.openstack.org/show/621453/ | 16:54 |
mordred | yes. that's correct | 16:55 |
mordred | query_cache_type=1 means "on" ... as oppposed to 0 which is off and 2 which is "only use for queries that explicitly request query cache" | 16:55 |
clarkb | jeblair: dmsimard believes that the reason we are failing puppet is epel and centos 7.4 python-paramiko packages conflict. So disabling epel in theory fixes that | 16:55 |
clarkb | jeblair: give me a moment and I will get a paste up of the error from git01 puppet runs | 16:55 |
mordred | and query_cache_wlock_invalidate means "invalidate the query cache when someone takes out an explicit write lock on a table" | 16:56 |
*** yamahata has joined #openstack-infra | 16:56 | |
mordred | clarkb: what are we using python-paramiko for on cgit servers? | 16:56 |
clarkb | http://paste.openstack.org/show/621454/ | 16:56 |
clarkb | mordred: itis a jeepyb dep | 16:56 |
*** dprince has joined #openstack-infra | 16:56 | |
*** baoli has quit IRC | 16:56 | |
mordred | how about we install it from pip like the rest of jeepyb and don't try to mix system and pip depends? | 16:56 |
dmsimard | clarkb: exim ? for MTA purposes ? I guess there are other MTAs... :P | 16:57 |
jeblair | dmsimard: you want to be our postmaster? | 16:57 |
mordred | dmsimard: we're exim fans | 16:57 |
clarkb | haha | 16:57 |
fungi | dmsimard: exim is also the default mta on the distro for most of the servers we run | 16:57 |
clarkb | fwiw I'm sure happy to use exim as long as I can go to jeblair and ask questions :) | 16:57 |
dmsimard | I haven't done exim in a long time.. either postfix or qmail | 16:57 |
jeblair | anyway... cgit is also important | 16:57 |
*** nmathew has joined #openstack-infra | 16:57 | |
jeblair | let's see if we can tweak the repo priorities | 16:58 |
mordred | dmsimard: we have in our midst a GIANT exim expert, so we take advantage of that | 16:58 |
dmsimard | we do ? | 16:58 |
dmsimard | I have nothing against exim fwiw, was mostly joking | 16:58 |
mordred | dmsimard: jeblair ran the email systems for UC Berkeley before working on OpenStack :) | 16:58 |
mordred | jeblair: how about not installing paramiko from yum/dnf? | 16:59 |
dmsimard | neat | 16:59 |
fungi | i suppose we could tweak puppet to install paramiko from pypi? | 16:59 |
clarkb | http://mirror.sfo12.us.leaseweb.net/epel/7/x86_64/p/python-paramiko-doc-1.16.1-2.el7.noarch.rpm that is epel package | 16:59 |
dmsimard | what's paramiko being installed for in the first place ? | 16:59 |
fungi | right, what mordred just said | 16:59 |
mordred | the pattern of some-things-from-pip-some-things-from-distro isn't a great pattern and we tend to discourage it | 16:59 |
*** kgiusti has left #openstack-infra | 16:59 | |
fungi | especially on rh platforms where pip and rpm fight over file locations | 16:59 |
mordred | and it is in the jeepby requirements- so we can just remove it from puppet and let the jeepyb install take care of it I believe | 16:59 |
jeblair | do we install any jeepyb from pypi? | 17:00 |
mordred | from git | 17:00 |
jeblair | any dependencies? | 17:00 |
mordred | we do a pip install . in /opt/jeepyb | 17:00 |
clarkb | trying to find the centos 7 paramiko version to compare | 17:00 |
mordred | yah - a pile of them | 17:00 |
mordred | including paramiko | 17:00 |
dmsimard | jeepyb uses paramiko ? or where is it from ? | 17:00 |
clarkb | we likely install paramiko from distro packages because in the past we needed to do C builds for paramiko | 17:00 |
*** prometheanfire has joined #openstack-infra | 17:00 | |
clarkb | but now cryptography ships a wheel so that is no longer a problem | 17:00 |
mordred | ah - yah. that sounds like why that's there | 17:00 |
clarkb | dmsimard: jeepyb speaks ssh to gerrit to do things and relies on paramiko for that | 17:01 |
dmsimard | ah, ok, makes sense | 17:01 |
mordred | are we still running openstackwatch? | 17:01 |
*** gouthamr has quit IRC | 17:01 | |
*** gouthamr has joined #openstack-infra | 17:01 | |
clarkb | http://mirror.centos.org/centos/7/extras/x86_64/Packages/python-paramiko-2.1.1-2.el7.noarch.rpm that is the non epel package | 17:01 |
clarkb | so ya those two are conflicting. Is that not an epel/centos bug too? | 17:02 |
clarkb | mordred: I don't think so, e-r replaced that didn't it ? | 17:02 |
*** thorre_se has joined #openstack-infra | 17:02 | |
mordred | oh - that's making the rss feed ... | 17:02 |
dmsimard | EmilienM, mwhahaha: does puppet have a 'disablerepo' or 'enablerepo' equivalent for the package resource ? | 17:03 |
mwhahaha | dmsimard: wat | 17:03 |
dmsimard | mwhahaha: like if we want to selectively enable epel for a particular package installation | 17:03 |
dmsimard | yum --enablerepo epel install foo | 17:03 |
openstackgerrit | Merged openstack-infra/system-config master: Switch review.o.o to cgit links https://review.openstack.org/505067 | 17:04 |
mwhahaha | dmsimard: not likely | 17:04 |
jeblair | mordred: are you writing a puppet-jeepyb patch? | 17:04 |
dmsimard | mwhahaha: my googlefu made me lean towards that conclusion as well but thought I'd double check | 17:04 |
mordred | jeblair: yes | 17:04 |
jeblair | cool | 17:04 |
*** thorre has quit IRC | 17:04 | |
*** thorre_se is now known as thorre | 17:04 | |
*** baoli has joined #openstack-infra | 17:04 | |
dmsimard | clarkb: I *think* that technically the paramiko package from EPEL should be retired | 17:05 |
mwhahaha | dmsimard: yes it does | 17:05 |
mwhahaha | dmsimard: https://docs.puppet.com/puppet/latest/type.html#package-provider-yum | 17:05 |
mwhahaha | dmsimard: install_options | 17:05 |
jeblair | fungi: can you +3 https://review.openstack.org/505300 ? | 17:05 |
dmsimard | clarkb: because EPEL has a policy not to provide packages that are in base OS | 17:05 |
clarkb | dmsimard: is that something you can file a bug against just so that we are good users? | 17:05 |
*** hamzy has joined #openstack-infra | 17:05 | |
dmsimard | clarkb: I'll check with someone more knowledgeable than I | 17:05 |
clarkb | dmsimard: I linked both conflicting versions of packaages above if yo uneed concrete links | 17:05 |
*** ltomasbo has quit IRC | 17:06 | |
fungi | jeblair: thanks, i missed that since it lacked a topic | 17:06 |
dmsimard | clarkb: you had a paste of the puppet error log right ? | 17:07 |
*** Swami has quit IRC | 17:07 | |
openstackgerrit | Merged openstack-infra/puppet-gerrit master: Allow configuring account/group cache limits https://review.openstack.org/505328 | 17:07 |
*** askb has quit IRC | 17:07 | |
clarkb | dmsimard: ya http://paste.openstack.org/show/621454/ there is more to it than that (more files) but extra content seemed mostly redundant | 17:07 |
jeblair | dmsimard, clarkb: potentially relevant: https://bugzilla.redhat.com/show_bug.cgi?id=1481618 | 17:08 |
openstack | bugzilla.redhat.com bug 1481618 in python-cryptography "RFE: python 3 package for python-cryptography in EPEL7" [Unspecified,New] - Assigned to jeremy | 17:08 |
clarkb | jeblair: the zuul.conf update is in place on zuul.o.o | 17:08 |
clarkb | jeblair: I think that means we can rebase tobiash's change back on msater again and then reinstall zuul and restart whenever ready | 17:08 |
*** caphrim007 has quit IRC | 17:09 | |
jeblair | clarkb: yeah, let's do that with our 'everything' restart? | 17:09 |
clarkb | wfm | 17:09 |
openstackgerrit | Monty Taylor proposed openstack-infra/puppet-jeepyb master: Stop installing python depends from distro https://review.openstack.org/505335 | 17:09 |
*** szahers has joined #openstack-infra | 17:09 | |
* clarkb is trying to keep the etherpad up to date as we go | 17:09 | |
openstackgerrit | Monty Taylor proposed openstack-infra/puppet-jeepyb master: Stop installing python depends from distro https://review.openstack.org/505335 | 17:09 |
mordred | kk. I think that ^^ should do it | 17:09 |
*** szahers has quit IRC | 17:10 | |
dmsimard | clarkb: in the meantime, we can use --disablerepo epel in install_options for the package puppet resource where paramiko is installed | 17:10 |
mordred | I added a subscribe on the removal so that we'll re-run pip install when we remove those packages | 17:10 |
dmsimard | or what mordred is doing works too I guess | 17:10 |
*** jpena|away has quit IRC | 17:10 | |
*** amoralej has quit IRC | 17:10 | |
clarkb | ok I have approved mordred's change. I have got a tail of the puppet apply logs going so will try and watch for that | 17:12 |
*** xyang1 has joined #openstack-infra | 17:13 | |
openstackgerrit | Merged openstack-infra/project-config master: Update post ref regex https://review.openstack.org/505300 | 17:13 |
jlvillal | Are there any known issues with emails from Gerrit? | 17:13 |
*** ccamacho has left #openstack-infra | 17:13 | |
*** dizquierdo is now known as alpgarcia | 17:13 | |
jlvillal | On a patch of mine, I had some comments from people, and I uploaded a new patch. But I do not see any emails about it. | 17:14 |
jlvillal | I am seeing other emails from Gerrit though. | 17:14 |
*** sambetts is now known as sambetts|afk | 17:14 | |
clarkb | jlvillal: yes, they are being processed very slowly | 17:15 |
clarkb | jlvillal: more details can be found at https://etherpad.openstack.org/p/gerrit-2.13-issues work is in progress to attempt to address it | 17:15 |
jlvillal | clarkb: Ah, okay. I'll be more patient then. Thanks! | 17:15 |
openstackgerrit | Monty Taylor proposed openstack-infra/puppet-jeepyb master: Allow removing openstackwatch https://review.openstack.org/505339 | 17:20 |
mordred | clarkb: ^^ we are still running openstackwatch - but the cloud account it uses doesn't actually work, so it's just a thing we run that errors once an hour | 17:20 |
fungi | interesting spike on review.o.o's eth1 graph every ~4 hours | 17:21 |
fungi | looks like it probably predates the upgrade though | 17:21 |
*** askb has joined #openstack-infra | 17:22 | |
fungi | http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=34&rra_id=all | 17:23 |
clarkb | memory use is still looking decent | 17:23 |
*** Swami has joined #openstack-infra | 17:24 | |
openstackgerrit | Monty Taylor proposed openstack-infra/system-config master: Remove openstackwatch https://review.openstack.org/505342 | 17:26 |
openstackgerrit | Monty Taylor proposed openstack-infra/system-config master: Remove openstackwatch reference https://review.openstack.org/505343 | 17:26 |
openstackgerrit | Merged openstack-infra/puppet-jeepyb master: Stop installing python depends from distro https://review.openstack.org/505335 | 17:27 |
*** udesale has joined #openstack-infra | 17:27 | |
*** efoley has quit IRC | 17:28 | |
clarkb | I think 17:45UTC will be the next puppet run on git backends | 17:28 |
clarkb | should include ^ | 17:28 |
*** bobh has joined #openstack-infra | 17:29 | |
*** rbrndt has quit IRC | 17:29 | |
mordred | clarkb: sweet | 17:29 |
*** olaph has quit IRC | 17:30 | |
*** ltomasbo has joined #openstack-infra | 17:30 | |
*** ijw has joined #openstack-infra | 17:30 | |
*** udesale has quit IRC | 17:31 | |
jeblair | clarkb: i need to afk for 30m; it sounds like you might be ready to do the global restart during that time. i think aside from manually merging/installing the zuul patch (which you're obviously familiar with) we should be all set. | 17:32 |
*** olaph has joined #openstack-infra | 17:32 | |
jeblair | clarkb: derp we're missing a change: https://review.openstack.org/505330 | 17:32 |
clarkb | I'll recheck it | 17:32 |
clarkb | I think that is the centos ssh keys issue that I hav esort of attempted to debug in spare time | 17:32 |
jeblair | clarkb: thx | 17:33 |
*** bobh has quit IRC | 17:33 | |
fungi | would cleaning up http://git.openstack.org/cgit/openstack-infra/system-config/tree/playbooks/clouds_layouts.yml#n49 help? | 17:34 |
*** tosky has quit IRC | 17:34 | |
fungi | i'll propose a change for that, missed cleanup | 17:35 |
clarkb | fungi: maybe? I have pushed https://review.openstack.org/501887 to help debug | 17:35 |
clarkb | ok rough plan: git backends will update shortly after 1745, see how they do and if successful then we can kick.sh review.o.o with all the changes we want in | 17:35 |
*** rtjure has quit IRC | 17:38 | |
fungi | oh, i guess i already did? https://review.openstack.org/489711 | 17:38 |
clarkb | I don't think pabelanger's concern there is really worht worrying about. Nodepool uses baked in keys to ssh not metadata provided keys | 17:39 |
fungi | pabelanger: can you elaborate on why that will cause nodepool to start failing to boot instances? | 17:39 |
clarkb | so everything nodepool should work fine | 17:39 |
clarkb | we'll just have a period of time where maybe boot will fail because key is beteen being deleted and recreated | 17:39 |
clarkb | (but nodepool will recycle those properly) | 17:39 |
*** alpgarcia is now known as dizquierdo | 17:39 | |
fungi | i'm still unclear on what keys need recreating | 17:39 |
fungi | this is only removing keys? | 17:39 |
clarkb | fungi: the nova key item in the nova db | 17:40 |
clarkb | fungi: you can't update infra-root-keys in nova. You have to delete it then recreate it with new content | 17:40 |
fungi | but it will still contain the key nodepool's using right? | 17:41 |
clarkb | nodepool's key is baked into the image by dib so that is orthogonal concern | 17:41 |
fungi | i'm really unclear on why removing keys nodepool doesn't use will cause nodepool to be unable to connect | 17:41 |
*** nmathew has quit IRC | 17:41 | |
clarkb | fungi: right its a non issue | 17:42 |
pabelanger | fungi: clarkb: 489711 won't actually delete SSH keys in openstack clouds. In fact, I don't think ansible will run properly once it is merged | 17:42 |
*** rtjure has joined #openstack-infra | 17:42 | |
pabelanger | fungi: clarkb: so, we'll first need to delete infra-root-keys in all clouds, then run cloud launcher again | 17:42 |
clarkb | the only issue re nodepool is when the key is deleted and not yet recreated we could attempt to boot and cause problems because noepool will set the infra-root-keys key name on the boot args | 17:42 |
pabelanger | but, when we delete infra-root-keys, we'll fail to launch new nodes, until recreated | 17:42 |
clarkb | we can also just delete it a few regions at a time and rerun the cloud launcher | 17:42 |
fungi | pabelanger: in that case, we should probably find a different model than to rely on automating configuration we can never automatically change :/ | 17:42 |
pabelanger | fungi: plan is to patch ansible ssh key task to allow us to force update, but I haven't pushed up that change yet | 17:43 |
fungi | got it, so we're waiting on a missing ansible feature | 17:43 |
fungi | i'm afraid i don't understand enough about the mechanism being described there to know how to go about making the required alterations manually | 17:44 |
*** shardy has quit IRC | 17:44 | |
fungi | use openstackclient to replace the nova ssh keys object with a new bundled version matching the desired change? | 17:45 |
clarkb | mordred: https://review.openstack.org/#/c/505339/ has failures | 17:45 |
*** martinkopec has joined #openstack-infra | 17:45 | |
pabelanger | fungi: we can write a new playbook to rotate keys, it is just a 2 step process. I can take a stab at it once off the phone with airline | 17:46 |
fungi | well, this isn't key rotation, so i'm still confused | 17:47 |
fungi | we're just revoking access for inactive users | 17:47 |
pabelanger | fungi: openstack doesn't have an API to replace an existing ssh key. You first have to delete it, then recreate it using same name | 17:47 |
fungi | we're not replacing an ssh key, we're removing several ssh keys and leaving the others untouched (not replaced) | 17:47 |
*** electrofelix has quit IRC | 17:48 | |
clarkb | fungi: right but its a single key in the nova api/db | 17:48 |
pabelanger | right, but they are blobed into a single key entry | 17:48 |
clarkb | and there is no update method | 17:48 |
fungi | or is what openstack calls an "ssh key" actually a set of keys? | 17:48 |
clarkb | yes that | 17:48 |
openstackgerrit | Monty Taylor proposed openstack-infra/puppet-jeepyb master: Allow removing openstackwatch https://review.openstack.org/505339 | 17:48 |
fungi | okay, so terminology mismatch | 17:48 |
mordred | clarkb: that should do it | 17:48 |
mordred | clarkb, fungi, pabelanger: reading keys scrollback | 17:49 |
*** harlowja has joined #openstack-infra | 17:49 | |
fungi | mordred: i think we came to terms with it... the comment pabelanger left on the change used the term "key" in two different ways to mean two different things (because openstack uses the term to mean something different than an actual human would) | 17:50 |
fungi | should be more properly referred to as a "keyset" or something | 17:50 |
mordred | yes - I agree with that | 17:52 |
mordred | fungi: yah - it's actually a "blob of contents to put into an authorized_keys file" | 17:53 |
mordred | fungi: I'm honestly not sure if it supporting more than one key is on purpose - I think it's an accident of how both it and glean and cloud-init all happen to operate | 17:53 |
fungi | so more clearly stated, the concern raised over 489711 is that altering the infra-root-keys variable in the clouds_layouts playbook will cause nodepool to be unable to find the bundled keyset previously created on our cloud providers? | 17:54 |
mordred | in any case, I agree with pabelanger - ansible-cloud-launcher can be updated to do a two-step delete/create - or we can add a force-replace flag to os_keypair itself and make a-c-l use that | 17:54 |
mordred | fungi: well - two concerns - one is that that change will not actually get applied correctly | 17:55 |
fungi | i don't get why, but it's probably me failing to understand rest api design | 17:55 |
mordred | fungi: the second is that there will be a point in time during whatever period exists when we DO actually apply that change where nodepool will have some boot errors, but I think that's ok | 17:55 |
clarkb | mordred: I agree those boot errors should only happen for a short period and self correct | 17:56 |
fungi | you can't replace the value, but you could delete and recreate it? (why is replacing not the same thing as deleting and recreating?) | 17:56 |
mordred | fungi: to replace the key we'll need to have something do "delete key ; add key" - so for the second issue there's just a moment in time where the key won't be there and boot --keypair=infra-root-keys will fail because infra-root-keys won't be there | 17:56 |
mordred | fungi: yes. you can totally delete and recreate | 17:56 |
mordred | fungi: just none of our curent things actually do that | 17:56 |
mordred | fungi: it's not a super huge thing to fix - it's just a bug/deficiency in our current automatoin | 17:57 |
fungi | oh, i see, so from an api design perspective, replacing a value would in theory offer some guaranteed continuity where there could never be a point in time where the value was empty | 17:57 |
mordred | yah | 17:57 |
mordred | but that's unpossible in nova api - so we're left with delete/create on our side | 17:58 |
*** ociuhandu has quit IRC | 17:58 | |
fungi | whereas deletion and creation back to back would be treated as independent operations with no guarantee of atomicity | 17:58 |
*** ociuhandu has joined #openstack-infra | 17:58 | |
*** baoli has quit IRC | 17:58 | |
*** baoli has joined #openstack-infra | 17:59 | |
clarkb | https://review.openstack.org/#/c/505330/2 is in a weird state | 18:00 |
*** tosky has joined #openstack-infra | 18:02 | |
AJaeger | clarkb: needs rebasing I guess | 18:02 |
clarkb | maybe? | 18:02 |
clarkb | mordred: http://paste.openstack.org/show/621463/ I think you need to remove pyyaml from the removes list | 18:03 |
AJaeger | see the patch is on top of - changeset 1 instead of 4 - and the orange icon on parent | 18:03 |
clarkb | AJaeger: aha thanks | 18:03 |
AJaeger | clarkb: not obvious ;( | 18:03 |
mordred | fungi, pabelanger, clarkb: https://github.com/ansible/ansible/pull/30565 | 18:04 |
clarkb | AJaeger: I will rebase an reapprove, thanks | 18:04 |
mordred | clarkb: piddle. one sec | 18:04 |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Bump gerrit account cache to 2048 https://review.openstack.org/505330 | 18:04 |
fungi | i don't even see the orange parent icon | 18:05 |
clarkb | it was there after AJaeger mentioned it | 18:05 |
clarkb | I have since rebased | 18:05 |
mordred | clarkb: I mean, tobefair, we don't need cloud-init on that server - but yah | 18:05 |
fungi | i have it pulled up in the webui from before he mentioned it | 18:05 |
fungi | oh, in the related changes box? | 18:06 |
openstackgerrit | Monty Taylor proposed openstack-infra/puppet-jeepyb master: Don't remove PyYAML - because cloud-init https://review.openstack.org/505352 | 18:06 |
clarkb | it was next to the Parent(s) line (that is where I noticed it) | 18:06 |
fungi | huh, it never displayed for me | 18:06 |
clarkb | but ya hitting reltaed changes also made it apparent with the orange patchset number | 18:07 |
fungi | at least now if you're looking at an older patchest, the "patch sets (2/3)" shows up construction orange so harder to miss | 18:07 |
clarkb | ya | 18:07 |
openstackgerrit | Ihar Hrachyshka proposed openstack-infra/devstack-gate master: Switch from lib/neutron-legacy to lib/neutron https://review.openstack.org/436798 | 18:08 |
clarkb | memory use still looks good | 18:09 |
clarkb | mordred: see comment on 5352 | 18:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/puppet-jeepyb master: Don't remove PyYAML - because cloud-init https://review.openstack.org/505352 | 18:11 |
mordred | clarkb: wow, sorry. my first version of thatpatch sucked | 18:11 |
pabelanger | mordred: wow, thanks. Would have taken me much longer to get that together | 18:12 |
clarkb | can we get a second review on 505352? should hopefully fix git backend puppeting for real | 18:13 |
mordred | pabelanger: that won't show up in 2.5 - so we still likely want to do a dumber version in a-c-l | 18:14 |
jeblair | done | 18:14 |
*** pvaneck has joined #openstack-infra | 18:15 | |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: DNM: Test Ansible 2.4 https://review.openstack.org/505354 | 18:15 |
pabelanger | mordred: ack | 18:16 |
pabelanger | okay, flights now booked for Sydney. I can get started on cloud-launcher change | 18:16 |
openstackgerrit | Rodrigo Duarte proposed openstack-infra/project-config master: Update LDAP domain driver CI job to run tempest full https://review.openstack.org/492223 | 18:18 |
*** trown|lunch is now known as trown | 18:22 | |
openstackgerrit | Merged openstack-infra/system-config master: Bump gerrit account cache to 2048 https://review.openstack.org/505330 | 18:23 |
clarkb | ok ^ is in now. So once 5352 merges and git backends are puppeting we should be ready to work on the big restart | 18:24 |
*** rhallisey has quit IRC | 18:24 | |
*** rhallisey has joined #openstack-infra | 18:24 | |
*** rtjure has quit IRC | 18:24 | |
*** askb has quit IRC | 18:25 | |
clarkb | I'm going to disable puppet cron now since 5352 ust failed | 18:27 |
clarkb | this way we don't have t owait for everything to go around in a circle again | 18:27 |
clarkb | can just enable it once we are ready | 18:27 |
*** bobh has joined #openstack-infra | 18:27 | |
openstackgerrit | Rodrigo Duarte proposed openstack-infra/project-config master: Update LDAP domain driver CI job to run tempest full https://review.openstack.org/492223 | 18:28 |
clarkb | oh we have a meetingtoday | 18:29 |
clarkb | EDISTRACTED | 18:29 |
fungi | in 30 minutes | 18:29 |
mordred | clarkb: cool. I'm ready whenever | 18:30 |
clarkb | ya I'm gonna prep for meeting, I have rechecked 5352 | 18:32 |
jeblair | why is that job so bad? | 18:33 |
*** felipemonteiro__ has quit IRC | 18:34 | |
*** ociuhandu has quit IRC | 18:34 | |
*** rbrndt has joined #openstack-infra | 18:34 | |
*** felipemonteiro__ has joined #openstack-infra | 18:35 | |
openstackgerrit | Merged openstack-infra/infra-specs master: Gerrit ContactStore Removal is implemented https://review.openstack.org/492287 | 18:37 |
clarkb | jeblair: becase we manage ssh keys with puppet and nova. There seems to be some conflict between them on centos in particular with glean? https://review.openstack.org/501887 is attempt at getting more data around what is happening | 18:37 |
*** dtantsur is now known as dtantsur|afk | 18:40 | |
openstackgerrit | Timo Tijhof proposed openstack-infra/zuul master: Status: Remove use of deprecated jQuery jqXHR `complete` method https://review.openstack.org/505366 | 18:42 |
*** felipemonteiro has joined #openstack-infra | 18:44 | |
openstackgerrit | Timo Tijhof proposed openstack-infra/zuul master: Status: Don't toggle panel when clicking patch link https://review.openstack.org/505368 | 18:46 |
*** felipemonteiro__ has quit IRC | 18:46 | |
*** ijw has quit IRC | 18:46 | |
mordred | clarkb: I feel like yesterday when we rolled out the new gerrit the CSS worked very well for me ... | 18:47 |
mordred | clarkb: then today it's back to being horizontal scrolling | 18:47 |
mordred | clarkb: did a thing change? or did I just suck looking at things yesterday? | 18:47 |
clarkb | I think it worked for me in chrome this morning but not firefox | 18:47 |
clarkb | css itself hasnt changed as far as I know | 18:47 |
mordred | weird | 18:48 |
jeblair | question about the status alert -- it says "Post jobs are not executed currently, do not tag any releases". the tag/release pipelines are working, as far as we know, yeah? just not post? so really it should be "do not land any changes"? | 18:48 |
clarkb | that may have just been confusion over the affects? | 18:49 |
openstackgerrit | Merged openstack-infra/puppet-jeepyb master: Don't remove PyYAML - because cloud-init https://review.openstack.org/505352 | 18:50 |
clarkb | re ^ I am reenabling puppet cron now | 18:50 |
AJaeger | jeblair, clarkb : I suggested that - for tags: Most in the post queue is reproduceable, so next run will do the work again. Just tags are unique | 18:50 |
mordred | clarkb: cool | 18:50 |
AJaeger | jeblair: feel free to change message ;) | 18:50 |
*** caphrim007 has joined #openstack-infra | 18:50 | |
AJaeger | jeblair: what did I miss? | 18:51 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Status: Remove use of deprecated jQuery jqXHR `complete` method https://review.openstack.org/505369 | 18:53 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Status: Don't toggle panel when clicking patch link https://review.openstack.org/505370 | 18:53 |
*** kgiusti has joined #openstack-infra | 18:54 | |
mordred | clarkb, pabelanger, fungi: remote: https://review.openstack.org/505371 Recreate keypairs when content is different | 18:54 |
clarkb | Infra meeting in 5 minutes. We will recap PTG (including Zuul things) and go over gerrit upgrade time permitting | 18:55 |
mordred | that ^^ is wonky but I think should fix a-c-l between now and when 2.5 releases | 18:55 |
clarkb | mordred: thanks | 18:55 |
clarkb | I'll have to tkae a look after the meeting | 18:55 |
openstackgerrit | Walter Scheper proposed openstack-dev/pbr master: Use topo ordering to correctly sort changelog entries https://review.openstack.org/505372 | 18:56 |
fungi | mordred: clarkb: on the css front, it lgtm while anonymous, but once i log in it gets too wide due to additional options in the menuish area | 18:56 |
*** dizquierdo has quit IRC | 18:57 | |
fungi | (it adds "my," "people" and "plugins" to the top row) | 18:58 |
clarkb | ah | 18:59 |
clarkb | css edits require service restarts to pick up but its also less urgent I think we can do that later in the week once we've got the actual functionality problems in a better place | 18:59 |
fungi | likely our logo is wider than the upstream default | 18:59 |
clarkb | ok meeting time now in #openstack-meeting | 18:59 |
mordred | also - the search text box is HUGE | 18:59 |
mordred | fungi: but I've got other things making horizontal scroll than just the top bar | 19:00 |
fungi | in my case (1440px wide browser) just my name in the top bar ends up hanging off the right-hand side | 19:01 |
mordred | fungi: https://imgur.com/a/smPzr fwiw | 19:01 |
fungi | though if i switch from "all" to "my" it gets far worse | 19:01 |
mordred | fungi: the patch list in related goes off too | 19:01 |
fungi | ahh, yeah i was looking at the open changes list | 19:02 |
*** baoli has quit IRC | 19:02 | |
mordred | fungi: yah on the changes list only the name goes off to the side | 19:03 |
*** baoli has joined #openstack-infra | 19:03 | |
openstackgerrit | Merged openstack-infra/zuul master: Status: Remove use of deprecated jQuery jqXHR `complete` method https://review.openstack.org/505366 | 19:04 |
openstackgerrit | Merged openstack-infra/zuul master: Status: Don't toggle panel when clicking patch link https://review.openstack.org/505368 | 19:04 |
*** felipemonteiro has quit IRC | 19:05 | |
*** felipemonteiro has joined #openstack-infra | 19:05 | |
*** felipemonteiro__ has joined #openstack-infra | 19:06 | |
clarkb | mordred: git backends puppeted successfully which means after meeting we do kick.sh review.o.o and do the grand restart of things | 19:07 |
mordred | clarkb: woot | 19:08 |
clarkb | we should probably do a bit of planning before we do that as there are several moving parts, but after meeting | 19:08 |
*** felipemonteiro has quit IRC | 19:09 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Use publish-docs-draft base job for docs-draft publishers https://review.openstack.org/504624 | 19:12 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Removed unused 'status: ' string from log line https://review.openstack.org/505378 | 19:12 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Emit shell instead of script tasks https://review.openstack.org/505379 | 19:12 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Omit some jobs from shared queue calculation https://review.openstack.org/505380 | 19:12 |
*** slaweq has quit IRC | 19:14 | |
*** slaweq has joined #openstack-infra | 19:14 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Status: Remove use of deprecated jQuery jqXHR `complete` method https://review.openstack.org/505369 | 19:14 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Status: Don't toggle panel when clicking patch link https://review.openstack.org/505370 | 19:14 |
*** Sukhdev has joined #openstack-infra | 19:17 | |
*** felipemonteiro has joined #openstack-infra | 19:18 | |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Add publish-service-types-authority job and mapping https://review.openstack.org/504609 | 19:19 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Add xstatic-check-version and openstack-tox-pypy https://review.openstack.org/504610 | 19:19 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Remove liberty/mitaka job regexes https://review.openstack.org/504964 | 19:19 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Remove unmatched single quotes from jenkins jobs https://review.openstack.org/504965 | 19:19 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Add mapping file setting to skip jobs from share queues https://review.openstack.org/504966 | 19:19 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Have publiccloud-wg gate on itself, not api-wg https://review.openstack.org/504967 | 19:19 |
openstackgerrit | Monty Taylor proposed openstack-infra/project-config master: Make yaml2ical publication job https://review.openstack.org/504968 | 19:19 |
*** felipemonteiro__ has quit IRC | 19:19 | |
*** felipemonteiro__ has joined #openstack-infra | 19:20 | |
*** wolverineav has quit IRC | 19:22 | |
*** felipemonteiro has quit IRC | 19:24 | |
*** felipemonteiro has joined #openstack-infra | 19:24 | |
Sukhdev | Dear Infra folks, can you please get this taken care of so that I can kick start the work on this project - https://review.openstack.org/#/c/503829/ | 19:25 |
*** felipemonteiro__ has quit IRC | 19:26 | |
*** jrist has joined #openstack-infra | 19:26 | |
AJaeger | Sukhdev: Sorry, we cannot currently. We're in the middle of fixing problems after a gerrit update and cannot create new repos currently. | 19:27 |
fungi | also, in the middle of the infra weekly meeting right this moment | 19:29 |
dmsimard | clarkb: is gerrit still on fire ? I pushed a tag to gerrit and while the tag is on git.openstack.org, it doesn't seem like it has replicated down to github yte. | 19:30 |
dmsimard | s/yte/yet/ | 19:30 |
clarkb | dmsimard: yes, we haven't fixed zuul yet for that (though that may be new and exciting behavior) | 19:30 |
dmsimard | clarkb: nevermind, false alarm | 19:31 |
*** tinwood has quit IRC | 19:31 | |
dmsimard | clarkb: some weird sorting by github, the release tag was put after the rc tag | 19:31 |
clarkb | ok | 19:31 |
Sukhdev | AJaeger : any ETA? | 19:32 |
*** tinwood has joined #openstack-infra | 19:32 | |
*** slaweq_ has joined #openstack-infra | 19:33 | |
*** martinkopec has quit IRC | 19:33 | |
*** jtomasek has quit IRC | 19:35 | |
clarkb | Sukhdev: hopefully we will have the majority of problems sorted out by the end of today. We are owrking as quickly as possible to clean things up around gerrit | 19:35 |
*** slaweq has quit IRC | 19:37 | |
Sukhdev | clarkb : Thanks. will bug you guys tomorrow in that case - best of luck in sorting things out :-) | 19:38 |
openstackgerrit | Sam Yaple proposed openstack-infra/shade master: Allow domain_id for roles https://review.openstack.org/496992 | 19:38 |
openstackgerrit | Sam Yaple proposed openstack-infra/shade master: Move role normalization to normalize.py https://review.openstack.org/500170 | 19:38 |
*** rhallisey has quit IRC | 19:38 | |
*** rhallisey has joined #openstack-infra | 19:38 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Deal with link-logs macro https://review.openstack.org/505387 | 19:38 |
*** felipemonteiro__ has joined #openstack-infra | 19:40 | |
openstackgerrit | Monty Taylor proposed openstack/os-client-config master: Treat clouds.yaml with one cloud like envvars https://review.openstack.org/505388 | 19:41 |
*** felipemonteiro has quit IRC | 19:43 | |
jeblair | dmsimard: see mail from dhellmann on openstack-dev | 19:43 |
jeblair | 2017-09-19 19:10:07.809429 | /tmp/03-24d8b0b01ba04e599848b6cb92742892.sh: line 21: /usr/zuul-env/bin/zuul-cloner: No such file or directory | 19:44 |
jeblair | http://logs.openstack.org/89/89289c511c8c48c8409dd1fb997fc24e64d1dae6/release/ara-tarball-signing/f812ce9/console.html#_2017-09-19_19_10_07_809429 | 19:44 |
jeblair | i don't immediately understand what's happening there | 19:44 |
*** mat128 has quit IRC | 19:46 | |
jeblair | /usr/zuul-env/bin/zuul-cloner exists on signing01 | 19:47 |
mordred | jeblair: I agree - perhaps it's one of the files being passed as argument to zuul-cloner? | 19:47 |
jeblair | Modify: 2017-09-19 19:10:08.153368290 +0000 | 19:48 |
jeblair | though, it may not have existed at the time that job ran? | 19:48 |
mordred | jeblair: also, ls: cannot access /opt/git: No such file or directory | 19:48 |
jeblair | oh wow | 19:48 |
mordred | although I think zuul-cloner should deal with that | 19:48 |
jeblair | did that job run right in the middle of a puppet run that reinstalled zuul? | 19:48 |
mordred | jeblair: golly - it's entirely possible | 19:48 |
jeblair | normally, that's baked into our images, but i think on the static nodes, we have puppet manage it? | 19:49 |
jeblair | i'm glad that's changing in v3 :) | 19:50 |
* clarkb needs to grab something to drink but then am back for coordinating restarts of things | 19:59 | |
mordred | jeblair: me too | 19:59 |
clarkb | maybe we can collect a list of things to do specific to restart on the etherpad and then get volunteers for the items? | 20:00 |
clarkb | also make sure we have everything in place that we want ot get in place | 20:00 |
fungi | wfm | 20:01 |
*** rhallisey has quit IRC | 20:02 | |
*** gouthamr has quit IRC | 20:03 | |
clarkb | Bottom of https://etherpad.openstack.org/p/gerrit-2.13-issues has rough plan | 20:06 |
clarkb | morded do you want ot be in charge of modifying the db instance for gerrit? | 20:06 |
clarkb | jeblair: you want to do zuul installation update and restart? | 20:07 |
*** jesusaur has quit IRC | 20:08 | |
clarkb | the system-config changes we want appear to be on disk | 20:08 |
*** jesusaur has joined #openstack-infra | 20:11 | |
*** baoli has quit IRC | 20:12 | |
jeblair | clarkb: sure | 20:13 |
*** baoli has joined #openstack-infra | 20:13 | |
*** rcernin has joined #openstack-infra | 20:14 | |
jeblair | clarkb: i will update the branch you made with a rebase on current master | 20:14 |
clarkb | jeblair: sounds good | 20:14 |
clarkb | fungi: maybe you want to do the gerrit stop and starts and the puppet run? (I'd like to be able to watch thesyslog and see what it says is changing) | 20:15 |
fungi | sure, i can tail that in a screen session on review.o.o | 20:16 |
jeblair | lemme re-order the etherpad | 20:16 |
clarkb | jeblair: go for it | 20:16 |
jeblair | okay, i think that order looks right :) | 20:16 |
jeblair | clarkb: i will await your 'go' signal to shut down zuul | 20:17 |
clarkb | ya I thin kwe are waiting on mordred to be ready on the db items | 20:17 |
clarkb | mordred: fungi ^ can you indicate when you are ready and if the steps on the etherpad look correct to you? | 20:17 |
fungi | i've switched computers so taking a second to pull up another copy of the pad | 20:18 |
*** jkilpatr has quit IRC | 20:18 | |
*** baoli has quit IRC | 20:18 | |
clarkb | I added the two backup commands to the etherpad too | 20:19 |
*** baoli has joined #openstack-infra | 20:19 | |
pabelanger | able to assist if needed | 20:19 |
fungi | found kick.sh | 20:19 |
clarkb | fungi: system-config/tools/kick.sh and should take the fqdn of the host as the single argument iirc | 20:20 |
fungi | yeah, i just confirmed by reading the script | 20:20 |
fungi | cool, getting a screen session going on review.o.o first tailing syslog and filtering for puppet entries | 20:21 |
ianw | heh, that's better than ctrl-r search for last time i did it :) | 20:21 |
fungi | there are two screen sessions already running as root on review.o.o | 20:21 |
clarkb | the one we used yesterday wasn't turned off | 20:22 |
mordred | clarkb: sorry - got sucked in to call - ready to go | 20:22 |
clarkb | ok I think that means everyone is ready to go? the backups will take a few minutes | 20:22 |
* clarkb composes a #status notice before we start | 20:22 | |
*** e0ne has joined #openstack-infra | 20:22 | |
ianw | fungi: i had one too i think, removed | 20:23 |
fungi | oh well, my screen session there is the newest date anyway | 20:23 |
clarkb | how does that notice message look? I think once I send that we can go ahead and start | 20:23 |
fungi | currently running jobs will be restarted? | 20:24 |
clarkb | jeblair: ^ | 20:24 |
jeblair | clarkb: lgtm | 20:25 |
clarkb | ok lets do this | 20:25 |
clarkb | #status notice Zuul and Gerrit are being restarted to address issues discovered with the Gerrit 2.13 upgrade. review.openstack.org will be inaccessible for a few minutes while we make these changes. Currently running jobs will be restarted for you once Zuul and Gerrit are running again. | 20:25 |
openstackstatus | clarkb: sending notice | 20:25 |
jeblair | i will save queues and stop zuul now | 20:25 |
jeblair | zuul is stopped | 20:26 |
fungi | stopping gerrit now | 20:26 |
-openstackstatus- NOTICE: Zuul and Gerrit are being restarted to address issues discovered with the Gerrit 2.13 upgrade. review.openstack.org will be inaccessible for a few minutes while we make these changes. Currently running jobs will be restarted for you once Zuul and Gerrit are running again. | 20:26 | |
fungi | gerrit is stopped | 20:27 |
fungi | mordred's cue | 20:27 |
clarkb | mordred: you are up for doing backups | 20:27 |
*** e0ne has quit IRC | 20:28 | |
clarkb | did we lose mordred? | 20:29 |
fungi | i have `/opt/.../kick.sh review.openstack.org` queued up in a root screen session on puppetmaster.o.o in case others want to attach | 20:29 |
mordred | clarkb: am here | 20:29 |
mordred | I can do backups - one sec | 20:29 |
clarkb | mordred: details are at https://etherpad.openstack.org/p/gerrit-2.13-issues assigned to you since you are doing the db related stuff | 20:29 |
clarkb | oh thefilename probably says 2.11 still | 20:30 |
clarkb | fixed | 20:30 |
mordred | clarkb: backing up now | 20:32 |
*** wolverineav has joined #openstack-infra | 20:34 | |
clarkb | db backup is about 1/3 done based on file size | 20:36 |
mordred | (still dumping - yah) | 20:36 |
*** ijw has joined #openstack-infra | 20:37 | |
jlk | *snicker* | 20:37 |
clarkb | 2/3 now | 20:37 |
clarkb | I've realized we will likely need to manually run manage projects after all this is done because gerrit shouldb't be running when it tries to create thta new project... | 20:39 |
dmsimard | I like how everyone keeps referring to "tobias's change" and somehow I'm probably the only one not aware of what it is | 20:39 |
anticw | gotta blame someone | 20:40 |
clarkb | mordred: looks done? | 20:40 |
clarkb | dmsimard: its the case sensitive change in zuul that tobiash wrote | 20:40 |
clarkb | dmsimard: we havne't merged it yet because it is backward incompatible but plan to merge it next week | 20:40 |
dmsimard | oh, okay | 20:40 |
fungi | are we reasonably sure puppet isn't going to abort partway through if it can't apply the manage-projects step? | 20:41 |
mordred | clarkb: yes done | 20:41 |
clarkb | fungi: ya I think that can fail on its own/ I can reread it htough | 20:41 |
mordred | clarkb: I will now apply the config and restart mysql | 20:41 |
*** Sukhdev has quit IRC | 20:41 | |
fungi | i suppose i can kick puppet on review.o.o in parallel with the trove instance restart? | 20:41 |
*** Goneri has quit IRC | 20:42 | |
clarkb | ya manage projects is its own block in gerrit.pp | 20:42 |
mordred | db is restarting the first tie - i will need to restart a second time | 20:42 |
clarkb | fungi: lets maybe do one at a time juts for simplicity | 20:42 |
fungi | k | 20:42 |
*** jkilpatr has joined #openstack-infra | 20:42 | |
*** srobert_ has joined #openstack-infra | 20:43 | |
mordred | clarkb, fungi: db has been restarted with new config settings | 20:43 |
*** jrist has quit IRC | 20:43 | |
clarkb | mordred: both required times? | 20:43 |
mordred | clarkb: yes. db is ready to go | 20:44 |
fungi | kicking the puppet now | 20:44 |
clarkb | ok | 20:44 |
*** srobert has quit IRC | 20:45 | |
fungi | i think it's nearly done | 20:46 |
fungi | puppet apply logging to syslog lgtm | 20:46 |
*** xyang1 has quit IRC | 20:46 | |
clarkb | grr file mode meant that gerrit init ran but it appears to not have done anything other than download and install some libs | 20:46 |
fungi | though scrolled by rapidly | 20:46 |
clarkb | (I think that is all ok more annoying than anything else) | 20:46 |
clarkb | oh nope | 20:47 |
clarkb | its reindexing | 20:47 |
clarkb | damnit gerrit | 20:47 |
*** srobert_ has quit IRC | 20:47 | |
fungi | so mode change caused the exec to be notified? | 20:47 |
mordred | clarkb: wait- what? | 20:47 |
clarkb | fungi: ya | 20:47 |
mordred | gah | 20:47 |
fungi | should i kill the reindex process? | 20:47 |
clarkb | I think we can kill the reindex process | 20:48 |
fungi | (out of band) | 20:48 |
clarkb | then restore the index that mordred backed up | 20:48 |
fungi | will do | 20:48 |
clarkb | ya | 20:48 |
mordred | kk | 20:48 |
mordred | index.backup.1505853141 | 20:48 |
mordred | /home/gerrit2/index.backup.1505853141 that is | 20:48 |
fungi | killed | 20:49 |
fungi | puppet is continuing from there (may abort?) | 20:49 |
clarkb | and puppet failed | 20:49 |
clarkb | so I think we restore the index backup (lets copy not move it so that we have it if we need it again | 20:49 |
clarkb | then rerun kick.sh | 20:49 |
fungi | do we need to re-kick or just assume all is well with the restore? | 20:49 |
clarkb | no lets rerun kick.sh | 20:49 |
fungi | k | 20:49 |
fungi | mordred: restoring? | 20:50 |
mordred | on it | 20:50 |
fungi | i'm all queued up for another run as soon as we're clear | 20:50 |
clarkb | (thats my bad, I re double checked file hashes to make sure puppet wouldn't do this but failed to remember mode) | 20:50 |
mordred | root@review:/home/gerrit2# mv review_site/index/ index.borked.$(date +%s) | 20:51 |
mordred | root@review:/home/gerrit2# cp -ax index.backup.1505853141 review_site/index | 20:51 |
mordred | is what I did | 20:51 |
mordred | we can probably just rm that borked one | 20:51 |
mordred | it's done copying | 20:51 |
fungi | okay, ready for me to re-kick? | 20:52 |
mordred | (yay for blocks still in filesystem cache from previous move I'm guessing) | 20:52 |
mordred | good on my end | 20:52 |
clarkb | ya I think that we are ready to rerun kick.sh | 20:52 |
fungi | fired up | 20:52 |
clarkb | log output looks better | 20:53 |
*** jcoufal has quit IRC | 20:53 | |
fungi | finished, yeah | 20:54 |
clarkb | and no gerrit running as expected | 20:54 |
clarkb | jeblair: are you ready on the zuul side? I think we are ready to start gerrit | 20:54 |
jeblair | clarkb: go for it | 20:55 |
fungi | patch applied? | 20:55 |
clarkb | fungi: I think you can start gerrit manually now | 20:55 |
fungi | okay, starting gerrit | 20:55 |
clarkb | gerrit log lgtm | 20:56 |
fungi | initscript ran | 20:56 |
jeblair | maybe restart apache? | 20:56 |
clarkb | ya I think apache needs to be convinced as well | 20:56 |
fungi | doing | 20:56 |
fungi | done | 20:57 |
clarkb | its there and reports the correct version | 20:57 |
jeblair | zuul is starting (receiving worker registrations) | 20:58 |
clarkb | cgit may not be quite working, links are there but at least one I tested isn't working | 20:58 |
clarkb | we should be able to iterate on cgit more easily now that puppet should be happyness | 20:58 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-sphinx master: Update exception message to include directories https://review.openstack.org/505400 | 20:58 |
clarkb | hrm did the cache settings not apply? | 20:59 |
* clarkb looks forthem | 21:00 | |
anticw | cgit links aren't right :( | 21:00 |
fungi | anticw: yeah, clarkb already spotted that | 21:00 |
ianw | clarkb: hmm, it's got an extra ".git" in there | 21:00 |
ianw | but works without that | 21:00 |
mordred | clarkb: where should I be looking for cgit links? | 21:00 |
fungi | ianw: okay, so probably easy patch | 21:00 |
clarkb | mordred: on the left hand side next to commit and parent rows | 21:00 |
*** esberglu has quit IRC | 21:00 | |
mordred | aha! there we go | 21:01 |
ianw | mordred: ctrl-f (cgit) | 21:01 |
*** esberglu has joined #openstack-infra | 21:01 | |
mordred | I was looking at a change that did not have the links - and refreshing did not add them :( | 21:01 |
mordred | https://review.openstack.org/#/c/502351/ | 21:02 |
mordred | still no cgit there for me | 21:02 |
clarkb | oh we have to pass that cache accounts stuff into ::gerrit | 21:02 |
clarkb | jeblair: ^ should I go ahead and write that change? | 21:02 |
mordred | but https://review.openstack.org/#/c/500170/ does | 21:02 |
clarkb | they are there for me on 502351, I'm guessing browser shenanigans | 21:03 |
jeblair | clarkb: yes please since i don't know what you're talking about; that will be the easiest way for you to explain it to me :) | 21:03 |
clarkb | jeblair: ok will be up momentarily | 21:03 |
mordred | weird | 21:03 |
mordred | shift-reload also doesn't add them - but it's not important if it's just browser-side weirdness forme | 21:03 |
jeblair | i'm trying to figure out why no zuul jobs are running | 21:03 |
fungi | did the cache options not end up in openstack_project::gerrit in addition to openstack_project::review? | 21:04 |
*** esberglu has quit IRC | 21:05 | |
fungi | aha, missing from openstack_project::review | 21:05 |
fungi | can't believe i didn't catch that | 21:05 |
openstackgerrit | Clark Boylan proposed openstack-infra/system-config master: Pass gerrit cache options through to puppet-gerrit https://review.openstack.org/505402 | 21:05 |
fungi | no, wait, we don't need it there | 21:06 |
clarkb | fungi: its actually ^ | 21:06 |
clarkb | ya we don't need it in review.pp | 21:06 |
fungi | right, now i get it | 21:06 |
pabelanger | nodepool building nodes in rax now | 21:07 |
fungi | i guess we could have made them not class params to openstack_project::gerrit | 21:07 |
fungi | and just set them on the gerrit module instantiation within it | 21:07 |
pabelanger | there we go, nodepool looks to be working | 21:08 |
clarkb | jeblair: let us know if you need any help | 21:08 |
fungi | my gertty gets really unhappy after gerrit outages | 21:08 |
jeblair | pabelanger: what was happening? | 21:08 |
clarkb | looks like maybe just waiting on nodes per pabelanger's comments | 21:08 |
jeblair | oh i guess it wasn't even building ready nodes due to geard outage | 21:08 |
*** e0ne has joined #openstack-infra | 21:08 | |
jeblair | re-enqueuing | 21:09 |
pabelanger | jeblair: nodepool looked to be not launching any new nodes because allocation requests were 0 | 21:09 |
pabelanger | but once we started deleting old jobs, it started requesting new allocations | 21:09 |
clarkb | if we can't fix the .git problem on the gerrit side we can have cgit apache rewrite to exclude the .git | 21:09 |
fungi | hrm, or maybe it's gerrit that's unhappy... i still can't retrieve 505402 | 21:09 |
jeblair | fungi: it's becaues gertty fetches from git.o.o to be nice, and gerrit is replicating | 21:10 |
clarkb | fungi: its there via web ui, are you trying to pull from git.oo? | 21:10 |
clarkb | ya that | 21:10 |
fungi | jeblair: aha, thanks. forgot about that | 21:10 |
*** trown is now known as trown|outtypewww | 21:10 | |
* fungi will go to a computer capable of using gerrit's webui | 21:10 | |
bkero | fungi: have you tried opening gerrit in dillo? | 21:10 |
ianw | clarkb: yeah, i don't see any settings, on either the gerrit side or the cgit side | 21:11 |
ianw | cgit can ignore the .git if its in the repo, but not ignore it in an incoming url aiui | 21:11 |
ianw | i can look at a url rewriter if you like | 21:11 |
clarkb | ianw: thats probably the easiest thing to do for now and generally user firendly to rewrite thta way I think | 21:12 |
ianw | clarkb: ok, will look into after breakfast :) | 21:12 |
*** esberglu has joined #openstack-infra | 21:12 | |
mordred | clarkb, ianw: the cgit type in gerrit hard-codes the .git | 21:13 |
mordred | clarkb, ianw: but there is also a "custom" type that allows us to set url templates | 21:13 |
fungi | so gerrit insists on adding a .git to the project name in those urls? | 21:13 |
mordred | type.setRevision("${project}.git/commit/?id=${commit}"); | 21:13 |
mordred | fir unstance | 21:13 |
clarkb | ok so that implies the only change we need to get in before restarting again is https://review.openstack.org/505402 ? | 21:13 |
mordred | that's in "case cgit" | 21:13 |
fungi | weird, given cgit doesn't need nor want the .git | 21:13 |
*** thorst has quit IRC | 21:14 | |
clarkb | I think the process for that is we want to get ^ merged, make sure it end up on system config on puppetmaster, then rerun kick.sh | 21:14 |
clarkb | then manually restart gerrit when ready | 21:14 |
mordred | well- I think for now we can likely emit our own things ... one sec, I think I can make a quick patch | 21:14 |
clarkb | ok | 21:14 |
clarkb | we also want to remove review.o.o from the emergency file and manually run manage-projects | 21:14 |
*** ldnunes has quit IRC | 21:14 | |
*** thorst has joined #openstack-infra | 21:16 | |
clarkb | good news is I think we are puppeting properly now | 21:17 |
clarkb | and future restarts should be short and not involve zuul or the database | 21:18 |
openstackgerrit | Monty Taylor proposed openstack-infra/puppet-gerrit master: Override the cgit url settings in gerrit https://review.openstack.org/505406 | 21:18 |
mordred | clarkb: ^^ | 21:19 |
clarkb | mordred: can you tab them all for consistency? | 21:19 |
clarkb | mordred: I think without that puppet will be noisy | 21:19 |
clarkb | but otherwise lgtm | 21:20 |
mordred | http://git.openstack.org/cgit/openstack-infra/gerrit/tree/gerrit-server/src/main/java/com/google/gerrit/server/config/GitwebConfig.java?h=openstack/2.13.8#n158 | 21:20 |
mordred | clarkb: totes. one sec | 21:20 |
*** thorst has quit IRC | 21:20 | |
openstackgerrit | Monty Taylor proposed openstack-infra/puppet-gerrit master: Override the cgit url settings in gerrit https://review.openstack.org/505406 | 21:21 |
openstackgerrit | Matt Riedemann proposed openstack-infra/elastic-recheck master: Add query for live migration invalid disk info bug 1718295 https://review.openstack.org/505410 | 21:22 |
openstack | bug 1718295 in OpenStack Compute (nova) "Unexpected exception in API method: MigrationError_Remote: Migration error: Disk info file is invalid: qemu-img failed to execute - Failed to get shared "write" lock\nIs another process using the image?" [High,Confirmed] https://launchpad.net/bugs/1718295 | 21:22 |
clarkb | infra-root can we get a second review on 505406? I think with that and the cache fix in we want to rerun kick.sh and monitor gerrit (then restart gerrit again) | 21:23 |
clarkb | then I'd like to remove review.o.o from the emergency file in puppet and manually run manage projects | 21:23 |
mordred | ++ | 21:24 |
ianw | lgtm | 21:24 |
clarkb | mordred: re manage-projects do we need to be concerned at all about things out of sync with the manage-projects cache items? | 21:24 |
clarkb | I can't remember if there were any dangerous situations in there | 21:24 |
mordred | nope - manage-projects manages itself | 21:24 |
clarkb | ok cool | 21:25 |
clarkb | seems like it grew more complex and scary recently but that is just my paranoia | 21:25 |
jeblair | clarkb: done | 21:25 |
mordred | however - these: https://review.openstack.org/#/q/topic:fix-manage-projects+status:open would be nice to get someone to review at some point | 21:25 |
jeblair | i'm going to afk for a few; i think the next gerrit restart can be a quick one so doesn't need any zuul action | 21:25 |
mordred | since three of them are from june | 21:25 |
clarkb | jeblair: ++ | 21:25 |
mordred | jeblair: ++ | 21:25 |
clarkb | I've got to watch the kids for a bit but will be around to do the next kick.sh and gerrit restart and stuff once changes merge | 21:27 |
*** vhosakot has quit IRC | 21:28 | |
*** rbrndt has quit IRC | 21:29 | |
fungi | i can do those if they happen soonish, otherwise disappearing for dinner in ~45 minutes | 21:29 |
*** rbrndt has joined #openstack-infra | 21:29 | |
*** rbrndt has quit IRC | 21:29 | |
*** tinwood has quit IRC | 21:30 | |
*** tinwood has joined #openstack-infra | 21:31 | |
*** rhallisey has joined #openstack-infra | 21:33 | |
*** gouthamr has joined #openstack-infra | 21:34 | |
*** vhosakot has joined #openstack-infra | 21:34 | |
*** jheroux has quit IRC | 21:35 | |
*** rbrndt has joined #openstack-infra | 21:36 | |
fungi | there's a post job running!!! | 21:37 |
clarkb | yay | 21:37 |
fungi | well, ref in post with jobs queued at any rate | 21:37 |
fungi | which is further than we were getting before the restart | 21:37 |
clarkb | I'm going to mark off the post jobs item on our etherpad | 21:37 |
clarkb | as that should be sufficient to see the gerrit trigger worked | 21:37 |
*** e0ne has quit IRC | 21:37 | |
fungi | should we #status ok yet, or hold off a smidge longer? | 21:37 |
openstackgerrit | Merged openstack-infra/system-config master: Pass gerrit cache options through to puppet-gerrit https://review.openstack.org/505402 | 21:37 |
*** e0ne has joined #openstack-infra | 21:38 | |
*** e0ne has quit IRC | 21:38 | |
clarkb | lets hold off until after we do the second planned restart | 21:38 |
fungi | k | 21:39 |
clarkb | I'm updating system-config on puppetmaster now to include ^ | 21:40 |
clarkb | that is done | 21:41 |
clarkb | now just waiting on puppet-gerrit change | 21:41 |
fungi | want me to re-kick review once that merges? | 21:41 |
clarkb | please | 21:41 |
*** jistr has quit IRC | 21:41 | |
clarkb | though puppet-gerrit change may merge more slowly due to node availability | 21:41 |
fungi | yeah, seems to be waiting on a few node assignments still | 21:42 |
fungi | xenial specifically | 21:42 |
*** bobh has quit IRC | 21:43 | |
clarkb | mordred: fungi maybe we want to go ahead and run manage projects now? | 21:44 |
clarkb | mordred: any chance you'd be interested in doing thta since you have touched that toolchain most recently? | 21:45 |
fungi | seems like it should be fine to do now | 21:45 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add disabled network action plugins for 2.4 https://review.openstack.org/505419 | 21:47 |
mordred | clarkb: I can do it | 21:49 |
mordred | clarkb: on review.o.o or on git* ? | 21:49 |
*** Sukhdev has joined #openstack-infra | 21:50 | |
clarkb | mordred: on review.o.o, git* should be good now | 21:50 |
mordred | kk. running | 21:50 |
clarkb | I'm cleaning up nodes in nodepool that got lost iwth the zuul restart | 21:50 |
clarkb | should mean that puppet-gerrit change gates faster | 21:50 |
mordred | clarkb: it hath run with no apparent issues | 21:51 |
*** ihrachys has quit IRC | 21:51 | |
*** armax has quit IRC | 21:51 | |
mordred | clarkb: btw- ansible 2.4 added a hiera lookup plugin | 21:52 |
mordred | clarkb: so (not this week) we should be able to potentially rework some of our ansible/puppet jankiness | 21:52 |
clarkb | interesting, will hvae to be careful that doesn't give all nodes access to any hiera data though | 21:53 |
clarkb | hrm http://git.openstack.org/cgit/openstack/networking-lagopus/ is still empty | 21:54 |
clarkb | mordred: ^ I expected manage projects to create that in gerrit nd push an initial commit | 21:54 |
mordred | clarkb: do we need to trigger a replication perhaps? | 21:54 |
clarkb | maybe? | 21:54 |
clarkb | the push of initial commitshould do that I thought | 21:54 |
mordred | oh - it's possible that there was a failed attempt and jeepyb thinks it did | 21:54 |
mordred | one sec | 21:54 |
*** jascott1 has joined #openstack-infra | 21:55 | |
mordred | clarkb: I do not see that repo in the projects.yaml on gerrit | 21:56 |
mordred | clarkb: in /etc/project-config | 21:56 |
*** esberglu has quit IRC | 21:57 | |
openstackgerrit | Gage Hugo proposed openstack-infra/project-config master: Skip ansible upgrade job in keystone https://review.openstack.org/505426 | 21:57 |
mordred | clarkb: Date: Wed Sep 13 23:22:39 2017 +0000 | 21:57 |
mordred | clarkb: is the latest commit ... I can obviously git pull - but I'd expect a full ansible/puppet run to get that up to date | 21:58 |
clarkb | oh I know why | 21:58 |
clarkb | its because we make project-config be in sync with git.o.o | 21:58 |
mordred | oh. yah | 21:58 |
clarkb | so this may all self heal as soon as we let puppet run on its own on that server | 21:58 |
mordred | kk. I think we're in good shape - there's no lingering issues with manage-projects | 21:58 |
*** hashar has quit IRC | 21:59 | |
mordred | so a normal run should be fine | 21:59 |
clarkb | in that case I guess we just move forward with the old plan which is basically kicks.sh it one more time to get these two changes on, then remove it from emergecny file and and watch it | 21:59 |
clarkb | mordred: thaks for looking | 21:59 |
fungi | cool, unfortunately the puppet-gerrit change is still waiting for xenial nodes | 21:59 |
fungi | should i punt it into the gate or let it run its course? | 22:00 |
clarkb | probably a good idea to do that so we can get this finished at a reasonable hour | 22:00 |
clarkb | I'll continue to clean out old leaked nodepool nodes too | 22:00 |
fungi | doing | 22:00 |
*** jistr has joined #openstack-infra | 22:01 | |
fungi | enqueued | 22:01 |
clarkb | basically anything used and more than an hour and a half or so old I am deleting | 22:01 |
ianw | was the email issue sorted out? did the threads get turned up? | 22:02 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Disable action and lookup plugins from 2.4 https://review.openstack.org/505419 | 22:03 |
clarkb | ianw: not yet, jeblair thinks the account caching miss rate is negatively affecting it so going to get that turned up first and see how it does. | 22:03 |
clarkb | ianw: jeblair wants to see a single email sender thread perform well beofre adding more | 22:03 |
ianw | ah right, that's 505402 | 22:03 |
clarkb | ok I think I have all the leaked nodes in nodepool marked for deletion, the remainder were booted after the restart | 22:05 |
*** jcoufal has joined #openstack-infra | 22:05 | |
*** jcoufal has quit IRC | 22:06 | |
*** aeng has joined #openstack-infra | 22:07 | |
*** ijw has quit IRC | 22:07 | |
*** dprince has quit IRC | 22:09 | |
*** rlandy is now known as rlandy|bbl | 22:09 | |
*** camunoz has quit IRC | 22:10 | |
clarkb | thinking about it I think the plan behind doing it on monday after ptg mostly worked out | 22:10 |
fungi | agreed | 22:11 |
clarkb | It was quiet, people mostly were ready for a boring day and we have the next day to sort out issues rather than have it look great for a weekend then get caught by surprise | 22:11 |
*** Sukhdev has quit IRC | 22:12 | |
fungi | looks like the change we wanted is getting xenial nodes in the gate finally | 22:12 |
fungi | and now it has all the nodes it needs | 22:13 |
fungi | i'm getting very close to having to step away, unfortunately | 22:13 |
clarkb | we should manually pull that into the repo in /etc/puppet/modules/gerrit then run kick.sh | 22:14 |
clarkb | mordred: ianw pabelanger jeblair ^ who is still left? | 22:14 |
*** priteau has quit IRC | 22:14 | |
*** dargains has joined #openstack-infra | 22:14 | |
pabelanger | still here | 22:15 |
fungi | okay, i am being dragged away to dinner | 22:15 |
clarkb | fungi: enjoy | 22:15 |
fungi | pabelanger to the rescue! | 22:15 |
fungi | thanks | 22:15 |
fungi | bbiaw | 22:15 |
*** rhallisey has quit IRC | 22:15 | |
pabelanger | clarkb: 505402 and manually kick? | 22:16 |
clarkb | pabelanger: 505406 | 22:16 |
clarkb | 505402 should already be in system config | 22:16 |
pabelanger | clarkb: ++ on 505406 | 22:16 |
clarkb | just waiting on 406 to merge the nwe can get it in /etc/puppet/modules/gerrit then we can kick.sh | 22:16 |
pabelanger | wfm | 22:17 |
clarkb | then we remove review.o.o from emergency file and watch thta it puppets as expected | 22:18 |
ianw | me too, if we want | 22:18 |
clarkb | due to not trusting puppet-gerrits handling of the war I'm tempted to stop gerrit, do index backup again, and then run kick.sh | 22:19 |
clarkb | (in theory its fine now since last kick.sh was fine but ugh) | 22:19 |
*** yamamoto_ has joined #openstack-infra | 22:19 | |
*** gouthamr has quit IRC | 22:21 | |
openstackgerrit | Merged openstack-infra/puppet-gerrit master: Override the cgit url settings in gerrit https://review.openstack.org/505406 | 22:21 |
clarkb | I am making sure that is up to date now | 22:21 |
*** chlong has quit IRC | 22:21 | |
pabelanger | k | 22:22 |
clarkb | uh hrm /etc/puppet/modules/gerrit is actually quite ancient | 22:22 |
clarkb | we use origin/master I think | 22:23 |
pabelanger | that's okay, right? We push them to node now | 22:23 |
*** tpsilva has quit IRC | 22:23 | |
clarkb | we push what we check out though | 22:23 |
clarkb | reading modules.env it uses origin/master | 22:23 |
clarkb | just gonna double check reflog but I think that is it | 22:23 |
clarkb | ya reflog confirms | 22:24 |
clarkb | I have done git remote update && git checkout origin/master in /etc/puppet/modules/gerrit | 22:24 |
clarkb | HEAD is dce6e65c8e6e4408ae4b314733dc20d8b3bf3edd | 22:25 |
pabelanger | k | 22:25 |
pabelanger | will see why it wasn't updated | 22:25 |
clarkb | its because we use the remote ref rather than updatingthe local one | 22:25 |
clarkb | pretty sure I grok it now | 22:25 |
clarkb | pabelanger: do you want to get ready to run kick.sh against review.o.o? | 22:27 |
*** sdague has quit IRC | 22:27 | |
pabelanger | clarkb: sure | 22:27 |
pabelanger | say when | 22:27 |
clarkb | I'm going to manually stop gerrit first | 22:27 |
clarkb | and backup the index again | 22:28 |
pabelanger | k | 22:28 |
clarkb | will tell you when that is done and we can kick.sh | 22:28 |
clarkb | ianw: maybe you can send another #status notice? | 22:28 |
ianw | ok | 22:28 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Port in changes from ansible 2.4 command module https://review.openstack.org/505430 | 22:29 |
ianw | draft -> #status notice Gerrit is being restarted to address some final issues, review.openstack.org will be inaccessible for a few minutes while we restart | 22:29 |
clarkb | stopping gerrit now | 22:29 |
ianw | #status notice Gerrit is being restarted to address some final issues, review.openstack.org will be inaccessible for a few minutes while we restart | 22:29 |
clarkb | index backup is running | 22:30 |
mordred | clarkb: about to afk - but I'll lurk around for the next little bit if you need | 22:30 |
clarkb | and is done | 22:30 |
pabelanger | I don't think status was picked up | 22:30 |
clarkb | pabelanger: go ahead and run kick.sh | 22:30 |
pabelanger | running | 22:31 |
clarkb | we'll have to debug statusbot later (since we are deep into things now) | 22:31 |
ianw | #status notice Gerrit is being restarted to address some final issues, review.openstack.org will be inaccessible for a few minutes while we restart | 22:31 |
openstackstatus | ianw: sending notice | 22:31 |
clarkb | ok it looks like it applied cleanly gonna check gerrit.config and start gerrit | 22:32 |
ianw | sorry, sometimes my bouncer doesn't authenticate with nickserv :/ | 22:32 |
vhosakot | https://review.openstack.org is down? | 22:32 |
pabelanger | clarkb: kick.sh done | 22:32 |
clarkb | lgtm | 22:32 |
clarkb | pabelanger: ready to start gerrit again? | 22:32 |
pabelanger | vhosakot: yes, see openstackstatus | 22:32 |
pabelanger | clarkb: ready | 22:32 |
vhosakot | pabelanger: cool +1 | 22:32 |
clarkb | starting | 22:32 |
ianw | vhosakot: yes, sorry the notice went out after the stop due to a small issue | 22:32 |
-openstackstatus- NOTICE: Gerrit is being restarted to address some final issues, review.openstack.org will be inaccessible for a few minutes while we restart | 22:33 | |
vhosakot | ianw: cool, thanks for the info | 22:33 |
clarkb | gerrit log looks happy | 22:33 |
*** dave-mcc_ has quit IRC | 22:33 | |
clarkb | cgit/gitweb links work now | 22:34 |
clarkb | anticw: ^ | 22:34 |
openstackstatus | ianw: finished sending notice | 22:34 |
pabelanger | Hmm | 22:34 |
clarkb | pabelanger: ? | 22:34 |
pabelanger | might be my cache | 22:34 |
pabelanger | but https://review.openstack.org/#/c/237134/ | 22:34 |
pabelanger | have old cgit URLs | 22:35 |
pabelanger | trying another browser | 22:35 |
clarkb | its there and owrking for me | 22:35 |
pabelanger | k | 22:35 |
clarkb | so ya I think local caches probably at fault | 22:35 |
pabelanger | agree | 22:35 |
pabelanger | another review is good for me too | 22:35 |
ianw | it says gitweb but goes to cgit | 22:35 |
*** ijw has joined #openstack-infra | 22:35 | |
clarkb | ianw: ya because we switched it to "custom" | 22:35 |
ianw | yep | 22:35 |
clarkb | I think we can liv ewith that for now :) | 22:35 |
pabelanger | ya | 22:36 |
clarkb | ok I think that means last step is to remove review.o.o from emergency file | 22:36 |
*** slaweq_ has quit IRC | 22:36 | |
pabelanger | wfm | 22:36 |
clarkb | and make sure that networking-lagopus gets created | 22:36 |
clarkb | next run is in 8 minutes | 22:37 |
clarkb | will we still have people around to watch that? | 22:37 |
clarkb | infra-root ^ | 22:38 |
clarkb | I can be around | 22:38 |
pabelanger | Ya, will be here | 22:38 |
clarkb | ok removing from emergency file now then | 22:38 |
clarkb | done | 22:38 |
clarkb | I am tailing syslog on review.o.o and the puppet_run_all.log on puppetmaster | 22:39 |
pabelanger | same | 22:40 |
*** rbrndt has quit IRC | 22:41 | |
*** baoli has quit IRC | 22:41 | |
*** rbrndt has joined #openstack-infra | 22:41 | |
*** rbrndt has quit IRC | 22:41 | |
clarkb | pabelanger: here we go | 22:45 |
*** openstackgerrit has quit IRC | 22:47 | |
pabelanger | clarkb: gerrit show-queue does appear to be going down | 22:47 |
*** ijw has quit IRC | 22:48 | |
clarkb | ok puppet run seems to have done what we expect according to logs | 22:48 |
clarkb | gonna look at networking-lagopus now | 22:48 |
clarkb | pabelanger: ya it trends down after the restart as it replicates everything but then trends up as emails are slow so we want to see what it does after replications | 22:48 |
jeblair | back | 22:50 |
pabelanger | puppet passed afs servers | 22:50 |
clarkb | everything looks good except for http://paste.openstack.org/show/621485/ guessing that command changed in gerrit | 22:51 |
jeblair | interestingly, the accounts cache hit ratio didn't change | 22:51 |
clarkb | so we have review.o.o puppeting but project creation is not working | 22:51 |
pabelanger | clarkb: --name doesn't appear to be a switch now | 22:52 |
jeblair | i'm not sure what else to do on the email front | 22:52 |
pabelanger | http://paste.openstack.org/show/621487/ | 22:52 |
clarkb | pabelanger: ya docs seem to agree | 22:52 |
clarkb | so we need jeepyb patch to fix that | 22:53 |
jeblair | i guess we can throw more threads at it, but i'm worried that will just eat up cpu | 22:53 |
clarkb | jeblair: maybe zaro and paladox can help with that too? | 22:53 |
pabelanger | clarkb: actually, gerritlib will need to update | 22:55 |
pabelanger | clarkb: https://review.openstack.org/399308/ already fixed. So we might need a version bump | 22:56 |
clarkb | ya likely needs a tag, though I kinda wish that checked the gerrit version and applied the right method | 22:57 |
clarkb | jeblair: paladox is in #gerrit but in europe iirc so may not respond until sometime tonight for us | 22:57 |
pabelanger | Ya, we need a tag. 0.6.0 is missing the fixes | 22:58 |
jeblair | clarkb: ack; i have emitted a question in #gerrit, thanks | 22:58 |
*** wolverineav has quit IRC | 23:00 | |
*** gouthamr has joined #openstack-infra | 23:01 | |
pabelanger | clarkb: I' | 23:01 |
pabelanger | err | 23:01 |
pabelanger | clarkb: I've added gerritlib issue to gerrit-2.13-issues etherpad | 23:01 |
clarkb | thanks I think I have a patch tha should work in followup | 23:03 |
clarkb | to handle arbitrary gerrit | 23:03 |
clarkb | I'm going to test it against review-dev first | 23:04 |
mordred | jeblair: I'd love to know why it's so slow | 23:05 |
clarkb | I'm wondering if it digs through accountPatchReviewDb and its just sad | 23:05 |
clarkb | however rtfsing it uses reviewdb not ^ | 23:05 |
* mordred must away for the evening ... talk to y'all tomorrow | 23:06 | |
*** slaweq has joined #openstack-infra | 23:06 | |
jeblair | clarkb: should we clear status alert now? | 23:06 |
clarkb | jeblair: yes I think so | 23:06 |
jeblair | has anyone checked post jobs work? | 23:07 |
*** xarses_ has quit IRC | 23:07 | |
clarkb | jeblair: they queued | 23:07 |
jeblair | wfm :) | 23:07 |
clarkb | and are still queued due to node scarcity | 23:07 |
*** ijw has joined #openstack-infra | 23:07 | |
clarkb | we still shouldn't merge new project changes until gerritlib is sorted | 23:07 |
*** armax has joined #openstack-infra | 23:07 | |
*** dhajare has joined #openstack-infra | 23:08 | |
pabelanger | Ya, I see a post job complete | 23:08 |
*** openstackgerrit has joined #openstack-infra | 23:09 | |
openstackgerrit | Merged openstack-infra/elastic-recheck master: Add query for live migration invalid disk info bug 1718295 https://review.openstack.org/505410 | 23:09 |
openstack | bug 1718295 in OpenStack Compute (nova) "Unexpected exception in API method: MigrationError_Remote: Migration error: Disk info file is invalid: qemu-img failed to execute - Failed to get shared "write" lock\nIs another process using the image?" [High,Confirmed] https://launchpad.net/bugs/1718295 | 23:10 |
*** ijw has quit IRC | 23:10 | |
*** rcernin has quit IRC | 23:11 | |
*** slaweq has quit IRC | 23:12 | |
jeblair | my hunch is that it's doing too many or too inefficient queries | 23:12 |
jeblair | since almost every time i get the thread dump, it's performing an account query | 23:12 |
*** bobh has joined #openstack-infra | 23:12 | |
openstackgerrit | Clark Boylan proposed openstack-infra/gerritlib master: Handle different gerrit versions with create-project https://review.openstack.org/505438 | 23:13 |
*** dhajare has quit IRC | 23:13 | |
clarkb | pabelanger: jeblair ^ I think that is more friendly gerritlib create-project | 23:13 |
jeblair | and everytime i do a mysql processlist, i see something like: SELECT T.notify_new_changes,T.notify_all_comments,T.notify_submitted_changes,T.notify_new_patch_sets,T.notify_abandoned_changes,T.account_id,T.project_name,T.filter FROM account_project_watches T WHERE T.account_id= ... | 23:13 |
clarkb | jeblair: thats against reviewdb too right? | 23:13 |
jeblair | ya | 23:13 |
*** caphrim007 has quit IRC | 23:13 | |
jeblair | so i worry it's something like perform that query for every account | 23:13 |
clarkb | its looking at https://review.openstack.org/#/settings/projects every time? | 23:14 |
clarkb | could also be related to our use of the third party ci no email group? | 23:14 |
clarkb | I know this isn't a shared opinion but lack of gerrit emails is so nice >_> | 23:15 |
pabelanger | clarkb: +2 | 23:15 |
jeblair | clarkb: is "version < '2.12'" going to work? | 23:16 |
clarkb | jeblair: the vast majority of the queued emails are for comments which I think is generated by gerrit-server/src/main/java/com/google/gerrit/server/change/EmailReviewComments.java | 23:16 |
clarkb | jeblair: it seems to work in practice, it does an alnum sort. It may break if you have eg 2.13.9.-gabc vs 2.13.9.-gzxy | 23:17 |
clarkb | jeblair: I think because we are comparing to the first portion of the string it should be fine (if it were more fine grained it would not be ok) | 23:17 |
jeblair | clarkb: or 2.9 ? | 23:17 |
clarkb | oh crud | 23:17 |
clarkb | clearly I've had too long of a day | 23:17 |
clarkb | I'll address that | 23:17 |
jeblair | clarkb: steal this code: http://git.openstack.org/cgit/openstack/gertty/tree/gertty/sync.py#n1558 | 23:18 |
*** vhosakot has quit IRC | 23:22 | |
*** tosky has quit IRC | 23:23 | |
*** dargains has quit IRC | 23:25 | |
openstackgerrit | Clark Boylan proposed openstack-infra/gerritlib master: Handle different gerrit versions with create-project https://review.openstack.org/505438 | 23:25 |
clarkb | stolen | 23:25 |
jeblair | +2 | 23:27 |
jeblair | pabelanger: ^ | 23:27 |
clarkb | gerrit-server/src/main/java/com/google/gerrit/server/mail/CommentSender.java does the actual construction of the comment emails that are sent I think | 23:28 |
pabelanger | jeblair: clarkb: +3 | 23:28 |
*** dargains has joined #openstack-infra | 23:28 | |
clarkb | jeblair: 341baf22a0756b73ba36caec87fcb740061e332f and 89634f70f37c027e009da61776d72f2621a4f69f looks suspicious | 23:30 |
clarkb | that first one mentions notedb | 23:30 |
clarkb | and I'm checked out on 2.13 | 23:30 |
clarkb | perhaps this is an early inneficient use of notedb? | 23:30 |
clarkb | I'm going to send a #status ok now how about "Gerrit is once again part of normal puppet config management. Problems with Gerrit gitweb links and Zuul post jobs have been addressed. We currently cannot create new gerrit projects (fixes in progress) and email sending is slow (being debugged)." | 23:33 |
jeblair | i'm tcpdumping the mysql port | 23:35 |
jeblair | and wow | 23:36 |
clarkb | jeblair: does that status ok look good to you? | 23:36 |
jeblair | clarkb: yes | 23:36 |
jeblair | this query: SELECT T.notify_new_changes,T.notify_all_comments,T.notify_submitted_changes,T.notify_new_patch_sets,T.notify_abandoned_changes,T.account_id,T.project_name,T.filter FROM account_project_watches T WHERE T.account_id= ... | 23:36 |
clarkb | #status ok Gerrit is once again part of normal puppet config management. Problems with Gerrit gitweb links and Zuul post jobs have been addressed. We currently cannot create new gerrit projects (fixes in progress) and email sending is slow (being debugged). | 23:36 |
openstackstatus | clarkb: sending ok | 23:36 |
jeblair | it starts with low account_id numbers, and works its way up to ~26000 | 23:36 |
jeblair | and every time it loops around, gerrit sends another email | 23:36 |
clarkb | wow | 23:36 |
jeblair | so, yeah, i think we're doing *at least* that query for every account for every email | 23:37 |
jeblair | it's not the only query | 23:37 |
jeblair | SELECT T.account_id,T.email_address,T.password,T.external_id FROM account_external_ids T WHERE T.account_id= | 23:37 |
jeblair | that one is also in a continuous loop | 23:37 |
*** ChanServ changes topic to "Discussion of OpenStack Developer and Community Infrastructure | docs http://docs.openstack.org/infra/ | bugs https://storyboard.openstack.org/ | source https://git.openstack.org/cgit/openstack-infra/ | channel logs http://eavesdrop.openstack.org/irclogs/%23openstack-infra/" | 23:39 | |
-openstackstatus- NOTICE: Gerrit is once again part of normal puppet config management. Problems with Gerrit gitweb links and Zuul post jobs have been addressed. We currently cannot create new gerrit projects (fixes in progress) and email sending is slow (being debugged). | 23:39 | |
jeblair | this seems to be the set: http://paste.openstack.org/show/621492/ | 23:39 |
clarkb | I'm being told by others at home that I need to go for a walk, and that is probably a good idea | 23:40 |
clarkb | will be afk for a bit | 23:40 |
*** vhosakot has joined #openstack-infra | 23:40 | |
*** gouthamr has quit IRC | 23:40 | |
jeblair | mordred: i know you're gone for the evening; i just wanted to ping you here as a marker for when you're back; i know you'll find that interesting ^ | 23:42 |
openstackstatus | clarkb: finished sending ok | 23:42 |
*** felipemonteiro has joined #openstack-infra | 23:44 | |
*** felipemonteiro__ has quit IRC | 23:44 | |
*** Sukhdev has joined #openstack-infra | 23:46 | |
openstackgerrit | Merged openstack-infra/gerritlib master: Handle different gerrit versions with create-project https://review.openstack.org/505438 | 23:47 |
*** felipemonteiro has quit IRC | 23:49 | |
*** threestrands has joined #openstack-infra | 23:51 | |
*** bcat_ has joined #openstack-infra | 23:51 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: web: add /tenants route https://review.openstack.org/503268 | 23:54 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: web: add /{tenant}/status route https://review.openstack.org/503269 | 23:55 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: web: add /{tenant}/jobs route https://review.openstack.org/503270 | 23:55 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: web: add /{tenant}/builds route https://review.openstack.org/466561 | 23:55 |
*** gouthamr has joined #openstack-infra | 23:55 | |
pabelanger | clarkb: I am also AFK | 23:56 |
*** kgiusti has left #openstack-infra | 23:57 | |
*** bobh has quit IRC | 23:57 | |
*** vhosakot has quit IRC | 23:57 | |
tonyb | ianw: What is it y'all need from me for the old stable branch removal? Just an updated list of repos and tags? | 23:58 |
jeblair | clarkb, mordred, fungi, zaro: i filed https://bugs.chromium.org/p/gerrit/issues/detail?id=7261 at the suggestion of paladox and added that to https://etherpad.openstack.org/p/gerrit-2.13-issues | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!