Tuesday, 2017-09-19

* fungi has no idea if twitter acts as an rss feed aggregator00:00
jeblairinfra-root: if you are interested in supporting statusbot's twitter integration, please fix it; otherwise i think we should disable it in our configuration.00:00
mtreinishfungi: is the script to do that in python. You can just steal the rss bits from o-h api server pretty easily00:01
fungii would be in favor of reverting it if it seems unstable, since we can't really test it00:01
mtreinishwhatever lib we used for that wasn't that hard to deal with00:01
clarkbI believe it is at least configurable so we can in theory start by removing the configfor it00:01
fungimtreinish: yeah, statusbot is in python00:01
clarkbthere is an unmade rye mule with my name on it downstairs /me is going to pop out to get that made but will keep an eye on irc until sleep time00:02
clarkbthanks again everyone!00:02
jeblairif we see statusbot pop in with an "finished sending ok" then we can assume the error cleared up00:03
mtreinishfungi: https://pypi.python.org/pypi/feedgen/00:03
fungimtreinish: thanks, i'll keep that in mind00:04
jeblairclarkb: oh hey i found the error in the log00:07
jeblairTwitterError: Text must be less than or equal to 140 characters.00:07
jeblairso the split support in statusbot is somehow broken (maybe on 'ok' messages?)00:07
*** oanson has quit IRC00:08
*** oanson has joined #openstack-infra00:09
clarkbfun00:09
jeblairthat means statusbot should be back in its normal state, it just didn't send you the ok00:09
jeblair#status log please avoid merging new project creation changes until after we have the git backends puppeting properly00:10
openstackstatusjeblair: finished logging00:10
jeblairgood00:10
prometheanfirewoo :D00:10
jeblairstill, we should either fix that bug or turn it off00:11
clarkbalso we'll need to watch melody bu that should be it for restarting gerrit due to memory cache leaks iirc00:11
*** vhosakot has quit IRC00:11
openstackgerritMatthew Thode proposed openstack/diskimage-builder master: Update Gentoo element for element changes  https://review.openstack.org/50384400:11
jeblair(the twitter feed doesn't have the status ok, so we are giving people misleading information.  i feel pretty strongly about not doing that.)00:11
*** stakeda has joined #openstack-infra00:12
*** csomerville has quit IRC00:13
mnasercan we reapprove changes00:16
mnaser..safely? :>00:16
clarkbyes I think so00:17
clarkbplease let us know if you see any weird behavior but we've sent the ok signal00:17
clarkbmnaser: ^00:17
mnaserclarkb will do, i did get one weird quirk00:17
mnaserlet me try to see if i can repro00:18
mnaseri edited a change (with the ui),  clicking save changes left it stuck in "working..." , the change submitted but it was stuck on working till i had to f500:18
clarkbhuh00:18
clarkbthe edit was fine though just ui failed to update?00:19
mnaserthe edit went through but it looks like the UI got a weird response or something and got stuck in 'working' forever00:19
mnaserlet me retry00:19
fungiimprove your gerrit with this one weird quirk00:20
mordredclarkb: I also saw a slight weirdness related to abandoning a change that sounds familiar - but it was also doing replication thread spinup - so I didn't pay attention too terribly00:20
mnaserokay i can replicate it00:20
*** chlong has joined #openstack-infra00:20
mnaserhttps://review.openstack.org/#/c/504361/ -- click on "Commit Message" in the files00:20
mnaserclick the "edit" icon, change commit message00:21
mnaserhit save, close, click "publish edit"00:21
mnaserthe ui gets stuck on loading00:21
clarkbmnaser: woo, though that sounds like the type of bug we document and live with since you can always refrssh or edit locally (its fun how our mass of users can find all the things so much better than I can though :) )00:21
mnaseroh yeah i'm just lazy and dont want to find the branch where i have something to push back up :p00:22
clarkbthough worth looking into as may indicate somethibg wrong with our proxy00:22
mordredfungi, jeblair: sorry - I was afk for a few - I'll fix the twitter integration00:22
*** caphrim007 has quit IRC00:23
*** caphrim007 has joined #openstack-infra00:23
mnaserlet me check if i get any console errors00:23
mnaseror 5xx00:23
mnaserso it looks like it actually even puts the change in this weird state00:25
mnaserwhere we have change 1, "edit" then 200:25
*** askb has joined #openstack-infra00:26
jeblairmordred: cool thx; you can find the traceback in the 2017-09-18 log00:26
jeblairmordred: (also, if you have the password for that account handy, maybe manually tweet the 'ok' for anyone following?)00:26
*** thorst has joined #openstack-infra00:26
ianwhmm, i just got a 502 error.  might be glitch in the matrix00:26
*** caphrim007 has quit IRC00:28
*** mat128 has joined #openstack-infra00:28
*** rossella_s has quit IRC00:29
*** gcb has joined #openstack-infra00:30
*** Swami has quit IRC00:30
ianwand again00:30
clarkbon a particular url?00:31
ianwjust navigating around doing some devstack reviews00:32
ianwhttps://review.openstack.org/#/c/501892/2/functions-common was the one just now00:32
clarkbhuh I wonder if that is apache not seeing gerrit respond fast enough?00:32
*** rossella_s has joined #openstack-infra00:32
clarkbalso that content get sheavily cached iirc so could be that we have to load more up to avoid timeouts?00:32
ianwgerrit-ssl-error.log:[Tue Sep 19 00:30:42.134614 2017] [proxy:error] [pid 118439:tid 139970131998464] [client 122.106.204.156:37824] AH00898: Error reading from remote server returned by /changes/501892/edit, referer: https://review.openstack.org/00:33
ianwthat's me00:33
clarkbdid you try editing the file?00:34
ianwno, just pressed "up" on the review to go back to the main review00:34
clarkbweird00:34
clarkbI wonder if related to mnasers thing00:34
ianwi guss the other was00:35
ianwgerrit-ssl-error.log:[Tue Sep 19 00:25:42.591561 2017] [proxy:error] [pid 22646:tid 139970140391168] [client 122.106.204.156:37642] AH00898: Error reading from remote server returned by /changes/457963/revisions/4/drafts, referer: https://review.openstack.org/00:35
clarkband drafts is another edit feature00:36
clarkbmaybe thats just completely derp in 2.13?00:36
*** rossella_s has quit IRC00:37
ianwi'm not trying to edit/draft etc ...00:37
clarkbIt may be trying to check if you have any outstanding drafts/edits?00:38
ianwnothing in the gerrit logs of interest around 00:25:42 i can see00:38
ianwyeah, firebug shows the call00:39
*** rossella_s has joined #openstack-infra00:39
ianwa GET to https://review.openstack.org/changes/501892/revisions/06c83af151e39a6f98d0bfe7b3b7cd08921b587f/drafts00:40
clarkbprobably first thing to try is increase apache tineout00:40
clarkbI am also hoping that as caches grow it becomes less of a problem00:41
*** xarses_ has joined #openstack-infra00:42
clarkbianw: I think there is nothing in the gerrit log because it didnt error. apache just timed out to the backend00:42
ianwyeah, probably right00:42
ianwfrom the error logs, i am also the only one seeing it00:43
ianwalthough probably just very quiet00:43
clarkbI havent been able to reproduce from my phone00:44
clarkbperhps rtt is playing a role?00:44
openstackgerritMonty Taylor proposed openstack-infra/statusbot master: Let the python-twitter library handle message splitting  https://review.openstack.org/50498000:46
mordredjeblair, clarkb: ^^ that should fix it00:46
mordredI'm not sure I have the password handy - looking00:46
ianwclarkb: -ETOOFARAWAY00:46
*** LindaWang has joined #openstack-infra00:47
ianwclarkb: keep an eye ... should i pre-prepare a timeout change just in case as (.au) day goes on we get more reports ?00:47
clarkbianw: I think that is a good idea but puppet isnt running there yet so would have to hand apply for now anyways00:48
clarkbbut a change tracking it would be good00:48
ianwoh right, yeah that was left off00:49
mordredclarkb, jeblair: status tweets sent00:54
*** tiswanso has joined #openstack-infra00:58
ianwclarkb: hmm, proxytimeout should == timeout == 30000:59
clarkbapache docs say default for timeout is 60 did we override?01:01
ianwroot@review:/var/log/apache2# cat /etc/apache2/apache2.conf | grep '^Timeout'01:04
ianwTimeout 30001:04
clarkbmust be debuntu01:05
*** askb has quit IRC01:05
ianwshouldn't need any keepalive stuff01:06
*** aeng_ has joined #openstack-infra01:08
*** aeng has quit IRC01:08
ianwok, someone other than me would have seen something; from Minneapolis Minnesota (according to maxmind)  "Error reading from remote server returned by /changes/"01:12
ianwso presumably rtt is not too much of an issue there01:12
clarkbya that should be fairly close01:12
*** Sukhdev has quit IRC01:12
clarkbmelody looks ok01:13
*** jdandrea_ has joined #openstack-infra01:13
clarkbcpu use might be a little higher01:13
*** mixos has joined #openstack-infra01:16
clarkbI've got while curl https://review.openstack.org/changes/ ; do echo "next" ; done running in a loop01:16
clarkbhas not failed yet01:16
ianwnot the first time we've had trusty-era apache proxy issues :/01:16
clarkbya01:17
*** Apoorva_ has joined #openstack-infra01:17
clarkbprobably bumps up priority of upgrading to xenial if it persists01:17
clarkb(and cache warming doesn't help01:17
ianwyeah ... in this case connection pooling etc don't seem to be likely candidates01:18
clarkbmy curl has likely run a few hundred time now01:18
ianwyeah, i'm tailing the error log to see too01:18
*** tiswanso has quit IRC01:19
*** tiswanso has joined #openstack-infra01:20
clarkbwe could also consider using a different proxy in an effort to debug (just have it on port 4443 or something)01:20
*** Apoorva has quit IRC01:20
clarkbhaproxy or nginx etc01:20
clarkbI've stopped my loop as it had no errors01:21
clarkbI don't think this is a catastrophic problem01:21
*** cshastri has joined #openstack-infra01:21
clarkblets keep an eye on it and consider further debugsteps but for now I really need to take a break (been a long day) and spend time with family01:21
*** rtjure has quit IRC01:21
clarkbping if something urgent comes up and I will check in periodically01:21
*** Apoorva_ has quit IRC01:21
*** sshnaidm has joined #openstack-infra01:22
*** cuongnv has joined #openstack-infra01:22
*** apetrich has quit IRC01:23
ianwnp, will keep an eye.  thanks!01:23
*** apetrich has joined #openstack-infra01:25
openstackgerritSagi Shnaidman proposed openstack-infra/tripleo-ci master: Set repo setup release in playbook  https://review.openstack.org/50493901:25
*** rtjure has joined #openstack-infra01:26
*** thorst has quit IRC01:27
*** yamahata has quit IRC01:29
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Actually sort mount-point list  https://review.openstack.org/50481901:30
*** liujiong has joined #openstack-infra01:39
fungiheading to bed soon myself, but will attempt to jump on problems from scrollback when i wake up, if any01:45
*** ijw has quit IRC01:51
*** mixos has quit IRC01:54
*** hongbin has joined #openstack-infra01:56
*** e0ne has joined #openstack-infra01:58
*** e0ne has quit IRC02:02
*** chlong has quit IRC02:03
*** aeng_ is now known as aeng02:03
*** xarses_ has quit IRC02:04
*** lennyb has quit IRC02:04
*** baoli has joined #openstack-infra02:05
mnasergerrit seems slow/choppy right now02:05
mnasergetting 502's now in the UI02:06
*** lennyb has joined #openstack-infra02:06
ianwyeah, they're increasing02:08
jeblaircacti says we just had a mojor outbound traffic spike02:10
ianwfor a start, something stackalytics does ... both it seems infra and stackalytics-bot-2 is causing a lot of errors02:10
ianwit seems to close the connection on gerrit during queries02:10
jeblairmelody says we're at 3% garbage collector time02:12
clarkbya melody says someone grabbed fuel-lib frkm gerrit?02:12
clarkbwouldnt be surprised if that us related to traffic spike02:13
ianwthere's a lot of "5000 ms timeout reached for Diff loader in project openstack ... " too02:13
ianwthere was a bunch of issues but nothing for ~5 mins now02:13
jeblairwe're also using as much memory as we were the last time we restarted02:14
clarkb:/02:14
ianwgetting some more now02:14
*** cody-somerville has joined #openstack-infra02:14
jeblair(we're at 25GB)02:14
ianwstackalytics-bot-2 seems to be just broken02:15
clarkbianw: ya it had probelms so they just made a second account hence the -2 iirc02:15
clarkbI want to say it has the paramiko fail to nicely close ssh connection problem we ran into02:16
jeblairthe traffic spike started at 01:4002:16
ianwcan we tell what's happening by thread-id maybe somehow?  if they're dying for some reason02:17
*** kong has joined #openstack-infra02:17
*** anticw has joined #openstack-infra02:17
*** ihrachys has quit IRC02:17
anticwre: the upgrade of gerrit earlier today ... we seem to be lacking the github/gitweb links; are there plans to restore those?02:18
mnaseroooh those have disappeared indeed02:18
anticwwhich are the only things that make gerrit slightly usabel02:18
anticwor usable02:18
mnaserthat might have been missed with the upgrade02:19
mnaserhttps://review-dev.openstack.org/#/c/107956/02:19
mnaseri see it in dev02:19
jeblairanticw: ouch, that hurts02:19
ianw"Error reading from remote server returned by /monitoring, referer: https://review.openstack.org/monitoring"02:20
ianwnow that's weird, right ... i mean that's a totally different bit02:20
clarkbif its in dev then likely just a config difference we have to sort out02:20
anticwjeblair: i use those links to review patches, very useful so it would be nice to have them back02:20
ianwdoes this suggest it's more likely to be apache's fault than gerrits fault?02:20
clarkbianw: ya I have a hunch it is the proxy at least partially at fault02:21
anticwreview-dev has a bogus cert (i'm sure you know this... just sayin')02:22
jeblairianw, clarkb: really?  high gc time, memory use at levels previously associated with gerrit being visibly slow?  that seems to point pretty squarely at gerrit02:22
jeblairhow could apache be contributing to this?02:23
ianwjeblair: well i'm just thinking that "/monitoring" is a pretty different end-point ...02:23
jeblairianw: yeah, though it's still served by the same jvm02:23
clarkbwell also ianw saw it earlier before these things spiked I'm sure it is gerrit too02:23
*** camunoz has quit IRC02:24
jeblairianw: it's usually a bit more robust than the rest of gerrit (since it doesn't have as much to do), but when things get really bad, we also lose access to it02:24
jeblairwe've apparently peaked at 31G of jvm memory02:25
jeblairout of a max of 3002:25
jeblair(java math?)02:25
clarkblikely, since there is non heap stuff too?02:26
jeblairmy inclination is to bump us up to 48G in the jvm, and restart.  and then hope that (a) that's enough, and (b) that traffic spike was anomolous and somehow responsible.02:28
clarkbthat seems like a reasonable step to take. we have 60gb on the server total02:28
jeblairafter that, i'm likely to start saying words like "downgrade" or "upgrade".02:29
clarkbI know other users of gerrit use significantly bigger servers we may be finding out why :(02:29
ianw++, fwiw02:30
*** ramishra has joined #openstack-infra02:30
ianwreally odd that nothing logs if threads are dying unexpectedly02:31
jeblairianw: are threads dying unexpectedly?02:31
SamYapleclarkb: i asked this over the weekend with no response (PTG and travel and all)02:31
SamYapleI want to do an automated build on dockerhub pointed at https://github.com/openstack/loci but would require git permissions to install a hook to allow dockerhub build to trigger02:31
SamYapleis this somethign infra would allow?02:31
clarkbI think apache is just closing connections more aggresively since gerrit is slower02:32
clarkband so no proper error on the gerrit side02:32
mnaserSamYaple infra currently trying to deal with gerrit upgrade issues so might not be a good time :>02:32
SamYaplemnaser: ah cool. ill check back at a less busy time. thanks02:32
clarkbor maybe not more aggressively but its basically masking any potential errors that could happen later02:32
jeblairwhere do we keep the jvm command?02:32
ianwclarkb: but the error suggests that apache didn't know why the connection stopped.  i think there's a separate msg for timeouts02:33
clarkbjeblair: its in review_site/bin in the init script iirc02:33
clarkbit sources from etc/default I want to say02:33
jeblairhrm, /etc/default/gerritcodereview doesn't seem to have a mem limit02:34
mnaseri think if things are timing out apache would return a 504, not a 502 :x02:34
*** dave-mccowan has quit IRC02:35
jeblairi will give the first person who finds where we configure the jvm memory size a cookie02:35
clarkbjeblair: looks like it is is in gerrit.config?02:35
clarkbgerrit.sh is running a git config command to get GERRIT_MEMORY02:35
clarkbya heaplimit in gerrit.confg02:36
jeblairof course02:36
jeblair#   container_heaplimit:02:36
jeblair^ that's the extent of our documentation of that param :(02:36
jeblairokay, i have modified the config file in place02:37
* jeblair hands clarkb a cookie02:37
jeblairshall we restart now?02:38
ianwi think so, 48gb is enough for anyone02:38
jeblair#status notice Gerrit is being restarted to feed its insatiable memory appetite02:39
openstackstatusjeblair: sending notice02:39
ianwmnaser: AH00898 in particular is what apache thinks about it02:39
clarkbwfm02:39
clarkbalso https://stackoverflow.com/questions/169453/bad-gateway-502-error-with-apache-mod-proxy-and-tomcat02:39
clarkbre which 500 error apache should return on a timeout02:39
clarkbthere is anecdotal evidence at least that 502s are what you'd get02:40
-openstackstatus- NOTICE: Gerrit is being restarted to feed its insatiable memory appetite02:40
clarkbthough 300 seconds seems like plenty02:40
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Bump gerrit to 48g heap  https://review.openstack.org/50499302:40
mnaseri cant imagine gerrit needing 300s to respond02:41
clarkboh though that says it is the tomcat server timing out02:41
jeblairthat ^ indicates it's running again02:41
jeblairclarkb: so maybe we're seeing jetty timeouts02:41
*** sshnaidm is now known as sshnaidm|off02:41
clarkbya02:41
clarkbbut then I'd expect that to get logged by gerrit but maybe something is breaking that02:42
openstackstatusjeblair: finished sending notice02:42
jeblairi'm going to go eat my own cookie now02:43
*** hongbin has quit IRC02:43
clarkbit is interesting that the active thread counts jumped up accrding to melody, I think that would support the theory that jetty is timing out if all the threads are busy02:43
ianwclarkb: http threads?02:43
clarkbianw: ya02:43
*** chason has joined #openstack-infra02:43
clarkbwell melody just says threads02:43
clarkbits all of them02:43
* clarkb looks if it can be more specific02:43
clarkbon the melody page if you click on the thread details thing it drops down a list of all the threads which may show us in the future02:44
ianwyeah, a lot of the http ones are in timed_wait02:45
clarkbthey are labeled HTTP-XX02:45
clarkbianw: they are in idlejobpoll02:46
*** mwarad has joined #openstack-infra02:46
clarkbwhich if I am assuming things about named methods that implies those are http threads just waiting for a request?02:47
ianwagree02:47
*** hongbin has joined #openstack-infra02:49
*** baoli has quit IRC02:51
clarkbgitweb docs say that gerrit looks by default at /usr/lib/cgi-bin/gitweb.cgi which is there but is a symlink, perhaps it stopped following symlinks?02:51
clarkbianw: have you gotten any 500s since the restart?02:53
*** eumel8 has joined #openstack-infra02:53
clarkbanticw: its definitely a bug that gitweb doesn't show up anymore, we'll have to look into it. My hunch is that gerrit stopped following symlinks to the gitweb cgi for some reason. But will look into it02:53
ianwclarkb: nup, nothing coming up02:53
ianwAH01136: Unescaped URL path matched ProxyPass; ignoring unsafe nocanon02:54
clarkbSamYaple: I don't think there is a good way to delegate that in github today, at least not the way we have the openstack/ org set up. Monty has been fiddling with github hooks/apps around zuul recently so may have ideas02:54
ianwis, that's not new02:54
clarkbSamYaple: the way we worked around that with read the docs was having zuul jobs hit the rebuild api url for read the docs projects. Perhaps there is something similar in dockerhub?02:55
SamYapleclarkb: i can setup a url that you can POST to in dockerhub that triggers the rebuild, can that be done in a post job?02:56
SamYaplethat way it auto rebuilds and dockerhub doesnt need to access github02:56
clarkbya that is basically the exact thing we do with read the docs. The only potential gotcha there is authentication. Read the docs lets anyone trigger a rebuild anonymously against their API (I think they rate limit itthough)02:57
SamYaplehmmm. ok so the trigger would be publically available02:59
SamYaplethats *fine*. it can't hurt things. but its not ideal03:00
clarkbat least in zuulv2.5 (what we are currently running), once 3.0 is up you'll be able to provide your own project specific secrets that you can manage and as long as you don't disclose them only zuul will know03:00
SamYapleanyway to make it secret? (or is this something zuulv3 provides)03:00
SamYapleah cool03:00
SamYaplei guess i can make them public for now. it wont hurt. i can regen the trigger later when zuulv3 lands03:01
SamYapledo you have a link to the way docs does it? (or at least the project i need to dig into)03:01
clarkbSamYaple: https://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/jobs/hooks.yaml03:03
SamYapleclarkb: sweet! something i know how to do!03:03
SamYaplethanks for the pointers. its super handy03:03
clarkbno problem03:04
clarkbianw: ok I need to afk again. Kids are screaming at me03:04
clarkbwill pick this up in the morning03:04
ianwclarkb / jeblair : thanks!03:05
ianwif nothing else, i can collate any issues if they occur03:05
SamYapleso a bit of a more broad question. would it be possible to allow arbitrary hooks to be called in post? that way if keystone merges a commit then in the post job it would call a hook to rebuild loci-keystone in dockerhub?03:11
SamYaplei mean i can submit a patchset to project-config right now to do this, but if you let one in...03:11
anticwclarkb: thanks, i appreciate it as i use that feature very heavily03:24
*** xarses_ has joined #openstack-infra03:25
masayukigpabelanger: mtreinish: Is there any update for the 404 for http://mirror.mtl01.inap.openstack.org:8080/registry.npmjs/@gulp-sourcemaps%2fmap-sources ?03:25
*** thorst has joined #openstack-infra03:28
*** rlandy|brb is now known as rlandy03:32
*** Sree has joined #openstack-infra03:32
*** thorst has quit IRC03:33
dmsimardSamYaple: in zuulv3 there's technically nothing preventing you from putting one of your jobs in the keystone project pipelines03:45
dmsimardFor better or for worse03:46
dmsimardWhich might be a problem actually..03:47
dmsimardHey let me put that failing job in your gate for the lulz03:47
*** jdandrea_ has quit IRC03:51
*** hongbin has quit IRC03:51
*** coolsvap has joined #openstack-infra03:51
*** gcb has quit IRC03:53
SamYapledmsimard: right thats sort of what i was getting at03:53
SamYapleto be fair, a fire-and-forget task like `curl -XPOST URL` is hard to break...03:54
SamYapleesspecially if you ignore it when it breaks `curl || :`. which i would be ok with03:55
dmsimardRight, but the complexity of a job is not a factor in allowing folks to put any job on any project03:55
dmsimardI think we're mostly hoping that people behave properly for the time being03:55
SamYapleyea thats a zuulv3 thing i dont fully understand yet03:56
SamYapleand being realistic, we trust alot of people to not do silly things that abuse the gate03:56
SamYapleand it mostly works out03:56
dmsimardSamYaple: for example, here you see we define jobs for zuul-jobs: https://github.com/openstack-infra/project-config/blob/master/zuul.yaml#L96103:57
dmsimardand here I added jobs elsewhere on the same project: https://review.openstack.org/#/c/503806/03:58
dmsimardzuulv3 gives a lot of freedom over what you can do and how you can do it03:58
SamYapleoh i see.. hmmm03:59
SamYaplei bet most projects wouldn't care/notice a post merge job like that03:59
SamYapleinfra might care if it fails though03:59
dmsimardSurely there are things planned (or already implemented, not 100% familiar with everything in v3) to provide some amount of restrictions03:59
*** e0ne has joined #openstack-infra04:00
dmsimardLike the concept of trusted/untrusted jobs and projects04:00
*** ykarel has joined #openstack-infra04:01
dmsimardianw: btw I addressed your comments in https://review.openstack.org/#/c/504554/ with https://review.openstack.org/#/c/504936/04:01
SamYaplethis is all very good info. ive got enough to do some cool stuff right now, so im going t osee how far that takes me04:01
*** ykarel_ has joined #openstack-infra04:03
*** aeng has quit IRC04:03
*** e0ne has quit IRC04:04
openstackgerritDavid Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Add integration tests for multi-node-bridge role  https://review.openstack.org/50478904:04
dmsimardianw: also note that all those multinode roles are integration tested too, the other reviews are in the topic https://review.openstack.org/#/q/topic:zuulv3-multinode04:04
*** ykarel has quit IRC04:05
ianwdmsimard: cool, thanks, will take a look.  does all look good04:06
*** tpsilva has quit IRC04:09
*** mat128 has quit IRC04:10
*** udesale has joined #openstack-infra04:16
*** claudiub has joined #openstack-infra04:20
*** aeng has joined #openstack-infra04:21
*** thorst has joined #openstack-infra04:29
ianwdmsimard: what's with the check for iptables6 in https://review.openstack.org/#/c/504553/ ?04:31
dmsimardianw: I missed that one, will fix tomorrow04:33
dmsimardNot sure why I did that, must have misremembered the original script04:33
dmsimardPTG was a long week... :)04:33
*** thorst has quit IRC04:34
*** benj_ has quit IRC04:34
ianwdmsimard: that's ok, i tend not to -1 unless i'm sure there's something wrong :)04:40
*** aeng has quit IRC04:50
*** aeng has joined #openstack-infra04:51
*** benj_ has joined #openstack-infra04:55
*** yamamoto has quit IRC04:57
*** yamamoto has joined #openstack-infra05:07
*** gildub_ has joined #openstack-infra05:08
*** gildub_ has quit IRC05:12
*** gildub_ has joined #openstack-infra05:17
*** cody-somerville has quit IRC05:22
*** zhurong has joined #openstack-infra05:24
*** ijw has joined #openstack-infra05:25
*** ijw has quit IRC05:30
*** thorst has joined #openstack-infra05:30
openstackgerritJuan Antonio Osorio Robles proposed openstack-infra/tripleo-ci master: Only inject cloud-init in CentOS 7.3  https://review.openstack.org/50485005:32
*** thorst has quit IRC05:34
*** hichihara has quit IRC05:35
*** amotoki_ has joined #openstack-infra05:39
*** sshnaidm|off has quit IRC05:40
*** lbragstad has quit IRC05:44
mordreddmsimard: in v3 there is TOTALLY something preventing you from putting your project in the keystone project pipeline05:49
mordredit's that you can't do that at all ;)05:50
mordreddmsimard: an untrusted project can only manipulate its own pipeline05:50
mordreddmsimard: only config projects can manipulate other project's pipelines05:50
*** amotoki__ has joined #openstack-infra05:52
*** amotoki_ has quit IRC05:52
*** ykarel_ is now known as ykarel05:53
*** aeng has quit IRC05:53
*** amotoki__ has quit IRC05:56
ianwyolanda / AJaeger: fyi as you come online ... gerrit seems fine.  we had a period of memory blow-out and increased 502 errors, leading to https://review.openstack.org/#/c/504993/, but it's all been sane since restart, no errors at all05:57
*** amotoki_ has joined #openstack-infra05:57
*** udesale has quit IRC05:59
*** udesale has joined #openstack-infra06:00
*** jtomasek has joined #openstack-infra06:00
*** mriedem has quit IRC06:05
*** jdandrea has quit IRC06:05
*** akscram1 has quit IRC06:05
*** Jeffrey4l has quit IRC06:05
*** Shrews has quit IRC06:05
*** afazekas has quit IRC06:05
*** mdrabe has quit IRC06:05
*** numans has quit IRC06:05
*** apuimedo has quit IRC06:05
*** dhellmann has quit IRC06:05
*** ilpianista_ has quit IRC06:05
*** uberjay has quit IRC06:05
*** mancdaz has quit IRC06:05
*** harlowja has quit IRC06:05
*** _d34dh0r53_ has quit IRC06:05
*** GregHous- has quit IRC06:05
*** StevenK has quit IRC06:05
*** aspiers has quit IRC06:05
*** _Cyclone_ has quit IRC06:05
*** Krenair has quit IRC06:05
*** ggherdov- has quit IRC06:05
*** jamespage has quit IRC06:05
*** fmccrthy has quit IRC06:05
*** jmccrory has quit IRC06:05
*** wendar has quit IRC06:05
*** tdasilva has quit IRC06:05
*** mwhahaha has quit IRC06:05
*** cmurphy has quit IRC06:05
*** melwitt has quit IRC06:05
*** petrovich has quit IRC06:05
*** nhandler has quit IRC06:05
*** electrical has quit IRC06:05
*** rajinir has quit IRC06:05
*** jpmaxman has quit IRC06:05
*** Ng has quit IRC06:05
*** melwitt has joined #openstack-infra06:05
*** wendar has joined #openstack-infra06:05
*** d34dh0r53 has joined #openstack-infra06:05
*** dhellmann has joined #openstack-infra06:05
*** _Cyclone_ has joined #openstack-infra06:05
*** Jeffrey4l has joined #openstack-infra06:05
*** petrovich has joined #openstack-infra06:05
*** StevenK has joined #openstack-infra06:05
*** uberjay has joined #openstack-infra06:05
*** apuimedo has joined #openstack-infra06:05
*** afazekas has joined #openstack-infra06:05
*** cmurphy has joined #openstack-infra06:05
*** nhandler has joined #openstack-infra06:05
*** aspiers has joined #openstack-infra06:05
*** Shrews has joined #openstack-infra06:05
chandankumarianw: Hello06:05
*** Ng has joined #openstack-infra06:05
*** melwitt is now known as Guest3664106:06
*** numans has joined #openstack-infra06:06
chandankumarianw: https://review.openstack.org/#/c/502224/ review for creating repo for neutron-tempest-plugin got merged06:06
chandankumarianw: but repo is not yet created on git.openstack.org06:06
*** electrical has joined #openstack-infra06:06
chandankumarianw: please have a look, Thanks :-)06:06
*** ggherdov- has joined #openstack-infra06:06
*** fmccrthy has joined #openstack-infra06:06
*** mwhahaha has joined #openstack-infra06:06
*** jamespage has joined #openstack-infra06:06
ianwchandankumar: i think we've stopped puppet as part of gerrit upgrades.  admins will be looking at this tomorrow USA time06:06
ianwso check after that06:06
chandankumarianw: thanks for the info :-)06:07
*** panda|bbl has quit IRC06:08
*** mrunge has quit IRC06:08
*** jdandrea has joined #openstack-infra06:09
*** xarses_ has quit IRC06:09
*** mrunge has joined #openstack-infra06:09
*** masayukig[m] has quit IRC06:09
*** aspiers[m] has quit IRC06:09
*** mriedem has joined #openstack-infra06:10
*** akscram1 has joined #openstack-infra06:10
*** mdrabe has joined #openstack-infra06:10
*** mancdaz has joined #openstack-infra06:10
*** GregHous- has joined #openstack-infra06:10
*** jmccrory has joined #openstack-infra06:10
*** tdasilva has joined #openstack-infra06:10
*** rajinir has joined #openstack-infra06:10
*** jpmaxman has joined #openstack-infra06:10
*** panda has joined #openstack-infra06:11
*** Krenair has joined #openstack-infra06:12
*** mnaser has quit IRC06:12
*** andreas_s has joined #openstack-infra06:22
*** mriedem has quit IRC06:25
ianwclarkb / anticw : so the (cgit) links being available on review-dev are because gitweb isn't enabled there -> https://git.openstack.org/cgit/openstack-infra/system-config/tree/modules/openstack_project/manifests/review_dev.pp#n7506:27
ianwonly cgit06:27
*** mnaser has joined #openstack-infra06:28
*** thorst has joined #openstack-infra06:31
*** dizquierdo has joined #openstack-infra06:31
openstackgerritIan Wienand proposed openstack-infra/system-config master: Switch review.o.o to cgit links  https://review.openstack.org/50506706:33
ianwclarkb: ^ an option, maybe.  i've run out of time to figure out why gitweb 404's but i'm sure you'll figure it out :)06:34
*** thorst has quit IRC06:35
AJaegerianw: thanks for update. I notice that I don't get emails by gerrit this morning. Is that working for you?06:36
ianwAJaeger: i am getting emails ok, i think06:37
*** pgadiya has joined #openstack-infra06:38
ianwAJaeger: now you mention it, maybe i'm not ...06:42
*** dhajare has joined #openstack-infra06:43
*** martinkopec has joined #openstack-infra06:44
ianwi dunno, exim is processing and not erroring06:45
fricklerianw: AJaeger: I noticed some mails missing, too, though I also received some06:46
AJaegerianw: didn't get any in the last hour - and rechecked some...06:49
*** ccamacho has joined #openstack-infra06:49
*** florianf has joined #openstack-infra06:49
* AJaeger will do other stuff now and report back later today if this didn't heal itself...06:49
yolandahi, good morning, can i help? reading the messages...06:50
*** zhurong has quit IRC06:54
*** gildub_ has quit IRC06:58
*** xinliang has quit IRC07:02
*** gildub_ has joined #openstack-infra07:03
AJaegerianw, yolanda : IT looks like post-jobs are not run at all ;(07:07
AJaegerinfra-root ^07:07
AJaegerhttps://review.openstack.org/#/c/505068/ just merged, I do not see it in status.o.o/zuul anymore (and nothing in post queue)07:07
AJaegerand log files are non-existing: http://logs.openstack.org/aa/aab471e1911cbd91f3ea42c702f712da2c3ed16a/07:08
AJaegerianw, yolanda: Can you check logs and see why post jobs are not run?07:10
AJaeger#status log Zuul is not running any post jobs07:13
openstackstatusAJaeger: finished logging07:13
yolandanothing strange on zuul logs...07:13
AJaegerthanks for checking - that's strange07:14
*** mwarad has quit IRC07:14
*** _mwarad_ has joined #openstack-infra07:14
yolandaalso nodepool is spinning vms07:15
*** xinliang has joined #openstack-infra07:15
AJaegeryolanda: I do not see any post jobs in status.o.o/zuul07:15
yolandayep07:15
yolandathe other queues look fine, but don't know what's going on for post07:15
*** masayukig[m] has joined #openstack-infra07:16
AJaegerat this time there have to be some - since the translation jobs are all run in periodic queue and thus it should queue all post jobs for repos with translations - and we merged a few in the last hour including api-site07:16
ianwyeah i'm not seeing anything obvious07:17
AJaegerLet's send an alert out - what about #status alert Post jobs are not executed currently, do not tag any relesaeses.07:17
*** hashar has joined #openstack-infra07:18
yolandai'm seeing some POST_FAILURE errors07:18
*** rcernin has joined #openstack-infra07:18
yolanda#status alert Post jobs are not executed currently, do not tag any releases07:19
openstackstatusyolanda: sending alert07:19
*** zhurong has joined #openstack-infra07:19
AJaegerthanks, yolanda07:19
AJaegerI'll be back later07:19
*** dhajare has quit IRC07:21
-openstackstatus- NOTICE: Post jobs are not executed currently, do not tag any releases07:22
ianwthe last time post seemed to do something was 2017-09-18 15:01:00,316 DEBUG zuul.IndependentPipelineManager: Finished queue processor: post (changed: True)07:22
*** ChanServ changes topic to "Post jobs are not executed currently, do not tag any releases"07:22
yolandayep, it doesn't seem to be detecting new changes in the queue07:22
*** pcaruana has joined #openstack-infra07:22
AJaegerianw: so, that's before the gerrit update07:22
*** tesseract has joined #openstack-infra07:24
*** armax has quit IRC07:24
openstackstatusyolanda: finished sending alert07:25
*** gildub_ has quit IRC07:30
*** thorst has joined #openstack-infra07:32
*** _mwarad_ has quit IRC07:32
*** itooon has joined #openstack-infra07:33
*** ilpianista_ has joined #openstack-infra07:33
*** aspiers[m] has joined #openstack-infra07:33
*** jpena|off is now known as jpena07:35
*** threestrands has quit IRC07:36
*** thorst has quit IRC07:36
*** itooon has quit IRC07:38
*** xinliang has quit IRC07:42
*** xinliang has joined #openstack-infra07:42
*** xinliang has quit IRC07:42
*** xinliang has joined #openstack-infra07:42
*** thorre_se has joined #openstack-infra07:42
*** alexchadin has joined #openstack-infra07:44
*** thorre has quit IRC07:46
*** thorre_se is now known as thorre07:46
*** aarefiev_ptg is now known as aarefiev07:47
*** dhajare has joined #openstack-infra07:48
*** egonzalez has joined #openstack-infra07:52
*** ralonsoh has joined #openstack-infra07:53
*** mrunge has quit IRC07:54
*** _mwarad_ has joined #openstack-infra07:55
*** mrunge has joined #openstack-infra07:57
*** d0ugal has quit IRC08:01
*** ociuhandu has quit IRC08:03
*** shardy has joined #openstack-infra08:04
*** d0ugal has joined #openstack-infra08:09
*** jpich has joined #openstack-infra08:11
*** ykarel is now known as ykarel|lunch08:11
*** gildub_ has joined #openstack-infra08:12
*** liujiong_lj has joined #openstack-infra08:14
*** liujiong has quit IRC08:15
*** efoley has joined #openstack-infra08:20
*** priteau has joined #openstack-infra08:21
*** zhurong has quit IRC08:24
*** mrunge has quit IRC08:26
*** mrunge has joined #openstack-infra08:26
*** gildub_ has quit IRC08:28
*** thorst has joined #openstack-infra08:33
*** jaosorior has quit IRC08:33
*** jaosorior has joined #openstack-infra08:34
*** hashar has quit IRC08:36
*** thorst has quit IRC08:37
*** hashar has joined #openstack-infra08:37
*** Sree_ has joined #openstack-infra08:38
*** Sree_ is now known as Guest3251408:38
*** Sree has quit IRC08:41
fricklerinfra-root: if you need a break from checking gerrit: ask.o.o seems also to be down for an extended period this morning. usually there is a 30 minute outage at around 6:30, now it is gone for two hours08:44
*** electrofelix has joined #openstack-infra08:45
*** ykarel|lunch is now known as ykarel08:47
*** udesale has quit IRC08:50
*** pbourke has joined #openstack-infra08:54
*** ijw has joined #openstack-infra08:55
*** ijw has quit IRC09:00
*** udesale has joined #openstack-infra09:03
*** alexchadin has quit IRC09:04
*** slaweq has quit IRC09:04
*** alexchadin has joined #openstack-infra09:05
*** udesale has quit IRC09:05
*** udesale has joined #openstack-infra09:05
*** zhurong has joined #openstack-infra09:06
*** slaweq has joined #openstack-infra09:07
*** yamamoto has quit IRC09:07
d0ugalThe is:starred search on gerrit no longer works :-(09:09
d0ugalactually, ignore me - I was just signed out.09:10
d0ugalphew :)09:10
*** udesale__ has joined #openstack-infra09:10
egonzalezhi is there a estimate time for post jobs back to normal state?09:10
*** dizquierdo has quit IRC09:11
*** udesale has quit IRC09:12
*** e0ne has joined #openstack-infra09:24
*** yamamoto has joined #openstack-infra09:32
*** thorst has joined #openstack-infra09:33
*** tosky has joined #openstack-infra09:37
*** thorst has quit IRC09:38
*** s-shiono has quit IRC09:51
*** stakeda has quit IRC09:52
*** ociuhandu has joined #openstack-infra09:55
AJaegeregonzalez: we need to wait for the US part of the team to investigate...09:59
egonzalezAJaeger, thanks10:00
*** ociuhandu has quit IRC10:01
*** liujiong_lj has quit IRC10:05
openstackgerritThierry Carrez proposed openstack-infra/odsreg master: Update OpenStack logo  https://review.openstack.org/50514510:06
*** cuongnv has quit IRC10:07
*** nicolasbock has joined #openstack-infra10:09
*** Guest32514 has quit IRC10:10
frickleranother gerrit bug: every patch is shown has having the same topic as itself, so there is always at least "Same Topic (1)"10:10
fricklers/has/as/10:11
*** Sree has joined #openstack-infra10:12
openstackgerritOpenStack Proposal Bot proposed openstack-infra/project-config master: Normalize projects.yaml  https://review.openstack.org/50514810:12
*** sambetts|afk is now known as sambetts10:13
*** alexchadin has quit IRC10:13
openstackgerritThierry Carrez proposed openstack-infra/odsreg master: Increase title to 60 chars  https://review.openstack.org/50515010:15
openstackgerritThierry Carrez proposed openstack-infra/odsreg master: Single topic uses admins to review  https://review.openstack.org/50515110:15
*** Sree has quit IRC10:16
*** thorst has joined #openstack-infra10:18
*** thorst has quit IRC10:19
*** mrunge has quit IRC10:23
*** mrunge has joined #openstack-infra10:25
*** _mwarad_ has quit IRC10:26
*** bhavik1 has joined #openstack-infra10:30
*** alexchadin has joined #openstack-infra10:31
*** udesale__ has quit IRC10:32
*** udesale has joined #openstack-infra10:32
*** jtomasek has quit IRC10:33
*** bhavik1 has quit IRC10:33
*** yamamoto has quit IRC10:34
*** jtomasek has joined #openstack-infra10:34
*** alexchadin has quit IRC10:36
*** yamamoto has joined #openstack-infra10:38
*** jkilpatr has quit IRC10:42
*** yamamoto has quit IRC10:47
fricklerinfra-root: it looks like mails from gerrit may not be lost, but just delayed quite a bit, see these headers with more than 4h delay: http://paste.openstack.org/show/621408/10:48
*** timothyb89 has quit IRC10:48
fricklernot consistent, though, I also did receive a mail at 10:20 with only 2 minutes delay10:51
*** udesale has quit IRC10:54
*** udesale has joined #openstack-infra10:55
*** alexchadin has joined #openstack-infra10:55
*** dtantsur|afk is now known as dtantsur10:58
openstackgerritTobias Henkel proposed openstack-infra/nodepool feature/zuulv3: WIP: Honor cloud quotas before launching nodes  https://review.openstack.org/50383811:00
openstackgerritTobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Don't fail on quota exceeded  https://review.openstack.org/50305111:00
openstackgerritTobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Make max-servers optional  https://review.openstack.org/50428211:00
openstackgerritTobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support cores limit per pool  https://review.openstack.org/50428311:00
openstackgerritTobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support ram limit per pool  https://review.openstack.org/50428411:00
*** wolverineav has joined #openstack-infra11:05
*** dhajare has quit IRC11:05
*** dhajare has joined #openstack-infra11:05
*** sdague has joined #openstack-infra11:12
*** cshastri has quit IRC11:13
*** baoli has joined #openstack-infra11:14
*** jkilpatr has joined #openstack-infra11:16
*** thorst has joined #openstack-infra11:19
*** yamamoto has joined #openstack-infra11:23
*** wolverineav has quit IRC11:24
*** thorst has quit IRC11:26
*** baoli has quit IRC11:28
openstackgerritMathieu Velten proposed openstack-infra/project-config master: Update Magnum DCOS image build  https://review.openstack.org/50443111:29
*** tpsilva has joined #openstack-infra11:29
*** alexchadin has quit IRC11:33
*** martinkopec has quit IRC11:33
*** alexchadin has joined #openstack-infra11:34
*** ldnunes has joined #openstack-infra11:35
*** alexchadin has quit IRC11:35
hwoaranggood day11:36
hwoarangsince the gerrit update last night my custom dashboards do not work anymore11:36
*** alexchadin has joined #openstack-infra11:36
hwoaranghas something changed so gerrit-dash-creator has to be updated or is there something wrong with gerrit?11:36
hwoarangfor example the one returned by ./gerrit-dash-creator dashboards/openstack-ansible.dash returns no results which is not normal11:37
*** jpena is now known as jpena|lunch11:40
dmsimardmordred: where can we see what projects are trusted and which aren't ? I guess I've only ever had to deal with trusted projects so far cause I haven't seen that kind of restriction in action yet11:41
*** pgadiya has quit IRC11:45
*** ociuhandu has joined #openstack-infra11:48
*** zhurong has quit IRC11:49
*** alexchadin has quit IRC11:56
*** alexchadin has joined #openstack-infra11:57
*** jaosorior has quit IRC12:02
*** efoley has quit IRC12:03
*** efoley_ has joined #openstack-infra12:03
Shrewsdmsimard: http://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul/main.yaml12:03
*** thorst has joined #openstack-infra12:04
*** hwoarang has quit IRC12:05
*** panda is now known as panda|lunch12:05
*** szahers has joined #openstack-infra12:07
*** hwoarang has joined #openstack-infra12:07
*** wolverineav has joined #openstack-infra12:08
*** jdandrea_ has joined #openstack-infra12:09
szahersHi folks, We are trying to publish freezer docs, so I have created this change https://review.openstack.org/#/c/504329/ and it got merged, but It seems we are missing something as I can't see freezer docs on https://docs.openstack.org/freezer/latest/ (it's required to get this change https://review.openstack.org/#/c/504325/ to pass)12:09
szahersanyone can help or let me know what should we do ?12:09
*** alexchadin has quit IRC12:13
*** alexchadin has joined #openstack-infra12:13
fungiianw: clarkb: yeah, we had been wanting to switch from linking the internal gitweb to linking external cgit instead so as to take more load off gerrit (and possibly allow us to turn off the former eventually). i'm in favor of going ahead with that switch now12:15
*** jdandrea_ has quit IRC12:16
*** dprince has joined #openstack-infra12:16
*** martinkopec has joined #openstack-infra12:17
*** LindaWang has quit IRC12:17
*** LindaWang has joined #openstack-infra12:18
AJaegerszahers: see channel topic - post jobs are not running at all currently. And your docs are published from post jobs. We need to fix that first and then your next change will publish in the post pipeline.12:18
*** rlandy has joined #openstack-infra12:19
szahersAJaeger ah, my bad. Thanks :)12:20
fungihwoarang: try dropping ,n,z from the end of the dashboard url? that was a deprecated option format which was finally dropped sometime between 2.11 and 2.1312:21
fungii'll check the gerrit event stream to see what may have changed about the sorts of events on which we're triggering the post pipeline12:22
hwoarangfungi: there are no such characters at the end or anywhere in the url12:23
*** trown|outtypewww is now known as trown12:23
fungihwoarang: in that case, you may need to manually reconstruct some similar query through the search box in the webui and see how it differs from what the dashboard creator generates12:24
hwoarangyeah i thought so ..12:24
hwoarangwill try that12:24
fungiunfortunately gerrit likes to occasionally change their url formats between releases12:25
*** acoles has joined #openstack-infra12:25
*** pblaho has joined #openstack-infra12:26
acolesHi, thanks infra team for merging this https://review.openstack.org/#/c/504127/ - we're still not seeing alerts for those branches in #openstack-swift, is there anything else that needs to happen to kick gerritbot?12:27
fungiacoles: config updates of the gerrit server are temporarily disabled while we work through some remaining issues following yesterday's upgrade maintenance12:29
fungiso i expect that change hasn't been applied yet12:29
acolesfungi: ok. thanks for replying.12:29
*** dhajare has quit IRC12:30
*** efoley has joined #openstack-infra12:32
*** efoley_ has quit IRC12:32
*** markmcd has quit IRC12:35
*** kgiusti has joined #openstack-infra12:36
*** camunoz has joined #openstack-infra12:37
*** markmcd has joined #openstack-infra12:37
* mhayden tips his hat to the folks who worked hard on upgrading gerrit -- works great!12:38
fungiinfra-root: so on the post pipeline triggering... zuul is configured to trigger on ref-updated events; i see some of those while listening on the event stream12:38
*** sshnaidm has joined #openstack-infra12:38
fungigoing to try and track one down in the zuul debug log next12:38
*** Sree has joined #openstack-infra12:39
*** markmcd has quit IRC12:39
fungilog says zuul's adding and processing ref-updated trigger events12:41
*** markmcd has joined #openstack-infra12:41
fungimhayden: we're not out of the woods yet... increased memory utilization, missing gitweb links, post pipeline jobs aren't triggering... but thanks!12:42
*** jpena|lunch is now known as jpena12:42
fungiit's a complex system, so at least we didn't expect to catch everything before upgrading and knew there would be issues to work through today12:42
sdaguefungi: ah, maybe we need to upgrade gerrit dash creator?12:45
*** dave-mccowan has joined #openstack-infra12:46
fungisdague: likely, but i don't know the details there12:46
*** sshnaidm has quit IRC12:47
*** dave-mcc_ has joined #openstack-infra12:51
*** dave-mccowan has quit IRC12:51
*** pblaho has quit IRC12:52
*** esberglu has joined #openstack-infra12:52
*** isviridov_away has quit IRC12:52
*** vaidy has quit IRC12:52
*** LindaWang has quit IRC12:53
*** szahers has quit IRC12:54
*** bh526r has joined #openstack-infra12:55
*** pblaho has joined #openstack-infra12:55
sdagueoh, you know what changed12:55
sdaguelabel:Code-Review>=-2,self12:56
sdagueused to only match on things with -2,-1,1,212:56
sdaguenow it matches on unvoted on things12:56
*** jcoufal has joined #openstack-infra12:56
fungi"unvoted" as in changes where you've left a comment with a 0 vote?12:57
*** Goneri has joined #openstack-infra12:57
*** mriedem has joined #openstack-infra12:57
sdagueor never voted12:58
sdagueor never left a comment12:58
*** mat128 has joined #openstack-infra12:58
fungithat's certainly strange12:58
sdaguethat it changed, or the old behavior13:00
sdaguethe old behavior was equally odd13:00
sdaguewell, I sent an ML post out about it, about to have network outage as they upgrade fiber here13:03
dmsimardfungi: do you happen to know if jobs running zuul v3 are in logstash yet ?13:05
*** hashar_ has joined #openstack-infra13:06
fungidmsimard: no clue13:06
dmsimardfungi: searching for "build_change" with a review number doesn't yield anything13:06
dmsimardI know jeblair was working on it, perhaps it's not finished yet13:06
dmsimardI wanted to create a E-R query for an Ansible issue I've been noticing.13:07
*** bobh has joined #openstack-infra13:07
*** hashar has quit IRC13:08
*** amotoki_ has quit IRC13:09
*** vaidy has joined #openstack-infra13:09
*** gouthamr has joined #openstack-infra13:13
*** isviridov_away has joined #openstack-infra13:13
openstackgerritDavid Moreau Simard proposed openstack-infra/elastic-recheck master: Add query for Ansible privilege escalation timeout  https://review.openstack.org/50523313:14
openstackgerritDavid Moreau Simard proposed openstack-infra/elastic-recheck master: Add query for Ansible privilege escalation timeout  https://review.openstack.org/50523313:17
*** udesale has quit IRC13:17
*** acoles has left #openstack-infra13:17
sdagueoh, this is pretty good, reviewedby: actually is the better version of it all13:19
*** chlong has joined #openstack-infra13:19
hwoarangfungi: initial debugging suggests that new gerrit doesn't understand regexp in the foreach statement13:23
hwoarangif you hardcode the names of the projects you are looking for then dashboard seems to render find13:23
*** panda|lunch is now known as panda13:25
fungihwoarang: are you starting the regex with ^13:27
*** Sree_ has joined #openstack-infra13:28
fungithe api at least seems to handle regular expressions fine as long as they start with a ^ character13:28
*** alexchadin has quit IRC13:28
*** Sree_ is now known as Guest733513:28
*** felipemonteiro_ has joined #openstack-infra13:28
*** felipemonteiro__ has joined #openstack-infra13:29
*** Sree has quit IRC13:29
*** markvoelker has joined #openstack-infra13:30
*** felipemonteiro_ has quit IRC13:33
*** tiswanso has quit IRC13:33
*** LindaWang has joined #openstack-infra13:33
*** ihrachys has joined #openstack-infra13:34
hwoarangyeah it starts with ^13:37
*** dtantsur is now known as dtantsur|lunch13:38
*** coolsvap has quit IRC13:42
*** ijw has joined #openstack-infra13:46
hwoarangvery strange because this query does work https://review.openstack.org/#/q/project:openstack/openstack-ansible-rabbitmq_server+AND+status:open but this dashboard (which is similar afaics) doesn't https://review.openstack.org/#/dashboard/?foreach=%28project%3Aopenstack%2Fopenstack%2Dansible%2Drabbitmq_server%29+status%3Aopen&title=OpenStack%2DAnsible+Review+Inbox&Foobar=age%3A30d)13:49
hwoaranganyway13:49
*** ijw has quit IRC13:50
dmsimardfungi: I forgot about the warning for not tagging releases13:50
dmsimardfungi: and I tagged something and it's running so I guess whatever it was it works now ?13:51
dmsimardthis is pre-release FWIW13:51
*** hongbin has joined #openstack-infra13:51
fungidmsimard: it's apparently post pipeline jobs which are having trouble, so might not actually impact most release activities13:51
dmsimardfungi: is anyone looking into that ? can I help ?13:52
fungidmsimard: except stuff driven through release management, which relies on a post job for changes merged to the releases repo to trigger automated tagging13:52
sdaguehwoarang: age:30d means patches that haven't changed in over 30d13:53
sdaguethere are only 3 patches13:53
fungidmsimard: all i can tell so far is that we're still getting ref-updated events and the zuul debug log indicates it's adding the trigger and processing them13:53
sdaguethey all have seen activity in the past week13:53
fungidmsimard: but it's not launching any jobs so something's not matching correctly i guess... still digging13:53
dmsimardfungi: so a review that merges and triggers POST, the merge occurs properly but then the post job doesn't trigger ?13:54
hwoarangsdague: true :/ i will keep digging for a simpler reproducer13:54
sdaguehwoarang: what are you trying to figure out?13:55
*** wolverineav has quit IRC13:55
hwoarangsdague: the link produced by ./gerrit-dash-creator  dashboards/openstack-ansible.dash returns 0 results for me whereas two days ago the list was pretty massive13:55
sdaguehwoarang: is it in gerrit dash creator repo?13:56
hwoarangyep13:56
sdaguelabel:Code-Review>=0,self13:57
sdaguethat's your problem13:57
sdaguesee the email that I sent13:57
fungidmsimard: yeah, i think something about the parameter names may have changed in ref-updated events causing zuul not to find commit details, but that's just a hunch so far. i'm trying to match up the parameter names and data types to what it expects13:58
sdaguehwoarang: http://lists.openstack.org/pipermail/openstack-dev/2017-September/122277.html13:58
sdagueCode-Review=0 used to be a noop13:58
hwoarangahh thank you sdague13:58
sdaguenow it actually matches everything you've never voted on13:58
*** efoley has quit IRC14:01
*** efoley has joined #openstack-infra14:01
*** Guest36641 is now known as melwitt14:03
*** erlon has joined #openstack-infra14:05
*** esberglu has quit IRC14:06
*** esberglu has joined #openstack-infra14:07
yolandahwoarang, do you recognize this error on suse? http://logs.openstack.org/28/505128/1/check/gate-bifrost-integration-tinyipa-opensuse-423/ff4c568/console.html14:08
*** srobert has joined #openstack-infra14:08
*** rbrndt has joined #openstack-infra14:09
*** armax has joined #openstack-infra14:09
hwoarangyolanda: seems like something (bindep?) called 'zypper install' without providing any package name14:10
yolandai was testing this on centos, and i had to pin the version to work, but seems to fail for suse14:10
dmsimardfungi: I'll cheer from the sidelines :(14:10
hwoarangyolanda: what patch is this?14:11
fungidmsimard: it's slow going because i still have inlaws visiting for my wife's birthday14:11
*** jheroux has joined #openstack-infra14:11
yolandahttp://logs.openstack.org/28/50512814:11
dmsimardfungi: zuul logs aren't in logstash are they ?14:11
yolanda505128 is the patch14:11
*** esberglu has quit IRC14:11
yolandahwoarang, actually i'm getting output from the test run, and i get the same for centos: bindep -b &> /dev/null || sudo -H -E /bin/yum -y install14:12
yolandathat's the command that is generated14:12
hwoarangoh bindep 2.3 will not work for suse14:12
yolandaand 2.4 seem to fail on centos14:12
yolandasomething with unicode14:12
hwoarangyolanda: there are some suse specific fixes that made it after 2.3 especially the opensuse leap support14:13
hwoarangwhich is what the gate uses14:13
yolandahwoarang, are you familiar with bindep then? let me show you the error on centos14:13
yolandamm, weird, not it doesn't give errors, but on a clean system it was failing with some unicode error14:15
hwoarangmaybe the locale was not set properly?14:15
yolandamm, fails on my fedora as well14:15
clarkbfungi you havent managed to record the json data for a ref updated event yet have you? docs say the content is refName I wonder if that used to be called "ref"14:15
yolandahwoarang, http://paste.openstack.org/show/621434/14:16
yolandawith 2.5.014:16
clarkbwehave a fix for th dashboard so that leaves, zuul post, email slowness, gitweb, and memory consumption on list of items I see from scrollback14:17
fricklerask.o.o is still offline, anyone up for a quick apache restart maybe?14:18
fricklerclarkb: although not urgent, please also add the "each patch has the same topic as itself" to that list, I'm pretty sure that didn't happen before14:19
*** yamamoto has quit IRC14:20
*** esberglu has joined #openstack-infra14:20
fungiclarkb: i have one, just a sec and i'll paste14:20
*** baoli has joined #openstack-infra14:21
clarkbfrickler: I'm not sure I understand that one. I knoe I tested search by topic worked after the upgrade. Can you expand on the behavior you are seeing?14:22
*** wolverineav has joined #openstack-infra14:22
fungiclarkb: http://paste.openstack.org/show/62143614:22
fungithere's a bunch of them14:22
fricklerclarkb: when viewing a singleton patch like https://review.openstack.org/504349 , there is a tab on the right with "Same Topic (1)" and a link to the patch itself. IMHO that tab should only exist when there are other patches with the same topic14:23
fungithat sounds more like a behavior change than a bug14:23
jeblairfungi: those are all updates to changes so they wouldn't be enqueued in post; we need a ref update to a branch14:24
fungijeblair: ahh, yep, i'll see if i can nab one of those14:24
sdaguefrickler: it's definitely a change14:25
clarkbI think 2.11 did that too in some situations. Related changes would include the current change for example14:25
hwoarangyolanda: actually bindep fails here as well14:25
fungitrap laid14:25
jeblair505248,1 may be about to merge14:25
hwoarangsomething is not right14:25
sdaguefrickler: however, it's honestly less confusing UX for that list to exist all the time instead of only if >=2 patches in the series14:25
*** bobh has quit IRC14:26
fungijeblair: i'm listening to the event stream for ref-updated events and filtering out any for refs/changes/ so hopefully should see one when that merges14:26
jeblairfascinating -- i'm seeing events for "project":"All-Users"14:26
jeblairi wonder if that happens when folks update their settings14:26
*** efoley has quit IRC14:26
yolandahwoarang, if you export LANG to en_US works? seemed to do the trick in my fedora14:27
fungiso i guess it's possible gerrit no longer emits ref-updated on changes merging. i can't find evidence i caught any earlier besides for change updates14:27
hwoarangyolanda: it actually fails with AttributeError: 'Depends' object has no attribute 'platform'14:27
fungichecking docs to see if they made that into a new event type14:27
jeblair{"submitter":{"name":"Jenkins","username":"jenkins"},"refUpdate":{"oldRev":"a74c237cf717246d659cef241a5d09096cbbc2a0","newRev":"5c5ca5f1eeca79ab8c13ac6b7221077f78f9fa6b","refName":"refs/heads/master","project":"openstack/openstack-ansible-rabbitmq_server"},"type":"ref-updated","eventCreatedOn":1505831339}14:30
yolandahwoarang, with 2.5.0?14:30
jeblairfungi, clarkb: ^14:30
hwoarangyep14:30
fricklersdague: hmm, not sure I agree, particularly when it may hide other more relevant tabs like cherry-picks or conflicts. but maybe that's really a matter of taste14:30
jeblairit looks like they stopped dropping refs/heads from the ref on branches14:30
jeblairwhich means two things:14:31
jeblair1) our pipeline config is wrong  2) our jobs are all wrong14:31
fungioh, ugh. that's going to be a ton of changes to jobs14:31
fungiyeah, that14:31
jeblairmordred: ^ this could have a significant impact on v3 conversion14:31
clarkb1) should be straightforward its 2) that will be painful unless we want to bake behavior compat into zuul14:32
fungii wonder if it also impacts how zuul-cloner needs to handle $ZUUL_REF(NAME)14:32
*** sdague has quit IRC14:32
jeblairactually........14:32
jeblairi *think* we were clever enough to standardize zuul.ref in v3! :)14:32
jeblairso all the v3 job content should not be affected by the change14:32
yolandahwoarang, forced the LANG on that patch, let's see if that helps suse, or it's another problem14:32
jeblairclearly we should migrate to v3 before upgrading gerrit.  :|14:33
clarkbshould be simpleish to add a pipeline trigger flag to drop refs/.*/ from the refName ?14:34
*** ykarel has quit IRC14:35
*** efoley has joined #openstack-infra14:36
jeblairneeds to be a driver config option, but yeah.14:36
jeblairthough i'm only okay with that because we're throwing the code away next week.  :)14:36
clarkbanf we can apply that before checking the co dition match too so both 1 and 2 are addressed14:37
*** rbrndt has quit IRC14:37
*** rbrndt has joined #openstack-infra14:38
clarkbthough have to be careful to only do it for refs/heads and refs/tags I think14:38
jeblairclarkb: just refs/heads14:38
mordredjeblair: yes - I agree, I think we're fine in v314:38
*** esberglu has quit IRC14:39
*** esberglu has joined #openstack-infra14:40
jeblairclarkb: want me to write that patch?14:40
clarkbjeblair: yes please. I am going to get a list going of the data we have collected so far14:40
*** sdague has joined #openstack-infra14:40
*** jrist has quit IRC14:41
*** esberglu has quit IRC14:41
clarkband things that need investigation and or fixing14:41
*** esberglu has joined #openstack-infra14:41
tonybCan someone check on 503601 I can't see it in status.o.o/zuul by either change number or git SHA.  so I'm wondering if it dropped somehow with the update14:41
*** baoli has quit IRC14:42
*** rossella_s has quit IRC14:43
*** baoli has joined #openstack-infra14:43
*** baoli has quit IRC14:44
*** baoli has joined #openstack-infra14:44
*** martinkopec has quit IRC14:46
*** sdague has quit IRC14:46
*** rossella_s has joined #openstack-infra14:46
hwoarangyolanda: my problem is a regression in tumbleweed14:47
clarkbtonyb: I think because it didn't get a jenkins +1 it isn't going into the gate after workflow +1. I expect a recheck would sort it out14:48
AJaegertonyb: agree with clarkb ^14:48
*** jaosorior has joined #openstack-infra14:48
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Add strip_branch_ref compat option  https://review.openstack.org/50529014:48
jeblairclarkb, mordred: ^14:49
mordredjeblair: +214:50
tonybclarkb, AJaeger: Thanks.  As often happens I had the tools I just didn't use them correctly.14:50
*** Guest7335 has quit IRC14:51
openstackgerritJames E. Blair proposed openstack-infra/puppet-zuul master: Add gerrit_strip_branch_ref option  https://review.openstack.org/50529214:51
clarkbjeblair: lgtm should I go ahead and approve? mordred has +2'd as well14:52
mordredclarkb: ++14:52
jeblairclarkb: ya14:52
jeblairafter it lands, we'll need to make a new branch with the other patch on zuul.o.o14:53
*** szahers has joined #openstack-infra14:53
openstackgerritTristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: configloader: don't use path in SourceContext comparaison  https://review.openstack.org/50529314:53
fungiclarkb: if you're putting together a list, https://review.openstack.org/505067 is a good candidate i think14:53
clarkbhttps://etherpad.openstack.org/p/gerrit-2.13-issues is the list14:53
fungithanks14:53
*** david-lyle has joined #openstack-infra14:53
fungiand looks like it's already on there14:53
jeblairmordred: can you update the regex in the zuulv3 pipeline config?14:55
clarkbfungi: just added it :)14:55
clarkbof all of these the one that concerns me most is the memory consumption. changes to jgit were supposed to mean that gerrit didn't cache everything until we had to restart anymore, but could mean we just use more memory in general...14:56
*** beekneemech is now known as bnemec14:56
clarkbit also seems to be a reaction to load, as it clearly picks up in as north america wakes14:57
openstackgerritJames E. Blair proposed openstack-infra/puppet-openstackci master: Pass through gerrit_strip_branch_ref  https://review.openstack.org/50529614:57
clarkb(there is image link on etherpad above)14:57
clarkbjeblair: oh are we going to have to edit the zuul layout voluptuous checker too?14:57
clarkbjeblair: I wonder if ^ will fail without editing your change14:57
jeblairclarkb: it's a zuul.conf setting14:58
clarkbah14:58
clarkbbecause it is part of the connection roger14:58
openstackgerritMerged openstack-infra/zuul master: Add strip_branch_ref compat option  https://review.openstack.org/50529014:59
*** lbragstad has joined #openstack-infra14:59
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul  https://review.openstack.org/50529714:59
jeblairokay "topic:branch-ref" should be ready14:59
*** xarses_ has joined #openstack-infra15:00
jeblairmordred: nevermind, i'll get the zuulv3 config15:00
*** szahers has quit IRC15:00
*** cody-somerville has joined #openstack-infra15:00
mordredjeblair: (sorry, in my bi-weekly corporate overlord staff meeting)15:01
dmsimardinfra-root: image builds (at least on review.rdo's nodepool) are broken due to https://github.com/openstack-infra/project-config/commit/0335417d0d1b6b69cc95da0134b15f7e67878f3b merging and the repository not existing on git.o.o15:02
mordreddmsimard: I think that's on the TDL for fixing - thanks15:03
*** andreww has joined #openstack-infra15:03
dmsimardyeah I suspect it's due to the ref issue15:03
*** ccamacho has quit IRC15:03
openstackgerritJames E. Blair proposed openstack-infra/project-config master: Update post ref regex  https://review.openstack.org/50530015:03
*** andreww has quit IRC15:03
*** trown is now known as trown|brb15:03
*** baoli has quit IRC15:03
dmsimardmordred: TDL is https://etherpad.openstack.org/p/gerrit-2.13-issues ? I'll add it there.,15:04
*** baoli has joined #openstack-infra15:04
mordreddmsimard: well, we also had a puppet issue which is causing puppet to have sads on git.o.o15:05
*** xarses_ has quit IRC15:05
mordreddmsimard: which I _think_ is the root cause of that particular thing15:05
*** andreas_s has quit IRC15:05
dmsimardmordred: the paramiko thing, yes15:05
mordredyah15:05
dmsimardmordred: I thought new projects were created by jeepyb ?15:05
* mordred is stuck in morning meetings so hasn't had a chance to be very helpful yet15:05
dmsimardor is that just gerrit15:05
*** amoralej has joined #openstack-infra15:05
mordreddmsimard: yes. but jeepyb is run by puppet15:05
dmsimardohhhhhh15:05
*** andreww has joined #openstack-infra15:05
mordredso puppet is stuck on git0*.o.o, which means that those aren't getting updated15:06
*** andreww is now known as xarses_15:06
openstackgerritMatthew Thode proposed openstack-infra/project-config master: make a gentoo nodepool image  https://review.openstack.org/50453015:06
dmsimardmordred: I'll try and reproduce on my end, although the fix is probably to just disable EPEL :/15:07
dmsimardmordred: the base OS version of python-paramiko (provided by centos extras) is much more recent than the one from EPEL15:07
portdirecthey - I'm having trouble linking a blueprint to a ps - I've tried a few things, but wondering if theres anything up since the gerrit upgrade?15:07
pabelangeryup, once we get puppet going again, it should be created15:07
dmsimardmordred: python-paramiko probably got a huge bump in centos extras due to ansible being shipped in extras now.15:07
*** trown|brb is now known as trown15:08
dmsimardwhat's happening is that the version of extras is already installed and it's trying to install the version from epel on top or something.15:08
*** rossella_s has quit IRC15:08
clarkbinfra-root thinking it would be good to avoid restarting gerrit for as long as possible (I knowthis delays fixing stuff like gitweb/cgit) because the longer it runs the better memory profile we will get. It is possible it just needs more memory than before but will reach steady state but also possible it just wants a big gulp amount of memory and we will have a hard time seeing that if we restart it15:08
clarkboften15:08
dmsimardbut I can't do much beyond speculate15:08
clarkbportdirect: what method are you using to link the two?15:08
dmsimardclarkb: could it be related to the core count/thread bump we did ?15:08
amoralejwould it be possible to get a new release of diskimage-builder?, i'd like to get a new tag which includes https://review.openstack.org/#/c/500212/15:09
clarkbdmsimard: no, we have had the same core count as before that was just a change to the reindex step which is a process that exited when complete15:09
dmsimardclarkb: oh.15:09
clarkbdmsimard: that reindex step builds on disk indexes for gerrit and is a one off step15:09
pabelanger+2 on topic:branch-ref15:09
fungijeblair: topic:branch-ref lgtm15:10
*** rossella_s has joined #openstack-infra15:10
jeblairthey're all safe to approve15:10
fungii can go back and approve them now that someone else has also +2'd15:10
fungihrm, gate-infra-puppet-apply-3-ubuntu-trusty fails on the last one15:11
portdirectclarkb: tried just the "Paritially implements" in the commit message: https://review.openstack.org/#/c/504089/15:11
*** lbragstad has quit IRC15:11
portdirectwhich for us as a hosted project used to fail on the search in launchpad, but would at least link it there.15:11
fungiPuppet (err): Invalid parameter gerrit_strip_branch_ref on Class[Openstackci::Zuul_scheduler] at Class[Openstackci::Zuul_scheduler] at /etc/puppet/modules/openstack_project/manifests/zuul_prod.pp:5815:12
fungioh! wrong depends-on i think15:13
jeblairfungi: yep! i'll fix15:14
fungiit's already queued in my gertty15:14
fungibut... isn't getting pushed to gerrit for some reason15:14
jeblairfungi: i see you updated it15:14
fungigertty's gone to offline sync15:14
openstackgerritJeremy Stanley proposed openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul  https://review.openstack.org/50529715:15
fungithere it goes15:15
clarkbportdirect: http://paste.openstack.org/show/621445/ possibly related to the broken search?15:15
mordredfungi: +315:16
*** slaweq has quit IRC15:16
*** makowals has quit IRC15:16
*** yamamoto has joined #openstack-infra15:20
*** ccamacho has joined #openstack-infra15:21
*** yamamoto has quit IRC15:25
*** dimak has quit IRC15:25
*** dimak has joined #openstack-infra15:26
*** bobh has joined #openstack-infra15:26
*** slaweq_ has quit IRC15:27
*** slaweq has joined #openstack-infra15:28
openstackgerritMerged openstack-infra/puppet-zuul master: Add gerrit_strip_branch_ref option  https://review.openstack.org/50529215:28
*** csomerville has joined #openstack-infra15:29
clarkbinfra-root: proposal, lets get zuul post stuff sorted (in progress yay) since zuul puppeting should work as soon as we reenable puppeting globally. Then focus on fixing git backends puppeting so that we can start applying some of the more gerrit specific fixes. The etherpad https://etherpad.openstack.org/p/gerrit-2.13-issues has thoughts/ideas/fixes for each entry so far15:29
clarkbthat should give us time to gather more data on memory utilization before restarting as well15:29
clarkbwhile still actively working to fix things15:29
clarkbonce I'm off my weekly tuesday call I will work to get puppet running again15:30
openstackgerritMerged openstack-infra/puppet-openstackci master: Pass through gerrit_strip_branch_ref  https://review.openstack.org/50529615:31
*** bobh has quit IRC15:31
*** cody-somerville has quit IRC15:32
*** jaosorior has quit IRC15:32
*** Swami has joined #openstack-infra15:33
openstackgerritMonty Taylor proposed openstack-infra/shade master: Add method to set bootable flag on volumes  https://review.openstack.org/50247915:33
mordredclarkb: ++15:34
mordredclarkb: great plan by me15:34
pabelanger+115:34
openstackgerritJeremy Stanley proposed openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul  https://review.openstack.org/50529715:35
clarkbI've just approved https://review.openstack.org/#/c/504993/1 which bumps gerrit heap memory usability to 48GB (from 30GB) this was applied by hand last night and getting that in ensures we don't regress once puppet is running again15:36
*** e0ne has quit IRC15:37
fungiweird, why did gerritbot rereport that?15:37
fungiuh15:38
fungigertty reverted that somehow15:38
fungilike, on its own15:39
clarkbfungi: ?15:39
fungichecking its logs now, but wondering if that was some sort of timeout where it pushed the edit but never got a response and tried to roll it back15:39
fungi50529715:39
jeblairfungi: let me push the update to fix it15:40
fungithanks15:40
fungithis sort of jives with the behavior people were reporting last night about the webui getting hung up after pushing edits through it15:40
clarkbthough in those cases it seemed to "work" in the end? I guess error handling in gertty could be the different behavior?15:41
jeblairi'm seeing similar issues as fungi did15:41
jeblairi have debug log enabled, so can comb through this later15:41
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul  https://review.openstack.org/50529715:41
fungilooks like maybe it gave up after waiting 20 minutes15:41
fungiand yeah, i don't have debug logging enabled15:42
clarkbfungi: maybe check for errors on the gerrit side/15:43
*** LindaWang has quit IRC15:45
funginothing in the error log i could find matching that change-id or numeric index15:46
*** sdague has joined #openstack-infra15:46
jeblairi definitely got some gertty errors during the initial edit -- fungi you may have some in your log too.  we'll see if something happens 20 minutes later.15:47
jeblair"edit already in progress"15:47
fungifor some reason my .gertty.log only had a couple tracebacks from february15:47
jeblairhuh15:48
fungioh, wait, now it's got some15:48
fungiEdit already in progress on change 50529715:48
fungiyeah15:48
fungithis is the first thing in there:15:49
fungi2017-09-19 15:14:37,182 Offline due to: HTTPSConnectionPool(host='review.openstack.org', port=443): Read timed out. (read timeout=30)15:49
openstackgerritMerged openstack-infra/system-config master: Bump gerrit to 48g heap  https://review.openstack.org/50499315:49
fungiand then the "Edit already in progress" exceptions start up15:49
fungiis that from retries?15:50
fungialso getting some of these:15:50
fungi2017-09-19 15:33:28,840 Offline due to: ('Connection aborted.', BadStatusLine("''",))15:50
*** baoli has quit IRC15:50
clarkbready for me to uncomment the entry in roots crontab for puppet runs?15:52
*** yamamoto has joined #openstack-infra15:52
fungidid we want to wait for the system-config change for zuul to land?15:53
clarkbwe can15:53
fungi505297 is still pending15:53
clarkbthat will save us waiting for previous run to end15:53
fungithat's exactly what i was thinking15:53
clarkbkk waiting for that then15:53
openstackgerritDoug Hellmann proposed openstack-infra/project-config master: add whereto to gerritbot for openstack-doc  https://review.openstack.org/50531415:54
zaroHey great job on the upgrade!15:57
clarkbzaro: hey, if you have a moment can you look over https://etherpad.openstack.org/p/gerrit-2.13-issues and see if anything looks familiar?15:58
clarkbzaro: I think we have ideas for fixes on most of that15:58
clarkbzaro: curious if there is a suggested email thread pool size in particular and if you have any idea on the memory15:58
*** florianf has quit IRC15:59
zaroclarkb: i think i have a few changes for the cgit thing16:00
*** dtantsur|lunch is now known as dtantsur16:00
kmallocthe new gerrit seems way faster16:01
kmallocfyi16:01
kmallocjust in general use16:01
openstackgerritMerged openstack-infra/system-config master: Enable gerrit_strip_branch_ref in Zuul  https://review.openstack.org/50529716:02
*** dprince has quit IRC16:02
jeblairfungi: any chance you stop/started gertty?16:02
clarkbfungi: ^ you good for me to uncomment crontab entry now?16:02
clarkb1615 will be first puppet_run_all run16:03
*** yamamoto has quit IRC16:03
*** sbezverk has quit IRC16:03
*** egonzalez has quit IRC16:03
fungijeblair: i stop/started it shortly before reviewing those changes because it was hung syncing from overnight and so wasn't finding them16:03
fungiclarkb: yeah, go for it16:03
*** isaacb has joined #openstack-infra16:04
*** dprince has joined #openstack-infra16:04
clarkbfungi: #*/15 * * * * flock -n /var/run/puppet/puppet_run_all.lock bash /opt/system-config/production/run_all.sh >> /var/log/puppet_run_all_cron.log 2>&1 that is the line you commented out?16:05
clarkbthe log file looks wrong to me, shouldn't it me puppet_run_all.log?16:06
clarkblooking at system config I may be wrong, but want ot double check I am restoring the correct entry16:07
*** hashar_ is now known as hashar16:08
*** Apoorva has joined #openstack-infra16:08
clarkbaha ansible log path is set to the non _cron path16:10
clarkbok I am uncommenting the line I pasted above16:10
openstackgerritIhar Hrachyshka proposed openstack-infra/project-config master: Revert "neutron: Make grenade-neutron-dvr-multinode job non-voting"  https://review.openstack.org/50531816:10
clarkb16:15 UTC will be next puppet run16:11
fungiclarkb: yep16:11
fungithat was the one16:11
clarkbthanks16:11
*** hamzy has quit IRC16:11
jeblairclarkb, fungi, mordred: i think mysql may be implicated in the email slowness16:13
jeblairbased on thread tracebacks16:13
jeblairgerrit is currently between emails, and it's sitting in a socket read under mysql16:13
*** dizquierdo has joined #openstack-infra16:14
jeblairi'll try to get the mysql query it's running16:14
clarkbok16:14
clarkbansible-puppet has started16:15
clarkbI am watching it for this first pass and assuming it acts as expected look at addressing problems in git backends next16:17
zaroclarkb: memory graph seems to indicate a few gc cycles then no more.  does that seem right?16:17
clarkbzaro: yes, we increased the heap memory to 48G from 30G after those gc cycles16:17
mordredjeblair: ok. off the phone and done with followups16:17
clarkbzaro: it was causing slowness and 500 errors16:17
clarkbzaro: since the memory bump it has been happy except for using almost all of the extra memory16:18
fungirackspace doesn't have any larger vm flavors than the one we're using, right?16:18
clarkbgit backends failed to puppet as expected so review.o.o was not touched (also it is in the emergency file list)16:18
mordredjeblair: which logs are implicated mysql?16:19
jeblairclarkb, mordred: hrm, that may have been a red herring.  i'm not seeing it in any really long queries; it may be that the email thread happens to do a lot of queries and i just keep catching it in one16:19
jeblairmordred: i'm looking at the thread dump16:19
mordredok. I'll look at mysql to see if I see anything just in case16:19
clarkbfungi: there are 90 adn 120 GB ram flavors16:19
clarkbwe are using 60GB flavor currently16:19
fungiahh, okay16:20
fungifor some reason i thought 60 was the top16:20
fungiregardless that seems pretty huge16:20
*** caphrim007 has joined #openstack-infra16:20
clarkbfungi: I think zaro will likely report 60gb host instance is small for gerrit, I want to say people run it wiht 256GB baremetal in places16:21
clarkbthat said I generally agree, but EJAVA16:21
mordredfungi: what's the process for changing the my.cnf config for our db?16:21
jeblairi still think we're on the larger size16:21
clarkbmelody reports a massive drop in memory use fwiw16:21
fungimordred: puppet16:21
clarkband no major gc blip16:21
fungiafaik16:22
clarkbso maybe we can steady state we just needed a little more16:22
fungimordred: or you mean within the db instance?16:22
mordredfungi: I thought there was some cloud thing ... yah16:22
mordredthe 'profile' or something16:22
fungimordred: i've been doing it through the rackspace dashboard16:22
jeblairclarkb: http://help.collab.net/topic/teamforge178/reference/Gerrit-Performance-Tuning-Cheat-Sheet.pdf16:22
jeblairclarkb: the collabnet cheat sheet "large" size only goes up to 32G ram16:22
fungimordred: they have an option editor you use to create a custom configuration for the appropriate database type and server version, and then you apply it to a db instance (and restart the instance)16:23
mordredfungi: cool. I'm gonna look at a couple of things based on status variables16:24
clarkbpuppet ran through afs and is now in the else section of nodes16:24
clarkb(so zuul should be getting updated shortly)16:24
mordredfungi: I don't see a way to get to the option editor16:25
mordredah - nevermind. found it16:25
fungirackspace dashboard: databases -> mysql configurations16:25
*** askb has joined #openstack-infra16:26
fungithe one we're using for review-mysql instance right now is the "sanity" configuration for 5.116:26
mordredoh wow - we're still on 5.1 for gerrit :)16:26
clarkbdmsimard: is there an easy way (or any way at all I suppose) of doing a package listing and sorting by repo location? I want to make sure that we aren't using epel for anything important before turning it off on the backends (as that seems to be one of the suggested fixes)16:26
clarkbmordred: yes that is why upgrading is next on the list16:26
mordredwhen we do the xenial update for 2.14 we should perhaps consider migrating to 5.7 if we havent' already been considering that16:26
mordredcool16:26
clarkbmordred: we can't do 4 byte utf816:26
fungimordred: yep, leading me to wonder whether the planned db instance upgrade/replacement should get its timetable stepped up16:27
jeblairclarkb, mordred: this is really confusing.  according to exim, gerrit is sometimes waiting one or two minutes between sending out emails.16:27
jeblairit doesn't seem to be stuck in any individual long queris16:27
fungiif we really are seeing performance issues we can better control with newer mysql16:27
dmsimardclarkb: repoquery -i <package> ?16:27
jeblairbut i wonder if it's just doing *a lot* of them, or many kinda-sorta-slow ones, or having a lot of cache misses...16:27
*** bobh has joined #openstack-infra16:27
*** prometheanfire has left #openstack-infra16:28
*** vhosakot has joined #openstack-infra16:28
*** rwsu has quit IRC16:28
jeblairthe mysql calls are always underneath the account cache16:28
dmsimardclarkb: I have a script that already does almost of what you're looking for16:29
jeblairwow show-caches is very slow to responde16:30
*** rwsu has joined #openstack-infra16:30
*** rwsu has quit IRC16:30
*** rwsu has joined #openstack-infra16:30
jeblair  accounts                      |  1024               |   2.9ms | 64%     |16:31
mordredjeblair: there are definitely a couple of mysql-level caches where we have significant misses, which is what I'm looking at right now16:31
jeblairwhat you say we bump that ^?16:31
*** mat128 has quit IRC16:32
*** bobh has quit IRC16:32
clarkbdmsimard: any chance it is shareable? I owe you beer or cookies if so16:32
dmsimardclarkb: yeah, let me adapt it from https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/nodepool/scripts/filter_packages.sh16:33
clarkbthanks16:33
clarkbin related news, thread on upstream gerrit ml says upstream gerrit has a case of the 500s today as well16:33
*** ralonsoh has quit IRC16:33
*** bh526r has quit IRC16:34
*** sdague has quit IRC16:35
*** dprince has quit IRC16:35
jeblairi will work on a change to bump the account cache(s)16:36
*** sdague has joined #openstack-infra16:36
*** jpich has quit IRC16:36
dmsimardclarkb: http://paste.openstack.org/show/621449/ should work16:36
dmsimardclarkb: example (unsorted) output http://paste.openstack.org/show/621450/16:37
*** mikal has quit IRC16:37
*** pcaruana has quit IRC16:38
clarkbjeblair: sounds good16:38
electrofelixjamielennox: have you had a chance to review https://review.openstack.org/47547416:39
clarkbdmsimard: thanks I will sanity check git backends for epel shortly16:39
openstackgerritJames E. Blair proposed openstack-infra/puppet-gerrit master: Allow configuring account cache limits  https://review.openstack.org/50532816:39
openstackgerritMerged openstack/os-client-config master: Fix requires_floating_ip  https://review.openstack.org/50461616:41
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Bump gerrit account cache to 2048  https://review.openstack.org/50533016:43
jeblairclarkb, fungi, mordred: ^ "topic:account-cache" is ready16:43
clarkbmaybe you want a similar stack for increasing email send thread pool count?16:44
jeblairclarkb: i actually don't want to increase the threads until i see one thread behaving well16:44
clarkbok16:44
clarkbcache stack lgtm16:44
jeblairit should not take 1 minute for a computer to compose an email :)16:45
fungiand approved16:45
*** isaacb has quit IRC16:45
clarkbjeblair: ++16:45
jeblairer i spot a typo in one of those16:45
*** Qiming has quit IRC16:45
jeblairi will -2, fix, and i'm also going to add another cache entry16:46
fungii unapproved them anyway16:46
fungioh, yep16:47
clarkbok16:47
*** mikal has joined #openstack-infra16:47
fungiyou mixed cache_accounts_by{name,email} with cache_by{name,email}16:47
openstackgerritJames E. Blair proposed openstack-infra/puppet-gerrit master: Allow configuring account/group cache limits  https://review.openstack.org/50532816:48
jeblairfungi: yep.  fixed, and added another cache:16:48
jeblair  groups_byuuid                 |  1024               |   2.0ms | 36%     |16:48
fungithanks16:48
fungiyeah, that's probably a good addition16:48
jeblairupdating system-config change now16:48
fungiyou can probably remove your cr-2 now16:48
jeblairdone16:49
*** trown is now known as trown|lunch16:49
clarkbexim comes from epel on git0116:49
clarkb(still waiting on a complete list)16:49
*** Qiming has joined #openstack-infra16:49
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Bump gerrit account cache to 2048  https://review.openstack.org/50533016:49
*** jpena is now known as jpena|away16:49
jeblairokay, those are both ready again ^16:49
fungiclarkb: oh, right, because postfix is the default for centos i guess?16:50
*** mat128 has joined #openstack-infra16:51
clarkbfungi: must be?16:51
jeblairi've never managed to run a production system without epel16:51
mordredjeblair, fungi, clarkb: I have an updated db config for review.o.o I'd like to apply - I made one called "review-5.1" since it's a bit specific to review.o.o16:51
fungii think i've (hopefully) fixed our default language preference for ovh now, though i had to set our "location" to something other than france (or else it only allowed to select french)16:51
mordredapplying it requires restarting the db16:51
clarkbmordred: ok, lets try to coordinate that with a gerrit restart16:52
mordredyah16:52
clarkbmordred: since we have a few changes we want ot apply to gerrit as well16:52
jeblairmordred: cool, i think maybe after my tuning changes merge ^ yeah that16:52
*** rcernin has quit IRC16:52
clarkbalso maybe get cgit in?16:52
anticw+116:52
*** tesseract has quit IRC16:52
fungichecking out the proposed db config now16:52
*** baoli has joined #openstack-infra16:53
clarkbbased on http://paste.openstack.org/show/621452/ I don't think we can turn off epel. exim and cgit are both in epel and important to the git backends16:53
jeblairclarkb: what's the problem with epel?16:53
mordredfungi: tl;dr is bump query cache size since we have a high number of query_cache_lowmem_prunes - but we have WAY WAY WAY more reads than writes - and also to bump up the innodb_buffer_pool_size since this instance has 4G but we only have 1.2G allocated to innodb_buffer_pool (which means we effectively only have a 2G instance)16:54
fungimordred: yep, for the sake of others not having to log into the rackspace dashboard, this seems to be your additions over our standard "sanity" config: http://paste.openstack.org/show/621453/16:54
mordredyes. that's correct16:55
mordredquery_cache_type=1 means "on" ... as oppposed to 0 which is off and 2 which is "only use for queries that explicitly request query cache"16:55
clarkbjeblair: dmsimard believes that the reason we are failing puppet is epel and centos 7.4 python-paramiko packages conflict. So disabling epel in theory fixes that16:55
clarkbjeblair: give me a moment and I will get a paste up of the error from git01 puppet runs16:55
mordredand query_cache_wlock_invalidate means "invalidate the query cache when someone takes out an explicit write lock on a table"16:56
*** yamahata has joined #openstack-infra16:56
mordredclarkb: what are we using python-paramiko for on cgit servers?16:56
clarkbhttp://paste.openstack.org/show/621454/16:56
clarkbmordred: itis a jeepyb dep16:56
*** dprince has joined #openstack-infra16:56
*** baoli has quit IRC16:56
mordredhow about we install it from pip like the rest of jeepyb and don't try to mix system and pip depends?16:56
dmsimardclarkb: exim ? for MTA purposes ? I guess there are other MTAs... :P16:57
jeblairdmsimard: you want to be our postmaster?16:57
mordreddmsimard: we're exim fans16:57
clarkbhaha16:57
fungidmsimard: exim is also the default mta on the distro for most of the servers we run16:57
clarkbfwiw I'm sure happy to use exim as long as I can go to jeblair and ask questions :)16:57
dmsimardI haven't done exim in a long time.. either postfix or qmail16:57
jeblairanyway... cgit is also important16:57
*** nmathew has joined #openstack-infra16:57
jeblairlet's see if we can tweak the repo priorities16:58
mordreddmsimard: we have in our midst a GIANT exim expert, so we take advantage of that16:58
dmsimardwe do ?16:58
dmsimardI have nothing against exim fwiw, was mostly joking16:58
mordreddmsimard: jeblair ran the email systems for UC Berkeley before working on OpenStack :)16:58
mordredjeblair: how about not installing paramiko from yum/dnf?16:59
dmsimardneat16:59
fungii suppose we could tweak puppet to install paramiko from pypi?16:59
clarkbhttp://mirror.sfo12.us.leaseweb.net/epel/7/x86_64/p/python-paramiko-doc-1.16.1-2.el7.noarch.rpm that is epel package16:59
dmsimardwhat's paramiko being installed for in the first place ?16:59
fungiright, what mordred just said16:59
mordredthe pattern of some-things-from-pip-some-things-from-distro isn't a great pattern and we tend to discourage it16:59
*** kgiusti has left #openstack-infra16:59
fungiespecially on rh platforms where pip and rpm fight over file locations16:59
mordredand it is in the jeepby requirements- so we can just remove it from puppet and let the jeepyb install take care of it I believe16:59
jeblairdo we install any jeepyb from pypi?17:00
mordredfrom git17:00
jeblairany dependencies?17:00
mordredwe do a pip install . in /opt/jeepyb17:00
clarkbtrying to find the centos 7 paramiko version to compare17:00
mordredyah - a pile of them17:00
mordredincluding paramiko17:00
dmsimardjeepyb uses paramiko ? or where is it from ?17:00
clarkbwe likely install paramiko from distro packages because in the past we needed to do C builds for paramiko17:00
*** prometheanfire has joined #openstack-infra17:00
clarkbbut now cryptography ships a wheel so that is no longer a problem17:00
mordredah - yah. that sounds like why that's there17:00
clarkbdmsimard: jeepyb speaks ssh to gerrit to do things and relies on paramiko for that17:01
dmsimardah, ok, makes sense17:01
mordredare we still running openstackwatch?17:01
*** gouthamr has quit IRC17:01
*** gouthamr has joined #openstack-infra17:01
clarkbhttp://mirror.centos.org/centos/7/extras/x86_64/Packages/python-paramiko-2.1.1-2.el7.noarch.rpm that is the non epel package17:01
clarkbso ya those two are conflicting. Is that not an epel/centos bug too?17:02
clarkbmordred: I don't think so, e-r replaced that didn't it ?17:02
*** thorre_se has joined #openstack-infra17:02
mordredoh - that's making the rss feed ...17:02
dmsimardEmilienM, mwhahaha: does puppet have a 'disablerepo' or 'enablerepo' equivalent for the package resource ?17:03
mwhahahadmsimard: wat17:03
dmsimardmwhahaha: like if we want to selectively enable epel for a particular package installation17:03
dmsimardyum --enablerepo epel install foo17:03
openstackgerritMerged openstack-infra/system-config master: Switch review.o.o to cgit links  https://review.openstack.org/50506717:04
mwhahahadmsimard: not likely17:04
jeblairmordred: are you writing a puppet-jeepyb patch?17:04
dmsimardmwhahaha: my googlefu made me lean towards that conclusion as well but thought I'd double check17:04
mordredjeblair: yes17:04
jeblaircool17:04
*** thorre has quit IRC17:04
*** thorre_se is now known as thorre17:04
*** baoli has joined #openstack-infra17:04
dmsimardclarkb: I *think* that technically the paramiko package from EPEL should be retired17:05
mwhahahadmsimard: yes it does17:05
mwhahahadmsimard: https://docs.puppet.com/puppet/latest/type.html#package-provider-yum17:05
mwhahahadmsimard: install_options17:05
jeblairfungi: can you +3 https://review.openstack.org/505300 ?17:05
dmsimardclarkb: because EPEL has a policy not to provide packages that are in base OS17:05
clarkbdmsimard: is that something you can file a bug against just so that we are good users?17:05
*** hamzy has joined #openstack-infra17:05
dmsimardclarkb: I'll check with someone more knowledgeable than I17:05
clarkbdmsimard: I linked both conflicting versions of packaages above if yo uneed concrete links17:05
*** ltomasbo has quit IRC17:06
fungijeblair: thanks, i missed that since it lacked a topic17:06
dmsimardclarkb: you had a paste of the puppet error log right ?17:07
*** Swami has quit IRC17:07
openstackgerritMerged openstack-infra/puppet-gerrit master: Allow configuring account/group cache limits  https://review.openstack.org/50532817:07
*** askb has quit IRC17:07
clarkbdmsimard: ya http://paste.openstack.org/show/621454/ there is more to it than that (more files) but extra content seemed mostly redundant17:07
jeblairdmsimard, clarkb: potentially relevant: https://bugzilla.redhat.com/show_bug.cgi?id=148161817:08
openstackbugzilla.redhat.com bug 1481618 in python-cryptography "RFE: python 3 package for python-cryptography in EPEL7" [Unspecified,New] - Assigned to jeremy17:08
clarkbjeblair: the zuul.conf update is in place on zuul.o.o17:08
clarkbjeblair: I think that means we can rebase tobiash's change back on msater again and then reinstall zuul and restart whenever ready17:08
*** caphrim007 has quit IRC17:09
jeblairclarkb: yeah, let's do that with our 'everything' restart?17:09
clarkbwfm17:09
openstackgerritMonty Taylor proposed openstack-infra/puppet-jeepyb master: Stop installing python depends from distro  https://review.openstack.org/50533517:09
*** szahers has joined #openstack-infra17:09
* clarkb is trying to keep the etherpad up to date as we go17:09
openstackgerritMonty Taylor proposed openstack-infra/puppet-jeepyb master: Stop installing python depends from distro  https://review.openstack.org/50533517:09
mordredkk. I think that ^^ should do it17:09
*** szahers has quit IRC17:10
dmsimardclarkb: in the meantime, we can use --disablerepo epel in install_options for the package puppet resource where paramiko is installed17:10
mordredI added a subscribe on the removal so that we'll re-run pip install when we remove those packages17:10
dmsimardor what mordred is doing works too I guess17:10
*** jpena|away has quit IRC17:10
*** amoralej has quit IRC17:10
clarkbok I have approved mordred's change. I have got a tail of the puppet apply logs going so will try and watch for that17:12
*** xyang1 has joined #openstack-infra17:13
openstackgerritMerged openstack-infra/project-config master: Update post ref regex  https://review.openstack.org/50530017:13
jlvillalAre there any known issues with emails from Gerrit?17:13
*** ccamacho has left #openstack-infra17:13
*** dizquierdo is now known as alpgarcia17:13
jlvillalOn a patch of mine, I had some comments from people, and I uploaded a new patch. But I do not see any emails about it.17:14
jlvillalI am seeing other emails from Gerrit though.17:14
*** sambetts is now known as sambetts|afk17:14
clarkbjlvillal: yes, they are being processed very slowly17:15
clarkbjlvillal: more details can be found at https://etherpad.openstack.org/p/gerrit-2.13-issues work is in progress to attempt to address it17:15
jlvillalclarkb: Ah, okay. I'll be more patient then. Thanks!17:15
openstackgerritMonty Taylor proposed openstack-infra/puppet-jeepyb master: Allow removing openstackwatch  https://review.openstack.org/50533917:20
mordredclarkb: ^^ we are still running openstackwatch - but the cloud account it uses doesn't actually work, so it's just a thing we run that errors once an hour17:20
fungiinteresting spike on review.o.o's eth1 graph every ~4 hours17:21
fungilooks like it probably predates the upgrade though17:21
*** askb has joined #openstack-infra17:22
fungihttp://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=34&rra_id=all17:23
clarkbmemory use is still looking decent17:23
*** Swami has joined #openstack-infra17:24
openstackgerritMonty Taylor proposed openstack-infra/system-config master: Remove openstackwatch  https://review.openstack.org/50534217:26
openstackgerritMonty Taylor proposed openstack-infra/system-config master: Remove openstackwatch reference  https://review.openstack.org/50534317:26
openstackgerritMerged openstack-infra/puppet-jeepyb master: Stop installing python depends from distro  https://review.openstack.org/50533517:27
*** udesale has joined #openstack-infra17:27
*** efoley has quit IRC17:28
clarkbI think 17:45UTC will be the next puppet run on git backends17:28
clarkbshould include ^17:28
*** bobh has joined #openstack-infra17:29
*** rbrndt has quit IRC17:29
mordredclarkb: sweet17:29
*** olaph has quit IRC17:30
*** ltomasbo has joined #openstack-infra17:30
*** ijw has joined #openstack-infra17:30
*** udesale has quit IRC17:31
jeblairclarkb: i need to afk for 30m; it sounds like you might be ready to do the global restart during that time.  i think aside from manually merging/installing the zuul patch (which you're obviously familiar with) we should be all set.17:32
*** olaph has joined #openstack-infra17:32
jeblairclarkb: derp we're missing a change: https://review.openstack.org/50533017:32
clarkbI'll recheck it17:32
clarkbI think that is the centos ssh keys issue that I hav esort of attempted to debug in spare time17:32
jeblairclarkb: thx17:33
*** bobh has quit IRC17:33
fungiwould cleaning up http://git.openstack.org/cgit/openstack-infra/system-config/tree/playbooks/clouds_layouts.yml#n49 help?17:34
*** tosky has quit IRC17:34
fungii'll propose a change for that, missed cleanup17:35
clarkbfungi: maybe? I have pushed https://review.openstack.org/501887 to help debug17:35
clarkbok rough plan: git backends will update shortly after 1745, see how they do and if successful then we can kick.sh review.o.o with all the changes we want in17:35
*** rtjure has quit IRC17:38
fungioh, i guess i already did? https://review.openstack.org/48971117:38
clarkbI don't think pabelanger's concern there is really worht worrying about. Nodepool uses baked in keys to ssh not metadata provided keys17:39
fungipabelanger: can you elaborate on why that will cause nodepool to start failing to boot instances?17:39
clarkbso everything nodepool should work fine17:39
clarkbwe'll just have a period of time where maybe boot will fail because key is beteen being deleted and recreated17:39
clarkb(but nodepool will recycle those properly)17:39
*** alpgarcia is now known as dizquierdo17:39
fungii'm still unclear on what keys need recreating17:39
fungithis is only removing keys?17:39
clarkbfungi: the nova key item in the nova db17:40
clarkbfungi: you can't update infra-root-keys in nova. You have to delete it then recreate it with new content17:40
fungibut it will still contain the key nodepool's using right?17:41
clarkbnodepool's key is baked into the image by dib so that is orthogonal concern17:41
fungii'm really unclear on why removing keys nodepool doesn't use will cause nodepool to be unable to connect17:41
*** nmathew has quit IRC17:41
clarkbfungi: right its a non issue17:42
pabelangerfungi: clarkb: 489711 won't actually delete SSH keys in openstack clouds. In fact, I don't think ansible will run properly once it is merged17:42
*** rtjure has joined #openstack-infra17:42
pabelangerfungi: clarkb: so, we'll first need to delete infra-root-keys in all clouds, then run cloud launcher again17:42
clarkbthe only issue re nodepool is when the key is deleted and not yet recreated we could attempt to boot and cause problems because noepool will set the infra-root-keys key name on the boot args17:42
pabelangerbut, when we delete infra-root-keys, we'll fail to launch new nodes, until recreated17:42
clarkbwe can also just delete it a few regions at a time and rerun the cloud launcher17:42
fungipabelanger: in that case, we should probably find a different model than to rely on automating configuration we can never automatically change :/17:42
pabelangerfungi: plan is to patch ansible ssh key task to allow us to force update, but I haven't pushed up that change yet17:43
fungigot it, so we're waiting on a missing ansible feature17:43
fungii'm afraid i don't understand enough about the mechanism being described there to know how to go about making the required alterations manually17:44
*** shardy has quit IRC17:44
fungiuse openstackclient to replace the nova ssh keys object with a new bundled version matching the desired change?17:45
clarkbmordred: https://review.openstack.org/#/c/505339/ has failures17:45
*** martinkopec has joined #openstack-infra17:45
pabelangerfungi: we can write a new playbook to rotate keys, it is just a 2 step process. I can take a stab at it once off the phone with airline17:46
fungiwell, this isn't key rotation, so i'm still confused17:47
fungiwe're just revoking access for inactive users17:47
pabelangerfungi: openstack doesn't have an API to replace an existing ssh key. You first have to delete it, then recreate it using same name17:47
fungiwe're not replacing an ssh key, we're removing several ssh keys and leaving the others untouched (not replaced)17:47
*** electrofelix has quit IRC17:48
clarkbfungi: right but its a single key in the nova api/db17:48
pabelangerright, but they are blobed into a single key entry17:48
clarkband there is no update method17:48
fungior is what openstack calls an "ssh key" actually a set of keys?17:48
clarkbyes that17:48
openstackgerritMonty Taylor proposed openstack-infra/puppet-jeepyb master: Allow removing openstackwatch  https://review.openstack.org/50533917:48
fungiokay, so terminology mismatch17:48
mordredclarkb: that should do it17:48
mordredclarkb, fungi, pabelanger: reading keys scrollback17:49
*** harlowja has joined #openstack-infra17:49
fungimordred: i think we came to terms with it... the comment pabelanger left on the change used the term "key" in two different ways to mean two different things (because openstack uses the term to mean something different than an actual human would)17:50
fungishould be more properly referred to as a "keyset" or something17:50
mordredyes - I agree with that17:52
mordredfungi: yah - it's actually a "blob of contents to put into an authorized_keys file"17:53
mordredfungi: I'm honestly not sure if it supporting more than one key is on purpose - I think it's an accident of how both it and glean and cloud-init all happen to operate17:53
fungiso more clearly stated, the concern raised over 489711 is that altering the infra-root-keys variable in the clouds_layouts playbook will cause nodepool to be unable to find the bundled keyset previously created on our cloud providers?17:54
mordredin any case, I agree with pabelanger - ansible-cloud-launcher can be updated to do a two-step delete/create - or we can add a force-replace flag to os_keypair itself and make a-c-l use that17:54
mordredfungi: well - two concerns - one is that that change will not actually get applied correctly17:55
fungii don't get why, but it's probably me failing to understand rest api design17:55
mordredfungi: the second is that there will be a point in time during whatever period exists when we DO actually apply that change where nodepool will have some boot errors, but I think that's ok17:55
clarkbmordred: I agree those boot errors should only happen for a short period and self correct17:56
fungiyou can't replace the value, but you could delete and recreate it? (why is replacing not the same thing as deleting and recreating?)17:56
mordredfungi: to replace the key we'll need to have something do "delete key ; add key" - so for the second issue there's just a moment in time where the key won't be there and boot --keypair=infra-root-keys will fail because infra-root-keys won't be there17:56
mordredfungi: yes. you can totally delete and recreate17:56
mordredfungi: just none of our curent things actually do that17:56
mordredfungi: it's not a super huge thing to fix - it's just a bug/deficiency in our current automatoin17:57
fungioh, i see, so from an api design perspective, replacing a value would in theory offer some guaranteed continuity where there could never be a point in time where the value was empty17:57
mordredyah17:57
mordredbut that's unpossible in nova api - so we're left with delete/create on our side17:58
*** ociuhandu has quit IRC17:58
fungiwhereas deletion and creation back to back would be treated as independent operations with no guarantee of atomicity17:58
*** ociuhandu has joined #openstack-infra17:58
*** baoli has quit IRC17:58
*** baoli has joined #openstack-infra17:59
clarkbhttps://review.openstack.org/#/c/505330/2 is in a weird state18:00
*** tosky has joined #openstack-infra18:02
AJaegerclarkb: needs rebasing I guess18:02
clarkbmaybe?18:02
clarkbmordred: http://paste.openstack.org/show/621463/ I think you need to remove pyyaml from the removes list18:03
AJaegersee the patch is on top of - changeset 1 instead of 4 - and the orange icon on parent18:03
clarkbAJaeger: aha thanks18:03
AJaegerclarkb: not obvious ;(18:03
mordredfungi, pabelanger, clarkb: https://github.com/ansible/ansible/pull/3056518:04
clarkbAJaeger: I will rebase an reapprove, thanks18:04
mordredclarkb: piddle. one sec18:04
openstackgerritClark Boylan proposed openstack-infra/system-config master: Bump gerrit account cache to 2048  https://review.openstack.org/50533018:04
fungii don't even see the orange parent icon18:05
clarkbit was there after AJaeger mentioned it18:05
clarkbI have since rebased18:05
mordredclarkb: I mean, tobefair, we don't need cloud-init on that server - but yah18:05
fungii have it pulled up in the webui from before he mentioned it18:05
fungioh, in the related changes box?18:06
openstackgerritMonty Taylor proposed openstack-infra/puppet-jeepyb master: Don't remove PyYAML - because cloud-init  https://review.openstack.org/50535218:06
clarkbit was next to the Parent(s) line (that is where I noticed it)18:06
fungihuh, it never displayed for me18:06
clarkbbut ya hitting reltaed changes also made it apparent with the orange patchset number18:07
fungiat least now if you're looking at an older patchest, the "patch sets (2/3)" shows up construction orange so harder to miss18:07
clarkbya18:07
openstackgerritIhar Hrachyshka proposed openstack-infra/devstack-gate master: Switch from lib/neutron-legacy to lib/neutron  https://review.openstack.org/43679818:08
clarkbmemory use still looks good18:09
clarkbmordred: see comment on 535218:10
openstackgerritMonty Taylor proposed openstack-infra/puppet-jeepyb master: Don't remove PyYAML - because cloud-init  https://review.openstack.org/50535218:11
mordredclarkb: wow, sorry. my first version of thatpatch sucked18:11
pabelangermordred: wow, thanks. Would have taken me much longer to get that together18:12
clarkbcan we get a second review on 505352? should hopefully fix git backend puppeting for real18:13
mordredpabelanger: that won't show up in 2.5 - so we still likely want to do a dumber version in a-c-l18:14
jeblairdone18:14
*** pvaneck has joined #openstack-infra18:15
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul feature/zuulv3: DNM: Test Ansible 2.4  https://review.openstack.org/50535418:15
pabelangermordred: ack18:16
pabelangerokay, flights now booked for Sydney.  I can get started on cloud-launcher change18:16
openstackgerritRodrigo Duarte proposed openstack-infra/project-config master: Update LDAP domain driver CI job to run tempest full  https://review.openstack.org/49222318:18
*** trown|lunch is now known as trown18:22
openstackgerritMerged openstack-infra/system-config master: Bump gerrit account cache to 2048  https://review.openstack.org/50533018:23
clarkbok ^ is in now. So once 5352 merges and git backends are puppeting we should be ready to work on the big restart18:24
*** rhallisey has quit IRC18:24
*** rhallisey has joined #openstack-infra18:24
*** rtjure has quit IRC18:24
*** askb has quit IRC18:25
clarkbI'm going to disable puppet cron now since 5352 ust failed18:27
clarkbthis way we don't have t owait for everything to go around in a circle again18:27
clarkbcan just enable it once we are ready18:27
*** bobh has joined #openstack-infra18:27
openstackgerritRodrigo Duarte proposed openstack-infra/project-config master: Update LDAP domain driver CI job to run tempest full  https://review.openstack.org/49222318:28
clarkboh we have a meetingtoday18:29
clarkbEDISTRACTED18:29
fungiin 30 minutes18:29
mordredclarkb: cool. I'm ready whenever18:30
clarkbya I'm gonna prep for meeting, I have rechecked 535218:32
jeblairwhy is that job so bad?18:33
*** felipemonteiro__ has quit IRC18:34
*** ociuhandu has quit IRC18:34
*** rbrndt has joined #openstack-infra18:34
*** felipemonteiro__ has joined #openstack-infra18:35
openstackgerritMerged openstack-infra/infra-specs master: Gerrit ContactStore Removal is implemented  https://review.openstack.org/49228718:37
clarkbjeblair: becase we manage ssh keys with puppet and nova. There seems to be some conflict between them on centos in particular with glean? https://review.openstack.org/501887 is attempt at getting more data around what is happening18:37
*** dtantsur is now known as dtantsur|afk18:40
openstackgerritTimo Tijhof proposed openstack-infra/zuul master: Status: Remove use of deprecated jQuery jqXHR `complete` method  https://review.openstack.org/50536618:42
*** felipemonteiro has joined #openstack-infra18:44
openstackgerritTimo Tijhof proposed openstack-infra/zuul master: Status: Don't toggle panel when clicking patch link  https://review.openstack.org/50536818:46
*** felipemonteiro__ has quit IRC18:46
*** ijw has quit IRC18:46
mordredclarkb: I feel like yesterday when we rolled out the new gerrit the CSS worked very well for me ...18:47
mordredclarkb: then today it's back to being horizontal scrolling18:47
mordredclarkb: did a thing change? or did I just suck looking at things yesterday?18:47
clarkbI think it worked for me in chrome this morning but not firefox18:47
clarkbcss itself hasnt changed as far as I know18:47
mordredweird18:48
jeblairquestion about the status alert -- it says "Post jobs are not executed currently, do not tag any releases".  the tag/release pipelines are working, as far as we know, yeah? just not post? so really it should be "do not land any changes"?18:48
clarkbthat may have just been confusion over the affects?18:49
openstackgerritMerged openstack-infra/puppet-jeepyb master: Don't remove PyYAML - because cloud-init  https://review.openstack.org/50535218:50
clarkbre ^ I am reenabling puppet cron now18:50
AJaegerjeblair, clarkb : I suggested that - for tags: Most in the post queue is reproduceable, so next run will do the work again. Just tags are unique18:50
mordredclarkb: cool18:50
AJaegerjeblair: feel free to change message ;)18:50
*** caphrim007 has joined #openstack-infra18:50
AJaegerjeblair: what did I miss?18:51
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Status: Remove use of deprecated jQuery jqXHR `complete` method  https://review.openstack.org/50536918:53
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Status: Don't toggle panel when clicking patch link  https://review.openstack.org/50537018:53
*** kgiusti has joined #openstack-infra18:54
mordredclarkb, pabelanger, fungi: remote:   https://review.openstack.org/505371 Recreate keypairs when content is different18:54
clarkbInfra meeting in 5 minutes. We will recap PTG (including Zuul things) and go over gerrit upgrade time permitting18:55
mordredthat ^^ is wonky but I think should fix a-c-l between now and when 2.5 releases18:55
clarkbmordred: thanks18:55
clarkbI'll have to tkae a look after the meeting18:55
openstackgerritWalter Scheper proposed openstack-dev/pbr master: Use topo ordering to correctly sort changelog entries  https://review.openstack.org/50537218:56
fungimordred: clarkb: on the css front, it lgtm while anonymous, but once i log in it gets too wide due to additional options in the menuish area18:56
*** dizquierdo has quit IRC18:57
fungi(it adds "my," "people" and "plugins" to the top row)18:58
clarkbah18:59
clarkbcss edits require service restarts to pick up but its also less urgent I think we can do that later in the week once we've got the actual functionality problems in a better place18:59
fungilikely our logo is wider than the upstream default18:59
clarkbok meeting time now in #openstack-meeting18:59
mordredalso - the search text box is HUGE18:59
mordredfungi: but I've got other things making horizontal scroll than just the top bar19:00
fungiin my case (1440px wide browser) just my name in the top bar ends up hanging off the right-hand side19:01
mordredfungi: https://imgur.com/a/smPzr fwiw19:01
fungithough if i switch from "all" to "my" it gets far worse19:01
mordredfungi: the patch list in related goes off too19:01
fungiahh, yeah i was looking at the open changes list19:02
*** baoli has quit IRC19:02
mordredfungi: yah  on the changes list only the name goes off to the side19:03
*** baoli has joined #openstack-infra19:03
openstackgerritMerged openstack-infra/zuul master: Status: Remove use of deprecated jQuery jqXHR `complete` method  https://review.openstack.org/50536619:04
openstackgerritMerged openstack-infra/zuul master: Status: Don't toggle panel when clicking patch link  https://review.openstack.org/50536819:04
*** felipemonteiro has quit IRC19:05
*** felipemonteiro has joined #openstack-infra19:05
*** felipemonteiro__ has joined #openstack-infra19:06
clarkbmordred: git backends puppeted successfully which means after meeting we do kick.sh review.o.o and do the grand restart of things19:07
mordredclarkb: woot19:08
clarkbwe should probably do a bit of planning before we do that as there are several moving parts, but after meeting19:08
*** felipemonteiro has quit IRC19:09
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Use publish-docs-draft base job for docs-draft publishers  https://review.openstack.org/50462419:12
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Removed unused 'status: ' string from log line  https://review.openstack.org/50537819:12
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Emit shell instead of script tasks  https://review.openstack.org/50537919:12
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Omit some jobs from shared queue calculation  https://review.openstack.org/50538019:12
*** slaweq has quit IRC19:14
*** slaweq has joined #openstack-infra19:14
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Status: Remove use of deprecated jQuery jqXHR `complete` method  https://review.openstack.org/50536919:14
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Status: Don't toggle panel when clicking patch link  https://review.openstack.org/50537019:14
*** Sukhdev has joined #openstack-infra19:17
*** felipemonteiro has joined #openstack-infra19:18
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Add publish-service-types-authority job and mapping  https://review.openstack.org/50460919:19
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Add xstatic-check-version and openstack-tox-pypy  https://review.openstack.org/50461019:19
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Remove liberty/mitaka job regexes  https://review.openstack.org/50496419:19
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Remove unmatched single quotes from jenkins jobs  https://review.openstack.org/50496519:19
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Add mapping file setting to skip jobs from share queues  https://review.openstack.org/50496619:19
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Have publiccloud-wg gate on itself, not api-wg  https://review.openstack.org/50496719:19
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Make yaml2ical publication job  https://review.openstack.org/50496819:19
*** felipemonteiro__ has quit IRC19:19
*** felipemonteiro__ has joined #openstack-infra19:20
*** wolverineav has quit IRC19:22
*** felipemonteiro has quit IRC19:24
*** felipemonteiro has joined #openstack-infra19:24
SukhdevDear Infra folks, can you please get this taken care of so that I can kick start the work on this project - https://review.openstack.org/#/c/503829/19:25
*** felipemonteiro__ has quit IRC19:26
*** jrist has joined #openstack-infra19:26
AJaegerSukhdev: Sorry, we cannot currently. We're in the middle of fixing problems after a gerrit update and cannot create new repos currently.19:27
fungialso, in the middle of the infra weekly meeting right this moment19:29
dmsimardclarkb: is gerrit still on fire ? I pushed a tag to gerrit and while the tag is on git.openstack.org, it doesn't seem like it has replicated down to github yte.19:30
dmsimards/yte/yet/19:30
clarkbdmsimard: yes, we haven't fixed zuul yet for that (though that may be new and exciting behavior)19:30
dmsimardclarkb: nevermind, false alarm19:31
*** tinwood has quit IRC19:31
dmsimardclarkb: some weird sorting by github, the release tag was put after the rc tag19:31
clarkbok19:31
SukhdevAJaeger : any ETA?19:32
*** tinwood has joined #openstack-infra19:32
*** slaweq_ has joined #openstack-infra19:33
*** martinkopec has quit IRC19:33
*** jtomasek has quit IRC19:35
clarkbSukhdev: hopefully we will have the majority of problems sorted out by the end of today. We are owrking as quickly as possible to clean things up around gerrit19:35
*** slaweq has quit IRC19:37
Sukhdevclarkb : Thanks. will bug you guys tomorrow in that case - best of luck in sorting things out :-)19:38
openstackgerritSam Yaple proposed openstack-infra/shade master: Allow domain_id for roles  https://review.openstack.org/49699219:38
openstackgerritSam Yaple proposed openstack-infra/shade master: Move role normalization to normalize.py  https://review.openstack.org/50017019:38
*** rhallisey has quit IRC19:38
*** rhallisey has joined #openstack-infra19:38
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Deal with link-logs macro  https://review.openstack.org/50538719:38
*** felipemonteiro__ has joined #openstack-infra19:40
openstackgerritMonty Taylor proposed openstack/os-client-config master: Treat clouds.yaml with one cloud like envvars  https://review.openstack.org/50538819:41
*** felipemonteiro has quit IRC19:43
jeblairdmsimard: see mail from dhellmann on openstack-dev19:43
jeblair2017-09-19 19:10:07.809429 | /tmp/03-24d8b0b01ba04e599848b6cb92742892.sh: line 21: /usr/zuul-env/bin/zuul-cloner: No such file or directory19:44
jeblairhttp://logs.openstack.org/89/89289c511c8c48c8409dd1fb997fc24e64d1dae6/release/ara-tarball-signing/f812ce9/console.html#_2017-09-19_19_10_07_80942919:44
jeblairi don't immediately understand what's happening there19:44
*** mat128 has quit IRC19:46
jeblair/usr/zuul-env/bin/zuul-cloner exists on signing0119:47
mordredjeblair: I agree - perhaps it's one of the files being passed as argument to zuul-cloner?19:47
jeblairModify: 2017-09-19 19:10:08.153368290 +000019:48
jeblairthough, it may not have existed at the time that job ran?19:48
mordredjeblair: also, ls: cannot access /opt/git: No such file or directory19:48
jeblairoh wow19:48
mordredalthough I think zuul-cloner should deal with that19:48
jeblairdid that job run right in the middle of a puppet run that reinstalled zuul?19:48
mordredjeblair: golly - it's entirely possible19:48
jeblairnormally, that's baked into our images, but i think on the static nodes, we have puppet manage it?19:49
jeblairi'm glad that's changing in v3 :)19:50
* clarkb needs to grab something to drink but then am back for coordinating restarts of things19:59
mordredjeblair: me too19:59
clarkbmaybe we can collect a list of things to do specific to restart on the etherpad and then get volunteers for the items?20:00
clarkbalso make sure we have everything in place that we want ot get in place20:00
fungiwfm20:01
*** rhallisey has quit IRC20:02
*** gouthamr has quit IRC20:03
clarkbBottom of https://etherpad.openstack.org/p/gerrit-2.13-issues has rough plan20:06
clarkbmorded do you want ot be in charge of modifying the db instance for gerrit?20:06
clarkbjeblair: you want to do zuul installation update and restart?20:07
*** jesusaur has quit IRC20:08
clarkbthe system-config changes we want appear to be on disk20:08
*** jesusaur has joined #openstack-infra20:11
*** baoli has quit IRC20:12
jeblairclarkb: sure20:13
*** baoli has joined #openstack-infra20:13
*** rcernin has joined #openstack-infra20:14
jeblairclarkb: i will update the branch you made with a rebase on current master20:14
clarkbjeblair: sounds good20:14
clarkbfungi: maybe you want to do the gerrit stop and starts and the puppet run? (I'd like to be able to watch thesyslog and see what it says is changing)20:15
fungisure, i can tail that in a screen session on review.o.o20:16
jeblairlemme re-order the etherpad20:16
clarkbjeblair: go for it20:16
jeblairokay, i think that order looks right :)20:16
jeblairclarkb: i will await your 'go' signal to shut down zuul20:17
clarkbya I thin kwe are waiting on mordred to be ready on the db items20:17
clarkbmordred: fungi ^ can you indicate when you are ready and if the steps on the etherpad look correct to you?20:17
fungii've switched computers so taking a second to pull up another copy of the pad20:18
*** jkilpatr has quit IRC20:18
*** baoli has quit IRC20:18
clarkbI added the two backup commands to the etherpad too20:19
*** baoli has joined #openstack-infra20:19
pabelangerable to assist if needed20:19
fungifound kick.sh20:19
clarkbfungi: system-config/tools/kick.sh and should take the fqdn of the host as the single argument iirc20:20
fungiyeah, i just confirmed by reading the script20:20
fungicool, getting a screen session going on review.o.o first tailing syslog and filtering for puppet entries20:21
ianwheh, that's better than ctrl-r search for last time i did it :)20:21
fungithere are two screen sessions already running as root on review.o.o20:21
clarkbthe one we used yesterday wasn't turned off20:22
mordredclarkb: sorry - got sucked in to call - ready  to go20:22
clarkbok I think that means everyone is ready to go? the backups will take a few minutes20:22
* clarkb composes a #status notice before we start20:22
*** e0ne has joined #openstack-infra20:22
ianwfungi: i had one too i think, removed20:23
fungioh well, my screen session there is the newest date anyway20:23
clarkbhow does that notice message look? I think once I send that we can go ahead and start20:23
fungicurrently running jobs will be restarted?20:24
clarkbjeblair: ^20:24
jeblairclarkb: lgtm20:25
clarkbok lets do this20:25
clarkb#status notice Zuul and Gerrit are being restarted to address issues discovered with the Gerrit 2.13 upgrade. review.openstack.org will be inaccessible for a few minutes while we make these changes. Currently running jobs will be restarted for you once Zuul and Gerrit are running again.20:25
openstackstatusclarkb: sending notice20:25
jeblairi will save queues and stop zuul now20:25
jeblairzuul is stopped20:26
fungistopping gerrit now20:26
-openstackstatus- NOTICE: Zuul and Gerrit are being restarted to address issues discovered with the Gerrit 2.13 upgrade. review.openstack.org will be inaccessible for a few minutes while we make these changes. Currently running jobs will be restarted for you once Zuul and Gerrit are running again.20:26
fungigerrit is stopped20:27
fungimordred's cue20:27
clarkbmordred: you are up for doing backups20:27
*** e0ne has quit IRC20:28
clarkbdid we lose mordred?20:29
fungii have `/opt/.../kick.sh review.openstack.org` queued up in a root screen session on puppetmaster.o.o in case others want to attach20:29
mordredclarkb: am here20:29
mordredI can do backups - one sec20:29
clarkbmordred: details are at https://etherpad.openstack.org/p/gerrit-2.13-issues assigned to you since you are doing the db related stuff20:29
clarkboh thefilename probably says 2.11 still20:30
clarkbfixed20:30
mordredclarkb: backing up now20:32
*** wolverineav has joined #openstack-infra20:34
clarkbdb backup is about 1/3 done based on file size20:36
mordred(still dumping - yah)20:36
*** ijw has joined #openstack-infra20:37
jlk*snicker*20:37
clarkb2/3 now20:37
clarkbI've realized we will likely need to manually run manage projects after all this is done because gerrit shouldb't be running when it tries to create thta new project...20:39
dmsimardI like how everyone keeps referring to "tobias's change" and somehow I'm probably the only one not aware of what it is20:39
anticwgotta blame someone20:40
clarkbmordred: looks done?20:40
clarkbdmsimard: its the case sensitive change in zuul that tobiash wrote20:40
clarkbdmsimard: we havne't merged it yet because it is backward incompatible but plan to merge it next week20:40
dmsimardoh, okay20:40
fungiare we reasonably sure puppet isn't going to abort partway through if it can't apply the manage-projects step?20:41
mordredclarkb: yes done20:41
clarkbfungi: ya I think that can fail on its own/ I can reread it htough20:41
mordredclarkb: I will now apply the config and restart mysql20:41
*** Sukhdev has quit IRC20:41
fungii suppose i can kick puppet on review.o.o in parallel with the trove instance restart?20:41
*** Goneri has quit IRC20:42
clarkbya manage projects is its own block in gerrit.pp20:42
mordreddb is restarting the first tie - i will need to restart a second time20:42
clarkbfungi: lets maybe do one at a time juts for simplicity20:42
fungik20:42
*** jkilpatr has joined #openstack-infra20:42
*** srobert_ has joined #openstack-infra20:43
mordredclarkb, fungi: db has been restarted with new config settings20:43
*** jrist has quit IRC20:43
clarkbmordred: both required times?20:43
mordredclarkb: yes. db is ready to go20:44
fungikicking the puppet now20:44
clarkbok20:44
*** srobert has quit IRC20:45
fungii think it's nearly done20:46
fungipuppet apply logging to syslog lgtm20:46
*** xyang1 has quit IRC20:46
clarkbgrr file mode meant that gerrit init ran but it appears to not have done anything other than download and install some libs20:46
fungithough scrolled by rapidly20:46
clarkb(I think that is all ok more annoying than anything else)20:46
clarkboh nope20:47
clarkbits reindexing20:47
clarkbdamnit gerrit20:47
*** srobert_ has quit IRC20:47
fungiso mode change caused the exec to be notified?20:47
mordredclarkb: wait- what?20:47
clarkbfungi: ya20:47
mordredgah20:47
fungishould i kill the reindex process?20:47
clarkbI think we can kill the reindex process20:48
fungi(out of band)20:48
clarkbthen restore the index that mordred backed up20:48
fungiwill do20:48
clarkbya20:48
mordredkk20:48
mordredindex.backup.150585314120:48
mordred/home/gerrit2/index.backup.1505853141 that is20:48
fungikilled20:49
fungipuppet is continuing from there (may abort?)20:49
clarkband puppet failed20:49
clarkbso I think we restore the index backup (lets copy not move it so that we have it if we need it again20:49
clarkbthen rerun kick.sh20:49
fungido we need to re-kick or just assume all is well with the restore?20:49
clarkbno lets rerun kick.sh20:49
fungik20:49
fungimordred: restoring?20:50
mordredon it20:50
fungii'm all queued up for another run as soon as we're clear20:50
clarkb(thats my bad, I re double checked file hashes to make sure puppet wouldn't do this but failed to remember mode)20:50
mordredroot@review:/home/gerrit2# mv review_site/index/ index.borked.$(date +%s)20:51
mordredroot@review:/home/gerrit2# cp -ax index.backup.1505853141 review_site/index20:51
mordredis what I did20:51
mordredwe can probably just rm that borked one20:51
mordredit's done copying20:51
fungiokay, ready for me to re-kick?20:52
mordred(yay for blocks still in filesystem cache from previous move I'm guessing)20:52
mordredgood on my end20:52
clarkbya I think that we are ready to rerun kick.sh20:52
fungifired up20:52
clarkblog output looks better20:53
*** jcoufal has quit IRC20:53
fungifinished, yeah20:54
clarkband no gerrit running as expected20:54
clarkbjeblair: are you ready on the zuul side? I think we are ready to start gerrit20:54
jeblairclarkb: go for it20:55
fungipatch applied?20:55
clarkbfungi: I think you can start gerrit manually now20:55
fungiokay, starting gerrit20:55
clarkbgerrit log lgtm20:56
fungiinitscript ran20:56
jeblairmaybe restart apache?20:56
clarkbya I think apache needs to be convinced as well20:56
fungidoing20:56
fungidone20:57
clarkbits there and reports the correct version20:57
jeblairzuul is starting (receiving worker registrations)20:58
clarkbcgit may not be quite working, links are there but at least one I tested isn't working20:58
clarkbwe should be able to iterate on cgit more easily now that puppet should be happyness20:58
openstackgerritMonty Taylor proposed openstack-infra/zuul-sphinx master: Update exception message to include directories  https://review.openstack.org/50540020:58
clarkbhrm did the cache settings not apply?20:59
* clarkb looks forthem21:00
anticwcgit links aren't right :(21:00
fungianticw: yeah, clarkb already spotted that21:00
ianwclarkb: hmm, it's got an extra ".git" in there21:00
ianwbut works without that21:00
mordredclarkb: where should I be looking for cgit links?21:00
fungiianw: okay, so probably easy patch21:00
clarkbmordred: on the left hand side next to commit and parent rows21:00
*** esberglu has quit IRC21:00
mordredaha! there we go21:01
ianwmordred: ctrl-f (cgit)21:01
*** esberglu has joined #openstack-infra21:01
mordredI was looking at a change that did not have the links - and refreshing did not add them :(21:01
mordredhttps://review.openstack.org/#/c/502351/21:02
mordredstill no cgit there for me21:02
clarkboh we have to pass that cache accounts stuff into ::gerrit21:02
clarkbjeblair: ^ should I go ahead and write that change?21:02
mordredbut https://review.openstack.org/#/c/500170/ does21:02
clarkbthey are there for me on 502351, I'm guessing browser shenanigans21:03
jeblairclarkb: yes please since i don't know what you're talking about; that will be the easiest way for you to explain it to me :)21:03
clarkbjeblair: ok will be up momentarily21:03
mordredweird21:03
mordredshift-reload also doesn't add them - but it's not important if it's just browser-side weirdness forme21:03
jeblairi'm trying to figure out why no zuul jobs are running21:03
fungidid the cache options not end up in openstack_project::gerrit in addition to openstack_project::review?21:04
*** esberglu has quit IRC21:05
fungiaha, missing from openstack_project::review21:05
fungican't believe i didn't catch that21:05
openstackgerritClark Boylan proposed openstack-infra/system-config master: Pass gerrit cache options through to puppet-gerrit  https://review.openstack.org/50540221:05
fungino, wait, we don't need it there21:06
clarkbfungi: its actually ^21:06
clarkbya we don't need it in review.pp21:06
fungiright, now i get it21:06
pabelangernodepool building nodes in rax now21:07
fungii guess we could have made them not class params to openstack_project::gerrit21:07
fungiand just set them on the gerrit module instantiation within it21:07
pabelangerthere we go, nodepool looks to be working21:08
clarkbjeblair: let us know if you need any help21:08
fungimy gertty gets really unhappy after gerrit outages21:08
jeblairpabelanger: what was happening?21:08
clarkblooks like maybe just waiting on nodes per pabelanger's comments21:08
jeblairoh i guess it wasn't even building ready nodes due to geard outage21:08
*** e0ne has joined #openstack-infra21:08
jeblairre-enqueuing21:09
pabelangerjeblair: nodepool looked to be not launching any new nodes because allocation requests were 021:09
pabelangerbut once we started deleting old jobs, it started requesting new allocations21:09
clarkbif we can't fix the .git problem on the gerrit side we can have cgit apache rewrite to exclude the .git21:09
fungihrm, or maybe it's gerrit that's unhappy... i still can't retrieve 50540221:09
jeblairfungi: it's becaues gertty fetches from git.o.o to be nice, and gerrit is replicating21:10
clarkbfungi: its there via web ui, are you trying to pull from git.oo?21:10
clarkbya that21:10
fungijeblair: aha, thanks. forgot about that21:10
*** trown is now known as trown|outtypewww21:10
* fungi will go to a computer capable of using gerrit's webui21:10
bkerofungi: have you tried opening gerrit in dillo?21:10
ianwclarkb: yeah, i don't see any settings, on either the gerrit side or the cgit side21:11
ianwcgit can ignore the .git if its in the repo, but not ignore it in an incoming url aiui21:11
ianwi can look at a url rewriter if you like21:11
clarkbianw: thats probably the easiest thing to do for now and generally user firendly to rewrite thta way I think21:12
ianwclarkb: ok, will look into after breakfast :)21:12
*** esberglu has joined #openstack-infra21:12
mordredclarkb, ianw: the cgit type in gerrit hard-codes the .git21:13
mordredclarkb, ianw: but there is also a "custom" type that allows us to set url templates21:13
fungiso gerrit insists on adding a .git to the project name in those urls?21:13
mordred        type.setRevision("${project}.git/commit/?id=${commit}");21:13
mordredfir unstance21:13
clarkbok so that implies the only change we need to get in before restarting again is https://review.openstack.org/505402 ?21:13
mordredthat's in "case cgit"21:13
fungiweird, given cgit doesn't need nor want the .git21:13
*** thorst has quit IRC21:14
clarkbI think the process for that is we want to get ^ merged, make sure it end up on system config on puppetmaster, then rerun kick.sh21:14
clarkbthen manually restart gerrit when ready21:14
mordredwell- I think for now we can likely emit our own things ... one sec, I think I can make a quick patch21:14
clarkbok21:14
clarkbwe also want to remove review.o.o from the emergency file and manually run manage-projects21:14
*** ldnunes has quit IRC21:14
*** thorst has joined #openstack-infra21:16
clarkbgood news is I think we are puppeting properly now21:17
clarkband future restarts should be short and not involve zuul or the database21:18
openstackgerritMonty Taylor proposed openstack-infra/puppet-gerrit master: Override the cgit url settings in gerrit  https://review.openstack.org/50540621:18
mordredclarkb: ^^21:19
clarkbmordred: can you tab them all for consistency?21:19
clarkbmordred: I think without that puppet will be noisy21:19
clarkbbut otherwise lgtm21:20
mordredhttp://git.openstack.org/cgit/openstack-infra/gerrit/tree/gerrit-server/src/main/java/com/google/gerrit/server/config/GitwebConfig.java?h=openstack/2.13.8#n15821:20
mordredclarkb: totes. one sec21:20
*** thorst has quit IRC21:20
openstackgerritMonty Taylor proposed openstack-infra/puppet-gerrit master: Override the cgit url settings in gerrit  https://review.openstack.org/50540621:21
openstackgerritMatt Riedemann proposed openstack-infra/elastic-recheck master: Add query for live migration invalid disk info bug 1718295  https://review.openstack.org/50541021:22
openstackbug 1718295 in OpenStack Compute (nova) "Unexpected exception in API method: MigrationError_Remote: Migration error: Disk info file is invalid: qemu-img failed to execute - Failed to get shared "write" lock\nIs another process using the image?" [High,Confirmed] https://launchpad.net/bugs/171829521:22
clarkbinfra-root can we get a second review on 505406? I think with that and the cache fix in we want to rerun kick.sh and monitor gerrit (then restart gerrit again)21:23
clarkbthen I'd like to remove review.o.o from the emergency file in puppet and manually run manage projects21:23
mordred++21:24
ianwlgtm21:24
clarkbmordred: re manage-projects do we need to be concerned at all about things out of sync with the manage-projects cache items?21:24
clarkbI can't remember if there were any dangerous situations in there21:24
mordrednope - manage-projects manages itself21:24
clarkbok cool21:25
clarkbseems like it grew more complex and scary recently but that is just my paranoia21:25
jeblairclarkb: done21:25
mordredhowever - these: https://review.openstack.org/#/q/topic:fix-manage-projects+status:open would be nice to get someone to review at some point21:25
jeblairi'm going to afk for a few; i think the next gerrit restart can be a quick one so doesn't need any zuul action21:25
mordredsince three of them are from june21:25
clarkbjeblair: ++21:25
mordredjeblair: ++21:25
clarkbI've got to watch the kids for a bit but will be around to do the next kick.sh and gerrit restart and stuff once changes merge21:27
*** vhosakot has quit IRC21:28
*** rbrndt has quit IRC21:29
fungii can do those if they happen soonish, otherwise disappearing for dinner in ~45 minutes21:29
*** rbrndt has joined #openstack-infra21:29
*** rbrndt has quit IRC21:29
*** tinwood has quit IRC21:30
*** tinwood has joined #openstack-infra21:31
*** rhallisey has joined #openstack-infra21:33
*** gouthamr has joined #openstack-infra21:34
*** vhosakot has joined #openstack-infra21:34
*** jheroux has quit IRC21:35
*** rbrndt has joined #openstack-infra21:36
fungithere's a post job running!!!21:37
clarkbyay21:37
fungiwell, ref in post with jobs queued at any rate21:37
fungiwhich is further than we were getting before the restart21:37
clarkbI'm going to mark off the post jobs item on our etherpad21:37
clarkbas that should be sufficient to see the gerrit trigger worked21:37
*** e0ne has quit IRC21:37
fungishould we #status ok yet, or hold off a smidge longer?21:37
openstackgerritMerged openstack-infra/system-config master: Pass gerrit cache options through to puppet-gerrit  https://review.openstack.org/50540221:37
*** e0ne has joined #openstack-infra21:38
*** e0ne has quit IRC21:38
clarkblets hold off until after we do the second planned restart21:38
fungik21:39
clarkbI'm updating system-config on puppetmaster now to include ^21:40
clarkbthat is done21:41
clarkbnow just waiting on puppet-gerrit change21:41
fungiwant me to re-kick review once that merges?21:41
clarkbplease21:41
*** jistr has quit IRC21:41
clarkbthough puppet-gerrit change may merge more slowly due to node availability21:41
fungiyeah, seems to be waiting on a few node assignments still21:42
fungixenial specifically21:42
*** bobh has quit IRC21:43
clarkbmordred: fungi maybe we want to go ahead and run manage projects now?21:44
clarkbmordred: any chance you'd be interested in doing thta since you have touched that toolchain most recently?21:45
fungiseems like it should be fine to do now21:45
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Add disabled network action plugins for 2.4  https://review.openstack.org/50541921:47
mordredclarkb: I can do it21:49
mordredclarkb: on review.o.o or on git* ?21:49
*** Sukhdev has joined #openstack-infra21:50
clarkbmordred: on review.o.o, git* should be good now21:50
mordredkk. running21:50
clarkbI'm cleaning up nodes in nodepool that got lost iwth the zuul restart21:50
clarkbshould mean that puppet-gerrit change gates faster21:50
mordredclarkb: it hath run with no apparent issues21:51
*** ihrachys has quit IRC21:51
*** armax has quit IRC21:51
mordredclarkb: btw- ansible 2.4 added a hiera lookup plugin21:52
mordredclarkb: so (not this week) we should be able to potentially rework some of our ansible/puppet jankiness21:52
clarkbinteresting, will hvae to be careful that doesn't give all nodes access to any hiera data though21:53
clarkbhrm http://git.openstack.org/cgit/openstack/networking-lagopus/ is still empty21:54
clarkbmordred: ^ I expected manage projects to create that in gerrit nd push an initial commit21:54
mordredclarkb: do we need to trigger a replication perhaps?21:54
clarkbmaybe?21:54
clarkbthe push of initial commitshould do that I thought21:54
mordredoh - it's possible that there was a failed attempt and jeepyb thinks it did21:54
mordredone sec21:54
*** jascott1 has joined #openstack-infra21:55
mordredclarkb: I do not see that repo in the projects.yaml on gerrit21:56
mordredclarkb: in /etc/project-config21:56
*** esberglu has quit IRC21:57
openstackgerritGage Hugo proposed openstack-infra/project-config master: Skip ansible upgrade job in keystone  https://review.openstack.org/50542621:57
mordredclarkb: Date:   Wed Sep 13 23:22:39 2017 +000021:57
mordredclarkb: is the latest commit ... I can obviously git pull - but I'd expect a full ansible/puppet run to get that up to date21:58
clarkboh I know why21:58
clarkbits because we make project-config be in sync with git.o.o21:58
mordredoh. yah21:58
clarkbso this may all self heal as soon as we let puppet run on its own on that server21:58
mordredkk. I think we're in good shape - there's no lingering issues with manage-projects21:58
*** hashar has quit IRC21:59
mordredso a normal run should be fine21:59
clarkbin that case I guess we just move forward with the old plan which is basically kicks.sh it one more time to get these two changes on, then remove it from emergecny file and and watch it21:59
clarkbmordred: thaks for looking21:59
fungicool, unfortunately the puppet-gerrit change is still waiting for xenial nodes21:59
fungishould i punt it into the gate or let it run its course?22:00
clarkbprobably a good idea to do that so we can get this finished at a reasonable hour22:00
clarkbI'll continue to clean out old leaked nodepool nodes too22:00
fungidoing22:00
*** jistr has joined #openstack-infra22:01
fungienqueued22:01
clarkbbasically anything used and more than an hour and a half or so old I am deleting22:01
ianwwas the email issue sorted out?  did the threads get turned up?22:02
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Disable action and lookup plugins from 2.4  https://review.openstack.org/50541922:03
clarkbianw: not yet, jeblair thinks the account caching miss rate is negatively affecting it so going to get that turned up first and see how it does.22:03
clarkbianw: jeblair wants to see a single email sender thread perform well beofre adding more22:03
ianwah right, that's 50540222:03
clarkbok I think I have all the leaked nodes in nodepool marked for deletion, the remainder were booted after the restart22:05
*** jcoufal has joined #openstack-infra22:05
*** jcoufal has quit IRC22:06
*** aeng has joined #openstack-infra22:07
*** ijw has quit IRC22:07
*** dprince has quit IRC22:09
*** rlandy is now known as rlandy|bbl22:09
*** camunoz has quit IRC22:10
clarkbthinking about it I think the plan behind doing it on monday after ptg mostly worked out22:10
fungiagreed22:11
clarkbIt was quiet, people mostly were ready for a boring day and we have the next day to sort out issues rather than have it look great for a weekend then get caught by surprise22:11
*** Sukhdev has quit IRC22:12
fungilooks like the change we wanted is getting xenial nodes in the gate finally22:12
fungiand now it has all the nodes it needs22:13
fungii'm getting very close to having to step away, unfortunately22:13
clarkbwe should manually pull that into the repo in /etc/puppet/modules/gerrit then run kick.sh22:14
clarkbmordred: ianw pabelanger jeblair ^ who is still left?22:14
*** priteau has quit IRC22:14
*** dargains has joined #openstack-infra22:14
pabelangerstill here22:15
fungiokay, i am being dragged away to dinner22:15
clarkbfungi: enjoy22:15
fungipabelanger to the rescue!22:15
fungithanks22:15
fungibbiaw22:15
*** rhallisey has quit IRC22:15
pabelangerclarkb: 505402 and manually kick?22:16
clarkbpabelanger: 50540622:16
clarkb505402 should already be in system config22:16
pabelangerclarkb: ++ on 50540622:16
clarkbjust waiting on 406 to merge the nwe can get it in /etc/puppet/modules/gerrit then we can kick.sh22:16
pabelangerwfm22:17
clarkbthen we remove review.o.o from emergency file and watch thta it puppets as expected22:18
ianwme too, if we want22:18
clarkbdue to not trusting puppet-gerrits handling of the war I'm tempted to stop gerrit, do index backup again, and then run kick.sh22:19
clarkb(in theory its fine now since last kick.sh was fine but ugh)22:19
*** yamamoto_ has joined #openstack-infra22:19
*** gouthamr has quit IRC22:21
openstackgerritMerged openstack-infra/puppet-gerrit master: Override the cgit url settings in gerrit  https://review.openstack.org/50540622:21
clarkbI am making sure that is up to date now22:21
*** chlong has quit IRC22:21
pabelangerk22:22
clarkbuh hrm /etc/puppet/modules/gerrit is actually quite ancient22:22
clarkbwe use origin/master I think22:23
pabelangerthat's okay, right? We push them to node now22:23
*** tpsilva has quit IRC22:23
clarkbwe push what we check out though22:23
clarkbreading modules.env it uses origin/master22:23
clarkbjust gonna double check reflog but I think that is it22:23
clarkbya reflog confirms22:24
clarkbI have done git remote update && git checkout origin/master in /etc/puppet/modules/gerrit22:24
clarkbHEAD is dce6e65c8e6e4408ae4b314733dc20d8b3bf3edd22:25
pabelangerk22:25
pabelangerwill see why it wasn't updated22:25
clarkbits because we use the remote ref rather than updatingthe local one22:25
clarkbpretty sure I grok it now22:25
clarkbpabelanger: do you want to get ready to run kick.sh against review.o.o?22:27
*** sdague has quit IRC22:27
pabelangerclarkb: sure22:27
pabelangersay when22:27
clarkbI'm going to manually stop gerrit first22:27
clarkband backup the index again22:28
pabelangerk22:28
clarkbwill tell you when that is done and we can kick.sh22:28
clarkbianw: maybe you can send another #status notice?22:28
ianwok22:28
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Port in changes from ansible 2.4 command module  https://review.openstack.org/50543022:29
ianw draft -> #status notice Gerrit is being restarted to address some final issues, review.openstack.org will be inaccessible for a few minutes while we restart22:29
clarkbstopping gerrit now22:29
ianw#status notice Gerrit is being restarted to address some final issues, review.openstack.org will be inaccessible for a few minutes while we restart22:29
clarkbindex backup is running22:30
mordredclarkb: about to afk - but I'll lurk around for the next little bit if you need22:30
clarkband is done22:30
pabelangerI don't think status was picked up22:30
clarkbpabelanger: go ahead and run kick.sh22:30
pabelangerrunning22:31
clarkbwe'll have to debug statusbot later (since we are deep into things now)22:31
ianw#status notice Gerrit is being restarted to address some final issues, review.openstack.org will be inaccessible for a few minutes while we restart22:31
openstackstatusianw: sending notice22:31
clarkbok it looks like it applied cleanly gonna check gerrit.config and start gerrit22:32
ianwsorry, sometimes my bouncer doesn't authenticate with nickserv :/22:32
vhosakothttps://review.openstack.org is down?22:32
pabelangerclarkb: kick.sh done22:32
clarkblgtm22:32
clarkbpabelanger: ready to start gerrit again?22:32
pabelangervhosakot: yes, see openstackstatus22:32
pabelangerclarkb: ready22:32
vhosakotpabelanger: cool +122:32
clarkbstarting22:32
ianwvhosakot: yes, sorry the notice went out after the stop due to a small issue22:32
-openstackstatus- NOTICE: Gerrit is being restarted to address some final issues, review.openstack.org will be inaccessible for a few minutes while we restart22:33
vhosakotianw: cool, thanks for the info22:33
clarkbgerrit log looks happy22:33
*** dave-mcc_ has quit IRC22:33
clarkbcgit/gitweb links work now22:34
clarkbanticw: ^22:34
openstackstatusianw: finished sending notice22:34
pabelangerHmm22:34
clarkbpabelanger: ?22:34
pabelangermight be my cache22:34
pabelangerbut https://review.openstack.org/#/c/237134/22:34
pabelangerhave old cgit URLs22:35
pabelangertrying another browser22:35
clarkbits there and owrking for me22:35
pabelangerk22:35
clarkbso ya I think local caches probably at fault22:35
pabelangeragree22:35
pabelangeranother review is good for me too22:35
ianwit says gitweb but goes to cgit22:35
*** ijw has joined #openstack-infra22:35
clarkbianw: ya because we switched it to "custom"22:35
ianwyep22:35
clarkbI think we can liv ewith that for now :)22:35
pabelangerya22:36
clarkbok I think that means last step is to remove review.o.o from emergency file22:36
*** slaweq_ has quit IRC22:36
pabelangerwfm22:36
clarkband make sure that networking-lagopus gets created22:36
clarkbnext run is in 8 minutes22:37
clarkbwill we still have people around to watch that?22:37
clarkbinfra-root ^22:38
clarkbI can be around22:38
pabelangerYa, will be here22:38
clarkbok removing from emergency file now then22:38
clarkbdone22:38
clarkbI am tailing syslog on review.o.o and the puppet_run_all.log on puppetmaster22:39
pabelangersame22:40
*** rbrndt has quit IRC22:41
*** baoli has quit IRC22:41
*** rbrndt has joined #openstack-infra22:41
*** rbrndt has quit IRC22:41
clarkbpabelanger: here we go22:45
*** openstackgerrit has quit IRC22:47
pabelangerclarkb: gerrit show-queue does appear to be going down22:47
*** ijw has quit IRC22:48
clarkbok puppet run seems to have done what we expect according to logs22:48
clarkbgonna look at networking-lagopus now22:48
clarkbpabelanger: ya it trends down after the restart as it replicates everything but then trends up as emails are slow so we want to see what it does after replications22:48
jeblairback22:50
pabelangerpuppet passed afs servers22:50
clarkbeverything looks good except for http://paste.openstack.org/show/621485/ guessing that command changed in gerrit22:51
jeblairinterestingly, the accounts cache hit ratio didn't change22:51
clarkbso we have review.o.o puppeting but project creation is not working22:51
pabelangerclarkb: --name doesn't appear to be a switch now22:52
jeblairi'm not sure what else to do on the email front22:52
pabelangerhttp://paste.openstack.org/show/621487/22:52
clarkbpabelanger: ya docs seem to agree22:52
clarkbso we need jeepyb patch to fix that22:53
jeblairi guess we can throw more threads at it, but i'm worried that will just eat up cpu22:53
clarkbjeblair: maybe zaro and paladox can help with that too?22:53
pabelangerclarkb: actually, gerritlib will need to update22:55
pabelangerclarkb: https://review.openstack.org/399308/ already fixed. So we might need a version bump22:56
clarkbya likely needs a tag, though I kinda wish that checked the gerrit version and applied the right method22:57
clarkbjeblair: paladox is in #gerrit but in europe iirc so may not respond until sometime tonight for us22:57
pabelangerYa, we need a tag. 0.6.0 is missing the fixes22:58
jeblairclarkb: ack; i have emitted a question in #gerrit, thanks22:58
*** wolverineav has quit IRC23:00
*** gouthamr has joined #openstack-infra23:01
pabelangerclarkb: I'23:01
pabelangererr23:01
pabelangerclarkb: I've added gerritlib issue to gerrit-2.13-issues etherpad23:01
clarkbthanks I think I have a patch tha should work in followup23:03
clarkbto handle arbitrary gerrit23:03
clarkbI'm going to test it against review-dev first23:04
mordredjeblair: I'd love to know why it's so slow23:05
clarkbI'm wondering if it digs through accountPatchReviewDb and its just sad23:05
clarkbhowever rtfsing it uses reviewdb not ^23:05
* mordred must away for the evening ... talk to y'all tomorrow23:06
*** slaweq has joined #openstack-infra23:06
jeblairclarkb: should we clear status alert now?23:06
clarkbjeblair: yes I think so23:06
jeblairhas anyone checked post jobs work?23:07
*** xarses_ has quit IRC23:07
clarkbjeblair: they queued23:07
jeblairwfm :)23:07
clarkband are still queued due to node scarcity23:07
*** ijw has joined #openstack-infra23:07
clarkbwe still shouldn't merge new project changes until gerritlib is sorted23:07
*** armax has joined #openstack-infra23:07
*** dhajare has joined #openstack-infra23:08
pabelangerYa, I see a post job complete23:08
*** openstackgerrit has joined #openstack-infra23:09
openstackgerritMerged openstack-infra/elastic-recheck master: Add query for live migration invalid disk info bug 1718295  https://review.openstack.org/50541023:09
openstackbug 1718295 in OpenStack Compute (nova) "Unexpected exception in API method: MigrationError_Remote: Migration error: Disk info file is invalid: qemu-img failed to execute - Failed to get shared "write" lock\nIs another process using the image?" [High,Confirmed] https://launchpad.net/bugs/171829523:10
*** ijw has quit IRC23:10
*** rcernin has quit IRC23:11
*** slaweq has quit IRC23:12
jeblairmy hunch is that it's doing too many or too inefficient queries23:12
jeblairsince almost every time i get the thread dump, it's performing an account query23:12
*** bobh has joined #openstack-infra23:12
openstackgerritClark Boylan proposed openstack-infra/gerritlib master: Handle different gerrit versions with create-project  https://review.openstack.org/50543823:13
*** dhajare has quit IRC23:13
clarkbpabelanger: jeblair ^ I think that is more friendly gerritlib create-project23:13
jeblairand everytime i do a mysql processlist, i see something like: SELECT T.notify_new_changes,T.notify_all_comments,T.notify_submitted_changes,T.notify_new_patch_sets,T.notify_abandoned_changes,T.account_id,T.project_name,T.filter FROM account_project_watches T WHERE T.account_id= ...23:13
clarkbjeblair: thats against reviewdb too right?23:13
jeblairya23:13
*** caphrim007 has quit IRC23:13
jeblairso i worry it's something like perform that query for every account23:13
clarkbits looking at https://review.openstack.org/#/settings/projects every time?23:14
clarkbcould also be related to our use of the third party ci no email group?23:14
clarkbI know this isn't a shared opinion but lack of gerrit emails is so nice >_>23:15
pabelangerclarkb: +223:15
jeblairclarkb: is "version < '2.12'" going to work?23:16
clarkbjeblair: the vast majority of the queued emails are for comments which I think is generated by gerrit-server/src/main/java/com/google/gerrit/server/change/EmailReviewComments.java23:16
clarkbjeblair: it seems to work in practice, it does an alnum sort. It may break if you have eg 2.13.9.-gabc vs 2.13.9.-gzxy23:17
clarkbjeblair: I think because we are comparing to the first portion of the string it should be fine (if it were more fine grained it would not be ok)23:17
jeblairclarkb: or 2.9 ?23:17
clarkboh crud23:17
clarkbclearly I've had too long of a day23:17
clarkbI'll address that23:17
jeblairclarkb: steal this code: http://git.openstack.org/cgit/openstack/gertty/tree/gertty/sync.py#n155823:18
*** vhosakot has quit IRC23:22
*** tosky has quit IRC23:23
*** dargains has quit IRC23:25
openstackgerritClark Boylan proposed openstack-infra/gerritlib master: Handle different gerrit versions with create-project  https://review.openstack.org/50543823:25
clarkbstolen23:25
jeblair+223:27
jeblairpabelanger: ^23:27
clarkbgerrit-server/src/main/java/com/google/gerrit/server/mail/CommentSender.java does the actual construction of the comment emails that are sent I think23:28
pabelangerjeblair: clarkb: +323:28
*** dargains has joined #openstack-infra23:28
clarkbjeblair: 341baf22a0756b73ba36caec87fcb740061e332f and 89634f70f37c027e009da61776d72f2621a4f69f looks suspicious23:30
clarkbthat first one mentions notedb23:30
clarkband I'm checked out on 2.1323:30
clarkbperhaps this is an early inneficient use of notedb?23:30
clarkbI'm going to send a #status ok now how about "Gerrit is once again part of normal puppet config management. Problems with Gerrit gitweb links and Zuul post jobs have been addressed. We currently cannot create new gerrit projects (fixes in progress) and email sending is slow (being debugged)."23:33
jeblairi'm tcpdumping the mysql port23:35
jeblairand wow23:36
clarkbjeblair: does that status ok look good to you?23:36
jeblairclarkb: yes23:36
jeblairthis query: SELECT T.notify_new_changes,T.notify_all_comments,T.notify_submitted_changes,T.notify_new_patch_sets,T.notify_abandoned_changes,T.account_id,T.project_name,T.filter FROM account_project_watches T WHERE T.account_id= ...23:36
clarkb#status ok Gerrit is once again part of normal puppet config management. Problems with Gerrit gitweb links and Zuul post jobs have been addressed. We currently cannot create new gerrit projects (fixes in progress) and email sending is slow (being debugged).23:36
openstackstatusclarkb: sending ok23:36
jeblairit starts with low account_id numbers, and works its way up to ~2600023:36
jeblairand every time it loops around, gerrit sends another email23:36
clarkbwow23:36
jeblairso, yeah, i think we're doing *at least* that query for every account for every email23:37
jeblairit's not the only query23:37
jeblairSELECT T.account_id,T.email_address,T.password,T.external_id FROM account_external_ids T WHERE T.account_id=23:37
jeblairthat one is also in a continuous loop23:37
*** ChanServ changes topic to "Discussion of OpenStack Developer and Community Infrastructure | docs http://docs.openstack.org/infra/ | bugs https://storyboard.openstack.org/ | source https://git.openstack.org/cgit/openstack-infra/ | channel logs http://eavesdrop.openstack.org/irclogs/%23openstack-infra/"23:39
-openstackstatus- NOTICE: Gerrit is once again part of normal puppet config management. Problems with Gerrit gitweb links and Zuul post jobs have been addressed. We currently cannot create new gerrit projects (fixes in progress) and email sending is slow (being debugged).23:39
jeblairthis seems to be the set: http://paste.openstack.org/show/621492/23:39
clarkbI'm being told by others at home that I need to go for a walk, and that is probably a good idea23:40
clarkbwill be afk for a bit23:40
*** vhosakot has joined #openstack-infra23:40
*** gouthamr has quit IRC23:40
jeblairmordred: i know you're gone for the evening; i just wanted to ping you here as a marker for when you're back; i know you'll find that interesting ^23:42
openstackstatusclarkb: finished sending ok23:42
*** felipemonteiro has joined #openstack-infra23:44
*** felipemonteiro__ has quit IRC23:44
*** Sukhdev has joined #openstack-infra23:46
openstackgerritMerged openstack-infra/gerritlib master: Handle different gerrit versions with create-project  https://review.openstack.org/50543823:47
*** felipemonteiro has quit IRC23:49
*** threestrands has joined #openstack-infra23:51
*** bcat_ has joined #openstack-infra23:51
openstackgerritTristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: web: add /tenants route  https://review.openstack.org/50326823:54
openstackgerritTristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: web: add /{tenant}/status route  https://review.openstack.org/50326923:55
openstackgerritTristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: web: add /{tenant}/jobs route  https://review.openstack.org/50327023:55
openstackgerritTristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: web: add /{tenant}/builds route  https://review.openstack.org/46656123:55
*** gouthamr has joined #openstack-infra23:55
pabelangerclarkb: I am also AFK23:56
*** kgiusti has left #openstack-infra23:57
*** bobh has quit IRC23:57
*** vhosakot has quit IRC23:57
tonybianw: What is it y'all need from me for the old stable branch removal?  Just an updated list of repos and tags?23:58
jeblairclarkb, mordred, fungi, zaro: i filed https://bugs.chromium.org/p/gerrit/issues/detail?id=7261 at the suggestion of paladox and added that to https://etherpad.openstack.org/p/gerrit-2.13-issues23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!