Tuesday, 2013-08-20

clarkbjlk: our jenkins slaves are good at DDoSing our git server00:01
clarkbjlk: particularly when we point them at git-daemon00:01
jlkstrange.00:01
jlkbut your repos are significantly larger than Fedoras was00:01
jlkFedora was thousands of small repos00:01
jlkOur hits were probably more distributed as well, distributed over time and network capabilities. RHT infrastructure had networking gear in between our servers and the Internet, I don't know what they did for throttling or whatnot00:03
jeblairjlk: did you use xinetd or run git-daemon itself?00:04
jlkgood question! I believe I used whatever was packaged in EPEL00:04
jlkwould have been rhel6 era00:04
jeblairjlk: that's pretty much what we're doing, which ends up being xinetd.  so no particular tuning?00:08
fungiclarkb: any good reason not to pass --events on mysqldump runs? currently cronspamming us about skipping the mysql.event table on each server00:08
jlkjeblair: not that I remember.00:09
jlkI think I looked at one time at doing git export to just get the latest bits instead of doing a full clone, or doing shallow clones, on our build server00:09
jlkbecause it didn't need any history, just needed the bits00:09
mordredjeblair: ^^ that's a little bit what I was afraid of - we tend to absolutely slam the cloning infrastructure00:10
jlkah, apparently they do use xinetd to throttle it a lot now00:10
jlkwhere "it" == anonymous clones00:10
openstackgerritClark Boylan proposed a change to openstack-infra/config: Proxy git-daemon with haproxy.  https://review.openstack.org/4278400:11
jeblairmordred: not really, we almost never clone00:11
clarkbI can't help myself00:11
ttxmordred: are you done merging back swift m-p tags to master, or should I keep the m-p branch alive for some more time ?00:12
clarkbthat is completely untested but in theory made easy with the puppetlabs module00:12
mordredttx: done with it00:12
ttxmordred: I can delete it now ?00:12
mordredttx: also wrote a patch to potentially do it00:12
mordredttx: yup00:12
clarkbjlk: aren't all git-daemon clones anonymous?00:12
ttxmordred: ok, on my way to final cleanup00:12
mordredttx: https://review.openstack.org/#/c/41927/00:12
jlkwell, yes, I'm not sure why I added that bit of data.00:12
jlk"data"00:12
jeblairclarkb, mordred: http://paste.openstack.org/show/44553/00:13
jeblairclarkb, mordred: that thread is just sitting there.  best i can tell, it's not waiting on a lock.  but it is holding one which is blocking everyone else.00:13
jeblairthat should be the jjb update that changes the git url.  it applied fine on jenkins0100:13
mordredjeblair: wow. that's stellar00:14
jeblairi'm leaning towards "try to manually kill that thread".  any other ideas before i do that?00:14
clarkbjeblair: is it possibly waiting on a locked file?00:15
*** pcrews has quit IRC00:16
*** ^demon has joined #openstack-infra00:16
*** ^demon has joined #openstack-infra00:16
ttxmordred: there is a corner case in the merge-tags thing00:16
jeblairclarkb: it looks like a runaway regex00:16
mordredjeblair: I'd chalk that up to "java sucks sometimes"00:17
mordredttx: yeah?00:17
clarkbfungi: uh I don't know00:17
ttxmordred: for stable/* I'm not sure you actually want to merge tags back... do you ?00:17
* clarkb reads more manpages00:17
*** nati_ueno has quit IRC00:17
clarkbpleia2: if you are really adventurous I think it would be cool to apply 42784 to your test server if it is still up00:17
jeblairmordred: 'cept gearman-plugin is a few rungs down the stacktrace00:18
jeblairmordred: so it's our fault00:18
mordredttx: branch: ^(milestone-proposed).*$00:18
ttxmordred: i.e. when we tag 2013.1.3 on stable/grizzly, do we rally want to merge the tags back to havana master ?00:18
mordredttx: the job is configured to only run on milestone-proposed00:18
mordredsince that's the only time we ever want to do this00:18
ttxmordred: at release time we use milestone-proposed too, and turn that into stable/*00:18
*** ^d has quit IRC00:18
clarkbmordred: thoughts on fungi's --events mysqldump option?00:19
mordredttx: but it's milestone-proposed when you make the tag, right?00:19
clarkbmordred: is that table useful or just noise?00:19
ttxmordred: so we push like, havana-rc2 tags to milestone-proposed while master switched to icehouse00:19
mordredclarkb: noise. we don't use it00:19
mordredttx: yup. that's fine00:19
clarkbmordred: so better to redirect that warning message to /dev/null than to dump the table?00:20
ttxmordred: ok, just doublechecking00:20
mordredttx: we _do_ want the final tag from havana milestone-proposed to be in master, so that the in-flight versions look "sensible"00:20
mordredbut I agree, the following tags that are made on stable/* do not want to be merged to master00:20
ttxmordred: can that job generate a conflict ? Or is it always successful ?00:21
mordredttx: and we're making it always a null-merge, so the merge will never bring changes from m-p to master00:21
ttxok, guess that answers my question00:21
mordredttx: it's always successful. it's using the merge strategy which says "just keep my version"00:21
ttxack00:21
ttx+1ed00:22
Alex_GaynorIs there anythign I could be doing to help with the "ddosing ourselves with git" issue?00:22
clarkbAlex_Gaynor: right now we are switching to using https instead of git:// as apache deals with ddosing ourselves better00:23
jeblairclarkb, mordred: uh, wow, ok, it got unstuck.00:23
mordredjeblair: wow00:23
Alex_Gaynorclarkb: "apache deals with ddosing ourselves better", I feel like this encapsulates everything I feel about computering (for better and for worse) :)00:23
clarkbAlex_Gaynor: https://review.openstack.org/42784 is one potential way of moving back to using git:// but it needs testing and probably input from someone that knows haproxy better than me00:23
Alex_Gaynorclarkb: I can probably ping some HA proxy friends00:23
clarkbAlex_Gaynor: I am semi hoping we can abuse pleia2's test box if it is still around00:24
jlkseems really strange to make use of https to make things faster...00:24
jlkIIRC git:// isn't doing any encryption, which /should/ make it an easier process to handle.00:24
jeblairAlex_Gaynor, jlk: basically, git under xinetd has no socket queueing, so you're either under the 50 process limit, or over, in which case you get your connection dropped00:24
jlkinteresting00:24
jeblairAlex_Gaynor, jlk: apache at least will let you separately tune how many things you run, vs how many things you queue00:24
clarkband if we increase the connection limit we end up hitting cpu and disk hard00:24
jlknod00:25
Alex_GaynorIs there anything we can point at github?00:25
jeblairso we can set a reasonable number of processes to run at once, and a larger queue00:25
Alex_Gaynorlet them deal with the problem00:25
mordredAlex_Gaynor: hehehe00:25
mordredAlex_Gaynor: that's funny00:25
jeblairAlex_Gaynor: that's been our strategy up to this point00:25
jlkthey appear to be moving away from git:// as much as they can00:25
BobBall_Awaymordred: Now the only failure with VIRTUAL_ENV is grenade... not sure how to fix it though, since we're explicitely trying to perform an upgrade it sounds like it might be more difficult than I'd hope...00:25
jlkbut that might just be because they can stick all sorts of tracking around http usage that they can't w/ git://00:25
mordredBobBall_Away: I think we just may need to do similar work there00:26
jeblairAlex_Gaynor: github still fails quite often, enough for our automagic to notice00:26
mordredBobBall_Away: or backport some of the changes to devstack stable/grizzly00:26
mordredBobBall_Away: but that's thrilling!00:26
BobBall_Awaymordred: effectively the error seems to be it's running in the venv but things (such as pip) haven't been installed in it00:26
jeblairAlex_Gaynor: (i should say partial strategy -- we haven't used github in tests for a long time, but we still use it for cronjobs, etc)00:26
mordredBobBall_Away: I'm going to run out fora second, I'll look at grenade when I get back00:26
BobBall_Awayvery thrilling00:26
BobBall_AwayI'm going to bed now00:26
jlkI think Fedora infrastructure also has multiple front ends for git00:26
jlkthat use a shared FS00:26
mordredBobBall_Away: thanks for your help!00:26
BobBall_Awayit's 1:30am and I've had enough :D00:27
dstufftuse a CDN00:27
dstufft!00:27
jlknot positive though00:27
Alex_Gaynordstufft: doing invalidation on a CDN'd git repo sounds awful00:27
jlkyikes00:27
* mordred has a hunch multiple servers is going to wind up being in the cards eventually00:27
dstufftAlex_Gaynor: I dunno sounds like it wouldn't be that bad actually00:27
lifelessdstufft: I'm not aware of any git CDN's00:27
Alex_Gaynorlifeless: if you run git over HTTP(S) you can just use any HTTP pass-through CDN00:28
clarkblifeless: the http stuff should CDN just fine00:28
jeblairmordred: yep.  i just want it to be multiple good servers.00:28
lifelessAlex_Gaynor: clarkb: yeouch. No. Thanks.00:28
jlkmultiple servers seems easy for read-only support. it's the read/write that's hard with a load balancer00:28
Alex_Gaynormaster/slave git00:28
mordredjlk: we don't need read/write00:28
mordredwe have a single writ emaster00:28
mordredwhich is gerrit00:28
jlkand I really didn't want there to be two vastly different URLs for read-only clone vs write clone00:28
jeblairjlk: we are in the fortunate position of only needing to consider read-only mirrors here00:29
mordredwhich replicates to things00:29
*** nati_ueno has joined #openstack-infra00:29
jlkmordred: oh right, that makes things a lot easier for you00:29
mordredyup00:29
lifelessAlex_Gaynor: clarkb: I presume you are aware of the way plain HTTP with git (and basically all VCS's) works, right ?00:29
lifelessAlex_Gaynor: clarkb: or perhaps I should say, I presume you *aren't* aware, or you wouldn't suggest a CDN be a good fit.00:29
dstufftpretend network latency doesn't exist and just fetch some files ? :V00:29
lifelessdstufft: thats part A of the terror. part B is to either do readv's, or to sporadically download the entire repo all over again, due to the rebalancing of 'pack' operations00:30
openstackgerritClark Boylan proposed a change to openstack-infra/config: Make mysql backup crons quiet.  https://review.openstack.org/4278500:30
clarkbfungi: mordred ^ that should make mysqldump cronspam less annoying00:30
dstufftyou can probably run multiple git slaves and just front it with haproxying proxying streams around, the only hard part would be determining if an incoming stream is read or write, if there's something obvious in the cnnect that lets you know if something is authentcated you can just shove all authenticated at the master and anonymous at the read slaves00:31
clarkblifeless: if the repo hasn't changed the packs stay the same00:31
jeblairdstufft: all streams are read.  :)00:31
clarkblifeless: and iirc for large repos like nova you end up with several static packs as git leaves old stuff alone00:31
jeblairfor us00:31
jlkdstufft: I don't think we have to worry about writes, everything is a read00:32
jlkdstufft: only gerrit has write access00:32
dstufftif everything is read then that's even easier00:32
dstufftjust use haproxy as a TCP load balancer00:32
dstufftuse whatever protocol you want, http, git, ssh, doesn't matter00:33
clarkbdstufft: https://review.openstack.org/4278400:33
mordreddstufft: that's what clarkb was looking in to earlier00:33
dstufftwtf is a pp file00:33
mordreddstufft: puppet00:33
dstufftoh00:33
jeblairdstufft, jlk, Alex_Gaynor: so here's the thing -- we spun up a 30g, 8vcpu cloud server for this, and ddosed it with jenkins (it's arguable whether it performed better or worse than the http setup we have on review.o.o)00:34
jlkthat seems really bizarre, unless you're working with huge repos00:34
dstufftyou mean the haproxy solution?00:34
mordredwe have a LOT of activity :)00:34
clarkbdstufft: mordred that is a first stab at using haproxy to do queing but it can be grown to handle mutliple servers00:34
jeblairdstufft, jlk, Alex_Gaynor: before we spin up an army of maxsize(rackspacecloudservers) for this, i figure a little thought and testing of the tuning of one server might be in order.00:34
mordredclarkb: ports             => '29418', ?00:34
Alex_Gaynorjeblair: so, suggest from a friend of mine "instances=32"00:35
Alex_Gaynorjeblair: for xinetd00:35
dstufftoh you were just shoving a bigger server at it00:35
Alex_GaynorI assume this forks 32 processes to handle requests00:35
lifelessclarkb: it tries to accomodate things yes, which makes the behaviour worse, because you get sporadic 'wtf is it doing' when it has to suck down the entire history again.00:35
clarkbdstufft: mordred or maybe we use lbaas to do handle multiple services and keep the local haproxy for queueing00:35
jlkmordred: does all that activity require a full clone of the repo?00:35
dstufftwhat does rackspace have for HD's00:35
*** dina_belova has joined #openstack-infra00:35
*** rfolco has joined #openstack-infra00:35
jeblairAlex_Gaynor: we currently have the default of 50.00:35
dstufftspinning up more processes won't help if you're IO bound00:35
clarkbmordred: haproxy will listen on 9418 so I stuck gitdaemon on the alternate that gerrit uses00:35
mordredclarkb: ahhhh00:36
mordredclarkb: I agree with jeblair - let's see what a local haproxy queue will do to it00:36
mordredbefore we start adding in multi-machine lbaas00:36
clarkbmordred: definitely00:36
mordredbut potentially yes00:36
jeblairi think we ought to do some real performance testing too00:36
dstufftwhere was the bottleneck?00:36
*** coderanger has joined #openstack-infra00:36
jeblairwhere we figure out where the bottleneck actually is :)00:36
coderangerAlex_Gaynor: Fine :P00:37
Alex_Gaynorcoderanger knows how haproxy works and junk00:37
jeblairand what kind of throughput we can get under different configurations00:37
*** mriedem has joined #openstack-infra00:37
Alex_Gaynorcoderanger: tl;dr; too many things trying to get stuff from git == ddosing ourselves00:37
jlkyeah, curious where the bottleneck is. Disk, or CPU, or network00:37
clarkbcoderanger: Alex_Gaynor https://review.openstack.org/#/c/42784/1/modules/cgit/manifests/init.pp is the important file00:37
dstufftI think before you go changing your configs around you should figure out the bottleneck00:37
coderangerSo cranking down maxconns won't buffer connections like it says in the review comment, it will just leave the socket in the listen queue00:37
dstufftbecause that's going to influence what the solution is a lot :V00:38
coderangerSo if you are getting backed up, you are just going to end up with the kernel refusing conns00:38
clarkbcoderanger: "anything behind that will queue" is what the commit message says. Is that completely wrong?00:38
clarkbah00:38
clarkbwell that doesn't help00:38
coderangerI mean if can smooth out spikes00:39
*** michchap has joined #openstack-infra00:39
coderangerUp to whatever you max fds is00:39
clarkbcoderanger: spikes are the current issue. Our jenkins slaves are a thundering herd00:39
coderangerDo you know the magnitude?00:39
clarkbcoderanger: we need a semi deterministic way of making them wait in line if necessary00:39
jeblair#status ok00:40
*** ChanServ changes topic to "Discussion of OpenStack Developer Infrastructure | docs http://ci.openstack.org | bugs https://launchpad.net/openstack-ci/+milestone/grizzly | https://github.com/openstack-infra/config"00:40
*** dina_belova has quit IRC00:40
coderangerclarkb: If thats the way you want to go, make sure you set the backlog param in haproxy too :)00:40
clarkbcoderanger: absolute worst case is something like ~300 connections all at once based on the number of slaves we have00:40
clarkb+ some fudge for random people using it too00:41
coderangerAhh okay, for 300 conns thats fine as long as you know you can clear them00:41
coderangerDo the slaves retry on failure?00:41
clarkbcoderanger: they do not, and that may help a little but not fix the problem00:41
coderangerIf so, you can also just set the xinetd instances=3200:41
coderangeror probably do that anyway jut for safety :)00:41
coderangerAny reason to not use Jenkins' "hash" support in the scm config?00:42
coderangerThats been the default for a while now for exactly this reason00:42
fungicoderanger: we don't really use the scm plugin for this00:42
clarkbcoderanger: because it has been useless for a long time. I believe mordred helped make it better but we tried switching to it and didn't for some other reason00:43
clarkbmordred: jeblair do you remember why we stuck with g-g-p?00:43
coderangerAhh, manual build kickoff times every slave trying to pull down code?00:43
jeblairclarkb: because it has a nice echo statement00:43
mordredless work for jenkins to attempt to do00:43
jeblaircoderanger: yeah, we 'manually' run 400-600 jobs per hour00:44
jeblaircoderanger: obviously it's not manual, but that's the way jenkins sees it; they're triggered by a project gating system hooked up to our code review00:44
coderangerYahr00:44
coderangerAnd to be clear, this is on recent-ish linux, right? :)00:44
clarkbcoderanger: haproxy or jenkins?00:45
mordredwell, the git server is running on centos600:45
coderangerhaproxy00:45
coderanger(this would do truly bad things on Windows)00:45
mordredwe don't do windows00:45
openstackgerritClark Boylan proposed a change to openstack-infra/config: Proxy git-daemon with haproxy.  https://review.openstack.org/4278400:45
fungiusing windows would be truly bad things00:45
clarkb^^ now with backlog00:45
uvirtbotclarkb: Error: "^" is not a valid command.00:45
clarkbuvirtbot: sssshhh00:45
uvirtbotclarkb: Error: "sssshhh" is not a valid command.00:45
mordredclarkb: yes. that looks good00:46
coderangerclarkb: Other thing to check is that no hooks on the git server are using the remote IP for anything (access control, logging?)00:47
coderangerOther than that, sounds like it will do what you want :)00:47
clarkbcoderanger: we don't have server side hooks so we should be fine00:47
jlkI hadn't thought about hooks on a git-daemon pull00:48
clarkbcoderanger: cool thanks00:48
* jeblair runs again00:49
clarkbcoderanger: what does the hash option to jenkins scm plugin do?00:51
* fungi assumes it's hash-based load distribution00:51
* Alex_Gaynor assumes it reuses the same clone but just fetches that hash00:51
fungiooh, you're probably right00:52
coderangerYeah, the scm plugin uses a cron-style config00:52
coderangerthe hash flag just lets you do <hash based lb>/N00:52
coderangerSpreads out the thundering heard, but that only helps balance against multiple jobs00:52
coderangernot multiple slaves on the same job00:52
clarkbcoderanger: if you want to see shiny graphs and current tests http://status.openstack.org/zuul/00:52
* fungi guessed right00:53
Alex_Gaynorjeez, 600+ outstanding events00:53
Alex_Gaynors/events/results/00:53
clarkbAlex_Gaynor: this is what happens before milestone 3 every single time00:53
clarkbAlex_Gaynor: for grizzly it was particularly painful00:53
Alex_Gaynorclarkb: ahaha, this is my first milestone I guess00:54
clarkbAlex_Gaynor: if we had the grizzly load today we would've been fine, but you guys keep writing more code :)00:54
Alex_Gaynorclarkb: sorry?00:54
Alex_Gaynor:D00:54
Alex_Gaynorclarkb: these events/results are all bottlenecked on git?00:57
*** anteaya has quit IRC00:58
lifelessmordred: is the expectation that doing 'pip install -r requirements.txt' will grab everything a service needs?00:58
lifelessmordred: pyudev which neutron wants is not listed in it's requirements.txt. I suspect it's a transitive dependency :(00:58
clarkbAlex_Gaynor: events definitely are. I don't think results are so it is weird to see results so high00:59
clarkbAlex_Gaynor: actually I take that back. results end up merging code in gerrit which would be bottlenecked too00:59
clarkbAlex_Gaynor: events is gerrit events input into zuul. Things like new patchset or new comment. results are results from jenkins01:00
Alex_Gaynorclarkb: I assume results are serialized, so it's really a head of the line problem?01:01
clarkbAlex_Gaynor: correct01:02
*** lbragstad has joined #openstack-infra01:02
clarkbcomparing cacti graphs for zuul and review.o.o this really seems to be a zuul problem01:02
clarkbmordred: jeblair fungi I think we should merge the change to point d-g at git.o.o01:02
clarkbjeblair: and I wonder if we shouldn't artificially throttle zuul, or at least have the option to01:04
clarkbI feel better when things are slow but under control :)01:04
jeblairclarkb: what?01:05
clarkbjeblair: see the queue lengths on the zuul status page01:05
bodepdwas mgagne in here asking about redirects?01:06
*** beagles has quit IRC01:06
* bodepd searches logs...01:06
clarkbbodepd: he was at some point last week iirc01:07
bodepdclarkb: what was the verdict?01:08
bodepdclarkb: shoul I open a ticket?01:08
bodepdclarkb: we've got a lot of changes that need to happen, and decision to make based on if that happens01:09
clarkbbodepd: I want to say he made the change and it merged01:09
clarkbbodepd: check in the git log for openstack/config01:09
bodepdthe repo does not exist01:09
clarkber openstack-infra/config01:09
*** pabelanger has quit IRC01:10
bodepdno, I meant stackforge/puppet-quantum01:10
clarkboh renames01:10
clarkbhe wanted puppet lint file redirects. I thought that is what you were talking about01:11
clarkbmordred: ^ rename question01:11
bodepdsorry.for hte lack of context01:11
jeblairi believe that repo has been renamed01:11
bodepdmordred: basically, a github redict stackforge/puppet-quantum -> stackforge/puppet-neutron01:11
bodepdwould be awesome01:11
bodepdI know it's possible to do if you are admin of an account01:12
jeblairbodepd: i'm opposed to that.01:12
bodepdjeblair: ok.01:12
bodepdjeblair: that is what I need to know. (if it is going to happen or not)01:12
bodepdb/c we have lots of code that needs to be updated otherwise01:12
bodepdjeblair: what is the reason against it?01:13
jeblairbodepd: sorry, it's an extremely busy time, we're even shorter staffed then normal, and we need to focus on keeping openstack running01:13
clarkbjeblair: the last log item for processing result events is from 00:2501:13
*** xchu has joined #openstack-infra01:13
*** pabelanger has joined #openstack-infra01:13
jeblairclarkb: yeah, i'm trying to figure out what it's doing01:14
jeblairclarkb: oh really, i thought this was the last01:15
jeblair2013-08-20 00:09:35,360 DEBUG zuul.Scheduler: Processing result event <Build 3133095c056a4d7ab064e05a01c7b310 of gate-tempest-devstack-vm-postgres-full>01:15
pleia2clarkb: am away from my laptop for a few hours, can do some tests later (my test server is still up)01:16
clarkbpleia2: awesome. That would be helpful as it seems like I am doing 2 other things at the moment01:16
clarkbpleia2: and I think it can wait for tomorrow01:16
jeblairoh you're right01:17
jeblair2013-08-20 00:25:24,949 DEBUG zuul.Scheduler: Processing result event <Build 339f8f6144644de8b354d56303879d7b of gate-cinder-pep8>01:17
clarkbjeblair: which is interesting because it is a result that should end up merging code or anything like that01:20
*** lcestari has quit IRC01:21
clarkbjeblair: but that would trigger pipeline.manager.onBuildCompleted(build)01:23
clarkbjeblair: 42726,2 is in the check queue01:25
jeblairclarkb: any completion event triggers the pipeline processor01:25
clarkbjeblair: it does look like the gate queue is still being processed though?01:26
jeblairit does?01:27
fungibodepd: per github redirects, i got the impression from the article on their site that it happens automagically when a repo is moved/renamed. but maybe not01:28
clarkbjeblair: well the existing changes are getting some updates. I think anything going through the global event loop is stuck01:28
Alex_Gaynorfungi: yes, when a repo is renamed the redirects should be automatic01:28
clarkbjeblair: though it looks like that is happening for check changes too. So status on the changish/eventqueueobject is being updated but the big while true loop is stuck so we don't update much more than that01:28
clarkbjeblair: are we stuck in the while self.processQueue loop in the pipeline manager?01:30
clarkbjeblair: https://git.openstack.org/cgit/openstack-infra/zuul/tree/zuul/scheduler.py#n103601:31
*** coderanger has left #openstack-infra01:32
*** Ryan_Lane has quit IRC01:32
*** mriedem has quit IRC01:33
clarkbjeblair: http://paste.openstack.org/show/44559/ is the last time I see that log message01:34
jeblairclarkb: it recently logged it again 2013-08-20 01:27:07,488 DEBUG zuul.IndependentPipelineManager: Starting queue processor: check01:34
clarkbjeblair: yeah my version of the debug log was out of date01:35
jeblairclarkb: did it move?01:35
jeblairclarkb: istr top of check had no running jobs01:35
clarkbjeblair: yeah looking at the log it seems to have moved01:36
jeblairclarkb: 2013-08-20 01:27:07,148 DEBUG zuul.Scheduler: Run handler sleeping01:36
jeblair2013-08-20 01:27:07,148 DEBUG zuul.Scheduler: Run handler awake01:36
*** dina_belova has joined #openstack-infra01:36
jeblairclarkb: so basically it just spent 1 hour in one iteration of that loop01:36
clarkbjeblair: http://paste.openstack.org/show/44560/01:36
clarkbjeblair: yes01:36
Alex_Gaynorit looks like the queue started to move again?01:37
Alex_Gaynorat least a little01:37
clarkbAlex_Gaynor: yeah a little01:37
clarkbI need to head home or food will be cold. But I will check back in from there01:37
clarkbjeblair: tail -f /var/log/zuul/debug.log | grep 'zuul.*PipelineManager' is what I am running now to see it move01:38
fungiis the gerrit-overloaded-slowing-merges-and-result-posting theory still being batted around? with load average ~300 there and cpu pegged flat out, it seems reasonable for that to crawl01:40
fungier, ~200 i guess01:41
*** dina_belova has quit IRC01:41
Alex_Gaynoreverything broke together is a pretty reasonable explanation it seems01:41
jeblairfungi: it's possible; but we didn't see this earlier when we were busier01:41
fungimmm, point01:42
Alex_Gaynorso what changed such that things started moving again?01:42
Alex_Gaynor(there's still a ton of oustadning events/results)01:43
HenryGTrying to figure out what went wrong in gate-grenade-devstack-vm here: https://review.openstack.org/3508501:44
HenryGHelp?01:45
fungiHenryG: could this be the client backwards compat issue which was causing problems earlier today? have you asked in #openstack-qa?01:47
*** pcrews has joined #openstack-infra01:47
mordredyes it is01:47
*** ftcjeff has joined #openstack-infra01:47
mordredHenryG: known issue from earlier. should be fixed now01:47
HenryGmordred: fungi: thanks. recheck bug #?01:47
jeblairHenryG: it's at the top of the page here: http://status.openstack.org/rechecks/01:48
fungiHenryG: yeah, looking at the console log for that change it looks the same01:48
Alex_Gaynorso I'm starting to think those queue counts can't possibly be right01:49
jeblairAlex_Gaynor: why? it's been stuck/slow for over an hour01:49
Alex_Gaynorjeblair: well, there are ~50 patches in tehre right now, how can there be 965 results (is that queue entirely jenkins results/)01:50
jeblairAlex_Gaynor: those are start and stop events for jenkins; something like more than 700 have arrived since the start of the slowness01:53
Alex_Gaynorso 50 * (say 6 tests per) * 2 still doesn't account for 900?01:53
fungiand yeah, it does seem from the cacti graphs that cpu/load have fallen dramatically on zuul in the past couple hours01:53
Alex_GaynorRandom other point: the SCP step for the logs seems to be slower today01:54
jeblairAlex_Gaynor: it's more than 6 jobs per change01:55
jeblairAlex_Gaynor: nova runs 1301:55
jeblairin the check queue01:55
Alex_Gaynorgah, good point, I guess it does add up01:55
Alex_Gaynor1k events :(01:56
*** nati_ueno has quit IRC01:57
jeblairi have attached a debugger.01:58
jeblairi need to get a stack trace, but the last time i tried that with gdb, the old trick i used to use didn't work01:59
clarkbgdb or pdb?02:00
jeblairgdb02:00
jeblaircan you attach pdb to a running process?02:00
Alex_Gaynorattach a gdb, acquire the GIL, use pdb :)02:00
jeblairAlex_Gaynor: do you have instructions for that?02:01
dstufftyou'll have to teach me how to do that someday Alex_Gaynor02:01
Alex_Gaynorif it's a recent gcc there's actually a python embedded that let's you do stuff02:01
Alex_Gaynorgdb&\02:01
Alex_Gaynorgdb*\02:01
Alex_Gaynorhttp://wiki.python.org/moin/DebuggingWithGdb has some details02:02
jeblairAlex_Gaynor: afaict, the 'py-bt' thing is a fedora-ism02:02
jeblairhttps://fedoraproject.org/wiki/Features/EasierPythonDebugging#New_gdb_commands02:02
Alex_Gaynorjeblair: it was originalyl developed by a redhat person for fedora, but it's upstream now02:02
jeblairoh.  this is on precise02:03
Alex_Gaynormaybe debian/friends don't compile with the needed flags or something :(02:03
jeblairAlex_Gaynor: i think those are extra gdb commands02:03
*** rfolco has quit IRC02:04
jeblairah, they are also in the precise python dbg package02:04
fungiload average on review.o.o has collapsed too now02:04
* fungi needs to head out to a dinner reservation. bbl02:05
Alex_GaynorI need to head home from the office because at some point it became 7PM, I'll be around more when I'm home02:05
*** rfolco has joined #openstack-infra02:05
clarkbjeblair: anything else I can be doing now to help?02:09
jeblairclarkb: i'm still unable to get a stacktrace.  'py-bt' just says (unable to read python frame information) for every frame02:10
jeblairclarkb: figuring out how to get a stacktrace from a running python on ubuntu precise is what i'm working on now.  any help there would be appreciated02:11
*** yaguang has joined #openstack-infra02:11
clarkbjeblair: ok02:12
jeblairclarkb: apparently those macros expect to be run with python-dbg, which of course is not how we started zuul02:13
clarkbjeblair: http://www.python.org/~jeremy/weblog/031003.html not quite a stack trace but possibly useful02:13
*** xBsd has joined #openstack-infra02:16
clarkbjeblair: also http://svn.python.org/projects/python/trunk/Misc/gdbinit02:17
jeblairclarkb: i think the objects have changed since then02:17
clarkbjeblair: that gdbinit comes with a pystack function02:18
*** ^demon has quit IRC02:20
jeblairclarkb: No symbol "co" in current context.02:20
jeblairclarkb: these all seem to be obsolete.02:20
clarkb:( yeah they are fairly old02:20
* clarkb finds python2.7 branch02:21
*** lbragstad has quit IRC02:22
jeblairclarkb: i think it's due to gcc optimizations02:23
clarkbjeblair: http://hg.python.org/cpython/file/c048b211f634/Misc/gdbinit doesn't seem different but I haven't actually diffed them02:23
clarkbjeblair: ah so the symbols just don't exist because gcc02:23
jeblairi wonder if we could even do Alex_Gaynor's pdb trick with the current level of symbol mangling02:25
Alex_Gaynorjeblair: if you can grab the Gil and use c execute simple string it should be possible02:26
jeblairAlex_Gaynor: that sounds easy but i have no idea how to go about that02:26
Alex_GaynorWhen I'm at a computer and not my phone I'll try to find av reference02:27
mordredjeblair: I'm here - I do not what what I can do to be helpful to you02:30
clarkbmordred: we need a stacktrace from running zuul02:30
mordredhttp://www.jmcneil.net/2012/04/debugging-your-python-with-gdb-ftw/02:31
mordredreading this now02:31
jeblairmordred: my understanding of that is that it does not work because of gcc optimizations02:32
mordredjeblair: yeah. I believe you are correct02:32
mordredbtw - symbol stripping, which debian is obsessed with, has no real noticable benefit most times02:33
mordredand screws you in times like this02:33
mordredjeblair: have you installed python-dbg? sometimes deb packages extract the symbols and put them into external files02:33
jeblairthanks debian!02:33
jeblairmordred: yes i have02:33
mordredand gdb can be told to load them as symbol maps02:33
mordredlet me see if i can get some info on that02:34
jeblairmordred: that made the backtraces look like this: #33 0x0000000000466a42 in PyEval_EvalFrameEx ()02:34
jeblairmordred: but still no understanding of arguments or local variables02:34
*** eharney has joined #openstack-infra02:34
mordredso "p *co" does nothing02:35
jeblairNo symbol "co" in current context.02:35
mordredawesome02:35
jeblairso, we could call this a wash02:36
*** dina_belova has joined #openstack-infra02:36
jeblairand restart zuul using the 'python-dbg' binary02:36
mordredoh - wait02:36
mordredthere's a thing dhellman tweeted about the other day02:37
*** jfriedly has quit IRC02:37
clarkbthis must be why people gentoo02:37
jeblairand if it happens again, we'd be in a better place (no idea what that would do to performance though, since i think it is doing refcount debugging as well)02:37
jeblairmordred: that's exciting; i'm holding for your tweet02:37
jeblair(i'll be really excited if the actual method is less than 140 characters)02:37
mordredok. I don't think this is it, but, while I'm looking, look at: https://github.com/albertz/pydbattach02:38
*** rfolco has quit IRC02:38
*** dina_belova has quit IRC02:38
jeblairmordred: wilco02:38
*** mriedem has joined #openstack-infra02:40
jeblairmordred: neat, but it's complicated, and i don't really want to audit it or compile/run it on our server right now02:41
mordredjeblair: ok. that's the closest thing I can find right now02:41
mordredI think that call a wash and restart zuul with python-dbg is our best bet02:42
clarkbwfm02:42
clarkbnot elegant, but if it keeps things moving...02:42
*** bingbu has joined #openstack-infra02:43
jeblairokay that's clearly more complicated than it seems02:46
jeblairImportError: /usr/local/lib/python2.7/dist-packages/Crypto/Util/_counter.so: undefined symbol: Py_InitModule4_6402:46
jeblairok, so i can just restart it as normal, and add some more debug lines to it i guess.02:47
jeblairmaybe add a jenkins style "threadDump" command.  won't that just be the best?02:48
jeblairzuul has been restarted.  it has no queue.02:48
*** mriedem has quit IRC02:49
*** pcrews has quit IRC02:49
clarkbjeblair: that will work too02:49
Alex_Gaynorwell that doesn't sound good02:49
mordredjeblair: sigh. I believe, now that you mention, to use python-dbg, you will need -dbg versions of all of the c-based python libraries you might have installed02:49
mordredin addition to the -dbg versions of the c libraries they depend on02:49
jeblairmordred: lets move all our servers to rhel02:50
mordredjeblair: ok02:50
clarkbjeblair: or gentoo02:50
mordredjeblair: or gentoo - and we can compile from source ourselves02:50
jeblairmordred: https://bugs.launchpad.net/nova/+bug/937554/comments/1302:51
uvirtbotLaunchpad bug 937554 in nova "Lots of problems with deleting a server immediately after create (dup-of: 934575)" [High,Fix committed]02:51
uvirtbotLaunchpad bug 934575 in nova "notifier endless loops in is_primitive" [Medium,Fix released]02:51
*** eharney has quit IRC02:51
* mordred is looking at the debian packaging and cannot figure out why stack information is missing in the normal python02:51
*** melwitt has quit IRC02:51
mordredthey aren't passing stupid optimizer flags02:51
jeblairhandy instructions for building your own python, in a nova bug report no less!02:51
jeblairmordred: "02:52
jeblair#Recompiling python with make "CFLAGS=-g -fno-inline -fno-strict-aliasing" solves this problem.02:52
jeblairmordred: ^ from that bug report; that help?02:52
mordredahhhh02:52
mordredyes02:52
mordred-fno-inline02:52
mordredI forgot - python actually has a bunch of stuff defined in header files02:52
mordredso -O2 is going to wind up inlining the shit out of it02:52
mordred-O2 includes -finline-small-functions02:55
mordred-O0, which python-dbg is compiled with, does not02:55
mordredthey're all compiled with -g but then dh_strip puts the symbols into python-dbg02:55
mordrednone of that is helpful here02:56
*** afazekas_zz is now known as __afazekas_zz03:02
jeblairi have reverified all the changes that were approved and did not have a vrfy-203:04
*** rcleere has joined #openstack-infra03:04
*** markmcclain has quit IRC03:05
jeblairi have had a very long day and am not useful.  tomorrow i intend to work on nodepool.  if anyone wants to add some more debugging or a threadDump feature to zuul, that would be great; otherwise, i'll get to it later this week03:06
jeblairalso, i'm thinking we should have the gearman-plugin stop seding work status packets.03:06
jeblairsending03:06
*** Ryan_Lane has joined #openstack-infra03:07
Alex_Gaynorso, are no builds happening right now?03:07
clarkbI can look into zuul threaddumps03:07
clarkbafter I propose changed to add mysql backups (that should be quick)03:07
jeblairAlex_Gaynor: i restarted zuul, should be running now03:07
Alex_Gaynorjeblair: there doesn't appear to be anythign on http://status.openstack.org/zuul/03:07
clarkbjeblair: are work status packets causing problems?03:07
clarkbAlex_Gaynor: refresh? there is stuff for me03:07
*** pcrews has joined #openstack-infra03:08
jeblairAlex_Gaynor: you may need to reload it?03:08
Alex_GaynorI don't even know. I hate browsers.03:08
jeblairclarkb: no, but we ignore them.  just busy work.03:08
mordredjeblair: oh, for some reason I thought we were using them for status bars - I agree with anything you say03:12
clarkbmordred: that is what I thought they were for too03:12
clarkband isn't zuul LOST status the result of not getting a status from gearman?03:13
*** erfanian has quit IRC03:13
jeblairmordred: we could.  what we actually do is grab the estimated time from the first one and then calculate it ourselves.03:13
mordredjeblair: ah. nice03:13
jeblairclarkb: no, it polls gearman to see if the job is still in the queue.  that would be a reasonable thing to do though...03:14
jeblairclarkb: it would have helped with the jobs that got stuck in the jenkins queue and never ran03:14
jeblairclarkb: maybe we should keep it and just reduce the logging.03:14
clarkb++03:14
jeblairi've seen several jobs lost because of errors like this: https://jenkins02.openstack.org/job/gate-grenade-devstack-vm/2370/console03:15
jeblairi have no idea what's going on there.  perhaps a dead slave (nodepool does not have a periodic job to recheck ssh access)03:15
jeblairbut it seems to happen a lot for that.03:15
Alex_GaynorFor all that jobs that were lost when zuul was restarted, are the patch authors notified so they can recheck/reverfiy?03:16
clarkbAlex_Gaynor: no, but I think jeblair indicated he did it for them03:16
jeblairAlex_Gaynor: i reverified the ones that were approved;03:16
Alex_GaynorOh, that's nice of you!03:17
jeblairI have not done rechecks.03:17
jeblairit's hard to get a gerrit query for that.03:17
Alex_GaynorThings that don't hvae a current status from jenkins03:17
Alex_Gaynorgerrit doesn't have an easy way to do that? :(03:17
mordred-label:Verified<=2 will get you the ones that are completely new - but it's hard to get the ones that may have had a new patchset uploaded since the last time they were check verified03:19
mordredbecause we don't clear the verified status on the start of a new check job like we do for the gate03:19
mordredactually, you'd want -label:Verified<=2 -label:Approved for the first one, to make sure that you're not catching a thing that the gate has cleared the verified vote03:20
mordredbut still, you're still missing a ton there03:21
*** HenryG has quit IRC03:25
*** zul has quit IRC03:31
*** cthulhup has joined #openstack-infra03:33
*** cthulhup has quit IRC03:37
*** dina_belova has joined #openstack-infra03:37
*** dina_belova has quit IRC03:42
*** afazekas has joined #openstack-infra03:42
*** boris-42 has joined #openstack-infra03:49
bodepdfungi: I just went through the following process: https://gist.github.com/bodepd/627693203:52
bodepdfungi: and my redirects worked as expected. I did, however, use github's GUI, and I am not sure what process was used by your team03:53
*** xBsd has quit IRC03:53
*** jfriedly has joined #openstack-infra03:55
*** wenlock has joined #openstack-infra03:56
*** mberwanger has joined #openstack-infra03:59
*** yaguang has quit IRC03:59
*** vogxn has joined #openstack-infra04:01
*** michchap_ has joined #openstack-infra04:04
*** michchap has quit IRC04:08
*** yaguang has joined #openstack-infra04:12
*** ftcjeff has quit IRC04:23
*** wenlock has quit IRC04:24
*** dims has quit IRC04:25
*** cthulhup has joined #openstack-infra04:27
*** cthulhup has quit IRC04:31
*** dina_belova has joined #openstack-infra04:38
*** mberwanger has quit IRC04:38
*** dina_belova has quit IRC04:42
*** xBsd has joined #openstack-infra04:47
*** reed has quit IRC04:53
*** yaguang has quit IRC04:59
*** rcleere has quit IRC05:03
fungibodepd: yeah, mordred did the stackforge/puppet-{quantum,neutron} move, but not sure what he did in github land for it. our http://ci.openstack.org/gerrit.html#renaming-a-project recipe suggests "12. Rename the project in GitHub..." so i would assume that's what he did05:07
*** dmakogon_ has joined #openstack-infra05:08
*** yaguang has joined #openstack-infra05:12
*** cthulhup has joined #openstack-infra05:21
*** SergeyLukjanov has joined #openstack-infra05:24
*** cthulhup has quit IRC05:25
*** nicedice_ has quit IRC05:29
mordredfungi, bodepd I'm pretty sure I just deleted the old project and let the new project be created by manage_projects05:34
*** dina_belova has joined #openstack-infra05:38
*** dina_belova has quit IRC05:43
*** thomasbiege has joined #openstack-infra05:48
*** DennyZhang has joined #openstack-infra05:55
*** mikal has quit IRC05:55
*** thomasbiege1 has joined #openstack-infra05:59
*** thomasbiege has quit IRC06:02
*** thomasbiege1 has quit IRC06:13
*** cthulhup has joined #openstack-infra06:15
*** thomasbiege has joined #openstack-infra06:17
*** cthulhup has quit IRC06:20
*** dina_belova has joined #openstack-infra06:39
*** dina_belova has quit IRC06:43
*** tian has quit IRC06:44
*** nayward has joined #openstack-infra06:47
*** fbo is now known as fbo_away06:49
*** SergeyLukjanov has quit IRC06:50
*** jfriedly has quit IRC06:52
bodepdmordred: :( . I'm trying to reach out to some folks at github to see if they can help us setup those redirects06:57
bodepdmordred: I may need someone with actual credentials to approve it once I get a hold of the right person06:58
*** michchap has joined #openstack-infra07:00
*** xchu has quit IRC07:00
*** michchap_ has quit IRC07:02
*** cthulhup has joined #openstack-infra07:09
*** SergeyLukjanov has joined #openstack-infra07:11
*** xchu has joined #openstack-infra07:12
*** cthulhup has quit IRC07:14
*** SergeyLukjanov has quit IRC07:14
*** ruhe has joined #openstack-infra07:26
*** pblaho has joined #openstack-infra07:29
*** boris-42 has quit IRC07:34
*** SergeyLukjanov has joined #openstack-infra07:38
*** dina_belova has joined #openstack-infra07:39
*** michchap has quit IRC07:39
*** michchap has joined #openstack-infra07:39
odyibodepd: Simply contacting Github support had really good turn around on the redirects from puppetlabs/puppetlabs-* to stackforge/puppet-*.07:41
odyiThey manually put them in long before I actually deleted the repositories.07:42
*** odyssey4me3 has joined #openstack-infra07:47
odyiThe "Approved" label that seems to be a part of each Gerrit project.  What is it used for?  Gerrit docs don't make mention of it so I assume it is a custom label.07:48
* odyi also couldn't find it mentioned in any of the OpenStack/Gerrit workflow docs.07:50
*** michchap has quit IRC07:52
*** morganfainberg is now known as morganfainberg|a07:55
*** DennyZhang has quit IRC07:56
*** SergeyLukjanov has quit IRC08:00
*** vogxn has quit IRC08:03
*** cthulhup has joined #openstack-infra08:03
*** jpich has joined #openstack-infra08:04
*** derekh has joined #openstack-infra08:06
*** fbo_away is now known as fbo08:07
*** cthulhup has quit IRC08:08
*** xchu has quit IRC08:09
*** alex_dolby has joined #openstack-infra08:15
*** jhesketh has quit IRC08:16
alex_dolbyhi guys.. i am running tox -epy26 in python-novaclient compoennt and getting error about pbr version versions08:17
alex_dolbypbr version in setup.py and requirement.txt has different versions..08:18
alex_dolbyany pointers?08:18
*** mkerrin has quit IRC08:20
*** dina_belova has quit IRC08:21
*** ladquin has quit IRC08:24
*** xchu has joined #openstack-infra08:26
*** psedlak has joined #openstack-infra08:27
*** SergeyLukjanov has joined #openstack-infra08:27
*** boris-42 has joined #openstack-infra08:40
*** cthulhup has joined #openstack-infra08:57
*** vogxn has joined #openstack-infra09:02
*** cthulhup has quit IRC09:02
*** arezadr has quit IRC09:12
*** dina_belova has joined #openstack-infra09:22
*** dina_belova has quit IRC09:26
*** bingbu has quit IRC09:27
*** SergeyLukjanov has quit IRC09:32
*** dina_belova has joined #openstack-infra09:32
*** dina_belova has quit IRC09:34
*** dina_belova has joined #openstack-infra09:35
*** yaguang has quit IRC09:43
*** odyssey4me3 has quit IRC09:54
*** xchu has quit IRC10:03
*** odyssey4me3 has joined #openstack-infra10:05
*** dina_belova has quit IRC10:09
*** LinuxJedi has quit IRC10:09
*** ruhe has quit IRC10:12
*** alexpilotti has joined #openstack-infra10:12
*** odyssey4me3 has quit IRC10:17
*** LinuxJedi has joined #openstack-infra10:20
*** ruhe has joined #openstack-infra10:21
*** odyssey4me3 has joined #openstack-infra10:24
*** thomasbiege has quit IRC10:28
*** SergeyLukjanov has joined #openstack-infra10:37
*** nayward has quit IRC10:45
dhellmannmordred, jeblair : were you looking for https://github.com/dhellmann/smiley/ last night?10:49
*** mkerrin has joined #openstack-infra10:52
*** nayward has joined #openstack-infra10:52
*** markmc has joined #openstack-infra10:56
*** thomasbiege has joined #openstack-infra11:02
*** dina_belova has joined #openstack-infra11:09
jswarrenAfter the glanceclient fix yesterday, I reviewed three changes with "recheck no bug" about 12 hours ago.  Jenkins has not re-reviewed them yet.  Anything else I need to do?11:09
jswarrenFor example: https://review.openstack.org/#/c/40232/11:11
*** SergeyLukjanov has quit IRC11:12
*** dina_belova has quit IRC11:14
*** vogxn has quit IRC11:16
*** lcestari has joined #openstack-infra11:17
*** vogxn has joined #openstack-infra11:18
*** zul has joined #openstack-infra11:19
*** dina_belova has joined #openstack-infra11:19
*** dims has joined #openstack-infra11:20
*** dina_belova has quit IRC11:24
*** nayward has quit IRC11:29
*** weshay has joined #openstack-infra11:31
*** vogxn has quit IRC11:31
*** ruhe has quit IRC11:32
*** SergeyLukjanov has joined #openstack-infra11:39
*** nayward has joined #openstack-infra11:41
*** SergeyLukjanov has quit IRC11:44
*** zul has quit IRC11:46
*** pcm_ has joined #openstack-infra11:46
*** vogxn has joined #openstack-infra11:46
*** HenryG has joined #openstack-infra11:49
openstackgerritJulien Danjou proposed a change to openstack/requirements: Add gevent  https://review.openstack.org/4287111:50
*** jjmb1 has quit IRC11:58
*** afazekas is now known as afazekas_no_irq11:59
*** yaguang has joined #openstack-infra12:02
*** ruhe has joined #openstack-infra12:06
*** rfolco has joined #openstack-infra12:07
*** alex_dolby has quit IRC12:09
*** vogxn has quit IRC12:12
*** apcruz has joined #openstack-infra12:18
*** mriedem has joined #openstack-infra12:19
*** dina_belova has joined #openstack-infra12:20
*** sandywalsh has quit IRC12:22
*** sandywalsh has joined #openstack-infra12:24
*** dina_belova has quit IRC12:25
*** anteaya has joined #openstack-infra12:27
*** SergeyLukjanov has joined #openstack-infra12:35
*** ruhe has quit IRC12:36
*** zul has joined #openstack-infra12:37
*** dims has quit IRC12:38
*** dprince has joined #openstack-infra12:39
*** dkranz has joined #openstack-infra12:39
*** dims has joined #openstack-infra12:40
*** dina_belova has joined #openstack-infra12:43
zulso im curious why jenkins hasnt been triggered for https://review.openstack.org/#/c/41093/ and https://review.openstack.org/#/c/42789/12:44
*** ruhe has joined #openstack-infra12:47
markmczul, you know, I think I'm seeing this too with my nova reviews12:47
* markmc looks12:47
*** dina_belova has quit IRC12:47
*** SergeyLukjanov has quit IRC12:47
markmczul, ok, not seeing it now - but think I saw zuul missing some submissions yesterday12:48
zulhmmm12:48
zulis there a way to kick them off again?12:49
markmclooks like recheck doesn't work, I don't know of another way12:51
markmcjust change the commit message of the first patch and re-submit12:51
zulok12:55
*** dkranz has quit IRC12:55
anteayamarkmc zul there were issues yesterday with jenkins. The best I understand is that jenkins was ddosing our git server and there was much work to bring about a resolution. Reading the logs, I can not definitively point to a solution that was found. What you are seeing _may_ be related.13:00
markmcok, thanks13:01
*** jog0 is now known as jog0-away13:01
zulanteaya:  cool thanks13:01
anteayanp13:01
*** mberwanger has joined #openstack-infra13:01
*** adalbas has quit IRC13:03
*** kiall has quit IRC13:08
*** dkliban has quit IRC13:11
*** changbl has quit IRC13:12
*** whayutin_ has joined #openstack-infra13:14
*** weshay has quit IRC13:16
*** xchu has joined #openstack-infra13:20
*** w_ has quit IRC13:23
*** sgviking has quit IRC13:25
*** sgviking has joined #openstack-infra13:25
*** sgviking has quit IRC13:26
*** sgviking has joined #openstack-infra13:26
*** lbragstad has joined #openstack-infra13:27
*** HenryG has quit IRC13:27
*** michchap has joined #openstack-infra13:30
*** mberwanger has quit IRC13:35
*** prad_ has joined #openstack-infra13:37
*** burt has joined #openstack-infra13:42
*** thomasbiege2 has joined #openstack-infra13:43
mordreddhellmann: yes13:44
*** cppcabrera has joined #openstack-infra13:45
*** thomasbiege has quit IRC13:46
jd__ttx, mordred, dhellmann, whoever, I'd need https://review.openstack.org/#/c/42871/ quickly to unblock Ceilomeer CI failing13:46
jd__zul: ^13:46
mordredjd__: can you point me to the failing thing?13:46
ttxjd__: looks like I don't have +2 on requirements13:47
* mordred wants to understand why our mirror builder isn't picking it up13:47
ttxmordred: I thought I had, but meh13:47
jd__mordred: http://logs.openstack.org/46/42846/1/check/gate-ceilometer-python27/caaca73/console.html.gz13:47
mordredthank you13:47
jd__mordred: Pymongo does not specify the dependency…13:48
ttxI can certainly spare the effort13:48
mordredjd__: o m g13:48
mordredjd__: SERIOUSLY?13:48
mordredI hate people13:48
*** dina_belova has joined #openstack-infra13:48
jd__I couldn't agree more13:48
jd__I've opened a ticket upstream https://jira.mongodb.org/browse/PYTHON-55813:48
mordredaprvd13:48
ttxmordred: was I supposed to have +2 on requirements ? I forget what we originally said (discovered recently I wasn't subscribed to it)13:49
mordredttx: I'm happy to give you +2 on them - makes sense for you to have it13:49
*** whayutin_ has quit IRC13:49
jd__+1 :)13:50
ttxcan't remember if I signed up for it or not13:51
*** dina_belova has quit IRC13:52
mordredjd__: ok- there is feedback on that bug...13:53
ttxmordred: let me watch the reviews for some time to see if I actually care enough13:53
jd__mordred: just saw, I'm responding13:53
mordredjd__: I did too13:53
jd__ah13:53
* jd__ lags13:53
mordredjd__: "gevent doesn't support python 3 or pypy" -- is there an internal feature of pymongo that you're using that's going to get us in trouble with python 3 and pypy support?13:54
jd__mordred: no, we use nothing fancy13:54
mordredk. cool13:54
mordredI'll be interested to see what's going on here13:54
jd__that's why I'm surprised we see errors about gevent now that we pull pymongo 2.613:54
*** michchap has quit IRC13:57
*** dina_belova has joined #openstack-infra13:58
*** ftcjeff has joined #openstack-infra13:59
openstackgerritRussell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job temporarily  https://review.openstack.org/4289814:00
*** weshay has joined #openstack-infra14:01
*** vogxn has joined #openstack-infra14:01
jd__ah now that talks about greenlet and I'm going to be lost in that again14:02
* jd__ runs14:02
*** dina_belova has quit IRC14:03
openstackgerritRussell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job  https://review.openstack.org/4289814:06
*** dkliban has joined #openstack-infra14:08
*** xBsd has quit IRC14:10
*** xBsd has joined #openstack-infra14:10
*** xBsd has quit IRC14:12
mordredjd__: I think we can remove use_greenlets14:15
mordredjd__: "If you need to use standard Python threads in the same process as Gevent and greenlets"14:16
jd__indeed, we don't use threads so that should be ok I guess14:16
markmcare you sure?14:16
markmclibraries have been known to spawn random threads :)14:17
*** pabelanger has quit IRC14:17
jd__now I'm unsure and scared14:17
mordredmarkmc: well, let's solve that problem when we come to it for real - keeping the option means we're adding another python3 incompatability14:17
mriedemdhellmann: ping14:17
mordredjd__: can we try a patch to ceilometer that removes the option?14:18
markmcjd__, cooperative coroutines mumble mumble ... oh, look over there!14:18
jd__mordred: sure, it'll take me a sec'14:18
mordredjd__: sending patch in...14:18
jd__mordred: cool14:18
mordredjd__: https://review.openstack.org/4290614:20
*** changbl has joined #openstack-infra14:20
jd__mordred: ack, approving, if Jenkins' happy, we'll be too14:21
mordredgreat!14:21
jd__and we'll be able to revert gevent fortunately14:21
*** xBsd has joined #openstack-infra14:21
mordredI already blocked that from merging14:21
mordredand https://jira.mongodb.org/browse/PYTHON-558?focusedCommentId=407277#comment-407277 for anyone who wants to play along14:22
mriedemdhellmann: nevermind14:22
*** odyssey4me3 has quit IRC14:22
ttxnice turnaround on that bug report14:22
mordreddhellmann: I'm reading the mailing list as being in approval of give Alex_Gaynor +2 on requirements...14:23
mordreddhellmann: shall we make that happen?14:23
*** beagles has joined #openstack-infra14:28
*** thomasbiege2 has quit IRC14:28
*** rcleere has joined #openstack-infra14:32
*** mrmartin has joined #openstack-infra14:33
*** gordc has joined #openstack-infra14:35
*** markmcclain has joined #openstack-infra14:37
*** ruhe has quit IRC14:38
gordchi folks, would anyone happen to know when the cron job runs to update CI mirror? i just made a release for a lib and was wondering when jenkins would pick it up ... or if i could force it to get picked up.14:38
*** datsun180b has joined #openstack-infra14:40
*** yaguang has quit IRC14:40
*** __afazekas_zz has quit IRC14:41
mordredgordc: it runs after we land requirements changes - which lib? is it a thing that we should raise the min in openstack/requirements for?14:47
*** odyssey4me4 has joined #openstack-infra14:47
*** senk has joined #openstack-infra14:47
*** derekh has quit IRC14:50
gordcmordred: its for pycadf library (a new lib for audit data) -- i did not include a min since some changes were still being made aruond time it was added14:50
*** SergeyLukjanov has joined #openstack-infra14:51
*** dina_belova has joined #openstack-infra14:57
*** david-lyle has quit IRC14:57
*** cthulhup has joined #openstack-infra14:58
*** sandywalsh has quit IRC15:01
*** ryanpetrello has joined #openstack-infra15:03
*** wu_wenxiang has joined #openstack-infra15:04
mriedemgordc: hey, i noticed that this didn't automatically change the status/assignee of the bug in launchpad: https://review.openstack.org/#/c/42904/15:04
mriedemwas going to ask dhellmann if the pycadf project is hooked up to launchpad via gerrit for status changes15:05
gordcmriedem: it probably isn't hooked up correctly. i created the launchpad project so good chance i mucked it up :)15:06
wu_wenxiangI find some commit didn't start check for a long time, for example: https://review.openstack.org/#/c/38963/ and https://review.openstack.org/#/c/42794/15:06
*** ruhe has joined #openstack-infra15:08
*** sridevi has joined #openstack-infra15:08
*** xchu has quit IRC15:08
jeblairwu_wenxiang: leave a comment with "recheck no bug"; we had to restart zuul yesterday and it lost its queue15:09
srideviHi could someone help me with https://review.openstack.org/#/c/34801/15:09
srideviI see "ERROR:root:Could not find any typelib for GnomeKeyring" failures15:10
*** ^d has joined #openstack-infra15:12
*** ^d has joined #openstack-infra15:12
*** xBsd has quit IRC15:12
wu_wenxiangjeblair: Thanks15:12
*** SlickNik has quit IRC15:13
*** vogxn has quit IRC15:13
*** SlickNik has joined #openstack-infra15:14
*** pabelanger has joined #openstack-infra15:15
*** wu_wenxiang has quit IRC15:16
*** david-lyle has joined #openstack-infra15:17
*** sandywalsh has joined #openstack-infra15:17
ryanpetrellojeblair: Can I bug you to take a peek at this review? https://review.openstack.org/#/c/42685/215:17
*** UtahDave has joined #openstack-infra15:19
ryanpetrelloor clarkb for that matter15:19
*** ruhe has quit IRC15:20
jeblairryanpetrello: i'm hacking on a fix for a production problem we've been having right now, but i will make it a point to review it today if the rest of the team hasn't taken care of it15:21
ryanpetrellothanks15:21
ryanpetrellothis obviously takes a back seat :)15:21
*** ruhe has joined #openstack-infra15:22
openstackgerritgordon chung proposed a change to openstack/requirements: assign a min version to pycadf  https://review.openstack.org/4292315:23
*** reed has joined #openstack-infra15:23
*** dina_belova has quit IRC15:24
*** sridevi has quit IRC15:24
mordredryanpetrello: done15:30
ryanpetrellojeblair: Monty approved it, thanks15:30
mordredjeblair: anything I can do to help you?15:30
ryanpetrello(thanks)15:30
openstackgerritA change was merged to openstack-infra/config: Add WSME to StackForge.  https://review.openstack.org/4268515:36
*** nayward has quit IRC15:39
*** afazekas_no_irq is now known as afazekas15:42
*** thomasbiege has joined #openstack-infra15:42
*** vogxn has joined #openstack-infra15:43
NobodyCamjeblair: seems we have no core members on stackforge/pyghmi we did before the rename15:45
*** rnirmal has joined #openstack-infra15:46
*** zehicle_at_dell has joined #openstack-infra15:47
mordredNobodyCam: looking15:49
NobodyCamthnank you mordred :)15:50
openstackgerritMonty Taylor proposed a change to openstack-infra/config: Rename python-impi acl file to pyghmi  https://review.openstack.org/4293215:51
NobodyCamw00t15:51
mordredNobodyCam: should be fixed soon15:51
*** changbl has quit IRC15:51
NobodyCam:) TY15:51
NobodyCammordred: shouldn't you be burning things about now?15:52
mordredNobodyCam: soon15:52
NobodyCam:)15:52
*** mrodden has quit IRC15:53
*** davidhadas has quit IRC15:54
*** ruhe has quit IRC15:55
openstackgerritA change was merged to openstack-infra/config: Rename python-impi acl file to pyghmi  https://review.openstack.org/4293215:58
*** xBsd has joined #openstack-infra15:59
clarkbmorning16:00
NobodyCamgood morning clarkb16:01
clarkbmordred jeblair: which production issue?16:01
mordredclarkb: I'm assuming the thing from yesterday16:01
*** sridevi has joined #openstack-infra16:02
mordredclarkb: if you have a second, a ton of these: https://review.openstack.org/#/q/watchedby:mordred%2540inaugust.com+-label:CodeReview%253C%253D-1+-label:Verified%253C%253D-1+-label:Approved%253E%253D1++-status:workinprogress+-status:draft+-is:starred+-owner:mordred%2540inaugust.com,n,z16:02
clarkbmordred: which one :) it was like a horrible train wreck16:02
mordredclarkb: could use a second +2 and are trivial changes16:02
*** sridevi has quit IRC16:03
clarkbmordred ok I have a couple things I want to fix while I am thinking of them but can look at those after16:03
mordredclarkb: k. they're not important, but most of them are simple enough to be 'while drinking first cup of coffee' fodder16:03
clarkbmordred jeblair what do you think of adding something like celery.contrib.rdb to zuul for stack traces and remote pdb16:03
mordredoy16:04
mordredsomething about using celery in a project that uses gear seems weird16:04
clarkbI would simplify and vendor it16:04
*** mrodden has joined #openstack-infra16:04
clarkbmordred forget it is celery :) but their contrib.rdb module seems relatively decent and they have tests for it16:04
mordredneat16:05
*** gyee has joined #openstack-infra16:05
mordredthen why not just requirements celery?16:05
clarkbwe could do that too... seems heavy for something like a contrib module. I could go either way vendor or require16:06
NobodyCammordred: should that merge have fixed us?16:06
mordredNobodyCam: it'll take a minute16:07
NobodyCamahh ok :) TY16:07
mordredNobodyCam: we have to wait for the git pull cron followed by the puppet agent - so it could be as long as 30 minutes from merge16:07
*** jfriedly has joined #openstack-infra16:08
*** gordc has left #openstack-infra16:08
mordredclarkb: also, your haproxy patch has 3 +2's : https://review.openstack.org/#/c/42784/ so I think whenever you want to land that and ride shotgun, you know, whatever16:09
*** odyssey4me4 has quit IRC16:11
jeblairmordred, clarkb: i am reworking nodepool (as i mentioned yesterday)16:11
*** pabelanger has quit IRC16:12
jeblairclarkb: the celery thing is heavyweight.  i don't think we need a full remote debugger, we just need better logging, and the ability to get a stacktrace if something is stuck...16:13
clarkbjeblair: It needs an update. because the proxy is a single source we need to bump xinetd limits... i will propose that shortly16:13
*** thomasbiege has quit IRC16:13
pleia2testing 42784 here now16:13
jeblairclarkb: and that's just for a desparate situation -- in reality we should always be able to figure out what's going on from logs.  this is perhaps the first time we've been unable to do that with zuul.  :(16:13
clarkbjeblair: ok, I figured remote debugger would give us that and more, but can just log stacktraces as a start16:14
pleia2clarkb: there are some errors for 42784, investigating and drafting up comment now16:16
clarkbpleia2 thanks. /me -> office16:16
openstackgerritRussell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job  https://review.openstack.org/4289816:22
*** boris-42 has quit IRC16:22
*** dina_belova has joined #openstack-infra16:24
ryanpetrellomordred: how long does it generally take for merged openstack-infra/config projects to show up in github.com/stackforge ?16:26
mordredryanpetrello: usually quicker than this - let me look16:26
ryanpetrellothx16:26
openstackgerritMonty Taylor proposed a change to openstack-infra/config: Make the gitweb links in gerrit point to git.o.o  https://review.openstack.org/4269416:27
*** pabelanger has joined #openstack-infra16:27
*** markmc has quit IRC16:29
clarkbpleia2 try stopping xinetd first. It has port 941816:32
clarkbor rather kick it to pick up the new config16:33
*** nicedice_ has joined #openstack-infra16:34
pleia2clarkb: ah, yeah! it didn't pick up the new config, restarting it then starting haproxy is fine16:34
*** xBsd has quit IRC16:36
*** psedlak has quit IRC16:36
clarkbcool I eill encode into puppet16:37
*** adalbas has joined #openstack-infra16:38
*** dina_belova has quit IRC16:41
*** pycabrera has joined #openstack-infra16:42
*** nati_ueno has joined #openstack-infra16:42
*** kgriffs has joined #openstack-infra16:42
*** nati_ueno has joined #openstack-infra16:43
*** pblaho has quit IRC16:43
pleia2having some trouble getting it to clone with haproxy enabled, browsing logs16:43
kgriffshey guys, Kurt here from the Marconi team. We'd like to enable logging and/or meetbot for #openstack-marconi - what's the recommended way to do this?16:43
kgriffshost it ourselves, or is there a shared bot?16:43
*** cppcabrera has quit IRC16:43
*** pycabrera is now known as cppcabrera16:43
pleia2kgriffs: there is a shared bot, hang on, I'll grab a recent review as an example16:44
annegentlemodules/gerritbot/files/gerritbot_channel_config.yaml16:44
annegentlekgriffs: I think that's it. ^^16:44
*** alexpilotti has quit IRC16:44
pleia2kgriffs: https://review.openstack.org/#/c/41512/16:44
annegentlepleia2: mine's not so recent, but https://review.openstack.org/#/c/21696/16:44
annegentleheh16:44
pleia2for logging it's modules/openstack_project/manifests/eavesdrop.pp16:44
pleia2not gerritbot16:44
kgriffscool, thanks!16:45
pleia2gerritbot is the one that tells you updates in reviews merges and things :)16:45
kgriffsactually, I think we are in gerritbot16:45
cppcabrerayup, we have gerritbot running as of yesterday. :D16:45
ryanpetrellomordred: that seemed to work, thx :)16:45
pleia2kgriffs: once it's in eavesdrop you get logs up on http://eavesdrop.openstack.org/16:45
ryanpetrelloI noticed, however that one of the groups was created - https://review.openstack.org/#/admin/groups/202,members - while the other, wsme-ptl, wasn't16:46
dhellmannmordred: I am, too. I was going to wait the number of days specified in https://wiki.openstack.org/wiki/Governance/Approved/CoreDevProcess but I don't have16:46
kgriffspleia2: excellent16:46
dhellmannmordred: added Alex_Gaynor to requirements-core group in gerrit16:48
openstackgerritClark Boylan proposed a change to openstack-infra/config: Proxy git-daemon with haproxy.  https://review.openstack.org/4278416:48
clarkbpleia2: ^ slightly updated. You may want to try those settings as the xinetd ACLs are slightly relaxed to be more friendly to haproxy16:48
pleia2clarkb: great, thanks16:49
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Rework run_shell_command  https://review.openstack.org/4233716:49
openstackgerritMonty Taylor proposed a change to openstack-dev/pbr: Use wheel by default  https://review.openstack.org/4125516:49
mordredryanpetrello: you are now in wsme-core, so you should be able to add other people16:51
mordredas you see fix16:51
mordredfit16:51
mordredryanpetrello: poking wsme-ptl16:51
*** SlickNik has quit IRC16:52
ryanpetrelloawesome, and *thank you*16:52
mordredNobodyCam: you should be set16:52
*** SlickNik has joined #openstack-infra16:52
mordredryanpetrello: I'm excited to have wsme moved in!16:52
*** alexpilotti has joined #openstack-infra16:52
dhellmannmordred: cdevienne is looking forward to having more contributors :-)16:53
NobodyCammordred: Thank you !!16:53
mordreddhellmann: :)16:53
*** kgriffs has left #openstack-infra16:53
*** afazekas has quit IRC16:54
mordreddhellmann: while you're here, could I get a second +2 on https://review.openstack.org/#/c/42515/ ? I have another patch that's wanting it and I'm trying to clear as much of my outstanding niggly stuff before I am out today16:54
dhellmannmordred: sure, looking now16:54
mordreddhellmann: (there's two other in requirements that could probably use love as well)16:54
dhellmannmordred: I've got a standup in 3 minutes, but after that can look at anything you'd like reviewed16:56
clarkbpleia2: anything I can do to help testing/debug git-daemon?16:57
openstackgerritAlejandro Cabrera proposed a change to openstack-infra/config: feat: add marconi channel to eavesdrop  https://review.openstack.org/4295616:57
pleia2clarkb: the patch helps us stop losing the puppet lottery (xinetd should have to look at the file it's subscribed to first before haproxy stuff happens) but still unable to clone from git:// with it enabled, looking for haproxy related logs now17:00
*** ladquin has joined #openstack-infra17:00
*** fbo is now known as fbo_away17:01
*** jerryz has joined #openstack-infra17:04
*** morganfainberg|a is now known as morganfainberg17:04
pleia2gosh, looking for issues with git specifically is a fun google-fu problem17:08
*** dprince has quit IRC17:08
morganfainbergAlex_Gaynor: ping17:09
clarkbpleia2: is it like googling for Go?17:14
pleia2yeah, and screen(1) :)17:14
pleia2might be an issue with my test isntance though, it doesn't have a fqdn for one17:14
fungii cannot, for the life of me, figure out how to adjust bugtask metadata for git-review on bug 1179008 (trying to set it to fix-committed for example). tried repeatedly over the past few days and every time i get a launchpad "timeout error..." ideas?17:15
openstackgerritClark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads.  https://review.openstack.org/4295917:15
uvirtbotLaunchpad bug 1179008 in python-neutronclient "rename requires files to standard names" [Medium,In progress] https://launchpad.net/bugs/117900817:15
pleia2and ipaddress might show up weird on hpcloud (the local address the machine thinks it has in `ip addr` is not the public address17:15
* pleia2 manual tweaks17:16
*** vipul is now known as vipul-away17:16
koolhead17pleia2: hi there17:17
clarkbjeblair: ^ 42959 is a bit of a WIP but I figured I would get that out sooner than later. I am still working on testing it (is the easiest way to do that to write a unittest?)17:17
clarkbfungi: it times out for me too. Maybe we attached too many projects to that bug?17:18
pleia2koolhead17: hey, hope you're enjoying your stay in SF :)17:19
*** vipul-away is now known as vipul17:22
koolhead17pleia2: i am. :)17:23
jeblairclarkb: lgtm; you might need to actually run it in order to test it.  also, i think there is something messed up with signals and using the internal gear server.17:23
koolhead17lets catch up sometime over weekend17:23
mordredjeblair: oh lovely17:23
* koolhead17 waves jeblair mordred clarkb & everyone :D17:23
mordredhey koolhead17 - enjoyin SF?17:23
koolhead17yes sir. its great17:24
koolhead17:)17:24
koolhead17i might be in seattle for a day17:25
clarkbkoolhead17: one day is not enough for seattle :P17:25
Alex_Gaynormorganfainberg: pong17:25
koolhead17clarkb: i know :(17:26
clarkbjeblair: is there still a dev zuul that I can use to test within a running system?17:26
koolhead17clarkb: won`t mind coming to portland for beer for few hr though. :D17:26
koolhead17Alex_Gaynor: hi there17:26
morganfainbergAlex_Gaynor: hey, wanted to follow up with you regarding: https://review.openstack.org/#/c/42455/ (since you, in theory could bump up to a +2 now, btw, gratz on core for requirements)17:27
pleia2clarkb: so netstat tells me git daemon isn't even running when not on the default port, so trying to fix that now17:27
clarkbjeblair: I have at least one small updated to that. I realized that a reconfigure will also reconfigure logging so I am just going to get the logger each time I need to dump stack traces17:27
morganfainbergAlex_Gaynor: see if there was any outstanding concerns, since thats thenext blocker for my caching stuff in keystone.17:27
clarkbpleia2: doesn't xinetd fork git-daemon's on demand as connections come in?17:28
*** vogxn has quit IRC17:28
fungihowever xinetd should be listening on that port17:29
Alex_Gaynormorganfainberg: I don't think there are any outstanding concerns, but I'll have to give it a once over before +2ing :) I'll come around in a few minutes to it17:29
*** pcm_ has quit IRC17:29
pleia2clarkb: yeah, but it still should have: :::9418                     :::*                        LISTEN      10606/xinetd17:29
pleia2as fungi says :)17:29
morganfainbergAlex_Gaynor: thanks! i appreciate it :)17:29
clarkbpleia2: haproxy will be 9418, xinetd on 2941817:29
pleia2right, haproxy shows up on 9418 and no xinetd at all17:30
pleia2can't get it to listen on 2941817:30
clarkbweird17:31
* pleia2 confirms it's not selinux17:31
*** pcm_ has joined #openstack-infra17:32
*** SergeyLukjanov has quit IRC17:34
clarkbjeblair: woot, I wrote a small script that sits in a while loop with that signal handler configured and it seems to work17:35
clarkbjeblair: much easier testing that way than getting a complete zuul running17:35
*** cthulhup has quit IRC17:35
pleia2Aug 20 17:36:39 git-vanilla xinetd[10709]: Service git expects port 9418, not 2941817:36
pleia2heh17:36
pleia2dear xinetd, do it anyway17:37
*** mgagne has joined #openstack-infra17:37
*** mgagne has quit IRC17:37
*** mgagne has joined #openstack-infra17:37
openstackgerritAnita Kuno proposed a change to openstack-dev/hacking: Testing how .html files are rendered by cgit.  https://review.openstack.org/4296117:42
*** zul has quit IRC17:42
morganfainbergAlex_Gaynor: looks like dhellmann got to it before you.  thanks :)17:46
Alex_Gaynormorganfainberg: okey doke, sorry bout that, I'm writing some scripts to setup swift for some benchmarkming :)17:46
morganfainbergAlex_Gaynor: not a problem man, was just following up with people today about it.  thanks again!17:47
openstackgerritAnita Kuno proposed a change to openstack-dev/hacking: Testing how .html files are rendered by cgit  https://review.openstack.org/4296117:48
clarkbpleia2: maybe we should consider running it as a stand alone daemon?17:48
clarkbpleia2: and rely on haproxy to do the DDoS protection17:48
pleia2clarkb: so it looks like xinetd uses /etc/services to determine where it should bind stuff, by patching /etc/services I got it to work, but this seems sub-optimal17:50
pleia2(commented out the 9418 git lines, added ones for 29418)17:51
*** dina_belova has joined #openstack-infra17:52
clarkbpleia2: so cloning works now? its a start :)17:52
pleia2yeah! This is with haproxy running: git clone git://15.185.127.146/openstack-infra/config.git17:53
pleia2browsing git-daemon docs, the /etc/services thing may actually be more git daemon and less xinetd17:54
pleia2so maybe we do need to change /etc/services17:55
*** dina_belova has quit IRC17:57
clarkbok17:58
anteayapleia2: to add to your list of things to do, here is a patch consisting of an .html file I generated with rst2html: https://review.openstack.org/#/c/42961/17:58
anteayalet me know how it looks17:58
clarkbpleia2: that seems hacky though17:59
*** cppcabrera is now known as cppcabrera_afk17:59
pleia2clarkb: yeah, so if we run it stand alone without --inetd we should be able to specify an alternate --port18:00
clarkbpleia2: I like that better18:01
pleia2I am not sure of the best way to do this, as "the centos way" is using xinetd to run services that don't have specific init scripts, git is just a command line "git daemon..."18:03
clarkbpleia2: ubuntu's git daemon package comes with an init script. we could vendor it for centos18:03
clarkbI am sure that the red hat folk in the channel want to beat me after saying that18:03
pleia2hehe18:04
pleia2so we'd just drop it in /etc/init.d/ ? I am really unfamiliar with rh init system stuff18:04
pleia2(well, after tweaking it to work properly, of course)18:05
openstackgerritClark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads.  https://review.openstack.org/4295918:07
clarkbjeblair: ^ that comes with a test. Let me know what you think18:07
clarkbpleia2: yes, dropping it in /etc/init.d/ and having puppet ensure the service is enabled should be sufficient18:07
clarkbassuming that the debian/ubuntu script doesn't have a bunch of debianisms in it that centos won't like18:08
*** changbl has joined #openstack-infra18:09
pleia2clarkb: looking now, it does - hard coded paths, /etc/default references, might actually be worth rewriting18:09
pleia2there are useful things I can pull from it though, hacking away18:10
pleia2anteaya: ok, I'll have a look in a little bit18:12
pleia2or use one someone already wrote http://robescriva.com/blog/2009/01/13/git-daemon-init-scripts-on-centos-52/18:13
anteayak thanks18:13
* pleia2 frowns at no license18:14
pleia2ah, easy enough to write own18:14
clarkbpleia2: let me know if there is anything I can do to help18:16
clarkbI half feel like I threw my crazy haproxy idea over the wall >_>18:17
*** cthulhup has joined #openstack-infra18:17
clarkbwas not my intention :)18:17
*** zul has joined #openstack-infra18:17
pleia2no worries, it mostly worked, certainly didn't anticipate it being so cranky about non-standard ports, it shouldn't be like this :)18:18
*** xBsd has joined #openstack-infra18:23
*** melwitt has joined #openstack-infra18:23
*** cthulhup has quit IRC18:24
clarkbpleia2: jeblair: mordred: Worth noting that the g-g-p times with https://git.o.o seem to be better than when against review.o.o on centos unittest slaves18:25
clarkbso maybe we should stop worrying too much about git://18:25
pleia2hmm, maybe there is a way I can edit the server_args line to support port18:27
*** vipul is now known as vipul-away18:30
*** vipul-away is now known as vipul18:30
pleia2not so much18:31
reedneed a staging server for activity.openstack.org18:32
pleia2clarkb: maybe, seems unlikely that if we point everything at https that there will be enough load on git:// to cause problems18:34
*** arezadr has joined #openstack-infra18:35
jeblairreed: do you want to write the puppet (we can point you to some docs), or do you want someone else to do it?18:39
reedjeblair, send me the puppet stuff, I'd like to learn18:39
reed(is that a good answer or what?)18:39
*** markmcclain has quit IRC18:40
pleia2reed: http://ci.openstack.org/sysadmin.html#adding-a-new-server is a good start :)18:40
anteayaare we waiting for anything specific for this patch: https://review.openstack.org/#/c/38177/ Use cgit server instead of github for everything There is quite the lineup of green +'s on it18:40
jeblairreed: it's the most perfect answer ever.  :)18:40
* reed admires his most perfect answer ever, sipping coffee18:41
jeblairreed: http://ci.openstack.org/sysadmin.html#adding-a-new-server18:41
jeblairreed: you should actually start at the top of that doc18:41
pleia2anteaya: still working to tune the git server before we throw everything at it18:41
jeblairreed: it has background info, and also instructions on how to test18:41
anteayapleia2: ah, okay18:41
jeblairreed: but the section i pointed to has the actual steps18:41
jeblairreed: and somewhere, there's mrmartin's change to add his staging server18:42
jeblairlooking18:42
reedjeblair, oh, right... I can copy that too18:42
jeblairreed: https://review.openstack.org/#/c/42608/18:42
reedsweet18:42
*** SergeyLukjanov has joined #openstack-infra18:43
jeblairreed, mrmartin: and sorry i haven't reviewed that yet.  it is a high priority, after we get some of the operational issues we've been dealing with under control18:43
*** SergeyLukjanov has quit IRC18:43
jeblair(this week is very busy due to a feature freeze deadline)18:43
reednp, mrmartin is on vacation today anyway18:44
mordreddamn feature freeze18:44
mordredclarkb, pleia2: git-daemon wants us to edit /etc/services to run it on another port?18:45
*** vipul is now known as vipul-away18:46
pleia2mordred: well, inetd does18:46
pleia2if running it from xinetd or using --inetd on the command line, you can't specify --port because it just does an /etc/services lookup and will only use what's in that file18:47
pleia2I vote that this is broken :)18:47
pleia2but it is what it is18:47
*** openstack` has joined #openstack-infra18:51
*** openstack has quit IRC18:51
*** pabelanger has quit IRC18:52
*** openstack` is now known as openstack18:52
*** boris-42 has joined #openstack-infra18:52
*** afazekas has joined #openstack-infra18:53
mordredpleia2: it seems like a very poor design18:55
openstackgerritJames E. Blair proposed a change to openstack-infra/nodepool: WIP: provider manager  https://review.openstack.org/4297318:56
openstackgerritClark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads.  https://review.openstack.org/4295918:57
jeblairmordred, clarkb: ^ that is my solution to the problems with rate limits we saw yesterday ^.  i also think it's a bit cleaner and more reliable.18:58
clarkbjeblair: I will review after the meeting18:58
jeblairmordred, clarkb: it needs a little more work, and testing with a real provider instead of my fake one, but it's mostly there and worth a general review18:58
clarkbjeblair: the zuul change should be ready for review as well18:59
jeblairclarkb: thanks18:59
jeblairmeeting time!18:59
jeblairalmost18:59
*** AJaeger has joined #openstack-infra19:01
*** pabelanger has joined #openstack-infra19:01
*** mriedem1 has joined #openstack-infra19:02
*** cppcabrera_afk is now known as cppcabrera19:03
*** mriedem has quit IRC19:03
AJaegerHi infra team, I'd like to have some guidance and help on getting the Basic Install guide build now also for openSUSE - and thus on the docs.openstack.org19:05
AJaegerannegentle guided me in https://review.openstack.org/#/c/41777/ to you.19:06
*** thomasbiege1 has joined #openstack-infra19:06
clarkbAJaeger: we are in our weekly meeting currently, so we may be a bit slow to answer, but will catch up after the meeting19:06
AJaegerclarkb, sorry, didn't know. Ok, I'll stay around and let you finish your meeting. Thanks for the quick heads-up.19:06
*** gyee has quit IRC19:07
*** vipul-away is now known as vipul19:07
*** vipul is now known as vipul-away19:07
*** vipul-away is now known as vipul19:07
AJaegerclarkb, btw if I should send an email or use other means, just tell me19:07
clarkbAJaeger: IRc is probably easiest, it will just be maybe an hour before we can really answer your questiosn19:08
AJaegerclarkb, ok, thanks19:08
*** markmcclain has joined #openstack-infra19:15
Alex_GaynorSo the amount of time between whe a job finishes on jenkins, and when zuul records it as done seems why too large. Are there any known bottlenecks there, and what can be done to improve that?19:15
*** fbo_away is now known as fbo19:15
nati_uenoJenkinsreview on Gerrit get really readable! Nice19:17
*** dprince has joined #openstack-infra19:18
reedjeblair, pleia2: since the activity-staging server needs to have apache and mysql should I draw inspiration from static.pp for the include::apache and various mods??19:19
*** kiall_ has joined #openstack-infra19:20
pleia2reed: yes19:21
reedcool19:21
jeblairAlex_Gaynor: link to an example change?19:22
mordrednati_ueno: thanks! (jeblair did it)19:22
*** vipul is now known as vipul-away19:22
nati_uenojeblair: Thanks!19:23
*** nati_ueno has quit IRC19:26
*** jerryz has quit IRC19:26
*** HenryG has joined #openstack-infra19:26
*** cthulhup has joined #openstack-infra19:26
Alex_Gaynorjeblair: just random ones I'm noticing as they happen19:27
*** nati_ueno has joined #openstack-infra19:30
*** gordc has joined #openstack-infra19:34
*** thomasbiege1 has quit IRC19:34
*** cthulhup has quit IRC19:37
*** cthulhup has joined #openstack-infra19:41
*** xBsd has quit IRC19:42
jeblairAlex_Gaynor: don't forget about severed heads;19:43
*** vipul-away is now known as vipul19:43
jeblairAlex_Gaynor: the head of the queue was just severed because it failed a test, but it's still running its tests and won't report until they are done19:43
jeblairAlex_Gaynor: (scroll to the bottom of the gate queue to see it)19:43
Alex_Gaynorjeblair: so the case I was looking at was teh top item in the gate queue19:44
Alex_Gaynors/queue/pipeline19:44
russellbbtw, i put up this change earlier today to help free up some jenkins resources over the next couple weeks: https://review.openstack.org/#/c/42898/19:45
*** zul has quit IRC19:47
SlickNikhey guys.19:49
SlickNikjust wanted to report in that review.openstack.org is being much slower than usual.19:50
fungiSlickNik: yes, it's being used much more than usual19:51
clarkbSlickNik: yup, it is getting bogged down by all of the testing to test all of your code :) I think we just agreed to merge a change that will hopefully alleviate some of this19:51
clarkbjeblair: do you want to force merge that change or should I just go ahead and do it?19:51
jeblairclarkb: i'll do it19:51
fungiSlickNik: with the icehouse feature freeze looming, lots of people are trying to submit/review/merge much more code volume than usual19:51
openstackgerritA change was merged to openstack-infra/devstack-gate: Use git.openstack.org as origin  https://review.openstack.org/4269319:52
clarkbjeblair: thanks19:52
SlickNikCool, thanks! Understandable with the FF looming.19:52
SlickNikAnd thanks for being on top of it (as usual).19:52
SlickNikChers.19:53
SlickNikCheers*19:53
clarkbSlickNik: in the mean time you will probably find that using git review -d and the gerrit ssh interface to be a little more responsive19:53
clarkband do your reviews locally (not sure if you can do inline comments this way, but otherwise it should work)19:53
openstackgerritAnne Gentle proposed a change to openstack-infra/config: Ensure that the release.path.name is set for the Block Storage  https://review.openstack.org/4298419:54
*** afazekas has quit IRC19:54
ryanpetrelloanybody know if there's a generalized sphinx upload hook for pythonhosted.org ?19:54
pleia2clarkb: I'm heading out to lunch in a couple minutes (might run a bit long), will finish up init script upon my return!19:54
ryanpetrellothat does e.g., http://pythonhosted.org/an_example_pypi_project/buildanduploadsphinx.html19:54
ryanpetrellosimilar to what the rtfd hook does, but uploads directly to pythonhosted.org?19:55
ryanpetrelloif not, I'd be glad to experiment in writing one, just wanting to make sure it doesn't already exist...19:55
*** markmc has joined #openstack-infra19:55
mordredryanpetrello: we have not made one19:55
ryanpetrelloI wonder if doc_upload has the same permissions as how maintainer roles work19:56
mordredat some point, I'd love to get a good general design/direction around rtfd/pythonhosted/docs.o.o19:56
ryanpetrelloi.e., if you're a maintainer, you can upload docs19:56
mordreddhellmann, annegentle ^^19:56
mordredryanpetrello: also, look at how we do pypi-upload19:56
clarkbryanpetrello: note we don't use setup.py to upload stuff to pypi because ugh. Instead we have a wrapper around curl to do it so that we don't have to run arbitrary code19:56
annegentlemordred: I've met with Todd Morey in the last couple weeks to try to synch with www for design19:56
jeblairryanpetrello: i lookd into it briefly19:56
mordredryanpetrello: it's probably more directly related to how we'd need to upload docs to pypi19:56
annegentlemordred: Sphinx does work well for dev docs19:56
jeblairryanpetrello: it can be done by uplodaing a zipfile19:56
clarkbAJaeger: still around?19:56
AJaegerclarkb, Yes.19:57
jeblairryanpetrello: so basically, it would be like the pypi-upload job19:57
mordredannegentle: main questoin is - which of the three available locations should we automatically upload to?19:57
mordredannegentle: or - should we upload to all of them?19:57
annegentlemordred: ah19:57
clarkbAJaeger: ok, give me a quick minute to settle back into doing stuff and I will do my best to answer your questions about new doc jobs19:57
annegentlemordred: one place.19:57
dhellmannmordred: we're looking at pythonhosted for wsme because that's one of the places it is already using19:57
*** dina_belova has joined #openstack-infra19:57
ryanpetrellowhy not as many as you specify via hooks?19:57
*** SergeyLukjanov has joined #openstack-infra19:58
dhellmannmy preference is for rtfd.org, because that's what most people are doing for new projects19:58
ryanpetrelloif elect pythonhosted vs rtfd19:58
annegentleryanpetrello: why clutter the internet? :)19:58
ryanpetrellothe submission process for those is quite different19:58
dhellmannannegentle: +119:58
ryanpetrellono, I agree19:58
annegentledhellmann: my issue with rtfd is we need the GA info to make good decisions about docs19:58
jeblairopenstack projects should have their docs uploaded to docs.openstack.org19:58
ryanpetrellojust staying we should give folks the flexibility to choose19:58
dhellmannfor openstack stuff, I think we should just host it ourselves19:58
annegentlejeblair: yes19:58
jeblairstackforge projects can do whatever they want19:58
mordredsure19:58
jeblairand we do give them the flexibility to do that right now.19:58
dhellmannannegentle: right, this would just be for third-party or stackforge stuff19:58
annegentlejeblair: sure19:58
ryanpetrelloright, Doug and I are mostly referring to stackforge in this context19:58
annegentledhellmann: ok19:58
*** ^demon has joined #openstack-infra19:59
*** ^demon has joined #openstack-infra19:59
ryanpetrellojust suggesting that stackforge folks may find a "auto-upload to pythonhosted.org on release" useful19:59
ryanpetrellothey currently have this for rtfd19:59
ryanpetrellojust considering another option19:59
dhellmannyep19:59
dhellmannI think we should allow pythonhosted, but encourage rtfd where possible19:59
annegentleryanpetrello: ok. nice that it happens on upload20:00
ryanpetrello+120:00
mordred++20:00
annegentleryanpetrello: but there are good reasons to ci docs20:00
annegentleI'd probably encourage continuous publishing20:00
clarkbAJaeger: we configure all of our jenkins jobs using the Jenkins Job Builder, http://ci.openstack.org/jjb.html20:00
ryanpetrellosure, s/on release/whenever is applicable20:00
dhellmannannegentle: good point20:00
ryanpetrellocontinuous, if it's right for your project/preference20:01
*** lcestari has quit IRC20:01
clarkbAJaeger: that page is a good starting point for learning how JJB works. With the help of that page you should be able to grab an existing doc job that does something similar to what you want and copy pasta as needed without losing too much understanding of what is going on20:01
*** ^d has quit IRC20:01
clarkbAJaeger: then the second thing you need to do is tell zuul to run that jenkins job when you need it to be run20:02
*** mikal has joined #openstack-infra20:02
clarkbAJaeger: https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/zuul/layout.yaml is where you do that. http://ci.openstack.org/zuul.html has a brief zuul intro and links ot more in depth docs20:03
clarkbAJaeger: so from a super high level your change will have two parts. 1. add job to jenkins with JJB and 2. tell zuul to run new job in layout.yaml20:03
AJaegerclarkb: Thanks, I'll check how the current guides are build and see whether I need to duplicate that setup or can somehow hook into it...20:04
*** zehicle_at_dell has quit IRC20:05
openstackgerritClark Boylan proposed a change to openstack-infra/config: Make mysql backup crons quiet.  https://review.openstack.org/4278520:06
clarkbjeblair: mordred fungi ^20:06
clarkband now time for reviews20:06
*** mikal has quit IRC20:07
fungiclarkb: lgtm. i'm popping out for lunch and then i'll try to review a few changes before my next meeting20:09
clarkbAJaeger: feel free to ask questions as they arise. I know I gave the high level info dump and wasn't very specific20:10
AJaegerclarkb, that helped a lot - I got the right pointer. I'll propose a change in a few minutes for you to review that I didn't miss anything...20:12
*** mikal has joined #openstack-infra20:13
openstackgerritAndreas Jaeger proposed a change to openstack-infra/config: Build Basic Install Guide for openSUSE  https://review.openstack.org/4298820:15
*** dmakogon_ has quit IRC20:16
AJaegerclarkb, my feeling is just that I'm missing something. That was too easy ;)20:16
vipulyou guys aware of review.o.o being a slow today?20:17
clarkbvipul: yes, we are DDoSing it with the jenkins slaves20:17
vipulooh fun!20:18
clarkbwe recently merged a devstack gate change that will point more tests to git.openstack.org which will hopefully alleviate the pressure on review.o.o but we need the currently running tests to flip over before we see20:18
openstackgerritJames E. Blair proposed a change to openstack-infra/nodepool: Add ProviderManager  https://review.openstack.org/4297320:19
clarkbvipul: this is the typical pre feature freeze rush that never fails to break something20:21
clarkbvipul: tl;dr you need to write more code during H1 :)20:21
*** pabelanger has quit IRC20:21
vipulclarkb: h1 is for recovering from all the hangovers at the summit :D20:22
*** pcm_ has quit IRC20:22
*** HenryG has quit IRC20:23
*** mikal has quit IRC20:24
jeblairi think jenkins02 is experiencing a similar slowness as before; i've got jstack trying to get a thread dump; it is responding, but very slowly, and it has a bunch of offline nodes sitting aroind.20:27
jeblairclarkb, fungi: ^ i uploaded a polished version of the providermanager change; i'm about to start live-testing it20:27
clarkbjeblair: ok, it is next up in my queue.20:28
jeblairclarkb, fungi: i think i will also do something similar to serialize jenkins access, and try to deploy both of those together.20:28
openstackgerritClark Boylan proposed a change to openstack-infra/devstack-gate: Replace review.o.o with git.o.o.  https://review.openstack.org/4298920:28
clarkbjeblair: ^ I noticed that needed doing20:28
jeblairclarkb: no it doesn't we don't use those anymore20:29
clarkbjeblair: well it needs doing at least for the README20:29
clarkbjeblair: the image building is elsewhere, maybe there should be a clean up d-g commit then do the git stuff on top of it20:29
jeblairclarkb: ok, sure, we can change the readme.  i'm pretty sure the image building, whether run manually or nightly, is not causing current performance problems, so i deferred it20:30
jeblairsimilarly, i have deferred removing those things until there's a replacement20:31
jeblair(for manually running)20:31
jeblairclarkb: but can we at least avoid adding that to the gate queue until it's not busy?20:31
clarkbjeblair: ya20:31
clarkbI will WIP it20:32
*** pabelanger has joined #openstack-infra20:32
openstackgerritJim Branen proposed a change to openstack/requirements: Allow use of hp3parclient 1.1.0.  https://review.openstack.org/4299120:32
clarkbrussellb: https://jenkins01.openstack.org/job/gate-nova-python26/1366/console seems to be a fairly frequent test failure20:32
clarkbjeblair: FYI ^ I think that has semi broken the gate (only nova runs that test so only nova is affected)20:33
russellbboris-42: ^^^20:33
russellbboris-42: can you help dig into that?  since you (and your team) have been working most in that area20:34
mordredanteaya: when you get a moment, would you look at the scrollback in the meeting channel20:35
mordredanteaya: and the discussion of setting up a repo that we'll use for voting for TC motions?20:36
markmcrussellb, I think someone from his team submitted a patch20:36
* markmc digs it up20:36
anteayamordred: I was following some of that20:36
*** kiall_ is now known as Kiall20:36
boris-42russellb I am here20:36
anteayamordred: am I the resource volunteered for duty?20:36
markmcrussellb, it was victor, https://review.openstack.org/#/c/42649/20:36
anteaya:D20:36
mordredanteaya: yup20:37
russellbmarkmc: nicedice_20:37
anteayaokey dokey smokey20:37
russellberr, nice.20:37
boris-42russellb yeah this is already solve20:37
mordredanteaya: you know, if you want :)20:37
anteayayeah yeah yeah20:37
russellbclarkb: looks like we have a patch up for that ... need to get it reviewed/merged though20:37
anteayaso the way I understand it, I go back through the TC meeting logs and pull out past decisions20:37
clarkbrussellb: markmc: great. Note that any nova changes approved before that one probably won't merge20:37
*** mikal has joined #openstack-infra20:38
anteayaand offer them up as patches to the repo20:38
anteayathat I am about to create20:38
anteayato gather the history20:38
markmcclarkb, it only happens like 1 in every 5 times from what I've seen20:38
anteayais that one of the tasks, apart from creating the repo itself20:38
anteayattx: what do we want to call this TC decision repo?20:39
anteayaat the very least, I will learn a lot about the history of the TC20:39
russellbclarkb: that change is approved now20:42
boris-42russellb nice20:42
boris-42russellb thnaks20:42
boris-42=)20:42
*** SergeyLukjanov has quit IRC20:42
russellbboris-42: yep, np20:42
jeblairclarkb, mordred: i think jstack is stuck in its deadlock detection.20:43
mordredjeblair: wow20:44
*** dina_belova has quit IRC20:44
*** cthulhup has quit IRC20:45
*** cthulhup has joined #openstack-infra20:45
clarkbload on git.o.o is ~18 and under 1 on review.o.o20:46
clarkbjeblair: that is an impressive feat20:46
openstackgerritRussell Bryant proposed a change to openstack-infra/config: Disable tempest in the cells job  https://review.openstack.org/4289820:46
jeblairi'm attaching the debugger and will try that way20:46
mordredclarkb: woot!20:46
clarkbjeblair: I am working my way through the nodepool client manager change right now20:46
*** cppcabrera is now known as cppcabrera_afk20:47
notmynamemordred: tags for getting pbr with swift...got a few minutes?20:48
lifelessAlex_Gaynor: do you know, is there a way to get a unicode string directly from a memoryview, rather than copy to bytestrnig, then decode to unicode string?20:48
jeblairclarkb, mordred: i think it's slow because there are so many nodes still attached to it (which is true because it is slow)20:48
jeblairmordred: got a few mins?20:49
Alex_Gaynorlifeless: apparently! codecs.utf_8_codecs(memoryview) seems to wokr (for example)20:49
*** cthulhup has quit IRC20:50
lifelessAlex_Gaynor: ahha, thanks!20:50
jeblairmordred: i think jenkins02 needs to be stopped, and have all the nodes removed from its config.xml; all related nodes deleted from nova, and then started again.20:50
clarkbjeblair: that is no good. What do you think about an artificial throttle in zuul or nodepool, so that we can at least prevent it from overrunning itself20:50
lifelessAlex_Gaynor: though 2.7's codecs module has no utf_8_codecs attribute20:51
jeblairclarkb: i mentioned that i wanted to serialize access to jenkins, do you want something else?20:51
Alex_Gaynorlifeless: codecs.utf_8_decode(m)20:52
dprincejeblair: question on Gerrit comment syntax. I noticed recently that 'SUCCESS' is green.... and 'FAILED' is red. Is that HTML formatting that does that? or some sort of magic gerrit syntax you'd need to use?20:52
lifelessAlex_Gaynor: ahha! cool.20:52
clarkbjeblair: I think serializing access to jenkins is part of the answer, doing more to add a configurable queue length so that anything going over some limit blocks20:52
jeblairclarkb: if we wanted the whole system to be slow, we could have done nothing.  it was self limiting earlier.20:53
jeblairclarkb: and still is20:53
jeblairclarkb: the point is to actually be able to run all of the tests we need to run20:53
jeblairclarkb: that's why we're scaling jenkins horizontally and adding more masters20:53
clarkbjeblair: I am not suggesting to make it slow, you can still make the limit arbitrarily high20:53
jeblairclarkb: what are you suggesting then?20:54
clarkbjeblair: but in cases like this we would be much more better off putting a limit on how fast it can be20:54
jeblairclarkb: how fast what?20:54
clarkbjeblair: jobs per hour20:54
jeblairclarkb: are you talking about zuul?20:54
clarkbjeblair: or nodepool concurrent nodes20:54
clarkbjeblair: I am think of zuul and or nodepool. They can both be throttled to take some of the pressure off of jenkins and gerrit20:55
jeblairclarkb: okay, so we just merged a change that will cause tests to not touch gerrit20:55
*** mikal has quit IRC20:56
jeblairclarkb: zuul accesses gerrit serially when creating its changes20:56
jeblairideally, we have just done quite a lot to take the pressure off of gerrit20:56
jeblairclarkb: so what pressure on gerrit do you want to relieve?20:56
ttxanteaya: openstack/governance ideally, though it's a bit overreaching20:56
clarkbjeblair: our major problem today and yesterday appears to be a thundering herd. If we can let them thunder at a tunable pace we should be able to reign in when jenkins runs faster than it shoes can move20:56
jeblairclarkb: i think you are over-generalizing20:56
ttxbut openstack/tech-governance is a mouthful20:56
anteayattx: I'm fine with openstack/governance20:57
clarkbjeblair: I am trying to be generic, because next milestone it will be some other DDoS20:57
ttxand it's not as if we never renamed any project in the past20:57
anteayado we want it in the openstack/ namespace or the openstack-infra/ namespace do you think, ttx?20:57
ttxwell if one thing is openstack/, that would be it20:57
anteayavery good20:57
clarkbjeblair: and a generic pace enforcment will help us at least keep moving rather than needing emergency fixes to keep going20:57
* anteaya goes back to looking up docs for creating a new git repo20:58
jeblairclarkb: overgeneralizing a problem does not help provide a solution.  how do you write a patch to "don't cause problems"?20:58
jeblairclarkb: your second point20:58
jeblairclarkb: pressure on jenknis20:58
ttxmordred: your cookiecutter thing looks good -- looks like an automated mordred-goes-to-fix-your-project merge20:59
jeblairclarkb: we have seen that jenkins can run a lot of jobs, and have a lot of slaves20:59
anteayattx: I don't have any expectation of any gate or check tests for openstack/governance20:59
jeblairclarkb: but right now, we've seen issues with slaves not being removed from jenkins20:59
ttxanteaya: we could enforce some common template20:59
jeblairclarkb: i don't know why that is.  there may be a bug in the gearman-plugin.  the 'thundering herd' of deleted nodes may just be too much contention for that kind of operation.21:00
ttxanteaya: but not yet maybe21:00
anteayattx: got one in mind?21:00
anteayattx: very good21:00
jeblairclarkb: and as you observed earlier, jenkins does not do well if you do lots of things at once21:00
clarkbjeblair: ya21:00
jeblairclarkb: so serializing access to adding and removing nodes from jenkins may help with that21:00
jeblairclarkb: at least, we might get a better idea of what is going on21:00
clarkbjeblair: I am all for fixing the specific bottlenecks because I want to be able to do as many operations as possible. But I also think having some way of pull back so that everything doesn't shut down is useful21:01
jeblairclarkb: anyway, you've had some good suggestions, and i'm trying to implement solutions for the problems we've seen based on them21:01
jeblairclarkb: that sounds great.  i have no idea what you're talking about though.21:01
anteayattx: who do you want as core for openstack/governance?21:02
clarkbjeblair: I am not sure where we would want the control to go (proabably in zuul) but being able to tell it launch at most 300 jobs per hour or some number of jobs per minute/second etc will be useful so that in cases like now we can continue to run jenkins jobs without making the problem worse.21:03
jeblairclarkb: why would we want to do that?  what problem does that solve?21:03
ttxanteaya: that's where it gets tricky. You want +2/-2 for TC members. And APRV for the chair (me)21:03
* ttx is in a meeting21:03
clarkbjeblair: I also see that as being useful so that it can be tied to a PID loop (or similar) where it automatically increases the limit and decreases it based on job throughput or some other metric21:03
anteayattx: okay, sorry more questions later21:04
clarkbjeblair: right now it would potentially give jenkisn a chance to catch back up on its own21:04
jeblairclarkb: catch up with what?21:04
clarkbjeblair: deleting nodes21:04
jeblairclarkb: oh, i don't think that has anything to do with it21:04
clarkbjeblair: or $otheroperation that has slowed to a crawl21:04
jeblairclarkb: it can't delete nodes because it's deleting nodes21:04
jeblairclarkb: not because it's running jobs21:04
jeblairclarkb: there _are_ things we can control to tune this whole system, but we need to tune the right things.21:04
*** gyee has joined #openstack-infra21:05
*** pblaho has joined #openstack-infra21:05
jeblairclarkb: if you want to rate-limit starting or stopping jobs, that can be done with zuul and gearman, in how they dispatch jobs21:05
jeblairclarkb: but setting an arbitrary jobs-per-hour limit doesn't address an actual problem.21:05
clarkbjeblair: right, I see it as a tool help implement proper bottleneck fixes21:06
jeblairclarkb: i really don't think it will help21:06
jeblairclarkb: you're creating and tuning a parameter that has nothing to do with the systems that are actually running21:07
clarkbbut it is a parameter that influences everything21:07
jeblairclarkb: for instance, it would do nothing to prevent mass simultaneous deletions of nodes, which is an ACTUAL problem21:07
*** nati_ueno has quit IRC21:07
jeblair(or at least seems to be)21:07
*** melwitt has quit IRC21:08
*** melwitt1 has joined #openstack-infra21:08
clarkbjust noticed that the zuul status timers don't do hours properly...21:08
*** nati_ueno has joined #openstack-infra21:08
clarkbjeblair: but it would reduce the number of nodes that would be deleted together21:08
jeblairclarkb: no, the fix that i'm trying to write right now will do that21:09
jeblairclarkb: it will delete only one node from a jenkins at a time21:09
jeblairclarkb: why would you want to try to fix that another way?21:09
clarkbI am not suggesting this as a fix21:09
jeblairclarkb: what are you suggesting?21:09
clarkbyou would still want to fix that particular problem with the change you are writing21:09
clarkbjeblair: I am suggesting that we have some way of slowing everything down to usable levels while you write that fix21:10
*** rfolco has quit IRC21:10
clarkbwe are very spiky and the ability to smooth out really big spikes will help in fixing the fallout21:10
jeblairclarkb: the fix i want to write will do that?  why don't i just go write that instead of something else that won't fix it?21:11
clarkbbecause next week or during icehouse freeze we will run into similar yes different problems21:11
*** cppcabrera_afk is now known as cppcabrera21:14
*** fbo is now known as fbo_away21:16
jeblairmordred, fungi: ping21:17
mordredjeblair: pong21:17
jeblairmordred: can you clean up jenkins02?21:17
mordredjeblair: yes. is there a description of the problem in the scrollback?21:18
*** vipul is now known as vipul-away21:18
jeblairmordred: yes21:18
mordredjeblair: great. I will find it21:18
jeblairmordred: thanks21:18
*** vipul-away is now known as vipul21:18
mordredttx: next year, can we move the nova FF one week prior? having me be only partially here due to burningman prep is not fantastic21:18
mordredjeblair: oh wow. ok. force stop ok yeah?21:19
jeblairmordred: yep21:19
mordredstopping21:20
mordredbtw - salt-master has cpu pegged on puppetmaster - I'm going to restart it21:20
jeblairmordred: i thought we stopped all the minions?  maybe stop the master too.21:21
mordredgreat21:21
clarkbwe should make a second pass at cleaning up the salt stuff after featurefreeze21:22
clarkbI believe the minions are still going crazy after the ssh thing21:22
clarkbs/ssh/crypto/21:22
jeblairoh, we didn't stop them?21:22
*** thomasbiege1 has joined #openstack-infra21:22
reedfungi, jeblair, pleia2: let me know if you think it may work https://review.openstack.org/#/c/42998/21:23
clarkbjeblair: we stopped them by hand, then restarted them then ran the rekey thing in hopes it would make them sane again21:24
ttxmordred: nex tyear, you shall scream when I show the schedule on the screen21:24
clarkbjeblair: but it didn't we should probably just disable the minion service on the slaves21:24
mordredttx: yes, I will21:24
clarkbttx: I think he did21:25
mordredclarkb: oh, you're right21:25
mordredI did21:25
mordredI believe I mentioned something like "there's going to be a rush and I'm not going to be much help" if the FF is that week21:25
*** thomasbiege1 has quit IRC21:26
ttxnext year if we separate summit/conf it would happen earlier21:26
mordredperfect21:26
lifelessmordred: when do you leave for burning man21:27
lifeless?21:27
openstackgerritJames E. Blair proposed a change to openstack-infra/nodepool: Add ProviderManager  https://review.openstack.org/4297321:27
jeblairclarkb: ^ live-tested21:27
*** prad_ has quit IRC21:27
openstackgerritAnita Kuno proposed a change to openstack-infra/config: Creating/adding the openstack/governance repository  https://review.openstack.org/4300221:27
jeblairclarkb: i'm basically just going to do the same thing for jenkins now.21:27
clarkbjeblair: ok21:28
clarkbjeblair: I have only found one minor issue so far21:28
clarkbjeblair: but it won't cause any bugs21:28
anteayamordred ^21:29
mordredjeblair: I've stopped jenkins02, amd currently working on deleting devstack slaves that were attached to it21:29
anteayaso in addition to this patch (I basically just followed the instructions for stackforge repos) what else to I have to do to create the repo?21:29
mordredlifeless: first thing in the morning21:30
anteayado I just create it on my laptop and push it as an empty repo?21:30
clarkblifeless: too soon21:30
anteayagiving it a .gitreview file21:30
lifelessmordred: ack21:30
*** alexpilotti has quit IRC21:33
mordredjeblair: ERROR: n/a (HTTP 400)21:34
mordredjeblair: is that ^^ a symptom of az1 rate limiting?21:34
Alex_Gaynorso trying to access the jenkins pages for some of hte running jobs on the zuul status page is resulting in 502s21:35
jeblairmordred: not that i'm aware; i don't see current rate limiting errors from nodepool21:37
mordredAWESOME21:37
anteayaare there tc meeting logs prior to October 2012? this link has October 2012 through to now but not prior: http://eavesdrop.openstack.org/meetings/tc/21:37
jeblairAlex_Gaynor: mordred is working on that21:37
Alex_Gaynorjeblair: okey doke (as always if I can help in some way, let me know)21:37
mordredjeblair: I'm getting that error a lot from running nova list and nova delete21:37
mordredbtw - ERROR: n/a (HTTP 400) is a TERRIBLE error message21:38
*** dprince has quit IRC21:38
jeblairmordred: OverLimit: This request was rate-limited. (HTTP 413)21:40
mordredok21:40
jeblairmordred: ^ that's what that looks like (and just happened)21:40
mordredfantastic21:41
*** boris-42 has quit IRC21:41
*** cppcabrera has left #openstack-infra21:42
mordredjeblair: I'm not having much luck in deleting the nodes... how important is that part of the step?21:44
jeblairmordred: i think you can skip it, nodepool should be able to clean up21:47
jeblairmordred: it will be slow about it, which probably isn't a bad thing21:47
mordredjeblair: ok. then I'm going to delete the node section from config.xml and restart21:47
jeblairmordred: just the devstack nodes21:48
*** mrmartin has quit IRC21:49
*** prad_ has joined #openstack-infra21:50
*** AJaeger has quit IRC21:51
*** thomasbiege1 has joined #openstack-infra21:51
mordredjeblair: jenkins02 is starting21:55
mordredjeblair: and yes - just hte devstack nodes were delete21:55
*** dina_belova has joined #openstack-infra21:55
*** weshay has quit IRC21:55
* fungi is caught up on scrollback from lunch and reviewing gate-performance-improving changes as a first priority21:57
clarkbjeblair: woo finally got through that change21:58
clarkbjeblair: the only major concern I have is with the default timeout used by the manager code21:58
pleia2oh, my lunch was productive, got to talk to a redhat admin who thinks that for our use case running git daemon as a service makes more sense than xinetd anyway since we're using it so much, feel less bad about writing the init script now ;)21:59
Alex_GaynorSo is this how it works every feature freeze? We fix the latest rounds of bottlenecks ?21:59
clarkbAlex_Gaynor: yes21:59
*** dina_belova has quit IRC22:00
*** thomasbiege1 has quit IRC22:00
clarkbpleia2: oh good22:00
mordredAlex_Gaynor: each time, the feature freeze has been significantly larger than the previous too22:00
Alex_Gaynormordred: sure, that was the underlying premise of my statementn, I didn't meean to imply we weren't making progress :)22:00
clarkbAlex_Gaynor: the number of changes that go in the week before feature freeze is not only much greater than the previous feature freeze but much greater than the weeks before it22:00
*** gyee has quit IRC22:01
*** markmc has quit IRC22:02
notmynamemordred: I can do the needful this afternoon for the tagging process to get pbr working22:02
*** rnirmal has quit IRC22:02
*** mriedem1 has quit IRC22:03
*** markmcclain has quit IRC22:03
mordrednotmyname: ok. from my side, I believe we can do that22:05
notmynamemordred: here's, IMO, a simple thing I think will make it all work22:07
*** burt has quit IRC22:08
mordredooh. I like simple things22:08
notmynamemordred: we tag today with 1.9.2 and consume that version number (ie we won't ever "release" a 1.9.2). This will let pbr do the right thing and create version numbers that sort properly22:08
notmynamemordred: if we have another minor release, it will be 1.9.322:08
notmynamemordred: but most likely will be 1.10.0 anyway22:09
mordredwell... we could do that ...22:09
mordredbut it will cause a 1.9.2 to be released to tarballs.o.o22:09
mordredbut I'm ok with that if you are22:09
openstackgerritClark Boylan proposed a change to openstack-infra/zuul: SIGUSR2 logs stack traces for active threads.  https://review.openstack.org/4295922:09
notmynamemordred: I don't see that as a problem, but do you have an alternate suggestion?22:09
clarkbjeblair: ^ now with documentation22:09
jeblairclarkb: just looked at your comment22:09
mordrednotmyname: tagging 1.9.2-dev - which will not cause a release to be cut22:10
mordredand will map closely to your current version in tree22:10
jeblairon cleanupServer in providerManager...22:10
notmynamemordred: to quote from clay on the pbr patch "Rather than waiting for imminent merge, we really should get a 1.9.2 tag on the origin repo *now* so the git based versioning works in sane fashion for review. I don't really care about 1.9.2-dev which doesn't parse by distutils.version.StrictVersion *anyway*."22:10
clarkbjeblair: about the timeout value22:10
jeblairclarkb: yeah22:10
mordrednotmyname: ok. I'm sold by that22:11
jeblairclarkb: so the timeout loop is a big loop that runs inside of the thread that is trying to delete the server22:11
notmynamemordred: ya, mostly the last line22:11
notmynamemordred: and if you haven't you should read his full comment on https://review.openstack.org/#/c/28892/22:11
notmynamemordred: but I think we can go forward with a 1.9.2 tag and then merge the patch22:12
jeblairclarkb: inside of that loop, it puts a task on the queue to get the server, and waits for that to complete22:12
jeblairclarkb: so i don't think anything about the timeout value changes22:12
jeblairclarkb: overall, we still wait, er, an hour for the server to be deleted (in a thread that is pretty much dedicated to trying to delete the server)22:12
jeblairclarkb: but that shouldn't affect anything else, other than every 2 seconds, that thread asks the provider thread to check on the server22:13
mordrednotmyname: reading now22:13
notmynamemordred: so I think that leaves it here: I'll approve/merge the pbr patch when I see the 1.9.2 tag on master upstream22:13
jeblairclarkb: (if a lot of servers are being slow to be deleted, everything else about that provider will be slow too, but i think that's desirable.  mostly.)22:13
*** ^demon has quit IRC22:13
*** gyee has joined #openstack-infra22:14
*** ^d has joined #openstack-infra22:14
*** pblaho has quit IRC22:14
clarkbjeblair: will it not prevent other tasks for running? for some reason I thought it would, but that function is called from outside the manager thread and does the poll loop there22:14
*** ^d has quit IRC22:14
*** ^d has joined #openstack-infra22:14
clarkbjeblair: so I think I was concerned about nothing22:14
clarkbThe running of the delete task runs in the manager thread which is quick22:15
clarkbjeblair: I will update my vote22:15
jeblairclarkb: exactly, all of those methods just put a task on the manager's queue, running those tasks happens in the dedicated thread, and all the tasks should be simple 1:1 nova api calls22:15
clarkbjeblair: done22:16
clarkbjeblair: pleia2 http://logs.openstack.org/93/42593/4/gate/gate-grenade-devstack-vm/6de9e45/logs/devstack-gate-setup-workspace-new.txt22:18
anteayattx: when you are around but not in a meeting, here is my first attempt: https://review.openstack.org/#/c/43002/22:18
*** ^d has quit IRC22:19
clarkbjeblair: pleia2: I think that may be replication related22:19
clarkbthough I am not sure because I would've expected git to make that more atomic22:20
mordrednotmyname: ok. yes. I tihnk it's a well written comment, and I appreciate the willingness to go along.22:20
*** dkliban has quit IRC22:20
mordrednotmyname: do you want me to cut a tag? or do you want to do it?22:20
notmynamemordred: I can't make tags for swift (unless that's changed)22:20
notmynamemordred: if I have the perms, I'd be happy to do it22:21
jeblairclarkb: i agree, it wfm locally22:21
clarkbjeblair: pleia2 http://paste.openstack.org/show/44689/ is what I see in the apache log22:21
mordredttx: you around?22:21
jeblairclarkb: what a strange error22:22
clarkbjeblair: ya, file exists though and has timestamps from days in the past22:22
pleia2that is odd, it's just ssh that replicates so it shouldn't be doing something like deleting it first (huh, would it?)22:22
notmynamemordred: after midnight in paris right now..22:22
mordrednotmyname: ok. I'll just do it22:23
jeblairnotmyname, mordred: he's not in that timezone22:23
notmynameah, ok then :-)22:23
clarkbpleia2: I don't expect it to and the mod time on that dir is from the 13th22:23
notmynamemordred: ok. who has permission to push tags? with the change to pbr is that changing?22:23
notmynamejeblair: clarkb: ^ ?22:23
mordrednotmyname: no - it should be still ttx since it's a server project22:23
notmynameok22:24
mordrednotmyname: the main change is that it won't need to commit to change the version anymore22:24
mordrednotmyname: so the chances of your milestone-proposed brnach being any different than master are _REALLY_ low :)22:24
openstackgerritJim Branen proposed a change to openstack/requirements: Allow use of hp3parclient 2.0  https://review.openstack.org/4299122:25
mordrednotmyname: 5c6f0015d56478108a623cf65641a39ea91fc2b5 work for you?22:25
notmynamemordred: confirm. 5c6f0015d56478108a623cf65641a39ea91fc2b522:25
*** changbl has quit IRC22:26
mordrednotmyname: done22:26
notmynamemordred: thanks22:27
notmynamemordred: final tests on pbr branch22:27
notmynamerd22:27
clarkbI wonder22:29
*** lbragstad has quit IRC22:29
clarkbjeblair: pleia2 so apache is allowed to read the pack and idx files directly without talking to the git http thing22:33
clarkbjeblair: pleia2 and that is what appears to have failed22:33
*** jungleboyj has joined #openstack-infra22:33
jungleboyjCan anyone answer questions about how the Transifex Translations are being automatically done?22:34
clarkbpleia2: any chance selinux is involved?22:34
clarkbjungleboyj: yes I can, whats up?22:34
jungleboyjclarkb: Awesome.  Thank you!22:35
*** jhesketh has joined #openstack-infra22:35
pleia2clarkb: good question, it shouldn't since everything in /var/lib/git should have the right selinux magic to serve it up to httpd22:35
pleia2clarkb: but this is getting quite far out of my git expertise to understand what is happening git-wise (pack and idx files?)22:36
clarkbpleia2: in .git/objects/pack22:36
jungleboyjclarkb: I am working on Cinder and noticed that we had some english strings that were coming our wrong.  When I look at the .po files for en_US I see that it has a msgstr defined that is either incomplete or all together wrong.  Trying to figure out the right way to fix that.  I had gone through and removed all the msgstr s (msgstr="") since it doesn't make sense to translate English to English but now I see the latest22:37
mordredjungleboyj: can you defined "coming out wrong" ?22:37
clarkbpleia2: the pack files contain a bunch of object files all compressed together, I believe the idx files tell git where to look in that compressed blob for specific objects22:37
clarkbpleia2: that particular file has been in place since the 13th though22:38
pleia2clarkb: I see, so that doesn't sound to me like anything strange that selinux would have a problem with inside /var/lib/git/22:38
clarkbjungleboyj: can you link to a particular example in a proposed change?22:38
clarkbjungleboyj: and I think the way i8ln works it does make sense to translate English to English depending on the locale :)22:39
jungleboyjmordred: I had the string _("Failure creating image %s.  Error %s", vol_id, error) or something like that.  In the .po the msgstr for that was just "Failure creating image" and that was all that was printed to the logs.22:39
lifelessbad translator, no cookie22:39
*** apcruz has quit IRC22:40
*** sandywalsh has quit IRC22:40
* clarkb updates cinder repo22:40
*** shardy is now known as shardy_afk22:40
clarkbpleia2: the normal permissions all look fine. I don't know why else apache would fail to see a dir22:41
*** nijaba has quit IRC22:42
mgagneWith JJB, has anyone had the great idea to use parameterized jobs in job-group?22:42
jungleboyjclarkb: Here is the specific example:  https://review.openstack.org/#/c/40948/2/cinder/locale/en_US/LC_MESSAGES/cinder.po  Line 58322:42
pleia2clarkb: /var/log/audit.log is where selinux logs violations, so you can look there22:43
clarkbpleia2: thanks22:43
jungleboyjmsgid "Failed to copy image to volume: %(reason)s"22:43
jungleboyjmsgstr "Failed to copy image to volume"22:43
clarkbjungleboyj: we treat transifex as the source of truth for those msgstrs22:45
clarkbjungleboyj: the old string there may have been a casualty of babel doing a fuzzy translation and not understanding the %(reasons) I am not actually sure there22:46
jungleboyjclarkb: Ok, well, in the case of Cinder the msgstrs are incomplete or wrong.  Need to figure out how to fix it.  Saw the same thing in other projects too.22:46
clarkbjungleboyj: but for patchset 1 the removal of the msgstr would've come from transifex or the update_catalog that we run prior to updating from transifex22:46
clarkbjungleboyj: yeah, things were wrong at one point because babel allows fuzzy translations by default, we have since disabled that. Let me get you a link to the script that proposes these chagnes22:47
fungijungleboyj: i have seen translations from the "c" source language to en get extremely stale because nobody is checking them for some projects, so eventually the source strings grow different numbers of format string parameters than the obsolete en versions which should normally be identical22:47
clarkbjungleboyj: https://github.com/openstack-infra/config/blob/master/modules/jenkins/files/slave_scripts/propose_translation_update.sh22:48
clarkbjungleboyj: https://github.com/openstack-infra/config/blob/master/modules/jenkins/files/slave_scripts/propose_translation_update.sh#L46-L55 is the most relevant section. I wonder if this is fallout from when we didn't prevent fuzzy matches22:49
fungijungleboyj: i did a fairly massive pass through nova some months back to clean up english translations (which basically resulted in me duplicating the source strings)22:49
fungii'm not familiar with what the impact from fuzzy matches might be though22:50
clarkbjungleboyj: from git blame http://paste.openstack.org/show/44691/ that was long enough ago to be when fuzzy matching was allowed so I think that is the issue22:50
*** mikal has joined #openstack-infra22:51
clarkbfungi: jungleboyj: we may want to reseed them all with non fuzzy strings based on what is in transifex to get past the cruft that babel let through initially22:51
*** mikal has quit IRC22:52
*** prad_ has quit IRC22:52
fungii take it there's no way to identify a fuzzy vs. non-fuzzy translation of a string solely from the pofile22:53
*** sandywalsh has joined #openstack-infra22:53
notmynamemordred: patch merged (merging) and email sent to ML22:53
mordrednotmyname: woot!22:53
notmynamemordred: thanks for your help on it22:53
mordrednotmyname: thanks for yours! I believe pbr is much better today than it was originally due to addressing your concerns22:54
*** nijaba has joined #openstack-infra22:54
clarkbfungi: there is the # fuzzy comment, but I think babel may not remove those when it has a non fuzzy translation22:54
clarkbfungi: which makes it a little painful to work with22:54
jungleboyjclarkb: So, let me make sure that I understand.  There are some old en translations that didn't happen properly because fuzzy matching was allowed.22:54
*** ftcjeff has quit IRC22:55
*** markmcclain has joined #openstack-infra22:55
notmynamemordred: in my email I said, "If you have any issues, just ask Monty. Preferably after 10pm on Tuesdays" ;-)22:55
*** michchap has joined #openstack-infra22:55
mordredclarkb: speaking of i18n, we should get swift on the transifex bandwagon - they already use babel and everything22:55
fungiclarkb: right. unless we actually expect un-fuzzed translations to result in the #fuzzy comment also getting removed, no way to tell just from the translated string itself22:55
mordredclarkb: and their translations are in top level like I sort of want everyone else's to be :)22:55
mordrednotmyname: I look forward to those questions :)22:56
clarkbjungleboyj: correct22:56
jungleboyjclarkb: If that is the case, how can I get fixes for those strings that got fuzzed up.22:56
clarkbjungleboyj: you can translate them in transifex, or I think it is still possible to propose a patch that fixes them, but that may not be the case. I will have to double check that22:57
openstackgerritElizabeth Krumbach Joseph proposed a change to openstack-infra/config: Swap git daemon in xinetd for service  https://review.openstack.org/4301222:57
*** mkirk_ has quit IRC22:58
jungleboyjclarkb: Forgive all the noob questions.  How do I translate them in transifex?22:58
clarkbjungleboyj: https://github.com/openstack-infra/config/blob/master/modules/jenkins/files/slave_scripts/upstream_translation_update.sh#L42-L53 we still push local git contents back to transifex so you can propose a fix in git if you like22:58
*** mkirk_ has joined #openstack-infra22:58
clarkbjungleboyj: I have actually never done it :) but I believe you log into https://transifex.com find the cinder project and then you can either update strings in your browser or use the tx tool22:59
*** gordc has left #openstack-infra22:59
jungleboyjclarkb: Ok.22:59
jungleboyjclarkb: FYI, the pot file doesn't have any msgstrs defined in it.  Will changing the pos make a difference?23:00
clarkbthe pot file is a template, it should not have any msgstrs in it23:00
clarkbthe .po files contain the actual translations23:00
*** rcleere has quit IRC23:01
openstackgerritElizabeth Krumbach Joseph proposed a change to openstack-infra/config: Swap git daemon in xinetd for service  https://review.openstack.org/4301223:01
jungleboyjclarkb: That is what I thought.  So, I would need to actually put the changes in the POs.23:01
*** sgviking has quit IRC23:02
*** dkliban has joined #openstack-infra23:02
clarkbjeblair: pleia2 mordred https://jenkins01.openstack.org/job/gate-neutron-pep8/434/console ugh. I think centos and ubuntu must be sufficiently different that this doesn't work quite right. Or something replication related23:02
clarkbjungleboyj: yup23:02
jungleboyjclarkb: Once I do that, is there something I need to do to get a new transifex import to happen?23:03
*** jpich has quit IRC23:03
clarkbjungleboyj: using transifex's tx tool you can get pull the pos and push them back to transifex if you want to use their workflow23:03
clarkbjungleboyj: we import from transifex once a day per project23:03
clarkbso you don't need anything special it should just happen23:03
jungleboyjclarkb: Ok, and you don't recommend clearing out all the english msgstrs ?  Just fix the ones that are wrong?23:04
clarkbjungleboyj: right. as en_US is different than C23:04
jeblairclarkb: yeah, three differences: replication over ssh, operating system, git version23:04
clarkband different than en_UK and so on23:04
jungleboyjclarkb: Ok.  Thank you so much for the help!23:04
pleia2clarkb: I think it's a rewrite problem! pulling that file from /cgit works, but not the direct git.openstack.org/openstack/neutron/... location23:05
clarkbpleia2: interesting23:05
openstackgerritMathieu Gagné proposed a change to openstack-infra/jenkins-job-builder: Job-specific subst. in a job group's job list  https://review.openstack.org/4301323:05
*** mrodden has quit IRC23:06
clarkbpleia2: /cgit will be served by cgit though right?23:07
clarkbpleia2: so possibly completely different processes23:07
pleia2clarkb: right23:07
pleia2but at least the files do exist and are servable by apache somewhere23:07
pleia2might be right about git version weirdness23:08
jeblairclarkb: maybe check if that file exists on disk?23:08
pleia2cgit is serving it23:08
jeblairpleia2: could be cached23:09
pleia2ah23:09
jeblairpleia2: if it exists on disk and apache does not serve it, it's as you say, a rewrite problem23:09
jeblairpleia2: if not, we're back to where we were23:09
clarkbjeblair: the files do exist on disk, at least the ones that I have seen23:09
clarkbs/seen/looked at/23:09
*** sgviking has joined #openstack-infra23:09
jeblairclarkb: does openstack/neutron/objects/pack/pack-de6d5d31c8684408cf90392a88fb0176b4ca8f01.idx ?23:10
clarkbhttps://github.com/openstack-infra/config/blob/master/modules/cgit/templates/git.vhost.erb#L19-L30 for those follwoing along.23:10
clarkbjeblair: checking23:10
clarkbjeblair: yes -r--r--r--. 1 cgit cgit 4488 Aug 20 06:18 pack-de6d5d31c8684408cf90392a88fb0176b4ca8f01.idx23:11
jeblairpleia2: sounds like you're on to something23:12
clarkbjeblair: pleia2 does the RewriteRule and ScriptAlias conflict?23:12
pleia2hmm23:12
clarkboh you know23:13
*** jerryz has joined #openstack-infra23:13
clarkbactually no that can't be it23:13
pleia2the regex for pack|idx seems right23:14
clarkbpleia2: yeah that comes straight from the git http man page iirc23:14
*** dims has quit IRC23:15
*** ken1ohmichi has joined #openstack-infra23:18
*** ryanpetrello has quit IRC23:20
openstackgerritJames E. Blair proposed a change to openstack-infra/nodepool: Add JenkinsManager  https://review.openstack.org/4301423:21
openstackgerritJames E. Blair proposed a change to openstack-infra/nodepool: Add an ssh check periodic task  https://review.openstack.org/4301523:21
openstackgerritJames E. Blair proposed a change to openstack-infra/nodepool: Change credentials-id parameter in config file  https://review.openstack.org/4301623:21
openstackgerritJames E. Blair proposed a change to openstack-infra/nodepool: Reduce timeout when waiting for server deletion  https://review.openstack.org/4301723:21
openstackgerritJames E. Blair proposed a change to openstack-infra/nodepool: Add ProviderManager  https://review.openstack.org/4297323:21
mgagnewhich repo should I clone to test? I was able to clone stackforge/puppet-glance and openstack/python-heatclient without problem23:21
clarkbmgagne: neutron and nova appear to currently be failing fairly frequently according to the logs23:21
mgagneclarkb: is it therefore an intermittent issue?23:22
pleia2clarkb: so can it get to some pack-de6d5d31c8684408cf90392a88fb0176b4ca8f01.idx files?23:23
clarkbmgagne: yes, it seems to be intermittent23:23
pleia2er, .idx files23:23
clarkbpleia2: I am not sure yet, actually let me try getting that file direclty23:23
clarkbpleia2: mgagne: this may in part depend on the local state of your repo23:23
mgagneclarkb: I'm cloning from scratch, are tests fetching and checking out a specific ref instead?23:24
clarkbmgagne: tests will clone if the repo doesn't already exist otherwise they will do a remote update to fetch what they are missing23:25
clarkbpleia2: directly fetching one of those neutron files with wget fails. This must've been what you tested before23:25
clarkbpleia2: for whatever reason I thought you tested with a git clone which does work23:25
pleia2clarkb: I just tested via web browser23:26
clarkbpleia2: looking at the vhost cgit will serve anything not under .*/objects because ScriptAlias / /usr/libexec/git-core/git-http-backend/ will never be used as we rewrite / to /cgit23:27
clarkbpleia2: oh but we rewrite ^/$ to /cgit so anything like /openstack/foo should go to git-http-backend right?23:28
pleia2clarkb: yeah, I think those rewrite things are not for cgit23:28
*** mrodden has joined #openstack-infra23:28
pleia2clarkb: I think they are just for git-http-backend23:29
pleia2fungi added them in a change to support git-http-backend23:29
*** changbl has joined #openstack-infra23:29
*** dims has joined #openstack-infra23:30
*** HenryG has joined #openstack-infra23:31
jeblairclarkb: ^ the new stack of nodepool changes is in production23:32
fungiyup23:32
jeblairclarkb: (i did reduce that timeout, btw, because i think it was ridiculously large)23:32
fungifrom an hour to...?23:33
*** ken1ohmichi has quit IRC23:33
jeblair10 mins23:33
* fungi nods. sounds sane23:33
jeblairwhich is just, well, large.  :)23:33
fungis/ridiculously//23:33
pleia2clarkb: confirmed, I don't have any of the pack rewrite rules in my test instance and I can download packs via cgit (hi fungi!)23:33
clarkbpleia2: I think it may be an selinux thing23:34
clarkbpleia2: httpd itself will access the git files when they hit the AliasMatches23:35
* fungi retries to grok where the ^/$ rewrite could conflict at all with the git-http-backend cgi scriptalias23:35
clarkbbut httpd runs under a different selinux type23:35
clarkbI am very quickly learning about selinux types so that I can test23:35
jeblairselinux would show that error23:35
jeblairclarkb: look in audit.olg23:35
jeblairlog23:35
clarkbaudit.log was a pain to look at ...23:36
pleia2hah23:36
pleia2can grep for git probably23:36
clarkbbut I think I just get annoyed when there are no timestamps. I will look again23:36
fungiclarkb: well, there are timestamps, you just need to learn to read unixtime directly ;)23:37
clarkbI don't see any AVC messages in audit.log23:38
mgagneclarkb: I think it has to do with the way packs are generated. Could be that they are generated on-the-fly and there is contention issues on git.o.o due to the high volume of clone, fetch, etc.23:38
mgagneclarkb: https://www.kernel.org/pub//software/scm/git/docs/git-update-server-info.html23:39
clarkbmgagne: it seems to know where the files are though, it just can't get them23:39
mgagneclarkb: a curl returns the file? Could it be caching issue? Or is it a timing issue, by the time you test the existence of the file, it got generated. Trying to figure out what have been tried/tested.23:41
*** rfolco has joined #openstack-infra23:42
clarkbmgagne: wgetting the file that was failed to fetch on a jenkins slave fails, but the file is on disk and has been there for at leasthours23:42
clarkbmgagne: https://jenkins01.openstack.org/job/gate-neutron-pep8/434/console has a list of things that can't be fetched23:42
clarkbmgagne: however changing the root of the url to /cgit you are able to get the file23:43
clarkbmgagne: so it is only when apache attempts direct access via https://github.com/openstack-infra/config/blob/master/modules/cgit/templates/git.vhost.erb#L28-L29 that it fails23:43
jeblairfurther evidence the scriptalias is not working: the actual apache error log message says "File does not exist: /var/lib/git/openstack/neutron"23:44
jeblairand that _doesn't_ exist23:44
jeblairbecause it's /var/lib/git/openstack/neutron.git23:44
jeblairso presumably the scriptalias directive to use the smart http server would normally translate that,23:44
clarkboh that may be it23:44
pleia2oh wow, right23:45
jeblairbut it's not, so apache is just trying to serve a simple file23:45
pleia2https://git.openstack.org/openstack/neutron.git/objects/pack/pack-8dd2daf4e48bc336b39e06bcb5612bdc2c7bec7c.idx works!23:46
pleia2nice one jeblair23:46
jeblairbut looking at that, i think we're trying to get apache to just serve the files23:46
jeblairit looks like the aliasmatch directives are intended to take precedence, and then scriptalias catches the rest23:47
mroddenany idea why i'm seeing this in my tox runs? http://paste.openstack.org/show/44692/23:47
mroddencannot import setuptools23:47
clarkbjeblair: the config comes from https://www.kernel.org/pub/software/scm/git/docs/git-http-backend.html23:47
mroddenbut it actually installs setuptools 1.0 above...23:47
jeblairclarkb: yeah, and it's the same as on review23:47
*** mriedem has joined #openstack-infra23:48
jeblairclarkb: what if the git smart http server is providing the wrong urls?23:48
jeblair(git version difference)23:48
clarkbjeblair: could be23:49
mgagneGIT_PROJECT_ROOT has a trailing slash23:49
mgagnecould it be?23:49
clarkbmrodden: the uninstall of distribute that happens first is causing the problem I htink23:49
mgagnedoc doesn't show/use trailing slash23:49
clarkbmrodden: try updating tox?23:50
pleia2mgagne: perhaps, maybe if it has a trailing slash it does assume neutron/ and won't expand to neutron.git/23:50
mroddenclarkb: ok i'm on 1.423:50
mrodden1.4.2 i think23:50
clarkbthere is a trailing slash on review.o.o, but I can go ahead and update it git.o.o and restart apache to check23:51
mroddenwow they have 1.6.0 out now...23:51
clarkbmrodden: there has been a lot of churn around setuptools and distribute merging23:51
clarkbmrodden: so there are a bunch of updates from tools23:51
fungiwell, we have trailing / on GIT_PROJECT_ROOT for the gerrit servers and zuul in fact23:51
*** UtahDave has quit IRC23:51
mroddencrazy23:51
openstackgerritJoshua Hesketh proposed a change to openstack-infra/zuul: Move gerrit specific result actions under reporter  https://review.openstack.org/4264423:52
openstackgerritJoshua Hesketh proposed a change to openstack-infra/zuul: Add support for emailing results via SMTP  https://review.openstack.org/4264523:52
openstackgerritJoshua Hesketh proposed a change to openstack-infra/zuul: Separate reporters from triggers  https://review.openstack.org/4264323:52
clarkbfungi: yeah but this is the only server with this version of git23:52
clarkbanyways restarting apache now23:52
clarkbdidn't help23:53
pleia2nope :\23:53
jeblairuh, so there are very few references to pack files in the gerrit logs23:54
clarkbjeblair: maybe it isn't working there either?23:54
mordredclarkb: oh - interesting23:54
jeblairsome of them are to '.git' dirs, and they work, some omit '.git' and are 404s23:54
pleia2same thing here23:55
jeblairby very few, i mean 1 client this week.23:55
clarkbwarning hack: what if we just symlink openstack/foo to openstack/foo.git on disk?23:55
clarkband handle both cases?23:55
pleia2clarkb: it hurts, but if we do we can do it in the jeepyb script23:56
jeblairclarkb: maybe to stop the bleeding?  but we really should figure out the problem.23:57
clarkbjeblair: I agree23:57
clarkblet me add a neutron symlink then try grabbing that idx file again23:57
clarkbthat will at least tell us if this is the only problem23:58
* pleia2 nods23:58
jeblair(i don't think we should add it to jeepyb, (unless we decide it's the actual solution) we'll never fix it)23:58
pleia2jeblair: ah, ok23:58
jeblairmordred: i forgot a step earlier: set the nodes to deleted in nodepool23:59
jeblairi'll do that now23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!