*** dims has quit IRC | 00:00 | |
jeblair | clarkb: the documentation for the -n and -t options suggest to me that is the case | 00:00 |
---|---|---|
*** bearhands is now known as comstud | 00:00 | |
clarkb | jeblair: I know that when doing a git remote update you only get tags on the branches being tracked | 00:01 |
fungi | jog0: yeah, e-mail thread later in the week (maybe thursday) to start comparing opinions might be good. i gather we should also be using subcategory tags and can put more than one on a talk, so that might also be a good way to coordinate some things | 00:01 |
clarkb | jeblair: and git remote update is just fancy git fetch so I think that is correct | 00:01 |
jog0 | fungi: yeah | 00:02 |
*** amotoki_ has quit IRC | 00:06 | |
*** amotoki_ has joined #openstack-infra | 00:06 | |
mordred | clarkb: you have machiens that are still borked? | 00:07 |
clarkb | mordred: no I fixed them | 00:07 |
mordred | clarkb: ok | 00:08 |
clarkb | mordred: I have a feeling that if you tried this on one of our precise slaves that needs newer pbr you will see the failure | 00:08 |
clarkb | mordred: zuul-dev maybe | 00:08 |
mordred | clarkb: I can see the pbr upgrade failure | 00:08 |
mordred | clarkb: it's the other failure I can't see | 00:08 |
clarkb | mordred: the install pbr failure? | 00:09 |
lifeless | how do you do multiline file contents in yaml ? | 00:09 |
mordred | lifeless: :- | 00:09 |
clarkb | I would expect the same sort of issue on zuul-dev | 00:09 |
mordred | lifeless: one sec - I will paste example | 00:10 |
mordred | lifeless: https://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/jenkins_job_builder/config/devstack-gate.yaml#n13 | 00:10 |
mordred | lifeless: : | is what I meant :) | 00:10 |
*** amotoki_ has quit IRC | 00:12 | |
*** dkranz has joined #openstack-infra | 00:12 | |
*** michchap has joined #openstack-infra | 00:16 | |
*** dims has joined #openstack-infra | 00:16 | |
lifeless | is an ssl chain file needed ? | 00:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 00:18 |
clarkb | woo I htink we have non deterministic sudoing in neutron tests when mocks are not setup ahead oftime | 00:20 |
*** jhesketh_ has joined #openstack-infra | 00:21 | |
clarkb | marun: https://review.openstack.org/#/c/43558/ my comments on that review hopefully provide more details | 00:22 |
*** svarnau has quit IRC | 00:22 | |
lifeless | clarkb: the top of review.projects.yaml.erb is not like the bottom | 00:23 |
lifeless | clarkb: does the homepage key have special meaning? | 00:23 |
clarkb | lifeless: correct. review.projects.yaml.erb comes in two sections, first section is of defaults and second section is a list of projects with possible default overrides | 00:23 |
clarkb | lifeless: it sets the homepage on github.org. We should probably start overriding thosefor stackforge projects | 00:25 |
clarkb | *github.com | 00:25 |
clarkb | lifeless: https://github.com/openstack-infra/config where it says website, that value comes from homepage | 00:26 |
clarkb | mordred: ^ | 00:26 |
lifeless | clarkb: where are teh docs for adding projects again? | 00:26 |
clarkb | lifeless: http://ci.openstack.org/stackforge.html | 00:26 |
lifeless | cool | 00:27 |
lifeless | what does the separation of gitbhu users achieve? | 00:27 |
lifeless | do derivers need to do that ? | 00:27 |
clarkb | lifeless: which separation? stackforge vs openstack or the one about actual users? | 00:28 |
*** nati_uen_ has quit IRC | 00:28 | |
lifeless | gerrit-user: openstack-project-creator |g | 00:28 |
lifeless | gerrit-committer: OpenStack Project Creator <openstack-infra@lists.openstack.org> |r | 00:28 |
lifeless | gerrit-key: <%= ssh_project_key %> |o | 00:28 |
lifeless | in the erb file | 00:28 |
*** fbo_away has quit IRC | 00:29 | |
clarkb | you don't strictly need to do that, but it does do one really nice thing. It prevents you from needed to use an account interactively that always has super powers | 00:29 |
clarkb | I find it really annoying when reviewing code to have the extra buttons unless I need them (because 99% of the time I don't want them and don't want to misclick) | 00:30 |
lifeless | isn't that user also set in hiera? Why is it hardcoded in this template. | 00:30 |
lifeless | oh, I'm thinking github users. | 00:30 |
lifeless | bah | 00:30 |
clarkb | I am going to run home now but should follow IRC again once there if you have more questions | 00:31 |
lifeless | thanks | 00:31 |
lifeless | I will | 00:31 |
lifeless | I made no progress the last two days without a ready source of institutional knowledge | 00:31 |
*** Ryan_Lane has quit IRC | 00:33 | |
*** rfolco has quit IRC | 00:38 | |
*** UtahDave has quit IRC | 00:40 | |
openstackgerrit | James E. Blair proposed a change to openstack-infra/config: Use gerrit for the remote update in post jobs https://review.openstack.org/44988 | 00:41 |
mordred | lifeless: so, this may be late in framing... | 00:46 |
*** mgagne has quit IRC | 00:48 | |
mordred | lifeless: but heira is essentially just a collection of input parameters - so perhaps the knowledge about what the heira bits do is really a deficiency in describing what the input params need? | 00:48 |
clarkb | mordred yes the params should be better documented in puppet | 00:48 |
mordred | lifeless: also, probably out of scope for right now, but I do believe that ::gerrit and openstack_project::gerrit is not split where it should be in all places | 00:49 |
mordred | some things should be split out and combined via composition - and other things should really just be features of the underlying gerrit module - I've been meaning to do a refactor for a while | 00:49 |
*** rfolco has joined #openstack-infra | 00:50 | |
clarkb | jeblair is that temporary until the zuul change? | 00:50 |
clarkb | mordred ++ see hunners roles and profiles example on github | 00:51 |
mordred | clarkb: aroo? | 00:51 |
lifeless | mordred: it is, which is why I document the hiera bits in the .pp files. | 00:51 |
mordred | lifeless: cool | 00:51 |
* mordred catching up by just poking people | 00:51 | |
jeblair | clarkb: yes; it should only be used by post jobs, so the additional load on gerrit should be small | 00:52 |
jeblair | post,release,pre-release | 00:52 |
*** nos_ has joined #openstack-infra | 00:54 | |
*** nos_ has quit IRC | 00:54 | |
*** nosnos has joined #openstack-infra | 00:55 | |
*** changbl has quit IRC | 00:55 | |
clarkb | mordred: https://github.com/hunner/roles_and_profiles | 00:56 |
clarkb | jeblair: wfm | 00:56 |
clarkb | mordred: our puppet modules should start looking like that, then we can use r10k to pull in modules (and split our modules out) leaving openstack-infra/config with launch/, a puppetfile, and the openstack_project module | 00:59 |
clarkb | mordred: and some scripts | 00:59 |
mordred | clarkb: what's r10k? | 01:00 |
clarkb | mordred: it is like puppet librarian but actually works according to the author | 01:00 |
*** yjiang5 is now known as yjiang5_away | 01:00 | |
*** changbl has joined #openstack-infra | 01:00 | |
mordred | clarkb: I do not understand how that example is different | 01:01 |
mordred | clarkb: there is also no puppetfile | 01:01 |
clarkb | mordred: roles and profiles is a compeltely different problem than what puppetfiles solve | 01:02 |
clarkb | mordred: all modules that are not roles or profiles could be serviced by a puppetfile | 01:02 |
mordred | clarkb: ah. | 01:02 |
mordred | gotcha | 01:02 |
clarkb | mordred: but by organizing the specific bits into roles nad profiels you end up with nicely reconsumable bits | 01:02 |
clarkb | in the other modules | 01:03 |
lifeless | whats test-manage-project.config for ? | 01:03 |
lifeless | the name test in it is ambiguous - is it for ci,or for manual testing of things or ??? | 01:03 |
clarkb | lifeless: it is used against review-dev | 01:04 |
clarkb | lifeless: there is a test-manage-project on review-dev | 01:04 |
mordred | clarkb: ah. ok. (reading presentation) | 01:05 |
mordred | clarkb: I think this is a more complete version of what I was trying to do with the openstack_project:: module and how we're doing things in site.pp | 01:06 |
clarkb | mordred: yup | 01:06 |
mordred | EXCEPT | 01:06 |
clarkb | hunner brought it up when he tried using our gerrit module and that is what I told him. "We seem to have tried to do something like this but never got there" | 01:06 |
mordred | clarkb: so the expectation is that hiera calls go into profiles? | 01:07 |
clarkb | mordred: yeah I disagree with that | 01:07 |
mordred | I do too | 01:07 |
clarkb | and there was some arguing :) but overall Ithink the structure is good | 01:07 |
mordred | hrm. SO | 01:08 |
mordred | hiera in profiles would reduce a lot of the copying of parameters we do | 01:09 |
mordred | but I don't like that it means that it's hard to do a simple different site.pp without going heira on it | 01:09 |
clarkb | it would, and if we explicitly make profiles not reconsumable we could get away with it | 01:09 |
clarkb | yeah that. for testing you want to reconsume site.pp | 01:09 |
clarkb | and other folks probably will still pull openstack-infra/config even if it is skeletonized | 01:10 |
lifeless | so I'm doing about 90% copying 10% parameterisation atm | 01:10 |
lifeless | I would do it better and refactor, but ETIME. | 01:10 |
mordred | yah | 01:10 |
mordred | lifeless: that's what I did with the HP gerrit | 01:10 |
mordred | and then filed a note with myself to go back and refactor where I could not parameter | 01:10 |
mordred | (I did send some patches upstream) | 01:10 |
*** pcrews has joined #openstack-infra | 01:10 | |
lifeless | mordred: yeah | 01:11 |
lifeless | mordred: I figured you had a fairly good reason for leaving me with this pile of poo :) | 01:11 |
jeblair | i'm so glad i've spent time helping | 01:12 |
mordred | lifeless: there's a big enough refactor in some places that it just hasn't bubbled up | 01:12 |
lifeless | jeblair: it's been very helpful! | 01:12 |
lifeless | jeblair: by poo, I didn't mean the infrastructure is bad | 01:12 |
lifeless | jeblair: I meant that mordred had already done a full pass over this, but it's still copy-paste material. | 01:13 |
mordred | yup. that's on me | 01:13 |
mordred | I meant to document what I did | 01:13 |
mordred | and then I got busy | 01:13 |
lifeless | jeblair: I didn't mean to cast aspersions on -infra or the config tree as a whole. | 01:14 |
jeblair | lifeless: np. i now understand what you were saying. | 01:14 |
* mordred hangs head | 01:15 | |
lifeless | jeblair: sorry for the phrasing there; in person I think it would have come across properly | 01:16 |
*** jhesketh has quit IRC | 01:16 | |
*** jhesketh has joined #openstack-infra | 01:16 | |
*** amotoki_ has joined #openstack-infra | 01:20 | |
lifeless | the run things many times thing with puppet is super annoying | 01:22 |
marun | clarkb: thanks for the review pointer, I'll see if I can help get that fixed. | 01:23 |
*** amotoki_ has quit IRC | 01:24 | |
jeblair | lifeless: yeah, if we had the test system we really want, we'd spin up (at least) a new node for each puppet change and make sure things don't bitrot and can work once from a fresh run; as it is, we only catch bitrot when we build new servers | 01:25 |
lifeless | jeblair: yeah; the tripleo story has the same issue, we're working towards the same grail | 01:26 |
lifeless | jeblair: [catching things early that is] | 01:26 |
lifeless | I have a manage-projects issue | 01:27 |
lifeless | paramiko.SSHException: not a valid DSA private key file | 01:27 |
lifeless | but | 01:27 |
lifeless | I presume it's looking at less /home/gerrit2/review_site/etc/ssh_project_rsa_key | 01:27 |
lifeless | which is - by name anyhow - meant to be an RSA file | 01:27 |
lifeless | and in fact, | 01:27 |
lifeless | # file /home/gerrit2/review_site/etc/ssh_project_rsa_key | 01:27 |
lifeless | /home/gerrit2/review_site/etc/ssh_project_rsa_key: PEM RSA private key | 01:27 |
lifeless | also I need a hand with apache ssl - it wants an intermediate chain file, and I have no clue about that for self signed certs | 01:28 |
clarkb | lifeless: for the self signed certs we typically just use the one that comes in /etc/ssl/certs ssl-cert-snakeoil.pem iirc | 01:29 |
*** tian has joined #openstack-infra | 01:30 | |
clarkb | though you may actually care about the contents of your self signed cert | 01:30 |
clarkb | lifeless: https://github.com/saltstack/salt-cloud/pull/68 could your exception be for similar reasons? | 01:31 |
jeblair | lifeless: you should be able to omit the chain file for an ssl cert | 01:33 |
jeblair | lifeless: most of our modules do that if you set the chain file contents to '' in hiera | 01:33 |
jeblair | lifeless: unsure if the gerrit module does that | 01:33 |
mordred | jeblair: it does | 01:34 |
lifeless | Syntax error on line 30 of /etc/apache2/sites-enabled/50-review.testing-cabal.org.conf: | 01:36 |
lifeless | SSLCertificateChainFile: file '/etc/ssl/certs/intermediate.pem' does not exist or is empty | 01:36 |
lifeless | Action 'start' failed. | 01:36 |
clarkb | that may be a bug | 01:37 |
lifeless | gerrit_ssl_chain_file_contents: '' | 01:37 |
lifeless | in hiera | 01:37 |
clarkb | lifeless: I wonder if that is creating a string that != "" in the vhost .erb | 01:38 |
lifeless | clarkb: I made a new self signed cert, will that be compatible with the /etc/ssl/certs/ssl-cert-snakeoil.pem you're mentioning ? | 01:38 |
clarkb | lifeless: no you just use the snakeoil cert instead | 01:38 |
clarkb | lifeless: oh the vhost erb is checking the chain file and not chain file contents | 01:39 |
clarkb | lifeless: so either unset the file variable or change the check in the vhost erb | 01:39 |
lifeless | clarkb: ./modules/meetbot/templates/vhost.erb ? | 01:40 |
clarkb | lifeless: modules/gerrit/templates/gerrit.vhost.erb | 01:41 |
*** HenryG has joined #openstack-infra | 01:42 | |
lifeless | clarkb: just adding _contents on the end will be sufficient ? | 01:42 |
clarkb | hmm actually no | 01:43 |
clarkb | I think the check there is correct, because you may hve a chain file that isn't managed by puppet | 01:43 |
lifeless | clarkb: is that a use case? | 01:43 |
clarkb | lifeless: yes, I think so | 01:43 |
lifeless | clarkb: surely the answer is 'update it in hiera' in that case? | 01:43 |
clarkb | maybe you don't have hiera or some sites cohabitate | 01:44 |
clarkb | unsetting the other variable is easy enough | 01:44 |
clarkb | lifeless: and appears to be unset by default in ::gerrit and ::openstack_project::gerrit.pp | 01:45 |
lifeless | clarkb: how do you unset it ? I certainly haven't set it. | 01:45 |
clarkb | lifeless: openstack_project::review.pp has it hardcoded | 01:46 |
*** pabelanger has joined #openstack-infra | 01:46 | |
lifeless | clarkb: this seems like needless complexity to me, but sure. | 01:47 |
lifeless | clarkb: just '' ? falsey? | 01:47 |
clarkb | lifeless: review.pp isn't meant to be reconsumed... | 01:47 |
lifeless | clarkb: see the discussion I raised the other day about what is config and what is design | 01:48 |
*** nati_ueno has quit IRC | 01:48 | |
clarkb | lifeless: I totally understand and grok that argument. The roles and profiles thing should get us there | 01:48 |
clarkb | lifeless: but today the outermost onion layer isn't mean to be used that way | 01:48 |
*** nati_ueno has joined #openstack-infra | 01:48 | |
lifeless | clarkb: sure; so my instructions I'm writing - viewable in the review - are 'copy and paste and make X adjustments' | 01:49 |
lifeless | clarkb: w.r.t. complexity I'm saying having a 'not managed by puppet' use case in a puppet managed file is very odd | 01:49 |
clarkb | lifeless: so you can either update teh vhost erb like you suggested (but I don't think that is a proper fix, others may disagree) or update openstack_project::review to make this configurable or write your own review.pp | 01:50 |
clarkb | lifeless: the certs themselves don't have to be managed by puppet | 01:50 |
clarkb | you might have some other key management mechanism (whcih isn't uncommon aiui) | 01:50 |
*** adalbas has quit IRC | 01:51 | |
mordred | clarkb: I think that our assumption that review.pp isn't reconsumable is a split of things in the wrong place | 01:51 |
clarkb | mordred: anything in openstack_project isn't meant to be directly reconsumable if you want to be different | 01:52 |
mordred | clarkb: cause most of the people who are trying to reconsume our stuff are wanting a gerrit that works like ours | 01:52 |
clarkb | right which means you need a chain file | 01:52 |
mordred | clarkb: I know - but I think we may have underestimated how much "OpenStack's Gerrit" is a thing people want | 01:52 |
mordred | as opposed to just "a gerrit" | 01:52 |
mordred | AH | 01:52 |
clarkb | I am totally on board with the make it better this is not ideal | 01:52 |
mordred | I follow what you are saying | 01:53 |
clarkb | but this is how things are today | 01:53 |
lifeless | clarkb: thats not related to 'works like openstacks', thats related to 'has a public certificate not a self signed one' | 01:53 |
lifeless | clarkb: | 01:53 |
lifeless | clarkb: 'certs are in hiera' is related to 'works like openstacks' | 01:53 |
clarkb | lifeless: but puppet itself is config | 01:53 |
clarkb | and in the outermost layer the config mixes with the design | 01:53 |
lifeless | clarkb: there is config and there is config | 01:53 |
lifeless | clarkb: there is server config and there is project config; imagine for a second that the openstack ci infrastructure was an aaS thing. | 01:54 |
clarkb | lifeless: I totally get that | 01:54 |
lifeless | clarkb: which bits would be API driven, and which bits would be deployer problems? | 01:54 |
clarkb | lifeless: and my suggestion to move towards something using the roles and profiles paradigm will get us there | 01:54 |
lifeless | This is clearly a deployer problem. What projects to manage is an API consumer thing. | 01:54 |
clarkb | but today that isn't the case | 01:54 |
lifeless | clarkb: right, we're all agreed ;) | 01:54 |
lifeless | clarkb: I want to be clear - I'm not whinging about what I'm needing to do. | 01:55 |
lifeless | clarkb: I'm arguing that *in the current setup*, the template file supporting a case openstack doesn't deploy is hair on the yak. | 01:55 |
clarkb | lifeless: why, we would like it to be consumable by others | 01:56 |
lifeless | clarkb: and that the future setup, with roles profiles etc might be a time that that matters - or it might now. | 01:56 |
clarkb | I think it matters now beacuse you can take the gerrit module and reuse it | 01:56 |
clarkb | you will just end up with a Gerrit gerrit not an OpenStack gerrit | 01:56 |
lifeless | I'm super wary of early generalisation with this sort of thing. | 01:57 |
mordred | clarkb: I do not think that is true | 01:57 |
clarkb | mordred: which thing isn't true? | 01:57 |
mordred | clarkb: I believe we have openstack isms split across both modules | 01:57 |
lifeless | I appreciate the story - support folk with pluggable certificate management infrastructures. | 01:58 |
mordred | clarkb: ::gerrit does not get you a gerrit gerrit | 01:58 |
clarkb | mordred: right but those are bugs | 01:58 |
clarkb | we shouldn't create more bugs because it makes life easy now | 01:58 |
lifeless | But having two variables that are tied together like this with non-obvious side effects seems like a non-scalable [in terms of understanding and predictability] pattern. | 01:58 |
mordred | clarkb: yah. I'm just supporting the yak shaving argument - if gerrit gerrit doesn't work, and we're wanting to refactor to roles/profiles anyway... | 01:58 |
mordred | clarkb: then patches to suppor tthe thing that actually comes up right now, namely, reconsuming openstack gerrit, are proably more important | 01:59 |
clarkb | lifeless: welcome to puppet | 01:59 |
lifeless | clarkb: I'd much rahther see a top level thing in puppet for managing 'this is an SSL key and it might come from lots of places and it might have an intermediate file and so on' | 01:59 |
clarkb | mordred: but it is reconsumable... | 01:59 |
*** yaguang has joined #openstack-infra | 01:59 | |
lifeless | wooo | 02:00 |
clarkb | mordred: I am arguing that it isn't reconsumable from openstack_project::review.pp if you want something other than an OpenStack Gerrit | 02:00 |
lifeless | https://review.testing-cabal.org/#/q/status:open,n,z | 02:00 |
lifeless | ^ wooo | 02:00 |
uvirtbot | lifeless: Error: "wooo" is not a valid command. | 02:00 |
lifeless | woo woow woo | 02:00 |
clarkb | mordred: openstack_project::gerrit actually is reconsumable | 02:00 |
clarkb | I think | 02:00 |
mordred | clarkb: ok. I grok | 02:00 |
mordred | clarkb: it's sort of reconsumable | 02:00 |
mordred | there are bugs | 02:00 |
lifeless | clarkb: at the moment it's copy-pastable; I presume thats not what you mean ? | 02:00 |
clarkb | lifeless: right, should be able to reconsume without copy pasta | 02:01 |
mordred | clarkb: or, rather, let me say a different thing... | 02:01 |
Alex_Gaynor | hmmm, pasta | 02:01 |
lifeless | clarkb: you will want to review my docs where I say to copy-paste it then :) | 02:01 |
mordred | I think that "OpenStack's Gerrit" and "An OpenStack Gerrit" are different things | 02:01 |
clarkb | mordred: thats fair | 02:02 |
clarkb | mordred: and openstack_project::review is OpenStack's Gerrit. openstack_project::gerrit is an Openstack Gerrit | 02:02 |
mordred | clarkb: and right now I think that there e are elements of "OpenStack's Gerrit" that bleed into the description of "An OpenStack Gerrit" | 02:02 |
mordred | clarkb: and that they do is a bug :) | 02:02 |
mordred | but it's been low priority for _us_ | 02:02 |
clarkb | mordred: I actually think that in this case there isn't much bleeding | 02:03 |
clarkb | openstack_project::gerrit doesn't have the issue, only openstack_project::review does | 02:03 |
clarkb | running openstack_project::gerrit on two servers has helped one of which uses snakeoil certs :0 | 02:03 |
clarkb | * :) so at least in this particular case I think we are fine. There are definitely bugs though | 02:04 |
*** gyee has quit IRC | 02:04 | |
anteaya | pleia2: if I see docs that tell me to install packages on xUbuntu 12.04 can I use Ubuntu? or would not using xUbuntu trip me up? | 02:04 |
anteaya | *I don't know the difference very well* | 02:05 |
lifeless | clarkb: no, thats not the manage-projects issue | 02:05 |
lifeless | clarkb: I can ssh to ssh -p 29418 robertc@review.testing-cabal.org | 02:05 |
lifeless | clarkb: though I get a permission denied of course, as I haven't setup users yet. | 02:05 |
mordred | lifeless: I've been meaning to add initial user creation steps to the puppet module | 02:06 |
mordred | lifeless: so I welcome work on your side in this area :) (I think that doing it via sql will likely be required, due to how initial admin user account works in gerrit) | 02:07 |
lifeless | so do you guys lookup passwords in hiera whenever you need them (e.g. for the manual mysql root commands in gerrit.rst ?) | 02:07 |
clarkb | I typically use the gqsl interface | 02:07 |
anteaya | bah I'll just install it and watch it go up in flames if it is going to go | 02:08 |
lifeless | clarkb: gsql ? | 02:08 |
mordred | lifeless: if I need to do sql on the box, I log in as root and just run mysql | 02:09 |
mordred | lifeless: the root user is configured to be able to log in to the mysql on the box | 02:09 |
lifeless | hah so bad instructions | 02:09 |
clarkb | lifeless: ssh gerrit gerrit gsql | 02:09 |
clarkb | its a gerrit command | 02:09 |
mordred | but clarkb's thing is safer | 02:09 |
lifeless | clarkb: can you do that for initial setup? | 02:10 |
mordred | no. for initial setup you'll need direct mysql commands | 02:10 |
lifeless | clarkb: and by ssh gerrit, do you mean the port 29418 ? | 02:10 |
mordred | lifeless: ssh -p 29418 robertc@review.testing-cabal.org gerrit gsql | 02:11 |
clarkb | lifeless: yeah talking to the gerrit ssh server | 02:11 |
clarkb | lifeless: and yes, we also tend to use the root account with its mysql conf file to do things automagically | 02:12 |
lifeless | hmm, i don't see any initial-user setup stuff | 02:12 |
mordred | lifeless: there is none | 02:13 |
mordred | lifeless: initial-user setup is all manual | 02:13 |
mordred | currently | 02:13 |
lifeless | mordred: yes, but I don't see any docs for it. | 02:13 |
mordred | the first person to log in to the gerrit TTW gets admin/root | 02:13 |
lifeless | oh | 02:14 |
clarkb | there is some stuff in git-review we might be able to genericize | 02:14 |
clarkb | that does initial setup for gerrit so that tests can run against it | 02:14 |
clarkb | but it is still pretty rough, I had to hack on it last week to get it to work with gerrit 2.4 | 02:15 |
mordred | Alex_Gaynor: what's fox? | 02:17 |
Alex_Gaynor | mrodden: I assume it's like tox, but I typod it | 02:17 |
mordred | Alex_Gaynor: hahahaha | 02:17 |
mordred | Alex_Gaynor: (that should work | 02:17 |
Alex_Gaynor | mordred: most confusing things I say can be explained by the fact that I'm a very poor typist | 02:18 |
mordred | Alex_Gaynor: unless it's expecting a c thing to be installed - all the gate is doing for ceilometer is what you desscfribe | 02:18 |
mordred | Alex_Gaynor: what broke | 02:18 |
Alex_Gaynor | mordred: I think I probably need to be running mongodb | 02:18 |
mordred | Alex_Gaynor: I have learned that I cannot type without direct visual feedback | 02:18 |
lifeless | does puppet setup the gerrit user for manage-projects etc? | 02:18 |
mordred | Alex_Gaynor: you should not need to be running mongo - clarkb <-- ? | 02:19 |
mordred | lifeless: nope | 02:19 |
lifeless | is there a concordance of such users somewhere? | 02:19 |
clarkb | lifeless: that is the only one | 02:19 |
*** melwitt1 has quit IRC | 02:19 | |
clarkb | we got rid of the sync user so there shouldn't be others | 02:20 |
clarkb | mordred: what about mongo? | 02:20 |
mordred | clarkb: does ceilo need mongo to run tox tests? | 02:20 |
clarkb | mordred: oh yeah ceilometer does mongodb tests, but they should be opportunistic | 02:20 |
mordred | lifeless: http://ci.openstack.org/gerrit.html#gerrit-configuration <-- don't forget those steps :) | 02:20 |
clarkb | because they don't run on our precise hosts which have ancient mongodb | 02:20 |
mordred | lifeless: and then http://ci.openstack.org/gerrit.html#access-controls | 02:21 |
clarkb | you might need the dependencies though so that deps can build | 02:21 |
clarkb | but actual tests probably won't run | 02:21 |
mordred | lifeless: lists all of the global acls which are also not managed by anything | 02:21 |
mordred | lifeless: as well as the groups that one would need | 02:21 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Make gerrit DB setup match actual practice. https://review.openstack.org/44993 | 02:21 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 02:21 |
*** rcleere has joined #openstack-infra | 02:22 | |
mordred | lifeless: and the user needs to be in the Project Bootstrappers group | 02:23 |
*** markmcclain has joined #openstack-infra | 02:23 | |
mordred | (we also have it in Administrators) | 02:23 |
lifeless | what we're doing for tripleo is writing api scripts to drive these sorts of setups | 02:24 |
lifeless | so they aren't tied into machine-orchestration layers [like pupper] | 02:24 |
lifeless | time for a short break for me | 02:24 |
*** pcrews has quit IRC | 02:25 | |
*** ryanpetrello has joined #openstack-infra | 02:29 | |
*** ryanpetrello has quit IRC | 02:40 | |
*** ryanpetrello has joined #openstack-infra | 02:42 | |
*** nati_ueno_2 has joined #openstack-infra | 02:42 | |
*** ryanpetrello has quit IRC | 02:43 | |
*** nati_ueno has quit IRC | 02:46 | |
*** nosnos_ has joined #openstack-infra | 02:47 | |
*** nosnos has quit IRC | 02:49 | |
*** dims has quit IRC | 02:51 | |
*** nosnos_ has quit IRC | 02:55 | |
*** nosnos has joined #openstack-infra | 02:55 | |
*** anteaya has quit IRC | 02:58 | |
pleia2 | oops, missed anteaya by a couple minutes | 03:01 |
*** kiall has quit IRC | 03:07 | |
*** xchu has joined #openstack-infra | 03:12 | |
*** pcrews has joined #openstack-infra | 03:13 | |
openstackgerrit | Alex Gaynor proposed a change to openstack-infra/config: Run the heatclient tests under PyPy https://review.openstack.org/44996 | 03:17 |
*** nati_ueno_2 has quit IRC | 03:25 | |
lifeless | clarkb: mordred: how do you assign the permissions for Project Bootstrappers to the group ? | 03:26 |
*** nati_ueno has joined #openstack-infra | 03:26 | |
*** kiall has joined #openstack-infra | 03:29 | |
*** tian has quit IRC | 03:34 | |
*** rcleere has quit IRC | 03:34 | |
lifeless | mordred: clarkb: how does the acls described in gerrit.rst get into the gerrit sytem ? | 03:54 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Gerrit docs improvements - user and groups. https://review.openstack.org/45001 | 03:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 03:59 |
*** odyi has quit IRC | 03:59 | |
*** clarkb has quit IRC | 03:59 | |
*** Guest16331 has quit IRC | 03:59 | |
*** mikal has quit IRC | 03:59 | |
*** stevebaker has quit IRC | 03:59 | |
*** mikal_ has joined #openstack-infra | 03:59 | |
*** clarkb has joined #openstack-infra | 03:59 | |
*** lillie has joined #openstack-infra | 04:00 | |
*** odyi has joined #openstack-infra | 04:00 | |
*** lillie is now known as Guest72665 | 04:00 | |
lifeless | clarkb: https://github.com/saltstack/salt-cloud/pull/68 - no; it was that I hadn't created the account yet. | 04:01 |
*** stevebaker has joined #openstack-infra | 04:01 | |
*** mikal_ is now known as mikal | 04:03 | |
*** dkliban has quit IRC | 04:04 | |
fungi | lifeless... same answer for both--the All-Projects acl gets hand configured through the gerrit webui or a suitable acl config file is pushed through the gerrit ssh interface or committed in ~gerrit2/review_site/git/All-Projects.git/ in refs/meta/config | 04:09 |
lifeless | fungi: so https://review.testing-cabal.org/gitweb?p=All-Projects.git;a=blob;f=project.config;h=4d301c96c45478a43922a7bcc8b15f0aef15a3dc;hb=f701a9ea071e144dd4b1858f960ab6e835b3f474 | 04:10 |
lifeless | fungi: that looks similar to the syntax in the docs | 04:10 |
lifeless | fungi: does the one in the docs replace that file ? | 04:10 |
*** markmcclain has quit IRC | 04:11 | |
fungi | though if applied directly to the filesystem without going through the gerrit interfaces, it misses out on built-in syntax checking | 04:11 |
lifeless | fungi: whats the best way? I"m going to write down what you tell me for posterity :) | 04:12 |
fungi | yeah, you'd update/replace that | 04:12 |
fungi | pushing it through gerrit's ssh interface is probably automation-friendly | 04:13 |
*** vogxn has joined #openstack-infra | 04:13 | |
*** nati_ueno_2 has joined #openstack-infra | 04:14 | |
fungi | manage-projects does similar work for project-specific acls | 04:14 |
fungi | we want to puppet it, just haven't found the time, i think (though maybe there are more subtle obstacles to that one i'm overlooking) | 04:15 |
*** nati_ueno has quit IRC | 04:17 | |
openstackgerrit | Darragh Bailey proposed a change to openstack-infra/jenkins-job-builder: Support use of different git chooser strategies https://review.openstack.org/45002 | 04:18 |
fungi | group membership management is a separate puppetry challenge though... they're all kept in mysql tables | 04:18 |
pleia2 | "Your Presentation proposal for linux.conf.au has been accepted" \o/ | 04:20 |
fungi | congrats, pleia2! | 04:20 |
pleia2 | thanks :) | 04:20 |
lifeless | fungi: I just need a recipe to apply by hand | 04:21 |
pleia2 | I've never been to .au, and first place I get to go is.. Perth? :) | 04:21 |
pleia2 | will have to see about a Sydney stop perhaps | 04:21 |
pleia2 | but yay! | 04:22 |
lifeless | pleia2: congrats :) | 04:25 |
*** SergeyLukjanov has joined #openstack-infra | 04:25 | |
Alex_Gaynor | pleia2: plan an extra few days to visit sydney, when I did that this past year it was awesome | 04:26 |
morganfainberg | pleia2, nice! | 04:28 |
fungi | lifeless: last time i set up a gerrit i updated those default acls through the webui, but you should be able to push it as your initial administrative user, or possibly authenticating as the builtin 'Gerrit Code Review' account (it's case sensitive) using the gerrit ssh service host key as the account's ssh key | 04:31 |
* fungi should afk for the night and try to sleep... stupid jetlag | 04:33 | |
pleia2 | Alex_Gaynor: given the timing, I'm thinking new years in sydney would be fun | 04:36 |
*** ArxCruz has quit IRC | 04:37 | |
*** reed has quit IRC | 04:38 | |
*** vipul is now known as vipul-away | 04:44 | |
*** yongli has joined #openstack-infra | 04:44 | |
openstackgerrit | Darragh Bailey proposed a change to openstack-infra/jenkins-job-builder: Support use of different git chooser strategies https://review.openstack.org/45002 | 04:44 |
lifeless | fungi: whats the ssh command to do acl updates? | 04:47 |
lifeless | (or clarkb / mordred / pleia2 :)) | 04:48 |
*** vipul-away is now known as vipul | 04:50 | |
*** Ryan_Lane has joined #openstack-infra | 04:51 | |
*** boris-42 has joined #openstack-infra | 04:51 | |
*** vogxn has quit IRC | 04:56 | |
*** nati_ueno has joined #openstack-infra | 05:02 | |
clarkb | lifeless: its weird because you are pushing to a special git ref | 05:02 |
lifeless | clarkb: I've been googling for howtos and docs. | 05:03 |
lifeless | clarkb: and been overloaded with related-but-not-that stuff | 05:03 |
clarkb | lifeless: I think it was removed from our docs when we reorged them, let me find it in history for you | 05:03 |
*** openmike|2 has joined #openstack-infra | 05:04 | |
*** nati_ueno_2 has quit IRC | 05:05 | |
clarkb | lifeless: http://git.openstack.org/cgit/openstack-infra/config/tree/doc/source/gerrit.rst?id=05aaa0b956af5e06b15bf1b760874f99f3d86a79#n831 | 05:06 |
clarkb | lifeless: I like to do that in a brand new repo that I git init and not clone | 05:06 |
clarkb | lifeless: that way I don't have to clean out the actual repo contents | 05:07 |
lifeless | should I restore those docs? | 05:09 |
lifeless | clarkb: ^ | 05:09 |
clarkb | lifeless: no, they were removed because we rely on manage_projects.py now | 05:09 |
clarkb | lifeless: I think gerrit may document this stuff somewhere too, let me see if I can find it in the gerrit documentation | 05:10 |
clarkb | maybe they don't document it in tree | 05:11 |
lifeless | clarkb: but, this stuff isn't done by manage_projects. | 05:17 |
lifeless | clarkb: so, it's not redundant | 05:18 |
clarkb | acl updates are done by manage_projects | 05:19 |
lifeless | clarkb: for all *other* acls, AIUI. | 05:19 |
lifeless | clarkb: but not these. | 05:19 |
clarkb | lifeless: which ones? | 05:20 |
lifeless | clarkb: http://ci.openstack.org/gerrit.html#access-controls | 05:20 |
lifeless | clarkb: those are the ones I'm asking about | 05:20 |
lifeless | clarkb: I asked 'how do I give the 'project bootstrappers' group the right permissions | 05:20 |
clarkb | All-Projects? we don't currently use manage_projects to manage All-Projects but it can | 05:20 |
clarkb | oh so this is a bootstrapping issue | 05:21 |
lifeless | clarkb: and 'how do the acls in gerrit.rst get into the system' | 05:21 |
lifeless | clarkb: and I was told the answer to both questions was the same : do $something_undocumented to gerrit to make it happen. | 05:21 |
lifeless | clarkb: I'm just trying to get that $documented thing documented. | 05:21 |
lifeless | clarkb: you seem to be saying that this *deleted* documentation is such documentation, but that you don't want it documented. | 05:22 |
lifeless | *confused* | 05:22 |
clarkb | ya, I am now currently trying to sort out if there is a sane way to have manage_projects do that for us | 05:22 |
clarkb | lifeless: because gerrit itself should document this | 05:22 |
clarkb | lifeless: and I thought it did, but after some digging I think that I was wrong | 05:22 |
lifeless | clarkb: it should, but I couldn't find it :( | 05:22 |
clarkb | ya I can't find it either. So some subset of the documentation that was removed that explains how to bootstrap All-Projects for project bootstrappers should probably be added back into gerrit.rst | 05:23 |
clarkb | or do that bit by hand in the Gerrit UI | 05:23 |
lifeless | clarkb: I don't know the acl syntax well enough to update the gerrit UI to match | 05:24 |
lifeless | clarkb: I have an unknown text format on my left, and a similar-but-different web UI on my right. | 05:24 |
clarkb | I expected https://review.openstack.org/Documentation/project-setup.html or https://review.openstack.org/Documentation/access-control.html to describe how to push into refs/meta/config | 05:24 |
clarkb | but neither page does | 05:25 |
lifeless | clarkb: folk with practice will know, but not newcomers. I'd like to make it just do X, where X is detailed and has no if's, but's or else's in it. | 05:25 |
clarkb | agreed, I think the steps that explain how to do that in the old docs need to be put back into the existing docs | 05:25 |
lifeless | ok. I shall reinstate them then :) | 05:26 |
clarkb | lifeless: give me a minute I will paste something | 05:26 |
lifeless | thank you! | 05:28 |
lifeless | see, I'm a fumbling newby, which is why I only make progress when nagging one of you lot :) | 05:28 |
lifeless | I very much doubt I will have jenkins and zuul live and running tests tomorrow :( | 05:28 |
lifeless | but, who knows, maybe I will! | 05:28 |
*** vipul is now known as vipul-away | 05:29 | |
*** Adri2000 has joined #openstack-infra | 05:31 | |
*** Adri2000 has joined #openstack-infra | 05:31 | |
*** nicedice_ has quit IRC | 05:31 | |
*** vogxn has joined #openstack-infra | 05:34 | |
clarkb | lifeless: http://paste.openstack.org/show/45718/ something like that | 05:35 |
lifeless | clarkb: will you be up for much longer? must be nearly 11? | 05:35 |
clarkb | not much longer | 05:35 |
*** openstack has joined #openstack-infra | 14:53 | |
*** openstackstatus has joined #openstack-infra | 14:54 | |
fungi | and they're back | 14:54 |
dizquierdo | hi clarkb, if you're around we can discuss a bit about https://review.openstack.org/#/c/44057/ | 14:54 |
anteaya | yay | 14:54 |
anteaya | thanks | 14:54 |
dizquierdo | from our side we're ready :) | 14:55 |
anteaya | makes reading the backscroll shorter | 14:55 |
*** openstackgerrit has joined #openstack-infra | 14:55 | |
fungi | statusbot had stopped completely, meetbot was simply indefinitely disconnected | 14:55 |
jeblair | dizquierdo: ok, we can approve it now | 14:55 |
fungi | and there's gerritbot... the gang's all here | 14:55 |
*** lcanas has joined #openstack-infra | 14:55 | |
dizquierdo | great thanks jeblair :) | 14:55 |
anteaya | sorry in advance if I bring up a topic that was discussed in the ensuing 9.5 hour outage | 14:55 |
Alex_Gaynor | mordred: Thinking outloud, one thing I think would go a long way to reducing the frustration people are feeling would be increased visibility to what's going on with the neutron failures. What do we know, how are we looking to fix it, are there reviews I should follow, etc. | 14:55 |
jeblair | dizquierdo: we just wanted to make sure that you were ready to switch over to using gerrit (so we wouldn't have to try to move more commits over) | 14:55 |
dizquierdo | we've got a couple of questions (lcanas and me) | 14:55 |
dizquierdo | regarding the process... we have a bot uploading new datasets, so data keep daily updated | 14:56 |
dizquierdo | but... not sure how we should proceed now | 14:56 |
*** wenlock has joined #openstack-infra | 14:56 | |
jeblair | dizquierdo: you can still have a bot upload new patches (we do that with translations), but you'll need to manually approve them | 14:56 |
dizquierdo | or, leave datasets in other place, and source code in the gerrit process (perhaps better idea) | 14:56 |
dizquierdo | ok, then we can leave datasets automatically updated by the bot and later check if this is scalable :) | 14:57 |
mordred | Alex_Gaynor: that's a good idea. I'm not sure where the start of that is... | 14:58 |
dizquierdo | let's go then | 14:58 |
jeblair | Alex_Gaynor: we could start by talking to markmcclain | 14:58 |
mordred | markmcclain: ^^ who would be a good person on neutron side to ask about? | 14:58 |
mordred | jeblair: you beat me to it | 14:58 |
Alex_Gaynor | mordred: A single over-arching task to follow on launchpad, instead of a revolving door of random symptoms would be one place | 14:59 |
anteaya | s/ensuing/previous | 14:59 |
ttx | fungi/jeblair: Looks like there is some shortage in gate-*-python26-able nodes | 15:00 |
markmcclain | Alex_Gaynor: I've been working to more folks to help fix the random failures | 15:00 |
markmcclain | *to recruit | 15:00 |
sdague | ttx: I think it will resolve itself | 15:00 |
ttx | sdague: ok thx :) | 15:00 |
jeblair | dizquierdo: i approved the change; it'll probably be a while before it merges because the test system is recovering from a network incident earlier | 15:00 |
*** mgagne has joined #openstack-infra | 15:01 | |
ttx | better safe than sorry on this day | 15:01 |
dizquierdo | thanks jeblair, let me know any issue you may find :) | 15:01 |
sdague | ttx: there was the big restart, and py26 jobs are slow | 15:01 |
sdague | but in reality I think they'll catch up past the tempest jobs in another 30 minutes | 15:01 |
fungi | agreed, we're only starting to see it recently because the tempest jobs are now so much faster the py26 unit tests for some projects are rivalling them for duration now | 15:02 |
sdague | cinder py26 jobs are only 5 mins, for instance | 15:02 |
Alex_Gaynor | markmcclain: It'd be great to have one over-arching place to track that and get status updates | 15:02 |
ttx | jeblair: you should tweak the zuul status graphs so that they don't show a drop to 0 on the current hour :) | 15:02 |
sdague | heh | 15:03 |
markmcclain | Alex_Gaynor: I'll work on consolidating the info | 15:03 |
fungi | that would probably require predictive modelling, or a lag for the sample duration | 15:03 |
Alex_Gaynor | markmcclain: a single launchpad issue to track all the downstream manifestations owuld be good I think | 15:03 |
jeblair | ttx: yeah, i'm not sure i found the right graphite function for that yet -- i found one that does a rolling history, but that would make the graph continuously change shape | 15:03 |
fungi | jeblair: spline fitting ;) | 15:04 |
jeblair | ttx: i think what we want is one that scales the most recent value by how complete the hour is | 15:04 |
ttx | jeblair: and uses a dotted line to show that "prediction" | 15:04 |
ttx | a bit tricky :) | 15:05 |
boris-42 | fungi hi | 15:05 |
fungi | boris-42: howdy | 15:06 |
boris-42 | fungi how are you? | 15:06 |
sdague | oh man neutron unit tests are longer than tempest tests huh. | 15:06 |
sdague | at least on 26 | 15:07 |
fungi | boris-42: caffeinated | 15:07 |
boris-42 | fungi and red eyes?) | 15:07 |
fungi | boris-42: always, though i think i inherited them | 15:07 |
jeblair | sdague: congrats! :) | 15:07 |
boris-42 | fungi could I ask about adding new project to stackforge https://review.openstack.org/#/c/44952/ ?) | 15:07 |
fungi | sdague: new target ;) | 15:07 |
sdague | :) | 15:07 |
boris-42 | fungi it is terrible to work on project without Opnestack CI=) | 15:08 |
*** vogxn has quit IRC | 15:08 | |
fungi | boris-42: i agree! | 15:08 |
sdague | yeh, I only knew the nova timings off the top of my head, which come in at 20 mins | 15:08 |
*** senk has joined #openstack-infra | 15:08 | |
fungi | boris-42: taking a look at the proposal real quick | 15:08 |
*** datsun180b_ has joined #openstack-infra | 15:08 | |
sdague | so actually, given that, we probably could use some more 26 nodes if that's an option | 15:09 |
*** datsun180b has quit IRC | 15:09 | |
*** datsun180b_ is now known as datsun180b | 15:09 | |
jeblair | boris-42: we're happy to have more projects in general. i will review the proposal during my regular reviews (we are usually pretty good about reviewing all changes to /config) | 15:10 |
boris-42 | jeblair thank you | 15:10 |
fungi | as for network outage correlation, gerritbot (on gerrit) and meetbot (on eavesdrop) both fell off irc at 05:40 | 15:10 |
boris-42 | fungi thank you also=) | 15:10 |
fungi | which is around the time of the gaps in cacti graphs as well | 15:10 |
jeblair | boris-42: you should expect that a core member will have reviewed it within 24-48 hours -- we are very busy so be patient. :) | 15:10 |
fungi | and of the various "hung" stale jobs | 15:11 |
boris-42 | jeblair I will be really happy to get reviews is so short itme=) | 15:11 |
boris-42 | if I get* | 15:11 |
*** senk1 has joined #openstack-infra | 15:11 | |
*** hashar has joined #openstack-infra | 15:11 | |
openstackgerrit | A change was merged to openstack-infra/config: Addition of the activity-board project to the OpenStack-infra environment https://review.openstack.org/44057 | 15:12 |
jeblair | sdague: let's keep an eye on it; we bumped the slave counts last week and the current values seemed to be about right (we went from 32->40 precise and 12->14 centos) | 15:12 |
sdague | jeblair: ok | 15:12 |
*** xchu has joined #openstack-infra | 15:12 | |
fungi | aha! https://status.rackspace.com/ indicates a network maintenance in "DFW Datacenter- September 4th from 12:01 AM to 6:00 AM CDT" | 15:12 |
sdague | jeblair: the big issue is neutron actually takes longer on 26 then tempest | 15:12 |
fungi | that would be... pretty much all our long-lived systems | 15:12 |
*** senk has quit IRC | 15:13 | |
sdague | so a lot of neutron jobs in the queue means that we actually back up on 26 | 15:13 |
*** reed has joined #openstack-infra | 15:13 | |
sdague | actually, python 26 neutron is the longest job we have in the queue at all (takes ~ 30 minutes) | 15:13 |
jeblair | sdague: ok, makes sense. let's validate that holds up the head of the queue (and not just deep changes) and we can bump it some more. | 15:15 |
sdague | yep, sure | 15:15 |
*** pcrews has joined #openstack-infra | 15:17 | |
Alex_Gaynor | sdague: clearly we should drop python 2.6 :snrk: | 15:19 |
jeblair | i have to run an errand; bbiab | 15:19 |
fungi | k | 15:19 |
Alex_Gaynor | So it looks like a bunch of post- jobs fail, does anyone trac those? | 15:20 |
Alex_Gaynor | track* | 15:20 |
fungi | Alex_Gaynor: not with any organized regularity | 15:20 |
fungi | i think it varies by project | 15:20 |
Alex_Gaynor | fungi: I wonder if failing post- tests shoudl leave a comment on the review that spawned it, just so people see it | 15:21 |
fungi | we talked about that... i'm also not sure people look at reviews any longer once they're merged. they fall off the radar for the most part | 15:21 |
Alex_Gaynor | yeah, but they'll get an email | 15:21 |
sdague | mordred: any idea why the neutron tox.ini definition doesn't actually print out the slowest tests? | 15:21 |
Alex_Gaynor | or at least if they're like me they'll get an email | 15:21 |
mordred | sdague: nope. not intentional to best of my knowledge | 15:22 |
fungi | yeah, i get so many gerrit e-mails any more i have a hard time paying attention to them | 15:22 |
Alex_Gaynor | fungi: Ah intersting, I use my inbox for tracking "TODO" things | 15:22 |
*** kmartin has quit IRC | 15:22 | |
*** datsun180b has quit IRC | 15:23 | |
sdague | mordred: it looks like it should | 15:23 |
fungi | i should probably tune my gerrit watches and set up some more fine-grained filtering on my mta for different kinds of messages from gerrit | 15:23 |
*** datsun180b has joined #openstack-infra | 15:23 | |
mordred | fungi: enough people seem to use email like Alex_Gaynor does, that perhaps leaving post comments on merged changes would get failed post jobs _some_ attention | 15:23 |
sdague | mordred: oh, does it only work when testr passes? | 15:24 |
*** kiall has quit IRC | 15:24 | |
mordred | sdague: yes. I believe this is the case | 15:24 |
fungi | mordred: Alex_Gaynor: jeblair said something about getting a better post job result reporting system integrated with zuul soon, so i'm hoping that will be a little better than gerrit comment e-mails anyway | 15:24 |
Alex_Gaynor | ah, cool | 15:24 |
jeblair | fungi: that was for bitrot | 15:25 |
fungi | oh, for periodics, not posts | 15:25 |
fungi | hrm :/ | 15:25 |
sdague | ok, that's fair | 15:25 |
*** dkehn has quit IRC | 15:25 | |
sdague | I guess it's just that neutron has 15k unit tests | 15:25 |
sdague | the slowest is only 7s | 15:25 |
*** gyee has joined #openstack-infra | 15:25 | |
*** UtahDave has joined #openstack-infra | 15:26 | |
dstufft | Alex_Gaynor: inbox is my TODO list, worst one besides all the other ones :] | 15:26 |
jeblair | the post jobs don't know what change they are "for", which makes having them report back difficult | 15:26 |
*** Bada has quit IRC | 15:26 | |
jeblair | i have a 90% completed patch to gerrit to add the info we would need to change-merged events to use them instead of ref-updated | 15:26 |
jeblair | but i'm not sure we _want_ to use them instead of ref-updated | 15:27 |
fungi | and having zuul directly e-mail the author or committer e-mail is likely not a great solution either i guess | 15:27 |
mordred | jeblair: nod. good point | 15:27 |
*** nati_ueno has quit IRC | 15:27 | |
mordred | jeblair: and I agree re ref-updated | 15:27 |
*** nati_ueno has joined #openstack-infra | 15:28 | |
*** vogxn has joined #openstack-infra | 15:28 | |
fungi | we would still need ref-updated to catch tags and new branch creation presumably | 15:29 |
fungi | or is that ref-created | 15:29 |
*** vogxn has quit IRC | 15:29 | |
*** pentameter has joined #openstack-infra | 15:30 | |
fungi | yeah, no ref-created event | 15:30 |
fungi | so i think ref-updated is the only thing that catches tags and new branches | 15:31 |
*** ruhe has joined #openstack-infra | 15:31 | |
ryanpetrello | so I know that OpenStack aims to support Py3.3 | 15:32 |
*** nati_ueno has quit IRC | 15:32 | |
ryanpetrello | but for stackforge projects that would like to support 32 | 15:32 |
ryanpetrello | are there any options? | 15:33 |
ryanpetrello | I know we've gotten e.g., pypy working for some projects | 15:33 |
fungi | ryanpetrello: not really, no. we have to keep a bank of dedicated slaves per python version, so offering slaves specifically to support stackforge projects isn't really in the project's best interests resource-wise | 15:33 |
*** dkehn has joined #openstack-infra | 15:34 | |
fungi | right now the combination of needing some system-wide dependencies installed via pip and our use of puppet to manage those slaves and the fact that puppet doesn't deal with the concept of multiple pip versions to support parallel installation of multiple python versions... | 15:34 |
ryanpetrello | right | 15:35 |
ryanpetrello | makes sense | 15:35 |
sdague | man neutron unit tests are extra racey today, huh? | 15:35 |
fungi | in general, puppet in fact doesn't deal with the idea that you may want to install multiple packages of the same name via different mechanisms on one system | 15:35 |
markmcclain | sdague: yeah… been looking into that | 15:37 |
fungi | so using pip-2.7 and pip-3.2 to install tox for both python2.7 and python3.2 interpreters on one machine, for example, can't be orchestrated with puppet currently | 15:37 |
*** mestery_ has joined #openstack-infra | 15:37 | |
Alex_Gaynor | fungi: hmm, this doesn't sound right, you don't need to have a tox installed per vm, for example I just reused the existing py3 builders for PyPy | 15:38 |
fungi | Alex_Gaynor: well, tox is probably a bad example | 15:38 |
dhellmann | how do we get pip installed for python3.3 and pypy? (those are on the same nodes, right?) | 15:39 |
Alex_Gaynor | dhellmann: we don't have a special pip for pypy, stuff is just installed in the tox venv | 15:40 |
Alex_Gaynor | dhellmann: why would we need a per-pypy pip? | 15:40 |
fungi | Alex_Gaynor: but in general, we'd need a long-term-stable distribution supporting the system-wide dependencies of sufficient versions for the right python interpreters | 15:40 |
mordred | so - the tox/python problem actually should go away with latest tox once we put in config file snippets | 15:40 |
*** mestery has quit IRC | 15:40 | |
*** dkehn has quit IRC | 15:40 | |
mordred | the problem before was the tox used system python, not venv python, to run the sdist/install | 15:40 |
mordred | which meant that setup_requires got angry | 15:40 |
mordred | new tox has an option to run setup.py develop in the context of the venv | 15:41 |
dhellmann | Alex_Gaynor: good point. So, fungi, why do we care about having multiple interpreters installed side-by-side? | 15:41 |
mordred | which means that just using tox will get us much further than previously was possible | 15:41 |
*** dkehn has joined #openstack-infra | 15:41 | |
fungi | dhellmann: we'd like to install multiple interpreters so we need fewer different systems to manage | 15:41 |
* dhellmann is confused | 15:42 | |
fungi | dhellmann: but currently the problem is that there are no lts distros with python 3 as their system default python | 15:42 |
mordred | dhellmann: right now, because multiple interpreters on the same box actually break things in our system | 15:42 |
dhellmann | ah | 15:42 |
mordred | dhellmann: we have to maintain a different set of slaves for each interp we want | 15:42 |
dhellmann | so we can't just install 3.2 on a 3.3 node | 15:42 |
mordred | that's obviously poop | 15:42 |
mordred | right | 15:42 |
dhellmann | k | 15:42 |
mordred | although - I _think_ we can now | 15:42 |
mordred | we just haven't tested it yet | 15:42 |
mordred | because,you know, feature freeze | 15:43 |
* dhellmann is impatient to get DreamCompute running well enough to offer nodes | 15:43 | |
*** mestery_ is now known as mestery | 15:43 | |
fungi | dhellmann: well, we could install 3.2 on a 3.3 node but we couldn't use puppet to tell pip to install versions of system-wide dependencies for both 3.2 and 3.3 | 15:43 |
mordred | fungi: which pip installed system-wide depends do we need on unittest slaves? | 15:43 |
dhellmann | fungi: what mordred said | 15:43 |
mordred | fungi: tox is the only one in that context, right? | 15:43 |
fungi | mordred: fewer and fewer... having a look now | 15:43 |
*** vipul is now known as vipul-away | 15:44 | |
fungi | setuptools-git currently as well | 15:44 |
*** kiall has joined #openstack-infra | 15:44 | |
fungi | oh, and python-subunit, git-review... | 15:45 |
sdague | hmmm... why did the pending python26 nova jobs get cancelled when something further up the stack in the gate failed? Those shouldn't be linked. | 15:46 |
mordred | fungi: we do not need setuptools-git | 15:46 |
fungi | virtualenv too | 15:46 |
mordred | fungi: we do not need per-interp virtualenv, and we should not need per-interp subunit | 15:47 |
fungi | okay, so as long as we can install them somewhere the python interpreters we're running can find them | 15:48 |
mordred | fungi: but that's a great list of things we should go clean up | 15:48 |
mordred | fungi: for any project that produces subunit output, subunit should be installed into the venv - so we should be able to access it by running things in the context of .tox/py$venv/bin/python | 15:48 |
fungi | any of that stuff which we can (or nowtimes already do) run from inside a virtualenv instead, ought to get cleaned up | 15:49 |
mordred | fungi: a global virtualenv is fine for tox to use with all of the pyhton versions | 15:49 |
mordred | yup | 15:49 |
mordred | but first, we need to test that new version of tox works like we expect | 15:49 |
*** vipul-away is now known as vipul | 15:50 | |
fungi | and also maybe test some assumptions about whether everyone has upgraded to our current ways of packaging and running tests | 15:50 |
openstackgerrit | A change was merged to openstack-infra/gitdm: Assign Andreas Jaeger to SUSE https://review.openstack.org/43273 | 15:50 |
mordred | ah! | 15:50 |
mordred | https://review.openstack.org/#/c/42178/ | 15:50 |
mordred | fungi: ^^ | 15:50 |
mordred | that merged | 15:50 |
*** yaguang has quit IRC | 15:50 | |
mordred | whic hmeans it's tested :) | 15:50 |
fungi | another miracle! | 15:51 |
mordred | the above patch should be ported to all of the openstack projects | 15:51 |
*** changbl has joined #openstack-infra | 15:51 | |
mordred | oh - actually - heads up everyone | 15:51 |
mordred | that patch landed at midnight | 15:51 |
fungi | good news everyone? | 15:51 |
mordred | it requires tox 1.6 - which is fine for us | 15:52 |
mordred | but it's possible that we might get questions from nova devs | 15:52 |
mordred | about why local tox runs stopped working | 15:52 |
mordred | tox reports that they need to upgrade tox | 15:52 |
mordred | but, you know, it's possible they won't read and will ask us | 15:52 |
* mordred sends out an email to the list to tell people to upgrade | 15:53 | |
*** lcanas has quit IRC | 15:53 | |
*** dhellmann is now known as dhellmann_ | 15:55 | |
fungi | thanks for the heads up | 15:55 |
anteaya | sdague: are the pending python26 nova jobs you are looking at in the gate queue? | 15:57 |
anteaya | the ones that got cancelled? | 15:57 |
sdague | anteaya: mostly, I think the fact that we've got a bunch of devstack jobs in there queue is mitgating it | 15:57 |
sdague | yes | 15:57 |
anteaya | okay | 15:58 |
sdague | I forgot that we reset everything | 15:58 |
sdague | even the non linked jobs | 15:58 |
anteaya | how zuul deals with jobs running after the failed job is different now | 15:58 |
*** vogxn has joined #openstack-infra | 15:58 | |
anteaya | as soon as a patch in the gate has a voting failure all jobs after it cease | 15:58 |
sdague | that might be something to think about as future optimization, especially with unit tests now being on order same length as integration tests | 15:58 |
*** vogxn has quit IRC | 15:58 | |
anteaya | the tests that were running used to keep running and they don't now | 15:59 |
sdague | ok | 15:59 |
fungi | if nothing else, this has provided a nice burst of job volume to help prove out the most recent round of performance optimizations | 16:00 |
anteaya | once jeblair has the chance to document all the details, I recommend it | 16:00 |
*** dina_belova has quit IRC | 16:00 | |
anteaya | yes it frees up the nodes to run jobs that have a point to finishing | 16:00 |
*** tstevenson has joined #openstack-infra | 16:00 | |
*** vogxn has joined #openstack-infra | 16:00 | |
fungi | spiked up to around 700 jobs last hour | 16:01 |
*** datsun180b_ has joined #openstack-infra | 16:01 | |
*** vogxn has quit IRC | 16:01 | |
*** svarnau has joined #openstack-infra | 16:01 | |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add cookiecutter templates repo https://review.openstack.org/42530 | 16:01 |
anteaya | sdague: I don't know if that particular change is covered in this blog post or not: http://amo-probos.org/post/15 it is still on my todo list | 16:02 |
mordred | mrodden: have you seen https://review.openstack.org/#/c/42530 ? | 16:02 |
*** xchu has quit IRC | 16:02 | |
mordred | mrodden: I'm reviewing the run-mirror patch, and I realized I should maybe point you at the cookiecutter stuff in case you wanted to start from a fresh repo | 16:02 |
fungi | anteaya: the blog post is a little higher-level and doesn't go into queue management minutiae quite so much | 16:03 |
mordred | but now as I type that, I believe the plan was to use the jeepyb repo as a seed - so nevermind | 16:03 |
anteaya | fungi: ah okay | 16:03 |
*** datsun180b has quit IRC | 16:03 | |
*** datsun180b_ is now known as datsun180b | 16:03 | |
anteaya | so yes, once jeblair can catch his breath, I recommend hearing the details - nodepool, zuul and friends had a fair makeover in the last 2 weeks | 16:04 |
anteaya | oh and we have 2 check queues/pipelines, which I believe work independently of each other | 16:04 |
mrodden | mordred: i had not seen it... | 16:05 |
anteaya | and zuul is more likely to kick out a patch that it can't merge, not because tests failed but because it needs to spend resources on merging patches with a higher probability of getting in | 16:05 |
mrodden | mordred: we are mostly copying what jeepyb has though, and tailoring it to the new pypi-mirror project | 16:05 |
mordred | mrodden: yeah. I actually read the change this time :P) | 16:05 |
anteaya | you will see the rare comment, identifying this with the advice to rebase and try again | 16:05 |
mordred | jog0: speaking of the above - I think you were out of town when I discovered https://review.openstack.org/42530 | 16:06 |
openstackgerrit | A change was merged to openstack-infra/config: Create new pypi-mirror project for run_mirror.py https://review.openstack.org/39399 | 16:06 |
*** SergeyLukjanov has quit IRC | 16:08 | |
*** tstevenson has quit IRC | 16:08 | |
fungi | anteaya: yeah, the refusal to test changes which won't merge on top of those ahead of it went in a while back | 16:09 |
*** nati_ueno has joined #openstack-infra | 16:09 | |
*** jpmelos has left #openstack-infra | 16:09 | |
anteaya | fungi: ah okay, not new then, new for me | 16:09 |
fungi | i was only really not around for stuff which changed last week | 16:09 |
anteaya | fungi: cool | 16:09 |
mordred | oh! did my experimental thing land? neat | 16:10 |
anteaya | forgive me if I have a hard time keeping track and duplicate information | 16:10 |
anteaya | the last two weeks have been a bit of a blur | 16:10 |
fungi | anteaya: no worries. i think i'm basically caught up on the fundamental changes at this point | 16:10 |
anteaya | yay | 16:10 |
anteaya | so glad you are back, fungi | 16:11 |
anteaya | :D | 16:11 |
*** vogxn has joined #openstack-infra | 16:11 | |
fungi | me too! | 16:11 |
*** odyssey4me has quit IRC | 16:12 | |
anteaya | yeah, less chance of being allergic to something at home, eh? | 16:12 |
mordred | fungi: I added a thing that merged while I was on playa that you might find interesting ... https://review.openstack.org/#/c/41954/ (I just noticed because jog0 was using it) | 16:13 |
anteaya | oh yes the experimental queue has been getting some exercise | 16:14 |
*** boris-42 has quit IRC | 16:14 | |
anteaya | not recently, going by its sparkline but last week it did | 16:15 |
fungi | mordred: oh, right. i remember the discussion around that. awesome | 16:15 |
*** openstackgerrit has quit IRC | 16:16 | |
*** openstackgerrit has joined #openstack-infra | 16:17 | |
*** amotoki has quit IRC | 16:17 | |
mordred | I love the sparklines | 16:18 |
ttx | impressive list of potential merges in the gate right now for the next 10 min | 16:18 |
ttx | gone now | 16:18 |
openstackgerrit | Ilya Shakhat proposed a change to openstack-infra/config: Add projects of Fuel family to Stackforge https://review.openstack.org/45044 | 16:19 |
* ttx pauses | 16:19 | |
anteaya | ttx do you know how many went in? | 16:19 |
*** ruhe has quit IRC | 16:21 | |
sdague | ttx: what caused the reset? I seem to have missed that | 16:21 |
ttx | sdague: there were like a neat line of 20+ almost merged and then poof | 16:27 |
*** svarnau_ has joined #openstack-infra | 16:27 | |
ttx | probably their success is being queued up to the post pipe | 16:27 |
ttx | will be back in 2/3 hours | 16:28 |
*** fbo is now known as fbo_away | 16:28 | |
*** jpich has quit IRC | 16:29 | |
*** svarnau has quit IRC | 16:29 | |
*** kiall has quit IRC | 16:30 | |
*** atiwari has joined #openstack-infra | 16:33 | |
*** senk1 has quit IRC | 16:33 | |
zaro | morning | 16:34 |
clarkb | morning | 16:35 |
*** mrmartin has joined #openstack-infra | 16:35 | |
clarkb | mordred: yes experimental pipeline merged. I think it is a big success | 16:36 |
clarkb | mordred: dkranz is using it to develop heat tempest tests too | 16:36 |
*** sridevi has joined #openstack-infra | 16:36 | |
mordred | neat | 16:36 |
derekh | hi all, would anybody have a chance to look at https://bugs.launchpad.net/openstack-ci/+bug/1217815 please | 16:37 |
uvirtbot | Launchpad bug 1217815 in openstack-ci "Tripleo ci service account in gerrit" [Undecided,New] | 16:37 |
fungi | sdague: which reset? the zuul restart i did this morning or a more recent change getting kicked out unleashing the stack behind it for retesting? | 16:38 |
clarkb | elasticsearch went sideways again | 16:38 |
*** dkehn has quit IRC | 16:38 | |
fungi | clarkb: we probably blew its mind with the surge this morning... or did it suffer from the network maintenance? | 16:39 |
clarkb | fungi: looks like networking issues around 0542 | 16:39 |
clarkb | er 0538 | 16:39 |
*** nicedice_ has joined #openstack-infra | 16:39 | |
fungi | clarkb: that would be the same outage which killed everything else then | 16:40 |
clarkb | [elasticsearch] master_left [[elasticsearch5][YcfPnvKYRaOYjCh8zMJQig][inet[/166.78.24.7:9300]]], reason [failed to ping, tried [3] times, each with maximum [30s] timeout] | 16:40 |
clarkb | fungi: ok, catching up I take it there was an unplanned network flap? | 16:40 |
zaro | clarkb, mordred : bad news :( https://gerrit-review.googlesource.com/#/c/48254/ | 16:40 |
fungi | planned maintenance rs was doing. i think dfw was down hard for close to 30 minutes | 16:40 |
clarkb | I am going to restart elasticsearch services so that they rediscover each other | 16:40 |
*** kiall has joined #openstack-infra | 16:41 | |
*** kiall has quit IRC | 16:41 | |
*** kiall has joined #openstack-infra | 16:41 | |
*** senk has joined #openstack-infra | 16:41 | |
mordred | zaro: uhm. I confused by his confusion | 16:42 |
zaro | clarkb, mordred : mfich feels that change owner needs to work with all permissions, not just label. | 16:42 |
clarkb | now which side of the split brain do I want to kick | 16:42 |
* clarkb picks on semi arbitrarily | 16:42 | |
mordred | zaro: ah. gotcha. so, more work to do then I guess? | 16:43 |
zaro | mordred: i asked him about it, here's the entire discussion if you are interested: http://paste.openstack.org/show/45751/ | 16:43 |
mordred | zaro: will that be hard? or just more work | 16:43 |
*** senk1 has joined #openstack-infra | 16:43 | |
mordred | cool | 16:43 |
zaro | mordred: it's more work that i don't know how to do yet. I mean i don't know how change owner should applies to all of the other permissions. i would just be taking a guess. | 16:44 |
*** senk1 has quit IRC | 16:44 | |
mordred | oy | 16:45 |
*** senk has quit IRC | 16:45 | |
*** jhesketh has quit IRC | 16:45 | |
zaro | mordred: mfich suggested that i request suggestions (doing RFC on commit message) to get feedback on how that group should work with all the other permissions. | 16:46 |
zaro | mordred: or do implement it the best i can (don't like this suggestion). | 16:47 |
mordred | zaro: I thnk that's a good idea. also, jeblair and clarkb may have thoughts - but since that's not a usecase for us, we might not have great suggestions | 16:47 |
pleia2 | anteaya: oh! almost forgot, did you get your ubuntu question answered re: installing packages? | 16:47 |
mordred | perhaps also send a message to the repo-discuss mailing list asking for opinions? | 16:47 |
anteaya | pleia2: not yet, I went with precise and nothing blew up | 16:48 |
clarkb | elasticsearch cluster health is now "yellow"... now we wait for it to recover shards and indices | 16:48 |
anteaya | now I can't get it to work, but that may be for other reasons | 16:48 |
zaro | mordred: yeah, i can do all of that but it would seem like it will probably much more work and take longer to get into core. | 16:48 |
zaro | mordred: what about doing as a plugin? | 16:48 |
pleia2 | anteaya: re: ubuntu and xubuntu, they pull from the exact same repository so almost everything worse, it's desktop-specific stuff that won't (xfce panel stuff won't work in Unity, for example) | 16:49 |
pleia2 | s/worse/work | 16:49 |
* pleia2 needs to apply more coffee | 16:50 | |
mordred | zaro: I'm not sure. | 16:50 |
*** dkehn has joined #openstack-infra | 16:50 | |
zaro | clarkb, jeblair: your thoughts on implimenting gerrit WIP as a plugin? | 16:51 |
*** kiall has quit IRC | 16:52 | |
*** derekh has quit IRC | 16:53 | |
anteaya | pleia2: yeah that is what I figured, it appeared to install okay, I will continue to figure out what else I am doing wrong | 16:53 |
anteaya | happy coffee | 16:53 |
*** svarnau has joined #openstack-infra | 16:53 | |
pleia2 | anteaya: anything I can help with? | 16:53 |
*** ruhe has joined #openstack-infra | 16:54 | |
zaro | anteaya: happy wifi? | 16:54 |
anteaya | zaro: yes, thank you - I slept so much better last night | 16:54 |
*** nati_ueno has quit IRC | 16:55 | |
*** kiall has joined #openstack-infra | 16:55 | |
anteaya | pleia2: probably, let me finish off this task and get refocused and I'll find you after coffee | 16:55 |
pleia2 | hah, ok :) | 16:55 |
*** nati_ueno has joined #openstack-infra | 16:55 | |
*** svarnau_ has quit IRC | 16:56 | |
fungi | zaro: writing it as a plugin still leaves us with code to maintain indefinitely. getting the feature integrated upstream in some form usable by us leaves us debt-free in that regard | 16:56 |
*** jaypipes has quit IRC | 16:57 | |
*** nati_ueno has quit IRC | 16:57 | |
clarkb | sdague: I think testr is still counting each test twice. so neutron has ~7.5k tests | 16:57 |
*** nati_ueno has joined #openstack-infra | 16:57 | |
clarkb | fungi: after skimming scrollback I see you had a very fun and exciting morning :) thank you for taking care of that | 16:58 |
*** vogxn has quit IRC | 16:58 | |
fungi | clarkb: it wasn't that bad, really | 16:58 |
clarkb | I feel bad because I was going to bed just as things were breaking, but I probably wouldn't have been much use then if I had been paying attention to zuul | 16:58 |
*** nati_ueno has quit IRC | 16:58 | |
*** hashar has quit IRC | 16:58 | |
*** nati_ueno has joined #openstack-infra | 16:59 | |
fungi | i mean, it set back the patches trying to get through for the milestone, but it wasn't really that hard to get things working again | 16:59 |
*** olaph has quit IRC | 16:59 | |
fungi | and hopefully it wasn't too severe a setback overall | 17:00 |
*** ruhe has quit IRC | 17:00 | |
*** olaph has joined #openstack-infra | 17:00 | |
fungi | also, i really like that 700jph spike on the graph which resulted (and didn't destroy anything as far as i can tell, so awesome new high water mark?) | 17:01 |
clarkb | if it isn't a record jph it is near it | 17:01 |
clarkb | sdague: yes the neutron unit tests need some love. yesterday I got tired of jenkins emailing me about sudo attempts so I dug into their tests and found that `sudo ovs-vsctl` isn't deterministically mocked out depening on test order | 17:02 |
clarkb | sdague: so be warned if you run the tests with a sudo enabled user :) | 17:02 |
zaro | fungi: yes, fungi i understand. trade off is that we could get feature we want in faster without having to meet the strict gerrit requirements to imposed by gerrit core. | 17:02 |
*** Ryan_Lane has joined #openstack-infra | 17:03 | |
clarkb | zaro: did mfick not offer what the behavior should be for the other permission types? | 17:03 |
zaro | fungi: without doing anymore work as well. | 17:03 |
*** Ryan_Lane has quit IRC | 17:03 | |
*** nati_ueno has quit IRC | 17:03 | |
zaro | clarkb: i asked but he did not. | 17:03 |
fungi | zaro: yep, it is definitely a workload balance question | 17:04 |
zaro | clarkb: his suggestion was 'do your best' | 17:04 |
clarkb | :/ | 17:04 |
zaro | yep, double that | 17:04 |
zaro | he did say that i should wait for other people to provide feedback, but since he's already voted -1 i don't know if anyone else will bother. | 17:06 |
openstackgerrit | A change was merged to openstack-infra/config: Add doc build process into Stackalytics https://review.openstack.org/44887 | 17:06 |
anteaya | pleia2: the task I am working on is standing up owncloud, but if you are available I think we should work on puppet-dashboard first, what do you think? | 17:06 |
clarkb | sdague: https://review.openstack.org/#/c/43558/ the comments at the end of that review may be useful | 17:06 |
pleia2 | anteaya: ah, I've never done anything with owncloud | 17:07 |
anteaya | yeah, me either | 17:07 |
openstackgerrit | A change was merged to openstack-infra/config: Also install pypy-dev on pypy nodes https://review.openstack.org/44690 | 17:07 |
anteaya | I think puppet-dashboard is a higher priority right now, what do you think? | 17:07 |
pleia2 | anteaya: I'm kind of on a roll with baremetal stuff, give me an hour or so and then we have a look at puppet-dashboard? | 17:07 |
anteaya | sure | 17:07 |
pleia2 | great | 17:08 |
anteaya | roll away | 17:08 |
pleia2 | :) | 17:08 |
anteaya | :D | 17:08 |
*** ruhe has joined #openstack-infra | 17:08 | |
*** boris-42 has joined #openstack-infra | 17:08 | |
*** senk has joined #openstack-infra | 17:10 | |
*** mrmartin has quit IRC | 17:16 | |
*** jaypipes has joined #openstack-infra | 17:16 | |
clarkb | new review.o.o and wiki mysql dumps look good | 17:16 |
clarkb | if they rotate properly tonight I will start work on making bup go | 17:16 |
*** nati_ueno has joined #openstack-infra | 17:17 | |
clarkb | mordred: fungi: bringing up multiple pythons on a slave again | 17:19 |
fungi | ya | 17:19 |
clarkb | mordred: fungi: doesn't new tox make the system python's sdist of the thing being tested less painful (or a non issue)? assuming we can remove other global dependencies it should be safe for projects to use python3.2 on precise for example? | 17:19 |
clarkb | the one exception to this is nova because libvirt (but they have a fake for libvirt if it isn't available globally so not a major problem) | 17:20 |
clarkb | fungi: also have you seen https://review.openstack.org/#/c/44378/ ? I can't figure out why Gerrit SSH is so derpy and fails like that | 17:20 |
clarkb | fungi: happens with our gerrit and gerrit 2.6.X | 17:21 |
*** dizquierdo has quit IRC | 17:22 | |
*** kiall has quit IRC | 17:23 | |
*** w__ has joined #openstack-infra | 17:26 | |
*** olaph has quit IRC | 17:28 | |
fungi | clarkb: i think once the tox/virtualenv installs are confirmed to work fine when spawned using a different python interpreter version than is used within the virtualenv, it's probably safe start running tests from multiple pythons side-by-side where we can | 17:28 |
fungi | though there are also python installability concerns, for example getting python3.2 and python3.3 to install side-by-side on precise may take more packaging work since right now they step on each others package names and paths in some places | 17:30 |
clarkb | fungi: interesting, so they aren't currently safe to install side by side as on Debian? | 17:30 |
fungi | nope | 17:30 |
*** kiall_ has joined #openstack-infra | 17:31 | |
*** kiall_ has quit IRC | 17:31 | |
*** kiall_ has joined #openstack-infra | 17:31 | |
fungi | now, how much packaging work it'll take to get them more debian-like is certainly worth exploring. might not be much | 17:31 |
dstufft | packaging is the worst | 17:32 |
fungi | but also debian's already moved out of the window where i can source 2.6, 2.7, 3.2 and 3.3 from the same reasonably-integrated suites | 17:32 |
zaro | mordred: https://git.openstack.org/cgit/openstack-infra/publications/ | 17:32 |
fungi | i have one system basically held in that state now for my own short-term sanity but don't expect to keep it like that too long | 17:33 |
zaro | mordred: ^ how do i get the pubs? | 17:33 |
fungi | in short, debian/jessie will release with 2.7 and either 3.3 or 3.4 | 17:33 |
clarkb | fungi: what about (and tell me if this is just over engineered) we run lxcs on each of our slaves for the different python versions | 17:33 |
clarkb | fungi: I think that leaves us with needing some way of locking a slave when one lxc is in use | 17:34 |
fungi | then we have (in effect) mutiple slaves | 17:34 |
zaro | mordred: make-index doesn't work, there are no tags. | 17:34 |
clarkb | zaro: I have fixed that bug, I think the change merged yesterday | 17:34 |
fungi | and each lxc container needs to be kept up to date independently as a separate system anyway | 17:34 |
* zaro trys again | 17:34 | |
clarkb | zaro: but even then make-index isn't sufficient. Instead you need to look in the non master branches | 17:34 |
clarkb | zaro: overview for example contains the overview talk, you can just open it in firefox (chrom* doesn't do the zuul animation properly) | 17:35 |
fungi | clarkb: unless we build lxc containers on the fly like we do with virtualenvs, but then that leaves us maintaining the systems which build multiple different lxc containers if nothing else | 17:35 |
clarkb | fungi: that is less of an issue with docker and unionfs | 17:35 |
clarkb | we could respin the images once a day or so then boot containers on the fly | 17:36 |
fungi | and at that point, why not just build distinct system images for on-demand consumption as nodepool slaves at that point | 17:36 |
clarkb | feels very overengineered for the requirement of "support multiple pythons" | 17:36 |
clarkb | fungi: biggest reason is xen and kvm booting from glance images is slow | 17:37 |
clarkb | fungi: if you already have locally cached container filesystems booting should take almost no time | 17:37 |
fungi | what about mordred's dib/takeovernode/kexec work then? will that be close? | 17:37 |
clarkb | fungi: yeah, similar solutions to different problems | 17:38 |
clarkb | but maybe they can be treated as the same problem? kexec a different root fs based on python need? | 17:38 |
fungi | well, but if those become the de-facto solution underlying nodepool resources, and we move away from long-running slaves... | 17:39 |
*** dkehn has quit IRC | 17:40 | |
*** dkehn has joined #openstack-infra | 17:40 | |
zaro | clarkb: is there only 1 pub, or are there more? | 17:40 |
fungi | also, maybe we have a collating job which takes the various dib-created filesystems and makes them different gpt parts on the same block device, then uses that for all unconsumed slaves, we have a local cache of all possible slave versions available to takeovernode | 17:40 |
zaro | clarkb: i was expecting more. | 17:41 |
clarkb | zaro: currently I believe there is just overview. pleia2 wants to add her talk. I am going to write up a doc on how to do that when I have a free chunk of time | 17:41 |
pleia2 | clarkb: if you rewind a bit in history we have about 10 or so | 17:41 |
fungi | zaro: if you checkout a ref in master from before everything was deleted (check git log) you can get the old presos | 17:41 |
fungi | zaro: but it made more sense to clean those up before separating them into new branches, so i only initially did overview | 17:42 |
fungi | mostly to act as an example for further reorg work | 17:43 |
pleia2 | once we have instructions, we can go about adding the rest back :) | 17:43 |
clarkb | pleia2: I am thinking general process will be 1. create new branch for thing (or have someone do it for you if permissions are a problem) 2. push change to update default branch in .gitreview for that branch 3. push commit(s) with new thing in them | 17:43 |
fungi | we wanted to polish the overview file layout a bit to decide what the others should end up looking like once split out and cleaned up | 17:43 |
pleia2 | clarkb: yeah, the "create a new branch" thing is where I get stuck, I don't really understand yet how we do that for projects in this infrastructure | 17:43 |
clarkb | pleia2: typically it requires a person with extra gerrit permissions to click a button in gerrit | 17:44 |
zaro | ohh. i see, it looks like it's just adding slides onto the same deck. | 17:44 |
pleia2 | clarkb: oh good, then I wasn't missing something obvious :) | 17:44 |
fungi | pleia2: but really, decide what branch name you need for a distinct presentation (which isn't just an evolution of an existing presentation which already has a dedicated branch) and i'll be happy to click the button | 17:45 |
clarkb | pleia2: ya, mordred jeblair and I can do it too | 17:45 |
pleia2 | great | 17:46 |
clarkb | I do intend on writing a proper howto doc in the master branch or something in the README though | 17:46 |
fungi | clarkb: also if you can include a basic specification for the minimum requirements and filenames a presentation needs, that would help a ton (also would make for a good technical review checklist, or possible blueprint for future validation tests) | 17:47 |
clarkb | fungi: ++ things like a README.rst for make-index and so on? | 17:47 |
fungi | definitely. that would be great | 17:47 |
*** jaypipes has quit IRC | 17:48 | |
fungi | we also talked about adding a branch for a basically empty template/skeleton presentation (could be just a gutted copy of overview with the theming left intact), which might make the actual instructions a little easier | 17:49 |
clarkb | elasticsearch has recovered and is now rebalancing | 17:49 |
*** jaypipes has joined #openstack-infra | 17:49 | |
clarkb | fungi: good idea. I should probably start with that, put as much boilerplate as possible there and explain how to flesh it out in the howto | 17:50 |
fungi | idea being for a new presentation, you branch from template and start working there | 17:50 |
*** krtaylor has quit IRC | 17:50 | |
fungi | i still don't think our plan has a good solution to global theming changes though... maybe we make updates to the template branch and then cherry-pick those to all the others? | 17:51 |
clarkb | we could use a submodule >_> | 17:51 |
clarkb | but then we have another git repo... | 17:51 |
* fungi is not a fan of submodule so far, but the ways he has used it may not be great either | 17:51 | |
clarkb | fungi: I think this is the sort of thing that submodules were created for. They are substitutes for things that should be treated like packages but have no packaging | 17:52 |
fungi | yeah, i think cherry-picking template updates into other presentation branches would probably be fine as long as we're mindful of making them merge-friendly when we construct them | 17:52 |
clarkb | ++ | 17:52 |
*** jaypipes has quit IRC | 17:53 | |
*** yjiang5_away is now known as yjiang5 | 17:53 | |
clarkb | fungi: https://review.openstack.org/#/c/44378/ do you know what causes ssh clients to complain about a hash mismatch? | 17:53 |
fungi | also, in the same vein, i think anywhere we reuse images in slides we should try to always keep the same file names and paths for them, so it's easier to update them in all branches as needed | 17:53 |
*** ruhe has quit IRC | 17:53 | |
* fungi looks | 17:54 | |
clarkb | fungi: also if you look at that change you will notice that gerrit 2.4 seems to be more picky about some things than 2.6 by deafult | 17:55 |
clarkb | fungi: was interesting debugging some of the failures | 17:55 |
fungi | hash mismatch is going to be a difference in the key being verified and the one cached | 17:55 |
*** eharney has quit IRC | 17:55 | |
clarkb | fungi: in known_hosts? | 17:55 |
clarkb | fungi: perhaps we should simply not verify the host keys? | 17:55 |
fungi | yeah, in *a* known_hosts anyway. there are multiple places ssh can look | 17:55 |
*** jaypipes has joined #openstack-infra | 17:56 | |
fungi | we can probably tune ssh for more specific behaviors in that regard through adjusting the envvar git uses to determine how it's invoked if we want | 17:56 |
clarkb | it would make the tests a little simpler if we can ignore host key verification as there is a bunch of setupt to make the known_hosts file correct | 17:57 |
clarkb | or at least attempting to be correct | 17:57 |
fungi | but basically what that says is that the ssh command as run by git had already seen a key for "localhost" and it didn't match the one served on that port | 17:57 |
*** tstevenson has joined #openstack-infra | 17:57 | |
*** openmike has joined #openstack-infra | 17:57 | |
clarkb | fungi: and that is using the host:port as a key? | 17:58 |
fungi | just host | 17:58 |
clarkb | fungi: oh well than that is probably the issue | 17:58 |
fungi | and yes, if you're going to auto-accept host keys in a script then it really makes just as much sense to ignore them entirely (tofu uses aside) | 17:58 |
clarkb | fungi: since we are running >1 ssh server | 17:58 |
fungi | one workaround if this were a production problem (which it isn't) would be to use multiple aliases for localhost | 17:59 |
clarkb | fungi: I will fiddle with not verifying the key | 17:59 |
fungi | for example, i have several machines behind a many-to-one layer-4 nat at home, with sshd from each of them exposed on different tcp ports at the same global ipv4 address. i need to use distinct names when connecting to them | 18:00 |
openstackgerrit | A change was merged to openstack-infra/jenkins-job-builder: Make references to Jenkins plugins uniform https://review.openstack.org/44960 | 18:00 |
clarkb | fungi: any first thoughts on whether or not we should be testing with our gerrit vs upstream? | 18:00 |
clarkb | fungi: I don't like hitting upstream for a dependency, that seems potentially flaky as well. And after realizing the tests needed updating to run against 2.4 I am slightly worried we may end up with git-review that works against upstream but not our gerrit | 18:01 |
fungi | clarkb: if we don't want to test with more than one gerrit, i think we should probably test with latest upstream release since we're not the only community using git-review any longer | 18:01 |
fungi | but better would be to test more than one gerrit | 18:02 |
*** Ryan_Lane has joined #openstack-infra | 18:02 | |
fungi | like 2.4 (which we use), 2.5 (which mediawiki is using currently?), 2.6 which we're maybe moving to before 2.7 comes out? | 18:02 |
*** Ryan_Lane has quit IRC | 18:03 | |
clarkb | fungi: the matrix becomes very large | 18:03 |
fungi | yeah, it does | 18:03 |
clarkb | though the test suite is small enough that we might just bruite force it | 18:03 |
*** Ryan_Lane has joined #openstack-infra | 18:03 | |
fungi | i honestly don't have a great reason for saying we should test multiple versions. it just feels wrong to me to advertise it as a client for gerrit, in the generic sense, but then only test on our old and hacked-up island | 18:04 |
fungi | maybe two tests. latest released gerrit and the gerrit run for openstack infra | 18:04 |
clarkb | that seems reasonable | 18:06 |
fungi | that way it's not broken for our community (because we're a special snowflake who happens to be running the tests!) and if it's broken for anyone else then they ought to run a newer gerrit | 18:06 |
clarkb | or run our gerrit >_> | 18:07 |
clarkb | so hopeful that I won't be able to say such things in the near future | 18:07 |
fungi | sure, but just "you ought to run openstack's old gerrit fork for your users to take advantage of git-review" by itself is not exactly a great answer | 18:07 |
clarkb | ya | 18:08 |
clarkb | speaking of git-review, apparently it doesn't work on windows when the path to git review has spaces in it | 18:10 |
*** dina_belova has joined #openstack-infra | 18:11 | |
*** SergeyLukjanov has joined #openstack-infra | 18:11 | |
fungi | ooh, neat. i would consider that worth filing a bug (or did someone already and i missed it?) | 18:12 |
fungi | also, the GIT_SSH envvar will only take the path to the executable, so you could create a script on the fly which runs ssh -o StrictHostKeyChecking=no "$@" and then point it at that | 18:14 |
harlowja | ok, qq, whos 'Arx Cruz' (or whats the reference?) | 18:14 |
harlowja | :-P | 18:14 |
openstackgerrit | John Dickinson proposed a change to openstack-infra/reviewstats: added Peter Portante as a core dev https://review.openstack.org/45088 | 18:14 |
fungi | clarkb: or override it in the .ssh/config for the user running the tests, alternatively | 18:14 |
openstackgerrit | A change was merged to openstack-infra/reviewstats: added Peter Portante as a core dev https://review.openstack.org/45088 | 18:15 |
fungi | harlowja: i'm guessing it's this guy... https://twitter.com/arxcruz | 18:15 |
*** dina_belova has quit IRC | 18:16 | |
*** melwitt has joined #openstack-infra | 18:16 | |
harlowja | interesting fungi, idk, whoever it is is starting gating jobs on my reviews, ha | 18:17 |
harlowja | https://review.openstack.org/#/c/44074/ :-P | 18:17 |
fungi | harlowja: http://www.openstack.org/community/members/profile/11563 | 18:17 |
harlowja | intersting | 18:17 |
harlowja | is it a bug that his username is starting the zuul jobs? | 18:17 |
harlowja | *since my guess is thats an automated thing? | 18:17 |
fungi | ArxCruz seems to be in irc here, so he can likely tell you if that's his own jenkins or something | 18:17 |
clarkb | fungi: there is a bug for the iwndows thing. lp probably hasn't spoken to your mail server yet | 18:18 |
harlowja | ah, ArxCruz do u have your own mini-zuul? | 18:18 |
clarkb | fungi: thanks for the heads up on GIT_SSH. I think we are already doing something of the sort so I can just update that | 18:18 |
russellb | i came here to ask that same question | 18:18 |
ArxCruz | harlowja: yes, sorry about that, it's already off line | 18:18 |
ArxCruz | :( | 18:18 |
russellb | about ArxCruz heh | 18:18 |
fungi | ArxCruz: it's the best way to learn! | 18:19 |
ArxCruz | I've configure with puppet so it started to get everything from gerrit | 18:19 |
russellb | kinda funny really :-) | 18:19 |
ArxCruz | I just notice when I start receive emails about bugs being tested | 18:19 |
fungi | i guess it's working ;) | 18:19 |
ArxCruz | fungi: yeah, now I need to understand how work to just receive the events, not send back information (from now) | 18:20 |
ArxCruz | fungi: harlowja do I need to take some action ? | 18:20 |
clarkb | ArxCruz: to not send back info you update your pipeline config in layout.yaml | 18:20 |
ArxCruz | clarkb: which part? hehe, it's huuuuuuuuuuuge | 18:21 |
fungi | right, zuul-dev's layout has it set as an example, i think | 18:21 |
clarkb | ArxCruz: the pipeline part | 18:21 |
anteaya | ArxCruz: lol | 18:21 |
*** Grizzlebee has quit IRC | 18:21 | |
clarkb | the bits that say success: verify: 1 | 18:21 |
fungi | as zuul-dev does actually get events and run a subset of tests but doesn't report back to gerrit | 18:21 |
clarkb | I think you remove those | 18:21 |
fungi | might make sense to reuse the same layout.yaml zuul-dev does, or even edit that one down further for yours | 18:22 |
ArxCruz | fungi: okay, I will update my zuul to use layout.yaml from zuul-dev :) | 18:26 |
ArxCruz | sorry for any trouble that I may have caused | 18:26 |
*** nati_ueno has quit IRC | 18:27 | |
*** nati_ueno has joined #openstack-infra | 18:27 | |
*** jaypipes has quit IRC | 18:28 | |
ArxCruz | fungi: by the way, I'm trying to configure my own jenkins server too, how can I connect my mini-zuul with my jenkins? is that gearman_workers option ? | 18:28 |
clarkb | fungi: I have a git-review patch running `test run --parallel --until-failure`. If that doesn't fail for a while I will push it up | 18:30 |
clarkb | fungi: didn't help | 18:32 |
*** mrmartin has joined #openstack-infra | 18:33 | |
ArxCruz | harlowja: fungi I'm using now zuul-dev yaml, please let me know if there's some problem :) | 18:34 |
*** nati_ueno has quit IRC | 18:35 | |
*** kiall_ has quit IRC | 18:35 | |
*** nati_ueno has joined #openstack-infra | 18:36 | |
clarkb | fungi: going to try it with the managed known hosts file without strict host key checking | 18:36 |
clarkb | not that I expect this will help but it will keep the known host keys isolated | 18:36 |
*** kiall has joined #openstack-infra | 18:38 | |
* clarkb abuses 127.0.0.0/8 instead | 18:39 | |
*** nati_ueno has quit IRC | 18:40 | |
openstackgerrit | David Lenwell proposed a change to openstack-infra/gitdm: added my self to the piston group and added the launch pad mapping https://review.openstack.org/45092 | 18:40 |
*** sarob has joined #openstack-infra | 18:40 | |
*** jaypipes has joined #openstack-infra | 18:43 | |
sdague | markmcclain: so neutron just reset the gate on unit tests for at least the 4th time since I've been watching, which is killing everyone else's merges | 18:44 |
*** dhellmann_ is now known as dhellmann | 18:46 | |
*** hashar has joined #openstack-infra | 18:48 | |
markmcclain | sdague: yeah saw the same thing | 18:54 |
markmcclain | got a proposed fix up for review | 18:55 |
*** fbo_away is now known as fbo | 18:55 | |
*** reed has quit IRC | 18:55 | |
sdague | markmcclain: cool, it would be nice if we didn't push any more neutron patches into the gate before that fix. It's going to be a long time for people watching patches today anyway, and rather not make it longer on them | 18:58 |
*** gyee has quit IRC | 19:00 | |
ttx | BOO. pep8 fail causing 2 cancels at gate | 19:02 |
anteaya | :( | 19:02 |
ttx | not exactly sure how adding a README.txt makes a pep8 fail though :) | 19:03 |
ttx | hah | 19:04 |
ttx | "git commit title ('This adds a README to brick.') should not end with period" | 19:04 |
ttx | it's actually a HACKING fail | 19:04 |
ttx | mordred: looks like people already forgot your good advice to wait until check tests pass before submitting to gate | 19:04 |
mordred | ttx: longer conversation you may not want to get involve in | 19:04 |
mordred | re HACKING | 19:05 |
ttx | indeed :) | 19:05 |
hashar | hey :) Some people complained a bunch of hours ago that Zuul was stuck not proceeding result | 19:05 |
anteaya | it was | 19:06 |
ttx | bunch of hours ago it was broken. | 19:06 |
anteaya | it should be fixed now | 19:06 |
hashar | ahhh good :-) | 19:06 |
anteaya | :D | 19:06 |
ttx | well, not Zuul, the gating process was broken. | 19:06 |
hashar | I could not really help so lamely recommended to .. wait() | 19:06 |
ttx | due to some cloud infra fail IIRC | 19:06 |
anteaya | hashar: thanks, that was all I could have done too | 19:06 |
anteaya | so thanks | 19:07 |
*** SergeyLukjanov has quit IRC | 19:07 | |
anteaya | rax had an outage on our servers running many important services | 19:07 |
anteaya | lost the bots, eavesdrop, don't know what else | 19:07 |
anteaya | everything should be back up now though | 19:08 |
hashar | and I had no clue who to ping during European morning | 19:08 |
anteaya | not sure if it was planed maintenance or an unexpected outage | 19:08 |
hashar | but hey, I am not that much involved in OpenStack to figure out everything | 19:08 |
anteaya | hashar: yeah, I hear that | 19:08 |
anteaya | glad the call to the wait() function was able to work | 19:08 |
mrmartin | maybe we need an EU infra team also | 19:09 |
anteaya | well right now we have 4 infra core with root access, and 3 of us who are working towards that | 19:11 |
*** dina_belova has joined #openstack-infra | 19:11 | |
anteaya | so until any interested party learns enough about everything about infra to have core status and root access it doesn't much matter where they sleep | 19:11 |
anteaya | they couldn't restart infra anyway | 19:12 |
mrmartin | I'll will promote this fantastic opportunity in the local user group, and try to find someone who don't want to sleep during the next few months | 19:12 |
*** kiall has quit IRC | 19:12 | |
hashar | you can still start the bootstrapping process | 19:12 |
anteaya | well I figure to learn all of infra will take at least 6 months, if they really really pick it up fast - myself I think I will need longer | 19:12 |
hashar | i.e. focus on some european people to have them later become root / whatever privilege | 19:13 |
anteaya | then they have to be core and root | 19:13 |
anteaya | mrmartin: but by all means, any new folks who want to come along and learn are more than welcome | 19:13 |
mrmartin | anteaya: I guess it is a normal process, if I hire somebody now to work on openstack it takes the same 6 month to learn the basics | 19:13 |
anteaya | hashar: sure | 19:13 |
anteaya | all welcome | 19:13 |
*** openmike has quit IRC | 19:14 | |
*** vipul is now known as vipul-away | 19:14 | |
*** vipul-away is now known as vipul | 19:14 | |
anteaya | mrmartin: yes, but yes I think that having capable folks in all time zones would make us all happy | 19:14 |
hashar | a thing that works nice is to pair experimented people with entry level people | 19:15 |
anteaya | yes, mentoring is a good approach | 19:15 |
hashar | so whenever there is an outage, the newcomer is responsible and the experimented/root is looking over the shoulder | 19:15 |
hashar | both could update whatever troubleshooting doc already exist | 19:15 |
hashar | next time, the doc will probably be enough for the rookie to sort the issue | 19:15 |
hashar | and so on | 19:15 |
anteaya | fungi tends to help me since he is in the same time zone, when he has time - this week is a bad time for me to hog his attention | 19:15 |
hashar | but that takes a long time | 19:15 |
anteaya | yes | 19:16 |
sdague | ttx: yeh, these are the times I think we need fast fail in zuul | 19:16 |
anteaya | and the room for error in infra is small | 19:16 |
anteaya | but yes, we do what we can and the core/root folks support us a lot | 19:16 |
*** dina_belova has quit IRC | 19:16 | |
anteaya | this time frame right now is a poor example of that, since the core/root members are focusing on keeping this working | 19:17 |
sdague | I need to figure out the right data to catch to figure out throughput modelling for zuul | 19:17 |
anteaya | so us 3 non-core are doing work that doesn't require much support from infra-core | 19:17 |
anteaya | while trying to do what we can to support the ff tension/rush | 19:18 |
sdague | ug, and another neutron reset, /me hopes the fix markmcclain has in the merge queue handles it | 19:18 |
hashar | anteaya: sounds to me like you are greatly offloading root which is a good start | 19:18 |
anteaya | thanks | 19:18 |
anteaya | doing my best | 19:18 |
*** ryanpetrello has quit IRC | 19:19 | |
markmcclain | sdague: once this merges this should fix the gate failure https://review.openstack.org/#/c/45091/ | 19:19 |
anteaya | it's funny, I tell my body to wake me up when I need to be awake | 19:19 |
sdague | markmcclain: cool | 19:19 |
clarkb | markmcclain: there are other races in the neutron tests | 19:19 |
anteaya | yesterday morning I was up and online around 7:30am my time, this morning I didn't wake up until 10:30am | 19:20 |
clarkb | markmcclain: https://review.openstack.org/#/c/43558/ the comments at the end of that review point out a different set of races Ithink | 19:20 |
anteaya | I came online as things that were down were still being discovered | 19:20 |
anteaya | I couldn't have done anything other than call .wait() either | 19:20 |
markmcclain | clarkb: looking into that one as well | 19:21 |
mordred | sdague: jeblair is working on some helpful changes to teh zuul scheduler | 19:24 |
sdague | mordred: cool | 19:24 |
mordred | sdague: he was just catching me up on what I missed | 19:24 |
clarkb | sdague: it will do one level of branching | 19:24 |
mordred | sdague: https://review.openstack.org/#/c/44346/ | 19:25 |
clarkb | so that on the first failure we restart tests behind without the failing test, if we fail again we discard the restarted thing and go again | 19:25 |
*** pblaho has joined #openstack-infra | 19:25 | |
clarkb | fungi: I think abusing 127.0.0.0/8 may have fixed the problem. I will push that up shortly | 19:25 |
clarkb | fungi: nope nevermind | 19:25 |
clarkb | fungi: as soon as I say something we get a failure | 19:25 |
sdague | clarkb: sounds great | 19:27 |
clarkb | fungi: actually, the failure is with scp now | 19:27 |
sdague | the other thing that would be nice to do is have some way to either poke changes out of the queue, or have them promoted on next reset | 19:27 |
russellb | don't think it's a problem exactly, but this review has an interesting issue ... look at all the "uploaded patch N" comments at the end - https://review.openstack.org/#/c/37819/ | 19:27 |
russellb | just fyi i guess | 19:27 |
mrodden | i think Gerrit is having a rough day ^ | 19:28 |
*** vipul is now known as vipul-away | 19:29 | |
sdague | interesting... | 19:29 |
*** ryanpetrello has joined #openstack-infra | 19:30 | |
anteaya | I don't think i have ever seen that before | 19:30 |
mordred | BobBall: ping | 19:31 |
anteaya | any opinion of whether this is worthy of a bug report? | 19:31 |
*** sdague has left #openstack-infra | 19:31 | |
mordred | russellb: wow. that's fascinating | 19:31 |
russellb | mordred: that's kinda what I was thinking :-) | 19:31 |
mrodden | all of the out of order comments are dated sept 3rd 12:00pm | 19:32 |
mordred | sdake: what do you mean "have them promoted on next reset" | 19:32 |
mordred | gah | 19:32 |
mordred | sdake: the above was meant for sdague | 19:32 |
lifeless | morning | 19:33 |
*** yjiang5 is now known as yjiang5_away | 19:33 | |
anteaya | morning lifeless | 19:34 |
lifeless | anteaya: hi thar | 19:34 |
lifeless | clarkb: so want the bad news ? | 19:34 |
jog0 | mordred: the tox change is awesome!! | 19:34 |
clarkb | lifeless: not really, but ok | 19:34 |
mordred | jog0: yay! | 19:34 |
*** vipul-away is now known as vipul | 19:34 | |
lifeless | I can't get the permissions right :( | 19:34 |
lifeless | clarkb: the config file review.o.o is running doesn't seem to permit openstack-project-creator to create projects | 19:35 |
mordred | lifeless: our manage-projects user is in both Project Bootstrappers and Admins | 19:35 |
lifeless | clarkb: I fixed that | 19:35 |
lifeless | clarkb: but I couldn't figure out how to have it push to /refs/meta/config and have it work. | 19:35 |
lifeless | mordred: ohhoho! | 19:35 |
mordred | lifeless: I do not know _why_ this is the case, but it is certainly true | 19:36 |
*** sdague has joined #openstack-infra | 19:36 | |
sdague | ok, so pushing a rev 2 of a patch will kick it out of the gate queue, right? | 19:36 |
fungi | sdague: yes | 19:37 |
sdague | I just did that to the cinder pep8 fail to let stuff behind it make progress | 19:37 |
fungi | that's how i knocked one out earlier when it was unfortunately necessary | 19:37 |
fungi | in that case, it had already been reverified once though, and they had a copy of the unit test failures already | 19:37 |
*** odyi has quit IRC | 19:39 | |
openstackgerrit | David Lenwell proposed a change to openstack-infra/gitdm: added my self to the piston group and added the launch pad mapping https://review.openstack.org/45092 | 19:40 |
sdague | another neutron reset coming | 19:40 |
*** odyi has joined #openstack-infra | 19:40 | |
*** odyi has joined #openstack-infra | 19:40 | |
Ryan_Lane | upgrading mediawiki for security patches | 19:41 |
mordred | Ryan_Lane: bah. security is for weenies | 19:41 |
EmilienM | could anyone make a last review (and maybe an approval) on https://review.openstack.org/#/c/44032 ? | 19:41 |
Ryan_Lane | :D | 19:41 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Gerrit docs improvements - user and groups. https://review.openstack.org/45001 | 19:42 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 19:42 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document review.pp parameters a bit. https://review.openstack.org/44969 | 19:42 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Make gerrit DB setup match actual practice. https://review.openstack.org/44993 | 19:42 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document basic admin hints for jeepyb. https://review.openstack.org/45043 | 19:42 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Non-openstack-ci support for launch/dns.py. https://review.openstack.org/44980 | 19:42 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document bootstrapping of Gerrit ACLs. https://review.openstack.org/45011 | 19:42 |
pleia2 | oh, this guy again | 19:42 |
pleia2 | :) | 19:42 |
*** kiall has joined #openstack-infra | 19:43 | |
lifeless | woohoo, manage-projects run succeeded | 19:43 |
lifeless | what do you guys think, zuul after gerrit? | 19:44 |
openstackgerrit | A change was merged to openstack-infra/gitdm: added my self to the piston group and added the launch pad mapping https://review.openstack.org/45092 | 19:45 |
clarkb | lifeless: sure, you can set up a simple gearman worker for testing until you have jenkins | 19:45 |
clarkb | lifeless: or you can do jenkins then zuul. The advantage with this approach is you can use the gerrit jenkins plugin before zuul is running | 19:46 |
lifeless | clarkb: I worry that what you just said implies running diverged from upstream | 19:46 |
markmcclain | maybe I missed it, but is it a known issue that job to propose translations has not run since Aug 30th? | 19:47 |
clarkb | lifeless: it does imply that, I think it is only useful to do that for a short period of time while you bring up zuul | 19:47 |
clarkb | markmcclain: I bet I know what happened | 19:48 |
lifeless | clarkb: so, that seems like waste - setting up something I would not use for long ? | 19:48 |
clarkb | markmcclain: transifex flipped the world upside down semi recently and I think it may have broken the job, let me check on the transifex side to see if anything is missing | 19:48 |
clarkb | lifeless: setting up jenkins itself is 99% of the battle | 19:49 |
mordred | lifeless: I'd do zuul first | 19:49 |
clarkb | lifeless: adding a plugin to talk to gerrit is a couple button pushes | 19:49 |
mordred | lifeless: since a simple throw-away gearman worker is about 4 lines of code | 19:49 |
mordred | clarkb: it's multiple non-automated button pushes | 19:49 |
mordred | I'd think gerrit -> zuul -> jenkins -> jjb would be the sequence that leaves you with the most intermediary testable states where you don't need to change the tested state to work on the next one | 19:50 |
lifeless | clarkb: http://paste.ubuntu.com/6063924/ is the manual patch I have outstanding vs everything :) | 19:50 |
lifeless | btw | 19:51 |
lifeless | (to the floor) | 19:51 |
lifeless | the prose in gerrit.rst about API projects | 19:51 |
lifeless | is that aspirational or legacy? | 19:51 |
lifeless | since nova is parented on All projects, not API projects, on review.o.o | 19:51 |
jeblair | lifeless: https://review.openstack.org/#/admin/projects/openstack/compute-api,access | 19:52 |
jeblair | lifeless: it is neither aspirational or legacy; it is accurate | 19:52 |
mordred | lifeless: yeah. it's current and accurate | 19:52 |
lifeless | oh! | 19:53 |
lifeless | so whats compute-api vs nova? | 19:53 |
jeblair | lifeless: api docs | 19:53 |
lifeless | ah! | 19:53 |
hashar | if a change got two code-review +2, does it need a third person to vote approved or is one of the CR+2 voter allowed to approve? | 19:54 |
lifeless | hashar: the second +2 voter should approve | 19:55 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Explain API projects a little. https://review.openstack.org/45111 | 19:55 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 19:55 |
lifeless | jeblair: thank you | 19:55 |
jeblair | hashar: what project? | 19:55 |
fungi | hashar: usually if the cr+2 voter didn't approve it's because they either think additional eyes would be good, or because it's not an ideal time to merge it, or maybe because there was a reviewer from a previous patchset who may still have outstanding questions | 19:55 |
hashar | jeblair: on JJB :) | 19:56 |
hashar | lifeless: fungi thanks :) | 19:56 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/git-review: Offset localhost IP address when testing. https://review.openstack.org/45112 | 19:56 |
jeblair | hashar: ah, then you are probably asking about your own behavior, as a recent addition to core :) | 19:57 |
clarkb | fungi: ^ so that isn't compeltely working according to my testing, but should give you an idea of what I am trying to do | 19:57 |
* fungi checks | 19:57 | |
hashar | jeblair: yeah exactly. Mathieu Gagné and I both are recent approver of JJB. We both CR+2 but I guess we are both afraid to approve huhu | 19:58 |
hashar | will sort that out tomorrow :-) | 19:58 |
Ryan_Lane | wiki is upgraded | 19:58 |
jeblair | hashar: i agree with fungi's guidelines -- the second core +2 can also aprv, but it is fine to leave it for others if you feel it would benefit from other core reviews | 19:58 |
fungi | clarkb: i think struct.pack()/unpack() are what i've used to simplify ipv4 integer conversion in the past, but can't recall if that's py3k compliant | 19:59 |
hashar | jeblair: yeah that make sense. I have asked the other cr+2 whether he wants a third review. Thx! | 19:59 |
*** portante is now known as portante|afk | 20:00 | |
*** portante|afk is now known as portante | 20:00 | |
mgagne | jeblair: you often have things to add so I kind of always rely on your input before approving | 20:00 |
jeblair | mgagne: ok, i'll try to be less helpful ;) | 20:00 |
clarkb | fungi: I wouldn't worry about that too much at least not while the tests still fail occasionally with hash mismathces | 20:00 |
fungi | hashar: also general good hygiene includes not +2'ing your own patches, and if possible commenting as to why you didn't approve it even though there were enough votes in favor (if you have the minute to spare) | 20:01 |
clarkb | fungi: was hoping you would see something glaringly wrong with what I had done that might explain the failures | 20:01 |
fungi | clarkb: no, waiting to see what it does | 20:01 |
fungi | since jenkins is happily testing it anyway | 20:01 |
fungi | so i didn't bother to run it yet | 20:01 |
*** rcleere has joined #openstack-infra | 20:02 | |
hashar | fungi: wikimedia has a similar policy. So I guess I will be a good OpenStack citizen :) | 20:02 |
*** gyee has joined #openstack-infra | 20:02 | |
anteaya | hashar: since you are core on JJB already you are in a prime position to work toward being infra core/root | 20:04 |
anteaya | if that is what you want | 20:04 |
openstackgerrit | A change was merged to openstack-infra/jenkins-job-builder: Add IRCbot plugin support https://review.openstack.org/44032 | 20:04 |
*** yjiang5 has joined #openstack-infra | 20:05 | |
clarkb | fungi: I have been using `testr run --parallel --until-failure` locally as failure doesn't happen 100% of the time | 20:05 |
hashar | anteaya: I am already too busy with Wikimedia. But I am glad to help maintain the tool I am heavily relying on (aka JJB and Zuul). | 20:05 |
jeblair | clarkb, fungi: what are you working on? | 20:06 |
fungi | clarkb: oh, right | 20:06 |
jeblair | scrollback is intimidating | 20:06 |
anteaya | hashar: okay, fair enough - you know your schedule best, just following up on our earlier conversation on getting more infra root/core folks | 20:06 |
*** nati_ueno has joined #openstack-infra | 20:06 | |
fungi | jeblair: i'l trying to read talk abstracts at the moment. was on the list of things i wanted to get done this morning and now, well, it found its way onto the list of things i wanted to do this afternoon instead | 20:07 |
fungi | er, i'm | 20:07 |
clarkb | jeblair: flaky git review tests | 20:07 |
jeblair | ah ok | 20:07 |
fungi | clarkb: is poking at a nondeterministic ssh host key mismatch in the git-review integration test | 20:07 |
clarkb | jeblair: not a super high priority thing but really annoying | 20:07 |
clarkb | jeblair: did you see the weird change that russellb linked? | 20:08 |
jeblair | clarkb, fungi: speaking of which, i do think we should land https://review.openstack.org/#/c/44988/ soon | 20:08 |
fungi | clarkb: another option you may not have tried... can we just reuse the same host key for multiple gerrits instead? | 20:08 |
pleia2 | anteaya: apologies, "we'll talk in an hour" sailed on past and now I really should grab some lunch, want to chat for 10 minutes about the plan or just wait until I return? | 20:08 |
clarkb | fungi: ooh good idea, yes I think we can do that | 20:08 |
jeblair | we can (a) verify that it works with the publications repo; and (b) it would be good to have the post/release jobs working before we start cutting h3 :) | 20:08 |
anteaya | pleia2: understood | 20:09 |
fungi | jeblair: good idea re testing it with the publications publish jobs | 20:09 |
anteaya | eat first, stable blood sugar is always a good idea | 20:09 |
clarkb | jeblair: +2 from me, I thought I had +2'd it yesterday | 20:09 |
pleia2 | anteaya: ok cool, bbs | 20:09 |
anteaya | k | 20:09 |
jeblair | clarkb: yeah, i assume that weird change is due to gerrit not having enough resolution to sort those patchsets (which i assume really were uploaded) | 20:09 |
jeblair | clarkb: that's a lot of assumptions though | 20:09 |
clarkb | jeblair: I wonder if it could possibly be related to mysqldump? | 20:10 |
clarkb | (I don't have much reason to think so but I am paranoid) | 20:10 |
sdague | jeblair: the patches all were much older than those dates | 20:10 |
mgagne | jeblair: but I learn from your input and attention to details =) | 20:10 |
*** sdague has left #openstack-infra | 20:11 | |
*** sdague has joined #openstack-infra | 20:11 | |
jeblair | sdake: right, but they could still have all been uploaded at once | 20:11 |
jeblair | sdague: ^ | 20:11 |
*** dina_belova has joined #openstack-infra | 20:12 | |
sdague | jeblair: sure.... but how? it's not a series, it's the same patch | 20:12 |
hashar | jeblair: Ryan told me about you post highlighting zuul+gear at http://amo-probos.org/post/15 . I praise your 3l33t graphivz skills. You might want to look at pip seqdiag to generate graphs http://blockdiag.com/en/ :] | 20:12 |
jeblair | hashar: nice, thanks! that's almost as l33t as graphviz :) | 20:14 |
fungi | jeblair: was there a particular publications change you wanted to wait for on 44988, or just manually retrigger a post job on the branch tip? | 20:14 |
jeblair | fungi: i think there was an outstanding change | 20:14 |
* fungi doesn't see one | 20:14 | |
mgagne | hashar: nice find! | 20:14 |
clarkb | fungi: there should be my change to fix the README.rst title in overview | 20:14 |
clarkb | unless that already merged | 20:15 |
jeblair | that's the one i'm thinking of | 20:15 |
clarkb | I think it merged | 20:15 |
fungi | it did | 20:15 |
fungi | but we can retrigger it | 20:15 |
jeblair | fungi: it's a race | 20:15 |
jeblair | fungi: so retriggering won't tell us much, but it will tell us if it works or completely fails | 20:16 |
fungi | ahh, right | 20:16 |
comstud | Hey y'all | 20:16 |
comstud | i found this 'check experimental' | 20:16 |
jeblair | sdague: it would be nice if we know what hartsocks saw | 20:16 |
comstud | things have been in queue for 50m but it doesn't seem they are running | 20:16 |
comstud | not fully implemented or? | 20:16 |
hashar | mgagne: jeblair: seqdiag syntax is really simple. I am probably going to use it to document Wikimedia CI workflow | 20:16 |
jeblair | comstud: the gate has a higher priority | 20:17 |
hashar | mgagne: jeblair: I will share back here once it is completed. | 20:17 |
sdague | jeblair: what's the link again? | 20:17 |
*** dina_belova has quit IRC | 20:17 | |
jeblair | comstud: and is very busy right now so other pipelines are slower | 20:17 |
jeblair | sdague: https://review.openstack.org/#/c/37819/ | 20:17 |
comstud | jeblair: Ok | 20:17 |
fungi | jeblair: you know what? i believe that was the user i did the gerrit account merge on yesterday. i bet an unseen side effect is that those were possibly uploaded by his old account and jenkins is using a mysql timestamp field on update to those rows to sort them | 20:17 |
clarkb | fungi: that sounds very plausible | 20:18 |
fungi | s/jenkins/gerrit/ | 20:18 |
fungi | too many names | 20:18 |
mordred | fungi: yes. that sounds completely reasonable | 20:18 |
jeblair | and another reason we should never do that | 20:19 |
fungi | agreed | 20:19 |
*** jaypipes has quit IRC | 20:19 | |
jeblair | i have to get ready to had to sf for the cloudscaling aws api hackathon | 20:19 |
jeblair | s/had/head/ | 20:19 |
jeblair | where i plan to say things like 'add stuff to tempest' and 'if it's in tempest, it can run in the gate'. | 20:20 |
*** nati_uen_ has joined #openstack-infra | 20:20 | |
lifeless | jeblair: I was just going to ask what you were going to do there :) | 20:20 |
fungi | broken record mode: engaged | 20:20 |
jeblair | lifeless: i'm certainly not going to hack on aws api testing! | 20:20 |
clarkb | jeblair: there are also boto tests in the unitests (whether or not they should be there is a different problem) | 20:21 |
sdague | mordred: so what does the pbr raw install job do? | 20:21 |
mordred | sdague: it runs through several install permutations with proposed pbr changes on each openstack project | 20:22 |
mordred | sdague: so, it builds a mirror, then injects proposed pbr into that mirror, then installs and re-installs openstack projects into various virtualenvs | 20:23 |
fungi | so sounds like the gerrit db schema really should set "DEFAULT CURRENT_TIMESTAMP" on those columns making them only populate on insert and not on update | 20:23 |
yjiang5 | 20:23 | |
sdague | ok, fwiw, it's the longest test we have now but a good chunk | 20:23 |
clarkb | fungi: so you have confirmed that merging the accounts probably updates those comment timestamps? | 20:24 |
mordred | sdague: it could actually probably be whittled down | 20:24 |
mordred | sdague: it's got several permutations that I do not think we need anymore | 20:24 |
fungi | clarkb: i'm tearing into the current schema now to see | 20:24 |
clarkb | mordred: it can also do all of those installs in parallel right? | 20:24 |
mordred | clarkb: I don't know how | 20:24 |
clarkb | mordred: testr with bash subunit :) | 20:24 |
mordred | we _could_ also make it build wheels so that the installs themselves are much faster | 20:24 |
mordred | clarkb: haha | 20:24 |
anteaya | fungi yes hartsocks was the username hartsock was folded into yesterday | 20:25 |
clarkb | mordred: http://bazaar.launchpad.net/~subunit/subunit/trunk/view/head:/shell/README | 20:25 |
fungi | clarkb: ha. it has both on update and default. it seems to sort on the wrong one | 20:25 |
fungi | anteaya: other way around, but yes | 20:26 |
hashar | have a good afternoon, bed time on my side of the earth. *wave* | 20:26 |
mrodden | watching the zuul queue is awesome today | 20:26 |
fungi | night hashar | 20:26 |
mrodden | there are some patches going through with 30 commits merged | 20:26 |
anteaya | fungi: hartsocks is the current username yes, and hartsock was folded into hartsocks was it not? | 20:26 |
*** hashar has quit IRC | 20:26 | |
openstackgerrit | Peter Liljenberg proposed a change to openstack-infra/jenkins-job-builder: Added support for JaCoCo plugin Publisher https://review.openstack.org/44705 | 20:27 |
fungi | anteaya: er, right. for some reason i misread you above. ignore me ;) | 20:27 |
clarkb | JaCoCo makes me think of Conan O'Brien | 20:27 |
fungi | clarkb: i was wrong. that table has only one timestamp field, and it does have the on update property set | 20:27 |
* fungi also misread the schema. bad day for trying to read things i guess | 20:28 | |
*** nati_ue__ has joined #openstack-infra | 20:28 | |
anteaya | fungi: no worries, I am often wrong - just not always | 20:28 |
anteaya | no worries | 20:28 |
fungi | clarkb: so yeah, if it's using that column to determine order in which patchsets were uploaded or comments were added, it really ought to not set that property | 20:29 |
*** nati_uen_ has quit IRC | 20:30 | |
fungi | both the patch_sets and patch_comments tables have the same sort of thing going on. given that gerrit doesn't allow you to replace a patchset or edit a comment, it seems unnecessary to update the timestamp beyond insertion | 20:31 |
*** fbo is now known as fbo_away | 20:31 | |
fungi | it's something we could fix if we wanted to, but probably better just to keep it in mind instead | 20:32 |
clarkb | yeah, fix upstream if it is still an issue otherwise I think just being aware of it locally is sufficient | 20:32 |
clarkb | markmcclain: my hunch about why translations proposals are not ahppening seems to be wrong. I am digging deeper | 20:35 |
clarkb | markmcclain: looks like potentially a different issue related to the transifex changes | 20:37 |
pleia2 | anteaya: ok, having a browse through these files in the etherpad now | 20:37 |
anteaya | k | 20:37 |
anteaya | making additions now | 20:37 |
*** portante is now known as portante|afk | 20:39 | |
anteaya | sorry I am a different colour, I don't know how to get the same colour every time I open an etherpad | 20:39 |
morganfainberg | anteaya, i think you can select a color and it'll persist (cookies?) | 20:40 |
pleia2 | anteaya: if you click on the color next to your name in the list you can change it (it's not a big deal though) | 20:40 |
openstackgerrit | Jeremy Stanley proposed a change to openstack-infra/config: Gerrit sysadmin tips for account repairs/renaming https://review.openstack.org/44912 | 20:41 |
anteaya | morganfainberg: okay I can try | 20:41 |
fungi | clarkb: ^ that includes the ugly workaround | 20:41 |
anteaya | pleia2: thanks, I never get it exact though, close but not exact | 20:41 |
pleia2 | yeah I change colors depending on the computer I'm working on, so whatever :) | 20:41 |
clarkb | markmcclain: I think I see the problem. Sorting out how to fix now. It actually seems to be related to one of our jenkins job changes that only partially applied | 20:42 |
morganfainberg | anteaya, they should allow you to use hex for colors ;) | 20:42 |
anteaya | pleia2: well there is just the two of us so far, so I'm the other one | 20:42 |
anteaya | morganfainberg: sure they should, I should, me ... sigh okay I'll add to the list | 20:43 |
fungi | morganfainberg: the geek in me says they should set your color to a 24-bit hash of your nick ;) | 20:43 |
anteaya | fungi: I like that | 20:43 |
morganfainberg | fungi, but you might get collisions with only 24 bits | 20:44 |
markmcclain | clarkb: thanks for digging into it | 20:44 |
fungi | morganfainberg: clearly we need to switch etherpad to 128-bit color | 20:44 |
morganfainberg | fungi, YES! | 20:44 |
clarkb | markmcclain: puppet on the slave that runs those jobs had been stuck for a month... | 20:47 |
clarkb | markmcclain: I have kicked it, todays translation jobs should work properly | 20:47 |
markmcclain | thanks | 20:47 |
clarkb | markmcclain: thank you for letting us know | 20:48 |
pleia2 | anteaya: so does a gemfile with dependencies for puppet-dashboard exist? | 20:49 |
anteaya | yes | 20:49 |
pleia2 | (ruby is not my forte, excuse me if I get all the terms mixed up) | 20:49 |
anteaya | no worries | 20:49 |
pleia2 | ok cool | 20:49 |
anteaya | ruby is my forte, at least in this channel | 20:49 |
pleia2 | :) | 20:49 |
* fungi tries to make a habit to look at the stale devices list in the puppet dashboard from time to time, but i guess i hadn't checked it for a while before going out of town | 20:49 | |
lifeless | will zuul run on a 512M instance, do you think? | 20:49 |
*** Ryan_Lane has quit IRC | 20:49 | |
pleia2 | anteaya: where is this file in the dashboard source? | 20:50 |
fungi | lifeless: if it's doing basically nothing, probably | 20:50 |
*** Ryan_Lane has joined #openstack-infra | 20:50 | |
anteaya | https://github.com/sodabrew/puppet-dashboard/blob/master/Gemfile | 20:50 |
fungi | lifeless: it and gear are very lightweight | 20:50 |
pleia2 | anteaya: thanks | 20:50 |
jeblair | lifeless: http://cacti.openstack.org/cacti/graph_view.php?action=tree&tree_id=1&leaf_id=23 | 20:50 |
jeblair | lifeless: do you know about our cacti? | 20:51 |
anteaya | pleia2: the way it works is that once you get a new project you install bundler as a gem, then run `bundle install` | 20:51 |
lifeless | jeblair: thanks; I kindof did. | 20:51 |
jeblair | lifeless: you can probably draw some conclusions based on those graphs and what you know about our load | 20:51 |
jeblair | lifeless: (that would probably be as good as our guesses) | 20:51 |
pleia2 | anteaya: I see, so puppet-bundler enables puppet to consume the gemfiles? | 20:51 |
lifeless | 50G used, hmmm | 20:51 |
jeblair | lifeless: to that i would only add that until recently we struggled with a 1g node. | 20:51 |
anteaya | bundler reads the Gemfile (it gets worked up about the gem.lock file, for reasons I have never understood) and then downloads the listed gems | 20:51 |
*** Ryan_Lane has quit IRC | 20:51 | |
pleia2 | anteaya: ok, I see | 20:51 |
anteaya | pleia2: that is my hope | 20:51 |
*** Ryan_Lane has joined #openstack-infra | 20:51 | |
lifeless | jeblair: struggled with disk space, or memory/cpu ? | 20:52 |
anteaya | I think we will need to see it in action before we know for sure | 20:52 |
jeblair | lifeless: ram | 20:52 |
anteaya | pleia2: the gemfile deals with gem versions too | 20:52 |
pleia2 | anteaya: yeah, I'm thinking we have too many moving pieces to brainstorm for too long before we need to start installing things and kicking tires | 20:52 |
jeblair | lifeless: my guess is that 1g is still okay for a handful of projects | 20:52 |
pleia2 | anteaya: do you have an hpcloud account? | 20:52 |
anteaya | mordred: do I have an hpcloud account? | 20:53 |
lifeless | jeblair: interesting - the 50G disk space seems flat | 20:53 |
lifeless | jeblair: is it proportional to project count? job definitions? | 20:53 |
anteaya | I'm waiting on approval for something, let's roll the dice and see if I have it yet | 20:53 |
*** dims has quit IRC | 20:53 | |
anteaya | pleia2: let's go with no for now and work on yours | 20:53 |
pleia2 | anteaya: you can set one up (free for 3 months) and then mordred helps get your account upgraded to employee | 20:53 |
anteaya | pleia2: yeah, it won't without a phone number | 20:54 |
pleia2 | but ok, I can toss up an instance on my hpcloud account and give you access to the shell for playing | 20:54 |
pleia2 | ah, goodie | 20:54 |
anteaya | and I have had enough spam on my private telephone number | 20:54 |
anteaya | that is what I am waiting for | 20:54 |
anteaya | pleia2: sounds like our smoothest way forward | 20:54 |
openstackgerrit | Emilien Macchi proposed a change to openstack-infra/jenkins-job-builder: Add Plot plugin support https://review.openstack.org/43685 | 20:54 |
*** dims has joined #openstack-infra | 20:55 | |
*** nati_ue__ has quit IRC | 20:55 | |
pleia2 | anteaya: have your ssh public key handy? | 20:55 |
*** nati_uen_ has joined #openstack-infra | 20:55 | |
anteaya | pm'd | 20:56 |
*** rcleere has quit IRC | 20:56 | |
clarkb | jeblair: interesting observation of the nodepool node graph, we don't seem capable of using all 296 nodes with the shorter test times | 20:56 |
clarkb | jeblair: and if you look at the gate queue right now you will see that we occasionally don't start jobs in order | 20:57 |
jeblair | clarkb: i have an idea of how to get nodepool spinning up nodes faster (though keep in mind, gate resets may reduce the actual used capacity) | 20:58 |
* jeblair -> sf | 20:59 | |
jeblair | clarkb: if you can figure out why those jobs are starting out of order, that would be very useful | 21:01 |
clarkb | jeblair: ok | 21:01 |
lifeless | how far out of order? | 21:02 |
lifeless | like, could it be message bus non-order guarantees? | 21:03 |
clarkb | lifeless: not terrible http://status.openstack.org/zuul/ 44501 is the first case of it | 21:03 |
clarkb | *44051 | 21:03 |
clarkb | lifeless: geard should hand out jobs in order though | 21:03 |
clarkb | lifeless: it is one of the reasons we are using geard and not gearman proper | 21:03 |
*** dprince has quit IRC | 21:03 | |
clarkb | though maybe, zuul is placing the jobs on the bus in a different order than I expect | 21:04 |
mordred | clarkb: hand out jobs in order does not mean start in order | 21:04 |
mordred | clarkb: a loaded jenkins could receive the event and then not kick the job that instant | 21:04 |
mordred | perhaps? | 21:04 |
anteaya | fungi: going to take my first run at standing up the sodabrew puppet-dashboard, and am selected an initial ruby verision, you had mentioned you had some experience with 1.8/1.9 incompatibility on Fedora 18 puppet | 21:04 |
anteaya | do you have those notes handy/ | 21:04 |
anteaya | ? | 21:04 |
*** ArxCruz has quit IRC | 21:05 | |
anteaya | when working with other puppet services using 1.8 | 21:05 |
clarkb | mordred: the jenkins executor worker threads should only grab work when they are able to perform work | 21:05 |
mordred | clarkb: right - but able to perform work doesn't mean that java won't decided to do something weird in the next second | 21:06 |
*** pcm_ has quit IRC | 21:06 | |
clarkb | mordred: possibly | 21:06 |
mordred | clarkb: it'd be a tiny race, but you never know | 21:06 |
anteaya | s/selected/selecting | 21:06 |
clarkb | mordred: I think the behavior we are seeing according to the status page indicates a larger race | 21:07 |
*** svarnau has quit IRC | 21:07 | |
clarkb | mordred: because there are many jobs that have started instead that could have run the jobs that are queued instead | 21:07 |
*** kiall has quit IRC | 21:07 | |
clarkb | by many I mean >5 | 21:08 |
mordred | clarkb: ah. interesting | 21:08 |
fungi | anteaya: gimme a sec and i'll dig those links up | 21:08 |
anteaya | fungi: thank you | 21:09 |
fungi | anteaya: though what i found was an inability to run puppet 2.7 on ruby 1.9.3 because it broke puppet report (among other bits) | 21:09 |
anteaya | okay, good to know | 21:09 |
* mordred is happy for anteaya to look at ruby things | 21:09 | |
anteaya | I do think the consensus was to start on ruby 1.8.7 | 21:09 |
fungi | it may be that puppet dashboard is new enough to not suffer similar problems, and capable of collecting reports from puppet 2.7 machines? | 21:10 |
anteaya | so I think I will start there, but i do need to know what I am trying to avoid/work with | 21:10 |
*** Ryan_Lane has quit IRC | 21:10 | |
fungi | but yes, if it'll run on ubuntu precise's default ruby, then all the better | 21:10 |
*** Ryan_Lane has joined #openstack-infra | 21:10 | |
* anteaya knows ruby is loathed in here, so trying hard to be an seen as an expert in ruby | 21:10 | |
anteaya | _not_ to be seen, _not_ | 21:10 |
anteaya | I can't type | 21:10 |
lifeless | anteaya: too late. | 21:11 |
lifeless | anteaya: we heard you loud and clear; ruby expert - check. | 21:11 |
anteaya | you are so going to regret that | 21:11 |
anteaya | I know so little | 21:11 |
lifeless | I will | 21:11 |
anteaya | ha ha ha | 21:11 |
lifeless | when I'm asking you ruby questions :P | 21:11 |
anteaya | yup | 21:11 |
anteaya | about that time | 21:11 |
anteaya | fungi: I would like to keep that hope alive as long as possible | 21:12 |
*** dina_belova has joined #openstack-infra | 21:12 | |
anteaya | I wonder for my first pass if I try to push it to 1.9 and then see what breaks when I try to consume other puppety things? | 21:12 |
anteaya | I might try that | 21:13 |
anteaya | by push it to I mean use 1.9 | 21:13 |
anteaya | Wednesday is a great day to break stuff, I think I will go that route | 21:14 |
*** NobodyCam_ has joined #openstack-infra | 21:14 | |
*** NobodyCam_ has quit IRC | 21:14 | |
*** dina_belova has quit IRC | 21:17 | |
anteaya | so if I apt-cache ruby1.9.1 it actually gives me ruby 1.9.3: 1.9.3.0-1ubuntu2.4 yay! | 21:17 |
*** Ryan_Lane has quit IRC | 21:17 | |
*** Ryan_Lane1 has joined #openstack-infra | 21:17 | |
*** marun has quit IRC | 21:17 | |
*** svarnau has joined #openstack-infra | 21:19 | |
*** Ryan_Lane1 is now known as Ryan_Lane | 21:20 | |
*** Ryan_Lane has quit IRC | 21:20 | |
*** Ryan_Lane has joined #openstack-infra | 21:20 | |
*** lcestari has quit IRC | 21:20 | |
openstackgerrit | Ryan Petrello proposed a change to openstack-infra/config: Provide a more generic run-tox.sh. https://review.openstack.org/43145 | 21:21 |
clarkb | I think it might be the order that zuul is putting them in the queue | 21:22 |
*** marun has joined #openstack-infra | 21:23 | |
ryanpetrello | mordred: re: 43145, I went ahead and removed the comments and that distribute workaround | 21:23 |
jeblair | clarkb: why? | 21:25 |
clarkb | jeblair: grepping through the logs indicates the launch order may be funny. I am about to paste some of what I have found | 21:25 |
clarkb | jeblair: http://paste.openstack.org/show/45780/ | 21:28 |
mordred | ryanpetrello: woot! thanks | 21:29 |
clarkb | jeblair: found that grepping for 44051,1 since that is at the base of the problem. | 21:30 |
clarkb | it is possible I missed something as a result. Looking more closely now that I have more info | 21:30 |
anteaya | 16 patches potentially ready to pass the gate | 21:31 |
anteaya | they look so pretty all lined up like that | 21:31 |
clarkb | jeblair: now that I know what I am looking for I still see launches for earlier changes happening after the most recent launches for newer changes | 21:33 |
notmyname | any ideas why tox -epep8 passes but a local invocation of pep8 fails? any hints on what to look for? | 21:34 |
*** sarob has quit IRC | 21:34 | |
notmyname | or more specifically, there seem to be real pep8 issues, just that some aren't found by tox | 21:34 |
mordred | notmyname: hrm. | 21:34 |
clarkb | notmyname: because the pep8 job probably doesn't actuall run pep8 instead it runs flake8 | 21:34 |
*** sarob has joined #openstack-infra | 21:35 | |
mordred | notmyname: what clarkb said - did you run flake8? | 21:35 |
openstackgerrit | A change was merged to openstack-dev/pbr: Add pypy to tox.ini https://review.openstack.org/44051 | 21:35 |
notmyname | clarkb: mordred: good point, but I think I found the issue (pep8 1.4.5 vs pep8 1.4.6) | 21:36 |
mordred | notmyname: ah yes. we do our best to pin at the beginning of the cycle | 21:37 |
jeblair | clarkb: im on bart so my debugging is limited. are the log msgs lying? | 21:37 |
notmyname | mordred: is that why it's at 1.4.5? because 1.4.6 came out after havana was well underway? | 21:38 |
mordred | notmyname: yes | 21:38 |
clarkb | jeblair: not sure yet, I don't expect so as the time difference is large and it seems to be missing entries that I would expect | 21:38 |
notmyname | mordred: by implication, should I expect that icehouse will bump it and cause new pep8 failures? | 21:38 |
mordred | notmyname: and we've been bitten in the past by sudden pep8 version bumps | 21:38 |
fungi | lifeless: not sure if anyone got back to you on our zuul disk usage, but 90% of it appears to be /var/log/zuul (there was a while where the jenkins-gearman plugin was very noisy) | 21:38 |
notmyname | mordred: right | 21:38 |
mordred | notmyname: potentially - but we're going to try to figure out a good way to manage that | 21:38 |
mordred | so that it doesn't screw yoy | 21:39 |
mordred | you | 21:39 |
clarkb | fungi: it is still very noisy :) | 21:39 |
notmyname | mordred: don't worry. you'll make somebody mad ;-) | 21:39 |
notmyname | mordred: actually, I can guess who that will be ;-) | 21:39 |
*** sarob has quit IRC | 21:39 | |
anteaya | sdague markmcclain so keystone patch 44509 failed on tempest-neutron: https://jenkins01.openstack.org/job/gate-tempest-devstack-vm-neutron/9040/console thought you might want to know | 21:39 |
notmyname | mordred: jeblair: clarkb: in other news, I just clicked "approve" on a patch that will start gating all swift tests on pep8 | 21:40 |
mordred | notmyname: woot | 21:40 |
*** portante|afk is now known as portante | 21:40 | |
alexpilotti | hi guys, I have a patch that is unable to merge since a lot of hours | 21:40 |
alexpilotti | https://review.openstack.org/#/c/38791/ | 21:40 |
*** kiall has joined #openstack-infra | 21:40 | |
alexpilotti | beside "reverify no bug", is there anything I can do? | 21:41 |
alexpilotti | I have another 2 approved patches that depend on this one | 21:41 |
openstackgerrit | Emilien Macchi proposed a change to openstack-infra/jenkins-job-builder: Add Plot plugin support https://review.openstack.org/43685 | 21:42 |
anteaya | alexpilotti: it is in the gate pipeline | 21:42 |
anteaya | alexpilotti: don't do anything more, it will take you out of the gate queue | 21:42 |
alexpilotti | ok, do you know how can I estimate when it'll run? | 21:43 |
*** weshay has quit IRC | 21:43 | |
*** thomasm has quit IRC | 21:43 | |
markmcclain | anteaya: looking | 21:44 |
anteaya | alexpilotti: sure, go to this page: http://status.openstack.org/zuul/ | 21:45 |
anteaya | look down the middle column - that is the gate queue | 21:46 |
alexpilotti | anteaya: tx! | 21:46 |
anteaya | find your patch number and if the time is known it will appear in the top right corner of the window with your patch | 21:46 |
anteaya | alexpilotti: np, hope it merges | 21:46 |
anteaya | markmcclain: thanks, it might be a legitimate failure, I just know you and sdague were working on this earlier, thought this might help | 21:47 |
clarkb | jeblair: http://paste.openstack.org/show/45784/ I think I figured it out. This is happening when jobs are restarted if the job result is None | 21:49 |
fungi | anteaya: these are things fixed in puppet 3.0 but never backported to 2.x... http://projects.puppetlabs.com/issues/15190 http://projects.puppetlabs.com/issues/8974 | 21:49 |
clarkb | jeblair: so I think we are ok, there may be a better way to handle this though, perhaps using high priority for job restarts like that and normal for gate? | 21:49 |
anteaya | fungi thanks, what is the tl;dr version of what our plan is for going to puppet 3.0, if it is anything other than the time it takes to upgrade? | 21:50 |
clarkb | mordred: lifeless ^ the reason the tests appear to start out of order is we are working around Jenkins failures by restarting jobs. The restart goes at the end of the queue | 21:50 |
alexpilotti | anteaya: being today the last day for H3, when is the time limit to submit for review? | 21:51 |
*** ewindisch has joined #openstack-infra | 21:51 | |
fungi | anteaya: i think it's a matter of having a second puppet master and migrating systems from old to new incrementally... i had gotten the impression that you can't mix old and new puppet on old and new ruby in the same master, but i could be wrong | 21:51 |
*** ewindisch is now known as ericw | 21:51 | |
anteaya | ttx ^ | 21:52 |
clarkb | fungi: second master is the way to go | 21:52 |
*** mrodden has quit IRC | 21:52 | |
jog0 | jeblair: ping | 21:52 |
*** ericw is now known as ewindisch | 21:52 | |
anteaya | bet he is asleep though | 21:52 |
fungi | anteaya: also the puppet 3.2.0 release notes claim that the ruby 1.9.3 p0 carried in ubuntu precise is unsuitable, so it may be necessary to wait until $next_lts | 21:52 |
anteaya | alexpilotti: keep submitting until someone tells you to stop | 21:52 |
*** portante is now known as portante|afk | 21:52 | |
clarkb | alexpilotti: I think ttx operates on the before I wake up rule but I am not 100% sure | 21:52 |
anteaya | alexpilotti: I don't know, so just keep working | 21:52 |
yjiang5 | anything wrong in gate? Seems a patch has been paused for a long time. | 21:52 |
anteaya | ah okay, so let's not wake him up | 21:52 |
clarkb | yjiang5: nothing wrong, it had a job restarted which slowed that particular one down | 21:52 |
ewindisch | jog0: hopefully, jblair is on his way to the AWS hackathon ;-) | 21:53 |
yjiang5 | clarkb: aha, got it. | 21:53 |
*** shardy is now known as shardy_afk | 21:53 | |
clarkb | yjiang5: oh actually | 21:53 |
*** ewindisch is now known as ewindisch- | 21:53 | |
clarkb | hmm | 21:53 |
alexpilotti | anteaya clarkb hehehe I like this approach :-) | 21:53 |
jog0 | ewindisch: I will be heading there fora bit myself ;) | 21:53 |
anteaya | go alexpilotti go | 21:53 |
ewindisch- | jog0: cool | 21:53 |
ewindisch- | wedgwood should be showing up, too. I know you miss him. | 21:53 |
clarkb | yjiang5: that particular test has failed (looking at the console log) I wonder if that has mad ethings go sideways | 21:53 |
pleia2 | ewindisch-: you in town this week? | 21:54 |
*** datsun180b has quit IRC | 21:54 | |
*** sdake_ has quit IRC | 21:54 | |
jog0 | clarkb: yeah its hanging | 21:54 |
openstackgerrit | A change was merged to openstack-infra/jenkins-job-builder: Add Plot plugin support https://review.openstack.org/43685 | 21:54 |
jog0 | https://jenkins02.openstack.org/job/gate-tempest-devstack-vm-postgres-full/7811/console | 21:54 |
ewindisch- | pleia2: yes. We're doing an AWS + tempest hackathon at evault right now (3rd & Howard) | 21:54 |
anteaya | fungi: arrrr - okay so waiting for the next lts release since I know there is an aversion to using rubies through rvm rather than the distributed versions | 21:54 |
clarkb | sdague: jog0: safe to manually kill that job so that the gate keeps moving? | 21:54 |
jog0 | clarkb: kill it | 21:54 |
pleia2 | ewindisch-: ah yes, I've been there a few times :) | 21:55 |
pleia2 | (busy tonight and tomorrow though, alas) | 21:55 |
clarkb | jeblair: you might also mention that doing a hackathon during the day of feature freeze isn't so great :) | 21:55 |
fungi | anteaya: does rvm automagically update on your machine to apply security fixes? | 21:55 |
anteaya | fungi: yeah the migration of systems from old to new is going to be fun | 21:55 |
jog0 | its a doc patch | 21:55 |
openstackgerrit | Tobias Stevenson proposed a change to openstack-infra/gerritbot: Fix comment event reporting in Gerritbot https://review.openstack.org/44913 | 21:55 |
yjiang5 | clarkb: really strange for that simple patch (change HACKING.RST) | 21:55 |
anteaya | fungi: I don't think so, I think you have to tell it to update | 21:55 |
anteaya | manually | 21:55 |
clarkb | yjiang5: I think flaky tests did it in, not the actual change | 21:55 |
clarkb | jog0: done | 21:55 |
jog0 | clarkb: https://review.openstack.org/#/c/42296/ | 21:55 |
jog0 | clarkb: looks like all dependant patches have to rerun now though sigh | 21:56 |
anteaya | rvm won't tell you when you need to upgrade, just that if you select an older patch version, it will tell you a newer one is available | 21:56 |
clarkb | jog0: yes, that is how zuul works :) | 21:56 |
jog0 | clarkb: it makes sense just frustrating | 21:56 |
fungi | anteaya: then not suitable for us, unfortunately. same reason we don't like installing things from pip either... it's not a ruby vs python thing | 21:56 |
clarkb | jog0: and why aborting the change now instead of waiting is advantageous | 21:56 |
anteaya | fungi: yeah, I hear that and makes so much sense | 21:56 |
anteaya | *she said while railing against the limitations of the box* | 21:57 |
*** mrmartin has quit IRC | 21:57 | |
fungi | at $oldjob we had an aversion to auto-updating servers, but we also had a team of a dozen people handling the semi-automated security updating we performed | 21:57 |
openstackgerrit | Tobias Stevenson proposed a change to openstack-infra/gerritbot: Fix comment event reporting in Gerritbot https://review.openstack.org/44913 | 21:58 |
jog0 | clarkb: it would be nifty if zuul could detect if a unittest ased job is failing and mark it as failed for zuuls status keeping while it finishes running to report to the user | 21:58 |
jog0 | that would have made this issue not exist | 21:59 |
clarkb | jog0: I am so happy you mentioned that. So one of the potentially super awesome things testr + subunit allows us to do is stream the subunit back to zuul | 22:00 |
clarkb | jog0: zuul could then use that stream to do exactly what you have described | 22:00 |
anteaya | fungi: ah yes, well considering the size of the group we have available to manually update security issues, the approach infra takes is the most stable | 22:00 |
clarkb | jog0: but that requires getting everyone on testr which hasn't quite happened yet | 22:00 |
anteaya | I've never accepted limitation well, glad I work remotely | 22:00 |
clarkb | jog0: and also potentially needing bigger pipes. Not sure what the bandwidth and processing requirements are for 350 subunit streams | 22:01 |
jeblair | jog0: yes, we definitely want to do that; also -- tempest uses testr now so we can benefit from that on those tests | 22:01 |
anteaya | getting horizon on testr is going to be a job | 22:02 |
jog0 | clarkb jeblair:cool. the subunit stream bandwidth issue could be fun :) | 22:02 |
fungi | i wonder if a slave-side subunit processing tool dropping a message on a queue to say "this job is going to fail" as soon as it knows that to be the case would be lighter-weight for zuul | 22:02 |
anteaya | mordred: so I should be going to ruby events, yeah? | 22:03 |
clarkb | fungi: thats a good idea. I think jeblair wants to do more with the subunit streams though | 22:03 |
fungi | ahh | 22:03 |
clarkb | fungi: but that may be a good place to start | 22:03 |
anteaya | I need to find something to propose to a ruby conference that would be a good talk | 22:03 |
jeblair | clarkb: aha! (catching up on the out-of-order thing) | 22:03 |
clarkb | why you should use zuul even though it is written in python | 22:03 |
clarkb | anteaya: ^ | 22:03 |
anteaya | great | 22:04 |
jeblair | clarkb, fungi: yeah, i was kind of thinking something on the jenkins-side; but i only have vague ideas at this point | 22:04 |
anteaya | I will do the reading of the jeblair blog and zuul notes and ask many silly questions as I compose slides | 22:04 |
anteaya | thanks for the idea, clarkb | 22:04 |
jeblair | clarkb, fungi: all we really need is something that can cause gearman-plugin to send a work_warning packet | 22:04 |
jeblair | clarkb, fungi: though we could pipe all the data back if we wanted | 22:05 |
*** zeus has quit IRC | 22:05 | |
fungi | seeing how large the subunit captures are from current unit test jobs makes me worry that it would be a firehose streaming them all at zuul during peak activity | 22:05 |
clarkb | jeblair: fungi: sending work warning packets shouldn't be hard | 22:05 |
clarkb | I think the trickiest bit might be getting subunit into jenkins somehow | 22:05 |
jeblair | so then the question is -- what can we reasonably shim in to jenkins | 22:06 |
jeblair | clarkb: yeah | 22:06 |
*** pcrews has quit IRC | 22:06 | |
jeblair | needs some brainstorming. :) | 22:06 |
clarkb | this might be a good case for brainstorming non jenkins gearman workers | 22:06 |
clarkb | though the investment there is potentially very high | 22:06 |
jeblair | clarkb: yeah, though i think devstack workers are at the bottom of the list of likely non-jenkins workers at the moment (that's the hardest problem to solve) | 22:07 |
fungi | i can see how it would work in the non-jenkins future, but don't know enough about what kind of real-time feedback jenkins gets from slaves besides the obvious console stream and the end-of-job and slave status details | 22:07 |
jeblair | clarkb: (by contrast, privileged slaves are the easiest) | 22:07 |
*** pcrews has joined #openstack-infra | 22:08 | |
* anteaya goes for a walk | 22:08 | |
*** dolphm has joined #openstack-infra | 22:09 | |
jeblair | https://etherpad.openstack.org/hackathon-aws-compat | 22:10 |
jeblair | for folks curious about the aws api hackathon ^ | 22:10 |
SpamapS | wtf.. why does the pypi yaml module not just install libyaml support by default if libyaml-dev is available? | 22:11 |
*** svarnau_ has joined #openstack-infra | 22:11 | |
clarkb | ahahahahahaha | 22:12 |
ewindisch- | ty jeblair | 22:12 |
clarkb | so I just looked into the jenkins slave protocol and now see why there are security concerns | 22:13 |
SpamapS | clarkb: no one can be TOLD what the jenkins matrix is... | 22:13 |
*** dina_belova has joined #openstack-infra | 22:13 | |
fungi | clarkb: sounds like it should be easy then! ;) | 22:13 |
clarkb | "Once connected, remote slave agents can send in commands to be executed on the master, so in a way this is like an rsh service." | 22:13 |
clarkb | comment from the code | 22:13 |
clarkb | but yes, this may make it easier to do what we want | 22:14 |
fungi | greeeaaat. that basically parrots what we heard then | 22:14 |
clarkb | the slave side could run the subunit filter, then request the master to pass the work status packet up to zuul | 22:14 |
clarkb | still digging in | 22:14 |
lifeless | fungi: thanks | 22:14 |
*** svarnau has quit IRC | 22:15 | |
*** tstevenson has quit IRC | 22:16 | |
fungi | from a zuul status ui perspective, i suppose we could change the color of the progress bar for failing but not yet completed jobs | 22:16 |
fungi | but then in zuul treat that the same as it currently does a job failure from a queue management perspective | 22:16 |
clarkb | I wonder if the ssh protocol is very different as that is in a plugin I think | 22:17 |
*** dina_belova has quit IRC | 22:18 | |
*** boris-42 has quit IRC | 22:19 | |
clarkb | I think ssh is just a tunnel so probably not | 22:19 |
*** boris-42 has joined #openstack-infra | 22:19 | |
*** ryanpetrello has quit IRC | 22:21 | |
*** dkliban has quit IRC | 22:23 | |
*** burt has quit IRC | 22:25 | |
*** sarob has joined #openstack-infra | 22:25 | |
*** changbl has quit IRC | 22:27 | |
*** sarob has quit IRC | 22:30 | |
*** prad_ has quit IRC | 22:31 | |
*** jhesketh has joined #openstack-infra | 22:32 | |
*** jhesketh__ has joined #openstack-infra | 22:32 | |
*** thedodd has quit IRC | 22:32 | |
*** sarob has joined #openstack-infra | 22:33 | |
*** pblaho has quit IRC | 22:34 | |
*** atiwari has quit IRC | 22:38 | |
clarkb | https://github.com/jenkinsci/remoting appears to be the slave agent code | 22:38 |
clarkb | or at least makes up a significant portion of the agent | 22:40 |
*** sarob has quit IRC | 22:41 | |
*** sarob has joined #openstack-infra | 22:41 | |
*** dhellmann is now known as dhellmann_ | 22:43 | |
*** nati_ueno has quit IRC | 22:44 | |
*** sarob has quit IRC | 22:45 | |
clarkb | the more I read the more I think the ssh slave agents are safe (or at least not outright allowing the slave to do whatever) | 22:47 |
ewindisch- | jeblair: a recorded version of this for new developers would be awesome | 22:48 |
*** ryanpetrello has joined #openstack-infra | 22:50 | |
clarkb | reading remoting it appears to be fairly synchronous, master sends slave a command to execute over a channel, command is executed, results are returned | 22:51 |
fungi | clarkb: it looks like the agent runs as the jenkins user on our slaves, so in theory anything else running as the jenkins user could subvert the agent to do anything the master's protocol handler allows | 22:52 |
*** sarob has joined #openstack-infra | 22:53 | |
fungi | no real privsep between the agent and things the agent is running | 22:53 |
*** gordc has quit IRC | 22:54 | |
*** sarob_ has joined #openstack-infra | 22:56 | |
*** sdake_ has joined #openstack-infra | 22:56 | |
clarkb | fungi: right, but I don't think our ssh slaves are using the jnlp protocol | 22:57 |
fungi | ahh! right, okay | 22:57 |
clarkb | fungi: instead they just stream stdout back to jenkins | 22:57 |
clarkb | which should make it fairly safe | 22:57 |
fungi | but also means limited in-job signaling back to the mothership | 22:58 |
clarkb | right | 22:58 |
fungi | so we could stick something identifiable in-stream on the console and scrape that in a plugin on the master, but that would be extremely hackish | 22:59 |
*** sarob has quit IRC | 22:59 | |
clarkb | right, there are "Listeners" on the agent side. trying to deduce if they do anything interesting | 23:00 |
fungi | ssh protocol has a signaling channel, but whether the agent and master support nuances of ssh is unlikely | 23:00 |
alexpilotti | I'm looking at at patch on zuul since a while and after executing tasks, it gets eventually back to having all tasks marked as "queued" | 23:01 |
fungi | http://www.ietf.org/rfc/rfc4254.txt section 6.9 (page 14) | 23:01 |
alexpilotti | it's the first time that I take a look at zuul's behaviour | 23:01 |
*** sarob_ is now known as sarob | 23:02 | |
alexpilotti | is this something that I can consider normal and simply wait or should I start thinking about something? :-) | 23:02 |
fungi | alexpilotti: is this an approved change in the gate? if so, there are probably changes in line ahead of it failing and getting kicked out, which means yours has to be retested without those in the mix | 23:02 |
alexpilotti | fungi: yes it is | 23:02 |
clarkb | fungi: jeblair: I am beginning to wonder if mayber the jenkins slaves should be gearman clients and talk to zuul out of bad that way | 23:03 |
clarkb | s/bad/band/ | 23:03 |
lifeless | clarkb: out of bad is entirely appropriate | 23:03 |
*** fbo_away is now known as fbo | 23:03 | |
fungi | alexpilotti: eventually when your change has only passing changes ahead of it, it won't get its jobs reset any longer | 23:03 |
lifeless | clarkb: You *have* looked at the jenkins internals, right? | 23:03 |
fungi | alexpilotti: and once jobs complete for all the changes ahead of it, they will merge and yours will too | 23:03 |
clarkb | lifeless: I have, I am looking at them more now, this is leading me towards wanting to make jenkins slaves gearman clients :) | 23:03 |
alexpilotti | fungi: ok, there's quite an impressive list of 15 of them in front | 23:04 |
lifeless | clarkb: why jenkins at all at that point? | 23:05 |
fungi | clarkb: or zmq or something if making them talk gearman turns out to be less ideal | 23:05 |
alexpilotti | fungi: if I understand right: http://status.openstack.org/zuul/ | 23:05 |
clarkb | lifeless: earlier I suggested this being a reason to use non jenkins workers. Jeblair indicates doing so for devstack/tempest tests will be a lot of work | 23:05 |
clarkb | lifeless: trying to find a path of least resistance | 23:05 |
lifeless | clarkb: ah | 23:06 |
lifeless | clarkb: I am now curious what features of the jenkins slave that devstack/tempest use | 23:06 |
clarkb | lifeless: initially thought it might be possible to abuse jenkins insecurities to do this but I don't think that is possible anymore | 23:06 |
*** reed has joined #openstack-infra | 23:07 | |
lifeless | clarkb: plugin code can get remoted to the slave side | 23:07 |
lifeless | clarkb: jenkins would still need to kick the slave off | 23:08 |
reed | this looks weird: https://review.openstack.org/#/c/24184/ uploaded in July, merged in March? | 23:08 |
clarkb | lifeless: we need asynchronous callbacks into the master or similar | 23:08 |
clarkb | lifeless: to tell jenkins a test will fail but to continue running the job | 23:08 |
clarkb | fungi: is reeds change another potential gerrit DB weirdness? | 23:09 |
clarkb | I am going to walk home now. I have managed to neglect lunch and need to scrounge up food | 23:11 |
clarkb | Back in a bit | 23:11 |
lifeless | another puppet question - where does | 23:12 |
lifeless | $openstack_project::jenkins_ssh_key | 23:12 |
lifeless | get set? | 23:12 |
fungi | clarkb: reed: yes, i bet this happened when we ran update queries to change s/quantum/neutron/ (did we touch the changes table?) | 23:12 |
lifeless | ah, init.pp nvm | 23:12 |
*** dina_belova has joined #openstack-infra | 23:13 | |
clarkb | lifeless: that is a bit of a hack... | 23:14 |
clarkb | lifeless: things that use it inherit the base class which typically you avoid in puppet iirc | 23:14 |
lifeless | I can't see where status.openstack.org is defined | 23:14 |
clarkb | lifeless: it is a static.openstack.org vhost | 23:14 |
*** fbo is now known as fbo_away | 23:15 | |
lifeless | clarkb: oh, so /zuul is all static js pulling from zuul.o.o ? | 23:16 |
clarkb | yup | 23:16 |
clarkb | ok really heading home now | 23:16 |
reed | we have a spammer on the #meeting channel | 23:16 |
anteaya | he left | 23:17 |
anteaya | user Hoangg for those with blocking abilities | 23:17 |
*** dina_belova has quit IRC | 23:18 | |
lifeless | does the gerrit user for zuul need to be 'jenkins' or could they be separate role accounts? | 23:19 |
lifeless | what groups is the jenkins user in, in gerrit.o.o ? | 23:19 |
anteaya | alexpilotti: the heat patch 4 in front of yours failed so the test jobs on your patch have cancelled, once the patches in front of the heat patch finish (and if none of them have a failure) the passing patches will merge, the heat patch will be removed from the queue and your patch will be fourth in line after the gate resets | 23:21 |
anteaya | so the earliest it will merge is in about 50 minutes | 23:22 |
alexpilotti | anteaya: tx for the update | 23:22 |
anteaya | np | 23:22 |
alexpilotti | anteaya: what happens if it fails and it's marked as failed in gerrit? | 23:22 |
alexpilotti | anteaya: I mean, I normally would issue a reverify no bug | 23:23 |
anteaya | then you have to address the message in gerrit and take action based upon what you find | 23:23 |
alexpilotti | anteaya: in that case it goes back in the queue and I have to wait.. another 10 hours? :-) | 23:23 |
anteaya | well hopefully before you do reverify no bug, you take a look at the output from the logs | 23:23 |
anteaya | since something may have changed, you might need to rebase | 23:23 |
alexpilotti | anteaya: sure, I mean, in cases in which a totally unrelated failure happens | 23:24 |
anteaya | if you just reverify no bug and there is a legitamate failure in the logs you are wasting your own time | 23:24 |
anteaya | alexpilotti: occasionally the tests will pass but it won't merge due to a merge conflict and you will be advised to rebase and go again | 23:24 |
alexpilotti | anteaya: yeah, but in case (as it happens most of the time) in which there is no legitimate failure | 23:24 |
*** sdake_ has quit IRC | 23:24 | |
alexpilotti | anteaya: I know :-) | 23:24 |
anteaya | okay well the gate queue can only move as fast as the patches in front of it | 23:25 |
alexpilotti | anteaya: I'm referring to a situation in which it has nothing to do with the patch | 23:25 |
anteaya | so yes, if you have to queue again, you have to line up | 23:25 |
jeblair | lifeless: user can be anything; ours is called jenkins for historical consistency | 23:25 |
anteaya | I have no ability to speed things up in the queue | 23:25 |
fungi | lifeless: we put it in the "Continuous Integration Tools | 23:25 |
fungi | lifeless: " group | 23:25 |
fungi | (yay newlines) | 23:26 |
mordred | lifeless: what fungi said | 23:26 |
lifeless | jeblair: fungi: mordred: thanks. | 23:26 |
lifeless | I'll write up calling it zuul then. | 23:26 |
alexpilotti | anteaya: for example, last time this one failed: http://logs.openstack.org/91/38791/11/gate/gate-tempest-devstack-vm-full/238672e/ | 23:26 |
alexpilotti | anteaya: while all the rest was ok. Tempest has nothing to do with my patch (as I'm doing Hyper-V) :-) | 23:27 |
alexpilotti | anteaya: so in a case like this one, reverify no bug is the only chance? | 23:28 |
mordred | http://www.youtube.com/watch?v=ejuK8_12Fmg&feature=youtu.be | 23:28 |
anteaya | yes, if it is the case that the test failure was unrelated to your patch, ensure the project that is responsible for the failed service knows or a bug is filed, then requeue | 23:28 |
anteaya | alexpilotti: yex | 23:28 |
anteaya | yes | 23:28 |
jeblair | alexpilotti: 'reverify bug ####' would be better | 23:28 |
alexpilotti | anteaya: ok, how do I find out which bug #### should I address? | 23:29 |
jeblair | alexpilotti: it helps track nondeterministic failures for the people who are working on fixing those | 23:29 |
*** dims has quit IRC | 23:29 | |
anteaya | alexpilotti: start here: http://status.openstack.org/rechecks/ | 23:29 |
*** nicedice has joined #openstack-infra | 23:29 | |
*** nicedice_ has quit IRC | 23:29 | |
*** hemna is now known as hemnafk | 23:30 | |
alexpilotti | jeblair: ok thanks :-) | 23:30 |
alexpilotti | I wanted to be sure that I was doing everything in my power to keep things on track | 23:30 |
anteaya | alexpilotti: yes, thank you for asking | 23:31 |
anteaya | it is hard for us to know what people already know and what is new information for them | 23:31 |
anteaya | I'm glad you are learning the wonders of status.openstack.org and the zuul and recheck options | 23:32 |
anteaya | alexpilotti: looking good for you atm | 23:34 |
alexpilotti | anteaya: tx, I have another 2 patches waiting after that one | 23:34 |
anteaya | okay | 23:35 |
alexpilotti | anteaya: my only concern is that they make it before dawn here in Europe ;-) | 23:35 |
*** UtahDave has quit IRC | 23:35 | |
*** pentameter has quit IRC | 23:36 | |
reed | alexpilotti, stop the word! | 23:36 |
reed | damn typo | 23:36 |
*** nicedice_ has joined #openstack-infra | 23:36 | |
anteaya | that works too | 23:36 |
reed | world, silly fingers, worLd | 23:36 |
alexpilotti | I was actually wondering :-) | 23:36 |
*** nicedice has quit IRC | 23:37 | |
reed | mordred, squirrel | 23:37 |
anteaya | alexpilotti: yeah, well if ttx tries to freeze anything before your patches merge I will point him to the log | 23:37 |
anteaya | how's that | 23:37 |
alexpilotti | anteaya: you're my hero :-D | 23:37 |
lifeless | is zuul stateless? That is, does it need backing up or will it regenerate everything out of config + gerrit if rebuilt? | 23:37 |
anteaya | thanks | 23:37 |
anteaya | :D | 23:37 |
reed | anteaya doesn't sleep | 23:37 |
reed | and I'll go silent | 23:38 |
reed | byez | 23:38 |
anteaya | well I don't know about that, not quite moving into the land of no sleep like mordred and fungi | 23:38 |
anteaya | always nice to see you reed | 23:38 |
anteaya | :D | 23:38 |
jeblair | lifeless: stateless | 23:38 |
*** michchap has joined #openstack-infra | 23:40 | |
*** nicedice_ has quit IRC | 23:40 | |
*** nicedice has joined #openstack-infra | 23:40 | |
anteaya | alexpilotti: what are the next two patch numbers you need to merge? | 23:41 |
anteaya | let's get them logged | 23:41 |
fungi | clarkb: yeah, so just noticed that we probably should start setting created_on=created_on in our update queries to the changes table in our project renaming recipe. seems that field is also timestamp type with on update set | 23:42 |
alexpilotti | https://review.openstack.org/#/c/40076/ | 23:42 |
alexpilotti | https://review.openstack.org/#/c/42623/ | 23:42 |
fungi | clarkb: i'll update the documentation for that momentarily | 23:42 |
jeblair | fungi: nice catch | 23:42 |
jeblair | also, evil. | 23:43 |
jeblair | i mean, maybe gerrit should just be implemented as a series of sql triggers while they're at it | 23:43 |
anteaya | alexpilotti: great, 11 patches ahead of you right now, let's see what happens in 32 minutes | 23:43 |
fungi | jeblair: yes, gerrit really, really doesn't look like it needs on update set on any of its timestamp fields (and probably should use a more normal datatype for them anyway, but that's another ball of yarn) | 23:43 |
anteaya | there is a chance you might make this one | 23:43 |
alexpilotti | anteaya: there shouldn't be rebase issues, I chained them for this purpose, but seeing them will be relieving :-) | 23:43 |
*** jhesketh has quit IRC | 23:44 | |
anteaya | alexpilotti: if you are asked to rebase now it would be due to the patches that have merged recently | 23:44 |
alexpilotti | as in seeing them in the tree, ahem | 23:44 |
anteaya | no way of knowing until you get a message on the patch | 23:44 |
alexpilotti | tyep, that's the reason for the "relieving" part in the previous sentence | 23:44 |
anteaya | well it will be relieveing that is for sure | 23:45 |
alexpilotti | lol | 23:45 |
*** nicedice has quit IRC | 23:45 | |
*** ryanpetrello has quit IRC | 23:45 | |
fungi | however, if zuul isn't going to be able to merge your change on top of the changes ahead of it, you'll find out pretty much right away because it won't leave it in the pipeline any more (as of recently) | 23:45 |
anteaya | well that is good to know | 23:46 |
anteaya | yes sorry that was what we were talking about earlier today | 23:46 |
*** dims has joined #openstack-infra | 23:46 | |
fungi | even this morning after the restart when the gate pipeline was really huge, i reverified a change which wouldn't be able to merge and the gerrit change got a rejection from "jenkins" in less than a minute | 23:46 |
* anteaya shakes her head | 23:46 | |
anteaya | helpful | 23:47 |
clarkb | fungi: looking at the changes merged graph after that point we merged a bunch of stuff but it has been slow since. I wonder if one of those things introduced more flakyness? | 23:47 |
clarkb | fungi: also yay on gerrit db :/ | 23:47 |
*** nicedice has joined #openstack-infra | 23:51 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!