*** xinliang has quit IRC | 03:07 | |
*** amoralej|off is now known as amoralej | 07:29 | |
*** isaacb has joined #zuul | 07:49 | |
*** tflink has quit IRC | 07:53 | |
*** tflink_ has joined #zuul | 07:54 | |
*** electrofelix has joined #zuul | 09:03 | |
*** jkilpatr has quit IRC | 10:48 | |
*** smyers has quit IRC | 10:53 | |
*** eventingmonkey has quit IRC | 11:00 | |
*** smyers has joined #zuul | 11:01 | |
*** rcarrillocruz has quit IRC | 11:01 | |
*** tobiash has quit IRC | 11:01 | |
*** leifmadsen has quit IRC | 11:01 | |
*** toabctl has quit IRC | 11:01 | |
*** eventingmonkey has joined #zuul | 11:02 | |
*** tobiash has joined #zuul | 11:03 | |
*** toabctl has joined #zuul | 11:04 | |
*** rcarrillocruz has joined #zuul | 11:06 | |
*** leifmadsen has joined #zuul | 11:06 | |
*** jkilpatr has joined #zuul | 11:07 | |
*** olaph has quit IRC | 12:08 | |
*** amoralej is now known as amoralej|lunch | 12:14 | |
*** olaph has joined #zuul | 12:26 | |
Shrews | pabelanger: jeblair: is this enough to eliminate py2 for nodepool? https://review.openstack.org/492629 I'm not sure what version of python the coverage test uses, but I'm guessing py2? If so, not sure what to do about that one. | 13:19 |
---|---|---|
leifmadsen | jeblair: mordred: what is your availability like for a workshop on zuul v3 / github / bootstrap this week? | 13:22 |
leifmadsen | CC Shrews | 13:22 |
*** tflink_ is now known as tflink | 13:29 | |
*** dkranz has joined #zuul | 13:35 | |
*** amoralej|lunch is now known as amoralej | 13:36 | |
Shrews | leifmadsen: not sure how useful i'd be for that conversation, but thursday is likely the only day i have. you might want jlk for the github things | 13:59 |
leifmadsen | Shrews: all good, just keeping you in the loop in case you wanted to listen in | 13:59 |
leifmadsen | I'll ping you once I get enough material to review and we can banter back and forth :) | 13:59 |
Shrews | yeah, i would like to | 13:59 |
Shrews | tomorrow is the Ansible gathering that I know mordred is attending and i *might* partially attend | 14:00 |
jeblair | leifmadsen: thurs or fri are good for me | 14:21 |
jeblair | Shrews: lgtm. the coverage job should use the python specified in tox.ini i think. so just changing that on the v3 branch should work i think. | 14:25 |
leifmadsen | jeblair: Friday is best for me, maybe around 11am EDT? (or after lunch) | 14:25 |
leifmadsen | jeblair: what is your timezone? | 14:29 |
jeblair | leifmadsen: 11am edt wfm; i'm in pdt. | 14:29 |
leifmadsen | ah ok, just tried to schedule for 1pm EDT, and it says your working hours are 3am to 12pm :) | 14:30 |
leifmadsen | sent out, not sure we'll use the whole time, but I blocked it off just in case | 14:31 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Set base environment as python3 https://review.openstack.org/496251 | 14:34 |
Shrews | ugh, why is the | 14:39 |
Shrews | gate-dsvm-nodepool job running? | 14:39 |
Shrews | oh, i bet i needed to wait for puppet things | 14:40 |
*** isaacb has quit IRC | 14:48 | |
jeblair | mordred: devstack-gate has a playbooks/plugins/callback/devstack.py file.... i think that's going to cause problems for us the first time we try to use a playbook in playbooks ? | 16:00 |
SpamapS | 'morning zuulians | 16:18 |
SpamapS | I've called playbooks with playbooks before. | 16:18 |
jlk | o/ | 16:18 |
SpamapS | But maybe the zuul restrictions on playbooks will interfere? | 16:18 |
jeblair | SpamapS: i mean there's a plugin, which we disallow in untrusted repos | 16:19 |
jlk | Are we... New Zuulians? | 16:19 |
jlk | slaying dragons and whatnot? | 16:19 |
jlk | jeblair: hrm. Does that get called on the executor, or on the first node of the set? | 16:20 |
jeblair | jlk: the current plugin? it's used from command-line ansible in the devstack-gate project | 16:20 |
jeblair | it is not related to zuulv3. it will, i believe, prohibit us from running any zuulv3 playbooks out of that directory of the devstack-gate repo as long as it's there. | 16:21 |
jlk | Just trying to map it in my head. | 16:22 |
jlk | I think you're right. | 16:23 |
SpamapS | jeblair: OOOOOOHHHHHHH | 16:26 |
mordred | jeblair: we actually do not need that plugin once we've in v3 | 16:26 |
mordred | jeblair: so let's move it to a different directory so it doesn't conflict | 16:26 |
mordred | jeblair: we write out an ansible.cfg with paths in devstack-vm-gate-wrap.sh anyway, so it being adjacent to the playbooks dir is not necessary | 16:27 |
jeblair | mordred: ya. i'll put in a change to move it as part of my series | 16:27 |
jlk | ++ | 16:29 |
pabelanger | openstack-infra related, but adds publish-openstack-python-branch-tarball job: https://review.openstack.org/494672/ would love some reviews | 16:29 |
jeblair | mordred, jlk, SpamapS: oh, hrm, we actually only blacklist *_plugins directories.... maybe we don't need to change anything? | 16:32 |
jeblair | ie, playbooks/callback_plugins/foo would be bad, but playbooks/plugins/callback/foo (which is what's there) is okay? | 16:33 |
jlk | so devstack-gate is already using a non-standard path, set by ansible.cfg in-repo? | 16:33 |
jlk | If ansible picks up that plugin without configuring the path, we need to update our blacklist routines :/ | 16:34 |
jeblair | jlk: correct, it writes an ansible.cfg | 16:34 |
jlk | does the presence of an ansible.cfg file run afoul of our protections, should the playbook be ran from that directory? | 16:38 |
jlk | (in v3) | 16:38 |
mordred | it shouldn't - but also the ansible.cfg file shouldn't get written out by the v3 version of the job | 16:39 |
jlk | ah okay | 16:39 |
mordred | the only purpose of the ansible.cfg in that dir is to set up logging stuff appropriately - which we have already handled so yay! | 16:39 |
jlk | Anything up for urgent review? | 16:48 |
jlk | etherpads are somewhat difficult to follow | 16:50 |
jeblair | mordred, clarkb: i just found that most of the use of the local_conf jjb macro is actually to do this: enable_plugin {project-repo} git://git.openstack.org/{project-repo} | 16:53 |
jlk | mordred: Is there anything I can help out on today / this morning? | 16:53 |
jeblair | which of course, is not a key/value pair. so i think we'll want a special 'enabled_plugins' zuul job variable or something similar | 16:54 |
jeblair | that may mean the devstack_localrc module i just wrote goes on the shelf for a while | 16:54 |
mordred | jlk: I'd say one of two things - depending on which seems like a thing you can dive in to more readily ... | 16:55 |
mordred | jlk: either helping with the devstack-gate -> ansible conversion, which might be fun because ansible but also might be opaque because devstack-gate | 16:56 |
mordred | jlk: (that's the more important topic, but also the one that it's possible might be a brick wall) | 16:57 |
jlk | brick wall you say... | 16:58 |
jlk | Is there an active checklist / punchlist for the devstack gate things? I can at least try one or two before giving up | 16:59 |
jlk | (and do we have a good way of testing WIP?) | 16:59 |
mordred | jlk: or, if that's too much infra-specific learning curve, getting the github driver webhook migrated to zuul-web is bound to be fun - it's definitely lower priority fun though | 16:59 |
jlk | oh that thing. Zuul-web is live now? | 16:59 |
mordred | jlk: https://etherpad.openstack.org/p/AIFz4wRKQm is the etherpad we made the other day in prep for devstack-gate conversion | 16:59 |
jlk | okay I have that one open | 17:00 |
jlk | and that TODO block is current? | 17:00 |
mordred | jlk: I believe our idea for testing is to make a nice new job that first just emits the files we expect so we don't have to run devstack itself - and then work up the chain from there | 17:00 |
mordred | looking | 17:01 |
mordred | jlk: I think so - although from digging in a little over the weekend I think we might need to revise that list as we wor | 17:03 |
mordred | work | 17:03 |
mordred | jeblair: http://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/jobs/shade.yaml#n4 is an example of slightly more than that, should you need an example of more complex localrc usage | 17:03 |
mordred | jeblair: it's worth noting that http://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/jobs/shade.yaml#n39 then does some editing that obviously would be done differently if devstack_localrc existed | 17:04 |
jeblair | mordred: yeah i saw that. it's the only job in the entire system that does anything like that. i suspect that job will just end up with its own playbook. | 17:09 |
jeblair | or jobs, rather | 17:09 |
mordred | jeblair: ah - ok. I'm fine making it be its own thing | 17:10 |
jeblair | but yeah, in general, the localconf stuff is unstructured enough that i think we just need to have the v3 version of it behave more or less like the v2 version. i'll make a var so jobs can automatically splat some stuff into /tmp/dg-local.conf. | 17:11 |
mordred | jeblair: may be worth checking in with AJaeger/sdague about plans - my understanding is that they were trying to get more people to use local_conf so that people would do less scripts setting env vars and whatnot | 17:11 |
mordred | jeblair: cool | 17:11 |
jeblair | mordred: i think that's what i just said? | 17:12 |
jeblair | mordred: was that an irc lag issue? | 17:12 |
mordred | yah | 17:12 |
jeblair | cool | 17:12 |
*** electrofelix has quit IRC | 17:12 | |
jeblair | mordred, clarkb: the only issue i can see with this is that it is less amenable to inheritance. so if you have a devstack-foo job which splats some localconf, and you inherit from it, you will need to duplicate the parent's localconf if you want to add your own localconf. | 17:13 |
mordred | yah - I mean, I liked the localconf thing we sketched out - even if the shade jobs woudl be the only jobs using it | 17:14 |
mordred | jeblair: perhaps we should just start out using what you wrote in shade and see how well it works, and if it's good we can point people to it as a way to do more complex stuff just with local.conf and with less env-var logic? | 17:15 |
jeblair | mordred: the thing that is unique to the shade jobs is that they run sed on dg-local.conf | 17:15 |
jeblair | mordred: the thing i wrote was when i thought local.conf was variables. it's not; it's commands as well. | 17:16 |
mordred | yah. that puts a big crimp in the idea | 17:16 |
SpamapS | could we fix that in shade? | 17:16 |
SpamapS | do other projects do that? | 17:16 |
mordred | just shade | 17:16 |
mordred | I guess I'm the only one who found copying 15 lines of localrc more distateful than three sed lines | 17:17 |
jeblair | or the only one who didn't read the "private interface" line :) | 17:17 |
mordred | :) | 17:17 |
mordred | I mean - actually ... for shade, I could just skip local_conf and do a normal ansible template file with jinja for the logic to write out the local.conf file | 17:18 |
jeblair | i toyed with the idea of converting the 'enable_service' command into a variable, but it can have 2, 3, or 4 arguments, and there's a 'disable_service' command too. so the whole thing starts to look pretty procedural (because it is -- it's a bash script) and it gets harder and harder to think about it as variables. | 17:21 |
jeblair | mordred: but maybe we can combine the approaches | 17:22 |
jeblair | maybe we can have a variable to "throw this blob into dg-local.conf", but also a "merge these variables through job inheritance and write them to dg-local.conf" | 17:22 |
jeblair | and maybe we can even rethink enabling services -- if we can simplify or standardize the most common forms of the enable_service command, we might be able to do that as a boolean dict: services: {ceilometer-api: True} | 17:25 |
jeblair | i think those may be best thought of as 3 layered approaches, with increasing complexity. and maybe i should just focus on the first to facilitate the most straightforward devstack-gate conversion. | 17:27 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/zuul master: Add link to Infra Manual https://review.openstack.org/496329 | 17:28 |
mordred | jeblair: ++ | 17:31 |
mordred | jeblair: I think your suggestion is awesome - but since shade is the only current user that 'needs' it, and the current hack shade is doing is fine, we can defer that to post-ptg | 17:32 |
jeblair | mordred: yeah; i suspect that will let us simplify a lot of other jobs, but yes, that's not something we'd plan on doing pre-ptg anyway. | 17:33 |
mordred | pabelanger: heya - so - your twine patches inspired me in some ways and I wrote followup patches (which seemed to me to be the easiest way to discuss the thoughts I had instead of review comments) | 17:33 |
mordred | jeblair: ++ | 17:33 |
mordred | pabelanger: I shall now push them up - but feel free to tell me I'm dumb/wrong/a-bad-person or whatnot | 17:34 |
jeblair | 17:34 < openstackgerrit> James E. Blair proposed openstack-infra/devstack-gate master: Zuul v3: add a simple devstack job https://review.openstack.org/496330 | 17:35 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Rename upload-twine to upload-pypi https://review.openstack.org/496331 | 17:35 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack python tarball creation job https://review.openstack.org/496332 | 17:35 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack PyPI upload job https://review.openstack.org/496333 | 17:35 |
jeblair | devstack-lxc ERROR No valid playbook found | 17:37 |
jeblair | that was helpful! ^ :) | 17:37 |
mordred | jeblair: yay! | 17:37 |
mordred | pabelanger, jeblair: https://review.openstack.org/#/q/topic:upload-python includes pabelanger and my changes - I put up the two 'Add non-OpenStack' patches to show how I thought the abstraction could be used | 17:38 |
jlk | okay I'm looking at one of the TODO items ont eh devstack-gate list, add-sshkey-to-root role, which I think is creating an SSH key on each node in the set for the root user, and distributing it around. | 17:40 |
jlk | and setting known_hosts accordingly | 17:40 |
jlk | Does that match other people's understandings? | 17:40 |
jlk | although looking at the setup_ssh function, it seems to be using a singluar /etc/nodepool/id_rsa.pub file. So maybe I'm wrong? Is this using just a single ssh key file? | 17:41 |
jeblair | jlk: look at SpamapS's work, i think he may have found that the known_hosts bit should happen in that role. and you should be able to re-use the key we're using to log in as the zuul user too (it doesn't need to be a different key for root afaik) | 17:42 |
mordred | jlk: SpamapS has most of that one done although it needs review | 17:42 |
mordred | what jeblair said | 17:42 |
mordred | do we actually do keys for root? | 17:42 |
jlk | yeah I reviewed it, but it seemed to be a different work item | 17:43 |
jeblair | SpamapS's task was the add the private keys for zuul on all hosts, jlk's as add keys for root | 17:43 |
jeblair | mordred: for nova migration | 17:43 |
SpamapS | I am not 100% sure of the actual role and playbook I did that in btw. | 17:43 |
jlk | oh this is for nova? | 17:43 |
jlk | ahhh | 17:43 |
mordred | ah - gotcha. | 17:43 |
SpamapS | Feels very broad in scope.. | 17:43 |
mordred | so - overall we have 2 different needs: | 17:44 |
SpamapS | But at the same time, I kind of like having that as a basic assumption if you're using zuul-jobs | 17:44 |
SpamapS | "all your nodes start out being able to SSH to each other" | 17:44 |
jeblair | SpamapS: yes, i think having the private key on all hosts is universally applicable | 17:44 |
mordred | - users need to be able to ssh between test nodes as zuul | 17:44 |
mordred | - users need to be able to ssh between test nodes as root | 17:44 |
jlk | if it's for nova, then yeah, each node needs a private key and each other node needs to accept that private key | 17:44 |
SpamapS | ok so I just needed encouragement that there was already agreement on that :) | 17:44 |
jlk | mordred: I'd challenge that second one, not - users, instead - nova | 17:44 |
mordred | yah. there needs to be the ability a job can opt-in to to have a user be able to ssh to other nodes as root | 17:45 |
jlk | SpamapS: so your change takes a single build private key and puts it as zuul's private key? | 17:45 |
SpamapS | jlk: it does | 17:45 |
SpamapS | they all end up with the same key | 17:45 |
SpamapS | which is specific to the build | 17:45 |
jlk | is there any reason do not expand that to do it as root too? | 17:45 |
SpamapS | which I think is a nice feature... single use key for all | 17:45 |
jlk | or nova? What user is nova going to be running as in devstack? | 17:46 |
mordred | stack? | 17:46 |
SpamapS | jlk: no I think adding it to root is a good iea | 17:46 |
SpamapS | idea | 17:46 |
SpamapS | I also wonder about using /etc/ssh/known_hosts instead of ~/.ssh/known_hosts | 17:46 |
jlk | At blue box we ran as the "nova" user, and thus I made: https://github.com/blueboxgroup/ursula/blob/master/roles/nova-data/tasks/ssh.yml | 17:46 |
mordred | oh - that might be nice for this | 17:46 |
SpamapS | Can't see any problem with making all the other test nodes part of the global known_hosts. | 17:46 |
* SpamapS is reminded that there is feedback on the ssh key type PR/bug that needs responding | 17:47 | |
mordred | SpamapS: we could then always populate /etc/ssh/known_hosts in base - but make "authorize connection to user" a role people use on demand | 17:47 |
SpamapS | mordred: right | 17:47 |
SpamapS | Also, consider that the other thing we might want to do is just set StricHostKeyChecking to "no" | 17:48 |
SpamapS | was that considered? | 17:48 |
mordred | so in a the devstack-gate base job we could have "- {'role': 'authorize-ssh-connection', 'user': 'zuul'}\n- {'role': 'authorize-ssh-connection', 'user': 'root'}" | 17:49 |
SpamapS | might cause issues for jobs that want to ssh to places safely I suppose | 17:49 |
mordred | SpamapS: well - we HAVE the host keys, so if we set them, maybe it catches things attempting to connect to the wrong thing? | 17:49 |
SpamapS | yeah | 17:49 |
SpamapS | n/m on that | 17:49 |
SpamapS | just recoiling in horror at the ansible module I wrote ;) | 17:49 |
SpamapS | Oh also, modules don't automatically get fed hostvars right? | 17:50 |
* SpamapS just realized he assumed that | 17:50 | |
jlk | argh, wtf. | 17:51 |
jlk | In ursula, we're dealing with both "nova" user ssh things, and root (because of libvirt) | 17:51 |
jlk | I think I remember some of that, nova does some things with ssh directly, and libvirt does some as well | 17:51 |
jlk | (if libvirt is configured to use ssh to connect to other remote libvirts). | 17:51 |
jlk | How is it configured in devstack? | 17:52 |
clarkb | nova itself doesn't iirc, its just libvirt | 17:52 |
clarkb | and it ssh's as root in devstack setups | 17:52 |
jlk | are you sure about that? I thought nova did to transport images | 17:52 |
clarkb | jlk: thats all http to glance | 17:52 |
jlk | one sec | 17:52 |
clarkb | in devstack-gate we set up /etc/hosts for dns resolution because libvirt uses hostnames adn then also add the nodes to known_hosts and add the ssh key to the root user authorized keys | 17:53 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Install build private key too https://review.openstack.org/494302 | 17:53 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Add known_hosts generation for multiple nodes https://review.openstack.org/494700 | 17:53 |
* SpamapS rebased | 17:53 | |
clarkb | and yes the user all openstack services run under in devstack is 'stack' | 17:54 |
clarkb | we don't set up ssh things for stack, just root | 17:54 |
jlk | dammit I know there was a reason nova was sshing as nova | 17:56 |
jlk | https://docs.openstack.org/nova/latest/admin/ssh-configuration.html is still live | 17:56 |
clarkb | I think that may be if you want libvirt to talk as the nova user | 17:57 |
clarkb | but the default is root iirc | 17:57 |
jlk | no it's the resize code path instead of the migration code path. It's a different thing I think. Digging more | 17:58 |
clarkb | we test both with tempest and there is no nova user | 17:58 |
clarkb | https://docs.openstack.org/nova/latest/admin/configuring-migrations.html#kvm-libvirt | 18:00 |
jlk | again that's live migration, different from resize. | 18:01 |
clarkb | right but we test both is what I'm saying and only use the live migration libvirt config | 18:01 |
clarkb | makes me wonder if that is just stale docs | 18:01 |
clarkb | since a resize is performed as a libvirt migration aiui | 18:01 |
* clarkb asks nova channel | 18:02 | |
jlk | https://docs.openstack.org/ocata/config-reference/compute/resize.html | 18:02 |
jlk | (for some reason the pike doc set is missing nova) | 18:02 |
jlk | At least in newton era nova was sshing as itself over to other compute hosts | 18:03 |
jlk | and it used the address from the service record | 18:03 |
clarkb | ya the address from the service record is also what libvirt uses so we need the /etc/hosts stuff either way. I haev asked nova channel for clarification on the nova user stuff | 18:04 |
dmsimard | I have some time set aside to help upstream/infra for the foreseeable future, I hear v3 migration is where help is needed most. Do we have a list of todos ? | 18:08 |
clarkb | dmsimard: https://etherpad.openstack.org/p/AIFz4wRKQm has a big list related to the devstack-gate items | 18:09 |
dmsimard | clarkb: okay, I'll look if there's anything in there I can work on, thanks. | 18:10 |
clarkb | jlk: ok based on mriedems email I have learned things. https://git.openstack.org/cgit/openstack-infra/devstack-gate/tree/devstack-vm-gate.sh#n181 line 182 there is stealth mode setup for the stack user. Then 183 we set up the root user, and its the root user that we may not actually use but looks like mriedem wants to invert things so probably best to set up both for time being? | 18:10 |
clarkb | we may want to rewrite that as setup_ssh ~stack/.ssh | 18:11 |
dmsimard | I can probably hack on "split devstack-vm-gate-wrap into two main parts", anyone working on that right now ? | 18:12 |
dmsimard | ok, putting my name on there. | 18:15 |
jeblair | the multinode role needs to be finished as well: https://review.openstack.org/#/c/435933/ | 18:20 |
dmsimard | clarkb: where does the bindep for devstack/devstack-gate live ? | 18:25 |
dmsimard | Or is it assumed that the necessary things are embedded through nodepool elements instead ? | 18:25 |
clarkb | dmsimard: we don't really bindep in them. There are a handful of things we install but most are installed into a virtualenv iirc. Then devstack has its own package listing system that does the rest | 18:25 |
clarkb | the idea is that the two tools bootstrap themselves off a fairly minimal base image | 18:26 |
clarkb | there has been talk of converting devstack to using bindep for its package listing and installing but that hasn't happened yet as far as I know | 18:26 |
clarkb | but that way instead of listing all the things noav needs for example it can just install what nova's bindep file says it needs | 18:26 |
dmsimard | clarkb: ah, looks like http://git.openstack.org/cgit/openstack-dev/devstack/tree/files/rpms/general for example | 18:27 |
*** amoralej is now known as amoralej|off | 18:27 | |
clarkb | ya | 18:27 |
dmsimard | to put myself into perspective, the installation of those packages occurs *after* devstack-gate has run, correct ? | 18:28 |
mordred | yah - devstack-gate's job is mostly just to prepare for a devstack run and then to run devstack | 18:29 |
dmsimard | right, okay, thanks | 18:29 |
jlk | clarkb: that call to $BASE/new/.ssh is confusing. What is $BASE? Is this where it's setting things up for the stack user? | 18:32 |
clarkb | jlk: $BASE/new is the stack user's home dir | 18:33 |
clarkb | and $BASE is the base location for devstack-gate filesystem writing | 18:33 |
clarkb | whcih is why making it setup_ssh ~stack/.ssh would be clearer | 18:34 |
jlk | gotcha. | 18:34 |
jlk | So it sounds like we need to build on SpamapS's role, which appears to be doing everything as the zuul user, we'll want to duplicate those ssh keys and such for a set of users. root and "stack" to start with. | 18:35 |
jlk | either build on it, or add an adjacent role that runs after his | 18:36 |
SpamapS | Could we do the same role, but with "become" set? | 18:36 |
clarkb | or user argument, since you may want to run as root in all cases (to update /etc/hosts if necessary?) | 18:37 |
jlk | hrm. | 18:37 |
jlk | so if we change his role to adjust /etc/ssh instead of ~/.ssh for things like known_hosts, then all the subsequent role would need to do is distribute authorized_keys and priv/pub keys to the appropriate users | 18:38 |
jlk | oh his role doesn't touch known_hosts does it | 18:39 |
jlk | but yeah I think we want it all to be done as the root user, just setting ownership on the files where appropriate | 18:40 |
* jlk afk for lunch and such | 18:42 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack PyPI upload job https://review.openstack.org/496333 | 18:49 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack python tarball creation job https://review.openstack.org/496332 | 18:49 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Rename upload-twine to upload-pypi https://review.openstack.org/496331 | 18:49 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Rename tox/tarball-post to python/tarball-post https://review.openstack.org/496364 | 18:49 |
dmsimard | mordred: ah, bummer, you were already working on splitting the ansible stuff out of devstack-gate wrap ? Your name wasn't on the pad so I started looking at it. I'll review your stuff | 18:49 |
pabelanger | mordred: sure, looking at the patches now | 19:04 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Rename tox/tarball-post to python/tarball-post https://review.openstack.org/496364 | 19:07 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack PyPI upload job https://review.openstack.org/496333 | 19:07 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack python tarball creation job https://review.openstack.org/496332 | 19:07 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Rename upload-twine to upload-pypi https://review.openstack.org/496331 | 19:07 |
mordred | dmsimard: my stuff may be completely bong - feel free to ignore it | 19:09 |
mordred | dmsimard: that was more me exploring/noodling | 19:09 |
mordred | pabelanger: (sorry, just did one last update) | 19:09 |
dmsimard | mordred: you're okay if I pick it up ? | 19:10 |
dmsimard | I planned on cleaning some stuff and then doing the split afterwards | 19:10 |
dmsimard | So worst case there's bound to be a good amount of merge conflicts | 19:11 |
pabelanger | mordred: ack | 19:11 |
mordred | dmsimard: please do! | 19:12 |
mordred | dmsimard: and please bring any sanity to that you can :) | 19:12 |
dmsimard | mordred: ack | 19:12 |
mordred | jeblair: remind me again ... are playbooks defined in one repo available to jobs in a different repo? | 19:13 |
mordred | jeblair: actually -nevermind | 19:13 |
jlk | so playbook locations | 19:16 |
jlk | this role setting up ssh users, that's done in a base/pre.yaml job. The thing to make nova/libvirt work will be done as part of the devstack playbook, right? It can assume that base/pre.yaml has been completed? | 19:16 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Split ensuring tox is installed into a role https://review.openstack.org/496375 | 19:19 |
mordred | jlk: yes. base/pre will run before anything else | 19:21 |
jlk | okay I'm going to make this a separate role, since not everything in zuul-job may want the keys spread around like this | 19:22 |
pabelanger | jeblair: mordred: http://zuulv3-dev.openstack.org/logs/sandbox/ publish-openstack-python-branch-tarball worked! | 19:23 |
mordred | pabelanger: woot! | 19:24 |
pabelanger | that was run from post pipeline | 19:24 |
mordred | jlk: ++ | 19:28 |
mordred | pabelanger: that's super exciting | 19:29 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Add publish-openstack-python-branch-tarball post job https://review.openstack.org/496380 | 19:30 |
pabelanger | mordred: ya, I'll propose an update to switch to using tarballs.o.o too | 19:30 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Rename tox/tarball-post to python/tarball-post https://review.openstack.org/496364 | 19:35 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack PyPI upload job https://review.openstack.org/496333 | 19:35 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack python tarball creation job https://review.openstack.org/496332 | 19:35 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Rename upload-twine to upload-pypi https://review.openstack.org/496331 | 19:35 |
mordred | pabelanger: woot | 19:35 |
mordred | pabelanger: also - linter job actually helped :) | 19:35 |
pabelanger | :) | 19:35 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack PyPI upload job https://review.openstack.org/496333 | 19:37 |
mordred | pabelanger: do we have anything consuming tox-tarball atm? | 19:41 |
mordred | pabelanger: I don't see anything | 19:41 |
pabelanger | mordred: no, I think we agreed to delete it? | 19:42 |
jeblair | dmsimard, mordred, jlk: if you propose any patches to devstack-gate, would you mind setting the topic to zuulv3 ? (devstack-gate has something like 90 open changes) | 19:44 |
jlk | okay | 19:44 |
mordred | jeblair: ++ | 19:44 |
dmsimard | jeblair: will do | 19:44 |
mordred | pabelanger: yah - although if someone wants a build tarball job in their gate - should we put a build-openstack-tarball in openstack-zuul-jobs using the roles we've got? | 19:44 |
pabelanger | mordred: ya, that might make sense | 19:46 |
jeblair | dmsimard: i notice your changs look like they are heading in the direction of updating the current v2 usage -- in our etherpad brainstorming, we agreed it would be better to go ahead and cut ties with the v2 jobs. we're close enough to the cutover that we can just fork/copy the things we need and generally leave v2 alone. (we may need to make some changes, such as splitting out parts of shell scripts, but we should keep those minimal) | 19:46 |
jeblair | (it takes a really long time to land changes to devstack-gate) | 19:47 |
jlk | what's a good hostgroup to loop over to get all the nodes of a set? | 19:48 |
jeblair | jlk: 'all'? | 19:49 |
jlk | We're not explicitly inserting localhost into the inventory are we? | 19:49 |
dmsimard | jeblair: I was taking an approach to iteratively clean up what was there right now and then split things off which mordred started in https://review.openstack.org/#/c/495927/ | 19:49 |
jeblair | jlk: correct, we're not (partially to avoid that problem) | 19:50 |
jlk | cool | 19:50 |
dmsimard | jeblair: This also gives me the opportunity to learn how things work since I'm admittedly not that familiar with devstack/devstack gate | 19:50 |
dmsimard | jeblair: I'd like to take the split off of mordred's hands so he can work on other stuff since I have some time to invest and help | 19:50 |
jeblair | dmsimard: thanks. i'd suggest keeping changes to the running system to a minimum. i don't see why we should upgrade ansible or move the devstack-gate ansible configuration around. | 19:51 |
dmsimard | fair | 19:53 |
jeblair | dmsimard: (we're not going to be running ansible inside of devstack-gate any more, so that's all dead code) | 19:53 |
dmsimard | jeblair: do we have a "picture" of what we'd like things to look like on v3 ? perhaps that's what I'm missing | 19:54 |
dmsimard | like, are we moving the devstack-gate roles out of devstack-gate for example ? | 19:54 |
*** jkilpatr has quit IRC | 19:55 | |
jeblair | dmsimard: closest thing is in the etherpad. | 19:55 |
dmsimard | jeblair: this one ? https://etherpad.openstack.org/p/AIFz4wRKQm | 19:56 |
jeblair | dmsimard: yep | 19:56 |
dmsimard | okay. | 19:56 |
jeblair | dmsimard: the thing you're working on came from mordred's head, so maybe he can flesh it out a bit more | 19:56 |
jeblair | we're still lacking the overlay network role, which is a well-defined task if anyone wants to finish that up: https://review.openstack.org/#/c/435933/ | 19:57 |
mordred | jeblair, dmsimard: that was mostly me thinking about ways to make small changes to existing d-g so that we could write new v3 d-g jobs that potentially called some of the same shell scripts - but it's much more likely that the whole chunk of that from my brain should be ignored | 19:58 |
jeblair | mordred: yeah. i get the idea. i suspect we will have to do some of that. i don't personally know yet what would be useful. | 20:00 |
jeblair | i'm starting to suspect that a lot of the localrc stuff will be. | 20:00 |
jeblair | but i can only see as far as i've started working. :) | 20:00 |
mordred | yah | 20:03 |
mordred | pabelanger: fwiw - when we have a role in a repo (like zuul-jobs) but no playbook in that repo that uses it, our ansible lint checks don't actually check anything so far | 20:05 |
mordred | pabelanger: we may want to consider adding some test playbooks just to get the linting checks for some of them | 20:06 |
pabelanger | mordred: yes! I think I'm going to write a zuulv3 ansible-lint job which runs across all 4 repos | 20:06 |
mordred | I am noticing this because I added a job to zuul-jobs that uses a role that was defined a few commits before, and the linter doesn't pick up the issues until the patch with the job | 20:06 |
mordred | pabelanger: ++ | 20:06 |
pabelanger | because we have 1 way testing in some cases | 20:06 |
mordred | yup | 20:06 |
pabelanger | I hope to get to that once publishing works today | 20:07 |
mordred | pabelanger: other than things where I'm still fixing it actually working - you ok with my followons to your patches? | 20:07 |
pabelanger | mordred: ya, so far looks good. I should un WIP the base. Or you can | 20:07 |
mordred | cool - I'll squash those real quick | 20:07 |
pabelanger | great | 20:08 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Rename tox/tarball-post to python/tarball-post https://review.openstack.org/496364 | 20:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack PyPI upload job https://review.openstack.org/496333 | 20:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add non-OpenStack python tarball creation job https://review.openstack.org/496332 | 20:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add upload-pypi role https://review.openstack.org/495972 | 20:10 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Remove tox-tarball job https://review.openstack.org/496395 | 20:10 |
jlk | are there zuul inventory variables for node ssh hostkeys? | 20:13 |
pabelanger | no, but ansible provides them via facts | 20:13 |
jlk | I was looking for that variable but wasn't succeeding what's the fact variable? | 20:14 |
pabelanger | ~/.ssh/known_hosts on executor will have them also | 20:14 |
pabelanger | ansible_ssh_host_key_rsa_public for example | 20:15 |
pabelanger | http://logs.openstack.org/89/489689/18/check/tox-pep8/bfa7f32/zuul-info/host-info.ubuntu-xenial.yaml | 20:15 |
jlk | host_keys | 20:15 |
*** jkilpatr has joined #zuul | 20:15 | |
pabelanger | sorry, not following | 20:16 |
jlk | Ah I was looking at the zuul code that populates known_hosts file, it grabs node['host_keys'] | 20:16 |
jlk | ugh, the fact isn't in a direct format that the known_hosts module uses. | 20:17 |
jlk | the value of the fact is the key without the type info | 20:17 |
mordred | jlk: SpamapS wrote some pyhton for this that might be helpful | 20:17 |
mordred | jlk: https://review.openstack.org/#/c/494700/ | 20:17 |
jlk | ah that's exactly what I need, and I don't have to write it into my role | 20:18 |
jlk | Just need to get this review merged | 20:18 |
mordred | \o/ | 20:19 |
mordred | pabelanger, jeblair, jlk: https://review.openstack.org/#/q/status:open+topic:upload-python is ready for review | 20:21 |
jeblair | mordred: all lgtm | 20:26 |
dmsimard | jeblair: I've dropped the ansible.cfg directory review, can we compromise on the other 3 to unpin paramiko, update ansible and centralize config options in ansible.cfg ? I can squash them if necessary. Just feels like a low hanging fruit worth doing :) | 20:27 |
dmsimard | ansible and paramiko have passed check gate already | 20:28 |
dmsimard | Updating ansible seemed worthwhile (and taking out the paramiko pin) as zuul uses ansible>=2.3.0.0 unless mistaken | 20:29 |
mordred | dmsimard: first two look fine to me- the third one increased forks from 5 to 25 | 20:30 |
dmsimard | mordred: right, I can take that out entirely.. ansible defaults forks to 5 so I figured the intention was to raise it | 20:31 |
jeblair | dmsimard: i'm sorry, i can't review that. | 20:31 |
jeblair | dmsimard: if we succeed, that code has 3 weeks of life left. | 20:32 |
jlk | What is our current strategy with variable naming for in-role variables? | 20:32 |
jlk | like an option passed to the role? | 20:32 |
mordred | jlk: I've been trying to prefix them with something sensible | 20:32 |
jeblair | jlk: try to have a name that's unique to the role.. ya ^ | 20:32 |
dmsimard | jeblair: it makes the devstack-gate roles and playbooks run with the same version of ansible zuul is running, we're otherwise exposing ourselves to regressions we could have otherwise caught IMO | 20:33 |
jeblair | like i'm using "devstack_foo" on the stuff i'm writing | 20:33 |
jeblair | dmsimard: we've moved almost all the interesting playbooks out of devstack-gate already | 20:33 |
jeblair | most of the host setup stuff is now in the base job | 20:34 |
dmsimard | ¯\_(ツ)_/¯ | 20:36 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add upload-pypi role https://review.openstack.org/495972 | 20:37 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Rename tox/tarball-post to python/tarball-post https://review.openstack.org/496364 | 20:37 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add non-OpenStack python tarball creation job https://review.openstack.org/496332 | 20:37 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add non-OpenStack PyPI upload job https://review.openstack.org/496333 | 20:38 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Remove tox-tarball job https://review.openstack.org/496395 | 20:38 |
pabelanger | do we know why webstream is not working? | 20:39 |
pabelanger | follow up question, could somebody with HTML skills add a URL for finger:// too? | 20:40 |
pabelanger | blam! http://tarballs.openstack.org/sandbox/ | 20:40 |
pabelanger | new tar.gz and .whl uploaded | 20:40 |
mordred | pabelanger: oh - we need to restart ze01 now that hostname is set properly | 20:40 |
pabelanger | mordred: okay, cool. I can look into that in a moment | 20:41 |
jeblair | dmsimard: unless mordred can articulate a concrete split task, i think the most useful thing to work on would be to finish the overlay network role: https://review.openstack.org/#/c/435933/ | 20:42 |
pabelanger | jeblair: mordred: publish-openstack-python-branch-tarball comlpete! | 20:42 |
pabelanger | looking at twine stuff now | 20:42 |
dmsimard | jeblair: I'll have a look | 20:43 |
pabelanger | mordred: jeblair: https://review.openstack.org/496380/ could use a +3 to start publishing branch tarballs for zuul | 20:43 |
mordred | pabelanger, jeblair: could y'all look at https://review.openstack.org/#/c/496375/ real quick? - I've got a couple other jobs thatuse it | 20:44 |
pabelanger | looking | 20:44 |
pabelanger | mordred: So, +2, but I'd really like to bikeshed more about that maybe at PTG. I have some existing roles that do the same, and could be good to standardize them | 20:45 |
mordred | pabelanger: ++ | 20:45 |
pabelanger | mordred: okay, going to start looking at twine role for openstack-dev/sandbox | 20:46 |
pabelanger | start testing with testpypi | 20:46 |
mordred | pabelanger: you should just need a job and secret, right? | 20:46 |
pabelanger | mordred: and release pipeline for tagging | 20:47 |
mordred | pabelanger: like, add release-openstack-python to openstack-dev/sandbox to release pipeline | 20:47 |
mordred | cool | 20:47 |
pabelanger | ++ | 20:47 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Split ensuring tox is installed into a role https://review.openstack.org/496375 | 20:47 |
openstackgerrit | Jesse Keating proposed openstack-infra/zuul-jobs master: Role to copy the build ssh key to other users https://review.openstack.org/496413 | 20:51 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add publish-openstack-python-branch-tarball post job https://review.openstack.org/496380 | 20:52 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Add build-openstack-python-tarball to check and gate https://review.openstack.org/496416 | 20:53 |
mordred | pabelanger: that ^^ should exercise the build tarball job too | 20:54 |
*** dkranz has quit IRC | 20:54 | |
jlk | Finally taking my lunch break. Will review things after. | 20:55 |
pabelanger | mordred: good idea | 20:55 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Add known_hosts generation for multiple nodes https://review.openstack.org/494700 | 20:56 |
SpamapS | jlk: ^linters fixed | 20:56 |
pabelanger | cool, zuul-feature-zuulv3.whl http://tarballs.openstack.org/zuul/ | 20:57 |
mordred | pabelanger: woot | 20:57 |
pabelanger | mordred: so, we need to reboot whole server or just a service on ze01? | 20:58 |
mordred | pabelanger: just the service- the hostname is already correct | 20:58 |
mordred | pabelanger: the problem is that the last time ze01 started it thought its hostname was just ze01 - so that's what it's adding to the links it sends to the scheduler | 20:59 |
pabelanger | k | 21:00 |
mordred | jeblair, pabelanger: three more https://review.openstack.org/#/q/topic:upload-python+status:open and then the current set is in and landed and we'll just be missing pabelanger's upcoming testpypi job on sandbox | 21:10 |
pabelanger | +2 | 21:11 |
pabelanger | ze01 has also been restarted | 21:11 |
mordred | oh - and gpg signing | 21:11 |
pabelanger | Ya | 21:11 |
Shrews | So, if we were to add the finger gateway to zuul-web (where we have most of the code to handle that already), that would imply that for that to work, zuul-web would need to be accessible to the users (not firewalled off). And it would also have to be started as root (to grab the finger port) and drop privileges. Would those changes be acceptable? Or should we plan to have a separate executable for that? | 21:11 |
mordred | Shrews: we could probably just do a separate binary but use the same consume-remote-finger-stream code that's in zuul-web to avoid the root/drop-privs for zuul-web ... but also I do think we would prefer zuul-web to not be firewalled off from folks | 21:13 |
mordred | jeblair: ^^ thoughts? | 21:13 |
jeblair | mordred, Shrews: yeah that's the way i'm leaning too. | 21:13 |
Shrews | that was my inclination as well. i could probably pound out the code for that before ptg, if we want it | 21:14 |
clarkb | people have already built third party job tracking and info gathering tools around the existing simple zuul api. ++ to not firewalling it off | 21:14 |
jeblair | they are likely on different scale-out axes | 21:14 |
mordred | pabelanger: you want me to take gpg signing and you do the testpypi bits? also - we do not seem to be currently signing the tarballs we upload to tarballs.o.o | 21:14 |
pabelanger | mordred: sure. Ya, I think we should start signing them to tarballs.o.o too | 21:15 |
mordred | oh -nevermind - I see the sigs now | 21:15 |
jeblair | is there an ansible variable that roughly corresponds to 'name of current host' ? | 21:15 |
mordred | inventory_hostname | 21:16 |
jeblair | mordred: thanks | 21:16 |
SpamapS | Note that that specifically is the hostname you gave in inventory. It is not the one that `hostname` returns. | 21:18 |
SpamapS | ansible_hostname is that | 21:19 |
SpamapS | (if you've collected facts) | 21:19 |
jeblair | SpamapS: thanks -- the one in the inventory is the one i want (i'm putting the logs into directories named as in the inventory) | 21:20 |
jeblair | (i expect we'll need to make some changes to the log processor to account for that) | 21:21 |
pabelanger | jeblair: did you want to look at https://review.openstack.org/495973 adds upload-pypi role to our release-openstack-python | 21:23 |
jeblair | pabelanger: lgtm, +2 | 21:24 |
pabelanger | thanks | 21:24 |
jeblair | pabelanger, mordred, dmsimard: can you look at the diff between patchsets 3 and 4 on 496412 ? | 21:32 |
jeblair | the output from ps3 was http://logs.openstack.org/12/496412/3/check/devstack-lxc/b9e9cbc/job-output.txt.gz | 21:32 |
jeblair | which made me think i needed to add ".path" | 21:32 |
jeblair | i was patterning that off of https://review.openstack.org/495972 -- does it need a similar fix (assuming that is correct)? | 21:33 |
* dmsimard looksa | 21:34 | |
jeblair | also, we've broken log uploads: 2017-08-22 21:34:32,212 DEBUG zuul.AnsibleJob: [build: 2b330c84dd04497bad8c5fa9bbb1855e] Ansible output: b"Can't mkdir /var/lib/zuul/builds/2b330c84dd04497bad8c5fa9bbb1855e/ansible/post_playbook_2/secrets: Read-only file system" | 21:35 |
pabelanger | Hmm, I did just restart ze01 | 21:36 |
jeblair | i'll turn on verbose and keep | 21:37 |
dmsimard | jeblair: added a comment in 496412 | 21:37 |
dmsimard | looking at 495972 | 21:38 |
dmsimard | yeah I'm not sure 495972 would work, it's passing off the entire dict | 21:40 |
jeblair | wow, even with -vvv there's nothing more than 2017-08-22 21:41:35,467 DEBUG zuul.AnsibleJob: [build: 018219a924dd47398896d3801258f923] Ansible output: b"Can't mkdir /var/lib/zuul/builds/018219a924dd47398896d3801258f923/ansible/post_playbook_1/secrets: Read-only file system" | 21:41 |
jeblair | i have no idea why ansible is trying to mkdir there | 21:42 |
mordred | jeblair: I also do not understand why it's trying to mkdir there | 21:45 |
mordred | jeblair: that's the first time we restarted ze01 since landing the secrets naming patch - maybe there was an executor-side bug in that? | 21:46 |
mordred | (although that would certainly e weird) | 21:46 |
dmsimard | https://github.com/openstack-infra/zuul/commit/d6a71ca2b45b56d28f0518d67564f973805e0b34 landed recently enough | 21:47 |
jeblair | dmsimard: yeah, that's why it's failing to write there; though i don't know why it's attempting to write there | 21:47 |
jeblair | zuul-bwrap seems broken as well on ze01: Can't find source path None: No such file or directory | 21:50 |
jeblair | that doesn't happen for me locally | 21:50 |
mordred | jeblair: I cannot see anywhere in the new code that should cause that mkdir | 21:51 |
jeblair | mordred: well that's ansible trying to mkdir | 21:51 |
mordred | right | 21:51 |
jeblair | i really want to strace it, but i can't without zuul-bwrap | 21:52 |
jeblair | ah that's the auth sock | 21:58 |
mordred | ah. that makes sense why it works locally but not there :) | 21:59 |
jeblair | yeah, if i start ssh-agent it works | 22:00 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add role to GPG sign artifacts in a directory https://review.openstack.org/496426 | 22:02 |
mordred | jeblair: (btw, I'm assuming you're looking at the ze01 issue so I'm not poking at anything, please let me know if you want another set of eyes) | 22:05 |
jeblair | will do | 22:06 |
dmsimard | I'm trying to look too, trying to figure out where it's from :( | 22:06 |
mordred | dmsimard: unfortunately this is an error you have to be root to see | 22:06 |
mordred | dmsimard: (when the logging system breaks...) | 22:07 |
dmsimard | mordred: oh, yeah, ofc .. but looking at the recent commits and keeping an eye out if I see something wrong | 22:09 |
dmsimard | it's also a way for me to familiarize with the code base :p | 22:09 |
mordred | dmsimard: ++ | 22:12 |
jeblair | it doesn't seem to behave the same way | 22:18 |
jeblair | it's acturally trying to ssh | 22:18 |
jeblair | in production, it immediately errors | 22:18 |
mordred | jeblair: that's exceptionally yuck | 22:19 |
jeblair | i'm open to suggestions. | 22:20 |
jeblair | basically, ansible is emitting an error and i have no idea where it's coming from. | 22:21 |
dmsimard | trace AnsibleJob and go step by step? :/ | 22:21 |
dmsimard | as in, insert a pdb.set_trace | 22:21 |
mordred | jeblair: in the log, it looks like it's happening in /var/lib/zuul/builds/30ff4836f886475cbbbe5bd0d87ceca2/trusted/project_0/git.openstack.org/openstack-infra/project-config/playbooks/base/post-ssh.yaml right? | 22:21 |
jeblair | mordred: yep | 22:22 |
pabelanger | Yum, I think I see a security issue | 22:22 |
jeblair | dmsimard: the error is emitted by the ansible-playbook process | 22:22 |
pabelanger | should I talk here or PM? | 22:22 |
pabelanger | jeblair: mordred ^ | 22:23 |
jeblair | pabelanger: if it's exploitable, pm might be best. if it's "stop debug logging the gearman job arguments" we're probably okay to talk about it here. | 22:23 |
dmsimard | pabelanger: believe all hands are on fixing the broken executor right now :p | 22:23 |
mordred | jeblair: I'm poking a smidge | 22:24 |
pabelanger | okay, I just PM'd jeblair | 22:24 |
jeblair | pabelanger: you need to be authenticated to nickserv to pm me | 22:24 |
pabelanger | ah | 22:25 |
pabelanger | 1 sec | 22:25 |
*** pabelanger has quit IRC | 22:25 | |
*** pabelanger has joined #zuul | 22:25 | |
mordred | jeblair: look in /var/lib/zuul/builds/30ff4836f886475cbbbe5bd0d87ceca2/work/logs/job-output.txt | 22:25 |
jeblair | mordred: that's confusing | 22:26 |
mordred | jeblair: yah - that's not the same error I see in the zuul log | 22:26 |
jeblair | mordred: that's probably an error that happened before the error in the log | 22:28 |
jeblair | mordred: since that looks related to unittest post playbook rather than base job ssh post playbook | 22:28 |
jeblair | or post-logs or whatever it was | 22:28 |
mordred | yah | 22:28 |
pabelanger | k, just to confirm we are trying to debug the read-only files system error right? | 22:29 |
mordred | yah. which may or may not be a read-only file system error :) | 22:30 |
pabelanger | k, I see --tmpfs /var/lib/zuul/builds/8a68a873fc004d99936f5b35622cf349/ansible/post_playbook_2/secrets | 22:31 |
pabelanger | but I don't actually see use creating the secrets folder anyplace | 22:31 |
jeblair | i wonder if the error is coming from bwrap | 22:31 |
pabelanger | I think we need to ensure the folder exists first before having bwrap us it | 22:32 |
mordred | jeblair: oh - right. because "Ansible output" is just us reporting stdout | 22:32 |
mordred | from the execution | 22:32 |
jeblair | right, and the bwrap command would be included in that | 22:32 |
mordred | yah | 22:32 |
jeblair | so of course my zuul-bwrap command would be wrong | 22:32 |
jeblair | lemme retry that | 22:32 |
dmsimard | mordred: bingo, that's a bubblewrap exception https://github.com/projectatomic/bubblewrap/blob/5276f816eacc1dc655833d5603556ed64c9d48c5/bubblewrap.c#L978 | 22:33 |
dmsimard | not an ansible exception | 22:33 |
mordred | that definitely looks like that error | 22:33 |
jeblair | it's possible that the tests of the new tmpfs stuff only overlaid a secrets over a rwbind, and this is the first time we've overlaid them on a ro bind | 22:33 |
mordred | so - bubblewrap will make a dir to mount a tmpfs on if the dir doesn't exist- but it can't here | 22:34 |
mordred | jeblair: ++ | 22:34 |
pabelanger | mordred: ya | 22:34 |
pabelanger | okay, going to look at gearman debug thing | 22:35 |
jeblair | hrm, i'm still no able to reproduce with zuul-bwrap --ro-paths=/var/lib/zuul/builds/018219a924dd47398896d3801258f923/ansible/post_playbook_1 --secret=/var/lib/zuul/builds/018219a924dd47398896d3801258f923/ansible/post_playbook_1/secrets/secrets.yaml=foo:bar /var/lib/zuul/builds/018219a924dd47398896d3801258f923/ /bin/bash | 22:39 |
mordred | pabelanger: I'm gonna go ahead and add the test job to sandbox to do pypi release | 22:39 |
jeblair | that results in: --ro-bind /var/lib/zuul/builds/018219a924dd47398896d3801258f923/ansible/post_playbook_1 /var/lib/zuul/builds/018219a924dd47398896d3801258f923/ansible/post_playbook_1 --tmpfs /var/lib/zuul/builds/018219a924dd47398896d3801258f923/ansible/post_playbook_1/secrets --file 5 /var/lib/zuul/builds/018219a924dd47398896d3801258f923/ansible/post_playbook_1/secrets/secrets.yaml | 22:39 |
pabelanger | mordred: https://review.openstack.org/496421/ | 22:40 |
jeblair | which should be file on tmpfs on ro filesystem | 22:40 |
mordred | jeblair: root@ze01:~# ls /var/lib/zuul/builds/018219a924dd47398896d3801258f923/ansible/post_playbook_1/secrets/ | 22:41 |
mordred | jeblair: shows nothing | 22:41 |
jeblair | mordred: right -- the content is only there inside the bwrap | 22:41 |
mordred | jeblair: is it possible /var/lib/zuul/builds/018219a924dd47398896d3801258f923/ansible/post_playbook_1/secrets/ got created in one of the previous invocations? | 22:41 |
mordred | like- maybe let's try deleting the secrets dir and making sure that the behavior is the same? | 22:42 |
mordred | pabelanger: oh neat | 22:43 |
jeblair | mordred: ah yeah, that may be experimental error | 22:43 |
jeblair | mordred: so we probably need to create the mountpoint in the executor | 22:44 |
mordred | jeblair: ++ | 22:45 |
pabelanger | ++ | 22:45 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Create secrets dir in bwrap https://review.openstack.org/496432 | 22:46 |
jeblair | mordred, dmsimard, pabelanger: ^ i'm running the full test suite locally on that since we may need to force-merge it | 22:47 |
mordred | jeblair: kk | 22:47 |
mordred | jeblair: I'm ok force-merging that | 22:47 |
pabelanger | jeblair: mordred: left quick question | 22:48 |
jeblair | py35: commands succeeded | 22:49 |
jeblair | congratulations :) | 22:49 |
mordred | jeblair: woot | 22:49 |
jeblair | it also passed pep8 | 22:49 |
jeblair | pabelanger: replied | 22:50 |
mordred | jeblair: ok. I'm gonna force-merge | 22:51 |
pabelanger | ++ | 22:51 |
jeblair | mordred: thanks | 22:51 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Create secrets dir in bwrap https://review.openstack.org/496432 | 22:52 |
mordred | pabelanger: ok - I +A'd the sandbox change | 22:57 |
mordred | jeblair: sandbox change uploaded logs - so that fixed it | 22:57 |
pabelanger | mordred: great! | 22:57 |
mordred | pabelanger: I have https://review.openstack.org/#/c/496428 and https://review.openstack.org/#/c/496426/ for your reading pleasure | 22:58 |
jeblair | mordred, pabelanger: did both of you see my conversation with dmsimard earlier about item.path? | 22:59 |
mordred | nope! | 22:59 |
* mordred goes to look | 23:00 | |
pabelanger | looking for it now | 23:00 |
jeblair | 21:32 < jeblair> pabelanger, mordred, dmsimard: can you look at the diff between patchsets 3 and 4 on 496412 ? | 23:00 |
jeblair | starts there ^ | 23:00 |
mordred | oh! oy | 23:01 |
jeblair | mordred, pabelanger: one of you on fixing that? | 23:02 |
mordred | I'm on it | 23:02 |
pabelanger | okay, still trying to understand :) | 23:03 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add role to GPG sign artifacts in a directory https://review.openstack.org/496426 | 23:03 |
mordred | pabelanger: I made bad patches | 23:03 |
jeblair | pabelanger: the upload role needs to use item.path, not just item, because item is a dictionary when you use the results from find | 23:03 |
pabelanger | okay, ya. I see the issue now | 23:03 |
pabelanger | jeblair: yes, that is correct | 23:03 |
mordred | gah | 23:04 |
jeblair | or does it need to use item['path'] ? | 23:04 |
jeblair | dmsimard suggested that ^ | 23:04 |
jeblair | i thought jinja let us use either | 23:04 |
mordred | I thnk jinja does dict lookups with . - lemme test real quick | 23:04 |
pabelanger | playbooks/python/tarball-post.yaml has the right syntax for find | 23:04 |
pabelanger | {{ item.path }} is correct | 23:05 |
mordred | let me not do that in that one patch real quick ... | 23:06 |
jeblair | mordred: when you're done with that, i have another thing -- it looks like the callback plugin is omitting some error information. see: http://logs.openstack.org/12/496412/4/check/devstack-lxc/b5e5437/job-output.txt.gz#_2017-08-22_22_58_02_595390 | 23:07 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Use item.path not item on results of find https://review.openstack.org/496436 | 23:07 |
jeblair | mordred: that failed because zuul doesn't have permission to write to the destination directory. i found the error by using ara [dmsimard will be happy to know :)] (where it was reported in the json). | 23:07 |
mordred | jeblair: cool - I'll add that to the list to investigate/fix | 23:08 |
pabelanger | mordred: twine upload *.tar.gz should work too right? | 23:08 |
jeblair | mordred: the error is also in the callback json | 23:08 |
mordred | jeblair: yah - I imagine it's generally speaking in item processing | 23:08 |
jeblair | mordred: shows up as stdout -- | 23:08 |
jeblair | "stdout": "mv: cannot move 'src/git.openstack.org/openstack-infra/devstack-gate' to '/opt/stack/devstack-gate': Permission denied", | 23:08 |
jeblair | pabelanger: are you working on the gearman job debug cleanups or should i do that? | 23:09 |
mordred | pabelanger: instead of item.path? yes - but I did the find so that the upload would work if you only did sdist and didn't do bdist_wheel | 23:09 |
mordred | pabelanger: (or vice versa) | 23:09 |
pabelanger | jeblair: it might be faster if you do it, but happy to do it if you tell me where I need to look | 23:09 |
jeblair | pabelanger: k, i'll do it | 23:10 |
pabelanger | mordred: ++ | 23:10 |
mordred | jeblair: if you get a sec, would you look at the gpg sign job to make sure the direction looks ok? | 23:10 |
pabelanger | mordred: k, once we land that, I'll try tagging 0.0.6 for openstack-dev/sandbox | 23:11 |
mordred | jeblair: https://review.openstack.org/#/c/496426/ and https://review.openstack.org/#/c/496428 | 23:11 |
mordred | jeblair: perhaps we should shred the keyring files after signing? | 23:11 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul feature/zuulv3: Stop logging executor job args https://review.openstack.org/496437 | 23:12 |
jeblair | mordred: probably. can you stick a TODO to switch those to using stdin when we upgrade to ansible 2.4 ? | 23:13 |
jeblair | mordred: though, tbh, i don't know if that's possible with gpg | 23:13 |
jeblair | maybe we should add a tmpfs to the jobdir for this use? | 23:14 |
jeblair | like work/tmpfs | 23:14 |
mordred | jeblair: ++ | 23:16 |
mordred | jeblair: I'll add a TODO to use such a thing when it exists | 23:17 |
pabelanger | isn't /tmp inside bwrap already tmpfs. I guess the concern is that is 777 inside bwrap? | 23:17 |
jeblair | pabelanger: oh you're right. we could use /tmp | 23:18 |
mordred | either of you happen to know off the top of your head how to tell gpg to use a keyring that isn't in ~/.gnupg ? | 23:19 |
jeblair | i think the right thing to do would be to securely make a tmpdir in /tmp, register that as the gpg path, and then write things out there | 23:19 |
jeblair | mordred: i think it's in the system-config docs | 23:19 |
mordred | ya - I think you're right | 23:19 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Use item.path not item on results of find https://review.openstack.org/496436 | 23:19 |
mordred | ah - --homedir | 23:19 |
jeblair | yep! | 23:19 |
jeblair | mordred: what do you think of going ahead and doing the tmpdir thing ^? | 23:20 |
mordred | yup. on it | 23:20 |
pabelanger | mordred: do we want to try a twine upload without gpg first? | 23:20 |
jeblair | cool; i'll swing back around tomorrow to do that for ssh-add too, since we can do it without waiting for ansible 2.4 | 23:20 |
jeblair | we'll make fungi happy :) | 23:20 |
pabelanger | jeblair: do you have gpgsign = true in your .gitconfig file? | 23:22 |
pabelanger | reason I ask, if I have it enabled, zuul unitests fail because it will use ~/.gitconfig and not have access to my ssh-agent session | 23:22 |
jeblair | pabelanger: i don't | 23:23 |
pabelanger | thanks | 23:24 |
* fungi is always happy, but that's cool! | 23:24 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add role to GPG sign artifacts in a directory https://review.openstack.org/496426 | 23:24 |
mordred | jeblair, pabelanger, fungi ^^ how's that? | 23:24 |
jeblair | mordred: dig it | 23:26 |
mordred | woot | 23:26 |
mordred | jeblair: while I've got you: https://review.openstack.org/#/c/496390/ and https://review.openstack.org/#/c/496416 | 23:26 |
pabelanger | mordred: when would be delete the tempfile, on bwrap tear down? | 23:28 |
mordred | yah | 23:28 |
mordred | which happens as soon as the playbook exits | 23:29 |
mordred | pabelanger: do you think we should delete the dir anyway? | 23:29 |
jeblair | mordred: doesn't the pep8 job have the side effect of testing tarball builds? | 23:29 |
mordred | jeblair: not if skip-sdist is true | 23:30 |
jeblair | :( | 23:30 |
mordred | jeblair: (I'm not sure we actually need that gate job in zuul - that's more to verify that the job itself works) | 23:30 |
pabelanger | mordred: maybe, just to be extra paranoid | 23:31 |
pabelanger | +2 not to block | 23:31 |
jeblair | mordred: yeah, but the reason for the job is to add that to every project in openstack, yeah? that will be a bunch of new jobs. | 23:31 |
*** robled has quit IRC | 23:31 | |
fungi | mordred: 496426 patchset 3 using a tempfile causes it to end up on a tmpfs in our default deployment, per the discussion last week? | 23:31 |
*** robled has joined #zuul | 23:32 | |
pabelanger | mordred: going to try twine sans gpg key. Just to keep parellel testing | 23:32 |
mordred | jeblair: oh - no - the reason for the new job is just to make sure there was a job related to tarball building that didn't necessarily upload it - we can also land neither of those | 23:32 |
jeblair | mordred: i'm find landing both then removing from zuul | 23:33 |
openstackgerrit | Jesse Keating proposed openstack-infra/zuul-jobs master: Role to copy the build ssh key to other users https://review.openstack.org/496413 | 23:33 |
jeblair | mordred: though i have a -1 on the first | 23:33 |
mordred | jeblair: cool | 23:34 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Delete keyring dir when we're done https://review.openstack.org/496440 | 23:34 |
mordred | jeblair: I don't think we need to land then remove to zuul - we got to see the job work properly via depends on | 23:36 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add role to GPG sign artifacts in a directory https://review.openstack.org/496426 | 23:37 |
jeblair | mordred: okay, you mant want to abandon the second one then since it's been approved | 23:37 |
mordred | yup | 23:37 |
mordred | sweet. mr. jlk - wanna nudge https://review.openstack.org/#/c/496390 in ? | 23:38 |
pabelanger | okay, zuulv3 seen the tag of sandbox | 23:39 |
pabelanger | but didn't run the job as expected | 23:39 |
mordred | wompwomp | 23:40 |
* jlk looks | 23:41 | |
SpamapS | jlk: oh neat, I never knew about block: | 23:42 |
jeblair | mordred: when you have a minute, the series starting at https://review.openstack.org/496330 can start going in | 23:42 |
jlk | it's a 2.x thing | 23:42 |
jeblair | zuul v2.5 uses it extensively | 23:43 |
pabelanger | 2017-08-22 23:35:11,259 DEBUG zuul.IndependentPipelineManager: No jobs for change <Tag 0x7f8a0b2bb320 creates refs/tags/0.0.6 on 32e8ddc3ed4326bdb3406ddb20a07fd4d49ef733> | 23:43 |
jlk | huh | 23:43 |
jeblair | (in the crazy autogenerated playbook) | 23:43 |
pabelanger | jeblair: do you mind looking at zuul debug.log on zuulv3.o.o | 23:43 |
jeblair | pabelanger: on it | 23:44 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Stop logging executor job args https://review.openstack.org/496437 | 23:44 |
jlk | I'm not core in openstack-infra/openstack-zuul-jobs | 23:44 |
mordred | jeblair: this: https://review.openstack.org/#/c/496412/6/roles/setup-devstack-source-dirs/tasks/main.yaml is SO MUCH MORE SIMPLER than my brain was making it | 23:44 |
jeblair | mordred: it needs to be a teensy bit more complicated for grenade later on | 23:45 |
pabelanger | Oh | 23:45 |
pabelanger | I see it now | 23:45 |
jeblair | jlk: i bet we can change that; wanna be? | 23:45 |
jlk | Sure, I know a few ansible things :D | 23:45 |
pabelanger | sorry | 23:45 |
jeblair | pabelanger: ok. standing down. :) | 23:45 |
pabelanger | no, I got too excited | 23:45 |
pabelanger | jeblair: please look again | 23:45 |
jeblair | okay, standing up | 23:45 |
pabelanger | confused it with post pipeline | 23:45 |
pabelanger | +3 on 496390 | 23:47 |
mordred | jeblair: yah - but ... it's still super nice and understandable | 23:47 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Delete keyring dir when we're done https://review.openstack.org/496440 | 23:47 |
jlk | what's this "urldefense.proofpoint.com" stuff? | 23:48 |
jeblair | jlk: context? | 23:48 |
jlk | I'm seeing it in my emails from zuul | 23:48 |
jlk | - openstack-doc-build https://urldefense.proofpoint.com/v2/url?u=http-3A__logs.openstack.org_13_496413_2_check_openstack-2Ddoc-2Dbuild_20d1e75_html_&d=DwIDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=2-RoMLt9561c-QFVUQD1hdw_pcUzCKx9dE1pPx90bnc&m=61V_3686Rn5fSpYrn6qOzto5KT43gaoagNQJ2ijAnlM&s=j6sE801e2k4GgU7Km_OHa305xaCSAI6OOlnluI6Lheg&e= : SUCCESS in 2m 03s | 23:48 |
pabelanger | o.0 | 23:49 |
jlk | I'm not sure if that's something our zuul has done, or if IBM is fucking with my email. | 23:49 |
mordred | I believe someone is interecepting and rewriting that email | 23:49 |
mordred | so I'm gonna go with IBM | 23:49 |
jlk | le sigh | 23:49 |
jeblair | jlk: that'll be your mailserver's spam filter altering your email | 23:49 |
jlk | good thing it's not gpg signed... | 23:49 |
mordred | "Proofpoint Delivers on Gartner's | 23:49 |
mordred | Secure Email Gateway Recommendations | 23:49 |
mordred | " | 23:49 |
mordred | 1) break the ability to gpg sign and verify emails by manipulating them in-flight | 23:50 |
mordred | ok. I'm going to go eat the foods | 23:51 |
mordred | jeblair: oh - interesting ... | 23:52 |
mordred | jeblair: the error from zuul: https://review.openstack.org/#/c/496428 | 23:52 |
mordred | jeblair: so - I think maybe the way to deal with that is just to encrypt the pubring in the secret | 23:53 |
jeblair | mordred: two other choices: b64 encode it, or add support for binary. we'll need to b64encode it internally in zuul to get it into json to go over the wire, but we should be able to swing that. | 23:55 |
jeblair | mordred: i do agree that your idea gets us moving the quickest. | 23:55 |
jeblair | pabelanger: i'm thinking that the implied branch matcher that came along with adding the job in the .zuul.yaml file in the repo is keeping this from working. | 23:56 |
pabelanger | jeblair: okay | 23:57 |
jeblair | pabelanger: the easiest way to get things moving would be to add the publish job in project-config. other solutions will take some thought. | 23:57 |
pabelanger | jeblair: k, I'll do that then | 23:57 |
pabelanger | thank you | 23:57 |
*** SpamapS has quit IRC | 23:59 | |
*** SpamapS has joined #zuul | 23:59 | |
mordred | jeblair: oh! fascinating (implied branch matcher) | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!