mordred | pabelanger: it's like we're all grown up :) | 00:12 |
---|---|---|
pabelanger | IKR | 00:12 |
fungi | mordred: belated "yeseroo" on the moving zuul into the now correct group question | 00:26 |
fungi | also, yay! | 00:27 |
*** smyers has quit IRC | 05:22 | |
*** smyers has joined #zuul | 05:23 | |
*** isaacb has joined #zuul | 07:19 | |
*** isaacb has quit IRC | 08:14 | |
*** amoralej|off is now known as amoralej | 08:25 | |
*** isaacb has joined #zuul | 08:27 | |
*** electrofelix has joined #zuul | 09:09 | |
*** jkilpatr has quit IRC | 10:54 | |
*** isaacb has quit IRC | 11:09 | |
*** isaacb has joined #zuul | 11:10 | |
*** jkilpatr has joined #zuul | 11:29 | |
electrofelix | For zuul 2.5.2, noticed that the filter for zuul is split on a ',' however GitHub changes are reported back as '<PR>,<SHA>' but this can match gerrit patchset numbers for new github projects or gerrit change ids for older github projects | 12:19 |
electrofelix | meaning it's difficult to create a status_url to take you to only one change | 12:20 |
electrofelix | and when I looked to add the similar status filter for Gerrit changes I can only seem to find '<Change>,<Patchset>', in both cases this can result in matching multiple items because there doesn't appear to be a way to perform 'AND' matching, it defaults to 'OR' | 12:20 |
electrofelix | Is this something that will be changed for zuulv3? what direction is being considered? | 12:32 |
dmsimard|off | mordred: fyi https://review.openstack.org/#/c/487853/ | 12:33 |
mordred | electrofelix: I believe it all works for v3 already - check out https://github.com/gtest-org/ansible/pull/1 | 12:38 |
mordred | electrofelix: (that's openstack's zuulv3 working with that gh repo) | 12:39 |
mordred | dmsimard|off: woot - I'll dust that off | 12:39 |
fungi | zuulv3.o.o has broken... | 12:41 |
fungi | db socket issues | 12:42 |
fungi | maybe it's timing out the connection due to inactivity? | 12:42 |
fungi | http://paste.openstack.org/show/618522 | 12:43 |
electrofelix | mordred: the code in the filter still performs a split on ',' so you can't populate the filter string with something like 1,2 to pick up the second patchset of the first change | 12:45 |
fungi | i'm manually installing mysql-client on zuulv3.o.o for now to test connectivity to the trove instance | 12:46 |
mordred | fungi: kk \o/ | 12:46 |
fungi | the 'connection "mysql"'.dburi in /etc/zuul/zuul.conf gets me working access to the trove instance from zuulv3.o.o fwiw | 12:47 |
mordred | electrofelix: ah - then that sounds like a new issue | 12:47 |
fungi | i'll check the db configuration there and see if it's missing our sane timeout default overrides | 12:48 |
mordred | fungi: oh goodie. my expectation is that the dbapi connection would reconnect | 12:48 |
mordred | fungi: oh - it probably is | 12:48 |
fungi | confirmed, still has the default configuration | 12:51 |
fungi | and we don't seem to have any alternative configurations built in dfw yet applicable to mysql 5.7 instances | 12:51 |
fungi | i'll put one together based on our mysql 5.6 "sanity" config | 12:52 |
electrofelix | mordred: What I have noticed is I can search on 1_2 instead, so I might tweak the what we are running locally to change the status_url that is posted back from being 1,2 for gerrit or 1,<sha> for github to being 1_2 and 1_<sha> respectively | 12:53 |
mordred | fungi: thanks! and sorry about that | 12:54 |
*** dkranz has joined #zuul | 12:56 | |
fungi | configuration created, applied, and instance restarting now | 12:56 |
mordred | electrofelix: oh - you mean the /status/{change} zuul page | 12:56 |
electrofelix | mordred: yeap, where the jquery code allows filtering of what is shown | 12:56 |
mordred | electrofelix: I had misunderstood the nature of your question | 12:57 |
mordred | electrofelix: cool | 12:58 |
electrofelix | mordred: however even using the panel_change which uses 'id', if a gerrit change being matched on 1_2 will also match a github PR 1 if the sha1 starts with a '2' so it's not without some limitations | 13:01 |
electrofelix | so using the status page to filter for a single change is a bit of a hack at the moment | 13:02 |
SpamapS | should probably add source name | 13:03 |
mordred | electrofelix: indeed. I think we should probably make it a richer api somewhere - like change=1,patchset=2 or something (or /status/{change} and /status/{change}/patchset/{patchset} ) | 13:03 |
*** isaacb has quit IRC | 13:04 | |
SpamapS | if it were github:1_2 or openstackgerrit:1_2 it would be unambiguous | 13:04 |
SpamapS | or / or whatever | 13:04 |
*** isaacb has joined #zuul | 13:04 | |
electrofelix | I'll need to test the /status/{change} zuulv3 just to see if it behaves the same as the filtering in zuulv2 but it did look like it might work the same from the jquery code | 13:05 |
fungi | has anybody looked into the frequent job restarts for the zuulv3 server yet? i just saw a doc-build job in the gate pipeline complete, then restart | 13:06 |
fungi | and saw some changes repeatedly failing yesterday on retry limit | 13:06 |
electrofelix | mordred: Should there be a way to filter on combination of {project} & {change}? as that should be unique while {change} & {patchset} might not be if you are using multiple sources | 13:07 |
mordred | fungi: each time I've looked it's been an issue fetching from a mirror | 13:09 |
mordred | fungi: I wonder if we're missing a retry loop in one of our base jobs - I'll look in to that right now | 13:10 |
mordred | electrofelix: yes, I think so. also, there's work started on a richer dashboard too - we've been deferring it a bit as we get the other bits out the door - but in short, yes, all such things should be possible | 13:11 |
fungi | i'm poking around in documentation trying to see what we should be doing with sqla to self-heal on broken pipe exceptions for pymysql, though i'm a bit out of my depth there | 13:11 |
SpamapS | broken pipe? | 13:12 |
fungi | SpamapS: suspect this is a closed socket for an inactivity timeout (the default server-side wait_timeout in rackspace's trove instances is something absurdly short, like on the order of minutes) | 13:13 |
fungi | SpamapS: tracebacks at http://paste.openstack.org/show/618522/ | 13:14 |
fungi | we override it back to the mysql upstream value normally, but that's a manual process to apply a nonstandard configuration to the instance, and it had been overlooked for the one created to serve as the target for the zuulv3.o.o mysql reporter | 13:15 |
fungi | i see some recommendations of using http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html#sqlalchemy.create_engine.params.pool_recycle to avoid that | 13:17 |
fungi | though it seems like a bit of a workaround (isn't it possible to automatically reestablish the query socket and retry on a failure of that sort?) | 13:17 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Retry updating apt-cache https://review.openstack.org/494204 | 13:21 |
mordred | fungi: ^^ | 13:21 |
mordred | fungi: it is - but I actually think we should use that setting and set it to 1 | 13:22 |
mordred | fungi: mysql connections are not heavyweight like postgres or oracle connections - so most of the idea of a "connection pool" when it's used to keep an established connection around is an anti-pattern for mysql | 13:23 |
fungi | right, that's what i thought too | 13:23 |
mordred | fungi: connection pools to limit the number of active concurrent connections are good of course :) | 13:23 |
fungi | ahhhh... got it | 13:24 |
SpamapS | fungi: I'd personally go with a longer wait_timeout. MySQL connections are pretty cheap to keep idle. | 13:35 |
SpamapS | Especially in this instance. | 13:35 |
mordred | heh. we have the exact opposite viewpoint on this | 13:35 |
mordred | my recommended best practice is wait_time=0 | 13:36 |
mordred | and never keeping an idle connection | 13:36 |
fungi | SpamapS: yeah, i mean we do the longer wait_timeout normally. i'm more concerned about downstream users of zuul who may have shorter timeouts thrust upon them and making sure the server is robust in the face of such situations | 13:36 |
fungi | also i suspect there are other sorts of network disruptions which could cause similar symptoms, not just timeouts to idle sockets | 13:37 |
mordred | yup. this is amongst the reasons I recommend always reconnecting - the mysql protocol doens' thave any sort of heartbeat or keepalive - so with long-lived idle connections you're always at the risk at the start of action that your connection doesn't work anymore - sometimes due to switches or routers dropping connections in between | 13:38 |
mordred | to protect against that, it's common to issue a statement like select 1; before doing any real work | 13:38 |
mordred | of course, as soon you do that, you just did a network round trip and completely negated literally every amount of value in a long-lied connection | 13:38 |
mordred | that being that a long-lived connection saves exactly one network roundtrip which is the cost of establishing a connection | 13:39 |
fungi | i thought one of the libs (maybe it wasn't pymysqlclient?) had a built-in query ping which would transparently reconnect the socket on failure | 13:39 |
mordred | right. that's what I'm saying -that's crazypants | 13:39 |
mordred | it's a total waste of energy - establishing a pool of long-lived connections and then issuing a query ping to trigger a reconnect is complexity that provides no value | 13:40 |
fungi | does seem a little hacky, i have to agree on that point | 13:40 |
SpamapS | mordred: In high scale I'm with you. This is not that. | 13:40 |
mordred | it makes TOTAL sense if you're using oracle, postgres or mssql where the connection is heavy weight | 13:41 |
mordred | SpamapS: I'm saying disconnecting is EASIER | 13:41 |
mordred | it's less work for small scale, and it works more efficiently at large scale | 13:41 |
SpamapS | and most libraries do this automatically. Surprised to even see it's a problem. | 13:41 |
mordred | if you just set engine_recycle=1 in sqlalchemy, there is no more tuning or work that is needed | 13:41 |
SpamapS | Ah so there's just magic sauce that is turned off. Why is that turned off??? | 13:42 |
SpamapS | :-P | 13:42 |
mordred | sqlalchemy defaults to -1 which is "never recycle" - because sqlalchemy defaults to a pg-centric worldview | 13:42 |
mordred | and you have to tell sqlalchemy that you'd prefer it to behave appropriately for mysql | 13:42 |
SpamapS | ahhhhhhhhhh | 13:42 |
fungi | so, easy fix sounds like | 13:43 |
SpamapS | I figured this was deeper. :-P | 13:43 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Recycle stale SQL connections https://review.openstack.org/494210 | 13:43 |
mordred | thereyago | 13:43 |
fungi | thanks! | 13:44 |
fungi | i would not have found my way there on my own | 13:44 |
mordred | fungi, SpamapS: we could also go the other route and do pool_pre_ping=True | 13:51 |
fungi | mordred: ahh, right, that's the thing i was thinking of | 13:52 |
fungi | that's what i was looking into in an attempt to fix imilar issues we were encountering (and sometimes still do) with lodgeit/paste.o.o | 13:52 |
fungi | s/imilar/similar/ | 13:53 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Recycle stale SQL connections https://review.openstack.org/494210 | 13:55 |
mordred | the docs seem to indicate that pool defaults to None | 13:56 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Recycle stale SQL connections https://review.openstack.org/494210 | 14:44 |
jeblair | fungi: (moving -infra conversation here) | 14:46 |
jeblair | it looks like the jobdir playbook directory only contains links to repos which are in other directories, and any secrets.yaml files needed for that playbook | 14:47 |
jeblair | (filesystem symbolic links i mean) | 14:47 |
jeblair | so from a space usage pov, it's entirely possible for that to be a tmpfs | 14:48 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add two roles for publishing artifacts over ssh https://review.openstack.org/494230 | 14:49 |
jeblair | we may be able to use the --tmpfs, --symlink, and --file bwrap options to create that setup. | 14:50 |
mordred | jeblair: neat | 14:52 |
jeblair | fungi: and apparently ssh-add will read a key on stdin (though that doesn't appear to be documented in the man page at least, though i have experimentally verified it). using that, we may be able to avoid having the key touch the disk in the job as well. | 14:52 |
fungi | jeblair: that sounds even better still, agreed! | 14:54 |
mordred | jeblair, pabelanger: so - quesiton in my mind from those other two patches - if a secret could be requested for a job by an alternate name - then in theory someone could do "post: - publish-openstack-artifacts: secrets: - my-local-project-secret: name: fileserver" - and publish artifacts to their own fileserver using our trusted job | 14:54 |
jeblair | (that last bit depends on some hand-wavey ansible) | 14:54 |
mordred | jeblair: oh awesome - that was my first thought re: needing to write the file to disk | 14:55 |
fungi | shred is an imperfect belt-and-braces effort to work around those scenarios, so definitely not preferable if there are more elegant alternatives like cache-only filesystems or fifos | 14:55 |
jeblair | mordred: yes! i think you may have found a really good reason *not* to bundle the hostname with the secret (unless that's the intention). | 14:55 |
fungi | granted, those still expose a similar risk if the server doesn't employ encrypted swap | 14:56 |
fungi | oh, though tmpfs normally shouldn't get swapped out i suppose | 14:56 |
fungi | but it's of course possible for the copies in application memory allocations to be paged out still | 14:56 |
fungi | so, you know, no security solution is perfect, defense in depth, et cetera | 14:57 |
fungi | the tmpfs and/or pipe solutions seem much better than relying on shred at least | 14:57 |
pabelanger | mordred: jeblair: roles lgtm | 14:59 |
jeblair | mordred: oh -- we can also set 'final: true' to prohibit that. | 14:59 |
mordred | I'm saying the opposite thing | 14:59 |
mordred | what i'm saying is that the roles are general and describe an operation that's potentially safely re-usable | 14:59 |
jeblair | mordred: to summarise -- if we want a tarball upload job to be reusable, then putting the connection information in secrets or job variables is good. | 14:59 |
mordred | yes! | 15:00 |
jeblair | mordred: if we don't want it to be reusable, then hard-coding the connection info and/or setting final:true is good. | 15:00 |
mordred | yes | 15:00 |
jeblair | mordred: yeah, so if we want this to be *fully* generalizable, then we need to make an "api" for the job so that folks know whether they should supply the connection information in secrets or job variables. either will work. | 15:01 |
mordred | the missing link with the secret is that the playbook hsa to refer to the secret by the secrets name - we can't hand secret a to job b to show up as variable name c | 15:01 |
jeblair | mordred: ah, i missed that first time through, but yes. | 15:01 |
mordred | since the user would need to re-use the job and its playbook itself | 15:01 |
jeblair | mordred: i think we can do the usual scalar-or-dict thing with secrets to add an optional name field. | 15:02 |
mordred | jeblair: I think this works fine for us for now - mostly raising that if we could request a secret and set the name by which it's exposed in a job, that might be a nice feature add | 15:02 |
mordred | jeblair: ++ | 15:02 |
pabelanger | mordred: I'm pretty sure we could just make https://review.openstack.org/#/c/494230/1/roles/publish-artifacts-to-fileserver/tasks/main.yaml its own role too, we'd bascially need to do the same for logs.o.o at some point | 15:03 |
mordred | I mean, I'm not sure who exactly wants openstack zuul to build an artifact that they rsync over ssh to an external server - but there's remarkably little to prevent that :) | 15:03 |
pabelanger | write private key, ssh-add, shred private key, add known hosts | 15:03 |
pabelanger | going to be pretty common | 15:03 |
mordred | pabelanger: gah - I got those two backwards in that patch | 15:04 |
mordred | pabelanger: one sec ... | 15:04 |
pabelanger | Oh wait | 15:04 |
pabelanger | it already is a role | 15:04 |
pabelanger | sorry, just noticed | 15:04 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Add two roles for publishing artifacts over ssh https://review.openstack.org/494230 | 15:05 |
mordred | pabelanger: yup - I just did the copy/paste/delete backwards - your thought was my thought too :) | 15:05 |
fungi | 489691 got me thinking... what is the current behavior if someone does attempt to (either accidentally or intentionally) define a job under a name which is already in use by another repo? does zuul refuse to test that change? fail? undefined behavior still? | 15:13 |
mordred | fungi: fails with a syntax error, iirc | 15:14 |
fungi | okay, good | 15:16 |
fungi | i assumed someone had already thought of that as a potential avenue for confusion or abuse | 15:17 |
mordred | fungi: yah- we still need to tell folks to be friendly and prefix their local jobs with their project name or something like that | 15:28 |
mordred | fungi: and then see how far 'be nice to your neighbors' gets us | 15:28 |
*** yolanda has quit IRC | 15:44 | |
*** yolanda has joined #zuul | 15:44 | |
*** isaacb has quit IRC | 15:52 | |
pabelanger | First job using zuulv3 secrets: http://logs.openstack.org/89/489689/15/check/publish-openstack-python-branch-tarball/45a2698/ | 15:54 |
pabelanger | \o/ | 15:54 |
mordred | pabelanger: that's super awesome! | 16:06 |
pabelanger | I'm going to move to testpypi.python.org credentials now | 16:06 |
mordred | pabelanger: I verify that no secrets were emitted into any files on the log server | 16:07 |
mordred | dmsimard|off: in ara - what do you do about information that is otherwise protected by things like no-log - do you also skip writing that data to your database? | 16:07 |
dmsimard|off | mordred: it's handled by Ansible's callback hook thing, _dump_results I think ? | 16:08 |
mordred | dmsimard|off: ok - so you're taking advantage of that, which means the data your storing is pre-stripped | 16:09 |
dmsimard|off | mordred: the other perhaps relevant thing related to the ansible-playbook arguments (the "parameters" tab in the UI) which, for example, doesn't save extra-vars by default but can also be tweaked to ignore (or not) other arguments | 16:09 |
dmsimard|off | mordred: right, the data in the database is pre-stripped | 16:09 |
mordred | dmsimard|off: ok - cool. the things we pass in as extra_vars are, in fact, things we would like to be ignored - so that behavior is good for us | 16:10 |
dmsimard|off | so, back to arguments, say you do something like -e password=somethingsecret that won't be saved by default | 16:10 |
dmsimard|off | but you can remove that config to allow extra vars to be saved | 16:10 |
mordred | dmsimard|off: -e@secrets.yaml will work the same way | 16:10 |
mordred | dmsimard|off: oh golly no - we want those to not be stored, so that's perfect :) | 16:11 |
dmsimard|off | blog post about 1.0 highlights will be up today btw, will let you know | 16:12 |
mordred | pabelanger, jeblair, SpamapS: https://review.openstack.org/#/c/487853 is good to go now (as is https://review.openstack.org/#/c/488214) | 16:14 |
mordred | dmsimard|off: oh - I meant to ask you a question about your db design when you tweeted it - can you re-share that link? | 16:18 |
dmsimard|off | mordred: that graph is no longer relevant, I took a sledgehammer to the model already :) | 16:25 |
mordred | dmsimard|off: awesome. and I was justlooking at your models.py file anyway | 16:25 |
dmsimard|off | mordred: still happy to get input about the current state though | 16:26 |
dmsimard|off | mordred: make sure you look at feature/1.0 branch | 16:26 |
mordred | dmsimard|off: the concern I had from your graph was a key-value table off on the side that I didn't understand - but the one you have in there now makes more sense to me | 16:26 |
mordred | ah- lemme look at that | 16:26 |
dmsimard|off | mordred: I can also get an updated relationship graph up if need be :) | 16:27 |
*** isaacb has joined #zuul | 16:27 | |
mordred | k. same thing - Record was confusing on the graph - but your comment makes it make sense | 16:27 |
dmsimard|off | mordred: there's two less tables, some fields that have been taken out, some new relations, etc. | 16:27 |
mordred | ara_record allows someone to record arbitrary information - so I agree a k/v table seems fine | 16:27 |
mordred | (I always get worried when I see a k/v table in a rdbms so wanted to double-check) | 16:28 |
dmsimard|off | mordred: yeah, it's just a generic way to attach arbitrary data to a playbook report -- see the record tab in the UI here for examples: http://logs.openstack.org/70/494070/2/check/gate-ara-integration-py27-latest-centos-7/2849bbb/logs/build-playbook/ | 16:29 |
dmsimard|off | People asked for ARA to save many things (git versions, or whatever) I didn't want to include by default so I gave them the means to save whatever they want :) | 16:30 |
mordred | dmsimard|off: ++ | 16:38 |
mordred | jeblair: my patch needed for https://review.openstack.org/#/c/492557 merged and upstream cut a new release with it | 16:41 |
mordred | jeblair: I just hit recheck on it so we can verify, but it's ready for review now - pabelanger, SpamapS ^^ | 16:41 |
pabelanger | mordred: one day for the depends-on in patch :) | 16:43 |
jeblair | ze02 is online and running jobs | 16:46 |
pabelanger | Woah, nice | 16:47 |
SpamapS | neat! | 17:02 |
*** isaacb has quit IRC | 17:07 | |
mordred | pabelanger: right? | 17:13 |
mordred | jeblair: yay! | 17:13 |
dmsimard|off | mordred: there ya go http://rdo.dmsimard.com:1313/2017/08/16/ara-1.0-whats-coming/ | 17:23 |
dmsimard|off | er | 17:23 |
dmsimard|off | wrong paste :) https://dmsimard.com/2017/08/16/whats-coming-in-ara-1.0/ | 17:23 |
*** bhavik1 has joined #zuul | 17:38 | |
mordred | jeblair: in our zuul-jobs doc building, is there any way to cross-reference a zuul glossary concept? | 17:43 |
*** electrofelix has quit IRC | 18:00 | |
SpamapS | looks like jobs are a bit stuck | 18:02 |
SpamapS | http://zuulv3.openstack.org/static/stream.html?uuid=4485757750cb48d1b7b4c0acf2d60d41&logfile=console.log | 18:02 |
SpamapS | 47 minutes at that point | 18:02 |
SpamapS | we could maybe drop the tox-py35 timeout for zuul since it's usually done in < 10min | 18:03 |
SpamapS | so... | 18:07 |
SpamapS | is there anything i can help with? | 18:07 |
* SpamapS is just poking at BonnyCI's zuulv3 migration but otherwise not really zuulv3-ing much | 18:08 | |
pabelanger | we're having networking issue in infracloud, so cloud be related to long runtimes | 18:09 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add sphinx-autodoc-typehits sphinx extension https://review.openstack.org/492557 | 18:09 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Collect logging information into ara callback https://review.openstack.org/487853 | 18:09 |
pabelanger | or maybe just slow compute node | 18:09 |
jeblair | mordred: yes! https://docs.openstack.org/infra/zuul/feature/zuulv3/developer/docs.html#terminology | 18:10 |
jeblair | (i get a point everytime i can answer a question with a docs link, right?) | 18:11 |
SpamapS | pabelanger: former seems more likely. | 18:11 |
jeblair | pabelanger: internap is working now, right? | 18:14 |
jeblair | pabelanger: (the mirror hostname fixes are in place everywhere?) | 18:14 |
pabelanger | jeblair: not yet, we need to restart nl01 to add cloud info into zk | 18:14 |
jeblair | pabelanger: would you mind doing that? when you do, we can stop using infracloud. | 18:15 |
pabelanger | waiting for https://review.openstack.org/493088/ | 18:15 |
pabelanger | but yes, I can restart | 18:16 |
mordred | jeblair: yes - but can I use those in a README in a role in zuul-jobs and have them point to the zuul docs? | 18:16 |
jeblair | mordred: oh, i'm sorry i missed 'zuul-jobs' in your question. no point for me. :( | 18:16 |
mordred | jeblair: :) | 18:17 |
jeblair | mordred: not at the moment; it'd just have to be a regular RST hyperlink | 18:17 |
pabelanger | okay, nl01.o.o restarted | 18:17 |
mordred | jeblair: if we can't do that yet, we can add it to the todo-list... I mostly wanted to be able to link to secrets | 18:17 |
jeblair | mordred: part of me wants to add stuff like that to zuul-sphinx, but i kind of think it may be a little bit of a waste of time to do it before we have all-job-docs-auto-built-from-zuul-job-api. | 18:17 |
mordred | jeblair: k. I'm going to leave it alone for now since the v3 docs are published to a feature branch location anyway and I don't want to add too much cruft we have to remember to replace | 18:17 |
jeblair | mordred: indeed | 18:18 |
mordred | jeblair: ++ | 18:18 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Create nodepool.cloud inventory variable https://review.openstack.org/493088 | 18:18 |
mordred | jeblair: I mean, we COULD register an intersphinx mapping to zuul's docs | 18:18 |
jeblair | mordred: yeah, though does that still hit the feature branch problem? | 18:18 |
mordred | jeblair: when we do the job doc builds - but yeah, feature branch problem | 18:19 |
mordred | jeblair: so let's wait | 18:19 |
jeblair | kk | 18:19 |
jeblair | SpamapS: do you want to tackle the add-build-sshkey item on https://etherpad.openstack.org/p/AIFz4wRKQm (currently lines 42-44) ? | 18:20 |
mordred | jeblair: ah - there's not a secret glossary anyway | 18:20 |
jeblair | (probably should be :) | 18:20 |
SpamapS | mmm ssh keys :) | 18:20 |
SpamapS | The Zuulonomicon should definitely have an entry for that. | 18:28 |
*** bhavik1 has quit IRC | 18:28 | |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Retry updating apt-cache https://review.openstack.org/494204 | 18:30 |
pabelanger | mordred: do you want to rebase 494231 | 18:30 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Add two roles for publishing artifacts over ssh https://review.openstack.org/494230 | 18:31 |
mordred | oh. hrm | 18:34 |
mordred | one sec | 18:34 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 18:36 |
mordred | pabelanger: I had been working on an update to https://review.openstack.org/494230 :) | 18:36 |
mordred | pabelanger: and yes- I have to afk for a bit - but I will rebase 494231 a soon as I return | 18:37 |
*** amoralej is now known as amoralej|off | 18:47 | |
jeblair | pabelanger: do you understand what went wrong here? http://logs.openstack.org/91/494291/1/check/openstack-doc-build/f60753a/job-output.txt.gz#_2017-08-16_18_41_46_434535 | 18:49 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Add publish-openstack-python-branch-tarball to post pipeline https://review.openstack.org/494296 | 18:50 |
pabelanger | jeblair: Hmm, I think that might be realted to 494204. it tried 3 times to update cache and failed | 18:51 |
pabelanger | maybe condition is not correct | 18:52 |
jeblair | pabelanger: does it only display the output on the final 'retry'? | 18:52 |
jeblair | (cause i only see it happening once in that log) | 18:53 |
pabelanger | jeblair: I am not sure, I haven't used retry option much on the apt task | 18:53 |
pabelanger | look at logs, I think our until condition is not correct | 18:54 |
pabelanger | so, we likely did run apt-get update 3 times | 18:54 |
jeblair | pabelanger: more questions -- why is it saying to use the apt module if we *are* using the apt module? also, why are we using the apt module? i thought we stopped that because python-apt wasn't installed? | 18:55 |
SpamapS | hm | 18:55 |
jeblair | aha, we *are* using shell | 18:55 |
jeblair | that change's parent is old so the diff is wrong | 18:56 |
jeblair | http://git.openstack.org/cgit/openstack-infra/zuul-jobs/tree/roles/configure-mirrors/tasks/mirror.yaml | 18:56 |
pabelanger | Oh, ya, so until condition is not correct now | 18:56 |
pabelanger | so, we should revert 494204 | 18:57 |
jeblair | pabelanger: we're going to have to force-merge it | 18:57 |
pabelanger | yes :( | 18:57 |
jeblair | pabelanger: can we fix it? so we don't have to merge 2 changes? | 18:57 |
pabelanger | I think we'll need to register the exit code now, and retry if != 0 | 18:58 |
pabelanger | checking how we do it today in project-config | 18:58 |
jeblair | pabelanger: you think you can work up a probably-correct version of that fix and we can force-merge it? or do you want to force-merge the revert, and then write up the fix as a new role we use from base-test to exercise it first? | 18:59 |
jeblair | (i'm okay with doing the first to save time; we're still mostly pre-production here) | 19:00 |
pabelanger | jeblair: yes, I think we should force revert and do base-test. This is something we need to take care with in the future too | 19:00 |
jeblair | okay, i'll do the revert | 19:00 |
pabelanger | I didn't do a good know remembering this role was using by a trusted playbook | 19:00 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul-jobs master: Revert "Retry updating apt-cache" https://review.openstack.org/494298 | 19:02 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Revert "Retry updating apt-cache" https://review.openstack.org/494298 | 19:05 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Install build private key too https://review.openstack.org/494302 | 19:24 |
pabelanger | mordred: jeblair: so, I think we can write start writing some test playbooks in zuul-jobs (maybe tests folder) to flex some of our trusted roles. basically, zuul-executor run playbooks, to setup and run ansible-playbook with connection=local on node. The test playbook could try to use the role, ensure it works | 19:38 |
pabelanger | hopefully that makes sense | 19:39 |
jeblair | pabelanger: that makes sense, however, i question how many roles that approach would be effective with. many of them would need a nearly complete simulation of the local executor environment and a remote node. | 19:41 |
pabelanger | jeblair: Ya, it would be complex fast. I guess we could limit it just to roles used by trusted playbooks, untrusted could be tested via depends-on | 19:43 |
jeblair | i don't think graceful works for zuul executor | 19:48 |
jeblair | i'm going to hard-stop ze01 | 19:48 |
SpamapS | there's a lot of wonky process stuff going on | 19:48 |
SpamapS | wouldn't be surprised if we missed something if it has a TERM handler | 19:49 |
jeblair | SpamapS: it's simpler than that | 19:49 |
jeblair | def graceful(self): | 19:49 |
jeblair | # TODOv3: implement | 19:49 |
jeblair | pass | 19:49 |
SpamapS | curious... has anybody worked out a quick way to test playbooks locally? | 19:49 |
SpamapS | I'm thinking of writing an entry point thing similar to zuul-bwrap but just like, zuul-job-execute that sets up the modules and stuff and runs them in bwrap | 19:50 |
SpamapS | jeblair: :-| | 19:50 |
jeblair | SpamapS: i'm unaware of work having been done on that (though i am aware of the desire) | 19:51 |
SpamapS | like what I'm thinking is just zuul-job-execute -i {inventory of whatever targets you choose} | 19:51 |
pabelanger | SpamapS: I've tried using opentack/dox for testing some ansible things. Basically, but tox into docker and did something. protected my host system | 19:51 |
SpamapS | and it can pass the inventory in | 19:51 |
pabelanger | zuul-bwrap would be nice too | 19:51 |
SpamapS | pabelanger: yeah I've got a docker thing going.. but having to hand-code an ansible.cfg that works and thinking we have all that code already | 19:51 |
SpamapS | pabelanger: zuul-bwrap already exists :) | 19:52 |
SpamapS | and is half of this | 19:52 |
SpamapS | just need the ansible part | 19:52 |
SpamapS | or.. maybe I'm missing it and zuul-bwrap already works actually | 19:52 |
SpamapS | just have to put an inventory in the work_dir and call ansible-playbook explicitly. | 19:53 |
pabelanger | mordred: since you've signed up to do (pre-)python-tarball work on etherpad. Any interest adding https://review.openstack.org/494296/ to start testing current version of publish-openstack-python-branch-tarball | 19:53 |
SpamapS | I think.. dunno..playing with it now | 19:54 |
SpamapS | that should help us iterate on roles and playbooks faster ayway | 19:54 |
pabelanger | pushing untrusted playbooks up to zuulv3.o.o has been my process. trusted playbooks are little harder, usually requests me using local resources for that | 19:55 |
jeblair | i'm going to prime git clones of all the repos on all the executors to prepare for the startup test (i'm not interested in the clone time as we can do that beforehand) | 20:02 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Fix documentation nits https://review.openstack.org/494310 | 20:03 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Fix documentation nits https://review.openstack.org/494310 | 20:04 |
clarkb | its also a one time cost so even if your first startup is slower due to cloning subsequent ones shouldn't be | 20:05 |
jeblair | ya | 20:06 |
mordred | SpamapS: I was going to get around to doing that, so thanks! I currently just have an ansible.cfg sitting the root of my zuul checkout that points all of the plugin dirs to the right plugin places and then I run ansible-playbook with it | 20:12 |
mordred | SpamapS: but as you point out, that's a bunch of by-hand fiddling | 20:13 |
mordred | SpamapS: and then I turn on and off zuul action plugin blocking / etc by just commenting/uncommenting lines in that file | 20:14 |
mordred | SpamapS: of course, what I really want is something that will make me a temp dir in the right structure with repos symlinked in and make an inventory inside of that :) | 20:15 |
mordred | SpamapS, pabelanger: but one we have such a thing, doing a two-node job that runs ansible on one node with the other node in its inventory to test some of the things from trusted contexts would be nice - even if we need to write a specific job for each base-job we want to test and have a zuul-job-execute command with 10 command line options to get all the things set properly ... I think it would be a win (like, | 20:17 |
mordred | it doesn't have to be able to actually read the whole zuul.yaml file or anything to be an improvement) | 20:17 |
* mordred shuts up and goes back to accomplishing things | 20:18 | |
pabelanger | mordred: ya, that would be interesting to do also | 20:19 |
mordred | pabelanger: should we remove the tarball job from the v2 jobs before landing that change above? otherwise v2 and v3 will be fighting to upload to the same location, yeah? | 20:29 |
mordred | pabelanger: (unless we already did that and I missed it) | 20:30 |
SpamapS | Yeah the next level up, of having all the zuul* vars set would be cool | 20:31 |
SpamapS | but right now I just want to test the playbook with the zuul_ modules and actions | 20:32 |
SpamapS | it is rather tempting to just make something that shoves a job into the executor btw. | 20:32 |
SpamapS | As that would get this done in a rather complete fashion. | 20:33 |
pabelanger | mordred: ya, I thought we did. Let me confirm | 20:35 |
pabelanger | mordred: ya, we just have pre-release and release jobs for zuul still on zuul.o.o | 20:35 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Install build private key too https://review.openstack.org/494302 | 20:39 |
SpamapS | pabelanger: ^^ I misunderstood the original push for this. I think it makes more sense to have those go into the ansible_ssh_user's .ssh | 20:39 |
mordred | SpamapS: fwiw, this is what I have locally: http://paste.openstack.org/show/618575/ | 20:40 |
mordred | pabelanger: sweet | 20:40 |
SpamapS | mordred: yeah, so, Zuul will happily write that for you if we add the right entry point. | 20:41 |
SpamapS | I don't even think it will be much code | 20:41 |
mordred | SpamapS: oh - totally - that's just my "I'll just make a file" cop out - I'd love a tool | 20:41 |
SpamapS | Just mostly looking at what level to enter now. | 20:42 |
SpamapS | If we enter too high, we have to have sources configured somehow. | 20:42 |
SpamapS | If we go too low, we have to add a bunch of details on the cmdline or we lose assumptions. | 20:42 |
pabelanger | SpamapS: Hmm, why not just ~/.ssh? We should be connected as ansible_ssh_user right? | 20:44 |
pabelanger | also left a few suggests about ansible syntax | 20:44 |
SpamapS | oh right duh | 20:46 |
*** dkranz has quit IRC | 20:46 | |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Install build private key too https://review.openstack.org/494302 | 20:54 |
SpamapS | pabelanger: I forgot that yaml has native octals :-P | 20:54 |
jeblair | 1621 repos 9GB | 21:00 |
jeblair | okay, i'd like to take zuulv3.o.o offline now to perform the startup tests | 21:02 |
jeblair | any objection? | 21:03 |
pabelanger | wfm | 21:03 |
jeblair | oh, ha, the first startup is going to take a while as we generate 1600 rsa keypairs. | 21:04 |
jeblair | at the current rate, almost half an hour. | 21:05 |
pabelanger | ha, nice | 21:06 |
jeblair | okay, trying that again now that puppet is disabled | 21:18 |
SpamapS | jeblair: haveged running? | 21:22 |
SpamapS | :) | 21:22 |
SpamapS | so... known hosts propagation | 21:23 |
SpamapS | easiest thing would probably be to call ssh-keyscan | 21:26 |
pabelanger | ya, JJB does this today | 21:32 |
pabelanger | http://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/jobs/macros.yaml#n845 | 21:32 |
SpamapS | oh that's not handling known_hosts though | 21:34 |
SpamapS | nothing seems to be actually | 21:34 |
pabelanger | Oh, hmm. I liked the wrong thing | 21:36 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Install build private key too https://review.openstack.org/494302 | 21:36 |
SpamapS | oops forgot a fix derp | 21:36 |
SpamapS | pabelanger: ^^ all your comments addressed now I think | 21:36 |
pabelanger | http://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/jobs/ansible-role-jobs.yaml#n37 | 21:36 |
pabelanger | that is how I deal withit for testing ansible roles today | 21:36 |
pabelanger | so, should be easy to make that into a role | 21:37 |
jeblair | pabelanger, SpamapS: are you talking about the add-ssh-key-to-root role? | 21:37 |
pabelanger | originally ya, I commented about that in the review | 21:38 |
mordred | I mean - we know the known_hosts info for each node, because we get it from nodepool - so we really just need to copy .ssh/known_hosts from the executor to each node - don't need keyscan | 21:38 |
pabelanger | Ya, that too. They should all exist in inventory now | 21:39 |
mordred | pabelanger, SpamapS, if you have a sec, feel like +Aing https://review.openstack.org/#/c/493250 ? | 21:40 |
jeblair | http://git.openstack.org/cgit/openstack-infra/devstack-gate/tree/devstack-vm-gate.sh#n181 is the current devstack-gate equivalent using keyscan (though i agree, we should just use local inventory). however, public vs private ip addresses may come in to play here. we'll want both in the known_hosts file. | 21:40 |
jeblair | oh the executors are running cat jobs! | 21:41 |
pabelanger | Ya, we don't actually have them in inventory directly, but we could just use known_hosts from zuul-executor job dir, that is where we write them | 21:42 |
SpamapS | jeblair: I may have misunderstood the task entirely. :) | 21:43 |
SpamapS | I'm installing the build SSH key in the ansible ssh user's ~/.ssh | 21:44 |
SpamapS | it already gets installed in ~/.ssh/authorized_keys | 21:44 |
* SpamapS reads the jjb task again | 21:44 | |
jeblair | SpamapS: oh okay, you mentioned known_hosts so i assumed you were moving on to the next task | 21:44 |
SpamapS | allow-local-ssh-root is something else right? | 21:44 |
jeblair | i have no idea what that is :( | 21:45 |
SpamapS | jeblair: I figured known_hosts was a third thing | 21:45 |
mordred | SpamapS: allowing each node in the nodeset to ssh to each other | 21:45 |
SpamapS | jeblair: http://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/jobs/macros.yaml#n845 <-- allow-local-ssh-root seems to allow you to ssh from whatever job you're running to root | 21:45 |
jeblair | SpamapS: yeah, but what's the relevance of that to devstack-gate? | 21:45 |
SpamapS | jeblair: dunno, pabelanger pointed it out | 21:45 |
jeblair | SpamapS: that looks like a macro that's used by our infra puppet module tests | 21:46 |
jeblair | so i don't think it has anything to do with the tasks at hand | 21:46 |
SpamapS | I'll ignore it :) | 21:46 |
mordred | :) | 21:46 |
SpamapS | so https://review.openstack.org/494302 <--- just installs the build private key in ~ | 21:46 |
SpamapS | which is the first half of "let them ssh to eachother" | 21:47 |
mordred | yah. oh - we may have left out something | 21:47 |
jeblair | SpamapS: what's the second half? | 21:47 |
SpamapS | and now I'm poking at the second half, add known_hosts files for all nodes to all nodes. | 21:47 |
mordred | nevermind me- the thing I was about to say is wrong | 21:48 |
jeblair | SpamapS: ah, on the etherpad that was part of the add-ssh-key-to-root role | 21:48 |
jeblair | SpamapS: it's possible that it should, in fact, be part of the role you're working on instead | 21:48 |
jeblair | but that is why i thought you were starting on the root ssh task | 21:48 |
SpamapS | for adding ssh key to root, that seems like another thing entirely .. no? | 21:49 |
SpamapS | also now that I'm playing with known_hosts .. I am truly curious how we're doing that on all jobs. | 21:50 |
jeblair | SpamapS: indeed it is (lines 45-49 on etherpad) | 21:50 |
SpamapS | like, how does the executor handle it? add w/o prompt? | 21:50 |
jeblair | SpamapS: they come from nodepool | 21:50 |
SpamapS | oh nodepool gives 'em to us? | 21:50 |
jeblair | yep | 21:50 |
SpamapS | COOL... can haz in zuul vars? | 21:50 |
jeblair | why? | 21:50 |
SpamapS | so I don't have to keyscan for them. | 21:51 |
SpamapS | or scrape from /etc/ssh | 21:51 |
jeblair | but they're already in ~/.ssh/known_hosts | 21:51 |
SpamapS | aderpaderpaderp, so they are | 21:51 |
* SpamapS steps back from the 4 tasks he just wrotes, cracks knuckles, replaces with copy... | 21:52 | |
SpamapS | wait no.. copy won't work. | 21:52 |
* SpamapS will get it | 21:52 | |
jeblair | and it started | 21:52 |
jeblair | i'll go get timing info from the logs | 21:52 |
jeblair | okay, one thing i've learned is that i think we need to cache the branch list for every project -- as every configuration update (even speculative ones) needs to know all the project branches, so it's doing git-upload-packs for all projects every time that happens. | 21:55 |
mordred | jeblair: makes sense | 21:56 |
SpamapS | known_hosts is tricky.. I wonder if node-to-node comms in the multinode jobs ssh to hostnames or ips. | 21:59 |
SpamapS | guessing short hostnames | 21:59 |
SpamapS | but wondering how those are looked up | 21:59 |
* SpamapS would wager /etc/hosts | 21:59 | |
jeblair | SpamapS: by "private" ip address, actually; which is something we may need to plumb through to the known_hosts file; and we may also need to figure out how to expose that to the job | 22:01 |
jeblair | it took 1.5 minutes for zuul to submit all the cat jobs, and a further 7 minutes for them to complete (with 4 executors), for a total startup time of 8.5 minutes. | 22:02 |
pabelanger | jeblair: SpamapS: couldn't we use the fact cache for 'private' IPs? They should be listed first time we run a play | 22:03 |
jeblair | if we have 8 mergers and 8 executors, it should start up in less than 2.5 minutes. | 22:03 |
pabelanger | jeblair: each on there own server? | 22:04 |
jeblair | that seems entirely tractable for a system that rarely needs a full restart | 22:04 |
jeblair | pabelanger: that's how we do it now | 22:04 |
jeblair | i'm going to restart zuulv3 with the normal configuration now | 22:06 |
pabelanger | jeblair: sorry, just confirming. we'd have 8 ze01-ze08 and zm01-zm08? Today ze01 does both operations right? | 22:07 |
SpamapS | pabelanger: yeah I was just looking at that. | 22:08 |
jeblair | pabelanger: i think so. at least, that's what i'd like to have for the ptg. we can wind some mergers down if we don't need them. | 22:08 |
SpamapS | for that we might even want to just ssh-keyscan all the ips we know about for every host. | 22:08 |
SpamapS | from every host.. | 22:08 |
SpamapS | then we'd have all the reachable ones. | 22:09 |
SpamapS | but that could get messy | 22:09 |
pabelanger | jeblair: okay, thanks. Also means we might need to update puppet-zuul for zuulv3 mergers | 22:09 |
*** jkilpatr has quit IRC | 22:09 | |
SpamapS | also ready from nodepool doesn't necessarily mean they're all 100% network plumbed. | 22:09 |
SpamapS | just means the public ips can be ssh'd to | 22:09 |
SpamapS | but I will look closer at what multinode is already doing for these problems | 22:10 |
mordred | SpamapS: we do not assume private network connectivity between nodes | 22:11 |
mordred | SpamapS: the fun bits (the network-overlay role) will use public or private as makes sense for the set of nodes | 22:12 |
mordred | SpamapS: because we can't count on the clouds to provide us with an environment in wich multi-node openstack can actually run - thus the manual overlay \o/ | 22:13 |
pabelanger | just thinking, it wouldn't be too hard to add nodepool.ssh_known_hosts variables into the inventory for each host. Maybe that is something users want to access | 22:13 |
pabelanger | ansible facts would gather then too: ansible_ssh_host_key_dsa_public for example: http://logs.openstack.org/96/494296/1/check/tox-pep8/2ec42f9/zuul-info/host-info.ubuntu-xenial.yaml | 22:15 |
SpamapS | if they're already in hostvars that is actually a lot easier | 22:23 |
SpamapS | but what's more complicated still is making sure the right _host_ argument is there | 22:23 |
SpamapS | if it's the IPs of the manual overlay......... | 22:24 |
SpamapS | that seems like a "deep within the bowels of d-g" problem | 22:24 |
mordred | SpamapS: yah - I think you can ignore that case for now | 22:24 |
SpamapS | maybe I shouldn't be doing these things in base jobs | 22:24 |
mordred | becaues this is a base-job thing so there is no network-overlay yet | 22:24 |
mordred | also, I don't think we expect ssh traffic over that overlay - people can use the normal hostname/public-ip for that | 22:25 |
SpamapS | yeah, adding the SSH key is easy | 22:25 |
mordred | the overlay is just for neutron to have a network to manage | 22:25 |
SpamapS | and we'll have the host keys in facts... so one can generate known_hosts after overlay setup | 22:25 |
SpamapS | oh I thought there was host-to-host SSH to enable? | 22:25 |
mordred | if people want to ssh over the overlay they can solve that problem themselves | 22:25 |
mordred | yes. but not over the overlay | 22:25 |
mordred | the overlay is just for neutron in d-g | 22:25 |
mordred | i't's not otherwise useful | 22:25 |
SpamapS | Ah ok, so it's just via the already-existing public IP or hostname? | 22:26 |
mordred | yes | 22:26 |
SpamapS | ok so I can just dig through hostvars for ansible_ethX's | 22:26 |
mordred | so that if people want to, for instance, run a zuul command on node one that ansible's to node 2 as part of a functional test of a zuul playbook | 22:26 |
SpamapS | and maybe also resolve the public hostname to an IP for good measure | 22:26 |
mordred | I seriously think you can literally just copy .ssh/known_hosts | 22:26 |
SpamapS | that's what I have now :) | 22:26 |
mordred | \o/ | 22:26 |
mordred | I think maybe writing an /etc/hosts is maybe a thing that's good too? | 22:27 |
jeblair | that's a good start, but i think we'll need to add private ip to known_hosts. | 22:27 |
mordred | I can't remember if we do that currently | 22:27 |
clarkb | mordred: we ssh over the overlay, but you are correct that we don't care about those hostkeys. The sshing is from tempest to the instances and instance to instance to test networking and they are responsible for sorting that out themselves | 22:28 |
openstackgerrit | Clint 'SpamapS' Byrum proposed openstack-infra/zuul-jobs master: Add known_hosts from executor to all nodes https://review.openstack.org/494333 | 22:28 |
SpamapS | ^ the dumb version that just makes sure all the local lines are in the remote file. | 22:28 |
mordred | clarkb: right- we don't ssh between nodes over the overlay for normal node-ot-node ssh traffic | 22:28 |
SpamapS | and then yeah, we can also add all the interfaces ipv4 and ipv6's there too | 22:29 |
mordred | clarkb, SpamapS, jeblair: that said - the thing SpamapS was getting at earlier - each host does have the hostkey in the ansible hostvars - potentially a role that's "add-hostkey" so that if someone does want to ssh over a different network it's easy for them to add that in a playbook or something? | 22:30 |
mordred | clarkb: do we create an /etc/hosts today? | 22:30 |
clarkb | mordred: on multinode jobs we do yes | 22:31 |
clarkb | mordred: beacuse libvirt live migration uses the hostnames that nova provides so they have to resolve | 22:31 |
SpamapS | I mean, whatever adds those host records should probably also add the known_hosts entries. | 22:31 |
SpamapS | that would maybe be a nice generic role | 22:32 |
SpamapS | ssh-safely-to-from-all-nodes | 22:32 |
SpamapS | or something shorter ;) | 22:32 |
SpamapS | but it's hard to close the loop on what hostname will be ued | 22:33 |
SpamapS | used | 22:33 |
clarkb | the hostname is set before ansible ever shows up | 22:34 |
clarkb | so just use that? | 22:34 |
mordred | yah. each node has hostname set | 22:34 |
mordred | in fact, that's how we do it today | 22:34 |
clarkb | ya | 22:34 |
mordred | SpamapS: http://git.openstack.org/cgit/openstack-infra/devstack-gate/tree/devstack-vm-gate.sh#n184 | 22:35 |
mordred | SpamapS: we obviously don't need the keyscan cause we have keys now | 22:35 |
mordred | oh - we should probably work out something for /etc/nodepool contents | 22:36 |
mordred | clarkb: so we do known_hosts by ip and also put that ip into /etc/hosts with the hostname from each host | 22:39 |
clarkb | ya that should be for the normal non overlay IPs though | 22:40 |
mordred | yah | 22:40 |
clarkb | we do however use the cloud internal non NATed network if present though | 22:40 |
clarkb | because running some of this through NAT causes failures and ebugging that has been low priority because the other networks work | 22:40 |
mordred | ok- I think we have another thing to add to the list | 22:42 |
mordred | we'll need a role to write out an /etc/nodepool dir with primary and secondary info for the existing multinode jobs - and probably want a nodeset that has two nodes, one called "primary" and one called "secondary" :) | 22:44 |
mordred | but I think we're going to also need to expose the public/private info from nodepool into the inventory - since the logic of "node.private_ip == private_ip if private_ip else public_ip" is nodepool logic | 22:45 |
*** jkilpatr has joined #zuul | 22:48 | |
jeblair | mordred: we'll want /etc/nodepool for non-devstack jobs which auto-convert, but we shouldn't need it for our devstack job. | 22:49 |
jeblair | mordred: (ie, i think we can put it in the category of 'legacy role', like zuul vars) | 22:50 |
mordred | jeblair: I think we need it for devstack jobs that auto-convert to the legacy ... | 22:50 |
mordred | yah | 22:50 |
mordred | it's just it's an interface we've told people about, so god only knows how it's being used in jobs | 22:50 |
jeblair | right. just saying we don't need it for the devstack job we're building | 22:50 |
mordred | oh - totally | 22:50 |
jeblair | whereas the private ip thing we will need | 22:50 |
mordred | I'm more making a note that we need an answer for it before ptg | 22:51 |
mordred | jeblair: yup | 22:51 |
mordred | jeblair: also - v3 should be restarted/running normal, yeah? | 22:51 |
jeblair | mordred: yep | 22:52 |
mordred | jeblair: I hit +A again on the zuul-jobs docs patch and it doesn't seem to be running | 22:52 |
jeblair | mordred: number? | 22:52 |
mordred | jeblair: https://review.openstack.org/#/c/493250/ | 22:52 |
jeblair | 2017-08-16 22:23:01,027 DEBUG zuul.DependentPipelineManager: Change <Change 0x7fe4c8051e10 493250,1> does not match pipeline requirement <GerritRefFilter connection_name: gerrit open: True current-patchset: True | 22:52 |
jeblair | required-approvals: [{'username': re.compile('zuul'), 'Verified': [1, 2]}, {'Workflow': 1}]> | 22:52 |
jeblair | mordred: got caught in the no-mans-land between check and gate. needs a recheck. | 22:53 |
mordred | nod | 22:53 |
jeblair | done | 22:53 |
mordred | twice :) | 22:53 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 23:01 |
mordred | jeblair: that ^^ gets your review comments I think | 23:05 |
mordred | jeblair: can I nudge you for a +A on https://review.openstack.org/#/c/494296 ? | 23:09 |
SpamapS | hrm | 23:16 |
jeblair | mordred: +2 and done | 23:17 |
mordred | jeblair: thanks! | 23:21 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add publish-openstack-python-branch-tarball to post pipeline https://review.openstack.org/494296 | 23:33 |
pabelanger | neat! post job running right away | 23:34 |
pabelanger | ha | 23:38 |
pabelanger | 2017-08-16 23:37:16.873474 | ubuntu-xenial | mv: cannot move 'zuul-2.5.2.dev1640.tar.gz' to 'zuul-feature/zuulv3.tar.gz': No such file or directory | 23:38 |
pabelanger | will fix that tomorrow | 23:38 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Use new sphinx roles in docs https://review.openstack.org/493250 | 23:40 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Allow requesting secrets by a different name https://review.openstack.org/494343 | 23:43 |
mordred | jeblair: ^^ that turned out to not be very hard | 23:43 |
mordred | pabelanger: whoops! but nice that it ran! | 23:44 |
jeblair | mordred: of course because of the duplication due to decryption. neat. :) | 23:45 |
mordred | jeblair: exactly! turned out to work rather nicely :) | 23:45 |
SpamapS | oy.. nested loops digging around hostvars is quite a weird thing to get done in ansible | 23:48 |
mordred | SpamapS: indeed. fwiw - don't be afraid of tossing in a python module in the role | 23:49 |
mordred | SpamapS: no need to break your head too much on the jinja | 23:49 |
SpamapS | Yeah I draw the line at 2 filters | 23:50 |
mordred | SpamapS: I've got a simple one in zuul-jobs ./roles/validate-host/library/zuul_debug_info.py if you want to cargo-cult the setup bits :) | 23:51 |
SpamapS | and I think that's what I'll do, since this is a pretty dumb-easy bit for python | 23:51 |
pabelanger | mordred: ^should be our fix | 23:51 |
pabelanger | err | 23:51 |
mordred | yah. python is pretty amazing for that | 23:51 |
pabelanger | remote: https://review.openstack.org/494344 Replace slash for tarball rename | 23:51 |
SpamapS | Since it's like "for each ip of each interface fo each ssh key" | 23:51 |
mordred | pabelanger: +A (single-core/bugfix) | 23:52 |
mordred | SpamapS: yah. I'm sure you can do that in jinja, but you won't be sane when you're done | 23:52 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Document and update fileserver roles https://review.openstack.org/494291 | 23:53 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: Allow requesting secrets by a different name https://review.openstack.org/494343 | 23:55 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!