tristanC | iirc, passing data between jobs may become an issue when we'll want to trigger a single job | 00:14 |
---|---|---|
tristanC | well we'll have to somehow track that data and inject it to a child job when it is triggered | 00:15 |
SpamapS | Not sure I follow | 00:16 |
SpamapS | If there's no parent/child, what do you mean? | 00:16 |
SpamapS | vars would handle any static inputs | 00:16 |
tristanC | SpamapS: e.g., when test job takes as input a build job's artifact, then when we retrigger the test job only, we need to give it something to test | 00:18 |
SpamapS | I think you re-run the build job, which looks in the artifact cache, finds the already built artifact, and sets the artifact again. | 00:19 |
tristanC | SpamapS: right, but then we can't just re-trigger the test job | 00:20 |
tristanC | imagine this pipeline: build -> [test-job] * x -> publish, then it may be valuable to be able to only re-trigger the publish if it fails for some reason | 00:21 |
SpamapS | indeed I see that need | 00:23 |
SpamapS | this is where 'zuul pushes' would come in handy. ;) | 00:24 |
SpamapS | tristanC: also in zuul parlance, that's not really a pipeline. | 00:25 |
SpamapS | that's 2 zuul pipelines. | 00:25 |
SpamapS | maybe we need something to define linkages between pipelines. | 00:25 |
*** threestrands has quit IRC | 00:25 | |
tristanC | SpamapS: right, i was using jenkins terminology, where you are able to recheck part of the test pipeline, which seems pretty neat | 00:26 |
SpamapS | It is, and I'm actually poking around with fitting jenkins/zuul together in interesting ways right now. | 00:28 |
tristanC | regardless of the publish, you could have multiple phase to execute only if the previous one succeed, e.g.: build -> [simple tests] -> [extensive test] -> [benchmark test] | 00:28 |
SpamapS | That said.. if you follow the gate->landpatch->publish paradigm that zuul was built around, you likely just want something that binds landed patches to artifacts. | 00:28 |
SpamapS | and one way to do that, is to leave a breadcrumb in the log store, and use that to tag the commit in a post-commit, and then use tags. | 00:29 |
SpamapS | basically, something has to maintain state | 00:30 |
tristanC | yes, exactly :-) | 00:30 |
tristanC | SpamapS: woud you mind sharing a bit more how are you fitting jenkins/zuul together? | 00:31 |
SpamapS | tristanC: we're just experimenting. harlowja is writing a jenkins kicker job that will poke jenkins and then we'll put that in various places to see if it can be useful. | 00:32 |
harlowja | mainly for old-style jenkins jobs, not multibranch stuff with jenkinsfiles in it | 00:32 |
SpamapS | We have a lot of pipelines written here.. don't want to have to rewrite it all in zuul and lose momentum. | 00:32 |
SpamapS | but if we can let zuul gate on jenkins jobs that exist already, that seems like a pretty nice win. | 00:33 |
harlowja | an adaptor layer for all the groovy stuff would be interesting :-P | 00:33 |
harlowja | but likely to whack, lol | 00:33 |
tristanC | wouldn't it be possible to have a plugin so that jenkins act like another executor? | 00:34 |
jeblair | mordred has detailed notes about running jobs in jenkins from zuul that he's discussed in here several times. | 00:36 |
SpamapS | yeah we're taking the lowest barrier path | 00:36 |
SpamapS | mordred's notes have some more involved pathways to jenkins and zuul living in harmony | 00:37 |
harlowja | u should implement a CPS transform of ansible into groovy into ansible | 00:37 |
harlowja | into assembly | 00:37 |
harlowja | into jenkins | 00:37 |
jeblair | SpamapS: i believe your understanding of zuul_return is correct | 00:41 |
jeblair | tristanC: yes, that extra data is something we'll need to solve if we implement the ability to trigger a single job | 00:42 |
tristanC | jeblair: re zk d/c, turns out nodepool-launcher and zuul-scheduler are on the same host, so i think the issue is not related to different network path to/from zookeeper | 00:45 |
tristanC | jeblair: also the exception seems to happen after an idle time, e.g. mostly when the min-ready process start | 00:46 |
tristanC | well, looking at the code, it seems like zuul should be equaly affected but it isn't... the only difference i can see is that zuul is using a timeout argument when creating KazooClient | 00:51 |
tristanC | jeblair: Shrews: considering zk is a critical piece, I'd say we should make the client retry infinitely so that things doesn't go wrong when there is connections issues | 00:53 |
tristanC | jeblair: Shrews: i'd like to update the patch to use a custom KazooRetry object that would log retry attempt to signal operator that operation are stuck because of zk issues | 00:54 |
tristanC | nevermind the zuul timeout argment, it's the default value and it only affect the connection phrase | 01:00 |
tristanC | i guess zuul is not affected because of a different usage, e.g. different action rate from nodepool | 01:02 |
*** tristanC has quit IRC | 01:08 | |
*** tristanC has joined #zuul | 01:09 | |
*** gundalow has quit IRC | 01:09 | |
*** gundalow has joined #zuul | 01:10 | |
*** gundalow has joined #zuul | 01:10 | |
mordred | SpamapS, harlowja the biggest issue that the larger writeup I've got was trying to account for is git repo state and getting it on to the jenkins nodes | 01:15 |
mordred | SpamapS, harlowja: given that you already have both a jenkins and some jenkins content, you're obviously in a much better position than I am to work on solving that :) | 01:15 |
tristanC | mordred: if jenkins implement a zuul-executor service, then wouldn't be able to to construct the speculative repo locally? | 01:17 |
mordred | tristanC: dear god no | 01:18 |
clarkb | all this talk of jenkins made me go looking https://issues.jenkins-ci.org/browse/JENKINS-27514 be warned that is still a bug | 01:19 |
mordred | tristanC: getting all of the git merging correct is one of the reasons we wrote zuul in the first place - you'd wind up rewriting basically all of zuul.merger and zuul.executor | 01:19 |
clarkb | it basically makes jenkins unuseable and more than just us have verified it happens | 01:19 |
tristanC | jeblair: hum, zuul zk implementation doesn't make use of the didLoseConnection() property | 01:20 |
mordred | tristanC: if you're in a position to write java code or a plugin for jenkins, I think you'd be in much better position to get the jenkins/zuul handshake stuff done | 01:20 |
tristanC | mordred: yes, i was thinking about having a zuul-executor plugin in jenkins that would implement the executor server code | 01:21 |
SpamapS | Hrm | 01:22 |
SpamapS | I am dubious about that. | 01:22 |
SpamapS | tristanC: I see where you're going, but I think you also need to then implement merging? | 01:22 |
SpamapS | And I really don't think it's necessary. | 01:22 |
SpamapS | Basically all you really want Jenkins for is the content, not Jenkins itself. | 01:23 |
SpamapS | So you treat it like a runtime. | 01:23 |
SpamapS | And that means two modes of operation. | 01:23 |
SpamapS | either a) secrets to a standing jenkins | 01:23 |
SpamapS | or b) spin up bespoke masters, feed them a config and local git trees prepared by zuul-executor | 01:23 |
tristanC | SpamapS: well yes, the whole executor module, perhaps we could even load it as-is with jython or something? | 01:24 |
SpamapS | (a) is like, the first step, and then when you realize keeping one running is pointless, you go to (b) | 01:24 |
SpamapS | my god | 01:24 |
SpamapS | It's not tristanC, it's tristanC's monster. | 01:24 |
mordred | SpamapS: except that with that don't you wind up with people who still want to look at their pipeline things in the jenkins ui? | 01:25 |
mordred | https://etherpad.openstack.org/p/zuulv3-jenkins-integration <-- fwiw, in case folks haven't seen it | 01:25 |
SpamapS | mordred: I could see people def wanting to still have their pipeline UI pretties, and keeping (a) for that. | 01:25 |
mordred | SpamapS: b seems not completely terrible | 01:26 |
SpamapS | But the really magical part is giving them their pretties, but lining up speculative merges and submitting them as whole jenkins job runs for gating purposes. | 01:26 |
SpamapS | mordred: (b) def treats jenkins as less valuable. | 01:27 |
mordred | yah- I figured that people who had a tie to some jenkins pipeline for some reason would want jenkins to execute it, because they want their groovy whatever or whatnot | 01:27 |
SpamapS | just a way to run your groovy and use your plugins. | 01:27 |
mordred | ultimately, whatever way someone comes up with will likely work for me, since my main interest in the topic is enabling people to migrate away in a controlled and reasonable manner | 01:30 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/nodepool feature/zuulv3: zk: automatically retry command when connection is lost https://review.openstack.org/523640 | 02:01 |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: zk: automatically retry command when connection is lost https://review.openstack.org/525851 | 02:06 |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul-jobs master: Add sphinx_python variable to sphinx role and job https://review.openstack.org/525688 | 02:07 |
jeblair | tristanC: can you analyze the network traffic between nodepool and zk? there should be frequent pings, so the idle period shouldn't be an issue. maybe traffic analysis could show if something is wrong with that. | 02:10 |
openstackgerrit | David Moreau Simard proposed openstack-infra/zuul-jobs master: Add sphinx_python variable to sphinx role and job https://review.openstack.org/525688 | 02:10 |
tristanC | jeblair: sure, i'll run tcpdump on zk host | 02:12 |
tristanC | jeblair: though i think that change is valuable nevertheless, for example it let operator shutdown zk for maintainance without affecting zuul | 02:13 |
jeblair | tristanC: fyi, i left administrative -2s on those changes -- just because i want to be really careful merging them and don't want folks who haven't been in irc today to approve them. | 02:13 |
tristanC | jeblair: no worry about it :-) | 02:13 |
jeblair | i have to run now, chat more tommorow | 02:14 |
tristanC | have a good evening! | 02:14 |
*** xdl has joined #zuul | 02:41 | |
xdl | https://goo.gl/fJymyp | 02:41 |
xdl | https://goo.gl/fJymyphttps://goo.gl/fJymyp | 02:42 |
xdl | https://goo.gl/fJymyp | 02:42 |
xdl | https://goo.gl/fJymyp | 02:43 |
xdl | https://goo.gl/fJymyp | 02:43 |
xdl | http://ezstat.ru/6sZzR.mp4 | 02:50 |
xdl | http://ezstat.ru/6sZzR.gif | 02:52 |
*** haint has joined #zuul | 03:52 | |
*** nguyentrihai has quit IRC | 03:55 | |
*** xdl has quit IRC | 03:57 | |
*** _ari_ has joined #zuul | 03:58 | |
*** _ari__ has quit IRC | 03:59 | |
vivsoni_ | Hi Team, Zuul check is failing in driverfixes/ocata branch | 04:37 |
vivsoni_ | http://logs.openstack.org/19/525719/1/check/openstack-tox-py27/f0c6813/job-output.txt.gz#_2017-12-05_18_35_28_793241 | 04:37 |
vivsoni_ | can someone please help ? | 04:37 |
tobiash | vivsoni_: what branch of openstack/requirements do you need there? | 04:54 |
tobiash | The job took that from master | 04:54 |
vivsoni_ | driverfixes/ocata branch | 04:56 |
tobiash | vivsoni_: this branch seems not to exist on openstack/requirements | 04:59 |
tobiash | http://git.openstack.org/cgit/openstack/requirements/refs/ | 04:59 |
vivsoni_ | Actually i have proposed my fix on dirverfixes/ocata project: openstack/cinder | 05:00 |
vivsoni_ | after proposing some zuul checks is triggered | 05:00 |
tobiash | vivsoni_: you should reach out in #openstack-infra there are more poeple familiar with the openstack jobs (I am not) | 05:01 |
*** threestrands has joined #zuul | 05:09 | |
*** threestrands has quit IRC | 05:09 | |
vivsoni_ | tobiash: ok. Thanks for your time | 05:18 |
tobiash | no problem | 05:25 |
*** xinliang has quit IRC | 06:27 | |
*** xinliang has joined #zuul | 06:39 | |
*** hashar has joined #zuul | 07:49 | |
*** mnaser has quit IRC | 08:08 | |
*** mnaser has joined #zuul | 08:11 | |
*** sshnaidm|off is now known as sshnaidm|rover | 08:20 | |
*** hashar has quit IRC | 08:26 | |
*** hashar has joined #zuul | 08:33 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: web: add /{tenant}/jobs route https://review.openstack.org/503270 | 08:44 |
*** bhavik1 has joined #zuul | 09:45 | |
*** electrofelix has joined #zuul | 09:53 | |
tobiash | tristanC: just discovered a route mismatch on https://review.openstack.org/#/c/504807 | 10:25 |
tobiash | tristanC: added a comment about that | 10:25 |
*** bhavik1 has quit IRC | 10:29 | |
openstackgerrit | Tristan Cacqueray proposed openstack-infra/zuul feature/zuulv3: Serve keys from canonical project name https://review.openstack.org/504807 | 10:34 |
tristanC | tobiash: thanks, good catch :-) | 10:36 |
tristanC | jlk: any idea why 504267 would timed out on tox-py35? | 10:40 |
*** hashar is now known as hasharAwayu | 10:46 | |
*** hasharAwayu is now known as hasharAway | 10:46 | |
*** hasharAway has quit IRC | 10:55 | |
*** hashar has joined #zuul | 11:00 | |
tobiash | tristanC: does it break locally? | 11:42 |
*** jkilpatr has joined #zuul | 11:59 | |
*** openstackgerrit has quit IRC | 12:03 | |
*** bhavik1 has joined #zuul | 12:43 | |
*** bhavik1 has quit IRC | 12:46 | |
*** openstackgerrit has joined #zuul | 12:51 | |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: web: add /{tenant}/builds route https://review.openstack.org/466561 | 12:51 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: web: make console-stream tenant scoped https://review.openstack.org/505452 | 12:55 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul feature/zuulv3: web: add /{source}/{project}.pub route https://review.openstack.org/502530 | 12:55 |
tristanC | mordred: thanks :-) | 13:00 |
mordred | tristanC: :) | 13:01 |
mordred | tristanC: I don't know what's up with the gh patch though | 13:01 |
*** bhavik1 has joined #zuul | 13:01 | |
mordred | it's still too early for that sort of thing | 13:01 |
tobiash | tristanC: typically timeouts occur in zuul if a patch breaks almost all test cases due to some little overlooked issue... | 13:02 |
tobiash | at least that was the case for most of my changes with timeouts ;) | 13:03 |
tristanC | tobiash: i didn't try this patch locally, actually i don't know much how the gh driver works at all... | 13:04 |
*** dkranz has quit IRC | 13:05 | |
tristanC | mordred: what to you mean it's still too early? | 13:05 |
mordred | tristanC: I mean I'm only on the first cup of coffee - not enough coffee in bloodstream yet :) | 13:08 |
*** bhavik1 has quit IRC | 13:08 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: web: add /{tenant}/jobs route https://review.openstack.org/503270 | 13:14 |
*** jkilpatr has quit IRC | 13:16 | |
*** sshnaidm|rover is now known as sshnaidm|afk | 13:20 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: web: add /{tenant}/builds route https://review.openstack.org/466561 | 13:28 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: web: make console-stream tenant scoped https://review.openstack.org/505452 | 13:36 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: web: add /{source}/{project}.pub route https://review.openstack.org/502530 | 13:37 |
*** jkilpatr has joined #zuul | 13:42 | |
tobiash | yay, four patches less in my staging branch :) | 13:47 |
*** sshnaidm|afk is now known as sshnaidm|rover | 14:02 | |
*** dkranz has joined #zuul | 14:15 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Add stackdumphandler to zuul-web https://review.openstack.org/526086 | 14:50 |
fungi | it was suggested in #openstack-infra a few minutes ago that we still want to provide a default/generic "base" job in our standard library, as described in the user guide | 15:58 |
fungi | there was previously one in zuul-jobs which got removed, and so we likely need something like a zuul-base-jobs repo (name suggested by jeblair) to contain this instead | 15:59 |
fungi | do we expect to provide multiple base jobs in zuul-base-jobs (i'm supposing so from the plurality of its name)? | 15:59 |
fungi | and is the idea that people could fork zuul-base-jobs to use as their config repository, or that they would choose whether to keep their base job in a config repository vs include the zuul-base-jobs repo? | 15:59 |
jeblair | fungi: we currently have multiple base jobs, though they are mostly in service of testing the actual base job. it's possible they will have a part to play in zuul-base-jobs... | 16:00 |
fungi | ahh, right the base-test job could go in there too? | 16:00 |
fungi | s/could/should/ | 16:00 |
jeblair | fungi: i think if we get the "generic" base job(s) just right, and folks trust us, they could directly use the repo. if either of those things aren't true, forking or copying would be the way. i think we want to push hard on the idea that folks should not fork zuul-jobs, but i don't think we can be as confident with zuul-base-jobs. | 16:01 |
jeblair | fungi: re base-test -- maybe? i haven't thought all that through yet :| | 16:02 |
fungi | i'm happy to spin this piece up quickly since it seems to be tripping up early adopters. seems like something we want before general availability of 3.0 | 16:02 |
jeblair | fungi: ++ thanks | 16:02 |
fungi | writing up the new repo creation change for it now | 16:02 |
jeblair | fungi: i'm pretty sure the default base job will be a little different from what we are using right now -- at least until tobiash's work to completely genericize our repo sync is done... | 16:03 |
jeblair | in the mean time, the default base job can use the prepare-workspace role | 16:03 |
fungi | yeah, at a minimum (as also came up in the -infra discussion just now) we probably need to not include the mirror setup role | 16:03 |
pabelanger | I raised the idea the other day about maybe splitting out base job into trusted / untrusted parts too. To aid in testing changes. Right now something like configure-unbound really doesn't need to be trusted, it only is because it exists in the config-project. | 16:04 |
fungi | pabelanger seemed to imply including add-build-sshkeys may be useful at least | 16:04 |
jeblair | pabelanger: i don't want to add another level of inheritance -- it's too inefficient to do that for every job. however, the role doesn't have to be in project config, and can be externally tested. | 16:05 |
sshnaidm|rover | pabelanger, jeblair, hi, I wonder if we can ensure these stories are in roadmap and not got lost: https://storyboard.openstack.org/#!/story/2001353 https://storyboard.openstack.org/#!/story/2001354 | 16:08 |
pabelanger | jeblair: right, the role does exist in zuul-jobs, but added as role into base, today. Meaning we lose depends-on. As for inefficient, could you expand more on that? | 16:08 |
jeblair | sshnaidm|rover: thanks! i tagged them with the future work tag (zuulv3.x), so they're on the roadmap now -- we may move them up and assign them to a specific release the next time we re-evaluate it. | 16:10 |
jeblair | pabelanger: every item in the inheritance path adds more work to zuul. if it's something we do for every job, we should keep it in 'base' rather than adding another level. | 16:11 |
sshnaidm|rover | jeblair, great, thanks! | 16:12 |
jeblair | pabelanger: moreover, base jobs can only be defined in config repos, so i'm not sure we could split it anyway. | 16:12 |
pabelanger | jeblair: ah, okay. make sense. | 16:13 |
fungi | oh, hrm, i guess we already have a zuul-base-jobs repo | 16:20 |
fungi | just needs content | 16:21 |
jeblair | i'm going to admininstrative -2 some changes that are up for review that are not in the v3.0 roadmap and warrant significant discussion/review before we merge them. i don't mean that to discourage anyone, but rather just to help us focus on the immediate things. | 17:12 |
*** JasonCL has quit IRC | 17:16 | |
fungi | looking at the base jobs we have in project-config, our configuration of the validate-host also seems pretty openstacky... what's a good zuul_traceroute_host instead of git.openstack.org? can we assume /etc/dib-builddate.txt is likely to exist for most generic deployments? | 17:18 |
jeblair | fungi: i'm not sure the first thing is really necessary in a standard base job... and no, we should definitely not assume the second thing. folks may use stock cloud images instead of dib. | 17:19 |
fungi | i'm noticing we set these in base-minimal but not in base | 17:19 |
fungi | base presumably picks up the role's defaults from zuul-jobs | 17:20 |
fungi | looks like the role skips the traceroute if zuul_site_traceroute_host is undefined | 17:20 |
fungi | however our defaults for zuul_site_image_manifest_files are /etc/dib-builddate.txt and /etc/image-hostname.txt | 17:21 |
jeblair | fungi: re defaults: yes that seems likely. | 17:22 |
jeblair | fungi: does the role perhaps gracefully handle those files being absent? | 17:22 |
jeblair | if so, maybe it's okay. | 17:22 |
fungi | would it make more sense to adjust the zuul_site_image_manifest_files for the role in zuul-jobs to something more generic? | 17:22 |
fungi | (or stop shipping a default if the task does a no-op on the operations needing that variable?) | 17:23 |
*** JasonCL has joined #zuul | 17:24 | |
fungi | and yeah, looks like the zuul_debug_info.py script will no-op image manifest check if the variable is empty | 17:25 |
fungi | does a for loop over the list, so will just do nothing in the presence of an empty list | 17:26 |
openstackgerrit | David Shrewsbury proposed openstack-infra/zuul feature/zuulv3: WIP: Add finger gateway https://review.openstack.org/525276 | 17:27 |
fungi | i suppose if the base job includes that role, the role can have variables passed in by a descendant of the base job, hence it makes sense to keep the role itself? | 17:28 |
fungi | because if we pass neither variable into it, looks like that role does nothing at all by default | 17:28 |
fungi | (at least, once we remove the defaults from the role definition) | 17:29 |
openstackgerrit | Monty Taylor proposed openstack-infra/zuul-jobs master: Source nvm before using it https://review.openstack.org/526129 | 17:30 |
jeblair | tristanC: comments on 468624 (static driver) | 17:33 |
jeblair | Shrews: i just +2d 488384 -- you +2d it a while ago. do you want to re-review or push it through? | 17:40 |
Shrews | Shrews: lemme check real quick | 17:40 |
Shrews | jeblair: +A'd. i think mordred and myself were just waiting for you to take a look first | 17:43 |
jeblair | Shrews: cool | 17:43 |
clarkb | fungi: if the role does nothing by default seems like you should just include the role in a child job? | 17:46 |
Shrews | jeblair: how do you like logging handler names to be formatted? Is "zuul.<CLASSNAME>" preferable to, say, "zuul.log_streamer.<CLASSNAME>"? might as well clean that up while in that code | 17:46 |
*** hashar is now known as hasharDinner | 17:46 | |
clarkb | that would likely make reading job defs clearer to people if they are in the noop category of user for the role | 17:46 |
jeblair | Shrews: i think we need to move to logical handler names and not focus on classnames. so maybe just "zuul.log_streamer" or something. | 17:47 |
Shrews | jeblair: k | 17:47 |
jeblair | mordred: can you review https://review.openstack.org/518620 please? should be quick | 17:58 |
mordred | jeblair: looking | 18:00 |
clarkb | unfortunately that is cloud dependent | 18:00 |
clarkb | see also rax | 18:00 |
fungi | clarkb: my reasoning was that if the role can be configured in a child job but is a no-op without passing any variables then perhaps it still makes sense to include so that child jobs don't need to include the role, but you're right it's as many characters in the child job definition either way | 18:01 |
clarkb | probably wouldn't hurt to have a "If your cloud uses security groups" prefix? | 18:01 |
clarkb | fungi: ya I just worry that it could potentially be a bad fork in the road for someone debugging their jobs when that should be a noop for them | 18:01 |
mordred | jeblair: that makes me want to come back and ponder if we should expose that as a setting ... but for now that seems like a fine patch | 18:02 |
clarkb | this is also a greater problem with openstack | 18:03 |
clarkb | its super common for people to have networking problems not realizing the default firewall is no traffic | 18:03 |
fungi | clarkb: so at that point, once i strip down the openstackisms from our base job, the only thing it has which base-minimal lacks is the mirror-workspace-git-repos role... is that also overly openstacky? if so i can likely drop base-minimal since it will be identical to base at that point | 18:03 |
*** haint has quit IRC | 18:04 | |
fungi | mirror-workspace-git-repos looks useful and generic enough to keep in base i guess | 18:05 |
clarkb | that is what copies the prepped git repos to the test nodes right? | 18:05 |
clarkb | I think that should probably be kept | 18:05 |
jeblair | fungi: mirror-workspace-git-repos is probably too openstacky right now, until tobiash finishes his work to genericize it. we probably need prepare-workspace instead. | 18:05 |
jeblair | (unless that landed when i wasn't looking) | 18:05 |
fungi | ahh, okay | 18:06 |
tobiash | jeblair: I hadn't yet time to do that | 18:06 |
fungi | i'll swap it out, thanks | 18:06 |
tobiash | it's on my todo list | 18:06 |
fungi | is prepare-workspace enough divergence between base and base-minimal to bother including a base-minimal in the stdlib then? | 18:06 |
clarkb | fungi: probably useful for jobs that don't manipulate source code | 18:07 |
clarkb | like rtfd or whatever | 18:07 |
fungi | sounds reasonable | 18:07 |
fungi | and having base-test be identical to base is still useful simply because it allows you to do shadow testing on changes to base, i suppose? | 18:08 |
jeblair | well, prepare-workspace will dtrt if there are no remote nodes, so base should still be used | 18:09 |
jeblair | base-minimal is only used for testing base | 18:09 |
jeblair | i think dmsimard had some patches to rename it? | 18:09 |
fungi | seems like one of base-minimal or base-test ought to be dropped from the stdlib set ni that case? | 18:09 |
jeblair | fungi: i think i'd start with only including base and base-test in zuul-base-jobs. | 18:10 |
fungi | thanks, will do | 18:10 |
jeblair | let's add base-minimal (or whatever it's renamed to) if/when we decide we need it for testing roles there | 18:10 |
dmsimard | jeblair: I have the first patch out of the series done | 18:11 |
dmsimard | haven't got around to making the others | 18:11 |
openstackgerrit | Merged openstack-infra/nodepool feature/zuulv3: Refactor provider config to driver module https://review.openstack.org/488384 | 18:11 |
dmsimard | chasing a shade bug right now but I can put up the other patches after | 18:11 |
openstackgerrit | Fabien Boucher proposed openstack-infra/zuul feature/zuulv3: WIP: Git driver https://review.openstack.org/525614 | 18:12 |
openstackgerrit | Fabien Boucher proposed openstack-infra/zuul feature/zuulv3: WIP: Git driver https://review.openstack.org/525614 | 18:12 |
*** nguyentrihai has joined #zuul | 18:14 | |
openstackgerrit | Merged openstack-infra/nodepool feature/zuulv3: Improve test case node_assignment_at_quota https://review.openstack.org/506134 | 18:15 |
openstackgerrit | Merged openstack-infra/nodepool feature/zuulv3: Reorg non detailed instance listing columns https://review.openstack.org/522103 | 18:16 |
jeblair | Shrews: just to double check: 505354 will work with ansible 2.3 as well, right? | 18:19 |
jeblair | i mean... it passes tests. i just want to make sure i'm not missing something. :) | 18:19 |
Shrews | jeblair: oh yeah, totally works | 18:20 |
Shrews | jeblair: there might be something we have to do with the custom modules we have (mordred had a fix, but the code diverged and i had to remove that squash from that change), but that should get us most of the way to 2.4.1 compatible | 18:22 |
Shrews | jeblair: 2.4.0 will NOT work for us, btw | 18:22 |
jeblair | Shrews: cool, i +3d that so we don't lose it at least | 18:22 |
jeblair | tobiash: can you review https://review.openstack.org/514119 please? | 18:22 |
tobiash | looking | 18:23 |
Shrews | ugh, i'm about to climb back into asyncio hell for a bit. if i don't respond quickly, i'll poke my head out to check in eventually | 18:25 |
tobiash | jeblair: this is the last of this transform series right? | 18:25 |
tobiash | ah no, the middle | 18:26 |
jeblair | tobiash: i think there's going to be one more to remove _projects | 18:26 |
tobiash | yeah, that was what confused me | 18:26 |
tobiash | I was thinking we are already at the last step | 18:27 |
jeblair | everything in openstack has (hopefully) been converted to _projects. this lets us rename everything back to projects. then we can drop. | 18:27 |
tobiash | looks good to me | 18:27 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/zuul-base-jobs master: Initial boilerplate, packaging and testing https://review.openstack.org/526139 | 18:28 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/zuul-base-jobs master: Add generic base and base-test jobs/playbooks https://review.openstack.org/526140 | 18:28 |
fungi | those ^ will be wip until i confirm some testing is happening | 18:29 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Move github webhook from webapp to zuul-web https://review.openstack.org/504267 | 18:29 |
tobiash | tristanC, jlk: ^^ should at least pass the tests | 18:29 |
tobiash | the actual event ingestion via zuul-web is still untested there as I faked the event ingestion in the tests to directly use gearman | 18:29 |
tobiash | spawning a zuul-web for each test case seems like overkill to me so this should be a separate test case | 18:30 |
clarkb | tobiash: jeblair I found a small doc nit in that projects dict change, I'll push a followup | 18:31 |
jeblair | clarkb: oh thanks | 18:32 |
tobiash | good catch | 18:32 |
tobiash | jeblair: regarding the max-quota-age | 18:32 |
tobiash | I'm not sure what a reasonable default should be | 18:32 |
tobiash | I can imagine that this depends on the use case | 18:32 |
tobiash | if you have nodepool in its own tenant, this can and should be larger | 18:33 |
jeblair | tobiash: is it a very expensive operation? maybe just sticking with 60s is okay? | 18:33 |
tobiash | if the tenant is shared with many volatile vms it should be much less | 18:33 |
jeblair | tobiash: but if it's wrong, we invalidate it immediately anyway, so that's not so bad | 18:34 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix mixed canonical/non-canonical project merge https://review.openstack.org/513331 | 18:34 |
openstackgerrit | Merged openstack-infra/nodepool feature/zuulv3: Document security group https://review.openstack.org/518620 | 18:35 |
tobiash | jeblair: I think the operation is at least potentially expensive | 18:35 |
clarkb | iirc nova at least heavily caches its quota values | 18:36 |
tobiash | yes, the invalidation happens at least if we run into the quota | 18:36 |
clarkb | we might want to set it as a multiple of the nova cache period | 18:36 |
jeblair | tobiash: basically, i think the immediate invalidation makes it okay to have it not be very frequent in the shared case. that lets us pick an interval that won't seem too long if the quota expands again. so something between 1 and 5 minutes sounds reasonable. | 18:36 |
tobiash | well, so I'm fine with 60s | 18:37 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/zuul-base-jobs master: Initial boilerplate, packaging and testing https://review.openstack.org/526139 | 18:37 |
openstackgerrit | Jeremy Stanley proposed openstack-infra/zuul-base-jobs master: Add generic base and base-test jobs/playbooks https://review.openstack.org/526140 | 18:37 |
tobiash | just thought, someone might want to tune this ;) | 18:37 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Changes for Ansible 2.4 https://review.openstack.org/505354 | 18:39 |
jeblair | tobiash, clarkb: i asked in #openstack-nova about cache periods | 18:39 |
tobiash | thanks | 18:39 |
jeblair | tobiash: i'd prefer to have as few options as possible, so it's not confusing for operators. so if we can pick a value that should generally work, i'd like to do that. only if there's something that really needs tuning should we offer an option. | 18:40 |
jeblair | (i mean, even we don't know what to set it to right now. :) | 18:40 |
tobiash | jeblair: yeah, as nodepool does list_servers anyway it might not be that expensive | 18:45 |
tobiash | but in my older cloud I've seen that the list_servers could take up to 60s | 18:45 |
tobiash | (without cache) | 18:45 |
tobiash | and old shade | 18:45 |
tobiash | so it should be at least 60s | 18:46 |
jeblair | tobiash: it may not take nova as long to answer a limits query as it does to list servers | 18:47 |
tobiash | the limits query is faster | 18:48 |
tobiash | yes | 18:48 |
*** electrofelix has quit IRC | 18:48 | |
clarkb | jeblair: tobiash are we just listing the upper bounds or the usage too? | 18:48 |
fungi | i have a fun catch-22 in https://review.openstack.org/526140 | 18:49 |
tobiash | clarkb: we request the total quota and subtract all foreign vm-usage | 18:49 |
tobiash | clarkb: the result is the quota available for nodepool | 18:50 |
fungi | does the commit message from https://review.openstack.org/491907 four months ago still hold? do we actually want to define zuul-base-jobs as a config repo? | 18:50 |
clarkb | tobiash: ok so we aren't using the quota usage as reported by nova? I think this is the expensive query to make to nova | 18:50 |
fungi | or do we need to instruct zuul _not_ to load the zuul.yaml in zuul-base-jobs so that we can continue using the one in project-config? | 18:50 |
jeblair | fungi: we can either do that, or exclude 'jobs' from that repo. | 18:50 |
jeblair | fungi: (if we add it as a config-project, we'll need to allow it to shadow project-config as well, since it'll duplicate the job) | 18:51 |
fungi | i expect we want to just exclude jobs from it | 18:51 |
tobiash | clarkb: no, it's using get_compute_limits from shade which just returns the absolute quota without usage | 18:51 |
fungi | jeblair: otherwise that seems like it would get really messy? | 18:52 |
tobiash | then we query the server list and sum up the resources of each unknown vm | 18:52 |
clarkb | tobiash: jeblair in that case I wouldn't be too worried about a long period between checks especially since we invalidate on failure | 18:52 |
jeblair | fungi: i think they will have nearly the same effect. adding it as a config-repo might get us a little more syntax validation, but meh. | 18:52 |
clarkb | absolute quota doesn't tend to change often ime | 18:52 |
fungi | jeblair: zuul will (or at least is expected to) continue using the base job from project-config even if we set zuul-base-jobs as a config repo then? | 18:53 |
tobiash | clarkb: but the foreign used quota may | 18:53 |
jeblair | fungi: yes, if we set it up to shadow | 18:53 |
tobiash | e.g. two nodepools fighting for quota in the same tenant ;) | 18:53 |
jeblair | fungi: (zuul-jobs is already set up to shadow in the same way) | 18:54 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Convert zuul.projects to a dict https://review.openstack.org/514119 | 18:54 |
tobiash | but that still would lead to frequent invalidations so I think a longer interval would be ok | 18:54 |
tobiash | like a few minutes | 18:54 |
fungi | jeblair: okay. i'll give that a shot. thanks | 18:55 |
openstackgerrit | Merged openstack-infra/zuul-jobs master: Source nvm before using it https://review.openstack.org/526129 | 18:55 |
tobiash | clarkb, jeblair: so try out 5mins for the start? | 18:55 |
tobiash | worst thing which could happen would be that nodepool cannot use the full available quota for 5mins | 18:56 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Add support for shared ansible_host in inventory https://review.openstack.org/521324 | 18:56 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: github: add integration documentation https://review.openstack.org/522420 | 18:56 |
pabelanger | Yay, 521324 merged :) | 18:57 |
*** sshnaidm|rover is now known as sshnaidm|afk | 19:02 | |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul feature/zuulv3: Fix zuul.projects type in docs https://review.openstack.org/526149 | 19:05 |
*** ricky_vaca has joined #zuul | 19:14 | |
*** harlowja has quit IRC | 19:32 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Add cloud quota handling https://review.openstack.org/503838 | 19:35 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Don't fail on quota exceeded https://review.openstack.org/503051 | 19:35 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Make max-servers optional https://review.openstack.org/504282 | 19:35 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support cores limit per pool https://review.openstack.org/504283 | 19:35 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support ram limit per pool https://review.openstack.org/504284 | 19:36 |
*** harlowja has joined #zuul | 19:36 | |
*** JasonCL has quit IRC | 19:36 | |
*** JasonCL has joined #zuul | 19:45 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Fix zuul.projects type in docs https://review.openstack.org/526149 | 19:46 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Rename ssh_port to connection_port https://review.openstack.org/500800 | 19:49 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support username also for unmanaged cloud images https://review.openstack.org/500808 | 19:49 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Add connection-type to provider diskimage https://review.openstack.org/503148 | 19:49 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Don't gather host keys for non ssh connections https://review.openstack.org/503166 | 19:49 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Add connection-port to provider diskimage https://review.openstack.org/504112 | 19:49 |
*** hasharDinner is now known as hashar | 19:52 | |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Rename ssh_port to connection_port https://review.openstack.org/500800 | 20:15 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Support username also for unmanaged cloud images https://review.openstack.org/500808 | 20:15 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Add connection-type to provider diskimage https://review.openstack.org/503148 | 20:15 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Don't gather host keys for non ssh connections https://review.openstack.org/503166 | 20:15 |
openstackgerrit | Tobias Henkel proposed openstack-infra/nodepool feature/zuulv3: Add connection-port to provider diskimage https://review.openstack.org/504112 | 20:15 |
SpamapS | Hm I've just run into a rather odd situation. | 20:15 |
SpamapS | Have a private repo full of sensitive stuff. | 20:16 |
SpamapS | Want to run tests on that sensitive stuff, but do _not_ want to allow other projects to add it to required-projects. | 20:16 |
SpamapS | Thoughts? | 20:16 |
SpamapS | (I mean other than.. don't do that.. encrypt that stuff... which _is_ the plan.. some day. ;) | 20:17 |
openstackgerrit | Tobias Henkel proposed openstack-infra/zuul feature/zuulv3: Rename ssh_port to connection_port https://review.openstack.org/500799 | 20:19 |
tobiash | SpamapS: put it into a different tenant? | 20:20 |
tobiash | I will do that for separating projects from each other | 20:20 |
jeblair | yeah, i think that's the way to go. more granular separation within a tenant would be problematic i think. | 20:21 |
tobiash | some common used projects like zuul-jobs still can be part of each tenant | 20:22 |
tobiash | my plan is to have a trusted base repo, zuul-jobs and an internal cilib repo shared for all tenants | 20:23 |
*** ricky_vaca has quit IRC | 20:24 | |
harlowja | SpamapS don't have repo full of private stuff :( | 20:24 |
harlowja | see uber, lol | 20:24 |
tobiash | and each tenant gets its own additional trusted where it can put its own secrets and intermediate base jobs | 20:24 |
*** JasonCL has quit IRC | 20:26 | |
*** cinerama` has quit IRC | 20:31 | |
*** kklimonda has quit IRC | 20:31 | |
*** xinliang has quit IRC | 20:31 | |
*** fbo_ has quit IRC | 20:31 | |
SpamapS | Oh tenant is actually sufficient. | 20:33 |
*** hashar_ has joined #zuul | 20:33 | |
SpamapS | Thanks! | 20:33 |
jeblair | pabelanger: some minor doc -1s on the command socket changes; the actual code lgtm. | 20:34 |
pabelanger | jeblair: thanks, will look in a moment | 20:35 |
*** hashar has quit IRC | 20:37 | |
*** qwc has quit IRC | 20:41 | |
*** JasonCL has joined #zuul | 20:44 | |
*** cinerama` has joined #zuul | 20:47 | |
*** kklimonda has joined #zuul | 20:47 | |
*** xinliang has joined #zuul | 20:47 | |
*** fbo_ has joined #zuul | 20:47 | |
jeblair | fbo_: left some comments on the git driver change | 20:48 |
*** jkilpatr has quit IRC | 20:54 | |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Remove file extension when building SimpleLayout https://review.openstack.org/525356 | 20:58 |
*** qwc has joined #zuul | 20:59 | |
*** harlowja has quit IRC | 21:00 | |
*** harlowja has joined #zuul | 21:03 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Add command socket support to zuul-merger https://review.openstack.org/523197 | 21:06 |
openstackgerrit | Merged openstack-infra/zuul feature/zuulv3: Update playbook paths with extension https://review.openstack.org/525359 | 21:07 |
*** qwc has quit IRC | 21:11 | |
*** JasonCL has quit IRC | 21:12 | |
*** qwc has joined #zuul | 21:12 | |
*** harlowja has quit IRC | 21:32 | |
*** dkranz has quit IRC | 21:35 | |
*** openstackgerrit has quit IRC | 21:38 | |
*** sshnaidm|afk has quit IRC | 21:38 | |
*** tushar has quit IRC | 21:38 | |
*** leifmadsen has quit IRC | 21:38 | |
*** qwc has quit IRC | 21:41 | |
*** qwc has joined #zuul | 21:44 | |
*** qwc has quit IRC | 21:49 | |
*** qwc has joined #zuul | 21:50 | |
*** hashar_ has quit IRC | 21:56 | |
*** openstackgerrit has joined #zuul | 22:01 | |
*** sshnaidm|afk has joined #zuul | 22:01 | |
*** tushar has joined #zuul | 22:01 | |
*** leifmadsen has joined #zuul | 22:01 | |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Move send_command() into ZuulApp https://review.openstack.org/523211 | 22:08 |
openstackgerrit | Paul Belanger proposed openstack-infra/zuul feature/zuulv3: Add command socket support to zuul-scheduler https://review.openstack.org/523466 | 22:19 |
*** jkilpatr has joined #zuul | 22:20 | |
*** harlowja has joined #zuul | 22:36 | |
clarkb | is '2017-12-06 22:58:21,793 INFO nodepool.driver.openstack.OpenStackNodeRequestHandler[nl02.openstack.org-29835-PoolWorker.citycloud-sto2-main]: Node request 200-0001391414 disappeared' normal? because it seems to result in tracebacks | 23:06 |
clarkb | jeblair: wondering if maybe that is a side effect of releasing nodesets more frequently due to window changes or other changes in zuul | 23:06 |
clarkb | http://paste.openstack.org/show/628319/ is the traceback, I don't think its much of a concern other than it dirties up the logs and makes it harder to find actual errors | 23:08 |
clarkb | hrm there are 1260 occurences of that exact request disappearing in the log though | 23:09 |
jeblair | well that doesn't seem right | 23:09 |
jeblair | once seems like enough | 23:10 |
clarkb | ya I wonder if the pool worker is attempting to clean it up in a loop? | 23:13 |
clarkb | seems like state got lost somewhere | 23:13 |
*** openstack has joined #zuul | 23:31 | |
*** ChanServ sets mode: +o openstack | 23:31 | |
clarkb | I'm going to check the zk db directly to see if that request is there at all | 23:40 |
clarkb | but I'm guessing I can just update nodepool to drop the handler if the request has properly gone away | 23:40 |
clarkb | that request is indeed just missing from zk | 23:41 |
clarkb | I think launcher._cleanupNodeRequestLocks and _removeCompletedHandlers are racing each other | 23:54 |
clarkb | Shrews: ^ does that seem possible? basically cleanupNodeRequestLocks is removing the request lock before removeCompeltedHandlers can remove the handler | 23:55 |
clarkb | which results in the exception in the above traceback | 23:55 |
clarkb | also we appear to still leak node request locks in zookeeper there are far more of them than there are node requests | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!