Monday, 2017-02-13

rcarrillocruzfolks, won't attend tomorrow meeting, i'm in Munich in the new hire orientation thingy for redhat00:07
rcarrillocruzjeblair: ^00:07
*** jamielennox is now known as jamielennox|away00:24
*** jamielennox|away is now known as jamielennox00:50
*** saneax-_-|AFK is now known as saneax06:02
*** Cibo_ has joined #zuul07:28
*** abregman has joined #zuul08:34
*** hashar has joined #zuul08:50
*** abregman has quit IRC08:56
*** markmcd has quit IRC09:17
*** jamielennox is now known as jamielennox|away09:48
*** jamielennox|away is now known as jamielennox10:24
*** hashar has quit IRC10:37
*** hashar has joined #zuul10:38
*** herlo has quit IRC12:08
*** herlo has joined #zuul12:08
*** herlo has joined #zuul12:08
*** isaacb has joined #zuul12:20
*** isaacb_ has joined #zuul12:20
*** isaacb_ has left #zuul12:21
jkthi there, I've upgraded and rebooted my zuul-server, and I notice that it's now trying to access my private projects over https without proper credentials, https://paste.kde.org/pjm52hxec13:17
jktwhat confuses me is that this code apparently hasn't been touched since 2015, and I only installed this service maybe a month ago13:18
jktah well, so according to the logs, this has always been the case, interesting13:24
*** tristanC_ has joined #zuul13:30
*** herlo_ has joined #zuul13:31
*** tflink has quit IRC13:32
*** hashar has quit IRC13:32
*** tristanC has quit IRC13:32
*** herlo has quit IRC13:32
*** tflink has joined #zuul13:32
*** hashar has joined #zuul13:32
Shrewspabelanger: So, I retract my retraction. I think the failure you're seeing for min-ready=2/max-servers=1 is valid (and we should change the config).14:27
Shrewshrm... we might be able to make it work, though, by not requesting more than 1 server at a time for min-ready14:30
Shrewsthat would also solve the "splitting up min-ready requests" issue14:33
*** jamielennox is now known as jamielennox|away14:39
*** abregman has joined #zuul14:41
*** openstackgerrit has joined #zuul15:18
Shrewspabelanger: https://review.openstack.org/43312715:19
pabelangerShrews: will look shortly15:20
*** saneax is now known as saneax-_-|AFK15:21
Shrewspabelanger: no rush. solves the issue you found and fixes something jeblair wanted, too15:22
pabelangeryay15:22
Shrewsgrr, why did that fail15:43
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Split up min-ready requests to 1 node per request  https://review.openstack.org/43312715:45
Shrewsoooh, it's a race15:49
isaacbpabelanger: is there a way to temporary disable a job in zuul without editing the layout file and reloading the zuul services?15:51
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Split up min-ready requests to 1 node per request  https://review.openstack.org/43312716:00
*** abregman has quit IRC16:00
pabelangerisaacb: no, we don't have anything via the CLI today16:08
isaacbpabelanger: ok, thanks16:08
*** abregman has joined #zuul16:13
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Split up min-ready requests to 1 node per request  https://review.openstack.org/43312716:27
pabelangerShrews: ya, I think the split approach over updating our config file.16:37
Shrewspabelanger: there's a problem with it i have to track down16:37
*** abregman is now known as abregman|afk16:38
pabelangerk16:38
*** isaacb has quit IRC16:42
*** saneax-_-|AFK is now known as saneax16:52
jeblairmordred: in 428798 i pointed out the error that's causing tests to fail; you want to update?16:56
mordredjeblair: nope. I don't believe in tests17:05
mordredjeblair: oh holy hell. thank you17:13
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Add action plugins to restrict untrusted execution  https://review.openstack.org/42879817:13
*** abregman|afk has quit IRC17:13
jeblairmordred: green!17:20
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Split up min-ready requests to 1 node per request  https://review.openstack.org/43312717:23
Shrewsjeblair: fyi, before i forget (again), integration test is failing b/c of provider authentication issues, if you're curious. That should be a simple .yaml change, yeah?17:28
jeblairmordred: +2; who else should review it?17:28
jeblairShrews: i think so17:29
jeblairShrews: you want me to poke at it or do you want to?17:29
Shrewsk. i'll try to look at it again later. i tried adding a 'cloud' line to it last week, but that didn't work17:29
Shrewsjeblair: https://review.openstack.org/431727  ... not sure what the correct solution is17:31
jeblairShrews: it's intended to use the fakes which are triggered by "auth-url: 'fake'" i believe17:32
jeblairShrews: have a link to a failing job?17:33
Shrewsjeblair: http://logs.openstack.org/27/433127/1/experimental/gate-zuul-nodepool/0a0f934/testr_results.html.gz17:33
Shrewserr, actual log: http://logs.openstack.org/27/433127/1/experimental/gate-zuul-nodepool/0a0f934/logs/logs/nodepool-launcher.log17:34
jeblairShrews: ah, yeah, so it's trying to use an actual provider not a fake one17:34
mordredjeblair: might be nice for jlk and/or SpamapS to take a peek17:34
jeblair(i really need to add a test that tests that we don't break running the fake providers in a real daemon, though, i guess if we make this one voting, we will have one :)17:35
* SpamapS rouses17:37
Shrewspabelanger: ok, think i fixed the issue in 433127. we had a race on re-using existing READY nodes.17:37
jeblairShrews: the problem is that _updateProvider is instantiating a ProviderManager directly instead of using the helper function get_provider_manager17:38
Shrewsjeblair: ugh17:38
jeblairShrews: i will patch17:38
Shrewsjeblair: i love that there are 3 ways to instantiate a ProviderManager, btw17:39
pabelangerShrews: looking17:39
Shrewsstill need to remove the static methods at some point17:39
openstackgerritJames E. Blair proposed openstack-infra/nodepool feature/zuulv3: Use helper function to instantiate ProviderManager  https://review.openstack.org/43321217:39
Shrewsjeblair: thx17:41
jeblairShrews: yep.  i'd be happy to see it simplified.  to be fair, there used to only be one -- the static method.  you're changing things, which is fine, but if you do that, you can expect the transition period to be confusing.  :)17:41
Shrewsto be fair, i am always confused17:41
jeblairShrews: i left 'check experimental' on that, we'll see if i'm confused :)17:42
Shrews\o/17:42
pabelangerShrews: left a question on 433127.17:43
pabelangerShrews: jeblair: with nodepool, will we have 2 daemons or 3?17:44
pabelangernodepool, nodepool-builder, nodepool-launcher?17:44
jeblairpabelanger: 2: launcher and builder17:45
Shrewspabelanger: replied!17:45
pabelangerjeblair: k17:45
jeblairno more split daemon nonsense :)17:45
jeblair(we can split by provider if needed for performance reasons)17:45
pabelangerya, wasn't sure if we needed a 3rd for scheduling17:45
pabelangerbut sounds like each launcher will have something17:45
jeblairpabelanger: nope, they're self-sufficient17:45
pabelangervery cool17:46
mordredSpamapS, jlk: if you do look at https://review.openstack.org/428798 keep in mind roles aren't handled/managed yet, so are not covered in this17:51
mordredjlk: do playbooks have an implcit role lookup path relative to playbook location?17:52
Shrewsjeblair: that seemed to work, though it will need https://review.openstack.org/431719 to pass now17:57
jeblairShrews: 719 depends on an abandoned change? 43172718:02
Shrewsjeblair: oops. lemme remove that depends-on18:03
*** hashar has quit IRC18:04
openstackgerritDavid Shrewsbury proposed openstack-infra/zuul feature/zuulv3: Supply label name to Nodes  https://review.openstack.org/43171918:04
Shrewsjeblair: thx for pointing that out ^^^18:05
openstackgerritJames E. Blair proposed openstack-infra/nodepool feature/zuulv3: Use helper function to instantiate ProviderManager  https://review.openstack.org/43321218:08
jeblairShrews: i approved your label change, but also depended the helper function on it, will recheck that now18:08
Shrewsjeblair: w00t! success18:15
jeblairShrews: yay!18:15
jeblairi'm going through the outstanding nodepool stack now18:16
jeblairShrews: can you give your +1 or +2 to 431647?18:16
Shrewsjeblair: done18:17
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Supply label name to Nodes  https://review.openstack.org/43171918:17
jeblairpabelanger: do you want to review 433127?18:20
pabelangersure18:20
jeblairpabelanger: that and 433212 are ready for your +3 if you want to give it18:21
pabelanger+318:22
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Update nodepool 'list' command  https://review.openstack.org/43164718:23
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Update nodepool hold to use zookeeper  https://review.openstack.org/43175618:24
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Re-enable test_dib_image_pause / test_dib_image_upload_pause  https://review.openstack.org/43244718:24
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Re-enable test_dib_image_delete test  https://review.openstack.org/43260618:24
Shrewspabelanger: If you want to focus on test re-enablement, I can get to work on a node deleter thread18:25
pabelangerShrews: WFM18:25
Shrewsawesome. getting delete working will help avoid bash scripts at the ptg  :)18:26
jeblair++18:26
jeblairspurious test failure?   http://logs.openstack.org/07/432607/1/gate/gate-nodepool-python27-ubuntu-xenial/b0b24dc/testr_results.html.gz18:26
pabelangerya, just looking at it18:27
jeblairlooks like maybe the provider is still running and creating requests while the test is shutting down and trying to delete the zookeeper tree?18:28
jlkmordred: the implicit role path is roles/ subdir from where the playbook is.18:29
jlkmordred: beyond that, it's explicit via the config file, with a default of /etc/ansible/roles18:30
jeblairjlk: are we talking roles or plugins?18:30
jlkroles, but I may have misunderstood the question18:30
jlk[10:24:49]  <mordred>17:52:12> jlk: do playbooks have an implcit role lookup path relative to playbook location?18:31
mordredjlk: k. thanks18:31
jeblairoh hey i missed that, sorry :)18:31
mordredwhen we add scrubbing of role content for safety18:31
jeblaircool18:31
mordredwe should also apply that logic to anything found in a roles dir adjacent to the playbook when we find it18:31
mordredsort of like how we're throwing an error if we find plugin dirs adjacent to the playbook18:32
jeblairmordred: so we'll need to match "<playbook>/roles/plugins" or something?18:32
mordredjeblair: yah.18:32
jeblairmordred: because we *do* want to allow roles, right?  just not roles with plugins?18:32
mordredright18:32
jeblairk18:32
mordredjeblair: and to be fair, we do want to allow roles with plugins if they are in trusted repos18:33
jeblairright18:34
jlkthis is all just around automatically running a playbook in-repo right?18:35
jlkTotally avoided if the "test" calls ansible from one of the nodepool nodes?18:35
jeblairjlk: correct, but that's not the way we want most jobs to be written.  i can imagine that tests *of* ansible would want to be written that way; however, something's gone wrong if that's required in a more general case.18:36
jlkright, I was thinking of the use case where the code being tested is ansible code, which would require custom plugins or modules.18:37
Shrewsjeblair: pabelanger: I think jeblair is correct. Since we don't wait on the NodeLaunch threads, they could still access ZK.18:37
Shrewsduring shutdown, that is18:38
jeblairjlk: yeah, in that case, if it isn't in a secure repo, it would need to do what you say.18:38
jeblairShrews: it looks like that test created another min-ready request, so the issue may be that we don't wait for the main loop to stop18:40
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_node_net_name test  https://review.openstack.org/43322718:41
openstackgerritJames E. Blair proposed openstack-infra/nodepool feature/zuulv3: Wait for main loop when stopping  https://review.openstack.org/43323018:42
jeblairShrews: ^18:42
jeblairactually, lemme rebase that18:43
openstackgerritJames E. Blair proposed openstack-infra/nodepool feature/zuulv3: Wait for main loop when stopping  https://review.openstack.org/43323018:43
jeblairthere18:43
jeblair[alternatively, we could put that join in the test cleanup (but we'd also have to make sure to do so in any other callers)]18:44
Shrewsnah, that looks good to me18:45
pabelangerShrews: what do you think will be the best way to see if validate a node is not launched? That is the current thing we are missing from waitForNodes()18:47
pabelangerI need to assert we don't launch a disabled label18:47
pabelangerand don't have a good way to do that ATM18:47
mordredjeblair, jlk: yah - that's the use-case where I think we'll want to provide some sort of syntactic sugar so that the person in question can write things the 'normal' way but have thigns executed on a remote node - but I'm pretty sure I don't know exactly what that sugar looks lke yet18:48
jlkagreed, will take some use casing18:49
Shrewspabelanger: the request for that node fails?18:49
jlkmordred: jeblair: to that end, we're testing hoist, our Ansible to install zuul/nodepool et al, in Zuul 2.5, with the ansible launcher, using nodepool subnodes. SO we'll have SOME experience with how it kind of works18:49
Shrewspabelanger: oh, disabled label18:50
jlkmordred: looking at your change, I'm wondering if we can hack into the various strategies, or use some plugin, to maintain a whitelist or blacklist of modules. Instead of having to create symlinks to a blacklisted module, maybe we can capture which module is in use and reject/approve accordingly?18:51
jlkmight be a lot easier to maintain than a pile of symlinks that has to be examined each new Ansible release for possible bad actors.18:51
pabelangerShrews: ya, disabled18:52
jlkalso this feels like an excellent topic to speak about at AnsibleFest in London :D18:52
pabelangertest_disabled_label for example18:52
Shrewspabelanger: hrm... iterate through all requests and make sure there are none, or at least none with that label in request.node_types. depends on the test18:52
mordredjlk: ++18:53
mordredjlk: this is definitely the sledgehammer approach atm18:53
*** saneax is now known as saneax-_-|AFK18:54
Shrewspabelanger: gee, there's really no way to wait for a label that's disabled.18:57
Shrewsnot sure of the best solution there18:57
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_node_net_name test  https://review.openstack.org/43322718:58
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_node_vhd_image test  https://review.openstack.org/43323318:58
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_dib_upload_fail test  https://review.openstack.org/43323519:14
SpamapSI have a name for our locked down Ansible:  HOSTansIbLE19:17
SpamapSOr if you want to veil it a bit.. HostAnsible19:17
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Wait for main loop when stopping  https://review.openstack.org/43323019:26
jlkhah19:31
jlkAnsiblocked19:31
*** hashar has joined #zuul19:32
jeblairnice :)19:36
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Set Node image_id and launcher attributes  https://review.openstack.org/43324219:48
Shrewsjeblair: I'm assuming you wanted Node.image_id to be the external image ID there ^^^ and not a ZK id19:49
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Re-enable test_node test  https://review.openstack.org/43260719:54
hashargood morning!20:02
hashardo you still accept minor changes to the Zuul master branch or is that closed / deprecated in favor of Zuul v3?20:03
hasharI have a few tiny patches pending which could use to be landed.  I got them cherry picked on my instance but feel they might benefit others :D20:03
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Split up min-ready requests to 1 node per request  https://review.openstack.org/43312720:18
openstackgerritMerged openstack-infra/nodepool feature/zuulv3: Use helper function to instantiate ProviderManager  https://review.openstack.org/43321220:18
mordredhashar: we've landed few few bugfixes/small changes to master recently20:27
mordredhashar: it sort of depends on which bits they touch20:27
hasharmordred: yes I noticed hence my question :]20:27
hasharmaybe I should rebase mine and list them somewhere20:27
hasharthen you guys can triage them and accept/deny them20:28
hasharand if deny I will just abandon them20:28
mordredhashar: I think that sounds like a great plan20:28
hasharmordred: should i use the openstack-infra list for that or is zuul blessed with a mailing list nowadays ? :]20:31
mordredhashar: nope - still on infra list :)20:31
hashar:)20:32
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Add generator API method for node iteration  https://review.openstack.org/43325220:32
Shrewsjeblair: Current node deletion seems to launch a thread per node-to-delete. Given your concerns about thread contention, should we still follow that model?20:44
jeblairShrews: i think we can probably go ahead and drop that --20:52
jeblairShrews: the reason it's a thread is due to all the things it (historically) needs to do -- each of which are asynchronous and could fail (much like launching)20:52
jeblairShrews: well, hrm, that may not be right...20:53
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Add generator API method for node iteration  https://review.openstack.org/43325220:54
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Set Node image_id and launcher attributes  https://review.openstack.org/43324220:54
jeblairthe steps are: remove from jenkins (or revoke from zuul v2.5), clean up subnodes, and then actually delete the node20:54
Shrewspabelanger: added the requested testing in 43324220:54
jeblairin the new world, we're just left with 'delete the node' which is a shade call20:54
jeblairbut that shade call is, i think, a blocking call which itself has lots of individual tasks20:54
pabelangerShrews: thanks, looking20:55
jeblairShrews: so it may need to remain a thread for now, until we find a way to rework that.20:55
jeblair(tasks like floating ip cleanup, etc)20:55
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable working test_builder.py tests  https://review.openstack.org/43326220:57
jeblairShrews: re earlier quuestion -- i think the image_id was intended to be the zk path20:57
jeblairShrews:   image_id: /nodepool/image/ubuntu-trusty/builds/123/provider/rax/images/45620:57
Shrewsjeblair: ok. right now, i'm thinking a new child thread of the main NodePool thread that periodically iterates through the nodes (with a long'ish sleep period) which itself spawns the deleter thread20:57
Shrewsjeblair: ack on image_id20:58
jeblairShrews: (i think the idea there was to give us an easy way to say "this node was launched from this image / build")20:58
jeblairShrews: that sounds like it would work, but i think we'll want deletes to happen quickly.  to accomplish that, we may either want a short sleep or to start using watches...20:59
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_diskimage_build_only test  https://review.openstack.org/43326521:00
jeblairShrews: (deletes happen nearly instantaneously now, and it's important for our throughput)21:00
jeblairShrews: (setting a watch on every node that is allocated might be an approach)21:01
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_image_upload_fail test  https://review.openstack.org/43327021:01
Shrewsjeblair: ok. i was mostly concerned with querying ALL nodes too often (and the load that might cause), but that seems unavoidable as watches will not work on things like abnormally terminated launches21:02
*** jamielennox|away is now known as jamielennox21:02
Shrewsi suppose we could mix the two approaches21:02
jeblairShrews: agreed -- we would probably need to do long poll + watches or short poll.21:03
ShrewsHrm, not likely to get this done this week w/o major effort. Ugh21:03
jeblairShrews: good news is, polling is a good starting point and can build on it :)21:03
Shrewsyeah. maybe i can get that part done21:03
jeblairShrews: yeah, and we can set a short poll interval and see how it works and add watches and reduce the polling later21:04
jeblairif needed21:04
Shrewsjeblair: one last thing on image_id... the full path is a hidden detail for our ImageUpload model. How about "<image>-<build_id>-<provider>-<upload_id>"?21:09
Shrewsor some variant21:09
Shrewsmeh, i could access the "private" _imageUploadPath() ZK API method to build it, I guess21:10
jeblairShrews: i don't have strong feelings on that (i figured it's in the zk database so it didn't need to be hidden, and we would present something more useful to the user in CLI utilities).  i think really this is another point in favor of making globally unique images rather than the deep tree approach [this is something i think i'd like to change in a later revision].  fortunately, it's the only thing like this.21:13
Shrewsguid++21:14
jeblairwe said we were going to learn things.  we did.  :)21:14
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Set Node image_id and launcher attributes  https://review.openstack.org/43324221:21
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool feature/zuulv3: Add generator API method for node iteration  https://review.openstack.org/43325221:21
Shrewsok, i think image_id is correct now in 433242. going to take a quick walk before the meeting21:23
*** herlo_ is now known as herlo21:24
*** herlo has quit IRC21:24
*** herlo has joined #zuul21:24
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_diskimage_build_only test  https://review.openstack.org/43326521:33
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_dib_upload_fail test  https://review.openstack.org/43323521:33
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_node_vhd_image test  https://review.openstack.org/43323321:33
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_image_upload_fail test  https://review.openstack.org/43327021:33
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable test_node_net_name test  https://review.openstack.org/43322721:33
openstackgerritPaul Belanger proposed openstack-infra/nodepool feature/zuulv3: Re-enable working test_builder.py tests  https://review.openstack.org/43326221:33
jheskethMorning21:48
jeblairit's meeting time in #openstack-meeting-alt22:00
*** hashar has quit IRC22:06
*** lespaul has joined #zuul22:34
*** saneax-_-|AFK is now known as saneax23:02
*** jamielennox is now known as jamielennox|away23:15
*** jamielennox|away is now known as jamielennox23:20
*** lespaul has quit IRC23:34
jheskethjeblair: so thinking about the variants and whatnot in v3 (from your email), have you given any thought to storing branch variants in the branches themselves? It'd be very complicated because of the way jobs sprawl across repos, but it'd allow changes to jobs to be tested when proposed23:40
jheskethI think we may have chatted about this before23:40
* jhesketh 's memory is poor23:40
jheskethI suppose for testing branch changes you just need to use a depends-on, but I'm not sure if that'd work in the same repo but across branches23:41
jheskeththe other thing is that stable branches will still contain a .zuul.yaml file. Perhaps we need a policy of cleaning them up if they are otherwise completely ignored23:42

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!