Monday, 2018-06-11

tristanC	johanssone: i'm no pip expert, though you can grab the web stuff from http://tarballs.openstack.org/zuul/zuul-content-latest.tar.gz	00:36
*** yolanda has joined #zuul		03:44
*** yolanda_ has quit IRC		03:46
*** swest has joined #zuul		04:35
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: Change test prints to log.info https://review.openstack.org/554058	05:38
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: Fix logging in tests to be quiet when expected https://review.openstack.org/554054	05:38
*** Rohaan has joined #zuul		05:39
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: Fix logging in tests to be quiet when expected https://review.openstack.org/554054	05:44
tobiash	mordred: I took over your logging changes, hope that's ok	05:44
*** hashar has joined #zuul		05:52
*** AJaeger has quit IRC		05:54
*** AJaeger has joined #zuul		05:58
openstackgerrit	Merged openstack-infra/zuul master: Add timestamps to multiline debug message logs https://review.openstack.org/572845	06:06
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: Fix logging in tests to be quiet when expected https://review.openstack.org/554054	06:22
*** pcaruana has joined #zuul		06:26
Rohaan	tristanC: ping	06:45
tristanC	Rohaan: Hey!	06:48
Rohaan	tristanC: I was trying to install rh-python35-python-openshift, but yum complains : No package rh-python35-python-openshift available. Which repository needs to be enabled for this package?	06:55
*** bhavik1 has joined #zuul		06:59
tristanC	Rohaan: it's only in sf-master, it will be part of the next release. you can try it out using this repo: http://softwarefactory-project.io/repos/sf-release-master.rpm	07:01
tristanC	Rohaan: or you can install using "pip install openshift" as root	07:01
* Rohaan checks		07:05
*** bhavik1 has quit IRC		07:11
Rohaan	tristanC: Did you try this with a local openshift instance or some hosted openshift instance?	07:49
*** jpena\|off is now known as jpena		07:50
tristanC	Rohaan: only with a dedicated all-in-one "oc cluster up" setup	07:50
Rohaan	I tried with one Openshift instance which we get with subcribing to Openshift.io, but seems like we don't have access to create namespaces there :(	07:52
Rohaan	Now I'm trying with a local openshift cluster on my machine.	07:52
tristanC	arg, yes this proposed implementation needs to be able to create project and use the kubectl ansible connection plugin to run task on the pods	07:53
Rohaan	I'm running sfactory in Virtualbox in a centos instance	07:53
tristanC	well i wrote another driver that create pods in an existing project, but that's not part of the current container spec implementation in zuul, fwiw this driver is: https://review.openstack.org/#/c/535557/	07:54
tristanC	Rohaan: you can run oc cluster up on the zuul host or on a seperate instance. I've tried both the fedora origin package and the upstream release binary on centos too	07:55
Rohaan	O, would it be okay if I do on Zuul host?	07:56
Rohaan	lemme try	07:56
tristanC	yes sure, just grab the server binary from https://github.com/openshift/origin/releases and run it along side zuul	07:57
rcarrillocruz	tristanC: the creation of the openshift node for a given job will be in its project. How is that creation, is it a template where you just plumb the node type from nodepool or is that created within python code? I assume the latter...	08:06
rcarrillocruz	asking cos...it would be VERY interesting if the driver could allow pushing a $deployment_template , thinking of networking use cases were we spin up a topology per a given job	08:07
*** Wei_Liu has joined #zuul		08:09
tristanC	rcarrillocruz: afaiu, it's better to create a new project for each PR so that they each get isolated image registry	08:11
tristanC	rcarrillocruz: the current driver i propose create the project and service account using python sdk here: https://review.openstack.org/#/c/570667/3/nodepool/driver/openshift/provider.py L88	08:11
rcarrillocruz	agreed, not saying the opposite. What I'm asking is the driver is I assume just addressing the creation of the PR node	08:11
rcarrillocruz	in terms on how you are developing, wondering if you create with python or you create a YAML template then push with kubectl/openshift plugin	08:12
rcarrillocruz	so python	08:12
rcarrillocruz	gotcha	08:12
tristanC	rcarrillocruz: then the job starts with an access to this namespace, the initial implementation just build an image and start a single pod with a sleep command: https://review.openstack.org/#/c/570669/2/playbooks/openshift/deploy-project.yaml	08:13
rcarrillocruz	also tristanC , i've started poking at kubevirt and it works, i can get VMs on openshift	08:13
rcarrillocruz	it's VERY intriguing for zuul purposes :-)	08:13
rcarrillocruz	getting in a world where you can have zuul in openshift running containrs when possible and VMs when not possible is very interesting	08:14
tristanC	but this openshift-base job could easily be extended to use a more complex deployment recipie, e.g. to deploy cross project environment	08:14
tristanC	ultimately, here's what looks like a job to test a container deployment: https://softwarefactory-project.io/draft/zuul-openshift/#orgheadline10	08:15
*** electrofelix has joined #zuul		08:16
tristanC	rcarrillocruz: it seems like kubevirt is another openshift use case that may need another nodepool. Does it setup floating ip and ssh access, presumbably setup with cloud-init?	08:18
rcarrillocruz	yeah, if you look at the kubevirt demo repo, they spin up a cirros pod by attaching to it a volume for cloud init consumption	08:19
rcarrillocruz	https://raw.githubusercontent.com/kubevirt/demo/master/manifests/vm.yaml	08:19
tristanC	that's neat :-)	08:21
*** gtema has joined #zuul		08:28
rcarrillocruz	mordred , corvus : ^ , openshift can run VMs	08:34
rcarrillocruz	i tested it over the weekend	08:34
Rohaan	tristanC: Hey, I've installed and logged into openshift. Is there some way to force nodepool to create empty openshift project again? I was at step 3.1 with previous openshift instance	08:51
Rohaan	I've tried restarting nodepool/zuul services but that doesn't seem to create it	08:52
tristanC	Rohaan: unfortunately, you can swap out nodepool provider like that, you need to first set it to max-servers: 0, then remove it...	08:56
tristanC	Rohaan: since it's a test setup, I think you can "systemctl stop zookeeper; rm -Rf /var/lib/zookeeper/version-2/*; systemctl start zookeeper"	08:57
tristanC	that should purge nodepool database	08:57
tristanC	/you can not* swap out nodepool provider/	08:58
openstackgerrit	Fabien Boucher proposed openstack-infra/zuul master: Accumulate errors in context managers - part 2 https://review.openstack.org/574028	09:01
openstackgerrit	Fabien Boucher proposed openstack-infra/zuul master: Accumulate errors in context managers - part 2 https://review.openstack.org/574028	09:01
Rohaan	tristanC: I restarted zookeeper, but now nodepool list returns nothing :(	09:03
tristanC	Rohaan: perhaps restart the nodepool service, or does the openshift-project label has a min-ready value?	09:05
tristanC	(systemctl restart rh-python35-nodepool-launcher)	09:05
Rohaan	yes, I restarted nodepool also	09:06
Rohaan	How can I check if openshift-project label has a min-ready value?	09:07
tristanC	Rohaan: could you share /var/log/nodepool/launcher.log ?	09:07
tristanC	and /etc/nodepool/nodepool.yaml , label definition is in this file	09:08
Rohaan	nodepool.yaml: https://pastebin.com/vSzdLt1k	09:10
Rohaan	nodepool launcher.log : https://pastebin.com/3B0ijhng	09:10
tristanC	Rohaan: oh, you are missing the openshift provider, you need to add https://softwarefactory-project.io/draft/zuul-openshift/#orgheadline9 to /etc/nodepool/nodepool.yaml	09:12
tristanC	no need to restart nodepool, it auto reloads the configuration when it changes	09:13
*** fbo has joined #zuul		09:14
tristanC	Rohaan: make sure to follow all the steps of that draft zuul-openshift documentation	09:15
Rohaan	ah, sorry	09:16
Rohaan	I had added that in /root/config/nodepool/openshift.yaml file	09:17
tristanC	Rohaan: then you need to git commit && git push that file to trigger the config-update that will copy the content in /etc/nodepool/nodepool.yaml	09:18
Rohaan	tristanC: Have I added it correctly? https://pastebin.com/8WnhfXmF	09:22
tristanC	Rohaan: that should work, though next time you better merge the "providers" and "labels" list to not confuse the yaml parser	09:24
Rohaan	Now previous nodepool providers are back but still I don't see openshift project :(	09:29
tristanC	Rohaan: could you paste new launcher.log ?	09:31
Rohaan	https://pastebin.com/b52g0sKV	09:32
tristanC	Rohaan: you need to setup nodepool .kube/config: sudo -u nodepool oc login -u developer https://localhost:8443	09:34
Rohaan	I remember I did that. but let me try again	09:35
Rohaan	Hm, .kube/config just got created. Earlier I used to login successfully but I used to get a small "stat .: permission denied" error/warning	09:40
tristanC	Rohaan: hum, perhaps the sudo needs to be ran from a public directory, not from /root	09:41
Rohaan	tristanC: Yay, I can now see openshift provider https://pastebin.com/Pcu7Rj54	09:47
tristanC	Rohaan: nice!	09:47
Rohaan	tristanC: Hey, Have I changed .zuul.yaml in demo project correctly? https://pastebin.com/dpcXLAj5 . Somehow I can't see job status in zuul console after doing git review	10:05
tristanC	Rohaan: I think you need to remove the initial "- project" statement and only keep the one that sets the openshift-test job	10:07
Rohaan	Do I need to create a patch again? Or would amending and doing git review again would trigger it?	10:10
Rohaan	Right now I'm just amending it and pushing it again	10:11
tristanC	both action would work, though it sounds better to amend	10:12
Rohaan	Somehow it's not coming up, Do you know where zuul store it's logs?	10:15
Rohaan	https://ibin.co/44uRhpaq0ztW.png	10:17
Rohaan	tristanC: Any idea?	10:21
tristanC	Rohaan: clicking the "toggle ci" button at the bottom should show zuul configuration errors in gerrit comments	10:21
tristanC	Rohaan: otherwise you need to look at /var/log/zuul/scheduler.log	10:22
Rohaan	tristanC: Thanks. I see it's complaining about base-openshift being not defined https://pastebin.com/jyVEhrXX	10:24
tristanC	Rohaan: you need to copy those file in /root/config/playbooks: https://review.openstack.org/#/c/570669/	10:25
tristanC	and define the openshift-base job in /root/config/zuul.d/openshift.yaml using this: https://review.openstack.org/#/c/570669/2/zuul.yaml	10:25
tristanC	then commit and push those	10:26
*** Wei_Liu has quit IRC		10:31
Rohaan	tristanC: I found mistake, in https://review.openstack.org/#/c/570669/2/zuul.yaml job name is openshift-base but in doc we are writing base-openshift . Now it seems to be coming up	10:35
*** Wei_Liu has joined #zuul		10:35
Rohaan	but it fails giving : openshift-test finger://dhcppc4:7979/7e3edec464a942a9be90d17879af6fd6 : RETRY_LIMIT in 1s	10:36
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: Fix signature of overridden methods in LogStreamHandler https://review.openstack.org/574204	10:44
tristanC	Rohaan: to debug RETRY_LIMIT, change the zuul handler to DEBUG in /etc/zuul/executor-logging.yaml; then "systemctl restart rh-python35-zuul-executor";	10:45
tristanC	Rohaan: then run "/opt/rh/rh-python35/root/usr/bin/zuul-executor verbose"	10:46
tristanC	Rohaan: that will make the executor write ansible-playbook (the job execution) output to /var/log/zuul/executor.log	10:46
tristanC	Rohaan: you can comment "recheck" on the code review to retrigger the job	10:46
Rohaan	tristanC: Thanks. Getting some python import errors there https://pastebin.com/QqyuHSwr	10:52
tristanC	Rohaan: oh, after you installed the "sf-release-master.rpm", did you run "yum update -y" ?	10:54
Rohaan	oops	10:54
openstackgerrit	Fabien Boucher proposed openstack-infra/zuul master: Add a config_errors info to {tenant}/config-errors endpoint https://review.openstack.org/553873	10:56
openstackgerrit	Fabien Boucher proposed openstack-infra/zuul master: Add .stestr to .gitignore https://review.openstack.org/574213	11:00
Rohaan	tristanC: After doing `yum update -y` do I need to restart nodepool also?	11:02
tristanC	Rohaan: I guess mostly ansible got upgraded to the version 2.5, then you can just recheck the job without restarting the zuul-executor	11:04
tristanC	if zuul got upgraded, then you need to re-apply the tiny patch to support openshift resource, it's: https://review.openstack.org/#/c/570668/	11:05
*** GonZo2000 has joined #zuul		11:08
*** jpena is now known as jpena\|lunch		11:19
openstackgerrit	Tobias Henkel proposed openstack-infra/zuul master: Switch content type of public keys back to text/plain https://review.openstack.org/574220	11:23
*** Rohaan has quit IRC		11:36
*** Rohaan has joined #zuul		11:50
*** jpena\|lunch is now known as jpena		12:03
*** Rohaan has quit IRC		12:03
*** rlandy has joined #zuul		12:17
*** rlandy is now known as rlandy\|rover		12:18
*** pcaruana has quit IRC		12:39
*** odyssey4me has quit IRC		12:39
*** odyssey4me has joined #zuul		12:39
*** Wei_Liu has quit IRC		12:45
*** Wei_Liu has joined #zuul		12:45
*** myoung\|off is now known as myoung		13:19
*** pcaruana has joined #zuul		13:32
openstackgerrit	Merged openstack-infra/nodepool master: Fail quickly for disabled provider pools https://review.openstack.org/573762	13:35
openstackgerrit	Monty Taylor proposed openstack-infra/zuul master: Hide queue headers for empty queues when filtering https://review.openstack.org/572588	13:36
*** GonZo2000 has quit IRC		13:38
openstackgerrit	Merged openstack-infra/nodepool master: Fix 'satisfy' spelling errors https://review.openstack.org/573823	13:42
Shrews	mordred: thx	13:44
openstackgerrit	Merged openstack-infra/zuul master: Add .stestr to .gitignore https://review.openstack.org/574213	13:46
openstackgerrit	Merged openstack-infra/zuul master: github: Optimize getPullReviews() a bit https://review.openstack.org/570077	13:46
*** gtema has quit IRC		13:50
openstackgerrit	Fabien Boucher proposed openstack-infra/zuul master: Add tenant yaml validation option to scheduler https://review.openstack.org/574265	13:52
openstackgerrit	Fabien Boucher proposed openstack-infra/zuul master: Add tenant yaml validation option to scheduler https://review.openstack.org/574265	13:55
openstackgerrit	Fabien Boucher proposed openstack-infra/zuul master: Add tenant yaml validation option to scheduler https://review.openstack.org/574265	13:56
*** ianychoi has quit IRC		14:02
*** needssleep is now known as TheJulia		14:15
openstackgerrit	Merged openstack-infra/nodepool master: Correctly use connection-port in static driver https://review.openstack.org/569339	14:22
corvus	i need to run an errand this morning that could take a while; i hope to be back around lunch but it's hard to say.	14:25
pabelanger	fetch-zuul-cloner seems to only work with a single zuul connection, multiple will fail	14:46
openstackgerrit	Merged openstack-infra/zuul master: Make file matchers overridable https://review.openstack.org/571745	14:46
openstackgerrit	Monty Taylor proposed openstack-infra/nodepool master: Consume Task and TaskManager from openstacksdk https://review.openstack.org/414759	14:54
openstackgerrit	Monty Taylor proposed openstack-infra/nodepool master: Switch to RateLimitingTaskManager base class https://review.openstack.org/574287	14:57
mordred	Shrews, corvus: ^^ those two patches, along with "Import rate limiting TaskManager from nodepool https://review.openstack.org/574285" - should be basically a no-op/cleanup from a nodepool perspective, but should help people over in sdk land see the whole picture	15:00
mordred	I've got at least 2 people who have expressed interest in improving the caching story over there, but if you don't understand the nodepoool task manager you really can't touch the shade/sdk caching code	15:01
mordred	I broke it up into two patches so that we could see the is-a-thread to has-a-thread change in place- that first patch should basically work today as-is	15:03
Shrews	mordred: ++	15:05
Shrews	mordred: why does 547285 depend on 414759?	15:09
Shrews	seems backwards, but i havent look too closely yet	15:10
mordred	Shrews: just so we can make sure that the has-a-thread rework is good in nodepool before importing it into sdk	15:12
mordred	the patches don't actually depend on each other - but if nodepool reviewers find a flaw in 414759, I don't want the flawed code landing in sdk-land	15:13
Shrews	k	15:13
openstackgerrit	Monty Taylor proposed openstack-infra/nodepool master: Consume Task and TaskManager from openstacksdk https://review.openstack.org/414759	15:21
openstackgerrit	Monty Taylor proposed openstack-infra/nodepool master: Switch to RateLimitingTaskManager base class https://review.openstack.org/574287	15:21
openstackgerrit	Merged openstack-infra/nodepool master: Log connection port in static driver on timeout https://review.openstack.org/569334	15:39
openstackgerrit	Monty Taylor proposed openstack-infra/nodepool master: Use openstacksdk instead of os-client-config https://review.openstack.org/566158	15:42
*** electrofelix has quit IRC		15:53
*** dtruong_ has quit IRC		16:05
*** GonZo2000 has joined #zuul		16:30
openstackgerrit	qingszhao proposed openstack-infra/nodepool master: fix tox python3 overrides https://review.openstack.org/574344	17:03
*** jpena is now known as jpena\|off		17:13
openstackgerrit	Doug Hellmann proposed openstack-infra/zuul-jobs master: allow build-python-release consumers to override python version https://review.openstack.org/574373	18:17
*** dtruong has joined #zuul		18:31
clarkb	tristanC: I wanted to followup on the nodepool azure driver. Any chance you would be interested in working with kata to see if you can get that tested? I can introduce you to people and help too	18:34
*** myoung is now known as myoung\|lunch		18:52
openstackgerrit	Doug Hellmann proposed openstack-infra/zuul-jobs master: allow build-python-release consumers to override python version https://review.openstack.org/574373	19:09
*** bhavik1 has joined #zuul		19:14
openstackgerrit	Doug Hellmann proposed openstack-infra/zuul-jobs master: allow build-python-release consumers to override python version https://review.openstack.org/574373	19:17
*** GonZo2000 has quit IRC		19:48
*** bhavik1 has quit IRC		19:58
-openstackstatus- NOTICE: Zuul was restarted for a software upgrade; changes uploaded or approved between 19:30 and 19:50 will need to be rechecked		19:58
*** myoung\|lunch is now known as myoung		19:59
*** dkranz has quit IRC		20:05
clarkb	corvus: not sure if you saw but on friday I noticed that github event handling in zuul doesn't always handle depends on properly	20:26
corvus	clarkb: no, i missed that	20:26
corvus	clarkb: have more detail yet?	20:26
clarkb	corvus: http://paste.openstack.org/show/723011/ shows the logs from that. The TL;DR is that the depends on processing path wans't handled properly when I edited the PR to add the depends on. I had to push a new "patchset" for it to process the depends on	20:27
corvus	clarkb: i bet we can come up with a test case for that; i think our fake github is sufficiently expressive for that	20:28
corvus	clarkb: do you know what the event github sent was?	20:28
clarkb	corvus: I don't. But reading teh code in independent pipeline manager's checkForChangesNeededBy the issue is we return True (and check for True) to know if the depends on are managed properly but we return the list of needed changes if there are needed changes. Then lesewhere in independent manager we call addChange() on those changes to tie them up properly. But in the base manager class we check against true	20:30
clarkb	only	20:30
clarkb	_processOneItem() specifically doesn't handle if a list of needed changes is returned	20:30
clarkb	so I think this may be an issue on any event if the event does not go through the addChange() path	20:31
corvus	clarkb: i think i understand. so it's github-only by virtue of the fact that changing a gerrit commit message means a new patch, but in github does not. i do think we ought to be able to repro in a test case.	20:34
clarkb	ya, where I am still a little lost reading the code is how a change gets into the pipeline queue as an update PR event without having gone through the addChange path somewhere along the way	20:36
corvus	it should not be able to	20:36
clarkb	since getChangeQueue (which creates the queues in independent pipelines) creates them as part of addChange()	20:39
clarkb	maybe it was using cached data?	20:39
clarkb	also there is a possibly related problem of sometimes changes show up in the third party check pipeline twice. tobiash said that was a known issue he was debugging though	20:40
clarkb	I'll start by trying to write this up into a bug as it doesn't sound like a known issue	20:42
*** hashar has quit IRC		20:44
tobiash	clarkb: actually I planned to debug the double enqueue this week but haven't done so far	21:12
clarkb	tobiash: that is ok, week only just started :)	21:13
tobiash	clarkb: regarding adding dependsin later, I also observed this and I think it doesn't update the cached change in this case isn't really a new patchset	21:14
tobiash	So I guess we need to handle some event properly	21:15
tobiash	I saw that dependson issue also in our deployment btw	21:16
corvus	clarkb: moving our discussion about the log issue here -- ( http://logs.openstack.org/20/564220/1/check/devstack-multinode/7a4d81d/ara-report/result/ba7fbc30-b825-4650-a903-a6c4afdcbcb8/ )	21:17
corvus	are we certain that any v3-native multinode jobs are working?	21:17
corvus	that error looks like the 'command' module is being called without the zuul_log_id argument. that argument is set by the zuul_stream callback plugin inside v2_playbook_on_task_start. i'm starting to wonder if that method isn't always called in the same way that it previously was.	21:18
corvus	tobiash: ^	21:19
clarkb	ah ok you are about where I am	21:20
corvus	i'm looking at the job-output.json from that build, and i see some "zuul_log_id": null in it	21:25
tobiash	Ara says it's a shell zask but the shell rask before is ok	21:25
tobiash	corvus: maybe within the with_items it works differently now?	21:26
corvus	what's the previous shell task that's okay?	21:27
tobiash	http://logs.openstack.org/20/564220/1/check/devstack-multinode/7a4d81d/ara-report/result/1ee921c4-34c4-4294-bf7d-d9b042ea4a6e/	21:27
corvus	tobiash: that's on a different host	21:27
corvus	looks like http://logs.openstack.org/20/564220/1/check/devstack-multinode/7a4d81d/ara-report/result/0c83e4f7-8241-4edf-9df6-8cf0455c9775/ is the previous shell task on that host, and it didn't have the error	21:28
clarkb	maybe it doesn't call v2_playbook_on_task_start once for every node running the task?	21:28
corvus	tobiash: the failed task isn't in a with_items, is it?	21:29
corvus	it looks like it's just in a when block	21:29
tobiash	Ah, so it's not with_items but a task which ix executed on several hosts at once	21:29
corvus	(but it is in a block, which is notable)	21:29
tobiash	clarkb: yes, that's maybe the problem	21:32
tobiash	corvus: thos might be testable with tox-remote if we fake two ansible hosts with the same ip	21:33
corvus	++	21:33
clarkb	readingthe code I think it actually calls the callback on each host	21:37
clarkb	the free StrategyModule's run method seems to do this	21:37
corvus	most of the "zuul_log_id" entries in the json have uuids. only one task has "null" as the value. it's the apt-get update command in the pre-playbook. it's null for both hosts.	21:38
corvus	i don't see the connection yet	21:38
tristanC	clarkb: providing an azure account, or shell access to a system with one already setup, i could find some time to get it working. Though I can't promise to maintain it and it's probably better if someone from the infra^Wwinterscale team be assigned to it	21:39
corvus	(i suspect that it's the proximate cause for the error though -- you get to write to one log file called "None", but not a second time)	21:39
corvus	tristanC: we don't really "assign" people :)	21:39
corvus	people volunteer for things	21:40
tobiash	corvus: is there something else special with this command task.	21:44
tobiash	?	21:44
tristanC	corvus: heh, my bad wording. though it is a proprietary service, so supporting it should deserve some sort of compensation :-)	21:44
tobiash	Maybe task.action doesn't match 'command' or 'shell' in this case	21:45
corvus	tobiash: hrm, i can't see why it wouldn't...	21:45
tristanC	clarkb: and moving forward, i guess we need to reconsider some sort of label restriction per project or tenant	21:45
clarkb	tristanC: yes, we have an open story for that and I think to start we can operate on an honor system and if people abuse it go from there	21:46
tobiash	corvus: i need sleep now. If you need help with this just ping me and I try to reproduce and debug this tomorrow.	21:47
corvus	tobiash: thanks	21:49
tristanC	clarkb: fair enough. well let me know when i can start putting the code together	21:49
*** myoung is now known as myoung\|off		21:50
corvus	i think we want to finish reworking the driver interface and make sure it can handle containers, then add more drivers	21:50
tristanC	corvus: the first step is more of prototype and get the sdk working (e.g. create instance, clean-up leaks, etc...), that shouldn't be hard to rebase on driver interface change	21:51
corvus	tristanC: yep	21:52
clarkb	ya I think both of those things can be worked in parallel particularly if we have azure access in the near future	21:52
tobiash	corvus: one thing I noticed is that the task is skippex on one node	21:55
tobiash	Maybe that's the special thing	21:55
clarkb	the action value comes out of the yaml parser for tasks	21:56
corvus	the other task is a role handler. i don't know what a role handler is	21:56
corvus	ok, now i understand what a role handler is	21:57
corvus	anyway, that's a special thing about it	21:57
tobiash	The failed task is skipped for compute1 and fails for controller	21:58
tobiash	So there could be a side effect of the on skipped callback	21:58
corvus	so far, we have null values on: 1) a handler in a role in a playbook; 2) a task with a when argument in a block with a when argument in a role in a playbook	21:59
corvus	add "after a skipped task" to #2 :)	21:59
corvus	i don't think we can add that to #1 though	22:00
clarkb	it is also a parse error to not have a task action value	22:00
corvus	i'm going to start seeing if i can reproduce any of this locally	22:03
clarkb	ok, I was wrong about running the v2_playbook_on_task_start() function for each host. Reading the linear strategy instead (which appears to be default), it will only do this for the first host,task pair in its loop	22:18
clarkb	but it also seems to have always had this behavior	22:19
corvus	clarkb: that probably is good enough if we assume that the hosts are distinct (we actually can't because of the new ability to add the same host to the inventory twice, but that's really a separate problem. that's not what we're running into here. so let's set that aside for later)	22:22
corvus	clarkb: it has also occured to me that we're modifying a data structure here. i wonder if there's a code path in ansible 2.5 where this modification is no longer permanent. like, we get passed a copy of the task structure.	22:23
corvus	clarkb: unless i misunderstood what you said about the first host,task pair...	22:24
clarkb	corvus: my reading of it is that you have task command to echo foo that gets turned into a list of (hostx, task_command_foo), (hosty, task_command_foo)	22:25
corvus	clarkb: do you mean that it only runs once for each task? (and that one invocation just gets whichever host happens to be first?)	22:25
clarkb	the first time through that list it will run the callback against task_command_foo. What I am not sure about is if that task object is the same for hosty and hostx or if that chaged (but it might explain the behavior if it is a different task object)	22:25
clarkb	corvus: yup that	22:25
openstackgerrit	Merged openstack-infra/zuul-jobs master: allow build-python-release consumers to override python version https://review.openstack.org/574373	22:27
clarkb	it does look like blocks play into this as well in the get next task code	22:29
corvus	i haven't been able to reproduce it by including this role: http://paste.openstack.org/show/723237/	22:31
clarkb	this quickly gets into internal ansible state machien stuff and I'm losing my way through it. But d6872a7b070d1179e7d76bcda1490bb7003c4574 and b107e397cbd75d8e095b08495da2ac2aad3ef96f appear suspicious	22:33
clarkb	that second one in particular, I think meant the old behavior was that all the tasks were premade and when you iterate through that host,tasks list they will share task objects	22:35
clarkb	but now I don't know that that happens as they don't cache them upfront	22:35
*** EmilienM\|PTO is now known as EmilienM		22:41
corvus	i have been able to get a null log id by using handlers	22:41
clarkb	corvus: what does that look like?	22:44
clarkb	(I'm not sure I've seen that example)	22:44
corvus	clarkb: patterned after configure-mirrors role	22:44
corvus	there's a handlers/main.yaml with the task in it, and then elsewhere, something says "notify: [task name]"	22:45
clarkb	and its the handler itself in handlers/main.yaml that breaks?	22:46
corvus	clarkb: not so much breaking as being called with zuul_log_id=null	22:46
*** snapiri has quit IRC		22:47
*** snapiri has joined #zuul		22:47
corvus	clarkb: in the real exemplar, this happened first, so it didn't break either (it's okay to have a log file called "log-None". but only once. the second time breaks)	22:47
corvus	clarkb: so the block task in devstack (which i still haven't gotten to run with a null task locally) broke because it ran after the handler task had already written a log-None file	22:48
clarkb	why can't it just keep writing to the existing file? similar to how the zuul_log_id will be reused by tasks in a job I think	22:48
clarkb	oh each log is its own uuid	22:49
clarkb	I thought it was the job uuid for some reason	22:49
corvus	clarkb: yeah. i suspect it can't write to the file because the first task is done as root, the second as 'stack' user	22:50
corvus	thus the permission denied error on the file	22:50
corvus	which, i guess is something i'm missing from my local test :)	22:51
clarkb	just thinking out loud here, if each task gets its own uuid for the log file anyway, each task already has a built in uuid. We could possibly use that instead of passing in an explicit argument	22:52
corvus	clarkb: that seems plausible	22:52
clarkb	granted it is task._uuid so maybe that is why we provide our own	22:53
corvus	i've got something like this in a test locally: http://paste.openstack.org/show/723242/ but it's still not failing on the 'test block' task	23:11
corvus	i think i've got all the significant things	23:12
clarkb	on the block side of things maybe you need it to have some nodes evaluate to false?	23:16
clarkb	the discover nodes one will only run on one of the multiple hosts due to a when clause	23:16
corvus	clarkb: oh good point	23:16
corvus	still no joy	23:20
corvus	(and i have compute1 and controller in the same inventory ordre)	23:21
corvus	i don't have any groups though	23:21
openstackgerrit	Tristan Cacqueray proposed openstack-infra/nodepool master: Implement an Azure driver https://review.openstack.org/554432	23:24
corvus	clarkb: i'm nearing EOD, i wrote up a story here: https://storyboard.openstack.org/#!/story/2002528	23:36
corvus	mordred, tobiash: ^ that's as far as i've gotten	23:36
openstackgerrit	James E. Blair proposed openstack-infra/zuul master: DNM: reproducer for 2002528 https://review.openstack.org/574487	23:38
corvus	mordred, tobiash, clarkb: ^ that's my incomplete attempt to modify test_zuul_console to reproduce that	23:39
clarkb	corvus: before you go I dug up the original from oepnstack-helm http://logs.openstack.org/55/574055/3/check/openstack-helm-infra-ubuntu/9e9ac15/ara-report/result/54b51530-2857-4003-83e2-53fe446808b0/ that is failig on a shell task without a block or conditional at least not at the level the task is defined. Its pretty boring other than a delegate to	23:44
corvus	huh	23:45
clarkb	it is the running of that task on the second node that makes it unhappy (runs fine on the first node)	23:48
corvus	why isn't the previous task on node-1 a pause?	23:48
clarkb	I was just wondering that myself	23:49
clarkb	if you filter to node-1 there in ara there is no task at all	23:49
clarkb	perhaps pause is implicitly a one node thing in ansible?	23:50
clarkb	it just picks one and does it?	23:50
corvus	yeah, that seems to be the behavior	23:52
clarkb	I wonder if this is the sort of thing where if we ask ansible upstream about the behavior they will know off the top of their heads what is going on	23:52
clarkb	I too need to EOD here shortly. We finally have sun and that means yardwork	23:53
corvus	clarkb: i tried this, still no joy: http://paste.openstack.org/show/723246/	23:53
*** rlandy\|rover is now known as rlandy\|rover\|bbl		23:55

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!