Wednesday, 2018-04-18

hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	00:13
*** Goneri has quit IRC		00:19
*** rlandy\|bbl is now known as rlandy		01:55
*** rlandy has quit IRC		01:56
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	02:13
*** jaganathan has quit IRC		02:43
*** jaganathan has joined #oooq		02:47
*** ykarel\|away has joined #oooq		03:48
*** ykarel\|away is now known as ykare		03:48
*** ykare is now known as ykarel		03:48
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	04:13
ykarel	So still we have issues in promotion related to registry	04:19
ykarel	failed: [localhost] (item=swift-account) => {"changed": false, "item": "swift-account", "msg": "Error searching for image trunk.registry.rdoproject.org/tripleomaster/centos-binary-swift-account - 500 Server Error: Internal Server Error (\"{\"message\":\"layer does not exist\"}\")"}	04:20
*** udesale has joined #oooq		04:21
*** ratailor has joined #oooq		04:55
ykarel	Also noticed that retries: 3 is not working for docker pull ^^, so we should try with fixing retries to see if that helps in reducing failures	05:14
*** pgadiya has joined #oooq		05:24
*** pgadiya has quit IRC		05:24
*** udesale_ has joined #oooq		05:33
*** ykarel_ has joined #oooq		05:33
*** ykarel has quit IRC		05:36
*** udesale has quit IRC		05:36
*** hamzy has quit IRC		05:37
*** agopi has quit IRC		05:43
*** agopi has joined #oooq		05:43
*** jaganathan has quit IRC		05:44
*** jaganathan has joined #oooq		05:44
*** marios has joined #oooq		05:46
*** hamzy has joined #oooq		05:47
*** quiquell\|off is now known as quiquell\|ruck		05:57
quiquell\|ruck	ykarel_: Good morning, going to check	05:58
*** jfrancoa has joined #oooq		06:05
*** jfrancoa has joined #oooq		06:06
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	06:13
*** jtomasek has joined #oooq		06:25
*** jtomasek has quit IRC		06:25
*** jtomasek has joined #oooq		06:26
*** holser__ has joined #oooq		06:41
*** links has joined #oooq		06:47
*** skramaja has joined #oooq		07:00
*** kopecmartin has joined #oooq		07:15
*** quiquell\|ruck is now known as quiquell\|ruck\|af		07:15
*** quiquell\|ruck\|af is now known as quique\|ruck\|afk		07:15
*** florianf has joined #oooq		07:16
*** links has quit IRC		07:29
*** tesseract has joined #oooq		07:29
*** udesale__ has joined #oooq		07:29
*** ykarel__ has joined #oooq		07:30
*** udesale_ has quit IRC		07:32
*** ykarel_ has quit IRC		07:33
*** ykarel__ is now known as ykarel\|lunch		07:36
*** amoralej\|off is now known as amoralej		07:38
*** links has joined #oooq		07:47
*** tosky has joined #oooq		07:48
*** links has quit IRC		07:52
*** quique\|ruck\|afk is now known as quiquell\|ruck		07:58
*** ccamacho has joined #oooq		07:59
*** bogdando has joined #oooq		07:59
*** lucas-afk is now known as lucasagomes		08:05
*** gkadam has joined #oooq		08:07
*** ykarel\|lunch is now known as ykarel		08:13
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	08:13
ykarel	quiquell\|ruck, you find something about promotion failures?	08:30
quiquell\|ruck	ykarel: I have to check it out with panda, I also see some timeout a image fetching	08:32
ykarel	quiquell\|ruck, Ok, will prepare a patch to fix retries and see how it goes	08:32
quiquell\|ruck	failed: [localhost] (item=[u'etcd', u'f106094e961c5ab430687d673063baee379f6bbd_310b64d1']) => {"changed": false, "item": ["etcd", "f106094e961c5ab430687d673063baee379f6bbd_310b64d1"], "msg": "Error removing image docker.io/tripleomaster/centos-binary-etcd:f106094e961c5ab430687d673063baee379f6bbd_310b64d1 - UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)"}	08:33
quiquell\|ruck	at master	08:33
ykarel	hmm so two different errors one for localhost, and other for rdo registry	08:34
quiquell\|ruck	I think RDO is ok	08:35
*** udesale_ has joined #oooq		08:35
quiquell\|ruck	TASK [Push images to rdoproject registry with named label] is working fine	08:35
quiquell\|ruck	Feels like a docker image layer missing before push	08:36
*** udesale__ has quit IRC		08:38
quiquell\|ruck	Maybe something is half baked at docker hub	08:43
*** agopi has quit IRC		08:44
*** links has joined #oooq		08:49
quiquell\|ruck	syslog: inotify_add_watch(7, /dev/dm-1, 10) failed: No such file or directory	08:51
quiquell\|ruck	Don't know if it can be related	08:51
*** pgadiya has joined #oooq		08:57
*** pgadiya has quit IRC		08:57
*** panda\|rover\|off is now known as panda\|rover		08:59
quiquell\|ruck	panda\|rover: Hi	09:02
ykarel	quiquell\|ruck, fix retries: https://review.rdoproject.org/r/13419	09:03
quiquell\|ruck	panda\|rover: ykarel introducing a change on retries	09:05
*** jtomasek has quit IRC		09:07
amoralej	https://review.openstack.org/#/c/561482/ is failint in check	09:29
amoralej	it seems infra	09:29
amoralej	can i just recheck?	09:29
amoralej	is the undercloud-containers job broken?, two failures in a row	09:32
amoralej	quiquell\|ruck, ^	09:32
quiquell\|ruck	amoralej: Let me do a quick check	09:32
amoralej	any known issue?	09:32
amoralej	2018-04-17 20:50:15.983954 \| primary \| [WARNING]: No hosts matched, nothing to do	09:32
amoralej	http://logs.openstack.org/82/561482/1/check/tripleo-ci-centos-7-undercloud-containers/ecc4b0c/job-output.txt.gz	09:32
ykarel	amoralej, looks like there is a issue, docker image not found	09:33
ykarel	i have seen other job as well	09:33
ykarel	Image docker.io/tripleomaster/centos-binary-rabbitmq has no tag f106094e961c5ab430687d673063baee379f6bbd_310b64d1	09:33
quiquell\|ruck	ImageUploaderException: Image docker.io/tripleomaster/centos-binary-rabbitmq has no tag f106094e961c5ab430687d673063baee379f6bbd_310b64d1	09:34
quiquell\|ruck	Ups the same	09:34
quiquell\|ruck	There something weird with docker hub	09:34
ykarel	similar bug is there but for different tag: https://bugs.launchpad.net/tripleo/+bug/1764870	09:34
openstack	Launchpad bug 1764870 in tripleo "Missing tags in dockerhub, impossible to deploy a containerized undercloud" [Critical,Triaged] - Assigned to Gabriele Cerami (gcerami)	09:34
amoralej	that must be breaking all reviews	09:35
panda\|rover	is this still happening ?	09:36
ykarel	yes	09:36
panda\|rover	mmm there's no rabbitmq with that tag, I see other containers but not rabbitmq	09:41
ykarel	Tag and push to docker hub: failed: [localhost] (item=[u'f106094e961c5ab430687d673063baee379f6bbd_310b64d1', u'rabbitmq']) => {"changed": false, "item": ["f106094e961c5ab430687d673063baee379f6bbd_310b64d1", "rabbitmq"], "msg": "Error searching for image docker.io/docker.io/tripleomaster/centos-binary-rabbitmq - 500 Server Error: Internal Server Error (\"{\"message\":\"layer does not exist\"}\")"}	09:49
ykarel	and many other images for this tag failed to be pushed	09:50
panda\|rover	ykarel: but now the images are there	09:50
panda\|rover	I see f106094e961c5ab430687d673063baee379f6bbd_310b64d1	09:50
panda\|rover	ykarel: for rabbitmq	09:50
ykarel	panda\|rover, current run 2018-04-18 07:51:27,452 14001 INFO promoter Promoting the container images for dlrn hash f106094e961c5ab430687d673063baee379f6bbd on master to current-tripleo should have fixed it	09:51
panda\|rover	ykarel: yeah probably	09:52
panda\|rover	so we had a transient error	09:52
panda\|rover	and we are not handling very well	09:53
ykarel	sorry, 2018-04-18 07:51:27,452 14001 INFO promoter Promoting the container images for dlrn hash f106094e961c5ab430687d673063baee379f6bbd on master to current-tripleo	09:53
ykarel	2018-04-18 08:01:01,758 21610 ERROR promoter Another promoter process is running	09:53
ykarel	looking which Another is running	09:53
ykarel	panda\|rover, can you check which is running currently	09:54
ykarel	which release and which hash	09:54
panda\|rover	ykarel: mhh I see master and queens are running at the same time, and I'm not sure it should happen	09:56
ykarel	f106094e961c5ab430687d673063baee379f6bbd_310b64d1	09:56
ykarel	229 MB	09:56
ykarel	7 minutes ago for rabbitmq	09:56
panda\|rover	ykarel: and it's not easy to understand which hash	09:56
panda\|rover	ykarel: no, that was me	09:56
panda\|rover	ykarel: I repushed it manually	09:56
ykarel	you pushed?	09:56
ykarel	Okk	09:56
panda\|rover	ykarel: but it was already there	09:56
panda\|rover	ykarel: all the layers were existing	09:57
ykarel	But the question is why check jobs are using a non promoted hash: f106094e961c5ab430687d673063baee379f6bbd_310b64d1	09:57
ykarel	is this expected?	09:58
panda\|rover	ykarel: because the push happens before the promotion	09:59
panda\|rover	ykarel: so we promote only after we have all the containers in place	09:59
panda\|rover	the wierd thing is why the jobs are trying to get that hash	10:01
ykarel	yah that what i mean, jobs should use: https://trunk.rdoproject.org/centos7-master/current-tripleo/delorean.repo tag specified in base url	10:01
panda\|rover	they should resolve the hash at the start, not taking the containers with the tag "current-tripleo"	10:01
*** bogdando has quit IRC		10:02
*** bogdando has joined #oooq		10:04
*** holser__ has quit IRC		10:04
*** holser__ has joined #oooq		10:05
panda\|rover	ykarel: that's what I always thought, maybe there was a change, and some pull is using the current-tripleo tag directly instead of the hash	10:05
quiquell\|ruck	ykarel: I think openstack jobs use docker hub and rdo jobs use rdo repo	10:05
*** zoli is now known as zoli\|lunch		10:07
panda\|rover	ykarel: crap. prep-container is using the tag directly	10:07
panda\|rover	quiquell\|ruck: https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/overcloud-prep-containers/templates/overcloud-prep-containers.sh.j2#L44	10:08
panda\|rover	this shouls be the resolved hash	10:09
panda\|rover	it was overlooked in the review	10:09
panda\|rover	and I remember reviewing it	10:10
panda\|rover	damn	10:10
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	10:14
panda\|rover	ykarel: where did you take that error log from ?	10:18
ykarel	check job	10:18
ykarel	https://review.openstack.org/#/c/561482	10:19
panda\|rover	this is getting weirder	10:19
*** jaganathan has quit IRC		10:21
*** udesale__ has joined #oooq		10:21
panda\|rover	ykarel: which job is failing ?	10:22
panda\|rover	ykarel: undercloud containers ?	10:22
ykarel	yes	10:22
ykarel	panda\|rover, can this be the fix: https://review.openstack.org/#/c/559492	10:22
*** udesale_ has quit IRC		10:24
panda\|rover	ykarel: this is about container update	10:28
panda\|rover	ykarel: do you know the undercloud containers workflow well ?	10:31
ykarel	no :(	10:31
panda\|rover	ykarel: what's latest master hash ?	10:47
ykarel	current-tripleo?	10:48
panda\|rover	ykarel: yes	10:48
ykarel	a2e69c2c44417c85334944a4c46f91648aa0b97f_3791bf5d	10:48
panda\|rover	and we are also trying to promote a new one	10:53
ykarel	yes	10:53
ykarel	panda\|rover, looks like https://review.openstack.org/#/c/559492 would solve or atleast workaround the issue	11:00
panda\|rover	ykarel: how so ?	11:00
ykarel	panda\|rover, it will set update_containers to true	11:01
ykarel	and because of that containers would be prepared from docker_image_tag	11:01
ykarel	and thus undercloud would be installed from correct tag(promoted one)	11:01
panda\|rover	ykarel: I think we have problems with the uploads, not all the containers from that tag ar present anyway	11:11
ykarel	yes that's one issue	11:11
ykarel	and other is wrong tag is used for containers in undercloud install	11:12
panda\|rover	ykarel: I'm not even sure about that	11:13
panda\|rover	ykarel: because it either uses current-tripleo directly, and then I don't understand from where it's taking th e explicit hash	11:13
ykarel	f106094e961c5ab430687d673063baee379f6bbd_310b64d1 is used for undercloud containers which is not promoted yet	11:14
*** zoli\|lunch is now known as zoli		11:14
panda\|rover	ykarel: or it's trying to download the explicit hash as tag, and I don't understand from where it's taking it	11:14
*** zoli is now known as zoli\|wfh		11:14
*** zoli\|wfh is now known as zoli		11:14
panda\|rover	ykarel: certainly not from the current-tripleo tag	11:14
ykarel	yes, /me also trying to find that	11:15
*** udesale__ has quit IRC		11:16
*** lucasagomes is now known as lucas-hungry		11:18
amoralej	ykarel, panda\|rover so iiuc undercloud containers is broken?	11:55
panda\|rover	amoralej: master just promoted, it should be ok	11:56
panda\|rover	amoralej: but i'ts not clean what's the method of selecting the tag, and that is not working well when we have problems with the images upload	11:57
amoralej	ok	11:57
amoralej	i'll recheck	11:57
weshay	quiquell\|ruck, morning	11:58
ykarel	panda\|rover, but in promotion jobs we don't have a job with only containerized undercloud and no containerized overcloud	11:58
ykarel	panda\|rover, is there any such job?	11:58
*** atoth has joined #oooq		11:59
panda\|rover	ykarel: I think there is	11:59
ykarel	ok, which one, i will check something there	11:59
*** rfolco\|off is now known as rfolco		12:01
weshay	panda\|rover, any ideas what the root cause of "500 Server Error: Internal Server Error (\"{\"message\":\"layer does not exist\"}\")"}" is?	12:01
weshay	panda\|rover, afaict.. the last run of the container push worked w/o error in both rdo and docker	12:02
panda\|rover	weshay: it's either device mapper overload when there are two promotions happening at the same moment	12:02
panda\|rover	or file system is failing	12:02
weshay	ah	12:02
panda\|rover	we are seeing some worrying messages in the logs	12:03
weshay	ya.. two promotions at the same time seems like an issue	12:03
panda\|rover	we are currently stopping all the automatic runs	12:03
weshay	k	12:03
panda\|rover	and we have a queens run launched manually	12:03
weshay	manually from the promotion server, or using the manual push script?	12:03
panda\|rover	weshay: you in the program call ? I'm there only to not leave quique alone	12:03
weshay	panda\|rover, ya.. I'm on	12:03
panda\|rover	weshay: manually from the promoter server	12:03
panda\|rover	weshay: ok I'll drop then	12:04
weshay	panda\|rover, k	12:04
weshay	panda\|rover, ya.. ur good	12:04
weshay	quiquell\|ruck, will be great	12:04
panda\|rover	yeah, just the first time will be a bit rough if they start asking questions :)	12:04
quiquell\|ruck	weshay, panda\|rover: Let's see I have already change my pants	12:07
weshay	quiquell\|ruck, lolz	12:07
quiquell\|ruck	weshay: Even with the problem at promote we still are green, isn't it ?	12:07
weshay	quiquell\|ruck, you actually may be one of the first OpenStackers to join the program call in their first 90 days	12:07
weshay	so ++	12:07
weshay	quiquell\|ruck, board says queens is green :)	12:08
weshay	we are green	12:08
quiquell\|ruck	deal	12:08
*** lhinds- is now known as lhinds		12:08
quiquell\|ruck	Looks like around 20 is missing	12:09
quiquell\|ruck	For the manual promoter to pormote queens	12:09
quiquell\|ruck	Will check this https://dashboards.rdoproject.org/queens before talking	12:09
weshay	panda\|rover, quiquell\|ruck are any jobs failing due to the registry? missing containers	12:09
quiquell\|ruck	weshay: containerized undercloud	12:09
quiquell\|ruck	tripleo-ci-centos-7-undercloud-upgrades	12:10
*** amoralej is now known as amoralej\|lunch		12:10
weshay	hrm.. ya	12:10
weshay	panda\|rover, 2018-04-18 08:43:39 \| Exception: Image docker.io/tripleomaster/centos-binary-rabbitmq has no tag f106094e961c5ab430687d673063baee379f6bbd_310b64d1.	12:11
ykarel	weshay, is there a place where i can find all the runs for tripleo-ci-centos-7-undercloud-containers?	12:12
weshay	panda\|rover, shouldn't that be fixed now that you have pushed all the containers for master?	12:12
quiquell\|ruck	ykarel: at zuul's builds	12:12
weshay	ykarel, http://cistatus.tripleo.org/	12:12
panda\|rover	weshay: master promoted again 1 hour ago	12:12
weshay	lolz	12:13
panda\|rover	weshay: but there's something wrong with how the undercloud container job chooses its hash	12:13
ykarel	weshay, Thanks	12:13
weshay	panda\|rover, ya.. it should use the hash	12:13
weshay	not the current-tripleo tag	12:13
panda\|rover	weshay: yep, I see the same error in prep containers now, after the upgrade sprints	12:14
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	12:14
panda\|rover	weshay: we are using the tag directly	12:14
weshay	panda\|rover, we should push the hash tag first, and the softlink tag second	12:14
quiquell\|ruck	They are talking about containerized undercloud problems now	12:14
panda\|rover	weshay: we already do that	12:14
*** jtomasek has joined #oooq		12:15
panda\|rover	weshay https://github.com/rdo-infra/ci-config/blob/master/ci-scripts/container-push/container-push.yml#L122	12:15
weshay	panda\|rover, ya.. so I was looking at that yesterday	12:16
weshay	panda\|rover, which with_items is that using?	12:16
weshay	tag: "{{ item[0] }}"	12:16
weshay	\/centos-binary-{{ item[1] }}"	12:17
weshay	those two are confusing to me	12:17
panda\|rover	weshay: it's a with_nested. item[0] is the first element in the with_nested list, item[1] is the second	12:17
weshay	ykarel, you see it?	12:17
weshay	panda\|rover, ha.. I'm blind.. thanks	12:18
ykarel	weshay, looking	12:18
ykarel	panda\|rover, weshay it's using with_nested:	12:22
ykarel	item[0] means 1st list element, and item[1] means element from second list	12:22
ykarel	ohh it's already told,	12:23
ykarel	then what to look	12:23
weshay	quiquell\|ruck, nice work!	12:25
quiquell\|ruck	weshay: Done! put that on my 90 days plan !	12:26
quiquell\|ruck	:-P	12:26
ykarel	panda\|rover, you told that you pushed manually f106094e961c5ab430687d673063baee379f6bbd_310b64d for rabbitmq	12:26
weshay	:)	12:26
ykarel	panda\|rover, can you tell how you did that	12:26
quiquell\|ruck	thanks for the assistance btw	12:26
panda\|rover	ykarel: yes, but only for rabbitmq	12:26
ykarel	Ok, how	12:26
panda\|rover	ykarel: I'm not giving out this information for free	12:27
ykarel	:) what need to be done	12:28
panda\|rover	ykarel: I want a chicken masala delivered to my home	12:28
ykarel	u r crazy :)	12:28
panda\|rover	ykarel: you basically pull the image from registry with the tag, then retag changing the registry name, then push	12:28
ykarel	you pushed only <hash> or name also <current-tripleo>	12:29
*** trown\|outtypewww is now known as trown		12:29
panda\|rover	ykarel: only the hash	12:29
ykarel	please try pushing current-tripleo	12:29
quiquell\|ruck	weshay: Do you know how to access the bmc image of the RHEL stacks ?	12:29
panda\|rover	ykarel: I'd rather not, it will change the link to a wrong container	12:29
panda\|rover	ykarel: did you find any error with current-tripleo	12:30
ykarel	[zuul@subnode-0 ~]$ skopeo inspect docker://docker.io/tripleomaster/centos-binary-rabbitmq:current-tripleo\|grep rdo_version	12:30
ykarel	"rdo_version": "b23b33707ba4f4bd0682e58be30e1d16c6232992_d032039d"	12:30
ykarel	panda\|rover, ^^	12:30
*** rlandy has joined #oooq		12:30
panda\|rover	ykarel: yes, that's the hash that just promoted 1 hour ago	12:30
ykarel	i can see b23b33707ba4f4bd0682e58be30e1d16c6232992_d032039d	12:31
ykarel	229 MB	12:31
ykarel	10 hours ago	12:31
*** lucas-hungry is now known as lucasagomes		12:31
panda\|rover	ykarel: I think the creation time is maintained in the container itself	12:31
panda\|rover	ykarel: and it was indeed created 10 hours ago	12:32
panda\|rover	ykarel: then there' the promotion pipeline, and the promotion process, with all the delays we had	12:32
ykarel	panda\|rover, so when you pushred rabbitmq i saw the time and it was 7 minutes ago	12:32
panda\|rover	10 hours looks right	12:32
panda\|rover	heh	12:32
panda\|rover	you're right	12:34
panda\|rover	so at this point	12:34
panda\|rover	ykarel: for b23b33707ba4f4bd0682e58be30e1d16c62 part of the upload happened 10 hours ago	12:35
panda\|rover	but we were ebale to finish only 1 hour ago	12:35
panda\|rover	for all the container	12:35
panda\|rover	s	12:35
weshay	quiquell\|ruck, you are free to drop	12:36
panda\|rover	ykarel: also, we are tagging all the containers with current-tripleo, only after they are all pushed with the hash first	12:36
quiquell\|ruck	weshay: Ok, Mark McLoughling is asking for something, I think trello cards for queens	12:36
*** quiquell\|ruck is now known as quique\|ruck\|food		12:37
weshay	quiquell\|ruck, on the call?	12:37
quique\|ruck\|food	weshay: On the doc	12:37
quique\|ruck\|food	Comments on the right	12:37
weshay	quique\|ruck\|food, https://trello.com/c/KhgqKQGB	12:37
panda\|rover	is asking for a CIX card	12:38
panda\|rover	I updated the bug, I'll update the card too	12:38
rlandy	oh we're still on this container tag issue :(	12:38
quique\|ruck\|food	weshay: Cool thanks	12:39
rlandy	trown: good morning ...	12:39
trown	rlandy: good morning... I finally had some success last night!	12:40
rlandy	trown: that's good - mine timed out on TASK [Render deployment file for InstanceIdDeployment]	12:40
trown	rlandy: I think it might be a hostname thing... I changed a few things at once though, so not totally sure	12:40
panda\|rover	apevec already updated the card	12:40
rlandy	trown: which fs?	12:41
trown	rlandy: but when I went back and deployed non-container pike scenario I could see rabbit was complaining	12:41
trown	rlandy: I did fs007 on pike, just for familiarity in troubleshooting	12:41
rlandy	non-container? or containerized?	12:41
trown	rlandy: I will get what I have in to the patch so you can try, but I eventually got that featureset working	12:42
trown	non-container	12:42
trown	I am trying my changes on fs010 queens now though	12:42
rlandy	trown: cool - in the mean time, I was going to try sshnaidm\|off's new patch	12:42
trown	k	12:42
rlandy	trown: my deployment went much further the second time around though	12:43
rlandy	when I reran overcloud-deploy manually on the undercloud	12:43
rlandy	that I can't really explain	12:44
trown	rlandy: that is probably a fluke... it isnt really possible to rerun multinode	12:45
rlandy	trown: nothing really happened the first time round	12:45
rlandy	bailed very early	12:46
ykarel	panda\|rover, b23b33707ba4f4bd0682e58be30e1d16c6232992_d032039d tagged images pushed manually?	12:46
ykarel	as i can't see this hash in logs	12:46
panda\|rover	ykarel: no, it was an automatic run	12:47
ykarel	ok logs will appear when script finishes	12:47
ykarel	right?	12:47
panda\|rover	ykarel: script already finished	12:48
*** apetrich_ has joined #oooq		12:48
ykarel	then where are logs?	12:48
ykarel	i am looking here: http://38.145.34.55/master.log	12:48
panda\|rover	ykarel: search for the hash, it should be a 11:07 the first mention	12:49
panda\|rover	2018-04-18 11:07:33,288 19409 INFO promoter Promoting the container images for dlrn hash b23b33707ba4f4bd0682e58be30e1d16c6232992 on master to current-tripleo	12:49
ykarel	ya found after refreshing :)	12:49
panda\|rover	ykarel: oh yeah, we still don't have tail -f via http	12:50
ykarel	so no failure this time	12:50
ykarel	and master promoted	12:50
panda\|rover	yes, but I also halted all automatic runs	12:50
ykarel	Ok, so u will manually run for remaining releases	12:50
panda\|rover	ykarel: yes, but I'm doing this to test the theory, that the server is unable to handle correctly more than one promotion at a time	12:51
panda\|rover	ykarel: we got all sort of dm errors in the logs	12:51
ykarel	okk, i heard dm is deprecated, no?	12:52
quique\|ruck\|food	panda\|rover: We can restart docker instead of the server	12:52
ykarel	dm is devicemapper, right?	12:52
quique\|ruck\|food	ykarel: Yep	12:52
panda\|rover	quique\|ruck\|food: yes, but after queens finishes	12:52
*** ratailor has quit IRC		12:52
quique\|ruck\|food	panda\|rover: sure sure	12:54
panda\|rover	quique\|ruck\|food: twice sure, we must be very very sure	12:54
quique\|ruck\|food	panda\|rover: very very sure sure	12:55
panda\|rover	ykarel: we had error in the logs that don't make much sense	12:55
panda\|rover	ykarel: missing layers when we are trying to push, or delete an image	12:55
*** apetrich_ has quit IRC		12:55
ykarel	panda\|rover, i think first of all we should fix retries	12:55
ykarel	i think retries is not working	12:56
weshay	panda\|rover, note.. pike also promoted .. just in case you didn't see that	12:56
quique\|ruck\|food	panda\|rover: Looks like ansible docker module do a lookup, maybe after pushing it to check that it's there	12:56
quique\|ruck\|food	Or maybe before	12:57
panda\|rover	ykarel: "If the until parameter isn’t defined, the value for the retries parameter is forced to 1"	12:57
ykarel	and we don't have until	12:57
panda\|rover	so it' certainly not working, but if the problem is temporary , it may not be solved by the time we are retrying	12:57
weshay	panda\|rover, do you want me to reschedule your 1-1?	12:58
panda\|rover	weshay: no, I have to wait for queens promotion to finish before taking any other action	12:58
ykarel	panda\|rover, yes it would not be solved but we can get something from that if it passess in some tries	12:58
weshay	panda\|rover, I'm in	12:59
panda\|rover	ykarel: if it's a load problem we are just delaying the inevitable	12:59
quique\|ruck\|food	If we go to the retry, let's add a delay if it's not one by default	13:00
ykarel	hmm but is that load causing issue	13:02
*** Goneri has joined #oooq		13:04
*** amoralej\|lunch is now known as amoralej		13:05
quique\|ruck\|food	panda\|rover: queen promoted :-) yeeeih !!!	13:07
panda\|rover	quique\|ruck\|food: hhmmm	13:12
quique\|ruck\|food	https://dashboards.rdoproject.org/queens	13:17
arxcruz	weshay: 1-1 in 10 ?	13:21
*** zoli is now known as zoli\|afk		13:21
weshay	arxcruz, aye	13:22
*** quique\|ruck\|food is now known as quiquell\|ruck		13:24
trown	booyakasha	13:29
trown	2018-04-18 13:17:20 \| Ran: 1 tests in 73.0000 sec.	13:29
trown	2018-04-18 13:17:20 \| - Passed: 1	13:29
trown	2018-04-18 13:17:20 \| - Skipped: 0	13:29
trown	2018-04-18 13:17:20 \| - Expected Fail: 0	13:29
trown	2018-04-18 13:17:20 \| - Unexpected Success: 0	13:29
trown	2018-04-18 13:17:20 \| - Failed: 0	13:29
trown	^^ featureset10-queens	13:29
trown	rlandy: ^	13:29
weshay	arxcruz, ready	13:31
weshay	trown++	13:31
hubbot	weshay: trown's karma is now 39	13:31
weshay	trown++	13:31
hubbot	weshay: trown's karma is now 40	13:31
weshay	trown++	13:31
hubbot	weshay: trown's karma is now 41	13:31
panda\|rover	booyakasha ?	13:31
panda\|rover	booyakasha++ ?	13:31
rlandy	trown: very nice :)	13:33
panda\|rover	quiquell\|ruck: weid, the script did not finish yet on my console	13:33
panda\|rover	it's hanging there	13:33
quiquell\|ruck	panda\|rover: Same here	13:35
panda\|rover	I stilll see this 25fc8390fb49c6660b4dc953a2e2f0f3c977ff2b as ongoing	13:35
*** trown\|brb has joined #oooq		13:35
*** trown has quit IRC		13:35
panda\|rover	oh	13:35
panda\|rover	it's promoting two hashes ?	13:35
panda\|rover	on a row ?	13:35
panda\|rover	that should not happen ...	13:36
panda\|rover	oh no no	13:36
panda\|rover	it's promoting phase1	13:36
panda\|rover	Promoting the container images for dlrn hash 25fc8390fb49c6660b4dc953a2e2f0f3c977ff2b on queens to current-tripleo-rdo	13:36
panda\|rover	but it's working smoothly	13:36
panda\|rover	the overload theory stregthen	13:37
panda\|rover	I think we need to put the global lock in place ..	13:37
rlandy	trown: what was the fix?	13:37
rlandy	trown: can I rerun from https://etherpad.openstack.org/p/libvirt-setup-fake-nodepool-poc or are there more instructions?	13:38
panda\|rover	and then we can merge the retries fix from ykarel	13:38
quiquell\|ruck	panda\|rover: More than a global lock just one promoter script with a scheduler	13:39
quiquell\|ruck	This way we can even give prioritys to different promotions	13:39
quiquell\|ruck	panda\|rover: It's stuck here 28754	13:40
quiquell\|ruck	Here /usr/bin/python2 /tmp/ansible_1FfWtT/ansible_module_docker_image.py	13:41
panda\|rover	quiquell\|ruck: is that a process id ?	13:41
panda\|rover	ok	13:41
quiquell\|ruck	And the directory doesn't exist	13:41
panda\|rover	quiquell\|ruck: look at the docker images output	13:42
panda\|rover	quiquell\|ruck: it's pushing current-triple-rdo tags to docker.io	13:43
jfrancoa	panda\|rover: weshay: do you have a moment for a question?	13:43
panda\|rover	jfrancoa: it'll going to cost you	13:43
quiquell\|ruck	panda\|rover: that's phase 1?	13:43
jfrancoa	panda\|rover: whatever it is, I'll pay it ;-)	13:43
panda\|rover	jfrancoa: I like gazpacho a lot, you know ?	13:43
weshay	jfrancoa, sure.. in 1-1 but can irc now or chat in a bit	13:43
panda\|rover	jfrancoa: shoot	13:44
panda\|rover	quiquell\|ruck: yes	13:44
quiquell\|ruck	panda\|rover: So it's doing the right job, just promoting phase 1 too	13:44
quiquell\|ruck	:-)	13:44
panda\|rover	weshay: phase1 of the previous promotion	13:44
jfrancoa	panda\|rover: we were wondering if it would be possible to create a pipeline similar to the experimental one, but dedicate to upgrades	13:44
panda\|rover	jfrancoa: I think we have to discuss it with rdo infra folks	13:44
trown\|brb	rlandy: I think the fix was the hostname stuff I added to the patch, but ya I updated the etherpad with the updated patch	13:44
*** trown\|brb is now known as trown		13:45
jfrancoa	panda\|rover: the thing is that many patches are being backported in tht to former releases, and there is no upgrades job running in that project (which is normal because some of the jobs are not as stable as we'd wish)	13:45
panda\|rover	trown: can you stop being successful, there will be nothing left to do for the rest of the sprint :)	13:45
trown	irc is acting goofy	13:45
jfrancoa	panda\|rover: but, if we could have an option to trigger all upgrades related job upon request, something like "check-rdo-upgrades"	13:45
*** udesale__ has joined #oooq		13:45
trown	panda\|rover: lol... there is plenty to do to clean up the mess I have made... I was almost ready to give up on the entire sprint yesterday :P	13:46
rlandy	trown: vm flavor? patch still has subnode-2 as control	13:46
rlandy	subnode-1	13:46
trown	rlandy: ya I switched that back, and made them both big... like i said I changed alot at the same time	13:46
quiquell\|ruck	panda\|rover, trown: We are also overloading the sprints ?	13:46
* trown has poor operational discipline		13:46
quiquell\|ruck	With too much successful	13:47
panda\|rover	quiquell\|ruck: we are always overloading sprints	13:47
trown	rlandy: but what is in that patcheset is what worked for me	13:47
rlandy	trown: np - just checking so I test the right thing and avoid asking questions	13:47
* rlandy reads through the diffs		13:47
trown	rlandy: and I suspect it was just the hostname change that did it, because that looked to be what was causing rabbitmq to barf when I went back to a release I actually know how to troubleshoot well	13:48
panda\|rover	jfrancoa: technically is perfectly doable, the problem is the load this could cause to rdocloud, that's why we should talk with rdo infra	13:48
panda\|rover	jfrancoa: we can also try to find sutable alternatives	13:48
jfrancoa	panda\|rover: I submitted something in between, which makes use of the experimental pipeline: https://review.rdoproject.org/r/#/c/13420/1/zuul/upstream.yaml@307	13:48
rlandy	trown: can we pls change out your personal key?	13:49
rlandy	--upload '/home/trown/.ssh/id_rsa.pub:/root/.ssh/authorized_keys'	13:49
trown	rlandy: whoops, sure	13:49
* rlandy has to manually correct that each time		13:49
jfrancoa	panda\|rover: but the amount of jobs triggered would be huge (as the ovb-experimental are also included + the upgrades ones)	13:50
trown	rlandy: will ~/ work for now?	13:50
rlandy	trown: yep	13:50
rlandy	anything we can merge	13:50
rlandy	for now	13:50
trown	rlandy: ok updated that	13:50
rlandy	trown; I needed to edit the /etc/resolv.conf	13:51
rlandy	on subnode-0	13:51
panda\|rover	jfrancoa: mmhh	13:51
rlandy	to resolve the repos	13:51
weshay	trown, what was the issue? dns?	13:51
trown	rlandy: hmm I have not needed to do that...	13:51
rlandy	maybe just my setup	13:51
rlandy	ok - let's try this again	13:51
trown	weshay: still not 100% sure, but I think it was rabbitmq hostname issues	13:51
trown	weshay: rabbitmq is very particular about that stuff	13:52
hrybacki	rlandy: this morning confirmed that rebase didn't fix the issue either	13:52
weshay	yes it is	13:52
weshay	trown, is everything pushed to gerrit?	13:52
trown	weshay: ya	13:53
rlandy	hrybacki: then I think we still have a problem with the quickstart change https://github.com/openstack/tripleo-quickstart/commit/c8f9d725ea0306f980b96eff42b0f99230c5b8c7	13:53
*** holser__ has quit IRC		13:54
rlandy	and whatever it was fixed to	13:54
trown	testing fresh with only what is in gerrit now to make sure	13:54
rlandy	weshay: ^^ hrybacki is having problem with changes being copied	13:54
*** holser__ has joined #oooq		13:54
quiquell\|ruck	rlandy: Checking rhes 7.5I have arrive to Failed to set up security class mapping	13:54
rlandy	weshay; you changed that revert?	13:55
weshay	rlandy, that was reverted	13:55
weshay	https://github.com/openstack/tripleo-quickstart/commit/05cce7bd240192b3682ba0545f863ab8e8ed5229	13:55
weshay	rlandy, the bug the user hit was fixed w/ https://github.com/openstack/tripleo-quickstart/commit/4ab8b782768c91f6ba1be9a9f6f58634a5e202b3	13:55
rlandy	quiquell\|ruck: is that same error as in the screen shot and as on the other two stacks?	13:56
rlandy	if so, I forwarded emails weshay from bob fournier regarding the IPA image	13:57
quiquell\|ruck	rlandy: Yep	13:57
rlandy	ok - pls see his response to the emails	13:57
rlandy	I was told to drop the investigation at that point	13:57
quiquell\|ruck	weshay: Can you send me those e-mails those e-mails	13:59
* quiquell\|ruck rerepeating		13:59
quiquell\|ruck	rlandy: Found a bug from him	14:02
quiquell\|ruck	https://bugzilla.redhat.com/show_bug.cgi?id=1566110	14:02
openstack	bugzilla.redhat.com bug 1566110 in openstack-selinux "selinux errors in IPA with OSP-12 using RHEL 7.5" [High,Closed: notabug] - Assigned to lhh	14:02
rlandy	quiquell\|ruck: forwarding you the last emails	14:03
quiquell\|ruck	rlandy: most obliged	14:04
quiquell\|ruck	we are the ones building the IPA images or we have to use "official" ones ?	14:05
rlandy	quiquell\|ruck: we don't build any images	14:05
rlandy	if you look at the release file, you will see the IPA images defined	14:06
rlandy	ie: where we pull it from	14:06
rlandy	rhos-release install	14:07
quiquell\|ruck	Maybe with RHEL 7.5 we are expossing some selinux stuff into the introspected nodes	14:08
quiquell\|ruck	Or even a stupid kernel boot option ?	14:08
rlandy	weshay: to get back to hrybacki's problem, he is making change to tqe on the undercloud in /opt/stack and not seeing those changes picked up in the toc-gate-test run (manually on a reproducer set up env) ...	14:12
rlandy	the zuul_changes are picked up	14:12
rlandy	but not subsequent changes	14:12
weshay	hrybacki, post the log again	14:12
hrybacki	weshay: oh it's so long gone. I'll look up the channel log shortly though	14:12
weshay	faker	14:12
hrybacki	lol	14:12
weshay	:)	14:13
rlandy	hrybacki: weshay: https://paste.fedoraproject.org/paste/KMIyEKt5JSrYYhd4j8Xe1w	14:13
panda\|rover	jfrancoa: you have 5 minutes to chat ?	14:13
jfrancoa	panda\|rover: sure	14:13
*** zoli\|afk is now known as zoli		14:14
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	14:14
*** zoli is now known as zoli\|wfh		14:14
hrybacki	weshay: that was the complete run. Here is the specific example of missing bits: https://paste.fedoraproject.org/paste/C18QtGEix-K9Wrk-I9hguA	14:14
panda\|rover	jfrancoa: bj/gcerami ?	14:14
jfrancoa	panda\|rover: joining	14:15
weshay	hrybacki, this is something you are running locally?	14:15
hrybacki	weshay: so I am running a reproducer script that is pulling in changes for OOOQ and OOOQ-E	14:15
hrybacki	after the initial deployment I log onto the undercloud and make changes in /opt/stack/* (to avoid submitting patchsets for test runs)	14:15
*** gkadam has quit IRC		14:16
hrybacki	then I invoke the toci script expecting it to use what lives in /opt/stack/*	14:16
hrybacki	but what we find is /tmp/.quickstart/* does not line up with what is in /opt/stack/*	14:16
hrybacki	my guess is quickstart.sh is re-pulling the changes from gerrit	14:16
hrybacki	IIUC that is not the intended behavior	14:17
hrybacki	and a one letter typo => new patchset :(	14:18
rlandy	quiquell\|ruck: panda\|rover: note incoming review from rasca adding rhos-13 pipeline ( see discussion on #rhos-dev)	14:21
weshay	hrybacki, hrm...	14:21
rlandy	weshay: ^^ fyi	14:21
weshay	k	14:21
weshay	quiquell\|ruck, panda\|rover tool I quickly ginned up to help discover the owning DFG for escaltions https://github.com/weshayutin/google_sheet_search	14:21
weshay	hrybacki, if you could get that setup and tmate us in that would be very helpful	14:22
hrybacki	weshay: yeah. Not sure if a bug or not but does require me to bombard CI more than I'd prefer testing stuff out	14:22
hrybacki	If it is a new (or re-occurence) of an old one I'm happy to create/add to an old bug report	14:23
hrybacki	rlandy: weshay ^^	14:23
rlandy	hrybacki: the old bug report was a diff problem to start with	14:24
quiquell\|ruck	rlandy: Going to check thanks	14:24
quiquell\|ruck	weshay: Watched	14:25
hrybacki	ack. But is the intended behavior for what lives in /opt/stack to be what is used by toci and subsequently end up in /tmp/.quickstart right rlandy?	14:25
rlandy	hrybacki: stupid question - do you clean out LOCAL_WORKING_DIR="$WORKSPACE/.quickstart" when you rerun?	14:26
hrybacki	rlandy: on the undercloud?	14:26
hrybacki	rlandy: every execution is on a fresh undercloud (I've never had success runing a toci script more than once on a system)	14:27
rlandy	ok - should create a new tmp dir anyways	14:28
* hrybacki nods		14:28
hrybacki	it does. I also wipe out the old /tmp/repro dirs just in case	14:28
rlandy	quiquell\|ruck: wrt rhos-7.5	14:28
rlandy	bob questions the way we are working with extracting the IPA image	14:31
rlandy	you can check his untar comments	14:31
rlandy	the envs are still up	14:31
rlandy	tbh - I don't wee what we are doing wrong all of a sudden	14:31
rlandy	but we need to prove it	14:31
rlandy	I don't think we magically create selinux issues	14:32
rlandy	the question is if somehow others are turning off selinux so we are the ones showing it	14:32
panda\|rover	:( lp is timing out right after I wrote a long bug description	14:35
quiquell\|ruck	rlandy: I will check, I have se a success now running only the introspection at one of the baremetals	14:37
quiquell\|ruck	rlandy: Do you know how to check de kernel booting options of the IPA image ?	14:51
quiquell\|ruck	weshay, adarazs: We can see the script now	14:53
adarazs	quiquell\|ruck: ? you mean start the meeting early?	14:54
quiquell\|ruck	https://bluejeans.com/7891065232	14:54
quiquell\|ruck	adarazs: Yes, If we all can	14:54
weshay	aye joining	14:54
adarazs	me too	14:54
rlandy	panda\|rover: pls see question on #rhos-ops	15:01
rlandy	is there any milestone that needs to be achieved during next week? we are planning a scheduled outage because of networking maintenance job . Is it ok or any specific date should not be suggested?	15:01
rlandy	rdocloud	15:02
rlandy	quiquell\|ruck: ^^ fyi	15:02
panda\|rover	rlandy: sorry I wasn't in the program call, last queens import was 3 days ago	15:04
rlandy	panda\|rover: pls join #rhos-ops for the discussion	15:04
weshay	quiquell\|ruck, https://github.com/rdo-infra/ci-config/tree/master/ci-scripts	15:13
*** ykarel has quit IRC		15:17
*** quiquell\|ruck is now known as quiquell\|off		15:17
rlandy	quiquell\|off: ugh - sorry - missed you :(	15:18
rlandy	going to answer the IPA question	15:18
panda\|rover	quiquell\|off: https://review.rdoproject.org/r/13429	15:18
panda\|rover	d'oh	15:18
quiquell\|off	rlandy: Just paste it here I will read it tomorrow	15:18
rlandy	k - good night	15:18
panda\|rover	quiquell\|off: have a nice rest of day	15:18
rlandy	sorry - got distracted on other channel	15:19
hrybacki	rlandy: weshay is there a way to stop the log squelching in the toci script?	15:21
weshay	hrybacki, like what in particular?	15:22
hrybacki	weshay: I want verbose ansible output with none of this `no_logs: true` business for debugging	15:22
weshay	hrybacki, the ansible logs rarely help much	15:23
*** skramaja has quit IRC		15:23
hrybacki	weshay: I feel like I'm blindly searching for a needle in a haystack	15:24
panda\|rover	weshaystack	15:24
weshay	hrybacki, tmate your env	15:25
hrybacki	weshay: ?	15:26
hrybacki	I just need to see what the tripleo-inventory role is actually doing	15:27
*** tosky has quit IRC		15:27
*** tosky has joined #oooq		15:29
hrybacki	weshay: http://etherpad.corp.redhat.com/tls-everywhere-on-rdo-cloud -- current issues that aren't striked out are still persisting	15:30
rlandy	trown: got delayed - running with your latest changes now	15:34
trown	rlandy: k... I think I might be missing something, my tests from scratch have failed... one was on fs037 though	15:35
*** bogdando has quit IRC		15:36
rlandy	running so far	15:37
*** ykarel has joined #oooq		15:45
*** holser__ has quit IRC		15:47
*** holser__ has joined #oooq		15:48
*** udesale__ has quit IRC		15:52
trown	rlandy: I think we need to reboot the vms after setting the hostame... or maybe just set it with hostname command too...	15:53
rlandy	trown: getting fail ssh'ing to subnode-1	15:54
rlandy	failure	15:54
*** links has quit IRC		15:55
rlandy	toci-gate-test	15:55
rlandy	says ip is unrechable	15:55
trown	hmmm that is different than what I was hitting	15:56
rlandy	ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=Verbose -o PasswordAuthentication=no -o ConnectionAttempts=32 -tt -i /etc/nodepool/id_rsa 192.168.122.99 sudo mkdir -p /opt/stack/new/tripleo-ci	15:56
rlandy	times out	15:56
rlandy	new error	15:57
*** agopi has joined #oooq		15:57
rlandy	192.168.122.17 is the undercloud	15:58
*** jfrancoa has quit IRC		16:06
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	16:14
*** florianf has quit IRC		16:16
myoung	weshay: panda\|rover: does this look ok (sprint 12 planning summary for email) - would like to send this shortly: https://docs.google.com/document/d/1ZuK4uvRO-kpjiErChzwI5s3PU2pZxwQal5IRYWl32qU/edit?usp=sharing	16:16
myoung	chandankumar: ^^	16:18
trown	rlandy: ok.. that is different than what I am hitting... apparently I am hitting some new legit DIB bug... see #tripleo	16:20
trown	rlandy: I do think we probably need a reboot of the nodes though... adding that in	16:21
* rlandy tries that\		16:23
weshay	trown, rlandy https://bugs.launchpad.net/tripleo/+bug/1765123	16:25
openstack	Launchpad bug 1765123 in tripleo "dib-run-parts fails with /tmp/tmpkGJoNl/pre-install.d/05-rpm-epel-release: line 9: DISTRO_NAME: unbound variable" [Critical,Triaged]	16:25
trown	rlandy: I put up a new patch with that, that is also on top of the fix for DIB issue	16:25
trown	rlandy: updating etherpad as well	16:25
panda\|rover	trown: is https://review.openstack.org/561630 ready to be merged ?	16:29
trown	panda\|rover: no	16:29
rlandy	thanks	16:30
*** kopecmartin has quit IRC		16:30
weshay	trown, you need to change your patch to depend-on: https://review.openstack.org/#/c/562325/	16:32
trown	weshay: I am not so sure that patch will work tbh... I put my patch on top of alex's that will exclude dib from current repo	16:33
weshay	trown, k k.. I'm just not sure if the bug is in the latest package	16:34
weshay	in current-tripleo	16:34
trown	that could not get through promotion	16:34
weshay	k	16:35
weshay	trown, sorry.. btw.. for the libvirt playbook.. run as a regular user?	16:35
*** lucasagomes is now known as lucas-afk		16:35
trown	weshay: ya regular user	16:36
*** panda\|rover is now known as panda\|rover\|off		16:36
trown	rlandy: I think I am hitting the same thing as you now... I think this is why I hardcoded the key	16:42
trown	rlandy: trying to find a better way	16:42
rlandy	trown: yep - that was not the case before	16:44
weshay	trown, mind if I add	16:44
weshay	- name: ensure libvirt volume path exists	16:44
weshay	file:	16:44
weshay	path: "{{ libvirt_volume_path }}"	16:44
weshay	state: directory	16:44
weshay	to roles/libvirt/setup/common/tasks/main.yml	16:45
rlandy	I just wanted to get the review to a mergeable state	16:45
rlandy	hence the request to remove the hardcoded key	16:45
trown	rlandy: ya, and annoying to have to change that anytime a new patchset is up	16:45
weshay	?	16:46
trown	weshay: line 54 https://review.openstack.org/#/c/561630/5/roles/libvirt/setup/overcloud/tasks/fake_nodepool.yml	16:47
trown	weshay: I had it hardcoded to /home/trown/...	16:47
trown	weshay: but what I have there now doesnt actually work	16:48
weshay	don't think it's related	16:48
weshay	was failing on	16:49
weshay	TASK [libvirt/setup/common : Start volume pool] *********************************************************************************************************************************************************************	16:49
weshay	Wednesday 18 April 2018 12:39:16 -0400 (0:00:00.461) 0:00:07.681 *******	16:49
weshay	fatal: [whayutin-testbox]: FAILED! => {"changed": false, "failed": true, "msg": "cannot open directory '/opt/vm_images': No such file or directory"}	16:49
rlandy	I got that before	16:49
rlandy	create the /opt/vm_images dir	16:50
weshay	rlandy, right.. but ansible should do that	16:51
weshay	:)	16:51
weshay	trown, it won't hurt anything	16:51
trown	weshay: ya we are talking about 2 different things :P	16:53
weshay	trown, rlandy updated the review	16:54
trown	weshay: there are a milion things to clean up, that is what will become the sprint, I just want to get something that works	16:54
trown	weshay: if you hack on the same review though... that will get messy	16:54
weshay	ya.. I don't like doing it	16:54
weshay	gerrit sucks in that regard	16:54
weshay	so.. what's a good workflow.. two diff reviews and then reconcile them>	16:54
weshay	?	16:54
weshay	anyone have a sec for a zuul config question	17:05
*** amoralej is now known as amoralej\|off		17:07
rlandy	here we go again	17:12
*** trown is now known as trown\|lunch		17:12
rlandy	Set hostname correctly for subnode-0 - "Failed to connect to the host via ssh	17:14
* rlandy goes back to old changes		17:14
*** ykarel has quit IRC		17:21
*** holser__ has quit IRC		17:24
*** marios has quit IRC		17:37
*** zoli\|wfh is now known as zoli\|gone		18:03
*** ykarel has joined #oooq		18:04
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	18:14
*** ykarel has quit IRC		18:19
*** trown\|lunch is now known as trown		18:30
trown	rlandy: I updated review with a fix for the ssh key issue	18:32
trown	rlandy: it now defaults to ~/.ssh/id_rsa.pub ... but can be overridden, and actually works :P	18:33
rlandy	cool - will try in a but - just fixing some hardware	18:34
trown	the fix from alex for the undercloud install dib issue did not work for me on queens... trying pike	18:36
*** holser__ has joined #oooq		18:44
trown	oh duh... just realized I have not been passing ZUUL_CHANGES to toci	18:46
*** Goneri has quit IRC		18:50
*** tosky has quit IRC		18:52
*** tosky has joined #oooq		18:55
*** tesseract has quit IRC		19:15
*** atoth has quit IRC		19:30
*** holser__ has quit IRC		19:36
hrybacki	trown: woo! that was another one I was gonna bring up :P	19:55
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	20:14
*** dmellado has quit IRC		20:45
*** holser__ has joined #oooq		21:06
rlandy	trown: hit a failure TASK [repo-setup : Setup repos on live host] subnode-2	21:07
rlandy	see that?	21:07
rlandy	subnode-2?	21:07
rlandy	hosts only has subnode 0 and 1	21:08
*** strattao has quit IRC		21:09
trown	nah different hosts... gotta run though	21:11
trown	doesnt matter what we name them in the dummy setup part though	21:11
*** trown is now known as trown\|outtypewww		21:11
*** strattao has joined #oooq		21:12
*** apetrich_ has joined #oooq		21:27
*** holser__ has quit IRC		21:48
*** apetrich_ has quit IRC		21:49
*** rfolco is now known as rfolco\|off		21:59
hubbot	FAILING CHECK JOBS: gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv, tripleo-quickstart-extras-gate-newton-delorean-full-minimal \| check logs @ https://review.openstack.org/472607 and fix them ASAP.	22:14
*** rlandy has quit IRC		22:18
*** yolanda has quit IRC		22:32
*** tosky has quit IRC		23:02
*** strattao has quit IRC		23:14
*** strattao has joined #oooq		23:15

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!