Monday, 2020-06-08

*** jmasud has joined #oooq		00:04
*** jmasud has quit IRC		00:43
*** holser has quit IRC		03:17
*** ykarel\|away is now known as ykarel		04:23
*** ratailor has joined #oooq		04:38
*** jtomasek has joined #oooq		04:59
*** jmasud has joined #oooq		05:13
*** saneax has joined #oooq		05:26
*** jtomasek has quit IRC		05:37
ysandeep	folks o/ , Have you seen this kind of error before in container build : "SystemError: The following jobs were incomplete: [{'swift-base" ? but container built itself seems succesfull	05:56
ysandeep	https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-periodic-rhos-17/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-containers-rhel-8-rhos-17-build-push/75dc618/logs/build.log ?	05:56
bhagyashris	ysandeep, hey, there was some discussion going on friday you can see the logs here http://paste.openstack.org/show/794448/	06:01
ysandeep	bhagyashris, thank you o/	06:02
*** matbu has joined #oooq		06:06
*** udesale has joined #oooq		06:19
*** jmasud has quit IRC		06:40
*** jtomasek has joined #oooq		06:41
*** jmasud has joined #oooq		06:42
*** yolanda has joined #oooq		06:43
*** ysandeep is now known as ysandeep\|afk		06:57
*** skramaja has joined #oooq		07:04
*** jmasud has quit IRC		07:12
*** ccamacho has joined #oooq		07:32
*** tosky has joined #oooq		07:39
*** amoralej\|off is now known as amoralej		07:52
*** jpena\|off is now known as jpena		07:56
*** ysandeep\|afk is now known as ysandeep		08:08
*** dtantsur has joined #oooq		08:12
*** jfrancoa has joined #oooq		08:40
*** sshnaidm\|afk is now known as sshnaidm		08:44
*** jtomasek has quit IRC		08:48
*** jtomasek has joined #oooq		08:50
*** apetrich has joined #oooq		08:54
*** holser has joined #oooq		08:56
*** jschlueter has joined #oooq		09:12
*** ccamacho has quit IRC		09:55
*** jbadiapa has joined #oooq		10:02
akahat	cgoncalves, o/	10:30
cgoncalves	akahat, hi	10:30
akahat	cgoncalves, i need to talk about this:https://review.opendev.org/#/c/731501	10:30
akahat	cgoncalves, will the enable_provider_drivers will not work here?	10:31
akahat	I mean we have only three drivers: amphora, octavia and ovn.	10:31
*** ccamacho has joined #oooq		10:31
cgoncalves	akahat, ideally the OVN provider driver should be appended. we must not assume the 'octavia' and 'amphora' provider drivers are enabled	10:32
akahat	cgoncalves, okay. so appending only ovn will work?	10:34
cgoncalves	akahat, yes, appended if the provider driver isn't already present. I am not sure I follow what's driving this change though. could you please help me understand?	10:40
*** jtomasek has quit IRC		10:41
akahat	cgoncalves, this will help to understand: https://tree.taiga.io/project/tripleo-ci-board/task/1699?kanban-status=1447275	10:42
*** jtomasek has joined #oooq		10:43
*** derekh has joined #oooq		10:46
cgoncalves	akahat, maybe a better approach would be to construct the 'enabled_provider_drivers' in tempestconf based on the enabled provider drivers that you can get via Octavia API	10:48
cgoncalves	btw, you have a typo in the conf setting. it is "enable*d*_provider_drivers"	10:49
cgoncalves	akahat, https://docs.openstack.org/api-ref/load-balancer/v2/?expanded=list-providers-detail#list-providers	10:49
akahat	cgoncalves, okay. I'll fix it.	10:52
akahat	cgoncalves, thanks :)	10:52
cgoncalves	you're welcome	10:54
arxcruz	sshnaidm: hey, how can I add an ansible collection in tq?	10:58
arxcruz	sshnaidm: context: os_tempest is now using openstack.cloud collection	10:58
arxcruz	and we need to add it on tq	10:59
arxcruz	i'm checking the tripleo-ansible-operator, but it has the python.py file, and you install it using the tripleo-quickstart-extras requirements but the openstack.cloud doesn't have, and I'm not sure if it's the right way	10:59
arxcruz	I would do it with ansible-galaxy, but it doesn't seems to have it in tq	11:00
sshnaidm	arxcruz, https://review.opendev.org/#/c/730083/	11:04
arxcruz	sshnaidm: danke her Shnaidm, I was seein this path to follow but was unsure	11:06
sshnaidm	arxcruz, de nada, señor Arx	11:07
sshnaidm	or how do you call it in brasilian language? :D	11:07
arxcruz	de nada senhor Arx	11:08
arxcruz	we don't have the ñ in portuguese	11:08
arxcruz	in this case it pronounces equal with some accent	11:09
*** jpena is now known as jpena\|lunch		11:32
*** rfolco has joined #oooq		11:37
weshay\|ruck	0/	11:48
cgoncalves	\0	11:50
weshay\|ruck	pojadhav\|ruck, rfolco we should sync up	11:53
pojadhav\|ruck	weshay\|ruck, yup	11:53
weshay\|ruck	k.. holding for rfolco	11:53
rfolco	weshay\|ruck, pojadhav\|ruck need 2 min, will get a coffee	11:54
*** rfolco is now known as rfolco\|rover		11:54
weshay\|ruck	https://meet.google.com/one-rbow-bcs	11:56
weshay\|ruck	pojadhav\|ruck, 2020-06-08 08:33:52.573060 \| primary \| urllib3.exceptions.LocationParseError: Failed to parse: https://trunk.rdoproject.org/api-centos8-master-uc/api/report_result	11:58
*** rlandy has joined #oooq		12:06
*** skramaja has quit IRC		12:10
*** skramaja has joined #oooq		12:10
rfolco\|rover	weshay\|ruck, https://review.opendev.org/#/c/733790	12:11
*** jfrancoa has quit IRC		12:12
*** jfrancoa has joined #oooq		12:14
weshay\|ruck	rfolco\|rover, pojadhav\|ruck https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-train/773e973/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz	12:20
*** amoralej is now known as amoralej\|lunch		12:22
*** ratailor has quit IRC		12:31
rlandy	rfolco\|rover: do we have scrum today?	12:39
rfolco\|rover	rlandy, no, we have only on thu	12:40
rlandy	oh ok - we took the once a week option	12:40
rfolco\|rover	per team's suggestion yep	12:40
rfolco\|rover	for this sprint only as experiment	12:40
*** udesale_ has joined #oooq		12:49
weshay\|ruck	rfolco\|rover, that's the neutron ptl.. Slawomir Kaplonski	12:52
*** udesale has quit IRC		12:52
ysandeep	rlandy, hey sorry i was out of friday so I am not sure of what you and chandankumar decided for image build issue. Any luck with it?	12:52
rfolco\|rover	weshay\|ruck, ok thanks	12:52
weshay\|ruck	rfolco\|rover, irc slaweq	12:53
rlandy	ysandeep: hi - yes - chandankumar change some settings in the env review	12:53
ysandeep	rlandy, Whenever you are free, can we sync for some minutes today.	12:53
rlandy	ysandeep: but then that image expired :(	12:53
rlandy	so I had to update the image	12:54
*** saneax is now known as saneax_AFK		12:54
rlandy	ysandeep: yeah - just trying to kick the two diff image build jobs again os we can promote and clear the scenario failures	12:54
rlandy	will ping you in a bit	12:54
rfolco\|rover	pojadhav\|ruck, need 10 min before we sync	12:55
ysandeep	rlandy, ack thanks! and yes that container build is failing with weird System Errors , i saw you and marios had some discussion about it on friday.	12:55
pojadhav\|ruck	rfolco\|rover, okay	12:56
*** jpena\|lunch is now known as jpena		13:03
*** ykarel is now known as ykarel\|afk		13:11
*** amoralej\|lunch is now known as amoralej		13:14
rlandy	ysandeep: ok - if you have time now, let's chat	13:16
ysandeep	rlandy, sure	13:16
rlandy	ysandeep: https://meet.google.com/xpa-ceom-onm	13:17
sshnaidm	rlandy, hi	13:20
sshnaidm	rlandy, how is downstream ovb going?	13:20
*** rlandy is now known as rlandy\|mtg		13:20
rlandy\|mtg	sshnaidm: not great - the introspection still fails	13:20
rlandy\|mtg	it looks like the cloud is very slow to respond to power actions	13:21
sshnaidm	rlandy\|mtg, timeout?	13:21
rlandy\|mtg	in mtg now - will show you in a bit	13:21
sshnaidm	ack	13:21
rlandy\|mtg	introspection outright fails	13:21
rlandy\|mtg	no lcear trace as to why	13:21
weshay\|ruck	pojadhav\|ruck, rfolco\|rover directories/files still missing in latest https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/job-output.txt	13:31
rfolco\|rover	2020-06-08 08:37:50.462917 \| primary \| ok: All assertions passed	13:35
rfolco\|rover	2020-06-08 08:37:50.463339 \|	13:35
rfolco\|rover	2020-06-08 08:37:50.500997 \| primary \| ok: All assertions passed	13:35
rfolco\|rover	missing /etc/pki/CA/private	13:35
rfolco\|rover	weshay\|ruck, ok let me fix it and test the routine locally first	13:36
weshay\|ruck	rfolco\|rover, what are you fixing?	13:39
rfolco\|rover	weshay\|ruck, it should fail assertions	13:39
rfolco\|rover	isn't ?	13:39
weshay\|ruck	rfolco\|rover, 2020-06-08 13:24:28.136248 \| primary \| "msg": "Assertion failed"	13:39
rfolco\|rover	weshay\|ruck, I was looking at different job https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/ea5c167/job-output.txt	13:42
*** TrevorV has joined #oooq		13:43
rfolco\|rover	weshay\|ruck, ok so its doing what is supposed to	13:43
weshay\|ruck	rfolco\|rover, the build is failing appropriately, but we need to resolve the root cause of missing files still	13:46
rfolco\|rover	weshay\|ruck, yeah, looking what package it provides and comparing to green jobs	13:46
rfolco\|rover	weshay\|ruck, openssl-libs-1.1.1c-2.el8_1.1.x86_64 is installed	13:48
rfolco\|rover	in the controller at least :)	13:48
rfolco\|rover	weshay\|ruck, this check is weird... there is no "missing /etc/pki..." on rpm_va https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/rpm_va.txt	13:54
rfolco\|rover	so the grep fails	13:54
rfolco\|rover	let me look at this test	13:54
rfolco\|rover	just thinking loud here...	14:00
rfolco\|rover	[rfolco@redbox tripleo-ci]$ rpm -Va openssl-libs-1.1.1c-2.el8_1.1.x86_64 /etc/pki/tls/private	14:00
rfolco\|rover	[rfolco@redbox tripleo-ci]$ echo $?	14:00
rfolco\|rover	0	14:00
*** ykarel\|afk is now known as ykarel		14:09
rfolco\|rover	weshay\|ruck, I think I know what the issue is	14:09
rfolco\|rover	weshay\|ruck, the check is giving false negative	14:09
weshay\|ruck	rfolco\|rover, keep in mind, the issue often shows up in the deploy.. what makes you think it's a false negative?	14:11
rfolco\|rover	the test now is giving false negative	14:11
rfolco\|rover	the fix	14:11
rfolco\|rover	weshay\|ruck, if we tmate I can explain better	14:12
weshay\|ruck	k.. in mtg atm	14:12
rfolco\|rover	ok	14:12
*** rlandy\|mtg is now known as rlandy		14:15
rlandy	sshnaidm: hi .. is it possible that we see a significant slowdown on the nodes after changing the network?	14:16
rlandy	sshnaidm: we have jobs timing out that never did before last week	14:16
rfolco\|rover	weshay\|ruck, ok, I'm confident this is wrong and working on a fix... http://pastebin.test.redhat.com/872965 -- buggy code: https://github.com/openstack/tripleo-ci/blob/508376e178eab29f0debc5dbb40908d5dc985eb1/roles/oooci-build-images/tasks/image_sanity.yaml#L36	14:17
rfolco\|rover	weshay\|ruck, in summary: if (*AND ONLY IF*) we find /etc/pki/tls/private in rpm_Va output, we should check if it is "missing"	14:21
rfolco\|rover	updated pastebin shows this http://pastebin.test.redhat.com/872972	14:21
ysandeep	rlandy, fyi.. test run for that last task passed	14:23
weshay\|ruck	rfolco\|rover, ah ya.. see what you mean.. /etc/pki is not listed here https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/rpm_va.txt	14:25
weshay\|ruck	but is marked as failed here https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/job-output.txt	14:25
rfolco\|rover	weshay\|ruck, exactly, we should fail only if the file is in rpm_va plus "missing"	14:25
rfolco\|rover	patch coming	14:26
weshay\|ruck	k great	14:26
sshnaidm	rlandy, maybe, but I don't know how it's possible	14:31
sshnaidm	rlandy, routing in internal network should be better and faster actually..	14:32
*** TrevorV has quit IRC		14:36
sshnaidm	rlandy, do you see slowness in specific steps?	14:37
rlandy	sshnaidm: either way, wrt OVB, we have IPMI connection now - so that is good bur the response to power on/off is very slow	14:37
sshnaidm	rlandy, logs?	14:37
*** TrevorV has joined #oooq		14:37
rlandy	sshnaidm: see the last run on https://code.engineering.redhat.com/gerrit/#/c/200436	14:39
sshnaidm	rlandy, and which jobs does time out?	14:40
rlandy	sshnaidm: the MTU on the private subnet is 1450 and 1500 on external	14:40
rlandy	sshnaidm: the ipa multonode job for example	14:40
rlandy	getting logs	14:40
rlandy	should those NTUs match?	14:41
rlandy	MTUs	14:41
sshnaidm	"No nodes are manageable at this time."	14:42
sshnaidm	rlandy, the less mtu the better..	14:42
rlandy	sshnaidm: so if you watch introspection, two things happen	14:44
rlandy	the nodes stay in validating for some time	14:44
rlandy	and then get to enroll	14:44
rlandy	or manageable	14:44
rlandy	then when the nodes are in manageable state,	14:44
rlandy	and introspection does start, the power on command is issued	14:45
rlandy	and registered	14:45
rlandy	but the nodes don't power on for a very long time	14:45
sshnaidm	rlandy, I don't see in this job introspection starts, it fails before with "no manageable nodes" error	14:46
rlandy	I reran on the node	14:46
sshnaidm	rlandy, maybe when introspection starts in job, the nodes are still in enroll	14:46
rlandy	sshnaidm: yes	14:46
rlandy	the nodes take too long to get to every state that is expected	14:47
rlandy	tbh, idk if this cloud can support OVB	14:47
rlandy	the nodes are in fact still in verifying	14:47
rlandy	and only get to enroll afterwards	14:48
sshnaidm	rlandy, so maybe worth to add polling check if they're in manageable state with timeout	14:49
sshnaidm	to wait for them	14:49
rlandy	sshnaidm: maybe something else hit this cloud ... see the container build job for example:	14:49
rlandy	https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/builds?job_name=periodic-tripleo-containers-rhel-8-rhos-17-build-push	14:49
rlandy	see the slow down after 06/02	14:49
rlandy	so 06/03 onwards	14:50
rlandy	the jobs take twice as long	14:50
rlandy	2020-06-03 things go downhill	14:51
sshnaidm	yeah, no idea what's happening..	14:53
rfolco\|rover	weshay\|ruck, https://review.opendev.org/734112 Fix image_sanity check	14:57
rfolco\|rover	weshay\|ruck, https://review.rdoproject.org/r/27986 Test image_sanity fix	14:57
rfolco\|rover	pojadhav\|ruck, ^	14:58
rlandy	sshnaidm: what's the equivalent of provider_net_shared_3 om rdocloud?	15:01
rlandy	38.145.32.0/22	15:02
sshnaidm	rlandy, yep, 38.145.32.0/22	15:03
weshay\|ruck	rfolco\|rover, I like the change, but look at the output here https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/rpm_va.txt	15:03
weshay\|ruck	\/etc/pki is not listed	15:04
weshay\|ruck	so either grep should return that it's not found	15:04
weshay\|ruck	https://review.opendev.org/#/c/734112/1/roles/oooci-build-images/tasks/image_sanity.yaml	15:04
weshay\|ruck	rfolco\|rover, run a grep yourself and check the return code on something you find and something you don't	15:05
weshay\|ruck	rfolco\|rover, and look at ur patch again	15:05
*** skramaja has quit IRC		15:06
weshay\|ruck	zbr, FYI.. https://review.opendev.org/#/c/734083/	15:07
weshay\|ruck	zbr, extra points if we can move to centos-stream	15:07
zbr	i use stream locally, switching to it should be very easy	15:08
zbr	if something works on non-stream, likely it will still work with stream.	15:08
weshay\|ruck	zbr, ensuring we have the ecosystem upstream is ++	15:08
rfolco\|rover	weshay\|ruck, rpm -Va won't add all the files to the output.... /etc/pki/tls/private exists and is not added to the rpm_va.txt	15:08
weshay\|ruck	rfolco\|rover, the issue is w/ grep	15:08
weshay\|ruck	and the return code	15:08
weshay\|ruck	afaict	15:08
ykarel	rlandy, weshay\|ruck can u review https://review.opendev.org/#/c/733471/	15:09
sshnaidm	rlandy, we can revert back tenant config and see if it helps to other jobs, but we'll loose ovb for this time	15:09
rlandy	sshnaidm: discussing on #rhos-ops	15:10
weshay\|ruck	rfolco\|rover, http://pastebin.test.redhat.com/873012	15:10
weshay\|ruck	ykarel, looking	15:10
rfolco\|rover	weshay\|ruck, yes thats why I reverse grep in my patch	15:11
rlandy	ykarel: oh gosh, that constraints change keeps goinh	15:11
rlandy	going	15:11
rfolco\|rover	weshay\|ruck, I look for missing first	15:11
ykarel	rlandy, yes :(	15:11
rfolco\|rover	weshay\|ruck, then I reverse grep -v "/etc/pki..."	15:11
weshay\|ruck	rfolco\|rover, ur right.. missed the -v	15:11
weshay\|ruck	:)	15:11
rfolco\|rover	weshay\|ruck, I tested w/ a file that is not in rpm_va output, a file that is in the file, but not missing... and a file that is marked as missing	15:12
rfolco\|rover	weshay\|ruck, the famous "works in my machine"	15:13
weshay\|ruck	zbr, can you work w/ carlos on enabling centos-stream for tripleo upstream? not RIGHT NOW, but generally speaking? https://review.opendev.org/#/q/topic:centos-stream+(status:open+OR+status:merged)	15:16
weshay\|ruck	we've talked about this is in the past	15:16
zbr	i already added a warch on that change, I will help him.	15:16
cgoncalves	cool, thanks	15:17
weshay\|ruck	zbr++	15:17
weshay\|ruck	cgoncalves++	15:17
*** jmasud has joined #oooq		15:19
rlandy	ysandeep: fyi ... see #rhos-ops	15:37
ysandeep	rlandy, checking	15:37
rlandy	having discussion about slow downstream cloud	15:37
weshay\|ruck	rlandy, sshnaidm working this from another angle for ya.. /me is MOD, looking at the customer escalation board atm	15:41
weshay\|ruck	that issue is not tracked there, and imho should me	15:41
weshay\|ruck	be	15:41
cgoncalves	zbr, weshay\|ruck: does tripleo build images from an existing centos image ("centos" DIB element) or "centos-minimal"?	15:41
*** jmasud has quit IRC		15:44
weshay\|ruck	sec.. dealing w/ psi issues	15:45
weshay\|ruck	rlandy, sshnaidm open a lp to track, and mark promotion blocker	15:46
rlandy	weshay\|ruck: ack	15:47
*** ykarel is now known as ykarel\|away		15:51
*** ysandeep is now known as ysandeep\|afk		15:54
rlandy	rfolco\|rover: can you post that LP you had on the failing container build when they all pass?	15:59
rlandy	weshay\|ruck: ^^	15:59
weshay\|ruck	I'm not aware of that	16:00
weshay\|ruck	I'll check the hackmd	16:00
rlandy	weshay\|ruck: rfolco\|rover: got it ... https://bugs.launchpad.net/tripleo/+bug/1879365	16:04
openstack	Launchpad bug 1879365 in tripleo "[container build] SystemError: The following jobs were incomplete: state=finished" [High,Incomplete]	16:04
*** udesale_ has quit IRC		16:08
*** dtantsur is now known as dtantsur\|afk		16:10
*** ysandeep\|afk is now known as ysandeep		16:15
*** jmasud has joined #oooq		16:17
rlandy	weshay\|ruck: sshnaidm: from #rhos-ops, it looks like there are people working on the downstream cloud slowness and korde "I've bumped up the urgency in the case again."	16:22
rlandy	do we want another blocker LP?	16:22
sshnaidm	idk, but I'd like to get know about such cases asap and not waste hours trying to find what's wrong	16:23
sshnaidm	the notification part doesn't seem to work at all	16:23
rlandy	ok - creating one anyways	16:24
weshay\|ruck	rlandy, what's the bug #?	16:24
weshay\|ruck	notification?	16:25
weshay\|ruck	sshnaidm, which part are you speaking to?	16:25
sshnaidm	weshay\|ruck, if cloud is broken we should know about that asap	16:25
weshay\|ruck	rlandy, bz is probably more appropriate in this case	16:25
rlandy	https://one.redhat.com/tools-and-services/details/psi-openstack-cloud-d	16:25
rlandy	weshay\|ruck: k - adding	16:25
weshay\|ruck	sshnaidm, yes.. agree	16:25
rlandy	weshay\|ruck: we're locked out of JIRA atm	16:25
rlandy	auth issue	16:25
rlandy	there is some tracking there	16:26
Tengu	sso's dead apparently.	16:26
weshay\|ruck	probably runs on PSI	16:26
Tengu	uhuhu	16:27
ysandeep	rlandy, Fyi.. Hey we tried but unable to reproduce issue manually - tried running that test playbook against localhost, undercloud , Trying running playbook from outside just like zuul does but didn't hit any issue :(	16:27
ysandeep	rlandy, Working on theory if issue is somewhere else, and it's being false reported. I am rerunning that job replacing localhost with undercloud for a test.	16:27
weshay\|ruck	rlandy, sshnaidm Alan owns the relationship w/ the cloud provider.. we just need to cix..	16:27
rlandy	ysandeep: ^^ there are a lot of issues with the downsteam cloud atm	16:28
weshay\|ruck	cix can be cross referenced w/ jira or what ever other bs we need	16:28
rlandy	weshay\|ruck: yeah - creating BZ - will mention the JIRA ticket	16:28
ysandeep	rlandy, ack, not sure if its related but will trigger jobs later then	16:29
rlandy	ysandeep: may not be worth your debug time atm	16:29
weshay\|ruck	rlandy, sshnaidm https://access.redhat.com/support/cases/#/case/02671591	16:29
weshay\|ruck	fyi	16:29
ysandeep	rlandy, o/ thanks.. i will go to sleep then.. See you tomorrow o/ Have a great day ahead :)	16:30
rlandy	ysandeep: yeah - sorry about all this	16:30
sshnaidm	weshay\|ruck, "There was an error loading case."	16:30
zbr	rlandy or weshay\|ruck: quick review on https://review.rdoproject.org/r/#/c/27987/	16:30
rlandy	ysandeep: will leave you email if there is any progress	16:30
ysandeep	rlandy, thanks! that will help	16:30
*** ysandeep is now known as ysandeep\|away		16:30
weshay\|ruck	rlandy, imho.. bz that lists that ticket is enough.. then email rhos-dev w/ the cix flags in the subject	16:30
weshay\|ruck	sshnaidm, you may need to be logged in	16:31
weshay\|ruck	or another system is down	16:31
weshay\|ruck	lolz	16:31
sshnaidm	weshay\|ruck, I am..	16:31
rlandy	weshay\|ruck: yep - can't log in to BZ atm	16:32
weshay\|ruck	lolz	16:32
rlandy	another 2020 disaster	16:32
rlandy	zbr: we have no more molecule test in centos7? if so, great	16:33
zbr	rlandy: incorrect: we still have them but we are now using py36 on both c7/8.	16:33
sshnaidm	weshay\|ruck, upstream CI times out as well, takes 1 hours to prepare containers: https://187cce064a1459d372de-21abb6d2b9f578210dfe07e5ee1d658a.ssl.cf1.rackcdn.com/730083/2/check/tripleo-ci-centos-8-scenario001-standalone/ac77443/logs/undercloud/var/log/tripleo-container-image-prepare.log	16:33
rlandy	zbr: then +2	16:34
zbr	basically this helps us to migrate our codebase to py36 w/o forcing the system bump at the same time	16:34
zbr	smaller steps = safer	16:34
rlandy	ack	16:34
* sshnaidm is out to prepare bunker and supplies		16:34
*** sshnaidm is now known as sshnaidm\|afk		16:34
rlandy	Requests typically are < 2ms but are now taking > 10 secs.	16:39
rlandy	yep - that looks like our issue	16:39
rlandy	sloooooooooowwww cloud	16:39
*** amoralej is now known as amoralej\|lunch		16:52
*** amoralej\|lunch is now known as amoralej\|off		16:52
*** jmasud has quit IRC		16:57
*** derekh has quit IRC		17:01
*** jmasud has joined #oooq		17:06
rfolco\|rover	zbr, on a quick look, do you understand why this failed ? https://08c3aae88ab0ce3ed41d-baf4f807d40559415da582760ebf9456.ssl.cf1.rackcdn.com/733659/7/check/tripleo-buildimage-overcloud-full-centos-7-train/c35c051/build.log	17:15
rfolco\|rover	zbr, if command -v python3 executed, why python_path is empty and was the last command to run ? https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/dib-python/pre-install.d/01-dib-python#L18	17:16
zbr	not sure, but i remember having a similar problem in other places	17:18
rfolco\|rover	zbr, ok thanks... will compare to the scl run	17:19
rfolco\|rover	weshay\|ruck, can you re-w+ this one https://review.opendev.org/#/c/732618/	17:19
rfolco\|rover	weshay\|ruck, not sure what happened	17:20
weshay\|ruck	rfolco\|rover, depends-on https://review.opendev.org/#/c/730763	17:20
zbr	rfolco\|rover: i wonder if command may return multiple lines in some cases, could break the code in ugly ways	17:20
*** jpena is now known as jpena\|off		17:20
weshay\|ruck	which needs https://review.opendev.org/#/c/733790/	17:20
rfolco\|rover	weshay\|ruck, ah ok gotcha	17:21
zbr	i know that type does return multiple results and that you need to "\| head -n1"	17:21
weshay\|ruck	rfolco\|rover, this should fix the epel issue if we saw that in ussuri https://review.opendev.org/#/c/733790/3/container-images/tripleo_kolla_template_overrides.j2	17:21
rfolco\|rover	weshay\|ruck, yep	17:21
rfolco\|rover	zbr, command -v your mean ?	17:22
zbr	yep	17:22
rfolco\|rover	aahhh the first result might be empty	17:23
rfolco\|rover	or command -v python3 is really retrieving none	17:27
weshay\|ruck	rlandy, fyi.. this fails if you run master release from a centos-7 virthost fyi.. /me updates this patch	17:45
weshay\|ruck	https://review.opendev.org/#/c/733471/3	17:45
weshay\|ruck	rlandy, can chat when ever	17:49
rlandy	weshay\|ruck: https://meet.google.com/qin-kmpv-nwf	17:59
*** saneax_AFK has quit IRC		18:03
*** jmasud has quit IRC		18:11
*** jmasud has joined #oooq		18:13
*** rlandy is now known as rlandy\|mtg		18:20
*** jmasud has quit IRC		18:26
weshay\|ruck	rlandy\|mtg, https://review.opendev.org/#/c/724193/	18:50
weshay\|ruck	https://review.opendev.org/#/c/729824/	18:50
weshay\|ruck	rlandy\|mtg, tripleo-build-containers-ubi-8SUCCESS in 45m 15s (non-voting)	18:52
weshay\|ruck	rlandy\|mtg, https://review.opendev.org/#/c/724193/50	18:52
*** rlandy\|mtg is now known as rlandy		18:56
weshay\|ruck	rlandy, the patch to fix ussuri containers build is close to merging	18:57
weshay\|ruck	known issue	18:57
rlandy	great	18:57
weshay\|ruck	rlandy, removed DNM, https://review.opendev.org/#/c/730321/	19:12
rlandy	thanks - voted	19:12
weshay\|ruck	rlandy, k.. and I got these in the right place.. thought I didn't but I did	19:13
weshay\|ruck	https://review.opendev.org/#/c/733392/	19:13
weshay\|ruck	https://review.opendev.org/#/c/734100/1/zuul.d/standalone-jobs.yaml	19:13
rlandy	weshay\|ruck: CIX email sent for https://bugzilla.redhat.com/show_bug.cgi?id=1845266	19:20
openstack	bugzilla.redhat.com bug 1845266 in releng "Significant slowdown in running jobs in PSI upshift - internal zuul" [Unspecified,New] - Assigned to apevec	19:20
rlandy	weshay\|ruck: I'm going to try the old container build push job again ( from testproject) now that kforde says the API response time issues may have been addressed	19:22
rlandy	will see if it makes any difference	19:22
*** jmasud has joined #oooq		19:37
rlandy	weshay\|ruck: https://code.engineering.redhat.com/gerrit/202706 Update rhos-17 promotion criteria with new jobs added.	19:49
weshay\|ruck	rlandy, thanks	19:59
rfolco\|rover	weshay\|ruck, I don't know what to do with fs020, failing on master, ussuri, train..	20:07
rfolco\|rover	weshay\|ruck, mostly the same issue: pacemaker	20:07
weshay\|ruck	which issue w/ pacemaker?	20:08
weshay\|ruck	rfolco\|rover, is it on https://hackmd.io/YAqFJrKMThGghTW4P2tabA?both ?	20:10
rfolco\|rover	this is failing since ever	20:11
rfolco\|rover	https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master&job_name=periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-ussuri&job_name=periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-train&result=FAILURE	20:11
rfolco\|rover	weshay\|ruck, https://bugs.launchpad.net/tripleo/+bug/1867602	20:11
openstack	Launchpad bug 1867602 in tripleo "overcloud deploy failed due to Systemd start for pcsd failed" [Medium,Triaged]	20:11
rfolco\|rover	bug filed a few sprint ago	20:11
rfolco\|rover	weshay\|ruck, its not 100% consistent, it also failed on tempest sometimes	20:12
weshay\|ruck	rfolco\|rover, that's the images dude	20:12
weshay\|ruck	rfolco\|rover, it's the missing files in the overcloud images	20:13
rfolco\|rover	hmmm	20:13
rfolco\|rover	mark as up then ?	20:13
rfolco\|rover	dup	20:13
weshay\|ruck	rfolco\|rover, No such file or directory: '/var/log/pcsd/pcsd.log	20:13
weshay\|ruck	rfolco\|rover, that goes away.. like the /etc/pki issue goes away when there is a valid working overcloud-full image	20:14
rfolco\|rover	ok	20:15
weshay\|ruck	rfolco\|rover, see my last comment in that bug	20:15
rfolco\|rover	weshay\|ruck, will mark dup of https://bugs.launchpad.net/tripleo/+bug/1879766	20:16
openstack	Launchpad bug 1879766 in tripleo "master ovb jobs failing on Destination directory /etc/pki/tls/private does not exist" [Critical,Triaged] - Assigned to chandan kumar (chkumar246)	20:16
weshay\|ruck	k	20:16
weshay\|ruck	rfolco\|rover, sooner we can push a working image, sooner these problems go away	20:18
weshay\|ruck	rfolco\|rover, https://review.rdoproject.org/r/#/c/27986/	20:18
weshay\|ruck	focus there	20:19
rfolco\|rover	again...	20:19
rfolco\|rover	yep	20:19
rfolco\|rover	working on it	20:19
rfolco\|rover	well.. now the check IS RIGHT	20:20
rfolco\|rover	weshay\|ruck, ^ image_sanity is doing what is supposed to do	20:21
rfolco\|rover	the files are really missing and failing the job	20:21
weshay\|ruck	not all the time	20:22
weshay\|ruck	rfolco\|rover, the sanity check was failing ALL the time.. you are fixing that bit	20:23
weshay\|ruck	rfolco\|rover, pojadhav\|ruck and chandankumar should pick it up from you.. and also figure out why it does in fact fail sometimes	20:24
rfolco\|rover	weshay\|ruck, last time it failed even if the filename was not in the rpm va output.	20:25
weshay\|ruck	rlandy, https://code.engineering.redhat.com/gerrit/#/c/202706/ is correct, merged	20:30
rlandy	thanks	20:31
rfolco\|rover	weshay\|ruck, but even fixing the check itself, if the file is missing on rpm_va output, the job will fail...	20:31
rfolco\|rover	missing /var/lib/pcsd	20:31
weshay\|ruck	ya.. same shit	20:31
rfolco\|rover	so also need to understand why the image is missing files...	20:32
*** jbadiapa has quit IRC		20:32
weshay\|ruck	rfolco\|rover, yes indeed we do.. this started after chandankumar's refactor of the tripleo-ci/ooo-buildimage and oooq/buildimages role	20:32
rfolco\|rover	weshay\|ruck, I did not look at the code yet, but maybe we close out the qcow2 image while its still copying files into it	20:33
*** ccamacho has quit IRC		20:39
*** jtomasek has quit IRC		21:18
*** jmasud has quit IRC		21:28
*** jmasud has joined #oooq		21:38
*** jmasud has quit IRC		21:55
*** jfrancoa has quit IRC		21:56
rlandy	weshay\|ruck: still around?	22:02
weshay\|ruck	rlandy, aye	22:10
weshay\|ruck	rlandy, check ur email	22:10
rlandy	weshay\|ruck: thanks for the graphical backup	22:11
weshay\|ruck	:)	22:11
rlandy	weshay\|ruck: your opinion of reverting the change to add the private network?	22:11
weshay\|ruck	rlandy, do you want me to look at other jobs?	22:12
rlandy	weshay\|ruck: I don't think so - we're in the same place. OVB just died - as did BM agian in accessing the undercloud	22:12
weshay\|ruck	rlandy, let's schedule a 1/2 for you, sagi and myself to chat	22:12
rlandy	hangs on introspection	22:12
rlandy	weshay\|ruck: k - tomorrow morning	22:13
rlandy	at this point, I'd rather go back to the direct external node connection	22:13
rlandy	and give OVB a shot another time	22:13
weshay\|ruck	k	22:14
weshay\|ruck	let's rope sagi and chat about it	22:14
rlandy	yep	22:15
rlandy	I give up	22:18
*** dmellado_ has joined #oooq		23:10
*** dmellado has quit IRC		23:11
*** dmellado_ is now known as dmellado		23:11
*** tosky has quit IRC		23:13
*** TrevorV has quit IRC		23:19

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!