Monday, 2020-06-08

*** jmasud has joined #oooq00:04
*** jmasud has quit IRC00:43
*** holser has quit IRC03:17
*** ykarel|away is now known as ykarel04:23
*** ratailor has joined #oooq04:38
*** jtomasek has joined #oooq04:59
*** jmasud has joined #oooq05:13
*** saneax has joined #oooq05:26
*** jtomasek has quit IRC05:37
ysandeepfolks o/ , Have you seen this kind of error before in container build : "SystemError: The following jobs were incomplete: [{'swift-base" ? but container built itself seems succesfull05:56
ysandeephttps://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-periodic-rhos-17/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-containers-rhel-8-rhos-17-build-push/75dc618/logs/build.log ?05:56
bhagyashrisysandeep, hey, there was some discussion going on friday you can see the logs here http://paste.openstack.org/show/794448/06:01
ysandeepbhagyashris, thank you o/06:02
*** matbu has joined #oooq06:06
*** udesale has joined #oooq06:19
*** jmasud has quit IRC06:40
*** jtomasek has joined #oooq06:41
*** jmasud has joined #oooq06:42
*** yolanda has joined #oooq06:43
*** ysandeep is now known as ysandeep|afk06:57
*** skramaja has joined #oooq07:04
*** jmasud has quit IRC07:12
*** ccamacho has joined #oooq07:32
*** tosky has joined #oooq07:39
*** amoralej|off is now known as amoralej07:52
*** jpena|off is now known as jpena07:56
*** ysandeep|afk is now known as ysandeep08:08
*** dtantsur has joined #oooq08:12
*** jfrancoa has joined #oooq08:40
*** sshnaidm|afk is now known as sshnaidm08:44
*** jtomasek has quit IRC08:48
*** jtomasek has joined #oooq08:50
*** apetrich has joined #oooq08:54
*** holser has joined #oooq08:56
*** jschlueter has joined #oooq09:12
*** ccamacho has quit IRC09:55
*** jbadiapa has joined #oooq10:02
akahatcgoncalves, o/10:30
cgoncalvesakahat, hi10:30
akahatcgoncalves, i need to talk about this:https://review.opendev.org/#/c/73150110:30
akahatcgoncalves, will the enable_provider_drivers will not work here?10:31
akahatI mean we have only three drivers: amphora, octavia and ovn.10:31
*** ccamacho has joined #oooq10:31
cgoncalvesakahat, ideally the OVN provider driver should be appended. we must not assume the 'octavia' and 'amphora' provider drivers are enabled10:32
akahatcgoncalves, okay. so appending only ovn will work?10:34
cgoncalvesakahat, yes, appended if the provider driver isn't already present. I am not sure I follow what's driving this change though. could you please help me understand?10:40
*** jtomasek has quit IRC10:41
akahatcgoncalves, this will help to understand: https://tree.taiga.io/project/tripleo-ci-board/task/1699?kanban-status=144727510:42
*** jtomasek has joined #oooq10:43
*** derekh has joined #oooq10:46
cgoncalvesakahat, maybe a better approach would be to construct the 'enabled_provider_drivers' in tempestconf based on the enabled provider drivers that you can get via Octavia API10:48
cgoncalvesbtw, you have a typo in the conf setting. it is "enable***d***_provider_drivers"10:49
cgoncalvesakahat, https://docs.openstack.org/api-ref/load-balancer/v2/?expanded=list-providers-detail#list-providers10:49
akahatcgoncalves, okay. I'll fix it.10:52
akahatcgoncalves, thanks :)10:52
cgoncalvesyou're welcome10:54
arxcruzsshnaidm: hey, how can I add an ansible collection in tq?10:58
arxcruzsshnaidm: context: os_tempest is now using openstack.cloud collection10:58
arxcruzand we need to add it on tq10:59
arxcruzi'm checking the tripleo-ansible-operator, but it has the python.py file, and you install it using the tripleo-quickstart-extras requirements but the openstack.cloud doesn't have, and I'm not sure if it's the right way10:59
arxcruzI would do it with ansible-galaxy, but it doesn't seems to have it in tq11:00
sshnaidmarxcruz, https://review.opendev.org/#/c/730083/11:04
arxcruzsshnaidm: danke her Shnaidm, I was seein this path to follow but was unsure11:06
sshnaidmarxcruz, de nada, señor Arx11:07
sshnaidmor how do you call it in brasilian language? :D11:07
arxcruzde nada senhor Arx11:08
arxcruzwe don't have the ñ in portuguese11:08
arxcruzin this case it pronounces equal with some accent11:09
*** jpena is now known as jpena|lunch11:32
*** rfolco has joined #oooq11:37
weshay|ruck0/11:48
cgoncalves\011:50
weshay|ruckpojadhav|ruck, rfolco we should sync up11:53
pojadhav|ruckweshay|ruck, yup11:53
weshay|ruckk.. holding for rfolco11:53
rfolcoweshay|ruck, pojadhav|ruck need 2 min, will get a coffee11:54
*** rfolco is now known as rfolco|rover11:54
weshay|ruckhttps://meet.google.com/one-rbow-bcs11:56
weshay|ruckpojadhav|ruck, 2020-06-08 08:33:52.573060 | primary | urllib3.exceptions.LocationParseError: Failed to parse: https://trunk.rdoproject.org/api-centos8-master-uc/api/report_result11:58
*** rlandy has joined #oooq12:06
*** skramaja has quit IRC12:10
*** skramaja has joined #oooq12:10
rfolco|roverweshay|ruck, https://review.opendev.org/#/c/73379012:11
*** jfrancoa has quit IRC12:12
*** jfrancoa has joined #oooq12:14
weshay|ruckrfolco|rover, pojadhav|ruck https://logserver.rdoproject.org/openstack-periodic-24hr/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset002-train/773e973/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz12:20
*** amoralej is now known as amoralej|lunch12:22
*** ratailor has quit IRC12:31
rlandyrfolco|rover: do we have scrum today?12:39
rfolco|roverrlandy, no, we have only on thu12:40
rlandyoh ok - we took the once a week option12:40
rfolco|roverper team's suggestion yep12:40
rfolco|roverfor this sprint only as experiment12:40
*** udesale_ has joined #oooq12:49
weshay|ruckrfolco|rover, that's the neutron ptl.. Slawomir Kaplonski12:52
*** udesale has quit IRC12:52
ysandeeprlandy, hey sorry i was out of friday so I am not sure of what you and chandankumar decided for image build issue. Any luck with it?12:52
rfolco|roverweshay|ruck, ok thanks12:52
weshay|ruckrfolco|rover, irc slaweq12:53
rlandyysandeep: hi - yes - chandankumar change some settings in the env review12:53
ysandeeprlandy, Whenever you are free, can we sync for some minutes today.12:53
rlandyysandeep: but then that image expired :(12:53
rlandyso I had to update the image12:54
*** saneax is now known as saneax_AFK12:54
rlandyysandeep: yeah - just trying to kick the two diff image build jobs again os we can promote and clear the scenario failures12:54
rlandywill ping you in a bit12:54
rfolco|roverpojadhav|ruck, need 10 min before we sync12:55
ysandeeprlandy, ack thanks! and yes that container build is failing with weird System Errors , i saw you and marios had some discussion about it on friday.12:55
pojadhav|ruckrfolco|rover, okay12:56
*** jpena|lunch is now known as jpena13:03
*** ykarel is now known as ykarel|afk13:11
*** amoralej|lunch is now known as amoralej13:14
rlandyysandeep: ok - if you have time now, let's chat13:16
ysandeeprlandy, sure13:16
rlandyysandeep: https://meet.google.com/xpa-ceom-onm13:17
sshnaidmrlandy, hi13:20
sshnaidmrlandy, how is downstream ovb going?13:20
*** rlandy is now known as rlandy|mtg13:20
rlandy|mtgsshnaidm: not great - the introspection still fails13:20
rlandy|mtgit looks like the cloud is very slow to respond to power actions13:21
sshnaidmrlandy|mtg, timeout?13:21
rlandy|mtgin mtg now - will show you in a bit13:21
sshnaidmack13:21
rlandy|mtgintrospection outright fails13:21
rlandy|mtgno lcear trace as to why13:21
weshay|ruckpojadhav|ruck, rfolco|rover directories/files still missing in latest https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/job-output.txt13:31
rfolco|rover2020-06-08 08:37:50.462917 | primary | ok: All assertions passed13:35
rfolco|rover2020-06-08 08:37:50.463339 |13:35
rfolco|rover2020-06-08 08:37:50.500997 | primary | ok: All assertions passed13:35
rfolco|rovermissing     /etc/pki/CA/private13:35
rfolco|roverweshay|ruck, ok let me fix it and test the routine locally first13:36
weshay|ruckrfolco|rover, what are you fixing?13:39
rfolco|roverweshay|ruck, it should fail assertions13:39
rfolco|roverisn't ?13:39
weshay|ruckrfolco|rover, 2020-06-08 13:24:28.136248 | primary |   "msg": "Assertion failed"13:39
rfolco|roverweshay|ruck, I was looking at different job https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/ea5c167/job-output.txt13:42
*** TrevorV has joined #oooq13:43
rfolco|roverweshay|ruck, ok so its doing what is supposed to13:43
weshay|ruckrfolco|rover, the build is failing appropriately, but we need to resolve the root cause of missing files still13:46
rfolco|roverweshay|ruck, yeah, looking what package it provides and comparing to green jobs13:46
rfolco|roverweshay|ruck, openssl-libs-1.1.1c-2.el8_1.1.x86_64 is installed13:48
rfolco|roverin the controller at least :)13:48
rfolco|roverweshay|ruck, this check is weird... there is no "missing /etc/pki..." on rpm_va https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/rpm_va.txt13:54
rfolco|roverso the grep fails13:54
rfolco|roverlet me look at this test13:54
rfolco|roverjust thinking loud here...14:00
rfolco|rover[rfolco@redbox tripleo-ci]$ rpm -Va openssl-libs-1.1.1c-2.el8_1.1.x86_64 /etc/pki/tls/private14:00
rfolco|rover[rfolco@redbox tripleo-ci]$ echo $?14:00
rfolco|rover014:00
*** ykarel|afk is now known as ykarel14:09
rfolco|roverweshay|ruck, I think I know what the issue is14:09
rfolco|roverweshay|ruck, the check is giving false negative14:09
weshay|ruckrfolco|rover, keep in mind, the issue often shows up in the deploy.. what makes you think it's a false negative?14:11
rfolco|roverthe test now is giving false negative14:11
rfolco|roverthe fix14:11
rfolco|roverweshay|ruck, if we tmate I can explain better14:12
weshay|ruckk.. in mtg atm14:12
rfolco|roverok14:12
*** rlandy|mtg is now known as rlandy14:15
rlandysshnaidm: hi  .. is it possible that we see a significant slowdown on the nodes after changing the network?14:16
rlandysshnaidm: we have jobs timing out that never did before last week14:16
rfolco|roverweshay|ruck, ok, I'm confident this is wrong and working on a fix... http://pastebin.test.redhat.com/872965 -- buggy code: https://github.com/openstack/tripleo-ci/blob/508376e178eab29f0debc5dbb40908d5dc985eb1/roles/oooci-build-images/tasks/image_sanity.yaml#L3614:17
rfolco|roverweshay|ruck, in summary: if (***AND ONLY IF***) we find /etc/pki/tls/private in rpm_Va output, we should check if it is "missing"14:21
rfolco|roverupdated pastebin shows this http://pastebin.test.redhat.com/87297214:21
ysandeeprlandy, fyi.. test run for that last task passed14:23
weshay|ruckrfolco|rover, ah ya.. see what you mean.. /etc/pki is not listed here https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/rpm_va.txt14:25
weshay|ruckbut is marked as failed here https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/job-output.txt14:25
rfolco|roverweshay|ruck, exactly, we should fail only if the file is in rpm_va plus "missing"14:25
rfolco|roverpatch coming14:26
weshay|ruckk great14:26
sshnaidmrlandy, maybe, but I don't know how it's possible14:31
sshnaidmrlandy, routing in internal network should be better and faster actually..14:32
*** TrevorV has quit IRC14:36
sshnaidmrlandy, do you see slowness in specific steps?14:37
rlandysshnaidm: either way, wrt OVB, we have IPMI connection now - so that is good bur the response to power on/off is very slow14:37
sshnaidmrlandy, logs?14:37
*** TrevorV has joined #oooq14:37
rlandysshnaidm: see the last run on https://code.engineering.redhat.com/gerrit/#/c/20043614:39
sshnaidmrlandy, and which jobs does time out?14:40
rlandysshnaidm: the MTU on the private subnet is 1450 and 1500 on external14:40
rlandysshnaidm: the ipa multonode job for example14:40
rlandygetting logs14:40
rlandyshould those NTUs match?14:41
rlandyMTUs14:41
sshnaidm"No nodes are manageable at this time."14:42
sshnaidmrlandy, the less mtu the better..14:42
rlandysshnaidm: so if you watch introspection, two things happen14:44
rlandythe nodes stay in validating for some time14:44
rlandyand then get to enroll14:44
rlandyor manageable14:44
rlandythen when the nodes are in manageable state,14:44
rlandyand introspection does start, the power on command is issued14:45
rlandyand registered14:45
rlandybut the nodes don't power on for a very long time14:45
sshnaidmrlandy, I don't see in this job introspection starts, it fails before with "no manageable nodes" error14:46
rlandyI reran on the node14:46
sshnaidmrlandy, maybe when introspection starts in job, the nodes are still in enroll14:46
rlandysshnaidm: yes14:46
rlandythe nodes take too long to get to every state that is expected14:47
rlandytbh, idk if this cloud can support OVB14:47
rlandythe nodes are in fact still in verifying14:47
rlandyand only get to enroll afterwards14:48
sshnaidmrlandy, so maybe worth to add polling check if they're in manageable state with timeout14:49
sshnaidmto wait for them14:49
rlandysshnaidm: maybe something else hit this cloud ... see the container build job for example:14:49
rlandyhttps://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/builds?job_name=periodic-tripleo-containers-rhel-8-rhos-17-build-push14:49
rlandysee the slow down after 06/0214:49
rlandyso 06/03 onwards14:50
rlandythe jobs take twice as long14:50
rlandy2020-06-03 things go downhill14:51
sshnaidmyeah, no idea what's happening..14:53
rfolco|roverweshay|ruck, https://review.opendev.org/734112 Fix image_sanity check14:57
rfolco|roverweshay|ruck, https://review.rdoproject.org/r/27986 Test image_sanity fix14:57
rfolco|roverpojadhav|ruck, ^14:58
rlandysshnaidm: what's the equivalent of provider_net_shared_3 om rdocloud?15:01
rlandy38.145.32.0/2215:02
sshnaidmrlandy, yep, 38.145.32.0/2215:03
weshay|ruckrfolco|rover, I like the change, but look at the output here https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-buildimage-overcloud-full-master/4bd3223/rpm_va.txt15:03
weshay|ruck\/etc/pki is not listed15:04
weshay|ruckso either grep should return that it's not found15:04
weshay|ruckhttps://review.opendev.org/#/c/734112/1/roles/oooci-build-images/tasks/image_sanity.yaml15:04
weshay|ruckrfolco|rover, run a grep yourself and check the return code on something you find and something you don't15:05
weshay|ruckrfolco|rover, and look at ur patch again15:05
*** skramaja has quit IRC15:06
weshay|ruckzbr, FYI.. https://review.opendev.org/#/c/734083/15:07
weshay|ruckzbr, extra points if we can move to centos-stream15:07
zbri use stream locally, switching to it should be very easy15:08
zbrif something works on non-stream, likely it will still work with stream.15:08
weshay|ruckzbr, ensuring we have the ecosystem upstream is ++15:08
rfolco|roverweshay|ruck, rpm -Va won't add all the files to the output.... /etc/pki/tls/private exists and is not added to the rpm_va.txt15:08
weshay|ruckrfolco|rover, the issue is w/ grep15:08
weshay|ruckand the return code15:08
weshay|ruckafaict15:08
ykarelrlandy, weshay|ruck can u review https://review.opendev.org/#/c/733471/15:09
sshnaidmrlandy, we can revert back tenant config and see if it helps to other jobs, but we'll loose ovb for this time15:09
rlandysshnaidm: discussing on #rhos-ops15:10
weshay|ruckrfolco|rover, http://pastebin.test.redhat.com/87301215:10
weshay|ruckykarel, looking15:10
rfolco|roverweshay|ruck, yes thats why I reverse grep in my patch15:11
rlandyykarel: oh gosh, that constraints change keeps goinh15:11
rlandygoing15:11
rfolco|roverweshay|ruck, I look for missing first15:11
ykarelrlandy, yes :(15:11
rfolco|roverweshay|ruck, then I reverse grep -v "/etc/pki..."15:11
weshay|ruckrfolco|rover, ur right.. missed the -v15:11
weshay|ruck:)15:11
rfolco|roverweshay|ruck, I tested w/ a file that is not in rpm_va output, a file that is in the file, but not missing... and a file that is marked as missing15:12
rfolco|roverweshay|ruck, the famous "works in my machine"15:13
weshay|ruckzbr, can you work w/ carlos on enabling centos-stream for tripleo upstream? not RIGHT NOW, but generally speaking?  https://review.opendev.org/#/q/topic:centos-stream+(status:open+OR+status:merged)15:16
weshay|ruckwe've talked about this is in the past15:16
zbri already added a warch on that change, I will help him.15:16
cgoncalvescool, thanks15:17
weshay|ruckzbr++15:17
weshay|ruckcgoncalves++15:17
*** jmasud has joined #oooq15:19
rlandyysandeep: fyi ... see #rhos-ops15:37
ysandeeprlandy, checking15:37
rlandyhaving discussion about slow downstream cloud15:37
weshay|ruckrlandy, sshnaidm working this from another angle for ya.. /me is MOD, looking at the customer escalation board atm15:41
weshay|ruckthat issue is not tracked there, and imho should me15:41
weshay|ruckbe15:41
cgoncalveszbr, weshay|ruck: does tripleo build images from an existing centos image ("centos" DIB element) or "centos-minimal"?15:41
*** jmasud has quit IRC15:44
weshay|rucksec.. dealing w/ psi issues15:45
weshay|ruckrlandy, sshnaidm open a lp to track, and mark promotion blocker15:46
rlandyweshay|ruck: ack15:47
*** ykarel is now known as ykarel|away15:51
*** ysandeep is now known as ysandeep|afk15:54
rlandyrfolco|rover: can you post that LP you had on the failing container build when they all pass?15:59
rlandyweshay|ruck: ^^15:59
weshay|ruckI'm not aware of that16:00
weshay|ruckI'll check the hackmd16:00
rlandyweshay|ruck: rfolco|rover: got it ... https://bugs.launchpad.net/tripleo/+bug/187936516:04
openstackLaunchpad bug 1879365 in tripleo "[container build] SystemError: The following jobs were incomplete: state=finished" [High,Incomplete]16:04
*** udesale_ has quit IRC16:08
*** dtantsur is now known as dtantsur|afk16:10
*** ysandeep|afk is now known as ysandeep16:15
*** jmasud has joined #oooq16:17
rlandyweshay|ruck: sshnaidm: from #rhos-ops, it looks like there are people working on the downstream cloud slowness and korde "I've bumped up the urgency in the case again."16:22
rlandydo we want another blocker LP?16:22
sshnaidmidk, but I'd like to get know about such cases asap and not waste hours trying to find what's wrong16:23
sshnaidmthe notification part doesn't seem to work at all16:23
rlandyok - creating one anyways16:24
weshay|ruckrlandy, what's the bug #?16:24
weshay|rucknotification?16:25
weshay|rucksshnaidm, which part are you speaking to?16:25
sshnaidmweshay|ruck, if cloud is broken we should know about that asap16:25
weshay|ruckrlandy, bz is probably more appropriate in this case16:25
rlandyhttps://one.redhat.com/tools-and-services/details/psi-openstack-cloud-d16:25
rlandyweshay|ruck: k - adding16:25
weshay|rucksshnaidm, yes.. agree16:25
rlandyweshay|ruck: we're locked out of JIRA atm16:25
rlandyauth issue16:25
rlandythere is some tracking there16:26
Tengusso's dead apparently.16:26
weshay|ruckprobably runs on PSI16:26
Tenguuhuhu16:27
ysandeeprlandy, Fyi.. Hey we tried  but unable to reproduce issue manually - tried running that test playbook against localhost, undercloud , Trying running playbook from outside just like zuul does but didn't hit any issue :(16:27
ysandeeprlandy, Working on theory if issue is somewhere else, and it's being false reported.  I am rerunning that job replacing localhost with undercloud for a test.16:27
weshay|ruckrlandy, sshnaidm Alan owns the relationship w/ the cloud provider.. we just need to cix..16:27
rlandyysandeep: ^^ there are a lot of issues with the downsteam cloud atm16:28
weshay|ruckcix can be cross referenced w/ jira or what ever other bs we need16:28
rlandyweshay|ruck: yeah - creating BZ - will mention the JIRA ticket16:28
ysandeeprlandy, ack, not sure if its related but will trigger jobs later then16:29
rlandyysandeep: may not be worth your debug time atm16:29
weshay|ruckrlandy, sshnaidm https://access.redhat.com/support/cases/#/case/0267159116:29
weshay|ruckfyi16:29
ysandeeprlandy, o/ thanks.. i will go to sleep then.. See you tomorrow o/ Have a great day ahead :)16:30
rlandyysandeep: yeah - sorry about all this16:30
sshnaidmweshay|ruck, "There was an error loading case."16:30
zbrrlandy or weshay|ruck: quick review on https://review.rdoproject.org/r/#/c/27987/16:30
rlandyysandeep: will leave you email if there is any progress16:30
ysandeeprlandy, thanks! that will help16:30
*** ysandeep is now known as ysandeep|away16:30
weshay|ruckrlandy, imho..  bz that lists that ticket is enough.. then email rhos-dev w/ the cix flags in the subject16:30
weshay|rucksshnaidm, you may need to be logged in16:31
weshay|ruckor another system is down16:31
weshay|rucklolz16:31
sshnaidmweshay|ruck, I am..16:31
rlandyweshay|ruck: yep - can't log in to BZ atm16:32
weshay|rucklolz16:32
rlandyanother 2020 disaster16:32
rlandyzbr: we have no more molecule test in centos7? if so, great16:33
zbrrlandy: incorrect: we still have them but we are now using py36 on both c7/8.16:33
sshnaidmweshay|ruck, upstream CI times out as well, takes 1 hours to prepare containers: https://187cce064a1459d372de-21abb6d2b9f578210dfe07e5ee1d658a.ssl.cf1.rackcdn.com/730083/2/check/tripleo-ci-centos-8-scenario001-standalone/ac77443/logs/undercloud/var/log/tripleo-container-image-prepare.log16:33
rlandyzbr: then +216:34
zbrbasically this helps us to migrate our codebase to py36 w/o forcing the system bump at the same time16:34
zbrsmaller steps = safer16:34
rlandyack16:34
* sshnaidm is out to prepare bunker and supplies16:34
*** sshnaidm is now known as sshnaidm|afk16:34
rlandyRequests typically are < 2ms but are now taking > 10 secs.16:39
rlandyyep - that looks like our issue16:39
rlandysloooooooooowwww cloud16:39
*** amoralej is now known as amoralej|lunch16:52
*** amoralej|lunch is now known as amoralej|off16:52
*** jmasud has quit IRC16:57
*** derekh has quit IRC17:01
*** jmasud has joined #oooq17:06
rfolco|roverzbr, on a quick look, do you understand why this failed ? https://08c3aae88ab0ce3ed41d-baf4f807d40559415da582760ebf9456.ssl.cf1.rackcdn.com/733659/7/check/tripleo-buildimage-overcloud-full-centos-7-train/c35c051/build.log17:15
rfolco|roverzbr, if command -v python3 executed, why python_path is empty and was the last command to run ? https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/dib-python/pre-install.d/01-dib-python#L1817:16
zbrnot sure, but i remember having a similar problem in other places17:18
rfolco|roverzbr, ok thanks... will compare to the scl run17:19
rfolco|roverweshay|ruck, can you re-w+ this one https://review.opendev.org/#/c/732618/17:19
rfolco|roverweshay|ruck, not sure what happened17:20
weshay|ruckrfolco|rover, depends-on https://review.opendev.org/#/c/73076317:20
zbrrfolco|rover: i wonder if command may return multiple lines in some cases, could break the code in ugly ways17:20
*** jpena is now known as jpena|off17:20
weshay|ruckwhich needs https://review.opendev.org/#/c/733790/17:20
rfolco|roverweshay|ruck, ah ok gotcha17:21
zbri know that type does return multiple results and that you need to "| head -n1"17:21
weshay|ruckrfolco|rover, this should fix the epel issue if we saw that in ussuri https://review.opendev.org/#/c/733790/3/container-images/tripleo_kolla_template_overrides.j217:21
rfolco|roverweshay|ruck, yep17:21
rfolco|roverzbr, command -v your mean ?17:22
zbryep17:22
rfolco|roveraahhh the first result might be empty17:23
rfolco|roveror command -v python3 is really retrieving none17:27
weshay|ruckrlandy, fyi.. this fails if you run master release from a centos-7 virthost fyi.. /me updates this patch17:45
weshay|ruckhttps://review.opendev.org/#/c/733471/317:45
weshay|ruckrlandy, can chat when ever17:49
rlandyweshay|ruck: https://meet.google.com/qin-kmpv-nwf17:59
*** saneax_AFK has quit IRC18:03
*** jmasud has quit IRC18:11
*** jmasud has joined #oooq18:13
*** rlandy is now known as rlandy|mtg18:20
*** jmasud has quit IRC18:26
weshay|ruckrlandy|mtg, https://review.opendev.org/#/c/724193/18:50
weshay|ruckhttps://review.opendev.org/#/c/729824/18:50
weshay|ruckrlandy|mtg, tripleo-build-containers-ubi-8SUCCESS in 45m 15s (non-voting)18:52
weshay|ruckrlandy|mtg, https://review.opendev.org/#/c/724193/5018:52
*** rlandy|mtg is now known as rlandy18:56
weshay|ruckrlandy, the patch to fix ussuri containers build is close to merging18:57
weshay|ruckknown issue18:57
rlandygreat18:57
weshay|ruckrlandy, removed DNM, https://review.opendev.org/#/c/730321/19:12
rlandythanks - voted19:12
weshay|ruckrlandy, k.. and I got these in the right place.. thought I didn't but I did19:13
weshay|ruckhttps://review.opendev.org/#/c/733392/19:13
weshay|ruckhttps://review.opendev.org/#/c/734100/1/zuul.d/standalone-jobs.yaml19:13
rlandyweshay|ruck: CIX email sent for https://bugzilla.redhat.com/show_bug.cgi?id=184526619:20
openstackbugzilla.redhat.com bug 1845266 in releng "Significant slowdown in running jobs in PSI upshift - internal zuul" [Unspecified,New] - Assigned to apevec19:20
rlandyweshay|ruck: I'm going to try the old container build push job again ( from testproject) now that kforde says the API response time issues may have been addressed19:22
rlandywill see if it makes any difference19:22
*** jmasud has joined #oooq19:37
rlandyweshay|ruck:  https://code.engineering.redhat.com/gerrit/202706 Update rhos-17 promotion criteria with new jobs added.19:49
weshay|ruckrlandy, thanks19:59
rfolco|roverweshay|ruck, I don't know what to do with fs020, failing on master, ussuri, train..20:07
rfolco|roverweshay|ruck, mostly the same issue: pacemaker20:07
weshay|ruckwhich issue w/ pacemaker?20:08
weshay|ruckrfolco|rover, is it on https://hackmd.io/YAqFJrKMThGghTW4P2tabA?both ?20:10
rfolco|roverthis is failing since ever20:11
rfolco|roverhttps://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-master&job_name=periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-ussuri&job_name=periodic-tripleo-ci-centos-8-ovb-1ctlr_2comp-featureset020-train&result=FAILURE20:11
rfolco|roverweshay|ruck, https://bugs.launchpad.net/tripleo/+bug/186760220:11
openstackLaunchpad bug 1867602 in tripleo "overcloud deploy failed due to Systemd start for pcsd failed" [Medium,Triaged]20:11
rfolco|roverbug filed a few sprint ago20:11
rfolco|roverweshay|ruck, its not 100% consistent, it also failed on tempest sometimes20:12
weshay|ruckrfolco|rover, that's the images dude20:12
weshay|ruckrfolco|rover, it's the missing files in the overcloud images20:13
rfolco|roverhmmm20:13
rfolco|rovermark as up then ?20:13
rfolco|roverdup20:13
weshay|ruckrfolco|rover,  No such file or directory: '/var/log/pcsd/pcsd.log20:13
weshay|ruckrfolco|rover, that goes away.. like the /etc/pki issue goes away when there is a valid working overcloud-full image20:14
rfolco|roverok20:15
weshay|ruckrfolco|rover, see my last comment in that bug20:15
rfolco|roverweshay|ruck, will mark dup of https://bugs.launchpad.net/tripleo/+bug/187976620:16
openstackLaunchpad bug 1879766 in tripleo "master ovb jobs failing on Destination directory /etc/pki/tls/private does not exist" [Critical,Triaged] - Assigned to chandan kumar (chkumar246)20:16
weshay|ruckk20:16
weshay|ruckrfolco|rover, sooner we can push a working image, sooner these problems go away20:18
weshay|ruckrfolco|rover, https://review.rdoproject.org/r/#/c/27986/20:18
weshay|ruckfocus there20:19
rfolco|roveragain...20:19
rfolco|roveryep20:19
rfolco|roverworking on it20:19
rfolco|roverwell.. now the check IS RIGHT20:20
rfolco|roverweshay|ruck, ^ image_sanity is doing what is supposed to do20:21
rfolco|roverthe files are really missing and failing the job20:21
weshay|rucknot all the time20:22
weshay|ruckrfolco|rover, the sanity check was failing ALL the time.. you are fixing that bit20:23
weshay|ruckrfolco|rover, pojadhav|ruck and chandankumar should pick it up from you.. and also figure out why it does in fact fail sometimes20:24
rfolco|roverweshay|ruck, last time it failed even if the filename was not in the rpm va output.20:25
weshay|ruckrlandy, https://code.engineering.redhat.com/gerrit/#/c/202706/ is correct, merged20:30
rlandythanks20:31
rfolco|roverweshay|ruck, but even fixing the check itself, if the file is missing on rpm_va output, the job will fail...20:31
rfolco|rovermissing     /var/lib/pcsd20:31
weshay|ruckya.. same shit20:31
rfolco|roverso also need to understand why the image is missing files...20:32
*** jbadiapa has quit IRC20:32
weshay|ruckrfolco|rover, yes indeed we do.. this started after chandankumar's refactor of the tripleo-ci/ooo-buildimage and oooq/buildimages role20:32
rfolco|roverweshay|ruck, I did not look at the code yet, but maybe we close out the qcow2 image while its still copying files into it20:33
*** ccamacho has quit IRC20:39
*** jtomasek has quit IRC21:18
*** jmasud has quit IRC21:28
*** jmasud has joined #oooq21:38
*** jmasud has quit IRC21:55
*** jfrancoa has quit IRC21:56
rlandyweshay|ruck: still around?22:02
weshay|ruckrlandy, aye22:10
weshay|ruckrlandy, check ur email22:10
rlandyweshay|ruck: thanks for the graphical backup22:11
weshay|ruck:)22:11
rlandyweshay|ruck: your opinion of reverting the change to add the private network?22:11
weshay|ruckrlandy, do you want me to look at other jobs?22:12
rlandyweshay|ruck: I don't think so - we're in the same place. OVB just died - as did BM agian in accessing the undercloud22:12
weshay|ruckrlandy, let's schedule a 1/2 for you, sagi and myself to chat22:12
rlandyhangs on introspection22:12
rlandyweshay|ruck: k - tomorrow morning22:13
rlandyat this point, I'd rather go back to the direct external node connection22:13
rlandyand give OVB a shot another time22:13
weshay|ruckk22:14
weshay|rucklet's rope sagi and chat about it22:14
rlandyyep22:15
rlandyI give up22:18
*** dmellado_ has joined #oooq23:10
*** dmellado has quit IRC23:11
*** dmellado_ is now known as dmellado23:11
*** tosky has quit IRC23:13
*** TrevorV has quit IRC23:19

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!