Thursday, 2018-05-31

hubbotFAILING CHECK JOBS on stable/ocata: tripleo-ci-centos-7-undercloud-oooq @ https://review.openstack.org/56429100:24
*** tcw has joined #oooq01:58
*** tcw1 has quit IRC02:01
hubbotFAILING CHECK JOBS on stable/ocata: tripleo-ci-centos-7-undercloud-oooq @ https://review.openstack.org/56429102:24
*** rfolco_ has quit IRC02:50
*** sanjayu_ has quit IRC02:59
*** ykarel|away has joined #oooq03:47
*** udesale has joined #oooq04:06
*** rlandy|rover|bbl is now known as rlandy|rover04:20
*** rlandy|rover has quit IRC04:20
hubbotAll check jobs are working fine on stable/ocata, stable/ocata, master, stable/queens.04:24
*** links has joined #oooq04:38
*** gvrangan has joined #oooq05:24
*** ratailor has joined #oooq05:27
*** sshnaidm_pto has quit IRC05:28
*** gvrangan has quit IRC05:32
*** quiquell|off is now known as quiquell05:33
*** gvrangan has joined #oooq05:33
*** pgadiya has joined #oooq05:38
*** pgadiya has quit IRC05:38
*** marios has joined #oooq05:44
*** jaosorior has quit IRC05:58
*** quiquell is now known as quiquell|afk06:13
*** jaosorior has joined #oooq06:14
*** saneax has joined #oooq06:14
*** gvrangan has quit IRC06:23
hubbotAll check jobs are working fine on stable/ocata, stable/ocata, master, stable/queens.06:24
*** gvrangan has joined #oooq06:26
*** florianf has joined #oooq06:56
*** zoli is now known as zoli|wfh07:08
*** zoli|wfh is now known as zoli07:08
*** quiquell|afk is now known as quiquell07:09
*** ykarel_ has joined #oooq07:10
*** ykarel|away has quit IRC07:12
*** tesseract has joined #oooq07:22
*** skramaja has joined #oooq07:23
*** skramaja_ has joined #oooq07:28
*** skramaja has quit IRC07:28
*** ccamacho has joined #oooq07:44
*** amoralej|off is now known as amoralej07:55
*** holser__ has joined #oooq08:02
*** ykarel_ is now known as ykarel08:03
*** ratailor has quit IRC08:21
hubbotAll check jobs are working fine on stable/ocata, stable/ocata, master, stable/queens.08:24
*** holser__ has quit IRC08:24
*** sshnaidm_pto has joined #oooq08:25
*** ratailor has joined #oooq08:29
*** gvrangan has quit IRC08:35
*** ykarel is now known as ykarel|lunch08:42
*** gvrangan has joined #oooq08:52
*** panda|off is now known as panda09:04
quiquellpanda: Good morning09:11
pandaquiquell: helloooo09:12
quiquellpanda: Do you have a minute to check logs, and help me understand the n -> n + 109:13
quiquellFor sure it's not working but whant to know what it's and what not09:13
pandaquiquell: sure, how do you want to proceed ?09:15
quiquellpanda: bj and share screens or shared tmux ?09:17
quiquellhttps://bluejeans.com/789106523209:19
*** ykarel|lunch is now known as ykarel09:31
*** ratailor has quit IRC09:31
*** udesale has quit IRC09:32
*** ratailor has joined #oooq09:33
*** apetrich has quit IRC09:34
arxcruz|ruckdalvarez: around?09:34
arxcruz|ruckdalvarez: have you seen this errors before? http://paste.openstack.org/show/722402/09:35
*** udesale has joined #oooq09:38
*** ratailor has quit IRC09:46
*** ratailor has joined #oooq09:49
dalvarezarxcruz|ruck: no idea09:50
dalvarezi havent seen that before09:50
arxcruz|ruckthis is happening in one of our jobs in ocata all the time :/09:50
dalvarezarxcruz|ruck: for some reason it looks like soemthing is adding a second manager to OVS09:51
*** ratailor has quit IRC09:51
dalvarezmaybe slaweq can take a look09:51
dalvarezlooks like a tripleo thing though09:51
*** ratailor has joined #oooq09:52
dalvarezmaybe not, neutron adds the manager from its code too09:52
dalvarezarxcruz|ruck:  ^09:52
arxcruz|ruckhmmmm, okay09:52
*** gvrangan has quit IRC09:57
*** ykarel_ has joined #oooq10:18
*** ykarel has quit IRC10:21
hubbotAll check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master.10:24
quiquellpanda: Adding fs050 to the gates https://review.openstack.org/#/c/570882/10:24
quiquellIt's working already10:24
*** gvrangan has joined #oooq10:39
pandaquiquell: I have the name for the upgrade ddirection10:45
quiquellpanda: yes !!!10:45
pandaquiquell: for the differentiation10:45
quiquellpanda: what do you have in mind10:46
pandaquiquell: I usually borrow a lot from chemistry lately10:46
pandaquiquell: do you know what a atom orbital is ?10:46
quiquellpanda: Yep I remember the orbits of energy10:47
quiquellor similar10:47
pandaquiquell: ok the electron have a particular value to identify when the are in the same orbital, and it's their spin10:48
pandaquiquell: I suggest we call this spin up and spin down10:48
quiquellpanda: holy shit10:49
*** ykarel_ is now known as ykarel10:49
quiquellpanda: But it's always a spin up, what changes is the initial orbit10:50
quiquellbut the direction is always an upgrade10:50
quiquellincrement/decrement is wrong too10:50
quiquellwhat we change is the install release10:50
quiquellif delta is 110:51
quiquelland install release is is the increment or the decrement10:52
quiquellBut it always going to be a spin up10:52
quiquellDon't know10:52
quiquellpanda: I like spin up  or spin down10:52
quiquellOk, so it means where do we want to put the sable release10:53
quiquellat intall or at target10:54
quiquellso10:54
quiquell--stable-release-place=install --stable-release-place=target10:54
*** skramaja_ has quit IRC10:55
quiquell--stable-release-place=initial --stable-release-place=target10:55
quiquellpanda: What do you think ?10:56
pandafor what are these cli arguments ?10:56
quiquellemit_releases_file.py10:57
quiquellBut similar can be for the toci_type or job env10:57
panda--spin up --spin down ? :)10:57
quiquellIt's always a spin up10:57
quiquellit's an upgrade10:57
quiquellwhat changes is the initial point10:58
quiquellit always goes from a lower orbit the upper one10:58
quiquellbut what changes is the initial orbit10:58
panda--start-from-current --start-from-previous ? The difficulty is always the same, there are not many workds to describe a change in the starting point, our languages are alwasy forcused on what come next :)10:59
*** skramaja has joined #oooq10:59
quiquellnot previous11:00
quiquellwe have de ffu11:00
pandaI give up11:00
pandawe have ffu11:00
quiquellhaha11:00
pandaFFFFFUUUUUUUUUUUUU11:00
*** ratailor_ has joined #oooq11:00
quiquell--stable-at-install --stable-at-target11:01
panda--start-release-is-really-the-one-that-im-passing --start-release-is-not-that-one-its-the-other-one11:02
quiquell--install-with-release --upgrade-with-release11:02
quiquellpanda: Look the las too11:02
quiquells/too/two/11:02
*** ratailor has quit IRC11:03
*** tesseract has quit IRC11:07
*** tesseract has joined #oooq11:08
*** panda is now known as panda|lunch11:13
*** udesale has quit IRC11:33
*** udesale has joined #oooq11:37
*** gvrangan has quit IRC11:42
*** udesale has quit IRC11:47
*** udesale has joined #oooq11:51
quiquellpanda|lunch: Have it upgrade-from-release upgrade-to-release11:59
*** atoth has joined #oooq11:59
panda|lunchquiquell: that might work, now they just have to be mutually exclusive :)12:00
quiquellpanda|lunch: That's just implementation but need to be done12:00
quiquellpanda|lunch: btw, undercloud upgrade finished12:01
quiquellWe can take a look later on12:01
panda|lunchquiquell: ok, give me some time to look at the cards for today12:02
*** sshnaidm_pto has quit IRC12:02
*** udesale has quit IRC12:03
*** udesale has joined #oooq12:03
*** trown|outtypewww is now known as trown12:04
quiquellok12:04
*** udesale_ has joined #oooq12:07
trownhmm... not to throw a wrench in the naming party... but I think the script itself should be figuring out "spin-up" vs "spin-down", and we need to just add our keyword to the job type12:07
trownotherwise we are adding logic somewhere that is not python12:07
quiquelltrown: Figuring it aut means reading an enviroment variable12:08
quiquelltrown: The only bash logic is passing what the job want to do12:08
trownquiquell: how does the environment variable get set?12:09
panda|lunchtrown: agreed12:09
quiquelltrown: toci_type or var in the job12:09
*** udesale has quit IRC12:09
quiquelltrown: At least this are the only options I know of, you guys sure know other ones12:09
trownquiquell: if it is toci_type, then you have to have logic to parse toci_type12:10
quiquelltrown: Of course not doing it at fs12:10
quiquelltrown: Reading toci_type in two places is not a bad thing ?12:11
*** panda|lunch is now known as panda12:11
trownbut... I guess we already pull out featureset... maybe it is fine to only do the parsing in bash, and none of the logic of what to do based on that parsing12:11
quiquelltrown: Changing the toci_type parsing in the only place we do it, I think is better12:13
quiquelltrown: And isolate script from toci_type too there stuff we don't care about there12:13
*** gvrangan has joined #oooq12:13
pandaquiquell: ok, ready12:16
quiquellpanda: give me the key again12:19
pandaquiquell: lol12:19
pandaquiquell: take it from the previous place :)12:19
quiquellWaait I am going to eat soimething first12:19
quiquellIt was wrong12:19
quiquellIt was good ?12:19
pandaquiquell: it was ok12:19
quiquellpanda: zuul@38.145.32.10012:21
pandaquiquell: ok, take you time for the lunch12:21
quiquellpanda: Going to eat something now12:21
pandaquiquell: no rush12:21
quiquellpanda: You can explore already12:21
hubbotAll check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master.12:24
*** rlandy has joined #oooq12:28
quiquellrlandy: Good morning, https://review.openstack.org/#/c/568946/ is merged12:29
*** apetrich has joined #oooq12:29
*** udesale_ has quit IRC12:29
*** udesale_ has joined #oooq12:30
rlandyquiquell: yes - I checked with panda - he ok'ed it. Emilien wants that review in12:30
rlandyarxcruz|ruck: hello!12:31
*** rlandy is now known as rlandy|rover12:32
arxcruz|ruckrlandy|rover: hey12:32
rlandy|roverour zuul queues are so long these days12:32
arxcruz|ruckrlandy|rover: so, quick question, where are the rdo phase 1 jobs?12:32
rlandy|roverwonder if there is anything we can do about that12:32
rlandy|roverci.centos12:32
arxcruz|ruckrlandy|rover: nevermind, the dashboard just take longer to update12:32
rlandy|roverand I ma going to killl that stupid ocata  job today12:32
arxcruz|ruckso pike was promoted :)12:33
arxcruz|rucki was looking this morning, the jobs pass but dashboard wasn't updating, makes me wondering if i was looking the wrong jobs12:33
rlandy|roverarxcruz|ruck: dashboard is usually accurate12:33
rlandy|roverdepends on the hash in use12:33
arxcruz|ruckrlandy|rover: regarding ocata, dalvarez told that something is trying to insert two managers12:34
arxcruz|ruckrlandy|rover: i meant this http://rhos-release.virt.bos.redhat.com:3030/rhosp12:34
arxcruz|ruckuntill a few minutes ago was showing pike 4d old12:34
arxcruz|ruckbut now phase 1 is showing as green :)12:34
rlandy|roverphase2 needs to promote now - I'll check on that12:35
rlandy|roverarxcruz|ruck: what do you mean about two managers?12:35
rlandy|roverthe job gets killed at diff places each time12:35
arxcruz|ruckrlandy|rover: http://paste.openstack.org/show/722402/12:35
arxcruz|ruckTransaction causes multiple rows in \"Manager\" table to have identical values12:36
rlandy|roveryeah  - I have seen that constraint violation error before12:37
rlandy|roverthe problem is that the error is not consistent12:37
rlandy|roverdid you get any more info?12:38
arxcruz|ruckno, just a bunch of errors in neutron, which i believe that's the reason for no valid hosts12:38
arxcruz|ruckbut i'm not good in neutron stuff12:38
rlandy|rovernote that sometimes we get the conductor error12:38
rlandy|roverI have ping'ed nova and ironic guys on this12:39
arxcruz|ruckyou should ask dtantsur|afk12:39
rlandy|roverI have12:39
rlandy|roverand owalsh12:39
arxcruz|ruckdtantsur|afk: I can pay you a beer at brewdog if you help us :D12:39
rlandy|roverif it's a neutron issue, we need to bug someone else12:40
*** gvrangan has quit IRC12:40
rlandy|roverarxcruz|ruck: but this has to get assigned today12:41
rlandy|rovercan't carry on12:41
rlandy|roverI agree there12:41
arxcruz|ruckrlandy|rover: https://ci.centos.org/artifacts/rdo/jenkins-tripleo-quickstart-promote-ocata-rdo_trunk-minimal-356/undercloud/var/log/extra/errors.txt.gz12:42
arxcruz|ruckthere are errors in ironic, neutron and nova12:43
rlandy|roverhttps://bugzilla.redhat.com/show_bug.cgi?id=150603512:43
openstackbugzilla.redhat.com bug 1506035 in openstack-neutron "neutron openvswitch agent creates multiple rows in Manager table" [Low,Closed: notabug] - Assigned to amuller12:43
*** gvrangan has joined #oooq12:43
rlandy|rovererror could be harmless12:43
ykarelrlandy|rover, arxcruz|ruck  i think we are seeing non consistent issue in ocata is ironic is being restarted multiple times, may be other services also, in pike i can't see multiple restarts12:43
ykarelcan you look in this area12:43
ykarelis this expected12:43
rlandy|roverykarel: the reason is why12:44
rlandy|roverthe hatrdware is ok for pike, queens, master12:44
rlandy|roverocata works on rodcloud jobs12:44
pandaquiquell: looks like it's doing as expected for the build-test-packages, building them at install time, skipping them at upgrade12:44
quiquellpanda: But install-repo is reinstalling at upgrades12:44
quiquellI think12:44
rlandy|roverykarel: we also have no access to the hardware12:44
pandaquiquell: yes12:44
quiquellpanda: That's what we have to fix12:45
pandaquiquell: let's see how the precedence is12:45
*** udesale has joined #oooq12:45
quiquellbut build-test-packages is alaredy idempotent12:45
rlandy|roverso while I'd love to dump this one some team - we need to pick the right one12:45
*** udesale_ has quit IRC12:45
arxcruz|ruckrlandy|rover: check comment 512:45
ykarelrlandy|rover, i have a reproducer in a beaker machine, in case u want to look12:45
ykareli loan from a colleague12:46
rlandy|roverykarel: did it actually reproducer it?12:46
pandaquiquell: so, the gating repo has priority 1 as I remembered, I'm not wure if this means that newer versions will not be installed12:46
rlandy|roverthe errors12:46
ykarelrlandy|rover, no conductor issue reproduced12:46
ykareland i think the reason is ironic services are being restarted12:46
ykarelwe need to find is that expected12:46
pandaquiquell: that's the next thing to check: if a package in master will be installer over a package in queens even if it comes from a repo with priority=112:46
quiquellpanda: Have you check if the upgrade ins installing queens in master already ?12:47
quiquellrlandy|rover, arxcruz|ruck: going to put for a few days the ruck rover alarm tool we are doing12:47
quiquellrlandy|rover, arxcruz|ruck: let me know if it's of any use12:48
rlandy|roverykarel: do you work with any neutron expert?12:48
ykarelrlandy|rover, nope12:48
pandaquiquell: tht version is 0.20180531112327.b7d84da.el712:48
*** ruck-rover-alert has joined #oooq12:49
pandaquiquell: looks like the one from the change to me12:49
rlandy|roverykarel: the reproducer I ran on my minidell never got very far12:49
rlandy|roverdo you see the services restarting?12:49
rlandy|rovercan you hold that beaker machine?12:49
ruck-rover-alert[Alerting] Promotions for ocata alert: Ocata promotions problem in the last 24h: http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?fullscreen=true&edit=true&tab=alert&panelId=111&orgId=112:49
pandaquiquell: so even if install-repo is not idempotent, yum does the math for us, we are just wasting time12:49
pandainstalling a repo that is not useful anymore12:49
ykarelrlandy|rover, yes services are being started around every 10 minutes12:49
ruck-rover-alert[Alerting] Gate   jobs  upstream alert: Upstream gate is failing in the last 24h: http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?fullscreen=true&edit=true&tab=alert&panelId=55&orgId=112:50
ykarelrlandy|rover, a puppet apply is continuosly running12:50
ruck-rover-alert[Alerting] RDO master noop change results alert: RDO jobs failing at master noop change https://review.openstack.org/#/c/560445.: http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?fullscreen=true&edit=true&tab=alert&panelId=103&orgId=112:50
ruck-rover-alert[Alerting] RDO stable/ocata noop change results alert: RDO jobs failing at stable/ocata noop change https://review.openstack.org/#/c/564291.: http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?fullscreen=true&edit=true&tab=alert&panelId=106&orgId=112:50
quiquellpanda: but then it's wrong, it has to be the one from master not the one from the change12:50
ruck-rover-alert[Alerting] Upstream zuul max enqueued time alert: A lot of jobs at upstream tripleo gate queue, check https://zuul.openstack.org: http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?fullscreen=true&edit=true&tab=alert&panelId=71&orgId=112:50
quiquellI mean after the upgrade the tht have to be the master version ?12:50
rlandy|roverquiquell: can we turn this off12:50
pandaquiquell: lol, true, let's checkj better12:50
quiquellrlandy|rover: Sure sorry12:50
rlandy|roverwe're in the middle of a conversation here12:50
*** ruck-rover-alert has quit IRC12:50
ykarelrlandy|rover, /usr/bin/ruby /usr/bin/puppet apply --summarize --detailed-exitcodes /etc/puppet/manifests/puppet-stack-config.pp12:50
ykarelthis restarts services again and again12:51
rlandy|rovercan you hold that machine for  a bit12:51
rlandy|roverEmilienM: ping if you are around12:51
rlandy|roverremember you offered to help yesterday??12:52
rlandy|roverany thoughts on <ykarel> rlandy|rover, a puppet apply is continuosly running?12:52
rlandy|roverwe have an ocata issue here that is doing us in12:52
rlandy|roverykarel: adding your comments to the bug12:53
*** ruck-rover-alert has joined #oooq12:53
myoungo/ good morning12:54
rlandy|roverykarel: can we get access to your beaker machine?12:55
pandaquiquell: so tht package version is 8.0.3 so it's queens12:55
ykarelrlandy|rover, ok, yes you can get access12:55
pandaquiquell: but since this is after update, I expect this to be master12:55
*** quiquell is now known as quique|lunch12:55
quique|lunchpanda: Give me a minute12:56
ykarelrlandy|rover, i can see service being restarting in ovb jobs also both in pike/ocata but there we might be just lucky12:56
pandaquique|lunch: oh I thought you were lunching 30 minutes ago12:56
pandaquique|lunch: take your time12:56
rlandy|roverI was also looking at this .. https://bugs.launchpad.net/tripleo/+bug/167303012:57
openstackLaunchpad bug 1673030 in tripleo "ocata/stable jobs are broken (httpd restart failures)" [Critical,Fix released] - Assigned to Michele Baldessari (michele)12:57
rlandy|roverlooks like service restarts are nothing new12:57
rlandy|roverthey probably mess up a lot12:57
ykarelhmm, that's bad, i think services should not restart until really required12:59
rlandy|roverykarel: so I think we may have been getting by with this problem for some tim12:59
rlandy|roverwhen EmilienM is available, I'd like to run it by him12:59
ykarelrlandy|rover, okk12:59
ykarelrlandy|rover, ssh -X root@dell-per320-2.gsslab.pnq.redhat.com13:00
rlandy|roverykarel: awesome - I'm in - thank you13:00
ykarelrlandy|rover, ok  and undercloud: ssh -F ~/.quickstart/ssh.config.ansible undercloud13:01
*** florianf has quit IRC13:05
*** florianf has joined #oooq13:05
rlandy|roverarxcruz|ruck: once this bug is assigned - I am considering changing the promotion criteria13:06
rlandy|roverI'll look at why pike phase 2 is behind13:07
rlandy|roveralso - our upstream zuul queues are so long13:07
*** florianf has quit IRC13:08
*** florianf has joined #oooq13:08
*** quique|lunch is now known as quiquell13:09
quiquellpanda: ready13:09
arxcruz|ruckrlandy|rover: pike is behind because phase 1 was 4 days old, i believe we will have a pike phase 2 today13:09
rlandy|roverI hope so13:09
quiquellpanda: Let's look at it before the meeting13:09
rlandy|roverarxcruz|ruck: just retried https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/periodic-pike-rdo_trunk-featureset020-1ctlr_1comp_64gb/13:11
rlandy|rovermirror issue13:11
*** trown is now known as trown|brb13:11
*** trown|brb is now known as trown13:15
pandaquiquell: you there ?13:15
pandaquiquell: the tht version is from queens, so the priority is probably blocking the upgrade for packages13:16
quiquellpanda: Yep13:16
pandaquiquell: so we need to make install-repo smarter13:16
quiquellpanda: renamign the gating repo tarball is not enough _13:17
quiquellI mean adding the release on it13:17
quiquellso install repo just use the repo from the current release it is working on13:17
pandaquiquell: we need to remove what's on the previous version too13:17
quiquellpanda: Maybe we can move it more than remove it13:18
quiquellFor debuggin purposes at the end of install repo13:18
pandaquiquell: doesn't matter, it needs to be disabled13:18
pandain some way13:18
quiquellpanda: Let's hack it and see what happends will try13:18
pandamove it, remove it, stab it, strangle it, whatever :)13:18
quiquellpanda: the gating repo is not needed after install ?13:19
pandaquiquell: not as far as I know13:19
quiquellok13:19
*** marios has quit IRC13:19
*** marios has joined #oooq13:20
EmilienMrlandy|rover: hey, what's up? in meeting the next 1h30, then available13:21
ruck-rover-alert[Alerting] RDO stable/queens noop change results alert: RDO jobs failing at stable/queens noop change https://review.openstack.org/#/c/567224.: http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?fullscreen=true&edit=true&tab=alert&panelId=104&orgId=113:22
ruck-rover-alert[Alerting] Promotions for pike alert: Master promotions problem in the last 24h: http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?fullscreen=true&edit=true&tab=alert&panelId=110&orgId=113:22
ruck-rover-alert[Alerting] RDO stable/pike noop change results alert: RDO jobs failing at stable/pike noop change https://review.openstack.org/#/c/564285.: http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?fullscreen=true&edit=true&tab=alert&panelId=105&orgId=113:22
*** ruck-rover-alert has quit IRC13:22
rlandy|roverEmilienM: ok - pls ping me when you have some time - we need your puppet expertise13:23
*** ruck-rover-alert has joined #oooq13:23
ruck-rover-alert[OK] Promotions for pike alert: Master promotions problem in the last 24h13:23
myoungCI squad: standup/scrum in 6m13:24
rlandy|roverEmilienM: https://bugs.launchpad.net/tripleo/+bug/1774079 - this is killing the ocata promotion for weeks13:24
openstackLaunchpad bug 1774079 in tripleo "[ocata promotion] phase1 (ci.centos) job tripleo-quickstart-promote-ocata-rdo_trunk-minimal fails introspection/deploy "No valid host found"" [Critical,Triaged]13:24
ruck-rover-alert[Alerting] Promotions for queens alert: Queens promotions problem in the last 24h13:24
ruck-rover-alert[Alerting] RDO stable/queens noop change results alert: RDO jobs failing at stable/queens noop change https://review.openstack.org/#/c/567224.13:24
rlandy|roverykarel kindly has a reproducer env set up13:24
*** ratailor_ has quit IRC13:24
ruck-rover-alert[OK] Promotions for queens alert: Queens promotions problem in the last 24h13:25
ruck-rover-alert[OK] RDO stable/queens noop change results alert: RDO jobs failing at stable/queens noop change https://review.openstack.org/#/c/567224.13:25
rlandy|roverquiquell: pls, pls can we get these ruck-rover alerts turned off - too many to follow13:25
arxcruz|rucklol13:25
pandaruck-rover-alert: ?13:25
arxcruz|ruckquiquell: perhaps messaging only the ruck and rover13:25
quiquellrlandy|rover: Sorry again, It's not suppose to do it here... my fault13:26
*** ruck-rover-alert has quit IRC13:26
arxcruz|ruckquiquell--13:26
hubbotarxcruz|ruck: quiquell's karma is now 013:26
rlandy|roverquiquell: if it alerts on my personal feed, np13:26
quiquellarxcruz|ruck++13:26
hubbotquiquell: arxcruz|ruck's karma is now 213:26
rlandy|roverit's just a lot of text *** in the middle *** of a chat conversation13:26
quiquellrlandy|rover, arxcruz|ruck: alerts are at #tripleo-ci now13:27
rlandy|roveralso - if things are ok - we don't need to know about them13:27
arxcruz|ruckrlandy|rover: i believe we should assign to ironic, but not sure if they will care too much since it's ocata13:27
panda[Alerting] scrum meeting in 3 min13:27
panda[OK] scrum meeting in 3 min13:27
rlandy|roverpanda: lol13:27
quiquell:-)13:27
quiquellok means "back to normal" they don't get repeated13:28
*** links has quit IRC13:29
*** zoli is now known as zoli|lunch13:30
*** sshnaidm_pto has joined #oooq13:31
myoungrlandy|rover, scrum13:32
*** udesale_ has joined #oooq13:34
rlandy|roverquiquell: alerts on tripleo-ci are great - thanks13:36
*** udesale has quit IRC13:37
quiquellrlandy|rover: We have to adjust them the time range to check problems13:37
pandaquiquell: and add some dampening, to me a OK 3 minutes after an ALERT is too much noise13:43
quiquellpanda: the local influxdb is eating all the m emory, that was it was funny13:44
quiquellpanda: normally it's not so verbose13:44
*** gvrangan has quit IRC13:47
*** jaganathan has quit IRC14:02
ykarelrlandy|rover, seeing local reproducer, cause for restart should be:-14:08
ykarelMay 31 13:03:17 undercloud os-collect-config[714]: /usr/libexec/os-refresh-config/post-configure.d/98-undercloud-setup: line 91: HOME: unbound variable14:08
ykarelMay 31 13:03:17 undercloud os-collect-config[714]: [2018-05-31 13:03:17,766] (os-refresh-config) [ERROR] during post-configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/post-configure.d']' returned non-zero exit status 1]14:08
ykarelMay 31 13:03:17 undercloud os-collect-config[714]: [2018-05-31 13:03:17,767] (os-refresh-config) [ERROR] Aborting...14:08
ykarelMay 31 13:03:17 undercloud os-collect-config[714]: Command failed, will not cache new data. Command 'os-refresh-config' returned non-zero exit status 114:09
*** apetrich has quit IRC14:09
ykarelMay 31 13:03:17 undercloud os-collect-config[714]: Sleeping 1.00 seconds before re-exec.14:09
ykarelEmilienM, ^^ any idea?14:09
rlandy|roverykarel: my best guess is that this restart problem has been going on for some time - but we are seeing the failures somewhat more consistently now. if we get nowhere with the debug today, I think we need to promote14:16
pandaupcrap the undercloud !14:16
ykarelrlandy|rover, sure,14:17
rlandy|rovermaybe EmilienM has some more info here - we are going EOL in august14:17
rlandy|roverI want to give it one more day14:17
ykarelyup waiting for him14:17
quiquellpanda: It was uncrap ?14:17
rlandy|roverin truth, we had no promotions from 05/09 to 05/2214:18
rlandy|roverwhen things went south14:18
rlandy|rovera million things could have gone in in that time14:18
rlandy|roverykarel: but I agree - we don't give up without a proper last fight/investigation14:18
pandaquiquell: I clearly heard "upcrap"14:19
ykarelrlandy|rover, can see this restart even before 05/09, but in pike not14:19
hubbotAll check jobs are working fine on stable/queens, stable/ocata, stable/ocata, master.14:24
*** zoli|lunch is now known as zoli|wfh14:42
*** zoli|wfh is now known as zoli14:42
quiquellno usre needed for ruck/rover alarms now http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?orgId=114:43
rlandy|roverarxcruz|ruck: ugh - https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/periodic-pike-rdo_trunk-featureset020-1ctlr_1comp_64gb/44/console14:51
rlandy|roverpike rdo phase 214:51
arxcruz|ruckrlandy|rover: maybe ykarel can help ?14:52
arxcruz|ruckit's packaging issue14:52
rlandy|roverwe need to upgrade 7.514:52
rlandy|roverchecking host14:52
arxcruz|ruckrlandy|rover: you mean the image directly to 7.5 ?14:53
rlandy|rovercreating bug to start with14:54
arxcruz|ruckrlandy|rover: i'll create14:54
ykarelrlandy|rover, also need to remove the workaround:- install known good kernel version when moving to rhel 7.514:55
pandamyoung: trown quiquell when do we want to finish the discussion ? I'm available14:55
rlandy|roverykarel: yep - knew that one would bite us some day14:55
ykarelyup14:56
quiquellpanda: want to ask myoung stuff about the promoter first, just a few minutes14:56
EmilienMykarel: I have one more 1h meeting and I'm done :D14:56
arxcruz|ruckrlandy|rover: https://bugs.launchpad.net/tripleo/+bug/177443514:57
openstackLaunchpad bug 1774435 in tripleo "Libvirt package dependences broken on RDO Phase 2" [Undecided,Triaged] - Assigned to Ronelle Landy (rlandy)14:57
ykarelEmilienM, no prob, take your time :)14:57
rlandy|roverLast login: Thu May 31 09:16:49 2018 from slave-rdo-ci-fx2-01-s6.v101.rdoci.lab.eng.rdu.redhat.com14:57
rlandy|rover[root@rdo-ci-fx2-01-s6 ~]# cat /etc/redhat-release14:57
rlandy|roverRed Hat Enterprise Linux Server release 7.4 (Maipo)14:57
quiquellpanda: btw fs050 it's ready for the gates https://review.openstack.org/#/c/570882/14:57
rlandy|roverhmmm - this machine must support other jobs though14:58
*** skramaja has quit IRC15:00
myoungquiquell, panda, have tempest squad scrum, avail after15:00
myoungfolks are on PTO, will just be a few mins15:02
trownim available whenever15:05
quiquellrlandy|rover, trown, myoung: +2/+1w fs050 gating https://review.openstack.org/#/c/570882/15:06
rlandy|rovermyoung:what are our internal slaves usually running? this one is rhel 7.415:06
myoungpanda: do you have a hot sec to join my BJ with chandankumar?15:07
rlandy|roverhave you upgraded any to rhel 7.5?15:07
myoungpanda: need a quick bit of advice15:07
chandankumarpanda: https://review.openstack.org/#/q/topic:refstack-support+(status:open+OR+status:merged) we need your eyes on this one15:07
myoungbluejeans.com/matyoung15:07
myoungrlandy|rover: sec15:08
rlandy|roversure15:08
quiquellpanda: have some time for the meeting15:08
myoungrlandy|rover: they should all be at 7.4, now that 7.5 has released we could/should update the virthosts15:13
myoungrlandy|rover: the slaves (small VM's) are running fedora15:13
rlandy|rovermyoung: stupid question - how do we update - rhos-release or an entire reboot?15:14
myoungrlandy|rover: afaik the rhel 7.4 version on virthosts, while needing to be upgraded shouldn't be an issue, as for those jobs UC/OC area all libvirt domains anyway15:14
rlandy|rovermyoung: see error .. https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/periodic-pike-rdo_trunk-featureset020-1ctlr_1comp_64gb/44/console15:14
myoungrlandy|rover: aye it's generally a "rhos-release $theArgs && yum update && reboot"15:15
*** ccamacho has quit IRC15:15
rlandy|roveradmin.so.0(LIBVIRT_ADMIN_PRIVATE_3.2.0)(64bit)\n           Installed: libvirt-libs-3.9.0-14.el7_5.5.x86_64 (@rhelosp-rhel-7.4-server)\n              ~libvirt-15:15
* myoung looks15:15
rlandy|roverso if we update to 7.515:15
rlandy|roverwe should get by this15:15
*** ccamacho has joined #oooq15:15
rlandy|roveralso we are locking the kernel15:15
rlandy|roverhttp://git.app.eng.bos.redhat.com/git/tripleo-environments.git/tree/roles/prep-internal-host/tasks/main.yml#n3615:15
rlandy|rover"rhos-release rhel-7.5 && yum update && reboot"?15:16
rlandy|roverInstalled: libvirt-libs-3.9.0-14.el7_5.5.x86_6415:17
*** holser__ has joined #oooq15:17
pandamyoung: chandankumar sorry about that. myoung where did you go ?15:17
myoungpanda: we're done, most of tempest team is out so standup was just chandankumar's question15:18
myoungpanda, quiquell now have time :)15:18
myoungrlandy|rover: ack, let's just upgrade them15:18
quiquellpanda: Let's do this I don't have much time left15:18
rlandy|rovermyoung: using  "rhos-release rhel-7.5 && yum update && reboot" ?15:18
*** sanjay__u has joined #oooq15:18
myoungrlandy|rover: in the past we did hit some issues around having multiple versions of rpm's thru upgrades and needed a little kung-fu, mostly because there were some broken package dependencies in the 7.3 --> 7.4 upgrade path (686 binaries were being pulled in).  TLDR we should try on a single virthost first :)15:19
rlandy|rovermyoung: I am on rdo-ci-fx2-01-s6.v101.rdoci.lab.eng.rdu.redhat.com15:19
myoungrlandy|rover: heh hence http://git.app.eng.bos.redhat.com/git/tripleo-environments.git/tree/ci-scripts/virthost-rpm-cleanup.sh15:19
rlandy|rovercould try that one15:19
pandamyoung: quiquell in myoung room15:19
pandaready and waiting15:20
myoungpanda, quiquell, inc15:20
pandalock and loaded15:20
myoungrlandy|rover: can help / jump in after15:20
rlandy|rovermyoung: ping me when you are done with meeting and we'll do this15:20
quiquellconnecting15:20
myoungrlandy|rover:  ack15:20
*** holser__ has quit IRC15:28
*** saneax has quit IRC15:30
*** marios has quit IRC15:39
*** udesale_ has quit IRC15:40
*** jaganathan has joined #oooq15:40
*** jaganathan has quit IRC15:42
*** jaganathan has joined #oooq15:42
EmilienMrlandy|rover: what is the error, what is the context? i can help now15:55
rlandy|roverEmilienM: hi there - nice presentation earlier today! ykarel and I have been looking at the ocata failures in phase 115:56
rlandy|roverci.centos15:56
rlandy|rovergetting bug15:57
rlandy|roverhttps://bugs.launchpad.net/tripleo/+bug/177407915:57
openstackLaunchpad bug 1774079 in tripleo "[ocata promotion] phase1 (ci.centos) job tripleo-quickstart-promote-ocata-rdo_trunk-minimal fails introspection/deploy "No valid host found"" [Critical,Triaged]15:57
rlandy|roverthe problem being that the jobs fails consistently since 05/22 but always in a different place15:58
rlandy|roverykarel set up a reproducer env (which we can give you access to)15:58
rlandy|rovernoticing that there are a lot of restarts15:58
ykarelrlandy|rover, EmilienM i think i got something, proposed a patch15:59
ykarela cherry-pick, EmilienM can you check15:59
rlandy|roverykarel: interesting - date seems to be earlier expected - but it's not in ocata16:02
rlandy|rovernewton?16:02
trownhaha16:03
trownok, bye matt :P16:03
quiquellUps I have to drop btw16:03
myounggah.  bluejeans16:03
quiquellSee you tomorrow will ask panda about stuff you talk about16:03
trownno worries, I think we have a good idea what are the next steps16:03
*** quiquell is now known as quiquell|off16:03
pandatrown: you going to rejoin ?16:04
trownpanda: do we have more to discuss?16:05
ykarelrlandy|rover, not checked for newton16:05
rlandy|roverykarel: np - I'll check16:05
rlandy|roverykarel: do you see an improvement in restarts>16:05
rlandy|rover?16:05
rlandy|roverwe can deal with newton if it fixes ocata16:06
pandatrown: if you are ok with the notes on the card then no, I will remove the maturity labels and it's good to go16:06
ykarelrlandy|rover, improvement where, i haven't applied it locally yet16:06
trownpanda: ya, I think I could work on any of the cards16:06
pandatrown: ok16:06
rlandy|roverykarel: ok  then, we'll see what the gates say16:06
ykarelrlandy|rover, but line 91: HOME: unbound variable is still an issue, what i noticed is the .novaclient file is created in /home/stack and that script is running from root16:08
ykarelso even if $HOME is available, it would look at /root/.novaclient16:09
rlandy|roverykarel: same as with the duplicate entry - there are a lot of error we follow up and sometimes they are unrelated16:09
ykareli meant: https://github.com/openstack/instack-undercloud/blob/stable/ocata/elements/undercloud-install/os-refresh-config/post-configure.d/98-undercloud-setup#L9116:09
rlandy|roverykarel: considering the errors and stack traces, your patch seems like a good shot16:09
ykarelok let's get EmilienM mwhahaha opinion on that16:10
rlandy|roversure16:10
EmilienMykarel: ok, will check16:11
ykarelEmilienM, Thanks16:11
* ykarel leaving16:11
rlandy|roverykarel: thanks for your work on this!16:11
*** amoralej is now known as amoralej|off16:15
*** ykarel_ has joined #oooq16:16
*** ykarel has quit IRC16:18
*** apetrich has joined #oooq16:18
*** ccamacho has quit IRC16:21
*** zoli is now known as zoli|gone16:24
*** zoli|gone is now known as zoli16:24
hubbotAll check jobs are working fine on stable/ocata, stable/ocata, master, stable/queens.16:24
*** panda is now known as panda|off16:25
rlandy|rovermyoung: pls ping when you are ready to work on the virt host upgrade16:27
myoungrlandy|rover: ack, ready16:27
rlandy|rovermyoung: ok - I am logged into rdo-ci-fx2-01-s6.v101.rdoci.lab.eng.rdu.redhat.com as root16:28
rlandy|roverit's current installed with rhel 7.416:28
rlandy|roveram I good to go with  "rhos-release rhel-7.5 && yum update && reboot"?16:28
myoungtrown: panda|off suggests we tag-team on https://trello.com/c/ZlaiOAZ2 by doing test / impl in parallel16:28
rlandy|roverif not, what additional?16:28
myoungrlandy|rover: do you have a tmate already?16:28
rlandy|roverwill create - sec16:29
myoungcan jump into bj if helps or just in tmate either16:29
*** trown is now known as trown|lunch16:29
myoungtrown|lunch: : happy to take either half16:29
*** ykarel_ has quit IRC16:30
myoungrlandy|rover: (shocker) I kept fairly detailed notes when I did the 7.3 --> 7.4 upgrade (https://trello.com/c/yKpd7REc) - reviewing them now16:30
rlandy|rovermyoung: ok16:31
rlandy|roverlet's try on this one box16:31
rlandy|roverthen we can try the virt packages install16:31
rlandy|roverimho - we just rhos-release and hope for the best16:31
* rlandy|rover is full of hope16:31
*** ykarel has joined #oooq16:50
rlandy|roverhttps://code.engineering.redhat.com/gerrit/14041716:52
rlandy|rovermyoung: ^^16:52
EmilienMcould I get someone kicking off a job with https://review.openstack.org/#/c/566916 on featureset035 so I can debug, I don't have the hardware17:06
EmilienMI really want to reproduce the failure17:07
*** sshnaidm_pto has quit IRC17:08
EmilienMsend the bill to weq17:09
EmilienMwes*17:09
rlandy|roverEmilienM: reproducer not working on your personal rdocloud tenant?17:11
rlandy|roverI can kick it on my tenant17:11
myoungrlandy|rover:  https://code.engineering.redhat.com/gerrit/140418 Update the rpm update script to support 7.517:11
rlandy|rovercomment on line 5 and then let's push this17:12
EmilienMoh wait17:14
EmilienMrlandy|rover: let me try it before :-P17:14
EmilienMI always forgot I can deploy OVB in RDO now17:14
* EmilienM facepalm17:14
rlandy|roverEmilienM: depends on resources - let me know17:15
ykarelrlandy|rover, newton phase1 also red, don't we care about it now?17:15
EmilienMrlandy|rover: ack17:15
ykarelrlandy|rover, currently failing at missing pem file: |  /home/jenkins/.ssh/rdo-ci-public.pem: No such file or directory17:16
rlandy|roverykarel: it's been down for ages17:16
rlandy|roverwhole newton pipeline is disabled17:17
ykarelhmm if these are not cared, shouldn't those be removed17:17
myoungrlandy|rover:  looking at https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/a%20multijob%20perspective/job/rdo-promote-newton-rdo_trunk/17:17
ykarelrlandy|rover, i can see those jobs are running, llast run 2 hours ago17:18
rlandy|roverykarel: very few people care about anything after upstream and rdocloud17:18
rlandy|roverI'll look into it more after when we have ocata and pike going17:18
rlandy|roverhave a bunch of slaves to upgrade17:19
rlandy|roverykarel: your patch got two + 2's17:19
ykarelrlandy|rover, Okk Thanks17:19
ykarelrlandy|rover, yeah saw that, if we get that merged today, we can see the result in tomorrow's run17:19
rlandy|roverykarel++17:20
hubbotrlandy|rover: ykarel's karma is now 117:20
EmilienMrlandy|rover: have you seen that before ? http://paste.openstack.org/show/cF3C5lSLqsxORBBPgjSf/17:20
myoungykarel: I think folks "care" - but lastly needs > resources17:20
ykarelmyoung, agree17:20
ykareli also saw that job by chance17:20
EmilienMI wonder if my openrc is missing something17:20
rlandy|roverEmilienM: can you tell why the stack failed?17:20
rlandy|roverdo you have enough resources17:20
EmilienMI removed all resources in my project17:21
rlandy|roverlet me try my tenant17:21
rlandy|roverone moment17:21
*** sshnaidm_pto has joined #oooq17:21
EmilienMrlandy|rover: can you show your openrc plz? (without password!)17:22
ykarelEmilienM, are u trying v2 or v3 openrc?17:22
rlandy|roveruse v217:22
*** atoth has quit IRC17:22
EmilienMv317:22
ykarelhmm try v217:22
*** atoth has joined #oooq17:22
EmilienMoh ok17:22
ykarelbut should be fixed for v3 as well if that's the issue17:22
rlandy|roverit was a stack create limitation for a long time17:23
myoungEmilienM, rlandy|rover, looking at http://paste.openstack.org/show/722452, because my eyes are bleeding from prev paste :)17:23
rlandy|roverv3 compatibility17:23
EmilienMinteresting it works with deployed-server jobs17:25
EmilienMbut not for ovb, probably some resource in heat17:25
rlandy|roveryep - stack issue17:25
EmilienMok stack deploying now :D17:25
rlandy|roverawesome17:25
EmilienMykarel++17:25
hubbotEmilienM: ykarel's karma is now 217:25
EmilienMrlandy|rover++17:25
hubbotEmilienM: rlandy|rover's karma is now 117:25
EmilienMrlandy++17:25
EmilienMawesome work folks I'll keep repeating :D17:25
*** ykarel is now known as ykarel|away17:27
EmilienMCreate_Failed: Resource CREATE failed: resources.undercloud_env: Property error: resources.undercloud_server.properties.key_name: Error validating value 'key-985': The Key (key-985) could not be found.17:28
* EmilienM sad face17:28
EmilienMif someone can try to reproduce https://logs.rdoproject.org/16/566916/7/openstack-check/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-master/Zdf5fa249cc654c9d961afde1d13457cf/reproducer-quickstart.sh17:32
myoungrlandy|rover:  https://code.engineering.redhat.com/gerrit/140420 Remove old / uneeded slave label (rdo-manager-64-74proto)17:33
rlandy|rovermyoung: awesome17:34
*** ykarel|away has quit IRC17:37
*** d0ugal_ has joined #oooq17:40
*** florianf has quit IRC17:42
*** d0ugal has quit IRC17:42
rlandy|roverEmilienM: will try that in a bit - just updating slaves atm17:43
*** trown|lunch is now known as trown17:48
trownmyoung: I dont really understand how we would parallelize that card17:48
*** d0ugal__ has joined #oooq17:51
*** d0ugal_ has quit IRC17:53
myoungtrown: i have a similar concern, on my way out for food, back in a bit.17:57
*** myoung is now known as myoung|lunch17:57
*** d0ugal__ has quit IRC17:59
*** gvrangan has joined #oooq18:02
rlandy|roverEmilienM: I have the fs035 reproducer in progress on my tenant - it's still doing the nodepool-setup piece. pls send me your key and I'll add it to the undercloud so you can log in18:07
EmilienMoh wow18:09
EmilienMwhy didn't it work for me18:09
EmilienMrlandy|rover: the second one on https://launchpad.net/~emilienm/+sshkeys18:09
*** gvrangan has quit IRC18:09
rlandy|roveridk - would have to check your run command and env. - adding keys18:09
EmilienMthanks18:10
rlandy|roverEmilienM: pls try ssh zuul@38.145.33.18218:11
rlandy|roverit's still running TASK [nodepool-setup : Install packages]18:11
EmilienMrlandy|rover: I'm in! thanks :)18:11
rlandy|roverEmilienM: ok - I'll let you know when this part is done - you should see stuff happening in the /home/zuul dir afterwards18:12
EmilienMright18:13
EmilienMrlandy|rover: thanks a ton18:13
rlandy|roversure18:13
*** d0ugal__ has joined #oooq18:23
*** gvrangan has joined #oooq18:23
hubbotFAILING CHECK JOBS on master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal @ https://review.openstack.org/56044518:25
*** d0ugal__ has quit IRC18:34
*** tesseract has quit IRC18:34
*** ccamacho has joined #oooq18:45
rlandy|roverha - had to reseat that whole server :(18:47
rlandy|roverback now18:47
rlandy|roverEmilienM: ok - finally - you should see things running now on the reproducer undercloud18:59
*** d0ugal__ has joined #oooq18:59
rlandy|rovermyoung|lunch: on to the third slave19:00
rlandy|roverseems to be working though19:00
EmilienMrlandy|rover: indeed, I tailf console.log now19:00
EmilienMthanks again19:00
rlandy|roverwe could pike promote19:00
*** gvrangan has quit IRC19:00
EmilienMstill curious how you could make it work and not me19:01
rlandy|roverkolla failures :(19:29
*** myoung|lunch is now known as myoung19:34
myoungrlandy|rover: cool...19:35
*** holser__ has joined #oooq19:36
rlandy|roverReadTimeoutError(self._pool, None, 'Read timed out.')", "ReadTimeoutError: UnixHTTPConnectionPool(host='localhost', port=None): Read timed out.",19:38
rlandy|roveroh dear19:38
rlandy|rovermyoung: welcome back19:38
rlandy|rovermore fun on rdocloud19:38
rlandy|rovermaybe you want to host image builds as well19:39
myoungoh no19:39
myoungis this from periodics again?19:39
myoungpushing built containers --> rdo registry?19:39
rlandy|rovermyoung: yep - master19:39
rlandy|roveryou seen it?19:39
rlandy|roverhttps://review.rdoproject.org/zuul/19:39
myoungrlandy|rover:  https://bugs.launchpad.net/tripleo/+bug/177163419:41
openstackLaunchpad bug 1771634 in tripleo "periodic: container build jobs are failing when pushing to rdo registry (500, 504, read timeout)" [High,Confirmed]19:41
myoungyes this one was permafailing last sprint19:42
myoungrlandy|rover: related (but different issue) https://bugs.launchpad.net/tripleo/+bug/177146919:42
openstackLaunchpad bug 1771469 in tripleo "RFE: (dlrnapi-promoter) Better handle Error removing image docker.io/tripleoqueens/centos-binary-nova-placement-api:current-tripleo - UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=60)" [High,Triaged]19:42
myoungrlandy|rover: CIX card from last sprint, https://trello.com/c/6t0etyUO/593-cixlp1771634tripleociproa-periodic-container-build-jobs-are-failing-when-pushing-to-rdo-registry-500-504-read-timeout, since wasn't "failing" any more was moved to a "done" state.  I think we should move it back to critical failing jobs, and update https://bugs.launchpad.net/tripleo/+bug/1771634 with the latest failures19:43
openstackLaunchpad bug 1771634 in tripleo "periodic: container build jobs are failing when pushing to rdo registry (500, 504, read timeout)" [High,Confirmed]19:43
myoungarxcruz|ruck: ^^19:43
*** tosky has joined #oooq19:44
*** holser__ has quit IRC20:03
rlandy|roveroh yeah - I remember that bug20:07
rlandy|rovermyoung:ok - updating bug20:08
myoungrlandy|rover: i moved the CIX card back to failing jobs, and updated bug with link to latest failure kolla log20:18
*** apetrich has quit IRC20:23
*** apetrich has joined #oooq20:23
hubbotFAILING CHECK JOBS on master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal @ https://review.openstack.org/56044520:25
rlandy|roveron the last slave update :)20:58
*** trown is now known as trown|outtypewww21:00
*** sshnaidm_pto has quit IRC21:01
EmilienMrlandy|rover: in the environment you gave to me, the undercloud isn't containeirzed21:39
EmilienMhttps://review.openstack.org/#/c/56691621:39
rlandy|roverI'll check the reproducer I used21:39
rlandy|rover: ${ZUUL_CHANGES:="openstack/python-tripleoclient:master:refs/changes/64/570864/2^openstack/tripleo-quickstart:master:refs/changes/16/566916/7"}21:40
rlandy|rover: ${TOCI_JOBTYPE:="ovb-3ctlr_1comp-featureset035"}21:40
rlandy|roverdid I miss something?21:40
EmilienMmhh21:40
rlandy|roverexport EXTRA_VARS="\$EXTRA_VARS --extra-vars dlrn_hash_tag=07f09e500b0f28ecddabf5d8ac808a0a9b399ae3_28943010 "21:41
EmilienMit sounds like quickstart was cloned from master21:41
rlandy|roveroh  that21:42
EmilienM:D21:42
*** links has joined #oooq21:42
rlandy|roverI have a review somewhere to address that problem21:42
EmilienM:-O21:42
rlandy|roverquickstart itself on the machine is not changed21:43
rlandy|roverit should be on the undercloud though21:43
rlandy|roverI'll check21:43
EmilienMrlandy|rover: this? https://review.openstack.org/#/c/564589/21:44
rlandy|roveryep21:44
rlandy|roverbut that come into effect if you need to change the running script21:44
EmilienMI just want to test a change in the featureset21:45
rlandy|roverright ok - you changed the configset itself21:46
rlandy|roverEmilienM: sorry - I'll rerun it with manual edits21:46
EmilienMcan we do that? nice21:46
EmilienMif quickstart doesn't complain, I'm fine with it now21:47
rlandy|rovernothing fancy21:47
rlandy|roveredit reproducer to git review -d21:47
rlandy|roverafter it clones quickstart21:47
rlandy|roverok - I'm taking down the env21:48
EmilienMok21:48
rlandy|roverEmilienM: ok - let's try this again: ssh root@38.145.34.3622:00
rlandy|roveryour key is on the root user22:00
rlandy|roveras zuul is in setup atm22:00
rlandy|roverok - your key is on the zuul user as well now22:01
rlandy|roverarxcruz|ruck: how long should tempest take to run on pike? https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/periodic-pike-rdo_trunk-featureset020-1ctlr_1comp_64gb/45/console - been running for hours22:02
*** tosky has quit IRC22:03
rlandy|rovermyoung: the slaves are back and updated on phase 222:04
myoungrlandy|rover: the virt tempest fs020 job usually takes ~ 5 hours total22:06
myoungrlandy|rover: re virthosts --> rhel 7.5 \o/22:06
rlandy|roverreally22:07
rlandy|roverthat's way long22:07
rlandy|roverhoping it passes and we can promote pike22:07
rlandy|roverone less yellow box to worry about22:08
myoungaye22:13
myoungrlandy|rover: looks like #45 finished tempest with a pass, collecting logs now22:14
myoung@4:40 it's on track for a sub 5 hr runtime :)22:14
*** links has quit IRC22:16
rlandy|roverthat's insane22:19
rlandy|roverbut hopeful we will promote22:20
hubbotFAILING CHECK JOBS on master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal @ https://review.openstack.org/56044522:25
myoungrlandy|rover: we've passed for the job status so we should.  I'll watch the actual promotion into this evening, should take an hourish22:26
rlandy|rovercool - thanks22:27
myoungrlandy|rover: it's already in flight 2018-05-31 18:36:13,294 32625 INFO     promoter Promoting the container images for dlrn hash d5ff1f4b7aaeacd78e4ce1254c9428103893c137 on pike to current-tripleo-rdo-internal22:43
*** d0ugal__ has quit IRC22:48
*** d0ugal__ has joined #oooq23:03
*** rlandy|rover is now known as rlandy|rover|bbl23:09
rlandy|rover|bblcool23:09

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!