Monday, 2018-07-30

hubbotFAILING CHECK JOBS on stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-master @  (1 more message)00:39
*** pliu_ has joined #oooq02:05
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-3nodes-multinode @ https://review.openstack.org/567224, stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates- (1 more message)02:39
*** skramaja has joined #oooq03:23
*** gkadam has joined #oooq03:30
*** yolanda_ has joined #oooq03:38
*** links has joined #oooq03:39
*** udesale has joined #oooq03:40
*** yolanda has quit IRC03:41
*** links has quit IRC04:15
*** chkumar|trekk is now known as chandankumar04:30
*** jaganathan has joined #oooq04:30
*** links has joined #oooq04:33
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-3nodes-multinode @ https://review.openstack.org/567224, stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, tripleo-ci-centos-7-scenario008-multinode-oooq-container, legacy- (1 more message)04:39
*** ykarel| has joined #oooq05:20
*** pliu_ has quit IRC05:22
*** ykarel| is now known as ykarel05:35
*** holser_ has joined #oooq05:50
*** jaosorior has joined #oooq05:51
*** ratailor has joined #oooq06:14
chandankumarpanda|rover|off: rfolco|off container builds are failing on horizon https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-centos-7-master-containers-build/f36b526/logs/kolla/logs/000_FAILED_horizon.log06:14
*** jtomasek has joined #oooq06:18
*** quiquell has joined #oooq06:19
*** jfrancoa has joined #oooq06:19
*** brault has quit IRC06:19
quiquellsshnaidm: https://review.rdoproject.org/r/#/c/15041/ rebased06:26
*** agopi has quit IRC06:35
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-3nodes-multinode @ https://review.openstack.org/567224, stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, tripleo-ci-centos-7-scenario008-multinode-oooq-container, legacy- (1 more message)06:39
*** jfrancoa has quit IRC06:43
*** jfrancoa has joined #oooq06:43
*** ccamacho has joined #oooq06:44
*** holser_ has quit IRC06:44
quiquellpanda|rover|off, rfolco|off: Have open a bug, with new issue for scenario001 that stop the fix of CI06:53
quiquellpanda|rover|off, rfolco|off: https://bugs.launchpad.net/tripleo/+bug/178430706:53
openstackLaunchpad bug 1784307 in tripleo "tripleomaster/centos-binary-collectd:current-tripleo-updated-20180730001257 \"kolla_start\" Restarting" [Undecided,New] - Assigned to Rafael Folco (rafaelfolco)06:53
*** zoli is now known as zoli|wfh06:57
*** zoli|wfh is now known as zoli06:57
*** brault has joined #oooq06:59
*** tesseract has joined #oooq07:04
*** amoralej|off is now known as amoralej07:08
*** yolanda_ is now known as yolanda07:21
*** florianf has joined #oooq07:23
*** dalvarez has joined #oooq07:25
*** jtomasek_ has joined #oooq07:34
*** jtomasek has quit IRC07:34
*** ykarel is now known as ykarel|lunch07:38
*** amoralej_ has joined #oooq07:47
chandankumararxcruz: Hello07:55
chandankumararxcruz: Can I add you as a QE for this card https://trello.com/c/MX0ZJVtI/889-update-python-stestr-to-210 ?07:55
arxcruzchandankumar: hi, done07:56
chandankumararxcruz: thanks :-)07:56
*** bogdando has joined #oooq08:03
*** ykarel|lunch is now known as ykarel08:22
*** skramaja_ has joined #oooq08:24
*** skramaja is now known as Guest7720508:25
*** skramaja_ is now known as skramaja08:25
*** holser_ has joined #oooq08:29
*** d0ugal has joined #oooq08:31
*** d0ugal has quit IRC08:31
*** d0ugal has joined #oooq08:31
hubbotFAILING CHECK JOBS on stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, tripleo-ci-centos-7-scenario008-multinode-oooq-container, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-master @  (1 more message)08:39
*** skramaja_ has joined #oooq08:40
*** skramaja has quit IRC08:41
*** skramaja_ is now known as skramaja08:41
*** skramaja_ has joined #oooq08:46
*** skramaja is now known as Guest2239308:46
*** skramaja_ is now known as skramaja08:46
*** Guest22393 has quit IRC08:46
*** sshnaidm has quit IRC09:07
*** sshnaidm has joined #oooq09:09
*** sshnaidm is now known as sshnaidm|afk09:12
*** jfrancoa has quit IRC09:13
*** jfrancoa has joined #oooq09:15
*** quiquell is now known as quiquell|mtg09:15
*** dtantsur|afk is now known as dtantsur09:35
*** quiquell|mtg is now known as quiquell09:37
*** ratailor has quit IRC09:44
*** sshnaidm|afk has quit IRC09:47
quiquellykarel, arxcruz, marios, ssbarnea: Did you guys have this in the reproducer ? Inappropriate ioctl for device09:49
arxcruznopnope09:49
arxcruznope*09:50
ssbarneai seen this in the past, send me link.09:50
quiquelldamn...09:50
quiquellssbarnea: It's at my little fedora09:50
ssbarneai think is ansible bug if i remember well, but last update should have fixed it.09:50
quiquellssbarnea: Maybe downgrade ansible will help ?09:50
ssbarneaif you redirect(ansibke)  stdout the pause module was choking with something like this, but they fixed it.09:50
quiquellssbarnea: I am using |tee09:51
quiquellssbarnea: Maybe is that ?09:51
ssbarneai think it was fixed in both ansible branches 2.6/2.5 and released few days ago.09:51
ssbarneaalmost sure, not you fault. let me send a link to bug09:52
quiquellssbarnea: Ok going to update09:52
quiquellssbarnea: Is not a bug, is just a execution at my laptop09:52
ssbarneaor better first update ansible and tell me.09:52
quiquellssbarnea: Was going too, let me try09:52
quiquellssbarnea: Thanks man !09:52
mariosquiquell: not something i've seen yet09:53
ykarelquiquell, i also not aware of it09:54
quiquellok09:54
*** ratailor has joined #oooq10:00
quiquellPuff Failed to retrieve repo file from https://trunk.rdoproject.org/centos7-master/current-tripleo/delorean.repo after 10 retries10:05
quiquellis appearing again10:05
*** brault has quit IRC10:10
quiquellssh10:22
*** jaosorior has quit IRC10:26
hubbotFAILING CHECK JOBS on stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, tripleo-ci-centos-7-scenario008-multinode-oooq-container, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-master @  (1 more message)10:39
quiquellssbarnea: Working!! thanks man, ansible 2.5.7 works10:45
*** udesale has quit IRC10:47
*** panda|rover|off is now known as panda|rover10:50
ssbarneaquiquell: glad to hear that. it seems that being subscribed to all ansible release notes pays off.10:53
quiquellssbarnea: Shure it pays off10:54
ssbarnearegarding ansible versions and fixes/bugs. What if we would have a non-voting gate that uses next-ansible version (one that we do not yet support). Having this would clearly help upgrading code.10:55
*** ratailor has quit IRC10:56
ssbarneathere is something i learn regarding ansible team: is like 5-10x faster to have a bug fixed before their release than after. they do care a lot to avoid regressions, but they lack extensive testing. but if we catch bugs early, we would prevent ansible from releasiong breaking changes.10:57
ssbarneaquiquell: read and review https://review.openstack.org/#/c/583965/ -- that is why I knew about the bug.11:01
panda|roversounds familiar11:01
quiquellssbarnea: Damn... Have to do more reviewing :-/11:03
ssbarneawhile ansible fixed it, the fix is still vaid, working with all versions and cutting one task.11:03
*** sshnaidm|afk has joined #oooq11:05
ssbarneapanda|rover: quiquell : another one that should be very easy to review: https://review.openstack.org/#/c/582599/11:05
panda|roverssbarnea: we're on fire at the moment, you may get delay on any reviews11:06
quiquellssbarnea: Yep, sorry about delays man, we will get into it later on11:06
ssbarneasure. and again let me know if I can help with something. don't forget about me.11:07
ssbarneai am still following the RHU courses on OSP,... boring but needed.11:09
quiquellpanda|rover: My first PTO was 13th not 15th so I am going to be 2 days less in the sprint11:13
panda|roverquiquell: fantastic11:16
quiquellpanda|rover: I just merge broken stuff so it's good for the sprint11:17
panda|roverssbarnea: Best talk to weshay about your tasks at the present, when you have a rough idea, come back to me and we can distribute across sprint. I guess you'll shadow ruck and rover for quite a while, follow what's happening in #tripleo right now11:23
ssbarneapanda|rover: ok!11:24
ssbarneaquiquell: i think the same applies to me regarding PTO, i will be away the week August 13-17.11:25
panda|roverssbarnea: I don't count it in sprint until weshay says so.11:26
panda|roveryou*11:26
*** zoli is now known as zoli|doctor11:26
*** jaosorior has joined #oooq11:32
*** quiquell is now known as quiquell|lunch11:33
*** rfolco|off is now known as rfolco|ruck11:36
panda|rovermarios: you're woring on https://trello.com/c/lrSUMaqw/885-translate-tripleosh-bootstrap-subnodes-into-a-series-of-tasks-s17 ?11:45
panda|rovermarios: can you modify the description and the definition of done ?11:47
mariospanda|rover: yeah ack i posted something will update the card with that as well11:49
*** panda|rover is now known as panda|lunch11:51
rfolco|ruckpanda|lunch, should I assign this one to you ? https://bugs.launchpad.net/tripleo/+bug/177280711:59
openstackLaunchpad bug 1772807 in tripleo "default containerized undercloud install with local CA fails with "Error org.freedesktop.DBus.Error.TimedOut"" [High,Triaged]11:59
rfolco|ruckpanda|lunch, morning11:59
weshaypanda|lunch, rfolco|ruck greetings guys.. let me know if there is anything I can do for you12:02
panda|lunchthought and prayers12:02
panda|lunchrfolco|ruck: morning12:02
weshayssbarnea, let me get settled in and we'll chat k?12:02
weshaypanda|lunch, let me see if I'm reading things correctly... rdo jobs are very red.. and master is at 9 days..12:04
weshayeverything else seems ok12:04
panda|lunchrfolco|ruck: doesn't seem to be ci or quickstart fault12:04
rfolco|ruckpanda|lunch, k... another question,12:06
panda|lunchweshay: define "everything else". Scenarios are broken, upgrades are missing a tag, and we are currently reporting fasle positives12:06
rfolco|ruckhttps://ci.centos.org/job/tripleo-quickstart-promote-pike-rdo_trunk-minimal/ --> pike promotion12:06
weshaypanda|lunch, scenario01 / 03?12:06
weshaypanda|lunch, hrm ... ok ping me when you have a sec and lets review where we stand re: ruck / rover before the escalation mtg12:07
weshaypanda|lunch, you have a patch for the upgrade tag issue?12:08
panda|lunchweshay: I can talk now, and yes, we have a patch, but was blocked by the scenario failures12:08
panda|lunchweshay: you wioll se me eating ...12:08
weshayk..12:08
* weshay gets headset12:08
panda|lunchI think I'll join ci escalation too12:09
weshaypanda|lunch, rfolco|ruck in  my blue12:11
*** quiquell|lunch is now known as quiquell12:11
quiquellweshay: welcome back sir12:11
panda|lunchweshay:  https://review.openstack.org/58552812:11
rfolco|ruckweshay, please paste me link, I f* my browser cache12:12
panda|lunchweshay: https://review.openstack.org/58700612:15
weshayquiquell, panda|lunch nice work thus far.. let's start by getting a bug in the commit message for https://review.openstack.org/#/c/587006/ and start knocking these off one by one12:19
weshaystarting w/ https://review.openstack.org/#/c/587006/12:19
*** ratailor has joined #oooq12:19
*** gkadam has quit IRC12:19
panda|lunchweshay: ok12:19
quiquellweshay: ack12:19
quiquellweshay: Admin question, do I have to put my PTO at Orange HRM and appears in the calendar ?12:20
panda|lunchquiquell: weshay commit message updates12:20
panda|lunchd12:20
quiquellpanda|lunch, weshay: Jiris change is failing now at tripleo-ci-centos-7-containers-multinode12:22
weshayquiquell, yes first put it in the pto cal on google so the organization knows ur out, then in orange12:24
quiquellweshay: I can modify the calendar, I think you need to give me permissions12:24
quiquells/can/can't/12:25
weshayquiquell, hrm.. k.. /me sees the 3 node multinode error12:25
quiquellweshay: 3nodes ?12:25
quiquellrfolco|ruck: is this http://logs.openstack.org/99/586499/1/check/tripleo-ci-centos-7-containers-multinode/34499b8/logs/undercloud/home/zuul/tempest.log.txt.gz#_2018-07-30_11_58_0912:26
quiquellrfolco|ruck: related to this https://bugs.launchpad.net/tripleo/+bug/1784017 ?12:26
weshayquiquell, sorry was looking at Emiliens patch12:26
openstackLaunchpad bug 1784017 in tripleo "TestNetworkBasicOps.test_network_basic_ops failures" [Critical,Triaged]12:26
quiquellweshay: Ahh ok, we have to merge Jiri's one first, it has being like a perfect storm so far :-(12:27
weshayquiquell, hrm.. jiri's patch failed in tempest12:28
quiquellweshay: Bug above feels related :-/12:28
quiquellweshay: Going to add the Depends-On to disable health check, now that has failed, It's like impossible to merge this12:29
rfolco|ruckquiquell, not sure, traceback looks diff12:29
quiquellOk we have the +1w in the revert12:29
quiquellrfolco|ruck: Damn...12:30
*** rlandy has joined #oooq12:31
chandankumarpanda|lunch: weshay I need some help here https://review.openstack.org/#/c/584368/4/roles/validate-tempest/defaults/main.yml@2712:32
rfolco|ruckquiquell, I am not discarding the hypothesis12:32
chandankumarit is giving  [undercloud]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'undercloud_enable_tempest' is undefined"}12:32
*** panda|lunch is now known as panda|rover12:32
quiquellpanda|rover, weshay: We have +1w at revert it's just a matter of time and luck12:33
quiquellpanda|rover: I see there is no scenario001 running at the tht heatl check revert...12:38
quiquellpanda|rover: If they want to rre-revert it it should run it I think12:38
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-3nodes-multinode @ https://review.openstack.org/567224, stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates- (1 more message)12:39
*** links has quit IRC12:44
rlandyrook: hi12:44
weshayEmilienM, bogdando I may need to understand how we can exercise the containers w/ something like a chaos monkey in the promotion jobs to make sure we catch intermittent issues w/ health checks12:45
weshaybah.. wrong channel12:45
rlandyrook: nice work from you and agopi - https://review.openstack.org/#/c/583717/12:45
rlandy053 passing :)12:45
*** zoli|doctor is now known as zoli12:45
weshaypanda|rover, rfolco|ruck is there a bug on the collectd kolla issue?12:45
*** zoli is now known as zoli|wfh12:46
*** zoli|wfh is now known as zoli12:46
rfolco|ruckyes12:46
rfolco|rucksec12:46
bogdandoweshay: not sure promotion jobs are good fit for destructive tests12:46
rlandyrook: where do I look for the logs for broebeat specific stuff?12:46
bogdandoit should be periodic , IMO12:46
weshaybogdando, ya.. not sure either tbh12:46
rlandybrowbeat12:46
rfolco|ruckweshay,12:46
rfolco|ruckhttps://bugs.launchpad.net/tripleo/+bug/1784307 --> tripleomaster/centos-binary-collectd:current-tripleo-updated-20180730001257 \"kolla_start\" Restarting12:46
openstackLaunchpad bug 1784307 in tripleo "tripleomaster/centos-binary-collectd:current-tripleo-updated-20180730001257 \"kolla_start\" Restarting" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)12:46
rfolco|ruckhttps://bugs.launchpad.net/tripleo/+bug/1784233 --> Kolla fails to build horizon container in periodic master job12:46
openstackLaunchpad bug 1784233 in tripleo "Kolla fails to build horizon container in periodic master job" [Critical,Triaged] - Assigned to Gabriele Cerami (gcerami)12:46
weshayrfolco|ruck, ya.. thanks12:47
bogdandoIt's better to have tires on fire periodically then blocking promotions :D12:47
bogdandowhen*12:47
quiquellbogdando: Make sense to exercise scenari001 at tht with commons from ^common/* ?12:48
quiquellbogdando: The healtcheck is there12:48
quiquells/ commons / changes /12:48
*** agopi has joined #oooq12:49
bogdandoquiquell: was that for my periodic proposal for wes's chaos monkeys?12:49
rlandymarios: still not able to get your patch through? :( we didn't have a good day with the gates on friday12:49
bogdandoI'm not sure I'm following right12:49
rookrlandy: /home/<user>/browbeat/log/12:50
quiquellbogdando: review coming12:50
quiquellbogdando: mean this https://review.openstack.org/58705112:50
*** amoralej is now known as amoralej|lunch12:51
rlandyrook: https://logs.rdoproject.org/17/583717/20/openstack-check/legacy-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053-master/298303d/logs/undercloud/home/zuul/ - so then we are not collecting that  - will modify log collection list12:51
quiquellbogdando: Don't know if the other scenarios are needed too12:51
bogdandoI think they do12:51
rookrlandy: it sends the most relevant information to stdout12:51
* rlandy checks log collection12:51
rookrlandy: do we capture that in the console?12:51
bogdandocommon contains the very DF core12:51
quiquellbogdando: So the review makes sense, was weird for me to not see the scenarios exercised12:52
bogdandoso any change to it worth the maximum possible coverage12:52
bogdandoyupp12:52
rlandyrook: yep ... https://logs.rdoproject.org/17/583717/20/openstack-check/legacy-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053-master/298303d/job-output.txt.gz#_2018-07-28_02_52_03_75951312:52
rlandybut I don't see any specific info on the run browbeat line12:52
rlandyrook: it's important for debugging that we collect *all* browbeat logs/info12:53
rookrlandy: ack12:53
rlandyotherwise debug will be very difficult12:53
rookrlandy: agreed.12:53
rlandyrook: ok - so we are juts missing /home/user/browbeat/*?12:53
rlandyanything else?12:53
rlandywhile I am modifying log collection12:54
rookone sec i was reviewing what agopi changed.. /me looks12:54
rlandyrook: going in to scrum meeting - so take your time - will ping you when I get out of meeting12:54
mariosrlandy: ack yeah looks like latest also failed on that pike job seems consistent12:55
rlandymarios: consistent with your job or consistent everywhere?12:55
rookagopi: going to make another minor change12:56
*** links has joined #oooq12:56
mariosrlandy: didn't dig yet so don't know12:57
*** amoralej|lunch is now known as amoralej12:57
panda|roverweshay: https://trello.com/c/2WLBEuEc/589-cixlp1771233tripleociproa-te-broker-errors-in-tripleo-3rd-party-jobs we need to sync on this, there are news.12:58
weshayssbarnea, please join the status mtg13:04
weshayssbarnea, let's sync up after the mtg13:04
*** myoung has joined #oooq13:04
myoungtriplo-ci scrum: o/13:05
rascarlandy, hey hi! I saw your work on the reviews, I've tried to test it but unfortunately I'm still hitting the dbus problem13:05
myoungquiquell, ssbarnea ^^13:05
quiquellmyoung: Damn sorry13:05
quiquellGoing  there13:05
*** sshnaidm|afk is now known as sshnaidm13:06
rlandyrasca: in scrum - will ping you when out13:06
rascarlandy, sure no worries and thanks13:06
agopisure thing rook13:09
rookagopi: patch up13:10
rookhttps://review.openstack.org/#/c/583717/21/ansible/oooq/roles/browbeat-run/tasks/main.yml agopi13:12
agopiunderstood rook want to make sure we can look at how browbeat ran13:13
agopilooks good13:13
agopibut the patch isnt ready yet, i've to make changes and test it as changes made to gen_tripleo_hostfile isnt running in the internal CI13:14
mariospanda|rover: rlandy: weshay there are no logs :/  http://logs.openstack.org/95/583195/12/check/tripleo-ci-centos-7-containers-multinode-pike/b8eca56/logs/13:19
weshaymarios, ya.. saw that13:20
weshayneed to see if the job before also had that happen13:20
rlandywe need also to see if it fails elsewhere13:21
panda|roverwe've seen some other instance of this problem13:23
rookagopi: ok13:24
*** agopi is now known as agopi|brb13:25
panda|roversorry guys I keep disconnecting13:27
*** agopi|brb has quit IRC13:30
*** udesale has joined #oooq13:30
*** udesale has quit IRC13:35
*** skramaja has quit IRC13:36
*** Goneri has joined #oooq13:38
*** agopi has joined #oooq13:54
rookrlandy: added to the commit to get more output13:54
rookthanks for brining that up13:55
rookhow can I check the status of featureset05313:55
rookie, the queue13:55
*** ratailor has quit IRC13:55
weshayssbarnea, sorry.. this went way over, going to have to sync w/ you tomorrow probably14:02
ssbarneaweshay: sure, i have enough to do.14:03
quiquellpanda|rover, rfolco|ruck: Let me know if you find issues at dashboard-ci.tripleo.org14:08
weshayquiquell, check the newbie doc re: pto cal14:09
quiquellweshay: I have it added, but cannot change it14:09
quiquellweshay: Cannot add my days14:09
weshayquiquell, k.. I think you need to email eliska14:09
weshayshould have that in the doc I think14:09
quiquellweshay: Ahh ok found it at the end of the doc... sorry mate14:10
weshayno prob14:10
rlandyrook: sorry - out of scrum14:11
rlandyrook: to check the status of fs053, https://review.rdoproject.org/zuul/status.html - use search box - add your review number14:11
* rlandy works on review now to fix collect logs14:12
rlandymarios: ^^ pls ping me before you go so I know where to pick up the investigation/job watching. thanks14:13
panda|rovermyoung: want to chat about the isolated ansible tests ? I have a few minutes to spare14:16
*** quiquell has quit IRC14:17
myoungpanda|rover, sure, but i'm in tempest squad sprint 17 planning for an hour14:21
panda|rovermyoung: ok, ping me when you're ready14:24
mariosrlandy: ack still on this call for a bit14:25
mariosrlandy: sorry14:25
rlandyno rush14:27
*** links has quit IRC14:27
panda|rovermarios: rlandy 018-07-30 12:38:51.008910 | primary | ok: 4 changed: 4 unreachable: 0 failed: 0 during post-run. THe nodes were unreachable, so we didn't get any log14:29
panda|roverthe job timed out14:30
*** udesale has joined #oooq14:34
rlandyrook: https://review.openstack.org/#/c/583576/10/toci-quickstart/config/collect-logs.yml14:37
*** dtantsur is now known as dtantsur|brb14:38
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-3nodes-multinode @ https://review.openstack.org/567224, stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates- (1 more message)14:39
*** ykarel is now known as ykarel|away14:41
*** bogdando has quit IRC14:45
rookrlandy++14:51
hubbotrook: rlandy's karma is now 1714:51
rookyup14:51
rooki added to the ci script to save stdout to logs14:51
rooklog*14:51
*** ykarel|away has quit IRC14:51
*** vinaykns has joined #oooq14:52
rlandyhttp://logs.openstack.org/95/583195/12/check/tripleo-ci-centos-7-containers-multinode-pike/650309f/job-output.txt.gz#_2018-07-29_19_41_26_94788815:08
rlandymarios: ^^ previous run from the same job15:08
mariosrlandy: ack also here http://logs.openstack.org/95/583195/12/check/tripleo-ci-centos-7-containers-multinode-pike/b8eca56/ara-report/15:09
mariosrlandy: but without any more info why15:09
rlandyhmm ...15:16
mariosrlandy: i have no idea why but it seems like it must be related to this change, or at least, i can see green runs on other reviews like15:18
mariosgreen run @ https://review.openstack.org/#/c/585528/ 29th July https://review.openstack.org/#/c/570892/ today15:18
mariosfor tripleo-ci-centos-7-containers-multinode-pike15:18
*** atoth has quit IRC15:21
*** atoth has joined #oooq15:22
rlandyhttp://38.145.35.97/d/cEEjGFFmz/cockpit?orgId=1&panelId=61&fullscreen&var-launchpad_tags=alert&var-promotion_names=current-tripleo&var-promotion_names=current-tripleo-rdo&var-promotion_names=current-tripleo-rdo-testing&var-releases=master&var-releases=queens&var-releases=pike&var-releases=ocata&var-influxdb_filter=job_name%7C%3D%7Ctripleo-ci-centos-7-containers-multinode-pike15:27
rlandyshows no failing jobs15:27
rlandyweird15:27
rlandyeven if I look back 7 days15:28
rlandysova doesn't track this job15:28
rlandymarios: yep _ I do see green runs in other reviews15:34
rlandynot sure if that is an exact diagnosis though15:34
mariosrlandy: yeah. i was also checking the job definition for clues like https://github.com/openstack-infra/tripleo-ci/blob/833c9d2a814c15011c43f25021bdfde19e308afd/zuul.d/multinode-jobs.yaml#L136 but other job with featureset 010 passes like tripleo-ci-centos-7-containers-multinode15:36
rlandyand we git a fairly consistent timeout here15:36
rlandygot15:36
rlandymarios: I think out best bet is a reproducer to see where the timeout is and why15:37
*** dtantsur|brb is now known as dtantsur15:37
mariosrlandy: ack i will try that tomorrow but we don't even get the reproducer here15:42
mariosrlandy: so not sure where to get it from :) I mean it doesn't make sense to get it from a green job15:43
rlandymarios: we can edit one from another fs15:43
mariosrlandy: ah ok15:43
rlandyor job, right15:43
mariosrlandy:right i can edit the zuul changes etc15:43
rlandymarios: ack - ok, I'll try  a reproducer15:44
*** myoung has quit IRC15:44
*** tesseract has quit IRC15:49
*** zoli is now known as zoli|gone15:57
*** zoli|gone is now known as zoli15:58
mariosrlandy: thanks, am gonna call it a day in a sec. i will have a look tomorrow for any updates and otherwise also try the reproducer16:02
mariosrlandy: thanks for your help16:02
mariosas always16:02
*** jfrancoa has quit IRC16:05
panda|roverrlandy: do you have a patch for collect logs ?16:05
rlandypanda|rover: https://review.openstack.org/#/c/583576/10/toci-quickstart/config/collect-logs.yml16:06
rlandypanda|rover: ^^ part of tripleo-ci patch16:07
panda|roverrlandy: ok, we have another bug for logs collection, the one that causes pike job to not collect any logs16:08
panda|roverrlandy: or any other job that times out16:08
panda|roverrlandy: do you have something for that too ?\16:08
*** myoung has joined #oooq16:14
panda|roverrlandy:  https://review.openstack.org/58710316:17
panda|rovermarios: ^ that's why the logs were not showing up16:17
rlandypanda|rover:yep - that's the timeout issue we are looking at16:18
rlandypanda|rover: that doesn't fix the timeout, right? just the log collection if timeout happens?16:19
panda|roverrlandy: yep, just the collection16:19
panda|rovermaybe it's bettter if I create the bug16:21
rlandyyeah , ok - still need to debug the timeout then16:21
rlandybrb16:21
*** rlandy is now known as rlandy|brb16:21
*** ccamacho has quit IRC16:26
panda|roverrfolco|ruck: FYI https://bugs.launchpad.net/tripleo/+bug/178441716:35
openstackLaunchpad bug 1784417 in tripleo "TripleO-CI workflow fails to collect logs after a time out" [Critical,Fix committed] - Assigned to Gabriele Cerami (gcerami)16:35
*** panda|rover is now known as panda|rover|off16:36
myoungpanda|rover|off: sorry i missed yua16:37
panda|rover|offmyoung: no you didn't, wanna chat 5 minutes ?16:38
myoungsure16:38
hubbotFAILING CHECK JOBS on stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-master @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-3nodes- (1 more message)16:39
myoungpanda|rover|off: https://bluejeans.com/705085945516:39
*** jfrancoa has joined #oooq16:41
*** udesale has quit IRC16:43
weshayrfolco|ruck, you online?16:47
rfolco|ruckweshay, o/16:47
weshayrfolco|ruck, join my blue for a minute16:47
*** jfrancoa has quit IRC16:48
weshayrfolco|ruck, panda|rover|off this can be closed right? https://bugs.launchpad.net/tripleo/+bug/178354016:49
openstackLaunchpad bug 1783540 in tripleo "RDO cloud is not in operational state" [Critical,Triaged] - Assigned to chandan kumar (chkumar246)16:49
weshayrfolco|ruck,16:49
weshayhubbot check jobs:16:49
hubbotweshay: Error: "check" is not a valid command.16:49
weshayTQE, https://review.openstack.org/#/c/560445, I214272a6f25feb75496e44eb0a16269c6ee4cfe216:49
weshayTHT, https://review.openstack.org/#/c/567224, I0cbf9ffb8552411e4dd891c38702ff8d1f6db5b1, stable/queens16:49
weshayTHT, https://review.openstack.org/#/c/564285, If12c8fe9bd0bea98a4842f279399285344f22246, stable/pike16:49
weshayTHT, https://review.openstack.org/#/c/564291, I4c5bdf00ce8cf7eabf669b248b99cb8443e82fab, stable/ocata16:49
panda|rover|offweshay: rhos-ops confirmed it's working16:55
weshaypanda|rover|off, aye16:55
weshaypanda|rover|off, this scares me16:56
weshayhttp://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=116:56
weshaypanda|rover|off, rfolco|ruck http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1&panelId=63&fullscreen16:56
weshaypanda|rover|off, that seems wrong to me16:57
weshaypanda|rover|off, based on what I see from the no-op check jobs16:57
*** rlandy|brb is now known as rlandy16:59
panda|rover|offwhy the dashboard says NODE_FAILURE ?16:59
weshaypanda|rover|off, well.. if the provisioning fails17:00
panda|rover|offoh I see17:00
weshaypanda|rover|off, rfolco|ruck I'm going to run cleanup on the tenant unless17:00
weshayone of you are17:00
weshayrlandy, you ready?17:01
rlandyyep - joining17:01
panda|rover|offweshay: in http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1&panelId=63&fullscreen, what does time stand for ? If you look at https://review.openstack.org/#/c/586285/ for example, the last check was in 27th of July, not today17:02
weshayhrm..17:03
weshayI'm looking at recent changes17:04
panda|rover|offI still see some NODE_FAILURE yes, but the time column is really confusing17:05
panda|rover|offNODE_FAILURE anyway means that we're even unable to get the undercloud17:08
panda|rover|offwe have 349 instances in nodepool tenant currently17:09
panda|rover|offI've seen much more ...17:09
panda|rover|offbut yeah we have instances older than 1 day17:10
panda|rover|offbut not many17:10
panda|rover|offweshay: I don't think the cleanup script is going to solve much, there are like 10 instances older than 1 day17:11
panda|rover|offweshay: 16 instances.17:11
rfolco|ruckweshay, panda|rover|off I have a doctor appt. will be back in one hour, asap17:13
panda|rover|offyeah, I'm off for real now too.17:15
weshayk17:19
weshayrfolco|ruck, ping me when you are back, let's chat w/ Ronelle17:23
*** amoralej is now known as amoralej|off17:26
rfolco|ruckweshay, ok17:28
*** vinaykns has quit IRC17:31
*** ykarel|away has joined #oooq17:45
*** jaganathan has quit IRC17:50
weshayinteresting https://review.openstack.org/#/c/586843/117:59
*** vinaykns has joined #oooq18:14
panda|rover|offintersting indeed https://review.openstack.org/#/c/548005/18:20
weshaypanda|rover|off, do we have a bug on the master container build?18:20
*** myoung is now known as myoung|lunch18:22
panda|rover|offweshay: we have one that fails to build horizon in kolla18:23
weshayI don't even see logs :(18:24
weshayhttps://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-centos-7-master-containers-build/a213a15/18:24
panda|rover|offweshay I think it's in the ruck rover etherpad18:24
weshayhttps://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/?C=M;O=D18:24
panda|rover|offmmhh18:25
weshayrfolco|ruck, panda|rover|off how do we not have a bug opened on this ?18:25
weshaypanda|rover|off, I'm not sure if that job is running18:27
weshaypanda|rover|off, rfolco|ruck guess I'll wait until the next set kicks18:27
*** dtantsur is now known as dtantsur|afk18:31
weshaypanda|rover|off, rfolco|ruck I'll watch it.. no worries18:34
hubbotFAILING CHECK JOBS on stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-master @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-3nodes- (1 more message)18:39
*** holser_ has quit IRC18:45
panda|rover|offweshay we had this https://bugs.launchpad.net/tripleo/+bug/1784233 yesterday.18:46
openstackLaunchpad bug 1784233 in tripleo "Kolla fails to build horizon container in periodic master job" [Critical,Triaged] - Assigned to Gabriele Cerami (gcerami)18:46
weshaypanda|rover|off, aye.. saw that18:46
panda|rover|offweshay it was merged 3 hours ago. maybe it's related if now is timing out18:46
weshaypanda|rover|off, /me is just worried about the lack of visibility of the tripleo promotions jobs now18:46
weshaythat we have zuulv318:46
weshayno sova18:46
weshayand it looks likes the containers build job is not kicking or logging at all18:47
weshaypanda|rover|off, but I'll wait to see what happens when the next set kicks18:47
weshaypanda|rover|off, the ruck/rover cockpit is also not reporting the status correctly18:47
panda|rover|offI usually keep the window open on the logs page when I'm ruck18:49
rfolco|ruckweshay, back18:49
rfolco|ruckweshay, sagi opened the bug for container build job18:50
weshaypanda++18:50
weshayrfolco|ruck, diff issue18:50
rfolco|ruckweshay, missing logs ? panda|rover|off did18:52
rfolco|ruckoh I see your comment in https://review.rdoproject.org/etherpad/p/ruckrover-sprint1718:53
weshayrfolco|ruck, rlandy let's chat on my blue when you both are avail18:55
weshayre: rdo zuul config18:55
rfolco|ruckweshay, I can chat now18:55
weshayrfolco|ruck, k.. waiting on rlandy18:56
weshayrfolco|ruck, /me looking at https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-centos-7-master-containers-build/18:56
weshaywth?18:56
panda|rover|offmmnh logs on rdo was broken last week too18:56
panda|rover|offfor a while18:57
weshaypanda|rover|off, rfolco|ruck so.. if I don't see a set of those jobs kick by the end of my day.. I'm opening a blocker18:58
panda|rover|offmaybe it's relate to the NODE_FAILURE19:00
sshnaidmSeems like there is still problem with logs in rdo19:00
sshnaidmlogs that I opened 1 hour before just disappeared19:00
sshnaidmmaybe somebody is cleaning them now19:00
weshayhrm.. k19:01
* weshay asking in #rdo19:01
rfolco|ruckyep, the bug sshnaidm opened on Sunday had the logs and disappeared looks lile19:01
rfolco|rucklike19:01
weshayrfolco|ruck, where is that bug?19:01
rfolco|ruckhttps://bugs.launchpad.net/tripleo/+bug/178423319:02
openstackLaunchpad bug 1784233 in tripleo "Kolla fails to build horizon container in periodic master job" [Critical,Triaged] - Assigned to Gabriele Cerami (gcerami)19:02
rlandyweshay: rfolco|ruck: here - sorry19:02
agopirlandy: ping19:02
rfolco|rucktrying to find what still remains at https://review.rdoproject.org/zuul/builds.html?pipeline=openstack-periodic19:03
rlandyweshay: rfolco|ruck: logged onto bj19:05
rfolco|ruckok going19:05
weshayrlandy, ok.. joining19:05
weshayrfolco|ruck, my blue19:05
weshayrfolco|ruck, https://bluejeans.com/4113567798/19:07
*** holser_ has joined #oooq19:15
*** holser_ has quit IRC19:29
*** holser_ has joined #oooq19:29
*** ykarel|away has quit IRC19:30
*** myoung|lunch is now known as myoung19:34
*** holser_ has quit IRC19:46
hubbotFAILING CHECK JOBS on stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-master @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-3nodes- (1 more message)20:39
rlandyha- reproducer passed for marios' review ?20:44
rlandyagopi: hi - responding to ping above20:45
rlandyrook: agopi: if you want to merge your changes to browbeat I can remove the depends-on jobs20:46
*** holser_ has joined #oooq20:47
sshnaidmrlandy, weshay can you review those please? https://review.openstack.org/#/c/580214/  https://review.openstack.org/#/c/583198/20:49
sshnaidmweshay, can you merge this please? https://review.openstack.org/#/c/576816/20:50
weshaysshnaidm, /me looks20:50
agopihello rlandy, https://review.openstack.org/#/c/583717/ new patch didn't trigger a zuul build20:51
agopihttps://review.rdoproject.org/zuul/builds.html?job_name=legacy-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053-master20:51
agopichecked in status as well, nothing queued up either20:51
agopihttps://review.rdoproject.org/zuul/status.html20:51
rlandyagopi: can I recheck?20:52
agopiyes rlandy20:52
rlandyactually rebse20:52
rlandyrebase20:53
agopiokay done just now rebased20:53
rlandyit triggered now20:53
agopiokay thats weird, it didn't trigger on previous patch :/20:53
rlandyok - let's watch it20:53
*** hamzy_ is now known as hamzy20:59
weshaysshnaidm, rlandy panda|rover|off fyi.. be sure to make sure https://ci.centos.org/job/tripleo-quickstart-gate-master-delorean-quick-basic/ is passing on your reviews :)21:04
weshaythe job is working fairly well now21:04
weshaythat installs the undercloud and does all the prep, everything except the overcloud install and test21:05
rlandyrfolco|ruck: panda|rover|off: do I need to edit both playbooks/tripleo-ci/vars/environment_type/ovb.yaml and playbooks/tripleo-ci/templates/toci_gate_test.sh.j221:06
rlandyto add the browbeat playbook21:06
rfolco|rucklooking21:06
rlandydoes not seem to reference the ansible vars21:06
rlandyrfolco|ruck: ^^21:06
*** agopi is now known as agopi|brb21:07
rfolco|ruckrlandy, well, I think we are in a strange point where things tend to be in the new way but still are in the old way, even in the run-v3 workflow21:09
rlandyrfolco|ruck: I edited both21:09
rfolco|ruckrlandy, IMHO you can do both21:09
rfolco|ruckand then we cleanup old21:09
rlandylet's just not merge any of this21:09
rfolco|ruckrlandy, I'll be happy to review this change21:10
rfolco|ruckcount on me21:11
rfolco|ruckeven if its DNM change, testing only21:11
*** agopi|brb has quit IRC21:12
rlandyrfolco|ruck: the demise begins ... https://review.openstack.org/#/c/587228/21:13
rlandyrook: agopi: just FYI ^^ - we are using the browbeat job to test the new zuulv3 configuration21:15
rlandyyou may see another fs053 test start running21:15
rlandyand we may need to rekick testing21:16
rlandypls be patient with us :)21:16
*** florianf has quit IRC21:21
rlandyrfolco|ruck: panda|rover|off: If I copy over the runv3 playbook, I think I will also need everything in playbooks/tripleo-ci21:27
rlandytemplates, vars, etc.21:28
rlandyall referenced in run v321:28
panda|rover|offyep21:28
rlandyunless it's just the playbook itself21:28
rlandyok - copying the lot over21:28
rlandyminus post and run21:28
rlandyhmmm ... maybe post - will need to modify21:29
*** dsneddon has quit IRC21:29
rlandypanda|rover|off: actually that would mean everything in https://github.com/openstack-infra/tripleo-ci/tree/master/playbooks21:31
rlandyimages and all?21:31
*** dsneddon has joined #oooq21:31
rfolco|ruckrlandy, I know you are still working on it... missing stuff like the featureset etc21:33
rfolco|ruckleft a few comments21:33
rfolco|ruckleaving for today21:33
rfolco|ruckimagine all the people, leaving for today21:33
*** rfolco|ruck is now known as rfolco|off21:34
rlandyrfolco|off: yep, missing lots of stuff, we 're just getting started - "and miles to go before I sleep ... and mile to go before I sleep"21:34
rfolco|offwill be back l8er21:35
*** Goneri has quit IRC21:36
weshaysshnaidm, I did not really understand your -1 here https://review.openstack.org/#/c/583547/21:50
weshaythe packaging for ceph is built such that the rpms must be removed and the proper version installed21:51
sshnaidmweshay, well, maybe was better to talk with me before approving..21:51
sshnaidmweshay, we discussed it with marios in irc21:51
weshaysshnaidm, I can cancel my worflow if you want21:52
sshnaidmweshay, all this patch doesn't make sense at all21:52
sshnaidmweshay, it completely doesn't matter what to configure there, because in next minute everything will be rewritten by quickstart21:53
sshnaidmweshay, and this can't be reason for bug of course21:53
weshayI hit ceph issues prior to quickstart.sh running21:53
sshnaidmweshay, if so - not because of that21:53
weshayit happens in pre21:53
sshnaidmweshay, ok, let's see the log21:54
weshayit was local w/ the reproducer21:54
sshnaidmweshay, pre- is not related to tripleo.sh21:54
weshayso I don't have any ci logs21:54
sshnaidmweshay, ok, please add full logs to bug if you reproduce it again21:54
weshayaye21:55
weshaysshnaidm, ok.. review is updated21:55
*** myoung has quit IRC21:59
*** holser_ has quit IRC22:00
hubbotFAILING CHECK JOBS on stable/ocata: legacy-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024-ocata @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, legacy-tripleo-ci-centos-7-container-to-container-upgrades-master, legacy-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-master @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-3nodes- (1 more message)22:39
*** tcw has quit IRC22:42
*** tcw has joined #oooq22:42
*** sshnaidm is now known as sshnaidm|afk23:23
*** vinaykns has quit IRC23:24

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!