Friday, 2019-04-12

*** hamzy has joined #oooq00:14
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039 @ https://review.openstack.org/602248, master: tripleo-ci-centos-7-ovb- (2 more messages)00:36
*** apetrich has quit IRC01:57
*** aakarsh has joined #oooq02:09
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039 @ https://review.openstack.org/602248, master: tripleo-ci-centos-7-ovb- (2 more messages)02:36
*** rlandy|ruck|bbl is now known as rlandy|ruck02:42
*** rlandy|ruck has quit IRC02:49
*** irclogbot_3 has quit IRC03:01
*** irclogbot_3 has joined #oooq03:01
*** ykarel has joined #oooq03:24
*** raukadah is now known as chandankumar03:25
ykarelweshay, did the issue related to buildah got cleared?03:43
ykarellet me know if i can check something03:43
weshayykarel I don't think it's triggered, haven't really looked yet.. was planning on doing it in the morning03:44
ykarelweshay, i see u fired test project with those repos03:45
ykareli can see jobs here:- https://trunk-primary.rdoproject.org/api-centos-stein/api/civotes_detail.html?commit_hash=31a3eed8143f9ab8ecc8bc6123df4c1f5e45f826&distro_hash=5db0bc146d484e09fcc55872c92861109113529d03:45
ykarelboth with and without that patch ^^03:45
ykareland seems same result03:45
weshayyup 2019-04-12 00:02:35.871 14873 ERROR tripleoclient.v1.tripleo_deploy.Deploy [  ] Exception: Not found image: docker://trunk.registry.rdoproject.org/tripleomaster/centos-binary-tempest:586361584d545dd76d055874284fdf70c730a476_4bfa368503:46
weshayykarel aye03:48
weshayand the rpms are on the undercloud03:48
weshayykarel I'll email folks w/ you on it03:49
*** ykarel_ has joined #oooq03:51
*** ykarel has quit IRC03:53
*** ykarel has joined #oooq03:54
ykarelweshay, /me got disconnected, network issue03:54
weshayykarel emailed u03:55
ykarelokk just saw, merging the revert and will see the result in some time03:55
ykarelnext periodic run in around 1 hour03:56
weshayykarel it didn't merge03:56
*** ykarel_ has quit IRC03:56
weshayI don't have merge rights03:56
ykarelweshay, i just +W03:56
weshayah there we go03:56
weshay)03:56
weshay:)03:56
ykarelbut still it's confusing jobs failing at different places, strange is standalones are passing, need to look more what's the issue04:02
weshayykarel the tenant was maxed out04:02
weshayykarel fixed that04:02
weshayshould be clean04:03
weshayykarel https://review.openstack.org/#/c/651964/104:03
weshayI meged that04:03
weshaybut will take a few hours to land04:03
ykarelokk04:03
weshayworkflowed rather04:03
ykarelthat should not hurt the jobs i think04:03
weshayykarel it could.. every other podman/buidah update has :)04:04
ykarelweshay, hmm  possible but for last some time the updates were not that bad that used to be before04:04
weshayya.. but fool me once, twice .. you know the drill04:05
weshayykarel /me going to bed04:05
ykarelbut will get more clear picture with next run04:05
weshaysee you later man.. always thanks for the help!04:05
ykarelweshay, ack good night04:05
*** udesale has joined #oooq04:13
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb- (2 more messages)04:36
*** marios has joined #oooq05:15
*** ykarel has quit IRC05:16
*** ykarel has joined #oooq05:29
*** ykarel_ has joined #oooq05:34
*** ykarel has quit IRC05:36
*** jtomasek has joined #oooq05:39
*** ykarel__ has joined #oooq05:46
*** ykarel_ has quit IRC05:48
*** quiquell has joined #oooq05:50
*** quiquell is now known as quiquell|rover05:50
*** panda has joined #oooq05:54
*** panda has quit IRC05:55
*** marios has quit IRC05:57
*** marios has joined #oooq05:58
*** jfrancoa has joined #oooq06:06
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb- (3 more messages)06:36
*** quiquell|rover is now known as quique|rover|brb06:44
*** ykarel_ has joined #oooq06:49
*** ykarel__ has quit IRC06:51
*** holser_ has joined #oooq06:59
*** skramaja has joined #oooq06:59
*** hamzy has quit IRC07:05
*** hamzy has joined #oooq07:05
*** aakarsh has quit IRC07:05
*** quique|rover|brb is now known as quiquell|rover07:08
*** panda has joined #oooq07:12
*** ykarel_ is now known as ykarel07:16
*** panda has quit IRC07:16
*** amoralej|off is now known as amoralej07:21
*** panda has joined #oooq07:29
zbrwho can join me on bj for a 15min presentation on using pytest to validate release config files? marios, quiquell|rover arxcruz ?07:39
*** jfrancoa has quit IRC07:40
marioszbr: o/ when?07:40
zbrmarios: 9am BST (in 20min) sounds ok?07:41
marioszbr: k07:41
*** apetrich has joined #oooq07:42
*** tosky has joined #oooq07:43
quiquell|roverzbr: CI promotion pipeline is not happy07:44
*** jpena|off is now known as jpena07:47
zbrquiquell|rover: anything i can help with? i see that gate-check failed due to resources. probably the stack cleanup did not work?07:49
quiquell|roverI am into it07:49
*** ykarel is now known as ykarel|lunch07:52
*** ccamacho has joined #oooq07:53
*** jfrancoa has joined #oooq07:58
zbrhttps://bluejeans.com/265541792808:01
*** dtantsur|afk is now known as dtantsur08:01
quiquell|roverchandankumar: do you have some brain cycles for me ?08:02
chandankumarquiquell|rover: yes08:02
quiquell|roverI am super bad a discovering what tempest is trying to do08:02
chandankumarquiquell|rover: where?08:02
quiquell|roverchandankumar: http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-standalone-master/73a08de/logs/tempest.html.gz08:02
quiquell|roverFailed to establish authenticated ssh connection to cirros@192.168.24.113 (Error reading SSH protocol banner). Number attempts: 10. Retry after 11 seconds.08:02
chandankumarquiquell|rover: checking08:03
zbrhttps://review.openstack.org/#/c/649965/08:03
quiquell|roverchandankumar: this is a cirros instance that tempest is trying to startup ?08:04
quiquell|roverchandankumar: and then try to access it ?08:04
chandankumarquiquell|rover: https://github.com/openstack/tempest/blob/master/tempest/scenario/test_minimum_basic.py#L14508:06
chandankumarquiquell|rover: before rebooting it is trying to ssh into instance08:06
quiquell|roverso ti add the security group to have 22 port there08:07
quiquell|roverand try to access08:07
quiquell|roverthis is an instance it has startup over the oc installed by the standalone08:07
quiquell|roverthis is it ?08:07
quiquell|roverand als othe log there is the console log of the instance08:09
quiquell|roverok08:09
quiquell|roverthanks !08:09
chandankumarquiquell|rover: yes it add the security group08:10
chandankumarin this test, create glance image create keypair, server create , create a volume, attach to it then detacch then add floatin ip to it08:11
chandankumarand retirve it then tries to ssh into it08:11
*** amoralej is now known as amoralej|mtg08:13
quiquell|rovertest_minimum_basic_scenario08:15
quiquell|roverI don't see this test at a passing one ?08:15
quiquell|roverHave activate it now ?08:15
quiquell|roverAhh no nathing08:16
quiquell|roverIs there08:16
chandankumarquiquell|rover: https://github.com/openstack/tripleo-quickstart/blob/master/config/general_config/featureset052.yml#L36 it is running on standalone it should be running08:20
chandankumarI am trying to grep in log why ssh is timing out08:20
quiquell|roverchandankumar: looks like Error reading SSH protocol banner happend when you can do a ssh connection08:20
quiquell|roverchandankumar: but ssh server is not responding08:20
quiquell|roverchandankumar: do we have logs of those instances ?08:21
quiquell|roverchandankumar: at thhe end if the ssh logs of the instance created in the overcloud08:21
chandankumarfailed to get http://169.254.169.254/2009-04-04/user-data08:22
chandankumarwarning: no ec2 metadata for user-data08:22
chandankumarfound datasource (ec2, net)08:22
chandankumarTop of dropbear init script08:22
chandankumarStarting dropbear sshd: WARN: generating key of type ecdsa failed!08:22
chandankumarI am noit sure it is related08:22
quiquell|roverchandankumar: nope is same at success on08:22
quiquell|roverchandankumar: there is nothing especial at the instance console :-/08:22
quiquell|roverchandankumar: na, don't worry I am going to recheck the job08:24
*** dsneddon has quit IRC08:26
chandankumarquiquell|rover: one more thing, yesterday we ported jobs to os_tempest https://github.com/rdo-infra/rdo-jobs/blob/master/zuul.d/standalone-jobs.yaml#L608:26
chandankumarit is still running validate-tempest role08:26
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb- (3 more messages)08:36
mariosthanks zbr08:37
marioswe08:37
mariosr08:37
mariossoren08:37
zbrthanks, for the patience08:38
*** derekh has joined #oooq08:39
zbrmarios: panda : re just enabling the html report, that one is already ready for tripleo-ci : https://review.openstack.org/#/c/651910/1 -- you can click the openstack-tox-py* links to check it.08:41
quiquell|rovermarios, panda, zbr, chandankumar: to make f28 standalone voting https://review.openstack.org/#/c/651230/08:43
quiquell|roverIt's passing now08:43
quiquell|roverhttp://zuul.openstack.org/builds?job_name=tripleo-ci-fedora-28-standalone08:43
quiquell|roveralso uc upgrade https://review.openstack.org/#/c/65121808:44
chandankumarsshnaidm|off: https://review.openstack.org/#/c/639324/ this might run tls tests08:46
mariosquiquell|rover: ack needs to go gate too?08:47
mariosquiquell|rover: (commented)08:47
quiquell|rovereverything that is voting have to be in the gates I think08:48
quiquell|roverSince the time window to go to gate can break the jobs too08:48
mariosquiquell|rover: right thats what i'm saying :)08:49
quiquell|roverfixing08:50
quiquell|rovermarios: https://review.openstack.org/65123008:54
quiquell|roverzbr: can you revote ? https://review.openstack.org/65123008:55
mariosquiquell|rover: voted. btw are you using a config file to add reviewers? maybe consider removing matt young ... poor matt is still being spammed by you08:56
quiquell|roverI have matt there ?08:58
quiquell|roverI was also the gmail :-/08:58
quiquell|roverpoor man08:58
*** dtantsur is now known as dtantsur|brb08:59
*** dsneddon has joined #oooq09:06
*** ykarel|lunch is now known as ykarel09:21
zbrquiquell|rover: is bringing back the voting job to the people! hurrah! ;)09:27
quiquell|roverzbr: well we are trying it here https://bugs.launchpad.net/tripleo/+bug/182390109:29
openstackLaunchpad bug 1823901 in tripleo "move non-voting jobs back to voting where possible" [High,Triaged]09:29
ykarelquiquell|rover, hi09:32
ykarelquiquell|rover, good to rerun failed ovb master jobs in testproject09:32
ykarelhttps://trunk-primary.rdoproject.org/api-centos-master-uc/api/civotes_detail.html?commit_hash=38f3f4c50df0f4c4f45e956744e4fc2605092306&distro_hash=6fe222ac6797ca0469c45831310cafa8c06f4a0609:33
ykarelovb jobs failed with introspection, 1 standalone failed at tempest SSH09:33
quiquell|roverykarel: i am doing it already09:34
quiquell|roverykarel: the ones that are not affected by the metadata issue https://review.rdoproject.org/r/#/c/20166/09:34
ykarelquiquell|rover, but you missed some jobs in ^^09:34
ykarelthere are multiple failings09:35
chandankumararxcruz: except plain standalone all standalone periodic scenario jobs are using os_tempest09:35
quiquell|roverykarel: they are related to metadata09:35
chandankumarI am updating all in us story itself09:35
quiquell|roverthe not known ones are those09:35
ykarelquiquell|rover, not known?09:35
ykarelu means socket address bound09:35
quiquell|roverbmc socket issue and tempest ssh issue are new09:36
quiquell|roverother failures are known09:36
quiquell|roverjust those two to verify that is not transitory09:36
*** dsneddon has quit IRC09:37
ykarelquiquell|rover, those ^^ are also seen earlier09:38
ykareli think there must be bug for socket one atleast09:38
ykareliirc that happens when there are some stacks undeleted09:38
quiquell|roverykarel: the one at centos-7-standalone is known ?09:45
ykarelquiquell|rover, seems transient and happening from time to time09:46
ykarelnot sure if there is bug already09:46
ykarelbut chandankumar may know if any09:46
chandankumarykarel: seeing first time09:47
chandankumarykarel: socket error09:47
ykarelchandankumar, nope09:47
ykareltempest one09:47
ykarelchandankumar, https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-standalone-master/73a08de/logs/tempest.html.gz09:47
chandankumarykarel: yes  SSHException: Error reading SSH protocol banner seen first time09:48
chandankumarwaiting for next run09:48
quiquell|rovercould be that images are so smal that load is too much ?09:48
quiquell|roverand sshd does not respond ?09:48
ykarelack09:49
chandankumarquiquell|rover: nope09:50
chandankumarquiquell|rover: instance is already spinned, failed at ssh09:51
zbrquiquell|rover: not sure if it appllies here but I remember very well that about one year ago we had problems when new rhel came out, it did take longer to boot and ir jobs failing to connect because sshd took more time to start, even worse there was a small period of time where the ssh port was open but sshd daemon was not working. retries were needed (and not around is port open check). Also cloud load influenced it.09:51
quiquell|roverzbr: it has retries there but yest the banner issue is usually this09:51
quiquell|roverzbr: do you remember the cloud load stuff ?09:52
chandankumarquiquell|rover: is this issue seen multiple times?09:54
zbrquiquell|rover: in my case it was not this "Error reading SSH protocol banner" error, also I see that we retry for ~4 minutes. Not sure what to say if 4min is too much or too little, but it may worth increasing it a little bit.09:54
chandankumarnot sure cirros-3.5 image is the issue09:54
chandankumarwe are using cirros-3.6 in os_tempest side09:55
quiquell|roverchandankumar: I think is first time09:55
zbrchandankumar: i think that may be the cause: Starting dropbear sshd: WARN: generating key of type ecdsa failed!09:55
quiquell|roverchandankumar: did we update the image ?09:55
chandankumarquiquell|rover: in devstack, we switched to 3.6 long time ago09:55
quiquell|roverzbr: that appears at a passing job too09:55
zbrecdsa is key in recent ssh, this should not fail. also more interesting is the next line after that.09:56
quiquell|roverzbr: let me find the passing job09:56
chandankumarquiquell|rover: zbr https://github.com/openstack/openstack-ansible-os_tempest/commit/81938c8e73a240e0773c812725de4e802f213b5309:57
chandankumarin next run I think it will use os_Tempest09:57
quiquell|roverzbr: is here too http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-standalone-master/1cc6eea/logs/undercloud/home/zuul/tempest/tempest.log.txt.gz09:58
quiquell|roverchandankumar: htestproject has wored10:00
quiquell|roverchandankumar, ykarel: http://logs.rdoproject.org/66/20166/1/check/periodic-tripleo-ci-centos-7-standalone-master/ddd9bad/10:00
chandankumarquiquell|rover: ok10:01
quiquell|roverchandankumar: === cirros: current=0.3.5 uptime=16.84 ===10:01
quiquell|rover  ____               ____  ____10:01
quiquell|rover / __/ __ ____ ____ / __ \/ __/10:01
quiquell|rover/ /__ / // __// __// /_/ /\ \10:01
quiquell|rover\___//_//_/  /_/   \____/___/10:01
quiquell|rover   http://cirros-cloud.net10:01
quiquell|roverold cirros10:02
quiquell|roverthis is not suppose to use the os_tempest ?10:02
chandankumarquiquell|rover: /me is confused10:02
chandankumarquiquell|rover: all scenario standalone job is running ostempest10:02
quiquell|roverchandankumar: maybe there is something missing at periodic jobs10:02
chandankumarquiquell|rover: but why this change https://review.rdoproject.org/r/#/c/20061/10:02
quiquell|roverat zuul config10:02
chandankumaris not yet picked10:02
chandankumarit is merged 18 hours ago10:03
quiquell|roverweird10:03
quiquell|roverlet me find10:03
quiquell|roveruse_os_tempest: true10:04
quiquell|roveris there10:04
quiquell|roverhttp://logs.rdoproject.org/66/20166/1/check/periodic-tripleo-ci-centos-7-standalone-master/ddd9bad/zuul-info/inventory.yaml10:04
chandankumarsomething wrong is there10:04
quiquell|roverLet check the output of playbooks10:04
quiquell|roverchandankumar: it's using validate-tempest10:05
chandankumarsomething wrong in job definition10:05
quiquell|roverok let's find10:05
quiquell|rovermaybe it's not overriding it10:05
quiquell|rovercan be10:06
quiquell|roverthat we override in featureset or the like10:06
chandankumarquiquell|rover: if we make changes in fs it will break upstream jobs10:07
chandankumarquiquell|rover: https://review.rdoproject.org/r/#/c/20061/4/zuul.d/standalone-jobs.yaml@18 does it causes something?10:08
quiquell|rover2019-04-12 09:45:09.970895 | primary | PLAY [Validate the deployment] *************************************************10:08
quiquell|rover2019-04-12 09:45:09.994961 | primary |10:08
quiquell|rover2019-04-12 09:45:09.995150 | primary | TASK [include_tasks] ***********************************************************10:08
quiquell|rover2019-04-12 09:45:10.006830 | primary | Friday 12 April 2019  09:45:10 +0000 (0:00:00.081)       0:58:03.969 **********10:08
quiquell|rover2019-04-12 09:45:10.034036 | primary | skipping: [undercloud]10:08
quiquell|rover2019-04-12 09:45:10.049693 | primary |10:08
quiquell|rover2019-04-12 09:45:10.049923 | primary | PLAY RECAP *********************************************************************10:08
quiquell|rover2019-04-12 09:45:10.050092 | primary | undercloud                 : ok=106  changed=59   u10:08
quiquell|roverit's skipping os_tempest ?10:08
ykarelquiquell|rover, ack10:08
ykarelpanda, /me converting build-containers to role https://review.openstack.org/65202710:09
ykarelwe need to reuse it in rdoinfo jobs10:09
*** dsneddon has joined #oooq10:09
quiquell|roverchandankumar: I am opening a bug for it10:10
chandankumarquiquell|rover: sure10:10
pandaykarel: you beat me to it by a couple of days. I probably would have needed this for the rdo on rhel job10:12
ykarelpanda, okk let's get this before so you have easy transition :)10:13
*** dsneddon has quit IRC10:14
chandankumarquiquell|rover: I am a big idiot, let me show how10:16
quiquell|roverfound it ?10:16
chandankumarquiquell|rover: https://review.rdoproject.org/r/#/c/20168/10:16
pandaykarel: ok, waiting for the results. Looks good, but for example with the similar change for the other playbook, some variables needed to be passsed in a weird way. Maybe this doesn't need it.10:16
quiquell|roverchandankumar: https://bugs.launchpad.net/tripleo/+bug/182451210:16
openstackLaunchpad bug 1824512 in tripleo "periodic standalone jobs not using os_tempest" [High,Triaged] - Assigned to Quique Llorente (quiquell)10:16
quiquell|roverAdd the bug10:16
ykarelpanda, ack let's see how it goes then i iterate to fix it10:17
quiquell|roverchandankumar: ahh yep smelt like that10:17
quiquell|roverchandankumar: I am going to test it at my test review10:18
chandankumarquiquell|rover: sure10:20
chandankumarquiquell|rover: it also provided me a suggestion to add a new var in os_tempest role to just run smoke_Tests10:22
quiquell|rovercool10:22
chandankumarinstead of putting crazy regex10:22
quiquell|roversshnaidm|off: Do I see a pass for centos/libvirt here ? https://review.rdoproject.org/r/#/c/20131/10:25
zbrpanda: look at last test (expand)  from http://logs.openstack.org/72/651772/6/check/openstack-tox-py27/510de77/tox/reports.html -- does it look ok to you?10:29
*** bogdando has joined #oooq10:35
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb- (3 more messages)10:36
*** panda is now known as panda|lunch10:42
*** dsneddon has joined #oooq10:42
*** dsneddon has quit IRC10:47
*** amoralej|mtg is now known as amoralej10:49
*** holser_ is now known as holser|lunch11:01
*** udesale has quit IRC11:05
weshayquiquell|rover when you have a minute, let's 1-111:17
*** dsneddon has joined #oooq11:17
quiquell|rovernow is good11:17
quiquell|rovergoing to your room11:17
weshayzbr were you going anything w/ polling zuul yesterday.. the infra guys were freaked out11:17
weshayzbr I may need to chat w/ you too for a min11:18
zbrweshay: nothing special, but I am one of the few using https://github.com/openstack/coats/blob/master/coats/openstack_gerrit_zuul_status.user.js11:18
weshayI'll let you know11:18
zbrsure. but as a note I didn't do anything special about zuul yesterday.11:19
weshayquiquell|rover http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/6a18a34/logs/bmc-console.log11:21
*** dsneddon has quit IRC11:21
*** jpena is now known as jpena|lunch11:27
*** jaosorior has quit IRC11:33
arxcruzchandankumar: hey, so on https://tree.taiga.io/project/tripleo-ci-board/task/898 i need to do for  oswin-tempest-plugin and  vmware-nsx-tempest-plugin right ?11:34
arxcruzsorry the delay on this11:34
chandankumararxcruz: yes11:35
chandankumararxcruz: first check in openstack/releases repo11:35
chandankumarif tag is there or not11:35
chandankumararxcruz: thanks for taking care of that11:35
chandankumararxcruz: I have emailed the today's agenda feel free to add or remove from that11:36
arxcruzchandankumar:  vmware-nsx-tempest-plugin isn't in releases11:50
weshayzbr hey you around?11:51
zbrweshay: yep, joining now.11:51
weshayarxcruz did you update the lp spec for tempest?11:51
weshaytripleo lp spec11:52
*** dsneddon has joined #oooq11:53
arxcruzweshay: no, doing now11:56
arxcruzweshay: do11:57
arxcruzweshay: done11:57
weshayarxcruz ok.. paste the review . quiquell|rover needs the same11:57
*** dsneddon has quit IRC11:58
weshayzbr go let clark know :)12:07
quiquell|roverweshay, arxcruz: what's that spec ?12:11
quiquell|roverchandankumar: merging https://review.rdoproject.org/r/#/c/20168/ it's using os_tempest alright12:13
*** skramaja has quit IRC12:13
zbrweshay: the API issue from yesterday seem to be unrelated to our gate-checks, likely they are releated to cockpit, see http://eavesdrop.openstack.org/irclogs/%23tripleo/%23tripleo.2019-04-11.log.html#t2019-04-11T19:45:3112:15
arxcruzquiquell|rover: add tempest as tag in lp12:15
zbrsee https://softwarefactory-project.io/grafana/d/000000001/zuul-status?orgId=1&from=1554466523170&to=1555071323170&refresh=5s --- does anyone know who could create these spikes in API?12:15
quiquell|roverarxcruz: so how do I do stuff like that ?12:16
quiquell|roverarxcruz: to other tags12:16
*** rlandy has joined #oooq12:16
chandankumararxcruz: checking12:16
*** rlandy is now known as rlandy|ruck12:16
rlandy|ruckquiquell|rover: hello :)12:17
weshayquiquell|rover https://review.openstack.org/#/c/650283/12:17
quiquell|roverrlandy|ruck: o/12:17
quiquell|roverrlandy|ruck: containers are good now after buildah rollback12:17
chandankumararxcruz: vmware-nsx is also not in releases just like novajoin12:17
quiquell|roverrlandy|ruck: there were some issues non related12:18
rlandy|ruckweshay: re: the nova jobs bugs ... paul claims it's fixed upstream12:18
rlandy|ruckquiquell|rover: yep - interesting day yesterday12:18
chandankumararxcruz: let pin this to this commit https://github.com/openstack/vmware-nsx-tempest-plugin/commit/586361584d545dd76d055874284fdf70c730a47612:18
* rlandy|ruck checks ruck/rover etherpad12:18
weshayrlandy|ruck right.. they may have added the patch to sf last night12:19
quiquell|roverrlandy|ruck: so the issue what the new version of buildah or just buildah vs docker ?12:19
rlandy|ruckwhat brought zbr into the mix?12:19
weshayquiquell|rover the later... buildah vs docker12:20
quiquell|roverweshay: and what's this stuff with repodata ?12:20
rlandy|ruckquiquell|rover: could not get containers at all in OVB jobs12:20
rlandy|rucksee the testproject run12:20
rlandy|ruckquiquell|rover: EmilienM had a new version of buildah and podman12:20
rlandy|ruckwe wanted to test those rpms12:21
quiquell|roverrlandy|ruck: ack now I get it12:21
rlandy|ruckso we could do this again if we needed to12:21
rlandy|ruckquiquell|rover: ^^ possibly with a  nwe patch as the old one didn;t work12:21
quiquell|roverrlandy|ruck: you mean reactivating buildah pipeline or doing a testproject ?12:22
quiquell|roveror reproducer ?12:22
rlandy|ruckquiquell|rover: testproject before - not another try this directly in the pipeline12:23
quiquell|roverrlandy|ruck: we will have to exercise push though12:23
rlandy|rucknot sure what to do with this bug: https://bugs.launchpad.net/tripleo/+bug/1824388?12:23
openstackLaunchpad bug 1824388 in tripleo "periodic jobs are failing undercloud install - Not found image" [Critical,Fix released] - Assigned to Ronelle Landy (rlandy)12:23
rlandy|ruckclose it out due to revert?12:23
quiquell|roverrlandy|ruck: I closed it12:23
weshayrlandy|ruck no.. don't close it12:23
quiquell|roverrlandy|ruck: is working now the pipeline12:24
quiquell|roverrlandy|ruck: the other stuff is just continue with the transition with buildah12:24
rlandy|ruckstill 'In progress'12:24
weshayrlandy|ruck so .. what we need to explain in the CIX card and meeting is that we should be using "buildah" but it's busted atm.. and needs to be fixed12:24
*** dsneddon has joined #oooq12:24
rlandy|ruckoh wrong bug12:24
quiquell|roverI see it as Fix Releasd12:24
rlandy|ruckquiquell|rover: ^^ yep - was looking at the wrong bug12:24
rlandy|ruckhmm ... don't see the bgug in production chain12:26
quiquell|roverrlandy|ruck: me neither12:26
quiquell|roverI am going to re-tag it12:26
rlandy|ruckhttps://trello.com/c/yNNAaSqC/948-cixlp1824317tripleociproa-periodic-containers-build-fail-at-push-unauthorized-authentication-required-n12:26
quiquell|roverto see if it apperas12:26
rlandy|ruckquiquell|rover: weshay: ^^ we still have this one in prod chain list12:27
quiquell|roverrlandy|ruck: well thats buildah too12:27
rlandy|ruckworst case, I'll attach the second bug to this card12:27
quiquell|roverrlandy|ruck: it was fixed though12:27
rlandy|ruckit will give us a space to talk a bout buildah12:27
quiquell|roverrlandy|ruck: let's see if bug appears after re labeling12:27
quiquell|roveryep12:27
rlandy|ruckyeah if the new one shows up, I'll move the old one12:28
rlandy|rucknot a big deal12:28
*** jpena|lunch is now known as jpena12:28
rlandy|ruckweshay: requesting time this afternoon to bring hw back up12:28
quiquell|roverrlandy|ruck: about this https://bugs.launchpad.net/tripleo/+bug/182431512:29
openstackLaunchpad bug 1824315 in tripleo "periodic fedora28 standalone job failing at test_volume_boot_pattern" [Critical,Fix released] - Assigned to Quique Llorente (quiquell)12:29
quiquell|roverrlandy|ruck: I have close it, it does not appear anymore12:29
*** dsneddon has quit IRC12:29
chandankumarquiquell|rover: is it possible to track specific tests in sova?12:29
rlandy|ruckquiquell|rover: it's fine - don't worry -12:29
weshayrlandy|ruck when you have a moment to breathe, let's sync on openstack-infra heat stacks, and bm hardware12:29
rlandy|ruckjust let's leave the old card there and I'll add  the new bug in that card to talk about it12:29
quiquell|roverchandankumar: it uses regex more or less so maybe ye s12:30
* chandankumar needs to check it12:30
quiquell|roverchandankumar: what do you really want to do/have  ?12:31
rlandy|ruckweshay: ack - will ping in a bit after checked on pidone work12:31
chandankumarquiquell|rover: just to track how many times specifical critical tempest tests failed12:31
quiquell|roverweshay, Tengu: fix por issue with pki at reproducer https://review.rdoproject.org/r/#/c/20175/12:31
quiquell|rovermergy mergy12:31
quiquell|roverchandankumar: that's interesting12:32
chandankumarquiquell|rover: I am donot want to go through elastic recheck to look for specific tests grep12:32
quiquell|roverwould be nice to search by tempest test instead of job name12:32
quiquell|roverchandankumar: is a small subset of tests ?12:33
chandankumarit will give an idea how much time across our jobs got failed12:33
chandankumarquiquell|rover: yes two tests critical12:33
quiquell|roverif they are only two we can add them to the telegraf python script12:33
quiquell|roverso we can search by them at cockpit12:33
quiquell|roverand write a graph12:33
chandankumarquiquell|rover: basically these 3 tests12:34
chandankumarquiquell|rover: https://github.com/openstack/tripleo-quickstart/blob/master/config/general_config/featureset052.yml#L3512:34
Tenguquiquell|rover: «o/12:34
Tengulemme check that.12:34
chandankumarneed to look into stestr_html report12:34
quiquell|roverrlandy|ruck, weshay: to make f28 standalone voting and gating https://review.openstack.org/#/c/651230/12:35
quiquell|roverpanda|lunch: ^12:35
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario012-standalone @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb- (2 more messages)12:36
rlandy|ruckquiquell|rover: ack - about time12:36
rlandy|ruckhttps://review.openstack.org/#/c/651010/ merged  - thank you12:37
quiquell|roverweshay: workflow f28 standalone voting/gateing https://review.openstack.org/#/c/651230/12:44
quiquell|roverTengu: btw since you have a big lab there12:45
Tenguquiquell|rover: yeah?12:45
quiquell|roverTengu: there is a host mode for reproducer12:45
quiquell|roverTengu: to attack directly your big machines12:45
quiquell|roverTengu: only one node though12:45
Tenguoh?12:46
Tenguthat would be nice for standalone12:46
quiquell|rovernodepool_provider: host12:46
quiquell|roverit will launche the job at same machine reproducer is12:46
Tenguquiquell|rover: is there some "real" doc for that reproducer thingy? The readme is a bit light imho.12:46
quiquell|roverTengu: we have only the README and the doc in the script12:47
quiquell|roverTengu: not more info12:47
Tenguhmm12:48
Tenguwould love to get some more though :)12:48
quiquell|roverWe are just eating our own dog food now, but you are an exception12:48
Tenguhehe12:48
Tenguguess it would be great if I wasn't the only one though ;)12:48
quiquell|roverYep step by step12:48
rlandy|ruckcard commented for prod chain12:49
quiquell|roverrlandy|ruck: I see one old stack there, is this important ?12:49
rfolcorlandy|ruck, scen000-updates is failing on stein - http://logs.openstack.org/09/651809/1/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/61bcb01/logs/undercloud/home/zuul/overcloud_update_prepare_containers.log.txt.gz12:49
rlandy|ruckquiquell|rover: sorry - an old stack where?12:50
panda|lunchquiquell|rover: learn from the mistakes of the past https://review.openstack.org/#/c/633087/ , look at my comments.12:50
rlandy|ruckrfolco: looking12:50
quiquell|roverrfolco: https://bugs.launchpad.net/tripleo/+bug/182450012:50
openstackLaunchpad bug 1824500 in tripleo "periodic stein fs037 updates tripleo-ansible-inventory: error: unrecognized arguments: --undercloud-connection" [Critical,Fix committed] - Assigned to Quique Llorente (quiquell)12:50
quiquell|roverrlandy|ruck: ^12:50
rlandy|ruckyep - thought that one was familiar - thanks12:50
quiquell|roverrfolco: TL;DR tripleo-validations RPM was build with old commit after creating stable/stein branch12:51
*** dtantsur|brb is now known as dtantsur12:51
quiquell|rovernp12:51
chandankumararxcruz: thanks for the comment on the tempest docs!12:51
rfolcoquiquell|rover, thanks12:51
quiquell|roverpanda|lunch: ack, so what do we do here ?12:52
quiquell|roverpanda|lunch: also how important is this job now ?12:52
quiquell|roverrfolco: yw12:52
*** holser|lunch is now known as holser_12:53
panda|lunchquiquell|rover: I'm looking at recent builds. I see seven failures in the last 2 days. At least two are legit. One seems to have failed for the missing containers bug. We may have a chance this time, but be ready to justify the choise with data of a stable histotry in your hands.12:56
quiquell|roverpanda|lunch: thanks will do, yep let be conservative here12:57
*** dsneddon has joined #oooq12:58
chandankumararxcruz: weshay os_tempest meeting time https://redhat.bluejeans.com/1571313919/6145/13:00
*** altlogbot_0 has joined #oooq13:03
*** dsneddon has quit IRC13:04
rlandy|ruckquiquell|rover: on ruck/rover etherpad ... "https://bugs.launchpad.net/tripleo/+bug/1824256 - introspection issues, is promotion blocker, and looks like ykarel found something."13:06
openstackLaunchpad bug 1824256 in tripleo "Possible network issues in rdo-cloud causing introspection failures" [Critical,In progress] - Assigned to Ronelle Landy (rlandy)13:06
rlandy|ruckwhat did ykarel find?13:06
rlandy|ruckI see your note with the link to a closed bug13:06
rlandy|ruckthere was an issue with the tenant a while back13:06
rlandy|ruckjust checking if there is something new13:06
*** aakarsh has joined #oooq13:07
quiquell|roverrlandy|ruck: nah red herring it was not that13:07
rlandy|ruckk - np13:07
quiquell|roverrlandy|ruck: didn't update the line13:07
quiquell|roverA lot of stuff should be fixed next run13:07
rlandy|ruckgreat13:08
quiquell|roverrlandy|ruck: run reproducer ci at tripleo-ci change https://review.rdoproject.org/r/#/c/20176/13:15
quiquell|roverrlandy|ruck: with toci dry-run13:15
*** aakarsh|2 has joined #oooq13:17
*** aakarsh has quit IRC13:20
*** aakarsh|2 has quit IRC13:22
zbrpanda|lunch: rlandy|ruck : a simple patch that enables html reporting for py* jobs: https://review.openstack.org/#/c/651910/213:23
quiquell|roverzbr: we use pytest at emit releases python script too and in the reproducer unit tests13:25
quiquell|roverzbr: maybe is nice to have html reports there too13:25
zbrquiquell|rover: this is enabling it for emitreleases. i will do it for the other too.13:25
quiquell|roverzbr: take a look at rdo-infra/ci-config there some there too13:26
quiquell|roverohhh html is super nice13:26
chandankumararxcruz: you are talking about this https://review.openstack.org/#/c/648121/ ?13:28
arxcruzchandankumar: yes13:28
chandankumararxcruz: https://tree.taiga.io/project/tripleo-ci-board/task/917?kanban-status=144727613:29
chandankumararxcruz: it is from sprint bug backlog13:29
chandankumararxcruz: https://tree.taiga.io/project/tripleo-ci-board/us/96013:30
arxcruzchandankumar: no taiga link on the patch...13:30
chandankumararxcruz: we can ask him to add it13:30
*** dsneddon has joined #oooq13:32
*** altlogbot_0 has quit IRC13:32
*** altlogbot_0 has joined #oooq13:33
*** dsneddon has quit IRC13:36
*** quiquell|rover is now known as quiquell|off13:37
*** altlogbot_0 has quit IRC13:38
*** altlogbot_1 has joined #oooq13:38
*** aakarsh has joined #oooq13:40
*** aakarsh has quit IRC13:40
*** aakarsh has joined #oooq13:41
zbrchandankumar arxcruz rlandy|ruck : when can I spare you 15mins to present you the pytest-html reporting?13:56
rlandy|ruckzbr: yes14:00
rlandy|ruckpromised to make time for that14:00
*** altlogbot_1 has quit IRC14:00
rlandy|ruckzbr: pls ping me or book some time14:00
zbrrlandy|ruck: lest see if the other two can, it could easier to bundle it.14:01
rlandy|ruckzbr: k - let me know14:01
rfolcorlandy|ruck, can you please paste buildah bug so I can refer in my upstream patch please ?14:05
*** dsneddon has joined #oooq14:06
rlandy|ruckweshay: quiquell|off: panda|lunch: marios: review pls ... https://review.openstack.org/#/c/652078/14:11
*** dsneddon has quit IRC14:11
rlandy|ruckrfolco, https://bugs.launchpad.net/tripleo/+bug/182438814:11
openstackLaunchpad bug 1824388 in tripleo "periodic jobs are failing undercloud install - Not found image" [Critical,Fix released] - Assigned to Ronelle Landy (rlandy)14:11
rlandy|ruckweshay: k- ready to chat when you are14:11
weshayrlandy|ruck k.. let me chat w/ you a bit later14:12
weshayarxcruz I'm avail if you still have time before your next14:12
rlandy|ruckk14:12
rfolcorlandy|ruck, thx14:13
weshayzbr do you have a local instance of the cockpit running?14:15
zbrnope. but btw, false alarm about the zuul api, it was not the cockpit. it was me, well my firefox "upgrade".14:16
mariosrlandy|ruck: ack minor comment14:16
mariosrlandy|ruck: can revote if urgent to merge14:17
rlandy|rucknot urgent14:20
rlandy|ruckmarios: ^^ got a workaround for pidone but this is the real fix14:20
*** altlogbot_3 has joined #oooq14:25
*** Vorrtex has joined #oooq14:27
*** altlogbot_3 has quit IRC14:29
*** altlogbot_0 has joined #oooq14:30
weshayzbr that is so weird14:32
zbrweshay: what? bj?14:33
weshay zbr ur firefox14:33
*** altlogbot_0 has quit IRC14:33
*** panda|lunch is now known as panda14:33
zbryeah, i know. but that time coincided with my system reboot which included a firefox update, something quite rare.14:34
zbrprobably some combination of ff plugins made it enter an endless loop that repeated the http call. that's the only thing I can think about.14:35
*** ykarel is now known as ykarel|afk14:35
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario012-standalone @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb- (2 more messages)14:36
rlandy|ruckand we're back with node failures :(14:40
weshayrlandy|ruck really?14:43
weshaydang14:43
* weshay looks14:43
*** dsneddon has joined #oooq14:45
rlandy|ruckweshay: investigating14:45
rlandy|ruckwe van talk about it later14:45
rlandy|ruckcan14:45
*** altlogbot_0 has joined #oooq14:48
*** dsneddon has quit IRC14:49
*** altlogbot_0 has quit IRC14:51
*** altlogbot_2 has joined #oooq14:52
rfolcorlandy|ruck, I see 3 jobs failing for stable/stein now, scen8, scen9, and stdln-upgr14:53
rfolcofrom these, apprently only scen9 is green for rocky and master, and failing stein only14:53
*** altlogbot_2 has quit IRC14:55
*** altlogbot_2 has joined #oooq14:56
*** altlogbot_2 has quit IRC14:56
*** Goneri has joined #oooq14:58
*** altlogbot_1 has joined #oooq14:58
rlandy|ruckwill look in a sec14:58
*** ykarel|afk is now known as ykarel15:00
arxcruzweshay: i do have time now15:01
arxcruzdo you ?15:01
arxcruzso, i'm going to the supermarket buy some stuff for apetrich tomorrow, then i'll be back :D15:04
*** dsneddon has joined #oooq15:04
apetrichnow I'm to blame for :)15:04
arxcruzapetrich: of course :P15:10
arxcruzalways have someone to blame :P15:10
apetricharxcruz, just do like everyone else and create a bug in bugzilla "mistral bug: Missed a meeting because I went to the market"15:11
*** tosky has quit IRC15:11
*** tosky has joined #oooq15:12
arxcruzapetrich: cool, i'll do it :D15:13
*** jfrancoa has quit IRC15:19
*** ccamacho has quit IRC15:20
rlandy|ruckweshay: hmmm ...wondering if we shoudn't stagger the periodic pipeline more si stein/master kick one hour part - see if that helps node failures15:21
ykarelrlandy|ruck, so current periodic run was started manually?15:34
rlandy|ruckykarel: no - kicked with  four hourly trigger15:34
ykarelokk should be by cron only, ignore15:35
ykarelyes just saw15:35
rlandy|ruckbut we see node failures often at the start15:35
rlandy|ruckand I was wondering if we are not loading the system15:35
ykareli seems overloaded15:35
ykarelrlandy|ruck, yes15:35
rlandy|ruckso ... if we started stein/master apart, maybe we could get by15:35
ykarelrlandy|ruck, see https://zabbix.infra.prod.eng.rdu2.redhat.com/zabbix/screens.php?elementid=22415:36
ykareli see vms possible: tripleoci | ci.m1.large     |  3815:36
ykarelfew minutes back it was 2015:36
ykarelso ^^ could be reason for NODE_FAILURE15:36
chandankumarSee ya guys, Have a nice weekend :-)15:36
*** chandankumar is now known as raukadah15:36
rlandy|ruckykarel  - all a guess really but if we could reduce possible causes, we may get somewhere15:38
ykarelrlandy|ruck, possible causes?15:39
ykarelare there some VMS in Error state15:39
zbrrlandy|ruck: ok for bj seession in ~20min, at 1600 UTC?15:40
rlandy|ruckyeah - we are getting more and more of those VMs in error state15:40
rlandy|ruckcleaning up again15:40
rlandy|ruckzbr: ack15:40
ykarelrlandy|ruck, ack15:40
ykarelrlandy|ruck, good to check why they get into ERROR state15:40
rlandy|rucksure15:41
zbrarxcruz: you are welcomed too.15:41
*** bogdando has quit IRC15:44
rlandy|ruckholy cow15:49
rlandy|ruckwe are in error15:49
rlandy|ruckstacks are ok though15:49
rlandy|ruck {u'message': u'No valid host was found. There are not enough hosts available.', u'code': 500, u'created': u'2019-04-12T15:00:38Z'}15:50
rlandy|rucklikely means we ran out of resources15:50
ykarelyes zabbix indicates the same, how many vms are there in ERROR state?15:52
rlandy|ruckno stack though15:56
rlandy|ruckykarel: 27 - I've actually seen worse15:56
rlandy|ruckgetting rid of those15:56
ykarelack15:56
*** derekh has quit IRC15:58
zbrhttps://bluejeans.com/265541792816:00
*** jfrancoa has joined #oooq16:00
*** jfrancoa has quit IRC16:08
*** dsneddon has quit IRC16:10
ykarelrlandy|ruck, is fs039 failure known?16:11
*** dsneddon has joined #oooq16:11
rlandy|ruckykarel: ack - spoke with sshnaidm16:11
*** rlandy|ruck is now known as rlandy|ruck|mtg16:12
*** dtantsur is now known as dtantsur|afk16:12
ykarelrlandy|ruck|mtg, current master promotion has just missing fs03916:12
ykarelmissing successful jobs: [u'periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-master']16:12
rlandy|ruck|mtgmaybe we should remove from criteria and promote16:12
rlandy|ruck|mtgweshay: ^^ ok?16:12
ykarelif the issue is already known and worked upon good to promote16:12
weshayrlandy|ruck|mtg is 39 master voting?16:13
rlandy|ruck|mtgykarel: just in meeting atm - will modify criteria to promote in a  bit16:13
weshayI don't think it is.. but confirm16:13
ykarelit's third party ovb16:13
weshay oh wait16:13
rlandy|ruck|mtgshoudl not be16:13
weshayit's ovb16:13
weshayeven if it's ovb should be non-voting if we're going to promote w/ a known failures16:13
* weshay looks at master promote jobs16:14
rlandy|ruck|mtgweshay; on zbr's presentation16:15
weshayrlandy|ruck|mtg 39 failed on 2019-04-12 06:55:12.036609 | primary | TASK [overcloud-prep-images : Prepare the overcloud images for deploy] *********16:16
*** ykarel is now known as ykarel|away16:16
weshaywhich afaik is infra16:16
rlandy|ruck|mtgcorrect16:16
weshayrlandy|ruck|mtg ack to promote16:16
rlandy|ruck|mtghttps://bugs.launchpad.net/tripleo/+bug/182425616:16
openstackLaunchpad bug 1824256 in tripleo "Possible network issues in rdo-cloud causing introspection failures" [Critical,In progress] - Assigned to Ronelle Landy (rlandy)16:16
ykarel|awayweshay, rlandy|ruck|mtg there is some real bug in fs03916:16
*** dsneddon has quit IRC16:16
ykarel|awayupstream jobs are also failing since 10th16:16
ykarel|awaywith same error in overcloud deploy16:16
*** amoralej is now known as amoralej|off16:17
ykarel|awayhttps://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039&branch=master&result=FAILURE16:17
ykarel|awaysee the jobs which are duration around 7000 seconds16:17
ykarel|awayfailed at overcloud deploy16:17
rlandy|ruck|mtgyep - will confirm afterwards16:18
*** dsneddon has joined #oooq16:18
ykarel|awayokk, pasting here for reference:- [overcloud.ComputeServiceChain.ServiceServerMetadataHook]: CREATE_FAILED  resources.ServiceServerMetadataHook: u'type'16:18
*** dsneddon has quit IRC16:23
*** ykarel|away has quit IRC16:24
weshayrlandy|ruck|mtg periodic-tripleo-ci-centos-7-standalone-master failed in the latest run16:29
weshayrlandy|ruck|mtg /me is finally ready to chat16:29
weshaywhen ever u r16:29
rlandy|ruck|mtgweshay:in zbr's meeting16:29
rlandy|ruck|mtgwill ping afterwards16:29
*** jpena has quit IRC16:29
*** dsneddon has joined #oooq16:32
*** rlandy|ruck|mtg is now known as rlandy|ruck16:33
*** marios has quit IRC16:35
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario012-standalone @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb- (2 more messages)16:36
rlandy|ruckweshay: k - let's chat16:37
rlandy|ruckwill get to failing tests after that16:38
*** tosky has quit IRC16:38
weshayrlandy|ruck k /me goes16:39
*** holser_ has quit IRC16:52
zbrweshay: rlandy|ruck https://review.openstack.org/#/c/649965/ (release config testing) is ready for final review.16:53
*** ykarel has joined #oooq16:57
zbrhttps://review.openstack.org/#/c/651910/ is enabling html report on tripleo-ci repo.16:57
weshayrlandy|ruck https://etherpad.openstack.org/p/tripleo-train-topics17:10
*** vinaykns has joined #oooq17:14
rlandy|ruckzbr: will take a look in a bit17:45
*** irclogbot_3 has quit IRC18:08
*** irclogbot_3 has joined #oooq18:10
rlandy|ruckstep one prodchain board is sorted18:13
*** tosky has joined #oooq18:18
raukadahweshay: arxcruz https://etherpad.openstack.org/p/osa-train-ptg18:23
rlandy|ruckah - not bad - hit ipmi failure - drac ip update18:30
rlandy|ruckweshay: ^^18:30
rlandy|ruckcan fix that18:30
*** Vorrtex has quit IRC18:36
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario008-multinode-oooq-container, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-scenario012-standalone @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb- (2 more messages)18:36
raukadahhubbot1: source18:37
hubbot1raukadah: My source is at https://github.com/ProgVal/Limnoria18:37
*** Vorrtex has joined #oooq18:40
raukadahzbr: is the pytest demo recorded?18:48
raukadahsorry i missed it!18:48
zbrraukadah: no but I can do another one quickly, is only 15mins. no charge for private events, yet.18:48
raukadahzbr: bandwidth low, May be I will catch some time next week18:49
zbrraukadah: sure.18:49
raukadahzbr: great, thanks!18:49
zbrwere you talking about irc bots?18:50
raukadahzbr: nope18:50
raukadahhubbot1: I wanted to check hubbot1 source code18:51
hubbot1raukadah: Error: "I" is not a valid command.18:51
raukadah2 more messages looks annoying, may be something hidden there18:52
raukadahso checking the source18:52
zbrraukadah: ok. me just wanted to say that about a month ago I found a very nice bot named notifico (fully opensource but also hosted freely), https://n.tkte.ch/ -- already using it in multiple channels as it has hooks with multiple systems. adding one for gerrit should be very easy.18:54
raukadahzbr: great, adding to my list to check!18:55
*** apetrich has quit IRC19:00
rlandy|ruckweird just lost ... hostname rdoci-hp-01.v100.rdoci.lab.eng.rdu2.redhat.com: Name or service not known19:10
rlandy|ruckweshay: hmm ... it is not master but stein that would promote w/o fs03919:17
weshayah.. /me looks19:17
weshaysaw that was running well19:18
weshayrlandy|ruck you put up the push change19:18
rlandy|ruckweshay: I put up the push change?19:19
weshayrlandy|ruck nothing is triggering on stein except now for python-tripleoclient19:19
weshayshould be fine to promote19:19
weshayrlandy|ruck there should be very little diff between master and stein atm19:19
rlandy|ruckweshay: sorry - I am confused ...19:20
rlandy|ruckhttps://review.rdoproject.org/zuul/status19:20
rlandy|ruckonly failing stein job is fs03919:20
rlandy|ruckbut it's a deployment failure19:20
weshayya.. I see19:20
rlandy|ruckso I didn;t change the promotion criteria19:21
rlandy|rucklogging a bug on fs039 failing - that is all19:21
weshaythat's an odd heat error19:21
rlandy|ruckI know19:21
rlandy|ruckit's not the error ykarel mentioned19:22
rlandy|rucknot the same one in check failures19:22
weshayrlandy|ruck that is probably not specific to 3919:22
rlandy|ruckpossibly19:22
rlandy|ruckI am combing through the check errors to compare19:22
rlandy|ruckalmost looks like heat got interrupted19:23
weshayrlandy|ruck I think it's a real bug19:25
weshayrlandy|ruck http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039-stein/415c446/logs/undercloud/var/log/containers/heat/heat-engine.log.txt.gz19:26
weshay2019-04-12 17:31:32.052 8 INFO heat.engine.resource [req-a799f5c3-5a74-4b9e-9c2e-e36f5cc7dea4 - admin - default default] creating Value "CompactServices" Stack "overcloud-ComputeServiceChain-atwhtineazrs-ServiceServerMetadataHook-4z35ptevylo2" [dd0d8e8b-d0ab-4a08-84bb-d68cff588f4d]19:26
weshay2019-04-12 17:31:32.066 8 ERROR heat.engine.check_resource [req-a799f5c3-5a74-4b9e-9c2e-e36f5cc7dea4 - admin - default default] Unexpected exception in resource check.: KeyError: u'type'19:26
rlandy|ruck"No module named blazarclient". Not using blazar.19:26
rlandy|ruckthat looks familiar - ok bug in progress19:26
weshayrlandy|ruck those are just warnings though19:27
rlandy|ruckweshay: no the error you pointed out19:27
weshayah k19:27
rlandy|ruck<ykarel|away> okk, pasting here for reference:- [overcloud.ComputeServiceChain.ServiceServerMetadataHook]: CREATE_FAILED  resources.ServiceServerMetadataHook: u'type'19:27
rlandy|ruck^^ real problem19:28
weshayrlandy|ruck might be the change I proposed a revert on https://review.openstack.org/#/c/652137/19:30
rlandy|ruckidk - let's see what runs and passes/fails19:31
weshayrlandy|ruck or https://review.openstack.org/#/c/639119/19:32
weshayyou have the bug submitted?19:32
rlandy|ruckweshay: in progress - one minute - then we can log questions/comments19:32
weshaythis is what happens when rdo-cloud is unstable19:39
rlandy|ruckweshay: https://bugs.launchpad.net/tripleo/+bug/182457919:39
openstackLaunchpad bug 1824579 in tripleo "tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039 jobs are failing overcloud deployment with 'KeyError: u'type''" [Undecided,New]19:39
rlandy|ruck^^ can comment more there19:39
ykarel<weshay> rlandy|ruck or https://review.openstack.org/#/c/639119/19:45
ykarel^^ looks the culprit19:45
*** panda has quit IRC19:46
weshayykarel aye.. one of the two in the topic19:46
ykarelhmm19:46
weshaywill try both19:47
ykarelweshay, fs039 failed on ^^ patch on 5th, so ^^ patch affected it19:47
rlandy|ruckonly fails 039 with that big a change?19:49
ykarelthat patch should have missed some tls config19:49
rlandy|ruckreasonable19:49
rlandy|ruckweshay: you are entering a revert on  https://review.openstack.org/#/c/639119/?19:51
rlandy|ruckpls paste relevant reverts in bug so we track results19:52
weshayrlandy|ruck https://review.openstack.org/#/q/topic:containers-common+(status:open+OR+status:merged)19:52
weshayalready in there19:52
ykarelrlandy|ruck, i doubt at line https://review.openstack.org/#/c/639119/16/deployment/ovn/ovn-metadata-container-puppet.yaml@28419:53
ykarelmetadata_settings: {}19:53
rlandy|ruckgot it19:54
*** ykarel is now known as ykarel|away19:54
* ykarel|away leaving19:54
weshayykarel|away have a good weekend19:54
ykarel|awayu too19:55
weshayrlandy|ruck if we get bm back we can test this introspection error  we see in rdo-cloud w/ it19:58
weshayI see you have two jobs running...19:58
weshayhopefully we can get them non-voting in the pipeline while we also figure out the baseos workflow19:58
rlandy|ruckweshay: git two bm jobs running atm20:00
rlandy|ruckgot20:00
rlandy|ruckupdating the instackenv.json on the others20:00
weshayintrospection is running on one now :)20:01
rlandy|ruckyep - here's hoping20:06
weshayintrospection passed20:12
rlandy|ruckyep - and we're deploying20:14
*** ykarel|away has quit IRC20:17
weshayrlandy|ruck so if introspection passed and the deployment is running.. the ips must be mostly sane20:21
rlandy|ruckweshay: I think so, so far - it's juts instackenv.json20:21
rlandy|ruckI emailed matt to confirm20:21
rlandy|ruckthat I wasn't hitting old networks20:21
rlandy|ruckthe two hp changes are in - working on dell now20:22
rlandy|ruckdell is not in order like hp - so it's more work - need to check each machine20:23
weshayk20:27
weshaylet me know if I can help20:27
*** dsneddon has quit IRC20:29
rlandy|ruckalmost done  - two more envs to go20:33
*** Vorrtex has quit IRC20:34
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb- (2 more messages)20:36
*** zbr has quit IRC20:37
weshayrlandy|ruck woot.. tempest is working on libvirt repro :)20:55
weshayhttps://review.openstack.org/#/q/topic:bug/1824243+(status:open+OR+status:merged)20:55
rlandy|rucknice20:56
weshayarxcruz ^ let's chat about cirros 3.5/6 next week20:56
rlandy|ruckwell we have a failed deployment21:09
rlandy|ruckall instacknev.json updated21:09
rlandy|ruckPing to 10.12.150.126 failed21:11
rlandy|ruckexternal address looks wrong21:11
*** jtomasek has quit IRC21:14
rlandy|ruckweshay: negative - we have changed provisioning and external networks21:22
rlandy|ruckok - I see the vlan sheet - will update21:46
*** rlandy|ruck has quit IRC21:49
*** aakarsh has quit IRC22:28
hubbot1FAILING CHECK JOBS on master: tripleo-ci-centos-7-standalone-upgrade, tripleo-ci-centos-7-scenario012-standalone, tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset053, tripleo-ci-centos-7-scenario008-multinode-oooq-container @ https://review.openstack.org/604298, stable/pike: tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039, tripleo-ci-centos-7-ovb- (2 more messages)22:36
*** aakarsh has joined #oooq22:52
*** tosky has quit IRC23:11
*** aakarsh has quit IRC23:40
*** aakarsh has joined #oooq23:40

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!