*** med_ has quit IRC | 00:07 | |
jtcressy | Can I force-delete a stack? I cant get this thing to delete no matter how many times I try. | 00:07 |
---|---|---|
jtcressy | there are no instances in "openstack server list" but it still fails saying that It cant delete an instance. Why does it keep trying to delete an instance that doesn't exist??? wtf! | 00:08 |
*** toure is now known as toure|gone | 00:09 | |
*** ooolpbot has joined #tripleo | 00:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 00:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 00:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 00:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 00:10 |
*** ooolpbot has quit IRC | 00:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 00:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 00:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 00:10 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-common master: Support for ARA report for ansible playbooks in deploy https://review.openstack.org/565077 | 00:10 |
openstackgerrit | Sagi Shnaidman proposed openstack/python-tripleoclient master: Support ARA report tracking from command line https://review.openstack.org/583799 | 00:12 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-common master: Support for ARA report for ansible playbooks in deploy https://review.openstack.org/565077 | 00:12 |
*** thrash is now known as thrash|g0ne | 00:13 | |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart-extras master: Collect overcloud statistics with ARA https://review.openstack.org/578462 | 00:15 |
jtcressy | On every resource in my stack after attempting deletion: "Stack DELETE cancelled" | 00:16 |
*** moshele has joined #tripleo | 00:16 | |
jtcressy | why? wtf? | 00:16 |
*** honza has joined #tripleo | 00:20 | |
*** honza is now known as Guest70154 | 00:20 | |
*** Guest70154 is now known as honza_ | 00:21 | |
*** itlinux_ has joined #tripleo | 00:22 | |
*** itlinux has quit IRC | 00:23 | |
*** jtcressy has quit IRC | 00:25 | |
*** jtcressy has joined #tripleo | 00:26 | |
jtcressy | mwhahaha: is there any way for me to forcefully delete a failed heat stack? it keeps instantly failing every time I try to delete it | 00:26 |
jtcressy | "openstack stack failures list overcloud" shows me a list of resources that ALL say "DELETE aborted (user triggered cancel)". I triggered no such thing. I keep running "openstack stack delete overcloud -y" repeatedly to no avail. | 00:27 |
*** lblanchard has quit IRC | 00:28 | |
jillr | jtcressy: he's on pto. there is not a stack force delete though, | 00:28 |
jillr | one way I've seen that sort of thing, if any resources were deleted manually the stack can get into a state where it can't cascade through all the substacks and resources correctly, and you end up with an undeleteable stack. | 00:29 |
jtcressy | I did not delete any of these resources manually. I've only been running the stack delete command and nothing else | 00:30 |
jillr | you can attempt database surgery to identify what's in a bad state and nudge things along | 00:30 |
jillr | k, that's just one way I've seen it, as a example. | 00:30 |
jtcressy | each time i check what the failures are it seems to be different resources. I cant pin it down. | 00:30 |
jillr | it's probably fastest and easier to redeploy the undercloud if this is a test/poc cloud, | 00:30 |
jillr | or a great opp to learn heat troubleshooting, depending on how you want to look at it? :) | 00:31 |
jtcressy | I cant even begin to understand why it would cancel all of these deletions. I cant find a single resource that doesn't say "cancelled" | 00:31 |
jillr | heat logs might help, or you can run --debug with your openstack client cmd | 00:32 |
jtcressy | does this make any sense? https://hastebin.com/raw/orovuwehaj | 00:34 |
jillr | couple bugs that could be related: https://bugzilla.redhat.com/show_bug.cgi?id=1568578, https://bugzilla.redhat.com/show_bug.cgi?id=1571384 | 00:37 |
openstack | bugzilla.redhat.com bug 1568578 in rhosp-director "Deleting the OC stack occasionally fails" [Unspecified,Closed: duplicate] - Assigned to rhos-maint | 00:37 |
openstack | bugzilla.redhat.com bug 1571384 in openstack-ironic "libvirt errors are causing virtualbmc power operations to fail, resulting in failed deployments when using virtualbmc" [High,Closed: duplicate] - Assigned to ietingof | 00:37 |
jillr | is vbmc working for your OC nodes? | 00:37 |
jtcressy | vbmc? | 00:37 |
jillr | are you deploying with vms? | 00:38 |
jtcressy | no i'm on bare metal | 00:38 |
jtcressy | all R710's with iDRAC 6 | 00:38 |
jillr | ok, s/vbmc/ipmi then | 00:38 |
jtcressy | heat isn't trying to delete the instances or anything... it keeps getting stuck on the resources I show in that hastebin above. | 00:39 |
jtcressy | I dont think it's a nova issue. | 00:40 |
jillr | the nested stacks are intertwined with each other. you're going to need to trace out what resource it's trying to act on when it fails, and what caused that action to fail. | 00:42 |
jillr | debug logs should be helpful, so you can see what api call is being made when it happens | 00:42 |
jtcressy | I hit a dead end on this resource: "| ServiceChain | eeb871e8-c3c6-4d7a-90bd-4e56da023d32 | OS::Heat::ResourceChain | DELETE_FAILED | 2018-07-18T23:37:37Z |" | 00:45 |
jtcressy | "openstack stack resource list eeb871e8-c3c6-4d7a-90bd-4e56da023d32" gives me no output. | 00:45 |
*** artom has quit IRC | 00:49 | |
jtcressy | Ok.... so i guess running "openstack stack delete overcloud -y" repeatedly eventually WILL delete the stack. I just had to attempt it over 150 times over the course of an hour or so. | 00:49 |
*** itlinux_ has quit IRC | 00:49 | |
jtcressy | openstack stack list now comes up empty | 00:49 |
jtcressy | maybe I can write a brute-forcing script that will repeat this until the stack is deleted. it will be handy next time this happens. | 00:50 |
*** noslzzp has quit IRC | 00:58 | |
*** mburned has quit IRC | 01:04 | |
*** haleyb has quit IRC | 01:05 | |
*** jtcressy has quit IRC | 01:06 | |
*** noslzzp has joined #tripleo | 01:10 | |
*** ooolpbot has joined #tripleo | 01:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 01:10 |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 01:10 |
*** ooolpbot has quit IRC | 01:10 | |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 01:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 01:10 |
*** haleyb has joined #tripleo | 01:13 | |
*** mburned has joined #tripleo | 01:13 | |
*** itlinux has joined #tripleo | 01:17 | |
*** mrsoul` has joined #tripleo | 01:21 | |
*** mrsoul_ has joined #tripleo | 01:21 | |
*** Petersingh has joined #tripleo | 01:21 | |
*** mrsoul has quit IRC | 01:23 | |
*** mschuppert has quit IRC | 01:24 | |
*** med_ has joined #tripleo | 01:35 | |
*** med_ has quit IRC | 01:35 | |
*** med_ has joined #tripleo | 01:35 | |
*** Petersingh is now known as Petersingh|afk | 01:39 | |
*** agopi has joined #tripleo | 01:42 | |
*** rbrady has quit IRC | 01:45 | |
*** yamahata has quit IRC | 01:46 | |
EmilienM | can someone look at https://review.openstack.org/#/c/569153/ please | 01:51 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart master: Remove --use-heat usage, as it's deprecated https://review.openstack.org/581534 | 01:51 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Check container health as part of the deploy https://review.openstack.org/569153 | 01:52 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Limit deploy health checks to paunch managed ones https://review.openstack.org/581529 | 01:52 |
*** mcornea has quit IRC | 01:53 | |
*** ramishra has joined #tripleo | 01:54 | |
*** lblanchard has joined #tripleo | 01:54 | |
*** Petersingh|afk is now known as Petersingh | 01:54 | |
openstackgerrit | Merged openstack/puppet-tripleo stable/queens: remove scenario005 from experimental https://review.openstack.org/583685 | 01:57 |
*** ramishra has quit IRC | 02:01 | |
*** jaganathan has joined #tripleo | 02:06 | |
*** mdnadeem has joined #tripleo | 02:10 | |
*** ooolpbot has joined #tripleo | 02:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 02:10 |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 02:10 |
*** ooolpbot has quit IRC | 02:10 | |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 02:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 02:10 |
*** agopi has quit IRC | 02:17 | |
*** itlinux has quit IRC | 02:19 | |
*** shreshtha has quit IRC | 02:27 | |
*** ramishra has joined #tripleo | 02:29 | |
openstackgerrit | Tuan Do Anh proposed openstack/tripleo-common master: Fix typo of function naming conventions in parameters.py https://review.openstack.org/581178 | 02:35 |
*** moshele has quit IRC | 02:36 | |
*** psachin` has joined #tripleo | 02:37 | |
*** lblanchard has quit IRC | 02:43 | |
*** ramishra has quit IRC | 02:46 | |
*** jaganathan has quit IRC | 02:48 | |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Read featureset variable as string value https://review.openstack.org/583022 | 02:59 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Set common vars at vars/common.yaml https://review.openstack.org/582885 | 03:08 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Take featureset out of TOCI_JOBTYPE https://review.openstack.org/582384 | 03:08 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Take environment_type out of TOCI_JOBTYPE https://review.openstack.org/582385 | 03:08 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Take nodes out of TOCI_JOBTYPE https://review.openstack.org/582386 | 03:08 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Take periodic and dryrun out of TOCI_JOBTYPE https://review.openstack.org/582387 | 03:08 |
*** ooolpbot has joined #tripleo | 03:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 03:10 |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 03:10 |
*** ooolpbot has quit IRC | 03:10 | |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 03:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 03:10 |
*** Petersingh is now known as Petersingh|afk | 03:18 | |
*** psahoo has joined #tripleo | 03:56 | |
*** Petersingh|afk is now known as Petersingh | 03:58 | |
*** udesale has joined #tripleo | 04:02 | |
*** links has joined #tripleo | 04:08 | |
*** Haresh has joined #tripleo | 04:09 | |
*** ooolpbot has joined #tripleo | 04:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 04:10 |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 04:10 |
*** ooolpbot has quit IRC | 04:10 | |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 04:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 04:10 |
*** holser_ has joined #tripleo | 04:11 | |
*** med_ has quit IRC | 04:11 | |
*** pdeore has joined #tripleo | 04:17 | |
*** pdeore has quit IRC | 04:17 | |
*** rh-jelabarre has quit IRC | 04:19 | |
*** ccamacho has quit IRC | 04:21 | |
*** khyr0n has joined #tripleo | 04:22 | |
*** eck` is now known as eck`gone | 04:23 | |
*** pcaruana has joined #tripleo | 04:23 | |
*** ramishra has joined #tripleo | 04:28 | |
*** karthiks has quit IRC | 04:28 | |
*** shreshtha has joined #tripleo | 04:31 | |
*** pcaruana has quit IRC | 04:34 | |
*** Haresh has quit IRC | 04:39 | |
*** ykarel has joined #tripleo | 04:49 | |
*** mdnadeem has quit IRC | 04:55 | |
*** mdnadeem has joined #tripleo | 04:56 | |
*** skramaja has joined #tripleo | 04:57 | |
*** pdeore has joined #tripleo | 04:59 | |
*** mpjetta has quit IRC | 05:03 | |
*** holser_ has quit IRC | 05:04 | |
*** mdnadeem has quit IRC | 05:06 | |
*** ooolpbot has joined #tripleo | 05:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 05:10 |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 05:10 |
*** ooolpbot has quit IRC | 05:10 | |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 05:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 05:10 |
*** Petersingh is now known as Petersingh|bomga | 05:15 | |
*** mpjetta has joined #tripleo | 05:21 | |
*** rrubins has quit IRC | 05:22 | |
*** moshele has joined #tripleo | 05:23 | |
*** ccamacho has joined #tripleo | 05:25 | |
*** quiquell|off is now known as quiquell | 05:26 | |
*** moshele has quit IRC | 05:27 | |
*** flwang has quit IRC | 05:29 | |
*** mnasiadka_ has joined #tripleo | 05:29 | |
*** mnasiadka has quit IRC | 05:29 | |
*** NobodyCam has quit IRC | 05:29 | |
*** mnasiadka_ is now known as mnasiadka | 05:29 | |
*** NobodyCam has joined #tripleo | 05:29 | |
*** mgkwill_ has joined #tripleo | 05:30 | |
*** gregwork_ has joined #tripleo | 05:30 | |
*** mwhahaha has quit IRC | 05:30 | |
*** portdirect has quit IRC | 05:30 | |
*** mgkwill has quit IRC | 05:30 | |
*** mwhahaha has joined #tripleo | 05:30 | |
*** mgkwill_ is now known as mgkwill | 05:30 | |
*** portdirect has joined #tripleo | 05:30 | |
*** colonwq has quit IRC | 05:31 | |
*** morazi has quit IRC | 05:31 | |
*** hamzy has quit IRC | 05:31 | |
*** gregwork has quit IRC | 05:31 | |
*** gregwork_ is now known as gregwork | 05:31 | |
*** portdirect is now known as Guest72952 | 05:32 | |
*** hamzy has joined #tripleo | 05:36 | |
*** mdnadeem has joined #tripleo | 05:37 | |
openstackgerrit | Quique Llorente proposed openstack/python-tripleoclient master: [WIP] Learn --with-ara https://review.openstack.org/583861 | 05:37 |
*** rrubins has joined #tripleo | 05:38 | |
openstackgerrit | Quique Llorente proposed openstack/tripleo-quickstart master: [DNM] Use --with-ara for featureset010 https://review.openstack.org/583557 | 05:40 |
*** colonwq has joined #tripleo | 05:45 | |
*** morazi has joined #tripleo | 05:45 | |
*** flwang has joined #tripleo | 05:47 | |
*** threestrands has quit IRC | 05:49 | |
*** cshastri has joined #tripleo | 05:50 | |
*** dparkes has joined #tripleo | 05:55 | |
*** janki has joined #tripleo | 05:56 | |
*** mdnadeem has quit IRC | 05:57 | |
*** ratailor has joined #tripleo | 05:57 | |
*** noslzzp has quit IRC | 05:59 | |
*** mdnadeem has joined #tripleo | 05:59 | |
*** hamdyk has joined #tripleo | 06:08 | |
*** threestrands has joined #tripleo | 06:09 | |
*** threestrands has quit IRC | 06:09 | |
*** threestrands has joined #tripleo | 06:09 | |
*** ooolpbot has joined #tripleo | 06:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 06:10 |
*** ooolpbot has quit IRC | 06:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 06:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 06:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 06:10 |
*** paramite has joined #tripleo | 06:11 | |
*** jtcressy has joined #tripleo | 06:14 | |
*** agopi has joined #tripleo | 06:17 | |
bandini | Simple backport if anyone is around https://review.openstack.org/583107 | 06:20 |
*** udesale_ has joined #tripleo | 06:23 | |
*** udesale has quit IRC | 06:25 | |
*** gkadam has joined #tripleo | 06:26 | |
*** paramite has quit IRC | 06:27 | |
*** paramite has joined #tripleo | 06:28 | |
*** agurenko has joined #tripleo | 06:30 | |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-heat-templates master: Enable logging to stdout/stderr in memcached https://review.openstack.org/583344 | 06:31 |
*** jtcressy has quit IRC | 06:32 | |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Take periodic and dryrun out of TOCI_JOBTYPE https://review.openstack.org/582387 | 06:32 |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Check container health as part of the deploy https://review.openstack.org/569153 | 06:33 |
*** ffiore has joined #tripleo | 06:35 | |
*** jfrancoa has joined #tripleo | 06:36 | |
*** threestrands has quit IRC | 06:36 | |
*** pcaruana has joined #tripleo | 06:37 | |
*** gkadam has quit IRC | 06:38 | |
*** ramishra has quit IRC | 06:40 | |
*** ramishra has joined #tripleo | 06:41 | |
*** yprokule has joined #tripleo | 06:41 | |
*** paramite has quit IRC | 06:43 | |
*** moshele has joined #tripleo | 06:44 | |
*** assassin has joined #tripleo | 06:48 | |
*** Petersingh|bomga is now known as Petersingh | 06:48 | |
openstackgerrit | Flavio Percoco proposed openstack/tripleo-heat-templates master: WIP use openshift-ansible container instead of RPMs https://review.openstack.org/583868 | 06:50 |
*** udesale__ has joined #tripleo | 06:51 | |
*** aufi has joined #tripleo | 06:52 | |
*** mrunge_ has joined #tripleo | 06:53 | |
*** udesale_ has quit IRC | 06:54 | |
*** holser_ has joined #tripleo | 06:54 | |
*** aufi_ has joined #tripleo | 06:54 | |
*** mrunge has quit IRC | 06:55 | |
*** paramite has joined #tripleo | 06:57 | |
*** khyr0n has quit IRC | 06:58 | |
*** aufi has quit IRC | 06:58 | |
*** brault has joined #tripleo | 06:58 | |
*** mrsoul_ is now known as mschuppert | 06:59 | |
*** cylopez has joined #tripleo | 06:59 | |
*** cylopez has left #tripleo | 06:59 | |
*** nyechiel has joined #tripleo | 07:00 | |
*** moshele has quit IRC | 07:01 | |
openstackgerrit | waleed mousa proposed openstack/puppet-tripleo master: Adding support for VF LAG in SR-IOV https://review.openstack.org/558411 | 07:01 |
*** peereb has joined #tripleo | 07:02 | |
*** bogdando has joined #tripleo | 07:02 | |
*** openstackgerrit has quit IRC | 07:04 | |
*** sileht has quit IRC | 07:04 | |
*** sileht has joined #tripleo | 07:07 | |
*** dbecker has joined #tripleo | 07:09 | |
*** amoralej|off is now known as amoralej | 07:09 | |
*** ramishra has quit IRC | 07:09 | |
*** ooolpbot has joined #tripleo | 07:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 07:10 |
*** ooolpbot has quit IRC | 07:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 07:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 07:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 07:10 |
*** paramite has quit IRC | 07:10 | |
*** ramishra has joined #tripleo | 07:11 | |
*** shardy has joined #tripleo | 07:13 | |
*** shardy has quit IRC | 07:13 | |
*** shardy has joined #tripleo | 07:13 | |
*** openstackgerrit has joined #tripleo | 07:13 | |
openstackgerrit | Marios Andreou proposed openstack-infra/tripleo-ci master: tripleo.sh --repo-setup update ceph to luminous and remove older https://review.openstack.org/583547 | 07:13 |
*** paramite has joined #tripleo | 07:13 | |
*** gfidente has joined #tripleo | 07:21 | |
*** gfidente has quit IRC | 07:21 | |
*** gfidente has joined #tripleo | 07:21 | |
openstackgerrit | Marios Andreou proposed openstack-infra/tripleo-ci master: tripleo.sh --repo-setup update ceph to luminous and remove older https://review.openstack.org/583547 | 07:21 |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates master: Fix HostnameMap lookup - replace str_replace with yaql https://review.openstack.org/582475 | 07:24 |
*** yprokule_ has joined #tripleo | 07:24 | |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Move toci_quickstart variables to yaml https://review.openstack.org/582466 | 07:26 |
*** yprokule has quit IRC | 07:27 | |
*** yprokule_ is now known as yprokule | 07:27 | |
*** ykarel is now known as ykarel|lunch | 07:27 | |
openstackgerrit | Quique Llorente proposed openstack/tripleo-heat-templates master: [DNM] To test sprint16 toci refactoring https://review.openstack.org/583874 | 07:28 |
*** janki has quit IRC | 07:29 | |
*** rcernin has quit IRC | 07:29 | |
*** lvdombrkr has joined #tripleo | 07:30 | |
*** avivgt|lunch has joined #tripleo | 07:31 | |
*** noslzzp has joined #tripleo | 07:33 | |
*** tosky has joined #tripleo | 07:37 | |
*** florianf has joined #tripleo | 07:42 | |
openstackgerrit | Juan Badia Payno proposed openstack/tripleo-heat-templates master: [WIP] - mistra_engine container added /usr/share volume https://review.openstack.org/583877 | 07:44 |
*** janki has joined #tripleo | 07:44 | |
openstackgerrit | Tuan Do Anh proposed openstack/tripleo-common master: fix tox python3 overrides https://review.openstack.org/579827 | 07:46 |
openstackgerrit | Marios Andreou proposed openstack-infra/tripleo-ci master: tripleo.sh --repo-setup update ceph to luminous and remove older https://review.openstack.org/583547 | 07:50 |
*** Petersingh is now known as Petersingh|lunch | 07:50 | |
*** yprokule_ has joined #tripleo | 07:52 | |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates master: Use global ansible.cfg for nodes-uuid playbook https://review.openstack.org/583552 | 07:53 |
*** kopecmartin has joined #tripleo | 07:53 | |
*** yprokule has quit IRC | 07:55 | |
*** yprokule_ is now known as yprokule | 07:55 | |
openstackgerrit | Juan Badia Payno proposed openstack/tripleo-heat-templates master: [WIP] - mistra_engine container added /usr/share volume https://review.openstack.org/583877 | 08:03 |
*** dtantsur|afk is now known as dtantsur | 08:07 | |
*** ooolpbot has joined #tripleo | 08:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 08:10 |
*** ooolpbot has quit IRC | 08:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 08:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 08:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 08:10 |
*** gkadam has joined #tripleo | 08:11 | |
*** gkadam is now known as gkadam-brb | 08:12 | |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart-extras master: Enable support for running refstack tests in TQE https://review.openstack.org/570719 | 08:15 |
*** pmannidi has quit IRC | 08:18 | |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates master: Use global ansible.cfg for nodes-uuid playbook https://review.openstack.org/583552 | 08:20 |
*** derekh has joined #tripleo | 08:23 | |
*** dtantsur is now known as dtantsur|bbl | 08:24 | |
*** holser_ has quit IRC | 08:25 | |
*** ykarel|lunch is now known as ykarel | 08:30 | |
shardy | d0ugal: Hey, can you give any tips on how to trace mistral logs from an action error back to which workflow/task was running the action? | 08:30 |
shardy | d0ugal: http://logs.openstack.org/53/574753/21/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/516a10d/logs/undercloud/var/log/containers/mistral/executor.log.txt.gz#_2018-07-18_11_04_12_336 | 08:30 |
openstackgerrit | Cédric Jeanneret proposed openstack/puppet-tripleo master: Corrected vrrp script for haproxy status https://review.openstack.org/583886 | 08:30 |
shardy | here I can see an action failed because it 404'd getting plan-environment.yaml from swift | 08:30 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Move toci_quickstart variables to yaml https://review.openstack.org/582466 | 08:30 |
shardy | d0ugal: but it's not clear which workflow/task caused this (so I can figure out why the update workflow lost the plan-environment we copy/create during the initial deploy) | 08:31 |
shardy | it'd be cool if there was a tree/forest view option which we could dump into the CI logs, so you can see the exact state of the workflow graph | 08:31 |
flaper87 | bogdando: https://review.openstack.org/#/c/583238/ <- you sure it's this patch fault? | 08:32 |
flaper87 | the CI switched to containerized undercloud and now scenario009 is broken | 08:33 |
flaper87 | :( | 08:33 |
flaper87 | want to merge these patches asap | 08:33 |
flaper87 | shardy: mandre https://review.openstack.org/#/c/583238/ pls | 08:33 |
bogdando | flaper87: I don't know how to debug that zuul breakage ( | 08:33 |
bogdando | syntax* | 08:33 |
bogdando | flaper87: check experimental fails on that patch | 08:34 |
flaper87 | doh, I just noticed | 08:34 |
flaper87 | T_T | 08:34 |
d0ugal | shardy: Looking. Maybe we can add that script I wrote to CI, even if it doesn't land into Mistral yet | 08:34 |
shardy | flaper87: sure but bogdando is right, there's some zuul syntax error badness in the subsequent patches in that series | 08:35 |
* shardy tries to understand why | 08:35 | |
flaper87 | shardy: uyeah, I just noticed what he meant | 08:35 |
flaper87 | T_T | 08:35 |
flaper87 | sorry for the noise | 08:35 |
* flaper87 checks | 08:36 | |
shardy | flaper87: cool, I don't really see what's wrong, the list additions seem reasonable | 08:36 |
shardy | "Job tripleo-ci-centos-7-scenario005-multinode-oooq not defined" | 08:36 |
shardy | hmm | 08:36 |
flaper87 | I don't think it's this patche's fault | 08:36 |
flaper87 | lemme re run experimental | 08:36 |
flaper87 | maybe some job definition coming from somewhere else | 08:37 |
shardy | yeah | 08:37 |
openstackgerrit | Flavio Percoco proposed openstack/tripleo-heat-templates master: Run scenario009 for more services https://review.openstack.org/583238 | 08:37 |
flaper87 | rebased it and re-run check experimental | 08:37 |
*** Petersingh|lunch is now known as Petersingh | 08:37 | |
flaper87 | lets see | 08:37 |
openstackgerrit | Carlos Goncalves proposed openstack/tripleo-heat-templates master: Add scenario010 for testing Octavia https://review.openstack.org/518331 | 08:37 |
openstackgerrit | Luigi Toscano proposed openstack/tripleo-heat-templates master: WIP Deploy Sahara with unversioned endpoints https://review.openstack.org/583890 | 08:38 |
mandre | flaper87: yeah likely not the patch's fault | 08:38 |
d0ugal | shardy: The action exectuion ID is a few lines above the error in the logs action_ex_id | 08:38 |
d0ugal | shardy: http://logs.openstack.org/53/574753/21/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/516a10d/logs/undercloud/var/log/containers/mistral/executor.log.txt.gz#_2018-07-18_11_04_12_305 | 08:38 |
mandre | it's just that experimental has a tripleo-ci-centos-7-scenario005-multinode-oooq job that is probably never defined | 08:39 |
d0ugal | shardy: grepping the engine log for that ID finds the workflow trace | 08:39 |
d0ugal | shardy: http://logs.openstack.org/53/574753/21/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/516a10d/logs/undercloud/var/log/containers/mistral/engine.log.txt.gz#_2018-07-18_11_04_12_369 | 08:39 |
shardy | d0ugal: ah, thanks, I was looking for the ID in [req-79f774cb-9c70-4499-ba1b-2f92c3b19f71 fa0efb24f6194b28aa741bb374178745 0b39e2fc1663472f9b045854a4770f83 | 08:40 |
d0ugal | shardy: yeah, those IDs are confusing. I don't understand them. I need to figure that out. | 08:40 |
shardy | d0ugal: yeah be interested to know if you find out, I assumed one of the second IDs was the action execution | 08:41 |
*** tesseract has joined #tripleo | 08:41 | |
shardy | Ok so tripleo.plan_management.v1.update_deployment_plan calls tripleo.parameters.generate_passwords which blows up because plan-environment is missing | 08:42 |
shardy | d0ugal: do you happen to know how we maintain the plan-environment.yaml over updates to the plan? | 08:42 |
mandre | flaper87, shardy, bogdando: tripleo-ci-centos-7-scenario005-multinode-oooq definition was removed in https://review.openstack.org/#/c/581376/ | 08:42 |
shardy | I was expecting it to persist because it's got e.g all the passwords etc in it | 08:42 |
shardy | not get re-generated every update | 08:43 |
flaper87 | mandre: just rebased my patch | 08:43 |
flaper87 | it should work now | 08:43 |
flaper87 | let's see | 08:43 |
bogdando | flaper87: well spotted! thanks) | 08:43 |
shardy | mandre: aha, good catch! | 08:43 |
d0ugal | shardy: I don't fully remember, but I don't think it is re-generated. I think that workflow will generate the passwords if they are missing | 08:44 |
d0ugal | I can't think of a reason why it wouldn't exist at that point however. | 08:44 |
shardy | d0ugal: Ok I'll need to modify the update workflow then - how do we avoid changing the passwords, are those stored somewhere else as well as the plan-environment? | 08:44 |
mandre | flaper87: still, we need to remove scenario005 from experimental, the job doesn't exist anymore | 08:45 |
mandre | i'll submit a patch | 08:45 |
shardy | d0ugal: I assume part of the update cleans the old plan contents, but I incorrectly assumed we saved the plan-environment somewhere along the line | 08:45 |
flaper87 | mandre: wait | 08:45 |
flaper87 | let me do it | 08:45 |
flaper87 | just to avoid a bunch of rebases | 08:45 |
d0ugal | shardy: Good question. You are testing my memory :) | 08:46 |
flaper87 | also, why wasn't it removed from the experimental queue when the job was removed? | 08:46 |
shardy | hehe | 08:46 |
flaper87 | there probably is a patch for that already | 08:46 |
* shardy passes d0ugal a coffee ;) | 08:46 | |
flaper87 | mandre: ^ | 08:46 |
* shardy looks at the code to figure it out | 08:46 | |
flaper87 | mandre: https://review.openstack.org/#/c/583680/ | 08:46 |
*** agurenko has quit IRC | 08:46 | |
flaper87 | there's a patch for it already | 08:46 |
*** pradk has joined #tripleo | 08:47 | |
*** holser_ has joined #tripleo | 08:48 | |
d0ugal | shardy: I think they are only stored in user-environment and generated if missing? | 08:49 |
mandre | weshay must wonder what happened to his patch ;) | 08:49 |
d0ugal | shardy: but at one point we did store a copy of what we generted in another location - but I can't find that now. | 08:49 |
d0ugal | shardy: that was back when we stored everything in Mistral envs | 08:49 |
openstackgerrit | Jose Luis Franco proposed openstack-infra/tripleo-ci master: Collect /tmp/ansible-mistral-action into CI job logs. https://review.openstack.org/583896 | 08:49 |
openstackgerrit | Flavio Percoco proposed openstack/tripleo-heat-templates master: Run scenario009 for more services https://review.openstack.org/583238 | 08:49 |
*** kopecmartin has quit IRC | 08:49 | |
d0ugal | shardy: https://github.com/openstack/tripleo-common/blob/master/tripleo_common/actions/parameters.py#L285-L315 | 08:50 |
shardy | d0ugal: hmm, but there's a huge "passwords" block in the plan-environment | 08:50 |
shardy | https://github.com/openstack/tripleo-common/blob/master/tripleo_common/utils/passwords.py#L50 | 08:50 |
*** kopecmartin has joined #tripleo | 08:50 | |
*** agurenko has joined #tripleo | 08:50 | |
shardy | d0ugal: yeah, so it looks like we delete it, then grab any existing passwords from the heat environment | 08:50 |
shardy | sigh. That kinda breaks my selction of a plan-sample :( | 08:51 |
d0ugal | shardy: yeah, that was done for the upgrade case (upgrading from no initial plan) | 08:51 |
d0ugal | I don't know if we still need that, since everyone should have a plan now? | 08:51 |
shardy | d0ugal: I think we're relying on that, because by the time you update the plan, the old plan-environment is gone | 08:51 |
shardy | but, luckily, we still have the data in heat | 08:51 |
d0ugal | oh | 08:52 |
shardy | I'll have to test to confirm that though | 08:52 |
d0ugal | shardy: I don't see where the old environment is removed? | 08:52 |
d0ugal | I see it being loaded and updated, but I might be missing something | 08:52 |
openstackgerrit | Martin André proposed openstack/tripleo-quickstart-extras master: Restrict undercloud resolvers to IPv4 addresses https://review.openstack.org/583302 | 08:52 |
openstackgerrit | Damien Ciabrini proposed openstack/puppet-tripleo master: Prevent triggering firewall actions while configuring HA services https://review.openstack.org/583648 | 08:54 |
*** udesale_ has joined #tripleo | 08:54 | |
shardy | TODO(d0ugal): We need to put a more robust strategy in place here to handle updating plans. | 08:54 |
shardy | https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_deploy.py#L373 | 08:54 |
shardy | ;) | 08:54 |
shardy | IIRC we purge the entire container in the client | 08:54 |
* shardy looks for where | 08:54 | |
openstackgerrit | Marios Andreou proposed openstack-infra/tripleo-ci master: tripleo.sh --repo-setup update ceph to luminous and remove older https://review.openstack.org/583547 | 08:55 |
d0ugal | shardy: ah, I forgot that tripleoclient dove behind the API. gah | 08:55 |
shardy | https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/workflows/plan_management.py#L200 | 08:55 |
openstackgerrit | waleed mousa proposed openstack/puppet-tripleo master: Adding support for VF LAG in SR-IOV for Mellanox interfaces https://review.openstack.org/558411 | 08:56 |
shardy | d0ugal: yeah, I guess for now we could special-case it to save the plan-environment and capabilities-map, but it'd probably be better to have a workflow that does this | 08:56 |
d0ugal | yup | 08:56 |
d0ugal | shardy: we should really try and sort out the tripleoclient mess in Stein :) | 08:56 |
*** udesale__ has quit IRC | 08:57 | |
*** agurenko has quit IRC | 08:57 | |
shardy | d0ugal: yeah agreed, but it's going to be risky and a lot of work | 08:57 |
d0ugal | Indeed | 08:57 |
d0ugal | but it just keeps getting worse and causing extra work. | 08:57 |
shardy | maybe at the PTG we can figure out the steps and attempt it incrementally | 08:57 |
shardy | yeah | 08:57 |
flaper87 | bogdando: you can now remove your -1 https://review.openstack.org/#/c/583238/ :D | 08:57 |
*** salmankhan has joined #tripleo | 08:57 | |
shardy | Ok mystery solved - thanks for your help working through it! :) | 08:57 |
bogdando | oops, I thought it's gone after rebase, flaper87 | 08:58 |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-heat-templates master: Bind mount mistral state for external deployments https://review.openstack.org/583136 | 08:58 |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-heat-templates master: W/a kubespray vault install failure https://review.openstack.org/583132 | 08:58 |
openstackgerrit | Marios Andreou proposed openstack-infra/tripleo-ci master: tripleo.sh --repo-setup update ceph to luminous and remove older https://review.openstack.org/583547 | 08:59 |
d0ugal | shardy: np | 08:59 |
*** agurenko has joined #tripleo | 09:00 | |
mandre | bogdando: do you have an idea why scenario009 ran with containerized undercloud for https://review.openstack.org/#/c/583302/ ? I was kinda under the impression we haven't made the switch yet | 09:02 |
*** cmyster has quit IRC | 09:02 | |
bogdando | mandre: defaults switched | 09:02 |
bogdando | in the client, openstack undercloud install now deploys containerized, --use-heat=False would deploy instack | 09:03 |
bogdando | https://review.openstack.org/#/c/581534/ | 09:03 |
bogdando | mandre: ^^ | 09:03 |
gfidente | sshnaidm|rover if I was marios | 09:03 |
gfidente | I'd have killed you on the comment | 09:03 |
gfidente | about erase vs remove | 09:03 |
bogdando | and the switch was in https://review.openstack.org/#/c/576218 | 09:04 |
*** nyechiel has quit IRC | 09:04 | |
mandre | bogdando: ahh, thanks that explains it | 09:04 |
sshnaidm|rover | gfidente, it's nit, not reason for -1 | 09:04 |
shardy | bogdando: has much testing been done of upgrading from an instack installed undercloud to the containerized one? | 09:05 |
shardy | I ask because I tried it yesterday and it didn't work, but that was possibly due to other issues in my environment | 09:05 |
bogdando | shardy: that's using instack still | 09:05 |
*** Petersingh is now known as Petersingh|afk | 09:05 | |
gfidente | sshnaidm|rover ah sorry, what is the reason for the -1 ? | 09:06 |
shardy | bogdando: I mean we switched the defaults - if someone upgrades then runs "openstack undercloud install" or "openstack undercloud upgrade", the heat based container stuff will run | 09:06 |
bogdando | shardy: oh, do you mean UC upgrade, not OC? | 09:06 |
shardy | what happens then? | 09:06 |
shardy | bogdando: yeah | 09:06 |
sshnaidm|rover | gfidente, If I was gfidente, I would talk to people before killing them | 09:06 |
bogdando | it's been tested in upstream instack-to-cont-upgrade job for a few months | 09:06 |
gfidente | sshnaidm|rover oh dear | 09:07 |
gfidente | I wasn't thinking about killing you for real | 09:07 |
shardy | bogdando: Ok good to hear, hopefully my issues were a one-off then | 09:07 |
gfidente | and if you have something to say to -1, better say it in gerrit so others know | 09:07 |
bogdando | shardy: since May 4 , https://trello.com/c/nFbky9Uk/5-upgrade-support-from-instack-undercloud | 09:07 |
sshnaidm|rover | gfidente, please read comments more carefully | 09:07 |
gfidente | but anyway, I hope it's obvious I was joking | 09:07 |
gfidente | sorry if it wasn't | 09:07 |
bogdando | https://review.openstack.org/#/c/553633/ introced that job | 09:08 |
bogdando | introduced | 09:08 |
gfidente | sshnaidm|rover ok about comments | 09:08 |
*** panda|off is now known as panda | 09:08 | |
gfidente | I see you wrote this "How is that related to tripleo.sh?" | 09:08 |
gfidente | in a change which is making a change to tripleo.sh | 09:08 |
gfidente | I might be overlooking something | 09:08 |
gfidente | what was the real meaning of your comment? | 09:08 |
bogdando | jfrancoa: hi, so it failed now http://logs.openstack.org/15/583515/2/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/0c803dc/logs/undercloud/home/zuul/undercloud_install.log.txt.gz#_2018-07-18_17_35_00 while was passing the previous patchset | 09:09 |
bogdando | how come?.. :( | 09:09 |
*** ooolpbot has joined #tripleo | 09:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 09:10 |
*** ooolpbot has quit IRC | 09:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 09:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 09:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 09:10 |
jfrancoa | bogdando: I only changed the flag you commented in the patch, instead of adding --use-heat, set containerized_undercloud to true. "openstack undercloud install" should be now the same as "openstack undercloud install --use-heat" right? | 09:11 |
bogdando | comparing to http://logs.openstack.org/47/465047/13/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/0507766/ , jfrancoa , let's find out | 09:11 |
jfrancoa | bogdando: that's the only difference | 09:11 |
bogdando | I mean the instack job | 09:11 |
bogdando | in the original patch | 09:11 |
*** holser_ has quit IRC | 09:11 | |
*** Petersingh|afk is now known as Petersingh | 09:12 | |
bogdando | jfrancoa: never mind, I messed it up | 09:12 |
bogdando | hte links | 09:12 |
bogdando | so the passed instack and overcloud upgrades job was http://logs.openstack.org/47/465047/13/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/0507766/logs/undercloud/home/zuul/undercloud_install.log.txt.gz | 09:14 |
bogdando | and failed for the minor refactor on 14 ( | 09:14 |
*** derekh has quit IRC | 09:14 | |
*** kopecmartin has quit IRC | 09:15 | |
bogdando | oh, it passed http://logs.openstack.org/47/465047/14/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/3f333a1/ . Ugh, so confusing :D so that's the containerized UC patch failed on the 2nd run | 09:15 |
bogdando | jfrancoa: yes, you're right wrt that --use-heat change | 09:15 |
marios | gfidente: sshnaidm|rover peace :) brothers be cool | 09:16 |
marios | gfidente: sshnaidm|rover its all good | 09:17 |
marios | gfidente: sshnaidm|rover we are discussing it in oooq too | 09:17 |
jfrancoa | bogdando: yes, I added that modification. but currently by default we install containerized undercloud, right? so --use-heat and not using it should be the same. Or am I missing something? | 09:17 |
*** udesale__ has joined #tripleo | 09:17 | |
*** derekh has joined #tripleo | 09:17 | |
jfrancoa | bogdando: ok, I got it. https://review.openstack.org/#/c/583515/2/config/general_config/featureset051.yml@107 | 09:18 |
bogdando | jfrancoa: comparig http://logs.openstack.org/15/583515/1/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/698ee97/logs/undercloud/home/zuul/install-undercloud.log.txt.gz to http://logs.openstack.org/15/583515/2/check/tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades/67b9d01/logs/undercloud/home/zuul/install-undercloud.log.txt.gz I can see the deploy command changed, which is not expected | 09:19 |
jfrancoa | bogdando: it should be now "{{ undercloud_templates_path }}/ci/common/net-config-simple-bridge.yaml" | 09:19 |
jfrancoa | bogdando: I am going to change it | 09:19 |
bogdando | I can see there is also wrong hiera file | 09:19 |
chkumar|ruck | sileht: Please have a look at this failure https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016-master/12091db/logs/tempest.html.gz | 09:19 |
*** kopecmartin has joined #tripleo | 09:19 | |
openstackgerrit | Rabi Mishra proposed openstack/puppet-tripleo master: Check for neutron_plugin_ml2_ansible service when including plugin https://review.openstack.org/583900 | 09:19 |
bogdando | the classic one is for instack and should not be used with cont UC | 09:19 |
bogdando | seems like a little mess in quickstart left after the defaults switched | 09:20 |
*** udesale_ has quit IRC | 09:20 | |
bogdando | EmilienM: ^^ | 09:20 |
sileht | chkumar|ruck, sure | 09:20 |
openstackgerrit | Jose Luis Franco proposed openstack/tripleo-quickstart master: Enable containerized undercloud in scenario000-upgrades. https://review.openstack.org/583515 | 09:21 |
jfrancoa | bogdando: let's see if it passes now ^^^ | 09:21 |
bogdando | jfrancoa: right, well spotted. I need another change bundled with the main patch to alter all overcloud_templates_path to undercloud_ | 09:21 |
bogdando | in featuresets | 09:21 |
jfrancoa | bogdando: yes, right. we need to change all those references too | 09:22 |
*** athomas has quit IRC | 09:23 | |
sshnaidm|rover | gfidente, instead of killing somebody, maybe you can help moire with that bug - do you know where does it happen? jobs, logs..? | 09:24 |
gfidente | sshnaidm|rover sorry I waste my days killing people | 09:26 |
gfidente | sshnaidm|rover are you serious? | 09:26 |
sshnaidm|rover | gfidente, about bug? yes | 09:30 |
ccamacho | hey folks!!! | 09:30 |
ccamacho | https://www.youtube.com/watch?v=ZbZSe6N_BXs | 09:30 |
ccamacho | Happy! | 09:30 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Set common vars at vars/common.yaml https://review.openstack.org/582885 | 09:30 |
ccamacho | <3 love for all | 09:30 |
ccamacho | gfidente sshnaidm|rover marios :* | 09:31 |
ccamacho | jfrancoa not for you | 09:31 |
ccamacho | give me some review love man! | 09:31 |
gfidente | sshnaidm|rover so to be honest I remeber something close https://review.openstack.org/#/c/576597/ | 09:31 |
gfidente | sshnaidm|rover but I am definitely not oooq expert so I am not sure what can be interferring with that stuff | 09:31 |
openstackgerrit | Quique Llorente proposed openstack/tripleo-heat-templates master: [DNM] Test review https://review.openstack.org/583179 | 09:32 |
sshnaidm|rover | gfidente, that's oooq patch, according to patch the problem is in infra code | 09:33 |
sshnaidm|rover | gfidente, *according to bug | 09:33 |
gfidente | sshnaidm|rover yeah I remember infra pre-installed ceph repos | 09:33 |
sileht | chkumar|ruck, you can safely recheck, it's a test is not expecting the setup to be so slow | 09:34 |
* marios hugs ccamacho and a tree | 09:34 | |
gfidente | sshnaidm|rover this is why tripleo.sh and oooq were removing them I think | 09:34 |
chkumar|ruck | sileht: I will wait for next run then, thanks :-) | 09:34 |
sileht | ccamacho, I proposed a patch to our tempest plugin, the be remove this race when the setup is very slow | 09:34 |
marios | ccamacho: which review? | 09:35 |
ccamacho | marios xD | 09:35 |
ccamacho | maybe this one? | 09:35 |
ccamacho | https://review.openstack.org/#/c/581054/ | 09:35 |
ccamacho | simple simple | 09:35 |
*** afazekas|pto is now known as afazekas | 09:39 | |
*** suuuper has joined #tripleo | 09:41 | |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-specs master: Validation Framework specifications. https://review.openstack.org/583475 | 09:42 |
Tengu | ccamacho: hello! any way to get your feedback for this spec? -^^ :) | 09:43 |
Tengu | as you're apparently working on validations :) | 09:44 |
ccamacho | Tengu yeah I have it right now opened, have a lot of feedback :) | 09:44 |
Tengu | ccamacho: cool! :) | 09:44 |
gfidente | sshnaidm|rover marios so regarding the bug, I think one of oooq or tripleo.sh is meant to remove the pre-existing centos-release packages | 09:45 |
gfidente | sshnaidm|rover marios and in that context the tripleo.sh change looks sane to me\ | 09:45 |
*** shardy has quit IRC | 09:46 | |
sri_ | shardy, quick question, in my overcloud deployment instead of using ovs_bonds I've configre linux_bonds with vlans, http://paste.openstack.org/show/726266/, is linux_bond's works out of the box in os-net-config ? is there anything we need to be aware of | 09:47 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Set common vars at vars/common.yaml https://review.openstack.org/582885 | 09:47 |
*** kopecmartin has quit IRC | 09:47 | |
*** kopecmartin has joined #tripleo | 09:48 | |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Take featureset out of TOCI_JOBTYPE https://review.openstack.org/582384 | 09:48 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Take environment_type out of TOCI_JOBTYPE https://review.openstack.org/582385 | 09:48 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Take nodes out of TOCI_JOBTYPE https://review.openstack.org/582386 | 09:48 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Take periodic and dryrun out of TOCI_JOBTYPE https://review.openstack.org/582387 | 09:49 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Move toci_quickstart variables to yaml https://review.openstack.org/582466 | 09:49 |
marios | ccamacho: ack | 09:50 |
rnoriega | gfidente, hi! don't want to interrupt, just a quick question. What is the tag used for ceph containers in tripleo ci? latest? latest-luminous? | 09:51 |
rnoriega | gfidente, there is also another one called: build-master-XXX-centos-7 | 09:51 |
rnoriega | it's a bit confusing... :-\ | 09:51 |
gfidente | rnoriega hey, we pin to known working versions | 09:55 |
gfidente | location depends on the tripleo version, are you asking 'master' ? | 09:55 |
*** chem has joined #tripleo | 09:56 | |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart-extras master: WIP: do-not-review bundle v2 https://review.openstack.org/583574 | 10:00 |
*** honza_ is now known as honza | 10:01 | |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart-extras master: use undercloud registry for ceph_namespace in overcloud image prepare https://review.openstack.org/581607 | 10:01 |
openstackgerrit | Merged openstack/puppet-tripleo master: rate limit iptables logging https://review.openstack.org/581748 | 10:02 |
gfidente | chkumar|ruck regarding https://review.openstack.org/#/c/581607 , is the image actually copied into the undercloud registry? | 10:03 |
*** kopecmartin has quit IRC | 10:04 | |
*** Petersingh is now known as Petersingh|away | 10:05 | |
*** peereb has quit IRC | 10:06 | |
*** Petersingh|away has quit IRC | 10:06 | |
*** kopecmartin has joined #tripleo | 10:07 | |
*** ooolpbot has joined #tripleo | 10:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 10:10 |
*** ooolpbot has quit IRC | 10:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 10:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 10:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 10:10 |
Tengu | florianf and or ccamacho: maybe I can talk about the "validation framework" during one of your team meeting? So that we can catch interest of the majority, and get good feedback and make a great thing together? :) | 10:14 |
florianf | Tengu: Sounds good. I haven't commented on the spec yet though. Can't do it right now, but later today. | 10:16 |
Tengu | florianf: fine for me. When are your meetings? | 10:17 |
*** dhill_ has quit IRC | 10:17 | |
Tengu | (and where ;) | 10:17 |
*** v1a4 has joined #tripleo | 10:17 | |
florianf | Tengu: next one is on monday. but I'm not gonna be present because pto | 10:19 |
*** leanderthal has joined #tripleo | 10:19 | |
Tengu | florianf: ok. Maybe it would be good to wait the next one if you're back, so that we can get some first comment/reviews? | 10:19 |
florianf | Tengu: I'm away next week, so maybe the Monday after that could be good. Plus, this will give others some time to comment. I'll find out if there is anything else scheduled. | 10:21 |
Tengu | florianf: great! it's on IRC I get? which channel? | 10:22 |
florianf | Tengu: nope, bluejeans | 10:22 |
Tengu | of, ok. Care to add me to the event (calendar, whatever)? | 10:22 |
rnoriega | gfidente, sorry, I was afk. Yes, asking about master... | 10:23 |
ccamacho | we have our upstream meeting tomorrow Tengu florianf | 10:23 |
ccamacho | maybe might be a good place | 10:23 |
florianf | Tengu: sure, I'll talk to our tc who has all the calendar powers ;-) | 10:23 |
Tengu | ccamacho: as a first catch, depending on the hour, yep | 10:23 |
Tengu | florianf: perfect :). | 10:24 |
Tengu | ccamacho: have you any details? I won't take long anyway, just pointing to the spec so that ppl can at least know about it, read it, and comment out :) | 10:24 |
Tengu | hmm. *do you have any details - better. | 10:24 |
gfidente | rnoriega here https://github.com/openstack/tripleo-common/blob/master/tripleo_common/image/kolla_builder.py#L36 | 10:27 |
gfidente | end up in https://github.com/openstack/tripleo-common/blob/master/container-images/overcloud_containers.yaml#L106 | 10:27 |
*** nyechiel has joined #tripleo | 10:27 | |
gfidente | ci probably uses its own images.yaml though as it customizes prepare | 10:27 |
openstackgerrit | Yurii Prokulevych proposed openstack/tripleo-upgrade master: Minor updates of pre-provisioned envrionments. https://review.openstack.org/583913 | 10:29 |
Tengu | so. I'll be back in a while, taking a dip in the lake. need some fresh water :). | 10:30 |
rnoriega | gfidente, I see, thanks! | 10:30 |
gfidente | rnoriega I guess older tags are not visible in docker.io | 10:30 |
gfidente | from the web interface | 10:30 |
*** iranzo has joined #tripleo | 10:30 | |
*** iranzo has joined #tripleo | 10:30 | |
ccamacho | Meeting to share & talk about Upgrade in upstream and in CI Invite folks from other DFG to share, raise issues : https://etherpad.openstack.org/p/tripleo-upgrade-squad-meeting | 10:30 |
gfidente | we should test and bump the version if newer works | 10:30 |
jfrancoa | ccamacho: here you have the upgrades upstream meeting etherpad https://etherpad.openstack.org/p/tripleo-upgrade-squad-meeting | 10:30 |
ccamacho | tomorrow 3:30 CET | 10:30 |
ccamacho | jfrancoa thanks! | 10:30 |
openstackgerrit | Yurii Prokulevych proposed openstack/tripleo-upgrade master: Minor updates of pre-provisioned envrionments. https://review.openstack.org/583913 | 10:31 |
Tengu | ccamacho: cool, I should be available. Adding the topic. | 10:31 |
chkumar|ruck | gfidente: I think during undercloud install they might be pulled from upstream to undercloud registery so thought to use it | 10:31 |
chkumar|ruck | gfidente: https://review.openstack.org/#/c/549216/ was added earlier it from undercloud registery but removed | 10:32 |
rnoriega | gfidente, if there is a mapping between openstack version - ceph version. Why not using latest-$ceph_version ?? | 10:32 |
rnoriega | gfidente, at development cycle, of course. | 10:33 |
gfidente | rnoriega basically because neither ceph-container nor ceph-ansible upstream releases are tested (yet?) with tripleo | 10:35 |
gfidente | rnoriega and they broke more frequently than we wanted | 10:35 |
gfidente | rnoriega so while in theory I agree, mapping to latest is a good idea | 10:35 |
gfidente | rnoriega in practice that broke the entire tripleo ci because of issues that nobody in tripleo could work on | 10:35 |
gfidente | rnoriega hence we decided to pin to known working versions and advance them only after they are tested working | 10:35 |
rnoriega | gfidente, I see, alright. | 10:36 |
gfidente | rnoriega we have DNM submissions to test newer versions of both | 10:36 |
rnoriega | gfidente, just wanted to understand the pipeline. This is for OPNFV Apex (tripleo) where we use the tag: build-master-luminous-centos-7 | 10:36 |
gfidente | https://review.openstack.org/#/c/501987/ and https://review.openstack.org/#/c/562213/ | 10:36 |
rnoriega | gfidente, and people are asking about why not using new container images... and not 8 months old ones... | 10:37 |
gfidente | rnoriega do you need to override our pin for particular reasons? | 10:37 |
gfidente | rnoriega yeah we could bump up the tags if the tests pass | 10:37 |
*** moshele has joined #tripleo | 10:37 | |
gfidente | rnoriega we can try now | 10:37 |
rnoriega | gfidente, usually, the OPNFV community dictates which versions are meant to be included in a release... | 10:37 |
gfidente | whan version of ceph? | 10:38 |
rnoriega | gfidente, like, Openstack Queens + OpenDaylight Oxygen + Ceph Luminous... etc | 10:38 |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart-extras master: make reproducer bash syntax more portable https://review.openstack.org/581012 | 10:38 |
rnoriega | gfidente, I'm not sure, I have to ask, but I think it's luminous... | 10:38 |
gfidente | well, we obviously point to luminous, the question is what version of luminous | 10:38 |
gfidente | but all tags include luminous | 10:38 |
rnoriega | gfidente, mmmm... specific version, that I don't know. | 10:39 |
gfidente | right, so I think you should stick with the tag we tested | 10:39 |
rnoriega | gfidente, ok! | 10:39 |
rnoriega | gfidente, thanks for the tip! :-) | 10:39 |
gfidente | but we can at the same time bump up v3.0.3 to v3.0.6 | 10:40 |
gfidente | if it passes tripleo/ci | 10:40 |
gfidente | I'll add you to the submission which tests this | 10:40 |
*** ramishra has quit IRC | 10:40 | |
gfidente | note that 3.0.3 is the version of the *container image* | 10:40 |
openstackgerrit | Gabriele Cerami proposed openstack/tripleo-quickstart-extras master: manage-stack: add env variables in info gathering https://review.openstack.org/583916 | 10:40 |
gfidente | not the version of ceph itself | 10:40 |
*** kopecmartin has quit IRC | 10:41 | |
rnoriega | gfidente, yes please, include me. Thanks! | 10:41 |
openstackgerrit | Yurii Prokulevych proposed openstack/tripleo-upgrade master: Adjust templating for upgrade scripts. https://review.openstack.org/583917 | 10:41 |
gfidente | rnoriega and v3.0.3 is not 8 months old, but it's dated apr 17th | 10:42 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates master: DO-NOT-MERGE Test new ceph-container builds https://review.openstack.org/562213 | 10:42 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates master: DO-NOT-MERGE Test new ceph-container builds https://review.openstack.org/562213 | 10:42 |
rnoriega | gfidente, I meant the build-master-luminous-centos: https://hub.docker.com/r/ceph/daemon/builds/ | 10:42 |
gfidente | rnoriega oh that one is probably an old tag we don't use anyway | 10:43 |
gfidente | note I added you to https://review.openstack.org/#/c/562213/ which is testing 3.0.6 | 10:43 |
gfidente | 22 days old | 10:43 |
rnoriega | gfidente, isn't there a "generic" tag that points to the version you pick? like current-rdo that points to a hash-tag | 10:44 |
rnoriega | gfidente, even if it's just pinned | 10:44 |
gfidente | rnoriega you mean why we don't pin last-known-working version in ceph-container repo vs trpleo repo? | 10:44 |
rnoriega | gfidente, yep | 10:45 |
gfidente | rnoriega right | 10:45 |
gfidente | rnoriega there isn't because ceph last known working is not tested with tripleo | 10:45 |
gfidente | rnoriega so it is in tripleo that we maintain what is known to work with tripleo | 10:45 |
gfidente | same reason why we don't point to -latest | 10:46 |
rnoriega | gfidente, well, make sense. | 10:46 |
gfidente | rnoriega but this is indeed interesting topic | 10:46 |
gfidente | and applies to ceph-ansible as well | 10:46 |
gfidente | we discussed a while ago with weshay if/how we could gate ceph-ansible and ceph-container changes with a tripleo job | 10:46 |
rnoriega | gfidente, consuming external components from tripleo perspective increases complexity | 10:46 |
ccamacho | tENGO DONE :) | 10:46 |
rnoriega | gfidente, not complaining :-) | 10:46 |
ccamacho | Tengu done ** wrong caps | 10:46 |
gfidente | rnoriega I think we can offer infrastructure where to run the tripleo job with "pending" ceph-ansible and ceph-container code | 10:47 |
gfidente | rnoriega but the mechanics of setting up in github triggers for zuul on pull requests are not resolved yet | 10:47 |
gfidente | rnoriega so right now ceph continues to test both ceph-ansible and ceph-container with their test suite | 10:48 |
rnoriega | gfidente, maybe OPNFV could be a good place to test it without breaking the whole tripleo CI. We'd have to discuss it with trozet | 10:48 |
gfidente | rnoriega and tripleo/ci picks them up for testing what is considered stable | 10:48 |
gfidente | *when considered | 10:48 |
rnoriega | gfidente, ok | 10:48 |
gfidente | except they not always are :D | 10:48 |
rnoriega | hahha | 10:48 |
gfidente | rnoriega sure yes | 10:48 |
*** links has quit IRC | 10:49 | |
gfidente | rnoriega to be honest | 10:50 |
gfidente | current approach revealed to be much more stable for tripleo devs | 10:50 |
gfidente | we rarely had outages due to breakages in ceph components | 10:50 |
gfidente | when we had any, it was likely misconfig or stuff we could fix in tripleo, but not breakages in a ceph component outside our direct control | 10:50 |
gfidente | which in the past (around pike) was more the case instead | 10:51 |
rnoriega | gfidente, I see | 10:51 |
rnoriega | gfidente, reading your blogpost about ceph-container, ceph-ansible and openstack :-) | 10:52 |
*** jfrancoa is now known as jfrancoa|lunch | 10:52 | |
gfidente | rnoriega yeah that is the surface | 10:52 |
gfidente | of the whole thing | 10:52 |
rnoriega | gfidente, going for lunch now, thanks for the insights! :-) | 10:53 |
gfidente | rnoriega cool and nice if we can share knowledge about all this stuff | 10:53 |
openstackgerrit | Juan Badia Payno proposed openstack/tripleo-heat-templates master: mistral_engine container added /usr/share volume https://review.openstack.org/583877 | 10:53 |
*** amoralej is now known as amoralej|lunch | 10:54 | |
*** ukalifon has joined #tripleo | 10:56 | |
*** salmankhan has quit IRC | 10:57 | |
*** salmankhan has joined #tripleo | 10:58 | |
*** dhill_ has joined #tripleo | 10:58 | |
*** agurenko has quit IRC | 11:02 | |
*** agurenko has joined #tripleo | 11:05 | |
*** ooolpbot has joined #tripleo | 11:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 11:10 |
*** ooolpbot has quit IRC | 11:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 11:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 11:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 11:10 |
openstackgerrit | Luigi Toscano proposed openstack/tripleo-heat-templates master: WIP Deploy Sahara with unversioned endpoints https://review.openstack.org/583890 | 11:10 |
*** links has joined #tripleo | 11:12 | |
*** morazi has quit IRC | 11:17 | |
*** quiquell is now known as quiquell|lunch | 11:21 | |
*** udesale__ has quit IRC | 11:28 | |
*** agopi is now known as agopi|brb | 11:32 | |
*** abishop has joined #tripleo | 11:33 | |
*** kopecmartin has joined #tripleo | 11:34 | |
*** pchavva has joined #tripleo | 11:35 | |
*** rh-jelabarre has joined #tripleo | 11:37 | |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart master: include domain name into clouds.yaml https://review.openstack.org/582546 | 11:38 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-common master: Support for ARA report for ansible playbooks in deploy https://review.openstack.org/565077 | 11:39 |
*** moguimar has joined #tripleo | 11:42 | |
*** med_ has joined #tripleo | 11:47 | |
*** med_ has quit IRC | 11:47 | |
*** med_ has joined #tripleo | 11:47 | |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart-extras master: Enable support for running refstack tests in TQE https://review.openstack.org/570719 | 11:48 |
Tengu | ccamacho: great, thanks :). | 11:49 |
*** shardy has joined #tripleo | 11:50 | |
*** lblanchard has joined #tripleo | 11:51 | |
Tengu | ccamacho: thanks a lot for your feedback - will do some corrections for the nits, and will try to reflect your thoughts for the rest :). | 11:54 |
*** pdeore has quit IRC | 11:54 | |
openstackgerrit | Clark Chen proposed openstack/tripleo-ha-utils master: Filter "Starting" and "Stopping" keywords when deleting resources Sometimes the some resource can be in Starting status and resourceid will return "Starting", which will fail to uninstall https://review.openstack.org/583939 | 11:55 |
*** mcornea has joined #tripleo | 11:55 | |
*** moguimar has quit IRC | 11:57 | |
ccamacho | Tengu! Awesome but they are most suggestions :) we can speak about it in the upstream call :) | 11:57 |
ccamacho | thank you for the spec proposal | 11:57 |
Tengu | ccamacho: ok, we can see that tomorrow then :). | 11:57 |
Tengu | will just push the nits part, because, well, nits. | 11:58 |
*** athomas has joined #tripleo | 11:58 | |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-specs master: Validation Framework specifications. https://review.openstack.org/583475 | 11:58 |
Tengu | so no need to review, no real change -^ | 11:58 |
openstackgerrit | Flavio Percoco proposed openstack/tripleo-heat-templates master: Move to openshift-ansible 3.10 https://review.openstack.org/582495 | 12:00 |
*** shreshtha has quit IRC | 12:00 | |
chkumar|ruck | mandre: https://review.openstack.org/#/c/583940/ | 12:01 |
chkumar|ruck | regarding tempest user in a container | 12:01 |
*** dtantsur|bbl is now known as dtantsur | 12:02 | |
*** agopi|brb has quit IRC | 12:02 | |
*** trown|outtypewww is now known as trown | 12:03 | |
*** thrash|g0ne is now known as thrash | 12:04 | |
*** amoralej|lunch is now known as amoralej | 12:04 | |
trown | chkumar|ruck: I was able to reproduce that issue with scenario009 ... havent got to the bottom of it yet | 12:04 |
*** jfrancoa|lunch is now known as jfrancoa | 12:04 | |
chkumar|ruck | arxcruz: it is related to this bug https://bugzilla.redhat.com/show_bug.cgi?id=1603176 | 12:05 |
openstack | bugzilla.redhat.com bug 1603176 in rhosp-director "[OSP14][Containerized Undercloud] tempest_init_logs docker container exited with exited code!=0 "chown: invalid user: 'tempest:tempest'"" [Low,New] - Assigned to rhos-maint | 12:05 |
chkumar|ruck | arxcruz: it is not related to scenario002 issue | 12:06 |
chkumar|ruck | trown: great | 12:06 |
arxcruz | chkumar|ruck: but we use this docker image right? | 12:07 |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart master: Add fs055 to run refstack tests https://review.openstack.org/570884 | 12:07 |
openstackgerrit | Chandan Kumar proposed openstack-infra/tripleo-ci master: Add FS055 job as experimental to run refstack tests https://review.openstack.org/570892 | 12:07 |
chkumar|ruck | arxcruz: yes, but adding temepst user will be used only when --user tempest flag is passed with docker run | 12:08 |
chkumar|ruck | arxcruz: it does not affect the tempest container | 12:09 |
trown | mandre: have you seen any issue where mistral is failing to access files in /usr/share/ansible/openshift-ansible? scenario009 job is failing on it, and even testing with flaper87 patch that bumps openshift-ansible version I hit the same thing | 12:09 |
arxcruz | chkumar|ruck: please add these comments on the patch and i'll change my vote | 12:09 |
openstackgerrit | Gabriele Cerami proposed openstack/tripleo-quickstart-extras master: ovb-manage: save generated idnum to yaml file https://review.openstack.org/583944 | 12:09 |
*** leanderthal has quit IRC | 12:10 | |
*** ooolpbot has joined #tripleo | 12:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782438 | 12:10 |
*** ooolpbot has quit IRC | 12:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 12:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 12:10 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 12:10 |
flaper87 | trown: that's fixed with bogdando patch | 12:10 |
flaper87 | trown: https://review.openstack.org/#/c/583136/ | 12:10 |
flaper87 | trown: basically, the new containerized mistral doesn't have the osa playbooks installed in it | 12:11 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart-extras master: Add pauch to the includepkgs https://review.openstack.org/579223 | 12:11 |
flaper87 | we have to bindmount them from the base os | 12:11 |
trown | flaper87: ah makes sense | 12:11 |
EmilienM | hellow | 12:12 |
flaper87 | EmilienM: hellow tow youw toow | 12:12 |
openstackgerrit | John Trowbridge proposed openstack/tripleo-heat-templates master: Add secondary DNS server to disable-unbound environment https://review.openstack.org/582164 | 12:12 |
*** pdeore has joined #tripleo | 12:12 | |
EmilienM | flaper87: :) | 12:12 |
flaper87 | trown: feel free to base your patch on top of https://review.openstack.org/#/c/583136/ instead of mine | 12:13 |
flaper87 | trown: we can merge yours before mine lands | 12:13 |
trown | flaper87: I am using it to test 3.10 though | 12:13 |
flaper87 | trown: oh, nvm then | 12:13 |
flaper87 | :) | 12:13 |
trown | flaper87: my patch is not actually needed by CI, so kind of lower priority... It may not even be needed by most people locally | 12:14 |
trown | my ISP just doesnt like 1.1.1.1 | 12:14 |
chkumar|ruck | flaper87: ah the same patch is passed by mandre last night to me, need to test that | 12:14 |
*** morazi has joined #tripleo | 12:19 | |
flaper87 | trown: oh, mmh, silly isp | 12:20 |
openstackgerrit | Marios Andreou proposed openstack/tripleo-quickstart-extras master: WIP - Adds new bootstrap-subnodes role instead of tripleo.sh https://review.openstack.org/581026 | 12:20 |
*** yrabl has joined #tripleo | 12:21 | |
*** holser_ has joined #tripleo | 12:22 | |
openstackgerrit | James Slagle proposed openstack/tripleo-common stable/queens: Use hostnames in inventory https://review.openstack.org/583285 | 12:22 |
openstackgerrit | James Slagle proposed openstack/tripleo-common stable/queens: Fix dynamic inventory https://review.openstack.org/583286 | 12:22 |
openstackgerrit | James Slagle proposed openstack/tripleo-common stable/queens: Include 'tripleo_role_name' in the inventory https://review.openstack.org/583287 | 12:22 |
*** marrusl has quit IRC | 12:22 | |
openstackgerrit | James Slagle proposed openstack/tripleo-common stable/queens: Remove role_data from inventory https://review.openstack.org/583949 | 12:22 |
*** holser_ has quit IRC | 12:24 | |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: git integration for GetOvercloudConfig action https://review.openstack.org/579634 | 12:24 |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Use /var/lib/mistral/<plan-name> as config-download dir https://review.openstack.org/579635 | 12:24 |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Update failures listing to use latest ansible-errors.json location https://review.openstack.org/583293 | 12:24 |
*** raildo has joined #tripleo | 12:24 | |
*** holser_ has joined #tripleo | 12:25 | |
*** yrabl is now known as liverpooler | 12:25 | |
*** cmyster has joined #tripleo | 12:25 | |
*** cmyster has joined #tripleo | 12:25 | |
*** tzumainn has joined #tripleo | 12:25 | |
chkumar|ruck | mandre: regarding tempest tht changes, i think it is ok https://github.com/openstack/tripleo-heat-templates/blob/master/docker/services/tempest.yaml#L53 ? | 12:27 |
slagle | EmilienM: d0ugal : could I have another look at this series: https://review.openstack.org/#/q/topic:bug/1779093+(status:open+OR+status:merged) | 12:28 |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart master: Use undercloud templates path for UC deployments https://review.openstack.org/583950 | 12:30 |
*** ratailor has quit IRC | 12:33 | |
bogdando | jfrancoa: https://review.openstack.org/#/c/583950/ | 12:33 |
EmilienM | slagle: ack | 12:33 |
bogdando | please use that topic for your fs51 fixes | 12:34 |
openstackgerrit | Cédric Jeanneret proposed openstack/puppet-tripleo master: Corrected vrrp script for haproxy status https://review.openstack.org/583886 | 12:35 |
*** moshele has quit IRC | 12:35 | |
*** ssbarnea1 has quit IRC | 12:36 | |
bogdando | trozet, janki, dsneddon, bfournie: hi, PTAL rebased https://review.openstack.org/#/c/575122/ backport | 12:38 |
openstackgerrit | Cédric Jeanneret proposed openstack/puppet-tripleo master: Corrected vrrp script for haproxy status https://review.openstack.org/583886 | 12:39 |
*** ssbarnea has joined #tripleo | 12:39 | |
*** v1a4 has quit IRC | 12:40 | |
*** rlandy has joined #tripleo | 12:40 | |
*** Guest72952 is now known as portdirect | 12:41 | |
*** edmondsw has joined #tripleo | 12:41 | |
*** noslzzp has quit IRC | 12:45 | |
*** noslzzp has joined #tripleo | 12:45 | |
*** quiquell|lunch is now known as quiquell | 12:45 | |
*** ramishra has joined #tripleo | 12:55 | |
*** eck`gone is now known as eck` | 12:55 | |
*** iranzo is now known as iranzo|AFK | 12:56 | |
weshay | log server is down | 12:56 |
weshay | jobs will fail | 12:56 |
*** weshay changes topic to "Welcome to Rocky. CI status: http://logs.openstack.org is down RED RED RED | https://docs.openstack.org/tripleo-docs/latest" | 12:57 | |
weshay | EmilienM, ^ | 12:57 |
dpeacock | florianf: EmilienM: https://review.openstack.org/#/c/577397/ is ready for you again please :-) | 12:57 |
EmilienM | weshay: no... did you report to infra? | 12:57 |
weshay | -openstackstatus/#openstack-infra- NOTICE: logs.openstack.org is offline, causing POST_FAILURE results from Zuul. Cause and resolution timeframe currently unknown. | 12:57 |
dpeacock | I really wanna get this merged since M3 is looming and I'd like this to make it. | 12:57 |
weshay | they are on it | 12:57 |
EmilienM | ah it's from infra | 12:57 |
EmilienM | ok, thanks for the headsup | 12:57 |
EmilienM | dpeacock: ack | 12:58 |
*** psahoo has quit IRC | 12:58 | |
-openstackstatus- NOTICE: logs.openstack.org is offline, causing POST_FAILURE results from Zuul. Cause and resolution timeframe currently unknown. | 12:59 | |
*** ChanServ changes topic to "logs.openstack.org is offline, causing POST_FAILURE results from Zuul. Cause and resolution timeframe currently unknown." | 12:59 | |
*** moshele has joined #tripleo | 13:00 | |
honza | EmilienM: weshay: could i get your help with https://bugs.launchpad.net/tripleo/+bug/1782438 ? tripleo-ui is broken on oooq master | 13:02 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [Critical,New] | 13:03 |
*** iranzo|AFK is now known as iranzo | 13:03 | |
EmilienM | dpeacock: no -1 but a comment | 13:03 |
honza | EmilienM: weshay: i'm happy to do the work, i just don't really know where to look | 13:03 |
EmilienM | dpeacock: I'm happy to iterate later though, just let me know what you prefer; | 13:03 |
EmilienM | honza: looking now, one sec | 13:03 |
dpeacock | EmilienM: thnaks | 13:03 |
arxcruz | EmilienM: https://review.openstack.org/#/c/583659/ fix https://bugs.launchpad.net/tripleo/+bug/1773325 :) | 13:03 |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 13:03 |
dpeacock | *thanks | 13:03 |
EmilienM | arxcruz: +A, thx for the fix | 13:04 |
EmilienM | honza: it's really weird, I've been testing the UI for a while now and everything works fine | 13:04 |
EmilienM | but I don't use quickstart. | 13:04 |
EmilienM | it shouldn't matter, tbh | 13:04 |
honza | EmilienM: there's the problem | 13:05 |
dpeacock | EmilienM: lol scope creep - ok this might delay things - let me look into it - I was hoping to get *something* through soon - I'll come back to you | 13:05 |
dpeacock | :-) | 13:05 |
EmilienM | oh actually yes it matters | 13:05 |
honza | Is minimal.yml being exercised by ci anywhere? | 13:05 |
EmilienM | dpeacock: I'm happy to +2 it now, if you say we can do it later | 13:05 |
weshay | honza, /me looks | 13:05 |
EmilienM | dpeacock: I've +2-ed, but I want someone from validation to make a final review for the code structure. florianf / gchamoul at least | 13:06 |
mandre | chkumar|ruck: how are you running the tempest container? | 13:07 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates master: Use global ansible.cfg for nodes-uuid playbook https://review.openstack.org/583552 | 13:07 |
mandre | chkumar|ruck: the container you want to run with the 'tempest' user | 13:07 |
chkumar|ruck | mandre: https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/validate-tempest/templates/run-tempest.sh.j2#L47 | 13:08 |
weshay | chkumar|ruck, sshnaidm|rover can you guys help honza ? | 13:08 |
chkumar|ruck | mandre: on doing enable_tempest to true in undercloud.conf it pulls the container on undercloud, and create /var/log/container tempest | 13:08 |
weshay | https://bugs.launchpad.net/tripleo/+bug/1782438 | 13:08 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [High,Triaged] | 13:08 |
mandre | chkumar|ruck: this is where you need to add the '--user tempest' to the docker command | 13:08 |
dpeacock | EmilienM: Thank you Sir; I'll make a followup patch with your suggestion right after | 13:09 |
EmilienM | dpeacock: cool cool | 13:09 |
chkumar|ruck | mandre: yes, but waiting for kolla patch to merge | 13:09 |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-common master: WIP - Group sudoers by aliases https://review.openstack.org/583956 | 13:09 |
EmilienM | dpeacock: credits to Mr Alex for the code https://review.openstack.org/#/c/569153/25/common/deploy-steps-tasks.yaml | 13:09 |
*** ooolpbot has joined #tripleo | 13:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 13:10 |
*** ooolpbot has quit IRC | 13:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 13:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 13:10 |
dpeacock | EmilienM: Excellent! | 13:10 |
mandre | chkumar|ruck: you don't have to wait for kolla patch to merge - in the kolla patch you set the default user for the image, but you can force the user in the run-tempest.sh script if you want to test it | 13:10 |
chkumar|ruck | mandre: sure | 13:10 |
chkumar|ruck | mandre: I will prepare a patch for the same | 13:10 |
*** moshele has quit IRC | 13:10 | |
mandre | IIUC this is what arxcruz was asking for | 13:11 |
mandre | chkumar|ruck: what is the issue with scenario002? | 13:11 |
chkumar|ruck | mandre: container check script does not check for package update in tempest container | 13:12 |
*** pradk has quit IRC | 13:12 | |
chkumar|ruck | mandre: waiting for logs.openstack.org to up, I will explain the issue | 13:12 |
mandre | ok so it's unrelated to the user change, right? | 13:12 |
*** jtomasek has joined #tripleo | 13:13 | |
chkumar|ruck | mandre: yup | 13:13 |
chkumar|ruck | mandre: that is totally a different story | 13:13 |
arxcruz | mandre: according chkumar|ruck is unrelated, and i'm trusting on his word | 13:13 |
*** dprince has joined #tripleo | 13:14 | |
*** agopi has joined #tripleo | 13:15 | |
*** udesale has joined #tripleo | 13:18 | |
*** toure|gone is now known as toure | 13:19 | |
mandre | chkumar|ruck: ah the issue is caused by the tempest container not having the latest package? | 13:20 |
*** artom has joined #tripleo | 13:20 | |
chkumar|ruck | mandre: yup | 13:20 |
chkumar|ruck | mandre: so on this one https://review.openstack.org/573220 i have added a depends on patch from python-temepstconf | 13:21 |
chkumar|ruck | tempest container should be updated with python-temepstconf quickstart dlrn build package but it is not happening | 13:22 |
chkumar|ruck | i am not sure where is the gotcha | 13:22 |
openstackgerrit | Tim Rozet proposed openstack/puppet-tripleo stable/queens: Remove table 17 from OVS OF pipeline sync https://review.openstack.org/583009 | 13:23 |
openstackgerrit | Tim Rozet proposed openstack/puppet-tripleo stable/queens: Updates OpenDaylight HA Proxy backend check https://review.openstack.org/581790 | 13:24 |
mandre | chkumar|ruck: do you store the tempest image in the local registry on the undercloud? or are you pulling it via the 'docker run' command in run-tempest.sh | 13:27 |
chkumar|ruck | mandre: it is getting pulled in the local registery | 13:28 |
chkumar|ruck | mandre: https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/validate-tempest/defaults/main.yml#L26 | 13:28 |
chkumar|ruck | which is referneced to local registery | 13:28 |
*** mjturek has joined #tripleo | 13:29 | |
chkumar|ruck | mandre: might be I am doing something wrong, there | 13:30 |
yolanda_ | hi, good afternoon. I have a question.. i'm trying to deploy tripleo, queens, but without external network. | 13:32 |
yolanda_ | and i keep getting an error, it keeps asking me about ExternalNetName parameter | 13:32 |
yolanda_ | what shall i provide there? i don't have external routable network in my setup | 13:32 |
slagle | quiquell: sshnaidm|rover : do you know you're both working similar approaches with https://review.openstack.org/#/c/583536/ and https://review.openstack.org/#/c/565077/ | 13:34 |
sshnaidm|rover | slagle, yes :) | 13:34 |
slagle | ok | 13:35 |
*** iranzo is now known as iranzo|AFK | 13:36 | |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart-extras master: User tempest user with tempest container https://review.openstack.org/583961 | 13:36 |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart-extras master: Use tempest user with tempest container https://review.openstack.org/583961 | 13:37 |
*** iranzo|AFK is now known as iranzo | 13:38 | |
*** moguimar has joined #tripleo | 13:39 | |
*** bfournie has quit IRC | 13:42 | |
*** ChanServ changes topic to "Welcome to Rocky. CI status: http://logs.openstack.org is down RED RED RED | https://docs.openstack.org/tripleo-docs/latest" | 13:43 | |
-openstackstatus- NOTICE: logs.openstack.org is back on-line. Changes with "POST_FAILURE" job results should be rechecked. | 13:43 | |
trown | flaper87: I am seeing the webconsole pod failing to schedule with 3.10 ... is that a familiar issue? | 13:46 |
*** suuuper has quit IRC | 13:46 | |
*** suuuper has joined #tripleo | 13:46 | |
*** ccamacho has quit IRC | 13:49 | |
*** ccamacho1 has joined #tripleo | 13:49 | |
*** nyechiel has quit IRC | 13:51 | |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-heat-templates master: Enable deep_compare of pcmk resources by default https://review.openstack.org/581416 | 13:52 |
*** raildo has quit IRC | 13:53 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: Make sure that stonith state is enforced before attempting a scaleup https://review.openstack.org/582521 | 13:54 |
*** raildo has joined #tripleo | 13:54 | |
*** iranzo is now known as iranzo|AFK | 13:54 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: Make sure that stonith state is enforced before attempting a scaleup https://review.openstack.org/582521 | 13:55 |
flaper87 | trown: when that one fails is prob because the nodes are not ready | 13:55 |
trown | flaper87: hmm so a bit racy? | 13:55 |
flaper87 | get in the master node, run oc get pods and see if the pods there are not scheduled | 13:55 |
flaper87 | trown: no, more likely a miss labeled node (or at least that was my problem) | 13:55 |
flaper87 | trown: are you seeing this in your local env? | 13:56 |
trown | flaper87: ya | 13:56 |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart-extras master: avoid Inappropriate ioctl for device with pause module https://review.openstack.org/583965 | 13:56 |
flaper87 | trown: I can help looking into it | 13:56 |
trown | flaper87: ah ya all the pods are in Pending | 13:56 |
openstackgerrit | Sagi Shnaidman proposed openstack/python-tripleoclient master: Support ARA report tracking from command line https://review.openstack.org/583799 | 13:57 |
*** iranzo|AFK is now known as iranzo | 13:57 | |
flaper87 | trown: so, likely what I said. I can help looking into this if you give me access. Make sure you have nodes labeled as infra | 13:58 |
flaper87 | trown: oc get node $NODE | 13:58 |
*** moguimar has quit IRC | 13:59 | |
*** gfidente has quit IRC | 14:00 | |
*** gfidente has joined #tripleo | 14:01 | |
*** gfidente has quit IRC | 14:01 | |
*** gfidente has joined #tripleo | 14:01 | |
trown | flaper87: I am looking at ROLES there? only compute | 14:02 |
*** morazi has quit IRC | 14:08 | |
*** ooolpbot has joined #tripleo | 14:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 14:10 |
*** ooolpbot has quit IRC | 14:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 14:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 14:10 |
*** bkopilov has quit IRC | 14:10 | |
*** cshastri has quit IRC | 14:12 | |
*** morazi has joined #tripleo | 14:13 | |
trown | flaper87: is this openshift_node_group_name supposed to end up as "node-config-infra" http://paste.openstack.org/show/726276/ | 14:14 |
*** janki has quit IRC | 14:15 | |
*** paramite has quit IRC | 14:16 | |
*** janki has joined #tripleo | 14:16 | |
openstackgerrit | Derek Higgins proposed openstack/tripleo-quickstart master: [WIP] Add featureset 054 - overcloud baremetal+ansible-ml2 https://review.openstack.org/579601 | 14:17 |
openstackgerrit | Derek Higgins proposed openstack/tripleo-heat-templates master: [WIP] Add scenario 012 - overlcoud baremetal+ansible-ml2 https://review.openstack.org/579603 | 14:18 |
*** bfournie has joined #tripleo | 14:19 | |
*** mdnadeem has quit IRC | 14:25 | |
*** pdeore has quit IRC | 14:26 | |
*** jfrancoa has quit IRC | 14:28 | |
*** med_ has quit IRC | 14:29 | |
honza | EmilienM: weshay: our ovb configs in oooq-extras aren't set up for containerized undercloud; should they be? | 14:30 |
honza | chkumar|ruck: sshnaidm|rover any ideas on https://bugs.launchpad.net/tripleo/+bug/1782438 | 14:30 |
openstack | Launchpad bug 1782438 in tripleo "tripleo ui endpoints misconfigured in containerized undercloud" [High,Triaged] | 14:30 |
*** hamdyk has quit IRC | 14:31 | |
*** quiquell is now known as quiquell|off | 14:32 | |
sshnaidm|rover | honza, well, need to look more, not familiar with that part.. | 14:33 |
sshnaidm|rover | honza, can you point me exactly where quickstart populates this ip? | 14:33 |
*** pradk has joined #tripleo | 14:33 | |
*** rbrady has joined #tripleo | 14:34 | |
*** rbrady has quit IRC | 14:34 | |
*** rbrady has joined #tripleo | 14:34 | |
*** paramite has joined #tripleo | 14:36 | |
*** mdnadeem has joined #tripleo | 14:37 | |
Tengu | florianf: do you have a couple of minutes for a quick question on a validation? | 14:39 |
honza | sshnaidm|rover: all i know is in the bug; the --public-virtual-ip value passed to 'tripleo deploy' is 192.168.24.2 which can't be accessed from outside the virthost. Setting --public-virtual-ip to the virthost's public ip makes undercloud deploy fail (network issue) | 14:40 |
honza | sshnaidm|rover: i don't have the precise error message anymore, unfortunately | 14:41 |
*** dparkes has quit IRC | 14:41 | |
*** arxcruz has quit IRC | 14:41 | |
flaper87 | trown: sorry, had to go afk for a bit | 14:41 |
flaper87 | back now | 14:41 |
trown | flaper87: no worries | 14:41 |
honza | sshnaidm|rover: i'm re-running the script now | 14:42 |
sshnaidm|rover | honza, can you paste in bug your reproducing steps? | 14:42 |
honza | sshnaidm|rover: done | 14:43 |
*** arxcruz has joined #tripleo | 14:44 | |
*** bogdando has quit IRC | 14:44 | |
*** leanderthal has joined #tripleo | 14:46 | |
*** jfrancoa has joined #tripleo | 14:46 | |
*** yprokule has quit IRC | 14:47 | |
*** morazi has quit IRC | 14:48 | |
*** morazi has joined #tripleo | 14:54 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-quickstart-extras master: Adds reproducer check+exit+warning dependencies - virtualenv+others https://review.openstack.org/578081 | 14:55 |
*** janki has quit IRC | 14:55 | |
*** ratailor has joined #tripleo | 14:56 | |
owalsh | EmilienM, slagle, dprince: if you get a chance would really appreciate reviews on https://review.openstack.org/577855 | 14:56 |
openstackgerrit | Flavio Percoco proposed openstack/tripleo-heat-templates master: Move to openshift-ansible 3.10 https://review.openstack.org/582495 | 14:58 |
openstackgerrit | Flavio Percoco proposed openstack/tripleo-heat-templates master: WIP use openshift-ansible container instead of RPMs https://review.openstack.org/583868 | 14:58 |
openstackgerrit | John Trowbridge proposed openstack/tripleo-heat-templates master: Add secondary DNS server to disable-unbound environment https://review.openstack.org/582164 | 14:58 |
*** leanderthal has quit IRC | 15:02 | |
*** dtantsur is now known as dtantsur|afk | 15:02 | |
*** ykarel is now known as ykarel|away | 15:04 | |
slagle | owalsh: i'll take a look | 15:04 |
owalsh | slagle: thanks | 15:04 |
*** weshay changes topic to "Welcome to Rocky. CI status: GREEN | https://docs.openstack.org/tripleo-docs/latest" | 15:06 | |
*** jfrancoa has quit IRC | 15:08 | |
EmilienM | owalsh: ack | 15:08 |
*** jfrancoa has joined #tripleo | 15:08 | |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart-extras master: Run bashate via pre-commit https://review.openstack.org/583984 | 15:09 |
EmilienM | owalsh: ouch, it'll take time | 15:09 |
EmilienM | owalsh: I wasn't a fan of having these scripts in THT... | 15:10 |
weshay | EmilienM, just confirming.. this is what you want to see w/ the new healthchecks in the undercloud | 15:10 |
weshay | http://logs.openstack.org/55/573255/4/gate/tripleo-ci-centos-7-containers-multinode/91fb9e6/logs/undercloud/home/zuul/undercloud_install.log.txt.gz#_2018-07-19_14_42_21 | 15:10 |
EmilienM | owalsh: but I don't have anything better in my mind now | 15:10 |
openstackgerrit | Bob Fournier proposed openstack/tripleo-common stable/queens: ensure unique ironic node ID with UCS driver https://review.openstack.org/583985 | 15:10 |
*** ooolpbot has joined #tripleo | 15:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782165 | 15:10 |
*** ooolpbot has quit IRC | 15:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 15:10 |
openstack | Launchpad bug 1782165 in tripleo "ntp servers blocked. Can't deploy undercloud with oooq due to wrong NTP configuration" [Critical,Incomplete] - Assigned to wes hayutin (weshayutin) | 15:10 |
owalsh | EmilienM: yea, there were always a quick hack that turned out to be far too useful :-) | 15:10 |
EmilienM | weshay: yes, this is Alex's patch that is in effect :D | 15:10 |
*** ccamacho1 has quit IRC | 15:10 | |
weshay | k | 15:10 |
EmilienM | weshay: we need to get someone from nova (owalsh?) to look at this one | 15:10 |
EmilienM | but the container isn't healthy in this job | 15:11 |
*** leanderthal has joined #tripleo | 15:11 | |
EmilienM | weshay: is it in all jobs or randomly? | 15:11 |
* EmilienM in a meeting, can't look now | 15:11 | |
* owalsh in a meeting too | 15:11 | |
*** pcaruana has quit IRC | 15:12 | |
weshay | arxcruz++ | 15:12 |
openstackgerrit | Ryan Brady proposed openstack/tripleo-common master: Makes sorting environments with capabilities-map optional https://review.openstack.org/582233 | 15:16 |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart master: Add fs055 to run refstack tests https://review.openstack.org/570884 | 15:16 |
*** mrunge_ is now known as mrunge | 15:16 | |
florianf | Tengu: yup. still there? | 15:20 |
Tengu | florianf: yep :) | 15:20 |
*** iranzo is now known as iranzo|AFK | 15:20 | |
Tengu | florianf: I was wondering if you have and thought about weshay comment on this change: https://review.openstack.org/#/c/582917/ | 15:20 |
slagle | EmilienM: owalsh : i'm not a fan of the script in tht either, but it seems reasonable for now I suppose. as we move towards ansible roles and we can move the deploy tasks out of tht, maybe the script(s) could get moved into the role | 15:20 |
EmilienM | +1 with slagle | 15:21 |
EmilienM | owalsh: how does that sound to you? | 15:21 |
Tengu | florianf: was about to post a question on that topic on openstack-dev anyway. but as you're on validations, maybe you have a first word :). | 15:21 |
*** gkadam-brb is now known as gkadam | 15:21 | |
dpeacock | florianf: hey - any chance you are ready to +2 https://review.openstack.org/#/c/577397/ please? | 15:23 |
*** iranzo|AFK has quit IRC | 15:24 | |
*** sshnaidm|rover is now known as sshnaidm|afk | 15:24 | |
florianf | dpeacock: yes almost. It looks fine, but I want to give it one real test run at least. I'm finishing one thing in my dev environment and then I'm ready. | 15:25 |
dpeacock | florianf: of course - thank you - let me know if you need anything - I'm writing up a doc which I haven't submitted yet. It requires the inventory.yaml generated by tripleo-ansible-inventory which is archived in the homedir. | 15:26 |
dpeacock | florianf: it's in the undercloud-install-<timestamp>.tar.bz2 file. | 15:28 |
openstackgerrit | Marios Andreou proposed openstack/tripleo-quickstart-extras master: Adds reproducer check+exit+warning dependencies - virtualenv+others https://review.openstack.org/578081 | 15:28 |
EmilienM | owalsh: +2 with comment. I'll let slagle approve maybe | 15:30 |
florianf | dpeacock: the doc requires the inventory file? | 15:30 |
EmilienM | weshay: is it in all jobs or randomly? | 15:31 |
dpeacock | florianf: the validation playbook does; it needs the ctlplane ip | 15:31 |
florianf | dpeacock: right. | 15:32 |
dpeacock | After thinking about how to get it, Emilien and I settled on grabbing it from the inventory which was provided during the deployment. | 15:32 |
owalsh | EmilienM: thanks, ack that's that plan | 15:32 |
EmilienM | owalsh: cool. | 15:32 |
weshay | EmilienM, was in a gate failure | 15:33 |
EmilienM | weshay: damn. | 15:33 |
dpeacock | there may be a better way we can think of, and without trying to get too far ahead, I'm going to talk to the dev list people to see if anyone objects to making the inventory file persist on undercloud instead of being archived after deployment (which is the current state) | 15:33 |
dpeacock | florianf: ^ | 15:33 |
EmilienM | weshay: let's prepare a logstash query to see how frequent we have this one, give me a sec | 15:33 |
weshay | k | 15:33 |
EmilienM | weshay: build_name: *tripleo-ci* AND build_status: FAILURE AND message: "Up 2 minutes (unhealthy)" and message: "centos-binary-nova-api:current-tripleo-updated" | 15:36 |
*** lvdombrkr has quit IRC | 15:36 | |
florianf | dpeacock: if the validation is run through mistral, it already has access to the inventory information | 15:36 |
EmilienM | weshay: 7 hits today | 15:36 |
EmilienM | not much but still not good | 15:36 |
EmilienM | owalsh: can you help please? (maybe after your meeting if you still have time) | 15:37 |
EmilienM | weshay: do we have a bug already? | 15:37 |
owalsh | EmilienM: yes, will take a look after | 15:37 |
EmilienM | weshay: https://prnt.sc/k8hrjr | 15:38 |
EmilienM | owalsh: thx | 15:38 |
weshay | EmilienM, no.. I'll make one | 15:38 |
EmilienM | k | 15:38 |
EmilienM | owalsh: first look, it seems like the containers looks healthy after a while: http://logs.openstack.org/55/573255/4/gate/tripleo-ci-centos-7-containers-multinode/91fb9e6/logs/undercloud/var/log/extra/docker/containers/nova_api/docker_info.log.txt.gz | 15:40 |
*** aufi_ has quit IRC | 15:40 | |
EmilienM | so maybe it's just nova-api being too long | 15:40 |
*** holser__ has joined #tripleo | 15:40 | |
EmilienM | and we need to increase the timeout in alex's patch: https://review.openstack.org/#/c/569153/25/common/deploy-steps-tasks.yaml | 15:40 |
owalsh | EmilienM: ah, was about to say ^^^ maybe need to bump this in CI | 15:41 |
EmilienM | :( | 15:41 |
EmilienM | retry driven deployment | 15:41 |
*** ykarel|away has quit IRC | 15:41 | |
EmilienM | weshay: please add the query in the bug report, for the record. | 15:43 |
weshay | sure good idea | 15:44 |
*** holser_ has quit IRC | 15:44 | |
*** udesale has quit IRC | 15:46 | |
*** shreshtha has joined #tripleo | 15:46 | |
*** ramishra has quit IRC | 15:47 | |
*** mdnadeem has quit IRC | 15:48 | |
openstackgerrit | James Slagle proposed openstack/tripleo-docs master: Update docs for /var/lib/mistral/<plan name> https://review.openstack.org/584004 | 15:48 |
*** jfrancoa has quit IRC | 15:48 | |
weshay | https://bugs.launchpad.net/tripleo/+bug/1782598 | 15:49 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,Triaged] | 15:49 |
florianf | dpeacock: hmm. the inventory doesn't contain any info on the undercloud's ctlplane ip in my environment | 15:50 |
dpeacock | florianf: Did you deploy a containerized undercloud? | 15:51 |
dpeacock | florianf: on my system:- | 15:52 |
dpeacock | [vagrant@undercloud ~]$ grep ctlplane_ip undercloud-ansible-roQcHV/inventory.yaml | 15:52 |
dpeacock | ctlplane_ip: 192.168.24.1 | 15:52 |
*** skramaja has quit IRC | 15:52 | |
*** gfidente has quit IRC | 15:53 | |
openstackgerrit | Ronelle Landy proposed openstack/tripleo-quickstart-extras master: Include minimal Browbeat playbook in baremetal playbook https://review.openstack.org/581488 | 15:54 |
*** leanderthal has quit IRC | 15:55 | |
EmilienM | owalsh: I assigned https://bugs.launchpad.net/tripleo/+bug/1782598 to you | 15:57 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,Triaged] - Assigned to Oliver Walsh (owalsh) | 15:57 |
*** gfidente has joined #tripleo | 15:57 | |
*** gfidente has quit IRC | 15:57 | |
*** gfidente has joined #tripleo | 15:57 | |
*** gfidente^2nd has joined #tripleo | 15:57 | |
owalsh | EmilienM: ack, seems apache just took a couple of minutes to spawn the 4 nova_api procs | 15:58 |
*** kopecmartin has quit IRC | 16:01 | |
florianf | dpeacock: ok, that's probably it then. | 16:01 |
dpeacock | florianf: yeah this is only for containerized undercloud checks | 16:01 |
florianf | dpeacock: I know. But I wonder: should there be some checks if the variable exists? because if it doesn't the validation will break (not just fail). | 16:02 |
*** ukalifon has quit IRC | 16:04 | |
*** gfidente^2nd has quit IRC | 16:05 | |
dpeacock | Sure that sounds reasonable. | 16:05 |
florianf | dpeacock: I added a comment to the patch | 16:06 |
dpeacock | florianf: much appreciated - thank you Sir :-) | 16:07 |
florianf | dpeacock: thank *you*! :) | 16:07 |
*** avivgt|lunch has quit IRC | 16:08 | |
*** ooolpbot has joined #tripleo | 16:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782598 | 16:10 |
*** ooolpbot has quit IRC | 16:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 16:10 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,Triaged] - Assigned to Oliver Walsh (owalsh) | 16:10 |
*** shardy has quit IRC | 16:11 | |
*** sshnaidm|afk is now known as sshnaidm|rover | 16:11 | |
chkumar|ruck | EmilienM: weshay sshnaidm|rover https://review.openstack.org/#/c/581607/ please have a look at this one | 16:12 |
*** athomas has quit IRC | 16:12 | |
owalsh | larsks: hey, around? got a question re healthchecks | 16:13 |
larsks | owalsh: I'm around, although it's been a while since I looked at health checks... | 16:14 |
owalsh | larsks: wondering if we can alter the start period | 16:14 |
sshnaidm|rover | chkumar|ruck, you need the patch to pass CI before review | 16:14 |
sshnaidm|rover | chkumar|ruck, it has legit job failures | 16:15 |
weshay | thanks chkumar|ruck | 16:15 |
larsks | owalsh: by "can", do you mean "is it possible" or "is it advisable"? | 16:15 |
owalsh | larsks: I think it's necessary... wondering if it's possible | 16:15 |
owalsh | seems to be an option here - https://docs.docker.com/engine/reference/builder/#healthcheck | 16:16 |
sshnaidm|rover | weshay, recheck won't help | 16:16 |
weshay | ok | 16:16 |
chkumar|ruck | sshnaidm|rover: I need to find out how to add it to undercloud | 16:16 |
dpeacock | florianf: actually having checked I think this is implicitly taken care of - please see the three scenarios I just put in this paste and let me know if this addresses your concerns: http://paste.openstack.org/show/726283/ | 16:16 |
larsks | owalsh: I believe the health checks are put in place by the 'paunch' tool, so the real question is if paunch exposes an option for that. Let me see if I can answer that... | 16:17 |
owalsh | larsks: ah, no https://github.com/openstack/paunch/blob/master/paunch/builder/compose1.py#L185 | 16:17 |
sshnaidm|rover | chkumar|ruck, ok, so this patch is WIP so far, it's not ready for review | 16:17 |
larsks | owalsh: well, there you go :). Looks like a pretty simple patch, though. | 16:18 |
owalsh | larsks: ack thanks | 16:18 |
owalsh | EmilienM: think we need to drop the healthcheck patch... needs to consider the interval/retires options https://github.com/openstack/paunch/blob/master/paunch/builder/compose1.py#L185 | 16:19 |
*** karthiks has joined #tripleo | 16:19 | |
*** janki has joined #tripleo | 16:20 | |
*** agopi is now known as agopi|food | 16:21 | |
sshnaidm|rover | slagle, can you please take a look at https://review.openstack.org/#/c/565077/ ? I don't understand what to do on mistral workflows side | 16:21 |
owalsh | EmilienM: default is 3 retires and 30s interval before it marks containers unhealthy | 16:22 |
owalsh | so 60s timeouts defintely ain't right | 16:22 |
*** amoralej is now known as amoralej|off | 16:23 | |
*** jtcressy has joined #tripleo | 16:24 | |
*** agopi|food is now known as agopi | 16:25 | |
owalsh | larsks: *sigh* looks like that's a new docker option | 16:27 |
larsks | The inevitable march of progress... | 16:27 |
*** ratailor has quit IRC | 16:28 | |
*** moshele has joined #tripleo | 16:29 | |
*** noslzzp has quit IRC | 16:31 | |
florianf | dpeacock: the output will look different when the validation is run through mistral | 16:34 |
florianf | dpeacock: because it uses a custom plugin to format the output: https://github.com/openstack/tripleo-validations/blob/master/validations/callback_plugins/validation_output.py | 16:34 |
*** gfidente is now known as gfidente|afk | 16:34 | |
florianf | dpeacock: which doesn't pick up debug tasks etc. | 16:35 |
*** dtrainor has quit IRC | 16:35 | |
florianf | dpeacock: (you can check directly if you run the validation from the repository root because it contains an ansible.cfg file that sets the validation_output plugin) | 16:35 |
*** ffiore has quit IRC | 16:36 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart master: ensure pip deps are at the latest version https://review.openstack.org/583736 | 16:36 |
dpeacock | florianf: ok - my apologies - I haven't tried through mistral yet - is there an example command to run it all? | 16:36 |
*** dtrainor has joined #tripleo | 16:38 | |
florianf | dpeacock: no need to apologize at all! that's not exactly obvious... The easiest way to check the output as it would look when run through mistral is to cd into the validations repo root (where an ansible.cfg is located) and run the validation from there | 16:38 |
*** dbecker has quit IRC | 16:39 | |
florianf | dpeacock: I added another comment as a reminder... | 16:42 |
*** suuuper has quit IRC | 16:42 | |
jtcressy | So I once got the tripleo validations to run properly before... but most of the time the pre-deployment validation fail because "Warning! The validation did not run on any host." Is this a common issue? | 16:43 |
*** agopi is now known as agopi|food | 16:44 | |
*** pchavva has quit IRC | 16:45 | |
dpeacock | florianf: sorry - one thing I am missing is pretty crucial - what is the actual command to run this? | 16:46 |
jtcressy | florianf: just noticed you were talking about validations in the above messages, but I didnt catch the first half of the convo. | 16:46 |
jtcressy | dpeacock: I just used this from the docs a few minutes ago to spawn validations in mistral: openstack workflow execution create tripleo.validations.v1.run_groups '{"group_names": ["pre-deployment"]}' | 16:47 |
dpeacock | jtcressy: thank you :-) | 16:49 |
jtcressy | however, unrelated to that issue, I get "Warning! The validation did not run on any host." on every validation it tries to run. These were working once, but I don't know what's different between then and now. | 16:49 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates master: Add sample designate environment for ha https://review.openstack.org/584026 | 16:51 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates master: Use absolute paths in enable-designate environments https://review.openstack.org/584027 | 16:51 |
dpeacock | florianf: for dev purposes - how do I actually run the validation from the repo root? | 16:55 |
*** tesseract has quit IRC | 16:56 | |
dpeacock | literally just ansible-playbook ..? Or is there something else special? | 16:56 |
dpeacock | florianf: ok nevermind - I answered my own question - thanks | 16:57 |
dpeacock | ok right - characterized the problem now - moving on :-) | 16:58 |
florianf | dpeacock: sorry, back. | 16:59 |
dpeacock | Sorry for the stream of consciousness folks | 16:59 |
dpeacock | florianf: it's all good - I have everything I need to get going now :-) | 16:59 |
*** thrash is now known as thrash|biab | 16:59 | |
florianf | dpeacock: ok, cool! :-) | 17:00 |
florianf | jtcressy: the "did not run on any hosts" error usually appears if you run a validation that's supposed to run on the overcloud without having an overcloud deployed | 17:01 |
jtcressy | these are the "pre-deployment" validations though. | 17:01 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart master: ensure pip deps are at the latest version https://review.openstack.org/583736 | 17:01 |
jtcressy | e.g. the advanced format validation fails with this error and it does not need a deployed overcloud to run. | 17:02 |
florianf | jtcressy: yeah, that shouldn't happen. | 17:02 |
*** owalsh is now known as owalsh_biab | 17:02 | |
jtcressy | I looked through a few logs in /var/log/mistral but cant find anything relevant yet | 17:02 |
florianf | jtcressy: what command did you use to run a single validation? | 17:02 |
jtcressy | I used this to run all of the pre-deployment validations: "openstack workflow execution create tripleo.validations.v1.run_groups '{"group_names": ["pre-deployment"]}'" | 17:03 |
*** links has quit IRC | 17:03 | |
sri_ | dprince, dsneddon: hi quick quastion I've confiugred bonds in my overcloud with LACP, but from my switch side not learning mac address for the bonds | 17:03 |
jtcressy | florianf: I can also try to start them individually from the UI but they still fail with the same error. | 17:03 |
*** pradk has quit IRC | 17:04 | |
*** brault has quit IRC | 17:04 | |
*** derekh has quit IRC | 17:04 | |
florianf | jtcressy: can you post the output of `python /usr/bin/tripleo-ansible-inventory --list` somewhere? | 17:04 |
*** brault has joined #tripleo | 17:04 | |
florianf | *paste | 17:04 |
*** pradk has joined #tripleo | 17:05 | |
jtcressy | florianf: https://hastebin.com/raw/daquwakigo | 17:06 |
jtcressy | i have no stack deployed, but I do have a plan defined. | 17:06 |
florianf | jtcressy: do you use a stack/plan with a different name than overcloud? | 17:06 |
jtcressy | no | 17:06 |
*** bdodd has quit IRC | 17:06 | |
*** trown is now known as trown|lunch | 17:08 | |
florianf | jtcressy: what happens if you add `--plan overcloud` to the command? | 17:08 |
*** brault has quit IRC | 17:09 | |
jtcressy | florianf: see the second half of the hastebin. | 17:09 |
*** gkadam has quit IRC | 17:09 | |
*** ooolpbot has joined #tripleo | 17:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782267 | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782598 | 17:10 |
*** ooolpbot has quit IRC | 17:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 17:10 |
openstack | Launchpad bug 1782267 in tripleo "Stderr: u'Set Chassis Power Control to Up/On failed: Command not supported in present state\n': ProcessExecutionError: Unexpected error while running command." [Critical,Triaged] | 17:10 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,Triaged] - Assigned to Oliver Walsh (owalsh) | 17:10 |
*** moshele has quit IRC | 17:10 | |
florianf | jtcressy: what if you run this: openstack action execution run tripleo.plan.list | 17:11 |
*** holser__ has quit IRC | 17:11 | |
jtcressy | florianf: output: `{"result": ["overcloud"]}` | 17:11 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates master: Use global ansible.cfg for nodes-uuid playbook https://review.openstack.org/583552 | 17:12 |
*** psachin` has quit IRC | 17:13 | |
*** edmondsw has quit IRC | 17:13 | |
*** artom has quit IRC | 17:13 | |
florianf | jtcressy: ok, this is strange. does this output anything: `echo $TRIPLEO_PLAN_NAME` | 17:14 |
*** weshay changes topic to "Welcome to Rocky. CI status: GREEN, OVB RED due to nodepool nodefailure | https://docs.openstack.org/tripleo-docs/latest" | 17:14 | |
jtcressy | No, it does not. | 17:14 |
jtcressy | i did `source stackrc` a while ago btw in case that envvar is supposed to be populated from there. | 17:15 |
florianf | jtcressy: no, it's not. but if it was set to something else than overcloud (for whatever reason -- I just wanted to rule it out) it would have explain the result | 17:15 |
florianf | *explained | 17:16 |
dsneddon | sri_, You have the switch side configured for LACP as well? | 17:16 |
jtcressy | florianf: is it ok for this var to be empty or undefined? or does it _need_ to be "overcloud"? | 17:16 |
sri_ | yes, I mena my network admin told me | 17:16 |
florianf | jtcressy: yes it is | 17:16 |
sri_ | and also when i craete a linux bonds "BONDING_MASTER" this not part of the bond config | 17:17 |
florianf | jtcressy: the inventory will fall back to overcloud if it isn't set. | 17:17 |
sri_ | dsneddon, ^^ is that a deal breaker ? | 17:18 |
*** edmondsw has joined #tripleo | 17:18 | |
dsneddon | sri_, I'm not sure I understand what you mean when you say and also when i craete a linux bonds "BONDING_MASTER" this not part of the bond config" | 17:18 |
jtcressy | florianf: iirc the validations were running properly on pike, but i'm not sure. it could've also been queens. I've gone through a few re-creations of my undercloud node over the past few weeks. | 17:18 |
sri_ | dsneddon, instead of ovs_bond I've created a linux_bond | 17:18 |
florianf | jtcressy: I've tested it with a fairly recent master | 17:19 |
EmilienM | owalsh_biab: back. Ok so IIUC, we need to consider interval/retries options in paunch, but they are in a too-recent version of Docker so we can't use it now, is that correct? | 17:19 |
florianf | jtcressy: but not completely up to date | 17:20 |
sri_ | dsneddon, in a linux bond that parameter needed or is it optionl : https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/sec-configuring_a_vlan_over_a_bond | 17:20 |
dsneddon | sri_, That's not a problem, but you will have to use a different format for the bonding_options. You can reuse the BondInterfaceOvsOptions parameter, and set "bonding_options: {get_param: BondInterfaceOvsOptions}" in the NIC config. | 17:20 |
jtcressy | florianf: I'm definitely not working from master. I'm sticking to specific releases. | 17:20 |
owalsh_biab | EmilienM: no, we have intervals/retries but no start_period (which would be useful in this case) | 17:20 |
*** artom has joined #tripleo | 17:20 | |
jtcressy | florianf: and I am currently on queens fyi | 17:21 |
owalsh_biab | EmilienM: but looking at https://docs.docker.com/engine/reference/builder/#healthcheck ... | 17:21 |
dsneddon | sri_, so instead of, for instance "bond_mode=balance-tcp lacp=active", you would use "mode=4" for Linux bond options | 17:21 |
florianf | jtcressy: I have a queens env. let me double check real quick | 17:21 |
sri_ | dsneddon, my nic config: http://paste.openstack.org/show/726295/ | 17:21 |
owalsh_biab | EmilienM: think we will have to just check after the final step and give it enough time to run at least 3 healthchecks (or query the retry/interval to get how long we need to wait) | 17:22 |
dsneddon | sri_, That looks correct | 17:22 |
EmilienM | owalsh_biab: that's not really "step driven deployment" :-/ | 17:23 |
EmilienM | I think the goal was to stop the deployment if a container would fail to run | 17:23 |
EmilienM | to provide quick feedback to the deployer | 17:23 |
owalsh_biab | yea, but healthchecks need to fail $retry times | 17:24 |
sri_ | dsneddon, my network admin saying something must be wrong from my side, I just wanted clarify things from my side | 17:24 |
dsneddon | sri_, There are some inspection commands that will tell you if you are getting LLDP from the switch. This would allow you to confirm that the ports are plugged into the correct ports on the switch. | 17:24 |
dsneddon | sri_, You can use "openstack baremetal introspection interface list" and "openstack baremetal introspection interface show" commands | 17:25 |
dsneddon | sri_, Those commands are run on the undercloud with stackrc authentication file. | 17:25 |
EmilienM | owalsh_biab: why nova-api takes so much time also, and not other containers? | 17:25 |
sri_ | dsneddon, cool, let me try | 17:25 |
*** khyr0n has joined #tripleo | 17:26 | |
slagle | EmilienM: bandini : can you review this https://review.openstack.org/#/c/583017, since you reviewed the child | 17:27 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Take featureset out of TOCI_JOBTYPE https://review.openstack.org/582384 | 17:27 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Take environment_type out of TOCI_JOBTYPE https://review.openstack.org/582385 | 17:27 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Take nodes out of TOCI_JOBTYPE https://review.openstack.org/582386 | 17:27 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Take periodic and dryrun out of TOCI_JOBTYPE https://review.openstack.org/582387 | 17:27 |
slagle | merging anything in t-p-e is broken without the first one | 17:27 |
owalsh_biab | EmilienM: see https://bugs.launchpad.net/tripleo/+bug/1782598, request timestamps are very unstable. I'd guess load is high at that time | 17:27 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,Triaged] - Assigned to Oliver Walsh (owalsh) | 17:27 |
EmilienM | slagle: ack, looking now. | 17:28 |
EmilienM | owalsh_biab: so do you suggest to disable healthcheck for now and see if it was the only container? and in the meantime figure out how to solve this problem | 17:29 |
EmilienM | owalsh_biab: or total revert of alex's patch? | 17:29 |
owalsh_biab | EmilienM: total revert would be safest I think | 17:30 |
* owalsh_biab afk for a bit | 17:31 | |
EmilienM | mhh | 17:31 |
* EmilienM not really happy | 17:31 | |
jtcressy | Is there a minimum storage requirement on overcloud nodes? I just noticed that some of my nodes which have 270GB disk space are deployed via nova/ironic just fine, however nodes which have 135GB disk space outright fail with some sort of block storage error that does a horrible job at describing the true problem. | 17:32 |
cmyster | jtcressy: as in regular nodes or some specifics needs? IIRC 135GB is way more than the minimal | 17:35 |
jtcressy | just regular nodes. Currently all my nodes with 135GB storage are set to compute. I have a single 270GB node set to compute that actually succeeds in deployment. (control and ceph-storage nodes all have 270GB as well) | 17:36 |
jtcressy | the error also occurs in nova/ironic | 17:36 |
*** agopi|food is now known as agopi | 17:36 | |
cmyster | jtcressy: could you run df -h on the nodes that failed? | 17:37 |
jtcressy | heat requests instance -> nova fails to build instance and never even tells ironic to deploy the server. It just outright fails. It feels like it's failing some sort of check but i don't know what. | 17:37 |
jtcressy | cmyster: cant run df -h if it never powers on the failed nodes | 17:37 |
jtcressy | also the disks should be presumed clean on nodes in the "available" provision state | 17:38 |
cmyster | jtcressy: oh, you said failed to deploy, I assumed it at least passed some stages | 17:38 |
jtcressy | cmyster: nope. nova outright refuses to build the instance almost as quickly as heat requested it. | 17:39 |
cmyster | could be a thing for ironic team to have a look at... | 17:39 |
*** mdnadeem has joined #tripleo | 17:39 | |
jtcressy | i'm on queens release btw. no in-dev stuff going on. ;) | 17:39 |
cmyster | so queens, trying to deploy with tripleo? | 17:39 |
openstackgerrit | James Slagle proposed openstack/tripleo-common stable/queens: Include 'tripleo_role_name' in the inventory https://review.openstack.org/583287 | 17:39 |
jtcressy | cmyster: yes. tripleo queens. | 17:40 |
jtcressy | cmyster: here's the brief output in the tripleo UI about the failure: resources.NovaCompute: Went to status ERROR due to "Message: Build of instance edc6a7a4-170c-4cf4-8d00-8f4f03587b77 aborted: Failure prepping block device., Code: 500" | 17:41 |
*** yamahata has joined #tripleo | 17:41 | |
*** pchavva has joined #tripleo | 17:41 | |
sri_ | dsneddon, if you have min can you please take look at this http://paste.openstack.org/show/726299/ | 17:42 |
cmyster | jtcressy: introspection passed ? | 17:42 |
jtcressy | Yup | 17:42 |
*** chem has quit IRC | 17:43 | |
jtcressy | my instackenv.json actually has old values for the hard drive sizes (270GB) and the introspection updated it to the actual storage size of 135GB. (I moved disks around a few days ago to consolidate what I have into dedicated ceph nodes) | 17:43 |
cmyster | I re,e,ber seeing that issue | 17:44 |
cmyster | but where | 17:44 |
cmyster | impi? | 17:44 |
cmyster | hmmm | 17:44 |
openstackgerrit | Gabriele Cerami proposed openstack/tripleo-quickstart-extras master: ovb-manage: Allow the use of localhost as undercloud part of the stack https://review.openstack.org/584040 | 17:45 |
dsneddon | sri_, It's possible that eth2-5 are attached to a switch that is not running LLDP (Link-Layer Discovery Protocol) on those ports. | 17:45 |
sri_ | dsneddon, http://paste.openstack.org/show/726300/ | 17:47 |
dsneddon | sri_, Are eth0/1 attached to a different switch than eth2-5? It definitely looks like LLDP is not running on the 2-5 switchports. | 17:50 |
sri_ | dsneddon, let me find out | 17:51 |
pabelanger | have a baremetal question for DIB, anybody in tripleo deal with that before? I'm trying to understand why we need to extract kernel and initial ramdisk into separate images: http://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/baremetal | 17:51 |
*** gfidente|afk has quit IRC | 17:56 | |
slagle | pretty sure it was because that was how Ironic originally required it. it didn't always support whole disk images | 17:57 |
*** sshnaidm|rover is now known as sshnaidm|off | 17:57 | |
pabelanger | slagle: in tripleo, are you still doing kernel and ramdisk images or whole disk images | 17:58 |
jangutter | also, I don't think dib can generate gpt whole-disk image quite yet, so for EFI boot, I _think_ it's still required. Happy to be proven wrong. | 17:58 |
pabelanger | how does it work in OVB jobs today? | 17:59 |
pabelanger | jangutter: if needed, I think we can find somebody to add it, ianw comes to mind | 17:59 |
jangutter | pabelanger: I think there's already a task running for it. | 18:00 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates master: Add sample designate environment for ha https://review.openstack.org/584026 | 18:00 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates master: Use absolute paths in enable-designate environments https://review.openstack.org/584027 | 18:00 |
sri_ | dsneddon, I've asked my network-admin/boss about switchs he told me "don't get into it, just do what i said" | 18:00 |
jangutter | pabelanger: https://bugzilla.redhat.com/show_bug.cgi?id=1488557 | 18:00 |
openstack | bugzilla.redhat.com bug 1488557 in diskimage-builder "[RFE] diskimage-builder support whole disk images with UEFI whole disk image support for overcloud nodes" [Unspecified,On_dev] - Assigned to yroblamo | 18:00 |
sri_ | dsneddon, very sorry for wasting your time | 18:00 |
pabelanger | rlandy: weshay: panda: maybe you can answer OVB question above about kernel and ramdisk images | 18:00 |
sri_ | dsneddon, and thank you for your time :) | 18:01 |
jangutter | near as I can figure, EFI boot _likes_ having the kernel and ramdisk broken out. | 18:01 |
bnemec | pabelanger: I don't know for sure what ci is doing these days, but the default image build is still using split kernel and ramdisk so ci _should_ be doing that too. | 18:01 |
pabelanger | jangutter: http://git.openstack.org/cgit/openstack-infra/project-config/tree/nodepool/nb03.openstack.org.yaml#n72 is how we are doing efi images in nodepool, but really don't know much how it works, ianw drove that | 18:02 |
*** janki has quit IRC | 18:03 | |
pabelanger | bnemec: okay, that helps. If we added baremetal into nodepool, which I'm just planning now, we still want to use the 3 images? | 18:03 |
*** edmondsw has quit IRC | 18:03 | |
jangutter | pabelanger: heh, the "vm" element in my setup expressly set mbr(msdos) partition layout, while EFI kinda requires GPT. | 18:03 |
bnemec | pabelanger: I would say yes. | 18:03 |
pabelanger | bnemec: perfect, thanks! | 18:04 |
bnemec | Even if we changed the default, older releases are still using the split images. | 18:04 |
pabelanger | ++ | 18:04 |
florianf | jtcressy: I think I found something re failing validations: https://review.openstack.org/#/c/565201/3 | 18:06 |
dsneddon | sri_, I checked with the engineer who wrote the code for the LLDP data collection, and he said that it also would show that output for eth2-5 if no cable was plugged in to the port. I think that's unlikely, though, since os-net-config would have detected that and thrown an error during deployment. | 18:06 |
florianf | jtcressy: It's been backported to queens | 18:06 |
florianf | jtcressy: I can investigate further tomorrwo | 18:06 |
florianf | *tomorrow | 18:07 |
jtcressy | florianf: sounds good. | 18:07 |
florianf | jtcressy: thanks for the hint | 18:07 |
slagle | pabelanger: we use the split image with kernel, ramdisk, and a partition image | 18:08 |
pabelanger | slagle: ack, where can I look to see how that is built today for CI? | 18:08 |
slagle | quickstart | 18:09 |
*** ooolpbot has joined #tripleo | 18:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782267 | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782598 | 18:10 |
*** ooolpbot has quit IRC | 18:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 18:10 |
openstack | Launchpad bug 1782267 in tripleo "Stderr: u'Set Chassis Power Control to Up/On failed: Command not supported in present state\n': ProcessExecutionError: Unexpected error while running command." [Critical,Triaged] | 18:10 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,Triaged] - Assigned to Oliver Walsh (owalsh) | 18:10 |
sri_ | dsneddon, yes cable is conneted in all ports, and os-net-config also runs without any errors, are you saying is it still LLDP issue ? | 18:10 |
sri_ | dsneddon, my deplyment failed trying to ping one of the vlans | 18:12 |
*** florianf has quit IRC | 18:12 | |
*** trown|lunch is now known as trown | 18:13 | |
sri_ | dsneddon, this part [u'1000BASE-T fdx'] need show up all Interface right ? | 18:15 |
*** salmankhan has quit IRC | 18:16 | |
*** moshele has joined #tripleo | 18:16 | |
jtcressy | So i'm getting this error regardless of the node's hard drive size now: `Build of instance 5d3fe517-24ac-429a-af98-4289a0a00353 aborted: Failure prepping block device.` | 18:17 |
*** thrash|biab is now known as thrash | 18:17 | |
jtcressy | this is ONLY happening on compute nodes. ceph-storage and control are deployed just fine. | 18:17 |
openstackgerrit | Alan Bishop proposed openstack/puppet-tripleo stable/queens: [Ocata,Pike,Queens-Only] Fix Cinder's Netapp backend https://review.openstack.org/583734 | 18:18 |
*** moshele has quit IRC | 18:19 | |
*** panda is now known as panda|off | 18:20 | |
jtcressy | FAILED node: https://hastebin.com/qinofavugo.py | 18:20 |
jtcressy | Successful (currently bullding) node: https://hastebin.com/giputuyane.rb | 18:21 |
radez | EmilienM: that issue we were looking at the other day ended up being a puppet-tripleo issue: https://review.openstack.org/#/c/583900/ | 18:21 |
radez | if you get a min to look at it a review would be welcome :) | 18:21 |
EmilienM | radez: it's a way to fix this, indeed | 18:21 |
EmilienM | I'm happy to merge this. | 18:21 |
radez | EmilienM++ thx! | 18:22 |
openstackgerrit | John Trowbridge proposed openstack/tripleo-heat-templates master: Add secondary DNS server to disable-unbound environment https://review.openstack.org/582164 | 18:22 |
openstackgerrit | John Trowbridge proposed openstack/tripleo-heat-templates master: Move to openshift-ansible 3.10 https://review.openstack.org/582495 | 18:23 |
openstackgerrit | John Trowbridge proposed openstack/tripleo-heat-templates master: WIP use openshift-ansible container instead of RPMs https://review.openstack.org/583868 | 18:23 |
jtcressy | This is one of the bug reports mentioning my problem but it says it's been fixed and backported to queens? How come I still experience this problem if it was "fixed"? | 18:24 |
jtcressy | https://bugs.launchpad.net/tripleo/+bug/1749671 | 18:24 |
openstack | Launchpad bug 1749671 in tripleo "Overcloud installation fails with "Failure prepping block device." error" [Critical,Fix released] - Assigned to Harald Jensås (harald-jensas) | 18:24 |
pabelanger | okay, looking more at tripleo-quickstart, is ironic-python-agent seems to be the images for ironic? | 18:24 |
*** artom_ has joined #tripleo | 18:25 | |
jtcressy | my current node list: https://hastebin.com/qasikekunu.rb | 18:27 |
dsneddon | sri_, No, I don't think there is neccessarily an issue with the lack of LLDP data. I was just pointing out that since the switch isn't running LLDP, we won't get any useful troubleshooting data out of the "openstack baremetal introspection interface list|show" commands for those ports. | 18:27 |
jtcressy | the singular novacompute node still errors out with the "Failure prepping block device" error. | 18:28 |
jtcressy | Anyone here specialize in nova/ironic? | 18:28 |
*** artom has quit IRC | 18:29 | |
*** med_ has joined #tripleo | 18:29 | |
*** med_ has quit IRC | 18:29 | |
*** med_ has joined #tripleo | 18:29 | |
*** marrusl has joined #tripleo | 18:29 | |
dsneddon | sri_, But without LLDP, the switch is completely a black box. | 18:29 |
dsneddon | sri_, A few other things to look at. You can run "cat /proc/net/bonding/bond" on the overcloud nodes, which will give you the status of the bond and LACP. | 18:31 |
dsneddon | sri_, Oops, I meant "cat /proc/net/bonding/bond1" | 18:31 |
*** edmondsw has joined #tripleo | 18:31 | |
*** agurenko has quit IRC | 18:31 | |
*** mdnadeem has quit IRC | 18:35 | |
dsneddon | sri_, Other thing that could be causing issues: incorrect VLAN trunking configuration on the bond on the switch. This can't be detected by os-net-config, and so the bond gets set up but no traffic flows across the VLANs. | 18:36 |
dsneddon | sri_, Another thing could be incorrect cabling, so the ports on the switch are actually connected to different servers, rather than all bond slaves being attached to the same server. | 18:37 |
dsneddon | sri_, Finally, another problem could be native (untagged) vs. trunked (tagged) VLANs. These VLANs should be trunked so they will be tagged on both ends. | 18:38 |
dsneddon | sri_, You also want to make sure that the VLAN IDs are set correctly. StorageNetworkVlanID and InternalApiNetworkVlanID need to be set correctly in your network-environment.yaml (or set correctly in network_data.yaml if you are using a very recent TripleO version). | 18:39 |
*** akhilaki has joined #tripleo | 18:40 | |
openstackgerrit | Tom Barron proposed openstack/tripleo-heat-templates master: Update manila environment file names https://review.openstack.org/583705 | 18:46 |
*** shreshtha has quit IRC | 18:48 | |
*** bdodd has joined #tripleo | 18:48 | |
openstackgerrit | Ronelle Landy proposed openstack/tripleo-quickstart-extras master: DNM - Adding patch for reproducer test https://review.openstack.org/584065 | 18:51 |
*** moshele has joined #tripleo | 18:59 | |
*** moshele has quit IRC | 18:59 | |
*** rpioso|afk is now known as rpioso | 19:02 | |
*** sri__ has joined #tripleo | 19:02 | |
*** itlinux has joined #tripleo | 19:06 | |
*** ooolpbot has joined #tripleo | 19:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782267 | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782598 | 19:10 |
*** ooolpbot has quit IRC | 19:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 19:10 |
openstack | Launchpad bug 1782267 in tripleo "Stderr: u'Set Chassis Power Control to Up/On failed: Command not supported in present state\n': ProcessExecutionError: Unexpected error while running command." [Critical,Triaged] | 19:10 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,Triaged] - Assigned to Oliver Walsh (owalsh) | 19:10 |
sri__ | dsneddon, understood, i will look into all possible scenarios you mentioned, again thank you very much for your help | 19:11 |
*** tosky has quit IRC | 19:16 | |
*** brault has joined #tripleo | 19:21 | |
*** med_ has quit IRC | 19:23 | |
*** brault has quit IRC | 19:25 | |
*** dbecker has joined #tripleo | 19:25 | |
jtcressy | So I've tracked down my block device configuration problem from earlier to a problem between Heat and Nova... | 19:34 |
jtcressy | It seems Heat thinks a node exists by a particular UUID and tries to tell nova to deploy using said UUID, but that UUID does not exist as a node in ironic!!!! WTF? | 19:35 |
jtcressy | where does heat pull in a list of baremetal nodes from? does it cache a list in its own database? because it is most certainly stale data if it thinks a node still exists after it is LONG gone. | 19:36 |
jtcressy | in `/var/log/nova/nova-compute` grepping for "ERROR" I find a LOT of 404 errors when nova tries to fetch info for a bare metal node by the aforementioned UUID. I dont know where it's getting this from, but I need to know how to get rid of it so it selects new baremetal nodes and stops using stale UUIDs | 19:37 |
jtcressy | my deployments are going nowhere with this, as I cannot deploy any compute nodes because of this bizarre problem. | 19:38 |
jtcressy | 2018-07-19 13:12:53.604 10611 ERROR nova.virt.ironic.driver [req-cd5bd1c3-6cc8-4f27-a59a-832cb4649c0d 101ee8edb5b749e9ac95f5ee15333f4d 29ed7702e21e4480a317eb8b03bab387 - default default] [instance: dff12360-ae5e-49f3-bb52-76d3929a05a8] Error preparing deploy for instance dff12360-ae5e-49f3-bb52-76d3929a05a8 on baremetal node a6caba27-c4c0-4e6b-9b92-2fe65fd87410.: NotFound: Node a6caba27-c4c0-4e6b-9b92-2fe65fd87410 could not be found. (HTTP | 19:41 |
jtcressy | 404) | 19:41 |
jtcressy | Does anyone know what might be wrong? | 19:46 |
*** noslzzp has joined #tripleo | 19:49 | |
*** liverpooler has quit IRC | 19:58 | |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Add override_ansible_cfg https://review.openstack.org/584087 | 19:59 |
*** myoung is now known as myoung|biab | 20:00 | |
*** holser_ has joined #tripleo | 20:05 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: update default logging to match upstream https://review.openstack.org/584088 | 20:06 |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Add override_ansible_cfg https://review.openstack.org/584087 | 20:06 |
*** ooolpbot has joined #tripleo | 20:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782267 | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782598 | 20:10 |
*** ooolpbot has quit IRC | 20:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 20:10 |
openstack | Launchpad bug 1782267 in tripleo "Stderr: u'Set Chassis Power Control to Up/On failed: Command not supported in present state\n': ProcessExecutionError: Unexpected error while running command." [Critical,Triaged] | 20:10 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,Triaged] - Assigned to Oliver Walsh (owalsh) | 20:10 |
*** holser_ has quit IRC | 20:15 | |
*** holser_ has joined #tripleo | 20:15 | |
*** dbecker has quit IRC | 20:15 | |
*** sri__ has quit IRC | 20:22 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart master: devmode.sh has been upgraded https://review.openstack.org/584097 | 20:25 |
*** dprince has quit IRC | 20:25 | |
*** pchavva has quit IRC | 20:28 | |
*** artom_ is now known as artom | 20:29 | |
*** akhilaki_ has joined #tripleo | 20:31 | |
*** akhilaki has quit IRC | 20:33 | |
*** akhilaki has joined #tripleo | 20:37 | |
*** akhilaki_ has quit IRC | 20:39 | |
trozet | weshay: its a miracle scenario008 passed on queens: https://review.openstack.org/#/c/581790/ | 20:40 |
trozet | EmilienM: ^ so i think we are good now on https://review.openstack.org/#/c/581791/ when you have a minute | 20:41 |
EmilienM | trozet: good | 20:41 |
*** jroll has joined #tripleo | 20:42 | |
weshay | EmilienM++ | 20:43 |
jroll | EmilienM: I like your last email :) | 20:43 |
EmilienM | jroll: because it has "edge" in the subject? i know it's how I get people to read my garbage :P | 20:43 |
jroll | ha | 20:43 |
jroll | because it's similar to what I'm working on recently :P | 20:44 |
EmilienM | jroll: nice, tell me more | 20:44 |
jroll | EmilienM: not much to say, central DC control plane with remote compute nodes | 20:45 |
EmilienM | jroll: come help me \o/ | 20:45 |
jroll | EmilienM: well, we aren't tripleo users right now, but this is compelling :) | 20:45 |
EmilienM | jroll: oh, what do you use? | 20:46 |
jroll | EmilienM: some homegrown chef stuff at the moment, but this is a new project. have been evaluating OSA for now, mostly because we use a lot of ansible elsewhere | 20:47 |
EmilienM | jroll: come use tripleo | 20:48 |
EmilienM | we are ansible based :D | 20:48 |
jroll | heh | 20:48 |
jroll | wait, you are? TIL | 20:49 |
EmilienM | jroll: mostly. We use a bit of Puppet still for config managment, but most of the orchestration is now done by Ansible. | 20:49 |
jroll | neat. | 20:49 |
EmilienM | jroll: https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/ansible_config_download.html | 20:50 |
jroll | thanks, will read up | 20:50 |
EmilienM | jroll: and we're now using a bunch of ansible roles to deploy our services. | 20:50 |
jroll | EmilienM: neat, will check it out | 20:52 |
EmilienM | jroll: i'll save you time, come use tripleo ;-) | 20:52 |
* jroll sends EmilienM's boss a letter to promote him to sales | 20:52 | |
*** lblanchard has quit IRC | 20:53 | |
jroll | EmilienM: I'm hoping to eventually have: 1 API endpoint which connects to X cells, which each control Y sites with compute nodes | 20:53 |
jroll | or something like that | 20:53 |
EmilienM | ahah no I'm not sales | 20:53 |
*** akhilaki has quit IRC | 20:56 | |
*** abishop has quit IRC | 20:56 | |
*** raildo has quit IRC | 20:58 | |
*** raildo has joined #tripleo | 20:58 | |
*** lifeless has quit IRC | 21:01 | |
*** akhilaki has joined #tripleo | 21:02 | |
*** agopi has quit IRC | 21:05 | |
*** bugzy_ has quit IRC | 21:05 | |
*** lifeless has joined #tripleo | 21:06 | |
*** raildo has quit IRC | 21:06 | |
*** trown is now known as trown|outtypewww | 21:07 | |
openstackgerrit | Ronelle Landy proposed openstack/tripleo-quickstart-extras master: Include minimal Browbeat playbook in baremetal playbook https://review.openstack.org/581488 | 21:08 |
*** ooolpbot has joined #tripleo | 21:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782267 | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782598 | 21:10 |
*** ooolpbot has quit IRC | 21:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 21:10 |
openstack | Launchpad bug 1782267 in tripleo "Stderr: u'Set Chassis Power Control to Up/On failed: Command not supported in present state\n': ProcessExecutionError: Unexpected error while running command." [Critical,Triaged] | 21:10 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,Triaged] - Assigned to Oliver Walsh (owalsh) | 21:10 |
*** slaweq has quit IRC | 21:10 | |
*** holser_ has quit IRC | 21:12 | |
*** pradk has quit IRC | 21:12 | |
EmilienM | owalsh_biab: are you going to propose the revert? I'm more in favor of disabling the healthcheck for nova api | 21:12 |
EmilienM | I'll check that later /me afk | 21:12 |
*** bugzy has joined #tripleo | 21:14 | |
owalsh_biab | EmilienM: don't see why nova_api would be the only container affected.... none of the health checks failed, I expect they just timed out | 21:15 |
*** myoung|biab is now known as myoung | 21:16 | |
*** jtcressy has quit IRC | 21:16 | |
openstackgerrit | Oliver Walsh proposed openstack/tripleo-heat-templates master: Give healthchecks time to stablize before failing the deployment https://review.openstack.org/584119 | 21:26 |
*** slaweq has joined #tripleo | 21:29 | |
openstackgerrit | Oliver Walsh proposed openstack/tripleo-heat-templates master: Give healthchecks time to stabilize before failing the deployment https://review.openstack.org/584119 | 21:29 |
*** bfournie has quit IRC | 21:29 | |
*** owalsh_biab is now known as owalsh | 21:31 | |
*** akhilaki_ has joined #tripleo | 21:37 | |
openstackgerrit | James Slagle proposed openstack/python-tripleoclient master: Add --override-ansible-cfg https://review.openstack.org/584121 | 21:38 |
*** gbarros has joined #tripleo | 21:41 | |
*** paramite has quit IRC | 21:42 | |
*** jtcressy has joined #tripleo | 21:44 | |
jtcressy | Anyone have a guide on heat database surgery? Things are *very* broken on my undercloud and I think it's heat's fault. | 21:45 |
jtcressy | It keeps trying to deploy a node that doesn't exist instead of picking from my list of current nodes. | 21:45 |
jtcressy | I've rebooted the undercloud, deleted my plan, refreshed everything and it still has this problem. | 21:46 |
openstackgerrit | James Slagle proposed openstack/tripleo-docs master: Document --override-ansible-cfg https://review.openstack.org/584125 | 21:49 |
*** dtrainor has quit IRC | 21:51 | |
*** hamzy has quit IRC | 21:53 | |
*** hamzy has joined #tripleo | 21:53 | |
openstackgerrit | James Slagle proposed openstack/tripleo-common master: Add override_ansible_cfg https://review.openstack.org/584087 | 21:54 |
*** gbarros has quit IRC | 21:56 | |
*** dtrainor has joined #tripleo | 21:56 | |
*** gbarros has joined #tripleo | 21:56 | |
*** rcernin has joined #tripleo | 21:58 | |
*** brault has joined #tripleo | 21:59 | |
*** jtomasek has quit IRC | 22:00 | |
*** brault has quit IRC | 22:03 | |
*** mcornea has quit IRC | 22:04 | |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Improve nova statedir ownership logic https://review.openstack.org/577855 | 22:08 |
openstackgerrit | Merged openstack/tripleo-puppet-elements master: Update test-requirements.txt https://review.openstack.org/583017 | 22:08 |
*** ooolpbot has joined #tripleo | 22:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782267 | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782598 | 22:10 |
*** ooolpbot has quit IRC | 22:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 22:10 |
openstack | Launchpad bug 1782267 in tripleo "Stderr: u'Set Chassis Power Control to Up/On failed: Command not supported in present state\n': ProcessExecutionError: Unexpected error while running command." [Critical,Triaged] | 22:10 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,In progress] - Assigned to Oliver Walsh (owalsh) | 22:10 |
*** slaweq has quit IRC | 22:17 | |
*** gbarros has quit IRC | 22:20 | |
*** gbarros has joined #tripleo | 22:20 | |
*** ipsecguy has quit IRC | 22:22 | |
*** itlinux has quit IRC | 22:22 | |
*** jtcressy has quit IRC | 22:23 | |
*** jtcressy has joined #tripleo | 22:26 | |
*** jtcressy has quit IRC | 22:27 | |
*** edmondsw has quit IRC | 22:30 | |
*** mjturek has quit IRC | 22:31 | |
*** edmondsw has joined #tripleo | 22:31 | |
*** jtcressy has joined #tripleo | 22:33 | |
*** ipsecguy has joined #tripleo | 22:34 | |
*** edmondsw has quit IRC | 22:35 | |
*** gbarros has quit IRC | 22:41 | |
*** itlinux has joined #tripleo | 22:42 | |
*** rcernin_ has joined #tripleo | 22:47 | |
openstackgerrit | Ben Nemec proposed openstack/python-tripleoclient master: Move ironic http boot reno to the correct section https://review.openstack.org/584154 | 22:48 |
openstackgerrit | Merged openstack/puppet-tripleo master: Check for neutron_plugin_ml2_ansible service when including plugin https://review.openstack.org/583900 | 22:48 |
openstackgerrit | Merged openstack/tripleo-heat-templates master: remove scenario005 from experimental https://review.openstack.org/583680 | 22:48 |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Run scenario009 for more services https://review.openstack.org/583238 | 22:48 |
*** rcernin has quit IRC | 22:49 | |
*** lblanchard has joined #tripleo | 22:58 | |
*** rlandy is now known as rlandy|bbl | 22:59 | |
*** noslzzp has quit IRC | 22:59 | |
*** tzumainn has quit IRC | 23:02 | |
*** ooolpbot has joined #tripleo | 23:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1773325 | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782267 | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1782598 | 23:10 |
*** ooolpbot has quit IRC | 23:10 | |
openstack | Launchpad bug 1773325 in tripleo "tempest.api.object_storage.test_object_services is failing on scenario002" [Critical,In progress] - Assigned to Arx Cruz (arxcruz) | 23:10 |
openstack | Launchpad bug 1782267 in tripleo "Stderr: u'Set Chassis Power Control to Up/On failed: Command not supported in present state\n': ProcessExecutionError: Unexpected error while running command." [Critical,Triaged] | 23:10 |
openstack | Launchpad bug 1782598 in tripleo "container health check fails in step 5 on centos-binary-nova-api" [Critical,In progress] - Assigned to Oliver Walsh (owalsh) | 23:10 |
*** slaweq has joined #tripleo | 23:11 | |
*** dhill_ has quit IRC | 23:14 | |
*** slaweq has quit IRC | 23:16 | |
*** bfournie has joined #tripleo | 23:18 | |
*** gbarros has joined #tripleo | 23:19 | |
*** gbarros has quit IRC | 23:23 | |
*** gbarros has joined #tripleo | 23:24 | |
openstackgerrit | Arx Cruz proposed openstack/tripleo-quickstart master: Let's tempestconf tool handle swift related conf https://review.openstack.org/573220 | 23:31 |
pabelanger | panda|off: rlandy|bbl: weshay: So, here is a very basic example of how we can get a bmc node from nodepool: https://review.rdoproject.org/r/14768/ looking at tripleo-ci, that seems to be the only thing we do so the image before booting it. | 23:33 |
pabelanger | next step would be looking at working bmc-template node and seeing what networking is setup, SSH account, etc | 23:34 |
pabelanger | SSH key can be generated at runtime, like we do with devstack multinode jobs | 23:34 |
*** gbarros has quit IRC | 23:34 | |
pabelanger | networking, more tricky as there are provider networks | 23:34 |
pabelanger | however, one idea would be to generate overlay networks, but not sure how that would look with ironic bits | 23:35 |
pabelanger | panda|off: rlandy|bbl: weshay: I think I'd like to learn more about the ipxe-boot image next, looking at http://git.openstack.org/cgit/openstack-infra/tripleo-ci/tree/scripts/prepare-ovb-cloud.sh looks straight forward to create the image | 23:41 |
*** khyr0n has quit IRC | 23:45 | |
*** pmannidi has joined #tripleo | 23:49 | |
*** rpioso is now known as rpioso|afk | 23:51 | |
jtcressy | TIL If I remove nodes from my undercloud, they will linger in the `compute_nodes` table in the `nova` database and will cause heat/nova/ironic to fail deploying new nodes. I had to delete 43 lines of stale node data from that table. I'm beginning another deploy now and hopefully this will let me get past the problem i've been experiencing for the past two days. More details about this upon request! | 23:52 |
*** akhilaki__ has joined #tripleo | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!