*** ooolpbot has joined #tripleo | 00:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 00:10 |
---|---|---|
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 00:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 00:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 00:10 |
*** ooolpbot has quit IRC | 00:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 00:10 |
*** jamesdenton has quit IRC | 00:36 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Allow debootstrap to cleanup without a kernel https://review.openstack.org/604692 | 00:42 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart-extras master: Introduce undercloud_container_cli parameter https://review.openstack.org/600512 | 00:50 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart-extras master: Fix quickstart undercloud selinux configuration https://review.openstack.org/602703 | 00:50 |
weshay | hrm... scenario001 | 00:51 |
*** rlandy has joined #tripleo | 01:05 | |
*** ooolpbot has joined #tripleo | 01:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 01:10 |
*** ooolpbot has quit IRC | 01:10 | |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 01:10 |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 01:10 |
EmilienM | weshay: timeouts? | 01:16 |
*** rlandy has quit IRC | 01:19 | |
*** mrsoul has joined #tripleo | 01:19 | |
*** phuongnh has joined #tripleo | 01:22 | |
*** mschuppert has quit IRC | 01:23 | |
weshay | one | 01:28 |
weshay | it failed twice | 01:28 |
weshay | since we turned off validations | 01:28 |
*** owalsh_ has joined #tripleo | 01:29 | |
*** owalsh has quit IRC | 01:33 | |
*** rh-jelabarre has quit IRC | 01:34 | |
openstackgerrit | Emilien Macchi proposed openstack/ansible-role-redhat-subscription master: Add support for RHSM Pools https://review.openstack.org/605290 | 01:44 |
*** jamesdenton has joined #tripleo | 01:46 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: libvirt standalone deployment https://review.openstack.org/591077 | 01:55 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: f28 support for quickstart https://review.openstack.org/591652 | 01:57 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: libvirt standalone deployment https://review.openstack.org/591077 | 01:58 |
*** ooolpbot has joined #tripleo | 02:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 02:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 02:10 |
*** ooolpbot has quit IRC | 02:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 02:10 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: libvirt standalone deployment https://review.openstack.org/591077 | 02:37 |
*** psachin has joined #tripleo | 02:40 | |
*** apetrich has quit IRC | 02:42 | |
*** ooolpbot has joined #tripleo | 03:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 03:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 03:10 |
*** ooolpbot has quit IRC | 03:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 03:10 |
pvc | hi tripleo team | 03:14 |
pvc | any one knows why my NetworkDeployment is not completing? | 03:15 |
pvc | or should i wait a little bit more | 03:15 |
openstackgerrit | Merged openstack/tripleo-heat-templates stable/rocky: Stop cap granting to empty pool when telemetry disabled https://review.openstack.org/604734 | 03:22 |
*** skramaja has joined #tripleo | 03:25 | |
*** ramishra has joined #tripleo | 03:26 | |
*** udesale has joined #tripleo | 03:53 | |
*** itlinux has joined #tripleo | 03:55 | |
itlinux | hello all, can someone point me to the right docs on how to do minor updates on PIKE BM. I get this issue.. http://paste.openstack.org/show/730915/ | 03:56 |
*** ykarel has joined #tripleo | 03:59 | |
*** ooolpbot has joined #tripleo | 04:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 04:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 04:10 |
*** ooolpbot has quit IRC | 04:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 04:10 |
*** pcaruana has joined #tripleo | 04:14 | |
*** nawar has joined #tripleo | 04:24 | |
*** mcornea has quit IRC | 04:25 | |
*** shyamb has joined #tripleo | 04:26 | |
*** shyamb has quit IRC | 04:31 | |
*** shyamb has joined #tripleo | 04:34 | |
*** pcaruana has quit IRC | 04:38 | |
*** Petersingh has joined #tripleo | 04:46 | |
*** Petersingh is now known as Petersingh|afk | 04:47 | |
*** zb has joined #tripleo | 04:58 | |
*** dxiri has joined #tripleo | 05:00 | |
*** zaneb has quit IRC | 05:01 | |
*** nawar has quit IRC | 05:06 | |
*** abishop_ has quit IRC | 05:08 | |
*** ooolpbot has joined #tripleo | 05:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 05:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 05:10 |
*** ooolpbot has quit IRC | 05:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 05:10 |
*** abishop has joined #tripleo | 05:12 | |
Tengu | hello there | 05:14 |
*** Petersingh|afk is now known as Petersingh | 05:18 | |
jaosorior | yoo | 05:19 |
itlinux | hello all.. | 05:21 |
pvc | hello all | 05:21 |
itlinux | jaosorior: and Tengu: | 05:21 |
pvc | hello itlinux | 05:21 |
jaosorior | itlinux: hey! how's it going? | 05:21 |
itlinux | just trying to figure the update minor issue.. need your help there.. | 05:22 |
itlinux | http://paste.openstack.org/show/730915/ | 05:22 |
itlinux | any tips on what to look for on this.. | 05:22 |
*** dxiri has quit IRC | 05:22 | |
itlinux | it's driving me crazy :) | 05:22 |
jaosorior | itlinux: I don't have a lot of knowledge about the update and upgrades workflow yet :/ | 05:22 |
jaosorior | unsupported parameters for pacemaker_cluster | 05:22 |
itlinux | ok.. I will check with M.. what timezone is he at? | 05:23 |
jaosorior | sounds like you're using the wrong version of the pacemaker ansible module | 05:23 |
itlinux | that's default I have not changed anything on pacemaker.. | 05:23 |
jaosorior | not pacemaker itself | 05:24 |
jaosorior | but the pacemaker ansible module | 05:24 |
jaosorior | uhm... | 05:24 |
jaosorior | bandini: ^^ | 05:24 |
itlinux | I just deployed the overcloud .. so there is nothing I changed and used one of the default img | 05:24 |
itlinux | ahh the italian guy! ok | 05:24 |
itlinux | I will reach out to him and see what he has to say :) | 05:24 |
*** shyamb has quit IRC | 05:28 | |
*** shyamb has joined #tripleo | 05:36 | |
*** nawar has joined #tripleo | 05:38 | |
quique|rover|off | Good morning | 05:38 |
*** quique|rover|off is now known as quiquell|rover | 05:40 | |
quiquell|rover | ykarel: you there ? | 05:40 |
ykarel | quiquell|rover, yes | 05:40 |
quiquell|rover | ykarel: Did you found something about timeouts and overcloud deploy ? | 05:41 |
*** pcaruana has joined #tripleo | 05:43 | |
*** apetrich has joined #tripleo | 05:46 | |
Tengu | quiquell|rover: heya! for what I know, weshay patch was merged yesterday evening (CET), and apparently it did help a bit. | 05:47 |
Tengu | quiquell|rover: also, my patch for the selinux issue is working, and should be hopefully gating today. maybe. | 05:47 |
Tengu | eventually. | 05:48 |
quiquell|rover | Tengu: Cool, I still see some timeouts, but at least now we don't have to look at validations | 05:48 |
Tengu | quiquell|rover: hmm ok. well, zuul's pretty loaded as well as the gate. | 05:49 |
ykarel | quiquell|rover, no i didn't looked much there, but what i saw is stack creation and config-download were ok, just overcloudrc action stuck for long | 05:49 |
*** cylopez has joined #tripleo | 05:51 | |
quiquell|rover | ykarel: Yep, found that at one timeout, didn't find it elsewhere | 05:52 |
ykarel | quiquell|rover, really? i saw in multiple jobs | 05:52 |
quiquell|rover | ykarel: Can you add some of them to the lp of the timeout ? | 05:55 |
*** jistr has quit IRC | 05:55 | |
quiquell|rover | ykarel: or open new one, so we close the validations ? | 05:55 |
ykarel | quiquell|rover, okk will look in some time | 05:55 |
*** jistr has joined #tripleo | 05:56 | |
quiquell|rover | ykarel: Or give me the logs you found I will open it | 05:56 |
*** jfrancoa has joined #tripleo | 05:57 | |
*** mschuppert has joined #tripleo | 05:59 | |
quiquell|rover | mschuppert: Good morning, we have a queens promotion | 06:01 |
*** iranzo has joined #tripleo | 06:02 | |
mschuppert | quiquell|rover: perfect! will rerun the job | 06:03 |
quiquell|rover | mschuppert: Let me know if it works | 06:04 |
Tengu | ykarel quiquell|rover : lemme know if I can help a bit | 06:04 |
mschuppert | quiquell|rover: sure | 06:04 |
quiquell|rover | ykarel: openning the lp with the overcloud issue | 06:05 |
*** ksambor has joined #tripleo | 06:06 | |
openstackgerrit | Martin Schuppert proposed openstack/puppet-tripleo stable/queens: Revert "Revert "SSL support for haproxy -> novnc proxy connection"" https://review.openstack.org/594145 | 06:07 |
quiquell|rover | ykarel: https://bugs.launchpad.net/tripleo/+bug/17944 | 06:07 |
openstack | Launchpad bug 17944 in samba (Ubuntu) "samba: new changes from Debian require merging" [Medium,Fix released] - Assigned to Adam Conrad (adconrad) | 06:07 |
*** sanjayu_ has joined #tripleo | 06:08 | |
*** shyamb has quit IRC | 06:09 | |
*** nawar has quit IRC | 06:09 | |
*** ooolpbot has joined #tripleo | 06:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 06:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 06:10 |
*** ooolpbot has quit IRC | 06:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 06:10 |
*** shyamb has joined #tripleo | 06:11 | |
*** nawar has joined #tripleo | 06:11 | |
*** ratailor has joined #tripleo | 06:12 | |
nawar | hi | 06:12 |
*** jtomasek has joined #tripleo | 06:13 | |
*** holser_ has joined #tripleo | 06:15 | |
*** dmacpher_ has quit IRC | 06:15 | |
ykarel | quiquell|rover, wrong link | 06:15 |
quiquell|rover | ykarel: https://bugs.launchpad.net/tripleo/+bug/1794418 | 06:15 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) | 06:15 |
*** dmacpher has joined #tripleo | 06:16 | |
ykarel | quiquell|rover, adding few more links and the relation to timeout | 06:17 |
jaosorior | holser_: around? | 06:18 |
holser_ | jaosorior yeah | 06:18 |
quiquell|rover | ykarel: Let's work that | 06:18 |
holser_ | Good morning | 06:18 |
quiquell|rover | ykarel: Do you think it can be realted to the guard added to tripleoclient ? | 06:18 |
jaosorior | holser_: was it you that added the topic to the stein forum that's titled: "Zero footprint installer, interests and progress" ? | 06:19 |
ykarel | quiquell|rover, might be but not sure, so i asked exactly when we started seeing it and in which branches, if it's around the time merged and master only then it can be related | 06:19 |
holser_ | jaosorior let me have a look at etherpad... | 06:19 |
* holser_ looking | 06:20 | |
jaosorior | holser_: https://etherpad.openstack.org/p/tripleo-forum-stein | 06:20 |
holser_ | jaosorior nope .... | 06:22 |
holser_ | I don't recall that I added that | 06:22 |
nawar | i need your help | 06:22 |
jaosorior | holser_: the color matched yours... so it must have been someone else.. don't know who though | 06:22 |
jaosorior | need to poke them to get more info on it | 06:22 |
holser_ | I understand but it was not me | 06:22 |
jaosorior | holser_: sure, no problem. | 06:23 |
jaosorior | thanks | 06:23 |
holser_ | You may add comment and delete in a couple weeks if noone cares | 06:23 |
nawar | I'm trying to scale up my env of 2 nodes and add new compute but I'm getting this exception: KeyError: 'passwords' when redeploy | 06:24 |
jaosorior | nawar: what version? | 06:28 |
ykarel | quiquell|rover, commented https://bugs.launchpad.net/tripleo/+bug/1794418/comments/1 | 06:30 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) | 06:30 |
*** dtrainor has quit IRC | 06:33 | |
*** yprokule has joined #tripleo | 06:34 | |
*** aufi has joined #tripleo | 06:37 | |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Allow a containerized logrotate to access docker https://review.openstack.org/596274 | 06:37 |
*** shyamb has quit IRC | 06:38 | |
nawar | queens | 06:42 |
*** quiquell|rover is now known as quique|rover|brb | 06:43 | |
*** rdopiera has joined #tripleo | 06:43 | |
*** chkumar|off is now known as chkumar|ruck | 06:44 | |
chkumar|ruck | ykarel: is there some problem with overcloud deploy in timedout? | 06:45 |
ykarel | chkumar|ruck, which problem? | 06:45 |
chkumar|ruck | ykarel: looking at this bug https://bugs.launchpad.net/tripleo/+bug/1794418/ | 06:46 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) | 06:46 |
ykarel | chkumar|ruck, yes | 06:46 |
ykarel | seen in multiple jobs so issue is there | 06:46 |
openstackgerrit | Jose Luis Franco proposed openstack/tripleo-heat-templates master: Check if openstack-glance-registry is enabled before stopping it. https://review.openstack.org/603581 | 06:46 |
nawar | jaosorior: queens | 06:50 |
openstackgerrit | Merged openstack/tripleo-heat-templates stable/rocky: undercloud/stackrc: unset OS_* variables https://review.openstack.org/604936 | 06:51 |
chkumar|ruck | Tengu: Hello | 06:52 |
chkumar|ruck | Tengu: http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-master/5efcbac/logs/undercloud/home/zuul/undercloud_reinstall.log.txt.gz fs020 is still failing in master at same step with different error | 06:53 |
chkumar|ruck | TASK [Create /var/lib/config-data directory] *********************************** | 06:53 |
chkumar|ruck | 2018-09-26 01:33:54 | fatal: [undercloud]: FAILED! => {"changed": false, "msg": "path /var/lib/config-data/crond/etc/../usr/share/zoneinfo/UTC does not exist", "path": "/var/lib/config-data/crond/etc/../usr/share/zoneinfo/UTC", "state": "absent"} | 06:53 |
Tengu | chkumar|ruck: err, that's not related to my selinux work, that's for sure. | 06:56 |
Tengu | o____O | 06:57 |
Tengu | there isn't any recurse nor absent in the code. | 06:57 |
Tengu | wtf | 06:57 |
Tengu | chkumar|ruck: my patch was successful at least: https://review.rdoproject.org/r/#/c/16354/ | 06:58 |
*** shyamb has joined #tripleo | 06:58 | |
Tengu | chkumar|ruck: BUT it's still not merged. | 06:59 |
Tengu | might be that? | 06:59 |
openstackgerrit | Merged openstack/tripleo-specs master: Support for Podman in Stein https://review.openstack.org/602480 | 06:59 |
openstackgerrit | Merged openstack/tripleo-specs master: Improve upgrades_tasks CI coverage with standalone for Stein https://review.openstack.org/579854 | 06:59 |
chkumar|ruck | Tengu: Ah thanks :-) | 06:59 |
*** quique|rover|brb is now known as quiquell|rover | 07:00 | |
quiquell|rover | chkumar|ruck: o/ | 07:00 |
chkumar|ruck | Tengu: two patches are in zuul merge queue | 07:00 |
chkumar|ruck | https://review.openstack.org/#/c/605039/ and https://review.openstack.org/#/c/602703/ | 07:00 |
chkumar|ruck | hope this will promote master also | 07:00 |
chkumar|ruck | quiquell|rover: \o/ | 07:00 |
Tengu | chkumar|ruck: first one is mine | 07:00 |
chkumar|ruck | quiquell|rover: I am not sure this is a timedout case on rdo cloud http://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset018-master/918dfed/job-output.txt.gz | 07:02 |
chkumar|ruck | yesterday I have seen this type of one also | 07:02 |
quiquell|rover | chkumar|ruck: fs037 is still non voting wy do we have it at master criteria ? | 07:02 |
quiquell|rover | chkumar|ruck: fs037 is updates | 07:02 |
*** rcernin has quit IRC | 07:02 | |
chkumar|ruck | quiquell|rover: nope | 07:02 |
chkumar|ruck | quiquell|rover: https://github.com/rdo-infra/ci-config/blob/master/ci-scripts/dlrnapi_promoter/config/master.ini#L28 | 07:03 |
chkumar|ruck | it is still non-voting that's why | 07:03 |
quiquell|rover | chkumar|ruck: Ahh ok, I think I open old version | 07:04 |
chkumar|ruck | quiquell|rover: anyway fs037 is broken | 07:05 |
chkumar|ruck | quiquell|rover: good to have a bz | 07:05 |
chkumar|ruck | quiquell|rover: I will be looking into telemetry tempest issue why it is not working after fixes | 07:05 |
chkumar|ruck | and some tempest podman stuff | 07:05 |
chkumar|ruck | for fs027 | 07:05 |
*** amoralej|off is now known as amoralej | 07:06 | |
*** cylopez has quit IRC | 07:07 | |
*** cylopez has joined #tripleo | 07:07 | |
*** sanjayu_ has quit IRC | 07:08 | |
quiquell|rover | chkumar|ruck: fs027 arrg ok | 07:10 |
*** ooolpbot has joined #tripleo | 07:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 07:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 07:10 |
*** ooolpbot has quit IRC | 07:10 | |
quiquell|rover | chkumar|ruck: I will look into overcloud deploy timeout | 07:10 |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 07:10 |
chkumar|ruck | quiquell|rover: cool! | 07:10 |
quiquell|rover | chkumar|ruck: the overcloud deploy timeout is not promotion blocker ? | 07:10 |
*** waleedm has joined #tripleo | 07:12 | |
nawar | do I report bug ? or not | 07:13 |
openstackgerrit | Merged openstack/tripleo-common master: config: ignore missing server_id from the stack https://review.openstack.org/604483 | 07:14 |
openstackgerrit | Merged openstack/tripleo-common stable/queens: Avoid getting one-empty-element-list in blacklisted_hostnames. https://review.openstack.org/604165 | 07:14 |
*** Petersingh is now known as Petersingh|afk | 07:20 | |
*** psachin has quit IRC | 07:21 | |
quiquell|rover | ykarel: I don't find any pre 21th timeout with the overcloudrc issue, have to be related | 07:21 |
ykarel | quiquell|rover, okk good to keep an eye, i think u are merging that in rocky too | 07:22 |
ykarel | so good to get some feedback from someone from mistral | 07:23 |
quiquell|rover | ykarel: It's like we have convert a tripleoclient failure into a timeout | 07:23 |
*** shardy has joined #tripleo | 07:23 | |
ykarel | yes | 07:23 |
quiquell|rover | ykarel: Failure can be to throw proper exception instead of just return in the guard | 07:24 |
quiquell|rover | ykarel: But we have to find why zaqar websocket connection is not open at the time | 07:25 |
quiquell|rover | ykarel: Can be infra | 07:25 |
ykarel | no idea | 07:25 |
*** Petersingh|afk is now known as Petersingh | 07:27 | |
*** psachin has joined #tripleo | 07:27 | |
*** shyamb has quit IRC | 07:29 | |
pvc | hi | 07:33 |
pvc | anyone experience stuck on this part | 07:34 |
pvc | 2018-09-26 07:08:36Z [overcloud.Controller.0.NetworkDeployment]: CREATE_IN_PROGRESS state changed | 07:34 |
*** shyamb has joined #tripleo | 07:35 | |
Tengu | chkumar|ruck: -.- post_failure. damn zuul. | 07:36 |
*** bogdando has joined #tripleo | 07:43 | |
*** shyamb has quit IRC | 07:45 | |
*** jpich has joined #tripleo | 07:48 | |
*** chem has joined #tripleo | 07:51 | |
*** jpena|off is now known as jpena | 07:51 | |
quiquell|rover | ykarel: I see warning at zaqar around the problem of overcloud deploy http://logs.openstack.org/47/604447/2/check/tripleo-ci-centos-7-containers-multinode/e18e77f/logs/undercloud/var/log/containers/zaqar/zaqar-server.log.txt.gz#_2018-09-25_20_02_53_063 | 07:57 |
*** rcernin has joined #tripleo | 07:57 | |
quiquell|rover | Also a WebSocket connection closed: None | 07:57 |
*** Petersingh is now known as Petersingh|lunch | 07:57 | |
ykarel | quiquell|rover, i guess those warning should be in success job as well | 07:58 |
ykarel | so seems unrelated if that's the case | 07:58 |
ykarel | websocket can be related | 07:58 |
*** ykarel is now known as ykarel|lunch | 07:59 | |
openstackgerrit | RedHat RDO CI proposed openstack/tripleo-heat-templates master: GATE CHECK for TripleO https://review.openstack.org/604298 | 08:00 |
openstackgerrit | RedHat RDO CI proposed openstack/tripleo-heat-templates stable/rocky: GATE CHECK for TripleO https://review.openstack.org/604293 | 08:00 |
*** akrivoka has joined #tripleo | 08:00 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-heat-templates stable/queens: Allow a containerized logrotate to access docker https://review.openstack.org/601555 | 08:02 |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-heat-templates stable/pike: Allow a containerized logrotate to access docker https://review.openstack.org/605348 | 08:02 |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-heat-templates stable/rocky: Allow a containerized logrotate to access docker https://review.openstack.org/605349 | 08:03 |
*** Guest42266 has joined #tripleo | 08:04 | |
openstackgerrit | Marios Andreou proposed openstack-infra/tripleo-ci master: Removes EXTRA_TAGS from toci_gate_test&quickstart.sh.j2 https://review.openstack.org/604768 | 08:08 |
therve | quiquell|rover: What kind of issues do you see related to zaqar? | 08:10 |
*** ooolpbot has joined #tripleo | 08:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 08:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 08:10 |
*** ooolpbot has quit IRC | 08:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 08:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) | 08:10 |
quiquell|rover | therve: Running tripleo.deployment.v1.create_overcloudrc | 08:11 |
therve | That last bug? | 08:11 |
quiquell|rover | therve: give us this https://bugs.launchpad.net/tripleo/+bug/1794418 | 08:11 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) | 08:11 |
therve | Looking | 08:12 |
quiquell|rover | therve: suspect is https://review.openstack.org/#/c/603802/ | 08:12 |
holser_ | bandini I have assigned https://bugzilla.redhat.com/show_bug.cgi?id=1631674 back to me | 08:12 |
*** moshele has joined #tripleo | 08:12 | |
openstack | bugzilla.redhat.com bug 1631674 in python-tripleoclient "[UPGRADES][14] MySQL passwords ain't synchronized during when running deploy_steps_playbook.yaml" [Urgent,New] - Assigned to michele | 08:12 |
quiquell|rover | therve: What I don't really know is why do we end in a loop or why the websocket connection is close | 08:12 |
quiquell|rover | therve: The review fixed an issue at tripleoclient failint at closed connection | 08:13 |
quiquell|rover | therve: Maybe doing a "rerturn" is not correct | 08:13 |
quiquell|rover | therve: But I really want to know why the connection is closed | 08:13 |
*** hjensas has joined #tripleo | 08:15 | |
*** gkadam has joined #tripleo | 08:16 | |
therve | quiquell|rover: http://logs.openstack.org/91/587491/4/check/tripleo-ci-centos-7-containers-multinode/f618803/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz#_2018-09-25_08_20_00 shows the log | 08:21 |
therve | So we don't get into your return clause | 08:21 |
*** ykarel|lunch is now known as ykarel | 08:21 | |
therve | That return is bogus though. It should really be an exception | 08:23 |
quiquell|rover | therve: I don't think it's exception, just do yield with empty values | 08:24 |
quiquell|rover | ykarel: wait_for_messages at stuff without connection is like "We don't have more messages" | 08:24 |
quiquell|rover | therve: ^ | 08:24 |
therve | quiquell|rover: Why? | 08:24 |
*** noama has joined #tripleo | 08:26 | |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart-extras master: Add podman support to validate-tempest role https://review.openstack.org/605356 | 08:26 |
therve | If we still publish messages, the client won't wait for them | 08:26 |
*** shyamb has joined #tripleo | 08:26 | |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart master: Switch fs027 to deploy with podman https://review.openstack.org/600517 | 08:28 |
quiquell|rover | therve: The guard fixed an issue of tripleo-client failing beacuase an Exception was rise there (stack was correctly installed though) | 08:29 |
quiquell|rover | therve: was like a false negative | 08:29 |
quiquell|rover | therve: If I add a raise there we go back to the issue | 08:29 |
*** sanjayu_ has joined #tripleo | 08:29 | |
quiquell|rover | therve: now feels like we are going to the next issue after cleaning up the previous | 08:29 |
therve | quiquell|rover: Or maybe it was intermittent and nothing changed? | 08:30 |
*** jistr has quit IRC | 08:30 | |
therve | The fact that one CI run passed doesn't mean much | 08:30 |
quiquell|rover | therve: Can i add raise there and tripleoclient not failing ? I don't want a false negative again | 08:30 |
quiquell|rover | therve: It was not passing | 08:30 |
*** fhubik has joined #tripleo | 08:30 | |
therve | Well it was before, so what changed? | 08:30 |
therve | At the very least, I would try to catch the error from recv, instead of returning right away | 08:31 |
therve | It's possible that we read a bunch of messages successfully, and then fail | 08:31 |
*** jistr has joined #tripleo | 08:31 | |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-quickstart master: Add new featureset 056 for standalone upgarde. https://review.openstack.org/605363 | 08:33 |
*** derekh has joined #tripleo | 08:33 | |
*** derekh has joined #tripleo | 08:34 | |
*** shardy has quit IRC | 08:35 | |
*** shardy has joined #tripleo | 08:36 | |
quiquell|rover | therve: but where is the infinite loop ? | 08:37 |
quiquell|rover | therve: the while true ? | 08:37 |
therve | quiquell|rover: Yeah? You didn't break if the connection is opened | 08:37 |
quiquell|rover | therve: maybe we are not consuming stuff with recv ? | 08:37 |
therve | quiquell|rover: Can we get the ansible-errors.json file? | 08:38 |
therve | It's not in the CI logs | 08:38 |
quiquell|rover | therve: The issue is exactly that, after the long run the file is not there | 08:39 |
therve | quiquell|rover: Should we fix that instead? :) | 08:39 |
quiquell|rover | therve: Hehe agre we were just suspicous of the guard (and I am not ok with the 'return') | 08:40 |
Tengu | so, validations weren't the culprit for the time_out apparently :( | 08:41 |
*** nawar has quit IRC | 08:42 | |
quiquell|rover | Tengu: We have other issues looks like, but now that we have remov validation it's easier | 08:42 |
quiquell|rover | Tengu: reducing valirabes looking at timeouts is gold | 08:42 |
quiquell|rover | Tengu: gates are better also | 08:43 |
Tengu | :) | 08:43 |
Tengu | apparently time_outs are mainly for t-h-t changes. | 08:43 |
Tengu | quiquell|rover: if you want a fresh look: http://logs.openstack.org/39/605039/1/check/tripleo-ci-centos-7-scenario004-multinode-oooq-container/1a10cd5/ time_out and that one got a post_failure: http://logs.openstack.org/39/605039/1/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/beb1223/ | 08:45 |
*** owalsh_ is now known as owalsh | 08:45 | |
quiquell|rover | Tengu: hate you!!! this is different :-( | 08:46 |
Tengu | sorry ;) | 08:47 |
quiquell|rover | no overcloudrc issue | 08:47 |
chkumar|ruck | Tengu: post_Failure have everything passing | 08:47 |
quiquell|rover | Tengu: 1:30 overcloud deploy | 08:47 |
quiquell|rover | that's really wrong | 08:47 |
*** tosky has joined #tripleo | 08:48 | |
Tengu | chkumar|ruck: ah, it doesn't fail the whole thing? | 08:48 |
quiquell|rover | Tengu: later | 08:48 |
chkumar|ruck | Tengu: it is just a post failure failed to upload to logs somehow | 08:48 |
quiquell|rover | chkumar|ruck: But the other is not | 08:48 |
chkumar|ruck | http://logs.openstack.org/39/605039/1/check/tripleo-ci-centos-7-scenario003-multinode-oooq-container/beb1223/job-output.txt.gz | 08:49 |
*** apetrich has quit IRC | 08:49 | |
chkumar|ruck | quiquell|rover: yes, other is timed out | 08:49 |
quiquell|rover | chkumar|ruck: damn overcloud ARA task are wrong... we have to fix it | 08:50 |
*** nawar has joined #tripleo | 08:51 | |
sshnaidm | stevebaker, hi, I still wait for explanation why you merged this patch when there is discussion in progress there? https://review.openstack.org/#/c/599358/ | 08:52 |
*** chkumar|ruck has quit IRC | 08:58 | |
*** chandankumar has joined #tripleo | 08:59 | |
*** chandankumar is now known as chkumar|ruck | 09:00 | |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-quickstart master: Add new featureset 056 for standalone upgarde. https://review.openstack.org/605363 | 09:00 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-quickstart master: Put create repo script into its own tasks file. https://review.openstack.org/605369 | 09:00 |
*** ykarel is now known as ykarel|away | 09:01 | |
*** sileht has quit IRC | 09:02 | |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Refactor workflow actions https://review.openstack.org/603080 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: convert PlansActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603081 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert NodesActions to named expoers to avoid using 'this' in thunks https://review.openstack.org/603082 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert ValidationsActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603083 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert WorkflowExecutionsActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603084 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert CurrentPlanActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603085 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert FlavorsActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603086 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert I18nActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603087 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert LoggerActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603088 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert LoginActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603089 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert NetworksActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603090 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert EnvironmentConfigurationActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603091 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert NotificationActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603092 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert ParametersActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603093 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert RegisterNodesActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603094 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert RolesActions to named exports to avoid using 'this' in thunks https://review.openstack.org/603095 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert StacksActions to named exports to avoid using 'this' in thunk https://review.openstack.org/603096 | 09:03 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert ZaqarActions to named exports to avoid using 'this' in thunk https://review.openstack.org/603097 | 09:03 |
*** Petersingh|lunch is now known as Petersingh | 09:04 | |
*** ykarel|away has quit IRC | 09:05 | |
*** dtrainor has joined #tripleo | 09:07 | |
quiquell|rover | therve: The run from the guard review http://logs.openstack.org/02/603802/1/check/tripleo-ci-centos-7-containers-multinode/1ae99fc/ | 09:08 |
quiquell|rover | therve: it passes | 09:08 |
*** kopecmartin|off is now known as kopecmartin | 09:09 | |
therve | OK let's see | 09:09 |
*** ooolpbot has joined #tripleo | 09:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 09:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 09:10 |
*** ooolpbot has quit IRC | 09:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 09:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,Triaged] - Assigned to Quique Llorente (quiquell) | 09:10 |
chkumar|ruck | sshnaidm: bogdando https://review.openstack.org/#/c/605038/ needs +W on this | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert PlansActions to named exports https://review.openstack.org/603081 | 09:10 |
*** moshele has quit IRC | 09:10 | |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert NodesActions to named expoers https://review.openstack.org/603082 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert ValidationsActions to named exports https://review.openstack.org/603083 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert WorkflowExecutionsActions to named exports https://review.openstack.org/603084 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert CurrentPlanActions to named exports https://review.openstack.org/603085 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert FlavorsActions to named exports https://review.openstack.org/603086 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert I18nActions to named exports https://review.openstack.org/603087 | 09:10 |
*** dtantsur|afk is now known as dtantsur | 09:10 | |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert LoggerActions to named exports https://review.openstack.org/603088 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert LoginActions to named exports https://review.openstack.org/603089 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert NetworksActions to named exports https://review.openstack.org/603090 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert EnvironmentConfigurationActions to named exports https://review.openstack.org/603091 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert NotificationActions to named exports https://review.openstack.org/603092 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert ParametersActions to named exports https://review.openstack.org/603093 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert RegisterNodesActions to named exports https://review.openstack.org/603094 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert RolesActions to named exports https://review.openstack.org/603095 | 09:10 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui master: Convert ZaqarActions to named exports https://review.openstack.org/603097 | 09:10 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-quickstart-extras master: WIP: Add necessary bits for N-1->N standalone upgrade. https://review.openstack.org/604736 | 09:11 |
pvc | hi | 09:12 |
pvc | anyone know why i cant ssh to my overcloud instance but its state is running? | 09:13 |
therve | quiquell|rover: "Notifying subscriber" is missing | 09:13 |
shardy | pvc: didn't we discuss this yesterday? | 09:13 |
pvc | i cant install libguetfs :( | 09:13 |
quiquell|rover | therve: where ? | 09:13 |
pvc | so i cant customize the image | 09:13 |
therve | quiquell|rover: http://logs.openstack.org/91/587491/4/check/tripleo-ci-centos-7-containers-multinode/f618803/logs/undercloud/var/log/containers/zaqar/zaqar.log.txt.gz#_2018-09-25_08_19_59_507 | 09:14 |
pvc | but it is stack on NetworkDeployment shardy | 09:14 |
therve | We get the message, but it's not pushed to websocket | 09:14 |
shardy | pvc: the nova ACTIVE state only means the node booted, it doesn't care about networking etc, so my guess is there is some issue with the network | 09:14 |
shardy | pvc: Ok that probably confirms network issues, you can also set the root password via cloud-init without customizing the image, sec | 09:14 |
pvc | yes | 09:14 |
shardy | out of interest why can't you install libguestfs? | 09:15 |
pvc | please :( | 09:15 |
pvc | where | 09:15 |
pvc | there is an issue on our firewall device | 09:15 |
pvc | but i dont have a privilege to view it | 09:15 |
pvc | so i cant do anything | 09:15 |
shardy | pvc: Ok one moment I'll find a cloud-init example | 09:15 |
*** rcernin has quit IRC | 09:16 | |
shardy | https://github.com/openstack/tripleo-heat-templates/blob/master/firstboot/userdata_root_password.yaml | 09:16 |
shardy | Then you'd include an environment file like: | 09:16 |
quiquell|rover | therve: good and bad looks the same to me for req-2a4f6fca-d09e-4174-965c-ac225d0b60d6 | 09:17 |
quiquell|rover | therve: at good one | 09:17 |
*** salmankhan has joined #tripleo | 09:17 | |
pvc | where will i put hte password sharda | 09:17 |
pvc | shardy* | 09:17 |
shardy | give me a moment please | 09:18 |
therve | quiquell|rover: I wonder if it's not a race condition | 09:18 |
therve | quiquell|rover: http://logs.openstack.org/91/587491/4/check/tripleo-ci-centos-7-containers-multinode/f618803/logs/undercloud/var/log/containers/zaqar/zaqar-server.log.txt.gz#_2018-09-25_08_20_00_505 | 09:18 |
therve | It's happening *after* the response has been sent | 09:18 |
*** sileht has joined #tripleo | 09:19 | |
shardy | pvc: http://paste.openstack.org/show/730925/ | 09:19 |
dtantsur | jaosorior: morning! did you have a chance to submit the undercloud edge forum proposal? | 09:20 |
*** waleedm has quit IRC | 09:20 | |
pvc | i will just run this shardy openstack overcloud deploy --templates -e root_password_env.yaml | 09:20 |
shardy | pvc: yes, but if it's baremetal don't you have some nic config scripts as well? | 09:21 |
shardy | s/scripts/templates | 09:21 |
therve | quiquell|rover: http://paste.openstack.org/show/730927/ | 09:21 |
therve | tripleo is too fast | 09:21 |
shardy | that should give you root access via the ipmi console, anyway | 09:21 |
therve | I never tought I'd type that | 09:21 |
*** waleedm has joined #tripleo | 09:21 | |
*** waleedm has quit IRC | 09:22 | |
pvc | yes im just running the | 09:22 |
pvc | openstack overcloud deploy | 09:22 |
quiquell|rover | therve: So we are waitting before subscriber is created at zaqar ? | 09:22 |
pvc | i already created the file root_password shardy | 09:23 |
*** waleedm has joined #tripleo | 09:23 | |
shardy | pvc: Ok, well we have a default network config, normally that's not what you want for baremetal boxes though so you may want to check the docs | 09:23 |
shardy | at least if you have access you can then debug it | 09:23 |
therve | quiquell|rover: No, we send the message before waiting for it | 09:23 |
therve | And there is no mechanism to catch up | 09:24 |
chem | quiquell|rover: hum, a little question, when your inheriting from a job def, if you set, say "playbooks" variable, is that an override or are they merged with those of the parent definition ? | 09:24 |
pvc | it is okay now shardy openstack overcloud deploy --templates -e root_password_env.yaml | 09:24 |
quiquell|rover | chem: override | 09:24 |
quiquell|rover | chem: Don't know if you can concatenate them tough | 09:24 |
shardy | pvc: yes that's what I put in the paste, but as I mentioned you may well need some additional -e arguments to pass a valid network configuration for your hardware | 09:25 |
shardy | pvc: which nodes are you able to connect to and which are not working? | 09:25 |
chem | quiquell|rover: thanks, that's what I was thinking, but it would have been cool to have merging :) | 09:25 |
pvc | i can access my compute but controller cant | 09:25 |
quiquell|rover | therve: so, we ask the client to create overcloudrc, the client send it to zaqar and zaqar send it to mistral ? | 09:25 |
shardy | https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud-resource-registry-puppet.j2.yaml#L34 | 09:25 |
pvc | we will deploy again wait | 09:25 |
shardy | https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud-resource-registry-puppet.j2.yaml#L27 | 09:26 |
therve | quiquell|rover: No, we ask the client, the client sends it to mistral, mistral answers to zaqar | 09:26 |
shardy | pvc: Ok, that is most likely because we default to a bridged config that works for VMs for the Controller, but there's a noop config for the other roles | 09:26 |
quiquell|rover | therve: and client wait for message at zaqar ? | 09:26 |
shardy | I expect you'll need to pass a valid configuration that depends on your hardware and network setup | 09:26 |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart master: Better support for the local devbox cases https://review.openstack.org/593567 | 09:26 |
pvc | Where can i find a docs about passing a valid configuration? | 09:27 |
therve | quiquell|rover: Yes | 09:27 |
therve | quiquell|rover: But if mistral answers too quickly, we don't get it | 09:27 |
ssbarnea | quiquell|rover: https://review.openstack.org/#/c/605021/ really needed, I was hit by it while running reproducer, lucky bogdan already created a CR to fix it. | 09:27 |
quiquell|rover | therve: it answers before we wait fo rit ? | 09:27 |
therve | quiquell|rover: Yes | 09:27 |
*** ssbarnea|bkp has quit IRC | 09:28 | |
quiquell|rover | therve: so we have to start waitting before sending with another thread or the like | 09:28 |
shardy | https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/network_isolation.html#creating-custom-interface-templates | 09:28 |
quiquell|rover | therve: or registering callback | 09:29 |
therve | quiquell|rover: No the change I pasted is good enough | 09:29 |
pvc | but shardy internet connection is not an issue right since i already have the image? | 09:29 |
therve | You just need to create the websocket client beforehand | 09:29 |
shardy | pvc: ^^ it's in the network isolation section, because in most cases baremetal setups want to use that, but even without you may need to configure the nics, or at least disable that default bridge setup | 09:29 |
shardy | pvc: yes | 09:29 |
quiquell|rover | therve: Ahh I see | 09:29 |
pvc | just do give you a overview shardy im using a Lenovo Server | 09:30 |
quiquell|rover | therve: This can also cause connection not there I suppose and old kind of issues | 09:30 |
quiquell|rover | therve: and about the guard return thing, going to raise an exception | 09:31 |
quiquell|rover | therve: is that ok ? | 09:31 |
quiquell|rover | therve: thanks so much man !! | 09:31 |
therve | quiquell|rover: I'd just catch it same way as timeout | 09:31 |
shardy | pvc: Ok well the controllers get configured via https://github.com/openstack/tripleo-heat-templates/blob/master/net-config-bridge.j2.yaml | 09:32 |
shardy | pvc: so for baremetal I expect you will at least need to set NeutronPublicInterface https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/role.role.j2.yaml#L610 | 09:33 |
openstackgerrit | Athlan-Guyot sofer proposed openstack-infra/tripleo-ci master: WIP: New workflow for standalone upgrade https://review.openstack.org/604706 | 09:33 |
shardy | e.g if that interface isn't the first nic on the box it won't work | 09:33 |
openstackgerrit | Thomas Herve proposed openstack/python-tripleoclient master: Start websocket client before workflows https://review.openstack.org/605377 | 09:33 |
shardy | you can use the real interface name there, or os-net-config sorts the active (link up) interfaces, which doesn't always automatically mean "nic1" is the right device outside of basic VM setups | 09:33 |
therve | quiquell|rover: ^^^, I fixed the other ones in that file | 09:34 |
shardy | it's kind of hard to make that a default that works for any hardware/network setup | 09:34 |
therve | didn't check elsewhere | 09:34 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-quickstart master: Put create repo script into its own tasks file. https://review.openstack.org/605369 | 09:34 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-quickstart master: Add new featureset 056 for standalone upgarde. https://review.openstack.org/605363 | 09:34 |
openstackgerrit | Athlan-Guyot sofer proposed openstack-infra/tripleo-ci master: WIP: New workflow for standalone upgrade https://review.openstack.org/604706 | 09:35 |
quiquell|rover | therve: some of them are ok, will fix all | 09:35 |
pvc | shardy should i edit this https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/role.role.j2.yaml#L55 | 09:35 |
pvc | to the first nic of my controller baremetal | 09:36 |
shardy | pvc: No, you pass NeutronPublicInterface in the parameter_defaults of a -e file like in my paste | 09:36 |
shardy | parameter_defaults: | 09:36 |
shardy | NeutronPublicInterface: em2 | 09:37 |
shardy | or whatever | 09:37 |
shardy | probably worth browsing the docs as there are a lot of examples of this sort of thing | 09:37 |
*** dsneddon has quit IRC | 09:37 | |
therve | quiquell|rover: I can do it in the same patch, that'll make more sense no? | 09:37 |
pvc | http://paste.openstack.org/show/730929/ shardy | 09:38 |
quiquell|rover | therve: are you preparing the patch with the solution ? | 09:38 |
therve | quiquell|rover: https://review.openstack.org/605377 | 09:39 |
therve | Thought you saw it | 09:39 |
quiquell|rover | therve: nope, thanks ! | 09:40 |
pvc | but the allocation pool im using in the crtplane subnet is the real ip of the baremetel shardy | 09:40 |
quiquell|rover | therve: Checking if we suffer it elswhere | 09:41 |
pvc | where can i put the file shardy? | 09:41 |
pvc | i put here | 09:42 |
pvc | directory /usr/share/openstack-tripleo-heat-templates | 09:43 |
quiquell|rover | therve: found another | 09:43 |
therve | quiquell|rover: Yeah me too | 09:43 |
quiquell|rover | therve: plan_management.py | 09:43 |
openstackgerrit | Thomas Herve proposed openstack/python-tripleoclient master: Start websocket client before workflows https://review.openstack.org/605377 | 09:43 |
therve | quiquell|rover: ^^ | 09:43 |
quiquell|rover | therve: maybe this have to be encapsulated in a function | 09:43 |
therve | quiquell|rover: I'll leave you the refactoring :) | 09:44 |
quiquell|rover | therve: hufff, we the latency of openstack merges it would take ages :-) | 09:44 |
quiquell|rover | reFUCKtoring | 09:44 |
shardy | pvc: you should put user generated -e files in your home directory, not /usr/share/openstack-tripleo-heat-templates - that's owned by an RPM package | 09:44 |
shardy | and you need root to write to it... | 09:44 |
shardy | where in your home dir is up to you, for testing anywhere is fine, for production a lot of folks have a git controlled directory and a wrapper script | 09:45 |
quiquell|rover | therve: about proper error message, I suppose is difficult since we don't have the socket | 09:45 |
pvc | noted on this | 09:45 |
pvc | it is working now | 09:45 |
quiquell|rover | therve: Humm why don't we see this in rocky, adding the guard has found this ? | 09:46 |
quiquell|rover | therve: the guard is not merged to rocky yet | 09:46 |
pvc | i will let you know shardy openstack overcloud deploy --templates -e root_password_env.yaml -e config.yaml | 09:46 |
pvc | this is my config.yaml http://paste.openstack.org/show/730929/ shardy | 09:46 |
shardy | pvc: Ok, is "int1" a device name on your hardware? | 09:47 |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart-extras master: Only ask for a prompt when safe teardown requested https://review.openstack.org/605379 | 09:48 |
*** shardy is now known as shardy_mtg | 09:48 | |
pvc | yes | 09:49 |
pvc | the device name of my hardware | 09:49 |
*** nawar has quit IRC | 09:50 | |
*** nawar has joined #tripleo | 09:50 | |
*** shyamb has quit IRC | 09:51 | |
*** gfidente has joined #tripleo | 09:53 | |
*** waleedm has quit IRC | 10:01 | |
*** shyamb has joined #tripleo | 10:04 | |
*** jfrancoa has quit IRC | 10:04 | |
pvc | shardy it doesnt get an ip address | 10:04 |
pvc | but the root password worked | 10:04 |
pvc | i can login to it now | 10:04 |
*** jfrancoa has joined #tripleo | 10:05 | |
pvc | but if we restart the network service it can get an IP address shardy* | 10:05 |
quiquell|rover | jaosorior: Possible fix for new timeouts https://review.openstack.org/#/c/605377/ | 10:06 |
quiquell|rover | chkumar|ruck: ^ | 10:06 |
pvc | and it stack on NetworkDeployment shardy | 10:09 |
*** ooolpbot has joined #tripleo | 10:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 10:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 10:10 |
*** ooolpbot has quit IRC | 10:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 10:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,In progress] - Assigned to Thomas Herve (therve) | 10:10 |
openstackgerrit | Quique Llorente proposed openstack/python-tripleoclient master: Raise proper exception at webscocket close https://review.openstack.org/605387 | 10:11 |
pvc | shardy are you there | 10:11 |
quiquell|rover | therve: About the guard https://review.openstack.org/#/c/605387/ | 10:11 |
*** shardy_mtg has quit IRC | 10:11 | |
quiquell|rover | therve: this is it ? | 10:11 |
pvc | rdo hi | 10:12 |
pvc | quiquell|rover hi | 10:12 |
openstackgerrit | Quique Llorente proposed openstack/python-tripleoclient master: Raise proper exception at webscocket close https://review.openstack.org/605387 | 10:12 |
chkumar|ruck | quiquell|rover: ack check | 10:14 |
pvc | chkumar|ruck hi | 10:14 |
chkumar|ruck | quiquell|rover: do we need to keep doc guard patch for rocky? | 10:15 |
quiquell|rover | chkumar|ruck: You have read my mind, going to refactor it with the exception | 10:15 |
chkumar|ruck | quiquell|rover: hehe | 10:15 |
quiquell|rover | chkumar|ruck: but it has +2 there :-) | 10:15 |
pvc | anyone not busy here? | 10:15 |
quiquell|rover | chkumar|ruck: Let's do the right thing | 10:16 |
*** shyamb has quit IRC | 10:16 | |
*** fhubik is now known as fhubik|brb | 10:18 | |
quiquell|rover | chkumar|ruck: Humm thinking about it, let's first test the exception at master | 10:18 |
chkumar|ruck | quiquell|rover: yup, that would be better | 10:18 |
quiquell|rover | chkumar|ruck: And then if the stuff is not merged rocky/queens let refactor it | 10:18 |
chkumar|ruck | cool | 10:18 |
quiquell|rover | chkumar|ruck: if merged, well cherry-pick | 10:18 |
*** dciabrin has quit IRC | 10:21 | |
chkumar|ruck | quiquell|rover: ack | 10:21 |
quiquell|rover | bogdando: We need this to fix timeouts https://review.openstack.org/#/c/605377/ | 10:21 |
chkumar|ruck | pvc: Hello | 10:21 |
quiquell|rover | chkumar|ruck: we have to merge this https://review.openstack.org/#/c/605377/ | 10:21 |
pvc | hi chkumar | 10:21 |
pvc | my overcloud deploy it stacking on NetworkDeployment , after checking the instance there is an issue on network service. "no link present check cable" | 10:22 |
chkumar|ruck | pvc: may be marios bogdando Tengu can help on that ^^ | 10:23 |
pvc | hi marios bogdando Tengu | 10:23 |
Tengu | pvc: heya. are all your network interfaces connected to something? | 10:24 |
pvc | what do you mean connected to something Tengu | 10:24 |
chkumar|ruck | jaosorior: we need to patch https://review.openstack.org/#/c/605377/2 for timed out | 10:24 |
pvc | i override the Network as specificy by shardy here http://paste.openstack.org/show/730929/ | 10:25 |
Tengu | pvc: well, if "no link present check cable" is shown, that seems to point to some hardware issue, like a disconnected network interface. | 10:25 |
Tengu | pvc: or maybe faulty cable | 10:25 |
pvc | but the network interface is not the same on the network interface of my baremetal | 10:26 |
pvc | it is okay? | 10:26 |
pvc | but if we do an ifup <interface name> it can get an IP address | 10:26 |
*** Petersingh_ has joined #tripleo | 10:27 | |
*** Petersingh_ is now known as Petersingh|afk | 10:29 | |
openstackgerrit | Harald Jensås proposed openstack/python-tripleoclient master: Undercloud Validations - Deprecated (replaced/removed) opts https://review.openstack.org/604923 | 10:29 |
*** Petersingh has quit IRC | 10:29 | |
*** dciabrin has joined #tripleo | 10:33 | |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart master: Remove dependency on github for cloning https://review.openstack.org/605388 | 10:34 |
chkumar|ruck | marios: sshnaidm can we merge this https://review.openstack.org/#/c/575588/ running tempest on standalone | 10:37 |
chkumar|ruck | later on we can extend it to full tempest api and scenario test | 10:37 |
sshnaidm | chkumar|ruck, +w | 10:39 |
*** phuongnh has quit IRC | 10:39 | |
chkumar|ruck | sshnaidm: thanks :-) | 10:39 |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart master: Assure copied image is owned by current user https://review.openstack.org/596428 | 10:41 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-quickstart master: Put create repo script into its own tasks file. https://review.openstack.org/605369 | 10:42 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-quickstart master: Add new featureset 056 for standalone upgarde. https://review.openstack.org/605363 | 10:42 |
*** dmacpher has quit IRC | 10:43 | |
*** dmacpher has joined #tripleo | 10:44 | |
pvc | hi rdo | 10:45 |
pvc | Tengu what do i need to do? | 10:45 |
*** Petersingh|afk is now known as Petersingh | 10:51 | |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart master: Assure copied image is owned by current user https://review.openstack.org/596428 | 10:52 |
jaosorior | dtantsur: I did | 10:53 |
*** gvrangan has joined #tripleo | 10:53 | |
jaosorior | chkumar|ruck: ack | 10:53 |
dtantsur | thanks jaosorior | 10:54 |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart master: Assure copied image is owned by current user https://review.openstack.org/596428 | 10:54 |
*** shyamb has joined #tripleo | 10:55 | |
pvc | dtantsur im deploying now | 10:56 |
openstackgerrit | Derek Higgins proposed openstack/tripleo-heat-templates master: Add scenario 012 - overlcoud baremetal+ansible-ml2 https://review.openstack.org/579603 | 10:59 |
*** thrash|g0ne is now known as thrash | 10:59 | |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci master: [WIP] Testing ansible-ml2 job https://review.openstack.org/582294 | 11:01 |
openstackgerrit | Quique Llorente proposed openstack/tripleo-quickstart master: Use rdo mirror for q,p,o buildlogs at promotions https://review.openstack.org/604324 | 11:02 |
quiquell|rover | chkumar|ruck: What else do we have to look up ? | 11:07 |
*** aufi has quit IRC | 11:07 | |
*** ooolpbot has joined #tripleo | 11:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 11:10 |
*** ooolpbot has quit IRC | 11:10 | |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 11:10 |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 11:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,In progress] - Assigned to Thomas Herve (therve) | 11:10 |
mschuppert | quiquell|rover: that issue got resolved with the promotion! but there is a new one with gem versions http://logs.openstack.org/45/594145/5/check/puppet-openstack-unit-4.8-centos-7/4d03794/job-output.txt.gz#_2018-09-26_08_06_58_110082 -> https://review.openstack.org/#/c/605350/ | 11:11 |
*** udesale has quit IRC | 11:12 | |
pvc | hi dtantsur | 11:12 |
*** psachin has quit IRC | 11:13 | |
pvc | hi etingof dtantsur. two of my server doesnt get an ip address and its error is Determining IP information for ens4f0... failed; no link present. Check cable? | 11:13 |
dtantsur | pvc: could you please not spam us in several channels simultaneously? | 11:14 |
openstackgerrit | Gabriele Cerami proposed openstack-infra/tripleo-ci master: [WIP] add a job to test the reproducer https://review.openstack.org/604232 | 11:15 |
openstackgerrit | Honza Pokorny proposed openstack/tempest-tripleo-ui master: Add basic project structure https://review.openstack.org/575730 | 11:15 |
pvc | im sorrry dtantsur | 11:15 |
openstackgerrit | Martin André proposed openstack/tripleo-common master: Add wrapper for openshift-ansible docker command https://review.openstack.org/605399 | 11:15 |
*** pcaruana has quit IRC | 11:15 | |
openstackgerrit | Martin André proposed openstack/tripleo-heat-templates master: Introduce OpenShiftGlusterNodeVars heat param https://review.openstack.org/604724 | 11:16 |
openstackgerrit | Martin André proposed openstack/tripleo-heat-templates master: Make glusterfs the default sc when deploying with CNS https://review.openstack.org/604725 | 11:16 |
openstackgerrit | Martin André proposed openstack/tripleo-heat-templates master: Consolidate openshift-ansible global variables https://review.openstack.org/604726 | 11:16 |
openstackgerrit | Martin André proposed openstack/tripleo-heat-templates master: Add heat param for openshift prerequisites playbook https://review.openstack.org/604338 | 11:16 |
openstackgerrit | Martin André proposed openstack/tripleo-heat-templates master: Do not wipe disks on OpenShift gluster nodes https://review.openstack.org/605127 | 11:16 |
openstackgerrit | Martin André proposed openstack/tripleo-heat-templates master: Remove unused networks from OpenShift roles https://review.openstack.org/604727 | 11:16 |
openstackgerrit | Martin André proposed openstack/tripleo-heat-templates master: Deploy openshift all in one in scenario009 https://review.openstack.org/603780 | 11:16 |
openstackgerrit | Martin André proposed openstack/tripleo-heat-templates master: Use openshift-ansible container instead of RPMs https://review.openstack.org/583868 | 11:16 |
*** mathlin has quit IRC | 11:16 | |
openstackgerrit | Michal Pryc proposed openstack/instack-undercloud master: Fix curly brackets for ntp::servers: to prevent HTML escaping https://review.openstack.org/605400 | 11:17 |
openstackgerrit | Martin André proposed openstack/tripleo-common master: Add wrapper for openshift-ansible docker command https://review.openstack.org/605399 | 11:19 |
*** jpena is now known as jpena|lunch | 11:21 | |
*** Petersingh is now known as Petersingh|afk | 11:23 | |
*** psachin has joined #tripleo | 11:28 | |
honza | jaosorior: mandre: could you have a look at this patch? https://review.openstack.org/#/c/589572/ | 11:30 |
*** pvc_ has joined #tripleo | 11:30 | |
*** aufi has joined #tripleo | 11:30 | |
chkumar|ruck | quiquell|rover: currently everything is under controle | 11:31 |
jaosorior | honza: ack | 11:32 |
*** Petersingh|afk has quit IRC | 11:32 | |
*** pvc has quit IRC | 11:32 | |
honza | mwhahaha: hey, about this patch https://review.openstack.org/#/c/599400/ I admittedly don't know much about the way networking is configured in oooq, but i'm happy to work on it given some guidance --- would you mind elaborating on your last comment? | 11:33 |
honza | jaosorior: thanks! | 11:34 |
mandre | honza: shouldn't we make the expire period configurable? | 11:34 |
chkumar|ruck | quiquell|rover: for queens fs002 and fs020 is problematic with overcloud prepare image ction_spec_name\nInvalidActionException: Failed to find action [action_name=baremetal_introspection.get_status]\n'} | 11:35 |
chkumar|ruck | http://logs.rdoproject.org/openstack-periodic-24hr/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens/aebdd5c/logs/undercloud/home/zuul/overcloud_prep_images.log.txt.gz and | 11:35 |
chkumar|ruck | http://logs.rdoproject.org/openstack-periodic-24hr/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-queens-upload/fd3bef2/logs/undercloud/home/zuul/overcloud_prep_images.log.txt.gz | 11:35 |
honza | mandre: good point! | 11:35 |
chkumar|ruck | quiquell|rover: I am not sure what to do, as it is seen previously also in periodic jobs | 11:35 |
mandre | honza: we can possibly interpret a value as | 11:36 |
mandre | as 'turn if off' | 11:36 |
mandre | so we don't need to introduce a new param | 11:37 |
chkumar|ruck | quiquell|rover: and one with http://logs.rdoproject.org/openstack-periodic-24hr/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016-pike/a1840be/logs/undercloud/var/log/mistral/executor.log.txt.gz | 11:37 |
chkumar|ruck | for pike | 11:37 |
honza | mandre: so just replace the expires_by_type with a default? | 11:38 |
honza | mandre: ie if you're configuring the expire period, you might as well configure the mimetypes | 11:39 |
honza | or contenttypes or whatever it's called :) | 11:39 |
mandre | honza: yeah that makes sense | 11:39 |
chkumar|ruck | quiquell|rover: for master promotion only fs020 is blocking na? | 11:40 |
chkumar|ruck | rocky green | 11:40 |
chkumar|ruck | queens fs020 and fs002 | 11:40 |
chkumar|ruck | pike fs016 | 11:40 |
quiquell|rover | mschuppert: How is the rootwrap going ? | 11:42 |
*** dprince has joined #tripleo | 11:45 | |
*** lblanchard has joined #tripleo | 11:47 | |
openstackgerrit | Quique Llorente proposed openstack/python-tripleoclient stable/rocky: Add a guard to break if no connection https://review.openstack.org/603804 | 11:47 |
weshay | chkumar|ruck, quiquell|rover if you guys want a break I can attend the program call | 11:50 |
weshay | we're green, thanks to you guys | 11:50 |
quiquell|rover | weshay: thanks | 11:50 |
*** raildo has joined #tripleo | 11:51 | |
*** nawar has quit IRC | 11:53 | |
*** nawar has joined #tripleo | 11:54 | |
chkumar|ruck | weshay: thanks :-) | 11:54 |
*** shyamb has quit IRC | 11:55 | |
weshay | rdo cloud jobs still at 30% success rate :( | 11:56 |
openstackgerrit | Honza Pokorny proposed openstack/puppet-tripleo master: Add apaches expires directive for js and css files https://review.openstack.org/589572 | 11:58 |
openstackgerrit | Quique Llorente proposed openstack/python-tripleoclient master: Raise proper exception at webscocket close https://review.openstack.org/605387 | 11:58 |
*** rh-jelabarre has joined #tripleo | 11:58 | |
*** trown|outtypewww is now known as trown | 11:59 | |
*** shyamb has joined #tripleo | 12:00 | |
openstackgerrit | Quique Llorente proposed openstack/python-tripleoclient stable/rocky: Add a guard to break if no connection https://review.openstack.org/603804 | 12:01 |
*** Petersingh|afk has joined #tripleo | 12:02 | |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart master: Sudo virt-resize for libvirt reproducer https://review.openstack.org/605021 | 12:05 |
openstackgerrit | Daniel Alvarez proposed openstack/tripleo-heat-templates master: Configure http/https on OVN Metadata service to talk to Nova https://review.openstack.org/605406 | 12:06 |
*** leanderthal has joined #tripleo | 12:09 | |
*** ooolpbot has joined #tripleo | 12:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 12:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 12:10 |
*** ooolpbot has quit IRC | 12:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 12:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,In progress] - Assigned to Thomas Herve (therve) | 12:10 |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Switch previous release of master from 'queens' to 'rocky' https://review.openstack.org/590774 | 12:10 |
*** apetrich has joined #tripleo | 12:11 | |
weshay | chkumar|ruck, arxcruz any idea why we have no tempest.html or stackwiz? http://logs.openstack.org/19/603419/2/gate/tripleo-ci-centos-7-undercloud-containers/12fa54c/logs/undercloud/home/zuul/tempest.log.txt.gz | 12:12 |
*** Petersingh|afk is now known as Petersingh | 12:12 | |
chkumar|ruck | weshay: I have opened a bug for the same , it is seen at multiple places where tests failed | 12:13 |
openstackgerrit | wes hayutin proposed openstack/tripleo-common master: Remove non-voting jobs from the gate https://review.openstack.org/603419 | 12:13 |
mschuppert | quiquell|rover: rootwrap is good. we need https://review.openstack.org/605350 | 12:15 |
chkumar|ruck | weshay: https://bugs.launchpad.net/tripleo/+bug/1793665 | 12:15 |
openstack | Launchpad bug 1793665 in tripleo "Fs016/17 periodic jobs fails to generate tempest results once tempest run finishes" [Critical,Triaged] | 12:15 |
chkumar|ruck | weshay: I think it requires a lot of fixes at where result generaiton is done | 12:15 |
quiquell|rover | mschuppert: so you need master promotions no ? | 12:16 |
weshay | chkumar|ruck, arxcruz it's such a error prone piece of our work | 12:16 |
weshay | breaks very often | 12:16 |
chkumar|ruck | weshay: I need to do some clean there, breaking tempest result generaiton seperate from stackviz | 12:17 |
chkumar|ruck | weshay: I will be looking tomorrow | 12:17 |
*** dtantsur is now known as dtantsur|brb | 12:18 | |
*** rlandy has joined #tripleo | 12:18 | |
quiquell|rover | chkumar|ruck: master fs020 is different now, https://logs.rdoproject.org/openstack-periodic/git.openstack.org/openstack-infra/tripleo-ci/master/legacy-periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-master/5efcbac/logs/undercloud/home/zuul/undercloud_reinstall.log.txt.gz#_2018-09-26_01_33_54 | 12:18 |
quiquell|rover | chkumar|ruck: it's not the selinux stuff | 12:18 |
*** dprince has quit IRC | 12:18 | |
*** shardy has joined #tripleo | 12:19 | |
chkumar|ruck | quiquell|rover: nope | 12:19 |
weshay | jistr, do you want to update the depends-on https://review.openstack.org/#/c/590774/ | 12:19 |
chkumar|ruck | quiquell|rover: fs020 failing at same task /var/lib/config-data dir | 12:19 |
*** pdeore has joined #tripleo | 12:19 | |
quiquell|rover | chkumar|ruck: Do we have a bug for that ? | 12:20 |
*** quiquell|rover is now known as quique|rover|lch | 12:20 | |
chkumar|ruck | quiquell|rover: https://bugs.launchpad.net/tripleo/+bug/1794251 check the description bottom | 12:20 |
openstack | Launchpad bug 1794251 in tripleo "[master] undercloud reinstall failed with invalid selinux context: [Errno 95] Operation not supported" [Critical,In progress] - Assigned to Cédric Jeanneret (cjeanner) | 12:20 |
quique|rover|lch | chkumar|ruck: ack | 12:20 |
*** ratailor has quit IRC | 12:20 | |
chkumar|ruck | quique|rover|lch: when I filed the bz two with selinux 2 with different | 12:20 |
*** panda|off is now known as panda | 12:21 | |
quique|rover|lch | chkumar|ruck: Looks like there is a solution to disable selinux upstream | 12:22 |
mschuppert | quiquell|rover: we have stable branches for puppet-openstack_spec_helper | 12:22 |
*** agopi has quit IRC | 12:24 | |
chkumar|ruck | quique|rover|lch: I am not sure disabling selinux is the right thing | 12:24 |
openstackgerrit | Jiri Stransky proposed openstack-infra/tripleo-ci master: Switch previous release of master from 'queens' to 'rocky' https://review.openstack.org/590774 | 12:24 |
jistr | weshay: done, thanks | 12:25 |
jistr | fyi quique|rover|lch ^ | 12:25 |
jistr | (updated https://review.openstack.org/#/c/590774/ so if you push a new patch set, please fetch first) | 12:25 |
pvc_ | hi guys may i ask? | 12:28 |
pvc_ | jistr | 12:28 |
*** jpena|lunch is now known as jpena | 12:28 | |
pvc_ | shardy hello | 12:28 |
*** pvc_ has left #tripleo | 12:29 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart master: Fix .bashrc path for XDG exports https://review.openstack.org/593103 | 12:29 |
*** pvc_ has joined #tripleo | 12:29 | |
pvc_ | hi shardy | 12:29 |
*** udesale has joined #tripleo | 12:30 | |
openstackgerrit | Yurii Prokulevych proposed openstack/tripleo-upgrade master: [WIP] Add rocky specific parameters to custom NICs. https://review.openstack.org/605407 | 12:30 |
arxcruz | weshay: chkumar|ruck looking into it | 12:30 |
shardy | pvc_: hi | 12:31 |
pvc_ | hi shardy. it seems that my controller failing on starting network service Determining IP information for ens4f0... failed; no link present. Check cable? | 12:32 |
pvc_ | i think this is a hardware issue? | 12:32 |
pvc_ | but if i ifup ens4f0 it have an IP address | 12:32 |
*** amoralej is now known as amoralej|lunch | 12:33 | |
shardy | pvc_: as I said earlier, the default configuration creates an ovs bridge, which is then brought up - is br-ex configured on your node? | 12:34 |
openstackgerrit | Yurii Prokulevych proposed openstack/tripleo-upgrade master: [WIP] Add rocky specific parameters to custom NICs. https://review.openstack.org/605407 | 12:35 |
openstackgerrit | Athlan-Guyot sofer proposed openstack-infra/tripleo-ci master: WIP: New workflow for standalone upgrade https://review.openstack.org/604706 | 12:35 |
shardy | pvc_: if you want to disable that configuration, do something like OS::TripleO::Controller::Net::SoftwareConfig: /usr/share/openstack-tripleo-heat-templates/net-config-noop.yaml in the resource_registry section of an -e environment file | 12:35 |
pvc_ | on overcloud node or undercloud node? | 12:36 |
shardy | https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud-resource-registry-puppet.j2.yaml#L34 | 12:36 |
shardy | pvc_: which node is failing? | 12:36 |
pvc_ | Controller node | 12:36 |
shardy | Ok, that is the node to debug (!) | 12:36 |
shardy | pvc_: what OpenStack version is this? | 12:37 |
pvc_ | im using ocata shardy | 12:37 |
pvc_ | im sorry Compute node is failed shardy | 12:39 |
pvc_ | i will reset the bios of the failed node shard y | 12:40 |
*** ubijtsa is now known as assassin | 12:40 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart master: Fix .bashrc path for XDG exports https://review.openstack.org/593103 | 12:40 |
*** psachin has quit IRC | 12:41 | |
EmilienM | bandini: when you have time: https://review.rdoproject.org/r/#/c/16280/ | 12:41 |
EmilienM | bandini: and https://review.openstack.org/#/c/600849/ | 12:41 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Remove timeout logic https://review.openstack.org/589068 | 12:41 |
shardy | pvc_: Ok, as I said earlier for most baremetal deploymens you need to actually do some work to configure at least one interface - it might be that net-config-bridge.yaml is enough for a very simple setup, but normally folks require more configuration e.g multiple nics, bonding, isolated vlans etc | 12:42 |
*** gvrangan has quit IRC | 12:42 | |
quique|rover|lch | jistr: ack, going to remove the note that weshay mention | 12:43 |
pvc_ | this is not enought right shardy http://paste.openstack.org/show/730929/ | 12:43 |
*** quique|rover|lch is now known as quiquell|rover | 12:44 | |
mschuppert | can I please get reviews on https://review.openstack.org/#/c/587066/ https://review.openstack.org/#/c/604272/ | 12:44 |
openstackgerrit | Honza Pokorny proposed openstack/puppet-tripleo master: [ui] Add option to configure apache expires https://review.openstack.org/589572 | 12:46 |
*** assassin has left #tripleo | 12:46 | |
openstackgerrit | Quique Llorente proposed openstack-infra/tripleo-ci master: Switch previous release of master from 'queens' to 'rocky' https://review.openstack.org/590774 | 12:47 |
openstackgerrit | Marios Andreou proposed openstack-infra/tripleo-ci master: WIP: trying to remove QUICKSTART_RELEASE into the job definitions https://review.openstack.org/605410 | 12:47 |
quiquell|rover | mandre: ^ weee !!! | 12:47 |
*** shyamb has quit IRC | 12:49 | |
*** jcoufal has joined #tripleo | 12:49 | |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart master: DNM: fc28 mega-testing-patch https://review.openstack.org/605411 | 12:50 |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart master: Fix .bashrc path for XDG exports https://review.openstack.org/593103 | 12:54 |
*** janki has joined #tripleo | 12:55 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-common stable/rocky: config: ignore missing server_id from the stack https://review.openstack.org/605412 | 12:56 |
*** nawar has left #tripleo | 12:57 | |
*** jfrancoa has quit IRC | 12:57 | |
*** jfrancoa has joined #tripleo | 12:58 | |
*** skramaja has quit IRC | 12:59 | |
*** Guest42266 is now known as florianf | 13:01 | |
*** tzumainn has joined #tripleo | 13:02 | |
*** gfidente has quit IRC | 13:02 | |
*** Petersingh has quit IRC | 13:07 | |
*** sshnaidm is now known as sshnaidm|mtg | 13:07 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart master: Allow sudoing for non root user via wheel group https://review.openstack.org/605415 | 13:07 |
janki | EmilienM, hi | 13:08 |
*** Petersingh has joined #tripleo | 13:09 | |
*** ooolpbot has joined #tripleo | 13:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 13:10 |
*** ooolpbot has quit IRC | 13:10 | |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 13:10 |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 13:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,In progress] - Assigned to Thomas Herve (therve) | 13:10 |
weshay | chkumar|ruck, ? | 13:11 |
*** Petersingh_ has joined #tripleo | 13:12 | |
*** Petersingh has quit IRC | 13:12 | |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-quickstart-extras master: Add podman support for log collection https://review.openstack.org/605090 | 13:14 |
*** ohsnap has joined #tripleo | 13:14 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-pacemaker master: Rely on path for CLI calls when possible https://review.openstack.org/604891 | 13:16 |
*** Petersingh_ has quit IRC | 13:19 | |
openstackgerrit | Arx Cruz proposed openstack/tripleo-quickstart-extras master: WIP - Fix stackviz https://review.openstack.org/605419 | 13:21 |
*** Petersingh has joined #tripleo | 13:21 | |
openstackgerrit | Brent Eagles proposed openstack/tripleo-heat-templates master: Handle LP openvswitch meta-package on upgrade https://review.openstack.org/605200 | 13:21 |
*** Petersingh has quit IRC | 13:23 | |
*** Petersingh has joined #tripleo | 13:23 | |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart-extras master: Add podman support to validate-tempest role https://review.openstack.org/605356 | 13:25 |
*** chkumar|ruck is now known as chandankumar | 13:26 | |
EmilienM | janki: hey, in a meeting now. I'll answer later if you have any question | 13:26 |
*** mschuppert has quit IRC | 13:26 | |
*** mrsoul has quit IRC | 13:26 | |
*** agopi has joined #tripleo | 13:27 | |
*** vinaykns has joined #tripleo | 13:27 | |
Tengu | bandini: heya! are you here? | 13:28 |
Tengu | bandini: did the keepalived docker image or provisionning change lately? I get a weird error now when I try to deploy a podman-driven undercloud: modprobe: ERROR: could not insert 'ip_vs': Operation not permitted | 13:29 |
*** panda has quit IRC | 13:32 | |
*** dtantsur|brb is now known as dtantsur | 13:33 | |
*** abishop has quit IRC | 13:40 | |
bandini | Tengu: sort of here ;) seems keepalived is trying to modprobe ip_vs and fails | 13:41 |
*** amoralej|lunch is now known as amoralej | 13:41 | |
Tengu | bandini: it's new, isn't it? | 13:41 |
bandini | not sure if it is a new thing (i'd say it is less likely) | 13:41 |
Tengu | hmm. | 13:41 |
bandini | maybe, although I am not sure if we updated keepalived recently | 13:41 |
Tengu | didn't get that issue while deploying an undercloud until today. last try before my PTO -.-' | 13:41 |
Tengu | bandini: not really sure a *container* should be allowed to modprobe anything anyway. | 13:42 |
bandini | Tengu: what version is in the container? | 13:42 |
bandini | Tengu: exactly | 13:42 |
Tengu | that should be in the prep_host | 13:42 |
bandini | version == version of keepalived | 13:42 |
Tengu | bandini: wait, getting a clean env for the deploy | 13:42 |
Tengu | should be up in sec. | 13:42 |
*** mschuppert has joined #tripleo | 13:44 | |
*** slaweq has quit IRC | 13:45 | |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Increase post-timeout to 1 hour https://review.openstack.org/605185 | 13:48 |
Tengu | bandini: still waiting - deploy's running, should get the containers shortly. | 13:49 |
openstackgerrit | Udi Kalifon proposed openstack/tempest-tripleo-ui master: Selenium infra https://review.openstack.org/605424 | 13:49 |
*** gfidente has joined #tripleo | 13:50 | |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates master: Remove unused bootstrap-config.yaml https://review.openstack.org/605009 | 13:56 |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates master: Convert *tasks from bootstrap_nodeid to short_bootstrap_node_name https://review.openstack.org/605430 | 13:56 |
Tengu | bandini: hm. in fact keepalived has nothing to do on the undercloud... right? | 13:58 |
Tengu | bandini: docker.io/tripleomaster/centos-binary-keepalived:48a0d56e4ba547ab10a35888138370fb1ec74a97_31a62456 | 13:58 |
bandini | Tengu: keepalived on the undercloud is used for the VIPs when you use TLS (which is the default nowadays) | 13:59 |
*** mjturek has joined #tripleo | 13:59 | |
bandini | i.e. it is always used | 13:59 |
Tengu | bandini: duh. ok. | 13:59 |
Tengu | bandini: does the hash answer your question, or should I do some blackmagic in order to get another version number? | 14:00 |
*** zb is now known as zaneb | 14:01 | |
*** mcornea has joined #tripleo | 14:01 | |
*** mcornea has quit IRC | 14:02 | |
*** mcornea has joined #tripleo | 14:02 | |
bandini | [root@undercloud-0 pacemaker]# docker run -it --net=host --user=root docker.io/tripleomaster/centos-binary-keepalived:48a0d56e4ba547ab10a35888138370fb1ec74a97_31a62456 sh -c 'rpm -q keepalived' | 14:03 |
bandini | keepalived-1.3.5-6.el7.x86_64 | 14:03 |
openstackgerrit | Dmitry Tantsur proposed openstack/diskimage-builder master: Add an element to configure iBFT network interfaces https://review.openstack.org/391787 | 14:03 |
bandini | Tengu: from a quick look we did not upgrade much there, so maybe something else changed | 14:03 |
Tengu | bandini: hmm ok. maybe the bootstrap part? will check. | 14:04 |
Tengu | bandini: https://github.com/openstack/kolla/blob/master/docker/keepalived/extend_start.sh that's where it's called. | 14:06 |
Tengu | but still.... I don't think I saw that before. | 14:07 |
openstackgerrit | Dmitry Tantsur proposed openstack/diskimage-builder master: Add an element to configure iBFT network interfaces https://review.openstack.org/391787 | 14:07 |
Tengu | bandini: would you be OK to move that modprobe into the tripleo-heat-templates/docker/services/keepalived host_prep_tasks instead? | 14:08 |
Tengu | I can of course produce a patch in that way. | 14:08 |
*** abishop has joined #tripleo | 14:09 | |
*** ooolpbot has joined #tripleo | 14:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 14:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 14:10 |
*** ooolpbot has quit IRC | 14:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 14:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,In progress] - Assigned to Thomas Herve (therve) | 14:10 |
shardy | jaosorior: Hey is OS::TripleO::NodeTLSData still used anywhere, seems to be set to OS::Heat::None in both ./environments/enable-tls.yaml and ./overcloud-resource-registry-puppet.j2.yaml ? | 14:11 |
shardy | jaosorior: the reason for my question is I need to modify puppet/extraconfig/tls/tls-cert-inject.yaml as it references bootstrap_nodeid, but AFAICS it's not used anymore | 14:11 |
shardy | so perhaps I can just delete it? | 14:12 |
*** janki has quit IRC | 14:12 | |
weshay | mwhahaha, fetching the containers in pre is a no-go? https://review.openstack.org/#/c/580037/ | 14:14 |
mwhahaha | weshay: pretty much, the more i look at it the less beneficial it is | 14:14 |
jaosorior | shardy: right, it's not used anymore. Tengu refactored the TLS setup to be ansible based now :D | 14:14 |
shardy | jaosorior: Ok thanks, I'll remove it in my series reworking bootstrap things | 14:15 |
weshay | mwhahaha, pulling the containers at one time and comparing across providers? | 14:15 |
Tengu | :) | 14:15 |
*** udesale has quit IRC | 14:15 | |
weshay | not worth it? | 14:15 |
mwhahaha | weshay: the problem is pulling all the containers takes too long (like 40+ mins) | 14:16 |
mwhahaha | weshay: so unless you know up front what containers you're going to need, it's wasted time | 14:16 |
mwhahaha | weshay: pre also shares run's timeout so there is no gain | 14:16 |
Tengu | quiquell|rover: my patch is entering the gate (for the selinux recurse thingy) :) | 14:16 |
Tengu | quiquell|rover: cross you fingers! | 14:17 |
*** panda has joined #tripleo | 14:21 | |
dtantsur | slagle: hey! I'm not sure how your proposal at https://etherpad.openstack.org/p/tripleo-edge-squad-status is better than just have N underclouds per location with templates stored in git.. | 14:22 |
openstackgerrit | Dan Macpherson proposed openstack/ansible-role-openstack-operations master: Adding Backup and Restore Operations https://review.openstack.org/604439 | 14:23 |
openstackgerrit | Honza Pokorny proposed openstack/tempest-tripleo-ui master: Add basic project structure https://review.openstack.org/575730 | 14:23 |
slagle | dtantsur: well, better or worse, that's to be decided i guess. but the point is that you don't have an undercloud at each edge site | 14:26 |
dtantsur | slagle: right, so.. you don't have edge undercloud. | 14:26 |
dtantsur | you have some remote software configuration appliances, but not what people are used to: nodes management via ironic, etc | 14:27 |
slagle | dtantsur: you have something that can execute a deployment. a container with everything embedded perhaps to execute ansible or whatever | 14:27 |
dtantsur | so, it's kind of Federation idea, no? | 14:27 |
dtantsur | a big undercloud talking to smaller ones? | 14:27 |
*** moguimar has quit IRC | 14:27 | |
slagle | dtantsur: yes in a way, if it includes "just enough ironic" to also do baremetal | 14:27 |
*** dxiri has joined #tripleo | 14:28 | |
slagle | where the data was federated, as opposed to one big cloud (undercloud) | 14:28 |
dtantsur | slagle: I see. What I don't like here is the amount of TripleO-specific code to write to make it happen.. | 14:28 |
dtantsur | slagle: now, re RabbitMQ/MariaDB: how is it going to be solved for the overcloud? or is it going to also be federation? | 14:28 |
slagle | dtantsur: i don't see a lot of tripleo specific code | 14:29 |
dtantsur | slagle: well, the federation itself, no? | 14:30 |
shardy | dtantsur: also there are two layers to this problem - for each controlplane datacentre (region?) you could probably in some cases have a director per deployment, but for the compute/storage "far edge" nodes (think a radio antenna) it's unlikely any deployment hardware would be acceptable due to footprint | 14:30 |
slagle | dtantsur: ironic would have to support that | 14:30 |
dtantsur | shardy: right, this is what we're trying to figure out | 14:30 |
slagle | dtantsur: i don't see tripleo adding that onto ironic | 14:30 |
*** moguimar has joined #tripleo | 14:30 | |
slagle | dtantsur: but one HUGE undercloud isn't the answer | 14:30 |
dtantsur | slagle: welllll.. we may :) there'll be a session on that. but that's probably T+ | 14:30 |
dtantsur | slagle: it's not on HUGE, it's distributed | 14:31 |
slagle | it is huge when we talk about the number of distributed sites | 14:31 |
dtantsur | we're talking distributed vs federated (I assume you also expect federated nova, glance, etc) | 14:31 |
EmilienM | bandini, bogdando : thx for the review. Will address. | 14:31 |
slagle | dtantsur: for the undercloud? no, i don't | 14:31 |
dtantsur | slagle: for the overcloud. each service must support federation, no? | 14:32 |
slagle | dtantsur: i'm talking about the deployment tool. how it needs to work so that we can deploy edge architectures | 14:32 |
dtantsur | right, but I'm trying to see the whole picture | 14:32 |
dtantsur | we've spent some time making undercloud NOT a special snowflake, but a case of the overcloud | 14:32 |
slagle | if one huge (ok, "distributed") cloud won't work for the overcloud, it won't work for undercloud either | 14:32 |
dtantsur | so I'm trying to understand how the federation idea plays with all overcloud services | 14:32 |
shardy | each region is an independent cloud, and federation just enables authentication/authorization via Keystone in each region, with some central IdP | 14:33 |
shardy | I'm not clear why all services need to "support federation", e.g what is missing? | 14:33 |
dtantsur | okay, so it's keystone-level federation, not service-level | 14:33 |
EmilienM | bandini: I replied to https://review.rdoproject.org/r/#/c/16280/7/paunch-container-shutdown | 14:33 |
slagle | shardy: i don't see how they do | 14:33 |
dtantsur | i.e. different endpoints for each, say, nova, in each location | 14:33 |
slagle | *why | 14:33 |
EmilienM | bandini: do we need to follow up with a patch? I haven't checked how pacemaker containers are configured for restart policy | 14:33 |
quiquell|rover | Tengu: crossed, hand and feet | 14:34 |
shardy | dtantsur: yeah there were a few ideas on that, but it seems like either one central keystone with multiple regions, Federated access to each region, or possibly some new model where keystone can lazily replicate from a central IdP for each region | 14:34 |
Tengu | :D | 14:34 |
*** artom has quit IRC | 14:34 | |
shardy | the last one I'm not super clear on, but IIRC it's not currently possible | 14:34 |
dtantsur | shardy: okay, so then we don't need ironic to support federation itself | 14:34 |
dtantsur | there will be tripleo code that will know which ironic to talk to (and.. mm.. get rid of nova first?) | 14:35 |
bandini | EmilienM: no I think we're good there, thanks | 14:35 |
dtantsur | slagle: this is what I'm talking about re a lot of tripleo code ^^^ | 14:35 |
slagle | dtantsur: oh to get rid of nova? or know which ironic to talk to? | 14:35 |
shardy | dtantsur: it'd probably be something like an ansible playbook with a list of underclouds, one per region | 14:36 |
dtantsur | slagle: both :) but getting rid of nova is planned already (hopefully) | 14:36 |
shardy | the harder problem is what to do with the "far edge" compute only clusters | 14:36 |
EmilienM | bandini: thx for the help btw | 14:36 |
dtantsur | well, my hard problem is that it's no longer TripleO in any sense of "OpenStack on OpenStack" | 14:37 |
dtantsur | e.g. you cannot just talk to ironic to provision a node | 14:37 |
dtantsur | you need to talk to some playbooks that find an ironic to talk to | 14:37 |
shardy | dtantsur: well it's exactly the same for the controlplane | 14:37 |
*** Vorrtex has joined #tripleo | 14:37 | |
shardy | it's just a special case for some specific compute cluster deployments | 14:37 |
slagle | dtantsur: what i was proposing on the etherpad, was a self contained zero-footprint installer that could install an Ironic at the edge if needed. in that case, there is only one | 14:37 |
dtantsur | slagle: one per location, no? | 14:37 |
slagle | and then use that Ironic to provision local compute/storage | 14:38 |
bogdando | slagle, mwhahaha: "but the point is that you don't have an undercloud at each edge site" that part confises me. I know there is a related rfe for AIO productized as 1:1 bundled underclouds to all in one node managed by it. So it is very netural to expect these two can play together... and they do not! | 14:38 |
openstackgerrit | Udi Kalifon proposed openstack/tempest-tripleo-ui master: Selenium infra https://review.openstack.org/605424 | 14:38 |
slagle | but honestly, i wouldn't even focus on solving baremetal provisioning at the edge | 14:38 |
slagle | feels like boiling the ocean | 14:38 |
bogdando | (assumed an edge site may be represened by that productized AIO+undercloud bundle) | 14:38 |
dtantsur | well, something has install all this baremetals | 14:38 |
shardy | yeah, I think just use deployed server, at least for the first pass | 14:38 |
slagle | yes, something does. i don't think we have to provide it right on day 1 | 14:39 |
mwhahaha | bogdando: yea i thought that was the point of having an all in one deployable via an undercloud (and the undercloud can deploy multiples) | 14:39 |
dtantsur | mmmok, come back to me when customers start complaining :D | 14:39 |
bogdando | otherwise, if we go with "non-productized" AIO, which is standalone, we end up with non productized Edge as well :) messy | 14:39 |
* dtantsur imagines someone running around with a RHEL CD in all locations | 14:39 | |
slagle | pigeons, perhaps | 14:40 |
bogdando | and that "something that can execute a deployment" replacing control plane for edge sites, is not really a life cycle tool, can't do upgrades, for example | 14:40 |
dtantsur | pigeons++ | 14:40 |
shardy | dtantsur: FWIW I think the edge cases we're initially targetting will be more like embedded appliances - they'll build hundreds of them with a fixed spec and a standard image, so you just need some way to power them up and point a configuration tool at the running nodes | 14:40 |
shardy | e.g consider a cell phone antenna site | 14:40 |
mwhahaha | RFC 2549 | 14:41 |
bogdando | if acting disconnected from the central UC | 14:41 |
shardy | there are certainly other use-cases to consider too, but perhaps as a second step? | 14:41 |
dtantsur | what I'm trying to understand is how much more we can do beyond "make undercloud consume less memory, so that it fits into a small computer" | 14:41 |
dtantsur | because for me this ^^^ sounds like our plan | 14:41 |
slagle | bogdando: i think we can have lifecycle management, but just not with the architecture we have today in tripleo | 14:41 |
dtantsur | plus some playbooks, dirt and sticks | 14:41 |
bogdando | agree with far edge sites not having UC tho | 14:42 |
slagle | bogdando: everything can't be centrally managed. that's one of the core issues with edge we are trying to solve | 14:42 |
shardy | dtantsur: well that's one goal, but kind of orthoganal to edge clusters which only contain 3 nodes? | 14:42 |
slagle | bogdando: plus, the UC will never scale to manage thousand(s) node scale | 14:42 |
slagle | centrally | 14:42 |
Tengu | bandini: adding the "modprobe" in the services/keepalived.yaml host_prep_tasks solves my issue, I'll provide a patch once my mtg is over. | 14:42 |
dtantsur | okay, let me ask it in a more provocative way: what's the benefit of using tripleo here instead of e.g. kolla-ansible? | 14:42 |
Tengu | bandini: that was easy for once :). | 14:43 |
dtantsur | no central API, no bm provisioning | 14:43 |
slagle | dtantsur: we still need those. just perhaps not at the edge | 14:43 |
dtantsur | slagle: right, and I'm asking about Edge. Why use TripleO there? What is going to make our differentiator? | 14:43 |
shardy | dtantsur: the value is all the control-plane nodes in each DC can still do BM provisioning etc, then you can manage day-2 operations for both controlplane and edge compute sites with one tool | 14:44 |
shardy | e.g TripleO | 14:44 |
*** cylopez has quit IRC | 14:44 | |
bandini | Tengu: right but do we actually need that modprobe for what we do in keepalived? | 14:44 |
shardy | maybe kolla-ansible can do that, I have no idea | 14:44 |
slagle | dtantsur: because we use tripleo for the undercloud, and we want to have it support the edge | 14:44 |
Tengu | bandini: can't say - we don't want to run it from a container, that's all I can say. I'm no keepalived guru :/ | 14:44 |
dtantsur | shardy: without real undercloud on Edge, this is not the same tool. in the real DC you use Ironic, Mistral and co, on Edge you use.. pidgeons and ansible? | 14:44 |
slagle | dtantsur: it's not just "run some ansible playbooks" | 14:44 |
dtantsur | slagle: I know why we want it, I'm asking why a customer would. | 14:44 |
shardy | dtantsur: it may be helpful to define your vision of "Edge", here I think we're talking about distributed compute mostly | 14:45 |
Tengu | bandini: in a first step we can move it out of the container, so that we're iso-compatible. You OK with that? | 14:45 |
bandini | Tengu: yeah me neither, I'd be surprised if we need that kernel module only to add a couple of VIPs though (I might be mistaken though) | 14:45 |
shardy | dtantsur: meh - we already support several modes of deployment with don't use Ironic, I don't see how this is any different really | 14:45 |
bandini | Tengu: worksforme as long as we don't forget it (as it will bite us eventually ;) | 14:45 |
Tengu | well, seeing its name, it might be needed, in fact, bandini | 14:45 |
slagle | dtantsur: you seem to be coming from the position that the tool can't evolve. | 14:45 |
openstackgerrit | Mehdi Abaakouk (sileht) proposed openstack/puppet-tripleo master: ceilometer: escape % in crontab https://review.openstack.org/599421 | 14:45 |
shardy | and yeah, it mostly uses ansible because that's what $everybody asked for | 14:45 |
slagle | dtantsur: is TripleO not TripleO because we added support for pre-provisioned nodes? | 14:45 |
dtantsur | shardy: well, it's opt-out, not the only option | 14:45 |
slagle | dtantsur: i can't argue that point | 14:45 |
dtantsur | slagle: no, but it will when you remove the provisioning | 14:46 |
slagle | it's not totally sensical to me | 14:46 |
dtantsur | well, IMO | 14:46 |
shardy | dtantsur: sure there's still much utility in Ironic, we're just saying maybe not (yet at least) for tiny edge deployments | 14:46 |
slagle | dtantsur: we're not removing it | 14:46 |
dtantsur | my main point is: centralized and uniform control plane is a killer feature of tripleo | 14:47 |
dtantsur | I think sacrificing it will reduce our utility compared to more lightweight solutions | 14:48 |
dtantsur | I don't insist you put Ironic where it does not belong (less bug reports for me) | 14:48 |
dtantsur | (although I'm pretty sure somebody will come with a bug asking for Ironic at Edge UC quite soon) | 14:48 |
slagle | dtantsur: what i'm proposing is a way to still get the benefit from the centralized management, but in a distributed/disconnected/portable fashion | 14:48 |
slagle | i think we need to think beyond our existing architecture or just multi-node undercloud or multi-undercloud | 14:49 |
slagle | neither of those scale to thousands of nodes on their own | 14:49 |
*** ade_lee has joined #tripleo | 14:49 | |
openstackgerrit | Gabriele Cerami proposed openstack-infra/tripleo-ci master: [WIP] add a job to test the reproducer https://review.openstack.org/604232 | 14:49 |
dtantsur | slagle: maybe I don't quite undercloud your proposal? it seems like we're going to have the central undercloud with playbooks (behind Mistral?) only, and whatever is one Edge mostly hidden from operators using that central undercloud. | 14:50 |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-heat-templates master: Load ip_vs module from the host, NOT from the container https://review.openstack.org/605446 | 14:50 |
Tengu | bandini: -^ | 14:50 |
slagle | dtantsur: i'd like us to still use the UC to do the planning, and rendering of the deployment, along with some state tracking | 14:51 |
slagle | dtantsur: not necessarily "live" state | 14:51 |
dtantsur | slagle: it's still mistral+heat+ansible, right? | 14:51 |
slagle | but things like what container image versions were used, what ip's, etc | 14:51 |
shardy | Yeah, I don't think it'll be hidden, there are a few options, e.g multiple compute-only plans which scale out independent of the controlplane | 14:52 |
dtantsur | but no direct control plane access for tasks like introspection? | 14:52 |
slagle | dtantsur: but then separate the actual applying of that deployment, so that it doesnt have to be driven from the UC | 14:52 |
shardy | just because we don't use Ironic doesn't mean the undercloud can't manage those clusters | 14:52 |
*** PhilSliderS has quit IRC | 14:52 | |
slagle | dtantsur: we are already quite far along with config-download+export | 14:52 |
slagle | you get a container image that can do the deployment | 14:52 |
slagle | yes it uses ansible, but i consider that implementation detail almost | 14:52 |
openstackgerrit | Martin Schuppert proposed openstack/tripleo-heat-templates master: Add nova file_backed_memory and memory_backing_dir support for qemu.conf https://review.openstack.org/604360 | 14:53 |
dtantsur | okay, I think I see what the plans are, thanks all. I guess I cannot help much in this, since the ironic part stays unchanged (where it stays). | 14:53 |
slagle | dtantsur: and then management in git of the plan, state, rendered deployment, etc. instead of a centralized swift | 14:53 |
dtantsur | sorry to nitpick, but a git server is also centralized (unlike the protocol itself) | 14:54 |
*** sileht has quit IRC | 14:54 | |
dtantsur | but this probably does not matter much | 14:54 |
slagle | sure, i mean ultimately you need a truth of record. centralized | 14:55 |
*** sileht has joined #tripleo | 14:56 | |
bogdando | Also I think penguins may fit the case better than pigeons | 14:57 |
slagle | my thinking with ideas around git is that i think it's quite nice with how we've ended up using it with config-download | 14:57 |
bogdando | they can swim far distance across islands, for example, and still carry on more CDs with RHEL than over the air :) | 14:57 |
slagle | but it is still kind of hidden behind swift | 14:57 |
slagle | and i'd like the plan to be in git as well, so we have some clear history there. which would be super helpful | 14:57 |
slagle | and then when you consider the edge, where we may be doing deployments that are completely disconnected, we can't exactly be reporting status back over a message bus | 14:58 |
slagle | so saving some state locally seems useful | 14:58 |
dtantsur | bogdando++ | 14:58 |
slagle | with the ability to "sync back" (git push) back to the centralized management | 14:58 |
* dtantsur thinks that picking AMQP for an RPC implementation was a weird idea to begin with.. | 14:59 | |
dtantsur | I like the idea of git for history management though | 15:00 |
dtantsur | do we need a Git as a Service in OpenStack? :) | 15:00 |
*** pcaruana has joined #tripleo | 15:01 | |
slagle | i hope not :) | 15:01 |
*** Petersingh has quit IRC | 15:02 | |
*** Petersingh has joined #tripleo | 15:02 | |
therve | thrash: toure: I think I found the issue with the messaging timeout | 15:05 |
therve | Well root cause at least, don't know why it's happening yet | 15:05 |
thrash | therve: do tell | 15:06 |
therve | thrash: It looks like SIGHUP messes up with the worker, and the RPC client ends up broken | 15:06 |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-heat-templates master: Load isci_tcp module from the host. https://review.openstack.org/605450 | 15:06 |
*** moguimar has quit IRC | 15:06 | |
therve | thrash: We end up here: https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_rabbit.py#L656 | 15:07 |
therve | Bingo managed to reproduce | 15:07 |
thrash | therve: so, instead of waiting 48 hours... SIGHUP the API? | 15:08 |
therve | thrash: Yep. I did it 5-6, and now it's in a broken state | 15:08 |
thrash | therve: Nice work | 15:08 |
thrash | therve: sighup in the host? Or from within the container? | 15:09 |
*** moguimar has joined #tripleo | 15:09 | |
therve | thrash: I think both works, the process id is exposed in the host | 15:09 |
thrash | therve: ack | 15:09 |
therve | thrash: I used "sudo docker exec mistral_api kill -SIGHUP" though | 15:09 |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-heat-templates master: Load dm-multipath module from the host. https://review.openstack.org/605452 | 15:10 |
therve | thrash: I used "sudo docker exec mistral_api kill -SIGHUP 1" though | 15:10 |
thrash | therve: gotcha | 15:10 |
*** ooolpbot has joined #tripleo | 15:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 15:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 15:10 |
*** ooolpbot has quit IRC | 15:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 15:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,In progress] - Assigned to Thomas Herve (therve) | 15:10 |
thrash | therve: That was interesting timing... Look at the bug that just scrolled by | 15:10 |
Tengu | bandini: found a couple more of modprobe within containers... | 15:11 |
* Tengu tracks them down | 15:11 | |
thrash | therve: don't know if related, but interesting :) | 15:11 |
thrash | therve: but yeah... I just ran it once.. and bang. | 15:11 |
therve | \o/ | 15:12 |
thrash | restart the container and it works again. | 15:12 |
therve | thrash: So it's related to https://github.com/openstack/mistral/blob/master/mistral/api/service.py#L51 | 15:13 |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart master: use the debug callback (humananly_readable) https://review.openstack.org/604802 | 15:13 |
thrash | therve: but didn't we determine that the container is not running via wsgi? | 15:14 |
therve | thrash: Why is that important? | 15:14 |
bandini | Tengu: ack, nice. maybe open a bug and add a Related-Bug: XYZ for each review/fix? | 15:14 |
* toure reading backlog | 15:14 | |
Tengu | bandini: yeah, might be an idea ^^' | 15:14 |
thrash | therve: because that code is the WSGIService... | 15:14 |
therve | thrash: We don't run it via wsgi, but it runs the WSGIService anyway | 15:15 |
thrash | therve: I'm a dolt. :P | 15:15 |
therve | :D | 15:15 |
Tengu | bandini: https://bugs.launchpad.net/tripleo/+bug/1794550 - will update the commits. | 15:17 |
openstack | Launchpad bug 1794550 in tripleo "Some kernel modules are loaded from containers" [Medium,Triaged] - Assigned to Cédric Jeanneret (cjeanner) | 15:17 |
bandini | Tengu: awesome, thanks | 15:17 |
*** janki has joined #tripleo | 15:17 | |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-heat-templates master: Load isci_tcp module from the host. https://review.openstack.org/605450 | 15:19 |
*** iranzo has quit IRC | 15:19 | |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-heat-templates master: Load ip_vs module from the host https://review.openstack.org/605446 | 15:20 |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-heat-templates master: Load dm-multipath module from the host. https://review.openstack.org/605452 | 15:20 |
Tengu | done. | 15:20 |
Tengu | just the openvswitch, I don't know where it should be loaded. | 15:20 |
*** moguimar has quit IRC | 15:20 | |
*** yprokule has quit IRC | 15:21 | |
Tengu | EmilienM: -^^^ a few reviews :) | 15:21 |
jaosorior | o | 15:22 |
Tengu | https://review.openstack.org/#/q/status:open+topic:bug/1794550 shorter. | 15:22 |
Tengu | jaosorior: guess you'll be happy to see some modprobe dropped from the containers :) | 15:23 |
EmilienM | Tengu: will look after my meetings | 15:23 |
Tengu | EmilienM: np. will sign-off for today :). | 15:23 |
*** artom has joined #tripleo | 15:23 | |
EmilienM | ack | 15:23 |
jaosorior | Tengu: nice!! | 15:23 |
therve | thrash: Surprisingly hard to reproduce a second time | 15:24 |
toure | yeah I issued the SIGHUP | 15:24 |
toure | now the api doesn't reconnect :) | 15:24 |
Tengu | jaosorior: ;) thought so. anyway. see you tomorrow folks | 15:24 |
jaosorior | Tengu: have a good one! | 15:25 |
Tengu | same for you ;) | 15:25 |
toure | oh crap I forgot I cherry picked a change for the eventlets | 15:25 |
thrash | toure: I don't think that's even in play. Red herring. | 15:26 |
thrash | therve: You're right. Hard to reproduce... | 15:26 |
*** chandankumar is now known as chkumar|off | 15:26 | |
toure | ok | 15:27 |
*** quiquell|rover is now known as quique|rover|off | 15:27 | |
*** quique|rover|off is now known as quique|off | 15:28 | |
*** kopecmartin is now known as kopecmartin|ruck | 15:28 | |
*** sshnaidm|mtg is now known as sshnaidm | 15:29 | |
*** marios|rover has joined #tripleo | 15:29 | |
*** dsneddon has joined #tripleo | 15:31 | |
therve | thrash: Something like that http://paste.openstack.org/show/730950/ maybe, but it'd be nice to reproduce it more | 15:31 |
*** leanderthal has quit IRC | 15:31 | |
thrash | therve: I'm wondering if there is a race condition with the SIGHUP and the healthcheck that's happening every two seconds... | 15:32 |
toure | thrash I was going to mention, maybe the original theory of haproxy swamping us | 15:32 |
thrash | therve: the question... Where is the SIGHUP coming from anyway? | 15:32 |
bogdando | thrash: it comes from logrotate post script I think | 15:33 |
thrash | bogdando: Ahhh... | 15:33 |
*** jtomasek has quit IRC | 15:34 | |
*** PhilSliderS has joined #tripleo | 15:34 | |
*** sanjayu_ has quit IRC | 15:35 | |
thrash | therve: bogdando toure the picture is becoming quite a bit clearer... | 15:36 |
janki | EmilienM, hey. I have commented on the patch. am logging off now. thanks :) | 15:36 |
*** janki has quit IRC | 15:36 | |
EmilienM | janki: ack, will look asap | 15:36 |
toure | thrash so you think it is a race condition with logrotate and haproxy polling | 15:37 |
thrash | toure: not sure about the haproxy part... | 15:37 |
thrash | toure: but I'm thinking that if a request comes in at just the wrong time you hit that "forked after connection established" | 15:38 |
thrash | and when you hit that... Hosed. | 15:38 |
thrash | basically... request -> SIGHUP -> reply -> BORKED | 15:38 |
thrash | but that's completely speculation | 15:39 |
toure | yup that makes sense so if we slow the haproxy polling this should reduce the pressue of the race, which I know is a bandaid more than a fix | 15:39 |
thrash | toure: better to keep the error from happening, which I think therve's idea should alleviate. | 15:40 |
toure | true | 15:40 |
thrash | I think it's reasonable to clean up the rcp_clients on reset. | 15:41 |
*** hjensas has quit IRC | 15:42 | |
toure | +1 | 15:42 |
* toure will test the theory | 15:42 | |
*** jtomasek has joined #tripleo | 15:45 | |
thrash | toure: problem is... I can't reproduce it again. | 15:45 |
thrash | therve: I'll keep trying to trigger it... Not having any luck as of yet. | 15:45 |
toure | I have twice on my systems | 15:45 |
*** numans has joined #tripleo | 15:46 | |
thrash | toure: gonna let mine sit for a bit. | 15:46 |
*** thrash is now known as thrash|biab | 15:47 | |
toure | ack | 15:47 |
openstackgerrit | Merged openstack/ansible-role-redhat-subscription master: Add support for RHSM Pools https://review.openstack.org/605290 | 15:51 |
*** panda is now known as panda|bbl | 15:52 | |
ohsnap | so in the past i deployed a test/dev env using packstack. today im redoing it using tripleo-quickstart, this has been sitting here a little over an hour: TASK [undercloud-deploy : Install the undercloud] | 15:52 |
*** holser_ has quit IRC | 15:54 | |
*** jfrancoa has quit IRC | 15:54 | |
*** jfrancoa has joined #tripleo | 15:56 | |
*** Petersingh is now known as Petersingh|away | 15:57 | |
rh-jelabarre | I'm trying to map a router to a network (openstack router set --external-gateway provider_network external) and I keep getting a "BadRequestException: Unknown error". Any suggestions of what to look into? Or at least figure out what the error means? | 15:58 |
*** dxiri has quit IRC | 15:58 | |
*** ohsnap has quit IRC | 15:59 | |
*** Petersingh|away has quit IRC | 16:00 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart master: Inject undercloud user into the cloud image https://review.openstack.org/603069 | 16:01 |
openstackgerrit | David Peacock proposed openstack/tripleo-heat-templates master: docker-puppet.py: used dedicated hiera entry, not uuid https://review.openstack.org/559182 | 16:02 |
*** rdopiera has quit IRC | 16:04 | |
EmilienM | bnemec: thx for the email, long life to OVB | 16:07 |
bnemec | EmilienM: np. I'm hoping at some point we can do some testing with tripleo-ci on the new branch to make sure it doesn't break anything. | 16:08 |
*** dtantsur is now known as dtantsur|afk | 16:08 | |
bnemec | And if it does to get it fixed before making the switch. | 16:09 |
EmilienM | yes +2 | 16:09 |
openstackgerrit | David Peacock proposed openstack/tripleo-heat-templates master: docker-puppet.py: used dedicated hiera entry, not uuid https://review.openstack.org/559182 | 16:09 |
*** ramishra has quit IRC | 16:09 | |
*** noama has quit IRC | 16:10 | |
openstackgerrit | Sorin Sbarnea proposed openstack/tripleo-quickstart-extras master: Improve output of Verify Sphinx build task https://review.openstack.org/600403 | 16:10 |
*** ooolpbot has joined #tripleo | 16:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 16:10 |
*** ooolpbot has quit IRC | 16:10 | |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 16:10 |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 16:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,In progress] - Assigned to Thomas Herve (therve) | 16:10 |
*** ksambor has quit IRC | 16:15 | |
*** fhubik|brb has quit IRC | 16:15 | |
*** florianf is now known as florianf|afk | 16:17 | |
*** kopecmartin|ruck is now known as kopecmartin|off | 16:20 | |
*** aufi has quit IRC | 16:21 | |
*** jpich has quit IRC | 16:22 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Introduce OS::TripleO::Services::Podman https://review.openstack.org/604235 | 16:24 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: undercloud: deploy podman https://review.openstack.org/605221 | 16:24 |
EmilienM | chkumar|off: nice work on tempest, thanks | 16:25 |
EmilienM | chkumar|off: I commented though | 16:25 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart master: Switch fs027 to deploy with podman https://review.openstack.org/600517 | 16:25 |
*** gkadam has quit IRC | 16:25 | |
openstackgerrit | David Peacock proposed openstack/puppet-tripleo master: adding deployment_type fact in support https://review.openstack.org/605478 | 16:27 |
openstackgerrit | David Peacock proposed openstack/tripleo-heat-templates master: docker-puppet.py: used dedicated hiera entry, not uuid https://review.openstack.org/559182 | 16:28 |
*** panda|bbl is now known as panda | 16:35 | |
*** dxiri has joined #tripleo | 16:35 | |
*** trown is now known as trown|lunch | 16:40 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-quickstart master: Document the KVM accelerated mode for building VMs https://review.openstack.org/605483 | 16:40 |
*** sanjayu_ has joined #tripleo | 16:48 | |
*** akrivoka has quit IRC | 16:49 | |
Tengu | EmilienM: just had a discussion, my patches for the modprobe are NOT enough, and should NOT be merged - can you w-1 them ? I'm not on my work laptop, I have no access - for the records: https://review.openstack.org/#/q/status:open+topic:bug/1794550 thank you :) | 16:52 |
*** gfidente has quit IRC | 16:55 | |
*** artom has quit IRC | 16:58 | |
*** jfrancoa has quit IRC | 16:58 | |
*** jfrancoa has joined #tripleo | 16:58 | |
*** derekh has quit IRC | 17:01 | |
*** jfrancoa has quit IRC | 17:02 | |
*** thrash|biab is now known as thrash | 17:03 | |
thrash | toure: ok... repro after sitting for a bit. | 17:03 |
thrash | toure: so I'm going to apply therve's idea and let it sit again. | 17:04 |
*** artom has joined #tripleo | 17:04 | |
EmilienM | Tengu: ok | 17:05 |
Tengu | EmilienM: apparently I'll need to hit kolla-ansible - and persist the modprobe in some way across reboot. I was sure it couldn't be that easy :) | 17:06 |
*** salmankhan has quit IRC | 17:06 | |
EmilienM | Tengu: done | 17:06 |
Tengu | EmilienM: great, thanks! | 17:07 |
EmilienM | Tengu: I WIPed the THT patches. | 17:07 |
Tengu | just want to ensure nobody push anything for now. | 17:07 |
EmilienM | Tengu: ci is in bad shape no worries :D | 17:08 |
Tengu | "good news then" | 17:08 |
Tengu | #orNot ;) | 17:08 |
*** jpena is now known as jpena|off | 17:09 | |
*** AJaeger has joined #tripleo | 17:10 | |
*** ooolpbot has joined #tripleo | 17:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 17:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1794418 | 17:10 |
*** ooolpbot has quit IRC | 17:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 17:10 |
openstack | Launchpad bug 1794418 in tripleo "Overcloud deploy error creating overcloudrc" [Critical,In progress] - Assigned to Thomas Herve (therve) | 17:10 |
AJaeger | bogdando: are you around to discuss https://review.openstack.org/#/c/588587/4/zuul.d/layout.yaml | 17:10 |
AJaeger | ssbarnea: that's your change that I augmented so that it can merge ^ | 17:10 |
openstackgerrit | Brent Eagles proposed openstack/tripleo-heat-templates master: WIP: configure the undercloud host https://review.openstack.org/605489 | 17:11 |
bogdando | AJaeger: hi. Yes please. Tho I think the comment you've provided answers it fully :) | 17:11 |
AJaeger | bogdando: ok - wanted to be around to discuss further if needed... | 17:11 |
bogdando | thanks for the exmplanation! | 17:11 |
AJaeger | you're welcome, bogdando | 17:12 |
AJaeger | thanks for reviewing | 17:12 |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates master: Convert *tasks from bootstrap_nodeid to short_bootstrap_node_name https://review.openstack.org/605430 | 17:15 |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates master: Remove unused tls-cert-inject.yaml template https://review.openstack.org/605491 | 17:15 |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates master: Add SERVICE_bootstrap_node_ip values to allNodesConfig https://review.openstack.org/605492 | 17:15 |
*** shardy has quit IRC | 17:20 | |
*** dciabrin has quit IRC | 17:20 | |
*** bogdando has quit IRC | 17:25 | |
slagle | thrash: do you recall the reasoning for using the same queue name of "tripleo" for everything? can't we accidentally show messages from other workflows when polling the queue in tripleoclient? | 17:29 |
toure | thrash sounds good | 17:29 |
thrash | slagle: iirc, it was the UI that drove that. | 17:31 |
thrash | slagle: but no, because the workflow id is checked. | 17:32 |
thrash | slagle: *execution id | 17:32 |
thrash | slagle: at least on the CLI side. The CLI is somewhat synchronous, so it knows what it has executed, and looks for messages specifically from that execution. | 17:33 |
slagle | thrash: i don't see that in the code. at least not the way i'm reading it :) | 17:33 |
*** hjensas has joined #tripleo | 17:34 | |
slagle | i see a check on the execution id, but that's only so that we don't bail on getting messages due to a sub-workflow going complete | 17:34 |
slagle | and we've already yielded the payload to the caller before that check | 17:35 |
thrash | slagle: hmmm | 17:36 |
slagle | https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/workflows/base.py#L61 | 17:36 |
slagle | that's where I'm looking | 17:36 |
thrash | slagle: yep... was just looking at that... And I think you're right. | 17:36 |
thrash | slagle: I knew there was a check... But... | 17:37 |
slagle | with tripleoclient, we've never really had a reason that someone might run 2 workflows at the same time, so i wouldn't be surprised if this had gone unnoticed | 17:38 |
*** AJaeger has left #tripleo | 17:39 | |
slagle | but with the status and failures commands I've added that use workflows, we got a bug report of some deployment output polluting the output of the failures command | 17:39 |
thrash | slagle: exactly | 17:39 |
slagle | b/c a deployment was ongoing at the time | 17:39 |
thrash | slagle: should be a simple enough fix. Check for the execution id or the root execution id. | 17:40 |
thrash | before yield. | 17:40 |
slagle | yea, i think so. just wanted to double check my reasoning around the situation :) | 17:40 |
thrash | slagle: no worries... Looks like you got it right. :) | 17:41 |
*** trown|lunch is now known as trown | 17:44 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: libvirt standalone deployment https://review.openstack.org/591077 | 17:53 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: WIP: udpate reproducer to install required deps https://review.openstack.org/600836 | 17:54 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: f28 support for quickstart https://review.openstack.org/591652 | 17:54 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: libvirt standalone deployment https://review.openstack.org/591077 | 17:54 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: f28 support for quickstart https://review.openstack.org/591652 | 17:54 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: WIP: enable fedora-28 for the reproducer https://review.openstack.org/602492 | 17:54 |
openstackgerrit | Merged openstack/python-tripleoclient master: Start websocket client before workflows https://review.openstack.org/605377 | 17:57 |
*** jcoufal has quit IRC | 18:03 | |
*** ooolpbot has joined #tripleo | 18:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 18:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 18:10 |
*** ooolpbot has quit IRC | 18:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 18:10 |
*** slaweq has joined #tripleo | 18:12 | |
openstackgerrit | Alex Schultz proposed openstack/python-tripleoclient stable/rocky: Start websocket client before workflows https://review.openstack.org/605499 | 18:17 |
*** amoralej is now known as amoralej|off | 18:20 | |
itlinux | hello all can someone give me a pointer on this issue http://paste.openstack.org/show/730956/ | 18:21 |
itlinux | trying to do an update (minor) on pike | 18:21 |
openstackgerrit | Rafael Folco proposed openstack-infra/tripleo-ci master: Remove toci_jobtype definition from v3 jobs https://review.openstack.org/593863 | 18:22 |
*** slaweq has quit IRC | 18:22 | |
*** pcaruana has quit IRC | 18:24 | |
openstackgerrit | ayoung proposed openstack/tripleo-specs master: Global Galera Database https://review.openstack.org/600555 | 18:29 |
*** zaneb has quit IRC | 18:37 | |
*** zaneb has joined #tripleo | 18:37 | |
weshay | mwhahaha, I think https://review.openstack.org/#/c/603419/ is timing out | 18:45 |
weshay | http://zuul.openstack.org/stream.html?uuid=715bb91a744c45f6afec643ce29ddbdb&logfile=console.log\ | 18:45 |
openstackgerrit | Rafael Folco proposed openstack/tripleo-quickstart master: Run scenario001-multinode-oooq-container job for config/* changes https://review.openstack.org/602425 | 18:48 |
mwhahaha | weshay: of course it is | 18:49 |
weshay | mwhahaha, it's your fault | 18:53 |
mwhahaha | not my fault someone switching things to non-voting and didn't remove them from the gate :D | 18:53 |
*** jcoufal has joined #tripleo | 18:56 | |
*** abishop_ has joined #tripleo | 18:58 | |
*** abishop has quit IRC | 19:00 | |
openstackgerrit | James Slagle proposed openstack/python-tripleoclient master: Use sync action get_deployment_failures https://review.openstack.org/605058 | 19:05 |
openstackgerrit | Dan Sneddon proposed openstack/tripleo-heat-templates master: Ping default gateways before controllers https://review.openstack.org/604229 | 19:07 |
*** ooolpbot has joined #tripleo | 19:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 19:10 |
*** ooolpbot has quit IRC | 19:10 | |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 19:10 |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 19:10 |
*** asbishop has joined #tripleo | 19:12 | |
*** asbishop is now known as abishop | 19:13 | |
*** abishop_ has quit IRC | 19:15 | |
*** pdeore has quit IRC | 19:15 | |
*** salmankhan has joined #tripleo | 19:17 | |
toure | therve looks like your suggestion works | 19:18 |
toure | thrash ^^ | 19:18 |
toure | back in a bit | 19:19 |
*** toure is now known as toure|biab | 19:19 | |
*** jcoufal has quit IRC | 19:21 | |
*** salmankhan has quit IRC | 19:22 | |
openstackgerrit | Emilien Macchi proposed openstack/paunch master: podman: create/delete systemd unit files when restart policy is used https://review.openstack.org/600849 | 19:22 |
EmilienM | bandini: ^ ready for review again, comments addressed | 19:23 |
openstackgerrit | Emilien Macchi proposed openstack/paunch master: Stop hardcoding 'docker' and make it more generic https://review.openstack.org/601290 | 19:23 |
*** slaweq has joined #tripleo | 19:25 | |
mwhahaha | weshay: how many cpus do the CI instances have? 4? | 19:28 |
mwhahaha | nm is 8 | 19:29 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: WIP: udpate reproducer to install required deps https://review.openstack.org/600836 | 19:33 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: WIP: enable fedora-28 for the reproducer https://review.openstack.org/602492 | 19:33 |
openstackgerrit | Alex Schultz proposed openstack/ansible-role-container-registry master: Allow docker image download/upload concurrency https://review.openstack.org/605511 | 19:42 |
openstackgerrit | Chandan Kumar proposed openstack/tripleo-quickstart-extras master: Add podman support to validate-tempest role https://review.openstack.org/605356 | 19:46 |
*** sanjayu_ has quit IRC | 19:46 | |
openstackgerrit | Alex Schultz proposed openstack/tripleo-heat-templates master: Increase docker pull/push concurrency https://review.openstack.org/605514 | 19:50 |
thrash | toure|biab: therve I had the opposite. | 19:50 |
thrash | toure|biab: therve It didn't seem to work for me. | 19:51 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-common master: Increase upload concurrency https://review.openstack.org/605515 | 19:52 |
openstackgerrit | Sam Doran proposed openstack/tripleo-ansible master: Use generic names for container platform https://review.openstack.org/600847 | 19:54 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-common master: Increase upload concurrency https://review.openstack.org/605515 | 19:54 |
weshay | mwhahaha, pedal to the metal | 19:58 |
mwhahaha | not sure if it'll help, but worth a try | 19:59 |
*** dsneddon has quit IRC | 20:02 | |
*** dsneddon has joined #tripleo | 20:02 | |
*** agopi has quit IRC | 20:06 | |
*** bnemec has quit IRC | 20:10 | |
*** ooolpbot has joined #tripleo | 20:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 20:10 |
*** ooolpbot has quit IRC | 20:10 | |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 20:10 |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 20:10 |
openstackgerrit | James Slagle proposed openstack/python-tripleoclient master: Filter messages not from waiting execution https://review.openstack.org/605520 | 20:10 |
weshay | mwhahaha, ok.. finally deploying standalone on f28 w/ centos containers | 20:10 |
weshay | w/ the reproducer scripts | 20:10 |
mwhahaha | always a bonus | 20:10 |
thrash | toure|biab: therve stop() gets called also. Not a bad idea to just clear them there as well... Should get around race conditions | 20:13 |
EmilienM | are we getting CI back one day or? | 20:13 |
EmilienM | where are we? | 20:13 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-heat-templates master: Update standalone role https://review.openstack.org/605156 | 20:14 |
*** toure|biab is now known as toure | 20:14 | |
toure | thrash hrmm... | 20:15 |
*** bnemec has joined #tripleo | 20:15 | |
thrash | toure: The repro is not 100% | 20:15 |
toure | thrash therve one way to reproduce the issue is to move the logrotate from daily to hourly | 20:15 |
thrash | toure: which leads me to believe it is definitely a race condition | 20:15 |
toure | ack | 20:16 |
thrash | toure: sure. But still *time* | 20:16 |
toure | thrash true | 20:16 |
thrash | But again, it really depends on whether the race happens... I've concated the SIGHUP and the action call into a single cli | 20:16 |
thrash | toure: and that *sometimes* worked | 20:17 |
toure | but we should run into the race much quicker if it is kicking every hour | 20:17 |
toure | ah | 20:17 |
thrash | toure: I'm testing out adding the client reset to the stop function as well | 20:17 |
thrash | toure: do it every minute. :) | 20:18 |
toure | kk | 20:18 |
toure | hehe | 20:18 |
thrash | toure: or just cron the SIGHUP. :) | 20:18 |
thrash | toure: that's probably the better way. | 20:18 |
toure | true | 20:18 |
thrash | heck, that could be every 5 seconds. | 20:18 |
* toure creates a crontab for SIGHUP | 20:18 | |
toure | :) | 20:18 |
thrash | or probably 20 | 20:18 |
thrash | in fact, I think I'm gonna do that. Cron the SIGHUP, and then watch the action call | 20:19 |
toure | * * * * * sudo docker exec mistral_api kill -SIGHUP 1 >/dev/null 2>&1 | 20:20 |
toure | :) | 20:21 |
openstackgerrit | Emilien Macchi proposed openstack/paunch master: podman: create/delete systemd unit files when restart policy is used https://review.openstack.org/600849 | 20:22 |
openstackgerrit | Emilien Macchi proposed openstack/paunch master: Stop hardcoding 'docker' and make it more generic https://review.openstack.org/601290 | 20:22 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-quickstart-extras master: Update standalone environment file https://review.openstack.org/605523 | 20:23 |
toure | thrash got it | 20:23 |
toure | the cleanup() function works | 20:25 |
thrash | toure: from? | 20:25 |
toure | http://paste.openstack.org/show/730963/ | 20:26 |
toure | thrash ^^ | 20:26 |
thrash | toure: sure... but we need a much larger sample size to be certain. | 20:26 |
toure | yup and I will leave it running all nigth :p | 20:27 |
thrash | toure: biab | 20:27 |
*** thrash is now known as thrash|biab | 20:27 | |
toure | either ack | 20:27 |
toure | thrash|biab when you get back here is the setup | 20:32 |
toure | http://paste.openstack.org/show/nbn7LVPRox2obnisF6z7/ | 20:32 |
stevebaker | morning | 20:33 |
EmilienM | stevebaker: salut | 20:34 |
EmilienM | mwhahaha: what jobs are timeouting the most? | 20:34 |
mwhahaha | i don't know at the moment | 20:34 |
EmilienM | ok let's look grafana | 20:34 |
EmilienM | weshay: can you have an answer to that ? | 20:34 |
mwhahaha | cistatus.tripleo.org seems to think everything is grand (i think it's lying) | 20:35 |
stevebaker | mwhahaha: I've got some more context for talking about concurrent layer copying for docker/skopeo | 20:36 |
*** raildo has quit IRC | 20:36 | |
* weshay reads | 20:36 | |
mwhahaha | stevebaker: yea? the max-concurrent-download stuff in docker seems to have little effect (i tested) | 20:36 |
mwhahaha | stevebaker: i did throw a patch up to improve the tripleo-common number of upload workers | 20:36 |
weshay | EmilienM, what's the question | 20:37 |
weshay | 7 | 20:37 |
weshay | is the answer | 20:37 |
mwhahaha | cause that in my testing seemed to have a better impact on total time | 20:37 |
EmilienM | weshay: question is what jobs timeout the most | 20:37 |
EmilienM | what's the new URL for cockpit? | 20:37 |
stevebaker | mwhahaha: it does allow more layers to be downloaded concurrently, but yeah I didn't see a huge time difference in a test of pulling two images with mostly shared layers | 20:37 |
EmilienM | 38.145.34.131:3000 is down for me | 20:37 |
weshay | http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1&panelId=61&fullscreen | 20:37 |
EmilienM | oh ok | 20:38 |
weshay | scen001 and containers-multinode | 20:38 |
mwhahaha | stevebaker: https://review.openstack.org/#/q/topic:concurrent-settings+(status:open+OR+status:merged) is what i proposed a bit ago | 20:38 |
EmilienM | tripleo-ci-centos-7-undercloud-containers - 2h20 | 20:38 |
mwhahaha | stevebaker: i think the tripleo-common one might have some improvement, but the other i would agree doesn't seem to help when the layers are so shared (like ours) | 20:38 |
EmilienM | while it takes 25 min to deploy an undercloud w/o oooq in my env | 20:39 |
EmilienM | it makes me cry | 20:39 |
stevebaker | mwhahaha: as for skopeo, the news is bad. No concurrent copies, and most of the combinations of src->dest can't detect that the destination already has the layer, so the copy happens again. Until that is fixed, buildah in CI will be sloooow | 20:39 |
mwhahaha | stevebaker: oh noes | 20:39 |
EmilienM | omg | 20:40 |
weshay | dear lord don't cry :) | 20:40 |
weshay | there there | 20:40 |
EmilienM | can't we build our own images with pre-fetched containers in a local registry? | 20:41 |
weshay | +1 | 20:41 |
EmilienM | and our CI jobs would just update the layers? | 20:41 |
weshay | I like that plan | 20:41 |
EmilienM | is infra against us having our own centos7 image? | 20:41 |
weshay | DIB that sucks in the containers | 20:41 |
EmilienM | we could build our own image on top of current centos7 | 20:42 |
EmilienM | and install things that take time | 20:42 |
weshay | EmilienM, that would get us our daily updates w/ dlrn-current too | 20:42 |
stevebaker | EmilienM: what do you mean a local registry? per CI regsion? | 20:42 |
EmilienM | like openstack-selinux (almost 5 min to install) | 20:42 |
weshay | that's normal | 20:42 |
EmilienM | stevebaker: no I mean deploying a docker registry with our containers | 20:42 |
weshay | EmilienM, but installed and populated at image creation right? | 20:42 |
EmilienM | so when the VM starts, we alraedy have a registry and our CI job would not pull everything from scratch | 20:42 |
weshay | vs.. pre.yaml | 20:42 |
weshay | +1000 | 20:42 |
* weshay starts to cry | 20:43 | |
mwhahaha | good luck with that | 20:43 |
EmilienM | mwhahaha: why? I don't think that's impossible | 20:43 |
weshay | \/skick /sban mwhahaha | 20:43 |
EmilienM | stevebaker: it's technically possible you think? | 20:43 |
weshay | ianw, ^ | 20:43 |
EmilienM | if yes, I'll ping infra | 20:43 |
weshay | whose turn is it to buy scotch for the infra guys? | 20:44 |
stevebaker | EmilienM: do the VM images get rebuilt often enough for the layers to be current? | 20:44 |
weshay | stevebaker, ya | 20:44 |
mwhahaha | no they don't which is kinda teh problem | 20:44 |
EmilienM | stevebaker: every night we could do | 20:44 |
weshay | almost every day | 20:44 |
weshay | ya | 20:44 |
EmilienM | the only thing that will cost is : | 20:45 |
weshay | the image is 8.6 gb | 20:45 |
weshay | now | 20:45 |
EmilienM | 1) an image to store in more (so more storage for our cloud providers) | 20:45 |
EmilienM | 2) more time in the image build process | 20:45 |
EmilienM | yeah I'm afraid about the size with our containers | 20:45 |
weshay | https://nb01.openstack.org/images/ | 20:45 |
EmilienM | I think the first question inrfa will ask is what size would it take | 20:45 |
EmilienM | and I'ma fraid of the answer | 20:46 |
stevebaker | weshay: is it possible to just run CI against container images which are built more frequently? so current-tripleo is promoted images, but we do CI against a daily-tripleo tag or something? | 20:46 |
weshay | stevebaker, so the containers would just be updated w/ dlrn-current | 20:46 |
weshay | ya.. that is fine | 20:46 |
weshay | so we'd have that in the process.. | 20:46 |
EmilienM | I think what takes a bunch of time in our CI is that we run yum update in our containers | 20:46 |
weshay | I know.. we could isolate the services more effectively on the bm | 20:46 |
EmilienM | I mean we have to | 20:46 |
stevebaker | yes that is the slowest part, if there hasn't been a promotion for a while, updating every image with the dlrn-current repo gets slower and slower | 20:47 |
EmilienM | if we could come up with a size that it would take | 20:47 |
mwhahaha | so you realize that the containers are like 17G right? | 20:47 |
EmilienM | maybe we can request infra | 20:47 |
weshay | lolz | 20:47 |
weshay | \/skick mwhahaha | 20:47 |
EmilienM | well | 20:47 |
EmilienM | the image is stored once | 20:47 |
EmilienM | ie no history | 20:47 |
mwhahaha | shouldn't be | 20:47 |
EmilienM | we won't store multiple versions of this image | 20:48 |
mwhahaha | i'm talking about the downloaded containers | 20:48 |
mwhahaha | just a single set | 20:48 |
EmilienM | I know | 20:48 |
EmilienM | I'm trying to find a solution here :D | 20:48 |
weshay | it would be more efficient, even if the image was large | 20:48 |
EmilienM | but to me we download the same things in our CI jobs | 20:48 |
mwhahaha | i think we need to work on other problems | 20:48 |
EmilienM | so let's try to cache them in the image | 20:48 |
weshay | that's why the images are built daily | 20:48 |
weshay | to keep them updated | 20:48 |
weshay | this just takes to the next level w/ containers | 20:49 |
mwhahaha | yes cause we need more complexity | 20:49 |
weshay | how else can sell rhel8 | 20:49 |
weshay | MORE COMPLEXITY | 20:49 |
* mwhahaha closes laptop, opens window, escapes | 20:49 | |
mwhahaha | if you need me, i'll be doing something useful like taking anap | 20:49 |
* mwhahaha would rather we stop doing scenarios and only have a single undercloud job and convert all the services to single light weight standalones | 20:50 | |
mwhahaha | i think we'd get more use out of that | 20:50 |
EmilienM | mwhahaha: why aren't we doing it now | 20:50 |
EmilienM | what does it take? | 20:50 |
EmilienM | we need an env/role per scenario | 20:50 |
EmilienM | let's start with scenario001 | 20:50 |
mwhahaha | time/people | 20:50 |
weshay | $$ | 20:51 |
EmilienM | I thought we agreed to work on that | 20:51 |
mwhahaha | yea it's be 3 days | 20:51 |
mwhahaha | DO TRY AND GIVE PEOPLE SOME TIME | 20:51 |
EmilienM | no | 20:51 |
EmilienM | ok let me do it now | 20:51 |
* EmilienM disappears | 20:51 | |
weshay | EmilienM, converting scenario001 multinode to a standalone? | 20:51 |
EmilienM | yes | 20:52 |
EmilienM | it's not a big deal | 20:52 |
mwhahaha | you can't convert it straight | 20:52 |
weshay | and the standalone has enough resources? | 20:52 |
EmilienM | we need a role with the services | 20:52 |
EmilienM | and an environment | 20:52 |
mwhahaha | you need to pull the services out and split itnto multiple jobs | 20:52 |
weshay | ya | 20:52 |
EmilienM | what's the big deal here? | 20:52 |
weshay | I'm all for busting out the scenarios into combinations that would fit standalone | 20:52 |
EmilienM | it's 1 yaml file thingy | 20:53 |
mwhahaha | no it's not | 20:53 |
weshay | nice spec | 20:53 |
* weshay thinks EmilienM sounds like morazi | 20:53 | |
*** Vorrtex has quit IRC | 20:53 | |
EmilienM | come on | 20:53 |
weshay | lolz | 20:53 |
stevebaker | woah | 20:53 |
weshay | mwhahaha, EmilienM let's pause | 20:53 |
weshay | we know we want to work on standalone | 20:54 |
weshay | sure | 20:54 |
weshay | but let's go pursue the DIB/image w/ containers | 20:54 |
mwhahaha | k you go do that | 20:54 |
mwhahaha | we still need the standalone thing | 20:54 |
mwhahaha | for f28 and other gates | 20:54 |
EmilienM | scenario001 is a standalone with a dedicated env file | 20:54 |
EmilienM | overriding the serviecs | 20:54 |
EmilienM | and that's it | 20:54 |
mwhahaha | EmilienM: no it's not because there's not enough resources in CI to run scneario001 on a single box | 20:54 |
weshay | I'm on TASK [Run docker-puppet tasks (bootstrap tasks) for step 3] ********************************************** | 20:54 |
weshay | w/ f28 | 20:54 |
weshay | :) | 20:54 |
EmilienM | mwhahaha: just trash the multinode scenario001 | 20:55 |
EmilienM | in master | 20:55 |
weshay | mwhahaha, just tell prad to make ceil more efficient | 20:55 |
EmilienM | we can do it with zuul super easily | 20:55 |
* mwhahaha gives up and goes to work on other things | 20:55 | |
weshay | mwhahaha, it's worth trying once | 20:55 |
mwhahaha | k get it done | 20:55 |
EmilienM | mwhahaha: ok tell me what's complicated | 20:55 |
weshay | EmilienM, too many services for one box | 20:56 |
mwhahaha | i already did but you keep blasting past it | 20:56 |
weshay | but maybe it would work | 20:56 |
EmilienM | and I don't want to give up | 20:56 |
weshay | maybe not | 20:56 |
EmilienM | nothing has merged | 20:56 |
EmilienM | mwhahaha: then we reduce the services | 20:56 |
mwhahaha | right to what | 20:56 |
weshay | EmilienM, which one? | 20:56 |
EmilienM | scenario001 | 20:56 |
*** agopi has joined #tripleo | 20:56 | |
EmilienM | it's always broken | 20:56 |
mwhahaha | you need to actually come up with a plan on what you're doing | 20:56 |
weshay | ya.. but which service | 20:56 |
weshay | faker | 20:56 |
EmilienM | first let's make it non voting | 20:56 |
EmilienM | well let's stop to test autoscaling | 20:57 |
weshay | https://github.com/openstack/tripleo-heat-templates/blob/master/README.rst#service-testing-matrix | 20:57 |
EmilienM | let's reduce the tempest tests that we run | 20:57 |
EmilienM | so we can drop Heat maybe in this scenario | 20:57 |
mwhahaha | i'd rather we trash all the scenarios, and run a single standalone multinode (with my new standalone role) and then create individual standalone configurations for the various other services | 20:57 |
EmilienM | Heat is already tested on the containerized undercloud | 20:57 |
mwhahaha | not really | 20:57 |
mwhahaha | but sure | 20:57 |
mwhahaha | since we stopped doing actual heat, it's not really "Tested" | 20:58 |
mwhahaha | i know | 20:58 |
EmilienM | I guess we need some trade offs | 20:58 |
mwhahaha | EmilienM: let's come up with all the services and list them out how we want to est them in CI (etherpad?) | 20:58 |
mwhahaha | and we can figure out new featuresets for them | 20:58 |
*** trown is now known as trown|outtypewww | 20:59 | |
mwhahaha | assuming standalone rather than muiltinode | 20:59 |
EmilienM | so why services can't run on a single node? | 20:59 |
EmilienM | what changes from having an overcloud? | 21:00 |
EmilienM | once we run the playbooks to deploy, the heat instance isn't running anymore | 21:00 |
mwhahaha | ram | 21:00 |
EmilienM | so nothing should take memory on the host, (beside ansible) | 21:00 |
weshay | ######################################################## | 21:01 |
weshay | Deployment successfull! | 21:01 |
weshay | f28 | 21:01 |
*** abishop has quit IRC | 21:03 | |
matbu | ceph | 21:03 |
matbu | vi hosts | 21:03 |
matbu | ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd:q | 21:03 |
matbu | ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd:q:q! | 21:03 |
matbu | rmddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd:q:q! | 21:03 |
matbu | ddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd:q:q! | 21:03 |
* stevebaker googles how to quit vi | 21:04 | |
weshay | did matbu just die? | 21:04 |
EmilienM | he's stuck in vi | 21:05 |
matbu | EmilienM: oops :) | 21:06 |
*** slaweq has quit IRC | 21:06 | |
matbu | my term are totally wierd, | 21:06 |
matbu | sorry for the noise | 21:06 |
stevebaker | it cheered up my day | 21:07 |
mwhahaha | EmilienM: so if we only have 8G, a default install of the normal standalone takes 8Gs. Scenario* has way more services | 21:07 |
EmilienM | right | 21:07 |
weshay | matbu, glad you are alive and well | 21:08 |
matbu | weshay: yep just got a 90 baremetal nodes deployed | 21:09 |
weshay | matbu, where did yo get 90 nodes? | 21:09 |
EmilienM | i think we don't want to know | 21:10 |
*** ooolpbot has joined #tripleo | 21:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 21:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 21:10 |
*** ooolpbot has quit IRC | 21:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 21:10 |
matbu | weshay: no no you won't use for CI :) its the scale lab | 21:10 |
*** matbu has quit IRC | 21:10 | |
*** matbu has joined #tripleo | 21:11 | |
weshay | 90 nodes killed matbu's irc | 21:11 |
EmilienM | weshay, mwhahaha : https://ethercalc.openstack.org/TripleOStandaloneCICoverage | 21:11 |
matbu | :) | 21:11 |
EmilienM | maybe we could dress a list that way | 21:12 |
EmilienM | I just drafted something super quick | 21:12 |
weshay | dammit | 21:13 |
weshay | I hate people that are naturally on cocaine | 21:13 |
*** panda is now known as panda|off | 21:13 | |
EmilienM | mwhahaha: maybe we could start by dressing a list of actions for the CI team | 21:17 |
EmilienM | to give visibility on the work that you think needs to be done for standalone | 21:17 |
EmilienM | this spreadsheet will help if we wants to re-do scenarios | 21:18 |
EmilienM | weshay: who from CI team can work on that topic, in short term? | 21:18 |
weshay | EmilienM, it's going to be a sprint topoic | 21:21 |
weshay | topic | 21:21 |
weshay | we all can | 21:21 |
weshay | minus the ruck/rover | 21:21 |
mwhahaha | EmilienM: so is this the new version of the scenarios or are you documenting the old ones | 21:22 |
EmilienM | weshay: when is next sprint starting? | 21:22 |
EmilienM | mwhahaha: new | 21:22 |
weshay | EmilienM, thrs | 21:22 |
mwhahaha | so i'd like to get rid of the scenario names | 21:22 |
weshay | tomorrow | 21:22 |
mwhahaha | cause the last thing we need is more magic decoder rings to figure out ci | 21:23 |
weshay | agree, but also will note the names are the least of our problems | 21:23 |
EmilienM | I have to go but I'm back in 1h, and all evening | 21:25 |
EmilienM | mostly | 21:25 |
* weshay is going back to Red Rocks tonight :) | 21:25 | |
EmilienM | can we capture some sort of todo list (high level) | 21:25 |
mwhahaha | see he starts trouble and then just leaves | 21:25 |
EmilienM | mwhahaha: ... | 21:25 |
EmilienM | let's create a list of things we need to do from high level so it's easier to split the work | 21:28 |
EmilienM | I'll draft something when I'm back on etherpad or something | 21:28 |
EmilienM | if nobody started before | 21:28 |
* EmilienM brb | 21:28 | |
weshay | EmilienM, you should come to our planning mtg | 21:31 |
weshay | mwhahaha, EmilienM jaosorior fyi.. I'm seeing much more success from rdo jobs | 21:37 |
weshay | I think the +1 for 3rd party can be expected again | 21:38 |
weshay | http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1&panelId=207&fullscreen | 21:38 |
mwhahaha | EmilienM, weshay: ok so i listed the major services and their current coverage under undercloud/ovb: https://ethercalc.openstack.org/TripleOStandaloneCICoverage | 21:46 |
mwhahaha | are we missing anything? | 21:46 |
mwhahaha | so we'd likely want to fill gaps in service coverage via standalone | 21:47 |
*** toure is now known as toure|gone | 21:48 | |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Support ARA statistics in InfluxDB for longest tasks https://review.openstack.org/580238 | 21:54 |
weshay | sec | 21:54 |
*** lblanchard has quit IRC | 22:02 | |
openstackgerrit | Steve Baker proposed openstack/tripleo-common master: Set prepare neutron_driver from NeutronMechanismDrivers https://review.openstack.org/604953 | 22:09 |
*** ooolpbot has joined #tripleo | 22:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 22:10 |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 22:10 |
*** ooolpbot has quit IRC | 22:10 | |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 22:10 |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Calculate ARA metrics for overcloud https://review.openstack.org/604900 | 22:14 |
openstackgerrit | Merged openstack/tripleo-quickstart master: Set containerized_undercloud for OpenShift featureset https://review.openstack.org/602802 | 22:14 |
openstackgerrit | Merged openstack/tripleo-common master: Switch to openshift 3.10 https://review.openstack.org/596820 | 22:17 |
openstackgerrit | Merged openstack/tripleo-common master: Switch to origin-docker-build https://review.openstack.org/599307 | 22:17 |
stevebaker | mwhahaha: permission to merge this? https://review.openstack.org/#/c/604952 | 22:20 |
openstackgerrit | Sam Doran proposed openstack/tripleo-quickstart master: Add YAML standard out callback plugin https://review.openstack.org/605543 | 22:24 |
openstackgerrit | Arx Cruz proposed openstack/tripleo-quickstart-extras master: WIP - Fix stackviz https://review.openstack.org/605419 | 22:27 |
openstackgerrit | Steve Baker proposed openstack/tripleo-quickstart master: Make setup repo task output visible when errored https://review.openstack.org/599358 | 22:28 |
openstackgerrit | Merged openstack/tripleo-common master: Add container-registry image to openshift master role https://review.openstack.org/602112 | 22:33 |
*** rcernin has joined #tripleo | 22:35 | |
*** tosky has quit IRC | 22:37 | |
beagles | we still have a "do not workflow" in effect? | 22:38 |
* beagles glances at channel status, duh | 22:38 | |
EmilienM | beagles: hey | 22:53 |
EmilienM | did you have a chance to look at the sidecar podman container thingy? | 22:53 |
*** rcernin has quit IRC | 22:53 | |
EmilienM | weshay: yes I could join your planning | 22:54 |
EmilienM | weshay: invite me | 22:54 |
beagles | EmilienM: nyet - but can tomorrow a.m. Going to link up with bogdan | 22:54 |
EmilienM | beagles: excellent | 22:54 |
beagles | EmilienM: so what's the story with the gate, is there progress on sorting out why the timeouts? | 22:55 |
*** rcernin has joined #tripleo | 22:55 | |
EmilienM | beagles: not really, bunch of timeouts | 22:55 |
beagles | meh | 22:55 |
EmilienM | but there are some wips | 22:55 |
EmilienM | alex is looking at concurrency | 22:55 |
beagles | I think the last I was following it was timeouts on getting data around (pulling container images, etc)? is that still the thing? | 22:56 |
EmilienM | mwhahaha: ack for the spreadsheet, so we would have "standalone-ceph" for ex? | 22:56 |
EmilienM | beagles: see https://review.openstack.org/#/q/topic:concurrent-settings | 22:56 |
beagles | EmilienM: ack thanks | 22:58 |
EmilienM | beagles: could you please create a card in https://trello.com/b/S8TmOU0u/tripleo-podman and update with progress if any | 23:00 |
EmilienM | it helps to track our efforts, thx | 23:00 |
beagles | EmilienM: sure thing | 23:01 |
EmilienM | thx | 23:01 |
*** eggmaster has quit IRC | 23:08 | |
*** eggmaster has joined #tripleo | 23:08 | |
*** ooolpbot has joined #tripleo | 23:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1715374 | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1792560 | 23:10 |
*** ooolpbot has quit IRC | 23:10 | |
openstack | Launchpad bug 1715374 in tripleo "Reloading compute with SIGHUP prenvents instances to boot" [Critical,In progress] - Assigned to Bogdan Dobrelya (bogdando) | 23:10 |
openstack | Launchpad bug 1792560 in tripleo "Upgrades in CI still using Q->master instead of R->master release" [Critical,Triaged] - Assigned to Jiří Stránský (jistr) | 23:10 |
EmilienM | beagles: so there is another approach | 23:11 |
EmilienM | beagles: look https://review.openstack.org/#/c/604659/2 | 23:11 |
EmilienM | I'm going to update https://review.openstack.org/#/c/604180/ to use that | 23:12 |
EmilienM | we would configure these containers with cap SYSADMIN & mount /var/lib/container | 23:12 |
EmilienM | so they can do things that we need | 23:12 |
EmilienM | but privileged mode is too high for what we need | 23:12 |
EmilienM | (credits to stevebaker for the great idea) | 23:12 |
beagles | ah so this implements what steve baker was talking about it in the email thread | 23:12 |
beagles | that'd be cool | 23:12 |
EmilienM | yea | 23:13 |
EmilienM | beagles: again we want an approach that isn't heavy and complicated | 23:13 |
EmilienM | and that keeps our problem solved: separate neutron container from the services that it manages like haproxy/keepalived/dnsmasq etc | 23:13 |
beagles | EmilienM: yeah, I was just going to say - I'm all for easy | 23:13 |
EmilienM | so we can restart them independently | 23:13 |
EmilienM | cool | 23:14 |
EmilienM | beagles: so yeah let's try that | 23:14 |
beagles | EmilienM: right on | 23:14 |
EmilienM | beagles: I'm working on pacemaker bits now https://review.openstack.org/#/c/604180/ | 23:14 |
EmilienM | but i'll let you figure out the neutron thing | 23:14 |
beagles | EmilienM: ack | 23:14 |
EmilienM | basically we need the containers configured with cap_drop=SYSADMIN and mount /var/lib/containers | 23:14 |
mwhahaha | EmilienM: so yea we'd likely want a standalone-ceph, but that's a weird one where we'll probably want an ovb-ceph as well | 23:15 |
beagles | right | 23:16 |
EmilienM | mwhahaha: why would we need ovb-ceph? I would hope we can keep ovb as lighter as possible. | 23:16 |
mwhahaha | EmilienM: one that excercises mistral to deploy ceph-ansible | 23:16 |
mwhahaha | EmilienM: we could do a 1 uc 1 all-in-one (with ceph) | 23:16 |
mwhahaha | to be lighter | 23:16 |
mwhahaha | rather than an HA one | 23:16 |
EmilienM | ah right, dat mistral | 23:17 |
mwhahaha | we could leave 1 multinode job that maybe runs ceph | 23:17 |
mwhahaha | do a 1 uc 1 all-in-one (with ceph) multinode | 23:18 |
mwhahaha | to replace the existing one | 23:18 |
mwhahaha | but these are the types of things that need to be thought out and aren't just "go do a thing" | 23:18 |
mwhahaha | though a 1uc/1all-in-one(with ceph) is a better test if we want to properly excercise mistral, etc | 23:19 |
EmilienM | yes | 23:19 |
mwhahaha | i meant on ovb but anyway | 23:19 |
*** mcornea has quit IRC | 23:20 | |
mwhahaha | from a "basic tripleo deployment" in CI, i'd like to stadandarize on https://review.openstack.org/#/c/605156/ | 23:20 |
mwhahaha | all the other services would get coverage via a standalone unless it needs something special (ocavia/ceph) | 23:20 |
EmilienM | I also think we'll have to make tradeoffs | 23:21 |
EmilienM | testing autoscaling requires all telemetry + heat | 23:21 |
EmilienM | it's clearly not working for us | 23:21 |
EmilienM | timeouts etc | 23:21 |
EmilienM | I would be ok to have heat + basics and actually test heat api | 23:22 |
mwhahaha | does that get covered in the full tempest run we do for promotions? | 23:22 |
EmilienM | not sure about that | 23:22 |
mwhahaha | we should see, because if it does i think that would tick that box | 23:22 |
mwhahaha | thats the other thing that would be nice to know is the tempest tests for these feature sets | 23:22 |
EmilienM | if we could have "advanced" testing in RDO CI (with higher timeouts) | 23:22 |
mwhahaha | because we could be deploying all this stuff but not even bothering to exercise it | 23:22 |
EmilienM | and "simpler" tests in our gate | 23:22 |
EmilienM | it would help imho | 23:23 |
mwhahaha | EmilienM: that's why i recommend additional ovb jobs (other than the HA ones) | 23:23 |
EmilienM | :-o | 23:23 |
EmilienM | do we have capacity? | 23:23 |
mwhahaha | of course not | 23:23 |
mwhahaha | but if we didn't run fs001 and ds35 on everything, maybe | 23:23 |
mwhahaha | why do we run it on just about everything again? | 23:23 |
EmilienM | (is that the time where you blame me? :-)) | 23:24 |
mwhahaha | i've been blaming you since you started this several hours ago | 23:24 |
EmilienM | anyway | 23:24 |
mwhahaha | SO THERE | 23:24 |
EmilienM | you can't blame me | 23:24 |
mwhahaha | :D | 23:24 |
mwhahaha | i can and will | 23:24 |
EmilienM | I work with one eye since Monday | 23:24 |
mwhahaha | and you can't stop me | 23:24 |
EmilienM | so i'm half processing? | 23:24 |
mwhahaha | ARRRRRmilienM | 23:24 |
mwhahaha | it would also be beneficial to properly scope jobs to specific areas in the tripleo code base | 23:25 |
mwhahaha | i tend to think that we excessively run things (i'm looking at you tripleo-quickstart) | 23:25 |
*** rh-jelabarre has quit IRC | 23:26 | |
mwhahaha | speaking of random things, is someone driving the ovn switch? | 23:29 |
EmilienM | beagles: ^ | 23:30 |
mwhahaha | cause that likely adds additional minimum requirements for some jobs | 23:30 |
mwhahaha | with new servies and such | 23:30 |
*** tzumainn has quit IRC | 23:34 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Allow to run bootstrap containers in privileged mode. https://review.openstack.org/600533 | 23:35 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Set proper setype for service directories https://review.openstack.org/600534 | 23:35 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Allow to deactivate SELinux separation for selected containers https://review.openstack.org/600535 | 23:35 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Support podman when tagging container for Pacemaker https://review.openstack.org/604180 | 23:35 |
*** dxiri has quit IRC | 23:35 | |
EmilienM | stevebaker: https://review.openstack.org/#/c/604180/12/docker/services/pacemaker/cinder-backup.yaml | 23:36 |
EmilienM | something like this ^ see cap | 23:36 |
EmilienM | I haven't tested it (in progress now) | 23:36 |
*** rcernin_ has joined #tripleo | 23:41 | |
openstackgerrit | Alex Schultz proposed openstack/tripleo-docs master: Update standalone docs https://review.openstack.org/603522 | 23:41 |
stevebaker | EmilienM: looks good | 23:42 |
*** rcernin has quit IRC | 23:43 | |
mwhahaha | EmilienM: so why do we run pacemaker on containers-multinode? | 23:47 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Support podman when tagging container for Pacemaker https://review.openstack.org/604180 | 23:47 |
* mwhahaha sighs | 23:47 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Support podman when tagging container for Pacemaker https://review.openstack.org/604180 | 23:47 |
mwhahaha | :q | 23:47 |
EmilienM | mwhahaha: because at some point we said pacemaker would be default on overcloud | 23:48 |
EmilienM | to avoid keepalived | 23:48 |
mwhahaha | yet we didn't make that a thing | 23:48 |
mwhahaha | because if that was a thing, we should have fixed it in the resource-registry | 23:49 |
mwhahaha | and not in the CI scenarios | 23:49 |
*** openstackgerrit has quit IRC | 23:49 | |
* mwhahaha flips tables and will resume hating life tomorrow | 23:49 | |
EmilienM | let's put thisway : | 23:49 |
mwhahaha | tomorrow: roofers, so i shall enjoy banging my head against the wall and hearing it on my roof | 23:49 |
EmilienM | there is room for improvment | 23:49 |
*** rrubins__ has quit IRC | 23:54 | |
*** openstackgerrit has joined #tripleo | 23:58 | |
openstackgerrit | Alex Schultz proposed openstack/ansible-role-chrony master: Revert "Remove zuul configuration" https://review.openstack.org/605557 | 23:58 |
openstackgerrit | Alex Schultz proposed openstack/ansible-role-chrony master: Fix .gitreview https://review.openstack.org/605558 | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!