*** openstack has joined #tripleo | 00:09 | |
*** openstackgerrit has quit IRC | 00:17 | |
*** openstackgerrit has joined #tripleo | 00:18 | |
*** xinwu has quit IRC | 00:19 | |
*** morazi has quit IRC | 00:29 | |
*** yamahata has joined #tripleo | 00:29 | |
*** ccamacho has quit IRC | 00:34 | |
openstackgerrit | Merged openstack/instack-undercloud: Store events in Undercloud Ceilometer https://review.openstack.org/289789 | 00:37 |
---|---|---|
*** thrash is now known as thrash|g0ne | 00:37 | |
*** xinwu has joined #tripleo | 00:42 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Store events in Ceilometer https://review.openstack.org/290153 | 00:43 |
*** Erming__ has joined #tripleo | 00:44 | |
*** michchap has joined #tripleo | 00:45 | |
*** eggmaste` has joined #tripleo | 00:45 | |
*** ryansb_ has joined #tripleo | 00:45 | |
*** ryansb_ has quit IRC | 00:45 | |
*** ryansb_ has joined #tripleo | 00:45 | |
*** isq_ has joined #tripleo | 00:46 | |
*** pino|work_ has joined #tripleo | 00:47 | |
*** saneax is now known as saneax_AFK | 00:47 | |
*** prometheanfire has quit IRC | 00:47 | |
*** Nakato_ has joined #tripleo | 00:48 | |
*** Erming_ has quit IRC | 00:49 | |
*** pino|work has quit IRC | 00:49 | |
*** michchap_ has quit IRC | 00:49 | |
*** eggmaster has quit IRC | 00:49 | |
*** isq has quit IRC | 00:49 | |
*** ryansb has quit IRC | 00:49 | |
*** Nakato has quit IRC | 00:49 | |
*** ryansb_ is now known as ryansb | 00:49 | |
*** prometheanfire has joined #tripleo | 00:50 | |
*** lblanchard has joined #tripleo | 00:57 | |
*** rhallisey has quit IRC | 01:00 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add missing createUser line to /etc/snmp/snmpd.conf https://review.openstack.org/290317 | 01:11 |
openstackgerrit | Merged openstack/tripleo-common: Add capabilities filter for Nova https://review.openstack.org/288087 | 01:12 |
*** dmacpher-afk has quit IRC | 01:14 | |
*** xinwu has quit IRC | 01:22 | |
*** panda has quit IRC | 01:40 | |
*** panda has joined #tripleo | 01:40 | |
*** xinwu has joined #tripleo | 01:56 | |
openstackgerrit | James Slagle proposed openstack/tripleo-common: Add capabilities filter for Nova https://review.openstack.org/290942 | 01:56 |
openstackgerrit | Sam Yaple proposed openstack/diskimage-builder: Use fstrim to prep the block device https://review.openstack.org/290944 | 02:02 |
*** lblanchard has quit IRC | 02:26 | |
*** rbrady has quit IRC | 02:31 | |
*** dmacpher has joined #tripleo | 02:49 | |
*** Marga_ has quit IRC | 02:54 | |
*** Marga_ has joined #tripleo | 02:56 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder: centos-minimal does not provide base https://review.openstack.org/290956 | 03:00 |
*** Marga_ has quit IRC | 03:01 | |
*** xinwu has quit IRC | 03:30 | |
*** xinwu has joined #tripleo | 03:34 | |
*** shivrao has quit IRC | 03:44 | |
*** rlandy has quit IRC | 03:45 | |
*** yamahata has quit IRC | 03:49 | |
*** Marga_ has joined #tripleo | 03:50 | |
*** links has joined #tripleo | 03:50 | |
*** Marga_ has quit IRC | 03:51 | |
*** Marga_ has joined #tripleo | 03:51 | |
*** akuznetsov has joined #tripleo | 03:55 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder: Clear up "already provided" message https://review.openstack.org/290968 | 03:59 |
*** jaosorior has quit IRC | 04:02 | |
*** jaosorior has joined #tripleo | 04:03 | |
*** panda has quit IRC | 04:22 | |
*** panda has joined #tripleo | 04:22 | |
*** saneax_AFK is now known as saneax | 04:25 | |
*** xinwu has quit IRC | 04:27 | |
*** dmacpher has quit IRC | 04:30 | |
*** dmacpher has joined #tripleo | 04:31 | |
*** akuznetsov has quit IRC | 04:35 | |
*** saneax is now known as saneax_AFK | 04:54 | |
*** masco has joined #tripleo | 05:15 | |
*** dmacpher_ has joined #tripleo | 05:23 | |
*** dmacpher has quit IRC | 05:26 | |
*** cmyster has quit IRC | 05:35 | |
*** panda has quit IRC | 05:40 | |
*** jaosorior has quit IRC | 05:40 | |
*** Erming__ has quit IRC | 05:40 | |
*** panda has joined #tripleo | 05:40 | |
*** Erming__ has joined #tripleo | 05:46 | |
*** jaosorior has joined #tripleo | 05:46 | |
*** jaosorior has quit IRC | 05:46 | |
*** jtomasek has joined #tripleo | 05:53 | |
*** ayoung has quit IRC | 05:53 | |
*** Erming__ has quit IRC | 06:02 | |
*** ayoung has joined #tripleo | 06:04 | |
*** Erming__ has joined #tripleo | 06:08 | |
*** veteran has joined #tripleo | 06:21 | |
*** veteran has quit IRC | 06:22 | |
*** jprovazn has joined #tripleo | 06:23 | |
*** dmacpher has joined #tripleo | 06:27 | |
*** dmacpher_ has quit IRC | 06:28 | |
*** cmyster has joined #tripleo | 06:29 | |
*** cmyster has quit IRC | 06:29 | |
*** cmyster has joined #tripleo | 06:29 | |
openstackgerrit | Purandhar Sairam Mannidi proposed openstack/diskimage-builder: Add support for building images capable of UEFI https://review.openstack.org/287784 | 06:39 |
*** jtomasek has quit IRC | 06:53 | |
openstackgerrit | Purandhar Sairam Mannidi proposed openstack/diskimage-builder: Add support for building images capable of UEFI https://review.openstack.org/287784 | 06:54 |
*** jprovazn has quit IRC | 06:55 | |
*** jprovazn has joined #tripleo | 06:55 | |
*** saneax_AFK is now known as saneax | 06:58 | |
*** xinwu has joined #tripleo | 07:04 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Introduce a UpgradeScriptDeliveryWorfklow as part of tripleo upgrades https://review.openstack.org/289212 | 07:04 |
*** yamahata has joined #tripleo | 07:07 | |
*** rwsu has quit IRC | 07:07 | |
*** ohamada has joined #tripleo | 07:13 | |
bandini | mornin' | 07:14 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder: Set default locale to image in ubuntu-minimal https://review.openstack.org/290789 | 07:15 |
*** olap has quit IRC | 07:22 | |
*** ohamada has quit IRC | 07:23 | |
*** liverpooler has quit IRC | 07:23 | |
*** trozet has quit IRC | 07:27 | |
*** trozet has joined #tripleo | 07:28 | |
*** ccamacho has joined #tripleo | 07:28 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: stable/liberty: set default upgrade level to kilo https://review.openstack.org/290584 | 07:29 |
*** dshulyak has joined #tripleo | 07:32 | |
*** rcernin has joined #tripleo | 07:33 | |
*** pino|work_ is now known as pino|work | 07:44 | |
*** rdopiera has joined #tripleo | 07:48 | |
*** olap has joined #tripleo | 07:50 | |
*** shivrao has joined #tripleo | 07:52 | |
*** shivrao_ has joined #tripleo | 07:54 | |
*** fgimenez has joined #tripleo | 07:56 | |
*** shivrao has quit IRC | 07:57 | |
*** shivrao_ is now known as shivrao | 07:57 | |
*** rwsu has joined #tripleo | 08:01 | |
*** ifarkas has joined #tripleo | 08:02 | |
*** xinwu has quit IRC | 08:03 | |
*** rain has joined #tripleo | 08:08 | |
*** rain is now known as Guest78631 | 08:09 | |
*** Guest78631 is now known as leanderthal | 08:09 | |
*** aufi has joined #tripleo | 08:09 | |
*** xinwu has joined #tripleo | 08:10 | |
*** paramite has joined #tripleo | 08:14 | |
*** xinwu has quit IRC | 08:16 | |
*** stendulker has joined #tripleo | 08:28 | |
*** mbound has joined #tripleo | 08:31 | |
*** mikelk has joined #tripleo | 08:32 | |
*** pcaruana has joined #tripleo | 08:34 | |
*** liverpooler has joined #tripleo | 08:34 | |
*** liverpooler has quit IRC | 08:35 | |
*** liverpooler has joined #tripleo | 08:35 | |
*** rhefner has quit IRC | 08:36 | |
*** Ng has quit IRC | 08:36 | |
*** igorbelikov has quit IRC | 08:36 | |
*** shivrao has quit IRC | 08:38 | |
*** rhefner has joined #tripleo | 08:40 | |
*** igorbelikov has joined #tripleo | 08:42 | |
*** xinwu has joined #tripleo | 08:43 | |
*** Ng has joined #tripleo | 08:44 | |
*** ChanServ sets mode: +v Ng | 08:44 | |
*** chem has joined #tripleo | 08:46 | |
*** jaosorior has joined #tripleo | 08:52 | |
*** ohamada has joined #tripleo | 08:55 | |
openstackgerrit | Ishant Tyagi proposed openstack/os-collect-config: Add insecure option to the cfn collector https://review.openstack.org/284725 | 08:58 |
*** ishant has joined #tripleo | 08:58 | |
*** dmacpher has quit IRC | 08:59 | |
*** shardy has joined #tripleo | 09:05 | |
*** jaosorior has quit IRC | 09:09 | |
*** jaosorior has joined #tripleo | 09:09 | |
*** jistr has joined #tripleo | 09:09 | |
*** dtantsur|afk is now known as dtantsur | 09:11 | |
*** jcoufal has joined #tripleo | 09:12 | |
*** lucas-dinner is now known as lucasagomes | 09:14 | |
*** xinwu has quit IRC | 09:15 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Moves the swift start/stop into the common_functions.sh file https://review.openstack.org/287960 | 09:23 |
*** openstackgerrit has quit IRC | 09:30 | |
*** openstackgerrit_ has joined #tripleo | 09:30 | |
*** openstackgerrit_ is now known as openstackgerrit | 09:31 | |
*** openstackgerrit has quit IRC | 09:31 | |
*** openstackgerrit_ has joined #tripleo | 09:31 | |
*** pblaho has joined #tripleo | 09:31 | |
*** openstackgerrit_ is now known as openstackgerrit | 09:32 | |
*** openstackgerrit has quit IRC | 09:32 | |
*** openstackgerrit_ has joined #tripleo | 09:32 | |
*** openstackgerrit_ is now known as openstackgerrit | 09:33 | |
*** openstackgerrit has quit IRC | 09:33 | |
*** openstackgerrit_ has joined #tripleo | 09:33 | |
*** openstackgerrit_ is now known as openstackgerrit | 09:34 | |
*** panda has quit IRC | 09:40 | |
*** panda has joined #tripleo | 09:41 | |
*** electrofelix has joined #tripleo | 09:41 | |
*** rasca has quit IRC | 09:49 | |
*** mgould has joined #tripleo | 09:53 | |
*** rasca has joined #tripleo | 09:54 | |
*** akrivoka has joined #tripleo | 09:57 | |
openstackgerrit | Imre Farkas proposed openstack/tripleo-docs: Fix url for current-passed-ci https://review.openstack.org/291078 | 09:57 |
*** tosky has joined #tripleo | 09:58 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Fixup swift device string to delimit the ipv6 address with [] https://review.openstack.org/289757 | 09:58 |
*** dmacpher has joined #tripleo | 10:02 | |
*** Marga_ has quit IRC | 10:03 | |
*** liverpooler has quit IRC | 10:05 | |
*** liverpooler has joined #tripleo | 10:10 | |
*** derekh has joined #tripleo | 10:13 | |
*** nico_auv has joined #tripleo | 10:16 | |
*** jtomasek has joined #tripleo | 10:16 | |
*** liverpooler has quit IRC | 10:17 | |
openstackgerrit | Merged openstack/tripleo-docs: Extending the image build information https://review.openstack.org/270290 | 10:22 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Fixup systemctl_swift stop/start during the controller upgrade https://review.openstack.org/290501 | 10:22 |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates: Fixup systemctl_swift stop/start during the controller upgrade https://review.openstack.org/291088 | 10:25 |
*** paramite is now known as paramite|afk | 10:26 | |
openstackgerrit | Purandhar Sairam Mannidi proposed openstack/diskimage-builder: Add support for building images capable of UEFI https://review.openstack.org/287784 | 10:26 |
openstackgerrit | Merged openstack/tripleo-docs: Update python-rdomanager-oscplugin to python-tripleoclient https://review.openstack.org/290553 | 10:27 |
*** paramite|afk is now known as paramite | 10:28 | |
*** liverpooler has joined #tripleo | 10:29 | |
shardy | Simple docs patch needs a second +2/A please - https://review.openstack.org/#/c/283582/ | 10:32 |
openstackgerrit | Merged openstack/tripleo-docs: Document deploying the overcloud with ssl https://review.openstack.org/265006 | 10:33 |
*** athomas has joined #tripleo | 10:34 | |
marios | shardy: done | 10:35 |
shardy | thanks! | 10:35 |
marios | shardy: can you +2 the cherrypick for that systemctl fix you just +A (thanks for that) https://review.openstack.org/#/c/291088/1 | 10:37 |
openstackgerrit | Merged openstack/tripleo-docs: Removed reference to SpinalStack to prevent confusion https://review.openstack.org/283582 | 10:37 |
shardy | marios: done! | 10:38 |
marios | shardy: tyvm | 10:39 |
*** stendulker has quit IRC | 10:45 | |
*** Marga_ has joined #tripleo | 10:47 | |
*** paramite is now known as paramite|afk | 10:50 | |
jistr | ceph upgrades ready to land, just need a +2 https://review.openstack.org/#/c/289896/ | 10:57 |
jistr | same for upgrade init command for repo switching https://review.openstack.org/#/c/290465/ | 10:57 |
shardy | jistr: So, what's the rationale behind not actually running the script? | 10:59 |
shardy | Obviously we're going for the manual approach for computes due to the need for migrating workloads, but could this safely be automated? | 11:00 |
openstackgerrit | Attila Darazs proposed openstack-infra/tripleo-ci: Use IPv6 on the ceph gate job https://review.openstack.org/289445 | 11:01 |
jistr | shardy: it would run on all nodes at the same time, i'm not sure if that's safe if we want to keep ceph data availability. I'd guess it isn't. | 11:02 |
shardy | jistr: Ok, so we need a way to do a rolling apply of the script | 11:02 |
shardy | makes sense, thanks | 11:02 |
shardy | jistr: I wonder if we should modify SoftwareDeploymentGroup, so it has an option to serialize the deployments | 11:04 |
shardy | and/or do them in batches | 11:04 |
shardy | we already have that support in ResourceGroup (which SoftwareDeploymentGroup is based on), so would potentially be quite easy | 11:05 |
* shardy adds that to the list of things to look into | 11:05 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-common: Install the upgrade-non-controller.sh script with tripleo-common https://review.openstack.org/291101 | 11:06 |
marios | jistr: not sure if that is right yet... testing ^^^ | 11:06 |
jistr | shardy: yeah that would be quite useful i think. It could avoid a CDN hit in other situations, for example. And we might be able to do a minor update without having to control everything synchronously from tripleoclient. | 11:06 |
shardy | ramishra: ^^ Hey maybe this might be something you'd be interested in looking at? | 11:10 |
shardy | ramishra: we'd like SoftwareDeploymentGroup to expose the new rolling update features of ResourceGroup | 11:10 |
*** pblaho has quit IRC | 11:11 | |
*** aufi has quit IRC | 11:11 | |
ramishra | shardy: hey, surely I'll add to my newton todo:) | 11:13 |
*** paramite|afk is now known as paramite | 11:16 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Updated the heat_template_version https://review.openstack.org/288116 | 11:19 |
shardy | ramishra: thanks! :) | 11:23 |
*** ishant has quit IRC | 11:25 | |
*** akrivoka has quit IRC | 11:26 | |
*** trown|outtypewww is now known as trown | 11:27 | |
*** paramite is now known as paramite|afk | 11:31 | |
*** oshvartz has joined #tripleo | 11:33 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Increase default netdev_max_backlog to 10x https://review.openstack.org/289907 | 11:39 |
*** akrivoka has joined #tripleo | 11:40 | |
openstackgerrit | Merged openstack/puppet-tripleo: Make OpenStack service ports configurable in HAProxy https://review.openstack.org/287961 | 11:42 |
*** mbound has quit IRC | 11:47 | |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates: Updated the heat_template_version https://review.openstack.org/288134 | 11:48 |
*** paramite|afk is now known as paramite | 11:51 | |
*** trown is now known as trown|outtypewww | 11:56 | |
slagle | look at all that green | 12:02 |
*** jaosorior has quit IRC | 12:02 | |
*** jaosorior has joined #tripleo | 12:03 | |
shardy | I think we do have an issue with the lint check on stable tho: | 12:04 |
shardy | https://review.openstack.org/#/c/288867/ | 12:04 |
shardy | it's failing on the verify gate on lines unrelated to the patch I think | 12:04 |
* shardy looks for patch which fixes it | 12:05 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Make service certificate come from explicit input https://review.openstack.org/291132 | 12:06 |
jaosorior | shardy: That's the quick patch for the SoftwareConfig stuff we talked about in the morning ^^ | 12:07 |
openstackgerrit | Dmitry Tantsur proposed openstack/python-tripleoclient: Remove hardcoded delay between introspections https://review.openstack.org/291135 | 12:09 |
shardy | Hrm, we have the same long line on master | 12:09 |
jaosorior | Enabling TLS for the CI is all green :D https://review.openstack.org/#/c/281988/ if someone has time to check that out | 12:09 |
slagle | shardy: yea, these liberty patches passed the lint job in the check, but then failed it in the gate | 12:10 |
shardy | https://github.com/rodjek/puppet-lint/commit/2b48ab36bb5334a41f98f8bd75867cd69eb6f859 | 12:11 |
shardy | Needs to be under 140 chars | 12:11 |
* shardy fixes | 12:11 | |
shardy | Hmm, that should just be a warning tho | 12:12 |
openstackgerrit | Steven Hardy proposed openstack/instack-undercloud: Fix long line in puppet-stack-config.pp https://review.openstack.org/291136 | 12:16 |
openstackgerrit | Steven Hardy proposed openstack/instack-undercloud: Fix long line in puppet-stack-config.pp https://review.openstack.org/291137 | 12:17 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/python-tripleoclient: Remove hardcoded delay between introspections https://review.openstack.org/291135 | 12:23 |
openstackgerrit | Mike Burns proposed openstack/tripleo-heat-templates: Use service tenant for ceilometer https://review.openstack.org/291140 | 12:25 |
openstackgerrit | Mike Burns proposed openstack/instack-undercloud: Use service tenant for ceilometer https://review.openstack.org/291142 | 12:25 |
openstackgerrit | Mike Burns proposed openstack/tripleo-heat-templates: controller/ceilometer: use internalURL for os endpoint type https://review.openstack.org/291145 | 12:31 |
*** weshay has joined #tripleo | 12:31 | |
slagle | EmilienM: hi, getting a CI failure on one of the ipv6 patches, https://review.openstack.org/#/c/272089/ | 12:32 |
openstackgerrit | Steven Hardy proposed openstack/python-tripleoclient: Allow node import via yaml not only csv/json https://review.openstack.org/255228 | 12:32 |
slagle | EmilienM: "Error: Cannot reassign variable nova_ipv6" | 12:32 |
slagle | EmilienM: i don't see where we are reassigning it | 12:32 |
*** lucasagomes is now known as lucas-hungry | 12:34 | |
openstackgerrit | Steven Hardy proposed openstack/tripleo-docs: Update baremetal import to not use --json option https://review.openstack.org/291147 | 12:35 |
openstackgerrit | Steven Hardy proposed openstack/instack-undercloud: Fix long line in puppet-stack-config.pp https://review.openstack.org/291136 | 12:37 |
EmilienM | hello | 12:38 |
openstackgerrit | Steven Hardy proposed openstack/instack-undercloud: Fix long line in puppet-stack-config.pp https://review.openstack.org/291137 | 12:39 |
EmilienM | slagle: will look asap | 12:39 |
*** paramite is now known as paramite|afk | 12:39 | |
slagle | EmilienM: ok, thanks. i'm stumped on it. b/c I only see nova_ipv6 assigned to one time | 12:40 |
slagle | unless the variable name is used in another module? | 12:40 |
*** paramite|afk is now known as mmagr | 12:42 | |
*** mmagr is now known as paramite | 12:42 | |
EmilienM | slagle: https://review.openstack.org/#/c/272089/11/puppet/extraconfig/ceph/ceph-external-config.yaml | 12:45 |
*** pcaruana has quit IRC | 12:45 | |
shardy | Ok https://review.openstack.org/#/q/I7b3e6177d160f6f0cb775636f25baed2164d2002,n,z does appear to fix the lint failures for instack-undercloud | 12:46 |
shardy | I'm not sure why that wasn't failing before tho tbh | 12:46 |
*** Goneri has quit IRC | 12:46 | |
adarazs | folks, SOS, I'm still seeing trouble with IPv6 and rabbit, I have this in the gate job (on the new IPv6 gate): | 12:48 |
adarazs | Error: curl -k --noproxy localhost --retry 30 --retry-delay 6 -f -L -o /var/lib/rabbitmq/rabbitmqadmin http://guest:guest@fd00:fd00:fd00:2000::12:15672/cli/rabbitmqadmin returned 7 instead of one of [0] | 12:48 |
adarazs | http://logs.openstack.org/45/289445/4/check-tripleo/gate-tripleo-ci-f22-ceph/b6997a4/console.html | 12:48 |
adarazs | that is supposed to be bracketed. and the rabbitmq IPv6 change was merged, so it's supposed to work. | 12:49 |
jistr | marios: o/ | 12:49 |
jistr | marios: could you please review the cinder upgrade when you have a minute, it finally passed CI :)) https://review.openstack.org/#/c/287929/ | 12:50 |
marios | jistr: sure | 12:50 |
adarazs | I don't have enough tripleo-fu to figure out where that command comes from. | 12:50 |
slagle | shardy: cool | 12:51 |
*** trown|outtypewww is now known as trown | 12:52 | |
slagle | EmilienM: you are likely right in that review, but that environment file doesnt get used in CI, so i dont think that would cause the nova_ipv6 issue | 12:53 |
EmilienM | slagle: let me look again, I'm still reading logs | 12:53 |
*** pblaho has joined #tripleo | 12:55 | |
jistr | marios: thanks! | 12:55 |
marios | jistr: after I +A I thought of the -q... do you want it there? | 12:55 |
*** aufi has joined #tripleo | 12:56 | |
EmilienM | slagle: we might have merged something already that would cause that | 12:56 |
marios | jistr: commented there fwiw | 12:56 |
EmilienM | slagle: I'm looking at it, it's in HA jobs only | 12:57 |
*** rhallisey has joined #tripleo | 12:57 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Upgrade of Cinder block storage nodes https://review.openstack.org/287929 | 12:57 |
rhallisey | derekh, morning. Can you see what the journal log returned for the container job? | 12:57 |
*** oshvartz has quit IRC | 12:58 | |
jistr | marios: good catch, thanks! | 12:59 |
marios | jistr: ... bit late now :/ sry | 12:59 |
*** yamahata has quit IRC | 13:00 | |
openstackgerrit | Martin Mágr proposed openstack/tripleo-heat-templates: Keystone domain for Heat https://review.openstack.org/180566 | 13:00 |
EmilienM | slagle: I think I found it | 13:00 |
jistr | marios: no it's fine, i'm not sure if that *always* has to cause problems, probably not, so good that we have it merged so that we can progress with backporting the most important stuff | 13:00 |
*** pcaruana has joined #tripleo | 13:00 | |
jistr | hrmm we don't have gfidente | 13:01 |
*** dprince has joined #tripleo | 13:01 | |
EmilienM | slagle: would it be possible that HA jobs fail since bb05fa304a2eed2caa4840e8039832d369a357f7 ? | 13:01 |
EmilienM | slagle: other HA jobs fail too? | 13:03 |
slagle | EmilienM: other ha jobs are passing | 13:03 |
EmilienM | looking at http://tripleo.org/cistatus.html, it seems ok | 13:03 |
slagle | EmilienM: they also passed on that patch that added nova_ipv6, https://review.openstack.org/#/c/270110/ | 13:03 |
EmilienM | slagle: ok I had a patch but in fact i think it's useless | 13:04 |
EmilienM | slagle: can I push over the patch to address my comment? | 13:04 |
EmilienM | slagle: I think gfidente is not here today | 13:05 |
slagle | sure | 13:05 |
EmilienM | ok | 13:05 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Support the deployment of Ceph over IPv6 https://review.openstack.org/272089 | 13:06 |
EmilienM | let's try again, see if the errors happens again | 13:07 |
EmilienM | slagle: is it possible that a bug in heat would dupplicate puppet manifests? | 13:07 |
adarazs | derekh: ^ can you help me to track this down? | 13:07 |
slagle | EmilienM: i've not seen it before. but anything is possible | 13:07 |
derekh | adarazs: which error do you want help tracking down? | 13:10 |
*** thrash|g0ne is now known as thrash | 13:10 | |
*** akrivoka has quit IRC | 13:11 | |
*** jayg|g0n3 is now known as jayg | 13:13 | |
*** pradk has joined #tripleo | 13:13 | |
derekh | All, we now have 51 testenvs and (currently) 59 jenkins slaves trying to use them, of the jenkins slaves are waiting too long for testenvs when they eventually do get one it will be too late to run a full test and ZUUL will time them out, so they spend 1.5 hours using up a testenv only to fail | 13:14 |
derekh | Then the other jobs behind them will fail because they in turn were waiting even more on testenvs, | 13:15 |
derekh | And the whole thing will become a big sea of red timeouts | 13:15 |
derekh | We need to kill jobs that have been waiting for a testenv for more then X minutes to avoid this | 13:15 |
*** morazi has joined #tripleo | 13:15 | |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: Upgrade of Cinder block storage nodes https://review.openstack.org/291167 | 13:16 |
derekh | https://review.openstack.org/#/c/290731/ | 13:16 |
slagle | should we just merge that? | 13:16 |
derekh | That should do it ^^ | 13:16 |
*** lucas-hungry is now known as lucasagomes | 13:17 | |
derekh | slagle: I thinks so, PS1 proved that it can kill the tests, PS2 has passed at least on of the tests | 13:17 |
trown | +1 to just merging | 13:17 |
slagle | yea, it passed nonha | 13:17 |
*** masco has quit IRC | 13:18 | |
openstackgerrit | Merged openstack-infra/tripleo-ci: Kill CI job if it doesn't get a testenv quickly https://review.openstack.org/290731 | 13:18 |
slagle | derekh: overall, i am seeing a lot more green so far today. i think the redeploy helped | 13:19 |
derekh | slagle: shardy trown thanks | 13:19 |
derekh | slagle: Yup, I'm hoping now that we're fully loaded I'm hoping it stays that way over the next hour or 2 | 13:20 |
*** paramite is now known as paramite|afk | 13:21 | |
trown | derekh: with that job killer patch, are we going to get more single job of the three fails requiring full recheck? | 13:21 |
trown | it would be nice if a job killed because of the 20min timeout was autorqueued and voted with the other jobs based on the requeued job | 13:22 |
derekh | trown: Yup, quite probably, the real solution it to reduce the number of jenkins slaves | 13:22 |
trown | right, that is simpler solution :) | 13:22 |
derekh | trown: that patch needs to go into infra/project-config , I'm gonna line that up now, but sometimes it takes a while to get things in | 13:22 |
pradk | can i request some reviews on https://review.openstack.org/#/c/289435/ please | 13:22 |
trown | derekh: yep, makes sense | 13:23 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: Upgrades: quiet yum upgrade on cinder nodes https://review.openstack.org/291173 | 13:24 |
derekh | rhallisey: that last run of the containers jobs, failed to deploy the compute node, I got not logs there to look at, (its the compute node logs you wanted isn't it?) | 13:25 |
derekh | 2016-03-10 04:26:22.953 | | e9db9ca2-5fc7-4c14-ac30-19c4fc54e8eb | overcloud-novacompute-0 | ERROR | - | NOSTATE | | | 13:25 |
rhallisey | really I thought it passed earlier.. | 13:25 |
rhallisey | rather spwaned the node.. | 13:25 |
rhallisey | let me look again | 13:25 |
derekh | rhallisey: ok, if it has, and you have a computenode tarball from the ci run, you should now have the jourlan log entries in that tarball | 13:26 |
rhallisey | where would the tarball be though | 13:26 |
*** paramite|afk is now known as paramite | 13:31 | |
*** akrivoka has joined #tripleo | 13:37 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Enable glance-api show_image_direct_url for COW https://review.openstack.org/290358 | 13:37 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Set notification driver for nova to send https://review.openstack.org/288497 | 13:38 |
*** tremble has joined #tripleo | 13:38 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Upgrades: install zaqarclient https://review.openstack.org/287708 | 13:41 |
jistr | adarazs: looking further, i'm thinking this might actually need a fix in puppetlabs-rabbitmq. I'm a bit puzzled how this could have worked for anyone before. | 13:42 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add support for DeployArtifactURLs https://review.openstack.org/289094 | 13:42 |
jistr | adarazs: we're passing unbracketed IP into puppetlabs-rabbitmq, and it seems like it's using it for both cases where unbracketed IP would go, and where a bracketed IP would go | 13:44 |
jistr | adarazs: i'll try submitting a patch to t-h-t to pass a bracketed IP, which should fix the problem you're having, but it might break rabbitmq's config file for a change | 13:45 |
*** saneax is now known as saneax_AFK | 13:45 | |
jistr | adarazs: there might be something we're missing though. Do you know who from the network team had rabbitmq working on IPv6? | 13:46 |
*** links has quit IRC | 13:48 | |
*** jdob has joined #tripleo | 13:50 | |
*** akuznetsov has joined #tripleo | 13:51 | |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: IPv6: pass bracketed IP to rabbitmq puppet module https://review.openstack.org/291186 | 13:54 |
*** mkovacik has quit IRC | 13:54 | |
*** mkovacik has joined #tripleo | 13:54 | |
*** akuznetsov has quit IRC | 13:55 | |
jistr | adarazs: ^^ that's the patch, but as i said, it might fix the curl but break something else. So i gave it WIP status. Maybe a fix in puppetlabs-rabbitmq would be better. | 13:57 |
*** Goneri has joined #tripleo | 13:57 | |
* jistr back to upgrades | 13:57 | |
*** snecklifter has joined #tripleo | 13:59 | |
*** jaosorior has quit IRC | 13:59 | |
*** mbound has joined #tripleo | 14:00 | |
snecklifter | Hello, I've been debugging OSP-d installation on Lenovo hardware | 14:00 |
adarazs | jistr: do you have a patch for the rabbit uri thing? (sorry to pester you about it, just trying to make the ipv6 gate asap) | 14:00 |
snecklifter | It looks like the latter exposes a cdc_ether device which potentially tripleo is seeing as an active nic | 14:01 |
jistr | adarazs: :D yes | 14:01 |
jistr | adarazs: i wrote you ~5 or so messages about it, see above | 14:01 |
*** rlandy has joined #tripleo | 14:01 | |
*** akuznetsov has joined #tripleo | 14:02 | |
snecklifter | Does this sound plausible? I'm wondering what the logic is for determining if a nic is active | 14:02 |
snecklifter | 2: enp0s29u1u1u5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 | 14:02 |
snecklifter | ip reports as unknown | 14:03 |
snecklifter | but ethtool reports link is up | 14:03 |
*** openstackgerrit has quit IRC | 14:03 | |
*** openstackgerrit_ has joined #tripleo | 14:03 | |
snecklifter | Link detected: yes | 14:03 |
shardy | snecklifter: https://github.com/openstack/os-net-config/blob/master/os_net_config/utils.py#L53 | 14:04 |
shardy | that's the logic | 14:04 |
*** trozet has quit IRC | 14:04 | |
*** ifarkas has quit IRC | 14:04 | |
snecklifter | shardy, thanks | 14:04 |
*** openstackgerrit_ is now known as openstackgerrit | 14:04 | |
*** openstackgerrit has quit IRC | 14:04 | |
*** openstackgerrit_ has joined #tripleo | 14:05 | |
*** openstackgerrit_ is now known as openstackgerrit | 14:05 | |
*** openstackgerrit has quit IRC | 14:05 | |
*** openstackgerrit_ has joined #tripleo | 14:06 | |
snecklifter | shardy, what tool is it using to determine presence of carrier and address - sorry, not clear to me | 14:06 |
*** openstackgerrit_ is now known as openstackgerrit | 14:06 | |
*** openstackgerrit has quit IRC | 14:07 | |
*** lblanchard has joined #tripleo | 14:07 | |
*** openstackgerrit_ has joined #tripleo | 14:07 | |
*** openstackgerrit_ is now known as openstackgerrit | 14:08 | |
*** openstackgerrit has quit IRC | 14:08 | |
*** openstackgerrit_ has joined #tripleo | 14:08 | |
shardy | snecklifter: it's looking in /sys/class/net/<device>/carrier | 14:08 |
*** openstackgerrit_ is now known as openstackgerrit | 14:09 | |
*** openstackgerrit has quit IRC | 14:09 | |
*** rodrigods has joined #tripleo | 14:09 | |
snecklifter | ah, simple as that, great, will poke further | 14:09 |
*** openstackgerrit_ has joined #tripleo | 14:09 | |
*** openstackgerrit_ is now known as openstackgerrit | 14:10 | |
*** Guest41345 has joined #tripleo | 14:10 | |
shadower | the CI is no longer busted, is it? | 14:10 |
shadower | (not seeing any problems, just making sure that's the case) | 14:11 |
shardy | shadower: it's working apart from the lint job on instack-undercloud | 14:11 |
shadower | shardy: thanks! | 14:13 |
snecklifter | shardy, could we add /sys/class/net/<device>/operstate == up | 14:16 |
openstackgerrit | Attila Darazs proposed openstack-infra/tripleo-ci: Use IPv6 on the ceph gate job https://review.openstack.org/289445 | 14:18 |
derekh | dprince, hey if you got a spare minute or two can you add your setup to here as its different to everybody elses, https://etherpad.openstack.org/p/tripleo-dev-env-census | 14:18 |
shardy | snecklifter: yup probably, looks like that could possibly replace the current carrier test? | 14:18 |
shardy | https://www.kernel.org/doc/Documentation/networking/operstates.txt | 14:19 |
*** rbrady has joined #tripleo | 14:20 | |
snecklifter | shardy, sounds like a better test | 14:20 |
snecklifter | local interface also reports UNKNOWN rather than UP but no harm in leaving that check | 14:21 |
dprince | derekh: yeah, I can | 14:21 |
snecklifter | I will prep a patch | 14:21 |
derekh | dprince: thanks | 14:26 |
*** rwsu has quit IRC | 14:33 | |
dtantsur | can someone confirm my understanding of heat that if I need to change something in a template, I have to stack-delete and rebuild? | 14:35 |
dtantsur | i.e. it's not like puppet which modified the existing thing to the declared state? | 14:36 |
dtantsur | shardy, shadower ^^? | 14:36 |
* dtantsur tries to figure out if it's a bug or expected behavior | 14:37 | |
shadower | dtantsur: you should be able to do a "heat stack-update" with the updated templates/parameters | 14:38 |
dtantsur | shadower, so, I'm testing $stuff on the rdo day, and people told me I should add https://github.com/redhat-openstack/tripleo-quickstart/blob/master/playbooks/roles/tripleo/overcloud/templates/overcloud-deploy.sh.j2#L18-L40 | 14:38 |
dtantsur | shadower, I tried adding -e /path/to/such/file.yaml to the deploy command and start it again | 14:38 |
dtantsur | and it resulting in something like "error 4" (the next attempt -9), and I dunno if it's worth investigating or just give up and stack-delete first | 14:39 |
shadower | dtantsur: so what you linked is not a heat template | 14:41 |
*** pcaruana has quit IRC | 14:41 | |
* dtantsur is clueless :) | 14:41 | |
dtantsur | shadower, that's fine with me :) was it expected to work at all? | 14:41 |
shadower | dtantsur: I'm not sure. Not familiar with tripleo-quickstart at all yet :-( | 14:42 |
dtantsur | shadower, I'm rather asking if adding this file with -e flag on stack update is expected to work (at least potentially) | 14:42 |
dtantsur | ignore the context for a while :) | 14:42 |
shadower | ah, well yeah adding an environment file on update should work imho | 14:42 |
dtantsur | it didn't :) | 14:43 |
shadower | yeah so assuming the yaml file has the right contents, that sounds like a bug | 14:44 |
dtantsur | do you think I should report it against tripleo? or upstream heat? or tripleo-heat-templates? :) | 14:44 |
tosky | all of them! | 14:45 |
tosky | (sorry) | 14:45 |
dtantsur | easily :D | 14:45 |
*** rwsu has joined #tripleo | 14:50 | |
shardy | dtantsur: I'd start with a bug against tripleo, then we can re-route it if needed | 14:50 |
shardy | dtantsur: in answer to your original question, in nearly all cases it should be possible to update a stack, even from a failed state | 14:51 |
shardy | deleting it and starting again is valid, but shouldn't be mandatory unless you want a clean start | 14:52 |
dtantsur | shardy, ok, do you have some quick checklist which things I should collect for a report in addition to heat resource-list/show? | 14:52 |
shardy | dtantsur: The exact steps to reproduce, any error output, and any associated error in /var/log/heat/engine.log on the undercloud | 14:53 |
*** pcaruana has joined #tripleo | 14:53 | |
dtantsur | shardy, ok | 14:53 |
shardy | dtantsur: for resource-list, do heat resource-list -n5 overcloud | grep FAILED | 14:54 |
shardy | then you'll grab all the nested resources too | 14:54 |
*** pradk_ has joined #tripleo | 14:54 | |
*** jaosorior has joined #tripleo | 14:56 | |
jaosorior | Has anybody seen the following error while deploying the overcloud in HA? Error: Must pass auth_password to Class[Aodh::Auth] at /var/lib/heat-config/heat-config-puppet/fa59863d-38c0-463f-8754-dbb8b43d4156.pp:1048 on node overcloud-controller-1.localdomain | 14:58 |
openstackgerrit | Tomas Sedovic proposed openstack/tripleo-heat-templates: Allow the vnc server to bind on IPv6 address on computes https://review.openstack.org/270831 | 14:58 |
openstackgerrit | Tomas Sedovic proposed openstack/tripleo-heat-templates: Surround MongoDB IPs with braces in the connection string if IPv6 https://review.openstack.org/270154 | 14:58 |
*** akuznetsov has quit IRC | 14:59 | |
openstackgerrit | Gonéri Le Bouder proposed openstack/instack-undercloud: add INTERFACE_MTU parameter https://review.openstack.org/288041 | 14:59 |
dtantsur | shardy, https://bugs.launchpad.net/tripleo/+bug/1555676 and trying to get more information now | 15:00 |
openstack | Launchpad bug 1555676 in tripleo "Failed to add a simple environment file when updating the stack" [Undecided,New] | 15:00 |
shardy | dtantsur: can you add the output of heat deployment-show b1dfc129-91f4-4bce-86d8-fe79aa1c08a4 please? | 15:02 |
shardy | that should give us the stderr of the failed puppet run | 15:03 |
shardy | Actualluy sorry that's 3db70738-20c4-47b1-9c96-cad55212c055 | 15:03 |
*** trozet has joined #tripleo | 15:03 | |
shardy | you need the ID of the OS::Heat::StructuredDeployment resource that's FAILED | 15:04 |
openstackgerrit | Ben Nemec proposed openstack/instack-undercloud: Secure haproxy stats endpoint https://review.openstack.org/290912 | 15:04 |
*** devvesa has joined #tripleo | 15:05 | |
dtantsur | shardy, mmm, that's long, lemme fetch it as a file | 15:05 |
*** thrash has quit IRC | 15:05 | |
*** rdopiera has quit IRC | 15:06 | |
slagle | dprince: derekh : i think i might have found an issue with the mulitple nics in ci | 15:08 |
*** thrash has joined #tripleo | 15:08 | |
*** thrash has joined #tripleo | 15:08 | |
derekh | slagle: ya? | 15:08 |
slagle | i was looking at one of the jobs that was about to time out | 15:09 |
slagle | the compute node had deployed fine, but when it rebooted, the ctlplane ip became unreachable | 15:09 |
slagle | couldn't ssh or ping | 15:09 |
slagle | turns out there is another job in a different testenv that is also using that same ip for one of it's nodes | 15:09 |
slagle | i think this is causing an issue | 15:10 |
dprince | slagle: hmmm. They should be on different bridges though right? | 15:10 |
slagle | i suppose so, yes | 15:10 |
dprince | slagle: like each testenv' should have it's own bridge's now, for each network | 15:10 |
derekh | slagle: ya, the seperate bridges should keep them isolated, it it isn't we got a problem | 15:10 |
dprince | slagle: perhaps some even ARP flux is going on or something | 15:10 |
dtantsur | shardy, updated | 15:11 |
slagle | dprince: yea, arp could be it | 15:11 |
slagle | these jobs will probably tiem out soon, but it's testenv32-testenv1-sbiwjg32inl6 | 15:12 |
shardy | Cannot allocate memory - fork(2) | 15:12 |
shardy | dtantsur: You need more memory or some swap on the overcloud nodes | 15:12 |
dtantsur | shardy, something is terribly wrong with out installer if 4 GiB is not enough even for launching a simple instance... | 15:13 |
shardy | dtantsur: I agree, which is why we need composable services, so you can turn off stuff you don't want | 15:13 |
shardy | as it is, people keep adding stuff and we have no way to turn it off | 15:13 |
dtantsur | shardy, now I understand it's not a question for you, but I have no clues how I (the developer) is supposed to test anything | 15:14 |
shardy | dtantsur: we had to increase the memory on CI nodes from 4G recently for this reason | 15:14 |
shardy | dtantsur: how much ram does your test box have? | 15:14 |
dtantsur | shardy, so, what's the minimum with which I would be able to pass the pingtest on HA? | 15:14 |
dtantsur | shardy, I have a dell box with 32 GiB | 15:14 |
dtantsur | so I can probably bump memory to 6 (with risking of swapping, but still) | 15:15 |
trown | dtantsur: I think the pingtest would have passed with that single worker heat environment... it was just updating the stack without that to include it that bombed out | 15:15 |
slagle | dprince: all the seeds are bridged into the same br-ctlpane though? | 15:16 |
dtantsur | sigh... | 15:16 |
trown | ya | 15:16 |
dtantsur | ok, I'll try stack-delete and rebuild. otherwise I won't be able to test scaling up.. | 15:16 |
dprince | slagle: seriously? did I miss this!? | 15:16 |
slagle | i dunno :) i'm grasping at straws here | 15:17 |
derekh | slagle: yes, they always have been, one nic on br-ctlpane for external access and one bridge on brbmX for internal | 15:17 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/puppet-pacemaker: Basic beaker one node test. https://review.openstack.org/281376 | 15:17 |
shardy | dtantsur: I run my undercloud and overcloud nodes with 8G (5 nodes total) on a 32G ram box, but with KSM enabled (default) you can just about deploy a 4 node overcloud | 15:18 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/puppet-pacemaker: Add a service provider. https://review.openstack.org/286124 | 15:18 |
shardy | dtantsur: that said, I mostly stick to 2-3 node deployments (nonha) personally | 15:18 |
slagle | derekh: right, ok. i'm confused how this works i guess | 15:18 |
trown | shardy: KSM enabled is default? | 15:19 |
slagle | there's some networking related issue going on | 15:19 |
slagle | the node is definitely up with the right ip and networking applied (i added a console and watched the cloud-init output) | 15:19 |
shardy | trown: /sys/kernel/mm/ksm is present on my centos7 host | 15:19 |
slagle | but the ip is not reachable externally | 15:19 |
derekh | slagle: dprince could the route on the undercloud be sending traffic out the wrong nic ? | 15:20 |
dprince | slagle: br-ctlplane isn't used for the undercloud ctlplane I think | 15:21 |
dprince | slagle: I think that only gets used for the seed -> jenkins communication. | 15:21 |
derekh | Ya br-ctlplane on the TE host is used for trafic between jenkins slaves and the undercloud | 15:22 |
dprince | derekh: which route? | 15:22 |
trown | shardy: does /sys/kernel/mm/ksm/pages_shared actually show some pages shared though? | 15:22 |
dprince | slagle: agree the naming of br-ctlplane is confusing though. | 15:22 |
trown | shardy: I have a deployment on my centos host, and that shows 0 | 15:22 |
slagle | dprince: when the node first comes up, it has a default route of 192.0.2.1 though | 15:22 |
shardy | trown: Hmm, I'm just deploying some VMs to find out :) | 15:22 |
slagle | i saw that in the cloud-init output | 15:22 |
openstackgerrit | Dan Radez proposed openstack/os-cloud-config: Adding support for pxe_amt and amt_agent https://review.openstack.org/291232 | 15:23 |
derekh | dprince: dunno, I took some straws out of the packet slagle was clutching | 15:23 |
*** jaosorior has quit IRC | 15:23 | |
slagle | dprince: and if there are multiple 192.0.2.1's on that bridge... | 15:23 |
dprince | slagle: what if once the undercloud is installed we deleted the route? | 15:23 |
slagle | from the neutron subnet? | 15:24 |
dprince | slagle: the seed vm only needs external connectivity while it is installing instack-undercloud right | 15:24 |
slagle | until it does the pingtest | 15:24 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-heat-templates: Allow the containerized compute node to spawn larger VMs https://review.openstack.org/291235 | 15:24 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-heat-templates: Remove unused Neutron Agents container https://review.openstack.org/291236 | 15:24 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-heat-templates: Parameterize the heat-docker-agents image https://review.openstack.org/291237 | 15:24 |
slagle | it will need it again to download the image | 15:24 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Fixup systemctl_swift stop/start during the controller upgrade https://review.openstack.org/291088 | 15:24 |
derekh | slagle: dprince how about we take a test env host out of rotation, and bring up 3 jenkins slaves using the same TE host for envs and manually run jobs we can poke at | 15:25 |
openstackgerrit | Attila Darazs proposed openstack-infra/tripleo-ci: Use IPv6 on the ceph gate job https://review.openstack.org/289445 | 15:25 |
dprince | derekh: yep, lets do it | 15:25 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Upgrades: object storage node upgrade fix https://review.openstack.org/289826 | 15:25 |
shardy | trown: weird, ksm and ksmtuned services are running, the kernel stuff is loaded, and overcommitting does seem to work, but it's not sharing pages AFAICS | 15:25 |
*** fgimenez has quit IRC | 15:25 | |
derekh | dprince: ok, this will take a little time to setup, I'll be back with login details in a bit | 15:26 |
trown | shardy: hmm, that would be huge win if that worked... there has to be alot that could be shared | 15:27 |
trown | larsks: do you know anything about ksm ^ | 15:27 |
larsks | trown: not really, other than that it exists :) | 15:27 |
trown | larsks: k, that is the extent of my knowledge as well | 15:28 |
openstackgerrit | Dan Radez proposed openstack/os-cloud-config: Adding support for pxe_amt https://review.openstack.org/282077 | 15:28 |
derekh | afazekas: your not using that instance on the ci cloud are you? wanna zap it if I can | 15:28 |
openstackgerrit | Jaume Devesa proposed openstack/tripleo-docs: Add MidoNet documentation in advanced deployment https://review.openstack.org/270320 | 15:28 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/puppet-pacemaker: Basic beaker one node test. https://review.openstack.org/281376 | 15:30 |
openstackgerrit | Christopher Brown proposed openstack/os-net-config: Fixes lp bug 1555669 https://review.openstack.org/291243 | 15:30 |
openstack | Launchpad bug 1555669 in os-net-config "better link state detection" [Undecided,New] https://launchpad.net/bugs/1555669 - Assigned to Christopher Brown (snecklifter) | 15:30 |
snecklifter | shardy, ^^^ | 15:31 |
* bnemec glares at puppet-lint | 15:31 | |
larsks | trown: shardy: ...but on my system, ksmtuned is running and looking at the values in /sys/kernel/mm/ksm it seems as if there is page sharing going on. | 15:31 |
bnemec | I turned on KSM for my single-node OpenStack box. It crashed within 24 hours. | 15:32 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/puppet-pacemaker: Add a service provider. https://review.openstack.org/286124 | 15:32 |
bnemec | /data point | 15:32 |
larsks | bnemec: I ate lunch yesteday and my phone crashed. I'm not sure that's a data point, unless there is some corraborating evidence :) | 15:33 |
bnemec | larsks: Well, in this case the kernel oops included a trace that went through ksm. :-) | 15:33 |
larsks | It looks like ksm is enabled by default on centos7 (and presumably RHEL). | 15:33 |
slagle | derekh: what is the tedev port on br-ctlplane? | 15:33 |
larsks | (That is, I didn't explicitly enable it on my system...) | 15:34 |
openstackgerrit | Merged openstack/instack-undercloud: Fix long line in puppet-stack-config.pp https://review.openstack.org/291136 | 15:34 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder: Generate fedora-atomic images using dib https://review.openstack.org/287167 | 15:34 |
*** mbound has quit IRC | 15:34 | |
openstackgerrit | Merged openstack/instack-undercloud: Fix long line in puppet-stack-config.pp https://review.openstack.org/291137 | 15:34 |
derekh | slagle: thats was put there to give the Host an IP on the 192.168.1.0/24 network | 15:34 |
slagle | ok | 15:35 |
trown | larsks: shardy, going to try a drastically overcommited setup to test ksm page sharing... it does appear to be running by default on CentOS, but doesn't try to share pages unless they would otherwise be swapped out | 15:37 |
*** Goneri has quit IRC | 15:40 | |
*** Goneri has joined #tripleo | 15:40 | |
*** trozet has quit IRC | 15:40 | |
*** adarazs has quit IRC | 15:41 | |
*** adarazs has joined #tripleo | 15:42 | |
dtantsur | folks, could you please take a look at https://review.openstack.org/#/c/288417/ ? | 15:43 |
*** paramite is now known as paramite|afk | 15:43 | |
dtantsur | without this thing, people are complaining that IPA has a different root device selection logic | 15:43 |
dtantsur | and we have no way to override it | 15:43 |
dtantsur | meaning that without root device hints, the root device will change for many people on the next rebuild :( | 15:44 |
dtantsur | (I don't really like this patch, but I dunno what we could do) | 15:44 |
dtantsur | lucasagomes, ^^ | 15:44 |
* lucasagomes looks | 15:44 | |
lucasagomes | look* | 15:44 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/puppet-pacemaker: Basic beaker one node test. https://review.openstack.org/281376 | 15:45 |
openstackgerrit | Pradeep Kilambi proposed openstack/os-cloud-config: add aodh and gnocchi to keystone service list https://review.openstack.org/272110 | 15:45 |
*** paramite|afk is now known as paramite | 15:46 | |
adarazs | when I get "Merge Failed." from Gerrit, how can I figure out what depends-on patch is actually failing to merge? | 15:46 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/puppet-pacemaker: Add a service provider. https://review.openstack.org/286124 | 15:46 |
adarazs | or what method does Jenkins use to merge them? I tried to cherry pick them all in the specified order and it worked. | 15:46 |
openstackgerrit | Pradeep Kilambi proposed openstack/tripleo-heat-templates: Deploy Gnocchi as a Ceilometer metrics storage backend https://review.openstack.org/252032 | 15:47 |
dtantsur | adarazs, it's overly paranoid sometimes | 15:47 |
openstackgerrit | Merged openstack/instack-undercloud: Remove trailing / on keystone admin endpoint https://review.openstack.org/290724 | 15:47 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: Upgrades: object storage node upgrade fix https://review.openstack.org/291254 | 15:47 |
openstackgerrit | Merged openstack/instack-undercloud: Enable heat-manage purge_deleted cron job https://review.openstack.org/282899 | 15:47 |
lucasagomes | dtantsur, +1 and left a nit inline | 15:49 |
*** bvandenh has joined #tripleo | 15:49 | |
lucasagomes | dtantsur, not sure if it's possible tho, I'm not very familiar with the project and it may not be multi-threaded as I think | 15:49 |
lucasagomes | but added the nit to document it anyway | 15:49 |
dtantsur | lucasagomes, it's not multithreaded right now | 15:50 |
dtantsur | yeah, thanks | 15:50 |
shardy | adarazs: sometimes it just means the verify gate jobs failed, not necessarily that there's a merge conflict | 15:50 |
adarazs | shardy: okay, so what can I do? play with the "Depends-On" order of the patches until it passes? | 15:51 |
shardy | adarazs: what patch is it? | 15:51 |
shardy | adarazs: you can do reverify if it's a transient error | 15:52 |
*** yamahata has joined #tripleo | 15:52 | |
adarazs | shardy: https://review.openstack.org/289445 -- adding IPv6 to the gate. it needs a bunch of THT patches for having a chance to pass. | 15:52 |
adarazs | it worked until recently until I added a hotfix from jistr. but I doubt that the problem is Jiri's patch. | 15:54 |
shardy | adarazs: Hmm, yeah that's not very clear - I thought you meant a merge failure after approval | 15:57 |
openstackgerrit | Ben Nemec proposed openstack/instack-undercloud: Enable notifications on undercloud https://review.openstack.org/289518 | 15:57 |
adarazs | shardy: nope. I don't even know what method does it use to merge all these depends-on stuff if they are in the same repo. | 15:57 |
shardy | adarazs: ah, that may be the problem | 15:59 |
shardy | you may have a merge conflict between two different Depends-On changes in the same repo, e.g t-h-t | 15:59 |
*** xinwu has joined #tripleo | 15:59 | |
adarazs | yes. all of them are tht changes. | 15:59 |
shardy | ideally you only want one Depends-On per repo, pointing to the head of any series there | 15:59 |
*** jistr has quit IRC | 15:59 | |
shardy | adarazs: You may need to rebase the t-h-t patches into a series, then just depend on the top of the branch | 16:00 |
adarazs | shardy: hm, okay. I will try that. | 16:00 |
openstackgerrit | Ben Nemec proposed openstack/instack-undercloud: Remove trailing / on keystone admin endpoint https://review.openstack.org/291266 | 16:00 |
*** paramite has quit IRC | 16:00 | |
*** jtomasek has quit IRC | 16:00 | |
openstackgerrit | Merged openstack/instack-undercloud: Use pymysql database driver for OpenStack DBs https://review.openstack.org/284955 | 16:01 |
*** absubram has joined #tripleo | 16:02 | |
*** absubram_ has joined #tripleo | 16:04 | |
*** mbound has joined #tripleo | 16:05 | |
*** pradk has quit IRC | 16:06 | |
*** absubram has quit IRC | 16:06 | |
*** pradk_ is now known as pradk | 16:06 | |
*** absubram_ is now known as absubram | 16:06 | |
*** eggmaste` is now known as eggmaster | 16:07 | |
dtantsur | trown, do I need more memory on computes or only controllers? or only computes? | 16:08 |
*** bvandenh has quit IRC | 16:08 | |
trown | dtantsur: controllers are where the pressure is | 16:08 |
trown | shardy: larsks: [root@desk-trown ~]# cat /sys/kernel/mm/ksm/pages_sharing | 16:09 |
trown | 1547511 | 16:09 |
trown | so it does "just work" on centos | 16:09 |
larsks | trown: yeah, that's pretty much what I was seeing... | 16:09 |
trown | you just have to be overcommitted for it to kick on | 16:09 |
dtantsur | even more for me :) | 16:09 |
shardy | cool, that explains why I've been able to overcommit then - I guess I didn't launch enough VMs to see it this time | 16:09 |
trown | dtantsur: ya, I am currently running an HA deploy with 12G undercloud and 4 8GB overcloud nodes on a 32G host... so far so good | 16:10 |
dtantsur | awesome | 16:11 |
d0ugal | slagle: See my comment here: https://review.openstack.org/#/c/288869/ - does it make sense for the parameter to be different between the two files? | 16:13 |
openstackgerrit | James Slagle proposed openstack/instack-undercloud: Revert "run keystone in a wsgi process" https://review.openstack.org/291278 | 16:19 |
EmilienM | ayoung: ^ | 16:22 |
ayoung | EmilienM, what is his nick? | 16:23 |
ayoung | can someone please -2 that | 16:23 |
slagle | d0ugal: it's ok as it is i guess, the other parameters are like that | 16:23 |
ayoung | slagle, kill that please | 16:23 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-heat-templates: IPv6: pass bracketed IP to rabbitmq puppet module https://review.openstack.org/291186 | 16:23 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-heat-templates: Allow the vnc server to bind on IPv6 address on computes https://review.openstack.org/270831 | 16:23 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-heat-templates: Surround MongoDB IPs with braces in the connection string if IPv6 https://review.openstack.org/270154 | 16:23 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-heat-templates: Fix vncproxy_host for IPv6 https://review.openstack.org/287068 | 16:23 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-heat-templates: Support the deployment of Ceph over IPv6 https://review.openstack.org/272089 | 16:23 |
slagle | d0ugal: but yea, you'd have to specify both parameters | 16:24 |
*** pblaho has quit IRC | 16:24 | |
*** leanderthal has quit IRC | 16:24 | |
ayoung | slagle, you are going to be inflicting countless Keystomne errors on the rest of the world | 16:24 |
ayoung | do | 16:24 |
ayoung | not | 16:24 |
slagle | d0ugal: which is odd | 16:24 |
ayoung | revert | 16:24 |
d0ugal | slagle: Right, that confused me - I got it working by passing both | 16:24 |
adarazs | chooo-chooo here goes the patch train. | 16:24 |
ayoung | bnemec, please remove +2 on https://review.openstack.org/#/c/291278/1 | 16:24 |
slagle | ayoung: it doesnt work on upgrades | 16:25 |
ayoung | slagle, then lets fix that | 16:25 |
EmilienM | do we have upgrade jobs? | 16:25 |
slagle | ayoung: go for it :) | 16:25 |
slagle | EmilienM: we don't. it still has to work | 16:25 |
ayoung | slagle, abandon your patch please | 16:25 |
EmilienM | let's figure what is wrong | 16:25 |
openstackgerrit | Attila Darazs proposed openstack-infra/tripleo-ci: Use IPv6 on the ceph gate job https://review.openstack.org/289445 | 16:25 |
bnemec | It's. Broken. | 16:25 |
slagle | EmilienM: please | 16:25 |
ayoung | bnemec, Eventlet is broken | 16:26 |
slagle | EmilienM: the error is in there | 16:26 |
*** fgimenez has joined #tripleo | 16:26 | |
ayoung | bnemec, Please remove your +2 and put a workflow on it | 16:26 |
bnemec | No | 16:26 |
bnemec | It's fine to revert something if a breakage is found after it merges. | 16:27 |
bnemec | That's what is happening here. | 16:27 |
ayoung | bnemec no | 16:27 |
ayoung | bnemec, lets fix the upgrade | 16:27 |
bnemec | If/when we come up with a fix then it can go back in. | 16:27 |
slagle | ayoung: please do | 16:27 |
ayoung | bnemec, Eventlet is broken | 16:27 |
slagle | or, we can keep discussing | 16:27 |
ayoung | you are going to be putting errors into the installed service | 16:27 |
EmilienM | slagle: do you have httpd logs? | 16:28 |
ayoung | and we don;'t catch them in Keystone anymore...its HTTPD only | 16:28 |
shardy | slagle: would it help to raise a bug with more details of the issues found - it's not clear from the revert other then apparently it's broken? | 16:28 |
EmilienM | puppet fails to start apache but don't show why | 16:28 |
ayoung | so lets fix this, but if you revert you are going to cause a major load of pain | 16:28 |
slagle | marios: do you have the httpd logs? | 16:28 |
ayoung | shardy, can you please put a stop on the revert. | 16:28 |
EmilienM | marios: or journalctl | 16:28 |
EmilienM | we have missed something I guess, the problem is just apache does not work | 16:29 |
EmilienM | I'm sure eventlet was started before | 16:29 |
EmilienM | and binding can't happen. | 16:29 |
shardy | ayoung: lets stop arguing over the revert - quick reverts are OK, and can easily be un-reverted when the issues are resolved | 16:29 |
ayoung | shardy, not this one | 16:29 |
shardy | let's hear what the actual issue is, then make a call if it can be fixed quickly | 16:29 |
EmilienM | https://github.com/openstack/puppet-keystone/blob/stable/liberty/manifests/init.pp#L924 | 16:30 |
EmilienM | it should stop eventlet before running apache | 16:30 |
slagle | apetrich: were you getting the keystone error on upgrade too? | 16:30 |
slagle | thrash: ^? | 16:30 |
ayoung | shardy, fine, but please workflow -1 the revert until we know | 16:30 |
EmilienM | but if eventlet was already started maybe apache started before | 16:30 |
apetrich | slagle, aye | 16:30 |
ayoung | EmilienM, ok...so I suspect Systemd | 16:30 |
apetrich | slagle, cheers! | 16:30 |
EmilienM | we can suspect anything I want to see apache logs | 16:31 |
EmilienM | do not revert this patch before at least providing useful logs | 16:31 |
marios | slagle: EmilienM gimme few will attach to https://bugzilla.redhat.com/show_bug.cgi?id=1316588 | 16:31 |
openstack | bugzilla.redhat.com bug 1316588 in instack-undercloud "Upgrade undercloud fails on keystone error" [High,New] - Assigned to brad | 16:31 |
EmilienM | marios: let me 2 min | 16:31 |
ayoung | EmilienM, what we need to do is remove the whole openstack-keystone systemd config, as that is what is kicking off the start of eventlet | 16:31 |
ayoung | at a minimum, it should be explicitly disabled and the service not run | 16:32 |
EmilienM | the problems looks like in HAproxy | 16:32 |
EmilienM | haproxy[9505]: proxy keystone_admin has no server available! | 16:32 |
EmilienM | haproxy[9505]: proxy keystone_public has no server available! | 16:32 |
EmilienM | no actually eventlet is stopped | 16:33 |
EmilienM | so haproxy is not happy | 16:33 |
*** aufi has quit IRC | 16:33 | |
EmilienM | but apache does not sart | 16:33 |
EmilienM | I'm waiting for httpd logs | 16:33 |
slagle | dprince: got some new info | 16:33 |
*** trozet has joined #tripleo | 16:33 | |
bkero | that means that haproxy can't reach the servers associated with keystone_admin and keystone_public | 16:34 |
ayoung | EmilienM, that sounds like HAProxy depends on Keystone being up before staring httpd | 16:34 |
*** rwsu has quit IRC | 16:34 | |
slagle | dprince: managed to get the os-collect-config logs from teh failed node, and it looks like the fallback mode of os-net-config took down networking | 16:34 |
ayoung | bkero, of course it can't | 16:34 |
EmilienM | that's not that, I'm pretty sure the answer is in httpd logs | 16:34 |
EmilienM | HAproxy is a warning | 16:34 |
EmilienM | we don't care about that ^ | 16:34 |
ayoung | EmilienM, you think HTTPD failed to start? | 16:34 |
derekh | slagle: dprince I've hijacked testenv32-testenv1 | 16:35 |
derekh | And I've kicked off 3 ha jobs using the following fake jenkins slaves | 16:35 |
EmilienM | ayoung: yes. | 16:35 |
derekh | slagle: dprince 1. 66.187.229.58 2. 66.187.229.82 3. 66.187.229.124 | 16:35 |
EmilienM | haproxy is crying because keystone is done, during upgrade, because we stop eventlet and try to start httpd | 16:35 |
slagle | dprince: http://paste.openstack.org/show/490030/ | 16:35 |
slagle | derekh: ok, i've got a new straw ^^ | 16:36 |
derekh | its only started once the underclouds comes up we can take a look | 16:36 |
EmilienM | ayoung: let's wait for marios's logs | 16:36 |
slagle | derekh: something went wrong in os-net-config, and it looks like the fallback mode took down all of networking, or at least the default route to 192.0.2.1 | 16:36 |
*** bvandenh has joined #tripleo | 16:36 | |
slagle | derekh: that paste is from the failed node...i had to save the disk off, mount it, and get the log out | 16:37 |
apetrich | EmilienM, posted systemctl status and error_logs | 16:37 |
* EmilienM looking | 16:37 | |
*** mikelk has quit IRC | 16:37 | |
EmilienM | bingo | 16:37 |
derekh | slagle: nice, we're getting places | 16:37 |
EmilienM | Mar 10 09:18:19 instack.localdomain httpd[10499]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:35357 | 16:37 |
EmilienM | so either keystone eventlet is still started at this time, or HAproxy is stealing the binding but I don't think it does, since it's binded on VIPs | 16:38 |
EmilienM | it's a problem in vhost config I think, let me check code | 16:38 |
apetrich | EmilienM, I assumed that because of Process: 31065 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited, status=1/FAILURE) it probably could not kill it first | 16:38 |
ayoung | EmilienM, this is all from logs, right? If keystone eventlet started up, it should show in the keystone log | 16:38 |
EmilienM | yeah it's a problem in code | 16:39 |
EmilienM | we let default apache binding in puppet, which is 0.0.0.0 | 16:39 |
EmilienM | and I think haproxy is already using it | 16:39 |
EmilienM | we need to patch https://github.com/openstack/instack-undercloud/blob/stable/liberty/elements/puppet-stack-config/puppet-stack-config.pp#L112 | 16:39 |
EmilienM | to add bind_host options | 16:40 |
EmilienM | I'm doing a patch right now | 16:40 |
EmilienM | the question is, how it works when not testing upgrade | 16:40 |
bnemec | EmilienM: I thought you had already fixed that. Maybe it just needs a backport? | 16:41 |
EmilienM | yeah | 16:41 |
EmilienM | let me check that | 16:41 |
EmilienM | 5d020c717ccc9c4b8758de105816ab6a108dd146 | 16:41 |
openstackgerrit | Emilien Macchi proposed openstack/instack-undercloud: keystone/wsgi: bind on local IP https://review.openstack.org/291292 | 16:41 |
EmilienM | ok it should fix upgrade ^ | 16:42 |
*** ohamada has quit IRC | 16:42 | |
slagle | EmilienM: cool, can you test it please | 16:42 |
*** ohamada has joined #tripleo | 16:42 | |
EmilienM | no | 16:42 |
EmilienM | my env is f* up | 16:42 |
ayoung | HA | 16:42 |
ayoung | And there is the rub | 16:42 |
slagle | ayoung: do you want to test it? | 16:42 |
ayoung | slagle, | 16:42 |
EmilienM | I'm sure this is the fix | 16:42 |
EmilienM | but yeah we need to test it | 16:43 |
ayoung | slagle, I can honestly say I do not know how to test it. I have a machine running tripleo | 16:43 |
ayoung | but it has the patch already | 16:43 |
ayoung | slagle, how was the error originally discovered | 16:43 |
apetrich | EmilienM, wait a moment I can test that now | 16:44 |
slagle | ayoung: osp testing | 16:44 |
EmilienM | apetrich++ | 16:44 |
slagle | apetrich: thanks | 16:44 |
ayoung | apetrich, TYVM | 16:44 |
*** pcaruana has quit IRC | 16:47 | |
pradk | EmilienM, do we need/have a similar fix on overcloud? iirc there was a similar issue jprovazn ran into where wsgi was conflicting with ha proxy bind | 16:47 |
EmilienM | that's a good question | 16:47 |
EmilienM | let me check | 16:47 |
EmilienM | yes we have | 16:48 |
*** xinwu has quit IRC | 16:48 | |
EmilienM | puppet/controller.yaml has the right options so we are good | 16:48 |
EmilienM | puppet/controller.yaml: keystone::wsgi::apache::bind_host: {get_input: keystone_public_api_network} | 16:48 |
EmilienM | puppet/controller.yaml: keystone::wsgi::apache::admin_bind_host: {get_input: keystone_admin_api_network} | 16:48 |
thrash | slagle: yes | 16:48 |
pradk | cool | 16:48 |
*** ohamada_ has joined #tripleo | 16:52 | |
*** MaxPC has joined #tripleo | 16:52 | |
*** ohamada has quit IRC | 16:52 | |
thrash | EmilienM: slagle ayoung I'll manually test that backport. | 16:52 |
*** ohamada_ has quit IRC | 16:52 | |
*** ohamada_ has joined #tripleo | 16:53 | |
ayoung | thrash, cool thanks | 16:53 |
ayoung | slagle, in the interest of keeping me from blowing a gasket incase the revert gets accidentally approved, and as a courtesy to me, who has to fix all the nastiness in Keystone Eventlet bufgs when they are reports, could you please -1 Workflow the revert for now? | 16:54 |
slagle | i liked it better when you were making demands | 16:55 |
ayoung | slagle, 4 years I've battled this beast | 16:55 |
ayoung | http://adam.younglogic.com/2012/03/keystone-should-move-to-apache-httpd/ | 16:55 |
ayoung | slagle, and, in that time, we built all the federation infrastructure, which is dependent on the Apache modules for Crypto. So, without this, I don;t even have a prayer of setting it up, even as a one off or proof of concept | 16:56 |
*** devvesa has quit IRC | 17:00 | |
apetrich | EmilienM, slagle, ayoung good stuff | 17:00 |
*** tosky has quit IRC | 17:00 | |
apetrich | EmilienM, it's still deploying but passed that step | 17:00 |
EmilienM | cool | 17:01 |
ayoung | apetrich, is that a request? That I buy some good stuff for the next face to face? You all going to Austin? | 17:01 |
apetrich | ayoung, not this time unfortunately | 17:02 |
apetrich | Undercloud install complete. | 17:02 |
EmilienM | ok, we can merge my backport then | 17:03 |
EmilienM | apetrich: thx a lot for the testing | 17:03 |
ayoung | EmilienM, so that was not an upgrade test, just a smoke test? | 17:03 |
apetrich | EmilienM, No worries I'm stuck on that one | 17:03 |
ayoung | apetrich, ^^? | 17:03 |
*** lucasagomes is now known as lucas-afk | 17:04 | |
apetrich | ayoung, no this was an upgrade test. I had an env that failed on the upgrade. now it passed all the way | 17:04 |
apetrich | no, * | 17:04 |
*** derekh has quit IRC | 17:05 | |
apetrich | ayoung, so this was an actual CI upgrade test that passed the undercloud upgrade | 17:05 |
thrash | ayoung: I'll be there. :D | 17:05 |
ayoung | thrash, excellent. | 17:05 |
ayoung | apetrich, nice | 17:05 |
trown | I think we should revert the wsgi patch just for giggles anyways | 17:06 |
ayoung | So happy that this is going ahead. | 17:06 |
ayoung | trown, have you ever met me in person? | 17:06 |
trown | :) | 17:06 |
thrash | trown: you are evil. I like that. | 17:06 |
trown | ayoung: ya we met in Tokyo | 17:07 |
* EmilienM coughs | 17:07 | |
thrash | ayoung: my test appears to be running fine as well. | 17:07 |
ayoung | whew | 17:08 |
thrash | ayoung: I put a W-1 on slagle's patch for the sake of your blood pressure. | 17:08 |
MaxPC | is this the right time to mention ayoung thought this was for OSP 9 and not 8 ? | 17:08 |
thrash | haha | 17:09 |
ayoung | Oh, MaxPC | 17:09 |
MaxPC | :p | 17:09 |
*** bvandenh has quit IRC | 17:09 | |
slagle | MaxPC: i dunno, i was going to point out to him that keystone isnt wsgi in the overcloud | 17:09 |
MaxPC | looool | 17:09 |
slagle | which is where i think he'd care about it the most | 17:10 |
slagle | :) | 17:10 |
ayoung | slagle, that is true | 17:10 |
thrash | finally... a clean CI run on https://review.openstack.org/#/c/288568/ | 17:11 |
ayoung | slagle, its like a cancer. You want to kill it where ever you see it, and any remission is deathly | 17:11 |
MaxPC | I do agree with the sentiment thought let's not give up on it at the first gray cloud | 17:13 |
MaxPC | there might be fait weather on the other side | 17:13 |
slagle | MaxPC: no one did that. | 17:13 |
MaxPC | I know :-) | 17:13 |
slagle | broken is broken, reverts are an option in that case | 17:13 |
slagle | in this case, we got a quick fix, that is great | 17:14 |
MaxPC | I just got a lot of noise around this very quickly :-) it's all good, no blame here | 17:14 |
*** fgimenez has quit IRC | 17:14 | |
MaxPC | only love | 17:15 |
MaxPC | we all want the best product possible | 17:15 |
trown | in my experience posting a revert is the quickest way to get a quick fix if one is possible :) | 17:16 |
bnemec | trown: +1 :-) | 17:16 |
EmilienM | slagle: I should have backported that patch - mea culpa... | 17:16 |
bnemec | I missed it too. It would have broken SSL undercloud on Liberty, so we needed it anyway. | 17:17 |
openstackgerrit | Merged openstack/tripleo-common: Add capabilities filter for Nova https://review.openstack.org/290942 | 17:17 |
*** sthillma has joined #tripleo | 17:18 | |
*** ccamacho has quit IRC | 17:18 | |
openstackgerrit | Attila Darazs proposed openstack/tripleo-heat-templates: Allow the vnc server to bind on IPv6 address on computes https://review.openstack.org/270831 | 17:19 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-heat-templates: Fix vncproxy_host for IPv6 https://review.openstack.org/287068 | 17:21 |
bnemec | Speaking of not breaking SSL, https://review.openstack.org/#/c/281988 has passed CI and the dependency is in. | 17:21 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-heat-templates: Support the deployment of Ceph over IPv6 https://review.openstack.org/272089 | 17:21 |
*** sthillma_ has joined #tripleo | 17:22 | |
bnemec | Oh, and I should note that it works on stable too: https://review.openstack.org/#/c/287425/ | 17:22 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-heat-templates: IPv6: pass bracketed IP to rabbitmq puppet module https://review.openstack.org/291186 | 17:23 |
*** sthillma has quit IRC | 17:23 | |
*** sthillma_ is now known as sthillma | 17:23 | |
*** tosky has joined #tripleo | 17:23 | |
*** ccamacho has joined #tripleo | 17:27 | |
dprince | slagle: is this related to http://git.openstack.org/cgit/openstack/os-net-config/commit/?id=c545e46f8fe2362df81e86c187aa6e50be185ad6 | 17:28 |
dprince | slagle: sorry, I was away for a bit. Is os-net-config still what you think the root of our problem is now? | 17:28 |
slagle | dprince: could be, did you see the paste? | 17:28 |
dprince | slagle: that would have landed last week around the time things went south I think | 17:28 |
slagle | i guess nic5 didnt get mapped to anything, and then the fall back took down all of networking on the node | 17:29 |
dprince | slagle: yes, that is why I linked this patch | 17:29 |
slagle | dprince: the tb is from earlier up, in interface_name | 17:30 |
slagle | not sure it's related to that patch | 17:30 |
dprince | slagle: tb? sorry? | 17:30 |
slagle | traceback :) | 17:30 |
bnemec | Hmm, trunk.rdoproject.org seems to be down. | 17:32 |
slagle | nic5 should have gotten mapped to eth4 | 17:32 |
dprince | slagle: yeah | 17:32 |
slagle | and then eth4 passed into utils.interface_name | 17:32 |
slagle | err, utils.interface_mac | 17:32 |
*** thrash is now known as thrash|biab | 17:32 | |
trown | bnemec: ruh roh, asking in #rdo | 17:33 |
dprince | slagle: this is from the compute node you say? probably doesn't matter which node it is | 17:35 |
slagle | dprince: yes it was a compute node | 17:36 |
*** olap has quit IRC | 17:37 | |
dprince | slagle: is os-net-config perhaps running too early on that first pass? So it isn't settled enough to get the correct nic mapping? | 17:37 |
*** liverpooler has quit IRC | 17:37 | |
*** ccamacho has quit IRC | 17:38 | |
*** shivrao has joined #tripleo | 17:38 | |
slagle | dprince: i don't think so. mainly b/c there was a 5 minute delay on boot trying to start the networking service | 17:39 |
slagle | dprince: for some reason, the network systemd service tried dhcp on eth1, which took 5 minutes to time out. | 17:39 |
*** panda has quit IRC | 17:39 | |
dprince | slagle: okay, is perhaps the MAC address assigned to that nic just plain bad then? | 17:40 |
*** panda has joined #tripleo | 17:40 | |
jrist | ugh | 17:40 |
dprince | slagle: I can't imagine libvirt would allow that | 17:40 |
slagle | dprince: i don't see any dupes for this mac, i had already checked | 17:40 |
slagle | dprince: i have the image saved and mounted | 17:41 |
slagle | if you want to poke at it | 17:41 |
dprince | slagle: yeah, I don't think there would be dups, I'm just wondering if something in the image thinks it is a bad MAC for some reason | 17:41 |
dprince | slagle: but it worked the second pass, so couldn't be | 17:41 |
dprince | dsneddon: are you following this, we have a paste file showing an odd os-net-config failure http://paste.openstack.org/show/490030/ | 17:42 |
dsneddon | dprince, I am following | 17:43 |
dprince | dsneddon: thanks | 17:43 |
*** athomas has quit IRC | 17:45 | |
slagle | dprince: are we sure that the nic mapping is fully populated before add_bridge would be called? | 17:45 |
dprince | slagle: I'd like to try reverting the is_active_nic change | 17:45 |
dprince | slagle: it is the only thing that merged last week and it isn't critical I think, or at least we could re-add it easily later | 17:46 |
dprince | slagle: it could potentially be effecting the mappings because _is_active_nic is called from ordered_active_nics | 17:47 |
slagle | dprince: sure, worth a try | 17:48 |
dprince | slagle: is there a bug to reference for this? | 17:48 |
EmilienM | fyi trunk.rdoproject.org is down | 17:48 |
dprince | EmilienM: yeah, bummer | 17:48 |
*** bnemec changes topic to "TripleO | trunk.rdoproject.org is down. CI will fail until it's back up. | CI status: http://tripleo.org/cistatus.html | Docs: http://tripleo.org/" | 17:48 | |
dprince | EmilienM: that will give us time to talk perhaps :) | 17:49 |
slagle | dprince: let me file one | 17:50 |
openstackgerrit | Dan Prince proposed openstack/os-net-config: Revert "launchpad bug 1537330, fix _is_active_nic" https://review.openstack.org/291322 | 17:50 |
openstack | Launchpad bug 1537330 in os-net-config "os_net_config.utils._is_active_nic gives wrong result for linux bond" [Undecided,New] https://launchpad.net/bugs/1537330 | 17:50 |
dprince | slagle: ^^^ | 17:50 |
slagle | ok | 17:51 |
*** jtomasek has joined #tripleo | 17:51 | |
EmilienM | damn we don't need CI outage *now* | 17:51 |
dprince | slagle: oh, I din't file an actual bug | 17:51 |
slagle | i'll file a new one :) | 17:51 |
dprince | slagle: thanks, you've done the best detective work here so far | 17:51 |
dprince | EmilienM: we need our own mirrors man! | 17:51 |
bnemec | Things never go down when it's convenient. | 17:52 |
EmilienM | dprince: we need to mirror Internet | 17:52 |
EmilienM | do we have enough space? | 17:52 |
bnemec | Disk is cheap. ;-) | 17:52 |
dprince | EmilienM: I would actually suggest we run the mirror outside of our cloud I think | 17:56 |
dprince | EmilienM: like on RAX or something | 17:56 |
EmilienM | I would love seeing packaging mirror hosted by OpenStack Infr | 17:56 |
EmilienM | Infra* | 17:56 |
dprince | EmilienM: hey, I wanted to organize the puppet-tripleo stuff a bit better | 17:56 |
dprince | EmilienM: I really like your initial patches here. https://review.openstack.org/#/c/289459/2 | 17:57 |
EmilienM | dprince: my stuff on glance? | 17:57 |
dprince | EmilienM: one comment I had was to eliminate the "defined" pacemaker bits. I don't like that | 17:57 |
dprince | EmilienM: I'd rather just see us control it directly via Heat | 17:57 |
EmilienM | dprince: and I did not know how to do that | 17:57 |
EmilienM | we need a Hiera level for Pacemaker | 17:57 |
EmilienM | that is loaded when running HA | 17:57 |
dprince | EmilienM: I've done this before | 17:58 |
EmilienM | dprince: can you show me an example? | 17:58 |
EmilienM | so I can pick it and do the same for my work | 17:58 |
dprince | EmilienM: yes, probably easier if I just update your patches | 17:58 |
EmilienM | dprince: go ahead man | 17:58 |
dprince | EmilienM: okay, lets work this out | 17:58 |
dprince | EmilienM: so hey, I noticed micheal chapin filed an LP blueprint for a similar thing too | 17:58 |
dprince | EmilienM: https://blueprints.launchpad.net/tripleo/+spec/refactor-puppet-manifests | 17:59 |
dprince | EmilienM: basically similar to what we talked about in Tokyo, what you are doing now, etc. | 17:59 |
EmilienM | dprince: michchap nice | 17:59 |
dprince | EmilienM: should we organize all this under a spec and his blueprint? | 17:59 |
dprince | EmilienM: and then we make composable services depend on it? | 17:59 |
EmilienM | dprince: that would be an approach, yes | 18:00 |
EmilienM | do we really need a spec? | 18:00 |
dprince | EmilienM: perhaps a slightly slower path but it would make the composable services patches slightly smaller in some places | 18:00 |
EmilienM | AFIK it's just moving code | 18:00 |
dprince | I'm asking that same question, would it make sense to organize it? Or just do it | 18:01 |
slagle | dprince: https://bugs.launchpad.net/tripleo/+bug/1555749 | 18:01 |
openstack | Launchpad bug 1555749 in tripleo "CI: compute node networking unresponsive after os-net-config run" [Undecided,New] | 18:01 |
shardy | I was wondering the same thing - seems like a candidate for a specless blueprint or a spec-lite bug to me | 18:01 |
shardy | IOW just do it under the existing BP you just mentioned ;) | 18:01 |
dprince | EmilienM: lets just reference Michael's LP blueprint for all this code | 18:01 |
dprince | shardy: ack, I agree | 18:01 |
dprince | I'm gonna say this is officially approved then on LP. Any objectsion to approving https://blueprints.launchpad.net/tripleo/+spec/refactor-puppet-manifests now? | 18:02 |
openstackgerrit | Dan Prince proposed openstack/os-net-config: Revert "launchpad bug 1537330, fix _is_active_nic" https://review.openstack.org/291322 | 18:03 |
openstack | Launchpad bug 1537330 in os-net-config "os_net_config.utils._is_active_nic gives wrong result for linux bond" [Undecided,New] https://launchpad.net/bugs/1537330 | 18:03 |
bnemec | dprince: Just noticed this: https://review.openstack.org/#/c/291243 | 18:07 |
bnemec | Wonder if it could be related. | 18:07 |
slagle | dprince: ah, i see that is_active_nic directly influences what gets mapped, so yea, that could explain it. it must not have seen nic5/eth4 as active | 18:08 |
dprince | bnemec: yep, it could. This same function was changed last week (March 2nd) and is similar to what I'm suggesting reverting | 18:08 |
bnemec | Yeah, that's how I found it. It's listed in the conflicts for the revert. | 18:09 |
dprince | slagle/bnemec: a blind revert (no CI passes) of the os-net-config change from last week should be safe and get us results faster | 18:09 |
shardy | snecklifter: ^^ FYI | 18:10 |
bnemec | dprince: Agreed. I'm not suggesting we don't do the revert, just pointing out a possible fix. | 18:10 |
snecklifter | shardy, thanks | 18:10 |
*** Marga_ has quit IRC | 18:11 | |
snecklifter | except I'm having this issue on OSP-d so ^^^ not affecting it | 18:11 |
*** Marga_ has joined #tripleo | 18:12 | |
openstackgerrit | Dan Prince proposed openstack/os-net-config: Revert "launchpad bug 1537330, fix _is_active_nic" https://review.openstack.org/291322 | 18:12 |
openstack | Launchpad bug 1537330 in os-net-config "os_net_config.utils._is_active_nic gives wrong result for linux bond" [Undecided,New] https://launchpad.net/bugs/1537330 | 18:12 |
dprince | bnemec: think/fixed ^ | 18:12 |
snecklifter | bnemec, I'm not using bonded in this env, I think its down to the way linux kernel handles ethernet over usb | 18:13 |
*** trown is now known as trown|lunch | 18:14 | |
snecklifter | or the device reports to kernel or whatever | 18:14 |
bnemec | dprince: Thanks | 18:14 |
openstackgerrit | Sam Yaple proposed openstack/diskimage-builder: Revert "Zerofree the image if possible" https://review.openstack.org/291350 | 18:15 |
openstackgerrit | Sam Yaple proposed openstack/diskimage-builder: Use fstrim to prep the block device https://review.openstack.org/291351 | 18:15 |
*** thrash|biab is now known as thrash | 18:16 | |
*** electrofelix has quit IRC | 18:17 | |
slagle | wow, 3 ha jobs on the same testenv hosts really brigns it to a crawl | 18:17 |
slagle | we need ha ci job testenv anti-affinity | 18:17 |
dprince | slagle: I think controllers are by definition IO intensive | 18:17 |
dprince | slagle: and to even think that some would even suggest we only run HA jobs. Imagine what would happend then :) | 18:18 |
*** mgould has quit IRC | 18:18 | |
slagle | indeed :) | 18:18 |
EmilienM | rdo server is back | 18:19 |
bnemec | \o/ | 18:19 |
EmilienM | well, need to be tested | 18:19 |
EmilienM | because some other stuffs are still down | 18:19 |
openstackgerrit | Gonéri Le Bouder proposed openstack/instack-undercloud: add INTERFACE_MTU parameter https://review.openstack.org/288041 | 18:20 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-docs: Docs for containerized compute node https://review.openstack.org/254743 | 18:21 |
*** ohamada_ has quit IRC | 18:21 | |
*** xinwu has joined #tripleo | 18:22 | |
*** sthillma has quit IRC | 18:28 | |
*** jaosorior has joined #tripleo | 18:28 | |
*** openstackgerrit_ has joined #tripleo | 18:30 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add a ceph-storage node upgrade script for the upgrade workflow https://review.openstack.org/289896 | 18:32 |
jaosorior | bnemec: Noticed that the overcloud ssl patch is all green? :D https://review.openstack.org/#/c/281988/ | 18:33 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Upgrades: initialization command/snippet https://review.openstack.org/290465 | 18:34 |
bnemec | jaosorior: I did. I was begging for reviews earlier. :-) | 18:34 |
bnemec | Lotta stuff going on right now though. | 18:34 |
jaosorior | you got my +1... but I guess that doesn't really do much :/ | 18:34 |
jaosorior | bnemec: Yeah, noticed also the change you did for the keystone endpoint in the overcloud | 18:34 |
jaosorior | I somehow thought I was using an old undercloud or something | 18:34 |
bnemec | jaosorior: The admin endpoint? | 18:35 |
jaosorior | yeah | 18:35 |
jaosorior | aah, now I noticed it got merged already | 18:35 |
jaosorior | that was fast | 18:35 |
bnemec | jaosorior: Is it wrong in the overcloud too? | 18:35 |
bnemec | Things can merge fast when CI is running reasonably well. | 18:36 |
jaosorior | bnemec: It isn't (that I know of) | 18:36 |
jaosorior | bnemec: Have you seen this error, by the way: | 18:36 |
jaosorior | Error: Must pass auth_password to Class[Aodh::Auth] at /var/lib/heat-config/heat-config-puppet/fa59863d-38c0-463f-8754-dbb8b43d4156.pp:1048 on node overcloud-controller-1.localdomain | 18:36 |
bnemec | jaosorior: You may have images that were built when Aodh was merged, but that was since reverted. | 18:37 |
*** xinwu has quit IRC | 18:37 | |
bnemec | So the templates will no longer pass the Aodh configuration to the puppet modules on the image. | 18:37 |
jaosorior | crap | 18:37 |
bnemec | Although that's in a heat-config manifest. | 18:37 |
jaosorior | alright, will have to re-build the images then | 18:37 |
bnemec | I would have thought that's coming from t-h-t anyway. :-/ | 18:37 |
jaosorior | gonna give it another try | 18:38 |
bnemec | jaosorior: That's the first thing I would try. | 18:38 |
*** sthillma has joined #tripleo | 18:40 | |
jaosorior | bnemec: I don't really understand this change https://review.openstack.org/#/c/290570/ | 18:41 |
jaosorior | will this no longer be worked on then? https://review.openstack.org/#/c/244162/ | 18:42 |
bnemec | jaosorior: Not at all, that's why there's a revert also pushed for the deprecation message change. | 18:42 |
bnemec | I just don't see https://review.openstack.org/#/c/244162/ merging before we branch Mitaka at this point, so the deprecation message is just wrong. | 18:43 |
jaosorior | oh, I see | 18:43 |
bnemec | The deprecation shouldn't have merged before the functional patch in the first place. | 18:43 |
jaosorior | yeah... that change doesn't look like it's gonna be merged any time soon :/ | 18:43 |
bnemec | Right now we're telling people that a thing is deprecated, without the non-deprecated replacement having merged. | 18:44 |
openstackgerrit | Merged openstack/instack-undercloud: Use service tenant for ceilometer https://review.openstack.org/291142 | 18:44 |
jaosorior | bnemec: Now I see | 18:44 |
openstackgerrit | Sam Yaple proposed openstack/diskimage-builder: Use fstrim to prep the block device https://review.openstack.org/290944 | 18:46 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Upgrade of Cinder block storage nodes https://review.openstack.org/291167 | 18:47 |
EmilienM | could we have a review on https://review.openstack.org/#/c/274492/ please ? | 18:53 |
*** rhallisey has quit IRC | 18:58 | |
openstackgerrit | Ben Nemec proposed openstack/instack-undercloud: Switch to package-installs https://review.openstack.org/291367 | 19:00 |
jaosorior | bnemec: Where can I get info about that package-installs? | 19:02 |
jaosorior | documentation and such | 19:02 |
bnemec | jaosorior: It's a dib element: https://github.com/openstack/diskimage-builder/tree/master/elements/package-installs | 19:03 |
bnemec | Speaking of which, I need to add element-deps to those elements now too. | 19:03 |
*** rohitpagedar__ has joined #tripleo | 19:04 | |
jaosorior | bnemec: I see | 19:05 |
jaosorior | thanks | 19:05 |
openstackgerrit | Ben Nemec proposed openstack/instack-undercloud: Switch to package-installs https://review.openstack.org/291367 | 19:05 |
*** tosky has quit IRC | 19:06 | |
*** sthillma has quit IRC | 19:09 | |
*** trown|lunch is now known as trown | 19:12 | |
*** jistr has joined #tripleo | 19:15 | |
*** dmsimard has quit IRC | 19:17 | |
*** akrivoka has quit IRC | 19:19 | |
*** absubram has quit IRC | 19:19 | |
*** dmsimard has joined #tripleo | 19:22 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/os-cloud-config: Updated from global requirements https://review.openstack.org/285049 | 19:28 |
*** jcoufal has quit IRC | 19:29 | |
*** rhallisey has joined #tripleo | 19:30 | |
*** sthillma has joined #tripleo | 19:31 | |
*** jaosorior has quit IRC | 19:31 | |
openstackgerrit | James Slagle proposed openstack/os-net-config: Add some debugging output to ordered_active_nics https://review.openstack.org/291384 | 19:35 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-image-elements: Remove mysql-dev dependency from os-svc-install https://review.openstack.org/291385 | 19:36 |
*** dmsimard has quit IRC | 19:36 | |
bnemec | ^Removes a gross legacy hack that is pulling in unnecessary packages on our images. | 19:36 |
*** dmsimard has joined #tripleo | 19:37 | |
openstackgerrit | Sam Yaple proposed openstack/diskimage-builder: Use fstrim to prep the block device https://review.openstack.org/290944 | 19:39 |
*** rcernin has quit IRC | 19:40 | |
*** ayoung has quit IRC | 19:52 | |
*** nico_auv has quit IRC | 19:54 | |
openstackgerrit | Dan Prince proposed openstack-infra/tripleo-ci: Add common bash functions to help track metrics. https://review.openstack.org/291392 | 19:55 |
openstackgerrit | Dan Prince proposed openstack-infra/tripleo-ci: Metrics tracking for TripleO deployment tasks https://review.openstack.org/291393 | 19:55 |
*** trown has quit IRC | 19:56 | |
*** sshnaidm_ has joined #tripleo | 19:58 | |
*** sshnaidm has quit IRC | 19:59 | |
*** trown has joined #tripleo | 19:59 | |
dprince | EmilienM: metrics https://review.openstack.org/#/c/291393/ | 20:00 |
*** rhallisey has quit IRC | 20:01 | |
*** rhallisey has joined #tripleo | 20:02 | |
pradk | quick question, whats the right place to set the user/tenant roles for a new service? is os-cloud-config the right place? | 20:04 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: Upgrades: initialization command/snippet https://review.openstack.org/291400 | 20:06 |
dprince | pradk: we use puppet for that now | 20:07 |
dprince | pradk: so have a look at the puppet module for the respective service | 20:07 |
pradk | dprince, oh we do? i updated this https://review.openstack.org/#/c/272110/7/os_cloud_config/keystone.py | 20:08 |
dprince | pradk: is this what you are asking about: http://git.openstack.org/cgit/openstack/puppet-heat/tree/manifests/keystone/auth.pp | 20:08 |
pradk | dprince, EmilienM mentioned to me that his puppet keystone manage patch was reverted | 20:08 |
EmilienM | dprince: w00t | 20:08 |
pradk | dprince, so basically i need to define the gnocchi user/tenant and set it ResellerAdmin role | 20:08 |
dprince | pradk: correct, we still use os-cloud-config, but we need to get the keystone patch back in | 20:08 |
dprince | pradk: perhaps just do it in both places for now | 20:09 |
dprince | pradk: not a great solution but for now (today) it is the lay of the land | 20:09 |
dprince | pradk: I'm optimistic we'll get the keystone patch into our overcloud heat templates soon again | 20:09 |
bnemec | slagle: https://review.openstack.org/#/c/291322/ passed CI | 20:09 |
pradk | dprince, hmm so if i updated os-cloud-config that should have worked? or i strill need keystone patch for it to work | 20:10 |
pradk | dprince, the above patch i mentioned doesnt seem to set the role for me.. i picked the latest pkg build and updated overcloud image with virt-customize | 20:10 |
EmilienM | bnemec: same for https://review.openstack.org/#/c/290568/, except ceph job... not sure it's supposed to pass | 20:12 |
dprince | pradk: this needs to be deployed in your undercloud | 20:12 |
dprince | pradk: did you (perhaps manually) deploy the latest os-cloud-config in your patch to your undercloud? | 20:13 |
pradk | dprince, yea i updated my undercloud os-cloud-config as well | 20:13 |
dprince | pradk: when using os-cloud-config... the configuration occurs externally from the undercloud node | 20:13 |
pradk | os-cloud-config-999.9.9-99999.noarch is what i ahve from jenkins rpm-build | 20:14 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: Add a ceph-storage node upgrade script for the upgrade workflow https://review.openstack.org/291408 | 20:17 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add Rabbit IPv6 only support https://review.openstack.org/290568 | 20:22 |
*** thrash is now known as thrash|bbl | 20:28 | |
*** sthillma has quit IRC | 20:28 | |
*** bnemec changes topic to "TripleO | stable/liberty blocked by https://bugs.launchpad.net/tripleo/+bug/1555803 | CI status: http://tripleo.org/cistatus.html | Docs: http://tripleo.org/" | 20:29 | |
dprince | slagle: did you want to try this? https://review.openstack.org/#/c/291322/ | 20:29 |
*** mbound has quit IRC | 20:30 | |
pradk | dprince, once i update the os-cloud-config on undercloud, it should set up the roles when overcloud install runs automatically? or do i need to update something in between for it to kick in? | 20:34 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates: Use puppet parameter for Heat notifications https://review.openstack.org/291416 | 20:34 |
bnemec | Need ^ before any stable jobs will pass. | 20:35 |
dprince | pradk: I think it'll just go | 20:35 |
openstackgerrit | Ben Nemec proposed openstack/instack-undercloud: Enable notifications on undercloud https://review.openstack.org/289518 | 20:37 |
openstackgerrit | Sam Yaple proposed openstack/diskimage-builder: Revert "Zerofree the image if possible" https://review.openstack.org/291350 | 20:38 |
pradk | dprince, any way to confirm that happened? since the overcloud install failed there is no rc file to poke keystone on overcloud | 20:38 |
bnemec | pradk: It won't re-run keystone init on an already deployed overcloud. | 20:39 |
slagle | dprince: it's hard to reproduce...so i dont need to explicitly try it | 20:39 |
bnemec | You have to either redeploy or hack the client to force it to re-run keystone init. | 20:39 |
slagle | dprince: e.g., i cant confirm it fixes it | 20:39 |
dprince | slagle: no, neither can I. But a revert is safe and we can re-add this later I think | 20:40 |
pradk | bnemec, ah ok.. would it log the keystone init somewhere i can check what services it did for? | 20:40 |
openstackgerrit | Dan Sneddon proposed openstack/os-net-config: Fix order-of-operations bug in os-net-config restart_interfaces https://review.openstack.org/291420 | 20:42 |
*** absubram has joined #tripleo | 20:44 | |
bnemec | pradk: I'm not sure if that's logged anywhere. | 20:45 |
*** MaxPC has quit IRC | 20:50 | |
openstackgerrit | Merged openstack/tripleo-common: Install the upgrade-non-controller.sh script with tripleo-common https://review.openstack.org/291101 | 20:50 |
openstackgerrit | Dan Sneddon proposed openstack/os-net-config: Fix order-of-operations bug in os-net-config restart_interfaces https://review.openstack.org/291420 | 20:51 |
*** yamahata has quit IRC | 20:52 | |
slagle | bnemec: should we just merge the liberty fix? | 20:53 |
slagle | it's about 20th in the queue | 20:53 |
slagle | no point in waiting everything to fail before it | 20:54 |
bnemec | slagle: Might as well. It can't break things worse than they are. | 20:56 |
openstackgerrit | Dan Sneddon proposed openstack/os-net-config: Fix order-of-operations bug in os-net-config restart_interfaces https://review.openstack.org/291420 | 20:56 |
*** weshay has quit IRC | 21:01 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Use puppet parameter for Heat notifications https://review.openstack.org/291416 | 21:04 |
openstackgerrit | Dan Sneddon proposed openstack/os-net-config: Fix order-of-operations bug in os-net-config restart_interfaces https://review.openstack.org/291420 | 21:06 |
*** bnemec changes topic to "TripleO | stable/liberty fix for https://bugs.launchpad.net/tripleo/+bug/1555803 merged | CI status: http://tripleo.org/cistatus.html | Docs: http://tripleo.org/" | 21:09 | |
pradk | so looking at tripleclient, the keystone-init runs after the stack is created, but i need the user while the deploy is in progress so gnocchi can auth with swift | 21:09 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Surround MongoDB IPs with braces in the connection string if IPv6 https://review.openstack.org/270154 | 21:09 |
pradk | so where does the user/role get created in tripleo ? | 21:10 |
pradk | is os-cloud-config run as a pre or post stack creation step during overcloud deploy? | 21:13 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Surround MongoDB IPs with braces in the connection string if IPv6 https://review.openstack.org/291429 | 21:14 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Allow the vnc server to bind on IPv6 address on computes https://review.openstack.org/270831 | 21:15 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Fix vncproxy_host for IPv6 https://review.openstack.org/287068 | 21:15 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Allow the vnc server to bind on IPv6 address on computes https://review.openstack.org/291435 | 21:18 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Fix vncproxy_host for IPv6 https://review.openstack.org/291439 | 21:18 |
*** Marga_ has quit IRC | 21:20 | |
*** weshay has joined #tripleo | 21:20 | |
*** penick has joined #tripleo | 21:21 | |
pradk | dprince, ^^ could you clarify that for me please? | 21:22 |
slagle | bnemec: can you review https://review.openstack.org/#/c/272089/ | 21:24 |
slagle | bnemec: it passed ceph earlier on PS 12 | 21:25 |
slagle | and it was cruising for a pass before timed out | 21:25 |
slagle | the images took 90mins to build for whatever reason | 21:25 |
slagle | bnemec: i think we could merge it is what i'm trying to say | 21:25 |
bnemec | Why aren't any services smart enough to handle ipv6 automatically? | 21:25 |
slagle | Hah | 21:26 |
bnemec | It seems like everything has a magic "use ipv6" bit that needs to be flipped. | 21:26 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: Change default host reserved memory to 2048MB from 512MB https://review.openstack.org/291446 | 21:29 |
*** mbound has joined #tripleo | 21:30 | |
*** liverpooler has joined #tripleo | 21:32 | |
bnemec | slagle: Done | 21:32 |
* bnemec crosses his fingers that he didn't just break the ceph job | 21:32 | |
*** dprince has quit IRC | 21:32 | |
*** mbound has quit IRC | 21:32 | |
*** mbound has joined #tripleo | 21:33 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Support the deployment of Ceph over IPv6 https://review.openstack.org/272089 | 21:35 |
*** panda has quit IRC | 21:40 | |
*** panda has joined #tripleo | 21:40 | |
*** jistr has quit IRC | 21:40 | |
*** lblanchard has quit IRC | 21:40 | |
*** openstackstatus has quit IRC | 21:42 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: compute: include VIR_MIGRATE_TUNNELLED when doing VM shared storage https://review.openstack.org/286584 | 21:43 |
*** dshulyak has quit IRC | 21:43 | |
*** snecklifter has left #tripleo | 21:43 | |
*** snecklifter has joined #tripleo | 21:43 | |
*** openstackstatus has joined #tripleo | 21:45 | |
*** ChanServ sets mode: +v openstackstatus | 21:45 | |
jdob | pradk: so yeah, I remember this coming up and I think it was in relation to manila | 21:46 |
jdob | but the workaround there was to remove the need for the user that early on | 21:46 |
jdob | trying to remember who i was working with on that | 21:47 |
*** r-mibu has quit IRC | 21:47 | |
pradk | jdob, oh interesting, having uses while services come up is prtty standard case i thought.. so i cant get a gnocchi user until the stack is created? | 21:47 |
*** r-mibu has joined #tripleo | 21:47 | |
jdob | it was a guy named ryan, but he's not online right now (not even sure if he's still working on the project) | 21:48 |
jdob | ok, so, this is me dredging my memories from about 6 months ago | 21:48 |
jdob | but IIRC, we want to move the keystone init out of os-cloud-config | 21:48 |
jdob | which would help alleviate this | 21:48 |
jdob | but from what I remember, that's not the case and so far it hasn't been a blocker | 21:49 |
pradk | yea this is not good :( | 21:49 |
pradk | EmilienM, ^^ | 21:49 |
pradk | jdob, wait how does glance do it? | 21:50 |
jdob | magic? | 21:50 |
pradk | jdob, glance uses swift as default backend too | 21:50 |
pradk | i guess our only work around is perhaps to default to file driver instead | 21:50 |
jdob | holy shit: https://review.openstack.org/#/c/209594/ | 21:51 |
jdob | thats the patch I was thinking of | 21:51 |
jdob | cannot believe I found that | 21:51 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Enable predictable IPs on non-controllers https://review.openstack.org/290687 | 21:51 |
pradk | looking | 21:51 |
jdob | this might be apples and oranges now that I look at it | 21:51 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: Support the deployment of Ceph over IPv6 https://review.openstack.org/291455 | 21:51 |
pradk | i guess this is the endpoint itself being accessible | 21:52 |
jdob | ya, you're right | 21:52 |
jdob | i was a bit off, i just remembered it as a keystone related timing issue | 21:53 |
jdob | as for glance, I don't know off the top of my head, someone else in here should | 21:53 |
pradk | reading the tripleoclient code, keystone-init is called once the stack is created.. in our case the stack fails due to the user not available | 21:54 |
*** Marga_ has joined #tripleo | 21:54 | |
pradk | so are we in a chicken-egg problem then? | 21:54 |
pradk | glance uses swift, but it could be glance db sync doesnt require swift to be up? | 21:55 |
jdob | ya :( | 21:55 |
jdob | without knowing much about it, is there a reason a db sync would need the service running? | 21:56 |
jdob | on the surface it seems like it'd be better to execute prior to starting the service | 21:56 |
jdob | though i think i'm reading your comment wrong; your issue is that gnocchi uses keystone for auth during db sync, right? | 21:56 |
*** jayg is now known as jayg|g0n3 | 21:57 | |
pradk | yea so gnocchi has two backends, one is for metadata and other is for metrics .. metadata uses sqlalchemy, and metrics use carbonara which has swift/ceph/file backends | 21:57 |
pradk | jdob, so db sync upgrades both the indexer and storage .. hence swift should be accessible if swift is the default backend | 22:00 |
jdob | ok, so swift is running before gnocchi, so that's fine, but keystone doesn't know about it yet, so gnocchi can't get to it | 22:00 |
pradk | yep thats exactly the issue | 22:01 |
jdob | ok, so it's not a surprise this os-cloud-config patch didn't work, since that's just dorking with users | 22:01 |
jdob | well, shit. | 22:01 |
jdob | there are hooks in the THT templates that can be used for post deployment configuration | 22:02 |
pradk | yea i asumed that would run while keystone is coming up, guess not | 22:02 |
jdob | i can see why you'd think that | 22:02 |
pradk | shouldnt init-keystone run as part of keystone setup :) | 22:02 |
jdob | its an artifact of the past | 22:02 |
pradk | ah ok | 22:02 |
jdob | i wonder if we can put off the db sync until the post config | 22:03 |
jdob | though I suppose the service won't start without db sync running, huh. | 22:03 |
pradk | yea | 22:03 |
pradk | dbsync is part of the puppet manifest | 22:03 |
jdob | ah, shit, so it's not easy to move | 22:04 |
jdob | could probably pass a flag in to disable it, but this is ultimately a not-good line of thought to follow in the first palce | 22:04 |
pradk | if we use file driver we wont have this issue but swift /cepg are recommended for large deployments | 22:05 |
pradk | jdob, and document swift use case perhaps | 22:06 |
pradk | as by then the user exists | 22:06 |
jdob | in the interim, that's not a bad solution | 22:06 |
pradk | i hope we can find a solution by default but worst case we could do this | 22:06 |
jdob | it comes down to timeframe, we can push to address keystone init in newton, but that won't be available until osp 10 | 22:07 |
jdob | (talking inside baseball in an upstream channel, but whatever) | 22:07 |
*** shardy has quit IRC | 22:07 | |
*** dshulyak has joined #tripleo | 22:09 | |
pradk | jdob, yea, i was banging my head figure out why it wasnt picking up when all the config is in place | 22:10 |
jdob | its tricky to wrap your head around a few thousand lines of THT + python + puppet :) | 22:11 |
*** rcernin has joined #tripleo | 22:11 | |
pradk | i'll run this by ceilo team to see if we can get an agreement on using file driver as an interim solution for near term | 22:11 |
pradk | jdob, hehe no kidding | 22:11 |
jdob | ok cool. tomorrow i'll start looking at the aodh patches (mostly just don't want to screw up this gnocchi environment right now) | 22:12 |
*** sthillma has joined #tripleo | 22:13 | |
pradk | jdob, sounds good, thx for your reviews and testing .. aodh patch is in good shape and passing ci | 22:13 |
jdob | oh good, that should go smoother then | 22:13 |
*** jtomasek has quit IRC | 22:18 | |
openstackgerrit | Ben Nemec proposed openstack/tripleo-puppet-elements: Use package-installs for puppet installation https://review.openstack.org/291465 | 22:19 |
*** trown is now known as trown|outtypewww | 22:21 | |
*** rcernin has quit IRC | 22:22 | |
*** Goneri has quit IRC | 22:25 | |
*** dshulyak has quit IRC | 22:26 | |
*** jprovazn has quit IRC | 22:38 | |
*** chlong has quit IRC | 22:39 | |
openstackgerrit | Jeff Peeler proposed openstack/tripleo-docs: Docs for containerized compute node https://review.openstack.org/254743 | 22:40 |
*** trozet has quit IRC | 22:42 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: compute: include VIR_MIGRATE_TUNNELLED when doing VM shared storage https://review.openstack.org/286584 | 22:42 |
*** trozet has joined #tripleo | 22:49 | |
*** trozet_ has joined #tripleo | 22:50 | |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-common: Use m1.tiny instead of m1.demo for the pingtest VM https://review.openstack.org/289845 | 22:51 |
*** morazi has quit IRC | 22:53 | |
*** trozet has quit IRC | 22:54 | |
*** trown|outtypewww has quit IRC | 22:55 | |
*** trown has joined #tripleo | 22:58 | |
*** derekh has joined #tripleo | 23:09 | |
*** dmsimard is now known as dmsimard|pto | 23:10 | |
derekh | slagle: Are you still using them hosts? gonna gick them off again, this time without destrying the Test envs at the end | 23:13 |
derekh | bnemec: you either ? ^ | 23:14 |
*** nico_auv has joined #tripleo | 23:14 | |
bnemec | derekh: I am not | 23:15 |
derekh | bnemec: ack , I doubt slagle is either cause their not running, ok gonna kick them off so the'll be there in the morning when I get here | 23:17 |
bnemec | derekh: Yeah, it's getting late his time, so hopefully he's logged off by now. :-) | 23:17 |
*** dmsimard|pto has quit IRC | 23:26 | |
*** absubram has quit IRC | 23:31 | |
*** xinwu has joined #tripleo | 23:34 | |
*** Goneri has joined #tripleo | 23:36 | |
*** derekh has quit IRC | 23:39 | |
*** dmsimard|pto has joined #tripleo | 23:42 | |
*** yamahata has joined #tripleo | 23:46 | |
*** mkovacik has quit IRC | 23:47 | |
*** penick has quit IRC | 23:51 | |
*** nico_auv has quit IRC | 23:52 | |
*** penick has joined #tripleo | 23:54 | |
*** penick has quit IRC | 23:57 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!