*** bana_k has quit IRC | 00:08 | |
*** yamahata has joined #tripleo | 00:22 | |
*** trozet has quit IRC | 00:25 | |
*** leanderthal|afk has quit IRC | 00:36 | |
*** yamahata has quit IRC | 00:37 | |
*** padkrish has quit IRC | 00:41 | |
*** saneax is now known as saneax-_-|AFK | 00:41 | |
*** padkrish has joined #tripleo | 00:41 | |
*** padkrish has quit IRC | 00:47 | |
*** rain has joined #tripleo | 00:50 | |
*** rain is now known as Guest29776 | 00:51 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Add gateway_ip in OS::Neutron::Subnet https://review.openstack.org/379873 | 00:52 |
---|---|---|
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Fix typo in fixing gnocchi upgrade. https://review.openstack.org/379874 | 00:52 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Use -L with chown and set crush map tunables when upgrading Ceph https://review.openstack.org/379875 | 00:52 |
*** tzumainn has quit IRC | 01:00 | |
*** tiswanso has quit IRC | 01:23 | |
*** tiswanso has joined #tripleo | 01:23 | |
*** fultonj has joined #tripleo | 01:25 | |
*** chem has quit IRC | 01:26 | |
*** padkrish has joined #tripleo | 01:34 | |
*** bswartz has quit IRC | 01:44 | |
*** fultonj has quit IRC | 01:48 | |
*** mbozhenko has joined #tripleo | 01:52 | |
*** coolsvap has joined #tripleo | 01:54 | |
*** mbozhenko has quit IRC | 01:57 | |
*** rhallisey has quit IRC | 02:12 | |
*** bswartz has joined #tripleo | 02:15 | |
openstackgerrit | Merged openstack/tripleo-common: Don't set node state during node registration https://review.openstack.org/379843 | 02:16 |
openstackgerrit | Merged openstack-infra/tripleo-ci: Delete ping test environment in periodic jobs https://review.openstack.org/346134 | 02:21 |
*** padkrish has quit IRC | 02:44 | |
*** mbozhenko has joined #tripleo | 02:51 | |
*** mbozhenko has quit IRC | 02:56 | |
*** yamahata has joined #tripleo | 02:59 | |
*** lblanchard has quit IRC | 03:01 | |
*** jeckersb is now known as jeckersb_gone | 03:04 | |
*** david-lyle has quit IRC | 03:04 | |
*** bswartz has quit IRC | 03:27 | |
openstackgerrit | RedHat RDO CI proposed openstack/tripleo-heat-templates: GATE TEST, please ignore https://review.openstack.org/365449 | 03:45 |
*** tiswanso has quit IRC | 03:51 | |
*** sudipto has joined #tripleo | 03:53 | |
*** sudipto_ has joined #tripleo | 03:53 | |
*** links has joined #tripleo | 03:58 | |
*** padkrish has joined #tripleo | 04:02 | |
*** padkrish has quit IRC | 04:19 | |
*** padkrish has joined #tripleo | 04:21 | |
*** ramishra has quit IRC | 04:26 | |
*** ramishra has joined #tripleo | 04:28 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Generate internal TLS hieradata for apache services https://review.openstack.org/366075 | 04:32 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Enable TLS in the internal networkf or Mysql https://review.openstack.org/378537 | 04:32 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Add flag for internal TLS https://review.openstack.org/365942 | 04:32 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Enable internal TLS for ceilometer https://review.openstack.org/377648 | 04:32 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Enable internal TLS for aodh https://review.openstack.org/377649 | 04:32 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Enable internal TLS for gnocchi https://review.openstack.org/377650 | 04:32 |
*** cllewellyn_ has joined #tripleo | 04:34 | |
*** mburned_out is now known as mburned | 04:43 | |
*** pgadiya has joined #tripleo | 04:52 | |
*** padkrish has quit IRC | 04:53 | |
*** padkrish has joined #tripleo | 05:12 | |
*** skramaja has joined #tripleo | 05:23 | |
*** cllewellyn_ has quit IRC | 05:31 | |
*** Ryjedo_ has joined #tripleo | 05:34 | |
*** Ryjedo has quit IRC | 05:36 | |
*** Ryjedo_ is now known as Ryjedo | 05:36 | |
*** jaosorior has joined #tripleo | 05:40 | |
*** mbozhenko has joined #tripleo | 05:42 | |
*** xuao has quit IRC | 05:50 | |
*** flepied has quit IRC | 05:52 | |
*** aufi has joined #tripleo | 05:54 | |
*** mbozhenko has quit IRC | 05:56 | |
jaosorior | Can someone review this one? https://review.openstack.org/#/c/368643/ | 06:02 |
*** mburned is now known as mburned_out | 06:04 | |
*** limao has joined #tripleo | 06:06 | |
*** rcernin has joined #tripleo | 06:07 | |
*** mbozhenko has joined #tripleo | 06:08 | |
*** mbozhenko has quit IRC | 06:09 | |
*** mbozhenko has joined #tripleo | 06:13 | |
*** mbozhenko has quit IRC | 06:14 | |
*** mbozhenko has joined #tripleo | 06:14 | |
*** mhenkel has joined #tripleo | 06:21 | |
*** rasca has joined #tripleo | 06:22 | |
*** mhenkel has quit IRC | 06:26 | |
*** mhenkel has joined #tripleo | 06:26 | |
*** jprovazn has joined #tripleo | 06:27 | |
ccamacho | morning! | 06:27 |
d0ugal | Morning! | 06:36 |
*** radeks has joined #tripleo | 06:37 | |
*** pcaruana has joined #tripleo | 06:39 | |
*** dciabrin has joined #tripleo | 06:39 | |
*** jaosorior has quit IRC | 06:40 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo: Remove unused pacemaker profiles https://review.openstack.org/379957 | 06:41 |
bandini | morning | 06:41 |
openstackgerrit | gecong proposed openstack/puppet-tripleo: Fix a typo in haproxy.pp https://review.openstack.org/379958 | 06:41 |
*** limao has quit IRC | 06:43 | |
panda|zZ | bandini: ha-ipv6 in CI does not work even wiht updated rabbitmq ... | 06:45 |
panda|zZ | bandini: I'll investigate later ... :( | 06:45 |
bandini | panda|zZ: let's look at logs. which job? | 06:45 |
panda|zZ | bandini: http://logs.openstack.org/74/363674/28/experimental-tripleo/gate-tripleo-ci-centos-7-ovb-ha-ipv6/726be86/ | 06:46 |
*** limao has joined #tripleo | 06:47 | |
panda|zZ | doesn't seems to be the same error | 06:47 |
panda|zZ | bandini: going afk, later. | 06:47 |
bandini | panda|zZ: ack. it looks like a timeout. it reached controller step5 which means it was almost done | 06:48 |
*** cylopez has joined #tripleo | 06:49 | |
panda|zZ | bandini: same situation as before. | 06:50 |
panda|zZ | bandini: before the new package | 06:50 |
*** limao has quit IRC | 06:50 | |
*** apetrich has quit IRC | 06:52 | |
*** limao has joined #tripleo | 06:53 | |
*** apetrich has joined #tripleo | 06:53 | |
*** flepied has joined #tripleo | 06:53 | |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-heat-templates: Replace per role manifests with a common role manifest https://review.openstack.org/378736 | 06:56 |
*** padkrish has quit IRC | 06:56 | |
*** padkrish has joined #tripleo | 06:57 | |
*** limao has quit IRC | 06:57 | |
*** tobias_fiberdata has quit IRC | 06:57 | |
bandini | panda|zZ: there do you see that? I see no rabbit errors on controller1 nor controller2 | 06:59 |
*** jlinkes has joined #tripleo | 07:00 | |
*** tobias_fiberdata has joined #tripleo | 07:02 | |
*** mcornea has joined #tripleo | 07:16 | |
*** abehl has joined #tripleo | 07:17 | |
*** mbozhenk1 has joined #tripleo | 07:21 | |
*** jpena|off is now known as jpena | 07:24 | |
*** mbozhenko has quit IRC | 07:24 | |
*** rwsu has quit IRC | 07:27 | |
*** panda|zZ is now known as panda | 07:27 | |
*** pmannidi is now known as pmannidi|Gone | 07:28 | |
panda | bandini: that's what I was trying to say. Symptoms are the same as before the package: timeout after step5. But I didn't see rabbitmq errors | 07:28 |
*** pmannidi|Gone has quit IRC | 07:28 | |
*** limao has joined #tripleo | 07:28 | |
bandini | panda: ack | 07:29 |
*** akuznetsov has joined #tripleo | 07:30 | |
panda | bandini: you remember how much it takes for your local deployments to finish ? | 07:32 |
ccamacho | bandini :) man quick question, I have changed the name of the services config files (i.e. puppet/cinder-storage.yaml -> puppet/blockstorage.yaml) in https://review.openstack.org/#/c/379452/ just wondering, will this mess with your M/N upgrade? | 07:33 |
*** fzdarsky has joined #tripleo | 07:33 | |
ccamacho | well the change is on master | 07:34 |
bandini | panda: meh no not really. I hope I can run a full ipv6 test sometime today | 07:34 |
bandini | ccamacho: let me take a look | 07:34 |
*** coolias has joined #tripleo | 07:34 | |
*** zoli_gone-proxy is now known as zoliXXL | 07:35 | |
*** mburned_out is now known as mburned | 07:35 | |
bandini | ccamacho: I don't think it will affect upgrades | 07:35 |
ccamacho | good to hear that :) | 07:36 |
*** tremble has joined #tripleo | 07:40 | |
*** mcornea has quit IRC | 07:47 | |
*** ayoung has quit IRC | 07:47 | |
*** mcornea has joined #tripleo | 07:47 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Move the rest of static roles resource registry entries to j2 https://review.openstack.org/379452 | 07:48 |
*** coolias has quit IRC | 07:49 | |
marios | o/ | 07:49 |
*** jpich has joined #tripleo | 07:53 | |
*** ayoung has joined #tripleo | 07:53 | |
*** rwsu has joined #tripleo | 07:54 | |
*** yamahata has quit IRC | 07:56 | |
panda | how do I know what are the task to complete step 5 ? | 07:58 |
*** ayoung has quit IRC | 08:00 | |
panda | ccamacho: do you know ? ^ | 08:02 |
*** saneax-_-|AFK is now known as saneax | 08:02 | |
ccamacho | panda I have some notes about the config steps, let me find them | 08:03 |
panda | ccamacho: wait maybe I'm reading the logs wrong ... | 08:04 |
panda | 2016-09-30 00:08:57.298367 | 2016-09-30 00:08:43Z [ComputeDeployment_Step5]: CREATE_COMPLETE state changed | 08:04 |
panda | 2016-09-30 00:29:43.180688 | 2016-09-30 00:29:34Z [1]: SIGNAL_IN_PROGRESS Signal: deployment e91e5186-ad4a-4e18-b8a6-add5d31e2e6b succeeded | 08:04 |
panda | 2016-09-30 00:29:43.181573 | 2016-09-30 00:29:35Z [1]: CREATE_COMPLETE state changed | 08:05 |
panda | 2016-09-30 00:33:59.635720 | 2016-09-30 00:33:46Z [2]: SIGNAL_IN_PROGRESS Signal: deployment 8bca0c41-27af-44da-9508-613fe62e6b93 succeeded | 08:05 |
panda | 2016-09-30 00:33:59.635808 | 2016-09-30 00:33:48Z [2]: CREATE_COMPLETE state changed | 08:05 |
panda | 2016-09-30 00:44:38.187903 | 2016-09-30 00:44:36Z [AllNodesDeploySteps]: CREATE_FAILED CREATE aborted | 08:05 |
panda | does this mean that step5 completed and there's something else after that timed out ? | 08:05 |
panda | and those signals ? I see that 1 and 2 completed, does this means tha controller0 did not complete ? | 08:06 |
ccamacho | panda, not sure as from what I remember, between nodes the deployments are not synchronized, so those signals can come from different steps on different nodes | 08:09 |
ccamacho | but | 08:09 |
ccamacho | panda run heat stack-list --show-nested -f "status=FAILED" then get the ID | 08:10 |
ccamacho | and run heat deployment-show <ID> | 08:10 |
ccamacho | and you will get where was the deployment failed | 08:11 |
ccamacho | or | 08:11 |
*** dtantsur has joined #tripleo | 08:11 | |
ccamacho | heat stack-list --show-nested | 08:11 |
ccamacho | you will get them all | 08:12 |
panda | ccamacho: the deployment did not file, it timed out, there is no error in heat stack-list, that's why I'm trying to understand what task is not completing in time | 08:12 |
panda | what tasks remain to be done between step5 and the end of deployment | 08:13 |
ccamacho | panda heat stack-list --show-nested will show all the deployments and the stack | 08:14 |
ccamacho | not showing any error there? | 08:14 |
panda | ccamacho: I don't see any error, but it's a CI log I'm looking http://logs.openstack.org/74/363674/28/experimental-tripleo/gate-tripleo-ci-centos-7-ovb-ha-ipv6/726be86/logs/postci.txt.gz I don't have the live system | 08:15 |
*** shardy has joined #tripleo | 08:16 | |
*** coolias has joined #tripleo | 08:17 | |
ccamacho | panda /me reading logs | 08:17 |
panda | ccamacho: thanks. | 08:18 |
*** masco has joined #tripleo | 08:18 | |
*** derekh has joined #tripleo | 08:20 | |
openstackgerrit | Juan Badia Payno proposed openstack/tripleo-heat-templates: Fixed NoneType issue when logging-environment.yaml is used https://review.openstack.org/380003 | 08:22 |
matbu | marios: hey back man ? | 08:22 |
shardy | matbu: Hey, good morning | 08:22 |
matbu | shardy: hello | 08:23 |
shardy | thanks for pushing https://review.openstack.org/#/c/379547 | 08:23 |
shardy | looks like a good start, but I added some comments | 08:23 |
shardy | do you want to continue with that, or should I make the changes and push to the review? | 08:23 |
matbu | shardy: ah thanks, yes i was 100% sure of the fix | 08:23 |
marios | matbu: o/ hey yeah | 08:23 |
matbu | shardy: you can update the review if you want ? | 08:23 |
marios | matbu: i got delayed in vienna so didn't get to bed till 5 though :) so struggling a bit atm | 08:23 |
matbu | shardy: or make another one (/me looks at your comments) | 08:24 |
shardy | matbu: ack, will do - thanks for starting it! | 08:24 |
openstackgerrit | Christian Schwede proposed openstack/tripleo-heat-templates: Add system-uuid based hostname entries https://review.openstack.org/358643 | 08:24 |
marios | matbu: catching up still, slowly :) | 08:24 |
matbu | marios: hehe | 08:25 |
matbu | marios: we got good progress this week | 08:25 |
mcornea | hi everyone! how can I debug this kind of error: u'message': u"Failed to run action [action_ex_id=b9d88cae-f32f-4460-8c13-a7d00a3bf1ec, action_cls='<class 'mistral.actions.action_factory.DeployStackAction'>', attributes='{}', params='{u'container': u'cloudy', u'timeout': 240}']\n ERROR: Failed to validate: The template is not a JSON object or YAML mapping.", | 08:28 |
*** ayoung has joined #tripleo | 08:28 | |
openstackgerrit | Juan Badia Payno proposed openstack/tripleo-heat-templates: Fixed NoneType issue when monitoring-environment.yaml https://review.openstack.org/380007 | 08:29 |
shardy | mcornea: Hi, it sounds like either some malformed template was included via an environment file, or possibly the j2 rendering didn't work and wrote a broken template | 08:29 |
shardy | mcornea: we should say what file, so that's a bug in the error path | 08:29 |
marios | matbu: fantastic... going to try get some reviews done at some point too (I think most things upgrades related that were in play last week are mostly ready now, and i guess there may be some new things too... we can catchup some more on scrum later | 08:29 |
shardy | mcornea: I'd probably do mkdir overcloud_plan_tmp && pushd overcloud_plan_tmp && swift download overcloud | 08:30 |
shardy | mcornea: then there's a script which does yaml validation in the tree | 08:30 |
matbu | marios: yep sure, i think there is still 2 or 3 reviews, but the basic deployment case (HA + 1 compute) work completely on upstream | 08:30 |
shardy | https://github.com/openstack/tripleo-heat-templates/blob/master/tools/yaml-validate.py | 08:30 |
shardy | mcornea: if you run that over the whole tree, chances are you'll see the broken file | 08:31 |
ccamacho | panda sound like connection issues in the controllers https://paste.fedoraproject.org/438712/47522419/ check http://logs.openstack.org/74/363674/28/experimental-tripleo/gate-tripleo-ci-centos-7-ovb-ha-ipv6/726be86/logs/overcloud-controller-2/var/log/messages | 08:31 |
shardy | ./tools/yaml-validate.py . | 08:31 |
shardy | will walk the entire tree | 08:31 |
shardy | (you may have to copy it into the temporary dir above) | 08:31 |
shardy | s/overcloud/cloudy | 08:33 |
*** morazi has joined #tripleo | 08:34 | |
*** hjensas has joined #tripleo | 08:35 | |
*** akuznetsov has quit IRC | 08:36 | |
panda | ccamacho: it's a bit hard to read .. so it is os-collect-config taht is failing to connect ? | 08:41 |
panda | ccamacho: I see it in at least 3 different places at different times too | 08:43 |
ccamacho | I think can be related to collect config, yeahp as you said the error is in different places | 08:44 |
mcornea | shardy: thanks, that's very useful.trying it now | 08:50 |
derekh | So we just noticed that on our undercloud in rh1 the ceilometer db is 28G... 21G of it is the sample DB, any ideas how to deal with it? is there a delete command or something | 08:51 |
*** zoliXXL is now known as zoli|afk | 08:51 | |
*** masco has quit IRC | 08:52 | |
mcornea | shardy: odd, I get Validation successful! | 08:53 |
shardy | mcornea: weird - I'd probably look at the mistral logs on the undercloud next, to see if there was a preceding error with any clues | 08:54 |
shardy | mcornea: if you've got steps to reproduce I can take a look | 08:54 |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-heat-templates: WIP j2 template role config templates https://review.openstack.org/378737 | 08:54 |
d0ugal | mcornea, shardy - I'm around to help if I can too | 08:54 |
derekh | we only contains 78,651,678 records in the ceilometer.sample table.... | 08:55 |
shardy | derekh: do we even need ceilometer running? | 08:55 |
shardy | derekh: I disabled it by default a while back, so I'd be tempted to stop the service and drop the entire db? | 08:56 |
*** padkrish has quit IRC | 08:56 | |
mcornea | shardy: d0ugal in the mistral log I've got http://paste.openstack.org/show/583589/ ; trying to see if I can spot anything in one of those files | 08:57 |
derekh | shardy: probably not, but we stopped running it on the overcloud a while back, then messages started building up in rabbit instead, until messaging grount to a practical standstill | 08:57 |
*** paramite has joined #tripleo | 08:58 | |
shardy | derekh: Hmm, that sounds like a configuration issue, e.g we need to also disable the collectors? | 08:58 |
shardy | my patch just turned off the services in instack-undercloud via the enable_telemetry option | 08:58 |
derekh | shardy: although there is no ceilometer db on the overcloud /me doesn't know enough about ceilometer to know why | 08:59 |
d0ugal | mcornea: that sounds like a good place to start. Are you deploying custom templates or is it something standard I can try? | 09:00 |
shardy | derekh: Hmm, weird, I guess this is a question for pradk, but I know he's likely to be travelling today | 09:00 |
shardy | or at least recovering from travelling | 09:00 |
mcornea | d0ugal: it's standard templates with a custom role; I can share my environment if you want | 09:01 |
*** mbozhenk1 has quit IRC | 09:02 | |
derekh | shardy: ok, it can probably wait until monday, our undercloud isn't currently working because of a full filesystem, doesn't matter much as we don't need to control the overcloud at the moment | 09:03 |
derekh | zoli|afk: ^^ | 09:03 |
*** chem has joined #tripleo | 09:05 | |
shardy | bandini: Hey I made a comment on https://bugs.launchpad.net/tripleo/+bug/1629187 | 09:06 |
openstack | Launchpad bug 1629187 in tripleo "{{role.name}}Count parameters are not enough" [Undecided,Triaged] | 09:06 |
shardy | bandini: I think the data you need already exists in hiera, we just need to stop deriving anything from the *Count parameters | 09:06 |
shardy | and instead look at the list lengths in the puppet-tripleo profiles | 09:06 |
bandini | shardy: ah that's excellent, thanks! | 09:09 |
shardy | mcornea: Hmm, that's definitely not a good error output, but you might look at the heat-engine log to see if there's any clues from the validation error | 09:09 |
shardy | obviously we should surface the cause of the validation failure much better, so that's a bug we need to address | 09:09 |
*** tosky has joined #tripleo | 09:11 | |
shardy | mcornea: can you share how you're defining the custom role(s)? | 09:11 |
mcornea | shardy: yep, http://paste.openstack.org/show/583591/ | 09:12 |
shardy | mcornea: and the environment files for the ServiceApi role? | 09:13 |
mcornea | shardy: http://paste.openstack.org/show/583592/ | 09:13 |
shardy | mcornea: FYI ccamacho and I are working to improve the interface so all you'll need to do is modify roles_data.yaml | 09:13 |
shardy | some of the patches for that have landed, but some are still WIP | 09:14 |
mcornea | shardy: yep, I noticed those changes and I wanted to try them with the latest master but then I hit this | 09:14 |
shardy | I hope those can be backported and will resolve the usability issues around the current interfaces | 09:14 |
*** radeks has quit IRC | 09:14 | |
shardy | mcornea: Ok, thanks - I'll see if I can reproduce | 09:15 |
*** tzumainn has joined #tripleo | 09:16 | |
derekh | shardy: found the ceilometer data on the overcloud, 80G in mongo, on the undercloud 20+G in mysql | 09:16 |
shardy | derekh: wow | 09:18 |
*** limao has quit IRC | 09:19 | |
derekh | shardy: thats 80G in /var/lib/mongo , its takes up less ram but still a lot | 09:19 |
derekh | 3426 mongodb 20 0 0.153t 0.017t 0.017t S 0.3 13.9 234:10.76 mongod | 09:19 |
derekh | 17G | 09:19 |
shardy | derekh: Sounds like we need to decide if we really need it, and if not disable it everywhere and clean up? | 09:20 |
derekh | suddenly the heat ram issues don't seem so bad | 09:20 |
shardy | hehe | 09:20 |
shardy | derekh: FWIW it's been confirmed the heat ram thing is a bug, when just don't know what yet | 09:21 |
*** limao has joined #tripleo | 09:21 | |
shardy | when I get the last few release blocker bugs done I'm planning to bisect in an attempt to find it | 09:21 |
*** sudipto_ has quit IRC | 09:21 | |
*** sudipto has quit IRC | 09:21 | |
derekh | shardy: yup, anyways we can pick it up on monday once pradk is in, things are still tacking along | 09:21 |
shardy | we're using 4 times as much ram as in my last tests back in June | 09:21 |
derekh | shardy: ack | 09:21 |
shardy | (for heat-engine) | 09:21 |
derekh | ouch | 09:21 |
ccamacho | shardy quick question, Im playing now with https://review.openstack.org/#/c/378750/ but never did local changes to tripleo_common, so the question is, how to use the review instead of the packaged version when deploying the overcloud? Is there any env var to be configured? | 09:24 |
shardy | ccamacho: sure, there's two ways to do it, either build the local version with tripleo.sh --delorean-build openstack/tripleo-common, then either manually install it via yum localinstall, or configure a local yum repository with a greater priority (e.g 1) than the rest | 09:25 |
openstackgerrit | Dougal Matthews proposed openstack/tripleo-common: Use kwargs to pass in data and error to Mistral Result https://review.openstack.org/375348 | 09:26 |
shardy | ccamacho: Or, given that it's just one file, you can clone the tree then copy that one file in over the packaged one and hack on it in-place | 09:26 |
shardy | ccamacho: in both cases you may need to restart the services and/or reload the actions | 09:26 |
b00tcat | I accidentally deleted my ironic-python-agent.initramfs file on the undercloud and now I can't upload new images anymore - are these built anywhere? | 09:27 |
ccamacho | shardy, thanks! I was replacing the file but wanted to see if there was a better way. ack | 09:27 |
b00tcat | I mean can I get the images from the internet somethow? | 09:27 |
*** radeks has joined #tripleo | 09:27 | |
shardy | ccamacho: http://paste.fedoraproject.org/438764/14752276/ | 09:27 |
shardy | that's what I use to reload actions and bounce services | 09:27 |
shardy | the workflow stuff is commented because I was just modiifying the action | 09:28 |
ccamacho | shardy yeah!! thanks :) | 09:28 |
shardy | ccamacho: there probably are better ways, but that's how I do it :) | 09:28 |
*** athomas has joined #tripleo | 09:35 | |
shardy | b00tcat: At one point we stored some artifacts from CI for tripleo-quickstart to use, but I'm not sure if that is still the case | 09:35 |
mcornea | shardy: btw, I found yesterday some bugs when trying to create nodes for the service apis, not sure if you've seen them: https://bugs.launchpad.net/tripleo/+bugs?field.tag=composable-roles | 09:35 |
shardy | b00tcat: can you just rebuild the image, e.g rm -fr ironic-python-agent.* && openstack overcloud image build --type agent-ramdisk ? | 09:36 |
shardy | b00tcat: also, is the image still in glance? | 09:36 |
shardy | mcornea: ack, thanks - I was travelling for $meetings this week so not seen the latest bugs, but I'll try to take a look today | 09:37 |
mcornea | shardy: thanks | 09:38 |
shardy | I'm hoping we can get the remaining composability issues worked out in the next week | 09:38 |
mcornea | ack | 09:38 |
*** ramishra has quit IRC | 09:38 | |
*** ramishra has joined #tripleo | 09:39 | |
b00tcat | shardy: no, I can't find the image on Glance, so yeah I guess I'll have to rebuild :-) | 09:42 |
b00tcat | thanks | 09:42 |
*** mbozhenko has joined #tripleo | 09:42 | |
*** jlinkes_ has joined #tripleo | 09:42 | |
b00tcat | out of curiosity, if I had it in Glance, could I download it? | 09:42 |
b00tcat | I know this is not the channel for these questions but while we're at it ^^" | 09:42 |
shardy | b00tcat: yup, see glance help image-download | 09:43 |
shardy | (or the openstackclient equivalent) | 09:43 |
*** tzumainn has quit IRC | 09:43 | |
*** jlinkes has quit IRC | 09:45 | |
b00tcat | shardy: ty! | 09:46 |
*** mbozhenko has quit IRC | 09:47 | |
*** mbozhenko has joined #tripleo | 09:47 | |
*** zoli|afk is now known as zoli | 09:48 | |
*** zoli is now known as zoliXXL | 09:49 | |
*** yolanda has quit IRC | 09:54 | |
*** chem has quit IRC | 09:55 | |
*** chem has joined #tripleo | 09:55 | |
*** yolanda has joined #tripleo | 09:55 | |
*** electrofelix has joined #tripleo | 09:55 | |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-heat-templates: j2 template role config templates https://review.openstack.org/378737 | 09:58 |
*** mbozhenko has quit IRC | 09:58 | |
*** mbozhenk1 has joined #tripleo | 09:58 | |
*** abehl has quit IRC | 10:06 | |
*** limao has quit IRC | 10:11 | |
mcornea | shardy: d0ugal heat-engine logs shows this: http://paste.openstack.org/show/583596/ | 10:12 |
shardy | mcornea: does heat template-validate work for your ServiceApi template? | 10:13 |
shardy | it looks like it's failing to validate the ResourceGroup containing the ServiceApi nodes | 10:13 |
shardy | mcornea: if you can paste the template I'll see if I can spot anything | 10:14 |
mcornea | shardy: hm, I see i've got no puppet/serviceapi.yaml | 10:15 |
shardy | aha | 10:15 |
zoliXXL | derekh, shardy - can I compress some older files in /var/log/journal on undercloud host? There are many files from Sep 15 and 16 | 10:16 |
zoliXXL | that might gain some space in / until we do something with the MariaDB | 10:17 |
mcornea | shardy: but this is how the ServiceApi resourcegroup looks in overcloud.yaml: http://paste.openstack.org/show/583599/ | 10:17 |
*** jprovazn has quit IRC | 10:19 | |
shardy | mcornea: ack, I think that looks OK but we need to determine if the problem is with that block or with the template referenced by OS::TripleO::ServiceApi in your environment file | 10:20 |
derekh | zoliXXL: you can probably just delete the once that are that old, I doubt anybody is going to look at them | 10:20 |
zoliXXL | OK, will do | 10:20 |
*** abehl has joined #tripleo | 10:20 | |
mcornea | shardy: I commented them out from my env file and get the same error | 10:22 |
shardy | mcornea: Ok, trying to reproduce now | 10:22 |
*** coolias has quit IRC | 10:23 | |
*** coolias has joined #tripleo | 10:24 | |
zoliXXL | derek, deleted some files, will finish it after lunch | 10:24 |
zoliXXL | at least some space has been freed so far | 10:24 |
*** zoliXXL is now known as zoli|lunch | 10:26 | |
openstackgerrit | Attila Darazs proposed openstack/tripleo-quickstart: Revert "Return to using ping test in minimal jobs" https://review.openstack.org/379545 | 10:27 |
panda | in a live system, how do I understand what heat is trying to do in this moment ? ps faux ? | 10:33 |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-common: Modify j2 templating to allow role files generation https://review.openstack.org/378750 | 10:33 |
*** Guest29776 is now known as leanderthal | 10:33 | |
*** leanderthal is now known as leanderthal|afk | 10:33 | |
*** gfidente|afk is now known as gfidente | 10:37 | |
panda | also, the controller has load 18 with one vcpu .. gnocchi-metricd is eating most of the cycles | 10:38 |
shardy | panda: I had that problem recently, the problem was my image was old and missing some fixes needed to make gnochhi services start | 10:41 |
shardy | they span spewing errors into the logs eating all the CPU | 10:41 |
shardy | https://bugs.launchpad.net/tripleo/+bug/1626473 | 10:42 |
openstack | Launchpad bug 1626473 in tripleo "gnocchi is eating all my CPU :(" [High,Fix released] - Assigned to Carlos Camacho (ccamacho) | 10:43 |
ccamacho | shardy panda released here https://review.openstack.org/#/c/375968/ also mwhahaha sent an email ([openstack-dev] [puppet][tripleo][fuel] Upcoming changes to defaults around using processor count for worker configurations) encouraging folks to add some safer values for all workers out there | 10:46 |
shardy | ccamacho: Yeah, there was also the broken cotyledon issue though which results in the services using a lot of CPU doing nothing | 10:47 |
*** jlinkes__ has joined #tripleo | 10:48 | |
openstackgerrit | Julie Pichon proposed openstack/puppet-tripleo: Clean out UI httpd configuration file https://review.openstack.org/380152 | 10:49 |
panda | ccamacho: shardy: thanks, this is interesting. I'm still trying to understand why ha-ipv6 times out on CI, locally deployment completes, I have the gnocchi problem, but I think I used a recent image. THis could explain why CI is timing out, CPU load is high and deployment cannot finish in time, but I can't find anything wrong in gnocchi logs for CI. I'm going to increase the timeout for ha-ipv6 in my patch | 10:51 |
panda | and see what happens. | 10:51 |
shardy | mcornea: I just tried to reproduce and it works for me, here's how I tested http://paste.openstack.org/show/583601/ | 10:51 |
*** jlinkes_ has quit IRC | 10:52 | |
openstackgerrit | Gabriele Cerami proposed openstack-infra/tripleo-ci: Add IPv6 network configuration for ipv6 job types https://review.openstack.org/363674 | 10:52 |
shardy | mcornea: the deployment does fail on step 3 applying puppet, due to one of the bugs you already reported, but the validation problems you're seeing aren't evident | 10:53 |
shardy | mcornea: can you confirm you're putting the serviceapi.yaml file into the directory referenced via --templates? | 10:53 |
shardy | and that it validates OK? | 10:53 |
shardy | http://paste.openstack.org/show/583602/ | 10:55 |
*** coolias has quit IRC | 10:55 | |
shardy | that's how I validate the serviceapi.yaml directly via heatclient | 10:55 |
panda | locally I'm using an image updated to 23 september | 10:56 |
shardy | panda: what version of cotyledon is on the overcloud nodes? | 10:56 |
panda | uh-oh 2016-09-30 10:46:48.183 10651 ERROR cotyledon ValueError: invalid literal for int() with base 10: 'fd00' | 10:56 |
mcornea | shardy: hm, no, I didn't create the puppet/serviceapi.yaml. I was under the impression that it would get automatically created with the latest patches | 10:56 |
shardy | mcornea: aha - no, we're not quite there yet | 10:56 |
shardy | mcornea: that's what ccamacho is working on | 10:57 |
ccamacho | that should be the next step :) | 10:57 |
shardy | Its good feedback regardless, as it's exposed a bad error path/message | 10:58 |
panda | shardy: python-cotyledon is at 1.2.7 | 10:58 |
mcornea | shardy ccamacho ok, that makes perfect sense. I'll add it manually for now. | 10:58 |
ccamacho | mcornea you can review in the mean time :) https://review.openstack.org/#/c/378736/ | 10:58 |
panda | shardy: but that error seems promising in a ipv6 env ... I don't understand why I don't see it on CI though | 10:58 |
tobias_fiberdata | how's openstack newton when it comes to IPv6? I can see you are working on stuff like that | 10:59 |
*** shardy is now known as shardy_lunch | 10:59 | |
tobias_fiberdata | no1 knows perhaps as there's no1 using newton in production (Who would be so stupid). | 10:59 |
tobias_fiberdata | yet | 10:59 |
*** abehl has quit IRC | 10:59 | |
panda | tobias_fiberdata: essentially the problem is not with the protocol itself, it's the software that make a lot of assumption on how an address should look like. | 11:00 |
tobias_fiberdata | panda, yea ofc, IPv6 is great. what i mean is using newton in production is stupid as it's not officially released. | 11:01 |
tobias_fiberdata | i'm just wondering how mature openstack is with IPv6 | 11:01 |
tobias_fiberdata | in general | 11:01 |
panda | tobias_fiberdata: my answer is still valid for this question | 11:02 |
tobias_fiberdata | panda, alright :) | 11:02 |
mcornea | ccamacho: after cherry picking the patch it fails with: Could not fetch contents for file:///home/stack/templates/tripleo-heat-templates/puppet/controller-config-pacemaker.yaml | 11:06 |
mcornea | ccamacho: do I need something more? | 11:07 |
*** abehl has joined #tripleo | 11:12 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates: Ceilometer Wsgi Mitaka->Newton upgrades https://review.openstack.org/360004 | 11:12 |
*** sudipto_ has joined #tripleo | 11:15 | |
*** sudipto has joined #tripleo | 11:15 | |
panda | shardy_lunch: ccamacho : AH! in CI master is controller-1, I see the same error that I see locally ... So the timeout may still be caused by gnocchi metricd eating all the CPU because ipv6 is not correctly handled .. | 11:15 |
*** tdasilva_ has quit IRC | 11:18 | |
*** rhallisey has joined #tripleo | 11:19 | |
*** zaneb has quit IRC | 11:22 | |
*** lucas-afk is now known as lucasagomes | 11:31 | |
*** dprince has joined #tripleo | 11:40 | |
*** ccamacho has quit IRC | 11:40 | |
*** pkovar has joined #tripleo | 11:40 | |
*** numans has joined #tripleo | 11:41 | |
*** shardy_lunch is now known as shardy | 11:42 | |
shardy | mcornea: Thanks for the feedback - I think we need to remove the definition of OS::TripleO::ControllerConfig from puppet-pacemaker.yaml in that patch | 11:45 |
*** jpena is now known as jpena|lunch | 11:46 | |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates: Replace per role manifests with a common role manifest https://review.openstack.org/378736 | 11:47 |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates: j2 template role config templates https://review.openstack.org/378737 | 11:47 |
jpich | honza: jtomasek: Did you figure out why we upload things to npm? If it's not particularly necessary, we might as well not, till we sort out automation? | 11:49 |
jpich | honza: jtomasek: I looked at a random other JS-based openstack project (ironic-ui) and they don't seem to be there at all, so maybe it's ok not to | 11:50 |
*** akrzos has quit IRC | 11:50 | |
*** ccamacho has joined #tripleo | 11:51 | |
*** jprovazn has joined #tripleo | 11:51 | |
ccamacho | hey mcornea let me read my laptop crashed | 11:52 |
shardy | ccamacho: I updated https://review.openstack.org/#/c/378736 with what I think is the fix | 11:55 |
*** akrzos has joined #tripleo | 11:56 | |
bandini | marios: am running un upgrade cycle test with your latest ceilometer change. will report back | 11:57 |
*** tobias-fiberdata has joined #tripleo | 11:58 | |
*** aufi has quit IRC | 11:58 | |
ccamacho | shardy yeahp, but based on this change I have a question re OSP | 11:59 |
* ccamacho upgraded and all crashed.. | 11:59 | |
ccamacho | **in my laptop | 12:00 |
*** zoli|lunch is now known as zoli | 12:00 | |
*** zoli is now known as zoliXXL | 12:00 | |
*** tobias_fiberdata has quit IRC | 12:02 | |
*** ipsecguy has quit IRC | 12:02 | |
marios | bandini: ack thanks i just updated with the existing comments cos prad is likely still on a plane somewhere | 12:02 |
*** ipsecguy has joined #tripleo | 12:03 | |
*** akrzos has quit IRC | 12:03 | |
*** abehl has quit IRC | 12:04 | |
*** zoliXXL is now known as zoli|brb | 12:05 | |
shadower | jtomasek: what is the capabilities-map.yaml for? | 12:06 |
*** abehl has joined #tripleo | 12:06 | |
jtomasek | jpich: we do? lets discuss it when honza is online | 12:06 |
shardy | shadower: it adds extra data about the environment files for the UI | 12:06 |
jtomasek | shadower: it holds the metadata about environments available in tht repo | 12:07 |
shadower | shardy, jtomasek: ah. That's cool. Thanks | 12:07 |
jtomasek | shadower: we intend to replace it with proper environment metadata defined directly on the environment, but heat currently does not support it | 12:07 |
shadower | jtomasek: I was about to ask if there are plans for something like that :-) | 12:08 |
jpich | jtomasek: Apparently so, but due to versions number conflicts it's causing issues on creating releases (like EmilienM discovered when creating rc2 yesterday) | 12:08 |
*** jayg|g0n3 is now known as jayg | 12:10 | |
*** akrzos has joined #tripleo | 12:10 | |
*** tiswanso has joined #tripleo | 12:10 | |
*** tiswanso has quit IRC | 12:10 | |
*** tiswanso has joined #tripleo | 12:11 | |
shardy | shadower: Some interface additions are needed in heat, e.g see http://lists.openstack.org/pipermail/openstack-dev/2016-August/102297.html | 12:12 |
shardy | I hope we can make some progress on that during Ocata | 12:12 |
*** trown|outtypewww is now known as trown | 12:12 | |
shadower | thanks shardy | 12:13 |
*** amoralej is now known as amoralej|lunch | 12:14 | |
*** panda is now known as panda|afk | 12:15 | |
*** mburned is now known as mburned_out | 12:18 | |
*** gfidente has quit IRC | 12:20 | |
*** fultonj has joined #tripleo | 12:21 | |
*** zoli|brb is now known as zoli | 12:22 | |
*** zoli is now known as zoliXXL | 12:22 | |
*** morazi has quit IRC | 12:23 | |
*** athomas has quit IRC | 12:24 | |
*** ccamacho is now known as ccamacho|lunch | 12:25 | |
*** tiswanso has quit IRC | 12:25 | |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-common: Implement stack update as mistral actions https://review.openstack.org/379516 | 12:27 |
*** pgadiya has quit IRC | 12:28 | |
marios | bandini: thanks https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/tasks/major_upgrade_controller_pacemaker_3.sh :) | 12:29 |
*** gfidente has joined #tripleo | 12:30 | |
bandini | marios: yeah I felt that was the cleanest solution (not a fan of the third step though) | 12:31 |
*** skramaja has quit IRC | 12:33 | |
EmilienM | I created this gerrit topic https://review.openstack.org/#/q/topic:tripleo/newton-backport | 12:33 |
EmilienM | that include the patches that we want in stable/newton | 12:33 |
*** tobias-fiberdata has quit IRC | 12:37 | |
marios | bandini: so hopfeully the overhead for the extra softwareconfig isn't that great in itself... codewise it is just moving not adding something so lets see but I think it is fine and as you say we can just use the dependencies to orchestrate the upgrade as we intend it | 12:38 |
marios | bandini: i.e. i agree it is cleanest | 12:38 |
bandini | marios: yeah I thought of some other approaches but they were "meh" | 12:39 |
bandini | marios: glad you are okay with it, I did type "git review" and thought "I hope marios is not going to kill me" ;) | 12:39 |
marios | bandini: haha how can i complain about contributions man honestly thanks very much matbu have you see this one i mean i guess you've likely already pulled it in your testign since it is landed https://review.openstack.org/#/c/377842/ | 12:41 |
*** sudipto_ has quit IRC | 12:43 | |
*** sudipto has quit IRC | 12:43 | |
EmilienM | shardy: can I have review on the backports please? https://review.openstack.org/#/q/topic:tripleo/newton-backport | 12:43 |
*** ayoung has quit IRC | 12:43 | |
*** ayoung has joined #tripleo | 12:44 | |
shardy | EmilienM: There are a few more related to the composable-roles bugfixes and tripleoclient fix we discussed yesterday | 12:45 |
shardy | some of those are WIP but I'll add them to the list so we can land them as they get completed | 12:45 |
EmilienM | shardy: nice, please add them to this gerrit topic | 12:45 |
*** jpena|lunch is now known as jpena | 12:47 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add option to specify Certmonger CA https://review.openstack.org/368643 | 12:48 |
thrash | EmilienM: shardy I should set https://review.openstack.org/#/c/379516/ to topic tripleo/newton-backport? | 12:48 |
*** aufi has joined #tripleo | 12:48 | |
EmilienM | yes | 12:48 |
jpich | EmilienM: Is this for a potential rc3? Is it ok to add a couple of puppet patches here? Thinking of https://bugs.launchpad.net/tripleo/+bug/1628484 and https://bugs.launchpad.net/tripleo/+bug/1628983 (especially the first one) | 12:48 |
openstack | Launchpad bug 1628484 in tripleo "UI fails to load with a "400 Bad Request" error" [Critical,In progress] - Assigned to Honza Pokorny (hpokorny) | 12:48 |
openstack | Launchpad bug 1628983 in tripleo "Puppet httpd config for the tripleo-ui conflicts with the package config" [High,In progress] - Assigned to Julie Pichon (jpichon) | 12:48 |
EmilienM | it has the newton-backport-potential | 12:48 |
EmilienM | jpich: we won't have RC3, RC deadline was yesterday | 12:49 |
thrash | EmilienM: ack | 12:49 |
trown | EmilienM: don't we need bug for https://review.openstack.org/#/c/379875/ ? | 12:49 |
EmilienM | trown: we could, though it's for upgrades | 12:49 |
shardy | EmilienM: added those I'm following and/or assigned to | 12:49 |
EmilienM | jpich: I asked but we can't have more RC :( | 12:49 |
shardy | I should have patches moving some of those out of WIP by the end of today | 12:49 |
jpich | EmilienM: Then what's the difference between cycle-trailing and regular releases? I'm a bit confused :o | 12:50 |
EmilienM | jpich: me too | 12:50 |
shardy | jpich: we can still fix things, they just have to land on master then be backported to stable | 12:50 |
shardy | Seems the only difference is we get a couple more weeks of backporting things before we have to declare the release final | 12:50 |
EmilienM | shardy: we can do that post release also | 12:51 |
jpich | shardy: So the RC2 tag won't be used for the release then, but whatever the top patch is? | 12:51 |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-common: Implement stack update as mistral actions https://review.openstack.org/379516 | 12:51 |
EmilienM | shardy: like documented http://docs.openstack.org/project-team-guide/stable-branches.html#support-phases | 12:51 |
EmilienM | Phase I (first 6 months): All bugfixes (which meet the criteria described below) are appropriate | 12:51 |
shardy | EmilienM: Yeah, but at some point we have to declare a newton GA | 12:51 |
EmilienM | right | 12:51 |
EmilienM | in term of git tags, it's too late now | 12:51 |
EmilienM | except we can do newton stable releases later | 12:51 |
EmilienM | like we did with mitaka and liberty | 12:51 |
*** zoliXXL is now known as zoli|brb | 12:51 | |
*** hjensas has quit IRC | 12:52 | |
jpich | EmilienM: The UI package doesn't work without that puppet patch, does that mean no matter what we do now it can't get included in the Newton release? (as in, it'll have to wait until the first stable point release a few weeks later?) | 12:54 |
* jpich goes to read again the release docs tag descriptions | 12:54 | |
EmilienM | what is https://review.openstack.org/#/c/380152/1/manifests/ui.pp ? | 12:55 |
EmilienM | I'm tryin to understand | 12:55 |
EmilienM | ok I see, yum update versus puppet run | 12:56 |
jpich | EmilienM: The other one you already kindly reviewed (thank you!) is the most-most important one, but... yeah, that's it | 12:56 |
jpich | sorry I couldn't explain well | 12:56 |
*** links has quit IRC | 12:56 | |
EmilienM | ok let me review it | 12:57 |
jpich | Thank you | 12:57 |
*** david-lyle has joined #tripleo | 12:57 | |
EmilienM | so we have a conflict between puppet that creates a vhost and package that creates a vhost too right? | 12:57 |
jpich | Right | 12:58 |
jpich | Horizon had the same issue | 12:58 |
jpich | so I copied their solution :) | 12:58 |
jpich | (I linked to it in the bug) | 12:58 |
EmilienM | https://github.com/openstack/puppet-horizon/blob/000a40/manifests/wsgi/apache.pp#L115-L128 | 12:58 |
EmilienM | it's interested you know? | 12:58 |
EmilienM | I never saw it | 12:58 |
EmilienM | +2 | 12:58 |
jpich | Yay, one step closer | 12:58 |
EmilienM | interesting* | 12:59 |
EmilienM | jpich: do you have more blockers? | 12:59 |
jpich | EmilienM: That's the ones I'm aware of. jtomasek would know, about the UI repo itself | 12:59 |
*** cdearborn has joined #tripleo | 12:59 | |
jpich | Maybe the node tagging | 12:59 |
EmilienM | I'm adding tripleo/newton-backport gerrit topic | 12:59 |
jpich | Thank you | 12:59 |
EmilienM | you're welcome, thanks for your work! | 13:00 |
*** tobias-fiberdata has joined #tripleo | 13:01 | |
*** radeks has quit IRC | 13:01 | |
jpich | :) | 13:01 |
*** ayoung has quit IRC | 13:02 | |
jpich | The node tagging stuff doesn't seem to be ready for review yet ( https://bugs.launchpad.net/tripleo/+bug/1625264 ) so I'm not sure about the status /cc jtomasek honza | 13:03 |
openstack | Launchpad bug 1625264 in tripleo "Node tagging workflow doesn't support untagging" [High,In progress] - Assigned to Honza Pokorny (hpokorny) | 13:03 |
*** lblanchard has joined #tripleo | 13:04 | |
jpich | Is the cycle-trailing model new in Newton, or did it exist in Mitaka/previous releases as well? | 13:04 |
*** dtrainor has quit IRC | 13:05 | |
jtomasek | jpich, EmilienM: I am currently working on updating the nodes tagging GUI patch (https://review.openstack.org/#/c/367562) and honza is working on tripleo-common one (https://review.openstack.org/372628) | 13:06 |
EmilienM | panda|afk: DUDE | 13:07 |
EmilienM | panda|afk: increasing the timeout was a good idea :) | 13:07 |
EmilienM | panda|afk: we're running ipv6 pingtest now :) | 13:07 |
EmilienM | and it seems to pass \o/ | 13:07 |
jpich | jtomasek: Ok! | 13:07 |
EmilienM | bnemec: ^ | 13:07 |
EmilienM | jpich: Newton | 13:08 |
EmilienM | jpich: well, it's from Mitaka but in TripleO we use it since Newton afik | 13:08 |
*** Goneri has joined #tripleo | 13:08 | |
*** panda|afk is now known as panda | 13:09 | |
panda | EmilienM: woohoo! | 13:09 |
jpich | EmilienM: Ok, thanks for the information. So maybe all the processes aren't clearly defined yet? To me https://governance.openstack.org/reference/tags/release_cycle-trailing.html sure indicates we could have a rc3 or whatever number we need for up to 2 more weeks... but I've seen some of the conversations in the release channel and they've left me completely confused | 13:09 |
panda | EmilienM: I think I also know why it's taking too long | 13:09 |
panda | EmilienM: https://bugs.launchpad.net/tripleo/+bug/1629279 | 13:09 |
openstack | Launchpad bug 1629279 in tripleo "gnocchi is eating all my cpu too :(" [Undecided,Confirmed] - Assigned to Gabriele Cerami (gcerami) | 13:09 |
panda | EmilienM: do you know where is os_service_default defined in manifests ? | 13:10 |
EmilienM | jpich: let me confirm with Doug | 13:10 |
*** zoli|brb is now known as zoli | 13:10 | |
EmilienM | panda: ah I know this bug | 13:10 |
*** zoli is now known as zoliXXL | 13:10 | |
EmilienM | panda: we are fixing it | 13:10 |
EmilienM | panda: which process? metricd? | 13:10 |
*** dhill_ has quit IRC | 13:11 | |
*** dhill__ has joined #tripleo | 13:11 | |
EmilienM | ccamacho|lunch was working on it | 13:11 |
panda | EmilienM: yes | 13:11 |
EmilienM | tripleo.sh -- Overcloud pingtest SUCCEEDED | 13:11 |
*** dsavineau has joined #tripleo | 13:11 | |
EmilienM | panda: ok problem solved, but maybe not backported, a sec | 13:11 |
EmilienM | panda: https://review.openstack.org/#/c/375968/ | 13:11 |
panda | EmilienM: it's not the same as before, it's similar, but it depends on different solutioin | 13:11 |
jpich | EmilienM: Cool, I'm very curious! | 13:12 |
panda | EmilienM: I called it the same as the other as a joke, but maybe it's not a good idea | 13:12 |
EmilienM | jpich: sure, let me help panda with gnocchi/cpu problem and I'll catchup with Doug if we can make an RC2 | 13:12 |
EmilienM | RC3 | 13:12 |
jpich | EmilienM: Sure!! | 13:13 |
EmilienM | panda: ok so first step, backport https://review.openstack.org/#/c/380278/ | 13:13 |
EmilienM | panda: now let me look how heat template configure gnocchi | 13:14 |
EmilienM | ok it takes the defaults value | 13:14 |
panda | EmilienM: I traced back the problem in coordination_url value in gnocchi.conf, we're passing redis://:Be9gjaFvNmFvyCT6X4uVK3p23@fd00:fd00:fd00:2000::18:6379/ | 13:14 |
EmilienM | so we should be good with this puppet-gnocchi patch | 13:14 |
*** tiswanso has joined #tripleo | 13:14 | |
panda | EmilienM: yes, it's taking os_service_default, but it's wrong ... | 13:14 |
EmilienM | panda: why? | 13:15 |
panda | EmilienM: doesn't have brackets. | 13:15 |
EmilienM | ok that's a new bug | 13:15 |
EmilienM | panda: note that os_service_default is not taken anymore, we now have $::os_workers | 13:15 |
EmilienM | see https://review.openstack.org/#/c/380278/1/manifests/metricd.pp | 13:15 |
panda | EmilienM: os_workers is an ip ? | 13:16 |
EmilienM | panda: no | 13:16 |
EmilienM | panda: see https://review.openstack.org/#/c/375146/ | 13:16 |
EmilienM | mwhahaha (Alex) sent an email: [openstack-dev] [puppet][tripleo][fuel] Upcoming changes to defaults around using processor count for worker configurations | 13:16 |
panda | EmilienM: I see this in gnocchi/manifests/storage.pp | 13:17 |
panda | class gnocchi::storage( | 13:17 |
panda | $package_ensure = 'present', | 13:17 |
openstackgerrit | Juan Badia Payno proposed openstack/tripleo-heat-templates: Fixed NoneType issue when logging-environment.yaml is used https://review.openstack.org/380003 | 13:17 |
panda | $coordination_url = $::os_service_default, | 13:17 |
panda | ) inherits gnocchi::params { | 13:17 |
panda | EmilienM: so it's ok to replace os_service_default with os_workers ? | 13:17 |
panda | os_workers is an url ? | 13:18 |
EmilienM | I think we misunderstood | 13:18 |
EmilienM | ok let me re-phrase | 13:18 |
EmilienM | First of all, we have 2 problems : | 13:18 |
*** dtrainor has joined #tripleo | 13:18 | |
EmilienM | 1) The number of workers by default in Gnocchi is too high and puppet-gnocchi was using default value (os_service_default) in TripleO. Problem solved by https://review.openstack.org/#/c/380278/ where we limit the number of workers | 13:19 |
EmilienM | 2) coordination_url is not set to os_service_default in TripleO but to an URL that apparently is missing brackets. This problem looks like valid and we need to solve it | 13:20 |
EmilienM | let me send a patch for 2) and let's do a recheck on the tripleo-ci patch to see if we still need this timeout increase | 13:20 |
panda | EmilienM: what I don't understand is $coordination_url = $::os_service_default. What is $::os_service_default ? it's a int, a url , an ip ? a jolly value ? | 13:21 |
EmilienM | it's a fact that set the value to "ensure absent" in Puppet | 13:21 |
EmilienM | so we don't configure it at all | 13:21 |
*** ccamacho|lunch is now known as ccamacho | 13:21 | |
*** ayoung has joined #tripleo | 13:22 | |
*** amoralej|lunch is now known as amoralej | 13:22 | |
EmilienM | you don't have to look at it because we set the parameter in THT | 13:22 |
EmilienM | you're confused by the default value in puppet-gnocchi... but we don't use the default... | 13:22 |
panda | EmilienM: ok | 13:22 |
panda | EmilienM: ah, I get it %{hiera('redis_vip')}" in puppet/services/gnocchi-base.yaml | 13:23 |
panda | this is what is not normalized | 13:23 |
panda | EmilienM: in the meantime, I'll backport the workers patch | 13:23 |
EmilienM | panda: yes | 13:24 |
EmilienM | panda: backport what? | 13:24 |
EmilienM | I already send the puppet-gnocchi backport | 13:24 |
EmilienM | https://review.openstack.org/#/c/380278/ | 13:24 |
panda | ... | 13:24 |
panda | There's anything left for me to do ? :( | 13:24 |
EmilienM | I'm working on the puppet-tripleo fix to normalize the redis IP | 13:25 |
d0ugal | thrash: if we call openstack overcloud update stack after a new deploy, it wont do much will it? | 13:25 |
d0ugal | thrash: but it would at least test it a bit | 13:25 |
d0ugal | thrash: just wondering if we can add that to CI | 13:25 |
*** jeckersb_gone is now known as jeckersb | 13:26 | |
thrash | d0ugal: correct. It wouldn't do much. But we should add it to CI. It wouldn't even need to be a complicated install. | 13:27 |
bandini | marios: I still get Error: false is not a string. It looks to be a FalseClass at /etc/puppet/modules/ceilometer/manifests/api.pp:94 on node overcloud-con | 13:27 |
*** mcornea has quit IRC | 13:28 | |
bandini | marios: send me your ssh key and I will get you access | 13:28 |
thrash | d0ugal: I think I'm going to pull more of the code from the UpdateManager into the actions themselves. Since we're dealing more with plan than stack. | 13:28 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: gnocchi-base: add gnocchi_redis_password hiera parameter https://review.openstack.org/380291 | 13:28 |
thrash | d0ugal: and I haven't exactly figured out what to do with the interactive update piece yet. | 13:29 |
*** mcornea has joined #tripleo | 13:29 | |
thrash | d0ugal: can a workflow block waiting for a message from zaqar? | 13:30 |
marios | bandini: https://github.com/marios.keys | 13:30 |
*** itzkb_ has joined #tripleo | 13:32 | |
*** fultonj has quit IRC | 13:32 | |
*** fultonj has joined #tripleo | 13:34 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Fix typo in fixing gnocchi upgrade. https://review.openstack.org/379874 | 13:34 |
d0ugal | thrash: I'm not sure, I don't see why not tho' | 13:34 |
d0ugal | thrash: I'll try and add it to CI quickly then. | 13:35 |
openstackgerrit | Juan Badia Payno proposed openstack/tripleo-heat-templates: Fixed NoneType issue when monitoring-environment.yaml https://review.openstack.org/380007 | 13:37 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add gateway_ip in OS::Neutron::Subnet https://review.openstack.org/379873 | 13:38 |
*** jlinkes__ has quit IRC | 13:38 | |
*** jlinkes__ has joined #tripleo | 13:38 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Use -L with chown and set crush map tunables when upgrading Ceph https://review.openstack.org/379875 | 13:40 |
ccamacho | mcornea check this, https://paste.fedoraproject.org/439078/14752428/ it using the pcm env file without having a plan created the deployment will fail... | 13:41 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui: Integrate node tagging workflow https://review.openstack.org/367562 | 13:43 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Telemetry: add redis_password hiera parameter https://review.openstack.org/380291 | 13:44 |
EmilienM | panda: the problem is also with aodh and ceilometer | 13:44 |
EmilienM | I'm fixing them all in one | 13:44 |
itzkb_ | hi, can I get another core review for https://review.openstack.org/#/c/376007/? tnx | 13:45 |
panda | EmilienM: \o/ | 13:46 |
beagles | itzkb_: you get my first +2 | 13:46 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: telemetry: normalize coordination_url https://review.openstack.org/380306 | 13:47 |
itzkb_ | beagles: :-) thanks | 13:49 |
itzkb_ | beagles: congrats | 13:49 |
*** rbowen has joined #tripleo | 13:49 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: telemetry: remove coordination_url hiera settings https://review.openstack.org/380310 | 13:49 |
panda | EmilienM: I'm trying to figure how, but doesn't look too easy .. at least I wouldn't know where to put the normilizer function | 13:49 |
EmilienM | panda: https://review.openstack.org/#/q/topic:bug/1629279 | 13:50 |
beagles | itzkb_: thx ... nice to have a low, slow one for the first networking related one to come along :) | 13:50 |
openstackgerrit | Jiri Tomasek proposed openstack/tripleo-ui: Integrate node tagging workflow https://review.openstack.org/367562 | 13:50 |
itzkb_ | beagles: :-) | 13:50 |
*** akshai has joined #tripleo | 13:51 | |
openstackgerrit | Emilien Macchi proposed openstack-infra/tripleo-ci: Add IPv6 network configuration for ipv6 job types https://review.openstack.org/363674 | 13:52 |
EmilienM | panda: trying again with my patches ^ | 13:52 |
matbu | shardy: i posted few comments to the review if you have time to answer. i think i have all the cards to do the fix, but i needs some "core" eyes and advices :) | 13:52 |
*** gchamoul is now known as gchamoul|afk | 13:53 | |
panda | EmilienM: ah, yep, you refactored | 13:53 |
panda | EmilienM: remove the timeout | 13:53 |
EmilienM | panda: I didn't | 13:53 |
honza | jpich: jtomasek: about npm --- i didn't set that up and i assume it's because of some automated script reuse | 13:54 |
panda | EmilienM: we're doomed | 13:54 |
*** bfournie has quit IRC | 13:54 | |
EmilienM | panda: why? | 13:54 |
honza | jpich: jtomasek: i'm not sure how much value there is in publishing to npm | 13:54 |
EmilienM | I want to see if we still need this timeout | 13:54 |
shardy | matbu: ack - I got sidetracked by some custom-roles bugs so feel free to go ahead if you've got bandwidth to update it | 13:54 |
shardy | will check the comments now | 13:54 |
EmilienM | we can easily monitor the timestamps | 13:54 |
panda | EmilienM: ok | 13:55 |
thrash | d0ugal: take a look at update... I'm not sure we *can* put it in CI. | 13:55 |
thrash | d0ugal: because of the breakpoint | 13:55 |
thrash | *breakpoints | 13:55 |
EmilienM | jpich: I'm asking about RC3, let's see :) | 13:55 |
jpich | honza: jtomasek: I'm leaning toward not-doing-that too | 13:56 |
matbu | shardy: ack thanks, during my test i hit the missing overcloud-resource-registry-puppet.yaml at the validate_args step | 13:56 |
jpich | EmilienM: Oh! * listens in * | 13:56 |
thrash | d0ugal: we can, we just have to figure out a polling mechanism, then be able to clear the breakpoints (there's no cli that I can find to just clear the breakpoints) | 13:56 |
matbu | shardy: so i plan to fix that also (i didn't hit that before) | 13:56 |
*** morazi has joined #tripleo | 13:56 | |
thrash | d0ugal: also, since we have the plans now, I don't think --templates, -e, or --answers-file makes sense anymore. (Maybe answers file) | 13:56 |
thrash | d0ugal: wdyt? | 13:57 |
*** shadower has quit IRC | 13:57 | |
*** gchamoul|afk is now known as gchamoul | 13:59 | |
*** limao has joined #tripleo | 13:59 | |
*** rodrigods has quit IRC | 14:00 | |
EmilienM | jpich: /join #openstack-meeting | 14:00 |
*** rodrigods has joined #tripleo | 14:00 | |
*** shadower has joined #tripleo | 14:00 | |
thrash | d0ugal: also thinking about splitting abort and breakpoint clearing to their own commands | 14:01 |
*** limao_ has joined #tripleo | 14:01 | |
d0ugal | thrash: sorry, I am having to deal with an issue at home, so wont be able to help for a bit | 14:02 |
thrash | d0ugal: ok. no worries. | 14:02 |
d0ugal | thrash: water is coming up my kitchen sink alarmingly fast! | 14:02 |
thrash | DOH! | 14:02 |
thrash | go go go | 14:02 |
*** bfournie has joined #tripleo | 14:02 | |
shardy | matbu: I added some comments, let me know if they don't make sense :) | 14:03 |
*** jlinkes_ has joined #tripleo | 14:03 | |
*** limao has quit IRC | 14:04 | |
ccamacho | shardy I think I found why ha its broken we are not generating dynamically https://review.openstack.org/#/c/378736/4/puppet/manifests/overcloud_controller_pacemaker.pp only overcloud_controller.pp let me undo that change and ask to bandini for some feedback as I think in the future that file will disappear. | 14:04 |
*** yamahata has joined #tripleo | 14:04 | |
matbu | shardy: yep thank you, i'm reading that make sense :) | 14:06 |
*** itzkb_ has quit IRC | 14:06 | |
*** tiswanso has quit IRC | 14:06 | |
*** jlinkes__ has quit IRC | 14:06 | |
*** saneax is now known as saneax-_-|AFK | 14:06 | |
*** jlinkes__ has joined #tripleo | 14:07 | |
*** tiswanso has joined #tripleo | 14:08 | |
*** mbozhenk1 has quit IRC | 14:09 | |
mcornea | ccamacho: sorry, was in a meeting. checking | 14:09 |
*** jlinkes_ has quit IRC | 14:10 | |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-heat-templates: Replace per role manifests with a common role manifest https://review.openstack.org/378736 | 14:11 |
*** limao has joined #tripleo | 14:11 | |
mcornea | ccamacho: so if I delete the plan and try to deploy with an env file it fails? | 14:11 |
*** limao__ has joined #tripleo | 14:12 | |
ccamacho | yeahp its weird if you dont have a plan already created and use an additional env file will fail (because its not creating the plan) | 14:12 |
*** jlinkes__ has quit IRC | 14:12 | |
*** jlinkes has joined #tripleo | 14:12 | |
*** dhill__ has quit IRC | 14:13 | |
*** limao_ has quit IRC | 14:14 | |
*** abehl has quit IRC | 14:14 | |
*** dhill_ has joined #tripleo | 14:14 | |
*** limao has quit IRC | 14:16 | |
shardy | ccamacho: Yeah my assumption was that we don't need the pacemaker specific manifest anymore | 14:16 |
*** dtrainor has quit IRC | 14:17 | |
shardy | ccamacho: all the pacemaker stuff should have moved into services/pacemaker, but perhaps I missed something related to the package manifest path | 14:17 |
ccamacho | yeahp | 14:17 |
ccamacho | the patch its different | 14:17 |
ccamacho | let me check if works (deployin now locally) | 14:17 |
ccamacho | once spotted the issue we can fix it | 14:18 |
ccamacho | s/patch/path/ | 14:18 |
ccamacho | mcornea yeahp confirmed and if when using env files you dont run "swift download overcloud" you will have the "the cant find blah stuff error" | 14:20 |
*** limao has joined #tripleo | 14:22 | |
*** limao__ has quit IRC | 14:23 | |
*** yamahata has quit IRC | 14:27 | |
d0ugal | thrash: I agree, I'd like to deprecate the deploy command totally :) | 14:27 |
thrash | d0ugal: whoa. Really? | 14:27 |
d0ugal | thrash: I'll be sending an email with some thoughts about it when things calm down | 14:27 |
thrash | d0ugal: what replaces it? | 14:27 |
d0ugal | thrash: openstack overcloud plan deploy | 14:28 |
thrash | d0ugal: ahhh | 14:28 |
*** jprovazn has quit IRC | 14:28 | |
d0ugal | thrash: (it doesn't fully, yet, but I think it should) | 14:28 |
shardy | d0ugal: maybe we can do that at some point, but can't it just be a shortcut for plan create && plan deploy ? | 14:28 |
shardy | personally I find it convenient and would like to maintain the interface | 14:28 |
d0ugal | shardy: I guess, but I want to greatly simplify the interface and make backwards incompatible changes :) | 14:29 |
thrash | d0ugal: make sure it supports upgrades. :) | 14:29 |
d0ugal | shardy: The problem is, the deploy command at the moment doesn't really fit the plan model - hence the hacks. | 14:29 |
d0ugal | shardy: so something needs to change | 14:29 |
shardy | d0ugal: Yeah, I think there's definitely scope for some kind of deprecation to simplify things | 14:29 |
*** dtrainor has joined #tripleo | 14:29 | |
shardy | I was having a discussion with larsks where he had the idea of a more constrained --extra-files-dir type interface | 14:29 |
shardy | that would remove a lot of the pain we currently have | 14:30 |
openstackgerrit | Brent Eagles proposed openstack/tripleo-heat-templates: Sync xWorker count description to match new derived defaults https://review.openstack.org/380342 | 14:30 |
d0ugal | shardy: +1 | 14:30 |
shardy | but wouldn't necessarily kill the deploy command completely | 14:30 |
d0ugal | Sure, I guess I would be fine with that too :) | 14:30 |
d0ugal | It's in such a messy state I just want to delete it at the moment :) | 14:30 |
shardy | we could introduce such an option, then wire in warnings whenever we resolve a file outside of either the --templates or --extra-files locations | 14:30 |
*** fzdarsky is now known as fzdarsky|afk | 14:31 | |
jpich | EmilienM: Looks like things are somewhat clarified, thanks a lot for following up!! | 14:31 |
*** lucasagomes is now known as lucas-afk | 14:31 | |
EmilienM | jpich: cool | 14:32 |
EmilienM | I'll announce it | 14:32 |
jpich | EmilienM: Also +1 for cat picture <3 | 14:32 |
EmilienM | shardy: rc3 next week | 14:32 |
EmilienM | ok I need to create the milestone in Launchpad and move bugs now :) | 14:33 |
shardy | EmilienM: ack, sounds good | 14:33 |
EmilienM | shardy: by next Friday | 14:33 |
EmilienM | so we have 5 days to close as much as we can | 14:33 |
thrash | shardy: d0ugal also want to see a greater emphasis on using an answers file. | 14:34 |
openstackgerrit | Adriano Petrich proposed openstack/tripleo-heat-templates: GATE TEST, please ignore https://review.openstack.org/365449 | 14:34 |
shardy | EmilienM: great, that should allow us to clear down the issues we currently know about at least | 14:34 |
EmilienM | yep | 14:34 |
*** limao has quit IRC | 14:34 | |
d0ugal | thrash: Sure, execpt wouldn't it be better to store all that information in a plan? | 14:34 |
thrash | d0ugal: true, true. | 14:34 |
openstackgerrit | Gabriele Cerami proposed openstack/tripleo-quickstart: Clone tripleo-ci in the undercloud https://review.openstack.org/380346 | 14:34 |
*** limao has joined #tripleo | 14:34 | |
openstackgerrit | Alex Schultz proposed openstack/puppet-tripleo: Add aodh profile rspec testing https://review.openstack.org/374402 | 14:36 |
*** aufi has quit IRC | 14:37 | |
*** david-lyle has quit IRC | 14:37 | |
*** david-lyle has joined #tripleo | 14:38 | |
shardy | the risk IMO with the answers file approach is we'd need to expose a bunch of configuration options via the CLI again, otherwise it'll just be a list of environment files | 14:38 |
shardy | which is already possible, e.g via --environment-directory | 14:38 |
leifmadsen | dumb question... what is the new "neutron subnet-list" equivalent in Newton? | 14:38 |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-heat-templates: j2 template role config templates https://review.openstack.org/378737 | 14:38 |
leifmadsen | I assume it's something under "network list" ? | 14:38 |
shardy | leifmadsen: openstack subnet list | 14:39 |
*** jeckersb is now known as jeckersb_gone | 14:39 | |
shardy | the old command still works for me tho | 14:39 |
leifmadsen | shardy: doesn't work on my quickstart master deploy... | 14:40 |
leifmadsen | just shows me the network commands | 14:40 |
leifmadsen | thanks -- not sure how I didn't see the subnet commands when listing everything | 14:40 |
*** limao_ has joined #tripleo | 14:41 | |
leifmadsen | looks like this section of the documentation needs updating to the new commands? http://tripleo.org/basic_deployment/basic_deployment_cli.html#configure-a-nameserver-for-the-overcloud | 14:41 |
*** jlinkes has quit IRC | 14:42 | |
EmilienM | ok so we have https://launchpad.net/tripleo/+milestone/newton-rc3 | 14:42 |
EmilienM | I'll send an email about it now | 14:42 |
shardy | leifmadsen: I'm kinda surprised if they removed that command without any user visible deprecation messages, during the RC window | 14:42 |
*** jlinkes has joined #tripleo | 14:43 | |
shardy | but my undercloud is a couple of weeks old, so perhaps they did | 14:43 |
leifmadsen | yea... I don't see anything like that (I got it on a few of the other baremetal commands) | 14:43 |
leifmadsen | like "baremetal list" tells me I should use "baremetal node list" but proceeds to show me the list anyways | 14:43 |
shardy | I guess the docs will need updating to standardize on the openstackclient stuff regardless | 14:43 |
leifmadsen | shardy: yea, I just deployed the undercloud last night | 14:43 |
leifmadsen | shardy: if you point me at the source, I'm happy to submit a patch | 14:43 |
leifmadsen | I assume it's tripleo-docs ? | 14:44 |
paramite | shardy, Hey Steve, how can I debug failing overcloud deploy, when only AllNodesDeploySteps (OS::TripleO::PostDeploySteps) fails? | 14:44 |
leifmadsen | project wise on review.openstack ? | 14:44 |
shardy | https://github.com/openstack/tripleo-docs | 14:44 |
shardy | leifmadsen: ^^ patches very welcome, thanks! :) | 14:44 |
leifmadsen | that's not a mirror from openstack? | 14:44 |
shardy | leifmadsen: yes it is | 14:44 |
leifmadsen | ah ok, I'll submit a review on review.openstack.org unless that's not correct :) | 14:44 |
shardy | yup, that's correct, thanks! | 14:45 |
*** limao has quit IRC | 14:45 | |
leifmadsen | thanks! | 14:45 |
shardy | paramite: try running "openstack stack failures list overcloud" | 14:45 |
leifmadsen | testing out composable roles :) | 14:45 |
*** morazi has quit IRC | 14:45 | |
shardy | leifmadsen: ack, note there's still a few issues we're working through: | 14:46 |
openstackgerrit | Julie Pichon proposed openstack/puppet-tripleo: Clean out UI httpd configuration file https://review.openstack.org/380152 | 14:46 |
shardy | https://bugs.launchpad.net/tripleo/+bugs?field.tag=composable-roles | 14:46 |
openstackgerrit | Alex Schultz proposed openstack/puppet-tripleo: Add aodh profile rspec testing https://review.openstack.org/374402 | 14:46 |
leifmadsen | shardy: sounds good -- just trying to mostly test with parameters instead of -flavor and -scale openstack flags | 14:47 |
leifmadsen | checking list though so I know what to expect to run into | 14:47 |
leifmadsen | haha, first one is no way to override roles_data.yaml. I was thinking about that too! :) | 14:47 |
shardy | yeah, you have to cp -r /usr/share/openstack-tripleo-heat-templates for now | 14:48 |
shardy | I got a patch up adding -r my_roles_data.yaml, but needs tests | 14:48 |
leifmadsen | shardy: I'll try and test today | 14:48 |
leifmadsen | will be a good way for me to learn how to patch the undercloud to test things :) | 14:49 |
openstackgerrit | Gabriele Cerami proposed openstack/tripleo-quickstart: [WIP] Add configuration of HA IPv6 deployments https://review.openstack.org/380358 | 14:49 |
*** jeckersb_gone is now known as jeckersb | 14:49 | |
shardy | leifmadsen: http://paste.openstack.org/show/583601/ has some notes of how I'm testing a custom role locally | 14:49 |
leifmadsen | awesome thanks! | 14:49 |
leifmadsen | loading that up for later | 14:49 |
openstackgerrit | lilintan proposed openstack/tripleo-quickstart: Drop *openstack/common* in flake8 exclude list https://review.openstack.org/373874 | 14:50 |
shardy | the copy/sed stuff should go away when we fix bug https://bugs.launchpad.net/tripleo/+bug/1626976 completely | 14:50 |
openstack | Launchpad bug 1626976 in tripleo "Custom role requires manual environment/files" [High,In progress] - Assigned to Carlos Camacho (ccamacho) | 14:50 |
openstackgerrit | lilintan proposed openstack/tripleo-common: Drop *openstack/common* in flake8 exclude list https://review.openstack.org/373882 | 14:50 |
leifmadsen | +1 | 14:50 |
leifmadsen | seems pretty close! | 14:50 |
leifmadsen | already loving the composable stuff over whatever was there before :D | 14:50 |
shardy | Yeah, hopefully folks will like it, it's been a lot of work | 14:51 |
leifmadsen | I like it already | 14:52 |
shardy | good to know :) | 14:52 |
leifmadsen | at the very least, it has been easier to learn than looking through giant monolithic templates | 14:52 |
shardy | Yup, it's definitely easier to understand now, provided you don't look at the big scary yaql queries ;) | 14:53 |
leifmadsen | :D | 14:53 |
*** padkrish has joined #tripleo | 14:53 | |
leifmadsen | so far I haven't | 14:53 |
leifmadsen | I learned some stuff back in December/January when I started at RH through OSP7... it has come a long way! | 14:54 |
*** limao_ has quit IRC | 14:55 | |
*** limao has joined #tripleo | 14:56 | |
paramite | shardy, ah-hah, so it's because of: Engine went down during resource CREATE. We've experienced that yesterday too with larsks. I see two errors in the log, one is at "2016-09-30 14:19:15.356" AuthorizationFailure: Authorization failed. and second is at "2016-09-30 14:08:29.146" Data too long for column 'resource_properties' (http://chunk.io/f/56caa6c460d24ba0b13c7f4115d04f04). Is this known problem or I should file a heat bug? | 14:57 |
*** padkrish_ has joined #tripleo | 14:57 | |
mcornea | has anyone seen this error before: Error: /Stage[main]/Tripleo::Profile::Base::Pacemaker/File[/var/lib/tripleo/pacemaker-restarts]/ensure: change from absent to directory failed: Cannot create /var/lib/tripleo/pacemaker-restarts; parent directory /var/lib/tripleo does not exist ? | 14:57 |
EmilienM | mwhahaha: hey, when you got time, can you review this thing? https://review.openstack.org/#/q/topic:bug/1629279 (I suggest to read the bug report before) | 14:57 |
shardy | paramite: those may be bugs, but are you sure heat-engine didn't get killed by the OOM killer? | 14:57 |
EmilienM | mcornea: is it during an upgrade? | 14:58 |
shardy | paramite: there are heat memory usage issues under investigation, so it may be using up all the ram? | 14:58 |
mwhahaha | EmilienM: ok | 14:58 |
mcornea | EmilienM: no, initial deployment | 14:58 |
leifmadsen | so I'm trying to learn how to deploy without the --compute-scale and --compute-flavor (etc) flags on the openstack overcloud deploy command... I have this error though: http://pastebin.com/UhNc92TJ | 14:58 |
paramite | shardy, yeah, heat is pretty memory hungry, so I probably hit that | 14:58 |
leifmadsen | I must be rooking something up... | 14:58 |
paramite | shardy, ok, os as a workaround I should add more mem to undercloud, right? | 14:59 |
paramite | *so | 14:59 |
shardy | paramite: FYI https://bugs.launchpad.net/heat/+bug/1626675 | 14:59 |
openstack | Launchpad bug 1626675 in heat "Further memory usage issues with big stacks" [Critical,In progress] - Assigned to Zane Bitter (zaneb) | 14:59 |
EmilienM | mcornea: no I never saw it :( | 14:59 |
paramite | thx | 14:59 |
shardy | paramite: yeah unfortunately either more memory or some swap is the only option until those issues are fixed | 14:59 |
leifmadsen | shardy: yea in my experience heat-engine uses a LOT of RAM.... like 800MB per node (at least that's what it was doing for me on Liberty) | 14:59 |
beagles | mrunge: re: https://bugs.launchpad.net/tripleo/+bug/1629315 did you see the actual details of the error. i wonder if it is a heat bug or if it's something funny we are doing with environment files | 15:00 |
openstack | Launchpad bug 1629315 in tripleo "Can't use logging-environment.yaml on overcloud deployment" [Undecided,In progress] - Assigned to Juan Badia Payno (jbadiapa) | 15:00 |
*** padkrish has quit IRC | 15:00 | |
shardy | leifmadsen: if you check the plot referenced from that bug, we reduced it a lot early in newton, then massively increased it again | 15:00 |
shardy | hopefully we can work out why soon | 15:00 |
leifmadsen | shardy: yea I just opened that link :) | 15:00 |
paramite | shardy, ok, thanks, will add more mem then | 15:00 |
leifmadsen | anything to reduce memory usage in heat is ++++++ in my book | 15:00 |
leifmadsen | because it kills underclouds :) | 15:00 |
leifmadsen | shardy: sorry to be pest, but any ideas on my pastebin above? | 15:01 |
*** zaneb has joined #tripleo | 15:02 | |
shardy | leifmadsen: Yes, you need to use parameter_defaults, not parameters | 15:02 |
shardy | or the parameters are only applied to the top-level overcloud.j2.yaml file | 15:02 |
beagles | leifmadsen: the big pain is that there is a antagonistic kind of synergy with heat and python. The arena based allocation, large strings, etc. etc. mean that once memory is consumed once, it's going to stick around. Add multiple processes and you get .. what we got | 15:02 |
*** r-mibu has quit IRC | 15:03 | |
*** r-mibu has joined #tripleo | 15:04 | |
beagles | leifmadsen: allegedly python 3.3's allocator might improve things ... but that's a big if followed by bigger "so what" | 15:04 |
*** limao_ has joined #tripleo | 15:05 | |
leifmadsen | shardy: OOOOOOH | 15:05 |
leifmadsen | shardy: thanks :) I knew I was rooking something up! | 15:05 |
*** limao has quit IRC | 15:06 | |
leifmadsen | beagles: that's exactly what I was seeing -- memory being consumed and never released. It even happens between runs. Once that memory is allocated, it doesn't get released. | 15:06 |
leifmadsen | I wrote some big document about that post-300 node deploy | 15:06 |
leifmadsen | my graphs showed exactly what you just described. Now I know why! | 15:06 |
*** Ryjedo has quit IRC | 15:07 | |
beagles | leifmadsen: some memory should be reused, but I'm guessing that how blocks are reused is far from optimal | 15:08 |
leifmadsen | sounds right | 15:08 |
*** numans has quit IRC | 15:11 | |
leifmadsen | beagles: is there any sort of command I can run to force a garbage collection or something post-run ? | 15:12 |
*** mcornea has quit IRC | 15:13 | |
shardy | sudo systemctl restart openstack-heat-engine ;D | 15:13 |
beagles | leifmadsen: .... | 15:13 |
beagles | yeah that :0 | 15:13 |
derekh | init 6, make sure you get it all | 15:13 |
shardy | to be fair, it's not only heat which suffers from this problem, but it's particularly bad due to our memory usage patterns | 15:13 |
beagles | leifmadsen: afaik, python doesn't have a mechanism to shrink it's memory usage | 15:14 |
beagles | shardy: +1 | 15:14 |
leifmadsen | shardy: ah yes... I think I actually remember documenting that lol | 15:15 |
leifmadsen | derekh: lolz :) | 15:15 |
*** limao has joined #tripleo | 15:15 | |
leifmadsen | shardy: oh for sure, it's just one of the main components in my testing that I noticed had the most memory usage | 15:15 |
leifmadsen | which made deploying from an underpowered undercloud that much more underwhelming | 15:16 |
leifmadsen | rabbitMQ was another one :) | 15:16 |
*** akshai has quit IRC | 15:17 | |
*** limao_ has quit IRC | 15:18 | |
*** bnemec is now known as beekneemech | 15:20 | |
*** fzdarsky|afk is now known as fzdarsky | 15:20 | |
*** mcornea has joined #tripleo | 15:26 | |
*** padkrish_ has quit IRC | 15:28 | |
* ccamacho rebooting | 15:28 | |
*** ccamacho has quit IRC | 15:29 | |
*** akuznetsov has joined #tripleo | 15:32 | |
*** Ryjedo has joined #tripleo | 15:37 | |
openstackgerrit | Doug Hellmann proposed openstack/os-cloud-config: Update .gitreview for stable/newton https://review.openstack.org/380385 | 15:37 |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates: Make keystone api network hiera composable https://review.openstack.org/380386 | 15:37 |
*** jlinkes has quit IRC | 15:38 | |
*** Ryjedo has quit IRC | 15:39 | |
*** Ryjedo has joined #tripleo | 15:39 | |
*** Ryjedo has quit IRC | 15:39 | |
*** Ryjedo has joined #tripleo | 15:40 | |
*** Ryjedo has quit IRC | 15:40 | |
*** padkrish has joined #tripleo | 15:40 | |
openstackgerrit | Doug Hellmann proposed openstack/os-collect-config: Update .gitreview for stable/newton https://review.openstack.org/380388 | 15:41 |
openstackgerrit | Doug Hellmann proposed openstack/os-net-config: Update .gitreview for stable/newton https://review.openstack.org/380389 | 15:42 |
openstackgerrit | Doug Hellmann proposed openstack/os-refresh-config: Update .gitreview for stable/newton https://review.openstack.org/380391 | 15:42 |
openstackgerrit | Doug Hellmann proposed openstack/tripleo-image-elements: Update .gitreview for stable/newton https://review.openstack.org/380395 | 15:44 |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-common: Modify j2 templating to allow role files generation https://review.openstack.org/378750 | 15:48 |
*** dtantsur is now known as dtantsur|afk | 15:50 | |
*** Ryjedo has joined #tripleo | 15:51 | |
*** Ryjedo has quit IRC | 15:51 | |
*** Ryjedo has joined #tripleo | 15:51 | |
*** Ryjedo has quit IRC | 15:51 | |
*** Ryjedo has joined #tripleo | 15:52 | |
*** Ryjedo has quit IRC | 15:52 | |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates: Select per-network hostnames for service_node_names https://review.openstack.org/378764 | 15:52 |
openstackgerrit | Steven Hardy proposed openstack/tripleo-heat-templates: Make keystone api network hiera composable https://review.openstack.org/380386 | 15:52 |
openstackgerrit | Alex Schultz proposed openstack/puppet-tripleo: Add ceilometer profile rspec testing https://review.openstack.org/380406 | 15:52 |
*** Ryjedo has joined #tripleo | 15:53 | |
*** Ryjedo has quit IRC | 15:53 | |
*** Ryjedo has joined #tripleo | 15:54 | |
*** limao_ has joined #tripleo | 15:54 | |
*** Ryjedo has quit IRC | 15:56 | |
*** pkovar has quit IRC | 15:56 | |
*** Ryjedo has joined #tripleo | 15:56 | |
*** limao has quit IRC | 15:56 | |
openstackgerrit | Alex Schultz proposed openstack/puppet-tripleo: Only run ceilometer::db::sync on bootstrap node https://review.openstack.org/380414 | 15:57 |
*** Ryjedo has quit IRC | 15:59 | |
*** Ryjedo has joined #tripleo | 15:59 | |
*** ayoung has quit IRC | 16:00 | |
*** Ryjedo has quit IRC | 16:00 | |
*** tremble has quit IRC | 16:00 | |
*** b00tcat has quit IRC | 16:01 | |
*** akshai has joined #tripleo | 16:01 | |
*** ayoung has joined #tripleo | 16:01 | |
*** akshai_ has joined #tripleo | 16:02 | |
panda | EmilienM: it's not going very well .. but I don't understand why .. maybe puppet-tripleo changes contain errors | 16:03 |
*** ccamacho has joined #tripleo | 16:03 | |
*** limao_ has quit IRC | 16:05 | |
*** limao has joined #tripleo | 16:05 | |
*** akshai has quit IRC | 16:06 | |
EmilienM | panda: where? anything to show? | 16:08 |
*** rcernin has quit IRC | 16:08 | |
panda | EmilienM: almost every job in check-tripleo pipeline has failed | 16:09 |
*** limao has quit IRC | 16:09 | |
*** zoliXXL is now known as zoli|gone | 16:09 | |
panda | EmilienM: ipv6 is hanging anyway at step5, and default timeout expired | 16:10 |
EmilienM | mhh | 16:10 |
EmilienM | ok let me look | 16:10 |
zoli|gone | have a good weekend | 16:11 |
*** bana_k has joined #tripleo | 16:12 | |
*** zoli|gone is now known as zoli_gone-proxy | 16:13 | |
*** padkrish has quit IRC | 16:14 | |
*** trozet has joined #tripleo | 16:19 | |
*** tosky has quit IRC | 16:21 | |
*** padkrish has joined #tripleo | 16:21 | |
mrunge | beagles, pong | 16:25 |
mrunge | beagles, I have an idea, one should be able to trigger that error if you have an empty parameter_default: section somewhere | 16:26 |
*** bana_k has quit IRC | 16:27 | |
*** bana_k has joined #tripleo | 16:28 | |
*** shardy has quit IRC | 16:28 | |
mrunge | beagles, so it must be a bug somewhere else, but actually removing those empty sections from monitoring-environment.yml or logging-environment.yml works around this bug | 16:28 |
*** rasca has quit IRC | 16:28 | |
*** rasca_ has joined #tripleo | 16:28 | |
*** rasca_ has quit IRC | 16:28 | |
*** jpena is now known as jpena|off | 16:35 | |
*** rbowen has quit IRC | 16:36 | |
*** rbowen has joined #tripleo | 16:37 | |
trozet | hey dprince, why do i see this common usage of taking a comma delimited list, and converting it to a string in the templates: https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/neutron-base.yaml#L90 ? | 16:38 |
trozet | is it to make hiera happy or something? | 16:38 |
dprince | trozet: yes, I'm writing a hiera hook that will change some of this soon | 16:39 |
beagles | mrunge: it's interesting... I'm going to try it in something other than an overcloud deploy and see what happens | 16:39 |
*** mcornea has quit IRC | 16:39 | |
trozet | dprince: so if you dont do that step, how does it look in hieradata? does it not work at all? | 16:39 |
dprince | trozet: it was a work around that has been copied around a bunch of times | 16:40 |
*** padkrish has quit IRC | 16:40 | |
dprince | trozet: it just depends. It goes through Heat, the os-collect-config, os-apply-config, a shell script and then finally to a hiera file | 16:40 |
*** jpich has quit IRC | 16:40 | |
trozet | dprince: cause in ODL we do this: https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/opendaylight-api.yaml#L32, with no str_replace, would that not work by default? (in puppet-odl it expects that value as an array) | 16:40 |
mrunge | beagles, it fails quite quickly, once you're issuing openstack overcloud deploy... | 16:41 |
openstackgerrit | Alex Schultz proposed openstack/puppet-tripleo: Only run ceilometer::db::sync on bootstrap node https://review.openstack.org/380414 | 16:41 |
dprince | trozet: probably not I think do the all the conversion in place | 16:41 |
dprince | trozet: just depends on what you need it to be. Presumaby you'd want that comma_delimited_list to become an array | 16:42 |
dprince | trozet: and that isn't what happens without the work arounds I think | 16:42 |
dprince | trozet: but once we have a proper hiera hook, and avoid os-apply-config it'll all work out much better I think | 16:42 |
dprince | trozet: Json -> Yaml is quite clean | 16:42 |
trozet | dprince: ok I *think* why it works for us in OPNFV is we override it in templates like this https://github.com/trozet/opnfv-tht/blob/stable/colorado/environments/opendaylight.yaml#L21 | 16:43 |
trozet | dprince: so is that is quoted, so is that maybe being parsed into heat as a single item list? | 16:44 |
trozet | so that is quoted* | 16:44 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Move the rest of static roles resource registry entries to j2 https://review.openstack.org/380443 | 16:44 |
EmilienM | ccamacho: ^ please take care of your backports | 16:44 |
trozet | dprince: yeah i think that is why it works for us | 16:45 |
*** panda is now known as panda|bbl | 16:45 | |
EmilienM | d0ugal: could you check if all required tripleoclient patches are backported or WIP? | 16:46 |
*** electrofelix has quit IRC | 16:47 | |
EmilienM | panda|bbl: https://review.openstack.org/#/c/363674/ it works | 16:51 |
EmilienM | panda|bbl: with brackets this time ! | 16:51 |
EmilienM | panda|bbl: problem solved | 16:52 |
panda|bbl | EmilienM: yes, but all the non ipv6 fail | 16:52 |
EmilienM | (bracket problem) | 16:52 |
EmilienM | panda|bbl: yeah? where | 16:52 |
*** padkrish has joined #tripleo | 16:53 | |
panda|bbl | EmilienM: it took 1h30 to deploy, I think we are just in time ... | 16:53 |
EmilienM | why other jobs fail? | 16:53 |
EmilienM | http://logs.openstack.org/74/363674/31/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha-newton/2f62c29/logs/postci.txt.gz#_2016-09-30_15_25_52_000 | 16:54 |
*** masco has joined #tripleo | 16:54 | |
EmilienM | Error: /Stage[main]/Neutron::Db::Sync/Exec[neutron-db-sync]: Failed to call refresh: neutron-db-manage --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugin.ini upgrade heads returned 1 instead of one of [0]m | 16:54 |
EmilienM | it sounds like non related | 16:54 |
panda|bbl | EmilienM: some failed on delorean errors | 16:54 |
EmilienM | dprince, thrash: could you please review ipv6 fixes ? https://review.openstack.org/#/q/status:open+topic:bug/1629279 | 16:55 |
EmilienM | panda|bbl: not sure it's related to our work though | 16:55 |
panda|bbl | EmilienM: so, recheck ? | 16:55 |
dprince | trozet: exactly | 16:55 |
EmilienM | panda|bbl: the bracket issue is fixed. can we try now without touching the timeout? Can you push a patch without changing timeout? | 16:56 |
panda|bbl | EmilienM: sure | 16:56 |
EmilienM | panda|bbl: thanks | 16:56 |
EmilienM | panda|bbl: nice catch BTW | 16:56 |
openstackgerrit | Tim Rozet proposed openstack/tripleo-heat-templates: Fixes missing provider mappings for OpenDaylight https://review.openstack.org/380445 | 16:56 |
*** derekh has quit IRC | 16:59 | |
openstackgerrit | Gabriele Cerami proposed openstack-infra/tripleo-ci: Add IPv6 network configuration for ipv6 job types https://review.openstack.org/363674 | 16:59 |
*** bana_k has quit IRC | 17:02 | |
dprince | EmilienM: probably would've combined the t-h-t patches, but whatever | 17:07 |
*** akuznetsov has quit IRC | 17:07 | |
EmilienM | dprince: the order was on purpose, to not have coordination_url set twice | 17:07 |
*** akuznetsov has joined #tripleo | 17:07 | |
dprince | EmilienM: question on this one though https://review.openstack.org/#/c/380291/2 | 17:08 |
dprince | EmilienM: I don't think it hurts to it set twice if only 1 is active. Just a matter of opinion | 17:09 |
EmilienM | dprince: I thought about it and I was thinking at one day having multiple instances of Redis | 17:09 |
EmilienM | either way works for me | 17:10 |
*** padkrish has quit IRC | 17:10 | |
*** mburned_out is now known as mburned | 17:12 | |
EmilienM | dprince: if you have another approach, you can push over it, I would be happy with your change | 17:12 |
*** akuznetsov has quit IRC | 17:12 | |
pabelanger | So, we are seeing LaunchNetworkException: Unable to find public IP of server from tripleo-test-cloud-rh1 | 17:13 |
pabelanger | anybody mind checking it out? | 17:13 |
*** padkrish has joined #tripleo | 17:14 | |
EmilienM | dprince, mwhahaha: ok I can add a new parameter in the profile | 17:17 |
mwhahaha | well no | 17:17 |
EmilienM | ah | 17:17 |
mwhahaha | cause you need to provide it to a bunch of classes | 17:17 |
mwhahaha | there's a few issues with this change looking at it again | 17:18 |
mwhahaha | we should move the hiera lookups to parameters on the classes (it will aid in tests later) | 17:18 |
EmilienM | right | 17:18 |
mwhahaha | the other thing is that it doesn't seem like we have a consistent way to pass these configuration options that get reused all over the places to profiles | 17:18 |
*** morazi has joined #tripleo | 17:18 | |
mwhahaha | hes right that we probably shouldn't just use random hiera keys | 17:18 |
mwhahaha | but i'm not sure what the cleanest option would be | 17:18 |
mwhahaha | other than writing essentially a params class for services where we can shove this info | 17:19 |
*** mhenkel has quit IRC | 17:19 | |
mwhahaha | like a tripleo::config::{redis,mysql,mongo,etc} | 17:19 |
EmilienM | ah, I like this idea | 17:20 |
*** flepied has quit IRC | 17:20 | |
EmilienM | maybe we can write this tripleo::config in Ocata | 17:22 |
EmilienM | and leave it for now | 17:22 |
mwhahaha | sure | 17:22 |
mwhahaha | we can move the hiera stuff to params later as well | 17:22 |
mwhahaha | i started with ceilometer collector profile when i wrote the tests | 17:23 |
EmilienM | yes, it will be useful a lot for unit testing | 17:23 |
thrash | EmilienM: sure | 17:23 |
mwhahaha | found we always include ceilometer::db::sync always | 17:23 |
EmilienM | unit tests will give us interesting feedback, I'm sure ;-) | 17:24 |
beekneemech | pabelanger: Can you send me another recent failure UUID? I've been watching our node list, and it almost seems like there is a race going on here. | 17:25 |
*** masco has quit IRC | 17:25 | |
beekneemech | I've seen situations where a group of nodes gets started, comes up, then I see a bunch of errors in the grafana page, then the instances get vips. | 17:25 |
*** absubram has joined #tripleo | 17:26 | |
pabelanger | beekneemech: 964d0c1e-d490-4db4-b185-a980db1c9bba is the most recent failure | 17:26 |
*** padkrish has quit IRC | 17:27 | |
*** paramite has quit IRC | 17:28 | |
beekneemech | pabelanger: How long does nodepool wait for it to get fip? | 17:28 |
*** akshai_ has quit IRC | 17:29 | |
beekneemech | I have a nova list loop running every 15 seconds, and for that instance I see it in build state, then active without a fip, and by the next iteration it's already gone. | 17:29 |
*** amoralej is now known as amoralej|off | 17:31 | |
pabelanger | beekneemech: I want to say 60 sec | 17:31 |
beekneemech | Now I wish I had timestamped the loop... | 17:32 |
*** dhill_ has quit IRC | 17:36 | |
*** mhenkel has joined #tripleo | 17:37 | |
*** dhill_ has joined #tripleo | 17:39 | |
beekneemech | pabelanger: This is what I'm seeing for that instance: http://paste.openstack.org/show/583654/ | 17:42 |
beekneemech | This is three consecutive runs of nova list with a 15 second sleep between. | 17:42 |
beekneemech | In the first the instance is in BUILD state. | 17:42 |
beekneemech | The second it has booted and is running, but no fip yet. | 17:42 |
beekneemech | And the third it's already gone. | 17:43 |
*** trown is now known as trown|lunch | 17:43 | |
beekneemech | Note that 27790980-e98e-46a4-98e4-592d059cc4ec follows a similar path. | 17:43 |
*** akshai has joined #tripleo | 17:43 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Replace per role manifests with a common role manifest https://review.openstack.org/378736 | 17:44 |
openstackgerrit | Merged openstack/puppet-tripleo: Update .gitreview for stable/newton https://review.openstack.org/379599 | 17:45 |
*** dtrainor has quit IRC | 17:45 | |
*** dtrainor has joined #tripleo | 17:45 | |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-common: Modify j2 templating to allow role files generation https://review.openstack.org/378750 | 17:48 |
beekneemech | pabelanger: I wonder if the large number of instances in the nodepool tenant is causing a timing issue. | 17:48 |
openstackgerrit | Merged openstack/tripleo-common: Update .gitreview for stable/newton https://review.openstack.org/379600 | 17:48 |
beekneemech | The launch failures seem to coincide with the time when the list servers action starts taking longer. | 17:48 |
openstackgerrit | Merged openstack/instack-undercloud: Update .gitreview for stable/newton https://review.openstack.org/379598 | 17:48 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Update .gitreview for stable/newton https://review.openstack.org/379602 | 17:49 |
pabelanger | beekneemech: that is what I am thinking too, I have a patch up to revert back to 50 nodes to see if that fixes things. I only notices these failures once we hit 60 | 17:49 |
beekneemech | pabelanger: We weren't anywhere near 60 when the failures started though. | 17:51 |
beekneemech | The first one started around 14:00, at which point I see only 40 test nodes. | 17:51 |
pabelanger | beekneemech: Right, but I think nodepool is launching more nodes at once, which we've seen in the past to be a problem. EG: 13:56UTC today, we launched 6 nodes, had 2 failures | 17:53 |
*** ayoung has quit IRC | 17:54 | |
*** ayoung has joined #tripleo | 18:01 | |
*** fzdarsky is now known as fzdarsky|afk | 18:07 | |
*** ccamacho has quit IRC | 18:08 | |
*** baseball has joined #tripleo | 18:23 | |
*** sudipto has joined #tripleo | 18:27 | |
*** sudipto has quit IRC | 18:28 | |
*** mhenkel has quit IRC | 18:29 | |
*** sudipto has joined #tripleo | 18:30 | |
*** sudipto_ has joined #tripleo | 18:30 | |
*** cdearborn has quit IRC | 18:31 | |
trozet | dsneddon: hey I was wondering for variables like "neutron_api_node_ips" - they used to exist pre-composable services, now I see they are all missing, except keystone and memcached. Are these being dynamically generated? I see this: https://github.com/openstack/tripleo-heat-templates/blob/master/network/ports/net_ip_list_map.yaml | 18:32 |
trozet | I see references in puppet-tripleo still to variable: manifests/profile/base/neutron/midonet.pp: $neutron_api_node_ips = hiera('neutron_api_node_ips', ''), | 18:34 |
sudipto_ | Can someone help me with a problem i am facing while running the disk image builder with rhel7? | 18:34 |
*** cylopez has quit IRC | 18:35 | |
*** bana_k has joined #tripleo | 18:35 | |
thrash | sudipto_: what's up? | 18:36 |
*** bswartz has joined #tripleo | 18:38 | |
sudipto_ | thrash, i am trying to build rhel7 on ppc64le, it seemed to work fine, after i tweaked a few of the scripts. But it fails abruptly (without telling me why) - after showing this message: http://paste.openstack.org/show/583658/ while finalising the boot loader... | 18:38 |
sudipto_ | can you please give me any clues? | 18:38 |
*** cdearborn has joined #tripleo | 18:39 | |
thrash | sudipto_: You aren't doing this for OpenStack/TripleO purposes..? | 18:39 |
sudipto_ | thrash, yeah for openstack sahara precisely. | 18:40 |
thrash | You are talking about running dib directly. | 18:40 |
thrash | or no? | 18:40 |
sudipto_ | thrash, yeah running DIB with DIB_LOCAL_IMAGE set to my qcow2 | 18:40 |
thrash | Sorry... thought you were asking for help with the higher-level image building we do for ooo. | 18:40 |
thrash | I would probably need to see a bit more context, plus your dib command line to know where to even start looking. | 18:41 |
sudipto_ | thrash, i can get you that info... | 18:42 |
ayoung | EmilienM, what do I do in order to use the os_workers fact from https://review.openstack.org/#/c/375146/3 in keystone? is it :workers => $os_workers, | 18:46 |
ayoung | and do I need some include, or should the fact be picked up by default? | 18:46 |
sudipto_ | thrash, not sure if this is more context or - you want the complete logs: http://paste.openstack.org/show/583660/ | 18:47 |
mwhahaha | ayoung: it's picked up if you have an up to date openstacklib | 18:47 |
*** trown|lunch is now known as trown | 18:47 | |
ayoung | mwhahaha, and how is puppet expecting it? | 18:47 |
mwhahaha | https://github.com/openstack/puppet-openstacklib/blob/master/lib/facter/os_workers.rb | 18:47 |
trown | ayoung: https://review.openstack.org/#/c/375968/ is the patch for puppet=qnocchi | 18:47 |
ayoung | $os_workers or $::os_workers | 18:47 |
ayoung | trown, thanks | 18:47 |
mwhahaha | $::os_workers cause it's a fact | 18:48 |
trozet | slagle: maybe you could help me out if you are available | 18:48 |
mwhahaha | either techincally should work | 18:48 |
mwhahaha | ayoung: so long as your openstacklib has that file in the module path, facter should pick up on it and provide os_workers for you. You can test with with facter -p os_workers | 18:49 |
ayoung | trown, how'd you come up with 2 for the spec tests? | 18:49 |
thrash | sudipto_: and what what your command line? Did you pass any elements? | 18:49 |
trown | ayoung: that was there before, but it is the minimum it will be set to | 18:50 |
ayoung | trown, OK thanks | 18:50 |
mwhahaha | ayoung: we also provide it in the openstack tests as 2 | 18:50 |
sudipto_ | no i didn't pass anything. Just did diskimage-create rhel7 vm | 18:50 |
mwhahaha | https://github.com/openstack/puppet-openstack_spec_helper/blob/master/lib/puppet-openstack_spec_helper/defaults.rb#L7 | 18:50 |
sudipto_ | thrash, | 18:50 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Telemetry: add redis_password hiera parameter https://review.openstack.org/380291 | 18:52 |
openstackgerrit | Merged openstack/puppet-tripleo: telemetry: normalize coordination_url https://review.openstack.org/380306 | 18:54 |
openstackgerrit | Merged openstack/tripleo-heat-templates: telemetry: remove coordination_url hiera settings https://review.openstack.org/380310 | 18:54 |
*** kbyrne has quit IRC | 19:01 | |
*** mbozhenko has joined #tripleo | 19:06 | |
*** kbyrne has joined #tripleo | 19:06 | |
*** mbozhenko has quit IRC | 19:10 | |
*** panda|bbl is now known as panda | 19:12 | |
*** openstackgerrit has quit IRC | 19:18 | |
*** openstackgerrit has joined #tripleo | 19:18 | |
openstackgerrit | Ben Nemec proposed openstack-infra/tripleo-ci: Make the ovb-updates job work again https://review.openstack.org/374406 | 19:21 |
openstackgerrit | Ben Nemec proposed openstack-infra/tripleo-ci: Force rebuild of images https://review.openstack.org/380521 | 19:21 |
*** ayoung has quit IRC | 19:22 | |
openstackgerrit | Ben Nemec proposed openstack-infra/tripleo-ci: Test with scheduler hints https://review.openstack.org/378040 | 19:23 |
openstackgerrit | Ben Nemec proposed openstack-infra/tripleo-ci: Add support for testing predictable placement https://review.openstack.org/378014 | 19:23 |
openstackgerrit | Ben Nemec proposed openstack-infra/tripleo-ci: Test hostname map https://review.openstack.org/378017 | 19:23 |
thrash | sudipto_: and you're running thin on a ppc build? | 19:26 |
leifmadsen | what is a good way to debug what an overcloud deploy is doing... right now I think it is stuck in a loop, but it hasn't failed. Just sitting in CREATE_IN_PROGRESS right now | 19:28 |
leifmadsen | it seems to be stuck at this stack... | 19:29 |
leifmadsen | | AllNodesDeploySteps | 2ab29de0-a2aa-4730-bd | OS::TripleO::PostDeploy | CREATE_IN_PROGRESS | 2016-09-30T18:55:53Z | | 19:29 |
leifmadsen | 19:29 | |
leifmadsen | unless I'm possibly just being overly impatient... there are 6 steps that run right? in the logs I seem to see it doing similar operations, but it does seem to be increasing from step 3, to 4, and now on 5 | 19:30 |
leifmadsen | I'll just keep letting it run I suppose and see what happens | 19:30 |
*** ayoung has joined #tripleo | 19:33 | |
dsneddon | trozet, O functionality has been moved to puppet/services/vip-hosts.yaml. | 19:33 |
leifmadsen | ok, so I was just being impatient... it just finished | 19:34 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Telemetry: add redis_password hiera parameter https://review.openstack.org/380546 | 19:34 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: telemetry: normalize coordination_url https://review.openstack.org/380547 | 19:34 |
dsneddon | trozet, I think that functionality was moved to puppet/services/vip_hosts.yaml. | 19:34 |
beekneemech | [heat-admin@overcloud-controller-0 ~]$ heat stack-list | grep IN_PROGRESS | wc -l | 19:35 |
beekneemech | 40 | 19:35 |
trozet | dsneddon: I'm looking at https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.j2.yaml#L321 | 19:35 |
beekneemech | This is not going to end well... | 19:35 |
trozet | dsneddon: you see keystone_public_api_node_ips is still there | 19:35 |
trozet | dsneddon: so i think <service name>_node_ips is dynamically generated | 19:35 |
trozet | dsneddon: but for opendaylight, 'opendaylight_api_node_ips' never gets generated...the dynamic substitution is a little hard for me to follow - but my current guess is that the ODL service in role_data is actually called 'OpenDaylight' (rather than OpenDaylightApi) which is being used in this dynamic search for the network (OpenDaylightApiNetwork) | 19:37 |
dsneddon | trozet, Do you see that parameter in "heat stack-show overcloud"? | 19:37 |
trozet | so that name in role_data actually matters | 19:37 |
trozet | dsneddon: it was reported in bugzilla so i dont have a setup | 19:38 |
trozet | dsneddon: but i asked Itzik to give it a shot changing the service name in role data from OpenDaylight to OpenDaylightApi and see if that fixes it - I compared everything to NeutronApi and its the only difference I can find | 19:39 |
*** paramite has joined #tripleo | 19:47 | |
dsneddon | trozet, Yeah, my read on this is the same as yours. It's based on role.name. | 19:50 |
trozet | dsneddon: ok it gets a little crazy with those long jinja/substitution expressions across multiple files hhe | 19:50 |
trozet | heh* | 19:50 |
trozet | dsneddon: thanks for looking at it | 19:50 |
dsneddon | trozet, Yeah. I need to figure out how to make Heat tell me what the stack will output without actually deploying. It'll help me see what the eventual yaml looks like. Do you know a way? | 19:51 |
EmilienM | beekneemech: do you mind reviewing 2 backports https://review.openstack.org/380546 and https://review.openstack.org/380547 please ? | 19:52 |
trozet | dsneddon: no but that would be a handy dev tool as well I think | 19:52 |
EmilienM | beekneemech: and fyi we have ipv6 job green | 19:53 |
trozet | dsneddon: like just dump heat parameter and hiera output | 19:53 |
*** sudipto has quit IRC | 19:53 | |
*** sudipto_ has quit IRC | 19:53 | |
*** akuznetsov has joined #tripleo | 19:53 | |
beekneemech | EmilienM: Yeah, my local ipv6 test just passed too. | 19:53 |
*** akuznetsov has quit IRC | 19:54 | |
panda | EmilienM: yes, but non-ipv6 is still red | 19:55 |
panda | EmilienM: I think we could ignore liberty job, the change is not meant for that, but at least newton should pass, but isn't | 19:55 |
*** akuznetsov has joined #tripleo | 19:56 | |
*** baseball has quit IRC | 19:56 | |
*** rbowen has quit IRC | 19:56 | |
panda | EmilienM: again the neutron problem | 19:57 |
*** akuznetsov has quit IRC | 20:03 | |
panda | EmilienM: ok, it was present also before your patches, so we're good | 20:03 |
EmilienM | panda: ok | 20:04 |
panda | EmilienM: so liberty is failing because it can't get a puppet-tripleo package with you change, newton is failinf or an error not dedpdenent on you patch, and mitaka is failing but postci is bugged and we can't understand what's happening. | 20:06 |
*** tdasilva has quit IRC | 20:06 | |
panda | ... looks like I'm typing blindfolded .. | 20:06 |
*** trozet has quit IRC | 20:07 | |
*** akuznetsov has joined #tripleo | 20:07 | |
*** tiswanso has quit IRC | 20:08 | |
*** tiswanso has joined #tripleo | 20:12 | |
*** dprince has quit IRC | 20:13 | |
*** tiswanso has quit IRC | 20:17 | |
*** akuznetsov has quit IRC | 20:22 | |
*** tzumainn has joined #tripleo | 20:22 | |
*** jayg is now known as jayg|g0n3 | 20:32 | |
*** tiswanso has joined #tripleo | 20:36 | |
*** tdasilva has joined #tripleo | 20:40 | |
*** absubram has quit IRC | 20:40 | |
*** tiswanso has quit IRC | 20:41 | |
*** jbadiapa has quit IRC | 20:42 | |
*** Goneri has quit IRC | 20:44 | |
*** cllewellyn_ has joined #tripleo | 20:50 | |
*** gchamoul is now known as gchamoul|afk | 20:55 | |
*** fultonj has quit IRC | 20:59 | |
*** tiswanso has joined #tripleo | 21:01 | |
*** tiswanso has quit IRC | 21:05 | |
*** mbozhenko has joined #tripleo | 21:06 | |
*** akshai has quit IRC | 21:07 | |
*** mbozhenko has quit IRC | 21:10 | |
openstackgerrit | Alex Schultz proposed openstack/puppet-tripleo: Add aodh profile rspec testing https://review.openstack.org/374402 | 21:12 |
*** trown is now known as trown|outtypewww | 21:12 | |
openstackgerrit | Alex Schultz proposed openstack/puppet-tripleo: Add ceilometer profile rspec testing https://review.openstack.org/380406 | 21:12 |
openstackgerrit | Alex Schultz proposed openstack/puppet-tripleo: Only run ceilometer::db::sync on bootstrap node https://review.openstack.org/380414 | 21:12 |
*** dsavineau has quit IRC | 21:19 | |
*** mhenkel has joined #tripleo | 21:19 | |
*** rlandy has quit IRC | 21:26 | |
*** jeckersb is now known as jeckersb_gone | 21:27 | |
*** cdearborn has quit IRC | 21:28 | |
*** Goneri has joined #tripleo | 21:33 | |
*** tzumainn has quit IRC | 21:36 | |
*** cllewellyn_ has quit IRC | 21:39 | |
*** TSCHAK_ has joined #tripleo | 21:46 | |
*** null_ref has quit IRC | 21:47 | |
*** TSCHAK has quit IRC | 21:49 | |
*** jtomasek has quit IRC | 21:50 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo: Remove unused pacemaker profiles https://review.openstack.org/379957 | 21:57 |
*** flepied has joined #tripleo | 22:04 | |
*** morazi has quit IRC | 22:05 | |
*** lblanchard has quit IRC | 22:11 | |
*** TSCHAK has joined #tripleo | 22:13 | |
*** TSCHAK_ has quit IRC | 22:16 | |
*** tiswanso has joined #tripleo | 22:17 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo: Fix the timeout for pacemaker systemd resources https://review.openstack.org/380665 | 22:18 |
*** tiswanso has quit IRC | 22:22 | |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-heat-templates: Change rabbitmq queues HA mode from ha-all to ha-exactly https://review.openstack.org/379584 | 22:52 |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo: Change rabbitmq queues HA mode from ha-all to ha-exactly https://review.openstack.org/379586 | 22:52 |
*** mbozhenko has joined #tripleo | 23:07 | |
*** mbozhenko has quit IRC | 23:12 | |
*** Goneri has quit IRC | 23:20 | |
*** bana_k has quit IRC | 23:36 | |
*** dhill_ has quit IRC | 23:53 | |
*** absubram has joined #tripleo | 23:54 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!