*** chem has quit IRC | 00:04 | |
*** saneax is now known as saneax_AFK | 00:20 | |
*** trown|outtypewww is now known as trown | 01:06 | |
*** dmacpher-afk has quit IRC | 01:16 | |
openstackgerrit | Merged openstack-infra/tripleo-ci: set -o pipefail in deploy.sh https://review.openstack.org/287433 | 01:34 |
---|---|---|
*** panda has quit IRC | 01:40 | |
*** panda has joined #tripleo | 01:40 | |
openstackgerrit | Dan Sneddon proposed openstack/os-net-config: Fix hierarchy for Linux Bonds and Linux Bridges https://review.openstack.org/290224 | 01:53 |
*** lazy_prince has joined #tripleo | 01:54 | |
*** dmacpher has joined #tripleo | 02:08 | |
*** dsneddon has quit IRC | 02:18 | |
*** shivrao has quit IRC | 02:25 | |
*** lazy_prince has quit IRC | 02:36 | |
*** slagle changes topic to "TripleO | stable/liberty CI failing: https://bugs.launchpad.net/tripleo/+bug/1554846 | CI status: http://tripleo.org/cistatus.html | Docs: http://tripleo.org/" | 02:50 | |
*** yamahata has quit IRC | 02:53 | |
*** Marga__ has joined #tripleo | 03:06 | |
*** anande has joined #tripleo | 03:07 | |
*** Marga_ has quit IRC | 03:10 | |
openstackgerrit | Ryan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again https://review.openstack.org/288915 | 03:18 |
*** Marga__ has quit IRC | 03:20 | |
*** Marga_ has joined #tripleo | 03:20 | |
*** rhallisey has quit IRC | 03:23 | |
*** anande has quit IRC | 03:24 | |
*** veteran has joined #tripleo | 03:25 | |
*** akuznetsov has joined #tripleo | 03:31 | |
*** trozet has joined #tripleo | 03:34 | |
*** xinwu has quit IRC | 03:39 | |
*** yamahata has joined #tripleo | 03:47 | |
openstackgerrit | Nisha Agarwal proposed openstack/diskimage-builder: Add psmisc to the packages for ironic-agent https://review.openstack.org/289364 | 03:49 |
*** links has joined #tripleo | 04:03 | |
*** dmacpher has quit IRC | 04:05 | |
*** masco has joined #tripleo | 04:09 | |
*** shivrao has joined #tripleo | 04:09 | |
*** xinwu has joined #tripleo | 04:19 | |
*** dmacpher has joined #tripleo | 04:22 | |
*** Marga_ has quit IRC | 04:23 | |
*** tserong has quit IRC | 04:28 | |
*** veteran has quit IRC | 04:33 | |
*** xinwu has quit IRC | 04:34 | |
*** akuznetsov has quit IRC | 04:35 | |
*** Marga_ has joined #tripleo | 04:35 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Add Rabbit IPv6 only support https://review.openstack.org/269058 | 04:36 |
*** yamahata has quit IRC | 04:37 | |
*** yamahata has joined #tripleo | 04:38 | |
*** dmacpher has quit IRC | 04:38 | |
*** tserong has joined #tripleo | 04:38 | |
*** rlandy has quit IRC | 04:38 | |
*** Marga_ has quit IRC | 04:40 | |
*** mannidi has joined #tripleo | 04:42 | |
*** shivrao has quit IRC | 04:47 | |
*** dmacpher has joined #tripleo | 04:51 | |
*** Marga_ has joined #tripleo | 04:53 | |
*** Marga_ has quit IRC | 04:54 | |
*** Marga_ has joined #tripleo | 04:54 | |
*** Marga_ has quit IRC | 04:54 | |
*** Marga_ has joined #tripleo | 04:54 | |
*** Marga_ has quit IRC | 04:55 | |
*** Marga_ has joined #tripleo | 04:55 | |
*** shivrao has joined #tripleo | 04:57 | |
*** veteran has joined #tripleo | 05:10 | |
*** saneax_AFK is now known as saneax | 05:10 | |
*** xinwu has joined #tripleo | 05:29 | |
*** shivrao_ has joined #tripleo | 05:43 | |
*** shivrao has quit IRC | 05:47 | |
*** shivrao_ is now known as shivrao | 05:47 | |
*** shivrao has quit IRC | 05:52 | |
*** rcernin has joined #tripleo | 06:09 | |
*** bnemec has quit IRC | 06:11 | |
*** admin0 has joined #tripleo | 06:11 | |
*** cmyster has joined #tripleo | 06:16 | |
*** jtomasek has joined #tripleo | 06:19 | |
*** admin0 has quit IRC | 06:23 | |
*** akuznetsov has joined #tripleo | 06:30 | |
*** trozet has quit IRC | 06:36 | |
*** oshvartz has joined #tripleo | 06:53 | |
*** chlong has quit IRC | 06:54 | |
*** akuznetsov has quit IRC | 06:54 | |
*** akuznetsov has joined #tripleo | 07:02 | |
*** chlong has joined #tripleo | 07:07 | |
*** akuznetsov has quit IRC | 07:10 | |
*** saneax is now known as saneax_AFK | 07:12 | |
*** bvandenh has joined #tripleo | 07:22 | |
*** saneax_AFK is now known as saneax | 07:23 | |
*** bandini_ has joined #tripleo | 07:33 | |
*** bandini has quit IRC | 07:33 | |
*** rdopiera has joined #tripleo | 07:35 | |
*** bvandenh has quit IRC | 07:41 | |
*** liverpooler has joined #tripleo | 07:46 | |
*** dmacpher has quit IRC | 07:47 | |
*** bandini_ is now known as bandini | 07:51 | |
*** ohamada has joined #tripleo | 07:56 | |
*** ccamacho has joined #tripleo | 07:57 | |
*** gfidente has joined #tripleo | 07:58 | |
*** admin0 has joined #tripleo | 08:00 | |
*** NobodyCa1 has joined #tripleo | 08:00 | |
*** admin0 has quit IRC | 08:02 | |
*** jprovazn has joined #tripleo | 08:03 | |
*** NobodyCam has quit IRC | 08:03 | |
*** admin0 has joined #tripleo | 08:04 | |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: Add missing createUser line to /etc/snmp/snmpd.conf https://review.openstack.org/290317 | 08:05 |
*** lazy_prince has joined #tripleo | 08:07 | |
*** fgimenez has joined #tripleo | 08:08 | |
*** fgimenez has quit IRC | 08:08 | |
*** fgimenez has joined #tripleo | 08:08 | |
*** d0ugal has joined #tripleo | 08:09 | |
gfidente | yesterday liberty/ci had an all green job but now it's failing consistently :( | 08:11 |
gfidente | looking into thar | 08:11 |
*** dshulyak has joined #tripleo | 08:13 | |
*** aufi has joined #tripleo | 08:14 | |
*** xinwu has quit IRC | 08:17 | |
*** mikelk has joined #tripleo | 08:18 | |
*** mkovacik has joined #tripleo | 08:22 | |
*** tzumainn has joined #tripleo | 08:28 | |
gfidente | marios, last thing we needed was jenkins gate failing to run the tests | 08:33 |
gfidente | unbelievable :) | 08:33 |
openstackgerrit | Merged openstack/tripleo-common: Adds override for the overcloud node user in upgrade-non-controller https://review.openstack.org/289871 | 08:34 |
*** rwsu has joined #tripleo | 08:35 | |
*** athomas has joined #tripleo | 08:36 | |
marios | gfidente: yeah i am rechecking all the things | 08:40 |
marios | gfidente: i also saw this one which is weird https://review.openstack.org/#/c/263991/ | 08:40 |
gfidente | marios, but it's something to do with rake | 08:40 |
gfidente | it's out of our domain | 08:41 |
*** pblaho has quit IRC | 08:41 | |
*** veteran has quit IRC | 08:43 | |
gfidente | now on liberty/ci failing | 08:44 |
gfidente | Neutron::Agents::Ml2::Ovs/Service[neutron-ovs-agent-service]/ensure: change from stopped to running failed: Could not start Service[neutron-ovs-agent-service]: Execution of '/usr/bin/systemctl start neutron-openvswitch-agent' returned 1: Job for neutron-openvswitch-agent.service failed because a timeout was exceeded | 08:44 |
*** hjensas has quit IRC | 08:46 | |
*** aufi has quit IRC | 08:46 | |
gfidente | but why should this be happening only on liberty I am not sure | 08:47 |
*** shardy has joined #tripleo | 08:47 | |
*** openstackgerrit has quit IRC | 08:47 | |
*** openstackgerrit has joined #tripleo | 08:48 | |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder: Add dib element to generate logical volumes https://review.openstack.org/252041 | 08:51 |
*** akrivoka has joined #tripleo | 08:54 | |
*** links has quit IRC | 08:55 | |
openstackgerrit | Steven Hardy proposed openstack-infra/tripleo-ci: Move tripleo.sh into tripleo-ci repo https://review.openstack.org/272210 | 08:55 |
*** ifarkas has joined #tripleo | 08:56 | |
*** mbound has joined #tripleo | 09:04 | |
*** jaosorior has joined #tripleo | 09:06 | |
*** aufi has joined #tripleo | 09:06 | |
jaosorior | marios: Hey dude, regardings this review https://review.openstack.org/#/c/287199/8 I gave an answer. Thing is, those ports (like the ironic one) are the ports in which the internal service is listening | 09:08 |
marios | jaosorior: ok thanks for checking ... i wasn't sure if that was the case like i poked at https://forge.puppetlabs.com/puppetlabs/haproxy | 09:08 |
jaosorior | and those ports are not set up in the loadbalancer.pp. They're the ones that are set up by the respective puppet manifests | 09:08 |
*** dtantsur|afk is now known as dtantsur | 09:10 | |
marios | jaosorior: ok thanks revoted | 09:10 |
jaosorior | marios; Thanks dude! | 09:10 |
jaosorior | marios: But probably it would make sense to pass in those ports the internal services are listening on, by some means | 09:11 |
jaosorior | in another refactor of that manifest | 09:11 |
marios | jaosorior: well if it is useful/requested but yeah this is big enough a change | 09:11 |
jaosorior | but yeah.... this manifest is getting too big | 09:11 |
shardy | https://review.openstack.org/#/c/289466/ is the path forward for too-big manifests IMO | 09:14 |
jaosorior | shardy: Ah! I had seen that CR. Dude, I'm all in :D | 09:15 |
shardy | e.g moving most stuff into puppet-tripleo so it's not all deployed directly via SoftwareDeployments | 09:15 |
marios | shardy: nice | 09:15 |
jaosorior | I was waiting for the CI result yesterday, and forgot to check it today | 09:15 |
jaosorior | it looks promising | 09:16 |
*** lucas-dinner is now known as lucasagomes | 09:16 | |
*** sshnaidm has quit IRC | 09:17 | |
*** sshnaidm has joined #tripleo | 09:18 | |
*** jistr has joined #tripleo | 09:19 | |
*** admin0 has quit IRC | 09:23 | |
*** yamahata has quit IRC | 09:23 | |
*** admin0 has joined #tripleo | 09:24 | |
*** jcoufal has joined #tripleo | 09:25 | |
*** mgould has joined #tripleo | 09:34 | |
*** electrofelix has joined #tripleo | 09:34 | |
*** paramite has joined #tripleo | 09:36 | |
*** panda has quit IRC | 09:39 | |
*** panda has joined #tripleo | 09:40 | |
*** tremble has joined #tripleo | 09:41 | |
*** tremble has joined #tripleo | 09:41 | |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: Enable glance-api show_image_direct_url for COW https://review.openstack.org/290358 | 09:43 |
dtantsur | morning folks! are you aware that the gate is probably broken by the puppet-lint job? | 09:44 |
dtantsur | e.g. http://logs.openstack.org/97/289297/1/check/gate-instack-undercloud-puppet-lint/dbd330c/console.html | 09:45 |
*** derekh has joined #tripleo | 09:46 | |
jaosorior | damn, well, apparently the puppet lint gates are broken :/ | 09:46 |
marios | thanks dtantsur explains https://review.openstack.org/#/c/263991/ (saw this this morning) | 09:48 |
jaosorior | marios: http://logs.openstack.org/97/289297/1/check/gate-instack-undercloud-puppet-lint/dbd330c/console.html#_2016-03-09_09_42_31_501 got any idea where this gate-instack-undercloud-puppet-lint stuff is? | 09:51 |
marios | gfidente: didn't you say it was something to do with rake? | 09:51 |
jaosorior | in some repo | 09:52 |
marios | http://logs.openstack.org/91/263991/3/check/gate-tripleo-heat-templates-puppet-lint/c231396/console.html#_2016-03-09_08_32_41_578 "NoMethodError: undefined method `last_comment' for #<Rake::Application:0x000000013ab500>" | 09:52 |
marios | jaosorior: no sorry i don't | 09:53 |
*** paramite is now known as paramite|afk | 09:56 | |
jistr | marios: this is the only script-delivery.yaml thing we don't have in master yet, right? I'll base the channel switching on top of that change. https://review.openstack.org/#/c/289896/ | 09:56 |
jistr | marios: and good morning :) | 09:57 |
marios | jistr: good morning, double checking | 09:58 |
marios | jistr: there is still the swift fixup at https://review.openstack.org/#/c/289826/ as well | 09:59 |
*** stendulker has joined #tripleo | 10:00 | |
jistr | marios: ah right, thanks. It's only touching the swift .sh file so it shouldn't conflict with the channel switching changes to the upgrade initialization YAMLs. So i hope it's conflict-safe to base this on top the ceph patch. Thx! | 10:02 |
marios | jistr: yeah just mentioning it the other one makes sense if your basing another change onto it | 10:03 |
marios | you're | 10:03 |
*** nico_auv has joined #tripleo | 10:03 | |
*** olap has joined #tripleo | 10:12 | |
*** hjensas has joined #tripleo | 10:21 | |
*** paramite|afk is now known as paramite | 10:30 | |
*** nico_auv has quit IRC | 10:40 | |
*** tosky has joined #tripleo | 10:41 | |
*** dtantsur is now known as dtantsur|brb | 10:49 | |
gfidente | DEAR RAKE, BUNDLE, OR WHATEVER, SURE, YOU HAVE YOUR CHANCES TO BREAK TOO... BUT WHY TODAY ? | 10:50 |
gfidente | I'M SURE PEOPLE WOULD LISTEN AT YOU ANOTHER DAY | 10:51 |
*** mbound has quit IRC | 10:52 | |
gfidente | jaosorior, did your change just pass lint? | 10:53 |
gfidente | https://review.openstack.org/287465 | 10:53 |
shadower | what is undercloud using swift for? | 10:54 |
adarazs | shadower: somebody probably thought the underclould will work more swiftly when added. ;) | 10:55 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Update enable-tls.yaml with new endpoints https://review.openstack.org/287465 | 10:55 |
openstackgerrit | Moshe Levi proposed openstack/diskimage-builder: Add lshw package to ironic-agent https://review.openstack.org/289233 | 10:55 |
*** dsneddon has joined #tripleo | 10:56 | |
shadower | adarazs: lol | 10:56 |
trown | inspection stores data in swift | 10:57 |
adarazs | maybe even heat uses it for something /o\ | 10:58 |
shadower | so, I'm seeing swift-proxy-server taking up tonnes of memory during the deployment | 10:58 |
trown | glance is swift backed too.... I would bet the memory spike is loading images to glance | 10:59 |
*** mbound has joined #tripleo | 10:59 | |
shadower | oh is it? | 10:59 |
trown | which could very well be some swift bug, or it might be expected... | 10:59 |
shadower | trown: I thought glance just used the images on disk. Because yeah, that would definitely fit what I'm seeing | 11:00 |
gfidente | shadower, heat uses it | 11:00 |
shadower | yea I realised that too now | 11:00 |
gfidente | shadower, and the deployartifacts thing which we are trying to use to install the modules on the nodes 'dynamically' | 11:00 |
gfidente | but for the deployartifacts I think we could use any http | 11:01 |
shadower | gfidente: yep. Although this spike happens when booting the overcloud vms so the glance/image hypothesis fits my observations best so far9 | 11:01 |
gfidente | ah this I didn't know though ;) | 11:02 |
shadower | I'm installing new centos + undercloud on another machine -- will see if the newer packages help any | 11:02 |
shadower | gfidente: yeah, just checked the config and glance does indeed seem to use swift | 11:05 |
*** stendulker has quit IRC | 11:07 | |
shardy | shadower: this was discussed on the ML here: | 11:13 |
shardy | http://lists.openstack.org/pipermail/openstack-dev/2016-March/088579.html | 11:13 |
*** rhallisey has joined #tripleo | 11:14 | |
shardy | shadower: derekh has seen the same thing, I'm not sure if there is a bug reference, but it seems like a swift bug | 11:14 |
*** adarazs is now known as adarazs_lunch | 11:14 | |
shardy | (or, we're configuring swift wrong I guess) | 11:15 |
derekh | shardy: the reason I didn't create a bug is because I didn't know which it was, we should probably create one anyways and reassign as appropriate | 11:16 |
shardy | derekh: ack, I wasn't complaining, just saying I wasn't sure if we'd tracked down if/where the bug was :) | 11:16 |
derekh | shardy: it all good, didn't think you were complaining | 11:17 |
shardy | It does seem wrong, I'd expect swift to chunkify the request and not load the whole thing into ram | 11:17 |
derekh | yup | 11:18 |
*** dshulyak has quit IRC | 11:18 | |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: Make External Load Balancer templates work with IPv6 https://review.openstack.org/270700 | 11:20 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Change the default value for NetworkNexusVxlanGlobalConfig https://review.openstack.org/287096 | 11:23 |
*** nijaba has quit IRC | 11:24 | |
*** nijaba has joined #tripleo | 11:25 | |
*** nijaba has quit IRC | 11:25 | |
*** nijaba has joined #tripleo | 11:25 | |
rhallisey | derekh, morning | 11:25 |
rhallisey | I still can't figure out the gate :/ | 11:25 |
rhallisey | still works locally | 11:25 |
rhallisey | but it keeps hanging here http://logs.openstack.org/15/288915/6/check-tripleo/gate-tripleo-ci-f22-containers/a1d1bcb/console.html#_2016-03-08_22_34_29_838 | 11:26 |
derekh | rhallisey: looking | 11:26 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: Make External Load Balancer templates work with IPv6 https://review.openstack.org/285538 | 11:27 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Make External Load Balancer templates work with IPv6 https://review.openstack.org/270700 | 11:28 |
*** admin0 has quit IRC | 11:30 | |
*** admin0 has joined #tripleo | 11:31 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: puppet: allow config of ad-hoc Neutron settings https://review.openstack.org/289270 | 11:32 |
*** nico_auv has joined #tripleo | 11:33 | |
*** ohamada_ has joined #tripleo | 11:36 | |
*** ohamada has quit IRC | 11:36 | |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder: Add dib element to generate logical volumes https://review.openstack.org/252041 | 11:36 |
*** dshulyak has joined #tripleo | 11:36 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Update VNI and TunnelID ranges. https://review.openstack.org/289276 | 11:37 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Set swift replicas = min(device_count, replicas) https://review.openstack.org/289274 | 11:38 |
derekh | rhallisey: I'm looking on the compute node for that test you linked, is there normally a journal log on these machines? | 11:38 |
openstackgerrit | Merged openstack/tripleo-heat-templates: puppet: allow config of ad-hoc Cinder settings https://review.openstack.org/289271 | 11:39 |
rhallisey | derekh, yes | 11:39 |
rhallisey | derekh, can you watch as the ci runs? O.o | 11:39 |
rhallisey | that would help me figure this out because heat does not return an error on this | 11:40 |
rhallisey | and the create just hangs... | 11:40 |
rhallisey | so confusing.. | 11:40 |
rhallisey | derekh, look for docker-storage-setup in the journal | 11:41 |
rhallisey | that runs right after cloud-init on atomic | 11:41 |
*** chlong has quit IRC | 11:42 | |
derekh | rhallisey: Yup, I can if we recheck one, but befor we do that, you say there is normally a journal log do you know if its persistet to disk ? | 11:43 |
derekh | rhallisey: On our centos nodes it usually, the end of the ci job tars it up for use to look at | 11:44 |
openstackgerrit | Merged openstack/tripleo-heat-templates: puppet: allow config of ad-hoc Heat settings https://review.openstack.org/289272 | 11:44 |
openstackgerrit | Merged openstack/tripleo-heat-templates: puppet: allow config of ad-hoc Glance settings https://review.openstack.org/289273 | 11:44 |
derekh | rhallisey: Its usually somewhere like this on the cenots nodes var/log/journal/f32e0af35637b5dfcbedcb0a1de8dca1/system.journal | 11:45 |
*** masco has quit IRC | 11:46 | |
rhallisey | doesn't look like it persists.. | 11:46 |
*** links has joined #tripleo | 11:48 | |
* shardy notes we've apparrently given up on passing CI and code reviews for stable/liberty | 11:49 | |
gfidente | so yesterday we had an all green job in liberty | 11:51 |
gfidente | then something went wrong and we hit this | 11:51 |
gfidente | Error: Could not start Service[neutron-ovs-agent-service]: Execution of '/usr/bin/systemctl start neutron-openvswitch-agent' returned 1: Job for neutron-openvswitch-agent.service failed because a timeout was exceeded. | 11:51 |
gfidente | interestingly this is happening for netiso and non netiso, only in liberty/ci | 11:51 |
jistr | jlibosva just came over to chat about this minutes ago. it's been caused by a change in neutron | 11:52 |
gfidente | but I still couldn't figure the root cause of it | 11:52 |
jistr | we may not be hitting this in master CI because of being pinned maybe | 11:52 |
derekh | rhallisey: lets run something like this in get_host_info, it should get you the whole log | 11:53 |
jistr | gfidente: found out with jlibosva, but i'm not sure how to fix it best | 11:53 |
derekh | journalctl | gzip - > journal.log.gz | 11:53 |
jistr | lemme type it :) | 11:53 |
derekh | rhallisey: mind if I edit you tripleo-ci patch a little? | 11:53 |
rhallisey | derekh, ya go ahead | 11:53 |
rhallisey | derekh, let me add back in 2 deps | 11:53 |
rhallisey | I took out two to see if I could get different logs | 11:53 |
derekh | rhallisey: ok go for it | 11:53 |
openstackgerrit | Ryan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again https://review.openstack.org/288915 | 11:55 |
rhallisey | derekh, ok go ahead | 11:55 |
gfidente | jistr, oh thanks | 11:55 |
gfidente | we were not seeing this yesterday though | 11:55 |
jistr | gfidente, shardy: they implemented a change in ovs-agent that it only notifies systemd that it's up when it has connected to rabbit. This has been done apparently to fix recovery of controllers. L3 agent needs to start after OVS agent starts *and is connected to rabbit* it seems. So now they only do systemd notify when OVS connects to rabbit. | 11:55 |
gfidente | so we need to depend compute on controller | 11:56 |
gfidente | did I get it right? | 11:56 |
jistr | gfidente, shardy: which is ok for controllers, but given that we deploy controllers and computes in parallel, OVS agent can start much earlier on compute, and hit systemd timeout because rabbit isn't running yet | 11:56 |
jistr | gfidente: yes, exactly, but... | 11:57 |
gfidente | it's not nice, we tried to avoid that | 11:57 |
gfidente | but we had to do it for ceph fwiw | 11:57 |
jistr | i think it's valid to question if the neutron should behave like this in the first place | 11:57 |
shardy | Yeah, it will cause a significant increase in deployment time if we have to do that | 11:57 |
gfidente | as if we weren't slowing down enough using netiso :) | 11:58 |
jistr | yea. It basically means that controllers + computes cannot be deployed in parallel now, and i think that applies to every deployment, not just TripleO. | 11:58 |
*** mannidi has quit IRC | 11:58 | |
gfidente | shardy, can I reprise this $hit https://review.openstack.org/#/c/249149/ over to liberty to see how it goes? | 11:59 |
gfidente | just to see if it passes so at least we have data | 11:59 |
shardy | We can probably minimise the increase by still building the nodes in parallel and only adding a depends_on to ComputeNodesPostDeployment I guess | 12:00 |
gfidente | yeah thats' what the change was supposed to do | 12:00 |
shardy | gfidente: Ah, yeah I see | 12:01 |
shardy | I tested another patch posted by zaneb which serialized the RG's, and that added like 2mins to my local deployment time (about 20%) | 12:01 |
shardy | I guess this may be somewhat less | 12:01 |
gfidente | shardy, yeah I remember tha | 12:01 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Allow the continer job to run again https://review.openstack.org/288915 | 12:02 |
gfidente | yes but the good thing of zaneb's change was it preserved the order on update too if I remember correctly | 12:02 |
*** jaosorior has quit IRC | 12:02 | |
gfidente | while this patch is not, it's causing update to run on computes first | 12:02 |
shardy | Yeah | 12:02 |
derekh | rhallisey: ^ that should get you the journal log in the compute tarball | 12:02 |
shardy | well the post-deploy parts will still be serialized | 12:02 |
gfidente | so I'll update it just to see if it passes and -1 | 12:02 |
shardy | so we just need to ensure all the update stuff happens in -post.yaml | 12:02 |
*** jaosorior has joined #tripleo | 12:03 | |
*** trown is now known as trown|commute | 12:03 | |
shardy | the issue previously was UpdateDeployment was in e.g controller.yaml | 12:03 |
rhallisey | derekh, k thanks | 12:03 |
shardy | we can/should probably move it, given that now we're doing update stuff in the Pre/Post puppet resources instead | 12:03 |
slagle | jistr: are they fixing the neutron agent issue on their side, or is it on us? | 12:04 |
*** aufi has quit IRC | 12:05 | |
jistr | slagle: it's open. No decision on neutron side yet. Personally i'd +2 the patch that gfidente linked above in the meantime. That should be the fastest way to unblock us for stable/liberty merging, and then a revert can be discussed (both in neutron and in tripleo). | 12:06 |
slagle | jistr: gfidente yep, ok. lets go ahead and backport that to get CI running on it | 12:08 |
michchap_ | 'It basically means that controllers + computes cannot be deployed in parallel now, and i think that applies to every deployment' | 12:08 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: Make compute nodes deployment depend on controller https://review.openstack.org/290438 | 12:08 |
michchap_ | We need to model the dependencies between components rather than using 'step' | 12:08 |
michchap_ | via a services registry or other mechanism so that any node can query for the state of a given service and make decisions about when to include classes | 12:09 |
michchap_ | Then you can deploy control and compute at the same time, and compute's neutron classes just wait for rabbit to be available in the registry before being included | 12:10 |
*** thrash|g0ne is now known as thrash | 12:10 | |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder: Add dib element to generate logical volumes https://review.openstack.org/252041 | 12:12 |
jistr | slagle: ATM i think the new neutron behavior might do more harm than good (parallel deployment of controllers + computes is a good feature imho). Perhaps a better fix would be "let's not require L3 agent to start up after L2 agent, let it be smart and wait for L2 agent to appear before starting to perform actions that require L2 agent running", that might not be easy to achieve though, idk. jlibosva is at lunch now, we can chat more afterwards. | 12:12 |
gfidente | michchap_, that'd be a nice to have yes | 12:13 |
gfidente | I think we're going to face more of this type of issues with composable services | 12:13 |
michchap_ | gfidente: I had a ...bad? idea of doing it using a custom hiera backend that queries pacemaker | 12:14 |
*** chlong has joined #tripleo | 12:14 | |
gfidente | I am not sure myself really, we probably don't want to rely on pacemaker and even if we were to, it might no have very granular understanding of things like it happens when you set dependencies in the puppet manifest | 12:15 |
michchap_ | but I heard pacemaker is taking a back seat so I can't really use it as a registry going forward | 12:15 |
michchap_ | right | 12:15 |
michchap_ | so I've done this before using consul | 12:15 |
gfidente | we probably also don't want to have anything which is implementation specific, so we can't rely on the status of the puppet resources | 12:16 |
michchap_ | and it was actually a really elegant system/solution. Its main weakness was that the system state converges so you can't tell when it's sitting in a broken state, so failed test runs were often longer than they normally would be | 12:16 |
gfidente | so it looks to me we might just make the heat templates more granular and keep the dependencies in there | 12:16 |
michchap_ | right, that makes sense | 12:17 |
gfidente | shardy, would like it | 12:17 |
michchap_ | (I'm relatively new to tripleo) would that mean we'd end up with one heat resource per puppet 'profile' | 12:17 |
gfidente | :P | 12:18 |
*** dtantsur|brb is now known as dtantsur | 12:18 | |
gfidente | michchap_, from a two minutes chat, that's not a terrible idea yeah... not a puppet 'profile' but rather a service 'profile' though | 12:18 |
gfidente | so it's not implementation specific | 12:19 |
michchap_ | gfidente: yep | 12:20 |
*** lucasagomes is now known as lucas-hungry | 12:21 | |
*** Erming_ has joined #tripleo | 12:24 | |
*** Marga__ has joined #tripleo | 12:24 | |
*** greghaynes has quit IRC | 12:27 | |
*** Erming__ has quit IRC | 12:27 | |
*** chlong has quit IRC | 12:27 | |
*** rhallisey has quit IRC | 12:27 | |
*** Marga_ has quit IRC | 12:27 | |
*** jaosorior has quit IRC | 12:27 | |
*** rasca has quit IRC | 12:27 | |
*** rasca has joined #tripleo | 12:27 | |
*** jaosorior has joined #tripleo | 12:27 | |
shadower | shardy, derekh: thanks. So yeah, I'm hitting the swift-proxy issue and I may as well spend some time digging into it | 12:36 |
dtantsur | folks, while gate is not feeling well, could you review a couple of documentation changes please? https://review.openstack.org/281449 and https://review.openstack.org/284115 | 12:36 |
* shadower looks | 12:38 | |
*** rhallisey has joined #tripleo | 12:39 | |
*** chlong has joined #tripleo | 12:40 | |
*** greghaynes has joined #tripleo | 12:40 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Change the CinderISCSIHelper to lioadm https://review.openstack.org/283712 | 12:42 |
*** admin0 has quit IRC | 12:44 | |
openstackgerrit | Merged openstack/tripleo-docs: Update documentation on fetching introspection data https://review.openstack.org/281449 | 12:45 |
*** admin0 has joined #tripleo | 12:46 | |
dtantsur | shadower, thnx! | 12:46 |
*** mannidi has joined #tripleo | 12:54 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add missing createUser line to /etc/snmp/snmpd.conf https://review.openstack.org/263991 | 12:54 |
EmilienM | gfidente: for the rabbit / mongo patches - I suggest we move forward as they are | 12:57 |
EmilienM | they'll require some cleanup but not this week I think | 12:58 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: Upgrades: initialization command/snippet https://review.openstack.org/290465 | 12:58 |
*** devvesa has joined #tripleo | 13:02 | |
*** morazi has joined #tripleo | 13:04 | |
openstackgerrit | Merged openstack/tripleo-docs: Clarify profile matching documentation https://review.openstack.org/284115 | 13:04 |
*** lazy_prince has quit IRC | 13:05 | |
*** jayg|g0n3 is now known as jayg | 13:05 | |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-common: Example yaml for building images https://review.openstack.org/290468 | 13:06 |
slagle | jistr: gfidente : sounds like they are going to revert the neutron packaging change on liberty | 13:08 |
pradk | can i request some reviews on https://review.openstack.org/#/c/289435/ | 13:10 |
jaosorior | gfidente: It hadn't. Still was running into the same error. But now I re-checked it. | 13:10 |
*** dprince has joined #tripleo | 13:14 | |
*** pblaho has joined #tripleo | 13:15 | |
*** jaosorior has quit IRC | 13:15 | |
*** adarazs_lunch is now known as adarazs | 13:15 | |
*** jaosorior has joined #tripleo | 13:15 | |
*** aufi has joined #tripleo | 13:17 | |
*** dmacpher has joined #tripleo | 13:17 | |
*** trown|commute is now known as trown | 13:18 | |
adarazs | are the tripleo gates still busted or does it make sense to recheck? | 13:18 |
slagle | adarazs: liberty is still down | 13:18 |
*** weshay has joined #tripleo | 13:19 | |
gfidente | adarazs, do you know if puppet-memcached and memcached were updated for centos? | 13:23 |
adarazs | gfidente: nope, I don't. | 13:23 |
gfidente | ok yesterday derekh pinged apevac and number80 about it | 13:23 |
gfidente | without those the ipv6 deployment is still going to fail | 13:24 |
openstackgerrit | Miles Gould proposed openstack/python-tripleoclient: [WIP] Use Ironic API v1.11 to support ENROLL state https://review.openstack.org/272206 | 13:28 |
*** lucas-hungry is now known as lucasagomes | 13:28 | |
gfidente | slagle, shardy, jistr hey the depend fixed the issue in ci fwiw | 13:32 |
gfidente | I was looking at zuul, results are coming | 13:32 |
gfidente | how come we're not seeing this in the matser branch though and only in liberty? | 13:32 |
gfidente | derekh, ^^ ? | 13:33 |
shardy | gfidente: master is pinned via the current-tripleo link? | 13:33 |
jistr | gfidente: is it because of the delorean pin? | 13:33 |
jistr | jinx! | 13:33 |
shardy | hehe ;) | 13:33 |
gfidente | so I am not sure how the pin works then | 13:33 |
shardy | gfidente: master is periodically promoted (until recently this was manual, but I know the plan was for the periodic job to update it) | 13:34 |
gfidente | can it be pointed to a specific version of any arbitrary package? | 13:34 |
shardy | gfidente: but for stable/liberty we just use stable/liberty trunk | 13:34 |
slagle | gfidente: which depends on is this? | 13:34 |
gfidente | slagle, I meant the heat depend | 13:34 |
gfidente | slagle, https://review.openstack.org/#/c/290438/ | 13:34 |
trown | gfidente: memcached should be updated in delorean deps repo | 13:35 |
openstackgerrit | Jaume Devesa proposed openstack/tripleo-docs: Extending the image build information https://review.openstack.org/270290 | 13:35 |
slagle | gfidente: oh ok | 13:35 |
slagle | gfidente: they are supposedly going to revert the packaging change | 13:35 |
slagle | gfidente: which i think is better than us reacting | 13:35 |
*** links has quit IRC | 13:35 | |
gfidente | I think so as well | 13:36 |
gfidente | but we're completely blocked | 13:36 |
gfidente | until the revert happens in the packages | 13:36 |
gfidente | or can we pin some stable/liberty thing too? | 13:36 |
shadower | gfidente: do you have the instructions for line 20 written somewhere? The gist in https://etherpad.openstack.org/p/tripleo-ipv6-support is 404 on me | 13:36 |
gfidente | shadower, https://gist.github.com/gfidente/a7c72a68fddc20ddae86 | 13:37 |
jaosorior | puppet lint gate seems to be working :D | 13:38 |
jaosorior | any +As for this CR? https://review.openstack.org/#/c/287199/ | 13:38 |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-docs: Extend the root device selection documentation https://review.openstack.org/290492 | 13:38 |
dtantsur | mgould, could you proof-read this please ^^? | 13:38 |
* dtantsur writes all the docs \o/ | 13:38 | |
dtantsur | :) | 13:38 |
*** panda has quit IRC | 13:39 | |
*** panda has joined #tripleo | 13:40 | |
slagle | gfidente: the revert is landed | 13:40 |
slagle | just waiting on the build | 13:40 |
derekh | gfidente: we can pin the stable branch with DELOREAN_STABLE_REPO_URL but we never have, if the packaging revert will be fairly quick then we may aswell wait, it will take use 2 hours anyways to get it in | 13:42 |
shadower | gfidente: thanks! And the comments should be run on the undercloud, right? | 13:44 |
gfidente | undercloud yes | 13:45 |
shadower | cheers | 13:45 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-common: Use Fedora 23 atomic in container gate https://review.openstack.org/289565 | 13:45 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-common: Properly setup DNS for the container CI job https://review.openstack.org/289966 | 13:46 |
trown | derekh: gfidente, there is a current-passed-ci link already on the stable/liberty branch | 13:46 |
slagle | gfidente: the build is done | 13:47 |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates: Fixup systemctl_swift stop/start during the controller upgrade https://review.openstack.org/290501 | 13:47 |
trown | it is a bit old because we have had failures to build from source there for a few days | 13:47 |
slagle | gfidente: i just rechecked https://review.openstack.org/#/c/289758/ | 13:47 |
gfidente | thanks guys | 13:47 |
slagle | and https://review.openstack.org/#/c/289355/ | 13:47 |
slagle | the first and last in the series :) | 13:48 |
rhallisey | derekh, rechecking.. The image failed to dl. Should be fixed this time | 13:48 |
jaosorior | slagle: You mean for stable/liberty? | 13:48 |
dtantsur | folks, do you know if the puppet-lint problem got fixed? | 13:48 |
slagle | jaosorior: yes | 13:48 |
derekh | trown: ya, we should switch to that | 13:48 |
jaosorior | slagle: After that is it possible to still merge bug fixes to stable/liberty? | 13:48 |
slagle | dtantsur: i saw it pass this morning | 13:48 |
marios | jistr: i'll rebase https://review.openstack.org/290501 onto your review in a bit | 13:48 |
slagle | jaosorior: yea, assuming ci starts passing | 13:49 |
marios | jistr: though actually, it doesn't touch any common files so should be ok | 13:49 |
jaosorior | slagle: alright, makes sense | 13:49 |
dtantsur | slagle, first problems I saw on my patches were 10am UTC | 13:49 |
slagle | dtantsur: ok, i reverified this one https://review.openstack.org/#/c/283712/ about an hour ago and it passed | 13:49 |
jaosorior | dtantsur: Are you takling about the puppet-lint issues? At least on master they now seem to work | 13:50 |
*** trown is now known as trown|brb | 13:50 | |
dtantsur | good, thanks! | 13:50 |
*** trown|brb is now known as trown | 13:52 | |
*** mannidi_ has joined #tripleo | 13:53 | |
jistr | marios: hmm we probably need to remove the call to systemctl_swift from the object node .sh (mentioned on review), so we might need to touch the swift .sh file too and then a rebase might be needed | 13:53 |
*** mannidi has quit IRC | 13:54 | |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-common: Create a test flavor for the pingtest VM https://review.openstack.org/289845 | 13:54 |
marios | jistr: check on review (this is only on controllers so ok here) | 13:55 |
dtantsur | also folks, I know you'll hate me... but please review https://review.openstack.org/263309 | 13:57 |
dtantsur | I know that it's huge :( but I found no ways to fix it step-by-step: too many wrong things all over the place | 13:57 |
*** ohamada_ has quit IRC | 13:57 | |
jistr | marios: ah right, ok. Storage nodes can be fixed in a separate patch then. | 13:57 |
marios | jistr: yeah... also confirmed thatwhen you ControllerEnableSwiftStorage: false we only have swift-proxy on the controllers | 13:58 |
jistr | marios: so it works, cool. I bet i saw a BZ that said otherwise. worksforme :) | 13:59 |
*** Goneri has joined #tripleo | 14:00 | |
jistr | marios: +2'd the swift controller patch. Since the swift fix is just removing a the swift-proxy service from a list, we could put it directly to the existing fixup patch. Up to you. https://review.openstack.org/#/c/289826/ | 14:01 |
jistr | *the swift node fix | 14:01 |
marios | jistr: yeah sure if you wana add it there is fine | 14:01 |
jistr | marios: ack will do | 14:02 |
marios | jistr: i'm looking at your other one now https://review.openstack.org/#/c/290465/1 | 14:02 |
jistr | cool, thx | 14:02 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder: Generate fedora-atomic images using dib https://review.openstack.org/287167 | 14:04 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: Upgrades: object storage node upgrade fix https://review.openstack.org/289826 | 14:07 |
*** pradk has quit IRC | 14:08 | |
*** rlandy has joined #tripleo | 14:09 | |
*** saneax is now known as saneax_AFK | 14:13 | |
shardy | Can anyone see what I've done wrong in https://review.openstack.org/#/c/272210/ ? | 14:15 |
shardy | it's failing to find overcloud.yaml after the tripleo.sh move and associated changes, but I can't currently spot why | 14:15 |
*** lblanchard has joined #tripleo | 14:16 | |
gfidente | shardy, I saw that message when one of the environment files wasn't available | 14:17 |
*** dustins has joined #tripleo | 14:18 | |
shardy | gfidente: Ah, that could be it - misleading error if so! | 14:18 |
gfidente | shardy, in https://review.openstack.org/#/c/272210/11 do you need to copy the tenantvm template too? | 14:19 |
gfidente | I was just editing it in https://review.openstack.org/289845 | 14:19 |
gfidente | so can you incorporate those changes? :) | 14:19 |
shardy | gfidente: I have moved that template already, and it appears to be failing on overcloud create, not the pingtest | 14:20 |
jprovazn | dsneddon: ping | 14:21 |
shardy | we'll have to repropose those changes to tripleo-ci, assuming we manage to land this soon and I don't end up rebasing | 14:21 |
gfidente | shardy, yeah I was asking if you wanted to include those changes | 14:21 |
shardy | IMO we shouldn't include any changes in the move, just copy from a specific tripleo-common revision | 14:21 |
gfidente | so we have them there when it lands :) | 14:21 |
gfidente | okay ... booooring | 14:21 |
shardy | gfidente: IMHO we shouldn't do that, sorry | 14:21 |
gfidente | :) | 14:21 |
shardy | If they're urgent, land them to tripleo-common and I'll rebase | 14:22 |
*** NobodyCa1 is now known as NobodyCam | 14:23 | |
EmilienM | gfidente: is it ok to move forward with https://review.openstack.org/#/c/270154/ ? I'll work on the cleanup later, we might need Puppet functions for this need | 14:24 |
derekh | trown: did you say that memcached is updated in delorean deps ? I don't see it | 14:24 |
EmilienM | dprince: hey, can you please revisit your -1 on https://review.openstack.org/#/c/269058/ ? for a first iteration I think it's ok to have it like this, and we can add a hiera level for rabbit later | 14:25 |
gfidente | trown, we need python-memcached too | 14:27 |
gfidente | to make it pass | 14:27 |
dprince | EmilienM: sure, I can let that pass for now | 14:28 |
EmilienM | dprince: thanks | 14:29 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add Rabbit IPv6 only support https://review.openstack.org/269058 | 14:31 |
adarazs | gfidente, derekh: can you help me figure out where's the problem coming from in the ipv6 gate?: https://trello.com/c/wiJrOgZN/40-ipv6-add-upstream-ipv6-gate-jobs | 14:42 |
*** mannidi_ has quit IRC | 14:43 | |
derekh | adarazs: could that commit that just merged be relevant ? ^ | 14:43 |
adarazs | derekh: looks like it, thanks! | 14:44 |
adarazs | does "Depends-On:" check out a change (with all the dependent changes) or does it cherry pick? | 14:47 |
adarazs | so should I include all the changes in the commit message? | 14:48 |
adarazs | gfidente: ^ | 14:48 |
gfidente | depends-on is only used by CI | 14:48 |
gfidente | checkout will pick all the changes | 14:48 |
gfidente | cherry-pick will apply the change on your existing tree | 14:49 |
adarazs | gfidente: yeah, I know, I want to use it for CI on my change. :) | 14:49 |
gfidente | (oh well, checkout doesn't go across different repos) | 14:49 |
adarazs | https://review.openstack.org/289445 -- you told me yesterday that I only need that single depends-on | 14:49 |
*** jprovazn has quit IRC | 14:49 | |
adarazs | to get all the necessary IPv6 changes | 14:49 |
adarazs | is that really true? | 14:49 |
gfidente | ah yes | 14:50 |
adarazs | because https://review.openstack.org/#/c/269058/8 should have been included. | 14:50 |
gfidente | I see the question now | 14:50 |
*** oshvartz has quit IRC | 14:50 | |
adarazs | at least I think it should have | 14:50 |
gfidente | I thought the problem was how to checkout the tree of deps | 14:51 |
adarazs | nope | 14:51 |
gfidente | so yes depends_on is doing checkout afaik | 14:51 |
adarazs | so that should have pulled that rabbit change... anyway, I did a recheck, we'll see if it fails the same way, maybe that rabbit change actually doesn't fix this error. | 14:53 |
derekh | Anybody else want to describe their dev setup here? so newcomers get an idea of what HW setup they may be able to use for tripleo, https://etherpad.openstack.org/p/tripleo-dev-env-census | 14:53 |
*** bnemec has joined #tripleo | 14:53 | |
*** admin0 has quit IRC | 14:54 | |
openstackgerrit | Dougal Matthews proposed openstack/tripleo-docs: Update python-rdomanager-oscplugin to python-tripleoclient https://review.openstack.org/290553 | 14:55 |
*** admin0 has joined #tripleo | 14:56 | |
gfidente | adarazs, I'm checking the rabbit config to see if the changes which that submission does were applied | 14:57 |
gfidente | adarazs, yeah not in there | 14:59 |
adarazs | gfidente: okay, so now it should work as it was merged. | 14:59 |
gfidente | well it should break on mongo | 15:00 |
adarazs | gfidente: is it still true that this change should pull all the remaining necessary changes? https://review.openstack.org/272089 | 15:00 |
gfidente | derekh, ^^ do you know if that is the case and if we can control this behaviour? | 15:00 |
gfidente | I though when depends-on was pointing to a change we would git checkout the change (the entire tree the change needs to be applied cleanly) | 15:01 |
gfidente | *thought | 15:01 |
gfidente | is it doing git cherry-pick instead? | 15:01 |
derekh | gfidente: it should pull in the entire tree, whats missing ? | 15:03 |
gfidente | it apparently didn't for adarazs' change | 15:04 |
adarazs | gfidente: not sure if that rabbit change was part of the dependency for the ceph patch though. /o\ | 15:04 |
rhallisey | derekh looks like ci is failing everywhere on setup | 15:04 |
rhallisey | nvm that just for all my patchs | 15:05 |
rhallisey | patches | 15:05 |
rhallisey | wtf.. | 15:05 |
derekh | rhallisey: 2016-03-09 14:47:55.933 | /opt/stack/new/tripleo-common/scripts/tripleo.sh: line 164: REPO_PREFIX: unbound variable | 15:07 |
rhallisey | ya rebasing | 15:07 |
derekh | rhallisey: does one of your patches change that^ | 15:08 |
*** pblaho has quit IRC | 15:08 | |
derekh | ok | 15:08 |
rhallisey | nope | 15:08 |
*** saneax_AFK is now known as saneax | 15:08 | |
derekh | gfidente: adarazs which patch didn't get pulled in that was supposed to? | 15:08 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-common: Use Fedora 23 atomic in container gate https://review.openstack.org/289565 | 15:08 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-common: Properly setup DNS for the container CI job https://review.openstack.org/289966 | 15:08 |
openstackgerrit | Ryan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again https://review.openstack.org/288915 | 15:09 |
*** pradk has joined #tripleo | 15:09 | |
adarazs | derekh: probably https://review.openstack.org/269058 | 15:11 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: Add Rabbit IPv6 only support https://review.openstack.org/290568 | 15:12 |
derekh | adarazs: once that patch merge ZUUL with ignore depends on for it | 15:13 |
derekh | adarazs: but there is a window of time before the package appears in the trunk repository | 15:13 |
derekh | if you ran your job during that window then you wont have gotten the package with the change | 15:14 |
adarazs | derekh: the problem was that I assumed that it was pulled during the last gate run, but we had that rabbit error which seemed to suggest we were missing that rabbit ipv6 change. | 15:14 |
openstackgerrit | Ben Nemec proposed openstack/python-tripleoclient: Remove keystone init deprecation message https://review.openstack.org/290570 | 15:14 |
openstackgerrit | Ben Nemec proposed openstack/python-tripleoclient: Revert "Remove keystone init deprecation message" https://review.openstack.org/290571 | 15:14 |
adarazs | derekh: okay, let's see where this recheck takes us :) | 15:14 |
derekh | the window was about 33minutes long | 15:14 |
adarazs | huh, okay... | 15:14 |
gfidente | bnemec, so we're not going to go endpoints and stuff from puppet for liberty? :( | 15:15 |
pradk | jistr, slagle, are we ok with getting this merged into master or are we waiting on anything else? https://review.openstack.org/#/c/289435/ | 15:15 |
bnemec | gfidente: It doesn't look promising. It's not even ready for Mitaka yet: https://review.openstack.org/#/c/244162/ | 15:15 |
slagle | pradk: i think it just needs to be re-reviewed at this point | 15:16 |
*** tzumainn has quit IRC | 15:17 | |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: Allow the vnc server to bind on IPv6 address on computes https://review.openstack.org/270831 | 15:17 |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-heat-templates: Remove forced rabbitmq::file_limit conversion to string https://review.openstack.org/232983 | 15:19 |
shardy | slagle: Hey, quick question about the overcloud-full element - is there any reason we couldn't just setup the element-deps for that so it pulls in all the overcloud pieces? | 15:20 |
shardy | I've just needed to rebuild an overcloud-full image direct via dib, and it's pretty unwieldy | 15:21 |
slagle | shardy: overcloud-full? is that something we still use? | 15:23 |
slagle | i guess so | 15:24 |
shardy | https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_image.py#L186 | 15:24 |
slagle | yea i see | 15:25 |
shardy | slagle: Yeah, inside tripleoclient we end up creating a pretty long list of elements | 15:25 |
slagle | it doesn't really do anything | 15:25 |
slagle | i wonder if it's even needed | 15:25 |
slagle | but yes, if we wanted it to encapsulate all the elements we actually need, i suppose we could | 15:25 |
shardy | Maybe not, will check it out - I was thinking it'd be useful to have it be a meta-element which references the other overcloud-* stuff | 15:26 |
slagle | indeed, that would be useful | 15:26 |
slagle | i guess the thinking is with the yaml image building stuff, we'd have a single yaml definition of overcloud-full | 15:27 |
shardy | Yeah that would work too I guess | 15:28 |
openstackgerrit | Merged openstack/puppet-tripleo: Make OpenStack service ports configurable in HAProxy https://review.openstack.org/287199 | 15:28 |
shardy | We do some other weird stuff, like include the undercloud-package-install element for overcloud images | 15:28 |
shardy | anyway, thanks for the sanity check, I may look at updating the overcloud-full deps as that would improve my current dib workflow | 15:29 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: stable/liberty: set default upgrade level to kilo https://review.openstack.org/290584 | 15:29 |
*** tremble has quit IRC | 15:30 | |
*** jprovazn has joined #tripleo | 15:31 | |
*** rpothier has joined #tripleo | 15:33 | |
*** trozet has joined #tripleo | 15:35 | |
*** aufi has quit IRC | 15:36 | |
*** trozet has quit IRC | 15:36 | |
*** trozet has joined #tripleo | 15:36 | |
gfidente | recheck time | 15:38 |
*** xinwu has joined #tripleo | 15:43 | |
EmilienM | by enabling SSL in Puppet OpenStack CI, our CI jobs take a very long time to run and sometimes timeout. Have you already hit a similar problem in tripleo? | 15:45 |
bnemec | EmilienM: No, but we don't SSL everything right now, only the public endpoints. | 15:46 |
openstackgerrit | Sergey Gotliv proposed openstack/tripleo-heat-templates: Trove Integration https://review.openstack.org/233240 | 15:47 |
EmilienM | bnemec: ok - our CI is deploying without SSL termination, SSL is configured in WSGI services directly | 15:47 |
EmilienM | I checked cpu_info and we have aes flag | 15:47 |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-docs: Document benchmarks and add extra data examples https://review.openstack.org/290605 | 15:47 |
EmilienM | I was wondering if the ssl key size could have an impact on handshakes times | 15:48 |
EmilienM | we use a 4k size iirc | 15:48 |
bnemec | EmilienM: dsneddon tells me that it's normal for SSL on internal things to have a big impact on performance. You might want to talk to him about it. | 15:48 |
bnemec | He seems to have some experience with doing that. | 15:48 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Make OpenStack service ports configurable in HAProxy https://review.openstack.org/287961 | 15:50 |
*** adarazs has quit IRC | 15:52 | |
openstackgerrit | Sergey Gotliv proposed openstack/python-tripleoclient: Trove integration https://review.openstack.org/233241 | 15:53 |
*** liverpooler has quit IRC | 15:56 | |
openstackgerrit | Giulio Fidente proposed openstack/python-tripleoclient: Generate a password for Redis and pass it as deployment parameter https://review.openstack.org/290610 | 15:56 |
*** adarazs has joined #tripleo | 15:57 | |
trown | tripleo-quickstart demo starting shortly: http://www.youtube.com/watch?v=4O8KvC66eeU | 15:58 |
social | hmm I'm having issues with VMs, I rebooted whole node because the baremetal vms got stuck in shutdown/reboot in nova | 15:59 |
social | and ironic/nova still keep reporting old state after reboot even though everything is down | 15:59 |
jaosorior | trown: watchin :D | 16:00 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: Ensure access to Redis is password protected https://review.openstack.org/210405 | 16:00 |
*** aufi has joined #tripleo | 16:02 | |
mgould | dtantsur, done - sorry I only just noticed your request! | 16:04 |
dtantsur | thnx | 16:05 |
*** ohamada has joined #tripleo | 16:07 | |
*** paramite has quit IRC | 16:08 | |
*** dshulyak has quit IRC | 16:09 | |
*** yamahata has joined #tripleo | 16:10 | |
*** aufi has quit IRC | 16:13 | |
openstackgerrit | Giulio Fidente proposed openstack/python-tripleoclient: Generate a password for Redis and pass it as deployment parameter https://review.openstack.org/290610 | 16:13 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: WIP: Allow predictable IPs for Controllers on the ctlplane https://review.openstack.org/256003 | 16:17 |
*** absubram has joined #tripleo | 16:17 | |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-common: Use Fedora 23 atomic in container gate https://review.openstack.org/289565 | 16:18 |
openstackgerrit | Ryan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again https://review.openstack.org/288915 | 16:19 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: WIP: Allow predictable IPs for Controllers on the ctlplane https://review.openstack.org/256003 | 16:20 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-common: Properly setup DNS for the container CI job https://review.openstack.org/289966 | 16:21 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-heat-templates: Allow the containerized compute node to spawn larger VMs https://review.openstack.org/288822 | 16:24 |
openstackgerrit | Ryan Hallisey proposed openstack/tripleo-heat-templates: Remove unused Neutron Agents container https://review.openstack.org/287918 | 16:24 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates: WIP: Allow predictable IPs for Controllers on the ctlplane https://review.openstack.org/256003 | 16:26 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Permits configuration of Cinder enabled_backend via hieradata https://review.openstack.org/289979 | 16:27 |
*** mbound has quit IRC | 16:32 | |
gfidente | slagle, it passed! | 16:34 |
gfidente | holy green ci | 16:34 |
slagle | gfidente: yea, i just merged the first 7 patches | 16:34 |
gfidente | so we can't blame ci anymore | 16:35 |
EmilienM | congrats folks | 16:35 |
* rhallisey lives in the red XD | 16:37 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add IPv6 Support to Isolated Networks https://review.openstack.org/289355 | 16:38 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Add IPv6 versions of the Controller NIC configs https://review.openstack.org/269883 | 16:39 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Make the Neutron subnet ipv6_{ra,address}_mode configurable https://review.openstack.org/289417 | 16:39 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Allow to enable IPv6 on Corosync https://review.openstack.org/289422 | 16:39 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Fix rabbit_hosts list for glance-api for IPv6 https://review.openstack.org/289432 | 16:39 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Set /64 cidr_netmask for pcmk VIPs when IPv6 https://review.openstack.org/289461 | 16:39 |
openstackgerrit | Merged openstack/tripleo-heat-templates: Fixup the memcached servers string in nova.conf for v6 https://review.openstack.org/289758 | 16:39 |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-docs: Extend the root device selection documentation https://review.openstack.org/290492 | 16:39 |
slagle | unfortunately, tripleo ci is again awash in red | 16:40 |
*** trown has quit IRC | 16:41 | |
pino|work | http://sadtrombone.com | 16:41 |
dprince | slagle: is it just the capacity, memory/CPU issues. Or are you seeing a functional failure? | 16:41 |
slagle | dprince: i'm starting to look through them | 16:42 |
dprince | slagle: I'm wondering if we shoudl expediate the resize we talked about on the list... | 16:42 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: compute: include VIR_MIGRATE_TUNNELLED when doing VM shared storage https://review.openstack.org/286584 | 16:42 |
bandini | jistr: do you have some notes on how the update process (via the new pacemaker_upgrade_1/2.sh scripts) is supposed to work? | 16:42 |
slagle | this...https://review.openstack.org/#/c/289845/, failed with 2016-03-09 14:24:41.811 | Calling <function virsh_start at 0x7ff63efdbc08> with: ['start', 'seed_3'] | 16:42 |
slagle | 2016-03-09 14:24:42.074 | error: Failed to start domain seed_3 | 16:42 |
slagle | 2016-03-09 14:24:42.074 | error: Unable to add port vnet31 to OVS bridge 3brbm_one3: Operation not permitted | 16:42 |
slagle | what the heck is that | 16:42 |
dprince | slagle: sounds like the testenv it ran on is hosed, or perhaps wasn't cleaned up properly | 16:43 |
derekh | slagle: I've seen that before, but not usualy untell the testenv has a long uptime | 16:43 |
derekh | are we seeing many of them? maybe it now happens more oftens as we have a lot move nics plugged into each instance | 16:44 |
slagle | and on the other job on that patch, introspection timed out | 16:44 |
slagle | i see no discernible pattern in any of this :) | 16:44 |
*** admin0 has quit IRC | 16:45 | |
*** mikelk has quit IRC | 16:45 | |
dprince | slagle: but the queue has been packed today | 16:45 |
slagle | the next one i'm looking at is "No valid host" during oc node deployment | 16:46 |
*** trown has joined #tripleo | 16:46 | |
dtantsur | slagle, I've also seen this "Failed to start domain seed_3" on one of my patches | 16:46 |
openstackgerrit | Emilien Macchi proposed openstack/instack-undercloud: Use pymysql database driver for OpenStack DBs https://review.openstack.org/284955 | 16:46 |
slagle | dprince: yea, anecdotally, i just want to say these are performance related | 16:46 |
slagle | but i have no facts to that effect | 16:46 |
jistr | bandini: yes, we're hashing it out, i'll send you the link | 16:46 |
dprince | slagle: yeah, I follow. That is why I'm wondering if we re-allocate the testenv's if it would help us | 16:47 |
dprince | derekh: thoughts? | 16:47 |
dprince | derekh: when could we take a window of time and re-deploy them? | 16:48 |
slagle | it's a tough call, if we re-allocate something could go wrong | 16:48 |
slagle | and we could be completely down for a day or so | 16:48 |
*** dshulyak has joined #tripleo | 16:49 | |
*** yamahata has quit IRC | 16:50 | |
*** xinwu has quit IRC | 16:51 | |
*** bvandenh has joined #tripleo | 16:51 | |
*** xinwu has joined #tripleo | 16:51 | |
*** xinwu has quit IRC | 16:51 | |
*** trown is now known as trown|brb | 16:52 | |
derekh | dprince: slagle we can reallocate them in batches, maybe to a small few and see if things improve | 16:53 |
*** trown has joined #tripleo | 16:53 | |
derekh | dprince: slagle any ideas if all the extra nics could be putting a higher load on the host then there used to me | 16:53 |
* derekh saw a load of 140 thismorning | 16:54 | |
dprince | derekh: they certainly could be | 16:54 |
slagle | testenv12-testenv0-or57ccjfv7w2 env num 3 must be bad | 16:54 |
dprince | derekh: I was definately suspicious of it last week | 16:54 |
slagle | it's failed 2 jobs with that can't add ovs port error | 16:54 |
dprince | derekh: with low load things seemed to pass just fine though | 16:54 |
openstackgerrit | Dan Sneddon proposed openstack/os-net-config: Fix hierarchy for Linux Bonds and Linux Bridges https://review.openstack.org/290224 | 16:54 |
dprince | derekh: repeatedly, over the weekend | 16:55 |
derekh | slagle: once that heppens, you gotta rebuild the host, that env is hosted | 16:55 |
derekh | *hosed | 16:55 |
*** jaosorior has quit IRC | 16:55 | |
derekh | slagle: that env will continue to fail with the same error until the host is rebooted, we've always had it as a problem, I've just rebuilt them when it happened | 16:56 |
derekh | slagle: but it usually happen after a few months of uptime | 16:56 |
derekh | also we have CPU's overheating, I'm think maybe we want to turn off turbo to prevent CPU getting throttled under heavey load | 16:57 |
*** xinwu has joined #tripleo | 16:57 | |
derekh | # echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo | 16:57 |
derekh | https://bugzilla.redhat.com/show_bug.cgi?id=924570#c34 | 16:57 |
openstack | bugzilla.redhat.com bug 924570 in kernel "regression, package temp above normal induced mce" [Unspecified,New] - Assigned to kernel-maint | 16:57 |
*** tosky has quit IRC | 16:57 | |
slagle | did we have that before the redeploy for net iso? | 16:58 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates: Enable predictable IPs on non-controllers https://review.openstack.org/290687 | 16:58 |
*** rwsu has quit IRC | 16:58 | |
derekh | slagle: which one the CPU heating? yes we noticed it while in brno, but I feel its worse now (but thats just a gut feeling) | 16:59 |
slagle | derekh: the no_turbo setting | 16:59 |
slagle | just wondering if we were throttling before | 16:59 |
derekh | slagle: ahh, the setting value should have been the same all along, we never changed it | 17:00 |
*** bvandenh has quit IRC | 17:01 | |
*** xinwu has quit IRC | 17:04 | |
*** trown|brb has quit IRC | 17:05 | |
*** jistr has quit IRC | 17:06 | |
*** rcernin has quit IRC | 17:09 | |
openstackgerrit | Merged openstack/instack-undercloud: Nova should not sync power state of overcloud nodes https://review.openstack.org/288052 | 17:10 |
*** yamahata has joined #tripleo | 17:10 | |
*** olap has quit IRC | 17:10 | |
*** olap has joined #tripleo | 17:12 | |
*** ohamada has quit IRC | 17:13 | |
*** devvesa has quit IRC | 17:14 | |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates: Enable predictable IPs on non-controllers https://review.openstack.org/290687 | 17:15 |
*** saneax is now known as saneax_AFK | 17:16 | |
*** fgimenez has quit IRC | 17:16 | |
* bnemec just rechecked a patch he had already rechecked | 17:19 | |
*** dshulyak has quit IRC | 17:21 | |
EmilienM | bnemec: I figured the SSL thing, it was nothing with ssl but the tests in temepst that we run (too much...) | 17:22 |
bnemec | EmilienM: Cool | 17:22 |
*** jcoufal has quit IRC | 17:22 | |
*** hjensas has quit IRC | 17:24 | |
*** ifarkas has quit IRC | 17:25 | |
slagle | i think we're going to redeploy some testenv hosts to 3 envs and more ram | 17:25 |
openstackgerrit | Pradeep Kilambi proposed openstack/python-tripleoclient: Add gnocchi password as a deployment param https://review.openstack.org/290710 | 17:25 |
bnemec | \o/ | 17:25 |
slagle | in an attempt to quell the bloodbath | 17:25 |
bnemec | http://1.bp.blogspot.com/_bDijfYcj7qQ/TF7qd5_hHaI/AAAAAAAAAss/I0hr1HOs_8Y/s400/the-descent-horror-movie.jpg | 17:26 |
bnemec | I picture our CI rack looking like that now. Thanks slagle :-P | 17:27 |
*** dtantsur is now known as dtantsur|afk | 17:27 | |
slagle | bnemec: how did you pull that still frame off my web cam? | 17:28 |
bnemec | slagle: Hax! | 17:28 |
*** adarazs has quit IRC | 17:29 | |
*** dshulyak has joined #tripleo | 17:29 | |
*** mbound has joined #tripleo | 17:33 | |
*** adarazs has joined #tripleo | 17:33 | |
*** rdopiera has quit IRC | 17:34 | |
*** admin0 has joined #tripleo | 17:36 | |
*** mbound has quit IRC | 17:38 | |
*** panda has quit IRC | 17:40 | |
*** panda has joined #tripleo | 17:40 | |
derekh | Hi all, expect jobs to start mysteriously failing(differently to usual), I'm about to start redploying some TE hosts as per the email last night | 17:43 |
*** mgould has quit IRC | 17:44 | |
*** athomas has quit IRC | 17:47 | |
*** Marga__ has quit IRC | 17:49 | |
*** mbound has joined #tripleo | 17:49 | |
openstackgerrit | Merged openstack/os-cloud-config: Fix a typo in usage.rst https://review.openstack.org/234771 | 17:50 |
*** mbound has quit IRC | 17:50 | |
openstackgerrit | Merged openstack/os-cloud-config: Put py34 first in the env order of tox https://review.openstack.org/260515 | 17:50 |
openstackgerrit | Ben Nemec proposed openstack/instack-undercloud: Remove trailing / on keystone admin endpoint https://review.openstack.org/290724 | 17:52 |
*** dustins has quit IRC | 17:55 | |
*** jdob has quit IRC | 17:59 | |
*** mgould has joined #tripleo | 17:59 | |
*** bnemec changes topic to "TripleO | testenvs are being redeployed, expect random CI failures for a while | CI status: http://tripleo.org/cistatus.html | Docs: http://tripleo.org/" | 17:59 | |
bnemec | derekh: Updated the channel topic. Should we ask people to hold off on rechecks until you're done? | 17:59 |
greghaynes | bnemec: ianw Hello there - I want to chat about https://review.openstack.org/#/c/211859/ if youall have a min | 18:00 |
stevebaker | bnemec: hey, if I were to try network isolation in OVB which net environment should I use? | 18:00 |
derekh | bnemec: wouldn't do any harm to ask, | 18:00 |
stevebaker | bnemec: popular! | 18:01 |
*** bnemec changes topic to "TripleO | testenvs are being redeployed, expect random CI failures for a while. Please wait to recheck until the redeploy is complete | CI status: http://tripleo.org/cistatus.html | Docs: http://tripleo.org/" | 18:01 | |
bnemec | derekh: Done | 18:01 |
bnemec | stevebaker: Indeed! :-) | 18:01 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Kill CI job if it doesn't get a testenv quickly https://review.openstack.org/290731 | 18:01 |
greghaynes | haha, we can chat when things arent explodey | 18:01 |
bnemec | stevebaker: You'll need to have multiple nics available. Neutron doesn't allow tenant vms access to vlans. | 18:02 |
derekh | slagle: dprince bnemec now that we'll have more jenkins nodes then testenvs, something like that ^^ would be a good idea, | 18:02 |
bnemec | stevebaker: I use the "simple" templates from here: https://github.com/cybertron/tripleo-network-templates | 18:02 |
stevebaker | bnemec: yep, I'll be adding multiple nics and assuming I can get past introspection | 18:02 |
bnemec | I added the public nic back to the baremetal nodes in my local templates. | 18:02 |
openstackgerrit | Pradeep Kilambi proposed openstack/tripleo-heat-templates: Deploy Gnocchi as a Ceilometer metrics storage backend https://review.openstack.org/252032 | 18:03 |
derekh | slagle: dprince bnemec gotta run in a minute, here is what I';ve done so far, https://etherpad.openstack.org/p/qVHhIU4TQn | 18:03 |
*** shardy has quit IRC | 18:03 | |
derekh | dprince: was gonna take over, I'll be back later to see if there is anything I can help with | 18:03 |
stevebaker | bnemec: thanks, I'll take a look | 18:03 |
bnemec | # Handover to other people and run away | 18:03 |
* bnemec likes that plan | 18:03 | |
slagle | derekh: thanks :) | 18:04 |
bnemec | stevebaker: If you want to add a bunch of nics, https://github.com/openstack/tripleo-heat-templates/tree/master/network/config/multiple-nics should work too. | 18:04 |
bnemec | I need to test that and make sure OVB can actually handle it though. | 18:05 |
bnemec | I'm a little concerned based on my experience with adding lots of nics to the bmc that we may run into OpenStack bugs doing that. | 18:05 |
bnemec | But we'll see. | 18:05 |
*** olap has quit IRC | 18:05 | |
derekh | slagle: my deploy command failed (my fault), rerunning before I go | 18:06 |
derekh | ok, testenv30 is deploying now | 18:06 |
*** derekh is now known as derekh_afk | 18:06 | |
*** shivrao has joined #tripleo | 18:07 | |
stevebaker | bnemec: I may start with that, since its upstream and I can in theory create as many nics as I need | 18:07 |
*** Marga_ has joined #tripleo | 18:08 | |
*** Marga_ has quit IRC | 18:09 | |
*** xinwu has joined #tripleo | 18:10 | |
*** hjensas has joined #tripleo | 18:10 | |
*** Marga_ has joined #tripleo | 18:11 | |
*** Marga_ has quit IRC | 18:17 | |
greghaynes | ianw: bnemec ok, replied on https://review.openstack.org/#/c/211859/7 - I think theres multiple groups hoping for that to move forward right now so it would be awesome if I could get some more feedback | 18:18 |
*** lucasagomes is now known as lucas-dinner | 18:18 | |
*** mkovacik has quit IRC | 18:19 | |
*** jaosorior has joined #tripleo | 18:21 | |
*** Marga_ has joined #tripleo | 18:22 | |
*** dmacpher is now known as dmacpher-afk | 18:25 | |
*** nico_auv has quit IRC | 18:29 | |
*** mgould has quit IRC | 18:32 | |
*** shivrao has quit IRC | 18:39 | |
*** electrofelix has quit IRC | 18:40 | |
*** olap has joined #tripleo | 18:41 | |
*** rwsu has joined #tripleo | 18:43 | |
*** jdob_lt has joined #tripleo | 18:50 | |
*** jdob_lt is now known as jdob | 18:52 | |
*** Marga_ has quit IRC | 18:53 | |
*** Marga_ has joined #tripleo | 18:54 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Cisco nexus config template - obsolete parameter (replay count). https://review.openstack.org/288437 | 18:54 |
bnemec | Good news, everyone! | 18:57 |
bnemec | It looks like fedorapeople is down, so CI wouldn't be passing anyway (apparently we pull packages from there). | 18:57 |
larsks | How do I pass custom config settings to the *undercloud* install? The 'openstack undercloud install' command doesn't seem to take any parameters... | 18:58 |
*** shivrao has joined #tripleo | 18:59 | |
bnemec | Or maybe those two things are unrelated. My undercloud install is failing to download a package that doesn't even exist on my older undercloud. :-/ | 18:59 |
bnemec | larsks: It doesn't. The undercloud intentionally exposes relatively few configuration options to keep it simple. | 19:00 |
bnemec | What do you need to customize? | 19:00 |
larsks | bnemec: I want to pass in [ssh] libvirt_uri, because I am targeted libvirtd running unprivileged rather than running as root. | 19:01 |
larsks | I can obviously just edit ironic.conf and restart the service, but I was hoping for a more graceful option... | 19:01 |
trown | hmm, this seems worth wiring in to undercloud.conf maybe | 19:02 |
larsks | trown: I know, right? :) | 19:02 |
bnemec | I'm not so sure. I don't want to wire in an option that is only of interest to dev/test people who _aren't_ setting up their environment the way we suggest. | 19:03 |
bnemec | Everything in undercloud.conf is exposed to the user too. | 19:03 |
trown | fair | 19:04 |
trown | I think hiera could work too, will play with that | 19:05 |
trown | but first food | 19:05 |
*** trown is now known as trown|lunch | 19:05 | |
dmsimard | bnemec: fedorapeople is probably the temporary monitoring stuff ? | 19:06 |
bnemec | Yeah, I actually am starting to think maybe we need to allow for arbitrary puppet somehow. | 19:06 |
*** liverpooler has joined #tripleo | 19:06 | |
bnemec | dmsimard: It's a tempest dep. | 19:06 |
dmsimard | ah. | 19:06 |
bnemec | python2-testresources-1.0.0-1.el7.noarch | 19:07 |
bnemec | It's not even installed on my overcloud from maybe a week ago. | 19:07 |
jdob | gfidente, ping | 19:07 |
bnemec | Maybe the package just hasn't arrived in the right repos yet. | 19:07 |
jdob | actually, bnemec ping | 19:09 |
jdob | when you're done with that conversation | 19:09 |
bnemec | jdob: What's up? | 19:10 |
jdob | bnemec, these two patches in the client for the passwords (gnocchi and redis), if they land before the THT stuff, it's gonna break everything right? since they are passing in params that don't exist | 19:10 |
jdob | or am I wrong there | 19:10 |
bnemec | jdob: Right, but there's no way the template changes could pass CI without the client ones. | 19:11 |
jdob | and if i'm right, how do we handle landing these so that we don't get in a weird state | 19:11 |
jdob | this is true, i can't even run the gnocchi patch without the client | 19:11 |
bnemec | There should be a depends-on in the template changes that points at the corresponding client change. | 19:11 |
bnemec | Gerrit won't allow them to merge out of order then. | 19:11 |
jdob | right, but if the client merges and that THT patch takes another few days, isn't shit broken until the THT stuff lands too? | 19:12 |
bnemec | Although like I said, unless someone completely ignores CI that shouldn't happen. | 19:12 |
jdob | or does that password param get ignored if it's specified and unused | 19:12 |
bnemec | Hmm | 19:12 |
jdob | basically, i'm wondering if there is a circular dependency here | 19:12 |
jdob | and how we resolve landing them | 19:12 |
bnemec | I thought we had done this before, but I could be wrong. Let me look quickly. | 19:13 |
jdob | not sure what happens in gerrit if we have two depends-on pointing to each other, but IIRC, i heard that's bad mojo | 19:13 |
bnemec | jdob: Okay, yeah, we're fine if the client lands first. The client passes everything as parameter_defaults: https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_deploy.py#L231 | 19:14 |
jdob | ahhh, right, that was the way around this for other issues | 19:14 |
jdob | now i remember, thanks bnemec | 19:14 |
bnemec | np | 19:15 |
bnemec | Apparently it's only me who can't get to fedorapeople too. I wonder if they got pissed because I kicked off 60+ test rpm builds at once to catch up on the ones that failed because I broke my local network. | 19:16 |
jdob | i can ssh into jdob.fedorapeople.org if that helps | 19:17 |
jdob | or doesn't, I suppose | 19:17 |
bnemec | jdob: https://review.openstack.org/#/c/290710/ hasn't passed CI yet. | 19:18 |
bnemec | Nor has https://review.openstack.org/#/c/290610/ | 19:18 |
jdob | oh shit, my bad, I saw the green but didn't pay close enough attention | 19:18 |
jdob | removed the +As | 19:19 |
*** mkovacik has joined #tripleo | 19:19 | |
bnemec | Thanks | 19:20 |
slagle | dprince: there are some jobs running now on the new testenv30 hosts derekh_afk deployed | 19:20 |
*** saneax_AFK is now known as saneax | 19:21 | |
dprince | slagle: got a link to the jenkins so we can watch it? | 19:21 |
slagle | i'm just logged in right now | 19:23 |
slagle | is there a way to tell from that? | 19:23 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder: Set default locale to image in ubuntu-minimal https://review.openstack.org/290789 | 19:23 |
dprince | slagle: maybe, I thought perhaps you just noticed one of the jobs from zuul, or tripleo.org | 19:24 |
dprince | slagle: I think you could tell, but you'd have to be on the undercloud, or jenkins perhaps | 19:25 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates: Enable predictable IPs on non-controllers https://review.openstack.org/290687 | 19:25 |
*** akrivoka has quit IRC | 19:26 | |
slagle | dprince: here's one, https://jenkins05.openstack.org/job/gate-tripleo-ci-f22-ha/783/ | 19:28 |
dprince | slagle: boom, thanks | 19:28 |
dprince | slagle: if this goes well I'll deploy more then right? | 19:29 |
dprince | slagle: vs deploying them all now | 19:31 |
slagle | yea, sounds good | 19:35 |
*** sthillma has joined #tripleo | 19:36 | |
gfidente | jdob, it's ignored since the client passes stuff as parameter_defaults not parameters anymore | 19:39 |
gfidente | so we can pass my_mother_went_to_milan: true | 19:40 |
jdob | gfidente, please tell me there's a test case with that exact data in it | 19:40 |
jdob | that would be amazing | 19:40 |
gfidente | but this wasn't the case before though | 19:40 |
gfidente | client was using parameters: initially | 19:40 |
gfidente | going for the day guys! | 19:41 |
*** gfidente is now known as gfidente|afk | 19:41 | |
*** gfidente|afk has quit IRC | 19:43 | |
*** jprovazn has quit IRC | 19:43 | |
*** yamahata has quit IRC | 19:44 | |
slagle | dprince: so one of the jobs i'm looking at on testenv30-0 the nodes are stuck in wait call-back in ironic | 19:50 |
*** trozet_ has joined #tripleo | 19:50 | |
*** dshulyak has quit IRC | 19:50 | |
slagle | i suppose that could be related to the patch being tested, but that seems unlikely | 19:51 |
dprince | slagle: the one you linked before looks to be stuck building images | 19:51 |
dprince | sudo dd of=/var/log/image_build.txt | 19:51 |
*** trozet has quit IRC | 19:52 | |
dprince | slagle: got a link to the one waiting on Ironic callback? | 19:52 |
slagle | it's one of the running nonha ones | 19:53 |
slagle | let me see if i can find it | 19:53 |
slagle | dprince: https://jenkins06.openstack.org/job/gate-tripleo-ci-f22-nonha/1069/ | 19:54 |
*** trozet_ is now known as trozet | 19:58 | |
slagle | dprince: the nodes are active now, they got rescheduled | 19:58 |
slagle | that was odd...could explain some of these long job times | 19:58 |
openstackgerrit | Ryan Hallisey proposed openstack-infra/tripleo-ci: Allow the continer job to run again https://review.openstack.org/288915 | 19:59 |
slagle | i wonder if BOOTP packets got dropped somewhere | 19:59 |
dprince | slagle: perhaps, related to what though? | 20:02 |
*** jaosorior has quit IRC | 20:02 | |
*** jaosorior has joined #tripleo | 20:03 | |
slagle | dprince: not sure. but I only see the requests for boot.ipxe in the httpd access log after they initially timed out and were then deployed successfully | 20:07 |
slagle | which means they never requested boot.ipxe the first time | 20:07 |
slagle | so i'd guess dhcp failed | 20:07 |
dprince | slagle: interesting. Well this is probably the first job to use that testenv since it came up (we think?) | 20:08 |
slagle | oh let me check the ipxe script | 20:08 |
slagle | how many nics does it try? | 20:08 |
slagle | we have 10 now | 20:08 |
dprince | slagle: yes | 20:08 |
dprince | slagle: but the first Nic should take priority I think | 20:09 |
dprince | slagle: rather, they are ordered | 20:09 |
slagle | i dont think that always works | 20:09 |
slagle | that's why lucas had to add this | 20:09 |
slagle | but, i see it loops over all the nics anyway | 20:10 |
slagle | assuming that's working | 20:10 |
slagle | or they're all active | 20:10 |
dprince | slagle: yes, it would eventually try all of the active NICs | 20:10 |
dprince | slagle: I am doing this: http://git.openstack.org/cgit/openstack/tripleo-incubator/tree/scripts/configure-vm#n28 | 20:11 |
slagle | i see some errors in the inspector log as well | 20:14 |
slagle | http://paste.openstack.org/show/489904/ | 20:14 |
*** jcoufal has joined #tripleo | 20:15 | |
slagle | i wonder if they booted into the inspector ramdisk during deployment | 20:16 |
dprince | slagle: Hmm, not sure. The inspector logs indicate it is skipping some interfaces which were not PXE booting. However I would have expected one of them to be PXE booting (eth0 for example) | 20:17 |
slagle | right | 20:17 |
slagle | well i dont see any DHCPREQUEST's in the inspector-dnsmasq log after the oc deployment started, so that's probably not it | 20:18 |
*** snecklifter has quit IRC | 20:26 | |
bnemec | It may not be working, depending on how old the libvirt in the testenvs is. | 20:27 |
*** trown|lunch is now known as trown | 20:27 | |
bnemec | There's a fallback path for older roms that don't support the inc command. | 20:27 |
bnemec | No idea whether that would be a problem here, but it's possible. | 20:28 |
*** sthillma has quit IRC | 20:28 | |
bnemec | Specifically this: https://github.com/openstack/ironic/blob/master/ironic/drivers/modules/boot.ipxe#L8 | 20:29 |
bnemec | Note || goto old_rom | 20:29 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder: Set default locale to image in ubuntu-minimal https://review.openstack.org/290789 | 20:35 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: IPv6: duak-stack support for Keystone https://review.openstack.org/286344 | 20:38 |
dprince | slagle: the HA job failed with: 2016-03-09 20:45:00.006 | Error finding address for http://10.0.0.4:9292/v1/images: Unable to establish connection to http://10.0.0.4:9292/v1/images | 20:49 |
dprince | slagle: this was during the ping test to the overcloud | 20:49 |
dprince | slagle: seen that before? | 20:49 |
*** saneax is now known as saneax_AFK | 20:50 | |
slagle | possibly, it looks a little familiar | 20:50 |
slagle | we could check in logstash | 20:50 |
dprince | slagle: http://logs.openstack.org/88/288188/3/check-tripleo/gate-tripleo-ci-f22-ha/f4394a0// | 20:50 |
dprince | slagle: I'm gonna say that is possibly new or unrelated to what we are chasing in general | 20:51 |
slagle | yea i dont think it's related | 20:51 |
slagle | bnemec: does * work within double quotes on logstash? | 20:54 |
slagle | see what happens when you start an etherpad of queries? | 20:55 |
slagle | people ask you questions | 20:55 |
bnemec | slagle: I'm not sure. It automatically does a substring match on the Message field, so leading or trailing *'s would be unnecessary. | 20:55 |
bnemec | In theory anyway. :-) | 20:55 |
slagle | yea, that i see working | 20:55 |
slagle | but if i put one in the middle, it doesn't work | 20:56 |
*** prometheanfire has joined #tripleo | 20:58 | |
dprince | slagle: 3 new testenv's online | 20:58 |
dprince | slagle: testenv31-.... | 20:58 |
prometheanfire | so, trying to add a element to test in dib, hopefully only for periodic jobs | 20:58 |
prometheanfire | is all I need to do is add a subdir called test-elements/build-succeeds like in debian/fedora/apt-sources/ironic-anget? | 20:59 |
bnemec | slagle: Might be useful: A query such as "foo bar"~10000000 is an interesting alternative to foo AND bar | 21:01 |
slagle | oh i see | 21:03 |
bnemec | slagle: Although I'm not finding that it quite does what I would expect. I'm just looking at the reference here: http://www.lucenetutorial.com/lucene-query-syntax.html | 21:05 |
dprince | 3 more testenv's active | 21:19 |
dprince | testenv32-.... | 21:19 |
prometheanfire | not sure if that will make it run for all checks though | 21:22 |
*** dshulyak has joined #tripleo | 21:22 | |
*** dshulyak has quit IRC | 21:23 | |
openstackgerrit | Pradeep Kilambi proposed openstack/tripleo-heat-templates: Deploy Gnocchi as a Ceilometer metrics storage backend https://review.openstack.org/252032 | 21:23 |
*** trown is now known as trown|outtypewww | 21:25 | |
dprince | 3 more testenv's active | 21:27 |
dprince | testenv33-.... | 21:27 |
*** derekh_afk is now known as derekh | 21:29 | |
derekh | how goes it? | 21:30 |
dprince | derekh: I'm deploying testenv34 now | 21:30 |
slagle | dan is deploying more testenvs | 21:30 |
slagle | derekh: we had some jobs pass testenv30, so decided to redeploy the rest | 21:31 |
dprince | derekh: so we've got like 12 more testenv's or so... | 21:31 |
dprince | derekh: sorry, 12 more test environments | 21:31 |
slagle | some also failed...in unrelated ways | 21:31 |
dprince | yeah, still sort of unsure but decided to go all in to get better results | 21:32 |
derekh | dprince: slagle ok | 21:32 |
slagle | i just rechecked https://review.openstack.org/#/c/272089/ | 21:32 |
slagle | and the 3 jobs got the 3 envs from testenv31 | 21:32 |
slagle | so that will be a good test | 21:32 |
dprince | derekh: I'm also setting times, and swap up on these | 21:32 |
derekh | dprince: great, was about the check ;-) | 21:33 |
dprince | derekh: testenv34 just had some failed machines. Retrying it again... | 21:35 |
*** panda has quit IRC | 21:40 | |
*** panda has joined #tripleo | 21:40 | |
*** r-mibu has quit IRC | 21:47 | |
*** yamahata has joined #tripleo | 21:47 | |
*** r-mibu has joined #tripleo | 21:47 | |
dprince | okay, all 3 testenv34... workers are running now | 21:48 |
dprince | testenv35 building now | 21:48 |
* dprince wonders off for a bit | 21:49 | |
dprince | wander even | 21:49 |
derekh | dprince: gonna remove the unused overcloud ports | 21:49 |
dprince | oh crap, I forgot about those | 21:50 |
dprince | derekh: you can tell the unused ones though right? | 21:50 |
derekh | dprince: their named, so anything thats not te_testenv3X , | 21:51 |
dprince | derekh: right | 21:51 |
derekh | dprince: at this stage all the old testenvs are gone arn't they | 21:51 |
dprince | derekh: you doing it or me? | 21:51 |
*** ccamacho has quit IRC | 21:51 | |
derekh | dprince: yup | 21:51 |
dprince | derekh: I've got a ~/clean-ports.sh | 21:51 |
dprince | derekh: be very careful with that guy though | 21:51 |
derekh | dprince: I'll stick the my normal way ;-) | 21:52 |
dprince | derekh: script isn't guarded very well. Wrong regex and it'd go wonky | 21:52 |
* derekh deleted all the ports on the overcloud once, back when we were setting all this up, it didn't go very well | 21:53 | |
*** ccamacho has joined #tripleo | 21:57 | |
*** jdob has quit IRC | 22:00 | |
*** jayg is now known as jayg|g0n3 | 22:03 | |
*** jayg|g0n3 is now known as jayg | 22:04 | |
*** jayg is now known as jayg|g0n3 | 22:05 | |
*** absubram has quit IRC | 22:06 | |
dprince | derekh: 17 active testenv-worker machines | 22:07 |
*** lblanchard has quit IRC | 22:08 | |
dprince | derekh: 51 environments. That should be close to enough right? | 22:08 |
slagle | do we have any cache for the puppet modules? | 22:08 |
derekh | dprince: cool, I guess we just wait now | 22:08 |
*** admin0 has left #tripleo | 22:08 | |
dprince | slagle: I don't think so | 22:08 |
derekh | slagle: the git.openstack.org modules should already be on the jekins node, and we should be getting them from there, if we're not something is broken | 22:09 |
derekh | slagle: for the other we cloning from github mostly | 22:09 |
derekh | slagle: this is intented to change things to start cloning from the mirror server I've been talking about https://review.openstack.org/#/c/285257/ | 22:10 |
derekh | *from | 22:10 |
openstackgerrit | James Slagle proposed openstack-infra/tripleo-ci: Reuse the source-repositories cache during the image build https://review.openstack.org/290879 | 22:11 |
slagle | derekh: ok. that's probably pointless then ^ | 22:11 |
derekh | slagle: hold on I thought I fixed that recently, standby | 22:12 |
bnemec | So testenvs are all done? | 22:12 |
*** rpothier has left #tripleo | 22:14 | |
*** dshulyak has joined #tripleo | 22:14 | |
slagle | 17 minute image build | 22:14 |
slagle | seems like something is going faster | 22:15 |
derekh | slagle: https://review.openstack.org/#/c/283699/ | 22:15 |
slagle | heh | 22:15 |
slagle | i was lookign in the wrong place | 22:16 |
derekh | bnemec: yup, all done, | 22:17 |
bnemec | derekh: Roger, thanks | 22:17 |
*** bnemec changes topic to "TripleO | testenvs redeployed. recheck away! | CI status: http://tripleo.org/cistatus.html | Docs: http://tripleo.org/" | 22:18 | |
derekh | ok, I'm off, will check back in the morning, better tell my green pixels to get ready | 22:18 |
*** derekh has quit IRC | 22:19 | |
*** david-lyle has quit IRC | 22:19 | |
*** david-lyle has joined #tripleo | 22:20 | |
*** jcoufal has quit IRC | 22:28 | |
bandini | so I am getting "ERROR: Failed to validate: Failed to validate: resources[0]: "str_replace" parameters must be a mapping" while deploying THT from master. How do I troubleshoot this in general? heat-*.log aren't all too helpful: http://fpaste.org/336477/62661145/ | 22:31 |
openstackgerrit | Steve Baker proposed openstack/tripleo-heat-templates: Add a large packet ping to all nodes check https://review.openstack.org/290884 | 22:32 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Kill CI job if it doesn't get a testenv quickly https://review.openstack.org/290731 | 22:35 |
*** yamahata has quit IRC | 22:35 | |
*** jtomasek has quit IRC | 22:39 | |
slagle | dprince: do we need dhcp-all-interfaces running on the testenv hosts? | 22:40 |
slagle | isn't it going to keep more and more failed services? | 22:41 |
slagle | i thought there was a bug on that about it eventually degrading performance | 22:41 |
slagle | dprince: yea, this one https://bugzilla.redhat.com/show_bug.cgi?id=1293712 | 22:42 |
openstack | bugzilla.redhat.com bug 1293712 in rhel-osp-director "/etc/udev/rules.d/99-dhcp-all-interfaces.rules causes a slow and miserable degradation until things fail" [Urgent,Assigned] - Assigned to dprince | 22:42 |
slagle | and i see lots of failed services for the vnet interfaces on the testenv host | 22:43 |
slagle | i saw an instance of ironic failing to start one of the vm's on the host, and it had to redeploy it | 22:43 |
slagle | dprince: i'm getting a lot of this in one of the running jobs now, http://paste.openstack.org/show/489922/ | 22:45 |
slagle | i gotta step away. but it might be something to consider...turning off dhcp-all-interfaces on the testevnvs | 22:46 |
slagle | seems like that could be related to the vm's have 10 nics now | 22:46 |
openstackgerrit | Matthew Thode proposed openstack/diskimage-builder: Add testing for the Gentoo element https://review.openstack.org/290894 | 22:48 |
*** dshulyak has quit IRC | 22:48 | |
openstackgerrit | Ben Nemec proposed openstack/puppet-tripleo: Allow enabling authentication on haproxy.stats https://review.openstack.org/290896 | 23:01 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!