*** openstack has joined #tripleo | 05:58 | |
*** psanchez has joined #tripleo | 05:58 | |
jaosorior | great | 05:58 |
---|---|---|
*** rain has joined #tripleo | 05:58 | |
jaosorior | bandini: It seems that none of the openstack components that are not running over httpd support reload though | 05:58 |
jaosorior | what do you recommend in those cases? | 05:58 |
*** rain is now known as Guest61676 | 05:58 | |
*** saneax is now known as saneax_AFK | 05:59 | |
jaosorior | For example; I have this WIP for heat https://review.openstack.org/#/c/327069/10 but I haven't put any command there. | 05:59 |
bandini | jaosorior: in that case it gets a bit tricky, because if you do a "pcs restart openstack-<service>" it will restart all the dependent services too | 05:59 |
*** mbound has quit IRC | 05:59 | |
jaosorior | bandini; can't I just do systemctl restart openstack-heat-api? | 06:00 |
bandini | jaosorior: well it can get racy. basically if, while you do the restart via systemctl, pacemaker monitors that service it will see that it is down so it will try to stop and start it again. so at this point you're racing with pcmk for that | 06:02 |
bandini | maybe it works, but it can break | 06:03 |
jaosorior | bandini: Well, it is a post-save command; all I want is that the service re-reads the configuration so it can read the new certificates | 06:03 |
jaosorior | so if pacemaker tries to restart it again; it shouldn't be that problematic, as I don't do anything after the post save command | 06:04 |
*** pkovar has joined #tripleo | 06:04 | |
*** tremble has joined #tripleo | 06:07 | |
bandini | jaosorior: let me have a think about that | 06:09 |
*** ooolpbot has joined #tripleo | 06:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 06:10 |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 06:10 |
*** ooolpbot has quit IRC | 06:10 | |
*** dciabrin has quit IRC | 06:10 | |
*** jprovazn has joined #tripleo | 06:14 | |
*** yolanda has joined #tripleo | 06:18 | |
*** yolanda has quit IRC | 06:21 | |
*** ramishra has quit IRC | 06:21 | |
*** florianf has quit IRC | 06:22 | |
*** florianf has joined #tripleo | 06:22 | |
*** yolanda has joined #tripleo | 06:24 | |
*** rcernin has joined #tripleo | 06:24 | |
*** saneax_AFK is now known as saneax | 06:25 | |
*** ramishra has joined #tripleo | 06:26 | |
*** apetrich has quit IRC | 06:34 | |
*** apetrich has joined #tripleo | 06:34 | |
*** jcoufal has quit IRC | 06:36 | |
*** coolsvap has quit IRC | 06:37 | |
*** coolsvap has joined #tripleo | 06:38 | |
bandini | jaosorior: do you happen to have a newton env where the ci issue is seen? | 06:39 |
jaosorior | bandini: No dude :/ | 06:40 |
jaosorior | I don't have an accessible machine | 06:40 |
jaosorior | aaand I have a deployment I test with, but it's from before the issue | 06:40 |
jaosorior | I think ccamacho had one | 06:40 |
bandini | ah right, I will try to set one up then | 06:41 |
bandini | jaosorior: btw I will do a short write-up about the issues of systemctl restart <service> when <service> is managed by pacemaker. I think it will be useful for everyone (me included ;) | 06:43 |
jaosorior | alright! | 06:43 |
jaosorior | thanks dude | 06:43 |
bandini | the reasons for why things might break are totally not obvious and are related how the systemd apis work | 06:44 |
*** mburned has quit IRC | 06:46 | |
*** xinwu has quit IRC | 06:47 | |
*** mburned has joined #tripleo | 06:50 | |
*** panda has quit IRC | 06:50 | |
*** panda has joined #tripleo | 06:51 | |
*** afazekas is now known as afazekas|dentist | 06:52 | |
bandini | jaosorior: ever seen this https://paste.fedoraproject.org/379899/66059914/ after running tripleo.sh --repo-setup and then --undercloud / | 06:52 |
bandini | ? | 06:52 |
hewbrocca | morning folks | 06:54 |
bandini | yo hewbrocca | 06:54 |
*** mburned has quit IRC | 06:56 | |
hewbrocca | It seems like EmilienM made some progress on the upgrade job last night but it's still not fixed | 06:59 |
bandini | yeah am looking into that as well | 06:59 |
*** ramishra has quit IRC | 07:00 | |
*** mburned has joined #tripleo | 07:00 | |
*** anshul has joined #tripleo | 07:01 | |
hewbrocca | It's an interesting one... | 07:01 |
*** anshul is now known as Guest36238 | 07:01 | |
bandini | I have one theory, am trying to get a newton env installed, but of course some repo screwage is playing against me | 07:02 |
hewbrocca | shocking | 07:03 |
hewbrocca | OK, good stuff, thanks | 07:03 |
bandini | lol | 07:03 |
*** mikelk has joined #tripleo | 07:05 | |
*** ramishra has joined #tripleo | 07:05 | |
*** oshvartz has joined #tripleo | 07:06 | |
jaosorior | bandini: Yes, I usually just remove that package and try again | 07:07 |
bandini | jaosorior: oh boy. ok. trying | 07:08 |
* hewbrocca facepalm | 07:09 | |
bandini | jaosorior: can you get me /var/lib/pacemaker/cib/cib.xml from your newton install? | 07:09 |
*** tesseract has joined #tripleo | 07:09 | |
*** ooolpbot has joined #tripleo | 07:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 07:10 |
*** ooolpbot has quit IRC | 07:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 07:10 |
bandini | nope it fails afterwards as the undercloud tries to reinstall it and it barfs | 07:10 |
*** mcornea has joined #tripleo | 07:11 | |
*** jpich has joined #tripleo | 07:12 | |
bandini | asked in rdo as well | 07:12 |
*** coolsvap has quit IRC | 07:15 | |
jaosorior | bandini, will do | 07:15 |
*** coolsvap has joined #tripleo | 07:15 | |
*** pcaruana has joined #tripleo | 07:17 | |
bandini | jaosorior: Gracias Güey ;) | 07:18 |
jaosorior | a huevo | 07:19 |
*** ifarkas has joined #tripleo | 07:19 | |
*** ramishra has quit IRC | 07:19 | |
bandini | lol | 07:19 |
*** ramishra has joined #tripleo | 07:19 | |
jaosorior | bandini: http://pastebin.com/hQnf8VZM | 07:23 |
jaosorior | but it's not the same environment as the failure. like I said, I got that from before it started | 07:23 |
jaosorior | maybe you could compare that though | 07:23 |
*** jpena|off is now known as jpena | 07:23 | |
*** hjensas__ has joined #tripleo | 07:23 | |
bandini | jaosorior: yeah as soon as I get an install working | 07:25 |
*** Guest61676 is now known as leanderthal | 07:25 | |
*** ramishra has quit IRC | 07:26 | |
jaosorior | or poke ccamacho when he's back online | 07:26 |
bandini | yeah that is probably faster :D | 07:27 |
*** ramishra has joined #tripleo | 07:28 | |
jaosorior | hewbrocca: Do you know if we're deploying cinder over httpd? | 07:28 |
jaosorior | marios ^^ | 07:29 |
*** shardy has joined #tripleo | 07:30 | |
*** ramishra has quit IRC | 07:32 | |
jaosorior | shardy hey dude, quick question; do you know if we're deploying cinder over httpd? | 07:32 |
*** ohamada has joined #tripleo | 07:33 | |
hewbrocca | jaosorior: Hmm no I really have no idea | 07:34 |
bandini | jaosorior: I believe we are not (unless something changed very recently) | 07:34 |
shardy | jaosorior: I don't think we are - it should be easy enough to check via ps ax | grep cinder on a controller? | 07:34 |
shardy | and/or looking a the httpd conf | 07:35 |
*** ccamacho has joined #tripleo | 07:35 | |
ccamacho | Morning guys! | 07:36 |
openstackgerrit | Oded Shvartz proposed openstack/tripleo-common: overcloud-odl : add new image file definition https://review.openstack.org/266881 | 07:36 |
jaosorior | ccamacho: hey dude, what's up? | 07:37 |
bandini | shardy: for the record, I narrowed down the heat issue I mentioned yesterday. It seems os-collect-config on the nodes gets confused after we yum update to mitaka (from liberty) | 07:37 |
bandini | not sure yet why, but let's call it progress ;) | 07:37 |
*** ramishra has joined #tripleo | 07:37 | |
shardy | bandini: ah, well good to know you've got a handle on it | 07:37 |
shardy | the o-c-c thing doesn't sound good tho | 07:37 |
bandini | yeah I need to narrow down what is going on exactly | 07:38 |
ccamacho | jaosorior, the upgrades issue on ci, yesterday I managed to reproduce the error with an old patch passing CI, so i dont think is related to THT or puppet-tripleo | 07:38 |
shardy | bandini: Ok, well thanks for looking into it :) | 07:38 |
bandini | shardy: let's see how it goes ;) | 07:38 |
jaosorior | ccamacho: Do you still have that environment up and running? | 07:39 |
jaosorior | ccamacho: If so, would it be possible for you to fetch /var/lib/pacemaker/cib/cib.xml from a controller? | 07:39 |
*** ebarrera has joined #tripleo | 07:42 | |
*** dmacpher has quit IRC | 07:43 | |
*** zoli_gone-proxy is now known as zoliXXL | 07:43 | |
*** zoliXXL is now known as zoli|wfh | 07:45 | |
marios | jaosorior: o/ no we are not afaik | 07:45 |
*** ramishra has quit IRC | 07:46 | |
*** ramishra has joined #tripleo | 07:46 | |
ccamacho | jaosorior, I have the environment with the error deployed | 07:47 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for keystone https://review.openstack.org/327029 | 07:47 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for heat https://review.openstack.org/327069 | 07:47 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for glance API and registry https://review.openstack.org/327473 | 07:47 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for RabbitMQ https://review.openstack.org/327482 | 07:47 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for cinder-api https://review.openstack.org/328859 | 07:47 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Add fact to get the fqdn for a host in the different networks https://review.openstack.org/329299 | 07:47 |
ccamacho | jaosorior, sure, give a sec | 07:47 |
*** olap has joined #tripleo | 07:49 | |
ccamacho | jaosorior, http://paste.openstack.org/show/516489/ | 07:50 |
*** dtantsur|afk is now known as dtantsur | 07:50 | |
jaosorior | bandini^^ | 07:53 |
bandini | looking, thanks | 07:54 |
*** numans has quit IRC | 07:54 | |
*** saneax is now known as saneax_AFK | 07:56 | |
jaosorior | bandini https://www.diffchecker.com/5jgcdjda | 07:59 |
bandini | jaosorior: http://acksyn.org/files/tripleo/juan-working.pdf and http://acksyn.org/files/tripleo/carlos-broken.pdf | 08:00 |
bandini | I need to see stuff ;) | 08:00 |
jaosorior | ok | 08:00 |
ccamacho | yeahp, from yesterday checks might be something smelly with rabbit... | 08:00 |
jaosorior | so the main difference also is that neutron now depends on openstack-core | 08:00 |
jaosorior | wait | 08:01 |
jaosorior | no | 08:01 |
jaosorior | that neutron doesn't depend anymore on openstack-core | 08:01 |
bandini | there are quite a few constraints that went away | 08:01 |
bandini | working on installing my newton env | 08:01 |
ccamacho | my CI was deployed with this patch https://review.openstack.org/#/c/328361/ | 08:02 |
ccamacho | which was the last patch without the error | 08:02 |
ccamacho | if you want access to the environment just send me the public keys | 08:03 |
jaosorior | without? | 08:04 |
jaosorior | thought you had an environment with the error | 08:04 |
ccamacho | that env have the error | 08:04 |
*** aufi has joined #tripleo | 08:05 | |
jaosorior | alright | 08:05 |
*** coolsvap_ has joined #tripleo | 08:06 | |
*** coolsvap_ has quit IRC | 08:06 | |
*** osp has quit IRC | 08:06 | |
*** coolsvap_ has joined #tripleo | 08:07 | |
ccamacho | but my last test from yesterday was to deploy the previous patch before the first upgrade error on the upgrades job, to see if is related to THT or puppet-tripleo, but the error was reproduced.. so from our lasts tests from yesterday might be an error cause by some rabbit problem.. | 08:07 |
*** athomas has joined #tripleo | 08:07 | |
*** coolsvap has quit IRC | 08:07 | |
*** jaosorior has quit IRC | 08:07 | |
ccamacho | bandini how did you create this? "http://acksyn.org/files/tripleo/juan-working.pdf and http://acksyn.org/files/tripleo/carlos-broken.pdf" | 08:07 |
*** jaosorior has joined #tripleo | 08:08 | |
*** remix_auei is now known as remix_tj | 08:09 | |
*** ooolpbot has joined #tripleo | 08:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 08:10 |
*** ooolpbot has quit IRC | 08:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 08:10 |
*** hjensas_ has joined #tripleo | 08:11 | |
*** chem```` has joined #tripleo | 08:12 | |
*** chem``` has quit IRC | 08:14 | |
*** hjensas__ has quit IRC | 08:14 | |
chem```` | ccamacho: hello, has a solution to the CI problem been found ? | 08:19 |
*** chem```` is now known as chem | 08:20 | |
ccamacho | chem````: not yet man.. | 08:20 |
*** jprovazn has quit IRC | 08:21 | |
chem | ccamacho: when you have deployed your test env, did you use all git version of puppet as you described in your documentation ? | 08:22 |
chem | ccamacho: I just want to make sure I have the right env to do the tests | 08:22 |
openstackgerrit | yolanda.robla proposed openstack/tripleo-quickstart: Allow to specify templates path on overcloud deployment https://review.openstack.org/329556 | 08:24 |
*** thegodfather is now known as fabbione | 08:26 | |
ccamacho | chem yeahp, you can use from latest master patch or any other prior | 08:26 |
*** paramite has joined #tripleo | 08:27 | |
ccamacho | chem also if you want, can have access to my env not to invest time deploying | 08:27 |
*** apetrich has quit IRC | 08:28 | |
*** lucas-afk is now known as lucasagomes | 08:28 | |
chem | ccamacho: one of my idea was to bissec the puppet-tripleo/tht code to find if it's something there | 08:28 |
chem | ccamacho: in the meantime I accept your proposal :) | 08:29 |
ccamacho | what I did was to deploy latest patch (Failed) and the last patch without having the error on CI (Also failed), so I dont think is related to tripleo-heat-templates or puppet-tripleo | 08:30 |
chem | https://launchpad.net/%7Esofer-athlan-guyot/+sshkeys | 08:30 |
chem | ccamacho:^ | 08:30 |
chem | ccamacho: ha oki, thanks for the information | 08:31 |
chem | ccamacho: when you say "last patch" you mean tht/puppet-tripleo combo ? | 08:31 |
*** apetrich has joined #tripleo | 08:32 | |
ccamacho | chem yeahp | 08:32 |
openstackgerrit | Karthik S proposed openstack/tripleo-specs: New Spec: tripleo-ovs-dpdk https://review.openstack.org/313871 | 08:36 |
openstackgerrit | Sanjay Upadhyay proposed openstack/tripleo-specs: new spec: tripleo-sriov https://review.openstack.org/313872 | 08:43 |
*** saneax_AFK is now known as saneax | 08:44 | |
*** coolsvap_ is now known as coolsvap | 08:44 | |
*** coolsvap has quit IRC | 08:45 | |
*** coolsvap has joined #tripleo | 08:45 | |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Add infrstucture scripts to prepare rh2 https://review.openstack.org/295243 | 08:46 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: [WIP] Add support for OVB based CI https://review.openstack.org/329521 | 08:46 |
*** derekh has joined #tripleo | 08:47 | |
bandini | ccamacho: https://github.com/mbaldessari/pcs/tree/wip-graph-support | 08:48 |
hewbrocca | derekh: hey, how did rh2 go yesterday | 08:48 |
derekh | hewbrocca: good, all up again now, wont be reinstalling it again, just doing some sanity tests now and then will submit a patch to add it to infra | 08:48 |
ccamacho | bandini, thanks! | 08:49 |
derekh | hewbrocca: I redeployed it last night and recorded the process, started editing the recording to compact it a bit, 3 hours of editing and I'm only half way through .... | 08:49 |
derekh | hewbrocca: should finish the recording before the end of the week | 08:50 |
hewbrocca | derekh: that is really excellent | 08:54 |
chem | bandini: what is this pcs graph support branch doing ? | 08:55 |
chem | bandini: never mind ... I've red the commit. | 08:56 |
hewbrocca | derekh: What's the next step, actually enlisting the cloud in nodepool? | 08:56 |
chem | bandini: is this the tool that creates that http://acksyn.org/files/tripleo/wsgi-openstack-core.pdf | 08:57 |
derekh | hewbrocca: 1. submit patch to infa to add the cloud to nodepool, 2. add an experimental OVB based job, 3. get it working, 4. switch the job to voting, 5. remove the existing jobs running on rh1, 6. move the rack etc... | 08:58 |
*** mgould|afk is now known as mgould | 08:59 | |
*** jprovazn has joined #tripleo | 08:59 | |
hewbrocca | very good | 09:00 |
*** dtantsur is now known as dtantsur|brb | 09:00 | |
shardy | derekh: Is it possible to have more than one experimental job - we'd just have multiple jobs in the "check experimental" pipeline? | 09:00 |
shardy | derekh: I'd like to add a job that enables heat convergence for check experimental | 09:00 |
derekh | shardy: do you mean more then one experimental job running on rh2? | 09:01 |
hewbrocca | jistr: shardy is making tempting promises of rolling update support for SoftwareDeployments | 09:01 |
derekh | shardy: or just in general, are multiple experimental jobs possible? | 09:01 |
hewbrocca | assuming we can get jdob to do some work for a change | 09:01 |
shardy | derekh: No, I just meant generally | 09:01 |
jistr | hehe | 09:02 |
derekh | shardy: yup, I think thats fine, if its a problem, we could make the rh2 one non experimental but non voting | 09:02 |
*** dmk0202 has joined #tripleo | 09:02 | |
shardy | hewbrocca, jistr: ramishra has kindly offered to pick it up today as I'm going to be on PTO for a couple of days | 09:02 |
hewbrocca | ahh, too bad, then I can't hassle jdob about it | 09:03 |
hewbrocca | :) | 09:03 |
hewbrocca | ramishra: thanks, all BS aside | 09:03 |
shardy | hewbrocca: don't worry, I'm currently writing a heat spec that we can probably hassle jdob to implement ;) | 09:03 |
hewbrocca | excellent | 09:04 |
shardy | (merge strategy for environments) | 09:04 |
jistr | hewbrocca, shardy: yea we discussed the upgrades in general yesterday. Heat is also getting YAQL support which should mean we woulnd't have to do hacks to process data, hopefully. Still i think we need to have a PoC to see how we'd approach things in practice. We're capturing the discussion at https://etherpad.openstack.org/p/tripleo-composable-upgrades-discussion | 09:04 |
shardy | jistr: FYI the pattern will end up something like this: | 09:05 |
shardy | http://paste.fedoraproject.org/379924/46606788 | 09:05 |
shardy | jistr: yaql support already landed in newton heat FYI | 09:05 |
*** nijaba has quit IRC | 09:06 | |
*** ifarkas has quit IRC | 09:06 | |
bandini | chem: correct you do pcs -f <cib> constraint order graph > file.dot && dot -Tpdf file.dot > file.pdf | 09:07 |
jistr | shardy: ack, tahnks | 09:07 |
jistr | *thanks | 09:07 |
bandini | chem: I need to get some time to upstream it | 09:07 |
shardy | jistr: here's a simple example: http://paste.fedoraproject.org/379926/06802314 | 09:07 |
openstackgerrit | Merged openstack/diskimage-builder: Add python logger configuration https://review.openstack.org/325409 | 09:07 |
shardy | http://docs.openstack.org/developer/heat/template_guide/hot_spec.html#yaql | 09:07 |
chem | bandini: that's great, just make it plain simple what the constraints were doing exactly. | 09:07 |
shardy | jistr: note the yaql docs are... not good, but I've been reading the code to fill out the gaps | 09:08 |
shardy | perhaps we can help with some docs patches to improve that | 09:08 |
shardy | mistral docs also contain some examples | 09:08 |
jistr | shardy: yea that's pretty awesome i think. That gives Heat incomparably more data mangling power than it had until now. | 09:08 |
shardy | \o/ | 09:08 |
openstackgerrit | Merged openstack/diskimage-builder: Introspect logging testing more https://review.openstack.org/328071 | 09:08 |
shardy | jistr: the nice thing is we can take a list, manipulate it, then str_replace (or list_join) supports transparently serializing to json | 09:09 |
*** apetrich has quit IRC | 09:10 | |
*** ooolpbot has joined #tripleo | 09:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 09:10 |
*** ooolpbot has quit IRC | 09:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 09:10 |
hewbrocca | grrr | 09:11 |
jistr | shardy: just wondering now, if we want to create a value processed through YAQL and we want to reuse it on multiple places in a template, would there be a way to do that without using a nested stack like the in what you pasted? E.g. a resource of some sort... | 09:12 |
*** apetrich has joined #tripleo | 09:12 | |
*** nijaba has joined #tripleo | 09:13 | |
*** nijaba has joined #tripleo | 09:13 | |
jistr | shardy: i can imagine we could have a OS::TripleO::Yaql type that could accept `params` array, `yaql` string and give out `result` output, but it might be useful to support something in that sence out of the box | 09:13 |
jistr | just an idea though | 09:13 |
shardy | jistr: Yeah there has been discussion on that, for now we'd have to either accept some duplication or do as you say and store the data in a nested stack | 09:15 |
shardy | jistr: there was also discussion of an OS::Heat::Value native resource type | 09:16 |
shardy | I thought there was even a patch but I can't find it atm | 09:16 |
shardy | implementing such a thing would be incredibly easy tho | 09:17 |
jistr | shardy: yea i thought something like this https://paste.fedoraproject.org/379930/66068577/ | 09:17 |
shardy | Yeah, that would work, but it adds to our overload-of-nested-stacks problem | 09:18 |
shardy | probably fine as a first step tho | 09:18 |
jistr | and OS::Heat::Value sounds good to me by the name of it :) Something to get around the fact that we don't have variables available, just resources. So we'd have a resource that represents a variable. | 09:18 |
jistr | and OS::Heat::Value could get rid of the nested stack overload perhaps if implemented within Heat :) | 09:19 |
*** milan has quit IRC | 09:20 | |
shardy | jistr: see this thread for context: http://lists.openstack.org/pipermail/openstack-dev/2016-April/091432.html | 09:22 |
jistr | thanks | 09:22 |
shardy | we'll have to check on the current status of any implementation | 09:22 |
*** florianf has quit IRC | 09:23 | |
*** electrofelix has joined #tripleo | 09:27 | |
*** fzdarsky|afk has joined #tripleo | 09:27 | |
*** apetrich has quit IRC | 09:28 | |
*** apetrich has joined #tripleo | 09:28 | |
jistr | shardy: interesting read. Maybe the advantage of OS::Heat::Value over computable parameter defaults would be ability to reference other resources within the same stack. | 09:32 |
jistr | ... read their outputs for example | 09:32 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-quickstart: Allow passing extra arguments to overcloud deploy script https://review.openstack.org/328970 | 09:33 |
mgould | morning everyone | 09:33 |
mgould | could someone please do a release of instack-undercloud stable/mitaka, so one of my colleagues can QE introspection of UEFI-only nodes? | 09:35 |
shardy | jistr: yep that's true | 09:36 |
*** tosky has joined #tripleo | 09:36 | |
*** florianf has joined #tripleo | 09:37 | |
shardy | mgould: sure - FYI note that it is possible for anyone to propose a release to openstack/releases now | 09:37 |
*** apetrich has quit IRC | 09:37 | |
mgould | shardy: excellent, thanks! | 09:37 |
shardy | we just need to look at the most recent passing stable periodic job and take the hashes from there | 09:37 |
*** apetrich has joined #tripleo | 09:38 | |
*** hjensas_ has quit IRC | 09:38 | |
shardy | derekh: did we ever reinstate the periodic cistatus anywhere? | 09:38 |
shardy | I can pull the results via a script but I wasn't sure if there was an easier interface like the tripleo.org page we had previously | 09:39 |
derekh | shardy: nope but sshnaidm has a different status page that can be used, gimme a sec and I'll find it | 09:39 |
derekh | shardy: http://status-tripleoci.rhcloud.com/ | 09:40 |
derekh | brb | 09:40 |
shardy | derekh: thanks! | 09:41 |
*** fzdarsky|afk is now known as fzdarsky | 09:43 | |
*** mbound has joined #tripleo | 09:56 | |
*** ifarkas has joined #tripleo | 09:57 | |
*** tosky has quit IRC | 09:59 | |
*** milan has joined #tripleo | 10:00 | |
*** dciabrin has joined #tripleo | 10:01 | |
*** tosky has joined #tripleo | 10:06 | |
*** sambetts|afk is now known as sambetts | 10:09 | |
*** ooolpbot has joined #tripleo | 10:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 10:10 |
*** ooolpbot has quit IRC | 10:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 10:10 |
*** dtantsur|brb is now known as dtantsur | 10:13 | |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-docs: Composable roles within services Tutorial https://review.openstack.org/311512 | 10:14 |
*** akrivoka has joined #tripleo | 10:14 | |
*** dciabrin has quit IRC | 10:17 | |
shardy | Simple docs patch that looks ready to merge if anyone has a moment: https://review.openstack.org/#/c/329699 | 10:17 |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-heat-templates: DO NOT MERFGE - debug why upgrade job fails https://review.openstack.org/330069 | 10:19 |
*** noslzzp has joined #tripleo | 10:20 | |
bandini | jistr, ccamacho, EmilienM: [re upgrade jobs failing] I pushed https://review.openstack.org/330069 because I want to verify if we are hitting https://bugzilla.redhat.com/show_bug.cgi?id=1327469 | 10:20 |
openstack | bugzilla.redhat.com bug 1327469 in pacemaker "pengine wants to start services that should not be started" [Urgent,New] - Assigned to kgaillot | 10:20 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient: WIP Use Mistral for baremetal introspection https://review.openstack.org/327780 | 10:22 |
*** sirushti has quit IRC | 10:26 | |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient: WIP Use Mistral for baremetal introspection https://review.openstack.org/327780 | 10:26 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient: Use Mistral for baremetal introspection https://review.openstack.org/327780 | 10:29 |
*** jtomasek_ has joined #tripleo | 10:30 | |
*** dsariel has quit IRC | 10:31 | |
*** mikelk has quit IRC | 10:32 | |
*** akrivoka has quit IRC | 10:32 | |
*** sirushti has joined #tripleo | 10:33 | |
ccamacho | bandini ack | 10:33 |
shardy | derekh, sshnaidm: http://status-tripleoci.rhcloud.com/ indicates periodic-tripleo-ci-centos-7-ha-mitaka is constantly failing, and there are only 4 results, is that right? | 10:36 |
sshnaidm | shardy, I don't include there passed results (yet) | 10:36 |
shardy | do we have another script that can summarize all the periodic jobs run? | 10:36 |
shardy | sshnaidm: heh, passed results are the only thing I'm interested in :) | 10:37 |
sshnaidm | shardy, I see :) will add this | 10:37 |
shardy | I've been using tripleo-jobs-gerrit.py for non-periodic job results, it'd be good if we could get a similar script into tripleo-ci that supports the periodic jobs | 10:37 |
shardy | or, even better wire this back into the tripleo.org status pages | 10:38 |
shardy | every time we need to do a release, the first thing required is the latest periodic job pass for a particular branch | 10:38 |
shardy | getting that is kind of a hassle atm | 10:38 |
shardy | Can anyone direct me to the logs for the latest passed stable/mitaka periodic job? | 10:39 |
*** hjensas_ has joined #tripleo | 10:42 | |
*** apetrich has quit IRC | 10:46 | |
derekh | sshnaidm: also, can you remove the f22 jobs, they don't exist any longer | 10:50 |
sshnaidm | derekh, yeah, right | 10:50 |
derekh | shardy: http://logs.openstack.org/periodic/periodic-tripleo-ci-centos-7-ha-mitaka/d3b1c10/console.html | 10:51 |
shardy | derekh: thanks! | 10:51 |
marios | guys is anyone aware of a change in nova for the way we set the scheduler_driver or scheduler_host_manager for newton? (filed https://bugs.launchpad.net/tripleo/+bug/1593182 for now) | 10:52 |
openstack | Launchpad bug 1593182 in tripleo "failed openstack-nova-scheduler after updating undercloud from mitaka to newton packages" [Medium,Triaged] - Assigned to Marios Andreou (marios-b) | 10:52 |
*** akrivoka has joined #tripleo | 10:56 | |
*** ccamacho is now known as ccamacho|lunch | 10:56 | |
*** pkovar has quit IRC | 10:58 | |
*** jtomasek_ has quit IRC | 10:59 | |
jokke_ | hi | 10:59 |
openstackgerrit | Brad P. Crochet proposed openstack/puppet-tripleo: Add Mistral profiles https://review.openstack.org/323431 | 11:00 |
jokke_ | is it common that the gate-tripleo-ci-centos-7-upgrades job fails on overcloud controller resource exhaustion? | 11:02 |
*** olap has quit IRC | 11:04 | |
*** olap has joined #tripleo | 11:05 | |
openstackgerrit | Merged openstack/tripleo-quickstart: use environmental variables for ansible ssh configuration https://review.openstack.org/329124 | 11:06 |
*** dsariel has joined #tripleo | 11:06 | |
sshnaidm | derekh, can you please review it https://review.openstack.org/#/c/326055/ ? | 11:06 |
jistr | jokke_: hi, currently the -upgrades job fails altogether, i don't think we've pinned down the root cause yet, bandini pushed https://review.openstack.org/330069 to investigate | 11:08 |
jistr | jokke_: some info on that is here https://bugs.launchpad.net/tripleo/+bug/1592776 | 11:08 |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 11:08 |
openstackgerrit | Merged openstack/tripleo-quickstart: Adds nested blocks to skip steps if there are no overcloud VMs https://review.openstack.org/329617 | 11:09 |
jokke_ | jistr: I'm looking the overcloud-controller-0 messages logs | 11:09 |
jokke_ | there is at least note of high loads (over 10) after which pacemaker starts trying to fence the services off | 11:10 |
*** ooolpbot has joined #tripleo | 11:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 11:10 |
*** ooolpbot has quit IRC | 11:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 11:10 |
*** ramishra has quit IRC | 11:10 | |
jokke_ | there is a lot of high cpu load warnings from crmd (and tons of session starts for rabbitmq, which I dunno if it's normal) | 11:12 |
hewbrocca | jokke_: I would say it is not common -- this is a new development that we are trying to pin down | 11:13 |
jistr | hmm high load could be an issue, but regarding switching services off, it may be switching them off on purpose, as the failures seem to happen during a stack-update where we switch things off/on to apply config changes | 11:13 |
jokke_ | I doubt this is expected Jun 15 13:15:55 localhost pengine[11468]: warning: Forcing ip-fd00.fd00.fd00.3000..10 away from overcloud-controller-0 after 1000000 failures (max=1000000) | 11:14 |
jokke_ | looking these logs http://logs.openstack.org/18/311218/6/check-tripleo/gate-tripleo-ci-centos-7-upgrades/a985e87/logs/overcloud-controller-0/var/log/messages | 11:14 |
openstackgerrit | John Trowbridge proposed openstack/tripleo-quickstart: Use quickstart.sh to manage venv in all ci-scripts https://review.openstack.org/330040 | 11:14 |
jistr | jokke_: yea that isn't expected i think | 11:14 |
jistr | https://bugs.launchpad.net/tripleo/+bug/1592776/comments/6 | 11:14 |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 11:14 |
*** ramishra has joined #tripleo | 11:16 | |
*** thrash|g0ne is now known as thrash | 11:20 | |
jokke_ | jistr: added my notes from the oc-controller to that bug | 11:24 |
jistr | thanks :) | 11:24 |
shardy | mgould: see https://review.openstack.org/330476 for the instack-undercloud stable/mitaka release | 11:27 |
*** tobias_fiberdata has joined #tripleo | 11:28 | |
shardy | coolsvap: Hey, if you'd like to help with stable releases, you might like to review the sha's in the CI results referenced there | 11:28 |
shardy | and figure out which other components are due releases (quite a few I suspect) | 11:28 |
*** noslzzp has quit IRC | 11:28 | |
shardy | http://logs.openstack.org/periodic/periodic-tripleo-ci-centos-7-ha-mitaka/d3b1c10/console.html#_2016-06-16_06_24_36_685 | 11:29 |
coolsvap | shardy, sure | 11:29 |
shardy | you can see the sha's used in the test from the package names | 11:29 |
shardy | EmilienM: ^^ FYI I proposed an instack-undercloud release | 11:29 |
shardy | I'll be on PTO for a couple of days, so if there's review changes required, it'd be good if you or someone else can push them :) | 11:30 |
shardy | coolsvap: Note that as in this case for instack-undercloud, several repos were tagged directly, and don't have deliverables in openstack/releases | 11:31 |
shardy | AIUI the right thing is just to add them and continue with versioning from the current tag | 11:31 |
*** lucasagomes is now known as lucas-hungry | 11:31 | |
mgould | shardy: thanks! | 11:36 |
* mgould looks | 11:36 | |
*** weshay has joined #tripleo | 11:37 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for heat https://review.openstack.org/327069 | 11:37 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for glance API and registry https://review.openstack.org/327473 | 11:37 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for RabbitMQ https://review.openstack.org/327482 | 11:37 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for cinder-api https://review.openstack.org/328859 | 11:37 |
openstackgerrit | Flavio Percoco proposed openstack-infra/tripleo-ci: Update Fedora Atomic URL https://review.openstack.org/330477 | 11:38 |
*** tobias_fiberdata has quit IRC | 11:43 | |
*** pkovar has joined #tripleo | 11:45 | |
coolsvap | shardy got couple of mins i have some queries | 11:48 |
shardy | coolsvap: I'm about to drop out on PTO for the afternoon, but yes if they're quick queries ;) | 11:49 |
coolsvap | shardy, after looking at the logs i see some components like diskimage-builder need updates | 11:49 |
coolsvap | please let me know if i am going correct? | 11:50 |
coolsvap | if yes how independent components are updated in releases | 11:50 |
shardy | coolsvap: there was some discussion on the ML about diskimage-builder being a special case, I'm not sure how that concluded | 11:50 |
openstackgerrit | yolanda.robla proposed openstack/tripleo-quickstart: Allow to specify templates path on overcloud deployment https://review.openstack.org/329556 | 11:50 |
shardy | so I'd hold off any dib releases until we can chat with some of the diskimage-builder-core folks | 11:50 |
*** jpena is now known as jpena|lunch | 11:50 | |
coolsvap | alright | 11:51 |
shardy | coolsvap: the main focus is to ensure we've got semi-recent tags on all the other tripleo specific repos | 11:51 |
shardy | https://github.com/openstack-infra/tripleo-ci/blob/master/scripts/tripleo-jobs-gerrit.py#L17 | 11:52 |
*** ramishra has quit IRC | 11:52 | |
shardy | project list there, we can exclude the dib stuf and tripleo-ci | 11:52 |
coolsvap | maybe i need some more time, i will trouble EmilienM | 11:53 |
coolsvap | shardy, ^^ | 11:53 |
coolsvap | i am now looking at instack and i have the same query as diskimage-builder | 11:54 |
*** dprince has joined #tripleo | 11:54 | |
coolsvap | shardy, but i got it right what you did with instack-undercloud | 11:56 |
*** noslzzp has joined #tripleo | 11:57 | |
openstackgerrit | Ihar Hrachyshka proposed openstack/tripleo-heat-templates: Stop passing charset=utf8 for neutron database connection option https://review.openstack.org/330490 | 11:57 |
shardy | coolsvap: we can release instack via the same method as for instack-undercloud I think | 11:58 |
shardy | (we could even do it in the same patch if there are dependencies between the two) | 11:59 |
coolsvap | shardy, so i will need to update in independent or mitaka? | 11:59 |
coolsvap | thats my query | 11:59 |
openstackgerrit | Flavio Percoco proposed openstack/tripleo-heat-templates: Add StepConfig to docker compute-post.yaml https://review.openstack.org/330492 | 11:59 |
shardy | coolsvap: Honestly I'm not sure - we've aligned with the milestone releases from newton, so I just replicated that pattern for mitaka, and there is an existing instack-undercloud deliverable for liberty | 12:00 |
*** apetrich has joined #tripleo | 12:00 | |
coolsvap | i think in independent | 12:00 |
coolsvap | instack does not have stable branches | 12:00 |
shardy | coolsvap: good point, lets see how it's tagged in the governance repo | 12:01 |
coolsvap | tags: | 12:02 |
coolsvap | - release:cycle-with-intermediary | 12:02 |
coolsvap | - type:library | 12:02 |
shardy | Hmm, so we need to either cut branches or change that | 12:02 |
*** jschlueter is now known as jschlueter|afk | 12:02 | |
shardy | thanks for pointing out the inconsistency | 12:02 |
*** ramishra has joined #tripleo | 12:03 | |
shardy | I need to go now, perhaps we can sort this out next week | 12:03 |
shardy | or, please work with EmilienM and slagle_ to resolve it | 12:03 |
coolsvap | shardy, sure | 12:03 |
shardy | coolsvap: thanks! | 12:03 |
shardy | good to have more eyes on this :) | 12:03 |
*** shardy has quit IRC | 12:04 | |
*** MaxPC has joined #tripleo | 12:05 | |
*** jayg|g0n3 is now known as jayg | 12:08 | |
*** coolsvap has quit IRC | 12:10 | |
*** ooolpbot has joined #tripleo | 12:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 12:10 |
*** ooolpbot has quit IRC | 12:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 12:10 |
*** ramishra has quit IRC | 12:10 | |
*** ramishra has joined #tripleo | 12:12 | |
*** rhallisey has joined #tripleo | 12:15 | |
*** slagle_ is now known as slagle | 12:16 | |
*** ramishra has quit IRC | 12:16 | |
hewbrocca | bandini: any more ideas on this failure? | 12:16 |
*** ibravo2 has joined #tripleo | 12:20 | |
*** ibravo2 has quit IRC | 12:20 | |
*** jschlueter|afk is now known as jschlueter | 12:21 | |
*** ramishra has joined #tripleo | 12:21 | |
*** lucas-hungry is now known as lucasagomes | 12:22 | |
*** rcernin has quit IRC | 12:24 | |
*** myoung|afk has joined #tripleo | 12:26 | |
ccamacho|lunch | bandini, im still checking reading bugs, i got the changes from https://review.openstack.org/#/c/330069/3 to see why is failing but not able to determine why the cluster is not starting... | 12:26 |
*** ccamacho|lunch is now known as ccamacho | 12:26 | |
Ng | hey folks, I have a really dumb question :) | 12:27 |
hewbrocca | Ng: Remember, there are no stupid questions... | 12:28 |
Ng | given an OS::Heat::SoftwareConfig script in a template, what actually executes that script? | 12:28 |
hewbrocca | Only stupid people | 12:28 |
hewbrocca | :D | 12:28 |
hewbrocca | jistr: ^^^ | 12:28 |
Ng | I was guessing it would be something in the os-* toolchain, but I've picked through them and if it's there, I'm not seeing it :) | 12:29 |
jistr | Ng: it's a OS::Heat::SoftwareDeployment or OS::Heat::SoftwareDeploymentGroup resource | 12:29 |
jistr | the difference is that *Group takes multiple servers to apply the config (script) on | 12:30 |
*** trown|outtypewww is now known as trown | 12:30 | |
*** tobias_fiberdata has joined #tripleo | 12:30 | |
*** pradk has joined #tripleo | 12:31 | |
jistr | Ng: here is an example https://github.com/openstack/tripleo-heat-templates/blob/4bd42ea849e6d50d16610377b5e89504f5a4f412/extraconfig/tasks/major_upgrade_pacemaker.yaml#L53-L65 | 12:31 |
jistr | Ng: or if you mean what runs the script on the target machine, it is os-collect-config + os-refresh-config i think | 12:33 |
Ng | jistr: so conveniently, that's actually what I'm looking at - we're reliably seeing Step2 not get executed. It just sits in CREATE_IN_PROGRESS | 12:34 |
*** jcoufal has joined #tripleo | 12:34 | |
EmilienM | good morning | 12:34 |
hewbrocca | hey, isn't that the same bug bandini and ccamacho hit yesterday? | 12:34 |
Ng | but yeah, I did mean the actual script on the target machine. I'm trying to work my way back up the stack | 12:34 |
Ng | hewbrocca: yep, I'm pitching in trying to track it down | 12:34 |
hewbrocca | cool | 12:35 |
Ng | and I figured I'd start from where does that script end up and what is supposed to execute it | 12:35 |
slagle | it's the 55-heat-config orc script | 12:35 |
hewbrocca | Seems like a reasonable approach | 12:35 |
Ng | slagle: awesome, thanks | 12:35 |
slagle | as long as occ has collected the deployment | 12:35 |
Ng | afaics Step2 shows up in the json that occ produces | 12:36 |
slagle | i usually do "ps axjf | grep -A 5 os-refresh-config | 12:36 |
slagle | " | 12:36 |
slagle | that will show you if it's actually running something or not | 12:37 |
Ng | it's not, and I've tried running it by hand and adding debugging and it doesn't seem to try and do any script execution | 12:38 |
Ng | interestingly, we don't seem to have a 55-heat-config anywhere in /usr/libexec/os-refresh-config/, yet the Step1 gets executed fine | 12:38 |
*** rcernin has joined #tripleo | 12:38 | |
ccamacho | morning EmilienM! | 12:38 |
EmilienM | jistr: do we have some progress on upgrade job failure? | 12:38 |
Ng | likely the key factor here is that before step2, a yum update runs, and if that is commented out, step2 does get called | 12:39 |
jistr | EmilienM: i didn't look into it personally, bandini posted further debugging patch https://review.openstack.org/#/c/330069 | 12:39 |
ccamacho | EmilienM bandini added to https://review.openstack.org/#/c/330069/ | 12:39 |
*** julim has joined #tripleo | 12:39 | |
Ng | not found a side-effect of that yet though, there aren't any maintainer scripts for orc/oac (occ isn't updated) | 12:40 |
jistr | EmilienM: the CI results aren't there yet | 12:40 |
*** rlandy has joined #tripleo | 12:40 | |
*** athomas has quit IRC | 12:40 | |
Ng | ok, just rebuilt the environment cleanly, without Step1 having run either, and 55-heat-config exists | 12:41 |
* Ng will poke further. thanks everyone :) | 12:41 | |
jistr | Ng: good catch | 12:42 |
hewbrocca | o noes | 12:42 |
hewbrocca | the 55-heat-config bug again | 12:42 |
EmilienM | jistr, bandini: see my comment here: https://review.openstack.org/#/c/330069/3/extraconfig/tasks/pacemaker_resource_restart.sh | 12:43 |
pradk | jistr, Hi, what do you think of https://review.openstack.org/#/c/330096/ | 12:43 |
EmilienM | just fyi, HTH | 12:43 |
*** bfournie has joined #tripleo | 12:46 | |
*** rodrigods has quit IRC | 12:46 | |
*** rodrigods has joined #tripleo | 12:46 | |
*** athomas has joined #tripleo | 12:46 | |
*** bfournie1 has joined #tripleo | 12:50 | |
*** bfournie has quit IRC | 12:50 | |
*** dciabrin has joined #tripleo | 12:51 | |
hewbrocca | jistr: which job is this where it breaks | 12:53 |
hewbrocca | the 55-heat-config thing | 12:53 |
hewbrocca | mburned: ^^^ | 12:53 |
*** rcernin has quit IRC | 12:54 | |
*** ramishra has quit IRC | 12:54 | |
hewbrocca | Ng, bandini this thing where Heat stops -- does it happen on a regular deploy or only on the upgrade job | 12:54 |
Ng | it's the upgrade | 12:55 |
jistr | hewbrocca: it's not a job, it's manual upgrade testing. We don't have upgrades covered by CI (despite having an upgrade job :) ). | 12:55 |
hewbrocca | Oh! | 12:55 |
hewbrocca | I see, OK | 12:55 |
Ng | ah yes, sorry, we're testing this manually | 12:55 |
hewbrocca | and upgrade testing of what, exactly? | 12:55 |
mburned | hewbrocca: jistr: https://bugzilla.redhat.com/show_bug.cgi?id=1278181 | 12:55 |
openstack | bugzilla.redhat.com bug 1278181 in openstack-heat-templates "55-heat-config shouldn't use /var/run for it's DEPLOYED_DIR" [Unspecified,Closed: errata] - Assigned to sbaker | 12:55 |
*** mikelk has joined #tripleo | 12:55 | |
mburned | hewbrocca: that was the issue we had to handle specifically in kilo based updates | 12:55 |
Ng | hewbrocca: liberty to mitaka. bandini has more detail than I do on this, I'm just trying to help out :) | 12:56 |
mburned | hewbrocca: we documented it a bit here: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux_OpenStack_Platform/7/html/Director_Installation_and_Usage/sect-Updating_the_Overcloud.html | 12:56 |
hewbrocca | liberty to mitaka | 12:57 |
*** Goneri has joined #tripleo | 12:57 | |
jistr | yea though the BZ is a bit different. The BZ is about the data of 55-heat-config disappearing, while now we have 55-heat-config itself disappearing | 12:57 |
hewbrocca | right | 12:57 |
mburned | but it should have only been a problem for that single version of heat-templates, not anything newer | 12:57 |
hewbrocca | seems like some mitaka package update must be removing it? | 12:57 |
*** fultonj has joined #tripleo | 12:57 | |
jistr | yea if it happens after the step where yum update is executed, it could be some faulty script in one of the RPMs | 12:58 |
*** ramishra has joined #tripleo | 12:58 | |
hewbrocca | jistr: and is it liberty->mitaka or OSP8->OSP9 | 12:59 |
jistr | hewbrocca: i think bandini was testing upstream | 13:00 |
matbu | hewbrocca: jistr yep we were testing upstream | 13:00 |
matbu | (well we are) | 13:00 |
hewbrocca | Might be time to go hassle #rdo | 13:00 |
*** cdearborn has joined #tripleo | 13:00 | |
*** jpena|lunch is now known as jpena | 13:03 | |
*** tzumainn has joined #tripleo | 13:05 | |
*** ramishra has quit IRC | 13:05 | |
*** ramishra has joined #tripleo | 13:05 | |
*** rcernin has joined #tripleo | 13:06 | |
jistr | matbu: is it an environment where i could take a peek too? | 13:06 |
matbu | jistr: yep | 13:07 |
Ng | I think it's openstack-tripleo-image-elements | 13:07 |
Ng | the rpm has a post-install that does an rsync with --delete against /usr/libexec/os-refresh-config | 13:08 |
*** ooolpbot has joined #tripleo | 13:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 13:10 |
*** ooolpbot has quit IRC | 13:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 13:10 |
*** cllewellyn_ has joined #tripleo | 13:11 | |
*** cllewellyn__ has joined #tripleo | 13:11 | |
Ng | that seems like it would be a bug - if you've build an image with elements that have orc scripts not in openstack-tripleo-image-elements, those orc scripts are going to be wiped out if you upgrade o-t-i-e | 13:13 |
Ng | bandini: ^^ (for when you're back) | 13:14 |
jistr | Ng: yea i think you're right, well spotted. BTW i have liberty downstream now deployed, and i don't have openstack-tripleo-image-elements installed. | 13:14 |
*** dciabrin has quit IRC | 13:15 | |
jistr | so one issue is that o-t-i-e RPM perhaps shouldn't be doing this | 13:15 |
jistr | and another issue is that o-t-i-e probably shouldn't be on overcloud | 13:16 |
Ng | it seems reasonable that it might want to remove its stale orc scripts | 13:16 |
Ng | but this is a fairly blunt hammer :) | 13:16 |
matbu | jistr: you hit the same issue with downstream atm ? | 13:16 |
jistr | https://paste.fedoraproject.org/380037/8299314/raw/ | 13:16 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for glance API and registry https://review.openstack.org/327473 | 13:17 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for RabbitMQ https://review.openstack.org/327482 | 13:17 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo: Enable TLS in the internal network for cinder-api https://review.openstack.org/328859 | 13:17 |
jistr | matbu: i just went through the undercloud upgrade, which seems to have gone fine. I'm not testing further yet b/c we probably miss a lot of things for the migrations downstream still. | 13:17 |
jistr | so t-i-e is being pulled in by instack-undercloud, which is being pulled in by python-tripleoclient | 13:18 |
*** zoli|wfh is now known as zoli|lunch | 13:18 | |
jistr | i'm not sure if we need any of those packages installed on overcloud... probably not? | 13:18 |
jistr | also having a hard python-tripleoclient dependency on instack-undercloud is a bit weird too | 13:19 |
openstackgerrit | Adriano Petrich proposed openstack/tripleo-quickstart: inject debug options on the under/overcloud images https://review.openstack.org/329999 | 13:19 |
matbu | jistr: i can try with a --exclude=t-i-e | 13:19 |
jistr | matbu: that would be cool. Also if we have a liberty overcloud before upgrading at some point, might be worth checking if openstack-tripleo-image-elements is installed at all. I wonder if we're introducing weird deps that pull in packages that have no business in overcloud. | 13:22 |
*** jcoufal has quit IRC | 13:23 | |
Ng | jistr: at least in the environment I'm using, pre-upgrade, o-t-i-e is installed | 13:23 |
jistr | the yum log says | 13:23 |
Ng | (just checked on overlcoud-controller-0) | 13:23 |
jistr | Jun 16 12:59:51 Updated: openstack-tripleo-image-elements-5.0.0-0.20160613170807.5feb901.el7.centos.noarch | 13:23 |
jistr | so yea what Ng says | 13:24 |
slagle | i dont think tie should be installed on the overcloud | 13:25 |
*** akshai has joined #tripleo | 13:26 | |
slagle | if it is, the images need to be built with the element-manifest element, otherwise as you've seen the rpm %post is going to wipe everything out | 13:26 |
*** akshai_ has joined #tripleo | 13:27 | |
matbu | jistr: ack, i'll look if it's installed on the overcloud before upgrading | 13:27 |
*** myoung|afk has quit IRC | 13:29 | |
jistr | slagle: trying to track down what pulls in python-tripleoclient but can't find mention of it anywhere in t-p-e or t-i-e | 13:30 |
*** akshai has quit IRC | 13:31 | |
*** ramishra has quit IRC | 13:31 | |
*** myoung|afk has joined #tripleo | 13:32 | |
hewbrocca | eurrgh | 13:33 |
hewbrocca | This image deployment thing | 13:33 |
hewbrocca | how was this going to help us again? | 13:33 |
matbu | hewbrocca: wow that make me think about something ... | 13:35 |
jistr | well tbh the overcloud deployment takes ~30 minutes and not ~3 hours like in Staypuft times with kickstart :) | 13:35 |
hewbrocca | That's good I suppose | 13:36 |
matbu | hewbrocca: few weeks ago, i have been tested the downstream upgrade (7 to 8) with tripleo-quickstart. I used some images built by Wes, i got the same issue (heat stuck), when I used the mburned images, it passed correctly | 13:36 |
jistr | matbu: huh that could be a lead. Some customization somewhere is building funky images. | 13:37 |
*** ramishra has joined #tripleo | 13:37 | |
matbu | hewbrocca: i never understand why ... i didn't look more because the official images was working. | 13:37 |
matbu | jistr: yep | 13:37 |
hewbrocca | mburned: ^^^ | 13:37 |
slagle | jistr: yea, i'd start with the overcloud image build log | 13:38 |
slagle | or, go to a liberty ci job from tripleo-ci, and download the host_info.txt for one of the oc nodes | 13:39 |
jistr | i have the "normal" images built with tripleo.sh, and i don't have t-i-e on my env. It's master though, not liberty. | 13:39 |
slagle | it has rpm -qa output in there | 13:39 |
mburned | internally, for osp 7/8 we build using openstack overcloud image build | 13:39 |
mburned | nothing special | 13:39 |
jistr | yea i also don't have t-i-e in OSP 8 env (downstream liberty) | 13:40 |
jistr | so it's something specific to the way we build RDO images probably | 13:40 |
matbu | jistr: which images are using for RDO ? | 13:41 |
jistr | matbu, bandini: so you deployed with tripleo-quickstart, using downloaded images rather than built? maybe this? https://github.com/openstack/tripleo-quickstart/blob/master/config/release/liberty.yml | 13:41 |
*** tserong has quit IRC | 13:41 | |
matbu | jistr: yep for me, idk for bandini | 13:41 |
*** tserong has joined #tripleo | 13:42 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-docs: Upgrade documentation https://review.openstack.org/308985 | 13:43 |
trown | jistr: matbu, oh didnt realize we were troubleshooting RDO images | 13:43 |
*** lblanchard has joined #tripleo | 13:44 | |
jistr | trown: can we get to what builds them? | 13:45 |
trown | https://github.com/redhat-openstack/ansible-role-tripleo-image-build/blob/master/vars/default_package_list.yml are the packages that get pre-installed | 13:45 |
trown | I guess one of those is pulling in t-i-e? | 13:45 |
slagle | rdo uses it's own code to build images? | 13:46 |
jistr | trown: it's python-tripleoclient -> instack-undercloud -> openstack-tripleo-image-elements | 13:46 |
trown | slagle: it uses the tripleo-common library to do the image building, but it pre-installs a bunch of packages since it is used for undercloud image as well | 13:47 |
trown | slagle: so as not to install 300 packages twice | 13:47 |
jistr | trown: i think we should remove python-tripleoclient from that list for overcloud, and only have it in undercloud | 13:48 |
*** egafford has joined #tripleo | 13:48 | |
trown | jistr: cool that is easy to do, there is an undercloud package install var for just that purpose, I will put up a patch shortly | 13:49 |
slagle | trown: maybe we could maintain the list upstream | 13:49 |
myoung|afk | trown: I'll watch for it (review) | 13:49 |
*** myoung|afk is now known as myoung | 13:49 | |
jistr | trown: thanks | 13:49 |
*** dciabrin has joined #tripleo | 13:49 | |
hewbrocca | well there you go | 13:49 |
weshay | :) | 13:49 |
hewbrocca | what's the chance this also has something to do with this pacemaker issue bandini is arguing with | 13:50 |
trown | slagle: ya I have been trying to think of an alternative where we don't have that list at all, and turn the overcloud image into an undercloud image **AFTER** it is built with DIB/tripleo-common | 13:50 |
hewbrocca | like remove a bunch of stuff? | 13:51 |
trown | ya remove stuff, add stack user, add stuff | 13:51 |
*** hewbrocca is now known as hewbrocca-afk | 13:51 | |
jistr | hewbrocca-afk: it's almost surely unrelated to the pacemaker issue, the pacemaker issue fails on -upgrade job where we build overcloud images through tripleo.sh | 13:52 |
*** jcoufal has joined #tripleo | 13:52 | |
matbu | jistr: which pacemaker issue bandini hit ? I think the latest issue was the heat hanging on step2 | 13:53 |
*** Larion has joined #tripleo | 13:53 | |
ccamacho | EmilienM cant find anything useful, on controller neutron-server is not started after the update, after a "sudo systemctl restart neutron-server" it starts but not able to finish the update or get another error, I think is rabbit related but not sure how to test is rabbit is actually working properly. | 13:55 |
trown | myoung: https://review.gerrithub.io/280655 | 13:55 |
*** dciabrin has quit IRC | 13:56 | |
ccamacho | EmilienM, jaosorior, chem any new clue? | 13:56 |
*** tobias_fiberdata has quit IRC | 13:56 | |
EmilienM | ccamacho: I haven't worked on it yes | 13:57 |
EmilienM | yet* | 13:57 |
EmilienM | bandini: any news? | 13:57 |
openstackgerrit | Harry Rybacki proposed openstack/tripleo-quickstart: [WIP] Add scale to roles gate https://review.openstack.org/329542 | 13:58 |
chem | ccamacho: I'm going back to pacemaker analysis and following the date in multiple log file. I've found that ipaddr failure happens also in "successful" ha-proxy update. I'm going to log that in the launchpad | 13:58 |
matbu | jistr: so yes, t-i-e is installed on liberty | 13:58 |
EmilienM | jistr: could you reproduce the pacemaker issue? | 13:59 |
jistr | pradk: i think a tweak is required at https://review.openstack.org/#/c/330096/ but otherwise i think it's good, thanks | 14:02 |
*** masco has quit IRC | 14:03 | |
*** tserong has quit IRC | 14:05 | |
jtomasek | I am seeing this error randomly on several service apis in recent undercloud setup, is it known thing? http://paste.openstack.org/show/516666/ | 14:06 |
pradk | jistr, so you mean instead of keystone_cli.users.find .. do service.find? | 14:08 |
*** tserong has joined #tripleo | 14:08 | |
jistr | pradk: yea i think that should handle it better. E.g. i don't think there's a 'cinderv2' user, so the 'cinderv2' initialization would always be retried if we looked for users instead. | 14:09 |
*** ooolpbot has joined #tripleo | 14:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 14:10 |
*** ooolpbot has quit IRC | 14:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 14:10 |
pradk | jistr, i see.. ok lemme see what call keystone client supports for services check | 14:10 |
*** fzdarsky is now known as fzdarsky|afk | 14:11 | |
*** dsariel has quit IRC | 14:12 | |
*** tobias_fiberdata has joined #tripleo | 14:14 | |
*** tserong has quit IRC | 14:15 | |
*** Larion has quit IRC | 14:17 | |
*** tserong has joined #tripleo | 14:18 | |
*** zoli|lunch is now known as zoli|wfh | 14:19 | |
openstackgerrit | Harry Rybacki proposed openstack/tripleo-quickstart: [WIP] Add scale to roles gate https://review.openstack.org/329542 | 14:20 |
*** dciabrin has joined #tripleo | 14:23 | |
EmilienM | jistr, chem, bandini: is there someone on the pacemaker issue in upgrade job? | 14:23 |
chem | EmilienM: well, I'm reading corosync log and deploying a fresh quickstart | 14:24 |
jistr | EmilienM: i'm tasked with OSP 9 primarily, looking into https://bugzilla.redhat.com/show_bug.cgi?id=1341350 now | 14:25 |
openstack | bugzilla.redhat.com bug 1341350 in rhel-osp-director "rhel-osp-director: registering the overcloud images fails on first attempt with "500 Internal Server Error: Failed to upload image 51672726-cc40-40ea-9ca0-1f8b2267313c (HTTP 500)"" [Unspecified,Assigned] - Assigned to jstransk | 14:25 |
EmilienM | jistr: right but our CI is currently blocked, I spent almost my day on it yesterday and didn't make much progress, /me really needs help | 14:26 |
EmilienM | jistr: I'm also tasked to things but imho we should fix our CI first | 14:26 |
jistr | trying to look for https://review.openstack.org/#/c/330069 in zuul but can't find it | 14:27 |
jistr | we may need to recheck? | 14:27 |
EmilienM | I did recheck.. | 14:28 |
*** jistr is now known as jistr|mtg | 14:28 | |
*** ifarkas has quit IRC | 14:28 | |
EmilienM | what is worries me is that our CI is red | 14:28 |
EmilienM | and nobody really cares | 14:28 |
*** hewbrocca-afk is now known as hewbrocca | 14:30 | |
hewbrocca | EmilienM: I wouldn't say that, we've been working on fixing it all day | 14:30 |
EmilienM | cool, I hope I was wrong then. | 14:31 |
chem | EmilienM: see my new comment on the launchpad, maybe you will see there something that eludes me | 14:32 |
EmilienM | ok | 14:32 |
pabelanger | EmilienM: I feel your pain. Not to pile on to the ci pipeline on tripleo, I found it difficult to contribute simply because how long the testing were taking. | 14:33 |
*** ramishra has quit IRC | 14:34 | |
pabelanger | some feedback would be to break out testing into smaller portions vs and end to end testing model. | 14:34 |
hewbrocca | pabelanger: yes, slagle has done some significant work on that | 14:36 |
*** ramishra has joined #tripleo | 14:37 | |
hewbrocca | folks, I think we need to down tools until this upgrade job is green again | 14:38 |
EmilienM | trown: can we revert our last promotion? | 14:38 |
*** tserong has quit IRC | 14:38 | |
EmilienM | if it's related to the latest version of pacemaker or something? | 14:38 |
EmilienM | I propose we revert what is blocking us, and let bandini and other pacemaker gurus to figure what is going wrong | 14:39 |
matbu | jistr|mtg: bandini hewbrocca just removing t-i-e package (and deps) unblock and works \o/ | 14:39 |
hewbrocca | matbu: well that's good news | 14:40 |
hewbrocca | but I guess it doesn't fix the upgrade job, right? | 14:40 |
hewbrocca | or does it... | 14:40 |
*** pcaruana has quit IRC | 14:41 | |
matbu | hewbrocca: nop cause, the tripleo ci job is about minor upgrade | 14:41 |
openstackgerrit | Harry Rybacki proposed openstack/tripleo-quickstart: [WIP] Add scale to roles gate https://review.openstack.org/329542 | 14:41 |
hewbrocca | yeah, I thought not | 14:42 |
hewbrocca | EmilienM: I think the problem is that we have not been able to identify at all, what commit causes the upgrade job to fail | 14:42 |
trown | matbu: myoung I think https://review.gerrithub.io/280666 is a more long term fix to get rid of package drift between RDO images and upstream | 14:43 |
EmilienM | I'm going to compare (again) latest successful job and a recent failing job and see what is diff | 14:43 |
EmilienM | I did it already but it was not super useful | 14:43 |
openstackgerrit | Pradeep Kilambi proposed openstack/python-tripleoclient: Run post deploy config on force https://review.openstack.org/330096 | 14:43 |
ccamacho | hewbrocca ++ yeahp, I dont think is related to tht or puppet-tripleo as I have deployed patches from last week.. and had the same problem.. Im not able no get the error.. | 14:43 |
ccamacho | Emilien I did that already a patch from last week, in the upgrade failed with the same issue... | 14:44 |
chem | EmilienM: you had a link on the package diff, do you still have it ? | 14:44 |
hewbrocca | EmilienM: can you identify a task force to nail this down | 14:44 |
EmilienM | chem: yes but its not relevant | 14:44 |
*** trown is now known as trown|brb | 14:44 | |
EmilienM | it was https://www.diffchecker.com/f9hkwqmj | 14:44 |
hewbrocca | Name your people | 14:44 |
EmilienM | but I'm doing it again | 14:44 |
chem | EmilienM:thanks | 14:44 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-quickstart: Fix script creation modes to standard 755 https://review.openstack.org/330620 | 14:45 |
ccamacho | I can give you /etc /var/log from an OK env and a NOK /etc /var/log from the same env after the update... | 14:45 |
EmilienM | hewbrocca: I was wrong and it seems chem ccamacho are also working on it. Let's continue together now | 14:45 |
hewbrocca | OK EmilienM and you need bandini as well? | 14:45 |
EmilienM | ccamacho: yeah? that would be cool | 14:45 |
ccamacho | just a sec | 14:45 |
EmilienM | hewbrocca: oh yeah, for sure | 14:45 |
hewbrocca | OK | 14:45 |
myoung | trown, looked quickly and that seems like a rational solution to a real problem. I'll take a deeper look later on. | 14:45 |
hewbrocca | EmilienM: You might also want to grab beekhof | 14:47 |
EmilienM | ok, package diff: https://www.diffchecker.com/8qlic49z | 14:47 |
EmilienM | on left, successful job, on right, failing job | 14:47 |
chem | EmilienM: it's undercloud, do you have overcloud ? | 14:48 |
EmilienM | so there is dkms, tht, i-u and os-net-config | 14:48 |
*** cllewellyn_ has quit IRC | 14:48 | |
*** cllewellyn__ has quit IRC | 14:48 | |
EmilienM | right, that's undercloud | 14:48 |
EmilienM | damn, there is no rpm-qa logs on overcloud I think | 14:49 |
*** tserong has joined #tripleo | 14:49 | |
slagle | EmilienM: there is | 14:49 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Switch back to current-tripleo pre promotion https://review.openstack.org/111011 | 14:49 |
slagle | EmilienM: in /var/log/host_info.txt | 14:49 |
derekh | thats ^^ testing CI before we did the last promotion | 14:49 |
slagle | or host_info.log | 14:49 |
*** jaosorior has quit IRC | 14:49 | |
EmilienM | slagle: ok thanks, I missed that | 14:49 |
derekh | EmilienM: ^ | 14:49 |
EmilienM | oh awesome | 14:50 |
* EmilienM autoslap | 14:50 | |
*** panda has quit IRC | 14:50 | |
*** panda has joined #tripleo | 14:50 | |
ccamacho | EmilienM https://we.tl/EXKKlP4PjB | 14:51 |
*** trown|brb is now known as trown | 14:52 | |
EmilienM | ccamacho: excellent, thx, let me look now | 14:52 |
chem | I'm off for 30min | 14:52 |
ccamacho | you have there logs from both controller and compute before/after the upgrade | 14:52 |
ccamacho | all /etc and /var/logs | 14:53 |
ccamacho | EmilienM ^ | 14:53 |
EmilienM | excellent | 14:54 |
chem | I'm unoff ... wrong time | 14:55 |
EmilienM | ok I don't get it | 14:56 |
EmilienM | https://www.diffchecker.com/wnc1vjrp | 14:56 |
EmilienM | I need to try again, maybe I did wrong | 14:56 |
*** tserong has quit IRC | 14:56 | |
*** ramishra has quit IRC | 14:57 | |
*** adarazs is now known as adarazs_afk | 14:57 | |
openstackgerrit | Ana Krivokapic proposed openstack/tripleo-ui: Disable "Assign Nodes" link if no nodes are available https://review.openstack.org/330627 | 14:58 |
*** tremble has quit IRC | 14:58 | |
*** dtantsur is now known as dtantsur|bbl | 14:58 | |
openstackgerrit | Harry Rybacki proposed openstack/tripleo-quickstart: [WIP] Add scale to roles gate https://review.openstack.org/329542 | 14:58 |
*** ramishra has joined #tripleo | 14:59 | |
*** chem has quit IRC | 14:59 | |
*** chem has joined #tripleo | 15:00 | |
*** tserong has joined #tripleo | 15:02 | |
EmilienM | ok I have something more helpful | 15:03 |
EmilienM | https://www.diffchecker.com/qpmfhinu | 15:03 |
openstackgerrit | Merged openstack/tripleo-quickstart: Fix script creation modes to standard 755 https://review.openstack.org/330620 | 15:05 |
EmilienM | derekh, trown: what do you think if we come back to the previous repo we were using? | 15:05 |
EmilienM | it seems that all diff is related to the latest promotion | 15:05 |
*** ebarrera has quit IRC | 15:06 | |
trown | fine by me | 15:06 |
chem | EmilienM: this is a nightmare, so many packages ... | 15:06 |
EmilienM | chem: not so much, most of them are openstack related | 15:06 |
*** dciabrin has quit IRC | 15:07 | |
EmilienM | but yeah, it's not an easy one | 15:07 |
EmilienM | trown: how to proceed, by patching triplo-ci? | 15:07 |
EmilienM | 70/8b/708bf15975d8bfb4b7bc9426a86369d82c0d4dd9_cbd0900e was working well | 15:09 |
*** ooolpbot has joined #tripleo | 15:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 15:10 |
*** ooolpbot has quit IRC | 15:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 15:10 |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-puppet-elements: Add zaqar package to controller image https://review.openstack.org/330636 | 15:12 |
*** aufi has quit IRC | 15:14 | |
*** ramishra has quit IRC | 15:14 | |
*** coolsvap has joined #tripleo | 15:16 | |
*** ramishra has joined #tripleo | 15:16 | |
EmilienM | trown: how can we unpromote our CI? | 15:16 |
EmilienM | trown: should we do something in tripleo-ci/scripts/tripleo.sh ? | 15:16 |
*** jmiu_ has joined #tripleo | 15:17 | |
*** olap has quit IRC | 15:17 | |
trown | EmilienM: sorry on a call, I think I can just manually do it | 15:17 |
trown | derekh: wdyt? | 15:17 |
EmilienM | except if we have an immediate solution, I propose we do it | 15:18 |
*** fzdarsky|afk is now known as fzdarsky | 15:18 | |
*** jistr|mtg is now known as jistr | 15:19 | |
jistr | derekh: if you have the issue reproduced, you could try logging into the controller and run "crm_resource --wait -VVV" | 15:19 |
openstackgerrit | Sanjay Upadhyay proposed openstack/python-tripleoclient: Tripleoclient leaks temporary files https://review.openstack.org/330638 | 15:20 |
*** rbrady has quit IRC | 15:20 | |
jistr | that should repeatedly print some info (what resources it's waiting for to start or stop) which could help us debug that further | 15:20 |
derekh | trown: EmilienM I push a patch about 30 minutes ago to test the previous repo, it it work we could either merge the patch(or something similar) or switch back the current-tripleo links on the mirror and trunk servers | 15:20 |
derekh | trown: EmilienM https://review.openstack.org/#/c/111011/68 | 15:21 |
jistr | that's what bandini is trying to do with the CI patch https://review.openstack.org/#/c/330069/ | 15:21 |
EmilienM | derekh: excellent. Thanks. | 15:21 |
derekh | jistr: ack, will see how it goes, thanks | 15:21 |
*** ifarkas has joined #tripleo | 15:24 | |
ccamacho | jistr: for the CI issue [heat-admin@overcloud-controller-0 ~]$ sudo crm_resource --wait -VVV | 15:25 |
ccamacho | notice: LogActions:Start httpd:0(overcloud-controller-0) | 15:25 |
ccamacho | notice: LogActions:Start httpd:0(overcloud-controller-0) | 15:25 |
ccamacho | notice: LogActions:Start httpd:0(overcloud-controller-0) | 15:25 |
derekh | jistr: ccamacho I got this every 2 seconds | 15:27 |
derekh | notice: LogActions: Start openstack-nova-scheduler:0 (overcloud-controller-0) | 15:27 |
derekh | notice: LogActions: Start openstack-cinder-volume (overcloud-controller-0 - blocked) | 15:27 |
derekh | notice: LogActions: Start openstack-nova-api:0 (overcloud-controller-0) | 15:27 |
derekh | notice: LogActions: Start openstack-nova-conductor:0 (overcloud-controller-0) | 15:27 |
derekh | cinder-volume blocked? | 15:27 |
derekh | constaints say its started after openstack-cinder-scheduler | 15:28 |
jistr | hmm so we probably are hitting https://bugzilla.redhat.com/show_bug.cgi?id=1327469 | 15:28 |
openstack | bugzilla.redhat.com bug 1327469 in pacemaker "pengine wants to start services that should not be started" [Urgent,New] - Assigned to kgaillot | 15:28 |
derekh | and further up the logs, | 15:29 |
derekh | Jun 16 14:03:34 overcloud-controller-0 systemd[1]: openstack-cinder-scheduler.service: main process exited, code=killed, status=14/ALRM | 15:29 |
*** rbrady has joined #tripleo | 15:31 | |
jistr | ccamacho: is your output from the stuck upgrade too, didn't it change to what derekh pasted after a while? | 15:33 |
EmilienM | jistr: I don't understand, if you look packaging diff, we don't have an update on pacemaker bits, or do we? | 15:33 |
*** openstackgerrit has quit IRC | 15:34 | |
EmilienM | jistr: https://www.diffchecker.com/qpmfhinu | 15:34 |
*** openstackgerrit has joined #tripleo | 15:34 | |
ccamacho | nop now is empty.. | 15:34 |
ccamacho | jistr ^ | 15:34 |
*** adarazs_afk is now known as adarazs | 15:35 | |
*** mikelk has quit IRC | 15:35 | |
ccamacho | jistr weird, after running sudo pcs status --full had a lot of services stopped (including httpd) now they are started | 15:37 |
ccamacho | .. | 15:37 |
*** milan has quit IRC | 15:37 | |
ccamacho | without doing anithing... | 15:37 |
*** leanderthal is now known as leanderthal|afk | 15:38 | |
jistr | hmm yea... | 15:39 |
ansiwen_ | will in the end (after everything is composable) puppet/manifests/overcloud_compute.pp disappear completely? | 15:40 |
chem | ccamacho: pacemaker has its own life, and start stop service as it sees fit, look into /var/log/cluster/corosync.log | 15:40 |
jistr | derekh: could you please paste `pcs constraint show | grep nova` executed on a controller | 15:40 |
derekh | so could this be the problem, cinder scheduler not stopping when it should? | 15:41 |
derekh | Jun 16 14:02:34 overcloud-controller-0 systemd[1]: Stopping OpenStack Cinder Scheduler Server... | 15:41 |
derekh | Jun 16 14:03:34 overcloud-controller-0 systemd[1]: openstack-cinder-scheduler.service: main process exited, code=killed, status=14/ALRM | 15:41 |
derekh | jistr: yup | 15:41 |
ansiwen_ | and what is ::nova::compute::neutron, is that an independend service as well, EmilienM? | 15:41 |
EmilienM | ansiwen_: no | 15:42 |
openstackgerrit | Brad P. Crochet proposed openstack/python-tripleoclient: Add Zaqar password to deployment https://review.openstack.org/330650 | 15:42 |
EmilienM | it's nova.conf parameters for neutron | 15:42 |
EmilienM | but not used anywhere AFIK | 15:42 |
derekh | jistr: http://paste.openstack.org/show/516688/ | 15:42 |
EmilienM | ansiwen_: https://github.com/openstack/puppet-nova/blob/master/manifests/compute/neutron.pp | 15:43 |
ansiwen_ | EmilienM: thanks | 15:43 |
ansiwen_ | EmilienM: so, will in the end puppet/manifests/overcloud_compute.pp disappear or will there something left? | 15:44 |
chem | jistr: it looks like it miss openstack-core-clone openstack-nova-consoleauth-clone colocation | 15:44 |
*** saneax is now known as saneax_AFK | 15:44 | |
ansiwen_ | ccamacho: is puppet-nova something you should also mention in your walkthrough? | 15:45 |
EmilienM | ansiwen_: disappear | 15:45 |
ccamacho | ansiwen_ puppet-nova nop | 15:45 |
EmilienM | ansiwen_: why that? | 15:45 |
jistr | chem: i'm not sure if that's needed though | 15:46 |
jistr | derekh: thanks, that's not what i was hoping for, it actually looks ok :) | 15:46 |
ansiwen_ | because it is another external repository beside tripleo-puppet-elements that seems to be imported by default? | 15:47 |
*** flaper87 has joined #tripleo | 15:47 | |
derekh | jistr: ok, want to log onto this box and have a look? | 15:47 |
ansiwen_ | EmilienM: ok, so, if I want to let overcloud_compute.pp disappear, what could I do next? | 15:48 |
*** mcornea has quit IRC | 15:48 | |
chem | jistr: but then it doesn't form a group with all the openstack-nova-* services | 15:48 |
*** ibravo has joined #tripleo | 15:48 | |
EmilienM | ansiwen_: you can sync with pradk for telemetry stuff | 15:49 |
chem | jistr: it's stricking how that the service under this branch that wants to restart http://file.rdu.redhat.com/~mbaldess/lp1569444/newton-jiri.pdf | 15:49 |
*** mcornea has joined #tripleo | 15:49 | |
*** mcornea has quit IRC | 15:49 | |
EmilienM | ansiwen_: we have ceilometer compute agent | 15:49 |
EmilienM | ansiwen_: i'll finish nova | 15:49 |
*** mcornea has joined #tripleo | 15:49 | |
ansiwen_ | ok, pradk, give me work please :-) | 15:50 |
*** tobias_fiberdata has quit IRC | 15:52 | |
ansiwen_ | ccamacho: did you read my answer? I though if you mention tripleo-puppet-elements, maybe it's also worth to mention other external repositories? | 15:52 |
EmilienM | ansiwen_: you can review my stuff https://review.openstack.org/#/q/topic:tripleo/nova/compute-composable | 15:52 |
ansiwen_ | ccamacho: because without EmilienM I would not have known to look at puppet-nova for some definitions | 15:53 |
*** tobias_fiberdata has joined #tripleo | 15:53 | |
EmilienM | dprince: when you get a moment, can you look & give feedback on https://review.openstack.org/#/q/topic:tripleo/nova/compute-composable please? | 15:53 |
*** panda is now known as panda|afk | 15:53 | |
jistr | derekh: yea that would be nice if you can PM me login details | 15:53 |
derekh | jistr: ack | 15:54 |
*** Larion has joined #tripleo | 15:56 | |
openstackgerrit | imain proposed openstack/tripleo-heat-templates: WIP: Containerized Services for Composable Roles https://review.openstack.org/330659 | 15:57 |
*** dmacpher has joined #tripleo | 15:57 | |
*** akshai has joined #tripleo | 15:58 | |
*** tobias_fiberdata has quit IRC | 15:59 | |
*** zoli|wfh is now known as zoli|gone | 15:59 | |
ccamacho | hey ansiwen_ as the tutorial is based in a simple patch is not covering puppet-nova | 15:59 |
dprince | EmilienM: yep | 15:59 |
pradk | ansiwen_, sure what are you looking for in particular? | 16:00 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates: TEST: do not merge. Testing if colocation help. https://review.openstack.org/330661 | 16:00 |
pradk | ansiwen_, did you wrap up the libvirt realtime support thing? i see you on that in trello | 16:00 |
*** zoli|gone is now known as zoli_gone-proxy | 16:00 | |
*** tesseract has quit IRC | 16:00 | |
*** xinwu has joined #tripleo | 16:01 | |
EmilienM | pradk: what is that? | 16:01 |
*** dprince has quit IRC | 16:01 | |
*** akshai_ has quit IRC | 16:01 | |
chem | jistr: I'm testing adding a colocation constraint, just to see. | 16:01 |
pradk | EmilienM, ansiwen_ probably knows better on the details of that | 16:01 |
*** dprince has joined #tripleo | 16:02 | |
jistr | chem: ack, thanks. This is a bug in pacemaker, previously we've had luck with working around it by adding more constraints, but ordering constraints were enough IIRC. | 16:02 |
ccamacho | jistr, is normal that this service is the only one stopped? [heat-admin@overcloud-controller-0 ~]$ sudo pcs status --full | grep -i Stopped | 16:02 |
ccamacho | httpd(systemd:httpd):(target-role:Stopped) Stopped | 16:02 |
ccamacho | Stopped: [ overcloud-controller-0 ] | 16:02 |
EmilienM | chem: did we remove it when we landed composable nova? | 16:03 |
*** Larion has quit IRC | 16:04 | |
jistr | ccamacho: probably not, but depends where the operations failed and if any recovery was done etc. I'm looking at derekh's deployment and it's not in the same state as yours probably. | 16:04 |
chem | EmilienM: I'm practictly positive that no it was not removed, but let me check again | 16:04 |
ansiwen_ | pradk: no, I have no idea yet, eglynn created that trello card. don't know the details about it yet | 16:04 |
jistr | EmilienM: i don't think so. It was my first thought, that composable nova removed some constraints, but i wasn't able to find anything like that. | 16:04 |
jistr | that's why i asked derekh to paste those constraints before :) | 16:05 |
ansiwen_ | pradk: I'm just looking for composable role tasks, because this has high prio at the moment. | 16:05 |
EmilienM | jistr: no, I didn't remove any constraint I think I kept it for later :) | 16:05 |
*** lucasagomes is now known as lucas-afk | 16:06 | |
pradk | ansiwen_, ah ok sure.. for ceilo i have started something, but only converted the controller stuff.. there is still compute as ceilo-compute agent runs on compute.. you can look into that if you want | 16:06 |
*** milan has joined #tripleo | 16:06 | |
jistr | what keeps me me puzzled is why did we start hitting it all of a sudden. But it might be that we just shuffled around puppet code so maybe some order of pcs calls changed which makes the pcs bug fire up in a different way... | 16:06 |
ansiwen_ | pradk: sounds good | 16:07 |
EmilienM | chem: oh wait I found something | 16:07 |
ansiwen_ | pradk: any related pointers? shall I look at your ceilo controller parts? | 16:08 |
EmilienM | require => Pacemaker::Resource::Ocf['openstack-core'], is missing for some nova resource | 16:08 |
*** hewbrocca is now known as hewbrocca-afk | 16:08 | |
pradk | ansiwen_, see puppet/manifests/overcloud_compute.pp and look for ceilo specific config.. thats what we need to convert to composability | 16:08 |
pradk | ansiwen_, should be quite easy i think | 16:08 |
chem | EmilienM: I know I pinged you about this. But it's not relevent for pacemaker config | 16:08 |
pradk | ansiwen_, puppet/compute.yaml has the params | 16:09 |
jistr | chem, EmilienM: as long as it's not missing from the constraint definitions, it should probably be ok... | 16:09 |
ansiwen_ | pradk: easy is good for me :-) | 16:10 |
*** ansiwen_ is now known as ansiwen | 16:10 | |
chem | EmilienM: the heat orchestration ensures that ocf['openstack-core'] is created before | 16:10 |
pradk | ansiwen_, and whatever hiera data is in puppet/hieradata/compute.yaml | 16:10 |
*** ooolpbot has joined #tripleo | 16:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 16:10 |
*** ooolpbot has quit IRC | 16:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,Confirmed] | 16:10 |
*** akshai_ has joined #tripleo | 16:10 | |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: profiles/nova/pacemaker/consoleauth: add missing require https://review.openstack.org/330669 | 16:10 |
EmilienM | chem: ^ | 16:10 |
ansiwen | pradk: I don't get that separation into hieradata, btw... | 16:10 |
EmilienM | chem: other services are good, I checked | 16:10 |
derekh | jistr: btw, cinder-scheduler is runing on the controller, I did that, when I noticed that cinder-volume was blocked | 16:11 |
*** myoung is now known as myoung|biab | 16:12 | |
chem | EmilienM: when I saw the error it was on the top of my hit list. But I cannot see how it can mess around there ... | 16:12 |
*** akshai has quit IRC | 16:13 | |
jistr | derekh: ack thanks, i already cycled the resources status start/stop so it should be ok now i think anyway | 16:13 |
*** xinwu has quit IRC | 16:15 | |
*** myoung|biab has quit IRC | 16:17 | |
* bandini back | 16:18 | |
*** dprince has quit IRC | 16:20 | |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/tripleo-quickstart: return global control of force_cached_image https://review.openstack.org/330166 | 16:20 |
*** dprince has joined #tripleo | 16:21 | |
*** ramishra has quit IRC | 16:22 | |
EmilienM | chem: I'm sure it won't help but let's give it a try | 16:22 |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/tripleo-quickstart: return global control of force_cached_image https://review.openstack.org/330166 | 16:25 |
chem | EmilienM: I'm kinda in the same mood ... let's try some stuff | 16:25 |
*** ramishra has joined #tripleo | 16:26 | |
bandini | EmilienM, jistr: I see that the upgrade jobs was not triggered on my review. ci overloaded or somehing else? | 16:28 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates: Work around Nova blocking openstack-core stop https://review.openstack.org/330682 | 16:28 |
jistr | bandini, EmilienM, chem: ^^ this helped on derekh's environment | 16:29 |
jistr | bandini: indeed i think we're hitting the bug 'pengine wants to start its favorite services' | 16:29 |
jistr | what's funny is that this time it doesn't seem that we're missing any constraints | 16:30 |
EmilienM | wait | 16:30 |
chem | jistr: hehe | 16:30 |
EmilienM | it sounds super related to https://review.openstack.org/330669 | 16:30 |
EmilienM | jistr: ^ | 16:30 |
* bandini still catching up | 16:31 | |
jistr | EmilienM: it's worth a shot for sure, but the resource is defined fine, and the constraint too. As long as the `require => Pacemaker::Resource::Ocf['openstack-core'],` is on the constraint definition, it shouldn't matter that it isn't on the resource definition. | 16:31 |
*** dtantsur|bbl is now known as dtantsur | 16:32 | |
EmilienM | jistr: I was wondering if this new failure is related to our patches for nova composable | 16:32 |
bandini | jistr: I see though that in newton a bunch of previously existing constraints went missing | 16:33 |
bandini | jistr: http://acksyn.org/files/tripleo/juan-working.pdf vs http://acksyn.org/files/tripleo/carlos-broken.pdf | 16:33 |
chem | EmilienM: one way to find out would be to roll back all the nova composition that landed yesterday. But once again, I cannot see what would be missing or broken | 16:33 |
jistr | EmilienM: i think it probably is, but that doesn't necessarily mean that there's something wrong in them, it might have just started triggering a pcmk bug by doing operations in different (but maybe not wrong) order. | 16:34 |
*** ramishra has quit IRC | 16:34 | |
EmilienM | ok | 16:34 |
jistr | bandini: oh yea that carlos-broken.pdf doesn't look nice :) | 16:34 |
bandini | jistr: that is why we started triggering this, methinks | 16:35 |
derekh | EmilienM: trown my test of the old repo failed, 2016-06-16 16:25:07.571379 Stack overcloud CREATE_FAILED | 16:35 |
derekh | EmilienM: trown didn't get a far as the update | 16:35 |
EmilienM | damn | 16:35 |
derekh | EmilienM: trown going to retrigger it not | 16:35 |
bandini | jistr: shall we have a 5min sync up? I was gone for the most of the afternoon, not sure what has been tried already | 16:35 |
jistr | bandini: ok so you mean in fact it's blocked on Nova resources, but it's broken by the composable services for Neutron and Gnocchi? | 16:36 |
chem | bandini: where are all the neutron-server resources ... ? | 16:36 |
bandini | jistr: it's one theory, yes | 16:36 |
ccamacho | chem bandini my deployment is using an older patch (Composable Nova).. I will try to go more in the past.. | 16:36 |
bandini | chem: some constraints disapperared for those as well | 16:36 |
trown | EmilienM: derekh, we also have the option to try to move forward... osc_lib is packaged and being added as a dep to openstackclient package shortly | 16:36 |
bandini | now if I could dpeloy a damn newton install, I would not be so useless | 16:36 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Switch back to current-tripleo pre promotion https://review.openstack.org/111011 | 16:37 |
derekh | trown: EmilienM ^^ retrying | 16:37 |
EmilienM | ack | 16:38 |
EmilienM | trown: yeah | 16:38 |
EmilienM | I'm fine with the THT workaround now | 16:38 |
*** yamahata has joined #tripleo | 16:39 | |
chem | bandini: when/why they disapear I don't see when they were removed ? | 16:40 |
*** ramishra has joined #tripleo | 16:40 | |
bandini | chem: I have to look around in git logs, no idea yet | 16:40 |
*** oshvartz has quit IRC | 16:42 | |
*** dmk0202 has quit IRC | 16:42 | |
ccamacho | guys just to be clear. this is not working as openstack-core is not Started, right? "check_resource openstack-core stopped 1800" | 16:43 |
chem | ccamacho: not exactly | 16:44 |
ccamacho | mmm chem, can you give me more details? | 16:44 |
*** jpich has quit IRC | 16:44 | |
*** xinwu has joined #tripleo | 16:44 | |
jistr | what's tricky about these issues is that CI will never uncover them. B/c when the patch goes through -upgrades CI, it's deployed with old correct t-h-t, and then upgraded to incorrect t-h-t/o-p-m, but the constraints are still in place from the deploy. And the other jobs will deploy with the new (possibly incorrect) code, but they don't do service stops. | 16:45 |
chem | ccamacho: when the pcs resource stop openstack-core is done, the cluster remain in transition state because some resource are started that depends on openstack-core (and so should be off). Being in transtion for more that 1800 sec the script abort | 16:45 |
jistr | So this is something we need to be careful about when extracting composable roles... | 16:45 |
chem | ccamacho: that's how I understand it anyway, if someone can crosscheck | 16:45 |
ccamacho | chem ack | 16:46 |
bandini | chem: I believe you described it correctly, yes | 16:46 |
jistr | chem: i'm not sure if they're still started, maybe it's more that pacemaker *wants* them to start but they can't start because they have constraints in place that prevent that. | 16:46 |
bandini | yep it tries but somehow it can't/won't | 16:47 |
chem | bandini: I'm digging into a non-working log of corosync and the neutron resources and order *are* there | 16:47 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-docs: Document how to clean up a pacemaker cluster after failed update https://review.openstack.org/330695 | 16:47 |
chem | bandini: so it seems that the second graph misses them because it doesn't use them. Not the stuff we're looking I think | 16:47 |
bandini | chem: the neutron resources are there in http://acksyn.org/files/tripleo/carlos-broken.pdf but the ordering constraints seem to be gone no? | 16:49 |
chem | bandini: no in the log they are present : rsc_order first="neutron-netns-cleanup-clone" first-action="start" id="order-neutron-netns-cleanup-clone-neutron-openvswitch-agent-clone-mandatory" then="neutron-openvswitch-agent-clone" then-action="start"/> | 16:49 |
chem | bandini: that's from non working cluster/corosync.txt | 16:50 |
bandini | chem: do you have neutron-server -> neutron-openvswitch-agent too? | 16:50 |
chem | bandini: looking | 16:50 |
bandini | I don't see that one in the broken one | 16:50 |
*** ebarrera has joined #tripleo | 16:50 | |
jistr | yea i don't see that one | 16:51 |
bandini | jistr: what do you think about testing a test review that restores all the missing constraints and then we see how it behaves? | 16:52 |
jistr | ++ | 16:52 |
derekh | jistr: I gotta run soon, if your gonna want to reploy that overcloud, I'll have to kill the old one before I go | 16:53 |
jistr | derekh: i will not need a redeploy, will just test the constraint changes live | 16:53 |
jistr | derekh: thanks and have a good evening | 16:53 |
derekh | jistr: ok, ttyl | 16:53 |
chem | bandini: no this one is definitively gone ! | 16:53 |
EmilienM | bandini: I also did https://review.openstack.org/#/c/330669/ | 16:54 |
chem | bandini: looking at a working one, it was there | 16:54 |
EmilienM | bandini: to catch up with how it was before our work on composable nova roles | 16:54 |
chem | bandini: I think this is a goooood lead :) | 16:54 |
*** mbound has quit IRC | 16:54 | |
*** ramishra has quit IRC | 16:55 | |
bandini | or we just kill all the constraints except those specified in the NG HA spec and we deal with the fallout | 16:55 |
bandini | bit of a big hammer, but we're headed that way anyway | 16:55 |
bandini | EmilienM: looking | 16:56 |
*** xinwu has quit IRC | 16:58 | |
ccamacho | chem indeed starting all the services result in ERROR: cluster finished transition but openstack-core was not in stopped state, exiting | 16:59 |
*** derekh has quit IRC | 16:59 | |
ccamacho | ill do it without starting openstack-core | 16:59 |
bandini | jistr: btw. what is the background for doing all those "pcs resource disable" thingies instead of cluster stop and start? | 17:00 |
bandini | do you remember the specifics? | 17:01 |
jistr | bandini: hmm no i don't remember. We could change it if we wanted i guess. The only potential issue i see is that full cluster restart might take more time (the cluster needs to re-form). | 17:04 |
chem | bandini: the patchset were it was move to composable role is I896e5dfe6fae49371c9fe7f47c4364eb6f621b07 | 17:04 |
jistr | bandini: also we'd bump the VIPs | 17:04 |
jistr | and there was this issue with cluster stop going bad b/c of pacemaker communicating via VIP on the client side and the VIP disappearing in the middle | 17:05 |
chem | bandini: relative to what is present in the puppet-tripleo it seems that we added if $enable_dhcp to $enable_ovs for the constraint to take place | 17:05 |
jistr | (we discovered that one later, so it's not an original reason for this decision, but still we'd have to tackle that if we decided to change today) | 17:05 |
*** mbound has joined #tripleo | 17:05 | |
bandini | jistr: I see | 17:06 |
jistr | so i added | 17:06 |
jistr | pcs constraint order neutron-server-clone then neutron-openvswitch-agent-clone | 17:06 |
jistr | pcs constraint order neutron-openvswitch-agent-clone then neutron-dhcp-agent-clone | 17:06 |
jistr | pcs constraint order openstack-core-clone then openstack-gnocchi-metricd-clone | 17:06 |
jistr | y | 17:06 |
*** akshai_ has quit IRC | 17:06 | |
jistr | but still it gets stuck | 17:06 |
jistr | not sure if i forgot about some | 17:06 |
bandini | want to paste the CIB so we can check? | 17:07 |
bandini | is sahara covered? | 17:07 |
*** trown is now known as trown|lunch | 17:08 | |
bandini | chem: ack | 17:08 |
*** jpena is now known as jpena|off | 17:08 | |
*** ifarkas has quit IRC | 17:08 | |
jistr | bandini: possibly sahara is not, i worked off the carlos-broken.pdf | 17:08 |
chem | bandini: hum ... let me dig further I may have made a mistake. ... | 17:09 |
chem | bandini:(for the patchset) | 17:09 |
bandini | jistr: send me a CIB and we doublecheck together, if you want | 17:10 |
*** ooolpbot has joined #tripleo | 17:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 17:10 |
*** ooolpbot has quit IRC | 17:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,In progress] - Assigned to Jiří Stránský (jistr) | 17:10 |
jistr | bandini: it's derekh's machine, it's bit challenging to get it out, gimme a sec :D | 17:12 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-docs: Undercloud install is not a script anymore https://review.openstack.org/330714 | 17:13 |
*** xinwu has joined #tripleo | 17:13 | |
bandini | jistr: :) | 17:13 |
chem | bandini: this is the right one Ia61295943e67efe354a51a26fe4540f288ff6ede | 17:15 |
chem | bandini: but it's sooo old! | 17:15 |
*** leanderthal|afk has quit IRC | 17:16 | |
*** ohamada has quit IRC | 17:18 | |
*** mcornea has quit IRC | 17:20 | |
jistr | bandini: sahara isn't deployed there at all it seems | 17:20 |
*** dtantsur is now known as dtantsur|afk | 17:20 | |
*** sambetts is now known as sambetts|afk | 17:21 | |
jistr | bandini: does it look like something's still loose in that CIB? | 17:21 |
*** rcernin has quit IRC | 17:21 | |
*** athomas has quit IRC | 17:22 | |
bandini | jistr: not really no. do you see the exact same symptoms? | 17:22 |
* bandini need to feed the kids | 17:22 | |
bandini | bbiab | 17:22 |
jistr | yea, the exact same services | 17:22 |
*** panda|afk is now known as panda | 17:23 | |
openstackgerrit | Ben Nemec proposed openstack/tripleo-common: Allow updating of nodes in baremetal import https://review.openstack.org/330717 | 17:23 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-common: Update default branch in .gitreview https://review.openstack.org/330718 | 17:23 |
jistr | same as derekh pasted earlier https://paste.fedoraproject.org/380181/66097809/raw/ | 17:23 |
* EmilienM afk lunch bbiab too | 17:24 | |
jistr | i'm gonna call it a day, but https://review.openstack.org/#/c/330682/ should get us out of trouble for the time being | 17:25 |
*** mcornea has joined #tripleo | 17:25 | |
EmilienM | jistr: thanks, I'll monitor it | 17:25 |
chem | does someone know what happen to the NeutronEnableOVSAgent because it seems that neutron::enable_ovs_agent in puppet-triple will always be false? I must be missing something here ... | 17:26 |
bandini | I will be here tonight for another issue anyway, so I will be around | 17:26 |
*** akshai has joined #tripleo | 17:28 | |
*** leanderthal|afk has joined #tripleo | 17:29 | |
chem | jistr: on your platform could you check the value of neutron::enable_ovs_agent in the hieradata ? | 17:29 |
ccamacho | see you tomorrow guys, ill be also seeing the patch | 17:29 |
*** ccamacho has quit IRC | 17:30 | |
jistr | chem: says 'nil', the key isn't set | 17:32 |
jistr | chem: https://paste.fedoraproject.org/380187/60983821/raw/ | 17:33 |
* jistr going for real :) | 17:33 | |
jistr | o/ | 17:33 |
chem | jistr: bye :) | 17:34 |
*** akrivoka has quit IRC | 17:34 | |
*** tosky has quit IRC | 17:34 | |
openstackgerrit | Dan Sneddon proposed openstack/tripleo-specs: Specification for tripleo-lldp-validation blueprint https://review.openstack.org/329203 | 17:38 |
chem | dprince: would you mind if I make the changes to https://review.openstack.org/#/c/299643/ the (upload scripts), I need a break from the ha-upgrade problem | 17:38 |
paramite | Hi guyHi guys, can somebody give me a hint what should I fix on "ERROR: Failed to validate: : resources.ControllerServiceChain: : Failed to validate nested template: Invalid type (list)" | 17:38 |
paramite | ? | 17:39 |
*** mcornea has quit IRC | 17:39 | |
chem | dprince: if you have some time also to explain me how neutron::enable_ovs_agent can ever be true as NeutronEnableOVSAgent has disapeared from tht, I must miss something, but I can't find it | 17:41 |
*** jprovazn has quit IRC | 17:44 | |
dprince | chem: looking | 17:47 |
dprince | chem: go ahead and make changes on https://review.openstack.org/#/c/299643/ | 17:49 |
dprince | chem: NeutronEnableOVSAgent was removed, yes. Instead we now set this in your Heat environment's resource_registry: http://git.openstack.org/cgit/openstack/tripleo-heat-templates/tree/environments/neutron-nuage-config.yaml#n7 | 17:51 |
*** mgould is now known as mgould|afk | 17:52 | |
*** myoung|biab has joined #tripleo | 17:53 | |
*** myoung|biab is now known as myoung | 17:54 | |
chem | dprince: ok, thanks for the explanation. | 17:54 |
*** florianf has quit IRC | 17:54 | |
*** mcornea has joined #tripleo | 17:56 | |
matbu | jistr: marios bandini hewbrocca-afk so i got a sucessfull upgrade L->M | 17:57 |
matbu | jistr: marios bandini controller+compute+ceph+converge | 17:57 |
chem | matbu: what was the trick ? | 17:57 |
matbu | i just had to remove the t-i-e from all the nodes after upgrading | 17:57 |
matbu | chem: ^ | 17:58 |
*** electrofelix has quit IRC | 17:58 | |
matbu | chem: i mean major rdo upgrade | 17:58 |
chem | matbu: so that's not related to the CI ha-upgrade ? | 18:00 |
matbu | chem: ha no sorry :( | 18:00 |
chem | matbu: hehe, np, good news anyway :) | 18:01 |
matbu | chem: i need to deep a bit more into those jobs, i'm not very familiar with them | 18:01 |
*** paramite has quit IRC | 18:03 | |
*** coolsvap has quit IRC | 18:06 | |
*** ebalduf has joined #tripleo | 18:06 | |
*** coolsvap has joined #tripleo | 18:07 | |
*** rcernin has joined #tripleo | 18:07 | |
ayoung | During Openstack Overcloud deploy, how does heat talk with the (new) controller instance? Is it all via the Metadata server? | 18:08 |
jstir_ | ayoung: ITYM 'OpenStack'. | 18:08 |
*** akshai has quit IRC | 18:08 | |
ayoung | jstir_, I actually mean openstack since it is the CLI | 18:09 |
trown|lunch | lol, trollbot | 18:09 |
ayoung | so openstack overcloud deploy -e thetimehascomethewalrussaidtotalkofmanythings | 18:09 |
*** trown|lunch is now known as trown | 18:09 | |
chem | dprince: oki, I've had an hard look at the paste and all, but I still don't see how https://github.com/openstack/puppet-tripleo/blob/master/manifests/profile/pacemaker/neutron.pp#L52 can become true | 18:09 |
*** ooolpbot has joined #tripleo | 18:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1592776 | 18:10 |
*** ooolpbot has quit IRC | 18:10 | |
openstack | Launchpad bug 1592776 in tripleo "upgrade jobs failing with "cluster remained unstable for more than 1800 seconds"" [Critical,In progress] - Assigned to Jiří Stránský (jistr) | 18:10 |
*** akshai has joined #tripleo | 18:11 | |
openstackgerrit | Ryan Brady proposed openstack/tripleo-common: Add baremetal workflows https://review.openstack.org/300200 | 18:11 |
*** coolsvap has quit IRC | 18:12 | |
ayoung | trown, is my diagram correct here: https://adam.younglogic.com/wp-content/uploads/2016/06/VM-config-changes-via-Heat.png | 18:12 |
ayoung | I assume Heat actuall *is* the metadata service but listening on a magic port... | 18:13 |
trown | ayoung: ya that is the one part of your diagram I am not totally sure of... everything up to the heat circle is right, and os-collect-config on is right | 18:14 |
ayoung | trown, what controls all the stages of the update? | 18:14 |
*** pkovar has quit IRC | 18:18 | |
trown | ayoung: not sure what you mean | 18:19 |
trown | ayoung: the puppet manifests in tripleo-heat-templates have a step variable | 18:19 |
ayoung | trown, when you run an update, the log shows all the stages of the deplou | 18:19 |
ayoung | deploy | 18:19 |
ayoung | like | 18:20 |
ayoung | 2016-06-16 17:48:54 [overcloud-AllNodesExtraConfig-htynk23jryte]: UPDATE_IN_PROGRESS Stack UPDATE started | 18:20 |
ayoung | 2016-06-16 17:48:54 [overcloud-AllNodesExtraConfig-htynk23jryte]: UPDATE_COMPLETE Stack UPDATE completed successfully | 18:20 |
ayoung | err I can do better | 18:20 |
ayoung | actually, yah, like that...I realize Heat is is processing each of those templates in turn, but is it also synchronizing with the os-collect-config service on the remote host? | 18:21 |
trown | ya when os-collect-config finishes a puppet apply it signals back to heat | 18:22 |
trown | please dont ask how ;P /me does not know | 18:23 |
*** xinwu has quit IRC | 18:24 | |
ayoung | trown, I'm sure shady has it in a presentation somewhere | 18:25 |
EmilienM | chem: it sounds like the workaround does not work https://review.openstack.org/#/c/330682/ | 18:28 |
EmilienM | same for my patch https://review.openstack.org/330669 | 18:28 |
chem | EmilienM: mine works https://review.openstack.org/#/c/330661/ :) | 18:29 |
chem | EmilienM: soooo green, I cannot believe it | 18:30 |
EmilienM | w00t | 18:31 |
EmilienM | wow | 18:31 |
EmilienM | chem: can you update commit message ? and we can land it | 18:31 |
chem | EmilienM: doing | 18:32 |
*** csd_ has quit IRC | 18:32 | |
chem | EmilienM: hope it's not a lucky neutron hitting the right hard drive somewhere in an datacenter | 18:32 |
EmilienM | lol | 18:32 |
EmilienM | it's worth trying | 18:32 |
EmilienM | trown: you ok to land this patch ^ after commit message update? | 18:33 |
EmilienM | asking other reviewers too: bandini, bnemec, dprince ^ | 18:33 |
EmilienM | chem: it would be even better to have it in puppet-tripleo no? | 18:34 |
bnemec | EmilienM: Which one are we merging? | 18:35 |
chem | EmilienM: as I told you, If it was only me I would move all those constraint to tripleo and in the end removing them from them | 18:35 |
EmilienM | bnemec: chem did https://review.openstack.org/#/c/330661/ and it seems like it works. | 18:35 |
EmilienM | chem: ok so we can keep it there now | 18:36 |
chem | EmilienM: arghhh, but tomorrow we move them all oki ? | 18:37 |
bnemec | EmilienM: Okay, I only have a vague idea of what that's doing, but since it passed CI I'm okay with merging it. | 18:37 |
EmilienM | chem: either way work | 18:37 |
EmilienM | I prefer puppet-tripleo | 18:37 |
EmilienM | but since it's passing CI in THT | 18:37 |
chem | EmilienM: me too | 18:37 |
EmilienM | let's do it now in THT | 18:37 |
EmilienM | and we'll iterate tomorrow | 18:37 |
chem | EmilienM: I'm adjusting | 18:37 |
EmilienM | just update commit message and I propose we land it | 18:38 |
*** Guest36238 has quit IRC | 18:38 | |
chem | EmilienM: ack | 18:38 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates: Colocation make a group for pckm nova resources. https://review.openstack.org/330661 | 18:42 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates: Colocation make a group for pcmk nova resources. https://review.openstack.org/330661 | 18:42 |
*** dmk0202 has joined #tripleo | 18:44 | |
chem | EmilienM: let's hope it passes the gate, but I won't see that in real time. I'm off. | 18:45 |
*** dmk0202 has quit IRC | 18:45 | |
*** akshai_ has joined #tripleo | 18:45 | |
*** chem is now known as chem|off | 18:46 | |
*** saneax_AFK is now known as saneax | 18:46 | |
*** ebarrera has quit IRC | 18:46 | |
EmilienM | chem|off: thanks | 18:46 |
*** akshai has quit IRC | 18:47 | |
openstackgerrit | Dan Prince proposed openstack/tripleo-heat-templates: Composable opencontrail plugin https://review.openstack.org/328471 | 18:47 |
openstackgerrit | Dan Prince proposed openstack/tripleo-heat-templates: Drop extraconfig for neutron-nuage.yaml https://review.openstack.org/327742 | 18:47 |
openstackgerrit | Dan Prince proposed openstack/tripleo-heat-templates: Drop extraconfig for neutron-opencontrail.yaml https://review.openstack.org/328472 | 18:47 |
openstackgerrit | Dan Prince proposed openstack/tripleo-heat-templates: Composable neutron nuage plugin https://review.openstack.org/327741 | 18:47 |
*** akshai_ has quit IRC | 18:50 | |
*** panda has quit IRC | 18:50 | |
*** panda has joined #tripleo | 18:50 | |
openstackgerrit | Sanjay Upadhyay proposed openstack/python-tripleoclient: Tripleoclient leaks temporary files https://review.openstack.org/330638 | 18:51 |
EmilienM | bnemec: I asked infra to unqueue it so we skip all waiting time in the loooonng queue :) | 18:54 |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/tripleo-quickstart: return global control of force_cached_image https://review.openstack.org/330166 | 18:55 |
*** mcornea has quit IRC | 18:59 | |
openstackgerrit | Merged openstack/tripleo-heat-templates: Colocation make a group for pcmk nova resources. https://review.openstack.org/330661 | 18:59 |
EmilienM | yeah ^ | 18:59 |
*** yolanda has quit IRC | 19:00 | |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/tripleo-quickstart: return global control of force_cached_image https://review.openstack.org/330166 | 19:00 |
* EmilienM removing alert tag on the bug | 19:00 | |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: Implement Libvirt profile https://review.openstack.org/329682 | 19:02 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: First iteration of libvirt as a composable service https://review.openstack.org/329686 | 19:02 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: Create libvirt micro-service https://review.openstack.org/329714 | 19:02 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: Enable libvirt as a micro-service https://review.openstack.org/329718 | 19:03 |
*** xinwu has joined #tripleo | 19:05 | |
*** saneax is now known as saneax_AFK | 19:08 | |
*** ebalduf has quit IRC | 19:09 | |
openstackgerrit | Ryan Brady proposed openstack/tripleo-common: [WIP] Fix exception within deployment plan actions https://review.openstack.org/330755 | 19:16 |
*** fultonj has quit IRC | 19:23 | |
*** dprince has quit IRC | 19:27 | |
*** chem|off` has joined #tripleo | 19:31 | |
*** chem|off has quit IRC | 19:33 | |
*** MaxPC has quit IRC | 19:36 | |
*** dmk0202 has joined #tripleo | 19:40 | |
openstackgerrit | Merged openstack/tripleo-quickstart: make --requirements cumulative https://review.openstack.org/330086 | 19:40 |
EmilienM | bnemec: what was your results when you enabled Iptables rules by default on the overcloud? | 19:41 |
bnemec | EmilienM: It breaks HA right now. I have a series of patches up to fix it, except I just realized I haven't pushed my latest working local branch for review. :-) | 19:42 |
EmilienM | bnemec: ah | 19:43 |
openstackgerrit | Merged openstack/tripleo-quickstart: return global control of force_cached_image https://review.openstack.org/330166 | 19:43 |
*** mbound has quit IRC | 19:43 | |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates: Enable firewall by default on the overcloud https://review.openstack.org/321833 | 19:43 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates: Allow pacemaker ports in firewall https://review.openstack.org/330249 | 19:43 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates: Stop using deprecated port param in firewall rules https://review.openstack.org/330759 | 19:43 |
openstackgerrit | Ben Nemec proposed openstack/tripleo-heat-templates: Allow sahara ports in firewall https://review.openstack.org/330760 | 19:43 |
*** dmk0202 has quit IRC | 19:43 | |
bnemec | EmilienM: ^ | 19:44 |
bnemec | Those should get it working. | 19:44 |
EmilienM | excellent | 19:44 |
openstackgerrit | John Trowbridge proposed openstack/tripleo-quickstart: use the ansible-role-tripleo-inventory to override native inventory https://review.openstack.org/329938 | 19:48 |
EmilienM | bnemec: I have an idea how to make it composable | 19:49 |
EmilienM | bnemec: part of our roles | 19:49 |
EmilienM | bnemec: I'll work on it based on your patches so we can land yours first | 19:49 |
EmilienM | if you don't mind | 19:49 |
bnemec | EmilienM: No, that's sounds good. I definitely want to keep the fixes and the composable work separate because I think the fixes will need to be backported. | 19:50 |
*** fragatin_ has quit IRC | 19:51 | |
EmilienM | bnemec: definitly | 19:52 |
*** lblanchard has quit IRC | 19:56 | |
openstackgerrit | Merged openstack/tripleo-quickstart: Move default stopping point to just before overcloud deploy https://review.openstack.org/330176 | 19:58 |
*** ebalduf has joined #tripleo | 20:01 | |
*** timothyb89 has joined #tripleo | 20:01 | |
*** jayg is now known as jayg|g0n3 | 20:03 | |
timothyb89 | hi all, does anyone here have experience running diskimage-builder behind a proxy? I'm trying to build a nodepool image but hitting proxy errors even with the usual env vars set | 20:08 |
*** fragatina has joined #tripleo | 20:10 | |
*** fragatina has quit IRC | 20:10 | |
*** fragatina has joined #tripleo | 20:11 | |
*** jcoufal has quit IRC | 20:12 | |
rlandy | ayoung: hello - I'm hitting keystoneauth/identity errors when deploying on baremetal overcloud. During deploy (or sometimes after introspection), "ERROR (ConnectFailure): Unable to establish connection to http://<ip>5000/v2.0/tokens". Until that point, things were going along fine and all actions requiring auth worked. | 20:12 |
ayoung | rlandy, something shoot Keystone? | 20:13 |
*** akshai has joined #tripleo | 20:13 | |
*** noslzzp has quit IRC | 20:13 | |
*** fzdarsky is now known as fzdarsky|afk | 20:14 | |
rlandy | ayoung: I am guessing it just runs out of steam (resource) at some point. | 20:14 |
rlandy | ayoung: I've run a few times on two different baremetal environments | 20:15 |
rlandy | if I need to run introspection twice, for example, I could hit the error even before attempting deploy | 20:15 |
rlandy | I could share the env with you if that helps | 20:16 |
ayoung | rlandy, nope | 20:16 |
ayoung | rlandy, its not a Keystone issue AFAICT | 20:16 |
ayoung | you just need to figure out what is killing Keystone | 20:16 |
rlandy | ah ok | 20:16 |
ayoung | if you have the env up and running, look at the logs | 20:17 |
rlandy | I'm thinking ironic | 20:17 |
ayoung | journalctl, /var/log/keysteon etc | 20:17 |
*** ayoung has quit IRC | 20:18 | |
*** openstackstatus has joined #tripleo | 20:20 | |
*** ChanServ sets mode: +v openstackstatus | 20:20 | |
*** lucas-afk has quit IRC | 20:21 | |
*** dmk0202 has joined #tripleo | 20:23 | |
*** lucasagomes has joined #tripleo | 20:28 | |
openstackgerrit | Andreas Florath proposed openstack/diskimage-builder: Refactor: block-device handling https://review.openstack.org/319591 | 20:30 |
openstackgerrit | Jeff Peeler proposed openstack/tripleo-common: [WIP] Fix exception within deployment plan actions https://review.openstack.org/330755 | 20:32 |
*** mbound has joined #tripleo | 20:33 | |
openstackgerrit | Gabriele Cerami proposed openstack/tripleo-quickstart: Update downloaded images to latest delorean repos https://review.openstack.org/327898 | 20:38 |
openstackgerrit | Gabriele Cerami proposed openstack/tripleo-quickstart: Move ironic config to post install https://review.openstack.org/328300 | 20:38 |
*** ibravo has quit IRC | 20:46 | |
*** [1]cdearborn has joined #tripleo | 20:47 | |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-heat-templates: [WIP] Initial work to dump and restore galera db during major upgrades https://review.openstack.org/325205 | 20:48 |
openstackgerrit | Jeff Peeler proposed openstack/tripleo-common: [WIP] Fix exception within deployment plan actions https://review.openstack.org/330755 | 20:51 |
*** cdearborn has quit IRC | 20:53 | |
*** [1]cdearborn has quit IRC | 20:54 | |
*** rhallisey has quit IRC | 20:58 | |
openstackgerrit | Stephanie Miller proposed openstack/diskimage-builder: Ironic agent kernel should be owned by user building image https://review.openstack.org/330783 | 20:59 |
*** trown is now known as trown|outtypewww | 21:00 | |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: keystone: deploy composable firewall rules https://review.openstack.org/330785 | 21:04 |
EmilienM | bnemec: it's a PoC but see ^ | 21:04 |
EmilienM | slagle, bnemec: you guys working on firewall, please give any feedback before I continue this thing ^ | 21:04 |
*** chem|off` has quit IRC | 21:07 | |
*** shivrao has joined #tripleo | 21:07 | |
openstackgerrit | Stephanie Miller proposed openstack/diskimage-builder: Ironic agent kernel should be owned by user building image https://review.openstack.org/330783 | 21:28 |
*** Guest36238 has joined #tripleo | 21:29 | |
*** fzdarsky|afk has quit IRC | 21:29 | |
*** rcernin has quit IRC | 21:36 | |
*** dmk0202 has quit IRC | 21:38 | |
*** Goneri has quit IRC | 21:45 | |
openstackgerrit | ayoung proposed openstack/tripleo-quickstart: Allow for multiple undercloud nodes https://review.openstack.org/315749 | 21:50 |
*** chem has joined #tripleo | 21:50 | |
*** yamahata has quit IRC | 21:51 | |
*** chem is now known as chem|off | 21:52 | |
*** panda is now known as panda|Zz | 22:00 | |
openstackgerrit | Ben Nemec proposed openstack/tripleo-docs: Add Liberty and Mitaka admonitions and use them https://review.openstack.org/330802 | 22:01 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: keystone: deploy composable firewall rules https://review.openstack.org/330785 | 22:02 |
*** dmk0202 has joined #tripleo | 22:04 | |
*** egafford has quit IRC | 22:06 | |
*** ayoung has joined #tripleo | 22:07 | |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: Implement Libvirt profile https://review.openstack.org/329682 | 22:09 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: Create libvirt micro-service https://review.openstack.org/329714 | 22:10 |
*** akshai has quit IRC | 22:11 | |
openstackgerrit | ayoung proposed openstack/tripleo-quickstart: Provision Identity VM https://review.openstack.org/328335 | 22:14 |
ayoung | larsks, so...kindof harsh with the -2s there. Do you really object that strenuously to the approach? I'm not certain I can actually meet your suggestions. | 22:16 |
openstackgerrit | Emilien Macchi proposed openstack/puppet-tripleo: nova/api: include ::nova::network::neutron https://review.openstack.org/329529 | 22:16 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates: WIP: integration of the new puppet pacemaker. https://review.openstack.org/302409 | 22:16 |
*** weshay has quit IRC | 22:18 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates: compute: align rabbitmq configuration with nova-base service https://review.openstack.org/330022 | 22:18 |
EmilienM | ayoung: looking at https://review.openstack.org/#/c/328373/ | 22:22 |
EmilienM | so you want ipa server on undercloud, right? | 22:23 |
ayoung | EmilienM, yeah | 22:23 |
*** ebalduf has quit IRC | 22:23 | |
ayoung | EmilienM, but it needs to meet a few different scenarios | 22:23 |
EmilienM | ayoung: why not contriting to instack undercloud? | 22:23 |
EmilienM | contributing* | 22:23 |
ayoung | EmilienM, because I thought instack and quickstart were merging | 22:24 |
EmilienM | ayoung: afik we use puppet so setup & configure things, no? | 22:24 |
ayoung | I can put it anywhere, but I went through the work to get it here because this seems to be the developmnet tool of choice | 22:24 |
ayoung | EmilienM, that would be a very different set of code | 22:24 |
ayoung | EmilienM, this is all ansible based stuff we did last summer coming on over | 22:25 |
*** rlandy has quit IRC | 22:25 | |
EmilienM | so iirc, we can install things with puppet with instack undercloud OR with ansible with quickstart? | 22:25 |
EmilienM | s/iirc/iiuc/ | 22:25 |
ayoung | there was some work done on getting IPA set up by Puppet back when Lynn Root worked in our group, but I have not looked at it in a long time | 22:26 |
EmilienM | I'm not sure we take a right approach here | 22:26 |
EmilienM | everyone is working on instack-undercloud now | 22:26 |
EmilienM | we integrated recent services, etc | 22:26 |
EmilienM | maybe I'm wrong | 22:26 |
EmilienM | slagle: do you have thoughts on ^ ? | 22:26 |
ayoung | So...I need to put IPA on a separate MAchine as it conflicts ports with Swift | 22:26 |
ayoung | and, as I said, we are trying to mirror a common deploytment where IPA or some other IdP already exists | 22:27 |
ayoung | so, when I said "on undercloiud" I really meant "on a separate VM next to the undercloud" | 22:27 |
EmilienM | ah ok ! | 22:27 |
EmilienM | I was confused :) | 22:27 |
ayoung | EmilienM, IPA is very opnionoated, and , since it configures the Apache server, it will get in the way of Keystone and Horizon as well | 22:28 |
EmilienM | I'm off now | 22:28 |
ayoung | if we ever put Horizon on the undercloud.... | 22:28 |
ayoung | thanks for your interest | 22:28 |
EmilienM | ok | 22:28 |
EmilienM | ayoung: I'll look at it | 22:28 |
EmilienM | thanks! | 22:28 |
ayoung | EmilienM, I am just not sureI can do what larsks is asking for in the last of the three patches; make it a separate role | 22:29 |
ayoung | the Identity VM is generated when you build the overall structure | 22:29 |
ayoung | He might not want that, instead have it generated as its own role, but that is a much more significant rewrite | 22:29 |
*** xinwu has quit IRC | 22:30 | |
*** dmk0202 has quit IRC | 22:39 | |
*** xinwu has joined #tripleo | 22:44 | |
*** bfournie1 has quit IRC | 22:48 | |
*** saneax_AFK is now known as saneax | 22:49 | |
*** ayoung has quit IRC | 23:07 | |
*** pradk has quit IRC | 23:07 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder: Pre-install pip/virtualenv packages https://review.openstack.org/327472 | 23:15 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder: Pre-install pip/virtualenv packages https://review.openstack.org/327472 | 23:21 |
slagle | EmilienM: honestly, i'm not familiar with ipa enough to know if it should be an integrated service on the uc or not | 23:37 |
slagle | it would depend on what types of resources and configuration it needs | 23:37 |
slagle | and if it plays nicely alongside other services | 23:37 |
*** chlong has quit IRC | 23:42 | |
*** ayoung has joined #tripleo | 23:44 | |
*** Guest36238 has quit IRC | 23:46 | |
*** mbound has quit IRC | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!