*** daidv has quit IRC | 00:01 | |
kolla-slack | <jascott1> looks like helm uses k8s validation, and for configmaps its 1mb | 00:02 |
---|---|---|
japestinho | sdake I'm able to deploy kolla-k8s aio for testing inside openstack env. after run init-runocne I saw demo1 VM, demo networks and router created. but i can't ping the VM from qdhcp or ping it's floating ip | 00:02 |
japestinho | sdake which network type should I use? flat or anything else? | 00:03 |
sdake | japestinho so your on an openstack deployment? | 00:03 |
sdake | japestinho or your on bar emetal? | 00:03 |
kfox1111 | did anything recently change in canal? | 00:03 |
japestinho | sdake I'm using openstack deployment, I deploy on centos 7 instance | 00:03 |
sbezverk | jascott1: but we do not have that configmap | 00:03 |
sdake | jascott1 k8s has a limit of 1mb, or helm does? | 00:03 |
kolla-slack | <jascott1> helm uses k8s validation so k8s | 00:04 |
sbezverk | jascott1: it seems tiller autogenerate it | 00:04 |
kfox1111 | ah... so tiller may have added a bit more state tracking or something, and our amount of stuff in compute kit pushed us over the limit? | 00:04 |
sbezverk | jascott1: can you add debuggin around why this config map gets generated and what triggers it | 00:05 |
kfox1111 | sbezverk: its the release state tracking stuff. | 00:05 |
kolla-slack | <jascott1> i just raised the limit and pushed it | 00:05 |
kfox1111 | jascott1: cool. | 00:05 |
sbezverk | jascott1: in helm or in tiller? | 00:05 |
sbezverk | kfox1111: I have a job with jascott1 debug helm | 00:06 |
*** lamt has quit IRC | 00:06 | |
kfox1111 | sbezverk: cool. | 00:07 |
sbezverk | kfox1111: but if tiller image was changed then it will not get new one.. | 00:08 |
kfox1111 | sbezverk: yeah. it may need to be a tiller update. | 00:08 |
*** lamt has joined #openstack-kolla | 00:08 | |
sbezverk | kfox1111: Yeah I think so too | 00:08 |
kfox1111 | so, there seems to be a pattern in the multinode failures: | 00:09 |
kfox1111 | http://logs.openstack.org/41/454841/8/check/gate-kolla-kubernetes-deploy-centos-binary-3-ceph-multi-nv/0fa9c5c/logs/pods/default-test-dns-centos-7-2-node-rax-ord-8375918-522923-8mc8w.txt | 00:09 |
*** eanylin has joined #openstack-kolla | 00:09 | |
kfox1111 | canal has not been updated for 19 days... so not that... | 00:09 |
*** lamt has quit IRC | 00:10 | |
sdake | this has been happening since kubernetes 1.6 kfox1111 | 00:10 |
sbezverk | kfox1111: yep I saw these before, but the were not very frequent | 00:10 |
*** wagiel has joined #openstack-kolla | 00:20 | |
kolla-slack | <jascott1> @sbezverk I took out the panic and raised the limit to 2mb. | 00:20 |
*** sayantan_ has quit IRC | 00:24 | |
*** wagiel has quit IRC | 00:25 | |
*** srwilkers has quit IRC | 00:26 | |
*** Pavo has joined #openstack-kolla | 00:29 | |
openstackgerrit | Marcus Williams proposed openstack/kolla-ansible master: Add OpenDaylight role https://review.openstack.org/416367 | 00:30 |
*** Pavo has quit IRC | 00:30 | |
*** Pavo has joined #openstack-kolla | 00:31 | |
kfox1111 | ok, saw a multinode succeed. was not rax. | 00:39 |
kfox1111 | and another failure. rax. | 00:39 |
*** duonghq has joined #openstack-kolla | 00:39 | |
kfox1111 | so something network related on that cloud and our config is fighting. | 00:39 |
*** zhurong has joined #openstack-kolla | 00:40 | |
kfox1111 | seems like traffic isn't making it from the slave to the master node related to k8s apiserver. | 00:40 |
*** guoguo has joined #openstack-kolla | 00:42 | |
*** Pavo has quit IRC | 00:48 | |
*** Pavo has joined #openstack-kolla | 00:49 | |
kolla-slack | <jascott1> @sbezverk i raised it in the k8s validation library, k8s will eventually complain but should move past the error | 00:50 |
kolla-slack | <jascott1> to be clear, changes was in vendored k8s lib for helm, so helm has been increased but k8s has not | 00:50 |
*** lrensing has joined #openstack-kolla | 00:53 | |
*** tovin07_ has joined #openstack-kolla | 00:56 | |
*** lrensing has quit IRC | 01:03 | |
*** lrensing has joined #openstack-kolla | 01:05 | |
*** cuongnv has joined #openstack-kolla | 01:05 | |
*** daidv has joined #openstack-kolla | 01:10 | |
*** wagiel has joined #openstack-kolla | 01:15 | |
openstackgerrit | Merged openstack/kolla-ansible stable/ocata: Fix Telegraf retention policy not found https://review.openstack.org/453631 | 01:16 |
masber | hi, has anyone used kolla with nic bonding? does it works? | 01:16 |
masber | sorry I am talking about kolla-ansible 4.0 | 01:16 |
*** lrensing has quit IRC | 01:17 | |
*** yangyapeng has joined #openstack-kolla | 01:17 | |
*** wagiel has quit IRC | 01:19 | |
*** lrensing has joined #openstack-kolla | 01:20 | |
*** sayantan_ has joined #openstack-kolla | 01:23 | |
masber | I specified tunnel_interface to be a different than network_interface and I think neutron_agent on my compute host can't find the physical_network | 01:23 |
*** lucasxu has joined #openstack-kolla | 01:24 | |
*** MasterOfBugs has quit IRC | 01:25 | |
*** pramodrj07 has quit IRC | 01:25 | |
openstackgerrit | shaofeng cheng proposed openstack/kolla-ansible master: Add VMware DataStore support to glance https://review.openstack.org/452176 | 01:27 |
openstackgerrit | Merged openstack/kolla master: Fix name of the mistral-dashboard horizon plugin https://review.openstack.org/449136 | 01:28 |
*** hrw has quit IRC | 01:29 | |
*** hrw has joined #openstack-kolla | 01:31 | |
sbezverk | sdake: it looks like compute kit is still failing | 01:33 |
*** dixiaoli has joined #openstack-kolla | 01:33 | |
*** lucasxu has quit IRC | 01:38 | |
*** lucasxu has joined #openstack-kolla | 01:39 | |
*** Pavo has quit IRC | 01:46 | |
*** lucasxu has quit IRC | 01:46 | |
*** lucasxu has joined #openstack-kolla | 01:46 | |
openstackgerrit | shaofeng cheng proposed openstack/kolla-ansible master: Add VMware DataStore support to cinder https://review.openstack.org/452131 | 01:49 |
openstackgerrit | Surya Prakash (spsurya) proposed openstack/kolla-ansible master: Development Environment With Vagrant link not working https://review.openstack.org/455866 | 01:52 |
*** duritong_ has joined #openstack-kolla | 01:58 | |
*** duritong has quit IRC | 02:00 | |
daidv | Morning. | 02:07 |
daidv | Jeffrey4l, I have cherry-pick my commit to Ocata release follow launchpad bug. Can you review for it? | 02:08 |
daidv | https://review.openstack.org/#/c/455747/ | 02:08 |
*** wagiel has joined #openstack-kolla | 02:09 | |
*** ssurana has quit IRC | 02:12 | |
*** wagiel has quit IRC | 02:14 | |
openstackgerrit | shaofeng cheng proposed openstack/kolla-ansible master: Add VMware DataStore support to glance https://review.openstack.org/452176 | 02:17 |
*** lucasxu has quit IRC | 02:22 | |
*** caowei has joined #openstack-kolla | 02:34 | |
*** unicell has quit IRC | 02:35 | |
openstackgerrit | Merged openstack/kolla-ansible stable/ocata: Use utf8_general_ci collation as a default collation https://review.openstack.org/455747 | 02:38 |
daidv | Jeffrey4l, duonghq : Thanks! :) | 02:39 |
*** shashank_t_ has quit IRC | 02:40 | |
*** shashank_t_ has joined #openstack-kolla | 02:41 | |
*** lrensing has quit IRC | 02:44 | |
*** shashank_t_ has quit IRC | 02:45 | |
Jeffrey4l | np | 03:01 |
Jeffrey4l | duonghq, could u review https://review.openstack.org/455504 | 03:02 |
*** wagiel has joined #openstack-kolla | 03:03 | |
openstackgerrit | Chen proposed openstack/kolla-ansible master: fix typos on quickstart page https://review.openstack.org/455892 | 03:05 |
*** wagiel has quit IRC | 03:08 | |
*** shashank_t_ has joined #openstack-kolla | 03:12 | |
*** lamt has joined #openstack-kolla | 03:14 | |
duonghq | Jeffrey4l, backport is ok, but how does database url related to tooz config? | 03:19 |
*** eaguilar has quit IRC | 03:19 | |
inc0 | good evening | 03:29 |
duonghq | evening inc0 | 03:31 |
*** sayantan_ has quit IRC | 03:33 | |
spsurya | good evening inc0 | 03:33 |
*** gkadam has joined #openstack-kolla | 03:34 | |
*** sayantan_ has joined #openstack-kolla | 03:34 | |
*** lamt has quit IRC | 03:43 | |
*** krtaylor has quit IRC | 03:45 | |
*** lamt has joined #openstack-kolla | 03:56 | |
*** dave-mccowan has joined #openstack-kolla | 03:57 | |
*** wagiel has joined #openstack-kolla | 03:57 | |
*** wagiel has quit IRC | 04:02 | |
*** zhurong has quit IRC | 04:06 | |
*** lamt has quit IRC | 04:08 | |
*** lamt has joined #openstack-kolla | 04:10 | |
*** dave-mccowan has quit IRC | 04:16 | |
*** dave-mccowan has joined #openstack-kolla | 04:16 | |
*** lamt has quit IRC | 04:19 | |
*** g3ek has quit IRC | 04:23 | |
openstackgerrit | Surya Prakash (spsurya) proposed openstack/kolla-ansible master: Development Environment With Vagrant link not working https://review.openstack.org/455866 | 04:29 |
openstackgerrit | zhubingbing proposed openstack/kolla-ansible master: Fix Multi-regions nova support boot from volume https://review.openstack.org/456042 | 04:30 |
*** g3ek has joined #openstack-kolla | 04:31 | |
SamYaple | Jeffrey4l: your mariadb recovery playbook work is wrong and will almost certianly lead to making someone lose data :( | 04:31 |
SamYaple | Jeffrey4l: i didnt put the seqno stuff in originally because its not accurate | 04:31 |
SamYaple | what you have means you can force a cluster to basically roll back if a node was gracefully stopped then a period of time passed and then the cluster crashed | 04:32 |
SamYaple | there are 7 different failure scenarios and only 5 of them are 100% automatically recoverable from | 04:33 |
SamYaple | 1 of them is never automatically recoverable from | 04:33 |
SamYaple | and one is given the right conditions | 04:33 |
*** zhurong has joined #openstack-kolla | 04:34 | |
*** lucasxu has joined #openstack-kolla | 04:36 | |
SamYaple | inc0: ping about galera recovery above | 04:41 |
SamYaple | you guys should really remove that from the repo,, its like super dangerous | 04:41 |
*** jtriley has joined #openstack-kolla | 04:42 | |
*** lamt has joined #openstack-kolla | 04:43 | |
*** bmace has quit IRC | 04:45 | |
*** skramaja has joined #openstack-kolla | 04:46 | |
*** dave-mccowan has quit IRC | 04:50 | |
*** wagiel has joined #openstack-kolla | 04:51 | |
*** lamt has quit IRC | 04:52 | |
*** lamt has joined #openstack-kolla | 04:53 | |
*** wagiel has quit IRC | 04:56 | |
*** sayantan_ has quit IRC | 05:03 | |
*** shashank_t_ has quit IRC | 05:03 | |
*** jtriley has quit IRC | 05:04 | |
*** iceyao has joined #openstack-kolla | 05:05 | |
*** lamt has quit IRC | 05:08 | |
*** bmace has joined #openstack-kolla | 05:10 | |
daidv | Good afternoon. | 05:12 |
daidv | I wonder why we can NOT mix binary image with source image? Can anyone help me, please? | 05:13 |
*** jaosorior_away is now known as jaosorior | 05:13 | |
*** shashank_t_ has joined #openstack-kolla | 05:14 | |
*** yangyape_ has joined #openstack-kolla | 05:16 | |
*** yangyapeng has quit IRC | 05:17 | |
*** rwsu has quit IRC | 05:17 | |
Jeffrey4l | SamYaple, yes. current solution can not cover all case and it may cause data loss. | 05:20 |
Jeffrey4l | SamYaple, without patch, when recovery, it recovery from first node all the time. it will cause data loss too. | 05:21 |
Jeffrey4l | daidv, re mix binary and source, technically, we can ( need some trick ). But why? i prefer to use source | 05:23 |
daidv | Jeffrey4l, in my case, some images are working well and I just want to append one more project like Panko via source image. | 05:26 |
daidv | So I just think, we can make some special config for deployment some source images with other binary project image together? | 05:27 |
daidv | Currently Kolla mean, if one system are using binary must be integrate with new binary image? Right? | 05:28 |
*** MasterOfBugs has joined #openstack-kolla | 05:32 | |
*** pramodrj07 has joined #openstack-kolla | 05:32 | |
Jeffrey4l | daidv, yep. only one type is supported. | 05:33 |
*** sayantan_ has joined #openstack-kolla | 05:34 | |
openstackgerrit | jimmygc proposed openstack/kolla-ansible master: Add Glance Swift backend support https://review.openstack.org/452059 | 05:35 |
*** lucasxu has quit IRC | 05:40 | |
*** shashank_t_ has quit IRC | 05:46 | |
*** lamt has joined #openstack-kolla | 05:49 | |
*** unicell has joined #openstack-kolla | 05:55 | |
openstackgerrit | jimmygc proposed openstack/kolla-ansible master: Add Glance Swift backend support https://review.openstack.org/452059 | 05:56 |
*** lamt has quit IRC | 05:58 | |
*** mewald has joined #openstack-kolla | 05:59 | |
*** claudiub has joined #openstack-kolla | 05:59 | |
mewald | Is "deploy" the right action to run in order to recover a mariadb cluster with 2 out of 3 failed nodes? | 06:01 |
*** iniazi has quit IRC | 06:08 | |
*** hieulq has joined #openstack-kolla | 06:24 | |
*** hieulq has quit IRC | 06:26 | |
*** pcaruana has joined #openstack-kolla | 06:30 | |
*** pramodrj07 has quit IRC | 06:32 | |
*** MasterOfBugs has quit IRC | 06:32 | |
*** mewald has quit IRC | 06:34 | |
*** bogdando has joined #openstack-kolla | 06:44 | |
openstackgerrit | zhubingbing proposed openstack/kolla-ansible master: Add region config option in globals.yml https://review.openstack.org/454149 | 06:44 |
*** yuanying_ has joined #openstack-kolla | 06:48 | |
openstackgerrit | shaofeng cheng proposed openstack/kolla-ansible master: Nova_backend_ceph variable mobile location. https://review.openstack.org/456077 | 06:48 |
*** yuanying has quit IRC | 06:48 | |
*** pbourke has quit IRC | 06:49 | |
*** pbourke has joined #openstack-kolla | 06:51 | |
*** targon has joined #openstack-kolla | 06:54 | |
*** Kimmo_ has quit IRC | 06:54 | |
*** jistr has quit IRC | 06:54 | |
*** brad[] has quit IRC | 06:55 | |
*** p6 has quit IRC | 06:55 | |
*** p6 has joined #openstack-kolla | 06:55 | |
*** zhubingbing_ has joined #openstack-kolla | 06:55 | |
*** SamYaple has quit IRC | 06:55 | |
*** gema has quit IRC | 06:55 | |
*** dmsimard has quit IRC | 06:55 | |
*** SamYaple has joined #openstack-kolla | 06:55 | |
*** gema has joined #openstack-kolla | 06:55 | |
*** dmsimard has joined #openstack-kolla | 06:55 | |
*** gema has quit IRC | 06:56 | |
*** gema has joined #openstack-kolla | 06:56 | |
*** jistr has joined #openstack-kolla | 06:56 | |
*** manheim has joined #openstack-kolla | 06:58 | |
*** hieulq has joined #openstack-kolla | 07:01 | |
openstackgerrit | zhubingbing proposed openstack/kolla-ansible master: Copy region config option from all.yml to globals.yml https://review.openstack.org/454149 | 07:01 |
*** mewald has joined #openstack-kolla | 07:03 | |
*** mewald1 has joined #openstack-kolla | 07:06 | |
*** mewald has quit IRC | 07:09 | |
*** daidv_ has joined #openstack-kolla | 07:10 | |
openstackgerrit | Merged openstack/kolla-ansible master: Unmount Ceph OSD disks as part of destroy https://review.openstack.org/455714 | 07:11 |
*** shashank_t_ has joined #openstack-kolla | 07:12 | |
mewald1 | Is "deploy" the right action to run in order to recover a mariadb cluster with 2 out of 3 failed nodes? | 07:15 |
*** Serlex has joined #openstack-kolla | 07:17 | |
*** shardy has joined #openstack-kolla | 07:24 | |
*** brad[] has joined #openstack-kolla | 07:24 | |
openstackgerrit | Zeyu Zhu proposed openstack/kolla-ansible master: Remove the variable redefined in deploy-servers.yml https://review.openstack.org/448433 | 07:25 |
openstackgerrit | Zeyu Zhu proposed openstack/kolla-ansible stable/ocata: Modify the hosts of the post-deploy.yml playbook https://review.openstack.org/456094 | 07:26 |
*** nathharp has joined #openstack-kolla | 07:28 | |
*** caowei has quit IRC | 07:32 | |
japestinho | mewald1 I think you should run deploy and reconfigure if needed by mariadb services | 07:32 |
*** jmccarthy has joined #openstack-kolla | 07:33 | |
*** egonzalez has joined #openstack-kolla | 07:36 | |
*** shashank_t_ has quit IRC | 07:38 | |
openstackgerrit | Merged openstack/kolla-ansible master: fix typo https://review.openstack.org/455541 | 07:42 |
*** sayantan_ has quit IRC | 07:43 | |
*** daidv_ has quit IRC | 07:44 | |
*** athomas has joined #openstack-kolla | 07:48 | |
*** daidv_ has joined #openstack-kolla | 07:50 | |
*** kollian has joined #openstack-kolla | 07:50 | |
kollian | hi guys...... I am facing some in deploying openStack with kolla, I am just new to kolla just i am not getting from where i will run kolla-nuild command to build images I have installed docker as well as ansible | 07:52 |
kollian | kolla-build* | 07:53 |
kollian | all the dependency | 07:53 |
kollian | can anyone please help | 07:53 |
kollian | will i run after checkout the kolla repo | 07:54 |
kollian | ? | 07:54 |
japestinho | kollian you can follow this link from egonzalez blog http://egonzalez.org/deploy-openstack-designate-with-kolla-ansible/ | 07:57 |
japestinho | kollian download the image tarbals (binary/source) and setup docker private registry | 07:58 |
egonzalez | kollian, to build images need to install kolla too, then kolla-build command will be available | 07:58 |
kollian | japestinho: deployment node and target node can be same right ? | 07:59 |
egonzalez | if using stable branch, can pull images from dockerhub | 07:59 |
kollian | in three node system | 07:59 |
kollian | egonzalez: through PIP packages or there is any scrip to install | 08:00 |
kollian | ? | 08:00 |
japestinho | kollian yes it can be use for all-in-one or multinode | 08:00 |
zhubingbing_ | sup egonzalez | 08:00 |
zhubingbing_ | ;) | 08:00 |
kollian | using the master one | 08:00 |
egonzalez | kollian, master cannot be installed from pip | 08:00 |
egonzalez | kollian, use pip install -r kolla/requirements -r kolla/test-requirements | 08:01 |
egonzalez | change kolla with kolla-ansible to install kolla ansible requirements | 08:01 |
kollian | egonzalez: then i only need kolla repo to run the command | 08:02 |
egonzalez | kollian, commands can be found at tools/ folder ,kolla-ansible and build.py(this one in kolla repo) | 08:02 |
*** zhubingbing_ has quit IRC | 08:02 | |
*** zhubingbing_ has joined #openstack-kolla | 08:02 | |
kollian | egonzalez: you mean to pull the image from the registry i have to checkout the repo of stable branch like ocata/newton etc ? | 08:04 |
kollian | then i can run pip install ? | 08:04 |
egonzalez | kollian, don't need to download images to just install requirements, images are needed to deploy | 08:06 |
egonzalez | kollian, but if you want to use stable branch, just use pip install kolla and pip install kolla-ansible | 08:06 |
egonzalez | dont have to clone both repos | 08:07 |
japestinho | sup egonzalez :) | 08:11 |
egonzalez | japestinho, zhubingbing_ sup :D | 08:11 |
japestinho | egonzalez which network type should I use if I deploy openstack on openstack | 08:11 |
*** blallau has joined #openstack-kolla | 08:11 | |
*** Administrator_ has quit IRC | 08:12 | |
japestinho | egonzalez I am able to deploy kolla-kubernetes on openstack deployment | 08:12 |
*** Administrator_ has joined #openstack-kolla | 08:12 | |
japestinho | egonzalez after init-runonce I got this | 08:12 |
japestinho | [centos@kolla-k8s ~(keystone_admin)]$ ip netns | 08:12 |
japestinho | qrouter-67793882-51eb-416e-b794-00a3bbf93e25 | 08:12 |
japestinho | qdhcp-c3a937cf-ec04-4d02-ac46-d266c77c0419 | 08:12 |
japestinho | [centos@kolla-k8s ~(keystone_admin)]$ ip netns e qdhcp-c3a937cf-ec04-4d02-ac46-d266c77c0419 ping 10.0.0.100 | 08:12 |
japestinho | Cannot open network namespace "qdhcp-c3a937cf-ec04-4d02-ac46-d266c77c0419": Permission denied | 08:12 |
japestinho | egonzalez I can't ping to demo VM | 08:13 |
openstackgerrit | Eduardo Gonzalez proposed openstack/kolla master: DNM: test master branch https://review.openstack.org/456108 | 08:14 |
*** targon has quit IRC | 08:15 | |
egonzalez | japestinho, guess flat network type | 08:15 |
*** shardy has quit IRC | 08:15 | |
*** yangyape_ has quit IRC | 08:16 | |
egonzalez | japestinho, have you tried with root or sudo privs? | 08:16 |
openstackgerrit | Zeyu Zhu proposed openstack/kolla stable/newton: Modify the hosts of the post-deploy.yml playbook https://review.openstack.org/456109 | 08:17 |
*** shardy has joined #openstack-kolla | 08:17 | |
*** yangyapeng has joined #openstack-kolla | 08:18 | |
kollian | egonzalez: Thanks a lot, hope i will get it running | 08:20 |
openstackgerrit | Zeyu Zhu proposed openstack/kolla stable/newton: Modify the hosts of the post-deploy.yml playbook https://review.openstack.org/456109 | 08:21 |
japestinho | egonzales same thing with root priv | 08:22 |
japestinho | [root@kolla-k8s ~(keystone_admin)]$ ip netns e qdhcp-c3a937cf-ec04-4d02-ac46-d266c77c0419 ping -c3 8.8.8.8 | 08:22 |
japestinho | RTNETLINK answers: Invalid argument | 08:22 |
japestinho | RTNETLINK answers: Invalid argument | 08:22 |
japestinho | setting the network namespace "qdhcp-c3a937cf-ec04-4d02-ac46-d266c77c0419" failed: Invalid argument | 08:22 |
*** dmellado has joined #openstack-kolla | 08:23 | |
egonzalez | japestinho, for what is "e" option in ip netns? | 08:24 |
egonzalez | is sort exec? | 08:24 |
*** mewald1 has quit IRC | 08:28 | |
blallau | @egonzalez "e" alias "exec" | 08:29 |
openstackgerrit | Shunli Zhou proposed openstack/kolla-ansible master: Correct operating-kolla.rst document https://review.openstack.org/456110 | 08:29 |
*** mewald has joined #openstack-kolla | 08:29 | |
sdake | morning | 08:34 |
egonzalez | morning sdake | 08:36 |
egonzalez | japestinho, this issue is only in your env or more people in kolla-k8s is having the same? | 08:37 |
*** caowei has joined #openstack-kolla | 08:37 | |
japestinho | morning sdake | 08:38 |
japestinho | egonzalez as far as I know, I only the one having this issue | 08:38 |
kollian | egonzalez: i more lammy. seems like https://docs.openstack.org/project-deploy-guide/kolla-ansible/ocata/multinode.html missing the right direction | 08:38 |
kollian | to follow | 08:38 |
kollian | by any deployer | 08:38 |
openstackgerrit | Bertrand Lallau proposed openstack/kolla-ansible master: WIP: Enable Ceph input plugin in Telegraf https://review.openstack.org/455602 | 08:39 |
egonzalez | kollian, can you rephrase? cannot understand what you mean | 08:39 |
kollian | egonzalez: guide dont say about kolla installation | 08:40 |
kollian | it say about kolla-ansible | 08:40 |
egonzalez | japestinho, there was a bug in kolla-ansible with shared /run, maybe k8s does not mount /run as shared https://bugs.launchpad.net/kolla/+bug/1616268 | 08:40 |
openstack | Launchpad bug 1616268 in kolla newton "Stale namespace removal causing "RTNETLINK answers: Invalid argument" errors" [Critical,Fix committed] - Assigned to Jeffrey Zhang (jeffrey4l) | 08:40 |
egonzalez | kollian, kolla is not needed for kolla-ansible deploy unless want to build your own images | 08:41 |
kollian | egonzalez: so how kolla-ansible know about the images ? | 08:42 |
kollian | i.e build images | 08:42 |
egonzalez | kolla-ansible need a couple of setting in globals.yml | 08:42 |
egonzalez | registry, namespace and version. thats all. will pull images from the registry and deploy them | 08:43 |
blallau | @japestinho and @egonzalez exactly, I was thinking about the same issue: /run must mount with "shared" option "/run/:/run/:shared" | 08:43 |
kollian | egonzalez: ohh | 08:44 |
kollian | egonzalez: thknks | 08:44 |
egonzalez | kollian, if you not configure a registry and namespace, by default will pull images from dockerhub | 08:45 |
egonzalez | kollian, *only for stable releases (master not) | 08:45 |
*** gfidente has joined #openstack-kolla | 08:46 | |
sdake | egonzalez do you know where https://docs.openstack.org/project-deploy-guide/kolla-ansible/ocata/quickstart.html is rendered from | 08:46 |
sdake | egonzalez the docs are incorrect | 08:46 |
egonzalez | sdake, https://github.com/openstack/kolla-ansible/tree/master/deploy-guide/source | 08:47 |
egonzalez | sdake, master is with draft in the url instead of ocata https://docs.openstack.org/project-deploy-guide/kolla-ansible/draft/quickstart.html | 08:47 |
hrw | egonzalez, sdake: can you look at https://review.openstack.org/#/c/450805/ and vote? simple change, CI green | 08:48 |
hrw | I started recheck on some other ones | 08:48 |
japestinho | egonzalez blallau did you mean MountFlags=shared option on /etc/systemd/system/docker.service ? | 08:50 |
egonzalez | japestinho, nope, volumes mounting at container startup | 08:51 |
sdake | hrw done | 08:51 |
sdake | hrw rev needed | 08:51 |
egonzalez | japestinho, in kolla-ansible https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/neutron/defaults/main.yml#L22 dunno where I can find that in kubernetes | 08:51 |
hrw | sdake: rtslib is dependency for targetcli | 08:52 |
hrw | sdake: so listing it was not needed | 08:52 |
sdake | hrw for centos? | 08:53 |
hrw | yes | 08:53 |
sdake | hrw I don't see it installed here: http://logs.openstack.org/05/450805/7/check/gate-kolla-dsvm-build-centos-binary-centos-7-nv/89e115d/console.html.gz | 08:54 |
hrw | sdake: https://pastebin.com/EuVB3q9h | 08:54 |
*** Kimmo_ has joined #openstack-kolla | 08:54 | |
egonzalez | can someone take a look at https://review.openstack.org/#/c/455785/ ? | 08:54 |
egonzalez | temporaly fixes source deploy gates | 08:54 |
hrw | sdake: log you shown does not have targetcli either | 08:54 |
sdake | hrw agreed, seems like a bug | 08:55 |
egonzalez | dont know the reason, but nova service-list is not retrieving registered services when they are in the database | 08:55 |
sdake | hrw we may not require targetcli in nova packaging but may require rtslib | 08:55 |
sdake | hrw i am hesitent to approve a change that breaks centos - if you rev the patch without the rts removal I'll ack it :) | 08:56 |
hrw | sdake: ok, will readd rtslib | 08:56 |
sdake | what we really need is m0ar gates for kolla-ansible | 08:56 |
*** manheim has quit IRC | 08:56 | |
sdake | hrw the rest of the change looks great :) | 08:57 |
sdake | i think I lost my big battery for my phone | 08:57 |
* sdake groans | 08:57 | |
sdake | always on teh road, always lose my big battery | 08:57 |
*** shardy is now known as shardy_mtg | 08:57 | |
openstackgerrit | Marcin Juszkiewicz proposed openstack/kolla master: handle rtslib(-fb) package names and dependencies https://review.openstack.org/450805 | 09:01 |
hrw | sdake: here you have | 09:01 |
hrw | (venv-for-kolla) 11:01 hrw@gossamer:kolla$ git branch -m to-merge/debian-rtslib-fb-0412-1101 | 09:01 |
hrw | ;D | 09:01 |
sdake | for folks looking to contribute to kolla-kubernetes, our 0.6.0 release planned for april 15th is here: https://launchpad.net/kolla-kubernetes/+milestone/0.6.0 | 09:05 |
sdake | egonzalez can you ahve a look at https://review.openstack.org/#/c/450805/ | 09:05 |
sdake | specifically: http://logs.openstack.org/05/450805/8/experimental/gate-kolla-dsvm-deploy-multinode-ubuntu-source-ubuntu-trusty-2-node-nv/3ad4c36/console.html#_2017-04-12_09_04_48_597818 | 09:07 |
sdake | WTB NON FLAKEY GATES | 09:07 |
openstackgerrit | Merged openstack/kolla-ansible master: Fix Multi-regions nova support boot from volume https://review.openstack.org/456042 | 09:07 |
blallau | @gonzalez for 450805: why not using "openstack hypervisor list" instead ? | 09:10 |
*** manheim has joined #openstack-kolla | 09:10 | |
sdake | zhubingbing_ what would be helpful is https://blueprints.launchpad.net/kolla-kubernetes/+spec/move-config-to-kolla-k8s | 09:10 |
*** nhlfr has quit IRC | 09:11 | |
*** retr0h has quit IRC | 09:11 | |
*** lpetrut has joined #openstack-kolla | 09:12 | |
egonzalez | blallau, empty too :( | 09:12 |
egonzalez | blallau, http://paste.openstack.org/show/606241/ | 09:13 |
*** vcn[m] has joined #openstack-kolla | 09:13 | |
*** mewald has quit IRC | 09:13 | |
blallau | cause on install guide it should return something... | 09:16 |
blallau | are you admin user? | 09:16 |
blallau | when you launch "openstack hypervisor list" | 09:16 |
*** iceyao has quit IRC | 09:16 | |
blallau | or "openstack compute service list" | 09:16 |
*** bachp has joined #openstack-kolla | 09:16 | |
*** nhlfr has joined #openstack-kolla | 09:16 | |
*** retr0h has joined #openstack-kolla | 09:16 | |
blallau | seems you are admin user in simple_cell_setup.yml, but here http://paste.openstack.org/show/606241/ | 09:18 |
*** zhubingbing_ has quit IRC | 09:19 | |
*** zhubingbing has joined #openstack-kolla | 09:19 | |
blallau | install guide here https://docs.openstack.org/ocata/install-guide-ubuntu/nova-compute-install.html#add-the-compute-node-to-the-cell-database | 09:20 |
egonzalez | blallau, yep im admin http://paste.openstack.org/show/606242/ | 09:21 |
blallau | "openstack hypervisor list" (in admin) must return something | 09:21 |
egonzalez | this issue is reproduced in gates | 09:21 |
blallau | @egonzalez: ok, I have no idea... | 09:22 |
sdake | egonzalez what do you make of https://review.openstack.org/#/c/453846/22 | 09:23 |
sdake | the last comment | 09:23 |
kollian | egonzalez: I cann't see kolla repo to run kolla-build after installation i.e pip install kolla | 09:25 |
egonzalez | sdake, simple_cell_setup is executed right? | 09:25 |
*** Serlex has quit IRC | 09:25 | |
egonzalez | kollian, when installing from pip there is no repo downloaded, kolla-build command is globally available in your system | 09:25 |
sdake | egonzalez appears not | 09:26 |
egonzalez | blallau, filled a bug in nova and python-novaclient https://bugs.launchpad.net/nova/+bug/1682060 | 09:27 |
openstack | Launchpad bug 1682060 in python-novaclient "empty nova service and hypervisor list" [Undecided,New] | 09:27 |
kollian | egonzalez: it is written that i have run tools/start-registry in multinode depolyment for image registry | 09:27 |
kollian | how can i do registry | 09:27 |
kollian | ? | 09:28 |
kollian | have to * | 09:28 |
japestinho | kollian docker run -d -p 4000:5000 --restart=always -v /opt/kolla_registry/:/var/lib/registry --name registry registry:2 | 09:29 |
kollian | japestinho: this is for AIO or multinode ? | 09:30 |
japestinho | or you need cd to /usr/share/kolla/tools/ | 09:30 |
blallau | @egonzalez great! thank you | 09:30 |
*** gkadam is now known as gkadam-afk | 09:30 | |
*** zhuzeyu has quit IRC | 09:31 | |
openstackgerrit | Merged openstack/kolla-ansible master: Temporaly fix deploy gate https://review.openstack.org/455785 | 09:31 |
kollian | japestinho: there is no kolla folder in /usr/share/ | 09:31 |
japestinho | kollian I believe it's was for multinode but you can use it for AIO also | 09:32 |
kollian | japestinho: does kolla dir will be created after kolla-ansible installation ? | 09:34 |
japestinho | kollian nope, you need pip install -U kolla first, kolla-ansible will create /usr/share/kolla-ansible | 09:35 |
japestinho | kollian what do you want to create? AIO or multinode deployment? | 09:36 |
*** guoguo has quit IRC | 09:37 | |
kollian | japestinho: multinode | 09:37 |
japestinho | kollian I suggest you follow this one http://egonzalez.org/deploy-openstack-designate-with-kolla-ansible/ | 09:38 |
kollian | japestinho: following https://docs.openstack.org/project-deploy-guide/kolla-ansible/ocata/multinode.html | 09:39 |
japestinho | kollian you can skip kolla-build process with download image tarballs first | 09:39 |
*** manheim has quit IRC | 09:39 | |
japestinho | kollian then deploy docker private registry | 09:39 |
egonzalez | kollian, ^^ I suggest you to use images from Dockerhub for stable deployments, dont need any registry, just install kolla-ansible, configure globals.yml and deploy | 09:40 |
egonzalez | kollian, images from tarballs are latest change in stable branch (most of them not released as stable yet) | 09:41 |
* egonzalez should change the blog post to avoid issues with stable deploys... | 09:41 | |
kollian | egonzalez: you mean only kolla-ansible installation and configure globals.yml will work for openstack deployment from kolla without running `pip install kolla` | 09:43 |
egonzalez | kollian, yep, kolla is only needed if want to build your own images | 09:44 |
kollian | egonzalez: ok, thanks | 09:44 |
egonzalez | kollian, with this http://paste.openstack.org/show/606250/ is fair enough to deploy, change your IPs and NIC names to match your env, add nodes to inventory and deploy | 09:46 |
hrw | ok, rdo repos still 403 so no centos builds | 09:46 |
egonzalez | kollian, that will download stable images from dockerhub | 09:46 |
kollian | egonzalez: ok, thanks | 09:48 |
openstackgerrit | Eduardo Gonzalez proposed openstack/kolla-ansible master: DNM: test master branch https://review.openstack.org/456140 | 09:51 |
japestinho | egonzalez Just curious if I use that private docker registry and using image from tarballs, how to use kolla-build to build latest stable image again? :) | 09:51 |
*** sambetts|afk is now known as sambetts | 09:52 | |
egonzalez | japestinho, ./build.py -t source --tag 4.0.0 -n 192.168.100.215:4000/lokolla | 09:52 |
openstackgerrit | Merged openstack/kolla-ansible master: Nova_backend_ceph variable mobile location. https://review.openstack.org/456077 | 09:52 |
egonzalez | japestinho, add namespace with your registry IP on it | 09:52 |
japestinho | egonzalez ok thanks I got it. | 09:53 |
openstackgerrit | shaofeng cheng proposed openstack/kolla-ansible master: Remove show_multiple_locations in glance-api https://review.openstack.org/456143 | 09:58 |
*** tovin07_ has quit IRC | 09:59 | |
*** daidv has quit IRC | 10:01 | |
*** manheim has joined #openstack-kolla | 10:07 | |
kollian | egonzalez: japestinho i know it could be little embarrassing | 10:12 |
kollian | i am not finding | 10:12 |
kollian | kolla-ansible dir | 10:12 |
kollian | in /usr/share/ | 10:13 |
egonzalez | kollian, did you installed kolla-ansible with pip? | 10:13 |
kollian | egonzalez: yes and it installed successfully | 10:13 |
egonzalez | kollian, centos, ubuntu or oracle? | 10:14 |
kollian | http://paste.openstack.org/show/606259/ | 10:15 |
kollian | egonzalez: do i need to specify these in somwhere ? | 10:15 |
egonzalez | python packages are installed in different paths | 10:16 |
*** rhallisey has quit IRC | 10:16 | |
egonzalez | kollian, for deb is in /usr/local/share/kolla-ansible | 10:16 |
kollian | egonzalez: yes git this in /user/local :) | 10:17 |
kollian | got* | 10:17 |
kollian | egonzalez: i have not specified anywhere about my base destro i.e ubuntu | 10:20 |
kollian | do i need to specify somehwre ? | 10:20 |
egonzalez | kollian, yep in globals set `kolla_base_distro: "ubuntu"` in your case | 10:22 |
openstackgerrit | Merged openstack/kolla master: openstack-base: Percona-Server is x86-64 only https://review.openstack.org/449965 | 10:24 |
sdake | spsurya pingola | 10:26 |
sdake | zhubingbing pingola | 10:26 |
sdake | duonghq pingola | 10:26 |
spsurya | sdake: pongola | 10:26 |
sdake | hey fellas | 10:26 |
sdake | been at openstack leadership training this week | 10:27 |
sdake | wondering what your thinking is on the concept of a "protocore" for kolla-kubernetes | 10:27 |
sdake | a protocore is essentially a "core reviewer in training" | 10:27 |
sdake | i know all 3 of you (and more that are not in the channel) show some interest in reviewing kolla-kubernetes | 10:28 |
zhubingbing | hi | 10:28 |
sdake | one of the problems a core review team faces with minting new core reviewers is that often core reviewers have to have a detailed knowledge of the gates and codebase | 10:29 |
sdake | curious if you would be interested in taking part in such a program | 10:29 |
sdake | it works as follows | 10:29 |
sdake | 1. your +1 vote would count as a +2, however, you would not have the ability to +2 or +w a review | 10:30 |
sdake | 2. when a core reviewer reviews the change, they either +2/+w it , or provide feedback to you and the submitter about how the review was not quite right | 10:30 |
sdake | 3. you learn how to become a core reviewer for kolla-kubernetes without having to "guess" at the set of magic incantations necessary to obtain said objective | 10:31 |
sdake | the natural outcome of such a program is that you learn how to review properly for kolla-kubernetes | 10:33 |
*** Serlex has joined #openstack-kolla | 10:33 | |
egonzalez | that sound like a good program to have | 10:33 |
duonghq | sdake, pong | 10:33 |
sdake | duonghq read scrollback :) | 10:33 |
sdake | egonzalez I agree | 10:33 |
sdake | :) | 10:33 |
spsurya | sdake: as you mentioned in third point *protocore* will work towards core reviewer part ? | 10:33 |
spsurya | sdake: idea sounds good | 10:34 |
sdake | yup - the idea is to provide an onramp to core-reviewer without overwhelming you with core reviewer to begin with | 10:34 |
sdake | or alternatelly requiring you to move earth with your mind to learn hwo to become a cor ereviewer magically | 10:34 |
duonghq | sdake, good plan | 10:34 |
sdake | shame rwellum isn't in the channel :) | 10:35 |
sdake | it does require a 100% commitment from you to learn how to become a core reviewer | 10:35 |
sdake | some of you already are well on your way | 10:36 |
sdake | or core reviewer on other projects | 10:36 |
sdake | although each deliverable is different and has slightly different requirements | 10:36 |
sdake | egonzalez think such a program would be good for kolla-ansible and kolla? | 10:37 |
sdake | Jeffrey4l ^^ | 10:38 |
duonghq | I think the program is good, and maybe unique for Kolla team in OpenStack | 10:39 |
sdake | duonghq indeed nova is doing this now | 10:39 |
sdake | duonghq its not like i invented the idea msyelf :) | 10:39 |
duonghq | roger | 10:40 |
spsurya | sdake: bringing this to kolla is a good step | 10:40 |
sdake | haven't brought it yet | 10:40 |
sdake | just wanted to gauge interest | 10:40 |
sdake | if people raen't interested, no reason to do such a thing | 10:41 |
openstackgerrit | Paul Bourke (pbourke) proposed openstack/kolla master: Reparent kolla-toolbox from openstack-base https://review.openstack.org/435023 | 10:42 |
Jeffrey4l | sound great sdake | 10:42 |
sdake | pbourke ^^ | 10:42 |
spsurya | sdake: i think mostly would be interested in this | 10:42 |
sdake | spsurya that sentence didn't parse :) | 10:42 |
spsurya | even evryone | 10:42 |
spsurya | sdake: :) | 10:43 |
pbourke | sdake: sounds good, we need more cores on kolla-ansible | 10:44 |
sdake | pbourke right and how do we get more core reviewers? | 10:44 |
pbourke | sdake: well your idea sounds like a good way towards that | 10:44 |
pbourke | sdake: egonzalez: would you guys mind tag teaming and getting these finally merged? https://review.openstack.org/#/c/435023/ https://review.openstack.org/#/c/435024/ | 10:45 |
pbourke | so tired of fixing conflicts on them | 10:45 |
sdake | pbourke still not a fan of the reparent :) | 10:46 |
pbourke | sdake: hmm | 10:47 |
sdake | pbourke hopefully someday someone fixes that ;) | 10:47 |
pbourke | sdake: yeah im concerned with the complexity of these images | 10:48 |
sdake | pbourke reviewed enjoy | 10:48 |
pbourke | sdake: but (yet another) rearchitect seems very difficult at this stage in the project | 10:48 |
*** cuongnv has quit IRC | 10:55 | |
duonghq | need to back to home, see you in the meeting | 10:59 |
*** duonghq has quit IRC | 10:59 | |
nathharp | egonzalez - don’t know if you remember the vif plugging issue while spawning a large number of instances I mentioned last week? It looks like I had to adjust the number of worker threads for neutron-server | 11:01 |
egonzalez | nathharp, is the issue fixed after increasing workers? | 11:01 |
nathharp | egonzalez - it does appear to be. I am still seeing occasional failures (1 or 2 instances) but I might have a bad hypervisor. | 11:02 |
nathharp | egonzalez - issue appears to have moved on to DHCP allocations | 11:03 |
nathharp | egonzalez - any tips on tuning dnsmasq? | 11:03 |
egonzalez | nathharp, maybe increasing number of l3 and dhcp agents per network | 11:05 |
egonzalez | dhcp_agents_per_network: 2 | 11:05 |
egonzalez | max_l3_agents_per_router: 3 | 11:05 |
*** dixiaoli has quit IRC | 11:06 | |
nathharp | egonzalez thanks I’ll check them out. Issue seems to be the amount of time it takes to release an IP address when an instance is destroyed (if a large number are destroyed) | 11:08 |
*** athomas has quit IRC | 11:08 | |
*** papacz has joined #openstack-kolla | 11:09 | |
hrw | https://review.openstack.org/#/c/455374 - can you guys look at it? switching centos builds to rdo mirror may sort out centos gate failures | 11:11 |
egonzalez | nathharp, interesting, for my knowledge. have you estimated ~ limits per worker? | 11:11 |
openstackgerrit | Merged openstack/kolla master: check mariadb galera status in every loop. https://review.openstack.org/450131 | 11:12 |
*** rhallisey has joined #openstack-kolla | 11:13 | |
*** shardy_mtg is now known as shardy | 11:13 | |
*** athomas has joined #openstack-kolla | 11:14 | |
*** shardy is now known as shardy_lunch | 11:19 | |
egonzalez | pbourke, not sure for merging now toolbox reparent, build gates are failing, IDK if the registry will have kolla_toolbox image with those changes and will potentially break all other gates until rdo issue is fixed | 11:20 |
*** iceyao has joined #openstack-kolla | 11:20 | |
pbourke | egonzalez: see what you mean, lets hang on till gates are green | 11:21 |
openstackgerrit | jimmygc proposed openstack/kolla-ansible master: Add Glance Swift backend support https://review.openstack.org/452059 | 11:21 |
pbourke | egonzalez: though they wont go green for that patch as they depend on each other :/ | 11:21 |
egonzalez | pbourke, i know, at least wait until build are green. deploy won't fail until both changes are merged | 11:21 |
egonzalez | pbourke, luckly kolla-ansible is +w and will merge once kolla do | 11:22 |
egonzalez | * s/won't fail/will fail/ | 11:22 |
*** zhurong has quit IRC | 11:26 | |
*** rwallner has joined #openstack-kolla | 11:26 | |
openstackgerrit | Merged openstack/kolla master: Install panko in ceilometer base container https://review.openstack.org/444680 | 11:28 |
*** zhubingbing has quit IRC | 11:29 | |
*** iceyao has quit IRC | 11:30 | |
*** rwallner has quit IRC | 11:30 | |
*** rwallner has joined #openstack-kolla | 11:34 | |
*** haplo37_ has quit IRC | 11:36 | |
*** haplo37_ has joined #openstack-kolla | 11:36 | |
sdake | Jeffrey4l can you represent the protocore idea in the team meeting today | 11:38 |
sdake | I wont be able to make it as i'm in training | 11:38 |
sdake | or pbourke | 11:38 |
sdake | or egonzalez ? :) | 11:38 |
pbourke | sdake: I can do as I already was discussing core stuff with inc0 recently | 11:39 |
sdake | pbourke thakns i'll add to agenda | 11:39 |
sdake | pbourke * protocore (an onramp to core reviewer) (pbourke) | 11:40 |
sdake | enjoy | 11:40 |
sdake | thanks a bunch :) | 11:40 |
sdake | key ideas | 11:40 |
sdake | one protocore +1 + 1 core reviewer +2 = +w | 11:41 |
sdake | protocore is coached by core review team | 11:41 |
sdake | protocores are identified out of current pool of existing reviewers that are keen to join the core review team | 11:42 |
sdake | pbourke add or subtract as you see fit :) | 11:42 |
sdake | bbl | 11:42 |
*** gkadam-afk is now known as gkadam | 11:42 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/kolla master: Updated from global requirements https://review.openstack.org/455928 | 11:44 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/kolla-ansible master: Updated from global requirements https://review.openstack.org/455929 | 11:44 |
spsurya | nice | 11:44 |
nathharp | egonzalez - do you mean the neutron workers that I’ve set? My controller nodes are old but powerful, 2x6core 2.7GHz plus HT. I’ve set api, rpc and rpc_state workers to 25 | 11:49 |
egonzalez | nathharp, i mean, if you estimated how many jobs can take 1 worker until get collapsed | 11:50 |
nathharp | egonzalez, I’ve not been very scientific, but somewhere between 40-100 instance creates breaks the defaults | 11:51 |
nathharp | egonzalez, I tooks some inspiration from https://javacruft.wordpress.com/2014/06/18/168k-instances/ | 11:52 |
nathharp | egonzalez, regarding DHCP releases from dnsmasq - it is releasing ~1 address per second. I’m in an edge case, but I have a class C network, have 200 instances running, delete them all, and recreate. No errors in openstack, but dnsmasq complains about not enough IPs | 11:54 |
nathharp | egonzalez - I think neutron assigns a ‘free’ IP, but dnsmasq thinks it’s still in use | 11:55 |
kollian | egonzalez: i did the same but into this | 11:56 |
kollian | vagrant@Vmachine1:/usr/local/share/kolla-ansible$ sudo kolla-ansible prechecks -i ansible/inventory/multinode Pre-deployment checking : ansible-playbook -i ansible/inventory/multinode -e @/etc/kolla/globals.yml -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla -e action=precheck /usr/local/share/kolla-ansible/ansible/site.yml ERROR! the file_name '/etc/kolla/globals.yml' does not exist, or is not readable Command faile | 11:56 |
kollian | egonzalez: should not i run the precheck ? | 11:56 |
kollian | just after kolla-ansible installation and configuring the inventory/multinode | 11:57 |
egonzalez | kollian, have you copied all content from /usr/local.. to /etc/kolla/ cp -r /usr/local/share/kolla-ansible/etc_examples/kolla /etc/kolla/ | 11:57 |
kollian | egonzalez: no, that may be the problm | 11:58 |
egonzalez | globals.yml is modified at /etc/kolla | 11:58 |
egonzalez | kollian, also have to generate passwords | 11:59 |
*** eaguilar has joined #openstack-kolla | 12:00 | |
*** shardy_lunch is now known as shardy | 12:00 | |
openstackgerrit | Serguei Bezverkhi proposed openstack/kolla-kubernetes master: Making resolv.conf to be more flexible https://review.openstack.org/456191 | 12:02 |
*** blallau has quit IRC | 12:07 | |
*** sbezverk has quit IRC | 12:09 | |
*** ipsecguy_ is now known as ipsecguy | 12:10 | |
*** daidv has joined #openstack-kolla | 12:11 | |
*** krtaylor has joined #openstack-kolla | 12:11 | |
*** sbezverk has joined #openstack-kolla | 12:13 | |
openstackgerrit | Merged openstack/kolla-ansible master: Revert "Fix Fluentd warn on dnsmasq.log file parsing" https://review.openstack.org/453837 | 12:15 |
openstackgerrit | Merged openstack/kolla-ansible master: Fix outdated InfluxDB configuration https://review.openstack.org/452667 | 12:18 |
*** rwellum has joined #openstack-kolla | 12:20 | |
openstackgerrit | Merged openstack/kolla-ansible master: Congress: remove oslo_messaging_notifications config https://review.openstack.org/444411 | 12:22 |
openstackgerrit | Merged openstack/kolla-ansible master: Add gnocchi backend precheckes for ceilometer https://review.openstack.org/445312 | 12:24 |
*** haplo37 has quit IRC | 12:25 | |
*** haplo37_ is now known as haplo37 | 12:25 | |
*** daidv_ has quit IRC | 12:26 | |
*** daidv has quit IRC | 12:26 | |
*** daidv has joined #openstack-kolla | 12:26 | |
*** haplo37 has quit IRC | 12:32 | |
*** g3ek has quit IRC | 12:33 | |
*** srwilkers has joined #openstack-kolla | 12:34 | |
*** haplo37 has joined #openstack-kolla | 12:38 | |
*** haplo37_ has joined #openstack-kolla | 12:38 | |
*** g3ek has joined #openstack-kolla | 12:38 | |
openstackgerrit | Eduardo Gonzalez proposed openstack/kolla-ansible master: Revert "Temporaly fix deploy gate" https://review.openstack.org/456210 | 12:52 |
*** mbruzek has joined #openstack-kolla | 12:53 | |
*** lamt has joined #openstack-kolla | 12:57 | |
*** goldyfruit has joined #openstack-kolla | 13:01 | |
*** gkadam has quit IRC | 13:05 | |
*** iceyao has joined #openstack-kolla | 13:05 | |
*** lamt has quit IRC | 13:07 | |
*** dvx has joined #openstack-kolla | 13:09 | |
egonzalez | Jeffrey4l, around? re removing until in register tasks https://review.openstack.org/#/c/428719 | 13:10 |
egonzalez | Jeffrey4l, duonghq raised a comment here, https://review.openstack.org/#/c/451876/ | 13:10 |
egonzalez | Jeffrey4l, what happens when there is a network error and first attempt fails? retry is handled in other place than I missing? | 13:11 |
*** esharao has joined #openstack-kolla | 13:15 | |
mnaser | https://review.openstack.org/#/c/455374/ | 13:16 |
mnaser | anyone has any clues about this nondeterministic failure? | 13:16 |
mnaser | :( | 13:16 |
*** lamt has joined #openstack-kolla | 13:16 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/kolla-ansible master: Updated from global requirements https://review.openstack.org/455929 | 13:18 |
*** eanylin has quit IRC | 13:19 | |
egonzalez | mnaser, randomly fail python tests, ive no idea the reason | 13:22 |
*** erlon has joined #openstack-kolla | 13:22 | |
*** mattmceuen has joined #openstack-kolla | 13:26 | |
openstackgerrit | Bertrand Lallau proposed openstack/kolla-ansible master: Fix neutron agents restarted on ml2 config change https://review.openstack.org/447992 | 13:28 |
*** jtriley has joined #openstack-kolla | 13:35 | |
*** al498u has joined #openstack-kolla | 13:36 | |
*** al498u_ has joined #openstack-kolla | 13:38 | |
*** 7GHAARTU6 has joined #openstack-kolla | 13:41 | |
*** al498u has quit IRC | 13:41 | |
*** papacz has quit IRC | 13:42 | |
*** eaguilar has quit IRC | 13:42 | |
*** 7GHAARTU6 has quit IRC | 13:43 | |
pbourke | mnaser: i think some of the tests are written badly | 13:44 |
pbourke | mnaser: there's global data which they're manipulating, this is bad | 13:44 |
*** zhubingbing_ has joined #openstack-kolla | 13:45 | |
*** bogdando has quit IRC | 13:45 | |
*** zhurong has joined #openstack-kolla | 13:47 | |
*** mattmceuen has quit IRC | 13:48 | |
*** lamt has quit IRC | 13:49 | |
mnaser | pbourke im surprised its become an issue lately | 13:50 |
pbourke | mnaser: the tests are been added to, its possible some the newer ones have exposed it | 13:51 |
*** zhubingbing_ has quit IRC | 13:51 | |
pbourke | mnaser: that or the newer tests are wrong :/ | 13:52 |
mnaser | pbourke gotcha.. we'll have to get to the bottom of it, because otherwise things are going to take a long time to merge | 13:52 |
mnaser | esp because this is a voting check | 13:52 |
pbourke | mnaser: yeah | 13:52 |
mnaser | i have a ton of other work stuff i gotta work on but ill try to look at it.. no commitment on that atm :x | 13:52 |
*** srwilkers has quit IRC | 13:53 | |
*** lamt has joined #openstack-kolla | 13:53 | |
*** srwilkers has joined #openstack-kolla | 13:55 | |
*** lrensing has joined #openstack-kolla | 13:56 | |
*** ksumit has joined #openstack-kolla | 13:58 | |
*** ksumit has quit IRC | 13:58 | |
*** zhurong_ has joined #openstack-kolla | 14:00 | |
*** zhurong has quit IRC | 14:01 | |
*** iceyao has quit IRC | 14:02 | |
*** ksumit has joined #openstack-kolla | 14:02 | |
SamYaple | Jeffrey4l: with your mariadb recovery it is impossible to specify a recovery node. the state of mariadb recovery in kolla ansible is really bad right now. like a high probability of losing data | 14:06 |
*** lucasxu has joined #openstack-kolla | 14:10 | |
*** david-lyle has quit IRC | 14:15 | |
*** phuongnh has joined #openstack-kolla | 14:18 | |
*** lamt has quit IRC | 14:21 | |
*** srwilkers has quit IRC | 14:23 | |
*** satyar has joined #openstack-kolla | 14:23 | |
*** lrensing has quit IRC | 14:24 | |
inc0 | good morning | 14:28 |
*** rwallner has quit IRC | 14:28 | |
inc0 | SamYaple: then I have favor to ask | 14:28 |
*** rwallner has joined #openstack-kolla | 14:28 | |
inc0 | would you mind creating a bug in kolla-ansible for it and mark it as critical? | 14:28 |
inc0 | with explanation why it's bad and how to make it not bad? | 14:29 |
*** srwilkers has joined #openstack-kolla | 14:29 | |
*** rwallner has quit IRC | 14:29 | |
SamYaple | inc0: yes, but i will not be able to take on the work to fix it at this time | 14:30 |
inc0 | SamYaple: sure, no pressure | 14:31 |
inc0 | just record it somewhere, write down your thoughts and let someone else deal with it | 14:31 |
inc0 | maybe answer pings here:) | 14:31 |
inc0 | thank you good sir! | 14:31 |
SamYaple | np man | 14:33 |
*** lrensing has joined #openstack-kolla | 14:34 | |
*** daidv has quit IRC | 14:36 | |
Serlex | egonzalez - is it worth creating a bug for simple_cell_setup.yml? | 14:38 |
egonzalez | Serlex, yes if is a bug | 14:38 |
egonzalez | Serlex, what's your issue? | 14:38 |
Serlex | I see that you have a temporary fix | 14:38 |
Serlex | ok ignore that | 14:40 |
openstackgerrit | Dan Ardelean proposed openstack/kolla-ansible master: Add Hyper-V role https://review.openstack.org/455684 | 14:41 |
*** lamt has joined #openstack-kolla | 14:42 | |
SamYaple | inc0: https://bugs.launchpad.net/kolla-ansible/+bug/1682153 | 14:43 |
openstack | Launchpad bug 1682153 in kolla-ansible "mariadb_recovery is prone to data loss" [Undecided,New] | 14:43 |
inc0 | thanks SamYaple | 14:44 |
*** manheim has quit IRC | 14:50 | |
*** rwallner has joined #openstack-kolla | 14:53 | |
*** zhubingbing_ has joined #openstack-kolla | 14:57 | |
*** zhurong_ has quit IRC | 14:57 | |
*** lucasxu has quit IRC | 14:57 | |
*** mkoderer has joined #openstack-kolla | 14:58 | |
satyar | Hi inc0 | 14:58 |
satyar | Hi SamYaple | 14:59 |
satyar | continuing the yesterday discussion... | 14:59 |
satyar | after rebooting the Host able to recover the old VMs and they are getting 8950 MTU | 14:59 |
satyar | and accessable | 14:59 |
inc0 | hmm | 15:00 |
satyar | Not sure if kolla recommends the reboot of nodes after upgrade | 15:00 |
inc0 | satyar: how about doing just ifdown and ifup | 15:00 |
inc0 | kolla doesn't | 15:00 |
inc0 | but it seems like neutron dhcp server isn't being clever | 15:00 |
satyar | i did that without rebooting the host VMs not able to get the 8950 MTU | 15:01 |
satyar | only after rebooting the VMs getting 8950 | 15:01 |
satyar | even i tried rebooting the VMs which created before upgrade still no luck | 15:01 |
inc0 | did you try running dhclient? | 15:01 |
satyar | yes | 15:01 |
inc0 | hmm | 15:01 |
satyar | as VMs got 1500 MTU it was not able to communicate out side | 15:02 |
inc0 | how about restarting ovs agent? | 15:02 |
satyar | tried still same | 15:02 |
inc0 | that's very strange | 15:02 |
satyar | not sure how rebooting host solves this | 15:02 |
*** jaosorior is now known as jaosorior_away | 15:02 | |
inc0 | I honestly doubt it's Kolla issue as I see no level that could mean we broke this...but never know | 15:03 |
satyar | seems like kolla | 15:03 |
inc0 | well we use net=host | 15:03 |
satyar | tried with only neutron bare metal works fine | 15:03 |
inc0 | hmm | 15:03 |
inc0 | ok | 15:03 |
inc0 | and if you're not running jumboframes, everything with mtu 1500 and works? | 15:04 |
satyar | yes | 15:04 |
inc0 | damn I don't have setup that could reproduce this scenerio now | 15:05 |
satyar | hmmm | 15:05 |
satyar | its quite easy though | 15:05 |
SamYaple | satyar: neutron does its own mtu calculations internally | 15:05 |
satyar | yes | 15:05 |
inc0 | but rebooting host? | 15:05 |
SamYaple | if the interface came up with a 1500 mtu and you set it to 9000 mtu after teh fact, it would be 1500 mtu internally | 15:05 |
SamYaple | so there may be a race condition for you | 15:06 |
inc0 | yeah that would make sense | 15:06 |
satyar | my hosts comes with 9000 by default | 15:06 |
SamYaple | youll want the mtu to be 9000 before starting any nova or neutron services | 15:06 |
inc0 | also restart of neutron agent should help | 15:06 |
satyar | nope | 15:07 |
satyar | didnt | 15:07 |
SamYaple | no that wont help inc0 | 15:07 |
satyar | my host while imaging itself comes with 9000 MTU | 15:07 |
SamYaple | a full shutdown of the vm, stopping nova and neutron agents then starting nova and neutron agents after the mtu is 9000 would work | 15:07 |
SamYaple | satyar: its possible neutron or nova starts before your networking has come up | 15:08 |
SamYaple | in which cas the linux default mtu is 1500 | 15:08 |
satyar | SamYaple my hosts having 9000 MTU by default | 15:08 |
SamYaple | unless you recompiled your kernel after making that change, your default mtu is 1500 | 15:08 |
satyar | my kernels having 9000 | 15:08 |
SamYaple | your netowkr settings (a service) have the interfaces at 9000 mtu | 15:08 |
satyar | yes | 15:09 |
inc0 | satyar: did you set mtu to 9000, recompile kernel and create image from this recompiled kernel? | 15:09 |
satyar | you mean VMs? inc0 | 15:10 |
SamYaple | (not that we are recommending to do that to fix the issue) | 15:10 |
inc0 | satyar: no, the host itself | 15:10 |
SamYaple | satyar: im refering to the host here | 15:10 |
satyar | host is recompiled kernel with 9000 MTU | 15:11 |
satyar | so when the machine comes up it comes up with 9000 MTU | 15:11 |
inc0 | question is what sets up mtu to 9000 | 15:11 |
inc0 | if it's kernel or networking service | 15:11 |
inc0 | if you for example set it in /etc/network/interfaces, it won't help | 15:12 |
inc0 | you still can run into issue Sam described | 15:12 |
inc0 | I'd be interested why bare metal works, but that might be because docker just starts too fast... | 15:12 |
SamYaple | satyar: how exactly are you recompiling your kernel | 15:12 |
SamYaple | what are you doig to make it 9000 mtu | 15:13 |
inc0 | and since it's docker that starts neutron, it might cause race condition | 15:13 |
satyar | nope docker is getting installed after | 15:14 |
SamYaple | satyar: for my clarity, on the node hosting the l3 agent and on the compute node, please paste the output of `ip a` | 15:14 |
inc0 | satyar: did you reboot this node before upgrade? | 15:15 |
satyar | http://paste.openstack.org/show/606318/ | 15:16 |
satyar | no | 15:16 |
satyar | inc0 rebooted the host after upgrade | 15:16 |
ksumit | Hello all, I am fairly new to Ansible-Kolla and Docker in general. Is there a way to get CLI access (for example 'cinder list', etc.) after I have deployed OpenStack using Ansible-Kolla? I tried searching, but couldn't find any information. | 15:16 |
satyar | the paste of the machine not rebooted | 15:16 |
satyar | and having the issue | 15:16 |
inc0 | ksumit: if you run kolla-ansible post-deploy it will create /etc/kolla/admin-openrc.sh | 15:17 |
inc0 | source this file and your regular cinder client should work | 15:17 |
satyar | ksumit you can install on any machine which can access the HA | 15:17 |
satyar | and source openrc and access the systems | 15:17 |
inc0 | but we don't install clients | 15:17 |
*** lucasxu has joined #openstack-kolla | 15:18 | |
satyar | nnope | 15:18 |
satyar | we dont install clients | 15:18 |
inc0 | satyar: so...I don't have better idea | 15:18 |
ksumit | @inc0 Thanks! So is it just Cinder that can be accessed through command line or will it work for others too? Nova, Manila, etc. | 15:18 |
inc0 | ksumit: it works just fine | 15:19 |
inc0 | I mean | 15:19 |
inc0 | everything;) | 15:19 |
ksumit | Understood. Thanks! | 15:19 |
inc0 | clients liek cinder client are just REST api clients | 15:19 |
openstackgerrit | Eduardo Gonzalez proposed openstack/kolla master: WIP: fix zun images https://review.openstack.org/456251 | 15:19 |
inc0 | so they'll make http request to APIs | 15:19 |
inc0 | based on env variables which are set in admin-openrc | 15:20 |
satyar | SamYaple: any clue :( | 15:20 |
*** lamt has quit IRC | 15:20 | |
inc0 | Sam might be right, it's really edge case | 15:20 |
inc0 | but possible | 15:20 |
inc0 | but I'm nto able to reproduce it unfortunately | 15:21 |
satyar | inc0: SamYaple: I have code ready for jumbo frame support for VMs | 15:23 |
satyar | should i push it upstream? | 15:23 |
inc0 | bbiaf, see you in meeting | 15:23 |
inc0 | satyar: add it to bug report | 15:23 |
inc0 | maybe someone will be able to reproduce it | 15:23 |
satyar | already added | 15:24 |
satyar | should i create anothre bug for support of jumbo frame in kolla | 15:25 |
satyar | and push the fix for it? | 15:25 |
*** lamt has joined #openstack-kolla | 15:27 | |
sean-k-mooney | jumbo frames should work in kolla today | 15:28 |
sean-k-mooney | i have not read back but was the a case where it does not? | 15:28 |
satyar | i guess the neutron and nova changes are missing in kolla to support jumbo frame | 15:28 |
sean-k-mooney | all that needs to be done is set the relevent neutron config changes | 15:29 |
*** swinn has joined #openstack-kolla | 15:29 | |
satyar | True sean-k-mooney | 15:29 |
satyar | we dont have by default with kolla yet | 15:29 |
sean-k-mooney | at a minium though we should document the changes that are required | 15:29 |
inc0 | sean-k-mooney: issue was that after upgrade existing vms gets mtu 1500 | 15:29 |
satyar | yes :) | 15:30 |
satyar | https://bugs.launchpad.net/kolla/+bug/1681919 | 15:30 |
openstack | Launchpad bug 1681919 in kolla "After upgrade VMs getting 1500 MTU although jumbo frame is setup" [Undecided,New] | 15:30 |
sean-k-mooney | inc0: os-vif will fix it if the vm reboots | 15:30 |
satyar | nope it dont | 15:30 |
ksumit | Another question: How do I restart all containers after a reboot? | 15:30 |
inc0 | ksumit: why would you want to do that? | 15:30 |
sean-k-mooney | satyar: yes it does that was a resent change | 15:30 |
openstackgerrit | Eduardo Gonzalez proposed openstack/kolla-ansible master: WIP: fix zun deployment https://review.openstack.org/456256 | 15:30 |
kfox1111 | morning. | 15:30 |
inc0 | morning kfox1111 | 15:31 |
satyar | in kolla? | 15:31 |
inc0 | satyar: in neutron | 15:31 |
inc0 | that would be change in neutron | 15:31 |
sean-k-mooney | satyar: in os-vif actully its in the ocata relase | 15:31 |
ksumit | @inc0 I mean I am testing the deployment in a lab environment where they restart the machines every 2 nights for some reason. | 15:31 |
*** Pavo has joined #openstack-kolla | 15:31 | |
satyar | neutron i have 1 week old code | 15:31 |
satyar | of stable ocata | 15:31 |
ksumit | @inc0 So I was wondering about the steps that I need to take after a reboot to get everything working again. | 15:32 |
inc0 | aio or multinode? | 15:32 |
ksumit | AIO for now | 15:32 |
sean-k-mooney | satyar: https://github.com/openstack/os-vif/commit/01da454fc8f50f2c86200df377dab4d21cecb753 | 15:32 |
inc0 | well containers should restart | 15:32 |
inc0 | I mean they should start on their own | 15:32 |
sean-k-mooney | if you have neutron configured with the new mtu if the vm reboots we fix the mtus when it repluged | 15:32 |
satyar | ohh ok | 15:33 |
satyar | will try this now... | 15:33 |
inc0 | order they would restart in isn't determined in any way | 15:33 |
inc0 | so just do docker ps and see if everything works | 15:33 |
swinn | inc0: thanks for the help yesterday, I have a working kolla deployment but a few questions | 15:34 |
satyar | sean-k-mooney: i have the code base already | 15:34 |
inc0 | swinn: shoot | 15:34 |
satyar | that was on 9th Dec 2016 | 15:34 |
swinn | should my compute containers be qemu based? | 15:34 |
inc0 | satyar: this isn't in neutron | 15:34 |
inc0 | it's separate service I think | 15:34 |
sean-k-mooney | satyar: yes that code chage was expcitly to deal with the mtu issue on upgrade for triple0 | 15:34 |
inc0 | swinn: depends, qemu or qemu+kvm | 15:34 |
sean-k-mooney | if you update the neutron config before you upgrade that change will fix it when the vm reboots | 15:35 |
swinn | inc0: so in the list of hypervisors, I see qemu reported as the hypervisor type | 15:35 |
inc0 | if you want different hypervisor then it's different story | 15:35 |
swinn | inc0: so am I missing a config flag to change that? | 15:35 |
sean-k-mooney | also the guest os will get the updated mtu by dhcp when it renew it lease of you dont reboot the vms | 15:35 |
satyar | sean-k-mooney bit confused... | 15:36 |
satyar | os-vif code i guess i am missing something | 15:37 |
inc0 | swinn: we don't set virt_type by default | 15:37 |
swinn | and then for customizing services, is the easiest method (without spending days scouring the config reference docs) going to be to let kolla genconfigs and then edit them or will that break something? | 15:37 |
satyar | sean-k-mooney: where this code is residing? | 15:37 |
satyar | is it out of nova/neutron? | 15:37 |
sean-k-mooney | it executed as part of the nova-compute agent | 15:38 |
inc0 | but I believe kvm is default | 15:38 |
satyar | ok | 15:38 |
satyar | then i should have this code also | 15:38 |
inc0 | negative swinn, let me show you | 15:38 |
satyar | as my nova code is also 1 week back | 15:38 |
sean-k-mooney | satyar: the os-vif change was part of ocata | 15:38 |
satyar | ok | 15:39 |
satyar | so if i rebuild the nova-compute image | 15:40 |
satyar | i should be getting this changes right? | 15:40 |
*** duonghq has joined #openstack-kolla | 15:40 | |
sean-k-mooney | satyar: the workflow is update kolla external neutron config with mtu params, then kolla upgrade, then reboot vms on node(not required if you evacuate node first) | 15:41 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes master: WIP: v4 gates. https://review.openstack.org/454841 | 15:42 |
sbezverk | kfox1111: ping | 15:42 |
satyar | sean-k-mooney: i followed the same | 15:42 |
kfox1111 | sbezverk: hi. | 15:42 |
kfox1111 | the resolve.conf ps looks great. :) | 15:43 |
sbezverk | kfox1111: thanks, I was pinging you about tiller | 15:43 |
kfox1111 | ah. tiller. | 15:43 |
satyar | still getting the same issue | 15:43 |
sbezverk | kfox1111: I patched tiller and it is working now in my local test bed | 15:43 |
kfox1111 | for raising up the size limit? | 15:43 |
sbezverk | kfox1111: yep | 15:44 |
sbezverk | I pushed PR to helm people | 15:44 |
kfox1111 | k. lets put in an official bug report with helm then and ask for a 2.3.1? | 15:44 |
sbezverk | it is done | 15:44 |
kfox1111 | ah. even better. :) | 15:44 |
sbezverk | kfox1111: https://github.com/kubernetes/helm/issues/2261 | 15:44 |
kfox1111 | they have been pretty responsive to us in the past. if they can roll out a quick 2.3.1 bug fix that might be better to go forward then instead of rolling back. | 15:45 |
sean-k-mooney | satyar: i missed the start of thread can you discuribe the issue | 15:45 |
sbezverk | kfox1111: yep I like going forward better ;) | 15:45 |
sbezverk | https://github.com/kubernetes/helm/pull/2262 | 15:45 |
sbezverk | kfox1111: I hope they will not find any negative side effects in bumping up message size limit | 15:47 |
swinn | inc0: that would be appreciated, as an example how would I modify the cpu overcommit in nova.conf and update the containers to honor the setting? I think that would give me the pointers I need to do any others. | 15:47 |
inc0 | swinn: sorry, I was on phone | 15:48 |
swinn | no worries :) | 15:49 |
inc0 | https://docs.openstack.org/developer/kolla-ansible/advanced-configuration.html#openstack-service-configuration-in-kolla | 15:49 |
satyar | sean-k-mooney: raised the bug https://bugs.launchpad.net/kolla/+bug/1681919 | 15:49 |
openstack | Launchpad bug 1681919 in kolla "After upgrade VMs getting 1500 MTU although jumbo frame is setup" [Undecided,New] | 15:49 |
kfox1111 | sbezverk: the line 2 above your change still mentions 10mb. | 15:49 |
inc0 | swinn: ^ you want to use this mechanism | 15:49 |
*** vhosakot has joined #openstack-kolla | 15:50 | |
inc0 | https://docs.openstack.org/developer/kolla-ansible/quickstart.html on this page ctrl+f for virt_type to show you exactly how to do it | 15:50 |
swinn | perfect, thanks a ton | 15:50 |
*** daidv has joined #openstack-kolla | 15:51 | |
sean-k-mooney | satyar: so yes the behavior you discibe should be resovled by restarting the old vm | 15:51 |
sean-k-mooney | that is what the os-vif change was ment to do | 15:51 |
*** lamt has quit IRC | 15:53 | |
*** lrensing has quit IRC | 15:53 | |
satyar | sean-k-mooney i tried it didnt :( | 15:54 |
satyar | but after rebooting the host the issue got resolved | 15:54 |
*** lrensing has joined #openstack-kolla | 15:54 | |
satyar | sean-k-mooney: mentioned that in the bug also | 15:55 |
sbezverk | kfox1111: sorry I do not follow, where do you see still 10mb? | 15:55 |
kfox1111 | https://github.com/kubernetes/helm/pull/2262/files#diff-8f4541a8ec0fb2b3143ca168f7083f17L31 | 15:55 |
kfox1111 | you tweak line 33. line 31 says: | 15:56 |
kfox1111 | // maxMsgSize use 10MB as the default message size limit. | 15:56 |
mnaser | friendly notification | 15:56 |
mnaser | kolla meeting in 4 | 15:56 |
kfox1111 | mnaser: oh. thanks for the reminder. :) | 15:56 |
mnaser | ..at #openstack-meeting-4 | 15:56 |
sean-k-mooney | if it didnt its proably not a kolla bug but rather nova/os-vif/neutron | 15:56 |
sbezverk | kfox1111: right the comment, will fix it | 15:56 |
satyar | yeah tried with baremetal it works fine | 15:57 |
inc0 | meeting time | 15:58 |
sdake | sup peeps | 15:58 |
Jeffrey4l | egonzalez, re until in register, if this kind of normal task need retry for network error, all task should add this. | 15:58 |
sdake | lunc htime | 15:58 |
Jeffrey4l | so we assume the network is OK during deployment. | 15:58 |
sbezverk | kfox1111: it looks like helm people will be ok to roll this finx into next release | 15:58 |
sdake | sbezverk did you find the helm 23.0 bug? | 15:59 |
sbezverk | sdake: yes, and fixed it | 15:59 |
kfox1111 | sbezverk: will hey be willing to roll the next release asap to fix the issue? its kind of a regression. | 15:59 |
egonzalez | Jeffrey4l, roger, thanks | 15:59 |
sdake | sbezverk nice dude! | 15:59 |
*** tovin07 has joined #openstack-kolla | 16:00 | |
sbezverk | kfox1111: that I did not ask, but you should comment on the link I pasted with issue | 16:00 |
kfox1111 | inc0: you see the message: | 16:00 |
kfox1111 | [Openstack-operators] [kolla] Issue with Galera | 16:00 |
kfox1111 | someone needs to answer that one asap. | 16:00 |
sdake | kfox1111 sbezverk meeting time plz | 16:00 |
*** jascott1_ has joined #openstack-kolla | 16:01 | |
inc0 | kfox1111: we have bug opened | 16:01 |
sdake | meeting in #openstack-meeting-4 | 16:01 |
Jeffrey4l | SamYaple, with my patch, you can use "mariadb_recover_inventory_name" variable for specify the recover node. | 16:02 |
*** hieulq_ has joined #openstack-kolla | 16:04 | |
*** manheim has joined #openstack-kolla | 16:06 | |
SamYaple | Jeffrey4l: i made a bug report for kolla, im sure youll all work it out. | 16:06 |
Jeffrey4l | SamYaple, i will read srollback later. and mind paste the bug link? | 16:08 |
*** rwsu has joined #openstack-kolla | 16:09 | |
*** manheim has quit IRC | 16:10 | |
*** skramaja has quit IRC | 16:11 | |
*** Pavo has quit IRC | 16:12 | |
swinn | I’m assuming that I can’t communicate with the world because ovs-system and all the bridges report down.... | 16:12 |
swinn | best container to troubleshoot this from? | 16:13 |
*** Pavo has joined #openstack-kolla | 16:13 | |
*** Pavo has quit IRC | 16:13 | |
*** Pavo has joined #openstack-kolla | 16:14 | |
*** Pavo has quit IRC | 16:14 | |
*** Pavo has joined #openstack-kolla | 16:15 | |
*** Pavo has quit IRC | 16:15 | |
*** Pavo has joined #openstack-kolla | 16:15 | |
*** Pavo has quit IRC | 16:16 | |
*** Pavo has joined #openstack-kolla | 16:16 | |
*** Pavo has quit IRC | 16:16 | |
SamYaple | Jeffrey4l: https://bugs.launchpad.net/kolla-ansible/+bug/1682153 | 16:17 |
openstack | Launchpad bug 1682153 in kolla-ansible "mariadb_recovery is prone to data loss" [Critical,Confirmed] | 16:17 |
Jeffrey4l | roger. thanks. | 16:17 |
*** hieulq_ has quit IRC | 16:19 | |
*** l4yerffeJ has joined #openstack-kolla | 16:21 | |
*** sayantan_ has joined #openstack-kolla | 16:22 | |
*** l4yerffeJ has quit IRC | 16:23 | |
*** athomas has quit IRC | 16:23 | |
*** Serlex has quit IRC | 16:23 | |
*** jemcevoy has joined #openstack-kolla | 16:24 | |
*** sayanta__ has joined #openstack-kolla | 16:26 | |
*** lamt has joined #openstack-kolla | 16:27 | |
jemcevoy | inc0: How do I clean out my local docker registry/repo so I can rebuild cleanly? The ops server I am using does not have much local storage | 16:28 |
*** sayantan_ has quit IRC | 16:29 | |
*** mewald has joined #openstack-kolla | 16:32 | |
mewald | In my 3-node controller cluster the first two nodes died. I am just trying to bring back up the first of them but RabbitMQ keeps failing and is hanging in a restart loop of the docker container. I cannot extract any logging information since nothing is written. Any ideas? | 16:33 |
inc0 | jemcevoy: we have meeting now, but kolla-ansible destroy;) | 16:33 |
mewald | hehe nope definitely not, it's not a playground environment :D | 16:34 |
mewald | One of the deploy tasks fails but ansible keeps going: https://gist.github.com/mewald1/0b485cd77b73f2a2439b2d2c06ae220a | 16:35 |
*** calbers has quit IRC | 16:38 | |
*** lpetrut has quit IRC | 16:38 | |
mewald | inc0: lol, just noticed you didnt even talk to me xD | 16:38 |
*** gfidente is now known as gfidente|afk | 16:41 | |
*** mewald has quit IRC | 16:43 | |
*** david-lyle has joined #openstack-kolla | 16:45 | |
*** zhubingbing_ has quit IRC | 16:49 | |
*** unicell has quit IRC | 16:53 | |
*** mewald has joined #openstack-kolla | 16:53 | |
*** nathharp has quit IRC | 16:56 | |
inc0 | mewald: yeah:P | 16:59 |
inc0 | so kfox1111 Jeffrey4l | 16:59 |
inc0 | I'd suggest having both trung and head of stable branches | 16:59 |
kfox1111 | inc0: seee my message above about an important email. | 16:59 |
inc0 | so nova-api trunk AND nova-api ocata | 16:59 |
kfox1111 | so, stable patches that arn't released? | 17:00 |
inc0 | kfox1111: https://bugs.launchpad.net/kolla-ansible/+bug/1682153 | 17:00 |
openstack | Launchpad bug 1682153 in kolla-ansible "mariadb_recovery is prone to data loss" [Critical,Confirmed] | 17:00 |
inc0 | kfox1111: right | 17:00 |
inc0 | for example kolla-k8s will use it | 17:00 |
egonzalez | i'd use ocata-latest for daily builds | 17:00 |
jemcevoy | inc0: Thankyou | 17:00 |
kfox1111 | inc0: someone should respond to him directly, if havent already. I'm not involved in the ansible side neough that I think I should. | 17:01 |
kfox1111 | inc0: ah. the question is I guess, do we want the latest stable built to show up as latest, or trunk? | 17:01 |
kfox1111 | or maybe we do both? | 17:01 |
kfox1111 | latest stable tested build thats unrevisioined is ocata-latest and stable trunk is ocata-trunk or something? | 17:02 |
*** ksumit has quit IRC | 17:02 | |
inc0 | responded | 17:02 |
daidv | So, can I ask you something around "Mixing binary and source image" topic? | 17:02 |
sdake | inc0 i put that protocore email on the ml | 17:02 |
kfox1111 | inc0: thx. | 17:02 |
inc0 | kfox1111: trunk is master branch | 17:02 |
sdake | hope to get some responses from the nova peeps | 17:02 |
inc0 | that how I'd understand it | 17:03 |
sdake | Jeffrey4l quick Q | 17:03 |
daidv | inc0, ping? | 17:03 |
kfox1111 | inc0: true.... what else coudl we call it... tip? | 17:03 |
inc0 | daidv: I'm here | 17:03 |
kfox1111 | tip of the stable branch? | 17:03 |
daidv | Why did we separate binary image deployment vs source image deployment? | 17:03 |
inc0 | ocata-latest | 17:03 |
kfox1111 | daidv: cause some folks like me like to use rdo vendor packages, and some folks like to use pip packages. | 17:03 |
daidv | I think one service can be deployed by binary or source image? | 17:03 |
sdake | Jeffrey4l do you have time to review a spec related to the etcd work? | 17:03 |
inc0 | daidv: we have both types from very beggining | 17:03 |
*** tovin07 has quit IRC | 17:03 | |
kfox1111 | both have different drawbacks/benifits. | 17:04 |
Jeffrey4l | sdake, yep | 17:04 |
inc0 | kfox1111: why not just release name? | 17:04 |
inc0 | ocata == tip of stable/ocata | 17:04 |
inc0 | master == tip of master | 17:04 |
kfox1111 | inc0: why version stable at all then? | 17:05 |
Jeffrey4l | daidv, if you can do something which can make binary and source work together. it is be great. | 17:05 |
inc0 | well tags are better tested | 17:05 |
kfox1111 | what benifit is there to 4.0.0 / 4.0.1? | 17:05 |
Jeffrey4l | kfox1111, i am agree on this point. | 17:05 |
Jeffrey4l | with inc0 | 17:05 |
kfox1111 | we could skip point verions then and just do | 17:05 |
Jeffrey4l | for 4.0.1, how can we tell the end-user, this is a tag image or branch image? | 17:05 |
sdake | Jeffrey4l this review will take you some time | 17:06 |
kfox1111 | 4.0-1.. 4.0-2... with latest alieased to ocata-latest | 17:06 |
daidv | Jeffrey4l, yep, I got other question around "Tarball URLs" | 17:06 |
sdake | please provide feedback if it would be useful for kolla-ansible (only) | 17:06 |
sdake | https://review.openstack.org/#/c/243114/5/specs/queens/oslo-config-db.rst | 17:06 |
daidv | Why did we config special release url instead of stable release? | 17:06 |
sdake | inc0 feel free to weigh in plz | 17:06 |
daidv | I mean aodh-1.0.0.tar.gz or aodh-2.0.0.tar.gz instead of aodh-stable-mitaka.tar.gz or aodh-stable-newton.tar.gz | 17:06 |
inc0 | sdake: I think I did | 17:06 |
Jeffrey4l | sdake, then i need review it tomorrow. should go to bed soon ;) | 17:07 |
*** jascott1_ has quit IRC | 17:07 | |
sdake | inc0 i just spoke with emilian face to face here and he said nobody from kolla responded on the etcd thread or review | 17:07 |
*** jascott1_ has joined #openstack-kolla | 17:07 | |
sdake | i honestly dont know much about how it would be used in kolla-ansible which is why I asked you and Jeffrey4l to take al ook:) | 17:07 |
Jeffrey4l | kfox1111, what -1 -2 for? | 17:07 |
inc0 | ahh I must've missed clicking publish | 17:07 |
inc0 | or sth | 17:08 |
inc0 | I'll check it out | 17:08 |
sdake | inc0 huh? | 17:08 |
kfox1111 | Jeffrey4l: the proble we have now is we're not doing revisioning. | 17:08 |
Jeffrey4l | support etcd for oslo-config, hrm kolla-ansible should not use such a solution | 17:08 |
inc0 | nvm | 17:08 |
inc0 | I'll review it | 17:08 |
kfox1111 | we need to know when things change in the containers that are not related directly to kolla's code. | 17:08 |
inc0 | Jeffrey4l: not as default | 17:08 |
kfox1111 | revisions do that. | 17:08 |
inc0 | but as option, shy not | 17:08 |
inc0 | why | 17:08 |
Jeffrey4l | kfox1111, we are talking stable branch. it is OK to not save the revistioning | 17:09 |
daidv | inc0, I mean can we create some new things to deploy binary and source image together | 17:09 |
kfox1111 | thats how rpm/deb/etc deal with the problem of dep versioning. | 17:09 |
kfox1111 | Jeffrey4l: no, its a big rpoblem we're not handling. | 17:09 |
inc0 | guys, ok I'm context switching too much, one discussion at the time please | 17:09 |
kfox1111 | several times now we have seen bad containers on the hub. | 17:09 |
Jeffrey4l | re Why did we config special release url instead of stable release? , tag is more stable then stable branch | 17:09 |
kfox1111 | and the fix hasn't been to release a new kolla stable revision, | 17:09 |
kfox1111 | but just to rebuild the containers. | 17:09 |
sdake | kfox1111 sbezverk i'm not sure what to make of this : https://github.com/kubernetes/kubernetes/issues/43819#issuecomment-293553452 | 17:09 |
Jeffrey4l | stable branch is still develop/master branch actually. | 17:10 |
inc0 | kfox1111: what about daily? | 17:10 |
inc0 | do we want to push daily builds with revision? | 17:10 |
kfox1111 | inc0: daily would work I think. | 17:10 |
inc0 | and tag it with for example 4.0.0-12.04.2017 | 17:10 |
daidv | Jeffrey4l, gotcha, thanks. | 17:11 |
kfox1111 | inc0: just to b e clear, we're talking about dayly builds for updated stuff or for patches? | 17:11 |
inc0 | kfox1111: on dockerhub images in general I think | 17:11 |
kfox1111 | inc0: well, I want to squash revisions that don't matter. | 17:11 |
inc0 | because having multiple logics for that is bad | 17:11 |
kfox1111 | it may be weeks between 4.0.0-1 and 4.0.0-2. | 17:11 |
inc0 | I don't know which matters and which doesn't | 17:11 |
kfox1111 | the idea is to minimize the operator load for looking for changes. | 17:11 |
*** jascott1_ has quit IRC | 17:12 | |
inc0 | we don't have capacity to follow changes in every other openstack project | 17:12 |
Jeffrey4l | kfox1111, i do not recommend the operator to use hub.docker.com image directly | 17:12 |
kfox1111 | inc0: we don't need to. its deps we care about when it comes to revision. | 17:12 |
kfox1111 | for example: | 17:12 |
inc0 | not for source builds | 17:12 |
inc0 | and something will change every day, I almost guarantee you | 17:13 |
kfox1111 | we build container 4.0.0. do an rpm -Uvh on on, see openssl v1.0 | 17:13 |
inc0 | some library will be updated | 17:13 |
kfox1111 | we buid it two days later, and the rpm -qa shows openssl v1.1, | 17:13 |
kfox1111 | we automatically push it to the hub as 4.0.0-2 | 17:13 |
inc0 | kfox1111: and who will give us library names we care about? | 17:13 |
kfox1111 | inc0: any. | 17:13 |
kfox1111 | rpm -qa, dpkg -l and pip list. | 17:14 |
inc0 | then it will change almost every day | 17:14 |
kfox1111 | any changes there trigger a rebuild. | 17:14 |
kfox1111 | not usually. | 17:14 |
inc0 | well so here's thing | 17:14 |
kfox1111 | rpms change weekly maybe. pip too. | 17:14 |
inc0 | we an do weekly builds then | 17:14 |
kfox1111 | security updats should hit as soon as they are released by the vendor. | 17:14 |
kfox1111 | really, the rpms and debs and pip stuff doesn't chnage very often. | 17:15 |
*** shardy has quit IRC | 17:15 | |
kfox1111 | but when it changes we need to release asap. | 17:15 |
kfox1111 | thats all automatable. | 17:15 |
Jeffrey4l | i am strong again the 4.0.0-2 ideas | 17:15 |
inc0 | so problem I'm facing is | 17:15 |
inc0 | if something changes in nova container | 17:15 |
inc0 | we still need to rebuild whole stack | 17:15 |
kfox1111 | Jeffrey4l: the problem is, you want sites to reproduce the gates we have for testing. thats not tenable for most sites. | 17:16 |
inc0 | Jeffrey4l: it's not bad idea if we execute it correctly | 17:16 |
kfox1111 | reusing our gating to test our up to date containers allows shipping tested, updated stuff to our users. | 17:16 |
inc0 | ok, we discussed it long enough for it to be spec | 17:16 |
kfox1111 | thats very very valuable. | 17:16 |
Jeffrey4l | iirc, sdake talked this with me about this. | 17:16 |
inc0 | I think | 17:16 |
inc0 | we need broad feedback from various ops | 17:16 |
kfox1111 | inc0: thats solvable though. | 17:16 |
inc0 | kfox1111: I know it's solvable | 17:17 |
inc0 | but there are lots of moving parts | 17:17 |
*** lamt has quit IRC | 17:17 | |
kfox1111 | inc0: we nightly build fresh containers. we look for changes. if any changes, we gate test. if it passes, we push only the updated containers. | 17:17 |
Jeffrey4l | kfox1111, even for devstack/centos or every other gate job, your concern can not be solved. | 17:17 |
inc0 | kfox1111: we will test always | 17:17 |
inc0 | not only after changes | 17:17 |
Jeffrey4l | on the other hand, how may issue is caused by the repo update? | 17:17 |
kfox1111 | inc0: this is seperate from pushing changes into git. | 17:17 |
kfox1111 | inc0: those are tested always. | 17:17 |
Jeffrey4l | i do not see such issue for the last one year. | 17:18 |
inc0 | well, I think we can run gates on cron-based pushes too | 17:18 |
inc0 | btw publisher will run in infra | 17:18 |
kfox1111 | inc0: we have a nightly job thats testing containers for kolla-kubernetes. | 17:18 |
inc0 | and will pull from tarballs as we discuss it now | 17:18 |
kfox1111 | all we need to do is slide in a kolla build step at the front of it. | 17:18 |
inc0 | we merge every day | 17:18 |
kfox1111 | and plumbing to look for changes. | 17:19 |
inc0 | god my brain hurts | 17:19 |
inc0 | I'll start ML thread ok? | 17:19 |
kfox1111 | most of the work's already done. the main big piece left the authenticated upload to the hub. | 17:19 |
Jeffrey4l | it seem we are talking multi topic now. | 17:19 |
inc0 | yeah | 17:19 |
kfox1111 | could be.. | 17:19 |
Jeffrey4l | ML thread will be good start and let us focus on only one topic | 17:20 |
inc0 | I'll start 2 thread | 17:20 |
kfox1111 | k. | 17:20 |
inc0 | s | 17:20 |
kfox1111 | we need to cover all the topics though. | 17:20 |
inc0 | 1. for daily pushed names | 17:20 |
Jeffrey4l | great | 17:20 |
inc0 | 2. revision mgmt | 17:20 |
kfox1111 | so start as many threads as needed. | 17:20 |
inc0 | anything else? | 17:21 |
kfox1111 | whats daily pushed names about? | 17:21 |
inc0 | ocata-tip | 17:21 |
kfox1111 | that for following trunk/tip? | 17:21 |
inc0 | or whatever | 17:21 |
inc0 | yeah | 17:21 |
kfox1111 | k. | 17:21 |
inc0 | we want to get this sorted out asap | 17:21 |
inc0 | and I think it's easy to just agree | 17:21 |
inc0 | in fact I take that back | 17:22 |
inc0 | can we just agree on "ocata"? | 17:22 |
*** duonghq has quit IRC | 17:22 | |
inc0 | just relase name with no revision or anything | 17:22 |
inc0 | that will be documented to be tip of branch | 17:22 |
kfox1111 | so, why ocata and not just 4.0 ? | 17:22 |
kfox1111 | that would fit maybe nicer in the nameing scheme. | 17:22 |
kfox1111 | 4.0 | 17:23 |
inc0 | ocata is more meaningful and less prone to confusion | 17:23 |
kfox1111 | 4.0.1 | 17:23 |
kfox1111 | 4.0.1-3 | 17:23 |
sbezverk | kfox1111: yeah PR got approved for 2.3.1 :) | 17:23 |
*** unicell has joined #openstack-kolla | 17:23 | |
kfox1111 | sbezverk: cool. do they have a timeframe? | 17:23 |
inc0 | kfox1111: not sure if nicer | 17:23 |
inc0 | imho codename is nicer | 17:23 |
inc0 | corresponds with what we have in git | 17:23 |
kfox1111 | inc0: 50/50 on that one... | 17:23 |
inc0 | which is branch name ocata | 17:23 |
kfox1111 | its good if you know how openstack names things, worse if you dont. | 17:24 |
inc0 | we can call it stable/ocata to be fully equivalent to git branch name | 17:24 |
kfox1111 | though kolla's versioning not matching the release naming is bad too. :/ | 17:24 |
kfox1111 | all confusing. :/ | 17:24 |
inc0 | actually. stable/ocata is my favorite now | 17:24 |
inc0 | we cleearly suggest - it's not semver, it's based on git | 17:24 |
kfox1111 | I dont think tags can have /? | 17:24 |
inc0 | then just ocata | 17:25 |
inc0 | stable is redundant | 17:25 |
kfox1111 | I don't have much opinion on it, as I plan never to use it. its a landmine I think. | 17:25 |
inc0 | it's like - if you don't see any numbers in tag name, assume git | 17:25 |
inc0 | kfox1111: not for gates tho | 17:25 |
inc0 | also keep in mind that it will always pass kolla gates before pushing | 17:26 |
kfox1111 | I need to know my containers are all the same everywhere. targeting a tag that containers binary change in is a problem. | 17:26 |
*** daidv has quit IRC | 17:26 | |
kfox1111 | inc0: yeah. its just an alieas to the other, revisioned things. | 17:26 |
inc0 | kfox1111: for prod, for gates we want to know that things break asap | 17:26 |
kfox1111 | so it doesn't much matter. | 17:26 |
inc0 | if kolla-ansible/kolla-k8s breaks after pushing something to dockerhub | 17:27 |
inc0 | we'll see it next day | 17:27 |
inc0 | and *that* is valuable | 17:27 |
inc0 | may slow dev of project a little, but will improve overall quality | 17:27 |
*** egonzalez has quit IRC | 17:29 | |
kfox1111 | its easy for us to test it in gate before pushing though. | 17:29 |
kfox1111 | so we wont push broken. | 17:29 |
kfox1111 | the last two parts we really need to finish the effort is: | 17:30 |
kfox1111 | 1. job that pushes stuff from tarballs.o.o to docker hub. | 17:30 |
kfox1111 | 2. way of fingerprinting a container for changes. | 17:30 |
kfox1111 | 2 I think is mostlyu just rpm -qa | sort > rpms.txt | 17:30 |
kfox1111 | and diff. | 17:30 |
sdake | kfox1111 can you ack this plz: https://review.openstack.org/#/c/455502/1 | 17:32 |
sdake | sbezverk can you change your vote on above plz ^ | 17:32 |
kfox1111 | sbezverk: how close is the helm 2.3.1 release? | 17:33 |
sbezverk | sdake: the fix got already merged in helm master | 17:33 |
sdake | when is a release coming? | 17:33 |
sdake | or shall we use master for our gates? | 17:33 |
*** tonanhngo has joined #openstack-kolla | 17:34 | |
sbezverk | 2.3.1 will be release this week | 17:34 |
kfox1111 | sdake: if its a day or two out, could you make your dev work you want to do based on your revert ps, | 17:34 |
kfox1111 | then we reparent on trunk when helm gets released? | 17:34 |
*** tonanhngo has quit IRC | 17:35 | |
kfox1111 | then we dont have to revert/reapply. | 17:35 |
*** sambetts is now known as sambetts|afk | 17:36 | |
*** tonanhngo has joined #openstack-kolla | 17:37 | |
sdake | i guess although master is effectively broken in the meantime | 17:38 |
sdake | we shouldn't be afraid of git revert | 17:38 |
sdake | git revert can be used to revert a revert ;) | 17:38 |
sdake | happens all the time | 17:38 |
kfox1111 | sdake: its kind of broken for the gate. | 17:38 |
kfox1111 | but you don't share any of the setup_helm stuff, | 17:38 |
kfox1111 | in the dev docs, so it doesn't really matter. | 17:39 |
*** nathharp has joined #openstack-kolla | 17:39 | |
kfox1111 | it just muddies the history though and makes a git bisect slightly longer. | 17:39 |
kfox1111 | it feels like its just work that procrastination will solve soon. :) | 17:40 |
*** nathharp has quit IRC | 17:47 | |
*** mewald has quit IRC | 17:52 | |
jemcevoy | inc0: I ran the destroy successfully and I still have 3.2GB in /var/lib/docker/volumes/sha256/_data/docker/registry/v2/ Is there a way to get that space back? | 17:58 |
kfox1111 | inc0: how much more work do you think there is to getting the config over? | 18:01 |
*** mattmceuen has joined #openstack-kolla | 18:02 | |
*** krtaylor has quit IRC | 18:05 | |
*** MasterOfBugs has joined #openstack-kolla | 18:06 | |
*** pramodrj07 has joined #openstack-kolla | 18:06 | |
*** pramodrj07 has quit IRC | 18:06 | |
*** MasterOfBugs has quit IRC | 18:06 | |
*** MasterOfBugs has joined #openstack-kolla | 18:06 | |
*** pramodrj07 has joined #openstack-kolla | 18:06 | |
*** phuongnh has quit IRC | 18:07 | |
*** nathharp has joined #openstack-kolla | 18:07 | |
*** lamt has joined #openstack-kolla | 18:13 | |
inc0 | kfox1111: need to check what's missing | 18:13 |
inc0 | jemcevoy: we don't remove registry | 18:14 |
kfox1111 | inc0: hoping we can get that done by the next release. | 18:14 |
kfox1111 | and I would like to patch it all to support logging to stdout. | 18:14 |
inc0 | ok, I'll do due dilligence today | 18:14 |
kfox1111 | but don't want to start until the stuff is merged? | 18:14 |
inc0 | and make sure missing patces are up | 18:14 |
inc0 | I think we're mostly there | 18:14 |
kfox1111 | inc0: cool. thanks. | 18:14 |
kfox1111 | I've got a fluent-bit package up for review in kubernetes/charts I think would work well with kolla-kubernetes in that mode. | 18:15 |
kfox1111 | https://github.com/kubernetes/charts/pull/895 | 18:15 |
*** nathharp has quit IRC | 18:16 | |
*** srwilkers has quit IRC | 18:19 | |
sbezverk | kfox1111: I guess we will have to revert after all | 18:22 |
sbezverk | fixing grpc limitation exposed | 18:23 |
sbezverk | limitation with configmap being capped at 1MB | 18:23 |
*** satyar has quit IRC | 18:23 | |
openstackgerrit | Marcus Williams proposed openstack/kolla-ansible master: Add OpenDaylight role https://review.openstack.org/416367 | 18:32 |
kfox1111 | sbezverk: so the fix didn't fix it for us? | 18:35 |
sbezverk | it fixed only 1 part which was hiding configmap limitation | 18:36 |
kfox1111 | ok. | 18:37 |
kfox1111 | so... | 18:37 |
sbezverk | so I will try to see why 2.3.0 generates bigger confgimap than 2.2.3 | 18:37 |
sbezverk | but it might take longer | 18:38 |
kfox1111 | either helm trims back what extra stuff they are putting in over 2.2, | 18:38 |
kfox1111 | we stick to 2.2 forever, | 18:38 |
kfox1111 | helm comes up with a new way of storing data, | 18:38 |
kfox1111 | k8s gets bigger configmap support, | 18:38 |
kfox1111 | or we stop doing computekit charts. | 18:38 |
kfox1111 | or, helm gets an aggregate feature. where the releases are still seperate. | 18:39 |
sbezverk | yeah none of these looks very attractive | 18:39 |
sbezverk | and if we need any new features in helm we are busted | 18:41 |
kfox1111 | which we know we do. | 18:41 |
sbezverk | unless they get backported which is not always possible :( | 18:41 |
kfox1111 | parent values is much more important I think long term then computekit chart. :/ | 18:41 |
sbezverk | well, it depeneds, for sdake it seems computekit is a almost a showstopper | 18:42 |
sbezverk | we could try to request configmap size increase? | 18:43 |
kfox1111 | that wouldn't happen quickly. :/ | 18:43 |
kfox1111 | Id ask to see if the changes to helm 2.3 to the file format could easily be backed off temporarily, | 18:44 |
kfox1111 | and then for 2.4 support something like configmap spillover to a second configmap. | 18:44 |
kfox1111 | another option woudl be supporting 2.3 for everything but computekit, and changing helm ver in computekit job to 2.2. | 18:45 |
kfox1111 | though that would still prevent us from making 2.3 only changes to the microservice charts I guess.... :/ | 18:45 |
kfox1111 | and parent values will go a long way to being "helm native" | 18:46 |
kfox1111 | I guess this one's one for sdake to answer. | 18:46 |
kfox1111 | sdake: whats more important to you, being more helm native, or computekit as a chart. | 18:46 |
kfox1111 | cause I think we're staring that in the face right now for a few weeks at least. | 18:47 |
*** krtaylor has joined #openstack-kolla | 18:49 | |
*** Manheim has joined #openstack-kolla | 18:53 | |
sbezverk | kfox1111: seems that way, the error comes directly from kubernetes | 18:56 |
sbezverk | so nothing can be done on helm side other than optimization of the size of configmap | 18:56 |
sbezverk | but we do not know if it is doable | 18:56 |
kfox1111 | or make configmap have a pointer to a "next" configmap, | 18:58 |
kfox1111 | and it pulls data out of each one and appends them together. | 18:58 |
kfox1111 | its workaroundable. but not nesssiarily desirable. | 18:58 |
sbezverk | kfox1111: I suspect they package all charts into a signle configmap and ship it to tiller | 19:01 |
*** iceyao has joined #openstack-kolla | 19:01 | |
*** srwilkers has joined #openstack-kolla | 19:01 | |
sbezverk | then tiller parses it and instantiate individual objects | 19:01 |
kfox1111 | sbezverk: not sure. it may be the other way around. | 19:02 |
kfox1111 | the chart goes to tiller via grpc, | 19:02 |
kfox1111 | and tiller stores the state it cares about in configmap. | 19:02 |
kfox1111 | I would guess its that way, as the grpc limit is higher then the configmap size. | 19:02 |
kfox1111 | even before you changed it, it was 10m. | 19:02 |
sbezverk | Apr 12 13:42:53 kube-1 journal: 2017/04/12 17:42:53 release_server.go:840: warning: Failed to record release "veering-heron": ConfigMap "veering-heron.v1" is invalid: data: Too long: must have at most 1048576 characters | 19:03 |
kfox1111 | vs the 1m for configmap. | 19:03 |
*** ipsecguy_ has joined #openstack-kolla | 19:03 | |
sbezverk | you see, we do not have this config map, they build it | 19:03 |
sbezverk | for some internal purposes | 19:03 |
kfox1111 | yeah. | 19:03 |
kfox1111 | its all the state nessisary to do diffs I think. | 19:03 |
kfox1111 | I wonder if they are doing any compression on it. | 19:04 |
*** ipsecguy has quit IRC | 19:04 | |
*** iceyao has quit IRC | 19:06 | |
*** lucasxu has quit IRC | 19:18 | |
*** fooliouno has joined #openstack-kolla | 19:18 | |
*** lucasxu has joined #openstack-kolla | 19:20 | |
*** jmccarthy has quit IRC | 19:22 | |
*** lamt has quit IRC | 19:22 | |
fooliouno | sbezverk: I'm running into an issue in kolla-k8s when trying to do a set-manager in OVS to use Opendaylight container as a manager. Seeing the same issue as this guy: https://bugs.launchpad.net/kolla/+bug/1637928 | 19:24 |
openstack | Launchpad bug 1637928 in kolla "openvswitch set-manager to ODL doesn't work" [Low,Incomplete] | 19:24 |
fooliouno | sbezverk: any ideas on how to solve the issue | 19:25 |
fooliouno | kfox1111: Jeffrey4l: ^^^ | 19:26 |
kfox1111 | never played with opendaylight. sorry. :/ | 19:27 |
fooliouno | kfox1111: no problem. Thanks! | 19:27 |
sbezverk | fooliouno: same here, I do not think it will work currently in kolla-k8s | 19:29 |
sbezverk | fooliouno: but that would be a great contribution if you make it owrking ;) | 19:29 |
fooliouno | sbezverk: thanks. do you know why it may not work. I will continue looking into it, but need pointers :) | 19:31 |
sbezverk | fooliouno: it was assumption since we never tested it | 19:31 |
fooliouno | ah .. ok | 19:32 |
sbezverk | and as kfox1111 like to repeat ;) if it is not tested it is broken | 19:32 |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes master: WIP: v4 gates. https://review.openstack.org/454841 | 19:33 |
kfox1111 | fooliouno: I'm guessing it probably needs architectural changes to work? | 19:34 |
kfox1111 | do you need other agents but neutron-openvswitch-agent/openvswitch-* stuff? | 19:34 |
fooliouno | kfox1111: I was hoping to reuse most of the kolla-k8s containers. The neutron-openvswitch-agent would not be needed. | 19:35 |
fooliouno | kfox1111: I was thinking of building an ovs container myself as suggested by the person in the bug report and trying that. | 19:36 |
kfox1111 | does opendaylight have its own neutron agent then? | 19:37 |
kfox1111 | or agent that runs on the host, and neutron talks to just opendaylight? | 19:37 |
fooliouno | kfox1111: I was setting the ml2 mechanism driver in neutron.conf to use odl so that neutron talks to old | 19:39 |
fooliouno | odl | 19:39 |
fooliouno | just for l2 | 19:39 |
kfox1111 | so, ml2 plugin for odl to talk to the odl controller? | 19:39 |
kfox1111 | what gets settings from the odl controller to the node? | 19:39 |
fooliouno | from my understanding, neutron talks to odl and odl controls ovs | 19:40 |
kfox1111 | hmm.... | 19:40 |
kfox1111 | the question is how. | 19:40 |
kfox1111 | is ovs exported externally, and ip given to odl? | 19:40 |
kfox1111 | if its something like that, the the question is, how does odl discover it/get configured to talk to it. | 19:41 |
fooliouno | I thought that when we set the manager in ovs (ovs-vsctl set-manager), we provide the ODL container IP | 19:41 |
kfox1111 | ah. ok. | 19:42 |
fooliouno | ANd the problem is that that step is currently broken | 19:42 |
kfox1111 | that step isn't existing I think. | 19:42 |
fooliouno | I ran it manually in the ovs container and it doesnt do anything | 19:43 |
kfox1111 | hmm.. thats weird then. there must be more too it then just that. | 19:43 |
kfox1111 | to make that always run, it should get added to: helm/microservice/openvswitch-vswitchd-daemonset/templates/openvswitch-vswitchd-daemonset.yaml | 19:44 |
fooliouno | It works in a non-cotainerized environment FWIW :) | 19:44 |
kfox1111 | you kubectl exec'ed into the openvswitch-vswitchd -c main on the compute node and ran ovs-vsctl set-manager ... and it didn't work? | 19:45 |
fooliouno | correct | 19:45 |
kfox1111 | hmm... | 19:45 |
kfox1111 | is it a version difference? | 19:45 |
kfox1111 | ovs version incompatible? | 19:45 |
fooliouno | sorry .. didnt use the -c main but rest was same | 19:46 |
kfox1111 | loks like there is only one container in that pod. so same with or without it. | 19:46 |
fooliouno | Hmm .. I should perhaps try a continer with an older ovs version? | 19:46 |
kfox1111 | do you have just one controller? | 19:48 |
fooliouno | one controller as in one kolla_controller node? | 19:48 |
kfox1111 | for opendaylight. | 19:48 |
fooliouno | yes | 19:48 |
*** vhosakot has quit IRC | 19:48 | |
*** vhosakot has joined #openstack-kolla | 19:49 | |
kfox1111 | is there an equiv of neutron agent-list | grep openvswitch-agent | 19:49 |
kfox1111 | in odl to see if hosts are bound to it? | 19:49 |
kfox1111 | do you have an external opendaylight controller you could point it to, | 19:51 |
kfox1111 | to help narrow it down from isssues from running the controller in k8s vs issues with the comput nodes talking to a controller? | 19:51 |
*** srwilkers has quit IRC | 19:51 | |
fooliouno | When we look in odl container, there is nothing in it. presumably since ovs could not register with it. | 19:52 |
kfox1111 | k... then yeah, we have to deermine if its an issue with ovs, or some issue in between. | 19:53 |
kfox1111 | does odl provide a rest api in the controller? | 19:53 |
kfox1111 | if so, can you curl it from the ovs container? | 19:53 |
fooliouno | FYI .. curl and ssh from ovs container to odl container works | 19:53 |
*** lamt has joined #openstack-kolla | 19:54 | |
fooliouno | Its just that the ovs-vsctl set-manager command fails | 19:54 |
*** lamt has quit IRC | 19:54 | |
kfox1111 | it fails with an error, or says it succeeds and doesn't work? | 19:54 |
fooliouno | nothing. no logs. no output. nothing :) | 19:54 |
kfox1111 | hmm... | 19:55 |
sdake | sup | 19:55 |
sdake | i saw my name used in vein :) | 19:55 |
kfox1111 | are you trying to do dns service resolution? or registering it by service ip? | 19:55 |
fooliouno | tcpdump doesnt show any packets going out either | 19:55 |
kfox1111 | or alternately, whats your ovs-vsctl set-manager look like? | 19:56 |
fooliouno | Just using the ODL container IP:port | 19:56 |
fooliouno | no dns resolution | 19:56 |
kfox1111 | hmm... weird. | 19:56 |
kfox1111 | I'm guessing dns name resolution will fail. | 19:56 |
kfox1111 | but if your not using it, thats not an issue... | 19:56 |
kfox1111 | and echo $? right after the ovs-vsctl set-manager is 0? | 19:57 |
kfox1111 | does ovs-vsctl get-manager print the things you'd expect? | 19:57 |
fooliouno | 127 | 19:58 |
kfox1111 | 127's an error... | 19:58 |
kfox1111 | try the get-manager, but would expect that to not have the info.. | 19:58 |
fooliouno | It shows the one we set: "tcp:10.244.1.72:6640" | 19:59 |
sdake | re computekit | 19:59 |
sdake | i understand there is a 1mb limit | 19:59 |
kfox1111 | fooliouno: oh. interesting. | 19:59 |
sdake | is it in helm itself or kubernetes? | 19:59 |
sdake | and why did 2.3 break computekit? | 19:59 |
kfox1111 | sdake: helm was broken, but once the limit in helm was fixed, a limit in k8s was shown. | 19:59 |
sdake | or was it always broken | 20:00 |
kfox1111 | 2.2 worked. | 20:00 |
kfox1111 | the problem I think is they added a bit more metadata of their own to each release. | 20:00 |
kfox1111 | we were right under the line in 2.2, and just over it in 2.3 | 20:00 |
kfox1111 | even adding a few features in 2.2 might have pushed us over. | 20:00 |
sdake | k8s library gprc is limited to 1mb? | 20:00 |
kfox1111 | 10m. we upped it to 20m. | 20:01 |
sdake | grpc | 20:01 |
kfox1111 | but helm stors state in k8s configmaps, | 20:01 |
sdake | where is the limit that is breaking now | 20:01 |
kfox1111 | which are 1m limited. | 20:01 |
kfox1111 | and with the 20m fix in, we still have the 1m limit causing problems. | 20:01 |
fooliouno | kfox1111: sorry .. the output for echo $? was 0. misunderstood when and how to run it. | 20:01 |
sdake | that kind of kills the advantage of having conditions at all | 20:02 |
kfox1111 | scroll up for my list of possible things that could happen. | 20:02 |
kfox1111 | fooliouno: ok. cool. so, its getting added to ovs ok. | 20:02 |
sdake | is there any way to increase the limit in kubernetes? | 20:02 |
kfox1111 | sdake: "only a matter of code" ;) | 20:02 |
fooliouno | yep, but ovs doesnt seem to act on it. However, if I set it to neutron, it works and creates the br-int etc | 20:02 |
*** lamt has joined #openstack-kolla | 20:03 | |
kfox1111 | fooliouno: so then, either ovs is not reaching out to odl, or can't contact it for some reason. | 20:03 |
fooliouno | right .. I think ovs is not reaching out since tcpdump on ovs and odl is empty | 20:03 |
kfox1111 | fooliouno: I'd compare versions of ovs in the container and on the bare host you tried. | 20:03 |
kfox1111 | maybe there's something different there. | 20:03 |
kfox1111 | fooliouno: the way k8s works, it might be tricky to tcpdump on the right interface to catpure it. | 20:04 |
fooliouno | ok .. | 20:04 |
kfox1111 | is ovs and opendaylight on the same host? | 20:05 |
fooliouno | we dumped on all interfaces for that port. yes .. both containers on same host for now. | 20:05 |
fooliouno | Do you think its a good idea to build our own ovs container with an older version | 20:06 |
kfox1111 | not sure it wil help or not... | 20:07 |
kfox1111 | what k8s sdn are you using? | 20:07 |
fooliouno | ocata | 20:07 |
fooliouno | sorry .. flannel | 20:08 |
kfox1111 | flannel, ok... | 20:08 |
kfox1111 | and same host... | 20:08 |
kfox1111 | and the odl controller is net=host or not? | 20:08 |
fooliouno | we have a kube master on one host and kolla_controller and kolla_compute on a second host | 20:08 |
kfox1111 | oh. so odl controller on kolla_controller, and compute on kolla_compute? | 20:09 |
fooliouno | odl is not net-host = true. We debated that. | 20:09 |
fooliouno | both labels set to second host. same node. | 20:09 |
kfox1111 | k. | 20:09 |
fooliouno | odl doesnt really need access to host netns I believe | 20:09 |
fooliouno | but we could be wrong | 20:10 |
kfox1111 | not sure. I would guess not. | 20:10 |
sdake | kfox1111 you got an issue tracker or something that tracks this limit? | 20:12 |
kfox1111 | sdake: nope. maybe sbezverk has one. | 20:12 |
sdake | sbezverk issue tracker on this limit? | 20:12 |
sdake | jascott1 up for fixing a kubernetes limitation? | 20:13 |
sdake | jascott1_ ?? | 20:13 |
kolla-slack | <jascott1> I am! on a call, bbiab | 20:13 |
kfox1111 | sdake: thats not going to be a quick thing I think. :/ | 20:14 |
sdake | kfox1111 its not redefining a global then? :) | 20:14 |
kfox1111 | sdake: it may be that easy. but they don't make changes quickly in my experience. :/ | 20:17 |
openstackgerrit | Bertrand Lallau proposed openstack/kolla-ansible master: Magnum: add oslo_messaging_notifications config https://review.openstack.org/451857 | 20:19 |
openstackgerrit | Bertrand Lallau proposed openstack/kolla-ansible stable/ocata: Trove fix backup restore with Swift https://review.openstack.org/451780 | 20:20 |
*** srwilkers has joined #openstack-kolla | 20:28 | |
kolla-slack | <jascott1> ok so last night figured out that the config map and secret limit is 1mb and gRPC is 10mb | 20:31 |
kolla-slack | <jascott1> the release configmap is too big. so what do we want to do? | 20:31 |
kfox1111 | jascott1: so, we have afew options I think. | 20:31 |
kfox1111 | 1, something changed in helm to require more for the same chart. that could be undone temporarily. | 20:32 |
kfox1111 | 2. its possible to make spillover configmaps? if too big, trunkcate, mark the configmap as neeeding -> other configmap and put the rest there. | 20:32 |
kfox1111 | 3. fix it and wait in k8s. | 20:32 |
kolla-slack | <jascott1> oh can we link configmaps like that? | 20:33 |
kfox1111 | I'm leaning to 2 being the solution. | 20:33 |
kfox1111 | jascott1: its just configmaps in k8s. can't see why you couldnn't build it on top. | 20:33 |
kfox1111 | data: xxxxxxx | 20:33 |
kolla-slack | <jascott1> hmm arent we hitting bot the 1mb and the 10mb limit? | 20:33 |
kfox1111 | next: configmapname2 | 20:33 |
kolla-slack | <jascott1> how to get passed the 10mb limit? | 20:34 |
kfox1111 | jascott1: we hit the 10mb limit for sure. but seemed to be ok when sbezverk pached the check out. | 20:34 |
kfox1111 | then hit the configmap limit and are stuck. | 20:34 |
kolla-slack | <jascott1> and then its only going to get worse as we add more openstack components | 20:34 |
kfox1111 | jascott1: yeah. hence I think removing the limit from tiller is the right fix. #2. | 20:35 |
kfox1111 | then tiller's not constrained to k8s ever. | 20:35 |
kolla-slack | <jascott1> yeah I rolled a helm last night with a 2mb before I signed off, did that work? | 20:35 |
kfox1111 | 20mb you mean? | 20:36 |
kfox1111 | I think thats what sbezverk was testing with. | 20:36 |
kfox1111 | it was then the configmap that broke after that. | 20:36 |
kolla-slack | <jascott1> no I changed the k8s validation limit to 2mb for configmaps | 20:36 |
*** vhosakot has quit IRC | 20:36 | |
kfox1111 | oh. | 20:36 |
kfox1111 | he said then k8s was still complaining about 1m limit. | 20:36 |
kolla-slack | <jascott1> hmm | 20:36 |
kolla-slack | <jascott1> yeah so the k8s validation lib is vendored to helm so it wouldnt change k8s itself | 20:37 |
kfox1111 | its sthe api-server I think thats limiting it. | 20:37 |
kfox1111 | yeah. | 20:37 |
kolla-slack | <jascott1> im sure k8s is using the same validation (not hacked) | 20:37 |
kfox1111 | so, tiller serilizes stuff to put into the configmap, then puts it in releasename-configmap.data | 20:38 |
kolla-slack | <jascott1> sounds right | 20:38 |
*** jmccarthy has joined #openstack-kolla | 20:38 | |
kfox1111 | if we take that, %900k, if larger, releasename-configmap-%i | 20:38 |
kfox1111 | it would fit, | 20:38 |
kfox1111 | and retrieving would just be the reverse. append all the configmap data together before tiller gets it. | 20:39 |
kfox1111 | probably a couple dozen lines of code? | 20:39 |
kolla-slack | <jascott1> but are we saying that we are not over 10mb? | 20:39 |
*** jmccarthy has quit IRC | 20:39 | |
kolla-slack | <jascott1> for entire release? | 20:39 |
kfox1111 | we're over 10m, but under 20m. and setting it to 20m seems to work. | 20:40 |
kolla-slack | <jascott1> ah ok | 20:40 |
kfox1111 | so grpc limit isnt really 10m as far as we can tell. | 20:40 |
kfox1111 | not sure what the real upper limit is. | 20:40 |
kolla-slack | <jascott1> right weve only dealt with helm set limits afaik | 20:40 |
kfox1111 | one other thing we could potentially do... which does make it less helm native potentially... | 20:41 |
kfox1111 | right now, we share a common lib, and it gets coppied into every subcharts /carts dir. | 20:41 |
kfox1111 | its not clear to me, if you could "optimize" the chart, by deleting all the duplicates. | 20:42 |
kolla-slack | <jascott1> ah | 20:42 |
kolla-slack | <jascott1> interesting | 20:42 |
kfox1111 | data deduplication. :) | 20:42 |
openstackgerrit | Michal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move rabbitmq config to kolla-k8s https://review.openstack.org/450581 | 20:43 |
kolla-slack | <jascott1> I will take a look into what we are generating | 20:43 |
kfox1111 | if helm supports that, the helm package process should probably do that. | 20:43 |
kfox1111 | we should be able to build the computekit package, | 20:43 |
kfox1111 | extract it, remove all but one of the kolla-common subcharts, | 20:43 |
kfox1111 | and tar it back up. | 20:43 |
kfox1111 | and then see if it deploys. | 20:44 |
kfox1111 | not sure if that will effect the configmap object at all though. | 20:44 |
kfox1111 | that may be the difference between the grpc limit and the configmap one. | 20:44 |
kfox1111 | it may lessen the effect on grpc. but not the configmap. | 20:44 |
sdake | jascott1 so end goal is helm install openstack with individual enablement of services | 20:44 |
kolla-slack | <jascott1> right | 20:44 |
sdake | jascott1 if people want to use service charts thats good too | 20:45 |
sdake | jascott1 i'd like to understand if the limit is in kubernetes or in helm that we are runnign into | 20:45 |
sdake | unforutnately I am not capable of making this determination on my own | 20:46 |
openstackgerrit | Michal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move heat config to kolla-k8s https://review.openstack.org/450589 | 20:46 |
*** tonanhngo_ has joined #openstack-kolla | 20:48 | |
*** mewald has joined #openstack-kolla | 20:48 | |
*** tonanhng_ has joined #openstack-kolla | 20:49 | |
*** tonanhngo has quit IRC | 20:51 | |
*** tonanhngo_ has quit IRC | 20:52 | |
*** tonanhng_ has quit IRC | 20:53 | |
mewald | In my 3-node controller cluster the first two nodes died. I am just trying to bring back up the first of them but RabbitMQ keeps failing and is hanging in a restart loop of the docker container. I cannot extract any logging information since nothing is written. Any ideas? | 20:56 |
*** tonanhngo has joined #openstack-kolla | 20:58 | |
*** gfidente|afk has quit IRC | 20:59 | |
sdake | mewald panic? | 21:00 |
sdake | mewald how did the first two nodes fail? | 21:00 |
mewald | sdake: They "failed" by running out of disk space. We had to take them down, replace disks and freshly install them | 21:01 |
sdake | ok - so you want to recover with your remaining node to the other two nodes? | 21:02 |
*** jascott1_ has joined #openstack-kolla | 21:02 | |
sdake | inc0 got a 911 here... | 21:02 |
inc0 | mewald: docker logs rabbitmq shows nothing? | 21:02 |
sdake | mewald i have 0% power and no wall plug | 21:03 |
*** jascott1_ has quit IRC | 21:03 | |
sdake | i'll be back in teh hotel in about 30 minutes - alhtough no tsure how much I can help | 21:03 |
sdake | mewald do you have the original disks? | 21:03 |
mewald | inc0: Starting the container looks like this: https://gist.github.com/mewald1/0de8d046c4b59942103b08f0b706ee6a | 21:04 |
mewald | sdake: puhh yeah but recovering that RAID is going to be pain | 21:04 |
*** jascott1_ has joined #openstack-kolla | 21:04 | |
inc0 | hmm and nothing in /var/lib/docker/volumes/kolla_logs/rabbitmq? | 21:04 |
mewald | inc0: /var/lib/docker/volumes/kolla_logs/_data/rabbitmq/ remains completely empty | 21:04 |
inc0 | duh | 21:04 |
inc0 | hmm | 21:05 |
*** jtriley has quit IRC | 21:05 | |
kolla-slack | <jascott1> sdake its ultimately a k8s limit as far as I can tell | 21:05 |
openstackgerrit | Merged openstack/kolla master: Switch to RDO proxy mirrors https://review.openstack.org/455374 | 21:05 |
inc0 | mewald: what about syslog? | 21:05 |
inc0 | anything there? | 21:06 |
mewald | checking | 21:07 |
mewald | https://gist.github.com/mewald1/bf62938603e5ba096211527a61a26d76 | 21:07 |
inc0 | yeah not really helpful | 21:08 |
mewald | yeah, unfortunately not | 21:09 |
*** lucasxu has quit IRC | 21:09 | |
inc0 | let me try sth | 21:09 |
mewald | I would usually try to kill all rabbits including the last one standing and try to deploy but I am afrait nothing will work after that :D | 21:10 |
mewald | the cloud would be down ;) | 21:10 |
inc0 | docker inspect shows env variables right? | 21:10 |
inc0 | well not necessarily | 21:10 |
inc0 | that being said | 21:10 |
inc0 | try to stop one remaining rabbitmq node | 21:10 |
inc0 | (I assume it's working?) | 21:10 |
sbezverk | kfox1111: we should probably revisit the concept of compute kit and may be to decomission it? I mean, we cannot get stuck with older version of helm just because of compute kit. Also it is not the most popular way of deploying it.. what do you think? | 21:11 |
mewald | inc0: stop or remove the remaining one? | 21:12 |
inc0 | bring all of them to stop | 21:12 |
inc0 | and try to turn them on one at the time | 21:13 |
kolla-slack | <jascott1> @sbezverk whats the alternative? many smaller releases working together? | 21:13 |
mewald | ok | 21:13 |
kfox1111 | sbezverk: yeah, itsa cool concept, but maybe its time has not come yet. | 21:13 |
*** lucasxu has joined #openstack-kolla | 21:13 | |
kfox1111 | just doing a simple shell script to launch all the service packages works just as well. | 21:13 |
mewald | inc0: the one that worked before continues to work | 21:14 |
sbezverk | kfox1111: jascott1: right for the moment we I guess need to stick to service charts | 21:14 |
kfox1111 | if I had to pick one or the other, I think 2.3 is much more important then computekit. | 21:14 |
inc0 | mewald: so what I'm trying to do is to restart whole cluster | 21:14 |
inc0 | as it might have run into ugly state since majority of it died | 21:14 |
kfox1111 | I think computekit's cool though, so we shouldn't just forget about it. but for now, I don't think its really worth blocking stuff like helm native values over. | 21:14 |
sbezverk | kfox1111: agree, and we can still work on optimization in parallel.. it gives me a chance to practice in go ;) | 21:14 |
kfox1111 | +1 | 21:15 |
mewald | inc0: I just did "docker kill" then "docker start" on both rabbitmqs on both nodes. That should have restarted it, no? | 21:15 |
inc0 | do docker stop on all | 21:15 |
kolla-slack | <jascott1> @kfox1111 you dont have any faith in trying to reduce the common lib ? | 21:15 |
inc0 | then stocker start one at the time | 21:15 |
kolla-slack | <jascott1> or its duplication i mean | 21:16 |
kfox1111 | jascott1_: pretty sure it wont fix the configmap issue. but we could try. | 21:16 |
kolla-slack | <jascott1> ok I will check on the tiller=host thing and then look at cm’s | 21:16 |
mewald | inc0: changes nothing. the one that worked before starts up, the other one dies | 21:17 |
inc0 | config files looks right on all of them? | 21:17 |
kfox1111 | jascott1_: even assuming we can get the size down some, once we add configmaps for all the things as helm charts we may be right back in the same boat. | 21:18 |
mewald | inc0: yeah they are identical except stuff like bind addresses. I even checked erlang cookie in the volume | 21:18 |
jascott1_ | kfox1111 oh yeah good point. hmm | 21:18 |
inc0 | hmm | 21:19 |
kfox1111 | this is inherently a helm max release issue I think. | 21:19 |
inc0 | so in clusterer config you have something called gospel node | 21:19 |
kfox1111 | there's currently some upper bounds in how complicated an application can be for helm. | 21:19 |
inc0 | it's one working or one that was broken? | 21:19 |
mewald | yeah, what exactly is that? | 21:20 |
mewald | wait checking | 21:20 |
kfox1111 | not uncommon. most things run into scaling issues at some point. | 21:20 |
mewald | inc0: on both nodes the gospel is the one working | 21:20 |
inc0 | and on one working? | 21:20 |
*** lrensing has quit IRC | 21:20 | |
mewald | on the one working it is also the one working: on ctrl00 ctrl02 ist gospel, on ctrl02 ctrl02 is gospel | 21:21 |
mewald | inc0: https://gist.github.com/mewald1/0e70039d50d55111284b8da54fb24e05 | 21:22 |
inc0 | damn I ran out of ideas | 21:24 |
inc0 | try on rabbitmq support channel | 21:24 |
inc0 | or SamYaple you know rabbitmq | 21:24 |
-openstackstatus- NOTICE: Restarting Gerrit for our weekly memory leak cleanup. | 21:25 | |
*** Manheim has quit IRC | 21:26 | |
sbezverk | jascott1: I am trying to dump manifest of failing release to see what is in it | 21:26 |
mewald | inc0: I just noticed that the locales seem to be different on the nodes. "date" gives me "Wed Apr …" while the one not working shows "Mi 12. Apr" (german) Do you think that could cause it? | 21:26 |
sbezverk | jascott1: maybe it will help to uinderstand where we can squeez it a bit ;) | 21:26 |
jascott1_ | sbezverk cool | 21:27 |
inc0 | mewald: entirely possible | 21:27 |
inc0 | clusters like their time synchronized | 21:27 |
*** manheim has joined #openstack-kolla | 21:27 | |
kfox1111 | sbezverk: maybe coudl get a non failing one and see if you can decode it. | 21:28 |
kfox1111 | its probably base64 encoded, so if you undid it, it would be interesting t osee what kind of data's in there. | 21:28 |
sbezverk | kfox1111: right the idea was to collect from 2.2.3 and from 2.3.0 and compare | 21:28 |
kfox1111 | sbezverk: yeah. but having a working one might tell us if chart data deduplication might help too. | 21:28 |
mewald | inc0: was worth a try but it's still in a restart loop | 21:29 |
sbezverk | kfox1111: I cannot get working one with 2.3.0 | 21:29 |
sbezverk | so the only option I see is to get it from 2.2.3 or 2.2.2 | 21:30 |
mewald | also, time was syncrhonized, only locale was different | 21:30 |
inc0 | crap...full destroy - redeploy of cluster is always an option | 21:30 |
kfox1111 | sbezverk: if we can decode a 2.2.x one, that might be enough to tell us if dedup might help. | 21:30 |
sbezverk | kfox1111: sounds good, I am on it :) | 21:30 |
inc0 | docker rm -f rabbitmq -> kolla-ansible deploy -t rabbitmq | 21:30 |
mewald | puhh, if everythings down after that it's going to be a long night :D | 21:31 |
mewald | shall I risk it? | 21:31 |
*** lucasxu has quit IRC | 21:31 | |
sdake | moment | 21:31 |
kfox1111 | mewald: are you using ceilometer? | 21:32 |
mewald | yep | 21:32 |
kfox1111 | care if it looses data points? | 21:32 |
sdake | mewald please do the following | 21:32 |
*** manheim has quit IRC | 21:32 | |
kfox1111 | some datapoints I mean. | 21:32 |
sdake | make a backup of the existing data | 21:32 |
kfox1111 | (some sites do, some don't.) | 21:32 |
*** krtaylor has quit IRC | 21:32 | |
sdake | tar -czvf docker.tar.gz /var/lib/docker | 21:32 |
sdake | mewald for the most part a tar of /var/lib/docker is reloadable somewhere else | 21:33 |
inc0 | rabbimq volume is enough | 21:33 |
inc0 | but I'd be careful with dumping datafiles of rabbitmq to new cluster | 21:33 |
sdake | just make a backup before doing anything permanently destructive | 21:34 |
sdake | of all 3 disks | 21:34 |
mewald | losing some datapoints in ceilometer won't hurt | 21:34 |
kfox1111 | mewald: then rebuilding rabbit should probably be fine then. most of the rest of opensatck is stateless rpc. | 21:34 |
mewald | sdake: 3 disks? | 21:35 |
sdake | mewald is your database intact? | 21:35 |
*** manheim has joined #openstack-kolla | 21:35 | |
sdake | mewald i lost power - so was reading scrollback | 21:35 |
sdake | hang tight | 21:35 |
mewald | k | 21:36 |
sdake | mewald did mariadb recover correctly? | 21:36 |
sdake | if the problem is only rabbitmq, stopping and starting rabbitmq should get things giong again | 21:37 |
sdake | if the problem is mariadb, zug | 21:37 |
mewald | yes I got that to work after figuring out the trick with "mariadb_recover_inventory_name" | 21:37 |
sdake | rabbitmq = pile of garb | 21:38 |
sdake | wish we had another choice for the messaging component | 21:38 |
sdake | it works until it doesn't | 21:38 |
mewald | yeah so you are suggesting "docker kill rabbitmq && docker rm rabbitmq && docker volume rm rabbitmq" then run "kolla-ansible deploy"? | 21:38 |
sdake | mewald i'm not sure what the correct course of corrective action is | 21:39 |
*** goldyfruit has quit IRC | 21:39 | |
sdake | i can tell you the mariadb volue m is only used to store a pid file iirc | 21:39 |
inc0 | mewald: you can keep volume | 21:39 |
sdake | rathe the rabbitmq voluem | 21:39 |
sdake | and that file is used to syncronize the ha cluster | 21:39 |
sdake | maybe its not a pid file but a sequence file | 21:39 |
sdake | if you restart things, rabbitmq should recreate that file and openstack will reconnect to the services | 21:40 |
sdake | (services being memcached, rabbitmq, mariadb) | 21:40 |
mewald | yeah, my experience tought me to always delete the volumes, too. Had many very strange things happening when I forgot the volume :D | 21:40 |
inc0 | well | 21:40 |
inc0 | destroy volume will be safer | 21:40 |
sdake | mewald which type of filesystem in use for /var/lib/docker? | 21:41 |
inc0 | rabbitmq data isn't that important for openstack | 21:41 |
sdake | inc0 is correct, rabbitmq data is not crucial to peration | 21:41 |
sdake | the only thing thatis crucial is the database | 21:41 |
mewald | sdake: ext4 | 21:42 |
sdake | mewald thats good news | 21:42 |
sdake | mewald ar eyou using overlayfs or thin lvm or somethign else? | 21:42 |
kfox1111 | sdake: rabbit data only matters if you care about ceilometer data. some sites, like mine, do care in some cases. | 21:42 |
sdake | kfox1111 i see - didn't know there was a ceilometer + rabbitmq linkage | 21:42 |
mewald | the ext4 is on an lvm2 lv | 21:42 |
mewald | but nothing thin or so | 21:42 |
sdake | mewald run docker info -> paste | 21:42 |
sdake | tis will tell us the storage driver in use | 21:43 |
kfox1111 | sdake: notification data goes to ceilometer. if your org relies on event data being reliably captured, for auditing purporses for example, | 21:44 |
kfox1111 | then loosing it can be bad. | 21:44 |
mewald | https://gist.github.com/mewald1/bb04483b496cf3c5c992eba893eb1fa3 | 21:44 |
kfox1111 | thats pretty much the only use case I'm aware of though that does. | 21:45 |
mewald | ctrl02 is the working node, ctrl00 the failing one | 21:45 |
sdake | ubuntu | 21:45 |
sdake | i would stop all rabbitmq instances | 21:45 |
sdake | delte all of there volumes | 21:45 |
sdake | then deploy | 21:45 |
kfox1111 | sbezverk: want met wo wait on the recheck, or just call it goo? | 21:45 |
sdake | but lets wait for inc0 to weigh in | 21:45 |
kfox1111 | good? | 21:45 |
openstackgerrit | Merged openstack/kolla-kubernetes master: Making resolv.conf to be more flexible https://review.openstack.org/456191 | 21:46 |
sbezverk | jascott1: I see manifest of release in clear text, but the error is referencing to configmap, any suggestions how to access this configmap? | 21:47 |
sbezverk | kfox1111: re-check which one? | 21:47 |
kfox1111 | sbezverk: the rabbit config one. | 21:47 |
sbezverk | kfox1111: looks good to me.. the failure is not related to rabbit for sure.. | 21:47 |
kfox1111 | k. I'll just wf it then. | 21:48 |
kfox1111 | thanks. | 21:48 |
kolla-slack | <jascott1> @sbezverk you are talking about the encoded format of Release->Data ? | 21:49 |
kfox1111 | jascott1_: yeah. | 21:50 |
kfox1111 | looks like its gzipped. | 21:50 |
kfox1111 | ungzipping it shows its a binary file, | 21:50 |
kolla-slack | <jascott1> oh duh thats right | 21:50 |
kfox1111 | with big chunks of json. | 21:50 |
kfox1111 | well, base64 encoded, zgziped binary but mostly text. | 21:50 |
kfox1111 | weird. | 21:51 |
jascott1_ | someone in helm was trying to figure out format as well | 21:51 |
sbezverk | kfox1111: I do not see gzip, I see clear text manifest | 21:51 |
*** pcaruana has quit IRC | 21:51 | |
jascott1_ | the data tho | 21:51 |
kfox1111 | so, the readily recogizable parts are requirements.yaml, | 21:51 |
kfox1111 | and values.yaml. | 21:51 |
sbezverk | it does not look complete | 21:51 |
mewald | sdake: I am going to do it now | 21:51 |
kfox1111 | sbezverk: odd. the one I'm seeing is gziped... its an older one though. | 21:52 |
openstackgerrit | Merged openstack/kolla-kubernetes master: Move rabbitmq config to kolla-k8s https://review.openstack.org/450581 | 21:52 |
kfox1111 | let me look at another one... | 21:53 |
inc0 | I don't know much about clustering in rabbitmq | 21:53 |
mewald | sdake inc0: it worked | 21:53 |
inc0 | redeploy worked? | 21:53 |
mewald | yup | 21:53 |
inc0 | all good then> | 21:53 |
sdake | nice mewald | 21:54 |
mewald | I deleted all: container, image, volume from all nodes | 21:54 |
*** rwallner has quit IRC | 21:54 | |
sdake | mewald i'd highly recommend setting up a cron job and tarballing /var/lib/docker | 21:54 |
sdake | mewald on all of your control nodes | 21:54 |
sdake | or using backup software | 21:54 |
sdake | raid is not a replacement for backups | 21:54 |
sdake | high availabity is not a replacement for backups | 21:54 |
kfox1111 | hmm... with helm v2.3, its a gzip. not sure about v2. | 21:54 |
mewald | sdake: yes, ur right. What I am currently doing is mysqldumping mariadb to a different node every hour | 21:55 |
sdake | mewald nice - that woorks too | 21:55 |
sdake | as long as you ahve the database you should be gtg | 21:55 |
kfox1111 | sbezverk: jascott1_: ok, ,looking at another chart, I'm seeing all the templates in it too. | 21:55 |
mewald | yeah, openstack is quite easy to recover in that respect | 21:55 |
kfox1111 | so, yeah, if deduplication works, it may fix the issue. | 21:55 |
jascott1_ | technosophos said its base64enc(gzip(protobuf)) | 21:55 |
mewald | but this stuff that I just had with rabbitmq makes me hate all this crap :D | 21:55 |
kfox1111 | jascott1_: yeah. that sounds about right, looking at the output. | 21:56 |
sdake | mewald i htink the only real solution to that problem is to not run out of /var/lib/docker disk space | 21:56 |
inc0 | kfox1111: I think heat and ironic is all that's left | 21:56 |
kfox1111 | but its the full templates that are the sizable thing. | 21:56 |
sdake | docker behaves poorly when it runs out of disk space | 21:56 |
sdake | applications do as well | 21:56 |
kfox1111 | inc0: sweet. almost there. :) | 21:56 |
kfox1111 | inc0: mariadb too? | 21:56 |
sdake | mewald tbh your lucky your db wasn't destroyed | 21:56 |
sbezverk | jascott1_: right looked at the wrong place , sorry | 21:56 |
kfox1111 | sbezverk: I think the next step next then is this: | 21:56 |
inc0 | hmm...mariadb is still not there? let me check | 21:56 |
kfox1111 | sbezverk: add a step to compute_kit building that: | 21:57 |
kfox1111 | extracts the built compute kit to a temp dir, | 21:57 |
kfox1111 | rm -rf on all dirs named kolla-common | 21:57 |
kfox1111 | cp -a helm/kolla-common to the computekit/charts/ | 21:57 |
kfox1111 | and tar it back up, | 21:57 |
mewald | inc0: everything is good with my mariadb :) | 21:57 |
kfox1111 | and let the gate continue on as normal. | 21:58 |
kfox1111 | that may fix the issue. | 21:58 |
inc0 | yeah you're right mariadb still missin, on it | 21:58 |
*** esharao has quit IRC | 21:58 | |
inc0 | mewald: mariadb remars were for kfox1111 :P | 21:58 |
inc0 | brb rebooting | 21:58 |
mewald | ah, getting confused :D someone should add threads to IRC xD | 21:59 |
inc0 | +1 to that | 21:59 |
*** haplo37 has quit IRC | 21:59 | |
*** g3ek has quit IRC | 21:59 | |
*** lamt has quit IRC | 21:59 | |
sdake | mewald they did - its called slack ;-) | 21:59 |
mewald | sdake: true xD Is there a kolla team on slack? | 21:59 |
sdake | mewald its mirrored to irc | 21:59 |
sdake | and we use irc for main communication | 22:00 |
sdake | its mostly to help kubernetes community communicate in one forum with openstack community | 22:00 |
mewald | yeah makes sense | 22:00 |
*** rwallner has joined #openstack-kolla | 22:01 | |
*** rhallisey has quit IRC | 22:01 | |
*** tonanhngo has quit IRC | 22:03 | |
*** lamt has joined #openstack-kolla | 22:04 | |
*** tonanhngo has joined #openstack-kolla | 22:05 | |
sbezverk | kfox1111: what are we trying to get doing this? just a single copy of kolla-common? | 22:05 |
*** rwallner has quit IRC | 22:05 | |
kfox1111 | sbezverk: yeah. talking it over with technosophos right now. | 22:06 |
*** MasterOfBugs has quit IRC | 22:06 | |
*** pramodrj07 has quit IRC | 22:06 | |
inc0 | back | 22:06 |
*** tonanhngo_ has joined #openstack-kolla | 22:06 | |
*** pramodrj07 has joined #openstack-kolla | 22:06 | |
*** MasterOfBugs has joined #openstack-kolla | 22:06 | |
kfox1111 | sbezverk: actually I think the change needs to be slightly different. | 22:07 |
kfox1111 | sbezverk: we make compute kit depend on kolla-common too. | 22:07 |
kfox1111 | and after built, rm -f */kolla-compute/templates/* out of the tar exept the root level kolla-compute. | 22:08 |
*** haplo37 has joined #openstack-kolla | 22:08 | |
*** g3ek has joined #openstack-kolla | 22:08 | |
*** tonanhngo_ has quit IRC | 22:08 | |
*** tonanhngo_ has joined #openstack-kolla | 22:09 | |
*** tonanhngo_ has quit IRC | 22:09 | |
*** tonanhngo has quit IRC | 22:09 | |
sbezverk | kfox1111: can you delete files right in the tar file? never seen that before.. | 22:10 |
kfox1111 | I don't think so. | 22:10 |
sdake | you can do that via a pipe | 22:10 |
sdake | but not directly as you would like | 22:11 |
kfox1111 | the change I was tryign to say though is that we can't delete the child lib's entirely, just their template dir content. | 22:11 |
kfox1111 | otherwise, when we go to parent var includes, things would break. | 22:11 |
kfox1111 | the child values still need to be in there. | 22:12 |
inc0 | brb | 22:12 |
*** inc0 has quit IRC | 22:12 | |
*** inc0 has joined #openstack-kolla | 22:12 | |
inc0 | at some point you need to restart weechat... | 22:13 |
*** fooliouno has quit IRC | 22:13 | |
*** manheim has quit IRC | 22:13 | |
sbezverk | kfox1111: I think the issue i not with templates but with duplication rendered values.yaml if you add --debug to compute kit, you will see huge duplication of variables over and over | 22:14 |
sbezverk | I am not 100% sure if it is the same case in configmap, but could easily be the case | 22:15 |
sdake | gotta jet for dinner - bbl | 22:15 |
kfox1111 | sbezverk: yeah. I'm sure thats part of it. | 22:15 |
*** tonanhngo has joined #openstack-kolla | 22:15 | |
kfox1111 | but it should include a ton of copies of the templates too. | 22:15 |
kfox1111 | just cutting it down by half would probably do the trick. | 22:15 |
jascott1_ | sbezverk can you pastebin that output or something? | 22:16 |
kfox1111 | I think just doing the tar datadedup would be a good excersize. shouldn't take too long to try. | 22:17 |
openstackgerrit | Michal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move heat config to kolla-k8s https://review.openstack.org/450589 | 22:17 |
kfox1111 | the configmap data implies it will help. | 22:17 |
sbezverk | kfox1111: I cannot say I completely understand steps you are proposing, but if you put together PS I can quickly test in my local test bed as I can easily reproduce the issue | 22:17 |
kfox1111 | sbezverk: k. I'll take a quick stab at it. sec... | 22:17 |
*** Pavo has joined #openstack-kolla | 22:19 | |
kfox1111 | sbezverk: whats the full name of the tarball? | 22:19 |
kfox1111 | I don't have it handy. | 22:19 |
inc0 | kfox1111: btw..when are we planning to tackle clustered mariadb? | 22:20 |
mewald | sdake inc0: just deployed the third controller and again rabbit has those issues :D It's madness | 22:20 |
kfox1111 | inc0: I'm kind of waiting to see what k8s comes up with for that. | 22:20 |
inc0 | kfox1111: +1 to that...just might take some time | 22:20 |
kfox1111 | I think it kind of should be mariadb-operator. | 22:20 |
inc0 | especially that suprisingly google sells service like that | 22:20 |
kfox1111 | but if it doesn't show soon, doing a deamonset with host passthrough might be a good intermediary. | 22:21 |
sbezverk | kfox1111: I do not see compute kit tar file, only individual services | 22:22 |
sbezverk | tar files | 22:22 |
kfox1111 | I found it I think. | 22:23 |
kfox1111 | compute-kit-0.6.0-1.tgz | 22:23 |
sbezverk | kfox1111: :) where did you find it? in helm repo? | 22:23 |
kfox1111 | had one laying around. | 22:23 |
sbezverk | kfox1111: maybe because I did not build compute-kit but was deploying from helm/compute-kits/compute-kit | 22:25 |
kfox1111 | ah. yeah. | 22:26 |
*** Pavo has quit IRC | 22:33 | |
openstackgerrit | Kevin Fox proposed openstack/kolla-kubernetes master: WIP: Test compress compute kit https://review.openstack.org/456406 | 22:33 |
kfox1111 | sbezverk: ----^ | 22:33 |
kfox1111 | gotta build all and use the generated package. | 22:35 |
sbezverk | kfox1111: I see, finishing cleaning up my test bed from previous run | 22:35 |
sbezverk | and will try it | 22:36 |
*** mbruzek has quit IRC | 22:42 | |
openstackgerrit | Michal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move mariadb configs to k8s https://review.openstack.org/456407 | 22:43 |
sbezverk | kfox1111: started, waiting :) | 22:43 |
*** srwilkers has quit IRC | 22:46 | |
swinn | if I’m not able to access an external public network, do I just need to customize the ml2_conf.ini and enter a flat to physical mapping then reconfigure? | 22:47 |
*** lamt has quit IRC | 22:47 | |
swinn | or at least, good chance of that being the case? | 22:47 |
sbezverk | kfox1111: yeah!!! it worked | 22:47 |
kfox1111 | sbezverk: interesting. how big is the configmap now? | 22:48 |
kfox1111 | sbezverk: we may want to migrate that logic into both the build_compute_kit script ,and the build_service script. | 22:48 |
kfox1111 | and or ask helm to add it to helm. :) | 22:48 |
kfox1111 | which they do sound amenable to. | 22:49 |
sbezverk | kfox1111: it should be easier for them that to fight with k8s to increase limit | 22:51 |
*** claudiub has quit IRC | 22:58 | |
*** lamt has joined #openstack-kolla | 22:59 | |
inc0 | kfox1111 sbezverk sdake https://review.openstack.org/#/c/450589/ | 23:00 |
inc0 | plz | 23:00 |
kfox1111 | looks good to me. | 23:01 |
sbezverk | kfox1111: do we test heat in any gate jobs? | 23:01 |
kfox1111 | sbezverk: not yet. :/ | 23:01 |
inc0 | mariadb is waiting for gates | 23:01 |
kfox1111 | we really need a test for that. | 23:01 |
kfox1111 | even something really simple, like making a heat template that creates a server group, then heat stack-create -f foo.yaml foo would do the trick. | 23:02 |
sbezverk | kfox1111: do we want this check before merging this change? I mean it will not break anything, but there will always be doubt.. | 23:03 |
kfox1111 | sbezverk: its not tested, so its already broken. ;) | 23:03 |
kfox1111 | I'd rather get out of kolla-ansible completely for the next release thats coming up really really soon. | 23:04 |
kfox1111 | having a relase thats sort of both is really ugly. | 23:04 |
sbezverk | kfox1111: then inc0 needs to promise, if we find anything config related for heat, he would need to fix it ;) | 23:04 |
kfox1111 | he's already promiced its a streight up copy from kolla-ansible. | 23:04 |
inc0 | well, cut down copy | 23:05 |
kfox1111 | so not really any more broken then whats in kolla-ansible now. | 23:05 |
inc0 | but yeah | 23:05 |
kfox1111 | sdake: think we have a different solution to the computekit thing without backing out of 2.3. | 23:06 |
kfox1111 | sdake: https://review.openstack.org/#/c/456406/ | 23:06 |
sbezverk | I am off, had a very early start today, have a good one folks | 23:06 |
kfox1111 | sbezverk: have a good one. :) | 23:07 |
*** rwallner has joined #openstack-kolla | 23:07 | |
*** vhosakot has joined #openstack-kolla | 23:07 | |
*** lamt has quit IRC | 23:08 | |
kfox1111 | inc0: looks like comparing helm/service to ansible/roles, we still need ironic, maridb and openvswitch config's. | 23:08 |
inc0 | ironic needs rebase, mariadb is up for review | 23:08 |
inc0 | ovs - I'm on it | 23:08 |
openstackgerrit | Merged openstack/kolla-kubernetes master: Move heat config to kolla-k8s https://review.openstack.org/450589 | 23:10 |
*** rwallner has quit IRC | 23:11 | |
*** lamt has joined #openstack-kolla | 23:12 | |
kfox1111 | mariadb failed hard. | 23:12 |
kfox1111 | missing file. | 23:12 |
kfox1111 | wsrep-notify.sh.j2 | 23:12 |
*** lamt has quit IRC | 23:12 | |
*** lamt has joined #openstack-kolla | 23:13 | |
*** lamt has quit IRC | 23:13 | |
inc0 | yeah I thought I can remove it | 23:13 |
swinn | I feel like I’m at the final stages of having a working cluster. Last step is I can’t figure out how to map a subnet for floating ips to a physical flat network. Any guides on this out there? | 23:13 |
*** lamt has joined #openstack-kolla | 23:16 | |
*** lamt has quit IRC | 23:17 | |
*** srwilkers has joined #openstack-kolla | 23:19 | |
kfox1111 | swinn: not sure I understand the question. | 23:21 |
kfox1111 | are you using tenant networking? | 23:21 |
*** sayanta__ has quit IRC | 23:21 | |
swinn | I’m using the default networking with ovs but need to map a subnet for floating IPs to a real physical network | 23:22 |
swinn | when I create the public network, the provider is vxlan | 23:22 |
swinn | but in this case it should use a flat provider to connect to a physical network | 23:22 |
kfox1111 | haven't tried it with kolla but you create the public network telling it to use a provider network. | 23:22 |
kfox1111 | swinn: something like: https://github.com/openstack/kolla-kubernetes/blob/master/tests/bin/basic_tests.sh#L92 | 23:23 |
openstackgerrit | Michal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move mariadb configs to k8s https://review.openstack.org/456407 | 23:25 |
kfox1111 | inc0: so, after the last ps merges, do we just tweak the pathfinder.py to point to the local ansible dir and delete kolla-ansible from the gate? | 23:26 |
inc0 | yup | 23:26 |
kfox1111 | nice. :) | 23:26 |
swinn | kfox1111: thanks for the link, the syntax changed a bit but it helped | 23:27 |
kfox1111 | cool. | 23:28 |
inc0 | kfox1111: flat external + vxlans wont work? | 23:28 |
kfox1111 | inc0: flat external + tenant vxlans works. | 23:29 |
inc0 | yeah that's what I thought | 23:29 |
kfox1111 | inc0: he was just doing a net-create though without the type, so was allocating vxlans intead of flat. | 23:29 |
kfox1111 | just had to use the right flags. | 23:29 |
inc0 | then you create allocation pool on external flat and routers | 23:29 |
kfox1111 | right. | 23:29 |
swinn | it was the physnet1 mapping that got me, it’s a named interface | 23:29 |
kfox1111 | ah. yeah. | 23:30 |
kfox1111 | "your network naming may varry. " :) | 23:30 |
inc0 | kfox1111: so ovs confs are in neutron role | 23:31 |
inc0 | which means we should have those | 23:31 |
kfox1111 | oh. ok. | 23:31 |
kfox1111 | is it in the rm -rf? | 23:32 |
kfox1111 | maybe we just need to test that. | 23:32 |
kfox1111 | yeah. its not in the rm -rf list. so we probably should just test that. | 23:35 |
*** erlon has quit IRC | 23:35 | |
openstackgerrit | Michal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Finalize move of configs to kolla-k8s https://review.openstack.org/456412 | 23:35 |
inc0 | kfox1111: or remove all rm -rfs ^ ;) | 23:35 |
kfox1111 | that works too. :) | 23:35 |
inc0 | that will show us if we missed anything too | 23:36 |
inc0 | well gotta rebase ironic | 23:36 |
inc0 | but let's wait till mariadb merges | 23:36 |
kfox1111 | inc0: maybe the change note should be someting like "remove kolla-ansible as a dependency" ? | 23:36 |
swinn | kfox1111 and inc0: thanks so much for the help past few days, I can reach floating ips and all services are up | 23:37 |
kfox1111 | swinn: np. glad you got it working. :) | 23:37 |
inc0 | swinn: my pleasure | 23:38 |
masber | morning all, would you recommend running ceph and nova on the same node or separate interms of performance? | 23:38 |
kfox1111 | masber: we ran our site hyperconverged. worked well until recovery needs to happen. | 23:38 |
inc0 | masber: ceph and nova as in compute and ceph-osd? | 23:38 |
kfox1111 | then ceph gets really resource hungy. | 23:38 |
masber | I see | 23:39 |
kfox1111 | strangles out compute. | 23:39 |
*** Pavo has joined #openstack-kolla | 23:39 | |
masber | what about latency? | 23:39 |
kfox1111 | generally more predictable if seperate. | 23:39 |
kfox1111 | no noisy vm's in the way. | 23:39 |
masber | I am not sure about hyperconverged, I am thinking to keep them separate, in that case I can run ephemeral storage for high performance | 23:40 |
inc0 | masber: hyperconverged won't make it closer to storage | 23:40 |
kfox1111 | yeah. if you have the resource, I'd recommend keeping them seperate. | 23:40 |
inc0 | as in you never know if your ephemeral will land on osd with compute | 23:40 |
kfox1111 | inc0: right. | 23:40 |
inc0 | reason you might want to do it is to not waste disk slots in compute nodes;) | 23:41 |
masber | ok I understand | 23:42 |
inc0 | and have all cloud running on single spec of servers | 23:42 |
kfox1111 | my advice, avoid it if you can. its generally ok if you are small enough you can't. | 23:43 |
masber | yes, on the other hand you can always setup your vms to have "local" storage as ephemeral drives as scratch volume for high performance and then another drive to ceph for archiving | 23:43 |
*** zhubingbing_ has joined #openstack-kolla | 23:43 | |
masber | I am trying to migrate from HPC to private cloud, if that makes sense | 23:43 |
kfox1111 | yeah. thats what I do. | 23:43 |
kfox1111 | so if you want a pet that can float, you do a cinder backed volume. | 23:43 |
kfox1111 | if you want faster ephemeral, you don't check the box. :) | 23:44 |
kfox1111 | best of both worlds. | 23:44 |
inc0 | yeah, ephemerals are in general bad idea if you want to have live migration | 23:44 |
kfox1111 | if you back nova with ceph, you remove the choice. | 23:44 |
masber | that is a good point, ephemeral disks won't move on live migration, is that right? | 23:45 |
inc0 | with block migration they will, but that makes it much harder | 23:45 |
inc0 | and volitale | 23:45 |
kfox1111 | masber: my site does hpc and cloud. are you trying to migrate hpc workload to a private cloud, or just coming from mindset? | 23:45 |
kfox1111 | yeah. and block migration has never worked for me live. | 23:45 |
masber | kfox1111, we currently have a small HPC cluster and we would like to move to private cloud doue to the flexibility it provides | 23:46 |
kfox1111 | trynig to migrate mpi jobs? | 23:47 |
kfox1111 | and whats your definition of small? :) | 23:47 |
masber | no MPI thanks good | 23:47 |
kfox1111 | everyone's definition of size differs. | 23:47 |
kfox1111 | ok. so then its not too bad to migrate then. :) | 23:47 |
kfox1111 | we do a lot of mpi jobs. | 23:47 |
masber | yes sorry, small means 26 nodes running centos 6 and using rocks cluster for provisioning | 23:48 |
kfox1111 | ah. rocks. :) | 23:48 |
kfox1111 | we've got quite a few of those clusters around. | 23:48 |
masber | and the facts that we don't use MPI makes it less complicated for virtualization I guess | 23:48 |
kfox1111 | pretty reliable really. | 23:48 |
kfox1111 | much much less. :) | 23:48 |
kfox1111 | so.... not to discurage you too much... but | 23:49 |
kfox1111 | we're taking one of our HTC computing clouds and tearing it down. | 23:49 |
*** Pavo has quit IRC | 23:50 | |
kfox1111 | getting about a 20% performance bump moving the system from openstack to raw kubernetes. | 23:50 |
masber | I see | 23:50 |
kfox1111 | you can tune openstack vm's to get most of the performance difference back, but its a fair chunk of work. | 23:50 |
inc0 | well...that kinda corresponds with virtualization overhead | 23:50 |
masber | you can use magnum to deploy kubernetes? | 23:50 |
kfox1111 | but with k8s, it passes the raw cpu through, so doesn't need tuning. | 23:51 |
inc0 | masber: then you're losing both:) it's about removing virt layer alltogether | 23:51 |
inc0 | kfox1111: I wonder if slurm could use k8s | 23:51 |
masber | magnum can deploy kubernetes on bare-metal | 23:51 |
*** mewald has quit IRC | 23:51 | |
masber | no need to run on vms I think | 23:51 |
kfox1111 | inc0: I'm using condor running in k8s managed contailers on bare metal. | 23:51 |
kfox1111 | slurm would work just as well. | 23:51 |
masber | yes, my idea of openstack is to use it a unified environment for a cluster as it can do lots of things | 23:53 |
masber | and gives the oportunity to the end user to setup their own environments | 23:53 |
masber | kfox1111, what about mesos? have you look that option? | 23:54 |
kfox1111 | yeah. openstacks good for that. | 23:54 |
*** Pavo has joined #openstack-kolla | 23:54 | |
kfox1111 | masber: let me show you a little graph... :) | 23:54 |
kfox1111 | masber: https://trends.google.com/trends/explore?q=kubernetes,docker%20swarm,mesos | 23:55 |
kfox1111 | I picked k8s. :) | 23:55 |
masber | nice graph | 23:56 |
masber | I see | 23:56 |
kfox1111 | all the major players are lining up behind k8s. redhat, ubuntu, google, coreos, etc. | 23:57 |
kfox1111 | so, thats where I'm placing my bet. | 23:57 |
*** Pavo has quit IRC | 23:57 | |
masber | kfox1111, do you use NUMA architecture? | 23:58 |
kfox1111 | hard to avoid these days. :) | 23:58 |
kfox1111 | but, yeah. | 23:58 |
kfox1111 | we had one of the early SGI Altix's. 512 cores in one numa machine spread over I think it was something like 3 racks. | 23:59 |
masber | ok, so cpu pinning didn't help much | 23:59 |
*** ssurana has joined #openstack-kolla | 23:59 | |
masber | in terms of performance | 23:59 |
kfox1111 | it did a lot in that environment. :) | 23:59 |
kfox1111 | rack to rack memory access was really expensive. | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!