Wednesday, 2017-04-12

*** daidv has quit IRC00:01
kolla-slack<jascott1> looks like helm uses k8s validation, and for configmaps its 1mb00:02
japestinhosdake I'm able to deploy kolla-k8s aio for testing inside openstack env. after run init-runocne I saw demo1 VM, demo networks and router created. but i can't ping the VM from qdhcp or ping it's floating ip00:02
japestinhosdake which network type should I use? flat or anything else?00:03
sdakejapestinho so your on an openstack deployment?00:03
sdakejapestinho or your on bar emetal?00:03
kfox1111did anything recently change in canal?00:03
japestinhosdake I'm using openstack deployment, I deploy on centos 7 instance00:03
sbezverkjascott1: but we do not have that configmap00:03
sdakejascott1 k8s has a limit of 1mb, or helm does?00:03
kolla-slack<jascott1> helm uses k8s validation so k8s00:04
sbezverkjascott1: it seems tiller autogenerate it00:04
kfox1111ah... so tiller may have added a bit more state tracking or something, and our amount of stuff in compute kit pushed us over the limit?00:04
sbezverkjascott1: can you add debuggin around why this config map gets generated and what triggers it00:05
kfox1111sbezverk: its the release state tracking stuff.00:05
kolla-slack<jascott1> i just raised the limit and pushed it00:05
kfox1111jascott1: cool.00:05
sbezverkjascott1: in helm or in tiller?00:05
sbezverkkfox1111: I have a job with jascott1 debug helm00:06
*** lamt has quit IRC00:06
kfox1111sbezverk: cool.00:07
sbezverkkfox1111: but if tiller image was changed then it will not get new one..00:08
kfox1111sbezverk: yeah. it may need to be a tiller update.00:08
*** lamt has joined #openstack-kolla00:08
sbezverkkfox1111: Yeah I think so too00:08
kfox1111so, there seems to be a pattern in the multinode failures:00:09
kfox1111http://logs.openstack.org/41/454841/8/check/gate-kolla-kubernetes-deploy-centos-binary-3-ceph-multi-nv/0fa9c5c/logs/pods/default-test-dns-centos-7-2-node-rax-ord-8375918-522923-8mc8w.txt00:09
*** eanylin has joined #openstack-kolla00:09
kfox1111canal has not been updated for 19 days... so not that...00:09
*** lamt has quit IRC00:10
sdakethis has been happening since kubernetes 1.6 kfox111100:10
sbezverkkfox1111: yep I saw these before, but the were not very frequent00:10
*** wagiel has joined #openstack-kolla00:20
kolla-slack<jascott1> @sbezverk I took out the panic and raised the limit to 2mb.00:20
*** sayantan_ has quit IRC00:24
*** wagiel has quit IRC00:25
*** srwilkers has quit IRC00:26
*** Pavo has joined #openstack-kolla00:29
openstackgerritMarcus Williams proposed openstack/kolla-ansible master: Add OpenDaylight role  https://review.openstack.org/41636700:30
*** Pavo has quit IRC00:30
*** Pavo has joined #openstack-kolla00:31
kfox1111ok, saw a multinode succeed. was not rax.00:39
kfox1111and another failure. rax.00:39
*** duonghq has joined #openstack-kolla00:39
kfox1111so something network related on that cloud and our config is fighting.00:39
*** zhurong has joined #openstack-kolla00:40
kfox1111seems like traffic isn't making it from the slave to the master node related to k8s apiserver.00:40
*** guoguo has joined #openstack-kolla00:42
*** Pavo has quit IRC00:48
*** Pavo has joined #openstack-kolla00:49
kolla-slack<jascott1> @sbezverk i raised it in the k8s validation library, k8s will eventually complain but should move past the error00:50
kolla-slack<jascott1> to be clear, changes was in vendored k8s lib for helm, so helm has been increased but k8s has not00:50
*** lrensing has joined #openstack-kolla00:53
*** tovin07_ has joined #openstack-kolla00:56
*** lrensing has quit IRC01:03
*** lrensing has joined #openstack-kolla01:05
*** cuongnv has joined #openstack-kolla01:05
*** daidv has joined #openstack-kolla01:10
*** wagiel has joined #openstack-kolla01:15
openstackgerritMerged openstack/kolla-ansible stable/ocata: Fix Telegraf retention policy not found  https://review.openstack.org/45363101:16
masberhi, has anyone used kolla with nic bonding? does it works?01:16
masbersorry I am talking about kolla-ansible 4.001:16
*** lrensing has quit IRC01:17
*** yangyapeng has joined #openstack-kolla01:17
*** wagiel has quit IRC01:19
*** lrensing has joined #openstack-kolla01:20
*** sayantan_ has joined #openstack-kolla01:23
masberI specified tunnel_interface to be a different than network_interface and I think neutron_agent on my compute host can't find the physical_network01:23
*** lucasxu has joined #openstack-kolla01:24
*** MasterOfBugs has quit IRC01:25
*** pramodrj07 has quit IRC01:25
openstackgerritshaofeng cheng proposed openstack/kolla-ansible master: Add VMware DataStore support to glance  https://review.openstack.org/45217601:27
openstackgerritMerged openstack/kolla master: Fix name of the mistral-dashboard horizon plugin  https://review.openstack.org/44913601:28
*** hrw has quit IRC01:29
*** hrw has joined #openstack-kolla01:31
sbezverksdake: it looks like compute kit is still failing01:33
*** dixiaoli has joined #openstack-kolla01:33
*** lucasxu has quit IRC01:38
*** lucasxu has joined #openstack-kolla01:39
*** Pavo has quit IRC01:46
*** lucasxu has quit IRC01:46
*** lucasxu has joined #openstack-kolla01:46
openstackgerritshaofeng cheng proposed openstack/kolla-ansible master: Add VMware DataStore support to cinder  https://review.openstack.org/45213101:49
openstackgerritSurya Prakash (spsurya) proposed openstack/kolla-ansible master: Development Environment With Vagrant link not working  https://review.openstack.org/45586601:52
*** duritong_ has joined #openstack-kolla01:58
*** duritong has quit IRC02:00
daidvMorning.02:07
daidvJeffrey4l, I have cherry-pick my commit to Ocata release follow launchpad bug. Can you review for it?02:08
daidvhttps://review.openstack.org/#/c/455747/02:08
*** wagiel has joined #openstack-kolla02:09
*** ssurana has quit IRC02:12
*** wagiel has quit IRC02:14
openstackgerritshaofeng cheng proposed openstack/kolla-ansible master: Add VMware DataStore support to glance  https://review.openstack.org/45217602:17
*** lucasxu has quit IRC02:22
*** caowei has joined #openstack-kolla02:34
*** unicell has quit IRC02:35
openstackgerritMerged openstack/kolla-ansible stable/ocata: Use utf8_general_ci collation as a default collation  https://review.openstack.org/45574702:38
daidvJeffrey4l, duonghq : Thanks! :)02:39
*** shashank_t_ has quit IRC02:40
*** shashank_t_ has joined #openstack-kolla02:41
*** lrensing has quit IRC02:44
*** shashank_t_ has quit IRC02:45
Jeffrey4lnp03:01
Jeffrey4lduonghq, could u review https://review.openstack.org/45550403:02
*** wagiel has joined #openstack-kolla03:03
openstackgerritChen proposed openstack/kolla-ansible master: fix typos on quickstart page  https://review.openstack.org/45589203:05
*** wagiel has quit IRC03:08
*** shashank_t_ has joined #openstack-kolla03:12
*** lamt has joined #openstack-kolla03:14
duonghqJeffrey4l, backport is ok, but how does database url related to tooz config?03:19
*** eaguilar has quit IRC03:19
inc0good evening03:29
duonghqevening inc003:31
*** sayantan_ has quit IRC03:33
spsuryagood evening inc003:33
*** gkadam has joined #openstack-kolla03:34
*** sayantan_ has joined #openstack-kolla03:34
*** lamt has quit IRC03:43
*** krtaylor has quit IRC03:45
*** lamt has joined #openstack-kolla03:56
*** dave-mccowan has joined #openstack-kolla03:57
*** wagiel has joined #openstack-kolla03:57
*** wagiel has quit IRC04:02
*** zhurong has quit IRC04:06
*** lamt has quit IRC04:08
*** lamt has joined #openstack-kolla04:10
*** dave-mccowan has quit IRC04:16
*** dave-mccowan has joined #openstack-kolla04:16
*** lamt has quit IRC04:19
*** g3ek has quit IRC04:23
openstackgerritSurya Prakash (spsurya) proposed openstack/kolla-ansible master: Development Environment With Vagrant link not working  https://review.openstack.org/45586604:29
openstackgerritzhubingbing proposed openstack/kolla-ansible master: Fix Multi-regions nova support boot from volume  https://review.openstack.org/45604204:30
*** g3ek has joined #openstack-kolla04:31
SamYapleJeffrey4l: your mariadb recovery playbook work is wrong and will almost certianly lead to making someone lose data :(04:31
SamYapleJeffrey4l: i didnt put the seqno stuff in originally because its not accurate04:31
SamYaplewhat you have means you can force a cluster to basically roll back if a node was gracefully stopped then a period of time passed and then the cluster crashed04:32
SamYaplethere are 7 different failure scenarios and only 5 of them are 100% automatically recoverable from04:33
SamYaple1 of them is never automatically recoverable from04:33
SamYapleand one is given the right conditions04:33
*** zhurong has joined #openstack-kolla04:34
*** lucasxu has joined #openstack-kolla04:36
SamYapleinc0: ping about galera recovery above04:41
SamYapleyou guys should really remove that from the repo,, its like super dangerous04:41
*** jtriley has joined #openstack-kolla04:42
*** lamt has joined #openstack-kolla04:43
*** bmace has quit IRC04:45
*** skramaja has joined #openstack-kolla04:46
*** dave-mccowan has quit IRC04:50
*** wagiel has joined #openstack-kolla04:51
*** lamt has quit IRC04:52
*** lamt has joined #openstack-kolla04:53
*** wagiel has quit IRC04:56
*** sayantan_ has quit IRC05:03
*** shashank_t_ has quit IRC05:03
*** jtriley has quit IRC05:04
*** iceyao has joined #openstack-kolla05:05
*** lamt has quit IRC05:08
*** bmace has joined #openstack-kolla05:10
daidvGood afternoon.05:12
daidvI wonder why we can NOT mix binary image with source image? Can anyone help me, please?05:13
*** jaosorior_away is now known as jaosorior05:13
*** shashank_t_ has joined #openstack-kolla05:14
*** yangyape_ has joined #openstack-kolla05:16
*** yangyapeng has quit IRC05:17
*** rwsu has quit IRC05:17
Jeffrey4lSamYaple, yes. current solution can not cover all case and it may cause data loss.05:20
Jeffrey4lSamYaple, without patch, when recovery, it recovery from first node all the time. it will cause data loss too.05:21
Jeffrey4ldaidv, re mix binary and source,  technically, we can ( need some trick ). But why? i prefer to use source05:23
daidvJeffrey4l, in my case, some images are working well and I just want to append one more project like Panko via source image.05:26
daidvSo I just think, we can make some special config for deployment some source images with other binary project image together?05:27
daidvCurrently Kolla mean, if one system are using binary must be integrate with new binary image? Right?05:28
*** MasterOfBugs has joined #openstack-kolla05:32
*** pramodrj07 has joined #openstack-kolla05:32
Jeffrey4ldaidv, yep. only one type is supported.05:33
*** sayantan_ has joined #openstack-kolla05:34
openstackgerritjimmygc proposed openstack/kolla-ansible master: Add Glance Swift backend support  https://review.openstack.org/45205905:35
*** lucasxu has quit IRC05:40
*** shashank_t_ has quit IRC05:46
*** lamt has joined #openstack-kolla05:49
*** unicell has joined #openstack-kolla05:55
openstackgerritjimmygc proposed openstack/kolla-ansible master: Add Glance Swift backend support  https://review.openstack.org/45205905:56
*** lamt has quit IRC05:58
*** mewald has joined #openstack-kolla05:59
*** claudiub has joined #openstack-kolla05:59
mewaldIs "deploy" the right action to run in order to recover a mariadb cluster with 2 out of 3 failed nodes?06:01
*** iniazi has quit IRC06:08
*** hieulq has joined #openstack-kolla06:24
*** hieulq has quit IRC06:26
*** pcaruana has joined #openstack-kolla06:30
*** pramodrj07 has quit IRC06:32
*** MasterOfBugs has quit IRC06:32
*** mewald has quit IRC06:34
*** bogdando has joined #openstack-kolla06:44
openstackgerritzhubingbing proposed openstack/kolla-ansible master: Add region config option in globals.yml  https://review.openstack.org/45414906:44
*** yuanying_ has joined #openstack-kolla06:48
openstackgerritshaofeng cheng proposed openstack/kolla-ansible master: Nova_backend_ceph variable mobile location.  https://review.openstack.org/45607706:48
*** yuanying has quit IRC06:48
*** pbourke has quit IRC06:49
*** pbourke has joined #openstack-kolla06:51
*** targon has joined #openstack-kolla06:54
*** Kimmo_ has quit IRC06:54
*** jistr has quit IRC06:54
*** brad[] has quit IRC06:55
*** p6 has quit IRC06:55
*** p6 has joined #openstack-kolla06:55
*** zhubingbing_ has joined #openstack-kolla06:55
*** SamYaple has quit IRC06:55
*** gema has quit IRC06:55
*** dmsimard has quit IRC06:55
*** SamYaple has joined #openstack-kolla06:55
*** gema has joined #openstack-kolla06:55
*** dmsimard has joined #openstack-kolla06:55
*** gema has quit IRC06:56
*** gema has joined #openstack-kolla06:56
*** jistr has joined #openstack-kolla06:56
*** manheim has joined #openstack-kolla06:58
*** hieulq has joined #openstack-kolla07:01
openstackgerritzhubingbing proposed openstack/kolla-ansible master: Copy region config option from all.yml to globals.yml  https://review.openstack.org/45414907:01
*** mewald has joined #openstack-kolla07:03
*** mewald1 has joined #openstack-kolla07:06
*** mewald has quit IRC07:09
*** daidv_ has joined #openstack-kolla07:10
openstackgerritMerged openstack/kolla-ansible master: Unmount Ceph OSD disks as part of destroy  https://review.openstack.org/45571407:11
*** shashank_t_ has joined #openstack-kolla07:12
mewald1Is "deploy" the right action to run in order to recover a mariadb cluster with 2 out of 3 failed nodes?07:15
*** Serlex has joined #openstack-kolla07:17
*** shardy has joined #openstack-kolla07:24
*** brad[] has joined #openstack-kolla07:24
openstackgerritZeyu Zhu proposed openstack/kolla-ansible master: Remove the variable redefined in deploy-servers.yml  https://review.openstack.org/44843307:25
openstackgerritZeyu Zhu proposed openstack/kolla-ansible stable/ocata: Modify the hosts of the post-deploy.yml playbook  https://review.openstack.org/45609407:26
*** nathharp has joined #openstack-kolla07:28
*** caowei has quit IRC07:32
japestinhomewald1 I think you should run deploy and reconfigure if needed by mariadb services07:32
*** jmccarthy has joined #openstack-kolla07:33
*** egonzalez has joined #openstack-kolla07:36
*** shashank_t_ has quit IRC07:38
openstackgerritMerged openstack/kolla-ansible master: fix typo  https://review.openstack.org/45554107:42
*** sayantan_ has quit IRC07:43
*** daidv_ has quit IRC07:44
*** athomas has joined #openstack-kolla07:48
*** daidv_ has joined #openstack-kolla07:50
*** kollian has joined #openstack-kolla07:50
kollian hi guys......  I am facing some in deploying openStack with kolla, I am just new to kolla  just i am not getting from where i will run kolla-nuild command to build images  I have installed docker as well as ansible07:52
kolliankolla-build*07:53
kollianall the dependency07:53
kolliancan anyone please help07:53
kollianwill i run after checkout the kolla repo07:54
kollian?07:54
japestinhokollian you can follow this link from egonzalez blog http://egonzalez.org/deploy-openstack-designate-with-kolla-ansible/07:57
japestinhokollian download the image tarbals (binary/source) and setup docker private registry07:58
egonzalezkollian, to build images need to install kolla too, then kolla-build command will be available07:58
kollianjapestinho: deployment node and target node can be same right ?07:59
egonzalezif using stable branch, can pull images from dockerhub07:59
kollianin three node system07:59
kollianegonzalez: through PIP packages or there is any scrip to install08:00
kollian?08:00
japestinhokollian yes it can be use for all-in-one or multinode08:00
zhubingbing_sup egonzalez08:00
zhubingbing_;)08:00
kollianusing the master one08:00
egonzalezkollian, master cannot be installed from pip08:00
egonzalezkollian, use pip install -r kolla/requirements -r kolla/test-requirements08:01
egonzalezchange kolla with kolla-ansible to install kolla ansible requirements08:01
kollianegonzalez: then i only need kolla repo to run the command08:02
egonzalezkollian, commands can be found at tools/ folder ,kolla-ansible and build.py(this one in kolla repo)08:02
*** zhubingbing_ has quit IRC08:02
*** zhubingbing_ has joined #openstack-kolla08:02
kollianegonzalez: you mean to pull the image from the registry i have to checkout the repo of stable branch like ocata/newton etc ?08:04
kollianthen i can run pip install ?08:04
egonzalezkollian, don't need to download images to just install requirements, images are needed to deploy08:06
egonzalezkollian, but if you want to use stable branch, just use pip install  kolla and pip install kolla-ansible08:06
egonzalezdont have to clone both repos08:07
japestinhosup egonzalez :)08:11
egonzalezjapestinho, zhubingbing_ sup :D08:11
japestinhoegonzalez which network type should I use if I deploy openstack on openstack08:11
*** blallau has joined #openstack-kolla08:11
*** Administrator_ has quit IRC08:12
japestinhoegonzalez I am able to deploy kolla-kubernetes on openstack deployment08:12
*** Administrator_ has joined #openstack-kolla08:12
japestinhoegonzalez after init-runonce I got this08:12
japestinho[centos@kolla-k8s ~(keystone_admin)]$ ip netns08:12
japestinhoqrouter-67793882-51eb-416e-b794-00a3bbf93e2508:12
japestinhoqdhcp-c3a937cf-ec04-4d02-ac46-d266c77c041908:12
japestinho[centos@kolla-k8s ~(keystone_admin)]$ ip netns e qdhcp-c3a937cf-ec04-4d02-ac46-d266c77c0419 ping 10.0.0.10008:12
japestinhoCannot open network namespace "qdhcp-c3a937cf-ec04-4d02-ac46-d266c77c0419": Permission denied08:12
japestinhoegonzalez I can't ping to demo VM08:13
openstackgerritEduardo Gonzalez proposed openstack/kolla master: DNM: test master branch  https://review.openstack.org/45610808:14
*** targon has quit IRC08:15
egonzalezjapestinho, guess flat network type08:15
*** shardy has quit IRC08:15
*** yangyape_ has quit IRC08:16
egonzalezjapestinho, have you tried with root or sudo privs?08:16
openstackgerritZeyu Zhu proposed openstack/kolla stable/newton: Modify the hosts of the post-deploy.yml playbook  https://review.openstack.org/45610908:17
*** shardy has joined #openstack-kolla08:17
*** yangyapeng has joined #openstack-kolla08:18
kollianegonzalez: Thanks a lot, hope i will get it running08:20
openstackgerritZeyu Zhu proposed openstack/kolla stable/newton: Modify the hosts of the post-deploy.yml playbook  https://review.openstack.org/45610908:21
japestinhoegonzales same thing with root priv08:22
japestinho[root@kolla-k8s ~(keystone_admin)]$ ip netns e qdhcp-c3a937cf-ec04-4d02-ac46-d266c77c0419 ping -c3 8.8.8.808:22
japestinhoRTNETLINK answers: Invalid argument08:22
japestinhoRTNETLINK answers: Invalid argument08:22
japestinhosetting the network namespace "qdhcp-c3a937cf-ec04-4d02-ac46-d266c77c0419" failed: Invalid argument08:22
*** dmellado has joined #openstack-kolla08:23
egonzalezjapestinho, for what is "e" option in ip netns?08:24
egonzalezis sort exec?08:24
*** mewald1 has quit IRC08:28
blallau@egonzalez "e" alias "exec"08:29
openstackgerritShunli Zhou proposed openstack/kolla-ansible master: Correct operating-kolla.rst document  https://review.openstack.org/45611008:29
*** mewald has joined #openstack-kolla08:29
sdakemorning08:34
egonzalezmorning sdake08:36
egonzalezjapestinho, this issue is only in your env or more people in kolla-k8s is having the same?08:37
*** caowei has joined #openstack-kolla08:37
japestinhomorning sdake08:38
japestinhoegonzalez as far as I know, I only the one having this issue08:38
kollianegonzalez: i more lammy. seems like https://docs.openstack.org/project-deploy-guide/kolla-ansible/ocata/multinode.html  missing the right direction08:38
kollianto follow08:38
kollianby any deployer08:38
openstackgerritBertrand Lallau proposed openstack/kolla-ansible master: WIP: Enable Ceph input plugin in Telegraf  https://review.openstack.org/45560208:39
egonzalezkollian, can you rephrase? cannot understand what you mean08:39
kollianegonzalez: guide dont say about kolla installation08:40
kollianit say about kolla-ansible08:40
egonzalezjapestinho, there was a bug in kolla-ansible with shared /run, maybe k8s does not mount /run as shared https://bugs.launchpad.net/kolla/+bug/161626808:40
openstackLaunchpad bug 1616268 in kolla newton "Stale namespace removal causing "RTNETLINK answers: Invalid argument" errors" [Critical,Fix committed] - Assigned to Jeffrey Zhang (jeffrey4l)08:40
egonzalezkollian, kolla is not needed for kolla-ansible deploy unless want to build your own images08:41
kollianegonzalez: so how kolla-ansible know about the images ?08:42
kolliani.e build images08:42
egonzalezkolla-ansible need a couple of setting in globals.yml08:42
egonzalezregistry, namespace and version. thats all. will pull images from the registry and deploy them08:43
blallau@japestinho and @egonzalez exactly, I was thinking about the same issue: /run must mount with "shared" option "/run/:/run/:shared"08:43
kollianegonzalez: ohh08:44
kollianegonzalez: thknks08:44
egonzalezkollian, if you not configure a registry and namespace, by default will pull images from dockerhub08:45
egonzalezkollian, *only for stable releases (master not)08:45
*** gfidente has joined #openstack-kolla08:46
sdakeegonzalez do you know where https://docs.openstack.org/project-deploy-guide/kolla-ansible/ocata/quickstart.html is rendered from08:46
sdakeegonzalez the docs are incorrect08:46
egonzalezsdake, https://github.com/openstack/kolla-ansible/tree/master/deploy-guide/source08:47
egonzalezsdake, master is with draft in the url instead of ocata https://docs.openstack.org/project-deploy-guide/kolla-ansible/draft/quickstart.html08:47
hrwegonzalez, sdake: can you look at https://review.openstack.org/#/c/450805/ and vote? simple change, CI green08:48
hrwI started recheck on some other ones08:48
japestinhoegonzalez blallau did you mean MountFlags=shared option on /etc/systemd/system/docker.service ?08:50
egonzalezjapestinho, nope, volumes mounting at container startup08:51
sdakehrw done08:51
sdakehrw rev needed08:51
egonzalezjapestinho, in kolla-ansible https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/neutron/defaults/main.yml#L22 dunno where I can find that in kubernetes08:51
hrwsdake: rtslib is dependency for targetcli08:52
hrwsdake: so listing it was not needed08:52
sdakehrw for centos?08:53
hrwyes08:53
sdakehrw I don't see it installed here: http://logs.openstack.org/05/450805/7/check/gate-kolla-dsvm-build-centos-binary-centos-7-nv/89e115d/console.html.gz08:54
hrwsdake: https://pastebin.com/EuVB3q9h08:54
*** Kimmo_ has joined #openstack-kolla08:54
egonzalezcan someone take a look at https://review.openstack.org/#/c/455785/ ?08:54
egonzaleztemporaly fixes source deploy gates08:54
hrwsdake: log you shown does not have targetcli either08:54
sdakehrw agreed, seems like a bug08:55
egonzalezdont know the reason, but nova service-list is not retrieving registered services when they are in the database08:55
sdakehrw we may not require targetcli in nova packaging but may require rtslib08:55
sdakehrw i am hesitent to approve a change that breaks centos - if you rev the patch without the rts removal I'll ack it :)08:56
hrwsdake: ok, will readd rtslib08:56
sdakewhat we really need is m0ar gates for kolla-ansible08:56
*** manheim has quit IRC08:56
sdakehrw the rest of the change looks great :)08:57
sdakei think I lost my big battery for my phone08:57
* sdake groans08:57
sdakealways on teh road, always lose my big battery08:57
*** shardy is now known as shardy_mtg08:57
openstackgerritMarcin Juszkiewicz proposed openstack/kolla master: handle rtslib(-fb) package names and dependencies  https://review.openstack.org/45080509:01
hrwsdake: here you have09:01
hrw(venv-for-kolla) 11:01 hrw@gossamer:kolla$ git branch -m to-merge/debian-rtslib-fb-0412-110109:01
hrw;D09:01
sdakefor folks looking to contribute to kolla-kubernetes, our 0.6.0 release planned for april 15th is here: https://launchpad.net/kolla-kubernetes/+milestone/0.6.009:05
sdakeegonzalez can you ahve a look at https://review.openstack.org/#/c/450805/09:05
sdakespecifically: http://logs.openstack.org/05/450805/8/experimental/gate-kolla-dsvm-deploy-multinode-ubuntu-source-ubuntu-trusty-2-node-nv/3ad4c36/console.html#_2017-04-12_09_04_48_59781809:07
sdakeWTB NON FLAKEY GATES09:07
openstackgerritMerged openstack/kolla-ansible master: Fix Multi-regions nova support boot from volume  https://review.openstack.org/45604209:07
blallau@gonzalez for 450805: why not using "openstack hypervisor list" instead ?09:10
*** manheim has joined #openstack-kolla09:10
sdakezhubingbing_ what would be helpful is https://blueprints.launchpad.net/kolla-kubernetes/+spec/move-config-to-kolla-k8s09:10
*** nhlfr has quit IRC09:11
*** retr0h has quit IRC09:11
*** lpetrut has joined #openstack-kolla09:12
egonzalezblallau, empty too :(09:12
egonzalezblallau, http://paste.openstack.org/show/606241/09:13
*** vcn[m] has joined #openstack-kolla09:13
*** mewald has quit IRC09:13
blallaucause on install guide it should return something...09:16
blallauare you admin user?09:16
blallauwhen you launch "openstack hypervisor list"09:16
*** iceyao has quit IRC09:16
blallauor "openstack compute service list"09:16
*** bachp has joined #openstack-kolla09:16
*** nhlfr has joined #openstack-kolla09:16
*** retr0h has joined #openstack-kolla09:16
blallauseems you are admin user in simple_cell_setup.yml, but here http://paste.openstack.org/show/606241/09:18
*** zhubingbing_ has quit IRC09:19
*** zhubingbing has joined #openstack-kolla09:19
blallauinstall guide here https://docs.openstack.org/ocata/install-guide-ubuntu/nova-compute-install.html#add-the-compute-node-to-the-cell-database09:20
egonzalezblallau, yep im admin http://paste.openstack.org/show/606242/09:21
blallau"openstack hypervisor list" (in admin) must return something09:21
egonzalezthis issue is reproduced in gates09:21
blallau@egonzalez: ok, I have no idea...09:22
sdakeegonzalez what do you make of https://review.openstack.org/#/c/453846/2209:23
sdakethe last comment09:23
kollianegonzalez: I cann't see kolla repo to run kolla-build after installation i.e pip install kolla09:25
egonzalezsdake, simple_cell_setup is executed right?09:25
*** Serlex has quit IRC09:25
egonzalezkollian, when installing from pip there is no repo downloaded, kolla-build command is globally available in your system09:25
sdakeegonzalez appears not09:26
egonzalezblallau, filled a bug in nova and python-novaclient https://bugs.launchpad.net/nova/+bug/168206009:27
openstackLaunchpad bug 1682060 in python-novaclient "empty nova service and hypervisor list" [Undecided,New]09:27
kollianegonzalez:  it is written that i have run tools/start-registry in multinode depolyment for image registry09:27
kollianhow can i do registry09:27
kollian?09:28
kollianhave to *09:28
japestinhokollian docker run -d -p 4000:5000 --restart=always -v /opt/kolla_registry/:/var/lib/registry --name registry registry:209:29
kollianjapestinho: this is for AIO or multinode ?09:30
japestinhoor you need cd to /usr/share/kolla/tools/09:30
blallau@egonzalez great! thank you09:30
*** gkadam is now known as gkadam-afk09:30
*** zhuzeyu has quit IRC09:31
openstackgerritMerged openstack/kolla-ansible master: Temporaly fix deploy gate  https://review.openstack.org/45578509:31
kollianjapestinho: there is no kolla folder in /usr/share/09:31
japestinhokollian I believe it's was for multinode but you can use it for AIO also09:32
kollianjapestinho: does kolla dir will be created after kolla-ansible installation ?09:34
japestinhokollian nope, you need pip install -U kolla first, kolla-ansible will create /usr/share/kolla-ansible09:35
japestinhokollian what do you want to create? AIO or multinode deployment?09:36
*** guoguo has quit IRC09:37
kollianjapestinho: multinode09:37
japestinhokollian I suggest you follow this one http://egonzalez.org/deploy-openstack-designate-with-kolla-ansible/09:38
kollianjapestinho: following https://docs.openstack.org/project-deploy-guide/kolla-ansible/ocata/multinode.html09:39
japestinhokollian you can skip kolla-build process with download image tarballs first09:39
*** manheim has quit IRC09:39
japestinhokollian then deploy docker private registry09:39
egonzalezkollian, ^^ I suggest you to use images from Dockerhub for stable deployments, dont need any registry, just install kolla-ansible, configure globals.yml and deploy09:40
egonzalezkollian, images from tarballs are latest change in stable branch (most of them not released as stable yet)09:41
* egonzalez should change the blog post to avoid issues with stable deploys...09:41
kollianegonzalez: you mean only kolla-ansible installation and configure globals.yml will work for openstack deployment from kolla without running `pip install kolla`09:43
egonzalezkollian, yep, kolla is only needed if want to build your own images09:44
kollianegonzalez: ok, thanks09:44
egonzalezkollian, with this http://paste.openstack.org/show/606250/ is fair enough to deploy, change your IPs and NIC names to match your env, add nodes to inventory and deploy09:46
hrwok, rdo repos still 403 so no centos builds09:46
egonzalezkollian, that will download stable images from dockerhub09:46
kollianegonzalez: ok, thanks09:48
openstackgerritEduardo Gonzalez proposed openstack/kolla-ansible master: DNM: test master branch  https://review.openstack.org/45614009:51
japestinhoegonzalez Just curious if I use that private docker registry and using image from tarballs, how to use kolla-build to build latest stable image again? :)09:51
*** sambetts|afk is now known as sambetts09:52
egonzalezjapestinho, ./build.py -t source --tag 4.0.0 -n 192.168.100.215:4000/lokolla09:52
openstackgerritMerged openstack/kolla-ansible master: Nova_backend_ceph variable mobile location.  https://review.openstack.org/45607709:52
egonzalezjapestinho, add namespace with your registry IP on it09:52
japestinhoegonzalez ok thanks I got it.09:53
openstackgerritshaofeng cheng proposed openstack/kolla-ansible master: Remove show_multiple_locations in glance-api  https://review.openstack.org/45614309:58
*** tovin07_ has quit IRC09:59
*** daidv has quit IRC10:01
*** manheim has joined #openstack-kolla10:07
kollianegonzalez: japestinho i know it could be little embarrassing10:12
kolliani am not finding10:12
kolliankolla-ansible dir10:12
kollianin /usr/share/10:13
egonzalezkollian, did you installed kolla-ansible with pip?10:13
kollianegonzalez: yes and it installed successfully10:13
egonzalezkollian, centos, ubuntu or oracle?10:14
kollianhttp://paste.openstack.org/show/606259/10:15
kollianegonzalez: do i need to specify these in somwhere ?10:15
egonzalezpython packages are installed in different paths10:16
*** rhallisey has quit IRC10:16
egonzalezkollian, for deb is in /usr/local/share/kolla-ansible10:16
kollianegonzalez: yes git this in /user/local :)10:17
kolliangot*10:17
kollianegonzalez: i have not specified anywhere about my base destro i.e ubuntu10:20
kolliando i need to specify somehwre ?10:20
egonzalezkollian, yep in globals set `kolla_base_distro: "ubuntu"` in your case10:22
openstackgerritMerged openstack/kolla master: openstack-base: Percona-Server is x86-64 only  https://review.openstack.org/44996510:24
sdakespsurya pingola10:26
sdakezhubingbing pingola10:26
sdakeduonghq pingola10:26
spsuryasdake: pongola10:26
sdakehey fellas10:26
sdakebeen at openstack leadership training this week10:27
sdakewondering what your thinking is on the concept of a "protocore" for kolla-kubernetes10:27
sdakea protocore is essentially a "core reviewer in training"10:27
sdakei know all 3 of you (and more that are not in the channel) show some interest in reviewing kolla-kubernetes10:28
zhubingbinghi10:28
sdakeone of the problems a core review team faces with minting new core reviewers is that often core reviewers have to have a detailed knowledge of the gates and codebase10:29
sdakecurious if you would be interested in taking part in such a program10:29
sdakeit works as follows10:29
sdake1. your +1 vote would count as a +2, however, you would not have the ability to +2 or +w a review10:30
sdake2. when a core reviewer reviews the change, they either +2/+w it , or provide feedback to you and the submitter about how the review was not quite right10:30
sdake3. you learn how to become a core reviewer for kolla-kubernetes without having to "guess" at the set of magic incantations necessary to obtain said objective10:31
sdakethe natural outcome of such a program is that you learn how to review properly for kolla-kubernetes10:33
*** Serlex has joined #openstack-kolla10:33
egonzalezthat sound like a good program to have10:33
duonghqsdake, pong10:33
sdakeduonghq read scrollback :)10:33
sdakeegonzalez I agree10:33
sdake:)10:33
spsuryasdake: as you mentioned in third point *protocore* will work towards core reviewer part ?10:33
spsuryasdake: idea sounds good10:34
sdakeyup - the idea is to provide an onramp to core-reviewer without overwhelming you with core reviewer to begin with10:34
sdakeor alternatelly requiring you to move earth with your mind to learn hwo to become a cor ereviewer magically10:34
duonghqsdake, good plan10:34
sdakeshame rwellum isn't in the channel :)10:35
sdakeit does require a 100% commitment from you to learn how to become a core reviewer10:35
sdakesome of you already are well on your way10:36
sdakeor core reviewer on other projects10:36
sdakealthough each deliverable is different and has slightly different requirements10:36
sdakeegonzalez think such a program would be good for kolla-ansible and kolla?10:37
sdakeJeffrey4l ^^10:38
duonghqI think the program is good, and maybe unique for Kolla team in OpenStack10:39
sdakeduonghq indeed nova is doing this now10:39
sdakeduonghq its not like i invented the idea msyelf :)10:39
duonghqroger10:40
spsuryasdake: bringing this to kolla is a good step10:40
sdakehaven't brought it yet10:40
sdakejust wanted to gauge interest10:40
sdakeif people raen't interested, no reason to do such a thing10:41
openstackgerritPaul Bourke (pbourke) proposed openstack/kolla master: Reparent kolla-toolbox from openstack-base  https://review.openstack.org/43502310:42
Jeffrey4lsound great sdake10:42
sdakepbourke ^^10:42
spsuryasdake: i think mostly would be interested in this10:42
sdakespsurya that sentence didn't parse :)10:42
spsuryaeven evryone10:42
spsuryasdake: :)10:43
pbourkesdake: sounds good, we need more cores on kolla-ansible10:44
sdakepbourke right and how do we get more core reviewers?10:44
pbourkesdake: well your idea sounds like a good way towards that10:44
pbourkesdake: egonzalez: would you guys mind tag teaming and getting these finally merged? https://review.openstack.org/#/c/435023/ https://review.openstack.org/#/c/435024/10:45
pbourkeso tired of fixing conflicts on them10:45
sdakepbourke still not a fan of the reparent :)10:46
pbourkesdake: hmm10:47
sdakepbourke hopefully someday someone fixes that ;)10:47
pbourkesdake: yeah im concerned with the complexity of these images10:48
sdakepbourke reviewed enjoy10:48
pbourkesdake: but (yet another) rearchitect seems very difficult at this stage in the project10:48
*** cuongnv has quit IRC10:55
duonghqneed to back to home, see you in the meeting10:59
*** duonghq has quit IRC10:59
nathharpegonzalez - don’t know if you remember the vif plugging issue while spawning a large number of instances I mentioned last week?   It looks like I had to adjust the number of worker threads for neutron-server11:01
egonzaleznathharp, is the issue fixed after increasing workers?11:01
nathharpegonzalez - it does appear to be.   I am still seeing occasional failures (1 or 2 instances) but I might have a bad hypervisor.11:02
nathharpegonzalez - issue appears to have moved on to DHCP allocations11:03
nathharpegonzalez - any tips on tuning dnsmasq?11:03
egonzaleznathharp, maybe increasing number of l3 and dhcp agents per network11:05
egonzalezdhcp_agents_per_network: 211:05
egonzalezmax_l3_agents_per_router: 311:05
*** dixiaoli has quit IRC11:06
nathharpegonzalez thanks I’ll check them out.  Issue seems to be the amount of time it takes to release an IP address when an instance is destroyed (if a large number are destroyed)11:08
*** athomas has quit IRC11:08
*** papacz has joined #openstack-kolla11:09
hrwhttps://review.openstack.org/#/c/455374 - can you guys look at it? switching centos builds to rdo mirror may sort out centos gate failures11:11
egonzaleznathharp, interesting, for my knowledge. have you estimated ~ limits per worker?11:11
openstackgerritMerged openstack/kolla master: check mariadb galera status in every loop.  https://review.openstack.org/45013111:12
*** rhallisey has joined #openstack-kolla11:13
*** shardy_mtg is now known as shardy11:13
*** athomas has joined #openstack-kolla11:14
*** shardy is now known as shardy_lunch11:19
egonzalezpbourke, not sure for merging now toolbox reparent, build gates are failing, IDK if the registry will have kolla_toolbox image with those changes and will potentially break all other gates until rdo issue is fixed11:20
*** iceyao has joined #openstack-kolla11:20
pbourkeegonzalez: see what you mean, lets hang on till gates are green11:21
openstackgerritjimmygc proposed openstack/kolla-ansible master: Add Glance Swift backend support  https://review.openstack.org/45205911:21
pbourkeegonzalez: though they wont go green for that patch as they depend on each other :/11:21
egonzalezpbourke,  i know, at least wait until build are green. deploy won't fail until both changes are merged11:21
egonzalezpbourke, luckly kolla-ansible is +w and will merge once kolla do11:22
egonzalez* s/won't fail/will fail/11:22
*** zhurong has quit IRC11:26
*** rwallner has joined #openstack-kolla11:26
openstackgerritMerged openstack/kolla master: Install panko in ceilometer base container  https://review.openstack.org/44468011:28
*** zhubingbing has quit IRC11:29
*** iceyao has quit IRC11:30
*** rwallner has quit IRC11:30
*** rwallner has joined #openstack-kolla11:34
*** haplo37_ has quit IRC11:36
*** haplo37_ has joined #openstack-kolla11:36
sdakeJeffrey4l can you represent the protocore idea in the team meeting today11:38
sdakeI wont be able to make it as i'm in training11:38
sdakeor pbourke11:38
sdakeor egonzalez ? :)11:38
pbourkesdake: I can do as I already was discussing core stuff with inc0 recently11:39
sdakepbourke thakns i'll add to agenda11:39
sdakepbourke * protocore (an onramp to core reviewer) (pbourke)11:40
sdakeenjoy11:40
sdakethanks a bunch :)11:40
sdakekey ideas11:40
sdakeone protocore +1 + 1 core reviewer +2 = +w11:41
sdakeprotocore is coached by core review team11:41
sdakeprotocores are identified out of current pool of existing reviewers that are keen to join the core review team11:42
sdakepbourke add or subtract as you see fit :)11:42
sdakebbl11:42
*** gkadam-afk is now known as gkadam11:42
openstackgerritOpenStack Proposal Bot proposed openstack/kolla master: Updated from global requirements  https://review.openstack.org/45592811:44
openstackgerritOpenStack Proposal Bot proposed openstack/kolla-ansible master: Updated from global requirements  https://review.openstack.org/45592911:44
spsuryanice11:44
nathharpegonzalez - do you mean the neutron workers that I’ve set?   My controller nodes are old but powerful, 2x6core 2.7GHz plus HT.   I’ve set api, rpc and rpc_state workers to 2511:49
egonzaleznathharp, i mean, if you estimated how many jobs can take 1 worker until get collapsed11:50
nathharpegonzalez, I’ve not been very scientific, but somewhere between 40-100 instance creates breaks the defaults11:51
nathharpegonzalez, I tooks some inspiration from https://javacruft.wordpress.com/2014/06/18/168k-instances/11:52
nathharpegonzalez, regarding DHCP releases from dnsmasq - it is releasing ~1 address per second.   I’m in an edge case, but I have a class C network, have 200 instances running, delete them all, and recreate.   No errors in openstack, but dnsmasq complains about not enough IPs11:54
nathharpegonzalez - I think neutron assigns a ‘free’ IP, but dnsmasq thinks it’s still in use11:55
kollianegonzalez: i did the same but into this11:56
kollianvagrant@Vmachine1:/usr/local/share/kolla-ansible$ sudo kolla-ansible prechecks -i ansible/inventory/multinode Pre-deployment checking : ansible-playbook -i ansible/inventory/multinode -e @/etc/kolla/globals.yml -e @/etc/kolla/passwords.yml -e CONFIG_DIR=/etc/kolla  -e action=precheck /usr/local/share/kolla-ansible/ansible/site.yml  ERROR! the file_name '/etc/kolla/globals.yml' does not exist, or is not readable Command faile11:56
kollianegonzalez: should not i run the precheck ?11:56
kollianjust after kolla-ansible installation and configuring the inventory/multinode11:57
egonzalezkollian, have you copied all content from /usr/local.. to /etc/kolla/ cp -r /usr/local/share/kolla-ansible/etc_examples/kolla /etc/kolla/11:57
kollianegonzalez: no, that may be the problm11:58
egonzalezglobals.yml is modified at /etc/kolla11:58
egonzalezkollian, also have to generate passwords11:59
*** eaguilar has joined #openstack-kolla12:00
*** shardy_lunch is now known as shardy12:00
openstackgerritSerguei Bezverkhi proposed openstack/kolla-kubernetes master: Making resolv.conf to be more flexible  https://review.openstack.org/45619112:02
*** blallau has quit IRC12:07
*** sbezverk has quit IRC12:09
*** ipsecguy_ is now known as ipsecguy12:10
*** daidv has joined #openstack-kolla12:11
*** krtaylor has joined #openstack-kolla12:11
*** sbezverk has joined #openstack-kolla12:13
openstackgerritMerged openstack/kolla-ansible master: Revert "Fix Fluentd warn on dnsmasq.log file parsing"  https://review.openstack.org/45383712:15
openstackgerritMerged openstack/kolla-ansible master: Fix outdated InfluxDB configuration  https://review.openstack.org/45266712:18
*** rwellum has joined #openstack-kolla12:20
openstackgerritMerged openstack/kolla-ansible master: Congress: remove oslo_messaging_notifications config  https://review.openstack.org/44441112:22
openstackgerritMerged openstack/kolla-ansible master: Add gnocchi backend precheckes for ceilometer  https://review.openstack.org/44531212:24
*** haplo37 has quit IRC12:25
*** haplo37_ is now known as haplo3712:25
*** daidv_ has quit IRC12:26
*** daidv has quit IRC12:26
*** daidv has joined #openstack-kolla12:26
*** haplo37 has quit IRC12:32
*** g3ek has quit IRC12:33
*** srwilkers has joined #openstack-kolla12:34
*** haplo37 has joined #openstack-kolla12:38
*** haplo37_ has joined #openstack-kolla12:38
*** g3ek has joined #openstack-kolla12:38
openstackgerritEduardo Gonzalez proposed openstack/kolla-ansible master: Revert "Temporaly fix deploy gate"  https://review.openstack.org/45621012:52
*** mbruzek has joined #openstack-kolla12:53
*** lamt has joined #openstack-kolla12:57
*** goldyfruit has joined #openstack-kolla13:01
*** gkadam has quit IRC13:05
*** iceyao has joined #openstack-kolla13:05
*** lamt has quit IRC13:07
*** dvx has joined #openstack-kolla13:09
egonzalezJeffrey4l, around? re removing until in register tasks https://review.openstack.org/#/c/42871913:10
egonzalezJeffrey4l, duonghq raised a comment here, https://review.openstack.org/#/c/451876/13:10
egonzalezJeffrey4l, what happens when there is a network error and first attempt fails? retry  is handled in other place than I missing?13:11
*** esharao has joined #openstack-kolla13:15
mnaserhttps://review.openstack.org/#/c/455374/13:16
mnaseranyone has any clues about this nondeterministic failure?13:16
mnaser:(13:16
*** lamt has joined #openstack-kolla13:16
openstackgerritOpenStack Proposal Bot proposed openstack/kolla-ansible master: Updated from global requirements  https://review.openstack.org/45592913:18
*** eanylin has quit IRC13:19
egonzalezmnaser, randomly fail python tests, ive no idea the reason13:22
*** erlon has joined #openstack-kolla13:22
*** mattmceuen has joined #openstack-kolla13:26
openstackgerritBertrand Lallau proposed openstack/kolla-ansible master: Fix neutron agents restarted on ml2 config change  https://review.openstack.org/44799213:28
*** jtriley has joined #openstack-kolla13:35
*** al498u has joined #openstack-kolla13:36
*** al498u_ has joined #openstack-kolla13:38
*** 7GHAARTU6 has joined #openstack-kolla13:41
*** al498u has quit IRC13:41
*** papacz has quit IRC13:42
*** eaguilar has quit IRC13:42
*** 7GHAARTU6 has quit IRC13:43
pbourkemnaser: i think some of the tests are written badly13:44
pbourkemnaser: there's global data which they're manipulating, this is bad13:44
*** zhubingbing_ has joined #openstack-kolla13:45
*** bogdando has quit IRC13:45
*** zhurong has joined #openstack-kolla13:47
*** mattmceuen has quit IRC13:48
*** lamt has quit IRC13:49
mnaserpbourke im surprised its become an issue lately13:50
pbourkemnaser: the tests are been added to, its possible some the newer ones have exposed it13:51
*** zhubingbing_ has quit IRC13:51
pbourkemnaser: that or the newer tests are wrong :/13:52
mnaserpbourke gotcha.. we'll have to get to the bottom of it, because otherwise things are going to take a long time to merge13:52
mnaseresp because this is a voting check13:52
pbourkemnaser: yeah13:52
mnaseri have a ton of other work stuff i gotta work on but ill try to look at it.. no commitment on that atm :x13:52
*** srwilkers has quit IRC13:53
*** lamt has joined #openstack-kolla13:53
*** srwilkers has joined #openstack-kolla13:55
*** lrensing has joined #openstack-kolla13:56
*** ksumit has joined #openstack-kolla13:58
*** ksumit has quit IRC13:58
*** zhurong_ has joined #openstack-kolla14:00
*** zhurong has quit IRC14:01
*** iceyao has quit IRC14:02
*** ksumit has joined #openstack-kolla14:02
SamYapleJeffrey4l: with your mariadb recovery it is impossible to specify a recovery node. the state of mariadb recovery in kolla ansible is really bad right now. like a high probability of losing data14:06
*** lucasxu has joined #openstack-kolla14:10
*** david-lyle has quit IRC14:15
*** phuongnh has joined #openstack-kolla14:18
*** lamt has quit IRC14:21
*** srwilkers has quit IRC14:23
*** satyar has joined #openstack-kolla14:23
*** lrensing has quit IRC14:24
inc0good morning14:28
*** rwallner has quit IRC14:28
inc0SamYaple: then I have favor to ask14:28
*** rwallner has joined #openstack-kolla14:28
inc0would you mind creating a bug in kolla-ansible for it and mark it as critical?14:28
inc0with explanation why it's bad and how to make it not bad?14:29
*** srwilkers has joined #openstack-kolla14:29
*** rwallner has quit IRC14:29
SamYapleinc0: yes, but i will not be able to take on the work to fix it at this time14:30
inc0SamYaple: sure, no pressure14:31
inc0just record it somewhere, write down your thoughts and let someone else deal with  it14:31
inc0maybe answer pings here:)14:31
inc0thank you good sir!14:31
SamYaplenp man14:33
*** lrensing has joined #openstack-kolla14:34
*** daidv has quit IRC14:36
Serlexegonzalez - is it worth creating a bug for simple_cell_setup.yml?14:38
egonzalezSerlex, yes if is a bug14:38
egonzalezSerlex, what's your issue?14:38
SerlexI see that you have a temporary fix14:38
Serlexok ignore that14:40
openstackgerritDan Ardelean proposed openstack/kolla-ansible master: Add Hyper-V role  https://review.openstack.org/45568414:41
*** lamt has joined #openstack-kolla14:42
SamYapleinc0: https://bugs.launchpad.net/kolla-ansible/+bug/168215314:43
openstackLaunchpad bug 1682153 in kolla-ansible "mariadb_recovery is prone to data loss" [Undecided,New]14:43
inc0thanks SamYaple14:44
*** manheim has quit IRC14:50
*** rwallner has joined #openstack-kolla14:53
*** zhubingbing_ has joined #openstack-kolla14:57
*** zhurong_ has quit IRC14:57
*** lucasxu has quit IRC14:57
*** mkoderer has joined #openstack-kolla14:58
satyarHi inc014:58
satyarHi SamYaple14:59
satyarcontinuing the yesterday discussion...14:59
satyarafter rebooting the Host able to recover the old VMs and they are getting 8950 MTU14:59
satyarand accessable14:59
inc0hmm15:00
satyarNot sure if kolla recommends the reboot of nodes after upgrade15:00
inc0satyar: how about doing just ifdown and ifup15:00
inc0kolla doesn't15:00
inc0but it seems like neutron dhcp server isn't being clever15:00
satyari did that without rebooting the host VMs not able to get the 8950 MTU15:01
satyaronly after rebooting the VMs getting 895015:01
satyareven i tried rebooting the VMs which created before upgrade still no luck15:01
inc0did you try running dhclient?15:01
satyaryes15:01
inc0hmm15:01
satyaras VMs got 1500 MTU it was not able to communicate out side15:02
inc0how about restarting ovs agent?15:02
satyartried still same15:02
inc0that's very strange15:02
satyarnot sure how rebooting host solves this15:02
*** jaosorior is now known as jaosorior_away15:02
inc0I honestly doubt it's Kolla issue as I see no level that could mean we broke this...but never know15:03
satyarseems like kolla15:03
inc0well we use net=host15:03
satyartried with only neutron bare metal works fine15:03
inc0hmm15:03
inc0ok15:03
inc0and if you're not running jumboframes, everything with mtu 1500 and works?15:04
satyaryes15:04
inc0damn I don't have setup that could reproduce this scenerio now15:05
satyarhmmm15:05
satyarits quite easy though15:05
SamYaplesatyar: neutron does its own mtu calculations internally15:05
satyaryes15:05
inc0but rebooting host?15:05
SamYapleif the interface came up with a 1500 mtu and you set it to 9000 mtu after teh fact, it would be 1500 mtu internally15:05
SamYapleso there may be a race condition for you15:06
inc0yeah that would make sense15:06
satyarmy hosts comes with 9000 by default15:06
SamYapleyoull want the mtu to be 9000 before starting any nova or neutron services15:06
inc0also restart of neutron agent should help15:06
satyarnope15:07
satyardidnt15:07
SamYapleno that wont help inc015:07
satyarmy host while imaging itself comes with 9000 MTU15:07
SamYaplea full shutdown of the vm, stopping nova and neutron agents then starting nova and neutron agents after the mtu is 9000 would work15:07
SamYaplesatyar: its possible neutron or nova starts before your networking has come up15:08
SamYaplein which cas the linux default mtu is 150015:08
satyarSamYaple my hosts having 9000 MTU by default15:08
SamYapleunless you recompiled your kernel after making that change, your default mtu is 150015:08
satyarmy kernels having 900015:08
SamYapleyour netowkr settings (a service) have the interfaces at 9000 mtu15:08
satyaryes15:09
inc0satyar: did you set mtu to 9000, recompile kernel and create image from this recompiled kernel?15:09
satyaryou mean VMs? inc015:10
SamYaple(not that we are recommending to do that to fix the issue)15:10
inc0satyar: no, the host itself15:10
SamYaplesatyar: im refering to the host here15:10
satyarhost is recompiled kernel with 9000 MTU15:11
satyarso when the machine comes up it comes up with 9000 MTU15:11
inc0question is what sets up mtu to 900015:11
inc0if it's kernel or networking service15:11
inc0if you for example set it in /etc/network/interfaces, it won't help15:12
inc0you still can run into issue Sam described15:12
inc0I'd be interested why bare metal works, but that might be because docker just starts too fast...15:12
SamYaplesatyar: how exactly are you recompiling your kernel15:12
SamYaplewhat are you doig to make it 9000 mtu15:13
inc0and since it's docker that starts neutron, it might cause race condition15:13
satyarnope docker is getting installed after15:14
SamYaplesatyar: for my clarity, on the node hosting the l3 agent and on the compute node, please paste the output of `ip a`15:14
inc0satyar: did you reboot this node before upgrade?15:15
satyarhttp://paste.openstack.org/show/606318/15:16
satyarno15:16
satyarinc0 rebooted the host after upgrade15:16
ksumitHello all, I am fairly new to Ansible-Kolla and Docker in general. Is there a way to get CLI access (for example 'cinder list', etc.) after I have deployed OpenStack using Ansible-Kolla? I tried searching, but couldn't find any information.15:16
satyarthe paste of the machine not rebooted15:16
satyarand having the issue15:16
inc0ksumit: if you run kolla-ansible post-deploy it will create /etc/kolla/admin-openrc.sh15:17
inc0source this file and your regular cinder client should work15:17
satyarksumit you can install on any machine which can access the HA15:17
satyarand source openrc and access the systems15:17
inc0but we don't install clients15:17
*** lucasxu has joined #openstack-kolla15:18
satyarnnope15:18
satyarwe dont install clients15:18
inc0satyar: so...I don't have better idea15:18
ksumit@inc0 Thanks! So is it just Cinder that can be accessed through command line or will it work for others too? Nova, Manila, etc.15:18
inc0ksumit: it works just fine15:19
inc0I mean15:19
inc0everything;)15:19
ksumitUnderstood. Thanks!15:19
inc0clients liek cinder client are just REST api clients15:19
openstackgerritEduardo Gonzalez proposed openstack/kolla master: WIP: fix zun images  https://review.openstack.org/45625115:19
inc0so they'll make http request to APIs15:19
inc0based on env variables which are set in admin-openrc15:20
satyarSamYaple: any clue :(15:20
*** lamt has quit IRC15:20
inc0Sam might be right, it's really edge case15:20
inc0but possible15:20
inc0but I'm nto able to reproduce it unfortunately15:21
satyarinc0: SamYaple: I have code ready for jumbo frame support for VMs15:23
satyarshould i push it upstream?15:23
inc0bbiaf, see you in meeting15:23
inc0satyar: add it to bug report15:23
inc0maybe someone will be able to reproduce it15:23
satyaralready added15:24
satyarshould i create anothre bug for support of jumbo frame in kolla15:25
satyarand push the fix for it?15:25
*** lamt has joined #openstack-kolla15:27
sean-k-mooneyjumbo frames should work in kolla today15:28
sean-k-mooneyi have not read back but was the a case where it does not?15:28
satyari guess the neutron and nova changes are missing in kolla to support jumbo frame15:28
sean-k-mooneyall that needs to be done is set the relevent neutron config changes15:29
*** swinn has joined #openstack-kolla15:29
satyarTrue sean-k-mooney15:29
satyarwe dont have by default with kolla yet15:29
sean-k-mooneyat a minium though we should document the changes that are required15:29
inc0sean-k-mooney: issue was that after upgrade existing vms gets mtu 150015:29
satyaryes :)15:30
satyarhttps://bugs.launchpad.net/kolla/+bug/168191915:30
openstackLaunchpad bug 1681919 in kolla "After upgrade VMs getting 1500 MTU although jumbo frame is setup" [Undecided,New]15:30
sean-k-mooneyinc0: os-vif will fix it if the vm reboots15:30
satyarnope it dont15:30
ksumitAnother question: How do I restart all containers after a reboot?15:30
inc0ksumit: why would you want to do that?15:30
sean-k-mooneysatyar: yes it does that was a resent change15:30
openstackgerritEduardo Gonzalez proposed openstack/kolla-ansible master: WIP: fix zun deployment  https://review.openstack.org/45625615:30
kfox1111morning.15:30
inc0morning kfox111115:31
satyarin kolla?15:31
inc0satyar: in neutron15:31
inc0that would be change in neutron15:31
sean-k-mooneysatyar: in os-vif actully its in the ocata relase15:31
ksumit@inc0 I mean I am testing the deployment in a lab environment where they restart the machines every 2 nights for some reason.15:31
*** Pavo has joined #openstack-kolla15:31
satyarneutron i have 1 week old code15:31
satyarof stable ocata15:31
ksumit@inc0 So I was wondering about the steps that I need to take after a reboot to get everything working again.15:32
inc0aio or multinode?15:32
ksumitAIO for now15:32
sean-k-mooneysatyar: https://github.com/openstack/os-vif/commit/01da454fc8f50f2c86200df377dab4d21cecb75315:32
inc0well containers should restart15:32
inc0I mean they should start on their own15:32
sean-k-mooneyif you have neutron configured with the new mtu if the vm reboots we fix the mtus when it repluged15:32
satyarohh ok15:33
satyarwill try this now...15:33
inc0order they would restart in isn't determined in any way15:33
inc0so just do docker ps and see if everything works15:33
swinninc0: thanks for the help yesterday, I have a working kolla deployment but a few questions15:34
satyarsean-k-mooney: i have the code base already15:34
inc0swinn: shoot15:34
satyarthat was on 9th Dec 201615:34
swinnshould my compute containers be qemu based?15:34
inc0satyar: this isn't in neutron15:34
inc0it's separate service I think15:34
sean-k-mooneysatyar: yes that code chage was expcitly to deal with the mtu issue on upgrade for triple015:34
inc0swinn: depends, qemu or qemu+kvm15:34
sean-k-mooneyif you update the neutron config before you upgrade that change will fix it when the vm reboots15:35
swinninc0: so in the list of hypervisors, I see qemu reported as the hypervisor type15:35
inc0if you want different hypervisor then it's different story15:35
swinninc0: so am I missing a config flag to change that?15:35
sean-k-mooneyalso the guest os will get the updated mtu by dhcp when it renew it lease of you dont reboot the vms15:35
satyarsean-k-mooney bit confused...15:36
satyaros-vif code i guess i am missing something15:37
inc0swinn: we don't set virt_type by default15:37
swinnand then for customizing services, is the easiest method (without spending days scouring the config reference docs) going to be to let kolla genconfigs and then edit them or will that break something?15:37
satyarsean-k-mooney: where this code is residing?15:37
satyaris it out of nova/neutron?15:37
sean-k-mooneyit executed as part of the nova-compute agent15:38
inc0but I believe kvm is default15:38
satyarok15:38
satyarthen i should have this code also15:38
inc0negative swinn, let me show you15:38
satyaras my nova code is also 1 week back15:38
sean-k-mooneysatyar: the os-vif change was part of ocata15:38
satyarok15:39
satyarso if i rebuild the nova-compute image15:40
satyari should be getting this changes right?15:40
*** duonghq has joined #openstack-kolla15:40
sean-k-mooneysatyar: the workflow is update kolla external neutron config with mtu params, then kolla upgrade, then reboot vms on node(not required if you evacuate node first)15:41
openstackgerritKevin Fox proposed openstack/kolla-kubernetes master: WIP: v4 gates.  https://review.openstack.org/45484115:42
sbezverkkfox1111: ping15:42
satyarsean-k-mooney: i followed the same15:42
kfox1111sbezverk: hi.15:42
kfox1111the resolve.conf ps looks great. :)15:43
sbezverkkfox1111: thanks, I was pinging you about tiller15:43
kfox1111ah. tiller.15:43
satyarstill getting the same issue15:43
sbezverkkfox1111: I patched tiller and it is working now in my local test bed15:43
kfox1111for raising up the size limit?15:43
sbezverkkfox1111: yep15:44
sbezverkI pushed PR to helm people15:44
kfox1111k. lets put in an official bug report with helm then and ask for a 2.3.1?15:44
sbezverkit is done15:44
kfox1111ah. even better. :)15:44
sbezverkkfox1111: https://github.com/kubernetes/helm/issues/226115:44
kfox1111they have been pretty responsive to us in the past. if they can roll out a quick 2.3.1 bug fix that might be better to go forward then instead of rolling back.15:45
sean-k-mooneysatyar: i missed the start of thread can you discuribe the issue15:45
sbezverkkfox1111: yep I like going forward better ;)15:45
sbezverkhttps://github.com/kubernetes/helm/pull/226215:45
sbezverkkfox1111: I hope they will not find any negative side effects in bumping up message size limit15:47
swinninc0: that would be appreciated, as an example how would I modify the cpu overcommit in nova.conf and update the containers to honor the setting? I think that would give me the pointers I need to do any others.15:47
inc0swinn: sorry, I was on phone15:48
swinnno worries :)15:49
inc0https://docs.openstack.org/developer/kolla-ansible/advanced-configuration.html#openstack-service-configuration-in-kolla15:49
satyarsean-k-mooney: raised the bug https://bugs.launchpad.net/kolla/+bug/168191915:49
openstackLaunchpad bug 1681919 in kolla "After upgrade VMs getting 1500 MTU although jumbo frame is setup" [Undecided,New]15:49
kfox1111sbezverk: the line 2 above your change still mentions 10mb.15:49
inc0swinn: ^ you want to use this mechanism15:49
*** vhosakot has joined #openstack-kolla15:50
inc0https://docs.openstack.org/developer/kolla-ansible/quickstart.html on this page ctrl+f for virt_type to show you exactly how to do it15:50
swinnperfect, thanks a ton15:50
*** daidv has joined #openstack-kolla15:51
sean-k-mooneysatyar: so yes the behavior you discibe should be resovled by restarting the old vm15:51
sean-k-mooneythat is what the os-vif change was ment to do15:51
*** lamt has quit IRC15:53
*** lrensing has quit IRC15:53
satyarsean-k-mooney i tried it didnt :(15:54
satyarbut after rebooting the host the issue got resolved15:54
*** lrensing has joined #openstack-kolla15:54
satyarsean-k-mooney: mentioned that in the bug also15:55
sbezverkkfox1111: sorry I do not follow, where do you see still 10mb?15:55
kfox1111https://github.com/kubernetes/helm/pull/2262/files#diff-8f4541a8ec0fb2b3143ca168f7083f17L3115:55
kfox1111you tweak line 33. line 31 says:15:56
kfox1111  // maxMsgSize use 10MB as the default message size limit.15:56
mnaserfriendly notification15:56
mnaserkolla meeting in 415:56
kfox1111mnaser: oh. thanks for the reminder. :)15:56
mnaser..at #openstack-meeting-415:56
sean-k-mooneyif it didnt its proably not a kolla bug but rather nova/os-vif/neutron15:56
sbezverkkfox1111: right the comment, will fix it15:56
satyaryeah tried with baremetal it works fine15:57
inc0meeting time15:58
sdakesup peeps15:58
Jeffrey4legonzalez, re until in register, if this kind of normal task need retry for network error, all task should add this.15:58
sdakelunc htime15:58
Jeffrey4lso we assume the network is OK during deployment.15:58
sbezverkkfox1111: it looks like helm people will be ok to roll this finx into next release15:58
sdakesbezverk did you find the helm 23.0 bug?15:59
sbezverksdake: yes, and fixed it15:59
kfox1111sbezverk: will hey be willing to roll the next release asap to fix the issue? its kind of a regression.15:59
egonzalezJeffrey4l, roger, thanks15:59
sdakesbezverk nice dude!15:59
*** tovin07 has joined #openstack-kolla16:00
sbezverkkfox1111: that I did not ask, but you should comment on the link I pasted with issue16:00
kfox1111inc0: you see the message:16:00
kfox1111[Openstack-operators] [kolla] Issue with Galera16:00
kfox1111someone needs to answer that one asap.16:00
sdakekfox1111 sbezverk meeting time plz16:00
*** jascott1_ has joined #openstack-kolla16:01
inc0kfox1111: we have bug opened16:01
sdakemeeting in #openstack-meeting-416:01
Jeffrey4lSamYaple, with my patch, you can use "mariadb_recover_inventory_name" variable for specify the recover  node.16:02
*** hieulq_ has joined #openstack-kolla16:04
*** manheim has joined #openstack-kolla16:06
SamYapleJeffrey4l: i made a bug report for kolla, im sure youll all work it out.16:06
Jeffrey4lSamYaple, i will read srollback later. and mind paste the bug link?16:08
*** rwsu has joined #openstack-kolla16:09
*** manheim has quit IRC16:10
*** skramaja has quit IRC16:11
*** Pavo has quit IRC16:12
swinnI’m assuming that I can’t communicate with the world because ovs-system and all the bridges report down....16:12
swinnbest container to troubleshoot this from?16:13
*** Pavo has joined #openstack-kolla16:13
*** Pavo has quit IRC16:13
*** Pavo has joined #openstack-kolla16:14
*** Pavo has quit IRC16:14
*** Pavo has joined #openstack-kolla16:15
*** Pavo has quit IRC16:15
*** Pavo has joined #openstack-kolla16:15
*** Pavo has quit IRC16:16
*** Pavo has joined #openstack-kolla16:16
*** Pavo has quit IRC16:16
SamYapleJeffrey4l: https://bugs.launchpad.net/kolla-ansible/+bug/168215316:17
openstackLaunchpad bug 1682153 in kolla-ansible "mariadb_recovery is prone to data loss" [Critical,Confirmed]16:17
Jeffrey4lroger. thanks.16:17
*** hieulq_ has quit IRC16:19
*** l4yerffeJ has joined #openstack-kolla16:21
*** sayantan_ has joined #openstack-kolla16:22
*** l4yerffeJ has quit IRC16:23
*** athomas has quit IRC16:23
*** Serlex has quit IRC16:23
*** jemcevoy has joined #openstack-kolla16:24
*** sayanta__ has joined #openstack-kolla16:26
*** lamt has joined #openstack-kolla16:27
jemcevoyinc0: How do I clean out my local docker registry/repo so I can rebuild cleanly?  The ops server I am using does not have much local storage16:28
*** sayantan_ has quit IRC16:29
*** mewald has joined #openstack-kolla16:32
mewaldIn my 3-node controller cluster the first two nodes died. I am just trying to bring back up the first of them but RabbitMQ keeps failing and is hanging in a restart loop of the docker container. I cannot extract any logging information since nothing is written. Any ideas?16:33
inc0jemcevoy: we have meeting now, but kolla-ansible destroy;)16:33
mewaldhehe nope definitely not, it's not a playground environment :D16:34
mewaldOne of the deploy tasks fails but ansible keeps going: https://gist.github.com/mewald1/0b485cd77b73f2a2439b2d2c06ae220a16:35
*** calbers has quit IRC16:38
*** lpetrut has quit IRC16:38
mewaldinc0: lol, just noticed you didnt even talk to me xD16:38
*** gfidente is now known as gfidente|afk16:41
*** mewald has quit IRC16:43
*** david-lyle has joined #openstack-kolla16:45
*** zhubingbing_ has quit IRC16:49
*** unicell has quit IRC16:53
*** mewald has joined #openstack-kolla16:53
*** nathharp has quit IRC16:56
inc0mewald: yeah:P16:59
inc0so kfox1111 Jeffrey4l16:59
inc0I'd suggest having both trung and head of stable branches16:59
kfox1111inc0: seee my message above about an important email.16:59
inc0so nova-api trunk AND nova-api ocata16:59
kfox1111so, stable patches that arn't released?17:00
inc0kfox1111: https://bugs.launchpad.net/kolla-ansible/+bug/168215317:00
openstackLaunchpad bug 1682153 in kolla-ansible "mariadb_recovery is prone to data loss" [Critical,Confirmed]17:00
inc0kfox1111: right17:00
inc0for example kolla-k8s will use it17:00
egonzalezi'd use ocata-latest for daily builds17:00
jemcevoyinc0: Thankyou17:00
kfox1111inc0: someone should respond to him directly, if havent already. I'm not involved in the ansible side neough that  I think I should.17:01
kfox1111inc0: ah. the question is I guess, do we want the latest stable built to show up as latest, or trunk?17:01
kfox1111or maybe we do both?17:01
kfox1111latest stable tested build thats unrevisioined is ocata-latest and stable trunk is ocata-trunk or something?17:02
*** ksumit has quit IRC17:02
inc0responded17:02
daidvSo, can I ask you something around "Mixing binary and source image" topic?17:02
sdakeinc0 i put that protocore email on the ml17:02
kfox1111inc0: thx.17:02
inc0kfox1111: trunk is master branch17:02
sdakehope to get some responses from the nova peeps17:02
inc0that how I'd understand it17:03
sdakeJeffrey4l quick Q17:03
daidvinc0, ping?17:03
kfox1111inc0: true.... what else coudl we call it... tip?17:03
inc0daidv: I'm here17:03
kfox1111tip of the stable branch?17:03
daidvWhy did we separate binary image deployment vs source image deployment?17:03
inc0ocata-latest17:03
kfox1111daidv: cause some folks like me like to use rdo vendor packages, and some folks like to use pip packages.17:03
daidvI think one service can be deployed by binary or source image?17:03
sdakeJeffrey4l do you have time to review a spec related to the etcd work?17:03
inc0daidv: we have both types from very beggining17:03
*** tovin07 has quit IRC17:03
kfox1111both have different drawbacks/benifits.17:04
Jeffrey4lsdake, yep17:04
inc0kfox1111: why not just release name?17:04
inc0ocata == tip of stable/ocata17:04
inc0master == tip of master17:04
kfox1111inc0: why version stable at all then?17:05
Jeffrey4ldaidv, if you can do something which can make binary and source work together. it is be great.17:05
inc0well tags are better tested17:05
kfox1111what benifit is there to 4.0.0 / 4.0.1?17:05
Jeffrey4lkfox1111, i am agree on this point.17:05
Jeffrey4lwith inc017:05
kfox1111we could skip point verions then and just do17:05
Jeffrey4lfor 4.0.1, how can we tell the end-user, this is a tag image or branch image?17:05
sdakeJeffrey4l this review will take you some time17:06
kfox11114.0-1.. 4.0-2... with latest alieased to ocata-latest17:06
daidvJeffrey4l, yep, I got other question around  "Tarball URLs"17:06
sdakeplease provide feedback if it would be useful for kolla-ansible (only)17:06
sdakehttps://review.openstack.org/#/c/243114/5/specs/queens/oslo-config-db.rst17:06
daidvWhy did we config special release url instead of stable release?17:06
sdakeinc0 feel free to weigh in plz17:06
daidvI mean aodh-1.0.0.tar.gz or aodh-2.0.0.tar.gz instead of aodh-stable-mitaka.tar.gz or aodh-stable-newton.tar.gz17:06
inc0sdake: I think I did17:06
Jeffrey4lsdake, then i need review it tomorrow. should go to bed soon ;)17:07
*** jascott1_ has quit IRC17:07
sdakeinc0 i just spoke with emilian face to face here and he said nobody from kolla responded on the etcd thread or review17:07
*** jascott1_ has joined #openstack-kolla17:07
sdakei honestly dont know much about how it would be used in kolla-ansible which  is why I asked you and Jeffrey4l to take al ook:)17:07
Jeffrey4lkfox1111, what -1 -2 for?17:07
inc0ahh I must've missed clicking publish17:07
inc0or sth17:08
inc0I'll check it out17:08
sdakeinc0 huh?17:08
kfox1111Jeffrey4l: the proble we have now is we're not doing revisioning.17:08
Jeffrey4lsupport etcd for oslo-config, hrm kolla-ansible should not use such a solution17:08
inc0nvm17:08
inc0I'll review it17:08
kfox1111we need to know when things change in the containers that are not related directly to kolla's code.17:08
inc0Jeffrey4l: not as default17:08
kfox1111revisions do that.17:08
inc0but as option, shy not17:08
inc0why17:08
Jeffrey4lkfox1111, we are talking stable branch. it is OK to not save the revistioning17:09
daidvinc0, I mean can we create some new things to deploy binary and source image together17:09
kfox1111thats how rpm/deb/etc deal with the problem of dep versioning.17:09
kfox1111Jeffrey4l: no, its a big rpoblem we're not handling.17:09
inc0guys, ok I'm context switching too much, one discussion at the time please17:09
kfox1111several times now we have seen bad containers on the hub.17:09
Jeffrey4lre  Why did we config special release url instead of stable release? , tag is more stable then stable branch17:09
kfox1111and the fix hasn't been to release a new kolla stable revision,17:09
kfox1111but just to rebuild the containers.17:09
sdakekfox1111 sbezverk i'm not sure what to make of this : https://github.com/kubernetes/kubernetes/issues/43819#issuecomment-29355345217:09
Jeffrey4lstable branch is still develop/master branch actually.17:10
inc0kfox1111: what about daily?17:10
inc0do we want to push daily builds with revision?17:10
kfox1111inc0: daily would work I think.17:10
inc0and tag it with for example 4.0.0-12.04.201717:10
daidvJeffrey4l, gotcha, thanks.17:11
kfox1111inc0: just to b e clear, we're talking about dayly builds for updated stuff or for patches?17:11
inc0kfox1111: on dockerhub images in general I think17:11
kfox1111inc0: well, I want to squash revisions that don't matter.17:11
inc0because having multiple logics for that is bad17:11
kfox1111it may be weeks between 4.0.0-1 and 4.0.0-2.17:11
inc0I don't know which matters and which doesn't17:11
kfox1111the idea is to minimize the operator load for looking for changes.17:11
*** jascott1_ has quit IRC17:12
inc0we don't have capacity to follow changes in every other openstack project17:12
Jeffrey4lkfox1111, i do not recommend the operator to use hub.docker.com image directly17:12
kfox1111inc0: we don't need to. its deps we care about when it comes to revision.17:12
kfox1111for example:17:12
inc0not for source builds17:12
inc0and something will change every day, I almost guarantee you17:13
kfox1111we build container 4.0.0. do an rpm -Uvh on on, see openssl v1.017:13
inc0some library will be updated17:13
kfox1111we buid it two days later, and the rpm -qa shows openssl v1.1,17:13
kfox1111we automatically push it to the hub as 4.0.0-217:13
inc0kfox1111: and who will give us library names we care about?17:13
kfox1111inc0: any.17:13
kfox1111rpm -qa, dpkg -l and pip list.17:14
inc0then it will change almost every day17:14
kfox1111any changes there trigger a rebuild.17:14
kfox1111not usually.17:14
inc0well so here's thing17:14
kfox1111rpms change weekly maybe. pip too.17:14
inc0we an do weekly builds then17:14
kfox1111security updats should hit as soon as they are released by the vendor.17:14
kfox1111really, the rpms and debs and pip stuff doesn't chnage very often.17:15
*** shardy has quit IRC17:15
kfox1111but when it changes we need to release asap.17:15
kfox1111thats all automatable.17:15
Jeffrey4li am strong again the 4.0.0-2 ideas17:15
inc0so problem I'm facing is17:15
inc0if something changes in nova container17:15
inc0we still need to rebuild whole stack17:15
kfox1111Jeffrey4l: the problem is, you want sites to reproduce the gates we have for testing. thats not tenable for most sites.17:16
inc0Jeffrey4l: it's not bad idea if we execute it correctly17:16
kfox1111reusing our gating to test our up to date containers allows shipping tested, updated stuff to our users.17:16
inc0ok, we discussed it long enough for it to be spec17:16
kfox1111thats very very valuable.17:16
Jeffrey4liirc, sdake talked this with me about this.17:16
inc0I think17:16
inc0we need broad feedback from various ops17:16
kfox1111inc0: thats solvable though.17:16
inc0kfox1111: I know it's solvable17:17
inc0but there are lots of moving parts17:17
*** lamt has quit IRC17:17
kfox1111inc0: we nightly build fresh containers. we look for changes. if any changes, we gate test. if it passes, we push only the updated containers.17:17
Jeffrey4lkfox1111, even for devstack/centos or every other gate job, your concern can not be solved.17:17
inc0kfox1111: we will test always17:17
inc0not only after changes17:17
Jeffrey4lon the other hand, how may issue is caused by the repo update?17:17
kfox1111inc0: this is seperate from pushing changes into git.17:17
kfox1111inc0: those are tested always.17:17
Jeffrey4li do not see such issue for the last one year.17:18
inc0well, I think we can run gates on cron-based pushes too17:18
inc0btw publisher will run in infra17:18
kfox1111inc0: we have a nightly job thats testing containers for kolla-kubernetes.17:18
inc0and will pull from tarballs as we discuss it now17:18
kfox1111all we need to do is slide in a kolla build step at the front of it.17:18
inc0we merge every day17:18
kfox1111and plumbing to look for changes.17:19
inc0god my brain hurts17:19
inc0I'll start ML thread ok?17:19
kfox1111most of the work's already done. the main big piece left the authenticated upload to the hub.17:19
Jeffrey4lit seem we are talking multi topic now.17:19
inc0yeah17:19
kfox1111could be..17:19
Jeffrey4lML thread will be good start and let us focus on only one topic17:20
inc0I'll start 2 thread17:20
kfox1111k.17:20
inc0s17:20
kfox1111we need to cover all the topics though.17:20
inc01. for daily pushed names17:20
Jeffrey4lgreat17:20
inc02. revision mgmt17:20
kfox1111so start as many threads as needed.17:20
inc0anything else?17:21
kfox1111whats daily pushed names about?17:21
inc0ocata-tip17:21
kfox1111that for following trunk/tip?17:21
inc0or whatever17:21
inc0yeah17:21
kfox1111k.17:21
inc0we want to get this sorted out asap17:21
inc0and I think it's easy to just agree17:21
inc0in fact I take that back17:22
inc0can we just agree on "ocata"?17:22
*** duonghq has quit IRC17:22
inc0just relase name with no revision or anything17:22
inc0that will be documented to be tip of branch17:22
kfox1111so, why ocata and not just 4.0 ?17:22
kfox1111that would fit maybe nicer in the nameing scheme.17:22
kfox11114.017:23
inc0ocata is more meaningful and less prone to confusion17:23
kfox11114.0.117:23
kfox11114.0.1-317:23
sbezverkkfox1111: yeah PR got approved for 2.3.1 :)17:23
*** unicell has joined #openstack-kolla17:23
kfox1111sbezverk: cool. do they have a timeframe?17:23
inc0kfox1111: not sure if nicer17:23
inc0imho codename is nicer17:23
inc0corresponds with what we have in git17:23
kfox1111inc0: 50/50 on that one...17:23
inc0which is branch name ocata17:23
kfox1111its good if you know how openstack names things, worse if you dont.17:24
inc0we can call it stable/ocata to be fully equivalent to git branch name17:24
kfox1111though kolla's versioning not matching the release naming is bad too. :/17:24
kfox1111all confusing. :/17:24
inc0actually. stable/ocata is my favorite now17:24
inc0we cleearly suggest - it's not semver, it's based on git17:24
kfox1111I dont think tags can have /?17:24
inc0then just ocata17:25
inc0stable is redundant17:25
kfox1111I don't have much opinion on it, as I plan never to use it. its a landmine I think.17:25
inc0it's like - if you don't see any numbers in tag name, assume git17:25
inc0kfox1111: not for gates tho17:25
inc0also keep in mind that it will always pass kolla gates before pushing17:26
kfox1111I need to know my containers are all the same everywhere. targeting a tag that containers binary change in is a problem.17:26
*** daidv has quit IRC17:26
kfox1111inc0: yeah. its just an alieas to the other, revisioned things.17:26
inc0kfox1111: for prod, for gates we want to know that things break asap17:26
kfox1111so it doesn't much matter.17:26
inc0if kolla-ansible/kolla-k8s breaks after pushing something to dockerhub17:27
inc0we'll see it next day17:27
inc0and *that* is valuable17:27
inc0may slow dev of project a little, but will improve overall quality17:27
*** egonzalez has quit IRC17:29
kfox1111its easy for us to test it in gate before pushing though.17:29
kfox1111so we wont push broken.17:29
kfox1111the last two parts we really need to finish the effort is:17:30
kfox11111. job that pushes stuff from tarballs.o.o to docker hub.17:30
kfox11112. way of fingerprinting a container for changes.17:30
kfox11112 I think is mostlyu just rpm -qa | sort > rpms.txt17:30
kfox1111and diff.17:30
sdakekfox1111 can you ack this plz: https://review.openstack.org/#/c/455502/117:32
sdakesbezverk can you change your vote on above plz ^17:32
kfox1111sbezverk: how close is the helm 2.3.1 release?17:33
sbezverksdake: the fix got already merged in helm master17:33
sdakewhen is a release coming?17:33
sdakeor shall we use master for our gates?17:33
*** tonanhngo has joined #openstack-kolla17:34
sbezverk2.3.1 will be release this week17:34
kfox1111sdake: if its a day or two out, could you make your dev work you want to do based on your revert ps,17:34
kfox1111then we reparent on trunk when helm gets released?17:34
*** tonanhngo has quit IRC17:35
kfox1111then we dont have to revert/reapply.17:35
*** sambetts is now known as sambetts|afk17:36
*** tonanhngo has joined #openstack-kolla17:37
sdakei guess although master is effectively broken in the meantime17:38
sdakewe shouldn't be afraid of git revert17:38
sdakegit revert can be used to revert a revert ;)17:38
sdakehappens all the time17:38
kfox1111sdake: its kind of broken for the gate.17:38
kfox1111but you don't share any of the setup_helm stuff,17:38
kfox1111in the dev docs, so it doesn't really matter.17:39
*** nathharp has joined #openstack-kolla17:39
kfox1111it just muddies the history though and makes a git bisect slightly longer.17:39
kfox1111it feels like its just work that procrastination will solve soon. :)17:40
*** nathharp has quit IRC17:47
*** mewald has quit IRC17:52
jemcevoyinc0: I ran the destroy successfully and I still have 3.2GB in /var/lib/docker/volumes/sha256/_data/docker/registry/v2/  Is there a way to get that space back?17:58
kfox1111inc0: how much more work do you think there is to getting the config over?18:01
*** mattmceuen has joined #openstack-kolla18:02
*** krtaylor has quit IRC18:05
*** MasterOfBugs has joined #openstack-kolla18:06
*** pramodrj07 has joined #openstack-kolla18:06
*** pramodrj07 has quit IRC18:06
*** MasterOfBugs has quit IRC18:06
*** MasterOfBugs has joined #openstack-kolla18:06
*** pramodrj07 has joined #openstack-kolla18:06
*** phuongnh has quit IRC18:07
*** nathharp has joined #openstack-kolla18:07
*** lamt has joined #openstack-kolla18:13
inc0kfox1111: need to check what's missing18:13
inc0jemcevoy: we don't remove registry18:14
kfox1111inc0: hoping we can get that done by the next release.18:14
kfox1111and I would like to patch it all to support logging to stdout.18:14
inc0ok, I'll do due dilligence today18:14
kfox1111but don't want to start until the stuff is merged?18:14
inc0and make sure missing patces are up18:14
inc0I think we're mostly there18:14
kfox1111inc0: cool. thanks.18:14
kfox1111I've got a fluent-bit package up for review in kubernetes/charts I think would work well with kolla-kubernetes in that mode.18:15
kfox1111https://github.com/kubernetes/charts/pull/89518:15
*** nathharp has quit IRC18:16
*** srwilkers has quit IRC18:19
sbezverkkfox1111: I guess we will have to revert after all18:22
sbezverkfixing grpc limitation exposed18:23
sbezverklimitation with configmap being capped at 1MB18:23
*** satyar has quit IRC18:23
openstackgerritMarcus Williams proposed openstack/kolla-ansible master: Add OpenDaylight role  https://review.openstack.org/41636718:32
kfox1111sbezverk: so the fix didn't fix it for us?18:35
sbezverkit fixed only 1 part which was hiding configmap limitation18:36
kfox1111ok.18:37
kfox1111so...18:37
sbezverkso I will try to see why 2.3.0 generates bigger confgimap than 2.2.318:37
sbezverkbut it might take longer18:38
kfox1111either helm trims back what extra stuff they are putting in over 2.2,18:38
kfox1111we stick to 2.2 forever,18:38
kfox1111helm comes up with a new way of storing data,18:38
kfox1111k8s gets bigger configmap support,18:38
kfox1111or we stop doing computekit charts.18:38
kfox1111or, helm gets an aggregate feature. where the releases are still seperate.18:39
sbezverkyeah none of these looks very attractive18:39
sbezverkand if we need any new features in helm we are busted18:41
kfox1111which we know we do.18:41
sbezverkunless they get backported which is not always possible :(18:41
kfox1111parent values is much more important I think long term then computekit chart. :/18:41
sbezverkwell, it depeneds, for sdake it seems computekit is a almost a showstopper18:42
sbezverkwe could try to request configmap size increase?18:43
kfox1111that wouldn't happen quickly. :/18:43
kfox1111Id ask to see if the changes to helm 2.3 to the file format could easily be backed off temporarily,18:44
kfox1111and then for 2.4 support something like configmap spillover to a second configmap.18:44
kfox1111another option woudl be supporting 2.3 for everything but computekit, and changing helm ver in computekit job to 2.2.18:45
kfox1111though that would still prevent us from making 2.3 only changes to the microservice charts I guess.... :/18:45
kfox1111and parent values will go a long way to being "helm native"18:46
kfox1111I guess this one's one for sdake to answer.18:46
kfox1111sdake: whats more important to you, being more helm native, or computekit as a chart.18:46
kfox1111cause I think we're staring that in the face right now for a few weeks at least.18:47
*** krtaylor has joined #openstack-kolla18:49
*** Manheim has joined #openstack-kolla18:53
sbezverkkfox1111: seems that way, the error comes directly from kubernetes18:56
sbezverkso nothing can be done on helm side other than optimization of the size of configmap18:56
sbezverkbut we do not know if it is doable18:56
kfox1111or make configmap have a pointer to a "next" configmap,18:58
kfox1111and it pulls data out of each one and appends them together.18:58
kfox1111its workaroundable. but not nesssiarily desirable.18:58
sbezverkkfox1111: I suspect they package all charts into a signle configmap and ship it to tiller19:01
*** iceyao has joined #openstack-kolla19:01
*** srwilkers has joined #openstack-kolla19:01
sbezverkthen tiller parses it and instantiate individual objects19:01
kfox1111sbezverk: not sure. it may be the other way around.19:02
kfox1111the chart goes to tiller via grpc,19:02
kfox1111and tiller stores the state it cares about in configmap.19:02
kfox1111I would guess its that way, as the grpc limit is higher then the configmap size.19:02
kfox1111even before you changed it, it was 10m.19:02
sbezverkApr 12 13:42:53 kube-1 journal: 2017/04/12 17:42:53 release_server.go:840: warning: Failed to record release "veering-heron": ConfigMap "veering-heron.v1" is invalid: data: Too long: must have at most 1048576 characters19:03
kfox1111vs the 1m for configmap.19:03
*** ipsecguy_ has joined #openstack-kolla19:03
sbezverkyou see, we do not have this config map, they build it19:03
sbezverkfor some internal purposes19:03
kfox1111yeah.19:03
kfox1111its all the state nessisary to do diffs I think.19:03
kfox1111I wonder if they are doing any compression on it.19:04
*** ipsecguy has quit IRC19:04
*** iceyao has quit IRC19:06
*** lucasxu has quit IRC19:18
*** fooliouno has joined #openstack-kolla19:18
*** lucasxu has joined #openstack-kolla19:20
*** jmccarthy has quit IRC19:22
*** lamt has quit IRC19:22
fooliounosbezverk: I'm running into an issue in kolla-k8s when trying to do a set-manager in OVS to use Opendaylight container as a manager. Seeing the same issue as this guy: https://bugs.launchpad.net/kolla/+bug/163792819:24
openstackLaunchpad bug 1637928 in kolla "openvswitch set-manager to ODL doesn't work" [Low,Incomplete]19:24
fooliounosbezverk: any ideas on how to solve the issue19:25
fooliounokfox1111: Jeffrey4l: ^^^19:26
kfox1111never played with opendaylight. sorry. :/19:27
fooliounokfox1111: no problem. Thanks!19:27
sbezverkfooliouno: same here, I do not think it will work currently in kolla-k8s19:29
sbezverkfooliouno: but that would be a great contribution if you make it owrking ;)19:29
fooliounosbezverk: thanks. do you know why it may not work. I will continue looking into it, but need pointers :)19:31
sbezverkfooliouno: it was assumption since we never tested it19:31
fooliounoah .. ok19:32
sbezverkand as kfox1111 like to repeat ;) if it is not tested it is broken19:32
openstackgerritKevin Fox proposed openstack/kolla-kubernetes master: WIP: v4 gates.  https://review.openstack.org/45484119:33
kfox1111fooliouno: I'm guessing it probably needs architectural changes to work?19:34
kfox1111do you need other agents but neutron-openvswitch-agent/openvswitch-* stuff?19:34
fooliounokfox1111: I was hoping to reuse most of the kolla-k8s containers. The neutron-openvswitch-agent would not be needed.19:35
fooliounokfox1111: I was thinking of building an ovs container myself as suggested by the person in the bug report and trying that.19:36
kfox1111does opendaylight have its own neutron agent then?19:37
kfox1111or agent that runs on the host, and neutron talks to just opendaylight?19:37
fooliounokfox1111: I was setting the ml2 mechanism driver in neutron.conf to use odl so that neutron talks to old19:39
fooliounoodl19:39
fooliounojust for l219:39
kfox1111so, ml2 plugin for odl to talk to the odl controller?19:39
kfox1111what gets settings from the odl controller to the node?19:39
fooliounofrom my understanding, neutron talks to odl and odl controls ovs19:40
kfox1111hmm....19:40
kfox1111the question is how.19:40
kfox1111is ovs exported externally, and ip given to odl?19:40
kfox1111if its something like that, the the question is, how does odl discover it/get configured to talk to it.19:41
fooliounoI thought that when we set the manager in ovs (ovs-vsctl set-manager), we provide the ODL container IP19:41
kfox1111ah. ok.19:42
fooliounoANd the problem is that that step is currently broken19:42
kfox1111that step isn't existing I think.19:42
fooliounoI ran it manually in the ovs container and it doesnt do anything19:43
kfox1111hmm.. thats weird then. there must be more too it then just that.19:43
kfox1111to make that always run, it should get added to: helm/microservice/openvswitch-vswitchd-daemonset/templates/openvswitch-vswitchd-daemonset.yaml19:44
fooliounoIt works in a non-cotainerized environment FWIW :)19:44
kfox1111you kubectl exec'ed into the openvswitch-vswitchd -c main on the compute node and ran ovs-vsctl set-manager ... and it didn't work?19:45
fooliounocorrect19:45
kfox1111hmm...19:45
kfox1111is it a version difference?19:45
kfox1111ovs version incompatible?19:45
fooliounosorry .. didnt use the -c main but rest was same19:46
kfox1111loks like there is only one container in that pod. so same with or without it.19:46
fooliounoHmm .. I should perhaps try a continer with an older ovs version?19:46
kfox1111do you have just one controller?19:48
fooliounoone controller as in one kolla_controller node?19:48
kfox1111for opendaylight.19:48
fooliounoyes19:48
*** vhosakot has quit IRC19:48
*** vhosakot has joined #openstack-kolla19:49
kfox1111is there an equiv of neutron agent-list | grep openvswitch-agent19:49
kfox1111in odl to see if hosts are bound to it?19:49
kfox1111do you have an external opendaylight controller you could point it to,19:51
kfox1111to help narrow it down from isssues from running the controller in k8s vs issues with the comput nodes talking to a controller?19:51
*** srwilkers has quit IRC19:51
fooliounoWhen we look in odl container, there is nothing in it. presumably since ovs could not register with it.19:52
kfox1111k... then yeah, we have to deermine if its an issue with ovs, or some issue in between.19:53
kfox1111does odl provide a rest api in the controller?19:53
kfox1111if so, can you curl it from the ovs container?19:53
fooliounoFYI .. curl and ssh from ovs container to odl container works19:53
*** lamt has joined #openstack-kolla19:54
fooliounoIts just that the ovs-vsctl set-manager command fails19:54
*** lamt has quit IRC19:54
kfox1111it fails with an error, or says it succeeds and doesn't work?19:54
fooliounonothing. no logs. no output. nothing :)19:54
kfox1111hmm...19:55
sdakesup19:55
sdakei saw my name used in vein :)19:55
kfox1111are you trying to do dns service resolution? or registering it by service ip?19:55
fooliounotcpdump doesnt show any packets going out either19:55
kfox1111or alternately, whats your ovs-vsctl set-manager look like?19:56
fooliounoJust using the ODL container IP:port19:56
fooliounono dns resolution19:56
kfox1111hmm... weird.19:56
kfox1111I'm guessing dns name resolution will fail.19:56
kfox1111but if your not using it, thats not an issue...19:56
kfox1111and echo $? right after the ovs-vsctl set-manager is 0?19:57
kfox1111does ovs-vsctl get-manager print the things you'd expect?19:57
fooliouno12719:58
kfox1111127's an error...19:58
kfox1111try the get-manager, but would expect that to not have the info..19:58
fooliounoIt shows the one we set: "tcp:10.244.1.72:6640"19:59
sdakere computekit19:59
sdakei understand there is a 1mb limit19:59
kfox1111fooliouno: oh. interesting.19:59
sdakeis it in helm itself or kubernetes?19:59
sdakeand why did 2.3 break computekit?19:59
kfox1111sdake: helm was broken, but once the limit in helm was fixed, a limit in k8s was shown.19:59
sdakeor was it always broken20:00
kfox11112.2 worked.20:00
kfox1111the problem I think is they added a bit more metadata of their own to each release.20:00
kfox1111we were right under the line in 2.2, and just over it in 2.320:00
kfox1111even adding a few features in 2.2 might have pushed us over.20:00
sdakek8s library gprc is limited to 1mb?20:00
kfox111110m. we upped it to 20m.20:01
sdakegrpc20:01
kfox1111but helm stors state in k8s configmaps,20:01
sdakewhere is the limit that is breaking now20:01
kfox1111which are 1m limited.20:01
kfox1111and with the 20m fix in, we still have the 1m limit causing problems.20:01
fooliounokfox1111: sorry .. the output for echo $? was 0. misunderstood when and how to run it.20:01
sdakethat kind of kills the advantage of having conditions at all20:02
kfox1111scroll up for my list of possible things that could happen.20:02
kfox1111fooliouno: ok. cool. so, its getting added to ovs ok.20:02
sdakeis there any way to increase the limit in kubernetes?20:02
kfox1111sdake: "only a matter of code" ;)20:02
fooliounoyep, but ovs doesnt seem to act on it. However, if I set it to neutron, it works and creates the br-int etc20:02
*** lamt has joined #openstack-kolla20:03
kfox1111fooliouno: so then, either ovs is not reaching out to odl, or can't contact it for some reason.20:03
fooliounoright .. I think ovs is not reaching out since tcpdump on ovs and odl is empty20:03
kfox1111fooliouno: I'd compare versions of ovs in the container and on the bare host you tried.20:03
kfox1111maybe there's something different there.20:03
kfox1111fooliouno: the way k8s works, it might be tricky to tcpdump on the right interface to catpure it.20:04
fooliounook ..20:04
kfox1111is ovs and opendaylight on the same host?20:05
fooliounowe dumped on all interfaces for that port. yes .. both containers on same host for now.20:05
fooliounoDo you think its a good idea to build our own ovs container with an older version20:06
kfox1111not sure it wil help or not...20:07
kfox1111what k8s sdn are you using?20:07
fooliounoocata20:07
fooliounosorry .. flannel20:08
kfox1111flannel, ok...20:08
kfox1111and same host...20:08
kfox1111and the odl controller is net=host or not?20:08
fooliounowe have a kube master on one host and kolla_controller and kolla_compute on a second host20:08
kfox1111oh. so odl controller on kolla_controller, and compute on kolla_compute?20:09
fooliounoodl is not net-host = true. We debated that.20:09
fooliounoboth labels set to second host. same node.20:09
kfox1111k.20:09
fooliounoodl doesnt really need access to host netns I believe20:09
fooliounobut we could be wrong20:10
kfox1111not sure. I would guess not.20:10
sdakekfox1111 you got an issue tracker or something that tracks this limit?20:12
kfox1111sdake: nope. maybe sbezverk has one.20:12
sdakesbezverk issue tracker on this limit?20:12
sdakejascott1 up for fixing a kubernetes limitation?20:13
sdakejascott1_ ??20:13
kolla-slack<jascott1> I am! on a call, bbiab20:13
kfox1111sdake: thats not going to be a quick thing I think. :/20:14
sdakekfox1111 its not redefining a global then? :)20:14
kfox1111sdake: it may be that easy. but they don't make changes quickly in my experience. :/20:17
openstackgerritBertrand Lallau proposed openstack/kolla-ansible master: Magnum: add oslo_messaging_notifications config  https://review.openstack.org/45185720:19
openstackgerritBertrand Lallau proposed openstack/kolla-ansible stable/ocata: Trove fix backup restore with Swift  https://review.openstack.org/45178020:20
*** srwilkers has joined #openstack-kolla20:28
kolla-slack<jascott1> ok so last night figured out that the config map and secret limit is 1mb and gRPC is 10mb20:31
kolla-slack<jascott1> the release configmap is too big. so what do we want to do?20:31
kfox1111jascott1: so, we have afew options I think.20:31
kfox11111, something changed in helm to require more for the same chart. that could be undone temporarily.20:32
kfox11112. its possible to make spillover configmaps? if too big, trunkcate, mark the configmap as neeeding -> other configmap and put the rest there.20:32
kfox11113. fix it and wait in k8s.20:32
kolla-slack<jascott1> oh can we link configmaps like that?20:33
kfox1111I'm leaning to 2 being the solution.20:33
kfox1111jascott1: its just configmaps in k8s. can't see why you couldnn't build it on top.20:33
kfox1111data: xxxxxxx20:33
kolla-slack<jascott1> hmm arent we hitting bot the 1mb and the 10mb limit?20:33
kfox1111next: configmapname220:33
kolla-slack<jascott1> how to get passed the 10mb limit?20:34
kfox1111jascott1: we hit the 10mb limit for sure. but seemed to be ok when sbezverk pached the check out.20:34
kfox1111then hit the configmap limit and are stuck.20:34
kolla-slack<jascott1> and then its only going to get worse as we add more openstack components20:34
kfox1111jascott1: yeah. hence I think removing the limit from tiller is the right fix. #2.20:35
kfox1111then tiller's not constrained to k8s ever.20:35
kolla-slack<jascott1> yeah I rolled a helm last night with a 2mb before I signed off, did that work?20:35
kfox111120mb you mean?20:36
kfox1111I think thats what sbezverk was testing with.20:36
kfox1111it was then the configmap that broke after that.20:36
kolla-slack<jascott1> no I changed the k8s validation limit to 2mb for configmaps20:36
*** vhosakot has quit IRC20:36
kfox1111oh.20:36
kfox1111he said then k8s was still complaining about 1m limit.20:36
kolla-slack<jascott1> hmm20:36
kolla-slack<jascott1> yeah so the k8s validation lib is vendored to helm so it wouldnt change k8s itself20:37
kfox1111its sthe api-server I think thats limiting it.20:37
kfox1111yeah.20:37
kolla-slack<jascott1> im sure k8s is using the same validation (not hacked)20:37
kfox1111so, tiller serilizes stuff to put into the configmap, then puts it in releasename-configmap.data20:38
kolla-slack<jascott1> sounds right20:38
*** jmccarthy has joined #openstack-kolla20:38
kfox1111if we take that, %900k, if larger, releasename-configmap-%i20:38
kfox1111it would fit,20:38
kfox1111and retrieving would just be the reverse. append all the configmap data together before tiller gets it.20:39
kfox1111probably a couple dozen lines of code?20:39
kolla-slack<jascott1> but are we saying that we are not over 10mb?20:39
*** jmccarthy has quit IRC20:39
kolla-slack<jascott1> for entire release?20:39
kfox1111we're over 10m, but under 20m. and setting it to 20m seems to work.20:40
kolla-slack<jascott1> ah ok20:40
kfox1111so grpc limit isnt really 10m as far as we can tell.20:40
kfox1111not sure what the real upper limit is.20:40
kolla-slack<jascott1> right weve only dealt with helm set limits afaik20:40
kfox1111one other thing we could potentially do... which does make it less helm native potentially...20:41
kfox1111right now, we share a common lib, and it gets coppied into every subcharts /carts dir.20:41
kfox1111its not clear to me, if you could "optimize" the chart, by deleting all the duplicates.20:42
kolla-slack<jascott1> ah20:42
kolla-slack<jascott1> interesting20:42
kfox1111data deduplication. :)20:42
openstackgerritMichal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move rabbitmq config to kolla-k8s  https://review.openstack.org/45058120:43
kolla-slack<jascott1> I will take a look into what we are generating20:43
kfox1111if helm supports that, the helm package process should probably do that.20:43
kfox1111we should be able to build the computekit package,20:43
kfox1111extract it, remove all but one of the kolla-common subcharts,20:43
kfox1111and tar it back up.20:43
kfox1111and then see if it deploys.20:44
kfox1111not sure if that will effect the configmap object at all though.20:44
kfox1111that may be the difference between the grpc limit and the configmap one.20:44
kfox1111it may lessen the effect on grpc. but not the configmap.20:44
sdakejascott1 so end goal is helm install openstack with individual enablement of services20:44
kolla-slack<jascott1> right20:44
sdakejascott1 if people want to use service charts thats good too20:45
sdakejascott1 i'd like to understand if the limit is in kubernetes or in helm that we are runnign into20:45
sdakeunforutnately I am not capable of making this determination on my own20:46
openstackgerritMichal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move heat config to kolla-k8s  https://review.openstack.org/45058920:46
*** tonanhngo_ has joined #openstack-kolla20:48
*** mewald has joined #openstack-kolla20:48
*** tonanhng_ has joined #openstack-kolla20:49
*** tonanhngo has quit IRC20:51
*** tonanhngo_ has quit IRC20:52
*** tonanhng_ has quit IRC20:53
mewaldIn my 3-node controller cluster the first two nodes died. I am just trying to bring back up the first of them but RabbitMQ keeps failing and is hanging in a restart loop of the docker container. I cannot extract any logging information since nothing is written. Any ideas?20:56
*** tonanhngo has joined #openstack-kolla20:58
*** gfidente|afk has quit IRC20:59
sdakemewald panic?21:00
sdakemewald how did the first two nodes fail?21:00
mewaldsdake: They "failed" by running out of disk space. We had to take them down, replace disks and freshly install them21:01
sdakeok - so you want to recover with your remaining node to the other  two nodes?21:02
*** jascott1_ has joined #openstack-kolla21:02
sdakeinc0 got a 911 here...21:02
inc0mewald: docker logs rabbitmq shows nothing?21:02
sdakemewald i have 0% power and no wall plug21:03
*** jascott1_ has quit IRC21:03
sdakei'll be back in teh hotel in about 30 minutes - alhtough no tsure how much I can help21:03
sdakemewald do you have the original disks?21:03
mewaldinc0: Starting the container looks like this: https://gist.github.com/mewald1/0de8d046c4b59942103b08f0b706ee6a21:04
mewaldsdake: puhh yeah but recovering that RAID is going to be pain21:04
*** jascott1_ has joined #openstack-kolla21:04
inc0hmm and nothing in /var/lib/docker/volumes/kolla_logs/rabbitmq?21:04
mewaldinc0: /var/lib/docker/volumes/kolla_logs/_data/rabbitmq/ remains completely empty21:04
inc0duh21:04
inc0hmm21:05
*** jtriley has quit IRC21:05
kolla-slack<jascott1> sdake its ultimately a k8s limit as far as I can tell21:05
openstackgerritMerged openstack/kolla master: Switch to RDO proxy mirrors  https://review.openstack.org/45537421:05
inc0mewald: what about syslog?21:05
inc0anything there?21:06
mewaldchecking21:07
mewaldhttps://gist.github.com/mewald1/bf62938603e5ba096211527a61a26d7621:07
inc0yeah not really helpful21:08
mewaldyeah, unfortunately not21:09
*** lucasxu has quit IRC21:09
inc0let me try sth21:09
mewaldI would usually try to kill all rabbits including the last one standing and try to deploy but I am afrait nothing will work after that :D21:10
mewaldthe cloud would be down ;)21:10
inc0docker inspect shows env variables right?21:10
inc0well not necessarily21:10
inc0that being said21:10
inc0try to stop one remaining rabbitmq node21:10
inc0(I assume it's working?)21:10
sbezverkkfox1111: we should probably revisit the concept of compute kit and may be to decomission it? I mean, we cannot get stuck with older version of helm just because of compute kit. Also it is not the most popular way of deploying it.. what do you think?21:11
mewaldinc0: stop or remove the remaining one?21:12
inc0bring all of them to stop21:12
inc0and try to turn them on one at the time21:13
kolla-slack<jascott1> @sbezverk whats the alternative? many smaller releases working together?21:13
mewaldok21:13
kfox1111sbezverk: yeah, itsa cool concept, but maybe its time has not come yet.21:13
*** lucasxu has joined #openstack-kolla21:13
kfox1111just doing a simple shell script to launch all the service packages works just as well.21:13
mewaldinc0: the one that worked before continues to work21:14
sbezverkkfox1111: jascott1: right for the moment we I guess need to stick to service charts21:14
kfox1111if I had to pick one or the other, I think 2.3 is much more important then computekit.21:14
inc0mewald: so what I'm trying to do is to restart whole cluster21:14
inc0as it might have run into ugly state since majority of it died21:14
kfox1111I think computekit's cool though, so we shouldn't just forget about it. but for now, I don't think its really worth blocking stuff like helm native values over.21:14
sbezverkkfox1111: agree, and we can still work on optimization in parallel.. it gives me a chance to practice in go ;)21:14
kfox1111+121:15
mewaldinc0: I just did "docker kill" then "docker start" on both rabbitmqs on both nodes. That should have restarted it, no?21:15
inc0do docker stop on all21:15
kolla-slack<jascott1> @kfox1111 you dont have any faith in trying to reduce the common lib ?21:15
inc0then stocker start one at the time21:15
kolla-slack<jascott1> or its duplication i mean21:16
kfox1111jascott1_: pretty sure it wont fix the configmap issue. but we could try.21:16
kolla-slack<jascott1> ok I will check on the tiller=host thing and then look at cm’s21:16
mewaldinc0: changes nothing. the one that worked before starts up, the other one dies21:17
inc0config files looks right on all of them?21:17
kfox1111jascott1_: even assuming we can get the size down some, once we add configmaps for all the things as helm charts we may be right back in the same boat.21:18
mewaldinc0: yeah they are identical except stuff like bind addresses. I even checked erlang cookie in the volume21:18
jascott1_kfox1111 oh yeah good point. hmm21:18
inc0hmm21:19
kfox1111this is inherently a helm max release issue I think.21:19
inc0so in clusterer config you have something called gospel node21:19
kfox1111there's currently some upper bounds in how complicated an application can be for helm.21:19
inc0it's one working or one that was broken?21:19
mewaldyeah, what exactly is that?21:20
mewaldwait checking21:20
kfox1111not uncommon. most things run into scaling issues at some point.21:20
mewaldinc0: on both nodes the gospel is the one working21:20
inc0and on one working?21:20
*** lrensing has quit IRC21:20
mewaldon the one working it is also the one working: on ctrl00 ctrl02 ist gospel, on ctrl02 ctrl02 is gospel21:21
mewaldinc0: https://gist.github.com/mewald1/0e70039d50d55111284b8da54fb24e0521:22
inc0damn I ran out of ideas21:24
inc0try on rabbitmq support channel21:24
inc0or SamYaple you know rabbitmq21:24
-openstackstatus- NOTICE: Restarting Gerrit for our weekly memory leak cleanup.21:25
*** Manheim has quit IRC21:26
sbezverkjascott1: I am trying to dump manifest of failing release to see what is in it21:26
mewaldinc0: I just noticed that the locales seem to be different on the nodes. "date" gives me "Wed Apr …" while the one not working shows "Mi 12. Apr" (german) Do you think that could cause it?21:26
sbezverkjascott1: maybe it will help to uinderstand where we can squeez it a bit ;)21:26
jascott1_sbezverk cool21:27
inc0mewald: entirely possible21:27
inc0clusters like their time synchronized21:27
*** manheim has joined #openstack-kolla21:27
kfox1111sbezverk: maybe coudl get a non failing one and see if you can decode it.21:28
kfox1111its probably base64 encoded, so if you undid it, it would be interesting t osee what kind of data's in there.21:28
sbezverkkfox1111: right the idea was to collect from 2.2.3 and from 2.3.0 and compare21:28
kfox1111sbezverk: yeah. but having a working one might tell us if chart data deduplication might help too.21:28
mewaldinc0: was worth a try but it's still in a restart loop21:29
sbezverkkfox1111: I cannot get working one with 2.3.021:29
sbezverkso the only option I see is to get it from 2.2.3 or 2.2.221:30
mewaldalso, time was syncrhonized, only locale was different21:30
inc0crap...full destroy - redeploy of cluster is always an option21:30
kfox1111sbezverk: if we can decode a 2.2.x one, that might be enough to tell us if dedup might help.21:30
sbezverkkfox1111: sounds good, I am on it :)21:30
inc0docker rm -f rabbitmq -> kolla-ansible deploy -t rabbitmq21:30
mewaldpuhh, if everythings down after that it's going to be a long night :D21:31
mewaldshall I risk it?21:31
*** lucasxu has quit IRC21:31
sdakemoment21:31
kfox1111mewald: are you using ceilometer?21:32
mewaldyep21:32
kfox1111care if it looses data points?21:32
sdakemewald please do the following21:32
*** manheim has quit IRC21:32
kfox1111some datapoints I mean.21:32
sdakemake a backup of the existing data21:32
kfox1111(some sites do, some don't.)21:32
*** krtaylor has quit IRC21:32
sdaketar -czvf docker.tar.gz /var/lib/docker21:32
sdakemewald for the most part a tar of /var/lib/docker is reloadable somewhere else21:33
inc0rabbimq volume is enough21:33
inc0but I'd be careful with dumping datafiles of rabbitmq to new cluster21:33
sdakejust make a backup before doing anything permanently destructive21:34
sdakeof all 3 disks21:34
mewaldlosing some datapoints in ceilometer won't hurt21:34
kfox1111mewald: then rebuilding rabbit should probably be fine then. most of the rest of opensatck is stateless rpc.21:34
mewaldsdake: 3 disks?21:35
sdakemewald is your database intact?21:35
*** manheim has joined #openstack-kolla21:35
sdakemewald i lost power - so was reading scrollback21:35
sdakehang tight21:35
mewaldk21:36
sdakemewald did mariadb recover correctly?21:36
sdakeif the problem is only rabbitmq, stopping and starting rabbitmq should get things giong again21:37
sdakeif the problem is mariadb, zug21:37
mewaldyes I got that to work after figuring out the trick with "mariadb_recover_inventory_name"21:37
sdakerabbitmq = pile of garb21:38
sdakewish we had another choice for the messaging component21:38
sdakeit works until it doesn't21:38
mewaldyeah so you are suggesting "docker kill rabbitmq && docker rm rabbitmq && docker volume rm rabbitmq" then run "kolla-ansible deploy"?21:38
sdakemewald i'm not sure what the correct course of corrective action is21:39
*** goldyfruit has quit IRC21:39
sdakei can tell you the mariadb volue m is only used to store a pid file iirc21:39
inc0mewald: you can keep volume21:39
sdakerathe the rabbitmq voluem21:39
sdakeand that file is used to syncronize the ha cluster21:39
sdakemaybe its not a pid file but a sequence file21:39
sdakeif you restart things, rabbitmq should recreate that file and openstack will reconnect to the services21:40
sdake(services being memcached, rabbitmq, mariadb)21:40
mewaldyeah, my experience tought me to always delete the volumes, too. Had many very strange things happening when I forgot the volume :D21:40
inc0well21:40
inc0destroy volume will be safer21:40
sdakemewald which type of filesystem in use for /var/lib/docker?21:41
inc0rabbitmq data isn't that important for openstack21:41
sdakeinc0 is correct, rabbitmq data is not crucial to peration21:41
sdakethe only thing thatis crucial is the database21:41
mewaldsdake: ext421:42
sdakemewald thats good news21:42
sdakemewald ar eyou using overlayfs or thin lvm or somethign else?21:42
kfox1111sdake: rabbit data only matters if you care about ceilometer data. some sites, like mine, do care in some cases.21:42
sdakekfox1111 i see - didn't know there was a ceilometer + rabbitmq linkage21:42
mewaldthe ext4 is on an lvm2 lv21:42
mewaldbut nothing thin or so21:42
sdakemewald run docker info -> paste21:42
sdaketis will tell us the storage driver in use21:43
kfox1111sdake: notification data goes to ceilometer. if your org relies on event data being reliably captured, for auditing purporses for example,21:44
kfox1111then loosing it can be bad.21:44
mewaldhttps://gist.github.com/mewald1/bb04483b496cf3c5c992eba893eb1fa321:44
kfox1111thats pretty much the only use case I'm aware of though that does.21:45
mewaldctrl02 is the working node, ctrl00 the failing one21:45
sdakeubuntu21:45
sdakei would stop all rabbitmq instances21:45
sdakedelte all of there volumes21:45
sdakethen deploy21:45
kfox1111sbezverk: want met wo wait on the recheck, or just call it goo?21:45
sdakebut lets wait for inc0 to weigh in21:45
kfox1111good?21:45
openstackgerritMerged openstack/kolla-kubernetes master: Making resolv.conf to be more flexible  https://review.openstack.org/45619121:46
sbezverkjascott1: I see manifest of release in clear text, but the error is referencing to configmap, any suggestions how to access this configmap?21:47
sbezverkkfox1111: re-check which one?21:47
kfox1111sbezverk: the rabbit config one.21:47
sbezverkkfox1111: looks good to me.. the failure is not related to rabbit for sure..21:47
kfox1111k. I'll just wf it then.21:48
kfox1111thanks.21:48
kolla-slack<jascott1> @sbezverk you are talking about the encoded format of Release->Data ?21:49
kfox1111jascott1_: yeah.21:50
kfox1111looks like its gzipped.21:50
kfox1111ungzipping it shows its a binary file,21:50
kolla-slack<jascott1> oh duh thats right21:50
kfox1111with big chunks of json.21:50
kfox1111well, base64 encoded, zgziped binary but mostly text.21:50
kfox1111weird.21:51
jascott1_someone in helm was trying to figure out format as well21:51
sbezverkkfox1111: I do not see gzip, I see clear text manifest21:51
*** pcaruana has quit IRC21:51
jascott1_the data tho21:51
kfox1111so, the readily recogizable parts are requirements.yaml,21:51
kfox1111and values.yaml.21:51
sbezverkit does not look complete21:51
mewaldsdake: I am going to do it now21:51
kfox1111sbezverk: odd. the one I'm seeing is gziped... its an older one though.21:52
openstackgerritMerged openstack/kolla-kubernetes master: Move rabbitmq config to kolla-k8s  https://review.openstack.org/45058121:52
kfox1111let me look at another one...21:53
inc0I don't know much about clustering in rabbitmq21:53
mewaldsdake inc0: it worked21:53
inc0redeploy worked?21:53
mewaldyup21:53
inc0all good then>21:53
sdakenice mewald21:54
mewaldI deleted all: container, image, volume from all nodes21:54
*** rwallner has quit IRC21:54
sdakemewald i'd highly recommend setting up a cron job and tarballing /var/lib/docker21:54
sdakemewald on all of your control nodes21:54
sdakeor using backup software21:54
sdakeraid is not a replacement for backups21:54
sdakehigh availabity is not a replacement for backups21:54
kfox1111hmm... with helm v2.3, its a gzip. not sure about v2.21:54
mewaldsdake: yes, ur right. What I am currently doing is mysqldumping mariadb to a different node every hour21:55
sdakemewald nice - that woorks too21:55
sdakeas long as you ahve the database you should be gtg21:55
kfox1111sbezverk: jascott1_: ok, ,looking at another chart, I'm seeing all the templates in it too.21:55
mewaldyeah, openstack is quite easy to recover in that respect21:55
kfox1111so, yeah, if deduplication works, it may fix the issue.21:55
jascott1_technosophos said its base64enc(gzip(protobuf))21:55
mewaldbut this stuff that I just had with rabbitmq makes me hate all this crap :D21:55
kfox1111jascott1_: yeah. that sounds about right, looking at the output.21:56
sdakemewald i htink the only real solution to that problem is to not run out of /var/lib/docker disk space21:56
inc0kfox1111: I think heat and ironic is all that's left21:56
kfox1111but its the full templates that are the sizable thing.21:56
sdakedocker behaves poorly when it runs out of disk space21:56
sdakeapplications do as well21:56
kfox1111inc0: sweet. almost there. :)21:56
kfox1111inc0: mariadb too?21:56
sdakemewald tbh your lucky your db wasn't destroyed21:56
sbezverkjascott1_: right looked at the wrong place , sorry21:56
kfox1111sbezverk: I think the next step next then is this:21:56
inc0hmm...mariadb is still not there? let me check21:56
kfox1111sbezverk: add a step to compute_kit building that:21:57
kfox1111extracts the built compute kit to a temp dir,21:57
kfox1111rm -rf on all dirs named kolla-common21:57
kfox1111cp -a helm/kolla-common to the computekit/charts/21:57
kfox1111and tar it back up,21:57
mewaldinc0: everything is good with my mariadb :)21:57
kfox1111and let the gate continue on as normal.21:58
kfox1111that may fix the issue.21:58
inc0yeah you're right mariadb still missin, on it21:58
*** esharao has quit IRC21:58
inc0mewald: mariadb remars were for kfox1111 :P21:58
inc0brb rebooting21:58
mewaldah, getting confused :D someone should add threads to IRC xD21:59
inc0+1 to that21:59
*** haplo37 has quit IRC21:59
*** g3ek has quit IRC21:59
*** lamt has quit IRC21:59
sdakemewald they did - its called slack ;-)21:59
mewaldsdake: true xD Is there a kolla team on slack?21:59
sdakemewald its mirrored to irc21:59
sdakeand we use irc for main communication22:00
sdakeits mostly to help kubernetes community communicate in one forum with openstack community22:00
mewaldyeah makes sense22:00
*** rwallner has joined #openstack-kolla22:01
*** rhallisey has quit IRC22:01
*** tonanhngo has quit IRC22:03
*** lamt has joined #openstack-kolla22:04
*** tonanhngo has joined #openstack-kolla22:05
sbezverkkfox1111: what are we trying to get doing this? just a single copy of kolla-common?22:05
*** rwallner has quit IRC22:05
kfox1111sbezverk: yeah. talking it over with technosophos right now.22:06
*** MasterOfBugs has quit IRC22:06
*** pramodrj07 has quit IRC22:06
inc0back22:06
*** tonanhngo_ has joined #openstack-kolla22:06
*** pramodrj07 has joined #openstack-kolla22:06
*** MasterOfBugs has joined #openstack-kolla22:06
kfox1111sbezverk: actually I think the change needs to be slightly different.22:07
kfox1111sbezverk: we make compute kit depend on kolla-common too.22:07
kfox1111and after built, rm -f */kolla-compute/templates/* out of the tar exept the root level kolla-compute.22:08
*** haplo37 has joined #openstack-kolla22:08
*** g3ek has joined #openstack-kolla22:08
*** tonanhngo_ has quit IRC22:08
*** tonanhngo_ has joined #openstack-kolla22:09
*** tonanhngo_ has quit IRC22:09
*** tonanhngo has quit IRC22:09
sbezverkkfox1111: can you delete files right in the tar file? never seen that before..22:10
kfox1111I don't think so.22:10
sdakeyou can do that via a pipe22:10
sdakebut not directly as you would like22:11
kfox1111the change I was tryign to say though is that we can't delete the child lib's entirely, just their template dir content.22:11
kfox1111otherwise, when we go to parent var includes, things would break.22:11
kfox1111the child values still need to be in there.22:12
inc0brb22:12
*** inc0 has quit IRC22:12
*** inc0 has joined #openstack-kolla22:12
inc0at some point you need to restart weechat...22:13
*** fooliouno has quit IRC22:13
*** manheim has quit IRC22:13
sbezverkkfox1111: I think the issue i not with templates but with duplication rendered values.yaml if you add --debug to compute kit, you will see huge duplication of variables over and over22:14
sbezverkI am not 100% sure if it is the same case in configmap, but could easily be the case22:15
sdakegotta jet for dinner - bbl22:15
kfox1111sbezverk: yeah. I'm sure thats part of it.22:15
*** tonanhngo has joined #openstack-kolla22:15
kfox1111but it should include a ton of copies of the templates too.22:15
kfox1111just cutting it down by half would probably do the trick.22:15
jascott1_sbezverk can you pastebin that output or something?22:16
kfox1111I think just doing the tar datadedup would be a good excersize. shouldn't take too long to try.22:17
openstackgerritMichal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move heat config to kolla-k8s  https://review.openstack.org/45058922:17
kfox1111the configmap data implies it will help.22:17
sbezverkkfox1111: I cannot say I completely understand steps you are proposing, but if you put together PS I can quickly test in my local test bed as I can easily reproduce the issue22:17
kfox1111sbezverk: k. I'll take a quick stab at it. sec...22:17
*** Pavo has joined #openstack-kolla22:19
kfox1111sbezverk: whats the full name of the tarball?22:19
kfox1111I don't have it handy.22:19
inc0kfox1111: btw..when are we planning to tackle clustered mariadb?22:20
mewaldsdake inc0: just deployed the third controller and again rabbit has those issues :D It's madness22:20
kfox1111inc0: I'm kind of waiting to see what k8s comes up with for that.22:20
inc0kfox1111: +1 to that...just might take some time22:20
kfox1111I think it kind of should be mariadb-operator.22:20
inc0especially that suprisingly google sells service like that22:20
kfox1111but if it doesn't show soon, doing a deamonset with host passthrough might be a good intermediary.22:21
sbezverkkfox1111: I do not see compute kit tar file, only individual services22:22
sbezverktar files22:22
kfox1111I found it I think.22:23
kfox1111compute-kit-0.6.0-1.tgz22:23
sbezverkkfox1111: :) where did you find it? in helm repo?22:23
kfox1111had one laying around.22:23
sbezverkkfox1111: maybe because I did not build compute-kit but was deploying from helm/compute-kits/compute-kit22:25
kfox1111ah. yeah.22:26
*** Pavo has quit IRC22:33
openstackgerritKevin Fox proposed openstack/kolla-kubernetes master: WIP: Test compress compute kit  https://review.openstack.org/45640622:33
kfox1111sbezverk: ----^22:33
kfox1111gotta build all and use the generated package.22:35
sbezverkkfox1111: I see, finishing cleaning up my test bed from previous run22:35
sbezverkand will try it22:36
*** mbruzek has quit IRC22:42
openstackgerritMichal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move mariadb configs to k8s  https://review.openstack.org/45640722:43
sbezverkkfox1111: started, waiting :)22:43
*** srwilkers has quit IRC22:46
swinnif I’m not able to access an external public network, do I just need to customize the ml2_conf.ini and enter a flat to physical mapping then reconfigure?22:47
*** lamt has quit IRC22:47
swinnor at least, good chance of that being the case?22:47
sbezverkkfox1111: yeah!!! it worked22:47
kfox1111sbezverk: interesting. how big is the configmap now?22:48
kfox1111sbezverk: we may want to migrate that logic into both the build_compute_kit script ,and the build_service script.22:48
kfox1111and or ask helm to add it to helm. :)22:48
kfox1111which they do sound amenable to.22:49
sbezverkkfox1111: it should be easier for them that to fight with k8s to increase limit22:51
*** claudiub has quit IRC22:58
*** lamt has joined #openstack-kolla22:59
inc0kfox1111 sbezverk sdake https://review.openstack.org/#/c/450589/23:00
inc0plz23:00
kfox1111looks good to me.23:01
sbezverkkfox1111: do we test heat in any gate jobs?23:01
kfox1111sbezverk: not yet. :/23:01
inc0mariadb is waiting for gates23:01
kfox1111we really need a test for that.23:01
kfox1111even something really simple, like making a heat template that creates a server group, then heat stack-create -f foo.yaml foo would do the trick.23:02
sbezverkkfox1111: do we want this check before merging this change? I mean it will not break anything, but there will always be doubt..23:03
kfox1111sbezverk: its not tested, so its already broken. ;)23:03
kfox1111I'd rather get out of kolla-ansible completely for the next release thats coming up really really soon.23:04
kfox1111having a relase thats sort of both is really ugly.23:04
sbezverkkfox1111: then inc0 needs to promise, if we find anything config related for heat, he would need to fix it ;)23:04
kfox1111he's already promiced its a streight up copy from kolla-ansible.23:04
inc0well, cut down copy23:05
kfox1111so not really any more broken then whats in kolla-ansible now.23:05
inc0but yeah23:05
kfox1111sdake: think we have a different solution to the computekit thing without backing out of 2.3.23:06
kfox1111sdake: https://review.openstack.org/#/c/456406/23:06
sbezverkI am off, had a very early start today, have a good one folks23:06
kfox1111sbezverk: have a good one. :)23:07
*** rwallner has joined #openstack-kolla23:07
*** vhosakot has joined #openstack-kolla23:07
*** lamt has quit IRC23:08
kfox1111inc0: looks like comparing helm/service to ansible/roles, we still need ironic, maridb and openvswitch config's.23:08
inc0ironic needs rebase, mariadb is up for review23:08
inc0ovs - I'm on it23:08
openstackgerritMerged openstack/kolla-kubernetes master: Move heat config to kolla-k8s  https://review.openstack.org/45058923:10
*** rwallner has quit IRC23:11
*** lamt has joined #openstack-kolla23:12
kfox1111mariadb failed hard.23:12
kfox1111missing file.23:12
kfox1111wsrep-notify.sh.j223:12
*** lamt has quit IRC23:12
*** lamt has joined #openstack-kolla23:13
*** lamt has quit IRC23:13
inc0yeah I thought I can remove it23:13
swinnI feel like I’m at the final stages of having a working cluster. Last step is I can’t figure out how to map a subnet for floating ips to a physical flat network. Any guides on this out there?23:13
*** lamt has joined #openstack-kolla23:16
*** lamt has quit IRC23:17
*** srwilkers has joined #openstack-kolla23:19
kfox1111swinn: not sure I understand the question.23:21
kfox1111are you using tenant networking?23:21
*** sayanta__ has quit IRC23:21
swinnI’m using the default networking with ovs but need to map a subnet for floating IPs to a real physical network23:22
swinnwhen I create the public network, the provider is vxlan23:22
swinnbut in this case it should use a flat provider to connect to a physical network23:22
kfox1111haven't tried it with kolla but you create the public network telling it to use a provider network.23:22
kfox1111swinn: something like: https://github.com/openstack/kolla-kubernetes/blob/master/tests/bin/basic_tests.sh#L9223:23
openstackgerritMichal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Move mariadb configs to k8s  https://review.openstack.org/45640723:25
kfox1111inc0: so, after the last ps merges, do we just tweak the pathfinder.py to point to the local ansible dir and delete kolla-ansible from the gate?23:26
inc0yup23:26
kfox1111nice. :)23:26
swinnkfox1111: thanks for the link, the syntax changed a bit but it helped23:27
kfox1111cool.23:28
inc0kfox1111: flat external + vxlans wont work?23:28
kfox1111inc0: flat external + tenant vxlans works.23:29
inc0yeah that's what I thought23:29
kfox1111inc0: he was just doing a net-create though without the type, so was allocating vxlans intead of flat.23:29
kfox1111just had to use the right flags.23:29
inc0then you create allocation pool on external flat and routers23:29
kfox1111right.23:29
swinnit was the physnet1 mapping that got me, it’s a named interface23:29
kfox1111ah. yeah.23:30
kfox1111"your network naming may varry. " :)23:30
inc0kfox1111: so ovs confs are in neutron role23:31
inc0which means we should have those23:31
kfox1111oh. ok.23:31
kfox1111is it in the rm -rf?23:32
kfox1111maybe we just need to test that.23:32
kfox1111yeah. its not in the rm -rf list. so we probably should just test that.23:35
*** erlon has quit IRC23:35
openstackgerritMichal Jastrzebski (inc0) proposed openstack/kolla-kubernetes master: Finalize move of configs to kolla-k8s  https://review.openstack.org/45641223:35
inc0kfox1111: or remove all rm -rfs ^ ;)23:35
kfox1111that works too. :)23:35
inc0that will show us if we missed anything too23:36
inc0well gotta rebase ironic23:36
inc0but let's wait till mariadb merges23:36
kfox1111inc0: maybe the change note should be someting like "remove kolla-ansible as a dependency" ?23:36
swinnkfox1111 and inc0: thanks so much for the help past few days, I can reach floating ips and all services are up23:37
kfox1111swinn: np. glad you got it working. :)23:37
inc0swinn: my pleasure23:38
masbermorning all, would you recommend running ceph and nova on the same node or separate interms of performance?23:38
kfox1111masber: we ran our site hyperconverged. worked well until recovery needs to happen.23:38
inc0masber: ceph and nova as in compute and ceph-osd?23:38
kfox1111then ceph gets really resource hungy.23:38
masberI see23:39
kfox1111strangles out compute.23:39
*** Pavo has joined #openstack-kolla23:39
masberwhat about latency?23:39
kfox1111generally more predictable if seperate.23:39
kfox1111no noisy vm's in the way.23:39
masberI am not sure about hyperconverged, I am thinking to keep them separate, in that case I can run ephemeral storage for high performance23:40
inc0masber: hyperconverged won't make it closer to storage23:40
kfox1111yeah. if you have the resource, I'd recommend keeping them seperate.23:40
inc0as in you never know if your ephemeral will land on osd with compute23:40
kfox1111inc0: right.23:40
inc0reason you might want to do it is to not waste disk slots in compute nodes;)23:41
masberok I understand23:42
inc0and have all cloud running on single spec of servers23:42
kfox1111my advice, avoid it if you can. its generally ok if you are small enough you can't.23:43
masberyes, on the other hand you can always setup your vms to have "local" storage as ephemeral drives as scratch volume for high performance and then another drive to ceph for archiving23:43
*** zhubingbing_ has joined #openstack-kolla23:43
masberI am trying to migrate from HPC to private cloud, if that makes sense23:43
kfox1111yeah. thats what I do.23:43
kfox1111so if you want a pet that can float, you do a cinder backed volume.23:43
kfox1111if you want faster ephemeral, you don't check the box. :)23:44
kfox1111best of both worlds.23:44
inc0yeah, ephemerals are in general bad idea if you want to have live migration23:44
kfox1111if you back nova with ceph, you remove the choice.23:44
masberthat is a good point, ephemeral disks won't move on live migration, is that right?23:45
inc0with block migration they will, but that makes it much harder23:45
inc0and volitale23:45
kfox1111masber: my site does hpc and cloud. are you trying to migrate hpc workload to a private cloud, or just coming from mindset?23:45
kfox1111yeah. and block migration has never worked for me live.23:45
masberkfox1111, we currently have a small HPC cluster and we would like to move to private cloud doue to the flexibility it provides23:46
kfox1111trynig to migrate mpi jobs?23:47
kfox1111and whats your definition of small? :)23:47
masberno MPI thanks good23:47
kfox1111everyone's definition of size differs.23:47
kfox1111ok. so then its not too bad to migrate then. :)23:47
kfox1111we do a lot of mpi jobs.23:47
masberyes sorry, small means 26 nodes running centos 6 and using rocks cluster for provisioning23:48
kfox1111ah. rocks. :)23:48
kfox1111we've got quite a few of those clusters around.23:48
masberand the facts that we don't use MPI makes it less complicated for virtualization I guess23:48
kfox1111pretty reliable really.23:48
kfox1111much much less. :)23:48
kfox1111so.... not to discurage you too much... but23:49
kfox1111we're taking one of our HTC computing clouds and tearing it down.23:49
*** Pavo has quit IRC23:50
kfox1111getting about a 20% performance bump moving the system from openstack to raw kubernetes.23:50
masberI see23:50
kfox1111you can tune openstack vm's to get most of the performance difference back, but its a fair chunk of work.23:50
inc0well...that kinda corresponds with virtualization overhead23:50
masberyou can use magnum to deploy kubernetes?23:50
kfox1111but with k8s, it passes the raw cpu through, so doesn't need tuning.23:51
inc0masber: then you're losing both:) it's about removing virt layer alltogether23:51
inc0kfox1111: I wonder if slurm could use k8s23:51
masbermagnum can deploy kubernetes on bare-metal23:51
*** mewald has quit IRC23:51
masberno need to run on vms I think23:51
kfox1111inc0: I'm using condor running in k8s managed contailers on bare metal.23:51
kfox1111slurm would work just as well.23:51
masberyes, my idea of openstack is to use it a unified environment for a cluster as it can do lots of things23:53
masberand gives the oportunity to the end user to setup their own environments23:53
masberkfox1111, what about mesos? have you look that option?23:54
kfox1111yeah. openstacks good for that.23:54
*** Pavo has joined #openstack-kolla23:54
kfox1111masber: let me show you a little graph... :)23:54
kfox1111masber: https://trends.google.com/trends/explore?q=kubernetes,docker%20swarm,mesos23:55
kfox1111I picked k8s. :)23:55
masbernice graph23:56
masberI see23:56
kfox1111all the major players are lining up behind k8s.  redhat, ubuntu, google, coreos, etc.23:57
kfox1111so, thats where I'm placing my bet.23:57
*** Pavo has quit IRC23:57
masberkfox1111, do you use NUMA architecture?23:58
kfox1111hard to avoid these days. :)23:58
kfox1111but, yeah.23:58
kfox1111we had one of the early SGI Altix's. 512 cores in one numa machine spread over I think it was something like 3 racks.23:59
masberok, so cpu pinning didn't help much23:59
*** ssurana has joined #openstack-kolla23:59
masberin terms of performance23:59
kfox1111it did a lot in that environment. :)23:59
kfox1111rack to rack memory access was really expensive.23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!