Monday, 2018-12-10

*** dave-mccowan has quit IRC00:23
*** PagliaccisCloud has quit IRC00:37
*** dave-mccowan has joined #openstack-containers00:57
*** Nel1x has quit IRC01:02
*** hongbin has quit IRC01:45
*** PagliaccisCloud has joined #openstack-containers01:49
*** cbrumm_ has joined #openstack-containers02:12
*** cbrumm has quit IRC02:13
*** lbragstad has quit IRC02:13
*** lbragstad has joined #openstack-containers02:16
*** itlinux has quit IRC02:29
*** hongbin has joined #openstack-containers03:05
*** lbragstad has quit IRC03:38
*** udesale has joined #openstack-containers03:40
*** hongbin has quit IRC04:00
*** ramishra has joined #openstack-containers04:34
*** ykarel has joined #openstack-containers04:40
*** ricolin has joined #openstack-containers04:45
*** dave-mccowan has quit IRC04:53
*** ykarel_ has joined #openstack-containers04:59
*** ykarel has quit IRC05:02
*** ykarel_ has quit IRC06:30
*** belmoreira has quit IRC07:04
*** belmoreira has joined #openstack-containers07:16
*** belmoreira has quit IRC07:18
*** belmoreira has joined #openstack-containers07:18
*** rcernin has quit IRC07:23
*** udesale has quit IRC07:24
*** ykarel_ has joined #openstack-containers07:48
*** ykarel_ is now known as ykarel08:16
*** ykarel is now known as ykarel|lunch08:17
*** ykarel|lunch is now known as ykarel08:42
*** udesale has joined #openstack-containers08:49
*** shrasool has joined #openstack-containers09:25
lxkongstrigazi: hi, i'm working on how to accelerate the launch time of magnum k8s cluster and i found there are several software deployments that are executed in sequence after kubemaster is alive.09:42
lxkongstrigazi: that's why i asked in heat channel09:42
strigazilxkong: give me a sec, I'll explain09:43
lxkongsure09:43
strigaziThe improvement I have thought is to parallelize the deployment of master and worker nodes. Running the SDs in parallel won't help a lot.09:49
strigazilxkong:  the major bottleneck is that nodes wait for master to pull everything. The deployment of k8s manifests is async so it is fast.09:50
strigazilxkong: I'm writing a PoC for this now09:52
lxkongstrigazi: in my devstack deployment, the core_dns_service takes 80s and kubernetes_dashboard takes 171s somehow10:00
lxkongin sequence10:01
lxkongfor master and worker, i have created this https://storyboard.openstack.org/#!/story/2004573 for HA cluster10:01
lxkongbut for non-ha, after master gets created and has an ip address, the workers could start10:02
lxkongrather than waiting until all the things are finished inside the master10:02
brtknrlxkong: isn't that to do with your internet connection speed as it is mostly the time required to download containers?10:03
brtknrstrigazi: i like the idea of launching the master and workers in parallel!10:04
lxkongbrtknr: no, besides the container installation time, we also want to decrease the provision time for others10:04
lxkonganother k8s cluster was just created in my devstack, i can see:10:05
lxkonghttps://www.irccloud.com/pastebin/xKUT4VOM/10:06
lxkongmost of them are just `kubectl apply`, i don't know why they took so long time to finish10:06
ricolinIMO try to increase the polling interval, might help if you wish it run in short time, just remember to change it back after you're done10:10
ricolinhttps://github.com/openstack/heat-templates/blob/master/hot/software-config/boot-config/templates/fragments/os-collect-config.conf#L1010:10
ricolins/increase/decrease/10:10
ricolinstrigazi, have you try that yet?10:10
lxkongricolin: 'change it back after you're done' sounds like a hack instead of a solution :-(10:13
ricolinlxkong, actually not a hack IMO, just an improvement of performance for vm10:14
ricolinlxkong, you can always leave it with quick polling interval10:14
lxkongricolin: you mean, in magnum scenario, we give it a small value and remove it after something done?10:15
*** salmankhan has joined #openstack-containers10:15
lxkongbut if a small  polling interval will increase the api load?10:15
lxkongricolin: that may work, but not a perfect solution10:16
ricolinlxkong, it will increase api loading in certain level. Not sure there's a better design for a `polling design`, maybe we can try to figure out one:)10:17
ricolinlxkong, and yes, you should be able to change it after you're done10:18
*** udesale has quit IRC10:18
lxkongricolin: actually, my original requirement is to execute some scripts inside the vm in parallel, currently we are using SD, but seems like it doesn't support parallelism10:19
lxkongunless we write all the script in one SD rather than using multiple SDs for a single vm10:20
ricolinyou can use shell script do to that in SD right?10:20
ricolins/do to/to do/10:20
lxkongricolin: exactly and ugly :-)10:20
lxkongi guess in magnum, we separate the script into differetn SDs to gain some modular benefit10:21
ricolinlxkong, we can start discuss if we should have that in agent, just have to figure out how to do it in a dozen of agents10:22
ricolinshell, ansible, puppet, etc10:22
lxkongricolin: sounds like a plan10:22
ricolinlxkong, happy to help to review and trigger a discussion10:23
ricolinand it will be better if we got people help on it10:23
ricolinlike partial success10:24
ricolinwe need people to do it for sure10:24
* ricolin Hope he got more time on that10:24
lxkongricolin: btw, not sure if you know anybody in heat team who is familiar with Octavia that can help with this https://storyboard.openstack.org/#!/story/2004564?10:25
ricolinramishra, is the one10:26
lxkongramishra: it'd be much appreciated if you are available for this feature10:28
ricolinlxkong, btw, it's not like everything list in etherpad will happen, but would be very appreciated if you can leave your feedback in etherpad with your name, so we can give more voice on that issue https://etherpad.openstack.org/p/heat-user-berlin10:29
ramishralxkong: you asking me to implement it? ;) I can review if someone works on it, have no time to do it myself.10:40
lxkongramishra: that's totally fine, thanks10:41
lxkongstrigazi: hi, do you have some time to discuss https://review.openstack.org/#/c/497144/? i'm very happy to hear your suggestion10:48
brtknrlxkong: i am currently trying to setup a simple lbaas using neutron, octavia seemed like an overkill for what i was trying10:49
lxkongbut neutron-lbaas was already depcreated10:50
lxkongbut using neutron-lbaas is ok for this feature, just need to add another hook for it10:50
brtknrlxkong: yes, but it should still work... i think magnum uses neutron lbaas by default, you need to enable octavia explicitly in /etc/magnum/magnum.conf10:51
brtknrI got as far as setting up the lbaas but my controller manager is complaining: http://paste.openstack.org/show/736864/10:51
brtknrI am currently digging around for solutions10:51
lxkongbrtknr: nope, if Octavia is deployed in your cloud, it will be used as defulat10:51
lxkongbrtknr: https://github.com/openstack/magnum/blob/f27bde71719905e6f274a1a57799595780bc50c2/magnum/drivers/heat/template_def.py#L34210:53
brtknroh okay10:55
strigazisorry, I was in a meeting10:55
brtknrwhat about loadbalancer for services though?10:55
brtknrstrigazi: wb10:56
brtknrmy https://github.com/openstack/magnum/blob/f27bde71719905e6f274a1a57799595780bc50c2/magnum/drivers/h │ shrasool10:56
brtknr                            │                         | eat/template_def.py#L34210:56
brtknroops sorry10:56
strigaziI'm reading the thread10:56
strigazilxkong: the time that matters is, after the apiserver reports ok how much time do the SDs take.10:58
brtknrhttps://github.com/openstack/magnum/blob/53a1840d68382fd7bd6cc1f7c6752a37a632b50b/magnum/drivers/heat/k8s_template_def.py#L10410:58
brtknrlooks like you're right lxkong10:58
strigaziputting everything in one script might not help if the node performance is pour. The biggest gain is pull everything in parallel.11:00
lxkongstrigazi:  `pull everything in parallel` you mean for atomic images or heat sd?11:01
lxkongstrigazi: from the logs, all the SDs are executed in sequence, so the total time is about 4min after the apiserver is up and running11:03
lxkongbut from what those scripts do, 4min is not reasonable11:03
lxkongand from ricolin's explanation, there is a param for os-collect-config `polling-interval` with default value 30, that may be the reason, but i'm not sure11:04
lxkongos-collect-config/os-refresh-config/os-apply-config....so many things11:05
strigazimaybe your bottleneck is somewhere else. check what these scripts do The times you present don't make sense to me. In our cloud the agent starts at 12:18:53 and the last SD finishes at: 12:19:2911:08
lxkongstrigazi: are you using the latest magnum code?11:09
strigazilxkong: just a warning, if you reduce the polling time, you will do a ddos to your heat api. Just measure exxactly where the time is spent.11:10
lxkongmy environment is a devstack with latest code for all the projects11:10
brtknra little while ago, when i was testing changes to magnum on devstack, the VM that i was using had it's ip address throttled by openstack servers... I changed the floating ip and problem was solved.11:10
strigazilet me check me devstack which is master.11:10
lxkongstrigazi: yeah, i know11:10
*** udesale has joined #openstack-containers11:11
strigaziin devstack master: start Dec 07 18:29:55, end Dec 07 18:31:4711:12
lxkonghmm...11:13
lxkongstrigazi: do you mind paste your  heat-container-agent service log?11:15
lxkongfrom the service start to all the phases finished11:16
strigazihttp://paste.openstack.org/raw/736889/11:21
lxkongstrigazi: in your cluster, you don't have calico_service and kubernetes_dashboard, right?11:25
*** ArchiFleKs has joined #openstack-containers11:29
lxkongstrigazi: btw, have you tried to run `atomic install` when deployed the k8s cluster in a multi-thread or multi-process manner?11:34
* lxkong has to get some sleep, will continue testing tomorrow11:38
strigazilxkong: we don't use calico, maybe calico is slow12:11
strigazilxkong: I have the dashboard too12:13
strigazilxkong: paste.openstack.org cut my pase12:22
strigazilxkong: here it is https://paste.fedoraproject.org/paste/6-B--pO2cvkvx73-RDbXRg/raw12:22
*** salmankhan has quit IRC12:24
*** salmankhan has joined #openstack-containers12:31
*** cbrumm_ has quit IRC12:42
*** cbrumm_ has joined #openstack-containers12:49
strigazilxkong: https://paste.fedoraproject.org/paste/HfYsaoQLuNDpKsSJ80VCbw/raw with calico13:08
strigazilxkong: less than two mins13:09
*** dave-mccowan has joined #openstack-containers13:17
*** jmlowe has quit IRC13:45
brtknrstrigazi: have you started using cloud-controller-manager by any chance?13:53
openstackgerritMerged openstack/magnum stable/rocky: Add support for www_authenticate_uri in ContextHook  https://review.openstack.org/62367913:57
*** openstackstatus has joined #openstack-containers14:17
*** ChanServ sets mode: +v openstackstatus14:17
*** lbragstad has joined #openstack-containers14:17
*** jmlowe has joined #openstack-containers14:24
*** jmlowe has quit IRC14:24
*** jmlowe has joined #openstack-containers14:25
*** aspiers has quit IRC14:26
*** jmlowe has quit IRC14:30
*** shrasool has quit IRC14:34
*** jmlowe has joined #openstack-containers14:35
*** aspiers has joined #openstack-containers14:56
brtknranyone here started using cloud-controller-manager service?15:09
brtknrwith magnum?15:09
brtknrlooks like there is a service for it15:09
brtknrhttps://hub.docker.com/r/openstackmagnum/openstack-cloud-controller-manager/15:09
*** salmankhan1 has joined #openstack-containers15:11
*** salmankhan has quit IRC15:12
*** salmankhan1 is now known as salmankhan15:12
brtknris this wip?15:12
brtknrlooks like this change was abanadoned in July and restored recently: https://review.openstack.org/#/c/577477/3/magnum/drivers/common/templates/kubernetes/fragments/configure-kubernetes-master.sh15:18
strigazibrtknr: the PS i pushed works15:18
strigazibrtknr: you can test15:19
strigazibrtknr: i pushed PS version 315:19
brtknrstrigazi: nice, what prompted you to look at occm?15:22
brtknrthere is nothing with the tag v0.2.0 on dockerhub btw15:23
brtknroccm is tagged to download v0.2.015:23
strigazibrtknr: http://paste.openstack.org/show/736905/15:31
strigazibrtknr: I was testing things upstream for the lbaas delete hook written by lxkong. At CERN we are still not interested.15:32
strigazibrtknr: well we might need it for the autoscaler :)15:33
strigazibrtknr: we need it for the autoscaler, but i didn't have it in mind xD15:33
brtknroops, i was inspecting the wrong provider15:34
brtknri'm currently looking at neutron-lbaasv2, using cloud-provider=openstack hasnt worked for me. im checking to see if cloud-provider=external will15:35
strigazibrtknr: with octavia "It works in devstack"TM15:35
brtknrlol15:36
*** hongbin has joined #openstack-containers15:36
brtknrare you using rocky at cern or still queens?15:36
strigazi40% rocky15:36
strigazior more15:36
strigaziwe cherry-pick only what we need, it is difficult to rebase. Our network is too special15:37
brtknri see15:38
strigaziwe plan to upgrade in Jan, after holidays.15:40
strigazibrtknr: we might be in queens but we use k8s v1.12.3.15:41
brtknroh, how are you using v1.12.3? the openstackmagnum docker hub only goes up to v1.11.5? or are you using the upstream k8s image?15:44
*** itlinux has joined #openstack-containers15:44
*** jmlowe has quit IRC15:46
*** itlinux has quit IRC15:46
brtknrstrigazi: do you need kubelet running on master for occm to work?15:51
*** lpetrut has joined #openstack-containers16:12
*** munimeha1 has joined #openstack-containers16:21
*** jmlowe has joined #openstack-containers16:24
strigazibrtknr yes16:27
*** jmlowe has quit IRC16:34
*** openstackgerrit has quit IRC16:35
brtknrcool neutron lbaas v2 is working now :)16:41
brtknrwith cloud-provider=external16:42
brtknrstrigazi: just tested https://review.openstack.org/#/c/577477/3 and it works great!16:43
brtknrcan we cherry-pick https://review.openstack.org/#/c/571190 to queens?16:51
brtknri get merge conflict when i try to do it16:52
*** ykarel has quit IRC16:52
*** ykarel has joined #openstack-containers16:53
*** itlinux has joined #openstack-containers16:56
*** udesale has quit IRC16:58
*** openstackgerrit has joined #openstack-containers17:01
openstackgerritBharat Kunwar proposed openstack/magnum stable/queens: k8s_fedora: Add cloud_provider_enabled label  https://review.openstack.org/62413217:01
*** ykarel is now known as ykarel|away17:03
*** ricolin has quit IRC17:05
openstackgerritBharat Kunwar proposed openstack/magnum stable/queens: k8s_fedora: Add cloud_provider_enabled label  https://review.openstack.org/62413217:09
openstackgerritBharat Kunwar proposed openstack/magnum stable/queens: k8s_fedora: Add cloud_provider_enabled label  https://review.openstack.org/62413217:10
*** ricolin has joined #openstack-containers17:14
*** munimeha1 has quit IRC17:14
*** PagliaccisCloud has quit IRC17:26
*** jmlowe has joined #openstack-containers18:12
*** ricolin has quit IRC18:22
*** PagliaccisCloud has joined #openstack-containers18:34
*** salmankhan has quit IRC18:36
*** ykarel|away has quit IRC19:08
*** PagliaccisCloud has quit IRC19:27
*** salmankhan has joined #openstack-containers20:01
*** shrasool has joined #openstack-containers20:03
*** salmankhan has quit IRC20:06
openstackgerritRoberto Soares proposed openstack/magnum master: [k8s] Add vulnerability scanner  https://review.openstack.org/59814220:10
*** shrasool has quit IRC20:18
*** jmlowe has quit IRC20:26
*** lpetrut has quit IRC20:27
*** jmlowe has joined #openstack-containers20:32
*** jmlowe has quit IRC20:33
mordredanybody know off the top of their head - using magnum with atomic hosts, is there a particular directory on each node that should be used for files that should persist across reboots?20:37
*** jmlowe has joined #openstack-containers20:58
cbrumm_According to redhat's docs /var is writable and persists through reboots21:01
*** salmankhan has joined #openstack-containers21:04
lxkongstrigazi: hi are you still here? Could you please give more suggestion about https://review.openstack.org/#/c/497144? I didn't fully understand your comment 'This is per coe. All other features are written per driver', I want to make sure we agree with the plugin mechanism before I start writing some doc for how to config and test.21:10
*** roukoswarf has joined #openstack-containers21:12
roukoswarfdoes magnum allow you to set the deployment kubernetes version anywhere?21:15
lxkongroukoswarf:  `--labels kube_tag=YOUR_VERSION_HERE`21:16
lxkongmake sure you can find the version in openstackmagnum account in docker hub if you are using the upstream images21:17
roukoswarfah okay, forgot about labels and didnt know where it was sourcing it21:17
roukoswarfno versions beyond 1.11? is there a specific update frequency?21:18
*** salmankhan has quit IRC21:36
roukoswarfso even with kube_tag set to v1.11.5-1, i end up with nodes using v1.10.3+coreos.0, is there something im missing?22:18
*** itlinux has quit IRC22:43
mnaserlxkong: strigazi ricolin  i actually took a shot at improving speed deploy time22:56
mnaserit is failing, i dont know why, but if someone knows the right thing to do, it might pass22:57
mnaserit involves using softwaredeploymentgroups22:57
mnaserhttps://review.openstack.org/62372422:58
mnaserbtw, anyone going to be at kubecon?22:58
*** rcernin has joined #openstack-containers22:59
lxkongmnaser: not me :-(23:05
mnaserlxkong: aw bummer23:06
mnaserbtw, i would love for reviews on this stack here -- https://review.openstack.org/#/c/623628/23:07
mnasereverything with functional: prefix brings back functional tests for magnum and runs them in an env with nested virt!23:07
mnaseralso https://review.openstack.org/#/c/619643/ would be nice because that breaks things for a lot of users :(23:09
lxkongmnaser: thanks for proposing this https://review.openstack.org/#/c/623724/, i've added to my review list and will test when i'm available23:09
mnaserlxkong: yeah, i dont have tools to repro right now but i think that's the direction we wanna go23:12
lxkongmnaser: definitely23:12
mnaserthat way in non-ha deploys, the servers get deployed quickly and the minions can start spinning up already while softwaredeploys happen for masters23:12
mnaserbut i havent had time to setup a dev env to hack on it, but yay for functional working tests so we can actually just see that change in CI :)23:12
lxkongmnaser: what i thought is spin up the workers just after the vm created successfully23:13
lxkongbecuase what the workers need is just a ip address23:13
mnaserlxkong: yep! and thats why ha goes up faster because they take the lb ip address23:13
lxkongin ha cluster, it's lb vip, for non-ha cluster that's the master0's private ip23:13
mnaservs in non-ha we have to wait for the vm to go up23:13
mnaseryup :D23:13
mnaseralso with the approach i took there23:14
lxkongyeah23:14
mnaseryou reduce the # of resources you have23:14
mnaserno need to have a softwareconfig+softwaredeployment per ever node23:14
mnaseresp when the softwareconfig's are actually static, so it is less load on heat23:14
lxkongi'll give it a try after review and testing this one https://review.openstack.org/#/c/561783/23:15
*** dave-mccowan has quit IRC23:19
mnaserlxkong: awesome! :)23:29
mnaserplease do keep me updated23:29
lxkongsure, i'll23:29
mnaserif you needs anything please feel free to ping!23:30
lxkongmnaser: yep, thanks :-)23:30
*** itlinux has joined #openstack-containers23:49

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!