Wednesday, 2020-02-12

*** rcernin has joined #openstack-containers00:08
*** rcernin has quit IRC00:55
*** rcernin has joined #openstack-containers00:55
*** xinliang has quit IRC02:18
*** ricolin has joined #openstack-containers02:19
*** xinliang has joined #openstack-containers02:48
*** rcernin has quit IRC03:06
*** rcernin has joined #openstack-containers03:23
*** udesale has joined #openstack-containers04:05
*** ramishra has joined #openstack-containers04:09
*** vishalmanchanda has joined #openstack-containers04:13
*** rcernin has quit IRC04:16
*** rcernin has joined #openstack-containers04:16
*** ramishra has quit IRC04:32
*** dave-mccowan has quit IRC04:43
*** ramishra has joined #openstack-containers04:46
*** ykarel|away is now known as ykarel04:58
*** rcernin is now known as rcernin|lunch05:01
*** goldyfruit has quit IRC05:58
*** goldyfruit has joined #openstack-containers05:58
*** goldyfruit has quit IRC06:30
*** goldyfruit has joined #openstack-containers06:30
*** elenalindq has joined #openstack-containers06:42
*** ivve has quit IRC07:24
*** lpetrut has joined #openstack-containers07:31
*** ianychoi has joined #openstack-containers07:46
*** ykarel is now known as ykarel|lunch07:55
*** xinliang has quit IRC08:00
*** ivve has joined #openstack-containers08:45
openstackgerritFeilong Wang proposed openstack/magnum master: [k8s] Fix instance ID issue with podman and autoscaler  https://review.opendev.org/70733608:47
*** xinliang has joined #openstack-containers08:51
*** flwang1 has joined #openstack-containers08:57
flwang1strigazi: brtknr: meeting in 3 mins08:57
flwang1#startmeeting magnum09:00
openstackMeeting started Wed Feb 12 09:00:39 2020 UTC and is due to finish in 60 minutes.  The chair is flwang1. Information about MeetBot at http://wiki.debian.org/MeetBot.09:00
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.09:00
*** openstack changes topic to " (Meeting topic: magnum)"09:00
openstackThe meeting name has been set to 'magnum'09:00
flwang1#topic roll call09:00
*** openstack changes topic to "roll call (Meeting topic: magnum)"09:00
flwang10/09:00
brtknro/09:02
flwang1brtknr: hey, how are you09:03
brtknrgood thanks and you!09:03
brtknr?09:03
flwang1very good09:03
flwang1let's wait strigazi a bit09:03
flwang1brtknr: did you see my comments on your csi patch?09:04
brtknryes thanks for reviewing09:05
brtknrdid you see the issue on devstack?09:06
brtknrcan you leave comment about what k8s version you used and if it was podman or coreos etc.09:06
brtknri havent seen the same issue locally09:06
flwang1ok, will do09:06
flwang1i'm using v1.16.3 with podman09:06
flwang1and coreos09:06
flwang1brtknr: let's go through the agenda?09:08
brtknrsounds good09:08
brtknrbtw instead of kubelet-insure-tls, maybe i can use --kubelet-certificate-authority09:08
brtknrand use the ca for the cluster?09:09
flwang1brtknr: it would be great, otherwise, enterprise use won't like it09:09
flwang11. Help with removing the constraint that there must be a minimum of 1 worker in a given nodegroup (including default-worker).09:09
flwang1have you already got any idea for this?09:09
brtknri can manually specify count as 0 and get the cluster to reach CREATE_COMPLETE but i havent been able to override the value of node_count to 0, at the moment, it defaults to 109:10
brtknri havent been able to figure out where exactly this constraint is applied. any pointer would be appreciated but i realise there is no easy answer without properly digging underneath09:12
flwang1what do you mean manually specify?   like openstack coe cluster create   xxx --node-count 0 ?09:12
brtknrthat will not work because there is api level constraint09:12
brtknrwhen i remove the api level constraint, the node-count still defaults to 109:12
flwang1brtknr: i see. so you mean you hacked the code to set it to 0?09:12
brtknri can override count in the kubecluser.yaml file09:13
brtknrand only then cluster reaches CREATE_COMPLETE09:13
flwang1ah, i see.09:13
flwang1i would say if it works at the Heat level, then the overall idea should work09:13
flwang1we can manage it in magnum scope09:13
flwang1it would be nice if you can dig and propose a patch so that we can start to review from there09:14
brtknrsounds good09:14
flwang12. metrics sever CrashLoopBack09:15
flwang1is there anything we should discuss about this?09:15
brtknrflwang1: i saw your comment09:17
brtknri will see if there is a way to do this without insecure-tls09:17
brtknrthere is a --kubelet-certificate-authority and --tls-cert-file option which i havent explored09:18
flwang1cool, i appreciate your work on this issue09:18
flwang13. the heat agent log09:19
brtknrwe're still on roll call topic btw09:19
brtknrlol09:19
flwang1ah, sorry09:19
flwang1#topic heat agent log09:19
*** openstack changes topic to "heat agent log (Meeting topic: magnum)"09:19
flwang1with this one, we probably need to wait the investigation result from strigazi09:19
flwang1so let's skip it for now?09:19
brtknri was digging into the heat agent log yesterday as its sometimes impossible to see what is happening when the cluser is creating09:20
brtknrthe main problem is subprocess.communicate does not provide option to stream output09:20
flwang1:(09:21
brtknrthere may be a way to output stdout and stderr to a file on the other hand09:21
brtknrfrom whereever its being executed from09:22
brtknrbut happy to wait for what strigazi has to say09:22
flwang1cool, thanks09:22
flwang1move on?09:22
brtknrsure09:25
flwang1#topic volume AZ09:27
*** openstack changes topic to "volume AZ (Meeting topic: magnum)"09:27
flwang1brtknr: did you get a chance to review volume AZ fix https://review.opendev.org/705592 ?09:27
brtknryes it has a complicated logic09:30
flwang1brtknr: yep, i don't like it TBH, but i can't figure out a better way to solve it09:31
brtknrhow has it worked fine until now?09:32
flwang1sorry?09:32
flwang1you mean why it was working?09:32
brtknruntil now, how have we survived without this patch is what im asking09:33
flwang1probably because most of the companies are not using multi AZ09:33
flwang1without multi AZ, user won't run into this issue09:33
flwang1as far as i know, Nectar jakeyip, they're hacking the code09:34
flwang1i don't know if cern is using multi az09:34
elenalindqHi there, may I interrupt with a question? I tried  openstack coe cluster update $CLUSTER_NAME replace node_count=3 and it failed, because of lack of resources, so my stack is in UPDATE_FAILED state. I fixed the quota and I tried to rerun the update command hoping it will kick it off again, but nothing happens.   If I try openstack stack update <stack_id> --existing it will kick off the update, which succeeds (heat sh09:35
elenalindqDATE_COMPLETE), but openstack coe cluster list still shows my stack in UPDATE_FAILED. Is there a way to rerun the update from magnum? Using Openstack Train.09:35
brtknrflwang1: can we not get a default value for az in the same way novda does?09:36
brtknrelenalindq: if you are using train with an up to date CLI, you can rerun `openstack coe cluster resize <cluster_name> 3`09:37
flwang1brtknr: cinder can handle the "" for az09:37
brtknrand this will reupdate the heat stack09:37
elenalindqthank you  brtknr!09:37
brtknrflwang1: but not nova?09:37
flwang1brtknr: cinder can NOT handle the "" for az09:37
flwang1but nova can09:37
flwang1cinder will just return a 400 IIRC09:37
brtknrflwang1: can we look into nova code to see how they infer sensible default for region name?09:38
flwang1you can easily test this without using multi az09:38
flwang1are you trying to solve this issue in cinder?09:38
brtknri think magnum should have an internal default for availability zone09:39
brtknrrather than ""09:39
flwang1like a config option?09:40
flwang1then how can you set the default value for the this option?09:40
flwang1and this may break the backward compatibility :(09:40
*** ykarel|lunch is now known as ykarel09:41
brtknrflwang1: hmm will cinder accept None?09:43
flwang1brtknr: no, based on i tried09:43
flwang1you can give the patch a try and we can discuss offline09:44
flwang1it's a small issue but just make the template complicated, i understand that09:44
flwang1brtknr: let's move on09:46
flwang1#topic docker storage for fedora coreos09:46
*** openstack changes topic to "docker storage for fedora coreos (Meeting topic: magnum)"09:46
flwang1docker storage driver for fedora coreos https://review.opendev.org/69625609:46
flwang1brtknr: can you pls revisit above patch?09:47
flwang1strigazi: ^09:47
brtknrflwang1: my main issue with that patch is that we should try and use the same configure-docker-storage.sh and add a condition in there for fedora coreos09:51
brtknrit is hard to factor out common elements when they are in different files09:51
flwang1brtknr: we can't use the same script. i'm kind of using the same one from coreos09:52
flwang1because the logic are different09:52
flwang1brtknr: see https://github.com/openstack/magnum/blob/master/magnum/drivers/k8s_coreos_v1/templates/fragments/configure-docker.yaml09:53
brtknrah ok i see what you mean09:54
brtknrmy bad09:54
brtknrwhen i tested it, worked for me09:55
flwang1all good, please revisit it, because we do need it for fedora coreos driver to remove the TODO :)09:55
flwang1let's move on, we only have 5 mins09:56
brtknri just realised that atomic has its own fragment09:56
brtknrso this pattern makes sense09:56
flwang1#topic autoscaler podman issue09:56
*** openstack changes topic to "autoscaler podman issue (Meeting topic: magnum)"09:56
brtknri am happy to take this patch as is09:57
brtknron the topic of autoscaler, are you guys planning to work on supporting nodegroups?09:57
flwang1this is a brand new bug, see https://github.com/kubernetes/autoscaler/issues/281909:57
flwang1brtknr: i'm planning to support  /resize api first and then nodegroups, not sure if cern guys will take the node groups support09:58
flwang1brtknr: and here is the fix  https://review.opendev.org/70733609:58
flwang1we just need to add the volume mount for /etc/machine-id09:58
flwang1the bug reporter has confirmed that works for him09:59
brtknrflwang1: excellent, looks reasonable to me09:59
brtknrwe havent started using coreos in prod yet but this kind of bug is precisely the reason why09:59
brtknrwhat do you mean support the resize api and then nodegroups?10:00
brtknrdoesnt it already support resize?10:00
flwang1brtknr: it's using old way https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/magnum/magnum_manager_heat.go#L23110:01
flwang1in short, it call heat to remove the node and then call magnum update api to update the correct number, which doesn't make sense given we now got the resize api10:02
brtknrflwang1: ah okay thats true10:03
flwang1i'd like to use resize to replace it10:03
brtknrwhy did we not update node count on magnum directly?10:03
flwang1to drop the dependency with heat10:03
flwang1because the magnum update api can't specify which node you can delete10:04
brtknrflwang1: oh i see10:04
flwang1:)10:04
brtknrbut if you remove the node from heat stack and update node count, magnum doesnt remove any extra nodes?10:04
flwang1magnum just update the number because the node has been deleted10:05
flwang1it just "magically work" :)10:05
brtknri really want support for nodegroup autoscalinmg10:06
flwang1anyway, i think we all agree resize will be the right one to do this10:06
brtknrhappy to work on this but will need to learn golang first10:06
flwang1brtknr: then show me the code :D10:06
flwang1let's move on10:06
flwang1i'm going to close this meeting now10:06
flwang1we can discuss the out of box storage class offline10:07
brtknr#topic out of box storage clasS?10:07
flwang1brtknr: it's related to this one https://review.opendev.org/67683210:07
flwang1i proposed before10:07
brtknri see dioguerra lurking in the background10:07
flwang1:)10:09
flwang1let's end the meeting first10:09
flwang1#endmeeting10:09
*** openstack changes topic to "OpenStack Containers Team | Meeting: every Wednesday @ 9AM UTC | Agenda: https://etherpad.openstack.org/p/magnum-weekly-meeting"10:09
openstackMeeting ended Wed Feb 12 10:09:13 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)10:09
openstackMinutes:        http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-02-12-09.00.html10:09
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-02-12-09.00.txt10:09
openstackLog:            http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-02-12-09.00.log.html10:09
brtknrflwang1: ah the post install manifest would solve a few problems for us!10:09
flwang1i'm so happy you like it :)10:09
flwang1yep, it can resolve a lot of vendor specific requirements10:09
flwang1actually, the design has been agreed by strigazi as well10:10
flwang1but we didn't push it hard10:10
flwang1now, we have users asking that again, so i'm trying to pick it up and try agian10:10
brtknrflwang1: i'd really like the option to also provide this as a label10:11
brtknrrather than magnum.conf10:11
brtknrrather than just magnum.conf10:12
flwang1you mean put the URL as a label?10:13
flwang1so try to get label first and if fail, then try to get from magnum.conf ?10:14
brtknrflwang1: yep10:15
flwang1please leave a comment, then you will get a new patch set ;)10:15
brtknrive already done it10:16
flwang1wonderful10:16
flwang1i have to off now10:16
flwang1thank you for joining the meeting, brtknr10:17
brtknrokay, it was good to catch up!10:17
brtknrgoodnight flwang110:17
flwang1the best thing i have done in 2019 is inviting you join the magnum team10:17
flwang1just wanna say thank you for all you have done for Magnum10:18
brtknrflwang1: aw :) i am humbled! i have learnt a lot from you and strigazi!10:19
flwang1brtknr: cheers, ttyl10:19
*** rcernin|lunch has quit IRC10:22
dioguerrabrtknr: i always lurk from the background, but always late :(10:35
*** rcernin|lunch has joined #openstack-containers10:37
*** udesale has quit IRC11:06
*** vishalmanchanda has quit IRC11:12
brtknrdioguerra:  :) can i ask you some questions re node_count? i am wondering what forces the node_count to be a minimum of 111:25
*** ianychoi has quit IRC12:25
*** udesale has joined #openstack-containers12:34
*** ykarel is now known as ykarel|afk13:14
*** dave-mccowan has joined #openstack-containers13:19
*** dave-mccowan has quit IRC13:26
*** rcernin|lunch has quit IRC13:50
*** lbragstad has joined #openstack-containers14:02
*** ykarel|afk is now known as ykarel14:06
*** lbragstad has quit IRC14:08
*** vishalmanchanda has joined #openstack-containers14:11
*** jmlowe has joined #openstack-containers15:08
*** lbragstad has joined #openstack-containers15:15
*** jmlowe has quit IRC15:21
*** lpetrut has quit IRC15:24
*** jmlowe has joined #openstack-containers15:34
*** ykarel is now known as ykarel|afk15:49
*** udesale has quit IRC16:02
*** udesale has joined #openstack-containers16:02
*** jmlowe has quit IRC16:13
*** ivve has quit IRC16:23
*** ramishra has quit IRC16:49
*** udesale has quit IRC16:51
*** udesale has joined #openstack-containers16:52
*** udesale has quit IRC17:08
*** flwang has quit IRC17:30
*** ivve has joined #openstack-containers17:44
*** vishalmanchanda has quit IRC17:51
*** lbragstad has left #openstack-containers18:15
*** ykarel|afk is now known as ykarel|away18:39
*** jmlowe has joined #openstack-containers18:40
*** flwang1 has quit IRC19:45
*** jmlowe has quit IRC20:08
*** elenalindq has quit IRC21:01
*** rcernin has joined #openstack-containers21:18
*** jmlowe has joined #openstack-containers21:27
*** ivve has quit IRC21:31
*** jmlowe has quit IRC22:40
*** jmlowe has joined #openstack-containers22:44
*** jmlowe has quit IRC22:53
*** jmlowe has joined #openstack-containers22:56
*** jmlowe has quit IRC23:12
*** goldyfruit has quit IRC23:45
*** goldyfruit has joined #openstack-containers23:45

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!