Wednesday, 2020-02-12

*** rcernin has joined #openstack-containers		00:08
*** rcernin has quit IRC		00:55
*** rcernin has joined #openstack-containers		00:55
*** xinliang has quit IRC		02:18
*** ricolin has joined #openstack-containers		02:19
*** xinliang has joined #openstack-containers		02:48
*** rcernin has quit IRC		03:06
*** rcernin has joined #openstack-containers		03:23
*** udesale has joined #openstack-containers		04:05
*** ramishra has joined #openstack-containers		04:09
*** vishalmanchanda has joined #openstack-containers		04:13
*** rcernin has quit IRC		04:16
*** rcernin has joined #openstack-containers		04:16
*** ramishra has quit IRC		04:32
*** dave-mccowan has quit IRC		04:43
*** ramishra has joined #openstack-containers		04:46
*** ykarel\|away is now known as ykarel		04:58
*** rcernin is now known as rcernin\|lunch		05:01
*** goldyfruit has quit IRC		05:58
*** goldyfruit has joined #openstack-containers		05:58
*** goldyfruit has quit IRC		06:30
*** goldyfruit has joined #openstack-containers		06:30
*** elenalindq has joined #openstack-containers		06:42
*** ivve has quit IRC		07:24
*** lpetrut has joined #openstack-containers		07:31
*** ianychoi has joined #openstack-containers		07:46
*** ykarel is now known as ykarel\|lunch		07:55
*** xinliang has quit IRC		08:00
*** ivve has joined #openstack-containers		08:45
openstackgerrit	Feilong Wang proposed openstack/magnum master: [k8s] Fix instance ID issue with podman and autoscaler https://review.opendev.org/707336	08:47
*** xinliang has joined #openstack-containers		08:51
*** flwang1 has joined #openstack-containers		08:57
flwang1	strigazi: brtknr: meeting in 3 mins	08:57
flwang1	#startmeeting magnum	09:00
openstack	Meeting started Wed Feb 12 09:00:39 2020 UTC and is due to finish in 60 minutes. The chair is flwang1. Information about MeetBot at http://wiki.debian.org/MeetBot.	09:00
openstack	Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.	09:00
*** openstack changes topic to " (Meeting topic: magnum)"		09:00
openstack	The meeting name has been set to 'magnum'	09:00
flwang1	#topic roll call	09:00
*** openstack changes topic to "roll call (Meeting topic: magnum)"		09:00
flwang1	0/	09:00
brtknr	o/	09:02
flwang1	brtknr: hey, how are you	09:03
brtknr	good thanks and you!	09:03
brtknr	?	09:03
flwang1	very good	09:03
flwang1	let's wait strigazi a bit	09:03
flwang1	brtknr: did you see my comments on your csi patch?	09:04
brtknr	yes thanks for reviewing	09:05
brtknr	did you see the issue on devstack?	09:06
brtknr	can you leave comment about what k8s version you used and if it was podman or coreos etc.	09:06
brtknr	i havent seen the same issue locally	09:06
flwang1	ok, will do	09:06
flwang1	i'm using v1.16.3 with podman	09:06
flwang1	and coreos	09:06
flwang1	brtknr: let's go through the agenda?	09:08
brtknr	sounds good	09:08
brtknr	btw instead of kubelet-insure-tls, maybe i can use --kubelet-certificate-authority	09:08
brtknr	and use the ca for the cluster?	09:09
flwang1	brtknr: it would be great, otherwise, enterprise use won't like it	09:09
flwang1	1. Help with removing the constraint that there must be a minimum of 1 worker in a given nodegroup (including default-worker).	09:09
flwang1	have you already got any idea for this?	09:09
brtknr	i can manually specify count as 0 and get the cluster to reach CREATE_COMPLETE but i havent been able to override the value of node_count to 0, at the moment, it defaults to 1	09:10
brtknr	i havent been able to figure out where exactly this constraint is applied. any pointer would be appreciated but i realise there is no easy answer without properly digging underneath	09:12
flwang1	what do you mean manually specify? like openstack coe cluster create xxx --node-count 0 ?	09:12
brtknr	that will not work because there is api level constraint	09:12
brtknr	when i remove the api level constraint, the node-count still defaults to 1	09:12
flwang1	brtknr: i see. so you mean you hacked the code to set it to 0?	09:12
brtknr	i can override count in the kubecluser.yaml file	09:13
brtknr	and only then cluster reaches CREATE_COMPLETE	09:13
flwang1	ah, i see.	09:13
flwang1	i would say if it works at the Heat level, then the overall idea should work	09:13
flwang1	we can manage it in magnum scope	09:13
flwang1	it would be nice if you can dig and propose a patch so that we can start to review from there	09:14
brtknr	sounds good	09:14
flwang1	2. metrics sever CrashLoopBack	09:15
flwang1	is there anything we should discuss about this?	09:15
brtknr	flwang1: i saw your comment	09:17
brtknr	i will see if there is a way to do this without insecure-tls	09:17
brtknr	there is a --kubelet-certificate-authority and --tls-cert-file option which i havent explored	09:18
flwang1	cool, i appreciate your work on this issue	09:18
flwang1	3. the heat agent log	09:19
brtknr	we're still on roll call topic btw	09:19
brtknr	lol	09:19
flwang1	ah, sorry	09:19
flwang1	#topic heat agent log	09:19
*** openstack changes topic to "heat agent log (Meeting topic: magnum)"		09:19
flwang1	with this one, we probably need to wait the investigation result from strigazi	09:19
flwang1	so let's skip it for now?	09:19
brtknr	i was digging into the heat agent log yesterday as its sometimes impossible to see what is happening when the cluser is creating	09:20
brtknr	the main problem is subprocess.communicate does not provide option to stream output	09:20
flwang1	:(	09:21
brtknr	there may be a way to output stdout and stderr to a file on the other hand	09:21
brtknr	from whereever its being executed from	09:22
brtknr	but happy to wait for what strigazi has to say	09:22
flwang1	cool, thanks	09:22
flwang1	move on?	09:22
brtknr	sure	09:25
flwang1	#topic volume AZ	09:27
*** openstack changes topic to "volume AZ (Meeting topic: magnum)"		09:27
flwang1	brtknr: did you get a chance to review volume AZ fix https://review.opendev.org/705592 ?	09:27
brtknr	yes it has a complicated logic	09:30
flwang1	brtknr: yep, i don't like it TBH, but i can't figure out a better way to solve it	09:31
brtknr	how has it worked fine until now?	09:32
flwang1	sorry?	09:32
flwang1	you mean why it was working?	09:32
brtknr	until now, how have we survived without this patch is what im asking	09:33
flwang1	probably because most of the companies are not using multi AZ	09:33
flwang1	without multi AZ, user won't run into this issue	09:33
flwang1	as far as i know, Nectar jakeyip, they're hacking the code	09:34
flwang1	i don't know if cern is using multi az	09:34
elenalindq	Hi there, may I interrupt with a question? I tried openstack coe cluster update $CLUSTER_NAME replace node_count=3 and it failed, because of lack of resources, so my stack is in UPDATE_FAILED state. I fixed the quota and I tried to rerun the update command hoping it will kick it off again, but nothing happens. If I try openstack stack update <stack_id> --existing it will kick off the update, which succeeds (heat sh	09:35
elenalindq	DATE_COMPLETE), but openstack coe cluster list still shows my stack in UPDATE_FAILED. Is there a way to rerun the update from magnum? Using Openstack Train.	09:35
brtknr	flwang1: can we not get a default value for az in the same way novda does?	09:36
brtknr	elenalindq: if you are using train with an up to date CLI, you can rerun `openstack coe cluster resize <cluster_name> 3`	09:37
flwang1	brtknr: cinder can handle the "" for az	09:37
brtknr	and this will reupdate the heat stack	09:37
elenalindq	thank you brtknr!	09:37
brtknr	flwang1: but not nova?	09:37
flwang1	brtknr: cinder can NOT handle the "" for az	09:37
flwang1	but nova can	09:37
flwang1	cinder will just return a 400 IIRC	09:37
brtknr	flwang1: can we look into nova code to see how they infer sensible default for region name?	09:38
flwang1	you can easily test this without using multi az	09:38
flwang1	are you trying to solve this issue in cinder?	09:38
brtknr	i think magnum should have an internal default for availability zone	09:39
brtknr	rather than ""	09:39
flwang1	like a config option?	09:40
flwang1	then how can you set the default value for the this option?	09:40
flwang1	and this may break the backward compatibility :(	09:40
*** ykarel\|lunch is now known as ykarel		09:41
brtknr	flwang1: hmm will cinder accept None?	09:43
flwang1	brtknr: no, based on i tried	09:43
flwang1	you can give the patch a try and we can discuss offline	09:44
flwang1	it's a small issue but just make the template complicated, i understand that	09:44
flwang1	brtknr: let's move on	09:46
flwang1	#topic docker storage for fedora coreos	09:46
*** openstack changes topic to "docker storage for fedora coreos (Meeting topic: magnum)"		09:46
flwang1	docker storage driver for fedora coreos https://review.opendev.org/696256	09:46
flwang1	brtknr: can you pls revisit above patch?	09:47
flwang1	strigazi: ^	09:47
brtknr	flwang1: my main issue with that patch is that we should try and use the same configure-docker-storage.sh and add a condition in there for fedora coreos	09:51
brtknr	it is hard to factor out common elements when they are in different files	09:51
flwang1	brtknr: we can't use the same script. i'm kind of using the same one from coreos	09:52
flwang1	because the logic are different	09:52
flwang1	brtknr: see https://github.com/openstack/magnum/blob/master/magnum/drivers/k8s_coreos_v1/templates/fragments/configure-docker.yaml	09:53
brtknr	ah ok i see what you mean	09:54
brtknr	my bad	09:54
brtknr	when i tested it, worked for me	09:55
flwang1	all good, please revisit it, because we do need it for fedora coreos driver to remove the TODO :)	09:55
flwang1	let's move on, we only have 5 mins	09:56
brtknr	i just realised that atomic has its own fragment	09:56
brtknr	so this pattern makes sense	09:56
flwang1	#topic autoscaler podman issue	09:56
*** openstack changes topic to "autoscaler podman issue (Meeting topic: magnum)"		09:56
brtknr	i am happy to take this patch as is	09:57
brtknr	on the topic of autoscaler, are you guys planning to work on supporting nodegroups?	09:57
flwang1	this is a brand new bug, see https://github.com/kubernetes/autoscaler/issues/2819	09:57
flwang1	brtknr: i'm planning to support /resize api first and then nodegroups, not sure if cern guys will take the node groups support	09:58
flwang1	brtknr: and here is the fix https://review.opendev.org/707336	09:58
flwang1	we just need to add the volume mount for /etc/machine-id	09:58
flwang1	the bug reporter has confirmed that works for him	09:59
brtknr	flwang1: excellent, looks reasonable to me	09:59
brtknr	we havent started using coreos in prod yet but this kind of bug is precisely the reason why	09:59
brtknr	what do you mean support the resize api and then nodegroups?	10:00
brtknr	doesnt it already support resize?	10:00
flwang1	brtknr: it's using old way https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/magnum/magnum_manager_heat.go#L231	10:01
flwang1	in short, it call heat to remove the node and then call magnum update api to update the correct number, which doesn't make sense given we now got the resize api	10:02
brtknr	flwang1: ah okay thats true	10:03
flwang1	i'd like to use resize to replace it	10:03
brtknr	why did we not update node count on magnum directly?	10:03
flwang1	to drop the dependency with heat	10:03
flwang1	because the magnum update api can't specify which node you can delete	10:04
brtknr	flwang1: oh i see	10:04
flwang1	:)	10:04
brtknr	but if you remove the node from heat stack and update node count, magnum doesnt remove any extra nodes?	10:04
flwang1	magnum just update the number because the node has been deleted	10:05
flwang1	it just "magically work" :)	10:05
brtknr	i really want support for nodegroup autoscalinmg	10:06
flwang1	anyway, i think we all agree resize will be the right one to do this	10:06
brtknr	happy to work on this but will need to learn golang first	10:06
flwang1	brtknr: then show me the code :D	10:06
flwang1	let's move on	10:06
flwang1	i'm going to close this meeting now	10:06
flwang1	we can discuss the out of box storage class offline	10:07
brtknr	#topic out of box storage clasS?	10:07
flwang1	brtknr: it's related to this one https://review.opendev.org/676832	10:07
flwang1	i proposed before	10:07
brtknr	i see dioguerra lurking in the background	10:07
flwang1	:)	10:09
flwang1	let's end the meeting first	10:09
flwang1	#endmeeting	10:09
*** openstack changes topic to "OpenStack Containers Team \| Meeting: every Wednesday @ 9AM UTC \| Agenda: https://etherpad.openstack.org/p/magnum-weekly-meeting"		10:09
openstack	Meeting ended Wed Feb 12 10:09:13 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)	10:09
openstack	Minutes: http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-02-12-09.00.html	10:09
openstack	Minutes (text): http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-02-12-09.00.txt	10:09
openstack	Log: http://eavesdrop.openstack.org/meetings/magnum/2020/magnum.2020-02-12-09.00.log.html	10:09
brtknr	flwang1: ah the post install manifest would solve a few problems for us!	10:09
flwang1	i'm so happy you like it :)	10:09
flwang1	yep, it can resolve a lot of vendor specific requirements	10:09
flwang1	actually, the design has been agreed by strigazi as well	10:10
flwang1	but we didn't push it hard	10:10
flwang1	now, we have users asking that again, so i'm trying to pick it up and try agian	10:10
brtknr	flwang1: i'd really like the option to also provide this as a label	10:11
brtknr	rather than magnum.conf	10:11
brtknr	rather than just magnum.conf	10:12
flwang1	you mean put the URL as a label?	10:13
flwang1	so try to get label first and if fail, then try to get from magnum.conf ?	10:14
brtknr	flwang1: yep	10:15
flwang1	please leave a comment, then you will get a new patch set ;)	10:15
brtknr	ive already done it	10:16
flwang1	wonderful	10:16
flwang1	i have to off now	10:16
flwang1	thank you for joining the meeting, brtknr	10:17
brtknr	okay, it was good to catch up!	10:17
brtknr	goodnight flwang1	10:17
flwang1	the best thing i have done in 2019 is inviting you join the magnum team	10:17
flwang1	just wanna say thank you for all you have done for Magnum	10:18
brtknr	flwang1: aw :) i am humbled! i have learnt a lot from you and strigazi!	10:19
flwang1	brtknr: cheers, ttyl	10:19
*** rcernin\|lunch has quit IRC		10:22
dioguerra	brtknr: i always lurk from the background, but always late :(	10:35
*** rcernin\|lunch has joined #openstack-containers		10:37
*** udesale has quit IRC		11:06
*** vishalmanchanda has quit IRC		11:12
brtknr	dioguerra: :) can i ask you some questions re node_count? i am wondering what forces the node_count to be a minimum of 1	11:25
*** ianychoi has quit IRC		12:25
*** udesale has joined #openstack-containers		12:34
*** ykarel is now known as ykarel\|afk		13:14
*** dave-mccowan has joined #openstack-containers		13:19
*** dave-mccowan has quit IRC		13:26
*** rcernin\|lunch has quit IRC		13:50
*** lbragstad has joined #openstack-containers		14:02
*** ykarel\|afk is now known as ykarel		14:06
*** lbragstad has quit IRC		14:08
*** vishalmanchanda has joined #openstack-containers		14:11
*** jmlowe has joined #openstack-containers		15:08
*** lbragstad has joined #openstack-containers		15:15
*** jmlowe has quit IRC		15:21
*** lpetrut has quit IRC		15:24
*** jmlowe has joined #openstack-containers		15:34
*** ykarel is now known as ykarel\|afk		15:49
*** udesale has quit IRC		16:02
*** udesale has joined #openstack-containers		16:02
*** jmlowe has quit IRC		16:13
*** ivve has quit IRC		16:23
*** ramishra has quit IRC		16:49
*** udesale has quit IRC		16:51
*** udesale has joined #openstack-containers		16:52
*** udesale has quit IRC		17:08
*** flwang has quit IRC		17:30
*** ivve has joined #openstack-containers		17:44
*** vishalmanchanda has quit IRC		17:51
*** lbragstad has left #openstack-containers		18:15
*** ykarel\|afk is now known as ykarel\|away		18:39
*** jmlowe has joined #openstack-containers		18:40
*** flwang1 has quit IRC		19:45
*** jmlowe has quit IRC		20:08
*** elenalindq has quit IRC		21:01
*** rcernin has joined #openstack-containers		21:18
*** jmlowe has joined #openstack-containers		21:27
*** ivve has quit IRC		21:31
*** jmlowe has quit IRC		22:40
*** jmlowe has joined #openstack-containers		22:44
*** jmlowe has quit IRC		22:53
*** jmlowe has joined #openstack-containers		22:56
*** jmlowe has quit IRC		23:12
*** goldyfruit has quit IRC		23:45
*** goldyfruit has joined #openstack-containers		23:45

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!