Tuesday, 2025-06-24

dalees	Hi all, meeting here in 5 minutes.	07:55
jakeyip	hi dalees	07:57
dalees	#startmeeting magnum	08:00
opendevmeet	Meeting started Tue Jun 24 08:00:09 2025 UTC and is due to finish in 60 minutes. The chair is dalees. Information about MeetBot at http://wiki.debian.org/MeetBot.	08:00
opendevmeet	Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.	08:00
opendevmeet	The meeting name has been set to 'magnum'	08:00
dalees	#topic Roll Call	08:00
dalees	o/	08:00
jakeyip	o/	08:01
sd109	o/	08:01
dalees	mnasiadka: ping	08:01
mnasiadka	o/ (but on a different meeting so might not be very responsive)	08:01
dalees	okay	08:02
dalees	Agenda has reviews and one topic, so I moved that first.	08:02
dalees	#topic Upgrade procedure	08:02
dalees	jakeyip: you brought this one up?	08:03
jakeyip	hi that's mine. just want to know, anyone has an upgrade workflow inmine?	08:03
jakeyip	in mind?	08:03
dalees	for the combination of magnum, helm driver, helm charts capo, capi?	08:04
jakeyip	yes.	08:05
dalees	do you mean versions, or actual processes of performing them?	08:06
jakeyip	we recently looked into upgrading capi/capo but realised there are dependency with magnum-capi-helm and also helm charts	08:07
sd109	capi-helm-charts publishes a dependencies.json which specifies which capi version is tested against for each release of the charts: https://github.com/azimuth-cloud/capi-helm-charts/blob/affae0544b07c4b2e641b3b5bf990e561c055a91/dependencies.json	08:07
dalees	yeah, we've done capi/capo but not to their latest where they dropped capo's v1alpha7. that will be tricky to make sure all clusters are moved off old helm charts.	08:08
jakeyip	sd109: are there upgrade tests being run ?	08:10
jakeyip	charts will provision the resource at the version specified. upgrading CTs / charts _should_ upgrade those resources too?	08:11
jakeyip	capi / capo upgrades also upgrade the resources too I believe, but ideally they should be upgraded by charts first?	08:13
dalees	jakeyip: so the way these versions work is one is the 'stored' version (usually latest) and the k8s api can translate between that and any other served version (wheel-spoke in k8s docs).	08:14
dalees	so once you upgrade capo, it'll write new CRDs for the new versions to k8s, and start storing and serving the new versions (eg v1beta1). It doesn't matter if the charts write in the old or new crd version as long as it's still served.	08:15
dalees	so you need to care when they stop serving, but otherwise can keep using either old or new crd versions.	08:15
sd109	There are upgrade tests being run but since we haven't upgrade CAPO past v0.10 (due to some security group changes in v0.11 and the need for the new ORC installation in v0.12) yet, we haven't actually tested v0.12 upgrade that drops v1alpha7 for example	08:16
dalees	kubebuilder has some good info on this if you want to read more: https://book.kubebuilder.io/multiversion-tutorial/conversion-concepts.html	08:16
jakeyip	in this case, the chart will be using an older version, what happens when you do a helm upgrade?	08:16
dalees	sd109: ah, that's helpful to know you haven't gotten there either!	08:17
sd109	When you do a helm upgrade, it will upgrade the resources to new v1beta1 thanks to https://github.com/azimuth-cloud/capi-helm-charts/pull/423	08:18
jakeyip	no I mean a `helm upgrade` changing values but not chart, which happens when you resize etc	08:19
dalees	if helm tries to talk to capo with an old crd version that isn't served it'll fail.	08:20
dalees	ie. new capo (after v1alpha7 no longer served), and old chart that specifies v1alpha7.	08:21
jakeyip	in the case where it is still served but not latest	08:21
dalees	if it's served, it'll just translate to v1beta1 and store as that.	08:21
jakeyip	ok	08:22
dalees	that's the wheel-spoke model the kubebuilder docs talk about. (I think it changes the stored version in etcd on first write after the controller upgrade)	08:22
jakeyip	I'll read that	08:23
sd109	So I guess we need to make sure all user clusters are using a new enough version of capi-helm-charts to be on v1beta1 before we upgrade CAPO on the management cluster	08:23
jakeyip	yeah and also driver needs to be upgraded to talk v1beta1 first	08:24
jakeyip	I _think_ the sequence is something like - driver -> charts -> cluster templates -> all clusters -> capi+capo ?	08:26
dalees	yeah, that sounds right - for the version of capo that drops v1alpha7 (was it v0.10?).	08:27
sd109	I think the only place which needs updated in the driver is here: https://opendev.org/openstack/magnum-capi-helm/src/commit/60dc96c4dae8628e92c20b1ca594c4cf10eba5e4/magnum_capi_helm/kubernetes.py#L289	08:27
dalees	you do need capo at least up to a version that supports v1beta1 first (but that will be most of the installs)	08:27
sd109	And as far as I can tell that actually only affects the health_status of the cluster becuase it gets a 404 when trying to fetch the v1alpha7 version of the openstackcluster object from the management cluster	08:28
sd109	It was v0.12 that dropped v1alpha7 in CAPO	08:28
dalees	ah v0.12, thanks.	08:28
jakeyip	I think there were more issues than that but I can't remember now	08:29
dalees	jakeyip: you had a patchset for that - did it merge?	08:30
jakeyip	for magnum?	08:30
jakeyip	sorry for the driver? I didn't merge it yet I think	08:30
dalees	ah this one - https://review.opendev.org/c/openstack/magnum-capi-helm/+/950806	08:31
dalees	sd109: would you have a look at that soon?	08:31
sd109	Can we move that to v1beta1 now instead of v1alpha7?	08:31
dalees	yeah, I'd prefer that	08:32
jakeyip	can we / should we increment more than two?	08:32
dalees	well, it blocks upgrade to capo 0.12 if we don't	08:32
jakeyip	does capo serve more than 2?	08:33
dalees	It's probably worth noting the version restrictions, but yes they do.	08:34
dalees	(i need to look up these particular versions of capo)	08:35
jakeyip	ok I really need to try it out first then report back	08:35
jakeyip	happy to skip ahead while I take a look at capo, then come back later	08:35
sd109	There's some CAPO docs in API versions here which suggest CAPO does some kind of automatic migration to new API versions for us? https://cluster-api-openstack.sigs.k8s.io/topics/crd-changes/v1alpha7-to-v1beta1#migration	08:35
sd109	I also need to go away and have a closer look so happy to move on for now	08:36
dalees	if you would like a whole lot of yaml to read through, all the info about your capo version is found in: `kubectl get crd openstackclusters.infrastructure.cluster.x-k8s.io -o yaml`	08:36
dalees	look for "served: true", "stored: true" and "storedVersions"	08:36
dalees	okay, we shall move on. thanks for sharing what we each know!	08:37
dalees	#topic Review: Autoscaling min/max defaults	08:37
dalees	link https://review.opendev.org/c/openstack/magnum-capi-helm/+/952061	08:38
dalees	so this is from a customer request to be able to change min and max autoscaling values	08:38
dalees	i think we touched on this last meeting, but needed more thinking time.	08:39
dalees	perhaps it's the same again	08:39
sd109	Yeah sorry I haven't had time to look at that one yet, I'm hoping to get to it this week and will leave any comments I have on the patch itself	08:40
jakeyip	hm I thought I reviewed that but seems like I didn't vote	08:40
dalees	thanks both, we can move on unless there are things to discuss. please leave review notes in there when you get to it.	08:41
jakeyip	I will give it a go again	08:41
jakeyip	oh it's in draft :P	08:42
dalees	#topic Review: Poll more Clusters for health status updates	08:42
dalees	https://review.opendev.org/c/openstack/magnum/+/948681	08:43
jakeyip	I will review this	08:43
dalees	so this one I understand will add polling load to conductor for a large number of clusters	08:43
dalees	we've been running it for ages, and it enables a few things I'm doing in later patchsets in the helm driver: better health_status, and pulling back node_count from autoscaler.	08:44
dalees	it would be better to use watches for this than polling though.	08:44
jakeyip	hm the current situation is syncing _COMPLETE, but this adds more?	08:47
sd109	Don't think I have anything to add on this one, seems like a nice addition to me but agree that the extra load is worth thinking about	08:49
dalees	ah, that's true - it already is polling CREATE_COMPLETE and UPDATE_COMPLETE. So really that would be most clusters.	08:49
jakeyip	yeah I didn't understand you comments "Without the _COMPLETE I also wonder if..." .	08:51
dalees	huh. I think I had confused myself on what was being added.	08:53
dalees	agree, _COMPLETE is already there.	08:53
jakeyip	I think what you want is adding CREATE_IN_PROGRESS to surface the errors where a cluster gets stuck midway thru creation with things like autoscaler pod errors?	08:54
jakeyip	maybe need to clarify the use case in the commit message, then good to go	08:54
dalees	yeah I think so. thanks	08:55
dalees	there are 3 more reviews noted in agenda but only 5 min. Any particular of those or others we might talk about?	08:56
dalees	#topic Open Discussion	08:56
dalees	or other topics, for the last part of the meeting	08:56
jakeyip	I will look at them, maybe discuss next week	08:57
jakeyip	next meeting :P	08:57
sd109	Yeah I haven't had time to look at the two Helm reviews either so I don't think we need to discuss them now	08:58
dalees	all good, those helm ones are from stackhpc. Johns update to his one looks good and I want to progress Stigs one sometime as it's hurting us occasionally, but it can wait.	08:59
sd109	Great, thanks. I'm also trying to get someone from our side to progress Stig's one too but proving difficult to find the time at the moment	09:00
dalees	yep, i think it's promising. probably just needs to move more things (like delete) to the same reconciliation loop to avoid the conflicts.	09:01
dalees	but yeah, time!	09:01
dalees	#endmeeting	09:01
opendevmeet	Meeting ended Tue Jun 24 09:01:27 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)	09:01
opendevmeet	Minutes: https://meetings.opendev.org/meetings/magnum/2025/magnum.2025-06-24-08.00.html	09:01
opendevmeet	Minutes (text): https://meetings.opendev.org/meetings/magnum/2025/magnum.2025-06-24-08.00.txt	09:01
opendevmeet	Log: https://meetings.opendev.org/meetings/magnum/2025/magnum.2025-06-24-08.00.log.html	09:01
dalees	thanks both for coming and sharing!	09:01
jakeyip	thanks dalees	09:02
*** LarsErik1 is now known as LarsErikP		10:30

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!