Wednesday, 2023-02-15

opendevreviewOpenStack Proposal Bot proposed openstack/magnum master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/magnum/+/87193904:25
jakeyipI've never tried that TBH08:53
jakeyiphi all08:53
jakeyipplease add to agenda08:54
jakeyipplease add to agenda https://etherpad.opendev.org/p/magnum-weekly-meeting08:54
daleeshi jakeyip, all08:59
jakeyiphi dalees :)08:59
jakeyip#startmeeting magnum09:00
opendevmeetMeeting started Wed Feb 15 09:00:34 2023 UTC and is due to finish in 60 minutes.  The chair is jakeyip. Information about MeetBot at http://wiki.debian.org/MeetBot.09:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.09:00
opendevmeetThe meeting name has been set to 'magnum'09:00
jakeyip#topic Roll Call09:00
jakeyipo/09:00
daleeso/09:01
travissotoo/09:01
jakeyip#link https://etherpad.opendev.org/p/magnum-weekly-meeting 09:02
jakeyipPlease feel free to populate the agenda09:02
jakeyiphi travissoto 09:03
jakeyipthanks everyone for coming to the meeting. feel free to join in at anytime.09:03
jakeyip#topic PTG09:03
travissotohi all09:04
jakeyipThere are two PTGs this time, (1) Vitual PTG (March 27-31)   (2) OpenInfra Summit + PTG (June 13-15)09:04
jakeyipdoes everyone have a preference for PTG?09:05
jakeyipunfortunately I probably will not be able to make it to OpenInfra09:05
jakeyipif there is no hard preference I may book something for March and see if there will be interest closer to the date09:06
daleeslikewise, virtual is preferred this time for me.09:06
travissoto+109:06
jakeyipok09:07
jakeyip#action jakeyip to book Virtual PTG09:07
jakeyip#topic Antelope supported versions09:07
jakeyipThanks everyone for the patches to make FCOS 36-37 work 09:08
jakeyipand also k8s v1.2409:08
jakeyipA common issue for user is that they are unsure which versions of FCOS / K8S is supported. For that I have recently fixed up the docs to reflect that09:09
jakeyip#link https://docs.openstack.org/magnum/latest/user/#supported-versions09:09
jakeyipI would like to propose that we say FCOS 36/37 + k8s v1.24 is supported for this cycle.09:10
jakeyipwhat does everyone think?09:10
daleesI've also passed conformance for 1.25, and had 1.26 mostly running (but I have not reviewed kube-system pod versions yet).09:10
daleessounds good, 1.24 is still supported k8s version.09:10
jakeyipdalees: does default labels work ?09:11
daleesjakeyip: unlikely, that was the other topic I'd like to discuss. We bump a large number of things and the defaults are way out of date now (calico etc)09:11
jakeyipdalees: yeah that is a problem. lots of discussion needed there :)09:12
jakeyipok if we are in agreement let's just target 1.24 and 1.25 as stretch goal ;)09:12
jakeyip#info Antelope supported version FCOS 36/37 and Kubernetes v1.2409:13
daleesmaybe we can share (or update) working template labels if we have the 1.24 locked in. it's hard to know to update the default without the version (1.24) set in place.09:13
jakeyipwe can target tests for these versions, and we can have labels that work possibly out of the box (later discussion)09:13
jakeyipgreat :)09:13
jakeyip#topic Deprecation09:14
jakeyipAs Antelope is about to come to an end, I am in a bit of a hurry to mark things as deprecated this cycle, so to allow them to be removed in 2 cycles' time09:15
jakeyip#topic Deprecate Fedora Atomic for Kubernetes09:16
jakeyip#link https://review.opendev.org/c/openstack/magnum/+/833949 I see a few +1, I think we can get this in this cycle. Thanks dalees :)09:16
jakeyip#topic Deprecate Swarm09:16
jakeyipis anybody still using swarm? or not?09:16
daleesnot us.09:17
jakeyipwe are not using Swarm at all, and I'm not even sure it works.09:17
jakeyipOK if there is someone using Swarm please feel free to email me. If not I will drop a mail on the ML to see who is using and wants to take up maintenance09:18
jakeyip#action jakeyip Propose deprecation of Swarm to ML09:18
dalees+1 needs a mailing list post, and then propose removal if it's not relevant anymore.09:19
jakeyip:)09:19
jakeyipif we can get Fedora Atomic out there is lots of code we can remove09:19
jakeyip#topic python-magnumclient intermittent failures after tox409:20
jakeyipSo I tried updating to tox4 format, but weirdly I am getting intermittent failures running `tox -e py38` with those changes.09:21
jakeyipit fails in check too09:21
jakeyipwe have patches stuck because of this. if someone can help it'll be great09:22
jakeyipalright we went through the previous items pretty quickly, are there questions for those items, dalees / travissoto  ?09:23
daleesi'll see if i can look into those failures with magnumclient, the "intermittent" part is concerning.09:24
travissotono not from me at this stage :)09:24
jakeyipthanks dalees 09:24
jakeyipdalees: do you want to discuss prometheus helm chart now? 09:25
daleesyeah, sure.09:26
jakeyip#topic Prometheus helm charts09:26
daleesso the prometheus/grafana stack is installed into kube-system namespace with helm if monitoring_enabled is set. 09:26
daleesand this breaks in 1.2209:27
daleeswe(actually, travissoto ) replaced the helm charts with the newer, completely different ones from kube-prometheus-stack.09:27
daleesdo others want or use these, or should we keep this patch local and remove the complexity from Magnum?09:28
jakeyipI think I tried the default one and gave up :)09:29
jakeyipdalees: do you install it for all your users?09:30
daleesour templates enable it by default yes, but many turn it off and install their own monitoring stacks.09:30
jakeyipyeah ok09:31
jakeyipwe don't install it by default IIRC and I suspect users might prefer their own09:32
daleesas we refactor to CAPI, we're going to consider if and how we keep it. It's a big job to keep it maintained in Magnum codebase.09:32
jakeyipof cos having more things work out of the box is great for users, but maintaining them up to date is an issue09:33
jakeyipand the more is in the codebase, the more we are responsible for testing09:33
jakeyipwe don't have good test for that now (?) so merging changes will be difficult without tests09:34
daleesyeah, with k8s 1.24 as supported i suspect that won't work out of the box.09:35
jakeyipI propose that we should remove it if it is broken. What does everyone think?09:35
travissotoagree better to remove it09:35
daleesok by me09:36
daleeslets confirm it's broken first.09:36
dalees(it is for us, but i want to be sure it's not our local patches)09:37
jakeyipOK. can you help to confirm in devstack and send up a change to remove it if it is?09:37
jakeyipI may take a look later too09:37
daleesok09:38
jakeyip#action dalees / travissoto to confirm prometheus helm chart is broken, and propose patch to remove if it is09:39
jakeyip#topic Container Labels09:39
jakeyipbig topic ;)09:39
jakeyipI guess executive summary is: default labels may be broken, but updating them may break existing cluster templates that do not set them09:40
jakeyipwhat are our options?09:40
jakeyipoh oh 09:42
daleesyep, that's pretty difficult. making a template that relies on defaults that may change isn't a great experience.09:43
daleeswe've a similar issue with manifests, if someone updates the calico manifests for version v1.23 and we're still allowing users to create calico v1.13 their new clusters all break.09:44
daleesI've resolved this by copying all templates that change and picking them up with version matches (as seen in https://github.com/openstack/magnum/blob/master/magnum/drivers/k8s_fedora_coreos_v1/templates/kubecluster.yaml#L1424 )09:46
jakeyiplook on the bright side, new cluster breaking is better than current cluster breaking09:47
daleesback to labels topic though - i think defining as many labels as possible in a template helps, which is what we end up doing. it means the defaults don't apply.09:47
jakeyipthat's what we do too because the defaults are just too old09:48
daleesso who has the problem of user templates breaking? do we just run with it and update them to match the current k8s (1.24 right now)? and produce example templates that can be published with little chance of breaking?09:49
jakeyipI did have our organisation templates broken before, that's why I learnt to pin as many labels as possible09:50
daleesyou mentioned breaking existing clusters - how are these labels ever re-applied to a running cluster? the upgrade or scaling process doesn't do it (if we're talking kube-system container images). Existing Heat stacks stay the same. 09:51
jakeyipyeah sorry I mean existing cluster _templates_, I might have typo09:51
daleesah ok; just checking I understood properly09:52
daleesthis type of problem may not go away with CAPI. we still need some concept of cluster templates.09:53
jakeyipyeah, I feel the least disruptive is to leave the defaults alone and document what works for the current versions09:55
daleesbut then you sacrifice the "works out of the box" experience, if that is  the goal.09:55
jakeyipupdating the labels in code is an impossible task. we can push it to latest in Antelope, but by the time an organisation installs / upgrade to Antelope they will be out of date already09:56
daleesyou could remove all defaults and force them to be specified in template labels :)09:56
jakeyip:)09:56
jakeyipfor CAPI? :)09:56
daleesit's worth considering yeah.09:57
daleesthen you only maintain versions in once place, and they match the k8s version09:57
jakeyipyeah agree much nicer09:58
jakeyipI guess what we can do better is document it. I've heard complains :)09:58
jakeyipwe are almost out of time. any other topic?09:58
daleeskeen to hear others' ideas on the topic, who aren't in the meeting but involved in Magnum.09:59
daleesI've got some for another week, but they can wait. thanks for the discussion09:59
jakeyipme too. let's hold this regularly and more may join10:00
jakeyipThanks dalees and travissoto for coming10:00
jakeyip#endmeeting10:01
opendevmeetMeeting ended Wed Feb 15 10:01:03 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)10:01
opendevmeetMinutes:        https://meetings.opendev.org/meetings/magnum/2023/magnum.2023-02-15-09.00.html10:01
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/magnum/2023/magnum.2023-02-15-09.00.txt10:01
opendevmeetLog:            https://meetings.opendev.org/meetings/magnum/2023/magnum.2023-02-15-09.00.log.html10:01
jakeyipdalees / travissoto : how many people from catalyst are working on Magnum?10:07
daleesjakeyip: the two of us currently, with another learning the internals and a couple of others in the immediate team. We cover other services also, so not just Magnum.10:11
daleesjakeyip: how about your org?10:12
jakeyipdalees: generally only me. yes similarly I also help in other services.10:15
opendevreviewMatthew Heler proposed openstack/magnum master: Support multi AZ for k8s multi masters  https://review.opendev.org/c/openstack/magnum/+/71434710:51
supamattdalees: do you a wip patch of those prometheus changes available somewhere? 11:21
guilhermesp_____hey jakeyip just saw your reply on the conformance email. Yeah i think that could be PSP in fact. kube-apiserver fails to start with a rancher-1.25.* image and magnum master12:35
guilhermesp_____Feb 14 20:24:06 k8s-cluster-dgpwfkugdna5-master-0 conmon[119164]: E0214 20:24:06.615919       1 run.go:74] "command failed" err="admission-control plugin \"PodSecurityPolicy\" is unknown"12:35
guilhermesp_____if you want a full trace of the logs, dont hesitate i can share them all :) 12:37
mnasiadkaoops, forgot about the meeting13:39
opendevreviewTyler proposed openstack/magnum master: Update devstack plugin with capi management  https://review.opendev.org/c/openstack/magnum/+/87275513:47
opendevreviewTyler proposed openstack/magnum master: Update devstack plugin with capi management  https://review.opendev.org/c/openstack/magnum/+/87275516:58
supamattguilhermesp_____: I have k8s 1.26.1 working, you need to remove PodSecurity from the admissision list. This can be done with a label.17:23
opendevreviewTyler proposed openstack/magnum-tempest-plugin master: DNM: WIP Get tests passing on cluster-api  https://review.opendev.org/c/openstack/magnum-tempest-plugin/+/87275917:38
opendevreviewTyler proposed openstack/magnum master: Update devstack plugin with capi management  https://review.opendev.org/c/openstack/magnum/+/87275517:43
opendevreviewElod Illes proposed openstack/python-magnumclient master: DNM: dummy change to test gate health  https://review.opendev.org/c/openstack/python-magnumclient/+/87401419:37
daleesguilhermesp_____: supamatt: Ah! I hadn't realized we'd defaulted our admission controller to remove PodSecurityPolicy which is allowing 1.25 to function. Goes back to the labels discussion earlier in meeting - I'll propose a changeset to Magnum to remove it, but it may be something that needs to be specified per k8s version (1.20 clusters might20:36
daleeslike it, and 1.25 cannot have it).20:36
daleessupamatt: I'll see what we can do to share the prometheus changes, happy to.20:37
opendevreviewDale Smith proposed openstack/magnum master: Remove PodSecurityPolicy from default admission controller list  https://review.opendev.org/c/openstack/magnum/+/87403120:58
dalees^ created this for discussion. It's directly related to the meeting discussion on default labels - perhaps publishing example magnum templates for each k8s version is the way to go, instead.20:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!