Tuesday, 2024-10-08

clarkbmeeting time19:00
clarkb#startmeeting infra19:00
opendevmeetMeeting started Tue Oct  8 19:00:11 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:00
opendevmeetThe meeting name has been set to 'infra'19:00
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/Y55UWOXI5A5Z25CV5MKKQ7XWODWHBQ73/ Our Agenda19:00
clarkb#topic Announcements19:00
clarkb#link https://www.socallinuxexpo.org/scale/22x/events/open-infra-days CFP for Open Infra Days event at SCaLE is open until November 119:00
clarkbstill plenty of time to submit for SCaLE if you are interested19:00
clarkbalso stuff is still not decided yet but its looking likely i'll be out for at least some of next week. I'll update if anything becomes concrete19:01
clarkbif I am not around for our meeting I'll defer to others on whether or not we should have it19:01
clarkb#topic Switch OpenDev Zuul Tenants to Ansible 9 by default19:03
fungii will probably not be around for the meeting next week, as i'll be attending openinfra days north america on tuesday and wednesday (flying/driving monday and thursday)19:03
clarkball the more reason to skip it if I'm not around19:03
clarkb#link https://review.opendev.org/c/openstack/project-config/+/93132019:03
clarkbthis is something that we roughly agreed to merge nowish yesterday and haven't seen any objections too19:03
clarkbI'll go ahead and approve it in the next few minutes if there are no last minute objections  that we need to deal with19:04
clarkbbut tldr is opendev and zuul are using ansible 9 by default for some time and it generally works19:04
clarkbtesting with devstack+tempest also works19:04
clarkbso I expect it to generally be fine19:05
fungilgtm!19:05
clarkband approved19:05
clarkb#topic Rocky Package Mirror Creation19:06
clarkbStill no change on this. We can probably drop it from the agenda until a change shows up. But I wanted ot remind NeilHanlon to feel free to reach out with questions if there are any19:06
fungisounds good, thanks19:06
clarkb#topic Rackspace's Flex Cloud19:07
fungi...is awesome, really19:07
clarkband there has been progress here19:07
* fungi is not a paid spokesperson19:07
clarkbcorvus set up swift usage using application credentials. The main missing functionality is acls to restrict credential access to specific services or actions. And that may exist we just need to do more testing19:08
clarkbtl;dr is there are two appraoches for this. The first is to create a dedicated user for swift and then use swift acl functionality to limit access19:08
clarkbit isn't clear if the integration between old rax user creation and new rax flex swift enables this. They said try it and let them know how it goes19:09
fungiwhich we're told is expected to work, but that we should let them know if not19:09
fungiyeah, that19:09
clarkbthe other appraoch is to use keystonemiddleware based acls on application credentials which would live entirely within rax-flex and not involve the old auth/user stuff at all19:09
fungiwhich seems preferable if it can be made to work, since it doesn't rely on proprietary apis19:10
clarkbhwoever for this to work rax-flex would need to configure keystonemiddleware in swift and we are not sure if they have done that. But again probably something we can test to see if it works the way we expect and provide feedback if not19:10
corvusif anyone else wants to try these things, i would welcome that19:10
clarkbother than that swift in rax-flex is working and corvus has been pushing images to it successfully19:10
fungiand is therefore in theory more portable to other openstack providers too19:10
clarkbcorvus: ack19:10
clarkbwhich takes us to our next topic19:11
clarkb#topic Zuul-launcher image builds19:11
clarkbwe are successfully building images and uploading them to swift with a 72 hour expiration time. The next step is to deploy the zuul-launcher service to consume that job output and upload to clouds?19:11
clarkbI can't remember if I have reviewed that chagne now but realize I really should if not19:11
clarkbhttps://review.opendev.org/c/opendev/system-config/+/924188 it did merge yay19:12
clarkbthat didn't deploy a zuul-launcher yet though just built out the tooling to configure one?19:12
corvusthe 72 hour thing is almost done19:12
corvusyes, someone should launch.py the node soon-ish19:12
corvusthen we need this patch in zuul: https://review.opendev.org/93120819:13
fungii can try to look at that tomorrow, emergencies and my other work depending19:13
corvusminimal openstack driver19:13
fungistart with zuul-launcher01.opendev.org presumably19:13
clarkb#link https://review.opendev.org/931208 Openstack driver for the zuul-launcher subsystem19:13
corvusonce that's all in place, then just a bit more zuul configuration change to tell it to go to work19:13
clarkbfungi: or zl01 to mimic how nodepool was done19:13
corvusfungi: maybe "zl01" -- short like the others19:14
fungier, zl01 yes, i should have looked back at the change in gerrit ;)19:14
corvusoh yeah, we would have named it there :)19:14
fungiit already has an entry in hiera/common.yaml19:14
fungiwfm19:14
fungihopefully tomorrow morning i can get it booted and push up inventory and dns additions for that19:15
clarkbthe other bit of work is we can also add jobs to opendev/zuul-jobs for different image builds19:15
clarkbcurrently we've just got debian bullseye in there19:15
clarkbfungi: thanks!19:15
clarkbanything else on this topic?19:16
corvusnot from me, thx19:16
clarkb#topic Updating ansible+ansible-lint versions in our repos19:16
clarkb#link https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/926970 is the last current open change related to this work19:16
clarkbthis change has been open for long enough that I have had to fix arm image building twice to get it to pass ci :)19:17
clarkbnow that the openstack release is behind us is anyone else interested in reviewing this change? I think it would be good to land it while things are stableish19:17
clarkbthat will set us up with modern tooling that we hopefully don't have to modify much for a bit19:17
clarkbif there isn't interset in additional review maybe we should go ahead and approve it once we're satisfied ansible 9 hasn't set anything on fire19:18
fungiyeah, the longer that sits open the more likely it is to bitrot19:19
clarkbok I'll probably +A this afternoon if the ansible update seems table enough19:19
clarkb#topic OpenStack OpenAPI spec publishing19:19
clarkb#link https://review.opendev.org/92193419:20
clarkbI checked this change yesterday and haven't seen any review responses yet19:20
clarkbfungi: how do you want to approach this? Bring it up at the PTG or just reach out directly now or something else maybe?19:20
fungii expect it will resurface in the sdk team's ptg sessions, since that's where it last arose19:21
fungii'm fine dropping it from the agenda until it comes up again19:21
fungii mainly wanted to make sure we had some eyes on the proposal19:21
clarkbsounds good19:21
clarkb#topic Backup Server Pruning19:21
clarkbThe smaller of our two backup servers needs pruning roughly every 2.5 weeks at this point19:21
fungithis has been an increasingly frequent task, albeit an easy one19:22
clarkba quick investigation shows there are a number of old server backups sitting on that server19:22
clarkbI didn't du the directories but I suspect they contain at least some backup data that we could cleanup19:22
clarkbask01, ethercalc02, etherpad01, gitea01, lists, review-dev01, and review0119:22
clarkbthis is the list I identified. Either services that went away compeltely or services that had their primary server replaced with a new name19:23
clarkbI think there are two options here to improve the disk usage and space situation on this server. We can either A) remove those server specific backup dirs and the corresponding login details and clean things up in place or B) add a new volume to the server and update the mounts so that we're backing up into a clean location19:24
fungiwe made filesystem snapshots of each server as they were deleted, so i'm okay with removing their backups at this point if we've needed nothing from them all this time19:24
clarkbthen at some point later in the future we can delete the old volume19:24
fungibut also, yes, rotating the volume would have a similar effect19:24
fungikeep in mind something similar will arise with the wiki replacement19:25
clarkbif we do a volume rotation we may need to ensure that the ansible run for backup servers runs before we try to do backups to reinstall the backup target locations19:25
clarkbso either way I suspect there is a bit of intervention to do. Personally I think the in place cleanup seems simpler and easier to test (because we don't have to fuss with external things like volumes and coordinate mount moves)19:25
clarkbanyone else have an opinion on what they feel is safeest/best for our backups?19:26
clarkbin that case I guess we can try some in place cleanup. Probably for a less important service first like ethercalc then decide if we should siwtch to the volume replacement if that doesn't go smoothly19:28
clarkbDepending on how my week goes with other stuff I'll see if I have time to do that19:28
clarkb(too much is in the air and I'm trying to avoid overcommitting...)19:29
clarkb#topic Mailman 3 Upgrade19:29
clarkb#link https://review.opendev.org/c/opendev/system-config/+/930236 will upgrade our installation of mailman 3 to the latest versions19:29
fungispeaking of overcomitting ;)19:29
clarkbfungi has pushed a change to upgrade our mailman 3 installation to the latest version19:29
clarkbthe upgrade tasks themselves like db migration should be handled by the containers19:29
fungii'm happy to babysit this as it deploys and do some in-production tests in case there are any problems not caught in ci19:30
clarkbso in theory we build no container images, deploy new configs, push it all and tell it to run and voila upgraded mailman19:30
clarkbs/build no/build new/19:30
clarkbfungi: any new features we should be aware of?19:30
fungimailman is a lot of separate components, albeit aggregated into a smaller number of container images. the changelogs are sort of scattered19:30
fungii can try to put together a list of links to all the various changelogs if that's of interest19:31
clarkbI don't have specific interest. More just wondering if there was anything they were advertising as new and improved19:31
fungithough also some of this is upgrading stuff like alpine and django19:31
fungithere were notable dmarc mitigation improvements for gmail19:32
clarkbfungi: are they something you have to opt into or will we automatically get that in the upgrade? I think generally we haven't had too much trouble with gmail maybe beause our volume isn't too high19:32
fungito work around the fact that gmail's dkim records are a do-as-i-say-not-as-i-do situation so making assumptions about how gmail would handle messages from gmail addresses based on what's in dns turned out to be more recently problematic19:33
fungiit's something we'd have to add specific domains to an exclusion list for19:33
fungithe issue hasn't hit us yet afaik19:34
clarkbthe exclusion list says "this domain is backed by gmail treat it special"?19:34
fungithere were some additional config options for list behaviors as well which got discussed on mailman-users, but i don't recall the specifics at the moment19:34
fungiclarkb: an override list of "if addresses are at this domain then pretend that you need to do full address rewriting even if dns says it's unnecessary"19:35
clarkbgot it19:35
clarkbfungi: any thoughts on timing for the upgrade?19:36
clarkbthe changes themselves look good to me at this point19:36
funginext week would be harder for me. this week or week-after-next are preferable19:36
clarkbI expect to be around through at least thursday but probably also friday at this rate19:37
clarkbbut then ya next week not good for me either19:37
clarkbshould we just try and send it tomorrow? or is it better to wait for week after next since we're both traveling?19:38
fungitomorrow wfm19:38
clarkbcool I'll be around and can help19:39
fungii'm mostly just doing procrastinated talk prep at this point19:39
fungiyou know how it is ;)19:39
clarkbanything else mm3 related?19:39
clarkbheh yes19:39
fungii didn't have anything19:39
clarkb#topic Upgrading Old Servers19:40
clarkbUnfortunately I don't think there is anything new here19:40
clarkbstill waiting on updated patchsets for the mediawiki stack19:40
fungiit would be nice to reactivate work on the mediawiki changes yeah19:40
fungithe old server is getting overrun by crawlers19:40
clarkbI did want to make a note for tonyb that I believe some of my comments were related to avoiding restarting the container every time we run asnible against the host. fungi just made changes to the meetpad container stuff to ensure we only restart things when there are actual updates which may be useful to refer to19:41
clarkband some of our other serviecs do similar if you grep around for that sort of ansible task19:42
clarkb#topic Docker compose plugin with Podman19:42
funginote that the meetpad container restart changes are not yet approved, still under review19:42
clarkbfungi: I thought we landed them?19:43
clarkband we confirmed that we didn't break in the no new container case, but have yet to get an upstream release so haven't confirmed the new containers case19:43
fungioh, never mind, they did merge19:43
fungii'm clearly scattered lately19:43
clarkbthis topic is related to the previous in that last week we basically said it would be worthwhile to pick a simple but illustrative service like paste and upgrade its server to noble then set up the docker compose plugin with podman behind it19:43
fungidon't get old, the mind is the first thing to go19:44
clarkbthe motivation behind this is it would allow us to continue to use docker compose which should be a smaller migration than say podman compose but also host our images in quay and get speculative images in testing19:44
clarkbanyway I don't think anyone has started on this, but if there is interest in doing this I think this would be a good project for someone interested in helping OpenDev19:44
clarkbin particular you should be able to model everything in zuul using updates to existing ci jobs or new ci jobs19:45
clarkbthen you'd only need an existing root to boot the replacement noble server. If anyone is listening and is interested in this I'm happy to help and its likely I'll end up pushing it along myself if others don't take an interest19:45
clarkbI think it should create good overall exposure to how opendev system-config configuration management with ansible works as well as our image building process and deployments of containers via docker compose (in conjunction with ansible coordinating things)19:46
clarkb#topic Open Discussion19:46
clarkbAnything else?19:46
funginothing on my end19:47
clarkbI'll give it until 19:50 and if nothing else comes up we can end a bit early today19:48
clarkband that is the promised time19:50
clarkbthanks everyone19:50
clarkb#endmeeting19:50
opendevmeetMeeting ended Tue Oct  8 19:50:39 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:50
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2024/infra.2024-10-08-19.00.html19:50
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2024/infra.2024-10-08-19.00.txt19:50
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2024/infra.2024-10-08-19.00.log.html19:50
fungithanks clarkb!19:50

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!