Tuesday, 2025-04-22

clarkbAlmost meeting time18:58
clarkb#startmeeting infra19:00
opendevmeetMeeting started Tue Apr 22 19:00:04 2025 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:00
opendevmeetThe meeting name has been set to 'infra'19:00
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/6CDVW2M7T3K4QDPYT2TKHMPIHN7TSGV6/ Our Agenda19:00
clarkb#topic Announcements19:00
clarkbI will be travelling May 6th and unable to host the regular meeting. (that is two weeks from today)19:01
clarkbI think we can either skip that one or someone else can host19:01
clarkbThat was all I had to announce. Anything else?19:01
fungii'll be travelling as well19:01
fungion that date19:01
fungiprobably worth noting that the both of us will be fairly distracted that entire week with in-person meetings19:02
clarkbya I guess we'll be mostly out Monday-Friday that week19:02
clarkbdirectly impacting this meeting but potentially other things19:03
fungibut still in approximately the same time zones and can probably still do things in an emergency19:03
fungi(with some latency)19:03
clarkbIf that is all for announcements I think we can dive into the agenda19:04
clarkb#topic Zuul-launcher image builds19:04
corvuscanceling that mtg sounds good19:04
clarkbcorvus: sent email asking for volunteers to help with image builds and now we have volunteers \o/19:05
fungiit's like magic19:05
corvuswe got some volunteers!19:05
clarkbthen unrelated to ^ one thing corvus noticed dogfooding with zuul is that nodepool sets the private_ip vars to the public_ip values if there is no private_ip19:05
fungi(very formidable magic)19:05
clarkbthis was originally done to make it easy for openstack testing to avoid using NAT and always use the "private ip"19:06
corvusmnasiadka is working on arm64 stuff19:06
corvus#link arm64 image builds https://review.opendev.org/c/opendev/zuul-providers/+/94784119:06
clarkbhowever other users may find this behavior less useful or counterproductive so for niz we discussed having public_ip* and private_ip* always be the respective public and private values then we can have additional vars like interface IP for this is how you access the test node from the outside world and maybe local_ip for interfaces that avoid NAT19:07
clarkbso there may be a small API change that jobs will need to accomodate as we shift from nodepool to niz19:07
corvusoh we're talking about the other thing?19:07
corvusneil volunteered to work on rocky image builds19:07
clarkbsorry I just started braindumping. Feel free to proceed on the volunteers and we can discuss IPs after19:07
corvuson the private_ip thing -- i advocate having the nodepool vars maintain consistent behavior, then in the future, switch to using new zuul vars that have different behavior19:08
fungithat makes sense19:09
fungibasically backward-compatible but deprecated nodepool vars19:09
corvusso we can continue with the idea that the switch to niz won't require any changes at the time19:09
corvusbut once done, jobs will be using a deprecated api that will need to change19:09
corvusat some point in the future19:09
corvusthat will require a change in zuul which is in progress19:09
corvusyep19:09
clarkbsounds good to me19:10
corvusso we shouldn't move any more tenants over until that's finished19:10
clarkbany other concerns with what we're learning so far?19:11
corvusnope; i also started the change to emit statsd gauges for nodes/limits so we can has graphs19:11
corvusso hopefully in a few weeks, we'll have ducks all lined up for wider rollout19:12
clarkbseems like good steady progress19:12
clarkb++19:12
corvusi think that's about it19:12
clarkb#topic Container hygiene tasks19:12
clarkb#link https://review.opendev.org/q/topic:%22opendev-python3.12%22+status:open Update images to use python3.1219:12
corvus(oh btw i just noticed: opendev-build-diskimage-ubuntu-noble-arm64 https://zuul.opendev.org/t/opendev/build/26bf8c4c705e40639e57208bae1f8c18 : SUCCESS in 1h 17m 29s )19:12
clarkbmaybe this topic would've been better after the next one (gerrit server move) but now that gerrit is largely done this is going to be back on my radar19:13
clarkbcorvus: nice!19:13
clarkbvery reasonable runtime too19:13
clarkbbut tl;dr is I'm still trying to update us to python3.12. Gerrit and limnoria are two big ones that are outstanding and both need a bit of care to land as we should restart gerrit on the new image once built to ensure it works and I don't want to do that right after the server move and limnoria updates may impact meetings like this one if we interrupt a meeting19:13
clarkbI should just look at a calendar and meeting schedule and find a time to do limnoria then also plan for gerrit19:14
fungithe gerrit restart will be a good opportunity to exercise the sigint change too19:14
clarkb++19:14
clarkbso just be aware of that I guess and I'll try to get these over the finish line soonish19:15
clarkb#topic Switching Gerrit to run on Review0319:15
clarkbThis happend yesterday as scheduled and well within the hour we allocated19:15
clarkb#link https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 Notes on the migration plan19:15
fungi15 minutes downtime!19:15
clarkbI tried to keep notes there and mark things off as we went if anyone wants to look back on it19:16
clarkbthe current situation is that review02 is in the emergency file and review03 is in our inventory as a normal review server19:16
clarkbThere are a number of followup changes that we should work through as we're confident that rollback is less and less likely19:16
clarkb#link https://review.opendev.org/c/opendev/zone-opendev.org/+/947858 Reset DNS TTL19:16
clarkb#link https://review.opendev.org/c/opendev/system-config/+/947758 Drop review02 from inventory19:16
clarkb#link https://review.opendev.org/c/opendev/system-config/+/947759 Drop ignored docker compose version specifier19:16
clarkb#link https://review.opendev.org/c/opendev/system-config/+/882900 Migrate gerrit images to quay19:17
clarkbReviews very much appreciated. I'd also like to talk about that last change for a bit, but before I do I also would like everyone to look at review02 and double check there isn't anything they want to be preserved on review0319:17
clarkbfor example I had gerrit user database cleanup notes in my homedir that I copied over to review03. It wouldn't be the end of the world to go to backup to get that info but making it easy seemed nice19:17
fungialready looked, i don't need any of my files19:18
clarkbcool. Once you've done that the other thing to weigh in on is when you feel comfortable with cleaning up review02. I think we'll keep its bfv volume and data volume around longer than the instance itself. But still removing the instance is a big step19:18
clarkbdon't need answers on that now, but maybe followup in #opendev with thoughts/concerns/timeline ideas19:19
corvusi have no files19:19
funginor i now19:20
clarkbfor that last linked change I discovered that building the gerrit container image with docker fails in the move to quay because we don't set up buildkitd mirror data19:20
fungibaleeted19:20
clarkbThe latest patchset to that change updates the jobs to use podman to build the images instead19:20
clarkbit seems to work based on the fact that the system-config-run jobs manage to run gerrit and pass our test cases and take screenshots19:21
clarkbhowever I don't think we should switch only gerrit to build with podman. I think we should siwtch the entirety of our image builds that move to quay (leaving the old image builds that are stuck on docker using docker seems fine)19:21
fungiwfm19:22
clarkbany upfront concerns with doing that? I know corvus said earlier today that it would be good to keep zuul and opendev somewhat in sync here so that we have a battle tested consistent approach19:22
fungithat coordination makes sense, sure19:23
clarkbI can push a change up to do the switch across that set of jobs then rebase the gerrit image move onto that19:23
clarkbthat should make it more reviewable/mergable19:23
corvuszuul does it differently currently19:24
fungisounds good, i'll prioritize reviewing those19:24
clarkbcorvus: it did look like the initial test of having zuul use podman worked though?19:24
clarkbso at least no Dockerfile content problems19:24
clarkbbut we may want to inspect the images for difference?19:24
corvusi pushed a change up to zuul to see if podman works19:25
corvushttps://review.opendev.org/94784819:25
corvuslooks like it does19:25
corvusso probably the only reason we needed those settings in the zuul project was for nodepool-builder19:25
corvuswhich we don't care about long-term19:25
clarkbya maybe we can just let nodepool-builder be special for now then phase it out as part of the zuul-launcher effort19:25
clarkbsimilar to how opendev would ohase out docker builds as we move to quay19:26
corvusdoes opendev do any multi-arch builds?19:26
corvus(also, sorry, i'm laggy)19:26
clarkbcorvus: no multi arch builds that I am aware of. But double checking that is a good idea19:26
fungii'd be fine letting nodepool-builder images be unique/different until we no longer produce updates for them19:27
clarkband we don't have to commit ot anything in this meeting. I just want to get this out there so we can start considering the impacts and options19:28
corvussystem-config-build-image-python-builder-3.12-bookworm looks like it's multi-arch19:28
clarkboh right the base images19:28
clarkbbecause nodepool-builder depends on them19:28
clarkbno opendev service images do multi arch, but we make the base images multiarch so that nodepool can build on them19:29
corvusoh, well if it's only for nodepool.... then nbd.19:29
clarkbcorvus: yes I think that is the only place "we" use that19:29
corvusthat wfm then.19:30
clarkbok cool sounds like we have a rough plan and I can get the opendev side organized around that19:30
clarkbAnything else Gerrit related?19:30
clarkb#topic Upgrading old servers19:31
clarkbI think we can continue then19:31
clarkbthe other thing that is back on my todo list radar now that Gerrit is largely updated is picking up this process for other services19:31
clarkbmirror-update and eavesdrop are both on the todo list maybe I can find some time for them soon19:32
clarkbHelp is appreciated if aynone else is able to pcik a server or two or three as well19:32
clarkbI don't really have any updates on this topic other than to say that as gerrit has been the focus for a while19:32
clarkbanyone else have updates?19:32
fungii don't, other than working on shutting refstack down instead of upgrading it19:33
clarkboh ya I guess that is related. refstack won't be updated it will be shutdown19:33
clarkbsolves the problem in a different but still valid way19:33
clarkbthanks!19:33
fungibut that's pending some feedback from foundation staff on whether they want to announce it19:33
fungihopefully later this week19:34
clarkb#topic Working through our TODO list19:34
clarkb#link https://etherpad.opendev.org/p/opendev-january-2025-meetup19:34
clarkbjust a reminder that if everything else we're discussing isn't keeping you busy enough we've got a big list on this eitherpad19:34
clarkba good place for people who want to get more involved to look as well. Feel free to ask me any questions you may have. I'm super happy to help anyone work on these things19:35
NeilHanlono/ just stopping in to say i'll be poking at rocky/centos images this week if mnasiadka doesn't beat me to it :) 19:36
fungithanks NeilHanlon!!!19:36
clarkb++19:36
clarkb#topic Rotating mailman 3 logs19:37
fungino progress, sorry19:37
clarkback. Iwasn't sure and wanted to give you the opportunity if I had missed it19:37
clarkb#topic Moving hound image to quay.io19:37
clarkbso this is related to the earlier discussion we had with gerrit and also the change to do this landed late yesterday after I posted the meeting agenda19:37
clarkbI think the process here is basically we'll update the image build to use podman as part of the earlier discussed changes. Then roll forward and all should be well19:38
clarkbat this point lodgeit and hound are on quay but neither are affected by the issue that hit gerrit so they are in a midway point19:38
clarkbbut we'll fix them up before getting to gerrit and then we can do gerrit19:38
clarkb#topic Renewing wiki's cert19:40
clarkbAs you may have noticed our cert checker is unhappy this cert expires in just over two weeks19:40
clarkbdue to travel mentioned earlier I think my plan is to get a new cert issued and in place next week19:40
clarkbapologies for the continued email alerts but they should go away soonish19:41
fungithanks!19:41
clarkb#topic Open Discussion19:41
clarkbAnything else?19:41
fungii didn't have anything19:41
clarkbI think the screen is still running on review0319:43
clarkbI was going to clean that up and then get distracted today. I've got it on my todo list so it should go away soonish19:43
clarkbThe openinfra summit europe 2025 cfp is open until sometime in june19:44
clarkblooks like June 13 it closes19:44
fungii'm going to duck out early, thanks clarkb!19:45
clarkbyup we're winding down anyway19:46
clarkbI'll leave the floor open for a few more minutes if there is anything else but then end the meeting19:46
clarkbSounds like that may be everything. Thanks Everyone!19:48
clarkbWe'll be back next week at this time and location. but then likely cancelling the meeting the week after19:48
clarkb#endmeeting19:48
opendevmeetMeeting ended Tue Apr 22 19:48:43 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:48
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2025/infra.2025-04-22-19.00.html19:48
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2025/infra.2025-04-22-19.00.txt19:48
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2025/infra.2025-04-22-19.00.log.html19:48

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!