Tuesday, 2025-02-04

*** diablo_rojo_phone is now known as Guest796400:30
fungiwe'll get started in just a moment19:00
clarkbI've asked fungi to chair today as I'm fighting a headache that is being annoying19:00
fungiwe wish you all the best!19:01
fungiwell, i do anyway, can't speak for the rest of this lot19:01
fungi#startmeeting infra19:01
opendevmeetMeeting started Tue Feb  4 19:01:56 2025 UTC and is due to finish in 60 minutes.  The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
opendevmeetThe meeting name has been set to 'infra'19:01
fungi#link as always, the agenda is at https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting19:02
fungi#topic Announcements19:02
fungii don't have any announcements in mind, did anyone have anything that needs mentioning?19:03
clarkbnot me19:03
fungii'll skip over the actions and specs review sections since they seem to be similarly empty19:03
fungi#topic Zuul-launcher image builds (corvus)19:04
fungithe agenda mentions that zuul-launcher related configs (images, labels, providers, etc) have been refactored into a zuul-providers repo19:04
fungi#link https://opendev.org/opendev/zuul-providers19:04
fungialso that Zuul itself is starting to explore dogfooding of the new launcher managed images in jobs zuul runs19:04
fungii reviewed some of those changes, but don't really have any updates myself19:05
corvusi was hoping we could keep the jobs separate, but they have to live together with the image definitions; so the idea is we'll (eventually) put that repo in all the tenants, but only load jobs/projects in the opendev tenant.19:05
corvusanyway, that's almost exactly what was in opendev/zuul-jobs before, so anyone working on image jobs can just retarget that repo with no changes19:06
fungiis the colocation of build jobs and image definitions a security requirement?19:06
corvusyep19:06
fungisounds good to me19:06
fungiany next steps? what should we be reviewing/changing now?19:07
corvusi've run a job using a node on the launcher, including the new nodescan functionality19:07
corvusi think next i'll probably propose something like having zuul run its unit test jobs on new-style nodes19:07
corvusso we can get a little more volume on it19:07
fungicool so we can do a piecemeal migration from nodepool-managed labels to zuul-launcher labels?19:08
corvusyep  (eventually; i wouldn't say we're ready for that yet)19:08
corvusi think we want to try to catch some more bugs first :)19:08
fungisure, i just meant as a future low-impact migration plan19:09
corvusyep definitely19:09
fungiwe're not stuck with a big-bang )much)19:09
corvusi think we will have a sliding scale, and we can slide it backwards any time when the time comes19:09
fungigreat! anything else you want to note about this?19:09
corvusthat's it19:10
fungiin that case let's move on to the next topic (unless anyone has questions, feel free to interrupt whenever)19:10
fungi#topic Unpinning our Grafana deployment (clarkb)19:10
fungiseems like major progress was made on this19:11
clarkbya this went more quickly than I anticipated19:11
fungithe changes linked in the agenda are merged19:11
clarkbwe're running a noble grafana02 in production on the latest grafana 10 rlease as of an hour ago or so19:11
clarkbthe next steps will be cleaning up the old server once we're happy with this new one19:12
clarkband then figuring out an upgrade to grafana 1119:12
fungiyeah, i was about to ask19:12
clarkbI guess be on the lookout for cleanup changes as I'll try to get those up soon19:12
fungiso probably similar process but in-place now that we're on a noble server?19:12
clarkbya19:12
clarkbthe 11 upgrade will probably require updates to grafyaml or our graph definitions though due to angular being deprecated in 1019:12
clarkbit looks like 11 may have a toggle to reenable angular if it comes to that too but I think we should try and mvoe our graphs away from angular first19:13
fungiso hold a 11.x test node, see if it works, then let the upgrade deploy if no obvious issues are identified19:13
clarkbyup with also debugging of angular deprecation on that held node19:13
fungioh, got it, we probably need adjustments to grafyam19:13
fungil19:13
fungianything else we should know? or anyone have questions?19:14
clarkbnot from me19:14
fungigreat, thanks for working on this!19:15
fungi#topic Upgrading Old Servers (clarkb)19:15
clarkbmostly this is a catch all for updates around this.19:15
clarkbWe did update launch node to error if we detect fewer than 2 cpus19:16
fungitonyb was working on the wiki, i might have missed updates if there were any19:16
clarkbya not sure if tonyb has any updates for wiki specifically19:16
fungiand we've still got cacti, storyboard, and translate as well19:16
fungiseems like probably nothing to cover here for now, aside from the change to detect problem instances in rax xen19:17
clarkb++19:17
fungichanges i guess, with the typo correction19:17
fungi#topic Sprinting to Upgrade Servers to Focal (clarkb)19:17
fungirelated to the previous topic19:18
clarkbthis is an idea I had that came out of doign the paste and grafana server replacments. The work itself is often fairly straight forward with most issues cause in CI before we deploy anything19:18
clarkbthen the major time sink is waiting for reviews no the various changes to update dns, add to inventory, reupdate dns etc19:18
fungii'll note that the topic is probably a typo19:19
fungii guess you meant upgrade to noble19:19
clarkboh yes19:19
clarkbsorry19:19
fungino probs19:19
clarkbso basically I was wondering if othes woukld be willing to focus on this next week so that we can try and speed the process up and get some of the lwoer hanging fruit done19:19
fungii should have read the notes under it before i set the topic ;)19:20
fungii'm around next week for a sprint. did you have a particular day or days in mind?19:20
clarkbpart of my end state goal I'm hoping for is general confidence in noble and podman before we replace the gerrit server19:20
fungithat would certainly be good to have19:20
clarkbno particular days probably start monday and end when we're tired of working on this specifric set of tasks19:20
clarkband just ask people to try and help replace servers as well as review changes to replace servers19:21
fungito be clear, this is essentially a blocker to moving our images off dockerhub to quay, if we want to retain speculative testing of images, right?19:21
clarkbyes19:21
clarkbthere are many reasons to do it19:21
fungijust making sure i've got the motivation stated19:21
clarkbbetter ci, less docker hub, less old ubuntu19:22
fungisure, upgrades are a good idea regardless, but at least to me that's the big carrot19:22
fungidockerhub equals pain19:22
clarkbI think most of the platform specific gotchas have been addressed at this point. Now we just need to do the uplift hence the ask for focused time on it19:23
corvusi also feel like we can be flexible about reviews on essentially "rote" changes... like if there's nothing too novel about an upgrade, we've all agreed it's a good idea and it's probably okay to push that through with minimal review19:23
clarkbI'm happy with that too19:23
corvusand if something novel comes up, flag it for more discussion19:24
clarkbI like that19:24
fungiyeah, i have been doing mostly single-core approvals on those if they come from another of our sysadmins and i plan to be around to keep an eye on things19:24
fungiso if folks are interested in making a solid dent in the random dockerhub rate limit failures for our jobs, let's try to move a bunch of stuff to new enough ubuntu next week19:24
corvus++19:24
fungianything else we want to do right now for planning on this? or questions/concerns?19:25
clarkbnope later this week I'll try and put a todo list that pepeople can pick off19:26
clarkbthanks!19:26
fungithat would be great19:27
fungi#topic Switch to quay.io/opendevmirror images where possible (clarkb)19:27
fungiseems like we have a logical progression in topics19:28
clarkbI tried to order them that way :)19:28
fungiprescient19:28
clarkbmade progress on this last week but still have gerrit, zuul db, and I think one other to do19:28
clarkbcorvus: any concern with just doing this for zuul ro should it be coordinated to minimize the loss of build records?19:28
fungizuul-db only really affects zuul-web services right? or will it cause reporting failures while it's down? i can't remember now19:29
corvusit could cause reporting failures19:29
fungiand yeah, we'd presumably lose some buulds between the cracks19:29
clarkbit will affect the record keeping of jobs that finish while the db restarts19:29
corvusbut it will also retry19:29
corvusso do it fast enough it may be ok19:29
fungiso we could probably get by without pausing the whole system19:29
clarkbI think it is a relatively quick but not instantaneous restart. On the order of 15-30 seconds?19:30
clarkbmostly in mariadb startup costs19:30
corvusyeah... given we're not doing a release or anything, i'd say roll the dice :)19:30
fungii'm willing to do it early or late in my hours, or on a weekend, to minimize impact19:30
clarkbwfm thanks19:30
corvus(i mean, technically, this could happen any time if a hypervisor hiccups)19:30
clarkbgood point19:30
fungiyou make a really good point, we still have making the db ha as an outstanding task19:31
fungiand we've considered the risk low19:31
fungiso maybe just ~whenever (within reason)19:31
fungiwe can probably knock it out later this week in that case19:32
clarkb++19:32
corvus++19:32
fungiany other points that bear raising on this?19:32
clarkbnot from me19:32
fungi#topic Running certcheck on bridge (fungi)19:33
fungiit said "clarkb" on the agenda but it's really me at this point19:33
fungiand i haven't gotten to it yet, but this is a good reminder19:33
fungii don't really have anything to add, other than to note that i'm holding up one of the things we could move off the old cacti server19:34
fungiand i should really find a few minutes to get to it19:34
fungi#topic Service Coordinator Election (clarkb)19:34
fungicongratulations! no, wait, too soon19:34
clarkbThis is a reminder that the nomination period opens toady19:34
clarkb*today even19:34
clarkbI'm happy to answer questions if there is interst in someone else running19:35
fungias am i, and all our previous leaders19:35
fungi(i'm speaking on their behalf. we'll get mordred back here yet)19:35
clarkbha19:36
fungiit's totally rewarding, and nothing like whitewashing this here picket fence19:36
fungi#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/NGS2APEFQB45OCJCQ645P5N6XCH52BXW/ February 2025 OpenDev Service Coordinator Election19:37
fungi#topic Working through our TODO list (clarkb)19:38
clarkbAnd this is a reminder that we have a todo list that came out of meetup last month19:38
clarkbI've been trying to use it to inform what I poke at (the first few items are directly related to launch node updeates and grafana server replacement)19:38
fungi#link https://etherpad.opendev.org/p/r.cb73d0388959699f27a517446dabaa71 2025q1 meetup notes19:38
clarkbif you're lacking things to do feel free to take a look there and dive in19:39
fungian excellent reminder19:39
fungiany specific items you want to call out as priorities from there?19:39
clarkbnot really19:40
fungilet's get cracking!19:40
fungi#topic Open discussion19:40
fungifreeform poetry is welcome, or whatever you feel appropriate19:41
corvusvogon poetry?19:42
clarkbI'm going to step out now and try to get this headache under control. thanks everyone!@19:42
fungithy micturations are to me...19:42
fungifeel better!19:42
fungithese services aren't going to coordinate themselves, after all19:42
corvus++19:42
fungiin that case, enjoy the remaining 15-20 minutes for your preferred pasttimes19:43
fungithanks everyone!19:43
fungi#endmeeting19:43
opendevmeetMeeting ended Tue Feb  4 19:43:55 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:43
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2025/infra.2025-02-04-19.01.html19:43
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2025/infra.2025-02-04-19.01.txt19:43
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2025/infra.2025-02-04-19.01.log.html19:43

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!