*** diablo_rojo_phone is now known as Guest7964 | 00:30 | |
fungi | we'll get started in just a moment | 19:00 |
clarkb | I've asked fungi to chair today as I'm fighting a headache that is being annoying | 19:00 |
fungi | we wish you all the best! | 19:01 |
fungi | well, i do anyway, can't speak for the rest of this lot | 19:01 |
fungi | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Feb 4 19:01:56 2025 UTC and is due to finish in 60 minutes. The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
fungi | #link as always, the agenda is at https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting | 19:02 |
fungi | #topic Announcements | 19:02 |
fungi | i don't have any announcements in mind, did anyone have anything that needs mentioning? | 19:03 |
clarkb | not me | 19:03 |
fungi | i'll skip over the actions and specs review sections since they seem to be similarly empty | 19:03 |
fungi | #topic Zuul-launcher image builds (corvus) | 19:04 |
fungi | the agenda mentions that zuul-launcher related configs (images, labels, providers, etc) have been refactored into a zuul-providers repo | 19:04 |
fungi | #link https://opendev.org/opendev/zuul-providers | 19:04 |
fungi | also that Zuul itself is starting to explore dogfooding of the new launcher managed images in jobs zuul runs | 19:04 |
fungi | i reviewed some of those changes, but don't really have any updates myself | 19:05 |
corvus | i was hoping we could keep the jobs separate, but they have to live together with the image definitions; so the idea is we'll (eventually) put that repo in all the tenants, but only load jobs/projects in the opendev tenant. | 19:05 |
corvus | anyway, that's almost exactly what was in opendev/zuul-jobs before, so anyone working on image jobs can just retarget that repo with no changes | 19:06 |
fungi | is the colocation of build jobs and image definitions a security requirement? | 19:06 |
corvus | yep | 19:06 |
fungi | sounds good to me | 19:06 |
fungi | any next steps? what should we be reviewing/changing now? | 19:07 |
corvus | i've run a job using a node on the launcher, including the new nodescan functionality | 19:07 |
corvus | i think next i'll probably propose something like having zuul run its unit test jobs on new-style nodes | 19:07 |
corvus | so we can get a little more volume on it | 19:07 |
fungi | cool so we can do a piecemeal migration from nodepool-managed labels to zuul-launcher labels? | 19:08 |
corvus | yep (eventually; i wouldn't say we're ready for that yet) | 19:08 |
corvus | i think we want to try to catch some more bugs first :) | 19:08 |
fungi | sure, i just meant as a future low-impact migration plan | 19:09 |
corvus | yep definitely | 19:09 |
fungi | we're not stuck with a big-bang )much) | 19:09 |
corvus | i think we will have a sliding scale, and we can slide it backwards any time when the time comes | 19:09 |
fungi | great! anything else you want to note about this? | 19:09 |
corvus | that's it | 19:10 |
fungi | in that case let's move on to the next topic (unless anyone has questions, feel free to interrupt whenever) | 19:10 |
fungi | #topic Unpinning our Grafana deployment (clarkb) | 19:10 |
fungi | seems like major progress was made on this | 19:11 |
clarkb | ya this went more quickly than I anticipated | 19:11 |
fungi | the changes linked in the agenda are merged | 19:11 |
clarkb | we're running a noble grafana02 in production on the latest grafana 10 rlease as of an hour ago or so | 19:11 |
clarkb | the next steps will be cleaning up the old server once we're happy with this new one | 19:12 |
clarkb | and then figuring out an upgrade to grafana 11 | 19:12 |
fungi | yeah, i was about to ask | 19:12 |
clarkb | I guess be on the lookout for cleanup changes as I'll try to get those up soon | 19:12 |
fungi | so probably similar process but in-place now that we're on a noble server? | 19:12 |
clarkb | ya | 19:12 |
clarkb | the 11 upgrade will probably require updates to grafyaml or our graph definitions though due to angular being deprecated in 10 | 19:12 |
clarkb | it looks like 11 may have a toggle to reenable angular if it comes to that too but I think we should try and mvoe our graphs away from angular first | 19:13 |
fungi | so hold a 11.x test node, see if it works, then let the upgrade deploy if no obvious issues are identified | 19:13 |
clarkb | yup with also debugging of angular deprecation on that held node | 19:13 |
fungi | oh, got it, we probably need adjustments to grafyam | 19:13 |
fungi | l | 19:13 |
fungi | anything else we should know? or anyone have questions? | 19:14 |
clarkb | not from me | 19:14 |
fungi | great, thanks for working on this! | 19:15 |
fungi | #topic Upgrading Old Servers (clarkb) | 19:15 |
clarkb | mostly this is a catch all for updates around this. | 19:15 |
clarkb | We did update launch node to error if we detect fewer than 2 cpus | 19:16 |
fungi | tonyb was working on the wiki, i might have missed updates if there were any | 19:16 |
clarkb | ya not sure if tonyb has any updates for wiki specifically | 19:16 |
fungi | and we've still got cacti, storyboard, and translate as well | 19:16 |
fungi | seems like probably nothing to cover here for now, aside from the change to detect problem instances in rax xen | 19:17 |
clarkb | ++ | 19:17 |
fungi | changes i guess, with the typo correction | 19:17 |
fungi | #topic Sprinting to Upgrade Servers to Focal (clarkb) | 19:17 |
fungi | related to the previous topic | 19:18 |
clarkb | this is an idea I had that came out of doign the paste and grafana server replacments. The work itself is often fairly straight forward with most issues cause in CI before we deploy anything | 19:18 |
clarkb | then the major time sink is waiting for reviews no the various changes to update dns, add to inventory, reupdate dns etc | 19:18 |
fungi | i'll note that the topic is probably a typo | 19:19 |
fungi | i guess you meant upgrade to noble | 19:19 |
clarkb | oh yes | 19:19 |
clarkb | sorry | 19:19 |
fungi | no probs | 19:19 |
clarkb | so basically I was wondering if othes woukld be willing to focus on this next week so that we can try and speed the process up and get some of the lwoer hanging fruit done | 19:19 |
fungi | i should have read the notes under it before i set the topic ;) | 19:20 |
fungi | i'm around next week for a sprint. did you have a particular day or days in mind? | 19:20 |
clarkb | part of my end state goal I'm hoping for is general confidence in noble and podman before we replace the gerrit server | 19:20 |
fungi | that would certainly be good to have | 19:20 |
clarkb | no particular days probably start monday and end when we're tired of working on this specifric set of tasks | 19:20 |
clarkb | and just ask people to try and help replace servers as well as review changes to replace servers | 19:21 |
fungi | to be clear, this is essentially a blocker to moving our images off dockerhub to quay, if we want to retain speculative testing of images, right? | 19:21 |
clarkb | yes | 19:21 |
clarkb | there are many reasons to do it | 19:21 |
fungi | just making sure i've got the motivation stated | 19:21 |
clarkb | better ci, less docker hub, less old ubuntu | 19:22 |
fungi | sure, upgrades are a good idea regardless, but at least to me that's the big carrot | 19:22 |
fungi | dockerhub equals pain | 19:22 |
clarkb | I think most of the platform specific gotchas have been addressed at this point. Now we just need to do the uplift hence the ask for focused time on it | 19:23 |
corvus | i also feel like we can be flexible about reviews on essentially "rote" changes... like if there's nothing too novel about an upgrade, we've all agreed it's a good idea and it's probably okay to push that through with minimal review | 19:23 |
clarkb | I'm happy with that too | 19:23 |
corvus | and if something novel comes up, flag it for more discussion | 19:24 |
clarkb | I like that | 19:24 |
fungi | yeah, i have been doing mostly single-core approvals on those if they come from another of our sysadmins and i plan to be around to keep an eye on things | 19:24 |
fungi | so if folks are interested in making a solid dent in the random dockerhub rate limit failures for our jobs, let's try to move a bunch of stuff to new enough ubuntu next week | 19:24 |
corvus | ++ | 19:24 |
fungi | anything else we want to do right now for planning on this? or questions/concerns? | 19:25 |
clarkb | nope later this week I'll try and put a todo list that pepeople can pick off | 19:26 |
clarkb | thanks! | 19:26 |
fungi | that would be great | 19:27 |
fungi | #topic Switch to quay.io/opendevmirror images where possible (clarkb) | 19:27 |
fungi | seems like we have a logical progression in topics | 19:28 |
clarkb | I tried to order them that way :) | 19:28 |
fungi | prescient | 19:28 |
clarkb | made progress on this last week but still have gerrit, zuul db, and I think one other to do | 19:28 |
clarkb | corvus: any concern with just doing this for zuul ro should it be coordinated to minimize the loss of build records? | 19:28 |
fungi | zuul-db only really affects zuul-web services right? or will it cause reporting failures while it's down? i can't remember now | 19:29 |
corvus | it could cause reporting failures | 19:29 |
fungi | and yeah, we'd presumably lose some buulds between the cracks | 19:29 |
clarkb | it will affect the record keeping of jobs that finish while the db restarts | 19:29 |
corvus | but it will also retry | 19:29 |
corvus | so do it fast enough it may be ok | 19:29 |
fungi | so we could probably get by without pausing the whole system | 19:29 |
clarkb | I think it is a relatively quick but not instantaneous restart. On the order of 15-30 seconds? | 19:30 |
clarkb | mostly in mariadb startup costs | 19:30 |
corvus | yeah... given we're not doing a release or anything, i'd say roll the dice :) | 19:30 |
fungi | i'm willing to do it early or late in my hours, or on a weekend, to minimize impact | 19:30 |
clarkb | wfm thanks | 19:30 |
corvus | (i mean, technically, this could happen any time if a hypervisor hiccups) | 19:30 |
clarkb | good point | 19:30 |
fungi | you make a really good point, we still have making the db ha as an outstanding task | 19:31 |
fungi | and we've considered the risk low | 19:31 |
fungi | so maybe just ~whenever (within reason) | 19:31 |
fungi | we can probably knock it out later this week in that case | 19:32 |
clarkb | ++ | 19:32 |
corvus | ++ | 19:32 |
fungi | any other points that bear raising on this? | 19:32 |
clarkb | not from me | 19:32 |
fungi | #topic Running certcheck on bridge (fungi) | 19:33 |
fungi | it said "clarkb" on the agenda but it's really me at this point | 19:33 |
fungi | and i haven't gotten to it yet, but this is a good reminder | 19:33 |
fungi | i don't really have anything to add, other than to note that i'm holding up one of the things we could move off the old cacti server | 19:34 |
fungi | and i should really find a few minutes to get to it | 19:34 |
fungi | #topic Service Coordinator Election (clarkb) | 19:34 |
fungi | congratulations! no, wait, too soon | 19:34 |
clarkb | This is a reminder that the nomination period opens toady | 19:34 |
clarkb | *today even | 19:34 |
clarkb | I'm happy to answer questions if there is interst in someone else running | 19:35 |
fungi | as am i, and all our previous leaders | 19:35 |
fungi | (i'm speaking on their behalf. we'll get mordred back here yet) | 19:35 |
clarkb | ha | 19:36 |
fungi | it's totally rewarding, and nothing like whitewashing this here picket fence | 19:36 |
fungi | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/NGS2APEFQB45OCJCQ645P5N6XCH52BXW/ February 2025 OpenDev Service Coordinator Election | 19:37 |
fungi | #topic Working through our TODO list (clarkb) | 19:38 |
clarkb | And this is a reminder that we have a todo list that came out of meetup last month | 19:38 |
clarkb | I've been trying to use it to inform what I poke at (the first few items are directly related to launch node updeates and grafana server replacement) | 19:38 |
fungi | #link https://etherpad.opendev.org/p/r.cb73d0388959699f27a517446dabaa71 2025q1 meetup notes | 19:38 |
clarkb | if you're lacking things to do feel free to take a look there and dive in | 19:39 |
fungi | an excellent reminder | 19:39 |
fungi | any specific items you want to call out as priorities from there? | 19:39 |
clarkb | not really | 19:40 |
fungi | let's get cracking! | 19:40 |
fungi | #topic Open discussion | 19:40 |
fungi | freeform poetry is welcome, or whatever you feel appropriate | 19:41 |
corvus | vogon poetry? | 19:42 |
clarkb | I'm going to step out now and try to get this headache under control. thanks everyone!@ | 19:42 |
fungi | thy micturations are to me... | 19:42 |
fungi | feel better! | 19:42 |
fungi | these services aren't going to coordinate themselves, after all | 19:42 |
corvus | ++ | 19:42 |
fungi | in that case, enjoy the remaining 15-20 minutes for your preferred pasttimes | 19:43 |
fungi | thanks everyone! | 19:43 |
fungi | #endmeeting | 19:43 |
opendevmeet | Meeting ended Tue Feb 4 19:43:55 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:43 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2025/infra.2025-02-04-19.01.html | 19:43 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2025/infra.2025-02-04-19.01.txt | 19:43 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2025/infra.2025-02-04-19.01.log.html | 19:43 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!