Tuesday, 2024-01-16

clarkbAlmost meeting time18:59
clarkb#startmeeting infra19:00
opendevmeetMeeting started Tue Jan 16 19:00:13 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:00
opendevmeetThe meeting name has been set to 'infra'19:00
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/6MI3ENLNO7K43AW6PPVMW52K4CHSW7VQ/ Our Agenda19:00
frickler\o19:00
clarkb#topic Announcements19:00
clarkbA remidner that we'll be doing an OpenInfra Live episode covering all things OpenDev Thursday at 1500 UTC19:01
clarkbfeel free to listen in and ask questions19:01
fricklervPTG will be 2024-04-08 to -04-1219:02
clarkbyup as mentioend in #opendev I inteded to discuss an idea for us to participate this time around at the end of the meeting19:02
clarkbsignups happen now through february 1819:02
clarkb#topic Server Upgrades19:03
clarkbNot sure if we've got tonyb here (it is a bit early in the day)19:03
* tonyb waves19:03
clarkbprogress has been made on meetpad server upgrades. tonyb and I did testing the other day to figure out why the jvb wasn't used in the test env19:04
fungithanks for working through that!19:04
clarkbtl;dr is that the jvb was trying to connect to the prod server and that made everything sad19:04
clarkb#link https://review.opendev.org/c/opendev/system-config/+/905510 Upgrading meetpad service to jammy19:04
clarkbthis stack of changes is the current output of that testing and effort from tonyb19:04
clarkbreviews would be great. I think we can safely land all of those changes since they shouldn't affect prod but only improve our ability to test the service in CI19:05
clarkbtonyb: anything else to call out on this topic? definitely continue to ping me if you need extra eyeballs19:06
tonybI think meetpad is under control.  As I mentioned I've started thinking about the "other" servers so if people have time looking at my ideas for wiki.o.o would be good19:06
fungiwhere were those ideas again?19:07
tonybIs there a reason to keep hound on focal? or is the main idea to get off of bionic first19:07
clarkbtonyb: only that the priority was to ditch the older servers first19:07
clarkbhound should run fine on jammy19:07
tonybfungi: https://etherpad.opendev.org/p/opendev-bionic-server-upgrades#L5819:08
tonyb#link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades#L5819:08
fungithanks!19:08
* clarkb makes a note to review the notes19:08
* fungi notes clarkb's note to review the notes19:08
tonybOkay I'll add hound to my list.  it looks like it'll be another simple one19:08
fungii wish gitea would gain feature parity with hound, and then we could decommission another almost redundant service19:09
corvusi wonder if anything with gitea code search has changed19:09
clarkbcorvus: its still "weird"19:09
corvusbummer19:09
tonybAssuming that none of the index/cached data is critical and can/will be regenerated on a new server19:09
fungitonyb: yes, there's no real state maintained there. it can just be replaced whole19:09
clarkbthe latest issue we discovered was that you need to update files for it ot notice things have changed to update the index. It doesn't just reindex everything all at once so sometimes things get missed and don't show up19:10
clarkbthis is for gitea not hound. Hound reindexes everything on startup then does incremental pulls to reindex over time19:10
fungialso i prefer that hound gives you all matching lines from a repo, rather than just the repos which matched and one or two example matches19:10
fungii guess that could be a feature request for a search option or something19:11
clarkb#topic Python Container Updates19:11
clarkbNot to cut the previous discussion short but want to ensure we get time for everything today19:11
clarkb#link https://review.opendev.org/c/opendev/system-config/+/905018 Drop Bullseye python3.11 images19:11
clarkbThis is a cleanup change that we can make to our image builds after zuul-registry updated19:11
clarkbwe are still waiting on zuul-operator to switch in order to remove the last bullseye image19:12
clarkbI also used Zuul as a test platform for the 3.12 images and after fixing an import for a removed module it all seems to be working19:12
clarkbMy main concern was that we might have 3.12 changes that affect the images themselves but that doesn't appear to be the case19:13
clarkb#topic Upgrading Zuul's DB Server19:13
clarkbThere were hardware issues? on the hosting side that caused the existing db server to be migrated19:14
clarkbfrickler: found at least one build whose final status wasn't recorded properly in the db and the short outage for that migration appeas to be the cause19:14
clarkbI don't think this drastically changes the urgency of this move, but does give more fuel to the replace it fire I guess19:15
fricklernice wording, I agree to that19:15
clarkbthat said I still haven't found much time to think it through. I'm somewhat inclined to go with the simplest thing closest to what we're already doing as a result. Spin up a single mysql/mariadb server with the intent of scaling it up later should we decide to19:15
tonybYeah I think that19:16
tonybs my preference as well19:16
tonybit *seems* like we can add slaves after the fact with a small outage19:17
clarkbwe don't have to agree to that now, but raising objections before the next meeting would be good and we could write that down somewhere as the plan otherwise19:17
clarkb#topic Matrix Homeserver Hosting19:19
clarkbLast week fungi reached out to EMS about updating our hosting service/plan19:19
clarkbfungi: is there anything to add to that? I assume that our existing service will continue running as is without outage or config changes its merely the business side that changes?19:19
fungiyeah, it sounds like they'll just update our hosting plan, bill the foundation, and increase our quota for user count19:20
fungibut they're supposed to get up with us by the end of the month with details19:20
clarkbthank you for digging into that. I guess we'll practice patience until they get back to us19:20
tonybhopefully billing for the new plan doesn't start until ^^ happens19:21
fungithe up side is we no longer have to be quite so careful about the number of accounts we create on it, since the limit for that's increasing from 5 to 2019:21
fungitonyb: yes, it will take effect when our current plan runs out, they said19:21
tonybcool beans19:22
fungiand they're contacting us at least a week prior to that19:22
fungiwhich is all the info i have for now19:22
clarkb#topic OpenAFS Quota Issues19:22
fungifolks with access to the infra-root inbox can also read the e-mail they sent me19:22
tonybOh that reminds me ...19:23
tonybfungi: IIUC you use mutt to access that ... can you share a redacted mutt.config?19:23
clarkbI spent some time digging into this last week19:23
fungitonyb: sure, we'll catch up after the meeting19:23
tonybperfect19:24
clarkbThere were no obvious issues with extra arches in the ubuntu-ports volume. That would've been too easy :)19:24
clarkbHowever, I think we can probably start the process to cleanup ubuntu bionic in ports and maybe debian buster19:24
clarkbThe xenial indexes are still there, not getting updated and the packages don't appear to be there either. This is a minimal cleanup opportunity since it is just the indexes but we could clear those out when we clear out the others as I expect it to take the same process19:25
fungi"ubuntu ports" in this case means arm64 packages, for clarification19:25
funginot amd6419:25
clarkbright19:25
clarkbon the ubuntu amd64 mirror volume I noticed that we actually mirror a number of source packages19:25
clarkbunfortunately it seems like reprepro uses these to detect package updates and we can't easily remove them?19:26
clarkb(we'd have to do a post reprepro step to delete them I think and then have indexes that point at files that don't exist which is not nice)19:26
clarkbat least I wasn't able to find a reprepro flag to not mirror those packages19:26
fungiwell, it would be the "source indices" which aren't the same files19:26
clarkboh maybe we can delete those too then?19:27
clarkbI suspect this would be a pretty good size reduction if we can figure out a good way to remove those files19:27
fungibut also we might be veering close to violating licenses for some packages if we don't also host sources for them19:27
fungisince we technically are redistributing them19:28
clarkbeven as a mirror of an upstream that isn't modifying content or code?19:28
fungithen again, we omit mirroring sources for rpm-based distros right?19:29
clarkbyes I think so19:29
fungiso maybe don't worry about it unless someone tells us we need to19:29
clarkbmy non lawyer understanding of the gpl for example is that we'd need to provide sources if asked19:29
clarkbso we'd go grab the source for the thing and provide that19:29
fungiright19:29
clarkbnot that we necessarily have to provide it next to the binary package19:29
fungithe bigger risk there is if we continue to serve packages which are no longer available elsewhere19:30
fungisince actually fulfilling such requests would become challenging19:30
clarkbya I suppose that is possible but also not super likely for ubuntu in particular who has everything in source control. Might be a pain tog dig out of launchpad though19:30
clarkbI'm also open to other ideas for improving disk consumption. The other thing I wanted to look into but haven't done yet is whether or not the stream mirrors are still providing many copies of certain packages that might be prunable19:31
fricklerbut that's another good reason to drop eoled versions, I like that ;)19:32
clarkbya I think we can start there19:32
fungifrickler: agreed19:32
clarkb#topic Broken Wheel Cache/Mirror Builds19:33
clarkbone issue was that openafs was not working on arm64 nodes for centos stream. The problem there appears to be that our images stopped getting rebuilt on nb04 which meant our test nodes had old kernels19:34
clarkbthe old kernels didn't work with new openafs packaging stuff19:34
clarkbI claened up nb04 and we have new images now19:34
clarkbhttps://review.opendev.org/c/openstack/openstack-zuul-jobs/+/905270 merged after the iamges updated19:35
clarkbI don't know whether or not we've built a new package yet?19:35
clarkbbut progress anyway. Would be good to finish running that down and tonyb indicated an interest in doing that19:35
tonybYup.19:36
fungiit was pretty late when i approved that, i think, so may need to wait until later today19:37
clarkback19:37
clarkb#topic Gitea disks filling19:37
clarkbThe cron job configuration merged but looking at app.ini timestamps and process timestamps I don't think we restarted gitea to pick up the changes19:37
clarkbwe've got two toehr related chagnes and I'm thinking we get those reviewed and possibly landed then we can do a manual rolling restart if still necessary19:38
clarkb#link https://review.opendev.org/c/opendev/system-config/+/904868 update robots.txt on upstream's suggestion19:38
clarkb#link https://review.opendev.org/c/opendev/system-config/+/905020 Disable an unneeded cron job in gitea19:38
tonybThey both have enough votes to approve19:39
clarkboh cool I think I missed that when looking at gerrit19:39
tonybso yeah I think we're good to land and rolling restart19:39
clarkbI think we want to approve 905020 first in case 904868 does an automated rolling restart19:40
clarkbI hesitate to approve myself as we're supposed to haev an ice storm starting in about 3 hours19:40
clarkbbut can approve wednesday if I haven't lost power19:41
corvusconsider rescheduling the ice storm19:41
clarkbI wish. I'm tired of this cold weather. Tomorrow will be above freezing for the first time since Friday19:41
tonybI can approve it and watch for updates and restarts19:42
clarkbthanks19:42
clarkb#topic OpenDev Service Coordinator Election19:42
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/TB2OFBIGWZEYC7L4MCYA46EXIX5T47TY/19:42
clarkbI made an election schedule official on the mailing list19:42
clarkbI continue to be happy for someone else to step into the role and will support whoever that may be19:43
clarkbI also continue to be willing to do the work should others prefer not to19:43
clarkbOne nice thing about getting the schedule otu now is it should give everyone plenty of time to consider it :)19:43
fungiyou deserve a break, but also cycling back through those of us who were infra ptls is probably equally counter-productive19:43
fungia new volunteer would be ideal19:44
* tonyb sees the amount of work clarkb does and is awestruck19:45
clarkbheh I'm just chasing the electrons around. Eventually I might catch one19:45
clarkb#topic Open Discussion19:46
clarkbAs mentioned earlier I'm thinking it would be a good idea for us to try out the PTG again19:46
tonybYup I think that'd be good.19:47
clarkbthe reason for that is we've got a number of pots on the fire between general maintenance (gitea upgrades container image updates etc), fixing issues that arise, and future looking stuff that I think some focused time would be good19:47
fungimmm, hotpot19:48
fungicount me in!19:48
clarkbI wanted to bring it up because we said we could also just schedule time if we need it. I'm basically asserting that I think we could use some focused time and I'm happy to organize it outside of/before the PTG as well or as an alternative19:48
tonybfocused time is good.19:49
* frickler would certainly prefer something outside of the PTG19:49
clarkbfrickler: thats good feedback19:49
tonybI guess during the PTG could make it hard for us "as a team" to cover off attending other projects sessions19:49
clarkbtonyb: ya thats the struggle19:50
tonyb(including the TC)19:50
clarkbgiven frickler's feedback I'm thinking that maybe the best thing would be to schedule a couple of days prior to the PTG (maybe as early as February) and then also sign up for the PTG and just do our best during the bigger event19:50
fungiwe can also call it ptg prep, and use it for last-minute testing of our infrastructure19:51
fungimeetpad, etherpad, ptgbot, etc19:51
clarkbI'll be able to look at a calendar and cross check against holidays and other events after the OIL episode. I'll try to get that out Fridayish19:51
clarkbfungi: ++19:51
fungias opposed to doing it after the ptg19:51
clarkbfungi: yup I think we should definitely do it before19:52
clarkbeven the PTG feels a bit late, but having somethigni nFebruary and something in April seems not too crazy19:52
tonybWorks for me19:52
tonybI don't think I'm able t relocate to the US for the PTG this time which sucks :(19:52
clarkb:/ we'll just figure out how to make timezones work as best we can19:54
clarkbAnything else in our last ~6 minutes?19:54
fricklerwhat about the expiring ssl certs?19:54
clarkbfrickler: tonyb and I were going to do the linaro one today19:55
clarkbI'll remind the openstack.org sysadmins about the other one19:55
fricklerthere were also some more coming up recently19:55
fricklermirror02.ord.rax.opendev.org and review.opendev.org19:56
fungiyeah, i saw those, i suspect something started failing the letsencrypt periodic job, need to check the logs19:56
clarkbeither that or we need to restart apache to clear out old workers19:56
fungiother possibility is apache processes clinging to old data, yep19:56
fricklerok, so everything under control, fine19:57
clarkblooks like certs on the mirror node are not new files so likely the jobs19:57
clarkbits a good callout and why we alert with 30 days of warning so that we can fix whatever needs to be fixed in time :)19:57
fungifull ack19:58
clarkbthank you for your time everyone19:58
clarkbI'll call it here19:58
clarkb#endmeeting19:58
opendevmeetMeeting ended Tue Jan 16 19:58:37 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:58
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2024/infra.2024-01-16-19.00.html19:58
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2024/infra.2024-01-16-19.00.txt19:58
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2024/infra.2024-01-16-19.00.log.html19:58
tonybThanks everyone19:58
fungithanks!19:59
fricklero/19:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!