clarkb | couple of minutes to our weekly meeting | 18:58 |
---|---|---|
clarkb | I expect it may be lightly attended today. WHich is fine. I'll run through the agenda and see what happens | 18:58 |
clarkb | #startmeeting infra | 19:00 |
opendevmeet | Meeting started Tue Apr 29 19:00:31 2025 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:00 |
opendevmeet | The meeting name has been set to 'infra' | 19:00 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/BQPBY4QVBYNC3VTOU3HXAUTESQSC7WKZ/ Our Agenda | 19:00 |
clarkb | #topic Announcements | 19:00 |
clarkb | Due to travel obligations neither fungi or I can attend the meeting next week. For that reason we basically decided last week to cancel the next meeting on May 6 | 19:01 |
clarkb | I'm going ahead and making that official now. The May 6 meeting will be cancelled. We'll be back the week after. See you there | 19:01 |
clarkb | Anything else to announce before we dive into today's content? | 19:01 |
clarkb | #topic Zuul-launcher image builds | 19:03 |
clarkb | mnasiadka volunteered to add some arm64 images to zuul launcher. That work is in progress and i think the latest change merged earlier today | 19:03 |
clarkb | So far you should be able to use noble arm64 nodes from zuul launcher and jammy is in progress | 19:04 |
clarkb | this is great to see as it proves we can continue to do multi arch images with zuul launcher and getting help from interested parties is a huge bonus | 19:04 |
clarkb | I think the disk io improvements osuosl made semi recently help keep the image build times reasonable too. A group effort all around | 19:04 |
clarkb | I think the next steps there are to continue to add images and then also start dogfooding them | 19:05 |
clarkb | #link https://review.opendev.org/c/opendev/zuul-providers/+/948318/ is next up | 19:05 |
clarkb | Did anyone else have nodepool in zuul updates? | 19:05 |
clarkb | #topic Container hygiene tasks | 19:06 |
clarkb | Next up we've managed to update all the images to python3.12 except irbot/limnoria | 19:06 |
clarkb | #link https://review.opendev.org/q/topic:%22opendev-python3.12%22+status:open Update images to use python3.12 | 19:06 |
clarkb | I'll continue to link the topic link rather than a specific change in case anyone else finds cases that were missed. If you push changes with the topic we'll get them automatically on the review list that way | 19:07 |
clarkb | otherwise things have gone smoothly. Even jeepyb on the gerrit imagei s runnung under python3.12 now | 19:07 |
clarkb | fungi: any specific thoughts on when we should update limnoria? | 19:07 |
fungi | later today? | 19:07 |
clarkb | I'll be around so that should work for me | 19:08 |
fungi | i'm happy to babysit that deploy | 19:08 |
clarkb | cool | 19:08 |
fungi | should be a quiet time so won't disrupt meetings | 19:08 |
clarkb | the other hygiene task is a change to do container image builds with docker hub names forced to resolve to ipv4 addrs via /etc/hosts | 19:08 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/948247 Force docker hub access to happen over ipv4 for better rate limits. | 19:08 |
clarkb | we already force this on the system-config-run-* job side of thinsg to fetch the images via ipv4 when doing service tests | 19:09 |
clarkb | but we also fetch images from docker hub during the image build process and we've hit a few errors there recently due to rate limits | 19:09 |
clarkb | I think this is a good halfway step while we slowly move to quay for everything | 19:09 |
fungi | yeah, i wasn't comfortable single-core approving that too quickly, but if nobody else wants to look it over there's no need to delay further | 19:10 |
fungi | the sooner it merges, the fewer rechecks we'll need (in theory) | 19:10 |
clarkb | its also theoretically easy to remove later if we like | 19:10 |
clarkb | but ya maybe proceed with limnoria then that and see where we are aftwards in terms of reliablity | 19:11 |
clarkb | anything else container related? (I actually have a gerrit container item but I'll save that for the next topic since it is all about gerrit) | 19:11 |
clarkb | #topic Switching Gerrit to run on Review03 | 19:12 |
clarkb | #link https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 Notes on the migration plan | 19:12 |
clarkb | this is basically done at this point | 19:12 |
clarkb | we've been on the new server for just over a week now and the old server is shutdown | 19:13 |
clarkb | since then we've created new projects in gerrit and switched jeepyb over to python3.12 with a new container image (and update the stop signal to make podman happy) | 19:13 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/882900 Migrate gerrit images to quay | 19:13 |
clarkb | at this point it should be fine for us to move our gerrit image over to quay and we'll maintain speculative testing | 19:13 |
clarkb | however, when we do that we'll rebuild the image so should plan to restart gerrit again. Which means this week may not be the best for that as I'm traveling all next week | 19:14 |
clarkb | I don't think this is urgent though and can pick it up when I am back. I just wanted to make everyone aware of that as part of some of the last followup to this server move | 19:14 |
clarkb | The other big todo remaining is deleting the review02 instance | 19:14 |
clarkb | this server is boot from volume and has a data volume. I figure we can preserve both the boot from volume disk and the data volume but delete the instance (as long as the bfv volume doesn't automatically delte I should check that) | 19:15 |
clarkb | any concerns with that cleanup? DO you think we should delete the bfv volume and/or the data volume too at this point? | 19:15 |
fungi | sounds fine to me | 19:15 |
fungi | we have backups | 19:16 |
fungi | though we could also snapshot them | 19:16 |
clarkb | ya and either way I figured we could delte the instance then delete the volumes later too | 19:16 |
clarkb | so more time to decide we don't need them | 19:16 |
fungi | in theory snapshots go to cheaper/slower storage so are less taxing than leaving r/w volumes | 19:17 |
clarkb | given that I'll try to do that tomorrow probably. Double check the bfv volume won't delete autoamtically, name that volume so we know what it is later, then delete the instance | 19:17 |
clarkb | and then as further followup we can snapshot then delete the volume | 19:17 |
clarkb | #link https://www.gerritcodereview.com/3.11.html | 19:17 |
clarkb | the last gerrit related item I have is for us to start thinking about 3.11 upgrades | 19:18 |
clarkb | Ideally we'd get that done before openstack is too deep itno the current release cycle | 19:18 |
clarkb | I think that is possible | 19:18 |
fungi | yeah, seems doable | 19:19 |
clarkb | we already test the upgrade and it seems to work on the surface. The big change is that you have to manage refs/meta/config through reviews now by default. We'll want to investigate how that impacts manage-pojrects if at all given our existing acls | 19:19 |
clarkb | I suspect that our existing acls will mean nothing changes for us and only new installs are affected | 19:19 |
clarkb | but again thats a when I get back item for me. Happy for others to dive in too. I'd probably start with holding a node and doing some manual upgrades on the test setup | 19:20 |
clarkb | #topic Upgrading old servers | 19:20 |
clarkb | I don't have any updtes since I did review03 | 19:20 |
clarkb | did anyone else? | 19:20 |
fungi | nope, no word yet on refstack announcement plans | 19:22 |
clarkb | ackthanks | 19:22 |
clarkb | #topic Working through our TODO list | 19:22 |
clarkb | #link https://etherpad.opendev.org/p/opendev-january-2025-meetup | 19:22 |
clarkb | just another friendly reminder that you can find a high level backlog here | 19:23 |
clarkb | happy to discuss any of these with volunteers for pciking upwork if more info is needed | 19:23 |
clarkb | #topic Rotating mailman 3 logs | 19:24 |
clarkb | fungi: any news on this one? | 19:24 |
fungi | pushing it now, thanks for the reminder | 19:24 |
fungi | #link https://review.opendev.org/c/opendev/system-config/+/948478 Rotate mailman-core logs [NEW] | 19:24 |
fungi | looks like it was simpler than i expected | 19:24 |
clarkb | fungi: do we want to hold a node and let it copytruncate at least once before landing taht? | 19:25 |
clarkb | I seem to recall there aws concern that copytruncate wasn't sufficient in all cases? Though it seems like it should be since the file handles never change for the running process | 19:25 |
clarkb | tahnk you for getting that up | 19:25 |
fungi | we can, though i'm not sure it would be an effective test | 19:25 |
clarkb | I think as long as the files have data in them it shoudl exercise it? | 19:26 |
clarkb | and then we can look at lsof to see any obviously leaked fds or something | 19:26 |
fungi | yeah, i guess we just need to confirm that the services keep running and write new loglines | 19:26 |
clarkb | yup | 19:27 |
fungi | copytruncate should avoid the risk of leaking fds | 19:27 |
fungi | since mailman isn't opening a new file | 19:27 |
clarkb | ya that was my undersatnding too but I remember someone in the upstream issues saying it didn't work. Maybe they were simply mistaken | 19:27 |
clarkb | also worst case we probably remove the logrotate config and then restart mailman so not a huge deal if we land it and it doesn't work as expected | 19:28 |
fungi | right | 19:28 |
fungi | easy enough to recover from | 19:28 |
clarkb | ++ ok I'm fine with proceeding then. I'm also needing to check if *.log works there but I can do that when I review it properly | 19:29 |
clarkb | #topic Renewing wiki's cert | 19:29 |
clarkb | The cert expires while I'm traveling so my goal is to replace it this week | 19:29 |
clarkb | in fact I think that is a good task to do while waiting on limnoria updates later today | 19:29 |
clarkb | so I'll try to get that moving today. If it isn't done by friday say something place | 19:29 |
clarkb | *say something please | 19:29 |
clarkb | as I want to ensure it is done this week | 19:30 |
clarkb | I'll buy a one year cert. Then we may have to renew early next year as I think march 2026 ish is when max cert validity starts to fall below a year | 19:30 |
clarkb | but these certs are cheap enough that I'm fine with that. We lose like $1-$2 worht of cert validity | 19:30 |
fungi | sounds great. terrible but great | 19:31 |
fungi | greatly terrible | 19:31 |
clarkb | #topic Occasional Log Upload Failures to OVH | 19:31 |
clarkb | I've noticed a couple mornings the last week or so t hat we had very infrequent POST_FAILURE results due to ovh log uploads | 19:32 |
clarkb | both times I noticed this the blip was short and not widespread | 19:32 |
clarkb | so we never disabled that backend | 19:32 |
clarkb | I want to make note of it so that others can keep an eye out and we can debug further if things get worse | 19:32 |
clarkb | and if it does get worse we can always remove that provider | 19:33 |
clarkb | #topic Open Discussion | 19:33 |
clarkb | tonyb started https://review.opendev.org/c/openstack/project-config/+/948033 to discuss hosting rdo in opendev | 19:33 |
clarkb | probably worth reading over if you haven't yet just to call out any concerns. I noted a few things but I think they are all solveable and not blockers | 19:34 |
clarkb | also sean-k-mooney discovered that paste doesn't have utf8 4 byte support in mariadb | 19:35 |
clarkb | we may need to update lodgeit to support 4 byte in the first place then do a db migration (possibly manually) | 19:35 |
clarkb | I think going 3 byte -> 4 byte is a straightforward migration as the db just needs to allocate more disk space and there is no data loss | 19:36 |
fungi | yes | 19:36 |
clarkb | Anything else? | 19:36 |
fungi | i left a comment on the mailman log rotation change with pointers about file globbing | 19:37 |
fungi | spoiler: should be fine | 19:37 |
fungi | unless ansible wants that string quoted or something | 19:38 |
clarkb | ya my main concern is the ansible role supporting it | 19:38 |
fungi | but tests will tell us | 19:38 |
clarkb | but I think ianw fixed the issues with it | 19:38 |
clarkb | previously we used the filename as the name for the logrotate config file but now we hash them iirc | 19:38 |
fungi | oh, i hadn't realized it created problems in the past | 19:38 |
clarkb | ya because we'd get /etc/logrotate.d/*.conf | 19:38 |
fungi | got it, that's why we have e.g. /etc/logrotate.d/854d0b.conf on our servers | 19:39 |
clarkb | but now its something like $(echo '*' | sha256sum).conf | 19:39 |
clarkb | I just wanted to double check before I +2'd | 19:39 |
fungi | cool, please do! | 19:39 |
clarkb | oh yup the role readme even says it may be a wildcard so I think its fine | 19:40 |
clarkb | I just have memories of when it wasn't | 19:40 |
fungi | huh, actually we already have logrotate configuration for /var/lib/mailman/web-data/logs/*.log in /etc/logrotate.d/8e2e5c.conf | 19:41 |
clarkb | ya I think maybe we didn't realize how many log files mm3 has | 19:41 |
clarkb | ? | 19:41 |
fungi | yeah, fixing the change now | 19:41 |
clarkb | anything else for the meeting? | 19:42 |
clarkb | I think we can end a bit early otherwise. Thanks for your time helping keep opendev up and running everyone | 19:43 |
fungi | i've got nothing else | 19:44 |
fungi | log rotation change updated though | 19:44 |
clarkb | yup I'll look again | 19:44 |
clarkb | and then I'm going to eat lunch | 19:44 |
fungi | thanks clarkb! time to cook dinner, then can approve limnoria container change | 19:45 |
clarkb | #endmeeting | 19:45 |
opendevmeet | Meeting ended Tue Apr 29 19:45:37 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:45 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2025/infra.2025-04-29-19.00.html | 19:45 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2025/infra.2025-04-29-19.00.txt | 19:45 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2025/infra.2025-04-29-19.00.log.html | 19:45 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!