clarkb | meeting will start in just a few minutes | 18:58 |
---|---|---|
fungi | oh, yep! | 19:00 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Aug 2 19:01:24 2022 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link https://lists.opendev.org/pipermail/service-discuss/2022-August/000348.html Our Agenda | 19:01 |
clarkb | I am prepared with an agenda :) | 19:01 |
clarkb | #topic Announcements | 19:01 |
clarkb | The Service Coordinator nomination period has officially begun | 19:02 |
ianw | o/ | 19:02 |
clarkb | It started today and will run through August 16, 2022. I'll send a followup email to the thread I started last week warning people of this timeline :) | 19:02 |
clarkb | #link https://lists.opendev.org/pipermail/service-discuss/2022-July/000347.html | 19:02 |
clarkb | #topic Topics | 19:03 |
clarkb | #topic Improving OpenDev CD Throughput | 19:03 |
clarkb | I don't have anything new on this item. I'm thinking maybe we can pull it off the agenda until we've got new developments? Seems like we've got a lot of other stuff going on in the meantime | 19:03 |
clarkb | Any objections to that? | 19:04 |
fungi | none from me | 19:05 |
ianw | not really, it is a permanent todo :) | 19:06 |
clarkb | ok cool | 19:06 |
clarkb | #topic Updating Grafana Management Tooling | 19:06 |
clarkb | #link https://review.opendev.org/q/topic:grafana-json | 19:06 |
clarkb | ianw: ^ I think this stack is largely ready to go though I had some comments on it. Not sure if you want to respin or land as is and then improve in followups | 19:06 |
ianw | sorry, i wanted to get back and make sure i responded to comments before merging | 19:07 |
clarkb | either approach is fine with me. I just didn't want anyone to feel my comments were necessary improvements. I did +2 afterall | 19:07 |
ianw | i should have time to get to it soon | 19:08 |
clarkb | sounds good | 19:08 |
clarkb | #topic Bastion Host Updates | 19:08 |
clarkb | This item is largely a proxy for the zuul streaming log file cleanup work | 19:08 |
clarkb | at least for now | 19:09 |
clarkb | ianw: did those changes get included in the weekend upgrade of zuul? | 19:09 |
*** kopecmartin_ is now known as kopecmartin | 19:09 | |
clarkb | If so I think we should manually clear out the remaining files on bridge (and static) and then we can monitor to see how many sneak through due to aborted jobs and similar situation with zuul | 19:09 |
ianw | i haven't yet pushed the +w on those changes as i haven't responded to corvus' comments on the files potentially not being removed for aborted jobs | 19:10 |
ianw | the suggestion was a background thread to remove them | 19:11 |
clarkb | got it. FWIW it was my impression that we can land the stack you've got as it is an improvement. Just taht we will also want to look into a tmpreaper setup for the straggler files | 19:11 |
ianw | i'm starting to think perhaps documenting the situation a bit better first, and we can see how much of an issue it is, and perhaps if we can land something that puts them in more of a reserved namespace i'd feel better about a generic cleaner | 19:12 |
ianw | so i have a half-written doc change that i'll clean up ... very soon :) | 19:12 |
clarkb | before landing the current improvements? | 19:12 |
ianw | i'll push that on top and feel ok about landing what's there, i think | 19:13 |
clarkb | got it | 19:13 |
clarkb | I just didn't want this to get forgotten as the creatino of the tmpfiles will eventually bite us I think :) | 19:14 |
clarkb | this plan seems reasonable though. I think we can move on | 19:14 |
clarkb | #topic Upgrading Bionic Servers to Focal/Jammy | 19:14 |
clarkb | #link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades Notes on the work that needs to be done. | 19:14 |
clarkb | I've been using the mailman3 works as a good exercise for checking Jammy generally works for our config management | 19:15 |
clarkb | There are two changes related to improving Jammy support that can landable today | 19:15 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/851094/2 Run system-config-run-base against Jammy | 19:15 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/851266/1 Fix install-docker role for Jammy | 19:15 |
fungi | thanks, i'll make sure to check those out | 19:15 |
clarkb | Overall nothing really crazy about using Jammy yet which is a good thing | 19:16 |
fungi | ahh, i already reviewed one of them ;) | 19:16 |
clarkb | But I haven't gotten to the apahce config for mailman3 yet as I've been struggling with various mailman3 related things recently | 19:16 |
clarkb | I'm hoping we'll have functioning apache configs in the mailman3 context soon and that will hopefully expose if that server has any new things we need to accomodate | 19:16 |
clarkb | If you are curious about the mailman3 work it isn't for a bionic upgrade but it is updating some other old software | 19:17 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/851248 WIP change to run mailman3 on Jammy | 19:17 |
ianw | thanks, mm3 seems like something we have to do eventually :) | 19:18 |
ianw | #link https://review.opendev.org/c/zuul/nodepool/+/849273 Dockerfile: move into separate group when running under cgroupsv2 | 19:18 |
ianw | is i guess related, and maybe invalidates your "nothing really crazy" bit :) | 19:18 |
clarkb | The latest thing is hacking around assumptions in the upstream docker image configs. And for some reason port 8000 doesn't show up as listening even though I've gotten the errors out of the service logs now as far as I can tell | 19:18 |
clarkb | oh ya cgroupsv2 is definitely something that falls under crazy :) | 19:18 |
clarkb | I'll take a look at that change today | 19:18 |
clarkb | Anyway there is progress here. Slow but it is happening :) | 19:19 |
clarkb | #topic Gitea 1.17 Upgrade | 19:20 |
clarkb | Gitea made their 1.17.0 release over the weekend. Yesterday I updated my WIP change that was deploying and testing the gitea release candidates to this final release version | 19:20 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/847204 | 19:20 |
clarkb | I don't think we are in a rush to upgrade. I've tried to call out all of the breaking changes in my commit message and give details why they do or do not affect us | 19:21 |
clarkb | The screenshots and our testing seem to show it generally works though. So if we are happy after reviews I think we can upgrade whenever we feel ready | 19:21 |
clarkb | Seems like we're making good time through the agenda | 19:23 |
clarkb | #topic Rock Images Not Booting | 19:23 |
clarkb | Late last week it was pointed out that our Rocky Linux images were no longer booting | 19:23 |
clarkb | I pulled one up in a rescue instance and noticed there were not kernels in /boot and there were no entries in grub.cfg to boot | 19:24 |
clarkb | ianw: managed to trace this back to the machine ID missing in the image which prevents kernels from being installed to /boot and that prevents grub from updating its config to boot a kernel | 19:24 |
clarkb | The fix for this has landed in DIB. The next steps in addressing this are to make a DIB release and update nodepool to use that release. | 19:24 |
clarkb | However, there is one change that we'd liek to get into the DIB release that hasn't merged yet which adds support for rocky linux 9 | 19:25 |
clarkb | In any case we should expect this to be happy in the near future | 19:25 |
clarkb | ianw: out of curiousity how did you trace it back to the machine id? | 19:26 |
ianw | i started by looking at the dib functional boot jobs; luckily we had one that did work with logs still so had a comparision point | 19:27 |
ianw | eventually i realised that the upstream container had updated since that run, so that cracked open something to explore | 19:27 |
ianw | comparing to the older version of the container, the new one didn't have a /boot directory ... which i thought might be the problem at first | 19:28 |
ianw | when that didn't pan out, i started wondering about how the kernel actually got into /boot, which led me to the rpm scripts run by kernel-core package, which lead to tracing /bin/kernel-install, which led to realising it was looking for a machine-id ... | 19:29 |
ianw | just now though, weirdly, following this same thread on the rocky9 images, there doens't seem to be a kernel installed either. but the jobs are booting ok there when built under dib. so i don't know what's going on with that | 19:30 |
clarkb | I also wonder if the rhel stuff needs similar fixes? Or are they likely shielded by not starting from a container image? | 19:32 |
ianw | i do feel like we've been down a similar path with machine-ids and kernel installs on the other build types, in various incarnations | 19:33 |
ianw | https://review.opendev.org/c/openstack/diskimage-builder/+/675056/ even | 19:34 |
clarkb | ha ya ok | 19:34 |
ianw | oh god, how depressing | 19:35 |
ianw | https://bugzilla.redhat.com/show_bug.cgi?id=1737355#c7 | 19:35 |
ianw | "Oh, and I actually debugged this once before 2 years ago and forgot about it! :) | 19:35 |
ianw | https://review.opendev.org/#/c/504300/" | 19:35 |
ianw | so now i've debugged it 3 times?! | 19:35 |
fungi | that sounds familiar | 19:36 |
clarkb | #topic Open Discussion | 19:38 |
clarkb | Seems like the last topic was well covered :) and that was all on the agenda | 19:38 |
ianw | could i ask for reviews on | 19:39 |
ianw | #link https://review.opendev.org/q/topic:ansible-lint-update-6 | 19:39 |
fungi | oh, yep | 19:40 |
ianw | there's a lot, but hopefully nothing too controversial. it should bring everything in sync | 19:40 |
ianw | everything being our *-jobs repos | 19:40 |
fungi | lots are already merged too | 19:40 |
clarkb | related to linting the new rules about spaces after keywords hit zuul | 19:41 |
clarkb | I expect that will continue to crop up in places | 19:41 |
fungi | related to the rocky troubleshooting, 851520 is probably safe to merge now | 19:41 |
ianw | speaking of that, i proposed | 19:41 |
ianw | #link https://review.opendev.org/c/openstack/releases/+/851273 hacking: release 5.0.0 | 19:42 |
ianw | i don't know who usually looks after that. that would bring in a flake8 that is 3.10 compatible | 19:42 |
fungi | also on an entirely separate note, i have a wip behavior change proposed for git-review which could use some feedback as to whether it's desirable: https://review.opendev.org/850061 | 19:42 |
clarkb | ianw: I think that openstack often updates those at the beginning of a cycle so may be too late now and have to wait for ~end of october? | 19:43 |
ianw | yeah, true, i guess it has potential to do more than just support 3.10 | 19:43 |
frickler | ianw: I think it belongs to qa, so kopecmartin | 19:44 |
clarkb | Anything else? Last call | 19:47 |
fungi | i got nuthin | 19:47 |
fungi | oh, though i don't expect to be around much thursday, just a heads up to everyone | 19:49 |
clarkb | thank you for the heads up. | 19:49 |
clarkb | Thanks everyone. We'll see you back here next week | 19:49 |
clarkb | #endmeeting | 19:49 |
opendevmeet | Meeting ended Tue Aug 2 19:49:44 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:49 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-02-19.01.html | 19:49 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-02-19.01.txt | 19:49 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-02-19.01.log.html | 19:49 |
fungi | thanks clarkb! | 19:49 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!