clarkb | almost meeting time | 18:59 |
---|---|---|
frickler | o/ | 19:00 |
ianw | o/ | 19:01 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Jun 6 19:01:18 2023 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | there we go was wondering what happened to the bot | 19:01 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/Y3ZTMR6ZJJDWPZPNWYB32UC2HHGFZH73/ Our Agenda | 19:01 |
clarkb | This just went out. Sorry about that, I got super nerd sniped yesterday afternoon looking into the python and openafs thing | 19:01 |
clarkb | debuginfod is really cool and useful btw | 19:02 |
clarkb | #topic Announcements | 19:02 |
clarkb | A reminder that next week we'll skip having a meeting since several of us will be attending the open infra summit | 19:02 |
clarkb | Then for June 20th (the meeting after next) I won't be able to make it as I'll be in the middle of travel. I'm happy for others to run a meeting if they like. I just can't run it myself | 19:03 |
clarkb | #topic Topics | 19:03 |
clarkb | I removed the quay topic. I think its basically in a steady state situation now | 19:04 |
clarkb | dib functional testing is working again too | 19:04 |
clarkb | #topic Bastion Host Updates | 19:04 |
clarkb | #link https://review.opendev.org/q/topic:bridge-backups | 19:04 |
clarkb | tonyb appears to have reviewed this stack, thanks! Fungi mentioned he would take a look too but I don't think that has happened yet | 19:05 |
clarkb | The other thing I was thinking about recently is we should probably start looking at updating our ansible version bridge to ansible 8 (we are currently 7) | 19:05 |
fungi | no, i've been too distracted by other things, sorry | 19:05 |
clarkb | In theory that will be self testing and if we get the change that bumps things up to run all the system-config-run-* jobs we should get really good coverage of it | 19:05 |
clarkb | Not sure if anyone is interested in doing that. I don't think I have time for the next couple of weeks but I may start looking after if no one else beats me to it | 19:06 |
clarkb | Throwing it out there if there is interest since I think it could be a good one as our testing for it should be robust | 19:06 |
ianw | i think we're testing the git master so it should be fairly easy | 19:06 |
clarkb | ianw: I think that has been failing though. But proposing a change to move the cap from <8 to <9 should do what we need | 19:07 |
clarkb | and then ensuring all the jobs we want to trigger also trigger | 19:07 |
fungi | zuul is well behind that in what versions it supports, but shouldn't be an issue for our nested ansible calls, right? | 19:07 |
clarkb | fungi: correct | 19:07 |
clarkb | they are pretty well separated in our environment | 19:07 |
clarkb | #topic Mailman 3 | 19:08 |
clarkb | fungi: any updates with your testing of the vhost fixes? | 19:08 |
fungi | nope, distractions | 19:09 |
fungi | sorry | 19:09 |
clarkb | hopefully post summit we'll all have fewer distractions | 19:09 |
clarkb | #topic Gerrit Updates | 19:09 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/884779 Revert bind mounts for Gerrit plugin data | 19:09 |
clarkb | I pushed this change because I think I've decided that in he short term the best thing for us may be to just clear out that data when we launch new gerrit containers | 19:10 |
clarkb | I'd love feedback on that and/or the changes I pushed to manually clear things on gerrit startup | 19:10 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/880672 Dealing with leaked replication tasks on disk\ | 19:10 |
clarkb | I don't think this second change will clear all the files we need to clear but it will be easier to see what else is leaking after we land it if we decide to go that route instead. Feedback very much welcome either way or if you have alternative suggestions | 19:10 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/885317 Update Gerrit 3.8 image to 3.8.0 final release | 19:11 |
clarkb | I also pushed this update yesterday to update our 3.8 image. THis won't affect production, but will make our upgrade testing a bit more realistic | 19:11 |
clarkb | #topic Upgrading Old Servers | 19:12 |
clarkb | #link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades Notes | 19:12 |
clarkb | I've replaced the old zp01 server with a zp02 server. Though nothing may be using it... | 19:12 |
clarkb | Still on the list are a handful of mirror nodes, meetpad, and I think the insecure ci registry | 19:13 |
clarkb | I'm going to continue to try and pick them off one by one as I can. Probably mirror nodes will be my next target | 19:13 |
clarkb | Help welcome | 19:13 |
tonyb | can you link to the zp01 change? | 19:13 |
clarkb | tonyb: yes one sec | 19:14 |
tonyb | that'd help me grok what's needed | 19:14 |
tonyb | rather than thinking I know what to do ;P | 19:14 |
clarkb | tonyb: https://review.opendev.org/q/topic:replace-zp01+OR+topic:zp02 | 19:14 |
clarkb | tonyb: you do need a root to laucnh the new node, but if you propose a change like https://review.opendev.org/c/opendev/system-config/+/885076 which updtes the test node label type so that testing shows jammy works then I'm happy to do that with/for you | 19:15 |
tonyb | perfect | 19:15 |
clarkb | basically mock it up and see that testing sows it is happy then I can launch the node and stick it in the change in the inventory hosts file | 19:15 |
clarkb | tonyb: insecure-ci-registry would probably be a good one and or the mirrors. Since they have minimal state on host. meetpad is a bit weird because we need to sort out how to replace the control plane for that service | 19:16 |
clarkb | Relatd to all this but slightly different is the update of zuul servers to jammy for podman potentual use | 19:16 |
clarkb | (crrently going to jammy but still using docker) | 19:16 |
tonyb | got it | 19:17 |
clarkb | All of the mergers have been replaced and 6 executors have been booted. THis exposed a behavior in new openafs where lseek()ing the openafs ioctl device/file crashes the process with a kernel oops | 19:17 |
clarkb | this was a problem with zuul because it uses python open() which does an lseek under the hood. corvus replaced that open() with an os.open() which does not do any magic under the hood | 19:18 |
corvus | (actually all 12 booted) | 19:18 |
clarkb | corvus: oh cool I thought it was only 6 | 19:18 |
clarkb | static02 is a jammy node running jammy openafs and that has been functional so we expect that once we launch updated executor code on the new nodes they should be happy | 19:18 |
clarkb | this was a specific corner case that was sad | 19:18 |
corvus | yeah, plan was to get it all done before monday :/ | 19:18 |
ianw | so was this https://gerrit.openafs.org/#/c/14918/ ? i'm unclear if that fix wasn't there or if it's a different issue | 19:19 |
clarkb | ianw: the problem persisted in the 1.8.9 package the ppa built | 19:20 |
clarkb | I think you expected that version to include that patch? if so then I don't think that patch was the fix | 19:20 |
clarkb | however the description in that change definitely seems to match the behavior we saw | 19:20 |
corvus | (though i agree, the words in the commit message sure make that sound like it should be a fix) | 19:21 |
ianw | yeah, i perhaps miscalculated if that patch is there ... | 19:21 |
clarkb | in any case I think it is unlikely we'll be opening that structure outside of the single place zuul does it or in openafs itself | 19:21 |
clarkb | and openafs itself seems to be working on static02 so we're prbably good? Lets proceed and keep an eye on it and we can always look at rebuilding new ppa packages later if necessary | 19:22 |
ianw | $ cat src/afs/LINUX/osi_ioctl.c | grep default_llseek | 19:23 |
ianw | $ | 19:23 |
ianw | ... sigh ... you would have thought I'd think to check that :/ | 19:23 |
corvus | (pam modules would be the other place to watch out for that) | 19:23 |
ianw | that at least explains why 1.8.9 didn't fix it for us | 19:23 |
clarkb | cool I think that gives us a path forward should this problem continue to persist. We can build a 1.8.9 with that fix backported | 19:24 |
ianw | so the counterpoint to that is that we could pull that patch in | 19:24 |
clarkb | ianw: yup or convince ubuntu to pull it in maybe | 19:24 |
ianw | sorry about that. i don't know why i didn't think of it until just then | 19:24 |
clarkb | but it doesn't seem necessary yet so I think we can proceed with distro 1.8.8 and take it from there | 19:24 |
clarkb | Anything else related to server updates? | 19:25 |
ianw | it is in the openafs-stable-1_8_x branch in the openafs git | 19:26 |
fungi | fixed in 1.9.x at all? | 19:26 |
fungi | mainly curious, i have no idea how much unrelated breakage 1.9 will mean | 19:26 |
clarkb | cool so we can fetch the backport out of that branch then and it should apply cleanly to the existing packaging | 19:27 |
ianw | that was what i looked at. but that doesn't seem to have tags | 19:27 |
clarkb | fungi: I think it merged to 1.9 | 19:27 |
fungi | ahh | 19:27 |
clarkb | ya I think 1.9 is master right now? | 19:27 |
clarkb | #topic Fedora cleanup | 19:28 |
clarkb | tonyb: you've been poking at the zuul-jobs role stuff for this and corvus pointed out the possibiltiy of using the new thing that we never actually switched to... | 19:29 |
tonyb | Yeah | 19:29 |
clarkb | tonyb: did you have thoughts on what makes sense for pushing this ahead? | 19:29 |
tonyb | I'm looking for feedback on timing and priorities | 19:29 |
fungi | (there's an openafs 1.9.1 or 1.9.2 maybe, so they're supposedly releasing from it, but maybe they aren't making tags there) | 19:30 |
tonyb | The new thinng looks good but I admit I'm not in a position to judge the effort needed to make it a reality | 19:30 |
clarkb | I'll be honest my personal priority for this is low, I was just trying to find easy wins for maybe adding rocky/bookworm mirroring | 19:30 |
tonyb | I get that the fedora_mirror_enabled is a hack | 19:31 |
clarkb | for this reason I'm personally happy to take our time and add the new thing to configure mirrors. But that likely would take longer than bookworms release date | 19:31 |
frickler | how much effort is increasing afs capacity? | 19:31 |
clarkb | tonyb: the main thing is going to be adding configuration for the new thing and adding the new thing to the base-test base jobs and then reparenting a representative sampling of jobs to base-tests to ensure it is doing what we expect | 19:31 |
tonyb | corvus: It'd be good to get your thoughts on how desireable the new thing is | 19:31 |
fungi | bookworm release day is saturday, btw | 19:32 |
clarkb | frickler: you need to add volumes to existing servers (easy but increases potential for failures) or add new servers (more work) | 19:32 |
tonyb | I hesitate, could we "fork" the configure-mirrors role into openstack-jobs, remove the fedora stuff while I do the right thin in zuul? | 19:33 |
tonyb | clarkb: Yup I can totally do that. | 19:33 |
clarkb | tonyb: you'd probably need to fork it into opendev/base-jobs | 19:33 |
frickler | maybe it still would be worth to decouple cleaning up old mirrors from setting up new stuff? | 19:33 |
clarkb | tonyb: since this will affect all opendev users | 19:33 |
corvus | well, the new thing is designed to have the kind of flexibility we apparently are now starting to require, so i think it's better for opendev and the wider community (in that it lets others actually use mirror roles in zuul-jobs which, basically, we're the only ones who can actually use right now) | 19:33 |
tonyb | Okay, same idea but wrong repo ;P | 19:33 |
clarkb | frickler: yes that is doable too. It would still be lowish priority for me though which is why I was looking for easy wins | 19:34 |
clarkb | I just don't have time to add new distro content when I can barely keep up with what we already have so my personal preference is cleanup first | 19:34 |
corvus | i think it's the classic long-term/short-term balancing act, and i don't have a good read on making that decision. so i'm just able to provide background. :) | 19:35 |
frickler | iirc the patch to add bookworm mirroring is already present, just needs afs capacity | 19:35 |
tonyb | corvus: okay. | 19:35 |
clarkb | frickler: that and cleanup of buster | 19:35 |
clarkb | I think if I were trying to lead this I would look at doing the new mirror setup thing, test it with base-test, update base, clean up fedora mirroring, then decide if we need to adjust capacity from there or not | 19:36 |
clarkb | because from where I'm sitting we can't keep up so reducing effort first is a win | 19:36 |
clarkb | but if others want to push bookworm ahead and do something different i'm ok with that too | 19:36 |
tonyb | Okay. We'll make that the plan. | 19:37 |
clarkb | also we can do the two things concurrently which is nice | 19:37 |
clarkb | they don't conflict with each other | 19:37 |
tonyb | I'll put myself on the hook for making that happen .... as long as I can count on help/support for doing the new mirror_info thing | 19:37 |
clarkb | yup I can contiue to help | 19:38 |
tonyb | perfect | 19:38 |
clarkb | #topic Storyboard | 19:39 |
clarkb | I think fungi has been keeping up with the updates there. But anything I've missed worth calling out? | 19:39 |
fungi | nothing new since last week, afaik | 19:40 |
clarkb | #topic Open Infra Summit | 19:41 |
clarkb | I sent out an email trying to organize a low key gathering for those of us that will be there (for Zuul and OpenDev but really no one is counting). The beer gardne worked really well in berlin and about 2.5km away from the summit venue is a brewery with a ton of outdoor picnic table type setups | 19:42 |
clarkb | I'm hoping it doesn't rain (current forecast says it will rain in the morning but be dry in the afternoon) and we can go hang out Thursday at 6ish there | 19:42 |
clarkb | It looks like it may be a little cool. 67F/19C as a high and it will probably be a bit cooler in the evening so fingers crossed that still works out | 19:43 |
clarkb | if it gets too cold or rainy we'll figure it out then | 19:43 |
corvus | that's warmer and drier than here now... maybe i should open a beer | 19:44 |
clarkb | Also a reminder that next week is the summit. I expect it will get quiet around here. | 19:44 |
clarkb | #topic Open Discussion | 19:44 |
clarkb | Anything else? | 19:44 |
frickler | I'm slowly working my way through zuul config error cleanups | 19:45 |
corvus | related to the ongoing work to clean up errors, there are some zuul changes arriving soon that will hopefully help with that. new layout for the config errors page, ability to filter, and display of warnings. | 19:46 |
corvus | (i also have adding sorting to that list, but that will make more sense after the warnings arrive, so that may be a few changes away still) | 19:46 |
frickler | got the consent from the TC now to force-merge things that get stalled due so failing CI | 19:46 |
clarkb | frickler: I pushed a DNM change to confirm that github reports there are no valid merge methods for ansible and testinfra :/ | 19:46 |
clarkb | frickler: but I think your removal of the project listings is working fine so we don't need to dig into that with any urgency | 19:47 |
frickler | clarkb: yes, I saw that. some day zuul will likely need to switch to graphql for that | 19:47 |
clarkb | or figure out if different perms are now required | 19:48 |
frickler | corvus: I've switched to using the json from the API, will that also change? | 19:49 |
corvus | frickler: so far only new fields | 19:49 |
clarkb | I think I can give everyone 10 minutes back | 19:50 |
clarkb | feel free to continue discussion in #opendev or on the mailing list | 19:51 |
clarkb | thank you for your time and help! | 19:51 |
clarkb | #endmeeting | 19:51 |
opendevmeet | Meeting ended Tue Jun 6 19:51:11 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:51 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2023/infra.2023-06-06-19.01.html | 19:51 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-06-06-19.01.txt | 19:51 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2023/infra.2023-06-06-19.01.log.html | 19:51 |
fungi | thanks clarkb! | 19:51 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!