clarkb | Our team meeting will begin momentarily | 19:00 |
---|---|---|
ianw | o/ | 19:00 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Apr 25 19:01:02 2023 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/UT4QRPDF3HXQNR3PTIJZR5MIP5WPOTAW/ Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
clarkb | I don't have any announcements this week | 19:01 |
clarkb | #topic Topics | 19:01 |
clarkb | we can dive right in | 19:01 |
clarkb | #topic Migrating to Quay.io | 19:01 |
clarkb | A fair bit has happend on this since we last met | 19:02 |
clarkb | In the Zuul world changes are up to convert all of the zuul image publications to quay.io instead ofdocker hub | 19:02 |
clarkb | I think they all pass CI now too so Zuul should move soon and we'll want to updte all our docker-compose.yaml files to match when that happens | 19:02 |
clarkb | Piggybacking off of the hard work done in Zuul I've started doing some of the work for opendev as well. In particular I have copied our images from docker hub to quay.io to pull in old content. Most of our images only tag latest so for most things this isn't super necessary, but other images (like Gerrit) do tag versions. | 19:03 |
corvus | technically the zuul-operator doesn't pass, but i think the current issue is bitrot and there's only a small chance there's a container image issue lurking there. | 19:03 |
clarkb | I did skip four images that I don't plan to transfer over. opendevorg/bazel, opendevorg/grafana, opendevorg/jitsi-meet-prosody, and opendevorg/jitsi-meet-web since none of these are images we use today | 19:04 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/881285 WIP change to convert an image to quay.io using new jobs. | 19:04 |
fungi | sounds good to me, thanks for working through that | 19:04 |
clarkb | This change is WIP and open for feedback. corvus already had some feedback which i need to address | 19:04 |
clarkb | corvus: did you see my message in #opendev earlier today? I'm tempted to update ensure-quay-repo to check for a registry-type flag on each image in container_images and if it is set to quay then the role would act on it otherwise it would skip. Then we can run the role in the opendev base job | 19:05 |
clarkb | the major downside to this is that opendev quay users would need to include the api_token. I suppose I can skip if that isn't defined either | 19:05 |
corvus | clarkb: ah missed that message, but i like the idea. also, checking for the api_token is probably a good idea; i don't think we want to *require* the role :) | 19:06 |
clarkb | ok, I can work on that and that will result in changes to 881285 (and a new change) | 19:06 |
clarkb | I did test ensure-quay-repo this morning and it worked for me locally outside of zuul | 19:06 |
clarkb | The other thing to keep in mind for opendev image updates is that we should resync the images from docker hub to quay.io just before we update the zuul jobs to build he image. That way we have the latest and greatest on both sides before the pulication method changes | 19:07 |
clarkb | And I don't think we should try to do all images at once. It will be too much churn and potential broken stuff to debug. We should go an image or two at a time and ensure that we're deploying the image from its new home afterwards | 19:08 |
ianw | ++ | 19:08 |
clarkb | I'll try to pick this back up again later today. I think it is realistic that at least some of our images are published to quay.io automatically before our next meeting | 19:08 |
clarkb | any other questions/concerns/comments on this effort? | 19:09 |
clarkb | #topic Bastion Host Updates | 19:10 |
clarkb | I think the backups changes still need reviews? Is any other infra root willing/able to look at those? I think there is value here but also getting it wrong is potnetialy dangerous so review is valuable | 19:11 |
ianw | yeah it only makes sense if we have enough people who want to hold bits of it | 19:11 |
fungi | i can, keep meaning to | 19:12 |
clarkb | thanks. | 19:12 |
fungi | what's the review topic again? | 19:12 |
clarkb | #link https://review.opendev.org/q/topic:bridge-backups | 19:12 |
fungi | thanks | 19:12 |
clarkb | Anything else bastion related? | 19:13 |
ianw | not from me | 19:13 |
clarkb | #topic Mailman 3 | 19:14 |
clarkb | fungi: any progress with the held node? | 19:14 |
fungi | sorry, nothing to report yet. i've started playing with a new held node, but spent the past week catching up on urgent things that piled up while i was on vacation | 19:14 |
clarkb | #topic Gerrit Updates | 19:15 |
clarkb | before more important Gerrit items I want to note we are still running hte most recent proper release :) but gerrit 3.8 seems imminent | 19:15 |
clarkb | #link https://review.opendev.org/c/openstack/project-config/+/879906 Gerrit also normalized indentation of config files we should consider this to be in sync | 19:15 |
clarkb | fungi: ^ is a WIP change. We had talked about modifying the normalizer to insert a comment to explain the format to users. Is that the main missing piece before we unwip this change? | 19:16 |
fungi | yeah, i guess so | 19:16 |
ianw | i did propose something for that | 19:17 |
clarkb | #link https://review.opendev.org/c/openstack/project-config/+/880898/ better document gerrit acl config format | 19:17 |
ianw | #link https://review.opendev.org/c/openstack/project-config/+/880898 | 19:17 |
fungi | also can't recall if i mentioned it already, but i saw something recently in the gerrit upstream matrix channel about changes to how links are done with ui plugins in 3.8, so we'll need to make sure we test our gitea integration | 19:17 |
ianw | that i think helps generally. if something is wrong, it should give you a clue what it is without having to read the source | 19:17 |
clarkb | looks like the stack needs a rebase too? | 19:18 |
ianw | oh something might have merged, i can rebase | 19:18 |
clarkb | thanks i think we can remove the wip from the parent change at that point too | 19:18 |
clarkb | then it is just a matter of reviewing and applying the changes | 19:18 |
fungi | it'll need a rebase for every single change that touches or adds an acl, so we probably need to just make a decision to merge, then rebase, then review while other changes are on hold | 19:18 |
clarkb | fungi: yes, that should be testable from our test instances since the links will sned you to valid opendev.org links | 19:19 |
clarkb | fungi: something worth checking when we do upgrade testing though | 19:19 |
ianw | fungi: you're also still wip on it | 19:20 |
ianw | oh sorry, clarkb mentioned that | 19:20 |
clarkb | The other Gerrit thing to call out is cleaning up 3.6 images and updating our upgrade job | 19:20 |
fungi | ianw: yeah, i had proposed it as a straw man seeking folks to convince me the benefits outweighed the negatives | 19:20 |
clarkb | I put this on the list of things that new contributors could do, but I think we should probably tackle this sooner than later. I can probably poke at this later this week | 19:20 |
fungi | i guess i'm convinced enough to un-wip since others see that balance tipping toward benefit | 19:21 |
ianw | ++ i might have some time too | 19:21 |
clarkb | fungi: ya having cleaner diffs in the future is my motivation and that seems worthwhile to me | 19:21 |
clarkb | The last Gerrit item I have is the replication plugin leaking all over the review_site/data dir | 19:22 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/880672 Dealing with leaked replication tasks on disk | 19:22 |
clarkb | This is where I started digging into the problem and while that will help a bit it won't fully solve the issue | 19:22 |
clarkb | I ended up reading a fair bit of the plugin code and that resulted in two upstream bugs | 19:23 |
clarkb | #link https://bugs.chromium.org/p/gerrit/issues/detail?id=16867 | 19:23 |
clarkb | This first issue is related to trying to replicate All-Projects and All-Users. I'm pretty sure that what happens here is the replication plugin checks if it should replicate those repos after creating the waiting/ file on disk, but then short circuits when it shouldn't replicate without sending the finished event which cleans things up | 19:24 |
clarkb | #link https://bugs.chromium.org/p/gerrit/issues/detail?id=16868 | 19:24 |
clarkb | This second one is related to the plugin filtering out refs that it shouldn't replicate on changes it would otherwise replicate. For example you update a change via the web ui ths updates three refs. Two of the three should be replicated but not the third. I think this creates a new file on disk that gets properly managed and deleted orphaning the original with all three refs | 19:25 |
clarkb | I think this siutation is also related to the massive number of errors we get on startup. In those cases I suspect that we are actually filtering all refs for replication and it confuses the state machine | 19:26 |
clarkb | My workaround change addresses the first issue and some ofthe second issue. | 19:26 |
clarkb | I think it is still worth landing to reduce mental overhead but it likely won't completely solve the noise problem on startup. | 19:27 |
clarkb | I am hoping someone upstream can chime in on whether or not my ideas for fixing these problems are sound but I haven't seen any responses yet | 19:27 |
clarkb | And after doing all that I'm reasonably sure that deleting the files is completelyfine and that the only harm here is the leaking on disk | 19:28 |
ianw | i'm guessing there's not a ton of activity on that plugin ... you might be the expert now :) | 19:28 |
clarkb | ya if I can find some time to set up a test instance (I'll likely hold one launched by zuul) and recompile etc I will try to fix it | 19:28 |
clarkb | just lots of other stuff going on now that I'm really certain this is almost harmless :) | 19:29 |
fungi | and also the people who do have an interest in it are mostly interested in privileged gerrit-to-gerrit replication | 19:29 |
clarkb | fungi: yup that was something I realized in debugging this. Seems like very few people use the plugin with the restrictions we have | 19:29 |
clarkb | That was all I had for Gerrit. Anything else? | 19:29 |
clarkb | #topic Upgrading Servers | 19:31 |
clarkb | Etherpad is now running on the new server. THe move went well and it stuck to the expected timeline | 19:31 |
clarkb | We have since removed etherpad01 from our inventory and dns. I shutdown the server via the nova api too. | 19:31 |
clarkb | At this point we're probably comfortable deleting the server and its cinder volume? | 19:31 |
fungi | fine by me, yeah | 19:32 |
ianw | me too, new server lgtm | 19:32 |
ianw | and we double checked the backups | 19:32 |
fungi | i've been using it heavily, no problems | 19:32 |
clarkb | great I'll try to get that done (probably tomorrow) | 19:33 |
clarkb | Nameserver replacement is also in progress | 19:33 |
clarkb | Looks like the change to deploy the new servers (but not use them as resolvers yet) has landed | 19:33 |
clarkb | #link https://etherpad.opendev.org/p/2023-opendev-dns | 19:33 |
ianw | yep they should be active today, i'll keep ontop of it | 19:33 |
clarkb | ianw: is there anything else you need at this point? | 19:34 |
ianw | nope, i'll let people know when we can change registry records | 19:34 |
clarkb | thanks! | 19:34 |
ianw | i am assuming gating.dev and zuul* on gandi are corvus? | 19:35 |
clarkb | I doubt I'll have time to take on any replacements this week, but I'd like to grab another one or two and push on them next week. Maybe jitsi meet or mirrors or something | 19:35 |
clarkb | ianw: I think so | 19:35 |
ianw | actually zuul-ci.org is different to zuulci.org | 19:35 |
fungi | the zuulci.org domain is likely the foundation | 19:35 |
ianw | at least the registrar is | 19:36 |
corvus | yep | 19:36 |
fungi | yeah, zuulci.org is using csc global for its registrar, which is who the foundation goes through | 19:36 |
fungi | any whois for our domains with "Registrar: CSC Corporate Domains, Inc." is pretty much guaranteed to be foundation staff coordinated | 19:37 |
clarkb | I did give the foundation registrar managers a heads up this was going to happen soon too. They are ready and willing to help us when the time comes | 19:37 |
fungi | having a clear list of the domains we want them to update will be key, of course | 19:38 |
clarkb | fungi: sounds like opendev.org and zuulci.org | 19:38 |
clarkb | gating.dev and zuul-ci.org are the other two domains hosted by our nameservers and corvus has control of those | 19:38 |
clarkb | Anything else realted to nameservers or replacing servers generally? | 19:39 |
ianw | nope | 19:40 |
clarkb | #topic AFS volume quotas | 19:40 |
clarkb | Good news! The usage seems to have stabilized? | 19:41 |
clarkb | it sstill quite high but I don't think it has budged since last week | 19:41 |
fungi | the universe has changed in your favor, collect three tokens | 19:41 |
ianw | i hope to get back to wheel cleanup | 19:41 |
ianw | and fedora upgrades | 19:41 |
clarkb | sounds good and ya this continues to not be an emergency but I'm trying to keep an eye on it | 19:42 |
clarkb | #topic Gitea 1.19 | 19:43 |
clarkb | At this point I'm happy waiting for 1.19.2. One of ianw's bugs has been fixed and will be included in 1.19.2. The other is marked as should be fixed for 1.19.2 but hasn't been fixed yet | 19:43 |
clarkb | There are no critical features we are missing. 1.19 was largely cut to add their actions implementation from what I can see | 19:43 |
clarkb | If that changes we may accelerate and deploy 1.19.1 before .2 | 19:44 |
clarkb | #topic Quo vadis Storyboard | 19:44 |
fungi | as an aside, are we any closer to being concerned about needing to switch to forgejo, or has that teapot stopped tempesting? | 19:44 |
clarkb | #undo | 19:44 |
opendevmeet | Removing item from minutes: #topic Quo vadis Storyboard | 19:44 |
clarkb | fungi: no, as far as I can tell upstream development is still happening in the open via losely oragnized maintainers that write down their release goals at the beginning of their dev cycles | 19:44 |
clarkb | No features have been removed. Licensing has remained the same. etc | 19:45 |
fungi | any idea if forgejo has hard-forked or is continuing to pull new commits from gitea? | 19:45 |
clarkb | I think their intention is to avoid a hard fork | 19:45 |
fungi | just wondering if it's diverging substantially over time, or sitting stagnant | 19:45 |
clarkb | fomr what I've seen (and I'm totally not an expert) its basically the same code base with a different name | 19:46 |
fungi | the less it becomes like gitea, the harder it will be to switch if we decide we want to | 19:46 |
fungi | hence my curiosity | 19:46 |
fungi | anyway, i can read up on it, just didn't know if anyone had seen any news on that front | 19:47 |
clarkb | looking at their release notes they seem to be making intermediate releases between gitea releases but following gitea releases overall | 19:48 |
clarkb | and ya double checking on that is probably reasonable. I just haven't seen anything indicating major differences yet | 19:48 |
clarkb | #topic Pruning vexxhost backup server | 19:48 |
clarkb | We've gotten a few emails about this recently. We are at 92% of capacity. I suspect adding some of these new servers hasn't helped either | 19:48 |
clarkb | I can probably run the pruning script tomorrow | 19:49 |
ianw | i think we're about monthly on pruning that, so it probably tracks | 19:49 |
clarkb | and then maybe we check relative disk usage compared to prior prunes | 19:49 |
clarkb | ah ok | 19:49 |
ianw | i can run it and watch it | 19:50 |
clarkb | ianw: if you like. I'm happ to do it tomorrow too (I just have enough on my plate for today) | 19:50 |
ianw | it would be interesting to track individually backup size | 19:51 |
clarkb | And with that we've made it to the end of our agenda (actually this last topic was added by me at the last minute) | 19:51 |
fungi | yeah, i can also do it in the background tomorrow probably. i guess whoever gets to it first just let everyone else know ;) | 19:51 |
clarkb | ++ | 19:51 |
clarkb | #topic Open Discussion | 19:51 |
clarkb | Anything else? | 19:51 |
frickler | did you skip sb intentionally? not much to say probably anyway | 19:52 |
frickler | also I'll be away for 3 weeks starting thursday | 19:52 |
clarkb | frickler: oh you know whta I undid it and the nforgot it | 19:52 |
clarkb | thats on me | 19:52 |
ianw | (backup prune running in a screen now) | 19:52 |
clarkb | frickler: fungi: re sb I did want to aks about recent discussion around that. Is that mostly around the logistics of reflecting changes in projects within our project-config repo? | 19:53 |
fungi | ianw: noted, thanks! | 19:53 |
frickler | clarkb: the latest was just about removing it for retired projects | 19:54 |
fungi | clarkb: sb's database schema supports marking projects as "inactive" in order to not present them as autocompletion options and stuff, so i've been trying to remember to set that when projects retire, but should to an audit to see if any have been missed | 19:54 |
clarkb | got it. I guess it is a good idea to mark both retired projects that way as well as those that move elsewhere | 19:54 |
fungi | also removing "use-storyboard" and "groups" entries from retired projects in our gerrit/projects.yaml file if there are any | 19:55 |
frickler | another short note: the wheel builds are still broken, I think ianw has some patches up for that? | 19:55 |
frickler | current issue seems to be afs on centos once again | 19:55 |
fungi | on a similar note, there was at least one dib change to fix openeuler image builds, do we need a new dib release for that before we turn them back on, or did it already happen? | 19:56 |
ianw | i do have something up that will release the builds separately. i'll try to take a look at the build failures | 19:56 |
frickler | fungi: another dib release will be needed iirc | 19:56 |
ianw | we would need a dib release and nodepool rebuild. i can look at that | 19:56 |
ianw | i don't think we have anything else in the queue for dib | 19:56 |
ianw | if centos is failing that often means our images are out of sync and have different kernel versions to the headers available from the mirror | 19:57 |
frickler | oh, there's also the pypi org thing | 19:59 |
frickler | do we want to register at least an opendev org there? | 19:59 |
clarkb | I'll admit I haven't read it yet. | 19:59 |
clarkb | The trusted publisher thing kinda made my grumpy | 19:59 |
fungi | related, i registered an "opendev.org" user on pypi ("opendev" was already taken) in case we eventually want to replace the openstackci account there | 20:00 |
clarkb | we are officially at time | 20:01 |
fungi | thanks clarkb! | 20:01 |
clarkb | I'll end the meeting here but feel free to continue discussion in #opendev or on the mailing list | 20:01 |
clarkb | #endmeeting | 20:01 |
opendevmeet | Meeting ended Tue Apr 25 20:01:19 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 20:01 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2023/infra.2023-04-25-19.01.html | 20:01 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-04-25-19.01.txt | 20:01 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2023/infra.2023-04-25-19.01.log.html | 20:01 |
clarkb | thank you everyone! | 20:01 |
frickler | o/ | 20:02 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!