fungi | ahoy! | 19:00 |
---|---|---|
clarkb | Hello | 19:00 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Apr 18 19:01:06 2023 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/246L5WVFVKR4XU6PIQRILQ6Z4PPG6NDZ/ Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
clarkb | I didn't have any announcements | 19:02 |
clarkb | #topic Topics | 19:02 |
clarkb | We can jump right in | 19:02 |
clarkb | #topic Migrating container images to quay.io | 19:02 |
clarkb | Last week the promotion of container images through the intermediate registry ran successfully against zuul-client (this is the image that we've been using to test changes to jobs/playbooks/roles) | 19:03 |
corvus | i plan on copying over zuul images and updating zuul repos this week | 19:04 |
clarkb | I suspect that we are really close to taking action on this in OpenDev. In particular I expect the next sets of tasks to roughly be: copying existing image data from docker hub to quay, updating our jobs and possibly rebuilding images to test things, figuring out how to auto provision public repos in quay for new images | 19:04 |
clarkb | corvus: were you planning to do the copy of images for opendev as well? | 19:04 |
corvus | was just planning on zuul | 19:05 |
clarkb | corvus: and does the script handle deltas if we were to copy things today then rerun it again quickly before images moved? | 19:05 |
corvus | i think so | 19:05 |
corvus | with a quick change to omit :latest, could also run it again after the move, but probably no point to doing that | 19:06 |
clarkb | ok cool I can look at running it for opendev this week as an initial step with a plan to sync up any deltas as we actually move images on the zuul job side | 19:06 |
ianw | i know we figured out from old blog posts the bits to pre-make a public image on quay, but did we codify that in zuul-jobs yet? | 19:07 |
corvus | i think i shared the latest version of the script (which handles org renames and multi-arch) earlier in #opendev; i don't have the link handy right now though | 19:07 |
clarkb | ianw: I pushed a role for it https://review.opendev.org/c/zuul/zuul-jobs/+/877834 but I'm not sure where we would inject that in the current job setup | 19:07 |
clarkb | ianw: maybe it would be a separate job that runs before the promote job | 19:08 |
clarkb | to decouple things cleanly? | 19:08 |
ianw | ahh right yes i remember that now :) | 19:08 |
corvus | or could be a pre-run playbook for an inherited job | 19:08 |
clarkb | corvus: oh ya that should work too due to the nesting order | 19:08 |
ianw | it could be. since this requires an API key, that was my thinking that you'd already have to have an api key to use the tag-baesd promotion path anyway | 19:09 |
clarkb | ianw: ya though the creation api token needs very little in terms of permissions so doing it through the intermediate registry with a very limited key may still make sense | 19:09 |
clarkb | I can try to take a look at where to add this later this week too. I basically need to get through the etherpad (and possibly gitea?) stuff then I have a lot more time | 19:10 |
corvus | incidentally, i haven't heard anything more from the quay people about the zuul org. that's a little disappointing. :/ | 19:10 |
ianw | ++ i'm happy to help out too. agree we can sort out details later | 19:10 |
clarkb | sounds good. | 19:10 |
corvus | i don't think we need the api creation role for the zuul projects; we don't make new container images very often | 19:11 |
clarkb | Also as a side note it doesn't look like docker hub accidentally did the april 14 doom change (we didn't expect them to but images are updating on docker hub since) | 19:11 |
corvus | i mean, once it shows up and is settled, i'm not opposed to having it there; just that it's not in the critical path for now. | 19:12 |
clarkb | ya we could manually create them in opendev too, but I think we end up adding/removing images often enough that would be annoying | 19:12 |
ianw | no they definitely backtracked on that one | 19:12 |
ianw | docker i mean | 19:12 |
corvus | yep, every python version...etc.. | 19:13 |
clarkb | ianw: ya I know. They announced it too. I just wanted to make sure that reality panned out that way and it appears to have done so | 19:13 |
corvus | trust but verify. also, maybe don't actually trust. | 19:13 |
clarkb | alright anything else on this? Hopefully we've got some exciting updates next week | 19:14 |
corvus | heh i'm hoping for boring updates :) | 19:14 |
ianw | nope -- irrespective of changes upstream, i think we've got something nicer giving us options to point at multiple places | 19:15 |
clarkb | exciting because its done (at least for zuul) not due to any fireworks :) | 19:15 |
clarkb | #topic Bastion Host Updates | 19:15 |
corvus | ++ | 19:15 |
clarkb | The only thing I'm aware of here is the multiway encrypted backups stack needing reviews still | 19:15 |
clarkb | Launch node appears to be managing reverse dns in rax now and the openstack command in the venv we install can talk to rax and dns helper output all appear to work now when launching nodes | 19:16 |
ianw | yeah i used it to launch some dns nodes and it finally worked to give me all output :) | 19:17 |
clarkb | any other bridge related items? | 19:17 |
ianw | i even thought, wow, this is close to being something that could be a zuul job ... :) | 19:17 |
fungi | that would certainly be a cool future | 19:18 |
clarkb | #topic Mailman 3 | 19:19 |
clarkb | We'll keep moving along. I noticed some activity on the change srelated to mailman 3 vhosting this morning but suspect it is still too early to have much to report? | 19:19 |
fungi | i've got a fresh held lists node from today (104.130.219.137) which includes the changes in 867986 and 867987, and am starting to try out the recommended commands on it this week for django site creation and association in postorius | 19:19 |
fungi | #link https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/I5MLJAESRXQARS3MZHF75YQCBY2OUL6G/ Re: Multi-domain oddities in Hyperkitty and Postorius | 19:19 |
fungi | but yeah, no actual progress to report | 19:20 |
clarkb | hopefully we'll have good news next week | 19:20 |
clarkb | #topic Gerrit Updates | 19:20 |
clarkb | There has been some movement on ACL synchronization to better align our project-config acl files with what is in Gerrit now post 3.7 upgrade migration | 19:21 |
clarkb | #link https://review.opendev.org/c/openstack/project-config/+/880115 Update project-config acls to match post migration acls in Gerrit | 19:21 |
clarkb | ianw: ^ the only reason I haven't +2'd that is you indicated we could land a couple of changes together to correct some of the concerns (post-review in particluar) | 19:21 |
clarkb | I haven't seen that followup change yet, but I think what you've got in that first one is fine as long as we do have a followup | 19:22 |
clarkb | #link https://review.opendev.org/c/openstack/project-config/+/879906 Gerrit also normalized indentation of config files we should consider this to be in sync | 19:22 |
ianw | ohh yes sorry, i had that up in emacs and got distracted on dns yesterday. will do | 19:22 |
clarkb | This second change is going to modify every single acl though and should be coorindated with a manual run of mangae-projects I think | 19:22 |
clarkb | (that first one is small enough it should be fine to land) | 19:22 |
clarkb | But the idea here is that since Gerit seems to insist on hardtabbing in the config files we should od that same thing to reduce deltas making it easier to read diffs and understand changes when upgrades ahppen | 19:23 |
clarkb | I think this is less urgent, but something we should eventually get to. | 19:23 |
ianw | yeah, i mean ideally we don't have changes on upgrade that get us out of sync, but if we do, it's easier to look at without also reformatting the whole thing | 19:24 |
fungi | #link https://review.opendev.org/879906 Indent Gerrit ACL options | 19:25 |
fungi | oh that was linked already | 19:25 |
clarkb | ya I think we should get ianw's first update in and then look at the tabs situation | 19:25 |
clarkb | since tabs are less necessary and more painful to get applied cleanly | 19:25 |
fungi | it's still wip because i'm unconvinced we should enforce it, given people already struggle with the current acl normalization checks | 19:25 |
clarkb | and it gives us time to decide if we think it should be enforced. I'm really sad we don't think people can properly add tabs to files :( | 19:26 |
clarkb | I also started on trying to address the leaked replication files on disk | 19:26 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/880672 Dealing with leaked replication tasks on disk | 19:26 |
clarkb | This is tested now (though somewhat artificially) | 19:27 |
corvus | would it maybe help to add comments to files telling people to use tabs? | 19:27 |
clarkb | corvus: I think fungi's concern is that editors don't always treat the tab button as a hard tab | 19:27 |
corvus | they're not hard to deal with, but it's sometimes unclear whether one should... | 19:27 |
clarkb | which will lead to confusing people when their changes fail | 19:27 |
fungi | oh, there's an idea, insert a boilerplate comment block summarizing the normalization enforcement rules | 19:27 |
clarkb | it certainly can't hurt to have that info available when making edits to those files | 19:28 |
corvus | could also maybe look at editor config lines... | 19:28 |
fungi | though i don't know if gerrit strips comments, it would at least be easier to ignore that block in the diffs | 19:28 |
ianw | it does also give you a diff to correct your mistake | 19:28 |
ianw | (it == the normalization path) | 19:29 |
corvus | oh yeah i was assuming comments work... i don't know. | 19:29 |
fungi | yeah, mainly hoping to avoid more review round-trips | 19:30 |
clarkb | if any of the current files have comments in them in project-config we could check pretty easily | 19:30 |
clarkb | looks like they don't :( | 19:30 |
fungi | none do, no | 19:31 |
clarkb | anyway I think that is he less urgent of the two sides of the acl normalization cleanups so we can tackle that once we're happy with the functional side | 19:31 |
fungi | tangentially, came up in the gerrit matrix/discord today that 3.8 has a new plugin system for code linking so our overrides to swap gitiles out with gitea may need revisiting during the next upgrade | 19:31 |
clarkb | for the replication tasks cleanup workaround I'm fairly certain it is ok for those files to be deleted while gerrit is shutdown because we weren't bind mounting the data dir previously. My change adds a script to the gerrit images that inspects the json content and deletes those that we know leak leaving tasks we want to replicate behind | 19:32 |
clarkb | it is possible that it may also leave other classes of task that can't be replicated behind but there are 11k ish files currnetly and skimming them I've found these three types so far | 19:32 |
clarkb | it will be easier to see any potential fourth type after we deal with these first | 19:32 |
clarkb | Feedback on whether or not this seems like a good approach would be good in addition to reviews for whether or not it does what it says on the tin | 19:33 |
fungi | what is upstream's position on the bug, or has anyone weighed in yet? | 19:34 |
clarkb | I haven't filed a bug on this one yet but probably should | 19:34 |
clarkb | now that I understand it a bit better. | 19:34 |
clarkb | Another user said that changing group perms for the replication targets didn't change it for them though which was my hunch (we replicate as if we are the anonymous user) | 19:34 |
ianw | it tried it and failed, right? it does seem like it should unlink the file after that | 19:35 |
clarkb | ianw: yes however it also does retries and I suspect that is the problem here | 19:35 |
clarkb | the plugin probably can't tell the difference between failure due to gerrit acls (or some other internal mechanism) saying no and a network failure or temporary rejection from the remote | 19:36 |
ianw | ahh, yeah that sounds very likely | 19:36 |
fungi | i would totally buy that explanation | 19:36 |
clarkb | if I find time I can dig into the plugin implementation itself | 19:36 |
clarkb | in the meantime I suspect that what I've proposed is a safe way to manage the blast radius of leaking files to disk over time and avoiding errors in gerrit's error log at startup | 19:37 |
ianw | but also fixable, luckily clarkb is our honorary on staff Java developer | 19:37 |
clarkb | heh | 19:37 |
clarkb | I'm happy for reviewers to say "this should be fixed upstream we don't want this hacky workaround" too | 19:37 |
ianw | it's only a git revert away from removal though | 19:37 |
clarkb | and hte last gerrit related item I had was a reminder we should clean up the 3.6 image at some point. Add a 3.8 image and update our upgrade job | 19:37 |
ianw | that might be a fun one to test quay creation | 19:38 |
clarkb | I don't think we'llrevert at this point. I'm happy for us to remove the 3.6 image now | 19:38 |
clarkb | oh ya that could be a good one for adding 3.8 maybe | 19:38 |
clarkb | I can followup on this as I dig into the quay stuff more later this week to se if it makes sense in that process somewhere | 19:39 |
clarkb | #topic Upgrading Servers | 19:39 |
clarkb | static.opendev.org and the ~40 somethign other names it hosts are now on a jammy static02 host. static01 is removed and out of dns too | 19:39 |
clarkb | I've got a new etherpad02 server up and running and tested a data migration from etherpad01 to etherpad02. It takes about 30 minutes to dump the db and 30 minutes to restore it plus time to copy the data between hosts and double check you aren't doing something sill. I notified service-announce that there would be a 90 minute outage of etherpad tomorrow at 22:00 UTC to do the | 19:40 |
clarkb | actual move | 19:40 |
clarkb | #link https://paste.opendev.org/show/brRuhPssVLSi4UnF5hcN/ The etherpad move plan. | 19:40 |
clarkb | This is he plan I wrote down based on my local notes of testing the process. I put it in paste and not etherpad because therpad will be shutdown during this process to avoid data in the wrong location | 19:41 |
fungi | that's some serious foresight | 19:41 |
clarkb | Please review thta if you have time before tomorrow at 22:00 UTC it is relatively straightforward but extra eyeballs making ure i Haven't done something silly are appreciated | 19:41 |
clarkb | ianw: has also made progress on replacing nameservers | 19:42 |
clarkb | #link https://etherpad.opendev.org/p/2023-opendev-dns | 19:42 |
ianw | clarkb: plan lgtm. you could also use "zcat dump.gz | ..." :) | 19:42 |
fungi | yeah, i'm good with what you have there | 19:42 |
clarkb | This etherpad has links to changes. I've reviewed most of those changes and left questions on a couple of them. One of which also failed testing for a valid reason (I left a note indicating what i think is the fix) | 19:43 |
clarkb | thanks! | 19:43 |
ianw | yep thanks for going through that. and good catch on updating the other zones too | 19:43 |
clarkb | ianw: I also left notes on the etherpad about a few things I noticed were missing. We have 3 other zone files to update and reverse dns for the vexxhost nameserver would ideally be set | 19:43 |
ianw | i'm actually thinking maybe we template in the nameservers, but i'll think about that | 19:43 |
clarkb | ianw: I approved the change to update le testing to jammy as well not sure if that merged or not | 19:43 |
fungi | i'll probably be mia between 18:30 and 20:30 or so, but will definitely be back by 22:00 for the maintenance | 19:43 |
clarkb | Next week I'll probably start looking at jitsi meet or mirror nodes. Say something if you want to help and have a preference for what is left to do | 19:44 |
clarkb | #topic AFS volume utilization | 19:45 |
clarkb | we have crept up to 92.2% from 91.7% since last week | 19:45 |
clarkb | if that growth rate holds we'll have about 15 weeks before there is a problem. Which is about 3 ish months? | 19:46 |
ianw | i still haven't got back to wheel clearouts or f36 (sigh, now f38 is out anyway) | 19:46 |
clarkb | ack I think we have time. But we should probably look into those tasks sooner than later to see where we end up disk wise and make our next decisions from there | 19:47 |
clarkb | #topic Gitea 1.19 | 19:48 |
clarkb | I've got a change up to upgrade to gitea 1.19.1 | 19:48 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/877541 Upgrade opendev.org to 1.19.1 | 19:48 |
clarkb | however, ianw rightly pointed out that the changes to api interaction are weird and likely unnecessary | 19:48 |
ianw | yeah i just noted a comment on that about removing the auth for more endpoints | 19:49 |
clarkb | digging into this ianw filed two bugs against gitea around previously anonymous apis requiring auth now and improper headers for responses indicating auth is required | 19:49 |
ianw | #link https://github.com/go-gitea/gitea/issues/24159#issuecomment-1513620323 | 19:49 |
ianw | oh, and also a pull request now | 19:49 |
clarkb | ianw: yes it looks like the change to add scoped tokens to their api system made far reaching changes to the api that are problematic | 19:49 |
clarkb | thank you for looking into that I was just focused on making it work I didn't even consider they might have problems (in particular we were passing auth creds to the request so it wasn't clear to me that we went from public to private) | 19:50 |
clarkb | Anyway I think we can wait for 1.19.2 to fix this to avoid any potential breakage for users anonymousl talking to our gitea api | 19:51 |
clarkb | (we could scan the request logs for evidence of this if we were in a hurry) | 19:51 |
clarkb | but 1.19.2 should be out soon enough I hope and we don't currently hvae an urgent need to upgrade. | 19:51 |
clarkb | Reivews on that chagne would be helpful though as I expect minimal deltas between now and 1.19.2 when available (just cleanup our api requests to reflect they don't need auth anymore) | 19:52 |
ianw | yeah i agree that's unlikely -- nobody complained yet. if we want to just go with it that's fine, but i think we should revert the user/pass/auth force when we can so we note any further regressions when it's fixed | 19:52 |
clarkb | #topic Storyboard | 19:53 |
clarkb | have we seen any more requests to mark things RO? | 19:53 |
clarkb | Mostly curious if the moves by some projects have been showing up on our radar yet | 19:54 |
clarkb | I'll take that as a no :) | 19:55 |
clarkb | #topic Open Discussion | 19:55 |
clarkb | Anything else? | 19:55 |
fungi | i have a handful of project moves off storyboard i need to clean up behind | 19:55 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/880570?usp=dashboard | 19:55 |
fungi | see recent open/merged changes for gerrit/projects.yaml for a list of relevant exodus | 19:56 |
ianw | and a follow-on are a quick one to update links on the main page | 19:56 |
ianw | also i agree that usp= thing is very annoying | 19:56 |
clarkb | I noticed google uses usp in google docs I think it was yesterday | 19:57 |
clarkb | so ya its basically something they appear to have added to gerrit for their purposes but without any open source usage of it | 19:57 |
clarkb | they did at least ocnfirm that it is unused in gerrit 3.7. You would need to write a plugin or something like that to consume the info | 19:59 |
corvus | they interested in disabling it, or are they like "just ignore it" | 19:59 |
clarkb | corvus: they say it has no effect in open source gerrit and you should ignore it | 19:59 |
clarkb | to tl;dr | 19:59 |
ianw | except everyone knows how i copied the review link | 20:00 |
ianw | which, i admit, i don't really care about, but, why do you need to know | 20:00 |
clarkb | right it betrays info of the context where you copied links (email, dashboards, related changes, etc) | 20:00 |
clarkb | we are at time. Thank you everyone! we'll be back next week same time and location | 20:01 |
clarkb | feel free to pick up or continue conversation in #opendev or the mailing list if we want to continue to discuss any of these items | 20:01 |
clarkb | #endmeeting | 20:01 |
opendevmeet | Meeting ended Tue Apr 18 20:01:38 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 20:01 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2023/infra.2023-04-18-19.01.html | 20:01 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-04-18-19.01.txt | 20:01 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2023/infra.2023-04-18-19.01.log.html | 20:01 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!