clarkb | Just about meeting time | 18:59 |
---|---|---|
clarkb | #startmeeting infra | 19:00 |
opendevmeet | Meeting started Tue Nov 12 19:00:16 2024 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:00 |
opendevmeet | The meeting name has been set to 'infra' | 19:00 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/HQSCECQODT5XIHWR633MLLITCB3FG243/ Our Agenda | 19:00 |
clarkb | sorry for getting the agenda out late this week. I was out yesterday so sent it first thing today | 19:00 |
clarkb | #topic Announcements | 19:01 |
clarkb | I didn't have anything | 19:01 |
clarkb | I expect we'll have our regularly scheduled meeting next week and the week after. There is a slight possibility the one the week after may run into thanksgiving plans and get skipped but I have no plans that would do so at this point | 19:02 |
clarkb | #topic Zuul-launcher image builds | 19:02 |
clarkb | corvus: last week you indicated that we needed additional changes in zuul as well as needing testing for raw image upload/download timing | 19:02 |
clarkb | any updates on those items? | 19:02 |
corvus | nothing since last week | 19:03 |
corvus | slowly working on the upstream zuul changes; not started on the opendev-specific changes | 19:03 |
clarkb | thanks. So to recap on the opendev side we can add more image builds (in addition to bullseye) and test image builds using raw images instead of qcow2 to see what the timing of that looks like | 19:04 |
corvus | yep -- though note that one of the needed upstream changes is making the zuul-launcher download faster; so raw upload timings would be useful, but download timings are not optimized yet. | 19:04 |
clarkb | got it | 19:05 |
clarkb | anything else on this subject? | 19:05 |
corvus | not from me | 19:06 |
clarkb | #topic Backup Server Pruning | 19:06 |
clarkb | as mentioned last week ianw pushed up a change to do automated retirement and purging of backups from ansible possible. Its requires us to explicitly list things to retire then purge but is nice for record keep in comparison to what I did manually | 19:06 |
clarkb | the underlying process of removing backups ends up being very similar though | 19:07 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/933700 Backup deletions managed through Ansible | 19:07 |
clarkb | this is the primary change to do that. I'm happy with it but did write a followup to fix a minor issue we're already hitting after my manual cleanups | 19:07 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/934768 Handle backup verification for purged backups | 19:07 |
clarkb | maybe we can get those reviewed and landed then continue with cleanups using this system? I suspect I'll have to manually bring ethercalc02 into the same state on the other backup server but that isn't a big deal | 19:08 |
clarkb | if we're happy with that I'll update my documentation change as well to refer to this system | 19:09 |
clarkb | or just abandon it if it doesn't serve a prupose | 19:09 |
fungi | worth noting, it seems like the "backup inconsistency" warnings we keep receiving for ethercalc are sent weekly | 19:10 |
clarkb | fungi: yup that second change should address that | 19:10 |
fungi | so it wasn't just a one-time event | 19:10 |
fungi | cool | 19:10 |
clarkb | (I have to touch .retired in the ethercalc02 dir to make that work but then it should be handled) | 19:10 |
clarkb | #topic Upgrading old servers | 19:11 |
clarkb | I don't think there is anything new on this topic. But I'll leave it open for a minute or two for others to jump in with udpates if we have them | 19:12 |
clarkb | #topic Docker compose plugin with podman service for servers | 19:14 |
clarkb | similar situation with this topic. I'm unaware of any updates but will leave it open for a couple minutes if ther eare any | 19:14 |
clarkb | #topic Enabling mailman3 bounce processing | 19:16 |
clarkb | there are updates on this topic. | 19:16 |
clarkb | As promised I configured service-discuss to enable bounce processing. I didn't change any of the defaults as they seemed like a reasonable place to start. | 19:16 |
clarkb | This list is very low traffic so nothing really happened until I sent the meeting agenda email earlier today. That resulted in two list members having non zero scores | 19:17 |
clarkb | This is a good indication that bounce processing is unlikely to take immediate drastic action but also a more active list like opnstack-discuss might more quickly trend twoards removing people | 19:17 |
clarkb | in any case I think I'm comfortable with proceeding with enabling this on more (all?) lists if others are | 19:18 |
frickler | +1 | 19:18 |
clarkb | fungi: you moderate many lists any preference on approach here? SHould we try to do it on a list by list basis via moderator action or something else? | 19:18 |
fungi | i think it's probably fine to roll out to more lists at this point | 19:20 |
clarkb | fungi: did you want to pick some you moderate and do that? I can apply it to the other opendev lists | 19:21 |
clarkb | corvus: I guess you might consider doing it for the zuul lists too | 19:21 |
fungi | sure, i can | 19:21 |
corvus | do we need to manually do it for all lists? is there a default for new lists? | 19:21 |
clarkb | corvus: we currently don't configure it when creating new lists, but in theory we can update list creation to enable it on them. But ya we don't really manage list settings post creation | 19:22 |
fungi | it's possible the default for new lists is already on, and this is migrated lists we're changing? i'd need to look | 19:22 |
clarkb | oh ya that could be too. | 19:22 |
corvus | ack... my view is that this should be enabled for all existing and new lists without further delay (but, also, without urgency). | 19:23 |
clarkb | but ya we should enable it by default on new lists as part of this process. I'm less sure if there is value in automating enablement in existing lists | 19:23 |
clarkb | corvus: ack thanks | 19:23 |
corvus | so whatever the best method to achieve that is... :) | 19:23 |
fungi | i'll take a look at the settings for the new list added by change 924432 in july | 19:24 |
fungi | but it'll take me a few minutes since i need to use the superuser on it | 19:24 |
fungi | "Process Bounces" is "no" there | 19:25 |
clarkb | ok so default is off on list creation but it should be possible to switch that to on somehow | 19:26 |
fungi | so i guess we have it off by default for new lists, i'll see if we have that set in config | 19:26 |
clarkb | and then we leave existing lists alone so we'll need to either manually toggle them or write some tool to go update each of them separately | 19:26 |
clarkb | corvus: more generally I think it should be safe to list moderators to toggle the setting in the lists they manage | 19:26 |
clarkb | and if we do automate it for everything we should noop on those | 19:26 |
corvus | ack; i'll manually flip the bit for zuul lists | 19:27 |
corvus | erm, what's the option name? :) | 19:28 |
clarkb | corvus: you go to settings then bounce processing then flip it to enabled | 19:29 |
corvus | https://imgur.com/FwNWrOI | 19:29 |
corvus | that one? | 19:29 |
clarkb | yes | 19:29 |
corvus | ok; that was already set for zuul lists | 19:29 |
clarkb | oh more data backing up that this should be fine | 19:30 |
clarkb | anyway we can move on I think we've got rough agreement to go aehad and do this, we can update new list creation then sort out existing lists as we go | 19:30 |
clarkb | #topic Intermediate Insecure CI Registry Pruning | 19:30 |
clarkb | The intermdiate registry / insecure ci registry seems to be more stable now after updating its installation | 19:31 |
clarkb | after discussing needed pruning last week we did some code review and found a bugfix as well as added a dry run option | 19:31 |
clarkb | the plan is still to do a proper pruning on Friday as announced but I'm curious if we want to do a dry run first say tomorrow? | 19:31 |
clarkb | or do we think it is safest to do the dry run in the announced window which is ~Friday | 19:32 |
corvus | i think dry-run now sounds good | 19:32 |
clarkb | ok I'll work on figuring that out probably first thing tomorrow. (I only took one day off yesterday but I returend to a fairly big backlog I'm digging through today) | 19:33 |
corvus | i mean, worst case if i completely botched it is that we accidentally delete some temporary registry things and there's a small blip in jobs which can be corrected by rechecking. | 19:33 |
frickler | this might also be affected by the rax identity issues, though? | 19:33 |
clarkb | frickler: oh yes good point | 19:33 |
clarkb | so waiting for that to resolve is a good idea too | 19:33 |
clarkb | hopefully by tommorrow morning that wilkl be happy and I'll start a dry run in screen on the registry node | 19:34 |
clarkb | I am hopeful that the bugfix means this will just work (tm) | 19:34 |
corvus | yep, it's a plausible explanation and i think we can reset to "assume it works and debug what doesn't" | 19:35 |
clarkb | #topic Gerrit 3.10 Upgrade Planning | 19:35 |
clarkb | #link https://etherpad.opendev.org/p/gerrit-upgrade-3.10 Gerrit upgrade planning document | 19:35 |
clarkb | I would like to announce this upgrade soon if possible. Does the currently pencilled in time of Friday December 6 starting at 1600 UTC not work for some reason? | 19:36 |
fungi | wfm | 19:37 |
clarkb | ya I'm not hearing any concerns I'll work on sending that email out later today unless something comes up before then | 19:37 |
clarkb | Other than announcing the upgrade the other current today is simply going through the etherpad and ensuring we're comfortable with the changes/updates and any mitigations we might have | 19:38 |
clarkb | so please look over the etehrpad, add notes or concernsi f you have them and I'll try to regularly review it and address them | 19:38 |
clarkb | otherwise this seems like we're on track for uprading on the 6th | 19:39 |
clarkb | #topic RTD Build Trigger Requests | 19:40 |
clarkb | after more debugging ianw is of the opinion that we're getting hit by client fingerprinting in the CDN layer | 19:40 |
clarkb | one thought is that simply using a different client (like curl) may sidestep the issue | 19:41 |
clarkb | #link https://review.opendev.org/c/zuul/zuul-jobs/+/934243 switch to curl instead of ansible uri module | 19:41 |
ianw | yeah, they linked to a post about their anti-bot things | 19:41 |
ianw | https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse/ | 19:42 |
clarkb | #link https://about.readthedocs.com/blog/2024/07/ai-crawlers-abuse/ | 19:42 |
clarkb | I'm actually not sure if curl is in our executor images | 19:42 |
clarkb | we should check that before approving I guess but otherwise that seems like a reasonable workaround to me | 19:43 |
fungi | it's present on the executors themselves at least (i checked) | 19:43 |
clarkb | this would run within the container I think | 19:43 |
ianw | i think that job might start a node but not use it? | 19:43 |
fungi | ah, i'm always confused as to whether ansible is running things from inside the zuul-executor container or the distro | 19:43 |
fungi | ianw: yeah, that came up separately, for some reason it has a default nodeset instead of an empty one | 19:44 |
clarkb | https://opendev.org/openstack/project-config/src/branch/master/playbooks/publish/trigger-rtd.yaml#L2 it runs on localhost in the job not sure if the job has a nodeset | 19:44 |
corvus | docker run --rm -it quay.io/zuul-ci/zuul-executor:latest bash | 19:44 |
corvus | root@8f99eae3145d:/# curl | 19:44 |
corvus | curl: try 'curl --help' or 'curl --manual' for more information | 19:44 |
clarkb | https://opendev.org/openstack/project-config/src/branch/master/zuul.d/jobs.yaml#L1108-L1127 | 19:44 |
clarkb | it does have a nodeset I think | 19:44 |
clarkb | corvus: ok cool so it should work then we can also use an empty nodeset in the job | 19:45 |
clarkb | #topic promote-openstack-manuals-developer fails with Ansible errors | 19:46 |
clarkb | This is a different type of job error that occurs due to vairables being undefined | 19:46 |
clarkb | #link https://zuul.opendev.org/t/openstack/build/1a84db5d173c4777b9d730923721b04a Example failure | 19:46 |
clarkb | part of the problem here is that this promtoe job works for a special set of developer docs that aren't part of the main docs.openstack.org site so have specail everything and in this case something isn't quite right | 19:47 |
clarkb | I suspect that fixing this is going to require someone pages in all of the things that make this different and then apply that perspective to the jobs and add the missing bits? | 19:47 |
frickler | note that the last known successful run of that job was > 3y ago. and I assume the regression may have been triggered by a change in zuul almost that old | 19:48 |
frickler | https://opendev.org/zuul/zuul/commit/be50a6ca42c41c0608dd02930a01123afd4e6064 | 19:48 |
clarkb | ya at this point I doubt that we're trying to dig down deltas in what changed to break it and instead just need to undersatnd it properly and determine why it is broken and roll forward | 19:48 |
clarkb | personally I've always argued that developer.openstack.org should've been docs.openstack.org/developer and use the same systems as exist for docs | 19:49 |
clarkb | this is I think evidence for why this is a good idea but also that ship sailed a long time ago and the best thing is to simply figure out what needs to be changed to make it work | 19:49 |
clarkb | is anyone interested in debugging this further? I know fungi took a stab at it. | 19:51 |
clarkb | also I'm not sure there is anything opendev or zuul specific about it other than that is where the failure is originating | 19:51 |
clarkb | it should be solveable by anyone reading the jobs and error message? | 19:51 |
fungi | i've unfortunately already paged out 99% of the context there. pulled in too many directions | 19:51 |
frickler | the error comes from some data that iiuc is coming from a secret | 19:52 |
fungi | the only reason i even started looking at it was because i noticed the site mentioned and linked to trystack.org which hasn't existed for years, and i wanted to get rid of that dead link | 19:52 |
frickler | so not easy to debug for an outsider IMO | 19:52 |
clarkb | frickler: outsiders have the same info that we do when it comes to secrest in zuul though? | 19:52 |
clarkb | like I don't generally go off and decrypt things | 19:52 |
clarkb | (in fact I'm not sure I ever have) | 19:53 |
frickler | it may be needed in this case, though | 19:53 |
frickler | but anyway, I can look further into this, but with low priority | 19:53 |
clarkb | yes it is possible this would be that case | 19:53 |
clarkb | ok thanks | 19:53 |
clarkb | #topic Open Discussion | 19:53 |
clarkb | Anything else? | 19:53 |
frickler | someone mentioned connectivity issues to vexxhost IPv6 earlier today | 19:54 |
fungi | the openinfra foundation gained control of the openinfra.org domain and is going to start working to relocate their various sites out of the google-controlled .dev tld | 19:54 |
frickler | I didn't look closer yet, but seems to be recurring issue of what we had earlier | 19:55 |
fungi | i'm putting together a plan to migrate lists.openinfra.dev to lists.openinfra.org | 19:55 |
clarkb | and jayf mentioned connectivity issues that I think were actually slowness (connections are logged in sshd log but things took longer than expected) | 19:55 |
frickler | I did confirm the "no route to host" from AS3320 (Deutsche Telekom, german incumbent ISP) | 19:55 |
fungi | i'm shooting for migrating that mailman site the first week of december, so probably sending an announcement to the foundation mailing list about it on monday. i'll circulate an etherpad with the planned steps in the coming days | 19:56 |
clarkb | fungi: thanks for the heads up | 19:56 |
frickler | fungi: can we keep a redirect from the old site? | 19:56 |
fungi | the change on the mailman side is pretty simple because it's all in mariadb, so just some update queries (ideally with the services temporarily offline) | 19:56 |
fungi | frickler: yeah, i plan to keep the old urls and addresses working indefinitely | 19:57 |
frickler | or do they want to drop the .dev domain? | 19:57 |
fungi | they just want .org to be the official domain, but will retain control of the .dev one | 19:57 |
frickler | cool +1 | 19:57 |
fungi | there are no plans to drop it, so we can keep redirects and address aliases for ~ever | 19:57 |
fungi | i'll take care of the changes to add the redirects, aliases, config update, et cetera | 19:58 |
corvus | #link https://review.opendev.org/c/openstack/project-config/+/934832 Fix openstack developer docs promote job [NEW] | 19:58 |
corvus | i did that with no special knowledge | 19:58 |
corvus | i just read the error message and looked at the secret def | 19:59 |
corvus | hope it works | 19:59 |
clarkb | and we are at time | 20:00 |
clarkb | thank you everyone | 20:00 |
clarkb | We'll be back next week same time and location | 20:00 |
clarkb | #endmeeting | 20:00 |
opendevmeet | Meeting ended Tue Nov 12 20:00:18 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 20:00 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2024/infra.2024-11-12-19.00.html | 20:00 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2024/infra.2024-11-12-19.00.txt | 20:00 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2024/infra.2024-11-12-19.00.log.html | 20:00 |
frickler | oh, I was somehow assuming that all data in the secret would be encrypted and didn't check deeper yet, nice | 20:01 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!