*** clarkb is now known as Guest298 | 01:19 | |
*** Guest298 is now known as clarkb | 01:20 | |
clarkb | meeting time | 19:00 |
---|---|---|
clarkb | I'm actually going to try and end a little early today as I have an appointment to get to after our meeting | 19:00 |
fungi | sgtm | 19:01 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Nov 29 19:01:04 2022 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link https://lists.opendev.org/pipermail/service-discuss/2022-November/000383.html Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
ianw | o/ | 19:01 |
clarkb | There is a board meeting next week December 6 at 2100 UTC and weblate for openstack translations will be discussed if anyone is interested in attending | 19:02 |
clarkb | The summit cfp is also open. I noticed that SCaLE's cfp closes friday as well | 19:03 |
clarkb | and fossdem is doing things | 19:03 |
clarkb | anythine else to announce? | 19:04 |
clarkb | #topic Bastion Host Updates | 19:05 |
clarkb | with the holiday weekend I've sort of lost where we are at on this | 19:06 |
clarkb | I think there were changes to manage rax dns with a small script | 19:06 |
clarkb | and openstacksdk updated to fix the networking issue when booting rax nodes | 19:07 |
ianw | yep there's a stack for updating the node launcher @ | 19:07 |
ianw | #link https://review.opendev.org/q/topic:rax-rdns | 19:07 |
fungi | #link https://review.opendev.org/865320 Improve launch-node deps and fix script bugs | 19:07 |
fungi | also that | 19:07 |
ianw | that has a tool for updating RAX rdns automatically when launching nodes, and also updates our dns outputs etc. | 19:08 |
ianw | there's also | 19:08 |
ianw | #link https://review.opendev.org/q/topic:bridge-osc | 19:08 |
ianw | which updates the "openstack" command on the bastion host that doesn't currently work | 19:09 |
ianw | and then | 19:09 |
ianw | #link https://review.opendev.org/q/topic:bridge-ansible-update | 19:09 |
fungi | though in theory we could just deeplink from /usr/local/(s)bin to the osc in the launch env and not manage two separate copies | 19:09 |
ianw | is a separate stack that updates ansible on the bridge | 19:09 |
ianw | fungi: that's what https://review.opendev.org/c/opendev/system-config/+/865606 does :) | 19:10 |
fungi | oh, perfect ;) | 19:10 |
clarkb | cool. I'll do my best to pull these up after my appointment and review as many as I can | 19:11 |
clarkb | anything else with the bastion? | 19:11 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/865784 | 19:12 |
ianw | is a tool for backing up bits of the host | 19:12 |
ianw | that's about it. i think it's pretty close to being as much of a "normal" host as i think it can be | 19:13 |
ianw | there's still | 19:13 |
ianw | #link https://review.opendev.org/q/topic:prod-bastion-group | 19:13 |
clarkb | and that last link will do the parallel runs right? | 19:13 |
ianw | to update jobs so they can run in parallel | 19:14 |
ianw | yep | 19:14 |
clarkb | I'm thinking we should stablize as much as we can prior to that as that alone demands a fair bit of attention | 19:14 |
ianw | agree on that, not high priority | 19:14 |
clarkb | #topic Upgrading Old Servers | 19:14 |
clarkb | #link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades Notes | 19:14 |
clarkb | I don't have anything new to add to this :( I keep finding distractions | 19:15 |
clarkb | #topic Mailman 3 | 19:15 |
clarkb | #link https://etherpad.opendev.org/p/mm3migration Server and list migration notes | 19:15 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/865361 | 19:15 |
clarkb | I believe fungi is just about ready to deploy the new server | 19:15 |
clarkb | then its a matter of announcing and performing the swap over | 19:15 |
fungi | well, the server exists, but the inventory change is in the gate noiw | 19:16 |
fungi | dns is already there | 19:16 |
clarkb | it just merged :) | 19:16 |
fungi | perfect | 19:16 |
fungi | and i've added reverse dns and am making sure it's clear of any blocklists | 19:17 |
fungi | so assuming the deploy from the inventory addition checks out, we should be able to announce a maintenance window now | 19:17 |
clarkb | is there anything you need from us at this point? | 19:18 |
fungi | i'll float a draft etherpad in #opendev later today or early tomorrow | 19:18 |
clarkb | sounds good | 19:18 |
fungi | when would folks want to do the maintenance? monday december 5? | 19:18 |
clarkb | that day should work for me | 19:18 |
ianw | ++ | 19:19 |
fungi | as for things to cover in the announcement, people wanting to manage list moderation queues and configs or adjust their subscriptions will need to create accounts (but if they use the same address from the configs we've imported then those roles will be linked as soon as they follow the link from the resulting confirmation e-mail) | 19:20 |
fungi | also we're unable to migrate any held messages from moderation queues, so list moderators will want to do a quick pass over theirs shortly before the maintenance window | 19:20 |
fungi | other than that, and the brief outage and what the new ip addresses are, is there anything else i should cover in the announcement? | 19:21 |
frickler | can we first stop incoming mails to avoid race conditions? | 19:21 |
corvus | i can do that for zuul on sunday/monday | 19:21 |
clarkb | frickler: yes the etherpad link above covers the whole process | 19:21 |
clarkb | and I'm pretty sure stopping incoming mail is part of it? | 19:22 |
frickler | but before the final moderation run? | 19:22 |
clarkb | that should force it to queue up and then get through to the new server when dns is updated | 19:22 |
clarkb | frickler: oh specifically for moderation. For that I'm not sure | 19:22 |
frickler | maybe not so relevant for these low volume lists, but possibly for openstack-discuss | 19:22 |
fungi | the plan is to stop incoming mails by switching the hostnames for the sites to nonexistent addresses temporarily | 19:23 |
fungi | because we can't easily turn off some sites and not others at the mta layer | 19:23 |
fungi | but also to keep that as brief as possible, just long enough for dns to propagate and the imports to run | 19:24 |
fungi | and then we'll switch to the proper addresses in dns so messages deferred on the senders mtas will end up at the correct server | 19:24 |
fungi | for openstack-discuss, i'll personally be moderating it anyway and can literally check the moderation queue via an /etc/hosts override after dns is swizzled before the import | 19:25 |
frickler | ok, guess I should've read the etherpad first ;) | 19:25 |
fungi | but that won't be migrated in this upcoming window, it'll be sometime early next year | 19:26 |
clarkb | #topic Quo Vadis Storyboard | 19:27 |
clarkb | I sent the followup to the thread that I promised (a little late sorry) | 19:27 |
clarkb | tried to summarize the position the opendev team is in and some ideas for improving the status quo | 19:28 |
clarkb | I'd love to hear feedback (ideally on the thread so that others can easily follow along) | 19:28 |
clarkb | frickler did mention that if we're going to invest further in the gitea deployment (and maybe if we don't anyway) we'll need to sort out the gitea fork situation | 19:28 |
clarkb | I followed that whole thing as it was happening a few weeks ago and didn't feel like we needed to take any drastic action at the time | 19:29 |
clarkb | But i agree it is worth being informed and up to date on that situation to ensure we're putting effort in the correct location | 19:29 |
fungi | i guess people are still pushing to fork and it hasn't settled out yet? | 19:29 |
clarkb | yes I think a fork is in progress, but I'm not sure it is deployable yet | 19:29 |
fungi | does it have a name? | 19:30 |
frickler | yes, my main concern when I saw this was we might build something for gitea and then it turns out we would want/need the fork instead | 19:30 |
frickler | forgejo | 19:30 |
frickler | https://codeberg.org/forgejo/forgejo | 19:30 |
frickler | not sure though whether that is _the_ fork or just one of multiple ones | 19:30 |
fungi | they had to one-up gitea on unpronounceable names | 19:30 |
frickler | it's esperanto for "forge" they say | 19:31 |
fungi | i saw | 19:32 |
clarkb | I dont' think we need drastic action today either fwiw | 19:32 |
clarkb | but it would be good to evaluate both if we dig deeping into using gitea's extra functionality | 19:32 |
fungi | some of the folks involved in that have a history of getting very excited about things and then dropping them months later, so i'm not holding my breath yet anyway | 19:33 |
clarkb | #topic Vexxhost instance rescuing | 19:34 |
clarkb | This is mostly on the agenda at this point to try adn remind me to find time when jrosser's day overlaps with mine to gather info on the bfv setup they use for rescuing | 19:34 |
clarkb | then I can try and set that up in vexxhost and test it. IF I can make that work then we can develop a rescue image with all the right settings maybe | 19:35 |
clarkb | but no new updates on this since the last time we met | 19:35 |
clarkb | (holidays will do that) | 19:35 |
clarkb | #topic Gerrit 3.6 Upgrade | 19:35 |
clarkb | #link https://etherpad.opendev.org/p/gerrit-upgrade-3.6 | 19:35 |
clarkb | ianw: I've skimmed this doc a couple times at this point. The main new bit of feedback I've got (which is on the etherpad too) is that the latest 3.5 release which we upgraded to recently was made partially to fix a bunch of issues with copy approvals | 19:35 |
clarkb | I think it might be worth catching up on the state of copy approvals upstream (just to be sure there aren't any more bug fixes outstanding) then give it a go on our installation? | 19:36 |
clarkb | that way if our install finds new issues we have time to work upstream to address them | 19:36 |
ianw | yeah i agree -- aiui that can be run online just fine right? | 19:36 |
ianw | it looks likely to be a multiple-hour thing | 19:37 |
clarkb | ianw: yes, it is supposed to be able to run online in the background. Digging through logs for that might be the hard part to double check it was happy | 19:37 |
clarkb | the other piece which I don't thin kwe've done is evaluate any other potentailly breaking or user visible updates | 19:37 |
clarkb | just to try and idnetify any major things people might complain about | 19:38 |
ianw | yeah, i need to spin up the node | 19:38 |
ianw | then we can poke | 19:38 |
clarkb | 3.7 is looking to be far more problematic too so I'm glad we aren't trying to make that jump yet | 19:38 |
clarkb | it has currently broken recheck comments for example | 19:39 |
ianw | if you like, i can try running the copy-approvals when it slows down in a few hours, and monitor it | 19:39 |
clarkb | ianw: I think the first thing is to look at the changelog and open changes for 3.5 to see if we are missing any copy approvals updates | 19:39 |
clarkb | update our image if necessary but then ya running it | 19:39 |
corvus | clarkb: upstream bug suggests there should be an easy fix for that in zuul; i'll be investigating it soon | 19:40 |
ianw | ++ i'll check that out this morning and we can sync on it | 19:40 |
clarkb | corvus: it is a bit weird to me that want zuul to address it? its the commetn added event which has for a decade now included the content of the comment | 19:40 |
clarkb | corvus: I mean, its great if we can workaround it but I feel like that breaks a pretty base level expectation | 19:40 |
clarkb | ianw: sounds good. Anything else gerrit 3.6 upgrade related? | 19:41 |
corvus | assuming the upstream bug is accurate, they've had this backwards compat in place for years at our request, so i'm not ready to ding them on that. | 19:41 |
ianw | clarkb: nope, let's get that groundwork done then we can decide on update schedules | 19:42 |
clarkb | sounds good | 19:42 |
corvus | anyway, give me a chance to actually look into it so i can speak from knowledge :) | 19:42 |
clarkb | corvus: ++ | 19:42 |
clarkb | #topic Acme.sh failures | 19:43 |
clarkb | acme.sh switched to ecc certs/keys by default (from rsa) recently and broke our ability to renew things | 19:43 |
clarkb | ianw fungi and I all poked at it a bit and I think ianw was able to track it down to that specific change and wrote an upstream issue about it | 19:43 |
clarkb | #link https://github.com/acmesh-official/acme.sh/issues/4416 | 19:43 |
clarkb | Basically file paths changed and now acme.sh can't find data it is looking for later | 19:44 |
clarkb | to address this we've pinned to the previous release: 3.0.5 and the change to do that landed recently | 19:44 |
clarkb | we should expect that certs that our cert checker complains about refresh overnight today and address that (maybe sooner) | 19:44 |
clarkb | One thing I wanted to ask is if we should try and explicitly set settings like that to avoid underlying changes impacting us | 19:45 |
clarkb | we can set the key type and length and so on explicitly | 19:45 |
clarkb | and that might allow us to avoid pinning? | 19:45 |
clarkb | The downside to this is 5 years from now when rsa 2048 is no longer safe <_< | 19:45 |
ianw | we could -- this feels like a bit of a corner case because the ecc path seems to be a bit of a separate, opt-in thing, and upstream sort of half-switched it | 19:46 |
ianw | they can't really make the tool suddenly start updating certs in a different place on disk (a directory with _ecc appended) because that would seem to break everything | 19:47 |
clarkb | maybe we wait for feedback on the issue before deciding on how to move forward post pin | 19:47 |
clarkb | I just wanted to call the option of being explicit as an alternative to pinning | 19:47 |
ianw | yeah; i think it's good we've pointed it out -- we've run from dev for several years and this is the first time we've had an issue | 19:47 |
ianw | so i'll keep an eye, and if we can get back to a point of running from dev I think that's desirable to continue being a canary | 19:48 |
ianw | we can always pin to the last known working thing easily | 19:48 |
clarkb | sounds good and thank you for digging into that yesterday | 19:48 |
clarkb | I had a note on your debugging change too no sure if you saw | 19:49 |
clarkb | the one that updates driver.sh | 19:49 |
clarkb | #topic Open Discussion | 19:49 |
ianw | ok thanks, will go over those. just a couple of things that would have made it easier if happens again | 19:49 |
clarkb | ++ | 19:49 |
clarkb | corvus: ianw's fix to openstacksdk for launch node may be what we need for latest openstacksdk to work with nodepool too so I'll try to reprioritize testing that with your test tool | 19:50 |
clarkb | also linaro's new arm cloud appears to be near ready for use. This is being spun up to get us off old hardware that equinix wants to shutdown | 19:50 |
clarkb | I expect we'll need to move the builder and the mirror node manually (but maybe linaro has some magic to move those vms? I doubt it though ) | 19:51 |
ianw | yeah pretty sure that will all be a rebuild | 19:51 |
clarkb | And a reminder that we'll turn off the iweb cloud at the end of the year. | 19:51 |
ianw | that's ok, good test of the launch node changes :) | 19:51 |
clarkb | ++ | 19:51 |
clarkb | last call for anything else | 19:52 |
fungi | i've checked the new mm3 server's ip addresses against spamhaus and senderbase, and they're all clean | 19:54 |
clarkb | excellent | 19:55 |
clarkb | Thank you all for your time (in the meeting and working on OpenDev)! We'll be back next week. I should look at a calendar soon to see how December holidays and the new yaer impact our schedule. | 19:55 |
fungi | announcement is being drafted in the migration plan etherpad and is nearly complete, so i'll give folks a heads up in #opendev once it's ready to proof | 19:55 |
fungi | thanks clarkb! | 19:55 |
clarkb | #endmeeting | 19:55 |
opendevmeet | Meeting ended Tue Nov 29 19:55:27 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:55 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-29-19.01.html | 19:55 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-29-19.01.txt | 19:55 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-29-19.01.log.html | 19:55 |
clarkb | 5 minutes early :) | 19:55 |
fungi | nice | 19:55 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!