clarkb | Just about meeting time | 18:59 |
---|---|---|
clarkb | I'm going to try and manage this while eating a bowl of rice | 18:59 |
clarkb | #startmeeting infra | 19:00 |
opendevmeet | Meeting started Tue Mar 5 19:00:20 2024 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:00 |
opendevmeet | The meeting name has been set to 'infra' | 19:00 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/UG2JFEL6XFFLDT5UYDHCBYNAJF72XXHZ/ Our Agenda | 19:00 |
clarkb | #topic Announcements | 19:00 |
fungi | hold bowl with one hand, chopsticks with second hand, type with third hand | 19:00 |
clarkb | small note that I'll be AFK through a good chunk of tomorrow. Taking advantage of a morning matinee and kids being in school to see Dune | 19:01 |
clarkb | #topic Server Upgrades | 19:02 |
clarkb | I haven't seen any new movement on this | 19:02 |
tonyb | nope. I'll address the review feedback and boot the new servers today | 19:02 |
clarkb | Worth calling out that the announced rackspace mfa switch may impact our ability to run launch node. I've got notes to discuss that further at the tail end of the meeting | 19:03 |
clarkb | tonyb: ah if you boot today you should be fine | 19:03 |
clarkb | #topic MariaDB Upgrades | 19:03 |
clarkb | The paste db upgrade went as expected. It seems to have only touched system tables, and did a backup of those tables first the size of which is less than 1MB and reasonable to continue to have the process do that backup | 19:03 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/910999 Upgrade refstack mariadb to 10.11 | 19:04 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/911000 Upgrade etherpad mariadb to 10.11 | 19:04 |
clarkb | I went ahead and pushed these two changes to upgrade refstack and etherpad's backing databases. I did have to make a small change to etherpad's test cases because the log output from 10.11 was updated to say mariadb is read instead of myslq is ready | 19:05 |
clarkb | reviews welcome as well as any feedback on whether we're comfortable with docker-compose kicking the upgrade off automatically or if we'd prefer manual intervention for up to the minute backups | 19:06 |
clarkb | after these two gerrit, gitea, and mailman 3 are the remaining dbs that need upgrades. I'll try to continue to step through them | 19:07 |
clarkb | #topic AFS Mirror cleanups | 19:07 |
clarkb | OpenSUSE Leap and Debian Buster have been removed from afs mirroring as well as nodepool | 19:08 |
clarkb | Next up is CentOS 7 which we've got some stuff in progress for under topic:drop-centos-7 | 19:08 |
clarkb | I did realize that CentOS 7 had/has far more reach than the other two so decided to announce a removal date for March 15 in order to minimize impact to the openstack release process | 19:09 |
clarkb | the impact should still be minimal but there were enough places thatcentos 7 was still showing up that I didn't want to just blaze ahead like I did with the others | 19:09 |
clarkb | we're currently cleaning up project configs then late this week early next week I'll drop zuul-jobs testing of centos 7 and remove wheel caching for centos 7 | 19:10 |
fungi | the custom nodeset definition in devstack is nearly done merging backports across 8 active branches | 19:10 |
clarkb | then we can do the actual nodeset and nodepool removal on the 15th and once that is done clean up afs | 19:10 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/906013 Improve DKMS for CentOS OpenAFS testing/packaging | 19:10 |
fungi | though i expect whichever tries merging last to fail with errors we can then use to see what old branches of other projects are using the devstack nodeset | 19:10 |
clarkb | thsi change isn't directly related to the cleanup but involves centos and afs and I think will make it easier to understand failures with dkms on the platform | 19:11 |
clarkb | fungi: ya | 19:11 |
clarkb | fungi: keystone for example | 19:11 |
clarkb | slow but steady progress. And we've already freed up like 400GB of openafs consumption | 19:12 |
fungi | this cleanup effort is likely to be pretty vast, since copies of bits like custom nodesets and jobs are declared across many, many branches and only the last removal will actually tell you what was using it | 19:12 |
clarkb | I will note that last friday when I tried to clean up the buster mirror content afs01.dfw.openstack.org lost a "disk" and everything went sideways | 19:12 |
clarkb | it isn't clear to me if this was due to deleting a few hundred gigabytes of data or just coincidence | 19:12 |
clarkb | something we should be aware of when making other large changes to openafs. Rax addressed it quickly at least | 19:13 |
fungi | yeah, and other than retrying some vos releases there wasn't any lasting impact | 19:13 |
clarkb | fungi: yes, I mentioned it elsewhere but we really need openstack to clean up old stuff early in the branching process instead of at eol time | 19:13 |
clarkb | because we're ending up with ancient configs that make no sense in modern oepnstack that continue to be carried forward release after release increasing the cleanup time/cost | 19:14 |
fungi | i think people define shared resources in branched repos without considering how zuul uses them | 19:14 |
fungi | and not realizing that even if you delete something out of your master branch, other projects will just keep using it from a branch 5 releases ago | 19:15 |
clarkb | Another issue that I ran into was that openafs doesn't load on debian bookworm. | 19:15 |
clarkb | #link https://gerrit.openafs.org/#/c/15668/ Fix for openafs on arm with newer gcc | 19:15 |
clarkb | * doesn't load on debian bookworm arm64 | 19:15 |
clarkb | Once upstream merges this fix for it I'll submit a bug to debian to see if we can get that fixed (it doesn't work at all so should be a good candidate for a fixup) | 19:15 |
clarkb | Once we've chipped enough of this old stuff out we can add in new things :) | 19:16 |
clarkb | if anyone wants to get a headstart on that a new dib job to start building 24.04 might be helpful | 19:16 |
clarkb | *start building Ubuntu 24.04 to avoid any confusion on what I was referring to | 19:17 |
frickler | is it coincidence that openafs is using gerrit and we are using openafs? /me just notices this | 19:17 |
clarkb | frickler: yes I think it is | 19:17 |
clarkb | #topic OpenDev Email Hosting | 19:19 |
clarkb | Don't think we have anything new to mention on this. But kept it on the agenda in case we had any stronger opinions | 19:19 |
* clarkb will give everyone a couple minutes to chime in if so. Otherwise we can continue on | 19:19 | |
frickler | I'd be fine with dropping it from the agenda and reviving once we consider it to be more urgent again | 19:20 |
clarkb | wfm I can do that | 19:20 |
clarkb | #topic Project Renames | 19:21 |
clarkb | This is mostly a reminder that we're planning to do renames after the openstack release on April 19 | 19:21 |
clarkb | we can adjust this timing as necessary | 19:21 |
clarkb | so please say somethign if that timing is especially bad for some reason | 19:22 |
frickler | the release should happen earlier, the date is after the PTG | 19:22 |
clarkb | correct. Its basically release then ptg then the 19th | 19:23 |
clarkb | we didn't want to conflict with the ptg or the release so we're doing it late | 19:23 |
clarkb | which is a good lead into our next tope | 19:23 |
clarkb | *topic | 19:23 |
clarkb | #topic PTG Planning | 19:24 |
clarkb | #link https://ptg.opendev.org/ptg.html | 19:24 |
clarkb | I was hoping this schedule would be a bit more fileld in before picking times but it is very empty | 19:24 |
clarkb | rather than wait for others to fill in I think we can go ahead and grab some time. | 19:24 |
clarkb | Something like Wednesday 0400-0600 and Thursday 1400-1600UTC. Gives enough time between blocks to catch up on sleep. | 19:24 |
clarkb | monday and tuesday tend to be busy so I'm trying to accomodate that | 19:25 |
frickler | +1 | 19:26 |
tonyb | Works for me. I admit I'll only be attending APAC friendly meetings | 19:26 |
fungi | ever since the ptg organizers stopped trying to pre-schedule times for all registered teams,many teams tend to wait until the last week to book any slots | 19:26 |
tonyb | I was thinking of dropping into the openeuler session | 19:26 |
clarkb | tonyb: that doesn't conflict with the times I proposed does it? | 19:27 |
clarkb | no it is on friday so we're good there | 19:27 |
frickler | I wasn't even aware that the scheduling is already happening. seems it is only announced to PTLs/session leaders? | 19:27 |
tonyb | I don't think so. the one I saw was Friday | 19:28 |
clarkb | frickler: yes emails did go out to the session leaders. Not sure if emails went out more broadly. | 19:28 |
clarkb | I can make a note that we may need to communicate this more widely | 19:28 |
tonyb | I think it only goes to session leaders | 19:28 |
clarkb | anyway I'll get us signed up for those two blcoks later today | 19:31 |
clarkb | #topic Rax MFA Requirement | 19:31 |
fungi | sounds good | 19:31 |
clarkb | fungi received email today announcing that rax will require MFA for authenticating starting march 26, 2024 | 19:31 |
fungi | they've also added a similar notice on the login page for their portal | 19:32 |
clarkb | enabling MFA breaks normal openstack api auth. We have to either use a rax api key or bearer token | 19:32 |
clarkb | this means all of our automation is impacted. | 19:32 |
clarkb | Since bearer tokens expire (relatively quickly too) we've decided to investigate using the api_key method. To do this we need ot install rackspaceauth as a keystoneauth1 plugin to all the places we use the api | 19:33 |
clarkb | the nwe need to use the api key value instead of regular user auth | 19:33 |
clarkb | the rough plan here is to test this with nodepool using a single region to start that way we can check that launcher and builder operati ons work (or don't) | 19:33 |
frickler | do we know the lifetime for those api keys? | 19:34 |
clarkb | then when that works we can switch all rax nodepool providers over to the new system and update our control plane management to use the same api-key stuff. Then we can opt in to MFA when ready | 19:34 |
clarkb | fungi: ^ do you know the answer to frickler's question? You were testing this with your personal account any indication of a lifetime? | 19:34 |
fungi | frickler: i generated one for my personal rackspace account years ago and it's never changed | 19:34 |
fungi | from what i can tell it only changes if you click the "reset" button next to it in the account settings | 19:35 |
clarkb | If you'd like to help with reviews or pitch in pushing changes we're using topic:rackspace-mfa | 19:35 |
tonyb | I'm not really seeing how that helps with security at all? | 19:35 |
fungi | tonyb: it helps with security theater | 19:35 |
fungi | if you force people to make changes then you can't say you didn't do anything | 19:36 |
clarkb | fungi: for the system-config chagne we need to put new secrets in private vars. Is that done yet? | 19:36 |
clarkb | thinking about our next steps and I think it is roughly add the new private vars, land the system-config change, then update nodepool config | 19:36 |
fungi | yes, i left a comment on the change saying i did it too | 19:36 |
clarkb | then we can either land the nodepool change or try it out of the intermediate registry. Pull from the intermediate registry will only work for the launcher image I think since the builer is multiarch and docker isn't able to negotiate multiarch images out of the intermediate registry currently :/ | 19:37 |
clarkb | fungi: thanks! | 19:37 |
clarkb | fungi: we should be able to push up a project-config update with a depends on system-config too if we haven't yet | 19:38 |
clarkb | but I think that is where we're at until a couple of things merge | 19:38 |
corvus | i think if it works for launcher that's good enough to land the nodepool change | 19:38 |
clarkb | as an alternative we can manually install the lib itno the image if the launcher is multiarch too and can't be fetched out of testing | 19:38 |
fungi | clarkb: what needs changing in project-config? i can do that | 19:38 |
corvus | (i don't think we need to prove it works to land the nodepool change; it's pretty simple. but still, it'd be nice to avoid churn or errors there since there's no real way to test it other than in prod) | 19:38 |
clarkb | fungi: we have to update the nodepool/nl01.opendev.org and nodepool/nodepool.yaml files to force one of the three rax providers to use your newly defined clouds.yaml entries | 19:39 |
fungi | oh, right that | 19:39 |
clarkb | corvus: ++ | 19:39 |
fungi | yeah i'll get that proposed | 19:39 |
fungi | though probably not until after 21z | 19:39 |
clarkb | fungi: I would pick the rax region with the lowset capacity to reduce impact if it doesn't work | 19:39 |
fungi | good idea | 19:39 |
fungi | we have three weeks to get this working, which seems like plenty, but if we run into problems that time can disappear on us very quickly | 19:40 |
clarkb | agreed best to get as much info as we can as early as possible then adjust our plan as necessary | 19:41 |
frickler | what about log uploads, are these also affected or not? the earlier discussion in #opendev didn't seem conclusive to me | 19:41 |
fungi | we use swift account credentials for that, not keystone | 19:42 |
fungi | as i understand it | 19:42 |
fungi | those are separate accounts defined in swift itself and scoped to specific swift acls | 19:42 |
clarkb | ya so I don't think they will be affected but we should double check on that (check that we are using special creds and check that they aren't affected though i'm not sure how we do this second thing) | 19:42 |
clarkb | corvus: you may recall the details as I think youset that UP/ | 19:43 |
clarkb | (and I can't type) | 19:43 |
fungi | we can also, worst case, fall back to only uploading to ovh in the interim while we work it out | 19:43 |
clarkb | not ideal but ya that would work | 19:44 |
clarkb | as far as actual MFA implementatino goes their docs refer to phone authenticator apps. Typically this means they are doing totp so we should be able to do that here as well | 19:44 |
clarkb | similar to how some of our other accounts have done totp | 19:44 |
clarkb | Still a lot of unknowns for now but we've got a plan to learn more. Next week we can catch up and make sure there aren't any glaring issues we need to address | 19:46 |
clarkb | #topic Open Discussion | 19:46 |
clarkb | Anything else before we end the meeting? | 19:46 |
corvus | clarkb: i don't recall the details.... | 19:47 |
clarkb | corvus: ack we should be able to log in to the swift stuff and check and/or look at our secrets in zuul | 19:47 |
corvus | yeah, probably worth looking into ahead of time | 19:47 |
corvus | because i agree, something is different about it | 19:47 |
clarkb | openstack is starting to get into release mode. Keep that in mind when making changes | 19:48 |
clarkb | and thats about all I had | 19:48 |
clarkb | sounds like that is everything for today. Thank you everyone for your time and effort operating and improving opendev | 19:51 |
clarkb | #endmeeting | 19:51 |
opendevmeet | Meeting ended Tue Mar 5 19:51:55 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:51 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2024/infra.2024-03-05-19.00.html | 19:51 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2024/infra.2024-03-05-19.00.txt | 19:51 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2024/infra.2024-03-05-19.00.log.html | 19:51 |
frickler | thx clarkb | 19:52 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!