clarkb | just about meeting time | 18:59 |
---|---|---|
clarkb | #startmeeting infra | 19:00 |
opendevmeet | Meeting started Tue Nov 21 19:00:52 2023 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:00 |
opendevmeet | The meeting name has been set to 'infra' | 19:00 |
fungi | indeed | 19:00 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/WBYBD2663WL2IJD7NLDHBQ5ANRNRSMX3/ Our Agenda | 19:00 |
clarkb | #topic Announcements | 19:01 |
clarkb | It is Thanksgiving week in the US. I saw the TC meeting was cancelled today as a result. I will be less and less around as the week progresses. Have to start on food prep tomorrow | 19:01 |
clarkb | basically heads up that it may get quiet but I'll probably check my matrix connection at times | 19:01 |
clarkb | #topic Server Upgrades | 19:02 |
clarkb | tonyb has made progress on this and replaced the ord mirror. The new jammy mirror is in use | 19:02 |
tonyb | \o/ | 19:02 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/901504 Helper tool for mirror node volume management | 19:03 |
fungi | awesome job | 19:03 |
tonyb | I created mirror02.bhs1 today, and tested ^^^ | 19:03 |
clarkb | one thing that came out of that is the mirror nodes have volumes that are set up differently than all our other hosts so the existing tools can't be used | 19:03 |
clarkb | to avoid manual effort which results in errors and deltas tonyb volunteered to write a tool to simplify things. | 19:03 |
clarkb | I need to rereview it | 19:03 |
clarkb | tonyb: other than reviewing changes and answering questions you have is there anything the rest of us can be doing to help? | 19:04 |
tonyb | Nope I'm working through things. | 19:04 |
tonyb | if anything comes up I'll yell | 19:04 |
clarkb | sounds good and thank you for the help! | 19:05 |
clarkb | #topic Python Container Updates | 19:05 |
clarkb | No update on getting zuul-operator off of old debian. But uwsgi builds against python3.12 now so we can add python3.12 images if we want | 19:05 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/898756 And parent add python3.12 images | 19:06 |
clarkb | I don't expect we'll be making use of those quickly, but I do like getting the base images ready so that we aren't preventing anyone from testing with them | 19:06 |
tonyb | ++ | 19:06 |
clarkb | They should be straightforward reviews. THe parent is a bookkeeping noop and the child only adds new images that you have to explicitly opt into using | 19:06 |
clarkb | #topic Gitea 1.12.0 | 19:08 |
clarkb | I worked through the changelog and have the gitea test job running with screenshots that look correct now | 19:09 |
clarkb | However, it seems there is rough consensus that we'd like to rotate our ssh keys out in gitea before we upgrade to avoid needing to disable ssh key length checking | 19:09 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/901082 Support gitea key rotation | 19:09 |
clarkb | This change should allow us to do that entirely through configuration management. (the existing config management doesn't quite do what we need for rotating keys) | 19:09 |
clarkb | As written it should noop. Then we can create a new key, add it to gitea, then also update gerrit config management to deploy the key there and select it | 19:10 |
clarkb | the gerrit side is not yet implemented as I was hoping for feedback on 901082 first | 19:10 |
clarkb | Oh and I think we should use an ed25519 key because they have a single length which hopefully avoids gitea changing minimum lengths in the future on us | 19:11 |
tonyb | Sounds good to me. | 19:11 |
fungi | i'm fine with it | 19:12 |
clarkb | If you are interested in seeing what changes with gitea other than the ssh key stuff the change is ready for review | 19:12 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/897679 Upgrade to 1.21.0 | 19:12 |
clarkb | There are other things that change but none of them in a very impactful way | 19:12 |
clarkb | #topic Gerrit 3.8 Upgrade | 19:13 |
clarkb | This is done! | 19:13 |
tonyb | \o/ | 19:14 |
clarkb | It went really well as far as I can tell | 19:14 |
clarkb | The one issue we've seen is that html/js resources seem to be cached on the old version affecting the web ui file editor | 19:14 |
clarkb | If you hard refresh or delete your caches this seems to fix it | 19:14 |
clarkb | I've gone ahead and started on the gerrit container image cleanup for 3.7 and updates for 3.9 | 19:14 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/901469 Updates our gerrit image builds post upgrade | 19:14 |
clarkb | I figure we can merge those first thing next week if we still don't have a reason to rollback to 3.7 | 19:15 |
tonyb | Is it worth sending an email to service-announce (and pointing other projects at it) explaining the html/js issue | 19:15 |
clarkb | tonyb: ya I can do that as a response to the upgrade announcement | 19:15 |
tonyb | Okay, I wasn't volunteering you ;P | 19:16 |
clarkb | tonyb: no its fine, then I don't have to moderate it througgh the list :) | 19:16 |
tonyb | :) | 19:16 |
fungi | i get the impression not many people use the built-in change editor, and some of them will end up never seeing the problem because of their browser pulling the new version before they do | 19:16 |
clarkb | I also sent email to upstrem about it and at least one person indicated they had seen the issue before as well but weren't sure of any changes that avoid it | 19:16 |
clarkb | In related good news the gerrit 3.9 upgrade looks similar to the 3.8 upgrade. Minimal downtime to run init and then online reindexing | 19:18 |
clarkb | I haven't gone through the change list though so there may be annoying things we have to deal with pre upgrade | 19:18 |
clarkb | Anyway if you agree about removing 3.7 early next week maybe review teh chagne and indicate that in review or something | 19:19 |
clarkb | #topic Upgrading Zuul's MySQL DB Server | 19:19 |
clarkb | In addition to upgrading gerrit last friday we also did a big zuul db migration to accomodate buildsets with multiple refs | 19:19 |
clarkb | in that migration we discovered that the older mysql tehre didn't support modern sql syntax for renaming foreign key constraints | 19:20 |
clarkb | This has since been fixed in the zuul migration, but to avoid similar problems in the future it is probably a good idea for us to look into running a more modern mysql/maria db for zuul | 19:20 |
clarkb | I don't think we're going to create a plan for that here in this meeting but wanted to bring it up so that we can call out concerns or items to think about. I have 2. The first is where do we run it? Should it be on a dedicated host or just on say zuul01? I think disk size and memory needs will determine that. And are we currently backing up the db? If not should we before we | 19:21 |
clarkb | move it? | 19:21 |
clarkb | I suspect that the size of the database may make it somewhat impactful to run it alongside of the existing schedulers and we'll need a new host dedicated to the databse instead. Thats fine but a small departure from how we run mariabd next to our other services | 19:22 |
fungi | i don't see a mysqldump in root's crontab on either of the schedulers, for reference | 19:22 |
tonyb | It'd be a departure from how we typically run the DB, but consistent with how we're runnign it for zuul today right? | 19:24 |
clarkb | tonyb: correct. | 19:24 |
clarkb | tonyb: basically all of the self hosted non trove dbs currently are run out of the same docker compose for $service on the same host | 19:24 |
clarkb | but that is because all of those dbs are small enough or servers are large enough that the impact is minimal | 19:25 |
clarkb | I suspect that won't be the case here | 19:25 |
tonyb | Yup that makes sense | 19:25 |
fungi | well, first off, we're running zuul with no spof other than haproxy and that trove instance at the moment. would we want a db cluster? | 19:25 |
clarkb | maybe the thign to do is collect info in an etherpad (current db version, current db size needs for disk and memory, backups and backup sizes if any) and then use that to build a plan off of | 19:25 |
clarkb | so I'm not sure how zuul would handle that | 19:26 |
clarkb | for example is it galera safe? | 19:26 |
fungi | all questions we ought to ask | 19:26 |
clarkb | unlike zookeeper which automatically fails over and handles clustering out of the box with db servers its a lot more hands on and has impacts on the sorts of queries and inserts you can do for example | 19:27 |
fungi | in the short term though, should we schedule some downtime to reboot the current trove instance onto a newer version (if available)? | 19:27 |
clarkb | I think it depends on how much newer we can get? If it is still fairly ancient then probably not worthwhile but if it is modern then it may be worth doing | 19:28 |
clarkb | but ya this is the sort of info gathering we need before we can make any reasonable decisions | 19:28 |
tonyb | Yup. | 19:29 |
clarkb | https://etherpad.opendev.org/p/opendev-zuul-mysql-upgrade <- | 19:29 |
clarkb | lets collect questions and answers there | 19:29 |
fungi | the "upgrade instance" option is greyed out in the rackspace webui for that db, just checked. not sure if that means 5.7 is the latest they have, or what | 19:29 |
tonyb | Well that's a start. | 19:30 |
fungi | if i create a new instance they have mysql 8.0 or percona 8.0 or mariadb 10.4 as options | 19:30 |
fungi | so anyway, in-place upgrading seems to be unavailable for it | 19:31 |
fungi | no idea if those versions are also ~ancient | 19:31 |
tonyb | So we could stick with trove and dump|restore | 19:32 |
clarkb | 10.4 is like old old stable but still supported for a bit iirc | 19:32 |
clarkb | its what a lot of our stuff runs on and I haven't prioritized upgrades yet because it isn't EOL for another year or two iirc | 19:32 |
clarkb | I've got a list of questions in that etherpad now | 19:32 |
tonyb | 10.4.32 was releases last week | 19:32 |
clarkb | I think collect what we can on that etherpad then loop corvus in and make an informed decision | 19:34 |
corvus | oh hi, today has been busy for me, sorry just catching up | 19:35 |
clarkb | corvus: I don't think it is urgent. Just trying to get a handle on what an updated non trove zuul db looks like | 19:35 |
corvus | i think i'd add that we have generally been okay with losing the entire build db, thus the current decisions around deployment | 19:35 |
corvus | and lack of backups etc | 19:35 |
corvus | we could decide to change that, but that's definitely a first-order input into requirements :) | 19:36 |
corvus | if we wanted to remove the spof, we could do what the zuul operator does and run percona xtradb | 19:36 |
corvus | but none of us knows how to run it other than just using the pxc operator, so that's a k8s. | 19:36 |
corvus | if we run our own mysql spof, then i think it should be on a separate host since we now treat the schedulers as disposable | 19:37 |
fungi | those all sound like reasonable constraints | 19:39 |
corvus | maybe worth doing a survey of db clustering solutions that are reasonably low effort | 19:41 |
clarkb | ++ | 19:41 |
corvus | i feel like this is not important enough for us to sink huge amounts of ops time into running a zero-downtime cluster and risk more downtime by not doing it well enough. | 19:42 |
fungi | and that aren't kv stores, presumably. we need an actual rdbms right? | 19:42 |
corvus | so if it's hard/risky, i would lean toward just run a tidy mariadb on a dedicated node. | 19:42 |
clarkb | corvus: ++ | 19:42 |
corvus | but if it's reasonably easy (like it is in k8s with pxc-operator), then maybe worth it | 19:42 |
clarkb | I think it may be a really interesting learning experience if peopel are into that but also based on people's struggles with openstack stuff it seems running db clusters isn't always straightforward | 19:43 |
corvus | fungi: yes, mysql/mariadb or postgres specifically. no others. | 19:43 |
fungi | #nopostgres | 19:43 |
corvus | we should probably not exclude pgsql from our consideration, even though we're generally mysql biased so far. | 19:43 |
fungi | a, okay | 19:44 |
corvus | fungi: was that a veto of postgres, or tongue in cheek? | 19:44 |
fungi | it was an interpretation of your followup to my question about rdbms | 19:44 |
fungi | but then you clarified | 19:44 |
clarkb | ok I've tried to collect what we've said so far in that etherpad | 19:45 |
corvus | oh that was a deep cut. i get it. :) | 19:45 |
* fungi has no preference, just remember the postgres wars in openstack | 19:45 | |
clarkb | I think the next step(s) is/are to fill in the rest of the anwers to those questions and get some basic info on clustering options | 19:45 |
corvus | anyway, the ship has sailed on zuul supporting both. both are first-class citizens and will continue to be afaict, even though supporting two is O(n^2) effort. | 19:45 |
clarkb | I'm definitely not going to get to that this week :) I can already start to feel the pull of cooking happening | 19:45 |
corvus | so either is fine, if, say, postgres clustering is easy. | 19:46 |
fungi | wfm | 19:46 |
corvus | yep. these are good notes to take and will help me remember this after recovering from pumpkin pie. | 19:46 |
clarkb | maybe resync at next week's meeting and find specific volunteers for remaining info gathering post holiday | 19:46 |
corvus | ++ | 19:47 |
fungi | sounds good | 19:47 |
clarkb | #topic Open Discussion | 19:47 |
clarkb | I think if I could get one more thing done this week it would be to land the python3.12 image updates since that is low impact. But otherwise I'm happy to wait on the gitea ssh stuff and gerrit image cleanup/additions | 19:48 |
clarkb | I'm definitely going to start being around less regularly. Apparently we haev to roast turkey stuff tomorrow because we're not cooking it whole and need gravy makings | 19:48 |
clarkb | but also before that happens the turkey needs to be "deconstructed" | 19:49 |
clarkb | such a kind way of saying "butchered" | 19:49 |
fungi | openinfra foundation board individual member representative nominations are open until december 15 | 19:49 |
fungi | #link https://lists.openinfra.dev/archives/list/foundation@lists.openinfra.dev/thread/YJIQL444JMKFRSHUBYDWUQHBF7P7UDJF/ 2024 Open Infrastructure Foundation Individual Director nominations are open | 19:50 |
clarkb | One thing on my todo list for after thanksgiving is to start on Foundation Annual Report content for OpenDev (and Zuul) | 19:51 |
clarkb | I plan to stick that into etherpads like I've done before so that others can provide feedback easily | 19:51 |
clarkb | If there is somethign specific you're proud of or really want to see covered feel free to let me know | 19:51 |
clarkb | Last call for anything else? otehrwise we can go eat $meal a bit early | 19:52 |
tonyb | nothing from me | 19:53 |
clarkb | Sounds like that is everything. Thank you everyone for your time and I hope you get to enjoy Thanksgiving if you are celebrating | 19:53 |
clarkb | #endmeeting | 19:53 |
opendevmeet | Meeting ended Tue Nov 21 19:53:33 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:53 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2023/infra.2023-11-21-19.00.html | 19:53 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-11-21-19.00.txt | 19:53 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2023/infra.2023-11-21-19.00.log.html | 19:53 |
fungi | thanks clarkb! | 19:53 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!