Tuesday, 2024-07-02

clarkbjust about meeting time18:58
* tonyb is trying to wake up18:59
clarkbgood morning18:59
tonybI'm vertical at least18:59
clarkb#startmeeting infra19:00
opendevmeetMeeting started Tue Jul  2 19:00:18 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:00
opendevmeetThe meeting name has been set to 'infra'19:00
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/VV6IZNOMAO2KDKBBQ45R3VPNSRCOLWCG/ Our Agenda19:00
clarkb#topic Announcements19:00
clarkbA reminder that Thursday (and possibly Friday?) is a holiday for several of us19:00
tonybhappy turkey day!19:01
clarkbno this is happy please do your best to not set everything on fire with your firworks day :)19:01
corvusi will celebrate with maple syrup!19:01
clarkbmight also be worth calling out that oepnstack has loaded up the CI system with some important changes and we should do our best to ensure we aren't impacting their ability to merge19:02
tonyboh right.  silly me19:02
clarkbfungi: frickler: I saw rumblings of merge issues but haven't seen any of that tied back infrastructure issues.19:02
clarkbwell we can get back to that at the end of the agenda if there are concerns19:03
clarkb#topic Upgrading Old Servers19:04
clarkbtonyb: I've not been able to apy as much attention to this as I would've liked over the last week. Plenty of other distractions. Anything new to share about the wiki work?19:04
tonybi can get the IP of the held node19:04
tonybI did a snapshot and import of the current wiki data and it seems good19:05
clarkbtonyb: does that include the database content?19:05
tonybyup19:05
tonybit helped me discover some additional work to do but nothing significant 19:05
clarkbthat is great. I guess that proves we can do the sideways migration (probably with a reasonable downtime to avoid deltas between the two sides)19:05
tonybyup I'd say the outage will be an hour tops19:06
tonybso I'd like y'all to look at the wiki on that node19:06
tonybto test for extension breakage etc19:07
tonybthen I can publish changes for review19:07
clarkb++ I can #link a url if we have one (or do we need to edit /etc/hosts locally to have vhosts align?)19:07
fungiare images working too? the separate non-database file storage in mediawiki required some careful handling last i did this19:07
tonybyup images work too19:07
fungiawesome!19:07
tonybyou'll need to edit hosts with the IP 19:08
clarkbdo you have the IP? otherwise we have to go look in nodepool's hold list19:08
tonybI'll drop the details in #opendev when I get to my laptop 19:08
clarkbperfect thanks!19:08
clarkbI also wanted to ask about booting noble nodes. Have we booted one yet? I know the stack to make that possible landed last week.19:09
tonybfor noble I was going to try mirror mode but that was made complex19:09
tonybThe vexxhost mirrors are currently boot from volune19:10
tonybthe tax clouds don't have noble images19:10
tonybetc nothing super worrisome but it means I'll probably stall for a bit while I figure out the right way forward 19:11
clarkbseems reasonable19:11
clarkbI also wanted to note that debian fixed some openafs packaging so we may be able to flip some of those infra roles jobs that are non voting to voting at this point19:11
tonyb104.239.143.6wiki99.opendev.org19:11
tonybI can look at that too19:12
tonybI wanted to check the next release and see if there is anything we need that isn't backported19:12
clarkbI haven't tested the fixes myself but flipping from non voting to voting to run the jobs then see what fails/succeeds seems reasonable19:12
clarkbAnything else related to upgrading servers?19:13
tonybIt's a bit of a tangent but I've been looking at kAFS also19:13
fungioh, i did confirm that this week's new openafs packages in debian fixed my dkms build problems19:14
fungispeaking of afs19:14
tonybNice19:14
clarkbA good lead into the next topic19:14
clarkb#topic Cleaning Up AFS Mirror Content19:15
clarkba number of the topic:drop-centos-8-stream changes have merged at this point, but what remains is currently stuck behind projects like glance (and tempest etc?) still trying to run fips jobs on centos 8 stream19:15
clarkbI don't think we want to force merge cleanups this week due to the holiday and openstack being preoccupied with the security fix patches, but maybe we should consider forc emerging things next week?19:16
clarkbCurrently the jobs are broken and projects ahve just set thinsg to non voting which is completely uselss from our perspective. The jobs cannot succeed and need to be removed/replaced and if that isn't happening more naturally (due to the non voting workaround) I think we should be more forceful19:17
clarkbany concerns with doing that next week? Maybe we want to see where the openstack patching stands on monday?19:17
tonybSounds okay to me19:18
tonybwe pre-warned of that plan19:18
tonybWhat's the status of Xenial and Bionic?19:19
tonybWhat can I/we do to help get that content removed19:19
clarkbtonyb: my efforst there stalled on on trying to clean up centos 8 stream first (I prioritized that way since xenial jobs still function and centos 8 stream had fewer tendrils)19:20
clarkbI do have a semi recent change up to system-config to remove our last uses of xenial with a warning that once we do that we are at higher risk of breaking things fomr an opendev perspective19:20
clarkbthat was always going to happen with our plan so its a matter of timing. We could proceed with that if we think the risk is low enough19:20
clarkbtopic:drop-ubuntu-xenial should have that change Let me see about a direct link19:20
clarkb#link https://review.opendev.org/c/opendev/system-config/+/922680 Xenial CI Job removal from system-config19:21
tonybThanks19:22
clarkb#topic Gitea 1.22 Upgrade19:22
clarkbUpstream still doesn't have a 1.22.1 release19:23
clarkbso I haven't really looked at this much more. However, I did cycle out my held nodes as the previous ones were old enough to not have logs available in zuul related to them anymore19:24
clarkbalso a user pointed out yesterday on irc that tarball downloads for repos do not work currently. They were workign not that long ago and the logs indicate 200 responses with 19 bytes of json saying "complete: false"19:24
clarkbmy plan is to test that with the 1.22 held nodes and see if the behavior persists as I suspect it may just be a gitea bug19:25
clarkb#topic OpenMetal Cloud Rebuild19:25
clarkbI haven't seen any response to my last email19:26
clarkbThe main concern is that I think configuring storage properly is something that they should consider more generally for their product and we're a good test case which means we may want to avoid fixing it directly using frickler's kolla knowledge19:27
clarkbbasically trying to avoid stepping on toes and help provide some value back to the donating organization. With the holdiay and it being summer for those of us in nroth america it may also just be a vacation problem. I'll try to followup again after the holiday and see if we can get direction that avoids toe stepping19:28
tonybSounds good.19:28
clarkb#topic Testing Rackspace's New Cloud Offering19:28
clarkbsimilarly with this one I haven't heard back on the email I sent19:28
tonybI guess we could also file an issue / ticket19:28
clarkbtonyb: ya that might be another way to get attention19:29
clarkbIn the rackspace case I think I may lean on some folks at the foundation who may be in more regular contact with them and see if we can set something up19:29
clarkbagain with the holiday this week I don't expect it to move quickly though19:29
fungiyeah, cloudnull seems to busy for our usual heckling ;)19:30
fungier, too busy19:30
clarkb#topic Nodepool in Zuul19:31
corvusyou may remember this zuul spec from a while back19:31
corvusthe main work of implementation on that has begun, so i think in the not too distant future, this may become more relevant to opendev19:32
clarkbthe main goals are to express image and node info directly in zuul configs as well as using the zuul runtime engine to process things like image builds19:32
corvusyep, and a big part of that is being able to build images inside a zuul job19:33
clarkband reduce confusion over zuul and nodepool being different things for historical reasons despite being tightly coupled today19:33
fungiand provide opportunities for things like acceptance testing of images, i suppose19:33
tonybCan you link to the spec so I can ask reasonable questions 19:33
corvusso in opendev, we will need to port our image building from nodepool-builder into zuul jobs19:33
corvusfungi: yep19:33
corvus#link nodepool in zuul spec https://zuul-ci.org/docs/zuul/latest/developer/specs/nodepool-in-zuul.html19:34
tonybThank you19:34
clarkbI suspect that we'll be able to port an image at a time as we sort out any unexpected items19:35
corvusas for moving the image building into jobs -- there's a bit of work to do there, but i don't think it's going to be too bad, and we'll have help19:36
corvusfirst, i expect that the image build jobs are basically going to be "run diskimage-builder with the same parameters we use today inside nodepool-builder"19:36
corvussecond, the folks at bmw already build their images this way, and are offering their existing ansible roles that execute DIB to zuul-jobs19:36
corvusso a lot of the boilerplate for "run dib in a zuul job" should exist in some form19:37
corvuswe also have my old proof-of-concept patch: https://review.opendev.org/84879219:37
corvusthat just does it in a shell script, but the principle is the same19:37
clarkbI suspect that zuul will want to ship things that work out of the box for people migrating (though we're likely the test case for those)?19:38
corvusso anyway, as clarkb says, i think we can start working on these image jobs one at a time, once zuul grows the ability to run them19:38
corvusclarkb: yeah, i think there should be a straightforward path for anyone using nodepool-builder.19:39
corvusthe other thing opendev may see soon is a zuul-launcher host19:40
tonybExciting :)19:40
clarkbtalking out loud here: but do we need another host? Could just run that on the existing launcher nodes?19:40
corvusthat will be the zuul component responsible for launching nodes, and at least at first, driving image build workflow19:40
clarkbthough maybe since long term nodepool goes away not having things named nlXY is preferred19:40
corvusclarkb: we... could, i think?  but unless we're very constrained, i think it would be good to have a new host for simplicity19:41
clarkbI don't think we're constrained. Was more thinking it might speed up the conversion process to not launch new nodes too (though that isn't too big of a deal either)19:41
corvusthe process for doing all of this will be to effectively develop this in a shadow mode.  so zuul is going to grow a lot of features that are not enabled by default, and are not documented.19:42
corvusand i think we're going to get all the way to the end and have the ability to run both systems in parallel for a while before we call it done.19:42
fungithat sounds reasonable19:43
corvusi'll take care of writing the deployment changes, and doing a lot of the (undocumented) job/workflow construction19:43
fungithanks!19:43
corvusi will be leaning on other folks to help with the image build jobs themselves, because i am not as expert in that as others are :)19:43
clarkbsounds like a plan19:44
* tonyb is far from an expert but is happy to "drive" the opendev side of the image-building19:44
corvustonyb: ++ thanks!  and you will be soon!  :)19:44
clarkbanything else?19:45
tonyb\o/19:45
corvusi think you can expect to see some deployment changes relatively soon....19:45
corvusoh, and if anyone asks, i don't think opendev running zuul-launcher should be seen as a sign it's time for other folks to do so19:45
clarkbprobably want to point people at zuul release notes for signal that they are expected to migrate?19:46
corvusobviously anyone is welcome too -- but for clarity, sometimes we point to ourselves as an example of how zuul should be run; but in this case i consider this more like part of the development process, and it's not a signal of maturity or production readiness.19:46
corvusjust want to be clear to that19:46
corvusclarkb: exactly -- documentation and release notes will be a much better signal for that19:46
tonybGood to be clear19:47
corvusi think that's about it; thanks19:47
tonybthanks corvus 19:47
fungithat is also how we handled the switch from jenkins to ansible builders in zuul 2.5.x19:47
clarkb#topic Collating Backlog Items From the Group19:47
clarkbtonyb: you added this item do you want to drive or should I from the notes ( I know it is early for you)19:48
tonybI can 19:48
clarkbgo for it19:48
tonybIt's kind of a discussion point19:48
tonybAs a group we're somewhat overcommited and we all have a list (physical or not) of things to do19:48
fungii have a to do item to find where i put my to do list19:49
tonybI was thinking of coming up with a lighteight way to track these things19:49
fungiwe've used etherpads fairly well for scratch coordination in the past19:50
tonybMy initial idea was to have a bot (#noteit) that we could use to add a topic and link to the IRC logs where an item was discussed19:50
clarkbone idea I had was that maybe one of the super simple web kanban board things that have sprouted up could be worth trying. Downside to that is who knows how long those services will stay up and whether or not data is exportable19:50
clarkbbut https://gokanban.io/ for example is like etherpad for kanban I think19:50
clarkbtonyb: I do like the idea of linking back to the irc logs for greater context19:51
corvusi like the idea of #noteit so when i'm off doing something for a day or two and come back i can find the important bits in backlog;  seems similar to #status log which we might be able to abuse for that purpose too19:51
clarkbso much of the context ends up in IRC and we often end up grepping/googling/searching irc logs19:51
clarkbya I'd be willing to try it19:51
tonybAny of those could work I mainly wanted a non-invasive way to keep track19:52
fungior add a new #status track but sure log and even the others could link back to the irc log url where they were called19:52
clarkb(to be clear the kanban thing was an idea in addition to the noteit idea not a replacement. It was just somethign that came to mind when reading this on the agenda yesterday as I put it together)19:52
clarkbsounds like there aer no objections if someone has time to implement the feature in the bot19:53
tonybI figured as much #noteit would be the "quick add" to $whatever and then we'd edit that backend once its done19:53
clarkbmakes sense19:54
tonybA releated note is the specs repo is .... outdated19:54
fungiyes indeedily-doodily19:54
clarkbya the main issue with the specs repo is as you note we're largely overcommitted with the existing stuff so finding time for new things or major changes is difficult and then that reflects back on the specs repo19:54
tonybIf I start moving things around in there19:54
clarkbbut yes I think we should probably declare bankruptcy in the specs repo and carry over a small set of things we know we absolutely want to do19:54
fungiin my tenure as ptl, pre-opendev, i tried to reframe the specs list as a help-wanted board19:55
clarkbprometheus and the login improvements come to mind as things to carry over19:55
tonybfor example mailman3 can be marked as done right?19:55
fungisince that's more what it is19:55
clarkbtonyb: yes that one can be marked done19:55
fungibut yeah, some of those things can be crossed off the list now19:55
clarkbtonyb: but ya I think patches to clean that up and maybe reflect the help wanted ness of the situation more explicitly would all be good19:55
tonybOkay.  I can push some of those and ask for reviews from time-to-time to make sure I'm on the right path19:56
clarkb++ sounds good19:57
fungithanks!19:57
clarkbwe have a few more minutes19:57
clarkb#topic Open Discussion19:57
clarkbanything else that didn't get captured in the agenda that we want to call out quickly?19:57
tonybSo in the near terms I'll add a new bot to a mode for #status to do the note it stuff19:57
fungithe lists performance adjustment seems to have worked out, messages are coming through quite a lot faster now19:57
clarkbya I haven't noticed any major lags since the queuing stuff went in19:58
clarkbthank you everyone for all your help running OpenDev and your time during the meeting!19:59
tonybNice.  My mail is often laggy anyway so I didn't notice at all :/19:59
clarkbWe'll be back here next week at our regularly scheduled time, but as always feel free to start discussions on the mailing list or on irc if things are urgent19:59
clarkbor if you just want quick feedback dosen't have to urgent19:59
clarkb#endmeeting20:00
opendevmeetMeeting ended Tue Jul  2 20:00:06 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)20:00
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2024/infra.2024-07-02-19.00.html20:00
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2024/infra.2024-07-02-19.00.txt20:00
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2024/infra.2024-07-02-19.00.log.html20:00
clarkband now we can all go find $meal20:00
tonybThanks all!20:00
fungithanks clarkb!20:00
corvusthanks clarkb!20:00

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!