Tuesday, 2022-11-22

clarkbmeeting time19:00
fungiindeed19:00
ianwo/19:01
clarkb#startmeeting infra19:01
opendevmeetMeeting started Tue Nov 22 19:01:51 2022 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
opendevmeetThe meeting name has been set to 'infra'19:01
clarkb#link https://lists.opendev.org/pipermail/service-discuss/2022-November/000381.html Our Agenda19:02
clarkb#topic Announcements19:02
clarkbfungi: are individual director nominations still open?19:02
fungiyes, until january19:03
fungii think?19:03
fungino, december 1619:03
fungi#link https://lists.openinfra.dev/pipermail/foundation/2022-November/003104.html 2023 Open Infrastructure Foundation Individual Director nominations are open19:04
clarkbthanks19:04
fungialso cfp is open for the vancouver summit in june19:04
fungi#link https://lists.openinfra.dev/pipermail/foundation/2022-November/003105.html The CFP for the OpenInfra Summit 2023 is open19:04
clarkb#topic Bastion Host Updates19:06
clarkbLets dive right in19:06
clarkbthe bastion is doing centrally managed known hosts now. We also have a dedicated venv for launch node now19:06
clarkbfungi discovered that this venv isn't working with rax though. It needed older cinderclient, and iwht that fixed now needs something to address networking stuff with rax. iirc this was the problem that corvus and ianw ran into on old bridge? I think we need older osc maybe?19:07
clarkbfungi: that may be why my osc is pinned back on bridge01 fwiw19:07
fungiwell, the cinderclient version only gets in the way if you want to use launch-node's --volume option in rackspace, but it seems there's also a problem with the default networks behavior19:07
clarkbright, I suspect that the reason I pinned back osc but not cinderclient is I was not doing volume stuff but was doing server stuff19:08
ianw... that does sound familiar19:08
fungirackspace doesn't have neutron api support, and for whatever reason osc thinks it should19:08
fungimaybe we need to add another option to our clouds.yaml for it?19:08
clarkbianw: I'm pretty sure its the same issue that corvus ran into and helped you solve when you ran into it. And I suspect it was fixed for me on brdige01 with oldre osc19:08
clarkbfungi: if you can figure that out you win a prize. unfortunately the options for stuff like that aren't super clear19:09
ianw(https://meetings.opendev.org/irclogs/%23opendev/%23opendev.2022-10-24.log.html#t2022-10-24T22:56:16 was what i saw when bringing up new bridge)19:09
fungilooks like the server create command is hitting a problem in openstack.resource._translate_response() where it complains about the networks metadata19:09
ianw"Bad networks format"19:10
fungiyeah, that's it19:10
clarkbwe can solve the venv problem outside th emeeting but I wanted to call it out19:11
fungiokay, so we think ~root/corvus-venv (on old bridge?) has a working set?19:11
clarkbfungi: ya19:11
fungiyeah, that's enough for me to dig deeper19:12
fungithanks19:12
clarkbare there any other bridge updates or changes to call out?19:12
ianwthe ansible 6 work is close19:12
ianw#link https://review.opendev.org/q/topic:bridge-ansible-update19:13
ianwanything with a +1 can be reviewed :)19:13
clarkboh right I pulled up the change there that I havne't reviewed yet but saw it was huge compared to the others and got distracted :)19:13
ianw#link https://review.opendev.org/c/opendev/system-config/+/86509219:13
ianwis one for disabling known_hosts now that we have the global one19:13
clarkbianw: I think you can go ahead and approve that one if you like. I mostly didn't approve it just so that we could take a minute to consider if there was a valid use case for not doing it19:14
ianwok, i can't think of one :)  i have in my todo to look at making it a global thing ... because i don't think we should have any logging in between machines that we don't explicitly codify in system-config19:15
ianw#link https://review.opendev.org/q/topic:prod-bastion-group19:15
ianwhas a couple of changes still if we want to push on the parallel running of jobs.  19:15
ianw(it doesn't do anything in parallel, but sets up the dependencies so they could, in theory)19:15
clarkb++ I do want to push on that. But I also want to be able to observe things as they change19:16
clarkbprobbaly won't dig into that until next week19:16
ianwthat is very high touch, and i don't have 100% confidence it wouldn't cause problems, so not a fire and forget19:16
ianw++19:16
ianwother than that, i just want to write up something on the migration process, and i think it's (finally!) done19:16
clarkbexcellent. Thanks again19:17
ianwthere's no more points i can think of that we can really automate more19:17
clarkbsounds like we can move on to the next item19:18
clarkb#topic Upgrading Servers19:18
clarkbI was hoping to make progress on another server last week then I got sick and now this is behind a few other efforts :/19:18
clarkbthat said one of those efforts is mailman3 and does include a server upgrade. Lets just skip ahead to talk about that19:18
clarkb#topic Mailman 319:18
clarkbchanges for mailman3 config management have landed19:19
clarkbafter fungi picked this back up and addressed some issues with newer mailman and django19:19
clarkbI believe the next step is to launch a node, add it to our inventory and get mailman3 running on it. Then we can plan migration sfor opendev and zuul?19:19
clarkber I guess they are already planned. We can go ahead and perform them :)19:19
fungiyeah, the migration plan is outlined here:19:20
fungi#link https://etherpad.opendev.org/p/mm3migration Mailman 3 Migration Plan19:20
clarkbanything else to add to this?19:22
ianwsounds great, thanks!19:22
fungii'm thinking since we said let's do go/no-go in the meeting today for a monday 2022-11-28 maintenance, i'm leaning toward no-go since i'm currently struggling to get the server created and expect to not be around much on thursday/friday (also a number of people are quite unlikely to see an announcement this week)19:23
clarkbfungi: that is reasonable. no objection from me19:23
ianwyeah i wouldn't want to take this one on alone :)19:23
fungilater nect week might work, we could do a different day than friday given clarkb's unavailability19:24
fungialso apologies for typos and slow response, i seem to be suffering some mighty lag/packet loss at the moment19:24
clarkbyup later that week works for me. I'm just not around that friday19:25
fungianyway, nothing else on this topic19:25
clarkb(the 2nd)19:25
fungimainly working to get the prod server booted and added to inventory19:25
clarkbcool, lets move o nto the next thing19:25
clarkb#topic Using pip wheel in python base images19:25
clarkbThe change to update how our assemble system creates wheels in our base images has landed19:26
clarkbAt this point I don't expect any problems due to the testing I did with nodepool which covered siblings and all that19:26
clarkbBut if you do see problems please make note of them so that they can be debugged, but reverting is also fine19:26
clarkbmostly just a heads up19:26
clarkb#topic Vexxhost instance rescuing19:28
clarkbI didn't want this to fall off our mind swithout some sort of conclusion19:28
clarkbFor normally booted nodes we can create a custom boot image that uses a different device label for /19:29
clarkbwe can create this with dib and uploda it or make a snapshot of an image we've done surgery on in the cloud.19:29
clarkbHowever, for BFV nodes we would need a special image with special metadata flags and I have no idea what the appropriate flags would be19:29
clarkbgiven that, do we want to change anything about how we operate in vexxhost? previously people had asked about console login19:30
clarkbI'm personally hopeful that public clouds would make this work out of the box for users. But that isn't currently the case so I'm open to ideas19:31
ianwi guess review is the big one ...19:31
ianwi guess console login only helps if it gets to the point of starting tty's ... so not borked initrd etc.19:32
clarkbya its definitely a subset of issues19:32
clarkbmaybe our effort is best spent trying to help mnaser figure out what that metadata for bfv instances is?19:32
clarkber metadata for bfv rescue images is19:33
fungiyes, being able to mount the broken server's rootfs on another server instance is really the only 100% solution19:33
ianwit does feel like there's not much replacement for it19:33
ianwi'm trying to think of things in production that have stopped boot19:34
clarkbianw: the lists server is the main one. But thats due to its ancientness19:34
clarkb(which we are addressing by replacing it)19:34
ianwa disk being down in the large static volume and not mounting was one i can think of19:34
ianwi think i had to jump into rescue mode for that once19:35
fungifor that case, having a ramdisk image in the bootloader is useful19:35
fungibut not for cases where the hypervisor can't fire the bootloader at all (rackspace pv lists.o.o)19:36
clarkbya it definitely seems like rescuing is the most versatile tool. I'll see if I can learn anything more about how to configure ceph for it19:37
clarkbin particular jrosser had input and we might be able to try setting metadata like jrosser and see if it owrks :)19:37
clarkband if we can figure it out then other vexxhost users will potentially benefit too19:38
clarkbso worthwhile I think19:38
clarkb#topic Quo vadis Storyboard19:39
clarkbIts been about two weeks since I sent out anothre email asking for more input19:39
clarkb#link https://lists.opendev.org/pipermail/service-discuss/2022-October/000370.html19:39
clarkbwe've not had any responses since my last email. I think that means we should probably consider feedback gathered19:39
clarkbAt this point I think our next step is to work out amongst ourslves (but probably still on the mailing list?) what we think opendev's next steps should be19:40
clarkbto summarize the feedback from our users, it seems there aren't any specifically interested in using storaybodr or helping to maintain/develop it.19:41
clarkbThere are users interested in migrating from storyboard to launchpad19:41
ianwthe options are really to upgrade it to something maintainable or somehow gracefully shut it down?19:41
fricklerdo we know whether there is any other storyboard deployment beside ours?19:42
clarkbianw: yes I think ultimately where that puts us is we need to decide if we are willing to maintain it ourselves (including necessary dev work). This would require a shift in our priorities considering it hasn't been updated yet19:42
clarkbianw: and if we don't want to do that figure out if migrating to something else is feasible19:42
clarkbfrickler: there were once upon a time but I'm not sure today19:42
fungias always people can fork or take up maintenance later if it's something they're using, regardless of whether we're maintaining it19:44
clarkbright I'm not too worried about others unless they are willing to help us. And so far no one has indicated that is something they can or want to do19:45
clarkbI don't think we should make any decisions in the meeting. But I wanted to summarize where we've ended up and what our two likely paths forward appear to be19:45
ianwit feels like if we're going to expend a fair bit of effort, it might be worth thinking about putting that into figuring out something like using gitea issues19:45
clarkbianw: my main concern with that is I don't think we can effectively disable repo usage and allow issues19:45
clarkbits definitely something that can be investigated though19:46
fungiright now there are a lot of features in gitea which we avoid supporting because we aren't allowing people to have gitea accounts19:46
clarkband a number of features we explicitly don't want19:46
fungias soon as we do allow gitea accounts, we need to more carefully go through the features and work out which ones can be turned off or which we're stuck supporting because people will end up using them19:47
clarkbBut we've got really good CI for gitea so testing setups should't be difficult if people want to look into that.19:47
fungialso not allowing authenticated activities has shielded us from a number of security vulnerabilities in gitea19:47
clarkb++19:47
clarkbanyway I do think we should continue the discussion on the mailing list. I can write a followup indicating that we have options like maintain it ourselves and do the work, migrate to something else (lp and/or gitea) and try to outline the effort required for each?19:48
clarkbDo we want that in a separate thread? Or should I keep rolling forward with the existing one19:48
ianwi do agreee, but it feels like useful time spent investigating that path19:49
ianwit probably also loops back to our authentication options, something that would also be useful to work on i guess19:52
clarkbya at the end of the day everything end sup intertwined somehow and we're doing our best to try and prioritize :)19:53
clarkb(fwiw I think the recent priorities around gerrit and zuul and strong CI have been the right choice. They hvae enabled us to move more quickly on other things as necessary)19:53
clarkbsounds like no strong opinion on where exactly I send the email. I'll take a look at the existing thread and decide if a follow up makes esnse or not19:54
clarkbbut I'll try to get that out later today or tomorrow and we can pick up discussion there19:54
fungiagreed, i would want to find time to put into the sso spec if we went that route19:54
clarkb#topic Open Discussion19:56
clarkbAnything else before we run out of time19:56
fricklerjust thx to ianw for the mastodon setup19:57
fungireminder that i don't expect to be around much thursday through the weekend19:57
clarkbfungi: me too19:57
fricklerthat went much easier than I had expected19:57
clarkbmy kids actually have wednesday - monday out of school19:57
clarkbI wish I realized that sooner19:57
fungiwell, if you're not around tomorrow i won't tell anyone ;)19:57
ianw++ enjoy the turkeys19:58
clarkband we are at time. Thank you for your time everyone. We'll be back next week19:59
fungithanks clarkb!19:59
clarkb#endmeeting19:59
opendevmeetMeeting ended Tue Nov 22 19:59:54 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:59
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-22-19.01.html19:59
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-22-19.01.txt19:59
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-22-19.01.log.html19:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!