Tuesday, 2022-11-22

clarkb	meeting time	19:00
fungi	indeed	19:00
ianw	o/	19:01
clarkb	#startmeeting infra	19:01
opendevmeet	Meeting started Tue Nov 22 19:01:51 2022 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.	19:01
opendevmeet	Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.	19:01
opendevmeet	The meeting name has been set to 'infra'	19:01
clarkb	#link https://lists.opendev.org/pipermail/service-discuss/2022-November/000381.html Our Agenda	19:02
clarkb	#topic Announcements	19:02
clarkb	fungi: are individual director nominations still open?	19:02
fungi	yes, until january	19:03
fungi	i think?	19:03
fungi	no, december 16	19:03
fungi	#link https://lists.openinfra.dev/pipermail/foundation/2022-November/003104.html 2023 Open Infrastructure Foundation Individual Director nominations are open	19:04
clarkb	thanks	19:04
fungi	also cfp is open for the vancouver summit in june	19:04
fungi	#link https://lists.openinfra.dev/pipermail/foundation/2022-November/003105.html The CFP for the OpenInfra Summit 2023 is open	19:04
clarkb	#topic Bastion Host Updates	19:06
clarkb	Lets dive right in	19:06
clarkb	the bastion is doing centrally managed known hosts now. We also have a dedicated venv for launch node now	19:06
clarkb	fungi discovered that this venv isn't working with rax though. It needed older cinderclient, and iwht that fixed now needs something to address networking stuff with rax. iirc this was the problem that corvus and ianw ran into on old bridge? I think we need older osc maybe?	19:07
clarkb	fungi: that may be why my osc is pinned back on bridge01 fwiw	19:07
fungi	well, the cinderclient version only gets in the way if you want to use launch-node's --volume option in rackspace, but it seems there's also a problem with the default networks behavior	19:07
clarkb	right, I suspect that the reason I pinned back osc but not cinderclient is I was not doing volume stuff but was doing server stuff	19:08
ianw	... that does sound familiar	19:08
fungi	rackspace doesn't have neutron api support, and for whatever reason osc thinks it should	19:08
fungi	maybe we need to add another option to our clouds.yaml for it?	19:08
clarkb	ianw: I'm pretty sure its the same issue that corvus ran into and helped you solve when you ran into it. And I suspect it was fixed for me on brdige01 with oldre osc	19:08
clarkb	fungi: if you can figure that out you win a prize. unfortunately the options for stuff like that aren't super clear	19:09
ianw	(https://meetings.opendev.org/irclogs/%23opendev/%23opendev.2022-10-24.log.html#t2022-10-24T22:56:16 was what i saw when bringing up new bridge)	19:09
fungi	looks like the server create command is hitting a problem in openstack.resource._translate_response() where it complains about the networks metadata	19:09
ianw	"Bad networks format"	19:10
fungi	yeah, that's it	19:10
clarkb	we can solve the venv problem outside th emeeting but I wanted to call it out	19:11
fungi	okay, so we think ~root/corvus-venv (on old bridge?) has a working set?	19:11
clarkb	fungi: ya	19:11
fungi	yeah, that's enough for me to dig deeper	19:12
fungi	thanks	19:12
clarkb	are there any other bridge updates or changes to call out?	19:12
ianw	the ansible 6 work is close	19:12
ianw	#link https://review.opendev.org/q/topic:bridge-ansible-update	19:13
ianw	anything with a +1 can be reviewed :)	19:13
clarkb	oh right I pulled up the change there that I havne't reviewed yet but saw it was huge compared to the others and got distracted :)	19:13
ianw	#link https://review.opendev.org/c/opendev/system-config/+/865092	19:13
ianw	is one for disabling known_hosts now that we have the global one	19:13
clarkb	ianw: I think you can go ahead and approve that one if you like. I mostly didn't approve it just so that we could take a minute to consider if there was a valid use case for not doing it	19:14
ianw	ok, i can't think of one :) i have in my todo to look at making it a global thing ... because i don't think we should have any logging in between machines that we don't explicitly codify in system-config	19:15
ianw	#link https://review.opendev.org/q/topic:prod-bastion-group	19:15
ianw	has a couple of changes still if we want to push on the parallel running of jobs.	19:15
ianw	(it doesn't do anything in parallel, but sets up the dependencies so they could, in theory)	19:15
clarkb	++ I do want to push on that. But I also want to be able to observe things as they change	19:16
clarkb	probbaly won't dig into that until next week	19:16
ianw	that is very high touch, and i don't have 100% confidence it wouldn't cause problems, so not a fire and forget	19:16
ianw	++	19:16
ianw	other than that, i just want to write up something on the migration process, and i think it's (finally!) done	19:16
clarkb	excellent. Thanks again	19:17
ianw	there's no more points i can think of that we can really automate more	19:17
clarkb	sounds like we can move on to the next item	19:18
clarkb	#topic Upgrading Servers	19:18
clarkb	I was hoping to make progress on another server last week then I got sick and now this is behind a few other efforts :/	19:18
clarkb	that said one of those efforts is mailman3 and does include a server upgrade. Lets just skip ahead to talk about that	19:18
clarkb	#topic Mailman 3	19:18
clarkb	changes for mailman3 config management have landed	19:19
clarkb	after fungi picked this back up and addressed some issues with newer mailman and django	19:19
clarkb	I believe the next step is to launch a node, add it to our inventory and get mailman3 running on it. Then we can plan migration sfor opendev and zuul?	19:19
clarkb	er I guess they are already planned. We can go ahead and perform them :)	19:19
fungi	yeah, the migration plan is outlined here:	19:20
fungi	#link https://etherpad.opendev.org/p/mm3migration Mailman 3 Migration Plan	19:20
clarkb	anything else to add to this?	19:22
ianw	sounds great, thanks!	19:22
fungi	i'm thinking since we said let's do go/no-go in the meeting today for a monday 2022-11-28 maintenance, i'm leaning toward no-go since i'm currently struggling to get the server created and expect to not be around much on thursday/friday (also a number of people are quite unlikely to see an announcement this week)	19:23
clarkb	fungi: that is reasonable. no objection from me	19:23
ianw	yeah i wouldn't want to take this one on alone :)	19:23
fungi	later nect week might work, we could do a different day than friday given clarkb's unavailability	19:24
fungi	also apologies for typos and slow response, i seem to be suffering some mighty lag/packet loss at the moment	19:24
clarkb	yup later that week works for me. I'm just not around that friday	19:25
fungi	anyway, nothing else on this topic	19:25
clarkb	(the 2nd)	19:25
fungi	mainly working to get the prod server booted and added to inventory	19:25
clarkb	cool, lets move o nto the next thing	19:25
clarkb	#topic Using pip wheel in python base images	19:25
clarkb	The change to update how our assemble system creates wheels in our base images has landed	19:26
clarkb	At this point I don't expect any problems due to the testing I did with nodepool which covered siblings and all that	19:26
clarkb	But if you do see problems please make note of them so that they can be debugged, but reverting is also fine	19:26
clarkb	mostly just a heads up	19:26
clarkb	#topic Vexxhost instance rescuing	19:28
clarkb	I didn't want this to fall off our mind swithout some sort of conclusion	19:28
clarkb	For normally booted nodes we can create a custom boot image that uses a different device label for /	19:29
clarkb	we can create this with dib and uploda it or make a snapshot of an image we've done surgery on in the cloud.	19:29
clarkb	However, for BFV nodes we would need a special image with special metadata flags and I have no idea what the appropriate flags would be	19:29
clarkb	given that, do we want to change anything about how we operate in vexxhost? previously people had asked about console login	19:30
clarkb	I'm personally hopeful that public clouds would make this work out of the box for users. But that isn't currently the case so I'm open to ideas	19:31
ianw	i guess review is the big one ...	19:31
ianw	i guess console login only helps if it gets to the point of starting tty's ... so not borked initrd etc.	19:32
clarkb	ya its definitely a subset of issues	19:32
clarkb	maybe our effort is best spent trying to help mnaser figure out what that metadata for bfv instances is?	19:32
clarkb	er metadata for bfv rescue images is	19:33
fungi	yes, being able to mount the broken server's rootfs on another server instance is really the only 100% solution	19:33
ianw	it does feel like there's not much replacement for it	19:33
ianw	i'm trying to think of things in production that have stopped boot	19:34
clarkb	ianw: the lists server is the main one. But thats due to its ancientness	19:34
clarkb	(which we are addressing by replacing it)	19:34
ianw	a disk being down in the large static volume and not mounting was one i can think of	19:34
ianw	i think i had to jump into rescue mode for that once	19:35
fungi	for that case, having a ramdisk image in the bootloader is useful	19:35
fungi	but not for cases where the hypervisor can't fire the bootloader at all (rackspace pv lists.o.o)	19:36
clarkb	ya it definitely seems like rescuing is the most versatile tool. I'll see if I can learn anything more about how to configure ceph for it	19:37
clarkb	in particular jrosser had input and we might be able to try setting metadata like jrosser and see if it owrks :)	19:37
clarkb	and if we can figure it out then other vexxhost users will potentially benefit too	19:38
clarkb	so worthwhile I think	19:38
clarkb	#topic Quo vadis Storyboard	19:39
clarkb	Its been about two weeks since I sent out anothre email asking for more input	19:39
clarkb	#link https://lists.opendev.org/pipermail/service-discuss/2022-October/000370.html	19:39
clarkb	we've not had any responses since my last email. I think that means we should probably consider feedback gathered	19:39
clarkb	At this point I think our next step is to work out amongst ourslves (but probably still on the mailing list?) what we think opendev's next steps should be	19:40
clarkb	to summarize the feedback from our users, it seems there aren't any specifically interested in using storaybodr or helping to maintain/develop it.	19:41
clarkb	There are users interested in migrating from storyboard to launchpad	19:41
ianw	the options are really to upgrade it to something maintainable or somehow gracefully shut it down?	19:41
frickler	do we know whether there is any other storyboard deployment beside ours?	19:42
clarkb	ianw: yes I think ultimately where that puts us is we need to decide if we are willing to maintain it ourselves (including necessary dev work). This would require a shift in our priorities considering it hasn't been updated yet	19:42
clarkb	ianw: and if we don't want to do that figure out if migrating to something else is feasible	19:42
clarkb	frickler: there were once upon a time but I'm not sure today	19:42
fungi	as always people can fork or take up maintenance later if it's something they're using, regardless of whether we're maintaining it	19:44
clarkb	right I'm not too worried about others unless they are willing to help us. And so far no one has indicated that is something they can or want to do	19:45
clarkb	I don't think we should make any decisions in the meeting. But I wanted to summarize where we've ended up and what our two likely paths forward appear to be	19:45
ianw	it feels like if we're going to expend a fair bit of effort, it might be worth thinking about putting that into figuring out something like using gitea issues	19:45
clarkb	ianw: my main concern with that is I don't think we can effectively disable repo usage and allow issues	19:45
clarkb	its definitely something that can be investigated though	19:46
fungi	right now there are a lot of features in gitea which we avoid supporting because we aren't allowing people to have gitea accounts	19:46
clarkb	and a number of features we explicitly don't want	19:46
fungi	as soon as we do allow gitea accounts, we need to more carefully go through the features and work out which ones can be turned off or which we're stuck supporting because people will end up using them	19:47
clarkb	But we've got really good CI for gitea so testing setups should't be difficult if people want to look into that.	19:47
fungi	also not allowing authenticated activities has shielded us from a number of security vulnerabilities in gitea	19:47
clarkb	++	19:47
clarkb	anyway I do think we should continue the discussion on the mailing list. I can write a followup indicating that we have options like maintain it ourselves and do the work, migrate to something else (lp and/or gitea) and try to outline the effort required for each?	19:48
clarkb	Do we want that in a separate thread? Or should I keep rolling forward with the existing one	19:48
ianw	i do agreee, but it feels like useful time spent investigating that path	19:49
ianw	it probably also loops back to our authentication options, something that would also be useful to work on i guess	19:52
clarkb	ya at the end of the day everything end sup intertwined somehow and we're doing our best to try and prioritize :)	19:53
clarkb	(fwiw I think the recent priorities around gerrit and zuul and strong CI have been the right choice. They hvae enabled us to move more quickly on other things as necessary)	19:53
clarkb	sounds like no strong opinion on where exactly I send the email. I'll take a look at the existing thread and decide if a follow up makes esnse or not	19:54
clarkb	but I'll try to get that out later today or tomorrow and we can pick up discussion there	19:54
fungi	agreed, i would want to find time to put into the sso spec if we went that route	19:54
clarkb	#topic Open Discussion	19:56
clarkb	Anything else before we run out of time	19:56
frickler	just thx to ianw for the mastodon setup	19:57
fungi	reminder that i don't expect to be around much thursday through the weekend	19:57
clarkb	fungi: me too	19:57
frickler	that went much easier than I had expected	19:57
clarkb	my kids actually have wednesday - monday out of school	19:57
clarkb	I wish I realized that sooner	19:57
fungi	well, if you're not around tomorrow i won't tell anyone ;)	19:57
ianw	++ enjoy the turkeys	19:58
clarkb	and we are at time. Thank you for your time everyone. We'll be back next week	19:59
fungi	thanks clarkb!	19:59
clarkb	#endmeeting	19:59
opendevmeet	Meeting ended Tue Nov 22 19:59:54 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)	19:59
opendevmeet	Minutes: https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-22-19.01.html	19:59
opendevmeet	Minutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-22-19.01.txt	19:59
opendevmeet	Log: https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-22-19.01.log.html	19:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!