Tuesday, 2022-11-15

clarkb	almost meeting time.	18:59
fungi	indeed it be	19:00
clarkb	#startmeeting infra	19:01
opendevmeet	Meeting started Tue Nov 15 19:01:01 2022 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.	19:01
opendevmeet	Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.	19:01
opendevmeet	The meeting name has been set to 'infra'	19:01
ianw	o/	19:01
clarkb	#link https://lists.opendev.org/pipermail/service-discuss/2022-November/000379.html Our Agenda	19:01
clarkb	#topic Announcements	19:01
clarkb	I had no announcements	19:01
clarkb	#topic Topics	19:01
fungi	there are a couple	19:01
clarkb	#undo	19:01
opendevmeet	Removing item from minutes: #topic Topics	19:01
clarkb	go for it	19:01
fungi	nominations for the openinfra foundation board of directors are open	19:01
fungi	and the cfp for the openinfra summit in vancouver is now open as well	19:02
fungi	#link https://lists.openinfra.dev/pipermail/foundation/2022-November/003104.html 2023 Open Infrastructure Foundation Individual Director nominations are open	19:02
fungi	#link https://lists.openinfra.dev/pipermail/foundation/2022-November/003105.html The CFP for the OpenInfra Summit 2023 is open	19:02
fungi	that's all i can think of though	19:03
clarkb	#topic Bastion Host Updates	19:03
clarkb	#link https://review.opendev.org/q/topic:prod-bastion-group	19:03
clarkb	#link https://review.opendev.org/q/topic:bridge-ansible-venv	19:03
clarkb	looks like a few changes have merged since we last discussed this. ianw anything urgent or otherwise not captured by the two change topics that we should look at?	19:04
clarkb	One idea I had was maybe we should consolidate to a single topic for review even if there are distinct trees of change happening?	19:04
ianw	yeah i can clean up; i think prod-bastion-group is really now about being in a position to run parallel jobs	19:05
ianw	which is basically "setup source in one place, then fire off jobs"	19:05
clarkb	ah so maybe another topic for "things we need to do before turning off the old server" ?	19:06
ianw	the bridge-ansible-venv; one i'll get back to is us storing the host keys for our servers and deploying to /etc/ssh	19:06
ianw	fungi had some good points on making that better, so that's wip, but i'll get to that soon	19:07
ianw	(the idea being that when we start a new bridge, i'm trying to make it so we have as few manual steps as possible :)	19:07
clarkb	++	19:08
ianw	so writing down the manual steps has been a good way to try and think of ways to codify them :)	19:08
fungi	it's a good approach, but gets weird if you don't include all ip addresses along wit the hostnames, and we have that info in the inventory already	19:08
ianw	the only other one is	19:08
ianw	#link https://review.opendev.org/c/opendev/system-config/+/861284	19:08
ianw	which converts our launch-node into a small package installed in a venv on the bridge node	19:08
clarkb	and that should address our paramiko needs?	19:09
clarkb	I'll have to take a look at that	19:09
fungi	i'm going to need to launch a new server for mm3 this week probably, so will try to give that change a closer look	19:09
clarkb	Great, I'll have need of that too for our next topic	19:10
clarkb	#topic Upgrading old Servers	19:10
ianw	yep, it fixes that issue, and i think is a path to help with openstacksdk versions too	19:10
ianw	if we need two venv's with different versions -- well that's not great, but at least possible	19:10
clarkb	#link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades	19:11
clarkb	later this week I'm hoping to put a dent into some of these. I think we've basically sorted out all of the logistical challenges for doing jammy things once 861284 has landed	19:11
clarkb	so ya I'll try to review that and continue to make progress here	19:11
clarkb	I don't think there is much else to say about this now. We've hit the point where we just need to start doing the ugprades.	19:12
clarkb	#topic Mailman 3	19:13
clarkb	fungi has been pushign on this again. We ran into some problems with updated images that I think should be happier now	19:13
fungi	in preparing to do a final round of import tests, i got stuck on upstream updates to the container image tag we were tracking which slightly broke our setup automation	19:13
fungi	and fixing those in turn broke our forked images	19:13
fungi	which i think i have sorted (waiting for zuul to confirm)	19:14
clarkb	the good news is that this is giving us a sneak preview of the sorts of changes that need to be made for mm3 updates	19:14
clarkb	I wouldn't say the story there is great, but it should at least be doable	19:14
fungi	but pending successful repeat of a final round of migration testing, i hope to boot a future mm3 prod server later this week and send announcements for migration maintenance maybe?	19:14
clarkb	the main issue being the way mailman deals with dependencies and needing specific versions of stuff	19:15
clarkb	fungi: ++	19:15
fungi	yeah, it's dependency hell, more or less	19:15
fungi	anyway, might be worth looking at the calendar for some tentative dates	19:16
fungi	i don't expect we need to provide a ton of advance notice for lists.opendev.org and lists.zuul-ci.org, but a couple of weeks heads-up might be nice	19:16
fungi	which puts us in early december at the soonest	19:16
clarkb	that seems reasonable	19:16
ianw	++	19:16
fungi	based on my tests so far, importing those two sites plus dns cut-over is all doable inside of an hour	19:17
fungi	lists.openstack.org will need a few hours minimum, but i won't want to tackle that until early next year	19:17
fungi	should i be looking at trying to do the initial sites over a weekend, or do we think a friday is probably acceptable?	19:18
clarkb	I think weekdays should be fine. Both lists are quite low traffic	19:18
fungi	thinking about friday december 2, or that weekend (3..4)	19:18
clarkb	the second is bad for me, but I won't let that stop you	19:19
fungi	assuming things look good by the end of this week i can send a two-week warning on friday	19:19
fungi	could shoot for friday december 9 instead, if that works better for folks, but that's getting closer to holidays	19:20
fungi	i expect to be travelling at the end of december, so won't be around a computer as much	19:20
corvus	i think almost no notice is required and you should feel free to do it at your convenience	19:20
fungi	yeah, from a sending and receiving messages standpoint it should be essentially transparent	19:21
corvus	(low traffic lists + smtp queing)	19:21
fungi	for people doing list moderation, the webui will be down, and when it comes back for them it will need a new login (which they'll be able to get into their accounts for via password recovery steps)	19:22
fungi	that's really the biggest impact. that and some unknowns around how dkim signatures on some folks' messages may stop validating if the new mailman alters posts in ways that v2 didn't	19:23
frickler	maybe we should have a test ml where people concerned about their mailsetup could test this?	19:24
clarkb	frickler: unfortunately that would require setting up an entirely new domain	19:24
clarkb	I think if we didn't have to do so much on a per domain case this would be easier to test and transition	19:25
clarkb	but starting with low traffic domains is a reasonable stand in	19:25
clarkb	anyway I agree. I don't think a ton of notice is necessary. But sending some notice is probably a good idea as there will be user facing changes	19:25
frickler	not upfront, but add something like test@lists.opendev.org	19:25
fungi	would require a new domain before the migration, but could be a test ml on a migrated domain later if people want to test through it without bothering legitimate lists with noise posts	19:25
clarkb	frickler: oh I see, ya not a bad idea	19:25
fungi	we've had a test ml in the past, but dropped it during some domain shuffles in recent years	19:26
fungi	i have no objection to adding one	19:26
frickler	so that people from the large lists could test before those get moved	19:26
clarkb	anything else on this topic or should we move on?	19:27
fungi	if we're talking fridays, the next possible friday would be november 25, but since the 24th is a holiday around here i'm probably not going to be around much on the 25th	19:27
clarkb	fungi: maybe we should consider a monday instead? and do the 7th or 28th?	19:27
ianw	it's easier for me to help on a monday, but also i don't imagine i'd be much help ;)	19:29
fungi	mondays mean smaller tolerance for extra downtime and more people actively trying to use stuff	19:29
fungi	but fridays mean subtle problems may not get spotted for days	19:29
fungi	so it's hard to say which is worse	19:29
clarkb	right, but as corvus points out the risk theer is minimal	19:29
clarkb	for openstack that might change, but for opendev.org and zuul-ci.org I think it is fine?	19:30
fungi	i have a call at 17:00 utc on the 28th but am basically open otherwise	19:30
fungi	i can give it a shot. later in the day also means ianw is on hand	19:30
fungi	or earlier in the day for frickler	19:31
clarkb	and we can always reschedule if we get closer and decide that timing is bad	19:31
fungi	yeah, i guess we can revisit in next week's meeting for a go/no-go	19:32
clarkb	sounds good	19:32
fungi	no need to burn more time on today's agenda	19:32
clarkb	#topic Updating python base images	19:32
clarkb	#link https://review.opendev.org/c/opendev/system-config/+/862152	19:32
clarkb	This change has the reviews it needs. Assuming nothing else comes up tomorrow I plan to merge it then rebuild some of our images as well	19:32
clarkb	At this point this is mostly a heads up that image churn will occur but it is expected to largely be a noop	19:33
clarkb	#topic Etherpad container log growth	19:33
clarkb	#link https://review.opendev.org/c/opendev/system-config/+/864060	19:33
clarkb	This change ahs been approved. I half expected we need to manually restart the container to hvae it change behavior though	19:34
clarkb	I'll try to check on it after lunch today and can manually restart it at that point if necessary	19:34
clarkb	This should make the etherpad service far more reliable whcih is great	19:34
clarkb	#topic Quo vadis Storyboard	19:35
clarkb	#link https://lists.opendev.org/pipermail/service-discuss/2022-October/000370.html	19:35
clarkb	I did send that update to the mailing list las week explicitly asking for users that would like to keep using storyboard so that we can plan accordingly	19:35
clarkb	(and maybe convince them to help maintain it :) )	19:35
clarkb	I have not seen any movement on that thread since then though	19:36
fungi	i've added a highlight of that topic to this week's foundation newsletter too, just to get some added visibility	19:36
clarkb	It does look like the openstack TC is not interested in making any broad openstack wide decisions either. Which means it is unlikely we'll get openstack pushing in any single direction.	19:36
clarkb	I think we should keep the discussion open for another week then consider feedack collected and use that to make decisions on what we should be doing	19:37
clarkb	Any other thoughts or concerns about storyboard?	19:37
fungi	i had a user request in #storyboard this morning, but it was fairly easily resolved	19:38
fungi	duplicate account, colliding e-mail address	19:38
fungi	deactivated the old account and removed the e-mail address from it, then account autocreation for the new openid worked	19:38
clarkb	#topic Vexxhost server rescue behavior	19:40
clarkb	I did more testing of this and learned a bit	19:40
clarkb	For normally launched instances resolving the disk label collision does fix things.	19:40
clarkb	For BFV instances melwitt pointed me at the tempest testing for bfv server rescues and in that testing they set very specific image disk type and bus options	19:41
clarkb	I suspect that we need an image to be created with those properties that are proeprly set for the volume setup in vexxhost. Then theoretically this would work	19:41
clarkb	In both cases I think that vexxhost should consider creating a dedicated rescue image. Possibly one for bfv and one for non bfv. But with labels set (or uuid used) and the appropriate flags	19:42
clarkb	mnaser: ^ I don't think this is urgent, but it is also a nice feature to have. I'd be curious to know if you have an feedback on that as well	19:42
corvus	it sounds like bfv was something we previously needed but don't anymore; should we migrate to non-bfv?	19:42
clarkb	corvus: that is probably worth considering as well. I did that with the newly deployed gitea load balancer	19:43
mnaser	i suggest sticking to bfv, non-bfv means your data is sitting on local storage	19:43
mnaser	risks are if the hv goes poof the data might be gone, so if its a cattle then you're fine	19:43
mnaser	but if it's a pet that might be a bit of a bad time	19:43
clarkb	mnaser: I think for things like gitea and gerrit we would still mount a distinct data volume, but don't necessarily need the disk to be managed that way too. For the load balancer this is definitely a non issue	19:43
mnaser	oh well in that case when you're factoring it in then you're good	19:44
fungi	yeah, we would just deploy a new load balancer in a matter of minutes, it has no persistent state whatsoeevr	19:44
clarkb	but also I think these concerns are distinct. Server rescue should work particularly for a public cloud imo so that users can fixup things themselves.	19:45
clarkb	then whether or not we boot bfv is something we should consider	19:45
ianw	definitely an interesting point to add to our launch node docs though? mostly things are "cattle" but with various degrees of how annoying it would be to restore i guess	19:45
clarkb	In any case I just wanted to give an update on what I found. rescue can be made to work for non bfv instances on our end and possibly for bfv as well but I'm unsure what to set those image property values to for ceph volumes	19:46
clarkb	#topic Replacing Twitter	19:46
clarkb	we currently have a twitter account to post our status bot alerts to	19:47
clarkb	frickler has put this topic up asking if we should consider a switch to fosstodon instead	19:47
frickler	yes, not urgent for anything, but we should at least prepare imo	19:47
clarkb	I think that is a reasonable thing to do, but I have no idea what that requires in the status bot code. I assume we'd need a new driver since I doubt mastodon and twitter share an api	19:48
fungi	is it just a credential/url change for the api in statusbot's integration, or does it need a different api?	19:48
fungi	yeah, same thing i was wondering	19:48
ianw	definitely a different api	19:48
ianw	i had a quick look and not too hard	19:48
fungi	i guess if someone has time to write the necessary bindings for whatever library implements that, i'm okay with it, but i don't use either twitter or mastodon so can't speak to the exodus situation there	19:49
ianw	i don't feel like we need to abandon ship on twitter, but also adding mastodon seems like very little cost.	19:49
corvus	there isn't a technical reason to remove twitter support or stop posting there. (of course, there may be non-technical reasons, but there always have been). the twitter support itself is a tertiary output (after irc and wiki)	19:49
clarkb	right we could post to both locations (as long as we can still login to twitter which has apparnetly become an issue for 2fa users)	19:50
fungi	that sums up my position on it as well, yes	19:50
corvus	and i agree that adding mastodon is desirable if there are folks who would like to receive info there	19:50
ianw	opendevinfra i think have chosen fosstondon as a host -- it seems reasonable. it broadly shares our philosophy, only caveat is that it is english only (for moderation purposes)	19:50
ianw	not that i think we send status in other languages ...	19:51
clarkb	I would say if you are interested and have the time to do it then go for it :) This is unlikely to be a priority but also something taht shouldn't take long I don't expect	19:51
corvus	fosstodon seems like an appropriate place, but note that they currently have a waitlist	19:51
ianw	yeah, personally i have an account and like it enough to pitch in some $ via the patreon. it seems like a more sustainable model to pay for things you like	19:52
clarkb	maybe we can get on the waitlist now so that it is ready when we have the driver written?	19:53
clarkb	but ya I agree fosstodon seems appropriate	19:53
corvus	it looks like waitlist processing is not slow	19:53
corvus	anecdote: https://fosstodon.org/@acmegating waitlisted and approved within a few hours	19:53
ianw	well i can quickly put in "opendevinfra" as a name, if we like	19:53
frickler	+1	19:53
fungi	that matches the twitter account we're using, right?	19:54
frickler	yes	19:54
ianw	yep	19:54
fungi	if so, sgtm	19:54
ianw	well i'll do that and add it to the usual places, and i think we can make statusbot talk to it pretty quick	19:54
clarkb	sounds good	19:54
clarkb	#topic Open Discussion	19:55
clarkb	There were a few more things I was hoping to get to that weren't on the agenda. Let's see if we can cover them really quickly	19:55
clarkb	I've got a change up to upgrade our Gerrit version	19:55
clarkb	#link https://review.opendev.org/c/opendev/system-config/+/864217 Upgrade Gerrit to 3.5.4	19:55
clarkb	This change needs its parent too in order to land	19:56
clarkb	If that lands sometime this week I can be around to restart Gerrit to pick it up at some point	19:56
clarkb	openstack/ansible-role-zookeeper was renamed to windmill/ansible-role-zookeeper and we've since created a new openstack/ansible-role-zookeeper	19:56
fungi	yeah, we recently created a new repository for openstack whose name collides with a redirect for an old repository we moved out of openstack (different project entirely, just happens to use the same name). i've got a held node and am going to poke at options	19:57
clarkb	I didn't expect this to cause problems because we have a bunch of foo/project-config repos but because there was a prior rename we hvae redirects in place which this runs afoul of	19:57
clarkb	in addition to fixing this in gitea one way or another we should look into update our testing to call these problems out	19:57
clarkb	And finally, nodepool needs newer openstacksdk in order to run under python3.11 (because old sdk uses pythonisms that were deprecated and removedin 3.11). However new opnstacksdk previously didn't work with our nodepool and clouds	19:58
frickler	also there's an interesting git-review patch adding patch set descriptions, which looks useful to me https://review.opendev.org/c/opendev/git-review/+/864098 some concern on whether more sophisticated url mangling might be needed, maybe have a look if you're interested	19:58
clarkb	corvus has a nodepool test script thing that I'm hoping to try and use to test this without doing a whole nodepool deployment t osee if openstacksdk updates have made things better (and if not identify the problems)	19:58
ianw	heh, well if it can happen it will ... is the only problem really that apache is sending things the wrong way?	19:58
clarkb	ianw: its gitea itself redirecting, but ya I think that may be the only problem?	19:59
fungi	ianw: not apache but gitea	19:59
clarkb	frickler: interesting, I'm not evensure I know what that feature does in gerrit. I'll have to take a look	19:59
frickler	and I'd also like to learn other root's opinion on the Ubuntu FIPS token patch, if I'm in a minority I might be fine with getting outvoted	19:59
frickler	clarkb: you can see it in the patch, the submitter used their patch version	20:00
clarkb	frickler: excellent that will help :)	20:00
frickler	https://review.opendev.org/c/openstack/project-config/+/861457 for fips	20:00
clarkb	frickler: re FIPs I think that is more a question for openstack	20:01
fungi	yeah, the main issue with things like gerrit patchset descriptions is that we currently can't add regression tests for newer gerrit features unless we can get our git-review tests able to deploy newer gerrit versions	20:01
clarkb	I don't think it runs afoul of our expectations from a hosting side	20:01
corvus	i would have expected a new project creation to invalidate the gitea redirects. regardless of why it didn't work out, the last time i looked, the redirects were a gitea db entry. probably can be fixed manually, but if so, then we should remember to record that in our yaml files for repo moves since we have assumed that we should be able to reconstruct the gitea redirect mappings from that data alone.	20:01
clarkb	corvus: yup ++	20:01
fungi	corvus: yes, i plan to comment or comment out the relevant entry in opendev/project-config	20:01
clarkb	frickler: from the hosting side of things this is a big part of why I don't think we should have fips specific images	20:01
fungi	after i finish playing with the held node to determine options	20:01
clarkb	frickler: but we already allow jobs to interact with proprietary services (quay is/was for example)	20:02
clarkb	We are at time now. Feel free to continue discussion in #opendev or on the mailing list. Thank you for your time everyone	20:02
clarkb	Next week is a big US holiday week but I expect I'll be around through tuesday and probably most of wednesday	20:03
clarkb	I don't expect to be around much thursday and friday	20:03
clarkb	#endmeeting	20:03
opendevmeet	Meeting ended Tue Nov 15 20:03:19 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)	20:03
opendevmeet	Minutes: https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-15-19.01.html	20:03
opendevmeet	Minutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-15-19.01.txt	20:03
opendevmeet	Log: https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-15-19.01.log.html	20:03
fungi	same for me	20:03
fungi	thanks clarkb!	20:03
ianw	can't believe it's thanksgiving already again!	20:04
fungi	we do seem to have a lot of those	20:05

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!