clarkb | meeting time | 19:00 |
---|---|---|
fungi | indeed | 19:00 |
ianw | o/ | 19:01 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Nov 22 19:01:51 2022 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link https://lists.opendev.org/pipermail/service-discuss/2022-November/000381.html Our Agenda | 19:02 |
clarkb | #topic Announcements | 19:02 |
clarkb | fungi: are individual director nominations still open? | 19:02 |
fungi | yes, until january | 19:03 |
fungi | i think? | 19:03 |
fungi | no, december 16 | 19:03 |
fungi | #link https://lists.openinfra.dev/pipermail/foundation/2022-November/003104.html 2023 Open Infrastructure Foundation Individual Director nominations are open | 19:04 |
clarkb | thanks | 19:04 |
fungi | also cfp is open for the vancouver summit in june | 19:04 |
fungi | #link https://lists.openinfra.dev/pipermail/foundation/2022-November/003105.html The CFP for the OpenInfra Summit 2023 is open | 19:04 |
clarkb | #topic Bastion Host Updates | 19:06 |
clarkb | Lets dive right in | 19:06 |
clarkb | the bastion is doing centrally managed known hosts now. We also have a dedicated venv for launch node now | 19:06 |
clarkb | fungi discovered that this venv isn't working with rax though. It needed older cinderclient, and iwht that fixed now needs something to address networking stuff with rax. iirc this was the problem that corvus and ianw ran into on old bridge? I think we need older osc maybe? | 19:07 |
clarkb | fungi: that may be why my osc is pinned back on bridge01 fwiw | 19:07 |
fungi | well, the cinderclient version only gets in the way if you want to use launch-node's --volume option in rackspace, but it seems there's also a problem with the default networks behavior | 19:07 |
clarkb | right, I suspect that the reason I pinned back osc but not cinderclient is I was not doing volume stuff but was doing server stuff | 19:08 |
ianw | ... that does sound familiar | 19:08 |
fungi | rackspace doesn't have neutron api support, and for whatever reason osc thinks it should | 19:08 |
fungi | maybe we need to add another option to our clouds.yaml for it? | 19:08 |
clarkb | ianw: I'm pretty sure its the same issue that corvus ran into and helped you solve when you ran into it. And I suspect it was fixed for me on brdige01 with oldre osc | 19:08 |
clarkb | fungi: if you can figure that out you win a prize. unfortunately the options for stuff like that aren't super clear | 19:09 |
ianw | (https://meetings.opendev.org/irclogs/%23opendev/%23opendev.2022-10-24.log.html#t2022-10-24T22:56:16 was what i saw when bringing up new bridge) | 19:09 |
fungi | looks like the server create command is hitting a problem in openstack.resource._translate_response() where it complains about the networks metadata | 19:09 |
ianw | "Bad networks format" | 19:10 |
fungi | yeah, that's it | 19:10 |
clarkb | we can solve the venv problem outside th emeeting but I wanted to call it out | 19:11 |
fungi | okay, so we think ~root/corvus-venv (on old bridge?) has a working set? | 19:11 |
clarkb | fungi: ya | 19:11 |
fungi | yeah, that's enough for me to dig deeper | 19:12 |
fungi | thanks | 19:12 |
clarkb | are there any other bridge updates or changes to call out? | 19:12 |
ianw | the ansible 6 work is close | 19:12 |
ianw | #link https://review.opendev.org/q/topic:bridge-ansible-update | 19:13 |
ianw | anything with a +1 can be reviewed :) | 19:13 |
clarkb | oh right I pulled up the change there that I havne't reviewed yet but saw it was huge compared to the others and got distracted :) | 19:13 |
ianw | #link https://review.opendev.org/c/opendev/system-config/+/865092 | 19:13 |
ianw | is one for disabling known_hosts now that we have the global one | 19:13 |
clarkb | ianw: I think you can go ahead and approve that one if you like. I mostly didn't approve it just so that we could take a minute to consider if there was a valid use case for not doing it | 19:14 |
ianw | ok, i can't think of one :) i have in my todo to look at making it a global thing ... because i don't think we should have any logging in between machines that we don't explicitly codify in system-config | 19:15 |
ianw | #link https://review.opendev.org/q/topic:prod-bastion-group | 19:15 |
ianw | has a couple of changes still if we want to push on the parallel running of jobs. | 19:15 |
ianw | (it doesn't do anything in parallel, but sets up the dependencies so they could, in theory) | 19:15 |
clarkb | ++ I do want to push on that. But I also want to be able to observe things as they change | 19:16 |
clarkb | probbaly won't dig into that until next week | 19:16 |
ianw | that is very high touch, and i don't have 100% confidence it wouldn't cause problems, so not a fire and forget | 19:16 |
ianw | ++ | 19:16 |
ianw | other than that, i just want to write up something on the migration process, and i think it's (finally!) done | 19:16 |
clarkb | excellent. Thanks again | 19:17 |
ianw | there's no more points i can think of that we can really automate more | 19:17 |
clarkb | sounds like we can move on to the next item | 19:18 |
clarkb | #topic Upgrading Servers | 19:18 |
clarkb | I was hoping to make progress on another server last week then I got sick and now this is behind a few other efforts :/ | 19:18 |
clarkb | that said one of those efforts is mailman3 and does include a server upgrade. Lets just skip ahead to talk about that | 19:18 |
clarkb | #topic Mailman 3 | 19:18 |
clarkb | changes for mailman3 config management have landed | 19:19 |
clarkb | after fungi picked this back up and addressed some issues with newer mailman and django | 19:19 |
clarkb | I believe the next step is to launch a node, add it to our inventory and get mailman3 running on it. Then we can plan migration sfor opendev and zuul? | 19:19 |
clarkb | er I guess they are already planned. We can go ahead and perform them :) | 19:19 |
fungi | yeah, the migration plan is outlined here: | 19:20 |
fungi | #link https://etherpad.opendev.org/p/mm3migration Mailman 3 Migration Plan | 19:20 |
clarkb | anything else to add to this? | 19:22 |
ianw | sounds great, thanks! | 19:22 |
fungi | i'm thinking since we said let's do go/no-go in the meeting today for a monday 2022-11-28 maintenance, i'm leaning toward no-go since i'm currently struggling to get the server created and expect to not be around much on thursday/friday (also a number of people are quite unlikely to see an announcement this week) | 19:23 |
clarkb | fungi: that is reasonable. no objection from me | 19:23 |
ianw | yeah i wouldn't want to take this one on alone :) | 19:23 |
fungi | later nect week might work, we could do a different day than friday given clarkb's unavailability | 19:24 |
fungi | also apologies for typos and slow response, i seem to be suffering some mighty lag/packet loss at the moment | 19:24 |
clarkb | yup later that week works for me. I'm just not around that friday | 19:25 |
fungi | anyway, nothing else on this topic | 19:25 |
clarkb | (the 2nd) | 19:25 |
fungi | mainly working to get the prod server booted and added to inventory | 19:25 |
clarkb | cool, lets move o nto the next thing | 19:25 |
clarkb | #topic Using pip wheel in python base images | 19:25 |
clarkb | The change to update how our assemble system creates wheels in our base images has landed | 19:26 |
clarkb | At this point I don't expect any problems due to the testing I did with nodepool which covered siblings and all that | 19:26 |
clarkb | But if you do see problems please make note of them so that they can be debugged, but reverting is also fine | 19:26 |
clarkb | mostly just a heads up | 19:26 |
clarkb | #topic Vexxhost instance rescuing | 19:28 |
clarkb | I didn't want this to fall off our mind swithout some sort of conclusion | 19:28 |
clarkb | For normally booted nodes we can create a custom boot image that uses a different device label for / | 19:29 |
clarkb | we can create this with dib and uploda it or make a snapshot of an image we've done surgery on in the cloud. | 19:29 |
clarkb | However, for BFV nodes we would need a special image with special metadata flags and I have no idea what the appropriate flags would be | 19:29 |
clarkb | given that, do we want to change anything about how we operate in vexxhost? previously people had asked about console login | 19:30 |
clarkb | I'm personally hopeful that public clouds would make this work out of the box for users. But that isn't currently the case so I'm open to ideas | 19:31 |
ianw | i guess review is the big one ... | 19:31 |
ianw | i guess console login only helps if it gets to the point of starting tty's ... so not borked initrd etc. | 19:32 |
clarkb | ya its definitely a subset of issues | 19:32 |
clarkb | maybe our effort is best spent trying to help mnaser figure out what that metadata for bfv instances is? | 19:32 |
clarkb | er metadata for bfv rescue images is | 19:33 |
fungi | yes, being able to mount the broken server's rootfs on another server instance is really the only 100% solution | 19:33 |
ianw | it does feel like there's not much replacement for it | 19:33 |
ianw | i'm trying to think of things in production that have stopped boot | 19:34 |
clarkb | ianw: the lists server is the main one. But thats due to its ancientness | 19:34 |
clarkb | (which we are addressing by replacing it) | 19:34 |
ianw | a disk being down in the large static volume and not mounting was one i can think of | 19:34 |
ianw | i think i had to jump into rescue mode for that once | 19:35 |
fungi | for that case, having a ramdisk image in the bootloader is useful | 19:35 |
fungi | but not for cases where the hypervisor can't fire the bootloader at all (rackspace pv lists.o.o) | 19:36 |
clarkb | ya it definitely seems like rescuing is the most versatile tool. I'll see if I can learn anything more about how to configure ceph for it | 19:37 |
clarkb | in particular jrosser had input and we might be able to try setting metadata like jrosser and see if it owrks :) | 19:37 |
clarkb | and if we can figure it out then other vexxhost users will potentially benefit too | 19:38 |
clarkb | so worthwhile I think | 19:38 |
clarkb | #topic Quo vadis Storyboard | 19:39 |
clarkb | Its been about two weeks since I sent out anothre email asking for more input | 19:39 |
clarkb | #link https://lists.opendev.org/pipermail/service-discuss/2022-October/000370.html | 19:39 |
clarkb | we've not had any responses since my last email. I think that means we should probably consider feedback gathered | 19:39 |
clarkb | At this point I think our next step is to work out amongst ourslves (but probably still on the mailing list?) what we think opendev's next steps should be | 19:40 |
clarkb | to summarize the feedback from our users, it seems there aren't any specifically interested in using storaybodr or helping to maintain/develop it. | 19:41 |
clarkb | There are users interested in migrating from storyboard to launchpad | 19:41 |
ianw | the options are really to upgrade it to something maintainable or somehow gracefully shut it down? | 19:41 |
frickler | do we know whether there is any other storyboard deployment beside ours? | 19:42 |
clarkb | ianw: yes I think ultimately where that puts us is we need to decide if we are willing to maintain it ourselves (including necessary dev work). This would require a shift in our priorities considering it hasn't been updated yet | 19:42 |
clarkb | ianw: and if we don't want to do that figure out if migrating to something else is feasible | 19:42 |
clarkb | frickler: there were once upon a time but I'm not sure today | 19:42 |
fungi | as always people can fork or take up maintenance later if it's something they're using, regardless of whether we're maintaining it | 19:44 |
clarkb | right I'm not too worried about others unless they are willing to help us. And so far no one has indicated that is something they can or want to do | 19:45 |
clarkb | I don't think we should make any decisions in the meeting. But I wanted to summarize where we've ended up and what our two likely paths forward appear to be | 19:45 |
ianw | it feels like if we're going to expend a fair bit of effort, it might be worth thinking about putting that into figuring out something like using gitea issues | 19:45 |
clarkb | ianw: my main concern with that is I don't think we can effectively disable repo usage and allow issues | 19:45 |
clarkb | its definitely something that can be investigated though | 19:46 |
fungi | right now there are a lot of features in gitea which we avoid supporting because we aren't allowing people to have gitea accounts | 19:46 |
clarkb | and a number of features we explicitly don't want | 19:46 |
fungi | as soon as we do allow gitea accounts, we need to more carefully go through the features and work out which ones can be turned off or which we're stuck supporting because people will end up using them | 19:47 |
clarkb | But we've got really good CI for gitea so testing setups should't be difficult if people want to look into that. | 19:47 |
fungi | also not allowing authenticated activities has shielded us from a number of security vulnerabilities in gitea | 19:47 |
clarkb | ++ | 19:47 |
clarkb | anyway I do think we should continue the discussion on the mailing list. I can write a followup indicating that we have options like maintain it ourselves and do the work, migrate to something else (lp and/or gitea) and try to outline the effort required for each? | 19:48 |
clarkb | Do we want that in a separate thread? Or should I keep rolling forward with the existing one | 19:48 |
ianw | i do agreee, but it feels like useful time spent investigating that path | 19:49 |
ianw | it probably also loops back to our authentication options, something that would also be useful to work on i guess | 19:52 |
clarkb | ya at the end of the day everything end sup intertwined somehow and we're doing our best to try and prioritize :) | 19:53 |
clarkb | (fwiw I think the recent priorities around gerrit and zuul and strong CI have been the right choice. They hvae enabled us to move more quickly on other things as necessary) | 19:53 |
clarkb | sounds like no strong opinion on where exactly I send the email. I'll take a look at the existing thread and decide if a follow up makes esnse or not | 19:54 |
clarkb | but I'll try to get that out later today or tomorrow and we can pick up discussion there | 19:54 |
fungi | agreed, i would want to find time to put into the sso spec if we went that route | 19:54 |
clarkb | #topic Open Discussion | 19:56 |
clarkb | Anything else before we run out of time | 19:56 |
frickler | just thx to ianw for the mastodon setup | 19:57 |
fungi | reminder that i don't expect to be around much thursday through the weekend | 19:57 |
clarkb | fungi: me too | 19:57 |
frickler | that went much easier than I had expected | 19:57 |
clarkb | my kids actually have wednesday - monday out of school | 19:57 |
clarkb | I wish I realized that sooner | 19:57 |
fungi | well, if you're not around tomorrow i won't tell anyone ;) | 19:57 |
ianw | ++ enjoy the turkeys | 19:58 |
clarkb | and we are at time. Thank you for your time everyone. We'll be back next week | 19:59 |
fungi | thanks clarkb! | 19:59 |
clarkb | #endmeeting | 19:59 |
opendevmeet | Meeting ended Tue Nov 22 19:59:54 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:59 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-22-19.01.html | 19:59 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-22-19.01.txt | 19:59 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2022/infra.2022-11-22-19.01.log.html | 19:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!