Tuesday, 2022-08-16

clarkbAnyone else here for the meeting?19:00
clarkbI think we'll be short fungi. It will likely be a quick one19:00
fricklero/19:00
fricklerdid I miss some note about fungi?19:00
ianwo/19:01
clarkbhe is attempting to have some vacation time19:01
clarkb#startmeeting infra19:01
opendevmeetMeeting started Tue Aug 16 19:01:14 2022 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
opendevmeetThe meeting name has been set to 'infra'19:01
clarkbBut if you watch #opendev you might not be able to tell the difference19:01
clarkb#link https://lists.opendev.org/pipermail/service-discuss/2022-August/000352.html Our Agenda19:01
clarkb#topic Announcements19:01
clarkbOpenDev service coordinator nominations end today.19:01
clarkbI haven't seen any nominations yet. I'm taking that as a sign that everyone thinks i should do it again19:02
clarkbI guess if no one else says they are interested I'll go ahead and make my own nomination official later this afternnon19:03
clarkb#topic Topics19:04
clarkb#topic Improving Grafana Management Tooling19:05
clarkbI was going to call out https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/851955 as the last remaining change to close this out, but looks like it merged sometime between when I sent the agenda and our meeting starting19:05
clarkbianw: is there anything else to call out on this subject or can we consider this compelted?19:05
ianwI think we can call it done, thanks19:06
clarkbgreat, thank you for putting that together19:07
clarkb#topic Bastion Host Updates19:07
clarkbThis got side tracked by venv installation management iirc. Any other updates on this one?19:08
ianwno, sorry, i've been sidetracked away from this, but it's 2nd on my todo list19:09
clarkbNo problem. I think those venv updates will be generally useful (as already indicated by the borg work)19:09
clarkb#topic Upgrading Bionic servers to Focal/Jammy19:09
ianwbut yeah, i want to get things more isolated before we tackle the upgrade19:09
clarkbthats a good lead into this subject. Things all end up related to each other.19:09
clarkb#link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades Notes on the work that needs to be done.19:09
clarkbIn general though we're laying groundwork to make it possible to update to Jammy as well as make certain servers like bridge cleaner to update19:10
clarkbDid anyone else have questions/concerns/plans for upgrades that wanted to discuss?19:12
ianwnot really for me, i guess bridge is the first one I'd like to get into19:12
clarkbonward then19:13
clarkb#topic Mailman 319:13
clarkb#link https://review.opendev.org/c/opendev/system-config/+/851248 WIP change to deploy a mailman 3 instance19:13
clarkbMuch progress has been made with mailman 3 since our last meeting19:14
clarkbIn particular I believe that exim is properly configured to forward mail to mailman and mailman and exim know how to send mail outbound19:14
clarkbA permissions issue with xapian write locations was addressed allowing hyperkitty to spin up successfully and list (empty) archives19:15
clarkbI've also updated the testing to force mailman's hourly cron jobs to run during CI which makes things like hyperkitty bootstrap properly19:15
clarkbThere are now two major things to work on. The first is testing the migrations of our existing lists from mailman 2 to mailman 3. My plan is to work with fungi on that using a held CI node19:16
clarkbWe want to make sure that both the data (subscribers and archives) migrate cleanly as well as the configuration (handling dmarc, publicly accessibly vs not, and so on)19:16
clarkbThe other big item is dealing with uid:gid mappings between the host and the containers. Currently I'm punting on this because there doesn't seem to be a good answer for this. But the tl;dr is that mailman is uid 100 in btoh mailman-web and mailman-core and gid 100 in one of those and 65535 in the other19:17
clarkbthat makes mapping it onto users on the host a bit painful, particularly since they both differ with gids.19:18
clarkbMy current thinking on that is we can take the upstream images, inherit from them and change the uid:gid pair on both images to be consistent and also map them to something like 10500:10500 well out of the way of anything else on the system19:18
clarkbThat does mean we'll be building our own images though which adds some overhead, but nothing we haven't done for other pieces of software.19:19
clarkbIf others have better ideas or suggestions I am all ears :)19:19
clarkbOh and finally there is a currently held node that you should feel free to poke at.19:19
ianwhrm, is the uid/gid thing a bug?19:20
clarkbit might be. There are actually a couple of other things I've modified locally that are probably worth filing against upstream. Worst case they tell us that they won't fix it19:21
clarkbWhy don't I work on filing those issues today before I take any drastic steps like building our own images based on theirs19:21
ianwsounds sane, ... maybe just nobody has tried running them for real on the same host or something?19:22
fricklerat least finding out if there is some hidden reasoning behind those choices would also have been my idea19:22
clarkbya or they use docker volumes exclusively and don't think about the mapping19:22
clarkbI'll work on filing those today. The other issues are ALLOWED_HOSTS is basically useless as it doesn't get interpolated into the django config properly and the django config hardcodes assumed docker hostnames19:23
ianwwouldn't you still have the same issue if the volume was shared?19:23
clarkbianw: I think the uid overlap means it isn't an issue19:24
clarkbdefinitely not very clean and still not good to have uid 100 in a volume mapping to _apt on the host19:24
clarkbone issue is if they change things on their side they may break anyone using the images :/19:24
clarkbnot likely to be easy to fix, but making people aware of it and double checking we aren't missing something makes it worthwhile19:24
clarkbAnything else related to mailman 3? I'm hapyp to answer questions if people have them19:26
ianwnot for me, but thanks for working on it!19:27
clarkb#topic Gitea 1.1719:27
clarkb#link https://review.opendev.org/c/opendev/system-config/+/84720419:28
clarkbReviews on that change would be appreciated, but I do think waiting for 1.17.1 to happen before we upgrade is worthwhile19:28
clarkb1.17.1's milestone has quite a few bugs that seem like good things to have fixed before we upgrade19:28
clarkb#link https://github.com/go-gitea/gitea/milestone/12219:28
clarkbI do think a lot of them don't affect us because we are disabling the new package repo system they added19:29
clarkbBut we don't have an immediate need to update and this release seems like it could use some improving. I'm happy to wait :)19:30
ianw++19:30
clarkbThere are a lot of breaking chagnes in the release notes that I've noted in the commit message and tried to explain how they affect us if at all19:31
clarkbThe aerly review is worthwhile simply to get through all of that19:31
clarkb#topic Open Discussion19:33
clarkbThat was everything on the agenda19:33
clarkbAnything else19:33
ianwnot for me, i've been a bit side-tracked into some zuul regression testing for the console streaming bits after i broke that with the zuul restart on the weekened19:35
clarkbTripleo has been having ansible ssh failures in their jobs. From what I can tell the ssh connections are from 127.0.0.1 to 127.0.0.2 as the zuul user. Journald seems to record the ssh connection succeeds but then claims the remote closes the connection and ansible says ""Failed to connect to the host via ssh: kex_exchange_identification: Connection closed by remote19:35
clarkbhost\r\nConnection closed by 127.0.0.2 port 22", "unreachable": true"19:35
clarkbI know that ansible will report ssh errors when it cannot do its remote node bootstrapping due to full disks and similar problems19:36
clarkbI suspect this is some sort of ansible specific problem and that ssh itself is functioning, but that is still mostly a hunch at this point19:37
ianwwell i wasted about ... too many hours ... with a similar unreachable host19:38
clarkbwas that the fstrings thing?19:39
ianw"ssh-keyscan localhost -p 2022" does *not* scan the ssh running on port 2022.  "ssh-keyscan -p 2022 localhost" does.  if you have a ssh on port 22, the first will give you the keys for that19:39
clarkboh fun19:39
ianwand ansible will give you very little clue about host key mismatches19:40
clarkbright ansible's error reporting here is I think what is hampering further debugging. I've suggested they maybe increase verbosity so help sort it out19:40
ianwyep then you can see the ssh calls19:40
clarkbSounds like that may be everything?19:43
ianwnothing more from me19:44
clarkbThank you everyone! We'll be back here next week at the same time and location19:45
clarkb#endmeeting19:45
opendevmeetMeeting ended Tue Aug 16 19:45:04 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:45
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-16-19.01.html19:45
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-16-19.01.txt19:45
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-16-19.01.log.html19:45
fricklerthx clarkb 19:45

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!