clarkb | Our weekly meeting will begin momentarily | 18:59 |
---|---|---|
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Aug 30 19:01:36 2022 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link https://lists.opendev.org/pipermail/service-discuss/2022-August/000356.html Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
ianw | o/ | 19:02 |
clarkb | OpenStack's feature freeze begins this week. | 19:02 |
clarkb | Good time to be on the lookout for any code review and CI issues that the stress of feature freeze often brings (though more recently the added load hasn't been too bad) | 19:02 |
clarkb | Also sounds like starglinx is trying to prepare and ship a release | 19:03 |
clarkb | We should avoid landing risky changes to the infrastructure as well | 19:03 |
clarkb | use your judgement etc (I don't think we need to stop the weekly zuul upgrades for example as those have been fairly stable) | 19:03 |
clarkb | #topic Bastion Host Updates | 19:04 |
clarkb | I don't have anything new to add to this. Did anyone else? I think we ended up taking a step back on the zuul console log stuff to reevaluate things | 19:05 |
fungi | i didn't | 19:05 |
ianw | i didn't, still working on the zuul console stuff too in light of last week; we can do the manual cleanup | 19:07 |
clarkb | ok I can pull up the find comamnd I ran previously and rerun it | 19:08 |
clarkb | maybe with a shorter timeframe to keep. I think I used a month last time. | 19:08 |
clarkb | #topic Upgrading Bionic Servers | 19:08 |
clarkb | I've been distracted by gitea (something that needs upgrades acutally) and the mailman3 stuff. | 19:08 |
clarkb | Anyone else look at upgrades yet? | 19:09 |
clarkb | Sounds like no. Thats fine we'll pick this up in the future. | 19:10 |
clarkb | #topic Mailman 3 | 19:10 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/851248 Add a mailman 3 server | 19:10 |
clarkb | This change continues to converge closer and closer towards something that is deployable. And really at this point we might want to think about deploying a server? | 19:11 |
clarkb | In particular fungi tested the migration of opendev lists and that seemed to go well | 19:11 |
clarkb | configuration expectations made it across the migration | 19:11 |
fungi | yep, seemed to keep the settings we want | 19:11 |
clarkb | #link https://etherpad.opendev.org/p/mm3migration Server and list migration notes | 19:11 |
clarkb | I did add some notes to the bottom of taht etherpad for additional things to check. One of them I've updated the chagne for already. | 19:12 |
fungi | those are the exact commands i'm running, so we can turn it into a migration script as things get closer | 19:12 |
clarkb | I think testing dmarc if possible is a good next step. Unfornately I'm not super sure about how we should test that | 19:12 |
ianw | that sounds hard without making dns entries? | 19:13 |
clarkb | I guess we'd want to know if our existing dmarc valid signature preserving behavior in mm2 is preserved and if not if the mm3 dmarc options are good? | 19:13 |
fungi | it may be easier to test that once we've migrated lists.opendev.org lists to a new prod server | 19:13 |
clarkb | ya I think we can test the "pass through" behavior of service-discuss@lists.opendev.org using the test server if we can send signed email to the test server. But doing that without dns is likely apinful | 19:13 |
fungi | like, consider dmarc related adjustments part of the adjustment period for the opendev site migration | 19:13 |
clarkb | that makes sense as we'd have dns all sorted for that | 19:14 |
fungi | before we do the other sites | 19:14 |
fungi | alternative option would be to add an unused fqdn to a mm3 site on the held node and set up dns and certs, et cetera | 19:15 |
clarkb | that seems like overkill | 19:15 |
fungi | yes, i'm reluctant to create even more work | 19:15 |
clarkb | I think worst case we'll just end up using new different config for dmarc handling | 19:15 |
clarkb | and if we sort that out on the opendev lists before we migrate the others that is likely fine | 19:15 |
fungi | and at least mm3 has a greater variety of options for dmarc mitigation | 19:15 |
clarkb | The other thing I had in mind was testing migration of openstack-discuss as there are mm3 issues/posts about people hitting timeouts and similar errors with large list migrations | 19:17 |
clarkb | Maybe we should do that as a last sanity check of mm3 as deployed by this change then if that is happy clean the change up to make it mergeable? | 19:17 |
fungi | should be pretty easy to run through, happy to give that a shot this evening | 19:17 |
clarkb | great. Mostly I think if we are going to have any problems with migrating it will be with that list so that gives good confidence in the process | 19:17 |
fungi | rsync will take a while | 19:18 |
clarkb | fungi: one thing to take note of is disk consumption needs for the hyperkitty xapian indexes and the old pipermail archives and the new sotrage for mm3 emails | 19:18 |
clarkb | so that when we boot a new server we can estimate likely disk usage needs and size it properly | 19:18 |
* clarkb adds a note to the etherpad | 19:18 | |
fungi | also we should spend some time thinking about the actual migration logistics steps (beyond just the commands), like when do we shut things down, when do we adjust dns records, how to make sure incoming messages are deferred | 19:18 |
fungi | i have a basic idea of the sequence in my head, i'll put a section for it in that pad | 19:19 |
clarkb | ++ thanks | 19:20 |
clarkb | This is the sort of thing we should be able to safely do for say opendev and zuul in the near future then do openstack and starlingx once their releases are out the door | 19:20 |
fungi | there will necessarily be some unavoidable downtime, but we can at least avoid bouncing or misdelivering during the maintenance | 19:20 |
fungi | also there are still some todo comments in that change | 19:21 |
clarkb | I just deleted a few of them with my last update | 19:21 |
fungi | for one, still need to work out the apache rewrite syntax for preserving the old pipermail archive copies | 19:21 |
fungi | oh! i haven't looked at that last revision yet | 19:22 |
clarkb | I didn't address that particular one | 19:22 |
clarkb | I can try to take a look later today at that though | 19:22 |
fungi | i can probably also work it out now that we have an idea of what it would look like | 19:22 |
clarkb | and clean up any other todos https://review.opendev.org/c/opendev/system-config/+/851248/65/playbooks/service-lists3.yaml had high level ones we can clean up now too | 19:22 |
corvus | i think moving zuul over right after opendev would be great | 19:23 |
clarkb | fungi: if you poke at the migration for service-discuss I can focus on cleaning up the chagne to make it as mergeable as possible at this point | 19:23 |
clarkb | corvus: good to hear. I suspected you would be interested :) | 19:23 |
clarkb | So ya long story short I think we need to double check a few things and clean the change up to make it landable but we are fast appraoching the point where we actually awnt to deploy a new mailman3 server | 19:24 |
clarkb | exciting | 19:24 |
clarkb | I can also clean the change up to not deploy zuul openstack starglinx etc lists for now. Justh ave it deploy opendev to start since that will mimic our migration path | 19:25 |
clarkb | Anything else mm3 related? | 19:25 |
clarkb | #topic Gerrit load issues | 19:26 |
clarkb | This is mostly still on here as a sanity check. I don't think we haev seen this issue persist? | 19:27 |
clarkb | Additionally the http thread limit increase doesn't seem to have caused any negative effects | 19:27 |
fungi | i haven't seen any new issues | 19:28 |
clarkb | There were other changes I had in mind (like bumping ssh threads and http threads together to keep http above the ssh+http git limit), but considering we seem stable here I think we leave it be unless we observe issues agan | 19:28 |
clarkb | #topic Jaeger Tracing Server | 19:30 |
clarkb | corvus: I haven't seen any chagnes for this yet. No rush but wanted to make sure I wasn't missing anything | 19:31 |
clarkb | Mostly just a check in to make sure you didn't need reviews or other input | 19:32 |
corvus | clarkb: nope not yet -- i'm actually working on the zuul tutorial/example version of that today, so expect the opendev change to follow. | 19:33 |
clarkb | sounds good | 19:34 |
clarkb | #topic Fedora 36 | 19:34 |
clarkb | ianw: can you fill us in on the plans here? In particular one of the potentially dangerous things for the openstack release is updating the fedora version under them as they try to release | 19:35 |
ianw | just trying to move ahead with this so we don't get too far behind | 19:36 |
ianw | i'm not sure it's in the release path for too much -- we certainly say that "fedora-latest" may change to be the latest as we get to it | 19:36 |
ianw | one sticking point is the openshift related jobs, used by nodepool | 19:37 |
ianw | unfortunately it seems due to go changes, the openshift client is broken on fedora 36 | 19:38 |
ianw | tangentially, this is also using centos-7 and a SIG repo to deploy openshift-3 for the other side of the testing. i spent about a day trying to figure out a way to migrate off that | 19:38 |
clarkb | ianw: the fedora 36 image is up and running now though as we can discover these problems at least? The next steps are flipping the default nodeset labels? | 19:39 |
ianw | #link https://review.opendev.org/c/zuul/zuul-jobs/+/854047 | 19:39 |
ianw | has details about why that doesn't work | 19:39 |
ianw | clarkb: yep, nodes are working. i've made the changes to nodesets dependent on the zuul-jobs updates (so we know it at least works there), so have to merge them first | 19:41 |
ianw | i think they are now fully reviewed, thanks | 19:41 |
ianw | (the zuul-jobs changes) | 19:41 |
clarkb | ok so mostly a matter of updating base testing and fixing issues that come up | 19:42 |
clarkb | Anything that needs specific attention? | 19:43 |
ianw | i don't think so | 19:44 |
ianw | thanks, unless somebody finds trying to get openshift things running exciting | 19:45 |
clarkb | I've run away from that one before :) | 19:46 |
clarkb | It might not be a terrible idea to start a thread on the zuul discuss list about whether or not the openshift CI is viable | 19:46 |
fungi | or at least "practical" | 19:46 |
clarkb | and instead start treating it like one of the other nodepool providers that doesn't get an actual deployment | 19:46 |
clarkb | not ideal but would reduce some of the headache for maintaining it | 19:47 |
fungi | related, there was a suggestion about using larger instances to test it | 19:48 |
fungi | there's this change which has been pending for a while: | 19:48 |
fungi | #link https://review.opendev.org/844116 Add 16 vcpu flavors | 19:49 |
fungi | they also come with more ram, as a side effect | 19:49 |
fungi | we could do something similar but combine that with the nested-virt labels we use in some providers to get a larger node with nested virt acceleration | 19:49 |
ianw | we could ... | 19:50 |
ianw | but it just feels weird that you can't even start the thing with less than 9.6gb of ram | 19:50 |
fungi | i concur | 19:51 |
corvus | it would be interesting to know if there are any users of the openshift drivers (vs the k8s driver, given the overlap in functionality). | 19:51 |
ianw | ++ ... that was my other thought that this may not even be worth testing like this | 19:51 |
corvus | (i am aware of people who use the k8s nodepool driver with openshift) | 19:52 |
ianw | right, but something like minikube might be a better way to do this testing? | 19:52 |
clarkb | I think tristanC may use them. But agreed starting a thread on the zuul mailing list to figure this out is probably a good idea | 19:52 |
ianw | yeah, perhaps that is the best place to start, i can send something out | 19:53 |
clarkb | #topic Open Discussion | 19:53 |
clarkb | We are nearing the end of our hour. ANything else before time is up? | 19:54 |
fungi | i got nothin' | 19:54 |
clarkb | The gitea upgrade appears to have gone smoothly | 19:54 |
clarkb | I wonder if anyone even noticed :) | 19:55 |
clarkb | Monday is technically a holiday here. I'll probably be around but less so (I think I've got a BBQ to go to) | 19:55 |
fungi | oh, good point. i should try to not be around as much on monday | 19:56 |
ianw | time to put your white shoes away | 19:56 |
* fungi puts on his red shoes and dances the blues | 19:57 | |
clarkb | Sounds like that is everything. Thank you everyone | 19:58 |
clarkb | We'll see you back here next week | 19:58 |
clarkb | #endmeeting | 19:58 |
opendevmeet | Meeting ended Tue Aug 30 19:58:24 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:58 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-30-19.01.html | 19:58 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-30-19.01.txt | 19:58 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-30-19.01.log.html | 19:58 |
corvus | thanks clarkb ! | 19:58 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!