Tuesday, 2023-01-24

clarkbmeeting time19:00
fungiahoy!19:01
clarkb#startmeeting infra19:01
opendevmeetMeeting started Tue Jan 24 19:01:46 2023 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
opendevmeetThe meeting name has been set to 'infra'19:01
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/SAMQHW6WCCF4LKQ2IADJ4VJGZZENI72D/ Our Agenda19:01
clarkb#topic Announcements19:01
clarkbI sent email last week and made the Service Coordinator nomination period that begins on January 31, 2023 official19:02
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/32BIEDDOWDUITX26NSNUSUB6GJYFHWWP/19:02
clarkbI'll send a reminder email on the 31st that things have opened up too19:03
clarkb#topic Bastion Host Updates19:03
clarkbI suspect there hasn't been much change here after the week we had last week...19:04
clarkbBut on the todo list we need to shutdown the old bridge and clean it up when we are satisfied doing so is fine19:04
clarkb#link https://review.opendev.org/q/topic:bridge-backups19:04
clarkbWe need to review ^ that stack of changes (I need to personally do it, but anything around backups and encryption demands time and attention and I haven't had a ton of that recently)19:05
ianwoh right, yes everyone has signed off on that19:05
clarkband then finally once we've dealt with those items we can start looking at parallel infra-prod jobs again:19:05
clarkb#link https://review.opendev.org/q/topic:prod-bastion-group Remaining changes are part of parallel ansible runs on bridge19:05
ianw(i've shutdown the old bridge.  it's already in emergency.  i'll work on inventory changes, etc.)19:06
clarkbthanks!19:06
clarkbanything else to add to this topic?19:06
ianwnot today!19:06
clarkb#topic Mailman 319:07
clarkbfungi: I know we've all been underwater with various security things, you more than others. Did you manage to make any progress on the outstanding mailman3 items?19:07
clarkbFor those following along these are the major todo items that I recall:19:07
clarkbWe need a service restart to set the site_owner config19:07
fungii've started catching back up on this, i added some initial notes to the bottom of https://etherpad.opendev.org/p/mm3migration in the todo sectiom19:08
fungioh, right, i should add the restart19:08
clarkbWe need to figure out domain vhosting and likely change domain configuration in the mm3 django install to do this19:08
clarkband we need to fix the root email alias on the server19:08
fungii think all that's captured in the pad now19:09
clarkb#link https://etherpad.opendev.org/p/mm3migration live todo list for mailman3 work.19:09
clarkbexcellent. Anything else to mention on this topic?19:09
funginope, my focus will be the restart (should be able to do that after the meeting) and troubleshooting the job failures on 86798719:10
clarkbsounds good, thanks. Again let me know if I can help19:10
fungisure thing19:10
clarkb#topic Gerrit Updates19:10
clarkbThis has sort of morphed out of the Gerrit 3.6 post upgrade task tracking into a bigger set of items19:11
clarkb#link https://review.opendev.org/c/opendev/system-config/+/870114 Add Gerrit 3.6 -> 3.7 Upgrade test job19:11
clarkbthis change is a post upgrade item. ianw I responded to your comment there and basically indicated I feel like punting on that for now is ok/desireable since we aren't trying to fully automate the gerrit upgrade in production (yet)19:11
clarkb#link https://review.opendev.org/c/opendev/system-config/+/870874 Convert Gerrit to run on our base python image19:12
clarkbThis one looks super straightforward, but is actually fairly involved. It turns out that the old opendjdk images on dockerhub aren't really something we should use going forward.19:12
clarkbI've elected to address that by switching gerrit over to our base python image and installing java from debian repos. The reason for this is it will allow us to update python on those images to 3.10 or 3.11 for jeepyb in a straightforward manner19:13
clarkbDebain bullseye includes java 11 (what we currently run on) and java 17 (what we'll eventually move to) which is nice too19:13
clarkb#link https://review.opendev.org/c/opendev/system-config/+/870877 Run Gerrit under Java 1719:13
clarkbthis change is a followup to the previous change that switches us from java 11 to 17 as gerrit 3.6 release notes say 17 is fully supported as of 3.6. That said, I have had to add a workaround for a bug in running gerrit under java 17 (the bug is linked to in that change)19:14
clarkbI should write the gerrit mailing list today asking them about that because "fully supported" and "use this workaround for the jvm" seem to be in conflict with one another19:14
clarkbAnd finally ianw has a change to convert us away from deprecated copy conditions in 3.6 (this needs to be done before we upgrade to 3.7 along with other things like conversion to submit-requirements)19:15
clarkb#link https://review.opendev.org/c/openstack/project-config/+/867931 Cleaning up deprecated copy conditions in project ACLs19:15
clarkbI need to review ^ that one and will try to do that today. Then we can land that when we're happy with the state of acls generally (which I think we may already be)19:16
clarkbAny other Gerrit updates?19:16
ianwyeah i think we're fine on that -- to just log what happened i re-loaded the acl's per19:16
ianw#link https://etherpad.opendev.org/p/760YNeM5OEFS1hlr7bE519:16
ianwsomeone else should probably double check the logs but the only "errors" were for projects that were retired in and in R/O mode19:17
ianwi saw some discussion on that in #opendev and wondering if we should still make the change to jeepyb to stop on acl failures19:18
ianwi think we can, because we won't try to load ACL's for retired projects (normally?)19:18
clarkbianw: we will if we haven't already cached that we've updated them19:18
ianwalthough i guess it does mean we can't run the mass reload?19:18
clarkbwhich is my concern since the cache may not persist forever19:18
clarkbya exactly19:18
ianwyeah19:18
clarkbI think we need to handle errors for updating RO projects or remove RO projects from projects.yaml or something if we want to make errors more forceful19:19
clarkbI'm open to ideas if people want to leave them in that jeepyb review19:19
clarkbbut its nothing something we can land as is so I WIP'd it19:19
clarkbI can use your captured error logging for hints too19:20
clarkbI should go look at that for multiple reasons :)19:20
ianwok, yeah something to think about.  they're all using the retired acl file19:21
fungii suppose our projects.yaml cleanup job could also propose removals of read-only projects $somehow19:21
ianwi guess we need to probe though if it's been applied ...19:21
clarkbwell we know the retired acl does apply cleanly which would imply anything using that acl that fails is very likely to have failed due to being ro19:22
clarkbI want ot say gerrit says something like "you can't modify this RO project" which we could use as an indication to ignore too19:22
ianwyeah it says19:23
ianwopenstack-attic_compute-api.txt- ! [remote rejected] HEAD -> refs/meta/config (prohibited by Gerrit: project state does not permit write)19:23
clarkbcool, I can work on a new patchset that checks for that error specifically and only ignore it if that is returned else error19:23
ianw++19:23
clarkbanything else related to Gerrit before we go to the next thing?19:24
clarkb#topic Gitea 1.18 upgrade19:25
clarkbyesterday we did a minor upgrade from 1.17.3 to 1.17.419:25
clarkbThat was in preparation for an upgrade to 1.18.x which has now made it to 1.18.319:26
clarkb#link https://review.opendev.org/c/opendev/system-config/+/870851 Upgrade to 1.18.319:26
clarkbThere is a held node against the child change of ^ that can be used to preview things. It seems like it is working happily.19:26
clarkbreviews and double checking the changelog and the held node are much appreciated. I'm happy to watch that go in19:27
clarkb(this is something that has been on my todo list since like December so will be glad to have it done :) )19:27
ianw++ will do19:27
clarkb#topic Pruning backups on the rax server19:28
fungiyeah, i keep meaning to get to that19:28
clarkbThe rax backup server is warning us that we're at 92% of capacity and we should run the pruning tool19:28
fungidoesn't seem dire yet19:28
clarkbfungi: oh thanks.19:28
clarkbI wanted to bring it up hee just to make sure it didn't get completely forgotten under the fun of last week :)19:29
ianwyeah i can clear that out19:29
clarkbsounds like we've got a couple volunteers so should get done soon enough. Thanks again19:29
clarkb#topic Linaro Cloud Updates19:30
clarkb#link https://review.opendev.org/c/openstack/project-config/+/871196 Remove old linaro cloud from Nodepool19:30
clarkbI've reviewed that stack and I think it can go in whenever we are ready. Also note that the ssl cert for that cloud expires in 2 days so sooner than later is a good idea19:30
clarkbI know ianw is actively debugging use of the new cloud, but any chance you can give us a quick update on the modifications you made to that cloud?19:31
ianwyep, basically we have 2tb to play with on the cloud, but it was all assigned to a cinder pool19:31
fricklerdo we need a delay between the above cleanup patches?19:32
clarkbfrickler: yes, ideally we land the first one then wait until nodepool is done cleaning everything up before landing the second19:32
fricklero.k., I just approved the first and will only review the next one, then19:33
clarkbthanks!19:33
ianwanyway, i deleted that cinder pool, and made it only 150gb which is enough room for the cache volume we attach to the mirror node19:34
ianwthe rest of the storage i just attached as a regular file system, and moved the glance image storage/libvirt storage into volumes mounted from there19:34
fricklerthe new mirror seems to need another deployment run to recreate some things on the new volume19:35
ianwso, basically we have enough room now to store our uploaded images and run i think as many vm's as we have floating ip's for19:35
ianwfrickler: ahh, that may be, yes i did delete it's cache volume and recreate it.  i probably should have done a manual run of the mirror deployment against it.  will double check soon19:36
clarkbsounds like good progress. Anything else to add on the arm cloud migration?19:36
clarkb#topic Upgrading servers19:37
clarkbthis is reasonably high on my todo list but things like git security patching took precedence...19:37
clarkbI'm hoping once I've got my backlog of gitea and gerrit things out of the way I'll be able to focus on this more19:38
clarkbNo real updates on this one.19:38
clarkb#topic Quo vadis Storyboard19:38
clarkband same story on this topic :(19:38
clarkbWhich takes us to the end of our scheduled agenda19:39
clarkb#topic Open Discussion19:39
clarkbThere are a couple things I wanted to mention here.19:39
clarkbFirst is that we discovered gitea does cross repo searching (only on the primary branch) similar to hound. This caused us to wonder if we could drop hound as a result, but ianw pointed out that hound does regex serach and gitea does not19:39
clarkbThat said I've been using it for simple searches and it seems to work well.19:40
clarkbAnd second I've pushed tox -> nox conversion changes for bindep, jeepyb, git-review, and system-config now. For at least some of these (git-review) I don't think tox is working at all.19:40
fungiand it sounds like the underlying search library gitea uses supports regex, so it may just be simple glue/ui patching to gitea to add that?19:40
ianwyeah, i use both all the time :)19:40
clarkbfungi: yes there is opportunity to improve gitea to expose regex searching as both bleve (the default we use) and elasticsearch appear to support regex searches19:41
ianwi think hound is pretty low maintenance, i personally would like to keep it.  it probably doesn't need it's own host as it does now, but not sure where else it would live19:41
clarkbianw: ya I think as long as it has more functionality keeping it makes sense19:42
fungiif we get approximate feature parity in gitea, then dropping one more redundant ancillary service will still be good though19:42
clarkb++19:42
fungiwe can always host a static redirect from the codesearch name to gitea's explore search19:43
fricklerI like that I can just type "co" in the browser address bar and it will autocomplete and then focus in the search field19:43
fricklertwo more things from me:19:44
clarkbI've been trying to do the nox stuff when I've got a hole in my schedule that is too short to really dive into more involved tasks. I haven't seen any really strong reactions either way on nox. Please say something if you think I'm wasting my time and I'll try to fill those odd blocks of time with something else.19:44
fungihttps://opendev.org/explore/code also seems to focus on the search field, so a redirect would preserve that experience19:44
fricklerah, o.k.19:44
* frickler will be holidaying the upcoming two weeks and mostly be offline19:45
frickleralso I didn't get to add AFS to the agenda as promised. seems there was another cleanup in fedora, so we are good for now except maybe for some centos+ubuntu quota adjustments19:45
fungithanks for keeping an eye on that19:46
clarkbya I almost added it, but the dashboard looks pretty good right now so figured it could wait19:46
clarkbfrickler: I hope you are able to do something fun for your holidays19:46
fungii think i'm moments away from having updated screenshots to see if the latest revision of the donor logos addition works19:46
fungithough looks like the gitea image build takes a while, so it will probably still spill over into my next meeting19:47
fricklerwell as much fun as it gets with the weather and everything19:48
clarkbOh one last thought, should we offer to help debian with the git stuff since we already worked through much of it? fungi already gave them pointers, but I'm worried there hasn't been any movement there after a week. I just don't know what all might be involved particularly since that package isn't in salsa?19:50
fungii've been keeping an eye on https://repo.or.cz/w/git/debian.git/ and haven't seen any movement on the debian branches there either19:51
fricklermaybe ask some debian people like zigo or kevko for their judgement?19:51
fungithat sounds reasonable19:52
fungii'm not a dd so couldn't nmu anything without a sponsor, but also git is central enough to so many things i'd be a little uneasy being the one to nmu that anyway19:52
clarkbfrickler: excellent idea19:52
ianwi am a dd ... but would probably not do that for git! :)19:53
ianwwell not without consent, anyway 19:53
ianwbut ... if we find the right people happy to help19:53
fungiianw: oh! you're a dd? i have something else i need official dds to weigh in on at some point, but it's not directly related to opendev19:54
fungii'll follow up with you later on it19:54
ianwhaha well yeah, i maintain a few things; i was much more active back in itanium days on ia64 things19:55
clarkbianw: when I worked at intel we actually had some of those racked up doing I forget what19:55
ianwbut ... well that's a slice of history now :)19:55
ianwclarkb: probably making a lot of noise and heat19:55
clarkbha19:55
clarkbsounds like that is everything for today. I've got another meeting in a few minutes so stopping here and getting time between would be great19:56
clarkbthanks everyone!19:56
ianwthanks clarkb!19:56
clarkbboth for your time today and all the hard work everyone does to make this machine roll forward19:56
clarkb#endmeeting19:56
opendevmeetMeeting ended Tue Jan 24 19:56:32 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:56
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2023/infra.2023-01-24-19.01.html19:56
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-01-24-19.01.txt19:56
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2023/infra.2023-01-24-19.01.log.html19:56

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!