-opendevstatus- NOTICE: Depends-On using https://review.opendev.org URLs are currently not working. This was due to a config change in Zuul that we are reverting and will be restarting Zuul to pick up. | 17:40 | |
clarkb | Anyone else here for the meeting/ We will start shortly | 18:59 |
---|---|---|
clarkb | I expect it might be a small crew today as fungi is out and I think diablo_rojo_phone is busy | 19:00 |
corvus | o/ | 19:00 |
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue Jul 13 19:01:04 2021 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2021-July/000267.html Our Agenda | 19:01 |
diablo_rojo_phone | Juuuust made it to Seattle. | 19:01 |
diablo_rojo_phone | So I may half paying attention. | 19:01 |
clarkb | diablo_rojo_phone: don't worry about it | 19:01 |
clarkb | #topic Announcements | 19:01 |
ianw | o/ | 19:01 |
clarkb | A reminder that the gerrit server will be moving July 18 at 23:00UTC. We'll talk about that in more depth later in the meeting though | 19:02 |
clarkb | Other than that I dind't have any announcements | 19:02 |
clarkb | I can't type didn't today | 19:02 |
clarkb | #topic Actions from last meeting | 19:02 |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-07-06-19.01.txt minutes from last meeting | 19:02 |
clarkb | #action someone write spec to replace Cacti with Prometheus | 19:02 |
clarkb | That hasn't happened yet. I'm not too worried about it as we've been focused on updates to other systems. But once we can get up for some air that would be a good thing to look at next | 19:03 |
clarkb | There were no other actions recorded that I saw | 19:03 |
clarkb | #topic Specs Approval | 19:03 |
clarkb | #link https://review.opendev.org/796156 Supporting communications on our very own Matrix homeserver | 19:03 |
clarkb | I think this is now in a position where people can review it with enough real world information to make informed decisions | 19:04 |
clarkb | We have a matrix homeserver up for opendev.org. We have a test channel on that server. infra-root can invite themselves to that channel usign the admin account (details in the typical location) or you can ask corvus, mordred, fungi, or myself to add you | 19:04 |
clarkb | thoguh doesn't look like fungi has made it in there yet | 19:05 |
clarkb | given all that do we think we are in a position to put the spec up for approval now? I'm comfortable with that myself | 19:05 |
corvus | ++ | 19:05 |
clarkb | corvus: ^ you may have input | 19:05 |
clarkb | considering the focus on gerrit things this week and that fungi is not currently around. What about asking for reviews before 7/22 then approving it then if there are no objections | 19:06 |
clarkb | (gives people a few days after this week to review it) | 19:06 |
corvus | wfm | 19:07 |
clarkb | Alright infra-root please review https://review.opendev.org/796156 by 7/22 | 19:07 |
clarkb | and feelfree to interact with the system that is there to aid your review | 19:08 |
clarkb | #topic Topics | 19:08 |
clarkb | #topic review server upgrade | 19:08 |
clarkb | This is still schedueld for July 18 at 23:00 UTC | 19:09 |
clarkb | we ran into a small hiccup today when it was noticed that depends-on had stopped working. Turns out this was related to switching zuul to talk to review01.opendev.org instead of review.opendev.org. Zuul uses that config the line up the depends on and determine which are valid | 19:09 |
clarkb | A revert of that chaneg is in the gate right now and we'll need to restart Zuul once the deploy job for zuul runs for that | 19:10 |
clarkb | To handle the DNS update during the migration I think we can force merge the DNSchange in gerrit on review02, then manually pull and run the dns deploy playbook on bridge | 19:10 |
clarkb | Not as elegant but prevents depends-on from being ignored | 19:10 |
clarkb | ianw: ^ not sure if you had caught up on all that scrollback yet but that is the tldr | 19:10 |
ianw | ++ yep, i will re-write the checklist today to account for that | 19:11 |
clarkb | I also pushed up a change to reduce the TTL on the review.o.o cname record to 300 seconds since updating that will be more important now | 19:11 |
clarkb | we should be able to land that one today to get it out of the way | 19:11 |
ianw | yep, good idea | 19:12 |
clarkb | I think it would be good to do a resync of review02 today as well. Then we can spin it up with the current gerrit image and make sure everything looks happy | 19:12 |
clarkb | I have a related item on the next topic, but I'll hold off in case there are other upgrade specific things to go over | 19:12 |
clarkb | oh! have reminders gone out yet? we should send those to the mailing list. The meme is peopel don't read but we can only do our best :) | 19:13 |
corvus | send our reminder gifs? | 19:13 |
ianw | ahh, i said i would do that and got sidetracked sorry. i'll send one in reply to the original now | 19:13 |
clarkb | ianw: thanks! | 19:13 |
clarkb | corvus: no reminders that the server will haev a longer that typical outage | 19:13 |
clarkb | corvus: but adding gifs is probably a good way to get people to read them :) | 19:13 |
clarkb | Anything else? | 19:14 |
clarkb | #topic Gerrit Account Cleanup | 19:16 |
clarkb | I won't bother with cleaning up the ~176 external ID conflicts that I retired accounts for until after the move | 19:16 |
clarkb | however efoley reached out yesterday after they managed to get themselves into a bad spot with their account. | 19:16 |
clarkb | The root cause has been captured in https://bugs.chromium.org/p/gerrit/issues/detail?id=14776 tldr is deleting emails in the web ui is not safe if you delete the email for your openid it also deletes your openid externalid | 19:17 |
clarkb | We can't fix this in a simple way because of the conflicts I have been working to cleanup. What we can do is take advantage of the downtime to push a fix to the externalid records under gerrit then we'll reindex anyway and in theory be happy | 19:17 |
clarkb | ianw: ^ I started looking at the testing and staging of this on review02 today. That led me to create a /home/gerrit2/scratch/ dir where I was going to clone All-Users to and then checkout refs/meta/external-ids to make the necessary edits so they are staged and ready to go (and possibly test them?) | 19:18 |
clarkb | ianw: but I ran into a weird thing: I don't want that dir to be backed up because the refs/meta/externalids checkout has tons of small files and we already backup the source repo | 19:19 |
clarkb | ianw: is there a better location for me to do that? maybe /tmp? we can figure that out after the meeting too | 19:19 |
ianw | hrm, yeah the root disk should be big enough to handle it | 19:20 |
clarkb | But I am hoping to be able to stage that all up, push the fixes back into review_site/git/All-Users.git after we sync up to current state then maybe have efoley test a login against review02 if we turn it on | 19:20 |
clarkb | I'll coordinate that with ianw and we can edit the outage doc with what we learn | 19:20 |
ianw | otherwise we could do something like add an exclude to ~gerrit2/tmp ... that might be a good idea as even on the old server we've acquired random intermediate bits of little value | 19:20 |
ianw | so working in ~/tmp/<user>/ ... would be good that we know we can always remove those bits (and a signal to us working to remind us to consider it ephemeral and do things another way if we want it persisted) | 19:21 |
clarkb | not a bad idea. I actually do that on my personal machines because tmp is small | 19:22 |
clarkb | Anyway I think /tmp will work for now and we can coordinate the syncing and testing bits later | 19:23 |
clarkb | Another odd thing I noticed when doing that is /home/gerrit2 is root:root ownership | 19:23 |
clarkb | which means gerrit2 can't create dirs or files in its own homedir root. I suspect something related to docker containers with that? | 19:23 |
clarkb | Not critical either, but things like that make me want to turn on the gerrit if we can and ensure it starts up cleanly | 19:24 |
clarkb | #topic gitea backups failing to one backup target | 19:24 |
ianw | hrm, i quite possibly did a mkdir of /home/gerrit2 to get the LVM mounted there | 19:24 |
clarkb | ianw: ah | 19:24 |
clarkb | ianw: re gitea backups do we still suspecttimeout values in mysql configs? | 19:25 |
ianw | so that would be an oversight. i definitely have started it and played with it, so it does minimally work | 19:25 |
clarkb | cool | 19:25 |
ianw | umm, last thing was the ipv6 between gitea01 -> backup was seem to not work | 19:25 |
clarkb | oh right this is the vexxhost between regions routing problem | 19:25 |
ianw | i've reported that to mnaser and i believe an issue was raised, but i haven't followed up since | 19:25 |
clarkb | ok. This topic is on here mostly to remind me to catch up if there are any updates to catch up on. Sounds like we are still waiting for vexxhost | 19:26 |
clarkb | maybe we should consider dropping the AAAA record for now? | 19:26 |
ianw | it seems unfortunate but we could | 19:27 |
ianw | also the filesystem component of the backup is working | 19:27 |
ianw | so it must be falling back to ipv4 | 19:27 |
clarkb | I wonder if the streaming setup for the db prevents fallback from working | 19:27 |
clarkb | because the stream gets interrupted | 19:27 |
clarkb | vs the fs backup which can simply reconnect and then start doing files | 19:28 |
ianw | afaics borg doesn't log anything of interest relating to that | 19:29 |
ianw | i'll have to fiddle more, i'll put it on the todo list | 19:29 |
clarkb | It seems plausible at least | 19:29 |
ianw | the ipv6 may be a red herring to the actual problem | 19:29 |
clarkb | ya | 19:29 |
clarkb | and thanks | 19:29 |
ianw | it would just be nice to debug one thing at a time :) | 19:29 |
clarkb | ++ | 19:30 |
clarkb | #topic Gitea 1.14.4 upgrade scheduling | 19:30 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/800274 Gitea 1.14.4 upgrade | 19:30 |
clarkb | I've got this change passing now. It is one of the larger Gitea upgrade changes that we've had I think | 19:30 |
clarkb | worthy of careful review. There is a link to a held test node running that code too | 19:30 |
clarkb | Given everything else happening I'm happy to defer this to next week assuming things settle down a bit :) But if you have time for review this week that would be helpful | 19:31 |
clarkb | as that way I can address any concerns before we actually do the upgrade | 19:31 |
ianw | ++ i played around and change overall lgtm | 19:31 |
clarkb | #topic Scheduling Gerrit Project Renames | 19:32 |
clarkb | We said we'd do the week after the server upgrade/mova previously. Does anyone have opinions on a specific day for that? Probably Monday 7/26 or Friday 7/30? (I think I'm supposed to not be around on 7/30) | 19:33 |
clarkb | Any objections to pencilling in 7/26? | 19:34 |
clarkb | Let's pencil that in then and when fungi returns we can talk about a specific timeframe | 19:35 |
clarkb | I expect the rename outage to be quite short as we can do online reindexing | 19:35 |
clarkb | #topic Open Discussion | 19:35 |
clarkb | Anything else? | 19:35 |
ianw | I got https://paste01.opendev.org/ up | 19:36 |
ianw | i have a minor change to db layout to merge, but will then import the old database | 19:36 |
ianw | if it seems to work, i'm presuming no objections to changing the paste.openstack.org CNAME ? | 19:37 |
clarkb | sounds good to me | 19:37 |
ianw | i don't think the service has a bright future, but it should continue chugging along for a while in it's container | 19:38 |
ianw | as with all good web apps, every library it depends on has changed to the point that you basically have to rewrite everything to update it | 19:39 |
clarkb | fun, I think vexxhost was doing some minor maintenance with it though | 19:39 |
ianw | yeah, i got into "this bit deprecated from main framework, use this library -- oh, that library is now unmaintained and has bug that makes it not work with later versions of main framework" loop and gave up | 19:40 |
clarkb | Sounds like that may be about it. I'll go ahead and call the meeting here so that we can proceed with the Zuul restart | 19:42 |
clarkb | As always feel free to bring discussion up in #opendev or at service-discuss@lists.opendev.org | 19:43 |
clarkb | Thank you everyone | 19:43 |
clarkb | #endmeeting | 19:43 |
opendevmeet | Meeting ended Tue Jul 13 19:43:16 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:43 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2021/infra.2021-07-13-19.01.html | 19:43 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2021/infra.2021-07-13-19.01.txt | 19:43 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2021/infra.2021-07-13-19.01.log.html | 19:43 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!