Friday, 2025-04-04

opendevreviewMerged openstack/project-config master: Move yaml2ical to the opendev tenant  https://review.opendev.org/c/openstack/project-config/+/94628000:03
opendevreviewMerged opendev/bindep master: Fix authors/maintainers format in pyproject.toml  https://review.opendev.org/c/opendev/bindep/+/94621800:06
opendevreviewMerged opendev/engagement master: Drop maintainers field from pyproject.toml  https://review.opendev.org/c/opendev/engagement/+/94631400:07
clarkbthat is an odd distinction and probably assumes that software is built by an individual at all times00:29
tonybOkay quotas increased, adding 50GB to mirrors.ubuntu still left it > 90%, and flagging a warning, so I added a total of 100GB.  I hope that's okay00:31
tonybI'll give it an hour and verify I can see the change in grafana00:31
clarkbshould be fine. Ubuntu in particular grows slowly typically00:31
clarkbtonyb: you may have to wait longer since grafana may look at the ro volumes and not rw volumes. I can't remember how often ubuntu and centos 9 stream vos release00:32
tonybAh okay00:32
clarkb(I wouldn't manaually vos release, just wait for the cron jobs to do it)00:32
tonybYeah.  No rush.  I checked and the new quota has been applied so I'm happy to wait00:33
opendevreviewTony Breeds proposed opendev/system-config master: Add tony's AFS admin user to UserList  https://review.opendev.org/c/opendev/system-config/+/94631600:36
tonybclarkb: Update Gerrit images to 3.10.5 and 3.11.2 | https://review.opendev.org/c/opendev/system-config/+/946050 looks good to me.  Just need to plan the approval and restart 00:39
clarkbtonyb: thanks my goal is to start on that first thing tomorrow morning00:39
tonybclarkb: Sounds good00:40
clarkbthat way I have the afternoon to watch for anything unexpected before I disappear for a long weekend00:40
tonyb++00:40
tonybAlso FWIW, I can see the quota changes in grafana \o/01:22
opendevreviewMerged opendev/irc-meetings master: kolla: Move meeting one hour backwards (DST)  https://review.opendev.org/c/opendev/irc-meetings/+/94612613:26
opendevreviewMerged opendev/gerritbot master: Feature: Keep alive interval of 60 by default  https://review.opendev.org/c/opendev/gerritbot/+/94111713:37
clarkbany reason to not approve https://review.opendev.org/c/opendev/system-config/+/946050 nowish and plan to restart gerrit on 3.10.5 later today after the image updates?14:45
fungii can think of none14:45
fungiapproved it just now14:45
clarkbthanks. I just wanted to make sure there wasn't anything going on as I'm just rolling in now14:46
fungii'm probably disappearing in about half an hour to grab lunch and maybe run a quick errand, but should be around for the rest of the day after 16:30 or so14:46
fungihappy to help with a restart14:46
clarkbsounds great. I'm not in a huge hurry to do that and it will take some time to gate and promote anyway14:49
*** dhill is now known as Guest1289815:08
fungiokay, disappearing for a bit, shouldn't be much more than an hour15:21
opendevreviewMerged opendev/system-config master: Update Gerrit images to 3.10.5 and 3.11.2  https://review.opendev.org/c/opendev/system-config/+/94605015:54
clarkbimages for ^ have promoted15:57
clarkbthe rest of the deployment jobs (which should noop) are still running15:57
dan_withHi @fungi, Could I have you all attempt to `openstack server reboot --hard <VM_ID>`? We're sorting through an issue with role/admin.15:57
clarkbdan_with: yes I'll do that shortly15:57
dan_withThanks15:58
clarkbdan_with: just issued that request16:00
dan_withok thanks16:00
clarkbthe server pings and https://mirror.dfw3.raxflex.opendev.org/ responds with expected content now16:03
clarkbI guess let us know when you're happy on your side and we can reenable the region in the ci system16:03
dan_withok, I had to attach the volume after the boot, so you may need to log in via console and check things out. I would recommend a reboot (again), just to make sure everything comes up cleanly. It has both multipath paths now, so the desired result has been achieved.16:06
clarkboh tahts good to know16:08
clarkbchecking now16:08
clarkbconfirmed that it didn't mount things properly. Rebooting it again which should take care of that16:09
clarkb(I thought about mount -a but we want to see it come up on boot anyway)16:09
dan_withyes please. I couldn't attach the volume through the admin account until it started to boot. I'll have to solve the role/policy problem, but wanted to get you unstuck and your server happy.16:10
clarkbseems to have come up with the volume attached and mounting at boot worked16:11
clarkbits possible we leaked a small amount of data onto the rootfs at the two mount points but df reports minimal disk consumption for / so I decdie to not overthink cleaning that up for a mirror node16:12
clarkbdan_with: you're happy with the server now? and we can return it to service if we are happy with it?16:12
dan_withExcellent. Thank you for your patience and willingness to work with me. I can tell you have a great team. You can return to service if you would like. I just wanted to make sure the volume mounted correctly and it didn't need an fsck.16:13
clarkbbooting was slow it may have fsckd. Let me check16:13
clarkbdmesg -T only reports that rootfs was skipped16:14
clarkbso I don't think it fscked16:14
clarkbdan_with: and thank you for the help and resources16:14
clarkbI've +2'd https://review.opendev.org/c/openstack/project-config/+/946266 and when fungi returns he can confirm that the server looks good and approve it if so16:15
dan_withYou're welcome. Sorry for the odyssey down a rabbit hole. It did help uncover a major problem that I'll be working on today. So, have a wonderful Friday and weekend.16:15
clarkbyou too!16:16
fungilooks like we're all set for the upgrade when ready?16:42
fungithanks dan_with for your help on the mirror server!16:43
clarkbfungi: I think so16:43
clarkbdid you want to drive or should I?16:43
dan_withyou bet16:43
clarkbI think the process is note the current image, pull the new image and check it looks correct, down the service, move waiting queue aside, optionally delete the two large gerrit caches or move them aside, start the service16:43
fungii can drive the gerrit upgrade but would need a few minutes to get settled first16:44
clarkbgerrit_file_diff, git_file_diff, git_modified_files are the three main caches that have given us problems16:45
clarkbI know that gerrit has also tried to improve things but unsure if any of that has made it onto the 3.10 branch. So may be worth attempting to restart as is. Or we can just go easy mode and not bother16:45
clarkb(we don't awnt to delete all caches as that will force everyone to log back in again which is annoying)16:45
clarkbdiff_intraline appears to be another large one16:46
clarkband each cache has 2-3 files associated with it. The h2 db, a lock file and an optional trace file. I've been moving all aside when clearing them out16:48
opendevreviewMerged openstack/project-config master: Revert "Temporarily turn down raxflex-dfw3 use"  https://review.opendev.org/c/openstack/project-config/+/94626616:50
fungiokay, i have a root screen session open on review.o.o16:52
clarkbcool let me jump on there16:53
fungiopendevorg/gerrit               <none>    66bb7c64dabb   10 months ago   682MB16:53
fungithat looks like the image we're running?16:53
fungidoesn't seem like it's been that long16:53
clarkbis it ok if I type?16:53
fungisure16:54
fungibeen longer since we restarted gerrit than i realized16:54
clarkbopendevorg/gerrit               3.10      279db0b1f27b   8 weeks ago16:55
clarkbits that one I think16:55
fungiokay, so 8 weeks seems more reasonable16:55
fungithat didn't come up in the `docker image list` output for me16:55
fungioh, no it was me scrolling my local tmux session and not the remote screen session, okay16:56
clarkbI think it was there but your small terminals cut it off :)16:56
fungiyeah, i see it now16:56
fungiokay, so 279db0b1f27b is the id we're running16:56
clarkbyup16:56
clarkbas that matches the opendevorg/gerrit:3.10 image from docker ps -a16:56
fungipulling new images16:57
fungiopendevorg/gerrit               3.10      f5b922fbdc07   2 hours ago     681MB16:57
fungithat's the one we're switching to16:57
clarkbthat mariadb image also probably updated?16:58
clarkbmaybe not the 10.11 image isn't getting updated as frequently so could be either way16:58
fungiquay.io/opendevmirror/mariadb   10.11     4254659f2379   8 weeks ago     326MB16:58
clarkbya looks like it hasn't. Thats fine and I think expected16:58
fungithat looks older than the last restart, right16:58
fungimv /home/gerrit2/review_site/data/replication/ref-updates/waiting /home/gerrit2/tmp/replication_waiting_queues/waiting_queue_2025040416:59
clarkbya that looks good to me16:59
fungithat's how we'll move the waiting queue once the server is offline16:59
fungiokay, i'll status notice something before we restart17:00
clarkbdo you want to move the four caches (* 2 or 3 files) that I mentioned above too or see if gerrit has imrpvoed?17:00
clarkbwe know that if things do go sideways/sad that stopping gerrit and moving caches aside at that point does seem to resolve it17:00
clarkbso we do have an out17:00
fungiwhat's your gut feel with those? are they getting cumbersomely large?17:00
clarkbthe sizes we currently have are in a range where we've seen things be fine on restart and we've seen things not be fine17:01
clarkbI suspect that may be due to iops/throughput when it does its startup pruning17:01
clarkbso the big caches are ok if the network and ceph are happy but not when its slower? But that is just a hunch17:01
clarkbremoving the caches does cause diffs to be available on changes pages quicker17:02
clarkbbut may slow down page loads overall17:02
clarkbvs taking the 5-10 minute hit upfront and having faster loads overall17:02
fungiso something like `rm /home/gerrit2/review_site/cache/{gerrit_file_diff,git_file_diff,git_modified_files,diff_intraline}.{h2,db,lock}`17:05
clarkbI think its h2.db not h2,db17:05
clarkband ya I guess we can leave the trace files there problably (though I've moved them in the past)17:05
fungioh, trace17:06
fungii misread your earlier comment about file suffixes17:06
fungiokay, gerrit_file_diff and git_file_diff don't have a .trace.db17:08
fungibut otherwise `rm /home/gerrit2/review_site/cache/{gerrit_file_diff,git_file_diff,git_modified_files,diff_intraline}.{h2,lock,trace}.db`17:08
clarkbya that looks about right17:09
fungirm them or mv them?17:09
clarkbI moved them previously then rm'd them out of the dest dir. The reason for that is I wasn't compeltely sure that gerrit would recreate them automatically but it appears to do so so I think rm should be safe17:09
clarkbbasically allowed me to be able to recreate the files if necessary with the right perms etc17:09
clarkbbut gerrit seems to create missing caches on startup17:10
fungiso is there a reason to save the replication waiting queue, or could we just rm that too?17:10
clarkbthe reason to save the replication queue is to have a data corpus to debug and fix the issue as there are different scenarios that can cause those events to leak17:11
clarkbat this point i think we have probably collected enough data for that17:11
fungiso no real benefit to preserving those either17:11
clarkbif you want to rm that content that should be fine too (it might be slow as it is a number of inodes)17:11
clarkbso I guess mv then rm would allow us to do it with a potentially shorter gerrit outage17:12
clarkbbut its probably not slow enough to matter17:12
fungithat's a good point, unlinking tons of inodes could be slow17:12
fungii'll stick with the mv for that17:12
clarkbsounds good17:12
fungistatus notice The Gerrit service on review.opendev.org will be offline momentarily for a patch release update17:14
fungithat look okay?17:14
clarkbyup notice logtm17:15
fungi#status notice The Gerrit service on review.opendev.org will be offline momentarily for a patch release update17:15
opendevstatusfungi: sending notice17:15
clarkband the command you have queued up is a mouthfull but also looks right17:15
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline momentarily for a patch release update17:15
fungionce that returns, yeah... enter key17:15
clarkbside note grafana says dfw3 is back in service17:17
funginice17:17
opendevstatusfungi: finished sending notice17:18
fungirestarting...17:18
clarkb[2025-04-04T17:18:35.604Z] [main] INFO  com.google.gerrit.pgm.Daemon : Gerrit Code Review 3.10.5-1-g47283ba335-dirty ready17:18
fungiPowered by Gerrit Code Review (3.10.5-1-g47283ba335-dirty)17:19
clarkbweb ui is up and I can view diffs on https://review.opendev.org/c/openstack/project-config/+/94628017:19
fungiyeah, came up rather quickly17:19
fungimight be one of our fastest restarts yet17:19
clarkbhttps://review.opendev.org/c/starlingx/specs/+/94624417:20
clarkbthis got a new patchset17:20
clarkband I believe it was pushed post restart17:20
clarkbso we can check replication using that new patchset. I'll work on that now17:20
clarkborigin https://opendev.org/starlingx/specs/ and git fetch origin refs/changes/44/946244/14 produces 1172d3da10debd68ceb4c5f3189b8b9f8d1315d6 which seems to match the new patchset so I think that is good17:21
fungiyes, looks right here too17:22
clarkbthe error log is still full of people trying to connect with ancient ci systems that can't negotiate ssh17:22
clarkbI guess we shouldn't put random user ips in public config management even if we think they are tied to a system and not an individual?17:22
clarkbI'd like to add them to a permanent firewall block list. I guess we can use secret vars for that maybe17:22
clarkbit is an IBM ip address if anyone happens to know how to escalate to IBM via red hat or something17:23
fungifor the record, this works too and may be easier: https://opendev.org/starlingx/specs/commit/1172d3da10debd68ceb4c5f3189b8b9f8d1315d617:23
fungi(just reference the commit id)17:23
clarkboh cool. I never do that because I can never remember what the pathing is for gitea. But I can remember the special gerrit changes ref for whatever reason17:24
clarkbbrains are weird17:24
clarkbanything else you want me to check?17:24
fungino, seems like this is good17:24
clarkbI detached from the screen and will let you decide when you want to shut it down17:24
Clark[m]Gerrit still happy?19:15
fungiseems so19:16
fungihard to tell on a friday19:16
Clark[m]firefox has native vertical tabs now. I don't think it looks as good or works as well as tree style tabs but progress21:22

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!