Monday, 2025-04-14

fungithe infra-root address got the same message too00:45
fungii really don't know, i generally defer such questions to i18n team folks (ian and seongsoo mainly)00:46
clarkbbit of a slow start this morning, but local system is updated which is nice. Once I've caught up on messages I'm going to look into updating gerrit vars for the ssh host keys so that we can move to land that change.15:32
clarkbI also realized that synchronizing the gerrit data might require temporarily adding personal ssh keys to the gerrit port 22 ssh user account so that we can sync with the correct perms and take advantage of rsync speedups when updating existing data15:32
clarkbI'll update the etherpad with more details on the dns thing and ^ today too15:33
clarkbhoping we can actually snyc data today/tomorrow and test the server too15:33
fungiwouldn't we just generate a temporary ssh key on the new server and add it to authorized_keys on the old server?15:37
clarkbya that would also work. I aws thinking use ssh -A and then its slithgly less effort?15:39
clarkbbut we need to do something to make the sync possible from review_site to review_site. Another option would be to sync from review_site to gerrit2/tmp using personal keys and then mv/cp over to review_site which loses some rsync benefits15:39
fungifor similar work on the mailman 3 migration i generatd a key on the new server, trusted it on the old server, then ran rsync on the new server to pull files from the old, since i was going to need to do it multiple times15:42
clarkbinfra-root secret vars should be in place for https://review.opendev.org/c/opendev/system-config/+/947044 now. I updated the host vars for both review02 and review03 to use review02's current keys. Please double check that. I'm pretty sure I didn't copy the wrong new data from review03 but worth ensuring the data I put in place matches review02's content as part of review15:59
clarkbok got a number of updates into https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 there are a number of todos to prep changes that I'll start on now. Then its a matter of sorting out how to synchronize stuff. Feedback very much welcome16:08
opendevreviewClark Boylan proposed opendev/zone-opendev.org master: Reduce the review.o.o record TTL  https://review.opendev.org/c/opendev/zone-opendev.org/+/94713616:12
opendevreviewClark Boylan proposed opendev/zone-opendev.org master: Switch review.o.o to review03  https://review.opendev.org/c/opendev/zone-opendev.org/+/94713716:12
opendevreviewClark Boylan proposed opendev/system-config master: Unstage review03.opendev.org  https://review.opendev.org/c/opendev/system-config/+/94713916:18
clarkbfungi: re mm3 syncs is there a reason to prefer the orientation you used of having mm2 trust a key on mm3 then mm3 runs the sync vs having mm2 be trusted by mm3 and having mm2 run the sync?16:20
clarkbI'm just trying to mental model if there is a benefit to doing things in a specific way that I havne't considered16:20
fungimainly trusting the replacement more than the server that's being replaced16:21
fungisince the replacement will presumably live on long after the replaced server has been archived and deleted16:22
clarkbgot it. Not a case of making rsync more efficient/faster but one of appropriate trust levels16:23
clarkbone thing that is different with gerrit compared to mailman is gerrit uses the ssh key in its user for replication and it relies on the default key names for that. But as long as we use a non default for rsync and pass -i keyname we should be fine (you can do that with rsync right?)16:25
clarkblooks like you have to construct the ssh command for rsync to use or set up .ssh/config to do that16:25
clarkbI guess another option is don't generate another key but trust the replication key16:26
fungiyeah, honestly i'd just reuse the replication key and be done with it16:26
fungithen there's no additional cleanup to worry about, the only addition is to authorized_keys on the old server which is going to get deleted anyway16:27
clarkbmakes sense. I added that idea to the etehrpad ns seems like the one we'll likely go with16:27
clarkbok the etherpad is updated with notes for the db dump, copy, restore; git content rsync; and gerrit index rsync17:11
clarkbbefore I proceed it would be good to get feedback on those to ensure I'm not going to do anything stupid as this has the potential to impact the production server17:12
clarkbon further inspection the replication key is actually a non default key and we set some .ssh/config to force that key to be used with giteas. I think the easiest solution here is to generate a new ed25519 key on review03 and add its pubkey to authorized_keys on review02. That way we're not impacting any existing keys but we get a default key so rsync doesn't need complicated17:18
clarkbconfiguration. Then when all this is done we can remove that ed25519 key17:18
clarkbI'll update the etherpad to state ^ is my plan and let others look that over with the more complete context of the etherpad17:18
clarkbhttps://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 is the etherpad17:20
corvusthe default gerrit2 ssh key isn't used for anything?  like gitea replication?  or any gerrit hook scripts?17:33
clarkbcorvus: it isn't used for gitea replication anymore. It was used but then gitea complained it didn't have enough bits so we added a new flip flop key system to be able to switch keys out in gitea and then .ssh/config selects it17:35
clarkbI haven't been able to find anything that would use the old id_rsa key (jeepyb seems to use a key specific to it)17:35
clarkbat this point i have a new ed25519 key on 03 that is trusted by 02.17:35
clarkbThis seems like it should be safe and then easily cleaned up when the host migration is complet17:35
corvussgtm then; that was the only thing that jumped out at me.  i left a few supporting comments.17:36
clarkbthanks!17:36
clarkbI think fungi and I will start on these synchronization tasks around 1800 UTC so that we can get the test node up and running and confirm we get a working gerrit. Then we'll replay what we did in a week and update DNS to make it official17:39
corvusclarkb: after your last comment i now fully understand "on further inspection the replication key is actually a non default key" :)17:41
corvuss/last/earlier/17:41
corvus(for some reason i was reading that as "key type" or something similar and didn't understand the relevance)17:42
clarkband still no concerns?17:43
opendevreviewTim Burke proposed opendev/git-review master: Fix TypeError when reporting old git version  https://review.opendev.org/c/opendev/git-review/+/94714518:16
corvuscorrect18:17
clarkbfwiw we're doing the index sync first and it is not very fast :/19:13
clarkbI have a family lunch in a bit and the goal is to start up the git repo sync when the index sync finishes, check that all looks good before popping out. If thinsg look not good we can ^C and stop the sync on review03 and then go to lunch19:14
clarkbbut that way it can run in the background while we're all busy with other things anyway. Then probably tomorrow morning redo syncs to see how much of a speedup we get and then test the actual service on the new server19:14
clarkbthe first index sync completed and tool 55 minutes and 6 seconds. The git repo sync is running now and will probably take less time as there is less total data19:39
clarkbI think the way my day is going my plan is to run another pass of synchronization tomorrow morning and get an idea of the speedups we can expect from subsequent syncs then some point tomorrow turn on gerrit on review0319:39
clarkbthe current git sync is running in a root screen on review03 if anyone notices problems with that it should be safe to attach and ^C and we'll just do a short sync that needs catching up later19:40
clarkband the git sync finished19:55
clarkbthe screen is still up on review03 but I'm going to leave things as is for the afternoon so that I can eat lunch and work on other stuff. Will pick this back up tomorrow19:59
clarkbWhen I get back from lunch I'll start putting a meeting agenda together. Let me know if there are any edits you'd like to see and i'll do my best to get to then19:59
clarkbI've updated the meeting agenda. Anything else we need to add?22:09
clarkblast call on meeting agenda updates. I'll probably send it out not long after 23:00 UTC22:42
opendevreviewJames E. Blair proposed opendev/system-config master: Mirror node:22  https://review.opendev.org/c/opendev/system-config/+/94716022:44
corvusi'm restarted zuul sched/web/launcher to pick up the change that will allow us to set image import timeouts22:48
clarkbcorvus: that will also pickup the gerrit event stream update?22:49
clarkb(something to keep in mind I guess if we notice any oddities with that)22:49
corvussure will :)22:49
clarkbok I tried to write a good number of test cases and it should largely noop for opendev since we won't set the timeout22:50
clarkbbut ... something to keep an eye on22:50
corvusokay, we should be on the new code now as the second batch is starting up22:59
opendevreviewJames E. Blair proposed opendev/zuul-providers master: Add import timeout for rax classic  https://review.opendev.org/c/opendev/zuul-providers/+/94716323:00
corvusand that should exercise both things :)23:00
corvusit got an instant +1 so that's good.23:01
clarkb+2 from me23:01
corvusJames E. Blair: Uploaded patch set 1. (2025-04-14 23:00:18+0000)23:01
corvusZuul: Patch Set 1: Verified+1 (2025-04-14 23:00:25+0000)23:01
corvusthat's < 10 seconds, so i think the change is in effect :)23:02
clarkb++23:02
opendevreviewMerged opendev/zuul-providers master: Add import timeout for rax classic  https://review.opendev.org/c/opendev/zuul-providers/+/94716323:02
corvusthis is so fast :)23:02
clarkbzoom zoom23:02
clarkbwould be good to exercise it with a change that runs jobs too23:02
corvusLast reconfigured:23:03
corvusMon, Apr 14, 2025 11:02 PM23:03
corvusthat change is in production :)23:03
clarkblet me send the meeting agenda then I can look to see if I have any changes.23:03
clarkbhrm nothing easily convenient to update. I guess I can push something throwaway23:06
opendevreviewClark Boylan proposed opendev/system-config master: DNM rebuild gitea  https://review.opendev.org/c/opendev/system-config/+/94716523:08
clarkbthat should be a sufficiently interesting change to exercise things23:08
clarkband it queued up and has node requests which implies the merge happened? So ya seems like things are streaming in23:08
clarkband jobs are running23:13

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!