Monday, 2025-04-21

*** darmach9 is now known as darmach11:43
clarkbinfra-root I'm getting ready to put the review servers in the emergency file and do presyncing of the data (which should dramatically speed up the actual syncs during the downtime)13:50
clarkbhttps://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 is our document13:51
fungicool, i'm around but on a conference call for the next hour13:51
clarkbone thing I realized is that the manage-projects git repo cache and json state cache are not synced as part of this migration. That should be fine as they are both caches13:52
clarkbbut I wanted to call it out as something that we might see later when manage-projects runs for the first time. In particular I would expect it to be slow the first run13:52
fungiah, yeah, cold cache. we could copy it over, but it won't slow down the outage if we don't, just slow down the first mp run13:57
clarkbya I think we ignore it for now and we can try and copy it over later if it becomes a problem14:01
clarkbindex has presynced, git is presyncing now14:01
clarkbthen I'll also copy the replication config over but not put it in place yet.14:02
clarkbI think /opt/lib/jeepyb is the manage-projects cache. Probably copying over the project.cache json blob is sufficient then let it grow out the git repo/ref cache over time as things change relative to that? I don't know14:06
clarkbbut as mentioned I think it should be safe for us to proceed as is and just let it build out that cache in the first place14:06
fungishould we do a service notice sometime in the next hour reminding people about the upcoming outage and pointing to https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/D6VGKXHKXCV6TD6MFJY4H4KQBIM3AQYI/ ?14:07
clarkb++14:07
fungiand yeah, i think we perform a cold cache test of manage-projects in zuul jobs, right?14:08
fungi(because there is no jeepyb cache on the test nodes)14:08
fungiservice notice Reminder: The Gerrit service on review.opendev.org will be offline for a server replacement maintenance between 16:00-17:00 UTC today per https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/D6VGKXHKXCV6TD6MFJY4H4KQBIM3AQYI/14:10
fungisomething like that?14:10
clarkb"yes" I'm not sure the regular system-config-run-review jobs use manage-projects at all. But the separate jeepyb gerritlib integration job does. However this jeepyb gerritlib integration job uses a much smaller projects.yaml that isn't the same as our production one14:11
fungis/^service notice/status notice/14:11
clarkbthe notice lgtm14:11
fungicool14:11
clarkbgit presync took almost 10 minutes but a rerun took only a few seconds14:12
clarkbso I think we're in good shape there now14:12
fungiperfect14:12
clarkbnow to copy over the replication config so that it is ready for us later14:13
fungi#status notice notice Reminder: The Gerrit service on review.opendev.org will be offline for a server replacement maintenance between 16:00-17:00 UTC today per https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/D6VGKXHKXCV6TD6MFJY4H4KQBIM3AQYI/14:13
opendevstatusfungi: sending notice14:13
-opendevstatus- NOTICE: notice Reminder: The Gerrit service on review.opendev.org will be offline for a server replacement maintenance between 16:00-17:00 UTC today per https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/D6VGKXHKXCV6TD6MFJY4H4KQBIM3AQYI/14:13
clarkbthe replication config is stashed in ~gerrit2/tmp/clarkb/replication.config for now14:16
opendevstatusfungi: finished sending notice14:16
clarkbI guess I can go ahead and clear out the cache files on review03 now since gerrit is stopped there14:17
clarkbfungi: for later: maybe you can be set up to force merge https://review.opendev.org/c/opendev/zone-opendev.org/+/947137 to ensure it goes in ahead of the hourly jobs at 1600 UTC (maybe merge it at 15:59?)14:20
clarkbthen I'll be ready to shutdown gerrit on review02 as soon as that deploys and sync data across14:20
fungiyeah, can do14:21
clarkblooking at /usr/local/bin/manage-projects it bind mounts root's ssh known_hosts file which apepars to have the record in it that we need for the ssh connection to work. It also bind points /opt/lib/jeepyb which does not exist on review03 yet. It isn't clear to me if docker/podman will create that directory as part of the bind mount process or if we need to create it first14:31
clarkb"If you use --volume to bind-mount a file or directory that does not yet exist on the Docker host, Docker automatically creates the directory on the host for you. It's always created as a directory."14:33
clarkbso in theory I think this should work as is14:34
clarkbI guess we can sync over the manage-projects cache after we have review03 up and running since manage-projects shouldn't run uintil we land the switch to pull it out of review-staging. Maybe that is the safest thign to do?14:38
clarkbI guess let's worry about that when the initial migration is done14:38
fungiyeah, seems like something to double-check later today15:01
fungii've got the submit queued up for 947137,1 in about 45 minutes15:14
clarkbthen do we want to resend your notice at 1600 too?15:17
fungiyeah15:26
fungii can do that once i merge the change15:26
fungistatus notice notice The Gerrit service on review.opendev.org is offline for a server replacement maintenance between 16:00-17:00 UTC today per https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/D6VGKXHKXCV6TD6MFJY4H4KQBIM3AQYI/15:27
clarkb++15:27
fungislight edit to bring that one into the present-tense15:27
clarkbthen I'll wait for that to roll out and for DNS to start resolving the new location before I stop gerrit on 02 and begin data syncs15:27
fungiin theory we shouldn't need to wait for either of those15:28
clarkbwe need to wait long enough that the chagne replicatiosn and zuul can fetch it15:28
fungithe moment the deploy finishes we should be able to go ahead15:28
clarkb*that the change replicates15:28
clarkbbut ya once we're far enough along it should be fine to proceed15:28
fungiwell, that yes, but that will happen before the deploy completes15:29
clarkb++15:29
fungiwhereas change in dns resolution won't start to propagate until after it's deployed to the server15:29
clarkbfwiw I was thinking something like dig review.opendev.org @ns03.opendev.org just to verify the dns update occurred but not necessarily wait for google et al to see it15:30
fungiyeah, wfm15:32
fungiin theory that should show the new value shortly before the deploy job completes15:33
fungiinfra-root: 10 minutes to maintenance15:50
clarkbI'm ready now :)15:51
clarkbI'm staged to stop gerrit on 02 and dump the sql db there. Then on 03 there is a screen where I'll run the data synchronization from15:53
clarkb(no screen on 02 as its just stopping the service and performing the db dump)15:53
fungithanks, attached15:54
fungimerging the dns change now15:58
opendevreviewMerged opendev/zone-opendev.org master: Switch review.o.o to review03  https://review.opendev.org/c/opendev/zone-opendev.org/+/94713715:59
fungi#status notice notice The Gerrit service on review.opendev.org is offline for a server replacement maintenance between 16:00-17:00 UTC today per https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/D6VGKXHKXCV6TD6MFJY4H4KQBIM3AQYI/15:59
opendevstatusfungi: sending notice15:59
clarkbdeploy jobs have started15:59
-opendevstatus- NOTICE: notice The Gerrit service on review.opendev.org is offline for a server replacement maintenance between 16:00-17:00 UTC today per https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/message/D6VGKXHKXCV6TD6MFJY4H4KQBIM3AQYI/15:59
clarkbside note in the future maybe we should do things at the half hour so that we avoid hourly jobs entirely?16:01
fungifair point16:01
opendevstatusfungi: finished sending notice16:02
clarkb`dig review.opendev.org @ns03.opendev.org` seems to show me the new record now16:02
clarkband that deploy job was a success. Should I want for bootstrap bridge for hourlies to finish or just proceed with stopping services now?16:02
clarkbs/want/wait/16:02
fungii would just proceed16:03
clarkbI guess I should proceed since dns updates can be seen on that server at this point anyway and there is no gerrit on 03 running16:03
clarkbdoing so16:03
clarkbgerrit is down and mariadb is up16:04
clarkbwill do the db dump momentarily (want to udpate teh etherpad progress notes)16:04
fungii've been crossing out what you've done too16:05
fungilooks right16:06
clarkbdb is synced and loaded. proceeding to the rsyncs now16:08
fungik16:08
fungiseems done16:10
fungilooks like i would expect16:10
clarkbindex and git should be synced now16:11
clarkbfungi: were you able to follow ^ that or do you want to double check in scrollback?16:11
fungii was able to follow it16:12
clarkbcool and you're happy that db, git, and indexes all synced? If so I'll copy the replication config over and I think we can turn on gerrit16:12
fungiyeah, all looks right16:13
fungireplication config lgtm too16:13
clarkbok ready for me to start gerrit on 03?16:14
fungiyep, go for it16:14
clarkbcool just updated the etherpad to reflect we're about to start gerrit.16:15
clarkbI'll start gerrit momentarily16:15
clarkb[2025-04-21T16:15:49.548Z] [main] INFO  com.google.gerrit.pgm.Daemon : Gerrit Code Review 3.10.5-1-g47283ba335-dirty ready16:16
clarkbunfortunately our old fail to ssh friend is still there16:16
clarkbso IP change didn't help that16:16
fungiPowered by Gerrit Code Review (3.10.5-1-g47283ba335-dirty)\16:16
fungiwebui is up16:16
clarkbI'm going to work on logging in and all that now16:16
fungii was able to log in16:17
fungipulling up changes and diffs is working for me16:17
fungivery snappy16:17
clarkbI was abel to log in as well16:18
corvusi just did some searching by files and it's working (so i guess that confirms some indexes)16:18
corvusgertty is happy16:19
clarkbcorvus: any chance you've checekd zuul yet?16:19
corvusnope but i can16:19
clarkbif we think zuul is happy we can approve https://review.opendev.org/c/opendev/zone-opendev.org/+/947614 and see that zuul ci testing and replication to gitea etc all work16:19
fungiyeah, my gertty went "offline" for a bit during the outage but came right back on its own without a restart16:20
corvuszuul01 is getting stream events16:21
clarkbcorvus: cool I think we approve 947614 then16:21
corvuslooks pretty normal to me16:21
fungiopenstack tenant isn't very busy, but that's to be expected16:21
clarkbI approved that change16:22
clarkbits gate job enqueued16:22
fungiyeah, picked it up right away16:22
fungiseems the job is running normally16:22
opendevreviewMerged opendev/zone-opendev.org master: Update the SOA serial  https://review.opendev.org/c/opendev/zone-opendev.org/+/94761416:23
fungiwatching for `host -t soa opendev.org ns03.opendev.org` to update16:23
clarkbhttps://opendev.org/opendev/zone-opendev.org/commits/branch/master shows that replicated16:24
fungicurrently serving 1744647079 but should switch to 1744911189 during deploy16:24
fungiopendev.org has SOA record adns02.opendev.org. hostmaster.opendev.org. 1744911189 3600 600 864000 30016:25
clarkbgreat that roughly verifies code review approval, zuul gating, zuul merge, zuul deploy of system-config stuff16:26
fungialso 3 minutes from approval to serving a dns update is... impressive16:26
clarkbthe next big step is https://review.opendev.org/c/opendev/system-config/+/947139 but because that touches inventory/.* we will trigger manage projects. So before we get to 947139 do we want to sync /opt/lib/jeepyb?16:28
clarkbpart of me is leaning towards that sync just to avoid errors in manage-projects due to timeouts16:28
clarkbor other unexpected behavior16:28
fungii was going to say it would be good to exercise it with a cold/missing cache, but you make a good point about job timeouts16:28
clarkbwhile ya'll think about that I'm going to manaully disable the replication config on review0216:28
clarkbreplication config on review02 is unconfigured (I left it in the state 03 was in before the migration. Top level configs in place but removed all the targets)16:30
clarkb/opt/lib/jeepyb is owned by root not gerrit02 so the key setup I used to sync the other data may not work.16:31
clarkbfungi: ^ any good ideas for the best way to sync that? It is small so should be quick once we have a plan for it16:32
clarkbhttps://review.opendev.org/c/starlingx/test/+/947751 is a newly pushed patchset16:33
clarkbI'm going to double check that replicated16:33
corvusm-p is idempotent right, so if it does error we can "just" keep running it?  i don't feel strongly either way.16:34
clarkbhttps://opendev.org/starlingx/test/commit/ac386f4a548e95ac32a789bae8612d004b28909e seems to show that replicated16:35
clarkbcorvus: yes it should be. The idea is it checks state against gerrit if the local cache doesn't know the answer to things like what is the hash of the refs/meta/config16:35
clarkbcorvus: so it should slowly rebuild that cache from scratch if all goes as expected and we don't sync the cache16:35
fungiprobably an easy way to copy it is to just tar it up on 02, scp that to 03 and untar there16:37
clarkbthe original use of tar16:37
fungithat way we can preserve ownership without needing to worry about which accounts have access to ssh16:37
clarkbI marked check pushing new patchsets and checking replication as done via starlingx/test 947751,316:38
fungiwell, no, the original use of tar was as a writer for tape archives ;)16:38
fungihence its name16:38
clarkbfungi: any chance you want to sync /opt/lib/jeepyb?16:38
clarkbyou can use /home/gerrit2/tmp/ as the staging location for the tarball on 02 then gerrit2 can fetch it from there16:38
fungii can, sure16:38
fungiit's tiny anyway, under a megabyte16:39
clarkbthis isn't on the etherpad but I've run gerrit show-queue just to check that the server isn't running unexpected tasks or falling behind16:40
clarkbit looks correct to me16:40
clarkbI've also spot checked the "reviewed" flag on older changes that I know I've reviewed and those flags are present so the sql db sync appears to have worked16:42
fungi`tar tf tmp/jeepyb_cache.tar` as gerrit2 on review03 shows expected file list16:42
clarkboh huh are those directories empty?16:42
clarkbI guess we create a scratch space to do the refs/meta/cofnig updates that we don't keep around then record the hash in the json blob16:43
fungiyeah, i wonder if we should have copied something else16:43
fungithe /opt/lib/jeepyb/* directories on review02 are definitely all empty too16:43
clarkbthe project.cache file timestamp looks about right16:43
clarkband /usr/local/bin/manage-projects does -v/opt/lib/jeepyb:/opt/lib/jeepyb for bind mounts so I think the path is correct16:44
clarkbI think we just don't keep as much data there as I expected and the json blob is the interseting thing?16:44
fungi/opt/lib/jeepyb/project.cache.old i guess16:46
clarkbfungi: /opt/lib/jeepyb/project.cache16:46
fungiaha16:46
fungi/opt/lib/jeepyb/project.cache.old must be cruft from long ago16:46
fungilast modigied over 2 years ago, yep16:47
clarkbI guess that gives us a third option we could sync just /opt/lib/jeepyb/project.cache16:47
clarkbso that we've got an up to date json blob with refs/meta/config hashes we don't have to recalculate16:47
clarkbbtu then let jeepyb build out whatever scratch workspace it needs from there16:47
clarkbI kinda like that as a halfway step16:48
clarkbfungi: cool I see that file extracted now16:49
clarkband looks like you hashed it and confirmed it matches what is on 0216:50
clarkbgiven that should I go ahead and remove my WIP from https://review.opendev.org/c/opendev/system-config/+/947139 and remove review03.opendev.org from the emergency file?16:50
fungiyeah, let's see what happens16:50
clarkbok wip is removed and review03 is not in the emergnecy file. review02 is in the emergency file and will stay there16:51
clarkbanyone else want ot review 947139 or should I approve it?16:52
clarkbI guess I should approve it16:55
clarkbthis is done. YOu have a few minutes while that gates to stop it16:55
fungithanks!16:57
clarkbonce that is done I have ~three changes in mind for the longer term followup. First is replacing sighup with sigint (this is already written). Then a change to pull review02 from the inventory. Then finally a change that depends on both of those prior changes that removes the docker compose version specifier so that we stop getting warnings about that from the new host17:02
clarkbthose are longer term because I don't think we're in a huge rush to land any of them. Timing is likely largely based around when we're comfortable we won't rollback for some reason17:03
clarkboh then we can also move the gerrit images to quay17:03
clarkbhourly deploy just completed so when 947139 lands it will dive right in. I think manage-projects occurs near the end though fwiw17:09
fungiall three of those sound good, yes17:09
fungis/three/four/17:09
opendevreviewMerged opendev/system-config master: Unstage review03.opendev.org  https://review.opendev.org/c/opendev/system-config/+/94713917:10
clarkboh that didn't trigger quite as many jobs as I expected. but manage-projects did queue up17:10
fungieven better17:10
clarkbinfra-root now that we're on the noble host if you need to run `docker-compose down` for some reason to shutdown gerrit on 03 you'll see that it sits there waiting and waiting. The reason is we have a 5 minute timeout for gerrit to stop and we issue a sighup that never reaches the container due to apparomor rules. If you do this you can switch to another prompt and `ps -elf | grep17:23
clarkbgerrit | grep jdk` to get the pid of the java process then `sudo kill -HUP $pid` and docker-compose down will see the container has shutdown and proceed as usual17:23
clarkbI have a change up to switch us over to sigint which should just work then you won't have to think about this anymore17:23
clarkbbut calling this out now in case that sigint change doesn't merge before you need this trick17:24
clarkbmanage-projects should be just a minute or two away now17:25
clarkbit just started17:26
clarkbI'm tailing the log on bridge. It runs against gitea first. Also I think ansible may buffer the entire output of manage-projcets and not write it until its done?17:26
clarkbits done. I'm like 90% certain it nooped as epxcted but the log isn't small so digging in a bit more now17:29
fungithe log is very verbose. it lists tons of things it decides not to do17:30
corvussigint lgtm17:31
clarkb`grep manage_projects /var/log/ansible/manage-projects.yaml.log | grep -v 'Processing project' | grep -v 'skipping ACLs'` the only output from this is that it wrote its cache file at the end17:31
clarkbI think that confirms it nooped17:31
corvushttps://review.opendev.org/94754017:32
clarkband ya I think ^ should be safe to land before we decide to remove review02 since sigint should work with the old docker compose setup too17:33
clarkbbut the other changes I need to write are largely going to wait on us being ready to cleanup the old server I think17:33
clarkbI think we're in a good spot now. I haven't seen anything concerning. Review02 gerrit is shutdown and in the emergency file. Review03 is now acting like a normal gerrit server with that last change landing. I'm going to take a break for something to eat (I skipped breakfast) then will dive into the todo list at the end of the etherpad17:35
clarkbthank you everyone for the extra set of eyeballs.17:35
corvusclarkb: thank you for the planning and doing of the work! :)17:57
fungiyes, thanks!!!18:05
fungiwent very smoothly, and we were able to keep the outage to about 15 minutes at the start of the window18:05
fungiextreme success18:05
opendevreviewClark Boylan proposed opendev/system-config master: Add review03 to a couple of places that were missed  https://review.opendev.org/c/opendev/system-config/+/94775718:09
opendevreviewClark Boylan proposed opendev/system-config master: Remove review02 from the inventory  https://review.opendev.org/c/opendev/system-config/+/94775818:09
opendevreviewClark Boylan proposed opendev/system-config master: Drop docker-compose version specifier for Gerrit  https://review.opendev.org/c/opendev/system-config/+/94775918:09
clarkbinfra-root like 947540 I expect 947757 to be a safe change to alnd now. It adds the server to a couple places I missed intiailly. Nothing major but we probably want to fix that before we worry about cleaning up the server in the subsequent change18:09
opendevreviewClark Boylan proposed opendev/system-config master: Migrate gerrit images to quay.io  https://review.opendev.org/c/opendev/system-config/+/88290018:22
opendevreviewClark Boylan proposed opendev/system-config master: Fix gerrit upgrade testing  https://review.opendev.org/c/opendev/system-config/+/94776118:24
clarkbinfra-root 947761 is another thing I noticed that should be safe to land whenever as only upgrade testing of gerrit is really affected18:25
clarkband then 882900 is a restoration and rebase of the old change that would've put gerrit images on quay.18:26
clarkband with those changes I think I'm caught up on followup changes18:27
clarkbnow to sync the user cleanup info from my homedir on the old host to the new host18:27
clarkbreview03:~clarkb/gerrit_user_cleanups should now contain the content from review02:~clarkb/gerrit_user_cleanups18:31
clarkbI feel like I'm largely acught up now and just waiting on reviews for the safe changes18:33
clarkbgiven that I'm going to unpause ubuntu-noble image builds now18:33
clarkb#status log Migrated Gerrit services from review02 to review0318:35
opendevstatusclarkb: finished logging18:35
fungiyay!18:35
clarkb#status log Unpaused ubuntu-noble and ubuntu-noble-arm64 nodepool image builds18:35
opendevstatusclarkb: finished logging18:35
fungialso i think i've reviewed and +2'd all th eoutstanding related changes as of moments ago18:35
clarkbthanks. Looks like I can approve a few of them. I'll approve the sigint change and the fixup to add review03 in a few places18:37
corvusyou got a pair of +2s on all of them.  sounds like a winning hand18:38
clarkbthe quay change is broken. Its pulling the base images from quay which it shouldn't. I'll gix that18:38
fungigerrit-base?18:39
clarkbfor anyone following along I approved https://review.opendev.org/c/opendev/system-config/+/947761 https://review.opendev.org/c/opendev/system-config/+/947757 and https://review.opendev.org/c/opendev/system-config/+/947540 I think they are mostly disjoint enough to be safe to land together or seprately etc so just sent it18:40
fungiwe do still have old copies of that on quay too, but i guess the change doesn't start updating those?18:40
clarkbfungi: no the python base images18:40
fungiaha18:40
fungibecause originally we had moved everything over18:40
fungipython-builder and python-base both apparently18:41
opendevreviewClark Boylan proposed opendev/system-config master: Migrate gerrit images to quay.io  https://review.opendev.org/c/opendev/system-config/+/88290018:41
clarkbya but then we found the problem and now the new appraoch is to do it bit by bit as things move to noble18:41
fungiwe had also been publishing a literal gerrit-base image to quay18:42
clarkbya we'll resume doing that with 88290018:42
clarkbthats the image that has java and everything but the war in it. Then we have the version specific images that drop the way in18:43
clarkbs/drop the way in/drop the war in/18:43
fungiah, we start from python-base as gerrit-base and then amend it and publish the latter18:43
clarkbya we build on the python image to add in java there18:44
clarkbsince we need both java and python. Then we mix in the specific version of gerrit that we want in the final image18:44
fungivia system-config-upload-image-gerrit-base18:44
fungiokay, makes sense18:44
fungiis there a reason we don't publish python-builder and python-base to both dockerhub and quay, so builds for dependent images going to quay can use it from there?18:45
corvuswe do have it in opendevmirror; zuul uses it18:46
clarkbprobably the main thing is the docker hub pipeline is different than the quay pipeline. I think we could switch docker hub pipeline over to the container generic pipeline too though18:46
clarkbbut basically ^ its just more work right now probably18:47
fungimakes sense18:47
corvushttps://quay.io/repository/opendevmirror/python-base?tab=tags18:47
corvusswitching to mirror would reduce the docker calls, but we'd lose stacked image testing18:48
clarkbthings are still gating. I'm going to eat lunch while I wait19:11
fungii'm around in case any of them goes sideways19:11
clarkbhttps://zuul.opendev.org/t/openstack/build/8521e3cbdb584dd897f73794e0c36879/log/job-output.txt#17743 is an interesting result from the quay.io change19:46
clarkbI suspect but can't say for sure that we maybe didn't use the newly built base image there and instead used the old stale base image that is on quay19:47
clarkband then that explodes when trying to run the resulting war under that java version?19:47
clarkbthis isn't urgent though so I'm just going to ignore that for now. I think we can fetch the imagesl ocally and inspect them directly to determine what is going on there19:48
fungithe newly-built vs stale gerrit-base image?19:48
clarkbfungi: ya that error is saying the java can't run the classes in the war I think. So I'm thinking the java version is older than expected so we fetched whatever is actually on quay and not built by https://zuul.opendev.org/t/openstack/build/c0fb6820e42049a3b7cc3c7f7058ee8a19:49
clarkbwe should be using the newly built image in that buildset but maybe the jobs aren't configured correctly to do that or something19:49
clarkbbut since this is all down the road followups I'm also thinking punt for now19:49
fungisystem-config-build-image-gerrit-3.10 needs to run with the artifact system-config-build-image-gerrit-base creates, right? or are they only combined when system-config-run-review-3.10 runs?19:50
clarkbright system-config-build-image-gerrit-3.10 should run with the image built by sytem-config-build-image-gerrit-base19:52
clarkbwithin the same buildset19:52
clarkbhowever I suspect this hasn't happened based on that error19:52
fungisystem-config-build-image-gerrit-3.10 started after system-config-build-image-gerrit-base completed, so it had the opportunity at least19:53
clarkbhttps://zuul.opendev.org/t/openstack/build/c0fb6820e42049a3b7cc3c7f7058ee8a/log/job-output.txt#2699-2758 is where we push to the buildset registry19:55
corvusi see that job running the docker-image/pre.yaml playbook; maybe it should run container-image/pre.yaml ?19:55
clarkbhttps://zuul.opendev.org/t/openstack/build/26a4c797e04f4f9e90a279d6da54bab8/log/job-output.txt#1404-1531 is where we fetch it19:56
clarkbcorvus: aha that may cause it19:56
fungiah, right, if it's not using podman then it's going to pull from dockerhub19:57
fungier, from quay directly19:57
clarkbcorvus: where do you see that? I'm looking at the console on the jobs and both seem to use container-image?19:57
fungiinstead of the intermediate, because it can't handle proxy locations for anything other than dockerhub19:58
fungiintermediate or buildset19:58
corvushttps://zuul.opendev.org/t/openstack/build/8521e3cbdb584dd897f73794e0c36879/console19:58
corvus3rd pre playbook19:58
clarkbah in the run job19:59
clarkbI do wonder if actually the problem is using docker to build the image since as fungi says that will not use the mirror unless we're using buildah or buildx?19:59
clarkband then separately we may also need to update the run job as well?19:59
clarkbthe three followup changes should land shortly. I'll watch them and check things like the docker compose file update20:02
clarkbI don't recall needing to switch that for system-config-run-paste and from testing it seemed to work when we moved it. But maybe that is also broken20:05
clarkbdefinitely worthy of followup but will have to be another time20:05
corvusmaybe we just accept things are squirrely until everything in a given stack is switched over20:06
clarkbya that might be a reasonable option depending on the underlying issue20:06
clarkbthe remaining gitea build is about to run testinfra so hopefully it finishes soon20:17
opendevreviewMerged opendev/system-config master: Use sigint instead of sighup to stop gerrit  https://review.opendev.org/c/opendev/system-config/+/94754020:24
opendevreviewMerged opendev/system-config master: Add review03 to a couple of places that were missed  https://review.opendev.org/c/opendev/system-config/+/94775720:24
opendevreviewMerged opendev/system-config master: Fix gerrit upgrade testing  https://review.opendev.org/c/opendev/system-config/+/94776120:24
clarkbstop_signal is now SIGINT on disk and the running containers appear to have been elft alone. Job is still running for 947540 though20:28
fungii guess we'll want to do a live test of it at some point when things are quiet20:28
clarkbya but I think that is safe to do another day as well. I have that held node where I tested it a lot (admittedly primarily with kill not docker compose but still)20:29
fungiagreed20:29
clarkbmanage projects ran again and again appaers to have noop'd20:42
clarkbonce these changes deploy I'm going to work on getting a meeting agenda out. Please let me know if you have any edits/updates/deletions/etc to add20:44
clarkbbut then I may call it early today. i had an early start and its hard to context switch into something else afterdoing gerrit all day20:45
clarkball three of those changes have deployed now and all three buildsets report success20:52
fungigood idea20:53
clarkbI made note of this in #openstack-infra but new noble images have built and are in the process of uploading to the various cloud regions20:54
clarkbI don't really know how to test things for neutron. I'm guessing specific jobs should be run but if I tried rechecking stuff now it would be hit and miss for getting on the new image20:54
fungii expect they're let us know if things are (still/re)broken21:01
clarkbya I think the main risk is we build a second new image before they can test and they both end up broken21:03
clarkbbut noble pushed new kernels that supposedly fix this and we can't stay in the past forever. It has been almost a month. If things are still broken knowing that and finding workarounds is probably the right thing21:04
fungiexactly21:08
clarkbhttps://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting has the updates I wanted to apply to it in place21:10
clarkbanything else?21:10
corvuslgtm; we can talk about who has signed up for image building21:12
clarkboh ya let me add that21:12
opendevreviewJames E. Blair proposed zuul/zuul-jobs master: Add win_zuul_console to start-zuul-console  https://review.opendev.org/c/zuul/zuul-jobs/+/94776921:27
clarkbI'll send out the meeting agenda at 22:00 UTC21:36
clarkblooking at the build and system-config-run jobs in https://review.opendev.org/c/opendev/system-config/+/947010 for hound I'm like 95% certain that that images was properly used from the intermediate build registry and not the two year old image that is on quay22:03
clarkbmy suspicion with the gerrit change is that because docker image builds are trying to use the speculative state we're running into a problem22:03
clarkbbut its fine on the runtime side when we use podman as is22:04
clarkbya I think the missing piece is that use-buildset-registry also needs to configure /etc/buildkitd.toml or we can switch from docker to podman for the builds maybe22:12
opendevreviewClark Boylan proposed opendev/system-config master: Migrate gerrit images to quay.io  https://review.opendev.org/c/opendev/system-config/+/88290022:20
clarkbthat attempts to build with podman as the build command whcih should in theory recognize the mirror config that is set up for us22:21
clarkbcorvus: also note that https://zuul.opendev.org/t/openstack/build/8521e3cbdb584dd897f73794e0c36879/console#2/0/13/localhost this seems to be how we preload things for system-config-run jobs that use a mix of docker and podman things22:24
clarkbso I think the system-config-run jobs are ok, but only when artifacts are setup to preload the images?22:24
clarkbtimes liek this Iwish the docker image system was less implementation specific (everyone has their ow nconfig files and behaviors...)22:25
clarkbok now to get the meeting agenda out22:26
clarkbfungi: I've detached from the review03 screen. I think we can clean it up. If you agree I can attach later and exit the shells in there22:30
clarkbor feel free too22:31
fungii agree, i already detached a couple hours ago22:31
clarkbcool I'll probably get to that tomorrow morning. I'm going to wind down my typing activities for the day. I had to reload keys just to push that 882900 update22:32
clarkbwhich is a sign I've had my keys loaded for long enough. This way I can go on a bike ride too22:32
fungiindeed22:32
opendevreviewMerged openstack/project-config master: storlets: Add gerritbot notification about stable branches  https://review.opendev.org/c/openstack/project-config/+/94746922:54
opendevreviewMerged openstack/project-config master: zaqar: Fix missing notification about stable branches  https://review.opendev.org/c/openstack/project-config/+/94746722:55
opendevreviewMerged openstack/project-config master: watcher: Remove notification of puppet-watcher  https://review.opendev.org/c/openstack/project-config/+/94746622:57
opendevreviewMerged openstack/project-config master: Remove telemetry groups  https://review.opendev.org/c/openstack/project-config/+/94676723:00
opendevreviewMerged openstack/project-config master: Only pause update_constraints.sh when needed  https://review.opendev.org/c/openstack/project-config/+/94654123:00
opendevreviewMerged openstack/project-config master: Charms: add review priority to charms repos  https://review.opendev.org/c/openstack/project-config/+/94238123:00
opendevreviewMerged opendev/gerritlib master: Run the Gerritlib Jeepyb Gerrit integration job on Noble  https://review.opendev.org/c/opendev/gerritlib/+/94440723:01
opendevreviewMerged openstack/project-config master: Remove qdrouterd role from the config  https://review.opendev.org/c/openstack/project-config/+/93819423:08
opendevreviewMerged openstack/project-config master: Move OSA sync to integrated repository  https://review.opendev.org/c/openstack/project-config/+/94762823:08
opendevreviewMerged openstack/project-config master: Deprecate openstack-ansible-tests repository  https://review.opendev.org/c/openstack/project-config/+/94762923:08
opendevreviewMerged opendev/system-config master: Publish hound container images to quay  https://review.opendev.org/c/opendev/system-config/+/94701023:17

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!