Tuesday, 2024-12-17

clarkblast call on meeting agenda content. I'll get that sent out shortly00:16
tonybnothing from me00:17
fungii have nothing else to suggest either00:18
clarkbok I'll send it nowish then00:21
opendevreviewTony Breeds proposed openstack/diskimage-builder master: Add a tool for displaying CPU flags  https://review.opendev.org/c/openstack/diskimage-builder/+/93783604:52
opendevreviewJoel Capitao proposed openstack/project-config master: Authorize packstack-core to force push to remove branch  https://review.opendev.org/c/openstack/project-config/+/93779208:27
opendevreviewJoel Capitao proposed openstack/project-config master: Authorize packstack-release to delete  https://review.opendev.org/c/openstack/project-config/+/93779208:41
*** ykarel_ is now known as ykarel12:28
NeilHanlonclarkb, fungi: at this time Rocky is not intending to diverge from the x86-v3 baseline that RH has decided on for v10.. though if a SIG wanted to, I'd support it... we're already planning on having RISC-V support for Rocky 10 via a SIG, so...15:02
clarkbNeilHanlon: I doubt we'd drive such an effort. Mostly just curious what sorts of choices others are making15:15
karolinku[m]if after updating this CPU flags info and it appear that x86-v3 is available, would you consider creating label which would work with C10?15:17
clarkbkarolinku[m]: I think that depends a lot on what we discover. My primary concerns are that availability will be extremely limited and as mentioend yesterday I feel like that is ok for minor feature that are minimally used that projects can live without simply by having longer jobs vs an entire platform only works on more specialized hardware15:28
clarkbI also have a concern that this effectively means centos 10 can only run on what is likely our most performant hardware15:29
clarkbits one thing to turn off nested virt support and force projects to use emulation or rewrite tests to approach the problem another way. It is another to say "you can't have centos 10 stream anymore today because a cloud is no longer present"15:30
clarkbdisk use of gitea09 and paste seems table even with the logging changes (thats good)17:08
clarkbI'll see how things are looking after my morning of meetings and we can probably proceed wtih a gerrit restart at that point17:08
opendevreviewMerged openstack/project-config master: Authorize packstack-release to delete  https://review.opendev.org/c/openstack/project-config/+/93779217:20
opendevreviewTony Breeds proposed openstack/diskimage-builder master: Add a tool for displaying CPU flags  https://review.opendev.org/c/openstack/diskimage-builder/+/93783619:00
tonybkarolinku[m]: Can you rebase your CS-10 testing patch ontop of ^^20:04
tonybkarolinku[m]: note my change doesn't *do* anything to ensure x86_64-v3 nodes but it makes it very clear if you got one.20:04
clarkbtonyb: karolinku[m] I would continue to use the nested virt labels for now since that constrains the problem space a bit and it would be good to know among those clouds which are not acapable20:08
clarkbthen once we've understood that subset we can go to the wider labels20:08
tonybclarkb: you wanted me to include KVM info along side the cpu-level, Are you thinking capturing the output of `qemu-kvm --version` and `qemu-system-$(arch) --version` as adequate ?20:09
clarkbya that seem adequate. But also we can just check the packaged versiosn too probably and not have an explicit check20:10
clarkb7.2 and newer can emulate haswell (though I don't know what the underlying cpu requirements for that are)20:10
tonybclarkb: Yup I agree, I also have in my head a poorly formed request about adding a label explicitly for "foundational support" for RHEL-10-like distros.  NOT for general CI of those OSs20:12
tonybso we'd know that we can build images, but not actually build and add them to our clouds20:12
tonybwith my thinking being that was we do get clouds and adequate quota so that these machines are more common we could potentially add CI20:14
tonybalthough I haven't thought much about if that's actually helpful20:14
Clark[m]tonyb: for the foundational stuff I wondered if we could just qemu emulate haswell to check a build and not do functests20:21
Clark[m]We already emulate those VM boots for checking iirc20:21
Clark[m]So it shouldn't be any skower20:21
tonybClark[m]: That's certainly possible20:22
clarkbinfra-root there is nothing in the openstack gate or release pipelines. I've just notified the release team that I'll be restarting Gerrit shortly20:55
clarkbif there are no objections I'll start on that in a couple of minutes by sending a notice first then figure out the commands I need to run while that posts20:56
tonybno objection from me20:57
clarkbI'm not pulling new images or anything like that just a docker-compose down; mv the waiting queue task files for replication plugin then docker-compose up -d20:58
clarkb#status notice Gerrit will be restarted to pick up a small configuration update. You may notice a short Gerrit outage.20:59
opendevstatusclarkb: sending notice20:59
-opendevstatus- NOTICE: Gerrit will be restarted to pick up a small configuration update. You may notice a short Gerrit outage.21:00
opendevstatusclarkb: finished sending notice21:03
clarkbok proceeding with the restart now. There is a root screen on review02 if anyone else is interested but this should be quick21:03
clarkbweb ui is up but I'm still waiting for diffs (this is expected21:05
clarkbthe /var/log/containers content for the db container appears to have updated as epxected21:06
clarkbstill waiting for files/diffs21:09
JayFhow long does it usually take for the diffs to start back working again?21:12
clarkbusually about 5 monutes, its going long this time but I'm not seeing anything yet indicating why21:13
clarkbit does prune caches on startup and I suspect that is related21:14
clarkbseems like it may have stopped responding too?21:15
JayFI'm seeing the same. 21:15
clarkbarg I don't understand what is going on yet the error log is largely devoid of anything related21:15
clarkbthere are some ssh timeouts21:15
clarkboh now the error log is logging reejcted connections over http so that explains that at least21:16
clarkbnot the underlying cause but it is logging it 21:16
clarkbfrom what I can see gerrit is rejecting http connections so then the apahce proxy is responding with 502 bad gwateways21:20
clarkbgerrit show-queue shows a gerrit_file_diff pruning task started around the restart21:22
clarkbI suspect this is related to the lack of diffs21:22
clarkbhowever it isn't clear to me yet why this has snowballed into gerrit rejecting http connections. maybe we're all trying to load diffs filled up all the http slots?21:22
corvusclarkb: i'm around and can take a look21:23
clarkbcorvus: thanks. the gerrit_file_diff cache is 61GB on disk last21:24
clarkbs/last//21:24
clarkbthe last time we pruned it was 0100 and gerrit says it was 3GB at that time21:25
clarkbhowever this is an h2 db so the on disk size might be much larger than the actual cache content size?21:25
hasharyup :)21:25
hasharunless you get it vacuumed from time to time21:25
clarkbno vacuuming that i know of21:25
clarkbthere are a number of other stuck tasks in the queue as well so I'm guessing that things got stuck in gerrit running startup tasks and that is leading to other things not proceeding21:26
hashardo you have some monitoring in place?21:27
clarkbhashar: we don't use the prometheus plugin type stuff if that is what you are asking. And I think we removed that one java monitoring plugin when log4shell happened21:28
clarkb(possibly bad) ideas: we could stop gerrit and start it again to clear out the tasks and/or try manually killing tasks21:28
clarkbif we stop and start gerrit again we could potentially move the git file diff cache aside to see if bringing that up clean is happier21:28
hasharthe prometheus plugin and the grafana dashboard maintained somewhere upstream   have very nicely helped us21:29
corvusi ran a show-caches command a few mins ago; it hasn't returned yet21:29
clarkbcorvus: ya that one is slow21:29
corvusclarkb: do you mean we removed java melody?21:29
hasharshow-caches runs a full gc iirc21:29
clarkbcorvus: yes java melody was the one I couldnt' remember the name for21:29
clarkblooks like gerrit is responsive now fwiw21:30
corvuslosing javamelody is sad; it has been very helpful to get a stack trace when something was stuck... :/21:30
clarkbI wonder if the show caches running a gc tripped something over into working again? or it could be that we just needed to wait21:30
clarkbcorvus: agreed but it also did scary things with the logging methods iirc so we removed it at the time. We could probably add it back now?21:30
corvuswell my show-caches hasn't returned yet21:30
clarkbgerrit has cleaned up the logging stuff quite a bit21:30
JayFMy diff loads (well, at least once) now, fwiw.21:31
corvusi don't see a significant difference in the queue21:31
clarkbcorvus: ya the show queue output loosk very similar to when it waas being sad21:32
hasharfor the H2 backed cache: the database will grow up over time and get fragmented as bits are written/removed from it21:32
corvusso if it's responsive, i'm guessing it's just slowly working through a backlog while still performing the pruning21:32
clarkbhashar: is the suggsetyion that you delete the backing files occasionally?21:32
hasharon startup, Gerrit only allows 200ms to compact the database which is certainly not long enough and if the host does not restart often, it is essentailly never cleaned21:32
corvus"does not restart often" matches here :)21:33
hasharon our setup we have been transfering the caches over and over and had massive multiple GB caches which eventually one day ended up filing the disk21:33
hasharI probably spent two weeks of my life debugging it, exactly two years ago21:33
hasharmy fix was to set `-Dh2.maxCompactTime=15000`  which sets the time allowed to compact the db to 15 seconds21:34
hasharand after some restart the files were smaller21:34
hasharI wrote about my debugging at https://phabricator.wikimedia.org/phame/post/view/300/shrinking_h2_database_files/21:34
hasharthe raw journal is in https://phabricator.wikimedia.org/T323754 21:35
clarkbhashar: did the problems with those caches eventually lead to slow startup times like we observed?21:35
hasharbut I don't think we had any performance issue. The files were just super large21:35
clarkbI'm wondering if this is related or we may have a different more urgent problem that we need to debug first then get back to the gerrit stuff21:35
clarkb* get back to the gerrit cache stuff21:35
hasharI don't think that slowed the startup times21:35
hasharat least I don't remember that to have been an issue21:35
hasharnor that anything has improved after having the files compacted21:36
hasharour diff caches went from 12G and 8.2G down to 0.5G21:36
clarkblooks like index tasks for some changes show up in the queue then go away21:36
corvusclarkb: i'm going to attempt to get a stack trace in the container21:37
clarkbbut the ones at the top of the queue listing from aroudn when we restarted are slow21:37
clarkbcorvus: ok21:37
clarkb*from around when we restarted are slow and still in there21:37
corvusloving "ps: command not found"21:38
hashar(for the H2 cache, Gerrit 3.10 has a system running the H2 cache pruning on a daily basis at 1:00 (UTC I think) https://gerrit-review.googlesource.com/Documentation/config-gerrit.html#cachePruning21:38
clarkbthere is a growing set of errors in the error log for a user attempting tp push to starlingx/metal21:38
clarkbbut it almost looks like they tried to push and got an error but then gerrit eventually caught up and now they get no new changes errors?21:39
corvusclarkb: /home/gerrit2/review_site/logs/jstack.log21:40
clarkbcorvus: looks like the gerrit_file_diff is RUNNABLE in that list21:42
clarkbsystem load has shot back up and I think http is unhappy again21:42
clarkbthough maybe less unhappy than before seems things eventually loaded for me just slowly21:42
corvusi'm worried that there may be a deadlock...21:43
clarkbcorvus: the queue dropped down quite a bit actually21:43
corvuslemme put some things together21:43
clarkback21:43
clarkbgerrit_file_diff and git_file_diff are quite large on disk21:45
clarkbeverything else appears to be under 10GB21:45
corvusi think a lot of the index commands are waiting on my show-caches command to finish21:47
corvusso that's clearly a dangerous command to run :(21:47
corvusthat thread is runnable though; it's apparently reading the h2 db.21:48
clarkbcorvus: ah ok your show-cache isn't showing up anymore and the index tasks are no longer in there21:48
corvusoh yep it finished21:48
clarkbso theory time: the cache pruning is very expensive on very large backing files and things may get caught up in that?21:48
clarkbI suspect though don't know for sure that we can remove the cache files and have gerrit start those cache files over again21:49
clarkbhashar: ^ do you knwo?21:49
clarkbthe git_file_diff backing file is even large21:49
hashargerrit_file_diff and git_file_diff hold `git diff` output21:50
clarkbbut I'm kind of thinking it may be prudent to stop gerrit, move those files aside / delete them, then start gerrit back up again21:50
hasharor something like that, so they are necessarily super large21:50
clarkbhashar: ya but 61GB and 222GB large?21:50
clarkbthe actual data in them is about 3GB21:50
hasharoh21:50
hasharis the cache pruning showing in the show-queue output or the jstack?21:51
clarkbhashar: both21:51
hasharhttps://gerrit-review.googlesource.com/Documentation/config-gerrit.html#cachePruning21:51
hasharlooks like it default to be enabled on startup21:51
hasharso potenitally cachePruning.pruneOnStartup=false ?21:51
clarkbya and before 3.10 it only ran on startup21:51
clarkbwell we want to prune I think so try and keep disk usage under control over time21:51
corvusi did another jstack dump in jstack-2.log21:52
corvusthe dick cache pruner thread stack is different, so it's doing something :)21:52
clarkbok that is good to confirm21:53
hasharmy fix was to raise it to 15 seconds to let it prune the caches (`java -Dh2.maxCompactTime=15000`)21:53
hasharand I haven't looked at whether that had any effect with the newer "pruneOnStartup"21:53
clarkbhashar: I think that is a good followup though I am concerned it won't be sufficient with things being as large as they are21:53
clarkband instead wondering if we can just delete the caches. I want to say we can and gerrit generates new empty caches on startup21:54
hashartrue yes21:54
clarkbbut probably a good idea to move the cache aside rather than delete it21:54
clarkbthen once gerrit is up and happy delete it21:54
corvusclarkb: i think you are correct; i think all the h2 stuff can be considered ephemeral and i'm fairly sure an individual missing h2 db would be created empty.21:54
clarkbI'm kinda thinking we should maybe take the hit of moving the files aside then given their size and apparent impact on startup21:55
hasharthere is one h2 backed db which is not a cache though21:55
hasharI think the one storing whether a given file has been reviewed21:55
clarkboh we're processing git_file_diff according to show queue now so it is done with gerrit_file_diff21:55
corvusyep21:55
clarkbhashar: we store that one in mariadb21:55
corvusand the web ui is pretty responsive21:55
hasharso it took like 20 minutes to clear the git_file_diff?21:56
hashar+1 on mariadb :)21:56
clarkbhashar: more like 45 minutes for gerrit_file_diff I think21:56
hasharhas the file become smaller at least?21:56
clarkbno21:57
clarkbcorvus: do you think stopping gerrit, moving the two massive diff caches aside then starting gerrit again is something we should try and if so should we do that nowish?21:58
clarkbit does feel like this is related to gerrit becoming overly occupid with cache maintenance on startup leading to an inability to process other tasks/requests21:58
clarkbI suspect that growth of those files will be more constrained in the future too since we have daily pruning now when before it was only pruning on startup22:00
corvusclarkb: i think we're going to get out of this eventually and the system is sufficiently operable at the moment that we don't need to do that.22:01
corvusbut maybe taking the hit of that now with a few days before holidays sets us up to be more resilient if there is another problem later?22:02
clarkbcorvus: that is/was one of my thoughts though I suspect if there is a reason to restart gerrit later we can simply bundle of the cache moves into that effort22:03
corvusbasically -- i'd say let's do that friday, if friday weren't like the last day anyone would be around for a while.  so maybe now is better if we want to just nip it in the bud.22:03
clarkbya I think the main reason to do it nwo would be to see that we can restart gerrit safely with a known process before we have holidays22:03
corvusi'm on board with that line of thinking22:03
tonyb++22:03
corvusclarkb:  all clear from me whenever you want to do that22:04
clarkbok the process I'm thinking is we down gerrit, move the replication waiting queue aside, move the gerrit_file_diff and git_file_diff files out of the cache and into /home/gerrit2/tmp (this prevents them from being backed up but is same fs so shoulbe immediate mv of large files), then up gerrit22:05
clarkblet me get commands written down for all that then we can send a notice and try again22:05
corvusoh look there's a tempfile22:05
corvus-rw-r--r-- 1 gerrit2 gerrit2 237595340800 Dec 17 22:05 git_file_diff.h2.db22:05
corvus-rw-r--r-- 1 gerrit2 gerrit2   1517463392 Dec 17 21:50 git_file_diff.698130932.385.temp.db22:05
corvusi wonder if that can be used to guage progress22:06
corvusanyway -- main thing i was looking for is to "mv gerrit__diff." out of the way -- to make sure we get all the related files.22:07
corvusuh not sure if that made it through the bridge right, but you get the idea.22:07
clarkbya basically get the lock file and the trace file and the tempfile if present22:07
corvus++22:08
clarkbsomeone want to work on sending a status notice? I should be ready by the time that gets through22:09
clarkbactually I think I'm ready now22:10
clarkbhow about #status notice You may have noticed the Gerrit restart was a bit bumpy. We have identified an issue with Gerrit caches that we'd like to address which we think will make this better. This requires one more restart22:11
hashar+1 :)22:11
corvus++22:11
clarkbok sending that now22:11
clarkb#status notice You may have noticed the Gerrit restart was a bit bumpy. We have identified an issue with Gerrit caches that we'd like to address which we think will make this better. This requires one more restart22:11
opendevstatusclarkb: sending notice22:11
-opendevstatus- NOTICE: You may have noticed the Gerrit restart was a bit bumpy. We have identified an issue with Gerrit caches that we'd like to address which we think will make this better. This requires one more restart22:12
hasharfun is your git_file_diff has a diskLimit of 2G  and a gerrit_file_diff of 3G22:12
clarkbhashar: ya that was based on a single day size22:12
clarkbwhich is what the docs suggest22:12
hasharthe cache pruning thathappens on startup does some magic sql queries to keep those caches under those limits22:13
clarkbwell we had the default limits which were 128MB iirc then we did a restart a day after a prior restart and based the sizes on the reported prune size from the second restart22:13
hasharif you have the h2 files on disk that are severely larger than those (230G and 61G), then my guess is you suffered from the same issue I have encountered: the database needs to be compacted22:13
hasharwhich is a different mechanism than the cachePruning one22:13
clarkbright its the content vs the backing file problem. But now pruning happens daily which is new in 3.10 so maybe that will keep it under control22:14
corvusit would do that only if pruning also compacts the db?22:14
hasharthat is only keeping the data under those 2 and 3 g limit22:14
corvusi think hashar is talking about something like postgres vacuuming22:14
hasharthe compaction is at a lower level (that is the H2 driver itself)22:15
hasharyeah same as vacuuming22:15
opendevstatusclarkb: finished sending notice22:15
corvusso we'd need a "compact h2" system cron job22:15
hasharI knew of the concept after Sqlite Vacuum22:15
hasharand eventually found H2 has the exact same logic but named Compact22:15
hasharwhen Gerrit connects to the H2 databsaes through the java driver, the driver does compact upon connection22:16
clarkbok proceeding22:16
hasharfor up to 20 ms22:16
corvusoh that's the timeout you mentioned22:16
hasharor up to system property `h2.maxCompactTime`  ms22:16
hasharyeah sorry I wasn't clear22:16
hasharso in 20 ms it can vacuum much22:16
corvusno you were clear :)22:16
hasharI gave an arbitrary 15 seconds value, restarted Gerrit some times and eventually the file got smaller22:17
corvusso right after clarkb clears out the db file would be a really good time to bump that property22:17
hasharand if you have been carrying those h2 cache files over and over as we did22:18
hasharthen I guess you had the exact same issue I have encountered :b22:18
hasharever growing caches!22:18
clarkboh oops its already coming back22:18
hasharwhich makes me regret to not have pushed that further upstream to have a nice solution implemented22:18
clarkbbut we can deal with that some time  later as in theory we have time for that22:18
hasharhave you nuked both h2 files?*22:18
corvusclarkb: yeah i think we just do that today/tomorrow or maybe next year :)22:19
corvusbut that's a system-config change and i think we don't want to rush it22:19
clarkb++22:19
clarkbI have ocnfirmed that new h2 backing files and lock files were created22:19
clarkband show-queue looks much much cleaner22:21
clarkband I can see diffs22:21
hasharcongratulations!22:22
clarkbthe disk cache pruning things that show up in show-queue are not running they are the 01:00 scheduled tasks I believe22:22
clarkband distinct from the startup gerrit tasks which have all completed as far as I can tell22:22
corvusyep i saw some startup cache pruning that is done now22:23
clarkbI'm making notes now to followup with deletion of the large h2 backing files from the tmpdir and to look at the compaction option hashar mentioned22:25
clarkbthen I'll push an update to my podman docker compose change to make it mergeable22:25
clarkbwhich should be a good exercise of gerrit overall22:25
corvusi'll delete the jstack logs22:27
clarkbhashar: any idea if this is something upstream knows about? It does seem like h2 is a bad caching implementation if it grows indefintiely22:27
hasharI guess I will investigate the size of those files caches. At quick glance Wikimedia runs with the default of 10mb in memory  and whatever the default is for disk based22:28
clarkbI think I saw there was someone looking at an alternative caching implementation I wonder if that is better22:28
hasharI might have mentioned it on the upstream Discord yeah22:28
hasharmy guess is Google is using something else22:28
hasharbut Sap surely relies on H222:28
clarkbalso fwiw I had originally thought that the caches being too small is what led to diffs not loading quickly on startup but now I'm wondering that is purely related to the backing files not pruning quickly22:28
clarkband so in theory smaller caches are actually better?22:28
hasharso maybe worth filing a task for it or at least have the driver to compact things 22:29
hasharno clue22:29
clarkbwe may want to revert the changes to increase the cache sizes and clear out all of the h2 caches and reset our baselines22:29
clarkb(I don't think this is urgent)22:29
hasharmy guess is one want to look at the hit ratio22:30
opendevreviewClark Boylan proposed opendev/system-config master: Run containers on Noble with docker compose and podman  https://review.opendev.org/c/opendev/system-config/+/93764122:31
clarkbI think ^ should be mergable now though it won't do anything until we have followups that run on noble. Probably also want to sanity check the jobs it does run don't try to install and use podman22:32
hasharI am off, congratulations on fixing the Gerrit caches!22:32
clarkbhashar: thank you for the help!22:33
hasharyou are very welcome, I am always happy to help here22:34
hasharI guess take note of the magic compact time for next year22:35
clarkbyup I have written a note in my notes file locally22:35
hasharand do note https://phabricator.wikimedia.org/phame/post/view/300/shrinking_h2_database_files/22:35
clarkbdone added the link to my notes22:36
hasharalso the compaction time should be able to be passed to the java database driver base on my comment at https://phabricator.wikimedia.org/T323754#841974222:36
hasharjava -cp h2-1.3.176.jar org.h2.tools.Shell -url 'jdbc:h2:file:.git_file_diff;IFEXISTS=TRUE;MAX_COMPACT_TIME=15000'22:37
hasharsql> SHUTDOWN COMPACT;22:37
hasharThe file went from 9G to 302MB \o/22:37
hasharwhich is a HACKY way to compact it manually :b22:37
clarkbhashar: did you do that while gerrit was running?22:37
hasharnop22:37
hasharlocally22:37
hasharerr I mean independently22:38
hasharI think I copied the whole cache file on my local machine to ease debugging22:38
clarkbya that makes sense seems likely that would conflict with gerrit h2 operations especailly if they set the default compaction time so short it must be something you don't want happening while stuff is running22:38
hasharbut potentially Gerrit could learn yet another setting that would be passed down to the jdbc connection url22:38
hasharbut I did not look into that since the java property was a quick/good enough fix22:39
clarkbmakes sense thank you for the pointers22:39
hasharat least I wrote a blog post! :b22:39
hasharwhat I wonder is why you would have the issue only triggering now22:40
clarkbhashar: I'm guessing the size went over some threshold that caused us to not trim within the timeout of some request or client and then that snowballed into many many requests22:41
clarkbit is interesting that Gerrit started almost immediately with no file diff delay etc after moving those files aside22:41
clarkbmakes me feel more confident this was related22:41
hasharmost probably yeah22:42
hasharthe cache pruning tries to keep your caches below 2G/3G apparently.  Maybe it does not run that often22:43
hasharanyway, no cache, no slowness :)22:43
hasharyou are probably set for the next ten years22:43
clarkbha indeed22:43
hasharI am off for real!22:43
clarkbgoodnight22:43
clarkbtimburke: I'm just now seeing your comments on the swift container deleter script that I wrote. Sorry I have ignored that mostly because more important things have been fighting for my attention. I'll try to get back to that at some point as we still have need for it and thank you fro the feedback22:45
clarkbinfra-root I'ev confirmed that I can push a change, comemt on a change, search changes, view diffs and other web ui actions. I've also fetched the new patchset I pushed from gitea implying replication is working22:47
clarkbthe only thing I haven't seen yet is a change merge22:47
clarkbI don't have any changes that are in a good spot to merge right this moment. Does anyone else have a change we can merge as a canary?22:48
tonybsorry I don't think I do22:50
corvusactually yes22:55
corvus+3 https://review.opendev.org/93597022:55
clarkbawesome thanks!22:56
clarkbcorvus: hrm that change isn't showing up in zuul22:57
clarkbbut zuul says it is starting the jobs so it must've seen it?22:57
clarkboh thats opendev zuul jobs not zuul/zuul-jobs22:58
clarkbI see it now22:59
clarkbalso mordred wandertracks has a couple of changesi nthe gate queued up forever I wonder what is with that22:59
clarkbhttps://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_46d/937641/13/check/system-config-run-review-3.10/46d66cc/bridge99.opendev.org/ara-report/results/147.html I think that shows gerrit system-config-run jobs properly selecting the defaul.yaml from my updated install-docker role and not the noble specific file23:34
clarkbI see it installing docker-cmopose too and not docker compose23:34
clarkbcorvus: the opendev tenant also has some stuck promote jobs for an opendev/base-jobs merge from a month ago. I suspect we can simply evict those changes and the wandertracks ones and move on, but before we attmpt that I wanted to make sure you didn't want to try and debug this23:36
clarkbthough I suspect log files may have rolled over23:37
opendevreviewClark Boylan proposed opendev/zuul-jobs master: Compress raw/vhd images  https://review.opendev.org/c/opendev/zuul-jobs/+/93597023:47
clarkbcorvus: ^ there was a post run failure and I think that should fix it23:47
clarkbI ended up using https://review.opendev.org/c/opendev/sandbox/+/934997 test test merging. That worked and it replicated here: https://opendev.org/opendev/sandbox/commits/branch/master23:49

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!