opendevreview | Tony Breeds proposed openstack/project-config master: [Discussion] Import rdoinfo from the rdoproject https://review.opendev.org/c/openstack/project-config/+/948033 | 00:04 |
---|---|---|
opendevreview | Tony Breeds proposed openstack/project-config master: [Discussion] Import rdoinfo from the rdoproject https://review.opendev.org/c/openstack/project-config/+/948033 | 01:08 |
mnasiadka | corvus: good, so now that I understand how it works - I’ll work on the rest next week :) | 05:08 |
*** ykarel_ is now known as ykarel | 05:17 | |
frickler | infra-root: I noticed that the git gc cron is still running on both review02 and review03, but I assume it doesn't matter since the repos on 02 should get used anywhere anymore? | 05:54 |
frickler | but there is also a small regression in the content of those cron mails, on review03 there are lines like "text1\Mtext2", while for review02 those would appear properly formatted as two lines | 05:56 |
frickler | eh, s/\M/^M/ | 05:56 |
opendevreview | Lukas Kranz proposed zuul/zuul-jobs master: mirror-workspace-git-repos: Allow deleting current branch https://review.opendev.org/c/zuul/zuul-jobs/+/946033 | 07:05 |
*** rlandy__ is now known as rlandy | 11:28 | |
noonedeadpunk | Hey! Any idea why https://www.openstack.org/project-mascots is timeouting? Or am I using a wrong page to look for project mascots? | 12:23 |
noonedeadpunk | As it also seems it went out of search engines as well | 12:23 |
fungi | frickler: sounds like stray carriage returns in the message body | 12:59 |
fungi | noonedeadpunk: i think there's been problems with the vexxhost ceph object store that's served from in the past, i've seen it with the mascot logos and also member org logos on the openinfra site. i'll give the sysadmins and vexxhost folks a heads up | 13:00 |
fungi | oh, though the last time this happened (february) it was unrelated to vexxhost's ceph object storage, so i've initially just gave the webdev contractors for the foundation a heads up | 13:07 |
fungi | noonedeadpunk: it's been loading intermittently for me, if you reload a few times you might get lucky, but also they've confirmed and are looking into the problem | 13:30 |
noonedeadpunk | It could be also as I'm havbing more remote location then you... | 13:30 |
noonedeadpunk | thus cloudflare might be also taking longer to get the data | 13:31 |
noonedeadpunk | At least my success rate was around 0 last 5 or 6 attempts | 13:31 |
noonedeadpunk | thanks for escalation! | 13:32 |
fungi | noonedeadpunk: seems like they made some adjustments that should make it load more reliably through the cdn now if you want to try again | 14:05 |
fungi | they found some backend code that was taking too long to return and they're going to make it async to speed up loading | 14:06 |
noonedeadpunk | Yeah, it's waaay better now | 14:09 |
noonedeadpunk | I think it was loading all media before returning the page | 14:10 |
noonedeadpunk | or smth like thatr | 14:10 |
fungi | it still is for now, they simply increasd some timeouts as a workaround while they refactor the code | 14:12 |
opendevreview | Ildiko Vancsa proposed opendev/system-config master: Remove logging from Kata IRC channels https://review.opendev.org/c/opendev/system-config/+/948081 | 14:35 |
clarkb | frickler: correct the git gc runs against the local git repos so its fine for taht to keep running on review02 until we shut it down | 14:46 |
clarkb | frickler: I don't see what you mean about the ^M but maybe my mail client is rendering things even in the raw message view? | 14:47 |
clarkb | once I'm caught up on morning stuff I intend on approving https://review.opendev.org/c/opendev/system-config/+/947758 to remove review02 from our inventory | 14:54 |
clarkb | please say something if you think it is too early to do so (it will apply about 3 days after the server move and doesn't mean we'll delete the server yet. Just trying to get through things one step at a time) | 14:54 |
fungi | sounds great to me | 14:56 |
clarkb | at that point https://review.opendev.org/c/opendev/system-config/+/947759 should be safe too (makes the docker-compose.yaml docker compose specific) so I'll approve that one too | 14:57 |
frickler | clarkb: maybe, I'm using mutt | 15:34 |
fungi | i'm also using mutt, i'll check my cronspam folder | 15:39 |
clarkb | doesn't look like we get interleaved output either (we shouldn't its a find command that should run things serially) | 15:48 |
clarkb | but maybe it has to do with changes to git or find? | 15:48 |
fungi | some of the lines in the "Subject: Cron <gerrit2@review03> find /home/gerrit2/review_site/git/ ..." message are separated by a bare cr (^M) instead of an lf, specifically those that say "Expanding reachable commits in commit graph: ..." | 15:57 |
fungi | my guess is that something changed wrt the output of git tools on noble | 15:58 |
clarkb | those lines get rendered with line feeds instead of rednered ^M even in raw viewer for me | 15:59 |
clarkb | but ya I'm not sure that is a regression | 15:59 |
frickler | oh, maybe the command really only sends ^M to have the output stay on the same line? | 15:59 |
fungi | most of the other lines are not strung together, e.g. the ones that list the repositories being processed | 15:59 |
frickler | I guess I can try running git-gc on some of those repos locally | 15:59 |
fungi | yeah, you'd need to get them into a not-yet-collected state though | 16:00 |
clarkb | I have approved https://review.opendev.org/c/opendev/system-config/+/947758 | 16:06 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM intentional Gitea failure to hold a node https://review.opendev.org/c/opendev/system-config/+/848181 | 16:10 |
frickler | hmm, whatever I do, git-gc gives me much more output than what I see in those mails. also it stops generating any output as soon as I try to pipe it somewhere for detailed inspection. anyway it doesn't seem critical, so I'll just leave it at that, I just stumbled about the difference when I saw the two mails side by side this morning | 16:15 |
fungi | frickler: you need stdin to not be a tty i think, or maybe stdout. try piping through cat and redirecting in from /dev/null? | 16:17 |
clarkb | an at now job might also somewhat replicate the cron env? | 16:17 |
fungi | also some of those lines might be on other fds, could be a combination of stdout and stderr are interleaved there | 16:17 |
opendevreview | Merged opendev/system-config master: Remove review02 from the inventory https://review.opendev.org/c/opendev/system-config/+/947758 | 16:57 |
clarkb | that is ahead of hourly jobs. I'm keeping an eye on it as jobs run | 17:00 |
clarkb | other than removing review2 from zuul known_hosts that should largely be a noop though | 17:01 |
clarkb | I guess the docs get updated | 17:01 |
clarkb | deployment of 947758 reports success. I'll approve the cnage to edit the docker-compose.yaml file now | 17:39 |
clarkb | infra-root https://zuul.opendev.org/t/openstack/build/572f8d17b5f249e9a415e35572a30614/logs just had a post failure with no logs. I'm trying to find the executor that ran it now to see if this is a sad swift backend | 17:50 |
clarkb | it ran on ze03 and failed to upload to ovh bhs1 | 17:52 |
clarkb | its possible this is a one off so we shouldn't disable ovh yet. But we should monitor things and be prepared to do so | 17:53 |
clarkb | https://zuul.opendev.org/t/openstack/build/4ff5dc8ad91f47c29abeceac1070ba86/logs is another likely case. But I've only see the two so far | 18:05 |
clarkb | that one ran on ze06 and also tried to upload to ovh bhs | 18:06 |
clarkb | the ovh status page says object storage is degraded but I think that is for https://public-cloud.status-ovhcloud.com/incidents/491vx956zx6b which shouldn't affect us as we don't use s3 sdks | 18:09 |
clarkb | anyway if it gets persistently worse we can disable ovh bhs and possibly ovh gra | 18:09 |
fungi | i'll try to keep an eye out | 18:20 |
opendevreview | Clark Boylan proposed opendev/system-config master: Update gitea web admin docs https://review.opendev.org/c/opendev/system-config/+/948116 | 18:34 |
clarkb | fungi: ^ this came out of testing the sync tag function on the held gitea. That gitea is here: https://104.239.175.21:3081 if you want to test too | 18:38 |
clarkb | looks like when you click the button it enqueues the task into a queue. IF you navigate to monitoring -> queues on the left side there is a tag_sync queue which by default has 1 worker and up to 4 workers and the number in queue will jump up to 19XY then down to 0. This happens quickly on the test node so you have to go fast to see it there. I suspect it won't be so fast in | 18:39 |
clarkb | production | 18:39 |
clarkb | fungi: so now the question is do we just want to send it on the production node and trust that having things in a queue with a limited number of workers won't overwhelm it? | 18:39 |
clarkb | annoyingly the service log doesnt' seem to report success or completion and cross checking the code I think that is expected (it reports errors) | 18:40 |
clarkb | so we'd be relyong on that number in queue value to drop to 0 to know when it is done | 18:40 |
fungi | i suppose we could temporarily down gitea09 in haproxy first if we're especially concerned | 18:41 |
clarkb | ya maybe not a bad idea for the first one | 18:42 |
clarkb | my focus today is still getting those gerrit cleanups landed. So I'll recheck that change before going to lunch. But its going slow enough that maybe after lunch I'll just go ahead and do that with 09 (pull it out of the haproxy, click the button, wait for queue to report 0 in the queue, then add back to haproxy) | 18:43 |
clarkb | then we can check the results and if they look good proceed to do the others | 18:43 |
clarkb | fungi: re 948116 I don't think this changed our security stance compared to the past | 18:44 |
clarkb | I'm just updating the docs so tehy are accurate for current gitea deployment | 18:44 |
fungi | no, i agree, i was just pointing out that we do it for some other services, though they're more problematic than gitea | 18:44 |
clarkb | ah | 18:44 |
clarkb | ok version specifier removal change has been rechecked. I should look for food now | 19:05 |
fungi | that was the one where you first spotted the ovh log upload failure? | 19:08 |
fungi | bon appétit! | 19:08 |
clarkb | I've logged into gitea09, but socat on gitea-lb02 is giving me permission denied to the socket object for haproxy command | 20:22 |
clarkb | this server is still jammy not noble so not hitting the apparmor problems I don't believe | 20:23 |
clarkb | oh wait the path changed | 20:24 |
clarkb | yup /var/haproxy/run/stats was the old path but we moved it to /var/lib/haproxy/run/stats even on jammy just to keep in sync with noble | 20:25 |
clarkb | infra-root: gitea09 has been pulld out of rotation on gitea-lb02 so that I can click the button to resync git tags in repos on gitea09 | 20:26 |
fungi | sounds good | 20:26 |
fungi | and yeah, i recall having to change the command i was running from my shell history when we changed the socket path | 20:27 |
clarkb | I think it is alredy about halfway done | 20:29 |
clarkb | its done. So about 5 ish minutes total | 20:31 |
clarkb | https://gitea09.opendev.org:3081/openstack/openstack-ansible-haproxy_server/tags?q=ussuri-eol shows up now | 20:31 |
clarkb | noonedeadpunk: ^ fyi | 20:31 |
clarkb | spot checking zuul and nova I don't see anything going wrong either | 20:32 |
clarkb | I guess if we don't see any problems between now and tomorrow we can click that sync tags button on the other 5 backends | 20:33 |
clarkb | I will put 09 back into service with haproxy now | 20:33 |
clarkb | its probably fine to do these without taking backend sout of haproxy too since it was quick | 20:33 |
clarkb | I'm logged out of gitea09 now and everything should be back to "normal" | 20:37 |
clarkb | #status log Ran gitea tag synchronization on gitea09 via the web dashboard | 20:41 |
opendevstatus | clarkb: finished logging | 20:41 |
fungi | yeah, seems to have fixed it, lgtm | 20:48 |
clarkb | I noticed the archive queue has 100000 entries in it. THis is probably related to why archives don't work anymore | 20:50 |
clarkb | I'm not going to debug that now. Its a known thing and when they do work they fill the disk which is worse | 20:50 |
opendevreview | Merged opendev/system-config master: Drop docker-compose version specifier for Gerrit https://review.opendev.org/c/opendev/system-config/+/947759 | 21:00 |
clarkb | that appears to have applied successfully | 21:06 |
clarkb | I wnet ahead and approved the gitea docs update change | 21:08 |
opendevreview | Merged opendev/system-config master: Update gitea web admin docs https://review.opendev.org/c/opendev/system-config/+/948116 | 21:11 |
clarkb | fungi: I think we can consider the python3.12 update for gerrit base images tomorrow too | 21:14 |
fungi | yeah, that would be great, i plan to be around and can do the restart for it too | 21:15 |
clarkb | though we should check the git log on gerrit stable-3.10 before we do that just because we build from source | 21:15 |
clarkb | https://gerrit.googlesource.com/gerrit/+log/refs/heads/stable-3.10 it has never been an issue but I do try to check quickly for anything problematic these days | 21:16 |
fungi | sure | 21:16 |
clarkb | looks like they are in the middle of debugging some change id lookup latency. The changes primarily seem to be around when you import changes/projects from other servers with different serverids | 21:19 |
clarkb | we have never done that so I suspect most of htose changes are noops for us and we're good to go | 21:19 |
clarkb | but feel free to look it over and call out any concerns. I'm happy to dig in further too | 21:20 |
opendevreview | Merged opendev/system-config master: Remove logging from Kata IRC channels https://review.opendev.org/c/opendev/system-config/+/948081 | 21:28 |
clarkb | fungi: I don't think ^ restarte dthe container to pick up the new config | 21:57 |
fungi | oh, huh | 22:02 |
fungi | i wonder if we removed the automatic restart from the playbook and decided to just do manual restarts for safety? i'll take a look after dinner | 22:03 |
fungi | looks like https://opendev.org/opendev/system-config/src/commit/a4a885b/playbooks/roles/limnoria/tasks/main.yaml only effectively restarts the container on an image update? | 22:59 |
fungi | #status log restarted meetbot container on eavesdrop01 to pick up configuration change from https://review.opendev.org/948081 | 23:02 |
opendevstatus | fungi: finished logging | 23:03 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!