mnaser | error: RPC failed; curl 56 GnuTLS recv error (-54): Error in the pull function.\nerror: 20 bytes of body are still expected\nfatal: expected flush after ref listing\n | 02:04 |
---|---|---|
mnaser | We're seeing failed hits on opendev.org for our builds :x | 02:04 |
mnaser | Noticing some timeouts | 02:05 |
tonyb | mnaser: checking ... | 02:05 |
tonyb | mnaser: Definately something going on | 02:09 |
mnaser | We haven't had any changes on our side. | 02:13 |
tonyb | The load-average on a few of thew servers is "high" and there seesm to be more than one user-agent crawling | 02:19 |
mnaser | ah, the classic. | 02:23 |
tonyb | I'm able to really pinpoint any errors, the load is a little high but not problematically so | 03:04 |
tonyb | mnaser: If see more timeouts can you capture the client ip/timestamp for me to do more detailed digging? | 03:04 |
mnaser | Sure. | 03:08 |
opendevreview | Dr. Jens Harbott proposed openstack/project-config master: Only pause update_constraints.sh when needed https://review.opendev.org/c/openstack/project-config/+/946541 | 05:36 |
zigo | Not sure who to ask, but the single-use code from OpenInfraID are valid for only 10 minutes. That's the time it takes for mail to pass greylisting. Could this be increased to at least 30 minutes ? | 09:29 |
tonyb | I understand your pain. That's one for fungi or clarkb to raise with tipit | 09:31 |
zigo | btw, what's the schedule URL for the PTG ? :) | 09:31 |
zigo | Found it ... | 09:32 |
*** ykarel_ is now known as ykarel | 12:52 | |
*** mrunge_ is now known as mrunge | 13:59 | |
opendevreview | yatin proposed opendev/irc-meetings master: Move Neutron CI Weekly meeting 1 hour earlier https://review.opendev.org/c/opendev/irc-meetings/+/946631 | 14:00 |
fungi | zigo: thanks! i brought the 600-second openinfraid otp timeout up with the developers who maintain it, and they've indicated it's easily configurable so we'll see what they feel comfortable increasing it to | 14:51 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP test gerrit on noble https://review.opendev.org/c/opendev/system-config/+/946637 | 15:03 |
clarkb | infra-root ^ I realized I should take a step back on replacing gerrit and ensure testing is happy with it but then also write out a plan and do more ansible behavior checking when adding a new server etc and dump that all into an etherpad we can look over before making potentially disruptive changes | 15:04 |
opendevreview | Merged opendev/irc-meetings master: Move Neutron meeting to 1300 UTC on Tuesdays https://review.opendev.org/c/opendev/irc-meetings/+/946063 | 15:06 |
opendevreview | Merged opendev/irc-meetings master: Move Neutron CI Weekly meeting 1 hour earlier https://review.opendev.org/c/opendev/irc-meetings/+/946631 | 15:07 |
clarkb | following up on sad gitea from last night all of the current backends seem to be operating nominally after a quick check. Let us know if that changes and I'm happy to dig in more | 15:09 |
fungi | clarkb: i think i've addressed your comments on https://review.opendev.org/946284 if you get a chance for another look | 15:11 |
clarkb | will do | 15:13 |
opendevreview | Clark Boylan proposed opendev/system-config master: Update gitea to 1.23.7 https://review.opendev.org/c/opendev/system-config/+/946640 | 15:15 |
opendevreview | Merged opendev/yaml2ical master: Update Python versions and boilerplate https://review.opendev.org/c/opendev/yaml2ical/+/946284 | 15:28 |
opendevreview | Merged opendev/yaml2ical master: Address W504 linting rule https://review.opendev.org/c/opendev/yaml2ical/+/946285 | 15:28 |
clarkb | https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 gerrit planning document is coming together there. Bit of an outline for now as I'm paying attention to the foundation board meeting | 15:30 |
clarkb | as a reminder we will not have a team meeting today | 16:01 |
clarkb | infra-root https://gerrit-review.googlesource.com/c/homepage/+/464287 gerrit is discussing adding native support for jujutsu change-id headers | 16:30 |
clarkb | I left a couple comments there with a tl;dr of basically adding that support sounds great but not if it breaks existing users workflows (which it will as proposed so I suggested an alternative) | 16:30 |
*** diablo_rojo_phone is now known as Guest13175 | 16:32 | |
clarkb | jitsi meet just published a new release | 16:32 |
clarkb | meetpad02 and jvb02 are still in the emergency file so I think we're fine and won't update | 16:32 |
clarkb | ok https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 is starting to have some more interesting content like how do we want to handle manage-projects which will fail on the new server for a bit | 16:46 |
corvus | clarkb: i think https://gerrit-review.googlesource.com/c/homepage/+/464287/7/pages/design-docs/support-jujutsu/use-cases.md#45 may be the answer to the question in your comment? | 16:55 |
corvus | (also the next one) | 16:55 |
corvus | not sure if you're aiming for a solution that meets those criteria, or are advocating the criteria be changed (or maybe i misunderstand the criteria) | 16:55 |
clarkb | corvus: I'm saying that criteria is wrong/broken because it breaks Git users | 16:57 |
clarkb | I'm trying to argue that adding jujutsu support is fine as long as they do not break Git users who have used Gerrit for more than a decade | 16:57 |
corvus | got it | 16:57 |
clarkb | and I think that is possible if you have the commit hook respect the jujutsu change-id and add it to the commit message | 16:58 |
clarkb | I think this change is particularly problematic for OPenDev because I don't want to spend my time doing git and jujutsu compatibility support | 16:58 |
clarkb | which will be the case if gerrit allows jujutsu to push without a change-id in the commit message due to rebasing losing the header field | 16:58 |
corvus | yeah, that sounds like a reasonable adaptation. i do wonder if some of the pressure here is coming from the kernel and this is all really just an end-run around kernel devs opposition to Change-Id footers, which would make the current acceptance criteria non-negotiable | 17:00 |
fungi | clarkb: it looks like a primary reason for their proposal is to stop requiring the commit hook | 17:01 |
clarkb | ya I don't think a clear motiviation is expressed beyond supporting jujutsu change ids | 17:02 |
corvus | it seems like the real technical issue is that jj change ids are not preserved on cherry-pick/rebase. if it weren't for that, then everything would work fine with gerrit and i could see us easily getting to a point where there's no footer at all anymore. they mention that as a future stretch goal in the doc. | 17:02 |
clarkb | corvus: yup | 17:02 |
corvus | probably no one wants to say :) | 17:02 |
fungi | in the "background" section it talks about the commit message hook being a constant source of confusion for new gerrit users | 17:02 |
corvus | if things happened in the other order: git was updated to add and not delete change-id headers, then everything would be a lot easier i think. | 17:02 |
corvus | git is a constant source of confusion for new git users | 17:03 |
fungi | yes | 17:03 |
corvus | also old ones | 17:03 |
clarkb | corvus: ++ though we'd need a minimum git version but at least then you could tell people there is a light at the end of the tunnel | 17:03 |
*** Guest13175 is now known as help | 17:03 | |
clarkb | I just think that someone deciding to use a different client breaking everyone using the old "official" (for lack of a better term) client is a terrible decision | 17:03 |
*** help is now known as Guest13179 | 17:04 | |
fungi | clarkb: first i've heard of it, but it looks like they consider jujutsu to be a completely separate version control system (somewhat compatible with and implemented on top of git), not merely a special git client | 17:05 |
clarkb | fedora, debian, and ubuntu lack jujutsu packages but tumbleweed ahs them | 17:05 |
*** Guest13179 is now known as diablo_rojo_phone | 17:06 | |
clarkb | fungi: yes jujutsu is its own DVCS system but in the case of Gerrit it will be speaking git 100% of the time aiui | 17:06 |
clarkb | the interop layer is jujutsu's support of git fetch and push protocols | 17:06 |
clarkb | anyway what that means is I as a tumbleweed user can trivially decide to use jujutsu to push changes to gerrit that people using debian, ubuntu, or fedora would either need to be "experts" to update by manually discovering the change id and adding it to commitm essages or they would have to install jujutsu from somwhere else and switch tools | 17:08 |
clarkb | I'm all for supporting a different set of client tools but not at the expense of existing users is all | 17:08 |
clarkb | and I think that is achievable with a simple hook update | 17:09 |
fungi | though it would require jj users to still install and use gerrit's hook | 17:11 |
corvus | yeah, so all gerrit users are equal until git is updated to support jj change-ids | 17:12 |
clarkb | right | 17:12 |
corvus | maybe it's enough to still have the option to require the change-id footer in a particular gerrit install? so opendev has that option enabled but, like, companies doing kernel dev with gerrit don't? | 17:16 |
fungi | that is, of course, assuming jj even supports traditional git hook scripts (i suppose it must?) | 17:16 |
clarkb | ya looking further maybe that is the unstated issue | 17:17 |
clarkb | the jujutsu does point at the pre-commit tool (ugh) | 17:17 |
clarkb | *docs | 17:17 |
clarkb | corvus: ya maybe that is a workaround | 17:18 |
clarkb | thinking about this more its also possible that we could update git review -d and -x etc to rewrite things when our users fetch stuff if the data ins't in the commit message | 17:20 |
clarkb | then at least our common tooling would mitigate things | 17:20 |
corvus | yeah, there's an old resolved comment about doing something similar with the download plugin | 17:20 |
clarkb | in https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 I've bolded the major open question I have at this point. Feedback on that would be appreciated. Otherwise I think I've collected enough info to at least boot the new server (but problaby not add it to the inventory yet). I'll look into doing the server boot after lunch most likely | 17:27 |
opendevreview | Merged zuul/zuul-jobs master: Add role: `ensure-python-command`, refactor similar roles https://review.opendev.org/c/zuul/zuul-jobs/+/941490 | 17:37 |
clarkb | the gitea 1.23.7 change looks good https://review.opendev.org/c/opendev/system-config/+/946640 (I checked the system-config screenshot and the job passed) | 18:19 |
clarkb | thats another one I'm happy to babysit this afternoon | 18:19 |
clarkb | in prep for booting a new gerrit server I have created a new data volume for gerrit. I kept the same size (1024GB) and type (nvme) as the existing data volume. nvme appears to be the default anyway | 18:32 |
clarkb | I will boot with a 64gb root disk boot from volume to also match the old server. I think the argument that we're far less likely to have problems with bfv has won me over. I'm willing to deal with more apinful recoveries if the chance of needing to do so in the first place is significantly less | 18:33 |
fungi | makes sense | 18:40 |
*** jamesdenton_ is now known as jamesdenton | 18:52 | |
clarkb | launching review03 failed when trying to set up the data volume mount path at /home/gerrit2 | 19:44 |
clarkb | looks like we leak the bfv volume when that happens | 19:45 |
clarkb | https://opendev.org/opendev/system-config/src/branch/master/launch/src/opendev_launch/mount_volume.sh#L26 this check failed and we exited 1. It was checking /dev/vdb. I notice that review02 has mounted /dev/vdc and wonder if it identified the wrong device name (that comes from the openstack apis) | 19:46 |
clarkb | I think the issue is that we run make swap before we run mount_volume.sh and make_swap uses /dev/vdb | 19:48 |
clarkb | one way to solve this would be to flip the order of the two scripts. Another is for me to attach the volume after launch node runs and more manually do things | 19:48 |
clarkb | I'm going to try swapping the order since I think that is better for us long term. But first I'm going to delete these volumes and start over (the scripts don't work fi there is already content on the volumes) | 19:50 |
fungi | yeah, i've still never had the separate volume creation feature in launch-node match what i needed/expected | 19:50 |
fungi | and simply creating and attaching a volume after booting the server is simple enough anyway | 19:50 |
clarkb | ok both new leaked bfv and new data volume have been deleted | 19:53 |
clarkb | starting over now | 19:53 |
fungi | cpython 3.14.0a7, 3.13.3, 3.12.10, 3.11.12, 3.10.17 and 3.9.22 were just tagged | 19:55 |
fungi | 3.14.0a7 is the final alpha, next prerelease will be 3.14.0b1 | 19:55 |
opendevreview | Clark Boylan proposed opendev/system-config master: Create swap after dealing with data volumes https://review.opendev.org/c/opendev/system-config/+/946709 | 19:59 |
clarkb | that change got me past the previous error. I don't know why the bfv volume wasn't automatically claened up | 19:59 |
clarkb | we do delete the server then attempt to delete the volume but we got this error: "Invalid volume: Volume status must be available or error or error_restoring or error_extending or error_managing and must not be migrating, attached, belong to a group, have snapshots or be disassociated from snapshots after volume transfer." | 20:02 |
clarkb | I suspect that we're not waiting long enough for the volume to detach and become available after the server is deleted. We could add a busy wait | 20:02 |
clarkb | it looks like it tries to delete all volumes attached to the server though which may not be what we want either | 20:03 |
clarkb | hrm | 20:03 |
fungi | i wonder if we should be explicitly enabling the delete-on-termination feature for bfv instances | 20:07 |
clarkb | maybe. That gives us less flexibility for having the image hang aroudn after we delete a server | 20:08 |
clarkb | we would have to snapshot I think (which isn't the end of the world) | 20:08 |
fungi | yeah, i mean, a normal non-bfv server deletes its rootfs when the server instance is deleted, so it's no worse | 20:10 |
opendevreview | Clark Boylan proposed opendev/zone-opendev.org master: Add review03 to forward DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/946711 | 20:13 |
clarkb | maybe only approve ^ when you are happy with the state of the server. Then I can ask for reverse ptr records at that point. I'm going to work on the inventory addition change which I will wip as well since that needs careful consideration | 20:14 |
clarkb | then I'll look at improving launch node further | 20:14 |
fungi | logging into it now | 20:14 |
fungi | server (ubuntu version, filesystems, ram/swap) all lgtm | 20:15 |
fungi | approved the dns addition | 20:16 |
fungi | both ip addresses were correct and working for me to ssh as well | 20:16 |
fungi | make sure to check talos/senderbase or something similar for the addresses being on relevant e-mail blocklists | 20:17 |
fungi | since we do send a fair volume of notifications from gerrit | 20:17 |
fungi | i'll check as well | 20:17 |
clarkb | fungi: yes I checked the two links that launch node emits and they were both clear | 20:18 |
fungi | cool | 20:18 |
clarkb | they are spamhaus queries by ip | 20:18 |
fungi | cisco talos reputation lookups are clean for both the v4 and v6 addresses | 20:20 |
opendevreview | Merged opendev/zone-opendev.org master: Add review03 to forward DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/946711 | 20:22 |
opendevreview | Clark Boylan proposed opendev/system-config master: Add new Noble review03 to the inventory https://review.opendev.org/c/opendev/system-config/+/946637 | 20:22 |
clarkb | I'm going to WIP ^ for now | 20:23 |
clarkb | that change needs much more careful review | 20:23 |
fungi | sure | 20:23 |
fungi | as for the manage-projects question in the etherpad, i had some followup questions | 20:23 |
fungi | yeah, so given your confirmation of my suspicion, i'm in agreement on your preferred solution | 20:24 |
clarkb | fungi: yup just responded on the etherpad. Since I wrote that original set of questions and ideas down I discovered the review-staging group which 946637 puts review03 into | 20:25 |
clarkb | its there to solve this exact problem | 20:25 |
clarkb | mnaser: frickler fyi https://github.com/openstack/kolla-ansible/pull/52 | 20:26 |
mnaser | oh, didnt we have something that autoclosed it? | 20:26 |
clarkb | mnaser: oh sorry that was meant for mnasiadka | 20:26 |
clarkb | mnaser: we do, but the autocloser doesn't bring it to the attention of project memebrs unless they are subscribed to github events. I strongly encourage project membership subscribe to github events if they replicate to github | 20:27 |
clarkb | mnaser: I did have a question for you though. What is the best way to request reverse PTR records for the new review03 gerrit server | 20:27 |
clarkb | mnaser: you can see the new A and AAAA forward records in https://review.opendev.org/c/opendev/zone-opendev.org/+/946711/1/zones/opendev.org/zone.db | 20:28 |
fungi | or https://opendev.org/opendev/zone-opendev.org/src/branch/master/zones/opendev.org/zone.db#L690-L691 now as well | 20:29 |
clarkb | fungi: did you want to review https://review.opendev.org/c/opendev/system-config/+/946709 that is what is currently in place on bridge due to my manual edit and it seemed to work | 21:05 |
fungi | lgtm, thanks | 21:08 |
opendevreview | Clark Boylan proposed opendev/system-config master: Have launch_node delete volumes more properly https://review.opendev.org/c/opendev/system-config/+/946715 | 21:10 |
clarkb | I'm going to wip ^ because that is completely untested and potentially destructive | 21:10 |
clarkb | infra-root no meeting today but if I can ask everyone to look at https://review.opendev.org/c/opendev/system-config/+/946637 and check https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 and either give the go ahead to land that or suggestions to make it safer that would be great. Then hopefully we land that in the next few days and start with initial data sync | 21:15 |
clarkb | fungi: should we upgrade gitea? Looks like you +2'd https://review.opendev.org/c/opendev/system-config/+/946640 | 21:16 |
fungi | yeah, i'm around for a while still if you're ready | 21:18 |
opendevreview | Aurelio Jargas proposed zuul/zuul-jobs master: ensure-python-command: Install venv in Zuul-scoped path https://review.opendev.org/c/zuul/zuul-jobs/+/945277 | 21:20 |
opendevreview | Aurelio Jargas proposed zuul/zuul-jobs master: ensure-python-command: Optimize pip install https://review.opendev.org/c/zuul/zuul-jobs/+/945278 | 21:26 |
opendevreview | Aurelio Jargas proposed zuul/zuul-jobs master: ensure-nox: Add support for global symlink https://review.opendev.org/c/zuul/zuul-jobs/+/945276 | 21:26 |
clarkb | fungi: yup I'm ready | 21:27 |
opendevreview | Merged opendev/system-config master: Create swap after dealing with data volumes https://review.opendev.org/c/opendev/system-config/+/946709 | 21:27 |
fungi | okay, approved it just now | 21:28 |
clarkb | thanks | 21:28 |
mnasiadka | clarkb: Any guidance what should we do there? Usually there was a bot (I think) that autoclosed such PRs and direct the user to push it via Gerrit, right? | 21:42 |
clarkb | mnasiadka: there is still a bot and it shoudl get auto closed (I think it runs when you merge something and maybe daily?) | 21:47 |
opendevreview | Aurelio Jargas proposed zuul/zuul-jobs master: Fix deprecated cleanup-run https://review.opendev.org/c/zuul/zuul-jobs/+/946718 | 21:47 |
clarkb | but I strongly encourage anyone whose projects are mirrored to github subscribe ti github notifications so that you can help steer people in the right direction and/or pull in fixes that would otherwise die on the vine in github | 21:47 |
fungi | 946640 hit a post_failure on system-config-upload-image-gitea | 21:49 |
fungi | https://zuul.opendev.org/t/openstack/build/dcb3518e599a43bf8affc9d1717a573a | 21:49 |
fungi | problem with docker.io/opendevorg/assets? | 21:50 |
fungi | i guess we haven't moved that to quay yet | 21:51 |
clarkb | we haven't. This problem looks like the one where we expect a 404 from the buildset registry because that image isn't part of the buildset. This is supposed to cause docker to look at docker hub next. We theorize that if the docker lookup hits rate limits the docker client reports the first error it hit which was the 404 and not the last error or all errors | 21:52 |
clarkb | I think it should be safe to recheck though that probably means it won't merge until after 5 pm local | 21:52 |
clarkb | basically it tried to get the assets image from the buildset registry and didn't find it there (expected) it then tried to get it from docker hub and failed then reported the expected error rather than the unexpected one | 21:53 |
fungi | i directly reenqueued it to the gate to save some time | 21:57 |
fungi | i'll try to be around still as it deploys | 21:57 |
clarkb | wfm I should be around since it went straight to the gate | 22:03 |
clarkb | jamesdenton: any update on that low bw over private network instance? No rush just checking in to see if possibly we can delete the node at this point | 22:24 |
fungi | ugh, it failed again | 22:32 |
fungi | https://zuul.opendev.org/t/openstack/build/4f898b80134f4d2996bae236e0ae7bf1 | 22:32 |
fungi | this time on docker.io/library/debian:bookworm-slim | 22:33 |
fungi | maybe we should try again tomorrow and see if the dockerhub gods smile more favorably on us? | 22:33 |
clarkb | works for me | 22:44 |
clarkb | it is now on my todo list to upgrade gitea tomorrow | 22:44 |
fungi | thanks | 22:45 |
opendevreview | James E. Blair proposed opendev/system-config master: Add setuptools to python-builder assemble https://review.opendev.org/c/opendev/system-config/+/946722 | 23:16 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!