clarkb | just about meetin gtime | 18:59 |
---|---|---|
clarkb | #startmeeting infra | 19:01 |
opendevmeet | Meeting started Tue May 16 19:01:26 2023 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
opendevmeet | The meeting name has been set to 'infra' | 19:01 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/34DDMF4OX5CPXU2ARFFXA66IHRFDS3F2/ Our Agenda | 19:01 |
clarkb | #topic Announcements | 19:01 |
clarkb | I didn't really have anythin gto announce. | 19:02 |
clarkb | Which means we can dive right in I guess | 19:02 |
clarkb | #topic Migrating to Quay.io | 19:02 |
clarkb | #link ttps://etherpad.opendev.org/p/opendev-quay-migration-2023 Plan/TODO list | 19:02 |
clarkb | unfortunately I discovered a deficiency in docker that impacts speculative testing of container images when they are not hosted on dockerhub | 19:03 |
clarkb | link https://etherpad.opendev.org/p/3anTDDTht91wLwohumzW Discovered a deficiency in Docker impacting speculative testing of container images | 19:03 |
clarkb | #link https://etherpad.opendev.org/p/3anTDDTht91wLwohumzW Discovered a deficiency in Docker impacting speculative testing of container images | 19:03 |
clarkb | I put together a document describing the problem and some potential solutions or workarounds. | 19:03 |
clarkb | I think if we want to workaround it our best option is to build all of our images with buildx (it doesn't have the same deficiency) and then use skopeo to fetch the images out of band in testing only | 19:04 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/882977 illustration of skopeo prefetching | 19:04 |
clarkb | this change shows the skopeo workaround works and how we can apply it without impacting production | 19:04 |
clarkb | I keep getting distracted but my goal is to pcik this effort up again in the near future and implement th eworkaround everywhere | 19:05 |
clarkb | If we don't like those workarounds I think we should consider rolling back to docker hub and then implementing one of the more robust long term options of switching to podman | 19:05 |
clarkb | but that represents significant effort that will liekly need to be taken slowly to ensure we don't break anything hence the idea we should rollback the container image hosting move if we do that | 19:06 |
clarkb | I haven't heard a ton of feedback on whether or not people are ok with the workarounds other than fungi saying he was fine with them. Please let me know if you have thoughts | 19:08 |
clarkb | #topic Bastion Host Changes | 19:08 |
ianw | does this affect zuul as well? or does testing there not pull? | 19:08 |
clarkb | ianw: it does affect zuul as well. Though less completely because zuul has k8s testing | 19:08 |
clarkb | and zuul is in a different tenant so depends on for opendev python base images may not work anyway? | 19:09 |
clarkb | #undo | 19:09 |
opendevmeet | Removing item from minutes: #topic Bastion Host Changes | 19:09 |
corvusx | hi um i said some things and didn't see a response so i don't know what the current state of lagging is | 19:10 |
clarkb | I suspect that it will be much easier for zuul to move to podman and friends for CI too since the testing doesn't directly map to a production deployment anywhere | 19:10 |
clarkb | corvusx: I havne't seen any messages from you | 19:10 |
corvusx | so anyway: | 19:10 |
corvusx | i think the skopeo hack basically means we throw away everything we've done with speculative builds and go with a completely different approach which is potentially very fragile | 19:10 |
corvusx | i mean, we have all of this speculative build stuff tested and working in zuul-jobs -- but i guess the quay stuff is only working with podman. so it seems like that's the way we should go | 19:10 |
clarkb | my concern with that is that is likely months of effort as we go through each service one by one and figure out how permissions and stuff are affected and so on | 19:11 |
corvusx | basically, every interaction with an image is an opportunity to miss getting the speculative image | 19:11 |
clarkb | which is fine if we want to do that bu tI think we should rollback to docker hub first | 19:11 |
corvusx | podman is supposed to be a drop-in replacement | 19:11 |
corvusx | why would permissions be affected? | 19:11 |
clarkb | but we know it isn't | 19:11 |
clarkb | corvusx: because it runs as a regular user without a root daemon | 19:11 |
fungi | specifically, what i said i was good with was rolling forward to iterate on making things work for quay (whether that means temporary workarounds or whatever) rather than rolling back to dockerhub | 19:12 |
corvusx | i mean, they might have fixed whatever was wrong before? i haven't see any issues with the bind mount stuff. | 19:12 |
clarkb | corvusx: yes the bind specific problems may be addressed. Its more as I dug into it podman-compose vs docker-compose + podman appera very different than docker-compose with docker due to podmans different design | 19:12 |
clarkb | there are also the nested podman issues that ianw hit with nodepool-builder | 19:12 |
clarkb | all of it should be doable, but I don't think its a week of work | 19:13 |
corvusx | how so? i was certainly led to believe that it's a drop in by the folks who documented that in zuul | 19:13 |
clarkb | corvusx: so podman-compose is not feature complete compared to docker-compose aiui is the first thing | 19:13 |
corvusx | our docker-compose usage is limited to the feature set available like 5 years ago | 19:14 |
clarkb | corvusx: for this reason you can run podman + docker-compose using podman as a daemon. But if you do that I don't know if you are stuck on old docker-compose | 19:14 |
clarkb | but then separately if I run podman as root I will get different results then when I run it as a regular user | 19:14 |
corvusx | it's just our docker-compose usage is so simple i'm having a hard time believing that this could be a huge problem. it's basically just bind mounts and host networking... if podman/podman-compose can't handle that, then i don't know why it's even a thing | 19:15 |
clarkb | corvusx: I thin kwe already hvae evidence of it being an issue though with nodepool builder running nested podman | 19:16 |
clarkb | I agree ti shouldn't be an issue but then ianw pointed at some historical ones | 19:16 |
clarkb | there is also the issue of installing the toolchain on ubuntu | 19:16 |
corvusx | what's the nodepool builder running nested podman problem (and does that have any relation to what we do on servers?) | 19:16 |
clarkb | corvusx: my undertsanding is that if you run nodepool-builder under podman then all of the diskimage builder container elements stop working because of the nesting of podman | 19:17 |
clarkb | I don't know why this occurs | 19:17 |
clarkb | it would affect our nodepool builder servers | 19:17 |
corvusx | that does sound like a problem | 19:18 |
ianw | https://opendev.org/zuul/nodepool/src/branch/master/Dockerfile#L90 is the things we know about; cgroups issues | 19:18 |
corvusx | so i'm still in the camp that we should not consider the skopeo thing an intermediate solution | 19:18 |
ianw | i don't think anyone really knows what would happen under podman | 19:18 |
corvusx | i think if we want to go that direction, we should very explicitly make a decision to completely throw out all the work we've done on this (which is fine, that's a thing we can decide) | 19:19 |
clarkb | corvusx: I'm not sure I understand why we would have to throw it all out. Th echange I pushed shows it works with all of the speculative stuff in place? | 19:19 |
corvusx | but i don't think it makes sense to do that as an interim step, because i think it's potentially very error prone (every container image interaction in any job is an opportunity to miss some place that needs to be preloaded) | 19:19 |
clarkb | It is fragile though as you hav eto be explicit about which images are potentially speculative | 19:19 |
corvusx | and by the time we find and fix all those things, we will have built a completely new solution that supercedes the old one | 19:20 |
clarkb | ok in that case I think we need to rollback quay.io moves and then do switch to podman then move to quay.io. I don't think we can move to podman in a week. I suspect it will be months before everything since the only time we tried it blew up on us. | 19:20 |
clarkb | the first thing that will need to be solved is installing podman and friends. | 19:21 |
clarkb | Then we can use system-config-run jobs to see what if anything just breaks. Then start holding nodes and sorting out the transition from one to the other | 19:21 |
corvusx | we could still consider doing the preload thing, or we could consider staying on docker | 19:22 |
corvusx | i mean staying on dockerhub | 19:22 |
ianw | the workaround in system-config does mean that if someone was implementing from first principles in zuul-jobs, there's some pretty big caveats that aren't mentioned? | 19:22 |
corvusx | (also, are we absolutely sure that there isn't a way to make this work?) | 19:22 |
corvusx | ianw: i don't follow, can you elaborate? | 19:23 |
clarkb | corvusx: I mean I'm as sure as Ican be. There is an closed issue on moby with people saying this is still a problem as of February. | 19:23 |
clarkb | Digging around in the docker config docs I can't find any indication that you can set up a mirror for anything but docker.io. Another option woul dbe writing the custom proxy thing and configuring docker daemon with an http proxy that does what we need | 19:24 |
fungi | closed as a "wontfix" by the maintainers who apparently don't see the problem with fallbacks only working for dockerhub | 19:24 |
corvusx | clarkb: the proxy was the original idea and worked poorly | 19:24 |
clarkb | corvusx: ya it seems like something that would need a lot of from scratch implementation too rather than juts using an existing http proxy due to the specific behavior that is needed | 19:25 |
clarkb | and then potentially break when docker changes its behaviors | 19:25 |
corvusx | clarkb: no i literally mean we did that and it worked poorly | 19:25 |
corvusx | like that's part of why zuul-registry is designed the way it is | 19:25 |
clarkb | oh I see | 19:26 |
corvusx | i'm not going to say it can't be done, i'm just saying, it's not the easy way out here | 19:26 |
ianw | corvusx: i mean use-buildset-registry isn't a transparent thing if you're not pulling from docker.io? which i think is at least a tacit assumption in zuul-jobs | 19:26 |
corvusx | ianw: i agree; i thought it was tested with non-docker images and thought it worked | 19:26 |
corvusx | ianw: so if it really is borked like that, i agree it needs a big red warning | 19:27 |
clarkb | corvusx: ianw: fwiw I think others should double check what I've found. Best I could tell gerrit 3.8 jobs failed because it couldn't find tags for 3.8 because it was talking directly to quay.io which hasn't had the newer 3.8 tag synced over | 19:27 |
clarkb | basically completely ignoring the speculative state and talking directly to quay.io | 19:27 |
fungi | #link https://github.com/moby/moby/pull/34319 "Implement private registry mirror support" | 19:28 |
clarkb | then I found the issue I mentioned previously which seemed to confirm this is expected behavior | 19:28 |
fungi | for reference | 19:28 |
corvusx | i feel like the best approach would be to either (after switching back to dockerhub if we so wish) move to podman completely, or redesign the system based on preloading images (and doing that in a suitably generic way that we can set out rules about what jobs need to expect and do). something based on artifact data in zuul probably (like pull-from-intermediate-registry but for individual jobs). | 19:28 |
corvusx | it will be must less transparent but should work. | 19:28 |
corvusx | i think both of those have a good chance of producing a positive outcome | 19:29 |
clarkb | I thought about the second option there, but I don't think you can ever pull imgaes in your actual job workload because that would overwrite what is preloaded | 19:29 |
clarkb | I couldn't come up with a way to encode that in zuul-jobs as a "platform" level thing that our service deployments wouldn't immediately clobber over | 19:29 |
clarkb | which is why I ended up wiht the role embedded in systme-config in my example change proving it works (at least minimally) | 19:30 |
corvusx | clarkb: we risk getting into the weeds here, but how do we make opendev prod jobs work then? | 19:30 |
clarkb | corvusx: https://review.opendev.org/c/opendev/system-config/+/882977 illustrates it | 19:31 |
corvusx | yeah i'm reading it | 19:31 |
clarkb | corvusx: you basically hvae your normal docker-compose pull then after that and before docker-compose up you run skopeo to sideload the image(s) | 19:31 |
corvusx | right so that means that job doesn't test our production playbooks? | 19:32 |
clarkb | corvusx: I guess it depends on how you look at it. The skopeo side load only runs in CI but everything else is also tested and runs in production too | 19:32 |
corvusx | right, it has a "CI=1" style flag | 19:32 |
clarkb | yes but only for an additional step. It doesn't remove what runs in production | 19:33 |
clarkb | so what goes into production is still tested from a code coverage perspective | 19:33 |
corvusx | got it | 19:34 |
clarkb | anyawy it sounds like I should hold off on making any changes in the near future. I think it would be helpful if others can look at this and update the etherpad with their thoughts | 19:34 |
clarkb | I have spent a fair bit of time digging around this proble mand workarounds and ended up with that workaround. | 19:34 |
clarkb | I don't think there is a straightforward solution because docker quite simply is badly designed and doesn't support this workflow despite buildx and podman etc all doing so | 19:34 |
corvusx | yeah, if we look at it in a different perspective: | 19:35 |
corvusx | it seems what we have in zuul-jobs is: the docker world where all the roles work together as long as you only use docker images and tools; then basically the same for podman (with the exception that it also works with docker images) | 19:36 |
fungi | it does seem rather balkanized | 19:36 |
corvusx | so if we like what we have been using, then it seems like we're drawn toward "use podman family of tools". and if we're willing to accept a different method of building and testing things (with much less magic and more explicit decisions about when and where we use speculative images) then we open the door to preloading | 19:37 |
clarkb | in my mind the workaround with explicit decisions is ideally a short/mid term workaround that we employ to complete the migration to quay.io. Then we continue to move to podman | 19:38 |
clarkb | because clearly docker is a problem here ( we want to move to quay.io for a different problem too) | 19:38 |
clarkb | as an alternativ ewe can stay on docker hub and move to podman then move to quay.io | 19:38 |
clarkb | I guess it depends on which chnage we feel is a priority | 19:39 |
corvusx | okay, let me try to be convinced of skopeo as a workaround -- it's certainly better than what we have now since we basically have no speculative testing | 19:39 |
ianw | i know it's unhelpful from the peanut gallery, but yeah, i've certainly touted the "ci == production" features of opendev deployment several times in talks, etc. it is something that has been a big staple of the system | 19:39 |
fungi | right, it seems like a question of two paths to the same destination: whether we switch the toolchain first and then the hosting location, or the other way 'round | 19:39 |
corvusx | ianw: yes me too. full disclosure, i have one scheduled a few weeks from now :) but that's not driving my strong interest -- i can still give a talk and talk about what doesn't work too :) | 19:40 |
corvusx | i'm a bit worried about the nodepool thing | 19:40 |
corvusx | like does that mean it's simply not going to be possible to run it in nodepool? does that make podman DOA for us? | 19:40 |
clarkb | I'm personally happy with either approach. I've been looking at short term workarounds because quay.io migration is halfway done and completing that would be nice. But rolling back isn't the end of the world either | 19:40 |
clarkb | corvusx: I don't think it is DOA we just likely need to run it that way and figure out what random settings need to be toggled | 19:41 |
clarkb | More just that it isn't likely to be directly drop in, there is work to do I think | 19:41 |
tonyb | I certainly don't have the background of you all but I feel like completeing the migration then start a new wholesale tool migration to podman and co is what I'd look at | 19:42 |
clarkb | as a time check we've spent most of our meeting on this topic. We can followup on this outside the meeting? There are a few other topics to dig into | 19:42 |
corvusx | okay, my concern with the preload "hack" is that it's fragile and doesn't give us full test coverage. so if migrating nodepool will take a long time then we end up with a permanent hack | 19:42 |
corvusx | if we thought we could do it in a relatively short period and most of the skopeo hack work is done, then maybe that's the way to go. but if it's really uncertain and going to take a long time, that makes me think that rollback is better. | 19:43 |
clarkb | corvusx: in that case maybe we should focus on "install podman and friends" first then we can use that to run some test jobs with podman instead of docker | 19:44 |
clarkb | spend some time on that then decide if we rollback or rollforward with the workaround | 19:44 |
corvusx | sounds like if we can run nodepool we're probably golden? | 19:44 |
clarkb | yes I think nodepool and gerrit are the two big questions | 19:44 |
corvusx | what's weird about gerrit? | 19:44 |
clarkb | corvusx: all of the sideband command stuff | 19:45 |
corvusx | like "docker run gerrit foo" ? | 19:45 |
corvusx | er "docker exec" ? | 19:45 |
clarkb | ya for reindexing and all that | 19:45 |
clarkb | I think a lot of it is run not exec currently so it spins up anothe rcontainer on the same bind mounts and does operations | 19:46 |
clarkb | I'm less concerend this will break than nodepool builder stuff | 19:46 |
fungi | i guess you're saying it makes a good canary because of the additional complexity? | 19:46 |
corvusx | "neat" | 19:46 |
clarkb | but I think it is good coverage of a workflow we have | 19:46 |
clarkb | fungi: ya | 19:46 |
clarkb | most of our services are a few bind mounts and an http server. Super simple and basic | 19:46 |
fungi | got it, so not for any specific thing that we suspect is problematic | 19:46 |
clarkb | gerrit is different. nodepool-builder is different | 19:47 |
fungi | (unlike the cgroups problem witnessed with nodepool) | 19:47 |
ianw | mysql/mariadb dumps are another thing that calls into the running container | 19:47 |
corvusx | so is anyone interested in getting nodepool-builder to work? | 19:47 |
clarkb | ianw: ya and gerrit would cover that too | 19:47 |
tonyb | I am interested but I don't have time to dedicate to it for several weeks | 19:48 |
ianw | i think the "interesting" path in nodepool-builder would be the containerfile element | 19:48 |
clarkb | corvusx: I can continue to push this along if others prefer. But it might be a day or two until I can spin up testing for that (I would probably rely on system-config-run after sorting out some podman install stuff) | 19:48 |
clarkb | ianw: ++ | 19:48 |
ianw | which is used by fedora and rocky only. fedora we've discussed ... | 19:48 |
clarkb | baically I think step 0 is a system-config role to install podman/buildah/skopeo/podman-compose or docker-compose with podman backing it | 19:49 |
clarkb | then we can have the CI system produce a nodepool-builder we can poke at | 19:49 |
clarkb | and I can do that but probably no sooner than tomorrow | 19:49 |
corvusx | is there some person or group who is driving the rocky requirement that can help? | 19:50 |
ianw | ++ running dib gate testing will stress containerfile | 19:50 |
clarkb | corvusx: NeilHanlon and the openstack ansible group | 19:50 |
clarkb | I can reach out to them once I've got the basic stuff running | 19:50 |
corvusx | is fedora being removed? | 19:51 |
clarkb | corvusx: I think so no one really hicmed in saying they have a use case for it since I last prodded | 19:52 |
ianw | (although now i wonder about the dib gate and if that's pulling the nodepool-builder speculatively built container properly?) | 19:52 |
corvusx | ianw: it is almost certainly not | 19:52 |
corvusx | i do have a suggestion for a last-ditch way to avoid the nested podman problem with nodepool | 19:53 |
corvusx | if we find there is no way to get it to work, we can implement my "build an image in zuul and hack nodepool-builder to upload it" idea that i mentioned in the nodepool-in-zuul spec | 19:54 |
tonyb | I spent a couplel of days verifying that the only thing we "loose" with switching off fedora is some newer virt stack testing and that repo is available for CentOS so it's not even a real loss | 19:54 |
clarkb | tonyb: thanks for confirming | 19:54 |
clarkb | corvusx: oh thats a neat idea. | 19:54 |
corvusx | essentially, anticipate the nodepool-in-zuul work (which itself would alleviate this problem) by doing the rocky build in zuul and then having the nodepool builder job for that download and upload the build | 19:54 |
clarkb | ok lets continue this outside the meeting. I think we have a few investigative next steps we can take in order to better understand potential impact and I'm volunteering to start on those within a day or two | 19:55 |
corvusx | that's not going to be the easiest thing (like, if the issue can be resolved by adding "--work-please" to podman, that's best) but if we literally can't make it work, that gets us out of the corner we backed ourselves into. | 19:55 |
clarkb | I'm going to skip the bastion, mailman3, gerrit, server upgrades, and storyboard topics today becaus eI think the only one with any real movement since last week is the fedora topic | 19:56 |
clarkb | #topic AFS volume utilization | 19:56 |
clarkb | utilization is trending back up. We seem to bounce around up and down with a general trend of upwards | 19:56 |
clarkb | as mentioned I asked for any last comments on whether or not people had a use case for fedora last week with the intent of making change sstarting this week | 19:56 |
clarkb | there really wasn't any "we need fedora" feedback. We got some "why no rhel? Why this why that?" feedback instead | 19:57 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/883083 Cleanup unused fedora mirror content | 19:57 |
clarkb | I think this is a good step 0 that will free up fileserver disk | 19:57 |
fungi | sean is looking into the possibility of rhel again too | 19:57 |
fungi | has another question about it into rh legal | 19:58 |
clarkb | regardless of what happens to fedora this chagne should be safe as it removes unused fedora mirror content | 19:58 |
clarkb | I think if we can land that in the near future it would be great | 19:58 |
clarkb | that will address the immediate lets not run out of disk on AFS problem | 19:58 |
ianw | ... that has been a very "looked at" thing over the years | 19:58 |
clarkb | Then separately I also think we can proceed with removing the fedora bits we are using. The first step for that would be to configure the fedora jobs to stop relying on the mirrors. Then we can clean up the mirror content that was used. Then finally we can start removing jobs and remove the images | 19:59 |
ianw | i think given tonyb's input too, i don't see too much using this | 19:59 |
clarkb | The actual removal will porobably take some time as we clean up jobs. But we should be able to clean up mirrors which is the immediate concern then gracefully shutdown/remove th erest | 19:59 |
ianw | one thing is maybe we should stop the mirror setup first? i think zuul-jobs breakage might be the main pain point | 20:00 |
clarkb | ianw: yes thats what I said :) | 20:00 |
clarkb | basically we can keep the nodes but they can talk to upstream package repos | 20:00 |
ianw | ++ | 20:00 |
clarkb | note https://review.opendev.org/c/opendev/system-config/+/883083 removes mirroring for content we don't have nodes for | 20:00 |
clarkb | double check me on that but I think that means we can merge it right now | 20:01 |
clarkb | but for the bits we do use we need to stop consuming them in the jobs first and ya Ithink that is a good first step in the shutdown of what does exist | 20:01 |
clarkb | and we are at time. | 20:01 |
clarkb | Sorry this went a bit long and we skipped over time topics | 20:01 |
clarkb | Feel free to continue discussion in #opendev or on the mailing list for any thing that was missed (including the skipped topics) | 20:02 |
clarkb | But I won't keep you any longer than I already have | 20:02 |
clarkb | thank you for your time today! | 20:02 |
clarkb | #endmeeting | 20:02 |
opendevmeet | Meeting ended Tue May 16 20:02:24 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 20:02 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2023/infra.2023-05-16-19.01.html | 20:02 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-05-16-19.01.txt | 20:02 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2023/infra.2023-05-16-19.01.log.html | 20:02 |
corvusx | or longer than i have :) | 20:02 |
corvusx | thanks and sorry all :) | 20:02 |
tonyb | Thanks everyone | 20:02 |
fungi | thanks! | 20:02 |
corvus | ... | 20:36 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!