Tuesday, 2025-01-07

clarkb	fungi: hrm I wonder if the py_modules in setup.py could go in setup.cfg? That seems less hacky if so	00:00
clarkb	PBR does not run the release job	00:00
clarkb	I just pushed a change to add that job	00:04
clarkb	ok last call for meeting agenda items	00:10
tonyb	none from me	00:10
clarkb	and sent	00:15
clarkb	the build-python-release job succeeded in PBR so thats a good sign we haven't broken anything too bad with the recent pyproject.toml work there	00:17
clarkb	I guess fungi manually checked the package builds in PBR last week too so this is just good belts and suspenders	00:17
opendevreview	Jaromír Wysoglad proposed openstack/project-config master: Remove infrawatch/sg-core from zuul repos https://review.opendev.org/c/openstack/project-config/+/938537	08:11
*** persia is now known as Guest5226		09:40
opendevreview	Jeremy Stanley proposed opendev/bindep master: Use PBR's pyproject.toml build-backend support https://review.opendev.org/c/opendev/bindep/+/816741	15:11
opendevreview	Jeremy Stanley proposed opendev/bindep master: Evacuate most metadata out of setup.cfg https://review.opendev.org/c/opendev/bindep/+/938520	15:11
opendevreview	Jeremy Stanley proposed opendev/bindep master: Drop support for Python 3.6 https://review.opendev.org/c/opendev/bindep/+/938568	15:41
fungi	wip for now as a test ^	15:42
opendevreview	Jeremy Stanley proposed opendev/bindep master: Drop requirements.txt https://review.opendev.org/c/opendev/bindep/+/938570	15:48
fungi	also mostly just a test for now ^	15:48
clarkb	I guess once we're happy with the state of those changes we push a pbr beta release and set that version in the pyproject.toml and make sure the whole stack is still happy?	15:49
fungi	yeah, unless we want to dive into figuring out how to get pbr to source the package name from pyproject.toml instead of setup.cfg, which would potentially allow projects to drop their setup.cfg files completely	15:50
clarkb	since pbr will continue to use setuptools as the backend I'm not sure that is urgent.	15:51
fungi	agreed	15:54
fungi	but i do think an update to pbr's usage doc is warranted based on outcomes from this testing, i'll try to work on that between meetings	15:54
fungi	(basically it should state what the most minimal required setup.py and setup.cfg are for projects wanting to move metadata into pyproject.toml)	15:55
fungi	and also mention the python 3.7 minimum	15:55
fungi	wow, i didn't expect the requirements.txt removal to pass testing	16:06
fungi	noonedeadpunk: epel mirroring from pubmirror3.math.uh.edu seems to have resumed working, so i guess it was just offline for a few days	16:16
noonedeadpunk	even mirrors need to have holidays :D	16:22
noonedeadpunk	thanks for the update!	16:22
opendevreview	Jay Faulkner proposed openstack/diskimage-builder master: Stop using deprecated pkg_resources API https://review.opendev.org/c/openstack/diskimage-builder/+/907691	16:43
opendevreview	Merged openstack/diskimage-builder master: Fix: Run final Gentoo tasks in post-install.d https://review.opendev.org/c/openstack/diskimage-builder/+/937658	17:42
tonyb	frickler, JayF: re mediawiki annotations. Something like: https://blog.jasonantman.com/2012/02/using-templates-to-track-outdated-content-in-a-documentation-mediawiki/ ?	19:34
tonyb	or: https://www.mediawiki.org/wiki/Template:Note	19:37
frickler	tonyb: ack, that looks interesting, will take a closer look tomorrow, thx for the links	19:38
tonyb	Thinking about the ansible-devel series I think: https://review.opendev.org/c/opendev/system-config/+/934937/3 which makes minimal modifications to the pip3 role to work with EXTERNALLY-MANAGED python (essentially installing pip with --break-system-packages) is the wrong thing to do.	20:05
fungi	not mentioned in the meeting but i have a long-ish stack of changes proposed to fix testing and update packaging in bindep, for anyone looking to take a break from complex reviews ;)	20:06
tonyb	I'm not sure what the right thing to do is. Not doing something like that means that we'll have to find and update to all the places we use pip (and virtualenv?)	20:06
tonyb	or updating the pip3 role to install pip itself into a venv, would mean that everything we then install with pip will go into that venv (wont it)	20:08
fungi	might not be the worst thing, probably most of our special handling for pip and v{irtual,}env can just be (carefully) deleted	20:08
tonyb	Would we, like we did with podman, make that systematic change for >= Noble or try to fit it to older releases	20:09
fungi	without having spent any time looking at it in the past few years, my gut says most of it can be completely deleted for all the platforms we're still using	20:13
tonyb	fungi: Do you mean that we'd rely on distro packages for python3-pip and python3-venv ?	20:13
fungi	most of the special handling there hails from transitions from python 2 to 3 and puppet to ansible	20:13
fungi	and, yes, probably distro packages are perfectly fine now	20:14
fungi	if distri packages aren't fine, anything post python 3.6 (so post ubuntu bionic) can also use pip and virtualenv zipapps executed directly as python modules instead of relying on running get-pip.py to bootstrap installation	20:15
tonyb	So the pip3 role, which seems to be the heart of the issue, would keep installing from get-pip for < Noble and install distro packages otherwise (>= Noble). If we hit problems with that we'll analyse and potentially switch to zipapps as needed ?	20:17
opendevreview	James E. Blair proposed opendev/system-config master: Add jobs to mirror dockerhub images https://review.opendev.org/c/opendev/system-config/+/938508	20:18
tonyb	and this applies to OpenDev systems specifically but more general CI jobs would stick with installing tools like pip/venv/tox from pypi ?	20:19
Clark[m]	tonyb: I suspect most places we used pip globally before have been replaced by container images and we can ignore those. Then for everything else I think distro package for pip and venv can be used to pip install whatever we're managing into a virtualenv then we run it from there	20:23
Clark[m]	And ya making that transition on noble and newer might be easiest	20:24
tonyb	Okay, I'll work on that today	20:35
clarkb	tonyb: as a data point when I was doing the podman + docker compose stuff on noble I had test jobs for gerrit and gitea and paste and a few others iirc and none of them had pip issues because all of that is bundled up in the container images now	21:03
tonyb	clarkb: Fair, I hit it in the ansible-devel series as we maintain our own ansible-venv	21:04
clarkb	looking at system-config run file matchers jobs for things like the ci registry trigger on pip3 updates but looking in the code I don't think we use pip	21:05
clarkb	ya I think ansible may be one of the few places where we still pip install something	21:05
clarkb	I just want to point out that the impact may be limited so we can porobably focus on the existing use cases rather than the old ones to make the best choice for the future	21:06
tonyb	I also hit it in nodepool but as you say I expect it's few and far between	21:06
clarkb	oh we install docker-compose with pip3	21:09
clarkb	so ya clean break with noble which doesn't use docker-compose anymore makes a lot of sense	21:09
clarkb	the nodepool-launcher use is redundant with ^ its just explicitly doing it for some reason	21:10
clarkb	afsmon to grab afs stats is another use. I suspect this use and the ansible install may be the primary things other than docker-compose	21:10
tonyb	Okay, and given nodepool-launcher itself will possibly "go away" and/or get refactored away	21:11
tonyb	it's probably not a problem	21:11
clarkb	yup but also if we drop the use there then it will use what is in install-docker on noble anyway	21:11
clarkb	I can push up a change for that cleanup	21:12
tonyb	Oh okay. That'd be good if you have time	21:12
clarkb	but this has me thinking maybe instead of refactoring pip3 we just explicitly use python3-venv to make a venv for the few places we do things with pip other than docker-compose install	21:13
clarkb	then pip3 can die on the vine with the old portion of the install-docker role	21:13
tonyb	That's also doable	21:15
opendevreview	Clark Boylan proposed opendev/system-config master: Remove explicit docker-compose install in nodepool-launcher https://review.opendev.org/c/opendev/system-config/+/938620	21:15
clarkb	its also possible that I haven't found all the cases involved	21:15
clarkb	but I think that is all of them doing a git grep for pip in system-config/playbooks/roles. I see afsmon, install ansible, create venv, install-docker and pip3 itself	21:17
clarkb	install-docker won't need it on noble and newer which leaves us with install ansible, afsmon, and create venv. The last one there (create venv) is already in use for borg, install ansible , and launch-node	21:18
tonyb	That sounds good to me	21:18
clarkb	all that to say I think maybe if we land 938620 and switch afsmon to the create venv system we might be good?	21:18
clarkb	the create venv role relies on `virtualenv_command: '/usr/bin/python3 -m venv'`	21:19
tonyb	I'll poke around also but that sounds likely	21:19
tonyb	I'll probably do a very small change to pip3 so that it explicitly fails if used on noble	21:21
clarkb	++ that can direct us to use create-venv instead and help us catch anything we missed	21:21
tonyb	It's a plan!	21:24
tonyb	now to work out exactly what changed in the upgrade from F40 to F41, that broke logging into wiki.openstack :/	21:24
clarkb	tonyb: my only feedback on the wiki upgrade plan is it isn't clear if we're going to try and do the multistep upgrade in one shot (so one longer downtime) or do it in several smaller downtimes. I don't think that is critical to call out there though	21:24
tonyb	clarkb: I was hoping that itemising each "downtime" and the final sentence about the process taking from mid-Jan to mid-Feb would make that somewhat clear. I'll add something earlier and more explicit	21:27
tonyb	Okay `update-crypto-policies --set DEFAULT:SHA1` 'fixed' things to I can log into wiki.openstack	21:32
fungi	mainly i just remember being personally responsible for introducing our pip3 puppet module and moderately sad we copied that into ansible	21:44
clarkb	fungi: good news its almost completely gone at this point	21:45
clarkb	tonyb: I left a response to your latest comment on https://review.opendev.org/c/opendev/system-config/+/921321	21:56
clarkb	tonyb: I guess its not really a simplification but closer to what we do for other things so might be simpler to understand	21:56
tonyb	Yeah. I like that, I ruled it out as the Mediawiki container publishes/exposes port 80 so I was trying to stick with that but it looks like using hostmode ignores that	21:58
opendevreview	Merged opendev/system-config master: Add jobs to mirror dockerhub images https://review.opendev.org/c/opendev/system-config/+/938508	22:13
clarkb	I've sent some questions about the gerrit h2 cache db backing files to gerrit discord	22:53
tonyb	At the moment with MediaWiki I have the service installing and managing it's own MariaDB in a single compose directory, copied from lodgeit. Would it be better to use the mariadb role to create that database? or is the extra complexity/overhead too much	23:19
tonyb	Oh where to we tell the server to start services in /etc/*-compose/ at system start	23:20
clarkb	tonyb: I think all of our services except for the zuul db which is dedicated due to its size use the mariadb in the same compose setup	23:21
tonyb	Okay	23:22
clarkb	I think I would default to that unless the database is large or has other performance considerations	23:22
clarkb	tonyb: re starting on system start that may be something I didn't consider with podman + docker compose. With docker and docker-compose the docker daemon would start on boot and start up any services. I'm not sure if podman with docker compose will do that for us?	23:23
clarkb	https://github.com/containers/podman/blob/main/contrib/systemd/system/podman-restart.service.in we might need to experiement with something like this? Its possible that the installation already works as is.	23:24
fungi	might be worth comparing/contrasting to the mailman deployment since that's our newest from-scratch setup	23:24
fungi	but probably already consistent	23:25
clarkb	looks like I may still have a held node iwth docker compose and podman I can reboot	23:25
clarkb	158.69.66.158 this is a held paste node running under podman not docker	23:26
clarkb	I'm going to reboot it and see if the containers come back up	23:26
clarkb	tonyb: the two docker compose managed containers (lodgeit and mariadb) did come back up but the selenium container that is managed via docker commands did not	23:27
clarkb	so it seems that things that get docker-compose/docker compose managed should respect reboots. I'm going to stop the services with docker-compose down and see if a reboot starts them up again	23:28
tonyb	clarkb: sounds good	23:28
clarkb	I did a docker-compose stop and then rebooted. THat did restart the containers	23:30
clarkb	I believe that with docker as the backend that would not start the containers on boot	23:30
clarkb	but I'm not 100% certain of that. I think we can work with that and simply use docker-compose down instead if we want to reboot and not have stuff come back up	23:31
clarkb	(testing that now)	23:31
tonyb	Yeah that's my expectation	23:31
clarkb	yup down works (becauise the containers are fully removed) so I think podman should mostly just work there and approximate the docker behavior	23:32
clarkb	tonyb: I'll docker-compose up -d if you want to inspect things yourself	23:33
clarkb	corvus: ^ fyi since you've been following along with the podman is docker on noble work. This might interset you	23:33
clarkb	podman-restart.service loaded active exited Podman Start All Containers With Restart Policy Set To Always	23:34
clarkb	systemd reports this unit is present and active so I think the podman package is already installing that for us and it is working (just to understand how this works)	23:34
clarkb	the docker-compose.yaml for lodgeit does have restart-always on those two services	23:35
corvus	clarkb: we should check the "restart" setting in the docker-compose file	23:35
clarkb	corvus: yup for paste at least it is set to always which matches the unit's description of what it is doing. But other services may not have that	23:35
clarkb	looks like we are pretty consistent about that. However, gerrit doesnt' do so	23:36
clarkb	I suspect that this is still workable	23:37
clarkb	but something to be aware of and maybe we would write a special unit just for gerrit or somethign to have more control over it	23:37
corvus	er i'm lost	23:37
corvus	what's the problem? is the assertion that podman and docker backends would do something different?	23:38
clarkb	corvus: yes I think so. I believe that if we issued a reboot on review02 today without touching docker state that gerrit would be started up automatically for us on boot. However, I am not 100% certain of this I would need to test it	23:38
clarkb	and the reason for that is the containers would've been in a running state when the docker daemon was asked to shutdown so it would restore to that state	23:38
clarkb	corvus: in the case of podman and its magical restart systemd unit it only does so for containers with a policy of restart always	23:39
clarkb	the vast majority of our containers are set to restart always so I think in the general case we are fine. Gerrit is the one exception I know of but there may be a few others too (I Haven't done an exhaustive search yet)	23:39
clarkb	I'll set up a hold for gerrit that will enable me to test this easily	23:40
corvus	well, at one point we were going for "unless-stopped" which i advocate for and think is the safest	23:40
corvus	i still think our containers should be set to that	23:40
corvus	but it seems like we did not come to a consensus on that because they are in fact all over the map	23:41
corvus	so i think there's 3 things to straighten out:	23:41
corvus	1) what should our container policy be? i think the only wrong answer here is "whatever". then there's a series of better answers that are debatable. :)	23:41
corvus	2) what does docker do with a running container configured with a restart value other than "always".	23:42
clarkb	I have rechecked https://review.opendev.org/c/opendev/system-config/+/893571 and will put a hold on that for testing of reboot behavior with dockerd	23:42
corvus	3) what does podman do in the same situation.	23:42
corvus	i think we have some assumptions about the answer to #2 but we should check those before comparing to podman.	23:43
clarkb	corvus: ++ my check and hold are in place and should allow us to answer #2	23:43
clarkb	we can easily modify the docker-compose file on that held node to test different restart values for dockerd. Then separately we can test with the held paste server for podman behavior	23:43
corvus	looks like we have a mix of "always" "no" "unless-stopped" and "on-failure".	23:43
clarkb	corvus: I think some of the gerrit containers are also unset which would get us whatever the default behavior is?	23:44
clarkb	no is the default I think	23:45
corvus	https://docs.docker.com/engine/containers/start-containers-automatically/ says default is "no"	23:45
corvus	i can't recall if there is a reason we set gerrit to "no" these days, other than just wanting to do sanity checks after a crash...	23:45
clarkb	ya I think that is the only reason. We want to make sure the host is happy before starting the service	23:45
clarkb	we have the backing data in a volume which could potentailly not mount cleanly when we reboot	23:46
corvus	good point. and for everything else, i think "unless-stopped" would be a good standard policy for us.	23:46
corvus	(maybe we could make gerrit safe for unless-stopped by throwing in a sanity check script in the entrypoint)	23:47
clarkb	corvus: or if we have to write a special unit for it anyway that unit could depend on the fs mount or something along those lines	23:47
clarkb	I manaully changed the restart: always to restart: no and down then up'd the held lodgeit services. I'll reboot now to see what it does. Then step through for the other two restart policies	23:48
corvus	yeah. i'm holding out hope that the experimental results come in with equivalent behavior. that's certainly what i'd expect. but if they really mean "always" and not "unless stopped" then i guess we should do that.	23:48
clarkb	I wonder if the problem is podman doesn't keep that state around when stopped?	23:49
clarkb	corvus: https://github.com/containers/podman/blob/main/contrib/systemd/system/podman-restart.service.in#L12 looking at this I think that always is important	23:49
corvus	yeah, if the only data available to it is the policy, then i reckon their hands are tied.	23:49
corvus	it sure does look like it does what it says.	23:50
clarkb	though the next line implies mayber there is some other state because its stopping all the containers without always	23:50
clarkb	with policy no under podman the containers don't appear to restart on reboot. One of them is in a "created" state now. The other exited(0)	23:51
corvus	i'm having trouble understanding what the ExecStop is good for: https://github.com/containers/podman/blob/main/contrib/systemd/system/podman-restart.service.in#L13	23:53
clarkb	corvus: maybe it avoids the created state like I saw (I haven't checked the local copy of the unit specification to see if it does that stop)	23:53
clarkb	unless-stopped is the same as no with podman	23:55
clarkb	corvus: the exec stop is present on my held paste noble node so not sure now	23:56

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!