Thursday, 2025-04-03

tonyb	huzzah	01:27
tonyb	When y'all are awake any chance I can get some reviews on: https://review.opendev.org/q/project:zuul/nodepool+status:open+owner:tonyb	06:04
frickler	tonyb: I can't ensure being awake, but in my current state I only have a -1 for you	06:13
tonyb	frickler: I'll take a -1	06:15
frickler	zuul-lb02 says: Packages with upgradable origin but kept back: Ubuntu noble-updates: rsyslog	06:32
frickler	oops where did these trailing spaces come from? anyway, I'll check what apt says about this when invoked manually	06:33
noonedeadpunk	I guess this can be landed now?:) https://review.opendev.org/c/opendev/system-config/+/946068	07:41
noonedeadpunk	as seems that ubuntu has updated packages after the release: https://ubuntu-cloud.archive.canonical.com/ubuntu/dists/noble-updates/epoxy/	07:42
tonyb	I think there are concerns about the AFS quota	07:42
noonedeadpunk	there was plenty for uca last time I checked?	07:43
noonedeadpunk	(and UCA is quite small)	07:43
tonyb	I can look at it in a bit, just explaining one potential holdup	07:43
noonedeadpunk	If I'm not mistaken - that's current usage? https://grafana.opendev.org/d/9871b26303/afs?orgId=1&from=now-6h&to=now&timezone=utc&viewPanel=panel-34	07:44
noonedeadpunk	++	07:44
noonedeadpunk	I think when we discssed couple of days ago folks just wanted to wait for the release, not to fetch packets that will be updated in a day or so	07:44
tonyb	noonedeadpunk: Well if you're correct (and you probably are), there's no space issue there.	07:47
tonyb	frickler: reason I, or you, shouldn't +A https://review.opendev.org/c/opendev/system-config/+/946068 ?	07:48
frickler	tonyb: noonedeadpunk: ah, right, I wanted to that yesterday already, but somehow I forgot about it with all the release things happening. approved and I'll check the sync log later	07:58
tonyb	frickler: Thank you	08:02
noonedeadpunk	thanks!	08:02
noonedeadpunk	well, packages got updated today morning, so it's good it wasn't done yesterday :)	08:02
frickler	tonyb: I guess once noble is in, we can also finally proceed with kicking xenial out? ;) https://review.opendev.org/c/opendev/system-config/+/883468	08:12
frickler	btw. this is the output from zuul-lb02, didn't we disable phased-upgrades everywhere? or only in our mirrors? I assume that we do not use our mirrors on these servers in order to be able to get security updates as fast as possible? https://paste.opendev.org/show/bRW1B90Hvu5pO9PMXH3K/	08:13
tonyb	frickler: Maybe? I'd want to double check we don't have any old servers running xenial	08:20
tonyb	frickler: I have no idea about phased updates, I'd like to think that updates coming from ubuntu-security wouldn't be phased?	08:31
opendevreview	Merged opendev/system-config master: Add Epoxy UCA to mirrors https://review.opendev.org/c/opendev/system-config/+/946068	08:37
frickler	oh my, we're really still building xenial images. then I agree it isn't as trivial as I assumed. I'd still vote to proceed with the cleanup, but will wait for more feedback first	08:58
frickler	uca update deployed successfully, next run should be around 10 UTC	09:00
tonyb	frickler: I doubt I'll be of any help but I'll be around at 10 UTC	09:22
noonedeadpunk	ok, nice (about uca)	09:41
noonedeadpunk	seems epoxy is there (ᵔᴥᵔ)	10:55
fungi	clarkb: yeah, the tomllib used in python throws a parsing error if you try breaking an inline table into multiple lines	12:01
fungi	apparently this would need toml 1.1.0 which still doesn't seem to have emerged	12:04
fungi	frickler: note that's coming from noble-updates not noble-security	12:09
fungi	were you expecting a security update for it?	12:10
frickler	fungi: no, I only mentioned security updates because that's the main reason I see why we'd use upstream repos instead of our mirrors, see the paste I posted. using our mirror would avoid delaying phased upgrades iirc. but there's also no real issue, I only checked the automated mail because we had long-stuck upgrades earlier	12:31
Clark[m]	the concern with afs quotas is for Ubuntu and centos stream volumes. Not the uca volume. We need to increase quota for Ubuntu and uca	13:34
Clark[m]	Xenial sticks around for a number of reasons. On the infra side I'm happy to drop our testing and muddle through. However Openstack relies on it for translations? Or is that bionic? The main thing is we have to clean up what we can before removing the test images then dropping it from the mirror	13:35
Clark[m]	We have never used our CI mirrors for production nodes. I don't think we should either. We can however disable the phased updates on newer nodes whose apt supports it. It's just a config flag iirc?	13:36
frickler	yes to the latter	13:37
frickler	and I think translation updates run on bionic, but I need to check again	13:37
frickler	noonedeadpunk: noble epoxy update seems to have worked fine, do you have a job already where you can check it?	13:39
frickler	the other thing that came to my mind when looking at dib-image-list: do we want to clean up the 3y old gentoo images? and possible also > 120d old openeuler?	13:42
Clark[m]	I think we should clean out Gentoo. And openeuler was in the process of getting updated (it broke because the mirror updated and then they didn't also update the images) but only updated half of what needed updating	13:43
Clark[m]	I think we were hoping that there would be more of a push for the second half but I guess that fizzled out	13:44
frickler	translation is ubuntu-bionic, see e.g. https://zuul.openstack.org/build/f330dfc184be480187a10f59c5f6d435 . there isn't a way to filter builds by nodeset, or is there?	13:44
frickler	the images are named openEuler-22-03-LTS, new ones would be -24-* I think? but anyway let me look at cleaning gentoo and hope JayF doesn't get too sad ;)	13:46
fungi	worth keeping an eye out for breakage today related to https://discuss.python.org/t/upcoming-changes-in-the-pypa-wheel-project/85967	13:46
fungi	i checked back when the plan was first announced and couldn't find anywhere we're doing stuff they're removing, but just be aware as i could have missed something	13:48
frickler	nice, very considerate of them to wait until after our release ;)	13:51
Clark[m]	Er I realize I said we need to increase quota for Ubuntu and uca earlier. I meant Ubuntu and centos stream	13:52
noonedeadpunk	frickler: well, partially. https://review.opendev.org/c/openstack/openstack-ansible-openstack_hosts/+/945569 is passing now, I could push some with depends-on, but rather wait for this one to merge and the in any other job will see the result	13:56
noonedeadpunk	but at least apt is fine	13:57
noonedeadpunk	https://zuul.opendev.org/t/openstack/build/fe899483e4274a83b5de4b5837092cb8 was failing before fwiw	13:57
JayF	frickler: I didn't even know we had published Gentoo images	13:59
fungi	they've been held and broken for a long time	14:06
opendevreview	Dr. Jens Harbott proposed opendev/base-jobs master: Drop gentoo-17-0-systemd nodeset https://review.opendev.org/c/opendev/base-jobs/+/946264	14:47
clarkb	tonyb: I know you expressed interest in doing afs quotas	14:58
clarkb	tonyb: I'll hold off on doing them myself so that you can potentially dothat today. Let me know if you have questions (afs has its quirks)	14:59
dan_with	Hi, I work for Rackspace Flex. I would like to see if you can reboot a VM in your account? VM instance ID: 902d04c1-e5b5-45dc-8317-ce90e3114ebc, name: mirror01.dfw3.raxflex.opendev.org? It has a volume attached that only has one multipath connection.	14:59
dan_with	Even better if you can power it off for 15 minutes to let me migrate the VM and volume	15:00
fungi	dan_with: how urgent is it? if you can wait a couple hours, that would give us an opportunity to let any running jobs in that region complete so we don't cause any failures	15:04
fungi	then we can take the server offline as long as you need	15:04
fungi	if it's urgent, we can of course absorb the hit, just not as gracefully	15:05
dan_with	It can wait a few hours. It's not in immediate danger. I just don't want it to be more than a few hours. Thanks	15:05
fungi	dan_with: you've got it, i'll get things underway for that asap and let you know once we've got the server powered off	15:05
dan_with	ok thanks	15:06
clarkb	another option (and I'm not advocating for this now just noting it for the future) is that nodepool is quota aware. So dan_with could set our instance quota to 0 then wait for graphs on https://grafana.opendev.org/d/6d29645669/nodepool3a-rackspace-flex?orgId=1&from=now-6h&to=now&timezone=utc&var-region=$__all to show the region as unused and do the reboot on the cloud side	15:06
clarkb	I appreciate the coordination with us. Just wanted to make note of ^	15:07
opendevreview	Jeremy Stanley proposed openstack/project-config master: Temporarily turn down raxflex-dfw3 use https://review.opendev.org/c/openstack/project-config/+/946265	15:08
opendevreview	Jeremy Stanley proposed openstack/project-config master: Revert "Temporarily turn down raxflex-dfw3 use" https://review.opendev.org/c/openstack/project-config/+/946266	15:08
dan_with	Yeah, ideally we would want to actually poweroff the VM from Skyline or OpenStack, then do migration to a new host.	15:09
fungi	i've self-approved 946265 but will also apply it locally on nl05 so it takes effect a little sooner	15:10
fungi	clarkb: nodepool-launcher notices config changes straight away and doesn't need a restart, right?	15:11
clarkb	fungi: changes to its own config yes. I think its less reliable (or doesn't happen at all) for clouds.yaml	15:14
clarkb	so ya if you manually edit the file (be careful to get the region correct the ansible yaml rewriting changes the order and makes things harder to read) it should take effect immediately then the ansible run should noop and enforce the new value	15:14
fungi	cool, that's what i did	15:15
clarkb	we will also want to put the server in the emergency file while it is offline	15:15
clarkb	seveeral ansible playbooks will be sad if it is down	15:15
fungi	aha, yeah i'll go ahead and add it now	15:16
clarkb	the grafana dashboard I linked previously reflects your update	15:18
clarkb	and we are down to 2 in use nodes	15:19
clarkb	hrm max just went back to 32	15:19
clarkb	did you race the hourly jobs maybe?	15:19
clarkb	I think you must've	15:19
fungi	clarkb: no, i accidentally set the wrong region to 0 initially because of the way the associative arrays are randomly arranged in the generated nodepool.yaml file	15:20
fungi	but it's on the right one now	15:20
fungi	and it's fallen to 23 in-use according to `nodepool-launcher list`	15:21
fungi	well, 24 now, i guess one was transitioning from booting to in-use	15:22
clarkb	oh I see I mixed them up on the graphs	15:23
opendevreview	Merged openstack/project-config master: Temporarily turn down raxflex-dfw3 use https://review.opendev.org/c/openstack/project-config/+/946265	15:25
fungi	we're down to 5 servers in dfw3 now	15:45
corvus	fungi: should probably do niz as well	15:49
fungi	oh, right i forgot it was in use there too	15:49
corvus	i'll whip up a change	15:49
fungi	thanks! saves me remembering where that is off the top of my head	15:49
opendevreview	James E. Blair proposed opendev/zuul-providers master: Disable rax-flex-dfw3 https://review.opendev.org/c/opendev/zuul-providers/+/946273	15:52
corvus	now we'll see if the schedulers are running with the patch that implements that....	15:53
corvus	survey says yes	15:53
fungi	down to 2 in-use servers in dfw3 now	15:55
opendevreview	James E. Blair proposed opendev/zuul-providers master: Revert "Disable rax-flex-dfw3" https://review.opendev.org/c/opendev/zuul-providers/+/946274	15:55
opendevreview	James E. Blair proposed opendev/zuul-providers master: Use built-in noop job https://review.opendev.org/c/opendev/zuul-providers/+/946275	15:57
opendevreview	Merged opendev/zuul-providers master: Disable rax-flex-dfw3 https://review.opendev.org/c/opendev/zuul-providers/+/946273	15:58
opendevreview	Merged opendev/zuul-providers master: Use built-in noop job https://review.opendev.org/c/opendev/zuul-providers/+/946275	15:59
fungi	okay, dfw3 is empty now, shutting down the mirror there temporarily	16:01
dan_with	okay let me know when you are ready for me to do migrations	16:02
fungi	still reporting ACTIVE state in nova, so may take a little longer	16:03
fungi	er, i was looking at the wrong region/mirror	16:04
fungi	dan_with: mirror01.dfw3.raxflex.opendev.org is reporting POWEROFF state in nova now, so should be all set for you. take as long as you need, but please let us know when it's done	16:04
dan_with	Roger that. I will let you know when done	16:04
fungi	s/POWEROFF/SHUTOFF/	16:05
fungi	thanks!	16:05
fungi	clarkb: this is redundant, right? https://opendev.org/opendev/yaml2ical/src/branch/master/setup.cfg#L21-L23	16:08
fungi	looks like it's been in there since that file was first created	16:10
fungi	so i'm guessing it's cargo culted	16:10
clarkb	fungi: that is a bit brain melty	16:15
fungi	yeah	16:15
clarkb	fungi: I think it may not be redundant bceause pbr is saying its own setup hook is that one but others may override?	16:15
clarkb	it depends on how we look that up	16:15
clarkb	oh but this is in yaml2ical	16:16
fungi	right	16:16
fungi	it's not clear why a pbr-using project would have to specifically indicate pbr's setup hook, since i thought pbr injected that	16:17
clarkb	fungi: pbr/util.py looks for [global].setup_hooks not setup-hooks. So ya its not doing anything. But then pbr always runs its setup hook after the project specific setup hooks so it is also redundant	16:17
clarkb	I agree it is redundant (and also buggy so it does nothing and then uses pbr defaults)	16:17
fungi	thanks, that's what i thought. will clean it up as part of the overhaul i'm in th emiddle of	16:18
opendevreview	Jeremy Stanley proposed openstack/project-config master: Move yaml2ical to the opendev tenant https://review.opendev.org/c/openstack/project-config/+/946280	16:36
dan_with	If you start the VM now without putting it back in rotation/pool, can you do that so I can check status of multipath connections?	16:39
clarkb	we can	16:40
fungi	looks like it's already started	16:40
clarkb	https://mirror.dfw3.raxflex.opendev.org isn't returning data for me yet	16:41
fungi	status ACTIVE	16:41
fungi	updated 2025-04-03T16:23:27Z	16:41
fungi	so it was presumably booted about 18 minutes ago	16:41
fungi	though i'm currently unable to ssh into it	16:42
clarkb	doesn't seem to ping either. Might need to check the cnosole log?	16:42
fungi	`openstack console log show ...` returns an error	16:42
fungi	Instance 902d04c1-e5b5-45dc-8317-ce90e3114ebc could not be found.	16:43
dan_with	hold on just a second	16:43
clarkb	infra-root can I get reviews on https://review.opendev.org/c/opendev/system-config/+/946050 to update our gerrit image to the latest bugfix release? If we think we're sufficiently past the openstack release Id be happy to do a gerrit restart for that early tomorrow morning? That should be a good time for it	17:03
opendevreview	Jeremy Stanley proposed opendev/yaml2ical master: Update Python versions and boilerplate https://review.opendev.org/c/opendev/yaml2ical/+/946284	17:04
opendevreview	Jeremy Stanley proposed opendev/yaml2ical master: Address W504 linting rule https://review.opendev.org/c/opendev/yaml2ical/+/946285	17:04
clarkb	fungi: I left a question on 946284 before doing a more in depth review (wanted to see test results too)	17:22
clarkb	mnaser: I'm starting to look into replacing our gerrit server (review02 currently hosted in vexxhost ca-ymq-1) with a new server in the same location to do two things: 1) update the base operating system and 2) switch flavors to vexxhosts v3 flavors so that we can avoid boot from volume (in particular this will simplify server rescueing should we ever need that). Currently our quota	17:28
clarkb	limit for memory is 204800, we're using 156672 of which ~128gb is the existing server. Would it be possible to bump that limit up so that we can have a second 128gb server up for a few weeks (my goal is to have the new server in production by the 18th so maybe we would shutdown the old server by the 25th preserving only its old root disk and volume?)	17:28
clarkb	mnaser: happy to chat more if it helps or answer questions.	17:28
mnaser	clarkb: if you drop boot from volume it will make it harder for us to live migrate the server though :<	17:30
mnaser	but we can totally do it that's fine	17:30
clarkb	that is good to know about live migration. I wonder if the gerrit server is very live migrateable anyway just due to its size and activity? We can probably take cold migrations periodically if they aren't too long.	17:31
fungi	i didn't realize bfv made live migration easier, something to do with avoiding the nova image cache?	17:31
clarkb	the problem is server rescuing is an almost completely undocumented magical dark art with boot from volume	17:32
mnaser	it very much is lol	17:32
mnaser	clarkb: i'm curious if you check instance actions for the gerrit instance if we've already done live migrations	17:32
clarkb	mnaser: would that show up under openstack server show output?	17:32
mnaser	openstack server action list i think	17:33
clarkb	`server event list` seems to be it. Double checking now	17:34
clarkb	yes there are several live migrations. Based on timestamps we have two in july 2021, four a few minuets apart in september 2021, one in october 2022, and three a few minutes apart in april 2023	17:37
fungi	so it's been about 2 years since the last one	17:37
fungi	i guess we're overdue ;)	17:37
clarkb	and at that frequency I feel like we could take a cold migration once in a while if necessary	17:38
clarkb	I guess its a tradeoff between relying on live migration to hopefully make things painless most of the time nad incurring higher pain if something goes really wrong vs making the really wrong case easier to work with and having more periodic lesser pain	17:38
clarkb	I personally want to avoid major pain in the major problem cases (where you'd rescue) but I'm open to being convinced otherwise particualrly since we have never had to rescue this sintance	17:39
opendevreview	James E. Blair proposed zuul/zuul-jobs master: Add eatmydata support to ensure-zookeeper https://review.opendev.org/c/zuul/zuul-jobs/+/946289	17:40
mnaser	clarkb: also the other thing is i have no data guarantee if you dont use bfv	17:42
mnaser	it is local storage 100%, so if the local raid array blows up	17:42
mnaser	it's time to pull out backups	17:42
mnaser	(i dont wanna nag, i just wanna manage expectations :))	17:42
clarkb	mnaser: the important data does live on a data volume that we mount	17:42
mnaser	ahhh okay got it	17:42
clarkb	mnaser: and its definitely not nagging this is important useful info. I appreciate it	17:42
mnaser	so /var/gerrit or whatnot goes into data, but the actual os in the root	17:42
clarkb	yup	17:43
mnaser	ok so you're good it's almost like you've used this cloud for a while =)	17:43
clarkb	mnaser: with v3 flavors you can choose to bfv or use local adta disks?	17:43
clarkb	or is bfv v2 falvors only?	17:43
mnaser	yes, if you give it a boot volume, it will boot from it, otherwise, whatever the root_gb is will be on local disk	17:43
clarkb	got it. So maybe the thing to do is sleep on it with the info we have and decide if we stick to bfv for its upsides given the downside has never arisen yet	17:44
mnaser	yeah v3 you can have both :)	17:44
clarkb	and the downside is solveable its just not as easy as it would be without bfv	17:44
clarkb	mnaser: have you ever successfully rescued an instance that was booted from volume in that region? I'm just wondering if anyone else might be able to point us in the right direction should it come up	17:45
clarkb	I could boot a test node and figure it out myself. Maybe that is step 0 and document it	17:45
corvus	i wonder if lack storage guarantees on / would increase the chances we would need to perform a rescue	17:46
clarkb	corvus: ya that very well could be	17:46
clarkb	and more generally maybe this is feedback we should try to get back to the nova team. Boot from volume is valuable to users for data integrity but it is difficult to work with whcih can scare people away. Addressing the difficulty would help users take advantage of important features	17:48
clarkb	ok I think I've convinced myself to test bfv rescuing first on a new dummy node (no quota updatse needed for taht)	17:49
jrosser	I have done bfv rescue on my clouds, I can dig some notes out later if that’s helpful	17:49
clarkb	jrosser: ya I seem to recall you helped the last time this came up	17:49
clarkb	but then we sorted it out otherwise or something and never did the last step of actually testing it	17:49
jrosser	iirc there was an image property involved and needing to specify a micro version on the cli	17:51
jrosser	and an unobvious hazard of not being able to rescue bfv instance with the same image	17:51
clarkb	that sounds familiar. You basically need a special rescue image with a special property set. And you have to use a microversion that enables rescuring bfv instances at all and due to the special image you have to supply an image and can't rely on the default which is to use the existing image	17:52
clarkb	jrosser: I guess if you have those details that would be helpful. But no rush I probably won't get to this until tomorrow at the earliest	17:54
clarkb	fungi: yaml2ical's pyproject.toml has an invalid license field but it isn't apparent to me why that is based on the error and what is in the file	17:56
dan_with	I'm still working this issue	17:58
clarkb	dan_with: ack let us know if there is anything we can do to help	17:59
fungi	dan_with: no worries, and best of luck! we're fine however long you need	18:01
dan_with	Thank you. Still get kinks worked out with a new cloud	18:01
fungi	i feel at least partly at fault for whatever trauma openstack is causing you right now ;)	18:02
dan_with	lol ;)	18:06
mnaser	fungi: "A previous PEP had specified license to be a table with a file or a text key, this format is now deprecated. Most build backends now support the new format as shown in the following table."	18:13
mnaser	i wonder if too old setuptools that is doing that validation there	18:13
mnaser	since it does seem to want a key with file or text	18:13
*** dhill is now known as Guest12857		18:14
fungi	mnaser: that's a too-old setuptools, yes	18:14
fungi	i think it needs at last setuptools 77	18:14
mnaser	or deprecated format :P	18:15
mnaser	depends how long you want your afternoon to be =P	18:15
clarkb	mnaser: if it were you it sounds like you'd use bfv ya?	18:16
mnaser	clarkb: i feel like there's likeliness that the os would get so borked it would require a rescue, even less if it's all sitting on fairly reliable ceph	18:16
mnaser	sorry, little likeliness	18:17
fungi	mnaser: looks like the fix is that test-release-openstack needs to run on a newer python version	18:17
clarkb	mnaser: ya I think that is what corvus is saying too. Basically by being bfv we reduce the likelihood and then you don't have to worry so much so maybe its best to take advantage of that	18:17
fungi	right now it's running on ubuntu-focal with python 3.8 and the necessary setuptools version to support that syntax only works with python 3.9 or newer	18:17
mnaser	fungi: that'll do it :)	18:17
clarkb	fungi: we can drop the job particularly if moving into the opendev tenant and just have your nox targets build an sdist first or something	18:18
mnaser	clarkb: yeah, that's my thoughts to be honest, also less likely to scramble for a rebuild if a hypervisor blows up too	18:18
fungi	clarkb: yes, that will likely be my "fix"	18:18
clarkb	mnaser: ack I think that perspective is useful too	18:18
clarkb	since you've got a much broader set of experience dealing with the cloud and openstack and when things fail than we do	18:18
mnaser	if a hypervisor dies, i start your instance up on another node and we go back to regular programming	18:19
mnaser	vs you having to rebuild the whole thing at 2am	18:19
clarkb	I'm coming around to that and thinking bfv is a good idea even with the separate data volume	18:19
mnaser	i say bfv + seperate data for sure, to make migration/updates easier	18:20
clarkb	++	18:20
clarkb	just need the quota bump and I should be able to start spinning something up that we can migrate onto	18:20
mnaser	sorry i forgot if i asked the project id	18:20
mnaser	i should be able to add, just let me know what you want the new values to be	18:20
clarkb	204800 is the current value. Maybe we double it? that would be 409600 (happy to do less if you prefer).	18:21
clarkb	Let me find the project id	18:21
mnaser	openstack token issue -- or if youhave the project name	18:22
mnaser	openstackci maybe it was?	18:22
clarkb	ya just confirmed that appears to be it	18:22
clarkb	openstackci specifically	18:22
mnaser	done	18:22
clarkb	I see it reflected in the limits show --absolute output. Many thanks!	18:23
clarkb	and for the record the other limitation is max_total_volume_gigabytes but by my math we have sufficient headroom there	18:24
clarkb	so no need to update it	18:24
opendevreview	Jeremy Stanley proposed openstack/project-config master: Move yaml2ical to the opendev tenant https://review.opendev.org/c/openstack/project-config/+/946280	18:25
opendevreview	Jeremy Stanley proposed opendev/yaml2ical master: Update Python versions and boilerplate https://review.opendev.org/c/opendev/yaml2ical/+/946284	18:25
opendevreview	Jeremy Stanley proposed opendev/yaml2ical master: Address W504 linting rule https://review.opendev.org/c/opendev/yaml2ical/+/946285	18:25
fungi	clarkb: as far as the not wrapping lines after operators rule, that was apparently something pep 8 was edited years back to reverse the earlier recommendation that you should	18:26
clarkb	I can't help but feel like that is a very pythonic approach. Tell everyone you should do things one arbitrary way and get them used ot it. Then as soon as everyone is comfortable change to do the exact opposite	18:27
fungi	hah	18:27
fungi	so true	18:27
clarkb	mnaser: oh I also meant to ask if you've seen any gitea slowness issues? I've tried to sort of keep an eye on it and haven't seen evidence of that since we did the memcache update and futher blocked some ai web crawlers	18:28
fungi	https://peps.python.org/pep-0008/#should-a-line-break-before-or-after-a-binary-operator has some of the discussion	18:29
mnaser	clarkb: it's actually been pretty good lately, haven't seen any issuesb	18:29
clarkb	mnaser: awesome thanks for confirming. I think our improvements to web traffic handling for crawlers, memcachged, and a bugfix to prevent OOMs in gitea have made gitea much more stable in the last few weeks than it was a couple months ago	18:30
clarkb	fungi: those arguments are convincing but once you've spent decades doing it the other way you don't get to ban hammer everyone for doing it that way	18:30
clarkb	just accept both	18:30
fungi	indeed, we could just ignore both W503 and W504	18:31
clarkb	I think it was noonedeadpunk who also had gitea problems fetching constraints? I suspect that problem has faded away too	18:36
fungi	here's hoping	18:36
opendevreview	James E. Blair proposed zuul/zuul-jobs master: Add eatmydata support to ensure-zookeeper https://review.opendev.org/c/zuul/zuul-jobs/+/946289	19:05
dan_with	hey @fungi, you okay with me starting that VM now to make sure everything is good?	20:20
opendevreview	James E. Blair proposed zuul/zuul-jobs master: Add eatmydata support to ensure-zookeeper https://review.opendev.org/c/zuul/zuul-jobs/+/946289	20:21
opendevreview	James E. Blair proposed zuul/zuul-jobs master: Remove Red Hat support from ensure-zookeeper https://review.opendev.org/c/zuul/zuul-jobs/+/946304	20:21
fungi	dan_with: you bet, go for it!	20:22
dan_with	ok	20:22
corvus	dan_with: yes, it's safe to restart it at any time	20:22
clarkb	I'm going to pop out for a bike ride while the weather is nice	20:58
clarkb	tonyb: I can help with afs quota stuff when I get back	20:58
fungi	i'm also around to help with afs adjustment questions	21:00
dan_with	I'm going to need to keep the server down--there is an error with glance/swift and the people who have access are away for the day. Are you all okay with it being down over night or does it need to come back online today?	21:52
fungi	dan_with: yeah, that's perfectly fine. sorry it's giving you so much trouble	21:53
dan_with	Thanks for understanding. It uncovered an interesting issue that will make the cloud better. I'll keep you updated. It will be my priority until it is back online.	21:54
fungi	much appreciated. we're notorious for exposing bugs in our own software somehow	21:55
dan_with	lol. That's just good work	21:55
tonyb	clarkb, fungi: Yup I'd like to do the AFS quota updates, but I'm in and out this morning so if we don't overlap maybe it'd be quicker if y'all do it this time	22:22
tonyb	I assume I do account setup etc on afsdb01?	22:35
fungi	tonyb: you mean addition of a superuser?	22:37
fungi	tonyb: you can just use localauth on one of the fileservers if you prefer	22:38
tonyb	Yeah (I think) I have created krb5 accounts tonyb{,/admin}@OPENSTACK.ORG, now I'm doing the `pts createuser` step (which I assume I'd need to do to issue quota change commands	22:38
fungi	tonyb: for example, recently i did this on afs01.dfw.openstack.org: sudo fs setquota -localauth /afs/.openstack.org/mirror/ubuntu-ports 850000000	22:39
fungi	because i was lazy and didn't feel like fiddling with authenticating my admin account	22:40
tonyb	Oh, that's much easier	22:40
fungi	(or not really lazy so much as i think it was a time period where openafs in debian/sid wasn't building its lkm on the available version of the kernel)	22:41
tonyb	more efficient than lazy	22:42
fungi	but yeah, if you add an admin account for yourself, you can do those operations from the comfort of your own workstation too	22:42
tonyb	Oh okay, that's also good to know.	22:42
fungi	if you want help with the kerberos user principal and admin user creation steps, i'm happy to assist though	22:43
fungi	it eventually comes in handy, not so much for making quick quota adjustments but in other more involved work	22:44
tonyb	I think I've done those already	22:44
tonyb	Well the krb principal part	22:44
fungi	so not the pts createuser and pts adduser steps yet	22:45
fungi	and anyway, yeah i'd say do those similarly to the fs setquota example i gave, e.g. on afs01.dfw.o.o it'll be something like `sudo pts createuser -localauth $USERNAME -id UID`	22:46
tonyb	Yup, but the docs are good, I was just unclear where to run them.	22:47
fungi	or it might be `pts createuser $USERNAME -id UID -localauth`	22:47
fungi	not sure if the -localauth can go between createuser and your username	22:48
fungi	would need to check the manpage/context help	22:48
tonyb	Okay	22:48
fungi	but yeah, you'll either need to be root or sudo the command to use -localauth	22:48
tonyb	FWIW: ```tonyb@afs01:~$ sudo pts createuser tony.admin -id 9 -localauth	22:50
tonyb	User tony.admin has id 9	22:50
tonyb	```	22:50
fungi	not quite as auspicious as 8, but still perfectly cromulent	22:51
tonyb	frickler: has that honor	22:51
fungi	his id embiggens us all	22:51
tonyb	:)	22:55
fungi	tonyb: unrelated, i replied to your comments on 946219, let me know if it's still confusing	23:12
tonyb	Okay so I have accounts (I need to update UserList) I'm learning about AFS and verifying the capacity we have but from grafana it looks like we have 3 servers each with 5TB and we're using roughly 7.5TB, so we have "ooodles" of room left. I haven't check if the sum of all the quotas in < 15TB but I don't think that's needed .... So how much additional quota am I adding to which volumes (I guess mirror.ubuntu and	23:16
tonyb	mirror.centos-stream)	23:16
Clark[m]	Another thing to keep in mind is that you set quotas on the rw volume and the ro volume sort of catches up	23:17
Clark[m]	I would bump them both by say 50gb?	23:17
Clark[m]	The afs dashboard in grafana is a good resource for understanding usage and available room	23:18
tonyb	fungi: LGTM Happy for me to +A those chnages?	23:18
fungi	sure	23:18
tonyb	clarkb: Thanks. I'll go ahead and do that	23:20
fungi	note that it does change the representation in the package metadata slightly, and though pypi is able to figure out the previous arrangement and combine the separate "Author" (name) and Author-email metadata fields, the new arrangement is more correct and results in a name within the Author-email metadata field instead	23:21
tonyb	fungi: noted	23:21
opendevreview	Merged opendev/engagement master: Fix authors/maintainers format in pyproject.toml https://review.opendev.org/c/opendev/engagement/+/946219	23:22
fungi	tonyb: also the official spec is https://packaging.python.org/en/latest/specifications/pyproject-toml/#authors-maintainers	23:23
opendevreview	Jeremy Stanley proposed opendev/yaml2ical master: Update Python versions and boilerplate https://review.opendev.org/c/opendev/yaml2ical/+/946284	23:26
opendevreview	Jeremy Stanley proposed opendev/yaml2ical master: Address W504 linting rule https://review.opendev.org/c/opendev/yaml2ical/+/946285	23:26
fungi	Clark[m]: 946280 got updated to remove the previously failing openstack-specific job too	23:27
tonyb	So mirror.centos-stream is currently 350GB (350000000) to adding 50GB == 400000000	23:29
fungi	that looks right to me, yep	23:30
* fungi counts the zeroes again		23:30
tonyb	and mirror.ubuntu is 1.2TB (1200000000) adding 50GB == 1250000000	23:30
fungi	yes, correct number of nulls there too	23:31
tonyb	Thanks	23:32
tonyb	`sudo fs setquota -localauth ....` is saying that -localauth isn't a valid flag and the man page indicates I need to login/auth to my tonyb.admin account. but that is conuter indicated by what fungi said earlier	23:35
fungi	huh, i thought i had done that, but maybe it was an erroneous example from my shell history	23:36
tonyb	Okay, I need to pop out for a bit. I'll figure it out when I get back	23:36
tonyb	#Learning	23:37
fungi	tonyb: i concur, command errors, manpage for fs_setquota also doesn't mention the ability to use localauth, i guess that's limited to pts, vos, et cetera	23:39
fungi	so the example authentication in the next section of our docs is what you'll want with your admin account (pagsh ..., export KRB5CCNAME=..., kinit ..., aklog)	23:40
fungi	you can do that locally if you already have the tools installed, or on the fileserver if you prefer	23:41
clarkb	fungi: re 946280 I guess we'll figure out pypi later if we need it?	23:42
fungi	clarkb: no, the change in yaml2ical adds the non-openstack-specific jobs	23:43
fungi	same ones bindep uses	23:44
clarkb	ah	23:44
fungi	so should "just work"	23:44
opendevreview	Jeremy Stanley proposed opendev/bindep master: Fix authors/maintainers format in pyproject.toml https://review.opendev.org/c/opendev/bindep/+/946218	23:45
opendevreview	Jeremy Stanley proposed opendev/engagement master: Drop maintainers field from pyproject.toml https://review.opendev.org/c/opendev/engagement/+/946314	23:47
opendevreview	Jeremy Stanley proposed opendev/yaml2ical master: Update Python versions and boilerplate https://review.opendev.org/c/opendev/yaml2ical/+/946284	23:48
opendevreview	Jeremy Stanley proposed opendev/yaml2ical master: Address W504 linting rule https://review.opendev.org/c/opendev/yaml2ical/+/946285	23:48
fungi	reading https://packaging.python.org/en/latest/specifications/core-metadata/#core-metadata-maintainer i just noticed this disclaimer: "Note that this field is intended for use when a project is being maintained by someone other than the original author: it should be omitted if it is identical to Author."	23:49
fungi	so technically a mistake we've been making for years in setup.cfg as well	23:54

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!