Friday, 2025-06-06

*** bauzas7 is now known as bauzas		00:49
opendevreview	Friedrich Hiekel proposed openstack/nova master: Fix nova-scheduler placement error https://review.opendev.org/c/openstack/nova/+/951936	08:00
stephenfin	sean-k-mooney: Could you add this follow-up to your review queue? https://review.opendev.org/c/openstack/nova/+/951640/	09:42
sean-k-mooney	im still going though my emilas this morning so i can review it now	09:43
sean-k-mooney	ok so its a followup to api: Stop using wsgi.Controller.api_version to switch between API versions	09:43
stephenfin	yup	09:44
stephenfin	per gmaan's comments on same	09:45
sean-k-mooney	cool done	09:47
sean-k-mooney	assumign the parent pass ci this time it should be mergable when ever it gets to that point	09:47
opendevreview	Stephen Finucane proposed openstack/nova master: api: Address issues with hypervisors APIs https://review.opendev.org/c/openstack/nova/+/950867	10:39
opendevreview	Stephen Finucane proposed openstack/nova master: api: Address issues with instance actions API https://review.opendev.org/c/openstack/nova/+/951941	10:45
opendevreview	Balazs Gibizer proposed openstack/nova master: Remove unused config options https://review.opendev.org/c/openstack/nova/+/951945	11:31
gibi	sean-k-mooney: stephenfin: a follow up for WSGIServer removal ^^	11:32
sean-k-mooney	didnt you propose that before i tough i reveied it	11:33
stephenfin	I love deleting code	11:33
stephenfin	done	11:33
gibi	thanks	11:33
sean-k-mooney	oh there are more cases we missed.	11:34
gibi	yepp, I just detected them when I started looking at the deeventletification of nova-api	11:34
sean-k-mooney	ok that going to conflcit with my pre-commit updates but ill approve and i cna update the last patch for bandit	11:36
sean-k-mooney	stephenfin: since your about mind looking at https://review.opendev.org/q/topic:%22update-precommit%22	11:36
stephenfin	done to	11:39
stephenfin	*too	11:39
sean-k-mooney	thanks, ill need to rebase the bandit one later but ill do that after the other changes have merged	11:40
stephenfin	gmaan: sean-k-mooney: gibi: If any of you have the time/inclination, I could do with some thoughts on how to approach https://github.com/stephenfin/nova/commit/2d14a58f6310861b426359d71fba085e924e0538 (not pushed yet to avoid overwhelming Gerrit more than I have already)	11:56
stephenfin	the tl;dr: is that our use of the "soft" additionalProperties validator doesn't work when oneOf is used	11:56
sean-k-mooney	... 4/6 just for servers. better then all in one but that going to be fun to review...	11:57
stephenfin	That validator deletes unrecognised ("additional") keys from the input data. That's all fine and dandy normally, but when you're using 'oneOf' with 2 or more subschemas with non-overlapping keys, then keys that are invalid for an earlier schema but which are valid for a later schema will end up getting deleted	11:58
sean-k-mooney	stephenfin: so it nice an short but is there a specic point you want input on	11:58
stephenfin	yeah: how on earth do we work around this? 😅	11:59
sean-k-mooney	jsut reading now	11:59
sean-k-mooney	so ok were are you using onof there	11:59
sean-k-mooney	ah	12:00
sean-k-mooney	so either we get teh reservation_id	12:00
sean-k-mooney	or the fulll server responce?	12:00
stephenfin	yup	12:01
sean-k-mooney	can we set addtionalProperties true in the reservation_id path	12:01
gibi	stephenfin: I'm a bit low on context. Why does the validator delete keys based on earlier schemas?	12:01
stephenfin	I didn't know that was a thing until I'd to write the schema for it	12:01
stephenfin	gibi: sean-k-mooney: Here's a much smaller example https://paste.opendev.org/show/bpifVj7I2TTaQtFwVj4T/	12:01
sean-k-mooney	stephenfin: reservation id is only used in multi create	12:02
stephenfin	gibi: https://github.com/openstack/nova/blob/master/nova/api/validation/validators.py#L182	12:02
stephenfin	first added here https://github.com/openstack/nova/commit/4fefd25c807c087694a0e66f4d91fe0e64ef3f18	12:02
sean-k-mooney	let me try something locally for one sec	12:03
stephenfin	gibi: the idea behind that appears to be to allow users to specify provide additional properties in requests when using the legacy v2 API, but to strip them out and ignore them so that e.g. they can't accidentally trigger code paths for newer API versions	12:03
stephenfin	s/specify provide/provide/	12:04
stephenfin	but if you apply that to that minimal reproducer I just shared, and make a request that satisfies the second subschema (e.g. {"bar": "hello, world"}) but not the first, we attempt to match against the first one and strip out properties that are not permitted by that	12:05
gibi	stephenfin: I see so this is only for the legacy v2 compatible mode	12:05
stephenfin	yes	12:05
gibi	so the rule would be to strip out only those properties that are not in the oneOf branch that eventually matched. But I guess this is not easy to impelement in that context	12:06
stephenfin	I've been sitting on this for a few months, and now that things are merging rapidly (thanks sean-k-mooney, gmaan), it's properly time to figure out a solution 😅	12:06
stephenfin	no, unfortunately not	12:07
stephenfin	jsonschema iterates through the oneOf entries and processes each subschema separately, so we have no visibility into anything "above" the subschema, if that makes senes	12:08
stephenfin	*sense	12:08
gibi	then the best we can do is that for those schemas that has oneOf we relax the current rule and always preserve additional properties	12:09
stephenfin	Yeah, that's where I'm thinking we'll end up too. I just don't know the consequences of same	12:10
gibi	this is a compatibility layer that should not really used any more	12:11
stephenfin	I was thinking I could move this "delete additional properties" logic into the various affected methods themselves, but that's a big list and I don't know if there's any way to know we're handling a v2 request vs. a v2.1+ request	12:11
gibi	I would inclined to simply deprecate it as we cannot maintain it any more	12:11
stephenfin	yeah, that's the other option: delete v2	12:11
sean-k-mooney	is something like this valid https://paste.opendev.org/show/bkS797jQUZZoSNXnhlvO/	12:12
* stephenfin tries		12:12
sean-k-mooney	and if so is oneOF valid as the value of required	12:12
sean-k-mooney	also have you asked gemini :)	12:12
sean-k-mooney	ok it has a idea	12:14
sean-k-mooney	https://paste.opendev.org/show/bMERk0Vt08kwAdUW6tLi/	12:14
sean-k-mooney	let me ask it with the full example	12:14
gibi	hm does the validator in the _soft_validate_additional_properties sees the whole big schema? or has some kind of memory when the validation descends the tree? Could we use that validator object to stop applying the rule in the subtree of all oneOfs dinamically?	12:14
stephenfin	sean-k-mooney: ah, no that won't work (and I actually recall trying it before): you can't combine additionalProperties with oneOf https://github.com/python-jsonschema/jsonschema/issues/193	12:15
stephenfin	https://ajv.js.org/faq.html#additional-properties-inside-compound-keywords-anyof-oneof-etc	12:16
stephenfin	gibi: it does not, at least not with oneOf. It gets handed the subschemas individually	12:17
stephenfin	and the issue is that e.g. if I'm given a body like {'server': {...}, 'invalidkey': 'stuff'} then I probably want to strip out 'invalidkey' (a made up example but you get my point)	12:18
sean-k-mooney	https://paste.opendev.org/show/828072/	12:20
sean-k-mooney	stephenfin: ^ still readign that but that what gemini thinks about the problem	12:20
stephenfin	Not a fan of approach 1. I really don't want to encode logic in our schemas. They're already complex enough	12:22
sean-k-mooney	for this specific case i tought that was possibel ok if it was only this one case we neede to handel for 2.0 validation	12:23
stephenfin	Hmm, actually: maybe I can just disable response schema validation entirely for v2.0...	12:23
sean-k-mooney	that was gong to be my other question	12:23
sean-k-mooney	v2.0 is technially unversioned	12:23
stephenfin	It doesn't fix the problem but it does side-step it	12:23
sean-k-mooney	so it cant really ahve a schema	12:23
sean-k-mooney	by the way this coudl also be a bug	12:24
sean-k-mooney	like 2.0 and 2.1 are ment to be identical	12:24
sean-k-mooney	the only delta is the micoverion endpoint	12:24
sean-k-mooney	but servers should not change	12:25
stephenfin	they're mostly identical: look at I4c066075165f69d355a74979808fa0ad8d6546ab for example	12:26
stephenfin	(commit 4a29486f4cf15720d09ed0ed86c47cd07f5cd14a)	12:26
sean-k-mooney	well schduler hint need addtional properteies true alwasy	12:27
sean-k-mooney	even in 2.100	12:27
sean-k-mooney	that is not a 2.1 vs 2.0 compatiablity thing	12:27
stephenfin	you're right, good point	12:28
sean-k-mooney	schduelr hint are alwasy exstenible becuase tehy are one of the ways we provie to oeprator writing there own schduler filter	12:28
sean-k-mooney	they are however like extra specs	12:28
sean-k-mooney	in that we can have diffintions for the standard ones	12:29
sean-k-mooney	stephenfin: if we only have two examples to choose between i dont think the if,then,else is much more complx the oneOF	12:31
sean-k-mooney	so i woudl go with that for now	12:31
sean-k-mooney	both are conditionals its just the if is more explcit	12:31
sean-k-mooney	if you need multiple hten i woudl vote for not validaing repocnes for 2.0 at all	12:32
sean-k-mooney	but we shoudl see what others think	12:32
sean-k-mooney	stephenfin: we do not have an exmaple of this for 2.1+ right?	12:33
stephenfin	No, it's not an issue for v2.1 since the soft additional properties stuff only applies to v2.0	12:34
sean-k-mooney	stephenfin: actully i asked gemini for some ohter variation that aovid the repition let me link them	12:35
sean-k-mooney	https://paste.opendev.org/show/b7wJJXIE4edOqlCZ3Ijj/	12:37
sean-k-mooney	can we use unevaluatedProperties?	12:40
sean-k-mooney	if so option 3 is an option, 1 and 2 are mostly the same it jsut comes down to an if or switch semantic to select the correct one	12:41
sean-k-mooney	i would go with 3>2>1 in that order	12:42
opendevreview	Stephen Finucane proposed openstack/nova-specs master: Remove v2.0 API https://review.opendev.org/c/openstack/nova-specs/+/951949	12:44
stephenfin	sean-k-mooney: gibi: gmaan: Would be interested in hearing if there's a solid reason not to do ^	12:44
sean-k-mooney	hum that also an approch if we do that can we add v3 as a perfect copy of 2.100 :)	12:44
sean-k-mooney	stephenfin: please look at the previsou paste if you didnt see that howevre	12:45
stephenfin	I suggested that before. dansmith said no	12:45
sean-k-mooney	how about v4	12:46
sean-k-mooney	that woudl be at least new	12:46
gibi	stephenfin: in general I'm OK to remove v2.0 it is old as hell and right now it poses problems for us. I'm also OK to just sidestep the problem and disable response validaton in 2.0	12:49
sean-k-mooney	my personal prefence woudl be to add v3 in 2026.1 and remove v2.0 while keeping all of 2.1-2.100 until 2028.1+	12:49
sean-k-mooney	and in 2028.1 we woudl see if enough peopel had moved to v3 do deprection 2.1+ or not and set a deadlien for that but basiclly feature freeze that code next cycle.	12:51
sean-k-mooney	by feature feeze i just mean no more 2.x microverions	12:51
sean-k-mooney	not duplicate teh code	12:52
gibi	where does the v3 and v4 idea coming from? Could we just have 2.1, remove 2.0 and that is it?	12:54
sean-k-mooney	gibi: that was me just saying removing v2.0 specificly i think is good	12:55
sean-k-mooney	gibi: but as i have argued for years i think its time for a v3	12:55
sean-k-mooney	or to raise our min version	12:55
sean-k-mooney	i know dan dose not want to od that	12:55
sean-k-mooney	and i dont thnk we will have time to even consider a v3 this cycle	12:55
sean-k-mooney	im just frustrated that we wont even consider doing it again	12:56
sean-k-mooney	i think it activly makes our rest api wrose	12:56
gibi	ahh the raising of the min version.	12:56
sean-k-mooney	ya so if we remove 2.0 we efectivly faise ti to 2.1	12:56
gibi	yeah here we don't need it but yes if it helps deleting code then we can consider raising the min version	12:56
sean-k-mooney	which ithink is fine since they should be identical	12:56
sean-k-mooney	i am supprtive of stephenfin proposal to do the minium possible min_virsion in crease form 2.0 to 2.1	12:57
sean-k-mooney	by removeing 2.0	12:58
sean-k-mooney	but once we are done with eventlet removal and the open api schema work	12:58
sean-k-mooney	and service/manager policy work	12:58
sean-k-mooney	then i think we shoudl revisit v3 or raisgin the min version to somewhere in the 2.77 ish range	12:59
sean-k-mooney	im not pusshign for tha tnow but i thik we shoudl at least discuss how this woudl work in the next 18 months	12:59
gibi	yeah I think it make sense to discuss it and collect a list of things we would gain out of such a change	13:00
sean-k-mooney	stephenfin: you still have not responded on the other schema permutaions https://paste.opendev.org/show/b7wJJXIE4edOqlCZ3Ijj/ do any of those look reasonabel to you as i said my prefernce there would be 3>2>1	13:02
sean-k-mooney	unevaluatedProperties if viable seam clean and consise	13:02
gibi	on a totally independent topic. I just learned today that `openstack server list` triggers two independent full scatter-gather calls one for getting the instances (obviously needed) and one for getting the bdms for the instances (not so obvious). 1) https://github.com/openstack/nova/blob/68c2341b765a22b9b81894d2ff3b21fd5f8632ec/nova/api/openstack/compute/servers.py#L326	13:02
gibi	2)https://github.com/openstack/nova/blob/68c2341b765a22b9b81894d2ff3b21fd5f8632ec/nova/api/openstack/compute/servers.py#L342 and https://github.com/openstack/nova/blob/68c2341b765a22b9b81894d2ff3b21fd5f8632ec/nova/api/openstack/compute/views/servers.py#L497	13:02
stephenfin	sean-k-mooney: sorry, I'd just assumed we were going to skip validation for response bodies and be done with it	13:03
sean-k-mooney	gibi: hum thats unfortunet	13:03
sean-k-mooney	gibi: i assume we cant pass an arge to do the join egerly	13:04
elodilles	hi folks, i know thay you are busy with other things, but there is a small "tool" change i'd appreciate you would review and merge if possible (yeah, it's a small fix for backport validator, related to unmaintained branches :/): https://review.opendev.org/c/openstack/nova/+/949628	13:05
elodilles	s/thay/that/	13:05
gibi	elodilles: +2	13:06
sean-k-mooney	looks fine to me i guess	13:06
sean-k-mooney	it is one of the first times we have had to skip eols	13:06
gibi	sean-k-mooney: we might, but the server/view code is entered from different places so we should only conditionally do the join if we know it will be used	13:06
stephenfin	gibi: sean-k-mooney: thanks for the help https://github.com/stephenfin/nova/commit/2630daf37c8adf4b443c80e66139ea6285c04eae	13:07
stephenfin	will push all those once the bulk of what's already up on gerrit has merged	13:07
elodilles	thanks gibi sean-k-mooney o/	13:08
sean-k-mooney	gibi: im on a call but we have someting like "expected_filds" or something like that to contole which filed are lazy loaded on the instance object	13:35
sean-k-mooney	i dont knwo if we can use that in soem way	13:35
sean-k-mooney	i was thinkig of https://github.com/openstack/nova/blob/master/nova/objects/instance.py#L77C5-L101	13:37
opendevreview	Balazs Gibizer proposed openstack/nova master: Run nova-api and -metadata in threaded mode https://review.opendev.org/c/openstack/nova/+/951957	13:46
gibi	sean-k-mooney: yeah we can ask for the join, the problem is that there are multiple ways reaching that calls and some will use the bdms later some won't. Also the way how this is built up make is pretty hard to untangle them. It is not impossible to refactor this but I decided not to try it now	13:48
gibi	btw if https://review.opendev.org/c/openstack/nova/+/951957 works then that is a nice surprise and an easy win	13:48
gibi	locally it works for me but I need the full tempest run from nova-next to believe in it	13:49
sean-k-mooney	gibi: well i had workign versions last cycel so it really wont take much	14:00
sean-k-mooney	gibi: as in the api was the one of the first thigns i thied to move so with the infra you have added im expict that to not be supper complx to get working	14:01
sean-k-mooney	gibi: and yes scater gatther is the only use of eventlet spwan in them	14:02
sean-k-mooney	that why i suggested starting with the api	14:02
sean-k-mooney	gibi: i know i need to go back adn review the most recent version but i think schduler api and metadta are doabel by m2	14:03
gibi	yeah now that I see api working locally I believe that it is doable for m2	14:05
sean-k-mooney	it might be worth askign clark to look at https://review.opendev.org/c/openstack/devstack/+/948436 again	14:06
sean-k-mooney	although it will be a few days in any case before we get that far in the series	14:07
gibi	good point. I pinged clark in the review now and I can ping him on IRC next week	14:09
*** haleyb is now known as haleyb\|out		14:17
opendevreview	Merged openstack/nova master: [tool] Fix backport validator for non-SLURP https://review.opendev.org/c/openstack/nova/+/949628	15:58
opendevreview	Elod Illes proposed openstack/nova stable/2025.1: [tool] Fix backport validator for non-SLURP https://review.opendev.org/c/openstack/nova/+/951968	16:03
opendevreview	Elod Illes proposed openstack/nova stable/2024.2: [tool] Fix backport validator for non-SLURP https://review.opendev.org/c/openstack/nova/+/951969	16:04
opendevreview	Elod Illes proposed openstack/nova stable/2024.1: [tool] Fix backport validator for non-SLURP https://review.opendev.org/c/openstack/nova/+/951970	16:05
noonedeadpunk	I'm trying here to setup hugepages for one of my computes, but instance gets filtered out at NUMATopologyFilter with no good candidates https://github.com/openstack/nova/blob/master/nova/scheduler/filters/numa_topology_filter.py#L115-L121	16:29
noonedeadpunk	compute object seems to be having mempages https://paste.openstack.org/show/bRHPMmumA43t0KHk186K/	16:31
noonedeadpunk	I wonder if that's might be also due to PCI Passthrough attempt, as that's I guess what might not work with numa placement nicely?	16:33
noonedeadpunk	flavor defined as `hw:cpu_policy='dedicated', hw:mem_page_size='2MB', hw:vif_multiqueue_enabled='true', trait:CUSTOM_GPU='required', pci_passthrough:alias='A10_FULL:1'`	16:34
noonedeadpunk	and obviously once I unset hw:mem_page_size - scheduling passing nicely	16:35
opendevreview	sean mooney proposed openstack/nova master: [DNM] testing py3.13 eventlet bug workaround https://review.opendev.org/c/openstack/nova/+/951749	16:35
sean-k-mooney	you can have both enabled	16:57
sean-k-mooney	however if you have a numa vm by default we require the pci device request by the guest to come form the same numa node as the guest cpus/memory	16:58
sean-k-mooney	noonedeadpunk: you can diabel that in your pci config	16:58
sean-k-mooney	noonedeadpunk: in your pci alias you can set https://docs.openstack.org/nova/latest/configuration/config.html#pci.alias numa_policy=preferred	17:00
sean-k-mooney	noonedeadpunk: you can also set that in the flavor which is generally simpler	17:00
sean-k-mooney	i would recomend setting hw:pci_numa_affinity_policy=preferred or hw:pci_numa_affinity_policy=socket	17:02
sean-k-mooney	preferred is the most relaxed policy	17:02
sean-k-mooney	i woudl also suggest using hw:mem_page_size=large	17:02
noonedeadpunk	large is 1G?	17:05
noonedeadpunk	let me try setting policies then...	17:06
noonedeadpunk	as I tried couple of options, like hw:numa_nodes=1	17:06
noonedeadpunk	nah, didn't fly somehow... I guess best thing to do for me now - go for a weekend and check on that with fresh eyes after it :)	17:12
gmaan	about 2.0 removal. I am not sure what is benefits if we keep v2.1 as min version. 2.0 is just a endpoint without any change from v2.1 so no overhead of maintenance. This will just break user using 2.0 endpoint without giving any gain	17:12
sean-k-mooney	noonedeadpunk: large is anything that isnt small i.e. anything other then 4k default pagesize	17:21
noonedeadpunk	aha, ok	17:22
sean-k-mooney	noonedeadpunk: if you have not turned on debug yet do that on monday in the schduler	17:22
sean-k-mooney	you will need that to discover why	17:22
noonedeadpunk	I did that on scheduler	17:22
sean-k-mooney	ack	17:22
noonedeadpunk	that's where I grabbed cpu capabilities	17:23
noonedeadpunk	*compute	17:23
sean-k-mooney	so your flavor was askign for pinned cpus and hugepages	17:23
noonedeadpunk	yeah	17:23
sean-k-mooney	did you actully configure cpu_dedicated_set ?	17:23
noonedeadpunk	and PCI device	17:23
sean-k-mooney	in the comptue section of the nova.conf	17:24
sean-k-mooney	on the compute node	17:24
noonedeadpunk	nope, I did not, I just set reserved_host_cpus = 2 and that's it	17:24
noonedeadpunk	but it has no VMs running, as is in the isolated aggregate	17:24
sean-k-mooney	ok well that probaly the issue	17:24
sean-k-mooney	what release are you testing with	17:24
noonedeadpunk	Caracal	17:25
noonedeadpunk	when I drop hugepages from properties - it does allocate pinned cpus though	17:25
sean-k-mooney	do you have https://docs.openstack.org/nova/2024.1/configuration/config.html#workarounds.disable_fallback_pcpu_query defiend	17:26
noonedeadpunk	or you saying that without hugepages it does not go through the numa filter...	17:26
noonedeadpunk	nope	17:26
sean-k-mooney	ok so its falling abck	17:26
sean-k-mooney	so any of hugpeages, pinning or an explcit numa topolgoy will result in the schduelr usign the numa filter to validate the host	17:27
sean-k-mooney	how much hugepages did you allocate on the host? and how much ram are you asking for in the vm	17:27
noonedeadpunk	ok, here where the problem can actually be....	17:28
noonedeadpunk	as I kind of took example from docs for now for some POC	17:28
noonedeadpunk	so it's 1024*2MB	17:28
noonedeadpunk	and I'm asking for really big instance of 200Gb RAM	17:28
noonedeadpunk	So what I should be doing - is to split available ram into blocks and that num_pages * page_size == ram?	17:29
sean-k-mooney	would you mind pasting the output of virsh cabalities	17:30
sean-k-mooney	i can advise based on the size of the host with that	17:30
noonedeadpunk	https://paste.openstack.org/show/bYi3OvxXJbP6Grx275O7/	17:31
noonedeadpunk	but we plan to run there like... 4 VMs max?	17:31
sean-k-mooney	ok so you have 2 numa nodes	17:32
noonedeadpunk	yup	17:33
noonedeadpunk	no HT	17:33
sean-k-mooney	and you allcoate 2 GB of 2MB hugepages per numa node	17:33
sean-k-mooney	<pages unit='KiB' size='2048'>1024</pages>	17:34
noonedeadpunk	Right... And I guess that's where the problem is	17:34
sean-k-mooney	noonedeadpunk: try creating a 1G vm that asks for 1 cpu with hw:mem_page_size=large	17:34
sean-k-mooney	if that works all you need ot do is decied how you want to partion the host ram	17:34
sean-k-mooney	i.e. how much to reserve exclusivly for hugepages	17:34
noonedeadpunk	so just to double check where I was wrong. When I set `hw:mem_page_size='2MB'` in flavor - what needs to happen is that flavor ram / hw:mem_page_size = amount of available on compute hugepages?	17:36
sean-k-mooney	looks like you have 512GB total which 500 is usable	17:36
noonedeadpunk	yeah, sounds about right	17:36
noonedeadpunk	so 256 on single numa node	17:36
sean-k-mooney	ya i woudl start conservity ly and allocage 480GB of hugepages on that system total	17:37
noonedeadpunk	but actually seeing virsh capabiltities I'm really understanding where my logic went off now	17:37
noonedeadpunk	++	17:37
sean-k-mooney	that leave 32GB for the host	17:38
sean-k-mooney	whcih with only 4 vms total of 120GB each should be plent	17:38
sean-k-mooney	you can obvioulsy slice that up and run more or less vms	17:39
noonedeadpunk	there's a limit on pci deviuces... but also it's a poc more or less	17:39
sean-k-mooney	so you need to allcoate 245760 2MB hugepages i think	17:39
sean-k-mooney	hugepagesz=2M hugepages=245760	17:40
sean-k-mooney	with that said the server is large enough that you might consider using 1G pages just to have faster boot times	17:41
sean-k-mooney	in which case the kernel command line would be `hugepagesz=1G hugepages=480`	17:41
sean-k-mooney	you can allcoate these at runtim too but its recommend to do it at boot	17:41
sean-k-mooney	doign it at runtime is best effort	17:41
noonedeadpunk	oh, how can I allocate at runtime?	17:42
noonedeadpunk	as that's part for osa I'm missing now :)	17:42
noonedeadpunk	and I don't want to reboot hardware on grub changes there	17:42
sean-k-mooney	you echo a value in sys fs	17:42
noonedeadpunk	aha	17:42
noonedeadpunk	right	17:42
sean-k-mooney	let me get you an example	17:42
sean-k-mooney	you can do it with systemd or udev rules too	17:43
noonedeadpunk	I'm changing dev-hugepages.mount as well	17:43
sean-k-mooney	echo 1024 \| sudo tee /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages	17:44
sean-k-mooney	nova and libvirt have caches so you need to restart nova-compute and libvirt to make sure they see the change	17:44
sean-k-mooney	so you echo the desired ammount	17:45
sean-k-mooney	cat that value and you can see how many are there	17:45
sean-k-mooney	wehn you write to it it will block until its done i think	17:45
noonedeadpunk	ok, trying that	17:47
sean-k-mooney	allocating 1G pages at runtim normally does nto work because there often isnt enouch free contigous 1G adress spaces	17:47
noonedeadpunk	friday evening - perfect time :)	17:47
sean-k-mooney	noonedeadpunk: well we can pick it back up on monday too. i was just leaving anyway	17:48
noonedeadpunk	I think I will try to go with 2MB still, as I can recall also your advice about live migrations	17:48
sean-k-mooney	but ya i think nova was workign fine you just asked for more resocues then were aviable	17:48
noonedeadpunk	that qemu would operate on page size for migration	17:48
noonedeadpunk	so I'm not sure I wanna do it that big	17:49
sean-k-mooney	ya it jsut tends to work betterr for vms in general	17:49
noonedeadpunk	Yeah, I think this what was the issue	17:49
noonedeadpunk	fwiw, I also got nowhere with migrations crashing on qemu, when more then 80% of memory is used by stress...	17:49
sean-k-mooney	oh anything interssting	17:50
sean-k-mooney	on you said no where	17:50
sean-k-mooney	ok	17:50
noonedeadpunk	not really.. except that 20GB for stress out of 32GB VM is working, but 14GB stress out of 16GB VM crashing	17:50
noonedeadpunk	so it's not amout of memory or intensity - it's somehow percentage of VM memory which matters	17:51
noonedeadpunk	which makes no sense to me at least	17:51
sean-k-mooney	ya that supper odd to me too	17:51
noonedeadpunk	I also tried post copy with no difference	17:51
sean-k-mooney	and it was qemu that was rashing not the guest kernel OOMing right	17:51
noonedeadpunk	yeah	17:52
noonedeadpunk	migration failed, reason=crashed in qemu logs	17:52
sean-k-mooney	ya i havnt seen that or herd other reports	17:52
sean-k-mooney	that to me say qemu bug	17:53
noonedeadpunk	and tried on different generations of intels	17:53
sean-k-mooney	but i dont knwo what it could be	17:53
noonedeadpunk	was very easy to reproduce - get 16gb VM on hypervisor with ubuntu 22.04 or 24.04, on VM run `screen stress-ng --vm 4 --vm-bytes 14G --vm-keep`; live migrate -> profit	17:55
noonedeadpunk	almost 100% reproduction rate for me	17:56
noonedeadpunk	maybe 98... not sure	17:56
noonedeadpunk	yeah, once I incvreazed amount of pages, instance expectedly spawning	18:04
noonedeadpunk	so thanks a ton for help!	18:06
noonedeadpunk	these just somehow did not add up for me	18:06
gmaan	stephenfin: stephenfin gibi I replied on the v2.0 removal spec. It will break users as v2.1 is not same as v2.0 from user perspective - https://review.opendev.org/c/openstack/nova-specs/+/951949	18:08
gmaan	if somehow we can know no one use v2.0 then I will be happy to remove	18:08
gmaan	and my take that time will be to bump the min microversion also but we have to be very careful about this change as it will be big impact on users	18:09
sean-k-mooney	i guess installer can choose not to deploy 2.0 today we dotn in our new installer and other tools already dont deploy it too and have not for a while	18:16
noonedeadpunk	In OSA v2.1 seems to be deployed like always	19:08
noonedeadpunk	or well. Since Liberty: https://review.opendev.org/c/openstack/openstack-ansible/+/227839	19:08
gmaan	melwitt: about project-admin comment in manager role spec, sean-k-mooney and I replied but I agree to add some explanation in spec. While I am doing that, I wanted to check if it is ok to add that explanation in follow up and we merge current change or you would like to see that in this change itself. https://review.opendev.org/c/openstack/nova-specs/+/937650/9/specs/2025.2/approved/policy-manager-role-default.rst#29	19:17
gmaan	either is ok for me	19:17
opendevreview	Merged openstack/nova master: api: Stop using wsgi.Controller.api_version to switch between API versions https://review.opendev.org/c/openstack/nova/+/936366	20:17
melwitt	gmaan: cool thanks, I'm reading through. I don't think it's necessary to follow up or change it, it was just a bit of context I was missing or didn't already know	20:42
melwitt	this whole time I thought project-admin meant admin but only for a given project. if it's global, then I don't understand what's the difference between system-admin and project-admin then 😛	20:43
melwitt	but the important point is that if project-admin is global then that makes it obvious the usefulness of project-manager	20:44
gmaan	melwitt: yeah, by doing back and forth on those admin, we end up keeping only one admin (legacy admin) who can do things on system level	21:49
gmaan	if we could have system admin then it could make admin at different level more clear/safe to use but that confused operator instead of helpful :)	21:51
opendevreview	Merged openstack/nova-specs master: Propose API policy manager role spec https://review.opendev.org/c/openstack/nova-specs/+/937650	22:38

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!