Monday, 2025-03-17

r-taketnhi. I have two entries for next PTG(About CCA, RDT/MPAM). Besides purchasing a ticket and submitting spec, what else should I prepare for the PTG?07:08
fricklerUggla: could you have a quick look at https://review.opendev.org/c/openstack/releases/+/943801 pls?08:03
IshanShanware[m]* sean-k-mooney:... (full message at <https://matrix.org/oftc/media/v1/media/download/Af0thDY-PzrHZLStDMYwU6GFaZIiMSA41JuprrmuPgJq6AY3dEeP8-2cHH70wc9rwy3S6QmtPw_bSYPSO95YLsdCeV64AjOgAG1hdHJpeC5vcmcvVUNuTVBJVllKcFlkQmR5bnhsc0V2Qllv>)09:19
fricklerUggla: bauzas: also what's the status with rc1 for nova+placement? we're past the deadline now09:36
bauzasfrickler: as said in the nova rc1 patch, we're missing some patches before tagging09:36
bauzaseverything is tracked here https://etherpad.opendev.org/p/nova-epoxy-rc-potential09:37
bauzasplacement rc1 can be tagged tho, just double-checking the last dependencies09:37
fricklerbauzas: unless I miss something, the only remaining nova change is https://review.opendev.org/c/openstack/nova/+/943952 which is a pretty trivial but necessary change that keeps failing due to unrelated CI instability. let me know once you're ready to get this force-merged to unblock things :-)09:43
bauzas_frickler: not exactly, we're also missing a prelude patch that needs to be written as copy of the cycle highlights09:44
bauzas_that's a TODO on me this morning since we start to have a review quorum from the nova cores about the highlights09:45
opendevreviewSylvain Bauza proposed openstack/nova master: Add Epoxy prelude section  https://review.opendev.org/c/openstack/nova/+/94471310:19
IshanShanware[m]<IshanShanware[m]> "sean-k-mooney:..." <- bauzas:  any thoughts here ? 11:29
*** IshanShanware[m] is now known as ishanwar[m]11:54
opendevreviewribaudr proposed openstack/nova master: FUP improve comment accuracy and variable naming for tag removal  https://review.opendev.org/c/openstack/nova/+/94312412:14
opendevreviewribaudr proposed openstack/nova master: FUP Remove unnecessary PCI check  https://review.opendev.org/c/openstack/nova/+/94410512:14
opendevreviewribaudr proposed openstack/nova master: FUP improve and add integration tests for PCI SR-IOV servers  https://review.opendev.org/c/openstack/nova/+/94410612:14
opendevreviewribaudr proposed openstack/nova master: FUP Add a warning to make non-explicit live migration request debugging easier  https://review.opendev.org/c/openstack/nova/+/94413312:14
opendevreviewribaudr proposed openstack/nova master: FUP Update pci-passthrough and virtual-gpu documentation  https://review.opendev.org/c/openstack/nova/+/94415312:14
bauzascores, can we please give this a go ? https://review.opendev.org/c/openstack/nova/+/94471313:45
dansmithbauzas: I did three minutes ago13:47
bauzascool13:47
opendevreviewSylvain Bauza proposed openstack/nova master: Add Epoxy prelude section  https://review.opendev.org/c/openstack/nova/+/94471313:49
bauzasupdated ^13:49
dansmithsame13:50
dansmithbauzas: wanna fix up that cycle highlights one too and I'll +2?13:53
bauzasyup13:53
dansmithbauzas: thanks for always being on top of these release things13:54
dansmithuggla a de grandes chaussures à remplir13:54
bauzaslast time unless Uggla leaves me do that again in Flamingo :)13:54
bauzasdansmith: lollylol13:54
bauzascycle highlights updated too13:55
dansmithoh duh, I can't +2 that13:56
dansmithsorry, but +1d :)13:56
opendevreviewMerged openstack/nova master: Update compute rpc alias for epoxy  https://review.opendev.org/c/openstack/nova/+/94395213:56
Uggla@dansmith, you will learn French, that's a good start !13:57
dansmithI will not.. one french translation via google from me per cycle and that was the one for flamingo :)13:58
dansmiththe only one I really "know" is "voulez-vous coucher ... " you know the rest :)13:59
Uggla@bauzas, I'll try to do it for flamingo, but I hope, you will be not far away to check it. ;)13:59
bauzasUggla: GMaps tells me a distance of 36 kms by road indeed14:00
Uggla@dansmith, a useful one. ;)14:01
Ugglabauzas, any chance you merge couple of FUP for vGPU before RC 1 ?14:01
dansmithwe're already late right? unless it's something critical I'd vote to stay the course, backport if it's important14:02
bauzasUggla: I'm a bit lost now, the bottom patches are on gate, but for the other 3 ones, there is a still -1 for the 3rd14:05
bauzasUggla: tbc, we can merge the two first as those are in the gate, but for the other ones, I don't think so unfortunately :(14:05
bauzasif those two patches merge before we merge rc1, all good but I won't await them unfortunately14:07
Ugglabauzas, the one for the doc is important. but it can be backported without pb.14:08
bauzasyep, I know :(14:09
UgglaI can put it down in the serie if it helps.14:10
UgglaI have fixed the comments from @gibi and @sean-k-mooney, but it needs a review. Anyway let me know if I can help if not then we'll see and backport.14:11
dansmithgibi: sorry to complicate the workflow with those grammar fixes. I was thinking I could +2 it after that and totally forgot about the TODOs, else I would have just left the comments15:43
gibino worries15:44
gibiit easy to just pull the new ps from gerrit and continue from that15:44
dansmithas long as no more changes were local, but yeah15:44
gibiyeah I had no local changes to apply on top with a stash / stash pop15:45
dansmithbauzas: fancy re-applying your +2 to this? https://review.opendev.org/c/openstack/nova-specs/+/94348616:01
gibidansmith: left some comments in the spec16:16
dansmithgibi: thanks, I totally see now why those migrate scenarios sound wrong, obviously not my intention :)16:26
gibicool16:27
dansmithgibi: left a question for you on the evac thing, I don't have an easy way to test that locally, but I'm also not sure what you're worried about specifically16:27
gibidansmith: I feel evac is different but I need to take a look at the code to see how it is different...16:28
dansmithgibi: okay did you want me to just mention that we need to look at that in the implementation or dig into the difference? I could *try* to set up something to test the evac but I'm not sure if I'll be able to squeeze it in16:29
gibiI'm OK to mention evac as also probably affected and then later we can figure out what that really means16:30
dansmithokay16:32
gibihttps://github.com/openstack/nova/blob/6b45672b23db47ccbf74dbda4f2678fd25ea22ed/nova/conductor/manager.py#L1325 the place conductor schedules the evac, but above it we have the old case where the evac is forced to a host and skipping the scheduler16:33
gibiI'm not sure if we try to move the allocation to the migration uuid in either case16:33
dansmithah, okay, but probably the same principle applies here if you're in transition and don't have the PCI-i-p filter enabled such that we heal the allocation for it and the inventory right?16:34
dansmithsince we key the inventory reserved=1 off of the objects.PCIDevice and not the allocation, I think we'll reserve the device immediately regardless16:34
bauzasmaybe I misunderstood, but I was able to understand what "burned" was meaning 16:35
bauzas'burned' to me means that this is reserved and you can't longer use this device unless you're able to unreserve it16:35
bauzasso, of course, when evacuating something that was already reserved, we then also reserve the target16:36
bauzasboth then are 'burned'16:36
gibidansmith: my point is that if evac does not call move_allocations then evac operation will not be blocked by placement like cold/live migrate and resize that fails at move_allocations if reserved=116:36
dansmithbauzas: I think the problem is I referred to "burned" in the early part as happening immediately, but I wrote the migration lifecycle stuff using burned to mean after the allocation was dropped.. it doesn't change what we're doing, it's just not cear16:37
dansmithbauzas: I think we should say "burned happens as soon as reserved is incremented", thus instances run with a "burned" device the whole time.. that will be clearer16:37
bauzascool cool16:37
bauzasfwiw, nothing changes to me, this is just a documentation question, hence why I +2d it16:38
bauzaswe can change the spec in a follow-up if that's really that16:38
dansmithgibi: ah okay16:38
dansmithbauzas: yep, I'm rephrasing now16:38
gibidansmith: maybe evac will just work out of the box16:38
gibidansmith: anyhow I can try evac at some point later this week with your WIP code16:39
gibiso for the spec that mention that evac might or might not be effected by the placement bug and we can refine that later16:39
dansmithgibi: ack will do16:39
gibithanks16:43
dansmithgibi: if the situation you described doesn't heal the allocations before we start the instance, then evacuation would be able to oversubscribe a host today, even without this I think right?16:45
gibiI think the situation (i.e. not having move_allocation called) during evac, does not change the healing logic running on the destination compute. Hopefully it handles the fact that the instance uuid is allocating from two RP trees, one from the source node and one from the destination node16:47
gibibut that problem preexists not otu specific16:47
gibi* problem, if any16:48
opendevreviewDan Smith proposed openstack/nova-specs master: Add one-time-use-devices spec  https://review.opendev.org/c/openstack/nova-specs/+/94348616:48
dansmithgibi: since we call _update() synchronously in instance_claim() I think we *have* to be fixing the allocation before we start running the instance, and would fail if we couldn't16:48
dansmithanyway, see if these ^ wording updates make it clearer16:49
gibidansmith: I think we are on the same page. The healing happnes regardless of move_alloction called or not in the conductor16:50
dansmith...and synchronously such that if it fails we fail, but definitely good to confirm16:50
gibiyeah16:51
gibilooking at the spec...16:51
dansmithalso, please clarify your "over-subscribed" comment.. perhaps it's just a meaning-of-the-phrase thing16:51
gibidansmith: ack, clarified in the reivew17:10
gibi*review17:11
dansmithreplied.. I guess it's just what you think "over-subscribed" means...17:15
gibidansmith: do we want to fix both oversubscribed case in placement?17:21
gibior just the reserved=used case?17:21
dansmithgibi: I'm not sure what you mean.. does placement define over-subscribed to mean something specific?17:21
gibidansmith: I just don't want to suggest we will fix both over-subscribed case in a later bugfix. I think we only want to fix the reserved=used case17:22
dansmithit'll be the same fix I think17:23
dansmithI haven't actually done it yet (because I haven't decided about the signaling) but it's basically in the same place where it just adds them all up and decides if it's over.. it would actually be worse to (try to) fix it for one and not the other I think17:23
dansmithI thought we were arguing over the meaning of "over-subscribed" and whether total=1,used=1,reserved=1 was "over-subscribed"? :)17:24
gibiI think both is oversubscribed but they are different situations. I tink the reserved=used case is more of an allowed over subscription as nova managed VMs are not using more resources that available. The used > total case is different as it means nova VMs are using more than they should17:25
dansmithyes, very much agree.. I thought that was the nit you were picking17:26
dansmithor rather,17:26
dansmiththat you were suggesting something other than the above definition17:26
gibiso I don't want to signal in this spec that we will allow used > total during move operation in the future as I'm nost sure we should allow nova-compute to even start with such config in the first pl.ace17:27
dansmithokay I'm super confused17:28
dansmithgmeet? :)17:28
dansmithor tomorrow, if it's late for you17:28
gibiwe can have a quick gmeet I started my day late so I'm planned to be around longer17:29
dansmithmeet.google.com/njh-oqjw-azk17:29
opendevreviewDan Smith proposed openstack/nova-specs master: Add one-time-use-devices spec  https://review.opendev.org/c/openstack/nova-specs/+/94348618:02
opendevreviewDoug Goldstein proposed openstack/nova master: ironic: fix logging of validation errors  https://review.opendev.org/c/openstack/nova/+/94201918:45
mikalMorning19:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Reproduce bug/2098496  https://review.opendev.org/c/openstack/nova/+/94167319:42
opendevreviewBalazs Gibizer proposed openstack/nova master: Ignore metadata tags in pci/stats _find_pool logic  https://review.opendev.org/c/openstack/nova/+/94427719:42
opendevreviewBalazs Gibizer proposed openstack/nova master: Ignore metadata tags in pci/stats _find_pool logic  https://review.opendev.org/c/openstack/nova/+/94427719:45
opendevreviewDoug Goldstein proposed openstack/nova master: ironic: fix logging of validation errors  https://review.opendev.org/c/openstack/nova/+/94201922:04

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!