Monday, 2019-07-08

*** altlogbot_3 has quit IRC01:28
*** altlogbot_2 has joined #openstack-placement01:29
*** tetsuro has joined #openstack-placement05:57
*** tetsuro has quit IRC05:58
*** tetsuro has joined #openstack-placement05:59
*** tetsuro has quit IRC06:03
*** helenafm has joined #openstack-placement07:17
*** tssurya has joined #openstack-placement07:28
openstackgerritTetsuro Nakamura proposed openstack/placement master: Support `same_subtree` queryparam  https://review.opendev.org/66837607:47
openstackgerritTetsuro Nakamura proposed openstack/placement master: Doc `same_subtree` queryparam  https://review.opendev.org/66961607:47
*** tetsuro has joined #openstack-placement07:55
*** tetsuro has quit IRC07:57
*** ttsiouts has joined #openstack-placement08:01
*** ttsiouts has quit IRC08:13
*** ttsiouts has joined #openstack-placement08:13
*** ttsiouts has quit IRC08:18
openstackgerritChris Dent proposed openstack/placement master: Update implemented spec and spec document handling  https://review.opendev.org/66918408:18
openstackgerritChris Dent proposed openstack/placement master: Add whereto for testing redirect rules  https://review.opendev.org/66937008:18
openstackgerritChris Dent proposed openstack/placement master: tox: Stop building api-ref docs with the main docs  https://review.opendev.org/66937108:18
*** ttsiouts has joined #openstack-placement08:24
helenafm:q08:28
*** cdent has joined #openstack-placement09:05
*** cdent has quit IRC09:45
*** ttsiouts has quit IRC10:32
*** ttsiouts has joined #openstack-placement10:33
*** cdent has joined #openstack-placement10:33
*** ttsiouts has quit IRC10:37
cdentgibi: have you had a chance to look at tetsuro's same_subtree work? needs someone besides me and efried_pto looking at it10:46
gibicdent: not yet, I will dive into it now.10:49
cdentgreat, thanks10:49
*** ttsiouts has joined #openstack-placement11:01
*** cdent has quit IRC11:12
*** cdent has joined #openstack-placement11:18
*** sean-k-mooney has quit IRC12:03
*** sean-k-mooney has joined #openstack-placement12:16
*** cdent has quit IRC12:26
*** artom has joined #openstack-placement12:33
*** edleafe has joined #openstack-placement12:42
openstackgerritMerged openstack/os-resource-classes master: Add Python 3 Train unit tests  https://review.opendev.org/66947912:57
openstackgerritMerged openstack/os-traits master: Add Python 3 Train unit tests  https://review.opendev.org/66948013:00
*** takashin has left #openstack-placement13:01
*** mriedem has joined #openstack-placement13:23
*** cdent has joined #openstack-placement13:34
gibicdent: this same_subtree patch is pretty dense. I hope I finish it today but no promises.13:40
cdentgibi: no worries, there's not a huge rush because we're fairly ahead of schedule, but the sooner it is merged, the sooner nova can start playing with it I suppose13:40
gibiIs there a volunteer from nova side to play with this during Train?13:41
gibiI mean who will be the developer first consuming this? as we might need a review from that dev as well13:42
cdentgibi: I don't really know13:44
cdentefried_pto: is the one who has been driving that it needs to happen13:44
gibicdent: OK13:46
cdentI hope this doesn't turn into another situation where placement is months or even years ahead of nova, but it could well do, and that's fine13:47
*** amodi has joined #openstack-placement13:50
*** efried_pto is now known as efried13:54
gibiI don't remember a nova spec that explicity stated same_subtree as a requirement13:54
gibiand nova tend to be a lot slower to move forward than placement13:55
efriedI'm sort of hoping to bully bauzas into abandoning his vgpu affinity spec in favor of working NUMA nesting in nova.13:55
gibianyhow I will review the implementation, it was just a sidetrack of mine to get the "user" of the feature involved13:55
bauzasefried: sorry but why ?13:55
bauzasI already said both specs aren't competitive13:56
efriedNo, they're not mutually exclusive... except from the standpoint of development resource.13:56
efriedIMO retrofitting filter-based affinity for placement-modeled vgpu is a backward step.13:57
bauzasthat's your opinion13:58
efriedThat's what "IMO" means.13:58
efriedI'm just one voice. If there's a preference from the team to move ahead with it, so be it.13:59
gibithere is complexity difference between the two solutions that could mean leadtime differences as well.13:59
efriedyes, absolutely. The filter bauzas proposes could be easily contained in Train. NUMA modeling/affinity in placement may well take longer.14:00
bauzasefried: I'm not adding a filter14:00
efriedit has already taken long enough.14:00
efriedweigher?14:00
bauzasindeed, have you looked at my spec honestly?14:01
bauzasthe weigher is even not needed14:01
efriedI certainly could be misusing terminology. I'm really good at that.14:01
bauzasif that's really a problem for you, I can even remove it honestly14:01
bauzaswhat I really just need is https://review.opendev.org/#/c/650963/9/specs/train/approved/libvirt-vgpu-numa-affinity.rst@9214:02
bauzasefried: the point is, a filter can get NoValidHostds14:02
bauzasefried: not a weigher14:02
bauzasefried: it just helps to make sure we spread instances between hosts14:03
bauzasfor vGPUs14:03
bauzasin order to have less races14:03
efriedbauzas: Where is the code for @92 going to live, if not in the NUMATopologyFilter or a new weigher? In the libvirt virt driver?14:03
bauzasefried: I said it in the spec14:04
bauzashttps://review.opendev.org/#/c/650963/9/specs/train/approved/libvirt-vgpu-numa-affinity.rst@10114:05
bauzashttps://review.opendev.org/#/c/650963/9/specs/train/approved/libvirt-vgpu-numa-affinity.rst@211 and https://review.opendev.org/#/c/650963/9/specs/train/approved/libvirt-vgpu-numa-affinity.rst@21414:05
efriedbauzas: Okay, I've reread the spec.14:14
bauzasthanks14:14
efriedIt all makes perfect sense in a world where there's no NUMA affinity at the placement level.14:14
efriedbut once we have that, 80% of this code goes away.14:15
efriedA pack/spread weigher for its own sake may make sense.14:16
efriedthough not related to affinity14:16
efriedThe code to pick the proper NUMA node based on which PGPUs are allocated becomes n/a.14:17
efriedSo my point is, a) this becomes tech debt almost immediately; and b) the effort spent coding & reviewing could be better spent getting us closer to placement-based NUMA modeling & affinity.14:18
efriedIMO14:18
cdentIMOT14:18
cdentToo14:19
cdentotherwise we've spent a huge complexity cheque in placement for nada14:19
bauzasefried: cdent: there will still be nova changes for using the new placement microversions14:21
bauzasso, yeah, I'll work on this too14:22
bauzasefried: the libvirt code will still possibly be there but AFAIK and unless I misunderstood, we will only have hard affinity14:22
bauzasby using placement14:23
efriedcorrect14:23
efriedusing group_policy=none would allow you to employ post-scheduling soft affinity14:25
*** purplerbot has quit IRC14:28
*** purplerbot has joined #openstack-placement14:30
bauzasefried: so operators wanting to *only* have soft affinity for vGPUs would still need the libvirt claim14:30
bauzasand maybe the weigher14:30
efriedactually, I take it back14:32
efriedyou can't use the soft affinity after the fact - you can't disregard the claim you got.14:33
efriedit's all or nothing14:33
efriedYou can use a host that hasn't modeled NUMA in nested providers, and get soft affinity; or if you're on a host that has modeled NUMA in nested providers, you can have hard affinity or no affinity.14:34
bauzasefried: if you don't ask placement for hard affinity, then you can still have soft affinity14:51
bauzasthat's why I say the spec is not competitive14:52
efriedbauzas: you can't have soft affinity if the host is modeled with nested RPs.14:53
efriedbauzas: because we've created an allocation with resources from particular NUMA nodes.14:53
efriedSo if we didn't request affinity from placement, and you happen to get allocations from opposite NUMA nodes, you can't just ignore that on the host and pick a different distribution of NUMA nodes.14:54
efriedso14:54
efriedall or nothing14:54
*** dklyle has joined #openstack-placement15:13
mriedemima butt in with a question,15:14
mriedemon https://developer.openstack.org/api-ref/placement/?expanded=update-allocations-detail#update-allocations15:14
mriedemthe 409 in the description is mostly about inventory conflicts,15:14
mriedembut isn't there also a 409 response if the consumer exists and you pass 1.28 with consumer_generation=None?15:15
cdentyes15:15
cdentInventory and/or allocations changed while attempting to allocate15:16
cdentone could argue (weakly) that "or inventories are updated by another thread while attempting the operation" fits, since allocations change an inventory's capacity15:17
cdentand if you try to send None for consumer_generation and it doesn't work, then there have been allocations out from under you15:17
cdentbut yes, it could be documented better15:17
* mriedem is storyboarding15:20
cdentyou are a star15:20
mriedemhttps://storyboard.openstack.org/#!/story/200618015:22
*** dklyle has quit IRC15:28
*** dklyle has joined #openstack-placement15:28
*** amodi has quit IRC15:29
*** amodi has joined #openstack-placement15:39
*** helenafm has quit IRC15:43
*** tssurya has quit IRC15:53
gibicdent: left comments in https://review.opendev.org/#/c/66837616:03
cdentthanks gibi16:04
sean-k-mooneyefried: in the numa case we should catch that in the numa toplogy filter before we hit the compute node16:07
sean-k-mooneyit wont be going away at least not in the short term after we have numa support in placment16:07
efriedsean-k-mooney: Is the NUMATopologyFilter a filter or a weigher?16:08
sean-k-mooneyit might eventurlly go away however but ya we cant ignore the allcoation form placmeent16:08
sean-k-mooneyefried: its a filter16:08
sean-k-mooneydoing hard affinity16:08
efriedso the filter part that rejects an allocation that doesn't provide affinity - that would be moot16:08
efriedbecause why would we not have requested affinity from placement if we were going to enforce it in the filter anyway?16:09
sean-k-mooneyif an only if we implement everything it currently does in placment16:09
sean-k-mooneyso cpu, hugepages, and pci device numa affinity16:09
sean-k-mooneywhen we can enforce all of the above with placmenet it can go away but not before we have all 316:09
sean-k-mooneyso it will allow us to do it picemeal in that more and more of the filtering can be left to placmenet and eventualy everthing will be enforce by placement and it can be removed once we have parity16:11
sean-k-mooneythat is proably after U16:12
sean-k-mooneystephenfin: jangutter: i fixed my missing bug nit in https://review.opendev.org/#/c/666387/2 if you want to hit that one quickly. its not urgent but lets try and land it by m216:24
sean-k-mooneywe might want to backport it too to stien but we can do that when needed16:25
sean-k-mooneyoops wrong channel16:25
cdentcleanly sean-k-mooney needs to be kickbanned for violating the rules so egregiously16:26
cdentclearly!16:26
sean-k-mooneyclearly :)16:27
edleafeand certainly not cleanly!16:31
* cdent waves goodnight16:33
*** cdent has quit IRC16:33
efriedsean-k-mooney: I agree the filter itself needs to stick around, especially because (IMO) we should not be trying to go whole hog in placement with things like hugepages etc. However, pieces of that code will become redundant (run but never reject) due to the bits we are enabling in placement.16:35
sean-k-mooneyyep16:36
sean-k-mooneyalthouhg that code need to be made placment aware16:36
efriedthough (run but never reject) may be better as (remove) depending on how reliably we're running the placement side.16:36
sean-k-mooneyits the same code that does teh assignment on the compute node and it need to know it can only assing the resouce that correspond to the allocations/RPs selected by placement16:37
efriedyeah, that could get a little crazy.16:37
sean-k-mooneyefried: the filter works by invokeing the assingment code that will be used on the compute node without actully claiming the resouce in the RT16:37
sean-k-mooneyso 90% of the code would still be used after placment does it16:38
sean-k-mooneybut it need to learn that it cant select form all resouces anymore and can only look at the resouce that correspond to the RP in teh placement allocation16:38
sean-k-mooneypart of the logic will be updated to look at the alloction by the standardise cpu in placment work16:39
sean-k-mooneywhen hugepage or pci device are moved to plamcent the rest will have to be updted16:40
sean-k-mooneyon the plus side it should make the filter faster16:40
efriedwhich is kind of the whole point16:41
sean-k-mooneymaking it faster16:41
efriedfaster in two senses16:41
efriedfailures happen earlier with less racing; and the filter itself actually performs better.16:42
efriedwhole point of placement16:42
sean-k-mooneyya, although the real win is reducing rescudle however to do that we likely  need to move teh RT claim to the conductor too.16:42
*** ttsiouts has quit IRC16:43
*** ttsiouts has joined #openstack-placement16:44
*** ttsiouts has quit IRC16:49
*** mriedem has quit IRC21:53
openstackgerritMerged openstack/placement master: Update implemented spec and spec document handling  https://review.opendev.org/66918422:22
openstackgerritMerged openstack/placement master: Add whereto for testing redirect rules  https://review.opendev.org/66937022:32
openstackgerritMerged openstack/placement master: tox: Stop building api-ref docs with the main docs  https://review.opendev.org/66937122:32

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!