Tuesday, 2019-04-23

*** tetsuro has joined #openstack-placement00:12
*** tetsuro_ has joined #openstack-placement00:14
*** tetsuro has quit IRC00:16
*** ttsiouts has joined #openstack-placement00:19
*** ttsiouts_ has joined #openstack-placement00:26
*** ttsiouts has quit IRC00:28
*** mriedem has quit IRC00:30
*** ttsiouts has joined #openstack-placement00:34
*** ttsiouts_ has quit IRC00:36
*** ttsiouts has quit IRC00:41
*** ttsiouts has joined #openstack-placement00:43
*** tetsuro_ has quit IRC00:50
*** ttsiouts has quit IRC00:51
*** takashin has joined #openstack-placement01:43
*** dklyle has quit IRC02:45
*** dklyle has joined #openstack-placement02:46
*** ttsiouts has joined #openstack-placement02:48
*** takashin has left #openstack-placement03:03
*** ttsiouts has quit IRC03:21
*** ttsiouts has joined #openstack-placement05:18
*** ttsiouts_ has joined #openstack-placement05:43
*** ttsiouts has quit IRC05:45
*** ttsiouts has joined #openstack-placement05:47
*** ttsiouts_ has quit IRC05:49
*** ttsiouts has quit IRC05:52
openstackgerritSurya Seetharaman proposed openstack/placement master: [WIP] Spec: Support Consumer Types  https://review.opendev.org/65479906:15
*** ttsiouts has joined #openstack-placement07:22
*** helenafm has joined #openstack-placement08:08
*** tssurya has joined #openstack-placement08:12
tssuryacdent: good morning08:14
tssuryalet me know if you have some time, I had a couple of questions for the consumer types spec08:15
*** ttsiouts_ has joined #openstack-placement08:21
*** ttsiouts has quit IRC08:23
*** e0ne has joined #openstack-placement08:39
*** gibi_off is now known as gibi08:47
*** ttsiouts has joined #openstack-placement08:53
*** ttsiouts_ has quit IRC08:56
sean-k-mooneyo/10:35
sean-k-mooneyedleafe: efried are either of ye about at the moment?10:35
sean-k-mooneyquick question for ye.10:35
sean-k-mooneyi was talking to bauzas eairlier today and a question came up which bauzas is going to look into later10:36
sean-k-mooneywhen we have nested resocue providers will placement return muliple allocation candiates for the same host10:36
sean-k-mooneye.g. if i request rescource from both the root rp and a resouce that can be provide by either of two child rps will i get 2 allocation candiates10:37
sean-k-mooney1 with cpus/ram from the root and the other resocue form child RP1 and a second with the other resoucres form child rp2?10:38
sean-k-mooneythis is imporant for several of the numa/bandwith related feature but it also improtant for sharing resouce providers usecase too.10:39
sean-k-mooneyin the shareing case its not really about child resouce providers but more this is a member of 2 aggrates for differend shard disk providers. do i get an allcoation candiate for both possible configuration or jsut one of them10:40
sean-k-mooneyanyway if anyone know the answer to the above ^ let me know10:41
gibisean-k-mooney: placement returns multiple candidates per compute if possible. See for example the gabbit case https://github.com/openstack/placement/blob/931a9e124251a0322550ff016ae1ad080cd472f3/placement/tests/functional/gabbits/allocation-candidates.yaml#L60210:47
sean-k-mooneygibi: ok that is good to know and is there a way to limit the candiate per host without limiting the total set10:48
gibisean-k-mooney: I don't think so it is possible10:48
sean-k-mooneyok that might be useful as things get more nested10:49
sean-k-mooneyif you have a low limit set like cern has then it could be problemaic10:49
sean-k-mooneythey limit to like 20 or 50 allcoation candiates but if they get 5 allcotaions per host then thats only 10 hosts instead fo 5010:50
sean-k-mooneyor 4 i guess10:50
gibisean-k-mooney: they will know how nested their deployment will be so they can addjust the limit accordingly10:52
sean-k-mooneynot really10:52
sean-k-mooneythey set the limit for performacne reasons10:52
sean-k-mooneyso they cant increase it but we could ask placment to only  return 1 candiate per host or 3 instead of all possible combinaitons10:53
gibiOK, I see10:53
sean-k-mooneyjust takign th eband with case if i install 2 4x10G nics in a host that give me 8 pfs10:54
gibialternatively we can change placement to return one candidate for each possible host before retuns the second candidate for the first host (e.g. orderd the candidates differently)10:54
sean-k-mooneyassuming all were on the same physnet and had capastiy left that would result in 8 allocation canidate10:54
sean-k-mooneygibi: perhaps that has other issue too10:55
sean-k-mooneynamely numa affinity10:55
sean-k-mooneyi think this is a larger topic that is worth discussing with more people10:55
gibiOK so probably one ordering will not be good for all case10:55
*** tetsuro has joined #openstack-placement10:56
gibisean-k-mooney: agree about discussing it with others10:56
sean-k-mooneyya i think this is just another pair of axis that we need to consider in the wider "richer request syntax" discussion10:56
gibisean-k-mooney: as an extra problem, nova only considers the first candidate per host10:57
sean-k-mooneygibi: yes today.10:57
sean-k-mooneyif we want to do wigheing/filtering on allcoation candiate instead of hosts in the future that woudl change10:58
gibisean-k-mooney: I agree10:58
sean-k-mooneyany way i realised while talking to bauzas that we had assumed that there would be multiple allcoation candiates for the same host but i had never check10:59
sean-k-mooneythanks for pointing out that test10:59
sean-k-mooneyim not sure if its also the case for nested11:00
sean-k-mooneyits definelty assertign the behavior for local and shareing providers11:00
gibisean-k-mooney: there are nested cases in the same file11:00
sean-k-mooneyso in theory nova should allready be handeling it11:00
gibisean-k-mooney: nova handling the multiple candidates per host by taking the first candidate11:01
sean-k-mooneyah yes like these https://github.com/openstack/placement/blob/931a9e124251a0322550ff016ae1ad080cd472f3/placement/tests/functional/gabbits/allocation-candidates.yaml#L538-L55711:02
gibisean-k-mooney: yes11:03
*** tetsuro has quit IRC11:03
sean-k-mooneygibi: ya taking the first is fine until we start considering numa11:03
gibisean-k-mooney: yes, for numa affinity the nova scheduler should start understanding allocation candidates11:04
sean-k-mooneyactully we could have issue with tacking the first one for  sharing providers also right?11:05
sean-k-mooneyactully maybe not11:05
gibisean-k-mooney: if we want to prefer sharing over local disk for a compute that has both, then yes11:05
sean-k-mooneysince a host will either be configured ot used a local sotrage or the shared for ephemeral storge11:05
sean-k-mooneyi would have to think about it some more but there is proably some unexpeced behavior lingering there.11:08
sean-k-mooneyim not sure how we handel bfv vs non-bfv or local storage vs rbd image backend for ephmeral storage11:09
sean-k-mooneyor bfv on a host with rbd configured for epheral storage. as in that case we coudl have 2 sharing resouce providers of storage from different ceph cluster.11:10
sean-k-mooneyin other words i should poably learn how we do storage with placment at some point :)11:11
*** belmoreira has joined #openstack-placement11:45
*** mriedem has joined #openstack-placement12:11
edleafetssurya: cdent is off this week12:34
tssuryaedleafe: oh thanks for telling me12:35
tssuryaso I guess I will catch him at the ptg then12:36
edleafeyeah, he'll be there12:36
*** belmoreira has quit IRC13:07
*** belmoreira has joined #openstack-placement13:12
efriedsean-k-mooney: I'm here now, catching up...13:40
efriedsean-k-mooney: Having only read your intro, yes, it's absolutely possible to get multiple candidates from the same host, in both nested and sharing cases. We have quite a few gabbi tests that demonstrate this behavior.13:41
efriedokay, I see that's where gibi pointed you.13:41
sean-k-mooneyefried: yes13:42
sean-k-mooneyefried: i had assumed it would13:42
efriedI see the issue is again how to limit the number of candidates we get.13:42
efriedand/or dictate the order in which they arrive13:43
efriedWe have a placement config to randomize13:43
sean-k-mooneyya if you want to do weighing/filtering on the nova side with allocation candiates13:43
efriedRandomize will get you close-ish to "spread" behavior. Non-randomize *might* get you close-ish to "pack" behavior. Neither is all the way there.13:43
sean-k-mooneyusing limit might result in failure becaue not enough host were condiserd13:44
efriedRight, assuming there's some reason for the hosts at the top of the list to be unsuitable.13:44
sean-k-mooneyyes which is likely if you are using anything not tracked in placement13:44
efriedWe should in general keep striving to tailor the placement queries to make fewer allocation candidates unsuitable.13:44
sean-k-mooneylike sriov or pci passthough or numa or pinning or ...13:45
efriedyup13:45
efriedswhy we're trying to get all that stuff into placement.13:45
sean-k-mooneyyep13:45
sean-k-mooneyim just wondering if we need an allocation_per_host query arg or config setting13:46
sean-k-mooneyuntill we get to the point of tracking all that stuff in placemnt13:46
sean-k-mooneyalthough even tracking every thing in placement wont fix it13:47
sean-k-mooneywe will still have a combintorial problem if there are multiple RP of the same resouces13:47
sean-k-mooneye.g. multipe PFs13:47
sean-k-mooneyif you ask for 1 vf, 1 cpu 1G of ram and 10G of disk but have 8 PF you will get 8 allocation candiates for the same host13:48
sean-k-mooneyif you add in a request for a vGPU and you have 2 pGPUs on the host that becoems 16 alloication canidates13:49
sean-k-mooneytracking numa in plamcnet  might reduce that set or it might increase it13:50
sean-k-mooney(beacue you have muliple sets fo cpus/memory)13:50
sean-k-mooneyso its not a proablem in stien but it might become one in train if we dont have some logic for this13:51
sean-k-mooneyedleafe: by the way when efried was joking about weigher in placment on thursday was that actully a thing?13:52
edleafesean-k-mooney: ideas like that have been suggested, and pretty much laughed off13:52
efriedIf you mean "preferred traits" that's been semi-seriously considered.13:52
edleafethat's different13:53
sean-k-mooneycorrect13:53
edleafethat's more like "if you have some with those traits, great, otherwise I'll take what you got"13:53
efriedNo, preferred traits is a weigher-in-placement.13:53
sean-k-mooneyi mean a weigher that sorts the allication candiate13:53
sean-k-mooneyefried: prefered traits when i propsoed it first was in nova13:54
sean-k-mooneyedleafe: yes more or less13:54
sean-k-mooneyedleafe: prefered traits was when we get to the nova weigher look at the llocation candiates and chose the one with the most tratis that are in the prefered list13:55
edleafeyeah, that's different than having a bunch of random criteria sorting ACs13:55
sean-k-mooneyya13:55
efriedI guess you could implement it as, "if results with these traits, return only those results and skip the rest." I had been thinking of it as, "Find all the candidates as usual, but put the ones with these traits at the front of the list."13:55
sean-k-mooneyweighers in placment would be on the placmeent api side13:55
sean-k-mooneybefore we return the allcoation canidates reposcne to nova13:55
sean-k-mooneyefried: yes that is how prefered traits is intedn to work. but it was proposed as a nova only feature13:56
edleafeefried: that's what people were asking for, and we had agreed that was best done in the scheduler13:56
sean-k-mooneythe placement request woudl nto have any perfered tratis in it13:56
efriedI don't remember that being agreed to; I thought it was just that it hadn't made the cut yet.13:57
efried...and we would continue to weigh in the scheduler until it did.13:57
efriedbecause the only problem with doing it in the scheduler is the limit thing.13:57
edleafeefried: It was rejected for exactly that reason: it already exists in nova, and it's a nova concept13:58
sean-k-mooneywell there was certenly an element of there are more important things to do first13:58
edleafesean-k-mooney: yep. Re-implementing nova in placement is not a priority :)13:58
sean-k-mooneyedleafe: yes although sorting result is a general usecse that goes beyond nova13:59
efried++13:59
efriedAgree that placement currently *doesn't* handle sorting, except for random vs "natural"13:59
sean-k-mooneyanyway ill assume it will continue to be not a priority for train13:59
efriedbut that doesn't mean that it never should, that it's a concept anathema to the DNA of placement.13:59
edleafesean-k-mooney: sure, but it should be a generic solution14:00
sean-k-mooneythat is spereate form the problem i was describing above14:00
sean-k-mooneyedleafe: yep totoally aggre that if placment ever provides allocation sorting it shoudl be a general solution14:00
efriedHow generic?14:01
efriedSeems to me like we need to support a finite set of sorting policies14:01
sean-k-mooneyinitally i woudl start with assending/decending based on available of resouce clase on a host14:01
sean-k-mooneysort=VCPU>14:02
efriedright, most/least available resource by percentage/absolute of $RC14:02
sean-k-mooneyya something like that14:02
efriedor [$RC, ...] (in order)14:02
efriedso "least available by absolute" would be a "pack" strategy for example.14:03
efriedIt also sounds like "spread roots" would be another desired policy.14:03
sean-k-mooneythe other intersting soritng critira coudl be on traits but i woudl defer that until way later14:03
efriedYeah, without trying to weigh traits relative to each other, it would just be: pass in a [$list of preferred traits] and sort based on which candidate has the most of them.14:04
sean-k-mooneyyes14:05
efriedweighing traits relative to each other would be tricky, and would potentially require a whole other policy, like "rarest" traits are "heavier". That seems like going too far to me.14:05
sean-k-mooneythere are more complex thing you could do but baby steps14:05
edleafeI'm not sure why those types of sorts are better done in placement than in the calling service14:06
efriedbecause the calling service needs to be able to not need to deal with tens of thousands of results14:06
efriedCERN case.14:06
sean-k-mooneyedleafe: the issue is with the use of limmit14:06
sean-k-mooneycern do like limit=20 ro limit5014:07
efried^ so they don't have to deal with tens of thousands of results14:07
sean-k-mooneyyep14:07
edleafeSo we do the filtering *and* weighing in placement so that we only have to return the 20 "best"?14:07
sean-k-mooneywhich to be fair randomisation helps with but it distorst the set that nova can then filter/sort14:08
efriededleafe: Right, you would say sort_policy=foo&limit=2014:08
efriedwith a generic ?sort_policy=<enum val> you could start small, implementing "pack" and "spread" based on the total available resource. IMO this is not a nova-ism at all. "Use the open jug of milk from the fridge before starting a new jug".14:09
edleafeefried: sure, I get that. Weighers in Nova aren't always that simple, though14:10
sean-k-mooneyefried: pack is easy as it jsut an order by rp_uuid14:10
sean-k-mooneyspread is harder14:10
efriedsean-k-mooney: pack isn't sort by rp_uuid.14:10
efriedthat assumes all the resource and trait requests are the same.14:11
sean-k-mooneysorry yoru are right14:11
efriedpack is sort by least available remaining resource. and spread is sort by *most* available remainin gresource.14:11
sean-k-mooneyi was think of something related but different14:11
sean-k-mooneyyes14:11
sean-k-mooneythat is what i ment by vcpu> or memoy_mb<14:12
efriedYes, that's generic and complicated and powerful14:12
efriedto keep it simple, you could just average the percentage of remaining amount of all the resources in the request14:12
efried(keep it simple for the client, that is :)14:12
efriedso the client just says "pack" or "spread".14:12
sean-k-mooneyperhaps.14:13
efriedit's not perfect, but it gets real close while being vastly simpler for the caller.14:13
sean-k-mooneyanyway i just wanted to raise the posible problem we could have with limit as the combinitorics of allocation candiates increase with nested rps14:14
edleafepack/spread can be by percent of free resources, number of instances, number per switch, etc. We will have to keep this *very* limited.14:14
sean-k-mooneyefried: it does not solve my origninal problem however14:14
efriedno, for that you need a policy like "split roots".14:14
sean-k-mooneyim not sure we want to go that route14:15
sean-k-mooneyi feel it will be limiting14:15
sean-k-mooneybut anyway its just somehting for use to be thinking about14:15
efriedperhaps I'm not understanding the specific use cases you're trying to solve.14:15
efriedI thought you were trying to get it so the list of candidates didn't have umpteen results for the same host clustered together at the top of the list.14:16
sean-k-mooneyall host return 8 possible allocation candiates. i want ot limit that to a max of 214:16
efried...in case for some reason that host as a whole is unsuitable and we run off the end of our limit before we get to a suitable one.14:16
sean-k-mooneyefried: yes14:17
efriedRandomize results gets you pretty close to solving that problem though.14:17
efriedrandomize+limit14:17
efrieduntil there's some other criterion in play, it's almost as good as "spread roots".14:17
sean-k-mooneybut the second we track numa in placement that does not work14:17
efriedeh, why not?14:17
sean-k-mooneybecause we cant express the numa toploty relationship to placemnet14:18
sean-k-mooneyso while one of the allction candiate may be vaildi to the host we dont know which one14:18
sean-k-mooneyif we randomise it then we will get false no valid hosts14:19
efriedoh, okay, I was assuming when you said "track numa in placement" that you meant we were expressing the topology relationship.14:19
efriedyeah, so this is where being able to express topology and affinity matters.14:19
sean-k-mooneyi just mean track rps in a tree14:19
efriedboth of those blueprints are proposed for Train.14:19
sean-k-mooneywithout the richer query syntax randomisation is not going to help14:20
sean-k-mooneyyep they are.14:20
sean-k-mooneywill both land in train14:20
efriedrandomization will *help*, but it won't *solve*. It will still sometimes break.14:22
efriedwill both land in train <== is that a question?14:22
efriedOne could hope so. But it seems quite unlikely that we'll be able to land anything for affinity on the placement side and also consume it on the nova side within Train.14:23
sean-k-mooneyyes :) although perhaps a slightly biased one14:23
sean-k-mooneyya14:23
sean-k-mooneybut im not going to looese hope that train will deliver good improvemnt in general14:24
*** tetsuro has joined #openstack-placement14:59
*** tetsuro has quit IRC15:01
*** ttsiouts has quit IRC15:27
*** ttsiouts has joined #openstack-placement15:27
*** ttsiouts has quit IRC15:32
*** helenafm has quit IRC15:37
*** e0ne has quit IRC15:50
*** belmoreira has quit IRC16:03
openstackgerritKashyap Chamarthy proposed openstack/os-traits master: Add CPU traits for Meltdown/Spectre mitigation  https://review.opendev.org/65519316:45
*** altlogbot_0 has quit IRC16:50
*** altlogbot_3 has joined #openstack-placement16:56
*** e0ne has joined #openstack-placement17:20
*** e0ne has quit IRC17:24
*** tssurya has quit IRC18:07
*** tssurya has joined #openstack-placement18:10
*** e0ne has joined #openstack-placement18:29
*** amodi has joined #openstack-placement18:43
*** amodi has quit IRC18:55
*** e0ne has quit IRC19:01
-openstackstatus- NOTICE: the zuul scheduler is being restarted now in order to address a memory utilization problem; changes under test will be reenqueued automatically19:09
*** tssurya has quit IRC19:39
*** ttsiouts has joined #openstack-placement21:42
*** ttsiouts has quit IRC23:14
*** ttsiouts has joined #openstack-placement23:15
*** ttsiouts has quit IRC23:19

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!