Tuesday, 2019-05-28

*** takashin has left #openstack-placement02:32
*** licanwei has joined #openstack-placement03:21
*** e0ne has joined #openstack-placement06:06
*** e0ne has quit IRC06:25
*** tetsuro has joined #openstack-placement06:25
*** tetsuro has quit IRC07:09
*** yikun has joined #openstack-placement07:10
*** helenafm has joined #openstack-placement07:37
*** tetsuro has joined #openstack-placement07:51
*** tetsuro has quit IRC08:38
*** tetsuro has joined #openstack-placement08:57
openstackgerritBalazs Gibizer proposed openstack/placement master: Resource provider - request group mapping in allocation candidate  https://review.opendev.org/65758208:59
*** e0ne has joined #openstack-placement09:00
gibiefried: if you can answer my question in https://review.opendev.org/#/c/657419/7/api-ref/source/parameters.yaml@169 then I can +A that patch09:18
gibiefried: cdent seems to be out09:18
openstackgerritBalazs Gibizer proposed openstack/os-traits master: Add REPORT_PARENT_INTERFACE_NAME_FOR_SRIOV_NIC trait  https://review.opendev.org/65885209:25
openstackgerritChris Dent proposed openstack/placement master: Optionally run a wsgi profiler when asked  https://review.opendev.org/64326909:46
openstackgerritChris Dent proposed openstack/placement master: DNM: See what happens with 10000 resource providers  https://review.opendev.org/65742310:05
*** cdent has joined #openstack-placement10:07
*** tetsuro has quit IRC10:26
*** jaypipes has joined #openstack-placement10:42
*** jaypipes has quit IRC10:46
*** jaypipes has joined #openstack-placement10:58
*** ttsiouts has joined #openstack-placement11:33
*** ttsiouts has quit IRC11:38
cdentedleafe: can I delegate a thing to you if you're able?12:29
cdentwhich is: orchestrate when office hours should be12:31
openstackgerritChris Dent proposed openstack/placement master: perfload with written allocations  https://review.opendev.org/66075413:05
*** mriedem has joined #openstack-placement13:15
*** purplerbot has quit IRC13:20
*** purplerbot has joined #openstack-placement13:20
efriedgibi: Responded in https://review.opendev.org/#/c/657419/13:28
gibiefried: thanks13:52
gibiapproved the patch13:52
*** amodi has joined #openstack-placement14:01
efriedThanks gibi14:10
efriedcdent: We might want to have a hangout with placement+nova teams (or maybe put it on the agenda for next nova meeting) about the use cases for can_split.14:10
efriedBecause it sounds like "land me anywhere" isn't good enough, and "preserve existing behavior" is not possible.14:10
efried(other than by doing nothing, including NUMA modeling in any form)14:11
cdentefried: I think we should perhaps first (or simultaneously) do a query to operators, and something during the nova meeting14:14
cdentI think the shortest path to preserve existing behavior is "only model numa when the host is configured to do so"14:15
cdentthat does mean operators have to do something but perhaps that's not such a big deal14:15
efriedcdent: Right, but that gets us to the limitation that you can't co-locate guests with and without a NUMA topology.14:16
cdentyes, which is why I'm saying we need to talk to operators. maybe that's nbd14:16
cdentor14:16
cdentmaybe it is not enough of a deal to incur the cost14:17
efriedokay, an we "talk to operators" via the ML?14:17
efriedor what?14:17
cdentdo we have any other option?14:17
efriedswhat I'm asking.14:17
cdentit's a good place to start14:17
efriedight14:17
cdenti guess [ops][nova][placement] tags or something like14:18
efriedwant me to craft something?14:18
cdentwell. I think if I do it I will bias it too much one way. If you do it it will bias it too much another way, so it depends on which foot we want to start on (and the other will follow later with the other foot)14:19
efriedI think I can do it without bias. Starting...14:19
efried...but going to run some errands, so expect something in an hour or two.14:22
cdentaye aye14:23
openstackgerritChris Dent proposed openstack/placement master: Add olso.middleware.cors to conf generator  https://review.opendev.org/66176914:30
*** dklyle has joined #openstack-placement14:59
*** e0ne has quit IRC15:05
efriedcdent: I'm back.15:44
cdentwelcome back15:44
efriedUnsurprisingly, having a hard time phrasing questions15:44
efriedbecause they sound silly to me15:44
cdentdo you want to share a draft somewhere15:44
efriedyeah, lemme etherpad it...15:44
cdentdo you need a camp counsellor to tell you there are no stupid questions?15:45
*** helenafm has quit IRC15:51
cdentefried: Imma be outtie pretty soon, I see you got drawn away. If you send me the etherpad link I can look before you wake up. But in the meantime the important questions to me are:16:20
cdentHow important is it to be able to mix non-NUMA and NUMA workloads on the same host?16:21
cdentShould any host that supports NUMA represent it's VPCU and MEMORY as part of NUMA nodes or should some machine be "simple"?16:22
cdentIf given a choice between a faster and simpler placement service while needing to manage inventory (as in Should... above ) more closely or a potentially slower placement service and always report NUMA inventory, which would you choose?16:23
cdentDo you distinguish between low and high performance hosts? How?16:24
sean-k-mooneycdent:  i would prefer if all host were modeled as numa hosts16:24
sean-k-mooneyand similarly if all guest were modeled as numa guests so we can simply the code16:24
cdentBut mostly: I think we've laid out the questions pretty well on the spec and we need to summarize them and point interested parties to the spec16:25
sean-k-mooneybut if we need to support both the i can live with that16:25
cdentsean-k-mooney: yes, I know, these are prompts to try to help eric create an email to find out what operators prefer16:25
sean-k-mooneyi am almost finsihed https://review.opendev.org/#/c/658510/4/doc/source/specs/train/approved/2005575-nested-magic.rst by the way16:25
cdentour preferences aren't really the important part16:25
cdentexcellent, glad to hear it16:25
sean-k-mooneywell my preference are based on simplying the nova and palcement code by only haveing one code path for all instances16:26
cdentI think my comment on there about the locus of control being in the wrong place is the crux of things for me.16:26
cdentplacement doesn't know what an instance is16:26
cdentyet the only reason can_split is coming up is for allowing a certain type of instance16:26
cdentso it me, it seems...model breaking16:26
sean-k-mooneyhehe well if we make all isntance numa instance then it goes away16:27
cdentwould that mean changing a bunch of flavors?16:27
sean-k-mooneyno16:28
cdentI guess I'm not certain what you mean by "numa instance"16:28
sean-k-mooneyany instance that does not ahve a numa configrutain specifed via hw:numa_nodes or implictly creted by hugepage or cpu pinning is a non numa instnace16:28
sean-k-mooneywe implcitly generate a request for one numa node if you set hw:cpu_policy=dedicated or hw:mem_page_size=large16:29
sean-k-mooneythe cahnge would be if you dont set hw:numa_nodes=x we assuem x is 116:29
sean-k-mooneythat breaks some usecase but it also simplfies the code in some respects16:30
cdentthat breaks what has been described as The Problem: making use of space cpus spread across numa nodes for instance that "don't care"16:30
cdents/space/spare/16:30
sean-k-mooneyyep16:30
cdentso presumably that's a deal break _if_ The Problem really is the problem. which is why I think we need to talk to operators16:31
sean-k-mooneyit does but supporting that usecases cause alot of extra complexity16:31
cdentbecause if it is not a problem, then we can either solve it your way16:31
cdentor solve it my way: don't report numa inventory for some hosts16:31
sean-k-mooneyor solve it the way s im suggesting in the spec which do neither16:32
sean-k-mooneye.g. it support having the spread, lives with the complexity, and add a little more to respect the allocations form placement16:32
cdentI'll look forward to reading that later, but to reiterate: I think we need to stop talking about what's possible, and talk more about what people want16:33
cdentbecause it feels like we got down this road in the first place by saying "we have to be able to do X" without really confirming that requirement versus its costs16:33
cdentbrb16:33
* cdent is back16:35
cdentefried: maybe some of that chat ^ will be good fodder?16:37
sean-k-mooneyanyway to your point yes it would be good to know if operators care about the "i dont care about numa usecase" i think this will be enterpise folk mainly. and as always you will have the telcos on the other end making life complicated.16:38
cdentmriedem: this stack may not really be your bailiwick, but if you're inclined tetsuro has done some nice cleanups which appear to result in some good peformance gains: https://review.opendev.org/658778 (and also make some of the nested magic easier to do)16:39
cdentsean-k-mooney: I think the telco side of things is relatively sane/clear [1]: they want to report NUMA and they want to place with high levels of control. The real issue here is effective use of resources for people who are just "give me a vm".16:40
cdent[1] I can't believe I just said that16:40
mriedemi'm gonna be spending quite a bit of time here writing a recreate test for zigo's issue from earlier16:40
mriedemsince it goes back to ocata16:40
sean-k-mooneycdent: hehe that telcos are sane. are you feeling ok16:40
cdentmriedem: ossum! There's no rush on that tetsuro thing, was just thinking it might tickle your fancy16:41
sean-k-mooney:)16:41
sean-k-mooneybut yes i know what you mean16:41
cdentI know, right? But what I mean is: they essentially don't care about dynamic placement. They frequently want a very specific thing. The enterprise or cloudy case is what we think of as "the simple case" but mixing it into a numa setting is complicating things. Which is why I'm wonder if perhaps "don't report that numa stuff" is a reasonable out.16:42
sean-k-mooneycdent: well the current numa in placement spec was proposeing a per compute node conifg option that list the resouce classes to report per numa node16:43
sean-k-mooneywhich defalted ot none for backwards compatiablity16:43
sean-k-mooneyso if we continue to tell people dont mix numa instace with non numa instance16:44
cdentright, so maybe that's enough16:44
sean-k-mooneythat is what we recommend to day by the way the sure16:44
cdentperhaps that's another for efried's list of questions then: are you okay with the status quo16:45
sean-k-mooneyya it might be although we liekly would have to do some prefilter stuff to transform our existing flavor into something placement would ike16:45
cdentI assume cfriesen might have thoughts16:45
cdentbut he's not in here16:45
sean-k-mooneyi know he has costomer that use openstack to spin up a single giant vm  that uses all thee resouce on the plathform and he does not want ot have to tell them to create mulpile numa node in the guest16:46
sean-k-mooneyso he woudl prefer to not break the "i dont care" usecase for them16:46
sean-k-mooneyi hope they are a minority16:47
sean-k-mooneythe poeple that  i belive really want ot mix numa and non numa instance on the same host are the edge folks16:47
sean-k-mooneymainly because they dont have enough hosts at an edge site to partion them staticly16:48
cdentit's the case that you can already mix, right? the issue is mixing efficiently16:48
sean-k-mooneywell yes but we are removing that limitation at least partially in train16:49
sean-k-mooneyand windrive have donwstream only code that makes it work today16:49
sean-k-mooneythey have a host agent that confines the floating instance so they dont float over teh pinned ones16:49
sean-k-mooneyand it dynamicl changes there confinment as pinned isntace are added and remvoed16:49
sean-k-mooneyi think that is opensouced as part of starlinx but im not sure16:50
cdentif that were present on a host _and_ numa vcpu was being reported _and_ can_split was being used, placement would quickly become wrong16:52
sean-k-mooneyyes if it did not repect the numa node the allcoation was form16:53
sean-k-mooneybut this predates placmenet so it was not a usecse they conisdered16:53
cdentsure, I get that, just thinking out loud16:54
cdentbut it supports my concerns about locus of control16:54
sean-k-mooneyspecifically?16:55
cdentcan_split is worrisome when the operating system or other tools on the compute-node might move _which_ vcpu are being used by a workload16:56
cdentanyway, i'll watch the spec, I have to depart for the evening16:56
sean-k-mooneyah16:56
cdenti linked to the log near here on the spec, so hopefully other people will hop on16:56
cdentg'night16:56
* cdent waves16:56
*** cdent has left #openstack-placement16:57
sean-k-mooneywell i think if a virt dirver preport resoces in a numa aware way i think its resonable to require them to consume them based on the allocation they were provided16:57
*** irclogbot_0 has quit IRC17:17
*** irclogbot_0 has joined #openstack-placement17:19
sean-k-mooneyefried: i may have -1'd https://review.opendev.org/#/c/658510/4/doc/source/specs/train/approved/2005575-nested-magic.rst but i agree with the general direction you are putting froward and most of the dessiosn that have been made17:29
sean-k-mooneyi still need to finish it but i also need to clear my head so ill come back to it tomorow17:29
efriedsean-k-mooney: Duly noted. I don't see that spec getting merged before the can_split issue is resolved.17:29
efriedand so far we seem to have like four different (*very* different) views on the requirements around that.17:30
*** ttsiouts has joined #openstack-placement17:49
*** ttsiouts has quit IRC17:59
*** efried has quit IRC18:17
*** efried has joined #openstack-placement18:18
*** licanwei has quit IRC19:56
*** mriedem has quit IRC20:07
*** mriedem has joined #openstack-placement20:51
*** e0ne has joined #openstack-placement20:58
*** e0ne has quit IRC21:12
*** mriedem has quit IRC22:29
*** amodi has quit IRC22:49

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!