Thursday, 2018-09-27

*** fanzhang_ has left #openstack-cyborg00:58
*** openstackgerrit has joined #openstack-cyborg03:14
openstackgerritXinran WANG proposed openstack/cyborg master: Allocation/Deallocation API Specification  https://review.openstack.org/59799103:14
*** jiapei has joined #openstack-cyborg03:42
*** fanzhang_ has joined #openstack-cyborg07:11
*** fanzhang_ has left #openstack-cyborg07:11
*** fanzhang_ has joined #openstack-cyborg07:11
*** helenafm has joined #openstack-cyborg07:13
*** jiapei has quit IRC08:31
*** helenafm has quit IRC08:49
*** helenafm has joined #openstack-cyborg10:15
*** helenafm has quit IRC11:24
*** efried has quit IRC16:10
*** efried has joined #openstack-cyborg16:10
*** dims_ is now known as dims18:05
*** Sundar has joined #openstack-cyborg19:10
Sundarefried: Please LMK when you are here.19:11
efriedSundar: ō/19:11
Sundarefried: On the Nova spec, you correctly noted that I am not asking to include acceN in Placement queries19:11
SundarThese additional properties are needed to specify bitstreams, functions and things that Nova doesn't care about19:12
efriedI understand19:12
SundarSo, I have proposed that they be bundled with the request groups in the same way as resources or traots19:12
SundarSo, if we stick to that idea, do you expect pushback from other Nova developers? What does it take to get this through?19:13
efriedI could see some people objecting to the fact that it *looks* like they're tied together19:14
efriedwhich, of course, they are19:14
efriedI think you're going to have the same problem gibi is trying to solve via https://review.openstack.org/#/c/597601/ -- there's no way to map a request group to a piece of an allocation.19:15
efriedBut that's orthogonal to the design decision of using accN to denote such information.19:15
SundarThe alternative is to split the device profile entries in each VAR and store it in Cyborg. The problem then is that the information gets duplicated. That can lead to issues with consistency.19:17
efriedWhich information gets duplicated?19:17
efriedyou mean include stuff like bitstream info in the device profile?19:18
efriedthat seems... reasonable.19:18
efriedat least for phase 119:18
SundarFor e.g., say the device profile says: { resources:ACCEL_GPU=2; trait:GPU_FOO=required}. Then we would create 2 separate VARs. Each one could contain a copy like: {resources: ACCEL_GPU=1, trait:GPU_FOO=required}19:19
SundarSo, there are 2 VARs, each requesting 1 resource19:19
efriedyes, that's as it should be. One VAR, one accelerator.19:20
SundarThat means, the resources/traits go duplicated. Also, the semantics that both resources need to come from the same resource provider is lost from Cyborg's perspective19:20
Sundar*got19:20
efriedSo19:20
efriedI was thinking a device profile would be for *one* device.19:20
efriedBut19:20
efriedthen you can't, as you say, demand that two devices come from the same provider19:21
efriedor more generally, take advantage of granular syntax.19:21
efriedunless19:21
SundarYes. Nova would have that info, and presumably can colocate them. But, if VCyborg needs that info to connect the accelerators together, or whatever, that is ruled out19:22
efriedwe number the extra spec key using the same pattern as we do for the other resources19:22
efriedwhich is actually probably a better, more composable design.19:22
SundarYes!19:22
Sundar+119:22
SundarThat is the proposal in the spec19:22
efriedit is?19:22
efriedI got the impression you had set up device profiles so they could contain more than one accelerator19:23
SundarYes, I propose to take the device profile fields, cobvert them to extra spec granular syntax and fold them in19:23
SundarYes, in the above example, we would convert that to:19:23
Sundarresources2:ACCEL_GPU=2; trait2:GPU_FOO=required19:24
SundarThe only sticking point is who does that numbering. I thought Cyborg or os-acc could do that. But the comments point out that there could be request groups in the flavor unrelated to device profiles or Cybirg19:24
SundarSo, Nova should do the numbering. I am fine with that19:24
efriedYeah, you should ask gibi how he did that for the network bandwidth PoC. Pretty sure he would have done it from the nova thing that parses the extra specs.19:25
SundarOK. I'll take a look at the spec you cited above, and ask him.19:26
SundarThanks!19:26
Sundarefried: This spec seems orthogonal to our needs, Do you have pointers to other specs from him on this topic?19:27
efriedSundar: Here's where I suspect that is getting done: https://github.com/openstack/nova/blob/e658f41d686e4533640b101622f2342348c0316d/nova/scheduler/utils.py#L43319:28
efriedOh, the spec I cited above has nothing to do with renumbering request groups. That was just acking the fact that you're (probably) going to need a way to map a numbered group in the *request* to some piece of the allocation in the *response*. And placement has no way right now to help you with that - you basically have to reverse-engineer it yourself.19:29
efriedLet's say for example that your user wants two accelerators of the same type (say VGPU), doesn't care if they're from the same provider or not, and also wants to associate some other resource with each - let's say VRAM.19:31
efriedSo the user specifies one dev profile with VGPU=1,VRAM=1024 and another with VGPU=1,VRAM=204819:31
efriedplacement, in its infinite wisdom, decides to satisfy both accelerators from the same provider19:32
efriedso you get back an allocation with: VGPU=2,VRAM=307219:32
efriedAnd now you have to reverse-engineer that to figure out that you actually wanted one VGPU=1,VRAM=1024 and one VGPU=1,VRAM=204819:33
efriedSimple example, but add in more device profiles at the same time and/or try to generalize that reverse engineering, and it quickly becomes problematic.19:33
efriedThat's what gibi proposed his spec for.19:34
SundarYea, I got that19:34
efriedBecause in placement, *while* we're calculating allocation candidates (but not after), we know which request group corresponds to which bit of each allocation request.19:34
efriedWe just don't carry that information forward to the response, or preserve it internally in any way.19:34
SundarA related issue is that, if VRAM were not a standard resource, you may say accel:VRAM=2048 or something19:34
efriedum, rather not. If it's not a standard resource, it can be CUSTOM_VRAM, but it should still be tracked in placement.19:35
SundarNow, the accel extra spec has to go with the rest of the request group, for the device to be suitably configured during bind19:35
efriedIf not, you're setting yourself up for resource contention problems that placement was expressly design to obviate.19:36
SundarOK, that was not a good example. What if I wnat to assign the GPU to host software like DPDK for cryptography, rather than to a VM? Then the dev prof may say accel:assign-to-host=true19:37
SundarThis is not a trait19:38
Sundarbecause it is not meant for scheduling,19:38
Sundaronly to control how the assignment is done19:39
efriedI don't understand that use case. But yeah, traits are another useful way to expose the problem.19:39
efriedIf I ask for resource1=VGPU:1&required1=CUSTOM_FOO&resource2=VGPU:1, and I get back two allocations from different providers, I need to go figure out (based on traits? or what I already know about the providers?) which VGPU I'm supposed to apply to CUSTOM_FOO.19:40
efriedagain, it's not sooper hard, but once you generalize it out, you end up essentially having to duplicate a ton of the placement logic.19:41
SundarYes19:42
SundarOK, it still comes down to how gibi solved this problem, I suppose. I'll dig around.19:43
efried*this* problem I don't think gibi solved.19:44
efriedHe solved the renumbering problem.19:44
efriedI believe his PoC behaved quite similarly to what you're proposing - the resources/required gizmos live in the profile (the binding profile? or port? in his case)19:44
efriedand then he has to renumber those groups when he folds the profile in with the other flavor stuff.19:45
efriedThey wrote up a blog post that walked through the PoC, and I think some of that becomes clear in the demo19:45
efriedhttps://rubasov.github.io/2018/09/21/openstack-qos-min-bw-demo.html19:46
efriedyeah, search for resource_request in the above doc and you'll see what I'm talking about.19:47
efriedeach port profile looks like it gets to ask for resources for a single port. (I guess the actual VIF/VNIC resource is implicit? Not sure how that's working.)19:48
efriedSo the resources/required keys are unnumbered in the port profile.19:48
efriedAnd then (not sure if this is in the doc, but I know it's true because it was in the original spec and I confirmed by asking the question in the room when they did the demo) those get numbered in a non-conflicting way when they get folded into the rest of the request.19:49
efriedSundar: switching screens; ping me if you want to talk more.19:55

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!