Thursday, 2018-09-27

*** fanzhang_ has left #openstack-cyborg		00:58
*** openstackgerrit has joined #openstack-cyborg		03:14
openstackgerrit	Xinran WANG proposed openstack/cyborg master: Allocation/Deallocation API Specification https://review.openstack.org/597991	03:14
*** jiapei has joined #openstack-cyborg		03:42
*** fanzhang_ has joined #openstack-cyborg		07:11
*** fanzhang_ has left #openstack-cyborg		07:11
*** fanzhang_ has joined #openstack-cyborg		07:11
*** helenafm has joined #openstack-cyborg		07:13
*** jiapei has quit IRC		08:31
*** helenafm has quit IRC		08:49
*** helenafm has joined #openstack-cyborg		10:15
*** helenafm has quit IRC		11:24
*** efried has quit IRC		16:10
*** efried has joined #openstack-cyborg		16:10
*** dims_ is now known as dims		18:05
*** Sundar has joined #openstack-cyborg		19:10
Sundar	efried: Please LMK when you are here.	19:11
efried	Sundar: ō/	19:11
Sundar	efried: On the Nova spec, you correctly noted that I am not asking to include acceN in Placement queries	19:11
Sundar	These additional properties are needed to specify bitstreams, functions and things that Nova doesn't care about	19:12
efried	I understand	19:12
Sundar	So, I have proposed that they be bundled with the request groups in the same way as resources or traots	19:12
Sundar	So, if we stick to that idea, do you expect pushback from other Nova developers? What does it take to get this through?	19:13
efried	I could see some people objecting to the fact that it looks like they're tied together	19:14
efried	which, of course, they are	19:14
efried	I think you're going to have the same problem gibi is trying to solve via https://review.openstack.org/#/c/597601/ -- there's no way to map a request group to a piece of an allocation.	19:15
efried	But that's orthogonal to the design decision of using accN to denote such information.	19:15
Sundar	The alternative is to split the device profile entries in each VAR and store it in Cyborg. The problem then is that the information gets duplicated. That can lead to issues with consistency.	19:17
efried	Which information gets duplicated?	19:17
efried	you mean include stuff like bitstream info in the device profile?	19:18
efried	that seems... reasonable.	19:18
efried	at least for phase 1	19:18
Sundar	For e.g., say the device profile says: { resources:ACCEL_GPU=2; trait:GPU_FOO=required}. Then we would create 2 separate VARs. Each one could contain a copy like: {resources: ACCEL_GPU=1, trait:GPU_FOO=required}	19:19
Sundar	So, there are 2 VARs, each requesting 1 resource	19:19
efried	yes, that's as it should be. One VAR, one accelerator.	19:20
Sundar	That means, the resources/traits go duplicated. Also, the semantics that both resources need to come from the same resource provider is lost from Cyborg's perspective	19:20
Sundar	*got	19:20
efried	So	19:20
efried	I was thinking a device profile would be for one device.	19:20
efried	But	19:20
efried	then you can't, as you say, demand that two devices come from the same provider	19:21
efried	or more generally, take advantage of granular syntax.	19:21
efried	unless	19:21
Sundar	Yes. Nova would have that info, and presumably can colocate them. But, if VCyborg needs that info to connect the accelerators together, or whatever, that is ruled out	19:22
efried	we number the extra spec key using the same pattern as we do for the other resources	19:22
efried	which is actually probably a better, more composable design.	19:22
Sundar	Yes!	19:22
Sundar	+1	19:22
Sundar	That is the proposal in the spec	19:22
efried	it is?	19:22
efried	I got the impression you had set up device profiles so they could contain more than one accelerator	19:23
Sundar	Yes, I propose to take the device profile fields, cobvert them to extra spec granular syntax and fold them in	19:23
Sundar	Yes, in the above example, we would convert that to:	19:23
Sundar	resources2:ACCEL_GPU=2; trait2:GPU_FOO=required	19:24
Sundar	The only sticking point is who does that numbering. I thought Cyborg or os-acc could do that. But the comments point out that there could be request groups in the flavor unrelated to device profiles or Cybirg	19:24
Sundar	So, Nova should do the numbering. I am fine with that	19:24
efried	Yeah, you should ask gibi how he did that for the network bandwidth PoC. Pretty sure he would have done it from the nova thing that parses the extra specs.	19:25
Sundar	OK. I'll take a look at the spec you cited above, and ask him.	19:26
Sundar	Thanks!	19:26
Sundar	efried: This spec seems orthogonal to our needs, Do you have pointers to other specs from him on this topic?	19:27
efried	Sundar: Here's where I suspect that is getting done: https://github.com/openstack/nova/blob/e658f41d686e4533640b101622f2342348c0316d/nova/scheduler/utils.py#L433	19:28
efried	Oh, the spec I cited above has nothing to do with renumbering request groups. That was just acking the fact that you're (probably) going to need a way to map a numbered group in the request to some piece of the allocation in the response. And placement has no way right now to help you with that - you basically have to reverse-engineer it yourself.	19:29
efried	Let's say for example that your user wants two accelerators of the same type (say VGPU), doesn't care if they're from the same provider or not, and also wants to associate some other resource with each - let's say VRAM.	19:31
efried	So the user specifies one dev profile with VGPU=1,VRAM=1024 and another with VGPU=1,VRAM=2048	19:31
efried	placement, in its infinite wisdom, decides to satisfy both accelerators from the same provider	19:32
efried	so you get back an allocation with: VGPU=2,VRAM=3072	19:32
efried	And now you have to reverse-engineer that to figure out that you actually wanted one VGPU=1,VRAM=1024 and one VGPU=1,VRAM=2048	19:33
efried	Simple example, but add in more device profiles at the same time and/or try to generalize that reverse engineering, and it quickly becomes problematic.	19:33
efried	That's what gibi proposed his spec for.	19:34
Sundar	Yea, I got that	19:34
efried	Because in placement, while we're calculating allocation candidates (but not after), we know which request group corresponds to which bit of each allocation request.	19:34
efried	We just don't carry that information forward to the response, or preserve it internally in any way.	19:34
Sundar	A related issue is that, if VRAM were not a standard resource, you may say accel:VRAM=2048 or something	19:34
efried	um, rather not. If it's not a standard resource, it can be CUSTOM_VRAM, but it should still be tracked in placement.	19:35
Sundar	Now, the accel extra spec has to go with the rest of the request group, for the device to be suitably configured during bind	19:35
efried	If not, you're setting yourself up for resource contention problems that placement was expressly design to obviate.	19:36
Sundar	OK, that was not a good example. What if I wnat to assign the GPU to host software like DPDK for cryptography, rather than to a VM? Then the dev prof may say accel:assign-to-host=true	19:37
Sundar	This is not a trait	19:38
Sundar	because it is not meant for scheduling,	19:38
Sundar	only to control how the assignment is done	19:39
efried	I don't understand that use case. But yeah, traits are another useful way to expose the problem.	19:39
efried	If I ask for resource1=VGPU:1&required1=CUSTOM_FOO&resource2=VGPU:1, and I get back two allocations from different providers, I need to go figure out (based on traits? or what I already know about the providers?) which VGPU I'm supposed to apply to CUSTOM_FOO.	19:40
efried	again, it's not sooper hard, but once you generalize it out, you end up essentially having to duplicate a ton of the placement logic.	19:41
Sundar	Yes	19:42
Sundar	OK, it still comes down to how gibi solved this problem, I suppose. I'll dig around.	19:43
efried	this problem I don't think gibi solved.	19:44
efried	He solved the renumbering problem.	19:44
efried	I believe his PoC behaved quite similarly to what you're proposing - the resources/required gizmos live in the profile (the binding profile? or port? in his case)	19:44
efried	and then he has to renumber those groups when he folds the profile in with the other flavor stuff.	19:45
efried	They wrote up a blog post that walked through the PoC, and I think some of that becomes clear in the demo	19:45
efried	https://rubasov.github.io/2018/09/21/openstack-qos-min-bw-demo.html	19:46
efried	yeah, search for resource_request in the above doc and you'll see what I'm talking about.	19:47
efried	each port profile looks like it gets to ask for resources for a single port. (I guess the actual VIF/VNIC resource is implicit? Not sure how that's working.)	19:48
efried	So the resources/required keys are unnumbered in the port profile.	19:48
efried	And then (not sure if this is in the doc, but I know it's true because it was in the original spec and I confirmed by asking the question in the room when they did the demo) those get numbered in a non-conflicting way when they get folded into the rest of the request.	19:49
efried	Sundar: switching screens; ping me if you want to talk more.	19:55

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!