*** songwenping_ has joined #openstack-cyborg | 00:36 | |
*** songwenping__ has quit IRC | 00:39 | |
*** songwenping__ has joined #openstack-cyborg | 00:50 | |
*** songwenping_ has quit IRC | 00:53 | |
*** songwenping_ has joined #openstack-cyborg | 01:27 | |
*** songwenping__ has quit IRC | 01:29 | |
*** tetsuro has joined #openstack-cyborg | 04:22 | |
*** links has joined #openstack-cyborg | 05:28 | |
*** Yumeng has joined #openstack-cyborg | 06:07 | |
*** Sundar has joined #openstack-cyborg | 06:08 | |
Sundar | Hello all | 06:08 |
---|---|---|
Yumeng | hello | 06:08 |
brinzhang | Sundar means add two nova APIs to support bind and unbinding ARQs? | 06:08 |
*** chenke has joined #openstack-cyborg | 06:08 | |
songwenping_ | Hello all | 06:09 |
chenke | Hi I move here. | 06:09 |
brinzhang | hello all | 06:09 |
Sundar | brinzhang -- That is not clear to me. If we can do HTTP PATCH on the server object, we can apply or remove devic eprofiles. | 06:10 |
*** xinranwang has joined #openstack-cyborg | 06:10 | |
Sundar | SO I think it is one API only | 06:10 |
brinzhang | Yeah, one or two all ok for me | 06:10 |
Sundar | Second problem is with libvirt/qemu -- I don;t think we can add PCI interfaces to a running VM. We have to stop it first, regenerate the doain XML and then restart the server | 06:11 |
Sundar | *domain XML | 06:11 |
*** s_shogo has joined #openstack-cyborg | 06:12 | |
Sundar | What is the openstack CLI command you are thinking of? | 06:12 |
Sundar | One option is to avoid all this, and just do a resize with a different flavor, which has a different device profile | 06:13 |
Sundar | Then no need to change Nova API at all | 06:13 |
songwenping_ | Sundar, I have test the case of add and remove PCI interfaces to a running VM like this https://review.opendev.org/#/c/729945/5/nova/virt/libvirt/driver.py | 06:13 |
Sundar | songwenping_ : How does this work? How will libvirt change the domain XML on the fly? | 06:14 |
brinzhang | Sundar, I mean I want add two nova api to support binding and unbinding arqs from nova. that we can use "nova bind-accelerators and/or unbind-accelerators" | 06:14 |
*** shaohe_feng has joined #openstack-cyborg | 06:15 | |
*** haibin-huang has joined #openstack-cyborg | 06:15 | |
brinzhang | this need a spec to nova team | 06:15 |
Sundar | brinzhang: The Nova CLI is old. In the new openstack CLI, we say: $ openstack server create --flavor myflavor ..., where myflavor has the device profile. What is the openstack CLI you plan to use for hot attach/detach? | 06:16 |
songwenping_ | `guest.detach_device(hostdevPCI, persistent=True, live=True)' this function with "live=True" param can work for running VM. | 06:17 |
songwenping_ | we can dumpxml the VM and see the hostdev PCI attach. | 06:17 |
brinzhang | Sundar, they are same, we should do completed the cli in novaclinet and then to openstackclient to support, otherwise we cannt completed this feature in Nova | 06:17 |
brinzhang | Sundar: binding/unbinding will be an independent operation in nova | 06:18 |
brinzhang | Now we dont consider to support resize operation, that will be out of my control | 06:19 |
brinzhang | I think | 06:19 |
Sundar | brinzhang: will you pass a device profile to this CLI? if so, will one VM have 2 or more associated device profiles? | 06:19 |
brinzhang | Sundar: maybe you are right, but I am sure I can answer your question, I dont know does it need pass device_profile to the CLI | 06:21 |
Sundar | songwenping_: Great. I found this: https://libvirt.org/pci-hotplug.html Thanks for the pointer. | 06:22 |
brinzhang | If we want to bind an ARQ to an instance that its flavor has not accel:device_profile, maybe we should pass it. | 06:23 |
brinzhang | with unbind, I think we should judge the instance does have the accel:device_profile, access unbind or reject. | 06:24 |
songwenping_ | Yeah, right Sundar. Libvirt has supported hotplug for PCI devices. | 06:25 |
Sundar | IMHO, it is better to support resize: it covers hot-add and is more generally useful also. It allows us to keep the model that one VM has one device profile. | 06:25 |
Sundar | It is a standrd Nova operation: so less resistance, and may be we can even get help from Nova developers | 06:26 |
shaohe_feng | Libvirt has supported hotplug for PCI devices for a long time-_- | 06:27 |
Sundar | brinzhang: on a resize, does Nova scheduler get involved? If so, can it pick a different host? | 06:27 |
brinzhang | Sundar, resize may need to change host,that not easy to control, I know it should support, but it need time | 06:27 |
xinranwang | I think it will do re-schedule | 06:27 |
Sundar | Libvirt does not support it for the default config: looks like you have to specify a different cpu model q35. | 06:27 |
Sundar | brinzhang: we already have the scheduler flow for device profiles and ARQs. Why do you think it is difficult? | 06:29 |
Yumeng | brinzhang: I think unshelve will also do re-schedule, right? | 06:30 |
brinzhang | unshelve jsut need to re-create ARQs, it's easy to get | 06:31 |
brinzhang | I am not search into resize, so I can not answer clearly. | 06:32 |
Sundar | We can bring this up in the Nova discussion. | 06:32 |
Sundar | Anyway, we need to tlak to them about other VM ops for accelerators. | 06:33 |
Sundar | *talk | 06:33 |
shaohe_feng | Sundar: a question about schedule. You have bring up a discussion about this: | 06:33 |
brinzhang | This release we have rebuild/evcaute, suspend/resume, shelve/unshelve, I think it's enough to do in V | 06:33 |
shaohe_feng | add a new filter for dynamic program | 06:34 |
shaohe_feng | let cyborg do the filter, right? | 06:34 |
Sundar | shaohe_feng: not sue what you are referring to | 06:34 |
brinzhang | In https://review.opendev.org/#/c/729945/5/api-guide/source/accelerator-support.rst we record the operations supported in cyborg | 06:34 |
*** minmin has joined #openstack-cyborg | 06:35 | |
Yumeng | brinzhang: I think we can also start to support resize, chenke can investigate in. anyway, we can start discuss with nova during the session. | 06:36 |
brinzhang | agree | 06:36 |
brinzhang | after chenke talked with nova team, I think we can get more info about resize. | 06:37 |
shaohe_feng | such as, currently the function of the FPGA is ovs with 4 VF, but no VM uses it. and now on more FPGA available , we can select this one ovs PFGA as IPsec with 2 NICs, and change its function. | 06:38 |
shaohe_feng | ^ Sundar so this need a scheduler improvement, right? | 06:38 |
*** zhurong has joined #openstack-cyborg | 06:39 | |
shaohe_feng | and you bring up to add a new filter. | 06:39 |
shaohe_feng | I remember. | 06:39 |
Sundar | shaohe_feng: That doesn;t involve the Nova scheduler. We can define a device profile with a trait that selects that type of FPGA, and an accel:bitstream that refers to ipsec. Then Placement may return some FPGA with ovs image if it is free. But Cyborg will notice it does not have ipsec and do the programming. | 06:40 |
Sundar | What we don't have is this: | 06:41 |
shaohe_feng | yes, that's the current flow in cyborg at present | 06:42 |
Sundar | Say, we have two free FPGAs, one with ovs and with ipsec. In this case, we want the scheduler to pick the one with ipsec preferentially. That is not there today. | 06:42 |
Sundar | The scheduler has a weigher but it selects among hosts, not among allocation candidates from Placement | 06:43 |
shaohe_feng | you can not fill ovs or ipsec in the trait | 06:43 |
Yumeng | brinzhang: I was curious, given "unshelve jsut need to re-create ARQs, it's easy to get", does that mean shelve operation does not actually release the accelerator in hypervisor, the accelerator is still a claimed allocation, that's why there is no re-scheduler in unshelve. | 06:43 |
shaohe_feng | but the weigher also not support for FPGA, right? | 06:44 |
Yumeng | please correct me if I was wrong | 06:44 |
*** Mudit has joined #openstack-cyborg | 06:44 | |
Sundar | That is right. At least, if you do, it makes it too specific -- if you ask for ipsec and no FPGA with ipsec is free, the request will fail. No reprogramming. | 06:44 |
Sundar | What we really want is preferred traits: | 06:44 |
Sundar | We should be able to say I prefer the resource provider have this trait but, if not, give me something | 06:44 |
Sundar | Alex Xu knows about this | 06:44 |
shaohe_feng | everyone also know about it | 06:45 |
shaohe_feng | -_- | 06:45 |
shaohe_feng | but we want to improve it | 06:45 |
Sundar | shaohe_feng: There are many weighers, and we cna write custom weighers too. None of them cover FPGA, because they deal with hosts, not device RPs. | 06:45 |
shaohe_feng | and everyone know placement is so weak :') | 06:45 |
shaohe_feng | we can consider how to improve it. | 06:46 |
Sundar | What I meant is, Alex Xu knows Cyborg wants preferred traits. I think he was looking into implementing it at some point, but he got busy with other things | 06:46 |
shaohe_feng | yes, we can consider write a custom weigher for FPGA | 06:47 |
Sundar | Nope, that won;t help, as I said | 06:47 |
shaohe_feng | he did not give us a good suggestion | 06:47 |
shaohe_feng | what he know, others also know \ | 06:47 |
shaohe_feng | we want a good solution | 06:48 |
shaohe_feng | not the common knowledge | 06:48 |
brinzhang | Yumeng: when shelve instance we delete the instance's ARQs from nova and cyborg, when unshelve we should create it, it need to re-scheduler in unshelve | 06:48 |
shaohe_feng | yes, we can improve placement, every one can improve it. but we can let upstream accept it. | 06:49 |
brinzhang | Yumeng: the poc code is here, you can review https://review.opendev.org/#/c/729563/2 | 06:49 |
shaohe_feng | such as SDL can used for placement, but it is a big change for placement, right? vendor can improve it in this way. but it is difficult in upstream, right? | 06:50 |
Sundar | If we can get preferred traits, that would help. Shall we add that to Nova discussion? | 06:51 |
brinzhang | we will introduce the service version to support these operations, controled in https://review.opendev.org/#/c/715326/13/nova/compute/api.py@288 | 06:51 |
shaohe_feng | preferred is just a case of SDL. SDL can cover it. | 06:52 |
shaohe_feng | and preferred can not resolve the complex scenarios :') | 06:54 |
Sundar | Yumeng: ^ | 06:54 |
Sundar | brinzhang: ^ | 06:54 |
Yumeng | Sundar,yes sure. I will add operations to Nova etherpad | 06:55 |
shaohe_feng | Yumeng no, not this release. | 06:55 |
Sundar | No I mean discuss preferred traits | 06:55 |
shaohe_feng | the next release. | 06:55 |
shaohe_feng | I just want to know some history from sunder | 06:56 |
Yumeng | I don't think we have enough time to discuss preferred traits. | 06:56 |
shaohe_feng | but not preferred history | 06:56 |
Yumeng | yes next release. | 06:56 |
brinzhang | I found nova PTG time is too later for china, I will atten it asap, | 06:56 |
shaohe_feng | about the filter and weigher discussion | 06:56 |
Yumeng | this release, we discuss nova operations and smartnic integration | 06:56 |
brinzhang | Yumeng: you can ping me when this will be start | 06:56 |
Yumeng | yes, nova session is 10:00-11:00pm beijing time Friday. | 06:57 |
Yumeng | brinzhang:ok. | 06:57 |
brinzhang | Yumeng: ack | 06:58 |
Sundar | Ok, anything else for me? :) | 06:58 |
Yumeng | Nope for today. Thank you Sundar ! | 06:59 |
Sundar | Thank you, Yumeng and all! Have a good day. | 06:59 |
brinzhang | Sundar, hope you have a good day, bye | 06:59 |
Yumeng | Have a good night. BTW, wil you come tomorrow? | 06:59 |
Sundar | Yumeng, am I required for some topic tomorrow? | 06:59 |
Sundar | New drivers and drive rprograms -- sounds interesting | 07:00 |
Yumeng | Driver program API support? API attribute? | 07:00 |
Sundar | I'll try to join for 1 hour, but I have a long day tomorrow | 07:00 |
Sundar | Bye for now :) | 07:01 |
Yumeng | ok. Thanks! Then I will move New drivers and drive rprograms first, others later. | 07:01 |
Yumeng | Sundar: bye. | 07:01 |
*** Sundar has quit IRC | 07:01 | |
brinzhang | In zoom all chinese people? | 07:02 |
Yumeng | hi. let's take a break for 10 mins. | 07:02 |
Yumeng | brinzhang: nope. We have shogo and one guest. | 07:03 |
Yumeng | after ten mins shall we go back to zoom? | 07:03 |
Mudit | Hello! from India this is Mudit | 07:03 |
chenke | are your phone ok with zoom? brin | 07:03 |
chenke | hello dudit | 07:04 |
chenke | hello mudit | 07:04 |
brinzhang | Yeah, I asked but noone responed, I think my net is not very good | 07:04 |
Yumeng | hello Mudit! very welcome! | 07:04 |
brinzhang | welcome Mudit^ | 07:04 |
chenke | you can ask again, if i here, i will reply you. | 07:04 |
s_shogo | Sure, I can't understand Mandarin,,,, thanks. | 07:05 |
haibin-huang | Do we have cyborg-api for program fpga? | 07:05 |
Yumeng | hi Mudit: are you intrested in using accelerator device? | 07:05 |
Mudit | yes looking at smart nic specifically | 07:06 |
Yumeng | what kind of smart nic? SRIOV or others? | 07:06 |
Mudit | SRIOV basic + FPGA based or ARM cores too | 07:07 |
s_shogo | haibin-huang cyborg has no dynamic programming API cuurently, that is under developing in this patch. https://review.opendev.org/#/c/698190/ | 07:07 |
haibin-huang | ok, I will see it, thank you | 07:08 |
haibin-huang | when will merge this patch to master branch | 07:08 |
brinzhang | s_shogo: you should update this patch, and fix the comments, let merge this in V release | 07:09 |
Mudit | does cyborg address LCM of FPGAs as well | 07:09 |
s_shogo | haibin-huang I'll restart this work after the PTG, and I would like to complete this in June, in my hope:) | 07:09 |
s_shogo | bringzhang yes, that's right. | 07:09 |
chenke | cool | 07:10 |
brinzhang | s_shogo: I think you can, come on ^^ | 07:10 |
haibin-huang | cool, s_shogo | 07:10 |
Yumeng | Mudit: sounds cool. cyborg is trying to implement SRIOV in this release. LCM of FPGAs are not supported yet. | 07:11 |
Mudit | thanks yumeng | 07:12 |
Yumeng | currently supported Intel FPGAs https://github.com/openstack/cyborg/blob/master/cyborg/accelerator/drivers/fpga/intel/driver.py | 07:12 |
*** shaohe_feng has quit IRC | 07:12 | |
Yumeng | Mudit: if possible, can you share your cases? just add it to etherpad | 07:13 |
Mudit | yes sure I will | 07:13 |
Yumeng | we want to hear feedbacks from everywhere, vendors or operators.^^ | 07:13 |
Yumeng | Thanks Mudit! | 07:14 |
Mudit | does Cyborg treat a smart NIC with FPGA as another class than say standalone FPGA | 07:14 |
*** minmin has quit IRC | 07:21 | |
*** links has quit IRC | 07:49 | |
*** links has joined #openstack-cyborg | 07:53 | |
s_shogo | Yumeng: related to tomorrow topic, ( [Jun 3, 7:00 - 7:15 UTC] Implementation for Deivce Enable/Disable API ) | 08:01 |
s_shogo | important topics :) | 08:01 |
brinzhang | xinranwang, Yumeng: The placement seems can filter data by resource name https://docs.openstack.org/api-ref/placement/?expanded=list-resource-providers-detail,list-resource-classes-detail,list-resource-provider-inventories-detail#list-resource-classes | 08:08 |
brinzhang | xinranwang: if we want to get all cyborg resource, I think it's difficult, but I think we can get one type class then to make diff in cyborg, just an idea | 08:09 |
*** Mudit has quit IRC | 08:16 | |
*** tetsuro has quit IRC | 08:42 | |
*** tetsuro has joined #openstack-cyborg | 08:42 | |
*** tetsuro has quit IRC | 09:29 | |
*** s_shogo has quit IRC | 09:49 | |
*** links has quit IRC | 10:16 | |
*** haibin-huang has quit IRC | 10:20 | |
*** links has joined #openstack-cyborg | 10:21 | |
*** brinzhang has quit IRC | 11:29 | |
*** tetsuro has joined #openstack-cyborg | 11:56 | |
*** dansmith has quit IRC | 12:00 | |
*** dansmith has joined #openstack-cyborg | 12:01 | |
openstackgerrit | zhangboye proposed openstack/cyborg-specs master: Add py38 package metadata https://review.opendev.org/732549 | 12:04 |
openstackgerrit | zhangboye proposed openstack/python-cyborgclient master: Fix hacking min version to 3.0.1 https://review.opendev.org/732554 | 12:08 |
*** xinranwang has quit IRC | 12:29 | |
*** Yumeng has quit IRC | 13:54 | |
*** tetsuro has quit IRC | 14:03 | |
*** chenke has quit IRC | 14:09 | |
*** links has quit IRC | 16:39 | |
openstackgerrit | Hervé Beraud proposed openstack/cyborg master: Stop to use the __future__ module. https://review.opendev.org/732827 | 18:09 |
openstackgerrit | Hervé Beraud proposed openstack/python-cyborgclient master: Stop to use the __future__ module. https://review.opendev.org/732918 | 18:47 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!