*** Yumeng has quit IRC | 01:54 | |
openstackgerrit | Merged openstack/python-cyborgclient master: Fix a config error ralated to entry point https://review.opendev.org/734006 | 02:23 |
---|---|---|
*** xinranwang has joined #openstack-cyborg | 02:28 | |
*** swp20 has joined #openstack-cyborg | 02:28 | |
*** Sundar has joined #openstack-cyborg | 02:32 | |
*** songwenping_ has joined #openstack-cyborg | 02:49 | |
*** swp20 has quit IRC | 02:52 | |
*** Yumeng has joined #openstack-cyborg | 02:55 | |
*** s_shogo has joined #openstack-cyborg | 02:58 | |
*** chenke has joined #openstack-cyborg | 02:59 | |
chenke | HI | 03:00 |
Yumeng | hi all | 03:00 |
chenke | Hi yumeng. | 03:00 |
*** songwenping__ has joined #openstack-cyborg | 03:00 | |
s_shogo | Hi all | 03:00 |
*** wangzhh has joined #openstack-cyborg | 03:01 | |
Yumeng | welcome back wangzhh ^^ | 03:01 |
wangzhh | Haha long time no see. | 03:01 |
xinranwang | Hi all | 03:02 |
Yumeng | quite a long time. | 03:02 |
wangzhh | Hi xinran~~ | 03:03 |
xinranwang | wow, long time no see ya | 03:03 |
wangzhh | Yep, I can contribute rencently | 03:03 |
*** swp20 has joined #openstack-cyborg | 03:03 | |
Yumeng | wow, cool! | 03:03 |
*** songwenping_ has quit IRC | 03:03 | |
Yumeng | Let's start the meeting! | 03:04 |
Yumeng | #startmeeting openstack-cyborg | 03:04 |
openstack | Meeting started Thu Jun 11 03:04:07 2020 UTC and is due to finish in 60 minutes. The chair is Yumeng. Information about MeetBot at http://wiki.debian.org/MeetBot. | 03:04 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 03:04 |
*** openstack changes topic to " (Meeting topic: openstack-cyborg)" | 03:04 | |
openstack | The meeting name has been set to 'openstack_cyborg' | 03:04 |
Yumeng | #topic Roll call | 03:04 |
*** openstack changes topic to "Roll call (Meeting topic: openstack-cyborg)" | 03:04 | |
Yumeng | #infor Yumeng | 03:04 |
Sundar | Hi all | 03:04 |
Sundar | #info Sundar | 03:04 |
swp20 | Hi all | 03:04 |
wangzhh | #info wangzhh | 03:04 |
s_shogo | #info s_shogo | 03:04 |
xinranwang | #info xinranwang | 03:04 |
swp20 | #info swp20 | 03:04 |
chenke | #info chenke | 03:04 |
Yumeng | #topic Agenda | 03:05 |
*** openstack changes topic to "Agenda (Meeting topic: openstack-cyborg)" | 03:05 | |
*** songwenping__ has quit IRC | 03:05 | |
Yumeng | we will start from SmartNic topic. | 03:05 |
Yumeng | #topic SmartNic | 03:05 |
*** openstack changes topic to "SmartNic (Meeting topic: openstack-cyborg)" | 03:05 | |
Sundar | Thanks, Yumeng | 03:06 |
Sundar | I'd like to followup on our PTG discussion | 03:06 |
Sundar | Just to set the background: there are broadly two kinds of 'smart NICs'. A smart NIC may have a single ‘device’ that combines the accelerator and the NIC, or two (or more) components in a single PCI card, with separate accelerator and NIC components. | 03:06 |
Sundar | We should model the smart NIC as a single RP representing the combined accelerator/NIC for the first case. Agreed, right? | 03:07 |
Sundar | Yumeng, xinranwang, chenke, all : ^ | 03:08 |
Sundar | Anybody around?! | 03:09 |
chenke | HI sundar. | 03:09 |
wangzhh | I didn't follow for a long time, just listen at first.... | 03:09 |
Yumeng | yes, agree. I may miss some in the PTG. But agree. | 03:10 |
chenke | Do you means the first case is only one region ,right? | 03:10 |
Sundar | The concept of region is specific to FPGAs. A smart NIC may not have FPGAs. May be it is a NIC ASIC + many ARM cores. | 03:11 |
s_shogo | As a first step, that is good. agree. | 03:11 |
xinranwang | It may have fpga as well. I think we need consider that. I think subprovider can support this. | 03:11 |
Sundar | Great. For the second case, we could have a hierarchy with separate RPs for the accelerator and the NICs, and a top-level resource-less RP which aggregates all the children RPs and combines their traits | 03:11 |
Sundar | xinranwang: Of course. My point was, the concept of region can be used if it is a FPGA, not otherwise. | 03:12 |
chenke | Sundar ok. got it. | 03:12 |
Sundar | For example: N3000 may be modeled as two RPs - one for the FPGA and one for the 2 Fortville NICs. Perhaps each Fortville NIC cna be a separate RP. That is up to us. | 03:13 |
xinranwang | I agree. | 03:13 |
Sundar | Correspondingly, Cyborg will create a separate deployable for each component, and a top-level deployable that contains all component deployables. | 03:14 |
Sundar | The resources and traits of the children RPs are exposed in the top-level RP; similarly, the accelerators and attributes of children deployables are in the top-level deployable. | 03:15 |
xinranwang | It corresponds to the device topology. Cyborg can report like this. If we need interact with other project like neutron, we should consider more. Because neutron report to placement if bandwidth feature is enabled. | 03:16 |
*** tetsuro has quit IRC | 03:16 | |
Sundar | xinranwang: I am hoping that Cyborg can create RPs for all components, not leave it to neutron even for bandwidth provider. Otherwise, it gets a bit complicated. | 03:17 |
Sundar | Neutron can, of course, use the RP that Cyborg created | 03:17 |
xinranwang | The physnet is neutron who's in charge of, not sure cyborg should take it over. | 03:17 |
xinranwang | it a network concept | 03:18 |
xinranwang | s/it/it's | 03:18 |
Sundar | Yes. The physnet is best left to the admin or the OpenStack installer | 03:18 |
Sundar | Cyborg shouldn't get in the way. | 03:18 |
Sundar | However, we could model the physnet as a trait. Cyborg creates the NIC's RP but doesn't want to manage the physnet trait on that RP. | 03:19 |
xinranwang | what is neutron's bandwidth feature enabled, how will cyborg know that neutron report to placement as well. | 03:19 |
xinranwang | s/what is/what if | 03:19 |
Sundar | So, we can levae it to the admin or installer however they do it. That is how PCI whitelist handles physnet anyway. | 03:19 |
xinranwang | I am not sure this will be accepted by the community. I am still thinking about it. | 03:20 |
Sundar | xinranwang: Good point, I have thought about it. Hoping we can reach agreement with Neutron that they need not create RPs in this case. | 03:21 |
Sundar | What won;t be accepted:: phsynet as trait, or Cyborg creating the RPs for NICs? | 03:21 |
xinranwang | Anyway, I will propose a spec in nova community and everyone can discuss there. But I am still thinking about which solution I should choose to propose firstly, and others will be an alternative. | 03:22 |
Sundar | We should have some internal agreement hopefully before we approach others | 03:23 |
xinranwang | Yes, of course | 03:24 |
Sundar | Anyways, what do others think of Cyborg creating RPs for the NIC side too? It will do that for the first type where the accelerator and NIC are combined. To keep it uniform, we should do that for the second case too | 03:24 |
Sundar | Otherwise, we'll have different solutions for different types of smart NICs, depending on whether they have a single component or multiple components | 03:25 |
Sundar | Yumeng, chenke, s_shogo, all: ^ | 03:26 |
Yumeng | so if Admin use Placement CLI to set traits, then admin should know “physicalnet” and RP RC of this smartNIC. so does that mean admin needs to GET physicalnet first, then GET RP,RC, then report? | 03:27 |
Sundar | Anyway, I'll throw another idea in. Ideally, the admin should be able to formulate the device profile in the same way, independent of whether it is a single-component or multi-component device. | 03:27 |
xinranwang | s_shogo: | 03:28 |
Sundar | Yumeng: the admin needs to get the RP for the NIC and set the trait there. he could do so via the installer | 03:28 |
xinranwang | Shogo, are you around? | 03:28 |
s_shogo | xinranwang: yes, I'm considering about that, with thinking about the operation of N3000.. | 03:29 |
Sundar | Good. Thanks, s_shogo | 03:29 |
Sundar | To repeat: Ideally, the admin should be able to formulate the device profile in the same way, independent of whether it is a single-component or multi-component device. | 03:30 |
Sundar | That common device profile would look like this: | 03:30 |
chenke | Sundar: I am not familar with this. | 03:30 |
*** shaohe_feng has joined #openstack-cyborg | 03:30 | |
xinranwang | s_shogo: lol, np. Just for other things. Haibin can not access to IRC, and he met some problem with this https://review.opendev.org/#/c/698190/ , could you connect him, maybe by email? | 03:30 |
Sundar | chenke: ok, np | 03:31 |
Sundar | { "name": "my-smartnic-dp", | 03:32 |
s_shogo | Yes, I agree that from operator 's point of view > same way to treat device profile | 03:32 |
Sundar | { "name": "my-smartnic-dp", | 03:32 |
Sundar | "groups": [{ | 03:32 |
Sundar | "resources:FPGA": "1", | 03:32 |
Sundar | "resources:CUSTOM_NIC_X": "1", | 03:32 |
Sundar | "trait:CUSTOM_FPGA_REGION_ID_FOO": "required", | 03:33 |
Sundar | "trait:CUSTOM_NIC_TRAIT_BAR": "required", | 03:33 |
Sundar | "accel:bitstream_id": "3AFE" | 03:33 |
s_shogo | xinranwang: OK, I took conversation with haibin and shaohe yesterday, that seems to be solved. Of course, If another one , I can help that, too. | 03:33 |
Sundar | }] | 03:33 |
Sundar | } | 03:33 |
Sundar | IOW, the resource, traits and Cyborg properties for both the accelerator and NIC would be presented as a single resource group, which would ensure that a single RP would have to satisfy that. That single RP could be the top-level RP of a hierarchy. | 03:33 |
shaohe_feng | s_shogo: yes, another one, need your help | 03:34 |
xinranwang | s_shogo: thanks, please contace him when you got time. | 03:34 |
Sundar | Basically, unless it is a single request group, it is not guaranteed that the resources will come form the same RP | 03:34 |
Sundar | Do you have any comments or questions? | 03:35 |
s_shogo | shaohe_feng, OK,I got it! Could you contact me via e-mail like yesterday? | 03:35 |
shaohe_feng | s_shogo: yes. | 03:35 |
Sundar | During ARQ binding, Cyborg would still get a single RP as today. In the case of a multi-component device, Cyborg would translate that to the top-level Deployable object, and figure out what constituent components are present. | 03:35 |
xinranwang | we can merge the triat to a single request group, from cyborg, or from neutron. | 03:35 |
xinranwang | I am not against you Sundar I understand your proposal, it is one of the solution we presents in PTG. | 03:36 |
Sundar | Neutron cannot add traits to a RP owned by Cyborg, as per existing agreement among developers. | 03:36 |
Sundar | I don't think we got into the aspects of how the device profile should look, during the PTG. It is impotant to nail that down, so we get a uniform way of handling all smart NICs, whether they have 1 component or many | 03:38 |
xinranwang | If neutron don't want do this. Cyborg can update traits created by neutron, in that case, cyborg should know whether neutron create RP or not. We should think about it. | 03:38 |
Yumeng | Sundar,xinranwang: I just got a question. if admin knows the "physicalnet", why can't the admin create a device_profile with "physicalnet" directly? | 03:39 |
Sundar | If one OpenStack service starts updating traits on RPs created by another service, that can cause confusion. You are welcome to discuss it with other developers. | 03:39 |
Sundar | We have already had many such discussions during Nova-Cyborg discussion. | 03:40 |
xinranwang | How device_profile looks like is not related to how we report the RP. It's 2 questions. | 03:40 |
xinranwang | Yumeng: yes, admin can do that. | 03:40 |
Yumeng | so that either nova or neutron does not have to merge "physicalnet" trait to request group | 03:40 |
swp20 | Sundar, 'During ARQ binding, Cyborg would still get a single RP as today.' the RP is for FPGA or NIC_X? | 03:40 |
Sundar | My point is, the device profile must have a single RG for both sides. If you create 2 separate RGs for the multi-component case, it won't ensure co-location. | 03:41 |
Sundar | swp20: That would be the top-level RP, which contains both the FPGA and NIC RPs as children. | 03:41 |
xinranwang | I think we have mix 2 questions... one is who report to placement, one is how nova get this traits before scheduling. | 03:41 |
Yumeng | xinranwang: ok. so that's also one of the solution. | 03:41 |
Sundar | Yumeng: admin can create device profiles with physnet as a trait. The question is, how is that trait set in the RP, before it is referenced in the devic eprofile. | 03:42 |
Sundar | xinranwang: No, I have addressed both questions separately above. | 03:43 |
swp20 | Sundar: got it,Thanks. | 03:45 |
xinranwang | Yumeng: yes, sure. That's one of the solution | 03:45 |
Sundar | If it is not clear, let me state it again: I think Cyborg should create all the RPs needed for a smart NIC. It populates most traits. Some traits, like physnet, may need to come from the admin. The dmin creates device profiles. The user/tenant creates Neutron ports with the device profile name. Nova uses the device profiles in neutorn ports + | 03:46 |
Sundar | flavor to do the scheduling. | 03:46 |
xinranwang | Yes, I agree. That's one of the solution | 03:46 |
Sundar | What alternative solution is under consideration? | 03:47 |
xinranwang | ok, let me state | 03:47 |
xinranwang | 1. cyborg create rp, neutron use this rp and update phynet traits. | 03:48 |
swp20 | Sundar: ' Some traits, like physnet', should we report these traits to Placement? | 03:48 |
xinranwang | 2. neutron create rp, cyborg use this rp to update acc related rc and traits. | 03:48 |
Sundar | First, you are assuming it is ok for one service to create an RP and another to update it by adding traits. Secondly, who will create the top-level RP for multi-component NICs? | 03:49 |
xinranwang | there is also solution by using provider-config.yaml | 03:51 |
Sundar | swp20: SOme traits cna be discovered from the device's PCI ID, like the type of NIC. Those can be added as traits by Cyborg. Others, like physnet or external network connecivity, are know only to the admin or the OpenStack installer. They are best left out of Cyborg. | 03:51 |
Sundar | The provider-config.yaml doe snot create RPs today. | 03:51 |
Sundar | It can be enhanced n the future to do that, but that will be another long discussion and development | 03:52 |
xinranwang | cyborg create rp and other update traits. | 03:52 |
xinranwang | I am investigating how neutron report physnet traits, it seems neutron has also create a logic rp. Not sure, need to verify this. | 03:53 |
Sundar | You are free to bring this up. Look at past IRC discussions before you spend time on this. | 03:53 |
Sundar | Yes, Neutron creates an RP for bandwidth provider feature. | 03:53 |
swp20 | Sundar: so how we scheduler the resources by these traits left out Cyborg? | 03:54 |
*** tetsuro has joined #openstack-cyborg | 03:54 | |
Sundar | Just one more point before I conclude: when we have multi-component NICs with different RPs, it is important that the resource classes and traits for the accelerator RP and the NIC RP be totally disjoint (no overlapping resource classes or traits. That will usually be the case. | 03:55 |
shaohe_feng | neutron has a create_pci_requests_for_sriov_ports | 03:55 |
Sundar | swp20: The admin can reference such traits in the device profile which he creates in Cyborg. | 03:55 |
Sundar | Ok, this has taken a bit longer than I thought :). I didn't mean to take up the whole meeting. | 03:56 |
xinranwang | Yes, every solution has it's pros & cons... Need more investigation | 03:56 |
Yumeng | Sundar, xinranwang and all: IMHO, If nova and neutron have no objections, I would prefer the solution which Cyborg create the top-level RP for multi-component NICs. Since Cyborg does the lifecycle management of NICs. So I personally prefer Cyborg should create all the RPs needed for a smart NIC. | 03:56 |
shaohe_feng | it should can create RP. but anyway, who create the RP, cyborg and neutron should have the same way, that means it know each other | 03:57 |
Sundar | Yumeng: Sure. We need to discuss with neutron folks on the bandwidth provider case, where they create the RP now | 03:57 |
xinranwang | Yes, I prefer this too. RP creation in cyborg seems reasonable. | 03:57 |
xinranwang | About the physnet trait. that's what we need to discuss more. | 03:58 |
shaohe_feng | admin can set physnet for both cyborg and neutron. Tenant does not need to know this. Neutron can get the detail and device profile from cyborg, and add physnet when neutron call create_pci_requests_for_sriov_ports | 03:58 |
xinranwang | And also need to discuss the consistency about rp created in neutron and cyborg. | 03:59 |
shaohe_feng | agree | 03:59 |
wangzhh | I think it's better to let cyborg ccreate RP. And for the traits, can cybog auto sync it from neuton? | 04:00 |
wangzhh | *cyborg | 04:00 |
swp20 | Sundar: not familar with this, i will track it. thanks | 04:00 |
xinranwang | Neutron need to provide an interface in this case. wangzhh | 04:00 |
shaohe_feng | 1. if we just let neutron the create set the physnet, that is convenient to admin. convenience maybe that need more change, more effort. | 04:01 |
shaohe_feng | actually, cyborg can not know the physical networks. | 04:03 |
wangzhh | It is bridge_mapping conf of neutron, not sure if we can get it from neutron net-show external-net. xinranwang | 04:03 |
xinranwang | neutron will report physnet trait only if bandwidth feature enabled, as my understanding. So I am not sure we can get info from these apis | 04:04 |
Sundar | I would suggest that we keep networking aspects out of Cyborg. That can only lead to complexity and trouble. | 04:05 |
Yumeng | agree. | 04:06 |
Yumeng | seems we have run out of time! let's continue discussion in wechat or ML. | 04:06 |
Yumeng | And I will quickly mention the left topics. | 04:07 |
Yumeng | #topic third-party driver CI | 04:07 |
*** openstack changes topic to "third-party driver CI (Meeting topic: openstack-cyborg)" | 04:07 | |
Yumeng | we got three new drivers in victoria releases, do we need to require a CI for new drivers? what should be include in the tests? | 04:09 |
Yumeng | From my opinion, third-party driver CI is good. But not sure each vendor can provide. I think cyborg should at least require one or two main driver vendors to provide this kind of CI tests. | 04:11 |
Yumeng | what do you think? | 04:11 |
xinranwang | Hmmm, better to have 3rd CI. We can keep the dreiver as experimental until it has 3rd CI | 04:12 |
Sundar | Yumeng: sounds like a good idea, but how do we force vendors to provide it? We can only say that, if hey provide it, we will support it, otherwise it is an unsupported driver. | 04:13 |
Sundar | xinranwang: yes, I think we are aligned. | 04:13 |
Yumeng | Sundar: yes. we cannot require any vendor. sounds 'experimental' or 'supported' is a good idea.! | 04:14 |
Yumeng | anyway, we don't have to conclude it today. we can think more. and continue discussion. | 04:14 |
Sundar | Would ZTE provide 3p CI for their smart NIC? Just curious :) | 04:15 |
Yumeng | # topic storyboard usage | 04:15 |
Yumeng | Sundar: haha not sure. I need to check with the network team | 04:16 |
Yumeng | about the storyboard usage, we've got some complains in the PTG. I investigate some in it, and found it has a very flexible advantage in defining your own priority, worklists and so on. so it is worth use. | 04:18 |
Yumeng | so please check links in topic 3-7 https://wiki.openstack.org/wiki/Meetings/CyborgTeamMeeting#Agenda, and review the usage guide. We need to decide how we want to use the storyboard. | 04:20 |
Yumeng | Otherwise, if any other good solutions, we can also discuss. | 04:20 |
Sundar | I'm fine with whatever you all decide. | 04:21 |
Yumeng | ok. Thanks Sundar. | 04:22 |
Yumeng | #topic AoB | 04:22 |
*** openstack changes topic to "AoB (Meeting topic: openstack-cyborg)" | 04:22 | |
Yumeng | Does anybody want to bring up anything else? | 04:22 |
Yumeng | seem nope. | 04:23 |
Yumeng | That's so great today we had a very effective meeting today. Thank you all! | 04:23 |
Sundar | Thank you all | 04:24 |
Yumeng | Let's meet next week. Have a good day/night! | 04:24 |
Sundar | Bye | 04:24 |
*** Sundar has quit IRC | 04:24 | |
Yumeng | #endmeeting | 04:24 |
*** openstack changes topic to "Pending patches (Meeting topic: openstack-cyborg)" | 04:24 | |
openstack | Meeting ended Thu Jun 11 04:24:27 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 04:24 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/openstack_cyborg/2020/openstack_cyborg.2020-06-11-03.04.html | 04:24 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/openstack_cyborg/2020/openstack_cyborg.2020-06-11-03.04.txt | 04:24 |
openstack | Log: http://eavesdrop.openstack.org/meetings/openstack_cyborg/2020/openstack_cyborg.2020-06-11-03.04.log.html | 04:24 |
*** s_shogo has quit IRC | 04:27 | |
*** links has joined #openstack-cyborg | 04:50 | |
*** links has quit IRC | 05:12 | |
*** links has joined #openstack-cyborg | 05:21 | |
*** tetsuro_ has joined #openstack-cyborg | 05:45 | |
*** tetsuro__ has joined #openstack-cyborg | 05:47 | |
*** tetsuro has quit IRC | 05:47 | |
*** tetsuro_ has quit IRC | 05:50 | |
*** dustinc has quit IRC | 06:35 | |
*** shaohe_feng has quit IRC | 06:43 | |
*** xinranwang has quit IRC | 06:54 | |
*** chenke has quit IRC | 06:57 | |
*** wangzhh has quit IRC | 07:01 | |
*** tetsuro__ has quit IRC | 07:37 | |
*** tetsuro has joined #openstack-cyborg | 07:40 | |
*** tetsuro has quit IRC | 08:57 | |
*** tetsuro has joined #openstack-cyborg | 09:00 | |
*** tetsuro has quit IRC | 09:28 | |
openstackgerrit | Hervé Beraud proposed openstack/python-cyborgclient master: Use unittest.mock instead of mock https://review.opendev.org/734506 | 09:49 |
*** efried has quit IRC | 10:32 | |
*** jraju__ has joined #openstack-cyborg | 10:35 | |
*** links has quit IRC | 10:36 | |
*** Yumeng has quit IRC | 11:08 | |
*** gmann has quit IRC | 12:33 | |
*** gmann has joined #openstack-cyborg | 12:33 | |
*** efried has joined #openstack-cyborg | 13:20 | |
*** jraju__ has quit IRC | 14:57 | |
*** dustinc has joined #openstack-cyborg | 21:55 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!