Monday, 2023-08-28

opendevreviewOpenStack Proposal Bot proposed openstack/nova master: Imported Translations from Zanata  https://review.opendev.org/c/openstack/nova/+/89164804:17
*** elodilles_pto is now known as elodilles06:41
dvo-plvsean-k-mooney, Hello. Are you here ?07:36
gokhanihello folks, ı am trying to migrate from vmware to openstack and my vm stuck at booting from harddisk. I converted from vmdk to raw. I followed https://superuser.openinfra.dev/articles/how-to-migrate-from-vmware-and-hyper-v-to-openstack/ guide. my fstab file on my vm is https://paste.openstack.org/show/bqO5mJ9gCpyHEEBgvMjZ/. what can be reason of stucking at boot from harddisk ?09:16
gokhanivm is ubuntu 20.04 and cloudinit package is installed09:16
bauzasgood morning Nova10:32
* bauzas is eventually back at work10:32
sean-k-mooneyo/10:32
sean-k-mooneyi would say a lot has changed since you were last here but until last week the ci has been mostly unable to merge thigns so...10:33
bauzasargh10:34
sean-k-mooneygmann: dansmith and other have fortunetly managed to improve that of late but the reality is very little had merge in the early part of your pto10:34
bauzasI haven't yet looked at gerrit10:34
bauzasbut this afternoon, I'll do10:35
bauzasthis morning, paperwork + email scrubbing :(10:35
sean-k-mooneyi feel like configing that to morning is probably ambious10:35
bauzastbh, I need way more coffee before looking at the gate :)10:39
sean-k-mooneyits relitivly fine at the moment10:41
sean-k-mooneybar the normaly volume detach issues which i think will be a ptg topic10:41
gibibauzas: o/ welcome back. we did not pass the bug triage baton around during your PTO, I was lazy to find a candidate during the meeting. Otherwise the meetings were held and producitve as much as summer period allows.11:34
bauzas++ thanks for the offer11:34
dvo-plvsean-k-mooney, What do you think regarding multi nic support in the our patch by adding additional configuration via config file or even better using binding profile12:42
sean-k-mooneyhard no on config file12:43
sean-k-mooneyor user specifying data in the binding profile12:43
sean-k-mooneywell it depends on where the config file would be used, whats in it and how that is confiugred12:44
sean-k-mooneybut i dont think that likely to work.12:44
dvo-plvget configuration here to generate correct socket name https://review.opendev.org/c/openstack/neutron/+/869510/13/neutron/plugins/ml2/drivers/openvswitch/mech_driver/mech_openvswitch.py#22112:44
dvo-plvand here to create correct vf number https://review.opendev.org/c/openstack/os-vif/+/859574/8/vif_plug_ovs/ovs.py#34912:44
sean-k-mooneynote that binding_profile is expclivily for nova to pass data to the network backend12:44
sean-k-mooney as in you woudl have someihng in the neurton l2 agents config file12:45
sean-k-mooneywhich gets pass as part fo agent['configurations']12:45
sean-k-mooneythat you need use to calulate the vf number12:45
sean-k-mooneyi dont really think that is the correct approch honestly12:47
dvo-plvbut we need to correlate somehow formula for socket and vf number calculating 12:49
sean-k-mooneyhow i think this shoudl work is likely nova shoudl gather the info required adn store it in the binidng_profile and then neutron shoudl use that to calualte the path12:49
dvo-plvwe use simple number for vf and socket name stdvio+vf number. Mellanox uses something like pf0vf0 formulation12:50
dvo-plvsorry, what info required, where what entity we should parse to get this info12:50
sean-k-mooneythe vf number is only meainifn if you know the parent PF pci device12:51
sean-k-mooneycan you express what exactly is required to match12:52
sean-k-mooneyare you useing the vf number to identigfy the dpdk port or somehting like that12:54
sean-k-mooneyi.e. stdvio10 maps to dpdk1012:54
dvo-plvin our case to create correct socket, we need socket name base ( stdvio ) and vf number according to the formula "domain bus slot" * 8 + "func number"12:54
dvo-plvI added comment here https://review.opendev.org/c/openstack/os-vif/+/859574/8/vif_plug_ovs/ovs.py#35212:55
sean-k-mooneythat does not expalin why tha tis need and the algorhtim you implemted does nto match that12:55
sean-k-mooneywhat the code does is take 0000:2b:01.2 then reduce it to 01.2 then split that to 01  and 212:57
dvo-plvwe reserved first 4 pci for pf needs12:57
sean-k-mooneyand do 8*1+212:57
sean-k-mooneyso your are discarding the domain and bus numbers then lookig only at the slot and function12:58
sean-k-mooneyso 0000:2b:01.2 and 0000:3b:01.2 will compute the same socket12:59
dvo-plvyes, we use this approach to work with single nic support12:59
sean-k-mooneyright which i i dont think is suffent to move forward with this13:00
dvo-plvmulti nic support is in the development, so I would like to provide method, which can calculate correct vf number and socket for all vendors by passing formula to the binding profile13:00
sean-k-mooneythat feels like a hack13:01
sean-k-mooneyi also dont like the idea of effectlivly embedding a dsl to expres the formula which is then interpreted at run time13:01
sean-k-mooneydvo-plv: lookign tat the ovs and dpdk docs13:04
sean-k-mooneythe way we add port representors is by pasing the PCI address of the PF + the represnetor vf offest13:04
sean-k-mooneyhttps://docs.openvswitch.org/en/latest/topics/dpdk/phy/?highlight=dpdk#representors13:04
sean-k-mooneyi.e. ovs-vsctl add-port br0 dpdk-rep3 -- set Interface dpdk-rep3 type=dpdk \13:05
sean-k-mooney   options:dpdk-devargs=0000:08:00.0,representor=vf313:05
dvo-plvyes, we do the same, but representor identifier is not fixed, so it depends on vendor dpdk driver13:06
sean-k-mooneywell thats kind of a problem13:06
sean-k-mooneysince the code you want to modify is mento be vendor neutral13:06
sean-k-mooneymeaning your are not allowewd to make vendor speciifc chagnes to it13:07
dvo-plvthis why, I would like to provide config via binding profile13:07
sean-k-mooneyno 13:09
sean-k-mooneythats not really an option13:09
sean-k-mooneyat least not without alot of other changes13:10
sean-k-mooneyso here https://review.opendev.org/c/openstack/os-vif/+/859574/8/vif_plug_ovs/tests/unit/test_plugin.py#44013:10
sean-k-mooneyyou are creatign the port doing effectivly 13:10
sean-k-mooneyovs-vsctl add-port br0 dpdk-rep3 -- set Interface dpdk-rep3 type=dpdk \13:10
sean-k-mooney   options:dpdk-devargs=0000:08:00.0,representor=vf313:10
sean-k-mooneywhen you say the representor identifyier are you referign to vf313:14
sean-k-mooneyor the vhost-user socket file name13:15
sean-k-mooneydvo-plv: or both?13:15
dvo-plvboth13:15
sean-k-mooneyso the convention in ovs is the socket name should match the port name13:15
sean-k-mooneyat least for interfaces of type vhost-user and vhost-user-client13:16
sean-k-mooneyor whatever the ovs constats for those are13:16
dvo-plvI believe it can be correct for dpdk port type too13:17
dvo-plvbut as far as there is no convention regarding name formulation, we have to hardcore socket name on the dpdk initialization step 13:19
dvo-plvwe create vf and socket at the same time and then pass it to the some application to use13:19
sean-k-mooneythat not how thign will work in nova/openstack13:20
sean-k-mooneythe VFs woudl have to be staticaly allocated effectivly at boot time13:20
sean-k-mooneywhat driver is  the PF bound too?13:20
sean-k-mooneyits bound to a kernel dirver not DPDK yes?13:21
dvo-plvwe use default vfio 13:22
sean-k-mooneyfor the VFs or the PF13:22
sean-k-mooneyvfio-pci is ok for the vf but not the PF13:23
sean-k-mooneybased on https://doc.dpdk.org/guides-22.11/prog_guide/switch_representation.html#vf-representors i think the PF are bound to a kernel dirver and the VFs are boud to vfio-pci then managed by the dpdk userspace driver13:28
sean-k-mooneyalthogh im not sure that is correct if i look at https://doc.dpdk.org/guides-22.11/prog_guide/switch_representation.html#basic-sr-iov13:31
sean-k-mooney"A DPDK application running on the hypervisor owns the PF device, which is arbitrarily assigned port index 3"13:32
sean-k-mooneyhonestly i think upstream ovs need to be modifed to take the vhost-user socket path as a pramter when we do the port add13:35
dvo-plvAt the beginning we about that here https://review.opendev.org/c/openstack/nova-specs/+/859290/4..18//COMMIT_MSG#b913:35
dvo-plvso this is why we firstly created our mech driver13:35
dvo-plvso, maybe correct way be better to get back our mech driver ?13:36
sean-k-mooney if it needs to be vendor specific it woudl need to be out of tree both on the ml2 side and os-vif side13:37
sean-k-mooneythe reason im askign about the PF dirver by the way is to ensure libvirt can enumebrate the VFs13:38
sean-k-mooneyit cant do that as far as i am aware if the PF is added to dpdk13:38
dvo-plvwe use vfio-pci for pf and vf fucntions13:40
dvo-plvwe can observer our interfaces only via dpdk driver13:40
dvo-plvdpdk ntnic-pmd13:40
sean-k-mooneythen this wont work with libvirt13:40
sean-k-mooneysince we will have no nodedevs for the VFs13:41
sean-k-mooneyso we will have not entrieds in the pci_deivce table13:41
sean-k-mooneyunless your driver is implemeting a SYSfs interface for them?13:42
dvo-plvyes ,https://paste.opendev.org/show/b8UDrLC9e0iKrEJR6z6j/13:45
sean-k-mooneyif you have that interface then we can use sysfs to lookup the vf number13:45
sean-k-mooneywhich os-vif already has code for.13:46
sean-k-mooneyhttps://github.com/openstack/os-vif/blob/master/vif_plug_ovs/linux_net.py#L360-L37813:48
dvo-plvyes, I remember that I tried to use this algo, but it does not correct for us as far as I remmember for example we have this pci 000:2b:01.1/, so according to the fomrula vf number shoulde be 9, but sysfs calc it like virtfn513:49
sean-k-mooneythat sounds like a bug in your dpdk drvier then13:50
sean-k-mooneylookign at that output it is the 6th VF so 5 is correct if 0 indexed13:52
sean-k-mooneyi think the best way to make this portabel and upstreamable would be to add the socket name as a dpdkarg wehn adding the port and use the data for sysfs13:56
sean-k-mooneytaht would require 2 changes to the dpdk driver but it would mena we would not have to calulate the socket path in a special way13:57
sean-k-mooney so we can keep the neutron ml2 driver unmodifed at least in terms of the vhost-user path generation13:57
dvo-plvthis is not a bug, because we reserved first 4 pci for physical interfaces14:05
dvo-plvSo what about to get back to the separate ml2 plugin ? like Agilio ?14:10
dvo-plvsorry, but i did not get your option with dpdk args14:15
dvo-plvsocket creates by dpdk pmd driver on the init moment14:15
dvo-plv...-a 0000:2b:00.0,representor=[4-6],portqueues=[4:1,5:1,6:1] -a 0000:2b:00.4 -a 0000:2b:00.5 -a 0000:2b:00.6"14:16
dvo-plvwe pass next other config to the dpdk driver14:16
dvo-plvafter that it create sockets with names stdvio+vfnumber14:16
sean-k-mooneyshoudnt the socket be created by qemu with dpdk as the clinet14:18
sean-k-mooneywhen dpdk is the server then if the vswitch is restated it breaks network connectivty for all vms14:19
sean-k-mooneythis is why we moved to qemu server dpdk client mode many years ago14:19
dvo-plvthis is not an issue for us, after ovs restart and socket recreating, connectivity get back14:22
sean-k-mooneyqemu will not reconnect to the vhos-user socket if its recreated by dpdk14:22
sean-k-mooneythe socket FD will change14:23
sean-k-mooneythere is a way to make qemu do that i belive htat was added signifcinaly later but we dont configre that14:23
sean-k-mooneywe do not supprot genreating " <reconnect enabled='yes' timeout='10'/>"14:24
sean-k-mooneyhttps://libvirt.org/formatdomain.html#vhost-user-interface14:24
sean-k-mooneyso if qemu is runing in client mode and you restart ovs to do a package upgrae it will break network connectivnty for all guests until you hard reboot them14:26
sean-k-mooneyhave you tested this in an openstack envionment?14:26
dvo-plvthis is qemu command, -chardev socket,id=char0,path=/usr/local/var/run/stdvio6,server we set the server on the qemu side14:26
sean-k-mooneyok then qemu is runing in server mode whch means qemu is creating the socket not dpdk14:27
sean-k-mooneydpdk is runing in client mode and connecting to the socekt create by qemu14:27
sean-k-mooneythat is the way we recommend you use vhost user14:27
sean-k-mooneywhat im suggetign is we add the path to the dpdk args when os-vif creates the interface in ovs14:28
sean-k-mooneyand the driver shoudl use that instead of trying to calulate it14:28
dvo-plvyes, we formulate socket name by pci address according to the formula, because we reserve 4 first pcis for physical ports14:29
sean-k-mooneyyep but your not actully passing the vf number to dpdk in teh representor arg14:30
sean-k-mooneywhat you are pasing is the intger offset to add to the pf adress14:30
sean-k-mooneythat is why your expect virtfn5 -> ../0000:2b:01.1/ to be 914:31
dvo-plvyes, and this offset can not give us ability to work with sysfs14:32
dvo-plv+ socket name like stdvio + vf_num14:32
sean-k-mooneythat not the vf_num14:33
sean-k-mooneyits the pci endpoint number14:33
sean-k-mooneyits a diffent thing 14:33
dvo-plvsure14:33
sean-k-mooneyreally the dpdk driver should jsut ascp representor=vf5 + the pf address14:34
sean-k-mooneyand compute that internaly14:34
dvo-plvwe can get this offset from here in some way root@server23:~# cat /sys/bus/pci/devices/0000\:2b\:00.0/sriov_offset 14:34
dvo-plv414:34
sean-k-mooneyif we can read that form sysfs14:36
sean-k-mooneythen we can drop the algother in os-vif and use the exisitng fucntion and add the ofset14:36
sean-k-mooneythat still does not really help for the ml2 driver14:36
sean-k-mooneydo you have a document that describes the step by step process of adding the representor netdevs to ovs-dpdk manually14:37
dvo-plvone moment14:38
dvo-plvhttps://docs.napatech.com/r/Getting-Started-with-Napatech-Link-VirtualizationTM-Software/Create-the-OVS-Provider-Bridge-and-Start-2-VMs14:39
sean-k-mooneythat is incompelte. it does not have the port add comands for the represtor prots14:41
dvo-plvitem 4 Add the dpdkvp0 virtual port to the br-int bridge:14:42
*** JasonF is now known as JaqyF14:54
*** JaqyF is now known as JayF14:54
dvo-plvmaybe we can create logic like with vhostuser_socket_dir, if vhostuser_socket_name is set, get this name from config + pci from sysfs?15:12
noonedeadpunkhey folks. I wanna double-check one thing with you, that I'm not missing anything. So for volume to be attached to a VM with scsi bus - volume *must* be created from the image?16:59
noonedeadpunkAs I've found a spec to allow volume define that as well, but it was abandoned16:59
sean-k-mooneythe volume no17:11
sean-k-mooneybut the vm root disk must have hw_disk_bus=scsi17:12
sean-k-mooneywe only look at the metadata on the root disk be that a local disk or cinder volume root disk17:12
sean-k-mooneyand all other cinder volume must use the same disk bus17:12
sean-k-mooneynoonedeadpunk: so right now we do not support attching block devices with diffent busses17:13
sean-k-mooneythere is a way to ocationlaly make that work in a speicific edgecase but its not supported upstream17:13
noonedeadpunkaha, ok, I see then17:14
sean-k-mooneynoonedeadpunk: we discussed in the last ptg addign a way to supprot per volume disk bueses17:14
sean-k-mooneybut no one actully worked on it17:14
noonedeadpunkSo if you found yourself in situation when 25 volumes are not enough - you should start from scratch kinda?17:14
sean-k-mooneywell. it depends17:15
sean-k-mooneythe vm i assume is using virtio-blk now17:15
noonedeadpunkyup17:15
sean-k-mooneyis the vm bfv or local storage17:15
sean-k-mooneyand are you lookign for a admin solution or an end user one17:15
noonedeadpunkI'm not sure if it's bfv or not as instance seem to be gone....17:18
noonedeadpunkbut admin solution is fine17:18
sean-k-mooneythen you can use this https://docs.openstack.org/nova/latest/cli/nova-manage.html#image-property-set17:18
noonedeadpunk(or I jsut can't find it somehow)17:18
noonedeadpunkugh... We're running xena (going to upgrade to 2023.1 in a month)17:18
noonedeadpunkbut that is really handy command I didn't know about17:19
noonedeadpunkso thanks for pointing me to it!17:19
sean-k-mooneyits really ment for helping peopel upgrade17:19
sean-k-mooneybut it will update the ebeded image metadata17:19
sean-k-mooneyif its boot form volume17:19
sean-k-mooneythere is an other hack17:20
sean-k-mooneytl;dr is using old micorvstion of the  rebuild api17:20
sean-k-mooneyallow bfv guests to just update the image metadata17:20
sean-k-mooneyi.e. on microvsion where rebuild was not supproted for boot form volume17:20
sean-k-mooneyif you used the same image uuid we allowed the metadtaa to be updated17:21
noonedeadpunkiirc rebuild is supported quite recently? like zed or 2023.1?17:21
sean-k-mooneythis is generally not that safe as it would break the vm in some caseses and in the new microvserion it will actully rebuild the volume17:21
sean-k-mooneyyes17:21
noonedeadpunkok, awesome, thanks a lot!17:22
sean-k-mooneyso preior to actully supproting it properly there was a poorly documented feature where rebuild to the same iamge "only for bfv guests" woudl not destory data and just update the metadata17:22
sean-k-mooneybut honestly htat should never have been a thing17:22
sean-k-mooneywe effectivly forgot it existed until we added real rebuild support17:23
noonedeadpunkI can recall writing smth to the ML regarding that17:23
noonedeadpunkbut already forgot about this feature :D17:25
sean-k-mooneyhaving a behviaor for a normally distructive api not be only if its BFV is too easy to forget17:27

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!