Monday, 2025-04-14

opendevreviewStefan K proposed openstack/nova-specs master: Add Cloud Hypervisor support spec  https://review.opendev.org/c/openstack/nova-specs/+/94554909:22
opendevreviewStefan K proposed openstack/nova-specs master: Add Cloud Hypervisor support spec  https://review.opendev.org/c/openstack/nova-specs/+/94554909:32
opendevreviewMerged openstack/nova-specs master: tox: Drop envdir  https://review.opendev.org/c/openstack/nova-specs/+/94118411:00
opendevreviewAndre Aranha proposed openstack/nova master: Replace paramiko with ssh-python  https://review.opendev.org/c/openstack/nova/+/94692211:49
bbezakHi - I'd like to bring to nova team's attention pretty interesting (and somewhat convoluted to troubleshoot) bug - https://bugs.launchpad.net/nova/+bug/2104255. Namely stripping a switchdev capability from active SR-IOV VF on nova-compute service restart. Workaround is to restart nova-compute after VF is detach12:50
sean-k-mooneybbezak: note that nova does not supprot VF-LAG at all :)12:53
sean-k-mooneyat least not offically12:53
sean-k-mooneymelonox never acutlly did the work to enabel it in neutorn or nova they just figured out a hack to make it work12:54
bbezakindeed. I like that patch though - https://review.opendev.org/c/openstack/nova/+/884439. now we don't need to add switchdev by hand to binding profiles ;)12:55
bbezakand it works very well :)12:55
sean-k-mooneythats a diffent feature12:55
sean-k-mooneyso your using hardware offlowaded ovs now generic sriov12:55
bbezakyes12:55
sean-k-mooneybbezak: for what its worth the translation fo capablities to port bindign was ment to be done before hardwar offloaded ovs was merged12:56
sean-k-mooneybbezak: it actully predates the creation of placement and got put on hold for like 6 years12:57
sean-k-mooneybbezak: so it was always inteneded that you would not set switchdev12:57
sean-k-mooneyi.e. that nova would12:57
sean-k-mooneybbezak: anyway looking at the bug report12:58
bbezakthx for background info sean-k-mooney 12:59
sean-k-mooneyit looks like the network capablities are lost when an instnce is "torn down"12:59
sean-k-mooneydoes that mean the instnace is delete or stopped?12:59
bbezakwell. it happens on nova-compute service restart - on actively attached VFs13:00
bbezakthen when tearing vm down. the vfs are not usable13:00
sean-k-mooneyright but this is because while its attached to a vm13:00
sean-k-mooneywe cannot inspect its ethtool feature flags13:01
sean-k-mooneyi guess the problem is that we also cache this info on startup13:02
sean-k-mooneyso if the agent gets restarted whiel the vms are running13:02
bbezakyes, it is attached to vm, then nova-compute restart is stripping feature in the db (as it is not visible on the host). and then when vm is removed the vf don't have the capability13:02
sean-k-mooneyit will no longer have the network capablities info13:02
bbezakexactly13:02
sean-k-mooneywhich is not wrong per say13:03
sean-k-mooneyit just wrong because it will never get update again when its detached13:03
bbezakyeap13:03
sean-k-mooneythere are a couple of ways to fix that13:04
sean-k-mooneythey all have trade offs13:04
sean-k-mooneywe could clear the cache when we detach an interface13:04
sean-k-mooneywe could refuse to update the capablities if the device is not in the aviable sate13:05
bbezakI'm wondering if we could look to existed.extra_info for capabilities13:07
bbezakbut maybe that is too narrow13:07
bbezakthinking13:07
bbezak:)13:07
bbezakhere https://github.com/openstack/nova/blob/1ad11b13884baeaa6ed9f8f5818f4d176f4d3134/nova/pci/manager.py#L271-L28913:07
sean-k-mooneyyou mean merge the values form the db with those form libvirt and have teh db ones take precidence while its in claimmed or allocated13:07
gibiUggla: here is my eventlet removal summary https://gibizer.github.io/posts/Eventlet-Removal-Flamingo-PTG/ feel free to link to it in you PTG summary mail13:11
bbezaksth like that13:11
bbezakI'm not taking account other pci devices that may don13:11
bbezakI'm not taking into account other pci devices that may don't like that13:12
bbezakI guess13:12
sean-k-mooneybbezak: sorry on a call13:27
bbezakno rush!13:28
sean-k-mooneybbezak: we likely need to experiment with what is the correct approch.13:29
bbezakyeah13:45
opendevreviewBalazs Gibizer proposed openstack/nova master: split monkey_patching form import  https://review.opendev.org/c/openstack/nova/+/92242514:07
opendevreviewsean mooney proposed openstack/nova master: Remove workaround for ovn live migration  https://review.opendev.org/c/openstack/nova/+/94695014:11
opendevreviewPranali Deore proposed openstack/nova master: DNM: Test glance new location api  https://review.opendev.org/c/openstack/nova/+/89120714:28
opendevreviewAmit Uniyal proposed openstack/nova stable/2024.2: Libvirt: updates resource provider trait list  https://review.opendev.org/c/openstack/nova/+/93252215:00
opendevreviewMerged openstack/nova stable/2024.2: Libvirt: updates resource provider trait list  https://review.opendev.org/c/openstack/nova/+/93252222:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!