Tuesday, 2021-11-16

sean-k-mooneyhyang[m]: its merged its not relaease yet00:59
opendevreviewmelanie witt proposed openstack/nova master: Poison usage of eventlet spawn_n() in tests  https://review.opendev.org/c/openstack/nova/+/81804203:52
*** abhishekk is now known as akekane|home05:12
*** akekane|home is now known as abhishekk05:12
EugenMayerIt seems like the first boot of an instance (with cloud-init) has diffenret results in network then all the other after that. Is that desired?07:10
EugenMayeram I right that using --user-data has 2 flavours, using it with --config-drive true will expect to have meta-data alike content there, without using config-drive yaml based cloud-init files are expected. So the same --user-data for 2 different subsystems on init?07:47
EugenMayer(reading https://docs.openstack.org/nova/queens/user/config-drive.html)07:48
gibi_o/ morning nova08:03
*** gibi_ is now known as gibi08:03
bauzasgood spec review day, Nova08:40
* gibi already on it08:45
opendevreviewMerged openstack/nova-specs master: Repropose flavour and image defined ephemeral storage encryption  https://review.opendev.org/c/openstack/nova-specs/+/81086709:13
jengbersMorning, I have been searching OpenStack history, but I haven't been able to find why it is not possible to add existing instances to server groups. Does anyone know if that is just never discussed or if there is a fundamental technical problem?09:33
jengbersWe have been adding servers to server groups by changing the database and the scheduler has always handled this well.09:33
gibijengbers: the problem is consistency. When you add a server to a group then you can create a situation when the group membership and the actaul placement of the instance contradicts09:38
gibiand the question is what to do then09:38
gibia) move the instance to restore consistency09:39
gibib) reject the addition of the instance to the group09:39
gibic) allow temporary inconsitency and let the next move operation on the instance fix the group 09:40
gibiI think we never agreed which direction to take09:40
sean-k-mooneythere was a propsoal to extend it recently to allow this but the suggestion then was to have teh add do migration which i dont thinkis the right approch09:43
sean-k-mooneyits particaly a probelm for the affinity policy since that is more likely to break then the anti affinity policy09:43
kashyapbauzas: I might not be able to make the meeting today; I see it's at 17u CET :-(09:48
kashyapMorning, BTW09:48
bauzaskashyap: ah ok, no worries then09:48
bauzaskashyap: just add notes to your specless ask in the agenda, so we can discuss there09:49
kashyapbauzas: Yep; doing that now09:50
kashyapbauzas: If there are any questions, you can ask me here, I'll answer when I'm back later in the evening.  I'm away from 17-18u CET09:50
jengbersI guess option b) would be the least surprising.09:52
gibijengbers: with option b) the problem is that then the user first somehow move the instance to the proper place (but the user has no tool for it) then add it to the group 09:59
opendevreviewMerged openstack/nova-specs master: Store and allow libvirt instance device buses and models to be updated  https://review.opendev.org/c/openstack/nova-specs/+/81023510:00
sean-k-mooneygibi: if you have the correct wiegher enabel i belvie you can ask for an isntnace to be created on the same hos tor a different host to a specific instance via a scheduler hint 10:01
sean-k-mooneyits certenly requires a lot of knoladge of nova and the instance to do correctly10:01
gibisean-k-mooney: if you use SameHost / DifferentHost filters then you don't need server groups 10:02
sean-k-mooneyyou cant a s a normal use cold migrate to the same host or simialrly aline them all10:02
sean-k-mooneygibi: ya that is also kind of true10:02
gibithey implement similar logic but a very different way :)10:02
sean-k-mooneyyep10:03
sean-k-mooneyi do wish we had ways to make server groups more useful10:03
sean-k-mooneybut its kind fo hard to extend them for the reason above10:03
gibisean-k-mooney: for that we need to solve jengbers' problem and also extend the logic to support multiple groups per instance (or nested groups10:03
gibi)10:04
gibiboth is painfully missing but hard to solve10:04
gibibauzas: I'm done with the spec sweep. I could not really comment on the ironic one https://review.opendev.org/c/openstack/nova-specs/+/815789 and it seems nobody commented yet10:05
gibithe rest of the specs has feedback 10:05
sean-k-mooneyya. on second tought we should have "server aggreates" in parallel to server groups. you knwo so we can aggreate servers and give that aggreate of servers a name and even have some metadta that can be shared like this server it the primary of the aggreate and just not have it related to vm placement at all10:05
bauzasgibi: I still have 3 specs to look at10:05
sean-k-mooneythat way we can pretend server groups dont exist :)10:05
bauzasgibi: but OK, and thanks for the fish10:05
gibisean-k-mooney: :D10:06
jengbersgibi, sean-k-mooney: If it was only possible for admins, that could work, because they can also migrate servers, but for users it seems quite hard.10:14
gibijengbers: yeah that could work. Feel free to propose a spec about the new API to get wider discussion around it 10:15
jengbersOn the other hand, they can power of and start an instance. I guess that would mean it is started on a different hypervisor.10:16
sean-k-mooneyreally there are 2 paths we could take. 1 allow normall user ask nova to cold/live migrate an instance to be consitent with a server group that it is currently not a member of and then allow them to add the server to the group after, rejecting the request if the policy is volated10:20
sean-k-mooneyor 2 we can have the server group add triger the migration as part fo the request10:21
kashyapIs this failing for anyone else too?10:24
kashyap    tempest.api.compute.admin.test_live_migration.LiveAutoBlockMigrationV225Test.test_live_migration_with_trunk [108.238299s] ... FAILED10:24
gibikashyap: could you link the test run?10:25
kashyapgibi: https://zuul.opendev.org/t/openstack/build/632f8ed30e9a4a04a32648843f227ef310:25
jkulikwe've downstream extended the server-groups api to allow adding servers after the fact. we opted for not allowing to add servers if this would be against the server-group's rules10:25
gibilooking10:25
gibikashyap: I think you got hit by https://bugs.launchpad.net/neutron/+bug/194042510:27
jkulikit helps customers if they already spawned an instance and forgot the server-group and now want to spawn another instance in some affinity to the existing one10:28
gibithe stack trace is the same10:28
gibijkulik, jengbers: so both of you would like the same behavior, you should team up proposing this upstream :)10:28
kashyapgibi: Oh, thank you10:28
kashyapgibi: Now what?  ... Should I do a "recheck 1940425"?10:29
kashyapOr pray to the ju-ju at the bottom of the sea?  Or...10:29
gibikashyap: yepp, recheck bug 194042510:29
jkulikhttps://github.com/sapcc/nova/commit/7220be3968ee1dd257c9add88228cc5bb9857795 is the main commit downstream for us10:29
jkulikgibi: yes, we talked internally already about proposing this upstream, but small team, much work :/10:30
gibikashyap: I added your run to the bug maybe that way we can get attention to the failure as it is still happening10:30
kashyapgibi: Thx for the quick spot10:30
gibijkulik: no pressure, I know that type of frustration 10:31
bauzasjkulik: we tried discussing this in the past upstream, but operators are very afraid by the races conditions it creates10:35
bauzasjkulik: problem is, in a distributed service model like Nova, you can't get a valid answer whether you can do it, as when you validate, you don't ask the nova-compute service10:36
jkulikbauzas: that reminds me ... we wanted to change the DB to disallow having a server in multiple server-groups to help with races. we haven't done that, yet. thanks :D10:37
bauzasin theory we should hold new instance creations per compute once you ask for adding a new instance to the group10:38
jkulikour problem is a little different, still, as we use VMware and not libvirt. thus, we have a lot of hidden hypervisors as nova only sees the cluster. therefore, hard anti-affinity doesn't really matter for us that much10:40
jkulikcustomers want to make sure they run on different hypervisors and thus we sync the server-groups to the VMware clusters. VMware then migrates VMs around to make sure the rules apply.10:40
jkuliki.e. most of our customers depend on soft-anti-affinity, which is a Weigher in nova-scheduler anyways10:42
kashyapbauzas: Alright, added it to the Open Discussion here: https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting11:43
opendevreviewRajat Dhasmana proposed openstack/nova-specs master: Add spec for volume backed server rebuild  https://review.opendev.org/c/openstack/nova-specs/+/80962111:44
dmitriisgibi: tyvm for the feedback12:10
gibidmitriis: you are welcome. it is a well written spec, thanks for putting in the effort12:10
sean-k-mooneyi dont know if i set send on my respocne to the last verions of it12:22
sean-k-mooneygibi you are correct the pci passhtoug filter will be suffiect without the prefilter12:23
sean-k-mooneybut the prefiletr can help reduce the set if we only report the new trait on host that have off path devices12:23
sean-k-mooneyand the capablity to use them of course12:23
sean-k-mooneyill re review that specs later today12:24
gibisean-k-mooney: yes, exactly my argument, the prefilter is not mandatory but it is good to have12:26
*** mdbooth1 is now known as mdbooth12:53
dmitriissean-k-mooney: ack, ty for confirming13:00
sean-k-mooneyi dont currently have access to hardware to test what you have done but i may have access before the end of the cycle. if i do i might reach out to you and try and test it end to end although i dont know if i will have time to do that or not13:02
elodillesbauzas: i'll update now the meeting wiki #stable section if that is not interfering with you right now13:24
dmitriissean-k-mooney: btw, fnordahl and I have done end-to-end testing of this in a lab. Here's a PPA https://launchpad.net/~fnordahl/+archive/ubuntu/smartnic-enablement that was used in the process (the WIP reviews are in use there). It has https://listman.redhat.com/archives/libvir-list/2021-November/msg00431.html included as well - I am trying to get13:26
dmitriissomeone to review it sooner than later.13:26
dmitriisit doesn't yet have the prefilter and compute capability parts that were recently added to the spec but I will work on updating the WIP review soon with that and on raising a relevant os-traits change13:28
dmitriisWe had a VM booted with a floating IP assigned which we then connected to via a router. The flows were offloaded into the ConnectX-6 chip present on BF2.13:30
sean-k-mooneyi think i have ping that patch to people dowstream already but ill let the vert team know13:31
sean-k-mooney*virt13:31
dmitriisack, tyvm13:32
dmitriissean-k-mooney: besides testing overlays we also tried using VLAN provider networks. That worked as well but the only thing to note there is that collocating VMs with ports attached to overlay networks via PCI devices with the ones that are directly attached to VLAN networks is going to be problematic with the current whitelist based lookup13:33
dmitriisimplementation.13:33
dmitriisentries in the whitelist get a physnet tag (either null for overlay networks or a physnet label)13:34
sean-k-mooneycorrect they do13:34
dmitriisbut there is only one vendor/device id pair  13:34
sean-k-mooneyand technially null was never intended to be supported13:34
sean-k-mooneywe never had a nova feature to support overlays with pci devices13:35
sean-k-mooneythey exploted a lack of null checking and it happend to work13:35
sean-k-mooneydmitriis: anyway back to your point why is that problematic13:36
dmitriissean-k-mooney: heh, yes, I wasn't aware of the history but the hardware offload docs explicitly mention that null needs to be used https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html#configure-nodes-vxlan-configuration13:36
sean-k-mooneydmitriis: yes that was never intended to work13:36
sean-k-mooneybut peopel now have it in produciton13:36
sean-k-mooneydmitriis: https://bugs.launchpad.net/nova/+bug/191528213:37
dmitriissean-k-mooney: IIRC PCI requests come with a specific physnet parameter (or null). So when PCI stats are looked at, this parameter is used for lookup13:37
dmitriislet me find that code again13:37
sean-k-mooneyyes we end up passing python NONE in the pci request13:37
sean-k-mooneybecaue the physent of a vxlan or geneve netowrk is not set13:38
sean-k-mooneythat wil match the null phsynet specified in the whitelist 13:38
dmitriis((Pdb)) request13:39
dmitriisInstancePCIRequest(alias_name=<?>,count=1,is_new=<?>,numa_policy=<?>,request_id=c3a87cba-323a-4203-bca7-0916927dcd5b,requester_id='28ea5b12-729c-46b4-b441-518fe786ea10',spec=[{physical_network=None,remote_managed='True'}])13:39
dmitriisI  had something like this ^13:39
sean-k-mooneyyep13:39
sean-k-mooneythat shoudl work13:39
sean-k-mooneythat is the python None13:39
sean-k-mooneynot its not quoted13:39
sean-k-mooney*note13:39
sean-k-mooneythat will match physical_network=null13:40
dmitriisah, maybe that's an old note that I have. It was since fixed to use a string13:40
sean-k-mooneyyou have to use "'physical_network':null" not "'physical_network':'null'" in the whitelist13:40
sean-k-mooneylike this passthrough_whitelist={ "vendor_id":"15b3", "product_id":"101e", "physical_network":null }13:41
sean-k-mooneythat enabels a connectx-6 dx for overlay networking13:42
sean-k-mooneydmitriis: if you want to have some VF for vlan/flat and other for geneve tunnels you need to use the adress field to partion the vfs into groups 13:43
dmitriissean-k-mooney: I suppose that could be one way to do it13:44
sean-k-mooneydmitriis: this is because tunnels was never ment to be supported at all so we never impleted a way to allow a device to be part of multile physnets 13:44
sean-k-mooneydmitriis: if it was not for the fact that this was used as production we would have closed this as a secuirty bug and blocked the use of null the detail are in the bug13:45
dmitriissean-k-mooney: yeah, makes sense. I think that documenting this and suggesting address-based partitioning as a workaround is viable for now13:46
sean-k-mooneydmitriis: the tl;dr is we use a json parser to parse the whitelist and in json unquoted null is mapped to the python None object which just happens to be what we get when we parse the phsynet form networks that dont have one13:46
sean-k-mooneywhich is why physical_network=None in the pci request will actully match13:47
sean-k-mooneysicne that is also python None object not the stirng 'None'13:48
sean-k-mooneyfyi hte docs for the whitelist are not greate but incase yo udont know we support both bash style globs and python regex expression in the adress filed13:49
sean-k-mooneyand we supprot both in either the sting or dict form 13:50
sean-k-mooneyhttps://docs.openstack.org/nova/latest/configuration/config.html#pci.passthrough_whitelist has some examples13:50
dmitriissean-k-mooney: I recall some other place in Nova where I had to use a string instead (trying to find where so maybe I wrongly brought this up here).13:50
sean-k-mooneythere might be if you find it let me knwo and i might know the history or it might just be a bug13:51
dmitriissean-k-mooney: that's what we used in the lab13:52
dmitriispassthrough_whitelist = [{"vendor_id": "15b3", "product_id": "101e", "physical_network": null, "remote_managed": "true"}]13:52
dmitriisand for physnets: passthrough_whitelist = [{"vendor_id": "15b3", "product_id": "101e", "physical_network": "physnet1", "remote_managed": "true"}]13:53
sean-k-mooneynot at the same time right13:53
sean-k-mooneyif you add the adress filed you could use both but both look valid to me the first for geneve and the second for flat/vlan13:54
sean-k-mooneyhuh interesting13:55
sean-k-mooneythe VFs for the connectx-6 on the bluefiled 2 have the same vendor and product id as a normal connectx-613:55
bauzaselodilles: sure, please do it, I'll just update the wikipage after you13:59
dmitriissean-k-mooney: ack on the address field usage.14:06
dmitriissean-k-mooney: I don't have a separate ConnectX-6 at hand but BF2 has ConnectX-6 in it. Let me check the PCI ID DB - I think I've seen different ids but maybe that's for something else.14:07
elodillesbauzas: done, thanks (i might overused the info and link markers o:) feel free to edit :))14:11
bauzaselodilles: ack, thanks14:11
dmitriissean-k-mooney: so the PF is different but VFs look like the ones from a "regular" ConnectX-6.14:13
dmitriisPF:14:13
dmitriis82:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)14:13
dmitriis82:00.0 0200: 15b3:a2d6 (rev 01)14:13
dmitriis        Subsystem: 15b3:006114:13
dmitriisVF:14:13
dmitriis82:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function (rev 01)14:13
dmitriis82:00.3 0200: 15b3:101e (rev 01)14:13
dmitriis        Subsystem: 15b3:006114:13
dmitriisso careful "remote_managed" tagging is needed 14:14
sean-k-mooneyack good to know14:14
sean-k-mooneydmitriis: you can use the adress of the pf and vendor id of the VF to whitelist all the VFs that belog to that pf14:29
sean-k-mooneyjust so you know14:29
sean-k-mooneydmitriis: that behavior is not well known14:29
dmitriissean-k-mooney: didn't know that (not surprisingly), thanks for the info.14:32
dmitriissean-k-mooney: btw, BF2 does bonding at the ARM CPU side transparently to the hypervisor14:33
dmitriisand there's an option to hide the inactive PF for the hypervisor side: https://docs.mellanox.com/display/BlueFieldSWv24011082/BlueField%20Link%20Aggregation14:34
dmitriisthat makes it easier for OpenStack deployers/operators since only one PF needs to be taken into account14:35
dmitriissean-k-mooney: so this is the place where I had to use a string (instead of bool, not None so my reference was not correct) https://review.opendev.org/c/openstack/nova/+/812111/3/nova/network/neutron.py#2295 - that's where a device spec is dynamically generated (not based on flavor or image properties).14:42
sean-k-mooneyim on a call but ill look it up after thanks14:45
dmitriisack14:47
Adri2000hi, I've got a race condition issue on ussuri and victoria when resizing an instance... specifically this is with /var/lib/nova/instances on NFS, and the following happens sometimes when resizing an instance where a cold migration is triggered: `qemu-img resize` will be run on the new compute node, before the old compute node has fully released the lock on the disk file; this15:03
Adri2000will put the instance in ERROR state. does that ring a bell to anyone?15:03
Adri2000ERROR nova.compute.manager [req-...] [instance: 6ca672fd-8746-441f-bbca-6baa3234bb5e] Setting instance vm_state to ERROR: oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command. Command: qemu-img resize /var/lib/nova/instances/6ca672fd-8746-441f-bbca-6baa3234bb5e/disk Exit code: 1 Stdout: ''15:03
Adri2000Stderr: "qemu-img: Could not open '/var/lib/nova/instances/6ca672fd-8746-441f-bbca-6baa3234bb5e/disk': Could not open '/var/lib/nova/instances/6ca672fd-8746-441f-bbca-6baa3234bb5e/disk': Permission denied\n"15:03
sean-k-mooneyAdri2000: are you using nfsv315:04
Adri2000sean-k-mooney: `/var/lib/nova/instances type nfs4 (rw,relatime,vers=4.1...`15:04
sean-k-mooneyok nfsv3 has locking issues v4.1 improves the situration but recommend v4.2+15:05
sean-k-mooneylyarwood: does ^ seem familar to you15:05
sean-k-mooneyAdri2000: i belive there are some tunabel in the mount option that can be used to help resolve this15:07
sean-k-mooneyAdri2000: are you using raw images?15:11
bauzasgibi: I'll have to hardstop our meeting by 5:50pm our TZ15:11
gibibauzas: ack 15:12
bauzasin case we have to continue discussing, could you be chairing it ?15:12
sean-k-mooneydmitriis: oh there15:13
sean-k-mooneystr(self._is_remote_managed(vnic_type)),15:13
sean-k-mooneydmitriis: ya that makes sense15:13
Adri2000sean-k-mooney: qcow3 images. one nfs option I have currently is local_lock=none, maybe I should look into this one.15:13
sean-k-mooneydmitriis: technicaly the tags are defiend to be of type String15:14
sean-k-mooneyso its a dict of string to string15:14
sean-k-mooneyAdri2000: ack the reason i asked is apparently the lockign behavior is differnt in qemu for raw vs qcow15:15
dmitriissean-k-mooney: ack15:15
sean-k-mooneydmitriis: https://github.com/openstack/nova/blob/master/nova/pci/devspec.py#L262-L26315:17
dmitriissean-k-mooney: yep, makes sense15:18
dmitriissean-k-mooney: btw, the ovn-vif repo is now up under ovn-org https://github.com/ovn-org/ovn-vif15:18
sean-k-mooneyyes i saw your comment15:19
sean-k-mooneyjust looking at the code15:19
dmitriisack15:19
sean-k-mooneyam i right in assuming we do not want to allow these device to be used for flavor based pci passthouhg15:19
dmitriissean-k-mooney: yes, they won't be of much use without being plugged appropriately. Not the VFs at least.15:20
sean-k-mooneyya15:21
sean-k-mooneyim wondering if we should explictly block that15:21
sean-k-mooneyunfortunetly i dont see a trivial way to do that15:21
sean-k-mooneyalthough we might already do that15:22
dmitriissean-k-mooney: I guess we could exclude devices from search results if remote_managed is present but not requestedd15:22
sean-k-mooneyyep15:22
sean-k-mooneyi was just going to provide an exmaple15:22
sean-k-mooneywe do this in other cases already15:23
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/pci/stats.py#L411-L43315:23
sean-k-mooneyThat filteres out PF if you did not ask for one15:23
sean-k-mooneydmitriis: so you can copy past https://github.com/openstack/nova/blob/master/nova/pci/stats.py#L520-L53515:24
sean-k-mooneyand then add a new function that will filter out remote managed device if not requested15:24
dmitriissean-k-mooney: https://review.opendev.org/c/openstack/nova/+/812111/3/nova/network/neutron.py#229515:24
sean-k-mooneyi did this recently when i added support for vdpa https://github.com/openstack/nova/blob/master/nova/pci/stats.py#L54015:25
dmitriisactually, I'm explicitly passing remote_managed=False15:25
sean-k-mooneythat wont work15:25
dmitriissean-k-mooney: even with this? https://review.opendev.org/c/openstack/nova/+/812111/3/nova/pci/stats.py#11115:25
sean-k-mooneyit will break existing deploymetn on upgrade as there existing device wont have remote_managed=False15:25
sean-k-mooneyand it woudl only apply for pci recuts form ports15:26
sean-k-mooneydmitriis: that would work but we might end up doign a data migration of all existing rows15:26
sean-k-mooneydmitriis: ok we can review this as part of the code review rather then the spec.15:27
sean-k-mooneyim just finsihing reading it now and ill approve it shortly15:27
dmitriissean-k-mooney: ack, I am open to adding a filter as you suggested15:27
sean-k-mooneyeither would work but one involes updateing every row in the pci device tabel with remote_managed=false :)15:28
dmitriisright, I would certainly like to avoid introducing a change that would break with a stale state in PCI stats15:28
sean-k-mooneythe important thing is there is not a gap in the design15:28
dmitriisagreed15:28
*** artom__ is now known as artom15:34
sean-k-mooneydmitriis: ok i captured some of my tought in this converstaion in the spec but +2 +w from me15:43
sean-k-mooneydmitriis: feel free to ping me to review the implemetaion too. i do not have +2 right on the code repo but ill try and spend some time reviewing it end to end next week15:45
dmitriissean-k-mooney: tyvm. I'll try to get the code updated with some of the latest changes by then. Still have to extend func tests to cover more cases but there are some already.15:46
dmitriissean-k-mooney: speaking of other lifecycle operations, I've spent some time looking at the recent VF hot-plug/unplug changes so I may revisit some of the unsupported operations at a later point15:47
bauzasreminder : nova weekly meeting starts in 13 mins here in this #chan15:47
dmitriismaybe we can actually make things like cold migration work, just need to review that further15:47
sean-k-mooneyack15:47
sean-k-mooneydmitriis: it might just work15:48
sean-k-mooneythere is very littel in the nova side that will need to be updated15:48
sean-k-mooneyalso for the live migration15:48
dmitriissean-k-mooney: yes, we might need to document the need for extra slots to be added via the new config15:48
dmitriishttps://review.opendev.org/c/openstack/nova/+/545034/16/nova/conf/libvirt.py15:49
sean-k-mooneydmitriis: that is really only need for q3515:49
sean-k-mooneyand we already have a config to add extra slots in that case15:49
sean-k-mooneyyep that one15:49
dmitriisack15:49
sean-k-mooneythe pc machine type ahs 24 or 32 pci slot by default15:50
sean-k-mooneyfor q35 the defautl behavior is to allocation all that are required for your vm +1 free for hotplug15:50
sean-k-mooneyoh...15:51
sean-k-mooneythere might be a bug in sriov live migration with q3515:51
dmitriisFrom the guest OS perspective, the PCI addressing is tied to the virtual PCI topology. Hopefully it is consistent across migration so that device naming doesn't change for the guest while the MAC is reprogrammed anyway.15:52
sean-k-mooneyi did most of my testing with pc and when i tested with q35 i dont know if i tested with more then one sriov nic15:52
sean-k-mooneydmitriis: we dont gaurntee ti will be15:52
sean-k-mooneyso it might change15:53
dmitriissean-k-mooney: ah, good to know. Changing PCI addresses will change persistent device names tied to PCI addresses.15:53
sean-k-mooneyyes the way around that is to leverage device role tagging15:54
sean-k-mooneybut really we want qemu/kvm/nvidia to finish implemeitn live migration support for vdpa15:54
sean-k-mooneyso that we can just leave teh vdpa device attach15:54
sean-k-mooneydmitriis: by the way at some point we likely need to consider how to supprot vdpa+bluefiled-215:55
sean-k-mooneywe can get the simple version working first however.15:56
dmitriissean-k-mooney: yes, I agree. There are two cases: software and hardware vDPA. For soft vDPA there is an extra agent needed on the hypervisor host.15:56
dmitriisso that definitely has some challenges15:56
sean-k-mooneyim hoping we can simple not specify a device_type and relay on remote_manged=True15:56
dmitriisanother interesting area is Scalable Functions (SFs) which rely on mdev and a vendor-specific driver15:56
sean-k-mooneywell maybe not we can see15:57
sean-k-mooneydmitriis: yes i have worked with that in the past15:57
dmitriisit kind of erases the benefits of hardware virtio tbh 15:57
sean-k-mooneyits not clear that the mdev based apparoch will go to market or not at least form teh  vendor i was workign with15:57
sean-k-mooneywell the mdev impleation can be in hardware and present virtio too15:58
sean-k-mooneyit predates the vdpa buss15:58
dmitriisah, in that case, I take it back :^)15:58
whoami-rajatHi, just to be sure the nova meeting is in this channel right?15:59
dmitriisI was also thinking of what CXL would bring and how much churn will it introduce to the existing PCI management implementation in Nova15:59
sean-k-mooneythe protype i was working on used an fpga to implent virtio in "hardware" but the long term plan was to do that in an asic. i just dont know if they have pivitored to vdpa now or not but it was mdev based at the time15:59
bauzas#startmeeting nova16:00
opendevmeetMeeting started Tue Nov 16 16:00:10 2021 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'nova'16:00
gibio/16:00
elodilleso/16:00
bauzas#link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting16:00
bauzasgood 'day, everyone ;)16:00
whoami-rajatHi16:00
opendevreviewMerged openstack/nova-specs master: Integration With Off-path Network Backends  https://review.opendev.org/c/openstack/nova-specs/+/78745816:00
gmanno/16:01
bauzasI'll have to hardstop working in 45-ish mins, sooo16:01
bauzas#chair gibi16:01
opendevmeetCurrent chairs: bauzas gibi16:01
bauzassorry again16:01
gibiso I will take the rest16:01
* bauzas is a taxi16:01
bauzasanyway, let's start16:02
bauzas#topic Bugs (stuck/critical) 16:02
bauzas#info No Critical bug16:02
bauzas#link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 28 new untriaged bugs (+3 since the last meeting)16:02
bauzas#help Nova bug triage help is appreciated https://wiki.openstack.org/wiki/Nova/BugTriage16:02
bauzasI'm really a sad panda16:02
bauzasin general, I'm triaging bugs on Tuesday, but I forgot about our today's spec review day :)16:03
bauzasso I'll look at the bugs tomorrow16:03
bauzasin case people want to help us, <316:03
bauzasany bug to discuss ?16:03
bauzas#link https://storyboard.openstack.org/#!/project/openstack/placement 33 open stories (+1 since the last meeting) in Storyboard for Placement 16:04
bauzasabout thisq.$16:04
bauzasthis... *16:04
bauzasI tried to find which story was new :)16:04
bauzasbut the last story was already the one I knew16:05
bauzasso, in case people know...16:05
dansmitho/16:05
gibibauzas: if at some point I have time I can try to dig but I pretty full at the moment16:06
bauzasalso, Storyboard is a bit... slow, I'd say16:06
bauzas5 secs at least everytime it takes for looking about a story16:06
bauzasI mean, for stories, maybe we should use Facebook then ? :p16:07
bauzas(heh, :p )16:07
* bauzas was joking in case people were not knowing16:07
bauzasOK, this looks like a bad joke16:08
bauzasmoving on :p16:08
bauzas#topic Gate status 16:08
bauzas#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:08
bauzasnothing new16:08
bauzas#link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly Placement periodic job status 16:08
bauzasnow placement-nova-tox-functional-py38 job works again :)16:09
bauzasthanks !16:09
bauzas#topic Release Planning 16:09
bauzas#info Yoga-1 is due Nova 18th #link https://releases.openstack.org/yoga/schedule.html#y-116:10
bauzaswhich is in 2 days16:10
bauzasnothing really to say about it16:10
bauzas#info Spec review day is today16:10
bauzasI think I reviewed all the specs but one (but I see this one was merged ;) )16:10
bauzasthanks for all who already reviewed specs16:11
gibiyeah I think we pushed forward all the open specs 16:11
whoami-rajatSorry If I'm interrupting but I had one doubt regarding my spec16:12
bauzaswe merged 3 specs today16:12
bauzaswhoami-rajat: no worries, we can discuss this spec if you want during the open discussion topic16:12
whoami-rajatack thanks bauzas 16:12
bauzaswhoami-rajat: but what is your concern ?16:12
bauzasa tl;dr if you prefer16:13
bauzasfor other specs, I'll mark the related blueprints accepted in Launchpad by tomorrow16:14
whoami-rajatbauzas, so I'm working on the reimage spec for volume backed instances and we decided to send connector details with the reimage API call and cinder will do the attachment update (this was during PTG), Lee pointed out that we should follow our current mechanism of nova doing attachment update like we do for other operations16:14
bauzasok, if this is a technical question, let's discuss this during the open discussion topic as I said16:15
whoami-rajatsure, np16:15
bauzasok, next topic then16:15
bauzas#topic Review priorities 16:15
bauzas#link  https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement)+label:Review-Priority%252B116:15
bauzas#info https://review.opendev.org/c/openstack/nova/+/816861 bauzas proposing a documentation change for helping contributors to ask for reviews16:16
bauzasgibi already provided some comments on it16:16
bauzasI guess the concern is how to help contributors to ask for reviews priorities like we did with the etherpad16:16
bauzasbut if we have a consensus saying that it is not an issue, I'll stop 16:17
bauzasbut my only concern is that I think asking people to come on IRC and ping folks is difficult so we could use gerrit16:18
gibiwhat is more difficult? Finding the reason of a faul in nova code and fixing it or joing IRC to ask for review help?16:19
sean-k-mooneywell one you "might" be able to do offlien/async16:19
sean-k-mooneythe other invovles talking to peopel albeit by text16:20
sean-k-mooneyunfortunetly those are sometime non overlaping skill sets16:20
bauzasgibi: I'm just thinking of on and off contributors that just provide bugfixes16:20
gibidoing code review is talking to people via text :)16:20
bauzasbut let's continue discussing this in the proposal, I don't wanna drag the whole attention by now16:21
sean-k-mooneybauzas: for one off patches i think the expectaion shoudl still be on use to watch the patchs come in and help them 16:21
sean-k-mooneyrather ten assuemign they will use any tools we provide16:21
bauzassean-k-mooney: yeah but then how to discover them ?16:21
bauzaseither way, let's discuss this by Gerrit :p16:22
sean-k-mooneywell if its a similar time zone i watch for teh irc bot commeting for the patches16:22
sean-k-mooneyif i dont recognise it or the name i open it16:22
sean-k-mooneyand then one of use can request the reqview priority in gerrirt or publicise the patch to others16:22
bauzasthat's one direction16:23
sean-k-mooneyif there is something in gerrit i can set im happy to do that on patches when i think they are ready otherwise ill just ping them to ye as i do now16:23
bauzaseither way, we have a large number of items for the open discussion topic, so let's move on16:23
sean-k-mooneyack16:24
bauzas#topic Stable Branches 16:24
bauzaselodilles: fancy copy/pasting or do you want me to do so ?16:24
elodilleseither way is OK :)16:24
bauzasI can do it16:25
bauzas#info stable gates' status look OK, no blocked branch16:25
bauzas#info final ussuri nova package release was published (21.2.4)16:25
bauzas#info ussuri-em tagging patch is waiting for final python-novaclient release patch to merge16:25
bauzas#link https://review.opendev.org/c/openstack/releases/+/81793016:26
bauzas#link https://review.opendev.org/c/openstack/releases/+/81760616:26
bauzas#info intermittent volume detach issue: afaik Lee has an idea and started to work on how it can be fixed:16:26
bauzas#link https://review.opendev.org/c/openstack/tempest/+/817772/16:26
bauzasany question ? 16:26
elodillesthanks :)16:26
bauzaslooks like none16:27
bauzas#topic Sub/related team Highlights 16:27
gibithe volume detach issue feel more an more like not related to detach16:27
bauzas#undo16:27
opendevmeetRemoving item from minutes: #topic Sub/related team Highlights 16:27
gibithe kernel panic happens before we issue detach16:28
elodillesgibi: true16:28
gibiit is either related to the attach or the live migration itself16:28
gibiI have trials placing sleep in different places to see where we are too fast https://review.opendev.org/c/openstack/nova/+/81756416:28
bauzaswhich stable branches are impacted ?16:28
gibistable/victoria16:28
bauzasubuntu focal-ish I guess ?16:29
bauzasack thanks16:29
elodilles(and other branches as well, but might be different root causes)16:29
gibiI only see kernel panic in stable/victoria (a lot) and one single failure in stable/wallaby16:29
gibiso if there are detach issues in older stable that is either not causing kernel panic, or we don't see the panic in the logs16:30
bauzasI guess kernel versions are different between branches16:30
bauzasright?16:30
bauzascould we imagine somehow to verify another kernel version for stable/victoria 16:31
bauzas?16:31
gibiwe tested with guest cirros 0.5.1 (victoria default) and 0.5.2 (master default) it is reproducible with both16:31
bauzasack so unrelated16:31
gibithere is a summary here https://bugs.launchpad.net/nova/+bug/1950310/comments/816:31
bauzas#link https://bugs.launchpad.net/nova/+bug/1950310/comments/8 explaining the guest kernel panic related to stable/victoria branch16:32
sean-k-mooneyya the fiew cases i looked at with you last week were all happing befoer detach16:32
sean-k-mooneyso its either the attach or live migration16:32
gibisean-k-mooney: I have more logs in the runs of https://review.opendev.org/c/openstack/nova/+/817564 if you are interested16:32
sean-k-mooneyi looked downstream at our qemu bugs but didnt see anythign relevent16:32
sean-k-mooneygibi: sure ill try and take a look proably tomorrow16:33
sean-k-mooneybut ill open it in a tab16:33
gibisean-k-mooney: thanks, I will retrigger that patch for a couple times to see if the current sleep before the live migration helps16:33
bauzasa good sleep always helps16:34
bauzas:)16:34
elodilles:]16:34
sean-k-mooneywhen sleep does not work we can also try a trusty print statement16:34
gibisleep is not there as a solution but as a troubleshooting to see which step we are too fast :D16:35
* sean-k-mooney is dismayed by how may race condition __dont__ appeare when you use print for debugging16:35
gibiand I do have a lot of print(server.console) like statements in the tempest :D16:35
sean-k-mooneyi think we can move on but its good you were able to confirm we were attaching before the kerenl finished booting16:36
sean-k-mooneyat least in some cases16:36
sean-k-mooneythat at least lend weight to the idea we are racing16:36
bauzasok, let's move on16:37
gibiack16:37
bauzasagain, large agenda todayu16:37
bauzas #topic Sub/related team Highlights 16:37
bauzasdamn16:37
bauzas#topic Sub/related team Highlights 16:37
bauzasLibvirt : lyarwood ?16:37
bauzasI guess nothing to tell16:38
bauzasmoving on to the last topic 16:38
bauzas#topic Open discussion 16:38
bauzaswhoami-rajat: please queue 16:39
whoami-rajatthanks!16:39
bauzas(kashyapc) Blueprint for review: "Switch to 'virtio' as the default display device" -- https://blueprints.launchpad.net/nova/+spec/virtio-as-default-display-device 16:39
bauzasthis is a specless bp ask16:39
bauzaskashyap said " The full rationale is in the blueprint; in short: "cirrus"  display device has many limitations and is "considered harmful"[1] by  QEMU graphics maintainers since 2014."16:39
bauzasdo we need a spec for this bp or are we OK for approving it by now ?16:40
whoami-rajatso lyarwood had a concern with my reimage spec, we agreed to pass the connector info to reimage API (cinder) and cinder will do attachment update and return the connection info with events payload16:40
gibiI think we don't need a spec this is pretty self contained in the libvirt driver16:40
bauzaskashyap was unable to attend the meeting today16:40
whoami-rajat(in PTG)16:40
sean-k-mooneyi think we are ok with approving it the main thing to call out is we will be chaing it for existing isntnace too16:40
bauzaswhoami-rajat: please hold, sorry16:40
whoami-rajatoh ok16:40
gibithe only open question we had with sean-k-mooney is how to change the default16:40
gibibut kashyap tested it out that changing the default during hard reboot not cause any trouble to guests16:41
gibias the new video dev has a fallback vga mode16:41
bauzasgibi: I'm thinking hard of any potential upgrade implication16:41
sean-k-mooneyright so when we dicussed this before we decied to change it only for new instances to avoid upgrade issue16:41
bauzascorrect16:41
sean-k-mooneyour downstream QE tested this with windows guests and linux guest and both seamd to be ok with the change16:41
bauzasI'm in favor of not touching the running instances16:42
bauzasor asking to rebuild them16:42
gibiwe are not toching the running instance, we only touch hard rebooting instances16:42
sean-k-mooneyso kasyap has impletne this for all instnaces16:42
bauzasgibi: which happens when you stop/start, right?16:42
gibiright16:42
sean-k-mooneybauzas: yes as gibi says it will only take effect when the xml is next regenreted16:42
gibiit happens while the guest is not running16:42
*** akekane_ is now known as abhishekk16:43
gibiit is not an unplug/plug for a running guest16:43
bauzasdo we want admins to opt-in instances ?16:43
bauzasor do we agree it would be done automatically?16:43
sean-k-mooneyit will happen on start/stop hard reboot or a non live move operations16:43
gibibauzas: I trust kashyap that it is safe to change this device 16:44
bauzasdo we also want to have a nova-status upgrade check for yoga about this ?16:44
sean-k-mooneyno16:44
bauzasgibi: me too16:44
sean-k-mooneywhy would we need too16:44
sean-k-mooneywe are not removing support for cirrus16:44
gibiwe don't remove cirros16:44
sean-k-mooneyjsut not the default16:44
gibiyepp16:44
sean-k-mooneygibi: context is downstream it is being remvoed form rhel 916:45
bauzassean-k-mooney: sure, that just means that long-living instances could continue running cirros16:45
sean-k-mooneyso wwe need to care about it for our product16:45
sean-k-mooneyactully cirrus is not beeing remvoed in rhel 916:45
sean-k-mooneybut like in rhel 10 16:45
sean-k-mooneybauzas: yep which i think is ok16:46
sean-k-mooneywe coudl have a nova status check but it woudl have to run on the compute nodes16:46
sean-k-mooneywhich is kind of not nice16:46
sean-k-mooneysince it woudl have to check the xmls16:46
bauzasI know16:46
sean-k-mooneyso i woudl not add it personally16:46
bauzasI'm just saying that we enter a time that could last long16:47
gibiI agree, we don't need upgrade check16:47
sean-k-mooneyshal we continue this in the patch review16:48
bauzasbut agreed on the fact this is not a problem until cirros support is removed and this is not an upstream question16:48
bauzassean-k-mooney: you're right, nothing needing a spec16:48
bauzas#agreed https://blueprints.launchpad.net/nova/+spec/virtio-as-default-display-device is accepted as specless BP for the Yoga release timeframe16:49
bauzasmoving on16:49
gibi\o/16:49
bauzasnext item16:49
bauzas(kashyapc) Blueprint for review: "Add ability to control the memory used by fully emulated QEMU guests -- https://blueprints.launchpad.net/nova/+spec/control-qemu-tb-cache 16:49
bauzasagain, a specless bp ask16:49
bauzashe said " This blueprint allows us to configure how much memory a  plain-emulated (TCG) VM, which is what OpenStack CI uses.  Recently,  QEMU changed the default memory used by TCG VMs to be much higher, thus  reducing the no. of VMs you TCG could run per host.  Note: the libvirt  patch required for this  will be in libvirt-v7.10.0 (December 2021)."16:49
bauzas" See this issue for more details: https://gitlab.com/qemu-project/qemu/-/issues/693 (Qemu increased memory usage with TCG)"16:49
sean-k-mooneyim a little torn on this16:50
sean-k-mooneyim not sure i like this being a per host config option16:50
sean-k-mooneybut its also breaking existing deployemnts16:50
sean-k-mooneyso we cant really adress that with flavor extra specs or iamge properties16:51
sean-k-mooneysicne it would be a pain for operators to use16:51
gibibut that requires rebuild of existing instances16:51
sean-k-mooneyyep16:51
sean-k-mooneyso with that in mind the config option proably is the way to go16:51
sean-k-mooneyjust need to bare in mind it might chagne after a hard reboot if you live migrate16:51
gibiyeah, config as a first step, if later more fine grained control is needed we can add an extra_spec16:51
bauzasthere are libvirt dependencies16:52
sean-k-mooneyif we capture the (this should really be the same on all host in a region) pice in the docs im ok with this16:52
sean-k-mooneybauzas: and qemu deps16:52
bauzasyou need a recent libvirt in order to be able to use it16:52
bauzasright16:52
sean-k-mooneyits only supproted on qemu 5.0+16:52
gibisean-k-mooney: yeah that make sense to document16:52
sean-k-mooneyso we will need a libvirt verion and qemu check in the code16:53
sean-k-mooneywhich is fine we know how to do that16:53
bauzasso, if this is a configurable, this has to explain which versions you need16:53
sean-k-mooneyyep16:53
bauzaswe would expose something unusable for the most16:53
sean-k-mooneythe only tricky bit will be live migration16:53
sean-k-mooneyif the dest is not new enough but the host is16:54
bauzascorrect, the checks ?16:54
sean-k-mooneywe will need to make sure we validate that16:54
bauzasright16:54
bauzasbut this looks to me an implementation detail16:54
bauzasall of this seems not needing a spec, right?16:54
bauzasupgrade concerns are N/A16:54
bauzasas you explicitely need a recent qemu16:55
sean-k-mooneyam the live migration check will be a littel complex but other then that i dont see a need for a spec16:55
sean-k-mooneyim a littel concerned about the livemgration check which is what makes me hesitate to say no spec16:55
bauzaswe can revisit this decision if the patch goes hairy16:55
sean-k-mooneyyes16:55
sean-k-mooneythat works for mee16:55
gibiworks for me too16:56
sean-k-mooneyi htink we have the hypervior version avaible in the conductor so i think we can do it without an rpc/object change16:56
bauzas#agreed https://blueprints.launchpad.net/nova/+spec/control-qemu-tb-cache can be a specless BP but we need to know more about the live migration checks before we approve 16:56
bauzasgibi: sean-k-mooney: works for you what I wrote ?16:56
sean-k-mooney+116:57
bauzasok,16:57
bauzasnext topic is ganso16:57
bauzasand eventually, whoami-rajat16:57
gansohi!16:57
bauzasganso: you have one min :)16:57
gansoso my question is about adding hw_vif_multiqueue_enabled setting to flavors16:57
gansoit was removed from the original spec16:57
gansohttps://review.opendev.org/c/openstack/nova-specs/+/128825/comment/7ad32947_73515762/#9016:57
gansotoday it can be only used in image properties16:58
gansodoes it make at all semantically or is this something that only makes sense as an image property?16:58
sean-k-mooneyya this came up semi recently16:58
sean-k-mooneyi think we can just add this in the flavor16:58
bauzasthe other way would be a concern to me16:58
gansook. Would this require a spec?16:58
bauzasas users could use a new property16:59
sean-k-mooneywell image propertise are for exposing thing that affect the virtualised hardware16:59
bauzasbut given we already accept this for images, I don't see a problem with accepting it as a flavor extraspec16:59
sean-k-mooneyso in gnerally you want that to be user setable16:59
gansogreat17:00
bauzassean-k-mooney: right, I was just explaning that image > flavor seems not debatable while flavor > image seems to be discussed17:00
gansoto me it sounds simple enough to not require a spec, do you agree?17:00
bauzasgood question17:00
bauzasbut we're overtime17:00
sean-k-mooneyhttps://blueprints.launchpad.net/nova/+spec/multiqueue-flavor-extra-spec17:00
sean-k-mooneythis is the implemation https://review.opendev.org/q/topic:bp/multiqueue-flavor-extra-spec17:01
bauzasganso: whoami-rajat: let's continue discussing your concerns after the meeting17:01
bauzas#endmeeting17:01
opendevmeetMeeting ended Tue Nov 16 17:01:10 2021 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)17:01
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-16-16.00.html17:01
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-16-16.00.txt17:01
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2021/nova.2021-11-16-16.00.log.html17:01
whoami-rajatack17:01
bauzasI need to leave17:01
sean-k-mooneyganso: stephenfin  was working on this before he moved team last cycle17:01
bauzasganso: about your ask, I'll put the specless bp acceptance to next week17:01
sean-k-mooneyganso: i think we can do it as a specless blueprint17:01
bauzasganso: but we can basically agreed on this without waiting for it to be papered17:02
bauzasagree*17:02
sean-k-mooneyganso: all of the code is there i just didnt get time to pick it back up after stephenfin move so if you want to pick it up please do17:02
* bauzas needs to leave17:02
gansosean-k-mooney, bauzas thank you very much!!17:02
gibiwhoami-rajat: would be nice to have the reimage discussion when lyarwood is present17:03
gibiI not feel knowledgeable enough in cinder17:03
whoami-rajatgibi, ok, just wanted the team's thoughts on it, can you suggest a time that would be suitable?17:04
gibiwhoami-rajat: try to ping lyarwood tomorrow 17:05
whoami-rajatok17:06
gibiboth me and bauzas was +2 on your spec so a quick chat with lyarwood would be enough17:06
whoami-rajatack, i will fix the gate failure and see what lyarwood thinks about it17:06
opendevreviewDan Smith proposed openstack/nova master: WIP: Revert project-specific APIs for servers  https://review.opendev.org/c/openstack/nova/+/81620617:19
kashyapbauzas: Just back: on the blueprint for changing video model to "virtio": yes, you can trust the test results posted in the change.  As noted, I've got it properly integration-tested for Windows and Linux guests with Red Hat virt QE17:37
kashyapAlso, gibi --^ (Thanks for the trust :))17:37
gibikashyap: :)17:38
kashyapgibi: On context: it is not specific to downstream RHEL9 removing it (as sean-k-mooney phrased it).  *Regardless* of what RHEL9 does it is not a good default.  That's the bare argument.17:38
kashyap"it" == Cirrus, I mean.17:38
kashyapgibi: On the tb-cache thing: as a reminder, it is mostly used by CI setups that can't have KVM.  All sensible production users will use KVM17:39
kashyapUnless they have some need to run emulated-only guests -- because the performance is cripplingly slow compared to hardware-accelerated virt17:40
gibiyeah good point17:40
gibiit is for a specific non production use case17:40
clarkbkashyap: there are production use cases for emulation though. For example docker image builds for different architectures (we do a bunch of that)17:40
clarkbThat doesn't concern nova, but it should be something that qemu/libvirt consider17:41
clarkbbasically the emulation use case shouldn't simply be dismissed17:41
kashyapclarkb: Heya.  Fully agree - that's a valid use-case.  :-)  But I was speaking from a compute-workload point of view: 90% of them are on KVM driver17:42
kashyapclarkb: Enabling cross-arch builds is one of the appealing points, sure.17:42
kashyapclarkb: Although, my use of "sensible production users" is a bit dismissive, I agree.  Sorry :)17:43
sean-k-mooneykashyap: rackspace used to run there public cloud using qemu for x86 on power hardware for a long time17:43
kashyapsean-k-mooney: Sure; but it's also far, far, less secure.  And upstream QEMU doesn't have any security guranatees17:45
sean-k-mooneykashyap: yep and that is fine for many17:45
sean-k-mooneyespically if they use selinxu/contiaer to add an extra laywer of security around the qemu instancve17:46
kashyapsean-k-mooney: Sure; as long as they're aware of it.  I just double-checked with the QEMU folks: they "explicitly *disclaim* any security for TCG"17:46
sean-k-mooneyyes i kno17:46
sean-k-mooneyknow17:46
sean-k-mooneyits in there wiki17:46
kashyapPublic docs: https://qemu-project.gitlab.io/qemu/system/security.html17:47
sean-k-mooneyhttps://www.qemu.org/docs/master/system/security.html#non-virtualization-use-case17:47
kashyapYep.17:47
kashyapsean-k-mooney: Note, though; SELinux/AppArmour can mitgate *some* of the risk, but as the QEMU folks say elsewhere: "depending on the config you can still have *massive* holes you can drive a truck through"  (Cc: gibi, clarkb)17:50
clarkbsure, I'm not saying it is a good idea for production cloud VM usage. But I do think there are valid use cases out there17:51
kashyapAgreed.  I was just tempering the "production cloud w/ TCG" usage point-of-view.  In case any lurkers are observing this conversation, I wanted to plug the security implications here17:52
sean-k-mooneyi dont really think its a debate there have been several production largescase cloud that did run with just qemu17:53
sean-k-mooneydepening on your security model it may or may not be an issue17:53
kashyapAlso Rackspace used to offer Xen too.  Not just plain QEMU.18:01
kashyapI don't want belabour this point.  I wonder who are these "largescale clouds".  Overall, any serious user who wants to run non-toy compute workloads will not use plain emulation.18:02
kashyapAnyhow...time to wrap up the day.18:03
*** tosky is now known as Guest605418:05
*** tosky_ is now known as tosky18:05
*** tosky_ is now known as tosky18:48
daspsean-k-money: I opened the BP like you suggested but didn't tag it for yoga properly, so it may have been missed: https://blueprints.launchpad.net/nova/+spec/configurable-no-compression-image-types19:11
sean-k-mooneyam we will tag it when its review but you just need to add it to the metting adgenda by updating the wiki19:12
opendevreviewRodrigo Barbieri proposed openstack/nova master: Add 'hw:vif_multiqueue_enabled' flavor extra spec  https://review.opendev.org/c/openstack/nova/+/79235619:12
daspsean-k-mooney: thanks, done19:17
*** mdbooth5 is now known as mdbooth19:35
opendevreviewArtom Lifshitz proposed openstack/nova master: DNM: Test token expiration during live migration  https://review.opendev.org/c/openstack/nova/+/81777820:19
*** tosky is now known as Guest607022:42
*** tosky_ is now known as tosky22:42
*** tosky is now known as Guest607323:07
*** tosky_ is now known as tosky23:07

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!