Wednesday, 2022-02-02

opendevreviewmelanie witt proposed openstack/nova master: Add logic to enforce local api and db limits  https://review.opendev.org/c/openstack/nova/+/71213903:35
opendevreviewmelanie witt proposed openstack/nova master: Enforce api and db limits  https://review.opendev.org/c/openstack/nova/+/71214203:35
opendevreviewmelanie witt proposed openstack/nova master: Update quota_class APIs for db and api limits  https://review.opendev.org/c/openstack/nova/+/71214303:35
opendevreviewmelanie witt proposed openstack/nova master: Update limit APIs  https://review.opendev.org/c/openstack/nova/+/71270703:35
opendevreviewmelanie witt proposed openstack/nova master: Update quota sets APIs  https://review.opendev.org/c/openstack/nova/+/71274903:35
opendevreviewmelanie witt proposed openstack/nova master: Tell oslo.limit how to count nova resources  https://review.opendev.org/c/openstack/nova/+/71330103:35
opendevreviewmelanie witt proposed openstack/nova master: Enforce resource limits using oslo.limit  https://review.opendev.org/c/openstack/nova/+/61518003:35
opendevreviewmelanie witt proposed openstack/nova master: Add legacy limits and usage to placement unified limits  https://review.opendev.org/c/openstack/nova/+/71349803:35
opendevreviewmelanie witt proposed openstack/nova master: Update quota apis with keystone limits and usage  https://review.opendev.org/c/openstack/nova/+/71349903:35
opendevreviewmelanie witt proposed openstack/nova master: Add reno for unified limits  https://review.opendev.org/c/openstack/nova/+/71527103:35
opendevreviewmelanie witt proposed openstack/nova master: Enable unified limits in the nova-next job  https://review.opendev.org/c/openstack/nova/+/78996303:35
*** hemna8 is now known as hemna07:38
opendevreviewFabian Wiesel proposed openstack/nova master: Transport context to all threads  https://review.opendev.org/c/openstack/nova/+/82746710:10
opendevreviewFabian Wiesel proposed openstack/nova master: VmWare: Remove unused legacy_nodename regex  https://review.opendev.org/c/openstack/nova/+/80633610:14
amoralejhi, it seems devstack on centos9 is failing with some errors related to qemu unplugging volumes10:18
amoralejhttps://review.opendev.org/c/openstack/devstack/+/82742010:19
amoralejFeb 02 09:04:16.541765 centos-9-stream-ovh-gra1-0028274422 nova-compute[72151]: ERROR oslo_messaging.rpc.server nova.exception.DeviceDetachFailed: Device detach failed for vdb: Run out of retry while detaching device vdb with device alias virtio-disk1 from instance ee94e833-1561-480f-b6c8-0e391833c0c7 from the live domain config. Device is still attached to the guest.10:19
amoralejFeb 02 09:04:16.325611 centos-9-stream-ovh-gra1-0028274422 nova-compute[72151]: ERROR nova.virt.libvirt.driver [None req-7e39fb4f-50b1-4d86-b9ce-7addab205c30 tempest-ServerStableDeviceRescueTest-117889040 tempest-ServerStableDeviceRescueTest-117889040-project] Waiting for libvirt event about the detach of device vdb with device alias virtio-disk1 from instance ee94e833-1561-480f-b6c8-0e391833c0c7 is timed out.10:19
amoralejans similar10:19
amoralejdid anyone see this issue? is it known issue?10:20
opendevreviewFabian Wiesel proposed openstack/nova master: Vmware: Fix indentation in conditionals  https://review.opendev.org/c/openstack/nova/+/80639110:37
MrClayPoleWhen I live migrate any instances that has more than 1 volume attached the windows VMs sometimes lose access to their volumes and then hang. The logging stops on the host until after they are restarted. I'm not sure how to troubleshoot this further or if live migration of instances with more than one disk is supported? Any help would be appreciated.11:44
MrClayPoleThe VMs in question are using NFS storage via the NetApp cinder driver11:45
ralonsohgibi, hi, this is about https://bugs.launchpad.net/neutron/+bug/195974912:01
ralonsohplease check https://review.opendev.org/c/openstack/neutron/+/468982/5/neutron/services/qos/drivers/manager.py12:01
ralonsoh"openstack network qos rule type list" will return only those rules accepted by all loaded mech drivers only12:02
ralonsohYou can create a rule of any type, but you won't be able to assign it12:02
gibiralonsoh: do you mean assigning it to a port? 12:08
ralonsohif the port is bound12:08
ralonsohbecause that means you'll call the mech driver qos extension12:08
opendevreviewStephen Finucane proposed openstack/nova master: docs: Follow-ups for cells v2, architecture docs  https://review.opendev.org/c/openstack/nova/+/82733612:09
gibiralonsoh: I have a port with a qos policy that uses the min pps rule type and I can boot a VM with that port and the port is bound successfully12:10
ralonsohgibi, in your deployment, what backend are you using?12:11
ralonsohor backends12:11
gibiovs and sriov12:12
opendevreviewDmitrii Shcherbakov proposed openstack/nova master: Introduce remote_managed tag for PCI devs  https://review.opendev.org/c/openstack/nova/+/82483412:12
opendevreviewDmitrii Shcherbakov proposed openstack/nova master: Bump os-traits to 2.7.0  https://review.opendev.org/c/openstack/nova/+/82667512:12
opendevreviewDmitrii Shcherbakov proposed openstack/nova master: [yoga] Add support for VNIC_TYPE_SMARTNIC  https://review.opendev.org/c/openstack/nova/+/82483512:12
opendevreviewDmitrii Shcherbakov proposed openstack/nova master: Filter computes without remote-managed ports early  https://review.opendev.org/c/openstack/nova/+/81211112:12
gibiovs support the min packet rate rule12:12
ralonsohgibi, yes, right, as I weas guessing12:12
ralonsohthe problem is how we build the rule type set, as I commented in the bug12:12
gibilooking...12:13
ralonsohwe return the intersection of all mech driver supported types12:13
ralonsohinstead of returning the union12:13
sean-k-mooneygibi: ovn does not support qos fully yes i think12:13
gibisean-k-mooney: yes, I uses ovs :)12:13
sean-k-mooneyso if you are using the new default it might not work unless you reverted to ml2/ovs12:13
sean-k-mooneyok12:13
gibisean-k-mooney: I reverted, yes :)12:13
ralonsohsean-k-mooney, we do support qos in OVN12:14
ralonsohfully12:14
gibiralonsoh: ohh so because I have sriov I cannot see the min pps12:14
ralonsohgibi, right12:14
gibiralonsoh: let me check that in another devstack that only hase ovs but not sriov12:14
ralonsohI think this should be reconsidered in the API12:14
sean-k-mooneyralonsoh: that new this cycle right12:14
gibiralonsoh: I agree12:14
ralonsohI'll push a patch today12:14
gibiralonsoh: this sounds incorrect that I can boot with a qos rule but the rule type list does not show it12:14
gibiralonsoh: thank you!12:14
ralonsohsean-k-mooney, that was supported since wallaby12:15
ralonsohand in D/S in OSP1612:15
sean-k-mooneyoh ok12:15
sean-k-mooneyralonsoh: this is not the first time this api design choice has come up12:16
ralonsohsean-k-mooney, yeah... I think the current implementation is wrong12:16
sean-k-mooneywell i ment in general12:16
sean-k-mooneyneutron also has the same problem with vlan transparncy12:17
sean-k-mooneyto work around that vlan transpace was set to yes for sriov12:17
sean-k-mooneyevent though it really done not support it properly12:17
ralonsohthe aim of the API is to return only what is supported by all drivers12:18
sean-k-mooneyi think in general neutron need to list the capyablity per ml2/driver12:18
ralonsohfor example: https://review.opendev.org/q/3299cdffae5cd7196a1676da103da5e2e413ec2112:18
sean-k-mooneyralonsoh: ya i know but that has never felt useful to me12:18
ralonsohit was changed before and then reverted12:18
sean-k-mooneythe api shoudl idally list the qos polices per driver12:18
ralonsohsean-k-mooney, then what we can do is to create another API call12:18
ralonsohreturn all_supported_qos_types12:19
ralonsohor something similar12:19
sean-k-mooneyperhaps12:19
ralonsohI'll propose a new API12:20
sean-k-mooneyi still think having the new api return a dictonaly keyd by either the driver or vnic-type with the support qos polices as the values would be the way to organsie that api but just add a property to the exising one to list all is less work12:22
ralonsohI think we can do this with the current API, just adding a new parameter to the CLI call12:23
sean-k-mooneythe issue i have with all_supported_qos_types is that you cant tell if a port you create will work with any specific policy12:23
sean-k-mooneyralonsoh: actully at the end of the day what we really need is schduler support12:23
ralonsohyes12:23
gibiminimum pps / bw has scheduler support :)12:24
sean-k-mooneyneutron need to use traits to model which host support which polcies and nova need to shcdule the prot to such a host based on the requst12:24
sean-k-mooneygibi: it does but im thinking for dscp ectra12:24
gibiyeah for dhcp it is a different game12:24
sean-k-mooneye.g. the non quntitive qos12:24
sean-k-mooneyits a fair point about min*12:24
sean-k-mooneythose are ahead of the game12:25
stephenfinsean-k-mooney: Care to finally get these in? It's only been 16 months :) https://review.opendev.org/c/openstack/nova/+/705792/ https://review.opendev.org/c/openstack/nova/+/754448/12:25
gibiand supporting the non quantitative is problematic by the resourceless request group problem in placement 12:25
sean-k-mooneystephenfin: im looking at https://review.opendev.org/c/openstack/nova/+/814562 now but i can look at them after12:26
stephenfinOkay, sweet. ty :)12:26
sean-k-mooneystephenfin: o nnueton refactoring ya ill review those too12:26
sean-k-mooneyah you have a follow up for the cells doc cool i was going to ask if you were doing a new reviesion12:35
*** dasm|off is now known as dasm|rover12:44
*** amoralej is now known as amoralej|lunch12:56
gibiralonsoh: I'm not sure I understand the reason of the wontfix on https://bugs.launchpad.net/neutron/+bug/195974913:12
gibiralonsoh: you #1) point is what I would need to work. So that the rule type list returns all rule types not just rule types that are supported by every configured driver13:16
ralonsohgibi, sorry, I don't know why I set this flag13:32
ralonsohconfirmed, this is the correct value13:32
gibiralonsoh: that is better, thanks :)13:32
admin1hi guys .. is there a way to "transfer ownership" of an instance from one project to another ? 13:48
*** amoralej|lunch is now known as amoralej14:01
opendevreviewyuval proposed openstack/nova master: Lightbits LightOS driver  https://review.opendev.org/c/openstack/nova/+/82160614:04
opendevreviewDmitrii Shcherbakov proposed openstack/nova master: Document remote-managed port usage considerations  https://review.opendev.org/c/openstack/nova/+/82751314:58
gibiif somebody wants a change of (code) scenery then I can suggest looking at the placement code review series to support any-traits queries in microvarsion 1.39. The series starts here https://review.opendev.org/c/openstack/placement/+/825846/3 :)16:45
opendevreviewMerged openstack/nova master: docs: Add a new cells v2 document  https://review.opendev.org/c/openstack/nova/+/81456217:01
melwittgibi: I will look at some new code scenery :)17:15
*** lbragstad9 is now known as lbragstad17:19
dmitriisgibi, sean-k-mooney: mostly been getting unrelated gate failures so I am waiting for some rechecks to complete. I made a functional change to the patch that introduces the remote_managed tag here https://review.opendev.org/c/openstack/nova/+/824834/8/nova/pci/devspec.py#322 to include a check for the presence of a serial number when a device is17:27
dmitriistagged as remote_managed which is something I overlooked in the previous iteration and updated testing to reflect that.17:27
dmitriisI started working on the docs and started a doc review but most of the docs will be in Neutron under the OVN driver guide similar to how it's done today with OVS hardware offload.17:28
gibimelwitt: thanks! :)17:42
gibidmitriis: I will read back tomorrow I have to go now17:43
dmitriisgibi: np, thanks a lot for the help so far17:43
*** amoralej is now known as amoralej|off18:08
sean-k-mooneydmitriis: ack18:13
sean-k-mooneydmitriis: im looking at some downstream stuff currently but ill try to take a look proably tomorow at this point18:14
sean-k-mooneydmitriis: most of the doc make sense for the neutorn guide but we shoudl detail how to use the remote managed flag ectra in nova18:14
dmitriissean-k-mooney: ack, ta.18:16
dmitriissean-k-mooney: I currently describe some of it in the latest doc change and reference the option docstring but I can expand the description in the docs themselves too.18:16
opendevreviewIlya Popov proposed openstack/nova master: Fix to implement 'pack' or 'spread' VM's NUMA cells  https://review.opendev.org/c/openstack/nova/+/80564918:18
opendevreviewmelanie witt proposed openstack/nova master: Raise InstanceNotFound on fkey constraint fail saving info cache  https://review.opendev.org/c/openstack/nova/+/82694218:34
sean-k-mooneyo/ are we tracking the failure of tempest.api.compute.servers.test_device_tagging.TaggedAttachmentsTest.test_tagged_attachment 18:51
sean-k-mooneyas a potential gate issue anywhere18:51
sean-k-mooneyim seeing that fail more and more on reviews over the last 2 weeks18:52
sean-k-mooneyits like it knew lyarwood was starting on kubvirt this week :)18:52
*** artom__ is now known as artom18:53
sean-k-mooneyso this could be q35 related18:54
artomYeah, repeating what I said downstream... it's not even a Tempest race or whatever, it's the guest itself. Is this the q35 problem again? Surely we'd see other tests fail in that job, unless nova-next doesn't do any other device attachment tests, which would be... weird18:54
sean-k-mooneybut im not sure about that18:54
sean-k-mooneyit could be that the volume is not attched fully yet18:54
sean-k-mooneyi.e. the series from lee to wait fothe vm to be pingable might help18:54
sean-k-mooneybut in this case the test is sshing into the vm18:54
sean-k-mooneyto check the tag is there right18:55
sean-k-mooneythe failure message at the top level is Details: Timeout while verifying metadata on server.18:55
artomNo, there are definitely other tests that attach stuff that pass18:55
sean-k-mooneyso the test is doing  Remote command: set -eu -o pipefail; PATH=$PATH:/sbin:/usr/sbin; curl http://169.254.169.254/openstack/latest/meta_data.json18:56
artomMind you, they may not be SSH'ing into the guest?18:56
sean-k-mooneyby sshing into the guest18:56
sean-k-mooneyand that ssh connection is well connecting18:56
artomSo it's curl that's timing out?18:57
sean-k-mooneyim looking at https://zuul.opendev.org/t/openstack/build/d836724c364843e98bf893ac7157482818:57
sean-k-mooneyno i think the curl command is working18:57
sean-k-mooneyits doing it in a loop18:57
sean-k-mooneybut by the time the test complete the data is not in the metadata servie18:57
sean-k-mooneybut that could be related to the attach taking a long time18:58
artomIt's not 100% though, so whatever we try will have to be rechecked at least a few times19:00
sean-k-mooneyya i dont know it just gotten flaky recently19:01
sean-k-mooneynot clear reason why 19:01
sean-k-mooneyand as you say its no 100% so its hard to tell why19:02
artomI wonder if we should wait for volume and interface attach before carrying on19:11
artomLike, we're somehow "confusing" the guest by issuing two device attach commands in quick succession19:11
artomI realize how non-engineery that sounds19:11
opendevreviewArtom Lifshitz proposed openstack/nova master: DNM: Testing change to test_tagged_attachment in tempest  https://review.opendev.org/c/openstack/nova/+/82754919:22
artom^^ we'll see19:22
sean-k-mooneyartom: we proably should although i think we have an instace level lock at the comptue manager so only one attachment can happen at a time19:51
sean-k-mooneywe do for 2 interfaces or volumes but not sure about one of each 19:51
sean-k-mooneyso ya lets see if that helps19:52
sean-k-mooneyo/ all chat to ye tomrrow19:54
artomIt's a shot in the dark to serve as a data poin19:54
artomt19:54
admin1hi .. is there a nova command to see all ongoing migrations ? 21:19
mlozahello, need help figure out what would cause nova to delete a port on an instance 21:23
mlozai see this in the logs `Creating event network-vif-deleted`21:23
mlozahere's the full logs https://paste.openstack.org/raw/bRTqxJGwy3WxAPGVEPo9/21:23
admin1openstack compute service delete 56 => Unable to delete compute service that has in-progress migrations.  .. How do I check these migrations ? 21:52
*** dasm|rover is now known as dasm|off22:01
melwittadmin1: https://docs.openstack.org/python-openstackclient/latest/cli/command-objects/server-migration.html#server-migration-list22:08
admin1melwitt, this cluster is still in rocky  .. so that command is not there 22:10
admin1we are in the process of upgrading it .. so migrating, delete compute, reinstall and upgrade, add it back 22:10
melwittoh I see22:11
opendevreviewmelanie witt proposed openstack/nova master: Raise InstanceNotFound on fkey constraint fail saving info cache  https://review.opendev.org/c/openstack/nova/+/82694222:11
admin1nova migration-list lists the migraitons, but there is nothing pending or ongoing22:18
melwittyou could try the force or abort commands on the migrations if they are leftover https://docs.openstack.org/python-novaclient/latest/cli/nova.html#nova-live-migration-force-complete22:23
melwitthttps://docs.openstack.org/python-novaclient/latest/cli/nova.html#nova-live-migration-abort22:23
admin1i found the instance.. the instance is in pre-migrating status ....   when i enter the command  nova live-migration-abort $UUID $ID, it says Instance $UUID  is in an invalid state for 'abort live migration'22:34
admin1how do I abort a pre-migration status ? 22:34
admin1server show status is Running .. so instance  does not have a pre-migrating status22:35
melwittdid you try the force complete too?22:38
admin1yeah .. it gave Instance $UUID is in an invalid state for 'force_complete'22:40
admin1maybe i can just delete the compute service from the db ? 22:41
admin1and add it again .. 22:41
admin1want to do it properly though .. 22:41
admin1melwitt, https://gist.github.com/a1git/9a975b96cd91da4683084c7df3220530  22:44
admin1baiscally the way i have been upgrading from rocky (xenial ) -> bionic is ..       for example, migrate all instances from A - B and empty A,  then compute service delete A ,   reinstall A with bionic, same hostname, install nova .. , and then repeat the process again .. 22:45
admin1but for some reasons, this one is locked/blocked .. 22:45
melwitt+1 going into the db is the last resort if the proper tools don't work22:46
admin1i think 'pre-migrating' is blocking the deletion .. and I have found nothing to force this to either error or completed 22:48
admin1i did a mysqldump and i found the pre-migrating in only 1 table  .. nova.migrations .. 22:51
melwittI think you are right that it's the migration(s) blocking the service delete22:53
admin1update migrations set status='error' where instance_uuid=$UUID  and status='pre-migrating';22:54
admin1and then the delet worked fine :)   .. openstack compute service delete 5622:54
melwittI was just about to suggest that, change the status rather than deleting the migration record :)22:56
admin1how to delete a host from placement service ? 23:03
melwitthm, the service delete should have done that23:04
melwittit's the host you deleted the service for right?23:05
admin1 ResourceProviderCreationFailed: Failed to create resource provider h1  23:06
admin1openstack compute service list -- it gets added there .. 23:07
admin1i tried via "openstack compute service delete $id"   and in 2nd attempt  nova service-delete $uuid 23:08
admin14 other servers, no issues .. this one hypervisor = this and that errors :) 23:08
melwittthis is the cli for the placement service https://docs.openstack.org/osc-placement/latest/cli/index.html#resource-provider-delete23:08
melwittjust be careful and make sure it's not associated with any instances allocations23:10
melwittbefore deleting it23:10
admin1there are no instances 23:10
melwittk23:10
admin1all instances were migrated, this is a new install which just came up 23:10
melwittok cool23:10
admin1no such command in rocky 23:10
admin1this upgrade is necessary to upgrade from rocky =.. as stein needs 18.04  .. 23:11
admin1if i need to delete a hypervisor, isn't it just deleting compute service ? 23:11
melwittyeah but it does a few things: deletes the nova service record, the nova compute_nodes record, and then the placement resource provider23:12
melwittand the last step has (probably) failed in your case from the previous delete23:12
melwittso it sees a duplicate placement resource provider and refuses to create it. so you are correct you need to delete it so it can recreate it again23:13
melwittas for the command, you could upgrade openstackclient to newer than rocky and it will still work with rocky23:13
admin1sudo apt install python3-openstackclient     python3-openstackclient is already the newest version (5.6.0-0ubuntu1). ... no such command 23:18
melwittyeah I'd try pip installing it in a venv or something23:19
admin15.6.0 is the latest version, even via pip install 23:23
admin1in a vnv23:23
admin1and it does not have placement23:23
melwittoh ugh sorry, you need the osc-placement package23:23
melwittit's a openstackclient plugin23:23
admin1hmm.. how to install it ? 23:24
melwittI think you can just apt install it23:24
admin1found it 23:24
admin1i will try tomorrow on this .. 23:27
admin1melwitt, many thanks .. will report back tomorrow23:27
melwittok yw o/23:28

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!