Friday, 2022-09-30

opendevreviewAmit Uniyal proposed openstack/nova master: Adds check if resized to swap zero  https://review.opendev.org/c/openstack/nova/+/85733906:48
amorinhey nova team, if you have time someday to review this: https://review.opendev.org/c/openstack/nova/+/85368207:01
UgglaPoke bauzas --> https://photos.app.goo.gl/MRmeetQr749GFh3s5   ;)07:15
bauzasdanm fucking numpy07:31
bauzasI should have prepared better07:31
bauzasand now I have to wait until Feb 2023 IIRC07:32
bauzasgraaaah07:32
bauzasoh no, actually after Mar 29th :(07:34
Ugglagibi, bauzas could you have a look at https://review.opendev.org/c/openstack/nova/+/85435509:15
bauzasouch.09:17
bauzasUggla: what do you mean by "soft delete is deprecated ?"09:17
Ugglabauzas, This is mentioned and checked in a test that we should not create soft delete objects.09:18
bauzasI'd say this isn't recommended, true09:19
bauzasbut "deprecated" seems harsh to me09:19
bauzaslike, if you signal that instances won't be soft deleted, it would be an operators revolution09:20
Ugglabauzas, https://opendev.org/openstack/nova/src/commit/adeea3d5e7d7337d2817dd5c46334c76c05995ef/nova/tests/unit/db/test_models.py#L2109:21
UgglaI have just reuse the wording L54.09:22
Ugglabut I can change the wording if you wish.09:23
gibiUggla: done09:29
gibibauzas: what we want to say that no new ovo can be created with soft delete support09:30
gibiand this is enforced by a test case as Uggla noted above09:30
bauzasok, the wording seems weird to me but when I read the original commit msg, this is saying the same09:31
bauzasanyway09:31
bauzasI knew about new tables09:31
Ugglagibi, thx09:33
ralonsohgibi, hey, do you know how to book the operator-hour? 09:37
ralonsohI tried "#operator-hour-neutron book icehouse-FriB1"09:37
ralonsohbut I don't want to break anything testing other commands09:37
gibiralonsoh: I did not tried to book it. bauzas did you booked?09:38
ralonsohI did for neutron09:38
ralonsohbut not for the operator-hour09:38
bauzasralonsoh: gibi: yeah I did it for the nova one09:38
bauzasralonsoh: but you need to ask the infra folks to add you as the neutron owner09:38
bauzassec09:38
ralonsohbauzas, I am09:39
ralonsohI already booked the neutron slots09:39
bauzashttps://lists.openstack.org/pipermail/openstack-discuss/2022-September/030301.html09:39
bauzasyou need to register the new track09:39
ralonsohahhhhh09:39
ralonsohbauzas, thanks a lot09:39
bauzasnp09:39
bauzasralonsoh: that reminds me, I guess we will want a x-p session at the PTG between nova and neutron09:40
bauzasso, let's just ask our folks if they want09:40
bauzasand we'll try to find a slot09:40
ralonsohI think so, manly for live migration09:40
ralonsohdo you have topics for it?09:41
bauzasralonsoh: none for the moment AFAICS09:42
ralonsohbauzas, I'll send a mail today09:42
ralonsohwith a section in our etherpad09:42
bauzasOK09:42
auniyalHello #openstack-qa 10:23
auniyalHello #openstack-nova 10:23
auniyalI added a new flavor in here - https://opendev.org/openstack/nova/src/commit/aad31e6ba489f720f5bdc765c132fd0f059a0329/nova/tests/fixtures/nova.py#L73110:24
auniyalso updated this as well - https://opendev.org/openstack/nova/src/branch/master/nova/tests/functional/api_sample_tests/api_samples/flavors/v2.75/flavors-list-resp.json.tpl10:24
auniyalis there any other place I should update ?10:24
auniyalright now, some functional tests are failing at here - https://opendev.org/openstack/nova/src/commit/aad31e6ba489f720f5bdc765c132fd0f059a0329/nova/tests/functional/api_samples_test_base.py#L39910:25
auniyalmy added functional tests are passing, but existing tests are failing10:28
auniyalthese - https://opendev.org/openstack/nova/src/branch/master/nova/tests/functional/api_sample_tests/test_flavors.py10:29
auniyalgibi, bauzas ^^10:31
stephenfinbauzas: Does https://review.opendev.org/c/openstack/python-novaclient/+/816158 still need to be done (and backported)?10:38
opendevreviewEigil Obrestad proposed openstack/nova-specs master: Compute Inventory Customization  https://review.opendev.org/c/openstack/nova-specs/+/85885811:05
opendevreviewEigil Obrestad proposed openstack/nova-specs master: Compute Inventory Customization  https://review.opendev.org/c/openstack/nova-specs/+/85885811:18
obregibi, sean-k-mooney: Some clarifications and changes added ^^. Please have a look when time permits. As for workflow I wonder abot the comments: Should I mark comments as resolved when I answer them to signal that they are answered; or should you mark them as resolved to signal that my answer is accepted?11:18
fricklerhas it ever been discussed to build some mechanism to allow instances to store their SSH host key in nova, so that users could get a certified instance/hostkey association from the API? rough idea would be to allow a POST to a special metadata API endpoint, which could be authenticated like other metadata accesses11:56
frickleronce that is done, one could extend neutron-designate integration to generate SSHFP records in addition to A+PTR11:58
gibiobre: mark them resolved if you think you resolved them. We can always repoen it if think further discusion is needed12:06
opendevreviewEigil Obrestad proposed openstack/nova-specs master: Compute Inventory Customization  https://review.opendev.org/c/openstack/nova-specs/+/85885812:17
opendevreviewAmit Uniyal proposed openstack/nova master: Adds check if resized to swap zero  https://review.opendev.org/c/openstack/nova/+/85733913:15
bauzasstephenfin: good call about the client patches, I need to take a look at them13:34
bauzasfrickler: sounds to me something not nova-related13:35
bauzasie. you want some datastore to persist a tuple (instance, hostkey)13:35
fricklerbauzas: I think the verification of the instance ID would need to come from nova, else it could be spoofed from a different host13:39
*** dasm|off is now known as dasm13:39
fricklersome hacky workaround would be to have the instance output the key to the console log and just parse it from there, but that sounds a bit flaky13:40
bauzasfrickler: maybe I misunderstood your usecase13:56
bauzaswhen ssh'ing the instance, you mean ?13:57
bauzaswe already have the fingerprint13:57
fricklerbauzas: where do we have that? the use case is securing the initial SSH connection to a fresh instance, yes14:18
*** lbragstad4 is now known as lbragstad14:53
bauzasfrickler: https://docs.openstack.org/api-ref/compute/?expanded=show-keypair-details-detail#show-keypair-details14:59
bauzaswhen you import a pubkey, then it creates the fingerprint14:59
fricklerbauzas: ah, but the issue is about the hostkey, not the user key. cloud-init can output it on the console but I would want a nicer solution https://stackoverflow.com/questions/66658406/openstack-how-to-find-out-vms-key-fingerprint-before-first-ssh-session15:02
fricklerthe other option would be to have something similar like AWS IAM host roles and via that give cloud-init credentials with which it could talk to designate15:05
fricklerbut maybe you are right and this is more an issue for neutron than for nova15:09
clarkbfrickler: yes mordred brought this up probably 8 years ago at this point. I think the idea then was maybe to have the hypervisor scan the hostkey since it can talk directly to the correct interface then report that back through the api15:11
fricklerclarkb: ah, scanning from the outside instead of relying on some action from the instance would make this more generally useful, that's a good point, too15:14
bauzasfrickler: hah, about the hostkety15:26
bauzasI mean the compute node15:26
bauzasthen, as I said, not a nova usecase IMO15:26
clarkbbauzas: why isn't that a nova use case?15:31
clarkbnova is the only entity in a position to accurately query the info15:31
bauzasclarkb: that's port-dependent on the host, right?15:31
clarkbyes, you'd need to make a network request over the correct interface/port to accurately retrive the information15:32
clarkband you can't traverse additional network segments or you lose the accuracy15:32
bauzascorrect15:33
bauzasso you can't rely on nova itself15:33
bauzasbut I'm open to discuss the usecase in a spec 15:33
bauzasor even at the PTG15:33
bauzasI guess this is for blessing the instance ssh connection ?15:34
clarkbya ultimately I don't really care what implements it, but I do think being able to get the host key from nova show and be able to trust it is a very useful feature15:34
clarkbbauzas: yes, it is so that you don't have to take a leap of faith the first time you make an ssh connection that you are not being mitm'd15:34
bauzasclarkb: but then the connection to the port is managed thru neutron 15:35
bauzasbut maybe that's something we can sort out at the PTG15:35
clarkbbut it is a host attribute. Your ports may come and go but the hostkey doesn't necessarily change15:35
bauzasI understand the usecase15:35
clarkbit may ultimately require cooperation between the compute and network layers15:36
bauzasthis is OS-specific tho15:36
clarkbwell it is ssh specific15:36
bauzasactually no15:36
bauzasyeah that15:36
clarkbwindows and solaris and linux and osx can all run an sshd15:36
bauzasyeah15:36
bauzasanyway, I'm not eventually against the usecase, I think we can agree it's a valid one15:37
bauzasnow the problem is how to get this15:37
bauzasand how to present this15:37
clarkbyup, I think the raeson it hasn't been done is that it is complicated15:44
*** dasm is now known as dasm|off21:17
atmarkhow to properly to delete a VM stuck on BUILD status? openstack server delete throws no server with name or ID exists22:06
sean-k-mooneythat is the correct way if we have gotten to the point of creating a instance object in the db22:07
sean-k-mooneyif its stuck before that point then there should be no recored to delete22:07
atmarkserver list still list the instance22:08
atmarkhow can I get rid of the record?22:08
sean-k-mooneyyou should be able to just do a server delete22:08
sean-k-mooneyif you cant then you have a broken deployment and you need to triage why the request is failing22:09
sean-k-mooneydid you try deleteing with the uuid22:09
sean-k-mooneyrahter then the name22:09
sean-k-mooneyif that fails you should check the nova-api to see if there is an internal excption22:10
atmarki'm deleting by uuid but response is no name or ID or exists 22:10
atmarki think this an old instance from last year 22:11
sean-k-mooneyack there is obviouly an error internally without knowing what that is there is not much more i can suggest22:11
sean-k-mooneyif you take the request id you shoudl be able to find it in the api logs22:11
atmarki'm doing cleanup atm and just caught one on weird state 22:11
sean-k-mooneyits possible that it exists in teh api db but not in the cell db22:12
sean-k-mooneyif you do a server show does it have a host22:12
atmarkno22:12
sean-k-mooneyok so its before the schuiler has selected one most likely22:12
atmarkhttps://paste.openstack.org/show/bXZv2CMnmfdGeswoUKUC/22:13
atmarkVM is octavia amphorae22:14
sean-k-mooney--all is all tenants22:14
sean-k-mooneyis that owned by your current project22:15
johnsomYep, the instance is owned by the Octavia service account.22:15
sean-k-mooneyno what i mean is when openstack server delete 6fb27c67-539c-4525-8747-e26487d15e75 is run22:15
atmarkyes, i have admin access to all tenants 22:15
sean-k-mooneywhat user is that22:15
sean-k-mooneyok so thats an admin user22:16
sean-k-mooneyi think you should be able to delete the vm then even if your current token is not for the correct project due to admin right if you are not using new policy22:16
sean-k-mooneythe server show is implying that its not in the current project22:17
sean-k-mooneyso i was wondering if the delete was failing for the same reason22:17
johnsomThere is a --all-projects for the server delete command too, just like for list.22:18
sean-k-mooneyi tought admins could bypass that check22:18
johnsomI thought so too honestly22:18
sean-k-mooneyits been a while since i have done that so cant recall if you need to set the project id somehow or not22:19
atmarkopenstack server --os-project-id 848fe125a93a408ba8a8044fb87e9cdf delete 6fb27c67-539c-4525-8747-e26487d15e7522:20
atmark848.. is tenant ID of octavia 22:20
sean-k-mooneythat would require your current user to have the member or admin role on that project for keystone to be able to issue the token22:22
sean-k-mooneythe uuid shoudl be enough if your an admin22:22
sean-k-mooney--all-project is only required to delete by name22:22
sean-k-mooneyon a differnt project22:23
atmarki'm able to list VMs `openstack server list --os-project-id 848fe125a93a408ba8a8044fb87e9cdf`22:23
sean-k-mooneyack22:23
sean-k-mooneyand i assume the delete didnt work22:23
atmarkit didn't22:23
sean-k-mooneyyou could try --force22:23
sean-k-mooneybut that is not for this usecase22:23
sean-k-mooneyits for forceing the delete now if you have soft delete enabled22:24
sean-k-mooneyyou could also try reset-state22:24
sean-k-mooneythen delete22:24
sean-k-mooneyso reset the state to error22:24
sean-k-mooneythen delete form error22:24
atmarkthrows an no ID found on reset-state22:25
sean-k-mooneyya so i think what is happenign is there is a record in teh api db22:26
sean-k-mooneybut there is no cell mapping for the instnace and no recored in teh cell db22:26
sean-k-mooneyreset-state woudl try and update the instance in the cell db22:27
sean-k-mooneybut that fails because it was not created22:27
atmarkthis is what's in the db https://paste.openstack.org/show/bbS8V1bZK0VcFO7WBD7p/22:28
sean-k-mooneyif you have db access your could confirm that by checkign the api db to see if the build request exists and/or instance_mapping22:28
sean-k-mooneyok thats in which db22:28
sean-k-mooneycell022:28
sean-k-mooneyor api22:28
sean-k-mooneynova has 3 databases by default22:29
atmarkfull columns https://paste.openstack.org/show/bsnbvSiQ7ieLLCW7zPyF/ 22:29
atmarknova22:29
sean-k-mooneythat does not tell me anything what installer did you use22:29
atmarkkolla-ansible22:29
sean-k-mooneyok i have that deploy locally let me see which db that is22:29
sean-k-mooneywhich version?22:29
atmarkussuri22:30
sean-k-mooneyack22:30
atmarkthere 3 nova db, nova, nova_api and nova_cell022:30
atmarkthe paste came from nova db22:30
sean-k-mooneyack ok so nova i the nova cell1 db22:30
sean-k-mooneyso an instance should only end up in cell1 after it has been asigned a host22:31
sean-k-mooneythe only time after its booted that it can not have a host in cell 1 is if its shleved22:31
atmarkso it didn't end up in any cell since it doesn't have a host?22:36
atmarkhost is NULL 22:36
sean-k-mooneyselect * from nova_api.instance_mappings where instance_uuid = "ded221d0-8410-4f89-b322-b9ff0207a0dd";22:36
sean-k-mooneycan you run that but with your instance uuid22:36
sean-k-mooneyyou should see somethign like this https://paste.openstack.org/show/bAUqQT5Al2Q9GoF9Nln2/22:37
atmarkhttps://paste.openstack.org/show/b9tx6I2rQ3tUtsGBR98g/22:38
sean-k-mooneywhat im wondering is does that exist and is the cell_id set22:38
sean-k-mooneyand if so is that cell in your cell_mappings table22:38
sean-k-mooneyok so cell_id is 622:38
sean-k-mooneythe cell mappings22:39
sean-k-mooneyhas potitally password inf22:39
sean-k-mooneyso dont paste that22:39
atmark6 exist22:39
sean-k-mooneybut does the db connection string for cell id 6 end in nova22:39
atmarkhttps://paste.openstack.org/show/bh9PvsUNrxNuYSRsATga/22:39
atmark6 and 922:40
sean-k-mooneyyep22:40
sean-k-mooneydoes 6 point to the db where the instance iss22:40
sean-k-mooneyi.e. nova in your case22:40
sean-k-mooneyit  shoudl end in :3306/nova  22:40
sean-k-mooneythis is how we know which database to look in to delete the instace at the cell level22:41
atmarkhttps://paste.openstack.org/show/blXqFYFedaFGS9U7igPG/22:41
atmarkcell022:41
atmarkit's pointing to nova_cell022:42
sean-k-mooneyya so cell0 is where it should be if it fail22:42
sean-k-mooneycan you ceck cell 0 and see if its also in the instance table there22:42
atmarkyup it's here22:43
sean-k-mooneyok so its in both nova and nova_cell022:44
sean-k-mooneythats really strange22:44
atmarkcorrect22:44
sean-k-mooneyso the quickest way to fix this is just to delete that one record22:44
sean-k-mooneybut it shoudl only ever end up in one of those two dbs22:44
sean-k-mooneyany instnace that failed before it was schduled to a host should end up in nova_cell022:45
atmarkdelete in both db?22:45
atmarkprobably safe to delete in both right 22:45
sean-k-mooneyyou could try just removing it in nova22:45
sean-k-mooneyand then delete again but ya it should be safe to delete in both22:45
sean-k-mooneyi dont know of a failrue mode like this unless you change you cell mapping at some point22:46
sean-k-mooneynormally cell 0 get id 1 an cell_1 gets id 222:46
sean-k-mooneyalthough there is no significance to the speicic id 22:47
atmarkiirc, we did played with mappings because something failed22:47
sean-k-mooneyack so maybe this change at some point 22:47
sean-k-mooneyi have never seen this specific issue before22:48
atmarkyep. anywany, this is for monday. don't wanna cause something bad22:48
atmarkthanks for the help 22:48
atmarkwe have 3 prod envs, only 1 env is exhibiting this issue 22:49
sean-k-mooneyack22:49
atmarkall same version22:49
* sean-k-mooney is current deploying my home openstack with kolla-ansible in paralle22:50
sean-k-mooneyits my favorite way to install and manage openstack22:50
sean-k-mooneyim just sad that the zed release is not avlaible currently22:50

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!