opendevreview | Amit Uniyal proposed openstack/nova master: Adds check if resized to swap zero https://review.opendev.org/c/openstack/nova/+/857339 | 06:48 |
---|---|---|
amorin | hey nova team, if you have time someday to review this: https://review.opendev.org/c/openstack/nova/+/853682 | 07:01 |
Uggla | Poke bauzas --> https://photos.app.goo.gl/MRmeetQr749GFh3s5 ;) | 07:15 |
bauzas | danm fucking numpy | 07:31 |
bauzas | I should have prepared better | 07:31 |
bauzas | and now I have to wait until Feb 2023 IIRC | 07:32 |
bauzas | graaaah | 07:32 |
bauzas | oh no, actually after Mar 29th :( | 07:34 |
Uggla | gibi, bauzas could you have a look at https://review.opendev.org/c/openstack/nova/+/854355 | 09:15 |
bauzas | ouch. | 09:17 |
bauzas | Uggla: what do you mean by "soft delete is deprecated ?" | 09:17 |
Uggla | bauzas, This is mentioned and checked in a test that we should not create soft delete objects. | 09:18 |
bauzas | I'd say this isn't recommended, true | 09:19 |
bauzas | but "deprecated" seems harsh to me | 09:19 |
bauzas | like, if you signal that instances won't be soft deleted, it would be an operators revolution | 09:20 |
Uggla | bauzas, https://opendev.org/openstack/nova/src/commit/adeea3d5e7d7337d2817dd5c46334c76c05995ef/nova/tests/unit/db/test_models.py#L21 | 09:21 |
Uggla | I have just reuse the wording L54. | 09:22 |
Uggla | but I can change the wording if you wish. | 09:23 |
gibi | Uggla: done | 09:29 |
gibi | bauzas: what we want to say that no new ovo can be created with soft delete support | 09:30 |
gibi | and this is enforced by a test case as Uggla noted above | 09:30 |
bauzas | ok, the wording seems weird to me but when I read the original commit msg, this is saying the same | 09:31 |
bauzas | anyway | 09:31 |
bauzas | I knew about new tables | 09:31 |
Uggla | gibi, thx | 09:33 |
ralonsoh | gibi, hey, do you know how to book the operator-hour? | 09:37 |
ralonsoh | I tried "#operator-hour-neutron book icehouse-FriB1" | 09:37 |
ralonsoh | but I don't want to break anything testing other commands | 09:37 |
gibi | ralonsoh: I did not tried to book it. bauzas did you booked? | 09:38 |
ralonsoh | I did for neutron | 09:38 |
ralonsoh | but not for the operator-hour | 09:38 |
bauzas | ralonsoh: gibi: yeah I did it for the nova one | 09:38 |
bauzas | ralonsoh: but you need to ask the infra folks to add you as the neutron owner | 09:38 |
bauzas | sec | 09:38 |
ralonsoh | bauzas, I am | 09:39 |
ralonsoh | I already booked the neutron slots | 09:39 |
bauzas | https://lists.openstack.org/pipermail/openstack-discuss/2022-September/030301.html | 09:39 |
bauzas | you need to register the new track | 09:39 |
ralonsoh | ahhhhh | 09:39 |
ralonsoh | bauzas, thanks a lot | 09:39 |
bauzas | np | 09:39 |
bauzas | ralonsoh: that reminds me, I guess we will want a x-p session at the PTG between nova and neutron | 09:40 |
bauzas | so, let's just ask our folks if they want | 09:40 |
bauzas | and we'll try to find a slot | 09:40 |
ralonsoh | I think so, manly for live migration | 09:40 |
ralonsoh | do you have topics for it? | 09:41 |
bauzas | ralonsoh: none for the moment AFAICS | 09:42 |
ralonsoh | bauzas, I'll send a mail today | 09:42 |
ralonsoh | with a section in our etherpad | 09:42 |
bauzas | OK | 09:42 |
auniyal | Hello #openstack-qa | 10:23 |
auniyal | Hello #openstack-nova | 10:23 |
auniyal | I added a new flavor in here - https://opendev.org/openstack/nova/src/commit/aad31e6ba489f720f5bdc765c132fd0f059a0329/nova/tests/fixtures/nova.py#L731 | 10:24 |
auniyal | so updated this as well - https://opendev.org/openstack/nova/src/branch/master/nova/tests/functional/api_sample_tests/api_samples/flavors/v2.75/flavors-list-resp.json.tpl | 10:24 |
auniyal | is there any other place I should update ? | 10:24 |
auniyal | right now, some functional tests are failing at here - https://opendev.org/openstack/nova/src/commit/aad31e6ba489f720f5bdc765c132fd0f059a0329/nova/tests/functional/api_samples_test_base.py#L399 | 10:25 |
auniyal | my added functional tests are passing, but existing tests are failing | 10:28 |
auniyal | these - https://opendev.org/openstack/nova/src/branch/master/nova/tests/functional/api_sample_tests/test_flavors.py | 10:29 |
auniyal | gibi, bauzas ^^ | 10:31 |
stephenfin | bauzas: Does https://review.opendev.org/c/openstack/python-novaclient/+/816158 still need to be done (and backported)? | 10:38 |
opendevreview | Eigil Obrestad proposed openstack/nova-specs master: Compute Inventory Customization https://review.opendev.org/c/openstack/nova-specs/+/858858 | 11:05 |
opendevreview | Eigil Obrestad proposed openstack/nova-specs master: Compute Inventory Customization https://review.opendev.org/c/openstack/nova-specs/+/858858 | 11:18 |
obre | gibi, sean-k-mooney: Some clarifications and changes added ^^. Please have a look when time permits. As for workflow I wonder abot the comments: Should I mark comments as resolved when I answer them to signal that they are answered; or should you mark them as resolved to signal that my answer is accepted? | 11:18 |
frickler | has it ever been discussed to build some mechanism to allow instances to store their SSH host key in nova, so that users could get a certified instance/hostkey association from the API? rough idea would be to allow a POST to a special metadata API endpoint, which could be authenticated like other metadata accesses | 11:56 |
frickler | once that is done, one could extend neutron-designate integration to generate SSHFP records in addition to A+PTR | 11:58 |
gibi | obre: mark them resolved if you think you resolved them. We can always repoen it if think further discusion is needed | 12:06 |
opendevreview | Eigil Obrestad proposed openstack/nova-specs master: Compute Inventory Customization https://review.opendev.org/c/openstack/nova-specs/+/858858 | 12:17 |
opendevreview | Amit Uniyal proposed openstack/nova master: Adds check if resized to swap zero https://review.opendev.org/c/openstack/nova/+/857339 | 13:15 |
bauzas | stephenfin: good call about the client patches, I need to take a look at them | 13:34 |
bauzas | frickler: sounds to me something not nova-related | 13:35 |
bauzas | ie. you want some datastore to persist a tuple (instance, hostkey) | 13:35 |
frickler | bauzas: I think the verification of the instance ID would need to come from nova, else it could be spoofed from a different host | 13:39 |
*** dasm|off is now known as dasm | 13:39 | |
frickler | some hacky workaround would be to have the instance output the key to the console log and just parse it from there, but that sounds a bit flaky | 13:40 |
bauzas | frickler: maybe I misunderstood your usecase | 13:56 |
bauzas | when ssh'ing the instance, you mean ? | 13:57 |
bauzas | we already have the fingerprint | 13:57 |
frickler | bauzas: where do we have that? the use case is securing the initial SSH connection to a fresh instance, yes | 14:18 |
*** lbragstad4 is now known as lbragstad | 14:53 | |
bauzas | frickler: https://docs.openstack.org/api-ref/compute/?expanded=show-keypair-details-detail#show-keypair-details | 14:59 |
bauzas | when you import a pubkey, then it creates the fingerprint | 14:59 |
frickler | bauzas: ah, but the issue is about the hostkey, not the user key. cloud-init can output it on the console but I would want a nicer solution https://stackoverflow.com/questions/66658406/openstack-how-to-find-out-vms-key-fingerprint-before-first-ssh-session | 15:02 |
frickler | the other option would be to have something similar like AWS IAM host roles and via that give cloud-init credentials with which it could talk to designate | 15:05 |
frickler | but maybe you are right and this is more an issue for neutron than for nova | 15:09 |
clarkb | frickler: yes mordred brought this up probably 8 years ago at this point. I think the idea then was maybe to have the hypervisor scan the hostkey since it can talk directly to the correct interface then report that back through the api | 15:11 |
frickler | clarkb: ah, scanning from the outside instead of relying on some action from the instance would make this more generally useful, that's a good point, too | 15:14 |
bauzas | frickler: hah, about the hostkety | 15:26 |
bauzas | I mean the compute node | 15:26 |
bauzas | then, as I said, not a nova usecase IMO | 15:26 |
clarkb | bauzas: why isn't that a nova use case? | 15:31 |
clarkb | nova is the only entity in a position to accurately query the info | 15:31 |
bauzas | clarkb: that's port-dependent on the host, right? | 15:31 |
clarkb | yes, you'd need to make a network request over the correct interface/port to accurately retrive the information | 15:32 |
clarkb | and you can't traverse additional network segments or you lose the accuracy | 15:32 |
bauzas | correct | 15:33 |
bauzas | so you can't rely on nova itself | 15:33 |
bauzas | but I'm open to discuss the usecase in a spec | 15:33 |
bauzas | or even at the PTG | 15:33 |
bauzas | I guess this is for blessing the instance ssh connection ? | 15:34 |
clarkb | ya ultimately I don't really care what implements it, but I do think being able to get the host key from nova show and be able to trust it is a very useful feature | 15:34 |
clarkb | bauzas: yes, it is so that you don't have to take a leap of faith the first time you make an ssh connection that you are not being mitm'd | 15:34 |
bauzas | clarkb: but then the connection to the port is managed thru neutron | 15:35 |
bauzas | but maybe that's something we can sort out at the PTG | 15:35 |
clarkb | but it is a host attribute. Your ports may come and go but the hostkey doesn't necessarily change | 15:35 |
bauzas | I understand the usecase | 15:35 |
clarkb | it may ultimately require cooperation between the compute and network layers | 15:36 |
bauzas | this is OS-specific tho | 15:36 |
clarkb | well it is ssh specific | 15:36 |
bauzas | actually no | 15:36 |
bauzas | yeah that | 15:36 |
clarkb | windows and solaris and linux and osx can all run an sshd | 15:36 |
bauzas | yeah | 15:36 |
bauzas | anyway, I'm not eventually against the usecase, I think we can agree it's a valid one | 15:37 |
bauzas | now the problem is how to get this | 15:37 |
bauzas | and how to present this | 15:37 |
clarkb | yup, I think the raeson it hasn't been done is that it is complicated | 15:44 |
*** dasm is now known as dasm|off | 21:17 | |
atmark | how to properly to delete a VM stuck on BUILD status? openstack server delete throws no server with name or ID exists | 22:06 |
sean-k-mooney | that is the correct way if we have gotten to the point of creating a instance object in the db | 22:07 |
sean-k-mooney | if its stuck before that point then there should be no recored to delete | 22:07 |
atmark | server list still list the instance | 22:08 |
atmark | how can I get rid of the record? | 22:08 |
sean-k-mooney | you should be able to just do a server delete | 22:08 |
sean-k-mooney | if you cant then you have a broken deployment and you need to triage why the request is failing | 22:09 |
sean-k-mooney | did you try deleteing with the uuid | 22:09 |
sean-k-mooney | rahter then the name | 22:09 |
sean-k-mooney | if that fails you should check the nova-api to see if there is an internal excption | 22:10 |
atmark | i'm deleting by uuid but response is no name or ID or exists | 22:10 |
atmark | i think this an old instance from last year | 22:11 |
sean-k-mooney | ack there is obviouly an error internally without knowing what that is there is not much more i can suggest | 22:11 |
sean-k-mooney | if you take the request id you shoudl be able to find it in the api logs | 22:11 |
atmark | i'm doing cleanup atm and just caught one on weird state | 22:11 |
sean-k-mooney | its possible that it exists in teh api db but not in the cell db | 22:12 |
sean-k-mooney | if you do a server show does it have a host | 22:12 |
atmark | no | 22:12 |
sean-k-mooney | ok so its before the schuiler has selected one most likely | 22:12 |
atmark | https://paste.openstack.org/show/bXZv2CMnmfdGeswoUKUC/ | 22:13 |
atmark | VM is octavia amphorae | 22:14 |
sean-k-mooney | --all is all tenants | 22:14 |
sean-k-mooney | is that owned by your current project | 22:15 |
johnsom | Yep, the instance is owned by the Octavia service account. | 22:15 |
sean-k-mooney | no what i mean is when openstack server delete 6fb27c67-539c-4525-8747-e26487d15e75 is run | 22:15 |
atmark | yes, i have admin access to all tenants | 22:15 |
sean-k-mooney | what user is that | 22:15 |
sean-k-mooney | ok so thats an admin user | 22:16 |
sean-k-mooney | i think you should be able to delete the vm then even if your current token is not for the correct project due to admin right if you are not using new policy | 22:16 |
sean-k-mooney | the server show is implying that its not in the current project | 22:17 |
sean-k-mooney | so i was wondering if the delete was failing for the same reason | 22:17 |
johnsom | There is a --all-projects for the server delete command too, just like for list. | 22:18 |
sean-k-mooney | i tought admins could bypass that check | 22:18 |
johnsom | I thought so too honestly | 22:18 |
sean-k-mooney | its been a while since i have done that so cant recall if you need to set the project id somehow or not | 22:19 |
atmark | openstack server --os-project-id 848fe125a93a408ba8a8044fb87e9cdf delete 6fb27c67-539c-4525-8747-e26487d15e75 | 22:20 |
atmark | 848.. is tenant ID of octavia | 22:20 |
sean-k-mooney | that would require your current user to have the member or admin role on that project for keystone to be able to issue the token | 22:22 |
sean-k-mooney | the uuid shoudl be enough if your an admin | 22:22 |
sean-k-mooney | --all-project is only required to delete by name | 22:22 |
sean-k-mooney | on a differnt project | 22:23 |
atmark | i'm able to list VMs `openstack server list --os-project-id 848fe125a93a408ba8a8044fb87e9cdf` | 22:23 |
sean-k-mooney | ack | 22:23 |
sean-k-mooney | and i assume the delete didnt work | 22:23 |
atmark | it didn't | 22:23 |
sean-k-mooney | you could try --force | 22:23 |
sean-k-mooney | but that is not for this usecase | 22:23 |
sean-k-mooney | its for forceing the delete now if you have soft delete enabled | 22:24 |
sean-k-mooney | you could also try reset-state | 22:24 |
sean-k-mooney | then delete | 22:24 |
sean-k-mooney | so reset the state to error | 22:24 |
sean-k-mooney | then delete form error | 22:24 |
atmark | throws an no ID found on reset-state | 22:25 |
sean-k-mooney | ya so i think what is happenign is there is a record in teh api db | 22:26 |
sean-k-mooney | but there is no cell mapping for the instnace and no recored in teh cell db | 22:26 |
sean-k-mooney | reset-state woudl try and update the instance in the cell db | 22:27 |
sean-k-mooney | but that fails because it was not created | 22:27 |
atmark | this is what's in the db https://paste.openstack.org/show/bbS8V1bZK0VcFO7WBD7p/ | 22:28 |
sean-k-mooney | if you have db access your could confirm that by checkign the api db to see if the build request exists and/or instance_mapping | 22:28 |
sean-k-mooney | ok thats in which db | 22:28 |
sean-k-mooney | cell0 | 22:28 |
sean-k-mooney | or api | 22:28 |
sean-k-mooney | nova has 3 databases by default | 22:29 |
atmark | full columns https://paste.openstack.org/show/bsnbvSiQ7ieLLCW7zPyF/ | 22:29 |
atmark | nova | 22:29 |
sean-k-mooney | that does not tell me anything what installer did you use | 22:29 |
atmark | kolla-ansible | 22:29 |
sean-k-mooney | ok i have that deploy locally let me see which db that is | 22:29 |
sean-k-mooney | which version? | 22:29 |
atmark | ussuri | 22:30 |
sean-k-mooney | ack | 22:30 |
atmark | there 3 nova db, nova, nova_api and nova_cell0 | 22:30 |
atmark | the paste came from nova db | 22:30 |
sean-k-mooney | ack ok so nova i the nova cell1 db | 22:30 |
sean-k-mooney | so an instance should only end up in cell1 after it has been asigned a host | 22:31 |
sean-k-mooney | the only time after its booted that it can not have a host in cell 1 is if its shleved | 22:31 |
atmark | so it didn't end up in any cell since it doesn't have a host? | 22:36 |
atmark | host is NULL | 22:36 |
sean-k-mooney | select * from nova_api.instance_mappings where instance_uuid = "ded221d0-8410-4f89-b322-b9ff0207a0dd"; | 22:36 |
sean-k-mooney | can you run that but with your instance uuid | 22:36 |
sean-k-mooney | you should see somethign like this https://paste.openstack.org/show/bAUqQT5Al2Q9GoF9Nln2/ | 22:37 |
atmark | https://paste.openstack.org/show/b9tx6I2rQ3tUtsGBR98g/ | 22:38 |
sean-k-mooney | what im wondering is does that exist and is the cell_id set | 22:38 |
sean-k-mooney | and if so is that cell in your cell_mappings table | 22:38 |
sean-k-mooney | ok so cell_id is 6 | 22:38 |
sean-k-mooney | the cell mappings | 22:39 |
sean-k-mooney | has potitally password inf | 22:39 |
sean-k-mooney | so dont paste that | 22:39 |
atmark | 6 exist | 22:39 |
sean-k-mooney | but does the db connection string for cell id 6 end in nova | 22:39 |
atmark | https://paste.openstack.org/show/bh9PvsUNrxNuYSRsATga/ | 22:39 |
atmark | 6 and 9 | 22:40 |
sean-k-mooney | yep | 22:40 |
sean-k-mooney | does 6 point to the db where the instance iss | 22:40 |
sean-k-mooney | i.e. nova in your case | 22:40 |
sean-k-mooney | it shoudl end in :3306/nova | 22:40 |
sean-k-mooney | this is how we know which database to look in to delete the instace at the cell level | 22:41 |
atmark | https://paste.openstack.org/show/blXqFYFedaFGS9U7igPG/ | 22:41 |
atmark | cell0 | 22:41 |
atmark | it's pointing to nova_cell0 | 22:42 |
sean-k-mooney | ya so cell0 is where it should be if it fail | 22:42 |
sean-k-mooney | can you ceck cell 0 and see if its also in the instance table there | 22:42 |
atmark | yup it's here | 22:43 |
sean-k-mooney | ok so its in both nova and nova_cell0 | 22:44 |
sean-k-mooney | thats really strange | 22:44 |
atmark | correct | 22:44 |
sean-k-mooney | so the quickest way to fix this is just to delete that one record | 22:44 |
sean-k-mooney | but it shoudl only ever end up in one of those two dbs | 22:44 |
sean-k-mooney | any instnace that failed before it was schduled to a host should end up in nova_cell0 | 22:45 |
atmark | delete in both db? | 22:45 |
atmark | probably safe to delete in both right | 22:45 |
sean-k-mooney | you could try just removing it in nova | 22:45 |
sean-k-mooney | and then delete again but ya it should be safe to delete in both | 22:45 |
sean-k-mooney | i dont know of a failrue mode like this unless you change you cell mapping at some point | 22:46 |
sean-k-mooney | normally cell 0 get id 1 an cell_1 gets id 2 | 22:46 |
sean-k-mooney | although there is no significance to the speicic id | 22:47 |
atmark | iirc, we did played with mappings because something failed | 22:47 |
sean-k-mooney | ack so maybe this change at some point | 22:47 |
sean-k-mooney | i have never seen this specific issue before | 22:48 |
atmark | yep. anywany, this is for monday. don't wanna cause something bad | 22:48 |
atmark | thanks for the help | 22:48 |
atmark | we have 3 prod envs, only 1 env is exhibiting this issue | 22:49 |
sean-k-mooney | ack | 22:49 |
atmark | all same version | 22:49 |
* sean-k-mooney is current deploying my home openstack with kolla-ansible in paralle | 22:50 | |
sean-k-mooney | its my favorite way to install and manage openstack | 22:50 |
sean-k-mooney | im just sad that the zed release is not avlaible currently | 22:50 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!