opendevreview | Takashi Kajinami proposed openstack/os-resource-classes master: Fix outdated envlist https://review.opendev.org/c/openstack/os-resource-classes/+/939018 | 04:55 |
---|---|---|
opendevreview | Takashi Kajinami proposed openstack/python-novaclient master: Remove environment for Python 3.8 https://review.opendev.org/c/openstack/python-novaclient/+/939022 | 05:04 |
opendevreview | Takashi Kajinami proposed openstack/nova master: Drop environment for Python 3.8 https://review.opendev.org/c/openstack/nova/+/939027 | 05:11 |
opendevreview | Hemanth N proposed openstack/nova master: Add memory stats to compute monitor https://review.opendev.org/c/openstack/nova/+/939044 | 05:37 |
opendevreview | Hemanth N proposed openstack/nova master: Add memory stats to compute monitor https://review.opendev.org/c/openstack/nova/+/939044 | 06:08 |
opendevreview | Hemanth N proposed openstack/nova master: Add memory stats to compute monitor https://review.opendev.org/c/openstack/nova/+/939044 | 07:18 |
opendevreview | Hemanth N proposed openstack/nova master: Add memory stats to compute monitor https://review.opendev.org/c/openstack/nova/+/939044 | 07:48 |
gibi | bauzas: when you are up. I would like to formally ask for a spec freeze exception for https://review.opendev.org/c/openstack/nova-specs/+/938910 | 08:40 |
gibi | This is a spin off of the vTPM live migration spec | 08:40 |
gibi | but the impact is moved out of it as a) it has API impact b) it is independently useful feature | 08:41 |
gibi | we already have plenty of core approval on it so it does not seems to controversial | 08:41 |
gibi | the assignee of the impl is a questionmark today but just because we want to give the opportunity to auniyal or ratailor to pick it up. But if it does not fit to their timeline for E then I can be the fallback to push the implementation. Fortunately it is a small impact one | 08:43 |
gibi | more context is in the comment thread https://review.opendev.org/c/openstack/nova-specs/+/938843/1#message-c65ec478107012ab4c141bd79812528a74f7761d | 08:44 |
auniyal | hey gibi, can you please have a look at https://bugs.launchpad.net/nova/+bug/2093869/, nova-place placement bug | 09:10 |
ratailor | gibi, I could try to do that. | 09:10 |
gibi | ratailor: cool. Thanks for stepping up | 09:11 |
gibi | auniyal: looking | 09:12 |
opendevreview | Rajesh Tailor proposed openstack/nova master: Add support for showing finish_time https://review.opendev.org/c/openstack/nova/+/928933 | 09:38 |
gibi | auniyal: so this is a system with PCI in Placement enabled and after a while, even if the devstack sits idle the compute starts failing with the conflict exception | 09:46 |
gibi | auniyal: you wrote that "If we recreate resource provider, then new once does not have DISK_GB resource-class | 09:53 |
gibi | " | 09:53 |
gibi | does it mean you stopped the compute, deleted the RP tree, and then started the compute? | 09:53 |
gibi | did you have VMs running on this compute? | 09:54 |
gibi | do you just enabled PCI in Placement or also have some devices configured via [pci]device_spec config option? | 09:56 |
auniyal | not in right now- but I encountered the same issue, earlier (do not this VM-setup anymore ) | 09:56 |
auniyal | that time, I had VM, which was failing to start usin nova cmd, - on checking compute logs I saw this error. | 09:56 |
auniyal | that time, I teventually endup deleting R-P and creating new, | 09:56 |
auniyal | > enabled PCI in Placement or also have some devices configured via [pci]device_spec config option? | 09:57 |
auniyal | no | 09:57 |
auniyal | both time, there is no PCI changes | 09:57 |
auniyal | or any update in any service confs, the system is as same as how devstack deploy it | 09:58 |
auniyal | only thing is, I have ceph enabled for storage | 09:59 |
opendevreview | Dmitriy Chubinidze proposed openstack/nova master: Adding link for RabbitMQ installation during nova deployment on controller node. https://review.opendev.org/c/openstack/nova/+/938702 | 09:59 |
auniyal | >>> not in right now- but I encountered the same issue, earlier (do not this VM-setup anymore ) | 10:03 |
auniyal | meant, I do not have that setup anymore, and this is a new setup having similar issue | 10:03 |
gibi | auniyal: so the error is coming from https://github.com/openstack/nova/blob/a459467899d2b406aa8cf530ae481255eaf3c957/nova/compute/resource_tracker.py#L1360-L1370 Interstingly the comment in the code contradicts with the actual InventoryInUse exception in the logs. So I think the assumption that all the InventoryInUse exception from update_from_provider_tree is realted to PCI in Placement is wrong | 10:17 |
gibi | so that exception translation needs to be fixed to only react on real exceptions from the PCI in Placement codepath and do not react on other InventoryInUse exceptions. | 10:18 |
gibi | Still I'm wondering why the exception is raised in you env | 10:18 |
gibi | could you replace the raise exception.PlacementPciException(error=str(e)) | 10:19 |
auniyal | yes, I was surprised seeing PlacementPciException ! | 10:19 |
gibi | line with a raise e | 10:19 |
gibi | and try to start up the compute | 10:19 |
auniyal | ack, please give a min | 10:20 |
auniyal | gibi, https://paste.openstack.org/show/bqQZYiVU8NA5PdRxn7Rm/ | 10:23 |
auniyal | compute went uo though, in compute service list | 10:24 |
auniyal | https://paste.openstack.org/show/bL5nGTkXQjNgVMltv6fL/ | 10:25 |
auniyal | same msg isstill coming on compute restart | 10:26 |
gibi | yepp | 10:28 |
gibi | so there are two separate issues at least 1) we made that exception a hard fail since PCI in Placement as we too eagerly translate the exception 2) for some unknown reason in you env nova-compute cannot update the disk inventory at startup | 10:29 |
gibi | auniyal: could you put your nova-cpu.conf in paste.openstack.org? | 10:34 |
auniyal | here https://paste.openstack.org/show/bqpnfND7kpB29US0UA6P/ | 10:37 |
gibi | thanks | 10:37 |
gibi | OK so this is a compute with rbd configured | 10:39 |
auniyal | can't create new VM - https://paste.openstack.org/show/b2TQPF1lLwbFj3QZsaeC/ | 10:40 |
gibi | for the 2) I assume, but no proof yet, that for some reason the compute tries to modify the total DISK_GB total inventory to a value that is smaller than the current usage in placement causing placement to reject the change | 10:40 |
gibi | in the last log you pasted, it seems that the compute try to report an inventory without any DISK_GB meaning that any existing DISK_GB inventory is deleted, and placement rejects that as there is still usage from the RC | 10:42 |
gibi | is this a devstack that built with rbd in the first place or was it built with local disk first and then reconfigured with rbd on after? | 10:43 |
auniyal | no reconfiguration, only ceph/rbd | 10:44 |
gibi | https://github.com/openstack/nova/blob/a459467899d2b406aa8cf530ae481255eaf3c957/nova/virt/libvirt/driver.py#L9456-L9473 this is the code that causes that DISK_GB is reported. | 10:48 |
gibi | so for some reason disk_gb is 0 there | 10:49 |
gibi | but you could try to prove that by adding some extra logs around that code | 10:49 |
gibi | that disk_gb is filled in https://github.com/openstack/nova/blob/a459467899d2b406aa8cf530ae481255eaf3c957/nova/virt/libvirt/driver.py#L8306-L8307 in case of rbd | 10:50 |
gibi | based on the info from the rbd pool | 10:50 |
gibi | so you should check what the rbd reports for the vms pool | 10:53 |
auniyal | https://paste.openstack.org/show/b6WyabNygyB5hiTBUGdN/ | 10:58 |
auniyal | in between delimeters -- ## | 10:58 |
gibi | yepp that proofs that compute tries to remove the DISK_GB inventory in placement due to getting 0 total available disk from rbd | 11:04 |
gibi | so you should check your the vms rbd ceph pool in you deployment | 11:05 |
gibi | why it is reporting 0 disk | 11:05 |
auniyal | sudo rbd ls volumes hangs | 11:14 |
auniyal | gibi - https://paste.openstack.org/show/bzYaK4id1g94RyidGpcU/ | 11:16 |
auniyal | no keyring found | 11:17 |
auniyal | https://paste.openstack.org/show/bfgQbk0ZCHeDPzszszSj/ | 11:22 |
gibi | OK. so you have some ceph issues. | 12:13 |
gibi | I suggest to ping storage folks to get help debugging it | 12:14 |
gibi | I filed a separate bug for the missleading error message conflating DISK_GB with PCI in Placement https://bugs.launchpad.net/nova/+bug/2093879 | 12:26 |
gibi | bauzas: could you check my ping from this morning regarding the spec freeze exception? | 12:30 |
opendevreview | Balazs Gibizer proposed openstack/placement stable/2023.2: Add round-robin candidate generation strategy https://review.opendev.org/c/openstack/placement/+/938947 | 12:33 |
opendevreview | Balazs Gibizer proposed openstack/nova master: [CI][nova-next]test with placement ac breadth-first https://review.opendev.org/c/openstack/nova/+/937275 | 12:41 |
bauzas | sorry I missed your ping | 12:43 |
bauzas | gibi: I just +Wd your spec | 12:45 |
gibi | bauzas: thanks | 12:45 |
opendevreview | Merged openstack/nova-specs master: Image properties in server show https://review.opendev.org/c/openstack/nova-specs/+/938910 | 12:57 |
opendevreview | Merged openstack/nova-specs master: vTPM live migration https://review.opendev.org/c/openstack/nova-specs/+/936775 | 13:15 |
auniyal | gibi ack, thanks | 13:23 |
PrzemekK | How to connect encrypted volume (LUKS) to different VM ? Is it just detach/atach ? | 13:30 |
sean-k-mooney | PrzemekK: that depend if the vm is in the same project and the user that issues the attach hass acesses to the encyption secret then i think that shoudl work | 13:34 |
sean-k-mooney | detach for encycped vs non encypted volumes is the same more or less | 13:34 |
opendevreview | Takashi Natsume proposed openstack/nova-specs master: Create specs directory for 2025.2 Flamingo https://review.opendev.org/c/openstack/nova-specs/+/939091 | 13:35 |
PrzemekK | ok. Same project / user admin. I want backup via Commvault and strange error that unable create snapshot. It creates snapshot and attach it to backup machine | 13:42 |
sean-k-mooney | so an admin cannot retrive the secret from barbican, but if the admin user that is doign the backup is added as a member of the project. a generic user with the admin cannot access secrets stored in barbican | 14:15 |
sean-k-mooney | * is added as a member of the project it shoudl be able to retrive it. | 14:16 |
sean-k-mooney | *with the admin role cannot ... | 14:16 |
sean-k-mooney | PrzemekK: by the way commvaults backup soltuion is a purly out of tree impelmeation and may or may not be supproted by your openstack vendor or work with encypted volumes | 14:17 |
sean-k-mooney | if it relies on attaching the volume to a backup vm for exampel that wont work with Boot form volume guests | 14:18 |
dansmith | sean-k-mooney: did we have a bug for the iso+gpt thing or was it just something we stumbled upon? I can't find it in the bug tracker if so | 14:36 |
sean-k-mooney | we have a bug for generic iso support and after we got that working a new oslo.utils releasel with the gpt issue was cut | 14:37 |
sean-k-mooney | i can see if i can find the other bug if you want to at least track it as related | 14:37 |
dansmith | I found the generic iso problem, but it's way old | 14:37 |
dansmith | I mean, predates and is unrelated to the gpt thing | 14:38 |
sean-k-mooney | the multi format issue came out of https://review.opendev.org/c/openstack/nova/+/909611 | 14:38 |
dansmith | right | 14:38 |
sean-k-mooney | so i have comemnted on https://bugs.launchpad.net/nova/+bug/2054446 | 14:38 |
sean-k-mooney | that iso supprot in general was only partly broken i.e. isos that supprot block device booting stilll worked | 14:39 |
sean-k-mooney | so its not really the same thing but they are releated | 14:39 |
sean-k-mooney | i dont know if you want to file a spereate bug or not | 14:39 |
dansmith | yeah related by virtue of us causing that fix another issue | 14:39 |
dansmith | just wondering what to use for a related-bug on my fix patch.. I can file one with basically the details from the comments on the other patch, | 14:40 |
dansmith | but it will be pretty much just paperwork at that point | 14:40 |
sean-k-mooney | dansmith: the intoduction fo the gpt inspctor came out of the ironic cve is that correct | 14:40 |
dansmith | no | 14:40 |
sean-k-mooney | oh ok i tought it was motivated by that orginally. | 14:40 |
dansmith | it came out of the original thing, to get us to stop using raw for two things, it just wasn't critical for the qcow fix of course | 14:40 |
sean-k-mooney | dansmith: paperwork wise i would just tack it however you find simplest unless bauzas has a specific request of for a dedicate bug | 14:41 |
dansmith | yeah, so I was just going to say: | 14:41 |
dansmith | bauzas: this fix needs to get in: https://review.opendev.org/c/openstack/nova/+/931833 we noticed it because it caused a problem for the fix of another bug, but we never filed a bug for this specifically, | 14:41 |
dansmith | since it was just on the heels of landing the initial gpt detection support | 14:42 |
bauzas | dansmith: context ? | 14:42 |
dansmith | so if we need paperwork, please say, but if not, could you review? | 14:42 |
bauzas | want me to review ? | 14:42 |
bauzas | ack, ok | 14:42 |
bauzas | I already added myself for reviewing it indeed | 14:42 |
dansmith | thanks | 14:42 |
dansmith | I've got a tangled web of reviews for disk inspection across several projects and I need to start reducing that before I lose it :) | 14:43 |
opendevreview | Balazs Gibizer proposed openstack/nova master: [CI][nova-next]test with placement ac breadth-first https://review.opendev.org/c/openstack/nova/+/937275 | 14:53 |
*** ykarel_ is now known as ykarel | 15:02 | |
sean-k-mooney | gibi: ^ fyi see comment inline, im pretty sure peers is not the correct group | 15:02 |
sean-k-mooney | im gong to see if i can quickly update https://review.opendev.org/c/openstack/nova/+/933365 to adress the review feedback. if i cant get that to a mergable state today however im goign to have to ask someone else to take it over to unblock the requirements patch | 15:04 |
gibi | sean-k-mooney: thanksh | 15:04 |
gibi | sean-k-mooney: thank | 15:04 |
gibi | s | 15:04 |
opendevreview | Balazs Gibizer proposed openstack/nova master: [CI][nova-next]test with placement ac breadth-first https://review.opendev.org/c/openstack/nova/+/937275 | 15:06 |
opendevreview | sean mooney proposed openstack/nova master: [eventlet] update nova tests for eventlet 0.37.0 https://review.opendev.org/c/openstack/nova/+/933365 | 15:34 |
opendevreview | sean mooney proposed openstack/nova master: [eventlet] update nova tests for eventlet 0.38.2 https://review.opendev.org/c/openstack/nova/+/933365 | 15:35 |
sean-k-mooney | hberaud ^ i think that might be enough. the issue we had with the funcitoal test seam to have been resolved with the oslo.log release as well | 15:36 |
opendevreview | Dr. Jens Harbott proposed openstack/nova master: DNM: Test eventlet bump https://review.opendev.org/c/openstack/nova/+/938879 | 15:38 |
sean-k-mooney | frickler: i rebased ^ to confirm if my other change is enogh but it works locally so i suspect it is | 15:47 |
sean-k-mooney | locally im emulating the test env by doing " .tox/functional/bin/python3 -m pip install -U -c https://opendev.org/openstack/requirements/raw/commit/27ded6c22e5f1ea9a16c8d3dabbd7fad72775f00/upper-constraints.txt -r requirements.txt -r test-requirements.txt eventlet" | 15:48 |
sean-k-mooney | that the url to the uc bump patch | 15:48 |
MengyangZhang[m] | sean-k-mooney: Hey sean, are you planning to talk about this https://review.opendev.org/c/openstack/nova-specs/+/932653 on Thur? I see it listed on the agenda of Thu's cinder-nova meeting. Just want to give a quick update about. I already brought it up to cinder last week and they gave the green light since there was no cinder changes needed. | 16:43 |
MengyangZhang[m] | I was wondering what's next step regarding submitting the proposed code changes. Since the spec freeze has passed I heard, would it be possible to follow the specless workflow? Happy to discuss it in the next nova meeting. | 16:43 |
sean-k-mooney | MengyangZhang[m]: so technially we are past the spec approval deadline for this release. it was last thrusday | 17:35 |
sean-k-mooney | so procedurely this would need a spec freeze excption. it might be better however to plan this for next cycle but start on the impementation in parallel | 17:36 |
sean-k-mooney | hi folks https://review.opendev.org/c/openstack/nova/+/933365 is green and so is https://review.opendev.org/c/openstack/nova/+/938879/2 so i think we are ok to proceed with merging that | 18:08 |
sean-k-mooney | gibi: melwitt: would ye be able to review https://review.opendev.org/c/openstack/nova/+/933365/3 | 18:09 |
melwitt | sean-k-mooney: did you ever find some reason that the new eventlet cause the waitall() calls to change? | 18:11 |
sean-k-mooney | no | 18:12 |
sean-k-mooney | i dont think its particalarly relevent to the behavior of the test however | 18:13 |
sean-k-mooney | so i didnt spend much time trying to find out | 18:13 |
JayF | New eventlet fixed a few locking issues around os.read()/os.write() patched interactions aiui; the unit test change made sense to me in that context | 18:13 |
JayF | but I'm obviously not a nova or eventlet expert, just someone who has read all the involved code :) | 18:14 |
sean-k-mooney | im going to quickly test without the change to make sure it still fails | 18:28 |
sean-k-mooney | its possibel that the oslo change make it not required | 18:28 |
sean-k-mooney | but i suspect that the extra call to waitall is related to that | 18:29 |
sean-k-mooney | with the new oslo release i was able to remove the funcitonal test change | 18:29 |
sean-k-mooney | but i didnt test master with the new versions which im doing now | 18:30 |
JayF | nice, that'll build even more confidence that the oslo.log pipemutex stuff fixed something \o/ | 18:30 |
* JayF is mainly trying to be the openstack liason for Itamar's efforts on eventlet/oslo compat items | 18:30 | |
sean-k-mooney | hum ok master apprelty passed. but i need to verify that it has the correct versions fo eventlet. changing branch might have cause the venv to be recreateded | 18:33 |
sean-k-mooney | https://paste.opendev.org/show/bUq1UNJTP1NHybsQJHPN/ | 18:34 |
sean-k-mooney | so ya i actully think the nova change is not requried any more | 18:34 |
sean-k-mooney | frickler: do we want to try droping the depends-on form the requirements change? | 18:35 |
sean-k-mooney | oslo.log==7.0.0 might be all tha twas requried to fix nova with eventlet 0.38.2 | 18:36 |
opendevreview | melanie witt proposed openstack/nova master: Bump requirement to PrettyTable>=2.4.0 https://review.opendev.org/c/openstack/nova/+/939157 | 19:04 |
frickler | sean-k-mooney: ack, confirmed locally, updating the change now | 19:13 |
sean-k-mooney | cool im goign to call it a night, ill leave my nova change open but ill abandon it if the requiremetn patch passes | 19:32 |
opendevreview | melanie witt proposed openstack/nova stable/2024.2: libvirt: Wrap un-proxied listDevices() and listAllDevices() https://review.opendev.org/c/openstack/nova/+/939158 | 19:38 |
opendevreview | Dr. Jens Harbott proposed openstack/nova master: DNM: Test eventlet bump https://review.opendev.org/c/openstack/nova/+/938879 | 20:19 |
opendevreview | melanie witt proposed openstack/nova stable/2024.1: libvirt: Wrap un-proxied listDevices() and listAllDevices() https://review.opendev.org/c/openstack/nova/+/939162 | 20:33 |
opendevreview | melanie witt proposed openstack/nova stable/2023.2: libvirt: Wrap un-proxied listDevices() and listAllDevices() https://review.opendev.org/c/openstack/nova/+/939163 | 20:34 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!