*** tobias-urdin9 is now known as tobias-urdin | 13:03 | |
croeland1 | o/ | 14:00 |
---|---|---|
pranali | #startmeeting glance | 14:00 |
opendevmeet | Meeting started Thu Dec 7 14:00:28 2023 UTC and is due to finish in 60 minutes. The chair is pranali. Information about MeetBot at http://wiki.debian.org/MeetBot. | 14:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 14:00 |
opendevmeet | The meeting name has been set to 'glance' | 14:00 |
pranali | #topic roll call | 14:00 |
pranali | #link https://etherpad.openstack.org/p/glance-team-meeting-agenda | 14:00 |
pranali | o/ | 14:00 |
mrjoshi | o/ | 14:00 |
pranali | ok, so assuming everynone is back here let's start :) | 14:01 |
rosmaita | o/ | 14:01 |
pranali | #topic release/periodic jobs updates | 14:01 |
pranali | M2 is 5 weeks from now which will be spec freeze for us as well | 14:01 |
pranali | Periodic jobs are all green except intermittent TIME_OUTs on fips jobs | 14:02 |
pranali | moving to next | 14:02 |
pranali | #topic RBD deletion Issue | 14:02 |
pranali | #link https://bugs.launchpad.net/glance/+bug/2045769 - Image remains in active state even image data is deleted from the rbd store | 14:02 |
pranali | so i've observed this issue during the new add location api testing when delete is attempted when hash calculation is ongoing after image has set to active | 14:03 |
rosmaita | you asked me to look at this yesterday but i forgot | 14:04 |
pranali | yeah np | 14:04 |
pranali | I just thought you must be having an idea on this bcz i have seen one of you old patch where in the commit msg it's mentioned that when store throws in use exception it deleted the data as well | 14:05 |
pranali | #link https://github.com/openstack/glance/commit/f267bd6cde0e2b3ef5d08ae7c91831e1c88ed990 | 14:05 |
pranali | this one ^ | 14:05 |
rosmaita | ok, i will claim that i co-authored the part that doesn't have a bug | 14:06 |
pranali | ohh ok | 14:06 |
rosmaita | (just kidding) | 14:07 |
pranali | :D | 14:07 |
pranali | I've tried to fix that in my current location import patch by marking the image to deleted after catching the exception | 14:08 |
pranali | #link https://review.opendev.org/c/openstack/glance/+/886749/31/glance/async_/flows/location_import.py#83 | 14:08 |
pranali | but after noticing this same issue for image download as well i thin kit should be handled in deleted operation it self, right ? | 14:09 |
rosmaita | sorry, i'm still trying to figure out the context (looking at the bug https://bugs.launchpad.net/glance/+bug/2045769 ) | 14:10 |
rosmaita | with that bug, for step #1 | 14:10 |
rosmaita | the image has been uploaded and gone active before you go to step #2, is that right? | 14:11 |
pranali | yes | 14:11 |
rosmaita | ok, and since that was a regular 'glance image-create', the hash would be done during the upload (not later? or have we changed that?) | 14:12 |
pranali | yeah i think so | 14:13 |
rosmaita | ok, what i'm getting at is that i don't think the hash computation is involved in this issue | 14:13 |
rosmaita | the error in step #2 i'm pretty sure is coming from the client | 14:14 |
pranali | hmm need to check that but download has got NotFound error since the data was lost | 14:15 |
pranali | #link https://paste.opendev.org/show/bg8hJ7kF7CYJVM4lZMe2/ | 14:16 |
rosmaita | right | 14:16 |
pranali | I'm just not sure why store raises InUseByStore exception if it deletes the data | 14:17 |
rosmaita | right | 14:17 |
rosmaita | i wonder if the image cache has anything to do with this | 14:18 |
abhishekk | rosmaita, let me explain the issue here | 14:18 |
rosmaita | have you tried it without the cache (or do we always cache these days) | 14:18 |
rosmaita | please! | 14:18 |
abhishekk | I created the image A of 5 gb (hash is calculated) and image is active now | 14:19 |
abhishekk | I sent image download request, download started and in 2nd window I sent delete image request | 14:19 |
abhishekk | Now what happens is download interrupts as data is deleted but delete call fails by saying image in use | 14:20 |
abhishekk | and image remains active state | 14:20 |
abhishekk | on 2nd download call we get error that image has no data | 14:20 |
abhishekk | Problem is store returns image is busy but it also deletes the data from the store | 14:20 |
abhishekk | And user gets delete call failed and he sees image is still active | 14:21 |
rosmaita | are all the locations gone at that point? | 14:21 |
abhishekk | (assume) There is only one location, store deletes the data and returns Busy exception to glance | 14:22 |
abhishekk | glance does not deletes the location and keeps image in active state | 14:22 |
rosmaita | but does it leave the location on the image | 14:22 |
abhishekk | yes | 14:22 |
rosmaita | ok | 14:22 |
abhishekk | I think this is serious issue | 14:23 |
abhishekk | There are two possibilities, | 14:23 |
abhishekk | regression in ceph | 14:24 |
abhishekk | or store code is wrong | 14:24 |
rosmaita | or both! | 14:24 |
abhishekk | :D | 14:24 |
abhishekk | my suggestion to pranali is deploy quincy and check this scenario | 14:25 |
rosmaita | ok, on the plus side, though, the user deleted the image and the data is gone, so they will be annoyed that it still shows active, but shouldn't be too annoyed because they deleted it | 14:25 |
abhishekk | correct | 14:25 |
rosmaita | do we have debug logs from the first image delete in this scenario? | 14:26 |
pranali | abhishekk, i've tried to change the ceph version in nova-ceph-multistore job but it's failing | 14:27 |
pranali | #link https://zuul.opendev.org/t/openstack/build/e62d4a18b87f4be1872c84c0560f61d3 | 14:27 |
abhishekk | pranali, might have it | 14:28 |
* pranali is checking | 14:28 | |
abhishekk | find out the error, and try, because we need to rule out the possibilities | 14:28 |
abhishekk | this issue can be easily reproducible, so we can get logs again | 14:29 |
rosmaita | so basically, glance_store rbd driver asks ceph to delete the data, it gets back an is-busy-error, but ceph deletes the data anyway | 14:29 |
rosmaita | glance thinks that the delete failed, so it keeps the image in 'active' | 14:30 |
rosmaita | and doesn't remove the location where it thinks the data is | 14:30 |
abhishekk | correct | 14:30 |
rosmaita | but since ceph deleted the data, all downloads fail | 14:30 |
abhishekk | yes | 14:30 |
rosmaita | and this is with current master code, and which ceph? | 14:31 |
pranali | i think the latest ceph, Reef | 14:32 |
rosmaita | ok | 14:32 |
pranali | we have not yet confirmed whether it's there with previous version of ceph as well | 14:32 |
pranali | #link https://etherpad.opendev.org/p/image-delete-from-rbd-issue | 14:33 |
pranali | I've these logs atm | 14:33 |
rosmaita | ok, thanks | 14:34 |
abhishekk | etherpad is empty? | 14:34 |
pranali | :O | 14:34 |
pranali | I can see the logs in that etherpad | 14:35 |
abhishekk | its empty for me | 14:36 |
mrjoshi | It's empty for me too | 14:36 |
abhishekk | \o | 14:37 |
abhishekk | \o/ voodoo | 14:37 |
* croeland1 sees nothing | 14:38 | |
pranali | hmm not sure why it's showing me now reconnecting continuously :/ | 14:39 |
* abhishekk it's magic, issue does not want us to solve it except pranali | 14:39 | |
pranali | LOL | 14:39 |
pranali | ok, I think we should move ahead and can continue this discussion on glance channel | 14:41 |
pranali | #link https://paste.opendev.org/show/b8Lt6CgF5Sjd7Sd3g8SV/, tried to add the logs here | 14:42 |
rosmaita | ok, i can see that one | 14:42 |
abhishekk | I think your logs broke etherpad :P | 14:42 |
pranali | plz ingnore the above link , #link https://paste.opendev.org/show/b8sruRYp2tcRcJ9sWwqB/ | 14:43 |
abhishekk | I think you can explore above possibilities to isolate the problem | 14:45 |
abhishekk | let's move ahead, | 14:45 |
pranali | yeah | 14:45 |
pranali | #topic Specs | 14:45 |
pranali | #link https://review.opendev.org/c/openstack/glance-specs/+/899367 - Use Centralized database for cache operations | 14:46 |
abhishekk | also do one check, upload large image and during upload delete the image and see what happens | 14:46 |
pranali | #link https://review.opendev.org/c/openstack/glance-specs/+/900267 - New API to restore image | 14:46 |
pranali | #link https://review.opendev.org/c/openstack/glance-specs/+/899804 - [Spec Lite] Deprecate location strategy | 14:46 |
pranali | #link https://review.opendev.org/c/openstack/glance-specs/+/899805 - [Spec Lite] Deprecate cachemanage middleware | 14:46 |
pranali | #link https://review.opendev.org/c/openstack/glance-specs/+/899857 - Caracal project priorities | 14:46 |
pranali | abhishekk, yeah that also should be tried, I will do that | 14:47 |
pranali | kindly please have a look at these specs, the deprecation specs emails I've sent on ML, so we can wait for those till end of this month if anyone has any objection on the same | 14:48 |
abhishekk | please review centralized db spec, that is most important this cycle | 14:49 |
pranali | yes | 14:49 |
pranali | that's it from me | 14:50 |
pranali | let's move to open discussions | 14:51 |
pranali | #topic Open Discussions | 14:51 |
abhishekk | I don't have anything else | 14:51 |
mrjoshi | I would like to highlight | 14:51 |
rosmaita | ok, somebody please bug me tomorrow about reviewing specs | 14:51 |
pranali | rosmaita, ack :) | 14:52 |
abhishekk | haha | 14:52 |
* abhishekk signing out | 14:52 | |
abhishekk | thank you all | 14:52 |
pranali | Thanks everyone for joining !! | 14:52 |
pranali | #endmeeting | 14:52 |
opendevmeet | Meeting ended Thu Dec 7 14:52:59 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 14:52 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/glance/2023/glance.2023-12-07-14.00.html | 14:52 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/glance/2023/glance.2023-12-07-14.00.txt | 14:52 |
opendevmeet | Log: https://meetings.opendev.org/meetings/glance/2023/glance.2023-12-07-14.00.log.html | 14:52 |
*** tobias-urdin34 is now known as tobias-urdin | 17:27 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!