Monday, 2020-06-22

*** spotz has joined #openstack-glance00:25
*** rcernin_ has joined #openstack-glance02:58
*** rcernin has quit IRC02:59
*** rcernin_ has quit IRC03:16
*** rcernin_ has joined #openstack-glance03:32
*** Liang__ has joined #openstack-glance03:34
*** rcernin_ has quit IRC03:45
*** rcernin has joined #openstack-glance03:45
*** ratailor has joined #openstack-glance04:16
*** evrardjp has quit IRC04:33
*** evrardjp has joined #openstack-glance04:33
*** Liang__ has quit IRC04:49
*** Liang__ has joined #openstack-glance04:51
*** rcernin has quit IRC05:32
*** udesale has joined #openstack-glance05:33
*** rcernin has joined #openstack-glance05:40
*** m75abrams has joined #openstack-glance06:16
openstackgerritAbhishek Kekane proposed openstack/glance_store master: Release notes for Victoria Milestone 1  https://review.opendev.org/73721806:41
openstackgerritRajat Dhasmana proposed openstack/glance_store master: Don't allow image creation with encrypted nfs volumes  https://review.opendev.org/73250607:05
openstackgerritRajat Dhasmana proposed openstack/glance_store master: Don't allow image creation with encrypted nfs volumes  https://review.opendev.org/73250607:08
*** bhagyashris is now known as bhagyashris|lunc07:27
*** lpetrut has joined #openstack-glance07:40
*** rcernin_ has joined #openstack-glance07:47
*** rcernin has quit IRC07:47
*** rcernin_ has quit IRC07:54
*** bhagyashris|lunc is now known as bhagyashris08:34
*** rajivmucheli has joined #openstack-glance08:52
*** rajivmucheli has quit IRC09:13
*** tkajinam has quit IRC09:21
*** Liang__ has quit IRC09:59
*** ratailor has quit IRC12:21
*** udesale_ has joined #openstack-glance12:25
*** udesale has quit IRC12:27
*** rosmaita has joined #openstack-glance12:51
*** jv_ has joined #openstack-glance13:52
openstackgerritMarkus Hentsch proposed openstack/glance-specs master: Spec for the Glance part of Image Encryption  https://review.opendev.org/60966714:48
*** m75abrams has quit IRC15:44
*** gyee has joined #openstack-glance15:55
*** udesale_ has quit IRC16:21
*** lpetrut has quit IRC16:50
dansmithabhishekk: jokke: Need some  help figuring out why the copy-to-store isn't working in my test17:40
dansmithI see the task starting, but eventually it looks like they fail like this:17:40
dansmithhttps://zuul.opendev.org/t/openstack/build/18e4701c1a374bf09269778479160f25/log/controller/logs/screen-g-api.txt#746617:40
dansmithspecifically: glance.common.exception.Forbidden: You are not permitted to modify 'os_glance_importing_to_stores' on this image.17:41
dansmithnova just waits for the full timeout for the copy to finish but never sees that happen, I'm guessing because glance isn't updating those meta properties?17:41
abhishekkdansmith, never faced this issue before17:46
abhishekkCan I see the test?17:47
dansmithit's just a test trying to create an instance,17:47
dansmithwith my nova patch that requests a copy-to-store from file to rbd before progressing17:48
dansmithit works locally17:48
dansmithabhishekk: what would cause glance to not be able to update that status property on the image?17:49
dansmithshouldn't that be using an admin context or something such that it can always do that?17:49
abhishekkyes it should17:49
dansmithabhishekk: but it looks like it's failing even in just the plugin loading process right?17:54
dansmithis that loading the copy-to-store plugin?17:54
abhishekkNo plugin is loaded at the run-time i.e. when service starts17:55
abhishekkthis is where we build the actual flow of image creation17:56
dansmithokay, I don't even see any context usage in that path really17:56
dansmithabhishekk: you have the curse of being helpful so I'm pinging you, but if someone else is the better point of contact, let me know17:58
abhishekkdansmith, jokke is the one but I guess at the moment he is facing some connectivity issues17:59
dansmithokay17:59
abhishekkdoes multiple import test worked>18:00
abhishekk?18:00
abhishekkor you just trying copying scenario first?18:01
dansmiththe import conversion from qcow to raw worked fine,18:04
dansmiththis is now trying to do a copy-to-store18:04
dansmithseems to be failing here:18:04
dansmith    image.extra_properties[18:04
dansmith        'os_glance_importing_to_stores'] = ','.join((store for store in18:04
dansmith                                                     stores if18:04
dansmith                                                     store is not None))18:04
abhishekkyes, but this is the same flow which ran successfully for conversion thing18:05
dansmithbut with different owners18:05
abhishekkBut in that case you are creating image with single store I guess18:05
dansmithis that why? it's letting me do a copy-to-store, but then the task fails later async?18:05
abhishekkI guess so18:06
dansmiththat seems broken to me18:07
dansmith(a) if I can read the image, why can't I copy it to a store I have access to?18:07
dansmith(b) the import should fail with 403,18:07
dansmithand not just fail later18:07
dansmithand (c) if it fails later, it needs to add to failed_stores or whatever so I know to stop looking18:07
abhishekkthe C part is not executing as it is failing before actual taskflow execution starts18:08
abhishekkwhat two different owners are you talking about?18:09
abhishekkJun 22 15:57:24.812816 ubuntu-bionic-rax-iad-0017311577 glance-api[14039]: WARNING keystonemiddleware.auth_token [-] Authorization failed for token: keystonemiddleware.auth_token._exceptions.InvalidToken: Token authorization failed18:09
dansmithdevstack uploads the image as one tenant, but tempest tests all run under a different one18:10
abhishekkI can see this warning in between as well, but I guess it doesn't related18:10
dansmithre: (c) above, if you returned 20x to me when you started the task, but the task fails later, there needs to be some way for my polling loop to determine that it has failed and I should stop polling18:11
dansmithideally, you would start the task running before returning from the initial POST, so I know that 20x means it's actually loaded and runnable, but what matters more is that I have some way of knowing18:13
abhishekkyeah18:14
dansmithbecause I'm also seeing what looks like a race to start the conversion.. if two clients ask for copy-to-store simultaneously, it kinda looks like two tasks will start with the same goals18:14
abhishekkyes18:16
dansmiththat's a pretty big problem, if so18:17
abhishekkIMO second operation will fail in that case18:18
dansmithpresumably tasks are tracked in the database such that we could determine if more than one is started with the same goal before running right?18:18
dansmithwill currently or should be made to fail?18:18
abhishekklooking for confirmation18:20
dansmithokay, my next question will be... if I make nova do this with admin credentials, will that fix the ability to update the image?18:21
dansmithand/or do you think it's a bug that this is failing in the first place with another tenant asking for the copy?18:22
abhishekkFor first question, if two clients ask for copy-to-store same image simultaneously then there will problem18:23
abhishekkif I make nova do this with admin credentials, will that fix the ability to update the image? >>> this should work ok18:24
dansmithokay, so admin should fix me quickly, but.. what *should* be the case? I would expect using the user's token is better so that any auditing takes place as the user, etc.18:26
openstackgerritMerged openstack/glance_store master: Release notes for Victoria Milestone 1  https://review.opendev.org/73721818:26
abhishekknot sure at the moment, need to go through it again :(18:29
dansmithokay18:31
dansmithI just want to know whether it's expected to use admin for this, which means the bug is "I should have gotten 403 instead of 20x", or if it should work which means the bug is "admin shouldn't be required but is"18:32
dansmithabhishekk: either way, it sounds like there's also an important bug around preventing two nodes from copying the image at the same time right?18:32
abhishekktwo nodes, means two controller nodes?18:33
dansmithno, compute nodes18:33
dansmithor of course, two separate users who do this manually18:35
abhishekkpossibility is more if that is not HA setup (i.e. only single glance is running then its 100% sure)18:35
dansmithokay I'm not sure what you mean18:36
abhishekkI am saying staging is local to api node18:36
dansmitheven with only one controller node running glance standalone, there will be multiple worker processes right?18:36
abhishekkwe are using file store for staging which will be only one for all the processes18:37
dansmithokay, so are you saying that if we weren't hitting this permission issue, that one of the threads would "win" and the rest would never start, but only if they're all threads on one machine?18:37
dansmithand that if there are multiple API worker nodes, that "bad things would happen" in that case?18:37
abhishekkwhat I am saying is18:38
abhishekk1. there are two requests for copying image A to store 2 and 318:38
abhishekk2. first request will copy image in staging area /whatever/os_glance_staging_dir18:39
abhishekk3. Second request will also try to copy iamge in staging area but it will see that file is present there already so it will exit that task and proceed to import18:39
abhishekk4. In this case there might be possibility that when Import task starts for request 2 staging of image from request 1 is not completed18:40
dansmithso it will start copying the partial image?18:41
abhishekkwhich leads to corrupt or partial image upload18:41
dansmithgotcha, so that will happen even in the single-glance-api case18:41
abhishekkyes, can be18:42
dansmithokay, so that's a bug to file for sure right?18:42
abhishekkyes18:43
dansmithcan you do that and get me the bug number, since you understand that path?18:43
dansmithand I think another bug is the "report 20x but fail with permissions later, should have returned 403" issue right? I can file that one18:43
abhishekkIs it ok if I do it tomorrow morning my time?18:44
dansmithNO, DO IT NOW!18:44
dansmith(yes of course :)18:44
abhishekk:D18:44
dansmithbut confirm, the above 403 is a bug and I will file, yeah?18:44
dansmithi.e. if it is going to fail because admin is required, it should fail as 403 immediately and not 20x and fail later18:45
abhishekkI am still confused about this case, but I think it as a bug18:46
dansmithokay I will write it up as a bug and you can look tomorrow and maybe get a better idea with complete prose instead of my IRC ramblings18:46
abhishekkWe need couple of more error handling around these new plugins18:47
abhishekklike if import operation is in progress and someone sends copy-image request for the same image18:47
dansmithsupposedly there is a known error for that, according to the API18:48
dansmithand I'm checking for it18:48
dansmithhowever, I think it doesn't ever happen because the credential thing never allows the other one to start18:48
abhishekkit will happen18:49
abhishekkWe have some flags around import operation to set image to active while importing to multiple stores18:49
dansmithokay, I'm checking for it, but I don't log anything, so maybe it is18:49
dansmithabhishekk: right, but because of the permissions thing, the image never gets the "importing_to_stores=$newstore" set on it18:50
dansmithso if you're checking that, you won't see it in progress18:50
dansmithbut maybe there is other accounting?18:50
abhishekkI will test this scenario tomorrow which is in my mind18:51
dansmithack18:51
abhishekkBut sounds like "importing_to_stores=$newstore this will prevent second one18:51
dansmithmeaning prevent the race between two competing imports?18:52
smcginnisabhishekk: I updated https://review.opendev.org/#/c/735735/ to pick up that release note. If you could take a quick look when you have a moment, that would be great.18:52
abhishekksmcginnis, ack, will look in 5-10 minutes18:52
smcginnisThanks!18:53
abhishekkdansmith, I will try to reproduce this issue in my local environment first before filing this18:53
dansmithokay18:53
abhishekkJust to be on same page, the scenario I will test will be18:54
abhishekk1. create image in single store18:54
abhishekk2. send one request to copy it in other stores18:54
abhishekk3. Send 2nd request to do the same18:54
abhishekkalmost immediately18:54
dansmithyes, but,18:55
dansmithunless the task checks that no other task is also doing the import *after* it has succeeded in updating the image state, you can get a double insert race18:55
dansmithmeaning you get this pattern:18:55
dansmith1. task1 checks to see if already importing, false18:55
dansmith2. task2 checks to see if already importing, false18:56
dansmith3. task1 updates to importing18:56
dansmith4. task2 updates to importing18:56
dansmith5. chaos18:56
dansmithit's hard to prove that can't happen by testing externally18:56
abhishekkIn case of copy operation iamge status does not change it remains active only18:56
dansmithI know, by "checks" I mean "checks to see if the store is in os_glance_importing_to_stores"18:57
abhishekkaha18:57
dansmithto prevent the above,18:57
dansmitheach task needs a second check, after it succeeds in updating, to make sure that it and only it is importing18:57
abhishekkhmm18:58
abhishekksmcginnis, what is the significance of 'No' an empty file?19:00
smcginnisHuh, no idea where that came from. Let me fix that quick.19:01
abhishekksmcginnis, ack19:01
abhishekkdansmith, I am going offline now, its quiet late here (please let me know if you found something else)19:02
smcginnisabhishekk: Cleaned that up.19:02
dansmithabhishekk: just filed: https://bugs.launchpad.net/glance/+bug/188458719:02
openstackLaunchpad bug 1884587 in Glance "image import copy-to-store API should reflect proper authorization" [Undecided,New]19:02
smcginnisabhishekk: Have a good night.19:02
dansmithabhishekk: no problem we can circle back tomorrow19:02
abhishekksmcginnis, looking19:02
abhishekkdansmith, looking19:02
dansmithabhishekk: seriously, it's fine to go and look tomorrow19:03
abhishekkyes, thank you and have a good day ahead19:03
abhishekksmcginnis, for you too ^^19:03
dansmitho/19:03
smcginnisThanks abhishekk19:04
abhishekko/~19:04
dansmithabhishekk: for your morrow, I filed this bug in anticipation of additional looking you might do tomorrow, just to try to clearly lay out the process by which multiple threads may get started: https://bugs.launchpad.net/glance/+bug/188459619:53
openstackLaunchpad bug 1884596 in Glance "image import copy-to-store will start multiple importing threads due to race condition" [Undecided,New]19:53
dansmithif that is not possible for some reason, then we can close it as invalid, but ... I think it's valid :)19:53
openstackgerritDan Smith proposed openstack/glance master: WIP: Create image import factory with admin context  https://review.opendev.org/73738221:29
*** rcernin_ has joined #openstack-glance22:32
*** rcernin_ has quit IRC22:47
*** tkajinam has joined #openstack-glance22:51
*** rcernin_ has joined #openstack-glance23:02
*** rcernin_ has quit IRC23:16
*** rcernin has joined #openstack-glance23:18

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!