*** abhishekk is now known as akekane|home | 03:55 | |
*** akekane|home is now known as abhishekk | 03:55 | |
*** akekane_ is now known as abhishekk | 06:01 | |
*** rpittau|afk is now known as rpittau | 07:24 | |
*** jokke_ is now known as jokke | 11:07 | |
*** jokke is now known as jokke_ | 11:07 | |
abhishekk | #startmeeting glance | 14:00 |
---|---|---|
opendevmeet | Meeting started Thu Aug 26 14:00:09 2021 UTC and is due to finish in 60 minutes. The chair is abhishekk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 14:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 14:00 |
opendevmeet | The meeting name has been set to 'glance' | 14:00 |
abhishekk | #topic roll call | 14:00 |
abhishekk | #link https://etherpad.openstack.org/p/glance-team-meeting-agenda | 14:00 |
dansmith | o/ | 14:00 |
abhishekk | o/ | 14:00 |
abhishekk | lets wait couple of minutes for others to show | 14:01 |
abhishekk | I doubt rosmaita will join us today | 14:01 |
jokke_ | o/ | 14:01 |
abhishekk | Lets start | 14:02 |
abhishekk | #topic release/periodic jobs update | 14:02 |
abhishekk | M3 next week, but we will tag it a week after M3 | 14:02 |
abhishekk | i.e. next to next week | 14:02 |
abhishekk | so we still have around 6 working days to get things done | 14:02 |
abhishekk | python-glanceclient needs to be tagged next week though | 14:03 |
abhishekk | I will put a release patch around Sept 01 for the same | 14:03 |
pdeore | o/ | 14:03 |
abhishekk | Surprisingly periodic jobs no timeouts for last 3 days | 14:04 |
abhishekk | all green at the moment | 14:04 |
abhishekk | #topic M3 targets | 14:04 |
abhishekk | Glance Xena 3 review dashboard - https://tinyurl.com/glance-xena-3 | 14:04 |
abhishekk | Most of the policy patches are merged and remaining are approved | 14:04 |
abhishekk | due to heavy traffic in gate we are facing some unusual failures, I will keep watch on them | 14:05 |
jokke_ | I'd say usual milestone failures :) | 14:05 |
abhishekk | Thank you croelandt and dansmith and lance for reviewing these patches on priority | 14:05 |
jokke_ | Every cycle the same thing. Just like every year the winter surprises the Finns :D | 14:06 |
abhishekk | again and again at crucial time | 14:06 |
abhishekk | Cache API - Still under review - FFE required? | 14:06 |
abhishekk | There are some comments on tests and some doc changes needs to be done | 14:07 |
abhishekk | need to mention header which we are using to clear the cached and queued images in docs and api reference as well | 14:07 |
dansmith | I have some more draft comments I'm still mulling, I will move those to the latest PS or drop them accordingly | 14:07 |
jokke_ | I fixed the comments and the output of clear_cache as Dan kindly pointed out that it was very silly behaviour | 14:07 |
abhishekk | ack | 14:08 |
abhishekk | We will revisit the progress next week and decide on FFE grant for the same | 14:08 |
abhishekk | Any questions ? | 14:09 |
abhishekk | Same goes for metadef project persona | 14:09 |
jokke_ | Just FYI I'll be on PTO next week. I'd say the debate on the tests are great opportunity of followup patch after the FF as that API change needs to merge so we can get the client patch in before it needs to be released | 14:09 |
abhishekk | Patches are now open for review and we have good functional test coverage there to ensure the new RBAC behavior | 14:09 |
jokke_ | so FFE for that specific work is not great as it needs the client side | 14:10 |
abhishekk | ack, I haven't got enough time to have a look on new patch set or other review comments | 14:10 |
abhishekk | In your absence I will work on that | 14:11 |
abhishekk | Coming back to RBAC metadef, we are working on glance-tempest-plugin protection testing and that will be up and complete by tomorrow | 14:12 |
abhishekk | But I think the functional coverage on the glance side is good and we can consider those changes for M3 | 14:13 |
abhishekk | I will apply for the FFE for the same if this is not merged before M3 work | 14:14 |
abhishekk | Moving ahead | 14:14 |
abhishekk | #topic Wallaby backports | 14:15 |
croelandt | whoami-rajat: ^ | 14:15 |
croelandt | So, these 2 backports in the agenda are part of a huge big fux that includes Cinder patches as well | 14:15 |
whoami-rajat | hi | 14:15 |
croelandt | We were under the impression it was ok to backport in Wallaby | 14:15 |
croelandt | the first patch is a new feature but also support for the bug fix | 14:16 |
croelandt | I think Rajat has users affected by this in upstream Cinder, am I right? | 14:16 |
whoami-rajat | yes | 14:16 |
whoami-rajat | so most of the fixes on glance cinder side, like multi store support, format info support are all dependent on this attachment API code | 14:16 |
abhishekk | I am also under impression that if we have a crucial bug fix then we can backport supportive patches for it to stable branches | 14:17 |
abhishekk | and, I have seen some similar kind of backports in upstream in the past (not for glance though) | 14:18 |
croelandt | jokke_: what do you think? | 14:18 |
whoami-rajat | I've already backported cinder side changes and they're already +2ed, so we won't have any issues on code side as far as I'm aware | 14:18 |
abhishekk | Can we also have opinion from some other stable maintainers ? | 14:19 |
jokke_ | I think I already pointed out in the early phase of fixing these bugs that we should not had depended the prevent qcow2 on nfs on the attachement change as that is not backportable really by the policy. | 14:20 |
whoami-rajat | we can't do the qcow2 change without adding the new attachment API changes, it depends on the attachments get command | 14:21 |
whoami-rajat | s/command/api | 14:21 |
abhishekk | hmm, the policy suggests some corner cases as well | 14:21 |
jokke_ | And I do know ad understand that we will be backporting these in downstream anyways but that's totally different story. What comes to the upstream backport is all the refactoring, new dependencies etc. of that attachment API is making it very dodgy backport | 14:23 |
croelandt | Until when are we gonna be backporting stuffi n wallaby? | 14:23 |
croelandt | This might not be an issue for long :D | 14:24 |
jokke_ | whoami-rajat: I think we could have done that by looking the volume connection we get and looking the image file we have. There was no need ofr attachment api to figure out that we have combo of qcow2+NFS that we cannot support | 14:24 |
jokke_ | croelandt: wallaby is in active stable maintenance still for another ~8months | 14:25 |
abhishekk | I think we should get opinion from stable team as well | 14:26 |
croelandt | jokke_: is it likely that backporting these patches is going to be an issue in the next 8 months? | 14:26 |
whoami-rajat | jokke_, the initialize_connection call doesn't return the right format, the feature i implemented on cinder side was discussed during PTG and was decided to include the format in connection_info in the new attachment API response | 14:26 |
abhishekk | Unfortunately we have couple of requests for it from other customers but we will stick to policy if we need to | 14:26 |
abhishekk | we had lengthy discussion for that | 14:27 |
croelandt | abhishekk: so, how do we make our decision? | 14:28 |
dansmith | I'm not on glance stable, | 14:28 |
dansmith | but I definitely opt for less backporting in general, and definitely extreme caution over anything complex unless it's absolutely necessary | 14:28 |
dansmith | I think I looked at this before briefly, and I don't remember all the details, but anything that requires glance and cinder things to be backported is high risk for breaking people that don't upgrade both services in lockstep unless both sides are fully tolerant (and tested that way) of one happening before the other | 14:29 |
dansmith | downstream we can handle that testing and risk (and support if it breaks) but it's not really appropriate material for upstream stable in general, IMHO | 14:30 |
abhishekk | croelandt, I think we need some opinion from other stable maintainers as well | 14:31 |
abhishekk | but what dansmith has said now, this might be problematic in case of upgrade | 14:31 |
rosmaita | sorry i'm late | 14:31 |
jokke_ | dansmith: yeah, that kind of can be flagged in the requirements, which we don't currently do. But in general there is just too many red flags. It's not one or two of our stable rules this specific case is crossing | 14:31 |
jokke_ | hi rosmaita \o | 14:31 |
jokke_ | just in time for the attachment api backport discussion | 14:32 |
rosmaita | ah | 14:32 |
dansmith | jokke_: requirements.txt you mean? that has nothing to do with what is installed on other servers in a cluster, and certainly no direct impact on distro packages | 14:32 |
abhishekk | I think rosmaita is stable member for very long | 14:32 |
rosmaita | too long | 14:32 |
rosmaita | and not very stable | 14:32 |
abhishekk | :D | 14:32 |
croelandt | *badum tss* | 14:32 |
rosmaita | is there a place i can read the scrollback? | 14:33 |
abhishekk | So just to give you short overview | 14:33 |
rosmaita | i think the logs don't get published until the meeting ends | 14:33 |
rosmaita | ok, short overview is good | 14:33 |
jokke_ | dansmith: well sure, not the service. I was more thinking of cinderclient and os_brick needing to be able to do the right thing anyways | 14:33 |
abhishekk | we have one bug fix to backport which id depend on the patch which is impemented as a feature | 14:33 |
abhishekk | #link https://review.opendev.org/c/openstack/glance_store/+/805927 | 14:33 |
abhishekk | this is actual bug fix | 14:33 |
abhishekk | #link https://review.opendev.org/c/openstack/glance_store/+/805926 | 14:34 |
abhishekk | this is dependent patch which is needed for the above backport | 14:34 |
abhishekk | I am pro for this backport because I thought; | 14:34 |
abhishekk | the change is related to cinder driver and will not affect any other glance backend drivers | 14:35 |
jokke_ | rosmaita: basically the qcow2+NFS was implemented in a way that it depends on the attachment API support. Which is problematic backport due to it introducing new dependency, depending on cinder side backports and refactoring significant amount of the driver code | 14:35 |
abhishekk | and in the past for some other projects I have seen these kind of backports were supported | 14:35 |
rosmaita | well, rajat described the cinder attitude to driver backports very clearly in his last comment on https://review.opendev.org/c/openstack/glance_store/+/805926 | 14:36 |
abhishekk | yes | 14:37 |
rosmaita | our view is that people actually use the drivers, and it's a big ask to make them upgrade their entire cloud to a new release, rather than update within their current release | 14:37 |
dansmith | you could apply that reasoning to any feature backport for a buddy right? | 14:38 |
rosmaita | not really | 14:38 |
dansmith | "My buddy doesn't want to upgrade but does want this one feature, so we're going to backport so he doesn't have to upgrade?" | 14:38 |
rosmaita | if it impacts main cinder code, we don't do it | 14:39 |
rosmaita | the difference is that it's isolated to a single driver | 14:39 |
jokke_ | dansmith: that's why we backport in downstream like there is no tomorrow | 14:39 |
dansmith | you can apply this to a driver or the main code, either yields the same result for me :) | 14:39 |
dansmith | jokke_: exactly | 14:39 |
croelandt | jokke_: true that | 14:39 |
abhishekk | croelandt, I think downstream it is then | 14:40 |
croelandt | jokke_: I'm gonna start backporting every patch, at this rate | 14:40 |
abhishekk | I can understand the feeling | 14:40 |
jokke_ | And I personally think that in downstream that is business decision with attached commitment to support any problems it brings. In upstream we should be very cautious what we backport as we do not have similar control of the environment | 14:41 |
dansmith | this ^ | 14:41 |
abhishekk | ack | 14:41 |
abhishekk | any counter arguments to this? | 14:42 |
abhishekk | ok, moving ahead then | 14:43 |
abhishekk | #topic Holiday plans | 14:43 |
abhishekk | croelandt, is going on 2 weeks holiday from Monday | 14:43 |
abhishekk | and jokke_ is for 1 week | 14:43 |
rosmaita | slackers! | 14:44 |
abhishekk | any other core member is planning to take off during same time ? | 14:44 |
croelandt | rosmaita: hey, people died so that I could have PTO | 14:44 |
croelandt | I'm glad they did not die so I could eat Brussel sprouts | 14:44 |
jokke_ | rosmaita: I have an excuse. HR and the Irish gov will rain proper shaitstorm on me and my manager if I don't use my holidays. so there's that :P | 14:44 |
abhishekk | those two weeks we will be going to have ninja time I guess | 14:45 |
rosmaita | i'm just jealous, that's all | 14:45 |
abhishekk | ++ | 14:45 |
dansmith | abhishekk: I have no such plans at the moment | 14:45 |
abhishekk | great | 14:45 |
abhishekk | me neither | 14:45 |
croelandt | rosmaita: Unionize, comrade | 14:46 |
jokke_ | LOL | 14:46 |
rosmaita | is RH unionized in France? | 14:46 |
abhishekk | and on top of that jokke_ will send me picture of his tent and beer | 14:46 |
abhishekk | I guess that's it from me for today | 14:47 |
jokke_ | Tovarich Cyril :D | 14:47 |
abhishekk | moving to open discussion | 14:47 |
abhishekk | #topic Open discussion | 14:47 |
abhishekk | Nothing from me | 14:47 |
rajiv | Hi, sry to divert from holiday mood, firstly, thanks for merging/commenting on few of my bugs. The below is still pending : https://bugs.launchpad.net/swift/+bug/1899495 any update on this ? | 14:47 |
jokke_ | I can be inclusive and send the pictures to rest of ye too! | 14:47 |
abhishekk | Happy holidays croelandt and jokke_ | 14:47 |
rosmaita | jokke_: what's your current beer total on that app? | 14:48 |
abhishekk | rajiv, I think everyone is busy at the moment on M3 priorities as it is just around the corner | 14:48 |
jokke_ | rajiv: like I mentioned that is tricky one from Glance point of view. And I have couple of others under works still too | 14:48 |
rajiv | okay, i would i like to understand how glance-api process consumes memory ? for example, different image types and sizes consumes different glance-api process memory consumption. | 14:49 |
jokke_ | rajiv: I'll be back chasing those once I'm back after next week. | 14:49 |
croelandt | rosmaita: some employees are in a union | 14:49 |
* croelandt is not :-( | 14:49 | |
abhishekk | we can make ours | 14:49 |
rosmaita | rajiv: i think you get a 409 on a container delete if it's not empty | 14:49 |
abhishekk | glance union | 14:49 |
rosmaita | not sure that's a helpful operation | 14:50 |
rajiv | rosmaita: the swift container isnt empty, since the deletion goes in parallel, a conflict occurs. | 14:50 |
rosmaita | i mean, "observation" | 14:50 |
rajiv | we introduced retries but it dint help either. | 14:50 |
abhishekk | rosmaita, is cinder hitting any tempest failures in gate at the moment ? | 14:50 |
rosmaita | abhishekk: i sure hope not | 14:50 |
rajiv | today, i had an user upload 20 images in parallel and glance-api crashed. | 14:51 |
rosmaita | i will look | 14:51 |
abhishekk | rosmaita, ack, please let me know | 14:51 |
rajiv | rosmaita: jokke_ abhishekk any suggestion on my second question ? | 14:52 |
jokke_ | rajiv: I saw your question in #os-glance ... so the memory consumption is very tricky to predict. There is quite a bit buffering involved as the data actually passes through the API service and like you saw there might be lots of data in buffers and caches when you have lots of concurrent connections in flight | 14:52 |
rajiv | jokke_: yes, i raised it there as well but had no response, hence i asked here. | 14:52 |
jokke_ | rajiv: yeah, you were gone by the time I saw it :D | 14:53 |
rajiv | initially i set 3GB as the limit, but setting 5GB for 20 image uploads still chocks the glance-api process and sometimes the process gets killed | 14:53 |
rosmaita | abhishekk: just got an tempest-integrated-storage failure on tempest.api.compute.admin.test_volume.AttachSCSIVolumeTestJSON.test_attach_scsi_disk_with_config_drive | 14:54 |
jokke_ | In general this is one of those things we've seen in busy production clouds having set of decent dedicated servers for gapi alone as it can be quite taxing | 14:54 |
rajiv | hence the upload terminates and sends back a HTTP 502 and we have to manually delete the chunks in the swift container as well. | 14:54 |
rosmaita | just that one test, though | 14:54 |
abhishekk | rosmaita, yeah, I am hitting that three times since morning | 14:54 |
rosmaita | that same test? | 14:54 |
abhishekk | s/I am/I hit | 14:54 |
abhishekk | yeah | 14:55 |
jokke_ | rajiv: yeah, I don't think we ever designed any part of the service to really capping. So I think you setting limits for it will eventually lead to that same situation again | 14:55 |
rajiv | jokke_: okay, is there a doc or code i can refer ? | 14:55 |
rajiv | to understand how memory consumption works ? or a pattern ? | 14:56 |
jokke_ | rajiv: it's just matter of if it's 5 concurrent operations, 20 or 30. But you will eventually hit your artificial limit and get the service killed | 14:56 |
rosmaita | abhishekk: it did pass on a patch that depended on the one with the failure (which of course got a -2 from Zuul because the dependency didn't merge) | 14:56 |
abhishekk | rosmaita, not same test, mine is resize related | 14:56 |
rajiv | the image being uploaded was ~900GB, among 20, only 3 images were created. | 14:56 |
abhishekk | https://52b5fef6b4a63a70ea73-b7be325c2c973618eb7074df9913ea2c.ssl.cf5.rackcdn.com/799636/21/check/tempest-integrated-storage/91789c0/testr_results.html | 14:57 |
jokke_ | rajiv: I can't recall us having any documentation about that. | 14:57 |
rosmaita | ok, mine was a server-failed-to-delete problem | 14:57 |
jokke_ | rajiv: the main limiting factor is chunk ize | 14:57 |
abhishekk | rosmaita, ack | 14:57 |
abhishekk | thank you | 14:57 |
rajiv | the chunk size is 200MB, and enabling buffering did not help either. | 14:57 |
abhishekk | 3 minutes to go | 14:58 |
abhishekk | 2 | 14:58 |
jokke_ | so each of the greenlet worker thread will eat some memory and while you have data transfer flying, there is obviously the network buffers involved but really the chunking is the main limiting factor for the API not just caching your whole 900gigs into memory if the storage is slow :D | 14:58 |
rosmaita | hmmm, looks like it never got to the point where it could attach a volume, couldn't ssh into the vm | 14:59 |
jokke_ | rajiv: we can continue on #os-glance as we're running out of time if you prefer | 14:59 |
rajiv | sure, switching over. | 14:59 |
abhishekk | rosmaita, yes | 14:59 |
abhishekk | thank you all | 14:59 |
abhishekk | have a nice weekend | 14:59 |
rosmaita | i think it's just one of those random failures | 14:59 |
jokke_ | thanks everyone | 14:59 |
abhishekk | #endmeeting | 15:00 |
opendevmeet | Meeting ended Thu Aug 26 15:00:13 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 15:00 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/glance/2021/glance.2021-08-26-14.00.html | 15:00 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/glance/2021/glance.2021-08-26-14.00.txt | 15:00 |
opendevmeet | Log: https://meetings.opendev.org/meetings/glance/2021/glance.2021-08-26-14.00.log.html | 15:00 |
*** rpittau is now known as rpittau|afk | 16:02 | |
*** akekane_ is now known as abhishekk | 16:18 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!