opendevreview | ASHWIN A NAIR proposed openstack/python-swiftclient master: support part-num in python swiftClient https://review.opendev.org/c/openstack/python-swiftclient/+/902020 | 00:47 |
---|---|---|
opendevreview | ASHWIN A NAIR proposed openstack/python-swiftclient master: support part-num in python swiftClient https://review.opendev.org/c/openstack/python-swiftclient/+/902020 | 00:50 |
opendevreview | ASHWIN A NAIR proposed openstack/python-swiftclient master: support part-num in python swiftClient https://review.opendev.org/c/openstack/python-swiftclient/+/902020 | 00:51 |
opendevreview | Merged openstack/swift master: tests: Update CORS geckodriver https://review.opendev.org/c/openstack/swift/+/913285 | 03:48 |
opendevreview | Jianjian Huo proposed openstack/swift master: common: add memcached based cooperative token mechanism. https://review.opendev.org/c/openstack/swift/+/913731 | 04:07 |
opendevreview | Jianjian Huo proposed openstack/swift master: common: add memcached based cooperative token mechanism. https://review.opendev.org/c/openstack/swift/+/890174 | 04:12 |
opendevreview | Jianjian Huo proposed openstack/swift master: common: add memcached based cooperative token mechanism. https://review.opendev.org/c/openstack/swift/+/890174 | 05:06 |
opendevreview | Matthew Oliver proposed openstack/swift master: Auto-sharding: first attempt at _elect_leader https://review.opendev.org/c/openstack/swift/+/667030 | 06:23 |
Nicolas | Hi everyone, I have a question about the swift-object-expirer process, my assumption from the documentation is that one daemon should be enough for a given cluster however I found in a cluster that we inherited that multiple daemons were running, how are you managing this on your side ? | 09:36 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: rename probe test_mixed_policy_mpu.py to test_mpu.py https://review.opendev.org/c/openstack/swift/+/913756 | 11:52 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: First cut limited functionality MPU middleware https://review.opendev.org/c/openstack/swift/+/913712 | 14:45 |
opendevreview | Merged openstack/swift feature/mpu: rename probe test_mixed_policy_mpu.py to test_mpu.py https://review.opendev.org/c/openstack/swift/+/913756 | 15:43 |
timburke | Nicolas, having just one expirer severely limits how quickly expired objects actually get unlinked disk in a large cluster | 17:41 |
timburke | fortunately, it's mostly making network requests then waiting around for responses, so you can get pretty far with just one by cranking up concurrency. eventually, though, you'll notice your expirer queues taking longer and longer to clear, or your users will complain about expired objects still appearing in listings, or you'll notice that the disks are more full than it seems like they ought to be | 17:41 |
timburke | eventually, you'll find yourself wanting to use the process/processes config options to break up work across multiple nodes. we typically run one on all our object-server nodes, so it scales up as we scale up the amount of data we're storing | 17:43 |
opendevreview | Jianjian Huo proposed openstack/swift master: common: add memcached based cooperative token mechanism. https://review.opendev.org/c/openstack/swift/+/890174 | 17:44 |
acoles | timburke: apologies, I cannot make today's meeting and I'm also not around next week | 18:09 |
timburke | acoles, no worries -- i can't today, either ;-) another orthodontist appt | 18:34 |
acoles | oh right, I remember - good luck ! | 18:38 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: object-server: include deleted metadata in PUT/DELETE response https://review.opendev.org/c/openstack/swift/+/913831 | 18:39 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: proxy: Add list of backend response data to request environ https://review.opendev.org/c/openstack/swift/+/913832 | 18:39 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: FakeSwift: support environ_updates https://review.opendev.org/c/openstack/swift/+/913833 | 18:39 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: Add first cut MPU MW async cleanup markers https://review.opendev.org/c/openstack/swift/+/913834 | 18:39 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: WIP add MPU cleanup auditor https://review.opendev.org/c/openstack/swift/+/913835 | 18:39 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: WIP s3 mpu async cleanup https://review.opendev.org/c/openstack/swift/+/913836 | 18:39 |
opendevreview | Tim Burke proposed openstack/swift master: recon-cron: Tolerate missing directories https://review.opendev.org/c/openstack/swift/+/913841 | 19:20 |
opendevreview | Jianjian Huo proposed openstack/swift master: common: add memcached based cooperative token mechanism. https://review.opendev.org/c/openstack/swift/+/890174 | 20:38 |
opendevreview | Jianjian Huo proposed openstack/swift master: proxy: use cooperative tokens to coalesce updating shard range requests into backend https://review.opendev.org/c/openstack/swift/+/908969 | 20:38 |
mattoliver | kk, no acoles or timburke . Anyone else here for a meetings? | 21:01 |
mattoliver | *meeting | 21:01 |
mattoliver | If not, it's just me :P Question, either I skip or just talk to myself.. the latter makes me crazy but gives us minutes.. so crazy matt it is! :P | 21:03 |
mattoliver | #startmeeting swift | 21:03 |
opendevmeet | Meeting started Wed Mar 20 21:03:22 2024 UTC and is due to finish in 60 minutes. The chair is mattoliver. Information about MeetBot at http://wiki.debian.org/MeetBot. | 21:03 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 21:03 |
opendevmeet | The meeting name has been set to 'swift' | 21:03 |
mattoliver | who's here for the swift meeting? | 21:03 |
mattoliver | o/ (might as make it feel as normal as possible :P) | 21:03 |
mattoliver | As always the agenda is at | 21:04 |
mattoliver | #link https://wiki.openstack.org/wiki/Meetings/Swift | 21:04 |
mattoliver | And I did update the agenda for those reading along later. Although I forgot to add the first topic | 21:05 |
mattoliver | #topic caracal release! | 21:05 |
mattoliver | #link https://review.opendev.org/c/openstack/releases/+/912371 | 21:06 |
patch-bot | patch 912371 - releases - Release Swift for 2024.1 Caracal (MERGED) - 3 patch sets | 21:06 |
mattoliver | It's landed, so we have a release! And because we have no one else here I'll have to say it... it's the best swift release yet! | 21:06 |
mattoliver | Actually there were alot of awesome things in this release, so kudos to everyone who works on swift! | 21:07 |
mattoliver | Moving on, and most of these I'll just mention and move on, because the relevent parties aren't here today to give us real status updates. But will still mention them as they seem to be the active things running atm in swift land. | 21:08 |
mattoliver | #topic s3api: Fix handling of non-ascii access keys | 21:08 |
JayF | I confirm best swift release ever, thanks mattoliver :) | 21:09 |
mattoliver | We came across this in our prod. Seems we missed a unicode conversion to wsgi string when doing the py3 migration all those years ago. And we only just discovered it. | 21:09 |
mattoliver | lol, thanks JayF | 21:09 |
mattoliver | But we have a fix | 21:09 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/913723 | 21:09 |
patch-bot | patch 913723 - swift - s3api: Fix handling of non-ascii access keys - 1 patch set | 21:09 |
mattoliver | It's caused if someone has a non-asci character in their aws key. | 21:10 |
mattoliver | We had tests we thought covered this.. and looks like we did. But our fake app in the tests wasn't a good enough fake :( | 21:11 |
mattoliver | Anyway.. just a heads up, I expect that to land before the next meeting. | 21:11 |
mattoliver | #topic expirer grace period | 21:11 |
mattoliver | This is an older topic from the last meeting, but left it in. There is active work going on it. We have an intern working on it atm, and coming along nice. There is current discussions around maybe changing the name a little. Basically it gives us an optional grace period when expiring objects, so there can be like a soft delete. | 21:13 |
mattoliver | one of our users had a need for this. And it's optional. | 21:14 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/874806 | 21:14 |
patch-bot | patch 874806 - swift - expirer: per account and container grace period - 17 patch sets | 21:14 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/874710 | 21:15 |
patch-bot | patch 874710 - swift - support x-open-expired header for expired objects - 36 patch sets | 21:15 |
mattoliver | I'll try and remember to link first :P | 21:15 |
mattoliver | #topic cooperative tokens | 21:15 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/890174 | 21:15 |
patch-bot | patch 890174 - swift - common: add memcached based cooperative token mech... - 14 patch sets | 21:15 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/908969 | 21:15 |
patch-bot | patch 908969 - swift - proxy: use cooperative tokens to coalesce updating... - 10 patch sets | 21:15 |
mattoliver | Jianjian is leading the charge here. And it's really interesting and awesome work. | 21:16 |
mattoliver | For those who have been running swift for a while, or come to the last few PTGs we'd talked about a thundering herd problem, when there is alot of updates coming to one sharded container and the roots shard-ranges have ttl'ed out of cache. | 21:17 |
indianwhocodes | o/ | 21:17 |
mattoliver | Well this uses a token in the cache as a lock that updaters can grab. If they get it then they can go to the back end and get the latest set of shardranges and then they'll place them back in cache. It's a modified version of whats called a ghetto lock. A slightly more concurrent version that better supports a distributed system like swift | 21:19 |
mattoliver | Hey indianwhocodes just you and me, so was mostly talking to myself. | 21:19 |
mattoliver | but not anymore! now I don't look as crazy :P | 21:19 |
indianwhocodes | ya scrolled up, good so far! | 21:20 |
indianwhocodes | I have currently nothing ready for reviews yet | 21:20 |
mattoliver | Anyway, the work is awesome. I hope to get in an review the chain. | 21:20 |
mattoliver | thanks :P | 21:20 |
mattoliver | ok next topic | 21:21 |
mattoliver | #topic Feature/MPU feature branch | 21:21 |
mattoliver | So we've created a feature branch! We haven't done one of those in Swift since container-sharding. | 21:21 |
mattoliver | And it's because we're working, finally, on something we've been talking about for years. We used to call it ALO, or atomic large object. Ie, a large object where users can't see the segments so we can have a 1:1 connection. | 21:22 |
mattoliver | We'll follow the MPU api somewhat. No idea what the name will end up being, swift MPU? So this will go with our DLO and SLO. And our s3api will eventually just use them. And no more weird edgecases of orphaned segments! | 21:24 |
indianwhocodes | sounds promising! | 21:24 |
mattoliver | acoles: is leading the charge. No doubt if he was here you'd get a better explanation then I can give :P | 21:24 |
mattoliver | It does! | 21:25 |
mattoliver | #topic aws-chunked transfers | 21:25 |
mattoliver | timburke: would be better at talking about these. And making progress. | 21:25 |
mattoliver | I did look at the patches in the past, I should do so again! | 21:26 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/909049 | 21:26 |
patch-bot | patch 909049 - swift - s3api: Improve checksum-mismatch detection - 5 patch sets | 21:26 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/909800 | 21:26 |
patch-bot | patch 909800 - swift - utils: Add crc32c function - 5 patch sets | 21:26 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/909801 | 21:26 |
patch-bot | patch 909801 - swift - s3api: Add support for additional checksums - 6 patch sets | 21:26 |
indianwhocodes | I have been blackbox testing tim's patch with mountpoint-s3 benchmarks | 21:27 |
mattoliver | I attempting to recreate the probetest failure of that ^ one. | 21:27 |
mattoliver | oh nice | 21:27 |
mattoliver | oh I forgot more links | 21:27 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/909802 | 21:27 |
patch-bot | patch 909802 - swift - WIP: s3api: Additional checksums for MPUs - 6 patch sets | 21:27 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/836755 | 21:27 |
patch-bot | patch 836755 - swift - Add support of Sigv4-streaming - 15 patch sets | 21:27 |
mattoliver | man that timburke is a machine! | 21:28 |
mattoliver | yeah we need these aws-chucked transfers for mountpoint-s3 don't we. | 21:28 |
indianwhocodes | agreed. | 21:28 |
mattoliver | So indianwhocodes your a good man to test and review these :) | 21:28 |
indianwhocodes | i am looking into adding more s3api cross-compat tests to p 909801 | 21:29 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/909801 - swift - s3api: Add support for additional checksums - 6 patch sets | 21:29 |
mattoliver | oh nice! | 21:30 |
mattoliver | If/when you have a patch ready, let's add it to the list the patches then :) | 21:30 |
mattoliver | Let's move on.. almost at the end of the agenda :) I don't you we're busy! | 21:31 |
mattoliver | #topic drive-full-checker | 21:31 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/907523 | 21:31 |
patch-bot | patch 907523 - swift - drive-full-checker - 34 patch sets | 21:31 |
mattoliver | I think this is awesome, and we want to get this landed at some point. One of our SRE is interested in testing it. But downstream stuff got in the way this last week or so. Looks like timburke's had a play. | 21:32 |
mattoliver | I hope to too at somepoint. | 21:33 |
mattoliver | Maybe we can play with it in a VSAIO at least. | 21:33 |
mattoliver | Hopefully I can trick sre into looking this week or so :P | 21:33 |
mattoliver | #topic s3api and slo Partnum support | 21:34 |
mattoliver | So the swift side chain landed! | 21:34 |
mattoliver | Nice work indianwhocodes !! | 21:34 |
indianwhocodes | finally. | 21:34 |
mattoliver | Now we can add support to python-swiftclient! | 21:34 |
mattoliver | #link https://review.opendev.org/c/openstack/python-swiftclient/+/902020 | 21:35 |
patch-bot | patch 902020 - python-swiftclient - support part-num in python swiftClient - 15 patch sets | 21:35 |
indianwhocodes | i intend to play with the drive-full checker too if it goes ahead with mountpoint they will go hand in hand as major wins | 21:35 |
mattoliver | Oh yeah, great point! | 21:35 |
mattoliver | nice | 21:35 |
indianwhocodes | reason i say that is that it took me just 2 benchmarking jobs to fill up my vsaio, so just the amount of data we have in prod will have a significant impact!!! | 21:36 |
mattoliver | Well I'll try and take a look at the swiftclient partnum patch as soon as I can so we can get it all squared away. | 21:36 |
indianwhocodes | sounds good. | 21:36 |
mattoliver | Hey it's a jianjian , too bad he missed me talking about his awesome cooperative token stuff, he's going to have to read the logs :P | 21:37 |
jianjian | sorry for being late | 21:38 |
jianjian | that's awesome! yes, I will read the logs. :-P | 21:38 |
mattoliver | indianwhocodes: you can increase your vsaio disk size if need be. But maybe easily filling them is what we need to testing the drive-full-checker anyway :P | 21:38 |
indianwhocodes | exactly. | 21:38 |
mattoliver | #topic Drop support for liberasurecode<1.4.0 | 21:39 |
mattoliver | Last topic I have on the agenda before I open the floor. | 21:39 |
mattoliver | This is from last meeting, and I think yeah sounds good.. but I guess I should actaully go review it :P | 21:40 |
mattoliver | I see jianjian did. Nice | 21:40 |
indianwhocodes | i just did as well, lol. | 21:40 |
mattoliver | oh nice | 21:40 |
mattoliver | oh you did to, 8 mins ago! | 21:41 |
mattoliver | Well hopefully that means it'll land over the next week :) | 21:41 |
mattoliver | That's all the topics I gathered (over the 10 mins before this meeting) :P | 21:41 |
mattoliver | So | 21:41 |
jianjian | yeah, looked good to me, the new liberasurecode also is able to replace old .so library with static functions, which is nice | 21:41 |
mattoliver | oh cool | 21:42 |
mattoliver | #topic open floor | 21:42 |
mattoliver | I've been blowing dust off 3 year old patches I had for a better auto-sharding leader-election algorithm. Still fairly basic, but better then what we have: | 21:43 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/667030 | 21:44 |
patch-bot | patch 667030 - swift - Auto-sharding: first attempt at _elect_leader - 10 patch sets | 21:44 |
mattoliver | That's basically a rebase and squash and an attempt to address some comments from 3 years ago. | 21:44 |
jianjian | wow, leader election... that's cool! | 21:44 |
mattoliver | Well it's something we always had planned. And we purposely picked a overly simple one to land sharding | 21:45 |
mattoliver | But have always told people not to use auto-sharding because it isn't production ready. But we do use auto-sharding in tests | 21:45 |
jianjian | was there a design doc? or mostly it's described in the commit message | 21:46 |
mattoliver | Auto-sharding basically means the sharder takes responsiblity not just for sharding but identifying, scanning and initiate the scanning too. | 21:46 |
mattoliver | yeah there is, and there's been alot of docs over the years. | 21:47 |
mattoliver | I've been trying to gather them all up. I'll find the current link | 21:47 |
indianwhocodes | I have tried it before on a personal swift cluster | 21:47 |
jianjian | thanks. I guess only leader can kick off a sharding with auto-sharding, what happen if leader node dies? will another node stands up to be a new leader? | 21:49 |
mattoliver | these are the problems. | 21:49 |
mattoliver | #link https://docs.google.com/document/d/1VSpmPcEt1NDhDLb8Btvfl6BwnaeGboONvltSM-ZHUVQ/edit?usp=sharing | 21:50 |
mattoliver | ^ that I think only works for nvidians, but I'll make it available to everyone when I get the chance | 21:50 |
mattoliver | There are alot of leader election options to choose. But the first version is an increment of the existing one. | 21:51 |
jianjian | 👍 | 21:51 |
mattoliver | Existing one is super simple.. if your index 0 for the container (partition) then your the leader.. | 21:51 |
mattoliver | So great for testing and simple, but doesn't actaully use any real ellection and its super easy to get 2 who think they're the leader because of rebalances and eventual consistency | 21:52 |
mattoliver | this newer version takes ring versions into account and only listens and gets a qourum of votes from the latest ring. Meaning handoffs (old primaries now don't get a say). | 21:53 |
mattoliver | And currently I think there is a double check. Am I leader, scan, am I still the leader, write ranges into shardranges and replicate. | 21:54 |
mattoliver | throw away the work if I'm not. | 21:54 |
mattoliver | though maybe thats inefficent. But taking the less split brainy approach | 21:55 |
jianjian | so a shard leader could stop in the middle the sharding during rebalancing, and then cancel its work | 21:55 |
mattoliver | anther approach is to elect quicker and just get better at dealing and recovering from the split-brains. But in realilty we need to solve this anyway, because it'll happen | 21:56 |
mattoliver | not in the middle of sharding.. only in scanning for shardranges. | 21:56 |
mattoliver | Once a leader has scanned and inserted the shardranges. Sharding then happens as per normal. | 21:57 |
mattoliver | but yeah, atm | 21:57 |
mattoliver | I am thinking of also adding a memcache sentinal lock or something | 21:57 |
mattoliver | as an enhancement. | 21:57 |
mattoliver | Or maybe we just need a branc new approach :) | 21:58 |
mattoliver | like I said, these are from years ago and I'm relearning what past Matt was thinking :P | 21:58 |
jianjian | lol | 21:58 |
mattoliver | Dream is one day, we have auto-sharding enabled by default in swift. And we never need to think about it. | 21:59 |
jianjian | that'll be cool | 21:59 |
mattoliver | And for us downstream, we deprecate it from the controller | 21:59 |
mattoliver | Anyway, we're at time | 22:00 |
jianjian | I also feel we need shard shrinking as well regarding to the topic of sharding | 22:00 |
jianjian | yeah | 22:00 |
mattoliver | Yeah, there is a shrinking edge case that is blocking out auto-shrinking atm too, which is another blocker. So that also needs to be solved! | 22:01 |
mattoliver | Thanks for coming and thanks for working on swift! | 22:01 |
mattoliver | #endmeeting | 22:01 |
opendevmeet | Meeting ended Wed Mar 20 22:01:15 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 22:01 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/swift/2024/swift.2024-03-20-21.03.html | 22:01 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/swift/2024/swift.2024-03-20-21.03.txt | 22:01 |
opendevmeet | Log: https://meetings.opendev.org/meetings/swift/2024/swift.2024-03-20-21.03.log.html | 22:01 |
mattoliver | jianjian: That's why I caeed that doc, the path to auto-sharding ;) | 22:01 |
jianjian | good summary in the doc, thanks for sharing it | 22:06 |
opendevreview | Anish Kachinthaya proposed openstack/swift master: expirer: per account and container grace period https://review.opendev.org/c/openstack/swift/+/874806 | 22:18 |
timburke | thanks for running the meeting, mattoliver! | 22:55 |
mattoliver | nps | 23:20 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!