*** tkajinam has joined #openstack-swift | 00:02 | |
*** psachin has joined #openstack-swift | 02:43 | |
*** gkadam has joined #openstack-swift | 03:19 | |
*** rcernin has quit IRC | 03:35 | |
*** baojg has joined #openstack-swift | 03:48 | |
*** rcernin has joined #openstack-swift | 03:50 | |
*** rcernin has quit IRC | 03:57 | |
*** rcernin has joined #openstack-swift | 03:58 | |
*** baojg has quit IRC | 04:12 | |
*** baojg has joined #openstack-swift | 04:18 | |
*** gkadam has quit IRC | 04:21 | |
*** baojg has quit IRC | 04:23 | |
*** godog has quit IRC | 04:26 | |
*** baojg has joined #openstack-swift | 05:11 | |
*** tkajinam_ has joined #openstack-swift | 05:16 | |
*** tkajinam has quit IRC | 05:18 | |
*** baojg has quit IRC | 05:22 | |
*** baojg has joined #openstack-swift | 05:23 | |
*** tkajinam__ has joined #openstack-swift | 05:25 | |
*** tkajinam_ has quit IRC | 05:27 | |
*** e0ne has joined #openstack-swift | 05:52 | |
*** e0ne has quit IRC | 05:53 | |
*** tkajinam_ has joined #openstack-swift | 06:09 | |
*** tkajinam__ has quit IRC | 06:12 | |
*** spsurya has joined #openstack-swift | 06:21 | |
*** ccamacho has joined #openstack-swift | 07:12 | |
*** baojg has quit IRC | 07:36 | |
*** pcaruana has joined #openstack-swift | 07:41 | |
*** gkadam has joined #openstack-swift | 07:51 | |
*** baojg has joined #openstack-swift | 08:03 | |
*** tkajinam_ has quit IRC | 08:12 | |
*** godog has joined #openstack-swift | 08:18 | |
*** rcernin has quit IRC | 08:56 | |
*** mikecmpbll has quit IRC | 09:04 | |
*** hseipp has joined #openstack-swift | 09:05 | |
*** mikecmpbll has joined #openstack-swift | 09:18 | |
*** baojg has quit IRC | 09:57 | |
*** baojg has joined #openstack-swift | 09:58 | |
*** baojg has quit IRC | 10:09 | |
*** e0ne has joined #openstack-swift | 10:24 | |
*** mahatic has joined #openstack-swift | 10:51 | |
*** ChanServ sets mode: +v mahatic | 10:51 | |
*** mvkr has quit IRC | 11:34 | |
*** mvkr has joined #openstack-swift | 12:06 | |
*** e0ne has quit IRC | 12:25 | |
*** e0ne has joined #openstack-swift | 12:30 | |
*** baojg has joined #openstack-swift | 13:00 | |
*** psachin has quit IRC | 13:16 | |
*** e0ne has quit IRC | 14:05 | |
*** e0ne has joined #openstack-swift | 14:08 | |
*** openstackgerrit has joined #openstack-swift | 15:24 | |
openstackgerrit | Thiago da Silva proposed openstack/swift master: Remove duplicate statement https://review.openstack.org/632486 | 15:24 |
---|---|---|
*** ccamacho has quit IRC | 15:37 | |
*** ccamacho has joined #openstack-swift | 15:37 | |
*** ybunker has joined #openstack-swift | 15:46 | |
ybunker | hi all, quick question.. I've a swift cluster and some of the obj drives are getting more used that others, for example, some are at 95% of used space, and others are at 82% on the same node and also on different nodes.. the weights on the object ring are the same for those drives.. any ideas of what could be going on in here? also, is there a way to stop "storing" data on those 95% used disks? | 15:48 |
*** openstackgerrit has quit IRC | 15:51 | |
*** ianychoi has joined #openstack-swift | 16:26 | |
*** ccamacho has quit IRC | 16:33 | |
*** e0ne has quit IRC | 16:38 | |
*** e0ne has joined #openstack-swift | 16:39 | |
*** hseipp has quit IRC | 16:42 | |
*** pcaruana has quit IRC | 17:02 | |
*** e0ne has quit IRC | 17:02 | |
DHE | do you have a sane number of placement groups? sounds like you may have too few | 17:06 |
DHE | and no, you can't just stop using certain drives. swift needs to be able to consistently predict where an object is located by name alone | 17:06 |
*** ccamacho has joined #openstack-swift | 17:16 | |
ybunker | DHE: I've 8192 partitions, with replica of 3, 1 region, 8 zones and 72 devices | 17:21 |
DHE | seems okay... | 17:25 |
ybunker | DHE: don't know where else to look, and every day the used space is growing | 17:26 |
DHE | are the overused devices (disks) consistently on the same nodes? or would one node have a mixture of high and low usage disks | 17:28 |
tdasilva | it's also a good idea to double check that one of your nodes don't have old rings...swift-recon --md5 will check all rings for you | 17:30 |
DHE | that's a good point... | 17:30 |
ybunker | thanks a lot guys will check on that | 17:31 |
DHE | I'm assuming 72 disks here, not 72 nodes/hosts/servers | 17:31 |
*** mikecmpbll has quit IRC | 17:43 | |
*** ccamacho has quit IRC | 17:51 | |
ybunker | 72 disks yes, with 8 data nodes | 17:52 |
ybunker | the rings are the same for all the nodes, so that's not the problem :( | 17:55 |
timburke | ybunker: sounds like you're getting into a cluster-full situation, which typically sucks :-( even those ~80% full drives aren't going to be super happy; fwiw my usual recommendation is to try to keep drives under ~75% full | 18:00 |
ybunker | we are planning to add 4x more data nodes.. but it will take at least a month... :S | 18:01 |
timburke | ...can we delete some data? | 18:01 |
*** pcaruana has joined #openstack-swift | 18:01 | |
ybunker | two data nodes had 60~70% used, so i change the weights of those nodes so they balance more there.. but on the other nodes instead of lower a little bit on space, they just grow and grow :( | 18:02 |
timburke | the core trouble is that those full drives are going to start responding 507, even for a lot of replication requests, which means that the remaining drives will fill up *even more quickly*, and you'll probably get some super-replicated data | 18:03 |
timburke | if you're confident that your drives are healthy and unlikely to fail in the next couple months (to give you time to not only get the new hardware in place but also get replication to settle), you might want to look at the handoffs_first and handoffs_delete options.... i'd feel much more comfortable recommending them if you already had the new hardware in place, though, and just needed to make replication go faster | 18:05 |
ybunker | the thing is that the cluster has millon and millons of images.., mmm at some point is possible to delete some of the replicas? | 18:05 |
timburke | see https://github.com/openstack/swift/blob/2.20.0/etc/object-server.conf-sample#L279-L296 for the config options | 18:06 |
ybunker | timburke: thanks a lot, let me take a look on that | 18:06 |
ybunker | timburke: are those options available for juno release? 2.2.0 ? | 18:07 |
timburke | you could reduce the replica count for the ring... but it'll come at a cost to durability, and probably wouldn't be a quick fix. i wouldn't recommend it unless you already know you want a two-replica policy or something | 18:08 |
timburke | should go back fairly far... but then, junolemme see... | 18:08 |
timburke | should go back fairly far... but then, juno's pretty old... lemme see... | 18:08 |
timburke | looks like you're good: https://github.com/openstack/swift/commit/e078dc3da05ce9e7c2b36e05686d28101381eec8 | 18:09 |
timburke | (missing sample config got added in 1.13.0) | 18:10 |
ybunker | thanks :), so handoffs_first should be change to True and leave handoff_delete to auto | 18:11 |
ybunker | oh sorry to 2 | 18:12 |
timburke | probably? you'll definitely want to have handoffs_first=true when rebalancing... and yeah, handoff_delete=2 seems not-crazy | 18:12 |
timburke | once the new hardware's in place and you've had a few good replication cycles, you'll want to take those back to the defaults | 18:13 |
ybunker | another problem that we got is that we can let object-replicator process to be running all day, we start the process on a specific window and then stopped, because latency goes to the sky | 18:13 |
ybunker | so the obj-replicator process runs for about 4 hours a day | 18:14 |
timburke | good news is, more disks should definitely help with that | 18:15 |
timburke | are your auditors on the same schedule? | 18:16 |
ybunker | yes | 18:16 |
timburke | makes sense. anything you can do to avoid the disk-thrashing, i'd imagine... | 18:17 |
ybunker | do I need any special configuration on the object-auditor? i just have a concurrency of 1, files per second to 1, zero_byte_files_per_second = 5 and bytes_per_second = 1000 | 18:19 |
*** pcaruana has quit IRC | 18:23 | |
timburke | seems... ok-ish, i guess? how far does it get in that 4hr window? i feel like with that tuning, you should be able to have them running continuously without really impacting client traffic... | 18:23 |
timburke | i'd be inclined to increase concurrency to # of disks on the node, but that's me | 18:25 |
*** gkadam has quit IRC | 18:26 | |
timburke | how long is that cycle time? i feel like it must take a while... | 18:26 |
ybunker | on object-replicator i had concurrency = 2 and replicator_workers = 6 | 18:27 |
timburke | yeah, i think i like that once better. auditor doesn't have the same concurrency/workers split iirc | 18:28 |
timburke | i think they might not even mean the same thing :-( | 18:28 |
timburke | ugh, yeah: https://review.openstack.org/#/c/572571/ | 18:29 |
patchbot | patch 572571 - swift - object-auditor: change "concurrency" to "auditor_w... - 1 patch set | 18:29 |
*** ybunker has quit IRC | 18:34 | |
*** ybunker has joined #openstack-swift | 18:34 | |
*** ybunker has quit IRC | 18:45 | |
*** ybunker has joined #openstack-swift | 18:45 | |
*** openstackgerrit has joined #openstack-swift | 18:45 | |
openstackgerrit | Tim Burke proposed openstack/swift master: object-auditor: change "concurrency" to "auditor_workers" in configs https://review.openstack.org/572571 | 18:45 |
*** mikecmpbll has joined #openstack-swift | 18:47 | |
openstackgerrit | Tim Burke proposed openstack/swift master: object-auditor: change "concurrency" to "auditor_workers" in configs https://review.openstack.org/572571 | 18:53 |
*** baojg has quit IRC | 18:57 | |
*** baojg has joined #openstack-swift | 18:57 | |
*** baojg has quit IRC | 18:58 | |
*** baojg has joined #openstack-swift | 18:58 | |
*** baojg has quit IRC | 18:58 | |
*** baojg has joined #openstack-swift | 18:59 | |
*** baojg has quit IRC | 18:59 | |
*** baojg has joined #openstack-swift | 18:59 | |
*** baojg has quit IRC | 19:00 | |
*** baojg has joined #openstack-swift | 19:00 | |
*** baojg has quit IRC | 19:01 | |
*** baojg has joined #openstack-swift | 19:01 | |
*** baojg has quit IRC | 19:02 | |
*** baojg has joined #openstack-swift | 19:02 | |
*** baojg has quit IRC | 19:02 | |
*** baojg has joined #openstack-swift | 19:03 | |
*** baojg has quit IRC | 19:04 | |
*** baojg has joined #openstack-swift | 19:04 | |
*** baojg has quit IRC | 19:05 | |
*** baojg has joined #openstack-swift | 19:05 | |
*** baojg has quit IRC | 19:06 | |
*** baojg has joined #openstack-swift | 19:06 | |
*** baojg has quit IRC | 19:06 | |
*** baojg has joined #openstack-swift | 19:07 | |
*** e0ne has joined #openstack-swift | 19:07 | |
*** baojg has quit IRC | 19:08 | |
*** baojg has joined #openstack-swift | 19:08 | |
*** baojg has quit IRC | 19:09 | |
*** baojg has joined #openstack-swift | 19:09 | |
*** baojg has quit IRC | 19:09 | |
*** baojg has joined #openstack-swift | 19:10 | |
*** baojg has quit IRC | 19:10 | |
*** baojg has joined #openstack-swift | 19:11 | |
*** baojg has quit IRC | 19:11 | |
*** baojg has joined #openstack-swift | 19:11 | |
*** baojg has quit IRC | 19:12 | |
*** baojg has joined #openstack-swift | 19:13 | |
*** baojg has quit IRC | 19:13 | |
*** baojg has joined #openstack-swift | 19:14 | |
*** baojg has quit IRC | 19:14 | |
*** baojg has joined #openstack-swift | 19:15 | |
*** baojg has quit IRC | 19:15 | |
*** baojg has joined #openstack-swift | 19:15 | |
*** baojg has quit IRC | 19:16 | |
*** baojg has joined #openstack-swift | 19:16 | |
*** baojg has quit IRC | 19:17 | |
*** baojg has joined #openstack-swift | 19:17 | |
*** baojg has quit IRC | 19:17 | |
*** baojg has joined #openstack-swift | 19:19 | |
*** baojg has quit IRC | 19:20 | |
*** takamatsu has quit IRC | 19:27 | |
ybunker | ok so I disable object-replicator on the nodes that have more capacity | 19:38 |
ybunker | then I flip on handoffs_first and drop handoff_delete to 2 on object-replicator on all the nodes, or do i have to change that just on the most full nodes? | 19:38 |
*** e0ne has quit IRC | 19:42 | |
*** ybunker has quit IRC | 19:43 | |
timburke | that first bit sounds a little terrifying. why are we turning the replicator off entirely? as for the config changes, i always find it easier to reason about a cluster when i have configs as uniform as possible across the nodes... i think i'd do that on all of them | 19:46 |
*** pcaruana has joined #openstack-swift | 19:50 | |
*** pcaruana has quit IRC | 20:11 | |
DHE | also remember that replication is push based. It seems to me running the replicator is more likely to allow a host to delete objects once it realizes that it is a handoff node and the primaries are all healthy. (is that something a replicator does?) | 20:14 |
*** pcaruana has joined #openstack-swift | 20:24 | |
*** portante has left #openstack-swift | 20:32 | |
zaitcev | I tried to make everything less complicated for container server, and it went very poorly. | 21:01 |
zaitcev | I mean less complicated than my previous patch, which had ShardRange(row[0].decode('utf-8'), row[1:]) | 21:02 |
zaitcev | The biggest problem is the code that insists on using the nul character for SQL markers. | 21:03 |
zaitcev | Like... m = x + b'\x00', then sql("SELECT FROM table WHERE name < ?", m) | 21:05 |
zaitcev | There's NO WAY that I can see to use unicode there | 21:05 |
clayg | timburke: thanks for point me at p 437523 and p 609843 - those are both good to keep on the radar | 21:13 |
patchbot | https://review.openstack.org/#/c/437523/ - swift - Store version id when copying object to archive - 9 patch sets | 21:13 |
patchbot | https://review.openstack.org/#/c/609843/ - swift - Allow arbitrary UTF-8 strings as delimiters in con... - 2 patch sets | 21:13 |
zaitcev | Does anyone actually remember what that zero actually does? | 21:14 |
* zaitcev pokes mattoliverau | 21:17 | |
zaitcev | https://github.com/openstack/swift/blob/master/swift/container/sharder.py#L237 | 21:17 |
zaitcev | https://github.com/openstack/swift/blob/master/swift/common/utils.py#L4799 | 21:18 |
zaitcev | (the latter is actually bogus in py3, but never mind) | 21:18 |
*** pcaruana has quit IRC | 21:19 | |
*** baojg has joined #openstack-swift | 21:21 | |
*** baojg has quit IRC | 21:27 | |
openstackgerrit | Tim Burke proposed openstack/swift master: Fix socket leak on object-server death https://review.openstack.org/575254 | 21:39 |
timburke | zaitcev: we can't use u'\x00'? the idea is that `name == x` should be included, but no other valid object name after that. though i can't remember now why we didn't use `name <= ?`... | 21:45 |
timburke | why is that last one bogus on py3? | 21:46 |
zaitcev | wait, what | 21:49 |
zaitcev | oh, so a nil is a valid unicode character | 21:49 |
zaitcev | timburke: thanks a lot, I have something to re-think here. | 21:52 |
*** rcernin has joined #openstack-swift | 22:10 | |
*** rcernin has quit IRC | 22:17 | |
*** rcernin has joined #openstack-swift | 22:19 | |
*** lifeless_ is now known as lifeless | 22:34 | |
*** baojg has joined #openstack-swift | 22:52 | |
*** tkajinam has joined #openstack-swift | 22:54 | |
openstackgerrit | Merged openstack/swift master: Remove duplicate statement https://review.openstack.org/632486 | 23:12 |
timburke | so this is weird. func testing https://review.openstack.org/#/c/575254/ (which needs another patchset; i was dumb in my last one), i put some russian-roulette middleware in my object server pipelines, try to pull down something sizeable, and one of two things happens | 23:14 |
patchbot | patch 575254 - swift - Fix socket leak on object-server death - 3 patch sets | 23:14 |
timburke | either i see three object server deaths and a traceback in the proxy that ends with ShortReadError | 23:14 |
timburke | (which is good, that's the behavior i want) | 23:14 |
timburke | or i see *one* object server death and a traceback coming out of catch_errors that ends with BadResponseLength | 23:15 |
timburke | and i can't seem to figure out where i'm getting a response body file that wouldn't have my ByteCountEnforcer :-( | 23:16 |
timburke | i even tried pushing the wrapping up into utils/request_helpers... | 23:18 |
zaitcev | ugh | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!