openstackgerrit | Matthew Oliver proposed openstack/swift master: Refactor db auditors into a db_auditor base class https://review.openstack.org/614895 | 00:58 |
---|---|---|
*** two_tired has joined #openstack-swift | 01:11 | |
DHE | notmyname: while I didn't use something to 100% confirm (eg: `perf`), I've seen very visibly instances of CPU hit 100% and network hitting ~800 megabit, and the opposite - network saturated at gigabit and ~70% cpu usage | 01:38 |
DHE | while I did use perf briefly, I saw most CPU time in the EC functions during a GET operation (no PUTs) on the proxy host | 01:38 |
openstackgerrit | Matthew Oliver proposed openstack/swift master: Refactor db auditors into a db_auditor base class https://review.openstack.org/614895 | 02:58 |
*** two_tired has quit IRC | 03:37 | |
*** itlinux has quit IRC | 04:09 | |
openstackgerrit | Matthew Oliver proposed openstack/swift master: Refactor db auditors into a db_auditor base class https://review.openstack.org/614895 | 04:33 |
mattoliverau | ^ damn you pep8 errors, I swear I ran pep8 locally.. somehow missed it. | 04:33 |
*** threestrands has quit IRC | 05:40 | |
openstackgerrit | Rajat Dhasmana proposed openstack/swift master: Update min tox version to 2.0 https://review.openstack.org/615003 | 06:10 |
*** tellesnobrega_ has joined #openstack-swift | 06:15 | |
*** tellesnobrega has quit IRC | 06:18 | |
openstackgerrit | Kim Bao Long proposed openstack/swift-specs master: Update the min version of tox to 2.0 https://review.openstack.org/615031 | 06:47 |
openstackgerrit | Rajat Dhasmana proposed openstack/swift-specs master: Update min tox version to 2.0 https://review.openstack.org/615051 | 07:04 |
*** pcaruana has joined #openstack-swift | 07:20 | |
*** ccamacho has joined #openstack-swift | 07:34 | |
openstackgerrit | Vu Cong Tuan proposed openstack/python-swiftclient master: Switch to stestr https://review.openstack.org/581610 | 09:42 |
*** e0ne has joined #openstack-swift | 09:51 | |
*** ccamacho has quit IRC | 13:14 | |
*** ccamacho has joined #openstack-swift | 13:14 | |
*** tdasilva has joined #openstack-swift | 13:58 | |
*** ChanServ sets mode: +v tdasilva | 13:58 | |
*** two_tired has joined #openstack-swift | 14:30 | |
*** ccamacho has quit IRC | 14:36 | |
*** mrjk_ has joined #openstack-swift | 14:48 | |
mrjk_ | Hi | 14:52 |
*** ccamacho has joined #openstack-swift | 14:53 | |
mrjk_ | I've hit 100% disk usage on my cluster. Now I have added new disk and started rebalance, disk usage doesn't get lower. I can't find any documentation about this, aby hint that could explain my disks are not reblancing ? | 14:53 |
*** mrjk_ has quit IRC | 15:00 | |
*** two_tired has quit IRC | 15:01 | |
*** DJ_Machete has joined #openstack-swift | 15:03 | |
DJ_Machete | good afternoon everyone. i am new to swift programming and having an issue with getting the proper return status code from a web server post request | 15:05 |
DJ_Machete | what i need to do is present an alert to the user if the web request failed so they know. the first time i make the request, successful or not, it returns code 0, which isn't correct because i output the response and i can see the real status code in there | 15:07 |
DJ_Machete | then, the next request i make will return the previous web status code | 15:07 |
mrjk | DJ_Machete, if you are talking about apple products, you are not in the good chan | 15:12 |
DJ_Machete | this isnt for xcode swift? | 15:12 |
DHE | no, openstack swift is an data storage project | 15:12 |
DJ_Machete | ah, ok..oops, sorry | 15:13 |
*** DJ_Machete has left #openstack-swift | 15:13 | |
*** gyee has joined #openstack-swift | 15:26 | |
*** zaitcev has quit IRC | 15:44 | |
*** pcaruana has quit IRC | 16:17 | |
notmyname | good morning | 16:20 |
notmyname | mrjk: ah disk full. yeah, that's tough. you find anything yet? | 16:20 |
*** e0ne has quit IRC | 16:33 | |
timburke | notmyname: as i recall, acoles did some testing and found that (1) all data frags was considerably faster than reconstructing with parity and (2) *any* parity frags led to roughly the same slowdown -- didn't matter if it was all ec_num_parity of them or just one | 16:55 |
notmyname | timburke: ah, ok. I think it was the 2nd statement I was remembering | 16:56 |
timburke | and that second one, combined with our default shuffle behavior, might explain why you thought that it' always take about the same time -- it'd be exceedingly likely that you get at least one parity frag by shuffling | 16:57 |
notmyname | heh, yeah | 16:57 |
mrjk | notmyname, I reduced a bit the weight of full disks, and I expect them to let new object to be correctly sunchronised, to then let the reaper do its job. We'll see if that work | 17:12 |
notmyname | mrjk: yeah. you'll also probably want to set handoffs_only to true and and handoff_delete to 1 or 2. also maybe limit/raise rsync connections (if you're using a replicated policy) to get data off of the full drives as quickly as possible | 17:16 |
mrjk | I just checked out and it seems to work a bit, I gained some free spaces (couples of Mb) on my filled disks. \o/ | 17:16 |
notmyname | longer term, you should set fallocate_reserve so that you're never at truly 100% full | 17:16 |
notmyname | note that in order to delete something in swift (ie handle a client DELETE request), swift will *write* a new file. even though that file is empty, it has metadata, takes up inodes, etc, so you need some bytes available before it can unlink the existing larger file | 17:17 |
notmyname | and as a general principle, you'll see a "replication bump" when you add capacity. by default the existing data won't be removed from it's current location until it's durably stored in all the new primary locations. this means that for a 3-replica policy, you'll have 4 replicas temporarily for the data that was remapped to new hardware. this means that if you're at 1005 full, it's really hard | 17:20 |
notmyname | to make progress (this is what the handoff_delete config value is designed for) | 17:20 |
notmyname | the tl;dr is "don't let drives fill up. it's a lot easier to avoid that than it is to dig out from when you're in that situation" | 17:20 |
openstackgerrit | Tim Burke proposed openstack/swift master: Add "historical document" note to ring background docs https://review.openstack.org/615251 | 17:22 |
mrjk | notmyname, yep, definitely, I'm rediscovering some swift settings :) I'll definitiely go for fallocate config. Thanks for the details. I was expecting this replication bump, that's why I'm temporarly reducing 100% disks to a lower weight. | 17:26 |
notmyname | timburke: nice | 17:26 |
notmyname | timburke: really well said (the historical doc note) | 17:26 |
*** ccamacho has quit IRC | 17:27 | |
mrjk | notmyname, also, playing with handsoff_* settings should just concern filled up nodes, right ? | 17:28 |
mrjk | And yesh, we let it be filled because our customer didn't want to pay for new disks ^^ Now they fill sad, and I'm trying to fix their mess | 17:28 |
notmyname | mrjk: I'm guessing you're trying to "fix the full drives" some other way than `rm -rf *` ("good news, customer! you have lots of free space now!") ;-) | 17:31 |
mrjk | haha | 17:32 |
mrjk | I wish, but I would get killed by them (actually, data should be replicated 3 times, so it should be fine). Still I'm trying my best to help them | 17:32 |
notmyname | mrjk: changing the handoff_* settings isn't good for normal operation because it does introduce a slightly higher chance of data loss than normal operation (eg not ensuring that there are n_replica in primary locations before deleting local handoffs) | 17:32 |
notmyname | but it's normally ok in situations like this because they are temprorary | 17:33 |
mrjk | yep, anyway, they can't push anything else on those nodes because 100% everywhere ... yeah, I don't care, I told them to stop pushing stuuffs on it while we are fixing their shits ^^ I'm just trying to check if that's supported with my swift version | 17:34 |
notmyname | so eg setting handoffs_delete to 1 means that the replicator only ensures that there's a copy in 1 primary location before deleting the local copy. setting it to 2 is generally the compromise to improve things in a full-drive scenario | 17:34 |
notmyname | mrjk: reminds me of this old classic :-) https://www.andrews.edu/~freeman/bofh/bofh1.html | 17:36 |
notmyname | mrjk: I'm usre you'll already looking at these, but in case not, here's links. https://docs.openstack.org/swift/latest/deployment_guide.html for docs on config values and https://github.com/openstack/swift/tree/master/etc for well-commented sample config files | 17:44 |
*** itlinux has joined #openstack-swift | 17:58 | |
mrjk | Haha, nice I didn't know BOFH :) Thank you for the resource anyway. I won't be able to use the handoff_* settings with thoses servers are they are mainly using it as 99% read server | 18:03 |
mrjk | they use their swift to distribute file accross regions | 18:04 |
notmyname | mrjk: I'm not sure why reading from swift would prevent you from using handoff_delete settings | 18:05 |
*** itlinux has quit IRC | 18:06 | |
-openstackstatus- NOTICE: OpenStack infra's mirror nodes stopped accepting connections on ports 8080, 8081, and 8082. We will notify when this is fixed and jobs can be rechecked if they failed to communicate with a mirror on these ports. | 18:10 | |
mrjk | yep, but handsoff only appears when some people right into the cluster. There is almost no post in logs, I guess playing with this parameter would not gain me that much space ... Because there are no write, i should not have any handsoff on this region. Is my understanding good ? | 18:20 |
mrjk | Do you know a way to list effective handsoff on disks? | 18:21 |
notmyname | mostly. if you've got full drives, you need to add more capacity. any ring adjustment (either weights like you did or adding drives) will cause existing data to now be on handoffs. the handoffs_only will just churn on those to get it done more quickly | 18:21 |
*** pcaruana has joined #openstack-swift | 18:27 | |
*** zaitcev has joined #openstack-swift | 18:28 | |
*** ChanServ sets mode: +v zaitcev | 18:28 | |
openstackgerrit | Merged openstack/swift master: Add "historical document" note to ring background docs https://review.openstack.org/615251 | 18:30 |
mrjk | notmyname, ok, so the handsoff mechanism also happen during rebalance. That make sense finally. | 18:32 |
*** e0ne has joined #openstack-swift | 18:34 | |
-openstackstatus- NOTICE: The firewall situation with ports 8080, 8081, and 8082 on mirror nodes has been resolved. You can recheck jobs that have failed to communicate to the mirrors on those ports now. | 18:55 | |
openstackgerrit | Tim Burke proposed openstack/swift master: Update min tox version to 2.0 https://review.openstack.org/615003 | 19:00 |
openstackgerrit | Tim Burke proposed openstack/swift master: Update min tox version to 2.3.2 https://review.openstack.org/615003 | 19:01 |
*** e0ne has quit IRC | 19:05 | |
*** pcaruana has quit IRC | 19:08 | |
*** jistr has quit IRC | 19:35 | |
*** jistr has joined #openstack-swift | 19:37 | |
openstackgerrit | Tim Burke proposed openstack/swift master: s3api: Include '-' in S3 ETags of normal SLOs https://review.openstack.org/592231 | 21:09 |
openstackgerrit | Tim Burke proposed openstack/swift master: py3: Monkey-patch json.loads to accept bytes on py35 https://review.openstack.org/615336 | 21:40 |
timburke | zaitcev: ^^^ | 21:44 |
zaitcev | oh, well | 21:44 |
zaitcev | that sounds interesting | 21:45 |
zaitcev | Although I just started looking at 614656 | 21:45 |
zaitcev | timburke: could you explain to me what the if attr=="_orig" does? Surely the super does not have _orig, so how does this work? https://git.openstack.org/cgit/openstack/swift/tree/swift/__init__.py?id=c112203e0ef8f69cdd5a78c260029839a8763d26#n69 | 22:00 |
zaitcev | At first I thought, well maybe it's for someone inheriting us, then... but wait, even if they inherit, they see our _orig. | 22:01 |
zaitcev | oh | 22:02 |
zaitcev | Maybe it protects from anyone else seeing our own _orig? | 22:03 |
timburke | no, i was just worried about how __getattribute__ is a *really* low level override... looks like i don't actually need it, though? i'll see about cleaning that up a bit | 22:21 |
openstackgerrit | Merged openstack/swift master: Update min tox version to 2.3.2 https://review.openstack.org/615003 | 22:22 |
timburke | nope! never mind, i was right to be defensive :-) was testing with python3 before instead of python3.5 | 22:29 |
timburke | with the affected version, without the attr == '_orig' code, "RecursionError: maximum recursion depth exceeded" | 22:29 |
*** ianychoi has quit IRC | 23:02 | |
zaitcev | Oh, I see. It's because __getattribute__ invokes self._orig.getattr. | 23:09 |
timburke | i kinda like this idea of "code scarring"... i should consider backing out some of those changes where i swapped responses to byte strings now that p 601872 landed | 23:14 |
patchbot | https://review.openstack.org/#/c/601872/ - swift - Let error messages to be normal strings again (MERGED) - 1 patch set | 23:14 |
*** gyee has quit IRC | 23:48 | |
*** ianychoi has joined #openstack-swift | 23:55 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!