*** ho has joined #openstack-swift | 00:02 | |
*** zhill has quit IRC | 00:02 | |
*** Tahmina has quit IRC | 00:03 | |
*** annegentle has joined #openstack-swift | 00:06 | |
openstackgerrit | Samuel Merritt proposed openstack/swift: EC: don't mix different fragment archives https://review.openstack.org/168185 | 00:09 |
---|---|---|
torgomatic | "don't cross the streams" | 00:09 |
*** km has joined #openstack-swift | 00:11 | |
*** kei_yama has joined #openstack-swift | 00:11 | |
*** rmcall has quit IRC | 00:12 | |
*** kota_ has joined #openstack-swift | 00:29 | |
*** reed has quit IRC | 00:29 | |
*** reed has joined #openstack-swift | 00:30 | |
*** reed has quit IRC | 00:30 | |
*** reed has joined #openstack-swift | 00:31 | |
openstackgerrit | Samuel Merritt proposed openstack/swift: EC: don't mix different fragment archives https://review.openstack.org/168185 | 00:38 |
openstackgerrit | Clay Gerrard proposed openstack/swift: Add Fragment Index filter support to ssync https://review.openstack.org/165188 | 00:59 |
openstackgerrit | Clay Gerrard proposed openstack/swift: wip: ec reconstructor probe test https://review.openstack.org/164291 | 00:59 |
openstackgerrit | Clay Gerrard proposed openstack/swift: Erasure Code Reconstructor https://review.openstack.org/131872 | 00:59 |
*** annegentle has quit IRC | 01:14 | |
*** annegentle has joined #openstack-swift | 01:17 | |
*** vinsh has joined #openstack-swift | 01:37 | |
*** vinsh has quit IRC | 01:43 | |
*** vinsh has joined #openstack-swift | 01:43 | |
*** vinsh has quit IRC | 01:47 | |
*** annegentle has quit IRC | 01:48 | |
*** Gues_____ has joined #openstack-swift | 01:48 | |
*** happyeveryday has joined #openstack-swift | 01:50 | |
*** happyeveryday has quit IRC | 01:58 | |
*** haigang has joined #openstack-swift | 02:00 | |
*** rmcall has joined #openstack-swift | 02:08 | |
*** kota_ has quit IRC | 02:08 | |
*** rmcall_ has joined #openstack-swift | 02:11 | |
*** rmcall has quit IRC | 02:12 | |
*** rmcall_ is now known as rmcall | 02:12 | |
*** haomaiwang has joined #openstack-swift | 02:14 | |
*** zaitcev has quit IRC | 02:18 | |
*** rmcall has quit IRC | 02:22 | |
* charz our jenkins is petty busy, ha | 02:24 | |
*** haigang has quit IRC | 02:29 | |
*** tsg has quit IRC | 02:29 | |
*** haigang has joined #openstack-swift | 02:30 | |
*** Gues_____ has quit IRC | 02:47 | |
*** annegentle has joined #openstack-swift | 02:48 | |
*** gyee has quit IRC | 02:50 | |
*** Gues_____ has joined #openstack-swift | 02:53 | |
*** annegentle has quit IRC | 02:54 | |
*** devlaps has quit IRC | 02:54 | |
*** panbalag has quit IRC | 03:11 | |
notmyname | charz: is it causing problems? is the test cluster able to handle it? | 03:14 |
charz | notmyname: so far so good. It looks good for now. | 03:15 |
notmyname | ok | 03:15 |
charz | notmyname: Just got some pushover notification when some build fail. | 03:16 |
charz | notmyname: I'm monitoring on it. | 03:16 |
notmyname | thanks! | 03:16 |
charz | notmyname: my pleasure | 03:17 |
mattoliverau | Your 3rd party testing gets to firetest this week :) | 03:23 |
*** tsg has joined #openstack-swift | 03:26 | |
clayg | man it's surprising how prolific the assumption get get_by_index will always return a policy instance is | 03:32 |
clayg | the ECObjectController unlike the ReplicatedObjectController really can't deal with a unknown policy index | 03:33 |
clayg | where as the replicated policy can always just the default object.ring.gz and move forward - the ec controller needs a whole bunch of important policy spicific configuration to get the right number of fragments laid out with the right bits | 03:34 |
clayg | I'm not sure if I'm leaning toward raising an error in the proxy on unknown policy (like if the container_info has 2 cached, but this proxy hasn't been restarted yet) | 03:35 |
clayg | or if I should require a replicated policy so we know we can do something useful with the bytes | 03:35 |
notmyname | and then punt to the reconciler to sort it out later? | 03:36 |
clayg | notmyname: yeah of course - that's sort of the way the replicated object controller was designed | 03:36 |
notmyname | right. just walking through that path in my head | 03:36 |
clayg | I think raising an error would make sense as well - but for GETs too? 404 seems reasonable there - eventual consistency and all | 03:37 |
notmyname | so on object PUT, if the container server has a policy index that the proxy doesn't know about, either fall back to replicated storage or raise an error. is that the whole of it? | 03:39 |
clayg | in the object server we decided to raise 400 - which I'm not even entirely sure doesn't just go stright out to the client | 03:39 |
notmyname | * fall back to the default policy | 03:39 |
notmyname | that doesn't sound right | 03:40 |
clayg | yeah i was thinking default - there was something that freaked me out there... PUT x-copy-from: crazy-unknown-policy-index would 404 | 03:40 |
notmyname | 4xx is like saying, "hey client, you know that object you wanted me to store in that container? well you're doing it bad and should feel bad" | 03:41 |
notmyname | I mean, if we found the container info (to even get a policy index at all), then we know that the index is valid, right? | 03:42 |
clayg | yeah I don't think it was intentional to have it make it's way out of the proxy - I think I got confused thinking 400's worked like 404's and the proxy could work around it | 03:43 |
clayg | notmyname: yeah we don't let you get rid of policy indexes | 03:44 |
clayg | if we find it in the cache *someone* knows what to do with it | 03:44 |
clayg | but if you don't (you being either a object-server or proxy-server) - what can you do? | 03:44 |
clayg | when there was only one type of policy and they were all perfect replicas there really was no "wrong thing" you could do - you just do your thing | 03:45 |
clayg | notmyname: yeah anyway I think finding a server that doesn't know about a policy index is a server error | 03:48 |
notmyname | yeah, I agree. | 03:48 |
clayg | maybe a 501? 503 (chill out i'm HUPing right now) | 03:48 |
clayg | also the proxy has a better chance of working around a backend 5XX than a 4XX - that was my bad | 03:49 |
clayg | easy fix for the object servers - I'm less jazzed about returning a 5XX to the client - but if we stick the object in the wrong spot - or go looking for it in the wrong spot - it's only *barely* helping | 03:49 |
clayg | ... I think | 03:49 |
*** annegentle has joined #openstack-swift | 03:50 | |
notmyname | 503 I think | 03:50 |
notmyname | you're typing exactly what I'm thinking | 03:51 |
notmyname | yeah, on the one hand I think the error reveals an operation issue: configs are fully deployed before requests start using them | 03:51 |
clayg | yeah I think "deprecated" was the wrong word for the policy config option that enables you to avoid this problem | 03:51 |
notmyname | on the other, you can't do that atomically | 03:51 |
clayg | if you roll out your swift.conf and rings - with the policy turned off - then you can rolling enable the policy in swift.conf and even if someone has a stale config - at least they know what to do with it | 03:52 |
notmyname | ah, right | 03:52 |
clayg | in this way you can make sure everyone knows about policy before you allow the first container to be created | 03:52 |
notmyname | so yeah. "deprecated" isn't a great word there. "disabled" might be slightly better | 03:52 |
clayg | in fairness - we didn't strongly encourage this operational practice - but I think we're going to have to | 03:53 |
notmyname | ya, I was just wondering about where this sort of info will be written down | 03:53 |
notmyname | ie docs | 03:53 |
clayg | notmyname: I think someone wanted to reserve "disabled" for making it like you couldn't write new objects - currently deprecated just prevents it from being listed in info - or creating new containers with the policy - existing containers continue to work | 03:53 |
notmyname | ah | 03:54 |
clayg | notmyname: I think there's an existing section in the storage policy part of the admin guide that sort of hints that doing something like this might be a good idea | 03:54 |
notmyname | "someone" ;-) | 03:54 |
clayg | notmyname: maybe it was me? | 03:54 |
notmyname | lol | 03:54 |
clayg | notmyname: I honestly don't recall | 03:54 |
notmyname | I think it's funny because it's not like every one of us didn't have an opportunity to comment. it's owned by everyone now :-) | 03:54 |
*** annegentle has quit IRC | 03:55 | |
clayg | http://docs.openstack.org/developer/swift/overview_policies.html#deprecating-policies | 03:55 |
clayg | second paragraph - you can also use deprecated... | 03:56 |
clayg | still kinda dancing around the edge - rolling policy introduction is gunna have to be a thing | 03:56 |
clayg | how did we not know! | 03:56 |
clayg | well shit - now we have to come up with words | 03:57 |
notmyname | eh, it's not that bad. stronger words are needed, yes, but mostly it's clarifying what's there instead of new stuff altogether | 03:58 |
clayg | oh... yeah I guess that's true | 03:58 |
clayg | see we built the right THING we just named it wrong | 03:58 |
notmyname | we did everything but the hard problem? ;-) | 03:58 |
clayg | I'd rather be doing it right and looking the fool than doing it wrong and looking good | 03:58 |
clayg | typical | 03:59 |
clayg | lazy | 03:59 |
notmyname | lol | 03:59 |
notmyname | next you'll tell me that "account" and "user" were terrible choices too! | 03:59 |
clayg | no i'm used to it now | 03:59 |
notmyname | or a container-updater that walks object data | 03:59 |
clayg | obviously it's updating the containers | 04:00 |
clayg | i'll probably just start calling resturants that are closed "deprecated" and it'll totally be fine | 04:00 |
notmyname | lol | 04:00 |
clayg | ok, 503's all around - I'll put something in the body about your cluster operator is doing it wrong | 04:01 |
mattoliverau | Wow, I go to lunch and that's when the channel gets interesting... Typical :p | 04:03 |
clayg | mattoliverau: we were waiting for you to leave | 04:03 |
clayg | mattoliverau: you think we should mix in the all the refactoring as we go? | 04:03 |
mattoliverau | Closed restaurants == deprecated.. Love it! Lol | 04:03 |
clayg | mattoliverau: way back in the day i naively thougth we could work on the refactoring on master informed by the abstractions we were building in EC | 04:04 |
clayg | but then it turned out that EC was hard (who knew?) | 04:04 |
clayg | stupid fragment indexes and commit requests | 04:05 |
mattoliverau | Lol, yeah.. Some stuff if hard, other is similar but different.. Making refactoring a pain | 04:06 |
clayg | mattoliverau: I think historically we've not done the best dealing with refactoring - we're probably better at it now | 04:06 |
clayg | mattoliverau: but just at tdasilva how hard it is to merge "clean ups" | 04:07 |
clayg | mattoliverau: it's all like "hey I move a bunch of a code around because assteically I think it's better" - "oh yeah, what does it do" - "ideally? nothing. probably some bugs in there tho." | 04:07 |
*** annegentle has joined #openstack-swift | 04:08 | |
clayg | mattoliverau: I'm not sure merging a bunch of duplicated code just to develop it's isolation is better in the long run - but it's the idea I had | 04:09 |
mattoliverau | Thanks for you response of the review.. Let's get get stuff in, keep it logical, easy separated.. Duplication is a pain but there will be a ride off. I'm in the opinion of let's get EC working nicely along side Repl.. We and iterate in master later (famous last words) but especially when a new diskfile/policy is added and we need to start attacking 3 | 04:09 |
clayg | I think someone said it was a good idea - probably acoles_away - and he's smart | 04:09 |
clayg | I think torgomatic told me I'm crazy - but he often thinks I'm doing it wrong | 04:09 |
mattoliverau | Lol | 04:10 |
clayg | mattoliverau: a'ight - well don't think I'm picking on you | 04:10 |
clayg | mattoliverau: you looked at some shit I threw up and said "hey duplicated code sucks" and I was all like "yeah I know but *reasons* wah wah wah" | 04:11 |
mattoliverau | I think at this, the 11th hour, let's not do too many more crazy refactors.. Cause that could damage repl | 04:11 |
clayg | doesn't make me right | 04:11 |
clayg | just means we've got an argument worth having | 04:11 |
clayg | notmyname: *why* did never capture chucks' swift design principles somewhere! I feel like I could use them right about now | 04:12 |
mattoliverau | If we all thought the same way, we'd have no real innovation ;) plus arguments are fun :p | 04:12 |
clayg | mattoliverau: there was one about you're not done with the design until you've argued about it - or something | 04:12 |
mattoliverau | Ha, awesome | 04:13 |
* clayg shouts call out into the abyss for crieth | 04:13 | |
*** annegentle has quit IRC | 04:13 | |
clayg | was that in atlanta? hong kong? | 04:15 |
notmyname | clayg: https://etherpad.openstack.org/p/juno_swift_core_principles | 04:15 |
clayg | oh yippee! | 04:15 |
*** ppai has joined #openstack-swift | 04:17 | |
*** Gues_____ has quit IRC | 04:19 | |
clayg | gah now I remember why we stalled out - we couldn't decide where to stick it :\ | 04:21 |
notmyname | should be in https://github.com/openstack/swift/blob/master/CONTRIBUTING.md | 04:22 |
*** ppai has quit IRC | 04:23 | |
notmyname | ok, I've got a really early start tomorrow. I've got to take care of some other things before bed, so I'm logging off | 04:24 |
clayg | notmyname: thanks for all the help! | 04:24 |
mattoliverau | notmyname: night! I'll be online to help out tomorrow, so see you then. | 04:25 |
*** ppai has joined #openstack-swift | 04:36 | |
clayg | how can I know if i'm mark downing correctly? | 04:43 |
openstackgerrit | Clay Gerrard proposed openstack/swift: Add Swift Design Principles to CONTRIBUTING.md https://review.openstack.org/168221 | 04:46 |
*** reed has quit IRC | 05:03 | |
*** ppai has quit IRC | 05:05 | |
*** kota_ has joined #openstack-swift | 05:11 | |
*** annegentle has joined #openstack-swift | 05:19 | |
*** ppai has joined #openstack-swift | 05:20 | |
*** annegentle has quit IRC | 05:24 | |
*** tsg has quit IRC | 05:30 | |
*** jamielennox is now known as jamielennox|away | 05:41 | |
*** nshaikh has joined #openstack-swift | 05:42 | |
*** SkyRocknRoll has joined #openstack-swift | 05:53 | |
*** annegentle has joined #openstack-swift | 06:20 | |
*** annegentle has quit IRC | 06:25 | |
*** dmorita has quit IRC | 07:02 | |
*** chlong has quit IRC | 07:16 | |
openstackgerrit | Yuan Zhou proposed openstack/swift: EC: Allow proxy to read from all SN in EC policy https://review.openstack.org/168254 | 07:42 |
clayg | oh eventlet you owe me fucking 3 hours | 07:45 |
clayg | i was remembering to this guy in the office about this one time Douglas Crockford came to RAX to do a tech talk - he said a lot of interesting things | 07:46 |
clayg | but one thing I rember was him postulating that being good at programming acctually requires you to be slightly broken | 07:47 |
clayg | most humans avoid pain | 07:47 |
openstackgerrit | Yuan Zhou proposed openstack/swift: EC: Allow proxy to read from all SN in EC policy https://review.openstack.org/168254 | 08:02 |
*** anticw has quit IRC | 08:07 | |
*** anticw has joined #openstack-swift | 08:09 | |
*** admin6 has joined #openstack-swift | 08:17 | |
openstackgerrit | Yuan Zhou proposed openstack/swift: EC: Allow proxy to read from all SN in EC policy https://review.openstack.org/168254 | 08:21 |
*** annegentle has joined #openstack-swift | 08:22 | |
*** admin6 has quit IRC | 08:24 | |
*** admin6 has joined #openstack-swift | 08:24 | |
*** annegentle has quit IRC | 08:27 | |
openstackgerrit | Clay Gerrard proposed openstack/swift: Extract EC PUT to ECObjController https://review.openstack.org/164950 | 08:34 |
clayg | everybody likes writing new tests in proxy.test_sever - I don't even like to have that file open in my editor for fear it might eat up all my memory or startup some sockets ans spawn skynet on my laptop | 08:38 |
clayg | i'll see you guys later in the morning | 08:39 |
clayg | it's Friday! | 08:39 |
*** jistr has joined #openstack-swift | 08:40 | |
*** Cipher45 has quit IRC | 08:50 | |
*** Cipher45 has joined #openstack-swift | 08:51 | |
*** Cipher45 has joined #openstack-swift | 08:51 | |
*** Cipher45 has quit IRC | 08:53 | |
*** Cipher45 has joined #openstack-swift | 08:53 | |
*** Cipher45 has joined #openstack-swift | 08:53 | |
*** jordanP has joined #openstack-swift | 08:53 | |
mattoliverau | clayg: night man | 08:57 |
*** acoles_away is now known as acoles | 09:04 | |
acoles | morning | 09:05 |
acoles | mattoliverau: evening ;) | 09:05 |
mattoliverau | acoles: hey morning! | 09:08 |
acoles | mattoliverau: just catching up on the overnight fun, so what did you and clayg conclude about duplicate code ? | 09:13 |
acoles | mattoliverau: my take is first thing we do after major release is refactor, introduce bugs and find em before next release ;) | 09:14 |
mattoliverau | Its good for now :) keeps the code paths separate and easy to read, worry about possible refactor later | 09:14 |
mattoliverau | So like you said ;) | 09:14 |
acoles | mattoliverau: cool. have a good weekend. are you gonna watch the cricket final? | 09:16 |
mattoliverau | Getting late and failing at setting up container syncing between 2 saios... Must be time for dinner and a break to clear my head | 09:16 |
acoles | mattoliverau: i see its an all down-under affair | 09:16 |
*** mmcardle has joined #openstack-swift | 09:19 | |
mattoliverau | acoles: probably, but first I'll be online in the morning to help out as much as I can pre end of US day... Well inject some sarcasm anyway :p | 09:20 |
*** Akshat has joined #openstack-swift | 09:21 | |
Akshat | Hi | 09:21 |
acoles | mattoliverau: :D i'm never sure if they understand that ;) | 09:21 |
*** ppai has quit IRC | 09:21 | |
Akshat | I did a swift installation, but getting very bad PUT performance | 09:21 |
Akshat | I see an error in logs | 09:21 |
Akshat | object-server: ERROR container update failed | 09:22 |
Akshat | any pointers will be helpful | 09:23 |
mattoliverau | acoles: me neither.. Its half the fun :p | 09:23 |
*** kota_ has quit IRC | 09:23 | |
acoles | Akshat: that error would suggest that the object server(s) are timing out connecting to container servers. check the container servers are running and that the object servers can route http to them. | 09:27 |
Akshat | they are | 09:27 |
Akshat | I am facing this error intermittently | 09:27 |
Akshat | not continuos | 09:27 |
Akshat | they are runing on same machine | 09:27 |
Akshat | just diiferent ports | 09:27 |
Akshat | I am getting 50tps for 75KB PUT | 09:29 |
Akshat | though I have very huge servers | 09:30 |
Akshat | not sure what is blocking it | 09:30 |
Akshat | @acoles...what could be blocking according to you | 09:30 |
acoles | Akshat: although its logged as an error the object server is designed to recover, the failed update will be queued for later retry. that can happen when the container server is heavily loaded | 09:31 |
Akshat | if I do a top | 09:31 |
Akshat | I dont see it consuming 1 cpu | 09:31 |
Akshat | I have a 32 core machine | 09:31 |
*** joeljwright has joined #openstack-swift | 09:32 | |
Akshat | not sure what could be wrong | 09:32 |
acoles | Akshat: sorry i have to go to a meeting now | 09:32 |
Akshat | oh | 09:32 |
Akshat | when can we reconviene | 09:32 |
Akshat | Hi Joel | 09:34 |
*** ppai has joined #openstack-swift | 09:35 | |
*** ho has quit IRC | 09:45 | |
*** foexle has joined #openstack-swift | 09:46 | |
*** silor has joined #openstack-swift | 09:50 | |
Akshat | Hi | 09:56 |
Akshat | I need help with swift | 09:56 |
*** ppai has quit IRC | 09:56 | |
*** annegentle has joined #openstack-swift | 10:23 | |
*** haomaiwang has quit IRC | 10:24 | |
*** annegentle has quit IRC | 10:29 | |
Akshat | hi | 10:40 |
*** kei_yama has quit IRC | 10:45 | |
*** haigang has quit IRC | 10:48 | |
ctennis | Akshat: look in the swift logs for a transaction id (starting with tx...) for a specific operation that is failing to work properly, and follow that through the logs to see where the bottleneck is. | 11:17 |
*** erlon has joined #openstack-swift | 11:18 | |
Akshat | I can see txn | 11:20 |
Akshat | whic all logs should ai follow it | 11:20 |
ctennis | everywhere in your system | 11:21 |
ctennis | how many drives/nodes do you have? | 11:21 |
*** annegentle has joined #openstack-swift | 11:24 | |
*** km has quit IRC | 11:25 | |
*** haypo has joined #openstack-swift | 11:26 | |
*** annegentle has quit IRC | 11:29 | |
openstackgerrit | Merged openstack/swift: Multiple Fragment Archive support for suffix hashes https://review.openstack.org/159637 | 11:34 |
*** Akshat has quit IRC | 11:46 | |
*** admin6 has left #openstack-swift | 11:47 | |
*** Cipher45 has quit IRC | 12:14 | |
*** Cipher45 has joined #openstack-swift | 12:15 | |
*** Cipher45 has joined #openstack-swift | 12:15 | |
*** annegentle has joined #openstack-swift | 12:22 | |
*** annegentle has quit IRC | 12:23 | |
*** fthiagogv has joined #openstack-swift | 12:26 | |
*** panbalag has joined #openstack-swift | 12:50 | |
openstackgerrit | Thiago da Silva proposed openstack/swift: Select policy when running functional test https://review.openstack.org/167595 | 13:00 |
*** nshaikh has quit IRC | 13:19 | |
*** jrichli has joined #openstack-swift | 13:22 | |
*** annegentle has joined #openstack-swift | 13:24 | |
*** mahatic has joined #openstack-swift | 13:27 | |
*** annegentle has quit IRC | 13:29 | |
*** SkyRocknRoll has quit IRC | 13:37 | |
*** annegentle has joined #openstack-swift | 13:43 | |
*** Akshat has joined #openstack-swift | 13:54 | |
Akshat | ctennis ...I have 8 nodes, 184 drives | 13:57 |
ctennis | ok Akshat, jus follow the transaction logs and see where the bottleneck is..everythig is logged and timed. | 13:57 |
Akshat | I only see logs in syslog | 13:58 |
Akshat | any other place I can see them | 13:58 |
Akshat | I don't find that txn id anywhere else | 13:58 |
ctennis | swift sends everything to syslog, not sure how your setup is, depends on where you have things set to log to | 14:00 |
*** annegentle has quit IRC | 14:07 | |
*** Trixboxer has joined #openstack-swift | 14:17 | |
*** vinsh has joined #openstack-swift | 14:18 | |
*** fthiagogv has quit IRC | 14:26 | |
*** vinsh has quit IRC | 14:27 | |
openstackgerrit | Alistair Coles proposed openstack/swift: Make ECDiskFile require a fragment index https://review.openstack.org/168076 | 14:27 |
*** vinsh has joined #openstack-swift | 14:28 | |
*** G________ has joined #openstack-swift | 14:32 | |
*** annegentle has joined #openstack-swift | 14:32 | |
*** vinsh has quit IRC | 14:32 | |
*** reed has joined #openstack-swift | 14:33 | |
*** AbyssOne is now known as a1|away | 14:40 | |
*** mahatic has quit IRC | 14:47 | |
*** vinsh has joined #openstack-swift | 14:53 | |
*** lpabon has joined #openstack-swift | 14:56 | |
openstackgerrit | Alistair Coles proposed openstack/swift: Make ECDiskFile require a fragment index https://review.openstack.org/168076 | 14:56 |
*** Akshat has quit IRC | 15:01 | |
*** pokoli has joined #openstack-swift | 15:01 | |
*** vinsh has quit IRC | 15:01 | |
*** pokoli has left #openstack-swift | 15:03 | |
*** annegentle has quit IRC | 15:04 | |
*** annegentle has joined #openstack-swift | 15:05 | |
*** Akshat has joined #openstack-swift | 15:05 | |
*** rdaly2 has joined #openstack-swift | 15:06 | |
*** G________ has quit IRC | 15:08 | |
*** devlaps has joined #openstack-swift | 15:10 | |
*** Cipher45 has left #openstack-swift | 15:11 | |
*** reed has quit IRC | 15:14 | |
*** G________ has joined #openstack-swift | 15:16 | |
*** Akshat has quit IRC | 15:17 | |
*** fifieldt has quit IRC | 15:21 | |
*** lpabon has quit IRC | 15:26 | |
*** mahatic has joined #openstack-swift | 15:28 | |
*** mahatic has quit IRC | 15:28 | |
*** G________ has quit IRC | 15:29 | |
*** gyee has joined #openstack-swift | 15:29 | |
*** haomaiwang has joined #openstack-swift | 15:39 | |
*** mahatic has joined #openstack-swift | 15:44 | |
*** dencaval has quit IRC | 15:52 | |
*** tsg has joined #openstack-swift | 15:57 | |
openstackgerrit | Alistair Coles proposed openstack/swift: Make ECDiskFile require a fragment index https://review.openstack.org/168076 | 15:59 |
*** foexle has quit IRC | 16:05 | |
*** G________ has joined #openstack-swift | 16:05 | |
*** annegentle has quit IRC | 16:06 | |
*** chuck_ is now known as zul | 16:19 | |
*** zul has quit IRC | 16:19 | |
*** zul has joined #openstack-swift | 16:19 | |
*** G________ has quit IRC | 16:22 | |
clayg | morning | 16:26 |
*** G________ has joined #openstack-swift | 16:26 | |
acoles | clayg: its friday! multi-fi-hash-thingy landed | 16:26 |
*** Akshat has joined #openstack-swift | 16:27 | |
clayg | oh thank goodness! | 16:27 |
acoles | clayg: so are you awake enough to take a question? | 16:28 |
clayg | let's get this party started! | 16:28 |
clayg | i have COFFEEEEEEEEEEYYYYYYYYY | 16:28 |
acoles | thats the spirit! | 16:28 |
*** haypo has left #openstack-swift | 16:28 | |
acoles | clayg: i am working on https://review.openstack.org/#/c/165188/8 and looking at yield_hashes | 16:29 |
acoles | clayg: i had this conversation with peluse a while back - if we end up with a stray .data that is newer than anything else in the obj dir but has no durable, we don't want to yield that from yield_hashes do we? | 16:30 |
clayg | acoles: I just glanced at https://review.openstack.org/#/c/168076/4 but I get the feeling I'm going to love it | 16:30 |
clayg | acoles: correct - only yield out .data that is durable | 16:30 |
acoles | clayg: cool i love an easily reached agreement :) | 16:31 |
clayg | acoles: one exception might be if we have a way to sync suffixes - then the reciever might want a way to somehow know that it has some data for fi X and tell the remote end to... magic | 16:31 |
acoles | clayg: so i *think* its broken right now but if it is i will fix and write tests to prove | 16:31 |
clayg | acoles: perfect! | 16:31 |
acoles | next question: | 16:31 |
acoles | (can you tell i have been waiting!) | 16:32 |
clayg | acoles: acctually i'd love to have you help me fixup the validate something something in the middle of the cleanup_list_dir something - it's ugly - did you see it? | 16:32 |
clayg | i sort of don't remember | 16:32 |
Akshat | Hi | 16:32 |
Akshat | what can be done to improve the performance for swift | 16:33 |
Akshat | I am getting a really bad performance | 16:33 |
acoles | clayg: yeah i'm going to cleannup all that | 16:33 |
clayg | Akshat: did you fix the errors? | 16:33 |
acoles | clayg: the power of gather_ondisk_files will be unleashed :) | 16:33 |
Akshat | I could not find that txn id | 16:33 |
Akshat | still seeing object-server: ERROR container update failed | 16:33 |
clayg | acoles: whoa | 16:33 |
Akshat | in syslogs | 16:33 |
Akshat | and a very bad tps | 16:34 |
clayg | Akshat: yeah that makes put hit a timeout - slows things down quite a bit | 16:34 |
clayg | Akshat: yup - at least you know what you need to fix | 16:34 |
Akshat | what is the cause of that error | 16:34 |
Akshat | and how to fix it | 16:34 |
clayg | it means the object server can't talk to the container server - it'll normally say something more about - timeout - econnrefused - 4XX/5XX something like this | 16:34 |
Akshat | yup...in debug I seen something like greenlet econnrefused | 16:35 |
Akshat | but could not figure out any root cause for it | 16:35 |
Akshat | or how to fix it. | 16:35 |
clayg | acoles: wait - were you asking a "next question" | 16:35 |
acoles | clayg: next question, https://review.openstack.org/#/c/168076/4 is going to break stuff if i rebase the chain onto it | 16:36 |
acoles | clayg: yes! | 16:36 |
clayg | Akshat: oh nice - yeah that's easy the container servers aren't listening where the proxy is telling the object server to expect them to be | 16:36 |
acoles | clayg: all stuff that could be fixed but wasted effort if we don't like 168076 so what do you suggest? | 16:36 |
clayg | acoles: yeah but i love that chain | 16:37 |
Akshat | clayg: what could be the possible reason, could it be replication | 16:37 |
clayg | but I *do* like that change | 16:37 |
clayg | Akshat: nah - probably just network - do you have replication ports in your rings? | 16:37 |
Akshat | clayg: can proxy point to handoff node for containers | 16:37 |
clayg | Akshat: can you swift-ring-builder container.builder - and try to talk to one of those servers form the object server with curl | 16:38 |
acoles | clayg: k, well i'll stuck working on the yield_hashes/ssync patch 165188 then take a look at the rebase | 16:38 |
patchbot | acoles: https://review.openstack.org/#/c/165188/ | 16:38 |
Akshat | wait lemme try the commands | 16:38 |
clayg | Akshat: the proxy will talk to handoff containers - but it always sends down the primaies to the object server for write update | 16:38 |
*** tsg has quit IRC | 16:39 | |
acoles | clayg: s/stuck/stick/ | 16:39 |
clayg | acoles: ok - well let me review the fix name_to_ts and maybe we can rebase it on 165188 and fix from the recon down or something? | 16:39 |
clayg | i want to work on recon tests anyway - so trying to maintain some failing tests would probably highlight to me grossest parts | 16:40 |
clayg | I like this plan! | 16:40 |
clayg | today is going to be a good day! | 16:40 |
*** Akshat has quit IRC | 16:40 | |
acoles | clayg: ok. now the bad news... | 16:40 |
acoles | ...just kidding | 16:41 |
clayg | mattoliverau: cschwede: tdasilva: torgomatic: can you guys rock the proxy today? patch 164950 is sort of aonnoyingly good and has some questionable diffs because I'm looking at these files against master and trying to rip out cruft built up over the development of feature/ec | 16:41 |
patchbot | clayg: https://review.openstack.org/#/c/164950/ | 16:41 |
clayg | acoles: oh nice | 16:41 |
clayg | oh shit! i just call the reconstructor recon :'( | 16:41 |
clayg | damit peluse ^^^ | 16:42 |
clayg | maybe it should be the recoder? | 16:42 |
cschwede | clayg: looking | 16:42 |
acoles | the 'fixerupper' | 16:42 |
clayg | swift-init object-fixer once -nv | 16:43 |
acoles | +1 | 16:44 |
clayg | yeah but re-EC-doer is like recoder but like dyslexic | 16:44 |
clayg | acoles: so - there is no bad news? | 16:46 |
clayg | cschwede: oh i feel like I tricked you - not annoyingly good - annoyingly BIG - there's nothing good about a moving a bunch of code in the proxy - but I'm pretty sure some parts of it are good - mostly the lots and lots of new tests and the random bug fixes i found writing the tests | 16:49 |
*** silor has quit IRC | 16:49 | |
clayg | awww - you guys - come on! https://review.openstack.org/#/c/168221/ | 16:50 |
acoles | clayg: no tests, no +2 | 16:51 |
*** Akshat has joined #openstack-swift | 16:51 | |
acoles | :) | 16:51 |
acoles | i wanna see a spec first | 16:51 |
Akshat | clayg: I did tried curl | 16:51 |
Akshat | fetching contents of a container from object server | 16:51 |
Akshat | it looks to be working fine | 16:51 |
Akshat | clayg: Also I see this error intermittently | 16:52 |
clayg | maybe the rings on the proxy are stale - i'm not sure if the reconnrefused trace back includes the ip and port of the container server - be really nice if it did :\ | 16:54 |
cschwede | clayg: i’ll need some more time for the obj.py review in patch 164950, no blocker so far. unfortunately my coworking space closes soon, thus i have to interrupt the review | 16:54 |
patchbot | cschwede: https://review.openstack.org/#/c/164950/ | 16:54 |
Akshat | its the same ring all across | 16:54 |
Akshat | clayg: I verified the cksum | 16:55 |
Akshat | its the same | 16:55 |
clayg | cschwede: no problem - any thing you can throw out would be helpful - my eyes were glossing over last night and today is new challenges! | 16:55 |
clayg | Akshat: well dag nab it | 16:56 |
Akshat | clayg: sorry din't get you | 16:57 |
clayg | well there's no good reason for it to work from the object server with curl and not from the object server process | 16:58 |
clayg | so we're probably not talking to the same node/ip/port with curl as the object-server process is reading from the headers | 16:58 |
Akshat | clayg: the data in ring shows to be reachable | 16:59 |
Akshat | what best can be done to debug this | 17:00 |
clayg | yah but what does the error message say? | 17:00 |
Akshat | #012Traceback (most recent call last):#012 File "/usr/local/lib/python2.7/dist-packages/swift/obj/server.py", line 194, in async_update#012 full_path, headers_out)#012 File "/usr/local/lib/python2.7/dist-packages/swift/common/bufferedhttp.py", line 157, in http_connect#012 ipaddr, port, method, path, headers, query_string, ssl)#012 File "/ | 17:00 |
Akshat | usr/local/lib/python2.7/dist-packages/swift/common/bufferedhttp.py", line 189, in http_connect_raw#012 conn.endheaders()#012 File "/usr/lib/python2.7/httplib.py", line 969, in endheaders#012 self._send_output(message_body)#012 File "/usr/lib/python2.7/httplib.py", line 829, in _send_output#012 self.send(msg)#012 File "/usr/lib/python2.7 | 17:00 |
Akshat | /httplib.py", line 791, in send#012 self.connect()#012 File "/usr/local/lib/python2.7/dist-packages/swift/common/bufferedhttp.py", line 108, in connect#012 return HTTPConnection.connect(self)#012 File "/usr/lib/python2.7/httplib.py", line 772, in connect#012 self.timeout, self.source_address)#012 File "/usr/local/lib/python2.7/dist-pack | 17:00 |
Akshat | ages/eventlet/green/socket.py", line 60, in create_connection#012 raise error(msg)#012error: [Errno 111] ECONNREFUSED (txn: tx3866e75241e84b78b5ad0-0055151c57 | 17:00 |
Akshat | exact error | 17:00 |
*** haomaiw__ has joined #openstack-swift | 17:02 | |
Akshat | is curl command to fetch contents of the container good enough to validate | 17:02 |
clayg | Akshat: nope, there's a line above that | 17:02 |
clayg | ERROR container update failed with -> then it has the ip:port/dev | 17:02 |
Akshat | ERROR container update failed with 10.65.53.242:6001/sdn (saving for async update later): Timeout (3s) (txn: tx14298e16edc849a289f2e-00551504ba) | 17:03 |
*** haomaiwang has quit IRC | 17:04 | |
*** annegentle has joined #openstack-swift | 17:04 | |
*** Trixboxer has quit IRC | 17:04 | |
clayg | yeah ok, so and `curl http://10.65.53.242:6001/sdn` returns something about 4XX bad path? | 17:06 |
clayg | *from the host that logged that line* | 17:06 |
*** btorch has quit IRC | 17:10 | |
*** jistr has quit IRC | 17:13 | |
openstackgerrit | Thiago da Silva proposed openstack/swift: Add Swift Design Principles to CONTRIBUTING.md https://review.openstack.org/168221 | 17:13 |
*** G________ has quit IRC | 17:14 | |
*** nshaikh has joined #openstack-swift | 17:18 | |
clayg | yeah that's what i'm talking about! | 17:19 |
openstackgerrit | Thiago da Silva proposed openstack/swift: Document SWIFT_TEST_POLICY for regular functional tests https://review.openstack.org/167958 | 17:22 |
*** RayAngelone has joined #openstack-swift | 17:23 | |
acoles | tdasilva: ^^ thanks! | 17:23 |
tdasilva | acoles: hope it is ok, it was just a minor fix, so I thought it would not mind | 17:24 |
tdasilva | s/it/you | 17:24 |
*** nshaikh has quit IRC | 17:27 | |
*** jeblair is now known as denethor | 17:27 | |
*** denethor is now known as jeblair | 17:27 | |
*** cebruns has quit IRC | 17:30 | |
acoles | tdasilva: no not at all its helpful. funny thing is that iirc i spotted that i had left the word 'would' out before pushing first version, but i then put it in wrong place :) | 17:31 |
clayg | tdasilva: maybe I should add "seplling is not a prioitry" - or maybe that's just me :\ | 17:32 |
tdasilva | acoles, clayg: haha | 17:32 |
clayg | docs are required - probably spelling and grammer is optional | 17:33 |
tdasilva | that works | 17:33 |
tdasilva | :P | 17:33 |
acoles | clayg: s/grammer/grammar/ ;) | 17:33 |
tdasilva | clayg: have we considered moving the EC classes in controllers/obj.py to a different file? | 17:34 |
clayg | lol! | 17:34 |
clayg | tdasilva: i fucking tried and my stupid register trick didn't work anymore unless I could find someone to impor the file and execute it! | 17:34 |
clayg | although now I'm thinking adding it to __init__ sort of like we do with the proxy controllers would work | 17:34 |
*** btorch has joined #openstack-swift | 17:35 | |
clayg | tdasilva: i was acctually thinking it'd be sorta hawt to see everything (policy, controller, diskfilemanager) moved under a common module - like all the ec in one place! | 17:35 |
clayg | I have this idea with these @register hooks that someone could acctually just ship a storage policy and hook it in with entry point | 17:36 |
tdasilva | clayg: btw: I like the idea of the router, I think it will be helpful if we introduce new subclasses besides ec and replication ;) | 17:36 |
tdasilva | yes! | 17:36 |
clayg | i thought you would :) | 17:36 |
tdasilva | hehe | 17:36 |
*** haomaiwa_ has joined #openstack-swift | 17:40 | |
*** haomaiw__ has quit IRC | 17:40 | |
Akshat | clayg: I am out of network right now | 17:44 |
Akshat | will try the curl you specified and update you | 17:44 |
*** jordanP has quit IRC | 17:45 | |
*** haomaiwa_ has quit IRC | 17:53 | |
clayg | acoles: I'm changing up the parse_on_disk_filename | 17:53 |
acoles | clayg: in patch 165188 there's a bunch of tests that moved an i think as a result have not picked up the changes to dev paths | 17:54 |
patchbot | acoles: https://review.openstack.org/#/c/165188/ | 17:54 |
clayg | changes to dev paths? | 17:55 |
acoles | clayg: like on feature ec they are /srv/sda1 and in 165188 they are /srv/dev/ | 17:55 |
*** haomaiwang has joined #openstack-swift | 17:55 | |
clayg | so... they're failing? | 17:55 |
acoles | the return values to mocked get_dev_path | 17:55 |
acoles | idk why they got changed | 17:56 |
* clayg is way lost | 17:56 | |
acoles | but do we want to change them back again | 17:56 |
acoles | clayg: yeah i am lost too :/ | 17:56 |
acoles | clayg: i remember peluse grumbling about having to change some stuff like that | 17:57 |
clayg | oh i dont' know why those tests were *originally* mocking get_dev_path - i probably just didn't notice it when i copy pasted them | 17:57 |
clayg | acoles: maybe they just need to turn off mount check and use self.existing_dev1 (or something else that creates a real directory) | 17:58 |
acoles | clayg: its just that when i compare the moved tests with the originals on feature/ec there's a ton of diffs and i was hoping there wold be none (you just relocated from the DiskFileMixin to DiskFileManagerMixin | 18:00 |
acoles | ) | 18:00 |
acoles | hmmm | 18:01 |
Akshat | clayg: I tried curl http://10.65.53.241:6001/sdc | 18:05 |
Akshat | it gives me invalid url | 18:05 |
Akshat | Error was ERROR container update failed with 10.65.53.241:6001/sdc (saving for async update later): Timeout (3s) (txn: txc49b77bd7808406597127-005515999a | 18:05 |
clayg | acoles: i don't care about diff count in the tests - lots of tests diff makes it look like we're trying :D | 18:06 |
*** haomaiwang has quit IRC | 18:06 | |
clayg | well now hold on - Akshat the Timeout isn't the same as econnrefused is it? | 18:07 |
clayg | a) the 3s timeout is the node timeout (by default) - so that means the object-server made the connection and the container wasn't able to respond | 18:07 |
Akshat | If I see the debug log...trace says econnrefused | 18:08 |
clayg | b) i don't think an ECONREFUSED would be logged as a timeout - i waited for an ack for x seconds and didn't hear anything is different from I talked to this host and he said he won't accept connections on that port | 18:08 |
clayg | what? | 18:08 |
clayg | Akshat: are you sure those are the same reqeust? can you put a large snippit form the logs in a paste somewhere? | 18:09 |
clayg | paste/gist | 18:09 |
Akshat | sure | 18:09 |
openstackgerrit | Alistair Coles proposed openstack/swift: Add Fragment Index filter support to ssync https://review.openstack.org/165188 | 18:16 |
acoles | clayg: ^^ i cleaned up yield_hashes and there's a TODO where 168076 will allow more cleanup, i *think* this wants rebasing on patch 168076 so 168076 is start of chain | 18:16 |
patchbot | acoles: https://review.openstack.org/#/c/168076/ | 18:16 |
acoles | clayg: i have to work offline for a couple of hours back later | 18:17 |
*** acoles is now known as acoles_away | 18:19 | |
*** welldannit has quit IRC | 18:21 | |
clayg | Akshat: i don't see any econrefused in there - just the timeouts from the container servers - that's a different problem - how many and what kind of disks do you have servicing the container work load - i can see you're using ssbench - the scenario .json would be helpful as well | 18:23 |
clayg | Akshat: it's quite possible the work load doesn't make sense for the hardware - what's the total rps on the write path - you might try more limited crud scearnios to find good values for isolated workloads before trying to combine them | 18:24 |
Akshat | this is the scenario json | 18:26 |
Akshat | { | 18:26 |
Akshat | "name": "Small test scenario", | 18:26 |
Akshat | "sizes": [{ | 18:26 |
Akshat | "name": "tiny", | 18:26 |
Akshat | "size_min": 7680, | 18:26 |
Akshat | "size_max":7680 | 18:26 |
Akshat | }], | 18:26 |
Akshat | "initial_files": { | 18:26 |
Akshat | "tiny" : 100 | 18:26 |
Akshat | }, | 18:26 |
Akshat | "operation_count": 500, | 18:26 |
Akshat | "crud_profile": [1,0,0,0], | 18:26 |
Akshat | "user_count": 1 | 18:26 |
Akshat | } | 18:26 |
Akshat | how do I check rps on write path | 18:26 |
Akshat | I am only trying a PUT | 18:26 |
Akshat | and its giving me the worst performace | 18:27 |
Akshat | how do you switch on DEBUG logs for container-server | 18:27 |
Akshat | I have 184 disks, SATA, 3TB each | 18:27 |
*** mmcardle has quit IRC | 18:30 | |
*** gyee has quit IRC | 18:32 | |
clayg | requests/per/second - should be in the output of ssbench | 18:33 |
clayg | are you over-riding the user count on the command line? what's the command you use to run this scenario? please use paste | 18:33 |
Akshat | YES | 18:33 |
clayg | I don't understand where the HEADs are coming from | 18:34 |
Akshat | u 75 c 100 o 1000 | 18:34 |
Akshat | tps 36 | 18:34 |
Akshat | 2 zones | 18:34 |
Akshat | 184 drives across 8 nodes | 18:34 |
Akshat | 23 drive on each node | 18:34 |
Akshat | partition power 13 | 18:34 |
Akshat | what does HEAD signify | 18:35 |
Akshat | Mar 27 11:31:19 dfw-appblx061-25 swift: ERROR container update failed with 10.65.53.241:6001/sdc (saving for async update later): #012Traceback (most recent call last):#012 File "/usr/local/lib/python2.7/dist-packages/swift/obj/server.py", line 194, in async_update#012 full_path, headers_out)#012 File "/usr/local/lib/python2.7/dist-packages/sw | 18:35 |
Akshat | ift/common/bufferedhttp.py", line 157, in http_connect#012 ipaddr, port, method, path, headers, query_string, ssl)#012 File "/usr/local/lib/python2.7/dist-packages/swift/common/bufferedhttp.py", line 189, in http_connect_raw#012 conn.endheaders()#012 File "/usr/lib/python2.7/httplib.py", line 969, in endheaders#012 self._send_output(mess | 18:35 |
Akshat | age_body)#012 File "/usr/lib/python2.7/httplib.py", line 829, in _send_output#012 self.send(msg)#012 File "/usr/lib/python2.7/httplib.py", line 791, in send#012 self.connect()#012 File "/usr/local/lib/python2.7/dist-packages/swift/common/bufferedhttp.py", line 108, in connect#012 return HTTPConnection.connect(self)#012 File "/usr/lib/p | 18:35 |
Akshat | ython2.7/httplib.py", line 772, in connect#012 self.timeout, self.source_address)#012 File "/usr/local/lib/python2.7/dist-packages/eventlet/green/socket.py", line 60, in create_connection#012 raise error(msg)#012error: [Errno 111] ECONNREFUSED (txn: tx13a72f5fa163455a9a153-005515a1f6) | 18:35 |
Akshat | Mar 27 11:31:19 dfw-appblx061-25 swift: 10.65.53.241 - - [27/Mar/2015:18:31:18 +0000] "PUT /sdd/150/AUTH_72dc104eb65941ebb73cc0d3dd406022/ssbench_000013/tiny_000711" 201 - "PUT http://10.65.53.241:8080/v1/AUTH_72dc104eb65941ebb73cc0d3dd406022/ssbench_000013/tiny_000711" "txaf2d37659db149b5b9b15-005515a1f6" "proxy-server 38167" 0.7375 "-" 37848 | 18:35 |
Akshat | Mar 27 11:31:19 dfw-appblx061-25 swift: ERROR container update failed with 10.65.53.241:6001/sdd (saving for async update later): #012Traceback (most recent call last):#012 File "/usr/local/lib/python2.7/dist-packages/swift/obj/server.py", line 196, in async_update#012 response = conn.getresponse()#012 File "/usr/local/lib/python2.7/dist-packa | 18:35 |
Akshat | ges/swift/common/bufferedhttp.py", line 123, in getresponse#012 response = HTTPConnection.getresponse(self)#012 File "/usr/lib/python2.7/httplib.py", line 1045, in getresponse#012 response.begin()#012 File "/usr/lib/python2.7/httplib.py", line 409, in begin#012 version, status, reason = self._read_status()#012 File "/usr/lib/python2.7/h | 18:35 |
Akshat | ttplib.py", line 373, in _read_status#012 raise BadStatusLine(line)#012BadStatusLine: '' (txn: tx39e47fd5cf5042feabff4-005515a1eb) | 18:35 |
Akshat | this is the debug of the same log | 18:36 |
Akshat | check this https://gist.github.com/akshatknsl/82f7df535e85659819e9 | 18:37 |
Akshat | I have the debug logs | 18:37 |
glange | Akshat: http://paste.openstack.org/ <-- use that next time for multi-line pastes | 18:38 |
clayg | Akshat: stop. pasteing. these. lines. in. channel. - please just link to a gist | 18:38 |
clayg | glange: thanks :) | 18:38 |
glange | no, thank you! :) | 18:38 |
clayg | glange: you guys grumbling about this ec stuff over there? You know we're going to start reviewing the merge to master next week. | 18:38 |
Akshat | glange: sure | 18:38 |
clayg | glange: ya'll gunna even look at it? or is there some sinister plan to come through and -2 everything trololololo | 18:39 |
glange | hahah | 18:39 |
Akshat | http://paste.openstack.org/show/197210/ | 18:40 |
Akshat | clayg: These are the same logs in debug, they give much more detail | 18:40 |
*** annegentle has quit IRC | 18:40 | |
*** annegentle has joined #openstack-swift | 18:41 | |
*** Guest____ has joined #openstack-swift | 18:41 | |
*** panbalag has quit IRC | 18:46 | |
clayg | Mar 27 11:31:20 dfw-appblx061-25 swift: SIGTERM received <- are you doing like failure testing? stopping process during benchmark? | 18:48 |
clayg | becuase *that'll* log some errors :P | 18:48 |
Akshat | no failure testing | 18:49 |
Akshat | it was too slow | 18:49 |
Akshat | so killed the process | 18:49 |
Akshat | to paste you the logs | 18:49 |
*** nshaikh has joined #openstack-swift | 18:49 | |
Akshat | There is a trace for container update error | 18:50 |
Akshat | It points to error in /usr/local/lib/python2.7/dist-packages/swift/obj/server.py | 18:50 |
Akshat | File "/usr/local/lib/python2.7/dist-packages/eventlet/green/socket.py", line 60, in create_connection#012 | 18:51 |
Akshat | I have been stuck with this since days now | 18:52 |
Akshat | :( | 18:52 |
*** tsg has joined #openstack-swift | 18:55 | |
*** tsg_ has joined #openstack-swift | 18:57 | |
*** tsg has quit IRC | 19:00 | |
Akshat | clayg: what can be my next steps | 19:00 |
ctennis | it looks to me like you don't have servers running on 10.65.53.241 on ports 6000 and 6001...hence the connection refused errors | 19:03 |
Akshat | they are running | 19:03 |
Akshat | I verified telnet and netstat | 19:04 |
ctennis | you may need to use something like tcpdump to understand why it can't connect then | 19:05 |
ctennis | perhaps you're hitting a system resource limitation | 19:05 |
Akshat | could replication be the bottleneck | 19:08 |
ctennis | there's also nothing in this log indicating connections to anything except 10.65.53.241, so if you have 8 nodes it doesn't reflect in this log | 19:08 |
ctennis | that seems unlikely to me | 19:08 |
*** nshaikh has left #openstack-swift | 19:09 | |
Akshat | I will try few things as suggested | 19:13 |
Akshat | will update you ctennis and clayg | 19:13 |
Akshat | Thanks all | 19:14 |
clayg | ctennis: thanks | 19:17 |
Akshat | Just one question | 19:18 |
Akshat | I have a node with 23 disks | 19:18 |
Akshat | can a single object/container server cater to all 23 | 19:18 |
ctennis | well Akshat you should run multiple workers not just a single process | 19:20 |
Akshat | I have a 32 core machine and 32 workers | 19:20 |
ctennis | ok that will work fine | 19:20 |
*** gsilvis has quit IRC | 19:21 | |
Akshat | what is usual tps for PUT in swift | 19:22 |
Akshat | I mean you have benchmarked | 19:22 |
*** gsilvis has joined #openstack-swift | 19:22 | |
*** Akshat has quit IRC | 19:25 | |
*** Akshat has joined #openstack-swift | 19:26 | |
sweeper | Akshat: over 9000 | 19:26 |
*** joeljwright has quit IRC | 19:28 | |
openstackgerrit | Clay Gerrard proposed openstack/swift: Make ECDiskFile require a fragment index https://review.openstack.org/168076 | 19:36 |
*** devlaps has quit IRC | 19:36 | |
*** Akshat has quit IRC | 19:37 | |
*** Guest____ has quit IRC | 19:40 | |
openstackgerrit | Clay Gerrard proposed openstack/swift: Add Fragment Index filter support to ssync https://review.openstack.org/165188 | 20:08 |
clayg | idk.... i'm really staring to like the object-recoder | 20:11 |
clayg | it's the truc - reconstructor truct truck truckt truct tr ct tr ct truct reconstructor reconstructor reconsturct - it's hard | 20:13 |
*** acoles_away is now known as acoles | 20:27 | |
clayg | acoles: you're back!? | 20:31 |
clayg | i'm digging this make/parse_on_disk_filename man - thanks for that | 20:31 |
acoles | clayg: i am just for a while. i'm just looking at your changes. | 20:32 |
clayg | i'm redoing the recon change now - it's also digging on it - tests are getting better | 20:32 |
clayg | damnit! the recoder change | 20:32 |
acoles | clayg: i just had to take my son to cricket training and wrote some more tests for yield_hashes while i was waiting so will try to push those up | 20:32 |
acoles | clayg: fixer fixer fixer! | 20:32 |
clayg | fixer - recoder - reconstrcutor - everything is all on the left hand - I think i need a dovark or something | 20:33 |
clayg | acoles: i was going to give you a hard time about writing tests during baseball practice - but I've totally been missing will's practice for like three weeks now | 20:34 |
clayg | i guess it's better to be there and be destracted than to onot be there :\ | 20:34 |
acoles | clayg: oh i wasn't there i sit outside in the car - he's 16 and doesn't want me around :P | 20:35 |
acoles | clayg: so yeah you parser more bullet proof, cool | 20:36 |
acoles | s/you/you made/ | 20:36 |
acoles | clayg: dict return, cool. can we start sending dicts in ssync? i'd like that | 20:38 |
openstackgerrit | Clay Gerrard proposed openstack/swift: Erasure Code Reconstructor https://review.openstack.org/131872 | 20:39 |
clayg | acoles: now that's an idea | 20:39 |
clayg | maybe json or msgpack - even csv is fine i the parser on the other end can throw out trailing items | 20:40 |
acoles | clayg: did you get chance to look over this (i don't expect you to have had time!) https://trello.com/c/UjVTwXj7/154-optimize-ssync-to-not-replicate-data-file-when-only-durable-is-missing | 20:40 |
clayg | we should sneak that fix in now acctually | 20:40 |
acoles | clayg: if we are going to change the protocol we *must* make it extensible otherwise it will have to change again to do the right thing for meta files | 20:41 |
clayg | acoles: well csv you can just add to the end - it works - we've been doing it with log parsing lines for years | 20:42 |
clayg | i hate json, we could use query string encoding, msgpack is nice because it only has byte arrays so it never screws you on unicode | 20:42 |
clayg | and can have non string keys | 20:43 |
clayg | and it has good cross platform support | 20:43 |
acoles | clayg: yes, but we may need to have optional pieces like durable timestamp for EC but not there for repl so having k,v pairs is useful, although more verbose | 20:43 |
clayg | redbo: what's the bomb diggity serailization format for go? surely there's something better than json the hipsters are using | 20:43 |
clayg | acoles: true dat | 20:44 |
clayg | acoles: you *know* we are changing the REPLICATION verb to SSYNC | 20:44 |
clayg | so.... like..... you're not going to have old ssync processes talking to the new endpoints while you UUUUUUppppgrAAde | 20:45 |
clayg | we should *DO IT* | 20:45 |
acoles | clayg: of course | 20:45 |
clayg | notmyname: look more stuff to do like... *today* | 20:45 |
acoles | is notmyname around today? | 20:46 |
clayg | taxes? i don't know why this man has to spend three days on taxes - it's like he has overseas investmnets or something | 20:46 |
clayg | also - I think the deadline is in like a couple of weeks? my wife has that stuff filed in like Janurary. | 20:47 |
acoles | three days in a taxi sounds like a hell of a journey | 20:47 |
acoles | oh taxes | 20:47 |
clayg | lol | 20:47 |
acoles | thought that was your speling agin | 20:47 |
clayg | acoles: so I sorta think there should be something plumbed into diskfile_from_hash that's just like "create a durable if you have this file" | 20:48 |
acoles | clayg: speaking of wives, i'd better get these new tests added | 20:48 |
clayg | yeah i really shoulnd't be allowed to communicte in written word without a spellchecker :\ | 20:48 |
* acoles goes to look who calls diskfile_from_hash | 20:49 | |
openstackgerrit | Clay Gerrard proposed openstack/swift: wip: ec reconstructor probe test https://review.openstack.org/164291 | 20:49 |
clayg | a'ight jenkins do your thing | 20:49 |
clayg | acoles: the reciever - if it has someone wanting to push it a ts then it knows the remote end has a .durable for that file - so if it also has that file/ts but not durable - when it goes to get_from_hash it could also be like and feel free to make this timestamp durable if it's not already | 20:50 |
acoles | clayg: uhuh, so ssync_receiver just fixes up the durable | 20:50 |
acoles | yeah, that | 20:51 |
clayg | it's a cheap hack - just some plumbing | 20:51 |
clayg | not as good as k/v ssync protocol | 20:51 |
clayg | in the missing check updates | 20:51 |
clayg | we really should do that | 20:51 |
clayg | i'll look at it - but i need to shut down an head into the office | 20:51 |
clayg | it's already like mid afternoon out here | 20:51 |
acoles | k well i need to go reaquaint with my family | 20:52 |
clayg | but like i don't know... i was doing stuff | 20:52 |
clayg | good luck with that - I think I've got that on my calander for mid-april? | 20:52 |
acoles | lol | 20:52 |
-openstackstatus- NOTICE: Gerrit maintenance commences in 1 hour at 22:00 UTC http://lists.openstack.org/pipermail/openstack-dev/2015-March/059948.html | 21:01 | |
*** lcurtis has joined #openstack-swift | 21:01 | |
mattoliverau | Morning | 21:09 |
acoles | mattoliverau: good morning! | 21:09 |
mattoliverau | acoles: Good evening :) | 21:09 |
*** annegentle has quit IRC | 21:10 | |
*** csmart has quit IRC | 21:13 | |
*** csmart has joined #openstack-swift | 21:13 | |
openstackgerrit | Alistair Coles proposed openstack/swift: Add Fragment Index filter support to ssync https://review.openstack.org/165188 | 21:14 |
acoles | ^^ just added tests | 21:14 |
*** gyee has joined #openstack-swift | 21:14 | |
*** jrichli has quit IRC | 21:18 | |
acoles | mattoliverau: hey it was great to overlap with you but i a, shutting down now, have a good weekend! | 21:19 |
mattoliverau | The pleasure was mine, av a great weekend, go and enjoy your family :) | 21:21 |
*** acoles is now known as acoles_away | 21:22 | |
*** thumpba has joined #openstack-swift | 21:35 | |
*** mahatic has quit IRC | 21:40 | |
*** annegentle has joined #openstack-swift | 22:04 | |
clayg | wft are you doing gerrit? | 22:05 |
-openstackstatus- NOTICE: Gerrit is offline for maintenance, ETA 22:30 UTC http://lists.openstack.org/pipermail/openstack-dev/2015-March/059948.html | 22:05 | |
*** ChanServ changes topic to "Gerrit is offline for maintenance, ETA 22:30 UTC http://lists.openstack.org/pipermail/openstack-dev/2015-March/059948.html" | 22:05 | |
*** RayAngelone has quit IRC | 22:06 | |
clayg | oh sure - now you tell me | 22:06 |
clayg | srly, i don't know how to work now | 22:07 |
clayg | i don't think dvcs work the way I think they do | 22:07 |
mattoliverau | sigh.. thanks gerrit | 22:09 |
clayg | zaitcev: y u no in channel!? | 22:10 |
*** Nadeem has joined #openstack-swift | 22:14 | |
torgomatic | bash: line 1: 2789 Segmentation fault (core dumped) nosetests -s test.unit.proxy.test_server:TestObjectECRangedGET.test_multiple_ranges | 22:14 |
torgomatic | can I go home now? | 22:14 |
clayg | torgomatic WINS - FATALITY | 22:15 |
mattoliverau | lol, a day isn't over until your code seg faults :P | 22:30 |
*** erlon has quit IRC | 22:31 | |
mattoliverau | Well on the plus side, gerrit seems to be back | 22:32 |
*** ChanServ changes topic to "Review Dashboard: http://goo.gl/uRzLBX | Overview Dashboard: http://goo.gl/2By1qv | EC status: https://gist.github.com/notmyname/fd006c061ccb28e8ecfc | Logs: http://eavesdrop.openstack.org/irclogs/%23openstack-swift/" | 22:33 | |
*** welldannit has joined #openstack-swift | 22:49 | |
notmyname | hello | 22:51 |
notmyname | I'm here now | 22:51 |
notmyname | not doing taxes today. worse: I had to do a sales call | 22:51 |
notmyname | acoles_away: ^ | 22:51 |
mattoliverau | notmyname: moring | 22:53 |
notmyname | what have I missed today? | 22:54 |
notmyname | clayg: are we happy today? or sad? | 22:56 |
mattoliverau | torgomatic is seg faulting everything so wants to go home, and clayg isn't swearing enough.. so he must be fine. | 22:58 |
*** gyee has quit IRC | 22:58 | |
openstackgerrit | Clay Gerrard proposed openstack/swift: Don't let api version be a suggestion https://review.openstack.org/168509 | 23:01 |
clayg | a'ight gerrit what are you doing? | 23:01 |
notmyname | clayg: ah, good patch | 23:02 |
*** rdaly2 has quit IRC | 23:07 | |
clayg | well - maybe - half of the commit message is like "??? maybe ??? is this ???" | 23:10 |
clayg | glange: ^ you guys have lots of paths in logs | 23:10 |
clayg | ok - so gerrit is back up - we can get back to work | 23:11 |
notmyname | wow, did you see the ML thread "[openstack-dev] [swift] swift memory usage in centos7 devstack jobs"? | 23:12 |
clayg | yeah interesting | 23:13 |
clayg | I was thinking it was like that other issues we were tracking - but it was all RSS - so my first guess was libraies | 23:13 |
clayg | I honestly don't know how he did that output where he ... how did he phrase it "dug into the heap" | 23:14 |
* clayg is such a johnny at ops | 23:14 | |
openstackgerrit | Clay Gerrard proposed openstack/swift: Don't let api version be a suggestion https://review.openstack.org/168509 | 23:15 |
clayg | oops - forgot the bug # | 23:15 |
*** annegentle has quit IRC | 23:19 | |
notmyname | clayg: 400 seems to fit (and makes more sense IMO than 412) http://tools.ietf.org/html/rfc7231#section-6.5.1 | 23:26 |
clayg | so the two existing behaviors were 404 and 412 - and you choose 400? | 23:29 |
clayg | Yeah I can see that. | 23:29 |
notmyname | I'd prefer 412 over 404, but I was basing 400 on your distaste for 412 (and I think you're right according to the rfc) | 23:30 |
notmyname | 400 > 412 > 404 IMO | 23:30 |
*** cebruns has joined #openstack-swift | 23:33 | |
notmyname | clayg: I'm about to have to get on a call, but I was working on a patch for the version check | 23:42 |
notmyname | clayg: do you want a gist of my WIP diff or the thing I found? or would you rather I just push over it tonight? | 23:43 |
notmyname | clayg: gerrit comment left. gotta run | 23:47 |
*** lcurtis has quit IRC | 23:47 | |
*** Nadeem has quit IRC | 23:52 | |
clayg | oh you know what - meant to do lstrip - but i didn't think about vvvv1 | 23:58 |
clayg | i suppose we could just make a list of accepted version strings - if we think we can get in all the ones we care about | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!