*** itlinux has joined #openstack-swift | 00:11 | |
*** zufar_ has joined #openstack-swift | 00:57 | |
zufar_ | hello all | 00:57 |
---|---|---|
*** threestrands has quit IRC | 00:59 | |
*** Jeffrey4l has left #openstack-swift | 01:26 | |
*** mrjk_ has joined #openstack-swift | 01:51 | |
*** lifeless_ has joined #openstack-swift | 01:55 | |
*** irclogbot_2 has quit IRC | 02:00 | |
*** lifeless has quit IRC | 02:00 | |
*** zufar_ has quit IRC | 02:00 | |
*** mrjk has quit IRC | 02:00 | |
*** irclogbot_2 has joined #openstack-swift | 02:04 | |
kota_ | hello | 02:04 |
kota_ | notmyname: FYI, I'll introduce recent Swift upstream and ProxyFS at an OpenStack Event in Japan, https://openstack-jp.connpass.com/event/113590/ | 02:06 |
kota_ | today, in my time. | 02:06 |
*** baojg has joined #openstack-swift | 02:10 | |
*** psachin has joined #openstack-swift | 02:46 | |
notmyname | kota_: that's great to hear. good luck! | 03:32 |
*** psachin has quit IRC | 05:23 | |
*** pcaruana has joined #openstack-swift | 07:25 | |
*** pcaruana has quit IRC | 07:55 | |
*** pcaruana has joined #openstack-swift | 07:55 | |
*** ccamacho has joined #openstack-swift | 08:13 | |
*** dosaboy has quit IRC | 08:34 | |
*** DHE has quit IRC | 08:35 | |
*** DHE has joined #openstack-swift | 08:47 | |
*** e0ne has joined #openstack-swift | 08:55 | |
*** dosaboy has joined #openstack-swift | 09:10 | |
*** mikecmpbll has joined #openstack-swift | 09:13 | |
*** dosaboy has quit IRC | 09:15 | |
*** dosaboy has joined #openstack-swift | 09:17 | |
*** mvkr has joined #openstack-swift | 09:41 | |
*** mikecmpbll has quit IRC | 09:41 | |
*** mikecmpbll has joined #openstack-swift | 09:43 | |
*** admin6 has joined #openstack-swift | 10:16 | |
*** admin6 has quit IRC | 11:29 | |
*** e0ne has quit IRC | 12:33 | |
*** e0ne has joined #openstack-swift | 12:39 | |
*** e0ne has quit IRC | 12:40 | |
*** e0ne has joined #openstack-swift | 12:40 | |
*** admin6 has joined #openstack-swift | 12:58 | |
*** admin6 has quit IRC | 15:48 | |
*** itlinux has quit IRC | 15:58 | |
*** pcaruana has quit IRC | 16:00 | |
cwright | Hi everyone, I'm struggling with a strange issue we have been seeing on one of our swift proxy nodes | 16:01 |
cwright | We deployed swift into 2 datacenters, Reston and London, each with a pair of swift proxies (sw-proxy01 and sw-proxy02) | 16:01 |
cwright | These servers are deployed/configured via configuration management, and are all identical except for IP address | 16:01 |
cwright | Each proxy runs memcached, and swift-proxy is configured to use both memcached servers in its datacenter | 16:01 |
cwright | The issue is that in our Reston environment the sw-proxy02 node constantlys spits out errors about connecting to memcached | 16:01 |
cwright | "Timeout getting a connection to memcached" and "Error limiting server 10.10.10.180:11211" | 16:01 |
cwright | None of the other 3 proxy servers ever displays these issues. There are no networking rules/filters in place | 16:01 |
cwright | Is this something anyone here has seen before? | 16:01 |
cwright | Here's some sample log output: https://gist.github.com/corywright/eead89b552026c19cabcf71f906aa17c | 16:02 |
*** e0ne has quit IRC | 16:22 | |
*** itlinux has joined #openstack-swift | 17:02 | |
*** NM has joined #openstack-swift | 17:08 | |
*** mikecmpbll has quit IRC | 17:10 | |
*** itlinux has quit IRC | 17:28 | |
*** itlinux has joined #openstack-swift | 17:28 | |
*** itlinux has quit IRC | 17:29 | |
*** itlinux has joined #openstack-swift | 17:34 | |
*** mikecmpbll has joined #openstack-swift | 17:51 | |
zaitcev | cwright: You may think that there are no filtering in place, but I would start by testing with telnet, from the proxy of course, and maybe "su - swift", in case it's SElinux. | 18:03 |
timburke | good morning | 18:04 |
zaitcev | cwright: If it's not filtering, then I can divulge that I did have to change the default timeouts for real networks. | 18:04 |
notmyname | cwright: I would recommend using separate memcache pools for each region. yes, you'll not have quite as good cache hit rate, but on the other hand you'll only do region-local lookups for cache | 18:05 |
notmyname | zaitcev: adjusting the timeouts for "real" networks is a good point. should we adjust the defaults and/or the docs upstream? | 18:05 |
zaitcev | [filter:cache] | 18:07 |
zaitcev | connect_timeout = 1.2 | 18:07 |
zaitcev | I think default is 1 and bumping it juuuust this much was enough to resolve my timeouts. | 18:08 |
notmyname | default seems to be 0.3 | 18:08 |
zaitcev | oh | 18:08 |
zaitcev | Well, I basically just experimented until it worked. | 18:08 |
*** openstackgerrit has joined #openstack-swift | 18:10 | |
openstackgerrit | Tim Burke proposed openstack/swift master: misc test cleanup https://review.openstack.org/631077 | 18:10 |
cwright | hi, thanks zaitcev and notmyname. maybe I didnt describe it properly, but we are using separate memcache pools for each region | 18:12 |
cwright | zaitcev: of course its possible that something is interfering, but i've tried all the telnet/network checks, and our networking team has looked too. | 18:13 |
cwright | it even is reporting these errors when trying to connect to its own public ip | 18:13 |
cwright | i spent some time digging through the source code and searching old bugs and it almost looks like a memcache connection pool issue to me | 18:13 |
*** ccamacho has quit IRC | 18:14 | |
cwright | i will try the connect_timeout adjustment now and see if that helps, thanks zaitcev | 18:14 |
notmyname | I'm going to reset the counter on the wall if this is true (seriously, we have one), but have you checked MTU settings? `ping -M do -s 8972 [destinationIP]` | 18:16 |
zaitcev | cwright: I'm still unclear on the situation on your end: does every connection fail, or do only some? All you said was: "constantlys spits out". | 18:17 |
notmyname | use 1472 to check for the standard 1500 MTU. 8972 checks for jumbo frames (9000 MTU) | 18:18 |
cwright | zaitcev: i'm not certain, these servers are still getting relatively little traffic, but after a recent restart of swift-proxy the errors began immediately. | 18:18 |
cwright | notmyname: let me check | 18:19 |
cwright | notmyname: i think you've clued me in on something... | 18:21 |
notmyname | oh no! do I have to reset the counter? | 18:22 |
cwright | maybe :) I will confirm for you in a bit | 18:22 |
notmyname | so far the record is 15 days. it's normally no more than 3. (the 15 day record was because of the winter holidays!) http://d.not.mn/Image.jpeg | 18:24 |
cwright | notmyname: thanks a ton. i can't begin to say how much i appreciate the help and wisdom of everyone here. | 18:28 |
notmyname | was it MTUs? | 18:28 |
cwright | yea its going to be. we haven't adjusted yet but the pings are telling | 18:29 |
cwright | go ahead and reset your counter :) | 18:30 |
notmyname | http://d.not.mn/Image_2.jpeg | 18:31 |
notmyname | whomp whomp | 18:31 |
notmyname | the swiftstack support team (that I sit right next to) are sad now | 18:32 |
*** e0ne has joined #openstack-swift | 19:29 | |
*** NM has quit IRC | 20:01 | |
clayg | lo | 20:02 |
clayg | l | 20:02 |
cwright | notmyname: well, sorry to say but that didn't fix it. | 20:18 |
cwright | updated the mtu, rebooted, a few mins later the same errors are showing up | 20:20 |
cwright | i will try the connect_timeout now | 20:21 |
*** NM has joined #openstack-swift | 20:32 | |
*** NM has quit IRC | 20:33 | |
*** e0ne has quit IRC | 20:55 | |
*** itlinux has quit IRC | 21:49 | |
*** jistr has quit IRC | 22:32 | |
*** jistr has joined #openstack-swift | 22:33 | |
*** jistr has quit IRC | 22:49 | |
*** jistr has joined #openstack-swift | 22:50 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!