*** fyang has joined #openstack-dev | 00:18 | |
*** jdurgin has quit IRC | 00:41 | |
*** namaqua has joined #openstack-dev | 01:21 | |
*** HugoKuo has joined #openstack-dev | 01:26 | |
*** cloudgroups has joined #openstack-dev | 01:30 | |
*** namaqua has quit IRC | 01:42 | |
*** mattray has quit IRC | 01:45 | |
*** lorin1 has joined #openstack-dev | 01:51 | |
*** yamahata_ has joined #openstack-dev | 01:54 | |
*** yamahata_ has joined #openstack-dev | 01:55 | |
*** BK_man has joined #openstack-dev | 01:59 | |
BK_man | hi all | 01:59 |
---|---|---|
*** yamahata__ has joined #openstack-dev | 02:00 | |
BK_man | any scalability considerations on Nova deployments? I mean how many VMs it able to handle in Cactus release? | 02:01 |
*** BK_man has quit IRC | 02:14 | |
*** BK_man has joined #openstack-dev | 02:16 | |
*** BK_man has quit IRC | 02:18 | |
*** BK_man has joined #openstack-dev | 02:36 | |
lorin1 | BK_man I think the largest Nova deployment in use is about ~300 nodes. I'm not sure how many VMs per nodes they use. | 02:50 |
BK_man | lorin1: are shure about # of nodes? Is it on Cactus? | 02:51 |
lorin1 | BK_man: I'm not sure about the exact number, it's the DOE Magella project: http://magellan.alcf.anl.gov/. I don't know if they are running Cactus or not. | 02:52 |
BK_man | lorin1: thanks! | 02:52 |
*** lorin1 has quit IRC | 02:53 | |
*** cloudgroups has left #openstack-dev | 03:17 | |
*** fyang has quit IRC | 03:27 | |
*** BK_man has quit IRC | 03:42 | |
*** yamahata__ has quit IRC | 04:02 | |
*** BK_man has joined #openstack-dev | 04:08 | |
*** mattray has joined #openstack-dev | 04:14 | |
*** fyang has joined #openstack-dev | 04:31 | |
*** agarwalla_ has joined #openstack-dev | 05:21 | |
*** agarwalla has joined #openstack-dev | 05:29 | |
*** agarwalla_ has quit IRC | 05:29 | |
*** mattray has quit IRC | 05:34 | |
*** mattray has joined #openstack-dev | 05:36 | |
*** mattray has quit IRC | 05:38 | |
*** kanaka has joined #openstack-dev | 05:49 | |
*** jamesurquhart has joined #openstack-dev | 05:56 | |
*** jamesurquhart has left #openstack-dev | 05:56 | |
*** fyang has quit IRC | 06:05 | |
*** Binbin has joined #openstack-dev | 06:52 | |
*** yamahata__ has joined #openstack-dev | 07:04 | |
comstud | soren awake yet? | 08:30 |
comstud | soren: see comments and patch i added to 771512. curious if this is what you did, or if it's completely different. | 08:31 |
comstud | bed time for me, back in 7 hrs or so | 08:36 |
* comstud & | 08:36 | |
*** Eyk has joined #openstack-dev | 09:41 | |
soren | comstud: It's quite different, actually. | 10:14 |
soren | comstud: My approach involved catching the timeout exception in the trampoline and then I was expecting to find a way to check if the connection had been established. If not, I'd just re-raise the exception, otherwise I'd just pass on the exception and move on. | 10:18 |
*** Eyk has quit IRC | 11:07 | |
*** markvoelker has joined #openstack-dev | 11:20 | |
*** hazmat has joined #openstack-dev | 12:28 | |
soren | comstud: Hm... I have the smalled imaginable patch. It seems to actually solve the problem. I don't yet completely understand *why* yet, but there you go. | 12:37 |
*** mattray has joined #openstack-dev | 12:38 | |
soren | comstud: It's so small and easy that I question my test case. | 12:38 |
soren | comstud: http://paste.ubuntu.com/606131/ | 12:39 |
*** adiantum has joined #openstack-dev | 12:51 | |
*** cloudgroups has joined #openstack-dev | 13:21 | |
*** westmaas has quit IRC | 13:22 | |
*** westmaas has joined #openstack-dev | 13:22 | |
*** statik has joined #openstack-dev | 13:25 | |
*** throughnothing has joined #openstack-dev | 13:28 | |
*** throughnothing has quit IRC | 13:52 | |
*** cloudgroups has left #openstack-dev | 13:57 | |
*** Eyk has joined #openstack-dev | 13:57 | |
*** mattray has quit IRC | 14:02 | |
soren | comstud: Of course that won't be how it gets upstreamed.. | 14:05 |
*** throughnothing has joined #openstack-dev | 14:06 | |
*** throughnothing has quit IRC | 14:07 | |
*** throughnothing has joined #openstack-dev | 14:07 | |
*** jkoelker has joined #openstack-dev | 14:40 | |
*** mattray has joined #openstack-dev | 14:42 | |
*** mattray has quit IRC | 14:50 | |
*** clayg has joined #openstack-dev | 14:54 | |
*** dweimer has joined #openstack-dev | 14:58 | |
comstud | soren: that does seem to work | 15:10 |
comstud | it passes my test case, anyway | 15:11 |
*** lorin1 has joined #openstack-dev | 15:11 | |
comstud | which includes a real timeout | 15:11 |
soren | comstud: It does, doesn't it? I was surprised. | 15:12 |
soren | comstud: I was fully expecting to have to detect whether the connection was in good shape, and if not, reraise the exception. | 15:12 |
soren | comstud: Yeah, mine too | 15:12 |
comstud | that's a lot more simple than my fix | 15:12 |
comstud | :) | 15:12 |
soren | Ever so slightly. | 15:12 |
soren | :) | 15:12 |
*** lorin1 has quit IRC | 15:13 | |
*** lorin1 has joined #openstack-dev | 15:13 | |
comstud | i have to admit that i don't even understand how your fix works | 15:13 |
soren | Oh. | 15:14 |
comstud | eventlet internals is pretty foreign to me | 15:14 |
*** lorin1 has left #openstack-dev | 15:14 | |
*** lorin1 has joined #openstack-dev | 15:14 | |
comstud | i dug into it for quite a while, but | 15:14 |
*** lorin1 has left #openstack-dev | 15:14 | |
*** rnirmal has joined #openstack-dev | 15:14 | |
soren | So the trick is that when the timeout fires, the "current.throw" thing is a greenlet trick to force a thread to throw an exception (regardless of what it's doing right now). | 15:14 |
comstud | yep, caught that. | 15:15 |
soren | ...so I catch that, and switch to this greenlet to let it try and do something to the socket. | 15:15 |
soren | "something" being "whatever it was intending to do if everything had been fine". | 15:16 |
soren | ...but yeah, I don't completely understand it either. | 15:16 |
comstud | i'm curious how the 'real timeout' is still working. | 15:16 |
*** hub_cap has joined #openstack-dev | 15:16 | |
soren | I was *just* typing the same thing. | 15:16 |
comstud | that's the part that is confusing me | 15:16 |
soren | It's weird. | 15:16 |
comstud | that's one word for it | 15:16 |
soren | Yeah. I don't get that at all. | 15:16 |
comstud | :) | 15:16 |
soren | That's why I'm questioning my test cases. | 15:17 |
soren | I feel like it shouldn't work. | 15:17 |
comstud | yeah | 15:17 |
soren | You know, seeing as it's a bit pointless to throw the exception only to unconditionally catch it. | 15:18 |
comstud | i might add some debug for this, just because i'm generally curious | 15:18 |
soren | ...so they must have added it for some reason. | 15:18 |
comstud | haha yea | 15:18 |
soren | I guess this would be functionally equivalent: http://paste.ubuntu.com/606176/ | 15:20 |
*** cp16net has joined #openstack-dev | 15:20 | |
soren | If not, I have *no* idea why the other thing works. | 15:20 |
comstud | yeah | 15:21 |
soren | Yup, it acts the same. | 15:21 |
comstud | yep | 15:21 |
comstud | just tested too | 15:21 |
comstud | uh, so where the hell is the timeout coming from | 15:22 |
soren | I just found that out. | 15:22 |
soren | It happens in greenio in the connect method. | 15:22 |
soren | It has a loop where it calls into the trampoline. | 15:22 |
soren | Instead of the trampoline causing the exception to be thrown, it just returns, and the next time it goes through the loop, it sees that the timeout has passed and then it throws the exception. | 15:23 |
*** dragondm has joined #openstack-dev | 15:23 | |
comstud | ahh, yeah | 15:23 |
comstud | i see now | 15:23 |
comstud | recv() doesn't have that same check | 15:24 |
comstud | so it looks like this would work for connect() | 15:24 |
comstud | but not recv() | 15:24 |
soren | Ah. | 15:24 |
soren | Ok, so maybe it should? | 15:24 |
comstud | perhaps so | 15:25 |
soren | Hm... Making a test case for recv is harder. | 15:25 |
comstud | although | 15:26 |
comstud | ok. | 15:27 |
soren | :) | 15:27 |
comstud | i don't see any harm caused by adding there too | 15:29 |
comstud | i think i can see where this fix fails, though | 15:30 |
soren | great! | 15:30 |
comstud | if you happen to make it into trampoline for the socket operation | 15:30 |
comstud | and then there's a timeout | 15:30 |
comstud | ie, our sleep kinda masks that | 15:30 |
comstud | case | 15:30 |
comstud | my connect timed out case is last in my tests | 15:31 |
comstud | i bet if i put it first, it'll not time out correctly | 15:31 |
* comstud checks | 15:31 | |
*** adiantum has quit IRC | 15:31 | |
comstud | hm | 15:32 |
soren | oh, btw.. | 15:33 |
comstud | ok, somehow it's still working. | 15:33 |
soren | So there's a case we might not be fixing. | 15:33 |
soren | So, there's our specific case where there's a lot of small requests. | 15:35 |
*** rnirmal_ has joined #openstack-dev | 15:35 | |
soren | in between each, eventlet has a chance to go and handle these timeouts and do something useful. | 15:35 |
*** cp16net has quit IRC | 15:36 | |
soren | ...but there's a different case where something is blocking for so long that it alone causes the timeout to fire. So a single, long, blocking operation. | 15:36 |
*** rnirmal has quit IRC | 15:36 | |
*** rnirmal_ is now known as rnirmal | 15:36 | |
comstud | yea | 15:36 |
*** hub_cap has quit IRC | 15:36 | |
soren | ...where eventlet of course has no chance to do anything for the duration. | 15:36 |
soren | ...and I'm not sure that case is covered by at least my approach. It may be covered by yours (which I honestly haven't completely grokked yet). | 15:37 |
*** hub_cap has joined #openstack-dev | 15:37 | |
soren | That's what I was referring to yesterday when I said I had a fix for our case, but I didn't think it completely solved the problem. | 15:37 |
comstud | i'm pretty positive my fix covers all cases | 15:37 |
comstud | but maybe not | 15:37 |
soren | Ok. | 15:37 |
comstud | because essentially...i'm waiting for all poll()s for I/O to get a change to complete before firing timers | 15:38 |
soren | It seems to me that an approach that catches the timeout, does *something* to check if the socket actually timed out, and then reraises if it did, would also solve it completely | 15:39 |
soren | ...and would not require per-hub work. | 15:39 |
soren | Oh. | 15:39 |
comstud | yeah, I started looking at it and my brain went to mush | 15:39 |
soren | I can relate. | 15:39 |
comstud | there's not a good way, i think, to tell if things like 'recv()' time out | 15:40 |
soren | ..but that was my starting point for a fix. Ok, I think I've filled you in on all the thoughts I've had on the subject. | 15:40 |
comstud | because.. | 15:40 |
comstud | you're just waiting to receive data in a period of time | 15:40 |
comstud | you can't check that the socket is disconnected or anything | 15:40 |
comstud | because it can still be connected | 15:40 |
comstud | in that case | 15:40 |
comstud | you just haven't received data yet | 15:40 |
soren | Sometihng like recording when the last actual recv call was attempted. | 15:40 |
*** dprince has joined #openstack-dev | 15:40 | |
soren | If one hasn't been attempted insce the timeout occurred, try one last time. | 15:41 |
soren | Or something. | 15:41 |
comstud | also something stupid could be done to check the time elapsed in the poll() callbacks when I/O is ready... | 15:41 |
comstud | and loop through all timers and increment them by that amount | 15:41 |
comstud | heh. | 15:41 |
*** cp16net has joined #openstack-dev | 15:41 | |
soren | *shudder* | 15:41 |
soren | :) | 15:41 |
comstud | yeah exactly.. | 15:41 |
soren | I sort of got the impression that the eventlet guy doesn't really consider this a bug. | 15:42 |
comstud | same here | 15:42 |
soren | ..which is a bit concerning. | 15:42 |
comstud | this is a real bug, but at the same time, I think he does have a point. | 15:42 |
soren | I had written up a response, but I think your response was great, so i didn't bother. | 15:42 |
comstud | cools | 15:43 |
comstud | thnx:) | 15:43 |
soren | For sure. There's no doubt that the fact that we see this is a symptom that we're doing something wrong... | 15:43 |
soren | ...but if I understand the contract with Eventlet correctly, this bug is real. It shouldn't happen no matter how stupid my code is. | 15:43 |
comstud | yeah, that's my thought | 15:43 |
comstud | i think we're going to have to make some other changes on our end.. | 15:43 |
soren | Certainly. | 15:44 |
comstud | i wanted to try out my patch... | 15:44 |
comstud | and then figure out where our performance issue is | 15:44 |
comstud | and tackle that | 15:44 |
comstud | first.. see how much fixing eventlet does for us | 15:44 |
soren | The way I see it, our problem is one of optimisation. | 15:44 |
comstud | but ya, there's a real problem if we're blocking that much | 15:44 |
soren | Not so much a bug. | 15:44 |
comstud | I/O loops like these are not designed to work with too much blocking | 15:45 |
comstud | we might need to throw some crap into a queue for worker threads | 15:45 |
comstud | or something like that | 15:45 |
comstud | i don't know enough yet about nova to say for sure, however. | 15:45 |
comstud | i just really started working on nova code again.. since some initial work in August | 15:45 |
comstud | :) | 15:45 |
comstud | soren: agree. | 15:45 |
*** adiantum has joined #openstack-dev | 15:46 | |
comstud | in any case, i feel like i can break your patch... i just need to come up with the right test case. | 15:47 |
*** jwilmes has joined #openstack-dev | 15:47 | |
comstud | if i can get past that initial timeout check in connect() and get into trampoline() where trampoline() is waiting on the poll to complete... | 15:47 |
comstud | the timeout should not be returned | 15:47 |
comstud | when it should be | 15:47 |
*** fyang has joined #openstack-dev | 15:48 | |
comstud | i gotta hop into the shower.. gotta mtg in 10 | 15:48 |
*** mattray has joined #openstack-dev | 15:52 | |
*** fyang has quit IRC | 15:56 | |
*** lorin1 has joined #openstack-dev | 16:17 | |
*** cp16net has quit IRC | 16:23 | |
*** hub_cap has quit IRC | 16:24 | |
*** hub_cap has joined #openstack-dev | 16:24 | |
*** adiantum has quit IRC | 16:37 | |
*** cp16net has joined #openstack-dev | 16:41 | |
*** rnirmal has quit IRC | 16:43 | |
*** rnirmal has joined #openstack-dev | 16:43 | |
*** elasticdog has joined #openstack-dev | 16:45 | |
*** jwilmes has quit IRC | 16:56 | |
*** Vek has quit IRC | 16:57 | |
*** jdurgin has joined #openstack-dev | 16:59 | |
*** bcwaldon has joined #openstack-dev | 16:59 | |
vishy | bcwaldon: ping | 17:02 |
*** mattray has quit IRC | 17:02 | |
bcwaldon | vishy: 10 min, plz | 17:03 |
vishy | bcwaldon: np | 17:03 |
*** mattray has joined #openstack-dev | 17:09 | |
jaypipes | dprince, _0x44: would really appreciate a review on https://code.launchpad.net/~jaypipes/glance/bug713154/+merge/59110 if you have a few minutes. thx in advance! | 17:14 |
_0x44 | jaypipes: looking | 17:15 |
jaypipes | cheers | 17:15 |
_0x44 | jaypipes: I have a branch someplace that refactored out the parsing for swift/s3 into somewhere common | 17:16 |
_0x44 | I think it's stale, because it was before the huge glance changes in Cactus | 17:16 |
*** BK_man has quit IRC | 17:20 | |
*** dragondm has quit IRC | 17:23 | |
comstud | ok, mtgs done | 17:26 |
*** cp16net has quit IRC | 17:29 | |
*** dragondm has joined #openstack-dev | 17:31 | |
*** BK_man has joined #openstack-dev | 17:34 | |
dprince | jaypipes: on it | 17:47 |
*** xtoddx has joined #openstack-dev | 17:59 | |
*** dovetaildan has joined #openstack-dev | 18:02 | |
*** _cerberus_ has joined #openstack-dev | 18:27 | |
*** ChanServ sets mode: +v _cerberus_ | 18:27 | |
*** nhm has joined #openstack-dev | 18:29 | |
dprince | jaypipes: Just updated the glance s3 merge prop. | 18:40 |
jaypipes | dprince: thx dan! | 18:40 |
dprince | jaypipes: Couple of minor things (maybe just questions). | 18:40 |
dprince | jaypipes: Did you actually try this? Functionally? | 18:40 |
*** cp16net has joined #openstack-dev | 18:40 | |
dprince | jaypipes: I mean against say Amazon's S3 service? | 18:41 |
jaypipes | dprince: no, against nova-objectstore... | 18:41 |
jaypipes | dprince: might have picked up some issues :( | 18:42 |
*** blamar has joined #openstack-dev | 18:43 | |
jaypipes | dprince: it hasn't been a top priority, to be honest :) | 18:44 |
dprince | jaypipes: trying it now.... | 18:47 |
jaypipes | dprince: re: your question #2, there is a reason for that, yes. | 18:47 |
dprince | jaypipes: in a VPC. | 18:47 |
jaypipes | dprince: the reason is because in one case we want to check existence and error on it, in the other, we don't want to check for existence... | 18:48 |
dprince | jaypipes: Okay. I see. That makes sense. Give me a minute to try this against S3. I just want to see it work. | 18:49 |
jaypipes | dprince: yeah, me too :) working with diego as well on that front (testing on real S3.) | 18:50 |
dprince | jaypipes: Okay. Found something. | 18:51 |
dprince | jaypipes: line 201 | 18:52 |
dprince | jaypipes: if key.exists(): | 18:52 |
dprince | jaypipes: needs to be something like: | 18:52 |
dprince | if key and key.exists(): | 18:52 |
dprince | jaypipes: essentially key can be null. | 18:52 |
jaypipes | hmm... | 18:52 |
dprince | After making that change I still get this error: | 18:53 |
dprince | Error uploading image: 'Input' object has no attribute 'seek | 18:53 |
dprince | jaypipes: Sorry. I forgot to mention I'm just running a quick glance add command to upload a ramdisk into S3. | 18:53 |
jaypipes | dprince: farts. | 18:54 |
jaypipes | dprince: OK, please make a note on the MP. Seems the boto S3 API isn't as specific as it needs to be... | 18:54 |
jaypipes | dprince: just mentions the input needs to be a file-like object, which I interpreted as needing a read attribute... | 18:55 |
dprince | jaypipes: Sure. I'll make some notes. | 18:56 |
jaypipes | dprince: appreciated. trying to wrap up the version stuff first... | 18:57 |
jaypipes | dprince: which will cause merge conflicts with the S3 stuff anyway ;) | 18:57 |
*** Vek has joined #openstack-dev | 18:57 | |
dprince | jaypipes: Cool. I agree in doing that first then. So hey. Did my packager branch look good for the versioned API stuff? Or did you have more drastic changes in mind for the glance packager stuff? (I linked it in the merge prop) | 18:59 |
dprince | jaypipes: feel free to ninja any/all/or none of it if we want to make another branch. | 18:59 |
jaypipes | dprince: yeah, saw that. I didn't see any problems. | 18:59 |
dprince | jaypipes: We just need to coordinate the changes. Shall I merge prop it and then Soren can take it from there? | 19:00 |
jaypipes | dprince: not yet. :) | 19:00 |
jaypipes | dprince: changing from v1.0 to v1. | 19:01 |
dprince | jaypipes: I could go either way on that. | 19:02 |
dprince | jaypipes: doesn't really matter I guess but I actually prefer the v1.0 for consistency. Sorry. I neglected to reply the the ML thread. | 19:03 |
jaypipes | dprince: actually, rackspace and others are moving towards v1 and not using v1.0 in URIs, which is why I reconsidered. :) | 19:06 |
bcwaldon | :) | 19:07 |
dprince | dprince: SUre. good to be ahead of the game then. Kind of a preference thing anyway. Not going to make everyone happy. | 19:08 |
*** hub_cap_ has joined #openstack-dev | 19:10 | |
*** hub_cap has quit IRC | 19:10 | |
*** hub_cap_ is now known as hub_cap | 19:10 | |
*** HugoKuo has quit IRC | 19:17 | |
*** hub_cap has quit IRC | 19:35 | |
*** mattray has quit IRC | 19:46 | |
*** mattray has joined #openstack-dev | 19:51 | |
*** agarwalla has quit IRC | 19:58 | |
*** ironcamel has joined #openstack-dev | 20:08 | |
*** cp16net has quit IRC | 20:16 | |
*** dprince has quit IRC | 20:28 | |
*** lorin1 has quit IRC | 21:02 | |
*** bcwaldon has quit IRC | 21:02 | |
*** User665 has joined #openstack-dev | 21:08 | |
*** User665 has quit IRC | 21:09 | |
*** mattray has quit IRC | 21:18 | |
*** bcwaldon has joined #openstack-dev | 21:37 | |
*** mattray has joined #openstack-dev | 21:52 | |
*** markvoelker has quit IRC | 21:58 | |
*** cp16net has joined #openstack-dev | 21:58 | |
*** bcwaldon has quit IRC | 22:07 | |
*** jkoelker has quit IRC | 22:23 | |
*** rnirmal has quit IRC | 22:35 | |
*** cp16net has quit IRC | 23:01 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!