Wednesday, 2011-05-11

*** fyang has joined #openstack-dev00:18
*** jdurgin has quit IRC00:41
*** namaqua has joined #openstack-dev01:21
*** HugoKuo has joined #openstack-dev01:26
*** cloudgroups has joined #openstack-dev01:30
*** namaqua has quit IRC01:42
*** mattray has quit IRC01:45
*** lorin1 has joined #openstack-dev01:51
*** yamahata_ has joined #openstack-dev01:54
*** yamahata_ has joined #openstack-dev01:55
*** BK_man has joined #openstack-dev01:59
BK_manhi all01:59
*** yamahata__ has joined #openstack-dev02:00
BK_manany scalability considerations on Nova deployments? I mean how many VMs it able to handle in Cactus release?02:01
*** BK_man has quit IRC02:14
*** BK_man has joined #openstack-dev02:16
*** BK_man has quit IRC02:18
*** BK_man has joined #openstack-dev02:36
lorin1BK_man I think the largest Nova deployment in use is about ~300 nodes. I'm not sure how many VMs per nodes they use.02:50
BK_manlorin1: are shure about # of nodes? Is it on Cactus?02:51
lorin1BK_man: I'm not sure about the exact number, it's the DOE Magella project: http://magellan.alcf.anl.gov/. I don't know if they are running Cactus or not.02:52
BK_manlorin1: thanks!02:52
*** lorin1 has quit IRC02:53
*** cloudgroups has left #openstack-dev03:17
*** fyang has quit IRC03:27
*** BK_man has quit IRC03:42
*** yamahata__ has quit IRC04:02
*** BK_man has joined #openstack-dev04:08
*** mattray has joined #openstack-dev04:14
*** fyang has joined #openstack-dev04:31
*** agarwalla_ has joined #openstack-dev05:21
*** agarwalla has joined #openstack-dev05:29
*** agarwalla_ has quit IRC05:29
*** mattray has quit IRC05:34
*** mattray has joined #openstack-dev05:36
*** mattray has quit IRC05:38
*** kanaka has joined #openstack-dev05:49
*** jamesurquhart has joined #openstack-dev05:56
*** jamesurquhart has left #openstack-dev05:56
*** fyang has quit IRC06:05
*** Binbin has joined #openstack-dev06:52
*** yamahata__ has joined #openstack-dev07:04
comstudsoren awake yet?08:30
comstudsoren: see comments and patch i added to 771512.  curious if this is what you did, or if it's completely different.08:31
comstudbed time for me, back in 7 hrs or so08:36
* comstud &08:36
*** Eyk has joined #openstack-dev09:41
sorencomstud: It's quite different, actually.10:14
sorencomstud: My approach involved catching the timeout exception in the trampoline and then I was expecting to find a way to check if the connection had been established. If not, I'd just re-raise the exception, otherwise I'd just pass on the exception and move on.10:18
*** Eyk has quit IRC11:07
*** markvoelker has joined #openstack-dev11:20
*** hazmat has joined #openstack-dev12:28
sorencomstud: Hm... I have the smalled imaginable patch. It seems to actually solve the problem. I don't yet completely understand *why* yet, but there you go.12:37
*** mattray has joined #openstack-dev12:38
sorencomstud: It's so small and easy that I question my test case.12:38
sorencomstud: http://paste.ubuntu.com/606131/12:39
*** adiantum has joined #openstack-dev12:51
*** cloudgroups has joined #openstack-dev13:21
*** westmaas has quit IRC13:22
*** westmaas has joined #openstack-dev13:22
*** statik has joined #openstack-dev13:25
*** throughnothing has joined #openstack-dev13:28
*** throughnothing has quit IRC13:52
*** cloudgroups has left #openstack-dev13:57
*** Eyk has joined #openstack-dev13:57
*** mattray has quit IRC14:02
sorencomstud: Of course that won't be how it gets upstreamed..14:05
*** throughnothing has joined #openstack-dev14:06
*** throughnothing has quit IRC14:07
*** throughnothing has joined #openstack-dev14:07
*** jkoelker has joined #openstack-dev14:40
*** mattray has joined #openstack-dev14:42
*** mattray has quit IRC14:50
*** clayg has joined #openstack-dev14:54
*** dweimer has joined #openstack-dev14:58
comstudsoren: that does seem to work15:10
comstudit passes my test case, anyway15:11
*** lorin1 has joined #openstack-dev15:11
comstudwhich includes a real timeout15:11
sorencomstud: It does, doesn't it? I was surprised.15:12
sorencomstud: I was fully expecting to have to detect whether the connection was in good shape, and if not, reraise the exception.15:12
sorencomstud: Yeah, mine too15:12
comstudthat's a lot more simple than my fix15:12
comstud:)15:12
sorenEver so slightly.15:12
soren:)15:12
*** lorin1 has quit IRC15:13
*** lorin1 has joined #openstack-dev15:13
comstudi have to admit that i don't even understand how your fix works15:13
sorenOh.15:14
comstudeventlet internals is pretty foreign to me15:14
*** lorin1 has left #openstack-dev15:14
*** lorin1 has joined #openstack-dev15:14
comstudi dug into it for quite a while, but15:14
*** lorin1 has left #openstack-dev15:14
*** rnirmal has joined #openstack-dev15:14
sorenSo the trick is that when the timeout fires, the "current.throw" thing is a greenlet trick to force a thread to throw an exception (regardless of what it's doing right now).15:14
comstudyep, caught that.15:15
soren...so I catch that, and switch to this greenlet to let it try and do something to the socket.15:15
soren"something" being "whatever it was intending to do if everything had been fine".15:16
soren...but yeah, I don't completely understand it either.15:16
comstudi'm curious how the 'real timeout' is still working.15:16
*** hub_cap has joined #openstack-dev15:16
sorenI was *just* typing the same thing.15:16
comstudthat's the part that is confusing me15:16
sorenIt's weird.15:16
comstudthat's one word for it15:16
sorenYeah. I don't get that at all.15:16
comstud:)15:16
sorenThat's why I'm questioning my test cases.15:17
sorenI feel like it shouldn't work.15:17
comstudyeah15:17
sorenYou know, seeing as it's a bit pointless to throw the exception only to unconditionally catch it.15:18
comstudi might add some debug for this, just because i'm generally curious15:18
soren...so they must have added it for some reason.15:18
comstudhaha yea15:18
sorenI guess this would be functionally equivalent: http://paste.ubuntu.com/606176/15:20
*** cp16net has joined #openstack-dev15:20
sorenIf not, I have *no* idea why the other thing works.15:20
comstudyeah15:21
sorenYup, it acts the same.15:21
comstudyep15:21
comstudjust tested too15:21
comstuduh, so where the hell is the timeout coming from15:22
sorenI just found that out.15:22
sorenIt happens in greenio in the connect method.15:22
sorenIt has a loop where it calls into the trampoline.15:22
sorenInstead of the trampoline causing the exception to be thrown, it just returns, and the next time it goes through the loop, it sees that the timeout has passed and then it throws the exception.15:23
*** dragondm has joined #openstack-dev15:23
comstudahh, yeah15:23
comstudi see now15:23
comstudrecv() doesn't have that same check15:24
comstudso it looks like this would work for connect()15:24
comstudbut not recv()15:24
sorenAh.15:24
sorenOk, so maybe it should?15:24
comstudperhaps so15:25
sorenHm... Making a test case for recv is harder.15:25
comstudalthough15:26
comstudok.15:27
soren:)15:27
comstudi don't see any harm caused by adding there too15:29
comstudi think i can see where this fix fails, though15:30
sorengreat!15:30
comstudif you happen to make it into trampoline for the socket operation15:30
comstudand then there's a timeout15:30
comstudie, our sleep kinda masks that15:30
comstudcase15:30
comstudmy connect timed out case is last in my tests15:31
comstudi bet if i put it first, it'll not time out correctly15:31
* comstud checks15:31
*** adiantum has quit IRC15:31
comstudhm15:32
sorenoh, btw..15:33
comstudok, somehow it's still working.15:33
sorenSo there's a case we might not be fixing.15:33
sorenSo, there's our specific case where there's a lot of small requests.15:35
*** rnirmal_ has joined #openstack-dev15:35
sorenin between each, eventlet has a chance to go and handle these timeouts and do something useful.15:35
*** cp16net has quit IRC15:36
soren...but there's a different case where something is blocking for so long that it alone causes the timeout to fire. So a single, long, blocking operation.15:36
*** rnirmal has quit IRC15:36
*** rnirmal_ is now known as rnirmal15:36
comstudyea15:36
*** hub_cap has quit IRC15:36
soren...where eventlet of course has no chance to do anything for the duration.15:36
soren...and I'm not sure that case is covered by at least my approach. It may be covered by yours (which I honestly haven't completely grokked yet).15:37
*** hub_cap has joined #openstack-dev15:37
sorenThat's what I was referring to yesterday when I said I had a fix for our case, but I didn't think it completely solved the problem.15:37
comstudi'm pretty positive my fix covers all cases15:37
comstudbut maybe not15:37
sorenOk.15:37
comstudbecause essentially...i'm waiting for all poll()s for I/O to get a change to complete before firing timers15:38
sorenIt seems to me that an approach that catches the timeout, does *something* to check if the socket actually timed out, and then reraises if it did, would also solve it completely15:39
soren...and would not require per-hub work.15:39
sorenOh.15:39
comstudyeah, I started looking at it and my brain went to mush15:39
sorenI can relate.15:39
comstudthere's not a good way, i think, to tell if things like 'recv()' time out15:40
soren..but that was my starting point for a fix. Ok, I think I've filled you in on all the thoughts I've had on the subject.15:40
comstudbecause..15:40
comstudyou're just waiting to receive data in a period of time15:40
comstudyou can't check that the socket is disconnected or anything15:40
comstudbecause it can still be connected15:40
comstudin that case15:40
comstudyou just haven't received data yet15:40
sorenSometihng like recording when the last actual recv call was attempted.15:40
*** dprince has joined #openstack-dev15:40
sorenIf one hasn't been attempted insce the timeout occurred, try one last time.15:41
sorenOr something.15:41
comstudalso something stupid could be done to check the time elapsed in the poll() callbacks when I/O is ready...15:41
comstudand loop through all timers and increment them by that amount15:41
comstudheh.15:41
*** cp16net has joined #openstack-dev15:41
soren*shudder*15:41
soren:)15:41
comstudyeah exactly..15:41
sorenI sort of got the impression that the eventlet guy doesn't really consider this a bug.15:42
comstudsame here15:42
soren..which is a bit concerning.15:42
comstudthis is a real bug, but at the same time, I think he does have a point.15:42
sorenI had written up a response, but I think your response was great, so i didn't bother.15:42
comstudcools15:43
comstudthnx:)15:43
sorenFor sure. There's no doubt that the fact that we see this is a symptom that we're doing something wrong...15:43
soren...but if I understand the contract with Eventlet correctly, this bug is real. It shouldn't happen no matter how stupid my code is.15:43
comstudyeah, that's my thought15:43
comstudi think we're going to have to make some other changes on our end..15:43
sorenCertainly.15:44
comstudi wanted to try out my patch...15:44
comstudand then figure out where our performance issue is15:44
comstudand tackle that15:44
comstudfirst.. see how much fixing eventlet does for us15:44
sorenThe way I see it, our problem is one of optimisation.15:44
comstudbut ya, there's a real problem if we're blocking that much15:44
sorenNot so much a bug.15:44
comstudI/O loops like these are not designed to work with too much blocking15:45
comstudwe might need to throw some crap into a queue for worker threads15:45
comstudor something like that15:45
comstudi don't know enough yet about nova to say for sure, however.15:45
comstudi just really started working on nova code again.. since some initial work in August15:45
comstud:)15:45
comstudsoren: agree.15:45
*** adiantum has joined #openstack-dev15:46
comstudin any case, i feel like i can break your patch... i just need to come up with the right test case.15:47
*** jwilmes has joined #openstack-dev15:47
comstudif i can get past that initial timeout check in connect() and get into trampoline() where trampoline() is waiting on the poll to complete...15:47
comstudthe timeout should not be returned15:47
comstudwhen it should be15:47
*** fyang has joined #openstack-dev15:48
comstudi gotta hop into the shower.. gotta mtg in 1015:48
*** mattray has joined #openstack-dev15:52
*** fyang has quit IRC15:56
*** lorin1 has joined #openstack-dev16:17
*** cp16net has quit IRC16:23
*** hub_cap has quit IRC16:24
*** hub_cap has joined #openstack-dev16:24
*** adiantum has quit IRC16:37
*** cp16net has joined #openstack-dev16:41
*** rnirmal has quit IRC16:43
*** rnirmal has joined #openstack-dev16:43
*** elasticdog has joined #openstack-dev16:45
*** jwilmes has quit IRC16:56
*** Vek has quit IRC16:57
*** jdurgin has joined #openstack-dev16:59
*** bcwaldon has joined #openstack-dev16:59
vishybcwaldon: ping17:02
*** mattray has quit IRC17:02
bcwaldonvishy: 10 min, plz17:03
vishybcwaldon: np17:03
*** mattray has joined #openstack-dev17:09
jaypipesdprince, _0x44: would really appreciate a review on https://code.launchpad.net/~jaypipes/glance/bug713154/+merge/59110 if you have a few minutes. thx in advance!17:14
_0x44jaypipes: looking17:15
jaypipescheers17:15
_0x44jaypipes: I have a branch someplace that refactored out the parsing for swift/s3 into somewhere common17:16
_0x44I think it's stale, because it was before the huge glance changes in Cactus17:16
*** BK_man has quit IRC17:20
*** dragondm has quit IRC17:23
comstudok, mtgs done17:26
*** cp16net has quit IRC17:29
*** dragondm has joined #openstack-dev17:31
*** BK_man has joined #openstack-dev17:34
dprincejaypipes: on it17:47
*** xtoddx has joined #openstack-dev17:59
*** dovetaildan has joined #openstack-dev18:02
*** _cerberus_ has joined #openstack-dev18:27
*** ChanServ sets mode: +v _cerberus_18:27
*** nhm has joined #openstack-dev18:29
dprincejaypipes: Just updated the glance s3 merge prop.18:40
jaypipesdprince: thx dan!18:40
dprincejaypipes: Couple of minor things (maybe just questions).18:40
dprincejaypipes: Did you actually try this? Functionally?18:40
*** cp16net has joined #openstack-dev18:40
dprincejaypipes: I mean against say Amazon's S3 service?18:41
jaypipesdprince: no, against nova-objectstore...18:41
jaypipesdprince: might have picked up some issues :(18:42
*** blamar has joined #openstack-dev18:43
jaypipesdprince: it hasn't been a top priority, to be honest :)18:44
dprincejaypipes: trying it now....18:47
jaypipesdprince: re: your question #2, there is a reason for that, yes.18:47
dprincejaypipes: in a VPC.18:47
jaypipesdprince: the reason is because in one case we want to check existence and error on it, in the other, we don't want to check for existence...18:48
dprincejaypipes: Okay. I see. That makes sense. Give me a minute to try this against S3. I just want to see it work.18:49
jaypipesdprince: yeah, me too :) working with diego as well on that front (testing on real S3.)18:50
dprincejaypipes: Okay. Found something.18:51
dprincejaypipes: line 20118:52
dprincejaypipes: if key.exists():18:52
dprincejaypipes: needs to be something like:18:52
dprinceif key and key.exists():18:52
dprincejaypipes: essentially key can be null.18:52
jaypipeshmm...18:52
dprinceAfter making that change I still get this error:18:53
dprinceError uploading image: 'Input' object has no attribute 'seek18:53
dprincejaypipes: Sorry. I forgot to mention I'm just running a quick glance add command to upload a ramdisk into S3.18:53
jaypipesdprince: farts.18:54
jaypipesdprince: OK, please make a note on the MP. Seems the boto S3 API isn't as specific as it needs to be...18:54
jaypipesdprince: just mentions the input needs to be a file-like object, which I interpreted as needing a read attribute...18:55
dprincejaypipes: Sure. I'll make some notes.18:56
jaypipesdprince: appreciated. trying to wrap up the version stuff first...18:57
jaypipesdprince: which will cause merge conflicts with the S3 stuff anyway ;)18:57
*** Vek has joined #openstack-dev18:57
dprincejaypipes: Cool. I agree in doing that first then. So hey. Did my packager branch look good for the versioned API stuff? Or did you have more drastic changes in mind for the glance packager stuff? (I linked it in the merge prop)18:59
dprincejaypipes: feel free to ninja any/all/or none of it if we want to make another branch.18:59
jaypipesdprince: yeah, saw that. I didn't see any problems.18:59
dprincejaypipes: We just need to coordinate the changes. Shall I merge prop it and then Soren can take it from there?19:00
jaypipesdprince: not yet. :)19:00
jaypipesdprince: changing from v1.0 to v1.19:01
dprincejaypipes: I could go either way on that.19:02
dprincejaypipes: doesn't really matter I guess but I actually prefer the v1.0 for consistency. Sorry. I neglected to reply the the ML thread.19:03
jaypipesdprince: actually, rackspace and others are moving towards v1 and not using v1.0 in URIs, which is why I reconsidered. :)19:06
bcwaldon:)19:07
dprincedprince: SUre. good to be ahead of the game then. Kind of a preference thing anyway. Not going to make everyone happy.19:08
*** hub_cap_ has joined #openstack-dev19:10
*** hub_cap has quit IRC19:10
*** hub_cap_ is now known as hub_cap19:10
*** HugoKuo has quit IRC19:17
*** hub_cap has quit IRC19:35
*** mattray has quit IRC19:46
*** mattray has joined #openstack-dev19:51
*** agarwalla has quit IRC19:58
*** ironcamel has joined #openstack-dev20:08
*** cp16net has quit IRC20:16
*** dprince has quit IRC20:28
*** lorin1 has quit IRC21:02
*** bcwaldon has quit IRC21:02
*** User665 has joined #openstack-dev21:08
*** User665 has quit IRC21:09
*** mattray has quit IRC21:18
*** bcwaldon has joined #openstack-dev21:37
*** mattray has joined #openstack-dev21:52
*** markvoelker has quit IRC21:58
*** cp16net has joined #openstack-dev21:58
*** bcwaldon has quit IRC22:07
*** jkoelker has quit IRC22:23
*** rnirmal has quit IRC22:35
*** cp16net has quit IRC23:01

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!