Wednesday, 2013-09-04

*** kgriffs_afk is now known as kgriffs00:35
*** flaper87 is now known as flaper87|afk00:36
*** kgriffs is now known as kgriffs_afk00:45
*** nos_ has joined #openstack-marconi00:54
*** nos_ has quit IRC00:54
*** nosnos has joined #openstack-marconi00:55
*** amitgandhi has quit IRC01:33
*** kgriffs_afk is now known as kgriffs01:36
*** kgriffs is now known as kgriffs_afk01:45
*** ayoung has quit IRC01:52
*** whenry has joined #openstack-marconi01:55
*** ayoung has joined #openstack-marconi02:11
*** amitgandhi has joined #openstack-marconi02:22
*** whenry has quit IRC02:29
*** kgriffs_afk is now known as kgriffs02:36
*** kgriffs is now known as kgriffs_afk02:45
*** nosnos_ has joined #openstack-marconi02:47
*** nosnos has quit IRC02:49
*** nosnos_ has quit IRC02:55
*** nosnos has joined #openstack-marconi02:55
*** ayoung has quit IRC03:05
*** amitgandhi has quit IRC03:17
*** whenry has joined #openstack-marconi03:18
*** whenry has quit IRC03:34
*** kgriffs_afk is now known as kgriffs03:36
*** kgriffs is now known as kgriffs_afk03:46
*** kgriffs_afk is now known as kgriffs04:37
*** kgriffs is now known as kgriffs_afk04:46
*** whenry has joined #openstack-marconi05:23
*** openstack has joined #openstack-marconi14:53
flaper87nope, swift usecase is different from marconi's usecase14:54
flaper87swift has blob objects, we have messages14:54
oz_akan_in both case we have a resource, that we want to delete14:54
*** openstackgerrit has joined #openstack-marconi14:55
flaper87but the use case is not the same. When talking to message systems, most of the times, message deletion is not something users care about14:55
oz_akan_why do you think data type is a differentiator for deletes?14:55
kgriffshttps://wiki.openstack.org/wiki/Marconi/specs/api/v1/responsecodes14:55
kgriffsit is update to date assuming a couple pending patches are merged14:55
oz_akan_flaper87: deleting a message is important as it guarantees it won't be processes by others (though in this case there is claim id)14:56
flaper87I agree we should diferentiate both cases, I don't agree w/ treating it as error14:56
oz_akan_I think if you delete a message, you do it, because you care about it14:56
flaper87oz_akan_: right, when you ack a message in AMQP systems you don't get an error back if it was already acked14:56
kgriffsbrb (meeting)14:57
oz_akan_flaper87: got it14:57
oz_akan_404 is just another response.. if not, what do you think we could return?14:57
flaper87oz_akan_: something that is not an error: 200 and 204? Not sure to be honest14:59
flaper87kgriffs: https://review.openstack.org/#/c/45070/1/global-requirements.txt15:00
*** key4 has quit IRC15:00
*** key4 has joined #openstack-marconi15:01
oz_akan_hmm, 200 for successful delete, 204 for not found message15:01
oz_akan_might be, still I liked 404 because of the definition in wiki : The requested resource could not be found but may be available again in the future.[2] Subsequent requests by the client are permissible.15:02
oz_akan_https://wiki.openstack.org/wiki/Marconi/specs/api/v1/responsecodes#Delete_Messages15:03
oz_akan_here15:03
oz_akan_Delete message from a non existing queue 20415:03
oz_akan_zyuan_: had told that we return 404 in this case15:03
oz_akan_unfortunately he is off, can't verify at the moment15:03
oz_akan_kgriffs: flaper87 ^^15:03
flaper87oz_akan_: in which case?15:07
oz_akan_Delete message from a non existing queue15:09
oz_akan_document says returns 20415:09
oz_akan_while zyuan_ told that it returns 40415:09
oz_akan_I think malini_afk had told the same15:09
oz_akan_I am not sure if wiki page is wrong15:09
oz_akan_flaper87: ^^15:11
flaper87we return 204 https://github.com/stackforge/marconi/blob/master/marconi/transport/wsgi/queues.py#L8515:12
oz_akan_that code is for deleting queue15:14
oz_akan_right?15:14
flaper87ah, sorry, delete from a non-existing queue15:14
flaper87T_T15:14
flaper87oz_akan_: https://github.com/stackforge/marconi/blob/master/marconi/transport/wsgi/messages.py#L30315:15
oz_akan_flaper87: ok, we don't really check if queue exists to delete the message15:16
flaper87oz_akan_: nope, I was checking in the backend as well15:16
oz_akan_flaper87: so don't return anything specific15:16
oz_akan_only list a queue that doesn't exist return 40415:16
oz_akan_documentation is correct15:17
oz_akan_flaper87: tks15:17
oz_akan_https://bugs.launchpad.net/marconi/+bug/122076815:17
oz_akan_I created this to consider delete message response15:17
flaper87oz_akan_: awesome, thanks about that! This definitely needs further discussion15:17
openstackgerritA change was merged to stackforge/marconi: fix: claimed message require claim_id to delete  https://review.openstack.org/4333915:26
oz_akan_flaper87: not that awesome, just a bug report :D15:31
oz_akan_I am out for lunch15:31
flaper87oz_akan_: enjoy :D15:31
flaper87kgriffs: ping15:31
oz_akan_thanks15:31
kgriffsback15:54
kgriffsflaper87: so, on this: https://bugs.launchpad.net/marconi/+bug/122076815:58
kgriffsIs there a proposed solution?15:59
kgriffsas in, what return code and/or body?15:59
flaper87not yet16:04
flaper87I think this is worth discussing in our next meeting16:04
flaper87kgriffs: btw, unrelated topic16:05
flaper87About the sqlalchemy / MySQL backend, I talked this morning w/ Yeela and she wanted to take that bp16:05
flaper87She was going to work on the proton / qpid one but, since the rel backend is our priority then she volunteered to work on it16:06
flaper87She'll attend to our next meeting16:07
flaper87but, if you're ok w/ that, we could assign it to her so that other folks know someone is already going to work on that16:08
flaper87kgriffs: also, can you review test patches ?16:08
flaper87:D16:08
*** ayoung has quit IRC16:11
kgriffsflaper87: re backend, I'm cool with that. let me assign her16:11
flaper87kgriffs: thx16:12
kgriffsyeela kaplan?16:13
flaper87kgriffs: yup16:13
kgriffscool, we need some more RedHat copyrights. :D16:14
* kgriffs loves contributors16:14
flaper87kgriffs: YEAHHH!!!16:14
kgriffshttps://blueprints.launchpad.net/marconi/+spec/sql-storage-driver16:15
flaper87kgriffs: awesome! thanks!16:15
kgriffsthoughts on this?16:15
kgriffshttps://blueprints.launchpad.net/marconi/+spec/redis-storage-driver16:15
*** openstackgerrit has quit IRC16:16
*** openstackgerrit has joined #openstack-marconi16:16
flaper87kgriffs: +1 for that. cppcabrera has some work already done on that16:17
flaper87https://github.com/cabrera/marconi-redis16:17
flaper87Developing it outside is the best test for: 1) Our test suite structure 2) our plugins stuff16:17
flaper87once it's done, I think he can submit it for review16:18
kgriffsok, so keep the blueprint but develop in a separate repo?16:20
flaper87until it's done, I guess. SO, usually this kind of blueprints are impemented separately and then submited in a single patch16:21
flaper87but, we could split it in several patches 1 for each controller16:21
flaper87which makes reviews easier16:21
flaper87What I meant w/ developing it in a separate repo is that I like the fact that he's doing that because that allows us to test both, the test suite and the plugins thing.16:22
flaper87but, I'd be happy to pull that backend into Marconi's source tree16:23
*** amitgandhi has quit IRC16:23
kgriffsyep, makes sense16:24
kgriffsnot a bad model actually16:24
kgriffsfuture drivers can be essentially incubated in other/personal repos16:24
kgriffsand then we can pull them in IFF it makes sense and the core team wants to take over maintenance16:24
*** amitgandhi has joined #openstack-marconi16:25
*** amitgandhi has quit IRC16:25
*** amitgandhi has joined #openstack-marconi16:25
flaper87kgriffs: correct! awesome!16:27
flaper87brb, dinner16:27
kgriffsciao16:32
kgriffsoz_akan_: does that load test delete messages, or leave them in the DB?16:40
kgriffs(after running)16:40
*** ayoung has joined #openstack-marconi17:05
*** kgriffs is now known as kgriffs_afk17:29
*** kgriffs_afk is now known as kgriffs18:05
flaper87kgriffs: ping18:18
flaper87kgriffs: could you take a look here? https://review.openstack.org/#/c/45070/1/global-requirements.txt18:18
kgriffsyeah, sorry, been swamped and haven't been diligent about reviews today18:19
* kgriffs is looking18:19
flaper87kgriffs: no worries, that's just a very quick one that you certainly know the answer18:20
kgriffssometimes falcon does .postX18:20
flaper87https://github.com/racker/falcon/blob/master/falcon/version.py#L1918:20
kgriffsso, if we have ==0.1.6 then you would miss 0.1.6.postX18:21
kgriffsthe .post things are sort of silly, I know18:21
flaper87cool, I think that's all he wanted to know. I didn't know the answer w.r.t falcon and went lazy on it :P18:21
kgriffsI should really just bump the first minor up more often and leave the second for interim stuff18:22
kgriffsI can comment on that18:22
flaper87kk, thanks!18:22
kgriffscommented18:24
flaper87danke sir!18:25
kgriffsno problemo18:25
kgriffsoz_akan_: I'm on the mongo primary now, attempting to reproduce the 404 issue. I'll let you know how it goes.18:41
oz_akan_ok18:41
oz_akan_server02 right?18:41
oz_akan_kgriffs: ^^18:41
oz_akan_mng-02 to be exact18:41
kgriffsmar-tst-ord-mng-0218:42
kgriffsright?18:43
oz_akan_right18:43
kgriffs166.78.112.2518:43
kgriffscool18:43
oz_akan_I see 40000 messages, as if created with a loop, precisely18:43
*** JRow has left #openstack-marconi18:44
oz_akan_kgriffs: any luck?19:38
kgriffsas far as I can gather, it's not a timestamp issue19:44
kgriffsif it were, you should see 204, not 40419:44
kgriffsthe queue in the query definitely exists in the queues collection on primary19:45
kgriffsI guess I can check the secondaries19:45
kgriffsactually, one more think also19:45
kgriffswait a sec19:46
kgriffseeeeenteresting19:46
kgriffsmongodb.claims.create claims several messages by applying a claim ID to them19:46
kgriffsthen, it turns around and does a find for the same messages so it knows which were updated19:47
kgriffsthat would be a problem for eventually consistent collection, but you said that you still get a 404 when not reading from secondaries?19:48
oz_akan_kgriffs: no19:49
oz_akan_kgriffs: I just had the very first request when not reading from secondaries19:49
oz_akan_just one request amongst thousands19:49
oz_akan_so I can say we have problem only when we write to primary and read from secondaries19:50
oz_akan_claim id logic might be the case then19:50
kgriffsodd that you would still get a single 404 tho19:50
kgriffsok, so here is what the code does19:51
oz_akan_I didn't test write-read from primary enough many times that I can say we always get first request 40419:52
kgriffsok19:52
kgriffswell, we definitely have a race condition when reading from secondaries19:52
kgriffswhat happens is this:19:52
kgriffsa batch of messages is tagged by ID19:52
oz_akan_(I feel like watching a thriller, with popcorn )19:53
kgriffsbut then the claims controller reads those backs as a sort of sanity check, in case in between creating a list of message IDs to update, and actually tagging them with the claim, some of them may have gotten claimed by another process19:53
kgriffsLOL19:53
kgriffsaaaanyway19:53
kgriffshere's the scary part19:53
oz_akan_oh19:54
kgriffsA Nobel Peach Prize laureate is about to start World War III19:54
kgriffsno, wait, that's not it19:54
* kgriffs gets minds back on topic19:55
kgriffss/peach/peace19:55
kgriffs<sigh>19:55
oz_akan_(let me drink this cold coke)19:55
kgriffsok, so if the primary is behind the master by enough, then that final get to make the list of claimed IDs returns an empty result set. This then triggers mongodb.get to raise exceptions.ClaimDoesNotExist19:57
kgriffswhich is then propagated up the stack by mongodb.create19:57
kgriffsand finally...19:57
* kgriffs queue scary music19:57
* oz_akan_ hiding19:57
kgriffswsgi.claims.on_post catches ClaimDoesNotExist and converts it to falcon.HTTPNotFound()!!!!19:58
kgriffs<dun-dun-duuuuuuun!>19:58
oz_akan_what a tragedy19:58
kgriffsyou said it19:58
kgriffsI cried the whole time19:58
oz_akan_:o19:58
kgriffs(when I wasn't hiding)19:59
kgriffsso, we need to either change the logic, add a retry loop, or always read from the primary for that one call19:59
kgriffsso many choices, so little timeā€¦ :p19:59
oz_akan_hmm19:59
oz_akan_fastets, w=320:00
oz_akan_so we will write to all, and read from secondaries preferred20:00
kgriffswon't that hang if one of the nodes goes down?20:00
oz_akan_fastest,  I mean, fastest to implement20:00
oz_akan_w=majority20:00
kgriffsyes, but I am concerned about hanging when one node goes down20:00
kgriffsmajority I think mitigates that, correct?20:01
oz_akan_majority is safe20:01
oz_akan_if we would give 3 , it would20:01
kgriffsyeah20:01
kgriffsthat's what I'm worried about20:01
oz_akan_majority is a keyword20:01
oz_akan_ah..20:01
oz_akan_sorry20:01
oz_akan_right20:01
oz_akan_majority is not good enough20:01
kgriffsyeah20:02
oz_akan_maybe there is "all" .. I will check that20:02
oz_akan_ok then at if we read from primary we are fine20:02
oz_akan_at least20:02
kgriffsyeah, I'm just wondering if that will impact performance any?20:02
oz_akan_it will, we can measure20:02
oz_akan_though I think we have so many writes that we have to go to primary most of the time anyways20:03
oz_akan_lets measure and see what we will have20:03
oz_akan_I think this could be a temporary solution anyways20:04
oz_akan_app needs to be able to cover this at some point20:04
oz_akan_app = marconi20:04
kgriffsok. retry would slow down claim creation anyway20:04
kgriffsok, let me prepare a patch for you to try20:04
kgriffsI'll make it read from primary for just that one call (hopefully that won't be too convoluted ,heh)20:05
oz_akan_that one call20:09
kgriffsdoing the get from primary20:10
oz_akan_might not be easy as I think we decide that on the driver level while creating a conenction20:10
kgriffsI guess I could make all GET claim hit primary, or just when creating?20:10
kgriffsmmm20:10
kgriffsrings a bell20:10
kgriffsseems like with RSE I had to keep two connections20:11
kgriffslet me check it out20:11
oz_akan_oh that is trivky20:12
oz_akan_tricky means very tricky20:12
oz_akan_trivky means very tricky20:12
oz_akan_flaper87: missed the movie20:13
kgriffs"You can specify a read preference mode on connection objects, database objects, collection objects, or per-operation."20:16
kgriffssweet20:16
kgriffslet me try that20:16
oz_akan_per operation, amazing20:24
kgriffsnot sure if it will work for real, but we can try it. :p20:24
oz_akan_so you say this is the set up for the following episode20:26
oz_akan_I can't wait to watch it20:26
kgriffshe20:26
kgriffsheh20:26
kgriffsoz - what's the easiest way for me to get you the code chance to try?20:33
kgriffschange20:33
oz_akan_a branch is esasiest20:33
oz_akan_for like last time, I can apply a patch20:33
oz_akan_for = or20:33
oz_akan_if you have a public fork I could install it easuly20:34
oz_akan_easily20:34
kgriffsok20:35
kgriffslet me do that20:35
oz_akan_ok20:35
*** ayoung is now known as ayoung-afk20:36
kgriffsoz: https://github.com/kgriffs/marconi21:03
kgriffsI just set up a temporary fork to try this out21:03
kgriffs(just use the master branch)21:03
kgriffsnot sure if this is easier than using gerrit, but whatever21:04
kgriffsoz_akan_: ^^^21:06
oz_akan_got it21:07
kgriffsok21:09
kgriffsI was lazy and didn't test it locally (didn't want to set up a repl set)21:09
oz_akan_starting test21:10
oz_akan_w=1&readPreference=secondaryPreferred. so far not 404s21:11
oz_akan_am I experiencing a happy end?21:12
oz_akan_http://198.61.239.147:8000/log/20130904-2111/report.html21:12
oz_akan_I am not sure about the performance yet but with the patch it seems we don't get any more 40421:13
oz_akan_please don't touch master branch on that repo21:13
oz_akan_I will run benchmark later this evening or tomorrow morning21:13
oz_akan_I have to leave now21:14
oz_akan_kgriffs: ^^21:14
kgriffsoh, ok. It doesn't include my perf patch for large queues, I just forked from master/mainline21:14
oz_akan_yes, there we have 60K+ messages21:14
oz_akan_I will run on an empty queue21:14
oz_akan_bye for now21:15
*** oz_akan_ has quit IRC21:15
*** flaper87 is now known as flaper87|afk22:11
*** tedross has quit IRC22:20
*** oz_akan_ has joined #openstack-marconi22:27
*** oz_akan_ has quit IRC22:31
*** amitgandhi has quit IRC22:39
*** oz_akan_ has joined #openstack-marconi23:11
*** oz_akan_ has quit IRC23:53

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!