Tuesday, 2016-04-12

*** Qiming has quit IRC00:38
*** zzxwill has quit IRC01:14
openstackgerritzzxwill proposed openstack/python-senlinclient: Correct some typos  https://review.openstack.org/30394401:23
*** Qiming has joined #senlin01:34
*** yanyanhu has joined #senlin01:38
*** elynn has joined #senlin02:00
*** elynn has quit IRC02:05
*** elynn has joined #senlin02:05
Qimingelynn, seems the session_for_write() thing is not reliable02:06
Qimingstill some parameters to tune, maybe?02:06
elynnStill have some potential race condition.02:06
openstackgerritMerged openstack/python-senlinclient: Correct some typos  https://review.openstack.org/30394402:07
Qimingso we cannot treat session_for_write as safe atomic way for db api?02:07
Qimingwhat's the different between obj.save(session) and session.add(obj) ?02:07
elynnI thinks so, but I think we can not just lock the db by db_api02:07
elynnsession.add() is for new record02:07
Qimingthe later was seen in oslo.db doc, the former is what we have been using02:07
elynnojb.save() is for update.02:08
QimingI'm suspecting if that is true02:08
Qimingcannot understand why two nodes being added to the same cluster are getting the same index value02:09
*** ddeja has quit IRC02:09
elynnsave will not flush the session before is save to db I think.02:10
elynnIf they both go into https://github.com/openstack/senlin/blob/master/senlin/db/sqlalchemy/api.py#L162 at the same time, their index will be the same.02:11
elynnI think.02:11
*** yuanying has quit IRC02:12
*** yuanying has joined #senlin02:12
Qimingso we need to check if the cluster_next_index is "successful" and "retry", if the session_for_write is a real DB transaction02:13
yanyanhuQiming, I'm testing the membership issue02:15
elynnYou mean outside db_api?02:16
Qimingnot sure02:16
Qimingthe previous DB isolation hacks are all going away when we switch to the context manager and reader/writer thing02:16
yanyanhudidn't happened locally without that patch02:16
Qimingyanyanhu, it is rare02:17
yanyanhuyes02:17
yanyanhulet me make more tests02:17
Qimingbut there is a possibility, as the gate revealed02:17
yanyanhuright, need to figure out what is the reason02:17
elynnI think this problem is cause by enable concurrency of functional tests. Which will test some potential race condition.02:18
yanyanhuelynn, but that happened inside a single test case02:18
Qimingcluster membership one is not that case02:18
yanyanhuno clue in engine log02:19
Qimingit is telling us that the db transaction isolation is very fragile now02:19
yanyanhuQiming, yes, agree02:19
Qimingstill need to inject auto_commit and expire_on_commit options I guess02:19
elynnhmm, indeed, yes02:20
Qimingor even the 'isolation_level' option if that does not solve the problem02:20
yanyanhumaybe it happens once during hundred times of running, but there is likelihood02:20
Qimingyep, that is very annoying and very dangerous02:21
elynnyes, should find a way to lock write session.02:21
elynnI should go through oslo_db docs02:21
yanyanhuQiming, elynn, I have some new understanding about the relationship between Rally and Tempest. If you guys have time, I want to make a call with you to make a quick sync about the next step for senlin stress/scenario test support.02:23
elynnDo you think will this patch help ? https://github.com/openstack/keystone/commit/9535c0908465b626b01fccec034905196f78dc1a02:24
elynnyanyanhu, I want to talk about it with you, How about noon at starbucks?02:24
yanyanhuelynn, that's ok for me02:25
Qimingok02:25
Qiminghttp://stackoverflow.com/questions/2334824/how-to-increase-a-counter-in-sqlalchemy02:25
elynnhmm, interesting02:26
elynnLet's have a try.02:27
Qiminghttp://stackoverflow.com/questions/4167568/how-to-do-atomic-increment-decrement-with-elixir-sqlalchemy02:27
Qimingthe key of the second one is the combination of the filter and update operation02:27
elynnok, so the key is to do a db transaction as quick as we can.02:28
Qimingavoid introduce python values or objects02:29
Qimingtwo possibilities02:30
Qimingthe session.commit() wasn't called quickly enough in the session_for_write() logic02:31
Qiminganother possibility is the commit() wasn't flushed into db because the object in question wasn't expired immediately02:32
elynnsession.commit() will automatically call after with session_for_write I think.02:32
elynnsession.commit() will automatically call after with session_for_write block I think.02:32
Qimingif that is the case, session_for_write() is not preventing a reentrance into the transaction logic02:33
elynnFrom what we observed , that might be true, but I'm not so sure why02:34
Qimingso elynn02:43
Qimingfrom http://stackoverflow.com/questions/21653319/sqlalchemy-concurrency-update-issue02:43
Qimingand http://stackoverflow.com/questions/26356131/when-a-sqlalchemy-concurrent-updates-the-same-record-what-is-wrong02:43
QimingI think we should go query.with_for_update()02:44
elynnLet me look at these links02:44
Qimingthe second link did mentioned some possibility of dead locks, but that is not a concern in our context02:45
QimingI'm so curious/confused why such a problem wasn't discussed on mailinglist02:45
Qimingor ... people have all found some secret sauces of different taste?02:46
openstackgerritMerged openstack/senlin: Catch DBDuplicateEntry Error during cred_create  https://review.openstack.org/30320402:46
Qimingsigh, this is really not an easy task: http://skien.cc/blog/2014/01/15/sqlalchemy-and-race-conditions-implementing/02:49
elynnThey have their small secrets.02:49
elynnI would guess they solve the atomic issue outside of db_api.02:50
Qimingusing python lock?02:50
Qimingthat would be an overkill02:50
Qimingdb consistency is the foundation02:51
Qimingunless ... you don't trust the db you are using, which is true ins some contexts where you are using some distributed KV stores02:51
Qiminghere is another article: http://rachbelaid.com/handling-race-condition-insert-with-sqlalchemy/02:52
*** yuanying has quit IRC02:52
Qimingeven insertion has some problems, but this post is a good reference for the credential thing we were looking at02:53
elynnI'm going through the docs of oslo.db hoping I can find some way to solve this http://docs.openstack.org/developer/oslo.db/api/oslo.db.sqlalchemy.session.html02:55
Qimingokay02:58
Qimingi'd suggest you read at least the last post02:59
Qimingit is very inspiring02:59
elynnyes, I'm reading.02:59
elynnTry and rollback and get again?03:07
Qimingwhat do you mean?03:10
Qimingthat solution of 'get_or_create_link' looks pretty clean one for cred_create_or_update()03:11
elynnI mean the logic from your pasting blog, it's try to create and catch error and do a session rollback and then try to get existing record, it's a little complicated. But the this blog shows the workflow of how the race condition happens.03:11
Qimingyour current workaround, catching DBDuplicateEntity looks similar03:12
elynnNot so sure which is better.03:13
Qimingas for the index problem03:13
QimingI'd suggest we try the with_for_update() hack03:13
Qimingaccording to this: http://docs.sqlalchemy.org/en/rel_0_9/orm/query.html#sqlalchemy.orm.query.Query.with_for_update03:14
Qimingwith_for_update() is superceding the with_lockmode() call03:14
elynnLet's test it.03:15
Qimingem03:15
openstackgerritEthan Lynn proposed openstack/senlin: [WIP] Fix for cluster_next_index db call  https://review.openstack.org/30437103:17
*** yuanying has joined #senlin03:50
*** elynn has quit IRC04:21
*** zzxwill has joined #senlin05:17
*** elynn has joined #senlin05:44
*** elynn has quit IRC05:49
*** elynn has joined #senlin05:49
*** zzxwill has quit IRC06:19
elynnHi Qiming06:26
elynnDo we need to apply with_for_update for all update operation?06:27
openstackgerritEthan Lynn proposed openstack/senlin: Rename tempest_tests to tempest  https://review.openstack.org/30326406:36
Qimingelynn, I'd suggest we think twice06:36
Qimingit is ... after all ... a hack06:37
elynnok, so for now just apply for cluster_next_index.06:37
Qiminguntil we find another trace of concurrency error06:38
openstackgerritEthan Lynn proposed openstack/senlin: Use with_for_update in cluster_next_index db call  https://review.openstack.org/30437106:42
openstackgerritQiming Teng proposed openstack/senlin: Clarify some guidelines on contribution  https://review.openstack.org/30441307:08
openstackgerritQiming Teng proposed openstack/senlin: Release note for API versioning  https://review.openstack.org/30441407:11
openstackgerritQiming Teng proposed openstack/senlin: Release note for trust creation concurrency  https://review.openstack.org/30441607:15
openstackgerritQiming Teng proposed openstack/senlin: Release note for the event generation bug fix  https://review.openstack.org/30441907:19
openstackgerritQiming Teng proposed openstack/senlin: Reorg user documentation to be references  https://review.openstack.org/30442307:29
openstackgerritMerged openstack/python-senlinclient: Updated from global requirements  https://review.openstack.org/30300607:41
*** zzxwill has joined #senlin07:56
openstackgerritQiming Teng proposed openstack/senlin: Initial framework for user tutorial doc  https://review.openstack.org/30444408:24
*** elynn has quit IRC08:33
*** elynn has joined #senlin08:36
*** zzxwill has quit IRC08:39
*** zzxwill has joined #senlin08:40
openstackgerritQiming Teng proposed openstack/senlin: Fix problems in glossary  https://review.openstack.org/30445008:42
*** elynn_ has joined #senlin08:44
*** zzxwill has quit IRC08:47
*** elynn has quit IRC08:48
*** zzxwill has joined #senlin09:07
*** zzxwill has quit IRC09:10
*** zzxwill has joined #senlin09:15
*** openstackgerrit has quit IRC09:17
*** openstackgerrit has joined #senlin09:18
*** zzxwill has quit IRC09:20
openstackgerritQiming Teng proposed openstack/senlin: Clarify some guidelines on contribution  https://review.openstack.org/30441309:23
*** yanyanhu has quit IRC09:24
*** shu-mutou is now known as shu-mutou-AFK09:35
openstackgerritMerged openstack/senlin: Clarify some guidelines on contribution  https://review.openstack.org/30441309:42
*** openstackstatus has quit IRC09:57
*** openstack has joined #senlin10:02
*** zzxwill has joined #senlin10:03
*** Qiming has quit IRC10:14
*** zzxwill has quit IRC10:23
*** zzxwill has joined #senlin10:25
*** elynn_ has quit IRC10:28
*** zzxwill has quit IRC10:34
*** zzxwill has joined #senlin10:34
*** zzxwill has quit IRC10:48
*** zzxwill has joined #senlin10:50
*** zzxwill has quit IRC11:13
*** Qiming has joined #senlin11:22
*** zzxwill has joined #senlin11:57
*** openstack has quit IRC12:04
*** openstack has joined #senlin12:08
*** zzxwill has quit IRC12:10
openstackgerritMerged openstack/senlin: Release note for API versioning  https://review.openstack.org/30441412:43
openstackgerritMerged openstack/senlin: Release note for trust creation concurrency  https://review.openstack.org/30441612:43
openstackgerritMerged openstack/senlin: Release note for the event generation bug fix  https://review.openstack.org/30441912:43
openstackgerritMerged openstack/senlin: Reorg user documentation to be references  https://review.openstack.org/30442312:43
openstackgerritQiming Teng proposed openstack/senlin: Fix problems in glossary  https://review.openstack.org/30445012:44
*** cschulz_ has joined #senlin12:47
*** elynn has joined #senlin12:55
*** yanyanhu has joined #senlin12:58
*** zzxwill has joined #senlin13:01
*** zzxwill has quit IRC13:08
*** zzxwill has joined #senlin13:09
*** elynn has quit IRC13:12
*** elynn has joined #senlin13:13
*** zzxwill has quit IRC13:14
*** zzxwill has joined #senlin13:19
*** zzxwill has quit IRC13:23
*** zzxwill has joined #senlin13:29
openstackgerritLiuqing Jing proposed openstack/senlin-dashboard: Fix empty `Timestamp` column in cluster/node event tables  https://review.openstack.org/30277213:50
*** jhesketh has left #senlin13:57
*** elynn has quit IRC14:00
yanyanhuhi, Qiming, about the approval process, I think using policy to support it is better since policy is supposed to be customized by user/developer14:01
yanyanhubut if we embedded such mechanism into senlin scaling request, it's difficult to make it flexible enough to meet everyone's demand14:02
Qimingthe approval is not supposed to be done in minutes14:05
Qimingit will be processed by human in the middle14:05
cschulz_The issue with policy is that it is after the action is started which causes a problem with locking14:05
yanyanhuyes, just a trade-off between refactoring policy mechanism and redesign request process14:06
Qimingexactly, our policy definition is rule set to be check when action is about to be executed14:06
cschulz_One way we could do it is to have a policy make a approval request and then cancel the action.14:07
cschulz_The response from approval would be to create a special scaling request.14:07
Qimingyes, cschulz_, that is in fact turning the scaling request into two steps14:08
yanyanhuso the policy does some works at background14:08
Qimingthe second one is the one we can handle today14:08
Qimingthe first step, doesn't look like a real request yet14:09
yanyanhucan action timeout provide some help here?14:09
cschulz_The issue is that the condition that generated the original request may have changed.14:09
cschulz_So there really need to be a reevaluation of the situation at the time of the approval14:09
Qimingem14:10
Qimingthat is an issue14:10
Qimingbut we cannot avoid that, right?14:10
yanyanhuyes14:10
yanyanhuwe can't14:10
Qimingif the request is delayed for 20 minutes for approval14:10
yanyanhuthis problem always exists14:10
cschulz_That is why my thinking was leaning toward doing all the decision process outside before ever issuing any requests to Senlin14:11
yanyanhuI agree with cschulz_14:11
Qimingyep14:11
yanyanhusince this kind of processing is actually out of senlin's scope14:11
Qimingso long the user knows what will happen14:11
yanyanhueven out of ceilometer/monasca's scope I guess14:11
cschulz_Makes it much simpler as far as Senlin, just need a secure way for external entities to tell Senlin to take actions.14:12
cschulz_e.g. Zaqar14:12
yanyanhuyep14:12
Qimingokay, that makes perfect sense14:12
Qimingsenlin doesn't have to know the source of the trigger14:12
cschulz_Nope14:12
QimingI'm thinking if we should improve our scaling policy14:13
Qimingsay, if a request is about adding more then 10 VMs a time, we should send a message/alert14:13
cschulz_I think we need to have an notification policy14:13
yanyanhuthat's a good idea14:14
Qimingreport this to the boss, :)14:14
cschulz_To send a message on Zaqar, and a Zaqar Receiver14:14
Qimingyep14:14
cschulz_Am working on a Zaqar client for Senlin14:14
Qimingspeaking of emitting notifications14:14
QimingI'm thinking of unifying that into the senlin event module14:15
yanyanhumay also need to extend event?14:15
yanyanhu:)14:15
Qiminguse the same event interface14:15
Qimingallow more than one backend for the event, configurable for deployers14:15
cschulz_Where can I look into that?  Any documentation?14:15
yanyanhuyes14:15
Qimingone backend is db, one is zaqar queue, one is oslo_notification14:15
yanyanhulike ceilometer sampling14:16
yanyanhuno document for senlin event I guess...14:16
yanyanhuas of now14:16
Qimingcschulz_, http://git.openstack.org/cgit/openstack/senlin/tree/senlin/engine/event.py14:16
yanyanhuwow14:17
yanyanhuit is there :)14:17
yanyanhuoh, just code...14:17
cschulz_thanks.  I read the code.14:17
Qimingeach function is basically constructing a Event object and serializing that into DB14:17
yanyanhuwill leave. talk to you guys later14:18
Qimingwe can extend that with DB as just one of backend14:18
cschulz_Qiming, your Notes available times seem to be in US time zone14:18
Qimingbye yanyanhu14:18
cschulz_bye yanyanhu14:18
yanyanhubye14:18
*** yanyanhu has quit IRC14:18
Qiminghahah14:18
Qimingmy day time is mostly empty, but my night time is ... busy, overlapping meetings14:19
cschulz_Your availability appears on my Notes to be from now to EOD EST.  Do you work all night in Beijing?14:20
*** zzxwill has quit IRC14:26
Qimingno, cschulz_, I sleep14:30
cschulz_So Notes is wrong.  That's no surprise14:30
Qimingokay, I see14:31
QimingI was trying to schedule something during the coming summit14:31
Qimingso ... the best way to do that is to switch my notes time zone, schedule them, then switch back14:31
QimingI forgot to switch back I think14:32
cschulz_OK.  makes sense.  You got any time tomorrow?  Morning your time?14:32
Qimingyes14:32
cschulz_let me know what works for you.  I'll send invite.14:37
*** zzxwill has joined #senlin14:40
Qimingok14:42
*** zzxwill has quit IRC14:46
*** zzxwill has joined #senlin15:02
*** jdandrea_ has joined #senlin15:10
*** jdandrea has quit IRC15:14
*** openstackgerrit has quit IRC15:18
*** openstackgerrit has joined #senlin15:19
*** jdandrea_ is now known as jdandrea15:25
*** zzxwill has quit IRC15:47
*** Qiming has quit IRC16:22
*** yuanying has quit IRC19:04
*** nk2527 has left #senlin19:11
*** cschulz_ has quit IRC20:38
*** yuanying has joined #senlin23:18
*** Qiming has joined #senlin23:35
*** yuanying has quit IRC23:36

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!