Monday, 2016-11-21

*** ruijie has joined #senlin00:58
*** yanyanhu has joined #senlin01:41
yanyanhuhi, Qiming, just got response from Heidi last weekend and she said they are still waiting for the illustrator to give them the update version since they didn't think the first version is good enough.01:43
Qimingok01:44
yanyanhuand she will give me message as soon as she get it01:44
*** elynn has joined #senlin01:44
Qimingso long they are still working on it, it is fine01:44
yanyanhuyep, looks so01:46
*** elynn has quit IRC01:47
*** elynn has joined #senlin01:49
*** XueFeng has joined #senlin01:56
*** elynn has joined #senlin02:11
openstackgerritMerged openstack/python-senlinclient: Support  "global_project" arguments for action-list  https://review.openstack.org/39780502:11
openstackgerritQiming Teng proposed openstack/senlin: Move notifications object down one level  https://review.openstack.org/40002402:15
openstackgerritmiaohb proposed openstack/python-senlinclient: Cluster collect display error  https://review.openstack.org/40002602:25
*** yuanying has quit IRC02:50
*** yuanying has joined #senlin02:51
openstackgerritQiming Teng proposed openstack/senlin: Registry support for notification classes  https://review.openstack.org/40003302:54
Qimingare we suffering from the same problem? https://review.openstack.org/#/c/378572/03:19
*** zhurong has joined #senlin03:27
*** yuanying has quit IRC03:46
*** elynn has quit IRC03:48
*** yuanying has joined #senlin03:48
*** zhurong has quit IRC03:58
*** zhurong has joined #senlin03:59
*** gongysh2 has quit IRC04:01
*** shu-mutou-AWAY is now known as shu-mutou04:13
*** zhurong has quit IRC04:26
*** gongysh2 has joined #senlin04:27
*** rasmus has joined #senlin04:32
*** rasmus has quit IRC04:36
*** elynn has joined #senlin04:52
yanyanhuhi, Qiming, we are not I think. We inherit from service.Service for three services including engine, dispatcher and health manager and I think all of them are used with threadgroup04:55
Qimingokay, good to know04:55
*** elynn has quit IRC04:57
*** elynn has joined #senlin04:58
*** zhurong has joined #senlin05:11
openstackgerritMerged openstack/senlin: Add TODO item about referencing existing pool  https://review.openstack.org/39887205:12
*** zhurong has quit IRC05:31
*** zhurong has joined #senlin05:35
openstackgerritMerged openstack/senlin: Add request object for event-get  https://review.openstack.org/39983505:51
openstackgerritMerged openstack/senlin: Engine support for profile-validate2  https://review.openstack.org/39886305:53
*** elynn has quit IRC06:09
*** elynn has joined #senlin06:31
openstackgerritlvdongbing proposed openstack/senlin: API support for profile-validate2  https://review.openstack.org/40006206:31
*** elynn has quit IRC06:36
*** elynn has joined #senlin06:36
Qimingyanyanhu, free for a quick discussion?06:46
yanyanhuQiming, sure06:46
Qimingworking on versioned notification ...06:46
yanyanhuok06:46
QimingI don't think we will have a big problem unifying the logging interface among database, message, file, etc06:47
QimingI'm deferring that work till we get a poc implementation for versioned notification06:47
Qimingonce the interface is proved to be flexible, generic enough for both database and message, we can work on the generalization step06:48
yanyanhuyes, that makes sense. Once the poc implementation is ready, it will be easy to add more backend06:48
Qimingbefore we are there, we need to get the versioned notification thing done06:48
QimingI'm somehow blocked by the granularity problem when modeling events06:49
Qimingit is not like the LOG.info, LOG.error, ... which we treated as free06:49
Qimingfor notifications, if no one is receiving and processing them, it seems that they will be accumulated into the message queue for a long time06:50
yanyanhuyou're worried about the overhead06:50
yanyanhuoh, I see06:50
QimingI just experience that when reinstalling devstack adding gnocchi and aodh06:50
yanyanhuagree with this. Actually I think amqp is designed for runtime message delivering06:51
yanyanhuwith an assumption that the consumer is always online to receive and handle message06:51
Qimingand ... previous experience debuging some enterprise middleware ... do NOT overly log, do NOT overly notify ...06:51
yanyanhuthis is different from log type of message, e.g. kafka06:51
yanyanhuyes, it is06:52
Qimingalright, I did considered this when drafting the spec file, so eventually, we will expose some switches into the config file for users to customize ...06:52
Qimingwhat level of events should be fired, what kind of events should be masked etc ...06:53
yanyanhuyep06:53
Qimingthat would be ... complex to use, but flexible enough to meet requirements we are not anticipating06:53
Qimingokay, enought recap06:53
Qimingthe problem is ... we have too many choices to send event notifications06:53
Qimingwe have to make some design decisions on this06:54
Qimingwe are not supposed to emit a notification whenever we just need a debug info06:54
Qimingeven with user customization, we will only decide whether to emit an event at the last moment06:55
yanyanhulast moment, you mean?06:55
Qimingwe are not supposed to place a lot of 'if ... else ..' calls at the call site06:55
yanyanhuQiming, that's for sure...06:56
Qimingtake the LOG.info calls as an example06:56
Qimingwe are calling it everywhere06:56
Qimingif we want to do conditional logging, we are not supposed to add 'if (info is allowed for this module, for this action) then LOG.info' everywhere06:57
Qimingwe will keep the call site simple, just a single line, LOG.info(...)06:57
yanyanhuyes06:57
yanyanhuso there should a filter for this purpose?06:58
Qimingthen in the driver layer, we decide whether we will actually generate an event notification (or db record)06:58
Qimingcalled a filter or a filter chain if you want06:58
Qimingbut that filtering logic is not supposed to be placed at the call site, instead it should be placed inside the 'info' call06:59
yanyanhuyes06:59
Qimingin other words, the 'info' call should be smart enough to handle this customizations, correct?06:59
yanyanhuright06:59
Qimingokay, then ... where do we place those 'info/warn/error/' calls?07:00
Qiming(suppose we can filter them eventually, efficiently, at the last moment)07:00
yanyanhueach key point of workflow I guess?07:00
yanyanhue.g. action starts, succeeds07:01
Qimingright, the question lies in the definition of "key point of workflows"07:01
Qimingsay, cluster-scale-out as a workflow07:01
yanyanhutough quesition :)07:01
Qimingwhere do we call event generation?07:01
yanyanhuinside engine, I think service call, action building, policy taking effect, action scheduling/executing/finishing?07:02
yanyanhuand those points inside each sub action07:03
yanyanhuIt's hard to ask enduer to make decision I feel07:04
Qimingwe can emit event at the following places: 1) rpc request received and validated 2) cluster_scale_out action queued 3) cluster_scale_out action starts execution 4) cluster_scale_out action forks node_create action; 5) node_create action queued; 6) node_create action starts execution; 7) node_create action failes/succeeds 8) cluster_create action fails/succeeds 9) the original request reached a conclusion, i.e. cluster was scaled or not (status changes)07:04
QimingI do see every step a key point in the workflow07:05
yanyanhuyes, those events should be emitted07:05
Qimingbut I don't think we need to log them all07:05
Qimingit is too heavy07:05
Qimingcompletely ruining the idea of notification07:06
yanyanhuQiming, that's true07:06
yanyanhuespecially consider the overhead from interacting with event backend07:06
*** guoshan has joined #senlin07:06
Qimingcorrect, 5), 6), 7) above are proportional to the scale of a cluster operation07:07
Qimingafter drawing this on a paper, I'm astonished ...07:08
Qimingwe cannot afford logging so many events where each event will carry a lot of payload (based on my current design)07:08
Qimingoslo versioned objects, when dumped, are already generating a lot of overhead regarding bytes added07:09
yanyanhuyes07:09
yanyanhuin large scale, that could be very low efficient07:09
Qimingsuppose we dump the cluster properties for all the events above, and all the action properties for these events07:09
Qimingif we don't dump all the properties, we will be challenged ... why the cluster_scale_out event didn't tell me when it was started and when it was stopped?  I want to compute the duration of its execution ...07:11
Qimingso ... a tough decision, right?07:11
yanyanhuyes07:11
yanyanhuif so maybe we start from coarse granularity(e.g. only logging cluster level events) and then try finer granularity and evaluate the overhead increasing?07:11
Qiminghere is my current proposal07:11
Qimingwe don't dump action details07:12
Qimingwe think from end user's perspective07:12
Qimingthey shouldn't care about the asynchronous/synchronous execution of cluster operations ...07:13
Qimingit was ... senlin ... that makes things a "mess"07:13
Qimingtake node_create as an example07:14
Qimingif it is a derived action, not one originated from RPC request, we don't have to expose that detail to users07:14
Qimingwe instead should strive and focus on exposing information on the cluster operation itself ... event if it fails, we let the users know why it failed ..07:15
openstackgerritxu-haiwei proposed openstack/senlin: Update host node 'dependents' when create/delete container node  https://review.openstack.org/39601607:15
Qimingthat is the 'original' goal of events or notifications07:15
yanyanhuQiming, yep, totally makes sense. They are events not "debug" info07:16
*** zhurong has quit IRC07:16
Qimingback to the list above07:16
QimingI'd like to focus on 3), 9) only07:16
yanyanhu8) is duplicated with action list/get?07:17
Qimingin terms of event notification, there will be three types of events for this operation: cluster.scale_out.start, cluster.scale_out.end, cluster.scale_out.error07:18
Qimingand .. that is ALL07:18
yanyanhuthat's reasonable07:18
Qimingoh, 8) is inside the 'do_scale_out' function, and 9) is at the end of the '_execute' function07:19
yanyanhuI see07:19
Qimingit is gonna complicate the event generation a little bit, regarding the derivation of "status reason", but ...07:20
Qimingthe simplification of overall infrastructure may justify that effort, I hope07:20
yanyanhuit will I think07:21
Qimingokay, will proceed on this07:21
yanyanhuotherwise, the overhead could be unaffordable07:21
yanyanhugreat, thanks for those explanation :)07:21
Qimingand try apply the same principle on node operations (those derived from RPC)07:21
Qimingem ... actually, it maybe not that complex07:22
Qimingwe have been working very hard to reduce error message into the action.status and even into cluster status07:22
Qimingthat include failures of policy checks ...07:23
Qimingso we will see if we have to emit something when a policy check has failed07:23
yanyanhuok07:23
yanyanhusounds feasible07:23
Qimingit is not that interesting either, if we have recorded the reason why a cluster operation has failed07:24
Qimingokay, thx for ur time, :)07:24
yanyanhuevent can be used together with action get I think07:24
yanyanhumy pleasure07:24
yanyanhuhope this digging can help the team better understand the design principle07:25
yanyanhu:)07:25
openstackgerritlvdongbing proposed openstack/senlin: Engine support for profile-create2  https://review.openstack.org/40007507:25
Qimingwill try to document these design considerations when creating developer docs07:25
yanyanhugreat07:27
openstackgerritmiaohb proposed openstack/python-senlinclient: The default value of "--list" in cluster-collect's help message displays error  https://review.openstack.org/40007607:30
openstackgerritlvdongbing proposed openstack/senlin: API support for profile-create2  https://review.openstack.org/40007907:50
openstackgerritYanyan Hu proposed openstack/senlin: Fix an error in integration test  https://review.openstack.org/40008107:53
openstackgerritmiaohb proposed openstack/python-senlinclient: Revise the help message of cluster-collect  https://review.openstack.org/40007608:06
openstackgerritmiaohb proposed openstack/python-senlinclient: Revise the help message of cluster-collect  https://review.openstack.org/40007608:12
openstackgerritmiaohb proposed openstack/python-senlinclient: Revise the help info of cluster collect  https://review.openstack.org/40002608:19
openstackgerritlvdongbing proposed openstack/senlin: Remove dead code related to profile-get in engine layer  https://review.openstack.org/40009308:22
openstackgerritlvdongbing proposed openstack/senlin: Remove dead code related to profile-update in engine layer  https://review.openstack.org/40010408:30
openstackgerritShan Guo proposed openstack/senlin: Modify the cli in doc of policy attach command  https://review.openstack.org/40010508:31
*** gongysh2 has quit IRC08:43
openstackgerritlvdongbing proposed openstack/senlin: Remove dead code related to profile-delete in engine layer  https://review.openstack.org/40011408:44
openstackgerritMerged openstack/senlin: Add engine support for event_get2  https://review.openstack.org/39983608:50
openstackgerritMerged openstack/senlin: Api support for event_get2  https://review.openstack.org/39984108:52
openstackgerritRUIJIE YUAN proposed openstack/senlin: prepare for "destory" parameter in cluster-replace-nodes  https://review.openstack.org/40012909:04
openstackgerritmiaohb proposed openstack/python-senlinclient: Fix error in cluster collect  https://review.openstack.org/40013309:15
openstackgerritYanyan Hu proposed openstack/senlin: Versioned request object for receiver-delete  https://review.openstack.org/40013509:19
openstackgerritYanyan Hu proposed openstack/senlin: Engine support for receiver_delete2  https://review.openstack.org/40013609:19
*** yanyanhu has quit IRC09:24
*** shu-mutou is now known as shu-mutou-AWAY09:25
openstackgerritMerged openstack/python-senlinclient: Updated from global requirements  https://review.openstack.org/39537709:30
openstackgerritlvdongbing proposed openstack/senlin: Versioned request objects for profile_type  https://review.openstack.org/40014809:39
*** elynn has quit IRC09:51
*** guoshan has quit IRC10:40
*** guoshan has joined #senlin11:41
*** guoshan has quit IRC11:46
-openstackstatus- NOTICE: We are currently having capacity issues with our ubuntu-xenial nodes. We have addressed the issue but will be another few hours before new images have been uploaded to all cloud providers.12:20
*** catintheroof has joined #senlin12:31
openstackgerritXueFeng Liu proposed openstack/senlin: Fix nova resource leak  https://review.openstack.org/40023212:40
*** guoshan has joined #senlin12:42
*** guoshan has quit IRC12:47
openstackgerritMerged openstack/senlin: Update host node 'dependents' when create/delete container node  https://review.openstack.org/39601613:33
*** guoshan has joined #senlin13:43
*** bran has quit IRC13:44
*** guoshan has quit IRC13:47
openstackgerritQiming Teng proposed openstack/senlin: Remove NotificationPayloadBase class  https://review.openstack.org/40026614:20
openstackgerritQiming Teng proposed openstack/senlin: New fields for versioned notification  https://review.openstack.org/40026714:20
openstackgerritQiming Teng proposed openstack/senlin: New fields for versioned notification  https://review.openstack.org/40026714:21
*** guoshan has joined #senlin14:44
*** guoshan has quit IRC14:48
*** elynn has joined #senlin14:50
*** elynn has quit IRC15:09
*** guoshan has joined #senlin15:44
*** guoshan has quit IRC15:49
*** guoshan has joined #senlin16:45
*** guoshan has quit IRC16:50
*** guoshan has joined #senlin17:46
*** guoshan has quit IRC17:51
*** guoshan has joined #senlin19:01
*** guoshan has quit IRC19:05
*** guoshan has joined #senlin20:02
*** guoshan has quit IRC20:06
*** guoshan has joined #senlin21:03
*** guoshan has quit IRC21:07
*** shu-mutou-AWAY has quit IRC21:41
*** guoshan has joined #senlin22:03
*** guoshan has quit IRC22:08
*** guoshan has joined #senlin23:04
*** guoshan has quit IRC23:09
*** openstack has joined #senlin23:47

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!