Wednesday, 2015-08-19

*** lkarm has joined #senlin00:10
*** lkarm has quit IRC00:15
*** Qiming has joined #senlin00:47
Qimingmorning01:01
xuhaiweiQiming, morning01:15
*** Yanyanhu has joined #senlin01:15
*** branw has quit IRC01:20
*** Yanyan has joined #senlin01:22
Yanyanhi, xuhaiwei, about the failure in these two patches, will check whether we need a fix in the test case01:24
Yanyanhttps://review.openstack.org/213964 and https://review.openstack.org/21394401:25
*** Yanyanhu has quit IRC01:25
xuhaiweiYanyan, I am checking them too, it seems not the test case's problem01:26
Yanyanok01:26
Yanyanit happened locally after I recreate my test environment01:27
YanyanI think it is cause by the version update of some packages, e.g. oslo.db01:27
xuhaiweimaybe01:27
YanyanI guess they changed the exception msg when unavailable sort_dir is provided01:29
Yanyanthus caused these failures01:29
Yanyanwill make a test01:29
xuhaiweiok01:29
Yanyanhi, xuhaiwei, I think it is the reason01:32
Yanyanjust feel the new error msg is a little weird01:32
Yanyan'Unknown sort direction, must be one of: asc-nullsfirst, asc-nullslast, desc-nullsfirst, desc-nullslast'01:32
Yanyanwhy there is string 'null' here01:32
xuhaiweihave no idea01:33
Yanyanhmm, seems they do use these strings in the latest code01:35
xuhaiweihow to reproduce this error?01:37
Yanyanjust run tox -epy27 -r01:37
YanyanI think one of recent package version update cause this problem01:38
Yanyanwill propose a fix for this01:38
*** Qiming has quit IRC01:39
*** mathspanda has joined #senlin01:43
xuhaiweisee this Yanyan, https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/utils.py#L14301:45
xuhaiweiit's due to oslo.db's update01:46
Yanyanyes01:46
Yanyanand we didn't update our local package and thus didn't find it01:46
*** ChrisSen has joined #senlin01:52
*** Qiming has joined #senlin01:53
*** elynn has joined #senlin01:54
openstackgerritYanyan Hu proposed stackforge/senlin: Fix some test cases about illegal sort_dir  https://review.openstack.org/21441601:56
xuhaiweiYanyan, dont you think it's a little strange? we give the sort_dir='desc', but the error message calls for 'desc-nullsfirst' ?01:56
Yanyanhi, xuhaiwei, the patch has been proposed here, let's see whether it can fix the problem, thanks01:56
Yanyanyes, that's why I feel the msg is a little weird...01:57
Yanyandon't understand why there is a 'nulls' here01:57
xuhaiweiit seems both 'desc' and 'desc-nullsfirst' works01:58
Yanyanyes01:58
Yanyanand oslo_db also use this word in their own test cases for utils module01:59
Yanyanlike this: http://git.openstack.org/cgit/openstack/oslo.db/tree/oslo_db/tests/sqlalchemy/test_utils.py#n23602:00
Yanyanoh02:01
YanyanI guess desc now equal to desc-nullsfirst + desc-nullslast02:01
xuhaiweiyes02:02
xuhaiweimaybe they want to be more specific02:02
Yanyanthey provide more accurate support for query02:02
Yanyanright02:02
Yanyanhmm, good news :)02:02
Yanyanok, lets see whether this fix works. If so, we can try to rebase those blocked patches.02:03
*** Qiming_ has joined #senlin02:03
xuhaiweiok02:03
Yanyanhello, Qiming02:04
Yanyanseems your network is not stable ;)02:04
*** Qiming has quit IRC02:05
*** Qiming__ has joined #senlin02:05
xuhaiweiit seems he is having some meetup02:06
*** Qiming_ has quit IRC02:06
Yanyanyes02:07
*** Qiming__ has quit IRC02:15
openstackgerritYanyan Hu proposed stackforge/senlin: Check size limitation in cluster scale in/out action  https://review.openstack.org/21396402:18
*** jdandrea has quit IRC02:20
*** jroyal has joined #senlin02:22
*** jroyal has quit IRC02:26
openstackgerritYanyan Hu proposed stackforge/senlin: Add functional test for listing profile_type  https://review.openstack.org/21304002:34
openstackgerritMerged stackforge/senlin: Fix some test cases about illegal sort_dir  https://review.openstack.org/21441602:35
openstackgerritYanyan Hu proposed stackforge/senlin: Add functional test for listing policy_types  https://review.openstack.org/21362602:35
*** lkarm has joined #senlin02:37
openstackgerritxu-haiwei proposed stackforge/senlin: Revise cluster-scale-in/out default value  https://review.openstack.org/21394402:38
openstackgerritxu-haiwei proposed stackforge/senlin: Handle exceptions in keystone_v3 driver  https://review.openstack.org/21356902:38
*** Qiming has joined #senlin02:41
Qimingsigh, network is very limited02:42
*** lkarm has quit IRC02:42
Yanyanyes02:43
Yanyanthe meetup has started?02:43
Qimingyes02:45
openstackgerritYanyan Hu proposed stackforge/senlin: Add functional test for listing profile_type  https://review.openstack.org/21304002:45
Yanyanthe hackthon is tomorrow?02:45
Qimingit already started02:45
Yanyanoh02:45
Qimingfixing/reviewing bugs02:45
Yanyannice :)02:46
xuhaiweiwhich project?02:51
YanyanI guess in several projects02:53
*** Qiming has quit IRC02:55
*** mathspanda has quit IRC02:57
*** Qiming has joined #senlin03:03
openstackgerritxu-haiwei proposed stackforge/senlin: Fix some exception mapping miss  https://review.openstack.org/21443103:04
*** elynn_ has joined #senlin03:09
*** elynn has quit IRC03:13
Qiminghi03:44
Qimingthe sort_dir patch03:45
Yanyanhello03:45
QimingI was wondering if it is affecting03:45
Yanyanyou mean?03:46
Qimingwhat is your oslo.db version?03:47
Yanyanlet me check03:47
Yanyan2.4.0 for tox03:48
Yanyanand 2.1.0 in local03:48
Yanyanseems gate uses a newer version than the one defined in requirement03:50
*** Qiming has quit IRC03:55
*** Qiming has joined #senlin03:56
Qimingah, I see, gate is using oslo.db 2.4.004:01
Yanyanyes04:01
Yanyanso breaked our test04:01
*** Qiming has quit IRC04:06
openstackgerritxu-haiwei proposed stackforge/senlin: Fix some exception mapping miss  https://review.openstack.org/21443104:16
*** mathspanda has joined #senlin04:26
*** mathspanda has quit IRC04:30
*** mathspanda has joined #senlin04:31
openstackgerritYanyan Hu proposed stackforge/senlin: Use wait_for_delete to wait for nova server deletion  https://review.openstack.org/21444804:56
*** Qiming has joined #senlin05:14
*** lkarm has joined #senlin05:20
*** lkarm has quit IRC05:24
*** Qiming has quit IRC05:26
*** Qiming has joined #senlin05:29
*** Qiming has quit IRC05:29
*** Qiming has joined #senlin05:30
openstackgerritLinPeiyu proposed stackforge/senlin: Fix misleading document for webhooks usage  https://review.openstack.org/21445505:39
*** Qiming has quit IRC05:43
*** Qiming has joined #senlin05:43
*** Qiming_ has joined #senlin05:46
*** Qiming_ has quit IRC05:47
openstackgerritxu-haiwei proposed stackforge/senlin: Fix some exception mapping miss  https://review.openstack.org/21443105:48
*** Qiming has quit IRC05:50
*** Qiming has joined #senlin05:55
openstackgerritYanyan Hu proposed stackforge/senlin: Allow NODE_DELETE action to steal node lock  https://review.openstack.org/21445906:00
Yanyanhi, Qiming, free to talk?06:00
Qimingnot now, about to present06:00
Yanyanok, talk later06:01
Yanyanhave a good lecture :)06:01
Qimingwill do my best06:01
QimingKen is presenting Neutron, I'm kinda lost at the moment06:02
Yanyanneutron is complicated...06:02
mathspandahi, xuhaiwei.06:06
mathspanda'-c' for the specified cluster, but '-C' is for crendential.06:06
xuhaiweiyes06:07
mathspandathe example i wrote is '-C'06:07
mathspandaoh. i know what's wrong.06:08
xuhaiwei:)06:08
mathspandathanks.:)06:08
xuhaiweinope06:08
openstackgerritLinPeiyu proposed stackforge/senlin: Fix misleading document for webhooks usage  https://review.openstack.org/21445506:10
openstackgerritMerged stackforge/senlin: Add functional test for listing profile_type  https://review.openstack.org/21304006:25
openstackgerritMerged stackforge/senlin: Fix some exception mapping miss  https://review.openstack.org/21443106:38
openstackgerritMerged stackforge/senlin: Fix misleading document for webhooks usage  https://review.openstack.org/21445506:38
*** ChrisSen has quit IRC06:54
Qimingpresentation done07:02
xuhaiweiabout what07:03
Yanyanhow about it?07:03
Yanyanseems it's the heat's meeting time07:03
Qimingyes07:04
xuhaiweiwhat kind of people have joined?07:04
YanyanI'm gonna join openstack-meeting channel to listen :)07:07
Yanyanhi, xuhaiwei, I think it's a China openstack community activity07:08
*** xuhaiwei_ has joined #senlin07:10
xuhaiwei_it seems there are many this kind of meetup in China07:11
*** xuhaiwei has quit IRC07:13
Yanyanxuhaiwei_, yes07:13
*** jroyal has joined #senlin07:16
*** jroyal has quit IRC07:20
Yanyanhi, Qiming, are you free now?07:41
Qimingfine07:41
Yanyanjust pushed some comments on the node lock patch07:41
YanyanI think you're right that it's not safe to steal the node lock in most cases07:42
YanyanI think the only safe case is the old owner action of node has gone07:42
Qimingif you have some nodes that cannot be deleted07:42
Qimingyou will first investigate why it is still locked07:42
Qimingthere could be some bugs in the code07:43
Yanyanyep07:43
Yanyanso if the code is good writing, this should never happen07:43
Qimingif you are allowing node deletion unconditionally, a lot of bugs will be masked07:43
Yanyanunless you kill/restart the engine07:43
Yanyanyes07:43
Yanyanso maybe we just allow the lock stealing when we can ensure the parent engine of node's owner action has gone07:44
Yanyanin other cases, we don't allow it07:44
Qimingwhen you find some nodes cannot be deleted, you will first look into why that happened07:44
*** mathspanda has quit IRC07:44
Yanyanyes, if it happen accidentally, this should be a bug07:45
Qimingthere are two cases: bug in code (e.g. exception not caught) leaving node still locked, need to be fixed07:45
Qimingor there are cases beyond our control, if that is the case, we check the node status, and decide whether to force a steal07:46
Yanyanabout checking the node status, you mean check the 'status' attr of node?07:47
Qimingstealing locks unconditionally for NODE_DELETE action is bad, it is like the HARestarter resource07:49
Qimingnode status07:49
Qimingunder certain conditions, we may find that node must be deleted forciably, if that is the case, we will check node status for a decision07:50
Qimingmaybe there will be other cases for testing07:50
Qimingother conditions, sorry07:50
Yanyanyou mean we should delete node when it is in status like 'ACTIVE' 'INIT'?07:53
Yanyanbut not 'CREATING' 'DELETING' or 'UPDAING'07:54
Yanyanor some logic like this?07:55
Qimingyes08:04
Qimingwe will find out which status is safe to delete, which status is not08:04
Qimingthe basic assumption (starting point) would be: under no condition will we forcibly delete a node, unless we cannot find out a solution08:05
Yanyanagree with the assumption, but I think we may not be able to make the decision from just checking the status attr of node08:06
Qimingsaw your comments08:07
Yanyanunless we know the detail of physical resource behind the node08:07
Qimingright, we need to deal them case by case08:07
Yanyanyes08:07
Yanyanso before that, only case we can handle is the engine dying08:08
Qimingabout multi-engine case, there needs a special logic when engine starts08:08
Qimingit was documented in the FEATURES.rst as 'scavenger' process08:08
Yanyanyes, something like a scaning08:08
Qimingi.e. when a engine starts up, it will look for nodes/clusters .... those that in a hangup status and recover them08:09
Yanyanyes. this will be the complete solution08:10
Yanyanok, will add support for engine alive check to help decide whether we need node lock stealing before we can support more cases08:15
Qimingwe need a design here08:15
Yanyanhmm, for scavenger08:15
Qimingcurrently, we don't have multi-engine support, right?08:15
Yanyanright08:15
Qimingit was designed, but not yet implemented08:16
Qimingso there is a priority here08:16
Qimingeither we add multi-engine support first, then add scavenger08:16
Qimingif we plan like this, the scavenger would be a complete design08:16
Qimingon the other hand, if we add scavenger now, and implement multi-engine support later08:17
Qimingthe scavenger will have to be rewritten08:17
Qimings/will/may08:17
Yanyanhmm, actually, our current implementation should support multiple engines theoretically, we just didn't test it before08:19
Yanyanbut it may not be able to work correctly since some parts like dispatcher may not support it08:20
xuhaiwei_though don't understand the 'scavenger' well, do we need it now? I think the second way sounds better08:21
Yanyanxuhaiwei_, we may don't need to add it now, but if we want to support multiple engine, it is necessary I think08:22
xuhaiwei_just saw the FEATURES.rst, 'scavenger process' is in the High priority list08:23
QimingMy experience writing test cases for the scheduler and dispatcher module told me that multi-engine is not finished08:23
Qimingxuhaiwei_, it was there as high because we were assuming that multi-engine support is ready08:24
Yanyanyes, that's true08:24
YanyanQiming, just as you said, we need a plan for this feature08:25
xuhaiwei_what do you mean by 'multi-engine is not finished'08:25
Yanyanmaybe not in liberty-3, but we need a timeline for it08:25
*** LiuWei has joined #senlin08:26
Qimingxuhaiwei_, it means you start two senlin-engine processes to service user requests08:27
Yanyanhi, xuhaiwei_, we actually only run a single engine thread now08:27
Qimingmulti engine set up is a workaround to the Python's GIL (Global Interpreter Locking) problem08:28
xuhaiwei_due to my understanding, new engine service will be started when some new request is coming, so if only one request is there, only one engine service is started , right?08:29
Qimingengine service is a process08:31
Qimingwe handle requests using eventlets -- a Python emulation of multi-threads, as other projects do08:31
xuhaiwei_so multi-engine processes are already started before request comes?08:33
Qimingyes08:36
Qimingthese engines will share requests forwarded by the senlin-api process08:36
xuhaiwei_got it08:36
xuhaiwei_just confirmed heat started 9 engine process by default08:37
Yanyanyou can check this option in senlin.conf #num_engine_workers = 108:38
Yanyanit shoud 1 by default08:38
xuhaiwei_yes08:39
Qimingxuhaiwei_, it depends on your number of processors I think08:39
xuhaiwei_oh08:40
Yanyanso, Qiming, what is your opinion about it?08:45
*** mathspanda has joined #senlin08:45
Yanyanshould we start working on multiple engine support first?08:45
Qimingit would be great if we can double confirm the multi-engine support08:45
Yanyanhmm, yes, we can do some tests about it08:46
Qimingthen we base the scavenger work on it08:46
Qiminggreat, thanks08:46
Yanyanno problem. And do we need the interim solution before we can support this feature? for engine died case08:47
Yanyanor we can handle it manually08:48
Yanyansince the work will be replaced after scavenger is supported08:48
QimingI don't think we need to do it08:48
Yanyanok08:49
Yanyanwill do some tests about multiple engine, hope there are no much holes there :)08:49
openstackgerritYanyan Hu proposed stackforge/senlin: Use wait_for_delete to wait for nova server deletion  https://review.openstack.org/21444809:04
Yanyanmake some tests using concurrent cluster creating and deleting with two engine threads, seems the basic workflow is ok :)09:19
Yanyanthe node_create and cluster_create actions were assigned to two engines nearly equal09:20
*** Qiming has quit IRC09:30
*** mathspanda has quit IRC09:35
xuhaiwei_cool, Yanyan09:36
Yanyanlooks good, create and delete 4 cluster with 14 nodes :)09:36
Yanyanof course, there was not exception happened during node creation and deletion09:36
Yanyanotherwise, there could be error happened09:37
YanyanI guess maybe we can enable two engine threads by default when doing daily development work09:37
Yanyancan help to find problem :)09:37
xuhaiwei_ok09:38
*** Qiming has joined #senlin09:38
Yanyanlet me increase the cluster size to 2009:38
Qimingokay09:40
Yanyanlooks pretty good ;)09:40
Yanyan5 cocurrent cluster with 40 nodes: 36 heat stacks and 4 nova server09:40
Qimingjust for creation?09:41
Yanyancocurrent creation and deletion :)09:41
Yanyanwrote a shell script, sleep 1 second before each step09:41
Yanyanlet me make more tests09:42
Qimingokay09:42
Yanyanbut the api response become slow obviously...09:43
Yanyancost about 2 seconds to get response09:44
Yanyanah, this time, a cluster deletion failed although all its node has been deleted09:45
Qimingokay, is that a concurrency problem?09:48
QimingI'm afraid we have such problems when dealing with locks09:48
Yanyanhmm, guess so, but the second deletion succeeded09:48
Yanyanyes, also think so09:49
Yanyanguess it is caused by lock competition between cluster creating and deleting action09:49
Qimingem09:50
QimingI'm feeling we have some hidden bug there09:50
Qimingthe action logics09:50
Yanyansince in the test, I deleted cluster just a second after the creating request was sent out09:50
Qimingthose APIs were written without a careful design and it was not thoroughly revised09:51
Yanyanem, need more tests here09:51
Yanyanoh, a question is should we focus on this issue before l-3 deadline?09:53
QimingAs I can recall, Zhai HF has done some tests there, he told me it is not stable somehow09:53
Qimingif it is an action api problem it should be solved asap09:53
Yanyanthat's true09:54
Yanyanso maybe we enable multiple engine by default, I think this can help us find problem09:54
Qimingright09:56
Yanyanem, will use multi-engine env for daily work09:59
Yanyanprepare to leave10:00
Yanyansee U guys tomorrow10:02
*** Yanyan has quit IRC10:07
*** elynn_ has quit IRC10:18
openstackgerritMerged stackforge/senlin: Revise cluster-scale-in/out default value  https://review.openstack.org/21394410:44
openstackgerritMerged stackforge/senlin: Use Senlin generic driver to manage ceilometer_v2 driver  https://review.openstack.org/21359310:45
*** LiuWei has quit IRC11:21
*** Qiming has quit IRC11:35
*** branw has joined #senlin11:53
*** lkarm has joined #senlin12:34
*** jdandrea has joined #senlin13:55
*** Qiming has joined #senlin14:42
*** Qiming has quit IRC16:07
*** lkarm has quit IRC16:58
*** lkarm has joined #senlin16:59
*** lkarm has quit IRC16:59
*** lkarm has joined #senlin17:00
*** lkarm has quit IRC17:09
*** lkarm has joined #senlin17:10
*** jdandrea has left #senlin19:24
*** jdandrea has joined #senlin19:24
*** lkarm has quit IRC19:46
*** lkarm has joined #senlin19:46
*** lkarm has quit IRC19:47
*** lkarm has joined #senlin19:47
*** lkarm has quit IRC21:26
*** lkarm has joined #senlin21:27
*** lkarm has quit IRC21:31
*** lkarm has joined #senlin21:52
*** lkarm has quit IRC21:56
*** lkarm has joined #senlin22:17
*** lkarm has quit IRC22:23
*** lkarm has joined #senlin22:23
*** lkarm has quit IRC22:23
*** lkarm has joined #senlin22:23
*** lkarm has quit IRC22:28
*** xuhaiwei_ has quit IRC23:32

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!