*** zhurong has joined #senlin | 00:48 | |
*** hanwei has joined #senlin | 01:11 | |
*** Drago2 has quit IRC | 01:32 | |
*** yanyanhu has joined #senlin | 01:44 | |
*** elynn has joined #senlin | 01:46 | |
*** zhurong has quit IRC | 01:49 | |
*** zhurong has joined #senlin | 01:51 | |
*** openstackgerrit has joined #senlin | 01:55 | |
*** ChanServ sets mode: +v openstackgerrit | 01:55 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/senlin-dashboard: Updated from global requirements https://review.openstack.org/422971 | 01:55 |
---|---|---|
openstackgerrit | Merged openstack/senlin: Fix various problems in doc tree https://review.openstack.org/422386 | 02:06 |
yanyanhu | hi, Qiming, around? | 02:08 |
yanyanhu | I have got the title and abstract of those two NFV related proposal for Boston summit from Xinhui and Haiwei. Will add them to the proposal review list of IBM | 02:08 |
yanyanhu | and there is the etherpad to track them | 02:09 |
yanyanhu | https://etherpad.openstack.org/p/senlin-boston-summit-proposal | 02:09 |
yanyanhu | hi, xuhaiwei_ , around? | 02:10 |
openstackgerrit | XueFeng Liu proposed openstack/senlin: Revise create action when do cluster check https://review.openstack.org/421615 | 02:15 |
Qiming | ruijie_, a failed action is a failed action | 02:21 |
Qiming | I don't think we should focus on actions | 02:22 |
Qiming | they are senlin internals | 02:22 |
Qiming | instead, we should let users do what they want to do on their clusters and nodes | 02:22 |
ruijie_ | Yes Qiming, since in that case, node1:error, node2:active | 02:22 |
ruijie_ | After we trigger cluster_recover, node1 will be recreated by default | 02:23 |
ruijie_ | and the desired_capacity:0 | 02:23 |
ruijie_ | that doesn't make sense .. | 02:23 |
Qiming | right, desired capacity was the user's intent | 02:23 |
Qiming | node1 is error, then user can delete it | 02:23 |
ruijie_ | Then that will be hard for user to recover the cluster to a user excepted condition. | 02:25 |
Qiming | okay | 02:26 |
Qiming | there seem something we can improve | 02:26 |
Qiming | but never touch/operate the actions from API directly | 02:26 |
ruijie_ | Since the action is an internal obj, add an action_id parameter to cluster_recover may not be a good way? | 02:27 |
Qiming | suppose you have a failed action, then you try to resume/retry that action, this 'resume' or 'retry' can then become a new action that may fail ... | 02:27 |
Qiming | please think from a different angle if possible | 02:28 |
Qiming | we are treating actions read-only from users perspective | 02:28 |
Qiming | once that door is opened, we will face a lot of condtions leading the system out of control ... | 02:29 |
ruijie_ | Agreed, Qiming, that's why I said we should mark the cluster a special STATUS since that may need manual intervention | 02:29 |
openstackgerrit | Merged openstack/python-senlinclient: Add deprecation of cluster-run cli https://review.openstack.org/422498 | 02:29 |
Qiming | say you have a cluster A, desired_capacity is 0, but it still has two nodes, one is in ERROR, the other is in ACTIVE state | 02:30 |
Qiming | you are the user, what do you want to do next? | 02:30 |
ruijie_ | I hope to do the scale_in again to delete these 2 nodes .. | 02:31 |
Qiming | then do it | 02:31 |
Qiming | what is the blocker? | 02:31 |
ruijie_ | there will be a mess. I can do cluster_recover to recover the cluster if scale_out failed since it will recreate the node by default, And I can mark the cluster as ERROR or something after cluster_recover failed. | 02:35 |
ruijie_ | but anyway, this could be a workaround to handle failure. | 02:37 |
ruijie_ | just, that will be not easy to handle scale_in failure | 02:37 |
Qiming | it is not a mess | 02:38 |
Qiming | the current design is much clearer than before | 02:38 |
Qiming | what is bringing problem is the cluster_recover operation, we can discuss that in a separate thread | 02:38 |
Qiming | the semantics of cluster_recover is still ambiguous, I agree | 02:39 |
Qiming | we can ignore that operation for the moment | 02:39 |
Qiming | back to the situation you have | 02:39 |
Qiming | one cluster A, desired_capacity is 0, but it still has two nodes, one ERROR, one ACTIVE | 02:40 |
Qiming | you can still do cluster-scale-in -c 2, right? | 02:40 |
Qiming | if I'm remembering this correctly, senlin will check if you have 2 nodes in the cluster, and do another scale-in operation | 02:41 |
Qiming | it will sort of "ignore" your previous desired_capacity | 02:41 |
ruijie_ | right Qiming | 02:42 |
Qiming | during the lifetime of a cluster, you will meet a lot of situations where "desired_capacity" doesn't match the actual capacity, failures may happen anywhere, any time | 02:42 |
Qiming | what senlin gurantees you is this: it will validate the operation against the acutuality, not the desired state | 02:43 |
ruijie_ | scale itself is fine and clear, just I think we could find a way to handle some basic failures | 02:43 |
Qiming | even if senlin cannot sucessfully complete an operation, it will record your previous attempt into the "desired_capacity" after the operation | 02:44 |
Qiming | next time, when you want to do an operation, senlin is always providing you: 1. the fact; 2. your previous intention | 02:44 |
Qiming | that is enough for you to make a decision | 02:44 |
Qiming | if this is clear enough, we can discuss the cluster_recover operation | 02:45 |
ruijie_ | yes Qiming | 02:45 |
Qiming | currently, cluster_recover is very easy to be mis-interpreted as doing all and everything to recover a cluster to a healthy status | 02:46 |
Qiming | the problem lies in the definition of "healthy" | 02:46 |
Qiming | back to your situation: one cluster A, desired_capacity is 0, but it still has two nodes, one ERROR, one ACTIVE | 02:46 |
Qiming | when you type cluster-recover, what is in your mind? | 02:46 |
Qiming | senlin doesn't know ... | 02:47 |
ruijie_ | do scale_in again or check the action triggered | 02:47 |
Qiming | the status of the cluster is: two nodes, one ERROR, one ACTIVE | 02:47 |
ruijie_ | bug not a good idea since action is visiable to user | 02:47 |
ruijie_ | invisible | 02:47 |
Qiming | when senlin gets the cluster_recover request, it can do one of two things: | 02:47 |
Qiming | 1. delete the two nodes, so that the cluster has 0 nodes; | 02:48 |
Qiming | 2. recover the two nodes, so that the cluster has 2 nodes | 02:48 |
Qiming | how would senlin make such a decision? | 02:48 |
Qiming | this is something we can improve | 02:48 |
ruijie_ | so, the desired_capacity is a point we can use? | 02:49 |
Qiming | right | 02:49 |
Qiming | exactly | 02:49 |
ruijie_ | since it represent the user's excpeted condition | 02:49 |
Qiming | put it differently, we may want senlin cluster-recover to "recover" a cluster to its current "desired_capacity" | 02:50 |
Qiming | as a different example, if a cluster has desired_capacity set to 5, but it only have 2 nodes currently, one ERROR, one ACTIVE | 02:51 |
Qiming | current behavior of cluster-recover only attempts to recover the ERROR node | 02:51 |
ruijie_ | I see, Qiming. Then we need to make sure the desired_capacity is operated correctly in all others actions | 02:51 |
Qiming | however, user's intent may be: 1) recover the error node if needed, 2) delete active node if needed, 3) create new nodes if needed | 02:52 |
Qiming | right, that is the dilemma facing us | 02:52 |
Qiming | this worth a spec discussion in my opinion | 02:53 |
Qiming | it is not a simple bug fix as I feel it | 02:53 |
ruijie_ | Agreed Qiming. | 02:54 |
Qiming | be cautious though | 02:55 |
Qiming | this work, once iniated, is an overlap of the cluster-scale-in or cluster-scale-out logic | 02:56 |
Qiming | all scaling operation can be done this way: 1) cluster-resize --desired <new_value> <cluster> 2) cluster-recover if necessary | 02:57 |
Qiming | that means, you first set the new 'desired_capacity', then you use cluster-recover to enforce it, ... | 02:57 |
Qiming | cluster-recover == recover-bad-nodes + delete-extra-nodes + create-new-nodes | 02:58 |
Qiming | anyway, we need spend more time on this for clarification | 02:59 |
ruijie_ | yes Qiming, maybe we need to have a clear definition of "desired_capacity" | 03:02 |
Qiming | desired capacity defined by its name: desired capacity | 03:03 |
Qiming | whenever you read the value from a cluster, you know after all previous operations, what size the user want the cluster to have | 03:04 |
Qiming | desired capacity is used to determine cluster status | 03:04 |
Qiming | you know, cluster status is not determined by the operations performed on it | 03:04 |
Qiming | users and us should be aware of a simple truth: desired capacity can be the reality, but in a cloud environment, it is very common for the values to differ | 03:05 |
ruijie_ | yes Qiming, will think about it | 03:07 |
Qiming | great | 03:07 |
openstackgerrit | Qiming Teng proposed openstack/senlin: Remove LB_STATUS_POLLING from health detection type https://review.openstack.org/423012 | 03:11 |
*** zhurong has quit IRC | 03:39 | |
openstackgerrit | Merged openstack/senlin-dashboard: Imported Translations from Zanata https://review.openstack.org/421683 | 03:55 |
openstackgerrit | Merged openstack/senlin-dashboard: Updated from global requirements https://review.openstack.org/422971 | 03:56 |
*** shu-mutou-AWAY is now known as shu-mutou | 04:03 | |
openstackgerrit | XueFeng Liu proposed openstack/senlin: Remove LB_STATUS_POLLING from health detection type https://review.openstack.org/423012 | 04:39 |
*** catinthe_ has quit IRC | 04:54 | |
*** catintheroof has joined #senlin | 05:02 | |
*** catintheroof has quit IRC | 05:14 | |
yanyanhu | hi, xuhaiwei_, are you around? | 05:16 |
*** eandersson_ has joined #senlin | 05:30 | |
*** eandersson__ has quit IRC | 05:33 | |
*** catintheroof has joined #senlin | 05:46 | |
*** zhurong has joined #senlin | 05:49 | |
openstackgerrit | Merged openstack/senlin: Remove LB_STATUS_POLLING from health detection type https://review.openstack.org/423012 | 05:59 |
openstackgerrit | Merged openstack/python-senlinclient: Updated from global requirements https://review.openstack.org/420865 | 06:15 |
openstackgerrit | miaohb proposed openstack/senlin: Add db purge in senlin manage https://review.openstack.org/420666 | 06:43 |
*** XueFeng has quit IRC | 06:50 | |
*** zhurong has quit IRC | 07:03 | |
*** Jeffrey4l_ has quit IRC | 07:10 | |
*** XueFeng has joined #senlin | 07:19 | |
xuhaiwei_ | yanyanhu: sorry , just saw it now | 07:48 |
yanyanhu | hi, xuhaiwei_ no problem | 07:50 |
yanyanhu | just want to discuss with you about the proposal title | 07:50 |
yanyanhu | this one | 07:50 |
yanyanhu | https://etherpad.openstack.org/p/boston_summit_proposal_senlin_tacker | 07:50 |
yanyanhu | I feel the current name is not that sparking :) | 07:51 |
yanyanhu | and also it doesn't mention Tacker which an important basement of that topic | 07:52 |
yanyanhu | maybe we can consider to change it to something like "integrating Senlin and Tacker for scalable and high available VDU management"? | 07:53 |
yanyanhu | NFV is a word for too huge scope... | 07:53 |
yanyanhu | I guess being more specific could be helpful to attract audience :) | 07:55 |
xuhaiwei_ | yanyanhu: yes, the title is not good IMO :) | 07:57 |
yanyanhu | yep, if there is no "Tacker" appears in the title, it is not appropriate for this is expected to be joint proposal from Senlin and Tacker :) | 07:58 |
xuhaiwei_ | yanyanhu: My manager also mentioned this point, he thought the title should not contain project names in it | 07:59 |
yanyanhu | xuhaiwei_, that is also ok I think | 07:59 |
yanyanhu | if we don't mention neither tacker or senlin | 08:00 |
xuhaiwei_ | yea | 08:00 |
yanyanhu | if so, the title can be more specific for the issue we want to address | 08:00 |
yanyanhu | it hard to attract people if the name is too common :) | 08:01 |
yanyanhu | NFV has been talked about for years | 08:01 |
xuhaiwei_ | another problem is that we should stress why senlin are used for the vnf scaling, that means how senlin meets nfv's use case, it seems vnf's scaling is not that easy, there should be many specific configurations | 08:02 |
yanyanhu | so we just focus on the specific issue we are about to address or the features we provide | 08:02 |
yanyanhu | xuhaiwei_, yes absolutely | 08:02 |
yanyanhu | that should be the most important part of the presentation | 08:03 |
yanyanhu | why we use senlin to support VDU scaling(pool management) | 08:03 |
xuhaiwei_ | so I think we should figure out how to start a VNF first, to see what configurations are needed there | 08:03 |
yanyanhu | what is the benefit and what is the challenge | 08:03 |
yanyanhu | xuhaiwei_, yes. Just can put the implementation detail aside for a while | 08:04 |
Qiming | VDU is only for NFV guys ... but it may be fine | 08:04 |
yanyanhu | and figure it out in higher design level | 08:04 |
Qiming | mentioning "senlin" or "tacker" in the title is not gonna help attracting more audience | 08:05 |
yanyanhu | Qiming, thanks. please just correct me if I use it incorrectly :) | 08:05 |
xuhaiwei_ | yanyanhu: another question about senlin, does senlin support a policy that can be triggered by time? for example 18:00 every day, scale out a vm? | 08:05 |
yanyanhu | actually I'm not very clear about the difference between those glossary :) | 08:05 |
Qiming | and we have already passed the stage to raise the community's awareness of this project or others | 08:05 |
*** eandersson_ has quit IRC | 08:06 | |
yanyanhu | xuhaiwei_, we can if it is needed | 08:06 |
*** eandersson_ has joined #senlin | 08:06 | |
Qiming | xuhaiwei_, you can create a cron tab entry easily for those requirements | 08:06 |
yanyanhu | actually you can implement any policy if it matches the interface senlin defines for policy plugin | 08:06 |
Qiming | it is out of senlin's current scope I think | 08:06 |
yanyanhu | that why I said we can put the implementation detail aside in current stage | 08:07 |
yanyanhu | that's | 08:07 |
Qiming | the sky is your only limit for how to trigger a senlin operation | 08:07 |
xuhaiwei_ | just saw some articles which said some vnfs need to be scaled out due to time changes | 08:07 |
Qiming | adding a new cron policy to trigger certain actions at specific point in time | 08:08 |
Qiming | that is doable too | 08:08 |
xuhaiwei_ | got it | 08:08 |
xuhaiwei_ | currently if we want to install some applications into vms, we have to use software_config or just user_data? | 08:10 |
xuhaiwei_ | I mean make the whole story automatic | 08:10 |
ruijie_ | Hi all, just drafted an topic about our use case,hope to listen to your opinions :) | 08:16 |
ruijie_ | https://etherpad.openstack.org/p/senlin-dtdream-use-case | 08:16 |
Qiming | xuhaiwei_, you can do both ways, depending on which profile you are using | 08:17 |
Qiming | there is no such a process that fits all usage scenarios I think | 08:17 |
xuhaiwei_ | Qiming, got it, thanks | 08:17 |
Qiming | np | 08:17 |
Qiming | be cautious when using software config | 08:18 |
Qiming | it is very difficult to debug ... | 08:18 |
xuhaiwei_ | ok | 08:18 |
Qiming | you will need a vm image that has all the required agents installed | 08:19 |
xuhaiwei_ | the required agents are? | 08:20 |
Qiming | those agents will write data here and there, /var/log, /var/lib, /var/run ... | 08:20 |
Qiming | ... | 08:20 |
Qiming | os-collect-config, os-refresh-config, heat-config, heat-script-config, ... etc | 08:21 |
xuhaiwei_ | ohhh | 08:21 |
xuhaiwei_ | ok, where can I have it? | 08:21 |
xuhaiwei_ | you made it by diskimage-builder? | 08:21 |
Qiming | http://git.openstack.org/cgit/openstack/heat-templates/tree/hot/software-config/elements/README.rst | 08:22 |
Qiming | all those things have to be installed in order to have software config and software deployment work | 08:23 |
Qiming | most of the software are now migrated into a new repo here: http://git.openstack.org/cgit/openstack/heat-agents/ | 08:23 |
xuhaiwei_ | I used to create this kind of image once | 08:23 |
Qiming | I was a fan of software config | 08:24 |
Qiming | after playing it for a while | 08:24 |
Qiming | I gave up | 08:24 |
xuhaiwei_ | why | 08:24 |
Qiming | difficult to debug | 08:24 |
Qiming | and ... useless | 08:25 |
Qiming | there are many twisted parameters to tune | 08:25 |
Qiming | software_config_transport for example | 08:25 |
xuhaiwei_ | I think it can make the heat template clean at least :) | 08:26 |
Qiming | okay, if you think that way | 08:26 |
xuhaiwei_ | honestly I am not using it too much | 08:26 |
Qiming | what if you want to change a parameter to your software config? | 08:26 |
Qiming | and try make a mistake by typing ${SOMETHING} to $(SOMETHING) ... | 08:27 |
Qiming | it will take you a week to find where things go wrong | 08:28 |
xuhaiwei_ | I'd better not try | 08:28 |
Qiming | you have to understand the whole workflow ... read all the agents source code to figure out where debug output is written | 08:29 |
xuhaiwei_ | yes, that's not good | 08:29 |
Qiming | bookmark this link: http://git.openstack.org/cgit/openstack/heat-agents/tree/heat-config/os-refresh-config/configure.d/55-heat-config | 08:30 |
Qiming | you will need it quickly | 08:30 |
xuhaiwei_ | done | 08:31 |
Qiming | also this: http://git.openstack.org/cgit/openstack/heat-agents/tree/heat-config-script/install.d/hook-script.py | 08:32 |
Qiming | if you are doing things using Bash scripts | 08:32 |
xuhaiwei_ | got it | 08:33 |
Qiming | this is the code you will check if software deployment is not found in the VM: http://git.openstack.org/cgit/openstack/os-collect-config/tree/os_collect_config/collect.py | 08:34 |
xuhaiwei_ | ok | 08:35 |
xuhaiwei_ | I used to debug os-xxx-config a little | 08:35 |
xuhaiwei_ | in fact curretly I am doing something terrible enough, that is install an openstack environment with ironic on a NEC server and NEC switch in VLAN network | 08:37 |
xuhaiwei_ | now I know why hareware vendor wants to contribute to openstack | 08:39 |
Qiming | :) | 08:48 |
Qiming | good exepriences | 08:48 |
xuhaiwei_ | by the way, happy Chinese xiaonian to all, let's go home early today! | 08:51 |
ruijie_ | haha, my stage name is xiaonian :) | 08:52 |
xuhaiwei_ | what is stage name? ruijie_ | 08:54 |
ruijie_ | oh, the company ask every to have a stage name(nickname) | 08:55 |
ruijie_ | then we call each other's nickname :) | 08:55 |
xuhaiwei_ | got it :) | 08:56 |
xuhaiwei_ | today is special to you | 08:57 |
ruijie_ | haha :) | 08:58 |
*** openstackgerrit has quit IRC | 09:02 | |
yanyanhu | xuhaiwei_, thanks, you too :) | 09:03 |
yanyanhu | remember to eat dumpline | 09:03 |
yanyanhu | ruijie_, nice name, haha | 09:04 |
xuhaiwei_ | today my wife and me will eat malatang :) | 09:04 |
yanyanhu | cool | 09:05 |
yanyanhu | haha | 09:05 |
xuhaiwei_ | ma la tang | 09:05 |
xuhaiwei_ | leaving now, see u | 09:05 |
yanyanhu | see u | 09:05 |
ruijie_ | I used dictionary to search malatang :) | 09:09 |
yanyanhu | :P | 09:15 |
*** yanyanhu has quit IRC | 09:19 | |
*** openstackgerrit has joined #senlin | 09:22 | |
*** ChanServ sets mode: +v openstackgerrit | 09:22 | |
openstackgerrit | RUIJIE YUAN proposed openstack/senlin: revise tempest api for cluster 4 https://review.openstack.org/423139 | 09:22 |
openstackgerrit | RUIJIE YUAN proposed openstack/senlin: revise tempest api test for cluster 4 https://review.openstack.org/423140 | 09:22 |
openstackgerrit | RUIJIE YUAN proposed openstack/senlin: revise tempest api for cluster 4 https://review.openstack.org/423139 | 09:32 |
openstackgerrit | Qiming Teng proposed openstack/senlin: Add developer doc for health policy https://review.openstack.org/423174 | 10:16 |
Qiming | XueFeng, online? | 10:37 |
openstackgerrit | Qiming Teng proposed openstack/senlin: Add developer doc for health policy https://review.openstack.org/423174 | 10:37 |
*** hanwei has quit IRC | 10:39 | |
*** Jeffrey4l has joined #senlin | 10:40 | |
*** lixinhui has quit IRC | 10:43 | |
openstackgerrit | Kenji Ishii proposed openstack/senlin-dashboard: Implement action updating cluster policies https://review.openstack.org/423209 | 11:05 |
openstackgerrit | Kenji Ishii proposed openstack/senlin-dashboard: Implement action updating cluster policies https://review.openstack.org/423209 | 11:06 |
openstackgerrit | Kenji Ishii proposed openstack/senlin-dashboard: Implement action updating cluster policies https://review.openstack.org/423209 | 11:23 |
*** hanwei has joined #senlin | 11:49 | |
*** hanwei_ has joined #senlin | 11:56 | |
*** hanwei has quit IRC | 11:58 | |
*** hanwei has joined #senlin | 11:59 | |
*** hanwei_ has quit IRC | 12:03 | |
*** wllabs has quit IRC | 12:09 | |
*** catinthe_ has joined #senlin | 12:32 | |
*** catintheroof has quit IRC | 12:33 | |
*** fabian4 has quit IRC | 12:37 | |
*** Jeffrey4l_ has joined #senlin | 13:00 | |
*** Jeffrey4l has quit IRC | 13:04 | |
XueFeng | hi,QiMing | 13:24 |
*** catintheroof has joined #senlin | 14:33 | |
*** catinthe_ has quit IRC | 14:36 | |
*** elynn has quit IRC | 14:54 | |
*** Drago1 has joined #senlin | 16:18 | |
*** Drago1 has quit IRC | 18:32 | |
*** Drago1 has joined #senlin | 18:43 | |
*** catinthe_ has joined #senlin | 19:50 | |
*** catintheroof has quit IRC | 19:53 | |
*** catinthe_ has quit IRC | 21:02 | |
*** catintheroof has joined #senlin | 21:03 | |
*** catintheroof has quit IRC | 21:07 | |
*** Jeffrey4l__ has joined #senlin | 21:35 | |
*** Jeffrey4l_ has quit IRC | 21:35 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!