*** catintheroof has quit IRC | 00:19 | |
*** zhurong has joined #senlin | 00:55 | |
*** shu-mutou-AWAY has quit IRC | 01:19 | |
*** yanyanhu has joined #senlin | 01:26 | |
openstackgerrit | XueFeng Liu proposed openstack/senlin master: Add prepare and cleanup to all needed test cases https://review.openstack.org/455331 | 01:53 |
---|---|---|
*** dixiaoli has joined #senlin | 01:54 | |
*** dixiaoli has quit IRC | 01:58 | |
*** dixiaoli has joined #senlin | 02:02 | |
openstackgerrit | XueFeng Liu proposed openstack/senlin master: Add prepare and cleanup for nova server local tempest https://review.openstack.org/455325 | 02:07 |
Qiming | XueFeng, there? | 02:08 |
*** elynn has joined #senlin | 02:08 | |
*** catintheroof has joined #senlin | 02:16 | |
*** shu-mutou has joined #senlin | 02:21 | |
*** catintheroof has quit IRC | 02:34 | |
*** catintheroof has joined #senlin | 02:35 | |
root____5 | hi, QiMing | 02:39 |
XueFeng | hi,QiMing | 02:39 |
Qiming | Hi | 02:40 |
*** catintheroof has quit IRC | 02:40 | |
Qiming | I'm confused by your patches to the tempest/common/utils | 02:40 |
Qiming | and the big patch that your are making to the tempest test cases | 02:41 |
Qiming | the changes about nova key pair and subnet are only relevant to integration tests, right? | 02:42 |
Qiming | because in api and functional tests, we use openstack_test backend, not the real openstack backend | 02:43 |
Qiming | there is no need to create keypairs or network/subnet for most of the api/functional tests | 02:43 |
root____5 | hi, Qiming | 02:50 |
XueFeng | hi,QiMing | 02:50 |
XueFeng | In local run tempest | 02:50 |
XueFeng | We need create theses for each dynamical user/project | 02:51 |
XueFeng | tempest run with dynamical by default | 02:51 |
XueFeng | You can do a teset. cd /opt/stack/tempest, then run `nosetests -v senlin` | 02:53 |
Qiming | did you set cloud_backend = openstack_test in your senlin.conf before running api tests? | 02:56 |
Qiming | or functional tests? | 02:56 |
XueFeng | Oh, not set. | 02:57 |
XueFeng | Let me do a test | 02:57 |
*** dixiaoli has quit IRC | 03:00 | |
*** dixiaoli has joined #senlin | 03:01 | |
root____5 | QiMing, it's ok | 03:10 |
Qiming | those new logics are useful, however, for integration tests where we use real openstack drivers | 03:11 |
Qiming | so please consider make them integration tests specific | 03:11 |
Qiming | instead of polluting the api and functional tests | 03:11 |
XueFeng | ok, got it. | 03:12 |
XueFeng | Will remove the change for api and functional tests | 03:12 |
Qiming | okay | 03:12 |
XueFeng | Thanks:) | 03:13 |
openstackgerrit | yangyide proposed openstack/senlin master: Improve node check for health_policy_poll https://review.openstack.org/455503 | 03:52 |
*** dixiaoli has quit IRC | 04:17 | |
*** dixiaoli has joined #senlin | 04:19 | |
*** dixiaoli has quit IRC | 04:23 | |
openstackgerrit | yangyide proposed openstack/senlin master: Improve node check for health_policy_poll https://review.openstack.org/455503 | 05:05 |
*** dixiaoli has joined #senlin | 05:26 | |
ruijie | hi Qiming, online? | 05:52 |
Qiming | yes | 06:00 |
ruijie | https://blueprints.launchpad.net/senlin/+spec/scaling-action-support-health-check | 06:01 |
ruijie | about this bp, I got 2 ideas, not sure which one is better | 06:01 |
ruijie | 1. do health check before scaling. 2. do health check after scaling | 06:01 |
ruijie | since the process is _execute() -> pre_op() -> do_xyz() -> post_op() | 06:02 |
*** elynn has quit IRC | 06:03 | |
ruijie | if we want to do health check before scaling, we need to sync resource status in pre_op() in deletion policy if exit or in do_xyz() if no deletion policy exit | 06:03 |
ruijie | that might be heavy weight to sync resource status in deletion policy since it is single thread | 06:04 |
ruijie | second option is do health check after scaling, which means we don't care the status of resource for the current action, we do it synchronized with number of node_check action. | 06:05 |
Qiming | got your points | 06:06 |
Qiming | so ... the question is ... why do we want a health check when doing scaling? | 06:06 |
ruijie | for scale in action, we will chose bad nodes first .. | 06:07 |
ruijie | this is the only question | 06:07 |
Qiming | I'm taking a step back to revisit the reason why we are doing this | 06:07 |
Qiming | we want to make sure that the scaling operation is doing things right, correct? | 06:08 |
ruijie | yes Qiming | 06:08 |
Qiming | then the choice is evident | 06:08 |
Qiming | it is for sure that introducing an extra step before scaling would mean some overhead | 06:09 |
Qiming | that was the reason we wanted to make 'health_check' optional | 06:10 |
ruijie | so .. we do it before we chose candidates | 06:10 |
Qiming | agreed | 06:11 |
Qiming | this extra checking will be meaningless if we append it to the end of a scaling operation | 06:11 |
ruijie | yes Qiming, but I am not sure if we need this process for scale out too | 06:12 |
Qiming | in an ideal case, senlin will know the EXACT statuses of all cluster nodes before doing scaling | 06:13 |
Qiming | whether we want to do health check before scaling out can be left for future decision | 06:14 |
Qiming | now, let's focus on getting node status checked before *scaling in* | 06:15 |
ruijie | okay, Qiming, that makes sense :) | 06:15 |
Qiming | the question becomes how do we implement it | 06:15 |
Qiming | as you have mentioned, there are two places where you will need to invoke the health check | 06:16 |
Qiming | one in the do_scale_in() (or do_resize()) action, the other is in the deletion policy | 06:17 |
Qiming | anywhere else? | 06:17 |
ruijie | no Qiming | 06:17 |
Qiming | any impact on other policies such as LB, cross-az placement, cross-region placement? | 06:18 |
ruijie | np, we can use engine_node.do_check() to do this work | 06:18 |
Qiming | so you mean we are not forking new child actions for this | 06:19 |
ruijie | in do_scale_in() we can actually use node_action to do it, but in deletion policy that is not easy | 06:20 |
Qiming | okay. We are not supposed to create actions in a policy | 06:20 |
ruijie | yes Qiming | 06:21 |
Qiming | then we don't have a lot of choices | 06:21 |
ruijie | and we may need to handle nodes which status is WARNING since we introduced it for nodes that failed to added to lb pool | 06:21 |
Qiming | okay, that makes sense | 06:22 |
ruijie | thanks Qiming, much clear now :) | 06:24 |
Qiming | thanks for bringing these up | 06:24 |
ruijie | np :) | 06:24 |
yuanbin | ruijie, the health policy action recreate, I don't use this action? | 06:26 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/senlin-dashboard master: Imported Translations from Zanata https://review.openstack.org/455535 | 06:34 |
ruijie | em, yuanbin, you mean for cluster_recover? | 06:41 |
yuanbin | ruijie, yes | 06:42 |
openstackgerrit | RUIJIE YUAN proposed openstack/senlin master: handle node which status is WARNING https://review.openstack.org/455542 | 06:42 |
ruijie | that's 2 scenarios | 06:44 |
ruijie | for the health policy, we can monitor the node by pull the status of node or listen to rabbit, then we do the action to recover it, by recreating or rebuilding .etc | 06:45 |
ruijie | do health check in scaling action is another scenario, we may want to know the exact status of node before we do scale in or scale out, because we are supposed to remove bad nodes first | 06:47 |
yuanbin | ruijie, yes, the rebuilding i have some problems, my instance use cinder volume to boot, nova rebuild don't support cinder volume with root device | 06:48 |
yuanbin | ruijie, remove bad nodes, create new nodes from profile, i known. | 06:49 |
ruijie | oh okay :) I am not very familiar with nova .. | 06:50 |
yuanbin | ruijie, I have a question, when i use cluster-update the new profile use cinder boot, so the cluster-update execute image_update will faild | 06:51 |
yuanbin | ruijie, so, I use cluster-update image change, execute recreate action, but the recreate don't realize | 06:52 |
ruijie | https://git.openstack.org/cgit/openstack/senlin/tree/senlin/profiles/os/nova/server.py#n930 | 06:53 |
ruijie | I think Qiming may know this better :) | 06:54 |
Qiming | sounds to me that you are not simply updating the image of a server | 06:57 |
Qiming | you are actually updating its block device mapping | 06:57 |
Qiming | I'm not 100% sure nova supports such an operation | 06:57 |
Qiming | even if nova does, senlin has not yet supported it. | 06:58 |
yuanbin | Qiming, yes, if use ceph backend boot instance with cinder volume, rebuild will be update cinder volume | 06:58 |
Qiming | please read the do_update logic has ruijie pointed above and feel free to propose a fix | 06:59 |
Qiming | I don't have a cinder env or a ceph deployment to experiment with this | 06:59 |
Qiming | I'm more than happy to learn things in this space though | 07:00 |
yuanbin | Qiming, I known senlin update logic, only update_image a little question | 07:02 |
Qiming | okay? | 07:03 |
Qiming | did you mean question or problem? | 07:04 |
Qiming | there are many different scenarios to consider in the do_update workflow, we tried to optimize it so that we won't reboot/rebuild the server twice if we can solve the problem in one reboot | 07:05 |
Qiming | however, I cannot assure you that the root device changes have been considered | 07:06 |
Qiming | that is the reason I think we need your help on fixing it | 07:06 |
yuanbin | Qiming, senlin call rebuild , if my profile use block_device_mapping_v2 create boot volume, nova rebuild don't successful in my test boot volume rebuild | 07:13 |
yuanbin | Qiming, this is nova rebuild probrom | 07:14 |
Qiming | alright | 07:14 |
Qiming | any workaround we can think of? | 07:14 |
Qiming | for example, some other nova apis we can leverage? | 07:14 |
openstackgerrit | RUIJIE YUAN proposed openstack/senlin master: handle node which status is WARNING https://review.openstack.org/455542 | 07:14 |
Qiming | yuanbin, https://developer.openstack.org/api-ref/compute/#servers-run-an-action-servers-action | 07:15 |
yuanbin | Qiming, I want change senlin call nova rebuild to recreate, delete old profile exist node, and use new profile create node. | 07:17 |
Qiming | that makes sense | 07:17 |
Qiming | since we can either do rebuild or recreate for this purpose | 07:17 |
Qiming | how are we gonna make the decision to avoid breaking existing users? | 07:18 |
yuanbin | Qiming, nova will be support detach root device, use cinder create new volume and attach to instance, implementation root device update | 07:18 |
Qiming | if ("block_device_mapping_v2" is used) then we do recreate; else we do rebuild? | 07:19 |
yuanbin | Qiming, yes | 07:20 |
Qiming | or ... we can check the boot_index property for deciding this? | 07:20 |
Qiming | if bdmv2 is not used for root device, recreate would be an over-kill, right? | 07:20 |
yuanbin | yes | 07:21 |
Qiming | okay, this means we have a decent workaround | 07:21 |
Qiming | we don't blindly recreate, but we do recreate when unavoidable | 07:22 |
Qiming | recreate is harmful ... because the instance id will be changed, the IP address allocated will be changed ... | 07:22 |
Qiming | but ... if recreate is the only way out, we should do it | 07:23 |
yuanbin | I known, if recreate the ip will be change, the other(securite group, password, ...) maybe configure same | 07:24 |
Qiming | yes | 07:41 |
openstackgerrit | chenyb4 proposed openstack/senlin master: fix node do_check invalid code https://review.openstack.org/455575 | 08:09 |
openstackgerrit | RUIJIE YUAN proposed openstack/senlin master: handle node which status is WARNING https://review.openstack.org/455542 | 08:25 |
*** elynn has joined #senlin | 08:38 | |
*** openstackgerrit has quit IRC | 09:03 | |
*** zhurong has quit IRC | 10:10 | |
*** zhurong has joined #senlin | 10:11 | |
*** ruijie has quit IRC | 10:13 | |
*** elynn has quit IRC | 10:13 | |
*** yanyanhu has quit IRC | 10:35 | |
*** dixiaoli has quit IRC | 11:21 | |
*** dixiaoli has joined #senlin | 11:21 | |
*** dixiaoli has quit IRC | 11:25 | |
*** XueFeng has quit IRC | 12:04 | |
*** root____5 has quit IRC | 12:04 | |
*** shu-mutou is now known as shu-mutou-AWAY | 12:27 | |
*** zhurong has quit IRC | 12:54 | |
*** yanyanhu has joined #senlin | 12:56 | |
*** XueFengLiu has joined #senlin | 13:00 | |
*** yanyanhu has quit IRC | 13:58 | |
*** testtest has joined #senlin | 13:59 | |
*** testtest has quit IRC | 14:00 | |
*** testtest has joined #senlin | 14:03 | |
*** testtest has quit IRC | 14:06 | |
*** XueFengLiu has quit IRC | 14:06 | |
-openstackstatus- NOTICE: latest base images have mistakenly put python3 in some places expecting python2 causing widespread failure of docs patches - fixes are underway | 14:28 | |
-openstackstatus- NOTICE: we have rolled back centos-7, fedora-25 and ubuntu-xenial images to the previous days release. Feel free to recheck your jobs now. | 14:48 | |
*** catintheroof has joined #senlin | 21:19 | |
*** catintheroof has quit IRC | 21:21 | |
*** catintheroof has joined #senlin | 21:21 | |
*** dixiaoli has joined #senlin | 22:43 | |
*** catintheroof has quit IRC | 23:29 | |
*** catintheroof has joined #senlin | 23:30 | |
*** catintheroof has quit IRC | 23:30 | |
*** dixiaoli has quit IRC | 23:36 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!