*** wanghao has joined #openstack-mogan | 00:01 | |
openstackgerrit | wanghao proposed openstack/mogan-specs master: No need flavor and network in managing server https://review.openstack.org/494980 | 00:39 |
---|---|---|
wanghao | morning mogan! | 00:41 |
zhenguo | morning! | 00:47 |
*** wanghao has quit IRC | 00:50 | |
*** litao__ has joined #openstack-mogan | 01:06 | |
openstackgerrit | Tao Li proposed openstack/mogan master: Rollback to the original status when server powering action failed https://review.openstack.org/476862 | 01:31 |
*** wanghao has joined #openstack-mogan | 01:44 | |
openstackgerrit | liusheng proposed openstack/mogan master: Use server group in scheduler https://review.openstack.org/496151 | 01:56 |
liusheng | zhenguo: please seem my reply inline:https://review.openstack.org/#/c/496151 | 02:14 |
zhenguo | liusheng: ok | 02:14 |
zhenguo | liusheng: see my replies | 02:43 |
openstackgerrit | wanghao proposed openstack/mogan master: Introduce Cinder client into Mogan https://review.openstack.org/489455 | 02:46 |
zhenguo | liusheng: seems for affinity with non members it's hard to select an affinity_zone with enough nodes | 02:48 |
zhenguo | liusheng: and for anty_affinity we need to get nodes all from different affinity_zone, also hard | 02:49 |
zhenguo | liusheng: let me think more about his | 02:49 |
liusheng | zhenguo: thanks :), I have left replies. hah | 02:51 |
zhenguo | liusheng: do we have an affinity_zone list with server group object? | 02:51 |
liusheng | zhenguo: no | 02:52 |
zhenguo | liusheng: I remember you mentioned that you will add that | 02:52 |
liusheng | zhenguo: as we disscuss last time, yes, but it is calculated in time with the members of a server group like how Nova does | 02:53 |
zhenguo | liusheng: ok | 02:54 |
liusheng | zhenguo: the advantage is we don't need to maintain the relationship with the affinity zone and server group | 02:54 |
liusheng | zhenguo: just need to care the members | 02:54 |
zhenguo | liusheng: seems I need some time to go through your whole logic again, lol | 02:55 |
liusheng | zhenguo: hah, thanks, I also considered some times, maybe the current implementation is not very efficient, any suggestion from you is appreciate :D | 02:56 |
zhenguo | liusheng: hah | 02:57 |
zhenguo | liusheng: seems every scheduling, you will fetch all affinity zone and nodes map and construct a dict with all of them, it's really not effecient | 03:06 |
liusheng | zhenguo: yes.., maybe it is better to implement a cache mechanism, but seems it is hard to keep updated | 03:10 |
zhenguo | liusheng: yes, can we leverage placement? | 03:10 |
liusheng | zhenguo: admis may set the affnity zone for a aggregate from API | 03:11 |
zhenguo | liusheng: yes | 03:11 |
zhenguo | liusheng: a node can only in one affinity_zone, so if we refactor to change like what I suggested, what's the main problem? | 03:13 |
zhenguo | liusheng: seems just for server_group without any member, right? | 03:15 |
zhenguo | liusheng: seems anti-affinity also have problems | 03:15 |
liusheng | zhenguo: maybe not only, for example, for affinity policy, if we request 3 servers in a request, and maybe we can select 4 available nodes, but the 4 available nodes distributed in 4 affinity zones, we also cannot create servers | 03:17 |
zhenguo | liusheng: if there's a member in the server group, we can avoid that | 03:17 |
liusheng | zhenguo: oh, yes | 03:17 |
zhenguo | liusheng: so the main problem for us, is how to make sure the affinity zone have enough nodes | 03:18 |
openstackgerrit | wanghao proposed openstack/mogan master: Manage existing BMs: Part-2 https://review.openstack.org/481544 | 03:19 |
liusheng | zhenguo: yes, and also, we need to get the relationship of the affinity zone and nodes, because we need to set the affinity_zone field for the server after scheduling | 03:19 |
zhenguo | liusheng: for anti_affinity? | 03:21 |
zhenguo | liusheng: for affinity, there's only one affinity_zone, it's easy to know | 03:21 |
liusheng | zhenguo: both, yes, we need to set the server.affinity_zone field | 03:22 |
* zhenguo brb | 03:23 | |
zhenguo | liusheng: seems at least for some cases, we don't need to fetch all az and node map | 03:34 |
liusheng | zhenguo: seems yes, for the affinity zone with members in server group, I can optimize | 03:38 |
*** wanghao has quit IRC | 03:43 | |
openstackgerrit | Merged openstack/mogan master: Leverage _detach_interface for destroying networks https://review.openstack.org/496569 | 04:34 |
openstackgerrit | Merged openstack/mogan master: Return addresses with server API object https://review.openstack.org/495826 | 04:34 |
openstackgerrit | Merged openstack/mogan master: Add missed abstractmethod decorators for db abstract layer methods https://review.openstack.org/498152 | 04:37 |
openstackgerrit | Merged openstack/mogan master: Clean up server node uuid on task revert https://review.openstack.org/496207 | 04:37 |
openstackgerrit | Merged openstack/mogan master: Add support for scheduler_hints https://review.openstack.org/463534 | 04:39 |
*** wanghao has joined #openstack-mogan | 04:47 | |
*** wanghao has quit IRC | 05:01 | |
*** wanghao has joined #openstack-mogan | 05:11 | |
*** wanghao has quit IRC | 05:26 | |
*** wanghao has joined #openstack-mogan | 05:38 | |
zhenguo | liusheng: seems you can get [(zone1, [node-1, node-2]), (zone2, [node-3, node-4])] from our cache? | 07:01 |
liusheng | zhenguo: the cache only store the mapping between aggregates and nodes, not affinity zone | 07:05 |
zhenguo | liusheng: affinity_zone is a metadata of aggregate, | 07:05 |
zhenguo | liusheng: you can know which aggregates have the affinity_zone | 07:06 |
liusheng | zhenguo: yes, but I need to query the aggregate matadata, | 07:06 |
zhenguo | liusheng: seems now you also query the aggregate metadata before call placement | 07:06 |
liusheng | zhenguo: and if I use the cache, I need a interface from mogan-engine | 07:07 |
liusheng | zhenguo: yes | 07:07 |
zhenguo | liusheng: you dont' need call mogan-engine | 07:07 |
zhenguo | liusheng: as mogan-engine is also a proxy to report client | 07:07 |
liusheng | zhenguo: why ? | 07:08 |
zhenguo | liusheng: you can check mogan-engine code | 07:08 |
liusheng | zhenguo: only mogan-engine can update the cache | 07:08 |
zhenguo | liusheng: why you need to update the cache | 07:08 |
zhenguo | liusheng: you just read it | 07:08 |
liusheng | zhenguo: it is empty | 07:08 |
liusheng | zhenguo: if don't call mogan-engine | 07:08 |
zhenguo | liusheng: oh, seems yes, only engine initialize the cache | 07:09 |
liusheng | zhenguo: yes | 07:09 |
zhenguo | liusheng: so we can't leverage the cache | 07:09 |
liusheng | zhenguo: seems yes, if there is a big improvement, we can add a interface in mogan-engine for scheduler, but.. | 07:10 |
zhenguo | liusheng: maybe we don't need a separated scheduler service, lol | 07:11 |
liusheng | zhenguo: lol | 07:11 |
zhenguo | liusheng: forget it, let's try to find other approach | 07:11 |
liusheng | zhenguo: hah, I am not sure, if it will be a perfomance bottleneck, only some placement calls and db queries | 07:13 |
zhenguo | liusheng: I'm concerned that if we have many affinity_zones like name rack, every scheduling call you will issue the number of racks REST call to placement | 07:14 |
liusheng | zhenguo: seems yes, only this is O(n) complexity | 07:17 |
zhenguo | liusheng: server group is really hard to handle | 07:18 |
zhenguo | liusheng: 1. for affinity with members, we just need one placement call with member_of: agg_list | 07:22 |
liusheng | zhenguo: yes, this can be optimized | 07:22 |
zhenguo | liusheng: 2. for affinity without members, we need to find a server group with enough nodes | 07:22 |
liusheng | zhenguo: yes | 07:23 |
zhenguo | liusheng: can we handle this more effiecient | 07:23 |
liusheng | zhenguo: is there any way to get the all the aggregates-nodes mapping in one placement call ? | 07:25 |
liusheng | zhenguo: seems cannot .. | 07:25 |
zhenguo | liusheng: if we only select one node it's easy to handle | 07:26 |
zhenguo | liusheng: not scheduling num_servers node | 07:27 |
liusheng | zhenguo: yes | 07:27 |
zhenguo | liusheng: so, how about add server group check before call scheduler? | 07:27 |
zhenguo | liusheng: it's also benefit for anti-affinity | 07:28 |
zhenguo | liusheng: but it will also cause no enough nodes | 07:29 |
zhenguo | liusheng: if we selected an affinity zone without enough nodes | 07:29 |
liusheng | zhenguo: yes | 07:29 |
zhenguo | liusheng: do you know how nova handle this? | 07:29 |
liusheng | zhenguo: nova have host and aggregates in its own db, so I guess it is more easy | 07:30 |
liusheng | zhenguo: and for scheduler, Nova is different with us, nova scheduler will chech host one by one, and for nultiple creating, it will also select one by one | 07:31 |
zhenguo | liusheng: when scheduling with affinity policy, does it select a host with num_server * resources | 07:31 |
zhenguo | liusheng: if it check host one by one, how does it make sure there's enough resources in affinity policy scenario | 07:32 |
liusheng | zhenguo: seems the AffinityFilter will only check if it is the same host, and other filter will check the resources | 07:35 |
zhenguo | liusheng: yes, | 07:35 |
zhenguo | liusheng: can you help to confirm with our nova guys, to see can nova handle such multiple servers creation with affinity policy scenario? | 07:37 |
liusheng | zhenguo: ok, let me ask | 07:38 |
liusheng | zhenguo: just asked, in Nova, if create multiple servers, scheduler will select host for the servers one by one, evey server will pass all the filters, if there is no enough resources for the servers, it will schedule failed | 07:45 |
zhenguo | liusheng: lol | 07:46 |
zhenguo | liusheng: so for affinity policy, there's no specical check | 07:47 |
liusheng | zhenguo: yes | 07:48 |
liusheng | zhenguo: in Nova there is more info in db, and there is a in-memory HostState object | 07:48 |
zhenguo | liusheng: but they also have our problem, right? | 07:49 |
zhenguo | liusheng: if there's no member in server group, they also face the same problem with us | 07:50 |
liusheng | zhenguo: for affinity policy ? | 07:51 |
zhenguo | liusheng: for multiple servers | 07:51 |
zhenguo | liusheng: they go through the filters only once, right? | 07:52 |
liusheng | zhenguo: they will check hosts one by one, and why will check for servers one bye one. but zhenyu just told me that Nova scheduler is also starting to support check the resource multiple servers once | 07:52 |
liusheng | zhenguo: not once | 07:52 |
liusheng | zhenguo: because they also start to use placememt, and they may use the total resource requirement as the query filter passed to placement api to select rps | 07:53 |
zhenguo | liusheng: oh, yes, there's a for lool of num_servers | 07:53 |
zhenguo | liusheng: so, in nova, if there's no members in server group, they just choose one which satisfies one server's requirements, right? | 07:58 |
liusheng | zhenguo: yes | 07:59 |
zhenguo | liusheng: so it will just fail if there's no enough resource? | 07:59 |
liusheng | zhenguo: yes | 07:59 |
liusheng | zhenguo: their targets are hosts, it more simple | 07:59 |
zhenguo | liusheng: oh | 08:00 |
zhenguo | liusheng: we can continue to discuss the anti-affinity scenaro | 08:04 |
zhenguo | liusheng: is there a way to get the affinity zone for rp | 08:04 |
zhenguo | liusheng:*from | 08:04 |
liusheng | zhenguo: seems need: rp -- > agg --> affinity zone | 08:05 |
liusheng | zhenguo: it is also not very efficient | 08:05 |
zhenguo | liusheng: you mean need one more placement call to get the rp's aggregates, then get affinity zone from the aggregates | 08:06 |
liusheng | zhenguo: yes, since the affinity zone is only a metadata of aggregates | 08:07 |
liusheng | zhenguo: oh, maybe we can optimize by like Nova | 08:09 |
zhenguo | liusheng: how? | 08:09 |
liusheng | zhenguo: we don't query all the affinity zones nodes list, we only need to find the enough nodes and stop the cycle | 08:10 |
zhenguo | liusheng: you mean for affinity policy without members? | 08:10 |
liusheng | zhenguo: for all the situations except for the affnity policy with members situation | 08:11 |
zhenguo | liusheng: how do you handle the anti-affinity policy | 08:12 |
liusheng | zhenguo: when we find the number of servers nodes which satisfy the conditions, we stop the cycle | 08:13 |
zhenguo | liusheng: we only need one node for anti_affinity | 08:14 |
zhenguo | liusheng: in different affinity zones | 08:14 |
liusheng | zhenguo: yes, we only need to find the number of servers of affinity zones which every one have at least 1 available ndoe | 08:15 |
zhenguo | liusheng: seems for multi servers creation with anti-affinity we can move the logic to engine side, and call select_destination for each server. | 08:17 |
liusheng | zhenguo: seems we cannot, for now our scheduler don't have an in-memory HostState object | 08:18 |
zhenguo | liusheng: why we need taht | 08:19 |
liusheng | zhenguo: for multiple creation, the 2ed server scheduling need to consider the 1st server resource consumation | 08:21 |
zhenguo | liusheng: yes, after the first one scheduled, we can add it to the server groups members, and use new server group for the next schedule | 08:21 |
liusheng | that seems is similar with my optimization suggestion | 08:23 |
liusheng | zhenguo: and don't need to introduce rpc calls | 08:24 |
zhenguo | liusheng: you will try each affinity zone, right? | 08:25 |
liusheng | zhenguo: yes, also find for the servers one by one | 08:25 |
zhenguo | liusheng: but it's just try, you may need to try all affinity zones, right? | 08:26 |
liusheng | zhenguo: that is the worest situation. hah | 08:26 |
zhenguo | liusheng: yes, | 08:26 |
zhenguo | liusheng: but you may try more times | 08:26 |
liusheng | zhenguo: yes | 08:27 |
zhenguo | liusheng: you can compare the two solutions, I will be right back :D | 08:29 |
liusheng | zhenguo: ok | 08:29 |
litao__ | wanghao It seems list manageable servers API should also return availability_zone | 08:32 |
litao__ | zhenguo: you mentioned in last discussion | 08:32 |
zhenguo | litao__: you can just ignore az for now, hah | 08:40 |
litao__ | zhenguo: ok | 08:40 |
litao__ | zhenguo: If it not return the az , i will use the default_schedule_az in db record | 08:41 |
litao__ | zhenguo: right? | 08:41 |
zhenguo | litao__: you can just set it to None | 08:54 |
zhenguo | litao__: as the default_schedule_az is used for scheuling, | 08:54 |
litao__ | zhenguo: Isnt the default_scedule_az None? | 08:55 |
litao__ | is it? | 08:55 |
zhenguo | litao__: yes, it's default to None, but if it's set to others, you should not use it | 08:55 |
zhenguo | litao__: it should be None for your cas | 08:56 |
zhenguo | *case | 08:56 |
litao__ | zhenguo: OK | 08:56 |
openstackgerrit | liusheng proposed openstack/mogan master: Use server group in scheduler https://review.openstack.org/496151 | 08:57 |
zhenguo | liusheng: some nits in the patch | 09:18 |
liusheng | zhenguo: ok, thanks, replies inline, and will update the patch | 09:26 |
*** wanghao has quit IRC | 09:29 | |
*** wanghao has joined #openstack-mogan | 09:29 | |
*** wanghao has quit IRC | 09:29 | |
*** wanghao has joined #openstack-mogan | 09:30 | |
*** wanghao has quit IRC | 09:30 | |
*** wanghao has joined #openstack-mogan | 09:31 | |
*** wanghao has quit IRC | 09:31 | |
*** wanghao has joined #openstack-mogan | 09:31 | |
*** wanghao has quit IRC | 09:32 | |
*** wanghao has joined #openstack-mogan | 09:32 | |
*** wanghao has quit IRC | 09:33 | |
*** wanghao has joined #openstack-mogan | 09:33 | |
*** wanghao has quit IRC | 09:33 | |
*** wanghao has joined #openstack-mogan | 09:34 | |
*** wanghao has quit IRC | 09:34 | |
*** wanghao has joined #openstack-mogan | 09:35 | |
openstackgerrit | liusheng proposed openstack/mogan master: Use server group in scheduler https://review.openstack.org/496151 | 09:52 |
openstackgerrit | Zhenguo Niu proposed openstack/mogan master: Add socat console support https://review.openstack.org/493836 | 09:53 |
zhenguo | liusheng: maybe we'd better to add affinity_zone list to server group | 09:53 |
zhenguo | liusheng: as for server delete, we don't care about the performance | 09:54 |
liusheng | zhenguo: if so we need to maintain that property | 09:54 |
zhenguo | liusheng: which can remove the loop check during scheduling, right? | 09:56 |
liusheng | zhenguo: for anti-affinity maybe not | 09:56 |
zhenguo | liusheng: yes | 09:57 |
liusheng | zhenguo: the affinity zones list in server group is white list for affinity and black list for anti-affinity | 09:57 |
zhenguo | liusheng: yes, seems it makes more sense to have that property with server group | 09:57 |
zhenguo | liusheng: the only concern of not adding it is when delete a server we need to check that, right? | 09:58 |
liusheng | zhenguo: and the affinity group may can also be delete by admins | 09:59 |
zhenguo | liusheng: if they delete it, how do you handle the serve affinity zone now | 10:00 |
liusheng | zhenguo: maybe the current implementation is not very possible to be the performance bottleneck, hah, Nova also have similar process | 10:00 |
zhenguo | liusheng: we can be better than nova | 10:01 |
liusheng | zhenguo: just similar process, actually, I think our scheduler should be much faster than Nova scheduler | 10:02 |
zhenguo | liusheng: yes, but if we need to call placement many times, it will not | 10:02 |
liusheng | zhenguo: maybe not very times, current way I think is more optimized. and placement API is also pure db queries, it shouldn't be very slow | 10:03 |
liusheng | zhenguo: even or baremetal server creating, I think the most time consuming steps is not scheduling, the building in Ironic may take a more long time | 10:04 |
liusheng | zhenguo: that is different with vm | 10:05 |
zhenguo | liusheng: but scheduler may also time consuming if you try 1000 times, lol | 10:05 |
liusheng | zhenguo: lol. | 10:06 |
zhenguo | liusheng: seriously, how do you handle the server affinity zone if admins deleted the affinity zone | 10:06 |
liusheng | zhenguo: not sure, may need to stop deleting | 10:07 |
liusheng | zhenguo: need to check how Nova does | 10:07 |
zhenguo | liusheng: nova's affinity zone is host | 10:08 |
liusheng | zhenguo: oh, yes... | 10:08 |
liusheng | zhenguo: and we can also remove/add node for a affinity zone.. :( | 10:08 |
zhenguo | liusheng: yes | 10:08 |
zhenguo | liusheng: maybe need to add some limits | 10:09 |
liusheng | zhenguo: that may make things more complex if we add an affinity zone field for server group :( | 10:09 |
zhenguo | liusheng: like when a node already deployed, should not allow it be deleted from az | 10:09 |
zhenguo | liusheng: why | 10:09 |
zhenguo | liusheng: it seems same with current situation | 10:10 |
liusheng | zhenguo: when scheduling, may there isn't enough node of an affinity zone for creating, but at meanwhile, admins may add nodes to the affinity zone | 10:10 |
liusheng | zhenguo: or remove.. | 10:11 |
liusheng | zhenguo: may lead concurrency issues :( | 10:11 |
zhenguo | liusheng: if they not get in or move out in time, we don't need to consider them | 10:11 |
liusheng | zhenguo: hah | 10:12 |
zhenguo | liusheng: it's Chinese Valentine's Day, will you still work? | 10:13 |
liusheng | zhenguo: yes, but will go home. hah | 10:14 |
zhenguo | liusheng: hah | 10:14 |
zhenguo | liusheng: see you | 10:14 |
liusheng | zhenguo: will you still work ? | 10:14 |
liusheng | zhenguo: ok, bye | 10:14 |
zhenguo | liusheng: no, lol | 10:15 |
zhenguo | liusheng: have a good night | 10:15 |
* zhenguo away | 10:15 | |
liusheng | zhenguo: :) | 10:15 |
*** litao__ has quit IRC | 11:55 | |
*** wanghao_ has joined #openstack-mogan | 18:11 | |
*** wanghao has quit IRC | 18:12 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!