*** goldyfruit___ has joined #openstack-masakari | 00:01 | |
*** openstackgerrit has joined #openstack-masakari | 02:50 | |
openstackgerrit | pengyuesheng proposed openstack/masakari master: Blacklist requests-mock 1.7.0 https://review.opendev.org/683269 | 02:50 |
---|---|---|
openstackgerrit | Arthur Dayne proposed openstack/masakari master: Fix the bug #1836354 that masakari cannot funtion well with noauth2 strategy https://review.opendev.org/670680 | 03:36 |
openstack | bug 1836354 in masakari "Masakari service cannot function well when set auth_strategy=noauth2" [Undecided,In progress] https://launchpad.net/bugs/1836354 - Assigned to Arthur Dayne (palagend) | 03:36 |
*** jawad_axd has joined #openstack-masakari | 05:58 | |
openstackgerrit | Shilpa Devharakar proposed openstack/masakari-monitors master: Add operator guide documentation https://review.opendev.org/489095 | 06:01 |
*** jawad_ax_ has joined #openstack-masakari | 06:02 | |
*** jawad_axd has quit IRC | 06:03 | |
openstackgerrit | Shilpa Devharakar proposed openstack/python-masakariclient master: Update operator guide documentation https://review.opendev.org/683306 | 07:21 |
*** tpatil has joined #openstack-masakari | 08:04 | |
tpatil | uneek: Ping? | 08:09 |
uneek | pong | 08:09 |
tpatil | uneek: You want to set up hostmonitor right? | 08:10 |
uneek | I want to setup, whats neccessary ;) | 08:10 |
tpatil | uneek: we are updating operator's guide for masakari-monitors | 08:11 |
uneek | so as I assume, hostmonitor is the least that I need | 08:11 |
tpatil | #link : https://review.opendev.org/#/c/489095/11 | 08:11 |
tpatil | yes, if you want to evacuate VMs running on the compute host, you need masakari-hostmonitor | 08:11 |
uneek | yeah, I'm following the commit logs | 08:11 |
tpatil | please download https://review.opendev.org/#/c/489095/11/doc/source/_static/images/masakari-monitors.jpg,unified and see the architecture diagram. | 08:12 |
tpatil | it will help you to understand the components needs in your setup | 08:12 |
tpatil | s/needs/needed | 08:13 |
tpatil | the same architecture diagram is also available on masakari wiki | 08:14 |
tpatil | #link : https://wiki.openstack.org/wiki/Masakari | 08:14 |
uneek | I'm sorry, but somehow there is still a twisted knot in my head: How is this supposed to work? How should the engine realize that the compute is down, if the hostmonitor is running on the computenode? Is the hostmonitor permanently talking to the local pacemaker? I don't assume that. I'm afraid that diagram doesn't answer my question. | 08:20 |
uneek | or is the hostmonitor just an process, that connects to pacemaker and monitors the availability of the compute resource? | 08:22 |
uneek | I'm happy to contribute to a better documentation, that would answer this, but I would need to understand it before | 08:23 |
tpatil | That's correct, hostmonitor is a process which get the information from crm if any of the compute node is down and then it send the notification to masakari | 08:23 |
tpatil | say you have compute nodes A and B and you have created a pacemaker cluster | 08:26 |
tpatil | Now, if compute A goes down, masakari-hostmonitor running on compute B will read the information from CRM and it will see that compute A is down, then it will send the notification to masakari-api service that compute A is down, and then masakari-engine will execute the workflow to evacuate VMs running on compute host A | 08:27 |
uneek | Ah, so essentially it could also run on the control nodes | 08:28 |
tpatil | one should run masakari-api and masakari-engine services on the controller nodes | 08:28 |
uneek | can the hostmonitor also connect to a pacemaker-remote process? | 08:28 |
tpatil | masakari-monitor process should run only on compute nodes | 08:28 |
openstackgerrit | Shilpa Devharakar proposed openstack/python-masakariclient master: Update operator guide documentation https://review.opendev.org/683306 | 08:29 |
uneek | so can you please elaborate, what the tasks of host hostmonitor are, so that they must be running on the computes? If its only reading from the CRM then this can clearly also be done on the controllers. | 08:31 |
uneek | I can see, why instance and process monitor needs running on compute. But does hostmonitor do anything beside that? | 08:32 |
tpatil | I agree, it's possible to run hostmonitor on controller node as well | 08:33 |
tpatil | but the other three masakari-monitor services should run on the compute node i..e processmonitor, instancemonitor and introspectivemonitor | 08:34 |
tpatil | hostmonitor internally calls cibadmin --query to check the status of all hosts and if any of the hosts are offline, notification will be sent | 08:36 |
tpatil | you can see the code here: https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/hostmonitor/host_handler/handle_host.py#L332 | 08:38 |
tpatil | and https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/hostmonitor/host_handler/handle_host.py#L301 | 08:38 |
uneek | mom, afk | 08:38 |
uneek | re | 08:46 |
uneek | so, on the pacemaker-remote nodes the "cibadmin --query" command fails, thats one of the reasons, I want to understand whats happening | 08:47 |
uneek | And I assume the matching of the resource/node-names in pacemaker is done against the name of the compute service that is registered in openstack-nova - so best stick with fqdn there | 08:49 |
tpatil | yes that's correct | 08:50 |
tpatil | hostmonitor internally maintain the status of the hosts, after it run the command "cibadmin --query", it will get the current status of the hosts | 08:51 |
tpatil | it will compare it with the old status and according send notification to masakari | 08:51 |
uneek | the other monitor processes don't need access to the pacemaker, right? they only gather local information and talk directly to the masakari-api, right? | 08:51 |
tpatil | yes, that' correct. no need of pacemaker for other monitors | 08:52 |
uneek | ok - I think I got it now a bit more clear. | 08:52 |
tpatil | Great, if you have any further questions please contact me on mailing list or IRC | 08:53 |
uneek | I would suggest for the architecture diagram, to split it completely between the different monitor types/scenarios so you don't get overwhelmed of all the possible communications | 08:54 |
uneek | yeah, I will, I'll stick here in IRC ;) | 08:54 |
tpatil | I think that's a good suggestion, will try to do it at our end | 08:55 |
tpatil | I will bring up this point in the bi-weekly meeting | 08:55 |
tpatil | the next meeting with be held on Tuesday | 08:56 |
uneek | Just FIY: One thing that makes it a bit harder for me to disclutter all the information is, that we have deployment policy here, that we build minimal docker containers for every service and only run the needed parts, where they should. So thats why I'm a) trying to get this straight in my head, b) trying to explicitly describe the interfaces of the components between each other. And from my current view, I | 08:58 |
uneek | really think, that the hostmonitor-process is better placed on some control nodes rather then the compute nodes | 08:58 |
uneek | because by simple trial I found, that the cibadmin -query fails on the pacemaker-remote nodes | 08:59 |
uneek | which will be installed on all the compute-nodes | 08:59 |
tpatil | uneek: I will discuss about this point with Sampath who is PTL in the next meeting and answer to this question | 09:02 |
uneek | +1 | 09:02 |
openstackgerrit | pengyuesheng proposed openstack/masakari master: Update the constraints url https://review.opendev.org/683333 | 09:22 |
*** tpatil has quit IRC | 09:34 | |
*** brinzhang has quit IRC | 10:14 | |
*** goldyfruit___ has quit IRC | 12:12 | |
*** jawad_ax_ has quit IRC | 13:16 | |
*** jawad_axd has joined #openstack-masakari | 13:17 | |
*** jawad_axd has quit IRC | 13:22 | |
*** goldyfruit___ has joined #openstack-masakari | 13:28 | |
*** goldyfruit_ has joined #openstack-masakari | 13:35 | |
*** goldyfruit___ has quit IRC | 13:35 | |
*** goldyfruit___ has joined #openstack-masakari | 13:36 | |
*** goldyfruit_ has quit IRC | 13:39 | |
*** openstackgerrit has quit IRC | 14:06 | |
*** goldyfruit___ has quit IRC | 15:21 | |
*** goldyfruit has joined #openstack-masakari | 15:26 | |
*** jawad_axd has joined #openstack-masakari | 15:55 | |
*** jawad_axd has quit IRC | 15:59 | |
*** goldyfruit has quit IRC | 17:01 | |
*** goldyfruit has joined #openstack-masakari | 17:01 | |
*** goldyfruit has quit IRC | 17:31 | |
*** openstackgerrit has joined #openstack-masakari | 17:41 | |
openstackgerrit | OpenStack Release Bot proposed openstack/python-masakariclient master: Update master for stable/train https://review.opendev.org/683615 | 17:41 |
*** goldyfruit has joined #openstack-masakari | 18:07 | |
*** goldyfruit_ has joined #openstack-masakari | 18:14 | |
*** goldyfruit has quit IRC | 18:17 | |
*** goldyfruit_ has quit IRC | 22:42 | |
*** jawad_axd has joined #openstack-masakari | 23:24 | |
*** jawad_axd has quit IRC | 23:28 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!