*** rcernin has quit IRC | 00:05 | |
*** k_mouza has joined #heat | 00:15 | |
*** k_mouza has quit IRC | 00:20 | |
*** rcernin has joined #heat | 00:31 | |
*** weshay|ruck has quit IRC | 00:53 | |
*** weshay_ has joined #heat | 00:53 | |
*** andrein has quit IRC | 00:57 | |
*** zzzeek has quit IRC | 00:57 | |
*** andrein has joined #heat | 01:01 | |
*** zzzeek has joined #heat | 01:01 | |
*** tkajinam has quit IRC | 01:06 | |
*** tkajinam has joined #heat | 01:06 | |
*** k_mouza has joined #heat | 01:31 | |
*** tkajinam has quit IRC | 01:33 | |
*** tkajinam has joined #heat | 01:34 | |
*** k_mouza has quit IRC | 01:36 | |
*** ricolin has quit IRC | 02:15 | |
*** ricolin has joined #heat | 02:21 | |
*** k_mouza has joined #heat | 02:45 | |
*** k_mouza has quit IRC | 02:50 | |
*** k_mouza has joined #heat | 03:27 | |
*** k_mouza has quit IRC | 03:31 | |
*** udesale has joined #heat | 04:45 | |
*** ramishra has quit IRC | 05:11 | |
*** ramishra has joined #heat | 05:16 | |
*** brtknr has quit IRC | 06:17 | |
*** vishalmanchanda has joined #heat | 06:39 | |
*** tosky has joined #heat | 07:39 | |
*** brtknr has joined #heat | 08:46 | |
*** rcernin has quit IRC | 09:12 | |
*** k_mouza has joined #heat | 09:29 | |
*** tkajinam has quit IRC | 10:01 | |
*** ramishra has quit IRC | 10:08 | |
*** ramishra has joined #heat | 10:49 | |
*** ricolin has quit IRC | 10:52 | |
openstackgerrit | Rabi Mishra proposed openstack/heat stable/rocky: Don't store signal_url for ec2 signaling of deployments https://review.opendev.org/743732 | 11:50 |
---|---|---|
*** udesale_ has joined #heat | 12:19 | |
*** udesale has quit IRC | 12:21 | |
*** weshay_ is now known as weshay|ruck | 12:54 | |
zaneb | nsmeds: I think this is the bug report for that issue: https://storyboard.openstack.org/#!/story/2007843 | 12:55 |
pas-ha | I believe it all stems down from the fact that the heat engines' ids are not stable and re-created on each restart. the same zombie entries are then seen in heat service-list, and must be periodically cleaned up from outside | 13:22 |
pas-ha | ideally i'd like them to be stable enough, like hostname-<worker-index>. | 13:23 |
pas-ha | this is not complete solution, but the zombies will be left only if one downscales the heat-engine workers on the host which is IMO way less frequent operation than heat-engine restart | 13:24 |
zaneb | pas-ha: I wish I understood better how those queues are used. are they just there to wait for responses to messages sent out by the engine using call()? | 13:28 |
jrosser | if you ever get in a situation where the heat services constantly restart, bad things end up happening real quick | 13:28 |
jrosser | thats caused us a whole-cloud outage at least once | 13:28 |
zaneb | o.O | 13:29 |
jrosser | becasue youve got just a bazzillion queues generated and rabbitmq is just wedged totally | 13:29 |
zaneb | I'd love to fix this, it seems to be getting worse. I'd never heard of it before like a year ago and now there's multiple people (at least 4-5) complaining about it | 13:29 |
zaneb | the only proposal we've had was to fix it in devstack, which I nixed because that doesn't actually help any real clouds | 13:30 |
zaneb | we store in the DB a list of the engines and their status, so we know if they're not still alive. in principle we could delete those queues. in practice, I don't know if oslo.messaging gives us the APIs to do that | 13:31 |
zaneb | but also I really wanna know if we're losing messages when we do that | 13:32 |
zaneb | (I suspect the increased reports are a result of more people running OpenStack on top of k8s) | 13:33 |
*** gmoro has joined #heat | 13:36 | |
jrosser | i dont know the root cause but the symptom we had was like this | 13:40 |
jrosser | INFO heat.engine.worker [-] Starting engine_worker (1.4) in engine b3ab1e59-1ed5-4293-b16130 | 13:41 |
jrosser | INFO oslo_service.service [-] Child 16146 killed by signal 9 | 13:41 |
jrosser | just over and over in the heat service log | 13:41 |
nsmeds | @zaneb @jrosser thanks for the discussion! Considering we do not use Heat (yet), as it's a fairly new OpenStack deployment, do you think we're safe to just use `rabbitmqctl` to delete all the Heat queues? | 13:54 |
zaneb | nsmeds: in that situation, I would probably shut down all of the heat services, delete the queues, and start them back up again | 13:55 |
nsmeds | Btw, we use openstack-ansible - so not k8s. This only occurred on the staging cluster (which we deployed first, and had a few issues figuring out settings). The production cluster was deployed "smoothly" and doesn't have this same issue with Heat queues. | 13:55 |
nsmeds | Ok, I will try so today. Thank you! | 13:56 |
zaneb | that way you know you have what you need and no more | 13:56 |
*** cliffparsons has quit IRC | 13:57 | |
jrosser | nsmeds: i take no credit for the bash skills here but someone gave me this to help clean up http://paste.openstack.org/show/796430/ | 13:57 |
nsmeds | @jrosser thank you! | 13:58 |
zaneb | jrosser: 9 is SIGKILL so that would do it | 13:58 |
*** cliffparsons has joined #heat | 13:58 | |
*** ricolin has joined #heat | 14:40 | |
*** cliffparsons has quit IRC | 14:40 | |
*** rcernin has joined #heat | 14:42 | |
*** rcernin_ has joined #heat | 14:52 | |
ricolin | tosky, I will send patch to migrate those two jobs | 14:53 |
*** rcernin_ has quit IRC | 14:56 | |
*** rcernin has quit IRC | 14:58 | |
*** irclogbot_2 has quit IRC | 14:58 | |
tosky | ricolin: thank you! | 14:58 |
*** irclogbot_1 has joined #heat | 14:58 | |
ricolin | nsmeds, heat generate queues for engine service every time a engine service started, so what you need to do is to set expire time for queue, so when you restart over, queue will be removed eventually | 14:59 |
ricolin | nsmeds, probably some commands like | 14:59 |
ricolin | nsmeds, http://paste.openstack.org/show/796434/ | 14:59 |
ricolin | tosky, no, thank you!:) | 15:00 |
* ricolin keep forgot those two jobs | 15:00 | |
*** hoonetorg has quit IRC | 15:09 | |
nsmeds | @ricolin ok, I'll look into that as well :) | 15:16 |
*** hoonetorg has joined #heat | 15:21 | |
*** cliffparsons has joined #heat | 15:47 | |
*** k_mouza has quit IRC | 16:09 | |
*** udesale_ has quit IRC | 16:24 | |
*** k_mouza has joined #heat | 16:28 | |
*** k_mouza has quit IRC | 16:33 | |
*** k_mouza has joined #heat | 16:46 | |
*** k_mouza has quit IRC | 16:50 | |
*** k_mouza has joined #heat | 16:57 | |
*** ricolin has quit IRC | 17:01 | |
*** k_mouza has quit IRC | 17:01 | |
*** k_mouza has joined #heat | 17:15 | |
*** k_mouza has quit IRC | 17:19 | |
*** k_mouza has joined #heat | 17:35 | |
*** k_mouza has quit IRC | 17:39 | |
*** k_mouza has joined #heat | 17:54 | |
*** k_mouza has quit IRC | 17:58 | |
*** k_mouza has joined #heat | 18:10 | |
*** k_mouza has quit IRC | 18:14 | |
*** vishalmanchanda has quit IRC | 18:29 | |
*** k_mouza has joined #heat | 18:31 | |
*** k_mouza has quit IRC | 18:39 | |
*** k_mouza has joined #heat | 19:22 | |
*** k_mouza has quit IRC | 19:27 | |
*** k_mouza has joined #heat | 19:33 | |
*** k_mouza has quit IRC | 19:37 | |
*** k_mouza has joined #heat | 19:42 | |
*** k_mouza has quit IRC | 19:47 | |
*** k_mouza has joined #heat | 20:05 | |
*** k_mouza has quit IRC | 20:09 | |
*** rcernin_ has joined #heat | 22:36 | |
*** rcernin_ has quit IRC | 22:48 | |
*** rcernin has joined #heat | 22:48 | |
*** tkajinam has joined #heat | 22:53 | |
*** cliffparsons has quit IRC | 23:13 | |
*** tosky has quit IRC | 23:21 | |
*** ramishra has quit IRC | 23:52 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!