*** bobh has joined #openstack-mistral | 01:12 | |
*** bobh has quit IRC | 01:28 | |
*** bobh has joined #openstack-mistral | 02:49 | |
rakhmerov | d0ugal: I did | 03:43 |
---|---|---|
openstackgerrit | Renat Akhmerov proposed openstack/mistral master: Refactor action execution checker without using scheduler https://review.openstack.org/616096 | 03:54 |
*** bobh has quit IRC | 03:55 | |
*** bobh has joined #openstack-mistral | 04:08 | |
*** bobh has quit IRC | 04:10 | |
*** apetrich has quit IRC | 04:45 | |
*** apetrich has joined #openstack-mistral | 04:46 | |
openstackgerrit | Ayub Massalha proposed openstack/mistral master: Updating the execution_expiration_policy.batch_size to 1000 https://review.openstack.org/616414 | 04:58 |
*** akovi has joined #openstack-mistral | 05:21 | |
*** jtomasek has joined #openstack-mistral | 08:18 | |
*** shardy has joined #openstack-mistral | 09:22 | |
rakhmerov | apetrich: thanks for reviewing! | 09:28 |
apetrich | rakhmerov, no worries. I was trying to finish them yesterday | 09:28 |
apetrich | rakhmerov, I'm almost done on the last one | 09:29 |
rakhmerov | ok | 09:49 |
therve | d0ugal: I have some questions about the action expirer if you're around | 09:57 |
rakhmerov | therve: hi, I might be able to help too or akovi | 09:58 |
therve | rakhmerov: Hey thanks | 09:59 |
therve | rakhmerov: I've seen some weird behavior when it finds a stuck action | 09:59 |
therve | rakhmerov: Does it resume the action? | 09:59 |
rakhmerov | what do you mean by "stuck"? When it's running for too long? | 09:59 |
rakhmerov | it can only move it to ERROR state after it sees that a configure number of heartbeats are missed | 10:00 |
therve | rakhmerov: When it's detected as expired by the action executino checker | 10:00 |
rakhmerov | yes | 10:00 |
rakhmerov | it fails them | 10:00 |
therve | I've seen something where it fails the current tasks, but goes on the next task in the execution | 10:01 |
rakhmerov | considering them hopeless to ever finish | 10:02 |
rakhmerov | if this failed task has "on-error" or "on-complete" (or one of those things in task-defaults) then it's possible, yes | 10:02 |
d0ugal | therve: here now, was just getting coffee | 10:03 |
d0ugal | therve: but it looks like you might have gotten the answer you needed. Do you have an exxample? | 10:03 |
therve | d0ugal: Yeah it's a tripleo action | 10:03 |
therve | The issue is that the on_error references <% execution() %> | 10:04 |
d0ugal | aha | 10:04 |
therve | And that fails in a broken way in that case | 10:04 |
d0ugal | Makes sense | 10:04 |
rakhmerov | therve: fails with what error? | 10:04 |
d0ugal | therve: Do we know why it is considered expired? I thought that only happened if the executor crashed/restarted | 10:05 |
rakhmerov | the expression looks OK to me, it's supposed to return a dict with the current wf execution info | 10:05 |
therve | rakhmerov: "WorkflowExecution not found" | 10:05 |
therve | d0ugal: Yes I think the executor restarted | 10:06 |
d0ugal | I guess execution is missing from the context | 10:06 |
d0ugal | hmm, but maybe not | 10:06 |
rakhmerov | d0ugal: not necessarily, it can just be running for too long (longer than heartbeat checking interval multiplied by max missed heartbeats set in the config) | 10:06 |
* d0ugal digs into the code | 10:06 | |
rakhmerov | because, e.g., the executor is too busy | 10:07 |
d0ugal | oh, what is the default max? | 10:07 |
*** shardy has quit IRC | 10:07 | |
rakhmerov | 15 afaik | 10:07 |
d0ugal | max_missed_heartbeats * check_interval? | 10:07 |
rakhmerov | yep | 10:07 |
d0ugal | so that is only 300 seconds | 10:08 |
rakhmerov | yes | 10:08 |
rakhmerov | but there's also another option first_heartbeat_timeout | 10:08 |
rakhmerov | which is 3600 by default | 10:08 |
d0ugal | Right | 10:08 |
rakhmerov | the first heartbeat is OK to come within an hour | 10:08 |
rakhmerov | which is quite reasonable because executors might be busy and don't spread load evenly enough | 10:09 |
*** shardy has joined #openstack-mistral | 10:09 | |
d0ugal | rakhmerov: so an individual action can't take longer than 65 mins (by default)? | 10:09 |
rakhmerov | I guess for most cases 3600 is too big value though | 10:09 |
d0ugal | yeah | 10:09 |
rakhmerov | d0ugal: right | 10:09 |
d0ugal | We run ansible playbooks in actions - I am not sure how long they can take | 10:10 |
d0ugal | but I suspect some of them could be slow | 10:10 |
rakhmerov | therve: hah, interesting.. "WorkflowExecution not found" is probably a different issue | 10:10 |
rakhmerov | d0ugal: yeah | 10:10 |
*** jrist has quit IRC | 10:14 | |
*** shardy has quit IRC | 10:14 | |
*** jrist has joined #openstack-mistral | 10:16 | |
*** shardy has joined #openstack-mistral | 10:19 | |
*** shardy has quit IRC | 10:38 | |
*** shardy has joined #openstack-mistral | 10:39 | |
*** shardy has quit IRC | 10:44 | |
vgvoleg | Hi everyone! Just registered a little blueprint, please give some feedback https://blueprints.launchpad.net/mistral/+spec/mistral-getting-execution-with-specific-list-of-fields | 11:18 |
openstackgerrit | Renat Akhmerov proposed openstack/mistral master: Allow None for 'params' when starting a workflow execution https://review.openstack.org/616549 | 13:55 |
openstackgerrit | Renat Akhmerov proposed openstack/mistral master: Allow None for 'params' when starting a workflow execution https://review.openstack.org/616549 | 14:01 |
*** jtomasek has quit IRC | 14:02 | |
*** akovi has quit IRC | 14:02 | |
*** jaosorior has quit IRC | 14:02 | |
*** mmethot has quit IRC | 14:02 | |
*** mmethot has joined #openstack-mistral | 14:04 | |
*** jtomasek has joined #openstack-mistral | 14:04 | |
*** jaosorior has joined #openstack-mistral | 14:08 | |
*** bobh has joined #openstack-mistral | 14:16 | |
*** shardy has joined #openstack-mistral | 15:08 | |
*** shardy has quit IRC | 15:09 | |
*** shardy has joined #openstack-mistral | 15:09 | |
*** shardy has quit IRC | 15:09 | |
*** shardy has joined #openstack-mistral | 15:11 | |
*** shardy has quit IRC | 15:40 | |
d0ugal | vgvoleg: That sounds like a good addition to me | 17:03 |
*** irclogbot_1 has quit IRC | 17:31 | |
openstackgerrit | Merged openstack/mistral master: Improve workflow completion logic by removing periodic jobs https://review.openstack.org/607807 | 17:43 |
*** irclogbot_1 has joined #openstack-mistral | 18:02 | |
*** irclogbot_1 has quit IRC | 18:05 | |
*** edleafe_ has joined #openstack-mistral | 18:22 | |
openstackgerrit | Renat Akhmerov proposed openstack/mistral stable/rocky: Improve workflow completion logic by removing periodic jobs https://review.openstack.org/616658 | 18:26 |
*** irclogbot_1 has joined #openstack-mistral | 18:27 | |
openstackgerrit | Renat Akhmerov proposed openstack/mistral master: Add batch size for integrity checker https://review.openstack.org/615720 | 18:27 |
openstackgerrit | Renat Akhmerov proposed openstack/mistral master: Refactor action execution checker without using scheduler https://review.openstack.org/616096 | 18:27 |
*** bobh has quit IRC | 18:29 | |
*** irclogbot_1 has quit IRC | 18:32 | |
*** irclogbot_1 has joined #openstack-mistral | 18:37 | |
*** jtomasek has quit IRC | 20:05 | |
openstackgerrit | Oleg Ovcharuk proposed openstack/mistral master: Support getting execution by id with specific list of fields https://review.openstack.org/616693 | 21:09 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!