Friday, 2016-09-16

*** bobh has joined #openstack-mistral00:23
*** bobh has quit IRC00:23
*** bobh has joined #openstack-mistral00:27
*** gyee has quit IRC00:56
*** bobh has quit IRC00:56
*** bobh has joined #openstack-mistral01:13
*** bobh has quit IRC01:44
*** bobh has joined #openstack-mistral01:46
openstackgerritWinson Chan proposed openstack/mistral: Abstract authentication function  https://review.openstack.org/37123301:47
openstackgerritWinson Chan proposed openstack/python-mistralclient: Abstract authentication function  https://review.openstack.org/37123401:48
*** bobh has quit IRC01:50
*** bobh has joined #openstack-mistral01:53
*** rrecio_ has quit IRC02:05
*** bobh has quit IRC02:17
*** catintheroof has joined #openstack-mistral02:27
*** vishwanathj has quit IRC02:39
*** janki has joined #openstack-mistral02:50
*** gmann has quit IRC03:35
*** gmann has joined #openstack-mistral03:37
*** janki has quit IRC03:42
*** sharatss has joined #openstack-mistral03:46
*** hparekh has joined #openstack-mistral04:24
*** harlowja has quit IRC04:45
*** jaosorior has joined #openstack-mistral04:57
*** vishwanathj has joined #openstack-mistral04:59
*** vishwanathj has quit IRC05:00
openstackgerritLucky samadhiya proposed openstack/mistral: delete python bytecode including pyo before every test run  https://review.openstack.org/37128606:16
*** Ravikiran_K has joined #openstack-mistral06:34
*** stevebaker has quit IRC07:20
*** stevebaker has joined #openstack-mistral07:23
openstackgerritLucky samadhiya proposed openstack/python-mistralclient: delete python bytecode including pyo before every test run  https://review.openstack.org/37130507:29
*** kong has quit IRC07:41
openstackgerritMerged openstack/python-mistralclient: Updated from global requirements  https://review.openstack.org/37112307:54
*** shardy has joined #openstack-mistral07:58
d0ugalddeja, rakhmerov - Hey, so just a small update, we are not totally sure what happened but we seemed to be causing a race condition. We managed to resolve it in CI for now, I am going to try and re-create it now we know more next week and hopefully come up with a mistral bug that is actionable :)07:59
*** jpich has joined #openstack-mistral08:01
*** openstackgerrit has quit IRC08:03
*** openstackgerrit has joined #openstack-mistral08:04
therved0ugal, Still no good idea what happened?08:17
d0ugald0ugal: nah, we know - sort, of08:18
d0ugaltherve: to the point that we were able to stop it happening :)08:19
d0ugaltherve: but we have not fixed it yet, just removed the commit that highlighted the issue08:19
therved0ugal, Didn't you just removed a bunch of tasks?08:19
d0ugaltherve: Yeah08:19
therveSo yeah no good idea :D08:20
d0ugaltherve: so, openstack undercloud install at the end starts off a workflow, but didn't wait for it to finish - it is fairly quick, but we think some action calls were started by following commands that caused a race condition08:20
d0ugaltherve: we have a theory, but I need to write some nasty shell scripts to try and reproduce it at some point08:20
therveAh, ok08:20
d0ugalMy personal theory is that we start a Mistral workflow, then we start some direct synchronous action calls while the workflow is still running08:21
d0ugaland because there is only one vCPU on the multinode jobs, what happens then?08:21
d0ugal(and one CPU means one worker in Mistral by default I think)08:22
therveThere are 4 vcpu AFAIK08:22
d0ugalIf that is correct, it shouldn't be that hard to try. I just need to re-create my undercloud with one vCPU08:22
therveBut still only one executor and one engine08:22
d0ugaltherve: oh right, I am sure slagle said the multinodes only have 1, but I may have miss-understood that bit08:23
d0ugal(I fail to understand quite a few things about CI)08:23
therved0ugal, Still, that only matters for the API AFAICT08:23
d0ugaltherve: Yeah, that is the one thing that makes me doubt it :)08:23
d0ugalI'll try and re-create it with various combinations of this anyway08:24
d0ugalActually, I am going to try now. This seems more fun than rebasing patches.08:25
*** brunograz_ has quit IRC08:26
*** sharatss has quit IRC08:27
*** jchhatbar has joined #openstack-mistral08:30
*** nmakhotkin has joined #openstack-mistral08:30
*** jchhatbar is now known as janki08:32
*** shardy has quit IRC08:40
therveAhah08:51
therved0ugal, Reproduced08:51
d0ugaltherve: oh, how?08:51
therved0ugal, Simply using run-action08:51
d0ugaltherve: when?08:51
thervemistral run-action mistral.executions_create inputdata.json08:51
d0ugaltherve: Sure, I know the command.08:52
therveThe executor will talk back to the engine via the API, and that's when you have the cycle, so the timeout08:52
d0ugaltherve: I don't fully follow - can you share the full example?08:53
therveHum sure let me try to have the simplest example08:53
d0ugaltherve: oh, is this to do with 1 worker?08:53
d0ugalI think maybe I do understand08:53
therveWell I have 4 API workers, but it doesn't matter08:54
d0ugalthe API is waiting for the response from the engine on the API?08:54
d0ugalah08:54
d0ugalThen I am missing the point :)08:54
therveThe engine is waiting for the executor, which is waiting for the API, which is waiting for the engine08:56
d0ugalright08:56
therved0ugal, http://paste.openstack.org/show/579281/08:57
therveSimplest reproducer08:57
d0ugaltherve: $ mistral run-action mistral.action_executions_create '{"name": "std.echo"}'09:02
d0ugal:)09:02
therveThat should do the trick too :)09:03
therveI should have added "Simplest that I found" :)09:03
d0ugalhah, indeed.09:03
d0ugaltherve: Thanks for finding it.09:03
therveActually that one doesn't work :)09:03
d0ugalIt doesn't?09:04
d0ugalhttp://paste.openstack.org/show/579295/09:04
therveWeird09:05
therved0ugal, Oh because I worked around the problem09:05
therveIt does trigger the timeout09:06
d0ugaltherve: How did you workaround it?09:06
therved0ugal, I changed the oslo messaging executor to threading09:06
d0ugalah09:06
openstackgerritNikolay Mahotkin proposed openstack/mistral: Updating mistralclient docs  https://review.openstack.org/37082109:08
d0ugaltherve: thanks for finding it, are you going to open a bug?09:10
therved0ugal, Sure, doing it now09:11
d0ugalI've never liked our use of direct action calls, I'd like us to move away from them09:12
d0ugalMaybe I can use this as another excuse to do so :)09:12
therveSure, doesn't mean it should blow up :)09:13
therved0ugal, https://bugs.launchpad.net/mistral/+bug/162428409:14
openstackLaunchpad bug 1624284 in Mistral "MessagingTimeout when executing mistral actions" [Undecided,New]09:14
therverakhmerov ^^^09:14
d0ugaltherve: so reverting that mistral patch would have worked, if our CI wasn't broken :-D09:17
therved0ugal, Yep09:17
d0ugalI guess the solution is to make it configurable?09:17
therveI *doubt* it09:18
d0ugalI'm not sure how hard it is to support multiple messaging executors09:18
d0ugalah, ok09:18
therveThe solution is to use oslo_messaging properly. Which to be fair is somewhat hard.09:18
d0ugal:)09:19
*** shardy has joined #openstack-mistral09:26
d0ugaltherve: so, here is the bit I still don't really understand...09:39
d0ugaltherve: How do we re-create it with the tripleo actions?09:39
d0ugalIt sort of makes sense to me that it would fail with a mistral one like that09:40
therved0ugal, That part I'm not sure. Possibly the environment interactions?09:41
d0ugaltherve: oh, good point. That in effect creates nested actions I guess.09:41
therved0ugal, I tried to track down usage of mistral itself in the actions, and it seems to only be about using environments09:42
d0ugalRight, that makes sense09:43
d0ugalI didn't think I'd understand it by the end of this week, so I'm glad that I sort-of do09:43
d0ugalThanks :)09:44
*** sharatss has joined #openstack-mistral09:45
therveYou're welcome09:45
therveThat was a itch I had to scratch :)09:45
openstackgerritDougal Matthews proposed openstack/mistral: Correct documentation about task attributes 'action' and 'workflow'  https://review.openstack.org/37140410:03
*** janki has quit IRC10:22
*** kong has joined #openstack-mistral10:38
d0ugaltherve: To change to threading, did you just change the executor string?10:49
therved0ugal, Yep10:49
d0ugaltherve: to "threading"?10:49
therveYep10:49
d0ugaltherve: cool, I am going to try it in CI with a depends on, just to link it all together and check that works.10:49
openstackgerritDougal Matthews proposed openstack/mistral: [TESTING] Change the oslo messaging executor to threading  https://review.openstack.org/37143510:53
openstackgerritHardik Parekh proposed openstack/python-mistralclient: Added pagination options for workflow and actions  https://review.openstack.org/37143910:57
openstackgerritHardik Parekh proposed openstack/python-mistralclient: Added pagination options for workflow and actions  https://review.openstack.org/37143910:59
openstackgerritDougal Matthews proposed openstack/mistral: Correct documentation about task attributes 'action' and 'workflow'  https://review.openstack.org/37140411:20
*** catintheroof has quit IRC11:39
*** dprince has joined #openstack-mistral12:00
*** bobh has joined #openstack-mistral12:03
*** hparekh has quit IRC12:08
*** catintheroof has joined #openstack-mistral12:27
openstackgerritMerged openstack/mistral: Correct documentation about task attributes 'action' and 'workflow'  https://review.openstack.org/37140412:28
*** bobh has quit IRC12:29
*** vishwanathj has joined #openstack-mistral12:55
ddejatherve, d0ugal Hi guys, I'm glad that you have found issue13:00
ddejatalking about changing blocking into threading - It was threding before, but rakhmerov change it to blocking to make mistral works faster if I remember correctly13:01
ddejaI know how it sounds - how making something blocking instead of threding makes things faster. But this enables us to use non-locking model in the DB13:02
ddejaI would like to ask you, why do you call mistral from mistral? Because I'm not sure if the bug is really a bug, mistral calling itself not sounds like a good idea for me13:03
*** jaosorior has quit IRC13:03
*** jaosorior has joined #openstack-mistral13:04
*** bobh has joined #openstack-mistral13:06
therveddeja, Because actions can do several things13:07
therveThat's a serious limitation. It's also one that I reproduced quickly, but I'm pretty sure I can find other problems with using blocking13:07
ddejatherve: OK. Could you show a real example when mistral calling itself is used? Because I can't think of scenario when it is needed13:09
therveddeja, https://github.com/openstack/tripleo-common/blob/master/tripleo_common/actions/plan.py#L9113:10
ddejawhy not put this into a workflow? and on success call another action?13:12
*** bobh has quit IRC13:12
therveSure there are workarounds13:12
therveBut the behavior is still broken13:12
ddejaAgree13:13
ddejaNevertheless I'm wondering if such usage of mistral is OK, I would like to see Renat's opinon13:13
therveUsing the blocking executor is trying to hide concurrency issues under the rug13:15
ddejatherve: let me fing the patch where it was change13:16
therveddeja, https://review.openstack.org/#/c/356343/13:17
ddejatherve: oh, and there is explenation in this patch13:20
therveSome, yeah13:21
*** tonytan4ever has joined #openstack-mistral13:27
*** Ravikiran_K has quit IRC13:36
*** tonytan_brb has joined #openstack-mistral13:44
*** brian_price has quit IRC13:46
*** tonytan4ever has quit IRC13:46
ddejatherve: Oh, no I got it. You are calling the mistral environment-create from mistral action, but it hangs on API. It is serious issue13:47
openstackgerritMerged openstack/mistral: Take os_actions_endpoint_type into use  https://review.openstack.org/36925313:48
ddejatherve: but I've tested it now - it works fine (creating an environment while action is running)13:57
therveddeja, While the action is running, or inside the action itself?13:58
ddejatherve: oh...13:58
ddejathat matters, right?13:59
therveWell yeah :)13:59
ddejabut I can't think why? It just creates an mistralclient object and calls API13:59
ddejait shouldn't matter14:00
ddejabecause I see obvious deadlock with calling mistral run-action from mistral action, but this should work...14:01
ddejalet my try it on my devstack14:01
*** bobh has joined #openstack-mistral14:09
d0ugalddeja: it is a bug in that we have been doing it for months and it just stopped working :(14:10
*** jaosorior has quit IRC14:10
d0ugalddeja: we first started using this approach late last year actually14:10
*** nmakhotkin has quit IRC14:10
d0ugalddeja: From what I remember, dprince and/or rbrady did chat with rakhmerov about our usage place (with environments)14:11
openstackgerritSharat Sharma proposed openstack/mistral-lib: Small changes like deletion of extra underline in the docs  https://review.openstack.org/37157114:13
*** bobh has quit IRC14:13
ddejad0ugal: well, but it seems it works fine14:17
ddejad0ugal, therve: http://paste.openstack.org/show/580024/14:17
therveddeja, Not sure what I'm looking at14:18
therveAh you made a env create call14:18
ddejatherve: yes. Inside the std.noop.  But I've used debbuger14:19
ddejaso I'm just pasting this to line to action body to be sure14:19
therveddeja, It should be in the executor, I think?14:19
therveYou seem to be in the engine here14:19
ddejait is executor14:19
*** tonytan_brb is now known as tonytan4ever14:20
d0ugalddeja: FWIW, it works in one of our CI systems but not the other14:20
d0ugalWe don't know why that is.14:20
ddejawell, calling env.create from action execution works fine...14:21
ddejaat least on devstack with ubuntu14:21
d0ugalddeja: Yeah, and this is why it's been so hard to track down because it seems to work in some places fine14:22
ddejait is failing on Centos, right?14:22
therveFWIW env create was just an hypothesis14:22
therveIn theory env create just hits the database in the API, so it shouldn't raise an issue14:22
ddejatherve: env create should be OK. I'm thinking if there may be a place, where mistral.run-action is called from action14:23
ddejabecause that will block things14:23
ddejad0ugal, therve is it failing on CentOS only?14:25
therveIt's unlikely to be the issue14:25
d0ugalI agree with therve, but as far as I know that is the only place it has been seen.14:25
ddejaI'm not sure - we have recently change how the API service is beeing run. Maybe default config on centos differs from one from ubuntu14:26
therveFeel free to dissect tripleo logs to find the right cause14:30
therveOr just use the simple example I gave which trivially reproduce a similar issue14:30
d0ugalddeja: Can you point me to that change?14:31
ddejad0ugal: well, It doesn't matter. It uses python, so it same independent the OS14:32
d0ugalTrue, in theory :)14:32
ddejaI've setup number of workers to 1, but still can't reproduce it :/14:34
ddejad0ugal: do you have a Ceno14:39
ddejaCentOS with mistral?14:39
d0ugalddeja: Yup!14:39
d0ugalddeja: I wont have much time to look at it again today, but I will be focusing on this on Monday14:39
ddejacan yuo run 'sudo ps aux | grep mistral' and tell how many api services are being run?14:40
ddejajust this one cmd, I'm curious if it differs from ubuntu ;)14:40
*** kong has quit IRC14:51
*** rrecio_ has joined #openstack-mistral14:57
*** bobh has joined #openstack-mistral15:10
*** bobh has quit IRC15:14
ddejad0ugal, therve, guys, I found the root couse15:47
ddejaor maybe not...15:48
*** tonytan4ever has quit IRC16:04
*** bobh has joined #openstack-mistral16:10
*** bobh has quit IRC16:12
*** bobh has joined #openstack-mistral16:12
*** sharatss has quit IRC16:15
*** dprince has quit IRC16:24
*** jpich has quit IRC16:33
*** bobh_ has joined #openstack-mistral16:34
*** bobh has quit IRC16:37
*** bobh_ has quit IRC16:42
*** tonytan4ever has joined #openstack-mistral17:04
*** tonytan4ever has quit IRC17:10
*** tonytan4ever has joined #openstack-mistral17:29
ddejad0ugal, therve, rakhmerov: Ok guys, now I really found the root couse (after really cerfully reading what's in tripleo failing gate logs). I left a note under bug report https://bugs.launchpad.net/mistral/+bug/162428417:37
openstackLaunchpad bug 1624284 in Mistral "MessagingTimeout when executing mistral actions" [Undecided,Confirmed] - Assigned to Dawid Deja (dawid-deja-0)17:37
*** bobh has joined #openstack-mistral17:43
*** bobh has quit IRC17:47
*** bobh has joined #openstack-mistral17:50
*** harlowja has joined #openstack-mistral18:13
*** bobh has quit IRC18:27
*** chlong_ has quit IRC18:27
rbradyddeja: thanks for the update!18:32
*** brian_price has joined #openstack-mistral18:39
*** bobh has joined #openstack-mistral19:28
*** bobh has quit IRC19:37
*** dprince has joined #openstack-mistral19:37
*** dprince has quit IRC19:39
therveddeja, Sounds about right. That's mostly the same that I was doing though :)19:57
therveThe interaction between several calls is more worrying20:00
*** bobh has joined #openstack-mistral20:18
*** bobh has quit IRC20:39
*** vishwanathj has quit IRC21:22
*** bobh has joined #openstack-mistral22:03
*** bobh has quit IRC22:15
*** catintheroof has quit IRC22:31
*** brian_price has quit IRC23:47
*** bobh has joined #openstack-mistral23:52

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!