openstackgerrit | Trevor McKay proposed a change to openstack/savanna: Modify the REST doc to show a Java job type execution https://review.openstack.org/71423 | 00:02 |
---|---|---|
*** tmckay has left #savanna | 00:06 | |
*** qwerty_nor has quit IRC | 00:08 | |
*** ruhe has joined #savanna | 00:13 | |
*** matsuhashi has joined #savanna | 00:20 | |
*** nosnos has joined #savanna | 00:56 | |
*** ruhe has quit IRC | 01:18 | |
*** mst89 has quit IRC | 01:45 | |
openstackgerrit | Andrew Lazarev proposed a change to openstack/python-savannaclient: Changed base Resource class to prevent changing of passed arguments https://review.openstack.org/71086 | 01:56 |
mattf | alazarev? | 01:57 |
*** IlyaE has quit IRC | 02:11 | |
*** ityaptin has quit IRC | 02:16 | |
*** ityaptin has joined #savanna | 02:16 | |
*** aignatov_ is now known as aignatov | 03:11 | |
*** ghenriks has quit IRC | 04:09 | |
*** ghenriks has joined #savanna | 04:15 | |
*** jcooley_ has quit IRC | 04:43 | |
*** akuznetsov has joined #savanna | 04:45 | |
openstackgerrit | Alexander Ignatov proposed a change to openstack/savanna-dashboard: Compatibility with python-savannaclient>=0.5 https://review.openstack.org/70360 | 04:47 |
*** nosnos_ has joined #savanna | 05:02 | |
*** nosnos has quit IRC | 05:06 | |
*** aignatov is now known as aignatov_ | 05:27 | |
*** akuznetsov has quit IRC | 05:34 | |
*** akuznetsov has joined #savanna | 05:36 | |
*** DinaBelova_ is now known as DinaBelova | 05:41 | |
*** jcooley_ has joined #savanna | 05:49 | |
*** DinaBelova is now known as DinaBelova_ | 06:06 | |
*** matsuhashi has quit IRC | 06:07 | |
*** matsuhashi has joined #savanna | 06:07 | |
*** matsuhas_ has joined #savanna | 06:09 | |
*** matsuhashi has quit IRC | 06:09 | |
openstackgerrit | Jenkins proposed a change to openstack/savanna: Imported Translations from Transifex https://review.openstack.org/70918 | 06:09 |
*** jcooley_ has quit IRC | 06:11 | |
*** jcooley_ has joined #savanna | 06:24 | |
*** _nadya_ has joined #savanna | 06:25 | |
*** IlyaE has joined #savanna | 06:34 | |
*** nosnos has joined #savanna | 06:40 | |
*** nosnos_ has quit IRC | 06:42 | |
*** _nadya_ has quit IRC | 06:43 | |
*** IlyaE has quit IRC | 06:45 | |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 06:48 | |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 06:54 | |
*** IlyaE has joined #savanna | 06:58 | |
*** akuznetsov has quit IRC | 07:06 | |
*** akuznetsov has joined #savanna | 07:09 | |
*** IlyaE has quit IRC | 07:09 | |
*** _nadya_ has joined #savanna | 07:09 | |
*** IlyaE has joined #savanna | 07:21 | |
*** _nadya_ has quit IRC | 07:22 | |
*** _nadya_ has joined #savanna | 07:27 | |
*** _nadya_ has quit IRC | 07:28 | |
*** IlyaE has quit IRC | 07:33 | |
*** IlyaE has joined #savanna | 07:41 | |
*** skolekonov has joined #savanna | 07:52 | |
*** IlyaE has quit IRC | 07:54 | |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 08:20 | |
openstackgerrit | Daniele Venzano proposed a change to openstack/savanna-image-elements: Add a Spark element https://review.openstack.org/71237 | 08:26 |
*** bogdando has joined #savanna | 08:30 | |
*** dmitryme has joined #savanna | 09:00 | |
*** aignatov_ is now known as aignatov | 09:10 | |
*** jcooley_ has quit IRC | 09:32 | |
*** DinaBelova_ is now known as DinaBelova | 09:37 | |
openstackgerrit | A change was merged to openstack/savanna-extra: Small tweak to the wordcount example README https://review.openstack.org/71413 | 10:06 |
openstackgerrit | A change was merged to openstack/savanna-dashboard: Sync with global-requirements https://review.openstack.org/71355 | 10:07 |
openstackgerrit | Sergey Lukjanov proposed a change to openstack/savanna: Modify the REST doc to show a Java job type execution https://review.openstack.org/71423 | 10:13 |
*** ruhe has joined #savanna | 10:21 | |
*** aignatov is now known as aignatov_ | 10:22 | |
openstackgerrit | A change was merged to openstack/savanna: Sync with global-requirements https://review.openstack.org/71356 | 10:37 |
openstackgerrit | Daniele Venzano proposed a change to openstack/savanna-image-elements: Add a Spark element https://review.openstack.org/71237 | 10:49 |
openstackgerrit | A change was merged to openstack/savanna: Extract configs beginning with "edp." from job_configs['configs'] https://review.openstack.org/69712 | 10:50 |
openstackgerrit | A change was merged to openstack/savanna: Generate streaming tag in mapreduce job https://review.openstack.org/69727 | 10:52 |
openstackgerrit | A change was merged to openstack/savanna: Add validation check for streaming elements on MapReduce without libs https://review.openstack.org/69960 | 10:52 |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 10:54 | |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 10:57 | |
*** aignatov_ is now known as aignatov | 10:58 | |
openstackgerrit | A change was merged to openstack/savanna: Add integration test for streaming mapreduce https://review.openstack.org/70829 | 11:02 |
openstackgerrit | Nikita Konovalov proposed a change to openstack/savanna: Moving rest to Pecan/WSME framework https://review.openstack.org/63908 | 11:05 |
openstackgerrit | Nikita Konovalov proposed a change to openstack/savanna: Moving rest to Pecan/WSME framework https://review.openstack.org/63908 | 11:12 |
*** venza has quit IRC | 11:37 | |
*** venza has joined #savanna | 11:38 | |
openstackgerrit | A change was merged to openstack/savanna-dashboard: Adding floating ip pool to node groups details for cluster https://review.openstack.org/71381 | 11:44 |
*** DinaBelova is now known as DinaBelova_ | 11:52 | |
*** DinaBelova_ is now known as DinaBelova | 12:05 | |
*** IvanBerezovskiy has joined #savanna | 12:19 | |
*** qwerty_nor has joined #savanna | 12:23 | |
openstackgerrit | A change was merged to openstack/savanna: Refactored unit tests structure https://review.openstack.org/70211 | 12:25 |
*** venza has quit IRC | 12:25 | |
*** jcooley_ has joined #savanna | 12:26 | |
*** venza has joined #savanna | 12:26 | |
*** jcooley_ has quit IRC | 12:31 | |
openstackgerrit | Sergey Reshetnyak proposed a change to openstack/savanna: Add IDH plugin https://review.openstack.org/71507 | 12:35 |
openstackgerrit | Yaroslav Lobankov proposed a change to openstack/savanna: Default OpenStack auth port was changed https://review.openstack.org/71509 | 12:42 |
*** dmitryme has quit IRC | 12:43 | |
*** _crobertsrh is now known as crobertsrh | 12:45 | |
*** dmitryme has joined #savanna | 13:06 | |
openstackgerrit | Nikita Konovalov proposed a change to openstack/savanna: Moving rest to Pecan/WSME framework https://review.openstack.org/63908 | 13:41 |
openstackgerrit | Nikita Konovalov proposed a change to openstack/savanna: Moving rest to Pecan/WSME framework https://review.openstack.org/63908 | 13:53 |
*** dmitryme has quit IRC | 13:55 | |
*** jcooley_ has joined #savanna | 13:57 | |
*** jcooley_ has quit IRC | 14:03 | |
*** dmitryme has joined #savanna | 14:09 | |
*** tmckay has joined #savanna | 14:37 | |
*** dmitryme has quit IRC | 14:51 | |
*** dmitryme has joined #savanna | 14:51 | |
*** dmitryme has quit IRC | 15:01 | |
*** matsuhas_ has quit IRC | 15:17 | |
openstackgerrit | A change was merged to openstack/savanna: Default OpenStack auth port was changed https://review.openstack.org/71509 | 15:18 |
*** nosnos has quit IRC | 15:21 | |
openstackgerrit | Trevor McKay proposed a change to openstack/savanna: Move 'main_class' and 'java_opts' into edp.java configs https://review.openstack.org/69982 | 15:35 |
openstackgerrit | Trevor McKay proposed a change to openstack/savanna: Update the edp user doc to discuss "edp." configs for Java jobs https://review.openstack.org/71403 | 15:38 |
openstackgerrit | Trevor McKay proposed a change to openstack/savanna: Modify the REST doc to show a Java job type execution https://review.openstack.org/71423 | 15:38 |
openstackgerrit | Sergey Reshetnyak proposed a change to openstack/savanna: DO NOT MERGE Testing CI with cinder https://review.openstack.org/71572 | 15:43 |
*** jcooley_ has joined #savanna | 15:46 | |
*** aignatov is now known as aignatov_ | 15:48 | |
*** jmaron has joined #savanna | 15:51 | |
*** skostiuchenko has joined #savanna | 15:52 | |
*** jcooley_ has quit IRC | 15:53 | |
openstackgerrit | Trevor McKay proposed a change to openstack/savanna: Add utilities for supporting dotted job types https://review.openstack.org/71387 | 15:53 |
*** IvanBerezovskiy has left #savanna | 16:05 | |
openstackgerrit | Trevor McKay proposed a change to openstack/savanna: Remove extra Java job type fields from JobExecutions https://review.openstack.org/70420 | 16:08 |
*** akuznetsov has quit IRC | 16:16 | |
*** akuznetsov has joined #savanna | 16:29 | |
*** jcooley_ has joined #savanna | 16:30 | |
*** jcooley_ has quit IRC | 16:56 | |
*** jcooley_ has joined #savanna | 16:57 | |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 16:58 | |
*** akuznetsov has quit IRC | 17:02 | |
*** DinaBelova is now known as DinaBelova_ | 17:04 | |
*** mst89 has joined #savanna | 17:19 | |
*** mst89 has quit IRC | 17:23 | |
*** _nadya_ has joined #savanna | 17:29 | |
*** mattf is now known as _mattf | 17:33 | |
*** ylobankov1 has joined #savanna | 17:34 | |
*** DinaBelova_ is now known as DinaBelova | 17:37 | |
*** jcooley_ has quit IRC | 17:41 | |
*** jcooley_ has joined #savanna | 17:42 | |
*** akuznetsov has joined #savanna | 17:42 | |
*** _nadya_ has quit IRC | 17:52 | |
*** aignatov_ is now known as aignatov | 17:53 | |
*** _mattf is now known as mattf | 17:55 | |
SergeyLukjanov | team meeting will be in 5 mins | 17:59 |
*** alazarev has joined #savanna | 18:05 | |
SergeyLukjanov | irc meeting | 18:08 |
SergeyLukjanov | jmaron | 18:08 |
SergeyLukjanov | aignatov | 18:08 |
SergeyLukjanov | ^^ | 18:08 |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 18:08 | |
aignatov | SergeyLukjanov: I'm there | 18:08 |
*** DinaBelova is now known as DinaBelova_ | 18:09 | |
*** DinaBelova_ is now known as DinaBelova | 18:11 | |
*** mst89 has joined #savanna | 18:17 | |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 18:17 | |
*** IlyaE has joined #savanna | 18:20 | |
*** dmitryme has joined #savanna | 18:21 | |
*** jcooley_ has quit IRC | 18:21 | |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 18:21 | |
*** DinaBelova is now known as DinaBelova_ | 18:23 | |
*** akuznetsov has quit IRC | 18:37 | |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 18:45 | |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 18:51 | |
*** jcooley_ has joined #savanna | 18:58 | |
*** ruhe_ has joined #savanna | 18:59 | |
aignatov | mattf: clone of what you meant? | 19:00 |
mattf | more thoughts on cancel/stop use case? | 19:00 |
mattf | clone of running job. basically client can get details, let user edit and start a new job | 19:00 |
mattf | that's a useful feature anyway | 19:01 |
mattf | stop becomes clone + delete old | 19:01 |
*** ylobankov1 has left #savanna | 19:01 | |
aignatov | why need to clone on 'stop'? | 19:01 |
mattf | so the stop use case is to effectively "pause" a job, edit it and "unpause" (or restart) | 19:02 |
tmckay | yes, that's the idea | 19:02 |
mattf | it sounds like the use case has value of "start a new job" by allowing the user to avoid reentering information | 19:03 |
crobertsrh | Yes | 19:03 |
mattf | so clone would solve this by letting the user take a running job, click clone, have prepopulated fields, make minor edits and click run/start/create | 19:03 |
mattf | at that point you have the old job and the new job running. you can just delete the old one manually. | 19:03 |
aignatov | but if just don't run my old job and create new one totally different from the first? | 19:03 |
mattf | the ui could even have a clone w/ delete option | 19:04 |
tmckay | mattf, it could also be used like job hold or checkpointing in condor | 19:04 |
tmckay | stop it for a while, run it later, without losing the setup | 19:04 |
tmckay | long running cluster of course | 19:04 |
mattf | tmckay, but everyone agreed that stop was deleting the cluster in transient case | 19:05 |
tmckay | yes | 19:05 |
mattf | deleting cluster wipes out the world, there's no incremental restart there | 19:05 |
crobertsrh | Hmm, that functionality is essentially there....you can currently "relaunch" a currently running job. | 19:05 |
crobertsrh | and then just kill the original one. | 19:05 |
mattf | crobertsrh, funny that | 19:05 |
tmckay | crobertsrh, does relaunch have edit? | 19:06 |
crobertsrh | Yes, relaunch brings up job configs tab pre-populated with previous run | 19:07 |
tmckay | so, maybe that's enough | 19:07 |
tmckay | maybe we have what we want without "cancel". You just have to make sure you relaunch before you delete :) | 19:08 |
tmckay | That would keep the cluster from dying too, wouldn't it, in the transient case? | 19:08 |
crobertsrh | Yeah, that is the difference | 19:08 |
* tmckay has looked at that code, but it's been a while | 19:08 | |
crobertsrh | Not sure how cluster transientness is handled | 19:08 |
crobertsrh | Might just be "if cluster is transient and not busy, blow it away" | 19:08 |
crobertsrh | or it might be "kill cluster after this job is done" | 19:09 |
tmckay | It's pretty simple, iirc. Looks at active jobs | 19:09 |
* tmckay goes to look | 19:09 | |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 19:09 | |
crobertsrh | Maybe I can just add another button that says "clone" (even though it points to same thing as "relaunch" :) ) | 19:10 |
crobertsrh | or just clarify what "relaunch" is with a better word | 19:10 |
tmckay | jc = conductor.job_execution_count(ctx, | 19:10 |
tmckay | end_time=None, | 19:10 |
tmckay | cluster_id=cluster.id) | 19:10 |
tmckay | if jc > 0: | 19:10 |
tmckay | continue | 19:10 |
tmckay | so, any old job will do it looks like | 19:11 |
tmckay | crobertsrh, relaunch is probably fine, maybe just some verbage about relaunch semantics in the docs somewhere | 19:12 |
crobertsrh | As long as the job actually makes it into the cluster before you kill the first. | 19:12 |
tmckay | I think that's a pretty good chance. I can go look at what job_execution_count does, I think it's a db read... | 19:12 |
tmckay | The db write would be almost instantaneous | 19:12 |
aignatov | mattf, when you have cluster with TBs of data and start writing the job for some scientific ops, let's say on pure java, with tons of configurations, and this job is running about two days/weeks or whatever. And then some research is done and you got that pig job will be more effective but it differs from pure java job and has its own set of configuration, so the simple use case: is to stop/cancel/kill old job, forget about it, nothing to clone because it' | 19:13 |
aignatov | totally uneeded for new pig job. | 19:13 |
*** DinaBelova_ is now known as DinaBelova | 19:14 | |
mattf | aignatov, ok, where are you headed? | 19:15 |
tmckay | crobertsrh, yes, it's literally a database count of job execution objects where end_time is Null, ie it hasn't completed yet | 19:15 |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 19:15 | |
crobertsrh | ok, I'll quit my worrying then | 19:16 |
aignatov | i'd like to see just 'cancel' operation in new rest api :-) | 19:16 |
* mattf grins | 19:17 | |
crobertsrh | I should probably also add a "cancel" option to the job_executions in the dashboard as well....or maybe I'll call it "oops". | 19:17 |
mattf | aignatov, is your concern that the TBs of data cluster was started via a transient job launch and it'll go away? | 19:17 |
mattf | crobertsrh, lol | 19:18 |
aignatov | mattf, I'm talking only about long running clusters, not transient | 19:18 |
mattf | in your use case, what's the difference from launching a new job against the TBs cluster? | 19:19 |
tmckay | hmm, well, if you use the same output source... | 19:20 |
tmckay | then you're leaving it up to human timing. "Probably, this will be fine, because I hit delete right after I hit submit" | 19:20 |
tmckay | Worse if you have a <prep> tag that messes with your paths. The old job might still be writing to it | 19:21 |
mattf | tmckay, client side you can delete before submitting new. you keep the old around long enough to get the details to save editing | 19:21 |
mattf | aignatov, i've a feeling there's something to the case you're after. i'm just failing to grasp it. | 19:22 |
tmckay | mattf, okay, that works. | 19:22 |
aignatov | difference is that my new job can't be created just by cloning old one and modifying few fields | 19:22 |
mattf | what do you gain by cancel/stopping over deleting the running job on a persistent cluster? | 19:23 |
tmckay | aignatov, if there is no value in the old job, why not just "delete" which means stop and erase | 19:23 |
tmckay | what he said | 19:23 |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 19:23 | |
* tmckay will be back soon | 19:24 | |
aignatov | I think stop or cancel means just shutdown map reduce tasks on cluster, but leave info about job in the savanna db, 'Delete" i see as 'stop' + 'erase' | 19:25 |
mattf | btw, there's a followup conversation on if stop/cancel should be part of the url or just a PUT (modify) w/ a new job state that the service understands | 19:25 |
aignatov | where erase is clean of savanna db about job execution | 19:25 |
mattf | aignatov, that's exactly how i see it. what i'm not seeing is why you want to care to separate stop & erase | 19:25 |
*** ruhe_ has quit IRC | 19:27 | |
aignatov | in my vision I can stop job at some time and erase it later and these are two separate operations | 19:28 |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 19:28 | |
mattf | and you can stop a job and later restart it? | 19:29 |
mattf | the state graph has: PENDING -> RUNNING -> STOPPED -> PENDING ? | 19:29 |
mattf | also PENDING -> RUNNING -> DELETED and PENDING -> DELETED | 19:30 |
mattf | plus * -> ERROR | 19:30 |
crobertsrh | I don't think we have the means to "checkpoint" a job like that, do we? | 19:30 |
mattf | the transition to PENDING would have to mean a fresh start | 19:31 |
mattf | vs RUNNING -> STOPPED -> RUNNING, which is more of a checkpoint transition | 19:31 |
crobertsrh | Ah....that might be an interesting addition too. | 19:32 |
crobertsrh | but not really different from relaunching as a new job_execution | 19:32 |
mattf | it's probably quite possible w/ many big data frameworks, where it was a total pita for generic processes in condor | 19:32 |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 19:32 | |
*** jcooley_ has quit IRC | 19:35 | |
*** jcooley_ has joined #savanna | 19:36 | |
mattf | i need to scoot, bbiab | 19:36 |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 19:38 | |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 19:39 | |
*** jcooley_ has quit IRC | 19:40 | |
*** jcooley_ has joined #savanna | 19:41 | |
tmckay | I see two cases for clone. 1) add another job like one currently running, mostly the same but at least with a different output source and probably more differences | 19:43 |
tmckay | 2) rerun the current job with a config or 2 changed | 19:43 |
tmckay | In #1, we increase the job count | 19:44 |
tmckay | In #2, we just replace the running job. But this is where it gets dicey if the same output data source is used, in my mind. And if you allow it on transients, you can't delete old before you submit new because the cluster might go away in between | 19:45 |
*** jcooley_ has quit IRC | 19:45 | |
tmckay | crobertsrh, ^^ what do you think? Am I missing something? | 19:45 |
tmckay | I'm not sure #2 is really possible with transients. Unless a "stop" operation doesn't fill in the end time field. It's not running, it just never... completed. Then you have to delete it by hand later | 19:47 |
crobertsrh | sorry...was afk for a sec | 19:48 |
*** jcooley_ has joined #savanna | 19:50 | |
crobertsrh | Not sure about #2....it seems to make sense to have a cloned/rerun job get a new job_ex id | 19:50 |
crobertsrh | I suppose the UI could restrict use of some options on jobs that are on a transient cluster...might be confusing though. | 19:51 |
*** aignatov is now known as aignatov_ | 19:56 | |
*** jcooley_ has quit IRC | 19:56 | |
*** jcooley_ has joined #savanna | 19:57 | |
tmckay | crobertsrh, I agree, new job ex id. I'm specifically worried about the case where the same output path is used (swift or hdfs). | 19:57 |
tmckay | The second you have two jobs exist that address the same output path, you have the potential for trouble. Even if that potential is small, I still don't like it | 19:58 |
crobertsrh | Yeah, that is a potential problem. Users definitely need to be aware there. I specifically did NOT keep the input/output source selection for relaunched jobs for that reason. The user needs to re-select input/output even on relaunch. | 19:59 |
*** dmitryme has quit IRC | 19:59 | |
tmckay | so if output is the same, that means somehow delete old before submitting new. But deleting old before submitting new on a transient cluster means the cluster could get deleted in that split second. | 20:00 |
*** mattf is now known as _mattf | 20:00 | |
crobertsrh | Or, in the case of a swift output source, the user might need to handle that cleanup on their own...or have their job pre-clean the output dir | 20:00 |
tmckay | And if you're copying old to new, editing new, deleting old, submitting new, that's got to be orchestrated through the interface and indicated by the user | 20:01 |
tmckay | right | 20:01 |
tmckay | This is where "stop" to leave a job in limbo, do what you need to to, and delete it later may be a good option. Certainly seems simpler from the UI side. | 20:02 |
*** _mattf is now known as mattf | 20:02 | |
crobertsrh | I agree | 20:02 |
tmckay | I think we need a way to mark a job "undead" | 20:03 |
tmckay | to cease action but preserve the cluster | 20:04 |
crobertsrh | Does oozie have a mechanism to pause a job? | 20:06 |
tmckay | that one, I don't know. Pretty easy to check the client, though. | 20:07 |
*** _nadya_ has joined #savanna | 20:09 | |
crobertsrh | Looks like there is a SUSPEND state you can put a workflow into | 20:10 |
tmckay | Managing a Job -- A HTTP PUT request starts, suspends, resumes or kills a job. | 20:10 |
tmckay | :) | 20:10 |
tmckay | so there you go. For a transient cluster with relaunch, I suspect what we really want is suspend, submit, delete suspended | 20:11 |
*** jcooley_ has quit IRC | 20:12 | |
crobertsrh | Yep | 20:12 |
*** jcooley_ has joined #savanna | 20:12 | |
tmckay | If Oozie can do suspend/resume, not sure why Savanna shouldn't | 20:15 |
crobertsrh | Yeah, seems like something we should definitely have. | 20:17 |
*** alazarev has quit IRC | 20:21 | |
tmckay | hmm, suspend is talked about together with something called bundles. I'll have to look more | 20:22 |
*** mst89 has quit IRC | 20:23 | |
*** _nadya_ has quit IRC | 20:24 | |
*** eanxgeek has joined #savanna | 20:31 | |
tmckay | crobertsrh, okay, it apparently applies at multiple levels. http://oozie.apache.org/docs/3.1.3-incubating/DG_CommandLineTool.html#Suspending_a_Workflow_Coordinator_or_Bundle_Job. | 20:36 |
tmckay | The advantage I see of suspend over "cancel" (ie KILL) is that the end_time I assume isn't filled in, so it counts as an active job in the savanna execution count | 20:37 |
crobertsrh | Pretty nifty | 20:37 |
* tmckay goes to work on something more well defined, like dotted names :) | 20:37 | |
tmckay | crobertsrh, well, duh, even without any end_time semantics, the state becomes "suspended". | 20:38 |
tmckay | So that could be tested for | 20:38 |
crobertsrh | right :) | 20:38 |
tmckay | Case of the Thursdays, lol. I think there is a case for every day, eventually... | 20:39 |
crobertsrh | I'd take a case of the Saturdays. | 20:40 |
*** alazarev has joined #savanna | 20:41 | |
*** alazarev has quit IRC | 20:41 | |
tmckay | bah, my alembic migration is still on the wrong branch somehow. Darn this distributed development! | 20:42 |
tmckay | ;-) | 20:42 |
tmckay | time to start this bad boy from master | 20:43 |
mattf | <tmckay> If Oozie can do suspend/resume, not sure why Savanna shouldn't | 20:43 |
mattf | <crobertsrh> Yeah, seems like something we should definitely have. | 20:43 |
mattf | we may not always have oozie as our interface, e.g. the spark addition inbound | 20:43 |
tmckay | mattf, that's a valid point | 20:44 |
mattf | begs the question tho, should spark be added to oozie? | 20:44 |
crobertsrh | Ah, I did not realize we had Spark coming | 20:44 |
tmckay | mattf, does oozie have spark actions? | 20:45 |
mattf | i dunno | 20:45 |
crobertsrh | I'm looking for that right now :) | 20:46 |
tmckay | mattf, so if you read back, there's still something troubling me. It's relaunch on transient with the same output path. | 20:46 |
mattf | http://comments.gmane.org/gmane.comp.lang.scala.spark.user/1417 | 20:46 |
mattf | 'Having said that, it is fairly easy to leverage oozie to launch spark | 20:46 |
mattf | jobs - using the java action." | 20:46 |
crobertsrh | same thing I was reading :) | 20:46 |
tmckay | mattf, ah, yes, I was thinking java action... | 20:47 |
* mattf hi5 crobertsrh | 20:47 | |
crobertsrh | hooray google | 20:47 |
tmckay | shoot, in java actions, we can do *stinkin* anything | 20:47 |
mattf | it's the escape hatch | 20:47 |
mattf | enter w/ caution | 20:47 |
tmckay | mattf, I was thinking that "suspend" was a good way to stop a job without losing the cluster, submit the new one, then follow with "kill" | 20:48 |
tmckay | no windows for badness | 20:49 |
mattf | imho, as long as we're not making a new endpoint for cancel/stop/suspend and we're going the HTTP PUT to status field direction, we're in good shape and can add the feature later | 20:49 |
*** eanxgeek is now known as eanxgeek|log | 20:50 | |
mattf | ^^ kinda the middle ground in my mind, since i still can't see the use case that needs cancel | 20:50 |
tmckay | okay, I agree. Do we have a HTTP Put for status in the CR? I don't recall one... | 20:50 |
mattf | we do not, i can file one | 20:50 |
mattf | aignatov_, thoughts? | 20:51 |
tmckay | okay. That would be great. Then we can do anything we need to under the hood with the oozie client | 20:51 |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 21:02 | |
openstackgerrit | Trevor McKay proposed a change to openstack/savanna: Remove extra Java job type fields from JobExecutions https://review.openstack.org/70420 | 21:09 |
*** DinaBelova is now known as DinaBelova_ | 21:25 | |
*** jcooley_ has quit IRC | 21:28 | |
*** jcooley_ has joined #savanna | 21:29 | |
tmckay | crobertsrh, if https://review.openstack.org/#/c/70810/ is done let's take it out of work in progress | 21:41 |
crobertsrh | We will need the python-savannaclient updated as well. Is that on tap? | 21:41 |
crobertsrh | I took it out of WIP | 21:42 |
tmckay | yeah, moving that out of WIP too. We just have to bite the bullet and request a merge of all at the same time | 21:42 |
crobertsrh | Actually, it might still work even with the old client | 21:42 |
tmckay | yeah, removing job_exec_data isn't strictly necessary. | 21:43 |
crobertsrh | exactly | 21:43 |
tmckay | yay, no more WIP changes | 21:47 |
*** mst89 has joined #savanna | 21:49 | |
*** crobertsrh is now known as _crobertsrh | 21:52 | |
*** IlyaE has quit IRC | 22:08 | |
openstackgerrit | Trevor McKay proposed a change to openstack/savanna: Add utilities for supporting dotted job types https://review.openstack.org/71387 | 22:09 |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 22:11 | |
*** IlyaE has joined #savanna | 22:14 | |
*** tmckay is now known as _tmckay | 22:35 | |
mattf | SergeyLukjanov, fyi, i just finished a pass over the i3 milestone | 23:34 |
openstackgerrit | Andrew Lazarev proposed a change to openstack/savanna: [IDH] Fixed cluster start without jobtracker service https://review.openstack.org/71689 | 23:43 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!