Hiho, I can't remember exactly where the sugegstion came from, but here it is anyway. Wouldn't it be useful to have jobs implementing the task interface? Well, the advatages would be: - you can put jobs into tak containers, and wait() for them to change state/finish - if we have notification on tasks (see earlier mail from November or so), we have now notification on jobs as well, w/o the need to introduce a new mechanism - it seems simple to map the job states to task states: Done -> DoneOK Failed -> DoneFail all others -> Running (as a side note, we should consider have the same names for Done and Failed...) The only difference I see in terms of semantic is that tasks can be Pending - i.e. not yet started. Jobs can't, as soon as you create them, they are at least Queued. However, that is a topic I wated to open anyway: One of our earliest use cases is bulk job submission (from Hrabri/DRMAA). The current API does not allow to implement that use case efficiently - the only possibility is: for ( int i = 0; i < 1.000.000; i++ ) { my_jobs[i] = job_server.submit_job (job_description[i]); } Having jobs more task like would help: for ( int i = 0; i < 1.000.000; i++ ) { task_container.add_task (job_server.create_job (job_description[i])); } task_container.run (); The semantics would be now: job = job_server.create_job (job_description); // job is pending now job.run (); // job is queued now, or whatever - see previous mail. That argumentation is very similar to the one we hade about task creation: there we have: task t1 = file.read <async> (...); // t1 is running task t2 = file.read <task> (...); // t2 is pending t2.run (); // t2 is running same for jobs then: job j1 = job_server.submit_job <async> (...); // t1 is running job j2 = job_server.create_job <task> (...); // t2 is pending j2.run (); // j2 is running Feedback as always would be very welcome. An issue in that respect is in the issue list (since a couple of weeks actually). Best regards, Andre. -- "So much time, so little to do..." -- Garfield
Andre Merzky wrote: I fully support to have the task interface implemented on jobs. An additional use case for this (apart from what Andre mentioned) would be to allow to integrate jobs as first class citizens in workflows built out of saga::task's. Regards Hartmut
I can't remember exactly where the sugegstion came from, but here it is anyway.
Wouldn't it be useful to have jobs implementing the task interface?
Well, the advatages would be:
- you can put jobs into tak containers, and wait() for them to change state/finish
- if we have notification on tasks (see earlier mail from November or so), we have now notification on jobs as well, w/o the need to introduce a new mechanism
- it seems simple to map the job states to task states: Done -> DoneOK Failed -> DoneFail all others -> Running
(as a side note, we should consider have the same names for Done and Failed...)
The only difference I see in terms of semantic is that tasks can be Pending - i.e. not yet started. Jobs can't, as soon as you create them, they are at least Queued.
However, that is a topic I wated to open anyway: One of our earliest use cases is bulk job submission (from Hrabri/DRMAA). The current API does not allow to implement that use case efficiently - the only possibility is:
for ( int i = 0; i < 1.000.000; i++ ) { my_jobs[i] = job_server.submit_job (job_description[i]); }
Having jobs more task like would help:
for ( int i = 0; i < 1.000.000; i++ ) { task_container.add_task (job_server.create_job (job_description[i])); }
task_container.run ();
The semantics would be now:
job = job_server.create_job (job_description); // job is pending now
job.run (); // job is queued now, or whatever - see previous mail.
That argumentation is very similar to the one we hade about task creation: there we have:
task t1 = file.read <async> (...); // t1 is running task t2 = file.read <task> (...); // t2 is pending
t2.run (); // t2 is running
same for jobs then:
job j1 = job_server.submit_job <async> (...); // t1 is running job j2 = job_server.create_job <task> (...); // t2 is pending
j2.run (); // j2 is running
Feedback as always would be very welcome. An issue in that respect is in the issue list (since a couple of weeks actually).
Best regards,
Andre.
-- "So much time, so little to do..." -- Garfield
Wouldn't it be useful to have jobs implementing the task interface?
Certainly, no. Jobs and Tasks are two different things, and they are this on purpose. However, Tasks always have been the mechanism for asynchronous operation, which is kind-of obsoleted by having asynchronous ops directly. If you want to work on the "S" of SAGA: why not unify both Tasks and Jobs into a better "Job" notion, and do local asynchronous operations via async, local calls? Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Quoting [Thilo Kielmann] (Feb 04 2006):
Date: Sat, 4 Feb 2006 22:29:41 +0100 From: Thilo Kielmann
To: saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Wouldn't it be useful to have jobs implementing the task interface?
Certainly, no. Jobs and Tasks are two different things, and they are this on purpose.
Hmm, I am getting confused, as you suggest below to unify them...
However, Tasks always have been the mechanism for asynchronous operation, which is kind-of obsoleted by having asynchronous ops directly.
Well, not really I think - as tasks _are_ our asynchroneous operations. synchroneous seek: off_t new_pos = file.seek (pos, whence); // 1 asynchroneous seek: saga::task t2 = file.seek <async> (pos, whence, &newpos); // 2 saga::task t3 = file.seek <task> (pos, whence, &newpos); // 3 (Difference between 2 and 3 is only that task is pending (2) or running (3) )
If you want to work on the "S" of SAGA: why not unify both Tasks and Jobs into a better "Job" notion, and do local asynchronous operations via async, local calls?
I seem to misunderstand you, as I think that is what we do right now... isn't it? And we don't want to replace tasks with jobs, as that is pure overkill (migrate a async call? having pre-execution state? etc...) - so that is probably not what you mean either? Cheers, Andre.
Thilo
-- "So much time, so little to do..." -- Garfield
Well, not really I think - as tasks _are_ our asynchroneous operations.
No, they are not. We have: - synchronous operation - asynchronous operation ; starts immediately - task creating operation ; needs explicit start My suggestion is to remove the tasks, and put everything useful from the task interface into the job interface. And do async operations by async operations. Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Quoting [Thilo Kielmann] (Feb 04 2006):
Well, not really I think - as tasks _are_ our asynchroneous operations.
No, they are not. We have:
- synchronous operation - asynchronous operation ; starts immediately - task creating operation ; needs explicit start
Both of the latter calls return a task handle (or call it async op handle), that why, in SAGA, they are esentially the same.
My suggestion is to remove the tasks, [...] And do async operations by async operations.
The task only has run, cancel and wait, apart from state inspection. Cancel and wait are also needed by the async version. So you suggest to get rid of run? Well, we had that discussion. Also,
and put everything useful from the task interface into the job interface.
Well, our point is really that everythin usful from task should go into the job interface (job should implement task). So, maybe you are actually in violent agreement, kind of? :-) Somwhoe I doubt it :-P Still somewhat confused, Andre.
Thilo
-- "So much time, so little to do..." -- Garfield
I am actually in favour removing the concept of "tasks" from the SAGA spec. It simply creates intellectual overhead. (Remember people keep asking about the difference of tasks and jobs??)
- synchronous operation - asynchronous operation ; starts immediately - task creating operation ; needs explicit start
Both of the latter calls return a task handle (or call it async op handle), that why, in SAGA, they are esentially the same.
so, call it async handle
Well, our point is really that everythin usful from task should go into the job interface (job should implement task).
So, maybe you are actually in violent agreement, kind of? :-) Somwhoe I doubt it :-P
violent, at least -- agreement, maybe? Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Quoting [Thilo Kielmann] (Feb 04 2006):
I am actually in favour removing the concept of "tasks" from the SAGA spec. It simply creates intellectual overhead.
Well, I think we had that discussion already. And I am happy that it is closed ;-)
(Remember people keep asking about the difference of tasks and jobs??)
Right. One reaon more to unify them... :-)
Well, our point is really that everythin usful from task should go into the job interface (job should implement task).
So, maybe you are actually in violent agreement, kind of? :-) Somwhoe I doubt it :-P
violent, at least -- agreement, maybe?
hehe :-) Andre.
Thilo -- "So much time, so little to do..." -- Garfield
Hi, I just had a discussion with Thilo about the topic, as he and me obviously talked somewhat orthogonal to each other... Well, now we have the same opinion, kind of, and I have barely any bruises... Anyway, I want to summarize our point here, as I probably was not really clear in my initial post. Sorry if re-iteration of the topic bores you... So, we have tasks, which represent async operations, with a couple of states attached, and the ability to call run(), wait() and cancel() on these. And we can collect them in containers, and wait() on many of these tasks conveniently. And then we have jobs, which represent remote executables, with a couple of states attached, and the ability to call run (== create them), wait() and cancel(). And some more methods. And we can't collect them in containers right now, but would like to. You see the similarities, right? Its even more obvious in code: Tasks: -------------------------------------------- task_container tc; task t = file.copy saga::task (...); t.run ( ); t.wait (1.0); tc.add (t); tc.wait ( ); -------------------------------------------- Jobs: -------------------------------------------- job_container jc; job j = job_server.submit (job_descr); j.wait (1.0); jc.add (j); jc.wait ( ); -------------------------------------------- slightly changed: -------------------------------------------- job_container jc; ! job j = job_server.create (job_descr); + j.run ( ); j.wait (1.0); jc.add (j); jc.wait ( ); -------------------------------------------- The similarities are obvious I think. Now, if job would IMPLEMENT the task interface (or inherit from task), we would unify both classes, and hence: - simplify jobs (leave only those methods which are specific to jobs, like migrate, signal, ... - allow to out jobs into task containers, efficiently handling large amounts of jobs and other tasks - have the API and used paradigms more uniform. Also, if later tasks get suspendable, as Gregor rightly suggested, we can move more methods to tasks, w/o breaking the paradigms. In terms of state, following mappings would be appropriate: job::Pending -> task::Pending job::Done -> task::Done job::Failed -> task::Failed job::??? -> task::Cancelled job::Queued,Running,Pre/Poststaging,... -> task::Running So, no adjustements to the statet models are needed AFAICS, apart from Cancelled (Does it make sense on jobs? Should job::stop be job::cancel? Should tasks::cancel be task::job?) Hope that clearifies things. I think Gregor was on target with his remarks, and Hartmut signalled consent as well. And I think I convinced Thilo (Andre rubs his bruises). Unless there is any opposition, I'll go ahead and document that in the strawman then, ok? Cheers, Andre. Quoting [Thilo Kielmann] (Feb 04 2006):
Date: Sat, 4 Feb 2006 22:29:41 +0100 From: Thilo Kielmann
To: saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Wouldn't it be useful to have jobs implementing the task interface?
Certainly, no. Jobs and Tasks are two different things, and they are this on purpose.
However, Tasks always have been the mechanism for asynchronous operation, which is kind-of obsoleted by having asynchronous ops directly.
If you want to work on the "S" of SAGA: why not unify both Tasks and Jobs into a better "Job" notion, and do local asynchronous operations via async, local calls?
Thilo -- "So much time, so little to do..." -- Garfield
Ouch! Andre has a pretty good punch! I agree with his proposal ;-) Thilo On Mon, Feb 06, 2006 at 04:22:55PM +0100, Andre Merzky wrote:
X-Original-To: kielmann@localhost Delivered-To: kielmann@localhost.cs.vu.nl Date: Mon, 6 Feb 2006 16:22:55 +0100 From: Andre Merzky
To: Thilo Kielmann Cc: saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Hi,
I just had a discussion with Thilo about the topic, as he and me obviously talked somewhat orthogonal to each other...
Well, now we have the same opinion, kind of, and I have barely any bruises... Anyway, I want to summarize our point here, as I probably was not really clear in my initial post.
Sorry if re-iteration of the topic bores you...
So, we have tasks, which represent async operations, with a couple of states attached, and the ability to call run(), wait() and cancel() on these. And we can collect them in containers, and wait() on many of these tasks conveniently.
And then we have jobs, which represent remote executables, with a couple of states attached, and the ability to call run (== create them), wait() and cancel(). And some more methods. And we can't collect them in containers right now, but would like to.
You see the similarities, right? Its even more obvious in code:
Tasks: -------------------------------------------- task_container tc; task t = file.copy saga::task (...); t.run ( ); t.wait (1.0);
tc.add (t); tc.wait ( ); --------------------------------------------
Jobs: -------------------------------------------- job_container jc; job j = job_server.submit (job_descr); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
slightly changed: -------------------------------------------- job_container jc; ! job j = job_server.create (job_descr); + j.run ( ); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
The similarities are obvious I think. Now, if job would IMPLEMENT the task interface (or inherit from task), we would unify both classes, and hence:
- simplify jobs (leave only those methods which are specific to jobs, like migrate, signal, ...
- allow to out jobs into task containers, efficiently handling large amounts of jobs and other tasks
- have the API and used paradigms more uniform.
Also, if later tasks get suspendable, as Gregor rightly suggested, we can move more methods to tasks, w/o breaking the paradigms.
In terms of state, following mappings would be appropriate:
job::Pending -> task::Pending job::Done -> task::Done job::Failed -> task::Failed job::??? -> task::Cancelled
job::Queued,Running,Pre/Poststaging,... -> task::Running
So, no adjustements to the statet models are needed AFAICS, apart from Cancelled (Does it make sense on jobs? Should job::stop be job::cancel? Should tasks::cancel be task::job?)
Hope that clearifies things. I think Gregor was on target with his remarks, and Hartmut signalled consent as well. And I think I convinced Thilo (Andre rubs his bruises).
Unless there is any opposition, I'll go ahead and document that in the strawman then, ok?
Cheers, Andre.
Quoting [Thilo Kielmann] (Feb 04 2006):
Date: Sat, 4 Feb 2006 22:29:41 +0100 From: Thilo Kielmann
To: saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Wouldn't it be useful to have jobs implementing the task interface?
Certainly, no. Jobs and Tasks are two different things, and they are this on purpose.
However, Tasks always have been the mechanism for asynchronous operation, which is kind-of obsoleted by having asynchronous ops directly.
If you want to work on the "S" of SAGA: why not unify both Tasks and Jobs into a better "Job" notion, and do local asynchronous operations via async, local calls?
Thilo -- "So much time, so little to do..." -- Garfield
-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Andre,
I have a couple of reservations about this action that you may be able to
answer.
I had been hoping to avoid implementing the 'Task' namespace in the java
bindings and encourage developers to use the language's support to
threading to allow asynchronous method calls in the client code.
I am therefore concerned about creating a dependency between the
'JobManagement' namespace and the 'Task' namespace.
The submission of remote jobs is naturally asynchronous, and there are
natural semantic parallels to the asynchronous model described by the
'Task' namespace. However from my reading of the API I understood these two
models to be independent in purpose; creating a dependency could hinder the
natural description (and development) of these two areas of the API.
The idea of a job container for the management of a large number of remote
jobs is useful. However the TaskContainter does not appear to be wholly
compatible; the run() method to start the asynchronous operations is
unnecessary for jobs that have been submitted to a remote resource.
Graeme
Quoting Andre Merzky
Hi,
I just had a discussion with Thilo about the topic, as he and me obviously talked somewhat orthogonal to each other...
Well, now we have the same opinion, kind of, and I have barely any bruises... Anyway, I want to summarize our point here, as I probably was not really clear in my initial post.
Sorry if re-iteration of the topic bores you...
So, we have tasks, which represent async operations, with a couple of states attached, and the ability to call run(), wait() and cancel() on these. And we can collect them in containers, and wait() on many of these tasks conveniently.
And then we have jobs, which represent remote executables, with a couple of states attached, and the ability to call run (== create them), wait() and cancel(). And some more methods. And we can't collect them in containers right now, but would like to.
You see the similarities, right? Its even more obvious in code:
Tasks: -------------------------------------------- task_container tc; task t = file.copy saga::task (...); t.run ( ); t.wait (1.0);
tc.add (t); tc.wait ( ); --------------------------------------------
Jobs: -------------------------------------------- job_container jc; job j = job_server.submit (job_descr); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
slightly changed: -------------------------------------------- job_container jc; ! job j = job_server.create (job_descr); + j.run ( ); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
The similarities are obvious I think. Now, if job would IMPLEMENT the task interface (or inherit from task), we would unify both classes, and hence:
- simplify jobs (leave only those methods which are specific to jobs, like migrate, signal, ...
- allow to out jobs into task containers, efficiently handling large amounts of jobs and other tasks
- have the API and used paradigms more uniform.
Also, if later tasks get suspendable, as Gregor rightly suggested, we can move more methods to tasks, w/o breaking the paradigms.
In terms of state, following mappings would be appropriate:
job::Pending -> task::Pending job::Done -> task::Done job::Failed -> task::Failed job::??? -> task::Cancelled
job::Queued,Running,Pre/Poststaging,... -> task::Running
So, no adjustements to the statet models are needed AFAICS, apart from Cancelled (Does it make sense on jobs? Should job::stop be job::cancel? Should tasks::cancel be task::job?)
Hope that clearifies things. I think Gregor was on target with his remarks, and Hartmut signalled consent as well. And I think I convinced Thilo (Andre rubs his bruises).
Unless there is any opposition, I'll go ahead and document that in the strawman then, ok?
Cheers, Andre.
Quoting [Thilo Kielmann] (Feb 04 2006):
Date: Sat, 4 Feb 2006 22:29:41 +0100 From: Thilo Kielmann
To: saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Wouldn't it be useful to have jobs implementing the task interface?
Certainly, no. Jobs and Tasks are two different things, and they are this on purpose.
However, Tasks always have been the mechanism for asynchronous operation, which is kind-of obsoleted by having asynchronous ops directly.
If you want to work on the "S" of SAGA: why not unify both Tasks and Jobs into a better "Job" notion, and do local asynchronous operations via async, local calls?
Thilo -- "So much time, so little to do..." -- Garfield
Hi Graeme, Quoting [G.E.POUND@soton.ac.uk] (Feb 06 2006):
Andre,
I have a couple of reservations about this action that you may be able to answer.
I had been hoping to avoid implementing the 'Task' namespace in the java bindings and encourage developers to use the language's support to threading to allow asynchronous method calls in the client code.
I understood that much from your last comments, but my limited knowledge does not allow me to give a qualified answer I'm afraid. Hmm, maybe we can work this out together :-) Could you post some code examples to demonstrate how a asynchronous seek (for example) would be coded in a java application? The C++ code would be: saga::file f (url); saga::task t = f.seek saga::task (off, whence, &pos); t.run (); t.wait (); Next question is, how would a java application manage many tasks - i.e. is there something similar to a saga::task_container ? saga::task_container tc; tc.add (task_1); tc.add (task_2); tc.add (task_3); tc.run (); tc.wait ();
I am therefore concerned about creating a dependency between the 'JobManagement' namespace and the 'Task' namespace.
I think, for the java bindings that would mean that job implements the task interface. Apart from the run you mention below, the semantic of the task interface is actually included in job already, more or less, only the methods and states are differently named (that is one motivation for the proposal really).
The submission of remote jobs is naturally asynchronous, and there are natural semantic parallels to the asynchronous model described by the 'Task' namespace. However from my reading of the API I understood these two models to be independent in purpose; creating a dependency could hinder the natural description (and development) of these two areas of the API.
Both models are not really different on purpose. Would you see an advantage of having them truly separate?
The idea of a job container for the management of a large number of remote jobs is useful. However the TaskContainter does not appear to be wholly compatible; the run() method to start the asynchronous operations is unnecessary for jobs that have been submitted to a remote resource.
You are right. Well, that might be somewhat subtle, but in the example code below, the submit_job() call is accompanied by c new reate_job(). That creates a job which needs to be run(), which would make it compatiple to the task model. saga::job j1 = job_server.create_job (jd); // job state is 'pending' j1.run (); // job state is 'running' or so - 'not pending' j1.wait (); // job state is Done or Failed saga::job j2 = job_server.submit_job (jd); // job state is 'running' or so - 'not pending' j1.wait (); // job state is Done or Failed That is very similar to the semantics we have for tasks... One reason for this proposal is additionally that we want to approach the bulk operations soon. Consider a parameter sweep, where 100.000 jobs are to be run. for ( i = 0; i < 100.000; i++ ) { jobs[i] = job_server.submit (jd[i]); // SUBMIT } for ( i = 0; i < 100.000; i++ ) { jobs[i].wait (); } As for each submission, you are very likely to have at least one remote operation (they are independend), that will take a lot of time. Compare that to: task_container tc; for ( i = 0; i < 100.000; i++ ) { tc.add (job_server.create (jd[i])); // CREATE } tc.run (); tc.wait (); It is rather straigh forward to optimize the task container for bulk job submission, or bulk operations in general (we hope). What I am not sure is, what would that look like in native java? Are there similar mechanisms? Looking forward to your comments, Andre.
Graeme
Quoting Andre Merzky
: Hi,
I just had a discussion with Thilo about the topic, as he and me obviously talked somewhat orthogonal to each other...
Well, now we have the same opinion, kind of, and I have barely any bruises... Anyway, I want to summarize our point here, as I probably was not really clear in my initial post.
Sorry if re-iteration of the topic bores you...
So, we have tasks, which represent async operations, with a couple of states attached, and the ability to call run(), wait() and cancel() on these. And we can collect them in containers, and wait() on many of these tasks conveniently.
And then we have jobs, which represent remote executables, with a couple of states attached, and the ability to call run (== create them), wait() and cancel(). And some more methods. And we can't collect them in containers right now, but would like to.
You see the similarities, right? Its even more obvious in code:
Tasks: -------------------------------------------- task_container tc; task t = file.copy saga::task (...); t.run ( ); t.wait (1.0);
tc.add (t); tc.wait ( ); --------------------------------------------
Jobs: -------------------------------------------- job_container jc; job j = job_server.submit (job_descr); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
slightly changed: -------------------------------------------- job_container jc; ! job j = job_server.create (job_descr); + j.run ( ); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
The similarities are obvious I think. Now, if job would IMPLEMENT the task interface (or inherit from task), we would unify both classes, and hence:
- simplify jobs (leave only those methods which are specific to jobs, like migrate, signal, ...
- allow to out jobs into task containers, efficiently handling large amounts of jobs and other tasks
- have the API and used paradigms more uniform.
Also, if later tasks get suspendable, as Gregor rightly suggested, we can move more methods to tasks, w/o breaking the paradigms.
In terms of state, following mappings would be appropriate:
job::Pending -> task::Pending job::Done -> task::Done job::Failed -> task::Failed job::??? -> task::Cancelled
job::Queued,Running,Pre/Poststaging,... -> task::Running
So, no adjustements to the statet models are needed AFAICS, apart from Cancelled (Does it make sense on jobs? Should job::stop be job::cancel? Should tasks::cancel be task::job?)
Hope that clearifies things. I think Gregor was on target with his remarks, and Hartmut signalled consent as well. And I think I convinced Thilo (Andre rubs his bruises).
Unless there is any opposition, I'll go ahead and document that in the strawman then, ok?
Cheers, Andre.
Quoting [Thilo Kielmann] (Feb 04 2006):
Date: Sat, 4 Feb 2006 22:29:41 +0100 From: Thilo Kielmann
To: saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Wouldn't it be useful to have jobs implementing the task interface?
Certainly, no. Jobs and Tasks are two different things, and they are this on purpose.
However, Tasks always have been the mechanism for asynchronous operation, which is kind-of obsoleted by having asynchronous ops directly.
If you want to work on the "S" of SAGA: why not unify both Tasks and Jobs into a better "Job" notion, and do local asynchronous operations via async, local calls?
Thilo -- "So much time, so little to do..." -- Garfield
-- "So much time, so little to do..." -- Garfield
Andre,
The benefits of the application of the TaskContainer semantics to job
submission are compelling. Whilst the separation of JobService.submitJob()
into JobService.create() and Job.run() methods adds a little complexity,
the possibility of optimised bulk operations may justify this.
The role of the JobService changes a little, it may be used to create Job
objects without submitting them to the resource manager. Will
JobService.runJob() invoke the Job object that has been submitted to the
resource manager?
---
I am yet to be convinced that the 'Task' and 'JobManagement' namespaces
should be linked. Whilst the semantics of these namespaces are similar they
are designed for different purposes; a simple model of asynchronous method
calls in the client, and the submission of batch jobs to a remote resource.
The advantage of keeping these two namespaces separate is to avoid an
unnecessary dependency between these two different areas of the API. For
example; if in the future additional methods are required to support
asynchronous method calls these would be reflected in the JobManagement
package. [Or if the Task namespace were altered to support language
specific features, see below]
The advantage of linking the namespaces would be to allow all classes
implementing TaskContainer to handle Job objects.
---
In Java the basis for multithreading support is sub-classing the
java.lang.Thread class (or implementing the java.lang.Runnable interface).
In practice SAGA implementations could return inner classes that may be
called asynchronously (by either of these approaches) for each API method.
This is essentially the whole story (prior to Java 5.0), and the whole
approach could be encapsulated within the SAGA.Task model.
The problem with encapsulation within the SAGA.Task model is that the
fine-grain control over the threads (when sub-classing java.lang.Thread) is
lost. Furthermore when using Java 5.0 it would not be possible to leverage
the high-level concurrency utilities now available
(java.util.concurrent.*). For example the task scheduling framework would
be appropriate to control the execution of the threads.
For more details see:
http://java.sun.com/docs/books/tutorial/essential/threads/index.html
In C# the model is nicer than in Java. Methods may be invoked asynchronously
by creating a 'delegate' for that method using Thread.ThreadStart.
Evaluation of the delegates in different threads can be managed at a
high-level via the thread pool.
When working with these languages it there will be some functionality that
it will be important not to preclude. In Java <5.0 there may be little
above and beyond that which is available in SAGA.Task of value to most
users, and encapsulation may be the correct approach. However the
high-level support for concurrency available in Java 5.0 and C# is
certainly important. I am unsure whether access to this functionality is
best achieved by by-passing, or altering SAGA.Task namespace in the
language bindings.
Graeme
Quoting Andre Merzky
Hi Graeme,
Quoting [G.E.POUND@soton.ac.uk] (Feb 06 2006):
Andre,
I have a couple of reservations about this action that you may be able
to
answer.
I had been hoping to avoid implementing the 'Task' namespace in the java bindings and encourage developers to use the language's support to threading to allow asynchronous method calls in the client code.
I understood that much from your last comments, but my limited knowledge does not allow me to give a qualified answer I'm afraid. Hmm, maybe we can work this out together :-)
Could you post some code examples to demonstrate how a asynchronous seek (for example) would be coded in a java application?
The C++ code would be:
saga::file f (url); saga::task t = f.seek saga::task (off, whence, &pos);
t.run (); t.wait ();
Next question is, how would a java application manage many tasks - i.e. is there something similar to a saga::task_container ?
saga::task_container tc;
tc.add (task_1); tc.add (task_2); tc.add (task_3);
tc.run (); tc.wait ();
I am therefore concerned about creating a dependency between the 'JobManagement' namespace and the 'Task' namespace.
I think, for the java bindings that would mean that job implements the task interface. Apart from the run you mention below, the semantic of the task interface is actually included in job already, more or less, only the methods and states are differently named (that is one motivation for the proposal really).
The submission of remote jobs is naturally asynchronous, and there are natural semantic parallels to the asynchronous model described by the 'Task' namespace. However from my reading of the API I understood these two models to be independent in purpose; creating a dependency could hinder the natural description (and development) of these two areas of the API.
Both models are not really different on purpose. Would you see an advantage of having them truly separate?
The idea of a job container for the management of a large number of
remote
jobs is useful. However the TaskContainter does not appear to be wholly compatible; the run() method to start the asynchronous operations is unnecessary for jobs that have been submitted to a remote resource.
You are right. Well, that might be somewhat subtle, but in the example code below, the submit_job() call is accompanied by c new reate_job(). That creates a job which needs to be run(), which would make it compatiple to the task model.
saga::job j1 = job_server.create_job (jd); // job state is 'pending'
j1.run (); // job state is 'running' or so - 'not pending'
j1.wait (); // job state is Done or Failed
saga::job j2 = job_server.submit_job (jd); // job state is 'running' or so - 'not pending'
j1.wait (); // job state is Done or Failed
That is very similar to the semantics we have for tasks...
One reason for this proposal is additionally that we want to approach the bulk operations soon. Consider a parameter sweep, where 100.000 jobs are to be run.
for ( i = 0; i < 100.000; i++ ) { jobs[i] = job_server.submit (jd[i]); // SUBMIT }
for ( i = 0; i < 100.000; i++ ) { jobs[i].wait (); }
As for each submission, you are very likely to have at least one remote operation (they are independend), that will take a lot of time.
Compare that to:
task_container tc;
for ( i = 0; i < 100.000; i++ ) { tc.add (job_server.create (jd[i])); // CREATE }
tc.run (); tc.wait ();
It is rather straigh forward to optimize the task container for bulk job submission, or bulk operations in general (we hope).
What I am not sure is, what would that look like in native java? Are there similar mechanisms?
Looking forward to your comments,
Andre.
Graeme
Quoting Andre Merzky
: Hi,
I just had a discussion with Thilo about the topic, as he and me obviously talked somewhat orthogonal to each other...
Well, now we have the same opinion, kind of, and I have barely any bruises... Anyway, I want to summarize our point here, as I probably was not really clear in my initial post.
Sorry if re-iteration of the topic bores you...
So, we have tasks, which represent async operations, with a couple of states attached, and the ability to call run(), wait() and cancel() on these. And we can collect them in containers, and wait() on many of these tasks conveniently.
And then we have jobs, which represent remote executables, with a couple of states attached, and the ability to call run (== create them), wait() and cancel(). And some more methods. And we can't collect them in containers right now, but would like to.
You see the similarities, right? Its even more obvious in code:
Tasks: -------------------------------------------- task_container tc; task t = file.copy saga::task (...); t.run ( ); t.wait (1.0);
tc.add (t); tc.wait ( ); --------------------------------------------
Jobs: -------------------------------------------- job_container jc; job j = job_server.submit (job_descr); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
slightly changed: -------------------------------------------- job_container jc; ! job j = job_server.create (job_descr); + j.run ( ); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
The similarities are obvious I think. Now, if job would IMPLEMENT the task interface (or inherit from task), we would unify both classes, and hence:
- simplify jobs (leave only those methods which are specific to jobs, like migrate, signal, ...
- allow to out jobs into task containers, efficiently handling large amounts of jobs and other tasks
- have the API and used paradigms more uniform.
Also, if later tasks get suspendable, as Gregor rightly suggested, we can move more methods to tasks, w/o breaking the paradigms.
In terms of state, following mappings would be appropriate:
job::Pending -> task::Pending job::Done -> task::Done job::Failed -> task::Failed job::??? -> task::Cancelled
job::Queued,Running,Pre/Poststaging,... -> task::Running
So, no adjustements to the statet models are needed AFAICS, apart from Cancelled (Does it make sense on jobs? Should job::stop be job::cancel? Should tasks::cancel be task::job?)
Hope that clearifies things. I think Gregor was on target with his remarks, and Hartmut signalled consent as well. And I think I convinced Thilo (Andre rubs his bruises).
Unless there is any opposition, I'll go ahead and document that in the strawman then, ok?
Cheers, Andre.
Date: Sat, 4 Feb 2006 22:29:41 +0100 From: Thilo Kielmann
To: saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Wouldn't it be useful to have jobs implementing the task interface?
Certainly, no. Jobs and Tasks are two different things, and they are this on
Quoting [Thilo Kielmann] (Feb 04 2006): purpose.
However, Tasks always have been the mechanism for asynchronous
operation,
which is kind-of obsoleted by having asynchronous ops directly.
If you want to work on the "S" of SAGA: why not unify both Tasks and Jobs into a better "Job" notion, and do local asynchronous operations via async, local calls?
Thilo -- "So much time, so little to do..." -- Garfield
-- "So much time, so little to do..." -- Garfield
Hi Graeme, thanks for the pointer! I still have some questions if you don't mind... What I have seen at the URL you sent is how to program threads in Java. What I am unsure about is, what would a Java-SAGA programmer actually _do_. For example, if he wants to do a async seek on a file, what would he do? Lets try: - Create a new class which is 'Runnable', and inherits 'saga::file' - implement 'run' to perform a seek Can't be, because run is void (or can you change that?) try again: - Create a new class which is 'Runnable', and inherits 'saga::file' - have seek not doing anything, but setting args - have run perform the seek async Hmm, again not nice, as your object needs to be very stateful: e.g. store args for many reads, which are later run try again: - saga::file is already 'Runnable' - no, wait, it still has only one 'run' method, so I can't do a async seek and then async read... You see, I am really on the wrong track - sorry, this looks probably very foolish to you... I would very much appreciate a code example. That usualoly helps me most :-) I will google somewhat, I am sure there are for example async file classes around or so.. Thanks, Andre. Quoting [G.E.POUND@soton.ac.uk] (Feb 07 2006):
Date: Tue, 7 Feb 2006 14:33:34 +0000 From: G.E.POUND@soton.ac.uk To: Andre Merzky
Cc: G.E.POUND@soton.ac.uk, Thilo Kielmann , saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Andre,
The benefits of the application of the TaskContainer semantics to job submission are compelling. Whilst the separation of JobService.submitJob() into JobService.create() and Job.run() methods adds a little complexity, the possibility of optimised bulk operations may justify this.
The role of the JobService changes a little, it may be used to create Job objects without submitting them to the resource manager. Will JobService.runJob() invoke the Job object that has been submitted to the resource manager?
---
I am yet to be convinced that the 'Task' and 'JobManagement' namespaces should be linked. Whilst the semantics of these namespaces are similar they are designed for different purposes; a simple model of asynchronous method calls in the client, and the submission of batch jobs to a remote resource.
The advantage of keeping these two namespaces separate is to avoid an unnecessary dependency between these two different areas of the API. For example; if in the future additional methods are required to support asynchronous method calls these would be reflected in the JobManagement package. [Or if the Task namespace were altered to support language specific features, see below]
The advantage of linking the namespaces would be to allow all classes implementing TaskContainer to handle Job objects.
---
In Java the basis for multithreading support is sub-classing the java.lang.Thread class (or implementing the java.lang.Runnable interface). In practice SAGA implementations could return inner classes that may be called asynchronously (by either of these approaches) for each API method. This is essentially the whole story (prior to Java 5.0), and the whole approach could be encapsulated within the SAGA.Task model.
The problem with encapsulation within the SAGA.Task model is that the fine-grain control over the threads (when sub-classing java.lang.Thread) is lost. Furthermore when using Java 5.0 it would not be possible to leverage the high-level concurrency utilities now available (java.util.concurrent.*). For example the task scheduling framework would be appropriate to control the execution of the threads.
For more details see: http://java.sun.com/docs/books/tutorial/essential/threads/index.html
In C# the model is nicer than in Java. Methods may be invoked asynchronously by creating a 'delegate' for that method using Thread.ThreadStart. Evaluation of the delegates in different threads can be managed at a high-level via the thread pool.
When working with these languages it there will be some functionality that it will be important not to preclude. In Java <5.0 there may be little above and beyond that which is available in SAGA.Task of value to most users, and encapsulation may be the correct approach. However the high-level support for concurrency available in Java 5.0 and C# is certainly important. I am unsure whether access to this functionality is best achieved by by-passing, or altering SAGA.Task namespace in the language bindings.
Graeme
Quoting Andre Merzky
: Hi Graeme,
Quoting [G.E.POUND@soton.ac.uk] (Feb 06 2006):
Andre,
I have a couple of reservations about this action that you may be able
to
answer.
I had been hoping to avoid implementing the 'Task' namespace in the java bindings and encourage developers to use the language's support to threading to allow asynchronous method calls in the client code.
I understood that much from your last comments, but my limited knowledge does not allow me to give a qualified answer I'm afraid. Hmm, maybe we can work this out together :-)
Could you post some code examples to demonstrate how a asynchronous seek (for example) would be coded in a java application?
The C++ code would be:
saga::file f (url); saga::task t = f.seek saga::task (off, whence, &pos);
t.run (); t.wait ();
Next question is, how would a java application manage many tasks - i.e. is there something similar to a saga::task_container ?
saga::task_container tc;
tc.add (task_1); tc.add (task_2); tc.add (task_3);
tc.run (); tc.wait ();
I am therefore concerned about creating a dependency between the 'JobManagement' namespace and the 'Task' namespace.
I think, for the java bindings that would mean that job implements the task interface. Apart from the run you mention below, the semantic of the task interface is actually included in job already, more or less, only the methods and states are differently named (that is one motivation for the proposal really).
The submission of remote jobs is naturally asynchronous, and there are natural semantic parallels to the asynchronous model described by the 'Task' namespace. However from my reading of the API I understood these two models to be independent in purpose; creating a dependency could hinder the natural description (and development) of these two areas of the API.
Both models are not really different on purpose. Would you see an advantage of having them truly separate?
The idea of a job container for the management of a large number of
remote
jobs is useful. However the TaskContainter does not appear to be wholly compatible; the run() method to start the asynchronous operations is unnecessary for jobs that have been submitted to a remote resource.
You are right. Well, that might be somewhat subtle, but in the example code below, the submit_job() call is accompanied by c new reate_job(). That creates a job which needs to be run(), which would make it compatiple to the task model.
saga::job j1 = job_server.create_job (jd); // job state is 'pending'
j1.run (); // job state is 'running' or so - 'not pending'
j1.wait (); // job state is Done or Failed
saga::job j2 = job_server.submit_job (jd); // job state is 'running' or so - 'not pending'
j1.wait (); // job state is Done or Failed
That is very similar to the semantics we have for tasks...
One reason for this proposal is additionally that we want to approach the bulk operations soon. Consider a parameter sweep, where 100.000 jobs are to be run.
for ( i = 0; i < 100.000; i++ ) { jobs[i] = job_server.submit (jd[i]); // SUBMIT }
for ( i = 0; i < 100.000; i++ ) { jobs[i].wait (); }
As for each submission, you are very likely to have at least one remote operation (they are independend), that will take a lot of time.
Compare that to:
task_container tc;
for ( i = 0; i < 100.000; i++ ) { tc.add (job_server.create (jd[i])); // CREATE }
tc.run (); tc.wait ();
It is rather straigh forward to optimize the task container for bulk job submission, or bulk operations in general (we hope).
What I am not sure is, what would that look like in native java? Are there similar mechanisms?
Looking forward to your comments,
Andre.
Graeme
Quoting Andre Merzky
: Hi,
I just had a discussion with Thilo about the topic, as he and me obviously talked somewhat orthogonal to each other...
Well, now we have the same opinion, kind of, and I have barely any bruises... Anyway, I want to summarize our point here, as I probably was not really clear in my initial post.
Sorry if re-iteration of the topic bores you...
So, we have tasks, which represent async operations, with a couple of states attached, and the ability to call run(), wait() and cancel() on these. And we can collect them in containers, and wait() on many of these tasks conveniently.
And then we have jobs, which represent remote executables, with a couple of states attached, and the ability to call run (== create them), wait() and cancel(). And some more methods. And we can't collect them in containers right now, but would like to.
You see the similarities, right? Its even more obvious in code:
Tasks: -------------------------------------------- task_container tc; task t = file.copy saga::task (...); t.run ( ); t.wait (1.0);
tc.add (t); tc.wait ( ); --------------------------------------------
Jobs: -------------------------------------------- job_container jc; job j = job_server.submit (job_descr); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
slightly changed: -------------------------------------------- job_container jc; ! job j = job_server.create (job_descr); + j.run ( ); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
The similarities are obvious I think. Now, if job would IMPLEMENT the task interface (or inherit from task), we would unify both classes, and hence:
- simplify jobs (leave only those methods which are specific to jobs, like migrate, signal, ...
- allow to out jobs into task containers, efficiently handling large amounts of jobs and other tasks
- have the API and used paradigms more uniform.
Also, if later tasks get suspendable, as Gregor rightly suggested, we can move more methods to tasks, w/o breaking the paradigms.
In terms of state, following mappings would be appropriate:
job::Pending -> task::Pending job::Done -> task::Done job::Failed -> task::Failed job::??? -> task::Cancelled
job::Queued,Running,Pre/Poststaging,... -> task::Running
So, no adjustements to the statet models are needed AFAICS, apart from Cancelled (Does it make sense on jobs? Should job::stop be job::cancel? Should tasks::cancel be task::job?)
Hope that clearifies things. I think Gregor was on target with his remarks, and Hartmut signalled consent as well. And I think I convinced Thilo (Andre rubs his bruises).
Unless there is any opposition, I'll go ahead and document that in the strawman then, ok?
Cheers, Andre.
Date: Sat, 4 Feb 2006 22:29:41 +0100 From: Thilo Kielmann
To: saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Wouldn't it be useful to have jobs implementing the task interface?
Certainly, no. Jobs and Tasks are two different things, and they are this on
Quoting [Thilo Kielmann] (Feb 04 2006): purpose.
However, Tasks always have been the mechanism for asynchronous
operation,
which is kind-of obsoleted by having asynchronous ops directly.
If you want to work on the "S" of SAGA: why not unify both Tasks and Jobs into a better "Job" notion, and do local asynchronous operations via async, local calls?
Thilo -- "So much time, so little to do..." -- Garfield
-- "So much time, so little to do..." -- Garfield
-- "So much time, so little to do..." -- Garfield
Andre, Sure, no problem. Here are two examples of how asynchronous support could be provided in the Java bindings. The first completely encapsulates threading within the SAGA.Task model, the second exposes more of the threading capabilities. #1 Encapsulated threading model Following the syntax of the SAGA.Task model, the task factory will return objects implementing the Task interface for each of the Directory methods. //Example #1 DirectoryTaskFactory dtf = dir.createTaskFactory (); Task t1 = dtf.ls (dir); t1.run (); In this example when the method run() is invoked a thread is created and started internally by the object t1. Inner classes could also be used to conceal the gruesome details of this. All management of the state of the thread must be via the Task interface. This is fine as this is the purpose of the Task interface, particularly at the level at which the user programs. However the problem comes when the more detailed control over the thread is required, this is may be required by more sophisticated implementations of TaskContainer. #2 Exposed threading model If we need to expose fine-grained control over the threading model in the Java bindings it would be necessary to change the SAGA.Task namespace. //Example #2 DirectoryTaskFactory dtf = dir.createTaskFactory (); java.lang.Thread t1 = dtf.ls (dir); t1.start (); //Or t1.run(); TaskContainer tc; tc.addTask (t1); Through this approach the interface exposed to the user is more complex, however the TaskContainer may be more powerful. Implementations based in Java 5.0 can use the high level concurrency utilities to control the threads efficiently (and safely?). This approach would require the Task namespace to be specific to asynchronous method invocation, and preclude any link to the JobManagement namespace - but, of course, I do not think that this is a problem. A similar situation exists in C# in which 'tasks' could be returned as delegates or threads to facilitate the use of high-level threading support in the language; include the thread pool. --- Of course it is a tangential issue about the hazards of invoking these methods asynchronously. For example, the developer of a Java implementation should ensure that asynchronous methods synchronize properly. Also the example of the asynchronous methods called on a SAGA directory can easily be used to illustrate the potential issues arising from calling operations that move and copy the contents of a directory. ie: // Create Task factories DirectoryTaskFactory dtf = dir.createTaskFactory (); // Create Tasks Task t1 = dtf.ls (result); Task t2 = dtf.copy (source,target); Task t3 = dtf.move (source,target); With no control about the order of these operations a user of the SAGA API could quickly write very unsafe code. Graeme Andre Merzky wrote:
Hi Graeme,
thanks for the pointer!
I still have some questions if you don't mind...
What I have seen at the URL you sent is how to program threads in Java. What I am unsure about is, what would a Java-SAGA programmer actually _do_.
For example, if he wants to do a async seek on a file, what would he do?
Lets try:
- Create a new class which is 'Runnable', and inherits 'saga::file' - implement 'run' to perform a seek
Can't be, because run is void (or can you change that?)
try again:
- Create a new class which is 'Runnable', and inherits 'saga::file' - have seek not doing anything, but setting args - have run perform the seek async
Hmm, again not nice, as your object needs to be very stateful: e.g. store args for many reads, which are later run
try again:
- saga::file is already 'Runnable' - no, wait, it still has only one 'run' method, so I can't do a async seek and then async read...
You see, I am really on the wrong track - sorry, this looks probably very foolish to you...
I would very much appreciate a code example. That usualoly helps me most :-)
I will google somewhat, I am sure there are for example async file classes around or so..
Thanks, Andre.
Quoting [G.E.POUND@soton.ac.uk] (Feb 07 2006):
Date: Tue, 7 Feb 2006 14:33:34 +0000 From: G.E.POUND@soton.ac.uk To: Andre Merzky
Cc: G.E.POUND@soton.ac.uk, Thilo Kielmann , saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... Andre,
The benefits of the application of the TaskContainer semantics to job submission are compelling. Whilst the separation of JobService.submitJob() into JobService.create() and Job.run() methods adds a little complexity, the possibility of optimised bulk operations may justify this.
The role of the JobService changes a little, it may be used to create Job objects without submitting them to the resource manager. Will JobService.runJob() invoke the Job object that has been submitted to the resource manager?
---
I am yet to be convinced that the 'Task' and 'JobManagement' namespaces should be linked. Whilst the semantics of these namespaces are similar they are designed for different purposes; a simple model of asynchronous method calls in the client, and the submission of batch jobs to a remote resource.
The advantage of keeping these two namespaces separate is to avoid an unnecessary dependency between these two different areas of the API. For example; if in the future additional methods are required to support asynchronous method calls these would be reflected in the JobManagement package. [Or if the Task namespace were altered to support language specific features, see below]
The advantage of linking the namespaces would be to allow all classes implementing TaskContainer to handle Job objects.
---
In Java the basis for multithreading support is sub-classing the java.lang.Thread class (or implementing the java.lang.Runnable interface). In practice SAGA implementations could return inner classes that may be called asynchronously (by either of these approaches) for each API method. This is essentially the whole story (prior to Java 5.0), and the whole approach could be encapsulated within the SAGA.Task model.
The problem with encapsulation within the SAGA.Task model is that the fine-grain control over the threads (when sub-classing java.lang.Thread) is lost. Furthermore when using Java 5.0 it would not be possible to leverage the high-level concurrency utilities now available (java.util.concurrent.*). For example the task scheduling framework would be appropriate to control the execution of the threads.
For more details see: http://java.sun.com/docs/books/tutorial/essential/threads/index.html
In C# the model is nicer than in Java. Methods may be invoked asynchronously by creating a 'delegate' for that method using Thread.ThreadStart. Evaluation of the delegates in different threads can be managed at a high-level via the thread pool.
When working with these languages it there will be some functionality that it will be important not to preclude. In Java <5.0 there may be little above and beyond that which is available in SAGA.Task of value to most users, and encapsulation may be the correct approach. However the high-level support for concurrency available in Java 5.0 and C# is certainly important. I am unsure whether access to this functionality is best achieved by by-passing, or altering SAGA.Task namespace in the language bindings.
Graeme
Quoting Andre Merzky
: Hi Graeme,
Andre,
I have a couple of reservations about this action that you may be able to answer.
I had been hoping to avoid implementing the 'Task' namespace in the java bindings and encourage developers to use the language's support to threading to allow asynchronous method calls in the client code. I understood that much from your last comments, but my
Quoting [G.E.POUND@soton.ac.uk] (Feb 06 2006): limited knowledge does not allow me to give a qualified answer I'm afraid. Hmm, maybe we can work this out together :-)
Could you post some code examples to demonstrate how a asynchronous seek (for example) would be coded in a java application?
The C++ code would be:
saga::file f (url); saga::task t = f.seek saga::task (off, whence, &pos);
t.run (); t.wait ();
Next question is, how would a java application manage many tasks - i.e. is there something similar to a saga::task_container ?
saga::task_container tc;
tc.add (task_1); tc.add (task_2); tc.add (task_3);
tc.run (); tc.wait ();
I am therefore concerned about creating a dependency between the 'JobManagement' namespace and the 'Task' namespace. I think, for the java bindings that would mean that job implements the task interface. Apart from the run you mention below, the semantic of the task interface is actually included in job already, more or less, only the methods and states are differently named (that is one motivation for the proposal really).
The submission of remote jobs is naturally asynchronous, and there are natural semantic parallels to the asynchronous model described by the 'Task' namespace. However from my reading of the API I understood these two models to be independent in purpose; creating a dependency could hinder the natural description (and development) of these two areas of the API. Both models are not really different on purpose. Would you see an advantage of having them truly separate?
The idea of a job container for the management of a large number of remote jobs is useful. However the TaskContainter does not appear to be wholly compatible; the run() method to start the asynchronous operations is unnecessary for jobs that have been submitted to a remote resource. You are right. Well, that might be somewhat subtle, but in the example code below, the submit_job() call is accompanied by c new reate_job(). That creates a job which needs to be run(), which would make it compatiple to the task model.
saga::job j1 = job_server.create_job (jd); // job state is 'pending'
j1.run (); // job state is 'running' or so - 'not pending'
j1.wait (); // job state is Done or Failed
saga::job j2 = job_server.submit_job (jd); // job state is 'running' or so - 'not pending'
j1.wait (); // job state is Done or Failed
That is very similar to the semantics we have for tasks...
One reason for this proposal is additionally that we want to approach the bulk operations soon. Consider a parameter sweep, where 100.000 jobs are to be run.
for ( i = 0; i < 100.000; i++ ) { jobs[i] = job_server.submit (jd[i]); // SUBMIT }
for ( i = 0; i < 100.000; i++ ) { jobs[i].wait (); }
As for each submission, you are very likely to have at least one remote operation (they are independend), that will take a lot of time.
Compare that to:
task_container tc;
for ( i = 0; i < 100.000; i++ ) { tc.add (job_server.create (jd[i])); // CREATE }
tc.run (); tc.wait ();
It is rather straigh forward to optimize the task container for bulk job submission, or bulk operations in general (we hope).
What I am not sure is, what would that look like in native java? Are there similar mechanisms?
Looking forward to your comments,
Andre.
Graeme
Quoting Andre Merzky
: Hi,
I just had a discussion with Thilo about the topic, as he and me obviously talked somewhat orthogonal to each other...
Well, now we have the same opinion, kind of, and I have barely any bruises... Anyway, I want to summarize our point here, as I probably was not really clear in my initial post.
Sorry if re-iteration of the topic bores you...
So, we have tasks, which represent async operations, with a couple of states attached, and the ability to call run(), wait() and cancel() on these. And we can collect them in containers, and wait() on many of these tasks conveniently.
And then we have jobs, which represent remote executables, with a couple of states attached, and the ability to call run (== create them), wait() and cancel(). And some more methods. And we can't collect them in containers right now, but would like to.
You see the similarities, right? Its even more obvious in code:
Tasks: -------------------------------------------- task_container tc; task t = file.copy saga::task (...); t.run ( ); t.wait (1.0);
tc.add (t); tc.wait ( ); --------------------------------------------
Jobs: -------------------------------------------- job_container jc; job j = job_server.submit (job_descr); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
slightly changed: -------------------------------------------- job_container jc; ! job j = job_server.create (job_descr); + j.run ( ); j.wait (1.0);
jc.add (j); jc.wait ( ); --------------------------------------------
The similarities are obvious I think. Now, if job would IMPLEMENT the task interface (or inherit from task), we would unify both classes, and hence:
- simplify jobs (leave only those methods which are specific to jobs, like migrate, signal, ...
- allow to out jobs into task containers, efficiently handling large amounts of jobs and other tasks
- have the API and used paradigms more uniform.
Also, if later tasks get suspendable, as Gregor rightly suggested, we can move more methods to tasks, w/o breaking the paradigms.
In terms of state, following mappings would be appropriate:
job::Pending -> task::Pending job::Done -> task::Done job::Failed -> task::Failed job::??? -> task::Cancelled
job::Queued,Running,Pre/Poststaging,... -> task::Running
So, no adjustements to the statet models are needed AFAICS, apart from Cancelled (Does it make sense on jobs? Should job::stop be job::cancel? Should tasks::cancel be task::job?)
Hope that clearifies things. I think Gregor was on target with his remarks, and Hartmut signalled consent as well. And I think I convinced Thilo (Andre rubs his bruises).
Unless there is any opposition, I'll go ahead and document that in the strawman then, ok?
Cheers, Andre.
Date: Sat, 4 Feb 2006 22:29:41 +0100 From: Thilo Kielmann
To: saga-rg@ggf.org Subject: Re: [saga-rg] tasks and jobs... > Wouldn't it be useful to have jobs implementing the task > interface? Certainly, no. Jobs and Tasks are two different things, and they are this on
Quoting [Thilo Kielmann] (Feb 04 2006): purpose.
However, Tasks always have been the mechanism for asynchronous operation, which is kind-of obsoleted by having asynchronous ops directly.
If you want to work on the "S" of SAGA: why not unify both Tasks and Jobs into a better "Job" notion, and do local asynchronous operations via async, local calls?
Thilo -- "So much time, so little to do..." -- Garfield
-- "So much time, so little to do..." -- Garfield
participants (5)
-
Andre Merzky
-
G.E.POUND@soton.ac.uk
-
Graeme Pound
-
Hartmut Kaiser
-
Thilo Kielmann