Hi All, I was just following VU's discussion and like to comment on it. As I understand it, tasks are by definition independant from each other, because they are asynchronous operations, right? With this in mind, SAGA users should be responsible to manage possible race conditions by themselves. They are the only ones, which are aware of the exact nature of calls the put into tasks. And that s why, it is a easier task for them to avoid race conditions in their very own code, than for the SAGA spec, to avoid race conditions in every case without knowing about the exact semantics. Copying around, objects which are used by tasks seems to be problematic, as these different copies need to be synchronised afterwards (as Andre pointed out). Hence, I would propose to leave the problems of race conditions to the enduser, because there is probabely no general solution, which would not contradict certain use cases. regards, Stephan On Mon, 17 Jul 2006, Andre Merzky wrote:
Quoting [Thilo Kielmann] (Jul 17 2006):
Giving it another thought, I think it isn't as bad as my last mail assumed to be.
The difference is that SAGA tasks aren't threads. They are kind-of single operations (e.g., a single file.read, but no sequence of multiple of such operations). It is just that the obvious implementation in Java would be using threads for everything asynchronous...
Right, that is what saga tasks are: they represent a async operation.
Still, if state sharing between multiple tasks (or between a task and the main thread) is desired, we need to start out by defining data consistency of all local and remote objects...
We already had a discussion about this, which was triggered by similar comments from Felix. The result of the discussion was (cited from the spec intro):
\subsubsection{Consistency Model}
We had a lengthy discussion about consistency models, with the agreement that the consistency model is to be defined and documented by the implementation. The API spec itself does not assume any specific consistency model, as we feel that (a) POSIX consistency is not achievable within reasonable effort/performance, (b) if the user assumes the worst (no consistency), he will still be able to make good use of the API, and (c) reality will be somewhere in the middle.
After discussing further with some OGSA folx at last GGF, I added:
Implementors SHOULD, however, strive to implement ``At Most Once'' consistency, as that seems (a) to be generally supported by most Grid middleware, (b) implementable in distributed systems with reasonable effort, and (c) useful and intuitively expected by most end users.
There have been some recent discussion on the BES and OGSA list about At-Least-Once and At-Most-Once, but I am pretty positive that our use cases benefit from At-Most-Once most.
Is that what you are looking fore?
It is not really saying anything about shared state of object, and the life time consequences for these objects (that is what this thread originally tried to discuss).
For that, I tried to clean up the intro once more, see the CVS version of
\subsubsection{Life Time Management}
in the light of what we discussed about consistency and tasks, does that section make sense to you?
Cheers, Andre.
Thilo
On Mon, Jul 17, 2006 at 04:16:26AM +0200, Thilo Kielmann wrote:
Date: Mon, 17 Jul 2006 04:16:26 +0200 From: Thilo Kielmann
To: saga-rg@ggf.org Subject: [saga-rg] SAGA thread model I am sorry to say, but SAGA's task model seems to me severely flawed. This is for two reasons:
1. the "main" thread (executing sync operations) needs to be considered as yet another task 2. there must be a concise definition of the shared state. current solutions are ad-hoc and mostly undefined. shared state is: - local objects, shared between multiple tasks of the same process here: definition of synchronization between tasks - remote objects, in the service(s) here: definition of legal execution orders
so far, I can see only a few incidental definitions, but they are far from being concise.
"tasks in a bulk operation have to be independent" "a task cancel is doing 'best effort' but can not guarantee cancelation"
The latter, BTW, is a special case, because this is about connection termination for which you can formally prove that there is no protocol that can guarantee this AND notify both parties of successful termination.
To be constructive: what the task model must do first thing is - define tasks - define which data is shared between tasks and which concurrency control happens on this shared data
That is the only way to define clearly what tasks will do in the event of sharing, really.
You may want to look at:
This is: Doug Lea, "Concurrent Programming in Java: Design Principles and Patterns"
This book uses 280 pages on objects, shared state and concurrency control before using 95 pages for the thread operations...
-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/ -- "So much time, so little to do..." -- Garfield