
Ha, finally someone really critical :-D Thanks Felix, comments are invited. Cheers, Andre. ----- Forwarded message from Felix Hupfeld <hupfeld@zib.de> -----
I think there is a problem in the foundation of the SAGA approach. The underlying assumptions seem to be that: (a) prospective designers of Grid applications are unaware of distr. system issues (b) distr. system issues are not fundamental, but can be solved by technical means
on (a): Designers of Grid applications are presumably already writing parallel applications on clusters or large HPC machines. They are well aware that remote memory is not local memory and that they can't use DSM just like local memory without serious performance issues. To name one of the fundamental problems of writing distr. systems...
on (b): There are fundamental tradeoffs to be considered when building distributed systems (DS), like availability vs. consistency, scalabillity vs. consistency, ... State of the art of CS is that they can't be solved by technical means but are laws of the nature of DS. In practice that means: if you design an API for a DS, where you _must_ state which consistency guarentees you give, you assume certain requirements of the user and force him into your choice of the tradeoff spectrum.
With assumptions (a) and (b), the designers of SAGA seem to persue an approach, where the user has a very-POSIX like API. Although the spec. ignores all DS issues, it seems to imply that the API gives the user a single-copy view of the overall Grid, which makes the strongest choice on consistency, but results in the worst possible scalability and availability.
To illustrate, I will elaborate on the File API, but the same arguments can be made for namespaces and logical files, replicas, ...
The spec. does not make any statements about the consistency guarantees of the read and write operations. I assume that should imply that it gives POSIX like read and write guarantees, resulting in bad scalability and availability. The authors must explicitly state that. If the consistency guarantees are left open and governed by the API implementation, no applications can be written towards that API. You can't write applications if you don't known when you will be able to read the data you have written on local or remote instances.
A solution to this problem would be the specification of required consistency guarantees by the applications at open() time. This, however, would require thorough research on conistency models and application requirements, so I can't propose a few of hand.
If SAGA choses to give single-system like guarantees, this must be explicititely stated. All interfaces that deal with data are unusable without a specification of consistency guarantees.
However, if you chose to be as POSIX like as possible and have single-system semantics, why not directly use parts of the POSIX spec. and strip it down where needed?
----- End forwarded message ----- -- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+