
Quoting [Steven Newhouse] (May 30 2005):
Oh - I wonder if we sent you an old version?
Probably... date bask to GGF 13 or soon after.
An up to date version is on the SAGA wiki front page: http://wiki.cct.lsu.edu/saga/ The task stuff is better described there.
That is an implementation issue, really. We could imagine SAGA implementations which can only handle gsi and gridftp, and other implementations which have a plugin mechanism and can handle all types of protocols.
But how do I as a user discover which protocols are supported by the implementation? Do I discover this by calls failing?
Uhm, that question pops up wherever we look - and there is no good answer in sight. In GAT, we allow any:// as protocol - meaning that GAT can choose whatever it finds. But that has drawbacks. Consider following URLs: ftp://my.remote.host:1234//tmp/test.dat ftp://my.remote.host//tmp/test.dat gridftp://my.remote.host//tmp/test.dat http://my.remote.host//tmp/test.dat may all refer to the same physical location - or not! This all depends on service setup. So, any:// leaves that pretty much open to wild guessing. As do the above URLs really: the user probably does not know http server root settings on all remote hosts. If you think about it that way, the stagein/stageout settings for most job description languages are equally flawed. Its a very general problem. I would e happy if somebody in the group would come up with a good approach to that. I can think of only 2, which both have flaws: 1) ALWAYS use a replica system/grid file system Flaw: that needs to get populated, and user needs to be able to globally navigate therein 2) provide a URL translation service, either for the user or to be used by the implementation. url = URLTransLate ("ftp://my.remote.host//tmp/test.dat", "http://); // url is set to http://my.remote.host//servermount/tmp/test.dat Flaw: The server would need to know about local configurations, needs to be kept in sync, requires a remote op for each action on any URL etc. Conclusion: We don't know a good answer, at least not on API level...
The latter is better we think, but the API spec does not specify the implementation and architecture.
If you want middleware providers to support the interface specifying how protocol plugins are added in is as important as specifying how users will expect to use the APIs.
You might be right - but I am not sure. If every middleware provider _can_ implement its own SAGA version in whatever way he wants, hi might actually do that. If there at some point is a SAGA implementation which allows well defined plugins, the middleware providers might use that instead, or _still_ want to implementis their own way. To be sure: we want to have a pluggable implementation (and in fact we work on such one), but that plugin specification should, in our opinion, totally distinct from the SAGA API spec. What do otheres think about this issue?
I am not sure if I understand your distinction between Tier 1 and Tier 2 - could you give an example please?
Tier 1 - Stuff you are going to do now. Generic task & error framework and support for files movement and job submission.
Tier 2 - Things for later (V2 of the spec?) - streams, logical file catalogues, etc.
I see, ok. Tier 1: - session handle - errors - attrobutes - tasks - files - logical files - job submission, brokering - streams So basically what is in the API right now Tier 2: - steering and monitoring - possibly combining logical/physical files (read on logical files) - Task dependencies (simple work flows and batches) - extensions to Tier 1 classes There is no good and explicit roadmap for Tier2 right now. Best regards, Andre.
Steven
-- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+