[saga-rg] Re: Fwd (s.newhouse@omii.ac.uk): Re: SAGA Strawman API

30 May 2005

      Quoting [Steven Newhouse] (May 30 2005):
...
...
Oh - I wonder if we sent you an old version?
Probably... date bask to GGF 13 or soon after.
An up to date version is on the SAGA wiki front page:

  http://wiki.cct.lsu.edu/saga/

The task stuff is better described there.
...
...
That is an implementation issue, really.  We could imagine
SAGA implementations which can only handle gsi and gridftp,
and other implementations which have a plugin mechanism and
can handle all types of protocols.
But how do I as a user discover which protocols are supported by the 
implementation? Do I discover this by calls failing?
Uhm, that question pops up wherever we look - and there is
no good answer in sight.

In GAT, we allow any:// as protocol - meaning that GAT can
choose whatever it finds.  But that has drawbacks.  Consider
following URLs:

      ftp://my.remote.host:1234//tmp/test.dat
      ftp://my.remote.host//tmp/test.dat
  gridftp://my.remote.host//tmp/test.dat
     http://my.remote.host//tmp/test.dat

may all refer to the same physical location - or not!  This
all depends on service setup.  So, any:// leaves that pretty
much open to  wild guessing.  As do the above URLs really:
the user probably does not know http server root settings on
all remote hosts.

If you think about it that way, the stagein/stageout
settings for most job description languages are equally
flawed.  Its a very general problem.

I would e happy if somebody in the group would come up with
a good approach to that.  I can think of only 2, which both
have flaws:

  1) ALWAYS use a replica system/grid file system
     Flaw: that needs to get populated, and user needs to be
     able to globally navigate therein

  2) provide a URL translation service, either for the user
     or to be used by the implementation.

       url = URLTransLate ("ftp://my.remote.host//tmp/test.dat", "http://);
       // url is set to http://my.remote.host//servermount/tmp/test.dat

     Flaw: The server would need to know about local
     configurations, needs to be kept in sync, requires a
     remote op for each action on any URL etc.

Conclusion: We don't know a good answer, at least not on
API level...
...
...
The latter is better we think, but the API spec does not
specify the implementation and architecture.
If you want middleware providers to support the interface specifying how 
protocol plugins are added in is as important as specifying how users 
will expect to use the APIs.
You might be right - but I am not sure.  If every middleware
provider _can_ implement its own SAGA version in whatever
way he wants, hi might actually do that.

If there at some point is a SAGA implementation which allows
well defined plugins, the middleware providers might use
that instead, or _still_ want to implementis their own way.

To be sure: we want to have a pluggable implementation (and
in fact we work on such one), but that plugin specification
should, in our opinion, totally distinct from the SAGA API
spec.

What do otheres think about this issue?
...
...
I am not sure if I understand your distinction between 
Tier 1 and Tier 2 - could you give an example please?
Tier 1 - Stuff you are going to do now. Generic task & error framework 
and support for files movement and job submission.
Tier 2 - Things for later (V2 of the spec?) - streams, logical file 
catalogues, etc.
I see, ok.  Tier 1:

 - session handle
 - errors
 - attrobutes
 - tasks

 - files
 - logical files
 - job submission, brokering
 - streams

So basically what is in the API right now

Tier 2:

 - steering and monitoring
 - possibly combining logical/physical files (read on
   logical files)
 - Task dependencies (simple work flows and batches)
 - extensions to Tier 1 classes

There is no good and explicit roadmap for Tier2 right now.

Best regards, 

  Andre.
...
Steven
-- 
+-----------------------------------------------------------------+
| Andre Merzky                      | phon: +31 - 20 - 598 - 7759 |
| Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 |
| Dept. of Computer Science         | mail: merzky@cs.vu.nl       |
| De Boelelaan 1083a                | www:  http://www.merzky.net |
| 1081 HV Amsterdam, Netherlands    |                             |
+-----------------------------------------------------------------+