Re: [SAGA-RG] [Gat-devel] Proposal for changes in the glite-adaptor

12 Oct 2009

      Hi Ashiq, 

Quoting [Ashiq Anjum] (Oct 11 2009):
...
Dear Andre,
Many thanks for your reply. I think the SAGA mailing list may be the
   best place to discuss this issue.
Yep, agree.

I also agree with most of your arguments below! (quote is shortened)
Yes, a workflow API package in SAGA would cleanly solve the use case
- as said earlier, we are working on something like this.  I'll try
to share some more details on this list soon - but so far we only
implemented, and did not write anything up...
...
On the other hand, if a workflow is submitted as a single entity, the
   users will be managing just one job and then the underlying resource
   management system (such as glite WMS or Grid Gateway)may be splitting
   the jobs into tasks, providing the monitoring  information to users and
   will present a single result to the users.
*submitting* a workflow (instead of expressing it in a SAGA API,
what meant above) may well be part of the workflow API.  But we will
be facing the same problems as with the job package: there we
decided *not* to support to submit a native job description, because
either we would need to decide on a single job description language,
and its hard to pick one over the other (they all have legitimate
use cases); or one supports all job description languages, which
defeats the purpose of the SAGA API, really.  The same arguments
hold for workflow languages, so it is not clear what the support for
that will be in the Workflow extension....
...
There may be number of other scenarios where a workflow support in SAGA
   is needed. I am making here a case that SAGA can get a wide scale
   adoption in number of scientific and business communities if the
   workflow support is made  available.
Tthere may be number of ways to support workflows in SAGA. Ideally,
   the SAGA API should generate a JDL/JSDL from a workflow that can then
   be interpreted by an underlying Grid middleware.
Yes, that is the part which is straight forward: express job
dependencies etc. in a SAGA API package, and create JDL (or
whatever) in the adaptor, for submission.
...
1. Large jobs can execute the whole workflows, but we will be forcing
   the whole job to be scheduled on a single site. This may create more
   problems as we are suggesting the job to transfer all datasets to a
   single site.  This may also overload some sites and may virtually
   minimize the role of meta-schedulers, which were created for multi-site
   scheduling.
Unless the large job runs over multiple sites, and is using exactly
the resources the WF needs.  Just saying...
...
2. Pilot job may help in performance optimization but it is not clear
   how can they execute a whole workflow, especially if a workflow is
   distributed on more than one sites. If a workflow is distributed, pilot
   jobs have no mechanism where they can communicate and coordinate across
   sites.
One needs one pilot job per cluster, or per resource, to solve that
- I don't think that this is a problem.  Yes, one needs a
coordinating instance - see below.
...
3. A DAG enacter could be interesting but how it will coordinate with
   the underlying Grid resources remains to be investigated.
We may discuss the possibility of a workflow adapter that is abstract
   in nature and this should help the resulting job descriptions to be
   executed in different enactment environments. SAGA should generate
   generic workflow descriptions (JSDL/JDL) and the adapters can be
   extended to support the enactment functionality. Alternatively, SAGA
   can provide a partial enactment engine, however, how it should be
   executed needs an open debate. Yet another scenario could be to
   translate workflow descriptions into a SAGA API, which may then
   automatically dispatch them to underlying adapters.
Yes, one always needs a WF enactor, somewhere.  I'd argue that this
should not be implemented in SAGA, really: SAGA is a library, a DAG
enactor should be a service or a tool --  "Separation of concerns"
is needed here, IMHO.

So, lets the discussion of an enactor separate from the question on
how to express WFs in SAGA, if possible.
...
We do not want to break the standard by suggesting an adapter to
   support workflows, however, we need some mechanism that the SAGA
   implementations should support DAGS and other workflows.
Don't worry - its a fair request, and I am happy to see use cases
beyond our own workflow use cases!  The original point I was trying
to make was: the attempt to tunnel workflow descriptions thruogh the
*job* adaptor, in a glite specific way, that is breaking the SAGA
specification.

Ok, I think the next action item is on me, to send some more
details on our workflow approach in SAGA.  Will only manage to do so
after OGF I'm afraid...

Best, Andre.

PS.: a possible short term solution: assume your workflow
  description file is 'file://localhost/wf.jdl'.  Further,
  'wms.cern.net' is a glite gateway, and 'glite.localdomain.uk' 
  is a local glite node which can sbmit to wms.  The you can 
  run your workflow by calling (pseudo code):

    saga::job::service ("glite://glite.localdomain.uk");

    saga::job::description jd;

    saga::string stage_wf ("file://localhost/wf.jdl > gridftp://glite.localdomain.uk/tmp/wf.jdl");

    jd.set_vector_attribute ("FileTransfer", [stage_wf]);
    jd.set_attribute        ("Executable",   glite-wms-job-submit);
    jd.set_vector_attribute ("Arguments",    ["-a"], ["/tmp/wf.jdl"]);

    saga::job::job j = js.create_job (jd);

    j.run  ();
    j.wait ();

-- 
Nothing is ever easy.

Re: [SAGA-RG] [Gat-devel] Proposal for changes in the glite-adaptor

Andre Merzky