Re: [jsdl-wg] DataStaging concerns

20 May 2005

      Peter G.Lane wrote:
...
Forgive me if I'm reiterating on a topic.  I've only be reading up on 
JSDL since yesterday.  I have a few concerns about the DataStaging 
section.  Primarily, I'm wondering if it really makes sense to have it 
as part of the core schema.  I think it would be better to have 
extensions like POSIXApplication for more specific DRM configurations.
We suspect that there's going to be quite a bit of extension in that
area, and welcome feedback for post-1.0 (we're very very unlikely to
change anything for JSDL 1.0 now; it doesn't do everything, but it does
a useful fraction and too many people need something - anything! - now).
...
1) There's still controversy over whether staging should or should not 
be integrated into a DRM.  As far as I can tell, for example, the BES 
doesn't have any plans to implement staging.  DRMMA makes this 
optional.  If BES ends up using JSDL, wouldn't this be a violation of 
the spec which requires each element to be supported in some way?
"Supported" has a very particular meaning within a JSDL context, and the
effective meaning could include a definite response "I don't know how to
do data staging, man!" We discussed data staging quite a few time
(around a year ago IIRC) and what we came up with is a minimum to allow
processing of jobs on a number of different systems including domains
like cross-cluster deployment where everything has to be shipped in first.

If we'd had a proper workflow language too, we'd have done data staging
differently. But there wasn't something suitable already existing (BPEL
does something else) and if we'd have had to develop our own, we'd still
be arguing about it now.
...
3) I don't particularly like that the DataStaging sections include an 
option to remove the file at the end of the job.  If I'm staging out 
data then this doesn't make a whole lot of sense.  I'd much prefer a 
separate section which explicitly lists all the files that are to be 
removed from the submission machine after the job has been completed.  
This would also cover the case of data that is created rather than 
staged in but still needs to be removed after job completion.
The "remove data at the end of the job" applies after staging it out
(it'd be a bit silly otherwise). It is also the case that you can list a
file in the data staging section and not have it staged in or out, but
just deleted at end-of-job.
...
4) Based on the current GRAM incarnation, it would be nice to let RFT's 
transfer request description extend a base staging schema and then use 
that in the JSDL document rather than adding a bunch of extensions to 
DataStaging.  This is similar to how I'd want to go about using 
POSIXApplication.
I don't fully grasp what you mean here. The following is legal (modulo
namespaces) according to draft-18:

   <jsdl:DataStaging>
      <jsdl:FileName>example</jsdl:FileName>
      <!-- I think we agreed to default this next element -->
      <jsdl:CreationFlag>jsdl:overwrite</jsdl:CreationFlag>
      <jsdl:Source>
        <!-- I've no idea what your RFT schema might look like -->
        <rft:SynchFileWithSomewhere>...</rft:SynchFileWithSomewhere>
      </jsdl:Source>
   </jsdl:DataStaging>

(Hmm, the non-normative examples in the data-staging section of d18 seem
out of step. Bother.)

Given that the above is legal, what's the problem?

Donal.

Re: [jsdl-wg] DataStaging concerns

Donal K. Fellows