
Peter G.Lane wrote:
Forgive me if I'm reiterating on a topic. I've only be reading up on JSDL since yesterday. I have a few concerns about the DataStaging section. Primarily, I'm wondering if it really makes sense to have it as part of the core schema. I think it would be better to have extensions like POSIXApplication for more specific DRM configurations.
We suspect that there's going to be quite a bit of extension in that area, and welcome feedback for post-1.0 (we're very very unlikely to change anything for JSDL 1.0 now; it doesn't do everything, but it does a useful fraction and too many people need something - anything! - now).
1) There's still controversy over whether staging should or should not be integrated into a DRM. As far as I can tell, for example, the BES doesn't have any plans to implement staging. DRMMA makes this optional. If BES ends up using JSDL, wouldn't this be a violation of the spec which requires each element to be supported in some way?
"Supported" has a very particular meaning within a JSDL context, and the effective meaning could include a definite response "I don't know how to do data staging, man!" We discussed data staging quite a few time (around a year ago IIRC) and what we came up with is a minimum to allow processing of jobs on a number of different systems including domains like cross-cluster deployment where everything has to be shipped in first. If we'd had a proper workflow language too, we'd have done data staging differently. But there wasn't something suitable already existing (BPEL does something else) and if we'd have had to develop our own, we'd still be arguing about it now.
3) I don't particularly like that the DataStaging sections include an option to remove the file at the end of the job. If I'm staging out data then this doesn't make a whole lot of sense. I'd much prefer a separate section which explicitly lists all the files that are to be removed from the submission machine after the job has been completed. This would also cover the case of data that is created rather than staged in but still needs to be removed after job completion.
The "remove data at the end of the job" applies after staging it out (it'd be a bit silly otherwise). It is also the case that you can list a file in the data staging section and not have it staged in or out, but just deleted at end-of-job.
4) Based on the current GRAM incarnation, it would be nice to let RFT's transfer request description extend a base staging schema and then use that in the JSDL document rather than adding a bunch of extensions to DataStaging. This is similar to how I'd want to go about using POSIXApplication.
I don't fully grasp what you mean here. The following is legal (modulo namespaces) according to draft-18: <jsdl:DataStaging> <jsdl:FileName>example</jsdl:FileName> <!-- I think we agreed to default this next element --> <jsdl:CreationFlag>jsdl:overwrite</jsdl:CreationFlag> <jsdl:Source> <!-- I've no idea what your RFT schema might look like --> <rft:SynchFileWithSomewhere>...</rft:SynchFileWithSomewhere> </jsdl:Source> </jsdl:DataStaging> (Hmm, the non-normative examples in the data-staging section of d18 seem out of step. Bother.) Given that the above is legal, what's the problem? Donal.