
Partial (incremental) transfers can be done without extensions by specifying an appropriate URI. The spec already includes this example: <jsdl:DataStaging> <jsdl:Source> <jsdl:URI>rsync://foo.bar.com/~me/job1/input</jsdl:URI> </jsdl:Source> ... </jsdl:DataStaging> Michel Drescher wrote:
Hi Andrew,
first off, welcome to JSDL. :-)
In the data staging elements there is a creation flag that indicates whether to over-write or append. Often, one only wants to overwrite if the target and the source are different, for example if I am using a large data set that changes infrequently or several jobs on the same resource will use the "same" input file. Is there a way to do conditional stage in?
Not at the moment. This may be different in the future.
But we're talking about a major performance optimization.
I am not objecting in general, but I do wonder - how is "different" defined? By creation date? By modification date? Size? MD5 hash? - Who carries out the assessment and decides if source and target are different or the same?
I think a more prominent use case for a performance enhancement are partial file transfers (in your example, this would only transfer the changed bits of your data set).
But both, partial file transfers and conditional file transfers can be already realised today using extensions to JSDL: Note that, referring to JSDL 1.0 draft 19 (http://tinyurl.com/cvfmk), both source and target elements in a JSDL document instance do not have to have a jsdl:URI child element. You can add as many child elements and attributes as you want.
-- Andreas Savva Fujitsu Laboratories Ltd