
Hi everyone! We have something of a problem in that we need a way to join the virtual filesystem locations that we stage files to/from in DataStaging to the places referred to in various places in PosixApplication (especially Argument and Environment, though this also applies to Input, Output and Error; possibly WorkingDirectory too). The problem is that we want to virtualize the location within the real target filesystem(s) that the files are located in, but we do not have any standard substitution scheme; so far, we've only got the ability to specify literal strings which gives either absolute locations (non-portable) or locations that are relative to the working directory (great for the simple cases, but awful for anything complex). In DRMAA, this problem is solved by allowing magical tokens to be put at the start of applicable parameters and defining a substitution scheme for those magical parameters. (This is also the UNICORE approach). That works, but I use the term "magical" with good reason; the values are not easily manipulable with XML processing tools. The other basic alternative, processing with XSLT on the way in, has the problem in that it tends to end up binding things to specific locations far too soon. What I think we need is a way to say that a certain term is to be made relative to a particular virtual filesystem root. As far as I'm concerned, the easiest way of doing this is to put optional attributes on the elements needing this treatment (listed above) that say that the element value needs to be interpreted as being relative to the root of the virtual filesystem named in the attribute, with an absent attribute indicating that the attribute value is to be used literally. Alternatively, we could put a complex type inside the element, but that's much messier IMO. I suggest that the attribute should be called jsdl:filesystem (or rather, whatever we use in DataStaging but I don't remember for sure right now) and we might use it like this fragment: <jsdl:Application> <jsdl:ApplicationName> Gaussian </jsdl:ApplicationName> <jsdl-exec:POSIXApplication> <jsdl-exec:Input jsdl:filesystem="HOME"> input.txt </jsdl-exec:Input> <jsdl-exec:StackSpaceLimit> 8388608 </jsdl-exec:StackSpaceLimit> </jsdl-exec:POSIXApplication> </jsdl:Application> We'd need to decide what predefined virtual filesystems to support, of course. Or have we done that already? :^) Donal.

On Apr 26, Donal K. Fellows loaded a tape reading:
Hi everyone!
...
I suggest that the attribute should be called jsdl:filesystem (or rather, whatever we use in DataStaging but I don't remember for sure right now) and we might use it like this fragment:
<jsdl:Application> <jsdl:ApplicationName> Gaussian </jsdl:ApplicationName> <jsdl-exec:POSIXApplication> <jsdl-exec:Input jsdl:filesystem="HOME"> input.txt </jsdl-exec:Input> <jsdl-exec:StackSpaceLimit> 8388608 </jsdl-exec:StackSpaceLimit> </jsdl-exec:POSIXApplication> </jsdl:Application>
This seems as reasonable as the other magical approaches. It only prevents arbitrary string splicing like we have in GRAM... we use a ${VARIABLE} substitution rule in string fields like args and environment settings. The only thing more general might be some attribute to enable such a rewrite rule on a particular string element field? E.g. by the ever faithful URI or QName "extensible enumeration type" to indicate how such a filesystem string name is merged into the target element.
We'd need to decide what predefined virtual filesystems to support, of course. Or have we done that already? :^)
Donal.
Could this be handled using a more open content model? I am still not sure I understand the use case, e.g. who defines the virtual labels and their meaning. Could they ever be created dynamically by some external negotiation or stateful process, meaning the label really only means something to the client and for the provider it is just an arbitrary space in a pool? karl -- Karl Czajkowski karlcz@univa.com

Karl Czajkowski wrote:
This seems as reasonable as the other magical approaches. It only prevents arbitrary string splicing like we have in GRAM... we use a ${VARIABLE} substitution rule in string fields like args and environment settings.
Yeah, that's what ant and a few other tools do too. I seem to recall hearing/reading that XML hackers hate it as an idiom. :^)
The only thing more general might be some attribute to enable such a rewrite rule on a particular string element field? E.g. by the ever faithful URI or QName "extensible enumeration type" to indicate how such a filesystem string name is merged into the target element.
In general that's a good idea, but it's much more complex and we can do 90% of what we need with only 10% of the effort. As a bonus, my scheme also lets us avoid having to define what a directory separator looks like, at least for basic use. :^)
Could this be handled using a more open content model? I am still not sure I understand the use case, e.g. who defines the virtual labels and their meaning. Could they ever be created dynamically by some external negotiation or stateful process, meaning the label really only means something to the client and for the provider it is just an arbitrary space in a pool?
The virtual labels are a separate thing that we need to pin down, especially as they're needed elsewhere too. I suggest that we probably want a basic set like these (of course, users and jsdl consumers can create their own): HOME - Executing user's home directory TEMP (or TMP) - Temporary directory not guaranteed to last past the end of staging out, but relatively fast. Might not be shared across a cluster. SCRATCH - Temporary directory that can hold "large" temporary files with longer lifespan than TEMP where it is distinct (I know of a few systems where these are quite separate, and where it is shared cluster-wide.) I suppose we could add the job working directory to that list, but if anyone is doing cunning stuff there they should probably do it by hand. Creating new kinds of these things through negotiation or other stateful processing? Sounds fine to me. Donal.
participants (2)
-
Donal K. Fellows
-
Karl Czajkowski