Wildcards in data staging elements

Dear jsdl folks, what is the opinion on using JSDL DataStaging elements that contain wildcards? Or even something more powerful such as filesets with includes and excludes, as in Apache Ant? Thanks and best regards, Bernd. -- Dr. Bernd Schuller | mail: b.schuller@fz-juelich.de | phone: +49 2461 61-8736 (fax: -6656) Distributed Systems and Grid Computing | personal blog: Juelich Supercomputing Centre | http://www.jroller.com/page/gridhaus http://www.fz-juelich.de/jsc | ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- Forschungszentrum Jülich GmbH 52425 Jülich Sitz der Gesellschaft: Jülich Eingetragen im Handelsregister des Amtsgerichts Düren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDirig'in Bärbel Brumme-Bothe Geschäftsführung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender) ----------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------

Bernd Schuller wrote:
Dear jsdl folks,
what is the opinion on using JSDL DataStaging elements that contain wildcards? Or even something more powerful such as filesets with includes and excludes, as in Apache Ant?
This is quite a common requirement, especially for legacy codes that create multiple output files or use output file names that include the process id or similar. I can see two major problems with wildcards: 1. It can very difficult to get the underlying tools to deal with the wildcards correctly, or at all. This is particularly the case for output transfers where the only way to obtain the output files is to specify their names to the underlying middleware explicitly. 2. It creates a dependence on the filesystem that is not there currently. I can currently look at a JSDL file and know exactly the steps that will be performed. Adding wildcards means I also need to see the filesystems involved in order to know what the JSDL is actually going to do. Jon. -- Jonathan Giddy Grid Technologies Co-ordinator Welsh e-Science Centre Cardiff University ph (029) 2087 9153

Bernd Schuller wrote:
what is the opinion on using JSDL DataStaging elements that contain wildcards? Or even something more powerful such as filesets with includes and excludes, as in Apache Ant?
I've been tending to think in terms of "ship this whole directory tree", which doesn't need (explicit) wildcards, and which permits all sorts of interesting behind-the-scenes optimizations. One problem with wildcards is that they can sometimes lead to unexpected results; I can remember someone creating a file called '-f' in a globally-writeable directory once and getting some very strange complaints from the sysadmins. :-) This is why JSDL as written in the spec does not have any wildcards within it. (If it did, they'd *not* be shell-like, but rather XML-like.) Donal.

2007/11/13, Donal K. Fellows <donal.k.fellows@manchester.ac.uk>:
Bernd Schuller wrote:
what is the opinion on using JSDL DataStaging elements that contain wildcards? Or even something more powerful such as filesets with includes and excludes, as in Apache Ant?
is that they can sometimes lead to unexpected results; I can remember someone creating a file called '-f' in a globally-writeable directory once and getting some very strange complaints from the sysadmins. :-)
I'd vote against wildcards. They are somewhat platform-dependent (see Donal's statement on -f) and create dependencies on the implementation of the underlying filesystem (see Jon's statement). IMHO this is more or less a convenience feature: instead of writing a staging element for each of "outfile1.dat" to "outfile100.dat" by hand, there should be an easier way to do it. But this is something that should be handled by the UI used for submission, and (stab me if I'm wrong) as such out of the scope of JSDL. -Alexander -- Dipl.-Inform. Alexander Papaspyrou http://ds.e-technik.uni-dortmund.de/~alexp Robotics Research Institute phone : +49(231)755-5058 Information Technology Section fax : +49(231)755-3251 Dortmund Technical University, Germany
participants (4)
-
Alexander Papaspyrou
-
Bernd Schuller
-
Donal K. Fellows
-
Jonathan Giddy