
In the course of implementing some additional JSDL 1.0 features (or extensions), a couple of issues have reared their ugly heads that I wanted to bring to the group's attention. The first has to do with data staging, working directory, file systems, and everything else having to do with paths. It turns out that there are some exceptionally popular applications out there that are very poorly written. This is probably a surprise to no one, but unfortunately, its a fact of life that in the grid computing business, we have to support many of these applications despite their egregiousness. In particular, I refer to applications which cannot handle input files or paths which contain spaces. Recently, while trying to run a large batch of BLAST jobs using our new implementation of the SCRATCH file system, we discovered that BLAST is unable to deal with paths to the search database which contain spaces. As we thought about what to do about this, we realized that, sad as it may seem, it may be necessary to add to JSDL the ability to say that an Application (or certain inputs) must have paths without any spaces included. Would this be a Resource requirement? I'm not sure exactly how to classify it, but it seems like some JSDL annotation is necessary. The second problem has to do with the SPMD extensions. I feel very strongly that the enumerated type for which kind of SPMD application you are is insufficient. If you look at UVa as an example compute environment, we have at least 3 versions of MPICH running on our cluster at any given time (one each for the gcc, pgi, and intel compilers). Each one requires different libraries to be located in the search path and each one is started (sadly) with a different mpiexec. I think that a rework of the SPMD extensions to allow for some kind of greater specificity in the SPMD type is called for. Sincerely, Mark Morgan

Mark Morgan wrote:
The first has to do with data staging, working directory, file systems, and everything else having to do with paths. It turns out that there are some exceptionally popular applications out there that are very poorly written. This is probably a surprise to no one, but unfortunately, its a fact of life that in the grid computing business, we have to support many of these applications despite their egregiousness. In particular, I refer to applications which cannot handle input files or paths which contain spaces. Recently, while trying to run a large batch of BLAST jobs using our new implementation of the SCRATCH file system, we discovered that BLAST is unable to deal with paths to the search database which contain spaces. As we thought about what to do about this, we realized that, sad as it may seem, it may be necessary to add to JSDL the ability to say that an Application (or certain inputs) must have paths without any spaces included. Would this be a Resource requirement? I'm not sure exactly how to classify it, but it seems like some JSDL annotation is necessary.
I'm not sure that this is a JSDL requirement as such, as that's correctly conveying what you wish to achieve. It's just that BLAST is crapping out on it. I'd instead suggest that what you really need is something in your execution environment (i.e. I think this is a BES requirement) to say "crap out if there are spaces in filenames" or "convert 'bad' characters to underscores" or something like that. In fact, by running into trouble this way I believe you've actually demonstrated that your implementation is (in some ways at least) correct. :-D Donal.

I have to disagree. My implementation is not correct in so much as my job is to run jobs for scientists here at UVa. Those scientist use BLAST. Blast currently cannot run on my implementation because my implementation can have spaces in the path to where the BES jobs run. Therefor I have failed my scientists. We all know that BLAST is crapping out on spaces because it is poorly written and in a perfect world we would be able to justify the failure based off of that, but unfortunately, we are in the business of supporting poorly written applications. An application that cannot handle spaces in the paths is a fact of life and we need to be able to describe that unfortunately circumstance. -Mark On Tue, 2008-10-14 at 10:24 +0100, Donal K. Fellows wrote:
Mark Morgan wrote:
The first has to do with data staging, working directory, file systems, and everything else having to do with paths. It turns out that there are some exceptionally popular applications out there that are very poorly written. This is probably a surprise to no one, but unfortunately, its a fact of life that in the grid computing business, we have to support many of these applications despite their egregiousness. In particular, I refer to applications which cannot handle input files or paths which contain spaces. Recently, while trying to run a large batch of BLAST jobs using our new implementation of the SCRATCH file system, we discovered that BLAST is unable to deal with paths to the search database which contain spaces. As we thought about what to do about this, we realized that, sad as it may seem, it may be necessary to add to JSDL the ability to say that an Application (or certain inputs) must have paths without any spaces included. Would this be a Resource requirement? I'm not sure exactly how to classify it, but it seems like some JSDL annotation is necessary.
I'm not sure that this is a JSDL requirement as such, as that's correctly conveying what you wish to achieve. It's just that BLAST is crapping out on it. I'd instead suggest that what you really need is something in your execution environment (i.e. I think this is a BES requirement) to say "crap out if there are spaces in filenames" or "convert 'bad' characters to underscores" or something like that.
In fact, by running into trouble this way I believe you've actually demonstrated that your implementation is (in some ways at least) correct. :-D
Donal.

Mark Morgan wrote:
I have to disagree. My implementation is not correct in so much as my job is to run jobs for scientists here at UVa. Those scientist use BLAST. Blast currently cannot run on my implementation because my implementation can have spaces in the path to where the BES jobs run. Therefor I have failed my scientists. We all know that BLAST is crapping out on spaces because it is poorly written and in a perfect world we would be able to justify the failure based off of that, but unfortunately, we are in the business of supporting poorly written applications. An application that cannot handle spaces in the paths is a fact of life and we need to be able to describe that unfortunately circumstance.
I agree that there's a system failure. I just disagree that this is JSDL's fault; it is correctly describing what people want to run. It's just a shame that the application isn't as capable as that. Guessing from what you say that we're talking about running stuff on Windows with its odd ideas about user-specific temp directory names, the immediate fix is probably to add an OS constraint because changing the temporary directory is plain awkward and the default in most languages has a space in it. And complain to the BLAST maintainers, of course. If I'm guessing wrong and the issue is users' files with spaces in the names (as opposed to in the path) I'd just call that PEBKAC and stop worrying about it. ;-) Donal.

An application that cannot handle spaces in the paths is a fact of life and we need to be able to describe that unfortunately circumstance.
But I'm not sure that this is the fault of JSDL. I can specify a file path in JSDL that does not exist - regardless of the 'correctness' of the application or not - that would seem to be my fault, not JSDLs. A verification requirement in the BES would seem to be more appropriate. A local BES might know that BLAST on that platform can only have data files in /opt/data and that a file URL cannot have any spaces. This might be something described in the Application Template specification or someplace else? Steven

I would tend to agree with you about an Application Template or application description. On Tue, 2008-10-14 at 07:22 -0700, Steven Newhouse wrote:
An application that cannot handle spaces in the paths is a fact of life and we need to be able to describe that unfortunately circumstance.
But I'm not sure that this is the fault of JSDL. I can specify a file path in JSDL that does not exist - regardless of the 'correctness' of the application or not - that would seem to be my fault, not JSDLs.
A verification requirement in the BES would seem to be more appropriate. A local BES might know that BLAST on that platform can only have data files in /opt/data and that a file URL cannot have any spaces.
This might be something described in the Application Template specification or someplace else?
Steven

Steven Newhouse wrote:
An application that cannot handle spaces in the paths is a fact of life and we need to be able to describe that unfortunately circumstance.
But I'm not sure that this is the fault of JSDL. I can specify a file path in JSDL that does not exist - regardless of the 'correctness' of the application or not - that would seem to be my fault, not JSDLs.
A verification requirement in the BES would seem to be more appropriate. A local BES might know that BLAST on that platform can only have data files in /opt/data and that a file URL cannot have any spaces.
This might be something described in the Application Template specification or someplace else?
Thank you, yes. I agree with the way you say this. JSDL doesn't describe the application, but rather the request to execute the application. JSDL does not care if the application is likely to choke on the values provided; it doesn't syntax-check the arguments, as it's just a document format. Donal.
participants (3)
-
Donal K. Fellows
-
Mark Morgan
-
Steven Newhouse