Minutes for OGSA HPC Profile telecon (Aug 18 2006)

[[ these will be uploaded to sourceforge shortly ]] OGSA HPC Profile Minutes Fri Aug 18 2006 Participants: Marty Humphrey (University of Virginia) Marvin Theimer (MSFT) Glenn Wasson (University of Virginia) Donal Fellows (University of Manchester) Minutes: Marty Humphrey * Summary of Actions: AI-HPCP-0818a: Donal Fellow to review HPC Profile Application. AI-HPCP-0818b: Marvin Theimer will add paragraph to HPC Basic Profile stating: "all other elements of the JSDL 1.0 MAY appear but MAY also result in a 'not implemented' fault, but only the list here must be supported" AI-HPCP-0818c: Marvin Theimer will add first draft of section 4 to HPC Basic Profile that restricts BES operations to singletons * Previous minutes: NONE (this is the first teleconference) * Review of Actions: NO Previous Actions (this is the first teleconference) * Discuss the State of BES -------------------------- Marvin Theimer has grabbed the pen on BES doc (plan: to send it out by Wednesday including refactoring: [1] arbitrary clients vs [2] sys admins; management would only have start and stop) (ie. Put query into the factory); Main thing: add fleshed-out extended state model (current draft: there WILL be an extended state model, and heres the basic one"). extend data-staging example so that Ian F and Dave S can see the details and/or are satisfied Glenn: factoring into two port types (or three: activity, factory, management: but Activity is not current in it), not going to be in an Appendix Interesting issue: what ultimately shows up in the activity management interface Glenn: seemed like from the base case that the only thing you would do from the interface is Cancel Marvin recapped his email to BES re: three renderings Claim: basics of BES are shaping up to be what we expected (shaping up in a good way) * HPC Profile Application Extension - JSDL extension for describing executables, etc. ------------------------------------------------------------------------------------- fairly far along, are we happy with whats in there? Consensus is yes, were happy. Next step is to have someone other than Chris to go through it to see if there are any details that they dont like. Donal: the document needs: which bits are the normative MUSTs AI-HPCP-0818a: Donal Fellow to review HPC Profile Application. Specifically to ensure that normative text use is correct. Also, adding normative text. Hopefully done before the next JSDL call on Wed. Goal: to get on the agenda for JSDL. Donal will contact Andreas to get on the JSDL WG agenda. * HPC Basic Profile ------------------- Issue: whats the style of writing this doc? How do the restrictions work? Previous discussion: do we really want to create a doc that only has MUSTs? Currently, the style is: "These are the elements that MUST be supported and all other aspects COULD be supported but you also might get a fault back." Consensus is that this is the best style. Missing: another paragraph : "all other elements of the JSDL 1.0 MAY appear but MAY also result in a 'not implemented' fault, but only the list here must be supported." Example: file system element. Glenn: I dont like the Strike-out approach because you need both docs. AI-HPCP-0818b: Marvin Theimer will add paragraph stating: "all other elements of the JSDL 1.0 MAY appear but MAY also result in a 'not implemented' fault, but only the list here must be supported." Donal says: probably want to add more text with regard to range values. E.g., it must support JSDL exact but the exact allows to specify what you mean by Exact. In the case of range values, you also want to specify to mandate the syntactic requirements of integer values. I.e., set-value resources or single value resources. If NOT "set", then go with "exact". Marvin: Do we want to, in the base case, restrict total cpu count to be exact? COnsensus in call is that this is tricky, because its arguably not as simple as we would like. Consider putting this issue in a tracker. This leaves section 5 of BES our only restriction is to eliminate required support for arrays. AI-HPCP-081c: Marvin Theimer will add first draft of section 4 that restricts BES operations to singletons. This will result in our first complete draft. He will add in a first draft of the security section. * Other news ------------ Donal: At the end of the OGSA call yesterday, it was mentioned that EMS and data people are looking to do a combined use case, based on the HPC Profile. We all agreed on the call that this was great news and we're anxious to work with them. ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.

AI-HPCP-0818c: Marvin Theimer will add first draft of section 4 to HPC Basic Profile that restricts BES operations to singletons
Could you expand on this... having the BES interface support operations on vectors of activities has been fairly important. Is this is what you are planning to remove? Is this a profile restriction or an edit on the main BES document? Steven -- ---------------------------------------------------------------- Dr Steven Newhouse Mob:+44(0)7920489420 Tel:+44(0)23 80598789 Director, Open Middleware Infrastructure Institute-UK (OMII-UK) c/o Suite 6005, Faraday Building (B21), Highfield Campus, Southampton University, Highfield, Southampton, SO17 1BJ, UK

I am also concerned about this. We've never had vector operations in Globus, but Chris Smith argued eloquently for their importance in the early stages of BES, and so I became convinced that we need them. Ian. At 09:34 PM 8/20/2006 +0100, Steven Newhouse wrote:
AI-HPCP-0818c: Marvin Theimer will add first draft of section 4 to HPC Basic Profile that restricts BES operations to singletons
Could you expand on this... having the BES interface support operations on vectors of activities has been fairly important. Is this is what you are planning to remove?
Is this a profile restriction or an edit on the main BES document?
Steven -- ---------------------------------------------------------------- Dr Steven Newhouse Mob:+44(0)7920489420 Tel:+44(0)23 80598789 Director, Open Middleware Infrastructure Institute-UK (OMII-UK) c/o Suite 6005, Faraday Building (B21), Highfield Campus, Southampton University, Highfield, Southampton, SO17 1BJ, UK
_______________________________________________________________ Ian Foster -- Weblog: http://ianfoster.typepad.com Computation Institute: www.ci.uchicago.edu & www.ci.anl.gov Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439 Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637 Tel: +1 630 252 4619 --- Globus Alliance: www.globus.org

If I'm recalling correctly, there were two issues here. 1. general simplicity of the approach 2. failure semantics - would it be possible to define the failure semantics of array operations (i.e. partial failures) in a way that was precise enough for interop? Also, although array ops are not in the base case, they are in the initial set of extensions to the base case proposed in the use case document. Glenn _____ From: owner-ogsa-hpcp-wg@ggf.org [mailto:owner-ogsa-hpcp-wg@ggf.org] On Behalf Of Ian Foster Sent: Monday, August 21, 2006 3:44 AM To: Steven Newhouse; humphrey@cs.virginia.edu Cc: ogsa-hpcp-wg@ggf.org Subject: Re: [ogsa-hpcp-wg] Minutes for OGSA HPC Profile telecon (Aug 18 2006) I am also concerned about this. We've never had vector operations in Globus, but Chris Smith argued eloquently for their importance in the early stages of BES, and so I became convinced that we need them. Ian. At 09:34 PM 8/20/2006 +0100, Steven Newhouse wrote: AI-HPCP-0818c: Marvin Theimer will add first draft of section 4 to HPC Basic Profile that restricts BES operations to singletons Could you expand on this... having the BES interface support operations on vectors of activities has been fairly important. Is this is what you are planning to remove? Is this a profile restriction or an edit on the main BES document? Steven -- ---------------------------------------------------------------- Dr Steven Newhouse Mob:+44(0)7920489420 Tel:+44(0)23 80598789 Director, Open Middleware Infrastructure Institute-UK (OMII-UK) c/o Suite 6005, Faraday Building (B21), Highfield Campus, Southampton University, Highfield, Southampton, SO17 1BJ, UK _______________________________________________________________ Ian Foster -- Weblog: http://ianfoster.typepad.com <http://ianfoster.typepad.com/> Computation Institute: www.ci.uchicago.edu <http://www.ci.uchicago.edu/> & www.ci.anl.gov <http://www.ci.anl.gov/> Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439 Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637 Tel: +1 630 252 4619 --- Globus Alliance: www.globus.org <http://www.globus.org/>

Hi; I totally agree that they're important. However, in keeping with having as simple an HPC Profile base case as possible, the HPC Profile will restrict its base case to only allowing singleton vectors. Whether we want to define the BES base case as non-vector operations and then immediately define a vector extension or whether we want to define the base case as vector operations with the restriction being put in the HPC Profile specification is something still to be determined. Marvin. ________________________________ From: owner-ogsa-hpcp-wg@ggf.org On Behalf Of Ian Foster Sent: Monday, August 21, 2006 12:43 AM To: Steven Newhouse; humphrey@cs.virginia.edu Cc: ogsa-hpcp-wg@ggf.org Subject: Re: [ogsa-hpcp-wg] Minutes for OGSA HPC Profile telecon (Aug 18 2006) I am also concerned about this. We've never had vector operations in Globus, but Chris Smith argued eloquently for their importance in the early stages of BES, and so I became convinced that we need them. Ian. At 09:34 PM 8/20/2006 +0100, Steven Newhouse wrote: AI-HPCP-0818c: Marvin Theimer will add first draft of section 4 to HPC Basic Profile that restricts BES operations to singletons Could you expand on this... having the BES interface support operations on vectors of activities has been fairly important. Is this is what you are planning to remove? Is this a profile restriction or an edit on the main BES document? Steven -- ---------------------------------------------------------------- Dr Steven Newhouse Mob:+44(0)7920489420 Tel:+44(0)23 80598789 Director, Open Middleware Infrastructure Institute-UK (OMII-UK) c/o Suite 6005, Faraday Building (B21), Highfield Campus, Southampton University, Highfield, Southampton, SO17 1BJ, UK _______________________________________________________________ Ian Foster -- Weblog: http://ianfoster.typepad.com<http://ianfoster.typepad.com/> Computation Institute: www.ci.uchicago.edu<http://www.ci.uchicago.edu/> & www.ci.anl.gov<http://www.ci.anl.gov/> Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439 Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637 Tel: +1 630 252 4619 --- Globus Alliance: www.globus.org<http://www.globus.org/>

Hi; I totally agree that they're important. However, in keeping with having as simple an HPC Profile base case as possible, the HPC Profile will restrict its base case to only allowing singleton vectors. Whether we want to define the BES base case as non-vector operations and then immediately define a vector extension or whether we want to define the base case as vector operations with the restriction being put in the HPC Profile specification is something still to be determined. Marvin. ________________________________ From: owner-ogsa-hpcp-wg@ggf.org On Behalf Of Ian Foster Sent: Monday, August 21, 2006 12:43 AM To: Steven Newhouse; humphrey@cs.virginia.edu Cc: ogsa-hpcp-wg@ggf.org Subject: Re: [ogsa-hpcp-wg] Minutes for OGSA HPC Profile telecon (Aug 18 2006) I am also concerned about this. We've never had vector operations in Globus, but Chris Smith argued eloquently for their importance in the early stages of BES, and so I became convinced that we need them. Ian. At 09:34 PM 8/20/2006 +0100, Steven Newhouse wrote: AI-HPCP-0818c: Marvin Theimer will add first draft of section 4 to HPC Basic Profile that restricts BES operations to singletons Could you expand on this... having the BES interface support operations on vectors of activities has been fairly important. Is this is what you are planning to remove? Is this a profile restriction or an edit on the main BES document? Steven -- ---------------------------------------------------------------- Dr Steven Newhouse Mob:+44(0)7920489420 Tel:+44(0)23 80598789 Director, Open Middleware Infrastructure Institute-UK (OMII-UK) c/o Suite 6005, Faraday Building (B21), Highfield Campus, Southampton University, Highfield, Southampton, SO17 1BJ, UK _______________________________________________________________ Ian Foster -- Weblog: http://ianfoster.typepad.com<http://ianfoster.typepad.com/> Computation Institute: www.ci.uchicago.edu<http://www.ci.uchicago.edu/> & www.ci.anl.gov<http://www.ci.anl.gov/> Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439 Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637 Tel: +1 630 252 4619 --- Globus Alliance: www.globus.org<http://www.globus.org/>

humphrey@cs.virginia.edu wrote:
OGSA HPC Profile Minutes – Fri Aug 18 2006 [...] AI-HPCP-0818a: Donal Fellow to review HPC Profile Application. Specifically to ensure that normative text use is correct. Also, adding normative text. Hopefully done before the next JSDL call on Wed.
As promised, I've reviewed the HPC Profile Application extension document and applied a few changes (though fewer than I originally thought). I've uploaded a change-tracked version to GridForge (in the JSDL documentation tree) which has my changes in it together with comments detailing a few issues that I think could do with addressing before it goes final. (Some comments just describe reasoning behind one or two of the changes I've made though.) https://forge.gridforum.org/sf/docman/do/viewDocument/projects.jsdl-wg/docma... The main thing of note I suppose is that I've beefed up the Security Considerations to state that it is important that the UserName element not be allowed to do end-runs around security. But that needs checking by someone else to ensure that I've not gone too far and required that implementations *have* a security context that could allow deduction of what username to use. That's just the sane thing to do... :-) Cc:ing the JSDL WG so that they can get a chance to review the doc (plus my comments) too. Donal.

I've had a look through the document; here are some comments: 1. This definition is based on that of the POSIXApplication with the main change being the removal of the filesystem attribute and most of the limits. Could you add a section summarizing these changes, or highlight them in each element definition? I think this is very important given that the element names used are the same (even if in a different namespace). It may help to reduce confusion. 2. Executable is defined as a mandatory element in the same way as in the JSDL PosixApplication. Given the discussions we've had in the context of the EMS Scenarios this should probably be optional (0-1) since the location may not be known until deployment is finished. (This isn't an issue for the HPC profile.) 3. Donal changed the Argument type from normalizedstring to string. If I recall correctly there was a very long and heated discussion around this in the past (but, happily perhaps, I can't recall details). Could someone who was involved in the definition of this element followup? 4. Working directory - In contrast to Donal's comment I think if this is not present it should not be created. Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine. 5. Why was GroupName dropped? 6. Consider introducing a sub-section for "Security considerations" in each element definition. I think this makes more sense than mixing such information in the functional definition. It also makes more sense for this kind of spec than trying to list everything in a single Security considerations section at the back. (I'm also thinking of doing something like this in the errata release of 1.0). (Not sure if I can make the HPCP call tonight, but I might turn up for the first few minutes.) Andreas Donal K. Fellows wrote:
humphrey@cs.virginia.edu wrote:
OGSA HPC Profile Minutes – Fri Aug 18 2006 [...] AI-HPCP-0818a: Donal Fellow to review HPC Profile Application. Specifically to ensure that normative text use is correct. Also, adding normative text. Hopefully done before the next JSDL call on Wed.
As promised, I've reviewed the HPC Profile Application extension document and applied a few changes (though fewer than I originally thought). I've uploaded a change-tracked version to GridForge (in the JSDL documentation tree) which has my changes in it together with comments detailing a few issues that I think could do with addressing before it goes final. (Some comments just describe reasoning behind one or two of the changes I've made though.)
https://forge.gridforum.org/sf/docman/do/viewDocument/projects.jsdl-wg/docma...
The main thing of note I suppose is that I've beefed up the Security Considerations to state that it is important that the UserName element not be allowed to do end-runs around security. But that needs checking by someone else to ensure that I've not gone too far and required that implementations *have* a security context that could allow deduction of what username to use. That's just the sane thing to do... :-)
Cc:ing the JSDL WG so that they can get a chance to review the doc (plus my comments) too.
Donal.
-- Andreas Savva Fujitsu Laboratories Ltd

Andreas Savva wrote:
I've had a look through the document; here are some comments:
I'm answering these as I understand them. Points I don't answer are ones where I'd just be agreeing with Andreas. :-)
2. Executable is defined as a mandatory element in the same way as in the JSDL PosixApplication. Given the discussions we've had in the context of the EMS Scenarios this should probably be optional (0-1) since the location may not be known until deployment is finished. (This isn't an issue for the HPC profile.)
Perhaps the right thing to do would be to leave it optional in this spec and then profile it to be mandatory in the HPC Profile. The advanced cases we were considering today lie outside the scope of the HPCP, but this spec is still quite possibly useful for them.
3. Donal changed the Argument type from normalizedstring to string. If I recall correctly there was a very long and heated discussion around this in the past (but, happily perhaps, I can't recall details). Could someone who was involved in the definition of this element followup?
The difference is that xsd:normalizedString trims spaces and xsd:string does not. For arguments, it is sometimes necessary to have leading or trailing spaces. Occasionally, you even need args that are nothing but whitespace. What this spec is supposed to do is to indicate that whatever the implementation does, it must get the argument quoting right so that (near[*]) arbitrary arguments can be given.
4. Working directory - In contrast to Donal's comment I think if this is not present it should not be created.
If the user doesn't give a working directory, it's up to the implementor to do some default way of handling this. Some systems will want to do this by running in the home directory, and others will want to create a new per-job directory. The only files that the user should count on being there are those that they have staged there themselves. (If the user does not specify any stagings, they must obviously not be relying on any relatively-named files at all. Their choice.) Whatever the implementor does, they should document what they do in that case. The HPC Profile may wish to restrict this. (I'd recommend that they do just that!)
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine.
This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
5. Why was GroupName dropped?
Not portable to non-POSIX platforms I think. Donal. [* You'd still be limited by the XML 1.0 spec, and there's the whole separate business of understanding the character encodings actually used. Perhaps the spec should say something there, since it is likely that just copying the bytes from the input document won't work. ]

A few comments inline, just in case I can't make the call. Donal K. Fellows wrote:
Andreas Savva wrote:
I've had a look through the document; here are some comments:
I'm answering these as I understand them. Points I don't answer are ones where I'd just be agreeing with Andreas. :-)
2. Executable is defined as a mandatory element in the same way as in the JSDL PosixApplication. Given the discussions we've had in the context of the EMS Scenarios this should probably be optional (0-1) since the location may not be known until deployment is finished. (This isn't an issue for the HPC profile.)
Perhaps the right thing to do would be to leave it optional in this spec and then profile it to be mandatory in the HPC Profile. The advanced cases we were considering today lie outside the scope of the HPCP, but this spec is still quite possibly useful for them.
Yes, I think it should be optional in the schema and made mandatory in the HPCP (for the 'final' submission to the container).
3. Donal changed the Argument type from normalizedstring to string. If I recall correctly there was a very long and heated discussion around this in the past (but, happily perhaps, I can't recall details). Could someone who was involved in the definition of this element followup?
The difference is that xsd:normalizedString trims spaces and xsd:string does not. For arguments, it is sometimes necessary to have leading or trailing spaces. Occasionally, you even need args that are nothing but whitespace. What this spec is supposed to do is to indicate that whatever the implementation does, it must get the argument quoting right so that (near[*]) arbitrary arguments can be given.
I still can't remember the discussions on this but I believe this was the reason we allowed empty argument elements and said they must not be collapsed. The question I have if you change the argument type to a string is then why do you need multiple argument elements? Also I don't have a big problem with restricting definitions (e.g., by removing the filesystem attribute in this application extension) but I do have a problem with changing types for identically named elements.
4. Working directory - In contrast to Donal's comment I think if this is not present it should not be created.
If the user doesn't give a working directory, it's up to the implementor to do some default way of handling this. Some systems will want to do this by running in the home directory, and others will want to create a new per-job directory. The only files that the user should count on being there are those that they have staged there themselves. (If the user does not specify any stagings, they must obviously not be relying on any relatively-named files at all. Their choice.)
Whatever the implementor does, they should document what they do in that case. The HPC Profile may wish to restrict this. (I'd recommend that they do just that!)
Donal, are you saying that if the directory is not present it should be up to the implementation to decide what to do? I read your comment as saying that the specification should state that the directory should be created.
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine.
This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
5. Why was GroupName dropped?
Not portable to non-POSIX platforms I think.
Donal. [* You'd still be limited by the XML 1.0 spec, and there's the whole separate business of understanding the character encodings actually used. Perhaps the spec should say something there, since it is likely that just copying the bytes from the input document won't work. ]
-- Andreas Savva Fujitsu Laboratories Ltd

Andreas Savva wrote:
I still can't remember the discussions on this but I believe this was the reason we allowed empty argument elements and said they must not be collapsed. The question I have if you change the argument type to a string is then why do you need multiple argument elements?
Also I don't have a big problem with restricting definitions (e.g., by removing the filesystem attribute in this application extension) but I do have a problem with changing types for identically named elements.
I've tried looking up the word "collapsed" in relation to XML, and it seems to be fairly meaningless. Instead, these things are best defined in terms of XSD facets, and given the general requirements (no trimming, no internal replacement of whitespace sequences, no processing at all) xsd:string is the exact type we need. If the main JSDL spec says anything else, I think we can say that it's in error. :-)
Donal, are you saying that if the directory is not present it should be up to the implementation to decide what to do? I read your comment as saying that the specification should state that the directory should be created.
There are 3 cases: 1: User specifies existing directory - must not be created! 2: User specifies non-existing directry - must be created! 3: User does not specify directory. This case is up to the implementor but will probably follow one of the two patterns: 3a: Use existing "well-known" directory, e.g. $HOME. Implementation must document what directory will be chosen. 3b: Create new directory with "random" name so that jobs are isolated. Implementation must state that this will happen. I do not believe that the behaviour of all existing batch systems is the same in case 3 FWIW; lower-level ones will tend to 3a (as it assumes less) and higher-level ones will tend to 3b (as it isolates better).
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
Not sure I agree. The HPC Profile stuff is for a very restricted case. Donal.

Donal, Donal K. Fellows wrote:
Andreas Savva wrote:
I still can't remember the discussions on this but I believe this was the reason we allowed empty argument elements and said they must not be collapsed. The question I have if you change the argument type to a string is then why do you need multiple argument elements?
Also I don't have a big problem with restricting definitions (e.g., by removing the filesystem attribute in this application extension) but I do have a problem with changing types for identically named elements.
I've tried looking up the word "collapsed" in relation to XML, and it seems to be fairly meaningless. Instead, these things are best defined in terms of XSD facets, and given the general requirements (no trimming, no internal replacement of whitespace sequences, no processing at all) xsd:string is the exact type we need. If the main JSDL spec says anything else, I think we can say that it's in error. :-)
I don't think 'collapsed' was used in any XML-specific sense; rather it was probably used in the spec to mean that empty argument elements should not be discarded. Actually the spec should say clearly that an empty argument element maps to an empty string. It shows examples of this but it should say so in the definition. (One for the tracker.) Though I tend to agree with you that 'string' is more appropriate(*) than 'normalizedstring' I would still like to know why we chose normalized string in the first place. I'll put this in the tracker as well and I'd like to ask that this part of the definition of argument in the HPC application extension be kept pending until we have a discussion of this issue, probably at GGF18. Is this agreeable? (* if only because I can imagine cases where I'd want to pass a 'tab' character)
Donal, are you saying that if the directory is not present it should be up to the implementation to decide what to do? I read your comment as saying that the specification should state that the directory should be created.
There are 3 cases: 1: User specifies existing directory - must not be created! 2: User specifies non-existing directry - must be created! 3: User does not specify directory. This case is up to the implementor but will probably follow one of the two patterns: 3a: Use existing "well-known" directory, e.g. $HOME. Implementation must document what directory will be chosen. 3b: Create new directory with "random" name so that jobs are isolated. Implementation must state that this will happen.
I do not believe that the behaviour of all existing batch systems is the same in case 3 FWIW; lower-level ones will tend to 3a (as it assumes less) and higher-level ones will tend to 3b (as it isolates better).
I don't agree that all existing batch systems create a non-existing directory for case 2, and I would not want to see a MUST or SHOULD for this in the document. A MAY might not be objectionable.
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
Not sure I agree. The HPC Profile stuff is for a very restricted case.
I might be willing to agree if the limitations are described adequately in the document. -- Andreas Savva Fujitsu Laboratories Ltd

On 01/9/06 06:08, "Donal K. Fellows" <donal.k.fellows@manchester.ac.uk> wrote:
Donal, are you saying that if the directory is not present it should be up to the implementation to decide what to do? I read your comment as saying that the specification should state that the directory should be created.
There are 3 cases: 1: User specifies existing directory - must not be created! 2: User specifies non-existing directry - must be created! 3: User does not specify directory. This case is up to the implementor but will probably follow one of the two patterns: 3a: Use existing "well-known" directory, e.g. $HOME. Implementation must document what directory will be chosen. 3b: Create new directory with "random" name so that jobs are isolated. Implementation must state that this will happen.
I do not believe that the behaviour of all existing batch systems is the same in case 3 FWIW; lower-level ones will tend to 3a (as it assumes less) and higher-level ones will tend to 3b (as it isolates better).
I actually believe that in the case of '2' that the implementation should raise a fault and not create the directory. First off, I don't believe that any of the existing low-level batch systems will do this creation for you, and supporting these batch systems is a primary goal of the HPC Profile. Second, I'm not a big supporter of "implied" behaviour. That is, there is a side-effect that a directory is created when this job is run. I'd prefer that a user needs to make sure the directory exists if they are specifying a working directory (using working directory seems to imply that the user knows a fair bit about the execution environment). As for 3, I agree. It's up to the implementation to document it's behaviour.
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
Not sure I agree. The HPC Profile stuff is for a very restricted case.
Again ... I think it's up to the implementation to define it's behaviour if the working directory isn't specified. At this point, for the HPC Profile, I'd like to leave it at that. -- Chris

Christopher Smith wrote:
I actually believe that in the case of '2' that the implementation should raise a fault and not create the directory. First off, I don't believe that any of the existing low-level batch systems will do this creation for you, and supporting these batch systems is a primary goal of the HPC Profile. Second, I'm not a big supporter of "implied" behaviour. That is, there is a side-effect that a directory is created when this job is run. I'd prefer that a user needs to make sure the directory exists if they are specifying a working directory (using working directory seems to imply that the user knows a fair bit about the execution environment).
As for 3, I agree. It's up to the implementation to document it's behaviour.
According to my perspective, everything's OK as long as what happens is all clearly documented. :-) Maybe could even state that this is part of the profiled set of information a BES publishes about itself? In any case, the real interop case is #1 (specified existing dir). I raised the issue in #2 (specified non-existing dir) because requiring a fault precludes creation, which is behaviour useful for higher-level job managers, as it means that they don't need to take special action to create a shared directory for a group of related activities. Otherwise you end up having to do more workflow-dependency stuff in order to make the directory that the real tasks use. The standards perspective is: each case is distinct and should either have mandated behaviour, or a requirement on implementations to state what they do so that clients can discover prior to activity submission. With that, we can state that any client relying on a particular way of resolving this sort of issue (without first checking for it) is not itself strictly compliant with the profile. :-) Donal.

Andreas Savva wrote:
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine. This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
FWIW, I asked about using variables in path names at one point, and the answer seemed to be that one can specify file systems with mount points that are variables. So then all you need to do is reference this in your WorkingDirectory element and the service will resolve the mount point at runtime. Peter

Peter G. Lane wrote:
Andreas Savva wrote:
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine. This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
FWIW, I asked about using variables in path names at one point, and the answer seemed to be that one can specify file systems with mount points that are variables. So then all you need to do is reference this in your WorkingDirectory element and the service will resolve the mount point at runtime.
This is true for the POSIXApplication---the filesystemName attribute allows you to do that for WorkingDirectory and Environment elements, for example. The HPC profile application draft removes these attributes for simplicity. I was curious if there would be some other defined mechanism to reference 'home' or not. I'll wait to see the minutes of the call. -- Andreas Savva Fujitsu Laboratories Ltd

On 01/9/06 08:40, "Peter G. Lane" <lane@mcs.anl.gov> wrote:
Andreas Savva wrote:
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine. This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
FWIW, I asked about using variables in path names at one point, and the answer seemed to be that one can specify file systems with mount points that are variables. So then all you need to do is reference this in your WorkingDirectory element and the service will resolve the mount point at runtime.
This is true for the POSIXApplication extension, but not for the HPCProfileApplication extension, as the HPC Basic Profile doesn't allow the use of the FileSystem stuff (the "macro" aspect seems cool, but other aspects are hard to implement e.g. space available for the job). -- Chris

On 01/9/06 01:18, "Donal K. Fellows" <donal.k.fellows@manchester.ac.uk> wrote:
Andreas Savva wrote:
I've had a look through the document; here are some comments:
I'm answering these as I understand them. Points I don't answer are ones where I'd just be agreeing with Andreas. :-)
2. Executable is defined as a mandatory element in the same way as in the JSDL PosixApplication. Given the discussions we've had in the context of the EMS Scenarios this should probably be optional (0-1) since the location may not be known until deployment is finished. (This isn't an issue for the HPC profile.)
Perhaps the right thing to do would be to leave it optional in this spec and then profile it to be mandatory in the HPC Profile. The advanced cases we were considering today lie outside the scope of the HPCP, but this spec is still quite possibly useful for them.
I'd prefer to keep it required for this extension. It's intended to be very simple, and to give little room for error. The intention is that more advanced use cases will need a different application extension for JSDL. A system could support both POSIXApplication and HPCProfileApplication, for instance. -- Chris

Hi all, Sorry for my delay ... I'll be applying Donal's changes in the HPC Profile Application extension document today. I'll also try to address any further comments that came up in later email threads. -- Chris On 21/8/06 06:14, "Donal K. Fellows" <donal.k.fellows@manchester.ac.uk> wrote:
humphrey@cs.virginia.edu wrote:
OGSA HPC Profile Minutes Fri Aug 18 2006 [...] AI-HPCP-0818a: Donal Fellow to review HPC Profile Application. Specifically to ensure that normative text use is correct. Also, adding normative text. Hopefully done before the next JSDL call on Wed.
As promised, I've reviewed the HPC Profile Application extension document and applied a few changes (though fewer than I originally thought). I've uploaded a change-tracked version to GridForge (in the JSDL documentation tree) which has my changes in it together with comments detailing a few issues that I think could do with addressing before it goes final. (Some comments just describe reasoning behind one or two of the changes I've made though.)
https://forge.gridforum.org/sf/docman/do/viewDocument/projects.jsdl-wg/docma.... root.working_drafts/doc13749
The main thing of note I suppose is that I've beefed up the Security Considerations to state that it is important that the UserName element not be allowed to do end-runs around security. But that needs checking by someone else to ensure that I've not gone too far and required that implementations *have* a security context that could allow deduction of what username to use. That's just the sane thing to do... :-)
Cc:ing the JSDL WG so that they can get a chance to review the doc (plus my comments) too.
Donal.
participants (9)
-
Andreas Savva
-
Christopher Smith
-
Donal K. Fellows
-
Glenn Wasson
-
humphrey@cs.virginia.edu
-
Ian Foster
-
Marvin Theimer
-
Peter G. Lane
-
Steven Newhouse