Re: [ogsa-hpcp-wg] Minutes for OGSA HPC Profile telecon (Aug 18 2006)

humphrey@cs.virginia.edu wrote:
OGSA HPC Profile Minutes – Fri Aug 18 2006 [...] AI-HPCP-0818a: Donal Fellow to review HPC Profile Application. Specifically to ensure that normative text use is correct. Also, adding normative text. Hopefully done before the next JSDL call on Wed.
As promised, I've reviewed the HPC Profile Application extension document and applied a few changes (though fewer than I originally thought). I've uploaded a change-tracked version to GridForge (in the JSDL documentation tree) which has my changes in it together with comments detailing a few issues that I think could do with addressing before it goes final. (Some comments just describe reasoning behind one or two of the changes I've made though.) https://forge.gridforum.org/sf/docman/do/viewDocument/projects.jsdl-wg/docma... The main thing of note I suppose is that I've beefed up the Security Considerations to state that it is important that the UserName element not be allowed to do end-runs around security. But that needs checking by someone else to ensure that I've not gone too far and required that implementations *have* a security context that could allow deduction of what username to use. That's just the sane thing to do... :-) Cc:ing the JSDL WG so that they can get a chance to review the doc (plus my comments) too. Donal.

I've had a look through the document; here are some comments: 1. This definition is based on that of the POSIXApplication with the main change being the removal of the filesystem attribute and most of the limits. Could you add a section summarizing these changes, or highlight them in each element definition? I think this is very important given that the element names used are the same (even if in a different namespace). It may help to reduce confusion. 2. Executable is defined as a mandatory element in the same way as in the JSDL PosixApplication. Given the discussions we've had in the context of the EMS Scenarios this should probably be optional (0-1) since the location may not be known until deployment is finished. (This isn't an issue for the HPC profile.) 3. Donal changed the Argument type from normalizedstring to string. If I recall correctly there was a very long and heated discussion around this in the past (but, happily perhaps, I can't recall details). Could someone who was involved in the definition of this element followup? 4. Working directory - In contrast to Donal's comment I think if this is not present it should not be created. Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine. 5. Why was GroupName dropped? 6. Consider introducing a sub-section for "Security considerations" in each element definition. I think this makes more sense than mixing such information in the functional definition. It also makes more sense for this kind of spec than trying to list everything in a single Security considerations section at the back. (I'm also thinking of doing something like this in the errata release of 1.0). (Not sure if I can make the HPCP call tonight, but I might turn up for the first few minutes.) Andreas Donal K. Fellows wrote:
humphrey@cs.virginia.edu wrote:
OGSA HPC Profile Minutes – Fri Aug 18 2006 [...] AI-HPCP-0818a: Donal Fellow to review HPC Profile Application. Specifically to ensure that normative text use is correct. Also, adding normative text. Hopefully done before the next JSDL call on Wed.
As promised, I've reviewed the HPC Profile Application extension document and applied a few changes (though fewer than I originally thought). I've uploaded a change-tracked version to GridForge (in the JSDL documentation tree) which has my changes in it together with comments detailing a few issues that I think could do with addressing before it goes final. (Some comments just describe reasoning behind one or two of the changes I've made though.)
https://forge.gridforum.org/sf/docman/do/viewDocument/projects.jsdl-wg/docma...
The main thing of note I suppose is that I've beefed up the Security Considerations to state that it is important that the UserName element not be allowed to do end-runs around security. But that needs checking by someone else to ensure that I've not gone too far and required that implementations *have* a security context that could allow deduction of what username to use. That's just the sane thing to do... :-)
Cc:ing the JSDL WG so that they can get a chance to review the doc (plus my comments) too.
Donal.
-- Andreas Savva Fujitsu Laboratories Ltd

Andreas Savva wrote:
I've had a look through the document; here are some comments:
I'm answering these as I understand them. Points I don't answer are ones where I'd just be agreeing with Andreas. :-)
2. Executable is defined as a mandatory element in the same way as in the JSDL PosixApplication. Given the discussions we've had in the context of the EMS Scenarios this should probably be optional (0-1) since the location may not be known until deployment is finished. (This isn't an issue for the HPC profile.)
Perhaps the right thing to do would be to leave it optional in this spec and then profile it to be mandatory in the HPC Profile. The advanced cases we were considering today lie outside the scope of the HPCP, but this spec is still quite possibly useful for them.
3. Donal changed the Argument type from normalizedstring to string. If I recall correctly there was a very long and heated discussion around this in the past (but, happily perhaps, I can't recall details). Could someone who was involved in the definition of this element followup?
The difference is that xsd:normalizedString trims spaces and xsd:string does not. For arguments, it is sometimes necessary to have leading or trailing spaces. Occasionally, you even need args that are nothing but whitespace. What this spec is supposed to do is to indicate that whatever the implementation does, it must get the argument quoting right so that (near[*]) arbitrary arguments can be given.
4. Working directory - In contrast to Donal's comment I think if this is not present it should not be created.
If the user doesn't give a working directory, it's up to the implementor to do some default way of handling this. Some systems will want to do this by running in the home directory, and others will want to create a new per-job directory. The only files that the user should count on being there are those that they have staged there themselves. (If the user does not specify any stagings, they must obviously not be relying on any relatively-named files at all. Their choice.) Whatever the implementor does, they should document what they do in that case. The HPC Profile may wish to restrict this. (I'd recommend that they do just that!)
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine.
This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
5. Why was GroupName dropped?
Not portable to non-POSIX platforms I think. Donal. [* You'd still be limited by the XML 1.0 spec, and there's the whole separate business of understanding the character encodings actually used. Perhaps the spec should say something there, since it is likely that just copying the bytes from the input document won't work. ]

A few comments inline, just in case I can't make the call. Donal K. Fellows wrote:
Andreas Savva wrote:
I've had a look through the document; here are some comments:
I'm answering these as I understand them. Points I don't answer are ones where I'd just be agreeing with Andreas. :-)
2. Executable is defined as a mandatory element in the same way as in the JSDL PosixApplication. Given the discussions we've had in the context of the EMS Scenarios this should probably be optional (0-1) since the location may not be known until deployment is finished. (This isn't an issue for the HPC profile.)
Perhaps the right thing to do would be to leave it optional in this spec and then profile it to be mandatory in the HPC Profile. The advanced cases we were considering today lie outside the scope of the HPCP, but this spec is still quite possibly useful for them.
Yes, I think it should be optional in the schema and made mandatory in the HPCP (for the 'final' submission to the container).
3. Donal changed the Argument type from normalizedstring to string. If I recall correctly there was a very long and heated discussion around this in the past (but, happily perhaps, I can't recall details). Could someone who was involved in the definition of this element followup?
The difference is that xsd:normalizedString trims spaces and xsd:string does not. For arguments, it is sometimes necessary to have leading or trailing spaces. Occasionally, you even need args that are nothing but whitespace. What this spec is supposed to do is to indicate that whatever the implementation does, it must get the argument quoting right so that (near[*]) arbitrary arguments can be given.
I still can't remember the discussions on this but I believe this was the reason we allowed empty argument elements and said they must not be collapsed. The question I have if you change the argument type to a string is then why do you need multiple argument elements? Also I don't have a big problem with restricting definitions (e.g., by removing the filesystem attribute in this application extension) but I do have a problem with changing types for identically named elements.
4. Working directory - In contrast to Donal's comment I think if this is not present it should not be created.
If the user doesn't give a working directory, it's up to the implementor to do some default way of handling this. Some systems will want to do this by running in the home directory, and others will want to create a new per-job directory. The only files that the user should count on being there are those that they have staged there themselves. (If the user does not specify any stagings, they must obviously not be relying on any relatively-named files at all. Their choice.)
Whatever the implementor does, they should document what they do in that case. The HPC Profile may wish to restrict this. (I'd recommend that they do just that!)
Donal, are you saying that if the directory is not present it should be up to the implementation to decide what to do? I read your comment as saying that the specification should state that the directory should be created.
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine.
This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
5. Why was GroupName dropped?
Not portable to non-POSIX platforms I think.
Donal. [* You'd still be limited by the XML 1.0 spec, and there's the whole separate business of understanding the character encodings actually used. Perhaps the spec should say something there, since it is likely that just copying the bytes from the input document won't work. ]
-- Andreas Savva Fujitsu Laboratories Ltd

On Sep 01, Andreas Savva modulated:
... The question I have if you change the argument type to a string is then why do you need multiple argument elements?
The arguments to a POSIX command are a vector of strings. The strings may have any character in them, be empty strings, etc. You cannot trivially talk about encoding this vector in a string unless you assume a bunch of "quoting" conventions. Don't mix up a shell programming language with the underlying command execution enviroment. The shell parses the string command-line into an array of strings according to its shell language rules, but the underlying commands receive a vector. The JSDL implementation should map this vector into whatever form it requires, e.g. a properly escaped Bourne shell command if it is cooking up some "run script" for a scheduler. Things go horribly wrong if you start assuming a particular back-end parser or not escaping/quoting everything correctly. You must avoid having users work around implementation bugs by embedding implementation+platform specific quoting syntax inside their argument data! That said, Donal's point about string encoding is important to remember. To do it right, the implementation needs to translate the XML info set string (unicode I think, right?) to the locale in which the program will run. This might not be possible, unless we can assume that the implementation can select an appropriately locale for the characters being used. karl -- Karl Czajkowski karlcz@univa.com

Karl Czajkowski wrote:
On Sep 01, Andreas Savva modulated:
... The question I have if you change the argument type to a string is then why do you need multiple argument elements?
The arguments to a POSIX command are a vector of strings. The strings may have any character in them, be empty strings, etc. You cannot trivially talk about encoding this vector in a string unless you assume a bunch of "quoting" conventions. [...]
I agree completely with this point. The arguments are to contain the strings as seen in the argv argument to main(), excluding whatever well known leading arguments that are there first (i.e. the name of the executable).
That said, Donal's point about string encoding is important to remember. To do it right, the implementation needs to translate the XML info set string (unicode I think, right?) to the locale in which the program will run. This might not be possible, unless we can assume that the implementation can select an appropriately locale for the characters being used.
This is probably going to end up as a requirement on the BES spec. The XML infoset is defined in terms of UNICODE abstract characters, so when mapping to running a real executable there needs to be some encoding employed. Most encodings cannot render all UNICODE characters. Hence, if there are characters which cannot be encoded there probably SHOULD be a fault thrown. (I hesitate to say MUST.) I imagine that eventually all platforms will use UTF-8 as their native encoding - there are excellent reasons to do this shift - and that from then on, the encoding nasties will go away, but it's going to be a long time coming. Donal.

Andreas Savva wrote:
I still can't remember the discussions on this but I believe this was the reason we allowed empty argument elements and said they must not be collapsed. The question I have if you change the argument type to a string is then why do you need multiple argument elements?
Also I don't have a big problem with restricting definitions (e.g., by removing the filesystem attribute in this application extension) but I do have a problem with changing types for identically named elements.
I've tried looking up the word "collapsed" in relation to XML, and it seems to be fairly meaningless. Instead, these things are best defined in terms of XSD facets, and given the general requirements (no trimming, no internal replacement of whitespace sequences, no processing at all) xsd:string is the exact type we need. If the main JSDL spec says anything else, I think we can say that it's in error. :-)
Donal, are you saying that if the directory is not present it should be up to the implementation to decide what to do? I read your comment as saying that the specification should state that the directory should be created.
There are 3 cases: 1: User specifies existing directory - must not be created! 2: User specifies non-existing directry - must be created! 3: User does not specify directory. This case is up to the implementor but will probably follow one of the two patterns: 3a: Use existing "well-known" directory, e.g. $HOME. Implementation must document what directory will be chosen. 3b: Create new directory with "random" name so that jobs are isolated. Implementation must state that this will happen. I do not believe that the behaviour of all existing batch systems is the same in case 3 FWIW; lower-level ones will tend to 3a (as it assumes less) and higher-level ones will tend to 3b (as it isolates better).
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
Not sure I agree. The HPC Profile stuff is for a very restricted case. Donal.

Donal, Donal K. Fellows wrote:
Andreas Savva wrote:
I still can't remember the discussions on this but I believe this was the reason we allowed empty argument elements and said they must not be collapsed. The question I have if you change the argument type to a string is then why do you need multiple argument elements?
Also I don't have a big problem with restricting definitions (e.g., by removing the filesystem attribute in this application extension) but I do have a problem with changing types for identically named elements.
I've tried looking up the word "collapsed" in relation to XML, and it seems to be fairly meaningless. Instead, these things are best defined in terms of XSD facets, and given the general requirements (no trimming, no internal replacement of whitespace sequences, no processing at all) xsd:string is the exact type we need. If the main JSDL spec says anything else, I think we can say that it's in error. :-)
I don't think 'collapsed' was used in any XML-specific sense; rather it was probably used in the spec to mean that empty argument elements should not be discarded. Actually the spec should say clearly that an empty argument element maps to an empty string. It shows examples of this but it should say so in the definition. (One for the tracker.) Though I tend to agree with you that 'string' is more appropriate(*) than 'normalizedstring' I would still like to know why we chose normalized string in the first place. I'll put this in the tracker as well and I'd like to ask that this part of the definition of argument in the HPC application extension be kept pending until we have a discussion of this issue, probably at GGF18. Is this agreeable? (* if only because I can imagine cases where I'd want to pass a 'tab' character)
Donal, are you saying that if the directory is not present it should be up to the implementation to decide what to do? I read your comment as saying that the specification should state that the directory should be created.
There are 3 cases: 1: User specifies existing directory - must not be created! 2: User specifies non-existing directry - must be created! 3: User does not specify directory. This case is up to the implementor but will probably follow one of the two patterns: 3a: Use existing "well-known" directory, e.g. $HOME. Implementation must document what directory will be chosen. 3b: Create new directory with "random" name so that jobs are isolated. Implementation must state that this will happen.
I do not believe that the behaviour of all existing batch systems is the same in case 3 FWIW; lower-level ones will tend to 3a (as it assumes less) and higher-level ones will tend to 3b (as it isolates better).
I don't agree that all existing batch systems create a non-existing directory for case 2, and I would not want to see a MUST or SHOULD for this in the document. A MAY might not be objectionable.
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
Not sure I agree. The HPC Profile stuff is for a very restricted case.
I might be willing to agree if the limitations are described adequately in the document. -- Andreas Savva Fujitsu Laboratories Ltd

On 01/9/06 06:08, "Donal K. Fellows" <donal.k.fellows@manchester.ac.uk> wrote:
Donal, are you saying that if the directory is not present it should be up to the implementation to decide what to do? I read your comment as saying that the specification should state that the directory should be created.
There are 3 cases: 1: User specifies existing directory - must not be created! 2: User specifies non-existing directry - must be created! 3: User does not specify directory. This case is up to the implementor but will probably follow one of the two patterns: 3a: Use existing "well-known" directory, e.g. $HOME. Implementation must document what directory will be chosen. 3b: Create new directory with "random" name so that jobs are isolated. Implementation must state that this will happen.
I do not believe that the behaviour of all existing batch systems is the same in case 3 FWIW; lower-level ones will tend to 3a (as it assumes less) and higher-level ones will tend to 3b (as it isolates better).
I actually believe that in the case of '2' that the implementation should raise a fault and not create the directory. First off, I don't believe that any of the existing low-level batch systems will do this creation for you, and supporting these batch systems is a primary goal of the HPC Profile. Second, I'm not a big supporter of "implied" behaviour. That is, there is a side-effect that a directory is created when this job is run. I'd prefer that a user needs to make sure the directory exists if they are specifying a working directory (using working directory seems to imply that the user knows a fair bit about the execution environment). As for 3, I agree. It's up to the implementation to document it's behaviour.
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
Not sure I agree. The HPC Profile stuff is for a very restricted case.
Again ... I think it's up to the implementation to define it's behaviour if the working directory isn't specified. At this point, for the HPC Profile, I'd like to leave it at that. -- Chris

Christopher Smith wrote:
I actually believe that in the case of '2' that the implementation should raise a fault and not create the directory. First off, I don't believe that any of the existing low-level batch systems will do this creation for you, and supporting these batch systems is a primary goal of the HPC Profile. Second, I'm not a big supporter of "implied" behaviour. That is, there is a side-effect that a directory is created when this job is run. I'd prefer that a user needs to make sure the directory exists if they are specifying a working directory (using working directory seems to imply that the user knows a fair bit about the execution environment).
As for 3, I agree. It's up to the implementation to document it's behaviour.
According to my perspective, everything's OK as long as what happens is all clearly documented. :-) Maybe could even state that this is part of the profiled set of information a BES publishes about itself? In any case, the real interop case is #1 (specified existing dir). I raised the issue in #2 (specified non-existing dir) because requiring a fault precludes creation, which is behaviour useful for higher-level job managers, as it means that they don't need to take special action to create a shared directory for a group of related activities. Otherwise you end up having to do more workflow-dependency stuff in order to make the directory that the real tasks use. The standards perspective is: each case is distinct and should either have mandated behaviour, or a requirement on implementations to state what they do so that clients can discover prior to activity submission. With that, we can state that any client relying on a particular way of resolving this sort of issue (without first checking for it) is not itself strictly compliant with the profile. :-) Donal.

Andreas Savva wrote:
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine. This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
FWIW, I asked about using variables in path names at one point, and the answer seemed to be that one can specify file systems with mount points that are variables. So then all you need to do is reference this in your WorkingDirectory element and the service will resolve the mount point at runtime. Peter

Peter G. Lane wrote:
Andreas Savva wrote:
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine. This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
FWIW, I asked about using variables in path names at one point, and the answer seemed to be that one can specify file systems with mount points that are variables. So then all you need to do is reference this in your WorkingDirectory element and the service will resolve the mount point at runtime.
This is true for the POSIXApplication---the filesystemName attribute allows you to do that for WorkingDirectory and Environment elements, for example. The HPC profile application draft removes these attributes for simplicity. I was curious if there would be some other defined mechanism to reference 'home' or not. I'll wait to see the minutes of the call. -- Andreas Savva Fujitsu Laboratories Ltd

On 01/9/06 08:40, "Peter G. Lane" <lane@mcs.anl.gov> wrote:
Andreas Savva wrote:
Btw, I am really curious here, how is a user going to specify that they want a job to run in their home directory, wherever that place may be? Surely the exact location may differ from machine to machine. This spec (and more particularly the HPC Profile) isn't about defining portable jobs anyway. After all, requiring an executable pathname isn't even close to portable, and nor is there any mechanism for dealing with differences in path separator between platforms. (At least nobody is trying to suggest that MacOS 9 should be used as a HPC platform; the directory separator scheme there was *much* different...)
Yes, but it is still a common use case to run a job with 'home' being the working directory, or in a subdirectory relative of home. I think a way to specify that without giving the full path name isn't too much to ask.
FWIW, I asked about using variables in path names at one point, and the answer seemed to be that one can specify file systems with mount points that are variables. So then all you need to do is reference this in your WorkingDirectory element and the service will resolve the mount point at runtime.
This is true for the POSIXApplication extension, but not for the HPCProfileApplication extension, as the HPC Basic Profile doesn't allow the use of the FileSystem stuff (the "macro" aspect seems cool, but other aspects are hard to implement e.g. space available for the job). -- Chris

On 01/9/06 01:18, "Donal K. Fellows" <donal.k.fellows@manchester.ac.uk> wrote:
Andreas Savva wrote:
I've had a look through the document; here are some comments:
I'm answering these as I understand them. Points I don't answer are ones where I'd just be agreeing with Andreas. :-)
2. Executable is defined as a mandatory element in the same way as in the JSDL PosixApplication. Given the discussions we've had in the context of the EMS Scenarios this should probably be optional (0-1) since the location may not be known until deployment is finished. (This isn't an issue for the HPC profile.)
Perhaps the right thing to do would be to leave it optional in this spec and then profile it to be mandatory in the HPC Profile. The advanced cases we were considering today lie outside the scope of the HPCP, but this spec is still quite possibly useful for them.
I'd prefer to keep it required for this extension. It's intended to be very simple, and to give little room for error. The intention is that more advanced use cases will need a different application extension for JSDL. A system could support both POSIXApplication and HPCProfileApplication, for instance. -- Chris

Hi all, Sorry for my delay ... I'll be applying Donal's changes in the HPC Profile Application extension document today. I'll also try to address any further comments that came up in later email threads. -- Chris On 21/8/06 06:14, "Donal K. Fellows" <donal.k.fellows@manchester.ac.uk> wrote:
humphrey@cs.virginia.edu wrote:
OGSA HPC Profile Minutes Fri Aug 18 2006 [...] AI-HPCP-0818a: Donal Fellow to review HPC Profile Application. Specifically to ensure that normative text use is correct. Also, adding normative text. Hopefully done before the next JSDL call on Wed.
As promised, I've reviewed the HPC Profile Application extension document and applied a few changes (though fewer than I originally thought). I've uploaded a change-tracked version to GridForge (in the JSDL documentation tree) which has my changes in it together with comments detailing a few issues that I think could do with addressing before it goes final. (Some comments just describe reasoning behind one or two of the changes I've made though.)
https://forge.gridforum.org/sf/docman/do/viewDocument/projects.jsdl-wg/docma.... root.working_drafts/doc13749
The main thing of note I suppose is that I've beefed up the Security Considerations to state that it is important that the UserName element not be allowed to do end-runs around security. But that needs checking by someone else to ensure that I've not gone too far and required that implementations *have* a security context that could allow deduction of what username to use. That's just the sane thing to do... :-)
Cc:ing the JSDL WG so that they can get a chance to review the doc (plus my comments) too.
Donal.
participants (5)
-
Andreas Savva
-
Christopher Smith
-
Donal K. Fellows
-
Karl Czajkowski
-
Peter G. Lane