Hi;

Coming from the point-of-view of the HPC Profile working group, I have several questions about BES (including recent discussions on the mailing list), as well as some straw man thoughts about how BES should relate to the HPC profile spec.

Based on the BES-1.3 spec that Andrew Grimshaw recently sent out, at an abstract level, there seem to be the following aspects to BES:

· A core set of operations around activities:

· CreateActivityFromJSDL

· GetActivityStatus

· RequestActivityStateChange

· GetActivityJSDLDocuments

· A set of BES factory-specific system management operations and resource properties (RPs):

· StartAcceptingNewActivities

· StopAcceptingNewActivities

· IsAcceptingNewActivities RP

· Support for notifications.

· Support for various resource properties (or their equivalent in a non-WSRF version) having to do with an information model for describing various things about a BES factory, the associated container it represents, and any activities it is currently running.

· An extensible activity state model.

Things explicitly NOT in the BES specification are:

· Generic system management interface.

· Security design.

· Interface for directly controlling/manipulating an activity once it has been created.

Things that used to be in the BES spec but now seem to be extensions (please correct me if I’m wrong here!):

· Data staging

· Suspension

I have the following questions about BES and the various discussions that have recently occurred (including the ESI integration):

· Extensibility:

· Given that BES has bought into the notion of an extensible activity state diagram, it needs to also normatively define how clients can learn of the extensions that a given BES service supports. Is that something that will be added to the BES specification? Or will the specification point to some other place where notions of extensibility are defined more generically? (Personally, I’d vote for the former approach.)

· Is the “base case” for BES now fig.2, which shows states of {new, pending, running, canceled, failed, finished}?

· Previously included states, such as Execution-Pending, will presumably be defined in suitable extension profiles?

· Assuming that data staging and suspension are now extensions to the base BES spec, will they be defined as such in an appendix of the spec, or as a separate extension profile?

· The original BES spec describes a fairly sophisticated data staging design that supports parallelism. Is there any interest in defining a second, simpler data staging extension that avoids the complexity of the parallelism support?

· Will the suspension extension be the simple one that is currently presented in sec. 4 as an example? Or do people feel that a more complicated version, such as the ESI one is necessary/important? Can/should we define both?

· Given that suspension is no longer in the base design, presumably the createInSuspendedState parameter to CreateActivityFromJSDL should disappear?

· RequestActivityStateChange: I believe this operation will pose challenges in an extensible design. The current design is imperative by nature: it specifies an explicit state to move an activity to. However, a client who does not know of all the extensions that a BES service implements may not know how to pick the appropriate state to transition to. It seems better to introduce a more declarative approach in which clients specify “actions” they wish to occur, such as ‘CancelActivity’. This approach would allow the BES service to make the appropriate state transition in response to a desired action requested by a client.

· Information model:

· JSDL seems to inherently be focused on describing a single job or a single computational resource. For example, it has no notion of describing all the differing compute nodes of a (heterogeneous) compute cluster. By incorporating JSDL elements into the BES information model it seems that BES is foreclosing the ability to describe things like compute clusters. This issue also effects what can get returned from GetActivityJSDLDocuments. If I’m wrong about this, then it seems like it would be worth having an explicit explanation about how to achieve this functionality somewhere in the specification.

· The BES information model now includes various posix-specific elements of JSDL. How would other systems – such as a Windows system – be described?

· The spec requires that all BES services “support” all the various attributes listed in sec. 5, but they don’t have to implement them. What exactly does that mean? For example, if a JSDL doc specifies a CPU-Speed requirement and a particular BES service doesn’t implement it (meaning it doesn’t keep track of it), then does the associated CreateActivityFromJSDL request have to fail? If so, then do clients have to figure out what the minimal set of implemented attributes are in a system and then only use those in job descriptions? Is there is a notion of “optional” attributes that can be ignored, that specify desired attribute values rather than required ones?

· Is there any notion of specifying that all compute nodes should have the same value for some attribute (e.g. CPU architecture, CPU speed, NIC card)? This seems to be missing from the JSDL specification, but seems very important for BES if it is to support things like compute clusters.

· Some of the elements seem either incompletely specified, have definitions that are open to multiple interpretations, or have definitions that would be very difficult to implement in practice. In particular:

· CPU architecture seems like it can’t describe all the variations – let alone all the peripherals such as GPUs – that a computing resource might have (let alone a cluster).

· CPU speed seems like the tip of an iceberg having to do with characterizing the performance of a system, which will depend on all manner of things like details of the processor chip used, cache sizes, bus used, etc.

· Network bandwidth: is this the theoretical maximum of the NIC on a compute node or is it the current bandwidth actually available in a (shared) system? Note that the latter is difficult to measure in a practically useful way. Note also that network bandwidth only describes one aspect of communications performance and that several others are arguably equally important (e.g. latency).

All this leads to the question of whether BES will have a notion of extending the information model that is supplied. If so, then that leads to the question of what the base case should be and whether it should include a smaller set of things than is currently listed in the spec.

Are there any plans to tighten the definitions of some of the more vague information elements? (I guess this really is an issue more for the JSDL WG than for BES.)

· GetActivityJSDLDocuments returns a JSDL document for each specified activity. Is this sufficient to capture the entire “provenance” for what has happened to the activity? In particular, would it be sufficient to allow someone to (a) run the same activity on another BES service (assuming same hardware and software) and get the same results and (b) debug what has happened to an errant activity? I would argue that both capabilities have proven to be important in actual systems.

· System management operations:

· Currently BES supports 2 specific system management operations: Start and stop activities commands. Most schedulers support a variety of scheduling-specific system management operations and I’m wondering why these two operations were singled out in particular to be part of the base case?

· These operations seem to require a different set of authorization credentials than the other interface operations since they should be invoked by system administrators rather than random users. How will that work, given that these operations are in the same WSDL as the other operations? Wouldn’t this argue for moving these operations to a separate system management interface?

· Array operations:

· Currently one can create a single activity, but all other operations accept an array of AEDs as input. Was there some reason why an array creation operation wasn’t included so that, for example, parameter sweep applications can be created with a single request instead of N requests (where N can be in the thousands)?

· Given that BES seems to have bought into the notion of extensibility, should the base case be a “non-array” one? For example, currently if you want to handle a fault for a RequestActivityStateChange operation on a single activity you need to look inside the returned array of results to see if a fault infoset was returned. All the exception handling machinery that modern tooling provides can’t get used because RequestActivityStateChange never returns an actual fault message (as compared to a fault infoset for the appropriate array elements that are returned.

· Other questions:

· An entire (small) section is devoted to talking about the optional use of WS-Names. However, since the specification doesn’t require them, it’s unclear to me whether BES needs to say anything about WS-Names. As far as I understand things, whether an EPR is a WS-Name or not can be determined by inspecting it. Hence the only reason to have a special property on a BES service that indicates what kind of AEDs it returns is to alert potential clients ahead of time about this feature of the service. But it’s not clear to me what a client would do with that information, as compared to deciding opportunistically to exploit a WS-Name AED for, e.g. resolution, at the time that that would be necessary. Is there a use case that describes how clients would exploit the AED-type resource property?

· Since JSDL documents are self-describing, a BES service can figure out by inspection whether the job description infoset parameter to CreateActivityFromJSDL is JSDL or something else. This would seem to imply that naming the operation CreateActivity would lose no information and would allow for transparent extension to other job description infoset simply by using them (assuming they are self-describing).

· Container attributes that I have questions about:

· LocalResourceManagerType: where do these get defined normatively?

· Job Credential Service and File Credential Service: these imply a specific security model. Given that security is undefined in the BES spec, is this appropriate – especially given the rather vague definition of both?

Given these questions, as well as the mandate for the HPC profile to define a simple base interface, I would like to present the following straw man proposal for a modified BES specification for feedback from this community:

· Operations:

· CreateActivity(jsdlDoc) à EPR

· GetActivity(EPR) à activityState

· GetActivityProvenance(EPR) à either JSDL doc (if that can describe all the necessary provenance info) or JSDL+

· CancelActivity(EPR)

· For non-WSRF versions: QueryResources() à schedulerResourcesInfoset

· ‘schedulerResourcesInfoset’ is essentially the union of the RPs that would be exported in a WSRF-based version for describing the resources that are available for use at this BES service. Note that a BES service might also want to expose other kinds of information that would not be returned from this operation – this operation is there so that clients can determine whether or not a BES service could potentially meet their needs and is necessary for meta-scheduling scenarios.

· One might argue that one could use WS-Transfer for this operation. However, since a BES service might want to export other kinds of information, this would require an extra level of indirection so that the BES service could expose which EPRs to use for retrieving which kinds of information.

· Additional topics/summary:

· Simple state diagram and no notion of array operations, data staging, suspension, or notifications in base BES case.

· Extensions defined as separate profiles for array operations, data staging, suspension, and notifications.

· RequestActivityStateChange replaced by operations specifying desired actions rather than states. Base case supports activity cancellation; extensions can define additional operations (e.g. SuspendActivity).

· Information model: small base set plus extensions model (which ones to include in the base set TBD)

· All system management functions moved out to a separate interface.

Thanks for any and all feedback on these questions and this straw man proposal,

Marvin.