
Ian, My two cents. On 22 Jun 2006, at 23:11, Ian Foster wrote:
Hi,
I'd like to do some more on the BES spec. However, I have a few questions that I want to raise first:
* The extensible state model: this seems a great idea. However, I feel concerned that there are difficulties lurking there that we haven't discovered yet. Is there any prior work here that we can refer to in order to make ourselves (and others) feel more confident in the approach?
I too am pretty happy with this. The only thing I might change is to move the three terminal states to a sub state model and only expose "Done" at the top level, but I won't cry if we leave it the way it is. We had very good luck with a nested state model in Unicore. it wasn't extensible, but not all detailed substates were implemented by all platforms, but with the nesting we could always say something about the state. It worked well for both servers that didn't implement all substates and for clients that only understood the higher states. The attached java file describes the Unicore state model, for those interested. There is one thing I am concerned about and that is how do we express these substates? For example we want to be able to combine the concepts of StagingIn and Held without building a complex matrix of all possible combinations. While at the same time, we want StagingIn_Held to be a substate of StagingIn and restricted accordingly, e.g. limiting what the next states can be. What does this look like in XML? Any ideas folks? The simplest would be to use the schema extension model in XML. Define a type State, the restrict it to New, Pending, Running, Cancelled, Failed, and Finished. Then we can restrict Running to get Running_StagingIn, Running_Executing, and Running_StagingOut. But then when we add the notion of held substate on each of these, we get Running_StagingIn_Held, Running_Executing_Held, and Running_StagingOut_Held. I am happy with this, but it does create a large number of state names. Note, this substate only works if the substates really are specializations on their superstate. In particular there is a two part rule that MUST be followed. "All transitions to or from a substate must either be within the super state or reachable through the super states possible transitions." So a transition from Running_Executing_Held to Running_StagingOut would be possible because there is a transition from Running_Executing to Running_StagingOut. However, going from Running_Executing_Held to Pending would not be allowed. Note that although nothing to do with substates, the transition in Figure 3 from Failed to Pending is not legal since in the super state model there is no transition from Failed to Pending. The mistake is repeated in Figure 4.
* Information model: I certainly think we need an information model. On the other hand, I wonder if we might want to think about pulling the details out as a "profile." What we have is just what we need for our current HPC use cases; on the other hand, if one is modeling say a Java container, they might be less appropriate. Thoughts?
I think the information model we have is fairly simple and universal and would like to see it remain as part of the spec. As most elements are optional, I don't think this over burdens non-HPC implementations. The extensibility allows other profiles of course.
* Management operations: we currently have a couple of operations ("start/stop accepting requests") that are really container management rather than job operations. Could we separate those out in a distinct port-type?
There is an issue we just resolved to do just this.
* I know there was discussion about how to handle single-job vs. multi-job operations in a consistent manner, but I don't know how that was reconciled.
A pending issue still. -- Take care: Dr. David Snelling < David . Snelling . UK . Fujitsu . com > Fujitsu Laboratories of Europe Hayes Park Central Hayes End Road Hayes, Middlesex UB4 8FE +44-208-606-4649 (Office) +44-208-606-4539 (Fax) +44-7768-807526 (Mobile)