Re: [OGSA-BES-WG] [Fwd: [ogsa-wg] BES Comments]

28 Nov 2006

      ...
ii)                   Why does the state model not end in one terminated
state?  Are there any real differences between the 3 enumerated end
states other than how the activity ended up in the state?
iii)                  Also it seems somewhat counter intuitive that if
one is going to have three termination states, one should end up in the
one state which reflects both success and failure (Finished) and a
separate one for failures that prevent the activity from exiting in a
controlled fashion.  It seems like the transitions to the terminated
state should be something like -
a.       Completed Successfully
b.       Failed
c.       Cancelled
The three states are intended to convey how the activity completes. The
finished state indicates that the activity itself completed (without outside
intervention), but makes no statement about whether the computation that the
activity was engaged in was successful or not. This was envisioned to be
extended through some extension sub-states. One of the reasons is that it's
hard (generically) to determine whether an activity completes successfully
or not. Using things like (say) exit code is traditional, but doesn't always
help you distinguish between failure ("file doesn't exist") or a problem in
the running of the activity ("my calculated result is wrong").

Failed means that some "outside event" (e.g. a host failure) caused the
activity to finish prematurely, so that you can't say whether it completed
or not. And Cancelled is used to indicate a manual intervention that stops
the activity from completing.

At one point we discussed having one terminal state, but we felt that three
was quite useful.
...
iv)                  Is there any notion of a retry, given that the
state machine captures failure?  Or perhaps rescheduling?  For example
one could imagine additional useful transitions -
a.       Pending to Pending - i.e. rescheduling
b.       Running to Pending - i.e. failed/retrying
Having these in the base profile or basic model would enable greater
flexibility in implementing future profiles where one could request that
the BES container implement retries or allow rescheduling.  Or are you
categorically stating that users will never be able to reschedule or
request automatic retries.  Again if this is the case, it should
probably be explicitly stated.  I would actually urge you to make the
base specification as broad as possible in terms of the acceptable state
changes at this level to avoid unnecessarily constraining yourself down
the line.  I think this is especially important given what is stated in
4.2.
I don't think pending to pending is necessary, as you can sub-state this to
indicate any internal pending state transitions.

Good point on running to pending ... maybe we should consider this one? How
about failed to pending, or finished to pending?
...
vi)                  4.2 - State Specialization really seems to be
indicating that this specification makes no assumptions with respect to
sub-states that may be incorporated within an implementation but that
such sub-states are acceptable as long as they do not introduce new
state transitions between the standard, specified states.  It would be
very helpful to simply state this before all of the examples.
Yes ... this should be made very clear.

It's definitely time for another BES conference call.....

-- Chris

Re: [OGSA-BES-WG] [Fwd: [ogsa-wg] BES Comments]

Christopher Smith