OCCI MC - State Machine Diagram - occi-wg

OCCI MC - State Machine Diagram

older
(industry noise on CC protocols)...

Sam Johnston

18 Apr 2009 18 Apr '09

11:40 a.m.

Afternoon all, I have created a diagram (attached) of what I think the absolute minimum core states need to be... essentially boiling them down to "STOPPED" and "ACTIVE" with "START" and "STOP" being the only requisite actuators. Transitional "STOPPING" and "STARTING" states are optional. I believe states should be completely unambiguous so I don't particularly like the DMTF model ("stopped" and "active" machines are also "defined" but this is a separate state) and that also rules out vague terminology like "inactive" (which could mean both "stopped" and "suspended"). I've got an optional RESTART actuator which takes you from "ACTIVE" to "STARTING" as well as an optional "SUSPEND" and "RESUME" cycle which takes you via the transitional "SUSPENDING" and "RESUMING" states between "ACTIVE" and "SUSPENDED". DMTF have another "paused' state but I wonder whether this needs to be a state in its own right or if it's an attribute of "suspended". That's a rhetorical question - we could spend all week discussing nuances but for now I want to make sure we agree on the absolutely minimalist core functionality. Other states can be added via a live registry which we can pre-populate with a view to guiding innovation without stifling it. This has been uploaded to the wiki: http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/StateModel Sam

Attachments:

attachment.html (text/html — 1.7 KB)
OCCIMCStateMachine.png (image/png — 36.2 KB)
OCCIMCStateMachine.svg (image/svg+xml — 9.8 KB)

Show replies by date

Chris Webb

18 Apr 18 Apr

11:43 a.m.

Sam Johnston <samj@samj.net> writes:

...

I have created a diagram (attached) of what I think the absolute minimum core states need to be... essentially boiling them down to "STOPPED" and "ACTIVE" with "START" and "STOP" being the only requisite actuators. Transitional "STOPPING" and "STARTING" states are optional.

+1 I strongly agree with simplifying things in this way. Good stuff! Cheers, Chris.

Sam Johnston

1:14 p.m.

On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb <chris.webb@elastichosts.com>wrote:

...

Sam Johnston <samj@samj.net> writes:

...
I have created a diagram (attached) of what I think the absolute minimum core states need to be... essentially boiling them down to "STOPPED" and "ACTIVE" with "START" and "STOP" being the only requisite actuators. Transitional "STOPPING" and "STARTING" states are optional.

+1

I strongly agree with simplifying things in this way. Good stuff!

I subsequently realised that in fact infrastructure like Amazon, Mosso and ElasticHosts don't actually have a "stopped" state - "stop" for these guys is more like "destroy". It then occurred to me that there was no point making "stopped" optional as if you don't need to start/stop/restart machines then you just don't implement the machine control extension. It's far easier to create (HTTP PUT) and then destroy (HTTP DELETE) a server than what it is to parse the response to fire an actuator. So basically the machine control extension becomes optional (but possibly still interesting to indicate transitional states like "starting" and "stopping" as Chris pointed out privately). Sam

Krishna Sankar (ksankar)

3:22 p.m.

Sam/Chris, Good work. Couple of points. a) The diagram needs a entry and exit point circles (actually concentric circles) b) The Stopping and Starting states are important. For example if the state is Starting, clients could retry; schedulers could add the VM in their pool et al. Stopping state will mean that the system is not accepting service requests anymore. It might have to stay in this state until all pending requests are completed. c) There is another state Aborting - don't know if we want to add this. d) The stopped state, while not important during run time, will be useful for account keeping, auditing et al. For example a log entry with Stopped state with a timestamp e) Remember, monitoring and billing is an important plane for Clouds and so we should also have states that are relevant from that perspective ... Cheers <k/> From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: Saturday, April 18, 2009 6:14 AM To: Chris Webb Cc: occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb <chris.webb@elastichosts.com> wrote: Sam Johnston <samj@samj.net> writes: > I have created a diagram (attached) of what I think the absolute minimum > core states need to be... essentially boiling them down to "STOPPED" and > "ACTIVE" with "START" and "STOP" being the only requisite actuators. > Transitional "STOPPING" and "STARTING" states are optional. +1 I strongly agree with simplifying things in this way. Good stuff! I subsequently realised that in fact infrastructure like Amazon, Mosso and ElasticHosts don't actually have a "stopped" state - "stop" for these guys is more like "destroy". It then occurred to me that there was no point making "stopped" optional as if you don't need to start/stop/restart machines then you just don't implement the machine control extension. It's far easier to create (HTTP PUT) and then destroy (HTTP DELETE) a server than what it is to parse the response to fire an actuator. So basically the machine control extension becomes optional (but possibly still interesting to indicate transitional states like "starting" and "stopping" as Chris pointed out privately). Sam

Sam Johnston

4:04 p.m.

Hi Krishna, Thanks for the feedback. On Sat, Apr 18, 2009 at 5:22 PM, Krishna Sankar (ksankar) <ksankar@cisco.com

...

wrote:

...

a) The diagram needs a entry and exit point circles (actually concentric circles)

Resources can enter or leave the matrix at any point (e.g. you can import or migrate a suspended workload) so I think adding this, while technically correct, might impair readability (as it does in the DMTF diagram). The formal specification might want to include a formal state diagram however.

...

b) The Stopping and Starting states are important. For example if the state is Starting, clients could retry; schedulers could add the VM in their pool et al. Stopping state will mean that the system is not accepting service requests anymore. It might have to stay in this state until all pending requests are completed.

Sure, but it's really up to the provider as to whether they want to implement these. Some workloads (e.g. slices) start atomically so the transition doesn't make sense. We'll cater for the need but I'm a big fan of giving implementors maximum flexibility.

...

c) There is another state Aborting – don’t know if we want to add this.

Interesting idea - perhaps we'll include it in the registry as one of those "optional" states. Another interesting one is "paused", where no new requests will be accepted but all those in progress will be finished - a load balancer shouldn't send any new requests but it shouldn't terminate any existing ones either.

...

d) The stopped state, while not important during run time, will be useful for account keeping, auditing et al. For example a log entry with Stopped state with a timestamp

Perhaps, but simpler systems might operate without billing or have a simple meter based approach. If the infrastructure doesn't already maintain information about stopped resources then we don't want to force them to in order to implement the API.

...

e) Remember, monitoring and billing is an important plane for Clouds and so we should also have states that are relevant from that perspective …

Agreed. I don't think we should get too deep into this, but we should bear in mind that the ability to see/compare costs in the clients delivers huge value. I've included some simple metering examples to get the creative juices flowing. Sam

...

Cheers

<k/>

*From:* occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] *On Behalf Of *Sam Johnston *Sent:* Saturday, April 18, 2009 6:14 AM *To:* Chris Webb *Cc:* occi-wg@ogf.org *Subject:* Re: [occi-wg] OCCI MC - State Machine Diagram

On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb <chris.webb@elastichosts.com> wrote:

Sam Johnston <samj@samj.net> writes:

...
I have created a diagram (attached) of what I think the absolute minimum core states need to be... essentially boiling them down to "STOPPED" and "ACTIVE" with "START" and "STOP" being the only requisite actuators. Transitional "STOPPING" and "STARTING" states are optional.

+1

I strongly agree with simplifying things in this way. Good stuff!

I subsequently realised that in fact infrastructure like Amazon, Mosso and ElasticHosts don't actually have a "stopped" state - "stop" for these guys is more like "destroy".

It then occurred to me that there was no point making "stopped" optional as if you don't need to start/stop/restart machines then you just don't implement the machine control extension. It's far easier to create (HTTP PUT) and then destroy (HTTP DELETE) a server than what it is to parse the response to fire an actuator.

So basically the machine control extension becomes optional (but possibly still interesting to indicate transitional states like "starting" and "stopping" as Chris pointed out privately).

Sam

Krishna Sankar (ksankar)

4:13 p.m.

Sam, Just for clarity, resources shouldn't enter the state matrix at any point. For example they cannot enter the matrix in Stopping stage or leave while in Active/Running state. That is why the entry and exit points are important. But, of course, we will discuss these in detail as we progress. Yep, a Pause state is required. Good catch. Cheers <k/> From: Sam Johnston [mailto:samj@samj.net] Sent: Saturday, April 18, 2009 9:04 AM To: Krishna Sankar (ksankar) Cc: Chris Webb; occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram Hi Krishna, Thanks for the feedback. On Sat, Apr 18, 2009 at 5:22 PM, Krishna Sankar (ksankar) <ksankar@cisco.com> wrote: a) The diagram needs a entry and exit point circles (actually concentric circles) Resources can enter or leave the matrix at any point (e.g. you can import or migrate a suspended workload) so I think adding this, while technically correct, might impair readability (as it does in the DMTF diagram). The formal specification might want to include a formal state diagram however. b) The Stopping and Starting states are important. For example if the state is Starting, clients could retry; schedulers could add the VM in their pool et al. Stopping state will mean that the system is not accepting service requests anymore. It might have to stay in this state until all pending requests are completed. Sure, but it's really up to the provider as to whether they want to implement these. Some workloads (e.g. slices) start atomically so the transition doesn't make sense. We'll cater for the need but I'm a big fan of giving implementors maximum flexibility. c) There is another state Aborting - don't know if we want to add this. Interesting idea - perhaps we'll include it in the registry as one of those "optional" states. Another interesting one is "paused", where no new requests will be accepted but all those in progress will be finished - a load balancer shouldn't send any new requests but it shouldn't terminate any existing ones either. d) The stopped state, while not important during run time, will be useful for account keeping, auditing et al. For example a log entry with Stopped state with a timestamp Perhaps, but simpler systems might operate without billing or have a simple meter based approach. If the infrastructure doesn't already maintain information about stopped resources then we don't want to force them to in order to implement the API. e) Remember, monitoring and billing is an important plane for Clouds and so we should also have states that are relevant from that perspective ... Agreed. I don't think we should get too deep into this, but we should bear in mind that the ability to see/compare costs in the clients delivers huge value. I've included some simple metering examples to get the creative juices flowing. Sam Cheers <k/> From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: Saturday, April 18, 2009 6:14 AM To: Chris Webb Cc: occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb <chris.webb@elastichosts.com> wrote: Sam Johnston <samj@samj.net> writes: > I have created a diagram (attached) of what I think the absolute minimum > core states need to be... essentially boiling them down to "STOPPED" and > "ACTIVE" with "START" and "STOP" being the only requisite actuators. > Transitional "STOPPING" and "STARTING" states are optional. +1 I strongly agree with simplifying things in this way. Good stuff! I subsequently realised that in fact infrastructure like Amazon, Mosso and ElasticHosts don't actually have a "stopped" state - "stop" for these guys is more like "destroy". It then occurred to me that there was no point making "stopped" optional as if you don't need to start/stop/restart machines then you just don't implement the machine control extension. It's far easier to create (HTTP PUT) and then destroy (HTTP DELETE) a server than what it is to parse the response to fire an actuator. So basically the machine control extension becomes optional (but possibly still interesting to indicate transitional states like "starting" and "stopping" as Chris pointed out privately). Sam

Sam Johnston

5:20 p.m.

I've added both to the registry: http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/Registries Sam On Sat, Apr 18, 2009 at 6:13 PM, Krishna Sankar (ksankar) <ksankar@cisco.com

...

wrote:

...

Sam,

Just for clarity, resources shouldn’t enter the state matrix at any point. For example they cannot enter the matrix in Stopping stage or leave while in Active/Running state. That is why the entry and exit points are important. But, of course, we will discuss these in detail as we progress.

Yep, a Pause state is required. Good catch.

Cheers

<k/>

*From:* Sam Johnston [mailto:samj@samj.net] *Sent:* Saturday, April 18, 2009 9:04 AM *To:* Krishna Sankar (ksankar) *Cc:* Chris Webb; occi-wg@ogf.org

*Subject:* Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Krishna,

Thanks for the feedback.

On Sat, Apr 18, 2009 at 5:22 PM, Krishna Sankar (ksankar) < ksankar@cisco.com> wrote:

a) The diagram needs a entry and exit point circles (actually concentric circles)

Resources can enter or leave the matrix at any point (e.g. you can import or migrate a suspended workload) so I think adding this, while technically correct, might impair readability (as it does in the DMTF diagram). The formal specification might want to include a formal state diagram however.

b) The Stopping and Starting states are important. For example if the state is Starting, clients could retry; schedulers could add the VM in their pool et al. Stopping state will mean that the system is not accepting service requests anymore. It might have to stay in this state until all pending requests are completed.

Sure, but it's really up to the provider as to whether they want to implement these. Some workloads (e.g. slices) start atomically so the transition doesn't make sense. We'll cater for the need but I'm a big fan of giving implementors maximum flexibility.

c) There is another state Aborting – don’t know if we want to add this.

Interesting idea - perhaps we'll include it in the registry as one of those "optional" states. Another interesting one is "paused", where no new requests will be accepted but all those in progress will be finished - a load balancer shouldn't send any new requests but it shouldn't terminate any existing ones either.

d) The stopped state, while not important during run time, will be useful for account keeping, auditing et al. For example a log entry with Stopped state with a timestamp

Perhaps, but simpler systems might operate without billing or have a simple meter based approach. If the infrastructure doesn't already maintain information about stopped resources then we don't want to force them to in order to implement the API.

e) Remember, monitoring and billing is an important plane for Clouds and so we should also have states that are relevant from that perspective …

Agreed. I don't think we should get too deep into this, but we should bear in mind that the ability to see/compare costs in the clients delivers huge value. I've included some simple metering examples to get the creative juices flowing.

Sam

Cheers

<k/>

*From:* occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] *On Behalf Of *Sam Johnston *Sent:* Saturday, April 18, 2009 6:14 AM *To:* Chris Webb *Cc:* occi-wg@ogf.org *Subject:* Re: [occi-wg] OCCI MC - State Machine Diagram

On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb <chris.webb@elastichosts.com> wrote:

Sam Johnston <samj@samj.net> writes:

...
I have created a diagram (attached) of what I think the absolute minimum core states need to be... essentially boiling them down to "STOPPED" and "ACTIVE" with "START" and "STOP" being the only requisite actuators. Transitional "STOPPING" and "STARTING" states are optional.

+1

I strongly agree with simplifying things in this way. Good stuff!

I subsequently realised that in fact infrastructure like Amazon, Mosso and ElasticHosts don't actually have a "stopped" state - "stop" for these guys is more like "destroy".

It then occurred to me that there was no point making "stopped" optional as if you don't need to start/stop/restart machines then you just don't implement the machine control extension. It's far easier to create (HTTP PUT) and then destroy (HTTP DELETE) a server than what it is to parse the response to fire an actuator.

So basically the machine control extension becomes optional (but possibly still interesting to indicate transitional states like "starting" and "stopping" as Chris pointed out privately).

Sam

Andre Merzky

5:30 p.m.

Hi Sam, I am not sure I understand how you expect extensions to the state model to work. For example, assume that I have a client which implements the core specification only, thus only knows the STOPPED, ACTIVE and SUSPENDED states (your original figure). What is that client supposed to do if the backend reports an PAUSED state? It is a standards compliant client, so I would expect it to work with a standards compliant backend. However, the extension registry as it exists right now would break backward compatibility. One solution would be to register new states as substates of existing states. PAUSED for example could be registered as substate for SUSPENDED. What is the difference? Well, the state reported by the backend would be SUSPENDED, which the client understands. The client would also know that the resume() operation is valid for that state. Other clients which implement the extension would obtain the state SUSPENDED and the state_detail PAUSED, and learn that way that the client does not accept new requests. Voila: backward compatibility. Of course, that model poses limitations: it does not allow extensions which allow transitions which are not present within the top level state diagram. e.g., no extension could implement a direct transition from PAUSED to STOPPED, as this is not in your original state diagram. I, however, do not consider that to be a bug, but a feature: it guarantees that the top level state model is preserved even if extensions are present. BTW, I agree with Krishna's point that ENTRY and EXIT points are useful. Cheers, Andre Quoting [Sam Johnston] (Apr 18 2009):

...

Date: Sat, 18 Apr 2009 19:20:37 +0200 From: Sam Johnston <samj@samj.net> To: "Krishna Sankar (ksankar)" <ksankar@cisco.com> Cc: occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I've added both to the registry: [1]http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/Regis tries Sam

On Sat, Apr 18, 2009 at 6:13 PM, Krishna Sankar (ksankar) <[2]ksankar@cisco.com> wrote:

Sam,

Just for clarity, resources shouldnt enter the state matrix at any point. For example they cannot enter the matrix in Stopping stage or leave while in Active/Running state. That is why the entry and exit points are important. But, of course, we will discuss these in detail as we progress.

Yep, a Pause state is required. Good catch.

Cheers

<k/>

From: Sam Johnston [mailto:[3]samj@samj.net] Sent: Saturday, April 18, 2009 9:04 AM To: Krishna Sankar (ksankar) Cc: Chris Webb; [4]occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Krishna, Thanks for the feedback.

On Sat, Apr 18, 2009 at 5:22 PM, Krishna Sankar (ksankar) <[5]ksankar@cisco.com> wrote:

a) The diagram needs a entry and exit point circles (actually concentric circles)

Resources can enter or leave the matrix at any point (e.g. you can import or migrate a suspended workload) so I think adding this, while technically correct, might impair readability (as it does in the DMTF diagram). The formal specification might want to include a formal state diagram however.

b) The Stopping and Starting states are important. For example if the state is Starting, clients could retry; schedulers could add the VM in their pool et al. Stopping state will mean that the system is not accepting service requests anymore. It might have to stay in this state until all pending requests are completed.

Sure, but it's really up to the provider as to whether they want to implement these. Some workloads (e.g. slices) start atomically so the transition doesn't make sense. We'll cater for the need but I'm a big fan of giving implementors maximum flexibility.

c) There is another state Aborting dont know if we want to add this.

Interesting idea - perhaps we'll include it in the registry as one of those "optional" states. Another interesting one is "paused", where no new requests will be accepted but all those in progress will be finished - a load balancer shouldn't send any new requests but it shouldn't terminate any existing ones either.

d) The stopped state, while not important during run time, will be useful for account keeping, auditing et al. For example a log entry with Stopped state with a timestamp

Perhaps, but simpler systems might operate without billing or have a simple meter based approach. If the infrastructure doesn't already maintain information about stopped resources then we don't want to force them to in order to implement the API.

e) Remember, monitoring and billing is an important plane for Clouds and so we should also have states that are relevant from that perspective

Agreed. I don't think we should get too deep into this, but we should bear in mind that the ability to see/compare costs in the clients delivers huge value. I've included some simple metering examples to get the creative juices flowing. Sam

Cheers

<k/>

From: [6]occi-wg-bounces@ogf.org [mailto:[7]occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: Saturday, April 18, 2009 6:14 AM To: Chris Webb Cc: [8]occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb <[9]chris.webb@elastichosts.com> wrote:

Sam Johnston <[10]samj@samj.net> writes:

...
I have created a diagram (attached) of what I think the absolute minimum core states need to be... essentially boiling them down to "STOPPED" and "ACTIVE" with "START" and "STOP" being the only requisite actuators. Transitional "STOPPING" and "STARTING" states are optional.

+1 I strongly agree with simplifying things in this way. Good stuff!

I subsequently realised that in fact infrastructure like Amazon, Mosso and ElasticHosts don't actually have a "stopped" state - "stop" for these guys is more like "destroy". It then occurred to me that there was no point making "stopped" optional as if you don't need to start/stop/restart machines then you just don't implement the machine control extension. It's far easier to create (HTTP PUT) and then destroy (HTTP DELETE) a server than what it is to parse the response to fire an actuator. So basically the machine control extension becomes optional (but possibly still interesting to indicate transitional states like "starting" and "stopping" as Chris pointed out privately). Sam

References

1. http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/Registries 2. mailto:ksankar@cisco.com 3. mailto:samj@samj.net 4. mailto:occi-wg@ogf.org 5. mailto:ksankar@cisco.com 6. mailto:occi-wg-bounces@ogf.org 7. mailto:occi-wg-bounces@ogf.org 8. mailto:occi-wg@ogf.org 9. mailto:chris.webb@elastichosts.com 10. mailto:samj@samj.net

...

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Nothing is ever easy.

Krishna Sankar (ksankar)

5:34 p.m.

Sam Johnston

5:45 p.m.

Hi, I'm really quite sensitive about prematurely throwing a "wet blanket" over innovation (it is, after all, fairly early to be talking about standards). That's one of the main reasons for keeping the state engine on the server side and exposing possible transitions via links: <atom:link title="Start" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/start" rel="http://purl.org/occi/state#start"/> <atom:link title="Stop" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/stop" rel="http://purl.org/occi/state#stop"/> <atom:link title="Restart" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/restart" rel="http://purl.org/occi/state#restart"/> <atom:link title="Suspend" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/suspend" rel="http://purl.org/occi/state#suspend"/> We can't presume to know what weird and wonderful things people will want to do in the future so the registry is more for the server side implementors (to encourage them to use existing terms rather than coming up with their own). And before you ask, the "title" field is UI friendly and easily localised using content negotiation (so you can have a "Vaporize" transition over there which will appear as "Vaporise" over here). Cheers, Sam On Sat, Apr 18, 2009 at 7:34 PM, Krishna Sankar (ksankar) <ksankar@cisco.com

...

wrote:

...

Good point. I think we will need conformance levels (say 1,2 and 3) so that we can introspect and infer what an implementation will support. Agreed, at the instance level, we do need to be deterministic.

Cheers <k/>

|-----Original Message----- |From: Andre Merzky [mailto:andremerzky@gmail.com] On Behalf Of Andre |Merzky |Sent: Saturday, April 18, 2009 10:30 AM |To: Sam Johnston |Cc: Krishna Sankar (ksankar); occi-wg@ogf.org |Subject: Re: [occi-wg] OCCI MC - State Machine Diagram | |Hi Sam, | |I am not sure I understand how you expect extensions to the |state model to work. | |For example, assume that I have a client which implements |the core specification only, thus only knows the STOPPED, |ACTIVE and SUSPENDED states (your original figure). What is |that client supposed to do if the backend reports an PAUSED |state? | |It is a standards compliant client, so I would expect it to |work with a standards compliant backend. However, the |extension registry as it exists right now would break |backward compatibility. | |One solution would be to register new states as substates of |existing states. PAUSED for example could be registered as |substate for SUSPENDED. What is the difference? Well, the |state reported by the backend would be SUSPENDED, which the |client understands. The client would also know that the |resume() operation is valid for that state. Other clients |which implement the extension would obtain the state |SUSPENDED and the state_detail PAUSED, and learn that way |that the client does not accept new requests. Voila: |backward compatibility. | |Of course, that model poses limitations: it does not allow |extensions which allow transitions which are not present |within the top level state diagram. e.g., no extension |could implement a direct transition from PAUSED to STOPPED, |as this is not in your original state diagram. I, however, |do not consider that to be a bug, but a feature: it |guarantees that the top level state model is preserved even |if extensions are present. | | | |BTW, I agree with Krishna's point that ENTRY and EXIT |points are useful. | |Cheers, Andre | | |Quoting [Sam Johnston] (Apr 18 2009): |> Date: Sat, 18 Apr 2009 19:20:37 +0200 |> From: Sam Johnston <samj@samj.net> |> To: "Krishna Sankar (ksankar)" <ksankar@cisco.com> |> Cc: occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> I've added both to the registry: |> [1]http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi- |wg/wiki/Regis<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-%0A%7Cwg/wiki/Regis> |> tries |> Sam |> |> On Sat, Apr 18, 2009 at 6:13 PM, Krishna Sankar (ksankar) |> <[2]ksankar@cisco.com> wrote: |> |> Sam, |> |> Just for clarity, resources shouldnt enter the |state |> matrix at any point. For example they cannot enter the matrix in |> Stopping stage or leave while in Active/Running state. That is why |the |> entry and exit points are important. But, of course, we will |discuss |> these in detail as we progress. |> |> Yep, a Pause state is required. Good catch. |> |> Cheers |> |> <k/> |> |> |> From: Sam Johnston [mailto:[3]samj@samj.net] |> Sent: Saturday, April 18, 2009 9:04 AM |> To: Krishna Sankar (ksankar) |> Cc: Chris Webb; [4]occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> |> Hi Krishna, |> Thanks for the feedback. |> |> On Sat, Apr 18, 2009 at 5:22 PM, Krishna Sankar (ksankar) |> <[5]ksankar@cisco.com> wrote: |> |> |> a) The diagram needs a entry and exit point circles (actually |> concentric circles) |> |> Resources can enter or leave the matrix at any point (e.g. you can |> import or migrate a suspended workload) so I think adding this, |while |> technically correct, might impair readability (as it does in the |DMTF |> diagram). The formal specification might want to include a formal |state |> diagram however. |> |> b) The Stopping and Starting states are important. For example |if |> the state is Starting, clients could retry; schedulers could add |the VM |> in their pool et al. Stopping state will mean that the system is |not |> accepting service requests anymore. It might have to stay in this |state |> until all pending requests are completed. |> |> Sure, but it's really up to the provider as to whether they want to |> implement these. Some workloads (e.g. slices) start atomically so |the |> transition doesn't make sense. We'll cater for the need but I'm a |big |> fan of giving implementors maximum flexibility. |> |> c) There is another state Aborting dont know if we want to |add |> this. |> |> Interesting idea - perhaps we'll include it in the registry as one |of |> those "optional" states. Another interesting one is "paused", where |no |> new requests will be accepted but all those in progress will be |> finished - a load balancer shouldn't send any new requests but it |> shouldn't terminate any existing ones either. |> |> d) The stopped state, while not important during run time, |will be |> useful for account keeping, auditing et al. For example a log entry |> with Stopped state with a timestamp |> |> Perhaps, but simpler systems might operate without billing or have |a |> simple meter based approach. If the infrastructure doesn't already |> maintain information about stopped resources then we don't want to |> force them to in order to implement the API. |> |> e) Remember, monitoring and billing is an important plane for |> Clouds and so we should also have states that are relevant from |that |> perspective |> |> Agreed. I don't think we should get too deep into this, but we |should |> bear in mind that the ability to see/compare costs in the clients |> delivers huge value. I've included some simple metering examples to |get |> the creative juices flowing. |> Sam |> |> Cheers |> |> <k/> |> |> |> From: [6]occi-wg-bounces@ogf.org [mailto:[7]occi-wg- |bounces@ogf.org] On |> Behalf Of Sam Johnston |> Sent: Saturday, April 18, 2009 6:14 AM |> To: Chris Webb |> Cc: [8]occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> |> On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb |> <[9]chris.webb@elastichosts.com> wrote: |> |> Sam Johnston <[10]samj@samj.net> writes: |> > I have created a diagram (attached) of what I think the absolute |> minimum |> > core states need to be... essentially boiling them down to |"STOPPED" |> and |> > "ACTIVE" with "START" and "STOP" being the only requisite |actuators. |> > Transitional "STOPPING" and "STARTING" states are optional. |> |> +1 |> I strongly agree with simplifying things in this way. Good stuff! |> |> I subsequently realised that in fact infrastructure like Amazon, |Mosso |> and ElasticHosts don't actually have a "stopped" state - "stop" for |> these guys is more like "destroy". |> It then occurred to me that there was no point making "stopped" |> optional as if you don't need to start/stop/restart machines then |you |> just don't implement the machine control extension. It's far easier |to |> create (HTTP PUT) and then destroy (HTTP DELETE) a server than what |it |> is to parse the response to fire an actuator. |> So basically the machine control extension becomes optional (but |> possibly still interesting to indicate transitional states like |> "starting" and "stopping" as Chris pointed out privately). |> Sam |> |> References |> |> 1. http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi- |wg/wiki/Registries<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-%0A%7Cwg/wiki/Registries> |> 2. mailto:ksankar@cisco.com |> 3. mailto:samj@samj.net |> 4. mailto:occi-wg@ogf.org |> 5. mailto:ksankar@cisco.com |> 6. mailto:occi-wg-bounces@ogf.org |> 7. mailto:occi-wg-bounces@ogf.org |> 8. mailto:occi-wg@ogf.org |> 9. mailto:chris.webb@elastichosts.com |> 10. mailto:samj@samj.net | |> _______________________________________________ |> occi-wg mailing list |> occi-wg@ogf.org |> http://www.ogf.org/mailman/listinfo/occi-wg | | | | |-- |Nothing is ever easy.

Krishna Sankar (ksankar)

6:26 p.m.

Excellent. Agreed. Am a fan of orthogonal extensibility as well. In which case, let us make the supported states meta-discoverable via an appropriate policy plane interface. Of course, just discovering will not solve the problems - one has to figure out the semantics - how to use the discovered states effectively. BTW, have we started elaborating the canonical model of the different planes - data, control, management, policy and monitoring/billing ? Or more precisely is there a placeholder ? I will look around the Wiki - I assume that is the focal point of our efforts. Cheers <k/> From: Sam Johnston [mailto:samj@samj.net] Sent: Saturday, April 18, 2009 10:46 AM To: Krishna Sankar (ksankar) Cc: Andre Merzky; occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram Hi, I'm really quite sensitive about prematurely throwing a "wet blanket" over innovation (it is, after all, fairly early to be talking about standards). That's one of the main reasons for keeping the state engine on the server side and exposing possible transitions via links: <atom:link title="Start" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/star t" rel="http://purl.org/occi/state#start"/> <atom:link title="Stop" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/stop " rel="http://purl.org/occi/state#stop"/> <atom:link title="Restart" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/rest art" rel="http://purl.org/occi/state#restart"/> <atom:link title="Suspend" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/susp end" rel="http://purl.org/occi/state#suspend"/> We can't presume to know what weird and wonderful things people will want to do in the future so the registry is more for the server side implementors (to encourage them to use existing terms rather than coming up with their own). And before you ask, the "title" field is UI friendly and easily localised using content negotiation (so you can have a "Vaporize" transition over there which will appear as "Vaporise" over here). Cheers, Sam On Sat, Apr 18, 2009 at 7:34 PM, Krishna Sankar (ksankar) <ksankar@cisco.com> wrote: Good point. I think we will need conformance levels (say 1,2 and 3) so that we can introspect and infer what an implementation will support. Agreed, at the instance level, we do need to be deterministic. Cheers <k/> |-----Original Message----- |From: Andre Merzky [mailto:andremerzky@gmail.com] On Behalf Of Andre |Merzky |Sent: Saturday, April 18, 2009 10:30 AM |To: Sam Johnston |Cc: Krishna Sankar (ksankar); occi-wg@ogf.org |Subject: Re: [occi-wg] OCCI MC - State Machine Diagram | |Hi Sam, | |I am not sure I understand how you expect extensions to the |state model to work. | |For example, assume that I have a client which implements |the core specification only, thus only knows the STOPPED, |ACTIVE and SUSPENDED states (your original figure). What is |that client supposed to do if the backend reports an PAUSED |state? | |It is a standards compliant client, so I would expect it to |work with a standards compliant backend. However, the |extension registry as it exists right now would break |backward compatibility. | |One solution would be to register new states as substates of |existing states. PAUSED for example could be registered as |substate for SUSPENDED. What is the difference? Well, the |state reported by the backend would be SUSPENDED, which the |client understands. The client would also know that the |resume() operation is valid for that state. Other clients |which implement the extension would obtain the state |SUSPENDED and the state_detail PAUSED, and learn that way |that the client does not accept new requests. Voila: |backward compatibility. | |Of course, that model poses limitations: it does not allow |extensions which allow transitions which are not present |within the top level state diagram. e.g., no extension |could implement a direct transition from PAUSED to STOPPED, |as this is not in your original state diagram. I, however, |do not consider that to be a bug, but a feature: it |guarantees that the top level state model is preserved even |if extensions are present. | | | |BTW, I agree with Krishna's point that ENTRY and EXIT |points are useful. | |Cheers, Andre | | |Quoting [Sam Johnston] (Apr 18 2009): |> Date: Sat, 18 Apr 2009 19:20:37 +0200 |> From: Sam Johnston <samj@samj.net> |> To: "Krishna Sankar (ksankar)" <ksankar@cisco.com> |> Cc: occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> I've added both to the registry: |> [1]http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi- |wg/wiki/Regis <http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-%0A%7Cwg/wiki/Re gis> |> tries |> Sam |> |> On Sat, Apr 18, 2009 at 6:13 PM, Krishna Sankar (ksankar) |> <[2]ksankar@cisco.com> wrote: |> |> Sam, |> |> Just for clarity, resources shouldnt enter the |state |> matrix at any point. For example they cannot enter the matrix in |> Stopping stage or leave while in Active/Running state. That is why |the |> entry and exit points are important. But, of course, we will |discuss |> these in detail as we progress. |> |> Yep, a Pause state is required. Good catch. |> |> Cheers |> |> <k/> |> |> |> From: Sam Johnston [mailto:[3]samj@samj.net] |> Sent: Saturday, April 18, 2009 9:04 AM |> To: Krishna Sankar (ksankar) |> Cc: Chris Webb; [4]occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> |> Hi Krishna, |> Thanks for the feedback. |> |> On Sat, Apr 18, 2009 at 5:22 PM, Krishna Sankar (ksankar) |> <[5]ksankar@cisco.com> wrote: |> |> |> a) The diagram needs a entry and exit point circles (actually |> concentric circles) |> |> Resources can enter or leave the matrix at any point (e.g. you can |> import or migrate a suspended workload) so I think adding this, |while |> technically correct, might impair readability (as it does in the |DMTF |> diagram). The formal specification might want to include a formal |state |> diagram however. |> |> b) The Stopping and Starting states are important. For example |if |> the state is Starting, clients could retry; schedulers could add |the VM |> in their pool et al. Stopping state will mean that the system is |not |> accepting service requests anymore. It might have to stay in this |state |> until all pending requests are completed. |> |> Sure, but it's really up to the provider as to whether they want to |> implement these. Some workloads (e.g. slices) start atomically so |the |> transition doesn't make sense. We'll cater for the need but I'm a |big |> fan of giving implementors maximum flexibility. |> |> c) There is another state Aborting dont know if we want to |add |> this. |> |> Interesting idea - perhaps we'll include it in the registry as one |of |> those "optional" states. Another interesting one is "paused", where |no |> new requests will be accepted but all those in progress will be |> finished - a load balancer shouldn't send any new requests but it |> shouldn't terminate any existing ones either. |> |> d) The stopped state, while not important during run time, |will be |> useful for account keeping, auditing et al. For example a log entry |> with Stopped state with a timestamp |> |> Perhaps, but simpler systems might operate without billing or have |a |> simple meter based approach. If the infrastructure doesn't already |> maintain information about stopped resources then we don't want to |> force them to in order to implement the API. |> |> e) Remember, monitoring and billing is an important plane for |> Clouds and so we should also have states that are relevant from |that |> perspective |> |> Agreed. I don't think we should get too deep into this, but we |should |> bear in mind that the ability to see/compare costs in the clients |> delivers huge value. I've included some simple metering examples to |get |> the creative juices flowing. |> Sam |> |> Cheers |> |> <k/> |> |> |> From: [6]occi-wg-bounces@ogf.org [mailto:[7]occi-wg- |bounces@ogf.org] On |> Behalf Of Sam Johnston |> Sent: Saturday, April 18, 2009 6:14 AM |> To: Chris Webb |> Cc: [8]occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> |> On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb |> <[9]chris.webb@elastichosts.com> wrote: |> |> Sam Johnston <[10]samj@samj.net> writes: |> > I have created a diagram (attached) of what I think the absolute |> minimum |> > core states need to be... essentially boiling them down to |"STOPPED" |> and |> > "ACTIVE" with "START" and "STOP" being the only requisite |actuators. |> > Transitional "STOPPING" and "STARTING" states are optional. |> |> +1 |> I strongly agree with simplifying things in this way. Good stuff! |> |> I subsequently realised that in fact infrastructure like Amazon, |Mosso |> and ElasticHosts don't actually have a "stopped" state - "stop" for |> these guys is more like "destroy". |> It then occurred to me that there was no point making "stopped" |> optional as if you don't need to start/stop/restart machines then |you |> just don't implement the machine control extension. It's far easier |to |> create (HTTP PUT) and then destroy (HTTP DELETE) a server than what |it |> is to parse the response to fire an actuator. |> So basically the machine control extension becomes optional (but |> possibly still interesting to indicate transitional states like |> "starting" and "stopping" as Chris pointed out privately). |> Sam |> |> References |> |> 1. http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi- |wg/wiki/Registries <http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-%0A%7Cwg/wiki/Re gistries> |> 2. mailto:ksankar@cisco.com |> 3. mailto:samj@samj.net |> 4. mailto:occi-wg@ogf.org |> 5. mailto:ksankar@cisco.com |> 6. mailto:occi-wg-bounces@ogf.org |> 7. mailto:occi-wg-bounces@ogf.org |> 8. mailto:occi-wg@ogf.org |> 9. mailto:chris.webb@elastichosts.com |> 10. mailto:samj@samj.net | |> _______________________________________________ |> occi-wg mailing list |> occi-wg@ogf.org |> http://www.ogf.org/mailman/listinfo/occi-wg | | | | |-- |Nothing is ever easy.

Sam Johnston

6:34 p.m.

On Sat, Apr 18, 2009 at 8:26 PM, Krishna Sankar (ksankar) <ksankar@cisco.com

...

wrote:

...

Excellent. Agreed. Am a fan of orthogonal extensibility as well.

Great!

...

In which case, let us make the supported states meta-discoverable via an appropriate policy plane interface. Of course, just discovering will not solve the problems – one has to figure out the semantics – how to use the discovered states effectively.

I've been thinking about a metadata API but don't want to overcomplicate. Perhaps a template object exposing all the various options would be the easiest thing to do here. I figured our needs would become clearer as we start to think about implementation (something I'm already starting to do btw).

...

BTW, have we started elaborating the canonical model of the different planes – data, control, management, policy and monitoring/billing ? Or more precisely is there a placeholder ? I will look around the Wiki – I assume that is the focal point of our efforts.

Yes, the wiki's where it's at. I've put in some placeholders for some of these as extensions but again trying to be minimalist... already the storage one may be able to go (since the main operations are create, delete and the updating the size attribute - outside of that a snapshot actuator is the primary/only concern). I'm going to get away from the computer for a bit so I'll leave you to it. Sam

...

Cheers

<k/>

*From:* Sam Johnston [mailto:samj@samj.net] *Sent:* Saturday, April 18, 2009 10:46 AM *To:* Krishna Sankar (ksankar) *Cc:* Andre Merzky; occi-wg@ogf.org

*Subject:* Re: [occi-wg] OCCI MC - State Machine Diagram

Hi,

I'm really quite sensitive about prematurely throwing a "wet blanket" over innovation (it is, after all, fairly early to be talking about standards). That's one of the main reasons for keeping the state engine on the server side and exposing possible transitions via links:

<atom:link title="Start" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/start" rel="http://purl.org/occi/state#start"/>

<atom:link title="Stop" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/stop" rel="http://purl.org/occi/state#stop"/>

<atom:link title="Restart" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/restart" rel="http://purl.org/occi/state#restart"/>

<atom:link title="Suspend" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/suspend" rel="http://purl.org/occi/state#suspend"/>

We can't presume to know what weird and wonderful things people will want to do in the future so the registry is more for the server side implementors (to encourage them to use existing terms rather than coming up with their own).

And before you ask, the "title" field is UI friendly and easily localised using content negotiation (so you can have a "Vaporize" transition over there which will appear as "Vaporise" over here).

Cheers,

Sam

On Sat, Apr 18, 2009 at 7:34 PM, Krishna Sankar (ksankar) < ksankar@cisco.com> wrote:

Good point. I think we will need conformance levels (say 1,2 and 3) so that we can introspect and infer what an implementation will support. Agreed, at the instance level, we do need to be deterministic.

Cheers <k/>

|-----Original Message----- |From: Andre Merzky [mailto:andremerzky@gmail.com] On Behalf Of Andre |Merzky |Sent: Saturday, April 18, 2009 10:30 AM |To: Sam Johnston |Cc: Krishna Sankar (ksankar); occi-wg@ogf.org |Subject: Re: [occi-wg] OCCI MC - State Machine Diagram | |Hi Sam, | |I am not sure I understand how you expect extensions to the |state model to work. | |For example, assume that I have a client which implements |the core specification only, thus only knows the STOPPED, |ACTIVE and SUSPENDED states (your original figure). What is |that client supposed to do if the backend reports an PAUSED |state? | |It is a standards compliant client, so I would expect it to |work with a standards compliant backend. However, the |extension registry as it exists right now would break |backward compatibility. | |One solution would be to register new states as substates of |existing states. PAUSED for example could be registered as |substate for SUSPENDED. What is the difference? Well, the |state reported by the backend would be SUSPENDED, which the |client understands. The client would also know that the |resume() operation is valid for that state. Other clients |which implement the extension would obtain the state |SUSPENDED and the state_detail PAUSED, and learn that way |that the client does not accept new requests. Voila: |backward compatibility. | |Of course, that model poses limitations: it does not allow |extensions which allow transitions which are not present |within the top level state diagram. e.g., no extension |could implement a direct transition from PAUSED to STOPPED, |as this is not in your original state diagram. I, however, |do not consider that to be a bug, but a feature: it |guarantees that the top level state model is preserved even |if extensions are present. | | | |BTW, I agree with Krishna's point that ENTRY and EXIT |points are useful. | |Cheers, Andre | | |Quoting [Sam Johnston] (Apr 18 2009): |> Date: Sat, 18 Apr 2009 19:20:37 +0200 |> From: Sam Johnston <samj@samj.net> |> To: "Krishna Sankar (ksankar)" <ksankar@cisco.com> |> Cc: occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> I've added both to the registry: |> [1]http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi- |wg/wiki/Regis<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-%0A%7Cwg/wiki/Regis> |> tries |> Sam |> |> On Sat, Apr 18, 2009 at 6:13 PM, Krishna Sankar (ksankar) |> <[2]ksankar@cisco.com> wrote: |> |> Sam, |> |> Just for clarity, resources shouldnt enter the |state |> matrix at any point. For example they cannot enter the matrix in |> Stopping stage or leave while in Active/Running state. That is why |the |> entry and exit points are important. But, of course, we will |discuss |> these in detail as we progress. |> |> Yep, a Pause state is required. Good catch. |> |> Cheers |> |> <k/> |> |> |> From: Sam Johnston [mailto:[3]samj@samj.net] |> Sent: Saturday, April 18, 2009 9:04 AM |> To: Krishna Sankar (ksankar) |> Cc: Chris Webb; [4]occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> |> Hi Krishna, |> Thanks for the feedback. |> |> On Sat, Apr 18, 2009 at 5:22 PM, Krishna Sankar (ksankar) |> <[5]ksankar@cisco.com> wrote: |> |> |> a) The diagram needs a entry and exit point circles (actually |> concentric circles) |> |> Resources can enter or leave the matrix at any point (e.g. you can |> import or migrate a suspended workload) so I think adding this, |while |> technically correct, might impair readability (as it does in the |DMTF |> diagram). The formal specification might want to include a formal |state |> diagram however. |> |> b) The Stopping and Starting states are important. For example |if |> the state is Starting, clients could retry; schedulers could add |the VM |> in their pool et al. Stopping state will mean that the system is |not |> accepting service requests anymore. It might have to stay in this |state |> until all pending requests are completed. |> |> Sure, but it's really up to the provider as to whether they want to |> implement these. Some workloads (e.g. slices) start atomically so |the |> transition doesn't make sense. We'll cater for the need but I'm a |big |> fan of giving implementors maximum flexibility. |> |> c) There is another state Aborting dont know if we want to |add |> this. |> |> Interesting idea - perhaps we'll include it in the registry as one |of |> those "optional" states. Another interesting one is "paused", where |no |> new requests will be accepted but all those in progress will be |> finished - a load balancer shouldn't send any new requests but it |> shouldn't terminate any existing ones either. |> |> d) The stopped state, while not important during run time, |will be |> useful for account keeping, auditing et al. For example a log entry |> with Stopped state with a timestamp |> |> Perhaps, but simpler systems might operate without billing or have |a |> simple meter based approach. If the infrastructure doesn't already |> maintain information about stopped resources then we don't want to |> force them to in order to implement the API. |> |> e) Remember, monitoring and billing is an important plane for |> Clouds and so we should also have states that are relevant from |that |> perspective |> |> Agreed. I don't think we should get too deep into this, but we |should |> bear in mind that the ability to see/compare costs in the clients |> delivers huge value. I've included some simple metering examples to |get |> the creative juices flowing. |> Sam |> |> Cheers |> |> <k/> |> |> |> From: [6]occi-wg-bounces@ogf.org [mailto:[7]occi-wg- |bounces@ogf.org] On |> Behalf Of Sam Johnston |> Sent: Saturday, April 18, 2009 6:14 AM |> To: Chris Webb |> Cc: [8]occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> |> On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb |> <[9]chris.webb@elastichosts.com> wrote: |> |> Sam Johnston <[10]samj@samj.net> writes: |> > I have created a diagram (attached) of what I think the absolute |> minimum |> > core states need to be... essentially boiling them down to |"STOPPED" |> and |> > "ACTIVE" with "START" and "STOP" being the only requisite |actuators. |> > Transitional "STOPPING" and "STARTING" states are optional. |> |> +1 |> I strongly agree with simplifying things in this way. Good stuff! |> |> I subsequently realised that in fact infrastructure like Amazon, |Mosso |> and ElasticHosts don't actually have a "stopped" state - "stop" for |> these guys is more like "destroy". |> It then occurred to me that there was no point making "stopped" |> optional as if you don't need to start/stop/restart machines then |you |> just don't implement the machine control extension. It's far easier |to |> create (HTTP PUT) and then destroy (HTTP DELETE) a server than what |it |> is to parse the response to fire an actuator. |> So basically the machine control extension becomes optional (but |> possibly still interesting to indicate transitional states like |> "starting" and "stopping" as Chris pointed out privately). |> Sam |> |> References |> |> 1. http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi- |wg/wiki/Registries<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-%0A%7Cwg/wiki/Registries> |> 2. mailto:ksankar@cisco.com |> 3. mailto:samj@samj.net |> 4. mailto:occi-wg@ogf.org |> 5. mailto:ksankar@cisco.com |> 6. mailto:occi-wg-bounces@ogf.org |> 7. mailto:occi-wg-bounces@ogf.org |> 8. mailto:occi-wg@ogf.org |> 9. mailto:chris.webb@elastichosts.com |> 10. mailto:samj@samj.net | |> _______________________________________________ |> occi-wg mailing list |> occi-wg@ogf.org |> http://www.ogf.org/mailman/listinfo/occi-wg | | | | |-- |Nothing is ever easy.

Ignacio Martin Llorente

19 Apr 19 Apr

10:16 a.m.

Hi, I am not sure what is the state of this discussion, but I would like to support the addition of ENTRY and EXIT points to the diagram. Moreover, if we do not assume that the submitted VMs have to run immediately or not at all, I would like to suggest that we need a "Pending" state in the diagram. The new submitted VMs remain in this state until there are resources available. That could be the "DEFINED" state or the DMTF "INITIAL" state. Cheers, Ignacio -- Ignacio M. Llorente, Full Professor (Catedratico): web http://dsa-research.org/llorente and blog http://imllorente.dsa-research.org/ DSA Research Group: web http://dsa-research.org and blog http://blog.dsa-research.org Globus GridWay Metascheduler: http://www.GridWay.org OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org On 18/04/2009, at 20:26, Krishna Sankar (ksankar) wrote:

...

Excellent. Agreed. Am a fan of orthogonal extensibility as well.

In which case, let us make the supported states meta-discoverable via an appropriate policy plane interface. Of course, just discovering will not solve the problems – one has to figure out the semantics – how to use the discovered states effectively.

BTW, have we started elaborating the canonical model of the different planes – data, control, management, policy and monitoring/ billing ? Or more precisely is there a placeholder ? I will look around the Wiki – I assume that is the focal point of our efforts.

Cheers <k/>

From: Sam Johnston [mailto:samj@samj.net] Sent: Saturday, April 18, 2009 10:46 AM To: Krishna Sankar (ksankar) Cc: Andre Merzky; occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi,

I'm really quite sensitive about prematurely throwing a "wet blanket" over innovation (it is, after all, fairly early to be talking about standards). That's one of the main reasons for keeping the state engine on the server side and exposing possible transitions via links: <atom:link title="Start" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/start " rel="http://purl.org/occi/state#start"/>

<atom:link title="Stop" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/stop " rel="http://purl.org/occi/state#stop"/>

<atom:link title="Restart" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/restart " rel="http://purl.org/occi/state#restart"/>

<atom:link title="Suspend" href="http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/state/suspend " rel="http://purl.org/occi/state#suspend"/>

We can't presume to know what weird and wonderful things people will want to do in the future so the registry is more for the server side implementors (to encourage them to use existing terms rather than coming up with their own).

And before you ask, the "title" field is UI friendly and easily localised using content negotiation (so you can have a "Vaporize" transition over there which will appear as "Vaporise" over here).

Cheers,

Sam

...
wrote: Good point. I think we will need conformance levels (say 1,2 and 3) so

On Sat, Apr 18, 2009 at 7:34 PM, Krishna Sankar (ksankar) <ksankar@cisco.com that we can introspect and infer what an implementation will support. Agreed, at the instance level, we do need to be deterministic.

Cheers <k/>

|-----Original Message----- |From: Andre Merzky [mailto:andremerzky@gmail.com] On Behalf Of Andre |Merzky |Sent: Saturday, April 18, 2009 10:30 AM |To: Sam Johnston |Cc: Krishna Sankar (ksankar); occi-wg@ogf.org |Subject: Re: [occi-wg] OCCI MC - State Machine Diagram | |Hi Sam, | |I am not sure I understand how you expect extensions to the |state model to work. | |For example, assume that I have a client which implements |the core specification only, thus only knows the STOPPED, |ACTIVE and SUSPENDED states (your original figure). What is |that client supposed to do if the backend reports an PAUSED |state? | |It is a standards compliant client, so I would expect it to |work with a standards compliant backend. However, the |extension registry as it exists right now would break |backward compatibility. | |One solution would be to register new states as substates of |existing states. PAUSED for example could be registered as |substate for SUSPENDED. What is the difference? Well, the |state reported by the backend would be SUSPENDED, which the |client understands. The client would also know that the |resume() operation is valid for that state. Other clients |which implement the extension would obtain the state |SUSPENDED and the state_detail PAUSED, and learn that way |that the client does not accept new requests. Voila: |backward compatibility. | |Of course, that model poses limitations: it does not allow |extensions which allow transitions which are not present |within the top level state diagram. e.g., no extension |could implement a direct transition from PAUSED to STOPPED, |as this is not in your original state diagram. I, however, |do not consider that to be a bug, but a feature: it |guarantees that the top level state model is preserved even |if extensions are present. | | | |BTW, I agree with Krishna's point that ENTRY and EXIT |points are useful. | |Cheers, Andre | | |Quoting [Sam Johnston] (Apr 18 2009): |> Date: Sat, 18 Apr 2009 19:20:37 +0200 |> From: Sam Johnston <samj@samj.net> |> To: "Krishna Sankar (ksankar)" <ksankar@cisco.com> |> Cc: occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> I've added both to the registry: |> [1]http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi- |wg/wiki/Regis |> tries |> Sam |> |> On Sat, Apr 18, 2009 at 6:13 PM, Krishna Sankar (ksankar) |> <[2]ksankar@cisco.com> wrote: |> |> Sam, |> |> Just for clarity, resources shouldnt enter the |state |> matrix at any point. For example they cannot enter the matrix in |> Stopping stage or leave while in Active/Running state. That is why |the |> entry and exit points are important. But, of course, we will |discuss |> these in detail as we progress. |> |> Yep, a Pause state is required. Good catch. |> |> Cheers |> |> <k/> |> |> |> From: Sam Johnston [mailto:[3]samj@samj.net] |> Sent: Saturday, April 18, 2009 9:04 AM |> To: Krishna Sankar (ksankar) |> Cc: Chris Webb; [4]occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> |> Hi Krishna, |> Thanks for the feedback. |> |> On Sat, Apr 18, 2009 at 5:22 PM, Krishna Sankar (ksankar) |> <[5]ksankar@cisco.com> wrote: |> |> |> a) The diagram needs a entry and exit point circles (actually |> concentric circles) |> |> Resources can enter or leave the matrix at any point (e.g. you can |> import or migrate a suspended workload) so I think adding this, |while |> technically correct, might impair readability (as it does in the |DMTF |> diagram). The formal specification might want to include a formal |state |> diagram however. |> |> b) The Stopping and Starting states are important. For example |if |> the state is Starting, clients could retry; schedulers could add |the VM |> in their pool et al. Stopping state will mean that the system is |not |> accepting service requests anymore. It might have to stay in this |state |> until all pending requests are completed. |> |> Sure, but it's really up to the provider as to whether they want to |> implement these. Some workloads (e.g. slices) start atomically so |the |> transition doesn't make sense. We'll cater for the need but I'm a |big |> fan of giving implementors maximum flexibility. |> |> c) There is another state Aborting dont know if we want to |add |> this. |> |> Interesting idea - perhaps we'll include it in the registry as one |of |> those "optional" states. Another interesting one is "paused", where |no |> new requests will be accepted but all those in progress will be |> finished - a load balancer shouldn't send any new requests but it |> shouldn't terminate any existing ones either. |> |> d) The stopped state, while not important during run time, |will be |> useful for account keeping, auditing et al. For example a log entry |> with Stopped state with a timestamp |> |> Perhaps, but simpler systems might operate without billing or have |a |> simple meter based approach. If the infrastructure doesn't already |> maintain information about stopped resources then we don't want to |> force them to in order to implement the API. |> |> e) Remember, monitoring and billing is an important plane for |> Clouds and so we should also have states that are relevant from |that |> perspective |> |> Agreed. I don't think we should get too deep into this, but we |should |> bear in mind that the ability to see/compare costs in the clients |> delivers huge value. I've included some simple metering examples to |get |> the creative juices flowing. |> Sam |> |> Cheers |> |> <k/> |> |> |> From: [6]occi-wg-bounces@ogf.org [mailto:[7]occi-wg- |bounces@ogf.org] On |> Behalf Of Sam Johnston |> Sent: Saturday, April 18, 2009 6:14 AM |> To: Chris Webb |> Cc: [8]occi-wg@ogf.org |> Subject: Re: [occi-wg] OCCI MC - State Machine Diagram |> |> |> On Sat, Apr 18, 2009 at 1:43 PM, Chris Webb |> <[9]chris.webb@elastichosts.com> wrote: |> |> Sam Johnston <[10]samj@samj.net> writes: |> > I have created a diagram (attached) of what I think the absolute |> minimum |> > core states need to be... essentially boiling them down to |"STOPPED" |> and |> > "ACTIVE" with "START" and "STOP" being the only requisite |actuators. |> > Transitional "STOPPING" and "STARTING" states are optional. |> |> +1 |> I strongly agree with simplifying things in this way. Good stuff! |> |> I subsequently realised that in fact infrastructure like Amazon, |Mosso |> and ElasticHosts don't actually have a "stopped" state - "stop" for |> these guys is more like "destroy". |> It then occurred to me that there was no point making "stopped" |> optional as if you don't need to start/stop/restart machines then |you |> just don't implement the machine control extension. It's far easier |to |> create (HTTP PUT) and then destroy (HTTP DELETE) a server than what |it |> is to parse the response to fire an actuator. |> So basically the machine control extension becomes optional (but |> possibly still interesting to indicate transitional states like |> "starting" and "stopping" as Chris pointed out privately). |> Sam |> |> References |> |> 1. http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi- |wg/wiki/Registries |> 2. mailto:ksankar@cisco.com |> 3. mailto:samj@samj.net |> 4. mailto:occi-wg@ogf.org |> 5. mailto:ksankar@cisco.com |> 6. mailto:occi-wg-bounces@ogf.org |> 7. mailto:occi-wg-bounces@ogf.org |> 8. mailto:occi-wg@ogf.org |> 9. mailto:chris.webb@elastichosts.com |> 10. mailto:samj@samj.net | |> _______________________________________________ |> occi-wg mailing list |> occi-wg@ogf.org |> http://www.ogf.org/mailman/listinfo/occi-wg | | | | |-- |Nothing is ever easy.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam Johnston

11:10 a.m.

On Sun, Apr 19, 2009 at 12:16 PM, Ignacio Martin Llorente < llorente@dacya.ucm.es> wrote:

...

I am not sure what is the state of this discussion, but I would like to support the addition of ENTRY and EXIT points to the diagram.

Done. Attached, along with OmniGraffle source.

...

Moreover, if we do not assume that the submitted VMs have to run immediately or not at all, I would like to suggest that we need a "Pending" state in the diagram. The new submitted VMs remain in this state until there are resources available. That could be the "DEFINED" state or the DMTF "INITIAL" state.

"Pending" is ambiguous. "Resuming" is pending, so is "suspending". There's various takes on "defined" and "initial" too... for example if you were to implement something like the Q-Layer^WSun Cloud API web interface on top of OCCI then you would likely start by creating and linking a bunch of objects, none of which would be useful until associated with actual resources - it would be like a shell/skeleton. Perhaps this is something worth taking into consideration, though as I'd rather not impose too much on implementors it probably belongs in the registry. Sam

Krishna Sankar (ksankar)

4:15 p.m.

Same logic applies to stop and suspend as well ;o) I think the stop and suspend could be semantically the same except make be how state is handled. Is the difference that stop erases all internal state while suspend maintains the state ? But that is the RESTART and RESUME actions, not the STOP/SUSPEND itself. One could RESUME after STOP, couldn't one ? I also think pending state after entry is a good idea. To be correct, as Ignacio mentions, a VM has to have a pending or resuming state before becoming active. Same for exit. Exit out after stopping or suspending. Cheers <k/> From: Sam Johnston [mailto:samj@samj.net] Sent: Sunday, April 19, 2009 4:11 AM To: Ignacio Martin Llorente Cc: Krishna Sankar (ksankar); occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram On Sun, Apr 19, 2009 at 12:16 PM, Ignacio Martin Llorente <llorente@dacya.ucm.es> wrote: I am not sure what is the state of this discussion, but I would like to support the addition of ENTRY and EXIT points to the diagram. Done. Attached, along with OmniGraffle source. Moreover, if we do not assume that the submitted VMs have to run immediately or not at all, I would like to suggest that we need a "Pending" state in the diagram. The new submitted VMs remain in this state until there are resources available. That could be the "DEFINED" state or the DMTF "INITIAL" state. "Pending" is ambiguous. "Resuming" is pending, so is "suspending". There's various takes on "defined" and "initial" too... for example if you were to implement something like the Q-Layer^WSun Cloud API web interface on top of OCCI then you would likely start by creating and linking a bunch of objects, none of which would be useful until associated with actual resources - it would be like a shell/skeleton. Perhaps this is something worth taking into consideration, though as I'd rather not impose too much on implementors it probably belongs in the registry. Sam

Thijs Metsch

20 Apr 20 Apr

7:31 a.m.

Hi, Can you upload those files to the wiki? Cheers, -Thijs On Sun, 2009-04-19 at 13:10 +0200, Sam Johnston wrote:

...

On Sun, Apr 19, 2009 at 12:16 PM, Ignacio Martin Llorente <llorente@dacya.ucm.es> wrote:

I am not sure what is the state of this discussion, but I would like to support the addition of ENTRY and EXIT points to the diagram.

Done. Attached, along with OmniGraffle source.

Moreover, if we do not assume that the submitted VMs have to run immediately or not at all, I would like to suggest that we need a "Pending" state in the diagram. The new submitted VMs remain in this state until there are resources available. That could be the "DEFINED" state or the DMTF "INITIAL" state.

"Pending" is ambiguous. "Resuming" is pending, so is "suspending". There's various takes on "defined" and "initial" too... for example if you were to implement something like the Q-Layer^WSun Cloud API web interface on top of OCCI then you would likely start by creating and linking a bunch of objects, none of which would be useful until associated with actual resources - it would be like a shell/skeleton.

Perhaps this is something worth taking into consideration, though as I'd rather not impose too much on implementors it probably belongs in the registry.

Sam

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

Sam Johnston

7:50 a.m.

On Mon, Apr 20, 2009 at 9:31 AM, Thijs Metsch <Thijs.Metsch@sun.com> wrote:

...

Hi,

Can you upload those files to the wiki?

Sure, they're already hanging off this page: http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/StateModel Sam

...

On Sun, 2009-04-19 at 13:10 +0200, Sam Johnston wrote:

...
On Sun, Apr 19, 2009 at 12:16 PM, Ignacio Martin Llorente <llorente@dacya.ucm.es> wrote:

I am not sure what is the state of this discussion, but I would like to support the addition of ENTRY and EXIT points to the diagram.

Done. Attached, along with OmniGraffle source.

Moreover, if we do not assume that the submitted VMs have to run immediately or not at all, I would like to suggest that we need a "Pending" state in the diagram. The new submitted VMs remain in this state until there are resources available. That could be the "DEFINED" state or the DMTF "INITIAL" state.

"Pending" is ambiguous. "Resuming" is pending, so is "suspending". There's various takes on "defined" and "initial" too... for example if you were to implement something like the Q-Layer^WSun Cloud API web interface on top of OCCI then you would likely start by creating and linking a bunch of objects, none of which would be useful until associated with actual resources - it would be like a shell/skeleton.

Perhaps this is something worth taking into consideration, though as I'd rather not impose too much on implementors it probably belongs in the registry.

Sam

...

...
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Thijs Metsch

7:57 a.m.

Ah sorry, my fault - than it was only that the picture was not updated...Was missing the entry-points :-) -Thijs On Mon, 2009-04-20 at 09:50 +0200, Sam Johnston wrote:

...

On Mon, Apr 20, 2009 at 9:31 AM, Thijs Metsch <Thijs.Metsch@sun.com> wrote:

Hi,

Can you upload those files to the wiki?

Sure, they're already hanging off this page:

http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/StateModel

Sam

On Sun, 2009-04-19 at 13:10 +0200, Sam Johnston wrote: > On Sun, Apr 19, 2009 at 12:16 PM, Ignacio Martin Llorente > <llorente@dacya.ucm.es> wrote: > > I am not sure what is the state of this discussion, but I > would like to support the addition of ENTRY and EXIT points to > the diagram. > > Done. Attached, along with OmniGraffle source. > > Moreover, if we do not assume that the submitted VMs have to > run immediately or not at all, I would like to suggest that > we need a "Pending" state in the diagram. The new submitted > VMs remain in this state until there are resources available. > That could be the "DEFINED" state or the DMTF "INITIAL" state. > > "Pending" is ambiguous. "Resuming" is pending, so is "suspending". > There's various takes on "defined" and "initial" too... for example if > you were to implement something like the Q-Layer^WSun Cloud API web > interface on top of OCCI then you would likely start by creating and > linking a bunch of objects, none of which would be useful until > associated with actual resources - it would be like a shell/skeleton. > > Perhaps this is something worth taking into consideration, though as > I'd rather not impose too much on implementors it probably belongs in > the registry. > > Sam >

> _______________________________________________ > occi-wg mailing list > occi-wg@ogf.org > http://www.ogf.org/mailman/listinfo/occi-wg --

Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

Sam Johnston

8:19 a.m.

On Mon, Apr 20, 2009 at 9:57 AM, Thijs Metsch <Thijs.Metsch@sun.com> wrote:

...

Ah sorry, my fault - than it was only that the picture was not updated...Was missing the entry-points :-)

Finally worked out how to remove/update attachments. Done. Sam

Edmonds, AndrewX

9:58 a.m.

Mornin! After looking at the state model with a colleague (Victor), it was brought up that there are no exception states captured. Might it be appropriate to insert an error state such as “ERROR” or “CRASHED”? The assumption of the state model currently is that no failures occur but we should design with this in mind. Now perhaps failure is mitigated by provider internal strategies in which case exceptions are not revealed to a client but if a provider does not have advanced recovery strategies there’s no way to signal exceptions to a client (we should cater for lowest common denominator). Andy From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 09:20 To: Thijs Metsch Cc: occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram On Mon, Apr 20, 2009 at 9:57 AM, Thijs Metsch <Thijs.Metsch@sun.com<mailto:Thijs.Metsch@sun.com>> wrote: Ah sorry, my fault - than it was only that the picture was not updated...Was missing the entry-points :-) Finally worked out how to remove/update attachments. Done. Sam ------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

Alexis Richardson

10:05 a.m.

Good thinking. I am interested in how EH and GG deal with exceptions. Chris? On Mon, Apr 20, 2009 at 10:58 AM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

...

Mornin!

After looking at the state model with a colleague (Victor), it was brought up that there are no exception states captured. Might it be appropriate to insert an error state such as “ERROR” or “CRASHED”? The assumption of the state model currently is that no failures occur but we should design with this in mind.

Now perhaps failure is mitigated by provider internal strategies in which case exceptions are not revealed to a client but if a provider does not have advanced recovery strategies there’s no way to signal exceptions to a client (we should cater for lowest common denominator).

Andy

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 09:20 To: Thijs Metsch Cc: occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

On Mon, Apr 20, 2009 at 9:57 AM, Thijs Metsch <Thijs.Metsch@sun.com> wrote:

Ah sorry, my fault - than it was only that the picture was not updated...Was missing the entry-points :-)

Finally worked out how to remove/update attachments. Done.

Sam

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Tino Vazquez

10:35 a.m.

Howdy everyone, Excellent thread. My three cents: 1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED? 2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea. 3) +1 to the "CRASHED", "ERROR" or "FAILED" state. What do you think? -Tino On Mon, Apr 20, 2009 at 12:05 PM, Alexis Richardson <alexis.richardson@gmail.com> wrote:

...

Good thinking.

I am interested in how EH and GG deal with exceptions. Chris?

On Mon, Apr 20, 2009 at 10:58 AM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

...
Mornin!

After looking at the state model with a colleague (Victor), it was brought up that there are no exception states captured. Might it be appropriate to insert an error state such as “ERROR” or “CRASHED”? The assumption of the state model currently is that no failures occur but we should design with this in mind.

Now perhaps failure is mitigated by provider internal strategies in which case exceptions are not revealed to a client but if a provider does not have advanced recovery strategies there’s no way to signal exceptions to a client (we should cater for lowest common denominator).

Andy

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 09:20 To: Thijs Metsch Cc: occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

On Mon, Apr 20, 2009 at 9:57 AM, Thijs Metsch <Thijs.Metsch@sun.com> wrote:

Ah sorry, my fault - than it was only that the picture was not updated...Was missing the entry-points :-)

Finally worked out how to remove/update attachments. Done.

Sam

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Constantino Vázquez, Grid Technology Engineer/Researcher: http://www.dsa-research.org/tinova DSA Research Group: http://dsa-research.org Globus GridWay Metascheduler: http://www.GridWay.org OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org

Sam Johnston

10:56 a.m.

Hi Tino, A lot of this is covered in the state registry<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/Registries>(which I'll copy below for you) but comments inline nonetheless. On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

...

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

...

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

...

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted". What do you think?

...

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API. Sam Extensions State control StateTransitionsDescription abortingabortedThe resource encountered an error and is aborting *aborted*n/aThe resource encountered an error and has aborted *active**pause*, *restart*, *stop*, * suspend*The resource is active resumingaborting, activeThe resource is becoming active and restoring state pausingaborting, pausedThe resource is preparing to refuse new requests *paused*aborting, *resume*The resource is refusing new requests startingaborting, activeThe resource is becoming active *stopped**start*The resource is inactive and has no saved state stoppingstopped, abortingThe resource is becoming inactive and destroying state *suspended**resume*, *stop*The resource is inactive and has saved state *Note*: Stable states and user transitions in *bold*.

Edmonds, AndrewX

12:29 p.m.

I updated the state model (attached) to include an error state. Couldn’t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out. Andy From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch Subject: Re: [occi-wg] OCCI MC - State Machine Diagram Hi Tino, A lot of this is covered in the state registry<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/Registries> (which I'll copy below for you) but comments inline nonetheless. On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es<mailto:tinova@fdi.ucm.es>> wrote: Howdy everyone, Excellent thread. My three cents: 1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED? Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state" 2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea. I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS]. 3) +1 to the "CRASHED", "ERROR" or "FAILED" state. ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted". What do you think? Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API. Sam Extensions State control State Transitions Description aborting aborted The resource encountered an error and is aborting aborted n/a The resource encountered an error and has aborted active pause, restart, stop, suspend The resource is active resuming aborting, active The resource is becoming active and restoring state pausing aborting, paused The resource is preparing to refuse new requests paused aborting, resume The resource is refusing new requests starting aborting, active The resource is becoming active stopped start The resource is inactive and has no saved state stopping stopped, aborting The resource is becoming inactive and destroying state suspended resume, stop The resource is inactive and has saved state Note: Stable states and user transitions in bold. ------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

Sam Johnston

12:39 p.m.

On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com

...

wrote:

...

I updated the state model (attached) to include an error state. Couldn’t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example. Sam

...

*From:* occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] *On Behalf Of *Sam Johnston *Sent:* 20 April 2009 11:56 *To:* Tino Vazquez *Cc:* occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch *Subject:* Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/Registries>(which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control

*State*

*Transitions*

*Description*

aborting

aborted

The resource encountered an error and is aborting

*aborted*

n/a

The resource encountered an error and has aborted

*active*

*pause*, *restart*, *stop*, *suspend*

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

*paused*

aborting, *resume*

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

*stopped*

*start*

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

*suspended*

*resume*, *stop*

The resource is inactive and has saved state

*Note*: Stable states and user transitions in *bold*.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

Thijs Metsch

12:41 p.m.

As I stated before maybe we an add another exit-point instead of a real state. Cheers, -Thijs On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...

On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn’t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

Edmonds, AndrewX

12:47 p.m.

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this? -----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram As I stated before maybe we an add another exit-point instead of a real state. Cheers, -Thijs On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...

On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn’t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

Sam Johnston

12:52 p.m.

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy. Sam On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com

...

wrote:

...

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn’t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

Edmonds, AndrewX

1:02 p.m.

Sounds like a good comprise ☺ From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy. Sam On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com<mailto:andrewx.edmonds@intel.com>> wrote: Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this? -----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM<mailto:Thijs.Metsch@Sun.COM>] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org<mailto:occi-wg@ogf.org>; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram As I stated before maybe we an add another exit-point instead of a real state. Cheers, -Thijs On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...

On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com<mailto:andrewx.edmonds@intel.com>> wrote: I updated the state model (attached) to include an error state. Couldn’t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org<mailto:occi-wg-bounces@ogf.org> [mailto:occi-wg-bounces@ogf.org<mailto:occi-wg-bounces@ogf.org>] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org<mailto:occi-wg@ogf.org>; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es<mailto:tinova@fdi.ucm.es>> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org<mailto:occi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/occi-wg -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com<mailto:thijs.metsch@sun.com> D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. ------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

Thijs Metsch

1:07 p.m.

I agree - this is fine with me. On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...

Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn’t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

Edmonds, AndrewX

1:15 p.m.

Updated... -----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 14:07 To: Edmonds, AndrewX Cc: Sam Johnston; occi-wg@ogf.org; Molino, VictorX M Subject: RE: [occi-wg] OCCI MC - State Machine Diagram I agree - this is fine with me. On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...

Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn’t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

Andre Merzky

13 May 13 May

10:09 a.m.

Hi all, I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki (http://forge.ogf.org/short/occi-wg/states) I am really unhappy with a number of things: - too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points Please allow me to elaborate (1) Too many states: All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition. For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED. So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state"). (2) Too many internal state transitions That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached).... Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change. (3) unclear entry and exit points There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?) STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources. I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states. Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram. Cheers, Andre. PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-) Quoting [Edmonds, AndrewX] (Apr 20 2009):

...

Updated...

-----Original Message-----

I agree - this is fine with me.

On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...
Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn???t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

-- Nothing is ever easy.

Sam Johnston

10:24 a.m.

Andre, The diagram was previously simple enough to be comprehensible but the (IMO implicit) error transitions if anything detract from its meaning (again IMO). We are not in any position to lock this down as we simply don't know what a> implementors will want to do (some machines only ever really exist in a running state for example) and b> what tomorrow's machines will be capable of. That's why we have a registry and a todo item to review it in the context of interoperability. Sam On Wed, May 13, 2009 at 12:09 PM, Andre Merzky <andre@merzky.net> wrote:

...

Hi all,

I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki (http://forge.ogf.org/short/occi-wg/states)

I am really unhappy with a number of things:

- too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points

Please allow me to elaborate

(1) Too many states:

All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition.

For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED.

So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state").

(2) Too many internal state transitions

That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached)....

Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change.

(3) unclear entry and exit points

There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?)

STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources.

I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states.

Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram.

Cheers, Andre.

PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-)

Quoting [Edmonds, AndrewX] (Apr 20 2009):

...
Updated...

-----Original Message-----

I agree - this is fine with me.

On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...
Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn???t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

-- Nothing is ever easy.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Andre Merzky

12:41 p.m.

Quoting [Sam Johnston] (May 13 2009):

...

Andre, The diagram was previously simple enough to be comprehensible but the (IMO implicit) error transitions if anything detract from its meaning (again IMO). We are not in any position to lock this down as we simply don't know what a> implementors will want to do (some machines only ever really exist in a running state for example) and b> what tomorrow's machines will be capable of. That's why we have a registry and a todo item to review it in the context of interoperability.

Hi Sam, sure, you don't kow what tomorrows system will do. But doesn't that argument lead to a discussion about extensibility, which is, IMHO, orthognal. I may have missed some mails about the issue, my apologies if the following has been discussed before: The registry is one possible way to allow for extensibility of the state model. But (a) it does not help to keep the model clean (a cluttered registry is nothing you want you clients to handle), and (b) I thought that for the OCCI core, the set of states would be finite, and well defined. Am I mistaken in (b)? Thanks, Andre.

...

Sam

On Wed, May 13, 2009 at 12:09 PM, Andre Merzky <[1]andre@merzky.net> wrote:

Hi all, I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki ([2]http://forge.ogf.org/short/occi-wg/states) I am really unhappy with a number of things: - too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points Please allow me to elaborate (1) Too many states: All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition. For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED. So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state"). (2) Too many internal state transitions That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached).... Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change. (3) unclear entry and exit points There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?) STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources. I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states. Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram. Cheers, Andre. PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-) Quoting [Edmonds, AndrewX] (Apr 20 2009): > > Updated...

...
-----Original Message-----

I agree - this is fine with me.

On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...
Sounds like a good comprise J

From: Sam Johnston [mailto:[3]samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; [4]occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell

...
...
clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <[5]andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:[6]Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; [7]occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <[8]andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn???t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: [9]occi-wg-bounces@ogf.org [mailto:[10]occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: [11]occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <[12]tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be

its the

...
...
...
state where machines will wait for a host to be available to

run

...
on. Also, I don't really think that a machine entering its life

cycle in

...
SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be

completely

...
unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED"

state.

...
ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long

tasks

...
like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all

copies.

...
_______________________________________________ occi-wg mailing list [13]occi-wg@ogf.org [14]http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) [15]http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:[16]thijs.metsch@sun.com D-93049 Regensburg [17]http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -- Nothing is ever easy.

Alexis Richardson

12:44 p.m.

Andre I (personally) would certainly agree with (b) **** for the draft **** so that people can start implementing and prototyping, thus creating a feedback loop so that we can go from draft to review to final. alexis On Wed, May 13, 2009 at 1:41 PM, Andre Merzky <andre@merzky.net> wrote:

...

Quoting [Sam Johnston] (May 13 2009):

...
Andre, The diagram was previously simple enough to be comprehensible but the (IMO implicit) error transitions if anything detract from its meaning (again IMO). We are not in any position to lock this down as we simply don't know what a> implementors will want to do (some machines only ever really exist in a running state for example) and b> what tomorrow's machines will be capable of. That's why we have a registry and a todo item to review it in the context of interoperability.

Hi Sam,

sure, you don't kow what tomorrows system will do. But doesn't that argument lead to a discussion about extensibility, which is, IMHO, orthognal.

I may have missed some mails about the issue, my apologies if the following has been discussed before:

The registry is one possible way to allow for extensibility of the state model. But (a) it does not help to keep the model clean (a cluttered registry is nothing you want you clients to handle), and (b) I thought that for the OCCI core, the set of states would be finite, and well defined. Am I mistaken in (b)?

Thanks, Andre.

...
Sam

On Wed, May 13, 2009 at 12:09 PM, Andre Merzky <[1]andre@merzky.net> wrote:

Hi all, I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki ([2]http://forge.ogf.org/short/occi-wg/states) I am really unhappy with a number of things: - too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points Please allow me to elaborate (1) Too many states: All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition. For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED. So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state"). (2) Too many internal state transitions That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached).... Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change. (3) unclear entry and exit points There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?) STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources. I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states. Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram. Cheers, Andre. PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-) Quoting [Edmonds, AndrewX] (Apr 20 2009): > > Updated...

> > -----Original Message----- > > I agree - this is fine with me. > > On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote: > > Sounds like a good comprise J > > > > > > > > From: Sam Johnston [mailto:[3]samj@samj.net] > > Sent: 20 April 2009 13:52 > > To: Edmonds, AndrewX > > Cc: Thijs.Metsch@Sun.COM; [4]occi-wg@ogf.org; Molino, VictorX M > > Subject: Re: [occi-wg] OCCI MC - State Machine Diagram > > > > > > > > > > I'd tend to side with Thijs on this one - having a complete state > > machine is a lot less important when the implementation can tell its > > clients what the next states are (and the humans what they mean in > > plain $LANGUAGE). That said, if you want to add a dotted loop back to > > the start then that might help the perfectionists sleep easy. > > > > Sam > > > > On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX > > <[5]andrewx.edmonds@intel.com> wrote: > > > > Then conceptually, how do we recover from an error state if we've > > exited? Also if we want to support error state recovery, should the > > state model reflect this? > > > > > > -----Original Message----- > > From: Thijs.Metsch@Sun.COM [mailto:[6]Thijs.Metsch@Sun.COM] > > Sent: 20 April 2009 13:42 > > To: Sam Johnston > > Cc: Edmonds, AndrewX; [7]occi-wg@ogf.org; Molino, VictorX M > > Subject: Re: [occi-wg] OCCI MC - State Machine Diagram > > > > As I stated before maybe we an add another exit-point instead of a > > real > > state. > > > > Cheers, > > > > -Thijs > > > > > > On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote: > > > On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX > > > <[8]andrewx.edmonds@intel.com> wrote: > > > I updated the state model (attached) to include an error > > > state. Couldn???t see how to delete attachments so maybe > > someone > > > here has better luck? I also began to add identifiers to > > each > > > process transition but the diagram started to become > > cluttered > > > so left them out. > > > > > > > > > I would suggest that it's better to keep this clean (though I'm > > > impressed that you got as far as you did!). You can reach an error > > > state from anywhere - if a machine is found to be corrupt even while > > > stopped then a STOPPED->ERROR transition makes sense for example. > > > > > > Sam > > > > > > From: [9]occi-wg-bounces@ogf.org > > [mailto:[10]occi-wg-bounces@ogf.org] > > > On Behalf Of Sam Johnston > > > Sent: 20 April 2009 11:56 > > > To: Tino Vazquez > > > Cc: [11]occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch > > > > > > > > > Subject: Re: [occi-wg] OCCI MC - State Machine Diagram > > > > > > > > > Hi Tino, > > > > > > > > > > > > > > > A lot of this is covered in the state registry (which I'll > > > copy below for you) but comments inline nonetheless. > > > > > > On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez > > > <[12]tinova@fdi.ucm.es> wrote: > > > > > > Howdy everyone, > > > > > > Excellent thread. My three cents: > > > > > > 1) I think we should define clearly the semantics of the > > > states. for > > > instance, what is the difference between STOPPED and > > > SUSPENDED? Is it > > > that with SUSPENDED the state is saved and not with STOPPED? > > > > > > > > > Yes, it's exactly that. From the state registry: STOPPED = > > > "The resource is inactive and has no saved state" and > > > SUSPENDED="The resource is inactive and has saved state" > > > > > > > > > > > > 2) I really think we need an entry state like > > > "PENDING" or "DEFINED". > > > It will help in implementation relying on a > > > best-effort scheduler to > > > match VMs and hosts, like EC2 does. This will be the > > > state where > > > machines will wait for a host to be available to run > > > on. Also, I don't > > > really think that a machine entering its life cycle > > in > > > SUSPENDED state > > > is a good idea. > > > > > > > > > I tend to agree, but I'd like the terminology to be > > completely > > > unambiguous... something like "NEW" [for this AS]. > > > > > > > > > > > > 3) +1 to the "CRASHED", "ERROR" or "FAILED" state. > > > > > > > > > ABORT[ING|ED] = "The resource encountered an error and is > > > aborting/has aborted". > > > > > > > > > What do you think? > > > > > > > > > Some of these transitions take a while so some way of > > > indicating progress (especially interesting for long tasks > > > like live migrations) would be useful. Would prefer a > > > mechanism that worked universally for the API. > > > > > > Sam > > > > > > > > > > > > > > > Extensions > > > State control > > > State > > > > > > > > > Transitions > > > > > > > > > Description > > > > > > > > > aborting > > > > > > > > > aborted > > > > > > > > > The resource > > > encountered an error > > > and is aborting > > > > > > > > > aborted > > > > > > > > > n/a > > > > > > > > > The resource > > > encountered an error > > > and has aborted > > > > > > > > > active > > > > > > > > > pause, restart, > > > stop, suspend > > > > > > > > > The resource is > > > active > > > > > > > > > resuming > > > > > > > > > aborting, active > > > > > > > > > The resource is > > > becoming active and > > > restoring state > > > > > > > > > pausing > > > > > > > > > aborting, paused > > > > > > > > > The resource is > > > preparing to refuse > > > new requests > > > > > > > > > paused > > > > > > > > > aborting, resume > > > > > > > > > The resource is > > > refusing new > > > requests > > > > > > > > > starting > > > > > > > > > aborting, active > > > > > > > > > The resource is > > > becoming active > > > > > > > > > stopped > > > > > > > > > start > > > > > > > > > The resource is > > > inactive and has no > > > saved state > > > > > > > > > stopping > > > > > > > > > stopped, aborting > > > > > > > > > The resource is > > > becoming inactive > > > and destroying state > > > > > > > > > suspended > > > > > > > > > resume, stop > > > > > > > > > The resource is > > > inactive and has > > > saved state > > > > > > > > > > > > Note: Stable states and user transitions in bold. > > > > > > > > > > > > > > > > > ------------------------------------------------------------- > > > Intel Ireland Limited (Branch) > > > Collinstown Industrial Park, Leixlip, County Kildare, > > Ireland > > > Registered Number: E902934 > > > > > > This e-mail and any attachments may contain confidential > > material for > > > the sole use of the intended recipient(s). Any review or > > distribution > > > by others is strictly prohibited. If you are not the > > intended > > > recipient, please contact the sender and delete all copies. > > > > > > _______________________________________________ > > > occi-wg mailing list > > > [13]occi-wg@ogf.org > > > [14]http://www.ogf.org/mailman/listinfo/occi-wg > > -- > > Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) > > [15]http://blogs.sun.com/intheclouds > > Software Engineer Grid Computing > > Sun Microsystems GmbH > > Dr.-Leo-Ritter-Str. 7 mailto:[16]thijs.metsch@sun.com > > D-93049 Regensburg [17]http://www.sun.com > > > > ------------------------------------------------------------- > > Intel Ireland Limited (Branch) > > Collinstown Industrial Park, Leixlip, County Kildare, Ireland > > Registered Number: E902934 > > > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). Any review or distribution > > by others is strictly prohibited. If you are not the intended > > recipient, please contact the sender and delete all copies. > > > > > > > > > > > > ------------------------------------------------------------- > > Intel Ireland Limited (Branch) > > Collinstown Industrial Park, Leixlip, County Kildare, Ireland > > Registered Number: E902934 > > > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). Any review or distribution > > by others is strictly prohibited. If you are not the intended > > recipient, please contact the sender and delete all copies. -- Nothing is ever easy.

occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Andre Merzky

12:53 p.m.

Quoting [Alexis Richardson] (May 13 2009):

...

Andre

I (personally) would certainly agree with (b) **** for the draft **** so that people can start implementing and prototyping, thus creating a feedback loop so that we can go from draft to review to final.

Sure, I fully understand that. Then please consider my comments to apply for the final design, and feel free to shelf them for the time being... :) Best, Andre.

...

alexis

On Wed, May 13, 2009 at 1:41 PM, Andre Merzky <andre@merzky.net> wrote:

...
Quoting [Sam Johnston] (May 13 2009):

...
Andre, The diagram was previously simple enough to be comprehensible but the (IMO implicit) error transitions if anything detract from its meaning (again IMO). We are not in any position to lock this down as we simply don't know what a> implementors will want to do (some machines only ever really exist in a running state for example) and b> what tomorrow's machines will be capable of. That's why we have a registry and a todo item to review it in the context of interoperability.

Hi Sam,

sure, you don't kow what tomorrows system will do. But doesn't that argument lead to a discussion about extensibility, which is, IMHO, orthognal.

I may have missed some mails about the issue, my apologies if the following has been discussed before:

The registry is one possible way to allow for extensibility of the state model. But (a) it does not help to keep the model clean (a cluttered registry is nothing you want you clients to handle), and (b) I thought that for the OCCI core, the set of states would be finite, and well defined. Am I mistaken in (b)?

Thanks, Andre.

...
Sam

On Wed, May 13, 2009 at 12:09 PM, Andre Merzky <[1]andre@merzky.net> wrote:

Hi all, I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki ([2]http://forge.ogf.org/short/occi-wg/states) I am really unhappy with a number of things: - too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points Please allow me to elaborate (1) Too many states: All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition. For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED. So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state"). (2) Too many internal state transitions That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached).... Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change. (3) unclear entry and exit points There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?) STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources. I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states. Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram. Cheers, Andre. PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-) Quoting [Edmonds, AndrewX] (Apr 20 2009): > > Updated...

> > -----Original Message----- > > I agree - this is fine with me. > > On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote: > > Sounds like a good comprise J > > > > > > > > From: Sam Johnston [mailto:[3]samj@samj.net] > > Sent: 20 April 2009 13:52 > > To: Edmonds, AndrewX > > Cc: Thijs.Metsch@Sun.COM; [4]occi-wg@ogf.org; Molino, VictorX M > > Subject: Re: [occi-wg] OCCI MC - State Machine Diagram > > > > > > > > > > I'd tend to side with Thijs on this one - having a complete state > > machine is a lot less important when the implementation can tell its > > clients what the next states are (and the humans what they mean in > > plain $LANGUAGE). That said, if you want to add a dotted loop back to > > the start then that might help the perfectionists sleep easy. > > > > Sam > > > > On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX > > <[5]andrewx.edmonds@intel.com> wrote: > > > > Then conceptually, how do we recover from an error state if we've > > exited? Also if we want to support error state recovery, should the > > state model reflect this? > > > > > > -----Original Message----- > > From: Thijs.Metsch@Sun.COM [mailto:[6]Thijs.Metsch@Sun.COM] > > Sent: 20 April 2009 13:42 > > To: Sam Johnston > > Cc: Edmonds, AndrewX; [7]occi-wg@ogf.org; Molino, VictorX M > > Subject: Re: [occi-wg] OCCI MC - State Machine Diagram > > > > As I stated before maybe we an add another exit-point instead of a > > real > > state. > > > > Cheers, > > > > -Thijs > > > > > > On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote: > > > On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX > > > <[8]andrewx.edmonds@intel.com> wrote: > > > I updated the state model (attached) to include an error > > > state. Couldn???t see how to delete attachments so maybe > > someone > > > here has better luck? I also began to add identifiers to > > each > > > process transition but the diagram started to become > > cluttered > > > so left them out. > > > > > > > > > I would suggest that it's better to keep this clean (though I'm > > > impressed that you got as far as you did!). You can reach an error > > > state from anywhere - if a machine is found to be corrupt even while > > > stopped then a STOPPED->ERROR transition makes sense for example. > > > > > > Sam > > > > > > From: [9]occi-wg-bounces@ogf.org > > [mailto:[10]occi-wg-bounces@ogf.org] > > > On Behalf Of Sam Johnston > > > Sent: 20 April 2009 11:56 > > > To: Tino Vazquez > > > Cc: [11]occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch > > > > > > > > > Subject: Re: [occi-wg] OCCI MC - State Machine Diagram > > > > > > > > > Hi Tino, > > > > > > > > > > > > > > > A lot of this is covered in the state registry (which I'll > > > copy below for you) but comments inline nonetheless. > > > > > > On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez > > > <[12]tinova@fdi.ucm.es> wrote: > > > > > > Howdy everyone, > > > > > > Excellent thread. My three cents: > > > > > > 1) I think we should define clearly the semantics of the > > > states. for > > > instance, what is the difference between STOPPED and > > > SUSPENDED? Is it > > > that with SUSPENDED the state is saved and not with STOPPED? > > > > > > > > > Yes, it's exactly that. From the state registry: STOPPED = > > > "The resource is inactive and has no saved state" and > > > SUSPENDED="The resource is inactive and has saved state" > > > > > > > > > > > > 2) I really think we need an entry state like > > > "PENDING" or "DEFINED". > > > It will help in implementation relying on a > > > best-effort scheduler to > > > match VMs and hosts, like EC2 does. This will be the > > > state where > > > machines will wait for a host to be available to run > > > on. Also, I don't > > > really think that a machine entering its life cycle > > in > > > SUSPENDED state > > > is a good idea. > > > > > > > > > I tend to agree, but I'd like the terminology to be > > completely > > > unambiguous... something like "NEW" [for this AS]. > > > > > > > > > > > > 3) +1 to the "CRASHED", "ERROR" or "FAILED" state. > > > > > > > > > ABORT[ING|ED] = "The resource encountered an error and is > > > aborting/has aborted". > > > > > > > > > What do you think? > > > > > > > > > Some of these transitions take a while so some way of > > > indicating progress (especially interesting for long tasks > > > like live migrations) would be useful. Would prefer a > > > mechanism that worked universally for the API. > > > > > > Sam > > > > > > > > > > > > > > > Extensions > > > State control > > > State > > > > > > > > > Transitions > > > > > > > > > Description > > > > > > > > > aborting > > > > > > > > > aborted > > > > > > > > > The resource > > > encountered an error > > > and is aborting > > > > > > > > > aborted > > > > > > > > > n/a > > > > > > > > > The resource > > > encountered an error > > > and has aborted > > > > > > > > > active > > > > > > > > > pause, restart, > > > stop, suspend > > > > > > > > > The resource is > > > active > > > > > > > > > resuming > > > > > > > > > aborting, active > > > > > > > > > The resource is > > > becoming active and > > > restoring state > > > > > > > > > pausing > > > > > > > > > aborting, paused > > > > > > > > > The resource is > > > preparing to refuse > > > new requests > > > > > > > > > paused > > > > > > > > > aborting, resume > > > > > > > > > The resource is > > > refusing new > > > requests > > > > > > > > > starting > > > > > > > > > aborting, active > > > > > > > > > The resource is > > > becoming active > > > > > > > > > stopped > > > > > > > > > start > > > > > > > > > The resource is > > > inactive and has no > > > saved state > > > > > > > > > stopping > > > > > > > > > stopped, aborting > > > > > > > > > The resource is > > > becoming inactive > > > and destroying state > > > > > > > > > suspended > > > > > > > > > resume, stop > > > > > > > > > The resource is > > > inactive and has > > > saved state > > > > > > > > > > > > Note: Stable states and user transitions in bold. > > > > > > > > > > > > > > > > > ------------------------------------------------------------- > > > Intel Ireland Limited (Branch) > > > Collinstown Industrial Park, Leixlip, County Kildare, > > Ireland > > > Registered Number: E902934 > > > > > > This e-mail and any attachments may contain confidential > > material for > > > the sole use of the intended recipient(s). Any review or > > distribution > > > by others is strictly prohibited. If you are not the > > intended > > > recipient, please contact the sender and delete all copies. > > > > > > _______________________________________________ > > > occi-wg mailing list > > > [13]occi-wg@ogf.org > > > [14]http://www.ogf.org/mailman/listinfo/occi-wg > > -- > > Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) > > [15]http://blogs.sun.com/intheclouds > > Software Engineer Grid Computing > > Sun Microsystems GmbH > > Dr.-Leo-Ritter-Str. 7 mailto:[16]thijs.metsch@sun.com > > D-93049 Regensburg [17]http://www.sun.com > > > > ------------------------------------------------------------- > > Intel Ireland Limited (Branch) > > Collinstown Industrial Park, Leixlip, County Kildare, Ireland > > Registered Number: E902934 > > > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). Any review or distribution > > by others is strictly prohibited. If you are not the intended > > recipient, please contact the sender and delete all copies. > > > > > > > > > > > > ------------------------------------------------------------- > > Intel Ireland Limited (Branch) > > Collinstown Industrial Park, Leixlip, County Kildare, Ireland > > Registered Number: E902934 > > > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). Any review or distribution > > by others is strictly prohibited. If you are not the intended > > recipient, please contact the sender and delete all copies. -- Nothing is ever easy.

occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Nothing is ever easy.

Roger Menday

11:06 a.m.

Andre, I agree with your email. Your state diagram looks a simpler compared to the one on the wiki. And I believe your proposal is a lot closer to existing state modeling activities in the OGF (?) This might be an opportunity to express some concerns I have for the verb part of noun-attribute-verb. I see the various states as sub- resources (attached as a collection to the main resource). I can read the collection to discover the complete history of the resource (something which currently is not possible?). Furthermore, using both 201 CREATED and 202 ACCEPTED, I am able to distinguish between the resource currently in the process of moving to that state, and the resource already in that state. The resource publishes which states it can transition to at any point, (in this respect quite similar to the changing list of actuator URIs), but the state transition is triggered by creating a new resource in the states collection. Thoughts ? regards, Roger

...

I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki (http://forge.ogf.org/short/occi-wg/states)

I am really unhappy with a number of things:

- too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points

Please allow me to elaborate

(1) Too many states:

All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition.

For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED.

So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state").

(2) Too many internal state transitions

That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached)....

Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change.

(3) unclear entry and exit points

There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?)

STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources.

I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states.

Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram.

Cheers, Andre.

PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-)

Quoting [Edmonds, AndrewX] (Apr 20 2009):

...
Updated...

-----Original Message-----

I agree - this is fine with me.

On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...
Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn???t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

-- Nothing is ever easy. <occi-states.pdf>_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday (PhD) <roger.menday@uk.fujitsu.com> Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534 ______________________________________________________________________ Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469 This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

Tino Vazquez

12:21 p.m.

Roger, I quite like your approach (201 CREATED and 202 ACCEPTED), it helps modeling the (now implicit) "pending" state (between Entry point and Active) that I was worried about. Also, the state history looks promising, but I am not quite sure how we can order them (which state happen before which, or what happen with duplicates). Am I missing something? Regards, -Tino -- Constantino Vázquez, Grid Technology Engineer/Researcher: http://www.dsa-research.org/tinova DSA Research Group: http://dsa-research.org Globus GridWay Metascheduler: http://www.GridWay.org OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org On Wed, May 13, 2009 at 1:06 PM, Roger Menday <roger.menday@uk.fujitsu.com> wrote:

...

Andre,

I agree with your email. Your state diagram looks a simpler compared to the one on the wiki. And I believe your proposal is a lot closer to existing state modeling activities in the OGF (?)

This might be an opportunity to express some concerns I have for the verb part of noun-attribute-verb. I see the various states as sub- resources (attached as a collection to the main resource). I can read the collection to discover the complete history of the resource (something which currently is not possible?). Furthermore, using both 201 CREATED and 202 ACCEPTED, I am able to distinguish between the resource currently in the process of moving to that state, and the resource already in that state. The resource publishes which states it can transition to at any point, (in this respect quite similar to the changing list of actuator URIs), but the state transition is triggered by creating a new resource in the states collection.

Thoughts ?

regards, Roger

...
I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki (http://forge.ogf.org/short/occi-wg/states)

I am really unhappy with a number of things:

- too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points

Please allow me to elaborate

(1) Too many states:

All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition.

For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED.

So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state").

(2) Too many internal state transitions

That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached)....

Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change.

(3) unclear entry and exit points

There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?)

STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources.

I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states.

Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram.

Cheers, Andre.

PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-)

Quoting [Edmonds, AndrewX] (Apr 20 2009):

...
Updated...

-----Original Message-----

I agree - this is fine with me.

On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...
Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn???t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

-- Nothing is ever easy. <occi-states.pdf>_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday (PhD) <roger.menday@uk.fujitsu.com>

Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534

______________________________________________________________________

Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday

12:30 p.m.

Hi Tino,

...

Roger,

I quite like your approach (201 CREATED and 202 ACCEPTED), it helps modeling the (now implicit) "pending" state (between Entry point and Active) that I was worried about.

Also, the state history looks promising, but I am not quite sure how we can order them (which state happen before which, or what happen with duplicates).

Each new state just gets added to the end of a vector, and so I think you could have duplicates. As for the ordering, I suppose that depends on the rendering, e.g. it fits nicely as an ATOM feed (I could subscribe to it). (?) regards, Roger

...

Am I missing something?

Regards,

-Tino

-- Constantino Vázquez, Grid Technology Engineer/Researcher: http://www.dsa-research.org/tinova DSA Research Group: http://dsa-research.org Globus GridWay Metascheduler: http://www.GridWay.org OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org

On Wed, May 13, 2009 at 1:06 PM, Roger Menday <roger.menday@uk.fujitsu.com> wrote:

...
Andre,

I agree with your email. Your state diagram looks a simpler compared to the one on the wiki. And I believe your proposal is a lot closer to existing state modeling activities in the OGF (?)

This might be an opportunity to express some concerns I have for the verb part of noun-attribute-verb. I see the various states as sub- resources (attached as a collection to the main resource). I can read the collection to discover the complete history of the resource (something which currently is not possible?). Furthermore, using both 201 CREATED and 202 ACCEPTED, I am able to distinguish between the resource currently in the process of moving to that state, and the resource already in that state. The resource publishes which states it can transition to at any point, (in this respect quite similar to the changing list of actuator URIs), but the state transition is triggered by creating a new resource in the states collection.

Thoughts ?

regards, Roger

...
I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki (http://forge.ogf.org/short/occi-wg/states)

I am really unhappy with a number of things:

- too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points

Please allow me to elaborate

(1) Too many states:

All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition.

For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED.

So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state").

(2) Too many internal state transitions

That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached)....

Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change.

(3) unclear entry and exit points

There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?)

STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources.

I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states.

Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram.

Cheers, Andre.

PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-)

Quoting [Edmonds, AndrewX] (Apr 20 2009):

...
Updated...

-----Original Message-----

I agree - this is fine with me.

On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...
Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn???t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

-- Nothing is ever easy. <occi-states.pdf>_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday (PhD) <roger.menday@uk.fujitsu.com>

Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534

______________________________________________________________________

Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Andre Merzky

12:52 p.m.

Roger, Quoting [Roger Menday] (May 13 2009):

...

Andre,

I agree with your email. Your state diagram looks a simpler compared to the one on the wiki. And I believe your proposal is a lot closer to existing state modeling activities in the OGF (?)

This might be an opportunity to express some concerns I have for the verb part of noun-attribute-verb. I see the various states as sub- resources (attached as a collection to the main resource). I can read the collection to discover the complete history of the resource (something which currently is not possible?). Furthermore, using both 201 CREATED and 202 ACCEPTED, I am able to distinguish between the resource currently in the process of moving to that state, and the resource already in that state.

Yes, that mechanism basically defines substates. Similar would be ACTIVE:INITIALIZE and ACTIVE:COMPLETED etc. That would allow to have a clean state model with a small number of meaningful states if ( state == 2xx ) if ( state == ACTIVE:* ) echo 'resource is active' and would also allow to inform (human) clients about internal details if ( state == *:COMPLETED ) echo 'transition to active completed' There are several ways to express state details. It has been noted before that BES defines one possible way to do that, too.

...

The resource publishes which states it can transition to at any point, (in this respect quite similar to the changing list of actuator URIs), but the state transition is triggered by creating a new resource in the states collection.

Thoughts ?

I am not feeling comfortable with having a state model which will change on the fly, depending what a resource will answer. For one thing, it will make user interfaces and tools really difficult to design: - should I add a suspend button, even if it works only sometimes? - if it worked once, will it work again? - what can I do if it does not work? I want to suspend, dammit! ;-) Cheers, Andre.

...

regards, Roger

...
I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki (http://forge.ogf.org/short/occi-wg/states)

I am really unhappy with a number of things:

- too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points

Please allow me to elaborate

(1) Too many states:

All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition.

For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED.

So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state").

(2) Too many internal state transitions

That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached)....

Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change.

(3) unclear entry and exit points

There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?)

STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources.

I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states.

Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram.

Cheers, Andre.

PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-)

Quoting [Edmonds, AndrewX] (Apr 20 2009):

...
Updated...

-----Original Message-----

I agree - this is fine with me.

On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...
Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn???t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

-- Nothing is ever easy. <occi-states.pdf>_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday (PhD) <roger.menday@uk.fujitsu.com>

Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534

______________________________________________________________________

Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

-- Nothing is ever easy.

Roger Menday

1:40 p.m.

...

...
The resource publishes which states it can transition to at any point, (in this respect quite similar to the changing list of actuator URIs), but the state transition is triggered by creating a new resource in the states collection.

Thoughts ?

I am not feeling comfortable with having a state model which will change on the fly, depending what a resource will answer. For one thing, it will make user interfaces and tools really difficult to design:

- should I add a suspend button, even if it works only sometimes? - if it worked once, will it work again? - what can I do if it does not work? I want to suspend, dammit! ;-)

I don't share your pessimism, but then again, I also think we are getting our wires crossed. I think there is a lot of help here from hateoas which does a lot to address your concerns above (??) Roger

...

Cheers, Andre.

...
regards, Roger

...
I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki (http://forge.ogf.org/short/occi-wg/states)

I am really unhappy with a number of things:

- too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points

Please allow me to elaborate

(1) Too many states:

All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition.

For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED.

So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state").

(2) Too many internal state transitions

That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached)....

Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change.

(3) unclear entry and exit points

There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?)

STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources.

I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states.

Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram.

Cheers, Andre.

PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-)

Quoting [Edmonds, AndrewX] (Apr 20 2009):

...
Updated...

-----Original Message-----

I agree - this is fine with me.

On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...
Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn???t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

-- Nothing is ever easy. <occi-states.pdf>_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday (PhD) <roger.menday@uk.fujitsu.com>

Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534

______________________________________________________________________

Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

-- Nothing is ever easy. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam Johnston

2:13 p.m.

On Wed, May 13, 2009 at 3:40 PM, Roger Menday <roger.menday@uk.fujitsu.com>wrote:

...

...
I am not feeling comfortable with having a state model which will change on the fly, depending what a resource will answer. For one thing, it will make user interfaces and tools really difficult to design:

- should I add a suspend button, even if it works only sometimes? - if it worked once, will it work again? - what can I do if it does not work? I want to suspend, dammit! ;-)

I don't share your pessimism, but then again, I also think we are getting our wires crossed. I think there is a lot of help here from hateoas which does a lot to address your concerns above (??)

That was exactly the point of introducing both together - given that most of the innovation is going to happen server side, clients should be as dumb as possible. That is, it doesn't matter if a new state comes along after a client has shipped because it will be advertised as a potential transition (HATEOAS), perhaps even with the expected target state. The temptation is to assume that infrastructure is a simple problem with a fixed domain but I can assure you this not the case - without allowing for such flexibility each implementor will find themselves having a good chance of running into functions they are not able to expose via the API, or which the API expects but which are not present (for example, if "stop" implicitly results in "destroy" should we offer "stop" at all?). Like it or not there will be rapid evolution in this space and you only need to look at what weird and wonderful things people like Google and Cisco are doing with hardware to realise that we pretty much don't know what we're talking about today let alone the life of the protocol. Sam

Roger Menday

2:20 p.m.

On 13 May 2009, at 15:13, Sam Johnston wrote:

...

On Wed, May 13, 2009 at 3:40 PM, Roger Menday <roger.menday@uk.fujitsu.com

...
wrote: I am not feeling comfortable with having a state model which will change on the fly, depending what a resource will answer. For one thing, it will make user interfaces and tools really difficult to design:

- should I add a suspend button, even if it works only sometimes? - if it worked once, will it work again? - what can I do if it does not work? I want to suspend, dammit! ;-)

I don't share your pessimism, but then again, I also think we are getting our wires crossed. I think there is a lot of help here from hateoas which does a lot to address your concerns above (??)

That was exactly the point of introducing both together - given that most of the innovation is going to happen server side, clients should be as dumb as possible. That is, it doesn't matter if a new state comes along after a client has shipped because it will be advertised as a potential transition (HATEOAS), perhaps even with the expected target state.

I totally agree.

...

The temptation is to assume that infrastructure is a simple problem with a fixed domain but I can assure you this not the case - without allowing for such flexibility each implementor will find themselves having a good chance of running into functions they are not able to expose via the API, or which the API expects but which are not present (for example, if "stop" implicitly results in "destroy" should we offer "stop" at all?).

My point was I wanted to talk about the "states as verbs" model which currently on the wiki ... Roger

...

Like it or not there will be rapid evolution in this space and you only need to look at what weird and wonderful things people like Google and Cisco are doing with hardware to realise that we pretty much don't know what we're talking about today let alone the life of the protocol.

Sam

Sam Johnston

2:36 p.m.

On Wed, May 13, 2009 at 4:20 PM, Roger Menday <roger.menday@uk.fujitsu.com>wrote:

...

On 13 May 2009, at 15:13, Sam Johnston wrote:

On Wed, May 13, 2009 at 3:40 PM, Roger Menday <roger.menday@uk.fujitsu.com

...
wrote:

...
...
I am not feeling comfortable with having a state model which will change on the fly, depending what a resource will answer. For one thing, it will make user interfaces and tools really difficult to design:

- should I add a suspend button, even if it works only sometimes? - if it worked once, will it work again? - what can I do if it does not work? I want to suspend, dammit! ;-)

I don't share your pessimism, but then again, I also think we are getting our wires crossed. I think there is a lot of help here from hateoas which does a lot to address your concerns above (??)

That was exactly the point of introducing both together - given that most of the innovation is going to happen server side, clients should be as dumb as possible. That is, it doesn't matter if a new state comes along after a client has shipped because it will be advertised as a potential transition (HATEOAS), perhaps even with the expected target state.

I totally agree.

Great. Anyone doesn't agree with the need for [and proposed solution offering] flexibility in the state model?

...

The temptation is to assume that infrastructure is a simple problem with a fixed domain but I can assure you this not the case - without allowing for such flexibility each implementor will find themselves having a good chance of running into functions they are not able to expose via the API, or which the API expects but which are not present (for example, if "stop" implicitly results in "destroy" should we offer "stop" at all?).

My point was I wanted to talk about the "states as verbs" model which currently on the wiki ...

So the question is do you ask a "RUNNING" resource to "STOP" by pressing a button in order to get it to the "STOPPED" state or do you update its status from "RUNNING" to "STOPPED". To me the latter is unclean because who are you to say you're going to get to that state immediately, or indeed that you'll even get there at all - updating the status of a "STOPPED" machine to "RUNNING" makes no sense if there's an operating system error that causes the boot to fail for example. I certainly prefer pressing a "START" button, having a transitional "STARTING" state with an accurate progress indicator and eventually reaching a "RUNNING" state. There's some sense in a parametrised "transition-to" function of some sort with an array of potential target states but I still think I prefer the actuator approach that we currently have. Sam

Chris Webb

2:50 p.m.

Sam Johnston <samj@samj.net> writes:

...

So the question is do you ask a "RUNNING" resource to "STOP" by pressing a button in order to get it to the "STOPPED" state or do you update its status from "RUNNING" to "STOPPED". To me the latter is unclean because who are you to say you're going to get to that state immediately, or indeed that you'll even get there at all

Indeed. We have a classic example of this in our own public cloud. For us, guests can go away by being 'destroyed' (hard kill) or because the operating system inside has executed an ACPI power-down, essentially asking to be destroyed. We have an action 'shutdown' which sends an ACPI power-button event to the guest OS. This may result in a successful shutdown (leading to an ACPI power-down and guest destruction), it may be ignored, or it may trigger something completely different. (I've used it for server-wide SIGHUP-type behaviour before.) Because of this, even ignoring the delay in state change, it's not clear that our 'shutdown' event meaningfully maps to any particular state change because from outside the vm abstraction: we don't know what effect on state the power-button event will actually have! Cheers, Chris.

Roger Menday

4:35 p.m.

On 13 May 2009, at 15:50, Chris Webb wrote:

...

Sam Johnston <samj@samj.net> writes:

...
So the question is do you ask a "RUNNING" resource to "STOP" by pressing a button in order to get it to the "STOPPED" state or do you update its status from "RUNNING" to "STOPPED". To me the latter is unclean because who are you to say you're going to get to that state immediately, or indeed that you'll even get there at all

Indeed. We have a classic example of this in our own public cloud. For us, guests can go away by being 'destroyed' (hard kill) or because the operating system inside has executed an ACPI power-down, essentially asking to be destroyed.

We have an action 'shutdown' which sends an ACPI power-button event to the guest OS. This may result in a successful shutdown (leading to an ACPI power-down and guest destruction), it may be ignored, or it may trigger something completely different. (I've used it for server-wide SIGHUP- type behaviour before.)

Hi Chris, Maybe I miss something, but, given the above, and supposing it does go wrong (doesn't end up where you expected), how do you discover, a while later, why that it so ? Roger

...

Because of this, even ignoring the delay in state change, it's not clear that our 'shutdown' event meaningfully maps to any particular state change because from outside the vm abstraction: we don't know what effect on state the power-button event will actually have!

Cheers,

Chris.

Sam Johnston

4:45 p.m.

On Wed, May 13, 2009 at 6:35 PM, Roger Menday <roger.menday@uk.fujitsu.com>wrote:

...

On 13 May 2009, at 15:50, Chris Webb wrote:

Sam Johnston <samj@samj.net> writes:

...
So the question is do you ask a "RUNNING" resource to "STOP" by pressing

...
a button in order to get it to the "STOPPED" state or do you update its status from "RUNNING" to "STOPPED". To me the latter is unclean because who are you to say you're going to get to that state immediately, or indeed that you'll even get there at all

Indeed. We have a classic example of this in our own public cloud. For us, guests can go away by being 'destroyed' (hard kill) or because the operating system inside has executed an ACPI power-down, essentially asking to be destroyed.

We have an action 'shutdown' which sends an ACPI power-button event to the guest OS. This may result in a successful shutdown (leading to an ACPI power-down and guest destruction), it may be ignored, or it may trigger something completely different. (I've used it for server-wide SIGHUP-type behaviour before.)

Maybe I miss something, but, given the above, and supposing it does go wrong (doesn't end up where you expected), how do you discover, a while later, why that it so ?

Capturing errors of any asynchronous action when we're relying heavily on HTTP response codes is both difficult and necessary. I'd envisaged a windows style EventLog extension that would IMO be the best place for it (that is where it usually ends up, right?)

...

...
Because of this, even ignoring the delay in state change, it's not clear that our 'shutdown' event meaningfully maps to any particular state change because from outside the vm abstraction: we don't know what effect on state the power-button event will actually have!

True, so I think we agree to stick with the verbs. It's conceivable that we have verbs like "SYNC" (for storage devices) that don't result in state changes too. Sam

Roger Menday

4:51 p.m.

...

...
...
I am not feeling comfortable with having a state model which will change on the fly, depending what a resource will answer. For one thing, it will make user interfaces and tools really difficult to design:

- should I add a suspend button, even if it works only sometimes? - if it worked once, will it work again? - what can I do if it does not work? I want to suspend, dammit! ;-)

I don't share your pessimism, but then again, I also think we are getting our wires crossed. I think there is a lot of help here from hateoas which does a lot to address your concerns above (??)

That was exactly the point of introducing both together - given that most of the innovation is going to happen server side, clients should be as dumb as possible. That is, it doesn't matter if a new state comes along after a client has shipped because it will be advertised as a potential transition (HATEOAS), perhaps even with the expected target state.

I totally agree.

Great. Anyone doesn't agree with the need for [and proposed solution offering] flexibility in the state model?

...
The temptation is to assume that infrastructure is a simple problem with a fixed domain but I can assure you this not the case - without allowing for such flexibility each implementor will find themselves having a good chance of running into functions they are not able to expose via the API, or which the API expects but which are not present (for example, if "stop" implicitly results in "destroy" should we offer "stop" at all?).

My point was I wanted to talk about the "states as verbs" model which currently on the wiki ...

So the question is do you ask a "RUNNING" resource to "STOP" by pressing a button in order to get it to the "STOPPED" state or do you update its status from "RUNNING" to "STOPPED". To me the latter is unclean because who are you to say you're going to get to that state immediately, or indeed that you'll even get there at all - updating the status of a "STOPPED" machine to "RUNNING" makes no sense if there's an operating system error that causes the boot to fail for example. I certainly prefer pressing a "START" button, having a transitional "STARTING" state with an accurate progress indicator and eventually reaching a "RUNNING" state. There's some sense in a parametrised "transition-to" function of some sort with an array of potential target states but I still think I prefer the actuator approach that we currently have.

Hi Sam, My thoughts on both your post and that of Chris too. I hope I'm reading your email correctly. I like posting to a collection, because it does give an obvious holder for the state collection, which is then my state history, should i need it. I can also publish a membership policy (which changes) of what 'types' of new resources can go into a collection. This is useful to build the payload. It is also a record of failed transitions. I don't think this ignores the possibility of a delay or failure in state change. After POSTing to the collection, returning a 202, means that is has been accepted for processing (if i understand correctly). So, there is an (implied) "in-process" sub-state. And the accepted-for- processing resource is then a good one to ask why it went wrong, if something did go wrong ... (?) That's something I don't see in the actuator approach (again I might be missing something) as I don't see that it is creating new resources. Roger

...

Sam

______________________________________________________________________ Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469 This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

Andre Merzky

8:08 p.m.

Quoting [Sam Johnston] (May 13 2009):

...

...
...
That was exactly the point of introducing both together - given that most of the innovation is going to happen server side, clients should be as dumb as possible. That is, it doesn't matter if a new state comes along after a client has shipped because it will be advertised as a potential transition (HATEOAS), perhaps even with the expected target state.

I totally agree.

Great. Anyone doesn't agree with the need for [and proposed solution offering] flexibility in the state model?

Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions." Best, Andre.

...

The temptation is to assume that infrastructure is a simple problem with a fixed domain but I can assure you this not the case - without allowing for such flexibility each implementor will find themselves having a good chance of running into functions they are not able to expose via the API, or which the API expects but which are not present (for example, if "stop" implicitly results in "destroy" should we offer "stop" at all?).

My point was I wanted to talk about the "states as verbs" model which currently on the wiki ...

So the question is do you ask a "RUNNING" resource to "STOP" by pressing a button in order to get it to the "STOPPED" state or do you update its status from "RUNNING" to "STOPPED". To me the latter is unclean because who are you to say you're going to get to that state immediately, or indeed that you'll even get there at all - updating the status of a "STOPPED" machine to "RUNNING" makes no sense if there's an operating system error that causes the boot to fail for example. I certainly prefer pressing a "START" button, having a transitional "STARTING" state with an accurate progress indicator and eventually reaching a "RUNNING" state. There's some sense in a parametrised "transition-to" function of some sort with an array of potential target states but I still think I prefer the actuator approach that we currently have. Sam

References

1. mailto:roger.menday@uk.fujitsu.com 2. mailto:roger.menday@uk.fujitsu.com

-- Nothing is ever easy.

Sam Johnston

8:23 p.m.

On Wed, May 13, 2009 at 10:08 PM, Andre Merzky <andre@merzky.net> wrote:

...

Quoting [Sam Johnston] (May 13 2009):

...
...
...
That was exactly the point of introducing both together - given that most of the innovation is going to happen server side, clients should be as dumb as possible. That is, it doesn't matter if a new state comes along after a client has shipped because it will be advertised as a potential transition (HATEOAS), perhaps even with the expected target state.

I totally agree.

Great. Anyone doesn't agree with the need for [and proposed solution offering] flexibility in the state model?

Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies. I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

Gary Mazz

14 May 14 May

6:45 a.m.

New subject: OCCI MC - State Machine Diagram Please clarify

I'm in agreement with Sam on this point. I have a hard time reconciling the initial state and final state as I'm here writing some code. The initial state is prior to creation (instantiation) of the entity. The final state is the state after destruction. The error state also leads to a non existence state without a destruction phase. Additionally, there is a destroyed state but no instantiate state Once destroyed aren't you at the final state, non-existence ? You can't have a destroyed state if you don't exist to maintain the state. This state table makes sense if you are looking at it from the perspective of a log file, but its confusing from the perspective, an object lifecycle especially in the case of destroyed without parity to instantiated. Trivial as it may appear. How does restart on error occur in this model ? It appears restart can only occur when your active ? When does the active get the start trigger or you can't have an inactive event ? Does the active/restart depict the life cycle of the object contents ? How does the state diagram reconcile restart on error and hold on error ? Or some other action on error like snapshot? Second, it may not be appropriate for one type of user to gain access to details. A customer may only need to know their "service" is stuck in a specific state, while an support engineer may see a more detailed view of the state and a development engineer may see another more detailed view of the state. You may not want the customer to see all the underlying details. A service provider may only want customers to see 3 states; loading running stopped. -gary Sam Johnston wrote:

...

On Wed, May 13, 2009 at 10:08 PM, Andre Merzky <andre@merzky.net <mailto:andre@merzky.net>> wrote:

Quoting [Sam Johnston] (May 13 2009): > >>> That was exactly the point of introducing both together - given that >>> most of the innovation is going to happen server side, clients >>> should be as dumb as possible. That is, it doesn't matter if a new >>> state comes along after a client has shipped because it will be >>> advertised as a potential transition (HATEOAS), perhaps even with >>> the expected target state. >> >> I totally agree. > > Great. Anyone doesn't agree with the need for [and proposed solution > offering] flexibility in the state model?

Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category).

Sam

------------------------------------------------------------------------

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam Johnston

10:40 a.m.

New subject: OCCI MC - State Machine Diagram Please clarify

On Thu, May 14, 2009 at 8:45 AM, Gary Mazz <garymazzaferro@gmail.com> wrote:

...

I'm in agreement with Sam on this point.

I have a hard time reconciling the initial state and final state as I'm here writing some code.

Great, so there's [at least] two of us writing code now - the more the merrier.

...

The initial state is prior to creation (instantiation) of the entity. The final state is the state after destruction. The error state also leads to a non existence state without a destruction phase. Additionally, there is a destroyed state but no instantiate state Once destroyed aren't you at the final state, non-existence ? You can't have a destroyed state if you don't exist to maintain the state. This state table makes sense if you are looking at it from the perspective of a log file, but its confusing from the perspective, an object lifecycle especially in the case of destroyed without parity to instantiated. Trivial as it may appear.

How does restart on error occur in this model ? It appears restart can only occur when your active ? When does the active get the start trigger or you can't have an inactive event ? Does the active/restart depict the life cycle of the object contents ?

How does the state diagram reconcile restart on error and hold on error ? Or some other action on error like snapshot?

These sound like great examples as to why we shouldn't try to beat reality into a fixed state diagram (or vice versa), except perhaps at a very high level (and even then I have my doubts).

...

Second, it may not be appropriate for one type of user to gain access to details. A customer may only need to know their "service" is stuck in a specific state, while an support engineer may see a more detailed view of the state and a development engineer may see another more detailed view of the state. You may not want the customer to see all the underlying details. A service provider may only want customers to see 3 states; loading running stopped.

This is a very good point and something I think is well covered by our existing (extremely simple) security model: implementors decide what to show their clients based on who is connected. This is an example of one of those asynchronous errors not handled by HTTP - something we're going to have to have a look at in more detail for all asynchronous actions in due course. Sam

...

Sam Johnston wrote:

...
On Wed, May 13, 2009 at 10:08 PM, Andre Merzky <andre@merzky.net <mailto:andre@merzky.net>> wrote:

Quoting [Sam Johnston] (May 13 2009): > >>> That was exactly the point of introducing both together - given that >>> most of the innovation is going to happen server side, clients >>> should be as dumb as possible. That is, it doesn't matter if a new >>> state comes along after a client has shipped because it will be >>> advertised as a potential transition (HATEOAS), perhaps even with >>> the expected target state. >> >> I totally agree. > > Great. Anyone doesn't agree with the need for [and proposed solution > offering] flexibility in the state model?

Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category).

Sam

------------------------------------------------------------------------

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Andre Merzky

8:23 a.m.

Quoting [Sam Johnston] (May 13 2009):

...

...
Quoting [Sam Johnston] (May 13 2009):

...
...
...
...
That was exactly the point of introducing both together - given that most of the innovation is going to happen server side, clients should be as dumb as possible. That is, it doesn't matter if a new state comes along after a client has shipped because it will be advertised as a potential transition (HATEOAS), perhaps even with the expected target state.

I totally agree.

Great. Anyone doesn't agree with the need for [and proposed solution offering] flexibility in the state model?

Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't see that. If I want to write a client tool which starts a resource, I want to make sure the resource is in RUNNING state when the client reports success. But if that client (a) has to infer the available states from a registry, it cannot posisbly know which state has the semantic meaning of RUNNING attached. Further (b), if the client only sees those state transitions it is allowed in its current state, how does it know what transition path to take to reach that target state? Is it (I am making those up obviously): INITIAL -> create() -> CREATED -> elevate() -> ELEVATED () -> run() -> RUNNING or INITIAL -> create() -> CREATED -> init() -> INITIALIZED -> run() -> RUNNING Or should the tool simply fail because it cannot see a run() transition in its INITIAL state? I think HATEOAS works pretty well if a human is in the loop who can parse the available transition description, and deduce a semantic meaning. I don't think it makes for simple tooling, really. Then again, I may misunderstand the proposed usage of HATEOAS in OCCI. So, can you help me out: what mechanism will avoid the confusion from the example above, if a vendor can provide init() and elevate() transitions on the fly, with no predefined semantics attached? How would my tool deduce the transition path it needs to enact? Many thanks, Andre.

...

I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

References

1. mailto:andre@merzky.net

-- Nothing is ever easy.

Sam Johnston

9:55 a.m.

On Thu, May 14, 2009 at 10:23 AM, Andre Merzky <andre@merzky.net> wrote:

...

Quoting [Sam Johnston] (May 13 2009):

...
...
Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't see that. If I want to write a client tool which starts a resource, I want to make sure the resource is in RUNNING state when the client reports success. But if that client (a) has to infer the available states from a registry, it cannot posisbly know which state has the semantic meaning of RUNNING attached. Further (b), if the client only sees those state transitions it is allowed in its current state, how does it know what transition path to take to reach that target state? Is it (I am making those up obviously):

INITIAL -> create() -> CREATED -> elevate() -> ELEVATED () -> run() -> RUNNING

or

INITIAL -> create() -> CREATED -> init() -> INITIALIZED -> run() -> RUNNING

Or should the tool simply fail because it cannot see a run() transition in its INITIAL state?

The client must at least know how to create a resource and when it has done so successfully a "start" actuator will appear, perhaps with a target state of "running" (TBD). In that case it knows that if it pulls the "start" handle eventually the resource should end up "running". Otherwise it could know (from the registry) that "start" is the right button to push, but that's starting to break HATEOAS principles. We have options - it's just a matter of finding the right one.

...

I think HATEOAS works pretty well if a human is in the loop who can parse the available transition description, and deduce a semantic meaning. I don't think it makes for simple tooling, really.

I agree that humans are better at this stuff than computers but I'm unconvinced this translates to complex tooling.

...

Then again, I may misunderstand the proposed usage of HATEOAS in OCCI. So, can you help me out: what mechanism will avoid the confusion from the example above, if a vendor can provide init() and elevate() transitions on the fly, with no predefined semantics attached? How would my tool deduce the transition path it needs to enact?

The semantics for common functions will be in the registry. It's ones that are uncommon and impossible to predict like "translate" and "migrate" that we're catering for here, and generally there will need to be some kind of client side support for these. As I said below, "*we may need to revisit this point in the name of interop*", and I suggested categories as one possible solution (e.g. a "starting" vs a "stopping" transition)... parametrised transition calls are another... for example, how do I tell something to start *without* saved state if saved state is present (ala cold start vs resume)? Sam

...

...
I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

References

1. mailto:andre@merzky.net

-- Nothing is ever easy.

Alexis Richardson

9:59 a.m.

+1 to Sam's "we may need to revisit this point in the name of interop" At this stage we are shooting for a draft. The draft will let people implement prototypes which will let us debug interop and refine the model. On Thu, May 14, 2009 at 10:55 AM, Sam Johnston <samj@samj.net> wrote:

...

On Thu, May 14, 2009 at 10:23 AM, Andre Merzky <andre@merzky.net> wrote:

...
Quoting [Sam Johnston] (May 13 2009):

...
...
Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't see that. If I want to write a client tool which starts a resource, I want to make sure the resource is in RUNNING state when the client reports success. But if that client (a) has to infer the available states from a registry, it cannot posisbly know which state has the semantic meaning of RUNNING attached. Further (b), if the client only sees those state transitions it is allowed in its current state, how does it know what transition path to take to reach that target state? Is it (I am making those up obviously):

INITIAL -> create() -> CREATED -> elevate() -> ELEVATED () -> run() -> RUNNING

or

INITIAL -> create() -> CREATED -> init() -> INITIALIZED -> run() -> RUNNING

Or should the tool simply fail because it cannot see a run() transition in its INITIAL state?

The client must at least know how to create a resource and when it has done so successfully a "start" actuator will appear, perhaps with a target state of "running" (TBD). In that case it knows that if it pulls the "start" handle eventually the resource should end up "running". Otherwise it could know (from the registry) that "start" is the right button to push, but that's starting to break HATEOAS principles. We have options - it's just a matter of finding the right one.

...
I think HATEOAS works pretty well if a human is in the loop who can parse the available transition description, and deduce a semantic meaning. I don't think it makes for simple tooling, really.

I agree that humans are better at this stuff than computers but I'm unconvinced this translates to complex tooling.

...
Then again, I may misunderstand the proposed usage of HATEOAS in OCCI. So, can you help me out: what mechanism will avoid the confusion from the example above, if a vendor can provide init() and elevate() transitions on the fly, with no predefined semantics attached? How would my tool deduce the transition path it needs to enact?

The semantics for common functions will be in the registry. It's ones that are uncommon and impossible to predict like "translate" and "migrate" that we're catering for here, and generally there will need to be some kind of client side support for these.

As I said below, "we may need to revisit this point in the name of interop", and I suggested categories as one possible solution (e.g. a "starting" vs a "stopping" transition)... parametrised transition calls are another... for example, how do I tell something to start *without* saved state if saved state is present (ala cold start vs resume)?

Sam

...
...
I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

References

1. mailto:andre@merzky.net

-- Nothing is ever easy.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday

10:52 a.m.

On 14 May 2009, at 10:59, Alexis Richardson wrote:

...

+1 to Sam's "we may need to revisit this point in the name of interop"

I'm not sure if this is *just* an interop thing ... I thought my suggestions yesterday on how to transition state, error reporting, handling 'processing' states, etc ... were reasonable. Kind of disappointed this morning that I didn't get some feedback from you guys ... :( Roger

...

At this stage we are shooting for a draft. The draft will let people implement prototypes which will let us debug interop and refine the model.

On Thu, May 14, 2009 at 10:55 AM, Sam Johnston <samj@samj.net> wrote:

...
On Thu, May 14, 2009 at 10:23 AM, Andre Merzky <andre@merzky.net> wrote:

...
Quoting [Sam Johnston] (May 13 2009):

...
...
Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't see that. If I want to write a client tool which starts a resource, I want to make sure the resource is in RUNNING state when the client reports success. But if that client (a) has to infer the available states from a registry, it cannot posisbly know which state has the semantic meaning of RUNNING attached. Further (b), if the client only sees those state transitions it is allowed in its current state, how does it know what transition path to take to reach that target state? Is it (I am making those up obviously):

INITIAL -> create() -> CREATED -> elevate() -> ELEVATED () -> run() -> RUNNING

or

INITIAL -> create() -> CREATED -> init() -> INITIALIZED -> run() -> RUNNING

Or should the tool simply fail because it cannot see a run() transition in its INITIAL state?

The client must at least know how to create a resource and when it has done so successfully a "start" actuator will appear, perhaps with a target state of "running" (TBD). In that case it knows that if it pulls the "start" handle eventually the resource should end up "running". Otherwise it could know (from the registry) that "start" is the right button to push, but that's starting to break HATEOAS principles. We have options - it's just a matter of finding the right one.

...
I think HATEOAS works pretty well if a human is in the loop who can parse the available transition description, and deduce a semantic meaning. I don't think it makes for simple tooling, really.

I agree that humans are better at this stuff than computers but I'm unconvinced this translates to complex tooling.

...
Then again, I may misunderstand the proposed usage of HATEOAS in OCCI. So, can you help me out: what mechanism will avoid the confusion from the example above, if a vendor can provide init() and elevate() transitions on the fly, with no predefined semantics attached? How would my tool deduce the transition path it needs to enact?

The semantics for common functions will be in the registry. It's ones that are uncommon and impossible to predict like "translate" and "migrate" that we're catering for here, and generally there will need to be some kind of client side support for these.

As I said below, "we may need to revisit this point in the name of interop", and I suggested categories as one possible solution (e.g. a "starting" vs a "stopping" transition)... parametrised transition calls are another... for example, how do I tell something to start *without* saved state if saved state is present (ala cold start vs resume)?

Sam

...
...
I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

References

1. mailto:andre@merzky.net

-- Nothing is ever easy.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Alexis Richardson

10:55 a.m.

Roger What specific points did you most want feedback on? a On Thu, May 14, 2009 at 11:52 AM, Roger Menday <roger.menday@uk.fujitsu.com> wrote:

...

On 14 May 2009, at 10:59, Alexis Richardson wrote:

...
+1 to Sam's "we may need to revisit this point in the name of interop"

I'm not sure if this is *just* an interop thing ...

I thought my suggestions yesterday on how to transition state, error reporting, handling 'processing' states, etc ... were reasonable.

Kind of disappointed this morning that I didn't get some feedback from you guys ... :(

Roger

...
At this stage we are shooting for a draft. The draft will let people implement prototypes which will let us debug interop and refine the model.

On Thu, May 14, 2009 at 10:55 AM, Sam Johnston <samj@samj.net> wrote:

...
On Thu, May 14, 2009 at 10:23 AM, Andre Merzky <andre@merzky.net> wrote:

...
Quoting [Sam Johnston] (May 13 2009):

...
...
Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't see that. If I want to write a client tool which starts a resource, I want to make sure the resource is in RUNNING state when the client reports success. But if that client (a) has to infer the available states from a registry, it cannot posisbly know which state has the semantic meaning of RUNNING attached. Further (b), if the client only sees those state transitions it is allowed in its current state, how does it know what transition path to take to reach that target state? Is it (I am making those up obviously):

INITIAL -> create() -> CREATED -> elevate() -> ELEVATED () -> run() -> RUNNING

or

INITIAL -> create() -> CREATED -> init() -> INITIALIZED -> run() -> RUNNING

Or should the tool simply fail because it cannot see a run() transition in its INITIAL state?

The client must at least know how to create a resource and when it has done so successfully a "start" actuator will appear, perhaps with a target state of "running" (TBD). In that case it knows that if it pulls the "start" handle eventually the resource should end up "running". Otherwise it could know (from the registry) that "start" is the right button to push, but that's starting to break HATEOAS principles. We have options - it's just a matter of finding the right one.

...
I think HATEOAS works pretty well if a human is in the loop who can parse the available transition description, and deduce a semantic meaning. I don't think it makes for simple tooling, really.

I agree that humans are better at this stuff than computers but I'm unconvinced this translates to complex tooling.

...
Then again, I may misunderstand the proposed usage of HATEOAS in OCCI. So, can you help me out: what mechanism will avoid the confusion from the example above, if a vendor can provide init() and elevate() transitions on the fly, with no predefined semantics attached? How would my tool deduce the transition path it needs to enact?

The semantics for common functions will be in the registry. It's ones that are uncommon and impossible to predict like "translate" and "migrate" that we're catering for here, and generally there will need to be some kind of client side support for these.

As I said below, "we may need to revisit this point in the name of interop", and I suggested categories as one possible solution (e.g. a "starting" vs a "stopping" transition)... parametrised transition calls are another... for example, how do I tell something to start *without* saved state if saved state is present (ala cold start vs resume)?

Sam

...
...
I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

References

1. mailto:andre@merzky.net

-- Nothing is ever easy.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday (PhD) <roger.menday@uk.fujitsu.com>

Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534

______________________________________________________________________ Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

Roger Menday

11:17 a.m.

On 14 May 2009, at 11:55, Alexis Richardson wrote:

...

Roger

What specific points did you most want feedback on?

Maybe I am getting actuators wrong ... it could be, I like the Sun API, and they use actuators. But, actually, actuators don't seem that great, and they have filtered down into the verb part of the noun- attribute-verb model. And so, I think it is interesting to explore other ways of doing it, which address (these overlap) - async concerns - error reporting - offering porential of something more than just pressing a button (move a few dials then press a button, for ex) A different perspective is that in the state should be part of the model, not as a enumerable defined in a registry. Roger

...

a

On Thu, May 14, 2009 at 11:52 AM, Roger Menday <roger.menday@uk.fujitsu.com> wrote:

...
On 14 May 2009, at 10:59, Alexis Richardson wrote:

...
+1 to Sam's "we may need to revisit this point in the name of interop"

I'm not sure if this is *just* an interop thing ...

I thought my suggestions yesterday on how to transition state, error reporting, handling 'processing' states, etc ... were reasonable.

Kind of disappointed this morning that I didn't get some feedback from you guys ... :(

Roger

...
At this stage we are shooting for a draft. The draft will let people implement prototypes which will let us debug interop and refine the model.

On Thu, May 14, 2009 at 10:55 AM, Sam Johnston <samj@samj.net> wrote:

...
On Thu, May 14, 2009 at 10:23 AM, Andre Merzky <andre@merzky.net> wrote:

...
Quoting [Sam Johnston] (May 13 2009):

...
> Yes, me, I don't think HATEOAS should be applied in this > context. But I realise/accept that I maybe the only one > with that opinion - thats ok. So I'll say it here one last > time, for the record, and then will shut up: "a static > simple state model allows for very simple clients. > Extensions can be defined via substates, or additional > transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't see that. If I want to write a client tool which starts a resource, I want to make sure the resource is in RUNNING state when the client reports success. But if that client (a) has to infer the available states from a registry, it cannot posisbly know which state has the semantic meaning of RUNNING attached. Further (b), if the client only sees those state transitions it is allowed in its current state, how does it know what transition path to take to reach that target state? Is it (I am making those up obviously):

INITIAL -> create() -> CREATED -> elevate() -> ELEVATED () -> run() -> RUNNING

or

INITIAL -> create() -> CREATED -> init() -> INITIALIZED -> run() -> RUNNING

Or should the tool simply fail because it cannot see a run() transition in its INITIAL state?

The client must at least know how to create a resource and when it has done so successfully a "start" actuator will appear, perhaps with a target state of "running" (TBD). In that case it knows that if it pulls the "start" handle eventually the resource should end up "running". Otherwise it could know (from the registry) that "start" is the right button to push, but that's starting to break HATEOAS principles. We have options - it's just a matter of finding the right one.

...
I think HATEOAS works pretty well if a human is in the loop who can parse the available transition description, and deduce a semantic meaning. I don't think it makes for simple tooling, really.

I agree that humans are better at this stuff than computers but I'm unconvinced this translates to complex tooling.

...
Then again, I may misunderstand the proposed usage of HATEOAS in OCCI. So, can you help me out: what mechanism will avoid the confusion from the example above, if a vendor can provide init() and elevate() transitions on the fly, with no predefined semantics attached? How would my tool deduce the transition path it needs to enact?

The semantics for common functions will be in the registry. It's ones that are uncommon and impossible to predict like "translate" and "migrate" that we're catering for here, and generally there will need to be some kind of client side support for these.

As I said below, "we may need to revisit this point in the name of interop", and I suggested categories as one possible solution (e.g. a "starting" vs a "stopping" transition)... parametrised transition calls are another... for example, how do I tell something to start *without* saved state if saved state is present (ala cold start vs resume)?

Sam

...
...
I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

References

1. mailto:andre@merzky.net

-- Nothing is ever easy.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday (PhD) <roger.menday@uk.fujitsu.com>

Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534

______________________________________________________________________ Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

Sam Johnston

11:34 a.m.

On Thu, May 14, 2009 at 1:17 PM, Roger Menday <roger.menday@uk.fujitsu.com>wrote:

...

On 14 May 2009, at 11:55, Alexis Richardson wrote:

Roger

...
What specific points did you most want feedback on?

Maybe I am getting actuators wrong ... it could be, I like the Sun API, and they use actuators. But, actually, actuators don't seem that great, and they have filtered down into the verb part of the noun-attribute-verb model. And so, I think it is interesting to explore other ways of doing it, which address (these overlap)

- async concerns

Status/error fields would be one simple solution to this problem - we already have HTTP 20x error codes to indicate that something is indeed async. An eventlog extension is another.

...

- error reporting

As above.

...

- offering porential of something more than just pressing a button (move a few dials then press a button, for ex)

This is likely to be an absolute requirement - I don't want to have to press the start button 50 times to get 50 instances so there will need to be some way to create-en-masse for a start (probably with the flexibility of ranges ala amazon). I may also need to specify cold start vs warm start (e.g. with or without state). I've not really even started thinking about this yet, beyond knowing that there are a few potential solutions.

...

A different perspective is that in the state should be part of the model, not as a enumerable defined in a registry.

Sure, this is the more formal approach but it's also more rigid/brittle. As we can't hope to predict every state that will ever exist, least of all in a fashion that is understandable/acceptable to everyone, we need to be flexible. State diagrams are as much a personal preference, as evidenced by the many that exist for different projects - Microsoft vs VMware vs DMTF vs OGF being the ones I've looked at. Sam

Roger Menday

12:58 p.m.

Sam

...

Maybe I am getting actuators wrong ... it could be, I like the Sun API, and they use actuators. But, actually, actuators don't seem that great, and they have filtered down into the verb part of the noun-attribute-verb model. And so, I think it is interesting to explore other ways of doing it, which address (these overlap)

- async concerns

Status/error fields would be one simple solution to this problem - we already have HTTP 20x error codes to indicate that something is indeed async.

I think that is a good way. The thing (that I understand about the actuator) approach, or rather I have not seen documented anywhere, was that POSTing to an actuator URI doesn't seem to create a resource anywhere. So, we can't do that right now. OK, maybe that is something for OCCI 2.0, but why not do it now ?

...

An eventlog extension is another.

- error reporting

As above.

- offering porential of something more than just pressing a button (move a few dials then press a button, for ex)

This is likely to be an absolute requirement - I don't want to have to press the start button 50 times to get 50 instances so there will need to be some way to create-en-masse for a start (probably with the flexibility of ranges ala amazon). I may also need to specify cold start vs warm start (e.g. with or without state). I've not really even started thinking about this yet, beyond knowing that there are a few potential solutions.

So, actuators don't seem to cover this (?) Modeling state discovery *and* request as part of a model does seems to be a good alternative.

...

A different perspective is that in the state should be part of the model, not as a enumerable defined in a registry.

Sure, this is the more formal approach but it's also more rigid/ brittle. As we can't hope to predict every state that will ever exist, least of all in a fashion that is understandable/acceptable to everyone, we need to be flexible.

Yes, but, the people need to conform to some extent. That's why we are having this JSON vs ATOM discussion, right ? They also need to conform to a basic state model, and then allow states to be specialized if necessary. Don't you actually get this if you have state modeled as part of a class model; a simple one that everyone can live with - rather like the one in the OGF Reference Model - together with sub- classing for more specialist cases ? So, generically processed by everything at some level. That takes care of brittleness, right ? Roger ps. Actually, I would've thought that having the state of something as a stream of updates inside an feed would appeal to a "atom/atompp for occi" proponent ?

...

State diagrams are as much a personal preference, as evidenced by the many that exist for different projects - Microsoft vs VMware vs DMTF vs OGF being the ones I've looked at.

...

Sam

Sam Johnston

1:14 p.m.

On Thu, May 14, 2009 at 2:58 PM, Roger Menday <roger.menday@uk.fujitsu.com>wrote:

...

Sam

...
Maybe I am getting actuators wrong ... it could be, I like the Sun API, and they use actuators. But, actually, actuators don't seem that great, and they have filtered down into the verb part of the noun-attribute-verb model. And so, I think it is interesting to explore other ways of doing it, which address (these overlap)

- async concerns

Status/error fields would be one simple solution to this problem - we already have HTTP 20x error codes to indicate that something is indeed async.

I think that is a good way.

The thing (that I understand about the actuator) approach, or rather I have not seen documented anywhere, was that POSTing to an actuator URI doesn't seem to create a resource anywhere. So, we can't do that right now.

OK, maybe that is something for OCCI 2.0, but why not do it now ?

You need a resource to find an actuator so just poll the same resource for status updates - that is, we can have a verbose text status field in addition to the enum. To clarify: - Actuators that act immediately will return 200 OK. - Those that don't will return 202 Accepted. - Those that create a resource will return 201 Created with a Location: header pointing at the newly created resource (where that could presumably be a feed of resources if you asked for multiple instances). Ok by you?

...

An eventlog extension is another.

...
- error reporting

As above.

...
- offering porential of something more than just pressing a button (move a few dials then press a button, for ex)

This is likely to be an absolute requirement - I don't want to have to press the start button 50 times to get 50 instances so there will need to be some way to create-en-masse for a start (probably with the flexibility of ranges ala amazon). I may also need to specify cold start vs warm start (e.g. with or without state). I've not really even started thinking about this yet, beyond knowing that there are a few potential solutions.

So, actuators don't seem to cover this (?)

Modeling state discovery *and* request as part of a model does seems to be a good alternative.

It just means you need to pass parameters to the actuator via a GET or POST request (GET comes to mind first but POST works with more/larger parameters)... these exceptional cases can go in the registry... for start it might be 'min-instances' and 'max-instances' for example.

...

...
A different perspective is that in the state should be part of the model, not as a enumerable defined in a registry.

Sure, this is the more formal approach but it's also more rigid/brittle. As we can't hope to predict every state that will ever exist, least of all in a fashion that is understandable/acceptable to everyone, we need to be flexible.

Yes, but, the people need to conform to some extent. That's why we are having this JSON vs ATOM discussion, right ? They also need to conform to a basic state model, and then allow states to be specialized if necessary. Don't you actually get this if you have state modeled as part of a class model; a simple one that everyone can live with - rather like the one in the OGF Reference Model - together with sub-classing for more specialist cases ?

I'm still having trouble making this reconcile - force people into your model and you are sure to break something for someone, unless you make it so high level as to be meaningless. sub-classing and categories have the same result - multiple actuators bundled under one heading... if I want to stop the device do I press "shutdown", "halt", "acpioff", "poweroff" or just choose one at random? Better bet is to let implementors make sensible decisions like "try shutdown, wait a minute then do acpioff followed by poweroff if we're still not where we want to be"... think of the "Force Quit" and "End Task" options in windows. States that don't fit are things like "archiving", "restoring", "backing up", "cloning", "transferring", "transforming"... who's to say there's not going to be a "teleporting" state one of these days? It won't be anarchy and even if it were we'd just add some constraints... in any case we won't know until we suck it and see.

...

So, generically processed by everything at some level. That takes care of brittleness, right ?

Roger

ps. Actually, I would've thought that having the state of something as a stream of updates inside an feed would appeal to a "atom/atompp for occi" proponent ?

Providing a link to a feed of syslog/eventlog style entries is something that I had envisaged, and doing something similar for state changes/requests (even as a subset of same) does sounds sensible. Sam

Roger Menday

1:22 p.m.

...

...
This is likely to be an absolute requirement - I don't want to have to press the start button 50 times to get 50 instances so there will need to be some way to create-en-masse for a start (probably with the flexibility of ranges ala amazon). I may also need to specify cold start vs warm start (e.g. with or without state). I've not really even started thinking about this yet, beyond knowing that there are a few potential solutions.

So, actuators don't seem to cover this (?)

Modeling state discovery *and* request as part of a model does seems to be a good alternative.

It just means you need to pass parameters to the actuator via a GET or POST request (GET comes to mind first but POST works with more/ larger parameters)...

I hope I don't get a visit from the REST police for this comment, but wouldn't the GET here be frowned upon ... (??) Roger ______________________________________________________________________ Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469 This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

Sam Johnston

1:38 p.m.

On Thu, May 14, 2009 at 3:22 PM, Roger Menday <roger.menday@uk.fujitsu.com>wrote:

...

...
It just means you need to pass parameters to the actuator via a GET or POST request (GET comes to mind first but POST works with more/larger parameters)...

I hope I don't get a visit from the REST police for this comment, but wouldn't the GET here be frowned upon ... (??)

Meh, it doesn't matter if you push a button, poke it with a stick, kick it or use the force - the result is the same. If anyone is to be getting a visit from the REST police I guess it's me but on reviewing Tim's insightful RESTful Causistry<http://www.tbray.org/ongoing/When/200x/2009/03/20/Rest-Casuistry>post (and the comments) I think we're on the right track here... he talks about GET but I think either would do. I don't mind this being an exception - as Peter Keene points out<http://www.tbray.org/ongoing/When/200x/2009/03/20/Rest-Casuistry#c1237598640.454904>this is "not really an appropriately decomposed task for a REST architecture". Seth Ladd makes an interesting suggestion<http://www.tbray.org/ongoing/When/200x/2009/03/20/Rest-Casuistry#c1237592841.237066>about modeling activities as nouns like "running" and "backing up", which could coexist. Elegant but complicated - let's KISS with what we have. Sam

Roger Menday

2:24 p.m.

...

It just means you need to pass parameters to the actuator via a GET or POST request (GET comes to mind first but POST works with more/ larger parameters)...

I hope I don't get a visit from the REST police for this comment, but wouldn't the GET here be frowned upon ... (??)

Meh, it doesn't matter if you push a button, poke it with a stick, kick it or use the force - the result is the same. If anyone is to be getting a visit from the REST police I guess it's me but on reviewing Tim's insightful RESTful Causistry post (and the comments) I think we're on the right track here... he talks about GET but I think either would do.

would'nt GET be prone to some kind of spider GETing all links it came across (for indexing) causing data centre havoc ?

...

I don't mind this being an exception - as Peter Keene points out this is "not really an appropriately decomposed task for a REST architecture". Seth Ladd makes an interesting suggestion about modeling activities as nouns like "running" and "backing up",

So, Seth Ladds comment is essentially how I see it too. A collection of states.

...

which could coexist. Elegant but complicated

... so, a server might of type "commissioned" and "currently_unavailable" classes ?? That is a nice suggestion. A implementor could choose to sub-class "currently_unavailable" with "teleporting" - some extra information that a client could use, or just ignore. Roger

...

- let's KISS with what we have.

Sam Johnston

2:55 p.m.

On Thu, May 14, 2009 at 4:24 PM, Roger Menday <roger.menday@uk.fujitsu.com>wrote:

...

Meh, it doesn't matter if you push a button, poke it with a stick, kick it or use the force - the result is the same. If anyone is to be getting a visit from the REST police I guess it's me but on reviewing Tim's insightful RESTful Causistry<http://www.tbray.org/ongoing/When/200x/2009/03/20/Rest-Casuistry>post (and the comments) I think we're on the right track here... he talks about GET but I think either would do.

would'nt GET be prone to some kind of spider GETing all links it came across (for indexing) causing data centre havoc ?

Details - not if we don't include authentication information, check user agents, etc. but if we feel it necessary to keep sharp implements out of the way of the children then we can.

...

I don't mind this being an exception - as Peter Keene points out<http://www.tbray.org/ongoing/When/200x/2009/03/20/Rest-Casuistry#c1237598640.454904>this is "not really an appropriately decomposed task for a REST architecture". Seth Ladd makes an interesting suggestion<http://www.tbray.org/ongoing/When/200x/2009/03/20/Rest-Casuistry#c1237592841.237066>about modeling activities as nouns like "running" and "backing up",

So, Seth Ladds comment is essentially how I see it too. A collection of states.

A collection of states might be useful where machines are regularly in multiple states (e.g. "running" and "backing up") but I'm unconvinced there's sufficient demand/justification for the additional complexity. Most importantly, it doesn't prohibit any useful functionality that I can conceive (e.g. you can still backup and get progress via some indicator).

...

... so, a server might of type "commissioned" and "currently_unavailable" classes ?? That is a nice suggestion. A implementor could choose to sub-class "currently_unavailable" with "teleporting" - some extra information that a client could use, or just ignore.

Yes it (kind of) makes sense, but I'm having to exercise brain cells to understand something that should be childs play. Make the case for it (e.g. use cases that can't be satisfied without it) and we'll dig deeper. Sam

Tim Bray

4:33 p.m.

On May 14, 2009, at 7:55 AM, Sam Johnston wrote:

...

...
Meh, it doesn't matter if you push a button, poke it with a stick, kick it or use the force - the result is the same. If anyone is to be getting a visit from the REST police I guess it's me but on reviewing Tim's insightful RESTful Causistry post (and the comments) I think we're on the right track here... he talks about GET but I think either would do.

would'nt GET be prone to some kind of spider GETing all links it came across (for indexing) causing data centre havoc ?

Details - not if we don't include authentication information, check user agents, etc. but if we feel it necessary to keep sharp implements out of the way of the children then we can.

Here's the rule (http://www.w3.org/TR/webarch/#safe-interaction): "Agents do not incur obligations by retrieving a representation." That is to say, spiders and pre-fetching proxies and so on are perfectly entitled to GET any link they see, and if mayhem ensues, it's the server's fault, not the client. People often say "GET should not cause a change in state" which is wrong, because it writes logfile entries and sets cookies and all sorts of stuff. -Tim

Sam Johnston

4:53 p.m.

On Thu, May 14, 2009 at 6:33 PM, Tim Bray <Tim.Bray@sun.com> wrote:

...

would'nt GET be prone to some kind of spider GETing all links it came

...
across (for indexing) causing data centre havoc ?

Details - not if we don't include authentication information, check user agents, etc. but if we feel it necessary to keep sharp implements out of the way of the children then we can.

Here's the rule (http://www.w3.org/TR/webarch/#safe-interaction): "Agents do not incur obligations by retrieving a representation."

That is to say, spiders and pre-fetching proxies and so on are perfectly entitled to GET any link they see, and if mayhem ensues, it's the server's fault, not the client. People often say "GET should not cause a change in state" which is wrong, because it writes logfile entries and sets cookies and all sorts of stuff.

Ok so I guess I mis-parsed the part of your post<http://www.tbray.org/ongoing/When/200x/2009/03/20/Rest-Casuistry>where you said " *The reboot and halt buttons don’t really have any state, so you shouldn’t expect anything useful from a GET*". I'm guessing a "POST" is the best way to "push" a button then? Sam

Tim Bray

5:07 p.m.

On May 14, 2009, at 9:53 AM, Sam Johnston wrote:

...

Here's the rule (http://www.w3.org/TR/webarch/#safe-interaction): "Agents do not incur obligations by retrieving a representation."

That is to say, spiders and pre-fetching proxies and so on are perfectly entitled to GET any link they see, and if mayhem ensues, it's the server's fault, not the client. People often say "GET should not cause a change in state" which is wrong, because it writes logfile entries and sets cookies and all sorts of stuff.

Ok so I guess I mis-parsed the part of your post where you said "The reboot and halt buttons don’t really have any state, so you shouldn’t expect anything useful from a GET". I'm guessing a "POST" is the best way to "push" a button then?

Right. I was getting whining from people about the notion of a "write- only resource" and I was pushing back, saying "What's the problem with that? If there's a red button on the side of a machine that makes it restart when you push it, the button has no meaningful state, you can only push it" So yes, I think POST is a good choice for an actuator. If you look at the discussion in http://www.tbray.org/ongoing/When/200x/2009/03/20/Rest-Casuistry and then the comments, you'll see that people proposed a bunch of other ideas, some of them quite clever and some verging on the metaphysical, but I think that just doing a POST hits a nice 80/20 point both in terms of comprehensibility for the client and tractability for the implementor. -Tim

Sam Johnston

5:14 p.m.

On Thu, May 14, 2009 at 7:07 PM, Tim Bray <Tim.Bray@sun.com> wrote:

...

On May 14, 2009, at 9:53 AM, Sam Johnston wrote:

Here's the rule (http://www.w3.org/TR/webarch/#safe-interaction):

...
"Agents do not incur obligations by retrieving a representation."

That is to say, spiders and pre-fetching proxies and so on are perfectly entitled to GET any link they see, and if mayhem ensues, it's the server's fault, not the client. People often say "GET should not cause a change in state" which is wrong, because it writes logfile entries and sets cookies and all sorts of stuff.

Ok so I guess I mis-parsed the part of your post where you said "The reboot and halt buttons don’t really have any state, so you shouldn’t expect anything useful from a GET". I'm guessing a "POST" is the best way to "push" a button then?

Right. I was getting whining from people about the notion of a "write-only resource" and I was pushing back, saying "What's the problem with that? If there's a red button on the side of a machine that makes it restart when you push it, the button has no meaningful state, you can only push it"

So yes, I think POST is a good choice for an actuator. If you look at the discussion in http://www.tbray.org/ongoing/When/200x/2009/03/20/Rest-Casuistry and then the comments, you'll see that people proposed a bunch of other ideas, some of them quite clever and some verging on the metaphysical, but I think that just doing a POST hits a nice 80/20 point both in terms of comprehensibility for the client and tractability for the implementor.

A key use case for this is having brain-dead user agents like curl (from cron) and failover scripts being able to push buttons routinely or in an emergency. Putting a clear plastic lid over such things is standard practice in engineering circles... same may as well apply here. Do you have any comments as to the suitability of states-as-a-collection-of-nouns per Seth's suggestion on your post? As I said above, sounds elegant but complicated and probably unnecessary in this concext (can't think of a use case that needs this, let alone an important one)... I have no problems whatsoever with write-only resources. Sam

Sam Johnston

11:24 a.m.

On Thu, May 14, 2009 at 12:52 PM, Roger Menday <roger.menday@uk.fujitsu.com>wrote:

...

On 14 May 2009, at 10:59, Alexis Richardson wrote:

+1 to Sam's "we may need to revisit this point in the name of interop"

...
I'm not sure if this is *just* an interop thing ...

I thought my suggestions yesterday on how to transition state, error reporting, handling 'processing' states, etc ... were reasonable.

Kind of disappointed this morning that I didn't get some feedback from you guys ... :(

Roger, I was working on OCCI until 5am this morning and while this is by far the most interesting part of the work it's only half of the problem. The other half, adoption/marketing, is boring grunt work that keeps us organisers very much on our toes, preventing us from being responsive at times. In particular the absence of consensus around formats has put me in a fairly awkward position for my scheduled talk at Prague on Tuesday (which was due yesterday) and is jeopardising previously agreed deadlines that I have been advertising heavily in my own name. It doesn't help that I don't really share your concerns about states being a problem and am confident both that what we have will work and that it will invariably be refined in due course. Thanks for your understanding - I think you would be surprised to see how much behind-the-scenes work goes on in constantly driving this kind of initiative forward, which is why one has to be 100% committed to, and believe in, the cause. Sam

...

...
At this stage we are shooting for a draft. The draft will let people implement prototypes which will let us debug interop and refine the model.

On Thu, May 14, 2009 at 10:55 AM, Sam Johnston <samj@samj.net> wrote:

...
On Thu, May 14, 2009 at 10:23 AM, Andre Merzky <andre@merzky.net> wrote:

...
Quoting [Sam Johnston] (May 13 2009):

...
Yes, me, I don't think HATEOAS should be applied in this

...
context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't see that. If I want to write a client tool which starts a resource, I want to make sure the resource is in RUNNING state when the client reports success. But if that client (a) has to infer the available states from a registry, it cannot posisbly know which state has the semantic meaning of RUNNING attached. Further (b), if the client only sees those state transitions it is allowed in its current state, how does it know what transition path to take to reach that target state? Is it (I am making those up obviously):

INITIAL -> create() -> CREATED -> elevate() -> ELEVATED () -> run() -> RUNNING

or

INITIAL -> create() -> CREATED -> init() -> INITIALIZED -> run() -> RUNNING

Or should the tool simply fail because it cannot see a run() transition in its INITIAL state?

The client must at least know how to create a resource and when it has done so successfully a "start" actuator will appear, perhaps with a target state of "running" (TBD). In that case it knows that if it pulls the "start" handle eventually the resource should end up "running". Otherwise it could know (from the registry) that "start" is the right button to push, but that's starting to break HATEOAS principles. We have options - it's just a matter of finding the right one.

...
I think HATEOAS works pretty well if a human is in the loop who can parse the available transition description, and deduce a semantic meaning. I don't think it makes for simple tooling, really.

I agree that humans are better at this stuff than computers but I'm unconvinced this translates to complex tooling.

...
Then again, I may misunderstand the proposed usage of HATEOAS in OCCI. So, can you help me out: what mechanism will avoid the confusion from the example above, if a vendor can provide init() and elevate() transitions on the fly, with no predefined semantics attached? How would my tool deduce the transition path it needs to enact?

The semantics for common functions will be in the registry. It's ones that are uncommon and impossible to predict like "translate" and "migrate" that we're catering for here, and generally there will need to be some kind of client side support for these.

As I said below, "we may need to revisit this point in the name of interop", and I suggested categories as one possible solution (e.g. a "starting" vs a "stopping" transition)... parametrised transition calls are another... for example, how do I tell something to start *without* saved state if saved state is present (ala cold start vs resume)?

Sam

...
I don't think anyone knows every possible thing that users are going

...
to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

References

1. mailto:andre@merzky.net

-- Nothing is ever easy.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

...

Roger Menday (PhD) <roger.menday@uk.fujitsu.com>

Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534

______________________________________________________________________ Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

Roger Menday

12:07 p.m.

On 14 May 2009, at 12:24, Sam Johnston wrote:

...

On Thu, May 14, 2009 at 12:52 PM, Roger Menday <roger.menday@uk.fujitsu.com

...
wrote:

On 14 May 2009, at 10:59, Alexis Richardson wrote:

+1 to Sam's "we may need to revisit this point in the name of interop"

I'm not sure if this is *just* an interop thing ...

I thought my suggestions yesterday on how to transition state, error reporting, handling 'processing' states, etc ... were reasonable.

Kind of disappointed this morning that I didn't get some feedback from you guys ... :(

Roger, I was working on OCCI until 5am this morning and while this is by far the most interesting part of the work it's only half of the problem. The other half, adoption/marketing, is boring grunt work that keeps us organisers very much on our toes, preventing us from being responsive at times.

Hi Sam, I really wasn't questioning your commitment ! I'm interested in learning/participating/influencing - why I was hopeful for a reply. Roger

...

In particular the absence of consensus around formats has put me in a fairly awkward position for my scheduled talk at Prague on Tuesday (which was due yesterday) and is jeopardising previously agreed deadlines that I have been advertising heavily in my own name. It doesn't help that I don't really share your concerns about states being a problem and am confident both that what we have will work and that it will invariably be refined in due course.

...

Thanks for your understanding - I think you would be surprised to see how much behind-the-scenes work goes on in constantly driving this kind of initiative forward, which is why one has to be 100% committed to, and believe in, the cause.

Sam

At this stage we are shooting for a draft. The draft will let people implement prototypes which will let us debug interop and refine the model.

On Thu, May 14, 2009 at 10:55 AM, Sam Johnston <samj@samj.net> wrote: On Thu, May 14, 2009 at 10:23 AM, Andre Merzky <andre@merzky.net> wrote:

Quoting [Sam Johnston] (May 13 2009):

Yes, me, I don't think HATEOAS should be applied in this context. But I realise/accept that I maybe the only one with that opinion - thats ok. So I'll say it here one last time, for the record, and then will shut up: "a static simple state model allows for very simple clients. Extensions can be defined via substates, or additional transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't see that. If I want to write a client tool which starts a resource, I want to make sure the resource is in RUNNING state when the client reports success. But if that client (a) has to infer the available states from a registry, it cannot posisbly know which state has the semantic meaning of RUNNING attached. Further (b), if the client only sees those state transitions it is allowed in its current state, how does it know what transition path to take to reach that target state? Is it (I am making those up obviously):

INITIAL -> create() -> CREATED -> elevate() -> ELEVATED () -> run() -> RUNNING

or

INITIAL -> create() -> CREATED -> init() -> INITIALIZED -> run() -> RUNNING

Or should the tool simply fail because it cannot see a run() transition in its INITIAL state?

The client must at least know how to create a resource and when it has done so successfully a "start" actuator will appear, perhaps with a target state of "running" (TBD). In that case it knows that if it pulls the "start" handle eventually the resource should end up "running". Otherwise it could know (from the registry) that "start" is the right button to push, but that's starting to break HATEOAS principles. We have options - it's just a matter of finding the right one.

I think HATEOAS works pretty well if a human is in the loop who can parse the available transition description, and deduce a semantic meaning. I don't think it makes for simple tooling, really.

I agree that humans are better at this stuff than computers but I'm unconvinced this translates to complex tooling.

Then again, I may misunderstand the proposed usage of HATEOAS in OCCI. So, can you help me out: what mechanism will avoid the confusion from the example above, if a vendor can provide init() and elevate() transitions on the fly, with no predefined semantics attached? How would my tool deduce the transition path it needs to enact?

The semantics for common functions will be in the registry. It's ones that are uncommon and impossible to predict like "translate" and "migrate" that we're catering for here, and generally there will need to be some kind of client side support for these.

As I said below, "we may need to revisit this point in the name of interop", and I suggested categories as one possible solution (e.g. a "starting" vs a "stopping" transition)... parametrised transition calls are another... for example, how do I tell something to start *without* saved state if saved state is present (ala cold start vs resume)?

Sam

I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

References

1. mailto:andre@merzky.net

-- Nothing is ever easy.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday (PhD)

<roger.menday@uk.fujitsu.com>

Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534

______________________________________________________________________ Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam Johnston

12:12 p.m.

On Thu, May 14, 2009 at 2:07 PM, Roger Menday <roger.menday@uk.fujitsu.com>wrote:

...

On 14 May 2009, at 12:24, Sam Johnston wrote:

On Thu, May 14, 2009 at 12:52 PM, Roger Menday < roger.menday@uk.fujitsu.com> wrote:

...
On 14 May 2009, at 10:59, Alexis Richardson wrote:

+1 to Sam's "we may need to revisit this point in the name of interop"

...
I'm not sure if this is *just* an interop thing ...

I thought my suggestions yesterday on how to transition state, error reporting, handling 'processing' states, etc ... were reasonable.

Kind of disappointed this morning that I didn't get some feedback from you guys ... :(

Roger, I was working on OCCI until 5am this morning and while this is by far the most interesting part of the work it's only half of the problem. The other half, adoption/marketing, is boring grunt work that keeps us organisers very much on our toes, preventing us from being responsive at times.

Hi Sam,

I really wasn't questioning your commitment !

I'm interested in learning/participating/influencing - why I was hopeful for a reply.

My point is just that it's not always possible to address every point immediately. On this one I tend to agree with Alexis - implementation will iron out the little wrinkles provided we take care of the big ones. Sam

Alexis Richardson

12:16 p.m.

Roger On Thu, May 14, 2009 at 1:07 PM, Roger Menday <roger.menday@uk.fujitsu.com> wrote:

...

I'm interested in learning/participating/influencing - why I was hopeful for a reply.

You are extremely welcome to influence and participate - it is vital. I'm hoping that you'll get some feedback once the US wakes up. Please keep up the commentary, it does help even if people don't reply. alexis

...

...
...
At this stage we are shooting for a draft. The draft will let people implement prototypes which will let us debug interop and refine the model.

On Thu, May 14, 2009 at 10:55 AM, Sam Johnston <samj@samj.net> wrote:

...
On Thu, May 14, 2009 at 10:23 AM, Andre Merzky <andre@merzky.net> wrote:

...
Quoting [Sam Johnston] (May 13 2009):

...
> Yes, me, I don't think HATEOAS should be applied in this > context. But I realise/accept that I maybe the only one > with that opinion - thats ok. So I'll say it here one last > time, for the record, and then will shut up: "a static > simple state model allows for very simple clients. > Extensions can be defined via substates, or additional > transitions."

I would counterargue that HATEOAS allows for even simpler clients because they don't have to worry about hardwiring even a simple state model. Using HTTP we can even feed them plain $LANG descriptions of what the transitions and targets are - it doesn't get any easier than that and you don't have to worry about updating clients to implement new goodies.

I don't see that. If I want to write a client tool which starts a resource, I want to make sure the resource is in RUNNING state when the client reports success. But if that client (a) has to infer the available states from a registry, it cannot posisbly know which state has the semantic meaning of RUNNING attached. Further (b), if the client only sees those state transitions it is allowed in its current state, how does it know what transition path to take to reach that target state? Is it (I am making those up obviously):

INITIAL -> create() -> CREATED -> elevate() -> ELEVATED () -> run() -> RUNNING

or

INITIAL -> create() -> CREATED -> init() -> INITIALIZED -> run() -> RUNNING

Or should the tool simply fail because it cannot see a run() transition in its INITIAL state?

The client must at least know how to create a resource and when it has done so successfully a "start" actuator will appear, perhaps with a target state of "running" (TBD). In that case it knows that if it pulls the "start" handle eventually the resource should end up "running". Otherwise it could know (from the registry) that "start" is the right button to push, but that's starting to break HATEOAS principles. We have options - it's just a matter of finding the right one.

...
I think HATEOAS works pretty well if a human is in the loop who can parse the available transition description, and deduce a semantic meaning. I don't think it makes for simple tooling, really.

I agree that humans are better at this stuff than computers but I'm unconvinced this translates to complex tooling.

...
Then again, I may misunderstand the proposed usage of HATEOAS in OCCI. So, can you help me out: what mechanism will avoid the confusion from the example above, if a vendor can provide init() and elevate() transitions on the fly, with no predefined semantics attached? How would my tool deduce the transition path it needs to enact?

The semantics for common functions will be in the registry. It's ones that are uncommon and impossible to predict like "translate" and "migrate" that we're catering for here, and generally there will need to be some kind of client side support for these.

As I said below, "we may need to revisit this point in the name of interop", and I suggested categories as one possible solution (e.g. a "starting" vs a "stopping" transition)... parametrised transition calls are another... for example, how do I tell something to start *without* saved state if saved state is present (ala cold start vs resume)?

Sam

...
...
I don't think anyone knows every possible thing that users are going to want to do with the API (I certainly don't have the confidence to say I do anyway) but we may need to revisit this point in the name of interop... Atom categories would be one way to achieve this (e.g. "Cold Reboot" and "Warm Reboot" might go in the "restart" category). Sam

References

1. mailto:andre@merzky.net

-- Nothing is ever easy.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

...
Roger Menday (PhD) <roger.menday@uk.fujitsu.com>

Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534

______________________________________________________________________ Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Roger Menday (PhD) <roger.menday@uk.fujitsu.com> Senior Researcher, Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE, U.K. Tel: +44 (0) 208 606 4534 ______________________________________________________________________

Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Edmonds, AndrewX

13 May 13 May

1:17 p.m.

This is a nice refinement of the current state model and I don't believe we lose anything, in the context of core finite states, with this new representation :-) As you say Andre, it's clearer and simpler. Andy -----Original Message----- From: Andre Merzky [mailto:andremerzky@gmail.com] On Behalf Of Andre Merzky Sent: 13 May 2009 11:10 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram Hi all, I would like to follow up on the state model thread, even if people seem to have mostly agreed upon the version documented in the wiki (http://forge.ogf.org/short/occi-wg/states) I am really unhappy with a number of things: - too many states - too many state transitions with no actions (i.e. methods) attached - unclear entry and exit points Please allow me to elaborate (1) Too many states: All the XXX-ING states (STARTING, SUSPENDING, RESUMING, STOPPING) seem to have no real benefit, apart from informing the client that a state transition is in progress. State transitions, however, should, IMHO, not be part of a state diagram, unless specific actions can be performed during that transition. For example the STOPPING state: there are no actions attached to that state, the client can do nothing but wait until the state reached STOPPED. So, why not going to STOPPED immediately, and adding the transition process to the semantics of the STOPPED state? (e.g., "resources can still be utilized iff they are required to store or retrieve the VM state"). (2) Too many internal state transitions That is related to (1) of course: the diagram has 11 internal state transitions (no actions attached), and 5 external transitions (actions attached).... Well, that is not exactly incorrect or anything, but it is confusing matters. A state model should help to clarify what actions cause what state change. (3) unclear entry and exit points There is no state where the client can be sure that no remote resources are utilized anymore - i.e., an 'Destroyed' state is missing. At some point, a client needs to have the ability to free remote resources (well, unless one has garbage collection, but we don't want to go there, right?) STOPPED is not really the same - a STOPPED state can be moved to STARTING/ACTIVE via start(), so the backend cannot be allowed to free the resources. I do understand that the XXX-ING states are useful for informative purposes, for example, to allow a GUI to inform the user that a start() request has been received, and the backend is spinning up the machine. Those types of information can, however, easily provided by additional attributes, or sub-states. Attached is an alternate state diagram. It's main purpose is to illustrate that, by limiting the number of states and internal state transitions, one can derive a much simplier (and IMHO cleaner) state diagram. Cheers, Andre. PS.: Sorry for not using graffle. Sources are in xfig, so I don't attach them - don't want to get laughed at for using such ancient technology ;-) Quoting [Edmonds, AndrewX] (Apr 20 2009):

...

Updated...

-----Original Message-----

I agree - this is fine with me.

On Mon, 2009-04-20 at 14:02 +0100, Edmonds, AndrewX wrote:

...
Sounds like a good comprise J

From: Sam Johnston [mailto:samj@samj.net] Sent: 20 April 2009 13:52 To: Edmonds, AndrewX Cc: Thijs.Metsch@Sun.COM; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

I'd tend to side with Thijs on this one - having a complete state machine is a lot less important when the implementation can tell its clients what the next states are (and the humans what they mean in plain $LANGUAGE). That said, if you want to add a dotted loop back to the start then that might help the perfectionists sleep easy.

Sam

On Mon, Apr 20, 2009 at 2:47 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote:

Then conceptually, how do we recover from an error state if we've exited? Also if we want to support error state recovery, should the state model reflect this?

-----Original Message----- From: Thijs.Metsch@Sun.COM [mailto:Thijs.Metsch@Sun.COM] Sent: 20 April 2009 13:42 To: Sam Johnston Cc: Edmonds, AndrewX; occi-wg@ogf.org; Molino, VictorX M Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

As I stated before maybe we an add another exit-point instead of a real state.

Cheers,

-Thijs

On Mon, 2009-04-20 at 14:39 +0200, Sam Johnston wrote:

...
On Mon, Apr 20, 2009 at 2:29 PM, Edmonds, AndrewX <andrewx.edmonds@intel.com> wrote: I updated the state model (attached) to include an error state. Couldn???t see how to delete attachments so maybe someone here has better luck? I also began to add identifiers to each process transition but the diagram started to become cluttered so left them out.

I would suggest that it's better to keep this clean (though I'm impressed that you got as far as you did!). You can reach an error state from anywhere - if a machine is found to be corrupt even while stopped then a STOPPED->ERROR transition makes sense for example.

Sam

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 11:56 To: Tino Vazquez Cc: occi-wg@ogf.org; Molino, VictorX M; Thijs Metsch

Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions State control State

Transitions

Description

aborting

aborted

The resource encountered an error and is aborting

aborted

n/a

The resource encountered an error and has aborted

active

pause, restart, stop, suspend

The resource is active

resuming

aborting, active

The resource is becoming active and restoring state

pausing

aborting, paused

The resource is preparing to refuse new requests

paused

aborting, resume

The resource is refusing new requests

starting

aborting, active

The resource is becoming active

stopped

start

The resource is inactive and has no saved state

stopping

stopped, aborting

The resource is becoming inactive and destroying state

suspended

resume, stop

The resource is inactive and has saved state

Note: Stable states and user transitions in bold.

-------------------------------------------------------------

...
Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare,

Ireland

...
Registered Number: E902934

This e-mail and any attachments may contain confidential

material for

...
the sole use of the intended recipient(s). Any review or

distribution

...
by others is strictly prohibited. If you are not the

intended

...
recipient, please contact the sender and delete all copies.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

-- Nothing is ever easy. ------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

Tino Vazquez

20 Apr 20 Apr

12:46 p.m.

Hi Sam, Thanks for pointing out the registry, I missed it somehow. I like the extensions idea, quite neat. Also, I'm happy with a new "NEW" state and having "ABORTED" as the failed state, although I'm not sure if we should use this for both 1) the case that the machine has crashed due to some error and 2) the user cancels the machine. -Tino On Mon, Apr 20, 2009 at 12:56 PM, Sam Johnston <samj@samj.net> wrote:

...

Hi Tino,

A lot of this is covered in the state registry (which I'll copy below for you) but comments inline nonetheless.

On Mon, Apr 20, 2009 at 12:35 PM, Tino Vazquez <tinova@fdi.ucm.es> wrote:

...
Howdy everyone,

Excellent thread. My three cents:

1) I think we should define clearly the semantics of the states. for instance, what is the difference between STOPPED and SUSPENDED? Is it that with SUSPENDED the state is saved and not with STOPPED?

Yes, it's exactly that. From the state registry: STOPPED = "The resource is inactive and has no saved state" and SUSPENDED="The resource is inactive and has saved state"

...
2) I really think we need an entry state like "PENDING" or "DEFINED". It will help in implementation relying on a best-effort scheduler to match VMs and hosts, like EC2 does. This will be the state where machines will wait for a host to be available to run on. Also, I don't really think that a machine entering its life cycle in SUSPENDED state is a good idea.

I tend to agree, but I'd like the terminology to be completely unambiguous... something like "NEW" [for this AS].

...
3) +1 to the "CRASHED", "ERROR" or "FAILED" state.

ABORT[ING|ED] = "The resource encountered an error and is aborting/has aborted".

...
What do you think?

Some of these transitions take a while so some way of indicating progress (especially interesting for long tasks like live migrations) would be useful. Would prefer a mechanism that worked universally for the API.

Sam

Extensions

State control

StateTransitionsDescription abortingabortedThe resource encountered an error and is aborting abortedn/aThe resource encountered an error and has aborted activepause, restart, stop, suspendThe resource is active resumingaborting, activeThe resource is becoming active and restoring state pausingaborting, pausedThe resource is preparing to refuse new requests pausedaborting, resumeThe resource is refusing new requests startingaborting, activeThe resource is becoming active stoppedstartThe resource is inactive and has no saved state stoppingstopped, abortingThe resource is becoming inactive and destroying state suspendedresume, stopThe resource is inactive and has saved state

Note: Stable states and user transitions in bold.

Chris Webb

1:16 p.m.

Alexis Richardson <alexis.richardson@gmail.com> writes:

...

I am interested in how EH and GG deal with exceptions. Chris?

We try to flag all the significant errors synchronously during the create call. If you get success back, a VM exists and is running with the requested drives and network interfaces. Conversely you'll always get an immediate error back if (say) you try to specify an IP that doesn't belong to you, or use a drive which doesn't exist, or has exclusive locking enabling and is already mounted elsewhere, or whatever. Our API only operates at the virtual machine level. As far as we're concerned, measuring or interfering with the guest OS other than by providing virtual hardware for it would be a gross layering violation. Since like Amazon we have no concept of stopped servers at the API level (they exist in the web interface for convenience), this means we only have one user-visible guest state: if a guest exists at all, it is active and running. Migration of storage and guests within our infrastructure is only allowed if it is completely transparent to users, so again this isn't signalled to unprivileged users. Thus the only cases we have to deal with are when a guest exits (ACPI power down) or if an infrastructure host explodes and a guest must be revived (which looks like a hard reset from outside). At infrastructure level, our API between the management system and the individual hosts is the same API our users use, but with extra 'privileged' features. There we have HTTP callbacks ('callback:exit' key) available to signal when a guest disappears or is revived following a host crash, which are used internally for billing amongst other things. Although I don't think these are exposed to our end users through our unprivileged API yet, they will be. (However, for what it's worth, real users seem to prefer to do their "I've just booted" and "I'm going to shut down now" notifications within their guest OSes where they get a free choice of mechanism, and usually monitor their virtual machines over IP like they would physical servers, so we never had anyone ask for the callback stuff.) Cheers, Chris.

Thijs Metsch

10:08 a.m.

Good idea! We can either add it as a state or at least as a possible exit-point (Next to the existing one). Cheers, -Thijs On Mon, 2009-04-20 at 10:58 +0100, Edmonds, AndrewX wrote:

...

Mornin!

After looking at the state model with a colleague (Victor), it was brought up that there are no exception states captured. Might it be appropriate to insert an error state such as “ERROR” or “CRASHED”? The assumption of the state model currently is that no failures occur but we should design with this in mind.

Now perhaps failure is mitigated by provider internal strategies in which case exceptions are not revealed to a client but if a provider does not have advanced recovery strategies there’s no way to signal exceptions to a client (we should cater for lowest common denominator).

Andy

From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Sam Johnston Sent: 20 April 2009 09:20 To: Thijs Metsch Cc: occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram

On Mon, Apr 20, 2009 at 9:57 AM, Thijs Metsch <Thijs.Metsch@sun.com> wrote:

Ah sorry, my fault - than it was only that the picture was not updated...Was missing the entry-points :-)

Finally worked out how to remove/update attachments. Done.

Sam

------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg -- Thijs Metsch Tel: +49 (0)941 3075-122 (x60122) http://blogs.sun.com/intheclouds Software Engineer Grid Computing Sun Microsystems GmbH Dr.-Leo-Ritter-Str. 7 mailto:thijs.metsch@sun.com D-93049 Regensburg http://www.sun.com

Michael Richardson

1:09 p.m.

...

...
...
...
...
"Andre" == Andre Merzky <andre@merzky.net> writes: Andre> For example, assume that I have a client which implements the Andre> core specification only, thus only knows the STOPPED, ACTIVE Andre> and SUSPENDED states (your original figure). What is that Andre> client supposed to do if the backend reports an PAUSED state?

How did it get to that state? Since the client didn't put it into that state, it means that an operator or error must have forced this, so it's an error. Life does not always go as planned. If the client does not do PAUSED state, the system shouldn't get into that state. The client, upon seeing a state that it does not understand could attempt to recover by requesting a transition to primary state that it does understand. Andre> BTW, I agree with Krishna's point that ENTRY and EXIT points Andre> are useful. Where are they attached? -- ] Y'avait une poule de jammé dans l'muffler!!!!!!!!! | firewalls [ ] Michael Richardson, Sandelman Software Works, Ottawa, ON |net architect[ ] mcr@sandelman.ottawa.on.ca http://www.sandelman.ottawa.on.ca/ |device driver[ ] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [

David Snelling

18 Apr 18 Apr

7:40 p.m.

Folks, I write this note as a fellow OGF working group chair. In the Reference Model working group we have been working toward a Reference Model, which includes a life cycle state model. Please have a look at the sections in our draft document that pertain to life cycle and see if you can map your states sensibly onto/into these. The idea is that your states, which map directly, are aliases for those in the Reference Model. Any of yours that don't map directly, should be sub-states of one from the reference model. If this is not possible, please let us know. Ours is a work in progress, so it is possible that there are faults in our model. Please come talk to us at OGF26. See: http://forge.gridforum.org/sf/go/doc14766?nav=1 On 18 Apr 2009, at 12:40, Sam Johnston wrote:

...

Afternoon all,

I have created a diagram (attached) of what I think the absolute minimum core states need to be... essentially boiling them down to "STOPPED" and "ACTIVE" with "START" and "STOP" being the only requisite actuators. Transitional "STOPPING" and "STARTING" states are optional.

I believe states should be completely unambiguous so I don't particularly like the DMTF model ("stopped" and "active" machines are also "defined" but this is a separate state) and that also rules out vague terminology like "inactive" (which could mean both "stopped" and "suspended").

I've got an optional RESTART actuator which takes you from "ACTIVE" to "STARTING" as well as an optional "SUSPEND" and "RESUME" cycle which takes you via the transitional "SUSPENDING" and "RESUMING" states between "ACTIVE" and "SUSPENDED".

DMTF have another "paused' state but I wonder whether this needs to be a state in its own right or if it's an attribute of "suspended". That's a rhetorical question - we could spend all week discussing nuances but for now I want to make sure we agree on the absolutely minimalist core functionality. Other states can be added via a live registry which we can pre-populate with a view to guiding innovation without stifling it.

This has been uploaded to the wiki: http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/StateModel

Sam

<OCCI MC State Machine.png><OCCI MC State Machine.svg>_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Take care: Dr. David Snelling < David . Snelling . UK . Fujitsu . com > Fujitsu Laboratories of Europe Limited Hayes Park Central Hayes End Road Hayes, Middlesex UB4 8FE Reg. No. 4153469 +44-7590-293439 (Mobile) ______________________________________________________________________ Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469 This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

Alexis Richardson

8:12 p.m.

David, One of our side goals is to explain the relationship between whatever simple OCCI lifecyle we settle on, and the OVF lifecyle. Can you tell us if the OGF RM group has cross checked the RM states and transitions against OVF? BTW ... one simplifying assumption for OCCI *may* be that API operands are single slices (VMs). alexis On Sat, Apr 18, 2009 at 8:40 PM, David Snelling <David.Snelling@uk.fujitsu.com> wrote:

...

Folks, I write this note as a fellow OGF working group chair. In the Reference Model working group we have been working toward a Reference Model, which includes a life cycle state model. Please have a look at the sections in our draft document that pertain to life cycle and see if you can map your states sensibly onto/into these. The idea is that your states, which map directly, are aliases for those in the Reference Model. Any of yours that don't map directly, should be sub-states of one from the reference model. If this is not possible, please let us know. Ours is a work in progress, so it is possible that there are faults in our model. Please come talk to us at OGF26. See: http://forge.gridforum.org/sf/go/doc14766?nav=1

On 18 Apr 2009, at 12:40, Sam Johnston wrote:

Afternoon all,

I have created a diagram (attached) of what I think the absolute minimum core states need to be... essentially boiling them down to "STOPPED" and "ACTIVE" with "START" and "STOP" being the only requisite actuators. Transitional "STOPPING" and "STARTING" states are optional.

I believe states should be completely unambiguous so I don't particularly like the DMTF model ("stopped" and "active" machines are also "defined" but this is a separate state) and that also rules out vague terminology like "inactive" (which could mean both "stopped" and "suspended").

I've got an optional RESTART actuator which takes you from "ACTIVE" to "STARTING" as well as an optional "SUSPEND" and "RESUME" cycle which takes you via the transitional "SUSPENDING" and "RESUMING" states between "ACTIVE" and "SUSPENDED".

DMTF have another "paused' state but I wonder whether this needs to be a state in its own right or if it's an attribute of "suspended". That's a rhetorical question - we could spend all week discussing nuances but for now I want to make sure we agree on the absolutely minimalist core functionality. Other states can be added via a live registry which we can pre-populate with a view to guiding innovation without stifling it.

This has been uploaded to the wiki: http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/StateModel

Sam

<OCCI MC State Machine.png><OCCI MC State Machine.svg>_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Take care: Dr. David Snelling < David . Snelling . UK . Fujitsu . com > Fujitsu Laboratories of Europe Limited Hayes Park Central Hayes End Road Hayes, Middlesex UB4 8FE Reg. No. 4153469 +44-7590-293439 (Mobile)

______________________________________________________________________

Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential. Unauthorised use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam Johnston

9:57 p.m.

Evening, I've trawled through this document (ref model from it attached for convenience) and am not sure that it (and others like it for that matter) give the flexibility our implementors are going to need. It's early days to be standardising and even if it weren't then trying to foist a fixed lifecycle on people is either a> not going to work or b> limit them unnecessarily. Specific issues include the entry/exit points (live migrations start active for example and in some systems active is the only valid state), terminology (abstraction is fine but I had to hit the dictionary to confirm that "extant" meant what I thought it meant), and the requirement for a commissioning step is still unclear to me (though it will be soon when we start looking at templates). That said I'd happily work with you guys to bring our model into line with your or vice versa. Cheers, Sam On Sat, Apr 18, 2009 at 10:12 PM, Alexis Richardson < alexis.richardson@gmail.com> wrote:

...

David,

One of our side goals is to explain the relationship between whatever simple OCCI lifecyle we settle on, and the OVF lifecyle. Can you tell us if the OGF RM group has cross checked the RM states and transitions against OVF?

BTW ... one simplifying assumption for OCCI *may* be that API operands are single slices (VMs).

alexis

On Sat, Apr 18, 2009 at 8:40 PM, David Snelling <David.Snelling@uk.fujitsu.com> wrote:

...
Folks, I write this note as a fellow OGF working group chair. In the Reference Model working group we have been working toward a Reference Model, which includes a life cycle state model. Please have a look at the sections in our draft document that pertain to life cycle and see if you can map your states sensibly onto/into these. The idea is that your states, which map directly, are aliases for those in the Reference Model. Any of yours that don't map directly, should be sub-states of one from the reference model. If this is not possible, please let us know. Ours is a work in progress, so it is possible that there are faults in our model. Please come talk to us at OGF26. See: http://forge.gridforum.org/sf/go/doc14766?nav=1

On 18 Apr 2009, at 12:40, Sam Johnston wrote:

Afternoon all,

I have created a diagram (attached) of what I think the absolute minimum core states need to be... essentially boiling them down to "STOPPED" and "ACTIVE" with "START" and "STOP" being the only requisite actuators. Transitional "STOPPING" and "STARTING" states are optional.

I believe states should be completely unambiguous so I don't particularly like the DMTF model ("stopped" and "active" machines are also "defined" but this is a separate state) and that also rules out vague terminology like "inactive" (which could mean both "stopped" and "suspended").

I've got an optional RESTART actuator which takes you from "ACTIVE" to "STARTING" as well as an optional "SUSPEND" and "RESUME" cycle which takes you via the transitional "SUSPENDING" and "RESUMING" states between "ACTIVE" and "SUSPENDED".

DMTF have another "paused' state but I wonder whether this needs to be a state in its own right or if it's an attribute of "suspended". That's a rhetorical question - we could spend all week discussing nuances but for now I want to make sure we agree on the absolutely minimalist core functionality. Other states can be added via a live registry which we can pre-populate with a view to guiding innovation without stifling it.

This has been uploaded to the wiki:

http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/StateModel

...
Sam

<OCCI MC State Machine.png><OCCI MC State Machine.svg>_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Take care: Dr. David Snelling < David . Snelling . UK . Fujitsu . com > Fujitsu Laboratories of Europe Limited Hayes Park Central Hayes End Road Hayes, Middlesex UB4 8FE Reg. No. 4153469 +44-7590-293439 (Mobile)

______________________________________________________________________

Fujitsu Laboratories of Europe Limited Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE Registered No. 4153469

This e-mail and any attachments are for the sole use of addressee(s) and may contain information which is privileged and confidential.

Unauthorised

...
use or copying for disclosure is strictly prohibited. The fact that this e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does not guarantee that it has not been intercepted or amended nor that it is virus-free.

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

alexis.richardson＠gmail.com

10:09 p.m.

As far as I can tell -- Commissioning appears to consist of the marshalling of components of a larger whole, before work can commence. I'm not sure if we need that (cf. my comment about operands in last email, below). On Apr 18, 2009 10:57pm, Sam Johnston <samj@samj.net> wrote:

...

Evening,

...

I've trawled through this document (ref model from it attached for convenience) and am not sure that it (and others like it for that matter) give the flexibility our implementors are going to need. It's early days to be standardising and even if it weren't then trying to foist a fixed lifecycle on people is either a> not going to work or b> limit them unnecessarily.

...

Specific issues include the entry/exit points (live migrations start active for example and in some systems active is the only valid state), terminology (abstraction is fine but I had to hit the dictionary to confirm that "extant" meant what I thought it meant), and the requirement for a commissioning step is still unclear to me (though it will be soon when we start looking at templates).

...

That said I'd happily work with you guys to bring our model into line with your or vice versa.

...

Cheers,

...

Sam

...

On Sat, Apr 18, 2009 at 10:12 PM, Alexis Richardson alexis.richardson@gmail.com> wrote:

...

David,

...

One of our side goals is to explain the relationship between whatever

...

simple OCCI lifecyle we settle on, and the OVF lifecyle. Can you tell

...

us if the OGF RM group has cross checked the RM states and transitions

...

against OVF?

...

BTW ... one simplifying assumption for OCCI *may* be that API operands

...

are single slices (VMs).

...

alexis

...

On Sat, Apr 18, 2009 at 8:40 PM, David Snelling

...

David.Snelling@uk.fujitsu.com> wrote:

...

...
Folks,

...

...
I write this note as a fellow OGF working group chair. In the Reference

...

...
Model working group we have been working toward a Reference Model, which

...

...
includes a life cycle state model. Please have a look at the sections in

...

...
our draft document that pertain to life cycle and see if you can map your

...

...
states sensibly onto/into these.

...

...
The idea is that your states, which map directly, are aliases for those in

...

...
the Reference Model. Any of yours that don't map directly, should be

...

...
sub-states of one from the reference model.

...

...
If this is not possible, please let us know. Ours is a work in progress, so

...

...
it is possible that there are faults in our model.

...

...
Please come talk to us at OGF26.

...

...
See: http://forge.gridforum.org/sf/go/doc14766?nav=1

...

...

...

...
On 18 Apr 2009, at 12:40, Sam Johnston wrote:

...

...

...

...
Afternoon all,

...

...

...

...
I have created a diagram (attached) of what I think the absolute minimum

...

...
core states need to be... essentially boiling them down to "STOPPED" and

...

...
"ACTIVE" with "START" and "STOP" being the only requisite actuators.

...

...
Transitional "STOPPING" and "STARTING" states are optional.

...

...

...

...
I believe states should be completely unambiguous so I don't particularly

...

...
like the DMTF model ("stopped" and "active" machines are also "defined" but

...

...
this is a separate state) and that also rules out vague terminology like

...

...
"inactive" (which could mean both "stopped" and "suspended").

...

...

...

...
I've got an optional RESTART actuator which takes you from "ACTIVE" to

...

...
"STARTING" as well as an optional "SUSPEND" and "RESUME" cycle which takes

...

...
you via the transitional "SUSPENDING" and "RESUMING" states between "ACTIVE"

...

...
and "SUSPENDED".

...

...

...

...
DMTF have another "paused' state but I wonder whether this needs to be a

...

...
state in its own right or if it's an attribute of "suspended". That's a

...

...
rhetorical question - we could spend all week discussing nuances but for now

...

...
I want to make sure we agree on the absolutely minimalist core

...

...
functionality. Other states can be added via a live registry which we can

...

...
pre-populate with a view to guiding innovation without stifling it.

...

...

...

...
This has been uploaded to the wiki:

...

...
http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/StateModel

...

...

...

...
Sam

...

...

...

...
Machine.svg>_______________________________________________

...

...
occi-wg mailing list

...

...
occi-wg@ogf.org

...

...
http://www.ogf.org/mailman/listinfo/occi-wg

...

...

...

...
Take care:

...

...
Dr. David Snelling

...

...
Fujitsu Laboratories of Europe Limited

...

...
Hayes Park Central

...

...
Hayes End Road

...

...
Hayes, Middlesex UB4 8FE

...

...
Reg. No. 4153469

...

...
+44-7590-293439 (Mobile)

...

...

...

...

...

...

...

...
______________________________________________________________________

...

...

...

...
Fujitsu Laboratories of Europe Limited

...

...
Hayes Park Central, Hayes End Road, Hayes, Middlesex, UB4 8FE

...

...
Registered No. 4153469

...

...

...

...
This e-mail and any attachments are for the sole use of addressee(s) and

...

...
may contain information which is privileged and confidential. Unauthorised

...

...
use or copying for disclosure is strictly prohibited. The fact that this

...

...
e-mail has been scanned by Trendmicro Interscan and McAfee Groupshield does

...

...
not guarantee that it has not been intercepted or amended nor that it is

...

...
virus-free.

...

...

...

...
_______________________________________________

...

...
occi-wg mailing list

...

...
occi-wg@ogf.org

...

...
http://www.ogf.org/mailman/listinfo/occi-wg

...

...

...

...

...

_______________________________________________

...

occi-wg mailing list

...

occi-wg@ogf.org

...

http://www.ogf.org/mailman/listinfo/occi-wg

Michael Richardson

20 Apr 20 Apr

1:02 p.m.

Missing are error states. For instance, "starting" may well lead to an error state such as "failed to boot". The VM may well be spinning cycles at "Kernel panic", "db>", or "Failure on Drive C, abort,retry,fail". The failed state is important for clients to know, and possibly also for billing. Another state that I can think of that might be important is network disconnect --- you may have a VM that is ACTIVE, but the networks are disconnected. I'd call this state "STANDBY" (Or not yet connected). I often do this in my virtualized infrastructure: bring up a VM with a new version of software, on a new IP, with VRRP/CARP configured, but I'm not going to connect it on the live side until it's confirmed to be up, at which point, I enable the networking, and it takes over the virtual IP. -- ] Y'avait une poule de jammé dans l'muffler!!!!!!!!! | firewalls [ ] Michael Richardson, Sandelman Software Works, Ottawa, ON |net architect[ ] mcr@sandelman.ottawa.on.ca http://www.sandelman.ottawa.on.ca/ |device driver[ ] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [

Edmonds, AndrewX

23 Apr 23 Apr

1:13 p.m.

We had the discussion on the inclusion of an error state and it versus an error exit and it was agreed that an error exit would be used. You can see this here: http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/StateModel Andy -----Original Message----- From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Michael Richardson Sent: 20 April 2009 14:02 To: occi-wg@ogf.org Subject: Re: [occi-wg] OCCI MC - State Machine Diagram Missing are error states. For instance, "starting" may well lead to an error state such as "failed to boot". The VM may well be spinning cycles at "Kernel panic", "db>", or "Failure on Drive C, abort,retry,fail". The failed state is important for clients to know, and possibly also for billing. Another state that I can think of that might be important is network disconnect --- you may have a VM that is ACTIVE, but the networks are disconnected. I'd call this state "STANDBY" (Or not yet connected). I often do this in my virtualized infrastructure: bring up a VM with a new version of software, on a new IP, with VRRP/CARP configured, but I'm not going to connect it on the live side until it's confirmed to be up, at which point, I enable the networking, and it takes over the virtual IP. -- ] Y'avait une poule de jammé dans l'muffler!!!!!!!!! | firewalls [ ] Michael Richardson, Sandelman Software Works, Ottawa, ON |net architect[ ] mcr@sandelman.ottawa.on.ca http://www.sandelman.ottawa.on.ca/ |device driver[ ] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [ _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg ------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

Randy Bias

9:10 p.m.

I would personally shy away from trying to specify multiple error states. My past experience is that this is an endless bucket. Systems fail in lots of ways. Once you start specifying error states you won't stop. --Randy On 4/20/09 6:02 AM, "Michael Richardson" <mcr@sandelman.ca> wrote:

...

Missing are error states.

For instance, "starting" may well lead to an error state such as "failed to boot". The VM may well be spinning cycles at "Kernel panic", "db>", or "Failure on Drive C, abort,retry,fail". The failed state is important for clients to know, and possibly also for billing.

Another state that I can think of that might be important is network disconnect --- you may have a VM that is ACTIVE, but the networks are disconnected. I'd call this state "STANDBY"

(Or not yet connected). I often do this in my virtualized infrastructure: bring up a VM with a new version of software, on a new IP, with VRRP/CARP configured, but I'm not going to connect it on the live side until it's confirmed to be up, at which point, I enable the networking, and it takes over the virtual IP.

-- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias

Chris Webb

10:18 p.m.

Randy Bias <randyb@gogrid.com> writes:

...

I would personally shy away from trying to specify multiple error states. My past experience is that this is an endless bucket. Systems fail in lots of ways. Once you start specifying error states you won't stop.

Seconded. Cheers, Chris.

Michael Richardson

11:41 p.m.

...

...
...
...
...
"Chris" == Chris Webb <chris.webb@elastichosts.com> writes: Chris> Randy Bias <randyb@gogrid.com> writes:

>> I would personally shy away from trying to specify multiple error >> states. My past experience is that this is an endless bucket. >> Systems fail in lots of ways. Once you start specifying error >> states you won't stop. It's not about multiple error states. It's about having an error state that optionally includes details. -- ] Y'avait une poule de jammé dans l'muffler!!!!!!!!! | firewalls [ ] Michael Richardson, Sandelman Software Works, Ottawa, ON |net architect[ ] mcr@sandelman.ottawa.on.ca http://www.sandelman.ottawa.on.ca/ |device driver[ ] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [

Randy Bias

24 Apr 24 Apr

12:17 a.m.

Sorry. That wasn't clear in your initial email on this. I think that's a great idea. --Randy On 4/23/09 4:41 PM, "Michael Richardson" <mcr@sandelman.ca> wrote:

...

...
...
...
...
...
"Chris" == Chris Webb <chris.webb@elastichosts.com> writes: Chris> Randy Bias <randyb@gogrid.com> writes:

...
...
I would personally shy away from trying to specify multiple error states. My past experience is that this is an endless bucket. Systems fail in lots of ways. Once you start specifying error states you won't stop.

It's not about multiple error states. It's about having an error state that optionally includes details.

-- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias

Sam Johnston

25 Apr 25 Apr

1:03 a.m.

IMO all states should give details where possible. If a machine is sending or receiving I want to see % complete. If it's starting I want to see where we're at compared to the average. In any case giving the user good feedback should be helped rather than hindered. Sam on iPhone On 4/24/09, Randy Bias <randyb@gogrid.com> wrote:

...

Sorry. That wasn't clear in your initial email on this. I think that's a great idea.

--Randy

On 4/23/09 4:41 PM, "Michael Richardson" <mcr@sandelman.ca> wrote:

...
...
...
...
...
> "Chris" == Chris Webb <chris.webb@elastichosts.com> writes: Chris> Randy Bias <randyb@gogrid.com> writes:

...
...
I would personally shy away from trying to specify multiple error states. My past experience is that this is an endless bucket. Systems fail in lots of ways. Once you start specifying error states you won't stop.

It's not about multiple error states. It's about having an error state that optionally includes details.

-- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam Johnston

23 Apr 23 Apr

10:51 p.m.

On 4/23/09, Randy Bias <randyb@gogrid.com> wrote:

...

I would personally shy away from trying to specify multiple error states. My past experience is that this is an endless bucket. Systems fail in lots of ways. Once you start specifying error states you won't stop.

+1 Sam on iPhone

...

--Randy

On 4/20/09 6:02 AM, "Michael Richardson" <mcr@sandelman.ca> wrote:

...
Missing are error states.

For instance, "starting" may well lead to an error state such as "failed to boot". The VM may well be spinning cycles at "Kernel panic", "db>", or "Failure on Drive C, abort,retry,fail". The failed state is important for clients to know, and possibly also for billing.

Another state that I can think of that might be important is network disconnect --- you may have a VM that is ACTIVE, but the networks are disconnected. I'd call this state "STANDBY"

(Or not yet connected). I often do this in my virtualized infrastructure: bring up a VM with a new version of software, on a new IP, with VRRP/CARP configured, but I'm not going to connect it on the live side until it's confirmed to be up, at which point, I enable the networking, and it takes over the virtual IP.

-- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

5892

Age (days ago)

5918

Last active (days ago)

List overview

Download

89 comments

17 participants

participants (17)

Alexis Richardson
alexis.richardson＠gmail.com
Andre Merzky
Chris Webb
David Snelling
Edmonds, AndrewX
Gary Mazz
Ignacio Martin Llorente
Krishna Sankar (ksankar)
Michael Richardson
Randy Bias
Roger Menday
Sam Johnston
Thijs Metsch
Tim Bray
Tino Vazquez
Tino Vazquez