new state machine

Tomohiro Kudoh

8 Feb 2012 8 Feb '12

3:19 p.m.

Hi all, Here is a slide of a new state machine. Here, prov.cf/rel.cf does not mean data plane failure, but means confirmation of delivery of a prov/rel request to all the children. I am not sure how we can handle prov.fl (and rel.fl) messages. Since they are not related to data plane, they are almost fatal. Going back to the previous state may not make sense. (In most cases, such .fl message will not be sent back, but time out will happen). Tomohiro

Attachments:

NSI-SM-Single-Diagram-Feb_8_2012.pptx (application/vnd.openxmlformats-officedocument.presentationml.presentation — 129.3 KB)

Show replies by date

Henrik Thostrup Jensen

15 Feb 15 Feb

10:24 a.m.

Hi Tomohiro (long email, sorry). On Thu, 9 Feb 2012, Tomohiro Kudoh wrote:

...

Here is a slide of a new state machine.

Thanks for making this. I am bit puzzled by the introduction of the auto-provisioning state, but I can see it being useful in places where there is an NRM which can do auto-start, and auto-provisioning representing that we are interacting with the NRM to set this up. If this is what you meant, I think it makes sense. It is still not possible to release an auto-provisioned connection (i.e., going back to reserved from auto-provision), which I think is needed. It will probably not be the most used feature, but since we have the ability to release a provisioned connection it should also be possible to stop an auto-provisioning from happening, which is currently not possible. If we introduce this and the auto-provisioning state, we will also need a state for when auto-provisioning is being cancelled. Event when a connection have been auto-provisioned it should still, IMHO, go through the provision state as this is when the hardware brings up the connection - it is not something that can be skipped. This would cause provisionConfirmed to be emitted twice, which is not what we want. The new state machine is also slightly inconsistent wrt. provisionConfirmed - in auto provision it is emitted when going into auto-provision (i.e., long before the hardware is activated), and in the non-auto-provision it is emitted when the hardware is activated. Could we consider introducing a new primitive, e.g., autoProvisionConfirmed or linkActivated which was emitted when the hardware is brought up. I know this somewhat breaks the nice symmetry of the model. Finally: I have difficulty seperating the cleaning and terminating state, and they seem mostly equivalent for me, but I hadn't joined NSI at the time it was created, so can someone fill me in. I also not quite sure how to respond to a forcedEnd message.

...

I am not sure how we can handle prov.fl (and rel.fl) messages. Since they are not related to data plane, they are almost fatal. Going back to the previous state may not make sense. (In most cases, such .fl message will not be sent back, but time out will happen).

I think we have to realize that the current four primitives and associated responses doesn't cut it. These failures can be either fatal (as in: the connection is termianted), or non-fatal (as in: try again later and it might work). However this is not possible to express in the current provisionFailed / releaseFailed primitives. The protocol is designed such that the requester can expect the remote connection to be in a certain state when a one of these primitives are received, but that is not the case here. This is also the case with terminateFailed, which I don't think any of us really know how to handle :-). A solution is to use the forcedEnd for fatal events, and have provisionFailed / releaseFailed indicate a move back to scheduled - however then there is no need for a releaseFailed. I've attached a picture of the state machine as I envision it. There are probably still some things wrong with it, and it mostly as basis for future discussion :-). Best regards, Henrik Henrik Thostrup Jensen <htj at nordu.net> Software Developer, NORDUnet

Tomohiro Kudoh

2:54 p.m.

Hi Henrik, Thank you for your comments and state machine proposal. I've put my comment inline. On Wed, 15 Feb 2012 11:24:34 +0100 (CET) Henrik Thostrup Jensen <htj@nordu.net> wrote:

...

Hi Tomohiro

(long email, sorry).

On Thu, 9 Feb 2012, Tomohiro Kudoh wrote:

...
Here is a slide of a new state machine.

Thanks for making this.

I am bit puzzled by the introduction of the auto-provisioning state, but I can see it being useful in places where there is an NRM which can do auto-start, and auto-provisioning representing that we are interacting with the NRM to set this up. If this is what you meant, I think it makes sense.

The "Auto Provisioning" is required to wait confirm messages sent back from all the children.

...

It is still not possible to release an auto-provisioned connection (i.e., going back to reserved from auto-provision), which I think is needed. It will probably not be the most used feature, but since we have the ability to release a provisioned connection it should also be possible to stop an auto-provisioning from happening, which is currently not possible. If we introduce this and the auto-provisioning state, we will also need a state for when auto-provisioning is being cancelled.

OK. I added them.

...

Event when a connection have been auto-provisioned it should still, IMHO, go through the provision state as this is when the hardware brings up the connection - it is not something that can be skipped. This would cause provisionConfirmed to be emitted twice, which is not what we want. The new state machine is also slightly inconsistent wrt. provisionConfirmed - in auto provision it is emitted when going into auto-provision (i.e., long before the hardware is activated), and in the non-auto-provision it is emitted when the hardware is activated. Could we consider introducing a new primitive, e.g., autoProvisionConfirmed or linkActivated which was emitted when the hardware is brought up. I know this somewhat breaks the nice symmetry of the model.

This is a matter of whether to separate data plane errors from NSI base messages. At Baton Rouge, I have discussed with John and Chin, and we thought it is better to separate them. i.e. prof.cf/rel.cf just mean prov.rq/prov.cf have already been propagated to all the children. I think we need to discuss this matter first.

...

Finally: I have difficulty seperating the cleaning and terminating state, and they seem mostly equivalent for me, but I hadn't joined NSI at the time it was created, so can someone fill me in. I also not quite sure how to respond to a forcedEnd message.

...

...
I am not sure how we can handle prov.fl (and rel.fl) messages. Since they are not related to data plane, they are almost fatal. Going back to the previous state may not make sense. (In most cases, such .fl message will not be sent back, but time out will happen).

I think we have to realize that the current four primitives and associated responses doesn't cut it. These failures can be either fatal (as in: the connection is termianted), or non-fatal (as in: try again later and it might work). However this is not possible to express in the current provisionFailed / releaseFailed primitives. The protocol is designed such that the requester can expect the remote connection to be in a certain state when a one of these primitives are received, but that is not the case here. This is also the case with terminateFailed, which I don't think any of us really know how to handle :-).

A solution is to use the forcedEnd for fatal events, and have provisionFailed / releaseFailed indicate a move back to scheduled - however then there is no need for a releaseFailed.

OK. I think I agree with you.

...

I've attached a picture of the state machine as I envision it. There are probably still some things wrong with it, and it mostly as basis for future discussion :-).

Your state machine does not allow skew of prov.rq message delivery. If prov.rq is sent by an ultimate RA just before the start_time, some NSA will receive prov.rq before the start_time and some will do after the start_time. In the attached slide, slide 2 is a SM which supports release during auto provisioning, but does not transit to the provisioning state in auto-provision. Slide 3 is a SM which transits to the provisioned state by using non-symmetrical messages. (last two slides are the state machines in which release is not supported)

...

Best regards, Henrik

Henrik Thostrup Jensen <htj at nordu.net> Software Developer, NORDUnet

Thanks, Tomohiro

Henrik Thostrup Jensen

16 Feb 16 Feb

12:52 p.m.

Hi Tomohiro Replies inline. On Wed, 15 Feb 2012, Tomohiro Kudoh wrote:

...

...
I am bit puzzled by the introduction of the auto-provisioning state, but I can see it being useful in places where there is an NRM which can do auto-start, and auto-provisioning representing that we are interacting with the NRM to set this up. If this is what you meant, I think it makes sense.

The "Auto Provisioning" is required to wait confirm messages sent back from all the children.

Right. This makes sense (if there is an NRM involved or not is a secondary issue).

...

...
Event when a connection have been auto-provisioned it should still, IMHO, go through the provision state as this is when the hardware brings up the connection - it is not something that can be skipped.

This is a matter of whether to separate data plane errors from NSI base messages. At Baton Rouge, I have discussed with John and Chin, and we thought it is better to separate them. i.e. prof.cf/rel.cf just mean prov.rq/prov.cf have already been propagated to all the children.

I think we need to discuss this matter first.

Yes. I tend to think that we need to seperate replies (one for control plane, one for data plane), as we more or less agreed upon yesterday (we just haven't quite figure out how yet). The discussion can be had for release and terminate. Should this be returned once the release has been propagated all the way down and back up the tree, or once all the links has been teared down. E.g., a linkActivated, linkDeactivated, connectionTerminated (don't get to hung up on wording).

...

...
Finally: I have difficulty seperating the cleaning and terminating state, and they seem mostly equivalent for me, but I hadn't joined NSI at the time it was created, so can someone fill me in. I also not quite sure how to respond to a forcedEnd message.

There where some blank lines here. Did you intend to write something? If no one knows why we have a seperate cleaning and terminating state it might be time to reconsider it :-).

...

...
A solution is to use the forcedEnd for fatal events, and have provisionFailed / releaseFailed indicate a move back to scheduled - however then there is no need for a releaseFailed.

OK. I think I agree with you.

Well, I'm not quite sure what to think of these things myself :-/. But the releaseFailed and terminateFailed primitives seem quite artificial for me.

...

...
I've attached a picture of the state machine as I envision it. There are probably still some things wrong with it, and it mostly as basis for future discussion :-).

Your state machine does not allow skew of prov.rq message delivery. If prov.rq is sent by an ultimate RA just before the start_time, some NSA will receive prov.rq before the start_time and some will do after the start_time.

You are right, good catch.

...

In the attached slide, slide 2 is a SM which supports release during auto provisioning, but does not transit to the provisioning state in auto-provision. Slide 3 is a SM which transits to the provisioned state by using non-symmetrical messages.

The model you present on slide 3 has a problem similar to mine: If a release is send aroud start time, the state machine can either return reserved or scheduled. This causes a problem as the requester doesn't know if an arm or provision command should be send. I think it is reasonably clear that the way we respond to provision/release and to some extent terminate is a bit artifical/clumsy. I tried to come up with a new state machine, but I need to think a bit first. Best regards, Henrik Henrik Thostrup Jensen <htj at nordu.net> Software Developer, NORDUnet

Henrik Thostrup Jensen

17 Feb 17 Feb

2:07 p.m.

Hi again On Thu, 16 Feb 2012, Henrik Thostrup Jensen wrote:

...

I tried to come up with a new state machine, but I need to think a bit first.

I've come up with a new suggestions. Two states are introduced denoting that a control plane message has been received (but not replied to yet), along with two primitives to emit when a link activated / deactivated. The names are probably not the final ones. Again, I'm not really sure how this should look, and I not suggestion we adaopt this one immediately. However the state machine is important as it shows how we interact bewteen the NSA and control a connection, and what commands can be accepted depending on the current connection state. Best regards, Henrik Henrik Thostrup Jensen <htj at nordu.net> Software Developer, NORDUnet

Henrik Thostrup Jensen

3:17 p.m.

Hi again On Fri, 17 Feb 2012, Henrik Thostrup Jensen wrote:

...

I've come up with a new suggestions. Two states are introduced denoting that a control plane message has been received (but not replied to yet), along with two primitives to emit when a link activated / deactivated. The names are probably not the final ones.

And another one. This one is starting to look a bit better, but I still don't think it is there yet. Best regards, Henrik Henrik Thostrup Jensen <htj at nordu.net> Software Developer, NORDUnet

4894

Age (days ago)

4903

Last active (days ago)

List overview

Download

5 comments

2 participants

participants (2)

Henrik Thostrup Jensen
Tomohiro Kudoh