Hi John,
I am very sorry but somehow I missed your email. My apologises.
A- Some points that I think we agree on (I could be wrong) -
1) An NSA will have a state machine (or equivalent). That state machine will have a set of states including reserved and active. The issue we are talking about has to do with going between the reserved and active state.
Right, agreed.
2) A triggering event of some sort will be what causes the NSA to change states (and perhaps do some other things).
Right also.
B- The discussion of ways to instantiate a connection seems to me to have aspects
1) who generates the signal to the state machine
2) what other actions does the signal cause to happen
Exactly, depending on who signals the instantiation, the service workflow may be affected in one way or another. The discussion is about which option is the best for being included in v1.0.
C- There are questions about how to signal state between NSAs
1) is there a "master" that initiates the signal or is it independently done
2) if one NSA fails how do others know and keep in sync
Right.
Of A, B and C - I hope we agree about A.
I think B maybe needs some discussion, but I think we could agree pretty quickly
C is harder, and I think needs more discussion of how NSAs keep state. Some of this might be put off till after A and B are done and in the doc (in my opinion).
Right also
fully aligned!
--
Joan A. García-Espín
CTX, i2CAT Foundation
El 06/04/2010, a las 23:59, John Vollbrecht escribió:
I am a bit confused by the discussion on this. So I suggest abstracting out the issues bit to see where we agree or disagree.
A- Some points that I think we agree on (I could be wrong) -
1) An NSA will have a state machine (or equivalent). That state machine will have a set of states including reserved and active. The issue we are talking about has to do with going between the reserved and active state.
2) A triggering event of some sort will be what causes the NSA to change states (and perhaps do some other things).
B- The discussion of ways to instantiate a connection seems to me to have aspects
1) who generates the signal to the state machine
2) what other actions does the signal cause to happen
C- There are questions about how to signal state between NSAs
1) is there a "master" that initiates the signal or is it independently done
2) if one NSA fails how do others know and keep in sync
Of A, B and C - I hope we agree about A.
I think B maybe needs some discussion, but I think we could agree pretty quickly
C is harder, and I think needs more discussion of how NSAs keep state. Some of this might be put off till after A and B are done and in the doc (in my opinion).
Does this make any sense?
John
On Apr 6, 2010, at 9:57 AM, Joan A. García-Espín wrote:
Hi Jerry, thanks for your comments!
I've extracted some of your comments and added my view (just for avoiding large in-line discussion).
By definition, #3 above is out of scope for the NSI protocol. If some other agent (or control plane) wants to initiate the provisioning, it does so thru a mechanism that is defined within the NSI protocol. I expect #1 will suffice if some external agent needs to kick it off.
As for #3, I was thinking of an scenario where the NRM automatically (internally) triggers the provisioning. In this case, the service plane could consider letting the NSA in charge of that NRM polling it for the reservation status, and notifying the rest of NSAs in the provisioning tree later on (upwards approach).
Hmm...I think I have to disagree strongly.
Good :)
When we expect these NSAs to start based on independent clocks and without verification that the service/control plane associated with this connection is even functioning, I think we introduce a myriad of complexities that need to be discussed in detail. Mostly, blind provisioning is different in a tree structured service plane than in OSCARS or AutoBahn or a GMPLS style service plane. We need to understand these issues better before we wave our hands and say its simple.
My view: the Service Plane is an abstract layer where NSAs interact for allowing network service provisioning (we all agree on this). One of the key factors here is that NSI aims at supporting not only reservations (time dimension is desirable here), but also advance reservations or book ahead (time dimension is compulsory). Therefore, time synchronisation is something that we know, sooner or later, will appear on stage. The NSI cannot force implementers to use a given time sync method, but we can recommend its use as a way for precise service provisioning and for lowering the rate of unsatisfied services (error handling gains importance).
- The start time may be weeks or months after the request was reserved. How do we know if the service tree for that connection is still functioning? For instance, if the service tree is broken, how do we expect the provisioning to complete? (At least with requester initiated provisioning we get a request sent down the tree that verifies the service tree is still in place.)
NS status query is a primitive, afaik. An NSA can use it for periodically/punctually verify the service tree. Which NSA? I'd say the one who first initiated the service provisioning (root NSA).
- With independent provisioning, i.e. independent state changes, we don't have any means of determining the state of the connection unless we communicate that state up and down the service tree. We know what our state is, but we have no way of predicting the state of our parent/child NSAs and so we have no way to validate that the protocol is functioning properly.
For the sake of a good functioning, people implementing option #3 must implement a robust notification mechanism. I agree with you that independent and not-notified changes at a given point in the tree produce undesired situations. I don't think we have to worry about future implementations, just provide a flexible interface for doing so.
- If we flood Provision() requests up and/or down the tree to communicate to our parent/children NSAs that we auto-induced a state transition for a connection, how do we handle collisions of Provision() requests? Are there any race conditions or hysteresis?
Again, I think this is an implementation problem and is useful for the people working on error handling.
-If we send Provision() requests up the tree...is that any different than sending them down the tree? Are they handled any differently? If not, would it simplify the protocol if only the root initiated the Provision() requests based on a timer? i.e. does having all NSAs independently kick off timers cause unnecessary complexity?
We faced this problem in the Phosphorus project time ago, when designing Inter-Domain Broker communication protocol for our Harmony tool. After long discussions, we concluded this could be easily handled by a proper security infrastructure. Any NSA should be able to know who its neighbours are and what are they allowed to request. In the case of managing a GMPLS control plane with manual instantiation of the connection at the UNI, we needed a mechanism to notify the service plane the connection was provisioned, and thus we implemented the notification service. Alternatively, if notification not available, we could use a polling mechanism and latter flooding of the reservation state among all service plane entities involved. I agree that it adds complexity, but I mentioned it basing on a use case from Phosphorus.
- if we do have a protocol that handles mid-tree flooding of state changes, could this also be useful for the release() function or for possible error or failure mode handling scenarios?
In my mind, I was not considering mid-tree, only changes flooded from root NSA or network-attached NSA.
- What happens if the wall clocks in the NSA are not "adequately" synchronized. Can we assume that wall clock synchronization is a non-issue? If not, what does "adequate" mean in this context?
As I mentioned before, since we are considering calendars to be used in a per network (in general, per resource) for advance reservations, we should recommend that time sync is used, at least.
- How does a requester determine if/when a connection is in-service? Do we simply fire data at it until we see it pop out the other end? How do we know that? What can we say definitively about the connection state form the user's service perspective as we move from reserved to inservice..? (Even a few milliseconds could result in megabytes of information being leaked out or lost from a connection. This is both very poor service quality as well as a security risk.)
This problem is present at *any* provision strategy and should be discussed. The trivial solution (inject and expect it be transmitted) is something that can cause severe data losses, as you mention. From my past experience, when using a provisioning system, one cannot be 100% sure the path is provisioned until data plane is tested (i.e. ping). Assuming the network hardware works well, we need the NRM to control path set up. Then we have several options, but the most typical are (i) implement a way the NRM tells its parent NSA the connection is UP, and the let the NSA propagate this information along the service tree upwards or (ii) implement a polling loop in the NSA in charge of the NRM, until "connection UP" is obtained, and proceed with the flooding. In any of these cases, "connection UP" messages from NRM should be aggregated for building up "path segment UP", until a whole "path UP" is ready. Then, the original requester can be notified. Please note these are only some options, many others can be designed. NSI protocol should not stick to any of them, but consider them for making NSI flexible enough (=> will ease adoption).
- Is it adequate to have provisioned() segments in-service before all segments are in-service? Does this cause a timing issue in the tree that could result in mis-routed data? Do we need to stipulate some atomic function that allows data to begin flowing that is different than simply configuring the path?
This can happen, given that networks cooperating in the connection provisioning are independent. Thinking on this, I get the impression we need some message from the service plane to the app where we say the whole path is in-service.
Finally, I assert that there is no significant legacy application base that should dictate how NSI protocol functions just so that they don't have to change. If there are some requirements that those application need that NSI needs to support, ok...but I don't think its a good idea to make a future protocol act in some way just because thats the way we've always done it. (That logic says lets not change anything...:-)
I fully agree Jerry, hope you did not understand otherwise, sorry if I expressed wrongly myself.
My comments were two-fold:
- First, we should make the first release of the NSI implement the options we think are easier to adopt, but in any case limit it to them.
- Second, future-proofed NSI protocol needs firstly to be present-proof. Bad early adoption means hard future adoption.
My proposal:
1) I think we should do a "requester initiated provisioning" request down the tree as a first simplest provisioning initiation process. 2a) If we can work through the mid-tree messaging and state transition in a prompt manner, we could include that also in V1.0, but I think it may be more complex. 2b)A simple interim solution would be to have the first PA that sees a "provider initiated provisioning" request to assume that responsibility and passes a "requester initiated provisioning" request to all children. Then that first PA sends provision() requests down the tree to start the process just as with a requester initiated provisioning.
If we go for requester-initiated provisioning, we need to control both reservation and provisioning calendars, don't we? I mean, reservation requests will only serve as a "medium access control" mechanism, purely signalling, not really influencing in the provisioning. In this case, the user has to mind about both reservation and instantiation.
What about the time lapse between user requesting provisioning and data plane effectively up...?
I know I can be a bit brusque with some of this - hope it all make sense and was not too curt.
It all makes sense. Please mind I always want to broaden the applicability of the NSI and to strength its flexibility, since for me they're the key for easy adoption (of course, without running to generic).
My best regards,
--
Joan A. García-Espín
CTX, i2CAT Foundation
El 02/04/2010, a las 22:39, Jerry Sobieski escribió:
Hi Joan-
I have some reservations to what you suggest below...please let me know what you think...
Joan A. García-Espín wrote:
<<Some long discussions on how best to initiate the ‘provisioning’ phase of operation. Several methods were discussed, but none agreed:
1. Provisioning initiated by signal i.e. a message on the NSI interface.
2. Provisioning initiated by a timer local to the provider NSA
3. Provisioning initiated by a signal not on the NSI interface e.g on a separate control plane
>>
By definition, #3 above is out of scope for the NSI protocol. If some other agent (or control plane) wants to initiate the provisioning, it does so thru a mechanism that is defined within the NSI protocol. I expect #1 will suffice if some external agent needs to kick it off.
My vision is that ANY of the procedures should be supported in the NSI. Therefore, NSI must implement the required flag(s) in the service characterisation or specific signalling message(s) for automatic activation, but all of them might be considered optional for allowing either one or the other model to be used by requester.
The remaining #1 and # 2 amount to "Requester initiated" and "Provider initiated". And either could be initiated via a timer expiring, or by some other event in a work flow, etc. It seems to me the *simplest* thing to do is to have the requester initiate the provisioning. This is just like we have always done it with UNIs or RSVP or any other protocol, so it is well understood. Because it is initiated at the root of the request tree, it flows down the tree and back up in a very determinitic fashion. We know when each segment is fully provisioned. There is no special handling, it does not add complexity to have the provision request slide down the request tree just like the reservation() request did initially.
So, it is my proposal that we have at least #1 in version1: the connection is provisioned on requester sending a "provision()" request/command to the provider. And when a "ProvisionComplete()" message is received, the connection is
From a pragmatic viewpoint, option 2 is the best candidate for NSI release 1.0, since does not assume that either the requester app or the control plane have the capability to perform an automatic activation/instantiation of the connection. Basically, this means that legacy compatibility is ensured in early implementations, although having to implement time sync in the NSA. This is a good point in general because: (i) most of the apps/middlewares are not used to deal with generic GMPLS UNI and thus don't implement automatic signalling in the app <-> control plane interface, and secondly (ii) not all control planes or provisioning/management systems implement calendars and consequently cannot natively support advance reservations (that would be emulated at the service plane).
Hmm...I think I have to disagree strongly.
When we expect these NSAs to start based on independent clocks and without verification that the service/control plane associated with this connection is even functioning, I think we introduce a myriad of complexities that need to be discussed in detail. Mostly, blind provisioning is different in a tree structured service plane than in OSCARS or AutoBahn or a GMPLS style service plane. We need to understand these issues better before we wave our hands and say its simple.
Here is a list of some issues in no particular order:
- The start time may be weeks or months after the request was reserved. How do we know if the service tree for that connection is still functioning? For instance, if the service tree is broken, how do we expect the provisioning to complete?
(At least with requester initiated provisioning we get a request sent down the tree that verifies the service tree is still in place.)
- With independent provisioning, i.e. independent state changes, we don't have any means of determining the state of the connection unless we communicate that state up and down the service tree. We know what our state is, but we have no way of predicting the state of our parent/child NSAs and so we have no way to validate that the protocol is functioning properly.
- If we flood Provision() requests up and/or down the tree to communicate to our parent/children NSAs that we auto-induced a state transition for a connection, how do we handle collisions of Provision() requests? Are there any race conditions or hysteresis?
-If we send Provision() requests up the tree...is that any different than sending them down the tree? Are they handled any differently? If not, would it simplify the protocol if only the root initiated the Provision() requests based on a timer? i.e. does having all NSAs independently kick off timers cause unnecessary complexity?
- if we do have a protocol that handles mid-tree flooding of state changes, could this also be useful for the release() function or for possible error or failure mode handling scenarios?
- What happens if the wall clocks in the NSA are not "adequately" synchronized. Can we assume that wall clock synchronization is a non-issue? If not, what does "adequate" mean in this context?
- How does a requester determine if/when a connection is in-service? Do we simply fire data at it until we see it pop out the other end? How do we know that? What can we say definitively about the connection state form the user's service perspective as we move from reserved to inservice..? (Even a few milliseconds could result in megabytes of information being leaked out or lost from a connection. This is both very poor service quality as well as a security risk.)
- Is it adequate to have provisioned() segments in-service before all segments are in-service? Does this cause a timing issue in the tree that could result in mis-routed data? Do we need to stipulate some atomic function that allows data to begin flowing that is different than simply configuring the path?
We do not have currently any notion of what these NSI protocol requirements entail.
I do not differentiate between a timer event kicking off provisioning or some other non-calendar related event (say a work flow task completing) that defines the kickoff point. Whatever event occurs generates a Provision() request within that NSA which causes the state machine to transition.
Finally, I assert that there is no significant legacy application base that should dictate how NSI protocol functions just so that they don't have to change. If there are some requirements that those application need that NSI needs to support, ok...but I don't think its a good idea to make a future protocol act in some way just because thats the way we've always done it. (That logic says lets not change anything...:-)
My proposal:
1) I think we should do a "requester initiated provisioning" request down the tree as a first simplest provisioning initiation process. 2a) If we can work through the mid-tree messaging and state transition in a prompt manner, we could include that also in V1.0, but I think it may be more complex. 2b)A simple interim solution would be to have the first PA that sees a "provider initiated provisioning" request to assume that responsibility and passes a "requester initiated provisioning" request to all children. Then that first PA sends provision() requests down the tree to start the process just as with a requester initiated provisioning.
I know I can be a bit brusque with some of this - hope it all make sense and was not too curt.
Glad påsk!
Jerry
Best regards and have a nice Easter holidays,
--
Joan A. García-Espín
Network Technologies Cluster (CTX)
i2CAT Foundation
C/ Gran Capità 2-4, office 203, Nexus I building
08034 Barcelona, Catalonia (Spain)
T: +34 93 553 2518
F: +34 93 553 2520