[Nsi-wg] Fwd: NSI service instantiation or provisioning signalling

13 Apr 2010


      I apparently didn't send this to the NIS group by accident.


Begin forwarded message:
...
From: ""Joan A. García-Espín"" <jage@i2cat.net>
Date: April 13, 2010 5:05:49 AM GMT-04:00
To: John Vollbrecht <jrv@internet2.edu>
Subject: Re: NSI service instantiation or provisioning signalling
Hi John,
I am very sorry but somehow I missed your email. My apologises.
...
A- Some points that I think we agree on (I could be wrong) -
1) An NSA will have a state machine (or equivalent).  That state  
machine will have a set of states including reserved and active.   
The issue we are talking about has to do  with going between the  
reserved and active state.
Right, agreed.
...
2) A triggering event of some sort will be what causes the NSA to  
change states (and perhaps do some other things).
Right also.
...
B- The discussion of ways to instantiate a connection seems to me  
to have aspects
1) who generates the signal to the state machine
2) what other actions does the signal cause to happen
Exactly, depending on who signals the instantiation, the service  
workflow may be affected in one way or another. The discussion is  
about which option is the best for being included in v1.0.
...
C- There are questions about how to signal state  between NSAs
1) is there a "master" that initiates the signal or is it  
independently done
2)  if one NSA fails how do others know and keep in sync
Right.
...
Of A, B and C - I hope we agree about A.
I think B maybe needs some discussion, but I think we could agree  
pretty quickly
C is harder, and I think needs more discussion of how NSAs keep  
state.  Some of this might be put off till after A and B are done  
and in the doc (in my opinion).
Right also
fully aligned!
--
Joan A. García-Espín
CTX, i2CAT Foundation
El 06/04/2010, a las 23:59, John Vollbrecht escribió:
...
I am a bit confused by the discussion on this.  So I suggest  
abstracting out the issues bit to see where we agree or disagree.
A- Some points that I think we agree on (I could be wrong) -
1) An NSA will have a state machine (or equivalent).  That state  
machine will have a set of states including reserved and active.   
The issue we are talking about has to do  with going between the  
reserved and active state.
2) A triggering event of some sort will be what causes the NSA to  
change states (and perhaps do some other things).
B- The discussion of ways to instantiate a connection seems to me  
to have aspects
1) who generates the signal to the state machine
2) what other actions does the signal cause to happen
C- There are questions about how to signal state  between NSAs
1) is there a "master" that initiates the signal or is it  
independently done
2)  if one NSA fails how do others know and keep in sync
Of A, B and C - I hope we agree about A.
I think B maybe needs some discussion, but I think we could agree  
pretty quickly
C is harder, and I think needs more discussion of how NSAs keep  
state.  Some of this might be put off till after A and B are done  
and in the doc (in my opinion).
Does this make any sense?
John
On Apr 6, 2010, at 9:57 AM, Joan A. García-Espín wrote:
...
Hi Jerry, thanks for your comments!
I've extracted some of your comments and added my view (just for  
avoiding large in-line discussion).
...
By definition,  #3 above is out of scope for the NSI protocol.     
If some other agent (or control plane) wants to initiate the  
provisioning, it does so thru a mechanism that is defined within  
the NSI protocol.     I expect #1 will suffice if some external  
agent needs to kick it off.
As for #3, I was thinking of an scenario where the NRM  
automatically (internally) triggers the provisioning. In this  
case, the service plane could consider letting the NSA in charge  
of that NRM polling it for the reservation status, and notifying  
the rest of NSAs in the provisioning tree later on (upwards  
approach).
...
Hmm...I think I have to disagree strongly.
Good :)
...
When we expect these NSAs to start based on independent clocks  
and without verification that the service/control plane  
associated with this connection is even functioning, I think we  
introduce a myriad of complexities that need to be discussed in  
detail.   Mostly, blind provisioning is different in a tree  
structured service plane than in OSCARS or AutoBahn or a GMPLS  
style service plane.   We need to understand these issues better  
before we wave our hands and say its simple.
My view: the Service Plane is an abstract layer where NSAs  
interact for allowing network service provisioning (we all agree  
on this). One of the key factors here is that NSI aims at  
supporting not only reservations (time dimension is desirable  
here), but also advance reservations or book ahead (time dimension  
is compulsory). Therefore, time synchronisation is something that  
we know, sooner or later, will appear on stage. The NSI cannot  
force implementers to use a given time sync method, but we can  
recommend its use as a way for precise service provisioning and  
for lowering the rate of unsatisfied services (error handling  
gains importance).
...
- The start time may be weeks or months after the request was  
reserved.  How do we know if the service tree for that connection  
is still functioning?  For instance, if the service tree is  
broken, how do we expect the provisioning to complete? (At least  
with requester initiated provisioning we get a request sent down  
the tree that verifies the service tree is still in place.)
NS status query is a primitive, afaik. An NSA can use it for  
periodically/punctually verify the service tree. Which NSA? I'd  
say the one who first initiated the service provisioning (root NSA).
...
- With independent provisioning, i.e. independent state changes,  
we don't have any means of determining the state of the  
connection unless we communicate that state up and down the  
service tree.  We know what our state is, but we have no way of  
predicting the state of our parent/child NSAs and so we have no  
way to validate that the protocol is functioning properly.
For the sake of a good functioning, people implementing option #3  
must implement a robust notification mechanism. I agree with you  
that independent and not-notified changes at a given point in the  
tree produce undesired situations. I don't think we have to worry  
about future implementations, just provide a flexible interface  
for doing so.
...
- If we flood Provision() requests up and/or down the tree to  
communicate to our parent/children NSAs that we auto-induced  a  
state transition for a connection, how do we handle collisions of  
Provision() requests?  Are there any race conditions or hysteresis?
Again, I think this is an implementation problem and is useful for  
the people working on error handling.
...
-If we send Provision() requests up the tree...is that any  
different than sending them down the tree?  Are they handled any  
differently?   If not, would it simplify the protocol if only the  
root initiated the Provision() requests based on a timer?  i.e.  
does having all NSAs independently kick off timers cause  
unnecessary complexity?
We faced this problem in the Phosphorus project time ago, when  
designing Inter-Domain Broker communication protocol for our  
Harmony tool. After long discussions, we concluded this could be  
easily handled by a proper security infrastructure. Any NSA should  
be able to know who its neighbours are and what are they allowed  
to request. In the case of managing a GMPLS control plane with  
manual instantiation of the connection at the UNI, we needed a  
mechanism to notify the service plane the connection was  
provisioned, and thus we implemented the notification service.  
Alternatively, if notification not available, we could use a  
polling mechanism and latter flooding of the reservation state  
among all service plane entities involved. I agree that it adds  
complexity, but I mentioned it basing on a use case from Phosphorus.
...
- if we do have a protocol that handles mid-tree flooding of  
state changes, could this also be useful for the release()  
function or for possible error or failure mode handling scenarios?
In my mind, I was not considering mid-tree, only changes flooded  
from root NSA or network-attached NSA.
...
- What happens if the wall clocks in the NSA are not "adequately"  
synchronized.   Can we assume that wall clock synchronization is  
a non-issue?  If not, what does "adequate" mean in this context?
As I mentioned before, since we are considering calendars to be  
used in a per network (in general, per resource) for advance  
reservations, we should recommend that time sync is used, at least.
...
- How does a requester determine if/when a connection is in- 
service?  Do we simply fire data at it until we see it pop out  
the other end?  How do we know that?  What can we say  
definitively about the connection state form the user's service  
perspective as we move from reserved to inservice..?  (Even a few  
milliseconds could result in megabytes of information being  
leaked out or lost from a connection.  This is both very poor  
service quality as well as a security risk.)
This problem is present at *any* provision strategy and should be  
discussed. The trivial solution (inject and expect it be  
transmitted) is something that can cause severe data losses, as  
you mention. From my past experience, when using a provisioning  
system, one cannot be 100% sure the path is provisioned until data  
plane is tested (i.e. ping). Assuming the network hardware works  
well, we need the NRM to control path set up. Then we have several  
options, but the most typical are (i) implement a way the NRM  
tells its parent NSA the connection is UP, and the let the NSA  
propagate this information along the service tree upwards or (ii)  
implement a polling loop in the NSA in charge of the NRM, until  
"connection UP" is obtained, and proceed with the flooding. In any  
of these cases, "connection UP" messages from NRM should be  
aggregated for building up "path segment UP", until a whole "path  
UP" is ready. Then, the original requester can be notified. Please  
note these are only some options, many others can be designed. NSI  
protocol should not stick to any of them, but consider them for  
making NSI flexible enough (=> will ease adoption).
...
- Is it adequate to have provisioned() segments in-service before  
all segments are in-service?  Does this cause a timing issue in  
the tree that could result in mis-routed data?   Do we need to  
stipulate some atomic function that allows data to begin flowing  
that is different than simply configuring the path?
This can happen, given that networks cooperating in the connection  
provisioning are independent. Thinking on this, I get the  
impression we need some message from the service plane to the app  
where we say the whole path is in-service.
...
Finally, I assert that there is no significant legacy application  
base that should dictate how NSI protocol functions just so that  
they don't have to change.   If there are some requirements that  
those application need that NSI needs to support, ok...but I  
don't think its a good idea to make a future protocol act in some  
way just because thats the way we've always done it.  (That logic  
says lets not change anything...:-)
I fully agree Jerry, hope you did not understand otherwise, sorry  
if I expressed wrongly myself.
My comments were two-fold:
  - First, we should make the first release of the NSI implement  
the options we think are easier to adopt, but in any case limit it  
to them.
  - Second, future-proofed NSI protocol needs firstly to be present- 
proof. Bad early adoption means hard future adoption.
...
My proposal:
1) I think we should do a "requester initiated provisioning"  
request down the tree as a first simplest provisioning initiation  
process.  2a) If we can work through the mid-tree messaging and  
state transition in a prompt manner, we could include that also  
in V1.0, but I think it may be more complex. 2b)A simple interim  
solution would be to have the first PA that sees a "provider  
initiated provisioning" request to assume that responsibility and  
passes a "requester initiated provisioning" request to all  
children.  Then that first PA sends provision() requests down the  
tree to start the process just as with a requester initiated  
provisioning.
If we go for requester-initiated provisioning, we need to control  
both reservation and provisioning calendars, don't we? I mean,  
reservation requests will only serve as a  "medium access control"  
mechanism, purely signalling, not really influencing in the  
provisioning. In this case, the user has to mind about both  
reservation and instantiation.
What about the time lapse between user requesting provisioning and  
data plane effectively up...?
...
I know I can be a bit brusque with some of this - hope it all  
make sense and was not too curt.
It all makes sense. Please mind I always want to broaden the  
applicability of the NSI and to strength its flexibility, since  
for me they're the key for easy adoption (of course, without  
running to generic).
My best regards,
--
Joan A. García-Espín
CTX, i2CAT Foundation
El 02/04/2010, a las 22:39, Jerry Sobieski escribió:
...
Hi Joan-
I have some reservations to what you suggest below...please let  
me know what you think...
Joan A. García-Espín wrote:
...
<<Some long discussions on how best to initiate the  
‘provisioning’ phase of operation.  Several methods were  
discussed, but none agreed:
1.     Provisioning initiated by signal i.e. a message on the  
NSI interface.
2.     Provisioning initiated by a timer local to the provider NSA
3.     Provisioning initiated by a signal not on the NSI  
interface e.g on a separate control plane
...
>
By definition,  #3 above is out of scope for the NSI protocol.     
If some other agent (or control plane) wants to initiate the  
provisioning, it does so thru a mechanism that is defined within  
the NSI protocol.     I expect #1 will suffice if some external  
agent needs to kick it off.
...
My vision is that ANY of the procedures should be supported in  
the NSI. Therefore, NSI must implement the required flag(s) in  
the service characterisation or specific signalling message(s)  
for automatic activation, but all of them might be considered  
optional for allowing either one or the other model to be used  
by requester.
The remaining #1 and # 2 amount to "Requester initiated" and  
"Provider initiated".   And either could be initiated via a timer  
expiring, or by some other event in a work flow, etc.   It seems  
to me the *simplest* thing to do is to have the requester  
initiate the provisioning.  This is just like we have always done  
it with UNIs or RSVP or any other protocol, so it is well  
understood.  Because it is initiated at the root of the request  
tree, it flows down the tree and back up in a very determinitic  
fashion.  We know when each segment is fully provisioned.   There  
is no special handling, it does not add complexity to have the  
provision request slide down the request tree just like the  
reservation() request did initially.
So, it is my proposal that we have at least #1 in version1:  the  
connection is provisioned on requester sending a "provision()"  
request/command to the provider.   And when a  
"ProvisionComplete()" message is received, the connection is
...
From a pragmatic viewpoint, option 2 is the best candidate for  
NSI release 1.0, since does not assume that either the requester  
app or the control plane have the capability to perform an  
automatic activation/instantiation of the connection. Basically,  
this means that legacy compatibility is ensured in early  
implementations, although having to implement time sync in the  
NSA. This is a good point in general because: (i) most of the  
apps/middlewares are not used to deal with generic GMPLS UNI and  
thus don't implement automatic signalling in the app <-> control  
plane interface, and secondly (ii) not all control planes or  
provisioning/management systems implement calendars and  
consequently cannot natively support advance reservations (that  
would be emulated at the service plane).
Hmm...I think I have to disagree strongly.
When we expect these NSAs to start based on independent clocks  
and without verification that the service/control plane  
associated with this connection is even functioning, I think we  
introduce a myriad of complexities that need to be discussed in  
detail.   Mostly, blind provisioning is different in a tree  
structured service plane than in OSCARS or AutoBahn or a GMPLS  
style service plane.   We need to understand these issues better  
before we wave our hands and say its simple.
Here is a list of some issues in no particular order:
- The start time may be weeks or months after the request was  
reserved.  How do we know if the service tree for that connection  
is still functioning?  For instance, if the service tree is  
broken, how do we expect the provisioning to complete?
(At least with requester initiated provisioning we get a request  
sent down the tree that verifies the service tree is still in  
place.)
- With independent provisioning, i.e. independent state changes,  
we don't have any means of determining the state of the  
connection unless we communicate that state up and down the  
service tree.  We know what our state is, but we have no way of  
predicting the state of our parent/child NSAs and so we have no  
way to validate that the protocol is functioning properly.
- If we flood Provision() requests up and/or down the tree to  
communicate to our parent/children NSAs that we auto-induced  a  
state transition for a connection, how do we handle collisions of  
Provision() requests?  Are there any race conditions or hysteresis?
-If we send Provision() requests up the tree...is that any  
different than sending them down the tree?  Are they handled any  
differently?   If not, would it simplify the protocol if only the  
root initiated the Provision() requests based on a timer?  i.e.  
does having all NSAs independently kick off timers cause  
unnecessary complexity?
- if we do have a protocol that handles mid-tree flooding of  
state changes, could this also be useful for the release()  
function or for possible error or failure mode handling scenarios?
- What happens if the wall clocks in the NSA are not "adequately"  
synchronized.   Can we assume that wall clock synchronization is  
a non-issue?  If not, what does "adequate" mean in this context?
- How does a requester determine if/when a connection is in- 
service?  Do we simply fire data at it until we see it pop out  
the other end?  How do we know that?  What can we say  
definitively about the connection state form the user's service  
perspective as we move from reserved to inservice..?  (Even a few  
milliseconds could result in megabytes of information being  
leaked out or lost from a connection.  This is both very poor  
service quality as well as a security risk.)
- Is it adequate to have provisioned() segments in-service before  
all segments are in-service?  Does this cause a timing issue in  
the tree that could result in mis-routed data?   Do we need to  
stipulate some atomic function that allows data to begin flowing  
that is different than simply configuring the path?
We do not have currently any notion of what these NSI protocol  
requirements entail.
I do not differentiate between a timer event kicking off  
provisioning or some other non-calendar related event (say a work  
flow task completing) that defines the kickoff point.   Whatever  
event occurs generates a Provision() request within that NSA  
which causes the state machine to transition.
Finally, I assert that there is no significant legacy application  
base that should dictate how NSI protocol functions just so that  
they don't have to change.   If there are some requirements that  
those application need that NSI needs to support, ok...but I  
don't think its a good idea to make a future protocol act in some  
way just because thats the way we've always done it.  (That logic  
says lets not change anything...:-)
My proposal:
1) I think we should do a "requester initiated provisioning"  
request down the tree as a first simplest provisioning initiation  
process.  2a) If we can work through the mid-tree messaging and  
state transition in a prompt manner, we could include that also  
in V1.0, but I think it may be more complex. 2b)A simple interim  
solution would be to have the first PA that sees a "provider  
initiated provisioning" request to assume that responsibility and  
passes a "requester initiated provisioning" request to all  
children.  Then that first PA sends provision() requests down the  
tree to start the process just as with a requester initiated  
provisioning.
I know I can be a bit brusque with some of this - hope it all  
make sense and was not too curt.
Glad påsk!
Jerry
...
Best regards and have a nice Easter holidays,
-- 
Joan A. García-Espín
Network Technologies Cluster (CTX)
i2CAT Foundation
C/ Gran Capità 2-4, office 203, Nexus I building
08034 Barcelona, Catalonia (Spain)
T: +34 93 553 2518
F: +34 93 553 2520