I apparently didn't send this to the NIS group by accident.
Begin forwarded message:
> From: ""Joan A. García-Espín"" <jage(a)i2cat.net>
> Date: April 13, 2010 5:05:49 AM GMT-04:00
> To: John Vollbrecht <jrv(a)internet2.edu>
> Subject: Re: NSI service instantiation or provisioning signalling
>
> Hi John,
>
> I am very sorry but somehow I missed your email. My apologises.
>
>> A- Some points that I think we agree on (I could be wrong) -
>> 1) An NSA will have a state machine (or equivalent). That state
>> machine will have a set of states including reserved and active.
>> The issue we are talking about has to do with going between the
>> reserved and active state.
> Right, agreed.
>
>> 2) A triggering event of some sort will be what causes the NSA to
>> change states (and perhaps do some other things).
> Right also.
>
>> B- The discussion of ways to instantiate a connection seems to me
>> to have aspects
>> 1) who generates the signal to the state machine
>> 2) what other actions does the signal cause to happen
> Exactly, depending on who signals the instantiation, the service
> workflow may be affected in one way or another. The discussion is
> about which option is the best for being included in v1.0.
>
>> C- There are questions about how to signal state between NSAs
>> 1) is there a "master" that initiates the signal or is it
>> independently done
>> 2) if one NSA fails how do others know and keep in sync
> Right.
>
>> Of A, B and C - I hope we agree about A.
>> I think B maybe needs some discussion, but I think we could agree
>> pretty quickly
>> C is harder, and I think needs more discussion of how NSAs keep
>> state. Some of this might be put off till after A and B are done
>> and in the doc (in my opinion).
>
> Right also
>
> fully aligned!
> --
> Joan A. García-Espín
> CTX, i2CAT Foundation
>
>
>
>
>
> El 06/04/2010, a las 23:59, John Vollbrecht escribió:
>
>> I am a bit confused by the discussion on this. So I suggest
>> abstracting out the issues bit to see where we agree or disagree.
>>
>> A- Some points that I think we agree on (I could be wrong) -
>> 1) An NSA will have a state machine (or equivalent). That state
>> machine will have a set of states including reserved and active.
>> The issue we are talking about has to do with going between the
>> reserved and active state.
>> 2) A triggering event of some sort will be what causes the NSA to
>> change states (and perhaps do some other things).
>>
>> B- The discussion of ways to instantiate a connection seems to me
>> to have aspects
>> 1) who generates the signal to the state machine
>> 2) what other actions does the signal cause to happen
>>
>> C- There are questions about how to signal state between NSAs
>> 1) is there a "master" that initiates the signal or is it
>> independently done
>> 2) if one NSA fails how do others know and keep in sync
>>
>> Of A, B and C - I hope we agree about A.
>> I think B maybe needs some discussion, but I think we could agree
>> pretty quickly
>> C is harder, and I think needs more discussion of how NSAs keep
>> state. Some of this might be put off till after A and B are done
>> and in the doc (in my opinion).
>>
>> Does this make any sense?
>>
>> John
>>
>>
>>
>> On Apr 6, 2010, at 9:57 AM, Joan A. García-Espín wrote:
>>
>>> Hi Jerry, thanks for your comments!
>>>
>>> I've extracted some of your comments and added my view (just for
>>> avoiding large in-line discussion).
>>>
>>>> By definition, #3 above is out of scope for the NSI protocol.
>>>> If some other agent (or control plane) wants to initiate the
>>>> provisioning, it does so thru a mechanism that is defined within
>>>> the NSI protocol. I expect #1 will suffice if some external
>>>> agent needs to kick it off.
>>>
>>> As for #3, I was thinking of an scenario where the NRM
>>> automatically (internally) triggers the provisioning. In this
>>> case, the service plane could consider letting the NSA in charge
>>> of that NRM polling it for the reservation status, and notifying
>>> the rest of NSAs in the provisioning tree later on (upwards
>>> approach).
>>>
>>>> Hmm...I think I have to disagree strongly.
>>>
>>> Good :)
>>>
>>>> When we expect these NSAs to start based on independent clocks
>>>> and without verification that the service/control plane
>>>> associated with this connection is even functioning, I think we
>>>> introduce a myriad of complexities that need to be discussed in
>>>> detail. Mostly, blind provisioning is different in a tree
>>>> structured service plane than in OSCARS or AutoBahn or a GMPLS
>>>> style service plane. We need to understand these issues better
>>>> before we wave our hands and say its simple.
>>>
>>> My view: the Service Plane is an abstract layer where NSAs
>>> interact for allowing network service provisioning (we all agree
>>> on this). One of the key factors here is that NSI aims at
>>> supporting not only reservations (time dimension is desirable
>>> here), but also advance reservations or book ahead (time dimension
>>> is compulsory). Therefore, time synchronisation is something that
>>> we know, sooner or later, will appear on stage. The NSI cannot
>>> force implementers to use a given time sync method, but we can
>>> recommend its use as a way for precise service provisioning and
>>> for lowering the rate of unsatisfied services (error handling
>>> gains importance).
>>>
>>>> - The start time may be weeks or months after the request was
>>>> reserved. How do we know if the service tree for that connection
>>>> is still functioning? For instance, if the service tree is
>>>> broken, how do we expect the provisioning to complete? (At least
>>>> with requester initiated provisioning we get a request sent down
>>>> the tree that verifies the service tree is still in place.)
>>>
>>> NS status query is a primitive, afaik. An NSA can use it for
>>> periodically/punctually verify the service tree. Which NSA? I'd
>>> say the one who first initiated the service provisioning (root NSA).
>>>
>>>> - With independent provisioning, i.e. independent state changes,
>>>> we don't have any means of determining the state of the
>>>> connection unless we communicate that state up and down the
>>>> service tree. We know what our state is, but we have no way of
>>>> predicting the state of our parent/child NSAs and so we have no
>>>> way to validate that the protocol is functioning properly.
>>>
>>> For the sake of a good functioning, people implementing option #3
>>> must implement a robust notification mechanism. I agree with you
>>> that independent and not-notified changes at a given point in the
>>> tree produce undesired situations. I don't think we have to worry
>>> about future implementations, just provide a flexible interface
>>> for doing so.
>>>
>>>> - If we flood Provision() requests up and/or down the tree to
>>>> communicate to our parent/children NSAs that we auto-induced a
>>>> state transition for a connection, how do we handle collisions of
>>>> Provision() requests? Are there any race conditions or hysteresis?
>>>
>>> Again, I think this is an implementation problem and is useful for
>>> the people working on error handling.
>>>
>>>> -If we send Provision() requests up the tree...is that any
>>>> different than sending them down the tree? Are they handled any
>>>> differently? If not, would it simplify the protocol if only the
>>>> root initiated the Provision() requests based on a timer? i.e.
>>>> does having all NSAs independently kick off timers cause
>>>> unnecessary complexity?
>>>
>>> We faced this problem in the Phosphorus project time ago, when
>>> designing Inter-Domain Broker communication protocol for our
>>> Harmony tool. After long discussions, we concluded this could be
>>> easily handled by a proper security infrastructure. Any NSA should
>>> be able to know who its neighbours are and what are they allowed
>>> to request. In the case of managing a GMPLS control plane with
>>> manual instantiation of the connection at the UNI, we needed a
>>> mechanism to notify the service plane the connection was
>>> provisioned, and thus we implemented the notification service.
>>> Alternatively, if notification not available, we could use a
>>> polling mechanism and latter flooding of the reservation state
>>> among all service plane entities involved. I agree that it adds
>>> complexity, but I mentioned it basing on a use case from Phosphorus.
>>>
>>>> - if we do have a protocol that handles mid-tree flooding of
>>>> state changes, could this also be useful for the release()
>>>> function or for possible error or failure mode handling scenarios?
>>>
>>> In my mind, I was not considering mid-tree, only changes flooded
>>> from root NSA or network-attached NSA.
>>>
>>>> - What happens if the wall clocks in the NSA are not "adequately"
>>>> synchronized. Can we assume that wall clock synchronization is
>>>> a non-issue? If not, what does "adequate" mean in this context?
>>>
>>> As I mentioned before, since we are considering calendars to be
>>> used in a per network (in general, per resource) for advance
>>> reservations, we should recommend that time sync is used, at least.
>>>
>>>> - How does a requester determine if/when a connection is in-
>>>> service? Do we simply fire data at it until we see it pop out
>>>> the other end? How do we know that? What can we say
>>>> definitively about the connection state form the user's service
>>>> perspective as we move from reserved to inservice..? (Even a few
>>>> milliseconds could result in megabytes of information being
>>>> leaked out or lost from a connection. This is both very poor
>>>> service quality as well as a security risk.)
>>>
>>> This problem is present at *any* provision strategy and should be
>>> discussed. The trivial solution (inject and expect it be
>>> transmitted) is something that can cause severe data losses, as
>>> you mention. From my past experience, when using a provisioning
>>> system, one cannot be 100% sure the path is provisioned until data
>>> plane is tested (i.e. ping). Assuming the network hardware works
>>> well, we need the NRM to control path set up. Then we have several
>>> options, but the most typical are (i) implement a way the NRM
>>> tells its parent NSA the connection is UP, and the let the NSA
>>> propagate this information along the service tree upwards or (ii)
>>> implement a polling loop in the NSA in charge of the NRM, until
>>> "connection UP" is obtained, and proceed with the flooding. In any
>>> of these cases, "connection UP" messages from NRM should be
>>> aggregated for building up "path segment UP", until a whole "path
>>> UP" is ready. Then, the original requester can be notified. Please
>>> note these are only some options, many others can be designed. NSI
>>> protocol should not stick to any of them, but consider them for
>>> making NSI flexible enough (=> will ease adoption).
>>>
>>>> - Is it adequate to have provisioned() segments in-service before
>>>> all segments are in-service? Does this cause a timing issue in
>>>> the tree that could result in mis-routed data? Do we need to
>>>> stipulate some atomic function that allows data to begin flowing
>>>> that is different than simply configuring the path?
>>>
>>> This can happen, given that networks cooperating in the connection
>>> provisioning are independent. Thinking on this, I get the
>>> impression we need some message from the service plane to the app
>>> where we say the whole path is in-service.
>>>
>>>> Finally, I assert that there is no significant legacy application
>>>> base that should dictate how NSI protocol functions just so that
>>>> they don't have to change. If there are some requirements that
>>>> those application need that NSI needs to support, ok...but I
>>>> don't think its a good idea to make a future protocol act in some
>>>> way just because thats the way we've always done it. (That logic
>>>> says lets not change anything...:-)
>>>
>>> I fully agree Jerry, hope you did not understand otherwise, sorry
>>> if I expressed wrongly myself.
>>> My comments were two-fold:
>>> - First, we should make the first release of the NSI implement
>>> the options we think are easier to adopt, but in any case limit it
>>> to them.
>>> - Second, future-proofed NSI protocol needs firstly to be present-
>>> proof. Bad early adoption means hard future adoption.
>>>
>>>> My proposal:
>>>> 1) I think we should do a "requester initiated provisioning"
>>>> request down the tree as a first simplest provisioning initiation
>>>> process. 2a) If we can work through the mid-tree messaging and
>>>> state transition in a prompt manner, we could include that also
>>>> in V1.0, but I think it may be more complex. 2b)A simple interim
>>>> solution would be to have the first PA that sees a "provider
>>>> initiated provisioning" request to assume that responsibility and
>>>> passes a "requester initiated provisioning" request to all
>>>> children. Then that first PA sends provision() requests down the
>>>> tree to start the process just as with a requester initiated
>>>> provisioning.
>>>
>>> If we go for requester-initiated provisioning, we need to control
>>> both reservation and provisioning calendars, don't we? I mean,
>>> reservation requests will only serve as a "medium access control"
>>> mechanism, purely signalling, not really influencing in the
>>> provisioning. In this case, the user has to mind about both
>>> reservation and instantiation.
>>> What about the time lapse between user requesting provisioning and
>>> data plane effectively up...?
>>>
>>>> I know I can be a bit brusque with some of this - hope it all
>>>> make sense and was not too curt.
>>>
>>>
>>> It all makes sense. Please mind I always want to broaden the
>>> applicability of the NSI and to strength its flexibility, since
>>> for me they're the key for easy adoption (of course, without
>>> running to generic).
>>>
>>> My best regards,
>>> --
>>> Joan A. García-Espín
>>> CTX, i2CAT Foundation
>>>
>>>
>>>
>>>
>>>
>>> El 02/04/2010, a las 22:39, Jerry Sobieski escribió:
>>>
>>>> Hi Joan-
>>>>
>>>> I have some reservations to what you suggest below...please let
>>>> me know what you think...
>>>>
>>>> Joan A. García-Espín wrote:
>>>>>
>>>>> <<Some long discussions on how best to initiate the
>>>>> ‘provisioning’ phase of operation. Several methods were
>>>>> discussed, but none agreed:
>>>>>
>>>>> 1. Provisioning initiated by signal i.e. a message on the
>>>>> NSI interface.
>>>>> 2. Provisioning initiated by a timer local to the provider NSA
>>>>> 3. Provisioning initiated by a signal not on the NSI
>>>>> interface e.g on a separate control plane
>>>>> >>
>>>> By definition, #3 above is out of scope for the NSI protocol.
>>>> If some other agent (or control plane) wants to initiate the
>>>> provisioning, it does so thru a mechanism that is defined within
>>>> the NSI protocol. I expect #1 will suffice if some external
>>>> agent needs to kick it off.
>>>>>
>>>>> My vision is that ANY of the procedures should be supported in
>>>>> the NSI. Therefore, NSI must implement the required flag(s) in
>>>>> the service characterisation or specific signalling message(s)
>>>>> for automatic activation, but all of them might be considered
>>>>> optional for allowing either one or the other model to be used
>>>>> by requester.
>>>> The remaining #1 and # 2 amount to "Requester initiated" and
>>>> "Provider initiated". And either could be initiated via a timer
>>>> expiring, or by some other event in a work flow, etc. It seems
>>>> to me the *simplest* thing to do is to have the requester
>>>> initiate the provisioning. This is just like we have always done
>>>> it with UNIs or RSVP or any other protocol, so it is well
>>>> understood. Because it is initiated at the root of the request
>>>> tree, it flows down the tree and back up in a very determinitic
>>>> fashion. We know when each segment is fully provisioned. There
>>>> is no special handling, it does not add complexity to have the
>>>> provision request slide down the request tree just like the
>>>> reservation() request did initially.
>>>> So, it is my proposal that we have at least #1 in version1: the
>>>> connection is provisioned on requester sending a "provision()"
>>>> request/command to the provider. And when a
>>>> "ProvisionComplete()" message is received, the connection is
>>>>>
>>>>> From a pragmatic viewpoint, option 2 is the best candidate for
>>>>> NSI release 1.0, since does not assume that either the requester
>>>>> app or the control plane have the capability to perform an
>>>>> automatic activation/instantiation of the connection. Basically,
>>>>> this means that legacy compatibility is ensured in early
>>>>> implementations, although having to implement time sync in the
>>>>> NSA. This is a good point in general because: (i) most of the
>>>>> apps/middlewares are not used to deal with generic GMPLS UNI and
>>>>> thus don't implement automatic signalling in the app <-> control
>>>>> plane interface, and secondly (ii) not all control planes or
>>>>> provisioning/management systems implement calendars and
>>>>> consequently cannot natively support advance reservations (that
>>>>> would be emulated at the service plane).
>>>> Hmm...I think I have to disagree strongly.
>>>> When we expect these NSAs to start based on independent clocks
>>>> and without verification that the service/control plane
>>>> associated with this connection is even functioning, I think we
>>>> introduce a myriad of complexities that need to be discussed in
>>>> detail. Mostly, blind provisioning is different in a tree
>>>> structured service plane than in OSCARS or AutoBahn or a GMPLS
>>>> style service plane. We need to understand these issues better
>>>> before we wave our hands and say its simple.
>>>>
>>>> Here is a list of some issues in no particular order:
>>>>
>>>> - The start time may be weeks or months after the request was
>>>> reserved. How do we know if the service tree for that connection
>>>> is still functioning? For instance, if the service tree is
>>>> broken, how do we expect the provisioning to complete?
>>>> (At least with requester initiated provisioning we get a request
>>>> sent down the tree that verifies the service tree is still in
>>>> place.)
>>>>
>>>> - With independent provisioning, i.e. independent state changes,
>>>> we don't have any means of determining the state of the
>>>> connection unless we communicate that state up and down the
>>>> service tree. We know what our state is, but we have no way of
>>>> predicting the state of our parent/child NSAs and so we have no
>>>> way to validate that the protocol is functioning properly.
>>>>
>>>> - If we flood Provision() requests up and/or down the tree to
>>>> communicate to our parent/children NSAs that we auto-induced a
>>>> state transition for a connection, how do we handle collisions of
>>>> Provision() requests? Are there any race conditions or hysteresis?
>>>> -If we send Provision() requests up the tree...is that any
>>>> different than sending them down the tree? Are they handled any
>>>> differently? If not, would it simplify the protocol if only the
>>>> root initiated the Provision() requests based on a timer? i.e.
>>>> does having all NSAs independently kick off timers cause
>>>> unnecessary complexity?
>>>>
>>>> - if we do have a protocol that handles mid-tree flooding of
>>>> state changes, could this also be useful for the release()
>>>> function or for possible error or failure mode handling scenarios?
>>>>
>>>> - What happens if the wall clocks in the NSA are not "adequately"
>>>> synchronized. Can we assume that wall clock synchronization is
>>>> a non-issue? If not, what does "adequate" mean in this context?
>>>>
>>>> - How does a requester determine if/when a connection is in-
>>>> service? Do we simply fire data at it until we see it pop out
>>>> the other end? How do we know that? What can we say
>>>> definitively about the connection state form the user's service
>>>> perspective as we move from reserved to inservice..? (Even a few
>>>> milliseconds could result in megabytes of information being
>>>> leaked out or lost from a connection. This is both very poor
>>>> service quality as well as a security risk.)
>>>>
>>>> - Is it adequate to have provisioned() segments in-service before
>>>> all segments are in-service? Does this cause a timing issue in
>>>> the tree that could result in mis-routed data? Do we need to
>>>> stipulate some atomic function that allows data to begin flowing
>>>> that is different than simply configuring the path?
>>>>
>>>> We do not have currently any notion of what these NSI protocol
>>>> requirements entail.
>>>>
>>>> I do not differentiate between a timer event kicking off
>>>> provisioning or some other non-calendar related event (say a work
>>>> flow task completing) that defines the kickoff point. Whatever
>>>> event occurs generates a Provision() request within that NSA
>>>> which causes the state machine to transition.
>>>> Finally, I assert that there is no significant legacy application
>>>> base that should dictate how NSI protocol functions just so that
>>>> they don't have to change. If there are some requirements that
>>>> those application need that NSI needs to support, ok...but I
>>>> don't think its a good idea to make a future protocol act in some
>>>> way just because thats the way we've always done it. (That logic
>>>> says lets not change anything...:-)
>>>>
>>>> My proposal:
>>>> 1) I think we should do a "requester initiated provisioning"
>>>> request down the tree as a first simplest provisioning initiation
>>>> process. 2a) If we can work through the mid-tree messaging and
>>>> state transition in a prompt manner, we could include that also
>>>> in V1.0, but I think it may be more complex. 2b)A simple interim
>>>> solution would be to have the first PA that sees a "provider
>>>> initiated provisioning" request to assume that responsibility and
>>>> passes a "requester initiated provisioning" request to all
>>>> children. Then that first PA sends provision() requests down the
>>>> tree to start the process just as with a requester initiated
>>>> provisioning.
>>>>
>>>> I know I can be a bit brusque with some of this - hope it all
>>>> make sense and was not too curt.
>>>>
>>>> Glad påsk!
>>>> Jerry
>>>>>
>>>>> Best regards and have a nice Easter holidays,
>>>>> --
>>>>> Joan A. García-Espín
>>>>> Network Technologies Cluster (CTX)
>>>>> i2CAT Foundation
>>>>> C/ Gran Capità 2-4, office 203, Nexus I building
>>>>> 08034 Barcelona, Catalonia (Spain)
>>>>>
>>>>> T: +34 93 553 2518
>>>>> F: +34 93 553 2520
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>