Hi Jeff - glad to see you chime in!   See my response in line...

Jeff W. Boote wrote:

On Sep 28, 2010, at 3:40 PM, Jerry Sobieski wrote:

Hi Radak - see comment below...

Radek Krzywania wrote:
Hi Jerry,
IMHO setup and tear down times should be considered global (all NSA along a path). User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).
But we cannot expect all NSAs clocks to be exactly synchronized.    Clocks are critical to bookahead scheduling, and independent quasi-synchronized clocks (however slightly skewed) will cause problems.    Some of those problems are evident in this discussion.

Exact synchronization is not required. The protocol can (and probably should) define a reasonable synchronization requirement. i.e. NSAs MUST be synchronized within 1 second. (Or even 10 seconds) That should be a relatively trivial requirement, and bounds this problem.
I think the gotchya here is that even if we define a maximum skew between two interacting NSAs, that skew can be additive from one NSA to the next down/across the service tree.  And if the service tree contains a number of NSAs, that skew may become large.   Despite this,  whether the skew is large or small, it must still be analyzed carefully.  The protocol can be defined to handle these effectively - we just need to do a thorough analysis of the timing permutations.   This is careful work, but not particularly difficult. certainly not a roadblock.

Further, if we require *any* type of clock synchronization for the protocol to work, we need to then define a mechanism within the protocol to either synchronize clocks or to at least detect broken clocks.   And this check must be performed periodically during the NSA-NSA session to insure the clocks don't drift out of conformance.   IMO, any type or required clock synchronization creates substantial complexity.   I think it safe to assume that NSA clocks will be "approximately" the same, but we still need to handle even slight skew effects rigorously in the protocol.

Question for discussion:   What happens if the time-of-day clocks are way off?   Effectively, the reservations may never be successful, or the provisioning may never succeed.   But the protocol should still work correctly even if someone's calendar is messed up.   Remember: The day clock may be messed up in one NSA, but the protocol agent must function for many possible service trees.     We can discuss this:  I don't think we want to make it a function of the Connection Service to insure that reservation clocks are right.  Or maybe we should?   Maybe the "scheduling" function  -which uses a time-of-day clock that *should* be close to correct does need some coordination...what do folks think?    IMO, the timers and protocol functions for the connection life cycle state machine should be able to function with independent clocks that are approximately correct - whether that error is a few milliseconds or a few days.



Open issue for discussion:  How do we address the issue of clock skew across a scheduled network? 
 
For tear down, it does not matter where you start to removing the configuration (end point, or any point along the path). Since you remove single configuration point – the service is not available any more. That the time where available time ends.
Actually, I would assert that the "available" time ends when the End Time elapses.  The "End Time" is by definition when the connection is no longer available, and the user should therefore not assume they are usable past that time.   Maybe the path resources will stay in place, but maybe not.  During reconfiguration, the state of cross connects along the path is indeterminate, and if /when they are reconfigured, then user data can be [mis]routed to 3rd parties, and 3rd party data may be [mis]routed to the egress STP.    

The real issue in my mind is "when is the actual End Time?"  Given that we cannot guarantee exactly when each NSA may reach their repective End Time, the End Time should be (IMHO:-) an Estimated End Time "plus or minus", and the user should consider the ramifications of this.   

We do not know which NSA will reach End Time first and begin to tear down the connection.  Nor do we know the delta between this first NSA's clock and the user's clock.   

I don't think the NSA should be attempting to understand the delta of the users clock. It can simply treat requests relative to 'true time'. And, we should expect End Time to be done similar to Start Time. In other words, the users requested time indicates when they expect to 'use' the circuit. If tear-down takes time, the resource time should add a delta to that. Tear-down should not start until 'after' the users requested end time.
But who has the "True Time"?   This is the fundamental problem.   Every NSA thinks their time is the One True Time (:-).  And they all vary a little.   How does the protocol react when it discovers that some message or event has not occured according to what it believes to be the proper time?  Clocks are either *exactly* synchronized, or they are not.  Since the latter is the real world case, we need to just make sure the protocol is deigned to handle that.   

IMO, we should treat time as relative.  I.e. Each NSA maintains its own Time, but it must allow for others who's Time may be skewed.   So we need to consider what it actually means when an event occurs in each state and design to protocol to react accordingly.

I think the ideal situation is that the user sees the Estimated End Time approaching (via their own clock) and stops sending some user defined time prior to End Time.  The user lets the connection drain - still prior to End TIme.    Once the user traffic is drained, the user RA issues a Release Request.  We can, I think, assume that if the user issues the manual Release Request before the scheduled availability has ended, then the user has verified that no important data remains in the pipe.    But due to clock skew and due to the estimated End Time, the user's estimate of how much time is remaining may be substantially under-estimated.   This is why I think maybe a 2-minute warning might be useful - the user can request a warning of "n" seconds, and that warning will bubble up from the NSA who's clock is most advanced.  The user can then throttle down their traffic accordingly and then issue a Release Request.    While this might be nice, it is not fundamentally necessary for the protocol.  It helps the user, but the protocol must still be able to deterministically handle a user that ignores the warning and drives off the cliff.

Fundamentally, we want to make sure the user isn't surprised by an earlier than expected release due to clock skew.    And we won't know until close to the End Time who's clock is going to trigger the Release first.  The warning announces reveals that NSA and gives the user fair warning.

This is reasonable and becomes manageable by the user if the protocol defines the maximum clock skew allowed.
Ah...but as noted above - even a maximum clock skew is additive.  And the skew is measured against...what?  and when is it measured?  how often?

Whatever the method for initiating the release,  the network should insure that any user data accepted at the ingress prior to the End Time is not misrouted - even after the end time.    The network's only options are to try to deliver in-flight data properly or drop it.   Since the End Time has been reached, the network can no longer assume that any segments are still usable, so delivering it is not really an option either.  The network must drop any stranded traffic.   Thus, we need to have some means of blocking new ingress data, and insuring bytes in flight get dropped asap.  

One might take a different view if we hold the connection in place for some safety/guard time past the local End Time.  This would do several things:  1) it would make sure the End Time has elapsed for all NSAs especially the user RA thus allowing full use during the available timeframe, and b) wait a few mils longer (latency time) so that any data in flight is delivered.  At this point (after all NSAs have reached the end time plus a latency factor) any remaing data in flight was definately sent after the reservation.  Bad user, bad user.   In this case, any data in flight is no longer the network's concern.   Then, we can reconfigure without regard to securing the user information. 

Finally, we might consider how to insure that the connection is not torn down until *all* NSAs have reached the End Time.   THis could be indicated by flooding a "End Time Alert" notification  or some simlar message along the tree.   When that message is acknowledge by all NSAs in the connection, then a Release can begin.   Of course, here again, if an acknowledge is not received in a finite time, the connection is torn down unilaterally.   

I do however, think we need to address End Time processing in V1.0   This is important - we need to have a clearly defined lifecycle and primitievs that do not promise something the protocol cannot deliver.   >From this discussion, we cannot clearly state when the availability of the connection ends.

These are some very interesting and challenging nuances.   I hope this was useful musings...
br
Jerry
We can discuss whether it should be synchronized or signaled, but I would even left it for v2 (or v1.1, or whatever we decide). Since ALL segments of connection has configuration removed, the resource time is ended. I agree that resource time is difficult to forecast, yet we need to fit that into calendar full of other reservations and synchronize them.  Thus we need to estimate, guess, or use magic to get those values as realistic as possible. Overlapping is forbidden, and leaving gaps of unused resources will be waste of resources and money at the end.
 
“Two minute warning” is not speaking to me. I don’t see a reason to warn a domain or user that the connection will be closed soon, while user knows what was requested and domain is tracking that with a calendar. We can discuss some internal notifiers, but that’s implementation.
 
Best regards
Radek
 
________________________________________________________________________
Radoslaw Krzywania                      Network Research and Development
                                           Poznan Supercomputing and 
radek.krzywania@man.poznan.pl                   Networking Center
+48 61 850 25 26                             http://www.man.poznan.pl
________________________________________________________________________
 
_______________________________________________
nsi-wg mailing list
nsi-wg@ogf.org
http://www.ogf.org/mailman/listinfo/nsi-wg