time issue

John Vollbrecht

27 Sep 2010 27 Sep '10

9:51 p.m.

Hello all - Jerry and I had a discussion last week about the time issue. I think we developed a useful approach. The idea is to define two times, which I think we all agree exist. 1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time To further define these - Available time - requested by the user for its application - provided by the network. resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time. We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment. ----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this. The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence. For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35. We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group. John

Show replies by date

Inder Monga

27 Sep 27 Sep

10:13 p.m.

John and Jerry and all, We had a lot of discussion on this earlier this year before we took the connection services out of the architecture specification and when discussing the error recovery. This is what I remember being discussed then - just want to make sure it is aligned with the current conclusions and please point out where the differences are: 1. The NSA-Requestor always asks for "Available Time" 2. The NSA-Provider may choose to add their own configurable "guard" time to "Available time" make it the "Resource time". This will be different and specific to the network operator providing the services. The resource time helps them maintain a reservation calendar of when resources are available and deals with the resource overlaps, pre-emptions and all the good stuff. 3. The set of providers in the NSA service chain may have their own "guard" times - there is no need to align it, come up with a predictable value etc. It depends on the kind of the network that provider has built. There is no requirement to be close or far from the start time requested. 4. Each of those NSA providers may have an "Auto-provision" or "Signaled provision" configuration. "Auto-provision" means that the provisioning start signal to the statemachine is given by NSA-provider based on some timer. "signaled provision" means that the NSA-requestor may provision the start of connection setup (this allows for connections to be instantiated at time < total reservation time). 5. The original NSA-requestor waits for the signal from the NSA-provider telling it the connection is up. If the connection is not up before the "Available Start Time" requested - the SLA takes over. Either the NSA-requestor can then cancel the connection, or query the connection or Notify the NSA-provider that it is waiting for it. The NSA-provider can independently choose to Notify the NSA-requestor that provisioning is in progress even though the start time has elapsed. 6. If there is an error in the connection setup, that is notified up the service chain to the requestor. Hope this helps - happy to answer any clarification questions. Inder On Sep 27, 2010, at 2:51 PM, John Vollbrecht wrote:

...

Hello all -

Jerry and I had a discussion last week about the time issue. I think we developed a useful approach.

The idea is to define two times, which I think we all agree exist.

1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time

To further define these -

Available time - requested by the user for its application - provided by the network.

resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource

Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time.

We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment.

----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this.

The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence.

For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35.

We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group.

John

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

--- Inder Monga ANI Testbed imonga@es.net ESnet Blog (510) 499 8065 (c) (510) 486 6531 (o) "Whatever your mind can conceive and believe it can achieve." - Napoleon Hill

John Vollbrecht

28 Sep 28 Sep

12:41 a.m.

...

John and Jerry and all,

We had a lot of discussion on this earlier this year before we took the connection services out of the architecture specification and when discussing the error recovery. This is what I remember being discussed then - just want to make sure it is aligned with the current conclusions and please point out where the differences are:

1. The NSA-Requestor always asks for "Available Time" for automated request it asks for available time for user initiated it asks for resource time, or asks for available and gets resource time back so it can request at appropriate time

2. The NSA-Provider may choose to add their own configurable "guard" time to "Available time" make it the "Resource time". This will be different and specific to the network operator providing the services. The resource time helps them maintain a reservation calendar of when resources are available and deals with the resource overlaps, pre-emptions and all the good stuff.

On Sep 27, 2010, at 6:13 PM, Inder Monga wrote: this is true for automated request. it may be true for user initiated request depending on whether that is what is requested

...

3. The set of providers in the NSA service chain may have their own "guard" times - there is no need to align it, come up with a predictable value etc. It depends on the kind of the network that provider has built. There is no requirement to be close or far from the start time requested.

I am not sure what this means. I assume there is some need to be at least close to requested available time.

...

4. Each of those NSA providers may have an "Auto-provision" or "Signaled provision" configuration. "Auto-provision" means that the provisioning start signal to the statemachine is given by NSA-provider based on some timer. "signaled provision" means that the NSA-requestor may provision the start of connection setup (this allows for connections to be instantiated at time < total reservation time).

signaled provision is what I have been calling user initiated

...

5. The original NSA-requestor waits for the signal from the NSA-provider telling it the connection is up. If the connection is not up before the "Available Start Time" requested - the SLA takes over. Either the NSA-requestor can then cancel the connection, or query the connection or Notify the NSA-provider that it is waiting for it. The NSA-provider can independently choose to Notify the NSA-requestor that provisioning is in progress even though the start time has elapsed.

perhaps. I don't think we agreed to anything like this, though the idea of SLA kicking in seems reasonable. How this happens might be included in the service instance description.

...

6. If there is an error in the connection setup, that is notified up the service chain to the requestor.

Up the service chain is something I am not clear about. John

...

Hope this helps - happy to answer any clarification questions.

Inder

On Sep 27, 2010, at 2:51 PM, John Vollbrecht wrote:

...
Hello all -

Jerry and I had a discussion last week about the time issue. I think we developed a useful approach.

The idea is to define two times, which I think we all agree exist.

1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time

To further define these -

Available time - requested by the user for its application - provided by the network.

resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource

Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time.

We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment.

----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this.

The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence.

For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35.

We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group.

John

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

--- Inder Monga ANI Testbed imonga@es.net ESnet Blog (510) 499 8065 (c) (510) 486 6531 (o) "Whatever your mind can conceive and believe it can achieve." - Napoleon Hill

Inder Monga

1:20 a.m.

John, Thanks for giving me the opportunity to try and clarify On Sep 27, 2010, at 5:41 PM, John Vollbrecht wrote:

...

On Sep 27, 2010, at 6:13 PM, Inder Monga wrote:

...
John and Jerry and all,

We had a lot of discussion on this earlier this year before we took the connection services out of the architecture specification and when discussing the error recovery. This is what I remember being discussed then - just want to make sure it is aligned with the current conclusions and please point out where the differences are:

1. The NSA-Requestor always asks for "Available Time" for automated request it asks for available time for user initiated it asks for resource time, or asks for available and gets resource time back so it can request at appropriate time

I think we should keep the user view consistent. The User ALWAYS asks for Available time. If it is user initiated, then the network does not need to add a "guard" time. The time user requested is then "available" for the user to provision based on their request. Provisioning time is automatically included.

...

...
2. The NSA-Provider may choose to add their own configurable "guard" time to "Available time" make it the "Resource time". This will be different and specific to the network operator providing the services. The resource time helps them maintain a reservation calendar of when resources are available and deals with the resource overlaps, pre-emptions and all the good stuff.

this is true for automated request. it may be true for user initiated request depending on whether that is what is requested

I am suggesting we have it one way only for both. Maybe diagrams will help.

...

...
3. The set of providers in the NSA service chain may have their own "guard" times - there is no need to align it, come up with a predictable value etc. It depends on the kind of the network that provider has built. There is no requirement to be close or far from the start time requested.

I am not sure what this means. I assume there is some need to be at least close to requested available time.

Of course, as best practices, it should be close to requested available time. But we do not want to discuss about it in the NSI protocol specification other than state that each network needs to attach a network-specific guard time to the Available time in order to make sure that the resource is available for provisioning ahead of the user-requested time. This will prevent double-booking of resources.

...

...
4. Each of those NSA providers may have an "Auto-provision" or "Signaled provision" configuration. "Auto-provision" means that the provisioning start signal to the statemachine is given by NSA-provider based on some timer. "signaled provision" means that the NSA-requestor may provision the start of connection setup (this allows for connections to be instantiated at time < total reservation time).

signaled provision is what I have been calling user initiated

Thanks.

...

...
5. The original NSA-requestor waits for the signal from the NSA-provider telling it the connection is up. If the connection is not up before the "Available Start Time" requested - the SLA takes over. Either the NSA-requestor can then cancel the connection, or query the connection or Notify the NSA-provider that it is waiting for it. The NSA-provider can independently choose to Notify the NSA-requestor that provisioning is in progress even though the start time has elapsed.

perhaps. I don't think we agreed to anything like this, though the idea of SLA kicking in seems reasonable. How this happens might be included in the service instance description.

Yes when we discuss SLA negotiation as part of the protocol. It may be part of the service agreement discussed out of band between the providers. Ensuring a consistent SLA across multiple networks is not going to be an easy problem to enforce or reason about within the scope of NSI v1.0.

...

...
6. If there is an error in the connection setup, that is notified up the service chain to the requestor.

Up the service chain is something I am not clear about.

What I mean by "service chain" is a set of NSI-Requestors, NSI-Providers associated in setting up a single connection. This relates to sub-segmentation of the connection service among multiple NSAs/domains.

...

John

...
Hope this helps - happy to answer any clarification questions.

Inder

On Sep 27, 2010, at 2:51 PM, John Vollbrecht wrote:

...
Hello all -

Jerry and I had a discussion last week about the time issue. I think we developed a useful approach.

The idea is to define two times, which I think we all agree exist.

1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time

To further define these -

Available time - requested by the user for its application - provided by the network.

resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource

Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time.

We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment.

----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this.

The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence.

For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35.

We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group.

John

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

--- Inder Monga ANI Testbed imonga@es.net ESnet Blog (510) 499 8065 (c) (510) 486 6531 (o) "Whatever your mind can conceive and believe it can achieve." - Napoleon Hill

--- Inder Monga ANI Testbed imonga@es.net ESnet Blog (510) 499 8065 (c) (510) 486 6531 (o) "Whatever your mind can conceive and believe it can achieve." - Napoleon Hill

John Vollbrecht

4:27 p.m.

On Sep 27, 2010, at 9:20 PM, Inder Monga wrote:

...

John, Thanks for giving me the opportunity to try and clarify

On Sep 27, 2010, at 5:41 PM, John Vollbrecht wrote:

...
On Sep 27, 2010, at 6:13 PM, Inder Monga wrote:

...
John and Jerry and all,

We had a lot of discussion on this earlier this year before we took the connection services out of the architecture specification and when discussing the error recovery. This is what I remember being discussed then - just want to make sure it is aligned with the current conclusions and please point out where the differences are:

1. The NSA-Requestor always asks for "Available Time" for automated request it asks for available time for user initiated it asks for resource time, or asks for available and gets resource time back so it can request at appropriate time

I think we should keep the user view consistent. The User ALWAYS asks for Available time. If it is user initiated, then the network does not need to add a "guard" time. The time user requested is then "available" for the user to provision based on their request. Provisioning time is automatically included. jrv: This confuses available and scheduled time. If one asks for available time in a user provisioned request, it seems the network should give at least a hint of setup time, and that the network adds this to its scheduled resource time.

...
...
2. The NSA-Provider may choose to add their own configurable "guard" time to "Available time" make it the "Resource time". This will be different and specific to the network operator providing the services. The resource time helps them maintain a reservation calendar of when resources are available and deals with the resource overlaps, pre-emptions and all the good stuff.

this is true for automated request. it may be true for user initiated request depending on whether that is what is requested

I am suggesting we have it one way only for both. Maybe diagrams will help. The distinctions I suggest between available and resource time seems to me to make it easier to talk about the issues. It forces one to be clear about the requirements for each and what is possible in mapping between them.

...
...
3. The set of providers in the NSA service chain may have their own "guard" times - there is no need to align it, come up with a predictable value etc. It depends on the kind of the network that provider has built. There is no requirement to be close or far from the start time requested.

I am not sure what this means. I assume there is some need to be at least close to requested available time.

Of course, as best practices, it should be close to requested available time. But we do not want to discuss about it in the NSI protocol specification other than state that each network needs to attach a network-specific guard time to the Available time in order to make sure that the resource is available for provisioning ahead of the user-requested time. This will prevent double-booking of resources. Again, I think the distinctions make it possible to talk about this issue more clearly. You might say each network schedules a resource time that attempt to make the actual available time as close as possible to the requested available time. This makes it clear that the actual and requested available times are not identical.

...
...
4. Each of those NSA providers may have an "Auto-provision" or "Signaled provision" configuration. "Auto-provision" means that the provisioning start signal to the statemachine is given by NSA-provider based on some timer. "signaled provision" means that the NSA-requestor may provision the start of connection setup (this allows for connections to be instantiated at time < total reservation time).

signaled provision is what I have been calling user initiated

Thanks.

...
...
5. The original NSA-requestor waits for the signal from the NSA-provider telling it the connection is up. If the connection is not up before the "Available Start Time" requested - the SLA takes over. Either the NSA-requestor can then cancel the connection, or query the connection or Notify the NSA-provider that it is waiting for it. The NSA-provider can independently choose to Notify the NSA-requestor that provisioning is in progress even though the start time has elapsed.

perhaps. I don't think we agreed to anything like this, though the idea of SLA kicking in seems reasonable. How this happens might be included in the service instance description.

Yes when we discuss SLA negotiation as part of the protocol. It may be part of the service agreement discussed out of band between the providers. Ensuring a consistent SLA across multiple networks is not going to be an easy problem to enforce or reason about within the scope of NSI v1.0. One could imagine an SLA that said I want connection available at requested start time 99% of the time. The provider would set its resource time to be sure setup was complete 99% of the time.

...

...
...
6. If there is an error in the connection setup, that is notified up the service chain to the requestor.

Up the service chain is something I am not clear about.

What I mean by "service chain" is a set of NSI-Requestors, NSI-Providers associated in setting up a single connection. This relates to sub-segmentation of the connection service among multiple NSAs/domains.

I think this needs to be better defined. In a hand waving way I think this is correct, but the details are not at all clear to me.

...

...
John

...
Hope this helps - happy to answer any clarification questions.

Inder

On Sep 27, 2010, at 2:51 PM, John Vollbrecht wrote:

...
Hello all -

Jerry and I had a discussion last week about the time issue. I think we developed a useful approach.

The idea is to define two times, which I think we all agree exist.

1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time

To further define these -

Available time - requested by the user for its application - provided by the network.

resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource

Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time.

We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment.

----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this.

The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence.

For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35.

We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group.

John

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

--- Inder Monga ANI Testbed imonga@es.net ESnet Blog (510) 499 8065 (c) (510) 486 6531 (o) "Whatever your mind can conceive and believe it can achieve." - Napoleon Hill

--- Inder Monga ANI Testbed imonga@es.net ESnet Blog (510) 499 8065 (c) (510) 486 6531 (o) "Whatever your mind can conceive and believe it can achieve." - Napoleon Hill

Jerry Sobieski

1:50 a.m.

Hi John and everyone- John has summarized our discussion quite well. I add a couple comments just to make me a slight bit more happy with them:-) Please see them inline... Also, attached is a draft of what concerned me originally. Guy had asked me to document my concerns and recommendations. I think this docuement parallels John's summary below quite well - though I use some differnet terms. John Vollbrecht wrote:

...

Hello all -

Jerry and I had a discussion last week about the time issue. I think we developed a useful approach.

The idea is to define two times, which I think we all agree exist.

1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time

I called these In-Service Time and Provisioning Start Time...but I think John's terms are better.

...

To further define these -

Available time - requested by the user for its application - provided by the network.

Key Note: This is a *Requested* available time, and not a guaranteed time.

...

resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource

Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time.

We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment.

This is great John.

...

----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this.

The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence.

For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35.

We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group.

John is correct in that we did indeed discuss these issues. However, we have disagreements about whether such statistical averages really changes the fact that they are still just estimates - not deterministic, and not a guarranty. I think trying to formulate estimates of how often we are likely to fail on a committed parameter misses a bigger elephant hiding in the room: Why did we not make our commitment? What needs to be fixed to insure that our guaranties are solid? But intelligent folks can disagree on the value of estimates and averages, but IMO incorporating this information into the protocol will not be a good idea. We did discuss the prospect of making these estimates available to the user via the Service Definition document, which then becomes the perogative of the service provider to offer- or not, and the user ot use it - or not. They could give the user some idea of past performance. I think the SD is a more appropriate way to do this. But it is not something I believe the CS needs to deal with. I think fundamentally the Connection Service should be a simple front end to the functional routines that make up the state machine. And the state machine is dependent on a clear and closed set of events that transition the connection from state to state. All service parameters are analyzed according to the Service Definition and the Path Finder and the various genre of resources in the ResourceDB. I am happy if in the connection request that the "available time" be explicitly defined as a /Requested/ AvailableTime, and when returned in a Reservation Confirmation that it is explicitly an /Estimated/ Available Time - a preference, or estimate, a "hint". The PA simply makes a /best effort/ to meet it. Thanks John! Jerry

...

John

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Radek Krzywania

1:58 p.m.

Hi, Just a short comment on time definitions there - I kind of like them :) The case is that we should be able to estimate the difference between available and resource time - so in other words, to be able to estimate setup and tear down time. That is mostly for purposes of SLA, but not only. Also, since user is requesting only to give an available time, we need to map those times somehow, and keep them synchronised. This is implementation work here, but doable (if we are able to estimate setup/tear down times). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

...

-----Original Message----- From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of John Vollbrecht Sent: Monday, September 27, 2010 11:51 PM To: NSI WG Subject: [Nsi-wg] time issue

Hello all -

Jerry and I had a discussion last week about the time issue. I think we developed a useful approach.

The idea is to define two times, which I think we all agree exist.

1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time

To further define these -

Available time - requested by the user for its application - provided by the network.

resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource

Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time.

We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment.

----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this.

The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence.

For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35.

We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group.

John

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Jerry Sobieski

4:20 p.m.

Hi Radak- Hmmm... some musings... It seems to me that the guard times are local to each NSA - the only real desire is to have the circuit avalaible to the user at the desired time (or as close to it as our best effort can do), and to keep it available until the End TIme (or as close as we can manage.) So each NSA down the tree will prepend their own estimated guard time to the Available Time and hope for the best. I don't think there is a need to synchronize the times per se, just the state transitions. In order for the protocol to be deterministic, the easiest way IMO is to always send a Provision Complete msg up the tree whenever we complete the setup. This will attest that all children segments are in place and In Service. As for tear down time... From the netwokr perspective, the tear down can (should be?) almost instantaneous as the only real thing that needs to be done is to shut off the ingress point. Typically, there is no need to physically reconfigure the transit nodes back to some idle configuration - the ingress just gets shut and the resources marked as released. The resources are officially no longer available to the user after the End Time has passed. Even if *no reconfiguration* is done, and even if the ingress port is left open, the user's reservation has expired and they can no longer rely on its availability. The real question is "When will the End Time occur??" (I sound like a metephysical preacher:-) While each NSA down the tree will want to shut their respective ingress points at the End Time, the only one that really counts from a user perspective is the first NSA to reach the End Time. I.e. due to clock skew, a child NSA down the tree somewhere may reach the End Time before the top PA begins the 'proper' Release. Unless the Release is triggered deterministically by one particular NSA there could be some unpredictable variance in the actual release time. From the user's perspective, with a bit slower clock, the connection inexplicably goes away early... This is why I think messaging should be the primary means for managing the circuit. I think the best way to handle this may be that each NSA sets a timer when their End Time is reached. A "manual stop" Release message is expected within that timer window. When that timer expires, the local NSA stops waiting for a manual stop and reverts to "auto stop" for the connection. Auto-stop handling sends appropriate messages or notifications up the tree and drops the connection. This End Time Release timer could be some locally defined safety (guard) time. Even in this case, the user's reservation still expires at the End Time - they are no longer guarranteed the resources. And there is no practical way to know when the first End Time Release timer may expire...if the clocks are skewed, and the release guard time small enough, the connection may still drop before the user expects it. For this reason, I think we should consider defining the End Time as a best effort Estimated End Time just like the best effort Estimated Start Time (or Requested Avalable Time). In general, I think synchronizing a global network of NSAs to accurately reflect a common time is almost impossible, or at least out of scope. Accuracy and skew will always be an issue. Therefore, the protocol / state machine must be able to function correctly in the face of skew, and this may prevent us from making hard guarranties on availability times. (Did I explain this coherently?:-) This why auto-start and auto-stop are tricky - the clocks among the NSAs will never be exactly the same and so we'll always risk skewed independent clocks jumping the gun. But clock skew (as Radak suggests) is something we can handle, but it needs to be considered as part of the protocol/state machine, not just an implementation issue. Auto-start and auto-stop capability will need to address protocol issues like this caused by clock skew. Perhaps we need some protocol feature that insures a minimum maintained clock sync - and if an NSA's time/date clock is off by more than some standard minimum, that NSA is considered broken and not used. Remember also, that skew can accuulate across multiple networks/nodes, so even a standard minimum may still accumulate to something unacceptable. Another alternative we may consider is a "Two Minute Warning"...I.e. whenever a connection End Time approaches, say, 2 minutes from now, a message is flooded along the service tree to notify the NSAs on the path that at least one NSA is going to drop the connection very soon. The Prime Mover RA (the user) will ultimately get this message as well, notifying them that they should find a stopping point or risk losing connectivity abruptly. Indeed, a two-minute warning could be a connection specific user specified value with some Service Definition default. Ugh...this is interesting... Best regards Jerry Radek Krzywania wrote:

...

Hi, Just a short comment on time definitions there - I kind of like them :) The case is that we should be able to estimate the difference between available and resource time - so in other words, to be able to estimate setup and tear down time. That is mostly for purposes of SLA, but not only. Also, since user is requesting only to give an available time, we need to map those times somehow, and keep them synchronised. This is implementation work here, but doable (if we are able to estimate setup/tear down times).

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

...
-----Original Message----- From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of John Vollbrecht Sent: Monday, September 27, 2010 11:51 PM To: NSI WG Subject: [Nsi-wg] time issue

Hello all -

Jerry and I had a discussion last week about the time issue. I think we developed a useful approach.

The idea is to define two times, which I think we all agree exist.

1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time

To further define these -

Available time - requested by the user for its application - provided by the network.

resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource

Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time.

We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment.

----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this.

The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence.

For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35.

We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group.

John

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

John Vollbrecht

4:46 p.m.

I think this is useful discussion -- I would like to see an actual proposal for how time is carried in each of the possible request types [automatic or user initiated provisioning]. My musing/ wondering is whether it would be good to have both times at least potentially in the connection messages. This might be done by carrying one in the Service Definition Instance field and one in the Path field. At a minimum it seems that it is necessary to distinguish between the two times, either with some sort of flag or by carrying them in different fields. Do others agree? John On Sep 28, 2010, at 12:20 PM, Jerry Sobieski wrote:

...

Hi Radak-

Hmmm... some musings...

It seems to me that the guard times are local to each NSA - the only real desire is to have the circuit avalaible to the user at the desired time (or as close to it as our best effort can do), and to keep it available until the End TIme (or as close as we can manage.) So each NSA down the tree will prepend their own estimated guard time to the Available Time and hope for the best. I don't think there is a need to synchronize the times per se, just the state transitions. In order for the protocol to be deterministic, the easiest way IMO is to always send a Provision Complete msg up the tree whenever we complete the setup. This will attest that all children segments are in place and In Service.

As for tear down time... From the netwokr perspective, the tear down can (should be?) almost instantaneous as the only real thing that needs to be done is to shut off the ingress point. Typically, there is no need to physically reconfigure the transit nodes back to some idle configuration - the ingress just gets shut and the resources marked as released. The resources are officially no longer available to the user after the End Time has passed. Even if *no reconfiguration* is done, and even if the ingress port is left open, the user's reservation has expired and they can no longer rely on its availability.

The real question is "When will the End Time occur??" (I sound like a metephysical preacher:-) While each NSA down the tree will want to shut their respective ingress points at the End Time, the only one that really counts from a user perspective is the first NSA to reach the End Time. I.e. due to clock skew, a child NSA down the tree somewhere may reach the End Time before the top PA begins the 'proper' Release. Unless the Release is triggered deterministically by one particular NSA there could be some unpredictable variance in the actual release time. From the user's perspective, with a bit slower clock, the connection inexplicably goes away early...

This is why I think messaging should be the primary means for managing the circuit. I think the best way to handle this may be that each NSA sets a timer when their End Time is reached. A "manual stop" Release message is expected within that timer window. When that timer expires, the local NSA stops waiting for a manual stop and reverts to "auto stop" for the connection. Auto-stop handling sends appropriate messages or notifications up the tree and drops the connection. This End Time Release timer could be some locally defined safety (guard) time. Even in this case, the user's reservation still expires at the End Time - they are no longer guarranteed the resources. And there is no practical way to know when the first End Time Release timer may expire...if the clocks are skewed, and the release guard time small enough, the connection may still drop before the user expects it.

For this reason, I think we should consider defining the End Time as a best effort Estimated End Time just like the best effort Estimated Start Time (or Requested Avalable Time).

In general, I think synchronizing a global network of NSAs to accurately reflect a common time is almost impossible, or at least out of scope. Accuracy and skew will always be an issue. Therefore, the protocol / state machine must be able to function correctly in the face of skew, and this may prevent us from making hard guarranties on availability times. (Did I explain this coherently?:-) This why auto-start and auto-stop are tricky - the clocks among the NSAs will never be exactly the same and so we'll always risk skewed independent clocks jumping the gun. But clock skew (as Radak suggests) is something we can handle, but it needs to be considered as part of the protocol/state machine, not just an implementation issue.

Auto-start and auto-stop capability will need to address protocol issues like this caused by clock skew. Perhaps we need some protocol feature that insures a minimum maintained clock sync - and if an NSA's time/date clock is off by more than some standard minimum, that NSA is considered broken and not used. Remember also, that skew can accuulate across multiple networks/nodes, so even a standard minimum may still accumulate to something unacceptable.

Another alternative we may consider is a "Two Minute Warning"...I.e. whenever a connection End Time approaches, say, 2 minutes from now, a message is flooded along the service tree to notify the NSAs on the path that at least one NSA is going to drop the connection very soon. The Prime Mover RA (the user) will ultimately get this message as well, notifying them that they should find a stopping point or risk losing connectivity abruptly. Indeed, a two-minute warning could be a connection specific user specified value with some Service Definition default.

Ugh...this is interesting...

Best regards Jerry

Radek Krzywania wrote:

...
Hi, Just a short comment on time definitions there - I kind of like them :) The case is that we should be able to estimate the difference between available and resource time - so in other words, to be able to estimate setup and tear down time. That is mostly for purposes of SLA, but not only. Also, since user is requesting only to give an available time, we need to map those times somehow, and keep them synchronised. This is implementation work here, but doable (if we are able to estimate setup/tear down times).

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

...
-----Original Message----- From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of John Vollbrecht Sent: Monday, September 27, 2010 11:51 PM To: NSI WG Subject: [Nsi-wg] time issue

Hello all -

Jerry and I had a discussion last week about the time issue. I think we developed a useful approach.

The idea is to define two times, which I think we all agree exist.

1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time

To further define these -

Available time - requested by the user for its application - provided by the network.

resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource

Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time.

We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment.

----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this.

The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence.

For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35.

We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group.

John

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Radek Krzywania

5:01 p.m.

Hi Jerry, IMHO setup and tear down times should be considered global (all NSA along a path). User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ). For tear down, it does not matter where you start to removing the configuration (end point, or any point along the path). Since you remove single configuration point – the service is not available any more. That the time where available time ends. We can discuss whether it should be synchronized or signaled, but I would even left it for v2 (or v1.1, or whatever we decide). Since ALL segments of connection has configuration removed, the resource time is ended. I agree that resource time is difficult to forecast, yet we need to fit that into calendar full of other reservations and synchronize them. Thus we need to estimate, guess, or use magic to get those values as realistic as possible. Overlapping is forbidden, and leaving gaps of unused resources will be waste of resources and money at the end. “Two minute warning” is not speaking to me. I don’t see a reason to warn a domain or user that the connection will be closed soon, while user knows what was requested and domain is tracking that with a calendar. We can discuss some internal notifiers, but that’s implementation. Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and <mailto:radek.krzywania@man.poznan.pl> radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 <http://www.man.poznan.pl> http://www.man.poznan.pl ________________________________________________________________________ From: Jerry Sobieski [mailto:jerry@nordu.net] Sent: Tuesday, September 28, 2010 6:20 PM To: radek.krzywania@man.poznan.pl Cc: 'John Vollbrecht'; 'NSI WG' Subject: Re: [Nsi-wg] time issue Hi Radak- Hmmm... some musings... It seems to me that the guard times are local to each NSA - the only real desire is to have the circuit avalaible to the user at the desired time (or as close to it as our best effort can do), and to keep it available until the End TIme (or as close as we can manage.) So each NSA down the tree will prepend their own estimated guard time to the Available Time and hope for the best. I don't think there is a need to synchronize the times per se, just the state transitions. In order for the protocol to be deterministic, the easiest way IMO is to always send a Provision Complete msg up the tree whenever we complete the setup. This will attest that all children segments are in place and In Service. As for tear down time... From the netwokr perspective, the tear down can (should be?) almost instantaneous as the only real thing that needs to be done is to shut off the ingress point. Typically, there is no need to physically reconfigure the transit nodes back to some idle configuration - the ingress just gets shut and the resources marked as released. The resources are officially no longer available to the user after the End Time has passed. Even if *no reconfiguration* is done, and even if the ingress port is left open, the user's reservation has expired and they can no longer rely on its availability. The real question is "When will the End Time occur??" (I sound like a metephysical preacher:-) While each NSA down the tree will want to shut their respective ingress points at the End Time, the only one that really counts from a user perspective is the first NSA to reach the End Time. I.e. due to clock skew, a child NSA down the tree somewhere may reach the End Time before the top PA begins the 'proper' Release. Unless the Release is triggered deterministically by one particular NSA there could be some unpredictable variance in the actual release time. >From the user's perspective, with a bit slower clock, the connection inexplicably goes away early... This is why I think messaging should be the primary means for managing the circuit. I think the best way to handle this may be that each NSA sets a timer when their End Time is reached. A "manual stop" Release message is expected within that timer window. When that timer expires, the local NSA stops waiting for a manual stop and reverts to "auto stop" for the connection. Auto-stop handling sends appropriate messages or notifications up the tree and drops the connection. This End Time Release timer could be some locally defined safety (guard) time. Even in this case, the user's reservation still expires at the End Time - they are no longer guarranteed the resources. And there is no practical way to know when the first End Time Release timer may expire...if the clocks are skewed, and the release guard time small enough, the connection may still drop before the user expects it. For this reason, I think we should consider defining the End Time as a best effort Estimated End Time just like the best effort Estimated Start Time (or Requested Avalable Time). In general, I think synchronizing a global network of NSAs to accurately reflect a common time is almost impossible, or at least out of scope. Accuracy and skew will always be an issue. Therefore, the protocol / state machine must be able to function correctly in the face of skew, and this may prevent us from making hard guarranties on availability times. (Did I explain this coherently?:-) This why auto-start and auto-stop are tricky - the clocks among the NSAs will never be exactly the same and so we'll always risk skewed independent clocks jumping the gun. But clock skew (as Radak suggests) is something we can handle, but it needs to be considered as part of the protocol/state machine, not just an implementation issue. Auto-start and auto-stop capability will need to address protocol issues like this caused by clock skew. Perhaps we need some protocol feature that insures a minimum maintained clock sync - and if an NSA's time/date clock is off by more than some standard minimum, that NSA is considered broken and not used. Remember also, that skew can accuulate across multiple networks/nodes, so even a standard minimum may still accumulate to something unacceptable. Another alternative we may consider is a "Two Minute Warning"...I.e. whenever a connection End Time approaches, say, 2 minutes from now, a message is flooded along the service tree to notify the NSAs on the path that at least one NSA is going to drop the connection very soon. The Prime Mover RA (the user) will ultimately get this message as well, notifying them that they should find a stopping point or risk losing connectivity abruptly. Indeed, a two-minute warning could be a connection specific user specified value with some Service Definition default. Ugh...this is interesting... Best regards Jerry Radek Krzywania wrote: Hi, Just a short comment on time definitions there - I kind of like them :) The case is that we should be able to estimate the difference between available and resource time - so in other words, to be able to estimate setup and tear down time. That is mostly for purposes of SLA, but not only. Also, since user is requesting only to give an available time, we need to map those times somehow, and keep them synchronised. This is implementation work here, but doable (if we are able to estimate setup/tear down times). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ -----Original Message----- From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of John Vollbrecht Sent: Monday, September 27, 2010 11:51 PM To: NSI WG Subject: [Nsi-wg] time issue Hello all - Jerry and I had a discussion last week about the time issue. I think we developed a useful approach. The idea is to define two times, which I think we all agree exist. 1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time To further define these - Available time - requested by the user for its application - provided by the network. resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time. We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment. ----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this. The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence. For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35. We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group. John _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Jerry Sobieski

9:40 p.m.

Hi Radak - see comment below... Radek Krzywania wrote:

...

Hi Jerry,

IMHO setup and tear down times should be considered global (all NSA along a path). User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

But we cannot expect all NSAs clocks to be exactly synchronized. Clocks are critical to bookahead scheduling, and independent quasi-synchronized clocks (however slightly skewed) will cause problems. Some of those problems are evident in this discussion. Open issue for discussion: How do we address the issue of clock skew across a scheduled network?

...

For tear down, it does not matter where you start to removing the configuration (end point, or any point along the path). Since you remove single configuration point – the service is not available any more. That the time where available time ends.

Actually, I would assert that the "available" time ends when the End Time elapses. The "End Time" is by definition when the connection is no longer available, and the user should therefore not assume they are usable past that time. Maybe the path resources will stay in place, but maybe not. During reconfiguration, the state of cross connects along the path is indeterminate, and if /when they are reconfigured, then user data can be [mis]routed to 3rd parties, and 3rd party data may be [mis]routed to the egress STP. The real issue in my mind is "when is the actual End Time?" Given that we cannot guarantee exactly when each NSA may reach their repective End Time, the End Time should be (IMHO:-) an Estimated End Time "plus or minus", and the user should consider the ramifications of this. We do not know which NSA will reach End Time first and begin to tear down the connection. Nor do we know the delta between this first NSA's clock and the user's clock. I think the ideal situation is that the user sees the Estimated End Time approaching (via their own clock) and stops sending some user defined time prior to End Time. The user lets the connection drain - still prior to End TIme. Once the user traffic is drained, the user RA issues a Release Request. We can, I think, assume that if the user issues the manual Release Request before the scheduled availability has ended, then the user has verified that no important data remains in the pipe. But due to clock skew and due to the estimated End Time, the user's estimate of how much time is remaining may be substantially under-estimated. This is why I think maybe a 2-minute warning might be useful - the user can request a warning of "n" seconds, and that warning will bubble up from the NSA who's clock is most advanced. The user can then throttle down their traffic accordingly and then issue a Release Request. While this might be nice, it is not fundamentally necessary for the protocol. It helps the user, but the protocol must still be able to deterministically handle a user that ignores the warning and drives off the cliff. Fundamentally, we want to make sure the user isn't surprised by an earlier than expected release due to clock skew. And we won't know until close to the End Time who's clock is going to trigger the Release first. The warning announces reveals that NSA and gives the user fair warning. Whatever the method for initiating the release, the network should insure that any user data accepted at the ingress prior to the End Time is not misrouted - even after the end time. The network's only options are to try to deliver in-flight data properly or drop it. Since the End Time has been reached, the network can no longer assume that any segments are still usable, so delivering it is not really an option either. The network must drop any stranded traffic. Thus, we need to have some means of blocking new ingress data, and insuring bytes in flight get dropped asap. One might take a different view if we hold the connection in place for some safety/guard time past the local End Time. This would do several things: 1) it would make sure the End Time has elapsed for all NSAs especially the user RA thus allowing full use during the available timeframe, and b) wait a few mils longer (latency time) so that any data in flight is delivered. At this point (after all NSAs have reached the end time plus a latency factor) any remaing data in flight was definately sent after the reservation. Bad user, bad user. In this case, any data in flight is no longer the network's concern. Then, we can reconfigure without regard to securing the user information. Finally, we might consider how to insure that the connection is not torn down until *all* NSAs have reached the End Time. THis could be indicated by flooding a "End Time Alert" notification or some simlar message along the tree. When that message is acknowledge by all NSAs in the connection, then a Release can begin. Of course, here again, if an acknowledge is not received in a finite time, the connection is torn down unilaterally. I do however, think we need to address End Time processing in V1.0 This is important - we need to have a clearly defined lifecycle and primitievs that do not promise something the protocol cannot deliver. From this discussion, we cannot clearly state when the availability of the connection ends. These are some very interesting and challenging nuances. I hope this was useful musings... br Jerry

...

We can discuss whether it should be synchronized or signaled, but I would even left it for v2 (or v1.1, or whatever we decide). Since ALL segments of connection has configuration removed, the resource time is ended. I agree that resource time is difficult to forecast, yet we need to fit that into calendar full of other reservations and synchronize them. Thus we need to estimate, guess, or use magic to get those values as realistic as possible. Overlapping is forbidden, and leaving gaps of unused resources will be waste of resources and money at the end.

“Two minute warning” is not speaking to me. I don’t see a reason to warn a domain or user that the connection will be closed soon, while user knows what was requested and domain is tracking that with a calendar. We can discuss some internal notifiers, but that’s implementation.

Best regards

Radek

________________________________________________________________________

Radoslaw Krzywania Network Research and Development

Poznan Supercomputing and

radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center

+48 61 850 25 26 http://www.man.poznan.pl

________________________________________________________________________

Jeff W.Boote

29 Sep 29 Sep

12:08 p.m.

On Sep 28, 2010, at 3:40 PM, Jerry Sobieski wrote:

...

Hi Radak - see comment below...

Radek Krzywania wrote:

...
Hi Jerry, IMHO setup and tear down times should be considered global (all NSA along a path). User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

But we cannot expect all NSAs clocks to be exactly synchronized. Clocks are critical to bookahead scheduling, and independent quasi- synchronized clocks (however slightly skewed) will cause problems. Some of those problems are evident in this discussion.

Exact synchronization is not required. The protocol can (and probably should) define a reasonable synchronization requirement. i.e. NSAs MUST be synchronized within 1 second. (Or even 10 seconds) That should be a relatively trivial requirement, and bounds this problem.

...

Open issue for discussion: How do we address the issue of clock skew across a scheduled network?

...
For tear down, it does not matter where you start to removing the configuration (end point, or any point along the path). Since you remove single configuration point – the service is not available any more. That the time where available time ends.

Actually, I would assert that the "available" time ends when the End Time elapses. The "End Time" is by definition when the connection is no longer available, and the user should therefore not assume they are usable past that time. Maybe the path resources will stay in place, but maybe not. During reconfiguration, the state of cross connects along the path is indeterminate, and if /when they are reconfigured, then user data can be [mis]routed to 3rd parties, and 3rd party data may be [mis]routed to the egress STP.

The real issue in my mind is "when is the actual End Time?" Given that we cannot guarantee exactly when each NSA may reach their repective End Time, the End Time should be (IMHO:-) an Estimated End Time "plus or minus", and the user should consider the ramifications of this.

We do not know which NSA will reach End Time first and begin to tear down the connection. Nor do we know the delta between this first NSA's clock and the user's clock.

I don't think the NSA should be attempting to understand the delta of the users clock. It can simply treat requests relative to 'true time'. And, we should expect End Time to be done similar to Start Time. In other words, the users requested time indicates when they expect to 'use' the circuit. If tear-down takes time, the resource time should add a delta to that. Tear-down should not start until 'after' the users requested end time.

...

I think the ideal situation is that the user sees the Estimated End Time approaching (via their own clock) and stops sending some user defined time prior to End Time. The user lets the connection drain - still prior to End TIme. Once the user traffic is drained, the user RA issues a Release Request. We can, I think, assume that if the user issues the manual Release Request before the scheduled availability has ended, then the user has verified that no important data remains in the pipe. But due to clock skew and due to the estimated End Time, the user's estimate of how much time is remaining may be substantially under-estimated. This is why I think maybe a 2-minute warning might be useful - the user can request a warning of "n" seconds, and that warning will bubble up from the NSA who's clock is most advanced. The user can then throttle down their traffic accordingly and then issue a Release Request. While this might be nice, it is not fundamentally necessary for the protocol. It helps the user, but the protocol must still be able to deterministically handle a user that ignores the warning and drives off the cliff.

Fundamentally, we want to make sure the user isn't surprised by an earlier than expected release due to clock skew. And we won't know until close to the End Time who's clock is going to trigger the Release first. The warning announces reveals that NSA and gives the user fair warning.

This is reasonable and becomes manageable by the user if the protocol defines the maximum clock skew allowed.

...

Whatever the method for initiating the release, the network should insure that any user data accepted at the ingress prior to the End Time is not misrouted - even after the end time. The network's only options are to try to deliver in-flight data properly or drop it. Since the End Time has been reached, the network can no longer assume that any segments are still usable, so delivering it is not really an option either. The network must drop any stranded traffic. Thus, we need to have some means of blocking new ingress data, and insuring bytes in flight get dropped asap.

One might take a different view if we hold the connection in place for some safety/guard time past the local End Time. This would do several things: 1) it would make sure the End Time has elapsed for all NSAs especially the user RA thus allowing full use during the available timeframe, and b) wait a few mils longer (latency time) so that any data in flight is delivered. At this point (after all NSAs have reached the end time plus a latency factor) any remaing data in flight was definately sent after the reservation. Bad user, bad user. In this case, any data in flight is no longer the network's concern. Then, we can reconfigure without regard to securing the user information.

Finally, we might consider how to insure that the connection is not torn down until *all* NSAs have reached the End Time. THis could be indicated by flooding a "End Time Alert" notification or some simlar message along the tree. When that message is acknowledge by all NSAs in the connection, then a Release can begin. Of course, here again, if an acknowledge is not received in a finite time, the connection is torn down unilaterally.

I do however, think we need to address End Time processing in V1.0 This is important - we need to have a clearly defined lifecycle and primitievs that do not promise something the protocol cannot deliver. >From this discussion, we cannot clearly state when the availability of the connection ends.

These are some very interesting and challenging nuances. I hope this was useful musings... br Jerry

...
We can discuss whether it should be synchronized or signaled, but I would even left it for v2 (or v1.1, or whatever we decide). Since ALL segments of connection has configuration removed, the resource time is ended. I agree that resource time is difficult to forecast, yet we need to fit that into calendar full of other reservations and synchronize them. Thus we need to estimate, guess, or use magic to get those values as realistic as possible. Overlapping is forbidden, and leaving gaps of unused resources will be waste of resources and money at the end.

“Two minute warning” is not speaking to me. I don’t see a reason to warn a domain or user that the connection will be closed soon, while user knows what was requested and domain is tracking that with a calendar. We can discuss some internal notifiers, but that’s implementation.

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Jerry Sobieski

1:24 p.m.

Hi Jeff - glad to see you chime in! See my response in line... Jeff W. Boote wrote:

...

On Sep 28, 2010, at 3:40 PM, Jerry Sobieski wrote:

...
Hi Radak - see comment below...

...
Hi Jerry, IMHO setup and tear down times should be considered global (all NSA along a path). User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ). But we cannot expect all NSAs clocks to be exactly synchronized. Clocks are critical to bookahead scheduling, and independent quasi-synchronized clocks (however slightly skewed) will cause

Radek Krzywania wrote: problems. Some of those problems are evident in this discussion.

Exact synchronization is not required. The protocol can (and probably should) define a reasonable synchronization requirement. i.e. NSAs MUST be synchronized within 1 second. (Or even 10 seconds) That should be a relatively trivial requirement, and bounds this problem.

I think the gotchya here is that even if we define a maximum skew between two interacting NSAs, that skew can be additive from one NSA to the next down/across the service tree. And if the service tree contains a number of NSAs, that skew may become large. Despite this, whether the skew is large or small, it must still be analyzed carefully. The protocol can be defined to handle these effectively - we just need to do a thorough analysis of the timing permutations. This is careful work, but not particularly difficult. certainly not a roadblock. Further, if we require *any* type of clock synchronization for the protocol to work, we need to then define a mechanism within the protocol to either synchronize clocks or to at least detect broken clocks. And this check must be performed periodically during the NSA-NSA session to insure the clocks don't drift out of conformance. IMO, any type or required clock synchronization creates substantial complexity. I think it safe to assume that NSA clocks will be "approximately" the same, but we still need to handle even slight skew effects rigorously in the protocol. *Question for discussion: * What happens if the time-of-day clocks are way off? Effectively, the reservations may never be successful, or the provisioning may never succeed. But the protocol should still work correctly even if someone's calendar is messed up. Remember: The day clock may be messed up in one NSA, but the protocol agent must function for many possible service trees. We can discuss this: I don't think we want to make it a function of the Connection Service to insure that reservation clocks are right. Or maybe we should? Maybe the "scheduling" function -which uses a time-of-day clock that *should* be close to correct does need some coordination...what do folks think? IMO, the timers and protocol functions for the connection life cycle state machine should be able to function with independent clocks that are approximately correct - whether that error is a few milliseconds or a few days.

...

...
Open issue for discussion: How do we address the issue of clock skew across a scheduled network?

...
For tear down, it does not matter where you start to removing the configuration (end point, or any point along the path). Since you remove single configuration point – the service is not available any more. That the time where available time ends.

Actually, I would assert that the "available" time ends when the End Time elapses. The "End Time" is by definition when the connection is no longer available, and the user should therefore not assume they are usable past that time. Maybe the path resources will stay in place, but maybe not. During reconfiguration, the state of cross connects along the path is indeterminate, and if /when they are reconfigured, then user data can be [mis]routed to 3rd parties, and 3rd party data may be [mis]routed to the egress STP.

The real issue in my mind is "when is the actual End Time?" Given that we cannot guarantee exactly when each NSA may reach their repective End Time, the End Time should be (IMHO:-) an Estimated End Time "plus or minus", and the user should consider the ramifications of this.

We do not know which NSA will reach End Time first and begin to tear down the connection. Nor do we know the delta between this first NSA's clock and the user's clock.

I don't think the NSA should be attempting to understand the delta of the users clock. It can simply treat requests relative to 'true time'. And, we should expect End Time to be done similar to Start Time. In other words, the users requested time indicates when they expect to 'use' the circuit. If tear-down takes time, the resource time should add a delta to that. Tear-down should not start until 'after' the users requested end time.

...

...
I think the ideal situation is that the user sees the Estimated End Time approaching (via their own clock) and stops sending some user defined time prior to End Time. The user lets the connection drain - still prior to End TIme. Once the user traffic is drained, the user RA issues a Release Request. We can, I think, assume that if the user issues the manual Release Request before the scheduled availability has ended, then the user has verified that no important data remains in the pipe. But due to clock skew and due to the estimated End Time, the user's estimate of how much time is remaining may be substantially under-estimated. This is why I think maybe a 2-minute warning might be useful - the user can request a warning of "n" seconds, and that warning will bubble up from the NSA who's clock is most advanced. The user can then throttle down their traffic accordingly and then issue a Release Request. While this might be nice, it is not fundamentally necessary for the protocol. It helps the user, but the protocol must still be able to deterministically handle a user that ignores the warning and drives off the cliff.

Fundamentally, we want to make sure the user isn't surprised by an earlier than expected release due to clock skew. And we won't know until close to the End Time who's clock is going to trigger the Release first. The warning announces reveals that NSA and gives the user fair warning.

This is reasonable and becomes manageable by the user if the protocol defines the maximum clock skew allowed. Ah...but as noted above - even a maximum clock skew is additive. And

But who has the "True Time"? This is the fundamental problem. Every NSA thinks their time is the One True Time (:-). And they all vary a little. How does the protocol react when it discovers that some message or event has not occured according to what it believes to be the proper time? Clocks are either *exactly* synchronized, or they are not. Since the latter is the real world case, we need to just make sure the protocol is deigned to handle that. IMO, we should treat time as relative. I.e. Each NSA maintains its own Time, but it must allow for others who's Time may be skewed. So we need to consider what it actually /means/ when an event occurs in each state and design to protocol to react accordingly. the skew is measured against...what? and when is it measured? how often?

...

...
Whatever the method for initiating the release, the network should insure that any user data accepted at the ingress prior to the End Time is not misrouted - even after the end time. The network's only options are to try to deliver in-flight data properly or drop it. Since the End Time has been reached, the network can no longer assume that any segments are still usable, so delivering it is not really an option either. The network must drop any stranded traffic. Thus, we need to have some means of blocking new ingress data, and insuring bytes in flight get dropped asap.

One might take a different view if we hold the connection in place for some safety/guard time past the local End Time. This would do several things: 1) it would make sure the End Time has elapsed for all NSAs especially the user RA thus allowing full use during the available timeframe, and b) wait a few mils longer (latency time) so that any data in flight is delivered. At this point (after all NSAs have reached the end time plus a latency factor) any remaing data in flight was definately sent after the reservation. Bad user, bad user. In this case, any data in flight is no longer the network's concern. Then, we can reconfigure without regard to securing the user information.

Finally, we might consider how to insure that the connection is not torn down until *all* NSAs have reached the End Time. THis could be indicated by flooding a "End Time Alert" notification or some simlar message along the tree. When that message is acknowledge by all NSAs in the connection, then a Release can begin. Of course, here again, if an acknowledge is not received in a finite time, the connection is torn down unilaterally.

I do however, think we need to address End Time processing in V1.0 This is important - we need to have a clearly defined lifecycle and primitievs that do not promise something the protocol cannot deliver. >From this discussion, we cannot clearly state when the availability of the connection ends.

These are some very interesting and challenging nuances. I hope this was useful musings... br Jerry

...
We can discuss whether it should be synchronized or signaled, but I would even left it for v2 (or v1.1, or whatever we decide). Since ALL segments of connection has configuration removed, the resource time is ended. I agree that resource time is difficult to forecast, yet we need to fit that into calendar full of other reservations and synchronize them. Thus we need to estimate, guess, or use magic to get those values as realistic as possible. Overlapping is forbidden, and leaving gaps of unused resources will be waste of resources and money at the end.

“Two minute warning” is not speaking to me. I don’t see a reason to warn a domain or user that the connection will be closed soon, while user knows what was requested and domain is tracking that with a calendar. We can discuss some internal notifiers, but that’s implementation.

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

Inder Monga

8:10 a.m.

Radek I agree with your statements;

...

User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding. In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. d. Similar semantics apply to the end-time as well. Inder On Sep 28, 2010, at 10:01 AM, Radek Krzywania wrote:

...

Hi Jerry, IMHO setup and tear down times should be considered global (all NSA along a path). User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

For tear down, it does not matter where you start to removing the configuration (end point, or any point along the path). Since you remove single configuration point – the service is not available any more. That the time where available time ends. We can discuss whether it should be synchronized or signaled, but I would even left it for v2 (or v1.1, or whatever we decide). Since ALL segments of connection has configuration removed, the resource time is ended. I agree that resource time is difficult to forecast, yet we need to fit that into calendar full of other reservations and synchronize them. Thus we need to estimate, guess, or use magic to get those values as realistic as possible. Overlapping is forbidden, and leaving gaps of unused resources will be waste of resources and money at the end.

“Two minute warning” is not speaking to me. I don’t see a reason to warn a domain or user that the connection will be closed soon, while user knows what was requested and domain is tracking that with a calendar. We can discuss some internal notifiers, but that’s implementation.

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

From: Jerry Sobieski [mailto:jerry@nordu.net] Sent: Tuesday, September 28, 2010 6:20 PM To: radek.krzywania@man.poznan.pl Cc: 'John Vollbrecht'; 'NSI WG' Subject: Re: [Nsi-wg] time issue

Hi Radak-

Hmmm... some musings...

It seems to me that the guard times are local to each NSA - the only real desire is to have the circuit avalaible to the user at the desired time (or as close to it as our best effort can do), and to keep it available until the End TIme (or as close as we can manage.) So each NSA down the tree will prepend their own estimated guard time to the Available Time and hope for the best. I don't think there is a need to synchronize the times per se, just the state transitions. In order for the protocol to be deterministic, the easiest way IMO is to always send a Provision Complete msg up the tree whenever we complete the setup. This will attest that all children segments are in place and In Service.

As for tear down time... From the netwokr perspective, the tear down can (should be?) almost instantaneous as the only real thing that needs to be done is to shut off the ingress point. Typically, there is no need to physically reconfigure the transit nodes back to some idle configuration - the ingress just gets shut and the resources marked as released. The resources are officially no longer available to the user after the End Time has passed. Even if *no reconfiguration* is done, and even if the ingress port is left open, the user's reservation has expired and they can no longer rely on its availability.

The real question is "When will the End Time occur??" (I sound like a metephysical preacher:-) While each NSA down the tree will want to shut their respective ingress points at the End Time, the only one that really counts from a user perspective is the first NSA to reach the End Time. I.e. due to clock skew, a child NSA down the tree somewhere may reach the End Time before the top PA begins the 'proper' Release. Unless the Release is triggered deterministically by one particular NSA there could be some unpredictable variance in the actual release time. >From the user's perspective, with a bit slower clock, the connection inexplicably goes away early...

This is why I think messaging should be the primary means for managing the circuit. I think the best way to handle this may be that each NSA sets a timer when their End Time is reached. A "manual stop" Release message is expected within that timer window. When that timer expires, the local NSA stops waiting for a manual stop and reverts to "auto stop" for the connection. Auto-stop handling sends appropriate messages or notifications up the tree and drops the connection. This End Time Release timer could be some locally defined safety (guard) time. Even in this case, the user's reservation still expires at the End Time - they are no longer guarranteed the resources. And there is no practical way to know when the first End Time Release timer may expire...if the clocks are skewed, and the release guard time small enough, the connection may still drop before the user expects it.

For this reason, I think we should consider defining the End Time as a best effort Estimated End Time just like the best effort Estimated Start Time (or Requested Avalable Time).

In general, I think synchronizing a global network of NSAs to accurately reflect a common time is almost impossible, or at least out of scope. Accuracy and skew will always be an issue. Therefore, the protocol / state machine must be able to function correctly in the face of skew, and this may prevent us from making hard guarranties on availability times. (Did I explain this coherently?:-) This why auto-start and auto-stop are tricky - the clocks among the NSAs will never be exactly the same and so we'll always risk skewed independent clocks jumping the gun. But clock skew (as Radak suggests) is something we can handle, but it needs to be considered as part of the protocol/state machine, not just an implementation issue.

Auto-start and auto-stop capability will need to address protocol issues like this caused by clock skew. Perhaps we need some protocol feature that insures a minimum maintained clock sync - and if an NSA's time/date clock is off by more than some standard minimum, that NSA is considered broken and not used. Remember also, that skew can accuulate across multiple networks/nodes, so even a standard minimum may still accumulate to something unacceptable.

Another alternative we may consider is a "Two Minute Warning"...I.e. whenever a connection End Time approaches, say, 2 minutes from now, a message is flooded along the service tree to notify the NSAs on the path that at least one NSA is going to drop the connection very soon. The Prime Mover RA (the user) will ultimately get this message as well, notifying them that they should find a stopping point or risk losing connectivity abruptly. Indeed, a two-minute warning could be a connection specific user specified value with some Service Definition default.

Ugh...this is interesting...

Best regards Jerry

Radek Krzywania wrote: Hi, Just a short comment on time definitions there - I kind of like them :) The case is that we should be able to estimate the difference between available and resource time - so in other words, to be able to estimate setup and tear down time. That is mostly for purposes of SLA, but not only. Also, since user is requesting only to give an available time, we need to map those times somehow, and keep them synchronised. This is implementation work here, but doable (if we are able to estimate setup/tear down times).

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

-----Original Message----- From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of John Vollbrecht Sent: Monday, September 27, 2010 11:51 PM To: NSI WG Subject: [Nsi-wg] time issue

Hello all -

Jerry and I had a discussion last week about the time issue. I think we developed a useful approach.

The idea is to define two times, which I think we all agree exist.

1) available time - time a connection is available to the application to communicate between devices 2) resource time - time a resource is reserved to support available time

To further define these -

Available time - requested by the user for its application - provided by the network.

resource time - time a resource is allocated to a connection - includes setup and teardown time, if any - is time in reservation calendar for resource

Available time requested cannot be provided exactly by network because it cannot predict exactly length of setup and take down. I believe we all agree with this. Therefore provided available time can at best approximate requested available time.

We agreed that when a user requests automatic start connection it would request available time and the provider would schedule resource time to get as close as possible. When a request is for user initiated connection the time would be for reserved time, and the user initiation can start anytime after the reserved time. Available time depends on setup and take down times of equipment.

----------------- I think we agreed on the above definitions. The definition of time seem useful in discussing what goes in connection service messages. We also talked about some possible implications of this.

The difference between available and resource time is setup and takedown time. While it is impossible to be sure exactly how long they will be, it may be possible to define something statistical. For example setup takes an average of 17 sec with std deviation of N. If this is can be defined for the resource, then one can make a prediction about when a connection will be available with a degree of confidence.

For example this would allow one to request an automatic connection, for example, at 5pm and have it available 99% of the time. If the average setup time is 17 seconds and I add 10 seconds to be 99% sure, then the service would initiate the connection at 5:00:00 - 0:00:25, or 4:59:35.

We talked about including this "setup requirement" in the connection service definition of and NSA, and by implication including this in requests and replies. I think this is worth talking about in the group.

John

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

--- Inder Monga ANI Testbed imonga@es.net ESnet Blog (510) 499 8065 (c) (510) 486 6531 (o) "Whatever your mind can conceive and believe it can achieve." - Napoleon Hill

Jerry Sobieski

12:45 p.m.

Hi Inder- I am not sure I agree with all of this... Inder Monga wrote:

...

Radek

I agree with your statements;

...
User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

...

In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce

...

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are

...

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents. Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. the guard time to fit the available lead time? I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol. that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis. While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

...

d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity. br Jerry

Gigi Karmous-Edwards

1:31 p.m.

Jerry, For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?" In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option. I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA. Kind regards, Gigi On 9/29/10 8:45 AM, Jerry Sobieski wrote:

...

Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

...
Radek

I agree with your statements;

...
User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

...
In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

...
b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

...
c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

...
d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Jeff W.Boote

2 p.m.

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

...

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

...

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time. As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error. jeff

...

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

...
Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

...
Radek

I agree with your statements;

...
User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

...
In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

...
b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/ configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

...
c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/ Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

...
d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Inder Monga

3:09 p.m.

Hi Jeff,

...

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

This can be recommended. Is it mandatory requirement is the question?

...

jeff

...
Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

...
Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

...
Radek

I agree with your statements;

...
User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

...
In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

...
b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

...
c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

...
d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

--- Inder Monga ANI Testbed imonga@es.net ESnet Blog (510) 499 8065 (c) (510) 486 6531 (o) "Whatever your mind can conceive and believe it can achieve." - Napoleon Hill

John Vollbrecht

3:21 p.m.

I abstract from this discussion some things that need to be agreed. We are using the term available time and reserved time. Available time is requested time in a request and estimated time in a response. Perhaps estimated is best in both cases. There is a proposal that available time be used in all protocol messages. This certainly seems to work for automatic provisioning case. For user provisioning it seems to me that some way of giving the user a estimate of startup time is needed. Also for user provisioning the assumption is that tear down is initiated by network to satisfy reserved end time (if not torn down by user before then). We need to decide how to deal with automated and user initiated in the same protocol. Time synchronization is a major issue. I note that time synchronization in reservations is a question of setting start and end time equivalently in all NSAs. Jeff suggests that we use NTP or some equivalent to sync time between NSAs, which can provide a bound. I wonder if there is a way to do this, for reservations at least, in the protocol - each NSA sharing its time wirh its neighbor at in each request. I would like to see a list of specific issues and proposed resolutions to discuss. Is someone able to develop such a list? Are there other issues that need to be resolved? John On Sep 29, 2010, at 10:00 AM, Jeff W.Boote wrote:

...

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

...
Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

...
I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

...
Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

...
Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

...
Radek

I agree with your statements;

...
User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

...
In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

...
b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

...
c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

...
d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Artur Barczyk

30 Sep 30 Sep

2:46 p.m.

Hi, On 09/29/2010 05:21 PM, John Vollbrecht wrote:

...

I abstract from this discussion some things that need to be agreed.

We are using the term available time and reserved time.

I would call it "requested time" (as it comes from the user request).

...

Available time is requested time in a request and estimated time in a response. Perhaps estimated is best in both cases.

There is a proposal that available time be used in all protocol messages. This certainly seems to work for automatic provisioning case.

IMO, this is the only thing that makes sense - reserved time will be a local parameter of each NSA.

...

For user provisioning it seems to me that some way of giving the user a estimate of startup time is needed. Also for user provisioning the assumption is that tear down is initiated by network to satisfy reserved end time (if not torn down by user before then). We need to decide how to deal with automated and user initiated in the same protocol.

If my understanding is right, the "user initiated" provisioning will be done by the RA, so technically there should be no difference to "automated" provisioning if the RA is provided with the same info as any other NSA. Which it should, unless we aim for a simplified UNI as described by Jerry in an earlier mail (where I agree we probably don't actually want that). Btw, I never understood the need for user initiated provisioning in the first place. The user has requested a resource, and it has been confirmed to him. He can safely assume it's there (modulo error conditions). The provisioning should be initiated by the "owning" NSA, probably the one that received the original user request, or the RA itself. Some other thoughts in this respect: - The resources are reserved anyway. - User is bound to initiate the provisioning at requested/available time. He has no choice (otherwise we need to inform him about the resulting reserved time - in which domain? So why asking him to do so? What if he doesn't at all? - User should not need to be network aware - it's the network that has to provide the service to the specs. So "automatic" provisioning is really the only reasonable option, I think, be it from a PA or RA. Initiation by a "dumb" user agent just opens a can of worms...

...

Time synchronization is a major issue. I note that time synchronization in reservations is a question of setting start and end time equivalently in all NSAs. Jeff suggests that we use NTP or some equivalent to sync time between NSAs, which can provide a bound. I wonder if there is a way to do this, for reservations at least, in the protocol - each NSA sharing its time wirh its neighbor at in each request. I would like to see a list of specific issues and proposed resolutions to discuss. Is someone able to develop such a list?

I don't actually think this is an issue per se. All that is required is to start ntpd, which any reasonably configured server does anyway. This will give you (sub-)second synchronisation between NSAs, and if we put it as a requirement, also the user. Now, if we are talking about "guard times" and such of minutes, time synchronisation to a second or so will certainly be sufficient. Not sure the NSAs need to synchronise between each other, at least in v1, if each is required to synchronise to a reasonably good time reference, which should be the thing mandated. I agree the list of specific issues is needed here, I would again recommend to think of state machines in this context. With nested SMs, I believe the main issue is a timeout value to wait for any NSA not in provisioned state at the user requested/available time. But then again one would need to look though the tree and chain models in detail, so I'm sure there's more. Cheers, Artur

...

Are there other issues that need to be resolved?

John

On Sep 29, 2010, at 10:00 AM, Jeff W.Boote wrote:

...
On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

...
Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

...
I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

...
Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

...
Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

...
Radek

I agree with your statements;

...
User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

...
In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

...
b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

...
c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

...
d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Jerry Sobieski

29 Sep 29 Sep

7:33 p.m.

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding. The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue. This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok. So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it. my $.02 Jerry Jeff W.Boote wrote:

...

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

...
Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

...
I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

...
Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

...
Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

...
Radek

I agree with your statements;

...
User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

...
In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

...
b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

...
c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

...
d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

------------------------------------------------------------------------

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Radek Krzywania

30 Sep 30 Sep

1:37 p.m.

Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2. Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2. We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: - 20 minutes for reservation as set up time - Service availability time (e.g. 13 h) - Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and <mailto:radek.krzywania@man.poznan.pl> radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 <http://www.man.poznan.pl> http://www.man.poznan.pl ________________________________________________________________________ From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding. The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue. This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok. So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it. my $.02 Jerry Jeff W.Boote wrote: On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote: Jerry, For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?" In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option. I agree #1 seems the most deterministic. I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA. I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time. As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error. jeff Kind regards, Gigi On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this... Inder Monga wrote: Radek I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ). The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding. The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents. Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time? I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol. c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis. While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration. d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity. br Jerry _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _____ _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Artur Barczyk

3:25 p.m.

Hi Radek, All, hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc. Be aware that users complaining are users quite quickly lost. You don't want that. So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) ) In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) ) I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-) Cheers, Artur On 09/30/2010 03:37 PM, Radek Krzywania wrote:

...

Hi,

It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ).

For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above:

- 20 minutes for reservation as set up time

- Service availability time (e.g. 13 h)

- Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes)

In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards

Radek

________________________________________________________________________

Radoslaw Krzywania Network Research and Development

Poznan Supercomputing and

radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center

+48 61 850 25 26 http://www.man.poznan.pl

________________________________________________________________________

*From:* nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] *On Behalf Of *Jerry Sobieski *Sent:* Wednesday, September 29, 2010 9:33 PM *To:* Jeff W.Boote *Cc:* nsi-wg@ogf.org *Subject:* Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

Radek

I agree with your statements;

User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion,

a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept).

While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client.

Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue.

I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

------------------------------------------------------------------------

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Radek Krzywania

3:51 p.m.

Hi, There is no, and will be no such mechanism to define static, constant or even predictable connection activation time in distributed environment. What we do is estimates. You have a VC at 2pm, ask for circuit available at 1:40, as you were warned connection setup time may take up to 20 minutes. Then you have a guarantee. Please mind, v1 can’t solve everything. Let’s just create something and improve it step by step. Yes, in above case you extend reservation time, which means you pay more (in theory, depending on pay model). But think in reverse direction – how do you know how long do you need a connection? You usually guess and adds something just in case. So you don’t use it in optimal way anyway. For v1 – I would not care much. If we try to restrict it in details, we will stack in discussion and complexity of the protocol and its mechanisms. I would rather keep it simple in contrary. Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and <mailto:radek.krzywania@man.poznan.pl> radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 <http://www.man.poznan.pl> http://www.man.poznan.pl ________________________________________________________________________ From: Artur Barczyk [mailto:Artur.Barczyk@cern.ch] Sent: Thursday, September 30, 2010 5:25 PM To: radek.krzywania@man.poznan.pl Cc: 'Jerry Sobieski'; 'Jeff W.Boote'; nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Hi Radek, All, hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc. Be aware that users complaining are users quite quickly lost. You don't want that. So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) ) In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) ) I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-) Cheers, Artur On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2. Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2. We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding. The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue. This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok. So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it. my $.02 Jerry Jeff W.Boote wrote: On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote: Jerry, For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?" In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option. I agree #1 seems the most deterministic. I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA. I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time. As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error. jeff Kind regards, Gigi On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this... Inder Monga wrote: Radek I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ). The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding. The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents. Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time? I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol. c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis. While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration. d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity. br Jerry _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _____ _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg -- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Guy Roberts

4:06 p.m.

I support Radek’s approach on this one – in the absence of deterministic setup times, let’s take a heuristic approach and rely on the operating experience gained from systems such as AutoBAHN. As we gain more operating experience we may be able to find a better solution for future NSI versions. Guy From: Radek Krzywania [mailto:radek.krzywania@man.poznan.pl] Sent: 30 September 2010 16:52 To: 'Artur Barczyk' Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Hi, There is no, and will be no such mechanism to define static, constant or even predictable connection activation time in distributed environment. What we do is estimates. You have a VC at 2pm, ask for circuit available at 1:40, as you were warned connection setup time may take up to 20 minutes. Then you have a guarantee. Please mind, v1 can’t solve everything. Let’s just create something and improve it step by step. Yes, in above case you extend reservation time, which means you pay more (in theory, depending on pay model). But think in reverse direction – how do you know how long do you need a connection? You usually guess and adds something just in case. So you don’t use it in optimal way anyway. For v1 – I would not care much. If we try to restrict it in details, we will stack in discussion and complexity of the protocol and its mechanisms. I would rather keep it simple in contrary. Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl<mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ From: Artur Barczyk [mailto:Artur.Barczyk@cern.ch] Sent: Thursday, September 30, 2010 5:25 PM To: radek.krzywania@man.poznan.pl Cc: 'Jerry Sobieski'; 'Jeff W.Boote'; nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Hi Radek, All, hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc. Be aware that users complaining are users quite quickly lost. You don't want that. So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) ) In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) ) I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-) Cheers, Artur On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2. Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2. We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl<mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ From: nsi-wg-bounces@ogf.org<mailto:nsi-wg-bounces@ogf.org> [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> Subject: Re: [Nsi-wg] time issue Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding. The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue. This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok. So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it. my $.02 Jerry Jeff W.Boote wrote: On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote: Jerry, For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?" In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option. I agree #1 seems the most deterministic. I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA. I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time. As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error. jeff Kind regards, Gigi On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this... Inder Monga wrote: Radek I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ). The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding. The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents. Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time? I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol. c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis. While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration. d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity. br Jerry _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg ________________________________ _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg -- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Artur Barczyk

4:18 p.m.

Why the rush? As I pointed out in the previous mail, there will not be many applications which will be able to benefit from this way of circuit scheduling. At least the one I have in mind won't... Cheers, Artur On 09/30/2010 06:06 PM, Guy Roberts wrote:

...

I support Radek’s approach on this one – in the absence of deterministic setup times, let’s take a heuristic approach and rely on the operating experience gained from systems such as AutoBAHN. As we gain more operating experience we may be able to find a better solution for future NSI versions.

Guy

*From:* Radek Krzywania [mailto:radek.krzywania@man.poznan.pl] *Sent:* 30 September 2010 16:52 *To:* 'Artur Barczyk' *Cc:* nsi-wg@ogf.org *Subject:* Re: [Nsi-wg] time issue

Hi,

There is no, and will be no such mechanism to define static, constant or even predictable connection activation time in distributed environment. What we do is estimates. You have a VC at 2pm, ask for circuit available at 1:40, as you were warned connection setup time may take up to 20 minutes. Then you have a guarantee. Please mind, v1 can’t solve everything. Let’s just create something and improve it step by step.

Yes, in above case you extend reservation time, which means you pay more (in theory, depending on pay model). But think in reverse direction – how do you know how long do you need a connection? You usually guess and adds something just in case. So you don’t use it in optimal way anyway. For v1 – I would not care much. If we try to restrict it in details, we will stack in discussion and complexity of the protocol and its mechanisms. I would rather keep it simple in contrary.

Best regards

Radek

________________________________________________________________________

Radoslaw Krzywania Network Research and Development

Poznan Supercomputing and

radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center

+48 61 850 25 26 http://www.man.poznan.pl

________________________________________________________________________

*From:* Artur Barczyk [mailto:Artur.Barczyk@cern.ch] *Sent:* Thursday, September 30, 2010 5:25 PM *To:* radek.krzywania@man.poznan.pl *Cc:* 'Jerry Sobieski'; 'Jeff W.Boote'; nsi-wg@ogf.org *Subject:* Re: [Nsi-wg] time issue

Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote:

Hi,

It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ).

For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above:

20 minutes for reservation as set up time

Service availability time (e.g. 13 h)

Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes)

In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards

Radek

________________________________________________________________________

Radoslaw Krzywania Network Research and Development

Poznan Supercomputing and

radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center

+48 61 850 25 26 http://www.man.poznan.pl

________________________________________________________________________

*From:* nsi-wg-bounces@ogf.org <mailto:nsi-wg-bounces@ogf.org> [mailto:nsi-wg-bounces@ogf.org] *On Behalf Of *Jerry Sobieski *Sent:* Wednesday, September 29, 2010 9:33 PM *To:* Jeff W.Boote *Cc:* nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> *Subject:* Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

Radek

I agree with your statements;

User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion,

a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept).

While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client.

Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue.

I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

------------------------------------------------------------------------

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Artur Barczyk

4:17 p.m.

Hi, On 09/30/2010 05:51 PM, Radek Krzywania wrote:

...

Hi,

There is no, and will be no such mechanism to define static, constant or even predictable connection activation time in distributed environment.

Please point me to the study which determined this... ;-) Anyway, let me clarify: The user asks for a connection at 2pm. It is the service provider's responsibility to make sure it is there. Technical details of setup times in domains etc are just boring the customer at best. It is just bad practice to let the customer guess how much time he needs to add. In particular if you intend to make him pay for it.

...

What we do is estimates. You have a VC at 2pm, ask for circuit available at 1:40, as you were warned connection setup time may take up to 20 minutes. Then you have a guarantee.

You can guess as an implementation feature, but you cannot offer a service based on guesswork. Neither have a reasonable protocol definition.

...

Please mind, v1 can’t solve everything. Let’s just create something and improve it step by step.

Yes, in above case you extend reservation time, which means you pay more (in theory, depending on pay model). But think in reverse direction – how do you know how long do you need a connection? You usually guess and adds something just in case. So you don’t use it in optimal way anyway. For v1 – I would not care much. If we try to restrict it in details, we will stack in discussion and complexity of the protocol and its mechanisms. I would rather keep it simple in contrary.

I don't see what complexity this adds. Cheers, Artur

...

Best regards

Radek

________________________________________________________________________

Radoslaw Krzywania Network Research and Development

Poznan Supercomputing and

radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center

+48 61 850 25 26 http://www.man.poznan.pl

________________________________________________________________________

*From:* Artur Barczyk [mailto:Artur.Barczyk@cern.ch] *Sent:* Thursday, September 30, 2010 5:25 PM *To:* radek.krzywania@man.poznan.pl *Cc:* 'Jerry Sobieski'; 'Jeff W.Boote'; nsi-wg@ogf.org *Subject:* Re: [Nsi-wg] time issue

Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote:

Hi,

It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ).

For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above:

20 minutes for reservation as set up time

Service availability time (e.g. 13 h)

Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes)

In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards

Radek

________________________________________________________________________

Radoslaw Krzywania Network Research and Development

Poznan Supercomputing and

radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center

+48 61 850 25 26 http://www.man.poznan.pl

________________________________________________________________________

*From:* nsi-wg-bounces@ogf.org <mailto:nsi-wg-bounces@ogf.org> [mailto:nsi-wg-bounces@ogf.org] *On Behalf Of *Jerry Sobieski *Sent:* Wednesday, September 29, 2010 9:33 PM *To:* Jeff W.Boote *Cc:* nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> *Subject:* Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

Radek

I agree with your statements;

User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion,

a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept).

While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client.

Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue.

I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

------------------------------------------------------------------------

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Radek Krzywania

5:26 p.m.

Hi, In other words, user want a connection at 2pm. We set it up at 1:40. How we charge user for that is different matter. User don’t care, as the connection is delivered at specific time. It may delivered even earlier, who cares. So, since user is unaware, are happy? For offering a service based on guesswork – we don’t have (and will not until service is at least prototyped) anything thing better than guessing right now. No historical data, dependency on technology, dependency on NMS, dependency on signaling mechanism, dependency on number of devices, dependency on number of domains along path, even dependency on the hardware running agents and it’s network connection. If we can’t say exactly – we approximate (or guess if you wish) or stay with problem unresolved. Re question about complexity – Can you imagine a protocol flow, messages and mechanism that will guarantee deterministic, and moreover static/predictable activation time. This will not be just send request, receive response. You will need to guarantee that messages are transferred with always the same delays, and systems processes the messages with always the same time, and system responses no longer than/with the same time. Even if the protocol is not more complex because of it (but it will due to synchronization messages, failure/timeouts notifications, additional scenarios), the implementation of such agents will be horrible real time systems engineering (they still use ADA95 for that, don’t they?). Sure, we can play with that. But let’s start with something able to deliver a connection at all, and then discover how to do that more accurate. Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and <mailto:radek.krzywania@man.poznan.pl> radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 <http://www.man.poznan.pl> http://www.man.poznan.pl ________________________________________________________________________ From: Artur Barczyk [mailto:Artur.Barczyk@cern.ch] Sent: Thursday, September 30, 2010 6:18 PM To: radek.krzywania@man.poznan.pl Cc: 'Jerry Sobieski'; 'Jeff W.Boote'; nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Hi, On 09/30/2010 05:51 PM, Radek Krzywania wrote: Hi, There is no, and will be no such mechanism to define static, constant or even predictable connection activation time in distributed environment. Please point me to the study which determined this... ;-) Anyway, let me clarify: The user asks for a connection at 2pm. It is the service provider's responsibility to make sure it is there. Technical details of setup times in domains etc are just boring the customer at best. It is just bad practice to let the customer guess how much time he needs to add. In particular if you intend to make him pay for it. What we do is estimates. You have a VC at 2pm, ask for circuit available at 1:40, as you were warned connection setup time may take up to 20 minutes. Then you have a guarantee. You can guess as an implementation feature, but you cannot offer a service based on guesswork. Neither have a reasonable protocol definition. Please mind, v1 can’t solve everything. Let’s just create something and improve it step by step. Yes, in above case you extend reservation time, which means you pay more (in theory, depending on pay model). But think in reverse direction – how do you know how long do you need a connection? You usually guess and adds something just in case. So you don’t use it in optimal way anyway. For v1 – I would not care much. If we try to restrict it in details, we will stack in discussion and complexity of the protocol and its mechanisms. I would rather keep it simple in contrary. I don't see what complexity this adds. Cheers, Artur Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ From: Artur Barczyk [mailto:Artur.Barczyk@cern.ch] Sent: Thursday, September 30, 2010 5:25 PM To: radek.krzywania@man.poznan.pl Cc: 'Jerry Sobieski'; 'Jeff W.Boote'; nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Hi Radek, All, hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc. Be aware that users complaining are users quite quickly lost. You don't want that. So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) ) In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) ) I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-) Cheers, Artur On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2. Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2. We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding. The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue. This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok. So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it. my $.02 Jerry Jeff W.Boote wrote: On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote: Jerry, For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?" In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option. I agree #1 seems the most deterministic. I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA. I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time. As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error. jeff Kind regards, Gigi On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this... Inder Monga wrote: Radek I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ). The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding. The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents. Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time? I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol. c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis. While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration. d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity. br Jerry _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _____ _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg -- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801 -- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Jerry Sobieski

4:53 p.m.

Hi Artur- I accept the challenge! First, let me calm the nerves... The questionof setup time - particularly the issue of taking 10 minutes or more has mostly to do with provisioning all optical systems where amplification and attenuation across a mesh takes a significant time. In most cases, the provisioning will be a more conventional few seconds to a minute (or so). And for smaller domains with more conventional switcing gear, maybe a few seconds at most. So we should all try to keep perspective here that much of this discussion has to do with insuring the protocol functions correctly, consistently, and reliably as service infrastructure. And much of this is driven by making sure it works such even in the corner cases where it might take 15 minutes to provision, or two NSAs might have clocks that differ by 10 seconds. etc. But the real world facts are that nothing is perfect, and a globally distributed complex system such as our networks are never practically if not theoretically going to be perfectly synchronized or offer exect predictability. We ar trying to provide more/better predictabiity, not perfect predictability. You are right about the users expectations that the connection will be available at the requested time. But nothing is exact. Even if we knew and could predict exactly the setup time, if something was broken in the network and we couldn't meet the committed start time, what would the user do? Ok. deep breath....exhale.... feel better? Ok good. Now let me defend our discussions... To be blunt, it could be argued that any user application that blindly puts data into the pipe without getting some verification that the pipe *AND* the application agent at the far end is ready has no real idea if it is working AT ALL! If the agent at the other end is not functioning (not a network problem), this is fundamentally indistinguishable from a network connection not being available. How would the user be able to claim the network is broken? On the other hand, if there *is* out of band coordination going on between the local user agent and the destination user agent, then the application is trying to deal with an imperfect world in which it needs to determine and synchronze the state of the application agent on the far end before it proceeds. ---> Why would doing so with the network resource not be of equal importance? In *general* (Fuzzy logic alert) we will make the start time. Indeed, in most instances we will be ready *before* the start time. But if by chance we miss the start time by only 15 seconds, is that acceptable to you? Or to the application that just dumped 19 MBytes of data down a hole? What if was the user application that had a slightly fast clock and started 10 seconds early? *His* clock said 1pm, mine said 12:59:50. Who is broken? The result is the same. What if the delta was 5 minutes, or 50 milliseconds? Where do we draw the line? Draw a line, and there will still be some misses... The point here is that nothing is perfect and exact. And yet these systems function "correctly"! We need to construct a protocol that can function in the face of these minor (on a human scale) time deltas. But even seconds are not minor on the scale that a computer agent functions. So we necessarilly need to address these nuances so that it works correctly on a timescale of milliseconds and less. In order to address the issue of [typically] slight variations of actual start time, we are proposing that the protocol would *always* notify the originating RA when the circuit is ready, albeit after the fact, but it says determinitistically "the circuit is now ready." And we are also proposing a means for the RA to determine the state if that ProvisionComplete message is not received when it was expected - if there is a hard error or just a slow/late provisioning process still taking place. But given the fact that we cannot *exactly* synchronize each and every agent and system around the world- and keep them that way, and that we cannot predict perfectly how long each task will take before the fact, we have to face facts that we need to be able to function correctly with these uncertainties. Without meaning to preach, the user application needs to do so too. Small is relative. (there is an old joke here about a prositute and an old man...but I won't go into it.:-) Best regards Jerry So we want to provide the service at the request time. And we will make our best effort to do so. And in most cases we will succeed. But what will the application do if we miss it? What should the protocol do in an imperfect world? It truly cannot function on fuzzy logic. One approach to addressing this is to say the RA will always be notified when the connection goes into service. This is a positive sign that the connection is end-to-end. Artur Barczyk wrote:

...

Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote:

...
Hi,

It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ).

For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above:

- 20 minutes for reservation as set up time

- Service availability time (e.g. 13 h)

- Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes)

In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards

Radek

________________________________________________________________________

Radoslaw Krzywania Network Research and Development

Poznan Supercomputing and

radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center

+48 61 850 25 26 http://www.man.poznan.pl

________________________________________________________________________

*From:* nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] *On Behalf Of *Jerry Sobieski *Sent:* Wednesday, September 29, 2010 9:33 PM *To:* Jeff W.Boote *Cc:* nsi-wg@ogf.org *Subject:* Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

Radek

I agree with your statements;

User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion,

a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept).

While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client.

Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue.

I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

------------------------------------------------------------------------

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Radek Krzywania

5:31 p.m.

Hi, Setting up a circuit via Alcatel NMS takes 2 minutes. This time is mostly consumed by NMS to find a path through domain and warm the room with CPU heat. A seconds or minute is still a guess anyway :) I can agree to use those values (instead of 20 minutes) but according to my current experience – lot of timeouts will appear. I fully support the statement of “We are trying to provide more/better predictability, not perfect predictability.” This should be on the title page of NSI design BTW :) All the case is about everything is relevant and not exact. The example of user clocks depicts it quite well (thanks Jerry for pointing that). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and <mailto:radek.krzywania@man.poznan.pl> radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 <http://www.man.poznan.pl> http://www.man.poznan.pl ________________________________________________________________________ From: Jerry Sobieski [mailto:jerry@nordu.net] Sent: Thursday, September 30, 2010 6:54 PM To: Artur Barczyk Cc: radek.krzywania@man.poznan.pl; 'Jeff W.Boote'; nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Hi Artur- I accept the challenge! First, let me calm the nerves... The questionof setup time - particularly the issue of taking 10 minutes or more has mostly to do with provisioning all optical systems where amplification and attenuation across a mesh takes a significant time. In most cases, the provisioning will be a more conventional few seconds to a minute (or so). And for smaller domains with more conventional switcing gear, maybe a few seconds at most. So we should all try to keep perspective here that much of this discussion has to do with insuring the protocol functions correctly, consistently, and reliably as service infrastructure. And much of this is driven by making sure it works such even in the corner cases where it might take 15 minutes to provision, or two NSAs might have clocks that differ by 10 seconds. etc. But the real world facts are that nothing is perfect, and a globally distributed complex system such as our networks are never practically if not theoretically going to be perfectly synchronized or offer exect predictability. We ar trying to provide more/better predictabiity, not perfect predictability. You are right about the users expectations that the connection will be available at the requested time. But nothing is exact. Even if we knew and could predict exactly the setup time, if something was broken in the network and we couldn't meet the committed start time, what would the user do? Ok. deep breath....exhale.... feel better? Ok good. Now let me defend our discussions... To be blunt, it could be argued that any user application that blindly puts data into the pipe without getting some verification that the pipe *AND* the application agent at the far end is ready has no real idea if it is working AT ALL! If the agent at the other end is not functioning (not a network problem), this is fundamentally indistinguishable from a network connection not being available. How would the user be able to claim the network is broken? On the other hand, if there *is* out of band coordination going on between the local user agent and the destination user agent, then the application is trying to deal with an imperfect world in which it needs to determine and synchronze the state of the application agent on the far end before it proceeds. ---> Why would doing so with the network resource not be of equal importance? In *general* (Fuzzy logic alert) we will make the start time. Indeed, in most instances we will be ready *before* the start time. But if by chance we miss the start time by only 15 seconds, is that acceptable to you? Or to the application that just dumped 19 MBytes of data down a hole? What if was the user application that had a slightly fast clock and started 10 seconds early? *His* clock said 1pm, mine said 12:59:50. Who is broken? The result is the same. What if the delta was 5 minutes, or 50 milliseconds? Where do we draw the line? Draw a line, and there will still be some misses... The point here is that nothing is perfect and exact. And yet these systems function "correctly"! We need to construct a protocol that can function in the face of these minor (on a human scale) time deltas. But even seconds are not minor on the scale that a computer agent functions. So we necessarilly need to address these nuances so that it works correctly on a timescale of milliseconds and less. In order to address the issue of [typically] slight variations of actual start time, we are proposing that the protocol would *always* notify the originating RA when the circuit is ready, albeit after the fact, but it says determinitistically "the circuit is now ready." And we are also proposing a means for the RA to determine the state if that ProvisionComplete message is not received when it was expected - if there is a hard error or just a slow/late provisioning process still taking place. But given the fact that we cannot *exactly* synchronize each and every agent and system around the world- and keep them that way, and that we cannot predict perfectly how long each task will take before the fact, we have to face facts that we need to be able to function correctly with these uncertainties. Without meaning to preach, the user application needs to do so too. Small is relative. (there is an old joke here about a prositute and an old man...but I won't go into it.:-) Best regards Jerry So we want to provide the service at the request time. And we will make our best effort to do so. And in most cases we will succeed. But what will the application do if we miss it? What should the protocol do in an imperfect world? It truly cannot function on fuzzy logic. One approach to addressing this is to say the RA will always be notified when the connection goes into service. This is a positive sign that the connection is end-to-end. Artur Barczyk wrote: Hi Radek, All, hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc. Be aware that users complaining are users quite quickly lost. You don't want that. So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) ) In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) ) I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-) Cheers, Artur On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2. Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2. We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding. The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue. This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok. So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it. my $.02 Jerry Jeff W.Boote wrote: On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote: Jerry, For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?" In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option. I agree #1 seems the most deterministic. I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA. I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time. As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error. jeff Kind regards, Gigi On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this... Inder Monga wrote: Radek I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ). The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding. The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents. Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time? I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol. c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis. While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration. d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity. br Jerry _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _____ _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg -- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Aaron Brown

5:54 p.m.

I'm probably oversimplifying, but it seems to me this problem becomes much easier with Jeff's idea about having all clocks synchronized within a period of no more than some number seconds. If the clocks aren't synchronized, you run into a whole bunch of errors related to making absolute time-based reservations anyway. The protocol mandates clock offsets of no more than X seconds. Each domain selects its own setup time of "no more than Y minutes" and a tear down time of "Z minutes". If a user requests a reservation from time A to time B, the domain reserves time from A-X-Y through B+X+Z. When it comes to setting up the circuit, the domain starts setting it up at time A-X-Y. If the circuit isn't ready by time A-X, the domain throws a setup error and handles that error condition the way it'd handle an actual error occurred during circuit setup. The circuit remains active until time B+X, at which time the domain starts tearing it down. If, while the circuit is running, the hosts become desychronized, one of the domains will (from the either the clients or other domains' perspectives) end the circuit earlier than expected and report the tear down. The other domains/clients will handle that similar to if a cancel had occurred. Again, I may be vastly oversimplifying the problem. Cheers, Aaron On Sep 30, 2010, at 1:31 PM, Radek Krzywania wrote:

...

Hi, Setting up a circuit via Alcatel NMS takes 2 minutes. This time is mostly consumed by NMS to find a path through domain and warm the room with CPU heat. A seconds or minute is still a guess anyway :) I can agree to use those values (instead of 20 minutes) but according to my current experience – lot of timeouts will appear. I fully support the statement of “We are trying to provide more/better predictability, not perfect predictability.” This should be on the title page of NSI design BTW :) All the case is about everything is relevant and not exact. The example of user clocks depicts it quite well (thanks Jerry for pointing that).

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

From: Jerry Sobieski [mailto:jerry@nordu.net] Sent: Thursday, September 30, 2010 6:54 PM To: Artur Barczyk Cc: radek.krzywania@man.poznan.pl; 'Jeff W.Boote'; nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue

Hi Artur- I accept the challenge!

First, let me calm the nerves... The questionof setup time - particularly the issue of taking 10 minutes or more has mostly to do with provisioning all optical systems where amplification and attenuation across a mesh takes a significant time. In most cases, the provisioning will be a more conventional few seconds to a minute (or so). And for smaller domains with more conventional switcing gear, maybe a few seconds at most.

So we should all try to keep perspective here that much of this discussion has to do with insuring the protocol functions correctly, consistently, and reliably as service infrastructure. And much of this is driven by making sure it works such even in the corner cases where it might take 15 minutes to provision, or two NSAs might have clocks that differ by 10 seconds. etc.

But the real world facts are that nothing is perfect, and a globally distributed complex system such as our networks are never practically if not theoretically going to be perfectly synchronized or offer exect predictability. We ar trying to provide more/better predictabiity, not perfect predictability.

You are right about the users expectations that the connection will be available at the requested time. But nothing is exact. Even if we knew and could predict exactly the setup time, if something was broken in the network and we couldn't meet the committed start time, what would the user do?

Ok. deep breath....exhale.... feel better? Ok good. Now let me defend our discussions...

To be blunt, it could be argued that any user application that blindly puts data into the pipe without getting some verification that the pipe *AND* the application agent at the far end is ready has no real idea if it is working AT ALL! If the agent at the other end is not functioning (not a network problem), this is fundamentally indistinguishable from a network connection not being available. How would the user be able to claim the network is broken?

On the other hand, if there *is* out of band coordination going on between the local user agent and the destination user agent, then the application is trying to deal with an imperfect world in which it needs to determine and synchronze the state of the application agent on the far end before it proceeds. ---> Why would doing so with the network resource not be of equal importance?

In *general* (Fuzzy logic alert) we will make the start time. Indeed, in most instances we will be ready *before* the start time. But if by chance we miss the start time by only 15 seconds, is that acceptable to you? Or to the application that just dumped 19 MBytes of data down a hole?

What if was the user application that had a slightly fast clock and started 10 seconds early? *His* clock said 1pm, mine said 12:59:50. Who is broken? The result is the same. What if the delta was 5 minutes, or 50 milliseconds? Where do we draw the line? Draw a line, and there will still be some misses...

The point here is that nothing is perfect and exact. And yet these systems function "correctly"! We need to construct a protocol that can function in the face of these minor (on a human scale) time deltas. But even seconds are not minor on the scale that a computer agent functions. So we necessarilly need to address these nuances so that it works correctly on a timescale of milliseconds and less.

In order to address the issue of [typically] slight variations of actual start time, we are proposing that the protocol would *always* notify the originating RA when the circuit is ready, albeit after the fact, but it says determinitistically "the circuit is now ready." And we are also proposing a means for the RA to determine the state if that ProvisionComplete message is not received when it was expected - if there is a hard error or just a slow/late provisioning process still taking place.

But given the fact that we cannot *exactly* synchronize each and every agent and system around the world- and keep them that way, and that we cannot predict perfectly how long each task will take before the fact, we have to face facts that we need to be able to function correctly with these uncertainties. Without meaning to preach, the user application needs to do so too.

Small is relative. (there is an old joke here about a prositute and an old man...but I won't go into it.:-)

Best regards Jerry

So we want to provide the service at the request time. And we will make our best effort to do so. And in most cases we will succeed. But what will the application do if we miss it? What should the protocol do in an imperfect world? It truly cannot function on fuzzy logic.

One approach to addressing this is to say the RA will always be notified when the connection goes into service. This is a positive sign that the connection is end-to-end.

Artur Barczyk wrote: Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote: Radek

I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801 _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Evangelos Chaniotakis

6:02 p.m.

Seconding this. On Sep 30, 2010, at 1:54 PM, Aaron Brown wrote:

...

I'm probably oversimplifying, but it seems to me this problem becomes much easier with Jeff's idea about having all clocks synchronized within a period of no more than some number seconds. If the clocks aren't synchronized, you run into a whole bunch of errors related to making absolute time-based reservations anyway.

The protocol mandates clock offsets of no more than X seconds. Each domain selects its own setup time of "no more than Y minutes" and a tear down time of "Z minutes". If a user requests a reservation from time A to time B, the domain reserves time from A-X-Y through B+X+Z. When it comes to setting up the circuit, the domain starts setting it up at time A-X-Y. If the circuit isn't ready by time A-X, the domain throws a setup error and handles that error condition the way it'd handle an actual error occurred during circuit setup. The circuit remains active until time B+X, at which time the domain starts tearing it down. If, while the circuit is running, the hosts become desychronized, one of the domains will (from the either the clients or other domains' perspectives) end the circuit earlier than expected and report the tear down. The other domains/clients will handle that similar to if a cancel had occurred.

Again, I may be vastly oversimplifying the problem.

Cheers, Aaron

On Sep 30, 2010, at 1:31 PM, Radek Krzywania wrote:

...
Hi, Setting up a circuit via Alcatel NMS takes 2 minutes. This time is mostly consumed by NMS to find a path through domain and warm the room with CPU heat. A seconds or minute is still a guess anyway :) I can agree to use those values (instead of 20 minutes) but according to my current experience – lot of timeouts will appear. I fully support the statement of “We are trying to provide more/ better predictability, not perfect predictability.” This should be on the title page of NSI design BTW :) All the case is about everything is relevant and not exact. The example of user clocks depicts it quite well (thanks Jerry for pointing that).

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

From: Jerry Sobieski [mailto:jerry@nordu.net] Sent: Thursday, September 30, 2010 6:54 PM To: Artur Barczyk Cc: radek.krzywania@man.poznan.pl; 'Jeff W.Boote'; nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue

Hi Artur- I accept the challenge!

First, let me calm the nerves... The questionof setup time - particularly the issue of taking 10 minutes or more has mostly to do with provisioning all optical systems where amplification and attenuation across a mesh takes a significant time. In most cases, the provisioning will be a more conventional few seconds to a minute (or so). And for smaller domains with more conventional switcing gear, maybe a few seconds at most.

So we should all try to keep perspective here that much of this discussion has to do with insuring the protocol functions correctly, consistently, and reliably as service infrastructure. And much of this is driven by making sure it works such even in the corner cases where it might take 15 minutes to provision, or two NSAs might have clocks that differ by 10 seconds. etc.

But the real world facts are that nothing is perfect, and a globally distributed complex system such as our networks are never practically if not theoretically going to be perfectly synchronized or offer exect predictability. We ar trying to provide more/ better predictabiity, not perfect predictability.

You are right about the users expectations that the connection will be available at the requested time. But nothing is exact. Even if we knew and could predict exactly the setup time, if something was broken in the network and we couldn't meet the committed start time, what would the user do?

Ok. deep breath....exhale.... feel better? Ok good. Now let me defend our discussions...

To be blunt, it could be argued that any user application that blindly puts data into the pipe without getting some verification that the pipe *AND* the application agent at the far end is ready has no real idea if it is working AT ALL! If the agent at the other end is not functioning (not a network problem), this is fundamentally indistinguishable from a network connection not being available. How would the user be able to claim the network is broken?

On the other hand, if there *is* out of band coordination going on between the local user agent and the destination user agent, then the application is trying to deal with an imperfect world in which it needs to determine and synchronze the state of the application agent on the far end before it proceeds. ---> Why would doing so with the network resource not be of equal importance?

In *general* (Fuzzy logic alert) we will make the start time. Indeed, in most instances we will be ready *before* the start time. But if by chance we miss the start time by only 15 seconds, is that acceptable to you? Or to the application that just dumped 19 MBytes of data down a hole?

What if was the user application that had a slightly fast clock and started 10 seconds early? *His* clock said 1pm, mine said 12:59:50. Who is broken? The result is the same. What if the delta was 5 minutes, or 50 milliseconds? Where do we draw the line? Draw a line, and there will still be some misses...

The point here is that nothing is perfect and exact. And yet these systems function "correctly"! We need to construct a protocol that can function in the face of these minor (on a human scale) time deltas. But even seconds are not minor on the scale that a computer agent functions. So we necessarilly need to address these nuances so that it works correctly on a timescale of milliseconds and less.

In order to address the issue of [typically] slight variations of actual start time, we are proposing that the protocol would *always* notify the originating RA when the circuit is ready, albeit after the fact, but it says determinitistically "the circuit is now ready." And we are also proposing a means for the RA to determine the state if that ProvisionComplete message is not received when it was expected - if there is a hard error or just a slow/late provisioning process still taking place.

But given the fact that we cannot *exactly* synchronize each and every agent and system around the world- and keep them that way, and that we cannot predict perfectly how long each task will take before the fact, we have to face facts that we need to be able to function correctly with these uncertainties. Without meaning to preach, the user application needs to do so too.

Small is relative. (there is an old joke here about a prositute and an old man...but I won't go into it.:-)

Best regards Jerry

So we want to provide the service at the request time. And we will make our best effort to do so. And in most cases we will succeed. But what will the application do if we miss it? What should the protocol do in an imperfect world? It truly cannot function on fuzzy logic.

One approach to addressing this is to say the RA will always be notified when the connection goes into service. This is a positive sign that the connection is end-to-end.

Artur Barczyk wrote: Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote: Radek

I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/ NRM will make in path-finding.

The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/ configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/ Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801 _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Jeff W.Boote

6:20 p.m.

++ jeff On Sep 30, 2010, at 12:02 PM, Evangelos Chaniotakis wrote:

...

Seconding this.

On Sep 30, 2010, at 1:54 PM, Aaron Brown wrote:

...
I'm probably oversimplifying, but it seems to me this problem becomes much easier with Jeff's idea about having all clocks synchronized within a period of no more than some number seconds. If the clocks aren't synchronized, you run into a whole bunch of errors related to making absolute time-based reservations anyway.

The protocol mandates clock offsets of no more than X seconds. Each domain selects its own setup time of "no more than Y minutes" and a tear down time of "Z minutes". If a user requests a reservation from time A to time B, the domain reserves time from A-X-Y through B+X+Z. When it comes to setting up the circuit, the domain starts setting it up at time A-X-Y. If the circuit isn't ready by time A-X, the domain throws a setup error and handles that error condition the way it'd handle an actual error occurred during circuit setup. The circuit remains active until time B+X, at which time the domain starts tearing it down. If, while the circuit is running, the hosts become desychronized, one of the domains will (from the either the clients or other domains' perspectives) end the circuit earlier than expected and report the tear down. The other domains/clients will handle that similar to if a cancel had occurred.

Again, I may be vastly oversimplifying the problem.

Cheers, Aaron

On Sep 30, 2010, at 1:31 PM, Radek Krzywania wrote:

...
Hi, Setting up a circuit via Alcatel NMS takes 2 minutes. This time is mostly consumed by NMS to find a path through domain and warm the room with CPU heat. A seconds or minute is still a guess anyway :) I can agree to use those values (instead of 20 minutes) but according to my current experience – lot of timeouts will appear. I fully support the statement of “We are trying to provide more/ better predictability, not perfect predictability.” This should be on the title page of NSI design BTW :) All the case is about everything is relevant and not exact. The example of user clocks depicts it quite well (thanks Jerry for pointing that).

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http:// www.man.poznan.pl ________________________________________________________________________

From: Jerry Sobieski [mailto:jerry@nordu.net] Sent: Thursday, September 30, 2010 6:54 PM To: Artur Barczyk Cc: radek.krzywania@man.poznan.pl; 'Jeff W.Boote'; nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue

Hi Artur- I accept the challenge!

First, let me calm the nerves... The questionof setup time - particularly the issue of taking 10 minutes or more has mostly to do with provisioning all optical systems where amplification and attenuation across a mesh takes a significant time. In most cases, the provisioning will be a more conventional few seconds to a minute (or so). And for smaller domains with more conventional switcing gear, maybe a few seconds at most.

So we should all try to keep perspective here that much of this discussion has to do with insuring the protocol functions correctly, consistently, and reliably as service infrastructure. And much of this is driven by making sure it works such even in the corner cases where it might take 15 minutes to provision, or two NSAs might have clocks that differ by 10 seconds. etc.

But the real world facts are that nothing is perfect, and a globally distributed complex system such as our networks are never practically if not theoretically going to be perfectly synchronized or offer exect predictability. We ar trying to provide more/ better predictabiity, not perfect predictability.

You are right about the users expectations that the connection will be available at the requested time. But nothing is exact. Even if we knew and could predict exactly the setup time, if something was broken in the network and we couldn't meet the committed start time, what would the user do?

Ok. deep breath....exhale.... feel better? Ok good. Now let me defend our discussions...

To be blunt, it could be argued that any user application that blindly puts data into the pipe without getting some verification that the pipe *AND* the application agent at the far end is ready has no real idea if it is working AT ALL! If the agent at the other end is not functioning (not a network problem), this is fundamentally indistinguishable from a network connection not being available. How would the user be able to claim the network is broken?

On the other hand, if there *is* out of band coordination going on between the local user agent and the destination user agent, then the application is trying to deal with an imperfect world in which it needs to determine and synchronze the state of the application agent on the far end before it proceeds. ---> Why would doing so with the network resource not be of equal importance?

In *general* (Fuzzy logic alert) we will make the start time. Indeed, in most instances we will be ready *before* the start time. But if by chance we miss the start time by only 15 seconds, is that acceptable to you? Or to the application that just dumped 19 MBytes of data down a hole?

What if was the user application that had a slightly fast clock and started 10 seconds early? *His* clock said 1pm, mine said 12:59:50. Who is broken? The result is the same. What if the delta was 5 minutes, or 50 milliseconds? Where do we draw the line? Draw a line, and there will still be some misses...

The point here is that nothing is perfect and exact. And yet these systems function "correctly"! We need to construct a protocol that can function in the face of these minor (on a human scale) time deltas. But even seconds are not minor on the scale that a computer agent functions. So we necessarilly need to address these nuances so that it works correctly on a timescale of milliseconds and less.

In order to address the issue of [typically] slight variations of actual start time, we are proposing that the protocol would *always* notify the originating RA when the circuit is ready, albeit after the fact, but it says determinitistically "the circuit is now ready." And we are also proposing a means for the RA to determine the state if that ProvisionComplete message is not received when it was expected - if there is a hard error or just a slow/late provisioning process still taking place.

But given the fact that we cannot *exactly* synchronize each and every agent and system around the world- and keep them that way, and that we cannot predict perfectly how long each task will take before the fact, we have to face facts that we need to be able to function correctly with these uncertainties. Without meaning to preach, the user application needs to do so too.

Small is relative. (there is an old joke here about a prositute and an old man...but I won't go into it.:-)

Best regards Jerry

So we want to provide the service at the request time. And we will make our best effort to do so. And in most cases we will succeed. But what will the application do if we miss it? What should the protocol do in an imperfect world? It truly cannot function on fuzzy logic.

One approach to addressing this is to say the RA will always be notified when the connection goes into service. This is a positive sign that the connection is end-to-end.

Artur Barczyk wrote: Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http:// www.man.poznan.pl ________________________________________________________________________

From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote: Radek

I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/ NRM will make in path-finding.

The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/ configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/ Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801 _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Artur Barczyk

8:51 p.m.

Very good, support. Cheers, Artur On 09/30/2010 07:54 PM, Aaron Brown wrote:

...

I'm probably oversimplifying, but it seems to me this problem becomes much easier with Jeff's idea about having all clocks synchronized within a period of no more than some number seconds. If the clocks aren't synchronized, you run into a whole bunch of errors related to making absolute time-based reservations anyway.

The protocol mandates clock offsets of no more than X seconds. Each domain selects its own setup time of "no more than Y minutes" and a tear down time of "Z minutes". If a user requests a reservation from time A to time B, the domain reserves time from A-X-Y through B+X+Z. When it comes to setting up the circuit, the domain starts setting it up at time A-X-Y. If the circuit isn't ready by time A-X, the domain throws a setup error and handles that error condition the way it'd handle an actual error occurred during circuit setup. The circuit remains active until time B+X, at which time the domain starts tearing it down. If, while the circuit is running, the hosts become desychronized, one of the domains will (from the either the clients or other domains' perspectives) end the circuit earlier than expected and report the tear down. The other domains/clients will handle that similar to if a cancel had occurred.

Again, I may be vastly oversimplifying the problem.

Cheers, Aaron

On Sep 30, 2010, at 1:31 PM, Radek Krzywania wrote:

...
Hi, Setting up a circuit via Alcatel NMS takes 2 minutes. This time is mostly consumed by NMS to find a path through domain and warm the room with CPU heat. A seconds or minute is still a guess anyway :) I can agree to use those values (instead of 20 minutes) but according to my current experience – lot of timeouts will appear. I fully support the statement of “We are trying to provide more/better predictability, not perfect predictability.” This should be on the title page of NSI design BTW :) All the case is about everything is relevant and not exact. The example of user clocks depicts it quite well (thanks Jerry for pointing that). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ *From:* Jerry Sobieski [mailto:jerry@nordu.net] *Sent:* Thursday, September 30, 2010 6:54 PM *To:* Artur Barczyk *Cc:* radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl>; 'Jeff W.Boote'; nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> *Subject:* Re: [Nsi-wg] time issue Hi Artur- I accept the challenge!

First, let me calm the nerves... The questionof setup time - particularly the issue of taking 10 minutes or more has mostly to do with provisioning all optical systems where amplification and attenuation across a mesh takes a significant time. In most cases, the provisioning will be a more conventional few seconds to a minute (or so). And for smaller domains with more conventional switcing gear, maybe a few seconds at most.

So we should all try to keep perspective here that much of this discussion has to do with insuring the protocol functions correctly, consistently, and reliably as service infrastructure. And much of this is driven by making sure it works such even in the corner cases where it might take 15 minutes to provision, or two NSAs might have clocks that differ by 10 seconds. etc.

But the real world facts are that nothing is perfect, and a globally distributed complex system such as our networks are never practically if not theoretically going to be perfectly synchronized or offer exect predictability. We ar trying to provide more/better predictabiity, not perfect predictability.

You are right about the users expectations that the connection will be available at the requested time. But nothing is exact. Even if we knew and could predict exactly the setup time, if something was broken in the network and we couldn't meet the committed start time, what would the user do?

Ok. deep breath....exhale.... feel better? Ok good. Now let me defend our discussions...

To be blunt, it could be argued that any user application that blindly puts data into the pipe without getting some verification that the pipe *AND* the application agent at the far end is ready has no real idea if it is working AT ALL! If the agent at the other end is not functioning (not a network problem), this is fundamentally indistinguishable from a network connection not being available. How would the user be able to claim the network is broken?

On the other hand, if there *is* out of band coordination going on between the local user agent and the destination user agent, then the application is trying to deal with an imperfect world in which it needs to determine and synchronze the state of the application agent on the far end before it proceeds. ---> Why would doing so with the network resource not be of equal importance?

In *general* (Fuzzy logic alert) we will make the start time. Indeed, in most instances we will be ready *before* the start time. But if by chance we miss the start time by only 15 seconds, is that acceptable to you? Or to the application that just dumped 19 MBytes of data down a hole?

What if was the user application that had a slightly fast clock and started 10 seconds early? *His* clock said 1pm, mine said 12:59:50. Who is broken? The result is the same. What if the delta was 5 minutes, or 50 milliseconds? Where do we draw the line? Draw a line, and there will still be some misses...

The point here is that nothing is perfect and exact. And yet these systems function "correctly"! We need to construct a protocol that can function in the face of these minor (on a human scale) time deltas. But even seconds are not minor on the scale that a computer agent functions. So we necessarilly need to address these nuances so that it works correctly on a timescale of milliseconds and less.

In order to address the issue of [typically] slight variations of actual start time, we are proposing that the protocol would *always* notify the originating RA when the circuit is ready, albeit after the fact, but it says determinitistically "the circuit is now ready." And we are also proposing a means for the RA to determine the state if that ProvisionComplete message is not received when it was expected - if there is a hard error or just a slow/late provisioning process still taking place.

But given the fact that we cannot *exactly* synchronize each and every agent and system around the world- and keep them that way, and that we cannot predict perfectly how long each task will take before the fact, we have to face facts that we need to be able to function correctly with these uncertainties. Without meaning to preach, the user application needs to do so too.

Small is relative. (there is an old joke here about a prositute and an old man...but I won't go into it.:-)

Best regards Jerry

So we want to provide the service at the request time. And we will make our best effort to do so. And in most cases we will succeed. But what will the application do if we miss it? What should the protocol do in an imperfect world? It truly cannot function on fuzzy logic.

One approach to addressing this is to say the RA will always be notified when the connection goes into service. This is a positive sign that the connection is end-to-end.

Artur Barczyk wrote: Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2. Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2. We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ *From:* nsi-wg-bounces@ogf.org <mailto:nsi-wg-bounces@ogf.org> [mailto:nsi-wg-bounces@ogf.org] *On Behalf Of *Jerry Sobieski *Sent:* Wednesday, September 29, 2010 9:33 PM *To:* Jeff W.Boote *Cc:* nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> *Subject:* Re: [Nsi-wg] time issue Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote: On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option. I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA. I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time. As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error. jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote: Radek I agree with your statements;

User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding. The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

------------------------------------------------------------------------

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801 _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Jerry Sobieski

9:04 p.m.

Aaron- IMO, There are two fundamental issues we are dancing around: 1) How do we deal with /estimated/ duration of certain tasks - specifically when these tasks must complete before a hard deadline? 2) How do we, given a finite but non-trivial set of service primitives, protocol states, timers, and alarms, and messages all ocuring across independent agents arrayed in different relationships, insure that the service protocol functions reliably, consistently, and predictably under all conditions This "timing" is really about synchronizing *states* among the NSAs when transitions are driven by asynchronous or only quasi-synchronized calendars, i.e. insuring that the possible timing of these events do not break the state machine. So Time-of-Day clock synchronization is important - especially and particularly for the scheduling of resources and I do actually believe NTP will serve this ppurpose adequately in the background. However, even if all clocks were synchronized to nanoseconds, there are still propagation delays in processing the states, messages, differnet speed servers, etc that cause the timing of events to vary. These differences result in messages arriving at only approximately expected points in the protocol. From the perspective of the NSA state machine, a propagation delay with zero clock skew is indistinguishable from clock skew with zero propagation delay. And as long as we already have clock skew, we often blur the two as "timing"issues. THis probably introduces confusion but there it is. As long as the protocol is simply message driven, these events remain sequencial and easily undersood. But when clock events and timers are introduced, these events can occur at any point in the state machine and ripple through the service tree - sometimes colliding with processing from related asynchronous events in other NSAs. We have a much more complex state analysis to consider, and these "timing" issues need to be handled no matter how small they are. This being said, what we seem to be hammering out is a realistic expectation of what it means to commit to a hard Start Time given certain soft dependencies. We seem to be coming to the realization the the committed Start Time is not actually a certainty. So we need to decide what, if anything, the protocol can or should do to better assure the user that the circuit will be there when we promised, or to more gracefully recover from a missed deadline. One last comment:-) If we require all NSAs to synchronize to within some deltaT, then we need to introduce a mechanism to verify this. And then to do so again every once in a while. If we are in fact dependent upon a certain grnaularity sync in the day clock, then we need to insure that it is so. If we are not really dependent on it, then we should a) not require it, and b) make sure the protcol works regardless of the clock skew. I do not think we are really dependent on a synchronized time-of-day clcok among NSAs. Hope this helps Jerry Aaron Brown wrote:

...

I'm probably oversimplifying, but it seems to me this problem becomes much easier with Jeff's idea about having all clocks synchronized within a period of no more than some number seconds. If the clocks aren't synchronized, you run into a whole bunch of errors related to making absolute time-based reservations anyway.

The protocol mandates clock offsets of no more than X seconds. Each domain selects its own setup time of "no more than Y minutes" and a tear down time of "Z minutes". If a user requests a reservation from time A to time B, the domain reserves time from A-X-Y through B+X+Z. When it comes to setting up the circuit, the domain starts setting it up at time A-X-Y. If the circuit isn't ready by time A-X, the domain throws a setup error and handles that error condition the way it'd handle an actual error occurred during circuit setup. The circuit remains active until time B+X, at which time the domain starts tearing it down. If, while the circuit is running, the hosts become desychronized, one of the domains will (from the either the clients or other domains' perspectives) end the circuit earlier than expected and report the tear down. The other domains/clients will handle that similar to if a cancel had occurred.

Again, I may be vastly oversimplifying the problem.

Cheers, Aaron

On Sep 30, 2010, at 1:31 PM, Radek Krzywania wrote:

...
Hi, Setting up a circuit via Alcatel NMS takes 2 minutes. This time is mostly consumed by NMS to find a path through domain and warm the room with CPU heat. A seconds or minute is still a guess anyway :) I can agree to use those values (instead of 20 minutes) but according to my current experience – lot of timeouts will appear. I fully support the statement of “We are trying to provide more/better predictability, not perfect predictability.” This should be on the title page of NSI design BTW :) All the case is about everything is relevant and not exact. The example of user clocks depicts it quite well (thanks Jerry for pointing that).

Best regards Radek

________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

*From:* Jerry Sobieski [mailto:jerry@nordu.net] *Sent:* Thursday, September 30, 2010 6:54 PM *To:* Artur Barczyk *Cc:* radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl>; 'Jeff W.Boote'; nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> *Subject:* Re: [Nsi-wg] time issue

Hi Artur- I accept the challenge!

First, let me calm the nerves... The questionof setup time - particularly the issue of taking 10 minutes or more has mostly to do with provisioning all optical systems where amplification and attenuation across a mesh takes a significant time. In most cases, the provisioning will be a more conventional few seconds to a minute (or so). And for smaller domains with more conventional switcing gear, maybe a few seconds at most.

So we should all try to keep perspective here that much of this discussion has to do with insuring the protocol functions correctly, consistently, and reliably as service infrastructure. And much of this is driven by making sure it works such even in the corner cases where it might take 15 minutes to provision, or two NSAs might have clocks that differ by 10 seconds. etc.

But the real world facts are that nothing is perfect, and a globally distributed complex system such as our networks are never practically if not theoretically going to be perfectly synchronized or offer exect predictability. We ar trying to provide more/better predictabiity, not perfect predictability.

You are right about the users expectations that the connection will be available at the requested time. But nothing is exact. Even if we knew and could predict exactly the setup time, if something was broken in the network and we couldn't meet the committed start time, what would the user do?

Ok. deep breath....exhale.... feel better? Ok good. Now let me defend our discussions...

To be blunt, it could be argued that any user application that blindly puts data into the pipe without getting some verification that the pipe *AND* the application agent at the far end is ready has no real idea if it is working AT ALL! If the agent at the other end is not functioning (not a network problem), this is fundamentally indistinguishable from a network connection not being available. How would the user be able to claim the network is broken?

On the other hand, if there *is* out of band coordination going on between the local user agent and the destination user agent, then the application is trying to deal with an imperfect world in which it needs to determine and synchronze the state of the application agent on the far end before it proceeds. ---> Why would doing so with the network resource not be of equal importance?

In *general* (Fuzzy logic alert) we will make the start time. Indeed, in most instances we will be ready *before* the start time. But if by chance we miss the start time by only 15 seconds, is that acceptable to you? Or to the application that just dumped 19 MBytes of data down a hole?

What if was the user application that had a slightly fast clock and started 10 seconds early? *His* clock said 1pm, mine said 12:59:50. Who is broken? The result is the same. What if the delta was 5 minutes, or 50 milliseconds? Where do we draw the line? Draw a line, and there will still be some misses...

The point here is that nothing is perfect and exact. And yet these systems function "correctly"! We need to construct a protocol that can function in the face of these minor (on a human scale) time deltas. But even seconds are not minor on the scale that a computer agent functions. So we necessarilly need to address these nuances so that it works correctly on a timescale of milliseconds and less.

In order to address the issue of [typically] slight variations of actual start time, we are proposing that the protocol would *always* notify the originating RA when the circuit is ready, albeit after the fact, but it says determinitistically "the circuit is now ready." And we are also proposing a means for the RA to determine the state if that ProvisionComplete message is not received when it was expected - if there is a hard error or just a slow/late provisioning process still taking place.

But given the fact that we cannot *exactly* synchronize each and every agent and system around the world- and keep them that way, and that we cannot predict perfectly how long each task will take before the fact, we have to face facts that we need to be able to function correctly with these uncertainties. Without meaning to preach, the user application needs to do so too.

Small is relative. (there is an old joke here about a prositute and an old man...but I won't go into it.:-)

Best regards Jerry

So we want to provide the service at the request time. And we will make our best effort to do so. And in most cases we will succeed. But what will the application do if we miss it? What should the protocol do in an imperfect world? It truly cannot function on fuzzy logic.

One approach to addressing this is to say the RA will always be notified when the connection goes into service. This is a positive sign that the connection is end-to-end.

Artur Barczyk wrote: Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

*From:* nsi-wg-bounces@ogf.org <mailto:nsi-wg-bounces@ogf.org> [mailto:nsi-wg-bounces@ogf.org] *On Behalf Of *Jerry Sobieski *Sent:* Wednesday, September 29, 2010 9:33 PM *To:* Jeff W.Boote *Cc:* nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> *Subject:* Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote: Radek

I agree with your statements;

User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

------------------------------------------------------------------------

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801 _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

Artur Barczyk

9:51 p.m.

Hi Jerry, yes, seems there is a mix-up of issues, I think :-) But they are related. I was under the impression the problem was the different and nondeterministic provisioning times. But that's local to the NSA and its domain, so when you talk about clock events and timers, these should not be subject to propagation delays? (There might be some processing jitter, but unless the server is under very heavy load, also this should be in bounds.) What is, is the state change notifications between NSAs, but I am not sure how critical that is in terms of server synchronisation. So I still think that what Aaron and Jeff described is a good approach, if the NSA has a reasonable idea of provisioning time. What I mean by that is within a margin of some seconds to a minute, which should accommodate any propagation delays within the domain itself? Cheers, Artur On 09/30/2010 11:04 PM, Jerry Sobieski wrote:

...

Aaron-

IMO, There are two fundamental issues we are dancing around: 1) How do we deal with /estimated/ duration of certain tasks - specifically when these tasks must complete before a hard deadline? 2) How do we, given a finite but non-trivial set of service primitives, protocol states, timers, and alarms, and messages all ocuring across independent agents arrayed in different relationships, insure that the service protocol functions reliably, consistently, and predictably under all conditions This "timing" is really about synchronizing *states* among the NSAs when transitions are driven by asynchronous or only quasi-synchronized calendars, i.e. insuring that the possible timing of these events do not break the state machine.

So Time-of-Day clock synchronization is important - especially and particularly for the scheduling of resources and I do actually believe NTP will serve this ppurpose adequately in the background. However, even if all clocks were synchronized to nanoseconds, there are still propagation delays in processing the states, messages, differnet speed servers, etc that cause the timing of events to vary. These differences result in messages arriving at only approximately expected points in the protocol.

From the perspective of the NSA state machine, a propagation delay with zero clock skew is indistinguishable from clock skew with zero propagation delay. And as long as we already have clock skew, we often blur the two as "timing"issues. THis probably introduces confusion but there it is. As long as the protocol is simply message driven, these events remain sequencial and easily undersood. But when clock events and timers are introduced, these events can occur at any point in the state machine and ripple through the service tree - sometimes colliding with processing from related asynchronous events in other NSAs. We have a much more complex state analysis to consider, and these "timing" issues need to be handled no matter how small they are.

This being said, what we seem to be hammering out is a realistic expectation of what it means to commit to a hard Start Time given certain soft dependencies. We seem to be coming to the realization the the committed Start Time is not actually a certainty. So we need to decide what, if anything, the protocol can or should do to better assure the user that the circuit will be there when we promised, or to more gracefully recover from a missed deadline.

One last comment:-) If we require all NSAs to synchronize to within some deltaT, then we need to introduce a mechanism to verify this. And then to do so again every once in a while. If we are in fact dependent upon a certain grnaularity sync in the day clock, then we need to insure that it is so. If we are not really dependent on it, then we should a) not require it, and b) make sure the protcol works regardless of the clock skew. I do not think we are really dependent on a synchronized time-of-day clcok among NSAs.

Hope this helps Jerry

Aaron Brown wrote:

...
I'm probably oversimplifying, but it seems to me this problem becomes much easier with Jeff's idea about having all clocks synchronized within a period of no more than some number seconds. If the clocks aren't synchronized, you run into a whole bunch of errors related to making absolute time-based reservations anyway.

The protocol mandates clock offsets of no more than X seconds. Each domain selects its own setup time of "no more than Y minutes" and a tear down time of "Z minutes". If a user requests a reservation from time A to time B, the domain reserves time from A-X-Y through B+X+Z. When it comes to setting up the circuit, the domain starts setting it up at time A-X-Y. If the circuit isn't ready by time A-X, the domain throws a setup error and handles that error condition the way it'd handle an actual error occurred during circuit setup. The circuit remains active until time B+X, at which time the domain starts tearing it down. If, while the circuit is running, the hosts become desychronized, one of the domains will (from the either the clients or other domains' perspectives) end the circuit earlier than expected and report the tear down. The other domains/clients will handle that similar to if a cancel had occurred.

Again, I may be vastly oversimplifying the problem.

Cheers, Aaron

On Sep 30, 2010, at 1:31 PM, Radek Krzywania wrote:

...
Hi, Setting up a circuit via Alcatel NMS takes 2 minutes. This time is mostly consumed by NMS to find a path through domain and warm the room with CPU heat. A seconds or minute is still a guess anyway :) I can agree to use those values (instead of 20 minutes) but according to my current experience – lot of timeouts will appear. I fully support the statement of “We are trying to provide more/better predictability, not perfect predictability.” This should be on the title page of NSI design BTW :) All the case is about everything is relevant and not exact. The example of user clocks depicts it quite well (thanks Jerry for pointing that). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ *From:* Jerry Sobieski [mailto:jerry@nordu.net] *Sent:* Thursday, September 30, 2010 6:54 PM *To:* Artur Barczyk *Cc:* radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl>; 'Jeff W.Boote'; nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> *Subject:* Re: [Nsi-wg] time issue Hi Artur- I accept the challenge!

First, let me calm the nerves... The questionof setup time - particularly the issue of taking 10 minutes or more has mostly to do with provisioning all optical systems where amplification and attenuation across a mesh takes a significant time. In most cases, the provisioning will be a more conventional few seconds to a minute (or so). And for smaller domains with more conventional switcing gear, maybe a few seconds at most.

So we should all try to keep perspective here that much of this discussion has to do with insuring the protocol functions correctly, consistently, and reliably as service infrastructure. And much of this is driven by making sure it works such even in the corner cases where it might take 15 minutes to provision, or two NSAs might have clocks that differ by 10 seconds. etc.

But the real world facts are that nothing is perfect, and a globally distributed complex system such as our networks are never practically if not theoretically going to be perfectly synchronized or offer exect predictability. We ar trying to provide more/better predictabiity, not perfect predictability.

You are right about the users expectations that the connection will be available at the requested time. But nothing is exact. Even if we knew and could predict exactly the setup time, if something was broken in the network and we couldn't meet the committed start time, what would the user do?

Ok. deep breath....exhale.... feel better? Ok good. Now let me defend our discussions...

To be blunt, it could be argued that any user application that blindly puts data into the pipe without getting some verification that the pipe *AND* the application agent at the far end is ready has no real idea if it is working AT ALL! If the agent at the other end is not functioning (not a network problem), this is fundamentally indistinguishable from a network connection not being available. How would the user be able to claim the network is broken?

On the other hand, if there *is* out of band coordination going on between the local user agent and the destination user agent, then the application is trying to deal with an imperfect world in which it needs to determine and synchronze the state of the application agent on the far end before it proceeds. ---> Why would doing so with the network resource not be of equal importance?

In *general* (Fuzzy logic alert) we will make the start time. Indeed, in most instances we will be ready *before* the start time. But if by chance we miss the start time by only 15 seconds, is that acceptable to you? Or to the application that just dumped 19 MBytes of data down a hole?

What if was the user application that had a slightly fast clock and started 10 seconds early? *His* clock said 1pm, mine said 12:59:50. Who is broken? The result is the same. What if the delta was 5 minutes, or 50 milliseconds? Where do we draw the line? Draw a line, and there will still be some misses...

The point here is that nothing is perfect and exact. And yet these systems function "correctly"! We need to construct a protocol that can function in the face of these minor (on a human scale) time deltas. But even seconds are not minor on the scale that a computer agent functions. So we necessarilly need to address these nuances so that it works correctly on a timescale of milliseconds and less.

In order to address the issue of [typically] slight variations of actual start time, we are proposing that the protocol would *always* notify the originating RA when the circuit is ready, albeit after the fact, but it says determinitistically "the circuit is now ready." And we are also proposing a means for the RA to determine the state if that ProvisionComplete message is not received when it was expected - if there is a hard error or just a slow/late provisioning process still taking place.

But given the fact that we cannot *exactly* synchronize each and every agent and system around the world- and keep them that way, and that we cannot predict perfectly how long each task will take before the fact, we have to face facts that we need to be able to function correctly with these uncertainties. Without meaning to preach, the user application needs to do so too.

Small is relative. (there is an old joke here about a prositute and an old man...but I won't go into it.:-)

Best regards Jerry

So we want to provide the service at the request time. And we will make our best effort to do so. And in most cases we will succeed. But what will the application do if we miss it? What should the protocol do in an imperfect world? It truly cannot function on fuzzy logic.

One approach to addressing this is to say the RA will always be notified when the connection goes into service. This is a positive sign that the connection is end-to-end.

Artur Barczyk wrote: Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2. Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2. We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: 20 minutes for reservation as set up time Service availability time (e.g. 13 h) Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ *From:* nsi-wg-bounces@ogf.org <mailto:nsi-wg-bounces@ogf.org> [mailto:nsi-wg-bounces@ogf.org] *On Behalf Of *Jerry Sobieski *Sent:* Wednesday, September 29, 2010 9:33 PM *To:* Jeff W.Boote *Cc:* nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> *Subject:* Re: [Nsi-wg] time issue Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote: On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option. I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA. I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time. As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error. jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote: Radek I agree with your statements;

User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding. The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

------------------------------------------------------------------------

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801 _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Artur Barczyk

9:11 p.m.

Hi Jerry, oh, wasn't my intention to raise the heat... :-) I understand the problem, and agree with you on the very point of making the protocol function correctly being the key issue here. Here's how I would see the main points (most of it appeared in one post or the other, not taking credit for it): - the provisioning time has to be *subtracted* from the requested time - each NSA knows how much provisioning time it needs - if other NSAs need to be aware of it, this needs to be communicated. This would apply to the "user initiated" provisioning, if that's gonna stay - in this case, the longest provisioning time in all involved domains has to be subtracted (by all NSAs) to obtain the provisioning time - each NSA is synchronised to a good time source (requirement). A few seconds are subtracted to get the definitive provisioning time - as in Aaron's very good mail. - the NSA which received the original request receives the status from each NSA in the path, and once all are up, it notifies the user agent - if the entire path is not provisioned within a timeout value (a minute could be acceptable), an error condition is declared, and user notified, provisioning cancelled including teardown of already provisioned segments. Or, if the status changes are always propagated to the user agent, we can get rid of the timeout altogether, since the user will know when the circuit comes live. If that doesn't happen within the user app's timeout, it can abort. There's no explicit synchronisation between the NSAs. So one remaining key issue will be how to catch a de-synchronisation situation, and how to deal with it? Using timestamps in the messages between the NSAs would be an easy option for v1. So when an NSA gets any message from another NSA, it will check it against its own clock, and raise an alarm if it detects a time skew of more than N seconds. For v2, we can discuss if it makes sense to add explicit sync messages. Makes sense? Cheers, Artur On 09/30/2010 06:53 PM, Jerry Sobieski wrote:

...

Hi Artur- I accept the challenge!

First, let me calm the nerves... The questionof setup time - particularly the issue of taking 10 minutes or more has mostly to do with provisioning all optical systems where amplification and attenuation across a mesh takes a significant time. In most cases, the provisioning will be a more conventional few seconds to a minute (or so). And for smaller domains with more conventional switcing gear, maybe a few seconds at most.

So we should all try to keep perspective here that much of this discussion has to do with insuring the protocol functions correctly, consistently, and reliably as service infrastructure. And much of this is driven by making sure it works such even in the corner cases where it might take 15 minutes to provision, or two NSAs might have clocks that differ by 10 seconds. etc.

But the real world facts are that nothing is perfect, and a globally distributed complex system such as our networks are never practically if not theoretically going to be perfectly synchronized or offer exect predictability. We ar trying to provide more/better predictabiity, not perfect predictability.

You are right about the users expectations that the connection will be available at the requested time. But nothing is exact. Even if we knew and could predict exactly the setup time, if something was broken in the network and we couldn't meet the committed start time, what would the user do?

Ok. deep breath....exhale.... feel better? Ok good. Now let me defend our discussions...

To be blunt, it could be argued that any user application that blindly puts data into the pipe without getting some verification that the pipe *AND* the application agent at the far end is ready has no real idea if it is working AT ALL! If the agent at the other end is not functioning (not a network problem), this is fundamentally indistinguishable from a network connection not being available. How would the user be able to claim the network is broken?

On the other hand, if there *is* out of band coordination going on between the local user agent and the destination user agent, then the application is trying to deal with an imperfect world in which it needs to determine and synchronze the state of the application agent on the far end before it proceeds. ---> Why would doing so with the network resource not be of equal importance?

In *general* (Fuzzy logic alert) we will make the start time. Indeed, in most instances we will be ready *before* the start time. But if by chance we miss the start time by only 15 seconds, is that acceptable to you? Or to the application that just dumped 19 MBytes of data down a hole?

What if was the user application that had a slightly fast clock and started 10 seconds early? *His* clock said 1pm, mine said 12:59:50. Who is broken? The result is the same. What if the delta was 5 minutes, or 50 milliseconds? Where do we draw the line? Draw a line, and there will still be some misses...

The point here is that nothing is perfect and exact. And yet these systems function "correctly"! We need to construct a protocol that can function in the face of these minor (on a human scale) time deltas. But even seconds are not minor on the scale that a computer agent functions. So we necessarilly need to address these nuances so that it works correctly on a timescale of milliseconds and less.

In order to address the issue of [typically] slight variations of actual start time, we are proposing that the protocol would *always* notify the originating RA when the circuit is ready, albeit after the fact, but it says determinitistically "the circuit is now ready." And we are also proposing a means for the RA to determine the state if that ProvisionComplete message is not received when it was expected - if there is a hard error or just a slow/late provisioning process still taking place.

But given the fact that we cannot *exactly* synchronize each and every agent and system around the world- and keep them that way, and that we cannot predict perfectly how long each task will take before the fact, we have to face facts that we need to be able to function correctly with these uncertainties. Without meaning to preach, the user application needs to do so too.

Small is relative. (there is an old joke here about a prositute and an old man...but I won't go into it.:-)

Best regards Jerry

So we want to provide the service at the request time. And we will make our best effort to do so. And in most cases we will succeed. But what will the application do if we miss it? What should the protocol do in an imperfect world? It truly cannot function on fuzzy logic.

One approach to addressing this is to say the RA will always be notified when the connection goes into service. This is a positive sign that the connection is end-to-end.

Artur Barczyk wrote:

...
Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote:

...
Hi,

It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ).

For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above:

- 20 minutes for reservation as set up time

- Service availability time (e.g. 13 h)

- Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes)

In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards

Radek

________________________________________________________________________

Radoslaw Krzywania Network Research and Development

Poznan Supercomputing and

radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center

+48 61 850 25 26 http://www.man.poznan.pl

________________________________________________________________________

*From:* nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] *On Behalf Of *Jerry Sobieski *Sent:* Wednesday, September 29, 2010 9:33 PM *To:* Jeff W.Boote *Cc:* nsi-wg@ogf.org *Subject:* Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote:

Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote:

Radek

I agree with your statements;

User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but /the protocol is between the RA and the PA/ and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion,

a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept).

While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client.

Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue.

I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well.

Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

------------------------------------------------------------------------

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Guy Roberts

5:02 p.m.

Hi Artur, I guess you are making this request for a guaranteed start times with your ‘system-user’ hat on? On balance I agree with your proposal that connections should have a known start time, otherwise they are not useful to the user. The problem we are tackling here is that an unknown delay is inherent to the provisioning process– In particular I have experienced this with AutoBAHN. So the question is (as you say) to find a way of managing this in a deterministic way. Personally I agree with Radek, in that we don’t have enough system knowledge to be sure to remove these delays, so how do we handle this? My preferred solution would be that the requestor NSA always asks for "Available Time" (As per John’s definition), the onus should be on the provider NSA to begin this in advance to add some reliability to the system – if this is in fact achievable is not at all clear to me. Guy From: Artur Barczyk [mailto:Artur.Barczyk@cern.ch] Sent: 30 September 2010 16:25 To: radek.krzywania@man.poznan.pl Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Hi Radek, All, hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc. Be aware that users complaining are users quite quickly lost. You don't want that. So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) ) In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) ) I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-) Cheers, Artur On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2. Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2. We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: - 20 minutes for reservation as set up time - Service availability time (e.g. 13 h) - Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee). Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl<mailto:radek.krzywania@man.poznan.pl> Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________ From: nsi-wg-bounces@ogf.org<mailto:nsi-wg-bounces@ogf.org> [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> Subject: Re: [Nsi-wg] time issue Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding. The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue. This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok. So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it. my $.02 Jerry Jeff W.Boote wrote: On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote: Jerry, For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?" In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option. I agree #1 seems the most deterministic. I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA. I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time. As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error. jeff Kind regards, Gigi On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this... Inder Monga wrote: Radek I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ). The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding. The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents. Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought. In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time? I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree. b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol. c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis. While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration. d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity. br Jerry _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg ________________________________ _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org<mailto:nsi-wg@ogf.org> http://www.ogf.org/mailman/listinfo/nsi-wg -- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

John Vollbrecht

5:53 p.m.

Artur makes very clear why it is important to be able to predict starting available time and ending available time. Without it users won't come back. I believe it is also clear that providers must schedule their resources in a way that includes setup and teardown time in addition to available time. It is impossible to know exactly how long setup will take, so it is impossible to be absolutely sure exactly when a connection will be available. This is where SLAs come in. A provider who wants users to come back promises to have the connection available at a given time and pays a penalty for being late. If the penalty is big the provider estimates the startup time to be large. If the penalty is small it might reduce the startup time. So there is a concept of startup time that is needed prior to available time. There are now 3 times - available time, resource time, and startup time. [teardown needs to be included as well, but I skip it for this example] The relationships are straight forward: resource time == startup time + available time Assume the requestor always asks for available time. Then for automatic provision the provider schedules resource time, and initiates provisioning at start of resource time To this point I think everyone agrees -- The main issue that is still of concern is - for manual provision the provider schedules resource time (which is calculated from requested available time). For manual provisioning to provide the requested available time, the provider must return the setup time, either by returning the resource time as well as available time, or by including the (estimate) set up time. The alternative is to have manual provisioning request resource time. In that case it seems to me it would be good to return (estimated) startup time so manual provisioning can be started at the time required to make the connection available. In this case the manual provision requests resource time, while the automatic provision requests available time. --- A second issue that confuses discussion is the concept of "chained" segments. The issue is that if a number of segments are in a ete connection, each segment has its own, different, setup time. Somehow these need to be correlated to provide a setup time for the chain as a whole. This is a significant problem, but in my opinion is a problem for interacting NSAs, but is not something for the NSI to solve. We do need to have an idea of how NSA agents might solve this and be able to support their requirments in the protocol. It may be useful to define a "federating NSA agent" which uses NSI interface to exchange information. Such fedNSA agents have different requirements than NSA agents that coordinate applications with computing, networking, display and storage. Both use NSI. --- I note that the concept of estimated setup time fits well with SLA, and with Artur's requirement to be confident of when a connection is available. It is something that a provider sells to a user (for cash or maybe just to keep him happy). I think we should include setup (and teardown) in the framework. What do others think? John On Sep 30, 2010, at 1:02 PM, Guy Roberts wrote:

...

Hi Artur,

I guess you are making this request for a guaranteed start times with your ‘system-user’ hat on? On balance I agree with your proposal that connections should have a known start time, otherwise they are not useful to the user.

The problem we are tackling here is that an unknown delay is inherent to the provisioning process– In particular I have experienced this with AutoBAHN. So the question is (as you say) to find a way of managing this in a deterministic way. Personally I agree with Radek, in that we don’t have enough system knowledge to be sure to remove these delays, so how do we handle this?

My preferred solution would be that the requestor NSA always asks for "Available Time" (As per John’s definition), the onus should be on the provider NSA to begin this in advance to add some reliability to the system – if this is in fact achievable is not at all clear to me.

Guy

From: Artur Barczyk [mailto:Artur.Barczyk@cern.ch] Sent: 30 September 2010 16:25 To: radek.krzywania@man.poznan.pl Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue

Hi Radek, All,

hmmmm, I for my part would be quite annoyed (to put it mildly), if I miss the first 15 minutes of todays HD conf call just because I reserved the resources a week in advance. "Around" has no place in a well defined protocol. No fuzzy logic, please :-) Consider also the "bored child in a car" scenario: RA: are we there yet? PA: no... RA: are we there yet? PA: nooo.... RA: are we there yet? PA: NO! etc.

Be aware that users complaining are users quite quickly lost. You don't want that.

So let's consider two example users: - high volume data transfers through a managed system: a data movement scheduler has reserved some bandwidth at a given time. This time comes, the application will just throw data on the network, it might use connection-less protocol, or not, but it will result in an error. It cannot wait "around" 15 minutes, as it will bring the transfer schedule in complete disorder. Such a "service" is just useless. - video conferencing/streaming. You reserve the network resource for 3pm because your meeting starts then. How do you explain to the video conference participant that the network prevented the conference to start for "around" 15 minutes? (Well, you can, but this will be the last time you'll see the user using your network :-) )

In short, the only reasonable thing to do is to put the right mechanism in place to guarantee the service is up when the user requested it (and you confirmed it). The only acceptable reason for failing this is an error condition like network down (and we'll talk about protection in v2 :-) )

I also think it is very dangerous to use "providing a service" as argument while the underlying protocols are not yet correctly specified. This is not theoretical, the service needs to be useful to the end-user, if you want some uptake. Fuzzy statements make it useless. The very reason people are interested in this is that it's deterministic - you know what you get and when. Otherwise use the routed network. :-)

Cheers, Artur

On 09/30/2010 03:37 PM, Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. For v1 is enough, as we will be able to deliver a service, while in v2 we can discuss possible upgrades (unless our engineering approach discovers it’s fine enough :) ). For #1 – it may a problem for instant reservations. Here user want a circuit ASAP. We define ASAP as (see above approach) less than 20 minutes (typically 5-10 minutes probably, but that’s my guess), or not at all. Users may or may not complain on that. In the first case we are good. For the second case we will need to design an upgrade for v2.

Synchronization IMHO is important, and out of scope at the same time. We can make an assumption that agents times are synchronized with precision of let say 10 seconds, which should be far enough. The agents will use system clocks, so they need to be synchronized at the end (NTP or whatever), but that not even implementation but deployment issue. So let put into specification: “NSI protocol requires time synchronization with precision not less than 10seconds”. If we discover it’s insufficient, let’s upgrade it for v2.

We already have some features to implement, just to see if it works fine (works at all, actually). If user is booking a circuit a week in advance, I guess he will not mind if we set it up 15 minutes after start time (user IS aware of that as we specify this in the protocol description). We can’t however deliver the service shorter than user defined time. So we can agree (by voting, not discussing) the fixed time values. My proposal is as above: - 20 minutes for reservation as set up time - Service availability time (e.g. 13 h) - Service tear down time (it’s not important from user perspective, as since any segment of connection is removed, the service is not available any more, but let’s say 15 minutes) In that way, calendar booking needs to have reserve resources for 13h 35 minutes. IMHO we can agree on that by simply vote for v1 (doodle maybe), and collect more detailed requirements for v2 later on. I get the feeling we started quite theoretical discussion based on assumptions and guessing “what if”, instead of focusing on delivering any service (event with limited guarantee).

Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 http://www.man.poznan.pl ________________________________________________________________________

From: nsi-wg-bounces@ogf.org [mailto:nsi-wg-bounces@ogf.org] On Behalf Of Jerry Sobieski Sent: Wednesday, September 29, 2010 9:33 PM To: Jeff W.Boote Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue

Ok. I can buy this approach of #1. The Requested Start Time is immutable as the request goes down the tree (which disallows #2) - it is still a Requested Start Time, but NSAs are not allowed to change requested start time as the request goes down the tree. But you can't prevent #3 if thats what an NSA somewhere down the tree decides to do. The result would be a promise he may not be able to keep - but thats acceptable because the Estimated Start Time is just an estimate, its not binding.

The point is, the local NSA cannot tell whether a remote NSA is using #1 or #3 since its totally up to the remote NSA to select the guard time appropriate for that request. Likewise, even if the remote NSA misses the Estimated Start Time, the requesting RA has no recourse other than to a) just wait until the provisioning completes or b) give up and release the connection. An SLA might influence the bad NSA to not low ball his provisioning guard time in the future, or it may provide a rebate for the jilted user, but these are not a protocol or a standards issue.

This goes to John's comment on the call today about what happens inside the NSA between the PA role and the RA role... These actions are captured in "state routines" that are invoked when protocol events occur. These actions are generalized in the standard, but any heuristics like these approaches to guard time cannot always be mandated. In a protocol standard, what ever components are "required" or "must" items, must be verifiable in a conformance test. I.e. if someone comes up with an NSI imlementation, we should be able to put the reference implementation against the test implementation and we should be able to tell via protocol operation if the implementation under test is doing all the "must" items. If we say an NSA must use #1 above, there is no way to test it and confirm that it is doing so. If the test implementation uses #3, the only outward sign is that it may miss the start time on some connection(s), but it could have as easily just been a poor judgment call on the provisioning time - which is ok.

So, in the standard, we can only recommend #1 be used. Or we can say the NSA "should" use #1. But we cannot require it.

my $.02 Jerry

Jeff W.Boote wrote:

On Sep 29, 2010, at 7:31 AM, Gigi Karmous-Edwards wrote:

Jerry,

For your question : " While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?"

In my opinion, I think the answer here has to be # 1) each NSA must reject the request if their process to establish the connection requested can not meet the Start time. In my opinion an NSA should NOT be allowed to change the requested start time (this will cause all types of problems for other NSAs), so # 2) is not an option. The guard time for each NSA will most likely be vastly different and very dependent on the tools used by that network domain to configure the network elements for the requested path, so an individual guard time of an NSA is also nonnegotiable, so option # 3) is not an option.

I agree #1 seems the most deterministic.

I agree with Radek, ONLY Start times and End times should be used in the protocol and that guard times are only private functions of each individual NSA.

I agree with this. The guard times are not additive across each NSA. The guard time from the perspective of the user will effectively be the maximum of each NSAa guard time in the chain. But, the user doesn't care as long as provisioning is accomplished by the users requested start time. That time would be in the protocol and would remain unchanged through each step of the chain. And, it shouldn't matter how long it takes to tear down the circuit either as long as the circuit is available until their requested end time.

As to how to manage this time synchronization... I think it is totally reasonable to depend upon existing protocols. There are other protocols that already depend upon time synchronization, and many of them use NTP. We are not talking about needing very tight synchronization anyway. 1 second or even 10 seconds is plenty close enough. It is more about bounding that error.

jeff

Kind regards, Gigi

On 9/29/10 8:45 AM, Jerry Sobieski wrote: Hi Inder- I am not sure I agree with all of this...

Inder Monga wrote: Radek

I agree with your statements; User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Perhaps once the RA-PA protocol is tightly defined in all its nuances, we can develop/recommend an end user API that simplifies the the application's required interactions ?? This would allow an application to embed an RA in a runtime library/module and the application itself would only have to deal with the basic connection requirements.... just a thought.

In my opinion, a. the user should specify "Expected Start Time, Expected End Time". The NSAs/domains along the path determine resource availability and booking in their schedules based on their own configured guard time (guard times are not specified by NSI protocol. NSI connection service architecture should discuss them as a suggested concept). While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

b. Within reasonable limits, the connection should be up as close to the start time as possible. The user can set his own policy/configuration on how long to wait after the start time to accept a connection. Since the resources are guaranteed, this is a connection of setup/provisioning only. Hence, there is no protocol state transition when start time is passed other than the messages that indicate the circuit is established end to end or teardown message initiated by the client. Ah, but the rub here is that the "user" is an RA...but not all RAs are the end user. We are defining the actions of an RA, regardless of whether it is a user NSA or an network NSA. So we must insure that if the RA gets tired of waiting for provisioning to complete, that whatever actions it is allowed to take will be consistent and predictable through out the service tree for all the RA/PA interactions. So the "user" actions are not irrelevant to the protocol.

c. We should not design a protocol that depends on time synchronization to work. In my opinion, the start time, expected time to provision aka guard time is best handled/shared as a SLA/Service definition issue. I agree: We cannot expect perfectly/exactly synchronized clocks anywhere in the network. And therefore we cannot depend upon clock synchronization for any part of the protocol to work. Which implies that the protocol must work when the clocks are NOT synchronized. How do we insure this? --> rigorous protocol analysis.

While the values of certain timers may be left to the Service Definition/SLA, as I state before, we must make sure that the protocol can function predictably and consistently in the face of all possible timing permutations that are possible among NSAs. This rapidly gets very complex if we allow too many variables for the SD/SLA to define. Sometimes, its ok to identify constants that the protocol must use so that we can validate the protocol and simplify implementation and deployment. Indeed, often times when clocks are only slightly skewed they introduce race conditions that become more likely to occur requiring more careful consideration.

d. Similar semantics apply to the end-time as well. Pretty much. Across the board, things like clock events, estimates, and service specific choices will create situations where we need to insure the protocol and state machines will function properly across the full range of possible permuted values. This is in general why protocol designers say "make it only as complex as it needs to be, and no more" - options breed complexity.

br Jerry

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801 _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

Jerry Sobieski

6:01 p.m.

Hi Radak- Good post. I concur with it pretty well also. See comments in line... Radek Krzywania wrote:

...

Hi,

It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1?

I suggest the both the Start Time and End Time are "best effort" times. We can bound a delayed start - see below.

...

So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time.

I suggest we word it something like this: "The Reserved Time should precede the Start Time sufficiently to allow auto-start provisioning to complete successfully at or before the committed Start Time. An auto-start Provisoning process that exceeds the Start Time by <<1 minute>> constitutes a failed provisioning process and a Provision Fault error is generated." This will bound the problem - the network will deliver within one minute of the scheduled start time, or declare a fault condition and fall on its sword. We can discuss if this is adequate for V1.0. And what the recovery action should be: release the connection and notify the user? notify the user and continue provisioning? ...? Likewise, in order to prevent a premature Release (:-), we should say something like this: "Upon receiving a calandar alarm indicating that the Stop Time for a connection has been reached, the NSA shall wait for <<1 minute>> grace period before initiating auto-stop Release processing for the connection. However, upon receipt of a valid Release message from another NSA, the local NSA shall cancel the grace period and immediately proceed with Release processing. The Reservation Stop Time shall succeed the In Service Stop Time sufficiently to properly and completely bring local resources to a known idle and secure state, including the grace period." Thoughts? Jerry

Radek Krzywania

6:42 p.m.

Hi, Well, it require discussion and „wording” but in general I would agree. There will be now question should we cancel the reservation if start is delayed, or keep best effort. I would insist on best effort for v1, however. IMHO we should say explicitly it’s best effort, we try our best, we can’t guarantee service timing. It may be delayed, but still delivered. What we should guard is the duration of the connection as requested by user (so if start delayed, we should finish delayed), and timeouts (so connection not ready in 20 minutes is out). We need to have those “buffers” counted in calendars for resources booking, preventing resources to overlap. Setting up a connection in advance (as I said before e.g. 20 minutes) should solved most of the issues here, at least for now. I would like to see in practice how the protocol and agents works, and then decide (basing on user comments maybe) if it’s fine or needs to be tuned. Best regards Radek ________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and <mailto:radek.krzywania@man.poznan.pl> radek.krzywania@man.poznan.pl Networking Center +48 61 850 25 26 <http://www.man.poznan.pl> http://www.man.poznan.pl ________________________________________________________________________ From: Jerry Sobieski [mailto:jerry@nordu.net] Sent: Thursday, September 30, 2010 8:02 PM To: radek.krzywania@man.poznan.pl Cc: nsi-wg@ogf.org Subject: Re: [Nsi-wg] time issue Hi Radak- Good post. I concur with it pretty well also. See comments in line... Radek Krzywania wrote: Hi, It’s getting hard to solve everything here, so let’s don’t try to solve everything here at once. So how about defining a start time as a best effort for v1? I suggest the both the Start Time and End Time are "best effort" times. We can bound a delayed start - see below. So we promise to deliver the service, yet we are unable to guarantee the exact start time in precision of seconds. If user want connection to be available at 2pm, it will be around that time, but we can’t guarantee when exactly (1:50, 2:01, 2:15). Let’s take a quite long time as a timeout (e.g. 20 minutes), and start booking the circuit in 5 or 10 minutes in advance (no discussion for v1, just best feeling guess) . The result will be that in most cases we will deliver the service at AROUND specified time. I suggest we word it something like this: "The Reserved Time should precede the Start Time sufficiently to allow auto-start provisioning to complete successfully at or before the committed Start Time. An auto-start Provisoning process that exceeds the Start Time by <<1 minute>> constitutes a failed provisioning process and a Provision Fault error is generated." This will bound the problem - the network will deliver within one minute of the scheduled start time, or declare a fault condition and fall on its sword. We can discuss if this is adequate for V1.0. And what the recovery action should be: release the connection and notify the user? notify the user and continue provisioning? ...? Likewise, in order to prevent a premature Release (:-), we should say something like this: "Upon receiving a calandar alarm indicating that the Stop Time for a connection has been reached, the NSA shall wait for <<1 minute>> grace period before initiating auto-stop Release processing for the connection. However, upon receipt of a valid Release message from another NSA, the local NSA shall cancel the grace period and immediately proceed with Release processing. The Reservation Stop Time shall succeed the In Service Stop Time sufficiently to properly and completely bring local resources to a known idle and secure state, including the grace period." Thoughts? Jerry

Artur Barczyk

9:01 p.m.

Hi Jerry,

...

This will bound the problem - the network will deliver within one minute of the scheduled start time, or declare a fault condition and fall on its sword. We can discuss if this is adequate for V1.0. And what the recovery action should be: release the connection and notify the user? notify the user and continue provisioning? ...?

I think a related, just as important question is, do you extend the termination time? For an application which has to transfer N TBs of data, starting 20 minutes late might be useless if the requested reservation time is shortened by that... I admit a videoconferencing app will prefer to wait. From the protocol point of view, terminating would be the cleanest thing to do. With the right configuration of provisioning times in each domain, a timeout of a minute, say, would indicate a deeper problem, so waiting might be pointless. However, an alternative would be to keep the user (agent/application) in wait state until provisioning successful or termination time - whatever comes first. The user app can then abort if it thinks it's beyond repair, and try to reschedule. But this implies that there is always a status (change) notification to the user agent. Then we can get rid of the timeout, actually. (Note that's not contradicting my earlier statements - the goal remains to provision at the requested time.) Cheers, Artur -- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801

Jerry Sobieski

9:58 p.m.

Hi Artur- Artur Barczyk wrote:

...

Hi Jerry,

...
This will bound the problem - the network will deliver within one minute of the scheduled start time, or declare a fault condition and fall on its sword. We can discuss if this is adequate for V1.0. And what the recovery action should be: release the connection and notify the user? notify the user and continue provisioning? ...?

I think a related, just as important question is, do you extend the termination time? For an application which has to transfer N TBs of data, starting 20 minutes late might be useless if the requested reservation time is shortened by that... I admit a videoconferencing app will prefer to wait.

This is IMO first a policy question - who is following you in the resource schedule? Are you more important than they are? From a purely technical point, the reservation only reserved the resources for the prearranged time. To extend that reservation would require changing the resource reservation all up and down the tree. Even if I said "sure give him extra time" another NSA may say "whooaa hold on! - I did my part- I was on time...why should I extend his reservation just because some chump NSA downstream couldn't meet their commitment? " For this reason, I would say the simple thing is to NOT extend the reservation. At least in V1.0. The fact is that probably the vast majority of our Start Times will be met successfully, and that the very few that miss will probably only miss by a relative "little bit" and the cost of extending will not be worth the complexity and arguments we will have to figure out how to do it:-). Frankly, I would recommend to users that if they need to transfer a known data quantity, they should add some buffer to handle such contingencies - some of which may not be network related at all. Another policy option is to try to encourage users to reserve MORE time than they need and Relesae early if they get done early. We all know better than to plan things too tightly - especially in complex interdependet systems. Always have some room to manuver!

...

From the protocol point of view, terminating would be the cleanest thing to do. With the right configuration of provisioning times in each domain, a timeout of a minute, say, would indicate a deeper problem, so waiting might be pointless.

Exactly.

...

However, an alternative would be to keep the user (agent/application) in wait state until provisioning successful or termination time - whatever comes first. The user app can then abort if it thinks it's beyond repair, and try to reschedule. But this implies that there is always a status (change) notification to the user agent. Then we can get rid of the timeout, actually. (Note that's not contradicting my earlier statements - the goal remains to provision at the requested time.)

Yeah, a timeout is useful for the protocol more generally. You never leave a protocol in a non-terminal state. There should always be a way for the state machine to terminate. If we issue a Provision down the tree and nothing ever comes back...at some point we have to assume something is wrong (!) And we have to be able to recover if nothings is ever EVER going to come back (the NSA died some how...) The timeout does this. When we wake up and realize nothings happened from below, we can eitehr check on progress, or just bail out. So, we can either issue a query() request down "Are we there yet?" "Nope, still working", or we can issue a notify up "Working, please stand by." "ok, thanks" after which the RA waits some more or bails out with a Release. This why we need a timeout value for all transient states Reserving, Provisioning, Releasing so that we don't hang indefinately on some delay from below. The other states will have the calandar alarm to kick them out of Scheduled and In Service. Each state's timeout may be different, and the action may be differnet. But this gets into error conditions... BR Jerry

...

Cheers, Artur

John Vollbrecht

5 Oct 5 Oct

7:27 p.m.

I would suggest that there are at least several issues that need to be resolved here. I try to define the issues and what my solution to each of them would be - at least right now. -- One is how to manage time across multiple NSAs - this is a problem when there are two NSAs, and more of a problem when there may be multiple NSAs involved in making a connection. 1) time syncrhonization Jeff has suggested that we require NSAs to run NTP to synchronize time between NSAs. The only down side I see of this is that it requires each NSA to run NTP. If we agree on this approach, then we might want to investigate if only provider NSAs need to run NTP, assuming a non provider will talk to a provider - this might make it easier for applications to be NSAs. The up side is that a request for starting and ending time is understood the same way by all NSAs. This means that a request that goes to multiple lNSAs in an authorization sequence will all see the same time and understand it the same way. *I vote for requiring NTP and working out the details - especially the potential difference in scheduling between segments because of possible skew in NTP time across NSAs. 2) Relationship between available and scheduled time 2a) available time definition I think we agree that start and end time (or duration) in a request will be for available time. Available time is time when the connection is actually available to carry traffic. This is the time that is sync'd using NTP. Note that available time is always an estimate because startup duration is variable. 2b) resource or scheduled time This is the time a resource is scheduled for by its own NRM. It calculates this time by inserting startup time for resources it controls before the requested start time, and inserts tear down time after the requested end time. This time is different for every NRM and may be different for different equipment within an NRM - that is an NRM implementation. In automated provisioning resource time is not shared with other NSAs. The impact of this will be when trying to schedule connections close to each other, they connnection reservations must be separated by startup and teardown times in participating NSAs. *I would like to see us accept these time concepts in talking about time issues. There are probably issues I haven't thought of yet that should be included in the descriptions. 3) Difference between automatic and manual provisioning 3a) Automatic provisioning is defined as having each NRM initiate connection so that it is estimated to be available at the requested start time. The time is estimated. It would be good to put some sort of bound on this, similar to what is done for NTP. This requires that the provider be able to make a estimate of startup and teardown time such that it occurs in a predicable way. Failure of a provider to do so would be an issue for SLA. 3b) Manual provisioning is defined as having a provision message sent to an NSA to initiate provisioning. This capability requires that requestor know when a connection is available to be provisioned. Presumably this time is the start time in the reservation minus startup time. How the requestor know this time is not clear, though it is possible to build a protocol that would make it available. In addition, the method of determining startime when a provision request is sent through a sequence of NSAs is also difficult, though not impossible. * I would vote to include automatic provisioning in V1 architecture and protocol, and include manual provisioning as a future in architecture and not in V1 protocol. This gives us time to explore manual provisioning and its uses cases before trying to define how it is implemented. John

Jerry Sobieski

6 Oct 6 Oct

2:15 p.m.

Hi John- This is a pretty good summary. Just a few comments (I can never resist:-) inline... John Vollbrecht wrote:

...

I would suggest that there are at least several issues that need to be resolved here. I try to define the issues and what my solution to each of them would be - at least right now.

--

One is how to manage time across multiple NSAs - this is a problem when there are two NSAs, and more of a problem when there may be multiple NSAs involved in making a connection.

1) time syncrhonization Jeff has suggested that we require NSAs to run NTP to synchronize time between NSAs. The only down side I see of this is that it requires each NSA to run NTP. If we agree on this approach, then we might want to investigate if only provider NSAs need to run NTP, assuming a non provider will talk to a provider - this might make it easier for applications to be NSAs.

The up side is that a request for starting and ending time is understood the same way by all NSAs. This means that a request that goes to multiple lNSAs in an authorization sequence will all see the same time and understand it the same way.

*I vote for requiring NTP and working out the details - especially the potential difference in scheduling between segments because of possible skew in NTP time across NSAs.

IMO - if we *require* the NSAs to be synchronized around a common global time (which is what NTP does), we need to do three things: 1. state clearly and concisely *why* this is required (I do not think it is *required*) 2. specify the acuracy that is required to meet #1 3. come up with a means to verify time synchronization (since out of sync NSAs must not be tolerated.) Else, we can *recommend* that the NSAs *should* use a protocol such as NTP to synchronize their time. We can still say why this is useful, but we do not require such synchronization. (I think this is the right approach). Either way, the protocol must still be designed to handle discrepencies between NSAs. Synchronization is not perfect, nor does it address the non-deterministic processing times and propagation delays that exist in a distributed architectures such as NSI. These are "timing" issues, not "Time" issues (if you get my point:-). The protocol state machine in different NSAs should converge to common state for a connection given enough time for messages to filter appropriately thru the service tree and underlying network. This is what I think we need to consider in more detail. I vote that we *recommend* NTP or a similar protocol be used to synchronize day clocks so that coordinated scheduling across independent agents is most effective.

...

2) Relationship between available and scheduled time 2a) available time definition I think we agree that start and end time (or duration) in a request will be for available time. Available time is time when the connection is actually available to carry traffic. This is the time that is sync'd using NTP. Note that available time is always an estimate because startup duration is variable.

No. The time that is sync'd using NTP is the *system clock* in each NSA. The Available Time or the Resource Time are simply static database entries in the ResourceDB, or in a "ConnectionDB". These times do not need synchronization, they are specified by users or NSAs. Its the system clocks that need to be synchronized. A "scheduler" process in each NSA periodically compares a scheduled process (say Resource Time) to the "current time" as represented by the system clock. If these two times match, a functional routine is called to process that event. Its important to recognize that any of the times we put in the ResourceDB represent allocation windows that we want to coincide, but it is the system clock - that is continually changing - that gets synchronozed by NTP. I do concur that the Available Time represents the time at which the circuit is intended to be In-Service and carrying traffic. I disagree that the Available Time is an estimate. It is a target. It is the Requested Available Time (from the RA) or the Committed Available Time (from a PA). But it is not an estimate. The *estimated* time is the "Estimated Provisioning Duration" which can not be known exactly in advance. The variance in Actual Provisioning Duration causes the Actual Available Time to vary - which is why you called it an estimate. But IMO we should be clear that their are a number of times being batted about regarding when the cirucit is supposed to be usable. IMO there are three: Requested Available TIme, Committed Available Time, and Actual Available TIme.

...

2b) resource or scheduled time This is the time a resource is scheduled for by its own NRM. It calculates this time by inserting startup time for resources it controls before the requested start time, and inserts tear down time after the requested end time. This time is different for every NRM and may be different for different equipment within an NRM - that is an NRM implementation. In automated provisioning resource time is not shared with other NSAs. The impact of this will be when trying to schedule connections close to each other, they connnection reservations must be separated by startup and teardown times in participating NSAs.

*I would like to see us accept these time concepts in talking about time issues. There are probably issues I haven't thought of yet that should be included in the descriptions.

With the caveats and nuances I mentioned above about sync and the Avail Time distinctions, I think you have captured the sentiment well.

...

3) Difference between automatic and manual provisioning 3a) Automatic provisioning is defined as having each NRM initiate connection so that it is estimated to be available at the requested start time. The time is estimated. It would be good to put some sort of bound on this, similar to what is done for NTP. This requires that the provider be able to make a estimate of startup and teardown time such that it occurs in a predicable way. Failure of a provider to do so would be an issue for SLA.

I would be pedantic and say the NSI does not specify what the NRM does. NSI *only* specifies what NSAs do. Auto Start is defined as each NSA initiates provisioning independently based upon a locally scheduled provisioning start time. Manual start is where each NSA awaits a ProvisionRequest message from another authorized NSA to initiate provisioning. I think trying to bound the error on "best estimates" is pointless, and frankly out of scope. The start time is either met, or it is not. Its like On-time Departure figures for an airline...they are estimates and do not guarantee anything. They are of questionable accuracy, self-stated, and not consistently computed. Is "On-Time" actually "On Time" or within 5 minutes of scheduled? Does an hour late count as a missed departure the same as a 5 minutes late departure? The way to improve this IMO is to simply say that, apriori, "Available Time" is either Requested Available Time or Committed Available Time. If the post facto Actual Available Time occured after the Committed Available Time, tough. It happens, its unavoidable, get over it. From a protocol standpoint, once the Committed Available Time has passed, the RA has a decision to make: Do I stay? Or do I go? i.e. should I wait a little while to see if things fall into place? Or should I give up and tear down the reservation? These are the only protocol options. The user may consult an SLA to see if there is recourse, but that is not part of the NSI protocol. The savvy user would institute an SLA before the fact to insure that there is substantial incentive for the Provider to meet the Committed Available Time. Doing some studies to track ontime departures could be useful, but it is not part of the NSI protocol. It is way off track in my opinion. It is something another tool or other software (perfSonar?) should be doing.

...

3b) Manual provisioning is defined as having a provision message sent to an NSA to initiate provisioning. This capability requires that requestor know when a connection is available to be provisioned. Presumably this time is the start time in the reservation minus startup time. How the requestor know this time is not clear, though it is possible to build a protocol that would make it available. In addition, the method of determining startime when a provision request is sent through a sequence of NSAs is also difficult, though not impossible.

* I would vote to include automatic provisioning in V1 architecture and protocol, and include manual provisioning as a future in architecture and not in V1 protocol. This gives us time to explore manual provisioning and its uses cases before trying to define how it is implemented.

I agree. Leave Manual provisioning for future. However, we still have ASAP requests to deal with. I think manual start and asap circuits may share many of the same challenges:-) This is due in large part to our requirement that reservation must precede all provisioning and estimated provisioning time (pure craziness:-) To elaborate a bit on the topic on manual provisioning... If the requested start time represented the start of the reservation (what we call Resource Time), then manual provisioning is a snap. The RA stipulates when the begining of provisioning will take place, and then the RA initiaties it at that time. This is in fact a very simple and deterministic process because it is entirely message driven from top of the tree down to the NRMs. We only run into confusion when we cross purposes and say the requested start time from the RA represents a time when we want the provisioning to have completed - and then say we'll initiate the provisioning. These are IMHO counter to one another. IMO, for manual circuits, the Requested Start Time is the beginning of the Provisioning phase, or the "Resource Time" referenced above. And for auto-start circuits, the Requested Start Time can represent either the Resource Time or the Available Time (Provisioning Start or In-Service Start respectively). For autostart, there is no fundamental dependency that we prepend an estimated provisioning time to the requested start time...that is just a convention the we agreed the network would assume responsibility for meeting the user's deadline. We could agree otherwise as well. Frankly, it would make for a simpler and more reliable protocol if the Requested Start Time always represented the beginning of the Resource Reservation, AND provisioning time was always considered part of the reservation. This would keep us out of the game of estimating future performance...which will never be exact, and always wasteful. As a complementary suggestion, I would have the End Time always represent the end of the reservation, and any de-provisioning that takes place does so after that time. Finally, this would make it easy to implement either autostart or manual start as all NSAs would have reserved the resources for a common start time. Jerry

Tomohiro Kudoh

3:28 p.m.

Hi all I would like to point out several things. 1. Automatic cannot be used for on-demand (or immediate) reservation for tree. When you want to provision an inter-domain connection through multiple networks (domains), some networks will provision requested intra-domain connections even if some networks denies the request. So, for on-demand, the requester must confirm all participating NSAs agree to provide intra-network connections first, and then let all NSAs to provision. So, explicit (manual) provisioning must be used for on-demand. For automatic, the start time should be sufficiently in future. (2PC solves this problem, but we decided to not to use 2PC for ver. 1) 2. Service start time from user's point of view is very difficult to define. For example, for Ethernet connection, there may be L2 switches in networks and/or at the edge. ARP exchange may take some time, and if STP (Spanning Tree Protocol, here) is working, it will take longer time before user can actually exchange packets. 3. For some applications, application might not want a connection in service before/after certain time for security reasons. Tomohiro

John Vollbrecht

5:29 p.m.

Hi Tomohiro - these are good points -- On Oct 6, 2010, at 11:28 AM, Tomohiro Kudoh wrote:

...

Hi all

I would like to point out several things.

1. Automatic cannot be used for on-demand (or immediate) reservation for tree. When you want to provision an inter-domain connection through multiple networks (domains), some networks will provision requested intra-domain connections even if some networks denies the request. My understanding has been that on-demand actually had to do a reservation and then immediate provision. I think this is implied here as well. So, the problem is that a) a reservation can be made and then backed out if a parent NSA is not successful in getting making all parallel reservations; and b) if provisioning is automatic then the reserved segments will be provisioned and then the reservation backed out - presumbably requiring the provisioning to be backed out.

...

So, for on-demand, the requester must confirm all participating NSAs agree to provide intra-network connections first, and then let all NSAs to provision. So, explicit (manual) provisioning must be used for on-demand. This makes sense to me.

...

For automatic, the start time should be sufficiently in future. (2PC solves this problem, but we decided to not to use 2PC for ver. 1) Defining "sufficiently in the future" is the problem here. Presumably this means far enough in the future that the reservation can be backed out in necessary by the parent before the NRM starts automatic provisioning

2. Service start time from user's point of view is very difficult to define. For example, for Ethernet connection, there may be L2 switches in networks and/or at the edge. ARP exchange may take some time, and if STP (Spanning Tree Protocol, here) is working, it will take longer time before user can actually exchange packets. This is true, and the "available time" on the network initiates the ability for the user to start trying to use the connection and things like ARP and STP to setup. A question for me is whether the user should wait till it gets a ack (or notify) from the provider to start trying the connection.

3. For some applications, application might not want a connection in service before/after certain time for security reasons. This is a good point. I wonder if this is something the network can provide or if the user should do this (make itself available) at specific times that match network available time in a predictable way

John

...

Tomohiro _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg

John Vollbrecht

6:02 p.m.

Some comments below - I will include some of this in a revised version of my doc later today -- On Oct 6, 2010, at 10:15 AM, Jerry Sobieski wrote:

...

Hi John- This is a pretty good summary. Just a few comments (I can never resist:-) inline...

John Vollbrecht wrote:

...
I would suggest that there are at least several issues that need to be resolved here. I try to define the issues and what my solution to each of them would be - at least right now.

--

One is how to manage time across multiple NSAs - this is a problem when there are two NSAs, and more of a problem when there may be multiple NSAs involved in making a connection.

1) time syncrhonization Jeff has suggested that we require NSAs to run NTP to synchronize time between NSAs. The only down side I see of this is that it requires each NSA to run NTP. If we agree on this approach, then we might want to investigate if only provider NSAs need to run NTP, assuming a non provider will talk to a provider - this might make it easier for applications to be NSAs.

The up side is that a request for starting and ending time is understood the same way by all NSAs. This means that a request that goes to multiple lNSAs in an authorization sequence will all see the same time and understand it the same way.

*I vote for requiring NTP and working out the details - especially the potential difference in scheduling between segments because of possible skew in NTP time across NSAs.

IMO - if we *require* the NSAs to be synchronized around a common global time (which is what NTP does), we need to do three things: 1. state clearly and concisely *why* this is required (I do not think it is *required*) 2. specify the acuracy that is required to meet #1 3. come up with a means to verify time synchronization (since out of sync NSAs must not be tolerated.)

Else, we can *recommend* that the NSAs *should* use a protocol such as NTP to synchronize their time. We can still say why this is useful, but we do not require such synchronization. (I think this is the right approach).

Either way, the protocol must still be designed to handle discrepencies between NSAs. Synchronization is not perfect, nor does it address the non-deterministic processing times and propagation delays that exist in a distributed architectures such as NSI. These are "timing" issues, not "Time" issues (if you get my point:-). The protocol state machine in different NSAs should converge to common state for a connection given enough time for messages to filter appropriately thru the service tree and underlying network. This is what I think we need to consider in more detail.

I vote that we *recommend* NTP or a similar protocol be used to synchronize day clocks so that coordinated scheduling across independent agents is most effective.

I think discussing the purpose of synching clocks is important. In my view the main reason is to allow automatic provisioning to happen at the same time in all NRMs that are involved in creating a connection. I think the STP is a relatively simple way to be sure that all agents see the same time, thought there are probably other ways. I think this is a topic that can be broken out from other discussions of time, if we assume that somehow all NRMs will have a "close enough" agreement on time. We can include "close enough" in final timing diagram as a "time skew" whose size depends on how the time synching is done.

...

...
2) Relationship between available and scheduled time 2a) available time definition I think we agree that start and end time (or duration) in a request will be for available time. Available time is time when the connection is actually available to carry traffic. This is the time that is sync'd using NTP. Note that available time is always an estimate because startup duration is variable.

No. The time that is sync'd using NTP is the *system clock* in each NSA. I agree - I wasn't trying to imply anything else.

...

The Available Time or the Resource Time are simply static database entries in the ResourceDB, or in a "ConnectionDB". These times do not need synchronization, they are specified by users or NSAs. Its the system clocks that need to be synchronized. A "scheduler" process in each NSA periodically compares a scheduled process (say Resource Time) to the "current time" as represented by the system clock. If these two times match, a functional routine is called to process that event. Its important to recognize that any of the times we put in the ResourceDB represent allocation windows that we want to coincide, but it is the system clock - that is continually changing - that gets synchronozed by NTP.

I do concur that the Available Time represents the time at which the circuit is intended to be In-Service and carrying traffic.

I disagree that the Available Time is an estimate. It is a target. It is the Requested Available Time (from the RA) or the Committed Available Time (from a PA). actually I think it is the estimated (committed) time from the PA. This consists of an estimated start time which is what causes the total available time to vary. The tear down time is committed - the network *will* start teardown at a specific time.

...

But it is not an estimate. The *estimated* time is the "Estimated Provisioning Duration" which can not be known exactly in advance. The variance in Actual Provisioning Duration causes the Actual Available Time to vary - which is why you called it an estimate. But IMO we should be clear that their are a number of times being batted about regarding when the cirucit is supposed to be usable. IMO there are three: Requested Available TIme, [I would say estimated] Committed Available Time, and Actual Available TIme.

...
2b) resource or scheduled time This is the time a resource is scheduled for by its own NRM. It calculates this time by inserting startup time for resources it controls before the requested start time, and inserts tear down time after the requested end time. This time is different for every NRM and may be different for different equipment within an NRM - that is an NRM implementation. In automated provisioning resource time is not shared with other NSAs. The impact of this will be when trying to schedule connections close to each other, they connnection reservations must be separated by startup and teardown times in participating NSAs.

*I would like to see us accept these time concepts in talking about time issues. There are probably issues I haven't thought of yet that should be included in the descriptions.

With the caveats and nuances I mentioned above about sync and the Avail Time distinctions, I think you have captured the sentiment well.

...
3) Difference between automatic and manual provisioning 3a) Automatic provisioning is defined as having each NRM initiate connection so that it is estimated to be available at the requested start time. The time is estimated. It would be good to put some sort of bound on this, similar to what is done for NTP. This requires that the provider be able to make a estimate of startup and teardown time such that it occurs in a predicable way. Failure of a provider to do so would be an issue for SLA.

I would be pedantic and say the NSI does not specify what the NRM does. NSI *only* specifies what NSAs do. Auto Start is defined as each NSA initiates provisioning independently based upon a locally scheduled provisioning start time. Manual start is where each NSA awaits a ProvisionRequest message from another authorized NSA to initiate provisioning.

I think trying to bound the error on "best estimates" is pointless, and frankly out of scope. The start time is either met, or it is not. Its like On-time Departure figures for an airline...they are estimates and do not guarantee anything. They are of questionable accuracy, self-stated, and not consistently computed. Is "On-Time" actually "On Time" or within 5 minutes of scheduled? Does an hour late count as a missed departure the same as a 5 minutes late departure? The way to improve this IMO is to simply say that, apriori, "Available Time" is either Requested Available Time or Committed Available Time. If the post facto Actual Available Time occured after the Committed Available Time, tough. It happens, its unavoidable, get over it. From a protocol standpoint, once the Committed Available Time has passed, the RA has a decision to make: Do I stay? Or do I go? i.e. should I wait a little while to see if things fall into place? Or should I give up and tear down the reservation? These are the only protocol options.

The user may consult an SLA to see if there is recourse, but that is not part of the NSI protocol. The savvy user would institute an SLA before the fact to insure that there is substantial incentive for the Provider to meet the Committed Available Time.

Doing some studies to track ontime departures could be useful, but it is not part of the NSI protocol. It is way off track in my opinion. It is something another tool or other software (perfSonar?) should be doing.

...
3b) Manual provisioning is defined as having a provision message sent to an NSA to initiate provisioning. This capability requires that requestor know when a connection is available to be provisioned. Presumably this time is the start time in the reservation minus startup time. How the requestor know this time is not clear, though it is possible to build a protocol that would make it available. In addition, the method of determining startime when a provision request is sent through a sequence of NSAs is also difficult, though not impossible. * I would vote to include automatic provisioning in V1 architecture and protocol, and include manual provisioning as a future in architecture and not in V1 protocol. This gives us time to explore manual provisioning and its uses cases before trying to define how it is implemented.

I agree. Leave Manual provisioning for future.

However, we still have ASAP requests to deal with. I think manual start and asap circuits may share many of the same challenges:-) This is due in large part to our requirement that reservation must precede all provisioning and estimated provisioning time (pure craziness:-) This is also the point I think Tomohiro makes in his note.

...

To elaborate a bit on the topic on manual provisioning... If the requested start time represented the start of the reservation (what we call Resource Time), then manual provisioning is a snap. The RA stipulates when the begining of provisioning will take place, and then the RA initiaties it at that time. This is in fact a very simple and deterministic process because it is entirely message driven from top of the tree down to the NRMs. We only run into confusion when we cross purposes and say the requested start time from the RA represents a time when we want the provisioning to have completed - and then say we'll initiate the provisioning. These are IMHO counter to one another.

IMO, for manual circuits, the Requested Start Time is the beginning of the Provisioning phase, or the "Resource Time" referenced above. And for auto-start circuits, the Requested Start Time can represent either the Resource Time or the Available Time (Provisioning Start or In-Service Start respectively). For autostart, there is no fundamental dependency that we prepend an estimated provisioning time to the requested start time...that is just a convention the we agreed the network would assume responsibility for meeting the user's deadline. We could agree otherwise as well.

Frankly, it would make for a simpler and more reliable protocol if the Requested Start Time always represented the beginning of the Resource Reservation, AND provisioning time was always considered part of the reservation. This would keep us out of the game of estimating future performance...which will never be exact, and always wasteful.

This seems a network centric view. From and application point of view it would be simpler if time was always the available time, even if it is a good estimate. I think this is where you have an issue with SLA (I might be wrong of course). In the available time is something the network promises to provide (within some bound), then it can be spelled out in an SLA. If it doesn't meet that time, then the terms of the SLA kick in. If you need something in the protocol to determine if the SLA has been met, then this should be included in the protocol. Saying one never knows when a connection will be available is a recipe for having no-one ever use the service.

...

As a complementary suggestion, I would have the End Time always represent the end of the reservation, and any de-provisioning that takes place does so after that time. Finally, this would make it easy to implement either autostart or manual start as all NSAs would have reserved the resources for a common start time.

Jerry

Inder Monga

29 Sep 29 Sep

3:07 p.m.

Jerry, My comments in-line

...

...
I agree with your statements;

...
User is not interested in partial results, as he/she is not even aware/interested in which NSAs/domains are involved. User doesn’t care (if everything works fine ;) ).

The protocol should be designed with the user in mind. The user does not care about guard time values, differences in setup times for MPLS vs optical lambdas, and concern itself with choices an NSA/NRM will make in path-finding.

The protocol designers can keep the user in mind, but the protocol is between the RA and the PA and and has a specific purpose: to reserve and instantiate a connection across the globe. We need to keep in mind that the RA is not always the end user - it is by definition another NSA and could be an NSA in the tree/chain somewhere. If we want to differentiate between the user and the network, then we can create a simplified User to Network API, and a different Network to Network API...but I don't think thats what we want to do (:-) We need to IMO *not* think about the user, but to think about the Requesting Agent - regardless of who it represents.

Of course, I understand the intermediate RA-PA interaction. The main purpose of highlighting the User here is to indicate the the Start Time and End Time comes from the originating RA and differentiate it from the intermediate RA's.

...

While the guard times may be network specific, we do need to at least consider what we would like an NSA to do if for instance a provisioning guard time pushes a reservation forward into a previous reservation: Do we 1) reject the request since we can't prepend our guard time and still make the Requested Start Time? OR 2) Do we retard the Estimated Start Time to allow for the guard time? OR 3) do we reduce the guard time to fit the available lead time?

I think we now agree that the Start Time is just an estimate, due primarily to the guard time itself being just an estimate. So none of these times are etched in stone...So which option do we recommend or require? The protocol is sensitive to these various times - they cause timers to go off, messages to be sent, error handling to kick in... If they are adjusted during scheduling or provisioning, we MUST understand what impact they will have to the protocol and how that will be carried through the service tree.

I propose #1. Anything else will fall under negotiation which we have punted.

5399

Age (days ago)

5408

Last active (days ago)

List overview

Download

48 comments

11 participants

participants (11)

Aaron Brown
Artur Barczyk
Evangelos Chaniotakis
Gigi Karmous-Edwards
Guy Roberts
Inder Monga
Jeff W.Boote
Jerry Sobieski
John Vollbrecht
Radek Krzywania
Tomohiro Kudoh

time issue

tags

participants (11)