Nice distinction - On Apr 14, 2010, at 9:54 AM, Artur Barczyk wrote:
Hi Jerry,
very good points...
On 04/14/2010 02:19 PM, Jerry Sobieski wrote:
Hey Radak-
Two points:
1) We should distinguish between "timing" issues, and "scheduling" issues. Most race conditions can be easily handled in the protocol, so I am not too worried about those other than to just be able to identify them all...
2) But where timing issues interact with scheduling we have this otehr problem of network time synchronization. And from todays protocol processing times, seconds difference may as well be days difference. For instance, if one NSA start unilateral provisioning of a connection and passes a Provision() or ProvisionComplete() message to its neighber NSA in the service tree, what if that neighbor has a clock that is different by 1 second? Is this an error? Should the later NSA wait for 1 second? does it ignore its own clock? what if the delta was 30 seconds? or 10 minutes? If we do not pin these issues down and bound them, then the protocol will not behave as we hope it well.
The point is IMO that we need to distinguish the time sychronisation of the NSAs and the state sychronisation of the segments:
I don't see a problem with the NSAs being sychronised to within a dt (seconds or even minutes) - that's just for bookeeping (circuit database or whatever one wants to call it) of resources. Of course a too large de- sychronisation will lead to effects like a resource believed to be available while it is not or vice versa, so we should aim at minimising this. Again, simple NTP should do for starters. Then, the circuits should be IS some time before the desired start time, say Dt. All one needs is Dt>dt (or '>>' to be on safe side) to avoid disappointed users. That's arguably not a protocol/standard issue, but implementation, I agree with Radek.
Then, the circuit segments' state in each domain has to be sychronised, and that should be event driven. That's where the protocol kicks in. Each segment will traverse the sequence of I->Res->Sch->Prov->IS->Rel (from Guy's email, and in the simple case everything's dandy, of course other states and transitions need to be taken into account in case of failures). What needs to be defined is the conditions for the transitions, and who does it. E.g. should the system wait for all domains to confirm Res before proceeding to Sch? Yes. Should it wait for all to be in Sch to proceed to Prov? Definitely. And here I wouldn't reply on time synchronisation between NSAs with a good-guess guard time, but on message exchange.
Hope that makes sense.
Cheers, Artur
To minimize these timing issues (not really race conditions) we could require a more accurate (GPS?) clock, but even this will not deterministically solve the issue, just reduce its likelyhood of occurance. As fast as processors are today, a clock would have to be synch'd to within a few milliseconds to make this a non-issue. We may be able to develop some sort of event timing easement as par tof the protocol...an event that occurs within some delta of anoterh event is considered simultaneous...these are not simple, but they could solve the protocol timing deterministics issue...
I suspect we can solve some of these timing/scheduling interaction issues with simpler protocol assertions: e.g. "Provisioning begins at the Start Time". Some will be ok to do this, and others we'll want a more accomodating approach. (For the record, this example is IMO one we should be more accomodating with:-)
Jerry
Radek Krzywania wrote:
Hi, Indeed, I forgot about NTP. But still my opinion is that we are unable to assure time precision at the level of seconds. Minutes are far more probable.
Regarding race conditions, it's not the role of the protocol to prevent it. Protocol operates in the area of single service definitions (how to request and process the request), while software will deal with simultaneous requests at different states and distributed in time (also overlapping). That's my opinion, unless someone will convince me otherwise :)
Best regards Radek
________________________________________________________________________ Radoslaw Krzywania Network Research and Development Poznan Supercomputing and radek.krzywania@man.poznan.pl Networking Center +48 61 858 20 28 http:// www.man.poznan.pl ________________________________________________________________________
-----Original Message----- From: Artur Barczyk [mailto:Artur.Barczyk@cern.ch] Sent: Tuesday, April 13, 2010 3:28 PM To: radek.krzywania@man.poznan.pl Cc: 'Inder Monga'; nsi-wg@ogf.org; 'Guy Roberts' Subject: Re: [Nsi-wg] Immediate/Advance reservation (Re: NSI conf call minutes)
On 04/13/2010 03:14 PM, Radek Krzywania wrote:
Hi,
What is a hard deadline service? Any example? Is it synchronised with GPS? With what is it synchronised? What does it mean I want a reservation at 14:34 GMT? Is it 14:34 on requestor clock, atomic clock in e.g. Switzerland, synchronised GPS time (still ms of differences)? Different time zone, different clocks. If you not synchronise domain clocks you can�t talk about time in so exact manner as I feel you want to. Which clock are we referencing?
I think it's not as bad as it sounds, NTP precision is enough at the time scales we will ever be able to aim at reaching. :-)
Being honest � I am not really
against �thrashing�, and especially not against race conditions. It will be an issue when number of request will be quite high and competition for resources will be high. For now, facing the current demand for dynamic services, it�s not an issue at all. Not in version 1. Besides, how to solve race conditions is more an implementation issue (out of scope then), not a protocol.
Radek, here I think you're wrong, sorry. In the context of multi- domain, the protocol has to be defined in a way to avoid pitfalls such as race conditions. (among other things)
Cheers, Artur
Best regards
Radek
________________________________________________________________________
Radoslaw Krzywania Network Research and Development
Poznan Supercomputing and
radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl> Networking Center
+48 61 858 20 28 http://www.man.poznan.pl
________________________________________________________________________
*From:* Inder Monga [mailto:imonga@es.net] *Sent:* Tuesday, April 13, 2010 2:49 PM *To:* Artur Barczyk *Cc:* radek.krzywania@man.poznan.pl; nsi-wg@ogf.org; 'Guy Roberts' *Subject:* Re: [Nsi-wg] Immediate/Advance reservation (Re: NSI conf call minutes)
All,
I agree about deterministic behavior. That is what we are all shooting for :) I am thinking in terms of state machines as well.
What I am hearing both of you state that "Start Time" is not really a "Start time"...it is ASAP after "Start time" in case things are not complete? This is fine for a data movement service without hard deadlines, how will you ensure this for a Video conf system that needs to start at a particular time? We have to think of all possible application services that can use NSI.
Radek, maybe Guard-time is being misunderstood - I am merely suggesting a gap before which Advanced Reservation Requests are not processed by the domain. There is nothing non-deterministic and immeasurable about that. It is a fixed value, albeit arbitrary value. This reduces the chances of the provisioning system across domains from "thrashing" - i.e. reserving resources and maybe releasing them because the connection did not happen in time.
Regardless of the decision on guard-time, for deterministic behavior for many error conditions including start time arriving and reservation is incomplete and start time arriving and provisioning is incomplete.
Enjoying the discussion,
Inder
On Apr 13, 2010, at 5:26 AM, Artur Barczyk wrote:
Hi Radek,
agree, but just to note, it's not about deterministic time, but deterministic behaviour I am worried about. I don't see a stable system where one part can be in provisioning while another in reservation. Guard time will not solve this by itself, even if you make it 2 months :-)
Cheers, Artur
On 04/13/2010 02:15 PM, Radek Krzywania wrote:
Hi,
I tried to catch up the discussion, hope I did not missed anything.
What is hard for me to understand is why are we trying to define measurable parameters (connection activation time) basing on non-deterministic, immeasurable parameters (guard time). Even if we measured how much time it takes to reserve and activate connection in a domain, we have only statistical view on how much time it MAY (SHOULD) take. Any change to the network, NSI architecture, HW, or even SF may extend this time unexpectedly, without prior notification. This is not something we can measure (or we need to do that constantly, changing guard time value every time, which in fact does not solve everything). IMHO we can't promise something we could not prove or be sure of. I am happy to measure guard time, add safe value (e.g. res + activation takes 4 minutes, + 2 minutes safe time = 6 minutes) and say to we SHOULD deliver a connection in less than 6 minutes. If we say we MUST provide it in less than 6 minutes, we have an issue.
I am rather more familiar with the option where connection is delivered as soon as possible, which means each domain performs reservation, then signalling is initialized immediately after resources are booked. Does user care if he gets it now = current time, or = current time + "gurad time or whatever"? I suppose not. If I want a circuit now, I expect to get it ASAP, which does not means it's deterministic. I am fine with knowledge I will get it around 6 minutes (statistically), but I must be immediately notified about activation. If we want to go into time details, we will get into very funny things like GPS synchronisation between users, NSA agents, networks, and domains. This is not a real-time system, not everything is deterministic, and not everything can be guaranteed. We can reconsider naming of the service, and change it from immediate to ASAP.
I am not sure if we should focus on this small issue, while facing resources guarantee in advance reservation mode. Try to guarantee there anything for 100% in 2 months time period:) Even if you assume no network/HW failures.
Best regards
Radek
________________________________________________________________________
Radoslaw Krzywania Network Research and Development
Poznan Supercomputing and
radek.krzywania@man.poznan.pl <mailto:radek.krzywania@man.poznan.pl
Networking Center
+48 61 858 20 28 http://www.man.poznan.pl
________________________________________________________________________
-----Original Message-----
From: nsi-wg-bounces@ogf.org <mailto:nsi-wg- bounces@ogf.org> [mailto:nsi-wg-bounces@ogf.org] On Behalf Of
Artur Barczyk
Sent: Tuesday, April 13, 2010 10:11 AM
To: Inder Monga
Cc: nsi-wg@ogf.org <mailto:nsi-wg@ogf.org>; Guy Roberts
Subject: Re: [Nsi-wg] Immediate/Advance reservation (Re: NSI conf call
minutes)
Hi Inder,
I see, thanks for this clarification.
I still think we are introducing an artificial decision step here, which
will just be confusing to the end-user (and make the whole system
more complex), and I still wonder about the necessity of it.
Please see in-line:
On 04/12/2010 11:15 PM, Inder Monga wrote:
Hi All,
I feel there is a lot of confusion, so let me try to explain my
case/understanding.
1. Guard-time:
This concept was proposed for Advanced Scheduling only. This can be a
default value and it does not have to be an "exact" measurement of
provisioning times. It only handles path computation and reservation
times across domains.
What does it mean to a user?
A user CANNOT ask for a advanced reservation connection with Tstart <
Tnow + Guard-time. If a user asks with a Tstart lower that Tnow +
Guard-time, the scheduled request is rejected outright.
Imagine I try to make a connection "NOW", and it gets refused after
N minutes due to lack of resources. Then I try "2 minutes from now", and
it gets rejected straight off.
We shouldn't aim at having expert users who would understand this.
I think the system should behave in the same (and deterministic) way,
independent of what the user states in reservation time.
(Btw - that the reservation and provisioning time might vary does not
make it less deterministic.)
With an ADvanced Scheduling function, provisioning initiation can happen
from both the user or the provider.
2. On-Demand Service: In my opinion, Guard-time does not prevent an
On-Demand service as specified by Jerry. They co-exist.
An on-demand service, with Tstart = ASAP can be implemented very easily.
The service starts when the "provisioning complete" message is received
by the user. If the user does not receive that message, it continues to
wait.
Exactly what I was aiming at - but the same logic can apply to any time
between "NOW" and the guard time, or doesn't it?
All you need to do, if the start time is reached before the reservation
is complete, to wait for the latter.
Does this make more sense?
I will answer specifics below.
[...]
What I meant is that if that time has passed by the time the provider
NSA gets notified of the reservation acceptance along the path, it
should proceed directly to provisioning.
In advanced reservation, the open question is what should a domain do if
Tstart comes, and it has not got a reservation complete or provision
message? Should it delete the connection or provision its own set of
resources? Chin and I include this case in the error recovery document
to be published soon.
No, no - simply wait for the reservation to complete. Only then will
you know if it succeeded in the first place.
IMO, the provisioning and reservation systems cannot be completely
decoupled. The provisioning stage should actually never be reached
until a reservation is complete. It is dependent on the outcome of the
path computation as well as resource reservation. Never go to provisioning
before you know you can have the resources.
You have to do this anyway, to protect against the guard time being
set too short. In which case you can just as well set the guard time to 0.
That's just common sense, IMO, what it means when I would ask for
immediate
circuit provisioning. "Please give it to me as soon as you're able to,
I'm waiting."
The thing not to forget is that someone can ask for a circuit not only
"now",
I think the "now" case is actually, "as soon as possible" - which is the
on-demand case. Then it just waits for the right message from the
Provider Agent before it knows the connection is available to be used.
Yes, absolutely agree - that's a discussion terminology, which I'd be
happy to change :-)
However, we need to be precise on what we mean. An "ASAP" reservation,
from a user's point of view, could mean really "any time possible, starting
from now", i.e. also in 2 hours, if the resources will only then become
available.
I am not sure BoD does mean that.
Will in such a case a BoD reservation be converted into a scheduled one?
but "a minute from now", which would lead to the same problem if the
time to
A minute from now actually becomes a "scheduled connection" and there is
where the problem really starts.
I am sorry I have missed large parts of this discussion, being kept off with
other workload. Sorry if I am coming back to things which might be obvious
to you by now.
But I do not really understand where the problem really is.
You mention the provisioning system to have to decide what to do
if the reservation step is not complete - but I think the right design
decision
would be that the system should never actually be in such a state.
(Sorry, I am falling into thinking in terms of state machines here, but
well,
that's what I start to believe would be good here.)
Is there other reasons?
Cheers,
Artur
I feel we should support both Advanced Reservation with guard-time and
On-demand connection service.
Inder
process the reservation is longer than a minute (as it most probably
will be
in the next future).
So the "now" string as in your option 2) would only work for a singular
subset of the
problem.
Cheers,
Artur
Guy
-----Original Message-----
From: Artur Barczyk [mailto:Artur.Barczyk@cern.ch ]
Sent: 12 April 2010 17:28
To: nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> <mailto:nsi-wg@ogf.org>
Subject: Re: [Nsi-wg] Immediate/Advance reservation (Re: NSI conf
call minutes)
Hi,
I think guard time is a shaky concept, as who can tell how long it should
be - it can/will depend on the number of domains the circuit
contains, the
speed of each reservation/provisioning system as well as the load on the
system, and will be variable over time (hoping for faster
reservation/provisioning
systems in the future).
But: if in step 5, the "wait for start time" means t_start <= t_current,
then the
provider will immediately pass on to provisioning.
What needs to be done however is to have the duration of the reservation
reflect the time difference between desired start time and the effective
one.
I am sure I am missing something..?
Cheers,
Artur
On 04/12/2010 11:12 AM, Guy Roberts wrote:
Jeroen,
Yes, that is correct. But the mechanism will be the same for
advance reservations, just a later start time.
Guy
-----Original Message-----
From: Jeroen van der Ham [mailto:vdham@uva.nl ]
Sent: 12 April 2010 08:19
To: Guy Roberts
Cc: John Vollbrecht; nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> <mailto:nsi-wg@ogf.org
Subject: Re: [Nsi-wg] Immediate/Advance reservation (Re: NSI conf
call minutes)
To sum this up, this describes a situation where there is no prior
reservation and provisioning is started immediately because the
startTime is meant as a "now"?
Jeroen.
On 09/04/2010 18:56, Guy Roberts wrote:
John,
My thinking of how it could work is as follows (though the details
are really part of the protocol definition group's work):
StartTime= time when the provisioning is begun. This is the only
possible meaning for StartTime since we have no way of knowing how
long the provisioning will take in advance of the provisioning
being performed. i.e provisioning completion time is
non-deterministic. For consistency as an asynchronous system, the
completion of provisioning (in- service) is pushed by the NRM to the
Provider which in turn sends this to the Requestor as a notification.
Locally initiated provisioning:
1. The Requester NSA creates a request with a start time
(StartTime). StartTime= NSAs current time + Requester guard time.
Eg 12:00pm + 5 minutes = 12:05pm.
2. Provider validates the start time as being at least the provider
guard time away from now. (note requester and provider guard times
could be a little different to allow for transmission delay of request)
3. Provider begins the reservation process (12:01pm)
4. Provider completes the reservation (12:02pm)
5. Provider waits for the startTime (12:05pm)
6. Provider starts provisioning locally (12:05pm)
7. Provider waits for confirmation of provisioning from NRM (12:06pm)
8. Provider sends a notification to the requestor NSA to notify
that the connection is in-service (12:06pm)
Provisioning signalled by Requester:
1. The Requester NSA creates a request with a start time
(StartTime). StartTime= NSAs current time + Requester guard time.
Eg 12:00pm + 5 minutes = 12:05pm.
2. Provider validates the start time as being at least the provider
guard time away from now. (note requester and provider guard times
could be a little different to allow for transmission delay of request)
3. Provider begins the reservation process (12:01pm)
4. Provider completes the reservation (12:02pm)
5. Provider waits for the startTime (12:05pm)
6. Provider waits for the signal to provision (12:10pm)
7. Provider initiates provisioning of the Connection (12:10pm)
7. Provider waits for confirmation of provisioning from NRM (12:11pm)
8. Provider sends a notification to the requestor NSA to notify
that the connection is in-service (12:11pm)
Guy
-----Original Message-----
From: John Vollbrecht [mailto:jrv@internet2.edu ]
Sent: 09 April 2010 17:28
To: Guy Roberts
Cc: John Vollbrecht; Tomohiro Kudoh; Jeroen van der Ham;
nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> <mailto:nsi-wg@ogf.org>
Subject: Re: [Nsi-wg] Immediate/Advance reservation (Re: NSI conf
call minutes)
I am still a bit confused. Perhaps someone could do a timing diagram
like the one Tomohiro did a while ago when we were discussing 2 phase
commits.
I will try to explain my confusion. My understanding has been that
we
agreed that provisioning would never be done without prior
reservation. So it would seem that the question being discussed is
"what is the time being requested in a reservation". If the
reservation succeeds then provisioning can happen.
It seems to me one question is how to define the start time being
requested. The options seem to be that is is either 1) the time the
circuit is actually provisioned and ready to use or 2) the time that
provisioning of the circuit starts. In one case the previous
connection may terminate sooner by the guard time and in the latter
it
may start later by the guard time. If it is (1) then a connection
scheduled for now must have been started at [now - (start time)].
A second question is whether is is possible to request a connection
that starts "now". This implies reserving a connection and
initiating
it as soon as it is reserved. Assume that start time is when
provisioning a circuit starts (case 2 above). It seems that main
issue with this is whether the time to reserve a connection is longer
than the requestor is willing to wait. The time it takes depends on
how many NSAs are "chained" to satisfy the request and how long each
NSA takes to reserve the connection. This time is "authorization
time" not guard time as I understand it.
There is another issue with defining authorization as "now" instead
of
a specific time. The problem is that each NSA in a chain will think
authorization happens at a slightly different time. I am not sure
how
important this is - it doesn't seem too important to me, but
perhaps I
am wrong. If provisioning starts after the reservation is complete,
then everything should be reserved, if at a slightly different time.
----------------------------------
I think Guy is suggesting that start time is when provisioning starts
(case 2) above. That seems simplest to me.
I am not sure the provisioning time is important, and if not I would
think it good to include "immediate" reservation
John
On Apr 9, 2010, at 11:15 AM, Guy Roberts wrote:
Tomohiro,
In this case, only some parts of inter-network connection will be
provisioned.
Right, I forgot about this reason - it is a good point. Again, I
think we are not complicating things too much if we have a rule that
the Requester NSA cannot send a start time sooner than now+guardtime.
I think we can solve the chain issue by not forcing any value for
the guard time. This can be a policy decision to suit the service
type, equipment and number of networks involved.
Guy
-----Original Message-----
From: Tomohiro Kudoh [mailto:t.kudoh@aist.go.jp]
Sent: 09 April 2010 09:04
To: Jeroen van der Ham
Cc: nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> <mailto:nsi-wg@ogf.org>
Subject: Re: [Nsi-wg] Immediate/ Advance reservation (Re: NSI conf
call minutes)
Hi Jeroen,
There is a problem for inter- network connection. During the
discussions
in some calls, the problem of synchronizing networks (managed by
different NSAs) was discussed.
If you use the "now" type request for inter-network connection
(without
complicated coordination), the actual provisioning time of networks
may
be different. Moreover, some networks may provision resources before
some other networks reply to the request, and such networks might deny
the request. In this case, only some parts of inter-network connection
will be provisioned.
The guard time is one of the simple solutions to solve this problem. I
understand there can be multiple ways to cope with this, but all of
them
will introduce some complication to some part (note that we decided
not
to use 2PC for the v1.0). This is a design choice matter.
Regards,
Tomohiro
On Thu, 08 Apr 2010 09:27:59 +0200
Jeroen van der Ham <vdham@uva.nl <mailto:vdham@uva.nl> <mailto:vdham@uva.nl>> wrote:
On 07/04/2010 15:02, Tomohiro Kudoh wrote:
If a requester wants resources to be provisioned as soon as
possible, it
can set the start time parameter in a advance request to:
(current time + guard time + a certain time required for message
delivery).
In this way, immediate provisioning can be requested by an advance
reservation request.
The procedure above seems overly complicated and if I really am
pressed
for time, and I miscalculate the (current time + guard time +
delivery
time) by a few seconds. Denying the request means that I have to do
it
all over again, making me even more pressed for time.
Why not keep things simple and always interpret a start time in the
past
as "now" ? (provided the end- time is in the future too)
Would there be any problems associated with that?
Jeroen.
_______________________________________________
nsi-wg mailing list
nsi-wg@ogf.org <mailto:nsi-wg@ogf.org
<mailto:nsi-wg@ogf.org>
_______________________________________________
nsi-wg mailing list
nsi-wg@ogf.org <mailto:nsi-wg@ogf.org
<mailto:nsi-wg@ogf.org>
http://www.ogf.org/mailman/listinfo/nsi-wg
_______________________________________________
nsi-wg mailing list
nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> <mailto:nsi-wg@ogf.org>
http://www.ogf.org/mailman/listinfo/nsi-wg
_______________________________________________
nsi-wg mailing list
nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> <mailto:nsi-wg@ogf.org>
http://www.ogf.org/mailman/listinfo/nsi-wg
--
Dr Artur Barczyk
California Institute of Technology
c/o CERN, 1211 Geneve 23, Switzerland
Tel: +41 22 7675801
_______________________________________________
nsi-wg mailing list
nsi-wg@ogf.org <mailto:nsi-wg@ogf.org> <mailto:nsi-wg@ogf.org>
http://www.ogf.org/mailman/listinfo/nsi-wg
---
Inder Monga http://100gbs.lbl.gov
imonga@es.net <mailto:imonga@es.net> <mailto:imonga@es.net
(510) 499 8065 (c)
(510) 486 6531 (o)
--
Dr Artur Barczyk
California Institute of Technology
c/o CERN, 1211 Geneve 23, Switzerland
Tel: +41 22 7675801
_______________________________________________
nsi-wg mailing list
nsi-wg@ogf.org <mailto:nsi-wg@ogf.org>
http://www.ogf.org/mailman/listinfo/nsi-wg
-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801
---
Inder Monga http://100gbs.lbl.gov imonga@es.net <mailto:imonga@es.net> http://www.es.net (510) 499 8065 (c) (510) 486 6531 (o)
-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801
_______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg
-- Dr Artur Barczyk California Institute of Technology c/o CERN, 1211 Geneve 23, Switzerland Tel: +41 22 7675801 _______________________________________________ nsi-wg mailing list nsi-wg@ogf.org http://www.ogf.org/mailman/listinfo/nsi-wg