Re: [SAGA-RG] missing(?) method reporting last modification time

1 Jun 2009

      Thank you Brother Andre for setting us straight by quoting from the  
book of SAGA.
Now go ahead in peace to love and server the grid...

Amen.

-john

On May 30, 2009, at 9:25 AM, Andre Merzky wrote:
...
John, Thilo,
allow me to quote from the Holy Book of SAGA, Scripture 2.8
"Execution Semantics and Consistency Model":
SAGA API calls on a single service or server can occur
 concurrently with (a) other tasks from the same SAGA
 application, (b) tasks from other SAGA applications, or
 also (c) calls from other, independently developed
 (non-SAGA) applications.  This means that the user of
 the SAGA API should not rely on any specific execution
 order of concurrent API calls.  However,
 implementations MUST guarantee that a synchronous
 method is indeed finished when the method returns, and
 that an asynchronous method is indeed finished when the
 task instance representing this method is in a final
 state.  Further control of execution order, if needed,
 has to be enforced via separate concurrency control
 mechanisms, preferably provided by the services
 themselves, or on application level.
[... at most once ...]
Beyond this, the SAGA API specification does \I{not}
 prescribe any consistency model for its operations, as we
 feel that this would be very hard to implement across
 different middleware platforms.  A SAGA implementation MAY
 specify some consistency model, which MUST be documented.
 A SAGA implementation SHOULD always allow for application
 level consistency enforcement, for example by use of of
 application level locks and mutexes.
Related to that is Scripture 2.6.4 "Concurrency Control":
Although limited, SAGA deﬁnes a de-facto concurrent
 programming model, via the task model and the asynchronous
 notification mechanism. Sharing of object state among
 concurrent units (e.g. tasks) is intentional and necessary
 for addressing the needs of various use cases. Concurrent
 use of shared state, however, requires concurrency control
 to avoid unpredictable behavior.
(Un)fortunately, a large variety of concurrency control
 mechanisms exist, with different programming languages
 lending themselves to certain flavors, like object locks
 and monitors in Java, or POSIX mutexes in C-like
 languages. For some use cases of SAGA, enforced
 concurrency control mechanisms might be both unnecessary
 and counter productive, leading to increased programming
 complexity and runtime overheads.
Because of these constraints, SAGA does not enforce
 concurrency control mechanisms on its implementations.
 Instead, it is the responsibility of the application
 programmer to ensure that her program will execute
 correctly in all possible orderings and interleavings of
 the concurrent units. The application programmer is free
 to use any concurrency control scheme (like locks,
 mutexes, or monitors) in addition to the SAGA API.
Again related, Commandement 2.6.5 calls for thread safety
for all implementations which want to obtain the blessings
of the church of SAGA.
So, that may be enough from the big story book for today I
think.
Best, Andre.
Quoting [Thilo Kielmann] (May 29 2009):
...
Well, the intended semantics was reporting the remote state.
Of course, in the presence of asynchronous operations (or simply  
operations
done by other, e.g. non-SAGA,  processes), all kinds of problems  
and race
conditions can occur. But that's just plain normal.
I see nothing special with tme stamps, that we would not have with  
anything
else we already have.
Thilo
On Fri, May 29, 2009 at 06:39:08AM -0700, John Shalf wrote:
...
Cc: saga-rg@ogf.org
From: John Shalf <JShalf@lbl.gov>
To: Thilo Kielmann <kielmann@cs.vu.nl>
Subject: Re: [SAGA-RG] missing(?) method reporting last  
modification time
It just makes problems with an undefined consistency model clearer
because ordering of requests may result in incorrect timestamps.   
The
problem is more difficult when dealing with timestamps because that
hits the metadata server, which is a different data path than data
storage. Users will assume POSIX-like consistency where
  a=check_timestamp(file foo)
  write(file foo)
  b=check _timestamp(file foo)
that b will always be greater than a.  For remote connections,  
this is
more difficult to guarantee unless you are explicit about the
underlying consistency model.  Are you going to claim POSIX
consistency?  If so, then it doesn't have anything to say about the
async case.
I brought this up years ago, but reading the current spec I don't  
see
that info. I'm not sure if it has been fully addressed.  It will be
easier to end up with absurd or seemingly incorrect situations with
relying on timestamp data.
-john
On May 29, 2009, at 6:29 AM, Thilo Kielmann wrote:
...
John,
what would be the special thing with a timestamp and everything  
else,
like file size, content, permissions... that we all have already?
Thilo
On Fri, May 29, 2009 at 06:21:40AM -0700, John Shalf wrote:
...
Cc: saga-rg@ogf.org
From: John Shalf <jshalf@lbl.gov>
To: Thilo Kielmann <kielmann@cs.vu.nl>
Subject: Re: [SAGA-RG] missing(?) method reporting last
modification time
For the async version of the SAGA interface, what consistency  
model
to
you propose for the modification time information?  POSIX  
semantics
do
not address this, which is precisely why POSIX is so damned slow  
on
distributed/remote filesystems.  It seems we'd need to at least
propose an unambiguous consistency model for the time-stamps and  
how
this would interact with concurrent async read/write calls that  
might
be in progress.
(just making the consistency model based on current state at the
remote side is fine, but from the client side, you might end up  
with
absurd situations when you do an async timestamp request  
concurrent
with an async file open for example.)
-john
On May 29, 2009, at 6:15 AM, Thilo Kielmann wrote:
...
Folks,
within our group we are currently delving into issues with  
accessing
remote file systems. What strikes us is that such access is SLOW.
As such, it would be very beneficial if one could find out when  
(and
thus whether) a remote file or directory has been modified.
While returning this piece of information sounds to be  
"trivial", it
strikes
us that the SAGA spec has no such call in the name space package
(where files
reside).
In POSIX terms, this is the info returned by the stat system call
(see: man 2 stat), with the st_mtime parameter.
In Java, files have a method lastmodified().
Both POSIX and Java report the time in milliseconds since  
01/01/1970
(epoch).
Of course, it looks like nobody has ever been thinking about  
such a
use case,
but here we are! Our feeling is that the last modification time  
is
very
essential meta data about files, so such a call should  
certainly be
there.
With our current problem certainly not being the only use case  
for
finding
out how old/new a given file or directory is...
Our favourite proposal is to add a method that returns the last
modification
time to both ns_entry and ns_directory as this makes sense with
physical as
well as with logical (replicated) files.
Any reactions/objections ???
Thilo Kielmann
-- 
Thilo Kielmann
http://www.cs.vu.nl/~kielmann/
--
saga-rg mailing list
saga-rg@ogf.org
http://www.ogf.org/mailman/listinfo/saga-rg
-- 
Thilo Kielmann
http://www.cs.vu.nl/~kielmann/
-- 
Nothing is ever easy.
--
 saga-rg mailing list
 saga-rg@ogf.org
 http://www.ogf.org/mailman/listinfo/saga-rg