missing(?) method reporting last modification time

Thilo Kielmann

29 May 2009 29 May '09

1:15 p.m.

Folks, within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified. While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside). In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter. In Java, files have a method lastmodified(). Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch). Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is... Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files. Any reactions/objections ??? Thilo Kielmann -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

Show replies by date

John Shalf

29 May 29 May

1:21 p.m.

For the async version of the SAGA interface, what consistency model to you propose for the modification time information? POSIX semantics do not address this, which is precisely why POSIX is so damned slow on distributed/remote filesystems. It seems we'd need to at least propose an unambiguous consistency model for the time-stamps and how this would interact with concurrent async read/write calls that might be in progress. (just making the consistency model based on current state at the remote side is fine, but from the client side, you might end up with absurd situations when you do an async timestamp request concurrent with an async file open for example.) -john On May 29, 2009, at 6:15 AM, Thilo Kielmann wrote:

...

Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Any reactions/objections ???

Thilo Kielmann -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/ -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

Thilo Kielmann

1:29 p.m.

John, what would be the special thing with a timestamp and everything else, like file size, content, permissions... that we all have already? Thilo On Fri, May 29, 2009 at 06:21:40AM -0700, John Shalf wrote:

...

Cc: saga-rg@ogf.org From: John Shalf <jshalf@lbl.gov> To: Thilo Kielmann <kielmann@cs.vu.nl> Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

For the async version of the SAGA interface, what consistency model to you propose for the modification time information? POSIX semantics do not address this, which is precisely why POSIX is so damned slow on distributed/remote filesystems. It seems we'd need to at least propose an unambiguous consistency model for the time-stamps and how this would interact with concurrent async read/write calls that might be in progress.

(just making the consistency model based on current state at the remote side is fine, but from the client side, you might end up with absurd situations when you do an async timestamp request concurrent with an async file open for example.)

-john

On May 29, 2009, at 6:15 AM, Thilo Kielmann wrote:

...
Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Any reactions/objections ???

Thilo Kielmann -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/ -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

John Shalf

1:39 p.m.

It just makes problems with an undefined consistency model clearer because ordering of requests may result in incorrect timestamps. The problem is more difficult when dealing with timestamps because that hits the metadata server, which is a different data path than data storage. Users will assume POSIX-like consistency where a=check_timestamp(file foo) write(file foo) b=check _timestamp(file foo) that b will always be greater than a. For remote connections, this is more difficult to guarantee unless you are explicit about the underlying consistency model. Are you going to claim POSIX consistency? If so, then it doesn't have anything to say about the async case. I brought this up years ago, but reading the current spec I don't see that info. I'm not sure if it has been fully addressed. It will be easier to end up with absurd or seemingly incorrect situations with relying on timestamp data. -john On May 29, 2009, at 6:29 AM, Thilo Kielmann wrote:

...

John,

what would be the special thing with a timestamp and everything else, like file size, content, permissions... that we all have already?

Thilo

On Fri, May 29, 2009 at 06:21:40AM -0700, John Shalf wrote:

...
Cc: saga-rg@ogf.org From: John Shalf <jshalf@lbl.gov> To: Thilo Kielmann <kielmann@cs.vu.nl> Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

For the async version of the SAGA interface, what consistency model to you propose for the modification time information? POSIX semantics do not address this, which is precisely why POSIX is so damned slow on distributed/remote filesystems. It seems we'd need to at least propose an unambiguous consistency model for the time-stamps and how this would interact with concurrent async read/write calls that might be in progress.

(just making the consistency model based on current state at the remote side is fine, but from the client side, you might end up with absurd situations when you do an async timestamp request concurrent with an async file open for example.)

-john

On May 29, 2009, at 6:15 AM, Thilo Kielmann wrote:

...
Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Any reactions/objections ???

Thilo Kielmann -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/ -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

Thilo Kielmann

1:50 p.m.

Well, the intended semantics was reporting the remote state. Of course, in the presence of asynchronous operations (or simply operations done by other, e.g. non-SAGA, processes), all kinds of problems and race conditions can occur. But that's just plain normal. I see nothing special with tme stamps, that we would not have with anything else we already have. Thilo On Fri, May 29, 2009 at 06:39:08AM -0700, John Shalf wrote:

...

Cc: saga-rg@ogf.org From: John Shalf <JShalf@lbl.gov> To: Thilo Kielmann <kielmann@cs.vu.nl> Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

It just makes problems with an undefined consistency model clearer because ordering of requests may result in incorrect timestamps. The problem is more difficult when dealing with timestamps because that hits the metadata server, which is a different data path than data storage. Users will assume POSIX-like consistency where a=check_timestamp(file foo) write(file foo) b=check _timestamp(file foo) that b will always be greater than a. For remote connections, this is more difficult to guarantee unless you are explicit about the underlying consistency model. Are you going to claim POSIX consistency? If so, then it doesn't have anything to say about the async case.

I brought this up years ago, but reading the current spec I don't see that info. I'm not sure if it has been fully addressed. It will be easier to end up with absurd or seemingly incorrect situations with relying on timestamp data.

-john

On May 29, 2009, at 6:29 AM, Thilo Kielmann wrote:

...
John,

what would be the special thing with a timestamp and everything else, like file size, content, permissions... that we all have already?

Thilo

On Fri, May 29, 2009 at 06:21:40AM -0700, John Shalf wrote:

...
Cc: saga-rg@ogf.org From: John Shalf <jshalf@lbl.gov> To: Thilo Kielmann <kielmann@cs.vu.nl> Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

For the async version of the SAGA interface, what consistency model to you propose for the modification time information? POSIX semantics do not address this, which is precisely why POSIX is so damned slow on distributed/remote filesystems. It seems we'd need to at least propose an unambiguous consistency model for the time-stamps and how this would interact with concurrent async read/write calls that might be in progress.

(just making the consistency model based on current state at the remote side is fine, but from the client side, you might end up with absurd situations when you do an async timestamp request concurrent with an async file open for example.)

-john

On May 29, 2009, at 6:15 AM, Thilo Kielmann wrote:

...
Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Any reactions/objections ???

Thilo Kielmann -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/ -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

Andre Merzky

30 May 30 May

4:25 p.m.

John, Thilo, allow me to quote from the Holy Book of SAGA, Scripture 2.8 "Execution Semantics and Consistency Model": SAGA API calls on a single service or server can occur concurrently with (a) other tasks from the same SAGA application, (b) tasks from other SAGA applications, or also (c) calls from other, independently developed (non-SAGA) applications. This means that the user of the SAGA API should not rely on any specific execution order of concurrent API calls. However, implementations MUST guarantee that a synchronous method is indeed finished when the method returns, and that an asynchronous method is indeed finished when the task instance representing this method is in a final state. Further control of execution order, if needed, has to be enforced via separate concurrency control mechanisms, preferably provided by the services themselves, or on application level. [... at most once ...] Beyond this, the SAGA API specification does \I{not} prescribe any consistency model for its operations, as we feel that this would be very hard to implement across different middleware platforms. A SAGA implementation MAY specify some consistency model, which MUST be documented. A SAGA implementation SHOULD always allow for application level consistency enforcement, for example by use of of application level locks and mutexes. Related to that is Scripture 2.6.4 "Concurrency Control": Although limited, SAGA deï¬nes a de-facto concurrent programming model, via the task model and the asynchronous notification mechanism. Sharing of object state among concurrent units (e.g. tasks) is intentional and necessary for addressing the needs of various use cases. Concurrent use of shared state, however, requires concurrency control to avoid unpredictable behavior. (Un)fortunately, a large variety of concurrency control mechanisms exist, with different programming languages lending themselves to certain flavors, like object locks and monitors in Java, or POSIX mutexes in C-like languages. For some use cases of SAGA, enforced concurrency control mechanisms might be both unnecessary and counter productive, leading to increased programming complexity and runtime overheads. Because of these constraints, SAGA does not enforce concurrency control mechanisms on its implementations. Instead, it is the responsibility of the application programmer to ensure that her program will execute correctly in all possible orderings and interleavings of the concurrent units. The application programmer is free to use any concurrency control scheme (like locks, mutexes, or monitors) in addition to the SAGA API. Again related, Commandement 2.6.5 calls for thread safety for all implementations which want to obtain the blessings of the church of SAGA. So, that may be enough from the big story book for today I think. Best, Andre. Quoting [Thilo Kielmann] (May 29 2009):

...

Well, the intended semantics was reporting the remote state. Of course, in the presence of asynchronous operations (or simply operations done by other, e.g. non-SAGA, processes), all kinds of problems and race conditions can occur. But that's just plain normal.

I see nothing special with tme stamps, that we would not have with anything else we already have.

Thilo

On Fri, May 29, 2009 at 06:39:08AM -0700, John Shalf wrote:

...
Cc: saga-rg@ogf.org From: John Shalf <JShalf@lbl.gov> To: Thilo Kielmann <kielmann@cs.vu.nl> Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

It just makes problems with an undefined consistency model clearer because ordering of requests may result in incorrect timestamps. The problem is more difficult when dealing with timestamps because that hits the metadata server, which is a different data path than data storage. Users will assume POSIX-like consistency where a=check_timestamp(file foo) write(file foo) b=check _timestamp(file foo) that b will always be greater than a. For remote connections, this is more difficult to guarantee unless you are explicit about the underlying consistency model. Are you going to claim POSIX consistency? If so, then it doesn't have anything to say about the async case.

I brought this up years ago, but reading the current spec I don't see that info. I'm not sure if it has been fully addressed. It will be easier to end up with absurd or seemingly incorrect situations with relying on timestamp data.

-john

On May 29, 2009, at 6:29 AM, Thilo Kielmann wrote:

...
John,

what would be the special thing with a timestamp and everything else, like file size, content, permissions... that we all have already?

Thilo

On Fri, May 29, 2009 at 06:21:40AM -0700, John Shalf wrote:

...
Cc: saga-rg@ogf.org From: John Shalf <jshalf@lbl.gov> To: Thilo Kielmann <kielmann@cs.vu.nl> Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

For the async version of the SAGA interface, what consistency model to you propose for the modification time information? POSIX semantics do not address this, which is precisely why POSIX is so damned slow on distributed/remote filesystems. It seems we'd need to at least propose an unambiguous consistency model for the time-stamps and how this would interact with concurrent async read/write calls that might be in progress.

(just making the consistency model based on current state at the remote side is fine, but from the client side, you might end up with absurd situations when you do an async timestamp request concurrent with an async file open for example.)

-john

On May 29, 2009, at 6:15 AM, Thilo Kielmann wrote:

...
Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Any reactions/objections ???

Thilo Kielmann -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/ -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

-- Nothing is ever easy.

John Shalf

1 Jun 1 Jun

4:12 a.m.

Thank you Brother Andre for setting us straight by quoting from the book of SAGA. Now go ahead in peace to love and server the grid... Amen. -john On May 30, 2009, at 9:25 AM, Andre Merzky wrote:

...

John, Thilo,

allow me to quote from the Holy Book of SAGA, Scripture 2.8 "Execution Semantics and Consistency Model":

SAGA API calls on a single service or server can occur concurrently with (a) other tasks from the same SAGA application, (b) tasks from other SAGA applications, or also (c) calls from other, independently developed (non-SAGA) applications. This means that the user of the SAGA API should not rely on any specific execution order of concurrent API calls. However, implementations MUST guarantee that a synchronous method is indeed finished when the method returns, and that an asynchronous method is indeed finished when the task instance representing this method is in a final state. Further control of execution order, if needed, has to be enforced via separate concurrency control mechanisms, preferably provided by the services themselves, or on application level.

[... at most once ...]

Beyond this, the SAGA API specification does \I{not} prescribe any consistency model for its operations, as we feel that this would be very hard to implement across different middleware platforms. A SAGA implementation MAY specify some consistency model, which MUST be documented. A SAGA implementation SHOULD always allow for application level consistency enforcement, for example by use of of application level locks and mutexes.

Related to that is Scripture 2.6.4 "Concurrency Control":

Although limited, SAGA deﬁnes a de-facto concurrent programming model, via the task model and the asynchronous notification mechanism. Sharing of object state among concurrent units (e.g. tasks) is intentional and necessary for addressing the needs of various use cases. Concurrent use of shared state, however, requires concurrency control to avoid unpredictable behavior.

(Un)fortunately, a large variety of concurrency control mechanisms exist, with different programming languages lending themselves to certain flavors, like object locks and monitors in Java, or POSIX mutexes in C-like languages. For some use cases of SAGA, enforced concurrency control mechanisms might be both unnecessary and counter productive, leading to increased programming complexity and runtime overheads.

Because of these constraints, SAGA does not enforce concurrency control mechanisms on its implementations. Instead, it is the responsibility of the application programmer to ensure that her program will execute correctly in all possible orderings and interleavings of the concurrent units. The application programmer is free to use any concurrency control scheme (like locks, mutexes, or monitors) in addition to the SAGA API.

Again related, Commandement 2.6.5 calls for thread safety for all implementations which want to obtain the blessings of the church of SAGA.

So, that may be enough from the big story book for today I think.

Best, Andre.

Quoting [Thilo Kielmann] (May 29 2009):

...
Well, the intended semantics was reporting the remote state. Of course, in the presence of asynchronous operations (or simply operations done by other, e.g. non-SAGA, processes), all kinds of problems and race conditions can occur. But that's just plain normal.

I see nothing special with tme stamps, that we would not have with anything else we already have.

Thilo

On Fri, May 29, 2009 at 06:39:08AM -0700, John Shalf wrote:

...
Cc: saga-rg@ogf.org From: John Shalf <JShalf@lbl.gov> To: Thilo Kielmann <kielmann@cs.vu.nl> Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

It just makes problems with an undefined consistency model clearer because ordering of requests may result in incorrect timestamps. The problem is more difficult when dealing with timestamps because that hits the metadata server, which is a different data path than data storage. Users will assume POSIX-like consistency where a=check_timestamp(file foo) write(file foo) b=check _timestamp(file foo) that b will always be greater than a. For remote connections, this is more difficult to guarantee unless you are explicit about the underlying consistency model. Are you going to claim POSIX consistency? If so, then it doesn't have anything to say about the async case.

I brought this up years ago, but reading the current spec I don't see that info. I'm not sure if it has been fully addressed. It will be easier to end up with absurd or seemingly incorrect situations with relying on timestamp data.

-john

On May 29, 2009, at 6:29 AM, Thilo Kielmann wrote:

...
John,

what would be the special thing with a timestamp and everything else, like file size, content, permissions... that we all have already?

Thilo

On Fri, May 29, 2009 at 06:21:40AM -0700, John Shalf wrote:

...
Cc: saga-rg@ogf.org From: John Shalf <jshalf@lbl.gov> To: Thilo Kielmann <kielmann@cs.vu.nl> Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

For the async version of the SAGA interface, what consistency model to you propose for the modification time information? POSIX semantics do not address this, which is precisely why POSIX is so damned slow on distributed/remote filesystems. It seems we'd need to at least propose an unambiguous consistency model for the time-stamps and how this would interact with concurrent async read/write calls that might be in progress.

(just making the consistency model based on current state at the remote side is fine, but from the client side, you might end up with absurd situations when you do an async timestamp request concurrent with an async file open for example.)

-john

On May 29, 2009, at 6:15 AM, Thilo Kielmann wrote:

...
Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Any reactions/objections ???

Thilo Kielmann -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/ -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

-- Nothing is ever easy. -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

Sylvain Reynaud

29 May 29 May

1:54 p.m.

A method "getLastModified" has already been added to our implementation (JSAGA) of the SAGA specification in order to fulfil a user request, but using it makes his code dependent on our implementation... Consequently, I fully support your proposal. Sylvain Reynaud Thilo Kielmann a écrit :

...

Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Any reactions/objections ???

Thilo Kielmann

Thilo Kielmann

30 May 30 May

2:11 p.m.

Dear Sylvain, interesting to hear! (It's a pity that you did not raise this issue before...) I have 2 questions: 1. Where exactly have you added getLastModified? In ns_entry/ns_directory? or in file/directory? Is there a good reason for one or the other place in the hierarchy? 2. Did you add anything else? I mean, did you find any other omissions? Thilo On Fri, May 29, 2009 at 03:54:05PM +0200, Sylvain Reynaud wrote:

...

From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Thilo Kielmann <kielmann@cs.vu.nl> CC: saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

A method "getLastModified" has already been added to our implementation (JSAGA) of the SAGA specification in order to fulfil a user request, but using it makes his code dependent on our implementation...

Consequently, I fully support your proposal.

Sylvain Reynaud

Thilo Kielmann a écrit :

...
Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Any reactions/objections ???

Thilo Kielmann

-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

Sylvain Reynaud

1 Jun 1 Jun

8:29 a.m.

Thilo Kielmann a écrit :

...

Dear Sylvain,

Dear Thilo,

...

interesting to hear! (It's a pity that you did not raise this issue before...)

I raised it at OGF23 (see slide 34 of my talk).

...

I have 2 questions:

1. Where exactly have you added getLastModified? In ns_entry/ns_directory? or in file/directory? Is there a good reason for one or the other place in the hierarchy?

In the ns_entry, because this information is also available with most logical file/directory protocols.

...

2. Did you add anything else? I mean, did you find any other omissions?

Our deviations from the SAGA specification are listed here http://grid.in2p3.fr/jsaga-dev/SAGA-delta.html But this web page does not distinguish between what we could consider as omissions and personal preferences... Comments are welcome. Best regards, Sylvain

...

Thilo

On Fri, May 29, 2009 at 03:54:05PM +0200, Sylvain Reynaud wrote:

...
From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Thilo Kielmann <kielmann@cs.vu.nl> CC: saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

A method "getLastModified" has already been added to our implementation (JSAGA) of the SAGA specification in order to fulfil a user request, but using it makes his code dependent on our implementation...

Consequently, I fully support your proposal.

Sylvain Reynaud

Thilo Kielmann a écrit :

...
Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are! Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Any reactions/objections ???

Thilo Kielmann

Thilo Kielmann

4 Jun 4 Jun

6:03 a.m.

Dear Sylvain,

...

...
I raised it at OGF23 (see slide 34 of my talk).

sorry, I must have missed this within your list of modifications.

...

Our deviations from the SAGA specification are listed here http://grid.in2p3.fr/jsaga-dev/SAGA-delta.html

I just had a look. This is quite a list. Would you, from your experience as of today, recommend more/some/all for inclusion in a "fixed" SAGA standard? Curious, Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

Andre Merzky

1:41 p.m.

Quoting [Thilo Kielmann] (Jun 04 2009):

...

Dear Sylvain,

...
...
I raised it at OGF23 (see slide 34 of my talk).

sorry, I must have missed this within your list of modifications.

...
Our deviations from the SAGA specification are listed here http://grid.in2p3.fr/jsaga-dev/SAGA-delta.html

I just had a look. This is quite a list. Would you, from your experience as of today, recommend more/some/all for inclusion in a "fixed" SAGA standard?

Dear Sylvain, we should have given you more detailed feedback on your list earlier on *blush* Most of the items have been discussed at one point or the other, however. Here is a detailed list. Thilo is right: we should resolve the items soonish. AmbiguityException: The spec does not define if saga::exceptions can be extended. I guess legally that can only happen outside the saga namespace, so, IMHO, that is a conflict. OTOH, its closely related to the implementation semantics (retry), so, to some extent, its an implementation detail. Not sure how to handle in the spec. BTW: 'avoid account locking' is an interesting use case... context.toString(): Thats a language binding issue. In C++, we also have a get_string () on the URL, for example. URL: Query interpreted as adaptor parameters: thats an implementation specific behaviour: SAGA does not define URL semantics. It breaks application portability though I assume. FLAGS_BYPASSEXIST=4096 I would think this is better solved on configuration level, in Java as a saga.property for example? Again, that breaks portability. additional method copyFrom(): I don't understand the use case for this. How is saga::file f (url_1); f.copy_from (url_2); different from saga::file tmp (url_2); tmp.copy (url_1); saga::file f (url_1); Is it about saving one line of code, and one temp object? additional method getLastModified(): as discussed, should make it into the spec. NSEntry.remove() allows !Recursive flag: The is actually in the spec, as saga::directory d (url); d.remove (0); // 0: empty flag set, i.e. Recursive is not set I think what you want is to change the semantics, from - a âBadParameterâ exception is thrown if the entry is a directory and the âRecursiveâ flag is not set. - a âBadParameterâ exception is thrown if the entry is a *nonempty* directory and the âRecursiveâ flag is not set. Would that solve your use case? Job description JobName attribute added This is not in the spec. Well, we had to draw a line somewhere, and as it was optional in JSDL, and noone really had a compelling use case, it was left out. Do you have applications which require that one? Job description CPUArchitecture attribute added has been added in the spec meanwhile Job description OperatingSystemType attribute added has been added in the spec meanwhile Job description JobStartTime and JobContact attributes not supported That is no problem. job attribute 'NativeJobDescription' added I assume that is a read-only attribute? What is the use case for this one? Debugging? job.sub_state metric added: saga::job already has a metric 'job.state_detail' - what is the difference? method allocateResource added: Is that method on job, or on the job_service? What is the semantics? I think that this would not go into the job managament in SAGA, but rather into the resource discovery/management extension Thilo's group is working on (within XtreemOS). You might want to check with them if their extension proposal fits your requirements... -- Nothing is ever easy.

Sylvain Reynaud

6:04 p.m.

Andre Merzky a écrit :

...

Quoting [Thilo Kielmann] (Jun 04 2009):

...
Dear Sylvain,

...
...
I raised it at OGF23 (see slide 34 of my talk).

sorry, I must have missed this within your list of modifications.

...
Our deviations from the SAGA specification are listed here http://grid.in2p3.fr/jsaga-dev/SAGA-delta.html

I just had a look. This is quite a list. Would you, from your experience as of today, recommend more/some/all for inclusion in a "fixed" SAGA standard?

Dear Sylvain,

we should have given you more detailed feedback on your list earlier on *blush* Most of the items have been discussed at one point or the other, however. Here is a detailed list. Thilo is right: we should resolve the items soonish.

Dear Andre, Thanks for this detailled feedback on our deviations from the SAGA specification. Here are some answers, inline in this mail. My proposals for a "fixed" SAGA standard will come in a separate mail.

...

AmbiguityException:

The spec does not define if saga::exceptions can be extended. I guess legally that can only happen outside the saga namespace, so, IMHO, that is a conflict. I think the type of exception is not really a problem since users can catch it as a NoSuccessException, but the difference in behavior for authentication may break compatibility between SAGA implementations... (see next mail)

...

OTOH, its closely related to the implementation semantics (retry), so, to some extent, its an implementation detail. Not sure how to handle in the spec.

BTW: 'avoid account locking' is an interesting use case...

context.toString():

Thats a language binding issue. In C++, we also have a get_string () on the URL, for example.

I agree.

...

URL:

Query interpreted as adaptor parameters: thats an implementation specific behaviour: SAGA does not define URL semantics. It breaks application portability though I assume.

Yes, I also think so. We try to avoid using this, but some protocols require additional information.

...

FLAGS_BYPASSEXIST=4096

I would think this is better solved on configuration level, in Java as a saga.property for example?

I don't think so because this depends on use-case, hence on user's code. Moreover, the user should be aware that he is changing the default behavior.

...

Again, that breaks portability.

Yes, that's why I want to propose it for inclusion in a "fixed" SAGA standard (see next mail)

...

additional method copyFrom():

I don't understand the use case for this. How is

saga::file f (url_1); f.copy_from (url_2);

different from

saga::file tmp (url_2); tmp.copy (url_1); saga::file f (url_1);

Is it about saving one line of code, and one temp object?

No, it was for optimizing file transfer from many-to-one locations without managing a pool of connections. But I plan to remove this method and do this another way.

...

additional method getLastModified():

as discussed, should make it into the spec.

OK.

...

NSEntry.remove() allows !Recursive flag:

The is actually in the spec, as

saga::directory d (url); d.remove (0); // 0: empty flag set, i.e. Recursive is not set

I think what you want is to change the semantics, from

- a â€™BadParameterâ€™ exception is thrown if the entry is a directory and the â€™Recursiveâ€™ flag is not set.

- a â€™BadParameterâ€™ exception is thrown if the entry is a *nonempty* directory and the â€™Recursiveâ€™ flag is not set.

Would that solve your use case?

Yes (my use-case is: "how to implement rmdir").

...

Job description JobName attribute added

This is not in the spec. Well, we had to draw a line somewhere, and as it was optional in JSDL, and noone really had a compelling use case, it was left out. Do you have applications which require that one?

I was using it with "allocateResource" (see end of this mail), but I am not sure I will still need it in the future...

...

Job description CPUArchitecture attribute added

has been added in the spec meanwhile

Job description OperatingSystemType attribute added

has been added in the spec meanwhile

That's not the point (see next mail).

...

Job description JobStartTime and JobContact attributes not supported

That is no problem.

(see next mail).

...

job attribute 'NativeJobDescription' added

I assume that is a read-only attribute?

Yes.

...

What is the use case for this one? Debugging?

Yes. I don't expect it to be included in the specification.

...

job.sub_state metric added:

saga::job already has a metric 'job.state_detail' - what is the difference?

job.state_detail can not be used in a uniform way because its values depends on each middleware. (see next mail)

...

method allocateResource added:

Is that method on job, or on the job_service? What is the semantics?

I think that this would not go into the job managament in SAGA, but rather into the resource discovery/management extension Thilo's group is working on (within XtreemOS). You might want to check with them if their extension proposal fits your requirements...

The goal was not to do resource discovery in SAGA, but to enable doing it outside of SAGA. But there are ways to do that without breaking portability, and I plan to remove this method. Best regards, Sylvain

Sylvain Reynaud

6:36 p.m.

Here is my proposal for inclusion in a "fixed" SAGA standard. SAGA Job Management: =================== * CPUArchitecture and OperatingSystemType should be scalar (instead of vector) attributes for consistency with JSDL specification. * page 166 says: "Attributes marked as 'not supported by JSDL' might disappear in future versions of the SAGA API". - Queue: this attribute makes the job description dependent on the targeted execution site, this information should be put in the URL instead. - JobStartTime and JobContact: IMHO, these attributes are not in the 80% of the 80/20 rule, they are not supported by most middlewares, and they can be implemented by the user application. - Interactive: although this attribute may not be in the 80% of the 80/20 rule, I think it is usefull and should be kept. * job are missing a state "QUEUED" in order to enable timeout on jobs queued for a long time, synchronization with job start, or just because many users want to know if their jobs are running or queued. This can not be done with the job.state_detail attribute because its content is not uniform. As it was said at OGF23, some job services don't have any queue, but I think this is not the usual case. SAGA Name Spaces: ================ * add a flag to disable checking existence of entry in constructor and open methods, because the cost for this check is not negligible with some protocols (then subsequent method calls on this object may throw an IncorrectState exception if the entry does not exist). SAGA Session: ============ * An exception should be thrown when several contexts of the session can be used for a specific method call. Else, some files may be created with unexpected owner, some jobs may fail at the end of execution because of unexpected permissions, some accounts may be locked because of too many failed connection attempts. * AFAIK, existing SAGA implementations support many technologies and may be used with several grids. Default session is very convenient, but it is not usable when several grids can be accessed, because a single default session for all is not enough. Argument of session constructor could be an identifier (e.g. a grid identifier) instead of a boolean. Thilo Kielmann a écrit :

...

Dear Sylvain,

...
...
I raised it at OGF23 (see slide 34 of my talk).

sorry, I must have missed this within your list of modifications.

...
Our deviations from the SAGA specification are listed here http://grid.in2p3.fr/jsaga-dev/SAGA-delta.html

I just had a look. This is quite a list. Would you, from your experience as of today, recommend more/some/all for inclusion in a "fixed" SAGA standard?

Curious,

Thilo

Andre Merzky

8:47 p.m.

Dear Sylvain, Quoting [Sylvain Reynaud] (Jun 04 2009):

...

SAGA Job Management: =================== * CPUArchitecture and OperatingSystemType should be scalar (instead of vector) attributes for consistency with JSDL specification.

Good find! These are indeed errors in the spec, and should be scalar attributes. Will fix.

...

* page 166 says: "Attributes marked as 'not supported by JSDL' might disappear in future versions of the SAGA API". - Queue: this attribute makes the job description dependent on the targeted execution site, this information should be put in the URL instead.

Interesting point. The problem I see is that its hard to define a standard way on *how* to encode it in the URL, as each URL component (host, path, query, ...) may already be interpreted by the backend. For example, a globus job manager URL may well look like https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5... Where would you put the queue?

...

- JobStartTime and JobContact: IMHO, these attributes are not in the 80% of the 80/20 rule, they are not supported by most middlewares, and they can be implemented by the user application.

Those have been included because DRMAA has these, and the DRMAA folx found them dead useful. JobStartTime I expect to get defined in JSDL when they get started on scheduling (which is not in the near future AFAIK). In SAGA, it probably should be move to the resource management package, when that emerges. JobContact is a tricky one: it seems dead useful to be able to specify a custom name to a job, and to identify the job thus in an easy way, but (a) I have yet to see a backend which supports it, (b) SAGA does not provide a way to actually *use* this name, e.g. as jobid, and (c) such a mapping can trivially be implemented on application level. I am with you here. Anyway, I think it is kind of dangerous to remove stuff from the spec simply because we don't see a use at the moment. Not sure how representative we are...

...

- Interactive: although this attribute may not be in the 80% of the 80/20 rule, I think it is usefull and should be kept.

Many of our original use cases had a visualization background, where interactive jobs are more likely to be useful than in other fields...

...

* job are missing a state "QUEUED" in order to enable timeout on jobs queued for a long time, synchronization with job start, or just because many users want to know if their jobs are running or queued. This can not be done with the job.state_detail attribute because its content is not uniform. As it was said at OGF23, some job services don't have any queue, but I think this is not the usual case.

I don't see why job.state_detail can't help you here: your SAGA implementation should always be able to bap the respective backend queue state to something like 'jsaga:queued'. The spec says that job_detail SHOULD be formatted as 'model:state', but you seem to have a very valid use case for defining your own backend model for jsaga. Reason why I am hesitant to agree with you, despite of the validity of your use case, is: while defining the spec and the state model, we tried only to expose those state at the top level which are the result of a SAGA API call, or can be left with a SAGA API call. There is not saga.job.queue() or saga::job::unqueue, so moving Queued on the top state model would break that rule of thumb.

...

SAGA Name Spaces: ================ * add a flag to disable checking existence of entry in constructor and open methods, because the cost for this check is not negligible with some protocols (then subsequent method calls on this object may throw an IncorrectState exception if the entry does not exist).

Makes sense. We could also overload 'Exclusive', which, at the moment, is only evaluated if 'Create' is specified. It has the same semantic meaning so (inversed): if 'Exclusive' is not specified on 'Create', an existing file is ignored. Would it make sense to allow Exclusive to be evaluated on all c'tors and open calls?

...

SAGA Session: ============ * An exception should be thrown when several contexts of the session can be used for a specific method call. Else, some files may be created with unexpected owner, some jobs may fail at the end of execution because of unexpected permissions, some accounts may be locked because of too many failed connection attempts.

Uh, this would badly break our use cases! In our implementation, you may very well copy a file instance with http, but read it with ftp, and delete it with nfs! And we need the respective contexts for these backends on the same session... Yes, results are not completely predictable if multiple backends can be used, but the user has always the option to create a specific session with exactly one context attached. So, that's a tough one.

...

* AFAIK, existing SAGA implementations support many technologies and may be used with several grids. Default session is very convenient, but it is not usable when several grids can be accessed, because a single default session for all is not enough. Argument of session constructor could be an identifier (e.g. a grid identifier) instead of a boolean.

Again, you can create a context for a Grid type, and attache just that one context to a session. Isn't that what you need? I may be missing something here... Thanks, Andre. -- Nothing is ever easy.

Sylvain Reynaud

5 Jun 5 Jun

11:49 a.m.

Andre Merzky a écrit :

...

Dear Sylvain,

Dear Andre,

...

Quoting [Sylvain Reynaud] (Jun 04 2009):

...
SAGA Job Management: =================== * CPUArchitecture and OperatingSystemType should be scalar (instead of vector) attributes for consistency with JSDL specification.

Good find! These are indeed errors in the spec, and should be scalar attributes. Will fix.

...
* page 166 says: "Attributes marked as 'not supported by JSDL' might disappear in future versions of the SAGA API". - Queue: this attribute makes the job description dependent on the targeted execution site, this information should be put in the URL instead.

Interesting point. The problem I see is that its hard to define a standard way on *how* to encode it in the URL, as each URL component (host, path, query, ...) may already be interpreted by the backend.

For example, a globus job manager URL may well look like

https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5...

Where would you put the queue?

In JSAGA, such URL is used internally, user gives this URL: wsgram://some.remote.host:9443/Fork If encoding the queue in the URL is not an acceptable solution, then I think the queue should be moved from attributes of job description to arguments of method job_service.create_job.

...

...
- JobStartTime and JobContact: IMHO, these attributes are not in the 80% of the 80/20 rule, they are not supported by most middlewares, and they can be implemented by the user application.

Those have been included because DRMAA has these, and the DRMAA folx found them dead useful.

JobStartTime I expect to get defined in JSDL when they get started on scheduling (which is not in the near future AFAIK). In SAGA, it probably should be move to the resource management package, when that emerges.

When this feature is not available in middleware, should SAGA implementations delay job submission by itself (i.e. on client-side) or throw a NotImplemented exception ?

...

JobContact is a tricky one: it seems dead useful to be able to specify a custom name to a job, and to identify the job thus in an easy way, but (a) I have yet to see a backend which supports it, (b) SAGA does not provide a way to actually *use* this name, e.g. as jobid, and (c) such a mapping can trivially be implemented on application level.

I think you are talking about JobName, which is currently supported by JSAGA (I plan to remove it for portability) but is not defined in the SAGA specification. JobContact is "set of endpoints describing where to report job state transitions": I think your comments (a) and (c) also apply to this one.

...

I am with you here. Anyway, I think it is kind of dangerous to remove stuff from the spec simply because we don't see a use at the moment. Not sure how representative we are...

...
- Interactive: although this attribute may not be in the 80% of the 80/20 rule, I think it is usefull and should be kept.

Many of our original use cases had a visualization background, where interactive jobs are more likely to be useful than in other fields...

...
* job are missing a state "QUEUED" in order to enable timeout on jobs queued for a long time, synchronization with job start, or just because many users want to know if their jobs are running or queued. This can not be done with the job.state_detail attribute because its content is not uniform. As it was said at OGF23, some job services don't have any queue, but I think this is not the usual case.

I don't see why job.state_detail can't help you here: your SAGA implementation should always be able to bap the respective backend queue state to something like 'jsaga:queued'.

The spec says that job_detail SHOULD be formatted as 'model:state', but you seem to have a very valid use case for defining your own backend model for jsaga.

OK. I was thinking that state_detail was supposed to contain only middleware-specific states, but indeed nothing in the specification is saying that... So I agree with you, I will remove my sub_state attribute and use the state_detail instead !

...

Reason why I am hesitant to agree with you, despite of the validity of your use case, is: while defining the spec and the state model, we tried only to expose those state at the top level which are the result of a SAGA API call, or can be left with a SAGA API call.

There is not saga.job.queue() or saga::job::unqueue, so moving Queued on the top state model would break that rule of thumb.

...
SAGA Name Spaces: ================ * add a flag to disable checking existence of entry in constructor and open methods, because the cost for this check is not negligible with some protocols (then subsequent method calls on this object may throw an IncorrectState exception if the entry does not exist).

Makes sense. We could also overload 'Exclusive', which, at the moment, is only evaluated if 'Create' is specified. It has the same semantic meaning so (inversed): if 'Exclusive' is not specified on 'Create', an existing file is ignored.

Would it make sense to allow Exclusive to be evaluated on all c'tors and open calls?

...
SAGA Session: ============ * An exception should be thrown when several contexts of the session can be used for a specific method call. Else, some files may be created with unexpected owner, some jobs may fail at the end of execution because of unexpected permissions, some accounts may be locked because of too many failed connection attempts.

Uh, this would badly break our use cases! In our implementation, you may very well copy a file instance with http, but read it with ftp, and delete it with nfs! And we need the respective contexts for these backends on the same session...

Yes, results are not completely predictable if multiple backends can be used, but the user has always the option to create a specific session with exactly one context attached.

So, that's a tough one.

OK. Can we consider that this is up to the implementation, or do you think these different behaviors would break portability ?

...

...
* AFAIK, existing SAGA implementations support many technologies and may be used with several grids. Default session is very convenient, but it is not usable when several grids can be accessed, because a single default session for all is not enough. Argument of session constructor could be an identifier (e.g. a grid identifier) instead of a boolean.

Again, you can create a context for a Grid type, and attache just that one context to a session. Isn't that what you need?

I think creating default session is very convenient, and I was frustrated not being able to use this possibility with my use-cases! ;-) However, I can indeed create my own default session builder on top of the SAGA specification. Best regards, Sylvain

...

I may be missing something here...

Thanks, Andre.

Andre Merzky

2 p.m.

Hi again, Quoting [Sylvain Reynaud] (Jun 05 2009):

...

...
...
- Queue: this attribute makes the job description dependent on the targeted execution site, this information should be put in the URL instead.

Interesting point. The problem I see is that its hard to define a standard way on *how* to encode it in the URL, as each URL component (host, path, query, ...) may already be interpreted by the backend.

For example, a globus job manager URL may well look like

https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5...

Where would you put the queue?

In JSAGA, such URL is used internally, user gives this URL: wsgram://some.remote.host:9443/Fork

sure, that will mostly work. The point is however, that we can't assure that it breaks for other backends which require a path specification on the URL.

...

If encoding the queue in the URL is not an acceptable solution, then I think the queue should be moved from attributes of job description to arguments of method job_service.create_job.

Thats also an option. What would be the difference however to keeping it in the job description? The info arrives at the same call, once in the description, once separate. I understand that having only JSDL approved keys in the job description is a clean solution - but that is mostly for the benefit of the SAGA implementors. For the SAGA users, that makes not much of the difference, IMHO. Well, I guess there are pros and cons for both versions. Maybe others on the list have any preference one qay or the other?

...

...
...
- JobStartTime and JobContact: IMHO, these attributes are not in the 80% of the 80/20 rule, they are not supported by most middlewares, and they can be implemented by the user application.

Those have been included because DRMAA has these, and the DRMAA folx found them dead useful.

JobStartTime I expect to get defined in JSDL when they get started on scheduling (which is not in the near future AFAIK). In SAGA, it probably should be move to the resource management package, when that emerges.

When this feature is not available in middleware, should SAGA implementations delay job submission by itself (i.e. on client-side) or throw a NotImplemented exception ?

To throw or not to throw, both would be valid, IMHO, and up to the implementation. For example, specifying a startup time far in the future would most likely trigger a BadParameter exception, as it cannot be expected that the job service instance is still alive by then. Having a very short delay may be acceptable for some implementations, and not for others. Anyway, a BadParameter exception is more appropriate than a NotImplemented, as the latter would suggest that the method create_job() is not implemented at all. BadParameter suggests that the interpretation of the method parameters (the job description) failed, which is in fact what happens.

...

...
JobContact is a tricky one: it seems dead useful to be able to specify a custom name to a job, and to identify the job thus in an easy way, but (a) I have yet to see a backend which supports it, (b) SAGA does not provide a way to actually *use* this name, e.g. as jobid, and (c) such a mapping can trivially be implemented on application level.

I think you are talking about JobName, which is currently supported by JSAGA (I plan to remove it for portability) but is not defined in the SAGA specification.

Uhoh, bad mixup on my part - my apologies!

...

JobContact is "set of endpoints describing where to report job state transitions": I think your comments (a) and (c) also apply to this one.

Agree.

...

...
[Queued...]

OK. I was thinking that state_detail was supposed to contain only middleware-specific states, but indeed nothing in the specification is saying that... So I agree with you, I will remove my sub_state attribute and use the state_detail instead !

Great!

...

...
...
SAGA Name Spaces: ================ * add a flag to disable checking existence of entry in constructor and open methods, because the cost for this check is not negligible with some protocols (then subsequent method calls on this object may throw an IncorrectState exception if the entry does not exist).

Makes sense. We could also overload 'Exclusive', which, at the moment, is only evaluated if 'Create' is specified. It has the same semantic meaning so (inversed): if 'Exclusive' is not specified on 'Create', an existing file is ignored.

Would it make sense to allow Exclusive to be evaluated on all c'tors and open calls?

Any feedback on this one? :-)

...

...
...
SAGA Session: ============ * An exception should be thrown when several contexts of the session can be used for a specific method call. Else, some files may be created with unexpected owner, some jobs may fail at the end of execution because of unexpected permissions, some accounts may be locked because of too many failed connection attempts.

Uh, this would badly break our use cases! In our implementation, you may very well copy a file instance with http, but read it with ftp, and delete it with nfs! And we need the respective contexts for these backends on the same session...

Yes, results are not completely predictable if multiple backends can be used, but the user has always the option to create a specific session with exactly one context attached.

So, that's a tough one.

OK. Can we consider that this is up to the implementation, or do you think these different behaviors would break portability ?

I think it makes sense to leave that to the implementation, as (a) we have use cases for both versions, and (b) implementations are not required to bind to multiple backends: if they bind to a single backend, the behaviour would be very much like what you describe. I am still not sure about introducing an additional exception here, but that is another issue...

...

I think creating default session is very convenient, and I was frustrated not being able to use this possibility with my use-cases! ;-)

Hehe, I see. Good point. Might not justfy an errata, but would be good to keep in mind for the next version... Cheers, Andre. -- Nothing is ever easy.

Sylvain Reynaud

5:26 p.m.

Andre Merzky a écrit :

...

Hi again,

Hi again,

...

Quoting [Sylvain Reynaud] (Jun 05 2009):

...
...
...
- Queue: this attribute makes the job description dependent on the targeted execution site, this information should be put in the URL instead.

Interesting point. The problem I see is that its hard to define a standard way on *how* to encode it in the URL, as each URL component (host, path, query, ...) may already be interpreted by the backend.

For example, a globus job manager URL may well look like

https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5...

Where would you put the queue?

In JSAGA, such URL is used internally, user gives this URL: wsgram://some.remote.host:9443/Fork

sure, that will mostly work. The point is however, that we can't assure that it breaks for other backends which require a path specification on the URL.

Then it can be added to the query part of the URL... But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource. IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.

...

...
If encoding the queue in the URL is not an acceptable solution, then I think the queue should be moved from attributes of job description to arguments of method job_service.create_job.

Thats also an option. What would be the difference however to keeping it in the job description? The info arrives at the same call, once in the description, once separate.

The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".

...

I understand that having only JSDL approved keys in the job description is a clean solution - but that is mostly for the benefit of the SAGA implementors. For the SAGA users, that makes not much of the difference, IMHO.

Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API". But I agree, if their usefulness is confirmed, they must be kept.

...

Well, I guess there are pros and cons for both versions. Maybe others on the list have any preference one qay or the other?

...
...
...
- JobStartTime and JobContact: IMHO, these attributes are not in the 80% of the 80/20 rule, they are not supported by most middlewares, and they can be implemented by the user application.

Those have been included because DRMAA has these, and the DRMAA folx found them dead useful.

JobStartTime I expect to get defined in JSDL when they get started on scheduling (which is not in the near future AFAIK). In SAGA, it probably should be move to the resource management package, when that emerges.

When this feature is not available in middleware, should SAGA implementations delay job submission by itself (i.e. on client-side) or throw a NotImplemented exception ?

To throw or not to throw, both would be valid, IMHO, and up to the implementation. For example, specifying a startup time far in the future would most likely trigger a BadParameter exception, as it cannot be expected that the job service instance is still alive by then. Having a very short delay may be acceptable for some implementations, and not for others.

Anyway, a BadParameter exception is more appropriate than a NotImplemented, as the latter would suggest that the method create_job() is not implemented at all. BadParameter suggests that the interpretation of the method parameters (the job description) failed, which is in fact what happens.

OK.

...

...
...
JobContact is a tricky one: it seems dead useful to be able to specify a custom name to a job, and to identify the job thus in an easy way, but (a) I have yet to see a backend which supports it, (b) SAGA does not provide a way to actually *use* this name, e.g. as jobid, and (c) such a mapping can trivially be implemented on application level.

I think you are talking about JobName, which is currently supported by JSAGA (I plan to remove it for portability) but is not defined in the SAGA specification.

Uhoh, bad mixup on my part - my apologies!

...
JobContact is "set of endpoints describing where to report job state transitions": I think your comments (a) and (c) also apply to this one.

Agree.

...
...
[Queued...]

OK. I was thinking that state_detail was supposed to contain only middleware-specific states, but indeed nothing in the specification is saying that... So I agree with you, I will remove my sub_state attribute and use the state_detail instead !

Great!

...
...
...
SAGA Name Spaces: ================ * add a flag to disable checking existence of entry in constructor and open methods, because the cost for this check is not negligible with some protocols (then subsequent method calls on this object may throw an IncorrectState exception if the entry does not exist).

Makes sense. We could also overload 'Exclusive', which, at the moment, is only evaluated if 'Create' is specified. It has the same semantic meaning so (inversed): if 'Exclusive' is not specified on 'Create', an existing file is ignored.

Would it make sense to allow Exclusive to be evaluated on all c'tors and open calls?

Any feedback on this one? :-)

Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.

...

...
...
...
SAGA Session: ============ * An exception should be thrown when several contexts of the session can be used for a specific method call. Else, some files may be created with unexpected owner, some jobs may fail at the end of execution because of unexpected permissions, some accounts may be locked because of too many failed connection attempts.

Uh, this would badly break our use cases! In our implementation, you may very well copy a file instance with http, but read it with ftp, and delete it with nfs! And we need the respective contexts for these backends on the same session...

Yes, results are not completely predictable if multiple backends can be used, but the user has always the option to create a specific session with exactly one context attached.

So, that's a tough one.

OK. Can we consider that this is up to the implementation, or do you think these different behaviors would break portability ?

I think it makes sense to leave that to the implementation, as (a) we have use cases for both versions, and (b) implementations are not required to bind to multiple backends: if they bind to a single backend, the behaviour would be very much like what you describe.

OK.

...

I am still not sure about introducing an additional exception here, but that is another issue...

Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) : << An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

...

...
I think creating default session is very convenient, and I was frustrated not being able to use this possibility with my use-cases! ;-)

Hehe, I see. Good point. Might not justfy an errata, but would be good to keep in mind for the next version...

OK. Best regards, Sylvain

...

Cheers, Andre.

Andre Merzky

6:55 p.m.

Quoting [Sylvain Reynaud] (Jun 05 2009):

...

Andre Merzky a écrit :

...
Hi again,

Hi again,

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
...
...
- Queue: this attribute makes the job description dependent on the targeted execution site, this information should be put in the URL instead.

Interesting point. The problem I see is that its hard to define a standard way on *how* to encode it in the URL, as each URL component (host, path, query, ...) may already be interpreted by the backend.

For example, a globus job manager URL may well look like

https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5...

Where would you put the queue?

In JSAGA, such URL is used internally, user gives this URL: wsgram://some.remote.host:9443/Fork

sure, that will mostly work. The point is however, that we can't assure that it breaks for other backends which require a path specification on the URL.

But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource.

IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.

Hi Sylvain, yes, excellent description of the problem: it should be part of the resource specification, not part of the job description. Alas, we don't have a resource description (yet). BTW, the same holds IMHO for CPUArchitecture for example, doesn't it?

...

...
...
If encoding the queue in the URL is not an acceptable solution, then I think the queue should be moved from attributes of job description to arguments of method job_service.create_job.

Thats also an option. What would be the difference however to keeping it in the job description? The info arrives at the same call, once in the description, once separate.

The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".

Ideally that may be true, but in practice, CPUArchitecture, OperatingSystem, and others pose similar limitations. Anyway, don't get me wrong: I think I mostly agree with you about the problem statement, and the cause. I am not 100% about the proposed solution, but that may be just me, being hesitant to change (I'm known for that I'm afraid)...

...

...
I understand that having only JSDL approved keys in the job description is a clean solution - but that is mostly for the benefit of the SAGA implementors. For the SAGA users, that makes not much of the difference, IMHO.

Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API".

But I agree, if their usefulness is confirmed, they must be kept.

I think, in the long run, further versions of JSDL, and JSDL extensions, will make our live much easier...

...

...
...
...
...
SAGA Name Spaces: ================ * add a flag to disable checking existence of entry in constructor and open methods, because the cost for this check is not negligible with some protocols (then subsequent method calls on this object may throw an IncorrectState exception if the entry does not exist).

Makes sense. We could also overload 'Exclusive', which, at the moment, is only evaluated if 'Create' is specified. It has the same semantic meaning so (inversed): if 'Exclusive' is not specified on 'Create', an existing file is ignored.

Would it make sense to allow Exclusive to be evaluated on all c'tors and open calls?

Any feedback on this one? :-)

Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.

Ah, well, naming - you are opening a bottomless pit! ;-) Any proposal? I throw in 'FailIfExists' ...

...

...
I am still not sure about introducing an additional exception here, but that is another issue...

Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) :

<< An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

I think thats an excellent proposal. Best regards, Andre. -- Nothing is ever easy.

Sylvain Reynaud

6 Jun 6 Jun

6:29 p.m.

Andre Merzky a écrit :

...

Quoting [Sylvain Reynaud] (Jun 05 2009):

...
Andre Merzky a écrit :

...
Hi again,

Hi again,

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
...
...
- Queue: this attribute makes the job description dependent on the targeted execution site, this information should be put in the URL instead.

Interesting point. The problem I see is that its hard to define a standard way on *how* to encode it in the URL, as each URL component (host, path, query, ...) may already be interpreted by the backend.

For example, a globus job manager URL may well look like

https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5...

Where would you put the queue?

In JSAGA, such URL is used internally, user gives this URL: wsgram://some.remote.host:9443/Fork

sure, that will mostly work. The point is however, that we can't assure that it breaks for other backends which require a path specification on the URL.

But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource.

IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.

Hi Sylvain,

Hi,

...

yes, excellent description of the problem: it should be part of the resource specification, not part of the job description. Alas, we don't have a resource description (yet). BTW, the same holds IMHO for CPUArchitecture for example, doesn't it?

I think CPUArchitecture and other resource specification attributes are part of the job description, since they describe the job requirements. But IMHO, attribute queue is not part of resource specification, it is part of *resource location* (like URL). Although queues are often configured with names "short" or "long", they can be used for very different purposes (e.g. queues by VO, by SLA, by feature...), they can have different names even when used for the same purpose, and when discovering job services, the queue is always in the response rather than in the query.

...

...
...
...
If encoding the queue in the URL is not an acceptable solution, then I think the queue should be moved from attributes of job description to arguments of method job_service.create_job.

Thats also an option. What would be the difference however to keeping it in the job description? The info arrives at the same call, once in the description, once separate.

The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".

Ideally that may be true, but in practice, CPUArchitecture, OperatingSystem, and others pose similar limitations.

IMHO the limitations are not similar : * If a job requires a specific OperatingSystem to run, then we can assume this requirement is the same for grid A and grid B. * If the user wants to submit his job on a specific queue on grid A, he can not expect to have the same queue on grid B.

...

Anyway, don't get me wrong: I think I mostly agree with you about the problem statement, and the cause. I am not 100% about the proposed solution, I have no preference on the proposed solution (URL, create_job argument, or other solution...), I just think queue must be removed from job description.

...

but that may be just me, being hesitant to change (I'm known for that I'm afraid)...

I think you are right to be hesitant; specifications must not change too much!

...

...
...
I understand that having only JSDL approved keys in the job description is a clean solution - but that is mostly for the benefit of the SAGA implementors. For the SAGA users, that makes not much of the difference, IMHO.

Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API".

But I agree, if their usefulness is confirmed, they must be kept.

I think, in the long run, further versions of JSDL, and JSDL extensions, will make our live much easier...

...
...
...
...
...
SAGA Name Spaces: ================ * add a flag to disable checking existence of entry in constructor and open methods, because the cost for this check is not negligible with some protocols (then subsequent method calls on this object may throw an IncorrectState exception if the entry does not exist).

Makes sense. We could also overload 'Exclusive', which, at the moment, is only evaluated if 'Create' is specified. It has the same semantic meaning so (inversed): if 'Exclusive' is not specified on 'Create', an existing file is ignored.

Would it make sense to allow Exclusive to be evaluated on all c'tors and open calls?

Any feedback on this one? :-)

Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.

Ah, well, naming - you are opening a bottomless pit! ;-) Any proposal?

No proposal yet... I am thinking about it!

...

I throw in 'FailIfExists' ...

FailIfExists match the first use-case (exclusive creation), the second use-case needs DoNotFailIfDoesNotExist ! ;-) Best regards, Sylvain

...

...
...
I am still not sure about introducing an additional exception here, but that is another issue...

Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) :

<< An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

I think thats an excellent proposal.

Best regards,

Andre.

Andre Merzky

23 Sep 23 Sep

1:47 p.m.

Dear Sylvain, I dropped the ball on this thread I think. Also, I think we came a conclusion about a number of issues already. So, let me try to summarize where we stand. I'd loke to use this as a last call for the list for the closed items, and as a call for feedback for the items still open. Closed items ------------------------------------------------------------ - add a LastModiefied timestamp to namespace entries (in addition to Created timestamp) -> added as get_mtime() to namespace::entry - IncorrectType for task t = f.get_size <Sync> (); size_t s = t.get_result <char> (); -> added to the spec - context.toString(): -> this is a language binding issue - no change in spec - NSEntry.remove() should allow for rmdir -> corrected in spec (Recursive flag only required for non-empty dirs) - CPUArchitecture and OperatingSystemType should be scalar -> needs to be fixed in spec - job are missing a state "QUEUED" -> this is a state_detail of the Running state - no change in spec. - removing the Queue attrib -> resolution unclear, possibly postponed to next JSDL version, or to a SAGA resource package, whichever comes first - avoid check for existence on open/creation of ns entries -> two possible solutions (a) overload Exclusive // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, Exclusive) : success open (name ) : success // entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, Exclusive) : no check (later IncorrectState) open (name ) : fail (b) add new flag 'DoNotFailIfDoesNotExist' (better name needed) // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, DNFIDNE ) : success open (name ) : success // entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, DNFIDNE ) : no check (later IncorrectState) open (name ) : fail I vote for (a), because I think its simplier and because I can't think of a good name for the new flag. Sylvain votes for (b) IIRC, but does not have a good name either ;-) Group should consider this to be a last call! So, I hope I covered all items - let me know if not! Best, Andre. Quoting [Sylvain Reynaud] (Jun 06 2009):

...

Date: Sat, 06 Jun 2009 20:29:10 +0200 From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Andre Merzky <andre@merzky.net> CC: Thilo Kielmann <kielmann@cs.vu.nl>, saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

Andre Merzky a écrit :

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
Andre Merzky a écrit :

...
Hi again,

Hi again,

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
...
>- Queue: this attribute makes the job description dependent on the > targeted > execution site, this information should be put in the URL instead. > > > Interesting point. The problem I see is that its hard to define a standard way on *how* to encode it in the URL, as each URL component (host, path, query, ...) may already be interpreted by the backend.

For example, a globus job manager URL may well look like

https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5...

Where would you put the queue?

In JSAGA, such URL is used internally, user gives this URL: wsgram://some.remote.host:9443/Fork

sure, that will mostly work. The point is however, that we can't assure that it breaks for other backends which require a path specification on the URL.

But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource.

IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.

Hi Sylvain,

Hi,

...
yes, excellent description of the problem: it should be part of the resource specification, not part of the job description. Alas, we don't have a resource description (yet). BTW, the same holds IMHO for CPUArchitecture for example, doesn't it?

I think CPUArchitecture and other resource specification attributes are part of the job description, since they describe the job requirements. But IMHO, attribute queue is not part of resource specification, it is part of *resource location* (like URL).

Although queues are often configured with names "short" or "long", they can be used for very different purposes (e.g. queues by VO, by SLA, by feature...), they can have different names even when used for the same purpose, and when discovering job services, the queue is always in the response rather than in the query.

...
...
...
...
If encoding the queue in the URL is not an acceptable solution, then I think the queue should be moved from attributes of job description to arguments of method job_service.create_job.

Thats also an option. What would be the difference however to keeping it in the job description? The info arrives at the same call, once in the description, once separate.

The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".

Ideally that may be true, but in practice, CPUArchitecture, OperatingSystem, and others pose similar limitations.

IMHO the limitations are not similar : * If a job requires a specific OperatingSystem to run, then we can assume this requirement is the same for grid A and grid B. * If the user wants to submit his job on a specific queue on grid A, he can not expect to have the same queue on grid B.

...
Anyway, don't get me wrong: I think I mostly agree with you about the problem statement, and the cause. I am not 100% about the proposed solution, I have no preference on the proposed solution (URL, create_job argument, or other solution...), I just think queue must be removed from job description.

...
but that may be just me, being hesitant to change (I'm known for that I'm afraid)...

I think you are right to be hesitant; specifications must not change too much!

...
...
...
I understand that having only JSDL approved keys in the job description is a clean solution - but that is mostly for the benefit of the SAGA implementors. For the SAGA users, that makes not much of the difference, IMHO.

Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API".

But I agree, if their usefulness is confirmed, they must be kept.

I think, in the long run, further versions of JSDL, and JSDL extensions, will make our live much easier...

...
...
...
...
>SAGA Name Spaces: >================ >* add a flag to disable checking existence of entry in constructor >and open methods, because the cost for this check is not negligible >with some protocols (then subsequent method calls on this object may >throw an IncorrectState exception >if the entry does not exist). > > > Makes sense. We could also overload 'Exclusive', which, at the moment, is only evaluated if 'Create' is specified. It has the same semantic meaning so (inversed): if 'Exclusive' is not specified on 'Create', an existing file is ignored.

Would it make sense to allow Exclusive to be evaluated on all c'tors and open calls?

Any feedback on this one? :-)

Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.

Ah, well, naming - you are opening a bottomless pit! ;-) Any proposal?

No proposal yet... I am thinking about it!

...
I throw in 'FailIfExists' ...

FailIfExists match the first use-case (exclusive creation), the second use-case needs DoNotFailIfDoesNotExist ! ;-)

Best regards, Sylvain

...
...
...
I am still not sure about introducing an additional exception here, but that is another issue...

Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) :

<< An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

I think thats an excellent proposal.

Best regards,

Andre.

-- Nothing is ever easy.

Sylvain Reynaud

24 Sep 24 Sep

10:21 a.m.

Hi, Last attempt to propose a better name for flag 'DoNotFailIfDoesNotExist' (see last item: "avoid check for existence on open/creation of ns entries")... What do you think about "MissingOK" ? It gives a good idea of what it is supposed to do. It is short. It is already used at least in the linux world : used by logrotate to continue with no error message when the log file does not exist, and used in rpm spec files to continue with no error message when a package is not installed. Best regards, Sylvain Andre Merzky a écrit :

...

Dear Sylvain,

I dropped the ball on this thread I think. Also, I think we came a conclusion about a number of issues already. So, let me try to summarize where we stand. I'd loke to use this as a last call for the list for the closed items, and as a call for feedback for the items still open.

Closed items ------------------------------------------------------------ - add a LastModiefied timestamp to namespace entries (in addition to Created timestamp)

-> added as get_mtime() to namespace::entry

- IncorrectType for

task t = f.get_size <Sync> (); size_t s = t.get_result <char> ();

-> added to the spec

- context.toString():

-> this is a language binding issue - no change in spec

- NSEntry.remove() should allow for rmdir

-> corrected in spec (Recursive flag only required for non-empty dirs)

- CPUArchitecture and OperatingSystemType should be scalar

-> needs to be fixed in spec

- job are missing a state "QUEUED"

-> this is a state_detail of the Running state - no change in spec.

- removing the Queue attrib

-> resolution unclear, possibly postponed to next JSDL version, or to a SAGA resource package, whichever comes first

- avoid check for existence on open/creation of ns entries

-> two possible solutions

(a) overload Exclusive // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, Exclusive) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, Exclusive) : no check (later IncorrectState) open (name ) : fail

(b) add new flag 'DoNotFailIfDoesNotExist' (better name needed) // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, DNFIDNE ) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, DNFIDNE ) : no check (later IncorrectState) open (name ) : fail

I vote for (a), because I think its simplier and because I can't think of a good name for the new flag. Sylvain votes for (b) IIRC, but does not have a good name either ;-)

Group should consider this to be a last call!

So, I hope I covered all items - let me know if not!

Best, Andre.

Quoting [Sylvain Reynaud] (Jun 06 2009):

...
Date: Sat, 06 Jun 2009 20:29:10 +0200 From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Andre Merzky <andre@merzky.net> CC: Thilo Kielmann <kielmann@cs.vu.nl>, saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

Andre Merzky a écrit :

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
Andre Merzky a écrit :

...
Hi again,

Hi again,

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
> > > >> - Queue: this attribute makes the job description dependent on the >> targeted >> execution site, this information should be put in the URL instead. >> >> >> >> > Interesting point. The problem I see is that its hard to > define a standard way on *how* to encode it in the URL, as > each URL component (host, path, query, ...) may already be > interpreted by the backend. > > For example, a globus job manager URL may well look like > > https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5... > > Where would you put the queue? > > > > In JSAGA, such URL is used internally, user gives this URL: wsgram://some.remote.host:9443/Fork

sure, that will mostly work. The point is however, that we can't assure that it breaks for other backends which require a path specification on the URL.

But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource.

IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.

Hi Sylvain,

Hi,

...
yes, excellent description of the problem: it should be part of the resource specification, not part of the job description. Alas, we don't have a resource description (yet). BTW, the same holds IMHO for CPUArchitecture for example, doesn't it?

I think CPUArchitecture and other resource specification attributes are part of the job description, since they describe the job requirements. But IMHO, attribute queue is not part of resource specification, it is part of *resource location* (like URL).

Although queues are often configured with names "short" or "long", they can be used for very different purposes (e.g. queues by VO, by SLA, by feature...), they can have different names even when used for the same purpose, and when discovering job services, the queue is always in the response rather than in the query.

...
...
...
...
If encoding the queue in the URL is not an acceptable solution, then I think the queue should be moved from attributes of job description to arguments of method job_service.create_job.

Thats also an option. What would be the difference however to keeping it in the job description? The info arrives at the same call, once in the description, once separate.

The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".

Ideally that may be true, but in practice, CPUArchitecture, OperatingSystem, and others pose similar limitations.

IMHO the limitations are not similar : * If a job requires a specific OperatingSystem to run, then we can assume this requirement is the same for grid A and grid B. * If the user wants to submit his job on a specific queue on grid A, he can not expect to have the same queue on grid B.

...
Anyway, don't get me wrong: I think I mostly agree with you about the problem statement, and the cause. I am not 100% about the proposed solution,

I have no preference on the proposed solution (URL, create_job argument, or other solution...), I just think queue must be removed from job description.

...
but that may be just me, being hesitant to change (I'm known for that I'm afraid)...

I think you are right to be hesitant; specifications must not change too much!

...
...
...
I understand that having only JSDL approved keys in the job description is a clean solution - but that is mostly for the benefit of the SAGA implementors. For the SAGA users, that makes not much of the difference, IMHO.

Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API".

But I agree, if their usefulness is confirmed, they must be kept.

I think, in the long run, further versions of JSDL, and JSDL extensions, will make our live much easier...

...
...
...
>> SAGA Name Spaces: >> ================ >> * add a flag to disable checking existence of entry in constructor >> and open methods, because the cost for this check is not negligible >> with some protocols (then subsequent method calls on this object may >> throw an IncorrectState exception >> if the entry does not exist). >> >> >> >> > Makes sense. We could also overload 'Exclusive', which, at > the moment, is only evaluated if 'Create' is specified. It > has the same semantic meaning so (inversed): if 'Exclusive' > is not specified on 'Create', an existing file is ignored. > > Would it make sense to allow Exclusive to be evaluated on > all c'tors and open calls? > > > Any feedback on this one? :-)

Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.

Ah, well, naming - you are opening a bottomless pit! ;-) Any proposal?

No proposal yet... I am thinking about it!

...
I throw in 'FailIfExists' ...

FailIfExists match the first use-case (exclusive creation), the second use-case needs DoNotFailIfDoesNotExist ! ;-)

Best regards, Sylvain

...
...
...
I am still not sure about introducing an additional exception here, but that is another issue...

Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) :

<< An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

I think thats an excellent proposal.

Best regards,

Andre.

Andre Merzky

12:36 p.m.

Hi Sylvain, Quoting [Sylvain Reynaud] (Sep 24 2009):

...

Hi,

Last attempt to propose a better name for flag 'DoNotFailIfDoesNotExist' (see last item: "avoid check for existence on open/creation of ns entries")... What do you think about "MissingOK" ?

It gives a good idea of what it is supposed to do. It is short. It is already used at least in the linux world : used by logrotate to continue with no error message when the log file does not exist, and used in rpm spec files to continue with no error message when a package is not installed.

I pondered about your problem again, and have a couple of questions (again). Sorry if I reopen the can of worms, after we converged pretty much already... So, in fact what you want to achieve is a delayed initialization, because in your use case the additional round trip time for making sure the file exists is expensive. I understand that - if you just read small amounts of data from many files, just opening the files can double the overall latency, for example. The first question though is, why don't you use the asynchronous file constructor for performing the open operation? After all, the async operations have been introduced in particular to hide latencies. Assuming that async ops do not help, for some reason: Just allowing to delay the error on synchronous construction may form a bad precedence, really, as one could argue that we would need that on all operations. Like, one could create a job::service instance for some endpoint URL, but report a DoesNotExist error only when later trying to submit a job. Or one could write some data to a file, and report an error later on when trying to read that data back again, etc. Is your use case different from those cases where one could delay error reporting, too? If so, how? Or positively speaking: If there is a reason why async operations won't work, and one still needs to have a flag to cover the use case, then in fact a 'DelayErrors' flag may be more appropriate, as it would allow us to use it in other situations, and not only in the specific one your use case met (file does not exist). But then again, introducing a general flag for delaying errors is quite a significant semantic change, really. FWIW, Hartmut (and someone else, can't remember) brought that topic up a while ago, when wondering if SAGA calls should not be getting an optional additional parameter to be returned on any errors, like // standard SAGA call size_t s = file.get_size (); // can throw std::cout << "size: " << s << "\n"; // error ignoring SAGA call size_t s = file.get_size (0); // never throws if ( s == 0 ) std::cout << "size: Unknown\n;" The proposed signature change would basically allow for SAGA calls which never throw, no matter the error condition. Effectively, the 'DelayErrors' flag discussed above does the same for constructors. I am not saying that we should consider those signatures, at least not for the current SAGA version (it is far too late in the specification roadmap to do so), but just wanted to mention it, as it seems to touch upon the same problem space. Bottom line: An IgnoreDoesNotExistException (which your 'MissingOK' basically translates to) sounds a very specific flag, for a very specific use case. Do asynchronous operations help? If not, should we consider a DelayErrors flag, possibly for the next SAGA version? Best, Andre.

...

Best regards, Sylvain

Andre Merzky a écrit :

...
Dear Sylvain,

I dropped the ball on this thread I think. Also, I think we came a conclusion about a number of issues already. So, let me try to summarize where we stand. I'd loke to use this as a last call for the list for the closed items, and as a call for feedback for the items still open.

Closed items ------------------------------------------------------------ - add a LastModiefied timestamp to namespace entries (in addition to Created timestamp)

-> added as get_mtime() to namespace::entry

- IncorrectType for

task t = f.get_size <Sync> (); size_t s = t.get_result <char> ();

-> added to the spec

- context.toString():

-> this is a language binding issue - no change in spec

- NSEntry.remove() should allow for rmdir

-> corrected in spec (Recursive flag only required for non-empty dirs)

- CPUArchitecture and OperatingSystemType should be scalar

-> needs to be fixed in spec

- job are missing a state "QUEUED"

-> this is a state_detail of the Running state - no change in spec.

- removing the Queue attrib

-> resolution unclear, possibly postponed to next JSDL version, or to a SAGA resource package, whichever comes first

- avoid check for existence on open/creation of ns entries

-> two possible solutions

(a) overload Exclusive // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, Exclusive) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, Exclusive) : no check (later IncorrectState) open (name ) : fail

(b) add new flag 'DoNotFailIfDoesNotExist' (better name needed) // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, DNFIDNE ) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, DNFIDNE ) : no check (later IncorrectState) open (name ) : fail

I vote for (a), because I think its simplier and because I can't think of a good name for the new flag. Sylvain votes for (b) IIRC, but does not have a good name either ;-)

Group should consider this to be a last call!

So, I hope I covered all items - let me know if not!

Best, Andre.

Quoting [Sylvain Reynaud] (Jun 06 2009):

...
Date: Sat, 06 Jun 2009 20:29:10 +0200 From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Andre Merzky <andre@merzky.net> CC: Thilo Kielmann <kielmann@cs.vu.nl>, saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

Andre Merzky a écrit :

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
Andre Merzky a écrit :

...
Hi again,

Hi again,

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

>> >> >> >>>- Queue: this attribute makes the job description dependent on the >>> targeted >>> execution site, this information should be put in the URL instead. >>> >>> >>> >>> >>Interesting point. The problem I see is that its hard to >>define a standard way on *how* to encode it in the URL, as >>each URL component (host, path, query, ...) may already be >>interpreted by the backend. >> >>For example, a globus job manager URL may well look like >> >>https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5... >> >>Where would you put the queue? >> >> >> >> >In JSAGA, such URL is used internally, user gives this URL: >wsgram://some.remote.host:9443/Fork > > > sure, that will mostly work. The point is however, that we can't assure that it breaks for other backends which require a path specification on the URL.

But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource.

IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.

Hi Sylvain,

Hi,

...
yes, excellent description of the problem: it should be part of the resource specification, not part of the job description. Alas, we don't have a resource description (yet). BTW, the same holds IMHO for CPUArchitecture for example, doesn't it?

I think CPUArchitecture and other resource specification attributes are part of the job description, since they describe the job requirements. But IMHO, attribute queue is not part of resource specification, it is part of *resource location* (like URL).

Although queues are often configured with names "short" or "long", they can be used for very different purposes (e.g. queues by VO, by SLA, by feature...), they can have different names even when used for the same purpose, and when discovering job services, the queue is always in the response rather than in the query.

...
...
...
>If encoding the queue in the URL is not an acceptable >solution, then I think the queue should be moved from >attributes of job description to arguments of method >job_service.create_job. > > > Thats also an option. What would be the difference however to keeping it in the job description? The info arrives at the same call, once in the description, once separate.

The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".

Ideally that may be true, but in practice, CPUArchitecture, OperatingSystem, and others pose similar limitations.

IMHO the limitations are not similar : * If a job requires a specific OperatingSystem to run, then we can assume this requirement is the same for grid A and grid B. * If the user wants to submit his job on a specific queue on grid A, he can not expect to have the same queue on grid B.

...
Anyway, don't get me wrong: I think I mostly agree with you about the problem statement, and the cause. I am not 100% about the proposed solution,

I have no preference on the proposed solution (URL, create_job argument, or other solution...), I just think queue must be removed from job description.

...
but that may be just me, being hesitant to change (I'm known for that I'm afraid)...

I think you are right to be hesitant; specifications must not change too much!

...
...
...
I understand that having only JSDL approved keys in the job description is a clean solution - but that is mostly for the benefit of the SAGA implementors. For the SAGA users, that makes not much of the difference, IMHO.

Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API".

But I agree, if their usefulness is confirmed, they must be kept.

I think, in the long run, further versions of JSDL, and JSDL extensions, will make our live much easier...

...
...
>>>SAGA Name Spaces: >>>================ >>>* add a flag to disable checking existence of entry in constructor >>>and open methods, because the cost for this check is not negligible >>>with some protocols (then subsequent method calls on this object >>>may throw an IncorrectState exception >>>if the entry does not exist). >>> >>> >>> >>> >>Makes sense. We could also overload 'Exclusive', which, at >>the moment, is only evaluated if 'Create' is specified. It >>has the same semantic meaning so (inversed): if 'Exclusive' >>is not specified on 'Create', an existing file is ignored. >> >>Would it make sense to allow Exclusive to be evaluated on >>all c'tors and open calls? >> >> >> Any feedback on this one? :-)

Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.

Ah, well, naming - you are opening a bottomless pit! ;-) Any proposal?

No proposal yet... I am thinking about it!

...
I throw in 'FailIfExists' ...

FailIfExists match the first use-case (exclusive creation), the second use-case needs DoNotFailIfDoesNotExist ! ;-)

Best regards, Sylvain

...
...
...
I am still not sure about introducing an additional exception here, but that is another issue...

Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) :

<< An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

I think thats an excellent proposal.

Best regards,

Andre.

-- Nothing is ever easy.

Sylvain Reynaud

1:55 p.m.

Andre Merzky a écrit :

...

Hi Sylvain,

Hi Andre,

...

Quoting [Sylvain Reynaud] (Sep 24 2009):

...
Hi,

Last attempt to propose a better name for flag 'DoNotFailIfDoesNotExist' (see last item: "avoid check for existence on open/creation of ns entries")... What do you think about "MissingOK" ?

It gives a good idea of what it is supposed to do. It is short. It is already used at least in the linux world : used by logrotate to continue with no error message when the log file does not exist, and used in rpm spec files to continue with no error message when a package is not installed.

I pondered about your problem again, and have a couple of questions (again). Sorry if I reopen the can of worms, after we converged pretty much already...

So, in fact what you want to achieve is a delayed initialization, because in your use case the additional round trip time for making sure the file exists is expensive.

I understand that - if you just read small amounts of data from many files, just opening the files can double the overall latency, for example.

Yes. That's the point.

...

The first question though is, why don't you use the asynchronous file constructor for performing the open operation? After all, the async operations have been introduced in particular to hide latencies.

Assuming that async ops do not help, for some reason:

Just allowing to delay the error on synchronous construction may form a bad precedence, really, as one could argue that we would need that on all operations. Like, one could create a job::service instance for some endpoint URL, but report a DoesNotExist error only when later trying to submit a job. Yes, this use-case is equivalent to mine, although the impact on the latency may not be as significant.

...

Or one could write some data to a file, and report an error later on when trying to read that data back again,

This one is a different use-case because SAGA implementation does not have to query the remote service to know if data is being written. Hence, user can implement this behavior by catching the exception raised.

...

etc.

Is your use case different from those cases where one could delay error reporting, too? If so, how?

Or positively speaking:

If there is a reason why async operations won't work, It would work, but it is still useless messages sent to the server...

...

and one still needs to have a flag to cover the use case, then in fact a 'DelayErrors' flag may be more appropriate, as it would allow us to use it in other situations, and not only in the specific one your use case met (file does not exist).

I agree. 'DelayErrors' is a better name than 'MissingOK', even for my use case, because all these errors may be raised on later invocations of method objects.

...

But then again, introducing a general flag for delaying errors is quite a significant semantic change, really. FWIW, Hartmut (and someone else, can't remember) brought that topic up a while ago, when wondering if SAGA calls should not be getting an optional additional parameter to be returned on any errors, like

// standard SAGA call size_t s = file.get_size (); // can throw std::cout << "size: " << s << "\n";

// error ignoring SAGA call size_t s = file.get_size (0); // never throws if ( s == 0 ) std::cout << "size: Unknown\n;"

The proposed signature change would basically allow for SAGA calls which never throw, no matter the error condition.

I don't see any reason for doing this, since the user can always catch the exception if he/she wants to ignore the error. What I would like to do is not to ignore errors that have already been detected, but to disable preliminary checks when needed.

...

Effectively, the 'DelayErrors' flag discussed above does the same for constructors.

I am not saying that we should consider those signatures, at least not for the current SAGA version (it is far too late in the specification roadmap to do so), but just wanted to mention it, as it seems to touch upon the same problem space.

Bottom line: An IgnoreDoesNotExistException (which your 'MissingOK' basically translates to) sounds a very specific flag, for a very specific use case. Do asynchronous operations help? Better than synchronous, but still not optimal...

...

If not, should we consider a DelayErrors flag, possibly for the next SAGA version?

'DelayErrors' sounds good to me. Best regards, Sylvain

...

Best, Andre.

...
Best regards, Sylvain

Andre Merzky a écrit :

...
Dear Sylvain,

I dropped the ball on this thread I think. Also, I think we came a conclusion about a number of issues already. So, let me try to summarize where we stand. I'd loke to use this as a last call for the list for the closed items, and as a call for feedback for the items still open.

Closed items ------------------------------------------------------------ - add a LastModiefied timestamp to namespace entries (in addition to Created timestamp)

-> added as get_mtime() to namespace::entry

- IncorrectType for

task t = f.get_size <Sync> (); size_t s = t.get_result <char> ();

-> added to the spec

- context.toString():

-> this is a language binding issue - no change in spec

- NSEntry.remove() should allow for rmdir

-> corrected in spec (Recursive flag only required for non-empty dirs)

- CPUArchitecture and OperatingSystemType should be scalar

-> needs to be fixed in spec

- job are missing a state "QUEUED"

-> this is a state_detail of the Running state - no change in spec.

- removing the Queue attrib

-> resolution unclear, possibly postponed to next JSDL version, or to a SAGA resource package, whichever comes first

- avoid check for existence on open/creation of ns entries

-> two possible solutions

(a) overload Exclusive // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, Exclusive) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, Exclusive) : no check (later IncorrectState) open (name ) : fail

(b) add new flag 'DoNotFailIfDoesNotExist' (better name needed) // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, DNFIDNE ) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, DNFIDNE ) : no check (later IncorrectState) open (name ) : fail

I vote for (a), because I think its simplier and because I can't think of a good name for the new flag. Sylvain votes for (b) IIRC, but does not have a good name either ;-)

Group should consider this to be a last call!

So, I hope I covered all items - let me know if not!

Best, Andre.

Quoting [Sylvain Reynaud] (Jun 06 2009):

...
Date: Sat, 06 Jun 2009 20:29:10 +0200 From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Andre Merzky <andre@merzky.net> CC: Thilo Kielmann <kielmann@cs.vu.nl>, saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

Andre Merzky a écrit :

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
Andre Merzky a écrit :

> Hi again, > > > Hi again,

> Quoting [Sylvain Reynaud] (Jun 05 2009): > > > >>> >>> >>> >>> >>>> - Queue: this attribute makes the job description dependent on the >>>> targeted >>>> execution site, this information should be put in the URL instead. >>>> >>>> >>>> >>>> >>>> >>> Interesting point. The problem I see is that its hard to >>> define a standard way on *how* to encode it in the URL, as >>> each URL component (host, path, query, ...) may already be >>> interpreted by the backend. >>> >>> For example, a globus job manager URL may well look like >>> >>> https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5... >>> >>> Where would you put the queue? >>> >>> >>> >>> >>> >> In JSAGA, such URL is used internally, user gives this URL: >> wsgram://some.remote.host:9443/Fork >> >> >> >> > sure, that will mostly work. The point is however, that we > can't assure that it breaks for other backends which require > a path specification on the URL. > > > But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource.

IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.

Hi Sylvain,

Hi,

...
yes, excellent description of the problem: it should be part of the resource specification, not part of the job description. Alas, we don't have a resource description (yet). BTW, the same holds IMHO for CPUArchitecture for example, doesn't it?

I think CPUArchitecture and other resource specification attributes are part of the job description, since they describe the job requirements. But IMHO, attribute queue is not part of resource specification, it is part of *resource location* (like URL).

Although queues are often configured with names "short" or "long", they can be used for very different purposes (e.g. queues by VO, by SLA, by feature...), they can have different names even when used for the same purpose, and when discovering job services, the queue is always in the response rather than in the query.

...
...
>> If encoding the queue in the URL is not an acceptable >> solution, then I think the queue should be moved from >> attributes of job description to arguments of method >> job_service.create_job. >> >> >> >> > Thats also an option. What would be the difference however > to keeping it in the job description? The info arrives at > the same call, once in the description, once separate. > > > > The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".

Ideally that may be true, but in practice, CPUArchitecture, OperatingSystem, and others pose similar limitations.

IMHO the limitations are not similar : * If a job requires a specific OperatingSystem to run, then we can assume this requirement is the same for grid A and grid B. * If the user wants to submit his job on a specific queue on grid A, he can not expect to have the same queue on grid B.

...
Anyway, don't get me wrong: I think I mostly agree with you about the problem statement, and the cause. I am not 100% about the proposed solution,

I have no preference on the proposed solution (URL, create_job argument, or other solution...), I just think queue must be removed from job description.

...
but that may be just me, being hesitant to change (I'm known for that I'm afraid)...

I think you are right to be hesitant; specifications must not change too much!

...
...
> I understand that having only JSDL approved keys in the job > description is a clean solution - but that is mostly for the > benefit of the SAGA implementors. For the SAGA users, that > makes not much of the difference, IMHO. > > > Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API".

But I agree, if their usefulness is confirmed, they must be kept.

I think, in the long run, further versions of JSDL, and JSDL extensions, will make our live much easier...

...
>>>> SAGA Name Spaces: >>>> ================ >>>> * add a flag to disable checking existence of entry in constructor >>>> and open methods, because the cost for this check is not negligible >>>> with some protocols (then subsequent method calls on this object >>>> may throw an IncorrectState exception >>>> if the entry does not exist). >>>> >>>> >>>> >>>> >>>> >>> Makes sense. We could also overload 'Exclusive', which, at >>> the moment, is only evaluated if 'Create' is specified. It >>> has the same semantic meaning so (inversed): if 'Exclusive' >>> is not specified on 'Create', an existing file is ignored. >>> >>> Would it make sense to allow Exclusive to be evaluated on >>> all c'tors and open calls? >>> >>> >>> >>> > Any feedback on this one? :-) > > > Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.

Ah, well, naming - you are opening a bottomless pit! ;-) Any proposal?

No proposal yet... I am thinking about it!

...
I throw in 'FailIfExists' ...

FailIfExists match the first use-case (exclusive creation), the second use-case needs DoNotFailIfDoesNotExist ! ;-)

Best regards, Sylvain

...
...
> I am still not sure about introducing an additional > exception here, but that is another issue... > > > > Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) :

<< An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

I think thats an excellent proposal.

Best regards,

Andre.

Andre Merzky

2:29 p.m.

Quoting [Sylvain Reynaud] (Sep 24 2009):

...

...
Is your use case different from those cases where one could delay error reporting, too? If so, how?

Or positively speaking:

If there is a reason why async operations won't work,

It would work, but it is still useless messages sent to the server...

Ah, ok - but then we are talking about bandwidth, and fairly small bandwidth, too.

...

...
But then again, introducing a general flag for delaying errors is quite a significant semantic change, really. FWIW, Hartmut (and someone else, can't remember) brought that topic up a while ago, when wondering if SAGA calls should not be getting an optional additional parameter to be returned on any errors, like

// standard SAGA call size_t s = file.get_size (); // can throw std::cout << "size: " << s << "\n";

// error ignoring SAGA call size_t s = file.get_size (0); // never throws if ( s == 0 ) std::cout << "size: Unknown\n;"

The proposed signature change would basically allow for SAGA calls which never throw, no matter the error condition.

I don't see any reason for doing this, since the user can always catch the exception if he/she wants to ignore the error.

Well, the same can be said for your use case. The reason is caching, and delayed execution. The example above may be misleading, I agree. But consider a file.open(), or file.copy(), which basically has the same semantics as the file constructor.

...

What I would like to do is not to ignore errors that have already been detected, but to disable preliminary checks when needed.

It depends on the use case which checks are preliminary. For your UC it is file existence. For another use case it may be file read permissions. Yet another UC may be job submission permission. Or if a stream can in fact connect to a port. Or if an RPC endpoint really exists and can be used. These are all checks which are implied for various SAGA object constructors... That is why I am hesitant to add it do the spec, and then describe the flag as 'don't check for file existence', because someone else will come and ask for a different check to be disabled for sure. Bottomless pit ;-)

...

...
Bottom line: An IgnoreDoesNotExistException (which your 'MissingOK' basically translates to) sounds a very specific flag, for a very specific use case. Do asynchronous operations help?

Better than synchronous, but still not optimal...

...
If not, should we consider a DelayErrors flag, possibly for the next SAGA version?

'DelayErrors' sounds good to me.

Ok, great - then lets stick to that name. But I'd rather move that to the next version, unless it is a no-go for you. Any opinion from the others? Cheers, Andre. -- Nothing is ever easy.

Sylvain Reynaud

3:11 p.m.

Andre Merzky a écrit :

...

Quoting [Sylvain Reynaud] (Sep 24 2009):

...
...
Is your use case different from those cases where one could delay error reporting, too? If so, how?

Or positively speaking:

If there is a reason why async operations won't work,

It would work, but it is still useless messages sent to the server...

Ah, ok - but then we are talking about bandwidth, and fairly small bandwidth, too.

One use-case for this is a demo from a laptop in a conference for example.

...

...
...
But then again, introducing a general flag for delaying errors is quite a significant semantic change, really. FWIW, Hartmut (and someone else, can't remember) brought that topic up a while ago, when wondering if SAGA calls should not be getting an optional additional parameter to be returned on any errors, like

// standard SAGA call size_t s = file.get_size (); // can throw std::cout << "size: " << s << "\n";

// error ignoring SAGA call size_t s = file.get_size (0); // never throws if ( s == 0 ) std::cout << "size: Unknown\n;"

The proposed signature change would basically allow for SAGA calls which never throw, no matter the error condition.

I don't see any reason for doing this, since the user can always catch the exception if he/she wants to ignore the error.

Well, the same can be said for your use case. The reason is caching, and delayed execution. The example above may be misleading, I agree. But consider a file.open(), or file.copy(), which basically has the same semantics as the file constructor.

Yes, file.open() has exactly the same semantics and would need the same flag. But file.copy() generally detects errors with no extra communication.

...

...
What I would like to do is not to ignore errors that have already been detected, but to disable preliminary checks when needed.

It depends on the use case which checks are preliminary. For your UC it is file existence. For another use case it may be file read permissions. Yet another UC may be job submission permission. Or if a stream can in fact connect to a port. Or if an RPC endpoint really exists and can be used. These are all checks which are implied for various SAGA object constructors...

That is why I am hesitant to add it do the spec, and then describe the flag as 'don't check for file existence', because someone else will come and ask for a different check to be disabled for sure. Bottomless pit ;-)

'DelayErrors' includes them all at least for namespace entries. Other packages do not have flags anyway.

...

...
...
Bottom line: An IgnoreDoesNotExistException (which your 'MissingOK' basically translates to) sounds a very specific flag, for a very specific use case. Do asynchronous operations help?

Better than synchronous, but still not optimal...

...
If not, should we consider a DelayErrors flag, possibly for the next SAGA version?

'DelayErrors' sounds good to me.

Ok, great - then lets stick to that name. But I'd rather move that to the next version, unless it is a no-go for you.

I guess some JSAGA users will probably prefer to continue using my non-standard flag rather than using asynchronous method, but I think the compatibility break is not so big if SAGA implementations can just ignore non-standard flags... unless both of them have non-standard flags of course... Cheers, Sylvain

...

Any opinion from the others?

Cheers, Andre.

Andre Merzky

29 Sep 29 Sep

12:27 p.m.

Dear Sylvain, About the DelayError flag: could we leave that to a later version of the API? My impression is that - the problem affects a rather small number of use cases - the problem is (at least partially) solved by async calls - the solution implies significant semantic changes, like state changes for ns_entries, limited ability to retrieve the complete error, etc. - the solution picks a very specific call (open) to delay errors, but ignores the same problem on other calls I think I understand the problem you are trying to solve by now, but I'd suggest not to use a stopgap solution. Its a pity that others did not utter any preferences - that would make it easier to estimate how important people think this is... Anyway, I'll certainy try to get some feedback from people at OGF27. Cheers, Andre. Quoting [Sylvain Reynaud] (Sep 24 2009):

...

Date: Thu, 24 Sep 2009 17:11:46 +0200 From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Andre Merzky <andre@merzky.net> CC: Thilo Kielmann <kielmann@cs.vu.nl>, saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

Andre Merzky a écrit :

...
Quoting [Sylvain Reynaud] (Sep 24 2009):

...
...
Is your use case different from those cases where one could delay error reporting, too? If so, how?

Or positively speaking:

If there is a reason why async operations won't work,

It would work, but it is still useless messages sent to the server...

Ah, ok - but then we are talking about bandwidth, and fairly small bandwidth, too.

One use-case for this is a demo from a laptop in a conference for example.

...
...
...
But then again, introducing a general flag for delaying errors is quite a significant semantic change, really. FWIW, Hartmut (and someone else, can't remember) brought that topic up a while ago, when wondering if SAGA calls should not be getting an optional additional parameter to be returned on any errors, like

// standard SAGA call size_t s = file.get_size (); // can throw std::cout << "size: " << s << "\n";

// error ignoring SAGA call size_t s = file.get_size (0); // never throws if ( s == 0 ) std::cout << "size: Unknown\n;"

The proposed signature change would basically allow for SAGA calls which never throw, no matter the error condition.

I don't see any reason for doing this, since the user can always catch the exception if he/she wants to ignore the error.

Well, the same can be said for your use case. The reason is caching, and delayed execution. The example above may be misleading, I agree. But consider a file.open(), or file.copy(), which basically has the same semantics as the file constructor.

Yes, file.open() has exactly the same semantics and would need the same flag. But file.copy() generally detects errors with no extra communication.

...
...
What I would like to do is not to ignore errors that have already been detected, but to disable preliminary checks when needed.

It depends on the use case which checks are preliminary. For your UC it is file existence. For another use case it may be file read permissions. Yet another UC may be job submission permission. Or if a stream can in fact connect to a port. Or if an RPC endpoint really exists and can be used. These are all checks which are implied for various SAGA object constructors...

That is why I am hesitant to add it do the spec, and then describe the flag as 'don't check for file existence', because someone else will come and ask for a different check to be disabled for sure. Bottomless pit ;-)

'DelayErrors' includes them all at least for namespace entries. Other packages do not have flags anyway.

...
...
...
Bottom line: An IgnoreDoesNotExistException (which your 'MissingOK' basically translates to) sounds a very specific flag, for a very specific use case. Do asynchronous operations help?

Better than synchronous, but still not optimal...

...
If not, should we consider a DelayErrors flag, possibly for the next SAGA version?

'DelayErrors' sounds good to me.

Ok, great - then lets stick to that name. But I'd rather move that to the next version, unless it is a no-go for you.

I guess some JSAGA users will probably prefer to continue using my non-standard flag rather than using asynchronous method, but I think the compatibility break is not so big if SAGA implementations can just ignore non-standard flags... unless both of them have non-standard flags of course...

Cheers, Sylvain

...
Any opinion from the others?

Cheers, Andre.

-- Nothing is ever easy.

Sylvain Reynaud

12:47 p.m.

Dear Andre, OK for leaving that to a later version. Until this later version, we will probably keep our non-standard flag, but also warn the users that it breaks application portability and propose asynchronous calls as an alternative. Cheers, Sylvain Andre Merzky a écrit :

...

Dear Sylvain,

About the DelayError flag: could we leave that to a later version of the API? My impression is that

- the problem affects a rather small number of use cases - the problem is (at least partially) solved by async calls - the solution implies significant semantic changes, like state changes for ns_entries, limited ability to retrieve the complete error, etc. - the solution picks a very specific call (open) to delay errors, but ignores the same problem on other calls

I think I understand the problem you are trying to solve by now, but I'd suggest not to use a stopgap solution.

Its a pity that others did not utter any preferences - that would make it easier to estimate how important people think this is... Anyway, I'll certainy try to get some feedback from people at OGF27.

Cheers, Andre.

Quoting [Sylvain Reynaud] (Sep 24 2009):

...
Date: Thu, 24 Sep 2009 17:11:46 +0200 From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Andre Merzky <andre@merzky.net> CC: Thilo Kielmann <kielmann@cs.vu.nl>, saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

Andre Merzky a écrit :

...
Quoting [Sylvain Reynaud] (Sep 24 2009):

...
...
Is your use case different from those cases where one could delay error reporting, too? If so, how?

Or positively speaking:

If there is a reason why async operations won't work,

It would work, but it is still useless messages sent to the server...

Ah, ok - but then we are talking about bandwidth, and fairly small bandwidth, too.

One use-case for this is a demo from a laptop in a conference for example.

...
...
...
But then again, introducing a general flag for delaying errors is quite a significant semantic change, really. FWIW, Hartmut (and someone else, can't remember) brought that topic up a while ago, when wondering if SAGA calls should not be getting an optional additional parameter to be returned on any errors, like

// standard SAGA call size_t s = file.get_size (); // can throw std::cout << "size: " << s << "\n";

// error ignoring SAGA call size_t s = file.get_size (0); // never throws if ( s == 0 ) std::cout << "size: Unknown\n;"

The proposed signature change would basically allow for SAGA calls which never throw, no matter the error condition.

I don't see any reason for doing this, since the user can always catch the exception if he/she wants to ignore the error.

Well, the same can be said for your use case. The reason is caching, and delayed execution. The example above may be misleading, I agree. But consider a file.open(), or file.copy(), which basically has the same semantics as the file constructor.

Yes, file.open() has exactly the same semantics and would need the same flag. But file.copy() generally detects errors with no extra communication.

...
...
What I would like to do is not to ignore errors that have already been detected, but to disable preliminary checks when needed.

It depends on the use case which checks are preliminary. For your UC it is file existence. For another use case it may be file read permissions. Yet another UC may be job submission permission. Or if a stream can in fact connect to a port. Or if an RPC endpoint really exists and can be used. These are all checks which are implied for various SAGA object constructors...

That is why I am hesitant to add it do the spec, and then describe the flag as 'don't check for file existence', because someone else will come and ask for a different check to be disabled for sure. Bottomless pit ;-)

'DelayErrors' includes them all at least for namespace entries. Other packages do not have flags anyway.

...
...
...
Bottom line: An IgnoreDoesNotExistException (which your 'MissingOK' basically translates to) sounds a very specific flag, for a very specific use case. Do asynchronous operations help?

Better than synchronous, but still not optimal...

...
If not, should we consider a DelayErrors flag, possibly for the next SAGA version?

'DelayErrors' sounds good to me.

Ok, great - then lets stick to that name. But I'd rather move that to the next version, unless it is a no-go for you.

I guess some JSAGA users will probably prefer to continue using my non-standard flag rather than using asynchronous method, but I think the compatibility break is not so big if SAGA implementations can just ignore non-standard flags... unless both of them have non-standard flags of course...

Cheers, Sylvain

...
Any opinion from the others?

Cheers, Andre.

Andre Merzky

12:50 p.m.

Sounds reasonable. Thanks, Andre. Quoting [Sylvain Reynaud] (Sep 29 2009):

...

Dear Andre,

OK for leaving that to a later version.

Until this later version, we will probably keep our non-standard flag, but also warn the users that it breaks application portability and propose asynchronous calls as an alternative.

Cheers, Sylvain

Andre Merzky a écrit :

...
Dear Sylvain,

About the DelayError flag: could we leave that to a later version of the API? My impression is that

- the problem affects a rather small number of use cases - the problem is (at least partially) solved by async calls - the solution implies significant semantic changes, like state changes for ns_entries, limited ability to retrieve the complete error, etc. - the solution picks a very specific call (open) to delay errors, but ignores the same problem on other calls

I think I understand the problem you are trying to solve by now, but I'd suggest not to use a stopgap solution.

Its a pity that others did not utter any preferences - that would make it easier to estimate how important people think this is... Anyway, I'll certainy try to get some feedback from people at OGF27.

Cheers, Andre.

Quoting [Sylvain Reynaud] (Sep 24 2009):

...
Date: Thu, 24 Sep 2009 17:11:46 +0200 From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Andre Merzky <andre@merzky.net> CC: Thilo Kielmann <kielmann@cs.vu.nl>, saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

Andre Merzky a écrit :

...
Quoting [Sylvain Reynaud] (Sep 24 2009):

...
...
Is your use case different from those cases where one could delay error reporting, too? If so, how?

Or positively speaking:

If there is a reason why async operations won't work,

It would work, but it is still useless messages sent to the server...

Ah, ok - but then we are talking about bandwidth, and fairly small bandwidth, too.

One use-case for this is a demo from a laptop in a conference for example.

...
...
...
But then again, introducing a general flag for delaying errors is quite a significant semantic change, really. FWIW, Hartmut (and someone else, can't remember) brought that topic up a while ago, when wondering if SAGA calls should not be getting an optional additional parameter to be returned on any errors, like

// standard SAGA call size_t s = file.get_size (); // can throw std::cout << "size: " << s << "\n";

// error ignoring SAGA call size_t s = file.get_size (0); // never throws if ( s == 0 ) std::cout << "size: Unknown\n;"

The proposed signature change would basically allow for SAGA calls which never throw, no matter the error condition.

I don't see any reason for doing this, since the user can always catch the exception if he/she wants to ignore the error.

Well, the same can be said for your use case. The reason is caching, and delayed execution. The example above may be misleading, I agree. But consider a file.open(), or file.copy(), which basically has the same semantics as the file constructor.

Yes, file.open() has exactly the same semantics and would need the same flag. But file.copy() generally detects errors with no extra communication.

...
...
What I would like to do is not to ignore errors that have already been detected, but to disable preliminary checks when needed.

It depends on the use case which checks are preliminary. For your UC it is file existence. For another use case it may be file read permissions. Yet another UC may be job submission permission. Or if a stream can in fact connect to a port. Or if an RPC endpoint really exists and can be used. These are all checks which are implied for various SAGA object constructors...

That is why I am hesitant to add it do the spec, and then describe the flag as 'don't check for file existence', because someone else will come and ask for a different check to be disabled for sure. Bottomless pit ;-)

'DelayErrors' includes them all at least for namespace entries. Other packages do not have flags anyway.

...
...
...
Bottom line: An IgnoreDoesNotExistException (which your 'MissingOK' basically translates to) sounds a very specific flag, for a very specific use case. Do asynchronous operations help?

Better than synchronous, but still not optimal...

...
If not, should we consider a DelayErrors flag, possibly for the next SAGA version?

'DelayErrors' sounds good to me.

Ok, great - then lets stick to that name. But I'd rather move that to the next version, unless it is a no-go for you.

I guess some JSAGA users will probably prefer to continue using my non-standard flag rather than using asynchronous method, but I think the compatibility break is not so big if SAGA implementations can just ignore non-standard flags... unless both of them have non-standard flags of course...

Cheers, Sylvain

...
Any opinion from the others?

Cheers, Andre.

-- Nothing is ever easy.

Sylvain Reynaud

10:48 a.m.

Dear Andre, I think we have missed one item: see the last one at the very end of this mail... We chose to implement a different behavior for authentication because automatically selecting the security context to use could lead to problems that would be painful to recover, such as creating files with unexpected owner, submitting jobs that will not be allowed to run but not to store their result, locking accounts because of too many failed connection attempts. We will replace our non-standard 'Ambiguity' exception with an 'AuthenticationFailed' exception, but the behavior of JSAGA when there are several context candidates is still different from what is currently described in the specification. Shoud the description of 'AuthenticationFailed' be changed as I proposed in order to allow for this behavior ? << An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >> Cheers, Sylvain Andre Merzky a écrit :

...

Dear Sylvain,

I dropped the ball on this thread I think. Also, I think we came a conclusion about a number of issues already. So, let me try to summarize where we stand. I'd loke to use this as a last call for the list for the closed items, and as a call for feedback for the items still open.

Closed items ------------------------------------------------------------ - add a LastModiefied timestamp to namespace entries (in addition to Created timestamp)

-> added as get_mtime() to namespace::entry

- IncorrectType for

task t = f.get_size <Sync> (); size_t s = t.get_result <char> ();

-> added to the spec

- context.toString():

-> this is a language binding issue - no change in spec

- NSEntry.remove() should allow for rmdir

-> corrected in spec (Recursive flag only required for non-empty dirs)

- CPUArchitecture and OperatingSystemType should be scalar

-> needs to be fixed in spec

- job are missing a state "QUEUED"

-> this is a state_detail of the Running state - no change in spec.

- removing the Queue attrib

-> resolution unclear, possibly postponed to next JSDL version, or to a SAGA resource package, whichever comes first

- avoid check for existence on open/creation of ns entries

-> two possible solutions

(a) overload Exclusive // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, Exclusive) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, Exclusive) : no check (later IncorrectState) open (name ) : fail

(b) add new flag 'DoNotFailIfDoesNotExist' (better name needed) // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, DNFIDNE ) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, DNFIDNE ) : no check (later IncorrectState) open (name ) : fail

I vote for (a), because I think its simplier and because I can't think of a good name for the new flag. Sylvain votes for (b) IIRC, but does not have a good name either ;-)

Group should consider this to be a last call!

So, I hope I covered all items - let me know if not!

Best, Andre.

Quoting [Sylvain Reynaud] (Jun 06 2009):

...
Date: Sat, 06 Jun 2009 20:29:10 +0200 From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Andre Merzky <andre@merzky.net> CC: Thilo Kielmann <kielmann@cs.vu.nl>, saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

Andre Merzky a écrit :

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
Andre Merzky a écrit :

...
Hi again,

Hi again,

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
> > > >> - Queue: this attribute makes the job description dependent on the >> targeted >> execution site, this information should be put in the URL instead. >> >> >> >> > Interesting point. The problem I see is that its hard to > define a standard way on *how* to encode it in the URL, as > each URL component (host, path, query, ...) may already be > interpreted by the backend. > > For example, a globus job manager URL may well look like > > https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5... > > Where would you put the queue? > > > > In JSAGA, such URL is used internally, user gives this URL: wsgram://some.remote.host:9443/Fork

sure, that will mostly work. The point is however, that we can't assure that it breaks for other backends which require a path specification on the URL.

But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource.

IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.

Hi Sylvain,

Hi,

...
yes, excellent description of the problem: it should be part of the resource specification, not part of the job description. Alas, we don't have a resource description (yet). BTW, the same holds IMHO for CPUArchitecture for example, doesn't it?

I think CPUArchitecture and other resource specification attributes are part of the job description, since they describe the job requirements. But IMHO, attribute queue is not part of resource specification, it is part of *resource location* (like URL).

Although queues are often configured with names "short" or "long", they can be used for very different purposes (e.g. queues by VO, by SLA, by feature...), they can have different names even when used for the same purpose, and when discovering job services, the queue is always in the response rather than in the query.

...
...
...
...
If encoding the queue in the URL is not an acceptable solution, then I think the queue should be moved from attributes of job description to arguments of method job_service.create_job.

Thats also an option. What would be the difference however to keeping it in the job description? The info arrives at the same call, once in the description, once separate.

The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".

Ideally that may be true, but in practice, CPUArchitecture, OperatingSystem, and others pose similar limitations.

IMHO the limitations are not similar : * If a job requires a specific OperatingSystem to run, then we can assume this requirement is the same for grid A and grid B. * If the user wants to submit his job on a specific queue on grid A, he can not expect to have the same queue on grid B.

...
Anyway, don't get me wrong: I think I mostly agree with you about the problem statement, and the cause. I am not 100% about the proposed solution,

I have no preference on the proposed solution (URL, create_job argument, or other solution...), I just think queue must be removed from job description.

...
but that may be just me, being hesitant to change (I'm known for that I'm afraid)...

I think you are right to be hesitant; specifications must not change too much!

...
...
...
I understand that having only JSDL approved keys in the job description is a clean solution - but that is mostly for the benefit of the SAGA implementors. For the SAGA users, that makes not much of the difference, IMHO.

Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API".

But I agree, if their usefulness is confirmed, they must be kept.

I think, in the long run, further versions of JSDL, and JSDL extensions, will make our live much easier...

...
...
...
>> SAGA Name Spaces: >> ================ >> * add a flag to disable checking existence of entry in constructor >> and open methods, because the cost for this check is not negligible >> with some protocols (then subsequent method calls on this object may >> throw an IncorrectState exception >> if the entry does not exist). >> >> >> >> > Makes sense. We could also overload 'Exclusive', which, at > the moment, is only evaluated if 'Create' is specified. It > has the same semantic meaning so (inversed): if 'Exclusive' > is not specified on 'Create', an existing file is ignored. > > Would it make sense to allow Exclusive to be evaluated on > all c'tors and open calls? > > > Any feedback on this one? :-)

Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.

Ah, well, naming - you are opening a bottomless pit! ;-) Any proposal?

No proposal yet... I am thinking about it!

...
I throw in 'FailIfExists' ...

FailIfExists match the first use-case (exclusive creation), the second use-case needs DoNotFailIfDoesNotExist ! ;-)

Best regards, Sylvain

...
...
...
I am still not sure about introducing an additional exception here, but that is another issue...

Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) :

<< An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

I think thats an excellent proposal.

Best regards,

Andre.

Andre Merzky

10:55 a.m.

Dear Sylvain, Quoting [Sylvain Reynaud] (Sep 29 2009):

...

Dear Andre,

I think we have missed one item: see the last one at the very end of this mail...

We chose to implement a different behavior for authentication because automatically selecting the security context to use could lead to problems that would be painful to recover, such as creating files with unexpected owner, submitting jobs that will not be allowed to run but not to store their result, locking accounts because of too many failed connection attempts.

We will replace our non-standard 'Ambiguity' exception with an 'AuthenticationFailed' exception, but the behavior of JSAGA when there are several context candidates is still different from what is currently described in the specification.

Shoud the description of 'AuthenticationFailed' be changed as I proposed in order to allow for this behavior ? << An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

Right - sorry that I forgot about that one. I agree that the wording should be changed here - is fixed now in CVS. Thanks, Andre.

...

Cheers, Sylvain

Andre Merzky a écrit :

...
Dear Sylvain,

I dropped the ball on this thread I think. Also, I think we came a conclusion about a number of issues already. So, let me try to summarize where we stand. I'd loke to use this as a last call for the list for the closed items, and as a call for feedback for the items still open.

Closed items ------------------------------------------------------------ - add a LastModiefied timestamp to namespace entries (in addition to Created timestamp)

-> added as get_mtime() to namespace::entry

- IncorrectType for

task t = f.get_size <Sync> (); size_t s = t.get_result <char> ();

-> added to the spec

- context.toString():

-> this is a language binding issue - no change in spec

- NSEntry.remove() should allow for rmdir

-> corrected in spec (Recursive flag only required for non-empty dirs)

- CPUArchitecture and OperatingSystemType should be scalar

-> needs to be fixed in spec

- job are missing a state "QUEUED"

-> this is a state_detail of the Running state - no change in spec.

- removing the Queue attrib

-> resolution unclear, possibly postponed to next JSDL version, or to a SAGA resource package, whichever comes first

- avoid check for existence on open/creation of ns entries

-> two possible solutions

(a) overload Exclusive // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, Exclusive) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, Exclusive) : no check (later IncorrectState) open (name ) : fail

(b) add new flag 'DoNotFailIfDoesNotExist' (better name needed) // entry exists open (name, Create | Exclusive) : fail open (name, Create ) : success open (name, DNFIDNE ) : success open (name ) : success

// entry does not exist open (name, Create | Exclusive) : success (creates) open (name, Create ) : success open (name, DNFIDNE ) : no check (later IncorrectState) open (name ) : fail

I vote for (a), because I think its simplier and because I can't think of a good name for the new flag. Sylvain votes for (b) IIRC, but does not have a good name either ;-)

Group should consider this to be a last call!

So, I hope I covered all items - let me know if not!

Best, Andre.

Quoting [Sylvain Reynaud] (Jun 06 2009):

...
Date: Sat, 06 Jun 2009 20:29:10 +0200 From: Sylvain Reynaud <Sylvain.Reynaud@in2p3.fr> To: Andre Merzky <andre@merzky.net> CC: Thilo Kielmann <kielmann@cs.vu.nl>, saga-rg@ogf.org Subject: Re: [SAGA-RG] missing(?) method reporting last modification time

Andre Merzky a écrit :

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

...
Andre Merzky a écrit :

...
Hi again,

Hi again,

...
Quoting [Sylvain Reynaud] (Jun 05 2009):

>> >> >> >>>- Queue: this attribute makes the job description dependent on the >>> targeted >>> execution site, this information should be put in the URL instead. >>> >>> >>> >>> >>Interesting point. The problem I see is that its hard to >>define a standard way on *how* to encode it in the URL, as >>each URL component (host, path, query, ...) may already be >>interpreted by the backend. >> >>For example, a globus job manager URL may well look like >> >>https://some.remote.host:9443/wsrf/services/ManagedExecutableJobService?65e5... >> >>Where would you put the queue? >> >> >> >> >In JSAGA, such URL is used internally, user gives this URL: >wsgram://some.remote.host:9443/Fork > > > sure, that will mostly work. The point is however, that we can't assure that it breaks for other backends which require a path specification on the URL.

But anyway, I think that the main point is not to know if we should put it in the URL or not, it is rather to know if the queue is part of the job description or part of the targeted resource.

IMHO, the answer is "targeted resource", because if the service discovery extension does not provide this information (either in the URL or in the service_data object), you can not guess it by yourself.

Hi Sylvain,

Hi,

...
yes, excellent description of the problem: it should be part of the resource specification, not part of the job description. Alas, we don't have a resource description (yet). BTW, the same holds IMHO for CPUArchitecture for example, doesn't it?

I think CPUArchitecture and other resource specification attributes are part of the job description, since they describe the job requirements. But IMHO, attribute queue is not part of resource specification, it is part of *resource location* (like URL).

Although queues are often configured with names "short" or "long", they can be used for very different purposes (e.g. queues by VO, by SLA, by feature...), they can have different names even when used for the same purpose, and when discovering job services, the queue is always in the response rather than in the query.

...
...
...
>If encoding the queue in the URL is not an acceptable >solution, then I think the queue should be moved from >attributes of job description to arguments of method >job_service.create_job. > > > Thats also an option. What would be the difference however to keeping it in the job description? The info arrives at the same call, once in the description, once separate.

The difference is that other attributes in job description do not depend on a particular execution site or a particular grid. Hence the same job description object could be used to run jobs on different hosts (and even on different grids) if it has no attribute "Queue".

Ideally that may be true, but in practice, CPUArchitecture, OperatingSystem, and others pose similar limitations.

IMHO the limitations are not similar : * If a job requires a specific OperatingSystem to run, then we can assume this requirement is the same for grid A and grid B. * If the user wants to submit his job on a specific queue on grid A, he can not expect to have the same queue on grid B.

...
Anyway, don't get me wrong: I think I mostly agree with you about the problem statement, and the cause. I am not 100% about the proposed solution,

I have no preference on the proposed solution (URL, create_job argument, or other solution...), I just think queue must be removed from job description.

...
but that may be just me, being hesitant to change (I'm known for that I'm afraid)...

I think you are right to be hesitant; specifications must not change too much!

...
...
...
I understand that having only JSDL approved keys in the job description is a clean solution - but that is mostly for the benefit of the SAGA implementors. For the SAGA users, that makes not much of the difference, IMHO.

Since they are not in the JSDL specification, these attributes are likely to be put at stake... Moreover, the SAGA specification says these attributes "might disappear in future versions of the SAGA API".

But I agree, if their usefulness is confirmed, they must be kept.

I think, in the long run, further versions of JSDL, and JSDL extensions, will make our live much easier...

...
...
>>>SAGA Name Spaces: >>>================ >>>* add a flag to disable checking existence of entry in constructor >>>and open methods, because the cost for this check is not negligible >>>with some protocols (then subsequent method calls on this object >>>may throw an IncorrectState exception >>>if the entry does not exist). >>> >>> >>> >>> >>Makes sense. We could also overload 'Exclusive', which, at >>the moment, is only evaluated if 'Create' is specified. It >>has the same semantic meaning so (inversed): if 'Exclusive' >>is not specified on 'Create', an existing file is ignored. >> >>Would it make sense to allow Exclusive to be evaluated on >>all c'tors and open calls? >> >> >> Any feedback on this one? :-)

Good idea IMHO, but then I think the name of this flag should be changed to one suitable for both use-cases : exclusive creation and no file existence check.

Ah, well, naming - you are opening a bottomless pit! ;-) Any proposal?

No proposal yet... I am thinking about it!

...
I throw in 'FailIfExists' ...

FailIfExists match the first use-case (exclusive creation), the second use-case needs DoNotFailIfDoesNotExist ! ;-)

Best regards, Sylvain

...
...
...
I am still not sure about introducing an additional exception here, but that is another issue...

Maybe the right exception to be thrown is AuthenticationFailed. Then its description should be changed to something like this (page 40) :

<< An operation failed because session could not successfully be used for authentication (none of the available contexts could be used, or can not determine which context to use). >>

I think thats an excellent proposal.

Best regards,

Andre.

-- Nothing is ever easy.

Andre Merzky

30 May 30 May

4:10 p.m.

Hi Thilo, Quoting [Thilo Kielmann] (May 29 2009):

...

Date: Fri, 29 May 2009 15:15:23 +0200 From: Thilo Kielmann <kielmann@cs.vu.nl> To: saga-rg@ogf.org Subject: [SAGA-RG] missing(?) method reporting last modification time

Folks,

within our group we are currently delving into issues with accessing remote file systems. What strikes us is that such access is SLOW. As such, it would be very beneficial if one could find out when (and thus whether) a remote file or directory has been modified.

While returning this piece of information sounds to be "trivial", it strikes us that the SAGA spec has no such call in the name space package (where files reside).

In POSIX terms, this is the info returned by the stat system call (see: man 2 stat), with the st_mtime parameter.

In Java, files have a method lastmodified().

Both POSIX and Java report the time in milliseconds since 01/01/1970 (epoch).

Of course, it looks like nobody has ever been thinking about such a use case, but here we are!

This is exactly why it was left out: no explicit use case... The agreement back then was, IIRC, to introduce a full stat() method in version 2 if later use cases require it. Thus, if you consider to add mtime, then please also consider to add the other time stamps. From lstat(2): struct timespec st_atimespec; /* time of last access */ struct timespec st_mtimespec; /* time of last data modification */ struct timespec st_ctimespec; /* time of last status change */ struct timespec st_birthtimespec; /* time of file creation(birth) */ Related to that discussion was, BTW, the question whether find() should support metadata criteria - at the moment, find only operates on entry names, and patterns thereof. I am not saying that this should be extented, but you may want to discuss it in respect to your use case...

...

Our feeling is that the last modification time is very essential meta data about files, so such a call should certainly be there. With our current problem certainly not being the only use case for finding out how old/new a given file or directory is...

Our favourite proposal is to add a method that returns the last modification time to both ns_entry and ns_directory as this makes sense with physical as well as with logical (replicated) files.

Probably makes sense. Not all namespace backends will be able to report timestamps I guess, but that's live. I'd suggest to keep the method calls symmetric to file.get_size(), in terms of semantics, error conditions, etc. get_size() is the only stat info we have included in the core. Well, apart from permission flags, which are handled by the permissions interface though. So, would something like get_atime () get_mtime () get_ctime () get_btime () work for you folx? I realize that this differs from what Sylvain's group added to JSAGA... :-/ Hmm, sorry for leeting this mail grow rather long, but please allow me to ramble on... We have been discussing the attribute interface again lately, in particular in respect to typed attributes. I'll post a more detailed proposal later next week, but for now, we think it might look similar to the task.get_result we have in C++, i.e. have type-templetized attribute setters/getters. For a time attributes, this would actually allow for convenient conversions, like: int ms_since_epoch = ns_entry.get_attribute <long> ("mtime"); struct tm spec = ns_entry.get_attribute <struct tm> ("mtime"); std::string time = ns_entry.get_attribute <std::string> ("mtime"); bool check = ns_entry.get_attribute <bool> ("mtime"); The last one, bool, would throw because there is no sensible conversion. Time as string would return a ctime(3) formatted string, in C++. Again, I am not saying that this is how it should be rendered (and also we are not sure if conversion is actually a good thing to have on the attrib interface), but you might want to consider to use the attrib interface for stat like information. file size would then move into the attribs as well I guess.

...

Any reactions/objections ???

About procedure: do you consider that an errata for GFD.90, or rather to be input to the next version? Note that typed attributes would be within the scope of the current spec, as that is left to the language bindings. However, attributes on ns_entry, and/or get_mtime etc., would need spec amendment. Best, Andre.

...

Thilo Kielmann

-- Nothing is ever easy.

Sylvain Reynaud

1 Jun 1 Jun

8:38 a.m.

Andre Merzky a écrit :

...

So, would something like

get_atime () get_mtime () get_ctime () get_btime ()

work for you folx? I realize that this differs from what Sylvain's group added to JSAGA... :-/

Dear Andre, I think this is not so different from what we added, it is rather more comprehensive. The reasons why we implemented only the "get_mtime" are: * many (most ?) protocols provide only 1 date (last modified). * our users were only lacking the last modification date. Best regards, Sylvain

Andre Merzky

2 Jun 2 Jun

10:39 a.m.

Dear Sylvain, Quoting [Sylvain Reynaud] (Jun 01 2009):

...

Andre Merzky a écrit :

...
So, would something like

get_atime () get_mtime () get_ctime () get_btime ()

work for you folx? I realize that this differs from what Sylvain's group added to JSAGA... :-/

Dear Andre,

I think this is not so different from what we added, it is rather more comprehensive.

Right.

...

The reasons why we implemented only the "get_mtime" are: * many (most ?) protocols provide only 1 date (last modified). * our users were only lacking the last modification date.

Good reasons IMHO. If everybody agrees on it, then we might indeed want to limit the change to mtime only. However, I really would hate to come back to the topic a year later to add one attribute, and another year later for the next one, etc ;-) Lets see what the others think.. Thanks, Andre. -- Nothing is ever easy.

Steve Fisher

11:53 a.m.

If any of the time functions are needed then we should provide them all. I presume that they are supported by all known operating systems? If not this would be a good reason to limit to the universally supported ones. 2009/6/2 Andre Merzky <andre@merzky.net>:

...

Dear Sylvain,

Quoting [Sylvain Reynaud] (Jun 01 2009):

...
Andre Merzky a écrit :

...
So, would something like

get_atime () get_mtime () get_ctime () get_btime ()

work for you folx? I realize that this differs from what Sylvain's group added to JSAGA... :-/

Dear Andre,

I think this is not so different from what we added, it is rather more comprehensive.

Right.

...
The reasons why we implemented only the "get_mtime" are: * many (most ?) protocols provide only 1 date (last modified). * our users were only lacking the last modification date.

Good reasons IMHO. If everybody agrees on it, then we might indeed want to limit the change to mtime only. However, I really would hate to come back to the topic a year later to add one attribute, and another year later for the next one, etc ;-)

Lets see what the others think..

Thanks, Andre.

-- Nothing is ever easy. -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

Andre Merzky

12:10 p.m.

Quoting [Steve Fisher] (Jun 02 2009):

...

If any of the time functions are needed then we should provide them all. I presume that they are supported by all known operating systems? If not this would be a good reason to limit to the universally supported ones.

It seem posix defines (http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html) time_t st_atime Time of last access. time_t st_mtime Time of last data modification. time_t st_ctime Time of last status change. st_birthtime/st_btime seems to be an BSD extension, AFAICS, so we may want to skip this one. It falls back to ctime though, if the system does not support it. This is POSIX, however, and, as Sylvain said, not many middleware implementations will support the full set. Best, Andre.

...

2009/6/2 Andre Merzky <andre@merzky.net>:

...
Dear Sylvain,

Quoting [Sylvain Reynaud] (Jun 01 2009):

...
Andre Merzky a écrit :

...
So, would something like

get_atime () get_mtime () get_ctime () get_btime ()

work for you folx? I realize that this differs from what Sylvain's group added to JSAGA... :-/

Dear Andre,

I think this is not so different from what we added, it is rather more comprehensive.

Right.

...
The reasons why we implemented only the "get_mtime" are: * many (most ?) protocols provide only 1 date (last modified). * our users were only lacking the last modification date.

Good reasons IMHO. If everybody agrees on it, then we might indeed want to limit the change to mtime only. However, I really would hate to come back to the topic a year later to add one attribute, and another year later for the next one, etc ;-)

Lets see what the others think..

Thanks, Andre.

-- Nothing is ever easy. -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Nothing is ever easy.

Thilo Kielmann

9:40 p.m.

Looks like I have opened a nice can of worms ;-))) I guess we should remind ourselves about the S of SAGA: simplicity. Also, the 80/20 rule might be worth reconsidering. The last modification time is the one that (now two) applications are having a use case for. (This is the 20.) In the recent dicussion, we had a few proposals for pushing these 20 more towards the 100... On Tue, Jun 02, 2009 at 02:10:32PM +0200, Andre Merzky wrote:

...

It seem posix defines (http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html)

time_t st_atime Time of last access. time_t st_mtime Time of last data modification. time_t st_ctime Time of last status change.

st_birthtime/st_btime seems to be an BSD extension, AFAICS, so we may want to skip this one. It falls back to ctime though, if the system does not support it.

This is POSIX, however, and, as Sylvain said, not many middleware implementations will support the full set.

I may want to add that also not all languages support more than the last modification time, e.g. Java only has this one... Time of last access and of last change to me seem like bits of information that typically systems software (not applications) would be interested in. Having said this, I would be in favour of limiting any addition to mtime (unless the other ones won't come with complexity for the user). About the way of inclusion I am not yet decided. The simplest way for the user is to have something like get_mtime. (And let the Java language binding possibly also have an alias lastModified (matching java.io.file). Using the attributes for the modification time (while having get_* for file size) seems odd to me. Also, I found this whole template-based type casting for attributes pure overkill for SAGA and certainly out of scope. (The type casting should NOT be part of SAGA, but if at all needed be part of the application itself, or of a language-level library that deals with converting types.) So, what are we doing with this? Can we still smuggle this into the "errata" thingy??? Regards, Thilo

...

Best, Andre.

...
2009/6/2 Andre Merzky <andre@merzky.net>:

...
Dear Sylvain,

Quoting [Sylvain Reynaud] (Jun 01 2009):

...
Andre Merzky a écrit :

...
So, would something like

get_atime () get_mtime () get_ctime () get_btime ()

work for you folx? I realize that this differs from what Sylvain's group added to JSAGA... :-/

Dear Andre,

I think this is not so different from what we added, it is rather more comprehensive.

Right.

...
The reasons why we implemented only the "get_mtime" are: * many (most ?) protocols provide only 1 date (last modified). * our users were only lacking the last modification date.

Good reasons IMHO. If everybody agrees on it, then we might indeed want to limit the change to mtime only. However, I really would hate to come back to the topic a year later to add one attribute, and another year later for the next one, etc ;-)

Lets see what the others think..

Thanks, Andre.

-- Nothing is ever easy. -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Nothing is ever easy. -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

Steve Fisher

11:34 p.m.

2009/6/2 Thilo Kielmann <kielmann@cs.vu.nl>:

...

Looks like I have opened a nice can of worms ;-)))

I guess we should remind ourselves about the S of SAGA: simplicity. Also, the 80/20 rule might be worth reconsidering.

The last modification time is the one that (now two) applications are having a use case for. (This is the 20.) In the recent dicussion, we had a few proposals for pushing these 20 more towards the 100...

On Tue, Jun 02, 2009 at 02:10:32PM +0200, Andre Merzky wrote:

...
It seem posix defines (http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html)

time_t st_atime Time of last access. time_t st_mtime Time of last data modification. time_t st_ctime Time of last status change.

st_birthtime/st_btime seems to be an BSD extension, AFAICS, so we may want to skip this one. It falls back to ctime though, if the system does not support it.

This is POSIX, however, and, as Sylvain said, not many middleware implementations will support the full set.

I may want to add that also not all languages support more than the last modification time, e.g. Java only has this one...

In that case I agree that we should support at most the modification time. Presumably Java does not provide the others because they are not universally supported.

...

Time of last access and of last change to me seem like bits of information that typically systems software (not applications) would be interested in.

Having said this, I would be in favour of limiting any addition to mtime (unless the other ones won't come with complexity for the user).

About the way of inclusion I am not yet decided. The simplest way for the user is to have something like get_mtime. (And let the Java language binding possibly also have an alias lastModified (matching java.io.file).

Using the attributes for the modification time (while having get_* for file size) seems odd to me. Also, I found this whole template-based type casting for attributes pure overkill for SAGA and certainly out of scope. (The type casting should NOT be part of SAGA, but if at all needed be part of the application itself, or of a language-level library that deals with converting types.)

So, what are we doing with this? Can we still smuggle this into the "errata" thingy???

Regards,

Thilo

...
Best, Andre.

...
2009/6/2 Andre Merzky <andre@merzky.net>:

...
Dear Sylvain,

Quoting [Sylvain Reynaud] (Jun 01 2009):

...
Andre Merzky a écrit :

...
So, would something like

get_atime () get_mtime () get_ctime () get_btime ()

work for you folx? I realize that this differs from what Sylvain's group added to JSAGA... :-/

Dear Andre,

I think this is not so different from what we added, it is rather more comprehensive.

Right.

...
The reasons why we implemented only the "get_mtime" are: * many (most ?) protocols provide only 1 date (last modified). * our users were only lacking the last modification date.

Good reasons IMHO. If everybody agrees on it, then we might indeed want to limit the change to mtime only. However, I really would hate to come back to the topic a year later to add one attribute, and another year later for the next one, etc ;-)

Lets see what the others think..

Thanks, Andre.

-- Nothing is ever easy. -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Nothing is ever easy. -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/ -- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg

Andre Merzky

3 Jun 3 Jun

12:09 p.m.

Quoting [Thilo Kielmann] (Jun 02 2009):

...

Looks like I have opened a nice can of worms ;-)))

Yes, and I am sure you are enjoying it! ;-))

...

I guess we should remind ourselves about the S of SAGA: simplicity. Also, the 80/20 rule might be worth reconsidering.

The last modification time is the one that (now two) applications are having a use case for. (This is the 20.)

I hope you mean 80? ;-)

...

In the recent dicussion, we had a few proposals for pushing these 20 more towards the 100...

On Tue, Jun 02, 2009 at 02:10:32PM +0200, Andre Merzky wrote:

...
It seem posix defines (http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html)

time_t st_atime Time of last access. time_t st_mtime Time of last data modification. time_t st_ctime Time of last status change.

st_birthtime/st_btime seems to be an BSD extension, AFAICS, so we may want to skip this one. It falls back to ctime though, if the system does not support it.

This is POSIX, however, and, as Sylvain said, not many middleware implementations will support the full set.

I may want to add that also not all languages support more than the last modification time, e.g. Java only has this one...

Thats interesting. Not relevant, but is that because Java meets the smallest common denominator here? Simple file systems such as FAT only support mtime, ususally, AFAIK. Does Java have a stat call?

...

Time of last access and of last change to me seem like bits of information that typically systems software (not applications) would be interested in.

Having said this, I would be in favour of limiting any addition to mtime (unless the other ones won't come with complexity for the user).

Fine with me.

...

About the way of inclusion I am not yet decided. The simplest way for the user is to have something like get_mtime. (And let the Java language binding possibly also have an alias lastModified (matching java.io.file).)

If we limit ourself to mtime, then yes, get_mtime() may be a good way to do it.

...

Using the attributes for the modification time (while having get_* for file size) seems odd to me.

We should then move size into the attributes, too. I agree that this should be done in a consistent way. Again, the whole problems comes from our decision back then to reconsider a stat() call later - which we are doing now, basically.

...

Also, I found this whole template-based type casting for attributes pure overkill for SAGA and certainly out of scope. (The type casting should NOT be part of SAGA, but if at all needed be part of the application itself, or of a language-level library that deals with converting types.)

That is actually more a language binding issue, I think. There may be better ways to introduce types attributes in other languages. Like, in Perl, the whole thing would be build in: my $value = obj.get_attribute (key); would always work. The use of the templetized accessors is motivated by the fact that we have it on task.get_result <type> () already, which was actually proposed by Ceriel, and back then you liked it a lot ;-) What other options would you see for doing types attributes?

...

So, what are we doing with this? Can we still smuggle this into the "errata" thingy???

separate mail, to keep this one 'short'... Cheers, Andre, -- Nothing is ever easy.

Thilo Kielmann

9:42 p.m.

On Wed, Jun 03, 2009 at 02:09:25PM +0200, Andre Merzky wrote:

...

...
I may want to add that also not all languages support more than the last modification time, e.g. Java only has this one...

Thats interesting. Not relevant, but is that because Java meets the smallest common denominator here? Simple file systems such as FAT only support mtime, ususally, AFAIK.

Does Java have a stat call?

AFAIK, Java is considered to be an "application" programming language, as opposed to systems programming languages. You can find this back e.g. in the stream-based file API, where random seeks are "not done" by typical Java applications. (read/write/seek had only been added later, but "real Java programmers" (-tm) don't use it ;-) To the best of my knowledge, there is no stat either.

...

...
Using the attributes for the modification time (while having get_* for file size) seems odd to me.

We should then move size into the attributes, too. I agree that this should be done in a consistent way.

Hmm, I must admit that I am not too familiar with the proper use of attributes in SAGA. I know we have them, but I never got beyond a vague feeling that they could be useful "somewhere"... In SAGA's file (and ns_entry) classes, I can find all commonly used information via get_*() methods. Which information do file's provide via attributes??? What I mean: I cannot see a single attribute in file, so why now start introducing them???

...

The use of the templetized accessors is motivated by the fact that we have it on task.get_result <type> () already, which was actually proposed by Ceriel, and back then you liked it a lot ;-)

I never (mentally) mapped task.get_result to typed attributes. Maybe some more background on them would help me getting a deeper understanding? Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

Andre Merzky

4 Jun 4 Jun

1:41 p.m.

Quoting [Thilo Kielmann] (Jun 03 2009):

...

On Wed, Jun 03, 2009 at 02:09:25PM +0200, Andre Merzky wrote:

...
...
I may want to add that also not all languages support more than the last modification time, e.g. Java only has this one...

Thats interesting. Not relevant, but is that because Java meets the smallest common denominator here? Simple file systems such as FAT only support mtime, ususally, AFAIK.

Does Java have a stat call?

AFAIK, Java is considered to be an "application" programming language, as opposed to systems programming languages. You can find this back e.g. in the stream-based file API, where random seeks are "not done" by typical Java applications. (read/write/seek had only been added later, but "real Java programmers" (-tm) don't use it ;-)

To the best of my knowledge, there is no stat either.

...
...
Using the attributes for the modification time (while having get_* for file size) seems odd to me.

We should then move size into the attributes, too. I agree that this should be done in a consistent way.

Hmm, I must admit that I am not too familiar with the proper use of attributes in SAGA. I know we have them, but I never got beyond a vague feeling that they could be useful "somewhere"...

Things which are considered 'properties' or 'meta data' (for whatever vague definition you want) of objects, are mostly exposed as attributes. Usually, changing attributes does not directly change object state, nor trigger any remote operation, etc. Mostly, attributes are informational (and thus often read-only). As you see, the above is rather vague, and involves a lot of handwaving. Nevertheless, a good rule of thumb is: If a SAGA object has a significan number of set_xxx/get_xxx methods, which just return simple data types, the attribute interface should be used. An excellent example is the job_description of course, but others are the job class (exposing memory usage, state, walltime etc via attributes) and replicas (user defined meta data are attached to replica entries as attributes).

...

In SAGA's file (and ns_entry) classes, I can find all commonly used information via get_*() methods. Which information do file's provide via attributes??? What I mean: I cannot see a single attribute in file, so why now start introducing them???

The only getters on ns_entry are get_url(), get_path() and get_name(), the only additional one in file is get_size(). Back then, we did not think this would be enough to justify the use of the attribute interface. Now, if we add get_atime, get_mtime, get_ctime, and get_btime, then it may be worth to reconsider. If we only go for get_mtime (as seems likely), nothing changes really, IMHO.

...

...
The use of the templetized accessors is motivated by the fact that we have it on task.get_result <type> () already, which was actually proposed by Ceriel, and back then you liked it a lot ;-)

I never (mentally) mapped task.get_result to typed attributes. Maybe some more background on them would help me getting a deeper understanding?

Well, consider the following two tasks: saga::task t_1 = file.get_name <saga::task::ASync> (); saga::task t_2 = file.get_size <saga::task::ASync> (); get_name would return a URL, get_size returns an int. Thus, you need to mark the get_result() function so that the correct type is used. We do that with templates in C++, and I though you'd do the same in Java? Anyway, flags and different names are also possible, but more difficult to maintain, and, at least in C++, somwhat non-native: // method templates saga::url u = t_1.get_result <saga::url> (); int s = t_2.get_result <int> (); Alternatives: // names saga::url u = t_1.get_url_result (); int s = t_2.get_int_result (); // flags saga::url u = t_1.get_result (saga::type::Url); int s = t_2.get_result (saga::type::Int); So, which is it in Java? As for attributes, that would (again in C++) look very similar. At the moment, we have only strings: std::string bufsize_s = stream.get_attribute ("BufSize"); The typed version we intent to propose would likely look like: int bufsize = stream.get_attribute <int> ("BufSize"); So, would be similar to get_result(). And also similar to task.get_object() btw, the only other templetized/typed method in the SAGA core spec. While we are on that topic: the spec is actually silent on what happens if the types mismatch, e.g. if I try to call: // spec defines BufSize to be of type int! float bufsize = stream.get_attribute <float> ("BufSize"); // throws In particular, we do not define an exception which semantically maps that case, apart from NoSuccess (which can always be used - that is what we do in C++ right now). Would people think that adding a 'IncorrectType' exception for that reason would make sense? It could also be resolved on language binding level, possibly. How does Java handle that? Thanks, Andre. -- Nothing is ever easy.

Andre Merzky

3 Jun 3 Jun

12:09 p.m.

New subject: OGF 26 notes, errata process, and mtime...

Quoting [Thilo Kielmann] (Jun 02 2009):

...

So, what are we doing with this [mtime problem]? Can we still smuggle this into the "errata" thingy???

Well, I did not send around notes from OGF26, yet, so let me fix this right now. Apart from the usual stuff, there have been two major discussion points: - In the sessions, we discussed approaches to a workflow package. The general feeling (between the 4 active participants :-P) has been that task dependencies are an excellent entry point. Branching and looping should also be rendered as special task types. Outside the session we got good input to the topic as well, and a number of external pointers: Legion, Unicore (AJO) and iRods all are good references, and are closer to our scheme of things than I would have thought. We are actually already implementing something like this in C++, so you can expect an explicit proposal soonish. If anybody is interested in an earlier discussion, on the list or by phone, please let me know. - The GFSG discussed our errata procedure. It was agreed that the errata are well motivated (because mostly triggered by implementation work) and mostly backward compatible (there are two items which are not, but all implementations have them implemented the same way I think). However, GFSG wants to motivate us to finally produce our experience document for GFD.90, to get the Core API from 'proposed recommendation' state into full 'recommendation' state. It was thus proposed to delay the errata until that experimental document has been handed in, and treat the errata as accomanying spec update. It makes sense I guess, even if I don't like the delay myself. Anyway, I started to work on the experimental document, and will approach the individual implementation groups for input over the next weeks. Lets try to get it out by next OGF. So, on the up side, we are still able to fiddle around with the errata for a while, and should try hard to use that time to get our implementations in sync. To answer Thilos question: I think mtime should go into errata, no matter what way we decide to solve it. Semantically, it is a small change, and it is already imlemented in JSAGA - which makes it an errata IMHO. The typed attributes should *not* be part of the errata, but rather go either into an extension package, or into v2.0, IMHO. It was a conscious decision to go for strings in v1.0, so now saying we need to 'fix' it makes not much sense. Maybe I shouldn't have brought it into the mtime thread, sorry for mixing up issues here... Best, Andre. -- Nothing is ever easy.

6013

Age (days ago)

6136

Last active (days ago)

List overview

Download

41 comments

6 participants

participants (6)

Andre Merzky
John Shalf
John Shalf
Steve Fisher
Sylvain Reynaud
Thilo Kielmann