
I have been looking at the Strawman SAGA API (v0.2+) with a view to developing a Java implementation of the API over multiple types of resource. Ultimately I am looking to develop SAGA implementations in Matlab and Jython as part of the GeodiseLab activity for the OMII (www.omii.ac.uk). I have some comments about the current SAGA API as it is defined in SIDL from the Java perspective, and some questions about the SAGA approach. There are also some trivial points about the SIDL interface. My key concern is whether the SAGA API (as defined in the SIDL interface in the strawman document) will produce 'nice' Java that will be simple to use. The problem stems from the use of the SIDL keywords 'out' and 'inout' to define method arguments that are passed by reference. This is perfectly acceptable in FORTRAN, C and C++, however when translated to Java the result is clumsy. In Java arguments are (typically) passed by value. In particular primitive types, such as 'long', and immutable types, such as 'String', are effectively passed by value. This URL describes how Java arguments are passed: http://www.yoda.arachsys.com/java/passing.html. When the Babel tool is used to parse the following SIDL definition into Java: package Hello version 1.0 { class World { void getMsg(out int ppp); } } The integer argument is placed within a mutable object that may be altered from within the 'getMsg' method. This solution may be opaque to may Java users of the API. In most situations in the SAGA API passing a single output argument by reference can be avoided by specifying a return argument (where it is currently void). For example: interface JobService { void submitJob (in JobDefinition jobDef, out Job job); ... } becomes: interface JobService { Job submitJob (in JobDefinition jobDef); ... } However, the problem is slightly more complex where multiple output arguments are returned by reference. For example: class File { void read (in long len_in, out string buffer, out long len_out ); ... } In this situation there are many alternatives to passing multiple output arguments by reference. For example; placing the output information in a single object, or, where appropriate, making the variables available as attributes of the class. A similar problem exists in Matlab and Python where arguments are passed by value. Incidentally, here multiple output arguments are supported by returning a vector (or tuple) of variables. The 'pass by value' issue leads me to a more general question about the SAGA API. Is it your objective that the API is strictly consistent between languages? My concern, if this was the case, is that this may lead to a 'worst-of-all-worlds' solution rather than an API that is appropriate for each language. Certainly this issue would arise for object-oriented versus procedural languages. Andre suggested to me that the language bindings were yet to be standardised. Does the scope exist to vary the language bindings to play to the strengths of each target language? Thanks, Graeme Pound Minor points - ExceptionCategory does not have enum values specified - SAGA.Task.Task.throw(); "throw" is a reserved word in Java - Should TaskContainer, File and LogicalFile be a classes or an interfaces? - Enums are frequently not capitalised, stylistically it would be nice to be consistent about this (from the Java POV these would be classes so should be capitalized) - void return type is frequently not specified

Hi Graeme, thanks for the comments. For the questions, see inlined comments... Cheers, Andre. Quoting [Graeme Pound] (Nov 22 2005):
I have been looking at the Strawman SAGA API (v0.2+) with a view to developing a Java implementation of the API over multiple types of resource. Ultimately I am looking to develop SAGA implementations in Matlab and Jython as part of the GeodiseLab activity for the OMII (www.omii.ac.uk).
I have some comments about the current SAGA API as it is defined in SIDL from the Java perspective, and some questions about the SAGA approach. There are also some trivial points about the SIDL interface.
My key concern is whether the SAGA API (as defined in the SIDL interface in the strawman document) will produce 'nice' Java that will be simple to use. The problem stems from the use of the SIDL keywords 'out' and 'inout' to define method arguments that are passed by reference. This is perfectly acceptable in FORTRAN, C and C++, however when translated to Java the result is clumsy.
In Java arguments are (typically) passed by value. In particular primitive types, such as 'long', and immutable types, such as 'String', are effectively passed by value. This URL describes how Java arguments are passed: http://www.yoda.arachsys.com/java/passing.html.
When the Babel tool is used to parse the following SIDL definition into Java: package Hello version 1.0 { class World { void getMsg(out int ppp); } }
We are currently fixing the strawman to avoid the 'void' signature of the calls - we just would leave: package Hello version 1.0 { class World { getMsg(out int ppp); } } So, the signature in the language binding is not necessarily void. There are two major issues with this: a) as you mention, for multiple out parameters, the situation is more unclear. b) the asynchroneous method calls are supposed to use the same signatures, but should return a task handle. That immediately raises problems with methods which have a single out param, as these are in most languages better expressed by returning that param as return value.
The integer argument is placed within a mutable object that may be altered from within the 'getMsg' method. This solution may be opaque to may Java users of the API.
In most situations in the SAGA API passing a single output argument by reference can be avoided by specifying a return argument (where it is currently void). For example:
interface JobService { void submitJob (in JobDefinition jobDef, out Job job); ... }
becomes:
interface JobService { Job submitJob (in JobDefinition jobDef); ... }
However, the problem is slightly more complex where multiple output arguments are returned by reference. For example:
class File { void read (in long len_in, out string buffer, out long len_out ); ... }
In this situation there are many alternatives to passing multiple output arguments by reference. For example; placing the output information in a single object, or, where appropriate, making the variables available as attributes of the class.
A similar problem exists in Matlab and Python where arguments are passed by value. Incidentally, here multiple output arguments are supported by returning a vector (or tuple) of variables.
the Java binding should use whatever is most 'native' in Java... - e.g. what is used in posix like implementations, or in standard libs/classes...
The 'pass by value' issue leads me to a more general question about the SAGA API. Is it your objective that the API is strictly consistent between languages? My concern, if this was the case, is that this may lead to a 'worst-of-all-worlds' solution rather than an API that is appropriate for each language. Certainly this issue would arise for object-oriented versus procedural languages. Andre suggested to me that the language bindings were yet to be standardised. Does the scope exist to vary the language bindings to play to the strengths of each target language?
No, the language bindings are definitely NOT supposed to be strictly consistent - but as much as possible. E.g. method call names should be recognisable, flags should be similar/identical if possible, error code should be the same etc. In particular the task model might look different in the various languages...
Thanks, Graeme Pound
Minor points
- ExceptionCategory does not have enum values specified
Right, its a TODO item already.
- SAGA.Task.Task.throw(); "throw" is a reserved word in Java
Good point, probably also in other languages. We could use rethrow?
- Should TaskContainer, File and LogicalFile be a classes or an interfaces?
Classes I think. Why should they be interfaces?
- Enums are frequently not capitalised, stylistically it would be nice to be consistent about this (from the Java POV these would be classes so should be capitalized)
Right, style is inconsistent as of yet - too many authors ;-)
- void return type is frequently not specified
It should be removed wherever it still _is_ specified - see above. Thanks, Andre. -- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+

Andre, In your reply you describe that you have been removing the return type of methods from the SIDL specification of the API. Is it the case that this intended to introduce an ambiguity about how arguments are returned by different language bindings? I have read the proposed charter of a SAGA working group. Could you clarify the deliverables of the research/working group for me? Is the SAGA API intended to be defined by the language bindings, or by the SIDL specification? It appears to me that ambiguities such as those described below could only be resolved by the language bindings, is this your understanding? I am keen to develop a preliminary implementation of the current SAGA API in Java as part of the GeodiseLab activity. This will serve to educate me about the SAGA API, and will work through some of the issues I have described. This may inform the development of the API. What would be the appropriate way for me to feedback this work to the research group? Thanks, Graeme Andre Merzky wrote:
Hi Graeme,
thanks for the comments. For the questions, see inlined comments...
Cheers, Andre.
Quoting [Graeme Pound] (Nov 22 2005):
I have been looking at the Strawman SAGA API (v0.2+) with a view to developing a Java implementation of the API over multiple types of resource. Ultimately I am looking to develop SAGA implementations in Matlab and Jython as part of the GeodiseLab activity for the OMII (www.omii.ac.uk).
I have some comments about the current SAGA API as it is defined in SIDL from the Java perspective, and some questions about the SAGA approach. There are also some trivial points about the SIDL interface.
My key concern is whether the SAGA API (as defined in the SIDL interface in the strawman document) will produce 'nice' Java that will be simple to use. The problem stems from the use of the SIDL keywords 'out' and 'inout' to define method arguments that are passed by reference. This is perfectly acceptable in FORTRAN, C and C++, however when translated to Java the result is clumsy.
In Java arguments are (typically) passed by value. In particular primitive types, such as 'long', and immutable types, such as 'String', are effectively passed by value. This URL describes how Java arguments are passed: http://www.yoda.arachsys.com/java/passing.html.
When the Babel tool is used to parse the following SIDL definition into Java: package Hello version 1.0 { class World { void getMsg(out int ppp); } }
We are currently fixing the strawman to avoid the 'void' signature of the calls - we just would leave:
package Hello version 1.0 { class World { getMsg(out int ppp); } }
So, the signature in the language binding is not necessarily void.
There are two major issues with this:
a) as you mention, for multiple out parameters, the situation is more unclear.
b) the asynchroneous method calls are supposed to use the same signatures, but should return a task handle. That immediately raises problems with methods which have a single out param, as these are in most languages better expressed by returning that param as return value.
The integer argument is placed within a mutable object that may be altered from within the 'getMsg' method. This solution may be opaque to may Java users of the API.
In most situations in the SAGA API passing a single output argument by reference can be avoided by specifying a return argument (where it is currently void). For example:
interface JobService { void submitJob (in JobDefinition jobDef, out Job job); ... }
becomes:
interface JobService { Job submitJob (in JobDefinition jobDef); ... }
However, the problem is slightly more complex where multiple output arguments are returned by reference. For example:
class File { void read (in long len_in, out string buffer, out long len_out ); ... }
In this situation there are many alternatives to passing multiple output arguments by reference. For example; placing the output information in a single object, or, where appropriate, making the variables available as attributes of the class.
A similar problem exists in Matlab and Python where arguments are passed by value. Incidentally, here multiple output arguments are supported by returning a vector (or tuple) of variables.
the Java binding should use whatever is most 'native' in Java... - e.g. what is used in posix like implementations, or in standard libs/classes...
The 'pass by value' issue leads me to a more general question about the SAGA API. Is it your objective that the API is strictly consistent between languages? My concern, if this was the case, is that this may lead to a 'worst-of-all-worlds' solution rather than an API that is appropriate for each language. Certainly this issue would arise for object-oriented versus procedural languages. Andre suggested to me that the language bindings were yet to be standardised. Does the scope exist to vary the language bindings to play to the strengths of each target language?
No, the language bindings are definitely NOT supposed to be strictly consistent - but as much as possible. E.g. method call names should be recognisable, flags should be similar/identical if possible, error code should be the same etc.
In particular the task model might look different in the various languages...
Thanks, Graeme Pound
Minor points
- ExceptionCategory does not have enum values specified
Right, its a TODO item already.
- SAGA.Task.Task.throw(); "throw" is a reserved word in Java
Good point, probably also in other languages. We could use rethrow?
- Should TaskContainer, File and LogicalFile be a classes or an interfaces?
Classes I think. Why should they be interfaces?
- Enums are frequently not capitalised, stylistically it would be nice to be consistent about this (from the Java POV these would be classes so should be capitalized)
Right, style is inconsistent as of yet - too many authors ;-)
- void return type is frequently not specified
It should be removed wherever it still _is_ specified - see above.
Thanks, Andre.

Quoting [Graeme Pound] (Nov 23 2005):
Andre,
In your reply you describe that you have been removing the return type of methods from the SIDL specification of the API. Is it the case that this intended to introduce an ambiguity about how arguments are returned by different language bindings?
Yes. In perl for example, we might use exceptions for error handling, und use: my ($job, $stdin, $stdout, $stderr) = job_server.run_job ($host, $exe); Other languages might support multiple return values as well, and you mention the problems in Java - so prescribing the number of return values should be left to the language binding.
I have read the proposed charter of a SAGA working group. Could you clarify the deliverables of the research/working group for me? Is the SAGA API intended to be defined by the language bindings, or by the SIDL specification? It appears to me that ambiguities such as those described below could only be resolved by the language bindings, is this your understanding?
Yes, that is my understanding. However, the language independent API spec will be most important, as it defines the _semantics_ of the API. The procedure will more or less be: - produce requirement document - produce language independent spec (beta) - iterate spec against middleware (compatibility) - produce language independent spec - produce language specific reference implementations - derive/iterate language bindings from reference implementations
I am keen to develop a preliminary implementation of the current SAGA API in Java as part of the GeodiseLab activity. This will serve to educate me about the SAGA API, and will work through some of the issues I have described. This may inform the development of the API. What would be the appropriate way for me to feedback this work to the research group?
There are two major points: - if you meet semantic problems with the API, we need immediate feedback, as then the language independent spec needs discussion. - as you are the first writing a Java binding for SAGA, you might consider that as a reference implementation. In that case, you should keep the API absolutely free of any project specific code etc. Best see the implementation as a basis for all future implementations. For C++ for example, we intend the header files to be usable for other C++ implementations. One more remark: we are pretty sure that, in one or two years from now, the SAGA spec will undergo major revisions. That includes firstly a widening of scope, but might secondly also affect lok & feel. So any reference implementation should be easily extensible in scope, and should, if possible, simple enough to change look and feel w/o the need for complete rewrite. Hope that helps, Andre.
Thanks,
Graeme
Andre Merzky wrote:
Hi Graeme,
thanks for the comments. For the questions, see inlined comments...
Cheers, Andre.
Quoting [Graeme Pound] (Nov 22 2005):
I have been looking at the Strawman SAGA API (v0.2+) with a view to developing a Java implementation of the API over multiple types of resource. Ultimately I am looking to develop SAGA implementations in Matlab and Jython as part of the GeodiseLab activity for the OMII (www.omii.ac.uk).
I have some comments about the current SAGA API as it is defined in SIDL from the Java perspective, and some questions about the SAGA approach. There are also some trivial points about the SIDL interface.
My key concern is whether the SAGA API (as defined in the SIDL interface in the strawman document) will produce 'nice' Java that will be simple to use. The problem stems from the use of the SIDL keywords 'out' and 'inout' to define method arguments that are passed by reference. This is perfectly acceptable in FORTRAN, C and C++, however when translated to Java the result is clumsy.
In Java arguments are (typically) passed by value. In particular primitive types, such as 'long', and immutable types, such as 'String', are effectively passed by value. This URL describes how Java arguments are passed: http://www.yoda.arachsys.com/java/passing.html.
When the Babel tool is used to parse the following SIDL definition into Java: package Hello version 1.0 { class World { void getMsg(out int ppp); } }
We are currently fixing the strawman to avoid the 'void' signature of the calls - we just would leave:
package Hello version 1.0 { class World { getMsg(out int ppp); } }
So, the signature in the language binding is not necessarily void.
There are two major issues with this:
a) as you mention, for multiple out parameters, the situation is more unclear.
b) the asynchroneous method calls are supposed to use the same signatures, but should return a task handle. That immediately raises problems with methods which have a single out param, as these are in most languages better expressed by returning that param as return value.
The integer argument is placed within a mutable object that may be altered from within the 'getMsg' method. This solution may be opaque to may Java users of the API.
In most situations in the SAGA API passing a single output argument by reference can be avoided by specifying a return argument (where it is currently void). For example:
interface JobService { void submitJob (in JobDefinition jobDef, out Job job); ... }
becomes:
interface JobService { Job submitJob (in JobDefinition jobDef); ... }
However, the problem is slightly more complex where multiple output arguments are returned by reference. For example:
class File { void read (in long len_in, out string buffer, out long len_out ); ... }
In this situation there are many alternatives to passing multiple output arguments by reference. For example; placing the output information in a single object, or, where appropriate, making the variables available as attributes of the class.
A similar problem exists in Matlab and Python where arguments are passed by value. Incidentally, here multiple output arguments are supported by returning a vector (or tuple) of variables.
the Java binding should use whatever is most 'native' in Java... - e.g. what is used in posix like implementations, or in standard libs/classes...
The 'pass by value' issue leads me to a more general question about the SAGA API. Is it your objective that the API is strictly consistent between languages? My concern, if this was the case, is that this may lead to a 'worst-of-all-worlds' solution rather than an API that is appropriate for each language. Certainly this issue would arise for object-oriented versus procedural languages. Andre suggested to me that the language bindings were yet to be standardised. Does the scope exist to vary the language bindings to play to the strengths of each target language?
No, the language bindings are definitely NOT supposed to be strictly consistent - but as much as possible. E.g. method call names should be recognisable, flags should be similar/identical if possible, error code should be the same etc.
In particular the task model might look different in the various languages...
Thanks, Graeme Pound
Minor points
- ExceptionCategory does not have enum values specified
Right, its a TODO item already.
- SAGA.Task.Task.throw(); "throw" is a reserved word in Java
Good point, probably also in other languages. We could use rethrow?
- Should TaskContainer, File and LogicalFile be a classes or an interfaces?
Classes I think. Why should they be interfaces?
- Enums are frequently not capitalised, stylistically it would be nice to be consistent about this (from the Java POV these would be classes so should be capitalized)
Right, style is inconsistent as of yet - too many authors ;-)
- void return type is frequently not specified
It should be removed wherever it still _is_ specified - see above.
Thanks, Andre.
-- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+

May I largely contradict Andre: On Wed, Nov 23, 2005 at 08:48:56AM -0600, Andre Merzky wrote:
One more remark: we are pretty sure that, in one or two years from now, the SAGA spec will undergo major revisions. That includes firstly a widening of scope, but might secondly also affect lok & feel. So any reference implementation should be easily extensible in scope, and should, if possible, simple enough to change look and feel w/o the need for complete rewrite.
We can not afford major changes in the API. While scope extensions are envisioned, extensions alone must be (mostly) upward-compatible. Look and feel changes can also only affect small things. We must take the fact into account that, in order to be useful and uptake-able by commercial parties (and even academic users), we have to provide stability of the API. Otherwise, the SAGA interface will be a toy rather than a standard. If the message is that the API will change a lot in the future, I can only recommend people using the GAT instead. In comparison to SAGA, the GAT has stable implementations and a lot of existing adaptors. What is the value of SAGA, if not of a stable and standardized interface? We have to get our act together, and we have to do so now. SAGA is not (supposed to be) an academic toy, it is meant for real. (So far my personal interpretation of the whole endeavour.) Cheers, Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

I agree to some extend - did not want to give the impression that all user need to re-code their applications for the next SAGA version. However, I am pretty sure that e.g. the language bindings will get iterated after some experience with the reference implementations, and the the scope extensions might introduce changes to the look and feel. It will be a balance act I guess... So its a good thing that the standardization of the language bindings is pretty late on our time line - it gives us time to get it right... A. Quoting [Thilo Kielmann] (Nov 23 2005):
Date: Wed, 23 Nov 2005 16:21:31 +0100 From: Thilo Kielmann <kielmann@cs.vu.nl> To: saga-rg@ggf.org Subject: Re: [saga-rg] Java implementation issues
May I largely contradict Andre:
On Wed, Nov 23, 2005 at 08:48:56AM -0600, Andre Merzky wrote:
One more remark: we are pretty sure that, in one or two years from now, the SAGA spec will undergo major revisions. That includes firstly a widening of scope, but might secondly also affect lok & feel. So any reference implementation should be easily extensible in scope, and should, if possible, simple enough to change look and feel w/o the need for complete rewrite.
We can not afford major changes in the API. While scope extensions are envisioned, extensions alone must be (mostly) upward-compatible. Look and feel changes can also only affect small things.
We must take the fact into account that, in order to be useful and uptake-able by commercial parties (and even academic users), we have to provide stability of the API. Otherwise, the SAGA interface will be a toy rather than a standard.
If the message is that the API will change a lot in the future, I can only recommend people using the GAT instead. In comparison to SAGA, the GAT has stable implementations and a lot of existing adaptors. What is the value of SAGA, if not of a stable and standardized interface?
We have to get our act together, and we have to do so now. SAGA is not (supposed to be) an academic toy, it is meant for real. (So far my personal interpretation of the whole endeavour.)
Cheers,
Thilo -- +-----------------------------------------------------------------+ | Andre Merzky | phon: +31 - 20 - 598 - 7759 | | Vrije Universiteit Amsterdam (VU) | fax : +31 - 20 - 598 - 7653 | | Dept. of Computer Science | mail: merzky@cs.vu.nl | | De Boelelaan 1083a | www: http://www.merzky.net | | 1081 HV Amsterdam, Netherlands | | +-----------------------------------------------------------------+

On Wed, 23 Nov 2005, Andre Merzky wrote:
I agree to some extend - did not want to give the impression that all user need to re-code their applications for the next SAGA version.
However, I am pretty sure that e.g. the language bindings will get iterated after some experience with the reference implementations,
This will, 'though, be finalised in the two year implementation phase before it becomes an accepted standard. Once it becomes a 'proposed recommendation', we need to maintain backwards compatibility. I would hope, 'though, that with the current level of interest in producing implementations, we can have the requisite two independent, API-level interoperable, implementations much sooner and push it to be a standard well within the two year period.
and the the scope extensions might introduce changes to the look and feel.
Again, once it is a standard we need to maintain backwards compatibility.
It will be a balance act I guess... So its a good thing that the standardization of the language bindings is pretty late on our time line - it gives us time to get it right...
That's the idea. Tom

Andre,
There are two major points:
- if you meet semantic problems with the API, we need immediate feedback, as then the language independent spec needs discussion.
- as you are the first writing a Java binding for SAGA, you might consider that as a reference implementation. In that case, you should keep the API absolutely free of any project specific code etc. Best see the implementation as a basis for all future implementations. For C++ for example, we intend the header files to be usable for other C++ implementations.
One more remark: we are pretty sure that, in one or two years from now, the SAGA spec will undergo major revisions. That includes firstly a widening of scope, but might secondly also affect lok & feel. So any reference implementation should be easily extensible in scope, and should, if possible, simple enough to change look and feel w/o the need for complete rewrite.
I will be developing the Java implementation to inform my understanding of the sematics. However I will attempt to provide comments upon te language independent draft swiftly. I envision a Java implementation that comprises two parts; an interface definition, and implementations of that interface for alternative resources. The interface definition would be suitably lightweight to provide a reference. By developing an implementation at this stage I understand that there will be changes until the API is finalised. However, I hope that this would further Thilo's aim of speeding the delivery of a stable and usable API. Graeme
participants (5)
-
Andre Merzky
-
G.E.POUND@soton.ac.uk
-
Graeme Pound
-
Thilo Kielmann
-
Tom Goodale