
Hi, I have gone through the comments made in the public comment period and arranged them and subsequent discussions between Andre and myself into a page of HTML which is attached. For some issues - e.g. typos - the solution is obvious so I marked these as green. Other issues with a suggested resolution are marked yellow - and I would like feedback on these - i.e. do you agree or not. Others have no suggested resolution - typically where Andre and I disagree - we need input on those. I will send round the web page periodically as it is revised until it is all a delightful shade of green. I will then consult those who made the comments to make sure they are happy. Please send your comments to the list in plain text - I don't want to merge HTML files. Steve

All, To agree on the naming convention for the (implementation independent) specification the Python to Saga language binding, I am starting this thread. I am not a long-time Python programmer so I would like some imput about the following. All names are taken from the GFD-R-P.90 document, "A Simple API for Grid Applications (SAGA)". Most name rules are taken from the Python Style Guide: http://www.python.org/dev/peps/pep-0008/ Module name: -saga -Examples: saga.File, saga.Flags, saga.Object -Rationale: Less keyboard strokes. Different class names do not conflict. (File.Flags, Namespace.Flags and Replica.Flags are the same. Job.State and Task.State are the same. Only Stream.State conflicts. This could become Stream.StreamState) This means only one saga module, and no specific packages. (Could this be a problem in the future?) Class name: -Uppercase letters or CapCase convention. Abbreviations in capitols -Example: File(), TaskContainer(), URL(), SeekMode(), NSEntry() Exception names: -Same as for class names. -Example: NotImplemented(), IncorrectURL(), AlreadyExists() Enum types: -GFD-R-P.90 specifies a number of enum types. (Standard) Python does not have enums. So just create classes with numbers within them. -Example: The classes WaitMode, ObjectType, Permission, Flags, SeekMode, IOMode class State(object): new = 1 running = 2 done = 3 canceled = 4 failed = 5 Function names: -Lowercase with words separated by an underscore. - Rationale: I know, seems strange, but it looks like its Python practice, in most cases. (Some Python modules like StringIO.py don't comply, but pickle.py does.) -Example: Attributes.set_attribute(self, key, value) Context.set_defaults(self) Object.get_session(self) Variable names: -Same as Function names. -Example: State.done (see above), ObjectType().exception Return types: -All the methods used have a returntype. Most are easily converted from the pseudo-code to Python code, but there are some optimalisations numbers -> int, long, float. boolean -> True, False, String -> String array <String> -> a List/Tuple (I.E. Attributes.get_vector_attribute()) array <byte> -> String (Buffer.set_data()) I haven't found other cases yet, but the list is probably incomplete. -Sychronous, Asynchonous, Task In 3.10, page 142, GDF-R-P.90 states to use a template member method like: task.get_result <return_type> (void) to get a Sync, Async or Task as a return type. Since Python does not have templates (String templates are something else), I propose to add a jobType parameter to all the methods using this principle. jobType could be {sync=0, async=1, task=0}. jobType could be defaulted to jobType=sync, so: f = saga.File(someURL) len_out = f.read (len, buffer); OR task = f.read (len, buffer, jobType=sync) Well, probably more to come but this is it for now. Greetings, Paul van Zoolingen. Student Vrije Universiteit, Amsterdam, The Netherlands === Those who can, do. Those who can't, simulate.

Beste Paul, Quoting [PFA van Zoolingen] (Aug 08 2008):
All,
To agree on the naming convention for the (implementation independent) specification the Python to Saga language binding, I am starting this thread. I am not a long-time Python programmer so I would like some imput about the following.
I am very likely much less of a python programmer than you, thus please excuse the potential ignortance of my anwer! :-)
All names are taken from the GFD-R-P.90 document, "A Simple API for Grid Applications (SAGA)". Most name rules are taken from the Python Style Guide: http://www.python.org/dev/peps/pep-0008/
Module name: -saga -Examples: saga.File, saga.Flags, saga.Object -Rationale: Less keyboard strokes. Different class names do not conflict. (File.Flags, Namespace.Flags and Replica.Flags are the same. Job.State and Task.State are the same. Only Stream.State conflicts. This could become Stream.StreamState)
Less keystrokes sound great :-) But is that a scalable approach? Saga is organized into packages, with the rationale that additional packages can be defined later on. For example, the job package has the classes description, job, service, and self. We are in the process on adding a checkpoint and recovery package, which kind of extends the core job package, and has the same classes. We would then end up with saga.job_job, saga.job_service, ... saga.cpr_job, saga.cpr_service, ... same amount of typing really. Well, you probably would leave off the prefix for the core package (saga.job, saga.service, ...), but still. Also, several classes/types would sound strange without package as qualifiers: saga.state - state of SAGA? saga.service - what does it do? saga.entry - entry of what?
This means only one saga module, and no specific packages. (Could this be a problem in the future?)
Class name: -Uppercase letters or CapCase convention. Abbreviations in capitols -Example: File(), TaskContainer(), URL(), SeekMode(), NSEntry()
Exception names: -Same as for class names. -Example: NotImplemented(), IncorrectURL(), AlreadyExists()
These are certainly things which should follow the typical language conventions...
Enum types: -GFD-R-P.90 specifies a number of enum types. (Standard) Python does not have enums. So just create classes with numbers within them. -Example: The classes WaitMode, ObjectType, Permission, Flags, SeekMode, IOMode
class State(object): new = 1 running = 2 done = 3 canceled = 4 failed = 5
Function names: -Lowercase with words separated by an underscore. - Rationale: I know, seems strange, but it looks like its Python practice, in most cases. (Some Python modules like StringIO.py don't comply, but pickle.py does.) -Example: Attributes.set_attribute(self, key, value) Context.set_defaults(self) Object.get_session(self)
same here: if that is the way to go in python...
Variable names: -Same as Function names. -Example: State.done (see above), ObjectType().exception
Return types: -All the methods used have a returntype. Most are easily converted from the pseudo-code to Python code, but there are some optimalisations numbers -> int, long, float. boolean -> True, False, String -> String
array <String> -> a List/Tuple (I.E. Attributes.get_vector_attribute()) array <byte> -> String (Buffer.set_data())
I haven't found other cases yet, but the list is probably incomplete.
-Sychronous, Asynchonous, Task
In 3.10, page 142, GDF-R-P.90 states to use a template member method like: task.get_result <return_type> (void) to get a Sync, Async or Task as a return type. Since Python does not have templates (String templates are something else), I propose to add a jobType parameter to all the methods using this principle. jobType could be {sync=0, async=1, task=0}. jobType could be defaulted to jobType=sync, so:
f = saga.File(someURL) len_out = f.read (len, buffer); OR task = f.read (len, buffer, jobType=sync)
That is certainly an option. Just to give you some trouble, here are some others: len_out = f.read (len, buffer, sync); // thats your version task = f.read (len, buffer, task); len_out = f.read_sync (len, buffer); task = f.read_task (len, buffer); len_out = f.read (sync, len, buffer); task = f.read (task, len, buffer); len_out = sync_f.read (len, buffer); task = task_f.read (len, buffer); And there are more... All options have pro and cons. We had that discussion for a long time as we designed the task model. If you are interested in gazillions of mails running circles, I can dig them out of the archive ;-) Your model is nice, as it is simple. The only drawback are optional arguments: dir.copy (src, target, Overwrite, sync); What happens without flags, e.g. no Overwrite is wanted: dir.copy (src, target, 0, sync); Basically, you cannot have default args anymore, and always have to specify all arguments. Thats tedious to the programmer. No idea if python provides some clever way to solve this - most other languages don't. Thus, please do also consider to put the flag in the first place, or to use some other qualifiers (method name, etc). Can't recall at the moment what way Hartmut chose in his Python implementation, sorry. Cheers, Andre.
Well, probably more to come but this is it for now. Greetings,
Paul van Zoolingen. Student Vrije Universiteit, Amsterdam, The Netherlands === Those who can, do. Those who can't, simulate. -- Nothing is ever easy.

Andre, Hartmut, You are right about the package names: Its better to leave them in. If a Python programmer does want to do it different, he can always do:
import saga from saga import url from url import URL temp = URL() # instead of saga.url.URL()
Class names Exception names Enum types Function names Variable names
check
Return types:
I think we should go a step further and add Python like interfaces to types as saga.attribute. saga.attribute is essentially a dictionary, and I would like to see the possibility to work with attributes the same way.
I.e.
m = saga.metric(...) attr = saga.attribute(m) print attr["type"] # print the type of the metric
or even:
print m["type"]
saga.monitoring.Metric() implements (or is a subclass of) saga.attributes.Attributes() in Python so metric should have all methods inherited from Attributes. It should be doable to store the attributes in a dictionary and make that easily accessible -> m.dict["type"] The inherited methods should then operate on the dictionary. m.list_attributes and m.find_attributes could return (a subset or the complete) dictionary I will try to compile a list of classes, their methods and return values.
Sync, Async, Task
That is certainly an option. Just to give you some trouble, here are some others:
len_out = f.read (len, buffer, sync); # thats your version task = f.read (len, buffer, task);
This one is also my version. In Python you can let the copy method return a task object or the integer. It's not bound to one return type.
len_out = f.read_sync (len, buffer); task = f.read_task (len, buffer);
len_out = f.read (sync, len, buffer); task = f.read (task, len, buffer);
Thes could also be my version if you explicitly named the parameters and did: f.read( type=sync, len=len, buffer=buffer)
len_out = sync_f.read (len, buffer); task = task_f.read (len, buffer);
And there are more... All options have pro and cons. We had that discussion for a long time as we designed the task model. If you are interested in gazillions of mails running circles, I can dig them out of the archive ;-)
Do you happen to know on which mailinglist and the name of the discussion? Then I will certainly check them.
Your model is nice, as it is simple. The only drawback are optional arguments:
dir.copy (src, target, Overwrite, sync);
What happens without flags, e.g. no Overwrite is wanted:
dir.copy (src, target, 0, sync);
Basically, you cannot have default args anymore, and always have to specify all arguments. Thats tedious to the programmer.
The defaults are specified in the method, example: def copy (src, target, flags=none, type=sync) When you call the method, Python fills in the blanks. Thats also why the default parameters are at the back of the parameter list. If you want to leave out the flags parameter but want to specifiy type: dir.copy (src, target, type=async) Python will accept dir.copy(src, target, type) but then uses type as the flags parameter, which (should!) result in errors.
No idea if python provides some clever way to solve this - most other languages don't. Thus, please do also consider to put the flag in the first place, or to use some other qualifiers (method name, etc). Can't recall at the moment what way Hartmut chose in his Python implementation, sorry.
If it is an optional parameter with a default specified, it has to go in the back of the parameter list. I don't know if this clarifies some issues for you? Hartmut, what do you have currently in your wrapper? Greetings, Paul van Zoolingen.

Hi Paul, Quoting [PFA van Zoolingen] (Aug 12 2008):
I think we should go a step further and add Python like interfaces to types as saga.attribute. saga.attribute is essentially a dictionary, and I would like to see the possibility to work with attributes the same way.
I.e.
m = saga.metric(...) attr = saga.attribute(m) print attr["type"] # print the type of the metric
or even:
print m["type"]
saga.monitoring.Metric() implements (or is a subclass of) saga.attributes.Attributes() in Python so metric should have all methods inherited from Attributes. It should be doable to store the attributes in a dictionary and make that easily accessible -> m.dict["type"] The inherited methods should then operate on the dictionary. m.list_attributes and m.find_attributes could return (a subset or the complete) dictionary
Yes, agree: the native dictionary interface should be exposed for classes which have attributes.
Sync, Async, Task
That is certainly an option. Just to give you some trouble, here are some others:
len_out = f.read (len, buffer, sync); # thats your version task = f.read (len, buffer, task);
This one is also my version. In Python you can let the copy method return a task object or the integer. It's not bound to one return type.
Sorry for being inprecise: I meant both lines being your version...
len_out = f.read_sync (len, buffer); task = f.read_task (len, buffer);
len_out = f.read (sync, len, buffer); task = f.read (task, len, buffer);
Thes could also be my version if you explicitly named the parameters and did: f.read( type=sync, len=len, buffer=buffer)
Ah, I see - sorry, wasn't aware of the explicit parameter naming thing in python. Cute :-)
len_out = sync_f.read (len, buffer); task = task_f.read (len, buffer);
And there are more... All options have pro and cons. We had that discussion for a long time as we designed the task model. If you are interested in gazillions of mails running circles, I can dig them out of the archive ;-)
Do you happen to know on which mailinglist and the name of the discussion? Then I will certainly check them.
Your model is nice, as it is simple. The only drawback are optional arguments:
dir.copy (src, target, Overwrite, sync);
What happens without flags, e.g. no Overwrite is wanted:
dir.copy (src, target, 0, sync);
Basically, you cannot have default args anymore, and always have to specify all arguments. Thats tedious to the programmer.
The defaults are specified in the method, example: def copy (src, target, flags=none, type=sync)
When you call the method, Python fills in the blanks. Thats also why the default parameters are at the back of the parameter list. If you want to leave out the flags parameter but want to specifiy type: dir.copy (src, target, type=async)
nice
Python will accept dir.copy(src, target, type) but then uses type as the flags parameter, which (should!) result in errors.
No idea if python provides some clever way to solve this - most other languages don't. Thus, please do also consider to put the flag in the first place, or to use some other qualifiers (method name, etc). Can't recall at the moment what way Hartmut chose in his Python implementation, sorry.
If it is an optional parameter with a default specified, it has to go in the back of the parameter list. I don't know if this clarifies some issues for you?
Yes, it does, thanks. Cheers, Andre. -- Nothing is ever easy.

Quoting [Andre Merzky] (Aug 12 2008):
And there are more... All options have pro and cons. We had that discussion for a long time as we designed the task model. If you are interested in gazillions of mails running circles, I can dig them out of the archive ;-)
Do you happen to know on which mailinglist and the name of the discussion? Then I will certainly check them.
Sorry, forgot to answer this one... Please have a look at http://www.ogf.org/pipermail/saga-rg/ In particular, in October/November 2005 we have been discussing the task model (again). The discussion there is long, but may motivate some of our design choices. I could not find the original list of all task model options we came up with - that must date sometime before that, and may well have been on a different mailing list... Cheers, Andre. -- Nothing is ever easy.

Sync, Async, Task
This could also be my version if you explicitly named the parameters and did: f.read( type=sync, len=len, buffer=buffer)
Ah, I see - sorry, wasn't aware of the explicit parameter naming thing in python. Cute :-)
I don't know if cute is the right word for it. Before you know it, someone will never forgive you for calling a language feature cute :P
And there are more... All options have pro and cons. We had that discussion for a long time as we designed the task model. If you are interested in gazillions of mails running circles, I can dig them out of the archive ;-)
I read it and it's indeed an 'interesting' (and long) discussion. I am not planning for another round of the same discussion, but the C++ solution is not usable in Python. Your final summary of the problem (found at http://wiki.cct.lsu.edu/saga/space/Task+Models ) states four solutions. Templates (c) and internal task factories (b) are not usable. a and d are really about adding a little 'thing' to to distinguish between doing a sync or getting a async or task. The same can be accomplished with a flag which is defaults type to 'sync'. It actually doesn't matter which solution is chosen. More keystrokes are always needed. Wait, I am adding another round to discussion. :/
No idea if python provides some clever way to solve this - most other languages don't. Thus, please do also consider to put the flag in the first place, or to use some other qualifiers (method name, etc). Can't recall at the moment what way Hartmut chose in his Python implementation, sorry.
If it is an optional parameter with a default specified, it has to go in the back of the parameter list. I don't know if this clarifies some issues for you?
Yes, it does, thanks.
I think I also saw somthing resembling a flag in Hartmuts Python wrapper apidoc, although I can't find it anymore. ('service' parameter?) Anyway, happy left-handers day (aug 13th) everyone! Paul.

No idea if python provides some clever way to solve this - most other languages don't. Thus, please do also consider to put the flag in the first place, or to use some other qualifiers (method name, etc). Can't recall at the moment what way Hartmut chose in his Python implementation, sorry.
If it is an optional parameter with a default specified, it has to go in the back of the parameter list. I don't know if this clarifies some issues for you?
Yes, it does, thanks.
I think I also saw somthing resembling a flag in Hartmuts Python wrapper apidoc, although I can't find it anymore. ('service' parameter?)
We have it the other way around, currently. It's the first parameter, and we have two different overloads, i.e.: f.read(buffer, len) f.read(saga.task.Sync, buffer, len) f.read(saga.task.ASync, buffer, len) f.read(saga.task.Task, buffer, len) but I agree that the usage of named parameters is better suited for Python and I'm ready to change this detail. This will move the method type to be the last parameter for all of the API functions: f.read(buffer, len, type=saga.task.Sync) but we need to add a 4th method type saga.task.Plain to distinguish between bytes_read = f.read(buffer, len) and task = f.read(buffer, len, type=saga.task.Sync) But since we have a big release (V1.0) for our SAGA package planned for September, I'll probably address all these changes after that only. Currently all users of the Python bindings are using the plain API only, so nothing should change for them. Regards Hartmut

Quoting [Hartmut Kaiser] (Aug 13 2008):
No idea if python provides some clever way to solve this - most other languages don't. Thus, please do also consider to put the flag in the first place, or to use some other qualifiers (method name, etc). Can't recall at the moment what way Hartmut chose in his Python implementation, sorry.
If it is an optional parameter with a default specified, it has to go in the back of the parameter list. I don't know if this clarifies some issues for you?
Yes, it does, thanks.
I think I also saw somthing resembling a flag in Hartmuts Python wrapper apidoc, although I can't find it anymore. ('service' parameter?)
We have it the other way around, currently. It's the first parameter, and we have two different overloads, i.e.:
f.read(buffer, len) f.read(saga.task.Sync, buffer, len) f.read(saga.task.ASync, buffer, len) f.read(saga.task.Task, buffer, len)
but I agree that the usage of named parameters is better suited for Python and I'm ready to change this detail. This will move the method type to be the last parameter for all of the API functions:
f.read(buffer, len, type=saga.task.Sync)
but we need to add a 4th method type saga.task.Plain to distinguish between
bytes_read = f.read(buffer, len) and task = f.read(buffer, len, type=saga.task.Sync)
I take that Plain would be the default then, and, also by default, not be visible? Thus you'd have: bytes_read = f.read (buffer, len) bytes_read = f.read (buffer, len, type=saga.task.Plain) task = f.read (buffer, len, type=saga.task.Sync) task = f.read (buffer, len, type=saga.task.Async) task = f.read (buffer, len, type=saga.task.Task) with the first two being the same call (once with default parameter, once explicit)? Thanks, Andre.
But since we have a big release (V1.0) for our SAGA package planned for September, I'll probably address all these changes after that only. Currently all users of the Python bindings are using the plain API only, so nothing should change for them.
Regards Hartmut
-- Nothing is ever easy.

I take that Plain would be the default then, and, also by default, not be visible? Thus you'd have:
bytes_read = f.read (buffer, len) bytes_read = f.read (buffer, len, type=saga.task.Plain) task = f.read (buffer, len, type=saga.task.Sync) task = f.read (buffer, len, type=saga.task.Async) task = f.read (buffer, len, type=saga.task.Task)
with the first two being the same call (once with default parameter, once explicit)?
Exactly. Regards Hartmut

Quoting [Hartmut Kaiser] (Aug 13 2008):
I take that Plain would be the default then, and, also by default, not be visible? Thus you'd have:
bytes_read = f.read (buffer, len) bytes_read = f.read (buffer, len, type=saga.task.Plain) task = f.read (buffer, len, type=saga.task.Sync) task = f.read (buffer, len, type=saga.task.Async) task = f.read (buffer, len, type=saga.task.Task)
with the first two being the same call (once with default parameter, once explicit)?
Cool :-)
Exactly. Regards Hartmut
-- Nothing is ever easy.

Hi, Sorry to be late commenting. A python module is a file and the package structure corresponds to a directory tree - so clearly we need multiple modules arranged in packages. I would also suggest that enum like variables be entirely in capitals I don't find that: bytes_read = f.read (buffer, len) is very pythonesqe. Surely it should be buffer = f.read() and len(buffer) will tell you what you have got. Is this a complete Python implementation or is it a wrapper around C++? Steve

Quoting [Fisher, SM (Steve)] (Aug 14 2008):
Hi,
Sorry to be late commenting. A python module is a file and the package structure corresponds to a directory tree - so clearly we need multiple modules arranged in packages.
I would also suggest that enum like variables be entirely in capitals
I don't find that: bytes_read = f.read (buffer, len) is very pythonesqe. Surely it should be buffer = f.read() and len(buffer) will tell you what you have got.
But how can you then specify that you want to read 20 bytes, instead of the whole 20GB file?
Is this a complete Python implementation or is it a wrapper around C++?
Our implementation wraps around C++. If I am not mistaken, Pauls implementation will wrap around Java (using Jython)? Cheers, Andre.
Steve
-- Nothing is ever easy.

Hi,
I would also suggest that enum like variables be entirely in capitals
By coincidence I was just discussing with Andre that in Permission None is a python keyword, but so is exec. So you have a point.
Is this a complete Python implementation or is it a wrapper around C++?
Our implementation wraps around C++. If I am not mistaken, Pauls implementation will wrap around Java (using Jython)?
Well, I am making a specification (which we all can use) and lateron adding the jython code to make it interact with the java reference implementation. <from my readme> ....concerning the specification of the Python-SAGA language binding. The result will be a Python specification which is independent of the SAGA reference implementation (Cpp or Java). Later on a java specific backend will be added with the help of Jython.. </end> A user program imports saga.py. In saga.py an environment variable is extracted (SAGA_IMPLEMENTATION_NAME, not decided on final name), and saga.py uses this name to import the various implementation specific python modules. In my case the SAGA_IMPLEMENTATION_NAME is defined as sagajavaLB (still no final name) and in module sagajavaLB, the python files with the java specific code are imported. By each time using 'from <name> import <class>' the user can use the classes like saga.file.File() instead of saga.sagajavaLB.sagaFile.File() Using this, the user could switch language binding implementation (python-to-java of python-to-cpp) by changing the environment variable. In the cpp case, a sagacppLB.py could be added, which imports, (maybe even maps functions to existing functions), and calls the already defined python wrapper. Paul

<from my readme> ....concerning the specification of the Python-SAGA language binding. The result will be a Python specification which is independent of the SAGA reference implementation
Good!
A user program imports saga.py. In saga.py an environment variable is extracted (SAGA_IMPLEMENTATION_NAME, not decided on final name), and saga.py uses this name to import the various implementation specific python modules. In my case the SAGA_IMPLEMENTATION_NAME is defined as sagajavaLB (still no final name) and in module sagajavaLB, the python files with the java specific code are imported.
By each time using 'from <name> import <class>' the user can use the classes like saga.file.File() instead of saga.sagajavaLB.sagaFile.File() Using this, the user could switch language binding implementation (python-to-java of python-to-cpp) by changing the environment variable.
In the cpp case, a sagacppLB.py could be added, which imports, (maybe even maps functions to existing functions), and calls the already defined python wrapper.
This seems an unnecessary complication - why not just change the PYTHONPATH to change implementation. I like the way the email package is structured into modules. You import what you need and you don't have strange things going on behind your back - potentially polluting your namespace. Steve

On Fri, 15 Aug 2008, Fisher, SM (Steve) wrote:
I don't find that: bytes_read = f.read (buffer, len) is very pythonesqe. Surely it should be buffer = f.read() and len(buffer) will tell you what you have got.
But how can you then specify that you want to read 20 bytes, instead of the whole 20GB file?
buffer = f.read(20)
I feel that the proposed calling sequence looks too C like
But how is the distinction made between saga-implementation managed buffers and application managed buffers? If buffer is a saga managed one and you did a f.read(20), where does the data go?
This seems an unnecessary complication - why not just change the PYTHONPATH to change implementation. I like the way the email package is structured into modules. You import what you need and you don't have strange things going on behind your back - potentially polluting your namespace.
Well, I use the environment variable for the name of the library. So multiple python-saga language bindings can exist in the same path. And for some reason jython does not use the PYTHONPATH variable. It uses a -Dpython.path flag which is given to the jvm. A separate variable is usable by everyone. There is not much magic going on. It cuts the implementation name from saga.<imp_name>.package.class(). So you don;t have to rewrite your python program if you use a different implementation. What do you mean with polluting the name space? All the saga classes might (well, probably are going to) be defined and loaded, but they all stay behind saga. I don't expect any other saga library to be loaded. Paul.

-----Original Message----- From: PFA van Zoolingen [mailto:pzn400@few.vu.nl] Sent: 15 August 2008 11:08 To: Fisher, SM (Steve) Cc: Andre Merzky; SAGA RG Subject: RE: [SAGA-RG] SAGA Python language binding naming.
On Fri, 15 Aug 2008, Fisher, SM (Steve) wrote:
I don't find that: bytes_read = f.read (buffer, len) is very pythonesqe. Surely it should be buffer = f.read() and len(buffer) will tell you what you have got.
But how can you then specify that you want to read 20 bytes, instead of the whole 20GB file?
buffer = f.read(20)
I feel that the proposed calling sequence looks too C like
But how is the distinction made between saga-implementation managed buffers and application managed buffers?
I don't think there is any need for a saga managed buffer. The call f.read(20) returns a string of maximum length 20. In other words "f" behaves just like a normal python file object. I think it is important to make python libraries work like python rather than C. Also bear in mind that there may be places in the API where it is useful to take advantage of the ability to return a tuple rather than a single object.
If buffer is a saga managed one and you did a f.read(20), where does the data go?
It goes into a newly allocated python string.
This seems an unnecessary complication - why not just change the PYTHONPATH to change implementation. I like the way the email package is structured into modules. You import what you need and you don't have strange things going on behind your back - potentially polluting your namespace.
Well, I use the environment variable for the name of the library. So multiple python-saga language bindings can exist in the same path. And for some reason jython does not use the PYTHONPATH variable. It uses a -Dpython.path flag which is given to the jvm. A separate variable is usable by everyone.
The Jython behavious is to be expected as JVMs don't like to access env variables. Whether you are using compiled python or Jython you have to tell it where to load your modules from. This can be modified in the code of course - but it is quite normal to set the path externally before running.
There is not much magic going on. It cuts the implementation name from saga.<imp_name>.package.class(). So you don;t have to rewrite your python program if you use a different implementation.
What do you mean with polluting the name space? All the saga classes might (well, probably are going to) be defined and loaded, but they all stay behind saga. I don't expect any other saga library to be loaded.
I get the feeling that implementation ideas are intruding into the API design. I would really like to see the description of the API for one small package and an example or two of its use - it makes things much clearer for me. It seems that I am a lone voice of dissent. Steve

Quoting [Fisher, SM (Steve)] (Aug 15 2008):
It seems that I am a lone voice of dissent.
You never know if the silent lurkers on the list, like me, agree, or disagree, or if they are ignorant to the topic at hand... ;-) Unless they post disagreement, you can safely asume one of the other two options: http://xkcd.com/386/ :-) Cheers, Andre. -- Nothing is ever easy.

Steve,
Quoting [Fisher, SM (Steve)] (Aug 15 2008):
It seems that I am a lone voice of dissent.
No, you're not. I'm just too tight in terms of time to have a free hour to think everything through. My initial plan was to have a look at Paul's code before replying, because I don't feel very comfortable with the way he has been setting things up so far. But I'm not able to give a formal yay or nay (and my rationale) at this point. My feeling is, that Paul is trying to build something like an abstract interface in Python, allowing to use different implementations. But in my understanding this is not really Pythonesc, it more smells like Java to me (which is no surprise as he is working in the a Java environment at VU). I'll try to gather my thoughts over the weekend, ok?
You never know if the silent lurkers on the list, like me, agree, or disagree, or if they are ignorant to the topic at hand... ;-) Unless they post disagreement, you can safely asume one of the other two options: http://xkcd.com/386/ :-)
This one Is really funny! Regards Hartmut

Hartmut, indeed, the idea is to have an abstract interface, independent of any implementation. This has nothing to do with Java. This is (going to be) a language binding for SAGA, and by design NOT a direct wrapper to any implementation. Regards, Thilo
My feeling is, that Paul is trying to build something like an abstract interface in Python, allowing to use different implementations. But in my understanding this is not really Pythonesc, it more smells like Java to me (which is no surprise as he is working in the a Java environment at VU).
I'll try to gather my thoughts over the weekend, ok?
You never know if the silent lurkers on the list, like me, agree, or disagree, or if they are ignorant to the topic at hand... ;-) Unless they post disagreement, you can safely asume one of the other two options: http://xkcd.com/386/ :-)
This one Is really funny!
Regards Hartmut
-- saga-rg mailing list saga-rg@ogf.org http://www.ogf.org/mailman/listinfo/saga-rg
-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

On Aug 15, 2008, at 8:48 AM, Andre Merzky wrote:
Quoting [Fisher, SM (Steve)] (Aug 15 2008):
It seems that I am a lone voice of dissent.
You never know if the silent lurkers on the list, like me, agree, or disagree, or if they are ignorant to the topic at hand... ;-) Unless they post disagreement, you can safely asume one of the other two options: http://xkcd.com/386/ :-)
Another lurker who hasn't had time to respond, but I've agreed with all you have been saying Steve. --keith -------------------------------------------------------------------------------------------------------- Keith R. Jackson email: KRJackson@lbl.gov MS: 50B-2239 phone: 510-486-4401 Lawrence Berkeley National Lab url: http://www-itg.lbl.gov/~kjackson/ ----------------------------------------------------------------------------------------------------------

On Fri, 2008-08-15 at 17:48 +0200, Andre Merzky wrote:
Quoting [Fisher, SM (Steve)] (Aug 15 2008):
It seems that I am a lone voice of dissent.
You never know if the silent lurkers on the list, like me, agree, or disagree, or if they are ignorant to the topic at hand... ;-) Unless they post disagreement, you can safely asume one of the other two options: http://xkcd.com/386/ :-)
No, you can't! Oh, wait...

Quoting [Mihael Hategan] (Aug 19 2008):
On Fri, 2008-08-15 at 17:48 +0200, Andre Merzky wrote:
Quoting [Fisher, SM (Steve)] (Aug 15 2008):
It seems that I am a lone voice of dissent.
You never know if the silent lurkers on the list, like me, agree, or disagree, or if they are ignorant to the topic at hand... ;-) Unless they post disagreement, you can safely asume one of the other two options: http://xkcd.com/386/ :-)
No, you can't!
Oh, wait...
LOL -- Nothing is ever easy.

Lurkers and lone voices, I moved the source files into separate packages and modules and have documented them and partially marked-up the files with epydoc. The unfinished version of the (implementation independent) Python SAGA api can be found at: https://svn.cct.lsu.edu/repos/saga-projects/language-bindings/pysaga/apidoc.... Return values, types and optional parameters are not yet clear for all methods but I am working on it. Although last time there was a discussion that buffers were "not pythonesqe" they must be added since they are part of the specification. Of course, >>>string = file.read(10) will be part of pysaga, but
file.read(10, buffer) will be too.
Which brings me to the following: what is frequently used in Python to represent a mutable buffer? I have seen immutable strings being used, but also lists and array of chars. Python-3000 will have mutable buffer and bytes, but we are not there yet. I also can't seem to figure out how it is done in the Cpp python wrapper. If anyone has an idea about the buffers, don't be a silent lurker :) Paul.

-----Original Message----- From: Andre Merzky [mailto:andre@merzky.net] Sent: 14 August 2008 19:26 To: Fisher, SM (Steve) Cc: Andre Merzky; Hartmut Kaiser; SAGA RG Subject: Re: [SAGA-RG] SAGA Python language binding naming.
Quoting [Fisher, SM (Steve)] (Aug 14 2008):
Hi,
Sorry to be late commenting. A python module is a file and
the package
structure corresponds to a directory tree - so clearly we need multiple modules arranged in packages.
I would also suggest that enum like variables be entirely in capitals
I don't find that: bytes_read = f.read (buffer, len) is very pythonesqe. Surely it should be buffer = f.read() and len(buffer) will tell you what you have got.
But how can you then specify that you want to read 20 bytes, instead of the whole 20GB file?
buffer = f.read(20) I feel that the proposed calling sequence looks too C like
Is this a complete Python implementation or is it a wrapper around C++?
Our implementation wraps around C++. If I am not mistaken, Pauls implementation will wrap around Java (using Jython)?
Cheers, Andre.
Steve
-- Nothing is ever easy.

Quoting [PFA van Zoolingen] (Aug 13 2008):
Sync, Async, Task
This could also be my version if you explicitly named the parameters and did: f.read( type=sync, len=len, buffer=buffer)
Ah, I see - sorry, wasn't aware of the explicit parameter naming thing in python. Cute :-)
I don't know if cute is the right word for it. Before you know it, someone will never forgive you for calling a language feature cute :P
:P
And there are more... All options have pro and cons. We had that discussion for a long time as we designed the task model. If you are interested in gazillions of mails running circles, I can dig them out of the archive ;-)
I read it and it's indeed an 'interesting' (and long) discussion. I am not planning for another round of the same discussion, but the C++ solution is not usable in Python. Your final summary of the problem (found at http://wiki.cct.lsu.edu/saga/space/Task+Models ) states four solutions. Templates (c) and internal task factories (b) are not usable. a and d are really about adding a little 'thing' to to distinguish between doing a sync or getting a async or task. The same can be accomplished with a flag which is defaults type to 'sync'. It actually doesn't matter which solution is chosen. More keystrokes are always needed.
Wait, I am adding another round to discussion. :/
Good points as far as I am concerned - thanks. Andre.
No idea if python provides some clever way to solve this - most other languages don't. Thus, please do also consider to put the flag in the first place, or to use some other qualifiers (method name, etc). Can't recall at the moment what way Hartmut chose in his Python implementation, sorry.
If it is an optional parameter with a default specified, it has to go in the back of the parameter list. I don't know if this clarifies some issues for you?
Yes, it does, thanks.
I think I also saw somthing resembling a flag in Hartmuts Python wrapper apidoc, although I can't find it anymore. ('service' parameter?)
Anyway, happy left-handers day (aug 13th) everyone! Paul.
-- Nothing is ever easy.

Hi Steve, some comments to your comments :-) 1 (UC in intro) Yes, may make sense to remove the long use case quotes, and to replace with summaries. Not neccessarily to save space, but to have the language and terminology uniform. 3 (data filter) Your proposal to simply suggest to use glue based data filter attributes sounds great to me: no need to reinvent anything here. And no need to come up with any 'complete' list of attributes either, which would probably impossible anyway... 11 (SQL portability) proposed solution, to limit camparing of strings, sounds good to me. not being very SQL literate: are we loosing much here? Quoting [Fisher, SM (Steve)] (Aug 06 2008):
Date: Wed, 6 Aug 2008 13:35:35 +0100 From: "Fisher, SM (Steve)" <S.M.Fisher@rl.ac.uk> To: "SAGA RG" <saga-rg@ogf.org> Subject: [SAGA-RG] Service Discovery revisions
Hi,
I have gone through the comments made in the public comment period and arranged them and subsequent discussions between Andre and myself into a page of HTML which is attached.
For some issues - e.g. typos - the solution is obvious so I marked these as green. Other issues with a suggested resolution are marked yellow - and I would like feedback on these - i.e. do you agree or not. Others have no suggested resolution - typically where Andre and I disagree - we need input on those.
I will send round the web page periodically as it is revised until it is all a delightful shade of green. I will then consult those who made the comments to make sure they are happy.
Please send your comments to the list in plain text - I don't want to merge HTML files.
Steve
Content-Description: SD.html
# Issue Author Text Discussion Resolution 1 Use Cases in the Introduction lfield I would suggest removing the use cases in the introduction as they do not really add to the document and make it more confusing. I think use cases in the intro are good, as to motivate the package, and give some context for its usage. - Andre We'd like a diagram showing how the API is invoked/used, ideally with a clear worked example, for the novices amongst us. Consequently We would support the use of actual use-cases in this specification and would resist any movement to get rid of them from the introduction, as suggested in another comment. - UWE Laurence discussed this with me before making the original comment. He had read it quickly and had seen the quotes as part of the text. The difficulty is that the wording of the use cases is confusing (for example it talks about components rather than services). I would like to remove the quoted text but to summarise the use case in a few words of my own. The reference to the use case document will of course be retained. - Steve I don't think we need a picture - but I can add a few more words with the examples at the end of the document. - Steve Replace the quoted text with a summary Add a few more words with the examples at the end of the document. 2 vo_filter sreynaud Filtering by VO only is too restrictive compared to the security mechanisms supported by SAGA implementations. It can not be used with non-grid security contexts and it can not be used with non-VO attributes of grid security contexts (e.g. DN of proxy, VOMS role or group). I would suggest replacing 'VOFilter' with something like 'AuthzFilter'. I would also suggest using the same attribute names as for context attributes (e.g. 'UserVO', 'UserID') for consistency with the SAGA Core specification. Yes - Andre No - I think the abstract concept of a VO is bigger than VOMS. For a non-grid deployment the "service provider" can publish the name of a group of people that might use the service. I could change the attribute name from VO to UserVO - except that this might imply that it is the VO of the user makign the query which is not the case - Steve 3 data_filter sreynaud The job management interfaces of SAGA Core specification can be used for uniform access to grid infrastructures because if defines both common methods and common attributes names for job description. We need something equivalent for service discovery; it would help a lot if the specification could define a set of common service data attribute names (e.g. a subset of GLUE property names) per service type (e.g. 'job_service.RunningJobs'). Yes - Andre I quite like the idea - perhaps it could go in an appendix as suggested attribute names. However I am not competent to choose this set of names and I fear it could cause long delays, so i would prefer not to do it. - Steve 4 service_filter sreynaud A paragraph explaining why different service type names are defined for 'file' and for 'directory' is needed. Same comment for 'logical-file' and 'logical-directory'. Yes - Andre We could say that the File service is implemented by the file and directory classes and similarly for 'logical-file' and 'logical-directory'. - Steve On page 10 say that the File service is implemented by the file and directory classes and similarly for 'logical-file' and 'logical-directory' 5 discoverer sreynaud A SAGA implementation may support several discovery service end-points (with potentially different implementations) to enable simultaneous access to several grid infrastructures that do not interoperate. Hence I would suggest to add an argument of type URL to the constructor of class 'discoverer'. This argument could be optional. Yes - Andre Add an optional argument when creating a discoverer to guide the implementation in locating the underyling information system. 6 discoverer not available UWE What happens if a discovery service instance being queried is not available? Can a user contact another instance? Or which fault tolerant mechanism should be considered to access services which are not listed on a particular information system? That is up to the SAGA implementation. One implementation may just fail, another may fall back to another service, a third one may to collect information from a set of available services (probably preferred). - Andre I think this is also addressed by the issue above from "sreynaud" Add an explanation to the document 7 Access Control UWE Access control mechanism is not discussed. How can user set access privileges on the discovery service contents? Different users have different privileges and there should be a mechanism where users can get the information based on their access right. The discover is created in a saga::session, thus has a set of saga::context's available. Only with these credentials it can perform queries. Also, these credentials identify the user to the backend, thus potentially allowing to perform server side authorization on that ID. - Andre Add an explanation to the document 8 Stale services UWE Stale services can appear in user requests and are not discussed in the spec. Grid and distributed systems are dynamic. Services may come and go at any time. There should be a mechanism to identify stale services, or the services which are down due to any reason. You can't avoid stale services: even if the API confirms a service is not stale, it might be on the next call, and vice versa. The SD API can only provide endpoints, but notguarantee availability, nor completeness. - Andre Add an explanation to the document 9 Time stamping UWE Time stamping of services is not mentioned. How a user may know when a particular service was registered or can query certain values for a particular time instance/interval? Hmm, I wonder if there is a use case for that... Otherwise, this can very well be part of the free form service_data. - Andre If the underlying info system is using GLUE II then everything has a creation time and a lifetime. The implementation may choose not to return "expired" services. I would like to note that it is the responsibility of the implemenation to apply any algorithm it chooses to return valid services - this might include ignoring services that have not refreshed their registration recently. 10 Legacy services UWE What to do if the published services are not standard webservices as is the case with most Grid Services? I do not think this API will be able to access legacy Grid applications or can identify non standard services. The data model is not tied to web services, nor is the API. I don't see any reason why that should not work for not-web services. - Andre I propose to say nothing 11 SQL variations UWE SQL Implementations are inconsistent and, usually, incompatible between vendors. In particular date and time syntax, string concatenation, nulls, and comparison case sensitivity often vary from vendor to vendor. How can applications users may be forced to use a single SQL syntax for all the implementations of discovery service? The document refers to SQL92 syntax, which is a standard. Any deviation from this standard, e.g. due to vendor specific backend SQL, need either to be treanslated by the SAGA implementation (preferred), or to be well motivated and documented. Indeed, different SQL versions would break portability, and IMHO compliance with the document. - Andre Data are all strings except that page 11 says: If values are specified as numeric values and not in single quotes the service data will be converted from string to numeric for comparison. So there is no problem with dates and times. Case sensitivity is normally in the hands of the user (or in this case the service discovery implementor) - at least it is for MySQL and HSQLDB. However there is a problem with comparison of strings. I suggest that we say that you cannot use > and < on strings. - Steve Forbid comparison of Strings other than equality, ineqaulity and LIKE. 12 UWE Not clear whether the attributes of the discovered service bring the middleware information too. Don't understand the point - I will go back and ask them - Steve 13 Adaptor loading UWE Can we automatically load the adaptor, once the discovered service is selected? That is up to the implementation. Not all implementations may be adaptor based. - Andre I propose to say nothing 14 Punctuation UWE Punctuation is poor. Authors should address the comma and the semi-colon. They aid readability. I will get someone who is "good at punctuation" to give it a read through at the end. 15 Typos UWE Typos : 'Intendeded' on page 1 'or a broker' should be 'or by using a broker' on page 4 'as they as' should be 'as they are' on page 6 'perfomed' should be 'performed' on page 11 These will be corrected -- Nothing is ever easy.

-----Original Message----- From: Andre Merzky [mailto:andre@merzky.net] Sent: 08 August 2008 22:05 To: Fisher, SM (Steve) Cc: SAGA RG Subject: Re: [SAGA-RG] Service Discovery revisions
Hi Steve,
some comments to your comments :-)
1 (UC in intro)
Yes, may make sense to remove the long use case quotes, and to replace with summaries. Not neccessarily to save space, but to have the language and terminology uniform.
3 (data filter)
Your proposal to simply suggest to use glue based data filter attributes sounds great to me: no need to reinvent anything here. And no need to come up with any 'complete' list of attributes either, which would probably impossible anyway...
11 (SQL portability)
proposed solution, to limit camparing of strings, sounds good to me. not being very SQL literate: are we loosing much here?
I don't think so. I cannot see much use for selection based on string comparisons except for equality. We can always put it back later if it is really necessary *and* we find how to do it.

Quoting [Fisher, SM (Steve)] (Aug 14 2008):
11 (SQL portability)
proposed solution, to limit camparing of strings, sounds good to me. not being very SQL literate: are we loosing much here?
I don't think so. I cannot see much use for selection based on string comparisons except for equality. We can always put it back later if it is really necessary *and* we find how to do it.
Agree. thats even automatically backward-compatible :-D Andre. -- Nothing is ever easy.
participants (8)
-
'Andre Merzky'
-
Andre Merzky
-
Fisher, SM (Steve)
-
Hartmut Kaiser
-
Keith Jackson
-
Mihael Hategan
-
PFA van Zoolingen
-
Thilo Kielmann