Re: [SAGA-RG] Fwd (mathijs@cs.vu.nl): Suboptimal things in SAGA

Hi Matthijs, we did not manage to discuss all at OGF - agenda was pretty packed already. Thus again by mail... Quoting [Thilo Kielmann] (Oct 15 2009):
more food...
----- Forwarded message from Mathijs den Burger <mathijs@cs.vu.nl> -----
Subject: Suboptimal things in SAGA From: Mathijs den Burger <mathijs@cs.vu.nl> To: Thilo Kielmann <kielmann@cs.vu.nl>
Hi,
Here's a list of things I feel that are not optimal in SAGA right now. Maybe interesting for (lunch)discussions at OGF?
1. Exception handling in engines with late binding is a pain. When multiple adaptors are tried automatically and all fail, it is hard to figure out what actually went wrong. JavaGAT also has this problem. Users run away screaming when seeing their first nested exception: it contains 10 backends they never heard of nor asked for complaining about stuff they do not understand.
Agree, it is painful. But what can you do? At best, the engine is able to emply some heuristics to extract the most relevant exception and push that to the top level. Your application should then only print that message to stderr, by default. The only real 'solution' would be to disable late binding... Or do you see any other way?
2. The any:// scheme is evil. It works well if the available backends change a lot, but that is not the case in practice. Users know very well what backends are used for what Grid sites: it's rather static info.
Well, then they should use the backend specific URLs, not 'any'! 'Any' is exactly for those cases where the backend is *not* known - in all other cases, it does not make sense to use it. If you think this is too much trouble for your users, simply disable 'any'. The spec says: "The SAGA API specifiation allows the use of the placeholder âanyâ (as in any://host.net/tmp/file). A SAGA compliant implementation MAY be able to choose a suitable protocol automatically, but CAN decline the URL with an IncorrectURL exception."
A much cleaner design (e.g. followed in IbisDeploy) to alleviate problems 1 and 2 is to define a number of Grid sites that each have certain backends and for which you have certain credentials. A SAGA engine can then, per site, only try the adaptors that make sense in the first place.
Hmm, isn't that what is happening? The default session should contain saga contexts for those backends you have security credentials for. As the adaptors live in that session, only those adaptors should get active for which a context exists. You probably mean that the other adaptors still throw an AuthorizationFailed exception if no context is available? Well, one can disable the adaptors. So, I guess what I try to say is that the saga::session can be used to specify the backends to use, via the contexts.
This limits the backends tried to the ones explicitly specified by the user, which makes it much more comprehensible what is going on. It is also faster, since not all adaptors have to be tried. Currently, there is no generic way in SAGA to limit which adaptors or credentials are used for a site. JavaGAT does have such functionality.
The generic way is to create a session with those contexts (aka credentials) attached which you want to use. Say, you want to limit the set of active adaptors to the globus adaptors, do saga::session s; saga::context c ("globus"); s.add_context (c); saga::filesystem::file f (s, url); This should get you only the globus adaptor - all others will bail out, right? (sorry if my answer is a repetition from above)
3. Sessions with multiple contexts of the same type should be forbidden. Trying them all may have weird and unwanted side-effects (e.g. creating files as a different user, or a security lockout because you tried to many passwords). It confuses the user. This issue is related to point 2.
This is a tough one. The problem here is that a context type is not bound to a backend type. Like, both glite and globus use X509 certs. Both AWS and ssh use openssl keypairs. Both local and ftp use Username/Password, etc. I don't think this is something one can enforce. We had the proposal to have the context types not bound to the backend *technology* (x509), but to the backend *name* (teragrid). This was declined as it makes it difficult to run your stuff on a different deployment using the same cert.
4. URL schemes are ill-defined. Right now, knowing which schemes to use is implementation-dependent voodoo (e.g. what is the scheme for running local jobs? Java SAGA uses 'local://', C++ SAGA used 'fork://'). There is no generic way of knowing these schemes other than 'read the documentation', which people don't do. Essentially, these schemes create an untyped dependency of a SAGA app to a SAGA implementation, causing SAGA apps not to be portable across implementations unless they all have the same adaptors that recognize the same schemes.
Correct. Schema definition is not part of the spec. I argue it should not be either, as that can only be a restrictive specification, which would break use cases, too. Only solution right now is to create a registry - simply a web page which lists recommendations on what scheme to use for what backend. Would that make sens to you?
5. Bulk operations are hard to implement and clumsy to use. Better would be to include bulk operations directly in the API where they make sense. It's much simpler to implement adaptors for that, and much easier for users to use and comprehend.
Oops - bulk ops were designed to be easy to use! Hmmm... About the hard to implement: true, but iff they are easy to use, then that does not matter (to the SAGA API spec). Why were bulk ops not explicitely added to the spec is obvious: it would (roughly) double the number of calls, and would lead to some pretty complex call signatures: list <list <url> > listings = dir.bulk_list (list <url>); list <int> results = file.bulk_read (list <buffer>, list <sizes>); Further, this would lead to even more complex error semantics (what happens if one op out of a bulk of ops fails?). This all is avoided by the current syntax foreach url in ( list<url> ) { tc.add_task (dir.list <Async> (url)); } tc.wait (All); Not that difficult to use I believe?
Sorry for the rant :)
Hey, thats ok! :-) Also, I am very biased in my answers as you'll notice, and probably somewhat defensive, too. So, it would be nice to hear from others, too! Cheers, and thanks, Andre. -- Nothing is ever easy.

Hi Andre, Pfew, these mails tend to get long. Here we go: On Sat, 2009-10-17 at 22:25 -0600, Andre Merzky wrote:
1. Exception handling in engines with late binding is a pain.
Agree, it is painful. But what can you do? At best, the engine is able to emply some heuristics to extract the most relevant exception and push that to the top level. Your application should then only print that message to stderr, by default.
The only real 'solution' would be to disable late binding... Or do you see any other way?
Mainly: restrict the number of backends tried as much as possible (see below). Furthermore, catch generic errors in the engine instead of in each adaptor separate (e.g. illegal flags, negative port numbers etc) so the user gets one exception instead of 10 identical ones from all adaptors tried.
2. The any:// scheme is evil.
Well, then they should use the backend specific URLs, not 'any'! 'Any' is exactly for those cases where the backend is *not* known - in all other cases, it does not make sense to use it.
Agreed.
A much cleaner design (e.g. followed in IbisDeploy) to alleviate problems 1 and 2 is to define a number of Grid sites that each have certain backends and for which you have certain credentials. A SAGA engine can then, per site, only try the adaptors that make sense in the first place.
Hmm, isn't that what is happening? The default session should contain saga contexts for those backends you have security credentials for. As the adaptors live in that session, only those adaptors should get active for which a context exists.
You probably mean that the other adaptors still throw an AuthorizationFailed exception if no context is available? Well, one can disable the adaptors.
So, I guess what I try to say is that the saga::session can be used to specify the backends to use, via the contexts.
This limits the backends tried to the ones explicitly specified by the user, which makes it much more comprehensible what is going on. It is also faster, since not all adaptors have to be tried. Currently, there is no generic way in SAGA to limit which adaptors or credentials are used for a site. JavaGAT does have such functionality.
The generic way is to create a session with those contexts (aka credentials) attached which you want to use. Say, you want to limit the set of active adaptors to the globus adaptors, do
saga::session s; saga::context c ("globus"); s.add_context (c);
saga::filesystem::file f (s, url);
This should get you only the globus adaptor - all others will bail out, right? (sorry if my answer is a repetition from above)
Not really. The other adaptors will still throw an exception in their constructor. Say the Globus adaptor fails for some reason: the user then still has to wade through all the other exceptions to find the one that matters. That's confusing and annoying.
3. Sessions with multiple contexts of the same type should be forbidden. Trying them all may have weird and unwanted side-effects (e.g. creating files as a different user, or a security lockout because you tried to many passwords). It confuses the user. This issue is related to point 2.
This is a tough one. The problem here is that a context type is not bound to a backend type. Like, both glite and globus use X509 certs. Both AWS and ssh use openssl keypairs. Both local and ftp use Username/Password, etc. I don't think this is something one can enforce.
We had the proposal to have the context types not bound to the backend *technology* (x509), but to the backend *name* (teragrid). This was declined as it makes it difficult to run your stuff on a different deployment using the same cert.
Hmm, in your adaptor-selecting example you do exactly that: using a context type specific to a single backend ("globus") to select a specific adaptor. If the context should have a type "x509", how do I then select only the Globus adaptor? And how do I differentiate between multiple Globus adaptors for different versions of Globus? There should be a better way of selecting adaptors...
4. URL schemes are ill-defined. Right now, knowing which schemes to use is implementation-dependent voodoo (e.g. what is the scheme for running local jobs? Java SAGA uses 'local://', C++ SAGA used 'fork://'). There is no generic way of knowing these schemes other than 'read the documentation', which people don't do. Essentially, these schemes create an untyped dependency of a SAGA app to a SAGA implementation, causing SAGA apps not to be portable across implementations unless they all have the same adaptors that recognize the same schemes.
Correct. Schema definition is not part of the spec. I argue it should not be either, as that can only be a restrictive specification, which would break use cases, too. Only solution right now is to create a registry - simply a web page which lists recommendations on what scheme to use for what backend. Would that make sens to you?
That would certainly help to bring the various SAGA implementations closer together. However, the more general problem is that SAGA users should be able to limit the adaptors used in a late-binding implementation. The two main reasons are: - speed (always trying 10 adaptors takes time) - clarity (limit the amount of exceptions) The current two generic mechanisms are context types and URL schemes. Both are not very well suited. Each adaptor would have to recognize a unique context type and scheme to allow the selection of individual adaptors. Even then, selecting two adaptors is already hard: you cannot have two schemes in a URL, and using two contexts only works if both adaptors recognize a context in the first place. A solution could be to add some extra functionality to a Session. A user should be able to specify which adaptor may be used, e.g. something similar to the Preferences object in JavaGAT. Ideally, you could also ask which adaptors are available. Specifying this in the API prevents each implementation from creation its own mechanism via config files, system properties, environment variables etc.
5. Bulk operations are hard to implement and clumsy to use. Better would be to include bulk operations directly in the API where they make sense. It's much simpler to implement adaptors for that, and much easier for users to use and comprehend.
Oops - bulk ops were designed to be easy to use! Hmmm...
About the hard to implement: true, but iff they are easy to use, then that does not matter (to the SAGA API spec).
Why were bulk ops not explicitely added to the spec is obvious: it would (roughly) double the number of calls, and would lead to some pretty complex call signatures:
list <list <url> > listings = dir.bulk_list (list <url>); list <int> results = file.bulk_read (list <buffer>, list <sizes>);
Further, this would lead to even more complex error semantics (what happens if one op out of a bulk of ops fails?).
This all is avoided by the current syntax
foreach url in ( list<url> ) { tc.add_task (dir.list <Async> (url)); } tc.wait (All);
Not that difficult to use I believe?
First, how do I figure out which list came from which URL? The get_object() call of each task will only return the 'dir' object, but you need the 'url' parameter to make sense of the result. Doesn't this make the current bulk ops API useless for all methods that take parameters? Second, does each bulk operation requires the creation of another task container? If I want to do dir.get_size(url) and dir.is_directory(url) for all entries in a directory, can I put all these tasks in one container, or should I create two separate containers? The programming model does not restrict me in any way. An engine will have a hard time analyzing such task containers and converting them to efficient adaptor calls... best regards, Mathijs

Quoting [Mathijs den Burger] (Oct 19 2009):
Hi Andre,
Pfew, these mails tend to get long. Here we go:
On Sat, 2009-10-17 at 22:25 -0600, Andre Merzky wrote:
1. Exception handling in engines with late binding is a pain.
Agree, it is painful. But what can you do? At best, the engine is able to emply some heuristics to extract the most relevant exception and push that to the top level. Your application should then only print that message to stderr, by default.
The only real 'solution' would be to disable late binding... Or do you see any other way?
Mainly: restrict the number of backends tried as much as possible (see below). Furthermore, catch generic errors in the engine instead of in each adaptor separate (e.g. illegal flags, negative port numbers etc) so the user gets one exception instead of 10 identical ones from all adaptors tried.
Yes, that is on our todo list. Right now, the C++ engine does not have the infrastructure for doing parameter checks on package level (instead of adaptor level). Is that implemented in Java?
The generic way is to create a session with those contexts (aka credentials) attached which you want to use. Say, you want to limit the set of active adaptors to the globus adaptors, do
saga::session s; saga::context c ("globus"); s.add_context (c);
saga::filesystem::file f (s, url);
This should get you only the globus adaptor - all others will bail out, right? (sorry if my answer is a repetition from above)
Not really. The other adaptors will still throw an exception in their constructor. Say the Globus adaptor fails for some reason: the user then still has to wade through all the other exceptions to find the one that matters. That's confusing and annoying.
Fair enough.
3. Sessions with multiple contexts of the same type should be forbidden. Trying them all may have weird and unwanted side-effects (e.g. creating files as a different user, or a security lockout because you tried to many passwords). It confuses the user. This issue is related to point 2.
This is a tough one. The problem here is that a context type is not bound to a backend type. Like, both glite and globus use X509 certs. Both AWS and ssh use openssl keypairs. Both local and ftp use Username/Password, etc. I don't think this is something one can enforce.
We had the proposal to have the context types not bound to the backend *technology* (x509), but to the backend *name* (teragrid). This was declined as it makes it difficult to run your stuff on a different deployment using the same cert.
Hmm, in your adaptor-selecting example you do exactly that: using a context type specific to a single backend ("globus") to select a specific adaptor. If the context should have a type "x509", how do I then select only the Globus adaptor? And how do I differentiate between multiple Globus adaptors for different versions of Globus? There should be a better way of selecting adaptors...
There is, but not on API level. If you know your backends in advance, or know that specific backends are prefered on some host, then you should configure your SAGA accordingly, i.e. disable all other backends by default. Most people compile and install all adaptors, and leave all enabled by default - it should be the task of the admin (the installing person) to make a sensible choice here. Well, thas is my/our approach to adaptor pre-selection anyway...
4. URL schemes are ill-defined. Right now, knowing which schemes to use is implementation-dependent voodoo (e.g. what is the scheme for running local jobs? Java SAGA uses 'local://', C++ SAGA used 'fork://'). There is no generic way of knowing these schemes other than 'read the documentation', which people don't do. Essentially, these schemes create an untyped dependency of a SAGA app to a SAGA implementation, causing SAGA apps not to be portable across implementations unless they all have the same adaptors that recognize the same schemes.
Correct. Schema definition is not part of the spec. I argue it should not be either, as that can only be a restrictive specification, which would break use cases, too. Only solution right now is to create a registry - simply a web page which lists recommendations on what scheme to use for what backend. Would that make sens to you?
That would certainly help to bring the various SAGA implementations closer together.
However, the more general problem is that SAGA users should be able to limit the adaptors used in a late-binding implementation. The two main reasons are:
- speed (always trying 10 adaptors takes time) - clarity (limit the amount of exceptions)
The current two generic mechanisms are context types and URL schemes. Both are not very well suited. Each adaptor would have to recognize a unique context type and scheme to allow the selection of individual adaptors. Even then, selecting two adaptors is already hard: you cannot have two schemes in a URL, and using two contexts only works if both adaptors recognize a context in the first place.
A solution could be to add some extra functionality to a Session. A user should be able to specify which adaptor may be used, e.g. something similar to the Preferences object in JavaGAT. Ideally, you could also ask which adaptors are available. Specifying this in the API prevents each implementation from creation its own mechanism via config files, system properties, environment variables etc.
Yeah, I was expecting you to come up with JavaGAT preferences :-P I myself really don't think its a good idea to add backend inspection/control to the SAGA API (backend meaning SAGA implementation in this case). Also, we already threw this out of the API a couple of times. I see your point of having an implemention independent mechanism. For C++, there are not too many implementations around (or expected to be around) to have a real problem here. Don't fix it if it ain't broken, right? So, we can try to make that more formal when we in fact have multiple implementations. For Java, you guys added properties already, and as far as I can see the exact properties which are available are undefined. I don't like this to be honest, but that seems the Java way, right? So, do you already use that for adaptor pre-selection? One think I could see implemented universally is to require a context to be present for a backend to get activated at all: that would allow to get reid of the majority of exceptions I think.
5. Bulk operations are hard to implement and clumsy to use. Better would be to include bulk operations directly in the API where they make sense. It's much simpler to implement adaptors for that, and much easier for users to use and comprehend.
Oops - bulk ops were designed to be easy to use! Hmmm...
About the hard to implement: true, but iff they are easy to use, then that does not matter (to the SAGA API spec).
Why were bulk ops not explicitely added to the spec is obvious: it would (roughly) double the number of calls, and would lead to some pretty complex call signatures:
list <list <url> > listings = dir.bulk_list (list <url>); list <int> results = file.bulk_read (list <buffer>, list <sizes>);
Further, this would lead to even more complex error semantics (what happens if one op out of a bulk of ops fails?).
This all is avoided by the current syntax
foreach url in ( list<url> ) { tc.add_task (dir.list <Async> (url)); } tc.wait (All);
Not that difficult to use I believe?
First, how do I figure out which list came from which URL? The get_object() call of each task will only return the 'dir' object, but you need the 'url' parameter to make sense of the result.
Yes, you need to track tasks on API level - but you need to do the same in the other case as well, explicitely or implicitely, via some list index or map.
Doesn't this make the current bulk ops API useless for all methods that take parameters?
No, not really, as that is rather simple on API level (pseudocode): foreach url in ( list<url> ) { saga::task t = dir.list <Async> (url)); tc.add_task (t); task_map[t] = url; } while ( tc.size () ) { saga::task t = tc.wait (Any); cout << "list result for " << task_map[t] << " : " << t.get_result <list <url> > (); } The code for explicit bulk operations would not look much different I assume.
Second, does each bulk operation requires the creation of another task container? If I want to do dir.get_size(url) and dir.is_directory(url) for all entries in a directory, can I put all these tasks in one container, or should I create two separate containers? The programming model does not restrict me in any way. An engine will have a hard time analyzing such task containers and converting them to efficient adaptor calls...
Again, it is not about ease of engine implementation. Also, we did implement it, and as long as you have task inspection (on implementation level), that analysis step is not too hard: foreach task in task_container { task_operation_type_map[task.operation_type].push_back (task); } foreach task_operation_type in task_operation_type_map { task_operation_type.call_adaptor_bulk_op (task_operation_type); } If an adaptor can't do the complete bulk op, it returns (in our implementation) those tasks it cannot handle, so the next adaptor can try (IIRC). If all adaptors fail, the individual ops are done one-by-one. If the adaptor does not have a bulk interface, the ops are done one-by-one anyway. So, its actually like (sorry for the long names, but you JAVA guys like that, don't you? ;-) : while ( ! task_operation_type_map.empty () ) { // try bulk ops for each adaptor foreach task_operation_type in task_operation_type_map { foreach adaptor in adaptor_list { task_container todo = task_operation_type_map[task_operation_type] task_container not_done = adaptor.bulk_op (todo); task_operation_type_map[task_operation_type] = not_done; } } // handle all not_dones foreach task_operation_type in task_operation_type_map { task_container todo = task_operation_type_map[task_operation_type] forach task in todo { foreach adaptor in adaptor_list { adaptor.serial_op (task) && break; } } } // all tasks are done, or cannot be done at all. } So, that is really it (modulo technical decorations, which can always be non-trivia of course). Supporting a complete set of bulk ops on implementation and adaptor level is not really a much simplier solution I think, and gives you less flexibility. Cheers, Andre. -- Nothing is ever easy.

On Mon, 2009-10-19 at 11:38 -0500, Andre Merzky wrote:
1. Exception handling in engines with late binding is a pain.
Agree, it is painful. But what can you do? At best, the engine is able to emply some heuristics to extract the most relevant exception and push that to the top level. Your application should then only print that message to stderr, by default.
The only real 'solution' would be to disable late binding... Or do you see any other way?
Mainly: restrict the number of backends tried as much as possible (see below). Furthermore, catch generic errors in the engine instead of in each adaptor separate (e.g. illegal flags, negative port numbers etc) so the user gets one exception instead of 10 identical ones from all adaptors tried.
Yes, that is on our todo list. Right now, the C++ engine does not have the infrastructure for doing parameter checks on package level (instead of adaptor level). Is that implemented in Java?
Yes, we recently moved such checks from the adaptors to the engine.
The generic way is to create a session with those contexts (aka credentials) attached which you want to use. Say, you want to limit the set of active adaptors to the globus adaptors, do
saga::session s; saga::context c ("globus"); s.add_context (c);
saga::filesystem::file f (s, url);
This should get you only the globus adaptor - all others will bail out, right? (sorry if my answer is a repetition from above)
Not really. The other adaptors will still throw an exception in their constructor. Say the Globus adaptor fails for some reason: the user then still has to wade through all the other exceptions to find the one that matters. That's confusing and annoying.
Fair enough.
3. Sessions with multiple contexts of the same type should be forbidden. Trying them all may have weird and unwanted side-effects (e.g. creating files as a different user, or a security lockout because you tried to many passwords). It confuses the user. This issue is related to point 2.
This is a tough one. The problem here is that a context type is not bound to a backend type. Like, both glite and globus use X509 certs. Both AWS and ssh use openssl keypairs. Both local and ftp use Username/Password, etc. I don't think this is something one can enforce.
We had the proposal to have the context types not bound to the backend *technology* (x509), but to the backend *name* (teragrid). This was declined as it makes it difficult to run your stuff on a different deployment using the same cert.
Hmm, in your adaptor-selecting example you do exactly that: using a context type specific to a single backend ("globus") to select a specific adaptor. If the context should have a type "x509", how do I then select only the Globus adaptor? And how do I differentiate between multiple Globus adaptors for different versions of Globus? There should be a better way of selecting adaptors...
There is, but not on API level. If you know your backends in advance, or know that specific backends are prefered on some host, then you should configure your SAGA accordingly, i.e. disable all other backends by default. Most people compile and install all adaptors, and leave all enabled by default - it should be the task of the admin (the installing person) to make a sensible choice here.
Well, thas is my/our approach to adaptor pre-selection anyway...
In general, the adaptors to enable differ per site. If you only use Globus for all your grid sites, it makes sense to only enable the Globus adaptor. If you use Globus at site A and Gridsam at site B, you want to enable both adaptors. However, a late-binding SAGA engine will then always try to use both adaptors for both sites, which is bad for speed and clarity. It should not be the case that to make SAGA usable, the number of installed adaptors has to be limited.
4. URL schemes are ill-defined. Right now, knowing which schemes to use is implementation-dependent voodoo (e.g. what is the scheme for running local jobs? Java SAGA uses 'local://', C++ SAGA used 'fork://'). There is no generic way of knowing these schemes other than 'read the documentation', which people don't do. Essentially, these schemes create an untyped dependency of a SAGA app to a SAGA implementation, causing SAGA apps not to be portable across implementations unless they all have the same adaptors that recognize the same schemes.
Correct. Schema definition is not part of the spec. I argue it should not be either, as that can only be a restrictive specification, which would break use cases, too. Only solution right now is to create a registry - simply a web page which lists recommendations on what scheme to use for what backend. Would that make sens to you?
That would certainly help to bring the various SAGA implementations closer together.
However, the more general problem is that SAGA users should be able to limit the adaptors used in a late-binding implementation. The two main reasons are:
- speed (always trying 10 adaptors takes time) - clarity (limit the amount of exceptions)
The current two generic mechanisms are context types and URL schemes. Both are not very well suited. Each adaptor would have to recognize a unique context type and scheme to allow the selection of individual adaptors. Even then, selecting two adaptors is already hard: you cannot have two schemes in a URL, and using two contexts only works if both adaptors recognize a context in the first place.
A solution could be to add some extra functionality to a Session. A user should be able to specify which adaptor may be used, e.g. something similar to the Preferences object in JavaGAT. Ideally, you could also ask which adaptors are available. Specifying this in the API prevents each implementation from creation its own mechanism via config files, system properties, environment variables etc.
Yeah, I was expecting you to come up with JavaGAT preferences :-P
I myself really don't think its a good idea to add backend inspection/control to the SAGA API (backend meaning SAGA implementation in this case). Also, we already threw this out of the API a couple of times.
Ah, I didn't know that :)
I see your point of having an implemention independent mechanism. For C++, there are not too many implementations around (or expected to be around) to have a real problem here. Don't fix it if it ain't broken, right? So, we can try to make that more formal when we in fact have multiple implementations.
Sure, but right now it increases the learning curve to use another SAGA implementation.
For Java, you guys added properties already, and as far as I can see the exact properties which are available are undefined. I don't like this to be honest, but that seems the Java way, right? So, do you already use that for adaptor pre-selection?
The properties are described in the User Guide. Basically, a user can specify system properties to specify adaptor loading order and which adaptors to include. For example, setting the property NSEntry.adaptor.name=local,javagat will make the engine try only the Local and the JavaGAT adaptor (in that order). Also possible: NSEntry.adaptor.name=!javagat which will use all adaptors, except the JavaGAT adaptor, in the default order. This is all very engine-specific, and other engines will probably do something completely different. I agree a generic approach is hard, but it is something every SAGA user will struggle with. BTW: putting theses properties in a saga.properties file is not very much different from the saga.ini file in C++. It is also the standard way to configure Java sofware, as opposed to a home-brew .ini file format ;).
One think I could see implemented universally is to require a context to be present for a backend to get activated at all: that would allow to get reid of the majority of exceptions I think.
Hmm, I like that idea. The only two drawbacks I see are: 1. it blurs the notion of a 'security' context 2. it requires that ALL adaptors recognize a security context, including the local adaptors. But adding a default 'local' security context to the default session with only a UserID would make sense. In fact, a 'local' security context with another UserID and UserPass could be used to run local job as a different user! Furthermore, the engine should then not include any nested exceptions from constructors that failed because no applicable security context was present, otherwise the user still sees too many nested exceptions.
5. Bulk operations are hard to implement and clumsy to use. Better would be to include bulk operations directly in the API where they make sense. It's much simpler to implement adaptors for that, and much easier for users to use and comprehend.
Oops - bulk ops were designed to be easy to use! Hmmm...
About the hard to implement: true, but iff they are easy to use, then that does not matter (to the SAGA API spec).
Why were bulk ops not explicitely added to the spec is obvious: it would (roughly) double the number of calls, and would lead to some pretty complex call signatures:
list <list <url> > listings = dir.bulk_list (list <url>); list <int> results = file.bulk_read (list <buffer>, list <sizes>);
Further, this would lead to even more complex error semantics (what happens if one op out of a bulk of ops fails?).
This all is avoided by the current syntax
foreach url in ( list<url> ) { tc.add_task (dir.list <Async> (url)); } tc.wait (All);
Not that difficult to use I believe?
First, how do I figure out which list came from which URL? The get_object() call of each task will only return the 'dir' object, but you need the 'url' parameter to make sense of the result.
Yes, you need to track tasks on API level - but you need to do the same in the other case as well, explicitely or implicitely, via some list index or map.
True
Doesn't this make the current bulk ops API useless for all methods that take parameters?
No, not really, as that is rather simple on API level (pseudocode):
foreach url in ( list<url> ) { saga::task t = dir.list <Async> (url)); tc.add_task (t); task_map[t] = url; }
while ( tc.size () ) { saga::task t = tc.wait (Any); cout << "list result for " << task_map[t] << " : " << t.get_result <list <url> > (); }
The code for explicit bulk operations would not look much different I assume.
Second, does each bulk operation requires the creation of another task container? If I want to do dir.get_size(url) and dir.is_directory(url) for all entries in a directory, can I put all these tasks in one container, or should I create two separate containers? The programming model does not restrict me in any way. An engine will have a hard time analyzing such task containers and converting them to efficient adaptor calls...
Again, it is not about ease of engine implementation. Also, we did implement it, and as long as you have task inspection (on implementation level), that analysis step is not too hard:
foreach task in task_container { task_operation_type_map[task.operation_type].push_back (task); }
foreach task_operation_type in task_operation_type_map { task_operation_type.call_adaptor_bulk_op (task_operation_type); }
If an adaptor can't do the complete bulk op, it returns (in our implementation) those tasks it cannot handle, so the next adaptor can try (IIRC). If all adaptors fail, the individual ops are done one-by-one. If the adaptor does not have a bulk interface, the ops are done one-by-one anyway. So, its actually like (sorry for the long names, but you JAVA guys like that, don't you? ;-) :
I guess everybody likes clear code, which does not necessarily imply long names :)
while ( ! task_operation_type_map.empty () ) { // try bulk ops for each adaptor foreach task_operation_type in task_operation_type_map { foreach adaptor in adaptor_list { task_container todo = task_operation_type_map[task_operation_type] task_container not_done = adaptor.bulk_op (todo); task_operation_type_map[task_operation_type] = not_done; } }
// handle all not_dones foreach task_operation_type in task_operation_type_map { task_container todo = task_operation_type_map[task_operation_type] forach task in todo { foreach adaptor in adaptor_list { adaptor.serial_op (task) && break; } } }
// all tasks are done, or cannot be done at all. }
So, that is really it (modulo technical decorations, which can always be non-trivia of course).
Supporting a complete set of bulk ops on implementation and adaptor level is not really a much simplier solution I think, and gives you less flexibility.
Hmm, njah, grumble. I guess you're technically right, although each adaptor still has to wade through many Tasks and somehow figure out if they can be done more efficiently together. Any Task can be put in a task container, also jobs etc. Would the analysis code simply skip those? The the user's perspective: creating a task container of many subtasks only makes sense when that improves performance. A user will therefore have to have intimate knowledge (again) about the adaptor implementation to know which tasks he can throw together in a task container so that 'magically' they will be performed faster together. A limited amount of dedicated bulk methods in the API objects would be much more inflexible, but also much simpler to comprehend. And although 'not easy to implement' is not a valid argument, it should not be too hard either, otherwise adaptor implementors will simply skip bulk operations. Related question: which C++ adaptors currently implement bulk ops? How easy was it to implement the analysis code in these adaptors? How easy is it to use these bulk ops in an application? cheers, -Mathijs
participants (2)
-
Andre Merzky
-
Mathijs den Burger