Final call: GridRPC API for inclusion in SAGA 1.0
Dear all, Hidemoto Nakada, Yusuke Tanimura, and myself have met and we have re-worked the pending proposal for a GridRPC module for inclusion in the upcoming SAGA spec. The only thing we have changed is the way in which parameters to an RPC invocation are passed. Now, it is an array of parameters, where a paramter is a struct, consisting of buffer, size, and, IN/OUT/INOUT mode. We have also modelled examples from the Ninf-G web page with this API. We think that this now has a "look and feel" of GridRPC. Attached please find the proposed text with API and examples. If this (or something similar) can be agreed upon quickly, then the GridRPC module can be included in the SAGA 1.0 spec. We see this API not as a competitor to the GridRPC API, as published in GFD-R.52. It is rather an alternative, embedded in the SAGA framework, to access existing GridRPC implementations, thus extending their user base. We are now doing the final call for comments for the two groups involved. If you have comments or questions, please post them TO BOTH MAILING LISTS. This final call is open until FRIDAY, AUGUST 18th. Comments and objections made by then will happily be included in the proposal. By this deadline, we will add this text (with all modifications we got) into the SAGA API document. Thanks to you all for your contributions. Thilo Kielmann -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Hi Thilo, Some comments inlined. Cheers, Andre. Quoting [Thilo Kielmann] (Aug 09 2006):
Date: Wed, 9 Aug 2006 19:00:49 +0200 From: Thilo Kielmann <kielmann@cs.vu.nl> To: gridrpc-wg@gridforum.org, saga-rg@gridforum.org Subject: [gridrpc-wg] Final call: GridRPC API for inclusion in SAGA 1.0
Dear all,
Hidemoto Nakada, Yusuke Tanimura, and myself have met and we have re-worked the pending proposal for a GridRPC module for inclusion in the upcoming SAGA spec. The only thing we have changed is the way in which parameters to an RPC invocation are passed. Now, it is an array of parameters, where a paramter is a struct, consisting of buffer, size, and, IN/OUT/INOUT mode. We have also modelled examples from the Ninf-G web page with this API. We think that this now has a "look and feel" of GridRPC.
It should have the look & feel of SAGA, and the semantics of GridRPC :-) - but if it closes in on look & feel on both ends, so the better of course...
Attached please find the proposed text with API and examples.
I felt free to change some part to adapt it to the spec look and feel (example coding conventions, intendation etc.). I'll convert that version to tex now, and add it to the CVS. Hope thats ok with you.
If this (or something similar) can be agreed upon quickly, then the GridRPC module can be included in the SAGA 1.0 spec. We see this API not as a competitor to the GridRPC API, as published in GFD-R.52. It is rather an alternative, embedded in the SAGA framework, to access existing GridRPC implementations, thus extending their user base.
We are now doing the final call for comments for the two groups involved. If you have comments or questions, please post them TO BOTH MAILING LISTS.
This final call is open until FRIDAY, AUGUST 18th.
There is one conflict: I hope that we can get the SAGA CORE spec into final mailing list call on Monday. So I suggest to NOT move that timeline backwards, but to include rpc already - if either of the final calls meets negative comments we remove it - does that make sense? More comments below
Comments and objections made by then will happily be included in the proposal. By this deadline, we will add this text (with all modifications we got) into the SAGA API document.
Thanks to you all for your contributions.
Thilo Kielmann
+-------------------------------------------------------------+
###### ###### ##### # # # # # # # # # # # ###### ###### # # # # # # # # # # # # # #####
+-------------------------------------------------------------+
Summary: ========
GridRPC is one of the few high level APIs defined by the GGF. Including it into the SAGA API benefits both: SAGA gets more complete, and provides a better coverage of its use cases with a single look and feel; and GridRPC gets embedded into a set of other tools of similar scope, which opens it to a potentially wider user community, and ensures its further development.
The RPC package of the SAGA API described here is a one to one mapping from the GridRPC standard, equipped with the SAGA look and feel, error conventions, task model etc.
+-------------------------------------------------------------+
Specification: ==============
package saga version 0.1 { package rpc { enum io_mode { In = 1, // input parameter Out = 2, // output parameter InOut = 3 // both input and output parameter }
struct parameter { long size; // number of bytes in buffer array<byte> buffer; // data io_mode mode; // parameter mode }
class rpc { CONSTRUCTOR (in string funcname ); call (inout array<parameter> parameters); } } }
+-------------------------------------------------------------+
#ifndef SHORT
Details: ========
class rpc: ----------
This class represents a remote function handle, which can be called (repeatedly), and returns the result of the respective remote procedure invocation.
The class offers one non trivial constructor, which initialises the remote function handle (see [1] for details). That process may involve connection setup, service discovery etc. The class further offers one method 'call()', which invokes the remote procedure, and returns the respective return data and values.
In the constructor, the remote procedure to be invoked is specified by a URL, whith the syntax:
gridrpc://server.net:1234/my_function
with the elements responding to:
gridrpc - scheme - identifying a grid rpc operation server.net - server - server host serving the rpc call 1234 - port - contact point for the server my_function - name - name of the remote method to invoke
All elements but the scheme can be empty, which allows the implementation to fall back to some default remote method to invoke (minimal URL: gridrpc:///).
The description of the constructor says that the URL can be NULL, in some languages that will mean 'empty' (e.g. if they have no default args). That would mean that scheme can be empty, too.
The argument and return value handling is currently very basic, and reflects the traditional scheme for remote procedure calls: an array of parameters, for each of which the buffer, its size, and the input/output mode is described. On invocation of the 'call' method, for each parameter the 'mode' value has to be initialized, for parameters with mode 'In' or 'InOut', also 'size' and 'buffer' must be initialized. For parameters with mode OUT, 'size' might also have the value 0 in which case the 'buffer' is considered to be void, and has to be created (e.g., allocated) by the SAGA system upon arrival of the result data.
Not that I disagree, but it should be noted that even for RPC calls which require input parameters only, the params must be passed by reference. That implies, in some languages, to track the param memory for async invokations, where that is not the case for async invokations of other languages which don't have no output parameters. It can;t be helped I guess, and should not be an issue really.
This argument handling scheme allows efficient (zero-copy) passing of parameters. For 'Out' parameters with a size value of 0, the implementation is required to allocate the data structure and to overwrite the size and buffer fields for the parameter.
It is the responsibility of the application programmer to free this memory I assume? The the language bindings MUST prescribe how that memory is allocated, to allow the application to choose the appropriate de-allocation method. Alternatively we would need an 'dealloc' method, which would then require the implementation to alloc and de-alloc the params (and to keep track of the blocks).
- constructor Purpose: inits a remote function handle Format: CONSTRUCTOR (in session session, in string funcname) Inputs: session: saga session to use funcname: name of remote method to initialize Outputs: - Throws: NoSuccess: server could not be contacted, or method is unknown
I assume a number of other exceptions would apply as well, such as AuthenticationFailed AuthorizationFailed PermissionDenied DoesNotExist (server contacted, but no such call available) IncorrectURL Well, and some more I guess. Question is: NoSuccess is actually reserved as last resort, if no other exception really applies (its the least specific exception, please have a look at the 'error handling' section in the spec). So, is NoSuccess really needed here, and in what conditions?
Notes: - see [1] for details - if funcname is NULL, a default handle is created
- call Purpose: call the remote procedure Format: call (inout array<parameter> param); Inputs: - In/Out: param: argument/result values for call Outputs: - Throws: NoSuccess: remote operation failed
Same as above.
Notes: - see [1] for details - by passing an array of variable size, different numbers of parameters can be handled. No special variable argument list handling is required.
We discussed varargs at one of the last GGF, and came to the conclusion that language bindings COULD allow varargs. That does not make sense with the proposed scheme, in particular in respect to the memory allocation policy described. So, I guess we abstain from varargs in the language bindings then? Other open questions we had from former RPC discussions, and which should be addressed in this spec, are: - GridRPC takes a config file name on initialization. That config file needs to be user specific IIUC, and there was some discussion, but no conclusion about that. So, what is the appraoch on that? Is that spec implementable on top of GridRPC, and how? If that is an issue still: our decision was to include the config file URL as (optional) parameter to the CONSTRUCTOR. Does that make sense? - The RPC spec is silent about 'when' the connection to the remote server is made, on creation of the handle, or on call(). We decided in other parts of the spec that, for example, the constructor opens a file, or remote connection. I propose to prescribe the same for RPC. Does that make sense? Do we need to loosen the semantics elsewhere in the spec? (IMHO not). - Ninf-G allows to bind a handle to multiple calls. I assume that this is hidden in the implementation for now, and has no explicit reflection in the API? I think that is what we decided on anyway... - should we add a 'cancel(in float timeout)'? Explicit resource dealloction was an issue in our discussion at GGF, and we agreed on cancel - is that not needed anymore? Cheers, Andre.
+-------------------------------------------------------------+
Examples: ========= // c++ example // call a remote matrix multiplication A = A * B try { rpc rpc ("gridrpc://fs0.das2.cs.vu.nl/matmul1");
std::vector <saga::rpc::parameter> params (2);
params[0].buffer = // ptr to matrix A params[0].size = sizeof (buffer); params[0].mode = saga::rpc::InOut;
params[1].buffer = // ptr to matrix B params[1].size = sizeof (buffer); params[1].mode = saga::rpc::In;
rpc.call (params);
// A now contains the result } catch ( const saga::exception & e) { std::err << "SAGA error: " << e.what () << std::endl; }
// c++ example // call a remote matrix multiplication C = A * B try { rpc rpc ("gridrpc://fs0.das2.cs.vu.nl/matmul2");
std::vector <saga::rpc::parameter> params (3);
params[0].buffer = NULL; // buffer will be created params[0].size = 0; // buffer will be created params[0].mode = saga::rpc::Out;
params[1].buffer = // ptr to matrix A params[1].size = sizeof (buffer); params[1].mode = saga::rpc::InOut;
params[2].buffer = // ptr to matrix B params[2].size = sizeof (buffer); params[2].mode = saga::rpc::In;
rpc.call (params);
// params[0].buffer now contains the result } catch ( const saga::exception & e) { std::err << "SAGA error: " << e.what () << std::endl; }
// c++ example // asynchronous version of A = A * B try { rpc rpc ("gridrpc://fs0.das2.cs.vu.nl/matmul1");
std::vector <saga::rpc::parameter> params (2);
params[0].buffer = // ptr to matrix A params[0].size = sizeof (buffer); params[0].mode = saga::rpc::InOut;
params[1].buffer = // ptr to matrix B params[1].size = sizeof (buffer); params[1].mode = saga::rpc::In;
saga::task t = rpc.call <saga::task::ASync> (params);
t.wait (); // A now contains the result } catch ( const saga::exception & e) { std::err << "SAGA error: " << e.what() << std::endl; }
// c++ example // parameter sweep example from // http://ninf.apgrid.org/documents/ng4-manual/examples.html // // Monte Carlo computation of PI // try { std::string uri[NUM_HOSTS]; // initialize... long times, count[NUM_HOSTS], sum;
std::vector <saga::rpc::rpc> servers;
// create the rpc handles for all URIs for ( int i = 0; i < NUM_HOSTS; ++i ) { servers.push_back (saga::rpc::rpc (uri[i])); }
// create persisten storage for tasks and parameter structs saga::task_container tc; std::vector <std::vector <saga:rpc::parameter> > params;
// fill parameter structs and start async rpc calls for ( int i = 0; i < NUM_HOSTS; ++i ) { std::vector <saga::rpc::parameter> param (3);
param[0].buffer = i; // use as random seed param[0].size = sizeof (buffer); param[0].mode = saga::rpc::In;
param[1].buffer = times; param[1].size = sizeof (buffer); param[1].mode = saga::rpc::In;
param[2].buffer = count[i]; param[2].size = sizeof (buffer); param[2].mode = saga::rpc::Out;
// start the async calls saga::task t = servers[i].call <saga::task::ASync> (param);
// save the task; tc.add (t[i]);
// save the parameter structs params.push_back (param); }
// wait for all async calls to finish tc.wait (-1, saga::task::All);
// compute and print pi for ( int i = 0; i < NUM_HOSTS; ++i ) { sum += count[i]; }
std::out << "PI = " << 4.0 * ( sum / ((double) times * NUM_HOSTS)) << std::endl; } catch ( const saga::exception & e) { std::err << "SAGA error: " << e.what () << std::endl; }
+-------------------------------------------------------------+
Notes: ======
References: -----------
[1] H. Nakada, S. Matsuoka, K. Seymour, J.Dongarra, C. Lee, H. Casanova: "A GridRPC Model and API for End-User Applications", Global Grid Forum Document GFD-R.52.
Comparision to the original GridRPC calls: ------------------------------------------
initialization: ---------------
- grpc_initialize
GridRPC: reads the configuration file and initializes the required modules. SAGA: not needed, implicit
- grpc_finalize
GridRPC: releases any resources being used SAGA: not needed, implicit
handle management: ------------------
- grpc_function_handle_default
GridRPC: creates a new function handle using the default server. This could be a pre-determined server name or it could be a server that is dynamically chosen by the resource discovery mechanisms of the underlying GridRPC implementation, such as the NetSolve agent. SAGA: default constructor
- grpc_function_handle_init
GridRPC: creates a new function handle with a server explicitly specified by the user. SAGA: explicit constructor
- grpc_function_handle_destruct
GridRPC: releases the memory associated with the specified function handle. SAGA: destructor
- grpc_get_handle
GridRPC: returns the handle corresponding to the given session ID (that is, corresponding to that particular non-blocking request). SAGA: not possible right now. However, status of asynchronous operations can be checked via the corresponding task objects.
call functions: ---------------
- grpc_call
GridRPC: makes a blocking remote procedure call with a variable number of arguments. SAGA: has no variable number of aguments, this case is covered via the SAGA version of grpc_call_argstack.
- grpc_call_async
GridRPC: makes a non-blocking remote procedure call with a variable number of arguments. SAGA: done via task model and equivalent to grpc_call_argstack.
- grpc_call_argstack
GridRPC: makes a blocking call using the argument stack SAGA: call provides a parameter array of variable size
- grpc_call_argstack_async
GridRPC: makes a non-blocking call using the argument stack. SAGA: done via the task model and call
asynchronous control functions: -------------------------------
- grpc_probe
GridRPC: checks whether the asynchronous GridRPC call has completed. SAGA: done via the task model
- grpc_cancel
GridRPC: cancels the specified asynchronous GridRPC call. SAGA: done via the task model
asynchronous wait functions: ----------------------------
- grpc_wait
GridRPC: blocks until the specified non-blocking requests to complete. SAGA: done via the task model
- grpc_wait_and
GridRPC: blocks until all of the specified non- blocking requests in a given set have completed. SAGA: done via the task container
- grpc_wait_or
GridRPC: blocks until any of the specified non- blocking requests in a given set has completed. SAGA: done via the task container
- grpc_wait_all
GridRPC: blocks until all previously issued non-blocking requests have completed. SAGA: done via the task container
- grpc_wait_any
GridRPC: blocks until any previously issued non-blocking request has completed. SAGA: done via the task container
error reporting functions: --------------------------
- grpc_perror
GridRPC: prints the error string associated with the last GridRPC call. SAGA: exceptions
- grpc_error_string
GridRPC: returns the error description string, given a numeric error code. SAGA: exceptions
- grpc_get_error
GridRPC: returns the error code associated with a given non-blocking request. SAGA: exceptions
- grpc_get_last_error
GridRPC: returns the error code for the last invoked GridRPC call. SAGA: exceptions
+-------------------------------------------------------------+
#endif // SHORT
-- "So much time, so little to do..." -- Garfield
Andre, thanks for your comments. More or less, we agree on this. more comments inlined (why should YOU not be forced to search for them? ;-) BTW: most deficiencies you rightly spotted are parts of the text that we simply inherited from your presious version. Feel free to fix ;-)
I felt free to change some part to adapt it to the spec look and feel (example coding conventions, intendation etc.). I'll convert that version to tex now, and add it to the CVS. Hope thats ok with you.
Actually, I was hoping for this.
There is one conflict: I hope that we can get the SAGA CORE spec into final mailing list call on Monday. So I suggest to NOT move that timeline backwards, but to include rpc already - if either of the final calls meets negative comments we remove it - does that make sense?
There is no conflict. Just, SAGA folks have a few days more to object...
gridrpc - scheme - identifying a grid rpc operation server.net - server - server host serving the rpc call 1234 - port - contact point for the server my_function - name - name of the remote method to invoke
All elements but the scheme can be empty, which allows the implementation to fall back to some default remote method to invoke (minimal URL: gridrpc:///).
The description of the constructor says that the URL can be NULL, in some languages that will mean 'empty' (e.g. if they have no default args). That would mean that scheme can be empty, too.
what does this mean? is this a problem?
Not that I disagree, but it should be noted that even for RPC calls which require input parameters only, the params must be passed by reference. That implies, in some languages, to track the param memory for async invokations, where that is not the case for async invokations of other languages which don't have no output parameters. It can;t be helped I guess, and should not be an issue really.
fine, can you say this in SIDL, too? ;-)
This argument handling scheme allows efficient (zero-copy) passing of parameters. For 'Out' parameters with a size value of 0, the implementation is required to allocate the data structure and to overwrite the size and buffer fields for the parameter.
It is the responsibility of the application programmer to free this memory I assume? The the language bindings MUST prescribe how that memory is allocated, to allow the application to choose the appropriate de-allocation method. Alternatively we would need an 'dealloc' method, which would then require the implementation to alloc and de-alloc the params (and to keep track of the blocks).
yep, we can add a sentence pointing out that this needs to be addressed in the language bindings.
I assume a number of other exceptions would apply as well, such as
AuthenticationFailed AuthorizationFailed PermissionDenied DoesNotExist (server contacted, but no such call available) IncorrectURL
Well, and some more I guess. Question is: NoSuccess is actually reserved as last resort, if no other exception really applies (its the least specific exception, please have a look at the 'error handling' section in the spec). So, is NoSuccess really needed here, and in what conditions?
this deficiency is nicely inherited from your old text. Can you please embellish this? (nobody really insists on 'NoSuccess'
Notes: - see [1] for details - by passing an array of variable size, different numbers of parameters can be handled. No special variable argument list handling is required.
We discussed varargs at one of the last GGF, and came to the conclusion that language bindings COULD allow varargs. That does not make sense with the proposed scheme, in particular in respect to the memory allocation policy described. So, I guess we abstain from varargs in the language bindings then?
by this definition, we can avoid varargs. so let's do that?
Other open questions we had from former RPC discussions, and which should be addressed in this spec, are:
- GridRPC takes a config file name on initialization. That config file needs to be user specific IIUC, and there was some discussion, but no conclusion about that. So, what is the appraoch on that? Is that spec implementable on top of GridRPC, and how? If that is an issue still: our decision was to include the config file URL as (optional) parameter to the CONSTRUCTOR. Does that make sense?
Hidemoto, Yusuke, and I agreed to leave the config file out of the API. We think this is better. (Hidemoto was even harsher with himself who once put the config file into the gridrpc spec ;-)
- The RPC spec is silent about 'when' the connection to the remote server is made, on creation of the handle, or on call(). We decided in other parts of the spec that, for example, the constructor opens a file, or remote connection. I propose to prescribe the same for RPC. Does that make sense? Do we need to loosen the semantics elsewhere in the spec? (IMHO not).
I am afraid, doing so would impose restrictions on the underlying gridrpc implementations. some of them might not meet them, but would still be useful if SAGA could access them. What you really would have in mind is a test "can execute" of the procedure? I am not sure if the constructor could aalready give a comprehensive answer??? Hidemot, Yusuke, any ideas ???
- Ninf-G allows to bind a handle to multiple calls. I assume that this is hidden in the implementation for now, and has no explicit reflection in the API? I think that is what we decided on anyway...
looks like
- should we add a 'cancel(in float timeout)'? Explicit resource dealloction was an issue in our discussion at GGF, and we agreed on cancel - is that not needed anymore?
hmm, wouldn't the cancel() belong to the task ??? Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Hi, Quoting [Thilo Kielmann] (Aug 10 2006):
Andre,
thanks for your comments. More or less, we agree on this. more comments inlined (why should YOU not be forced to search for them? ;-)
Fair enough :-P
BTW: most deficiencies you rightly spotted are parts of the text that we simply inherited from your presious version. Feel free to fix ;-)
Well, that was the reason why rpc was not included in the spec, yet. If I'd had the ability to fix the stuff myself I might have done it already, but I'm afraid I don't...
I felt free to change some part to adapt it to the spec look and feel (example coding conventions, intendation etc.). I'll convert that version to tex now, and add it to the CVS. Hope thats ok with you.
Actually, I was hoping for this.
hehe, worked...
There is one conflict: I hope that we can get the SAGA CORE spec into final mailing list call on Monday. So I suggest to NOT move that timeline backwards, but to include rpc already - if either of the final calls meets negative comments we remove it - does that make sense?
There is no conflict. Just, SAGA folks have a few days more to object...
*nod*
gridrpc - scheme - identifying a grid rpc operation server.net - server - server host serving the rpc call 1234 - port - contact point for the server my_function - name - name of the remote method to invoke
All elements but the scheme can be empty, which allows the implementation to fall back to some default remote method to invoke (minimal URL: gridrpc:///).
The description of the constructor says that the URL can be NULL, in some languages that will mean 'empty' (e.g. if they have no default args). That would mean that scheme can be empty, too.
what does this mean? is this a problem?
No, not a problem, but a contradiction: "All elements but the scheme can be empty ..." "An empty URL means ..." I would prefer to allow the URL to be totally empty.
Not that I disagree, but it should be noted that even for RPC calls which require input parameters only, the params must be passed by reference. That implies, in some languages, to track the param memory for async invokations, where that is not the case for async invokations of other languages which don't have no output parameters. It can;t be helped I guess, and should not be an issue really.
fine, can you say this in SIDL, too? ;-)
It is implicitely expressed in SIDL already. We might want to add a verbose note though
This argument handling scheme allows efficient (zero-copy) passing of parameters. For 'Out' parameters with a size value of 0, the implementation is required to allocate the data structure and to overwrite the size and buffer fields for the parameter.
It is the responsibility of the application programmer to free this memory I assume? The the language bindings MUST prescribe how that memory is allocated, to allow the application to choose the appropriate de-allocation method. Alternatively we would need an 'dealloc' method, which would then require the implementation to alloc and de-alloc the params (and to keep track of the blocks).
yep, we can add a sentence pointing out that this needs to be addressed in the language bindings.
I assume a number of other exceptions would apply as well, such as
AuthenticationFailed AuthorizationFailed PermissionDenied DoesNotExist (server contacted, but no such call available) IncorrectURL
Well, and some more I guess. Question is: NoSuccess is actually reserved as last resort, if no other exception really applies (its the least specific exception, please have a look at the 'error handling' section in the spec). So, is NoSuccess really needed here, and in what conditions?
this deficiency is nicely inherited from your old text. Can you please embellish this? (nobody really insists on 'NoSuccess'
Hmm, someone needs to go through the RPC spec and check what errors are defined there, and map them to SAGA exceptions. Volonteers? ;-)
Notes: - see [1] for details - by passing an array of variable size, different numbers of parameters can be handled. No special variable argument list handling is required.
We discussed varargs at one of the last GGF, and came to the conclusion that language bindings COULD allow varargs. That does not make sense with the proposed scheme, in particular in respect to the memory allocation policy described. So, I guess we abstain from varargs in the language bindings then?
by this definition, we can avoid varargs. so let's do that?
Ok, fine with me.
Other open questions we had from former RPC discussions, and which should be addressed in this spec, are:
- GridRPC takes a config file name on initialization. That config file needs to be user specific IIUC, and there was some discussion, but no conclusion about that. So, what is the appraoch on that? Is that spec implementable on top of GridRPC, and how? If that is an issue still: our decision was to include the config file URL as (optional) parameter to the CONSTRUCTOR. Does that make sense?
Hidemoto, Yusuke, and I agreed to leave the config file out of the API. We think this is better. (Hidemoto was even harsher with himself who once put the config file into the gridrpc spec ;-)
hehe - ok, that makes things much easier.
- The RPC spec is silent about 'when' the connection to the remote server is made, on creation of the handle, or on call(). We decided in other parts of the spec that, for example, the constructor opens a file, or remote connection. I propose to prescribe the same for RPC. Does that make sense? Do we need to loosen the semantics elsewhere in the spec? (IMHO not).
I am afraid, doing so would impose restrictions on the underlying gridrpc implementations. some of them might not meet them, but would still be useful if SAGA could access them.
What you really would have in mind is a test "can execute" of the procedure? I am not sure if the constructor could aalready give a comprehensive answer???
Hidemot, Yusuke, any ideas ???
- Ninf-G allows to bind a handle to multiple calls. I assume that this is hidden in the implementation for now, and has no explicit reflection in the API? I think that is what we decided on anyway...
looks like
Ok.
- should we add a 'cancel(in float timeout)'? Explicit resource dealloction was an issue in our discussion at GGF, and we agreed on cancel - is that not needed anymore?
hmm, wouldn't the cancel() belong to the task ???
No, I don't mean for async operations. For example (C++): { { saga::rpc rpc (url); rpc.call (); // BTW: ar void calls allowed? i.e. can the param list // be empty as default? } // the rpc handle is destroyed here. } So, the handle is destroyed - the programmer can, however, not be sure that remote resources are really freed - thats up to the implementation. In provious discussion we came hence up with: { { saga::rpc rpc (url); rpc.call (); // BTW: ar void calls allowed? i.e. can the param list // be empty as default? rpc.destroy (-1); // remote resources are now freed rpc.call (); // this call woudl throw an IncorrectState exception } // the rpc handle is destroyed here. } That cancel() allows to explicitely free remote resources, and can, if necessary, block until that is ensured. To have that ability was actually a request by the RPC folx IIRC, and we added it to several places in the SAGA spec (streams, name space entries). So I guess we should stick to it here? Cheers, Andre.
Thilo
-- "So much time, so little to do..." -- Garfield
Hi,
- The RPC spec is silent about 'when' the connection to the remote server is made, on creation of the handle, or on call(). We decided in other parts of the spec that, for example, the constructor opens a file, or remote connection. I propose to prescribe the same for RPC. Does that make sense? Do we need to loosen the semantics elsewhere in the spec? (IMHO not).
I am afraid, doing so would impose restrictions on the underlying gridrpc implementations. some of them might not meet them, but would still be useful if SAGA could access them.
What you really would have in mind is a test "can execute" of the procedure? I am not sure if the constructor could aalready give a comprehensive answer???
Hidemot, Yusuke, any ideas ???
In case of the GridRPC, when the connection to the remote server is made is free for each implementation. In the constructor, the server will be certainly assigned for the RPC. The connection for data transfer may be established in the constructor (Ninf-G) or may be done in rpc.call (GridSolve). ----------------------------------------------------- Yusuke Tanimura <yusuke.tanimura@aist.go.jp> Grid Technology Research Center, National Institute of AIST 1-1-1 Umezono, Tsukuba Central 2 Tsukuba City 305-8568, Japan TEL: +81-29-862-6703 / FAX: +81-29-862-6601
Yusuke, Quoting [Yusuke Tanimura] (Aug 10 2006):
Hi,
- The RPC spec is silent about 'when' the connection to the remote server is made, on creation of the handle, or on call(). We decided in other parts of the spec that, for example, the constructor opens a file, or remote connection. I propose to prescribe the same for RPC. Does that make sense? Do we need to loosen the semantics elsewhere in the spec? (IMHO not).
I am afraid, doing so would impose restrictions on the underlying gridrpc implementations. some of them might not meet them, but would still be useful if SAGA could access them.
What you really would have in mind is a test "can execute" of the procedure? I am not sure if the constructor could aalready give a comprehensive answer???
Hidemoto, Yusuke, any ideas ???
In case of the GridRPC, when the connection to the remote server is made is free for each implementation. In the constructor, the server will be certainly assigned for the RPC. The connection for data transfer may be established in the constructor (Ninf-G) or may be done in rpc.call (GridSolve).
Right, that is what I remember from last GGF as well. That means effectively that the call() method must be able to throw all exceptions we have on the constructor of the handle. No, the question is how do we handle that for other SAGA packages, e.g. for files? Here the constructor implies an open(), and could return DoesNotExist. If SAGA would allow to delay the opening of the file, it would be possible to save a round trip time, and to optimize the implementation - but the drawback is that on _any_ other call (read(), write() etc) we could meet a DoesNotExist exception. I guess we don't want that, and should limit that semantics to RPC. However, we then end up with an inconsistency in the spec (file constructor behaves differently than rpc constructor), which we need to document explictely (at least). Does that make sense? -- "So much time, so little to do..." -- Garfield
Dear Andre, Yusuke, Hidemoto, dear "groups", now that the SAGA CVS has lost its (recent) memory, let's sort out things via email. (Actually, when I did 'cvs update' yesterday, I wanted to fix this stuff directly in the repository :-( After some thinking, I believe that whatever we write into the RPC section of SAGA, it has to be implementable over the GridRPC spec, as of GFD.52. Otherwise, it wouldn't be "GridRPC" but merely SAGA RPC. While this might have benefits, I believe, for the sake of usefulness, we should stick to GFD.52. (Otherwise we might rule out some popular RPC implementations like Netsolve.) My following comments have to be seen in this light.
In the constructor, the remote procedure to be invoked is specified by a URL, whith the syntax:
gridrpc://server.net:1234/my_function
with the elements responding to:
gridrpc - scheme - identifying a grid rpc operation server.net - server - server host serving the rpc call 1234 - port - contact point for the server my_function - name - name of the remote method to invoke
All elements but the scheme can be empty, which allows the implementation to fall back to some default remote method to invoke (minimal URL: gridrpc:///).
The description of the constructor says that the URL can be NULL, in some languages that will mean 'empty' (e.g. if they have no default args). That would mean that scheme can be empty, too.
I think we can allow completely empty URL's, too. (There is an obvious default value for the implementation to fill in, right? ;-)
This argument handling scheme allows efficient (zero-copy) passing of parameters. For 'Out' parameters with a size value of 0, the implementation is required to allocate the data structure and to overwrite the size and buffer fields for the parameter.
It is the responsibility of the application programmer to free this memory I assume? The the language bindings MUST prescribe how that memory is allocated, to allow the application to choose the appropriate de-allocation method. Alternatively we would need an 'dealloc' method, which would then require the implementation to alloc and de-alloc the params (and to keep track of the blocks).
we can add this to the text.
- The RPC spec is silent about 'when' the connection to the remote server is made, on creation of the handle, or on call(). We decided in other parts of the spec that, for example, the constructor opens a file, or remote connection. I propose to prescribe the same for RPC. Does that make sense? Do we need to loosen the semantics elsewhere in the spec? (IMHO not).
To my personal taste, I would prefer having the accessability check in the constructor, enforced. However, this would contradict GFD.52 (see above). So we should have "all useful" exceptions (see below) both for the constructor and the "call" method.
- constructor Purpose: inits a remote function handle Format: CONSTRUCTOR (in session session, in string funcname) Inputs: session: saga session to use funcname: name of remote method to initialize Outputs: -
I suggest the following mappings of GridRPC errors to SAGA exceptions: GRPC_SERVER_NOT_FOUND IncorrectURL GRPC_FUNCTION_NOT_FOUND IncorrectURL GRPC_RPC_REFUSED AuthorizationFailed looks like the closest match GRPC_OTHER_ERROR_CODE NoSuccess default GRPC_NO_ERROR and GRPC_NOT_INITIALIZED are not applicable.
- call Purpose: call the remote procedure Format: call (inout array<parameter> param); Inputs: - In/Out: param: argument/result values for call Outputs: - Throws: NoSuccess: remote operation failed
same as constructor
- should we add a 'cancel(in float timeout)'? Explicit resource dealloction was an issue in our discussion at GGF, and we agreed on cancel - is that not needed anymore?
I gave it a thorough (I hope ;-) read and came to the following conclusion: We have to rely on GFD.52, this means here on grpc_cancel(). grpc_cancel is meant to cancel an asynchronous rpc call. The description in GFD.52 does NOT mention any remote resource de-allocation. This means, prescribing in SAGA some form of remote resource de-allocation would deviate from GridRPC and would likely not be implementable. This means, all we can do is cancel the SAGA task (possibly with timeout -1). And, if I understand SAGA's task cancel method correctly, it is supposed (or at least allowed) to "clean up resources", which I would translate to calling grpc_cancel underneath. Summary: no explicit cancel method for SAGA's RPC package Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Hi, Quoting [Thilo Kielmann] (Aug 13 2006):
- The RPC spec is silent about 'when' the connection to the remote server is made, on creation of the handle, or on call(). We decided in other parts of the spec that, for example, the constructor opens a file, or remote connection. I propose to prescribe the same for RPC. Does that make sense? Do we need to loosen the semantics elsewhere in the spec? (IMHO not).
To my personal taste, I would prefer having the accessability check in the constructor, enforced. However, this would contradict GFD.52 (see above). So we should have "all useful" exceptions (see below) both for the constructor and the "call" method.
We should explicitely describe that this is different to other constructors then.
- constructor Purpose: inits a remote function handle Format: CONSTRUCTOR (in session session, in string funcname) Inputs: session: saga session to use funcname: name of remote method to initialize Outputs: -
I suggest the following mappings of GridRPC errors to SAGA exceptions:
GRPC_SERVER_NOT_FOUND IncorrectURL GRPC_FUNCTION_NOT_FOUND IncorrectURL GRPC_RPC_REFUSED AuthorizationFailed looks like the closest match GRPC_OTHER_ERROR_CODE NoSuccess default
GRPC_NO_ERROR and GRPC_NOT_INITIALIZED are not applicable.
IncorrectURL is reserved fo non-parsable URLs, and URLs which cannot be handled because the scheme is unsupported. An invalid URL in the sense that the server does not exist, or points to a non-existing function handle etc. should cause a BadParameter exception, or a DoesNotExist exception (latter if it fails because the described endpoint does, well, not exist).
- should we add a 'cancel(in float timeout)'? Explicit resource dealloction was an issue in our discussion at GGF, and we agreed on cancel - is that not needed anymore?
I gave it a thorough (I hope ;-) read and came to the following conclusion:
We have to rely on GFD.52, this means here on grpc_cancel(). grpc_cancel is meant to cancel an asynchronous rpc call. The description in GFD.52 does NOT mention any remote resource de-allocation. This means, prescribing in SAGA some form of remote resource de-allocation would deviate from GridRPC and would likely not be implementable.
This means, all we can do is cancel the SAGA task (possibly with timeout -1). And, if I understand SAGA's task cancel method correctly, it is supposed (or at least allowed) to "clean up resources", which I would translate to calling grpc_cancel underneath.
Hmm, I am somewhat confused now, as I thought (as said earlier) that the timeout on cancel was an GridRPC requirement. Well, nevermind then. Should we consider removing cancel/close timeouts from the spec altogether? Agree with the other points you made. Cheers, Andre.
Summary: no explicit cancel method for SAGA's RPC package
Thilo -- "So much time, so little to do..." -- Garfield
On Sun, Aug 13, 2006 at 01:30:06PM +0200, Andre Merzky wrote:
To my personal taste, I would prefer having the accessability check in the constructor, enforced. However, this would contradict GFD.52 (see above). So we should have "all useful" exceptions (see below) both for the constructor and the "call" method.
We should explicitely describe that this is different to other constructors then.
Yes, no problem.
IncorrectURL is reserved fo non-parsable URLs, and URLs which cannot be handled because the scheme is unsupported.
Hmm, it that would be clear from the SAGA spec, I wold not have chosen "IncorrectURL" :-( So, what about the following mapping: GRPC_SERVER_NOT_FOUND DoesNotExist GRPC_FUNCTION_NOT_FOUND DoesNotExist GRPC_RPC_REFUSED AuthorizationFailed looks like the closest match GRPC_OTHER_ERROR_CODE NoSuccess default GRPC_NO_ERROR and GRPC_NOT_INITIALIZED are not applicable.
We have to rely on GFD.52, this means here on grpc_cancel(). grpc_cancel is meant to cancel an asynchronous rpc call. The description in GFD.52 does NOT mention any remote resource de-allocation. This means, prescribing in SAGA some form of remote resource de-allocation would deviate from GridRPC and would likely not be implementable.
This means, all we can do is cancel the SAGA task (possibly with timeout -1). And, if I understand SAGA's task cancel method correctly, it is supposed (or at least allowed) to "clean up resources", which I would translate to calling grpc_cancel underneath.
Hmm, I am somewhat confused now, as I thought (as said earlier) that the timeout on cancel was an GridRPC requirement. Well, nevermind then. Should we consider removing cancel/close timeouts from the spec altogether?
Unless I have overlooked something in GFD.52 (GridRPC), then the 'enforced' clean up can not be implemented there. Are there any other use cases for cancel/close timeouts?? I personally don't see much good in them: either the cancel manages to clean up, or it doesn't even after the timeout. This would onle be interesting if the same app would request more of the same remote resources. But still, either the remote resource manages to clean up, or sooner-or-later it will overflow... Thilo -- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Quoting [Thilo Kielmann] (Aug 13 2006):
IncorrectURL is reserved fo non-parsable URLs, and URLs which cannot be handled because the scheme is unsupported.
Hmm, it that would be clear from the SAGA spec, I wold not have chosen "IncorrectURL" :-(
from the spec: A method was invoked with an URL argument which could not be handled. The error specifically indicates that an implementation can not handle the specified protocol, or that access the specified entity via the given protocol is impossible. The exception MUST NOT be used to indicate any other error condition. See also notes to 'The URL Problem' in the introducton. Can you think of a wording which makes this more explicit? Is it confusing to refer to end point access here?
So, what about the following mapping:
GRPC_SERVER_NOT_FOUND DoesNotExist GRPC_FUNCTION_NOT_FOUND DoesNotExist GRPC_RPC_REFUSED AuthorizationFailed looks like the closest match GRPC_OTHER_ERROR_CODE NoSuccess default
GRPC_NO_ERROR and GRPC_NOT_INITIALIZED are not applicable.
Yes, sounds sensible to me.
We have to rely on GFD.52, this means here on grpc_cancel(). grpc_cancel is meant to cancel an asynchronous rpc call. The description in GFD.52 does NOT mention any remote resource de-allocation. This means, prescribing in SAGA some form of remote resource de-allocation would deviate from GridRPC and would likely not be implementable.
This means, all we can do is cancel the SAGA task (possibly with timeout -1). And, if I understand SAGA's task cancel method correctly, it is supposed (or at least allowed) to "clean up resources", which I would translate to calling grpc_cancel underneath.
Hmm, I am somewhat confused now, as I thought (as said earlier) that the timeout on cancel was an GridRPC requirement. Well, nevermind then. Should we consider removing cancel/close timeouts from the spec altogether?
Unless I have overlooked something in GFD.52 (GridRPC), then the 'enforced' clean up can not be implemented there.
Are there any other use cases for cancel/close timeouts?? I personally don't see much good in them: either the cancel manages to clean up, or it doesn't even after the timeout. This would onle be interesting if the same app would request more of the same remote resources. But still, either the remote resource manages to clean up, or sooner-or-later it will overflow...
Apart from handling remote resources, you may want to release locks and handles before the object goes out of scope. That is in particular important for Java ;-), as you have no control over _when_ the garbage collection will destroy your object. So, a close()/cancel() is useful in these cases. However, I am not sure if we do have an explicit use case for the timeout on these calls then. Cheers, Andre..
Thilo -- "So much time, so little to do..." -- Garfield
participants (3)
-
Andre Merzky
-
Thilo Kielmann
-
Yusuke Tanimura