Hi Thilo, Some comments inlined. Cheers, Andre. Quoting [Thilo Kielmann] (Aug 09 2006):
Date: Wed, 9 Aug 2006 19:00:49 +0200 From: Thilo Kielmann <kielmann@cs.vu.nl> To: gridrpc-wg@gridforum.org, saga-rg@gridforum.org Subject: [gridrpc-wg] Final call: GridRPC API for inclusion in SAGA 1.0
Dear all,
Hidemoto Nakada, Yusuke Tanimura, and myself have met and we have re-worked the pending proposal for a GridRPC module for inclusion in the upcoming SAGA spec. The only thing we have changed is the way in which parameters to an RPC invocation are passed. Now, it is an array of parameters, where a paramter is a struct, consisting of buffer, size, and, IN/OUT/INOUT mode. We have also modelled examples from the Ninf-G web page with this API. We think that this now has a "look and feel" of GridRPC.
It should have the look & feel of SAGA, and the semantics of GridRPC :-) - but if it closes in on look & feel on both ends, so the better of course...
Attached please find the proposed text with API and examples.
I felt free to change some part to adapt it to the spec look and feel (example coding conventions, intendation etc.). I'll convert that version to tex now, and add it to the CVS. Hope thats ok with you.
If this (or something similar) can be agreed upon quickly, then the GridRPC module can be included in the SAGA 1.0 spec. We see this API not as a competitor to the GridRPC API, as published in GFD-R.52. It is rather an alternative, embedded in the SAGA framework, to access existing GridRPC implementations, thus extending their user base.
We are now doing the final call for comments for the two groups involved. If you have comments or questions, please post them TO BOTH MAILING LISTS.
This final call is open until FRIDAY, AUGUST 18th.
There is one conflict: I hope that we can get the SAGA CORE spec into final mailing list call on Monday. So I suggest to NOT move that timeline backwards, but to include rpc already - if either of the final calls meets negative comments we remove it - does that make sense? More comments below
Comments and objections made by then will happily be included in the proposal. By this deadline, we will add this text (with all modifications we got) into the SAGA API document.
Thanks to you all for your contributions.
Thilo Kielmann
+-------------------------------------------------------------+
###### ###### ##### # # # # # # # # # # # ###### ###### # # # # # # # # # # # # # #####
+-------------------------------------------------------------+
Summary: ========
GridRPC is one of the few high level APIs defined by the GGF. Including it into the SAGA API benefits both: SAGA gets more complete, and provides a better coverage of its use cases with a single look and feel; and GridRPC gets embedded into a set of other tools of similar scope, which opens it to a potentially wider user community, and ensures its further development.
The RPC package of the SAGA API described here is a one to one mapping from the GridRPC standard, equipped with the SAGA look and feel, error conventions, task model etc.
+-------------------------------------------------------------+
Specification: ==============
package saga version 0.1 { package rpc { enum io_mode { In = 1, // input parameter Out = 2, // output parameter InOut = 3 // both input and output parameter }
struct parameter { long size; // number of bytes in buffer array<byte> buffer; // data io_mode mode; // parameter mode }
class rpc { CONSTRUCTOR (in string funcname ); call (inout array<parameter> parameters); } } }
+-------------------------------------------------------------+
#ifndef SHORT
Details: ========
class rpc: ----------
This class represents a remote function handle, which can be called (repeatedly), and returns the result of the respective remote procedure invocation.
The class offers one non trivial constructor, which initialises the remote function handle (see [1] for details). That process may involve connection setup, service discovery etc. The class further offers one method 'call()', which invokes the remote procedure, and returns the respective return data and values.
In the constructor, the remote procedure to be invoked is specified by a URL, whith the syntax:
gridrpc://server.net:1234/my_function
with the elements responding to:
gridrpc - scheme - identifying a grid rpc operation server.net - server - server host serving the rpc call 1234 - port - contact point for the server my_function - name - name of the remote method to invoke
All elements but the scheme can be empty, which allows the implementation to fall back to some default remote method to invoke (minimal URL: gridrpc:///).
The description of the constructor says that the URL can be NULL, in some languages that will mean 'empty' (e.g. if they have no default args). That would mean that scheme can be empty, too.
The argument and return value handling is currently very basic, and reflects the traditional scheme for remote procedure calls: an array of parameters, for each of which the buffer, its size, and the input/output mode is described. On invocation of the 'call' method, for each parameter the 'mode' value has to be initialized, for parameters with mode 'In' or 'InOut', also 'size' and 'buffer' must be initialized. For parameters with mode OUT, 'size' might also have the value 0 in which case the 'buffer' is considered to be void, and has to be created (e.g., allocated) by the SAGA system upon arrival of the result data.
Not that I disagree, but it should be noted that even for RPC calls which require input parameters only, the params must be passed by reference. That implies, in some languages, to track the param memory for async invokations, where that is not the case for async invokations of other languages which don't have no output parameters. It can;t be helped I guess, and should not be an issue really.
This argument handling scheme allows efficient (zero-copy) passing of parameters. For 'Out' parameters with a size value of 0, the implementation is required to allocate the data structure and to overwrite the size and buffer fields for the parameter.
It is the responsibility of the application programmer to free this memory I assume? The the language bindings MUST prescribe how that memory is allocated, to allow the application to choose the appropriate de-allocation method. Alternatively we would need an 'dealloc' method, which would then require the implementation to alloc and de-alloc the params (and to keep track of the blocks).
- constructor Purpose: inits a remote function handle Format: CONSTRUCTOR (in session session, in string funcname) Inputs: session: saga session to use funcname: name of remote method to initialize Outputs: - Throws: NoSuccess: server could not be contacted, or method is unknown
I assume a number of other exceptions would apply as well, such as AuthenticationFailed AuthorizationFailed PermissionDenied DoesNotExist (server contacted, but no such call available) IncorrectURL Well, and some more I guess. Question is: NoSuccess is actually reserved as last resort, if no other exception really applies (its the least specific exception, please have a look at the 'error handling' section in the spec). So, is NoSuccess really needed here, and in what conditions?
Notes: - see [1] for details - if funcname is NULL, a default handle is created
- call Purpose: call the remote procedure Format: call (inout array<parameter> param); Inputs: - In/Out: param: argument/result values for call Outputs: - Throws: NoSuccess: remote operation failed
Same as above.
Notes: - see [1] for details - by passing an array of variable size, different numbers of parameters can be handled. No special variable argument list handling is required.
We discussed varargs at one of the last GGF, and came to the conclusion that language bindings COULD allow varargs. That does not make sense with the proposed scheme, in particular in respect to the memory allocation policy described. So, I guess we abstain from varargs in the language bindings then? Other open questions we had from former RPC discussions, and which should be addressed in this spec, are: - GridRPC takes a config file name on initialization. That config file needs to be user specific IIUC, and there was some discussion, but no conclusion about that. So, what is the appraoch on that? Is that spec implementable on top of GridRPC, and how? If that is an issue still: our decision was to include the config file URL as (optional) parameter to the CONSTRUCTOR. Does that make sense? - The RPC spec is silent about 'when' the connection to the remote server is made, on creation of the handle, or on call(). We decided in other parts of the spec that, for example, the constructor opens a file, or remote connection. I propose to prescribe the same for RPC. Does that make sense? Do we need to loosen the semantics elsewhere in the spec? (IMHO not). - Ninf-G allows to bind a handle to multiple calls. I assume that this is hidden in the implementation for now, and has no explicit reflection in the API? I think that is what we decided on anyway... - should we add a 'cancel(in float timeout)'? Explicit resource dealloction was an issue in our discussion at GGF, and we agreed on cancel - is that not needed anymore? Cheers, Andre.
+-------------------------------------------------------------+
Examples: ========= // c++ example // call a remote matrix multiplication A = A * B try { rpc rpc ("gridrpc://fs0.das2.cs.vu.nl/matmul1");
std::vector <saga::rpc::parameter> params (2);
params[0].buffer = // ptr to matrix A params[0].size = sizeof (buffer); params[0].mode = saga::rpc::InOut;
params[1].buffer = // ptr to matrix B params[1].size = sizeof (buffer); params[1].mode = saga::rpc::In;
rpc.call (params);
// A now contains the result } catch ( const saga::exception & e) { std::err << "SAGA error: " << e.what () << std::endl; }
// c++ example // call a remote matrix multiplication C = A * B try { rpc rpc ("gridrpc://fs0.das2.cs.vu.nl/matmul2");
std::vector <saga::rpc::parameter> params (3);
params[0].buffer = NULL; // buffer will be created params[0].size = 0; // buffer will be created params[0].mode = saga::rpc::Out;
params[1].buffer = // ptr to matrix A params[1].size = sizeof (buffer); params[1].mode = saga::rpc::InOut;
params[2].buffer = // ptr to matrix B params[2].size = sizeof (buffer); params[2].mode = saga::rpc::In;
rpc.call (params);
// params[0].buffer now contains the result } catch ( const saga::exception & e) { std::err << "SAGA error: " << e.what () << std::endl; }
// c++ example // asynchronous version of A = A * B try { rpc rpc ("gridrpc://fs0.das2.cs.vu.nl/matmul1");
std::vector <saga::rpc::parameter> params (2);
params[0].buffer = // ptr to matrix A params[0].size = sizeof (buffer); params[0].mode = saga::rpc::InOut;
params[1].buffer = // ptr to matrix B params[1].size = sizeof (buffer); params[1].mode = saga::rpc::In;
saga::task t = rpc.call <saga::task::ASync> (params);
t.wait (); // A now contains the result } catch ( const saga::exception & e) { std::err << "SAGA error: " << e.what() << std::endl; }
// c++ example // parameter sweep example from // http://ninf.apgrid.org/documents/ng4-manual/examples.html // // Monte Carlo computation of PI // try { std::string uri[NUM_HOSTS]; // initialize... long times, count[NUM_HOSTS], sum;
std::vector <saga::rpc::rpc> servers;
// create the rpc handles for all URIs for ( int i = 0; i < NUM_HOSTS; ++i ) { servers.push_back (saga::rpc::rpc (uri[i])); }
// create persisten storage for tasks and parameter structs saga::task_container tc; std::vector <std::vector <saga:rpc::parameter> > params;
// fill parameter structs and start async rpc calls for ( int i = 0; i < NUM_HOSTS; ++i ) { std::vector <saga::rpc::parameter> param (3);
param[0].buffer = i; // use as random seed param[0].size = sizeof (buffer); param[0].mode = saga::rpc::In;
param[1].buffer = times; param[1].size = sizeof (buffer); param[1].mode = saga::rpc::In;
param[2].buffer = count[i]; param[2].size = sizeof (buffer); param[2].mode = saga::rpc::Out;
// start the async calls saga::task t = servers[i].call <saga::task::ASync> (param);
// save the task; tc.add (t[i]);
// save the parameter structs params.push_back (param); }
// wait for all async calls to finish tc.wait (-1, saga::task::All);
// compute and print pi for ( int i = 0; i < NUM_HOSTS; ++i ) { sum += count[i]; }
std::out << "PI = " << 4.0 * ( sum / ((double) times * NUM_HOSTS)) << std::endl; } catch ( const saga::exception & e) { std::err << "SAGA error: " << e.what () << std::endl; }
+-------------------------------------------------------------+
Notes: ======
References: -----------
[1] H. Nakada, S. Matsuoka, K. Seymour, J.Dongarra, C. Lee, H. Casanova: "A GridRPC Model and API for End-User Applications", Global Grid Forum Document GFD-R.52.
Comparision to the original GridRPC calls: ------------------------------------------
initialization: ---------------
- grpc_initialize
GridRPC: reads the configuration file and initializes the required modules. SAGA: not needed, implicit
- grpc_finalize
GridRPC: releases any resources being used SAGA: not needed, implicit
handle management: ------------------
- grpc_function_handle_default
GridRPC: creates a new function handle using the default server. This could be a pre-determined server name or it could be a server that is dynamically chosen by the resource discovery mechanisms of the underlying GridRPC implementation, such as the NetSolve agent. SAGA: default constructor
- grpc_function_handle_init
GridRPC: creates a new function handle with a server explicitly specified by the user. SAGA: explicit constructor
- grpc_function_handle_destruct
GridRPC: releases the memory associated with the specified function handle. SAGA: destructor
- grpc_get_handle
GridRPC: returns the handle corresponding to the given session ID (that is, corresponding to that particular non-blocking request). SAGA: not possible right now. However, status of asynchronous operations can be checked via the corresponding task objects.
call functions: ---------------
- grpc_call
GridRPC: makes a blocking remote procedure call with a variable number of arguments. SAGA: has no variable number of aguments, this case is covered via the SAGA version of grpc_call_argstack.
- grpc_call_async
GridRPC: makes a non-blocking remote procedure call with a variable number of arguments. SAGA: done via task model and equivalent to grpc_call_argstack.
- grpc_call_argstack
GridRPC: makes a blocking call using the argument stack SAGA: call provides a parameter array of variable size
- grpc_call_argstack_async
GridRPC: makes a non-blocking call using the argument stack. SAGA: done via the task model and call
asynchronous control functions: -------------------------------
- grpc_probe
GridRPC: checks whether the asynchronous GridRPC call has completed. SAGA: done via the task model
- grpc_cancel
GridRPC: cancels the specified asynchronous GridRPC call. SAGA: done via the task model
asynchronous wait functions: ----------------------------
- grpc_wait
GridRPC: blocks until the specified non-blocking requests to complete. SAGA: done via the task model
- grpc_wait_and
GridRPC: blocks until all of the specified non- blocking requests in a given set have completed. SAGA: done via the task container
- grpc_wait_or
GridRPC: blocks until any of the specified non- blocking requests in a given set has completed. SAGA: done via the task container
- grpc_wait_all
GridRPC: blocks until all previously issued non-blocking requests have completed. SAGA: done via the task container
- grpc_wait_any
GridRPC: blocks until any previously issued non-blocking request has completed. SAGA: done via the task container
error reporting functions: --------------------------
- grpc_perror
GridRPC: prints the error string associated with the last GridRPC call. SAGA: exceptions
- grpc_error_string
GridRPC: returns the error description string, given a numeric error code. SAGA: exceptions
- grpc_get_error
GridRPC: returns the error code associated with a given non-blocking request. SAGA: exceptions
- grpc_get_last_error
GridRPC: returns the error code for the last invoked GridRPC call. SAGA: exceptions
+-------------------------------------------------------------+
#endif // SHORT
-- "So much time, so little to do..." -- Garfield