There two ways to initialize a function handle.
- first, we can do it with grpc_function_handle_init. In this case, the server is explicitly given by the client. As data location is known (client, server or data repository), there no problem to bind the data to the server. All data management can be done by the client: placement, transfers or removal. The location will be given in the data management functions.
- second, the function handle could be initialized by grpc_function_handle_default. In that case, the GridRPC API document says: "This default could be a pre-determined server or it could be a server that is dynamically chosen by the resource discovery mechanisms of the underlying GridRPC implementation". Does that mean, the function handle will contain a server reference in it after grpc_function_handle_default call ? Or does that mean, the function handle will reference a default discovery (or GridRPC server) while the computational server will be chosen during grpc_call ?
It is totally implementation dependent, I believe. In theory, you can get more chance to choose 'better' server, if you delayed selection of the server to the actual invocation time, since at that time you can get more information on the invocation, such as the size of data to be transferred. Yes, you get more chance to choose a better server. But, that means after calling the grpc_function_handle_default, you do not have a reference to the server: it will be chosen later. You just have a reference to the platform and you can not place any data before calling
Here is a short response to Hidemoto's questions : Hidemoto Nakada wrote: the grpc_call since the server has not been selected. Please see the example further.
If the function handle contains a reference to a server, then the data management can be done in the same way as for grpc_function_handle_init. If the function handle does not reference the computational server, there no way to know where to place data before issuing grpc_call. This is the way function handles are implemented in Diet and Netsolve (2.0, any changes ?) GridRPC interfaces.
However, we should provide a way to dynamically choose a server... Any comments ?
I cannot understand your concern. Could you explain it giving examples ? Actually, the problem is to decide if we always know where the computation will take place or not. If we always know it, then we can use standard copy function to put the data on the server before computing (then the client is able to manage its data on its own, it does not need more platform support than data handle management). If we do not know where the computation will take place then we need platform support: we need way to say to the platform that we want to leave this data inside it, somewhere. This way could be a persistency flag or a bind function, it do not matter, but we need it. After computation, the server needs to know what to do with the data: send it back to the client or leave it on its host?
This example is taken from an application running under DIET. This application (kmc) simulate atoms deposition on a substrate. To better see the result of the simulation, the data computed by the simulation program are sent to a ray tracing service (povray). To optmize the performances, we plan to deploy both applications on to differents PC clusters (1 and 2) managed by DIET. The GRPC client will do : grpc_initialize(); grpc_function_handle_default( handleKmc, "kmc" ); // data preparation grpc_call( handleKmc, data, &result ); // At the time of the grpc_call, the client does not know on which // cluster it will execute. However, this is not very important as we // just use input data for kmc. grpc_function_handle_default( handlePovray, "povray" ); grpc_call( handlePovray, result, &image ); When the client will call povray, it will not know where its image will be computed, which povray server will be used, on cluster 1 or 2. If the client get the result back, there is no problem because it will use result in the call. But, if we want to avoid useless transfers of result, we need to leave the result data (persitent) inside the platform and transfert it if the povray computation is not done on the same cluster, when this server will be chosen. In that case, we need a way to indicate to the kmc server that the result data must be leave on the server and not returned to the client. However, before calling (grpc_call) the kmc application, we do not know which cluster will be used, so there is no way to inform it. Its not possible to bind the data to this server, we can just bind it to the platform. Is that example more clear for you ? Laurent -- Laurent PHILIPPE http://lifc.univ-fcomte.fr/~philippe philippe@lifc.univ-fcomte.fr Laboratoire d'Informatique (LIFC) tel: (33) 03 81 66 66 54 route de Gray fax: (33) 03 81 66 64 50 25030 Besancon Cedex - FRANCE