questions about function handles
All, After the last GridRPC meeting in Seoul, we promised to send a new proposal for data management in GridRPC and an answer to Hidemoto's comments. However, in our discussions we still face a problem on the definition of function handles. There two ways to initialize a function handle. - first, we can do it with grpc_function_handle_init. In this case, the server is explicitly given by the client. As data location is known (client, server or data repository), there no problem to bind the data to the server. All data management can be done by the client: placement, transfers or removal. The location will be given in the data management functions. - second, the function handle could be initialized by grpc_function_handle_default. In that case, the GridRPC API document says: "This default could be a pre-determined server or it could be a server that is dynamically chosen by the resource discovery mechanisms of the underlying GridRPC implementation". Does that mean, the function handle will contain a server reference in it after grpc_function_handle_default call ? Or does that mean, the function handle will reference a default discovery (or GridRPC server) while the computational server will be chosen during grpc_call ? If the function handle contains a reference to a server, then the data management can be done in the same way as for grpc_function_handle_init. If the function handle does not reference the computational server, there no way to know where to place data before issuing grpc_call. This is the way function handles are implemented in Diet and Netsolve (2.0, any changes ?) GridRPC interfaces. However, we should provide a way to dynamically choose a server... Any comments ? L. Philippe & B. DelFabbro -- Laurent PHILIPPE http://lifc.univ-fcomte.fr/~philippe philippe@lifc.univ-fcomte.fr Laboratoire d'Informatique (LIFC) tel: (33) 03 81 66 66 54 route de Gray fax: (33) 03 81 66 64 50 25030 Besancon Cedex - FRANCE
Laurent, sorry for my lazy response,
After the last GridRPC meeting in Seoul, we promised to send a new proposal for data management in GridRPC and an answer to Hidemoto's comments. However, in our discussions we still face a problem on the definition of function handles.
There two ways to initialize a function handle.
- first, we can do it with grpc_function_handle_init. In this case, the server is explicitly given by the client. As data location is known (client, server or data repository), there no problem to bind the data to the server. All data management can be done by the client: placement, transfers or removal. The location will be given in the data management functions.
- second, the function handle could be initialized by grpc_function_handle_default. In that case, the GridRPC API document says: "This default could be a pre-determined server or it could be a server that is dynamically chosen by the resource discovery mechanisms of the underlying GridRPC implementation". Does that mean, the function handle will contain a server reference in it after grpc_function_handle_default call ? Or does that mean, the function handle will reference a default discovery (or GridRPC server) while the computational server will be chosen during grpc_call ?
It is totally implementation dependent, I believe. In theory, you can get more chance to choose 'better' server, if you delayed selection of the server to the actual invocation time, since at that time you can get more information on the invocation, such as the size of data to be transferred.
If the function handle contains a reference to a server, then the data management can be done in the same way as for grpc_function_handle_init. If the function handle does not reference the computational server, there no way to know where to place data before issuing grpc_call. This is the way function handles are implemented in Diet and Netsolve (2.0, any changes ?) GridRPC interfaces.
However, we should provide a way to dynamically choose a server... Any comments ?
I cannot understand your concern. Could you explain it giving examples ? -hidemoto
There two ways to initialize a function handle.
- first, we can do it with grpc_function_handle_init. In this case, the server is explicitly given by the client. As data location is known (client, server or data repository), there no problem to bind the data to the server. All data management can be done by the client: placement, transfers or removal. The location will be given in the data management functions.
- second, the function handle could be initialized by grpc_function_handle_default. In that case, the GridRPC API document says: "This default could be a pre-determined server or it could be a server that is dynamically chosen by the resource discovery mechanisms of the underlying GridRPC implementation". Does that mean, the function handle will contain a server reference in it after grpc_function_handle_default call ? Or does that mean, the function handle will reference a default discovery (or GridRPC server) while the computational server will be chosen during grpc_call ?
It is totally implementation dependent, I believe. In theory, you can get more chance to choose 'better' server, if you delayed selection of the server to the actual invocation time, since at that time you can get more information on the invocation, such as the size of data to be transferred. Yes, you get more chance to choose a better server. But, that means after calling the grpc_function_handle_default, you do not have a reference to the server: it will be chosen later. You just have a reference to the platform and you can not place any data before calling
Here is a short response to Hidemoto's questions : Hidemoto Nakada wrote: the grpc_call since the server has not been selected. Please see the example further.
If the function handle contains a reference to a server, then the data management can be done in the same way as for grpc_function_handle_init. If the function handle does not reference the computational server, there no way to know where to place data before issuing grpc_call. This is the way function handles are implemented in Diet and Netsolve (2.0, any changes ?) GridRPC interfaces.
However, we should provide a way to dynamically choose a server... Any comments ?
I cannot understand your concern. Could you explain it giving examples ? Actually, the problem is to decide if we always know where the computation will take place or not. If we always know it, then we can use standard copy function to put the data on the server before computing (then the client is able to manage its data on its own, it does not need more platform support than data handle management). If we do not know where the computation will take place then we need platform support: we need way to say to the platform that we want to leave this data inside it, somewhere. This way could be a persistency flag or a bind function, it do not matter, but we need it. After computation, the server needs to know what to do with the data: send it back to the client or leave it on its host?
This example is taken from an application running under DIET. This application (kmc) simulate atoms deposition on a substrate. To better see the result of the simulation, the data computed by the simulation program are sent to a ray tracing service (povray). To optmize the performances, we plan to deploy both applications on to differents PC clusters (1 and 2) managed by DIET. The GRPC client will do : grpc_initialize(); grpc_function_handle_default( handleKmc, "kmc" ); // data preparation grpc_call( handleKmc, data, &result ); // At the time of the grpc_call, the client does not know on which // cluster it will execute. However, this is not very important as we // just use input data for kmc. grpc_function_handle_default( handlePovray, "povray" ); grpc_call( handlePovray, result, &image ); When the client will call povray, it will not know where its image will be computed, which povray server will be used, on cluster 1 or 2. If the client get the result back, there is no problem because it will use result in the call. But, if we want to avoid useless transfers of result, we need to leave the result data (persitent) inside the platform and transfert it if the povray computation is not done on the same cluster, when this server will be chosen. In that case, we need a way to indicate to the kmc server that the result data must be leave on the server and not returned to the client. However, before calling (grpc_call) the kmc application, we do not know which cluster will be used, so there is no way to inform it. Its not possible to bind the data to this server, we can just bind it to the platform. Is that example more clear for you ? Laurent -- Laurent PHILIPPE http://lifc.univ-fcomte.fr/~philippe philippe@lifc.univ-fcomte.fr Laboratoire d'Informatique (LIFC) tel: (33) 03 81 66 66 54 route de Gray fax: (33) 03 81 66 64 50 25030 Besancon Cedex - FRANCE
Laurent, Thank you for answering my question. I think this is going to be a fruitful discussion. Since we will not enough time at the GGF F2F meeting, we should continue this on the ML.
It is totally implementation dependent, I believe. In theory, you can get more chance to choose 'better' server, if you delayed selection of the server to the actual invocation time, since at that time you can get more information on the invocation, such as the size of data to be transferred. Yes, you get more chance to choose a better server. But, that means after calling the grpc_function_handle_default, you do not have a reference to the server: it will be chosen later. You just have a reference to the platform and you can not place any data before calling the grpc_call since the server has not been selected. Please see the example further.
If the function handle contains a reference to a server, then the data management can be done in the same way as for grpc_function_handle_init. If the function handle does not reference the computational server, there no way to know where to place data before issuing grpc_call. This is the way function handles are implemented in Diet and Netsolve (2.0, any changes ?) GridRPC interfaces.
However, we should provide a way to dynamically choose a server... Any comments ?
I cannot understand your concern. Could you explain it giving examples ? Actually, the problem is to decide if we always know where the computation will take place or not. If we always know it, then we can use standard copy function to put the data on the server before computing (then the client is able to manage its data on its own, it does not need more platform support than data handle management). If we do not know where the computation will take place then we need platform support: we need way to say to the platform that we want to leave this data inside it, somewhere. This way could be a persistency flag or a bind function, it do not matter, but we need it. After computation, the server needs to know what to do with the data: send it back to the client or leave it on its host?
This example is taken from an application running under DIET. This application (kmc) simulate atoms deposition on a substrate. To better see the result of the simulation, the data computed by the simulation program are sent to a ray tracing service (povray). To optmize the performances, we plan to deploy both applications on to differents PC clusters (1 and 2) managed by DIET.
The GRPC client will do :
grpc_initialize(); grpc_function_handle_default( handleKmc, "kmc" );
// data preparation
grpc_call( handleKmc, data, &result );
// At the time of the grpc_call, the client does not know on which // cluster it will execute. However, this is not very important as we // just use input data for kmc.
grpc_function_handle_default( handlePovray, "povray" ); grpc_call( handlePovray, result, &image );
When the client will call povray, it will not know where its image will be computed, which povray server will be used, on cluster 1 or 2. If the client get the result back, there is no problem because it will use result in the call. But, if we want to avoid useless transfers of result, we need to leave the result data (persitent) inside the platform and transfert it if the povray computation is not done on the same cluster, when this server will be chosen. In that case, we need a way to indicate to the kmc server that the result data must be leave on the server and not returned to the client. However, before calling (grpc_call) the kmc application, we do not know which cluster will be used, so there is no way to inform it. Its not possible to bind the data to this server, we can just bind it to the platform.
Is that example more clear for you ? Yes. thank you for taking time for this.
But I think this is essentially because of dynamic nature of GridRPC and by just defining the semantics of 'grpc_function_handle_dafault' will not solve the problem. Users will have freedom to dynamically create a povray handle *AFTER* the kmc process is invoked. even if users explicitly specify a server for povray, there is no way to know it before calling kmc process. So, I think we should admit this is a very complicated problem and there is no simple answer. In my opinion there is two way to solve the problem. - Assuming some 'magical' global data management system behind, define a simple interface. - Assuming no background support, define a set of explicit data transfer method and explicit data management (may be with soft state lifetime managment). I love the former one, because A) there are several such 'magical' systems actually emerging, like AIST's gfarm, B) data transfer method is already defined (or at least on the way its definition) in other WG and clearly out of scope of our WG. comments? -hidemoto
Laurent,
Thank you for answering my question. I think this is going to be a fruitful discussion. Since we will not enough time at the GGF F2F meeting, we should continue this on the ML.
... Yes. thank you for taking time for this.
But I think this is essentially because of dynamic nature of GridRPC and by just defining the semantics of 'grpc_function_handle_dafault' will not solve the problem. I agree. This was just to be sure of the possible semantics for 'grpc_function_handle_dafault' as it was not clear for me in the document. I had to check that I'm not wrong on this dynamic choice of
Users will have freedom to dynamically create a povray handle *AFTER* the kmc process is invoked. even if users explicitly specify a server for povray, there is no way to know it before calling kmc process. You are right. Now, we assume that we have several KMC severs. This is
So, I think we should admit this is a very complicated problem and there is no simple answer. In my opinion there is two way to solve the problem.
- Assuming some 'magical' global data management system behind, define a simple interface.
- Assuming no background support, define a set of explicit data transfer method and explicit data management (may be with soft state lifetime managment). I think the two solutions are not at the same level, but they both need functions to access them. I think we should first complete the GridRPC interface with data handles and data management functions. without assuming anything on the way they are implemeneted. By just defining functions, we do not assume anything on the underlying support. These functions may be used to access/interface some 'magical' global data management as well as a background support. This will depend on the
Hidemoto, Hidemoto Nakada wrote: the server nor this is a "particular" feature. true, these physians want to test several parameters on their KMC code. As one execution takes several days, they test several parameter sets in parallel and so use several KMC servers. When they want to submit a new parameter set, they call the GridRPC platform and the platform will choice the best available sever at the time of the call. So, the client does not known at that time which server it will use so he is not able to indicates where to leave the generated data before calling povray. For these reasons, we have to express where the data comes from (in case of input data) and what to do with the data after computation (for output data). platform. By defining data management functions in GridRPC, we will provide an homogenous and complete API to clients.
I love the former one, because
A) there are several such 'magical' systems actually emerging, like AIST's gfarm, B) data transfer method is already defined (or at least on the way its definition) in other WG and clearly out of scope of our WG.
Yes, we should not rewrite data transfert service, we must just interface it with GridRPC. We also have some kind of magical system in DIET which is called DTM. However, the work done in the GridRPC WG to normalize access to servers and define a common API to our platforms could also be done for data.... If we agree on that : just an API not a service, then we should start to discuss on what data structure and what function to put in this API. Comments ? Laurent -- Laurent PHILIPPE http://lifc.univ-fcomte.fr/~philippe philippe@lifc.univ-fcomte.fr Laboratoire d'Informatique (LIFC) tel: (33) 03 81 66 66 54 route de Gray fax: (33) 03 81 66 64 50 25030 Besancon Cedex - FRANCE
Laurent Could you (and/or some other people from your team) join the WG at the GGF 14 and give us a presentation on the topic ?
I think the two solutions are not at the same level, but they both need functions to access them. I think we should first complete the GridRPC interface with data handles and data management functions. without assuming anything on the way they are implemeneted. By just defining functions, we do not assume anything on the underlying support. These functions may be used to access/interface some 'magical' global data management as well as a background support. This will depend on the platform. By defining data management functions in GridRPC, we will provide an homogenous and complete API to clients.
To define an interface, we donot have to assume specific underlying mechanisms, but we have to assume underlying functionality. For example, if we can assume that the underlying system has automatic data-caching and expiration mechanism, we donot have to have an API to control such things. So, I belive, we have to agree on the 'richness' of the underlying system, to define an API. -hidemoto
Hidemoto, Hidemoto Nakada wrote:
Laurent
Could you (and/or some other people from your team) join the WG at the GGF 14 and give us a presentation on the topic ? Sorry, nobody from the team will join this GGF Session. We must follow our discussion by mail until next GGF 15.
I think the two solutions are not at the same level, but they both need functions to access them. I think we should first complete the GridRPC interface with data handles and data management functions. without assuming anything on the way they are implemeneted. By just defining functions, we do not assume anything on the underlying support. These functions may be used to access/interface some 'magical' global data management as well as a background support. This will depend on the platform. By defining data management functions in GridRPC, we will provide an homogenous and complete API to clients.
To define an interface, we donot have to assume specific underlying mechanisms, but we have to assume underlying functionality. For example, if we can assume that the underlying system has automatic data-caching and expiration mechanism, we donot have to have an API to control such things. Yes we need it at least to optimize data transfers: just the client knows if the data must be leaved in the platform. We may assume an underlying system with automatic data-caching and expiration mechanism, but with no data management support all the data (at least out data) will be sent back to the client after a computation request. In case of a request sequence the transfer may be useless, see the KMC example. For that reason, it could be better to provide the data management interface to the client. However, you are right when you say :
So, I belive, we have to agree on the 'richness' of the underlying system, to define an API.
For me, we can assume that the underlying system support at least data identification and data transfers (based on already existing mechanisms, we do not have to rewrite it). Then we may have different 'richness' levels : with or without data persitency. Laurent -- Laurent PHILIPPE http://lifc.univ-fcomte.fr/~philippe philippe@lifc.univ-fcomte.fr Laboratoire d'Informatique (LIFC) tel: (33) 03 81 66 66 54 route de Gray fax: (33) 03 81 66 64 50 25030 Besancon Cedex - FRANCE
Hello,
Could you (and/or some other people from your team) join the WG at the GGF 14 and give us a presentation on the topic ? Sorry, nobody from the team will join this GGF Session. We must follow our discussion by mail until next GGF 15.
This is really unfortunate. If you have something to put on the table, please send it to this mailing list. We'll discuss on it at the F2F meeting and let you report on the discussion.
To define an interface, we donot have to assume specific underlying mechanisms, but we have to assume underlying functionality. For example, if we can assume that the underlying system has automatic data-caching and expiration mechanism, we donot have to have an API to control such things. Yes we need it at least to optimize data transfers: just the client knows if the data must be leaved in the platform. We may assume an underlying system with automatic data-caching and expiration mechanism, but with no data management support all the data (at least out data) will be sent back to the client after a computation request. In case of a request sequence the transfer may be useless, see the KMC example.
I donot think it is always necessary, if we can assume really good backend system. Let us assume a real grid-wise file system that is capable of automatic caching and expiration. On your KMC example, what we need to do is just let the povray server(s) know the reference of the data. the data itself might be on the KMC server, or elsewhere on the grid. There might be even several copies of the data. The povray server will access the data with the reference and the grid file system will take care of de-reference of the data and data transfer for you.
For me, we can assume that the underlying system support at least data identification and data transfers (based on already existing mechanisms, we do not have to rewrite it). Then we may have different 'richness' levels : with or without data persitency.
Yes, the problem is that, which level is suitabl for the gridrpc api to assume. Let me draft the levels level 0: data identification and data transfer level 1: Global file system without caching and expiration level 2: Global file system with caching and expiration comments? -hidemoto
participants (2)
-
Hidemoto Nakada
-
Laurent Philippe