Dear All,
We'll present the current work on the Data Management in the GridRPC, mainly
the availibility of the library implemented the standard and its performance.
Here is a sum-up of the talk.
Please do not hesitate to come with use cases so that we can talk about how
this work can be integrated, and the benefits that it may lead to.
*****************************
Title: Transparent collaboration of GridRPC middleware using the OGF
standardized GridRPC data management API
In september 2011, the Open Grid Forum standardized the document "Data
Management API within the GridRPC" which discribes an optional API
that extends the GridRPC standard. Used in a GridRPC middleware, it
provides a minimal set of functions to handle a large set of data
operations among which: movements, replications, migrations, data
prefetch and persistency.
We'll present a library implementing the API that has been integrated
in two different middleware, respectively DIET and NINF. We have
conducted several experiments, showing very high benefits that a Grid
user can expect 1) in terms of resource usage compared to the current
GridRPC context since useless transfers are avoided; 2) in terms of
reducing the completion time of an application to obtain results the
soonest (data can be prefetched and replicated, hence letting calculus
to be submitted really soon in a workflow analysis in addition to the
possible overlap between computations and communications); 3) in terms
of code portability, since we show with these examples that at last
the same GridRPC code can be compiled and executed within two
different GridRPC middleware which implements the GridRPC data
management API; 4) finally we thus obtain middleware interoperability
without any explicit glue as generally done: we show as a proof of
concept that resources dispatched across different administrative
domains can be used altogether without the underlying distributed data
management systems having any knowledge of the workflow and/or
computing resources: computational servers of DIET and NINF
transparently collaborate to the same calculus by sharing GridRPC
data!
Last results has been published in [1,2] in which we explained the API
and described the improvements using the implementation of this
standard. Further work will go on the transparent management of
protocols such as GridFTP, iRods, torrent, and the use of catalog for
data, as well on security.
[1] Yves Caniou, Eddy Caron, Gaël Le Mahec, and Hidemoto Nakada.
Standardized Data Management in GridRPC Environments. In 6th
International Conference on Computer Sciences and Convergence
Information Technology, Jeju Island, Korea, Nov. 29 - Dec. 1 2011.
[2] Yves Caniou, Eddy Caron, Gaël Le Mahec, and Hidemoto Nakada.
Transparent Collaboration of GridRPC Middleware using the OGF
Standardized GridRPC Data Management API. In The International
Symposium on Grids and Clouds (ISGC), February 26 - March 2 2012.
Proceedings of Science.
--
Yves Caniou
Associate Professor at Université Lyon 1,
Member of the team project INRIA GRAAL in the LIP ENS-Lyon,
http://graal.ens-lyon.fr/~ycaniou/