Re: [SAGA-RG] [GRIDCPR-WG] CPR Document - final call

All, I am not sure what to do with this comment. (And I am not sure if we can do this, formally, at this stage of the process.) Anyway, just for doing something reasonable, I am in favour of adding this use case. Thilo On Thu, Mar 29, 2007 at 03:23:58PM +0200, Eduardo Huedo Cuesta wrote:
From: Eduardo Huedo Cuesta <ehuedo@fdi.ucm.es> To: Andre Merzky <andre@merzky.net> Cc: gridcpr-wg@ogf.org, SAGA RG <saga-rg@ogf.org> Subject: Re: [GRIDCPR-WG] CPR Document - final call
Dear All,
From the GridWay team, we would like to propose another consumer use-case that we think is within the scope of GridRPC. See below.
Best Regards,
Eduardo Huedo.
--------------------------------------------------------------------
GridWay Metascheduler =================
The GridWay Metascheduler [1, 2], now a Globus project, adapts job execution to changing grid conditions by providing fault recovery mechanisms, dynamic scheduling, migration on-request and opportunistic migration [3]. Migration is implemented by restarting the job on the new candidate host, therefore the job should generate restart files at regular intervals in order to continue execution from a given point. If checkpointing files are not provided, the job is restarted from the beginning. GridWay periodically retrieves to the client machine or a checkpoint server (GridFTP URL) the restart architecture-independent files.
Jobs submitted with GridWay could benefit from GridCPR systems providing standard and uniform APIs and services for portable checkpoint generation and storage.
Functional requirements . API for application state writing and reading. . Services for failure notification. . Services for checkpoint data management.
[1] GridWay Metascheduler. http://www.gridway.org/. [2] E. Huedo, R.S. Montero and I.M. Llorente: A framework for adaptive execution on grids. Software - Practice and Experience 34 (7): 631-651, 2004. [3] E. Huedo, R. S. Montero, I. M. Llorente: Evaluating the reliability of computational grids from the end user's point of view. Journal of Systems Architecture 52(12): 727-736, 2006.
Andre Merzky escribió:
Hi groups,
as discussed earlier, we put some effort into the GridCPR documents, to get them back into the editor pipeline. Thanks to Nathan and others, both the CPR usecase and the cpr architecture document have now all public comments addressed, and are to be submitted to the OGF editor.
The docs are supposed to represent groups consensus after submission, so, this mail is a one week final call on the mailing list: please review the documents, and comment on them! "speak now or forever hold your peace ..." :-)
Cheers, Andre.
------------------------------------------------------------------------
-- gridcpr-wg mailing list gridcpr-wg@ogf.org http://www.ogf.org/mailman/listinfo/gridcpr-wg
--
GridWay, Meta-scheduling Technologies for the Grid! http://www.gridway.org
**************************************************
Dr. Eduardo Huedo Cuesta Departamento de Arquitectura de Computadores y Automática Facultad de Informática Universidad Complutense de Madrid C/ Prof. García Santesmases s/n 28040 Madrid Spain
Tel: +34 91 394 76 03 Fax: +34 91 394 75 27 Email: ehuedo@fdi.ucm.es
**************************************************
-- gridcpr-wg mailing list gridcpr-wg@ogf.org http://www.ogf.org/mailman/listinfo/gridcpr-wg
-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/

I think it depends on what the OGF editor thinks about the next steps: if we can proceed in the pipeline w/o going into pub lic comment, we would not be able to significantly alter content of the document. If we have to go into public comment again, it does not matter... I'll ask the editor about the procedure, that will help us to decide. The usecase itself fits, IMHO, perfectly into the scope of GridCPR, and would improve the document (by widening the potential user base). Cheers, Andre. Quoting [Thilo Kielmann] (Mar 29 2007):
All,
I am not sure what to do with this comment. (And I am not sure if we can do this, formally, at this stage of the process.)
Anyway, just for doing something reasonable, I am in favour of adding this use case.
Thilo
On Thu, Mar 29, 2007 at 03:23:58PM +0200, Eduardo Huedo Cuesta wrote:
From: Eduardo Huedo Cuesta <ehuedo@fdi.ucm.es> To: Andre Merzky <andre@merzky.net> Cc: gridcpr-wg@ogf.org, SAGA RG <saga-rg@ogf.org> Subject: Re: [GRIDCPR-WG] CPR Document - final call
Dear All,
From the GridWay team, we would like to propose another consumer use-case that we think is within the scope of GridRPC. See below.
Best Regards,
Eduardo Huedo.
--------------------------------------------------------------------
GridWay Metascheduler =================
The GridWay Metascheduler [1, 2], now a Globus project, adapts job execution to changing grid conditions by providing fault recovery mechanisms, dynamic scheduling, migration on-request and opportunistic migration [3]. Migration is implemented by restarting the job on the new candidate host, therefore the job should generate restart files at regular intervals in order to continue execution from a given point. If checkpointing files are not provided, the job is restarted from the beginning. GridWay periodically retrieves to the client machine or a checkpoint server (GridFTP URL) the restart architecture-independent files.
Jobs submitted with GridWay could benefit from GridCPR systems providing standard and uniform APIs and services for portable checkpoint generation and storage.
Functional requirements . API for application state writing and reading. . Services for failure notification. . Services for checkpoint data management.
[1] GridWay Metascheduler. http://www.gridway.org/. [2] E. Huedo, R.S. Montero and I.M. Llorente: A framework for adaptive execution on grids. Software - Practice and Experience 34 (7): 631-651, 2004. [3] E. Huedo, R. S. Montero, I. M. Llorente: Evaluating the reliability of computational grids from the end user's point of view. Journal of Systems Architecture 52(12): 727-736, 2006.
Andre Merzky escribió:
Hi groups,
as discussed earlier, we put some effort into the GridCPR documents, to get them back into the editor pipeline. Thanks to Nathan and others, both the CPR usecase and the cpr architecture document have now all public comments addressed, and are to be submitted to the OGF editor.
The docs are supposed to represent groups consensus after submission, so, this mail is a one week final call on the mailing list: please review the documents, and comment on them! "speak now or forever hold your peace ..." :-)
Cheers, Andre.
------------------------------------------------------------------------
-- gridcpr-wg mailing list gridcpr-wg@ogf.org http://www.ogf.org/mailman/listinfo/gridcpr-wg
--
GridWay, Meta-scheduling Technologies for the Grid! http://www.gridway.org
**************************************************
Dr. Eduardo Huedo Cuesta Departamento de Arquitectura de Computadores y Automática Facultad de Informática Universidad Complutense de Madrid C/ Prof. García Santesmases s/n 28040 Madrid Spain
Tel: +34 91 394 76 03 Fax: +34 91 394 75 27 Email: ehuedo@fdi.ucm.es
**************************************************
-- gridcpr-wg mailing list gridcpr-wg@ogf.org http://www.ogf.org/mailman/listinfo/gridcpr-wg
-- "So much time, so little to do..." -- Garfield

The OGF editor sees no problem in including the use case, w/o triggering another public comment. As Thilo, Nathan and me answered in favour of inclusion already, I'll go ahead to do that. Eduardo: many thanks! :) Cheers, Andre. Quoting [Andre Merzky] (Mar 29 2007):
I think it depends on what the OGF editor thinks about the next steps: if we can proceed in the pipeline w/o going into pub lic comment, we would not be able to significantly alter content of the document. If we have to go into public comment again, it does not matter...
I'll ask the editor about the procedure, that will help us to decide.
The usecase itself fits, IMHO, perfectly into the scope of GridCPR, and would improve the document (by widening the potential user base).
Cheers, Andre.
Quoting [Thilo Kielmann] (Mar 29 2007):
All,
I am not sure what to do with this comment. (And I am not sure if we can do this, formally, at this stage of the process.)
Anyway, just for doing something reasonable, I am in favour of adding this use case.
Thilo
On Thu, Mar 29, 2007 at 03:23:58PM +0200, Eduardo Huedo Cuesta wrote:
From: Eduardo Huedo Cuesta <ehuedo@fdi.ucm.es> To: Andre Merzky <andre@merzky.net> Cc: gridcpr-wg@ogf.org, SAGA RG <saga-rg@ogf.org> Subject: Re: [GRIDCPR-WG] CPR Document - final call
Dear All,
From the GridWay team, we would like to propose another consumer use-case that we think is within the scope of GridRPC. See below.
Best Regards,
Eduardo Huedo.
--------------------------------------------------------------------
GridWay Metascheduler =================
The GridWay Metascheduler [1, 2], now a Globus project, adapts job execution to changing grid conditions by providing fault recovery mechanisms, dynamic scheduling, migration on-request and opportunistic migration [3]. Migration is implemented by restarting the job on the new candidate host, therefore the job should generate restart files at regular intervals in order to continue execution from a given point. If checkpointing files are not provided, the job is restarted from the beginning. GridWay periodically retrieves to the client machine or a checkpoint server (GridFTP URL) the restart architecture-independent files.
Jobs submitted with GridWay could benefit from GridCPR systems providing standard and uniform APIs and services for portable checkpoint generation and storage.
Functional requirements . API for application state writing and reading. . Services for failure notification. . Services for checkpoint data management.
[1] GridWay Metascheduler. http://www.gridway.org/. [2] E. Huedo, R.S. Montero and I.M. Llorente: A framework for adaptive execution on grids. Software - Practice and Experience 34 (7): 631-651, 2004. [3] E. Huedo, R. S. Montero, I. M. Llorente: Evaluating the reliability of computational grids from the end user's point of view. Journal of Systems Architecture 52(12): 727-736, 2006.
Andre Merzky escribió:
Hi groups,
as discussed earlier, we put some effort into the GridCPR documents, to get them back into the editor pipeline. Thanks to Nathan and others, both the CPR usecase and the cpr architecture document have now all public comments addressed, and are to be submitted to the OGF editor.
The docs are supposed to represent groups consensus after submission, so, this mail is a one week final call on the mailing list: please review the documents, and comment on them! "speak now or forever hold your peace ..." :-)
Cheers, Andre.
------------------------------------------------------------------------
-- gridcpr-wg mailing list gridcpr-wg@ogf.org http://www.ogf.org/mailman/listinfo/gridcpr-wg
--
GridWay, Meta-scheduling Technologies for the Grid! http://www.gridway.org
**************************************************
Dr. Eduardo Huedo Cuesta Departamento de Arquitectura de Computadores y Automática Facultad de Informática Universidad Complutense de Madrid C/ Prof. García Santesmases s/n 28040 Madrid Spain
Tel: +34 91 394 76 03 Fax: +34 91 394 75 27 Email: ehuedo@fdi.ucm.es
**************************************************
-- gridcpr-wg mailing list gridcpr-wg@ogf.org http://www.ogf.org/mailman/listinfo/gridcpr-wg
-- "So much time, so little to do..." -- Garfield
participants (2)
-
Andre Merzky
-
Thilo Kielmann