"GES Strawman" document available
Dear all, I just uploaded a new document into the "Input Documents/Execution Service" folder: http://forge.ogf.org/sf/go/doc15590?nav=1 This document is a somehow polished version of the notes taken during the pre-PGI Geneva meeting. It is basically a "wish list" for what we called "Geneva Execution Service (GES)". The name means that in its current status, the document covers the requirements emerged during the Geneva meeting. Please note that the document has some funny artifacts emerged during OpenOffice->Word conversion. Sorry for that, I hope we will be able to fix that in the next revisions. We now ask people to have a look at the document, comment it and also propose additional requirements arising from their middlewares and use cases. Thus, the document will eventually evolve into a requirements document for the "PGI Execution Service" (and not GES). Note that at this point we would like to concentrate on requirements only. How these requirements can be mapped onto existing or brand new specifications/profiles will be the next step, after the requirements have been finalized. We propose to have a teleconference on wednesday, april 8th at 16:00CET (note that now also Europe entered daylight saving time, so we are back in sync). Agenda and call in details will be circulated next week. Meanwhile, feel free to discuss the document using the mailing list. Moreno. -- Moreno Marzolla INFN Sezione di Padova, via Marzolo 8, 35131 PADOVA, Italy EMail: moreno.marzolla@pd.infn.it Phone: +39 049 8277103 WWW : http://www.dsi.unive.it/~marzolla Fax : +39 049 8756233
Moreno, Concerning the work about DATA STAGING inside OGF PGI : I have carefully read : - The GROMACS use case at http://forge.gridforum.org/sf/go/doc15580?nav=1 From my point of view, it contains 2 different use cases of Input Data Staging. - The section about 'Data Staging' of the 'Execution Service Strawman' document at http://forge.gridforum.org/sf/go/doc15590?nav=1 I am afraid that it contains too restrictive implicit assumptions about : - the permanent location of 1 file, - the destination of Input Data Staging. So, I would like to provide precisions about following concepts : Permanent location of 1 file ---------------------------- As far as I know, from the point of view of the grid Computing Resource, 1 permanent file can be stored on 5 different types of location : CLIENT: A location accessible by the Client (grid User or Workflow Engine), but NOT by Grid Services. ------ This requires Data Staging managed by the Client. WEB: A location accessible by anybody, provided it has adequate credentials. --- In this case, Data Staging SHOULD be managed by the Client. If not, then this requires the Client to transmit (beware of security issues) or delegate the credentials to a Grid Service. TAPE: A tape on a grid Storage Resource. ---- This requires Data Staging (which can be managed by the Client or by a Grid Service). DISK: A disk on a grid Storage Resource. ---- If the network bandwidth is high enough, and if the file is accessed only sequentially, this does NOT require Data Staging. But the Client can prefer to perform Data Staging anyway. LIST: A list of grid Storage Resources (storing replicas having identical content). The Client does NOT know in advance on which media (tape or disk) each Storage Resource stores its replica. ---- According to best practices, a disk on the Computing Resource should NOT be a permanent location for a file. Destination of INPUT Data Staging --------------------------------- As far as I know, Input Data Staging can be performed in direction of 3 different types of Storage Resources : - Disk on a (possibly remote) grid Storage Resource (from tape). This permits remote sequential reads. - Disk on a close grid Storage Resource (from tape or remote Storage Resource). This permits close sequential reads. - Local disk of the Computing Resource. This permits quick sequential or random reads and writes. Destination of OUTPUT Data Staging ---------------------------------- As far as I know, Output Data Staging can be performed from the local disk of the Computing Resource to 4 different types of Storage Resources : - 1 grid Storage Resource (The decision to choose disk or tape for permanent storage of the file is OUTSIDE the scope of PGI). - LIST (see definition above). - WEB (see definition above). - CLIENT (see definition above). USE CASES --------- The possibilities listed above lead to numerous different use cases. I describe below 4 canonic use cases (of course, any combination is possible) : Pre and Post Staging by the Client (EGEE standard use case) ---------------------------------- 1) Only if the input files are NOT already in the same Storage Resource, the Client transfers the input files to disks in the same grid Storage Resource (pre stage in), 2) The Client submits a Job (with the grid locations of input and output files, all in the same grid Storage Resource) to a Computing Resource close to the Storage Resource, 3) The Job sequentially reads from and writes to files inside the close Storage Resource, and uses the local disk only for temporary files, 4) The Client receives notification that the Job is finished, 5) Only if necessary, the Client pulls the desired output files from the Storage Resource (post stage out). I appreciate this use case, because it is very simple, and does NOT REQUIRE staging at all inside a well designed global workflow (like Unix pipe). Pre and Post Staging in a STORAGE Resource by the Execution Service ------------------------------------------------------------------- 1) The Client submits a Job (with the grid locations of input and output files) to the Execution Service, 2) Only if the input files are NOT already in the same Storage Resource, the Execution Service transfers the input files to disks in the same grid Storage Resource (pre stage in), 3) The Execution Service sends the Job for execution to a Computing Resource close to the Storage Resource. 4) The Job sequentially reads from and writes to files inside the close Storage Resource, and uses the local disk only for temporary files, 5) The Execution Service receives (from the Computing Resource) the notification that the Job is finished inside the Computing Resource, 6) Only if necessary, the Execution Service transfers the desired output files from the close Storage Resource to their final grid locations (post stage out). 7) The Client receives notification that the Job is finished, If the Client is a grid User, he would appreciate this use case, because it is very simple for him, but it requires complex processing by the Execution Service. Just In Time Staging in the COMPUTING Resource by the Execution Service ----------------------------------------------------------------------- 1) The Client submits a Job (with the grid locations of input and output files) to the Execution Service, 2) The Execution Service sends the Job with a 'do not start' attribute to a Computing Resource. 3) The Execution Service receives (from the Computing Resource) the locations of input and output files inside the Computing Resource, 4) The Execution Service transfers the input files to their locations inside the Computing Resource (just in time stage in), 5) The Execution Service starts the Job, 6) The Job sequentially or randomly reads from and writes to files only inside its local disk, 7) The Execution Service receives (from the Computing Resource) the notification that the Job is finished inside the Computing Resource, 8) The Execution Service transfers the output files from the Computing Resource to their final grid locations (just in time stage out). 9) The Client receives notification that the Job is finished, This use case requires some processing by the Execution Service, but is the best one when the job really needs random access to files. Just In Time Staging in the COMPUTING Resource by the Client ------------------------------------------------------------ The Client : 1) submits a Job with a 'do not start' attribute, 2) receives the locations of input and output files inside the Computing Resource, 3) pushes the input files to their locations inside the Computing Resource (just in time stage in), 4) starts the Job, 5) receives notification that the Job is finished, 6) pulls the output files from the Computing Resource (just in time stage out), 7) purges the Job (from the Computing Resource). I personally do NOT appreciate this use case, because it is complicated, and the Computing Resource must keep the (possibly huge) output files until they are pulled by the Client. Can you please : - Check if the above concepts are relevant, and if the associated lists are complete, - Check if the above use cases are relevant, and if there could be completely different use cases. - Propose, from the above concepts and use cases, which possibilities the PGI Working Group should take into account, and which should be considered out of scope ? Thank you very much in advance. Best regards. ---------------------------------- Etienne URBAH IN2P3 - LAL Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Mob: +33 6 22 30 53 27 Skype: etienne.urbah mailto:urbah@lal.in2p3.fr ---------------------------------- On Wed, 01 Apr 2009, Moreno Marzolla wrote:
Dear all,
I just uploaded a new document into the "Input Documents/Execution Service" folder:
http://forge.ogf.org/sf/go/doc15590?nav=1
This document is a somehow polished version of the notes taken during the pre-PGI Geneva meeting. It is basically a "wish list" for what we called "Geneva Execution Service (GES)". The name means that in its current status, the document covers the requirements emerged during the Geneva meeting. Please note that the document has some funny artifacts emerged during OpenOffice->Word conversion. Sorry for that, I hope we will be able to fix that in the next revisions.
We now ask people to have a look at the document, comment it and also propose additional requirements arising from their middlewares and use cases. Thus, the document will eventually evolve into a requirements document for the "PGI Execution Service" (and not GES). Note that at this point we would like to concentrate on requirements only. How these requirements can be mapped onto existing or brand new specifications/profiles will be the next step, after the requirements have been finalized.
We propose to have a teleconference on wednesday, april 8th at 16:00CET (note that now also Europe entered daylight saving time, so we are back in sync). Agenda and call in details will be circulated next week. Meanwhile, feel free to discuss the document using the mailing list.
Moreno.
participants (3)
-
Etienne URBAH -
Moreno Marzolla -
Steven Newhouse