Next teleconference: tomorrow, wednesday july 15th

Dear all, the next PGI teleconference will be held tomorrow july 15th at 16:00 CET (duration: 1 hour). Call-in details as follow: via Skype call +9900827049931906 (free of charge) ordinary phone numbers (local rates) with the 9931906 conference number: Austria 0820 401 15470 Belgium 0703 57 134 France 0826 109 071 Germany +49 (0) 180 500 9527 Switzerland 0848 560 397 The agenda is similar to the last call (in particular, tomorrow I would like that we spend a few minutes on a status update of the security discussion) 1) State model 2) Status update on security 3) AOB Feel free to propose additional topics for discussion. Moreno. -- Moreno Marzolla INFN Sezione di Padova, via Marzolo 8, 35131 PADOVA, Italy EMail: moreno.marzolla@pd.infn.it Phone: +39 049 8277103 WWW : http://www.dsi.unive.it/~marzolla Fax : +39 049 8756233

Moreno, Once we figure out what we need I can start the process to get a new draft of BES out. Besides the state model - where we seem to have reached a consensus, and the "vector" operations, there is also the issue of moving towards GLUE2 as the attribute schema which has been discussed several times. We will need to be specific as to what part of GLUE2, and how to get the info from endpoints (we could keep the current mechanism). Finally, there is JSDL. It would be good if we could synchronize the transition to GLUE2 with the JSDL group. I am cc'ing the JSDL group as they may be able to better answer the question of JSDL and GLUE2 congruence (if there are any such plans.) A
-----Original Message----- From: pgi-wg-bounces@ogf.org [mailto:pgi-wg-bounces@ogf.org] On Behalf Of Moreno Marzolla Sent: Tuesday, July 14, 2009 5:40 AM To: pgi-wg@ogf.org Subject: [Pgi-wg] Next teleconference: tomorrow, wednesday july 15th
Dear all,
the next PGI teleconference will be held tomorrow july 15th at 16:00 CET (duration: 1 hour).
Call-in details as follow: via Skype call +9900827049931906 (free of charge) ordinary phone numbers (local rates) with the 9931906 conference number:
Austria 0820 401 15470 Belgium 0703 57 134 France 0826 109 071 Germany +49 (0) 180 500 9527 Switzerland 0848 560 397
The agenda is similar to the last call (in particular, tomorrow I would like that we spend a few minutes on a status update of the security discussion)
1) State model
2) Status update on security
3) AOB
Feel free to propose additional topics for discussion.
Moreno.
-- Moreno Marzolla INFN Sezione di Padova, via Marzolo 8, 35131 PADOVA, Italy EMail: moreno.marzolla@pd.infn.it Phone: +39 049 8277103 WWW : http://www.dsi.unive.it/~marzolla Fax : +39 049 8756233
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg

Hi Andrew, The most important aspects of GLUE2 for this activity is the GLUE execution environment and GLUE application environment. It describes the hardware and software properties of the cluster respectively and the Job Description should mirror this information. Currently, this information is propagated around the Grid infrastructure via a parallel mechanism. Whether this needs to be part of BES or if there is an information service which must accompany BES is a matter for debate. Within the current EGEE/WLCG infrastructure there are no plans for the foreseeable future to move away from the current mechanism, the need to expose this information via BES is not a requirement for it to be deployed on the infrastructure. One of the problems with getting the information from BES is that this is then a service specific mechanism and other services such as the SRM, etc. will have a different method. Information system interoperation is indeed problem which we must solve but we need to deal with things in manageable chunks. I would consider this problem to be a general Grid interoperability problem rather than something specific to BES so we should try not get too distracted from the good progress that has been made so far. The definition of GLUE 2.0 has solved more than half of the information system interoperability problem. As the information content is standardized, moving it around and translating from one data format to another is just a technical plumbing exercise. Having something standardized here would certainly help but it is probably out of scope for the BES discussion. Anyway, my personal opinion on the information system standards is that a good one already exists, X.500 :) Laurence Andrew Grimshaw wrote:
Moreno, Once we figure out what we need I can start the process to get a new draft of BES out.
Besides the state model - where we seem to have reached a consensus, and the "vector" operations, there is also the issue of moving towards GLUE2 as the attribute schema which has been discussed several times. We will need to be specific as to what part of GLUE2, and how to get the info from endpoints (we could keep the current mechanism).
Finally, there is JSDL. It would be good if we could synchronize the transition to GLUE2 with the JSDL group. I am cc'ing the JSDL group as they may be able to better answer the question of JSDL and GLUE2 congruence (if there are any such plans.)
A
-----Original Message----- From: pgi-wg-bounces@ogf.org [mailto:pgi-wg-bounces@ogf.org] On Behalf Of Moreno Marzolla Sent: Tuesday, July 14, 2009 5:40 AM To: pgi-wg@ogf.org Subject: [Pgi-wg] Next teleconference: tomorrow, wednesday july 15th
Dear all,
the next PGI teleconference will be held tomorrow july 15th at 16:00 CET (duration: 1 hour).
Call-in details as follow: via Skype call +9900827049931906 (free of charge) ordinary phone numbers (local rates) with the 9931906 conference number:
Austria 0820 401 15470 Belgium 0703 57 134 France 0826 109 071 Germany +49 (0) 180 500 9527 Switzerland 0848 560 397
The agenda is similar to the last call (in particular, tomorrow I would like that we spend a few minutes on a status update of the security discussion)
1) State model
2) Status update on security
3) AOB
Feel free to propose additional topics for discussion.
Moreno.
-- Moreno Marzolla INFN Sezione di Padova, via Marzolo 8, 35131 PADOVA, Italy EMail: moreno.marzolla@pd.infn.it Phone: +39 049 8277103 WWW : http://www.dsi.unive.it/~marzolla Fax : +39 049 8756233
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg

Laurence, First, as you can tell I am not a GLUE2 expert - nor am I keeping up with JSDL 2.0. :-) That said, as to your comment
One of the problems with getting the information from BES is that this is then a service specific mechanism and other services such as the SRM, etc. will have a different method. Information system interoperation is indeed problem which we must solve but we need to deal with things in manageable chunks.
I agree completely. During the BES discussion we came to an impasse over this: some arguing that that we could use WS-RF resource properties ... and then have a single mechanism for all types of resources. Others, including but not limited to Microsoft, would have nothing to do with WS-RF. In the end to get consensus the WG decided on a separate function - very ugly. We in Genesis II support both the WS-RF mechanism and the OGSA-BES mechanism. The same thing by the way happened over notification, except in the end the WG basically punted. I personally think that the BES endpoint should provide a mechanism to get the information, but that the spec should be mute on how that information is aggregated or used. A
-----Original Message----- From: Laurence Field [mailto:Laurence.Field@cern.ch] Sent: Tuesday, July 14, 2009 3:22 PM To: Andrew Grimshaw Cc: Moreno Marzolla; pgi-wg@ogf.org; jsdl-wg@ogf.org Subject: Re: [Pgi-wg] Next teleconference: tomorrow, wednesday july 15th
Hi Andrew,
The most important aspects of GLUE2 for this activity is the GLUE execution environment and GLUE application environment. It describes the hardware and software properties of the cluster respectively and the Job Description should mirror this information.
Currently, this information is propagated around the Grid infrastructure via a parallel mechanism. Whether this needs to be part of BES or if there is an information service which must accompany BES is a matter for debate. Within the current EGEE/WLCG infrastructure there are no plans for the foreseeable future to move away from the current mechanism, the need to expose this information via BES is not a requirement for it to be deployed on the infrastructure.
One of the problems with getting the information from BES is that this is then a service specific mechanism and other services such as the SRM, etc. will have a different method. Information system interoperation is indeed problem which we must solve but we need to deal with things in manageable chunks. I would consider this problem to be a general Grid interoperability problem rather than something specific to BES so we should try not get too distracted from the good progress that has been made so far. The definition of GLUE 2.0 has solved more than half of the information system interoperability problem. As the information content is standardized, moving it around and translating from one data format to another is just a technical plumbing exercise. Having something standardized here would certainly help but it is probably out of scope for the BES discussion.
Anyway, my personal opinion on the information system standards is that a good one already exists, X.500 :)
Laurence
Moreno, Once we figure out what we need I can start the process to get a new draft of BES out.
Besides the state model - where we seem to have reached a consensus, and
"vector" operations, there is also the issue of moving towards GLUE2 as
attribute schema which has been discussed several times. We will need to be specific as to what part of GLUE2, and how to get the info from endpoints (we could keep the current mechanism).
Finally, there is JSDL. It would be good if we could synchronize the transition to GLUE2 with the JSDL group. I am cc'ing the JSDL group as
Andrew Grimshaw wrote: the the they
may be able to better answer the question of JSDL and GLUE2 congruence (if there are any such plans.)
A
-----Original Message----- From: pgi-wg-bounces@ogf.org [mailto:pgi-wg-bounces@ogf.org] On Behalf Of Moreno Marzolla Sent: Tuesday, July 14, 2009 5:40 AM To: pgi-wg@ogf.org Subject: [Pgi-wg] Next teleconference: tomorrow, wednesday july 15th
Dear all,
the next PGI teleconference will be held tomorrow july 15th at 16:00 CET (duration: 1 hour).
Call-in details as follow: via Skype call +9900827049931906 (free of charge) ordinary phone numbers (local rates) with the 9931906 conference number:
Austria 0820 401 15470 Belgium 0703 57 134 France 0826 109 071 Germany +49 (0) 180 500 9527 Switzerland 0848 560 397
The agenda is similar to the last call (in particular, tomorrow I would like that we spend a few minutes on a status update of the security discussion)
1) State model
2) Status update on security
3) AOB
Feel free to propose additional topics for discussion.
Moreno.
-- Moreno Marzolla INFN Sezione di Padova, via Marzolo 8, 35131 PADOVA, Italy EMail: moreno.marzolla@pd.infn.it Phone: +39 049 8277103 WWW : http://www.dsi.unive.it/~marzolla Fax : +39 049 8756233
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg

Hi Andrew, With the diverse types of services that we deal with in our infrastructure, I can't imagine a situation where they have all implemented an interface using the same technology. This is due to many factors including but not limited to: legacy, time scales, priories, ideologies, trends, fads etc. However, we have to somehow link all these services together, which is why I believe that a parallel system is the most flexible option. If an agreed information interface emerges, the exiting interfaces could be extended to provide this but the only advantage I see is aesthetics rather than function. Having said that, one of the advantages that I would see by having this added to BES is that developers of the interface would also have to worry about providing the information, which would save us the trouble :) We could then create a simple adaptor to extract the information and pull it into the parallel information system. In order to achieve this, a simple interface such as an XML document would suffice. Examples of such documents can be found on the GLUE 2.0 wiki page. http://forge.gridforum.org/sf/wiki/do/viewPage/projects.glue-wg/wiki/GLUE2XM... Laurence Andrew Grimshaw wrote:
Laurence,
I agree completely. During the BES discussion we came to an impasse over this: some arguing that that we could use WS-RF resource properties ... and then have a single mechanism for all types of resources. Others, including but not limited to Microsoft, would have nothing to do with WS-RF. In the end to get consensus the WG decided on a separate function - very ugly. We in Genesis II support both the WS-RF mechanism and the OGSA-BES mechanism. The same thing by the way happened over notification, except in the end the WG basically punted.
I personally think that the BES endpoint should provide a mechanism to get the information, but that the spec should be mute on how that information is aggregated or used.
A

Hi all, FWIW, I have mixed feelings on this issue. 4 years ago I thought it would be just great to have exactly the same interface and end-point for job submission/management, for information query, and for file management. We even do have the same interface for job and file management: job is largely characterized by a set of files, after all. And information persistency may well be realized as a set of files, too - why not. But then I killed a job by accident, being sure I deleted a file. So now I think a clear separation is a good think to do. Cheers, Oxana 2009-07-15 00:10, Laurence Field пишет:
Hi Andrew,
With the diverse types of services that we deal with in our infrastructure, I can't imagine a situation where they have all implemented an interface using the same technology. This is due to many factors including but not limited to: legacy, time scales, priories, ideologies, trends, fads etc. However, we have to somehow link all these services together, which is why I believe that a parallel system is the most flexible option. If an agreed information interface emerges, the exiting interfaces could be extended to provide this but the only advantage I see is aesthetics rather than function.
Having said that, one of the advantages that I would see by having this added to BES is that developers of the interface would also have to worry about providing the information, which would save us the trouble :) We could then create a simple adaptor to extract the information and pull it into the parallel information system. In order to achieve this, a simple interface such as an XML document would suffice. Examples of such documents can be found on the GLUE 2.0 wiki page.
http://forge.gridforum.org/sf/wiki/do/viewPage/projects.glue-wg/wiki/GLUE2XM...
Laurence
Andrew Grimshaw wrote:
Laurence,
I agree completely. During the BES discussion we came to an impasse over this: some arguing that that we could use WS-RF resource properties ... and then have a single mechanism for all types of resources. Others, including but not limited to Microsoft, would have nothing to do with WS-RF. In the end to get consensus the WG decided on a separate function - very ugly. We in Genesis II support both the WS-RF mechanism and the OGSA-BES mechanism. The same thing by the way happened over notification, except in the end the WG basically punted.
I personally think that the BES endpoint should provide a mechanism to get the information, but that the spec should be mute on how that information is aggregated or used.
A
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg

Oxana, I think you may have mis-interpreted what I meant. I think different interfaces for different functions is important. I was referring to a single interface for discovering resource properties|meta data|attributes. So that one can gather resource properties (perhaps including what interfaces the resource implements) in a uniform way - much like reflection interfaces in Java. Where and how the information to respond to resource property queries is irrelevant to me ... is it generated on the fly, is it variables in memory, is it stored in some database, I don't care. I just want to be able to ask for it. Similarly, if someone wants to build an "information service" that logically keeps the resource properties of many resources and allows me to query that, that is great too. How they get that information is not my concern. What that "information service" interface is I do care about. It has been debated quite a bit: many arguing against WS and XML representation as too slow and complex, instead arguing for a straight RDBMS interface with well-defined schema; others arguing for the full flexibility of extensible XML schema's. While I think the information service interface is important, I think it will be difficult to reach consensus as infrastructures tend to have their own well developed techniques. I think therefore that we should keep it out of scope for our discussions on the execution service. A
-----Original Message----- From: pgi-wg-bounces@ogf.org [mailto:pgi-wg-bounces@ogf.org] On Behalf Of Oxana Smirnova Sent: Tuesday, July 14, 2009 6:43 PM To: pgi-wg@ogf.org Cc: jsdl-wg@ogf.org Subject: Re: [Pgi-wg] Next teleconference: tomorrow, wednesday july 15th
Hi all,
FWIW, I have mixed feelings on this issue. 4 years ago I thought it would be just great to have exactly the same interface and end-point for job submission/management, for information query, and for file management. We even do have the same interface for job and file management: job is largely characterized by a set of files, after all. And information persistency may well be realized as a set of files, too - why not. But then I killed a job by accident, being sure I deleted a file. So now I think a clear separation is a good think to do.
Cheers, Oxana
Hi Andrew,
With the diverse types of services that we deal with in our infrastructure, I can't imagine a situation where they have all implemented an interface using the same technology. This is due to many factors including but not limited to: legacy, time scales, priories, ideologies, trends, fads etc. However, we have to somehow link all these services together, which is why I believe that a parallel system is the most flexible option. If an agreed information interface emerges, the exiting interfaces could be extended to provide this but the only advantage I see is aesthetics rather than function.
Having said that, one of the advantages that I would see by having this added to BES is that developers of the interface would also have to worry about providing the information, which would save us the trouble :) We could then create a simple adaptor to extract the information and pull it into the parallel information system. In order to achieve this, a simple interface such as an XML document would suffice. Examples of such documents can be found on the GLUE 2.0 wiki page.
http://forge.gridforum.org/sf/wiki/do/viewPage/projects.glue- wg/wiki/GLUE2XMLSchema
Laurence
Andrew Grimshaw wrote:
Laurence,
I agree completely. During the BES discussion we came to an impasse over this: some arguing that that we could use WS-RF resource properties ... and then have a single mechanism for all types of resources. Others, including but not limited to Microsoft, would have nothing to do with WS-RF. In
end to get consensus the WG decided on a separate function - very ugly. We in Genesis II support both the WS-RF mechanism and the OGSA-BES mechanism. The same thing by the way happened over notification, except in the end
2009-07-15 00:10, Laurence Field пишет: the the
WG basically punted.
I personally think that the BES endpoint should provide a mechanism to get the information, but that the spec should be mute on how that information is aggregated or used.
A
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg

Within the current EGEE/WLCG infrastructure there are no plans for the foreseeable future to move away from the current mechanism, the need to expose this information via BES is not a requirement for it to be deployed on the infrastructure.
Therefore if any such interface is defined in a future version of BES it should be encapsulated in its own optional port type. Steven

HI Laurence and Andrew, I really would suggest that we make what is effectively Basic Information Service. We have experience of trying to publish using glue based XML as metadata on the MS HPC-BP service and it ended up creating some performance issues. After building a separate service we also realised that as Laurence suggested this same service can be used for different types of resources. David On 14/07/2009 20:21, "Laurence Field" <Laurence.Field@cern.ch> wrote:
Hi Andrew,
The most important aspects of GLUE2 for this activity is the GLUE execution environment and GLUE application environment. It describes the hardware and software properties of the cluster respectively and the Job Description should mirror this information.
Currently, this information is propagated around the Grid infrastructure via a parallel mechanism. Whether this needs to be part of BES or if there is an information service which must accompany BES is a matter for debate. Within the current EGEE/WLCG infrastructure there are no plans for the foreseeable future to move away from the current mechanism, the need to expose this information via BES is not a requirement for it to be deployed on the infrastructure.
One of the problems with getting the information from BES is that this is then a service specific mechanism and other services such as the SRM, etc. will have a different method. Information system interoperation is indeed problem which we must solve but we need to deal with things in manageable chunks. I would consider this problem to be a general Grid interoperability problem rather than something specific to BES so we should try not get too distracted from the good progress that has been made so far. The definition of GLUE 2.0 has solved more than half of the information system interoperability problem. As the information content is standardized, moving it around and translating from one data format to another is just a technical plumbing exercise. Having something standardized here would certainly help but it is probably out of scope for the BES discussion.
Anyway, my personal opinion on the information system standards is that a good one already exists, X.500 :)
Laurence
Andrew Grimshaw wrote:
Moreno, Once we figure out what we need I can start the process to get a new draft of BES out.
Besides the state model - where we seem to have reached a consensus, and the "vector" operations, there is also the issue of moving towards GLUE2 as the attribute schema which has been discussed several times. We will need to be specific as to what part of GLUE2, and how to get the info from endpoints (we could keep the current mechanism).
Finally, there is JSDL. It would be good if we could synchronize the transition to GLUE2 with the JSDL group. I am cc'ing the JSDL group as they may be able to better answer the question of JSDL and GLUE2 congruence (if there are any such plans.)
A
-----Original Message----- From: pgi-wg-bounces@ogf.org [mailto:pgi-wg-bounces@ogf.org] On Behalf Of Moreno Marzolla Sent: Tuesday, July 14, 2009 5:40 AM To: pgi-wg@ogf.org Subject: [Pgi-wg] Next teleconference: tomorrow, wednesday july 15th
Dear all,
the next PGI teleconference will be held tomorrow july 15th at 16:00 CET (duration: 1 hour).
Call-in details as follow: via Skype call +9900827049931906 (free of charge) ordinary phone numbers (local rates) with the 9931906 conference number:
Austria 0820 401 15470 Belgium 0703 57 134 France 0826 109 071 Germany +49 (0) 180 500 9527 Switzerland 0848 560 397
The agenda is similar to the last call (in particular, tomorrow I would like that we spend a few minutes on a status update of the security discussion)
1) State model
2) Status update on security
3) AOB
Feel free to propose additional topics for discussion.
Moreno.
-- Moreno Marzolla INFN Sezione di Padova, via Marzolo 8, 35131 PADOVA, Italy EMail: moreno.marzolla@pd.infn.it Phone: +39 049 8277103 WWW : http://www.dsi.unive.it/~marzolla Fax : +39 049 8756233
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg

Moreno, Aleksandr and all, PGI Job State Model ------------------- Taking into account the remarks, comments and suggestions of the telephone conference of last week Wednesday 08 July 2009, I have performed following improvements to my Job State Model : - Group 'Hold-*' substates inside second level 'Hold' substates, - Add a third level 'Suspended' substate inside each 'Hold' substate, - Clearly separate 'Finished', 'Failed' and 'Cancelled' states, - Add an 'Automatic Resubmission' transition to the 'Submitted' state only from the 'Failed' state Concerning the possibility of 'transitions between substates' proposed by Aleksandr KONSTANTINOV in his mail dated 08 July 2009 : Even if a given implementation uses direct transitions between substates belonging to different states, - It would NOT be easy to agree on any standard for such transitions, - It would NOT be a good idea to force all implementations to simply understand them, - It would NOT be easy to write clients able to understand them, - For the normal flow, I have already designed : - the 'Outgoing' final substates as standard exit doors, - the 'Incoming' substates as standard entrance doors and branching points. So, I fully agree with Andrew GRIMSHAW, we should NOT expose transitions between substates belonging to different states. I have updated accordingly and uploaded inside GridForge following documents : - 'PGI Job State Model - Textual description' available at http://forge.gridforum.org/sf/go/doc15697?nav=1 - 'PGI Job State Model (available as ZARGO, XMI and PNG)' available under 5 formats at http://forge.gridforum.org/sf/go/doc15655?nav=1 : - ZARGO file created by ArgoUML - XMI file readable by any UML tool - PNG drawing of the Small Diagram - PNG drawing of the Big Diagram - PNG drawing of the Big Diagram with Comments The model itself contains the whole information, smaller diagrams are easier to view and to print, bigger diagrams show more details. Please review this model carefully before this telephone conference, so that we can quickly point any remaining issue and reach a consensus : I will attend this telephone conference, but I will probably NOT be able to attend the telephone conference on Wednesday 22 July 2009. I am still ready for further improvements, and if necessary, I will be available for telephone conferences on Fridays at 16h. Status update on Security ------------------------- I propose that before this telephone conference, you carefully review my 'PGI Security Model' available at http://forge.gridforum.org/sf/go/doc15584?nav=1 That would permit us to quickly point any remaining issue and perhaps reach a consensus. Best regards. ----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr ----------------------------------------------------- Moreno Marzolla wrote:
Dear all,
the next PGI teleconference will be held tomorrow july 15th at 16:00 CET (duration: 1 hour).
Call-in details as follow: via Skype call +9900827049931906 (free of charge) ordinary phone numbers (local rates) with the 9931906 conference number:
Austria 0820 401 15470 Belgium 0703 57 134 France 0826 109 071 Germany +49 (0) 180 500 9527 Switzerland 0848 560 397
The agenda is similar to the last call (in particular, tomorrow I would like that we spend a few minutes on a status update of the security discussion)
1) State model
2) Status update on security
3) AOB
Feel free to propose additional topics for discussion.
Moreno.

Balazs, Morris and all, PGI Job State Model ------------------- Taking into account the remarks, comments and suggestions of the telephone conference of last week Wednesday 15 July 2009, I have performed following improvements to my Job State Model : - Write down that 'Cancellation' and 'Failure' can happen in any substate, - Mention time-out inside the 'Delegated:Hold' substate, - Remove the 'Individual Job of a Job Collection' transition. Therefore, I have renamed 'PGI Job State Model' as 'PGI Single Job State Model', and I have created a new overview document about 'PGI Execution Service' (see below). I have updated accordingly and uploaded inside GridForge following documents : - 'PGI Single Job State Model - Textual description' available at http://forge.gridforum.org/sf/go/doc15697?nav=1 - 'PGI Single Job State Model (available as ZARGO, XMI and PNG)' available under 6 formats at http://forge.gridforum.org/sf/go/doc15655?nav=1 : - ZARGO file created by ArgoUML - XMI file readable by any UML tool - PNG drawing of the Small Diagram - PNG drawing of the Medium Diagram - PNG drawing of the Big Diagram - PNG drawing of the Big Diagram with Comments The model itself contains the whole information, smaller diagrams are easier to view and to print, bigger diagrams show more details. Please review this model carefully, so that we can quickly point any remaining issue and reach a consensus. PGI Execution Service Overview ------------------------------ Since the above 'Job State Model' covers only Single Jobs, I have created the attached plain text document named 'PGI Execution Service Overview'. It is also available at http://forge.gridforum.org/sf/go/doc15735?nav=1 I am absolutely certain that this document is very useful, but I am sure that it is incomplete, and there are probably points where you disagree. So please review this document thoroughly, and send me your remarks, comments and suggestions for improvement. Best regards. ----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr ----------------------------------------------------- +-----------------------------------------+ | OGF : PGI Execution Service Overview | +-----------------------------------------+ 1) Context 2) Types of grid Jobs processed by the PGI Execution Service 3) Single Jobs 4) Collection Jobs 5) Parameter Sweep Jobs 6) DAG Jobs 1) Context ========== The PGI Execution Service, which processes grid Jobs, is NOT designed such as any instance of the Execution Service works as a standalone grid service, but such as each instance of the Execution Service works in cooperation with at least : - Batch Systems really executing the Jobs, and other instances of the Execution Service to which it can delegate grid Jobs. - An 'Information Service' implementing GLUE 2.0 as specified at http://www.ogf.org/documents/GFD.147.pdf So, for information on the existence, capability, status, load and end points of its entities, the Execution Service : - MUST publish it to the Information Service. Implementation (push or pull, WS or ActiveMQ or other) has to be defined in accordance with the Information Service, - CAN optionally publish it through a port type of its own Web Service, but that is NOT mandatory. - An 'Accounting Service'. So, for accounting information, the Execution Service : - MUST publish it as specified by UR at http://www.ogf.org/documents/GFD.98.pdf - CAN optionally publish it through a port type of its own Web Service, but that is NOT mandatory. - A 'Logging and Bookkeeping' Service keeping the history of all events for each Job. Standardization of this 'Logging and Bookkeeping' is under way inside OGF JSDL with the name 'Activity Instance' at http://forge.gridforum.org/sf/docman/do/listDocuments/projects.jsdl-wg/docma... So, for Job events, the Execution Service : - MUST publish as much as it can to 'Logging and Bookkeeping'. Implementation (push or pull, WS or ActiveMQ or other) has to be defined in accordance with the 'Logging and Bookkeeping' Service, - CAN optionally publish them through a port type of its own Web Service, but that is NOT mandatory. For 'Accounting' and 'Logging and Bookkeeping', the 'Joint Security Policy Group' has published a draft of 'Grid Policy on the Handling of User-Level Job Accounting Data', available at http://www.jspg.org/wiki/Grid_Policy_on_the_Handling_of_User-Level_Job_Accou... 2) Types of grid Jobs processed by the PGI Execution Service ============================================================ Each grid Job is defined by a Job description. The standard for Job description is JSDL specified at http://www.ogf.org/documents/GFD.136.pdf The PGI Execution Service can process following types of grid Jobs : - Single Jobs, whose description contains only 1 Job executed by only 1 Batch System, - Collection Jobs, which are containers for a limited number of explicitly described independent Single Jobs, - Parameter Sweep Jobs, which describe independent Single Jobs to be created dynamically, - DAG Jobs, which describe a DAG workflow of a limited number of explicitly described Single Jobs. 3) Single Jobs ============== Their description contains only 1 Job executed by only 1 Batch System. The state model and behaviour INTERNALLY used by any implementation of the Execution Service to process Single Jobs is OUT OF SCOPE. But the state model and behaviour exposed by the Execution Service to the Job Submitter for each Single Job is described in following documents : - 'PGI Single Job State Model - Textual description' at http://forge.gridforum.org/sf/go/doc15697?nav=1 - 'PGI Single Job State Model (available as ZARGO, XMI and PNG)' at http://forge.gridforum.org/sf/go/doc15655?nav=1 At any time after having received the Jobid (or EPR) of the Single Job, the Submitter CAN request the status of the Single Job to the Execution Service, with MUST return it. 4) Collection Jobs ================== They are containers for a limited number of explicitly described independent Single Jobs. The gLite implementation currently processes such jobs. They are also implicitly defined by 'Create multiple Activities' in chapter 3.1.2 of 'Strawman of the Geneva grid job Execution Service (GES)' at http://forge.gridforum.org/sf/go/doc15590?nav=1 is NOT covered by the 'Single Job State Model' On arrival of a Collection Job, the Execution Service MUST immediately : - Associate a Jobid (or EPR) to the Job Collection itself, - For each Single Job described, submit it and get back the associated Jobid (or EPR, or Error message), - Return the whole list of Jobids (or EPRs) and Error messages to the Submitter of the Job Collection. With this behaviour, the Collection Job itself can be considered as stateless. At any time after having received the list of Jobids (or EPRs), the Submitter of the Collection Job CAN request : - The status of each Single Job (standard case for a Single Job), - The status of the Collection Job itself. In this case, the Execution Service MUST return the whole list of status for each Single Job, - To cancel the execution of the Collection Job. In this case, the Execution Service MUST cancel each submitted Single Job not already finished. 5) Parameter Sweep Jobs ======================= They describe independent Single Jobs to be created dynamically, as specified by 'JSDL Parameter Sweep Job Extension' at http://www.ogf.org/documents/GFD.149.pdf I propose that the Execution Service MUST expose to the Submitter of a Parameter Sweep Job following state model and behaviour : Submitted --------- On arrival of a Parameter Sweep Job, the Execution Service immediately associates a Jobid (or EPR) to the Parameter Sweep Job itself, and immediately returns it to the Submitter. Processing ---------- For each Single Job to be created dynamically, the Execution Service : - generates the appropriate Job description, - submits it, - gets back the associated Jobid (or EPR, or Error message), - notifies this Jobid (or EPR, or Error message) to the Submitter of the Parameter Sweep Job. Finished -------- As soon as the Execution Service has no more Single Job to create. At any time after having received any Jobid (or EPR), the Submitter of the Parameter Sweep Job CAN request : - The status of each Single Job (standard case for a Single Job), - The status of the Parameter Sweep Job itself. In this case, the Execution Service MUST return its status, and CAN return the number of Single Jobs already successfully submitted, the number of Single Jobs already unsuccessfully submitted, and if possible the number of Single Jobs still to be submitted, - To cancel the execution of the Parameter Sweep Job. In this case, the Execution Service MUST cancel the Parameter Sweep Job itself, but each already submitted Single Job continues its own life, - To cancel the execution of the Parameter Sweep Job including already submitted Single Jobs. In this case, the Execution Service MUST cancel the Parameter Sweep Job itself, and each already submitted Single Job not already finished. 6) DAG Jobs =========== They describe a DAG workflow of a limited number of explicitly described Single Jobs. I propose that the Execution Service MUST expose to the Submitter of a DAG Job following state model and behaviour : Submitted --------- On arrival of a DAG Job, the Execution Service immediately associates a Jobid (or EPR) to the DAG Job itself, and immediately returns it to the Submitter. Processing ---------- At the appropriate time defined by the DAG workflow, the Execution Service : - submits 1 appropriate single Job, - gets back the associated Jobid (or EPR, or Error message), - notifies this Jobid (or EPR, or Error message) to the Submitter of the DAG Job. Finished -------- As soon as the Execution Service has no more Single Job to create. At any time after having received any Jobid (or EPR), the Submitter of the DAG Job CAN request : - The status of each Single Job (standard case for a Single Job), - The status of the DAG Job itself. In this case, the Execution Service MUST return its status, the status of each Single Job already submitted, and if possible the number of Single Jobs still to be submitted, - To cancel the execution of the DAG Job. In this case, the Execution Service MUST cancel the DAG Job itself, but each already submitted Single Job continues its own life, - To cancel the execution of the DAG Job including already submitted Single Jobs. In this case, the Execution Service MUST cancel the DAG Job itself, and each already submitted Single Job not already finished. To be thoroughly reviewed and criticized.

Dear Etienne, Thanks for the updated documents and figures. Meanwhile your draft was included into the strawman rendering document. Unfortunately, we took an earlier version of your text. Would be nice if you could directly update the State model section of the rendering document. Please make sure you use openoffice 3.1 and turn track changes on. direct link to the strawman document: http://forge.ogf.org/sf/docman/do/downloadDocument/projects.pgi-wg/docman.ro... bye, Balazs Konya Etienne URBAH wrote:
Balazs, Morris and all,
PGI Job State Model ------------------- Taking into account the remarks, comments and suggestions of the telephone conference of last week Wednesday 15 July 2009, I have performed following improvements to my Job State Model :
- Write down that 'Cancellation' and 'Failure' can happen in any substate,
- Mention time-out inside the 'Delegated:Hold' substate,
- Remove the 'Individual Job of a Job Collection' transition.
Therefore, I have renamed 'PGI Job State Model' as 'PGI Single Job State Model', and I have created a new overview document about 'PGI Execution Service' (see below).
I have updated accordingly and uploaded inside GridForge following documents :
- 'PGI Single Job State Model - Textual description' available at http://forge.gridforum.org/sf/go/doc15697?nav=1
- 'PGI Single Job State Model (available as ZARGO, XMI and PNG)' available under 6 formats at http://forge.gridforum.org/sf/go/doc15655?nav=1 : - ZARGO file created by ArgoUML - XMI file readable by any UML tool - PNG drawing of the Small Diagram - PNG drawing of the Medium Diagram - PNG drawing of the Big Diagram - PNG drawing of the Big Diagram with Comments
The model itself contains the whole information, smaller diagrams are easier to view and to print, bigger diagrams show more details.
Please review this model carefully, so that we can quickly point any remaining issue and reach a consensus.
PGI Execution Service Overview ------------------------------ Since the above 'Job State Model' covers only Single Jobs, I have created the attached plain text document named 'PGI Execution Service Overview'.
It is also available at http://forge.gridforum.org/sf/go/doc15735?nav=1
I am absolutely certain that this document is very useful, but I am sure that it is incomplete, and there are probably points where you disagree.
So please review this document thoroughly, and send me your remarks, comments and suggestions for improvement.
-- Balázs Kónya NorduGrid Collaboration http://www.nordugrid.org Lund University balazs.konya@hep.lu.se High Energy Physics phone: +46 46 222 8049 BOX 118, S - 221 00 LUND, Sweden fax: +46 46 222 4015
participants (8)
-
Andrew Grimshaw
-
Balazs Konya
-
David Wallom
-
Etienne URBAH
-
Laurence Field
-
Moreno Marzolla
-
Oxana Smirnova
-
Steven Newhouse