Re: [Pgi-wg] OGF OGSA-BES - Requirements for an improved Basic Execution Service

Andrew, Steven and all OGSA-BES stakeholders, I have finished written down a document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 I will use this requirements document to propose improvements to the current specification of BES 1.0 described in GFD.108, as agreed at the OGSA-BES session at OGF32 in Salt Lake City on 17 July 2011. Thank you in advance for reading and criticizing my requirements document. Best regards. ----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr ----------------------------------------------------- On Thu, 08/09/2011 12:51, Morris Riedel wrote:
Hi Etienne,
thanks for your input.
At the next OGF, we will have several PGI/BES/JSDL/GLUE2.
We thus not only are able to clarify questions, but also will make a progress towards commonly agreed spec based on all of our inputs.
Take care, Morris
-----Ursprüngliche Nachricht----- Von: ogsa-bes-wg-bounces@ogf.org [mailto:ogsa-bes-wg-bounces@ogf.org] Im Auftrag von Etienne URBAH Gesendet: Mittwoch, 7. September 2011 19:33 An: Andrew GRIMSHAW; steven.newhouse@egi.eu; ogsa-bes-wg@ogf.org Cc: pgi-wg@ogf.org; edgi-na2@mail.edgi-project.eu; lodygens@lal.in2p3.fr Betreff: Re: [OGSA-BES-WG] OGF OGSA-BES - Requirements for an improved Basic Execution Service
Andrew, Steven and all OGSA-BES stakeholders,
At the OGSA-BES session at OGF32 in Salt Lake City on 17 July 2011, we had agreed to propose for the beginning of September 2011 improvements to the current specification of BES 1.0 described in GFD.108
But I am unable to define specifications before having a clear picture of requirements.
Therefore, I have written down a document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306
This document is almost finished, with most chapters in a readable and consistent state. The only exception is the chapter about data management, which I hope to write down very soon.
Thank you in advance for reading and criticizing this document.
Best regards.
----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr -----------------------------------------------------

Andrew, Steven and all OGSA-BES stakeholders, In the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 I have performed following improvements : - For each functionality, clearly separate : - the requirements about the corresponding Client interface (often 'MUST'), - the requirements about the implementation of the functionality (sometimes 'MAY'). - Add short titles in bold to improve readability. Thank you in advance for reading and criticizing this requirements document. Best regards. Etienne URBAH On Fri, 09/09/2011 22:51, Etienne URBAH wrote:
Andrew, Steven and all OGSA-BES stakeholders,
I have finished written down a document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306
I will use this requirements document to propose improvements to the current specification of BES 1.0 described in GFD.108, as agreed at the OGSA-BES session at OGF32 in Salt Lake City on 17 July 2011.
Thank you in advance for reading and criticizing my requirements document.
Best regards.
----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr -----------------------------------------------------
On Thu, 08/09/2011 12:51, Morris Riedel wrote:
Hi Etienne,
thanks for your input.
At the next OGF, we will have several PGI/BES/JSDL/GLUE2.
We thus not only are able to clarify questions, but also will make a progress towards commonly agreed spec based on all of our inputs.
Take care, Morris
-----Ursprüngliche Nachricht----- Von: ogsa-bes-wg-bounces@ogf.org [mailto:ogsa-bes-wg-bounces@ogf.org] Im Auftrag von Etienne URBAH Gesendet: Mittwoch, 7. September 2011 19:33 An: Andrew GRIMSHAW; steven.newhouse@egi.eu; ogsa-bes-wg@ogf.org Cc: pgi-wg@ogf.org; edgi-na2@mail.edgi-project.eu; lodygens@lal.in2p3.fr Betreff: Re: [OGSA-BES-WG] OGF OGSA-BES - Requirements for an improved Basic Execution Service
Andrew, Steven and all OGSA-BES stakeholders,
At the OGSA-BES session at OGF32 in Salt Lake City on 17 July 2011, we had agreed to propose for the beginning of September 2011 improvements to the current specification of BES 1.0 described in GFD.108
But I am unable to define specifications before having a clear picture of requirements.
Therefore, I have written down a document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306
This document is almost finished, with most chapters in a readable and consistent state. The only exception is the chapter about data management, which I hope to write down very soon.
Thank you in advance for reading and criticizing this document.
Best regards.
----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr -----------------------------------------------------

Hi Etienne, thanks for creating this consolidated requirements document, it is very interesting to read. I have a few comments, mostly related to whether certain features are a "MUST" or not. In many cases it was not clear to me whether you talk about the BES interface specification, or about the way a BES instance should be deployed and operated. It would be good to remove the operational and deployment concerns, so that only the specification parts remain. I hope my comments can be used to make the document clearer and easier to digest. So, in detail: 1.4 Methodology * ref 4: glite user guide -> out of scope, gLite is too complex and too specific in its architecture (if there is such a thing) to be useful as a base for BES 2.4 Collaboration with other services While this is important for interoperability, it is unimportant for the specification of a BES. The BES spec should NOT try to specify all the interaction with the rest of the world. This is the task of a "grid architecture specification" like OGSA. Specifically, the interactions with security, monitoring, accounting and logging framework are OPERATIONAL concerns that MUST NOT be a mandatory part of a BES specification. 4. BES non-functional requirements 4.1.2 Traceability - should be SHOULD not MUST 4.1.3 Security - how can a specification "implement" a policy? You probably meant "BES implementations SHOULD ..." - availability/reliability: while I agree, you cannot enforce quality of an implementation via the specification. So all of 4.1.3 is SHOULD in my opinion. 4.2 all of this is out of scope. For example the UNICORE service container hosts a number of services including an execution service. Probably you mean that the execution service SPECIFICATION should be limited to the execution service and MUST NOT specify accounting, security etc. 5 BES requirements applying to the info system Introduction: this is a requirement on the grid, not on the BES. 6 BES requirements applying to security Introduction: when you say "The Execution Service MUST NOT be overloaed by implementing a security framework", it is not clear what you mean by "service". Do you mean the service as defined by its interfaces, or do you mean the actual BES process or even machine which is running BES? For example, a BES may be one of many services hosted by a service container, which may include a full featured security framework. This should be clarified. 6.1 * "SSL certificates MUST be signed by a CA..." this is an operational decision, and has nothing to do with the BES spec. For example, a site may run an inhouse deployment of BES using an in-house CA. This requirement should be deleted. 6.3 * "For Client authentication, the Execution Service MUST accept all following authentication methods: Full X509, RFC-3820-compliant X509 Proxy" This requirement is invalid. I agree that it would be nice to be able to specify authentication methods, but it is impossible. For example Shibboleth, Username/password, OpenID, OAuth (e.g. for a REST interface over plain HTTP), or even NOTHING (e.g. in an inhouse grid) all can be valid authentication methods in some circumstances. Furthermore, making proxies a MUST implies that nonstandard authentication libraries instead of TLS/SSL must be used, making the BES implementation insecure. For some implementors (like UNICORE) proxies on the transport level are very much a no-go. 6.4. "This authorization mechanism MUST be consistent across all instances of the Execution Service" This violates the autonomy of a site. Site administrators often wish to stay in control of their resources, and do not accept external authorisation decision points. And anyway, who cares? Since the AuthZ mechanism is internal to the BES, it cannot be specified in the BES spec as such. 6.5 These are reqiurements on the security layer (or framework) and should not be used as requirements on BES. 7 BES requirements related to "Application Repositories" While I agree that BES should understand the notion of an "Application" (see e.g. JSDL ApplicationName), I don't agree that the BES should use these for Scheduling. Rather, this is the job of a broker. 8 BES requirements applying to Accounting As a "MUST", these are out of scope, and should be made "SHOULD". 9 Logging/Bookkeeping Same as 8. 10 Jobs 10.1 Types of job Support for parallel jobs: it should be "MUST" :-) Best regards, Bernd. On Tue, 2011-09-13 at 20:51 +0200, Etienne URBAH wrote:
Andrew, Steven and all OGSA-BES stakeholders,
In the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 I have performed following improvements :
- For each functionality, clearly separate : - the requirements about the corresponding Client interface (often 'MUST'), - the requirements about the implementation of the functionality (sometimes 'MAY').
- Add short titles in bold to improve readability.
Thank you in advance for reading and criticizing this requirements document.
Best regards.
Etienne URBAH
On Fri, 09/09/2011 22:51, Etienne URBAH wrote:
Andrew, Steven and all OGSA-BES stakeholders,
I have finished written down a document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306
I will use this requirements document to propose improvements to the current specification of BES 1.0 described in GFD.108, as agreed at the OGSA-BES session at OGF32 in Salt Lake City on 17 July 2011.
Thank you in advance for reading and criticizing my requirements document.
Best regards.
----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr -----------------------------------------------------
[...]
-- Dr. Bernd Schuller Federated Systems and Data Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556) ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------

Bernd, Concerning the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 : THANK YOU VERY MUCH for having taken the time to read this document, and for having taken the time to provide comments : Such comments are very useful for the improvement of documents, permit convergence and prepare later agreement. My answers are interspersed in your comments. Please note that the chapter numbering has been modified. Besides, in the chapter about 'Data Management', I have added a description of the context and the workflows for automatic and manual data staging. The new version of the document is available at the location given above. In order that we all make progress, please continue sending comments. Best regards. ----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr ----------------------------------------------------- On Thu, 15/09/2011 13:00, Bernd Schuller wrote:
Hi Etienne,
thanks for creating this consolidated requirements document, it is very interesting to read. I have a few comments, mostly related to whether certain features are a "MUST" or not.
In many cases it was not clear to me whether you talk about the BES interface specification, or about the way a BES instance should be deployed and operated. It would be good to remove the operational and deployment concerns, so that only the specification parts remain.
I hope my comments can be used to make the document clearer and easier to digest.
So, in detail:
1.4 Methodology
* ref 4: glite user guide -> out of scope, gLite is too complex and too specific in its architecture (if there is such a thing) to be useful as a base for BES Yes, this is a very large user guide for specific usage of gLite by users, and it does NOT provide a clear description of the architecture and of the functionalities. But it is NOT so complex. As soon as I finished reading this guide, it was easy for me to perform reverse engineering and extract the effective architecture (SOA with internal interfaces) and the implied functionalities (which are to be improved). In the text, I have prepended a few words to explain that.
2.4 Collaboration with other services
While this is important for interoperability, it is unimportant for the specification of a BES. The BES spec should NOT try to specify all the interaction with the rest of the world. This is the task of a "grid architecture specification" like OGSA.
My document is NOT targeted only to the specification of the BES Client interface, but to the clear and consistent description of BES context and functional + operational requirements which are really necessary for interoperability. As far as I know, OGSA does NOT take into account GLUE 2.0 yet. Therefore, an up to date 'grid architecture specification' is absolutely necessary. If OGSA members consider chapters 2.3 and 2.4 of my document as a 'grid architecture specification' which updates and improves OGSA, I thank them. If they consider that this 'grid architecture specification' does NOT comply with OGSA and competes with it, then I assert that it obsoletes OGSA.
Specifically, the interactions with security, monitoring, accounting and logging framework are OPERATIONAL concerns that MUST NOT be a mandatory part of a BES specification.
FAILURE of practical operations is often caused by LACK of early care about operational concerns during specification phase. As GIN-GC has proven and documented, this is even more true for interoperability on the field (as opposed to theoretical interoperability at the WSDL level). I confirm that care about operational concerns is REQUIRED for real operations and for practical interoperability on the field. Although operational concerns are NOT part of the BES Client interface, they are REQUIRED for the overall specifications of BES in its context. In the text, I have stressed that the document DOES take into account operational concerns.
4. BES non-functional requirements
4.1.2 Traceability - should be SHOULD not MUST
This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ? No traceability --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that traceability is a MUST.
4.1.3 Security - how can a specification "implement" a policy? You probably meant "BES implementations SHOULD ..."
The text now is 'BES specifications MUST fully take into account the Security Policies ...'
- availability/reliability: while I agree, you cannot enforce quality of an implementation via the specification. YES. Thank you.
So all of 4.1.3 is SHOULD in my opinion. Security policies stay a MUST.
4.2 all of this is out of scope. For example the UNICORE service container hosts a number of services including an execution service. Probably you mean that the execution service SPECIFICATION should be limited to the execution service and MUST NOT specify accounting, security etc.
'Well defined and narrow scope' is a general engineering requirement. It is fundamental concern crossing requirements, design, specifications, fabrication, tests, operations, user experience and product maintenance for all types of products, even outside software engineering. I confirm that 'Well defined and narrow scope' is absolutely REQUIRED for sound software design, implementation, deployment and maintainability. From your comment, I assume that the Execution Service of UNICORE does have a 'Well defined and narrow scope', does have precise interfaces with other UNICORE services, and minimizes overlaps with them. The text now is 'The Execution Service is defined by its functionalities (efficiently manage Jobs, which are transient entities) and its operational constraints : For sound software design, implementation, deployment and maintainability, ...'
5 BES requirements applying to the info system
Introduction: this is a requirement on the grid, not on the BES.
This is NOT a requirement, this is a DESCRIPTION. The importance of the Information System as foundational block is underestimated, and precise knowledge of GLUE 2.0 is poor inside OGF. I am trying to improve the situation by providing clear explanations.
6 BES requirements applying to security
Introduction: when you say "The Execution Service MUST NOT be overloaed by implementing a security framework", it is not clear what you mean by "service". Do you mean the service as defined by its interfaces, or do you mean the actual BES process or even machine which is running BES? For example, a BES may be one of many services hosted by a service container, which may include a full featured security framework. This should be clarified.
The text now is 'The Execution Service is defined by its functionalities (efficiently manage Jobs, which are transient entities) and its operational constraints. For sound software design, implementation, deployment and maintainability, BES specifications and instances of the Execution Service ...'
6.1
* "SSL certificates MUST be signed by a CA..." this is an operational decision, and has nothing to do with the BES spec. For example, a site may run an inhouse deployment of BES using an in-house CA. This requirement should be deleted.
This operational concern is REQUIRED for practical interoperability on the field. I have prepended : * Authentication of Servers : The Execution Service SHOULD permit Clients to authenticate it. If an Execution Service authenticates itself to clients, it MUST permit Clients to really perform this authentication.
6.3
* "For Client authentication, the Execution Service MUST accept all following authentication methods: Full X509, RFC-3820-compliant X509 Proxy"
This requirement is invalid. I agree that it would be nice to be able to specify authentication methods, but it is impossible. For example Shibboleth, Username/password, OpenID, OAuth (e.g. for a REST interface over plain HTTP), or even NOTHING (e.g. in an inhouse grid) all can be valid authentication methods in some circumstances.
There are 2 separate requirements : - 1 'MUST' for Full X509 and RFC-3820-compliant X509 Proxy - 1 'MAY' for all other ones.
Furthermore, making proxies a MUST implies that nonstandard authentication libraries instead of TLS/SSL must be used, making the BES implementation insecure. For some implementors (like UNICORE) proxies on the transport level are very much a no-go.
I had clearly specified RFC-3820-compliant X509 Proxy, which ARE standard. Your critics are valid for GSI proxies, which I have taken care NOT to mention.
6.4.
"This authorization mechanism MUST be consistent across all instances of the Execution Service"
This violates the autonomy of a site. Site administrators often wish to stay in control of their resources, and do not accept external authorisation decision points. And anyway, who cares? Since the AuthZ mechanism is internal to the BES, it cannot be specified in the BES spec as such.
Interoperability requires a federation of independent administrative domain to agree on common functionalities, interfaces and operations. This DOES sometime violate the autonomy of each individual site. The requirement is NOT that the AUTHZ decision point is external to any site, but that all participating site MUST accept to install inside their site an instance of a commonly validated software implementing the decision point. The AUTHZ mechanism MUST NOT be internal to the BES : For example, in UNICORE atomic services, the 'de.fzj.unicore.uas.security' package is described as 'The security subsystem of UNICORE/X', and is NOT internal to 'de.fzj.unicore.uas.impl.job'.
6.5
These are reqiurements on the security layer (or framework) and should not be used as requirements on BES.
These security requirements DO have impacts on the BES Client interface and on the Job Description document. In the text, I have made it clear.
8 BES requirements related to "Application Repositories"
While I agree that BES should understand the notion of an "Application" (see e.g. JSDL ApplicationName), I don't agree that the BES should use these for Scheduling. Rather, this is the job of a broker.
The text is now : * Resource selection : The Execution Service MUST use, among others, these references to 'Installed Applications' in order to select the most adequate computing resource for the Job.
9 BES requirements applying to Accounting
As a "MUST", these are out of scope, and should be made "SHOULD".
No Accounting --> No precise reporting to funding agencies --> No funding --> Abrupt and very long shutdown of all services. I fully confirm that Accounting is a MUST.
10 Logging/Bookkeeping
Same as 9.
Same as 'Traceability' : This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ? No Logging and Bookkeeping --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that Logging and Bookkeeping is a MUST.
12 Jobs
12.1 Types of job
Support for parallel jobs: it should be "MUST" :-)
The text is now : - The concept of 'Single Job' includes Jobs running massively-parallel processes using MPI on one large-scale HPC System. The Execution Service MUST understand instructions for usage of MPI inside the Job Description document. The Execution Service SHOULD transmit these instructions to the Batch System, or return an explicit error message if not supported.
Best regards, Bernd.
On Tue, 2011-09-13 at 20:51 +0200, Etienne URBAH wrote:
Andrew, Steven and all OGSA-BES stakeholders,
In the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 I have performed following improvements :
- For each functionality, clearly separate : - the requirements about the corresponding Client interface (often 'MUST'), - the requirements about the implementation of the functionality (sometimes 'MAY').
- Add short titles in bold to improve readability.
Thank you in advance for reading and criticizing this requirements document.
Best regards.
Etienne URBAH
On Fri, 09/09/2011 22:51, Etienne URBAH wrote:
Andrew, Steven and all OGSA-BES stakeholders,
I have finished written down a document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306
I will use this requirements document to propose improvements to the current specification of BES 1.0 described in GFD.108, as agreed at the OGSA-BES session at OGF32 in Salt Lake City on 17 July 2011.
Thank you in advance for reading and criticizing my requirements document.
Best regards.
----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr -----------------------------------------------------
[...]
-- Dr. Bernd Schuller Federated Systems and Data Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556)
------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------

Hi Etienne, all, On Thu, Sep 15, 2011 at 7:27 PM, Etienne URBAH <urbah@lal.in2p3.fr> wrote:
On Thu, 15/09/2011 13:00, Bernd Schuller wrote:
In many cases it was not clear to me whether you talk about the BES interface specification, or about the way a BES instance should be deployed and operated. It would be good to remove the operational and deployment concerns, so that only the specification parts remain.
2.4 Collaboration with other services
While this is important for interoperability, it is unimportant for the specification of a BES. The BES spec should NOT try to specify all the interaction with the rest of the world. This is the task of a "grid architecture specification" like OGSA.
My document is NOT targeted only to the specification of the BES Client interface, but to the clear and consistent description of BES context and functional + operational requirements which are really necessary for interoperability.
I would like to put forward a motion: "The discussion about BES interface specification should be separated from the discussion about interoperation of BES services." Motivation: Both discussions are important and necessary to have. Discussing both topics at once, however, will convolute the BES interface specification, and will delay overall progress. I do not mean that interoperation can only be discussed after the BES interface spec is finished, not at all - but each argument should clearly marked as belonging to *either* discussion, not both. Some more comments inlined...
Specifically, the interactions with security, monitoring, accounting and logging framework are OPERATIONAL concerns that MUST NOT be a mandatory part of a BES specification.
FAILURE of practical operations is often caused by LACK of early care about operational concerns during specification phase. As GIN-GC has proven and documented, this is even more true for interoperability on the field (as opposed to theoretical interoperability at the WSDL level).
I confirm that care about operational concerns is REQUIRED for real operations and for practical interoperability on the field. Although operational concerns are NOT part of the BES Client interface, they are REQUIRED for the overall specifications of BES in its context.
"REQUIRED for the overall specifications of BES" - I assume that this does *not* mean the BES Service Interface specification (which I think you refer to as 'BES client interface', as it is consumed by a non-service / client)?
In the text, I have stressed that the document DOES take into account operational concerns.
'the document' - I assume you mean the present requirement document? If so, I agree - it is useful to capture operational requirements. I also agree with Bernd though, that those should not directly influence the BES service interface specification, but rather are a separate concern. You cannot foresee the requirements of all implementations, nor the boundary conditions of all deployments - adding operational features to the BES Service interface specification *will* limit its applicability. Thus, those issues must IMHO be addressed in a separate document. And no, I don't expect EGI to use my service implementation ;-) My $0.02, Andre. -- Nothing is ever easy...

Hi Etienne, thanks for the clarifications. So indeed your document is aimed at both: 1) providing requirements for the actual BES specification ("client interface" in your terminology) 2) the operation and deployment issues that have to be solved for interoperability on an infrastructure level (say EGI and EDGI). It would be very beneficial for further progress if these two distinct concerns could be separated, at least CLEARLY marked in your document. I have added some more comments inline. On Thu, 2011-09-15 at 19:27 +0200, Etienne URBAH wrote: > Concerning the document named 'Requirements for an improved Basic > Execution Service (BES)' and available at > http://forge.gridforum.org/sf/go/doc16306 : > > THANK YOU VERY MUCH for having taken the time to read this document, and > for having taken the time to provide comments : > > Such comments are very useful for the improvement of documents, permit > convergence and prepare later agreement. > [...] > On Thu, 15/09/2011 13:00, Bernd Schuller wrote: > > > >[...] > > > 1.4 Methodology > > > > * ref 4: glite user guide -> out of scope, gLite is too complex and > > too specific in its architecture (if there is such a thing) to be useful > > as a base for BES > Yes, this is a very large user guide for specific usage of gLite by > users, and it does NOT provide a clear description of the architecture > and of the functionalities. > But it is NOT so complex. As soon as I finished reading this guide, > it was easy for me to perform reverse engineering and extract the > effective architecture (SOA with internal interfaces) and the implied > functionalities (which are to be improved). Indeed it appears that you try to impose gLite specifics (like a logging & bookkeeping service or proxy certificates on the transport level) as requirements. This would severly limit the BES specification effort, and will not be accepted (I hope) by other stakeholders. > In the text, I have prepended a few words to explain that. Basically you imply that the "architecture and functionalities" of the gLite execution system (together with the PGI work) is somehow the guideline to be followed, which I fully disagree with. > > > > 2.4 Collaboration with other services > > > > While this is important for interoperability, it is unimportant > > for the specification of a BES. The BES spec should NOT try to specify > > all the interaction with the rest of the world. This is the task of a > > "grid architecture specification" like OGSA. > My document is NOT targeted only to the specification of the BES > Client interface, but to the clear and consistent description of BES > context and functional + operational requirements which are really > necessary for interoperability. > As far as I know, OGSA does NOT take into account GLUE 2.0 yet. > Therefore, an up to date 'grid architecture specification' is absolutely > necessary. Glue2 is just an information model, not necessarily a perfect one nor the only one. However, I agree an information model has to be adopted for BES and any associated information systems. > If OGSA members consider chapters 2.3 and 2.4 of my document as a > 'grid architecture specification' which updates and improves OGSA, I > thank them. If they consider that this 'grid architecture > specification' does NOT comply with OGSA and competes with it, then I > assert that it obsoletes OGSA. I can't really say, the OGSA group stopped its work a long time ago and it's a long time that I looked at the documents. > > > > Specifically, the interactions with security, monitoring, accounting and > > logging framework are OPERATIONAL concerns that MUST NOT be a mandatory > > part of a BES specification. > FAILURE of practical operations is often caused by LACK of early care > about operational concerns during specification phase. Agreed. > As GIN-GC has > proven and documented, this is even more true for interoperability on > the field (as opposed to theoretical interoperability at the WSDL level). > I confirm that care about operational concerns is REQUIRED for real > operations and for practical interoperability on the field. Although > operational concerns are NOT part of the BES Client interface, they are > REQUIRED for the overall specifications of BES in its context. > In the text, I have stressed that the document DOES take into account > operational concerns. > > > > > 4. BES non-functional requirements > > > > 4.1.2 Traceability - should be SHOULD not MUST > This is an operational concern : Would you really take the risk that > the whole EGI becomes a spambot or a scambot ? Isn't it already, powered by gLite and used by the wlcg botnet (just kidding of course) :-) > No traceability --> No post mortem analysis after attack --> Large > infection --> Panic --> Abrupt and very long shutdown of all services. > I fully confirm that traceability is a MUST. It is an internal detail which any good implementation will provide. If BES-A is much easier and more secure to operate than BES-B, admins can choose which to install. > > > > 4.1.3 Security > > - how can a specification "implement" a policy? You probably > > meant "BES implementations SHOULD ..." > The text now is 'BES specifications MUST fully take into account the > Security Policies ...' Still no understanding here... let's take traceability The relevant EGI policy <https://documents.egi.eu/public/ShowDocument?docid=81> says "[...] software deployed in the Grid MUST include the ability to produce sufficient and relevant logging [...] The level of the logging MUST be configured by all service providers, including but not limited to the Sites, to produce the required information which MUST be retained for a minimum of 90 days." For example all UNICORE services can be made to comply with this by configuration of the logging library we use (Apache Log4j), and by not deleting log files for 90 days. So this is a feature of the implementation and the administrator in charge, not the specification. Thus, your sentence should read "BES implementations SHOULD ..." (It is MUST of course only if they want to be deployed in EGI) One does not try to specify implementation details, at least not in any specification I've ever seen (e.g. does the HTTP specification say anything about server logging or accepted CAs?). >[...] > > > > 4.2 > > all of this is out of scope. For example the UNICORE service > > container hosts a number of services including an execution service. > > Probably you mean that the execution service SPECIFICATION should be > > limited to the execution service and MUST NOT specify accounting, > > security etc. > 'Well defined and narrow scope' is a general engineering requirement. > It is fundamental concern crossing requirements, design, > specifications, fabrication, tests, operations, user experience and > product maintenance for all types of products, even outside software > engineering. exactly. > I confirm that 'Well defined and narrow scope' is absolutely REQUIRED > for sound software design, implementation, deployment and maintainability. > From your comment, I assume that the Execution Service of UNICORE > does have a 'Well defined and narrow scope', does have precise > interfaces with other UNICORE services, and minimizes overlaps with them. of course. And UNICORE does not include a L&B service :-) > [...] > > 6.1 > > > > * "SSL certificates MUST be signed by a CA..." this is an operational > > decision, and has nothing to do with the BES spec. > > For example, a site may run an inhouse deployment of BES using an > > in-house CA. This requirement should be deleted. > This operational concern is REQUIRED for practical interoperability > on the field. I have prepended : > * Authentication of Servers : The Execution Service SHOULD permit > Clients to authenticate it. If an Execution Service authenticates > itself to clients, it MUST permit Clients to really perform this > authentication. This sentence makes no sense to me, sorry. Maybe "Server and client SHOULD communicate via a secure channel (SSL/TLS)". Even this may not be true in the future, though it is for all(?) the Grid systems currently. > > > > 6.3 > > > > * "For Client authentication, the Execution Service MUST accept all > > following authentication methods: Full X509, RFC-3820-compliant X509 > > Proxy" > > > > This requirement is invalid. I agree that it would be nice to be able to > > specify authentication methods, but it is impossible. For example > > Shibboleth, Username/password, OpenID, OAuth (e.g. for a REST interface > > over plain HTTP), or even NOTHING (e.g. in an inhouse grid) all can be > > valid authentication methods in some circumstances. > There are 2 separate requirements : > - 1 'MUST' for Full X509 and RFC-3820-compliant X509 Proxy > - 1 'MAY' for all other ones. > > > > Furthermore, making proxies a MUST implies that nonstandard > > authentication libraries instead of TLS/SSL must be used, making the BES > > implementation insecure. For some implementors (like UNICORE) proxies on > > the transport level are very much a no-go. > I had clearly specified RFC-3820-compliant X509 Proxy, which ARE > standard. > Your critics are valid for GSI proxies, which I have taken care NOT > to mention. by "standard" I did not mean that it is an RFC, but software support. As opposed to standard SSL/TLS, proxies are almost not supported by industry standard tools, for example Apache httpd or the Java JDK. One has to rely on custom code, which is notoriously buggy and error prone. Since one important non-functional requirement (for me at least) is to be able use standard (off-the-shelf) open source software, having to support proxies is a big limitation. > > 6.4. > > > > "This authorization mechanism MUST be consistent across all instances of > > the Execution Service" > > > > This violates the autonomy of a site. Site administrators often wish to > > stay in control of their resources, and do not accept external > > authorisation decision points. And anyway, who cares? Since the AuthZ > > mechanism is internal to the BES, it cannot be specified in the > > BES spec as such. > Interoperability requires a federation of independent administrative > domain to agree on common functionalities, interfaces and operations. > This DOES sometime violate the autonomy of each individual site. > The requirement is NOT that the AUTHZ decision point is external to > any site, but that all participating site MUST accept to install inside > their site an instance of a commonly validated software implementing the > decision point. No. Each site may choose their own authz decision point, IMO. > The AUTHZ mechanism MUST NOT be internal to the BES : Maybe I was not clear. The authz mechanism is invisible for outside parties (like clients). It can be an external component, an internal component, whatever, it is up to the BES implementor and the site admin. > For example, > in UNICORE atomic services, the 'de.fzj.unicore.uas.security' package > is described as 'The security subsystem of UNICORE/X', and is NOT > internal to 'de.fzj.unicore.uas.impl.job'. In UNICORE site admins can choose what attribute sources and XACML decision points they want to use, but the clients (including other services) do not need to know this. That is what I meant by "internal". > > > > 6.5 > > > > These are reqiurements on the security layer (or framework) and should > > not be used as requirements on BES. > These security requirements DO have impacts on the BES Client > interface and on the Job Description document. > In the text, I have made it clear. indeed while preparing the EMI-ES specification, we came to the following conclusions 1) when using proxies for delegation, it is necessary to map each data staging item to a delegated credential (you can check the EMI-ES job description for details) 2) the delegation operations are separate from the job management operations, so they do not necessarily have to be part of the BES client interface. Also, there are existing implementations (UNICORE and Genesis come to my mind) that do not need this at all, because they do delegation without proxies. So I disagree, 6.5 mostly describes features of the particular security framework that is used. > > > > 8 BES requirements related to "Application Repositories" > > > > While I agree that BES should understand the notion of an "Application" > > (see e.g. JSDL ApplicationName), I don't agree that the BES should > > use these for Scheduling. Rather, this is the job of a broker. > The text is now : > * Resource selection : The Execution Service MUST use, among others, > these references to 'Installed Applications' in order to select the most > adequate computing resource for the Job. > > > > 9 BES requirements applying to Accounting > > > > As a "MUST", these are out of scope, and should be made "SHOULD". > No Accounting --> No precise reporting to funding agencies --> No > funding --> Abrupt and very long shutdown of all services. > I fully confirm that Accounting is a MUST. > ... operational > > > > 10 Logging/Bookkeeping > > > > Same as 9. > Same as 'Traceability' : > This is an operational concern : Would you really take the risk that > the whole EGI becomes a spambot or a scambot ? > No Logging and Bookkeeping --> No post mortem analysis after attack > --> Large infection --> Panic --> Abrupt and very long shutdown of all > services. > I fully confirm that Logging and Bookkeeping is a MUST. > an operational MUST maybe for some infrastructures, not all. E.g. a typical HPC site has its own accounting, its own logging systems independent of the (Grid) software used to submit jobs to it. > > > > 12 Jobs > > > > 12.1 Types of job > > > > Support for parallel jobs: it should be "MUST" :-) > The text is now : > - The concept of 'Single Job' includes Jobs running > massively-parallel processes using MPI on one large-scale HPC System. > The Execution Service MUST understand instructions for usage of MPI > inside the Job Description document. The Execution Service SHOULD > transmit these instructions to the Batch System, or return an explicit > error message if not supported. OK Best regards, Bernd. -- Dr. Bernd Schuller Federated Systems and Data Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556) ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------

Bernd and Andre, Concerning the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 : Thank you very much for your comments. BES Client interface and BES operational requirements ----------------------------------------------------- On Thu, 15/09/2011 20:43, Andre Merzky wrote:
"The discussion about BES interface specification should be separated from the discussion about interoperation of BES services." I agree, and I will create a separate chapter gathering all BES operational requirements having NO direct impact on the BES Client interface (this will take a little time).
BES requirements which seem to be specific to gLite --------------------------------------------------- If a requirement corresponds to NO real functional or operational need, then we should simply remove it. If a requirement corresponds to a real functional or operational need, but is expressed in way too specific to gLite, then we should reformulate it in a more generic way. Please provide suggestions. RFC-3820-compliant X509 Proxies ------------------------------- The RFC-3820-compliant X509 proxies are fully supported by the jLite library written in Java by Oleg SUKHOROSLOV and available at http://code.google.com/p/jlite/ Dependency of BES on other grid software components / operational issues ------------------------------------------------------------------------ It is very good that we know agree on GLUE 2.0 as base for the Information System. Otherwise, we could NOT agree on the way to express references to grid entities in the Job Description document. Your comments about chapter 6.5 confirm that the specifications of the BES Client interface DEPENDS on whether BES supports X509 proxies for delegation of Security credentials or NOT. Since delegation of Security credentials is a MUST for BES, my conclusion is that we MUST agree on SECURITY issues (even if some are operational issues) BEFORE trying to write down BES requirements. In the same way, we know that Clients need to perform complex queries on Jobs. The BES Client interface DEPENDS on whether such queries are accepted by BES itself, of by a separate Logging and Bookkeeping service. So, I think that we have to agree on the existence or absence of a separate Logging and Bookkeeping service BEFORE trying to write down BES requirements about queries. As a summary : - Some BES requirements are quite independent from other grid components, and we can discuss on them immediately. - But some other BES requirements are DEPENDENT from foundational grid components or operational issues, in particular Information System, Security, Logging and Bookkeeping, ... - Therefore, we have to agree on these other grid components or operational issues FIRST. This is a critical issue, and I propose that we discuss on it at OGF33. I will answer to your other comments later. Best regards. ----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr ----------------------------------------------------- On Thu, 15/09/2011 22:44, Bernd Schuller wrote:
Hi Etienne,
thanks for the clarifications. So indeed your document is aimed at both:
1) providing requirements for the actual BES specification ("client interface" in your terminology) 2) the operation and deployment issues that have to be solved for interoperability on an infrastructure level (say EGI and EDGI).
It would be very beneficial for further progress if these two distinct concerns could be separated, at least CLEARLY marked in your document.
I have added some more comments inline.
On Thu, 2011-09-15 at 19:27 +0200, Etienne URBAH wrote:
Concerning the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 :
THANK YOU VERY MUCH for having taken the time to read this document, and for having taken the time to provide comments :
Such comments are very useful for the improvement of documents, permit convergence and prepare later agreement. [...] On Thu, 15/09/2011 13:00, Bernd Schuller wrote:
[...]
1.4 Methodology
* ref 4: glite user guide -> out of scope, gLite is too complex and too specific in its architecture (if there is such a thing) to be useful as a base for BES Yes, this is a very large user guide for specific usage of gLite by users, and it does NOT provide a clear description of the architecture and of the functionalities. But it is NOT so complex. As soon as I finished reading this guide, it was easy for me to perform reverse engineering and extract the effective architecture (SOA with internal interfaces) and the implied functionalities (which are to be improved).
Indeed it appears that you try to impose gLite specifics (like a logging & bookkeeping service or proxy certificates on the transport level) as requirements. This would severly limit the BES specification effort, and will not be accepted (I hope) by other stakeholders.
In the text, I have prepended a few words to explain that.
Basically you imply that the "architecture and functionalities" of the gLite execution system (together with the PGI work) is somehow the guideline to be followed, which I fully disagree with.
2.4 Collaboration with other services
While this is important for interoperability, it is unimportant for the specification of a BES. The BES spec should NOT try to specify all the interaction with the rest of the world. This is the task of a "grid architecture specification" like OGSA.
My document is NOT targeted only to the specification of the BES Client interface, but to the clear and consistent description of BES context and functional + operational requirements which are really necessary for interoperability. As far as I know, OGSA does NOT take into account GLUE 2.0 yet. Therefore, an up to date 'grid architecture specification' is absolutely necessary.
Glue2 is just an information model, not necessarily a perfect one nor the only one. However, I agree an information model has to be adopted for BES and any associated information systems.
If OGSA members consider chapters 2.3 and 2.4 of my document as a 'grid architecture specification' which updates and improves OGSA, I thank them. If they consider that this 'grid architecture specification' does NOT comply with OGSA and competes with it, then I assert that it obsoletes OGSA.
I can't really say, the OGSA group stopped its work a long time ago and it's a long time that I looked at the documents.
Specifically, the interactions with security, monitoring, accounting and logging framework are OPERATIONAL concerns that MUST NOT be a mandatory part of a BES specification.
FAILURE of practical operations is often caused by LACK of early care about operational concerns during specification phase.
Agreed.
As GIN-GC has proven and documented, this is even more true for interoperability on the field (as opposed to theoretical interoperability at the WSDL level). I confirm that care about operational concerns is REQUIRED for real operations and for practical interoperability on the field. Although operational concerns are NOT part of the BES Client interface, they are REQUIRED for the overall specifications of BES in its context. In the text, I have stressed that the document DOES take into account operational concerns.
4. BES non-functional requirements
4.1.2 Traceability - should be SHOULD not MUST
This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ?
Isn't it already, powered by gLite and used by the wlcg botnet (just kidding of course) :-)
No traceability --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that traceability is a MUST.
It is an internal detail which any good implementation will provide. If BES-A is much easier and more secure to operate than BES-B, admins can choose which to install.
4.1.3 Security - how can a specification "implement" a policy? You probably meant "BES implementations SHOULD ..."
The text now is 'BES specifications MUST fully take into account the Security Policies ...'
Still no understanding here... let's take traceability The relevant EGI policy <https://documents.egi.eu/public/ShowDocument?docid=81> says
"[...] software deployed in the Grid MUST include the ability to produce sufficient and relevant logging [...] The level of the logging MUST be configured by all service providers, including but not limited to the Sites, to produce the required information which MUST be retained for a minimum of 90 days."
For example all UNICORE services can be made to comply with this by configuration of the logging library we use (Apache Log4j), and by not deleting log files for 90 days. So this is a feature of the implementation and the administrator in charge, not the specification. Thus, your sentence should read "BES implementations SHOULD ..." (It is MUST of course only if they want to be deployed in EGI) One does not try to specify implementation details, at least not in any specification I've ever seen (e.g. does the HTTP specification say anything about server logging or accepted CAs?).
[...]
4.2 all of this is out of scope. For example the UNICORE service container hosts a number of services including an execution service. Probably you mean that the execution service SPECIFICATION should be limited to the execution service and MUST NOT specify accounting, security etc.
'Well defined and narrow scope' is a general engineering requirement. It is fundamental concern crossing requirements, design, specifications, fabrication, tests, operations, user experience and product maintenance for all types of products, even outside software engineering.
exactly.
I confirm that 'Well defined and narrow scope' is absolutely REQUIRED for sound software design, implementation, deployment and maintainability. From your comment, I assume that the Execution Service of UNICORE does have a 'Well defined and narrow scope', does have precise interfaces with other UNICORE services, and minimizes overlaps with them.
of course. And UNICORE does not include a L&B service :-)
[...]
6.1
* "SSL certificates MUST be signed by a CA..." this is an operational decision, and has nothing to do with the BES spec. For example, a site may run an inhouse deployment of BES using an in-house CA. This requirement should be deleted. This operational concern is REQUIRED for practical interoperability on the field. I have prepended : * Authentication of Servers : The Execution Service SHOULD permit Clients to authenticate it. If an Execution Service authenticates itself to clients, it MUST permit Clients to really perform this authentication.
This sentence makes no sense to me, sorry. Maybe "Server and client SHOULD communicate via a secure channel (SSL/TLS)". Even this may not be true in the future, though it is for all(?) the Grid systems currently.
6.3
* "For Client authentication, the Execution Service MUST accept all following authentication methods: Full X509, RFC-3820-compliant X509 Proxy"
This requirement is invalid. I agree that it would be nice to be able to specify authentication methods, but it is impossible. For example Shibboleth, Username/password, OpenID, OAuth (e.g. for a REST interface over plain HTTP), or even NOTHING (e.g. in an inhouse grid) all can be valid authentication methods in some circumstances.
There are 2 separate requirements : - 1 'MUST' for Full X509 and RFC-3820-compliant X509 Proxy - 1 'MAY' for all other ones.
Furthermore, making proxies a MUST implies that nonstandard authentication libraries instead of TLS/SSL must be used, making the BES implementation insecure. For some implementors (like UNICORE) proxies on the transport level are very much a no-go.
I had clearly specified RFC-3820-compliant X509 Proxy, which ARE standard. Your critics are valid for GSI proxies, which I have taken care NOT to mention.
by "standard" I did not mean that it is an RFC, but software support. As opposed to standard SSL/TLS, proxies are almost not supported by industry standard tools, for example Apache httpd or the Java JDK. One has to rely on custom code, which is notoriously buggy and error prone.
Since one important non-functional requirement (for me at least) is to be able use standard (off-the-shelf) open source software, having to support proxies is a big limitation.
6.4.
"This authorization mechanism MUST be consistent across all instances of the Execution Service"
This violates the autonomy of a site. Site administrators often wish to stay in control of their resources, and do not accept external authorisation decision points. And anyway, who cares? Since the AuthZ mechanism is internal to the BES, it cannot be specified in the BES spec as such. Interoperability requires a federation of independent administrative domain to agree on common functionalities, interfaces and operations. This DOES sometime violate the autonomy of each individual site. The requirement is NOT that the AUTHZ decision point is external to any site, but that all participating site MUST accept to install inside their site an instance of a commonly validated software implementing the decision point.
No. Each site may choose their own authz decision point, IMO.
The AUTHZ mechanism MUST NOT be internal to the BES :
Maybe I was not clear. The authz mechanism is invisible for outside parties (like clients). It can be an external component, an internal component, whatever, it is up to the BES implementor and the site admin.
For example, in UNICORE atomic services, the 'de.fzj.unicore.uas.security' package is described as 'The security subsystem of UNICORE/X', and is NOT internal to 'de.fzj.unicore.uas.impl.job'.
In UNICORE site admins can choose what attribute sources and XACML decision points they want to use, but the clients (including other services) do not need to know this. That is what I meant by "internal".
6.5
These are reqiurements on the security layer (or framework) and should not be used as requirements on BES.
These security requirements DO have impacts on the BES Client interface and on the Job Description document. In the text, I have made it clear.
indeed while preparing the EMI-ES specification, we came to the following conclusions 1) when using proxies for delegation, it is necessary to map each data staging item to a delegated credential (you can check the EMI-ES job description for details) 2) the delegation operations are separate from the job management operations, so they do not necessarily have to be part of the BES client interface.
Also, there are existing implementations (UNICORE and Genesis come to my mind) that do not need this at all, because they do delegation without proxies.
So I disagree, 6.5 mostly describes features of the particular security framework that is used.
8 BES requirements related to "Application Repositories"
While I agree that BES should understand the notion of an "Application" (see e.g. JSDL ApplicationName), I don't agree that the BES should use these for Scheduling. Rather, this is the job of a broker.
The text is now : * Resource selection : The Execution Service MUST use, among others, these references to 'Installed Applications' in order to select the most adequate computing resource for the Job.
9 BES requirements applying to Accounting
As a "MUST", these are out of scope, and should be made "SHOULD".
No Accounting --> No precise reporting to funding agencies --> No funding --> Abrupt and very long shutdown of all services. I fully confirm that Accounting is a MUST.
... operational
10 Logging/Bookkeeping
Same as 9.
Same as 'Traceability' : This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ? No Logging and Bookkeeping --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that Logging and Bookkeeping is a MUST.
an operational MUST maybe for some infrastructures, not all.
E.g. a typical HPC site has its own accounting, its own logging systems independent of the (Grid) software used to submit jobs to it.
12 Jobs
12.1 Types of job
Support for parallel jobs: it should be "MUST" :-)
The text is now : - The concept of 'Single Job' includes Jobs running massively-parallel processes using MPI on one large-scale HPC System. The Execution Service MUST understand instructions for usage of MPI inside the Job Description document. The Execution Service SHOULD transmit these instructions to the Batch System, or return an explicit error message if not supported.
OK
Best regards, Bernd.
-- Dr. Bernd Schuller Federated Systems and Data Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556)
------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------

hi Etienne, On Fri, 2011-09-16 at 13:32 +0200, Etienne URBAH wrote: [...]
RFC-3820-compliant X509 Proxies ------------------------------- The RFC-3820-compliant X509 proxies are fully supported by the jLite library written in Java by Oleg SUKHOROSLOV and available at http://code.google.com/p/jlite/
You must be joking. I was talking about open source software like Apache httpd and Java JDK. jLite just wraps the cog-globus libraries and adds some gLite access APIs. No, thanks.
Dependency of BES on other grid software components / operational issues ------------------------------------------------------------------------ It is very good that we know agree on GLUE 2.0 as base for the Information System. Otherwise, we could NOT agree on the way to express references to grid entities in the Job Description document.
Your comments about chapter 6.5 confirm that the specifications of the BES Client interface DEPENDS on whether BES supports X509 proxies for delegation of Security credentials or NOT.
At least this was the EMI-ES v1.0 conclusion, which need not be the final word. The only dependency (in EMI-ES) is the specification which delegated credential is to be used for which data staging item. Delegation can be performed FULLY TRANSPARENT to the BES (for example on a message level as in UNICORE), and the BES interface specification is not dependent on it at all.
Since delegation of Security credentials is a MUST for BES, my conclusion is that we MUST agree on SECURITY issues (even if some are operational issues) BEFORE trying to write down BES requirements.
In the same way, we know that Clients need to perform complex queries on Jobs. The BES Client interface DEPENDS on whether such queries are accepted by BES itself, of by a separate Logging and Bookkeeping service. So, I think that we have to agree on the existence or absence of a separate Logging and Bookkeeping service BEFORE trying to write down BES requirements about queries.
The BES client interface does not necessarily need to allow to perform complex queries, these should be part of a separate interface.
As a summary : - Some BES requirements are quite independent from other grid components, and we can discuss on them immediately. - But some other BES requirements are DEPENDENT from foundational grid components or operational issues, in particular Information System, Security, Logging and Bookkeeping, ... - Therefore, we have to agree on these other grid components or operational issues FIRST.
This is a critical issue, and I propose that we discuss on it at OGF33.
Unfortunately I won't be in Lyon, only via phone. Summarising, my main points for the BES interface specification : * specifications must be narrowly scoped and composable (which is the JSDL/BES model anyway). * do not try to specify specific authentication methods. Do not try to specify specific delegation methods. Security is a cross cutting concern and should be dealt with separately. * do not assume a special environment where a BES instance will run. Interactions with other services (except for data access) are optional and should be treated as such. * leave operational aspects to operations. Recommendations for BES implementors may be given, of course. Best regards, Bernd.
I will answer to your other comments later.
Best regards.
----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr -----------------------------------------------------
On Thu, 15/09/2011 22:44, Bernd Schuller wrote:
Hi Etienne,
thanks for the clarifications. So indeed your document is aimed at both:
1) providing requirements for the actual BES specification ("client interface" in your terminology) 2) the operation and deployment issues that have to be solved for interoperability on an infrastructure level (say EGI and EDGI).
It would be very beneficial for further progress if these two distinct concerns could be separated, at least CLEARLY marked in your document.
I have added some more comments inline.
On Thu, 2011-09-15 at 19:27 +0200, Etienne URBAH wrote:
Concerning the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 :
THANK YOU VERY MUCH for having taken the time to read this document, and for having taken the time to provide comments :
Such comments are very useful for the improvement of documents, permit convergence and prepare later agreement. [...] On Thu, 15/09/2011 13:00, Bernd Schuller wrote:
[...]
1.4 Methodology
* ref 4: glite user guide -> out of scope, gLite is too complex and too specific in its architecture (if there is such a thing) to be useful as a base for BES Yes, this is a very large user guide for specific usage of gLite by users, and it does NOT provide a clear description of the architecture and of the functionalities. But it is NOT so complex. As soon as I finished reading this guide, it was easy for me to perform reverse engineering and extract the effective architecture (SOA with internal interfaces) and the implied functionalities (which are to be improved).
Indeed it appears that you try to impose gLite specifics (like a logging & bookkeeping service or proxy certificates on the transport level) as requirements. This would severly limit the BES specification effort, and will not be accepted (I hope) by other stakeholders.
In the text, I have prepended a few words to explain that.
Basically you imply that the "architecture and functionalities" of the gLite execution system (together with the PGI work) is somehow the guideline to be followed, which I fully disagree with.
2.4 Collaboration with other services
While this is important for interoperability, it is unimportant for the specification of a BES. The BES spec should NOT try to specify all the interaction with the rest of the world. This is the task of a "grid architecture specification" like OGSA.
My document is NOT targeted only to the specification of the BES Client interface, but to the clear and consistent description of BES context and functional + operational requirements which are really necessary for interoperability. As far as I know, OGSA does NOT take into account GLUE 2.0 yet. Therefore, an up to date 'grid architecture specification' is absolutely necessary.
Glue2 is just an information model, not necessarily a perfect one nor the only one. However, I agree an information model has to be adopted for BES and any associated information systems.
If OGSA members consider chapters 2.3 and 2.4 of my document as a 'grid architecture specification' which updates and improves OGSA, I thank them. If they consider that this 'grid architecture specification' does NOT comply with OGSA and competes with it, then I assert that it obsoletes OGSA.
I can't really say, the OGSA group stopped its work a long time ago and it's a long time that I looked at the documents.
Specifically, the interactions with security, monitoring, accounting and logging framework are OPERATIONAL concerns that MUST NOT be a mandatory part of a BES specification.
FAILURE of practical operations is often caused by LACK of early care about operational concerns during specification phase.
Agreed.
As GIN-GC has proven and documented, this is even more true for interoperability on the field (as opposed to theoretical interoperability at the WSDL level). I confirm that care about operational concerns is REQUIRED for real operations and for practical interoperability on the field. Although operational concerns are NOT part of the BES Client interface, they are REQUIRED for the overall specifications of BES in its context. In the text, I have stressed that the document DOES take into account operational concerns.
4. BES non-functional requirements
4.1.2 Traceability - should be SHOULD not MUST
This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ?
Isn't it already, powered by gLite and used by the wlcg botnet (just kidding of course) :-)
No traceability --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that traceability is a MUST.
It is an internal detail which any good implementation will provide. If BES-A is much easier and more secure to operate than BES-B, admins can choose which to install.
4.1.3 Security - how can a specification "implement" a policy? You probably meant "BES implementations SHOULD ..."
The text now is 'BES specifications MUST fully take into account the Security Policies ...'
Still no understanding here... let's take traceability The relevant EGI policy <https://documents.egi.eu/public/ShowDocument?docid=81> says
"[...] software deployed in the Grid MUST include the ability to produce sufficient and relevant logging [...] The level of the logging MUST be configured by all service providers, including but not limited to the Sites, to produce the required information which MUST be retained for a minimum of 90 days."
For example all UNICORE services can be made to comply with this by configuration of the logging library we use (Apache Log4j), and by not deleting log files for 90 days. So this is a feature of the implementation and the administrator in charge, not the specification. Thus, your sentence should read "BES implementations SHOULD ..." (It is MUST of course only if they want to be deployed in EGI) One does not try to specify implementation details, at least not in any specification I've ever seen (e.g. does the HTTP specification say anything about server logging or accepted CAs?).
[...]
4.2 all of this is out of scope. For example the UNICORE service container hosts a number of services including an execution service. Probably you mean that the execution service SPECIFICATION should be limited to the execution service and MUST NOT specify accounting, security etc.
'Well defined and narrow scope' is a general engineering requirement. It is fundamental concern crossing requirements, design, specifications, fabrication, tests, operations, user experience and product maintenance for all types of products, even outside software engineering.
exactly.
I confirm that 'Well defined and narrow scope' is absolutely REQUIRED for sound software design, implementation, deployment and maintainability. From your comment, I assume that the Execution Service of UNICORE does have a 'Well defined and narrow scope', does have precise interfaces with other UNICORE services, and minimizes overlaps with them.
of course. And UNICORE does not include a L&B service :-)
[...]
6.1
* "SSL certificates MUST be signed by a CA..." this is an operational decision, and has nothing to do with the BES spec. For example, a site may run an inhouse deployment of BES using an in-house CA. This requirement should be deleted. This operational concern is REQUIRED for practical interoperability on the field. I have prepended : * Authentication of Servers : The Execution Service SHOULD permit Clients to authenticate it. If an Execution Service authenticates itself to clients, it MUST permit Clients to really perform this authentication.
This sentence makes no sense to me, sorry. Maybe "Server and client SHOULD communicate via a secure channel (SSL/TLS)". Even this may not be true in the future, though it is for all(?) the Grid systems currently.
6.3
* "For Client authentication, the Execution Service MUST accept all following authentication methods: Full X509, RFC-3820-compliant X509 Proxy"
This requirement is invalid. I agree that it would be nice to be able to specify authentication methods, but it is impossible. For example Shibboleth, Username/password, OpenID, OAuth (e.g. for a REST interface over plain HTTP), or even NOTHING (e.g. in an inhouse grid) all can be valid authentication methods in some circumstances.
There are 2 separate requirements : - 1 'MUST' for Full X509 and RFC-3820-compliant X509 Proxy - 1 'MAY' for all other ones.
Furthermore, making proxies a MUST implies that nonstandard authentication libraries instead of TLS/SSL must be used, making the BES implementation insecure. For some implementors (like UNICORE) proxies on the transport level are very much a no-go.
I had clearly specified RFC-3820-compliant X509 Proxy, which ARE standard. Your critics are valid for GSI proxies, which I have taken care NOT to mention.
by "standard" I did not mean that it is an RFC, but software support. As opposed to standard SSL/TLS, proxies are almost not supported by industry standard tools, for example Apache httpd or the Java JDK. One has to rely on custom code, which is notoriously buggy and error prone.
Since one important non-functional requirement (for me at least) is to be able use standard (off-the-shelf) open source software, having to support proxies is a big limitation.
6.4.
"This authorization mechanism MUST be consistent across all instances of the Execution Service"
This violates the autonomy of a site. Site administrators often wish to stay in control of their resources, and do not accept external authorisation decision points. And anyway, who cares? Since the AuthZ mechanism is internal to the BES, it cannot be specified in the BES spec as such. Interoperability requires a federation of independent administrative domain to agree on common functionalities, interfaces and operations. This DOES sometime violate the autonomy of each individual site. The requirement is NOT that the AUTHZ decision point is external to any site, but that all participating site MUST accept to install inside their site an instance of a commonly validated software implementing the decision point.
No. Each site may choose their own authz decision point, IMO.
The AUTHZ mechanism MUST NOT be internal to the BES :
Maybe I was not clear. The authz mechanism is invisible for outside parties (like clients). It can be an external component, an internal component, whatever, it is up to the BES implementor and the site admin.
For example, in UNICORE atomic services, the 'de.fzj.unicore.uas.security' package is described as 'The security subsystem of UNICORE/X', and is NOT internal to 'de.fzj.unicore.uas.impl.job'.
In UNICORE site admins can choose what attribute sources and XACML decision points they want to use, but the clients (including other services) do not need to know this. That is what I meant by "internal".
6.5
These are reqiurements on the security layer (or framework) and should not be used as requirements on BES.
These security requirements DO have impacts on the BES Client interface and on the Job Description document. In the text, I have made it clear.
indeed while preparing the EMI-ES specification, we came to the following conclusions 1) when using proxies for delegation, it is necessary to map each data staging item to a delegated credential (you can check the EMI-ES job description for details) 2) the delegation operations are separate from the job management operations, so they do not necessarily have to be part of the BES client interface.
Also, there are existing implementations (UNICORE and Genesis come to my mind) that do not need this at all, because they do delegation without proxies.
So I disagree, 6.5 mostly describes features of the particular security framework that is used.
8 BES requirements related to "Application Repositories"
While I agree that BES should understand the notion of an "Application" (see e.g. JSDL ApplicationName), I don't agree that the BES should use these for Scheduling. Rather, this is the job of a broker.
The text is now : * Resource selection : The Execution Service MUST use, among others, these references to 'Installed Applications' in order to select the most adequate computing resource for the Job.
9 BES requirements applying to Accounting
As a "MUST", these are out of scope, and should be made "SHOULD".
No Accounting --> No precise reporting to funding agencies --> No funding --> Abrupt and very long shutdown of all services. I fully confirm that Accounting is a MUST.
... operational
10 Logging/Bookkeeping
Same as 9.
Same as 'Traceability' : This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ? No Logging and Bookkeeping --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that Logging and Bookkeeping is a MUST.
an operational MUST maybe for some infrastructures, not all.
E.g. a typical HPC site has its own accounting, its own logging systems independent of the (Grid) software used to submit jobs to it.
12 Jobs
12.1 Types of job
Support for parallel jobs: it should be "MUST" :-)
The text is now : - The concept of 'Single Job' includes Jobs running massively-parallel processes using MPI on one large-scale HPC System. The Execution Service MUST understand instructions for usage of MPI inside the Job Description document. The Execution Service SHOULD transmit these instructions to the Batch System, or return an explicit error message if not supported.
OK
Best regards, Bernd.
-- Dr. Bernd Schuller Federated Systems and Data Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556)
------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------
-- Dr. Bernd Schuller Federated Systems and Data Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556)

+1 on all points. Cheers, Andre. 2011/9/16 Bernd Schuller <b.schuller@fz-juelich.de>:
hi Etienne,
On Fri, 2011-09-16 at 13:32 +0200, Etienne URBAH wrote: [...]
RFC-3820-compliant X509 Proxies ------------------------------- The RFC-3820-compliant X509 proxies are fully supported by the jLite library written in Java by Oleg SUKHOROSLOV and available at http://code.google.com/p/jlite/
You must be joking. I was talking about open source software like Apache httpd and Java JDK. jLite just wraps the cog-globus libraries and adds some gLite access APIs. No, thanks.
Dependency of BES on other grid software components / operational issues ------------------------------------------------------------------------ It is very good that we know agree on GLUE 2.0 as base for the Information System. Otherwise, we could NOT agree on the way to express references to grid entities in the Job Description document.
Your comments about chapter 6.5 confirm that the specifications of the BES Client interface DEPENDS on whether BES supports X509 proxies for delegation of Security credentials or NOT.
At least this was the EMI-ES v1.0 conclusion, which need not be the final word. The only dependency (in EMI-ES) is the specification which delegated credential is to be used for which data staging item.
Delegation can be performed FULLY TRANSPARENT to the BES (for example on a message level as in UNICORE), and the BES interface specification is not dependent on it at all.
Since delegation of Security credentials is a MUST for BES, my conclusion is that we MUST agree on SECURITY issues (even if some are operational issues) BEFORE trying to write down BES requirements.
In the same way, we know that Clients need to perform complex queries on Jobs. The BES Client interface DEPENDS on whether such queries are accepted by BES itself, of by a separate Logging and Bookkeeping service. So, I think that we have to agree on the existence or absence of a separate Logging and Bookkeeping service BEFORE trying to write down BES requirements about queries.
The BES client interface does not necessarily need to allow to perform complex queries, these should be part of a separate interface.
As a summary : - Some BES requirements are quite independent from other grid components, and we can discuss on them immediately. - But some other BES requirements are DEPENDENT from foundational grid components or operational issues, in particular Information System, Security, Logging and Bookkeeping, ... - Therefore, we have to agree on these other grid components or operational issues FIRST.
This is a critical issue, and I propose that we discuss on it at OGF33.
Unfortunately I won't be in Lyon, only via phone.
Summarising, my main points for the BES interface specification :
* specifications must be narrowly scoped and composable (which is the JSDL/BES model anyway).
* do not try to specify specific authentication methods. Do not try to specify specific delegation methods. Security is a cross cutting concern and should be dealt with separately.
* do not assume a special environment where a BES instance will run. Interactions with other services (except for data access) are optional and should be treated as such.
* leave operational aspects to operations. Recommendations for BES implementors may be given, of course.
Best regards, Bernd.
I will answer to your other comments later.
Best regards.
----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr -----------------------------------------------------
On Thu, 15/09/2011 22:44, Bernd Schuller wrote:
Hi Etienne,
thanks for the clarifications. So indeed your document is aimed at both:
1) providing requirements for the actual BES specification ("client interface" in your terminology) 2) the operation and deployment issues that have to be solved for interoperability on an infrastructure level (say EGI and EDGI).
It would be very beneficial for further progress if these two distinct concerns could be separated, at least CLEARLY marked in your document.
I have added some more comments inline.
On Thu, 2011-09-15 at 19:27 +0200, Etienne URBAH wrote:
Concerning the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 :
THANK YOU VERY MUCH for having taken the time to read this document, and for having taken the time to provide comments :
Such comments are very useful for the improvement of documents, permit convergence and prepare later agreement. [...] On Thu, 15/09/2011 13:00, Bernd Schuller wrote:
[...]
1.4 Methodology
* ref 4: glite user guide -> out of scope, gLite is too complex and too specific in its architecture (if there is such a thing) to be useful as a base for BES Yes, this is a very large user guide for specific usage of gLite by users, and it does NOT provide a clear description of the architecture and of the functionalities. But it is NOT so complex. As soon as I finished reading this guide, it was easy for me to perform reverse engineering and extract the effective architecture (SOA with internal interfaces) and the implied functionalities (which are to be improved).
Indeed it appears that you try to impose gLite specifics (like a logging & bookkeeping service or proxy certificates on the transport level) as requirements. This would severly limit the BES specification effort, and will not be accepted (I hope) by other stakeholders.
In the text, I have prepended a few words to explain that.
Basically you imply that the "architecture and functionalities" of the gLite execution system (together with the PGI work) is somehow the guideline to be followed, which I fully disagree with.
2.4 Collaboration with other services
While this is important for interoperability, it is unimportant for the specification of a BES. The BES spec should NOT try to specify all the interaction with the rest of the world. This is the task of a "grid architecture specification" like OGSA.
My document is NOT targeted only to the specification of the BES Client interface, but to the clear and consistent description of BES context and functional + operational requirements which are really necessary for interoperability. As far as I know, OGSA does NOT take into account GLUE 2.0 yet. Therefore, an up to date 'grid architecture specification' is absolutely necessary.
Glue2 is just an information model, not necessarily a perfect one nor the only one. However, I agree an information model has to be adopted for BES and any associated information systems.
If OGSA members consider chapters 2.3 and 2.4 of my document as a 'grid architecture specification' which updates and improves OGSA, I thank them. If they consider that this 'grid architecture specification' does NOT comply with OGSA and competes with it, then I assert that it obsoletes OGSA.
I can't really say, the OGSA group stopped its work a long time ago and it's a long time that I looked at the documents.
Specifically, the interactions with security, monitoring, accounting and logging framework are OPERATIONAL concerns that MUST NOT be a mandatory part of a BES specification.
FAILURE of practical operations is often caused by LACK of early care about operational concerns during specification phase.
Agreed.
As GIN-GC has proven and documented, this is even more true for interoperability on the field (as opposed to theoretical interoperability at the WSDL level). I confirm that care about operational concerns is REQUIRED for real operations and for practical interoperability on the field. Although operational concerns are NOT part of the BES Client interface, they are REQUIRED for the overall specifications of BES in its context. In the text, I have stressed that the document DOES take into account operational concerns.
4. BES non-functional requirements
4.1.2 Traceability - should be SHOULD not MUST
This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ?
Isn't it already, powered by gLite and used by the wlcg botnet (just kidding of course) :-)
No traceability --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that traceability is a MUST.
It is an internal detail which any good implementation will provide. If BES-A is much easier and more secure to operate than BES-B, admins can choose which to install.
4.1.3 Security - how can a specification "implement" a policy? You probably meant "BES implementations SHOULD ..."
The text now is 'BES specifications MUST fully take into account the Security Policies ...'
Still no understanding here... let's take traceability The relevant EGI policy <https://documents.egi.eu/public/ShowDocument?docid=81> says
"[...] software deployed in the Grid MUST include the ability to produce sufficient and relevant logging [...] The level of the logging MUST be configured by all service providers, including but not limited to the Sites, to produce the required information which MUST be retained for a minimum of 90 days."
For example all UNICORE services can be made to comply with this by configuration of the logging library we use (Apache Log4j), and by not deleting log files for 90 days. So this is a feature of the implementation and the administrator in charge, not the specification. Thus, your sentence should read "BES implementations SHOULD ..." (It is MUST of course only if they want to be deployed in EGI) One does not try to specify implementation details, at least not in any specification I've ever seen (e.g. does the HTTP specification say anything about server logging or accepted CAs?).
[...]
4.2 all of this is out of scope. For example the UNICORE service container hosts a number of services including an execution service. Probably you mean that the execution service SPECIFICATION should be limited to the execution service and MUST NOT specify accounting, security etc.
'Well defined and narrow scope' is a general engineering requirement. It is fundamental concern crossing requirements, design, specifications, fabrication, tests, operations, user experience and product maintenance for all types of products, even outside software engineering.
exactly.
I confirm that 'Well defined and narrow scope' is absolutely REQUIRED for sound software design, implementation, deployment and maintainability. From your comment, I assume that the Execution Service of UNICORE does have a 'Well defined and narrow scope', does have precise interfaces with other UNICORE services, and minimizes overlaps with them.
of course. And UNICORE does not include a L&B service :-)
[...]
6.1
* "SSL certificates MUST be signed by a CA..." this is an operational decision, and has nothing to do with the BES spec. For example, a site may run an inhouse deployment of BES using an in-house CA. This requirement should be deleted. This operational concern is REQUIRED for practical interoperability on the field. I have prepended : * Authentication of Servers : The Execution Service SHOULD permit Clients to authenticate it. If an Execution Service authenticates itself to clients, it MUST permit Clients to really perform this authentication.
This sentence makes no sense to me, sorry. Maybe "Server and client SHOULD communicate via a secure channel (SSL/TLS)". Even this may not be true in the future, though it is for all(?) the Grid systems currently.
6.3
* "For Client authentication, the Execution Service MUST accept all following authentication methods: Full X509, RFC-3820-compliant X509 Proxy"
This requirement is invalid. I agree that it would be nice to be able to specify authentication methods, but it is impossible. For example Shibboleth, Username/password, OpenID, OAuth (e.g. for a REST interface over plain HTTP), or even NOTHING (e.g. in an inhouse grid) all can be valid authentication methods in some circumstances.
There are 2 separate requirements : - 1 'MUST' for Full X509 and RFC-3820-compliant X509 Proxy - 1 'MAY' for all other ones.
Furthermore, making proxies a MUST implies that nonstandard authentication libraries instead of TLS/SSL must be used, making the BES implementation insecure. For some implementors (like UNICORE) proxies on the transport level are very much a no-go.
I had clearly specified RFC-3820-compliant X509 Proxy, which ARE standard. Your critics are valid for GSI proxies, which I have taken care NOT to mention.
by "standard" I did not mean that it is an RFC, but software support. As opposed to standard SSL/TLS, proxies are almost not supported by industry standard tools, for example Apache httpd or the Java JDK. One has to rely on custom code, which is notoriously buggy and error prone.
Since one important non-functional requirement (for me at least) is to be able use standard (off-the-shelf) open source software, having to support proxies is a big limitation.
6.4.
"This authorization mechanism MUST be consistent across all instances of the Execution Service"
This violates the autonomy of a site. Site administrators often wish to stay in control of their resources, and do not accept external authorisation decision points. And anyway, who cares? Since the AuthZ mechanism is internal to the BES, it cannot be specified in the BES spec as such. Interoperability requires a federation of independent administrative domain to agree on common functionalities, interfaces and operations. This DOES sometime violate the autonomy of each individual site. The requirement is NOT that the AUTHZ decision point is external to any site, but that all participating site MUST accept to install inside their site an instance of a commonly validated software implementing the decision point.
No. Each site may choose their own authz decision point, IMO.
The AUTHZ mechanism MUST NOT be internal to the BES :
Maybe I was not clear. The authz mechanism is invisible for outside parties (like clients). It can be an external component, an internal component, whatever, it is up to the BES implementor and the site admin.
For example, in UNICORE atomic services, the 'de.fzj.unicore.uas.security' package is described as 'The security subsystem of UNICORE/X', and is NOT internal to 'de.fzj.unicore.uas.impl.job'.
In UNICORE site admins can choose what attribute sources and XACML decision points they want to use, but the clients (including other services) do not need to know this. That is what I meant by "internal".
6.5
These are reqiurements on the security layer (or framework) and should not be used as requirements on BES.
These security requirements DO have impacts on the BES Client interface and on the Job Description document. In the text, I have made it clear.
indeed while preparing the EMI-ES specification, we came to the following conclusions 1) when using proxies for delegation, it is necessary to map each data staging item to a delegated credential (you can check the EMI-ES job description for details) 2) the delegation operations are separate from the job management operations, so they do not necessarily have to be part of the BES client interface.
Also, there are existing implementations (UNICORE and Genesis come to my mind) that do not need this at all, because they do delegation without proxies.
So I disagree, 6.5 mostly describes features of the particular security framework that is used.
8 BES requirements related to "Application Repositories"
While I agree that BES should understand the notion of an "Application" (see e.g. JSDL ApplicationName), I don't agree that the BES should use these for Scheduling. Rather, this is the job of a broker.
The text is now : * Resource selection : The Execution Service MUST use, among others, these references to 'Installed Applications' in order to select the most adequate computing resource for the Job.
9 BES requirements applying to Accounting
As a "MUST", these are out of scope, and should be made "SHOULD".
No Accounting --> No precise reporting to funding agencies --> No funding --> Abrupt and very long shutdown of all services. I fully confirm that Accounting is a MUST.
... operational
10 Logging/Bookkeeping
Same as 9.
Same as 'Traceability' : This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ? No Logging and Bookkeeping --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that Logging and Bookkeeping is a MUST.
an operational MUST maybe for some infrastructures, not all.
E.g. a typical HPC site has its own accounting, its own logging systems independent of the (Grid) software used to submit jobs to it.
12 Jobs
12.1 Types of job
Support for parallel jobs: it should be "MUST" :-)
The text is now : - The concept of 'Single Job' includes Jobs running massively-parallel processes using MPI on one large-scale HPC System. The Execution Service MUST understand instructions for usage of MPI inside the Job Description document. The Execution Service SHOULD transmit these instructions to the Batch System, or return an explicit error message if not supported.
OK
Best regards, Bernd.
-- Dr. Bernd Schuller Federated Systems and Data Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556)
------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------
-- Dr. Bernd Schuller Federated Systems and Data Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556)
-- Nothing is ever easy...
participants (3)
-
Andre Merzky
-
Bernd Schuller
-
Etienne URBAH