
Bernd, Concerning the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 : THANK YOU VERY MUCH for having taken the time to read this document, and for having taken the time to provide comments : Such comments are very useful for the improvement of documents, permit convergence and prepare later agreement. My answers are interspersed in your comments. Please note that the chapter numbering has been modified. Besides, in the chapter about 'Data Management', I have added a description of the context and the workflows for automatic and manual data staging. The new version of the document is available at the location given above. In order that we all make progress, please continue sending comments. Best regards. ----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr ----------------------------------------------------- On Thu, 15/09/2011 13:00, Bernd Schuller wrote:
Hi Etienne,
thanks for creating this consolidated requirements document, it is very interesting to read. I have a few comments, mostly related to whether certain features are a "MUST" or not.
In many cases it was not clear to me whether you talk about the BES interface specification, or about the way a BES instance should be deployed and operated. It would be good to remove the operational and deployment concerns, so that only the specification parts remain.
I hope my comments can be used to make the document clearer and easier to digest.
So, in detail:
1.4 Methodology
* ref 4: glite user guide -> out of scope, gLite is too complex and too specific in its architecture (if there is such a thing) to be useful as a base for BES Yes, this is a very large user guide for specific usage of gLite by users, and it does NOT provide a clear description of the architecture and of the functionalities. But it is NOT so complex. As soon as I finished reading this guide, it was easy for me to perform reverse engineering and extract the effective architecture (SOA with internal interfaces) and the implied functionalities (which are to be improved). In the text, I have prepended a few words to explain that.
2.4 Collaboration with other services
While this is important for interoperability, it is unimportant for the specification of a BES. The BES spec should NOT try to specify all the interaction with the rest of the world. This is the task of a "grid architecture specification" like OGSA.
My document is NOT targeted only to the specification of the BES Client interface, but to the clear and consistent description of BES context and functional + operational requirements which are really necessary for interoperability. As far as I know, OGSA does NOT take into account GLUE 2.0 yet. Therefore, an up to date 'grid architecture specification' is absolutely necessary. If OGSA members consider chapters 2.3 and 2.4 of my document as a 'grid architecture specification' which updates and improves OGSA, I thank them. If they consider that this 'grid architecture specification' does NOT comply with OGSA and competes with it, then I assert that it obsoletes OGSA.
Specifically, the interactions with security, monitoring, accounting and logging framework are OPERATIONAL concerns that MUST NOT be a mandatory part of a BES specification.
FAILURE of practical operations is often caused by LACK of early care about operational concerns during specification phase. As GIN-GC has proven and documented, this is even more true for interoperability on the field (as opposed to theoretical interoperability at the WSDL level). I confirm that care about operational concerns is REQUIRED for real operations and for practical interoperability on the field. Although operational concerns are NOT part of the BES Client interface, they are REQUIRED for the overall specifications of BES in its context. In the text, I have stressed that the document DOES take into account operational concerns.
4. BES non-functional requirements
4.1.2 Traceability - should be SHOULD not MUST
This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ? No traceability --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that traceability is a MUST.
4.1.3 Security - how can a specification "implement" a policy? You probably meant "BES implementations SHOULD ..."
The text now is 'BES specifications MUST fully take into account the Security Policies ...'
- availability/reliability: while I agree, you cannot enforce quality of an implementation via the specification. YES. Thank you.
So all of 4.1.3 is SHOULD in my opinion. Security policies stay a MUST.
4.2 all of this is out of scope. For example the UNICORE service container hosts a number of services including an execution service. Probably you mean that the execution service SPECIFICATION should be limited to the execution service and MUST NOT specify accounting, security etc.
'Well defined and narrow scope' is a general engineering requirement. It is fundamental concern crossing requirements, design, specifications, fabrication, tests, operations, user experience and product maintenance for all types of products, even outside software engineering. I confirm that 'Well defined and narrow scope' is absolutely REQUIRED for sound software design, implementation, deployment and maintainability. From your comment, I assume that the Execution Service of UNICORE does have a 'Well defined and narrow scope', does have precise interfaces with other UNICORE services, and minimizes overlaps with them. The text now is 'The Execution Service is defined by its functionalities (efficiently manage Jobs, which are transient entities) and its operational constraints : For sound software design, implementation, deployment and maintainability, ...'
5 BES requirements applying to the info system
Introduction: this is a requirement on the grid, not on the BES.
This is NOT a requirement, this is a DESCRIPTION. The importance of the Information System as foundational block is underestimated, and precise knowledge of GLUE 2.0 is poor inside OGF. I am trying to improve the situation by providing clear explanations.
6 BES requirements applying to security
Introduction: when you say "The Execution Service MUST NOT be overloaed by implementing a security framework", it is not clear what you mean by "service". Do you mean the service as defined by its interfaces, or do you mean the actual BES process or even machine which is running BES? For example, a BES may be one of many services hosted by a service container, which may include a full featured security framework. This should be clarified.
The text now is 'The Execution Service is defined by its functionalities (efficiently manage Jobs, which are transient entities) and its operational constraints. For sound software design, implementation, deployment and maintainability, BES specifications and instances of the Execution Service ...'
6.1
* "SSL certificates MUST be signed by a CA..." this is an operational decision, and has nothing to do with the BES spec. For example, a site may run an inhouse deployment of BES using an in-house CA. This requirement should be deleted.
This operational concern is REQUIRED for practical interoperability on the field. I have prepended : * Authentication of Servers : The Execution Service SHOULD permit Clients to authenticate it. If an Execution Service authenticates itself to clients, it MUST permit Clients to really perform this authentication.
6.3
* "For Client authentication, the Execution Service MUST accept all following authentication methods: Full X509, RFC-3820-compliant X509 Proxy"
This requirement is invalid. I agree that it would be nice to be able to specify authentication methods, but it is impossible. For example Shibboleth, Username/password, OpenID, OAuth (e.g. for a REST interface over plain HTTP), or even NOTHING (e.g. in an inhouse grid) all can be valid authentication methods in some circumstances.
There are 2 separate requirements : - 1 'MUST' for Full X509 and RFC-3820-compliant X509 Proxy - 1 'MAY' for all other ones.
Furthermore, making proxies a MUST implies that nonstandard authentication libraries instead of TLS/SSL must be used, making the BES implementation insecure. For some implementors (like UNICORE) proxies on the transport level are very much a no-go.
I had clearly specified RFC-3820-compliant X509 Proxy, which ARE standard. Your critics are valid for GSI proxies, which I have taken care NOT to mention.
6.4.
"This authorization mechanism MUST be consistent across all instances of the Execution Service"
This violates the autonomy of a site. Site administrators often wish to stay in control of their resources, and do not accept external authorisation decision points. And anyway, who cares? Since the AuthZ mechanism is internal to the BES, it cannot be specified in the BES spec as such.
Interoperability requires a federation of independent administrative domain to agree on common functionalities, interfaces and operations. This DOES sometime violate the autonomy of each individual site. The requirement is NOT that the AUTHZ decision point is external to any site, but that all participating site MUST accept to install inside their site an instance of a commonly validated software implementing the decision point. The AUTHZ mechanism MUST NOT be internal to the BES : For example, in UNICORE atomic services, the 'de.fzj.unicore.uas.security' package is described as 'The security subsystem of UNICORE/X', and is NOT internal to 'de.fzj.unicore.uas.impl.job'.
6.5
These are reqiurements on the security layer (or framework) and should not be used as requirements on BES.
These security requirements DO have impacts on the BES Client interface and on the Job Description document. In the text, I have made it clear.
8 BES requirements related to "Application Repositories"
While I agree that BES should understand the notion of an "Application" (see e.g. JSDL ApplicationName), I don't agree that the BES should use these for Scheduling. Rather, this is the job of a broker.
The text is now : * Resource selection : The Execution Service MUST use, among others, these references to 'Installed Applications' in order to select the most adequate computing resource for the Job.
9 BES requirements applying to Accounting
As a "MUST", these are out of scope, and should be made "SHOULD".
No Accounting --> No precise reporting to funding agencies --> No funding --> Abrupt and very long shutdown of all services. I fully confirm that Accounting is a MUST.
10 Logging/Bookkeeping
Same as 9.
Same as 'Traceability' : This is an operational concern : Would you really take the risk that the whole EGI becomes a spambot or a scambot ? No Logging and Bookkeeping --> No post mortem analysis after attack --> Large infection --> Panic --> Abrupt and very long shutdown of all services. I fully confirm that Logging and Bookkeeping is a MUST.
12 Jobs
12.1 Types of job
Support for parallel jobs: it should be "MUST" :-)
The text is now : - The concept of 'Single Job' includes Jobs running massively-parallel processes using MPI on one large-scale HPC System. The Execution Service MUST understand instructions for usage of MPI inside the Job Description document. The Execution Service SHOULD transmit these instructions to the Batch System, or return an explicit error message if not supported.
Best regards, Bernd.
On Tue, 2011-09-13 at 20:51 +0200, Etienne URBAH wrote:
Andrew, Steven and all OGSA-BES stakeholders,
In the document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306 I have performed following improvements :
- For each functionality, clearly separate : - the requirements about the corresponding Client interface (often 'MUST'), - the requirements about the implementation of the functionality (sometimes 'MAY').
- Add short titles in bold to improve readability.
Thank you in advance for reading and criticizing this requirements document.
Best regards.
Etienne URBAH
On Fri, 09/09/2011 22:51, Etienne URBAH wrote:
Andrew, Steven and all OGSA-BES stakeholders,
I have finished written down a document named 'Requirements for an improved Basic Execution Service (BES)' and available at http://forge.gridforum.org/sf/go/doc16306
I will use this requirements document to propose improvements to the current specification of BES 1.0 described in GFD.108, as agreed at the OGSA-BES session at OGF32 in Salt Lake City on 17 July 2011.
Thank you in advance for reading and criticizing my requirements document.
Best regards.
----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr -----------------------------------------------------
[...]
-- Dr. Bernd Schuller Federated Systems and Data Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556)
------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDirig Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------