Definition of a Production Grid
Dear All, The word Grid means different things to different people and this quite frequently leads to misunderstandings and conflict. This is especially true within OGF where people from many different domains and backgrounds discuss "Grids". While each and every definition themselves are as valid as each other, trying to discuss a concept from two possibly unrelated perspectives is quite difficult. Before we move forward in the PGI activity we therefore need understand what we mean by "Grid" and hence the problem we are trying to solve. It seems that in some recent threads the word "production" has cause some disagreements as this word is more commonly used to reflect quality rather than concepts. From my perspective I can try to define the core properties that I see are key when describing the EGEE infrastructure. I would hope that similar infrastructures with whom EGEE would like to interoperate would have a similar description. I would like to introduce the term "Muti-institutional infrastructures for e-Science" which can be used to describe something with the following properties. 1) Multi-institutional. The EGEE infrastructure is composed by linking resources that reside at autonomous academic institutes. The key concept here is the administrative domain which maps to a real institute and that there is more than one institute in the infrastructure. 2) The Virtual Organization The main aspects that links these institutes is the drive to collaborate in order to do science. The collaborations are defined by Virtual Organizations. 3) x509 Certificates and Proxies Users are identified by x509 certificates which are provided by CAs accredited by the IGTF. Proxies are typically used to interact with services. 4) VOMS To identify which VO a user is belong to is by contacting VOMS. VOMS also supports Roles an Groups within the VO which is implemented by adding attributes to the proxy. 5) Multiple Services The are many different types of services each of which can define their own interfaces which may or many not be a Web Service. 6) Parrallel Information System The information about services is found by querying an information system which is not necessarily related to the service interface. 6) Scale 100s of administrative domains, thousands of users, thousands of services, millions of computing actives and petabytes of data. This may be a can of worms but if we can agree on a set of properties for what we mean by "Production Grid" it might help in the discussions. I don't mean that anything that has different properties is not a production grid, I just want to clarify what this group mean when it talks about a Grid infrastructures. Laurence Laurence
Hi Lawrence, I would be very uncomfortable with your definitions, this describes the status quo and possibly not something that could be existing in 18 months time. For example An internal Campus Grid running production systems, management, accounting, security etc. will require interoperability between institutions, not necessarily through national grids. Shibboleth is rapidly gaining ground for authentication and authorization and will be used for example by the UK NGI quite soon. And all of the others I am uncomfortable with, as far as I can see I think your criteria would for example not include DEISA? Can I suggest that we just set performance, policy and procedure targets and go from there. I.e. You grid will have legally compliant accounting for utilisation by a number of users that are identified using a strong authentication and authorisation mechanism, across a set of physically separate resources that may or may not be legally owned by more than one legal entity. The services that these offer can be many and varied but all should operate to a defined quality of service definition. That would be easier for everyone to sign up to. David PS The University of Oxford campus grid would though satisfy all of your criteria except we have ~100 users across 38 different independent legal entities within the university. PPS For those that wonder what real production is I would suggest looking at CycleComputing.com there they provide grid services to a significant number of commercial organisations, with legal guarantees on data separations and legal consequences if things go wrong. I doubt that there would be real legal ramifications if EGEE/ARC/DEISA didn't fulfill their SLDs On 17/03/2009 13:58, "Laurence Field" <Laurence.Field@cern.ch> wrote:
Dear All,
The word Grid means different things to different people and this quite frequently leads to misunderstandings and conflict. This is especially true within OGF where people from many different domains and backgrounds discuss "Grids". While each and every definition themselves are as valid as each other, trying to discuss a concept from two possibly unrelated perspectives is quite difficult.
Before we move forward in the PGI activity we therefore need understand what we mean by "Grid" and hence the problem we are trying to solve. It seems that in some recent threads the word "production" has cause some disagreements as this word is more commonly used to reflect quality rather than concepts.
From my perspective I can try to define the core properties that I see are key when describing the EGEE infrastructure. I would hope that similar infrastructures with whom EGEE would like to interoperate would have a similar description.
I would like to introduce the term "Muti-institutional infrastructures for e-Science" which can be used to describe something with the following properties.
1) Multi-institutional. The EGEE infrastructure is composed by linking resources that reside at autonomous academic institutes. The key concept here is the administrative domain which maps to a real institute and that there is more than one institute in the infrastructure.
2) The Virtual Organization The main aspects that links these institutes is the drive to collaborate in order to do science. The collaborations are defined by Virtual Organizations.
3) x509 Certificates and Proxies Users are identified by x509 certificates which are provided by CAs accredited by the IGTF. Proxies are typically used to interact with services.
4) VOMS To identify which VO a user is belong to is by contacting VOMS. VOMS also supports Roles an Groups within the VO which is implemented by adding attributes to the proxy.
5) Multiple Services The are many different types of services each of which can define their own interfaces which may or many not be a Web Service.
6) Parrallel Information System The information about services is found by querying an information system which is not necessarily related to the service interface.
6) Scale 100s of administrative domains, thousands of users, thousands of services, millions of computing actives and petabytes of data.
This may be a can of worms but if we can agree on a set of properties for what we mean by "Production Grid" it might help in the discussions. I don't mean that anything that has different properties is not a production grid, I just want to clarify what this group mean when it talks about a Grid infrastructures.
Laurence
Laurence
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
David Wallom wrote:
Hi Lawrence, [...] Can I suggest that we just set performance, policy and procedure targets and go from there. I.e. You grid will have legally compliant accounting for utilisation by a number of users that are identified using a strong authentication and authorisation mechanism, across a set of physically separate resources that may or may not be legally owned by more than one legal entity. The services that these offer can be many and varied but all should operate to a defined quality of service definition.
Hi all, I think that this definition is a bit generic, in the sense that it surely defines a "Grid", but I don't see how it addresses the term "Production" (which I agree is a term a bit elusive to quantify/qualify appropriately). In my mind I always associated "production" grids to those large-scale infrastructures (how much large?) that are used to get "real job" done (what does "real job" mean?). This is what I thought was the line dividing "production" grids from "non-production" ones. Moreno. -- Moreno Marzolla INFN Sezione di Padova, via Marzolo 8, 35131 PADOVA, Italy EMail: moreno.marzolla@pd.infn.it Phone: +39 049 8277103 WWW : http://www.dsi.unive.it/~marzolla Fax : +39 049 8756233
I have to firmly disagree. Production should refer to the quality and number of services that are available rather than its specific size. The scaling of an infrastructure has nothing at the moment to do with whether its resources are interoperable. Your separation of large is a completely arbitrary one. A production grid should be able to display policies and procedures for the management services and SLDs for the services that it provides users. Are you suggesting for example that a single national grid is not a production service? I can assure you for example that GLOW and other components of OSG as well as the UK NGS etc get an awful lot of work done with many many publications in high value refereed journals etc. as a direct result. Maybe we could use publication impact of the work done as a measure instead, it would be as arbitrary as 'real work'? David On 17/03/2009 14:50, "Moreno Marzolla" <moreno.marzolla@pd.infn.it> wrote:
David Wallom wrote:
Hi Lawrence, [...] Can I suggest that we just set performance, policy and procedure targets and go from there. I.e. You grid will have legally compliant accounting for utilisation by a number of users that are identified using a strong authentication and authorisation mechanism, across a set of physically separate resources that may or may not be legally owned by more than one legal entity. The services that these offer can be many and varied but all should operate to a defined quality of service definition.
Hi all,
I think that this definition is a bit generic, in the sense that it surely defines a "Grid", but I don't see how it addresses the term "Production" (which I agree is a term a bit elusive to quantify/qualify appropriately). In my mind I always associated "production" grids to those large-scale infrastructures (how much large?) that are used to get "real job" done (what does "real job" mean?). This is what I thought was the line dividing "production" grids from "non-production" ones.
Moreno.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I've never heard of the number of services being part of a definition of production. One service can be production, provided it meets the quality of service demands of the enterprise. Perhaps the reason size matters in grid is because no single item on the grid can be depended upon, but after a certain size there are enough examples of each item that the grid itself can boast "production quality". Hence, a user's workstation "on the grid" may not be of production quality, but 100,000 such workstations can produce "production quality" level of service. Arnie David Wallom wrote:
I have to firmly disagree. Production should refer to the quality and number of services that are available rather than its specific size. The scaling of an infrastructure has nothing at the moment to do with whether its resources are interoperable. Your separation of large is a completely arbitrary one. A production grid should be able to display policies and procedures for the management services and SLDs for the services that it provides users.
Are you suggesting for example that a single national grid is not a production service? I can assure you for example that GLOW and other components of OSG as well as the UK NGS etc get an awful lot of work done with many many publications in high value refereed journals etc. as a direct result. Maybe we could use publication impact of the work done as a measure instead, it would be as arbitrary as 'real work'?
David
On 17/03/2009 14:50, "Moreno Marzolla" <moreno.marzolla@pd.infn.it> wrote:
David Wallom wrote:
Hi Lawrence, [...] Can I suggest that we just set performance, policy and procedure targets and go from there. I.e. You grid will have legally compliant accounting for utilisation by a number of users that are identified using a strong authentication and authorisation mechanism, across a set of physically separate resources that may or may not be legally owned by more than one legal entity. The services that these offer can be many and varied but all should operate to a defined quality of service definition. Hi all,
I think that this definition is a bit generic, in the sense that it surely defines a "Grid", but I don't see how it addresses the term "Production" (which I agree is a term a bit elusive to quantify/qualify appropriately). In my mind I always associated "production" grids to those large-scale infrastructures (how much large?) that are used to get "real job" done (what does "real job" mean?). This is what I thought was the line dividing "production" grids from "non-production" ones.
Moreno.
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
- -- Arnie Miles Grid Middleware Architect Adjunct Assistant Professor of Computer Science Georgetown University 3300 Whitehaven Street NW Washington, DC 20007 202.687.9379 http://thebes.arc.georgetown.edu "Great spirits have always encountered violent opposition from mediocre minds" Albert Einstein -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJv74TlHilDOPbHk0RApbCAKDSPPkZLNWG8exYiNKbNjYuMjiMxwCeItXU RLHQPO+9khkkgb4h/Gkqmfk= =TK9H -----END PGP SIGNATURE-----
Hi Arnie, Size is definitely misleading concept. There are about 4 computers in the top 500 list that have more cores that available in EGEE. Laurence Arnie Miles wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
I've never heard of the number of services being part of a definition of production. One service can be production, provided it meets the quality of service demands of the enterprise.
Perhaps the reason size matters in grid is because no single item on the grid can be depended upon, but after a certain size there are enough examples of each item that the grid itself can boast "production quality". Hence, a user's workstation "on the grid" may not be of production quality, but 100,000 such workstations can produce "production quality" level of service.
Arnie
David Wallom wrote:
I have to firmly disagree. Production should refer to the quality and number of services that are available rather than its specific size. The scaling of an infrastructure has nothing at the moment to do with whether its resources are interoperable. Your separation of large is a completely arbitrary one. A production grid should be able to display policies and procedures for the management services and SLDs for the services that it provides users.
Are you suggesting for example that a single national grid is not a production service? I can assure you for example that GLOW and other components of OSG as well as the UK NGS etc get an awful lot of work done with many many publications in high value refereed journals etc. as a direct result. Maybe we could use publication impact of the work done as a measure instead, it would be as arbitrary as 'real work'?
David
On 17/03/2009 14:50, "Moreno Marzolla" <moreno.marzolla@pd.infn.it> wrote:
David Wallom wrote:
Hi Lawrence,
[...]
Can I suggest that we just set performance, policy and procedure targets and go from there. I.e. You grid will have legally compliant accounting for utilisation by a number of users that are identified using a strong authentication and authorisation mechanism, across a set of physically separate resources that may or may not be legally owned by more than one legal entity. The services that these offer can be many and varied but all should operate to a defined quality of service definition.
Hi all,
I think that this definition is a bit generic, in the sense that it surely defines a "Grid", but I don't see how it addresses the term "Production" (which I agree is a term a bit elusive to quantify/qualify appropriately). In my mind I always associated "production" grids to those large-scale infrastructures (how much large?) that are used to get "real job" done (what does "real job" mean?). This is what I thought was the line dividing "production" grids from "non-production" ones.
Moreno.
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
- -- Arnie Miles Grid Middleware Architect Adjunct Assistant Professor of Computer Science Georgetown University 3300 Whitehaven Street NW Washington, DC 20007 202.687.9379 http://thebes.arc.georgetown.edu
"Great spirits have always encountered violent opposition from mediocre minds" Albert Einstein -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFJv74TlHilDOPbHk0RApbCAKDSPPkZLNWG8exYiNKbNjYuMjiMxwCeItXU RLHQPO+9khkkgb4h/Gkqmfk= =TK9H -----END PGP SIGNATURE-----
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
David Wallom wrote: [...]
Maybe we could use publication impact of the work done as a measure instead, it would be as arbitrary as 'real work'?
I do not think that the publication impact is a good metric. A research/academic infrastructure used only, e.g., for testing and developing new programming models and/or technologies related to grid computing will surely have a large publication impact, as it will be used to try out new paradigms which will be subject to many publications. But not for this reason I would qualify such infrastructure as "production". I was trying to look for the definition of "production system" in some dictionary, but so far I failed to locate anything useful... Moreno. -- Moreno Marzolla INFN Sezione di Padova, via Marzolo 8, 35131 PADOVA, Italy EMail: moreno.marzolla@pd.infn.it Phone: +39 049 8277103 WWW : http://www.dsi.unive.it/~marzolla Fax : +39 049 8756233
Hi, I agree that "production" can be interpreted in many ways, and it may well be that the name of this group is misleading. Let's check what Webster has to say of "production" as an attributive: "something not specially designed or customized and usually mass-produced <a production car> <production housing>" Note "usually mass-produced". In my understanding, "production grid" is like a "production car" - like your average Renault is a production car as opposed to an F1 racing bolid by Ferrari. F1 bolid performs much better, but not when it comes to packing skis and dogs and kids. F1 bolids have to meet very tight standards, and a Megane is a pretty standard car, too - but the standards are different. It's much easier to buy a set of wheels or a mirror for a production car, because they usually are interchangeable. Uh, sorry for going that far with the analogy, I hope you all got my point. Cheers, Oxana
David Wallom wrote: [...]
Maybe we could use publication impact of the work done as a measure instead, it would be as arbitrary as 'real work'?
I do not think that the publication impact is a good metric. A research/academic infrastructure used only, e.g., for testing and developing new programming models and/or technologies related to grid computing will surely have a large publication impact, as it will be used to try out new paradigms which will be subject to many publications. But not for this reason I would qualify such infrastructure as "production".
I was trying to look for the definition of "production system" in some dictionary, but so far I failed to locate anything useful...
Moreno.
Maybe we could use publication impact of the work done as a measure instead, it would be as arbitrary as 'real work'?
Production is IMHO about infrastructure that enables you to do something, i.e. its dependable. E.g. cars/busses/planes form part of a production transport infrastructure. Different users of an infrastructure will have different metrics that define their success. A transport infrastructure is only useful to me if it gets me where I want to go.
I do not think that the publication impact is a good metric. A research/academic infrastructure used only, e.g., for testing and developing new programming models and/or technologies related to grid computing will surely have a large publication impact, as it will be used to try out new paradigms which will be subject to many publications. But not for this reason I would qualify such infrastructure as "production".
What about the 'Nobel Prize for discovering the Higgs Boson' metric? That seems to be a publication metric I hear about a lot! Seems a bit focussed, and hardly generic, but seems to work well for some communities! Steven
Hi David, Agreed, we should stop using the word production. I think that the key point associated with large is that it spans international boundaries which presents a whole load of policy and legal issues. The IGTF is the body which helps to facility international trust relationships and each country has its own CA. Maybe it should be "Multi-institutional International Infrastructures for e-Science". Laurence David Wallom wrote:
I have to firmly disagree. Production should refer to the quality and number of services that are available rather than its specific size. The scaling of an infrastructure has nothing at the moment to do with whether its resources are interoperable. Your separation of large is a completely arbitrary one. A production grid should be able to display policies and procedures for the management services and SLDs for the services that it provides users.
Are you suggesting for example that a single national grid is not a production service? I can assure you for example that GLOW and other components of OSG as well as the UK NGS etc get an awful lot of work done with many many publications in high value refereed journals etc. as a direct result. Maybe we could use publication impact of the work done as a measure instead, it would be as arbitrary as 'real work'?
David
On 17/03/2009 14:50, "Moreno Marzolla" <moreno.marzolla@pd.infn.it> wrote:
David Wallom wrote:
Hi Lawrence,
[...]
Can I suggest that we just set performance, policy and procedure targets and go from there. I.e. You grid will have legally compliant accounting for utilisation by a number of users that are identified using a strong authentication and authorisation mechanism, across a set of physically separate resources that may or may not be legally owned by more than one legal entity. The services that these offer can be many and varied but all should operate to a defined quality of service definition.
Hi all,
I think that this definition is a bit generic, in the sense that it surely defines a "Grid", but I don't see how it addresses the term "Production" (which I agree is a term a bit elusive to quantify/qualify appropriately). In my mind I always associated "production" grids to those large-scale infrastructures (how much large?) that are used to get "real job" done (what does "real job" mean?). This is what I thought was the line dividing "production" grids from "non-production" ones.
Moreno.
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
Laurence and all, Concerning the definition of a Production Grid : Lot of thanks to Laurence for proposing the first definition, and for proposing 'Multi-institutional International Infrastructures for e-Science'. Following David WALLOM, I think that 'International' is too restrictive. The key point is that a Production Grid spans institutional boundaries, which presents a whole load of policy and legal issues. So I propose 'Multi-institutional Infrastructure for e-Science'. Today, there can be Production Grids which do NOT use IGTF as trust anchor. But for interoperability, they will have to migrate and use IGTF as trust anchor. Inside the EDGeS project, we think that 'Production Grids' encompass both 'Service Grids' and 'Desktop Grids'. Shortly : - A Service Grid (SG) is a managed grid of managed computing clusters, offering a guaranteed QoS (Quality of Service). Typically, institutions with their managed clusters can join to SGs if they sign a certain SLA (Service Level Agreement) with the leadership of the SG. Since participants to a SG are most often institutions, an SG is often called an 'Institutional Computing Grid'. Examples of such service grid infrastructures are EGEE, NorduGrid, OSG, DEISA, TeraGrid. - A Desktop Grid (DG) is a loose opportunistic grid using idle resources. Inside desktop grids, computing and storage resources are typically owned by individual volunteer owners and not by institutes (therefore it is often called volunteer computing). Even if each single desktop computer provides a very low QoS, a desktop grid of reasonable size can, as a whole, provide a defined QoS and sign a SLA. Examples of such desktop grid systems are BOINC, XtremWeb, OurGrid, Xgrid. You can find a full description with drawings in chapter 5 'Technological context of the EDGeS project' of EDGeS deliverable DNA3.1 at http://www.edges-grid.eu:8080/c/document_library/get_file?p_l_id=11065&folderId=27671&name=DLFE-1042.pdf If you can NOT access this document, please let me now, I would then upload it to Gridforge. Best regards. ---------------------------------- Etienne URBAH IN2P3 - LAL Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Mob: +33 6 22 30 53 27 Skype: etienne.urbah mailto:urbah@lal.in2p3.fr ---------------------------------- On Tue, 17 Mar 2009, Laurence Field wrote:
Hi David,
Agreed, we should stop using the word production.
I think that the key point associated with large is that it spans international boundaries which presents a whole load of policy and legal issues. The IGTF is the body which helps to facility international trust relationships and each country has its own CA. Maybe it should be "Multi-institutional International Infrastructures for e-Science".
Laurence
David Wallom wrote:
I have to firmly disagree. Production should refer to the quality and number of services that are available rather than its specific size. The scaling of an infrastructure has nothing at the moment to do with whether its resources are interoperable. Your separation of large is a completely arbitrary one. A production grid should be able to display policies and procedures for the management services and SLDs for the services that it provides users.
Are you suggesting for example that a single national grid is not a production service? I can assure you for example that GLOW and other components of OSG as well as the UK NGS etc get an awful lot of work done with many many publications in high value refereed journals etc. as a direct result. Maybe we could use publication impact of the work done as a measure instead, it would be as arbitrary as 'real work'?
David
On 17/03/2009 14:50, "Moreno Marzolla" <moreno.marzolla@pd.infn.it> wrote:
David Wallom wrote:
Hi Lawrence,
[...]
Can I suggest that we just set performance, policy and procedure targets and go from there. I.e. You grid will have legally compliant accounting for utilisation by a number of users that are identified using a strong authentication and authorisation mechanism, across a set of physically separate resources that may or may not be legally owned by more than one legal entity. The services that these offer can be many and varied but all should operate to a defined quality of service definition.
Hi all,
I think that this definition is a bit generic, in the sense that it surely defines a "Grid", but I don't see how it addresses the term "Production" (which I agree is a term a bit elusive to quantify/qualify appropriately). In my mind I always associated "production" grids to those large-scale infrastructures (how much large?) that are used to get "real job" done (what does "real job" mean?). This is what I thought was the line dividing "production" grids from "non-production" ones.
Moreno.
Hi Etienne, I think we have all agreed to drop the word "production" as it infers something that is very subjective. What I hope that we are nowdoing is identifying types of Grids. My proposal is that one type of Grid has the IGTF as trust anchor. As the I in IGTF standard for International this is also a key property. You have highlighted two other types of Grids, 'Service Grids' and 'Desktop Grids'. So we have now identified 5 different types Campus Grid, NGIs, Service Grids, Desktop Grids and Multi-institutional International Infrastructures for e-Science. The fact that we are using different words to describe these suggests that they are subtly different otherwise we could just use the word Grid. Laurence Etienne URBAH wrote:
Laurence and all,
Concerning the definition of a Production Grid :
Lot of thanks to Laurence for proposing the first definition, and for proposing 'Multi-institutional International Infrastructures for e-Science'.
Following David WALLOM, I think that 'International' is too restrictive. The key point is that a Production Grid spans institutional boundaries, which presents a whole load of policy and legal issues.
So I propose 'Multi-institutional Infrastructure for e-Science'.
Today, there can be Production Grids which do NOT use IGTF as trust anchor. But for interoperability, they will have to migrate and use IGTF as trust anchor.
Inside the EDGeS project, we think that 'Production Grids' encompass both 'Service Grids' and 'Desktop Grids'.
Shortly :
- A Service Grid (SG) is a managed grid of managed computing clusters, offering a guaranteed QoS (Quality of Service). Typically, institutions with their managed clusters can join to SGs if they sign a certain SLA (Service Level Agreement) with the leadership of the SG. Since participants to a SG are most often institutions, an SG is often called an 'Institutional Computing Grid'. Examples of such service grid infrastructures are EGEE, NorduGrid, OSG, DEISA, TeraGrid.
- A Desktop Grid (DG) is a loose opportunistic grid using idle resources. Inside desktop grids, computing and storage resources are typically owned by individual volunteer owners and not by institutes (therefore it is often called volunteer computing). Even if each single desktop computer provides a very low QoS, a desktop grid of reasonable size can, as a whole, provide a defined QoS and sign a SLA. Examples of such desktop grid systems are BOINC, XtremWeb, OurGrid, Xgrid.
You can find a full description with drawings in chapter 5 'Technological context of the EDGeS project' of EDGeS deliverable DNA3.1 at http://www.edges-grid.eu:8080/c/document_library/get_file?p_l_id=11065&folderId=27671&name=DLFE-1042.pdf
If you can NOT access this document, please let me now, I would then upload it to Gridforge.
Best regards.
Hi, concerning NGIs: some of those I know are not substantially different from multi-institutional infrastructures, as they also use IGTF certificates and have all the other attributes except of the "international" one. NorduGrid CA case is very special: the CA itself is international and issues certificates to 5 different countries. Some other national Grids are using different AAA frameworks (e.g. no IGTF, no VOMS) - but this is true not just for NGIs but for Campus Grids as well. The conclusion is, shall we separate national infrastructures from international? While presence of borders inside a Grid practically implies IGTF, the reverse is not true. Cheers, Oxana Laurence Field пишет:
Hi Etienne,
I think we have all agreed to drop the word "production" as it infers something that is very subjective. What I hope that we are nowdoing is identifying types of Grids.
My proposal is that one type of Grid has the IGTF as trust anchor. As the I in IGTF standard for International this is also a key property.
You have highlighted two other types of Grids, 'Service Grids' and 'Desktop Grids'.
So we have now identified 5 different types Campus Grid, NGIs, Service Grids, Desktop Grids and Multi-institutional International Infrastructures for e-Science. The fact that we are using different words to describe these suggests that they are subtly different otherwise we could just use the word Grid.
Laurence
Etienne URBAH wrote:
Laurence and all,
Concerning the definition of a Production Grid :
Lot of thanks to Laurence for proposing the first definition, and for proposing 'Multi-institutional International Infrastructures for e-Science'.
Following David WALLOM, I think that 'International' is too restrictive. The key point is that a Production Grid spans institutional boundaries, which presents a whole load of policy and legal issues.
So I propose 'Multi-institutional Infrastructure for e-Science'.
Today, there can be Production Grids which do NOT use IGTF as trust anchor. But for interoperability, they will have to migrate and use IGTF as trust anchor.
Inside the EDGeS project, we think that 'Production Grids' encompass both 'Service Grids' and 'Desktop Grids'.
Shortly :
- A Service Grid (SG) is a managed grid of managed computing clusters, offering a guaranteed QoS (Quality of Service). Typically, institutions with their managed clusters can join to SGs if they sign a certain SLA (Service Level Agreement) with the leadership of the SG. Since participants to a SG are most often institutions, an SG is often called an 'Institutional Computing Grid'. Examples of such service grid infrastructures are EGEE, NorduGrid, OSG, DEISA, TeraGrid.
- A Desktop Grid (DG) is a loose opportunistic grid using idle resources. Inside desktop grids, computing and storage resources are typically owned by individual volunteer owners and not by institutes (therefore it is often called volunteer computing). Even if each single desktop computer provides a very low QoS, a desktop grid of reasonable size can, as a whole, provide a defined QoS and sign a SLA. Examples of such desktop grid systems are BOINC, XtremWeb, OurGrid, Xgrid.
You can find a full description with drawings in chapter 5 'Technological context of the EDGeS project' of EDGeS deliverable DNA3.1 at http://www.edges-grid.eu:8080/c/document_library/get_file?p_l_id=11065&folderId=27671&name=DLFE-1042.pdf
If you can NOT access this document, please let me now, I would then upload it to Gridforge.
Best regards.
Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
On Tuesday 17 March 2009 18:39, Laurence Field wrote:
Hi Etienne,
I think we have all agreed to drop the word "production" as it infers something that is very subjective. What I hope that we are nowdoing is identifying types of Grids.
My proposal is that one type of Grid has the IGTF as trust anchor. As the I in IGTF standard for International this is also a key property.
You have highlighted two other types of Grids, 'Service Grids' and 'Desktop Grids'.
So we have now identified 5 different types Campus Grid, NGIs, Service Grids, Desktop Grids and Multi-institutional International Infrastructures for e-Science. The fact that we are using different
IMHO those are rather attributes/properties of Grid infrastructures. Even Desktop Grid may be built on top of free cycles of Service Grid. A.K.
words to describe these suggests that they are subtly different otherwise we could just use the word Grid.
Laurence
Etienne URBAH wrote:
Laurence and all,
Concerning the definition of a Production Grid :
Lot of thanks to Laurence for proposing the first definition, and for proposing 'Multi-institutional International Infrastructures for e-Science'.
Following David WALLOM, I think that 'International' is too restrictive. The key point is that a Production Grid spans institutional boundaries, which presents a whole load of policy and legal issues.
So I propose 'Multi-institutional Infrastructure for e-Science'.
Today, there can be Production Grids which do NOT use IGTF as trust anchor. But for interoperability, they will have to migrate and use IGTF as trust anchor.
Inside the EDGeS project, we think that 'Production Grids' encompass both 'Service Grids' and 'Desktop Grids'.
Shortly :
- A Service Grid (SG) is a managed grid of managed computing clusters, offering a guaranteed QoS (Quality of Service). Typically, institutions with their managed clusters can join to SGs if they sign a certain SLA (Service Level Agreement) with the leadership of the SG. Since participants to a SG are most often institutions, an SG is often called an 'Institutional Computing Grid'. Examples of such service grid infrastructures are EGEE, NorduGrid, OSG, DEISA, TeraGrid.
- A Desktop Grid (DG) is a loose opportunistic grid using idle resources. Inside desktop grids, computing and storage resources are typically owned by individual volunteer owners and not by institutes (therefore it is often called volunteer computing). Even if each single desktop computer provides a very low QoS, a desktop grid of reasonable size can, as a whole, provide a defined QoS and sign a SLA. Examples of such desktop grid systems are BOINC, XtremWeb, OurGrid, Xgrid.
You can find a full description with drawings in chapter 5 'Technological context of the EDGeS project' of EDGeS deliverable DNA3.1 at http://www.edges-grid.eu:8080/c/document_library/get_file?p_l_id=11065&folderId=27671&name=DLFE-1042.pdf
If you can NOT access this document, please let me now, I would then upload it to Gridforge.
Best regards.
Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
On Tuesday 17 March 2009 17:04, David Wallom wrote:
I have to firmly disagree. Production should refer to the quality and number of services that are available rather than its specific size. The scaling of
Number of distinct services or total number of services? A.K.
Hi David, What I would be interested to find out is what are the similarities and differences between an infrastructures like EGEE and a Campus Grid? I tried to give a definition of what I saw as the key properties of the EGEE infrastructure, what would you say are the key properties of a Campus Grid? This is important as we might be talking about different entities which address subtly different problems and hence difficult to discuss in the same context. I have a poster on my wall from Oracle which says "The Oracle Grid Just Keeps Running". I am sure it is production quality and could be classed as an infrastructure but it is not what I have in mind what I used the word "Grid", from my perspective it describes a clustered solution. I am not an oracle (sorry for the pun) and if I could predict the future accurately would be richer or less poorer :) than I am now. The current situation is that all the current users of EGEE and similar infrastructures have X509 proxies from CAs accredited by the IGTF and belong to VOs. I would be very surprised if this situation was different in 18 months. Migrating OSG, EGEE and ARC to a different paradigm is not something that can be done overnight and until an alternative solution is presented and accepted, the time scale for doing such a migration can not be even guessed. The goal is not necessarily to define a list of properties to include everything. We might end up with a different set of properties for "multi-institutional infrastructures for e-Science" and "Campus Grids". I would see this as being an advantage because we could then focus on what was common rather getting confused by what was different. Unfortunately I don't have answers just questions. Laurence David Wallom wrote:
Hi Lawrence,
I would be very uncomfortable with your definitions, this describes the status quo and possibly not something that could be existing in 18 months time.
For example An internal Campus Grid running production systems, management, accounting, security etc. will require interoperability between institutions, not necessarily through national grids.
Shibboleth is rapidly gaining ground for authentication and authorization and will be used for example by the UK NGI quite soon.
And all of the others I am uncomfortable with, as far as I can see I think your criteria would for example not include DEISA?
Can I suggest that we just set performance, policy and procedure targets and go from there. I.e. You grid will have legally compliant accounting for utilisation by a number of users that are identified using a strong authentication and authorisation mechanism, across a set of physically separate resources that may or may not be legally owned by more than one legal entity. The services that these offer can be many and varied but all should operate to a defined quality of service definition.
That would be easier for everyone to sign up to.
David
PS The University of Oxford campus grid would though satisfy all of your criteria except we have ~100 users across 38 different independent legal entities within the university.
PPS For those that wonder what real production is I would suggest looking at CycleComputing.com there they provide grid services to a significant number of commercial organisations, with legal guarantees on data separations and legal consequences if things go wrong. I doubt that there would be real legal ramifications if EGEE/ARC/DEISA didn't fulfill their SLDs
On 17/03/2009 13:58, "Laurence Field" <Laurence.Field@cern.ch> wrote:
Dear All,
The word Grid means different things to different people and this quite frequently leads to misunderstandings and conflict. This is especially true within OGF where people from many different domains and backgrounds discuss "Grids". While each and every definition themselves are as valid as each other, trying to discuss a concept from two possibly unrelated perspectives is quite difficult.
Before we move forward in the PGI activity we therefore need understand what we mean by "Grid" and hence the problem we are trying to solve. It seems that in some recent threads the word "production" has cause some disagreements as this word is more commonly used to reflect quality rather than concepts.
From my perspective I can try to define the core properties that I see are key when describing the EGEE infrastructure. I would hope that similar infrastructures with whom EGEE would like to interoperate would have a similar description.
I would like to introduce the term "Muti-institutional infrastructures for e-Science" which can be used to describe something with the following properties.
1) Multi-institutional. The EGEE infrastructure is composed by linking resources that reside at autonomous academic institutes. The key concept here is the administrative domain which maps to a real institute and that there is more than one institute in the infrastructure.
2) The Virtual Organization The main aspects that links these institutes is the drive to collaborate in order to do science. The collaborations are defined by Virtual Organizations.
3) x509 Certificates and Proxies Users are identified by x509 certificates which are provided by CAs accredited by the IGTF. Proxies are typically used to interact with services.
4) VOMS To identify which VO a user is belong to is by contacting VOMS. VOMS also supports Roles an Groups within the VO which is implemented by adding attributes to the proxy.
5) Multiple Services The are many different types of services each of which can define their own interfaces which may or many not be a Web Service.
6) Parrallel Information System The information about services is found by querying an information system which is not necessarily related to the service interface.
6) Scale 100s of administrative domains, thousands of users, thousands of services, millions of computing actives and petabytes of data.
This may be a can of worms but if we can agree on a set of properties for what we mean by "Production Grid" it might help in the discussions. I don't mean that anything that has different properties is not a production grid, I just want to clarify what this group mean when it talks about a Grid infrastructures.
Laurence
Laurence
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
Hi Laurence, thanks for this mail; what you describe for EGEE is perfectly valid for NDGF and other ARC-based grids. I would still argue that X509 and VOMS are mere implementation details, but even these are common for these two kinds of grids. I do not allude that other kinds of grids do not belong to PGI; however, we clearly can not provide on a short time scale a single recipe for all possible setups. Let's take one step at a time, in the good tradition of low-hanging fruits of GIN :-) Cheers, Oxana Laurence Field пишет:
Dear All,
The word Grid means different things to different people and this quite frequently leads to misunderstandings and conflict. This is especially true within OGF where people from many different domains and backgrounds discuss "Grids". While each and every definition themselves are as valid as each other, trying to discuss a concept from two possibly unrelated perspectives is quite difficult.
Before we move forward in the PGI activity we therefore need understand what we mean by "Grid" and hence the problem we are trying to solve. It seems that in some recent threads the word "production" has cause some disagreements as this word is more commonly used to reflect quality rather than concepts.
From my perspective I can try to define the core properties that I see are key when describing the EGEE infrastructure. I would hope that similar infrastructures with whom EGEE would like to interoperate would have a similar description.
I would like to introduce the term "Muti-institutional infrastructures for e-Science" which can be used to describe something with the following properties.
1) Multi-institutional. The EGEE infrastructure is composed by linking resources that reside at autonomous academic institutes. The key concept here is the administrative domain which maps to a real institute and that there is more than one institute in the infrastructure.
2) The Virtual Organization The main aspects that links these institutes is the drive to collaborate in order to do science. The collaborations are defined by Virtual Organizations.
3) x509 Certificates and Proxies Users are identified by x509 certificates which are provided by CAs accredited by the IGTF. Proxies are typically used to interact with services.
4) VOMS To identify which VO a user is belong to is by contacting VOMS. VOMS also supports Roles an Groups within the VO which is implemented by adding attributes to the proxy.
5) Multiple Services The are many different types of services each of which can define their own interfaces which may or many not be a Web Service.
6) Parrallel Information System The information about services is found by querying an information system which is not necessarily related to the service interface.
6) Scale 100s of administrative domains, thousands of users, thousands of services, millions of computing actives and petabytes of data.
This may be a can of worms but if we can agree on a set of properties for what we mean by "Production Grid" it might help in the discussions. I don't mean that anything that has different properties is not a production grid, I just want to clarify what this group mean when it talks about a Grid infrastructures.
Laurence
Laurence
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
Hi Oxana I though x509 is very much an implementation details. The root of trust is with the IGTF and the VOs. Laurence Oxana Smirnova wrote:
Hi Laurence,
thanks for this mail; what you describe for EGEE is perfectly valid for NDGF and other ARC-based grids. I would still argue that X509 and VOMS are mere implementation details, but even these are common for these two kinds of grids.
I do not allude that other kinds of grids do not belong to PGI; however, we clearly can not provide on a short time scale a single recipe for all possible setups. Let's take one step at a time, in the good tradition of low-hanging fruits of GIN :-)
Cheers, Oxana
-----Original Message----- From: pgi-wg-bounces@ogf.org [mailto:pgi-wg-bounces@ogf.org] On Behalf Of Laurence Field Sent: 17 March 2009 14:58 To: pgi-wg@ogf.org Subject: [Pgi-wg] Definition of a Production Grid
Dear All,
The word Grid means different things to different people and this quite frequently leads to misunderstandings and conflict. This is especially true within OGF where people from many different domains and backgrounds discuss "Grids". While each and every definition themselves are as valid as each other, trying to discuss a concept from two possibly unrelated perspectives is quite difficult.
Before we move forward in the PGI activity we therefore need understand what we mean by "Grid" and hence the problem we are trying to solve. It seems that in some recent threads the word "production" has cause some disagreements as this word is more commonly used to reflect quality rather than concepts.
From my perspective I can try to define the core properties that I see are key when describing the EGEE infrastructure. I would hope that similar infrastructures with whom EGEE would like to interoperate would have a similar description.
I would like to introduce the term "Muti-institutional infrastructures for e-Science" which can be used to describe something with the following properties.
1) Multi-institutional. The EGEE infrastructure is composed by linking resources that reside at autonomous academic institutes. The key concept here is the administrative domain which maps to a real institute and that there is more than one institute in the infrastructure.
2) The Virtual Organization The main aspects that links these institutes is the drive to collaborate in order to do science. The collaborations are defined by Virtual Organizations.
3) x509 Certificates and Proxies Users are identified by x509 certificates which are provided by CAs accredited by the IGTF. Proxies are typically used to interact with services.
4) VOMS To identify which VO a user is belong to is by contacting VOMS. VOMS also supports Roles an Groups within the VO which is implemented by adding attributes to the proxy.
5) Multiple Services The are many different types of services each of which can define
These are interesting thoughts... more in defining "effective grids" rather than "production grids". 1. Multi-organisational (instead of multi-institutional) 2. The Virtual Organisation (composed of multiple real organisations) 3. X509 Proxy Certificate (as a means of encapsulating an identity with additional information that can be used for authentication - currently delivered for many of us through VOMS) 4. Multiple Services 5. An Independent (or Standalone) Information System (i.e. one that sits outside or is parallel to any particular service) 6. The ability to scale Along functional (users, services, computing activities, data) & non-functional (encompassing different administrative, policy and legal) parameters. Hopefully this is more encompassing. Steven Dr Steven Newhouse EGEE Technical Director http://cern.ch/Steven.Newhouse their
own interfaces which may or many not be a Web Service.
6) Parrallel Information System The information about services is found by querying an information system which is not necessarily related to the service interface.
6) Scale 100s of administrative domains, thousands of users, thousands of services, millions of computing actives and petabytes of data.
This may be a can of worms but if we can agree on a set of properties for what we mean by "Production Grid" it might help in the discussions. I don't mean that anything that has different properties is not a production grid, I just want to clarify what this group mean when it talks about a Grid infrastructures.
Laurence
Laurence
_______________________________________________ Pgi-wg mailing list Pgi-wg@ogf.org http://www.ogf.org/mailman/listinfo/pgi-wg
participants (8)
-
Aleksandr Konstantinov -
Arnie Miles -
David Wallom -
Etienne URBAH -
Laurence Field -
Moreno Marzolla -
Oxana Smirnova -
Steven Newhouse