Endpoints and ComputingService, relationships between services. Some thoughs

Hi all, I am in the situation that I would like to have a simple Endpoint published within a ComputingService. However, besides UML inheritance might allow this, the hierarchic xsd schema does not allow that, it only allows ComputingEndpoints to be nested within ComputingServices. I don't know it this is good or bad; for you to understand how I got here, I can give you this little problem to solve: Suppose the same machine hosts two RELATED services, that is, one needs the other one for proper functionality. For example a delegation service is needed to submit a job to a job execution service. Is there a way for an information consumer to infer/understand this relationship? Can a client understand that (1) the services are related and (2) that they are running in the same machine just by looking at the GLUE2 records? One might think of associations, but it can easily be shown that they don't solve the problem. We don't really have service-to-service associations, just some kind of hierarchy between services. What do you think? -- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Florido Paganelli said: Suppose the same machine hosts two RELATED services, that is, one needs the other one for proper functionality. For example a delegation service is needed to submit a job to a job execution service.
This sounds to me like *one* Service with two different types of Endpoint. For example a VOMS server can have voms endpoints to return credentials and voms-admin endpoints to manage VO membership, but they both talk to the same database so there is only one Service. This is the biggest structural change from GLUE 1 to 2 - in GLUE 1 there are only GlueService objects representing endpoints, but in GLUE 2 we introduced the Service concept exactly so they could be grouped together if they provide a shared functionality. Stephen -- Scanned by iCritical.

On 2012-08-20 12:51, stephen.burke@stfc.ac.uk wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Florido Paganelli said: Suppose the same machine hosts two RELATED services, that is, one needs the other one for proper functionality. For example a delegation service is needed to submit a job to a job execution service.
This sounds to me like *one* Service with two different types of Endpoint. For example a VOMS server can have voms endpoints to return credentials and voms-admin endpoints to manage VO membership, but they both talk to the same database so there is only one Service. This is the biggest structural change from GLUE 1 to 2 - in GLUE 1 there are only GlueService objects representing endpoints, but in GLUE 2 we introduced the Service concept exactly so they could be grouped together if they provide a shared functionality.
Stephen
Your observations are right, however the model of service you have in mind is monolithic and not distributed; I am in fact speaking about something that was not considered during the GLUE2 "making of". I am speaking about a service wrt the following use case: 1) the service, by design, *has* distributed endpoints, that might run on different hosts. 2) a client would like to gather the endpoints belonging to that distributed service. You can imagine an information system, like a resource-bdii, that runs independenlty from the resource, maybe on a different host. How to relate the resource-bdii endpoint with the resources it serves? I imagine a client collecting information from some index in which you only have GLUE2 endpoints and service records. How to relate these records and infer, for example, that they're on the same machine and/or serve the same resource? Parsing the EndpointURL would be a choice, but you don't need GLUE2 for that... All in all we could call these "service-service" relationships. -- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

Florido Paganelli [mailto:florido.paganelli@hep.lu.se] said:
Your observations are right, however the model of service you have in mind is monolithic and not distributed; I am in fact speaking about something that was not considered during the GLUE2 "making of".
As a general question Service-Service relations weren't considered much because in practice we had no relevant use cases, and as far as I know we still haven't. The schema is not supposed to be able to represent any conceivable situation, just to satisfy the things we know about. The schema does have a Service-Service relation, but as far as I know we have no uses of it at the moment and the semantics would need to be defined for a specific service if we did have a need. It may also be the case that you can infer the relationships from other information - for example if multiple VOMS servers are related to the same VO.
1) the service, by design, *has* distributed endpoints, that might run on different hosts.
As far as the schema is concerned it doesn't matter what hosts the endpoints are on - it will often be the case that a Service with many Endpoints will involve many hosts. It may of course be a question of the information system technology as to whether such a configuration can easily be published in practice. In particular if you have endpoints on different sites it may be difficult to co-ordinate the publishing. However within a site it should be fairly easy - for example CREAM is done like that (in the CLUSTER mode). Each head node publishes its own Endpoint(s) and they get merged in the site BDII.
How to relate these records and infer, for example, that they're on the same machine and/or serve the same resource?
Why would you care if they run on the same machine? In general you can't get that even from the URL, because it may refer to a DNS alias and not the real host. If they serve the same Resource in the schema sense you can known because they will all have relations to the same Resource object, i.e. ExecutionEnvironment for computing services. Stephen

Hi Stephen, thanks again for your comments. Just a note on your last question: On 2012-08-20 14:06, stephen.burke@stfc.ac.uk wrote:
Florido Paganelli [mailto:florido.paganelli@hep.lu.se] said:
Your observations are right, however the model of service you have in mind is monolithic and not distributed; I am in fact speaking about something that was not considered during the GLUE2 "making of".
As a general question Service-Service relations weren't considered much because in practice we had no relevant use cases, and as far as I know we still haven't. The schema is not supposed to be able to represent any conceivable situation, just to satisfy the things we know about. The schema does have a Service-Service relation, but as far as I know we have no uses of it at the moment and the semantics would need to be defined for a specific service if we did have a need. It may also be the case that you can infer the relationships from other information - for example if multiple VOMS servers are related to the same VO.
1) the service, by design, *has* distributed endpoints, that might run on different hosts.
As far as the schema is concerned it doesn't matter what hosts the endpoints are on - it will often be the case that a Service with many Endpoints will involve many hosts. It may of course be a question of the information system technology as to whether such a configuration can easily be published in practice. In particular if you have endpoints on different sites it may be difficult to co-ordinate the publishing. However within a site it should be fairly easy - for example CREAM is done like that (in the CLUSTER mode). Each head node publishes its own Endpoint(s) and they get merged in the site BDII.
Yes that's also the approach ARC will follow now, but I am not that happy with it.
How to relate these records and infer, for example, that they're on the same machine and/or serve the same resource?
Why would you care if they run on the same machine? In general you can't get that even from the URL, because it may refer to a DNS alias and not the real host. If they serve the same Resource in the schema sense you can known because they will all have relations to the same Resource object, i.e. ExecutionEnvironment for computing services.
Stephen
Well at the beginning I was publishing the Execution Service as a ComputingService separated from the Information Service as a plain Service. In the same box I had - ComputingService - Service (ldap information endpoints) But then if one pushes these records in an endpoint index such as EMIR, that contains the full GLUE2 Service/ComputingServce and Endpoint/ComputingEndpoint records, a client gathering these records needs to understand that these services are running on the same machine, or at least that one information service will ONLY serve information about a single execution service. If not, the client runs the risk of querying twice an information service for which you already have data. An example is the following: *) in the index I have (a) a ComputingEndpoint of a ComputingService on machine X, with Capability executionmanagement.jobexecution , (b) a Endpoint of a Service on machine X with Capability information.discovery.resource, that also allows me to discover 1) on a local information system *) a client wants to submit a job to machine X. It needs a Computing Endpoint with executionmanagement.jobexecution and wants to decide where to submit. How do the client know that (a) is already on machine X, and it does not need to query (b), machine X local information service? there's nothing in GLUE2 that tells me how (b) on machine X is related to (a) on machine X. Then the best is to merge those under the same Service as you suggested. At least I will have EndpointServiceForeignKey relation with the same ID and I can check against that. Cheers, -- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

Florido Paganelli [mailto:florido.paganelli@hep.lu.se] said:
Well at the beginning I was publishing the Execution Service as a ComputingService separated from the Information Service as a plain Service.
In the same box I had - ComputingService - Service (ldap information endpoints)
Your particular example is not applicable to the current BDII architecture - we *don't* publish resource BDIIs, they are just internal services inside the site. Only the site BDII is published as an externally visible service, containing all the information for all services at the site. But anyway the "same host" part is not directly relevant - resource BDIIs usually run on the service node but that isn't a requirement, they can run anywhere as long as they can collect the information to publish.
But then if one pushes these records in an endpoint index such as EMIR, that contains the full GLUE2 Service/ComputingServce and Endpoint/ComputingEndpoint records, a client gathering these records needs to understand that these services are running on the same machine, or at least that one information service will ONLY serve information about a single execution service.
As I've said in other mails, this needs to be driven by the practical use cases, you need to work out specifically what you need to do in a particular situation and then publish enough information to make it possible. There is no "right answer" to any of these questions (although there may be wrong answers!). Stephen

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Florido Paganelli said: I am in the situation that I would like to have a simple Endpoint published within a ComputingService. However, besides UML inheritance might allow this, the hierarchic xsd schema does not allow that, it only allows ComputingEndpoints to be nested within ComputingServices.
Also to answer that part, that is the way the schema was defined - if you have Endpoints which belong to a ComputingService they are ComputingEndpoints even if their type isn't directly related to computing, e.g. gridftp. But all the extra attributes in ComputingEndpoint are optional so it makes no practical difference, you can publish the same attributes as a plain Endpoint but give it the ComputingEndpoint type. Stephen -- Scanned by iCritical.

On 2012-08-20 12:58, stephen.burke@stfc.ac.uk wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Florido Paganelli said: I am in the situation that I would like to have a simple Endpoint published within a ComputingService. However, besides UML inheritance might allow this, the hierarchic xsd schema does not allow that, it only allows ComputingEndpoints to be nested within ComputingServices.
Also to answer that part, that is the way the schema was defined - if you have Endpoints which belong to a ComputingService they are ComputingEndpoints even if their type isn't directly related to computing, e.g. gridftp. But all the extra attributes in ComputingEndpoint are optional so it makes no practical difference, you can publish the same attributes as a plain Endpoint but give it the ComputingEndpoint type.
Stephen
Yes I will probably do that as there's no other choice at the moment. However is just sad, since the GLUE2 model document says on ComputingEndpoint: "The class represents an endpoint which is used to create, control and monitor computational activities" that is in a way scoping what kind of capability the endpoint should offer. For example a typical non-computing endpoint would be a Delegation endpoint that and has nothing to do with creation, monitoring and computation IMHO. In principle it would nicer to have an independent Delegation Service with its Endpoint. But then how do I know that the specific Delegation Service can only be used by one of my Services? -- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

Florido Paganelli [mailto:florido.paganelli@hep.lu.se] said:
For example a typical non-computing endpoint would be a Delegation endpoint that and has nothing to do with creation, monitoring and computation IMHO.
If it's really part of the computing service and can only be used with it then I think it does have something to do with managing computing tasks. On the other hand, if you have a general delegation service that can manage delegations for any other service then it should probably be published as a separate Service.
In principle it would nicer to have an independent Delegation Service with its Endpoint. But then how do I know that the specific Delegation Service can only be used by one of my Services?
You would need to publish the information in some way and document what it means - for example as we have for the ComputingService - StorageService relationship ("close SE"). Stephen

Florido, Concerning the publication of 2 RELATED services using the GLUE 2.0 schema and its XML rendering(s) : I suggest to : - begin by searching solutions at the conceptual level using the GLUE 2.0 schema, - and only then to look for practical solutions using an XML rendering. GLUE 2.0 schema --------------- It is easy to define 1 single Endpoint exposing 2 Services. Any information consumer will then know that these 2 Services are somehow related, and the precise relationship may be inferred from : - the respective values of the Capability Attribute for the 2 Services : For example 'security.delegation' and 'executionmanagement.jobmanager', - the value of 1 occurrence of the Service.ID value of one Service : This value is the ID of the other Service. So, the GLUE 2.0 schema seems to permit the above solution. XML renderings -------------- - A flat XML rendering, having NO presupposed hierarchy, should permit the practical implementation of the above solution. - Any XML rendering having a presupposed hierarchy is a conceptual mistake. Your example is a clear Use Case proving this. Best regards. ----------------------------------------------------- Etienne URBAH LAL, Univ Paris-Sud, IN2P3/CNRS Bat 200 91898 ORSAY France Tel: +33 1 64 46 84 87 Skype: etienne.urbah Mob: +33 6 22 30 53 27 mailto:urbah@lal.in2p3.fr ----------------------------------------------------- On Mon, 20/08/2012 12:01, Florido Paganelli wrote:
Hi all,
I am in the situation that I would like to have a simple Endpoint published within a ComputingService. However, besides UML inheritance might allow this, the hierarchic xsd schema does not allow that, it only allows ComputingEndpoints to be nested within ComputingServices.
I don't know it this is good or bad; for you to understand how I got here, I can give you this little problem to solve:
Suppose the same machine hosts two RELATED services, that is, one needs the other one for proper functionality. For example a delegation service is needed to submit a job to a job execution service.
Is there a way for an information consumer to infer/understand this relationship? Can a client understand that (1) the services are related and (2) that they are running in the same machine just by looking at the GLUE2 records?
One might think of associations, but it can easily be shown that they don't solve the problem. We don't really have service-to-service associations, just some kind of hierarchy between services.
What do you think?

Etienne URBAH [mailto:urbah@lal.in2p3.fr] said:
It is easy to define 1 single Endpoint exposing 2 Services.
I think you mean 1 Service exposing 2 Endpoints - at least what you say is not true, one Endpoint can only belong to one Service.
- Any XML rendering having a presupposed hierarchy is a conceptual mistake. Your example is a clear Use Case proving this.
I think the comment was about the object types rather than a structural hierarchy, i.e. that ComputingServices must have ComputingEndpoints (and StorageServices must have StorageEndpoints). I'm not 100% sure that was the right choice but it is the way it's defined, and I don't think it does any particular harm. For LDAP we did decide to change the way the references are named - in the original rendering we had things like GLUE2ComputingEndpointComputingServiceForeignKey, but we changed it to use the inherited reference GLUE2EndpointServiceForeignKey. However it's still true that the reference should point to an object with an objectclass of ComputingService if it comes from an object with an objectclass of ComputingEndpoint and vice versa. Stephen -- Scanned by iCritical.

Hi Etienne, thanks for sharing your opinion. More comments inline On 2012-08-20 13:17, Etienne URBAH wrote:
Florido,
Concerning the publication of 2 RELATED services using the GLUE 2.0 schema and its XML rendering(s) :
I suggest to :
- begin by searching solutions at the conceptual level using the GLUE 2.0 schema,
- and only then to look for practical solutions using an XML rendering.
GLUE 2.0 schema --------------- It is easy to define 1 single Endpoint exposing 2 Services.
the above is not my problem, unfortunately. I will try to summarize my problem this way. 1) I have two services that are related, that is, I need endpoints from both services for my task to be accomplished. These endpoints are different endpoints for each service (not just a single endpoint). 2) I want to know, just by looking at GLUE2 records, how these services are related and IF they are running in the same box. One solution is to merge the two services, hide the fact that they are on different boxes, and publish a single Service with all Endpoints on a single box. In this way i can see the relationship between the endpoints by inspecting that they belong to the same service (the EndpointService association) and I can also infer they are on the same machine. Maybe it's not the same physical machine but for the purpose of the client is fair enough. The issue with this solution is to fit ComputingEndpoints and Endpoints under the same Service (or ComputingService). Read later for comments on this.
Any information consumer will then know that these 2 Services are somehow related, and the precise relationship may be inferred from :
- the respective values of the Capability Attribute for the 2 Services : For example 'security.delegation' and 'executionmanagement.jobmanager',
the above is unfortunately not enough. Capabilities are meant to understand the purpose of a service, not its relationships with other services. In short, they tell something about a GLUE2 record itself and nothing about related records.
- the value of 1 occurrence of the Service.ID value of one Service : This value is the ID of the other Service.
Yes, i thought about service ID. But if I set the same ID on two services running on different boxes, then I am breaking universal uniqueness; in fact, service1 running in box1 is NOT service2 running in box2. Moreover, these services MIGHT be different if they have completely different endpoints.
XML renderings -------------- - A flat XML rendering, having NO presupposed hierarchy, should permit the practical implementation of the above solution.
- Any XML rendering having a presupposed hierarchy is a conceptual mistake. Your example is a clear Use Case proving this.
the comments I made about ComputingEndpoint and Endpoint with respect to ComputingService are not an XML issue, but a design issue. It was a structural requirement to have ComputingEndpoints inside a ComputingService rather than Endpoints. It does not have anything to do with hierarchical/flat XML renderings. Besides, the comment I made about UML inheritance might be wrong. Looking at the schema in page 22 I am not sure if an instance of an Endpoint object can be related to a ComputingService or not. The three entities (Endpoint, ComputingEndpoint, ComputingService) span across three packages, and it's the scope of the packages that tells how inheritance is realized by object instantiation. This aspect cannot be specified in UML AFAIK. By reading the descriptive text for ComputingService is clear what Stephen said, that is, a ComputingService can only have ComputingEndpoints. So to come back to the initial problems, if I have Cheers, Florido
On Mon, 20/08/2012 12:01, Florido Paganelli wrote:
Hi all,
I am in the situation that I would like to have a simple Endpoint published within a ComputingService. However, besides UML inheritance might allow this, the hierarchic xsd schema does not allow that, it only allows ComputingEndpoints to be nested within ComputingServices.
I don't know it this is good or bad; for you to understand how I got here, I can give you this little problem to solve:
Suppose the same machine hosts two RELATED services, that is, one needs the other one for proper functionality. For example a delegation service is needed to submit a job to a job execution service.
Is there a way for an information consumer to infer/understand this relationship? Can a client understand that (1) the services are related and (2) that they are running in the same machine just by looking at the GLUE2 records?
One might think of associations, but it can easily be shown that they don't solve the problem. We don't really have service-to-service associations, just some kind of hierarchy between services.
What do you think?
-- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

Florido Paganelli [mailto:florido.paganelli@hep.lu.se] said:
Besides, the comment I made about UML inheritance might be wrong. Looking at the schema in page 22 I am not sure if an instance of an Endpoint object can be related to a ComputingService or not.
Conceptually an Endpoint can only have a relation to a Service, not to any derived class, because you can't know what classes may be derived in the future. For example if we decide to derive a new class FileTransferService, Endpoint can't possibly have a relation to it because we didn't know it existed when the schema was defined. Perhaps we could have defined the schema in a different way, but it was a decision right at the start to do it like this. And it does have some advantage for type checking, i.e. you can check that a reference points to the right kind of object in a more specific way. In LDAP that isn't really a natural way to do it because LDAP objects can easily have multiple types and it makes more sense to extend the base type than redefine it, so in LDAP the GLUE2ComputingEndpoint objectlass only has the additional attributes and not the inherited ones. In XML I don't know which way is best.
So to come back to the initial problems, if I have
? Stephen
participants (3)
-
Etienne URBAH
-
Florido Paganelli
-
stephen.burke@stfc.ac.uk