GLUE 2.0 Specification - draft 41 - Working Group Last Call - glue-wg

GLUE 2.0 Specification - draft 41 - Working Group Last Call

Sergio Andreozzi

13 May 2008 13 May '08

1:26 a.m.

Dear all, in our plan, we have the important milestone of presenting for the coming OGF a public comment version GLUE 2.0 in order to achieve a broader exposure of our work and receive a broader review. Starting from the GLUE 2.0 specification available at the end of April (which is the result of a long series of teleconferences and email exchange by many partners) we refactored the document to be consistent and we completed all its parts. The updated renderings of the conceptual model to the XML Schema/LDAP/relational concrete data models will be issued during this week in a separated document (possibly by Wednesday). This is the official Working Group last call for GLUE 2.0 and we invite all of you to read the specification and to send your feedback by Friday, 12 AM CEST. By Monday 19 May, we will send the document + renderings to our area managers to be presented to the OGF Editor for promoting it to public comment. All the comments that we will receive by Friday 16 May and that can be easily incorporated, will be included. The remaining comments which may require discussion or major changes will be logged in the public comment tracker of the GLUE 2.0 specification as soon as it will be available and will be resolved during the 60 days of public comment period. When sending your feedback, pay attention to aspects related to the readability and consistency of the document. We would like to remark the importance of going to the OGF 23 with a coherent and complete document aligned with the renderings. This will enable the interested groups/projects to better evaluate the current work and create proof of concepts on real systems. We thank all of you for your contribution. Kind regards, Sergio, Balazs and Laurence PS: you can find attached the specification documents in both MS Word and PDF formats; the OGF forge portal is not reachable at the moment -- Sergio Andreozzi INFN-CNAF, Tel: +39 051 609 2860 Viale Berti Pichat, 6/2 Fax: +39 051 609 2746 40126 Bologna (Italy) Web: http://www.cnaf.infn.it/~andreozzi

Attachments:

ogf-glue-2_draft-41_WG-last-call.doc (application/msword — 2.4 MB)
ogf-glue-2_draft-41_WG-last-call.pdf (application/pdf — 521.1 KB)

Show replies by date

Burke, S (Stephen)

13 May 13 May

5:06 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working Group LastCall

glue-wg-bounces@ogf.org

...

[mailto:glue-wg-bounces@ogf.org] On Behalf Of Sergio Andreozzi said: This is the official Working Group last call for GLUE 2.0 and we invite all of you to read the specification and to send your feedback by Friday, 12 AM CEST.

OK, I've started, but it's hard going! I'll just comment on the main entities for now and come back to the rest. One general comment is that I think most things could do with more detailed explanation; even I find it difficult to understand what some things mean, and I think someone new to this will get lost fairly quickly. Can we give some examples in the text? Page 5, I think there is still some problem with the multiplicities, or perhaps with my understanding of what they mean ... for example I can see "1"s near Endpoint, Share and Resource, and yet all of them also have *s - so they are optional in some directions and mandatory in others?! (Formally I think they should all be optional.) All the definitions have a section labelled "Association End" - what does "End" mean there? Now comments on the entities ... Entity: I'm not sure why the attributes are mandatory - also they are missing in all the other entities even though they all inherit from Entity! In fact although the description says that everything inherits from Entity I suspect that excludes Extension? Also I'd be inclined to add Name and OtherInfo to Entity - at the moment not everything has a Name but I don't think it does much harm to include it. Extension: despite the statement at the start that everything must have an opaque ID Extension doesn't, the ID is the Key. I think that's a mistake - among other things you might have multiple instances of the same Key. That relates to another question which is why is the Value multiplicity *? I would suggest it should be 1 but you could have multiple instances for the same Key. Location: is missing OtherInfo, but it will get it automatically if it gets added to Entity. AdminDomain: I think Owner needs to be defined better. UserDomain: similarly I think UserManager and Member need more explanation and/or examples. Also it's not clear if what's there can work - for example if UserManager is supposed to be a VOMS URL the scheme would just be https, so how can you tell that it's VOMS and not something else? Probably it would need some kind of Type attribute. Service: it's not entirely obvious to me why Capability is mandatory. Also I think the enumerated Type list in the appendix is inadequate, it's not really clear to me how this is supposed to work, given that every Service is supposed to have a unique Type (what about computing and storage services?!). Conversely many of the things which are listed (e.g. org.teragrid.gpfs) don't seem to represent things I would consider to be Services at all. Endpoint: again I wonder why Capability is mandatory. And I don't understand why or how Type and Version are combined as a single Interface URI: at the very least this needs *much* more explanation since it's absolutely vital to using the endpoints! And the Type list needs to be enumerated (as it is now). Basically I don't understand why we haven't kept the Type and Version as we have them in 1.3. For QualityLevel, how do the Endpoint QLs relate to the one in the parent Service? For StartTime, is this different to Entity.CreationTime which should be inherited? I still think that TrustedCA belongs in AccessPolicy and not Endpoint - if we can't figure out how to fit it in there it probably means that we still don't have a good enough definition of the AP object. And I still think the Downtime attributes should be in a separate object, to me it makes no sense to put them there - among other things, if the service is down the Endpoint will probably not be published so you won't be able to read the Downtime information. Share: the Endpoint and Resource relations are marked as mandatory, which doesn't seem right since the objects are supposed to be optional. Manager: I'm not sure why it needs a globally unique ID? And again the Resource relation is mandatory which can't be right since the Resource itself is optional. Resource: as above, the description and the relations say that Endpoints, Shares and Managers are mandatory, which doesn't seem right. And again I don't think it needs a global ID. Also there is no accompanying text, which should at the very least say that it's an abstract entity. Activity: it looks odd to have two identical lines for the Activity-Activity relation. Policy: I think we still have some problems with this ... for one thing I think the Rule has to have type "string", since we can't possibly specify all formats. And I'm not sure why the Rule is optional, shouldn't it be 1..*? Also I think we need to say something about the semantics in the case where you have multiple Policy objects, either with the same Scheme or with different Schemes. Also the UserDomain relation should be * - the Rules may relate to several VOs, and conversely the UD object itself is optional. We also still need feedback on whether this can meet all the use cases - for EGEE I think we should be OK, but it's certainly a long way from being generic (DENY rules!). AccessPolicy: One basic point is that it seems to me that APs should be related to Service as well as Endpoint, otherwise service discovery will be a lot more complex and messy than it is now. Also as I said above I think TrustedCA should be here. MappingPolicy: I'm not clear if the concept of the Default mapping can actually work as stated. As mentioned above I don't think the relation to UD can be mandatory, it may well not be published, and while the Rules implicitly relate to a VO (or perhaps more than one) it may not be in any simple way, e.g. it may involve groups and roles. What exactly does it mean to say that there MUST be only one default MP per UD, in a way which is guaranted to be meaningful for all possible policy Schemes, all technologies, all service types, ...? OK, that's all the main entities, I'll get to the Computing part tomorrow. Stephen

Burke, S (Stephen)

14 May 14 May

4:45 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working GroupLastCall

glue-wg-bounces@ogf.org

...

[mailto:glue-wg-bounces@ogf.org] On Behalf Of Burke, S (Stephen) said: Share: the Endpoint and Resource relations are marked as mandatory, which doesn't seem right since the objects are supposed to be optional.

I'm still going through the document, but I just noticed another problem with Share: it's missing the relation to MappingPolicy, with the result that it's also missing for ComputingShare and StorageShare. Stephen

Sergio Andreozzi

6:03 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working Group LastCall

Hi Stephen, Burke, S (Stephen) wrote:

...

OK, I've started, but it's hard going! I'll just comment on the main entities for now and come back to the rest. One general comment is that I think most things could do with more detailed explanation; even I find it difficult to understand what some things mean, and I think someone new to this will get lost fairly quickly. Can we give some examples in the text?

in general normative documents do not have examples, but I agree that we should add more descriptive text. We may also put some examples and stating that they are non normative

...

Page 5, I think there is still some problem with the multiplicities, or perhaps with my understanding of what they mean ... for example I can see "1"s near Endpoint, Share and Resource, and yet all of them also have *s - so they are optional in some directions and mandatory in others?! (Formally I think they should all be optional.)

they way you should think is: - when I instantiate the endpoint, what associations must be instantiated? e.g.: if I create an instance of endpoint, then it is mandatory to have an association to service (and a service) but it is optional to have an assocation to share/accessPolicy/extension/activity

...

All the definitions have a section labelled "Association End" - what does "End" mean there?

it is borrowed by the UML world. Have a look at Section 8 (Template). Here is a better description: given the class X, such a class has an associated table (in the GLUE doc). In this table, the Association End section lists all the associations that starts from this class. What is described is the "other side" of the association (association end) in terms of: - what class is associated - what is the key attribute - what is the multiplicity (when I instantiate X, how many instances of the associated class can be associated to X via that association?) - description of the association from the viewpoint of X example: endpoint (*) ---- (1) service when describing the endpoint, in the association end I have: - Service.ID - 1 - An endpoint is part of a Service that means: "when I instance an Endpoint, then this must have an assocation to a Service and the specific service is identified by the ID property"

...

Entity: I'm not sure why the attributes are mandatory - also they are missing in all the other entities even though they all inherit from Entity!

right, made them optional; before they were part of an independent class and optionality was expressed by the association; now these attributes are taken by inheritance and we need to make them optional

...

In fact although the description says that everything inherits from Entity I suspect that excludes Extension? Also I'd be inclined to add Name and OtherInfo to Entity - at the moment not everything has a Name but I don't think it does much harm to include it.

if Class A inherits from class B, then class A inherits all the class B attributes *plus* all its associations; you'll see that all classes inheriting from class A, they have Extension.Key association end in the "inherited association end" table section about Name in Entity, I'm not sure; I'd like to listen from other people as well; for OtherInfo, we could do that

...

Extension: despite the statement at the start that everything must have an opaque ID Extension doesn't, the ID is the Key. I think that's a mistake - among other things you might have multiple instances of the same Key. That relates to another question which is why is the Value multiplicity *? I would suggest it should be 1 but you could have multiple instances for the same Key.

the extension is actually a special class; in fact it is the only one that does not inherits from Entity. I agree with your proposal; I can make the following changes: - Rename Key to ID (unless we want ID, Key, Value) - Value - multi = 1

...

Location: is missing OtherInfo, but it will get it automatically if it gets added to Entity.

ok, let's wait for other people opinion

...

AdminDomain: I think Owner needs to be defined better.

what about: Identification of the person or legal entity which pays for the services and resources

...

UserDomain: similarly I think UserManager and Member need more explanation and/or examples. Also it's not clear if what's there can work - for example if UserManager is supposed to be a VOMS URL the scheme would just be https, so how can you tell that it's VOMS and not something else? Probably it would need some kind of Type attribute.

the UserManager is actually an Endpoint.ID and not the URL; therefore it identifies an endpoint class instance with all the info

...

Service: it's not entirely obvious to me why Capability is mandatory.

let's say that you want to search for all job execution services regardless their interface/implementation; you can search for services having Service.Capability = executionmanagement.jobexecution (capacity of executing a job or set of jobs) it is needed if you want to search for Grid capability regardless the middleware interface/implementation; a different level of abstraction

...

Also I think the enumerated Type list in the appendix is inadequate,

it is an open enumeration, therefore if values do not match, not problem to publish something extra; it is probably out of scope to provide a complete list;

...

it's not really clear to me how this is supposed to work, given that every Service is supposed to have a unique Type (what about computing and storage services?!). Conversely many of the things which are listed (e.g. org.teragrid.gpfs) don't seem to represent things I would consider to be Services at all.

here we need some more experience I guess;

...

Endpoint: again I wonder why Capability is mandatory.

I think that in the Grid context, there should be an effort of classifying services and endpoints in terms of the conceptual capability that is provided

...

And I don't understand why or how Type and Version are combined as a single Interface URI: at the very least this needs *much* more explanation since it's absolutely vital to using the endpoints! And the Type list needs to be enumerated (as it is now). Basically I don't understand why we haven't kept the Type and Version as we have them in 1.3.

this emerged during the revision process with Balazs; one of the most wanted query pattern is: I want to search for an endpoint complying with an interface name an version, plus some extensions as well. E.g.: ARC implements BES 1.0 + a number of extensions; how do you search for them? Interface = urn:ogf:bes:1.0 and InterfaceExtension = urn:arc:bes++:1.0 and InterfaceExtension = urn:ogf:hpc:1.0 the problem sits mainly in the extensions; by coupling the name and version into one attributes, we can maintain simple properties

...

For QualityLevel, how do the Endpoint QLs relate to the one in the parent Service?

we discussed this and found no good answer so far, therefore we do not enforce relationship and leave up to who will configure the service; after some experience we may better rule it

...

For StartTime, is this different to Entity.CreationTime which should be inherited?

CreationTime is a metadata about a representation of a GLUE class; it states when that data was created (basically when the info provider was run) StartTime is the last start time of the endpoint

...

I still think that TrustedCA belongs in AccessPolicy and not Endpoint - if we can't figure out how to fit it in there it probably means that we still don't have a good enough definition of the AP object.

I'm not strong on this; TrustedCA is somewhat independent from the policies you set (thought it is related); and also, if you represent the same policy using different languages, than you have to replicate it when putting the attribute in MappingPolicy

...

And I still think the Downtime attributes should be in a separate object, to me it makes no sense to put them there - among other things, if the service is down the Endpoint will probably not be published so you won't be able to read the Downtime information.

this was done also for simplifying the query and because the relationship is 1--1. I understand your point. You envision that start time is not published by the endpoint itself but by some other service; I'll put a note in the doc

...

Share: the Endpoint and Resource relations are marked as mandatory, which doesn't seem right since the objects are supposed to be optional.

that means, if you instantiate a share, than you must instantiate an association to an existing resource and and existing endpoint; in this particular case, anyway, this is a kind of "pattern"; share and reource are abstract (and so the relationships) which means not meant to be instantiated, but subclussed

...

Manager: I'm not sure why it needs a globally unique ID?

about ID/LocalID probably we need more experience; apart a few cases, I don't know when to lean in one direction or in the other

...

And again the Resource relation is mandatory which can't be right since the Resource itself is optional.

again, mandatory if you instantiate it; this is for consistency; an association does not leave alone

...

Resource: as above, the description and the relations say that Endpoints, Shares and Managers are mandatory, which doesn't seem right.

here you raise a new possible use case: do we want to allow to publish a service with no endpoint and no share, but a manager and a resource? given the assocations multiplicity, at the moment we can have: 1. Service 2. Service + Endpoint 3. Service + Endpoing + Manager + Resource do we need different patterns to be described?

...

And again I don't think it needs a global ID.

we need to better define when we want ID vs. LocalID

...

Also there is no accompanying text, which should at the very least say that it's an abstract entity.

added

...

Activity: it looks odd to have two identical lines for the Activity-Activity relation.

also to me... but I wanted to be consistent; since you can navigate from one end to another, then from the class you see two ends; unless we put directionality in the navigation (only from one end you can go to the other), then we need both in order to be consistent with the UML semantics

...

Policy: I think we still have some problems with this ... for one thing I think the Rule has to have type "string", since we can't possibly specify all formats. And I'm not sure why the Rule is optional, shouldn't it be 1..*?

right, changed

...

Also I think we need to say something about the semantics in the case where you have multiple Policy objects, either with the same Scheme or with different Schemes.

added a comment in the doc to be resolved now I've to go... will answer to remaining part later. I've put the temporary new version (with track changes enabled) here: http://forge.ogf.org/sf/go/doc15213 Cheers, Sergio

...

Also the UserDomain relation should be * - the Rules may relate to several VOs, and conversely the UD object itself is optional. We also still need feedback on whether this can meet all the use cases - for EGEE I think we should be OK, but it's certainly a long way from being generic (DENY rules!).

AccessPolicy: One basic point is that it seems to me that APs should be related to Service as well as Endpoint, otherwise service discovery will be a lot more complex and messy than it is now. Also as I said above I think TrustedCA should be here.

MappingPolicy: I'm not clear if the concept of the Default mapping can actually work as stated. As mentioned above I don't think the relation to UD can be mandatory, it may well not be published, and while the Rules implicitly relate to a VO (or perhaps more than one) it may not be in any simple way, e.g. it may involve groups and roles. What exactly does it mean to say that there MUST be only one default MP per UD, in a way which is guaranted to be meaningful for all possible policy Schemes, all technologies, all service types, ...?

OK, that's all the main entities, I'll get to the Computing part tomorrow.

Stephen

-- Sergio Andreozzi INFN-CNAF, Tel: +39 051 609 2860 Viale Berti Pichat, 6/2 Fax: +39 051 609 2746 40126 Bologna (Italy) Web: http://www.cnaf.infn.it/~andreozzi

Paul Millar

15 May 15 May

5:43 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working Group LastCall

Hi Sergio, Stephen, On Wednesday 14 May 2008 20:03:46 Sergio Andreozzi wrote:

...

Burke, S (Stephen) wrote:

...
Can we give some examples in the text?

in general normative documents do not have examples, but I agree that we should add more descriptive text. We may also put some examples and stating that they are non normative

I agree that one should be careful about introducing examples in normative documents; specifically, be careful that one isn't relying on the examples to explain anything as the normative sections should be sufficient to completely specify expected behaviour. However, Stephen is right, I think some examples would help.

...

...
All the definitions have a section labelled "Association End" - what does "End" mean there?

it is borrowed by the UML world. Have a look at Section 8 (Template).

Hmmm, perhaps this information would be better nearer the beginning of the document. In my experience, it's common to have typographical conventions declared pretty early on in a document.

...

Here is a better description:

given the class X, such a class has an associated table (in the GLUE doc). In this table, the Association End section lists all the associations that starts from this class. What is described is the "other side" of the association (association end) in terms of: - what class is associated - what is the key attribute - what is the multiplicity (when I instantiate X, how many instances of the associated class can be associated to X via that association?) - description of the association from the viewpoint of X

It might be an idea to also include a one-line definition what is meant by an association: connects two entities, is uni-directional (but may have a reciprocal association), is defined with same multiplicity options as attributes, etc.

...

...
In fact although the description says that everything inherits from Entity I suspect that excludes Extension?

Yes, I believe the statement "everything inherits from Entity" is currently wrong.

...

...
Also I'd be inclined to add Name and OtherInfo to Entity - at the moment not everything has a Name but I don't think it does much harm to include it.

about Name in Entity, I'm not sure; I'd like to listen from other people as well;

Name has different multiplicity in different entities. Apart from the few entities that don't have a Name property, for Location it's a required property. If we allowed Name to be optional for Location then I think Name could (and should) go into Entity.

...

for OtherInfo, we could do that

Yes, putting OtherInfo into Entity would simply things. There are a few entities that don't have OtherInfo, but these omissions look like a mistakes. [...]

...

the extension is actually a special class; in fact it is the only one that does not inherits from Entity.

Is there a reason for this? If the attributes in Entity are optional (as is agreed elsewhere, I believe), then Extension could extend Entity.

...

I agree with your proposal; I can make the following changes: - Rename Key to ID (unless we want ID, Key, Value) - Value - multi = 1

Actually, I was going to argue against this change (sorry Stephen): a single key with multiple values may be logically distinguishable from multiple attributes with the same key (although LDAP doesn't allow this distinction, XML does). In fact, if anything, I would change the multiplicity of Value to (1..*). If Value isn't specified, the Key could be represented as an OtherInfo attribute in the corresponding Glue entity.

...

...
Location: is missing OtherInfo, but it will get it automatically if it gets added to Entity. ok, let's wait for other people opinion

My vote is for adding OtherInfo to Entity.

...

...
AdminDomain: I think Owner needs to be defined better.

what about:

Identification of the person or legal entity which pays for the services and resources

The attribute has (*) mult., so "person" should be "people", also a Resource is made available through a Service (I think one cannot provide a service without also providing the necessary resources underneath). So, how about: Identification of the people or legal entities that pay for the services.

...

...
UserDomain: similarly I think UserManager and Member need more explanation and/or examples.

I too feel these attributes are not currently specified well enough. Please improve the explanation, though; examples are useful, but I'd prefer effort went into providing better explanations than went into providing examples.

...

...
Also it's not clear if what's there can work - for example if UserManager is supposed to be a VOMS URL the scheme would just be https, so how can you tell that it's VOMS and not something else? Probably it would need some kind of Type attribute.

the UserManager is actually an Endpoint.ID and not the URL; therefore it identifies an endpoint class instance with all the info

So UserManager should be an association rather than a property, right?

...

...
Service: it's not entirely obvious to me why Capability is mandatory.

let's say that you want to search for all job execution services regardless their interface/implementation; you can search for services having Service.Capability = executionmanagement.jobexecution (capacity of executing a job or set of jobs)

it is needed if you want to search for Grid capability regardless the middleware interface/implementation; a different level of abstraction

This is laudable, but a couple of points: 1. I believe usage (in terms of specialising some abstract entity) in Glue-2.0 is described by subclassing: a job-execution service is a ComputingService, so this field is of limited use in this case. From an OOP analogue, requiring users to submit information of a prescribed form is usually indicative of a problem with the class hierarchy. Perhaps this could be fixed by making Service an abstract entity and creating a new concrete entity (lets call it "OtherService" for now). OtherService instances describe services that one would otherwise be unable to advertise correctly. The OGSA Capacity attribute would make more sense for these entities (IMHO). 2. Do these OGSA terms have specific semantics? In particular, I'm concerned that a client, on discovering something advertising itself with a specific Capacity_t value (e.g., "security.accounting") will attempt to interact with that service using whatever OGSA services are defined for that Capability_t value. Since a Service might provide no OGSA compliant service, yet still be (meaningfully) advertised within GLUE, perhaps the multiplicity should be (0..*)

...

...
Endpoint: again I wonder why Capability is mandatory.

I think that in the Grid context, there should be an effort of classifying services and endpoints in terms of the conceptual capability that is provided

Again, some comment as with Service: there seems a confusion between the abstract concept of a end-point (that all end-points inherit from) and an "OtherEndpoint", that describes end-points that are not one of the specific End-point subclasses. (my 2c worth)

...

...
And I don't understand why or how Type and Version are combined as a single Interface URI: at the very least this needs *much* more explanation since it's absolutely vital to using the endpoints! And the Type list needs to be enumerated (as it is now). Basically I don't understand why we haven't kept the Type and Version as we have them in 1.3.

this emerged during the revision process with Balazs; one of the most wanted query pattern is: I want to search for an endpoint complying with an interface name an version, plus some extensions as well.

E.g.: ARC implements BES 1.0 + a number of extensions; how do you search for them?

Interface = urn:ogf:bes:1.0 and InterfaceExtension = urn:arc:bes++:1.0 and InterfaceExtension = urn:ogf:hpc:1.0

the problem sits mainly in the extensions; by coupling the name and version into one attributes, we can maintain simple properties

Could the documentation be updated to say that Interface should (must?) be a URN that encodes the type and version of the interface. Leaving it as "Type=URI" is potentially confusing.

...

...
For QualityLevel, how do the Endpoint QLs relate to the one in the parent Service?

we discussed this and found no good answer so far, therefore we do not enforce relationship and leave up to who will configure the service; after some experience we may better rule it

It might help to include a comment to this effect within the property's definition. [...]

...

...
I still think that TrustedCA belongs in AccessPolicy and not Endpoint - if we can't figure out how to fit it in there it probably means that we still don't have a good enough definition of the AP object.

I'm not strong on this; TrustedCA is somewhat independent from the policies you set (thought it is related); and also, if you represent the same policy using different languages, than you have to replicate it when putting the attribute in MappingPolicy

I'm have no strong opinion, but feel I should point out that it might also be common across all Endpoints, so (in effect) a property of the Service. In practise, with Globus/LCG/gLite/... software stack, trust is asserted by include the CA certificate in /etc/grid-security/certificates/ (or equiv. for Tomcat), so is often common across all Services on the same host. The trust is often common across a Site (since CA certs are often installed automatically to provide consistent service across the site). Perhaps it should live as a separate entity that objects can link against; but I think Stephen's right, the confusion probably stems from not having a good enough understanding of entities (and AP in particular). In the meantime having it optionally within the Endpoint seems a reasonable compromise.

...

...
And I still think the Downtime attributes should be in a separate object, to me it makes no sense to put them there - among other things, if the service is down the Endpoint will probably not be published so you won't be able to read the Downtime information.

this was done also for simplifying the query and because the relationship is 1--1.

I don't think the relationship is necessarily 1:1. A site might choose to publish a small calendar of the next few scheduled down-times (e.g., it's a Windows NT box that must be rebooted every weekend). [...]

...

...
Share: the Endpoint and Resource relations are marked as mandatory, which doesn't seem right since the objects are supposed to be optional.

that means, if you instantiate a share, than you must instantiate an association to an existing resource and and existing endpoint;

in this particular case, anyway, this is a kind of "pattern"; share and reource are abstract (and so the relationships) which means not meant to be instantiated, but subclussed

True, but the issue remains: (subclasses of) Endpoints and (subclasses of) Resources are meant to be optional for (subclasses of) Share. For example, describing a StorageShare without a corresponding StorageResource is suppose to be supported.

...

...
Resource: as above, the description and the relations say that Endpoints, Shares and Managers are mandatory, which doesn't seem right.

here you raise a new possible use case: do we want to allow to publish a service with no endpoint and no share, but a manager and a resource?

given the assocations multiplicity, at the moment we can have:

1. Service 2. Service + Endpoint 3. Service + Endpoing + Manager + Resource

do we need different patterns to be described?

I think either: Service + Endpoint + Share or Service + Endpoint + Share + Manager was one option we wanted to retain, but I'm hoping others can comment further.

...

now I've to go... will answer to remaining part later. I've put the temporary new version (with track changes enabled) here:

http://forge.ogf.org/sf/go/doc15213

Gridforge seems down just now :-( Cheers, Paul.

Burke, S (Stephen)

16 May 16 May

4:19 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working Group LastCall

Paul Millar [mailto:paul.millar@desy.de] said:

...

Name has different multiplicity in different entities. Apart from the few entities that don't have a Name property, for Location it's a required property.

I don't think there's a strong reason for that, I guess it was just that Location info would normally be used in some kind of display so a Name would be quite useful.

...

Actually, I was going to argue against this change (sorry Stephen): a single key with multiple values may be logically distinguishable from multiple attributes with the same key (although LDAP doesn't allow this distinction, XML does).

Well, OK, but then I'd wonder if you were trying to express semantics that would end up being technology-dependent ... the basic point is that in LDAP there is no ordering for multivalued attributes, so you can't think of them as columns in a table. Also any multivalued attribute creates a mess in a relational implementation so we should avoid them if possible.

...

In fact, if anything, I would change the multiplicity of Value to (1..*). If Value isn't specified, the Key could be represented as an OtherInfo attribute in the corresponding Glue entity.

Yes. Stephen

Burke, S (Stephen)

10:52 a.m.

New subject: GLUE 2.0 Specification - draft 41 - Working Group LastCall

Sergio Andreozzi [mailto:sergio.andreozzi@cnaf.infn.it] said:

...

- when I instantiate the endpoint, what associations must be instantiated?

I can see that can make sense in some cases, e.g. if there is a ShareCapacity there must be a Share to attach it to. However, for the main entities that isn't the case, e.g. you can have a Share without a Resource or a Resource without a Share, so the relation can't be mandatory.

...

...
All the definitions have a section labelled "Association End" - what does "End" mean there?

it is borrowed by the UML world.

Maybe it is, but I doubt that most readers will be UML experts ...can't it just be called "Association"?

...

...
In fact although the description says that everything inherits from Entity I suspect that excludes Extension?

[...]

...

if Class A inherits from class B, then class A inherits all the class B attributes *plus* all its associations;

I phrased that badly, what I meant was that Extension does *not* inherit from Entity even though the text says that everything does. Actually Paul suggested that maybe Extension should inherit from Entity and there could be some logic to that - if you did that then you could have Extensions to Extensions so it would be possible to build a more complex extension structure.

...

I agree with your proposal; I can make the following changes: - Rename Key to ID (unless we want ID, Key, Value)

Actually that is what I was suggesting, i.e. to have ID (or LocalID), Key and Value - we have a general rule that the IDs should not have any semantics so they shouldn't be keys. Secondly you would need that to allow the possibility of several Extension objects with the same Key.

...

...
AdminDomain: I think Owner needs to be defined better.

what about: Identification of the person or legal entity which pays for the services and resources

It still isn't entirely clear what "identification" means - this is somewhere that examples would help.

...

the UserManager is actually an Endpoint.ID and not the URL; therefore it identifies an endpoint class instance with all the info

Well, then it's surely wrong, it should be an association and not an attribute. Also the structure gets more complicated, you will need a Service to attach the Endpoint(s) to ... I'd be interested to see what this would look like in practice. It's also worth remembering that there may be several VOMS endpoints for a given VO, maybe at different sites, used for failover.

...

...
Service: it's not entirely obvious to me why Capability is mandatory.

let's say that you want to search for all job execution services regardless their interface/implementation;

I can see that you *may* want to search like that, but is this so vital that we say that everyone *must* publish this for *every* endpoint and service? For EGEE at least I would think that we'd want this to be optional.

...

...
Also I think the enumerated Type list in the appendix is inadequate,

it is an open enumeration, therefore if values do not match, not problem to publish something extra; it is probably out of scope to provide a complete list;

It can't be complete but I think we should at least cover the standard cases - what are people supposed to do when they start writing info providers? They will be stuck at the first step ...

...

...
And I don't understand why or how Type and Version are combined as a single Interface URI:

[...]

...

this emerged during the revision process with Balazs; one of the most wanted query pattern is: I want to search for an endpoint complying with an interface name an version, plus some extensions as well.

OK, but what's wrong with doing (&(Type=SRM)(Version=2.2.0)) or the equivalent in other technologies? The whole point of the schema implementations should be to make queries efficient; putting multiple items in one attribute just undermines that, effectively you have a mini-schema within the attribute. In the limit we could have a single attribute per object with a huge URI that concatenates everything!

...

the problem sits mainly in the extensions; by coupling the name and version into one attributes, we can maintain simple properties

That doesn't make much sense to me, an attribute which concatenates several things is complex, not simple. What if I *don't* care about the version, just the type? What if I want to do Version>x? In any case, you didn't address my point that the Type values *must* continue to be an enumerated list as they are now - this is really the core of the entire schema, you can't do anything without knowing what the endpoint types are.

...

I'm not strong on this; TrustedCA is somewhat independent from the policies you set (thought it is related);

It is an access policy, i.e. it specifies who is allowed to access the endpoint/service. It's true that it's somewhat different in format to, say, a VOMS FQAN - but we want the Policy object to be generic, so it should be able to cope with that.

...

...
And I still think the Downtime attributes should be in a separate object,

[...]

...

this was done also for simplifying the query and because the relationship is 1--1.

The relationship *isn't* 1-1, there may be several upcoming downtimes for the same endpoint, you just forced it to be 1-1 (first downtime only) to make it fit!

...

...
Share: the Endpoint and Resource relations are marked as mandatory, which doesn't seem right since the objects are supposed to be optional.

that means, if you instantiate a share, than you must instantiate an association to an existing resource and and existing endpoint;

I'm not sure what you mean by "existing". If you're saying that *if* the Endpoint and Share both exist then you should fill in the relations too then I guess that makes sense, but I think it should be clarified that that is the meaning. However, that doesn't seem to match what you have for other things, e.g. Service to Endpoint is marked as *, but the same argument would apply there, *if* Endpoint objects exist for a given Service then the relation from Service should be filled in, it isn't optional in that case.

...

in this particular case, anyway, this is a kind of "pattern"; share and reource are abstract (and so the relationships) which means not meant to be instantiated, but subclussed

But presumably the subclasses inherit the multiplicities for the associations?

...

about ID/LocalID probably we need more experience; apart a few cases, I don't know when to lean in one direction or in the other

I think there are two cases: you need a unique ID when you have associations that span across services, or where some external service like a catalogue may need to store the ID. For Manager, I'm not sure that either of those apply because you don't access the manager software directly. For Resource there is in fact a use case that just arose: Steve Traylen is looking at having a GlueService (basically a gridftp server) linked to a SubCluster to let people modify the RunTimeEnvironment tags. In Glue 2 terms ... hmm, I would expect that to be in the ExecutionEnvironment, i.e. Resource, but in fact I don't see it - what is supposed to be the Glue 2 equivalent of the RTE?

...

here you raise a new possible use case: do we want to allow to publish a service with no endpoint and no share, but a manager and a resource?

At least, I'm not sure we should forbid it at the schema level. (A read-only classic SE which only supports NFS access?)

...

given the assocations multiplicity, at the moment we can have:

1. Service 2. Service + Endpoint 3. Service + Endpoing + Manager + Resource

I think we should allow Manager without Endpoint (classic SE with NFS, CE with local submission only, ...) Stephen

Burke, S (Stephen)

15 May 15 May

2:05 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working GroupLastCall

...

OK, that's all the main entities, I'll get to the Computing part tomorrow.

or the day after ... For the Computing diagram, I'm unsure about the multiplicities. I can reasonably believe that you must have a Share and an EE, but I'm not entirely sure if the Endpoint should be mandatory at the schema level, e.g. would you ever want to publish something where you only had local submission? And it seems slightly strange that there is no direct relation from EE to Service, although it's the result of how the main entities are related - it will be interesting to see what this looks like in LDAP. Also I don't understand why there is a relation from Benchmark and ApplicationEnvironment to ComputingManager, I would expect them to only relate to the EE. ComputingService: I think we should explicitly say what the contraints are on the *Jobs attributes, i.e. that Total is the sum of the others (I assume?). Also are these numbers supposed to include non-Grid jobs? In the table the relations to Endpoint and Share have multiplicity * but that doesn't correspond to the diagram (as above), i.e. it should be 1..*. ComputingEndpoint: the description says that it may be used for things like reservation and proxy manipulation, but that seems a bit odd to me, e.g. the attributes (Staging and JobDescription) probably wouldn't be relevant for such things. Maybe this raises a more general point: can a ComputingService only be composed of ComputingEndpoints, or can it have generic Endpoints too? ComputingShare: MaxMemory says that it's the maximum RAM, but is it in fact the total memory size whether RAM or swap? The Tag says that it's user-defined but it can presumably also be grid-defined. Anyway can this not just be OtherInfo? Also the association says ComputingResource and not ExecutionEnvironment. ComputingManager: I think the Homogeneity attribute would be better to be called Homogeneous if it's a boolean, i.e. True means it is homogeneous. Why is there a NetworkInfo attribute here, surely this is a property of the EE and not the LRMS? Is there a relation between TmpDir and the WorkingArea variables - and again, is this a property of the LRMS or the EE? What is the difference between WorkingArea and Cache? Are the WorkingAreaTotal and Free supposed to be dynamic attributes? If so, what do they mean given that WN-local working areas could all have different sizes, and how could anyone write an info provider? Similarly for CacheTotal and CacheFree - presumably they are WN-local areas, so what does an LRMS-level value mean? (Basically I think the descriptions are not adequate here ...) The WorkingAreaLifetime is presumably a minimum value, i.e. the files might stay longer. In the associations, the EE is marked as multiplicity 1..* but the description says "zero or more". Benchmark: as above, I don't understand why it's related to the ComputingManager, it's a hardware property. There is no explanatory text for this object, I think there should be something. EE: Seems to be missing the attributes inherited from Resource. The marking of a few properties (Platform, MainMemorySize, OSFamily and ConnectivityIn/Out) as mandatory seems rather arbitrary - I can see an argument for OSFamily to be important enough to be mandatory, but not the others. I think it could be made clearer exactly what is meant by an instance of an EE. The multiplicity for the association to ComputingManager is 1 which seems wrong given that ComputingManager itself is optional, at least according to the diagram. ApplicationEnvironment: I think the Name attribute should be named something different, given that everywhere else Name just means a human-readable description with no semantics (which appears to be the definition of the Description attribute here). Also as someone pointed out we may need more information about the application, e.g. compiler options. The explanation of Repository doesn't really enable me to understand what it means. Again the ComputingManager association is marked as mandatory when the object itself is optional, and as before I don't understand why this relation should exist at all since AE is a property of the EE and not the LRMS. In general I think this is one area where some examples would help a lot. ApplicationHandle: there is no explanatory text, and I think it needs some. ComputingActivity: is the Owner generally supposed to be a DN? If so I would suggest typing it as DN_t rather than String, and indicating anonymity with a Paul-style special DN like /O=Grid/CN=ANONYMOUS. UserMainMemory - again this says RAM but I suspect it means the total memory. ProxyExpirationTime: for VOMS proxies this is complex as every AC has its own expiration time as well as the one for the entire proxy. Should this be the smallest of all those times? For SubmissionHost, I don't understand why this would have a port, or indeed what the "e.g." part in the description means at all. The associations to Share and EE are marked as optional, shouldn't they be mandatory? CS2SS: is missing any explanatory text. That's it for computing, storage to follow ... Stephen

Burke, S (Stephen)

3:47 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working GroupLastCall

...

That's it for computing, storage to follow ...

OK ... on the diagram I again have doubts about the multiplicities: Share to Resource says 1..* at both ends but I think they should both be *. Conversely, Service to AccessProtocol says * but I'm inclined to think it should be 1..*, an SE with no way to access the data wouldn't be very useful ... As a general comment, the text uses the word "capacity" (or "capacities") in many places to mean something like "the ability to store data". I think this is a bad choice of word, partly because of possible confusion with the Capacity objects and partly because I don't think it's really the right word anyway. Also in some places the text uses "storage extent" to mean what seems to be something similar, and again I don't think "extent" is a very good word. However, I'm not entirely sure what to recommend as an alternative - perhaps "capability"? Service: four of the association descriptions use the word "offers", which I suspect isn't really the right word for most of them - for protocols it's OK, and maybe for Shares, but it doesn't really offer managers (they aren't externally visible) and it definitely doesn't offer Capacities, it just has them. SSCapacity: Here the word "capacity" appears in the description, so it seems that a Capacity object tells you the capacity of a capacity ... not very good! I think the *Size attributes need a more complete description, as this always seems to confuse people. Indeed there is no general explanatory text for this object and I think there should be some, e.g. to say that this is a whole-SE summary of the more detailed Share-level information. AP: The association to the SS2CS object (as shown in the diagram) is missing from the table. Again there is no explanatory text. Endpoint: are we expecting access protocol endpoints (e.g. for a classic SE) to be published as Endpoints or StorageEndpoints? There is no text to indicate what the endpoint is expected to be. Share: the Path is marked as mandatory - maybe that's OK and you should publish "/" if there is no specific path, but if so I think it should say so explicitly. Indeed it needs to be clearer exactly what the semantics are here - this is a SURL prefix to be used when files are written, and cannot in general be reverse-engineered, i.e. given a SURL you can't reliably deduce which Share it's in. Tag: I think we need a better description here, something which would correspond to the space token description for an SRM while being generic enough to be applicable to other technologies. In the associations, Endpoint and Resource are marked as mandatory when the objects are optional (and MappingPolicy is missing because it's missing from the main entities too). I think the text description for Share is not very clear. ShareCapacity: the description says "size and state" but in fact it's only the size. Again I think the attributes need a clearer description. The text below the table could also be clearer, at least my mental parser is currently throwing an error :) Manager: Type seems a slightly strange name for the attribute, things like "enstore" and "castor" aren't really types. Actually this applies to ComuptingManager too although I didn't pick it up there, are "lsf" and "pbs" types? Also since both CM and SM have a Type attribute should it anyway be defined in the parent Manager entity? (ditto Version). For the StorageResource association it says 1..* for the multiplicity but the text says "zero or more" - I think the text is right. There is no explanatory text below the table, I think there should be some. StorageResource: as you might guess I'm still inclined to prefer DataStore! Since we have ExecutionEnvironment and not ComputingResource this is evidently not a hard naming rule ... the description says "one or more endpoints/shares", it should be "zero or more". The Latency description is wrong, in this case it's the actual latency and not the maximum (tape is Nearline here even if it's part of a D1T1 Share). The Manager and Share associations say that they are mandatory, but I think the reality is a bit more complicated - you must have at least one of them otherwise the object will be completely detached, but logically either one of them could be missing - although possibly that would make a mess of the renderings. There is no explanatory text, this definitely needs some given all the trouble it causes ... SS2CS: The AccessProtocol relation is marked as multiplicity 1, but I think it should be *. One one side you could have a "close SE" relation involving several protocols (e.g. rfio and file). On the other side you may want to have the network info regardless of protocol (e.g. for gridftp). Again there is no general explanatory text for this object. That's everything for storage - there will be one more mail about the type definitions, when I find the energy to write it! Stephen

Burke, S (Stephen)

4:59 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working GroupLastCall

...

That's everything for storage - there will be one more mail about the type definitions, when I find the energy to write it!

Right, last lap on appendix B ... one general thing is that I think it should say at the start exactly what "open enumeration" and "closed enumeration" mean, and indeed what the mechanism is for adding to the opne ones. Also it looks as though most, but not all, of the enumerated values are in lower case - is there any particular reason for that? It looks a bit odd in some cases, and doesn't match our current practice. Also in some cases, e.g. ExpirationMode_t, the values are defined elsewhere (e.g. the SRM spec) and have a definite case structure already. PolicyRule_t: I think this still needs more work. I don't understand how the basic scheme can be just RECOMMENDed, surely if we define it it must be mandatory? Also I still don't like the DENY in there, and if it is there I think the semantics needs to be defined much more explicitly. Indeed that's true in general, I don't think this is clear enough. It also doesn't mention wildcard matching rules, which we need at least in a simple form for EGEE. As more minor points we currently have VO: and VOMS: rather than vo: and fqan:, is there a good reason for changing the current practice? Also in the examples this is wrong in the EGEE practice, for example VOMS:/atlas would *not* match /atlas/higgs according to the EGEE-agreed matching rules. Should EGEE just ignore this and define its own scheme? Capability_t: I find this pretty hard to grasp, and if we are seriously intending to make it mandatory (which I still think is a mistake) there needs to be a lot of guidance to implementors on how to assign these in practice. ServiceType_t: as I already said I think this list needs to be extended at least to cover the cases we know about in EGEE, OSG and Nordugrid. EndpointTechnology_t: What does "legacy" mean? What would e.g. http counts as? I would suggest having a value "custom" to mean a service-specific protocol (e.g. the old NS interface to the WMS). Describing (nearly) anything that isn't a web service as "legacy" seems a bit extreme :) DateTime_t: why restrict to GMT, is there any reason to disallow the generic format? A Grid operating in e.g. Japan might find that annoying ... Staging_t: why is it an open enumeration when it seems to cover all possibilities? ApplicationHandle_t: the description for softenv seems to be copied from module. OSName_t: for EGEE, after a long discussion we ended up with this for the current schema: http://goc.grid.sinica.edu.tw/gocwiki/How_to_publish_the_OS_name If you plan to change that be prepared for some disagreements! License_t: why is this a closed enumeration? The values seem quite restrictive. Stephen

Maarten Litmaath

6:17 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working GroupLastCall

Burke, S (Stephen) wrote:

...

As a general comment, the text uses the word "capacity" (or "capacities") in many places to mean something like "the ability to store data". I think this is a bad choice of word, partly because of possible confusion with the Capacity objects and partly because I don't think it's really the right word anyway. Also in some places the text uses "storage extent" to mean what seems to be something similar, and again I don't think "extent" is a very good word. However, I'm not entirely sure what to recommend as an alternative - perhaps "capability"?

That term is also being used already for something else. What about "ability"? (I did not check if it makes sense everywhere.)

...

Service: four of the association descriptions use the word "offers", which I suspect isn't really the right word for most of them - for protocols it's OK, and maybe for Shares, but it doesn't really offer managers (they aren't externally visible) and it definitely doesn't offer Capacities, it just has them.

For managers: "is managed by". For capacities: "has".

...

Manager: Type seems a slightly strange name for the attribute, things like "enstore" and "castor" aren't really types. Actually this applies to ComuptingManager too although I didn't pick it up there, are "lsf" and "pbs" types? Also since both CM and SM have a Type attribute should it anyway be defined in the parent Manager entity? (ditto Version). [...]

What about the good old Implementation?!

...

StorageResource: as you might guess I'm still inclined to prefer DataStore! Since we have ExecutionEnvironment and not ComputingResource this is evidently not a hard naming rule [...]

Either is fine with me.

Balazs Konya

12:45 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working Group Last Call

hi, Please find below some corrections spotted by our nordugrid colleagues. Balazs

...

-------- Original Message -------- Subject: Re: [NG-disc] GLUE 2.0 Specification - draft 41 - Working Group Last Call Date: Wed, 14 May 2008 16:06:55 +0200 From: Mattias Ellert <mattias.ellert@fysast.uu.se> Reply-To: Technical discussion about the middleware <nordugrid-discuss@nordugrid.org> To: Technical discussion about the middleware <nordugrid-discuss@nordugrid.org> References: <48299CD2.5070807@hep.lu.se>

Hi (especially Balázs),

here are a few comments on the last call draft.

Page 4: ‘MUST” → “MUST”

Page 4: fmultiple → multiple (or maybe multiples)

Page 11: see Section 0 → see Section 6.2

Page 13: In the table for the Policy entity, UserDomain.ID is marked <<abstract>> though it is not.

Page 15: Application Environment Handle → Application Handle

Page 16: single computing manager which execution environments → single computing manager whose execution environments

Page 20: see Section 0 → see Section 5.6

Page 21: The description for LogicalCPUDistribution doesn't make sense

General: OtherInfo attribute vs. Extension entity association. Do you need both? When to use which? You probably though of this already.

General: Some of the tables lack the vertical dividers/edges.

Mattias

Paul Millar

3:50 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working Group Last Call

Hi Balazs, Thanks for the feedback. On Thursday 15 May 2008 14:45:46 Balazs Konya wrote:

...

...
General: OtherInfo attribute vs. Extension entity association. Do you need both? When to use which? You probably though of this already.

This one stumpted me for a bit. I believe the distinction is valid, but could (perhaps) be better explained. My understanding is the Extension object is to allow Glue to represent completely new objects that cannot be expressed as any of the existing objects. The OtherInfo is for addition additional information to objects that can be expressed within Glue. At the risk of opening a can-o-worms, suppose a grid wants to represent actual storage hardware at a site (actual disks and tape libraries). They could do this by instantiating Extension objects (modelled after how CIM schema does this, for example), one for each disk and one for each tape-library. To indicate the hardware relationship to StorageResources, they could add OtherInfo attributes to the StorageResource that provides the link(s). HTH, Paul.

Burke, S (Stephen)

16 May 16 May

3:49 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working GroupLast Call

glue-wg-bounces@ogf.org

...

[mailto:glue-wg-bounces@ogf.org] On Behalf Of Paul Millar said: On Thursday 15 May 2008 14:45:46 Balazs Konya wrote: My understanding is the Extension object is to allow Glue to represent completely new objects that cannot be expressed as any of the existing objects. The OtherInfo is for addition additional information to objects that can be expressed within Glue.

In effect it's about simplicity, OtherInfo is much easier to use both for people writing info providers and for clients trying to use it. However, for some things it may not be sufficient, and Extension can provide more structure if it's needed. Indeed, if we allowed Extensions to have Extensions of their own you could potentially build a whole custom schema within GLUE - although whether it would be a good idea might be questionable ... Stephen

Paul Millar

15 May 15 May

9:26 p.m.

New subject: GLUE 2.0 Specification - draft 41 - Working Group Last Call

Hi Sergio, On Tuesday 13 May 2008 03:26:04 Sergio Andreozzi wrote:

...

This is the official Working Group last call for GLUE 2.0 and we invite all of you to read the specification and to send your feedback by Friday, 12 AM CEST.

OK, here are my comments, restricted to comments for the whole document or for the main entities and Appendix A and B. Of course, feel free to ignore these suggestions. (when suggesting changes to text, I've used the following convention: "[aa]" means delete "aa", "/bb/" means add "bb", "/bb/[aa]" means replace "aa" with "bb") *** General comments (not page specific) Top left corner of each page has "GWD-R, GWD-I or GWD-C"; this is not consistent throughout the document. Bottom left corner of each page (except title page) has what looks like an email address. This is not consistent throughout the document. Top right corner of each page (except title page) has a date. This is not consistent throughout the document (contents pages vs document main-body). The RFC-2119 references (MAY, SHOULD, MUST, etc) are not typeset in a uniform fashion. They're capitalised for most part but I believe there's the occasional usage in lower-case (Appendix A has them in lower-case). A search-and-replace should identify all such problems. Could RFC-2119 terms be typeset in a slightly smaller font? I find this helps a surprisingly when dealing with all-caps acronyms and phrases. There are many places where "can" is used instead of an RFC-2119 term. Could you search for all instances of the word "can" and replace (almost) all of them with either "MAY" or "SHOULD"? References are not handled consistently through the document: in some places they are cited using a square-brackets notation ("GLUE Schema 1.x [glue-1x]" ), in other places URIs are placed in-line (e.g., "RFC 2119 (see http://www.ietf.org/rtc/rfc2119.txt)."). A consistent scheme should be used. My vote would be for square brackets. The start of the References section doesn't appear in the Table of Contents. The Appendices are currently referenced as additional numerical sections: Appendix A is Section 16, Appendix B is Section 17. In my experience, it is usual to have Appendices start as a separately "numbered" sections, where the enumeration is expressed in upper-case Alphabet (hence the "A" of Appendix A). Could the two Appendices be altered to follow this convention? Could we have consistent typesetting of the entity names? The capitalisation tends to vary throughout the document: sometimes capitalised, sometimes lower-case (e.g., "UserDomain" vs "userDomain" vs "userdomain" vs "user domain"). It might help when reading the document if the entity names stood out more; for example, they were all in italics. However, this is a style issue, so it's just a mild suggestion. The phrase "this is an abstract entity not meant to be instantiated" is repeated for different abstract entities. This is imprecise: instances of this phrase should be updated so they say that abstract entity "SHOULD NOT be instantiated" or "MUST NOT be instantiated". The OtherInfo property the word "example" should be plural ("examples"), perhaps a better phrase is "[...] are all examples of valid syntax". There are several places where the document repeats information; for example on page 4 it is emphasised that ID is of type URI, where this is clearly stated in the Entity definition. Do we really need to labour this point? Could these statements be replaced by a single blanket statement (in General Comments, for example) that implementors must ensure that they follow the correct types for each entity property. *** Page 1 (title page) Could my association be changed to "DESY" (i.e., all caps) ? Abstract: Some suggestions (feel free to ignore :) "[...] described /using/[in] natural language /and/ enriched with a graphical representation [...]" "As a conceptual model, /it/[this] is /designed to be/[meant to be] /independent of the underlying information system/[implementation-independent]." "Rendering to concrete data models such as XML Schema, LDAP /Schema/, and /SQL/[relational] are provided in [a] separate document/s/." See also changes within first para. of Introduction (page 4). *** Page 4. Introduction, If the abstract is updated, the corresponding sections of the introduction 1st para should have similar changes applied. The terms LDAP, XML Schema and "relational" (perhaps replaced by "SQL") are used without citing references for them. On page 6. there is a reference to "an interoperability profile" without this term being defined. The closest to a definition is the last sentence of the first para of the introduction, which hints towards an interopt. profile. The term "interoperability profile" should be defined somewhere and the introduction seems like a natural place. This could be an additional paragraph, appearing after the first. General Statements Instead of "The ID MUST be compliant with the syntax of a URI.", how about simply "All ID property values must be valid URIs". Assuming ID is moved to Entity, perhaps some of the content of the paragraph about ID and LocalID could be moved to the section in Main Entities, where the "Entity" entity is defined. The terms "URI" and "URN" are used without defining them. A simple citation of the relevant RFCs should be sufficient. I believe SI is "Le Système International d'Unités", not "International System". ISO-2955 might be a more appropriate reference than Wikipedia, although the URI (from Wikipedia) is spectacularly ugly: http://isotc.iso.org/livelink/livelink/fetch/2000/2122/138351/138352/4446951... I feel we should cite some reference for the binary SI prefix, but I'm not sure of the best source to reference. Could we typeset 10^9 and 2^30 correctly: with the exponent numbers in superscript? The "place-holder values" section is either Appendix A or Section 16, not "Appendix 16". I feel we should state whether implementors MUST or SHOULD follow the guidelines in Appendix A, rather than just introducing the Appendix. There's a throw-away comment about "attributes" and "properties" being synonyms. Could we simply replace all instances of "attribute" with "property" and remove this paragraph? *** Page 5 The associations between Domain and Location, and Service and Location are described as "primary located at", this is perhaps better expressed as "primarily located at" or "has primary location". *** Page 6 Do we really want to make CreationTime and Validity properties required? How about these changes to the descriptions: CreationTime: "Timestamp /describing/ when the Entity [...]" Validity: The duration after CreationTime that the information presented in the Entity MAY be considered relevant. After that period has elapsed, the information SHOULD NOT be considered relevant. CreationTime and Validity are not included in the inherited properties in the other entities. Extension: "A key,value pair enabling /the/[to] /association of/[associate] extra information /with/[to] /an Entity/[a class] instance [which is] not capture by the model" Location Longitude: "The position of a place east or west of /the primary meridian (located in/ Greenwich, /UK/[England] /)/." We should also mention that -180 degrees is not a valid meridian of longitude (it's +180 degrees instead). The final para of the page mentions "interoperability profile" without defining this term (see comments about page 4). *** Page 7 Contact: URL property: this name seems wrong. The property is a URI, which is something the description seems to contradict itself over (URL, no URI, ....). Perhaps it should be given a more generic name, although I'm struggling to come up with something better than "transport". Domain: The description reads more like a description of a User-domain: AdminDomain objects (as a subclass of Domain) are not assigned Roles, so this description is wrong. Perhaps the "WWW" property should really be something like "FurtherInfo". The Type is URI, not URL, so could express something other than a web-page. For example, a grid might choose to express Domain further information using gopher, or by looking up the details through the French Minitel system, by querying some database service, or ... *** Page 8 AdminDomain: The description shouldn't use "can", these words should be change to MAY; "should" should be "SHOULD". Distributed Property: "admindomain" --> "AdminDomain" "This structure /MAY/[can be] represent[ed] /a/[via the] "participates in" association." UserDomain: Description: "A collection of actors that /MAY/[can] be assigned [with] user roles and privileges to Service or Share entities via Policy entities." I'm not sure why is there a Level property; isn't this described by the UserDomain--UserDomain hierarchy? *** Page 9 The para describing Virtual Organisations seems a little out of sequence. For example, "VO" is used before it's defined in the second sentence. The term is defined in the third sentence, rather than where "Virtual Organisation" is first used. Someone should spend a little bit of time tidying up that para. The final para should be changed: "This structure /MAY/[can be] represent[ed] /a/[via the] "participates in" association." Service: Capacity property: this is the first time OGSA is used, but it isn't referenced. "StatusPage" should be "StatusInfo" (it isn't necessarily a page). The description says it's a web page. It might not be: it could be via RFC-742 (finger protocol), automated/recorded phone message. *** Page 10 Comma missing after "e.g." in the "The simplest Service [...]" sentence. And the final sentence is imprecise: "Endpoints, Shares, Managers and Resources /MUST/[can] belong to /precisely/[only] one Service." Endpoint: The name "URL" for the property doesn't seem correct: the type is URI and the endpoint might not be a URL. How about something like Target, Contact or just "URI"? WSDL: the description says this is a URL. The Type is URI so the value might not be a URL. SupportProfile The description is useless: it doesn't specify what this is in any way. ImplementationVersion. The description mentions the three-number representation (major-version.minor-version.patch-level) without specifying whether information MAY, SHOULD or MUST use this form. *** Page 11 DowntimeStart description: "The [starting] timestamp /describing when/ [of] the next [scheduled] downtime /is scheduled to start/" (likewise for DowntimeEnd) "For Grid services [...] (see Section 0)." For the JMS example, the phrase "Java Messaging Service" should be capitalised. *** Page 13 Policy: Neither the Scheme or Rule properties appear to be particularly well defined. UserDomain isn't capitalised correctly (last sentence of last para.) *** Page 14 MappingPolicy Default property "user domain" --> "UserDomain" "This entity can be used to express [...]" Is this "MAY be used" or "SHOULD be used" ? [skipping onto Appendices] Appendix A: Some general comments: Various RFCs are mentioned without including references. Could the URI for RFC-2119 be adapted for these other references? Some of the examples appear to be converted to external links within the document; for example, the "www.example.org" example (16.2.1 "Fully qualified domain name") appears in blue text with an underline. Could these be converted back to plain text? There's many references to "unknown value" or similar. I feel these should all be changed to "place-holder value". I've tried to note where the occur, but searching should find them all. The RFC-2119 terms (MAY, SHOULD, etc) are all lower-case *** Page 39 The first place-holder values should be "Simple strings". This is currently body-text rather than Section 16.2.1 *** Page 40 FQDN: Could you update the indentation used for the examples? It appears to be inconsistent with the others. *** Page 41 Integers: "For these reasons, information providers MUST use all-nines to indicate /a place-holder/[an unknown] value." File path: Software should accept either value as /a/[an unknown-value] place-holder /value/." *** Page 42 URI: "Take care with the URI encoding. All /place-holder/[unknown] URI values MUST be [...]" "For "mailto" URIs [...] /Place-holder/[Unknown] mailto /URI values/[URIs] MUST use [...]" *** Page 43 FQAN: "Where VO is well-formed /FQDN/[DNS name]. Unlike /FQDNs/[DNS names], VO names must be lower-case. The [unknown] place-holder value for DQAN is derived from the /place-holder/[unknown] /FQDN/[DNS name] (see /section 16.2.x/[above])." Geographical location: "(0,0) MUST be used to specify /a place-holder/[an unknown] location" Appendix B *** Page 47. Should be "UTC" rather than "GMT". Should the OSName_t have entries like "windowsxp", should this be just "xp" as the Windows part is already spoken for in the OSFamily_t Could the xrootd protocol be added to StorageAccessProtol_t. AccessLatency_t The descriptions seem wrong to me as they use technologies rather than describing the latency. Here's my attempt: online: files with online latency are available for user activity with a low latency. The precise definition of "low" may be system-specific, but will typically be much less ten seconds. nearline: files with nearline latency will have typically latencies greater than those of online files and are typically satisfied without human intervention. Average latency for a requested files will be implementation-specific and may depend on the available hardware, but a typical value is in excess of a minute. Storage systems may undertake optimisations so that, under special circumstances, nearline latency may approach that of online latency. offline: storage that requires manual, human intervention to retrieve the data. Typical latency will depend on SLA, but typical values may exceed a day. ... and on to Storage Entities tomorrow! Cheers, Paul.

6317

Age (days ago)

6320

Last active (days ago)

List overview

Download

14 comments

5 participants

participants (5)

Balazs Konya
Burke, S (Stephen)
Maarten Litmaath
Paul Millar
Sergio Andreozzi