New Endpoint and Service types

stephen.burke＠stfc.ac.uk

3 Dec 2013 3 Dec '13

5:37 p.m.

Hi, The QCG project (http://www.qoscosgrid.org/trac/qcg) is implementing GLUE 2 publication and would like to use the following EndpointInterfaceName values: org.qcg.broker - QCG-Broker web service interface org.oasis.notification - OASIS WS-Notification standard and ServiceTypes: org.qcg.computing - QosCosGrid computing element org.qcg.notification - QosCosGrid notification broker org.qcg.broker - QosCosGrid metacheduling service Can we add them to the relevant lists? The people in CC should be able to provide more information if necessary. Stephen -- Scanned by iCritical.

Show replies by date

Tomasz Piontek

23 Dec 23 Dec

12:46 p.m.

Hello, Is there any decision concerning adding new EndpointInterfaceNames and ServiceTypes for QCG stack? Marry X-mas and Happy New Year ... All the best, Tomek W dniu 03.12.2013 18:37, stephen.burke@stfc.ac.uk pisze:

...

Hi,

The QCG project (http://www.qoscosgrid.org/trac/qcg) is implementing GLUE 2 publication and would like to use the following EndpointInterfaceName values:

org.qcg.broker - QCG-Broker web service interface org.oasis.notification - OASIS WS-Notification standard

and ServiceTypes:

org.qcg.computing - QosCosGrid computing element org.qcg.notification - QosCosGrid notification broker org.qcg.broker - QosCosGrid metacheduling service

Can we add them to the relevant lists? The people in CC should be able to provide more information if necessary.

Stephen

-- *********************************************************** * Tomasz Piontek piontek@man.poznan.pl * * Poznan Supercomputing and Networking Center * * tel.(+48 61) 858-21-72 fax.(+48 61) 858-21-51 * ***********************************************************

Florido Paganelli

14 Jan 14 Jan

9:18 a.m.

Hi All, sorry for my absence. Unfortunately my current tasks are making me more and more impossible to follow promplty the working group :( Adding these values for me would be easy. The problem is that the group didn't decide a procedure to _approve_ them to make them _official_, mostly because I promised I would propose one but I never did. Since I'm lacking time to do it, I'd like to propose a quick-n-dirty way of handling these requests: 1. Create a <enumerationType>-requested.csv (i.e. ServiceType_t-requested.csv) to put immediately online so that we speedup the addition process regardless of group approval. enumerations in such files are to be considered unofficial but usable for users. Their meaning is: "A request for addition has come to the group and is pending approval, but they're not approved yet." 2. Each enumeration must always contain a pointer to who requested the addition. This should be private to the group and not public (we might add it to the documents folder, for example) 3. twice a year the group meets and decides merges between proposed and official. What do you guys think about this? I feel very bad that we're not taking decisions about these matters, and it's currenlty mostly due to my lack of time. Cheers, Florido On 2013-12-23 13:46, Tomasz Piontek wrote:

...

Hello,

Is there any decision concerning adding new EndpointInterfaceNames and ServiceTypes for QCG stack?

Marry X-mas and Happy New Year ...

All the best, Tomek

W dniu 03.12.2013 18:37, stephen.burke@stfc.ac.uk pisze:

...
Hi,

The QCG project (http://www.qoscosgrid.org/trac/qcg) is implementing GLUE 2 publication and would like to use the following EndpointInterfaceName values:

org.qcg.broker - QCG-Broker web service interface org.oasis.notification - OASIS WS-Notification standard

and ServiceTypes:

org.qcg.computing - QosCosGrid computing element org.qcg.notification - QosCosGrid notification broker org.qcg.broker - QosCosGrid metacheduling service

Can we add them to the relevant lists? The people in CC should be able to provide more information if necessary.

Stephen

-- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

stephen.burke＠stfc.ac.uk

12:45 p.m.

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On

...

Behalf Of Florido Paganelli said: Adding these values for me would be easy. The problem is that the group didn't decide a procedure to _approve_ them to make them _official_, mostly because I promised I would propose one but I never did.

This should be a lightweight process - the main point is to record the values which are in use. In general there's no problem with adding new types on request; the only issues to consider are: 1) Namespace - if it uses a DNS-style name does the requestor reasonably justify relating it to that domain, i.e. do they speak for the relevant project or is it clearly related to a standard service provided by the project. 2) Checking that we don't already have something suitable in the existing list. 3) Sanity checking that the value is naming something that matches the conceptual purpose of the enumeration. I don't see any reason that can't be done quickly with an email vote. Anyway the worst case is that you end up with an unsuitable value which has to be changed later, and that will generally only cause problems for whoever requested it. In this particular case I don't see any problem with the QCG-specific names, QCG is a well-defined project so we can assume they know what they're doing, or will sort out their own problems if they don't. org.oasis.notification is more general so it would be useful if someone could do a quick check that it makes sense - personally I know nothing about it. Stephen -- Scanned by iCritical.

Bartosz Bosak

3 Mar 3 Mar

10:37 a.m.

Dear Stephen, all Could you check if new ServiceTypes/EndpointInterfaceNames for the QCG middleware have been registered in GLUE 2? Now we have some time and we would like to move forward with the implementation of our publishers for BDII... Best Regards, Bartek 2013-12-23 13:46 GMT+01:00 Tomasz Piontek <piontek@man.poznan.pl>:

...

Hello,

Is there any decision concerning adding new EndpointInterfaceNames and ServiceTypes for QCG stack?

Marry X-mas and Happy New Year ...

All the best, Tomek

W dniu 03.12.2013 18:37, stephen.burke@stfc.ac.uk pisze:

Hi,

...
The QCG project (http://www.qoscosgrid.org/trac/qcg) is implementing GLUE 2 publication and would like to use the following EndpointInterfaceName values:

org.qcg.broker - QCG-Broker web service interface org.oasis.notification - OASIS WS-Notification standard

and ServiceTypes:

org.qcg.computing - QosCosGrid computing element org.qcg.notification - QosCosGrid notification broker org.qcg.broker - QosCosGrid metacheduling service

Can we add them to the relevant lists? The people in CC should be able to provide more information if necessary.

Stephen

-- *********************************************************** * Tomasz Piontek piontek@man.poznan.pl * * Poznan Supercomputing and Networking Center * * tel.(+48 61) 858-21-72 fax.(+48 61) 858-21-51 * ***********************************************************

stephen.burke＠stfc.ac.uk

2:08 p.m.

Bartosz Bosak [mailto:bbosak@man.poznan.pl] said:

...

Could you check if new ServiceTypes/EndpointInterfaceNames for the QCG middleware have been registered in GLUE 2? Now we have some time > and we would like to move forward with the implementation of our publishers for BDII...

I haven't seen anyone disagree with them. None of the names are in use, and the org.qcg names are your own name space anyway. org.oasis.notification is something which could potentially be relevant to other people, but assuming that it's just a vanilla implementation of a standard I don't see that it would be controversial. Unless anyone has a strong objection now I'd suggest that you go ahead with your implementation. Stephen -- Scanned by iCritical.

Florido Paganelli

4:04 p.m.

Hi Bartok, Stephen On 2014-03-03 15:08, stephen.burke@stfc.ac.uk wrote:

...

Bartosz Bosak [mailto:bbosak@man.poznan.pl] said:

...
Could you check if new ServiceTypes/EndpointInterfaceNames for the QCG middleware have been registered in GLUE 2? Now we have some time > and we would like to move forward with the implementation of our publishers for BDII...

I haven't seen anyone disagree with them. None of the names are in use, and the org.qcg names are your own name space anyway.

I agree with Stephen

...

org.oasis.notification is something which could potentially be relevant to other people, but assuming that it's just a vanilla implementation of a standard I don't see that it would be controversial. Unless anyone has a strong objection now I'd suggest that you go ahead with your implementation.

regarding this

...

...
org.oasis.notification - OASIS WS-Notification standard

If you refer to this one https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsn then you should not request for an interfacename yourself, but the people in oasis should, or at least you should ask them for blessing. At least this is how I see it, since there is no written rules. In short, we can add it but be aware that oasis people might request to change it to their taste one day. I guess I should add these myself to the lists online... Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Maria Alandes Pradillo

5 Mar 5 Mar

8:44 a.m.

Dear all, On behalf of the DPM team, could you please also consider adding: - "DPM" to ServiceType - "org.webdav" and "org.xrootd" to InterfaceName Thanks a lot, Maria

...

-----Original Message----- From: glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of Florido Paganelli Sent: 03 March 2014 17:04 To: glue-wg@ogf.org Subject: Re: [glue-wg] New Endpoint and Service types

Hi Bartok, Stephen

...
Bartosz Bosak [mailto:bbosak@man.poznan.pl] said:

...
Could you check if new ServiceTypes/EndpointInterfaceNames for the QCG middleware have been registered in GLUE 2? Now we have some time > and we would like to move forward with the implementation of our publishers for BDII...

I haven't seen anyone disagree with them. None of the names are in use, and

On 2014-03-03 15:08, stephen.burke@stfc.ac.uk wrote: the org.qcg names are your own name space anyway.

I agree with Stephen

...
org.oasis.notification is something which could potentially be relevant to other people, but assuming that it's just a vanilla implementation of a standard I don't see that it would be controversial. Unless anyone has a strong objection now I'd suggest that you go ahead with your implementation.

regarding this

...
...
org.oasis.notification - OASIS WS-Notification standard

If you refer to this one

https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsn

then you should not request for an interfacename yourself, but the people in oasis should, or at least you should ask them for blessing. At least this is how I see it, since there is no written rules.

In short, we can add it but be aware that oasis people might request to change it to their taste one day.

I guess I should add these myself to the lists online...

Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ================================================== _______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg

stephen.burke＠stfc.ac.uk

12:22 p.m.

Maria Alandes Pradillo [mailto:Maria.Alandes.Pradillo@cern.ch] said:

...

On behalf of the DPM team, could you please also consider adding:

I think these do need some more discussion ...

...

- "DPM" to ServiceType

It isn't clear to me that DPM is a distinctive type - as far as I know it only implements standard interfaces like SRM and xroot, so why is it a different type of service to, say, dcache? What would you propose as a definition? DPM as a product name is published elsewhere. I think we probably need a type name that would be common between, say, DPM, dcache and Castor since they provide basically the same functionality.

...

- "org.webdav" and "org.xrootd" to InterfaceName

xroot has already been discussed extensively and we made a decision - I think the decision was to use "xroot" for the protocol name but we should check. For webdav I think the "org." prefix isn't adding much here - probably we should just use "webdav" as it's a well-known protocol defined in an RFC so there's no issue of a name clash, but there may be other views. Anyway there are likely to be other interested parties - dcache at least - who should express a view. Stephen -- Scanned by iCritical.

Florido Paganelli

3:43 p.m.

Hi Maria, my thoughs inline On 2014-03-05 13:22, stephen.burke@stfc.ac.uk wrote:

...

Maria Alandes Pradillo [mailto:Maria.Alandes.Pradillo@cern.ch] said:

...
On behalf of the DPM team, could you please also consider adding:

I think these do need some more discussion ...

...
- "DPM" to ServiceType

isn't it possible to assign an organization do DPM? I know nothing about it. org.<organizationname>.dpm? I would actually love everything lowercase, but people seems to like case... :(

...

It isn't clear to me that DPM is a distinctive type - as far as I know it only implements standard interfaces like SRM and xroot, so why is it a different type of service to, say, dcache? What would you propose as a definition? DPM as a product name is published elsewhere. I think we probably need a type name that would be common between, say, DPM, dcache and Castor since they provide basically the same functionality.

...
- "org.webdav" and "org.xrootd" to InterfaceName

xroot has already been discussed extensively and we made a decision - I think the decision was to use "xroot" for the protocol name but we should check.

For webdav I think the "org." prefix isn't adding much here - probably we should just use "webdav" as it's a well-known protocol defined in an RFC so there's no issue of a name clash, but there may be other views. Anyway there are likely to be other interested parties - dcache at least - who should express a view.

I agree with Stephen, if there is no organization it doens't make sense to have 'org.' there. If it's a simple standard webdav interface, it's perfect to have only webdav there. But I also wonder what is the type of such interfacenames -- it seems to me that these two should be in Endpoint.Technology instead. The name of the interface should be something like org.<organizationname>.webdavenhanced if it's not just plain webdav If that is not enough then you should look into capabilities for EMI-ES and craft proper ones, like: data.transfer.xrootd data.transfer.webdav -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Maria Alandes Pradillo

3:47 p.m.

Hi Florido,

...

...
Maria Alandes Pradillo [mailto:Maria.Alandes.Pradillo@cern.ch] said:

...
On behalf of the DPM team, could you please also consider adding:

I think these do need some more discussion ...

...
- "DPM" to ServiceType

isn't it possible to assign an organization do DPM? I know nothing about it.

org.<organizationname>.dpm?

Like for example org.cern.dpm? If this type of syntax is needed, please let me know and I will ask my DPM colleagues.

...

...
...
- "org.webdav" and "org.xrootd" to InterfaceName

xroot has already been discussed extensively and we made a decision - I think the decision was to use "xroot" for the protocol name but we should check.

For webdav I think the "org." prefix isn't adding much here - probably we should just use "webdav" as it's a well-known protocol defined in an RFC so there's no issue of a name clash, but there may be other views. Anyway there are likely to be other interested parties - dcache at least - who should express a view.

I agree with Stephen, if there is no organization it doens't make sense to have 'org.' there. If it's a simple standard webdav interface, it's perfect to have only webdav there.

But I also wonder what is the type of such interfacenames -- it seems to me that these two should be in Endpoint.Technology instead. The name of the interface should be something like org.<organizationname>.webdavenhanced if it's not just plain webdav

If that is not enough then you should look into capabilities for EMI-ES and craft proper ones, like:

data.transfer.xrootd data.transfer.webdav

I don´t have the answer to these questions since I´m not a Data Management expert. Is the WG normally following up this type of things with the relevant experts? Or how do you normally proceed to take a decision? Regards, Maria

stephen.burke＠stfc.ac.uk

4:03 p.m.

Maria Alandes Pradillo [mailto:Maria.Alandes.Pradillo@cern.ch] said:

...

I don´t have the answer to these questions since I´m not a Data Management expert. Is the WG normally following up this type of things with the relevant experts? Or how do you normally proceed to take a decision?

In general, new types should be proposed by the experts so they can explain what they mean! In the particular case of storage systems where we have several different implementations, at least DPM, dcache and storm publishing GLUE 2, it seems to me that it would be better if the experts from each of those could reach an agreement between themselves, rather than expecting the GLUE WG to arbitrate. At any rate most of the published values will be common between different implementations so it isn't simply a question of accepting proposals from one of them. Stephen -- Scanned by iCritical.

Florido Paganelli

6:45 p.m.

Hi Maria, On 2014-03-05 16:47, Maria Alandes Pradillo wrote:

...

Hi Florido,

...
...
Maria Alandes Pradillo [mailto:Maria.Alandes.Pradillo@cern.ch] said:

...
On behalf of the DPM team, could you please also consider adding:

I think these do need some more discussion ...

...
- "DPM" to ServiceType

isn't it possible to assign an organization do DPM? I know nothing about it.

org.<organizationname>.dpm?

Like for example org.cern.dpm? If this type of syntax is needed, please let me know and I will ask my DPM colleagues.

if it's developed by cern, yes that would be perfect. This organization name were a way to remember who to ask in case of future changes.

...

...
...
...
- "org.webdav" and "org.xrootd" to InterfaceName

xroot has already been discussed extensively and we made a decision - I think the decision was to use "xroot" for the protocol name but we should check.

For webdav I think the "org." prefix isn't adding much here - probably we should just use "webdav" as it's a well-known protocol defined in an RFC so there's no issue of a name clash, but there may be other views. Anyway there are likely to be other interested parties - dcache at least - who should express a view.

I agree with Stephen, if there is no organization it doens't make sense to have 'org.' there. If it's a simple standard webdav interface, it's perfect to have only webdav there.

But I also wonder what is the type of such interfacenames -- it seems to me that these two should be in Endpoint.Technology instead. The name of the interface should be something like org.<organizationname>.webdavenhanced if it's not just plain webdav

If that is not enough then you should look into capabilities for EMI-ES and craft proper ones, like:

data.transfer.xrootd data.transfer.webdav

I don´t have the answer to these questions since I´m not a Data Management expert. Is the WG normally following up this type of things with the relevant experts? Or how do you normally proceed to take a decision?

We have no defined procedure as I never had the time to write it down. This is just the way I think things should be wrt my understanding of how GLUE2 should be used by consumers. The WG usually discusses such matters on this mailing list. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Paul Millar

5:30 p.m.

Hi Stephen, Maria, everyone else, On 05/03/14 13:22, stephen.burke@stfc.ac.uk wrote:

...

Maria Alandes Pradillo [mailto:Maria.Alandes.Pradillo@cern.ch] said:

...
On behalf of the DPM team, could you please also consider adding:

I think these do need some more discussion ...

...
- "DPM" to ServiceType

It isn't clear to me that DPM is a distinctive type - as far as I know it only implements standard interfaces like SRM and xroot, so why is it a different type of service to, say, dcache? What would you propose as a definition? DPM as a product name is published elsewhere. I think we probably need a type name that would be common between, say, DPM, dcache and Castor since they provide basically the same functionality.

For comparison, dCache currently publishes a Service.Type of 'org.dcache.storage' I think we've already had a discussion on this point, without coming to a conclusion, as I recall. The problem (IMHO) is the level of detail depends on who's asking the question. Stephen, to you DPM, dCache and Castor provide the same functionality, so you would be happy with instances of all three published as Service.Type of 'storage' (or similar). Somebody who needs some unique characteristic provided by dCache (or DPM, or ...) might want more detailed Type, specifically that the service provides the dCache-like facilities (or DPM-like or ...). If all someone wants is to store some data using, say, WebDAV, then they can look for a WebDAV endpoint that they're authorised to use. They wouldn't even look at the StorageService Type. That isn't to say publishing DPM (or 'org.dcache.storage') is the correct approach, but that saying "they're all storage" also isn't necessarily correct. So, in the absence of other compelling reasons, I would suggest going for a *more* specific Type: the generic features may be published elsewhere (e.g., as endpoints) and searched for as such. The only thing I would suggest is that instead of publishing DPM, you publish a reasonable DNS name. A three-letter acronym is perhaps generic enough that it might result in confusion.

...

...
- "org.webdav" and "org.xrootd" to InterfaceName

xroot has already been discussed extensively and we made a decision - I think the decision was to use "xroot" for the protocol name but we should check.

For the xrootd protocol, dCache currently publishes Endpoint.URL: xroot://xrootd-door.example.org/ EndpointInterface.Name: xroot and StorageAccessProtocol.Type: xrootd

...

For webdav I think the "org." prefix isn't adding much here - probably we should just use "webdav" as it's a well-known protocol defined in an RFC so there's no issue of a name clash, but there may be other views. Anyway there are likely to be other interested parties - dcache at least - who should express a view.

For WebDAV, dCache is currently publishing as either 'http' or 'https', depending on whether SSL/TLS tunnelling is enabled or not. This is for Endpoint.URL ('http://' or 'https://'), Endpoint.InterfaceName and StorageAccessPoint.Type. This is largely for historical reasons (dCache supported HTTP before WebDAV) -- not to say this is correct. Since HTTP is a subset of WebDAV, it would be useful if someone searching for HTTP endpoints could also find any WebDAV endpoint. Therefore, here's a concrete proposal: --- A service that supports HTTP or WebDAV protocols MAY publish StorageAccessProtocol objects to represent this. If a StorageAccessProtocol object is published to represent HTTP access then the Type SHOULD be 'http'. If a StorageAccessProtocol object is published to represent WebDAV access then the Type SHOULD be 'webdav'. Since HTTP is a subset of WebDAV, a service that publishes a WebDAV StorageAccessProtocol SHOULD publish a StorageAccessProtocol for HTTP. When publishing an Endpoint object the describes an HTTP or a WebDAV endpoint with unencrypted access then the URL SHOULD start 'http://' and the InterfaceName SHOULD be 'http'. If the endpoint is encrypted then the URL SHOULD start 'https://' and the InterfaceName SHOULD be 'https'. If the endpoint supports WebDAV then a SupportedProfile of 'http://webdav.org/' SHOULD be published. --- Cheers, Paul.

stephen.burke＠stfc.ac.uk

6:10 p.m.

Paul Millar [mailto:paul.millar@desy.de] said:

...

Stephen, to you DPM, dCache and Castor provide the same functionality, so you would be happy with instances of all three published as Service.Type of 'storage' (or similar).

Not entirely - the object names are prefixed with "Storage" anyway, so simply publishing a Type of "storage" would be redundant. Also it seems to me that something like a standalone xrootd server or a "classic SE" as we used to have would reasonably be different types of storage service, even aside from the details of which protocols they support. However, we do have a family of SRM-based SEs which seem to me to represent a commmon type - indeed I thought that one of the goals of EMI for dcache, DPM and StoRM was precisely to make them interoperable! In the past I would have suggested "SRM" as a Type, but since we now seem to be making moves away from the use of SRM that may not be ideal as a name. From a dcache POV, what do you see as providing commonality with DPM and StoRM? (Beyond all being storage systems.)

...

Somebody who needs some unique characteristic provided by dCache (or DPM, or ...) might want more detailed Type, specifically that the service provides the dCache-like facilities (or DPM-like or ...).

If someone really wants to know the implementation they can look at the EndpointImplementationName or ManagerProductName - although of course it's undesirable to have anything which is implementation-specific. For me, to be a valid type it would have to be the case that a completely different vendor could potentially produce an independent product which could reasonably be described as "a DPM" or "a dcache" - even conceptually, can you see such a thing as being meaningful? If so, how would you define it? You use "dcache-like" above, but what does that mean (in terms of external interfaces)?

...

For the xrootd protocol, dCache currently publishes

Endpoint.URL: xroot://xrootd-door.example.org/ EndpointInterface.Name: xroot and StorageAccessProtocol.Type: xrootd

What protocol name do you recognise in e.g. a getTURL operation to return an xroot TURL? Does it match what DPM and StoRM use? What about webdav?

...

For WebDAV, dCache is currently publishing as either 'http' or 'https', depending on whether SSL/TLS tunnelling is enabled or not.

Bear in mind that the scheme name in the URL is not the same as the InterfaceName. I don't know a lot about webdav but my impression is that it's far from being identical with http as far as file access goes, so I would expect a different InterfaceName even if the URL is https:// (c.f. SRM vs. httpg://).

...

When publishing an Endpoint object the describes an HTTP or a WebDAV endpoint with unencrypted access then the URL SHOULD start 'http://' and the InterfaceName SHOULD be 'http'. If the endpoint is encrypted then the URL SHOULD start 'https://' and the InterfaceName SHOULD be 'https'. If the endpoint supports WebDAV then a SupportedProfile of 'http://webdav.org/' SHOULD be published.

If it's necessary to make that distinction think I would prefer to publish both http and webdav endpoints, doing it your way would seem likely to be error-prone. Stephen-- Scanned by iCritical.

Florido Paganelli

7:23 p.m.

Hi all, I agree with Stephen on all the comments and questions. Cheers, Florido On 2014-03-05 19:10, stephen.burke@stfc.ac.uk wrote:

...

Paul Millar [mailto:paul.millar@desy.de] said:

...
Stephen, to you DPM, dCache and Castor provide the same functionality, so you would be happy with instances of all three published as Service.Type of 'storage' (or similar).

Not entirely - the object names are prefixed with "Storage" anyway, so simply publishing a Type of "storage" would be redundant. Also it seems to me that something like a standalone xrootd server or a "classic SE" as we used to have would reasonably be different types of storage service, even aside from the details of which protocols they support. However, we do have a family of SRM-based SEs which seem to me to represent a commmon type - indeed I thought that one of the goals of EMI for dcache, DPM and StoRM was precisely to make them interoperable! In the past I would have suggested "SRM" as a Type, but since we now seem to be making moves away from the use of SRM that may not be ideal as a name. From a dcache POV, what do you see as providing commonality with DPM and StoRM? (Beyond all being storage systems.)

...
Somebody who needs some unique characteristic provided by dCache (or DPM, or ...) might want more detailed Type, specifically that the service provides the dCache-like facilities (or DPM-like or ...).

If someone really wants to know the implementation they can look at the EndpointImplementationName or ManagerProductName - although of course it's undesirable to have anything which is implementation-specific. For me, to be a valid type it would have to be the case that a completely different vendor could potentially produce an independent product which could reasonably be described as "a DPM" or "a dcache" - even conceptually, can you see such a thing as being meaningful? If so, how would you define it? You use "dcache-like" above, but what does that mean (in terms of external interfaces)?

...
For the xrootd protocol, dCache currently publishes

Endpoint.URL: xroot://xrootd-door.example.org/ EndpointInterface.Name: xroot and StorageAccessProtocol.Type: xrootd

What protocol name do you recognise in e.g. a getTURL operation to return an xroot TURL? Does it match what DPM and StoRM use? What about webdav?

...
For WebDAV, dCache is currently publishing as either 'http' or 'https', depending on whether SSL/TLS tunnelling is enabled or not.

Bear in mind that the scheme name in the URL is not the same as the InterfaceName. I don't know a lot about webdav but my impression is that it's far from being identical with http as far as file access goes, so I would expect a different InterfaceName even if the URL is https:// (c.f. SRM vs. httpg://).

...
When publishing an Endpoint object the describes an HTTP or a WebDAV endpoint with unencrypted access then the URL SHOULD start 'http://' and the InterfaceName SHOULD be 'http'. If the endpoint is encrypted then the URL SHOULD start 'https://' and the InterfaceName SHOULD be 'https'. If the endpoint supports WebDAV then a SupportedProfile of 'http://webdav.org/' SHOULD be published.

If it's necessary to make that distinction think I would prefer to publish both http and webdav endpoints, doing it your way would seem likely to be error-prone.

Stephen-- Scanned by iCritical. _______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg

-- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Paul Millar

6 Mar 6 Mar

11:34 a.m.

Hi Stephen, On 05/03/14 19:10, stephen.burke@stfc.ac.uk wrote:

...

Paul Millar [mailto:paul.millar@desy.de] said:

...
Stephen, to you DPM, dCache and Castor provide the same functionality, so you would be happy with instances of all three published as Service.Type of 'storage' (or similar).

Not entirely - the object names are prefixed with "Storage" anyway, so simply publishing a Type of "storage" would be redundant.

Yup, fair point.

...

Also it seems to me that something like a standalone xrootd server or a "classic SE" as we used to have would reasonably be different types of storage service, even aside from the details of which protocols they support.

Possibly, but then perhaps DPM and dCache also provide sufficiently different storage behaviour to quality. It's difficult without a benchmark to decide.

...

However, we do have a family of SRM-based SEs which seem to me to represent a commmon type - indeed I thought that one of the goals of EMI for dcache, DPM and StoRM was precisely to make them interoperable!

I think you continue to have this fallacy that dCache is somehow SRM-based. Yes, one can read data using SRM, but one can easily read the same data from the same dCache instance without using SRM: switch off SRM and dCache has always works perfectly well. The interoperability has always been at the protocol level, not the implementation level, so should appear in the Endpoint or AccessProtocol.

...

In the past I would have suggested "SRM" as a Type, but since we now seem to be making moves away from the use of SRM that may not be ideal as a name.

From my pov, 'SRM' was never a good name: SRM is a protocol, not a storage system. dCache, at least, has never been "based on" SRM.

...

From a dcache POV, what do you see as providing commonality with DPM and StoRM? (Beyond all being storage systems.)

One commonality between dCache and DPM is the immutable nature of stored data: once written, data may only be modified by replacing the old data with completely new data. I think StoRM also provides an immutable filesystem, but a StoRM person would need to confirm. However, this immutable nature could change in the (not too distant) future. Other than that, I don't think there's much that's similar: they're rather different implementations, with different design choices.

...

...
Somebody who needs some unique characteristic provided by dCache (or DPM, or ...) might want more detailed Type, specifically that the service provides the dCache-like facilities (or DPM-like or ...).

If someone really wants to know the implementation they can look at the EndpointImplementationName or ManagerProductName - although of course it's undesirable to have anything which is implementation-specific.

Both certainly true: they can look at the Manager.ProductName and that tying behaviour to implementation is undesirable.

...

For me, to be a valid type it would have to be the case that a completely different vendor could potentially produce an independent product which could reasonably be described as "a DPM" or "a dcache" - even conceptually, can you see such a thing as being meaningful? If so, how would you define it? You use "dcache-like" above, but what does that mean (in terms of external interfaces)?

I think the problem with Type is in deciding the use-case for querying it. When would they query Type rather than, say, Manager.ProductName? AFAIK, we don't have concrete examples where this information is useful. In terms of "dCache-like", there are any number of behavioural characteristics that distinguish dCache from DPM; for example, hot-spot detection and mitigation, overload protection, ability to stage file from tape, ... A client may adjust its behaviour if it detects that the storage system is "dCache-like" (or if it isn't dCache-like). As you point out, this could be discovered through Manager.ProductName, so it goes back to the above point: what are the use-cases for querying StorageService.Type?

...

...
For the xrootd protocol, dCache currently publishes

Endpoint.URL: xroot://xrootd-door.example.org/ EndpointInterface.Name: xroot and StorageAccessProtocol.Type: xrootd

What protocol name do you recognise in e.g. a getTURL operation to return an xroot TURL?

Currently it's 'root://'

...

Does it match what DPM and StoRM use?

I couldn't say: you would need to ask DPM and StoRM people.

...

What about webdav?

dCache SRM will return a TURL that starts 'http://' or 'https://'.

...

...
For WebDAV, dCache is currently publishing as either 'http' or 'https', depending on whether SSL/TLS tunnelling is enabled or not.

Bear in mind that the scheme name in the URL is not the same as the InterfaceName. I don't know a lot about webdav but my impression is that it's far from being identical with http as far as file access goes, so I would expect a different InterfaceName even if the URL is https:// (c.f. SRM vs. httpg://).

I'm pretty sure that, for uploading and downloading data, the HTTP and WebDAV requests *are* identical. WebDAV is about adding the "missing file-system ideas", like the concept of directories.

...

...
When publishing an Endpoint object the describes an HTTP or a WebDAV endpoint with unencrypted access then the URL SHOULD start 'http://' and the InterfaceName SHOULD be 'http'. If the endpoint is encrypted then the URL SHOULD start 'https://' and the InterfaceName SHOULD be 'https'. If the endpoint supports WebDAV then a SupportedProfile of 'http://webdav.org/' SHOULD be published.

If it's necessary to make that distinction think I would prefer to publish both http and webdav endpoints, doing it your way would seem likely to be error-prone.

Yes, but please bear in mind that there are many extensions that build on top of HTTP and that an endpoint may support many (into double-digits) of them concurrently. Publishing an endpoint for each results in (excessive?) duplication. While publishing multiple endpoints is possible, I was hoping we could come up with something better. Cheers, Paul.

Maria Alandes Pradillo

11:40 a.m.

Dear all,

...

As you point out, this could be discovered through Manager.ProductName, so it goes back to the above point: what are the use-cases for querying StorageService.Type?

People who don´t know the schema so well will tend to query for Service.Type. I would say we can publish this in both places. I find it very useful for myself right now to be able to distinguish DPM, dCache and StoRM and Service Type level. Although in the end, most people care about enpoints, so we need to make sure all Storage Systems are publishing the protocols with the same names, etc. Regards, Maria

stephen.burke＠stfc.ac.uk

4:45 p.m.

Paul Millar [mailto:paul.millar@desy.de] said:

...

Possibly, but then perhaps DPM and dCache also provide sufficiently different storage behaviour to quality. It's difficult without a benchmark to decide.

I think this is the only case where we have multiple service implementations which are largely interoperable, so it probably needs to be decided on its own merits. It may be that we just don't have a relevant use-case - I still think the only likely need to make a distinction would be between dcache, dpm, storm, castor and bestman as a group and other things like standalone xrootd and gridftp. Despite your protestations of differences the first group are used more-or-less interchangeably by LCG and EGI, we have had quite significant efforts (with varying degrees of success) to make them interoperable and we have a set of tools (GFAL/lcg-utils) which allow them to be used in a uniform way.

...

Other than that, I don't think there's much that's similar: they're rather different implementations, with different design choices.

But how does that affect the users? More specifically, what do you regard as a *contract* with your users beyond the specifications of the various protocols (and features which are already covered by GLUE attributes), i.e. documented behaviour that users can rely on? If dcache is a Type it needs a definition, which is presumably not just "whatever dcache happens to do right now" ...

...

I think the problem with Type is in deciding the use-case for querying it. When would they query Type rather than, say, Manager.ProductName? AFAIK, we don't have concrete examples where this information is useful.

ServiceType is only ever going to be useful in that sense where there is a need to identify sets of services which are bigger than a single implementation and smaller than the universe of services with a given set of endpoint types, or which are anyway identified as Computing or Storage Services (or any other specialisation). It seems possible that we have no instances at all where that's the case!

...

...
What protocol name do you recognise in e.g. a getTURL operation to return an xroot TURL?

Currently it's 'root://'

That isn't what I meant: https://sdm.lbl.gov/srm-wg/doc/SRM.v2.2.html#_Toc241633071 typedef struct { TAccessPattern accessPattern, TConnectionType connectionType, string[] arrayOfClientNetworks, string[] arrayOfTransferProtocols } TTransferParameters "arrayOfTransferProtocols" is a list of names of access protocols, but I don't believe that the SRM spec defines anywhere what those names should be (you'll see that it's only typed as "string[]" despite intrinsically being an enumerated type), hence each SE has its own internal list, which you can get with the GetTransferProtocols method: https://sdm.lbl.gov/srm-wg/doc/SRM.v2.2.html#_Toc241633127 but that doesn't of course tell you explicitly what the names refer to. I tried a number of times to get some kind of official definition of them, but the best I ever got was a verbal agreement to follow the GLUE definition - which in the past was largely true (apart from the use of "rfio" in DPM which should have been "gsirfio" since it's completely different to the "rfio" known to Castor). However we're now getting support for new protocols, and I have no idea what names the various implementations are using for them.

...

...
Does it match what DPM and StoRM use?

I couldn't say: you would need to ask DPM and StoRM people.

This seems to be a basic problem - we're trying to reach an agreement on standards between groups who don't talk to each other! I can see three main possibilities: a) all the storage implementors talk to each other and reach a common agreement; b) the most prevalent implementation (DPM) decides and everyone else follows; c) we give up and accept that we don't have interoperable standards.

...

Yes, but please bear in mind that there are many extensions that build on top of HTTP and that an endpoint may support many (into double-digits) of them concurrently. Publishing an endpoint for each results in (excessive?) duplication.

Maybe, but you'd only need to do it for cases where there is a practical need. Personally I don't know enough about http extensions to know what things people would commonly want to distinguish and query for. Stephen -- Scanned by iCritical.

Florido Paganelli

5 Mar 5 Mar

7:19 p.m.

Hi Paul, On 2014-03-05 18:30, Paul Millar wrote:

...

Hi Stephen, Maria, everyone else,

On 05/03/14 13:22, stephen.burke@stfc.ac.uk wrote:

...
Maria Alandes Pradillo [mailto:Maria.Alandes.Pradillo@cern.ch] said:

...
On behalf of the DPM team, could you please also consider adding:

I think these do need some more discussion ...

...
- "DPM" to ServiceType

It isn't clear to me that DPM is a distinctive type - as far as I know it only implements standard interfaces like SRM and xroot, so why is it a different type of service to, say, dcache? What would you propose as a definition? DPM as a product name is published elsewhere. I think we probably need a type name that would be common between, say, DPM, dcache and Castor since they provide basically the same functionality.

For comparison, dCache currently publishes a Service.Type of 'org.dcache.storage'

I think we've already had a discussion on this point, without coming to a conclusion, as I recall.

The problem (IMHO) is the level of detail depends on who's asking the question.

Stephen, to you DPM, dCache and Castor provide the same functionality, so you would be happy with instances of all three published as Service.Type of 'storage' (or similar).

Somebody who needs some unique characteristic provided by dCache (or DPM, or ...) might want more detailed Type, specifically that the service provides the dCache-like facilities (or DPM-like or ...).

If all someone wants is to store some data using, say, WebDAV, then they can look for a WebDAV endpoint that they're authorised to use. They wouldn't even look at the StorageService Type.

That isn't to say publishing DPM (or 'org.dcache.storage') is the correct approach, but that saying "they're all storage" also isn't necessarily correct.

So, in the absence of other compelling reasons, I would suggest going for a *more* specific Type: the generic features may be published elsewhere (e.g., as endpoints) and searched for as such.

The only thing I would suggest is that instead of publishing DPM, you publish a reasonable DNS name. A three-letter acronym is perhaps generic enough that it might result in confusion.

...
...
- "org.webdav" and "org.xrootd" to InterfaceName

xroot has already been discussed extensively and we made a decision - I think the decision was to use "xroot" for the protocol name but we should check.

For the xrootd protocol, dCache currently publishes

Endpoint.URL: xroot://xrootd-door.example.org/ EndpointInterface.Name: xroot and StorageAccessProtocol.Type: xrootd

that sounds correct to me

...

...
For webdav I think the "org." prefix isn't adding much here - probably we should just use "webdav" as it's a well-known protocol defined in an RFC so there's no issue of a name clash, but there may be other views. Anyway there are likely to be other interested parties - dcache at least - who should express a view.

For WebDAV, dCache is currently publishing as either 'http' or 'https', depending on whether SSL/TLS tunnelling is enabled or not. This is for Endpoint.URL ('http://' or 'https://'), Endpoint.InterfaceName and StorageAccessPoint.Type. This is largely for historical reasons (dCache supported HTTP before WebDAV) -- not to say this is correct.

Since HTTP is a subset of WebDAV, it would be useful if someone searching for HTTP endpoints could also find any WebDAV endpoint.

Paul, I think this is very bad. http is not webdav. If one searches for http endpoints that should NOT be done with interfacename, but with Capability or Technology. This http there is a very misleading thing -- in short, so how is an information consumer supposed to infer that endpoint supports webdav?? I think we should simply add webdav. If the thing is how to discover that webdav is based on http, this is another kind of question... I think of webdav as a protcol itself. If you think it as an extension of http, maybe you can play with InterfaceExtension in the Endpoint... But I think one should have a StorageAccessProtocol object just for that as well.

...

Therefore, here's a concrete proposal:

---

A service that supports HTTP or WebDAV protocols MAY publish StorageAccessProtocol objects to represent this. If a StorageAccessProtocol object is published to represent HTTP access then the Type SHOULD be 'http'. If a StorageAccessProtocol object is published to represent WebDAV access then the Type SHOULD be 'webdav'. Since HTTP is a subset of WebDAV, a service that publishes a WebDAV StorageAccessProtocol SHOULD publish a StorageAccessProtocol for HTTP.

I don't like the above. One should tell what the protocol is, not what are its close siblings. Type MUST be webdav. I am against giving such misleading recommendation as the above, they only generate confusion.

...

When publishing an Endpoint object the describes an HTTP or a WebDAV endpoint with unencrypted access then the URL SHOULD start 'http://' and the InterfaceName SHOULD be 'http'. If the endpoint is encrypted then the URL SHOULD start 'https://' and the InterfaceName SHOULD be 'https'. If the endpoint supports WebDAV then a SupportedProfile of 'http://webdav.org/' SHOULD be published.

In such scenario, what algorithm should the consumer use to discover a webdav endpoint? I dislike this one as well -- for the same reasons as above. Moreover would be nice to stick a domain name to those -- like org.ieee.webdav. I agree that for old standards like http there is no single organization, and is hard to tell, so there can always be exceptions, plain http and https can stay. I agree with the SupportedProfile proposal. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Paul Millar

6 Mar 6 Mar

10:35 a.m.

Hi Florido, On 05/03/14 20:19, Florido Paganelli wrote:

...

Paul, I think this is very bad. http is not webdav.

True, but all HTTP requests are valid WebDAV requests, so all WebDAV endpoints are valid HTTP endpoints.

...

If one searches for http endpoints that should NOT be done with interfacename, but with Capability or Technology.

I'm not sure I completely agree. Technology doesn't seem appropriate. It's optional and talks more about the mechanical process of providing access -- for example, if dCache provides Windows/SMB access using a particular version of Samba then this could be published as Technology. Capability is better; however, the OGSA definitions are (currently) higher-level functionality; they don't (currently) specify which protocol the endpoint supports. Additional definitions could be added but I'm not sure this is the correct approach.

...

This http there is a very misleading thing -- in short, so how is an information consumer supposed to infer that endpoint supports webdav??

Easy, if they want WebDAV, they search for an endpoint that supports the webdav profile (however that is specified).

...

I think we should simply add webdav.

Consider a single endpoint that supports HTTP, WebDAV, CDMI, HTTP-3rd-party-copy, RFC-3230, CalDAV, CardDAV and GroupDAV. If we follow this approach this endpoint must be published 8 times! There are any number of extensions, based on HTTP. I don't think it scales to publish each one as a new object, duplicating almost all of the information.

...

If the thing is how to discover that webdav is based on http, this is another kind of question... I think of webdav as a protcol itself.

I think you misunderstood: I don't want to publish that webdav is based on http. I want that: o we publish one StorageEndpoint object. o someone searching for HTTP endpoints will find that object. o someone searching for WebDAV endpoints will find that object.

...

If you think it as an extension of http, maybe you can play with InterfaceExtension in the Endpoint... But I think one should have a StorageAccessProtocol object just for that as well.

...
Therefore, here's a concrete proposal:

---

A service that supports HTTP or WebDAV protocols MAY publish StorageAccessProtocol objects to represent this. If a StorageAccessProtocol object is published to represent HTTP access then the Type SHOULD be 'http'. If a StorageAccessProtocol object is published to represent WebDAV access then the Type SHOULD be 'webdav'. Since HTTP is a subset of WebDAV, a service that publishes a WebDAV StorageAccessProtocol SHOULD publish a StorageAccessProtocol for HTTP.

I don't like the above. One should tell what the protocol is, not what are its close siblings. Type MUST be webdav. I am against giving such misleading recommendation as the above, they only generate confusion.

I say 'webdav' SHOULD be published if WebDAV is supported. We can make this MUST if you like, but I think SHOULD is strong enough.

...

...
When publishing an Endpoint object the describes an HTTP or a WebDAV endpoint with unencrypted access then the URL SHOULD start 'http://' and the InterfaceName SHOULD be 'http'. If the endpoint is encrypted then the URL SHOULD start 'https://' and the InterfaceName SHOULD be 'https'. If the endpoint supports WebDAV then a SupportedProfile of 'http://webdav.org/' SHOULD be published.

In such scenario, what algorithm should the consumer use to discover a webdav endpoint?

SupportedProfile=http://webdav.org/ (or however we choose to label WebDAV support).

...

Moreover would be nice to stick a domain name to those -- like org.ieee.webdav.

Note that SupportedProfile value is a URL, so something like: SupportedProfile=http://www.ietf.org/rfc/rfc4918.txt For us, the value is a constant. If the URL happens to be a pointer to where the semantics are described then all the better!

...

I agree that for old standards like http there is no single organization, and is hard to tell, so there can always be exceptions, plain http and https can stay.

For HTTP, we can use the RFC as a reference, for example SupportedProfile=http://www.ietf.org/rfc/rfc2616.txt We could do the same for WebDAV: SupportedProfile=http://www.ietf.org/rfc/rfc4918.txt SupportedProfile=http://www.ietf.org/rfc/rfc5689.txt ... Some thoughts... Paul.

stephen.burke＠stfc.ac.uk

11:22 a.m.

Paul Millar [mailto:paul.millar@desy.de] said:

...

SupportedProfile=http://webdav.org/

(or however we choose to label WebDAV support).

I don't see that as an appropriate use of the attribute - that should be a URL you can follow to get information, not a selection parameter. Textual matching against a URL is vulnerable to both format variations, e.g. in the number of / characters, and changes in the URL itself. A URL is something which is basically a free-form string aside from syntax constraints, not a specifier that can be defined precisely. In any case, we already have an attribute to publish additional subsidiary types, namely InterfaceExtension.

...

For us, the value is a constant. If the URL happens to be a pointer to where the semantics are described then all the better!

This pretty much says that you agree that this is an abuse of the attribute! Stephen -- Scanned by iCritical.

Paul Millar

12:13 p.m.

Hi Stephen, On 06/03/14 12:22, stephen.burke@stfc.ac.uk wrote:

...

Paul Millar [mailto:paul.millar@desy.de] said:

...
SupportedProfile=http://webdav.org/

(or however we choose to label WebDAV support).

I don't see that as an appropriate use of the attribute - that should be a URL you can follow to get information, not a selection parameter.

Later on, I proposed: http://www.ietf.org/rfc/rfc4918.txt This is a precise definition of the additional semantics supported by the endpoint. Would that be better?

...

Textual matching against a URL is vulnerable to both format variations, e.g. in the number of / characters, and changes in the URL itself.

Note that GLUE-2 defines the type of attribute as URI. If you don't like it, we need to change GLUE-2 to not use URI as a type! More practically, as LDAP doesn't support URI as a type then the LDAP binding must define a URI canonicalisation so that a text search is not ambiguous. That the LDAP binding doesn't define this canonicalisation is a problem that should be fixed.

...

A URL is something which is basically a free-form string aside from syntax constraints,

(Isn't that an oxymoron? If there are syntax constraints then it isn't free-form ;-)

...

not a specifier that can be defined precisely.

I disagree. Taking what you say at face value, we should remove all URIs as attribute types in the specification! An URL is perfectly valid identifier. What might be missing is how to describe a URL in LDAP, but that's a separate issue.

...

In any case, we already have an attribute to publish additional subsidiary types, namely InterfaceExtension.

I'm perfectly happy if we publish WebDAV as an InterfaceExtension rather than a SupportedProfile. GLUE-2 says nothing about how these should be used (it defines SupportedProfile as a profile and InterfaceExtension as an extension!)

...

...
For us, the value is a constant. If the URL happens to be a pointer to where the semantics are described then all the better!

This pretty much says that you agree that this is an abuse of the attribute!

I don't follow: which part of GLUE-2 spec. says this usage is an abuse? Cheers, Paul.

stephen.burke＠stfc.ac.uk

5:14 p.m.

Paul Millar [mailto:paul.millar@desy.de] said:

...

Later on, I proposed:

http://www.ietf.org/rfc/rfc4918.txt

This is a precise definition of the additional semantics supported by the endpoint.

Would that be better?

No, worse! My objection is to the use of any URL as a selection parameter: because the text of a URL can vary while resolving to the same document, because a URL can change, and especially in a case like the above because it's hugely error prone (very easy to type say "rcf4981").

...

...
A URL is something which is basically a free-form string aside from syntax constraints,

(Isn't that an oxymoron? If there are syntax constraints then it isn't free-form ;-)

That's nitpicking. What we want here is a *single* defined string which means "webdav", not a wide and undefinable range of strings.

...

I disagree. Taking what you say at face value, we should remove all URIs as attribute types in the specification!

First, my point applies to selection parameters used in queries, which is only a subset of all attributes. A URL as information returned by a query is fine. Secondly, a URI is not necessarily a URL; it's entirely possible to have a defined format for a URI which leads to a unique string in a given case, although I agree that we haven't been very good about doing it in practice.

...

...
In any case, we already have an attribute to publish additional subsidiary types, namely InterfaceExtension.

I'm perfectly happy if we publish WebDAV as an InterfaceExtension rather than a SupportedProfile.

GLUE-2 says nothing about how these should be used (it defines SupportedProfile as a profile and InterfaceExtension as an extension!)

I agree that these are not well-defined. At a fairly late stage in the schema process we had both primary and secondary interfaces defined as a URI which was supposed to be a formatted combination of a name and a version; it was me who argued to split the primary interface into separate Name and Version attributes. Here's part of Sergio's reply to a mail from me on 14/5/08: -----------------

...

And I don't understand why or how Type and Version are combined as a single Interface URI: at the very least this needs *much* more explanation since it's absolutely vital to using the endpoints! And the Type list needs to be enumerated (as it is now). Basically I don't understand why we haven't kept the Type and Version as we have them in 1.3.

this emerged during the revision process with Balazs; one of the most wanted query pattern is: I want to search for an endpoint complying with an interface name an version, plus some extensions as well. E.g.: ARC implements BES 1.0 + a number of extensions; how do you search for them? Interface = urn:ogf:bes:1.0 and InterfaceExtension = urn:arc:bes++:1.0 and InterfaceExtension = urn:ogf:hpc:1.0 the problem sits mainly in the extensions; by coupling the name and version into one attributes, we can maintain simple properties ---------------- And here's part of a reply from *you* the following day! ---------------- Could the documentation be updated to say that Interface should (must?) be a URN that encodes the type and version of the interface. Leaving it as "Type=URI" is potentially confusing. -------------- Unfortunately I don't think the documentation ever was updated or clarified, and in practice we've never used this up to now so it never came up. If we do want to treat webdav this way I think we need to define what the URI format is for InterfaceExtension. Basically we want a format with two variable parts: a protocol name which we should treat as an enumerated type like InterfaceName, and a version string which we should treat like InterfaceVersion. Stephen -- Scanned by iCritical.

stephen.burke＠stfc.ac.uk

3:45 p.m.

Paul Millar [mailto:paul.millar@desy.de] said:

...

I think you misunderstood: I don't want to publish that webdav is based on http.

I want that: o we publish one StorageEndpoint object. o someone searching for HTTP endpoints will find that object. o someone searching for WebDAV endpoints will find that object.

Part of the question is whether (and how) you expect users to know that webdav is a specialisation of http rather than a standalone protocol. I.e. your scheme, in whatever format it gets published, would mean that someone wanting webdav would do a basic search for an InterfaceName of http plus a subfilter on some other attribute to restrict the search to only http endpoints that support webdav. There's nothing intrinsically wrong with that, but it requires users who want webdav to know that they have to do that, rather than just a standard search for webdav as an InterfaceName. Also if the users are using a service discovery tool, e.g. ginfo, whatever method is used would need to be supported and not overly complex. Stephen (More replies after I get some coffee ...) -- Scanned by iCritical.

Florido Paganelli

25 Mar 25 Mar

11 a.m.

Hi Paul, After reading all the discussion with Stephen, I am convinced of one thing: On 2014-03-06 11:35, Paul Millar wrote:

...

Hi Florido,

[...]

Capability is better; however, the OGSA definitions are (currently) higher-level functionality; they don't (currently) specify which protocol the endpoint supports.

Additional definitions could be added but I'm not sure this is the correct approach.

This is the only correct approach, despite what OGSA definitions are. If Capabilities are open enumerations, we're free to set another route, and create specific ones for protocol. Every other solution is cumbersome and complex in my opinion-- and the amount of email exchanges tells. And it also solves the problem you mentioned later:

...

...
This http there is a very misleading thing -- in short, so how is an information consumer supposed to infer that endpoint supports webdav??

Easy, if they want WebDAV, they search for an endpoint that supports the webdav profile (however that is specified).

...
I think we should simply add webdav.

Consider a single endpoint that supports HTTP, WebDAV, CDMI, HTTP-3rd-party-copy, RFC-3230, CalDAV, CardDAV and GroupDAV.

If we follow this approach this endpoint must be published 8 times!

There are any number of extensions, based on HTTP. I don't think it scales to publish each one as a new object, duplicating almost all of the information.

the above means that publishing http (or any other value in the Protocol) is the minimum common protocol that tells nothing about the features of the endpoint. Whatever we put there will always be half the truth. So if we think is wise to keep the minimum common protocol in the protocol attribute, we must have other standard means of exposing features. Your proposed approach using profile I don't like, because it mainly applies to web services IMHO. I vote for creating better Capabilities. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Paul Millar

4 Apr 4 Apr

12:58 p.m.

Hi Florido, On 25/03/14 12:00, Florido Paganelli wrote:

...

After reading all the discussion with Stephen, I am convinced of one thing: [..]

...
Capability is better; however, the OGSA definitions are (currently) higher-level functionality; they don't (currently) specify which protocol the endpoint supports.

Additional definitions could be added but I'm not sure this is the correct approach.

This is the only correct approach, despite what OGSA definitions are. If Capabilities are open enumerations, we're free to set another route, and create specific ones for protocol. [...] I vote for creating better Capabilities.

Sounds good to me. My I suggest we have a standard way of mapping an OGF and RFC specification to a Capability? This could be a URL or a URN. That way, publishing the correct Capabilities becomes straight-forward and no further registration is needed. Cheers, Paul.

Florido Paganelli

1:46 p.m.

Hi Paul, On 2014-04-04 14:58, Paul Millar wrote:

...

Hi Florido,

On 25/03/14 12:00, Florido Paganelli wrote:

...
After reading all the discussion with Stephen, I am convinced of one thing: [..]

...
Capability is better; however, the OGSA definitions are (currently) higher-level functionality; they don't (currently) specify which protocol the endpoint supports.

Additional definitions could be added but I'm not sure this is the correct approach.

This is the only correct approach, despite what OGSA definitions are. If Capabilities are open enumerations, we're free to set another route, and create specific ones for protocol. [...] I vote for creating better Capabilities.

Sounds good to me.

My I suggest we have a standard way of mapping an OGF and RFC specification to a Capability? This could be a URL or a URN.

We have two ways: EITHER we just insert the URN in the description, OR we want a machine to be able to parse it and hence we add an additional field in the Capability_t.csv. I think it would be reasonable to add a specific field for better machine-readability.

...

That way, publishing the correct Capabilities becomes straight-forward and no further registration is needed.

true indeed. But we still have to agree on names. I am thinking of something like (omitting some fields for readability): Capability_t | Description |...| Reference documentation ======================================================================================================================= data.transfer.http | capacity of moving a file from one network location |...| https://tools.ietf.org/html/rfc6585 | | to another using the HTTP protocol |...| | ------------------------------------------------------------------------------------------------------------------------ data.transfer.https | capacity of moving a file from one network location |...| https://tools.ietf.org/html/rfc2660 | | to another using the HTTPS protocol |...| | ----------------------------------------------------------------------------------------------------------------------- data.transfer.webdav | .... Also, agreeing on which RFC is crucial in my opinion. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Paul Millar

5:01 p.m.

Hi Florido, On 04/04/14 15:46, Florido Paganelli wrote:

...

On 2014-04-04 14:58, Paul Millar wrote:

...
My I suggest we have a standard way of mapping an OGF and RFC specification to a Capability? This could be a URL or a URN.

We have two ways: EITHER we just insert the URN in the description, OR we want a machine to be able to parse it and hence we add an additional field in the Capability_t.csv.

Let me give some concrete examples: An endpoint that supports RFC-6585 publishes the capability 'org.ietf.rfc-6585' An endpoint that supports RFC-2660 publishes the capability 'org.ietf.rfc-2660'. An endpoint that supports SRM v2.2 publishes the capability 'org.ogf.gfd-129' An endpoint that supports GridFTP publishes the capability 'org.ogf.gfd-20' An endpoint that supports GridFTP v2 publishes the capability 'org.ogf.gfd-47' I hope you see how, given any RFC or any GFD, I know exactly how to publish a capability; and, given the capabilities of any endpoint, I know exactly which RFCs and GFDs it supports. Cheers, Paul.

Florido Paganelli

7 Apr 7 Apr

9:28 a.m.

Hi Paul, On 2014-04-04 19:01, Paul Millar wrote:

...

Hi Florido,

On 04/04/14 15:46, Florido Paganelli wrote:

...
On 2014-04-04 14:58, Paul Millar wrote:

...
My I suggest we have a standard way of mapping an OGF and RFC specification to a Capability? This could be a URL or a URN.

We have two ways: EITHER we just insert the URN in the description, OR we want a machine to be able to parse it and hence we add an additional field in the Capability_t.csv.

Let me give some concrete examples:

An endpoint that supports RFC-6585 publishes the capability 'org.ietf.rfc-6585'

An endpoint that supports RFC-2660 publishes the capability 'org.ietf.rfc-2660'.

An endpoint that supports SRM v2.2 publishes the capability 'org.ogf.gfd-129'

An endpoint that supports GridFTP publishes the capability 'org.ogf.gfd-20'

An endpoint that supports GridFTP v2 publishes the capability 'org.ogf.gfd-47'

I hope you see how, given any RFC or any GFD, I know exactly how to publish a capability; and, given the capabilities of any endpoint, I know exactly which RFCs and GFDs it supports.

I can see how, indeed. These that you listed above make more sense for InterfaceName than for Capabilities. As Stephen pointed out, there is no much additional information in the reversed domain name there. But for me is nice to have to track down origin, so I'd be happier to have InterfaceNames the way you listed them above. But since each Endpoint can only have one InterfaceName, then a service supporting multiple protocols should in principle have as many Endpoints as the supported protocols. We can overcome the above using capabilities, but then one must me more specific on what one can *do* with that protocol: Capabilities, the way I read them as they're described in GFD.147, are ways to discover functionalities, thus the namespace is not about the organization but tells about the functionality. Hence if one has an Endpoint whose interface supports more than one protocol, one could publish: data.transfer.rfc-2660 security.authentication.rfc-2660 data.transfer.gfd-47 and so on. What do you think? what do the others think? It would be nice if this is discussed among storage services developers. Regards, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Paul Millar

10:51 a.m.

Hi Florido, On 07/04/14 11:28, Florido Paganelli wrote:

...

On 2014-04-04 19:01, Paul Millar wrote:

...
I hope you see how, given any RFC or any GFD, I know exactly how to publish a capability; and, given the capabilities of any endpoint, I know exactly which RFCs and GFDs it supports.

I can see how, indeed.

These that you listed above make more sense for InterfaceName than for Capabilities.

Sure, this is my point: currently, capabilities are too high-level and there is only one InterfaceName.

...

As Stephen pointed out, there is no much additional information in the reversed domain name there.

I don't really care about the reversed-domain in the name (see below).

...

[..] But since each Endpoint can only have one InterfaceName, then a service supporting multiple protocols should in principle have as many Endpoints as the supported protocols.

So we're back to the idea of publishing the same information many, many times. Just to be clear: this, at best, an ugly work-around. Here's an example to put the effect in context. If a StorageService publish O(10) Endpoints (not unreasonable) and each endpoint supports O(10) different RFCs (a tad high, but not unreasonable), then publishing an Endpoint for each RFC means publishing O(100) *additional* Endpoint objects, each mostly duplicating information already published.

...

We can overcome the above using capabilities, but then one must me more specific on what one can *do* with that protocol:

Capabilities, the way I read them as they're described in GFD.147, are ways to discover functionalities, thus the namespace is not about the organization but tells about the functionality.

Two points to consider: By itself, each RFC document says *very precisely* what functionality is supported. The problem here is that the functionality described by an RFC could cover multiple items in the capability namespace and, when not, then the choice of capability isn't automatic. There's nothing to prevent publishing the higher-level functionality in addition to the very specific RFC-based capability.

...

Hence if one has an Endpoint whose interface supports more than one protocol, one could publish:

data.transfer.rfc-2660 security.authentication.rfc-2660 data.transfer.gfd-47 and so on.

What do you think? what do the others think?

First off, having two representations for RFC-2660 (as "data.transfer.rfc-2660" and "security.authentication.rfc-2660") is bad. Against which capability should a client query to find an endpoint that supports RFC-2660 (data.transfer, security.authentication, or both) ? What does it mean if a storage element publishes "data.transfer.rfc-2660" and not "security.authentication.rfc-2660"? IMHO, there should be a single, canonical way of representing that an endpoint supports RFC-2660 and this should be somehow a single attribute that is published. Adding "data.transfer" or "security.authentication" as a prefix is problematic as: 1. protocols often cover multiple high-level functionalities (GFD-47 isn't only about data-transfer) 2. the mapping (RFC-2660 --> "security.authentication") isn't automatic, so we're back to maintaining an enumeration. My view is still that: o An RFC or GFD describes (completely) a set of functionality. o A client's desired interaction is very often predicated on an endpoint supporting the functionality described in one or more RFC or GFD documents. o That a GLUE2 Endpoint supports multiple RFCs/GFDs concurrently is both natural and likely. o There is considerable advantage to publishing the RFCs/GFDs that a GLUE2 Endpoint supports, such that: * A single Endpoint may publish support for multiple RFCs/GFDs. * Each RFC and each GFD has a single, canonical string value that, when published as part of Endpoint, represents support for that RFC or GFD. * The canonical value for any RFC or GFD is knowable without consulting any enumeration (i.e., it is purely algorithmic). * Given a canonical value, the corresponding RFC or GFD is knowable without consulting an enumeration. I don't care so much whether the canonical value is published as Capabilities, InterfaceExtensions, SupportedProfiles or Semantics --- provided precisely one is chosen as the correct way of publishing this information. Likewise, I don't care so much whether the canonical value for RFC2660 is 'rfc2660', 'RFC-2660', 'org.ietf.rfc-2660', 'Standards.From-IETF.RFC-2660' or 'http://tools.ietf.org/rfc/rfc2660', provided one purely algorithmic translation is chosen.

...

It would be nice if this is discussed among storage services developers.

There is a meeting coming up, but I'm not sure how much convergence there'll be. I think it is this groups responsibility to either accept or reject my view, as expressed above. Cheers, Paul.

Florido Paganelli

12:26 p.m.

Hi Paul, Answers inline. On 2014-04-07 12:51, Paul Millar wrote:

...

Hi Florido,

On 07/04/14 11:28, Florido Paganelli wrote:

...
On 2014-04-04 19:01, Paul Millar wrote:

...
I hope you see how, given any RFC or any GFD, I know exactly how to publish a capability; and, given the capabilities of any endpoint, I know exactly which RFCs and GFDs it supports.

I can see how, indeed.

These that you listed above make more sense for InterfaceName than for Capabilities.

Sure, this is my point: currently, capabilities are too high-level and there is only one InterfaceName.

...
As Stephen pointed out, there is no much additional information in the reversed domain name there.

I don't really care about the reversed-domain in the name (see below).

...
[..] But since each Endpoint can only have one InterfaceName, then a service supporting multiple protocols should in principle have as many Endpoints as the supported protocols.

So we're back to the idea of publishing the same information many, many times. Just to be clear: this, at best, an ugly work-around.

Here's an example to put the effect in context.

If a StorageService publish O(10) Endpoints (not unreasonable) and each endpoint supports O(10) different RFCs (a tad high, but not unreasonable), then publishing an Endpoint for each RFC means publishing O(100) *additional* Endpoint objects, each mostly duplicating information already published.

...
We can overcome the above using capabilities, but then one must me more specific on what one can *do* with that protocol:

Capabilities, the way I read them as they're described in GFD.147, are ways to discover functionalities, thus the namespace is not about the organization but tells about the functionality.

Two points to consider:

By itself, each RFC document says *very precisely* what functionality is supported. The problem here is that the functionality described by an RFC could cover multiple items in the capability namespace and, when not, then the choice of capability isn't automatic.

There's nothing to prevent publishing the higher-level functionality in addition to the very specific RFC-based capability.

...
Hence if one has an Endpoint whose interface supports more than one protocol, one could publish:

data.transfer.rfc-2660 security.authentication.rfc-2660 data.transfer.gfd-47 and so on.

What do you think? what do the others think?

First off, having two representations for RFC-2660 (as "data.transfer.rfc-2660" and "security.authentication.rfc-2660") is bad.

is it?

...

Against which capability should a client query to find an endpoint that supports RFC-2660 (data.transfer, security.authentication, or both) ?

both -- but I agree is cumbersome, more below

...

What does it mean if a storage element publishes "data.transfer.rfc-2660" and not "security.authentication.rfc-2660"?

that https is not fully implemented -- however, I just invented those, it does not mean that we shouldn't reason about those. I might agree it makes no sense to have one and not the other. We can just decide that data.transfer.rfc-2660 is enough. Mine was a quickly made up example.

...

IMHO, there should be a single, canonical way of representing that an endpoint supports RFC-2660 and this should be somehow a single attribute that is published.

ok, but let's try to keep consistency with the string format at least

...

Adding "data.transfer" or "security.authentication" as a prefix is problematic as:

1. protocols often cover multiple high-level functionalities (GFD-47 isn't only about data-transfer)

2. the mapping (RFC-2660 --> "security.authentication") isn't automatic, so we're back to maintaining an enumeration.

My view is still that:

o An RFC or GFD describes (completely) a set of functionality.

o A client's desired interaction is very often predicated on an endpoint supporting the functionality described in one or more RFC or GFD documents.

o That a GLUE2 Endpoint supports multiple RFCs/GFDs concurrently is both natural and likely.

o There is considerable advantage to publishing the RFCs/GFDs that a GLUE2 Endpoint supports, such that:

* A single Endpoint may publish support for multiple RFCs/GFDs.

* Each RFC and each GFD has a single, canonical string value that, when published as part of Endpoint, represents support for that RFC or GFD.

* The canonical value for any RFC or GFD is knowable without consulting any enumeration (i.e., it is purely algorithmic).

* Given a canonical value, the corresponding RFC or GFD is knowable without consulting an enumeration.

I don't care so much whether the canonical value is published as Capabilities, InterfaceExtensions, SupportedProfiles or Semantics --- provided precisely one is chosen as the correct way of publishing this information.

But you should care for consistency, that is the key to a winner model.

...

Likewise, I don't care so much whether the canonical value for RFC2660 is 'rfc2660', 'RFC-2660', 'org.ietf.rfc-2660', 'Standards.From-IETF.RFC-2660' or 'http://tools.ietf.org/rfc/rfc2660', provided one purely algorithmic translation is chosen.

I understand all the above comments. However, we are paving the way for GLUE2 to be an unsuccessful schema. The reason is that we keep contradicting what is written in the model, we add hacks and strings the way we like in a non-intuitive way, such that everything can be interpreted in many different ways. If a capability is a feature then we should enforce that, and not start putting nonsense random strings (org.ogf.rcf-2660 does *not* make sense in that context.) what about protocol.support.rfc2660

...

...
It would be nice if this is discussed among storage services developers.

There is a meeting coming up, but I'm not sure how much convergence there'll be.

I wouldn't be surprised. The GLUE2 model as we speak is not even able to bring an agreement among those who created it. Why is that, we should maybe ask ourselves (and I was not part of the making of).

...

I think it is this groups responsibility to either accept or reject my view, as expressed above.

It's the group responsibility to bring sanity in this string mess, not to accept or reject someone's opinion. We should rather try to converge to some consistency for the sake of interoperability. WHat do you think about the above formulation, that is, generalize a capability string such as protocol.support.<protocol name> protocol.supported.<rfc name> protocol.supported.<document name> If you have better ideas I'd be happy to follow. I like protocol.supported.<something> where something is described in the Description field. No additional field in the CSV files. Regards, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Paul Millar

3:51 p.m.

Hi Florido, More comments in-line. On 07/04/14 14:26, Florido Paganelli wrote: [publishing as Capabilities]

...

...
...
Hence if one has an Endpoint whose interface supports more than one protocol, one could publish:

data.transfer.rfc-2660 security.authentication.rfc-2660 data.transfer.gfd-47 and so on.

What do you think? what do the others think?

First off, having two representations for RFC-2660 (as "data.transfer.rfc-2660" and "security.authentication.rfc-2660") is bad.

is it?

Well, in my humble opinion :)

...

...
Against which capability should a client query to find an endpoint that supports RFC-2660 (data.transfer, security.authentication, or both) ?

both -- but I agree is cumbersome, more below

...
What does it mean if a storage element publishes "data.transfer.rfc-2660" and not "security.authentication.rfc-2660"?

that https is not fully implemented -- however, I just invented those, it does not mean that we shouldn't reason about those. I might agree it makes no sense to have one and not the other. We can just decide that data.transfer.rfc-2660 is enough. Mine was a quickly made up example.

OK, lets move on from this example.

...

...
IMHO, there should be a single, canonical way of representing that an endpoint supports RFC-2660 and this should be somehow a single attribute that is published.

ok, but let's try to keep consistency with the string format at least

Sure. At this stage, I'm happy if we can agree that publishing the set of RFCs an Endpoint supports is reasonable thing to do :-) [..]

...

...
I don't care so much whether the canonical value is published as Capabilities, InterfaceExtensions, SupportedProfiles or Semantics --- provided precisely one is chosen as the correct way of publishing this information.

But you should care for consistency, that is the key to a winner model.

Firstly, consistent with GFD-147, right? Secondly, consistent with existing enumeration values. In GFD-147, Capability is described explicitly in terms of OGSA architecture classification. OGSA v1.5 spec doesn't include references to RFCs. I would take this to mean we shouldn't publish RFCs under Capability. The description of InterfaceExtension is almost semantically null: "the identification of an extension to the interface protocol supported by the Endpoint." -- there's nothing about what an extension *is*, but there is a hint that this has something to do with protocols. SupportedProfile also has a semantically null description (again, what is a profile?), but I can guess what is meant. RFCs could be published as SupportedProfile legitimately, but this probably wasn't the intention of this attribute. Semantics has the requirement of being human-readable. This isn't a problem for RFCs but, again, probably not what was intended. To me, publishing RFCs as InterfaceExtensions is the option most consistent with GFD-147.

...

...
Likewise, I don't care so much whether the canonical value for RFC2660 is 'rfc2660', 'RFC-2660', 'org.ietf.rfc-2660', 'Standards.From-IETF.RFC-2660' or 'http://tools.ietf.org/rfc/rfc2660', provided one purely algorithmic translation is chosen.

I understand all the above comments. However, we are paving the way for GLUE2 to be an unsuccessful schema. The reason is that we keep contradicting what is written in the model, we add hacks and strings the way we like in a non-intuitive way, such that everything can be interpreted in many different ways.

At the risk of repeating myself: I'm advocating having a single, canonical representation for supporting an RFC. This is then not open to being "interpreted in many different ways." It should be defined once and all implementations should use it when publishing their support for an RFC. I'm not sure of this "keep contradicting ..." but for now my interests are in how we publish support for an RFC.

...

If a capability is a feature then we should enforce that, and not start putting nonsense random strings (org.ogf.rcf-2660 does *not* make sense in that context.)

I don't understand: what do you mean by "a feature"? A Capability is defined in terms of OGSA, so that's clear.

...

what about

protocol.support.rfc2660

For Capabilities we can't: this string isn't defined in the OGSA v1.5 document. For InterfaceExtensions it's OK (subject to some minor changes) However, for InterfaceExtension the value should be a URI. Technically "protocol.support.rfc2600" is a URL, so a URI; however, it would be better if the value was either a URL with a schema-part or a properly scoped URN.

...

...
...
It would be nice if this is discussed among storage services developers.

There is a meeting coming up, but I'm not sure how much convergence there'll be.

I wouldn't be surprised. The GLUE2 model as we speak is not even able to bring an agreement among those who created it. Why is that, we should maybe ask ourselves (and I was not part of the making of).

...
I think it is this groups responsibility to either accept or reject my view, as expressed above.

It's the group responsibility to bring sanity in this string mess, not to accept or reject someone's opinion. We should rather try to converge to some consistency for the sake of interoperability.

That's what I'm trying to help facilitate, by providing concrete examples and concrete expressions of what I'm trying to achieve.

...

WHat do you think about the above formulation, that is, generalize a capability string such as

protocol.support.<protocol name> protocol.supported.<rfc name> protocol.supported.<document name>

If you have better ideas I'd be happy to follow.

I think the fundamental problem with Capability is that it's explicitly tied to OGSA. [..]

...

protocol.supported.<something> where something is described in the Description field. No additional field in the CSV files.

Sure. Any algorithmic mechanism should also be easy to describe in the CSV file. We should keep this in mind but, at this stage, it shouldn't be such a problem to fulfil that requirement. HTH, Paul.

stephen.burke＠stfc.ac.uk

4:49 p.m.

Paul Millar [mailto:paul.millar@desy.de] said:

...

At this stage, I'm happy if we can agree that publishing the set of RFCs an Endpoint supports is reasonable thing to do :-)

I'm not strongly opposed to this as a concept, but I would make a couple of points. Firstly, many grid protocols don't have RFCs or GFDs so this can't be a universal mechanism. Secondly, as far as I'm aware we so far have only one case, i.e. webdav/http, where there is a real issue, so we should beware of ending up with something which would complicate publishing and querying for the vast majority of cases which work perfectly well already.

...

The description of InterfaceExtension is almost semantically null: "the identification of an extension to the interface protocol supported by the Endpoint." -- there's nothing about what an extension *is*, but there is a hint that this has something to do with protocols.

I already explained the history of this - we have never defined the usage of InterfaceExtension up to now because it was never needed, but that's no reason not to do it now if we do need it.

...

SupportedProfile also has a semantically null description (again, what is a profile?), but I can guess what is meant. RFCs could be published as SupportedProfile legitimately, but this probably wasn't the intention of this attribute.

Again, I don't think we've ever had a use which required us to define it. I would conceptually see this as going in the opposite direction, in the sense that InterfaceExtensions would be something additional to the basic Interface, whereas a profile would be a restriction or specialisation. Stephen -- Scanned by iCritical.

Paul Millar

5:14 p.m.

Hi Stephen, On 07/04/14 18:49, stephen.burke@stfc.ac.uk wrote:

...

Paul Millar [mailto:paul.millar@desy.de] said:

...
At this stage, I'm happy if we can agree that publishing the set of RFCs an Endpoint supports is reasonable thing to do :-)

I'm not strongly opposed to this as a concept, but I would make a couple of points. Firstly, many grid protocols don't have RFCs or GFDs so this can't be a universal mechanism.

Agreed --- although I would hope the number of endpoints using non-standard protocols decreases over time.

...

Secondly, as far as I'm aware we so far have only one case, i.e. webdav/http, where there is a real issue, so we should beware of ending up with something which would complicate publishing and querying for the vast majority of cases which work perfectly well already.

This is in the back of my mind, too. Publishing RFCs seems a reasonable solution for HTTP and WebDAV (especially as WebDAV itself isn't really a single protocol), but does this make publishing other things harder? I don't think so.

...

...
The description of InterfaceExtension is almost semantically null: "the identification of an extension to the interface protocol supported by the Endpoint." -- there's nothing about what an extension *is*, but there is a hint that this has something to do with protocols.

I already explained the history of this - we have never defined the usage of InterfaceExtension up to now because it was never needed, but that's no reason not to do it now if we do need it.

OK.

...

...
SupportedProfile also has a semantically null description (again, what is a profile?), but I can guess what is meant. RFCs could be published as SupportedProfile legitimately, but this probably wasn't the intention of this attribute.

Again, I don't think we've ever had a use which required us to define it. I would conceptually see this as going in the opposite direction, in the sense that InterfaceExtensions would be something additional to the basic Interface, whereas a profile would be a restriction or specialisation.

That was pretty much my thoughts too: that SupportedProfile describes additional constraints not required by the underlying protocol, whereas InterfaceExtension describes additional functionality not described in the primary protocol. So, in summary, publishing RFCs in InterfaceExtention would be OK, bearing in mind your other concerns. Cheers, Paul.

Florido Paganelli

5:28 p.m.

Hi Paul, Thanks for answering. I'll cut away parts that I think we discussed enough for the sake of readability. On 2014-04-07 17:51, Paul Millar wrote:

...

[...]

...
But you should care for consistency, that is the key to a winner model.

Firstly, consistent with GFD-147, right? Secondly, consistent with existing enumeration values.

In GFD-147, Capability is described explicitly in terms of OGSA architecture classification. OGSA v1.5 spec doesn't include references to RFCs. I would take this to mean we shouldn't publish RFCs under Capability.

Well it says "list initially drafted from" For me consistency with this statement means that the initial state must be preserved (soundness) and that all the values included in that GFD.80 document should be taken into account (completeness). I didn't read all GFD.80 and OMII-DJRA2.1, but for what I can see there is no trace of the strings in GFD.147 as such. So I assume those who wrote GFD.147 used all the Capabilities they could get from these two docs and shaped them like what we have now. I don't see, however 1) Why we cannot add new ones out of these two documents in the GLUE2 spec 2) How would one benefit of a dns-like structure that does not include functionalities and features (terminology used in GFD.80)

...

The description of InterfaceExtension is almost semantically null: "the identification of an extension to the interface protocol supported by the Endpoint." -- there's nothing about what an extension *is*, but there is a hint that this has something to do with protocols.

Agree, very unfortunate: was actually matter of discussion in another email with Stephen.

...

SupportedProfile also has a semantically null description (again, what is a profile?), but I can guess what is meant. RFCs could be published as SupportedProfile legitimately, but this probably wasn't the intention of this attribute.

Semantics has the requirement of being human-readable. This isn't a problem for RFCs but, again, probably not what was intended.

To me, publishing RFCs as InterfaceExtensions is the option most consistent with GFD-147.

We can agree on consistency but not on longevity. The main issue here is that InterfaceExtension is NOT an Open Enumeration, that means is not supposed to be kept in a defined list of items or a registry like Open Enumerations. I agree with Stephen here that this would expose us to a zoo of weird URLs there -- and you don't need an inexperienced person like me to tell that this is a very common case with grid service. Do you know that it took a meeting of four hours to decide which ServiceType_t strings are the officially accepted?

...

...
...
Likewise, I don't care so much whether the canonical value for RFC2660 is 'rfc2660', 'RFC-2660', 'org.ietf.rfc-2660', 'Standards.From-IETF.RFC-2660' or 'http://tools.ietf.org/rfc/rfc2660', provided one purely algorithmic translation is chosen.

I understand all the above comments. However, we are paving the way for GLUE2 to be an unsuccessful schema. The reason is that we keep contradicting what is written in the model, we add hacks and strings the way we like in a non-intuitive way, such that everything can be interpreted in many different ways.

At the risk of repeating myself: I'm advocating having a single, canonical representation for supporting an RFC. This is then not open to being "interpreted in many different ways." It should be defined once and all implementations should use it when publishing their support for an RFC.

I'm not sure of this "keep contradicting ..." but for now my interests are in how we publish support for an RFC.

The above is just that I foresee that attributes used in discovery of Computing Services and Storage Services are becoming completely different, that for me is a contradiction, because it shows we have no unifying model.

...

...
If a capability is a feature then we should enforce that, and not start putting nonsense random strings (org.ogf.rcf-2660 does *not* make sense in that context.)

I don't understand: what do you mean by "a feature"? A Capability is defined in terms of OGSA, so that's clear.

I am using feature and functionality as it is used in that GFD.80 document. To me what the made-up strings in B.5 try to follow what is a Capability in that document, that describes features and functionalities of services. So I am completely with you here.

...

...
what about

protocol.support.rfc2660

For Capabilities we can't: this string isn't defined in the OGSA v1.5 document.

I don't see where we decided that we cannot add our own. Being consistent doesn't mean to be limited to, unless my understanding of english is not proper.

...

For InterfaceExtensions it's OK (subject to some minor changes)

However, for InterfaceExtension the value should be a URI. Technically "protocol.support.rfc2600" is a URL, so a URI; however, it would be better if the value was either a URL with a schema-part or a properly scoped URN.

Exactly. That's why I was actually suggesting it for InterfaceName.

...

...
...
...
It would be nice if this is discussed among storage services developers.

There is a meeting coming up, but I'm not sure how much convergence there'll be.

I wouldn't be surprised. The GLUE2 model as we speak is not even able to bring an agreement among those who created it. Why is that, we should maybe ask ourselves (and I was not part of the making of).

...
I think it is this groups responsibility to either accept or reject my view, as expressed above.

It's the group responsibility to bring sanity in this string mess, not to accept or reject someone's opinion. We should rather try to converge to some consistency for the sake of interoperability.

That's what I'm trying to help facilitate, by providing concrete examples and concrete expressions of what I'm trying to achieve.

I didn't say the opposite, I just think it needs more discussion -- unless we decide that the way one discovers and monitors storage is substantially different from computing services, which for me is a bit sad.

...

...
WHat do you think about the above formulation, that is, generalize a capability string such as

protocol.support.<protocol name> protocol.supported.<rfc name> protocol.supported.<document name>

If you have better ideas I'd be happy to follow.

I think the fundamental problem with Capability is that it's explicitly tied to OGSA.

I don't see how and where. It is "Initially drafted from". Maybe I am missing part of the document?

...

[..]

I think we might end up discussing this forever. I trust your opinion as a storage service developer and the fact that you can bring up this topic as a member of the group to a more dedicated storage audience. Mine were just thoughts of the way I'd like to see it. If the community wants to go in some other way, well fine for me. I guess it's wise to wait for some decision by the Storage developers and then take decisions within the group on what to publish. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

stephen.burke＠stfc.ac.uk

8 Apr 8 Apr

4:17 p.m.

Florido Paganelli [mailto:florido.paganelli@hep.lu.se] said:

...

We can agree on consistency but not on longevity. The main issue here is that InterfaceExtension is NOT an Open Enumeration, that means is not supposed to be kept in a defined list of items or a registry like Open Enumerations.

InterfaceExtension is supposed to be a formatted combination of a Name and a Version. That's because the schema doesn't support tables, so if you made InterfaceName and Version multivalued you would have no way to know which Version applied to which Name. I would expect that the Name component of InterfaceExtension would be an open enumeration - it could either be taken directly from the existing list or maintained separately.

...

The above is just that I foresee that attributes used in discovery of Computing Services and Storage Services are becoming completely different, that for me is a contradiction, because it shows we have no unifying model.

I would expect that the standard way to discover Endpoints would continue to be selection on the InterfaceName as it is now. What we're trying to decide here is how to deal with the specific case where one interface (webdav) is an extension of another (http). Even for storage that's the exception, most data access protocols stand alone, and we already have agreed InterfaceNames for them which have been in use for many years. For example, the agreed name for xroot access is "xroot" - as far as I know there is no RFC or GFD for that anyway, it's a HEP-specific protocol. Stephen -- Scanned by iCritical.

Maria Alandes Pradillo

6 Mar 6 Mar

7:39 a.m.

Dear Paul, Thanks very much for your answer. I agree with all what you have said. I will pass this information to my DPM colleagues and get back to you. Regards, Maria

...

-----Original Message----- From: glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of Paul Millar Sent: 05 March 2014 18:30 To: glue-wg@ogf.org Subject: Re: [glue-wg] New Endpoint and Service types

Hi Stephen, Maria, everyone else,

On 05/03/14 13:22, stephen.burke@stfc.ac.uk wrote:

...
Maria Alandes Pradillo [mailto:Maria.Alandes.Pradillo@cern.ch] said:

...
On behalf of the DPM team, could you please also consider adding:

I think these do need some more discussion ...

...
- "DPM" to ServiceType

It isn't clear to me that DPM is a distinctive type - as far as I know it only implements standard interfaces like SRM and xroot, so why is it a different type of service to, say, dcache? What would you propose as a definition? DPM as a product name is published elsewhere. I think we probably need a type name that would be common between, say, DPM, dcache and Castor since they provide basically the same functionality.

For comparison, dCache currently publishes a Service.Type of 'org.dcache.storage'

I think we've already had a discussion on this point, without coming to a conclusion, as I recall.

The problem (IMHO) is the level of detail depends on who's asking the question.

Stephen, to you DPM, dCache and Castor provide the same functionality, so you would be happy with instances of all three published as Service.Type of 'storage' (or similar).

Somebody who needs some unique characteristic provided by dCache (or DPM, or ...) might want more detailed Type, specifically that the service provides the dCache-like facilities (or DPM-like or ...).

If all someone wants is to store some data using, say, WebDAV, then they can look for a WebDAV endpoint that they're authorised to use. They wouldn't even look at the StorageService Type.

That isn't to say publishing DPM (or 'org.dcache.storage') is the correct approach, but that saying "they're all storage" also isn't necessarily correct.

So, in the absence of other compelling reasons, I would suggest going for a *more* specific Type: the generic features may be published elsewhere (e.g., as endpoints) and searched for as such.

The only thing I would suggest is that instead of publishing DPM, you publish a reasonable DNS name. A three-letter acronym is perhaps generic enough that it might result in confusion.

...
...
- "org.webdav" and "org.xrootd" to InterfaceName

xroot has already been discussed extensively and we made a decision - I think the decision was to use "xroot" for the protocol name but we should check.

For the xrootd protocol, dCache currently publishes

Endpoint.URL: xroot://xrootd-door.example.org/ EndpointInterface.Name: xroot and StorageAccessProtocol.Type: xrootd

...
For webdav I think the "org." prefix isn't adding much here - probably we should just use "webdav" as it's a well-known protocol defined in an RFC so there's no issue of a name clash, but there may be other views. Anyway there are likely to be other interested parties - dcache at least - who should express a view.

For WebDAV, dCache is currently publishing as either 'http' or 'https', depending on whether SSL/TLS tunnelling is enabled or not. This is for Endpoint.URL ('http://' or 'https://'), Endpoint.InterfaceName and StorageAccessPoint.Type. This is largely for historical reasons (dCache supported HTTP before WebDAV) -- not to say this is correct.

Since HTTP is a subset of WebDAV, it would be useful if someone searching for HTTP endpoints could also find any WebDAV endpoint.

Therefore, here's a concrete proposal:

---

A service that supports HTTP or WebDAV protocols MAY publish StorageAccessProtocol objects to represent this. If a StorageAccessProtocol object is published to represent HTTP access then the Type SHOULD be 'http'. If a StorageAccessProtocol object is published to represent WebDAV access then the Type SHOULD be 'webdav'. Since HTTP is a subset of WebDAV, a service that publishes a WebDAV StorageAccessProtocol SHOULD publish a StorageAccessProtocol for HTTP.

When publishing an Endpoint object the describes an HTTP or a WebDAV endpoint with unencrypted access then the URL SHOULD start 'http://' and the InterfaceName SHOULD be 'http'. If the endpoint is encrypted then the URL SHOULD start 'https://' and the InterfaceName SHOULD be 'https'. If the endpoint supports WebDAV then a SupportedProfile of 'http://webdav.org/' SHOULD be published.

---

Cheers,

Paul. _______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg

Paul Millar

11:36 a.m.

Hi Maria, On 06/03/14 08:39, Maria Alandes Pradillo wrote:

...

Thanks very much for your answer. I agree with all what you have said. I will pass this information to my DPM colleagues and get back to you.

My proposal was something that people on the list could hack away at. It wasn't meant as a final version. Sorry that I didn't make that clear! As we haven't reached consensus yet, I suggest you hold off until we're agreed what is the correct approach. Cheers, Paul.

Maria Alandes Pradillo

11:38 a.m.

Dear Paul, Sure, no problem. I have informed Oliver about this discussion and he will get in touch with you. Regards, Maria

...

-----Original Message----- From: Paul Millar [mailto:paul.millar@desy.de] Sent: 06 March 2014 12:37 To: Maria Alandes Pradillo; glue-wg@ogf.org Subject: Re: [glue-wg] New Endpoint and Service types

Hi Maria,

On 06/03/14 08:39, Maria Alandes Pradillo wrote:

...
Thanks very much for your answer. I agree with all what you have said. I will pass this information to my DPM colleagues and get back to you.

My proposal was something that people on the list could hack away at. It wasn't meant as a final version. Sorry that I didn't make that clear!

As we haven't reached consensus yet, I suggest you hold off until we're agreed what is the correct approach.

Cheers,

Paul.

Florido Paganelli

4 Apr 4 Apr

10:04 a.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Hi Maria, all I am coming back to this discussion because during the last GLUE2 group meeting we agreed on some proposals. Could you please send these back to developers? Answers inline. On 2014-03-05 09:44, Maria Alandes Pradillo wrote:

...

Dear all,

On behalf of the DPM team, could you please also consider adding:

- "DPM" to ServiceType

the group would like to have an organization name. It has beed decided that if there is no organization name, then one can fallback to the group reserver organization name, that can be used for orphan projects. therefore we suggest: org.ogf.glue.dpm What do people think about this?

...

- "org.webdav" and "org.xrootd" to InterfaceName

This I completely fail to understand. webdav and xrootd are protocol names. Why would they be interfacenames? An interface name is an "identification of the Interface" as GDF.147 says. org. is not an organization. now, we have dCache publishing: xroot as interface name, which I think is as bad as the above. This needs severe sanitization. Would be better to have some organization name there. So since it seems people likes to confuse concepts like protocol and interfacename, I might suggest something to come towards this confusion trying to clean it up. My suggestion for these two InterfaceNames would be: org.ogf.glue.dpm.webdav org.ogf.glue.dpm.xrootd What about these above? I feel they make more sense. How they can be used in discovery I already said in several emails. InterfaceName should NOT be used to indentify the protocol or the cababilities. For that there is existing attributes. Moreover, I'd like to have some description for each of these values. Once we decide on these we can go for further approval. Cheers, Florido

...

Thanks a lot, Maria

...
-----Original Message----- From: glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of Florido Paganelli Sent: 03 March 2014 17:04 To: glue-wg@ogf.org Subject: Re: [glue-wg] New Endpoint and Service types

Hi Bartok, Stephen

...
Bartosz Bosak [mailto:bbosak@man.poznan.pl] said:

...
Could you check if new ServiceTypes/EndpointInterfaceNames for the QCG middleware have been registered in GLUE 2? Now we have some time > and we would like to move forward with the implementation of our publishers for BDII...

I haven't seen anyone disagree with them. None of the names are in use, and

On 2014-03-03 15:08, stephen.burke@stfc.ac.uk wrote: the org.qcg names are your own name space anyway.

I agree with Stephen

...
org.oasis.notification is something which could potentially be relevant to other people, but assuming that it's just a vanilla implementation of a standard I don't see that it would be controversial. Unless anyone has a strong objection now I'd suggest that you go ahead with your implementation.

regarding this

...
...
org.oasis.notification - OASIS WS-Notification standard

If you refer to this one

https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsn

then you should not request for an interfacename yourself, but the people in oasis should, or at least you should ask them for blessing. At least this is how I see it, since there is no written rules.

In short, we can add it but be aware that oasis people might request to change it to their taste one day.

I guess I should add these myself to the lists online...

Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ================================================== _______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg

-- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Maarten Litmaath

10:27 a.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Hi all,

...

...
On behalf of the DPM team, could you please also consider adding:

- "DPM" to ServiceType

the group would like to have an organization name. It has been decided that if there is no organization name, then one can fallback to the group reserver organization name, that can be used for orphan projects.

therefore we suggest:

org.ogf.glue.dpm

What do people think about this?

Stephen already pointed out that a "DPM" _ServiceType_ does not make sense, unless someone else can also implement their own version of "DPM"! DPM is a _product_ that implements various services (data storage, management and access), each accessible through one or more protocols (SRM, GridFTP, ...).

Florido Paganelli

10:55 a.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Hi Maarten, Answers inline On 2014-04-04 12:27, Maarten Litmaath wrote:

...

Hi all,

...
...
On behalf of the DPM team, could you please also consider adding:

- "DPM" to ServiceType

the group would like to have an organization name. It has been decided that if there is no organization name, then one can fallback to the group reserver organization name, that can be used for orphan projects.

therefore we suggest:

org.ogf.glue.dpm

What do people think about this?

Stephen already pointed out that a "DPM" _ServiceType_ does not make sense, unless someone else can also implement their own version of "DPM"!

DPM is a _product_ that implements various services (data storage, management and access), each accessible through one or more protocols (SRM, GridFTP, ...).

Thanks for pointing this out, now I read Stephen's comment on that and I actually disagree. I might end up being pedantic, but I think that what a GLUE2 ServiceType_t is about IS identifying a service. Also ARC A-REX is a product that offers job submission interfaces supporting various protocols, and information systems interfaces supporting various standards. Nevertheless it is a service as a whole. Maybe words are used in a bad way, but if we stick to the definition in GFD.147: An abstracted, logical view of actual software components that parti- cipate in the creation of an entity providing one or more functionalities useful in a Grid Environment. [...] If we go to the StorageService definition: An abstracted, logical view of actual software components that parti- cipate in the creation of a storage capability in Grid Environment [...] If we look at the list of ServiceType_t in GFD.147, chapter B.31, you'll se that the third-level names there area all product names. So, if Stephen thinks that we should see dCache and DPM as a single Storage ServiceType_t, I can now say that I don't think we should. If the capabilities are the same, then discovery should be done on capabilities and NOT ServiceType_t IMHO. But this I said in many other various emails. I think the real problem here is that I didn't figure out what this DPM thing is. Nobody sent a description. Nobody sent a link to relevant documentation. If you know about this, please help me understanding this better. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Maarten Litmaath

11:24 a.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Hi Florido, all,

...

If we look at the list of ServiceType_t in GFD.147, chapter B.31, you'll se that the third-level names there area all product names.

True. It appears we have a precedent...

...

So, if Stephen thinks that we should see dCache and DPM as a single Storage ServiceType_t, I can now say that I don't think we should.

If the capabilities are the same, then discovery should be done on capabilities and NOT ServiceType_t IMHO. But this I said in many other various emails.

I suspect using capabilities would be better indeed, but there exists plenty of legacy usage that selects on ServiceType. For example, an SE supporting the SRM protocol should publish a record with GlueServiceType=SRM, while _GlueSEImplementationName_ may be "DPM".

...

I think the real problem here is that I didn't figure out what this DPM thing is. Nobody sent a description. Nobody sent a link to relevant documentation. If you know about this, please help me understanding this better.

http://www.eu-emi.eu/releases/emi-3-montebianco/products/-/asset_publisher/5...

Florido Paganelli

11:48 a.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Hi Marteen, Thanks for bringing this up again. On 2014-04-04 13:24, Maarten Litmaath wrote:

...

Hi Florido, all,

...
If we look at the list of ServiceType_t in GFD.147, chapter B.31, you'll se that the third-level names there area all product names.

True. It appears we have a precedent...

...
So, if Stephen thinks that we should see dCache and DPM as a single Storage ServiceType_t, I can now say that I don't think we should.

If the capabilities are the same, then discovery should be done on capabilities and NOT ServiceType_t IMHO. But this I said in many other various emails.

I suspect using capabilities would be better indeed,

During the EMI project we suggested an alternative solution for discovering using capabilities which is now implemented in the EMI-ES interface. If you look at the Capability_t-draft.csv file I uploaded long time ago, not yet approved by the group, you'll se examples of how we implemented it: https://github.com/OGF-GLUE/Enumerations/blob/master/Capability_t-draft.csv It is very job-interface centric, but can be applied to anything, and I found it way better than filling tag names and using the Profile attribute to read a RFC as Paul proposed. Example: data.access.stageindir.gridftp Capacity of offering clients a place from where to upload data by means of the gridftp protocol information.query.xpath1 Capacity of answering information system queries specified in the XPath v 1.0 query language. Maybe verbose, but difficulty to interpret in a wrong way. And more than this, intuitive both for man and machine.

...

but there exists plenty of legacy usage that selects on ServiceType. For example, an SE supporting the SRM protocol should publish a record with GlueServiceType=SRM, while _GlueSEImplementationName_ may be "DPM".

I think the above is part of the GLUE2 EGI profile. I found it perfectly sound, as so far there was no solution to discover such information. We actually agreed on using it, but I don't think this is a nice way of performing discovery. I think you're right, this is legacy stuff. This happened because Glue1 was NOT service-oriented, but endpoint oriented, and the service inherited the endpoint name SRM, or whatever it was in Glue1. Same as GOCDB, it's Service-Endpoint oriented, that means there is no difference between the service and the endpoints/interfaces/webservices it serves. I would suggest SRM to be Deprecated as a ServiceType_t in favour of correct product names with proper reverse DNS. Implementation name is fine. The string SRM is very misleading. It's a protocol, not a service. It's written everywhere here: https://www.gridpp.ac.uk/wiki/SRM There's even a section with implementations. I think this should NOT be a ServiceType_t all. As a matter of fact, it does never appear in GFD.147 as such. Instead there is a clumsy ogf.srm InterfaceName_t there.

...

...
I think the real problem here is that I didn't figure out what this DPM thing is. Nobody sent a description. Nobody sent a link to relevant documentation. If you know about this, please help me understanding this better.

http://www.eu-emi.eu/releases/emi-3-montebianco/products/-/asset_publisher/5...

thanks for the link! I still think the ServiceType should be different. That would make perfect sense to me. It does not really describe the product itself, but the collection of endpoints that the product MAY offer. Maybe ch.cern.dpm ? Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Maarten Litmaath

2:44 p.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Hi all,

...

Maybe ch.cern.dpm ?

Possibly, but the DPM team should think about the desired name space, in particular since the DPM Collaboration comprises other institutes. To be continued...

stephen.burke＠stfc.ac.uk

2:13 p.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Maarten Litmaath [mailto:Maarten.Litmaath@cern.ch] said:

...

I suspect using capabilities would be better indeed, but there exists plenty of legacy usage that selects on ServiceType. For example, an SE supporting the SRM protocol should publish a record with GlueServiceType=SRM, while _GlueSEImplementationName_ may be "DPM".

The GLUE 1 ServiceType maps to the GLUE 2 EndpointInterfaceName, not the GLUE 2 ServiceType! So in GLUE 2 the *Endpoints* should still have a ImplementationName of SRM, but the ServiceType is new and hence open to definition. Stephen -- Scanned by iCritical.

Florido Paganelli

2:30 p.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Hi Stephen, On 2014-04-04 16:13, stephen.burke@stfc.ac.uk wrote:

...

Maarten Litmaath [mailto:Maarten.Litmaath@cern.ch] said:

...
I suspect using capabilities would be better indeed, but there exists plenty of legacy usage that selects on ServiceType. For example, an SE supporting the SRM protocol should publish a record with GlueServiceType=SRM, while _GlueSEImplementationName_ may be "DPM".

The GLUE 1 ServiceType maps to the GLUE 2 EndpointInterfaceName, not the GLUE 2 ServiceType! So in GLUE 2 the *Endpoints* should still have a ImplementationName of SRM, but the ServiceType is new and hence open to definition.

Stephen

Ok, I misunderstood that. This is very nice. So we're free to suggest and set a new route. Cheers, Florido -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

stephen.burke＠stfc.ac.uk

2:45 p.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Florido Paganelli [mailto:florido.paganelli@hep.lu.se] said:

...

the group would like to have an organization name. It has beed decided that if there is no organization name, then one can fallback to the group reserver organization name, that can be used for orphan projects.

therefore we suggest:

org.ogf.glue.dpm

What do people think about this?

As I've said before, the ServiceType is only going to be useful if there is a need to identify a set of services which is bigger than a single implementation and smaller than the universe of all Services (or all Storage Services in this case). It may be that there is no such need, in which case we may as well use a ServiceType based on the implementation name (we have to publish something, it's a mandatory attribute). In any case, there's a meeting of the storage providers organised by Maria to discuss this kind of thing in a couple of weeks (https://indico.cern.ch/event/311528/), so we should wait to see what comes out of that.

...

...
- "org.webdav" and "org.xrootd" to InterfaceName

This I completely fail to understand.

webdav and xrootd are protocol names. Why would they be interfacenames? An interface name is an "identification of the Interface" as GDF.147 says. org. is not an organization.

I don't understand your point. InterfaceNames are often protocol names, although they may be more restricted to indicate a specialised use, for example the BDII uses LDAP as a protocol but the InterfaceNames are bdii_site and bdii_top (which aren't especially well-formed but have been in use for a long time). In general, if an Interface is specific to a particular product or project then we would prefer the Name qualified by a DNS-style prefix to avoid name clashes (not because the name has any significance). For standard well-known protocols like say http there is no need for any prefix as there's no likelihood of a clash. Arguably xroot is not such a general standard, but we long ago agreed on "xroot" as the name, it's been in use for many years and I think there's little chance of a clash. As we already discussed, webdav is a different case. It's certainly a standard, but it remains undecided whether we regard it as an interface in its own right or as a subset of http (and/or https?).

...

now, we have dCache publishing: xroot as interface name, which I think is as bad as the above.

Since that is what we specified, dcache is correct.

...

This needs severe sanitization.

No. For established names that have been in use for a long time I think changing them would be a very bad idea, it gains nothing and would be disruptive for a long time. Past experience with trying to rename things is that it's nearly impossible to remove all traces of the old name, so I would say that it should only be considered where there's an overriding reason. (Also note that the GOC DB has decided not to bring its names in line for the same reason.)

...

My suggestion for these two InterfaceNames would be:

org.ogf.glue.dpm.webdav org.ogf.glue.dpm.xrootd

No, that would be crazy - these are *standard* protocols, they are not in any way specific to DPM, so they need to have universal names.

...

How they can be used in discovery I already said in several emails. InterfaceName should NOT be used to indentify the protocol or the cababilities. For that there is existing attributes.

This is nonsense - InterfaceName is precisely the agreed attribute to identify the protocol. Stephen -- Scanned by iCritical.

Florido Paganelli

3:42 p.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Hi Stephen, Thanks for the comments. Some thoughts: On 2014-04-04 16:45, stephen.burke@stfc.ac.uk wrote:

...

Florido Paganelli [mailto:florido.paganelli@hep.lu.se] said:

...
the group would like to have an organization name. It has beed decided that if there is no organization name, then one can fallback to the group reserver organization name, that can be used for orphan projects.

therefore we suggest:

org.ogf.glue.dpm

What do people think about this?

As I've said before, the ServiceType is only going to be useful if there is a need to identify a set of services which is bigger than a single implementation and smaller than the universe of all Services (or all Storage Services in this case). It may be that there is no such need, in which case we may as well use a ServiceType based on the implementation name (we have to publish something, it's a mandatory attribute).

In any case, there's a meeting of the storage providers organised by Maria to discuss this kind of thing in a couple of weeks (https://indico.cern.ch/event/311528/), so we should wait to see what comes out of that.

Ok, let's wait and see.

...

...
...
- "org.webdav" and "org.xrootd" to InterfaceName

This I completely fail to understand.

webdav and xrootd are protocol names. Why would they be interfacenames? An interface name is an "identification of the Interface" as GDF.147 says. org. is not an organization.

I don't understand your point. InterfaceNames are often protocol names, although they may be more restricted to indicate a specialised use, for example the BDII uses LDAP as a protocol but the InterfaceNames are bdii_site and bdii_top (which aren't especially well-formed but have been in use for a long time). In general, if an Interface is specific to a particular product or project then we would prefer the Name qualified by a DNS-style prefix to avoid name clashes (not because the name has any significance).

ok. But from me as a developer (I was not yet member of the group) reading the GLUE2 spec I though ok, any random name can go there -- as nowehere in the document the word "protocol" is ever coupled with "InterfaceName" -- again, in my opinion this is badly managed by the group and brings lots of confusion. It is a mandatory attribute!

...

For standard well-known protocols like say http there is no need for any prefix as there's no likelihood of a clash. Arguably xroot is not such a general standard, but we long ago agreed on "xroot" as the name, it's been in use for many years and I think there's little chance of a clash.

I see

...

As we already discussed, webdav is a different case. It's certainly a standard, but it remains undecided whether we regard it as an interface in its own right or as a subset of http (and/or https?).

Publishing the subset is wrong. It is misleading.

...

...
now, we have dCache publishing: xroot as interface name, which I think is as bad as the above.

Since that is what we specified, dcache is correct.

...
This needs severe sanitization.

No. For established names that have been in use for a long time I think changing them would be a very bad idea, it gains nothing and would be disruptive for a long time. Past experience with trying to rename things is that it's nearly impossible to remove all traces of the old name, so I would say that it should only be considered where there's an overriding reason. (Also note that the GOC DB has decided not to bring its names in line for the same reason.)

...
My suggestion for these two InterfaceNames would be:

org.ogf.glue.dpm.webdav org.ogf.glue.dpm.xrootd

No, that would be crazy - these are *standard* protocols, they are not in any way specific to DPM, so they need to have universal names.

I understand. But then in the next review of the spec let's put that this is a protocol name and not the fuzzy identification thing that is there now. We might keep the word "identification" but let's put examples at least. We will need to add rules for interfaces that support multiple protocols, i.e. these protocols must be specified as Capabilities? The most obvious thing is that one should publish an endpoint for each interfaceName.

...

...
How they can be used in discovery I already said in several emails. InterfaceName should NOT be used to indentify the protocol or the cababilities. For that there is existing attributes.

This is nonsense - InterfaceName is precisely the agreed attribute to identify the protocol.

I never read it anywhere as a developer, I'm sorry Stephen. For me is just another string. After this clarification and 3 years of working with this I just realized we still don't have a common point of view on concepts. The fact that historically communities using GLUE2 made some choices are not visible to an external reader. So in conclusion: 1) we wait for this storage meeting 2) We keep those protocol names as interfacenames for the reasons you explained. Fair enough for me, but I just think that currently: - InterfaceName it is of no real use for discovery for inconsistenciens across implementors: some put a well known protocol name, some a weird string; if you don't belong to the community who created the rendering you just don't understand what that is about - imposes strict rules on implementations (multiple protocols => multiple endpoints) - does not suggest a way to publish protocol information -- this is something we only knew because we belong to the community using it, but is never mentioned in the spec. Cheers, -- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Location: Fysikum, Hus B, Rum B313 Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Maria Alandes Pradillo

8 Apr 8 Apr

12:57 p.m.

New subject: Enumerations for DPM and related InterfaceNames was: Re: New Endpoint and Service types

Dear all,

...

In any case, there's a meeting of the storage providers organised by Maria to discuss this kind of thing in a couple of weeks (https://indico.cern.ch/event/311528/), so we should wait to see what comes out of that.

Just to clarify that this is a first meeting where we will try to identify priorities and organise future regular meetings to discuss Information System issues that are common to the different Storage Systems. I´m not sure there will be time for a specific discussion next week, but we will for sure include this type of things on the list of priorities. By the way, if you have a particular thing on which you need input from this meeting, please, formulate it in a more precise way since I´m a bit lost with all the mails exchanged so far. Are we talking here about capabilities, interface names, ...? It would be good to have a request from the GLUE WG that I can bring to our meeting.

...

No. For established names that have been in use for a long time I think changing them would be a very bad idea, it gains nothing and would be disruptive for a long time. Past experience with trying to rename things is that it's nearly impossible to remove all traces of the old name, so I would say that it should only be considered where there's an overriding reason. (Also note that the GOC DB has decided not to bring its names in line for the same reason.)

Yes, please, take into account what it is already in use. We can´t ignore that completely, we should be pragmatic as well or nobody will make use of this information. For instance, I don´t know how many people know what 'org.ogf.gfd-129' is, but most people know what SRM is. Regards, Maria

...

...
My suggestion for these two InterfaceNames would be:

org.ogf.glue.dpm.webdav org.ogf.glue.dpm.xrootd

No, that would be crazy - these are *standard* protocols, they are not in any way specific to DPM, so they need to have universal names.

...
How they can be used in discovery I already said in several emails. InterfaceName should NOT be used to indentify the protocol or the cababilities. For that there is existing attributes.

This is nonsense - InterfaceName is precisely the agreed attribute to identify the protocol.

Stephen -- Scanned by iCritical.

Bartosz Bosak

10 Mar 10 Mar

11:50 a.m.

Dear Stephen, We have implemented our preliminary providers for the QCG services on top of the scripts available under package emi-resource-information-service. Now we are going to create relevant RPMs. Meanwhile, I have a couple of questions: 1. Do we need to publish data into both glue1 and glue2 or just into glue2? 2. It seems that the capability "notification" is not present in the supported capabilities of the validator? If this is a purposeful configuration, what capability should we select for the QCG-Notification service? If not, could we ask for addtion of the "notification" capability? 3. We have a doubt about publication of site-ID. The scripts, e.g. glite-info-glue2-simple, require provisioning of site-ID, but I don't know how and from which place, this information can be acquired... Is there some instruction or standard way of doing this? 4. Can you estimate when requested EndpointInterfaceName(s) and ServiceType(s) (and maybe new capability for QCG-Notification) will be present in the appropriate places? Best Regards, Bartek 2014-03-03 15:08 GMT+01:00 <stephen.burke@stfc.ac.uk>:

...

Bartosz Bosak [mailto:bbosak@man.poznan.pl] said:

...
Could you check if new ServiceTypes/EndpointInterfaceNames for the QCG middleware have been registered in GLUE 2? Now we have some time > and we would like to move forward with the implementation of our publishers for BDII...

I haven't seen anyone disagree with them. None of the names are in use, and the org.qcg names are your own name space anyway. org.oasis.notification is something which could potentially be relevant to other people, but assuming that it's just a vanilla implementation of a standard I don't see that it would be controversial. Unless anyone has a strong objection now I'd suggest that you go ahead with your implementation.

Stephen

-- Scanned by iCritical.

stephen.burke＠stfc.ac.uk

4:05 p.m.

Bartosz Bosak [mailto:bbosak@man.poznan.pl] said:

...

1. Do we need to publish data into both glue1 and glue2 or just into glue2?

GLUE 2 is sufficient as GLUE 1 is fairly close to being deprecated. On the other hand, there's no objection to publishing GLUE 1 for the time being.

...

2. It seems that the capability "notification" is not present in the supported capabilities of the validator? If this is a purposeful configuration, what capability should we select for the QCG-Notification service? If not, could we ask for addtion of the "notification" capability?

Capabilities are treated in the same kind of way as service/endpoint types, i.e. if you want something new which isn't covered by the existing values you can propose it to the GLUE WG.

...

3. We have a doubt about publication of site-ID. The scripts, e.g. glite-info-glue2-simple, require provisioning of site-ID, but I don't know how and from which place, this information can be acquired... Is there some instruction or standard way of doing this?

In EGI, the site IDs are defined in the GOC DB: https://goc.egi.eu/portal/index.php?Page_Type=Sites The service manager should know their site ID, so if you provide a way to configure it that should be OK. A service doesn't have to be physically located at that site, but there needs to be some kind of relationship - in particular the service will normally need to be configured in the site BDII for that site.

...

4. Can you estimate when requested EndpointInterfaceName(s) and ServiceType(s) (and maybe new capability for QCG-Notification) will be present in the appropriate places?

Hopefully Maria can comment about the validator. The GLUE WG is planning to meet in the next couple of weeks to discuss the new values. Stephen -- Scanned by iCritical.

Maria Alandes Pradillo

4:30 p.m.

Dear all,

...

Hopefully Maria can comment about the validator. The GLUE WG is planning to meet in the next couple of weeks to discuss the new values.

This has been released. To be included in the EMI repos next month: http://gridinfo.web.cern.ch/sys-admins/bdii-releases/glue-validator-2021 Regards, Maria

Bartosz Bosak

14 Mar 14 Mar

10:25 a.m.

Dear Maria, all If I am right there is no capability "notification" defined in the Open Enumerations registry. Is it possible to add this capability to the set? This capability was defined by OGF in http://www.ogf.org/documents/GFD.80.pdf and it is particularly important for our QCG-Notification service. Best Regards, Bartek 2014-03-10 17:30 GMT+01:00 Maria Alandes Pradillo < Maria.Alandes.Pradillo@cern.ch>:

...

Dear all,

...
Hopefully Maria can comment about the validator. The GLUE WG is planning to meet in the next couple of weeks to discuss the new values.

This has been released. To be included in the EMI repos next month: http://gridinfo.web.cern.ch/sys-admins/bdii-releases/glue-validator-2021

Regards, Maria

Maria Alandes Pradillo

19 Mar 19 Mar

9:37 a.m.

Yes, I will add it in a next release. From: bartosz.bosak@gmail.com [mailto:bartosz.bosak@gmail.com] On Behalf Of Bartosz Bosak Sent: 14 March 2014 11:26 To: Maria Alandes Pradillo Cc: Stephen Burke; glue-wg@ogf.org; qcg@plgrid.pl; Tomasz Piontek Subject: Re: New Endpoint and Service types Dear Maria, all If I am right there is no capability "notification" defined in the Open Enumerations registry. Is it possible to add this capability to the set? This capability was defined by OGF in http://www.ogf.org/documents/GFD.80.pdf and it is particularly important for our QCG-Notification service. Best Regards, Bartek 2014-03-10 17:30 GMT+01:00 Maria Alandes Pradillo <Maria.Alandes.Pradillo@cern.ch<mailto:Maria.Alandes.Pradillo@cern.ch>>: Dear all,

...

Hopefully Maria can comment about the validator. The GLUE WG is planning to meet in the next couple of weeks to discuss the new values. This has been released. To be included in the EMI repos next month: http://gridinfo.web.cern.ch/sys-admins/bdii-releases/glue-validator-2021

Regards, Maria

Tomasz Piontek

11 Mar 11 Mar

8:19 a.m.

Hi Stephen, W dniu 10.03.2014 17:05, stephen.burke@stfc.ac.uk pisze:

...

...
3. We have a doubt about publication of site-ID. The scripts, e.g. glite-info-glue2-simple, require provisioning of site-ID, but I don't know how and from which place, this information can be acquired... Is there some instruction or standard way of doing this? In EGI, the site IDs are defined in the GOC DB:

https://goc.egi.eu/portal/index.php?Page_Type=Sites

The service manager should know their site ID, so if you provide a way to configure it that should be OK. A service doesn't have to be physically located at that site, but there needs to be some kind of relationship - in particular the service will normally need to be configured in the site BDII for that site.

Maybe this arises from lack of detailed knowledge about the BDII on our side but we thought that "resource" do not have to know which "site" it belongs to, and the relationship is built on site side where the resource is registered in the site bdii. We are just wondering if it is possible to avoid any action from the admin site during the installation of qcg-bdii rpms. So far the only needed action is necessity to provide the site name. it would be nice to eliminate it, but as i understood you it is not possible. All the best, Tomek -- *********************************************************** * Tomasz Piontek piontek@man.poznan.pl * * Poznan Supercomputing and Networking Center * * tel.(+48 61) 858-21-72 fax.(+48 61) 858-21-51 * ***********************************************************

stephen.burke＠stfc.ac.uk

11:29 a.m.

Tomasz Piontek [mailto:piontek@man.poznan.pl] said:

...

Maybe this arises from lack of detailed knowledge about the BDII on our side but we thought that "resource" do not have to know which "site" it belongs to, and the relationship is built on site side where the resource is registered in the site bdii.

The Service object has a relation to the AdminDomain (site) so the publisher has to set it. Sites usually use some configuration tool so they only need to set the site name once and propagate it to all the services. Stephen -- Scanned by iCritical.

4164

Age (days ago)

4290

Last active (days ago)

List overview

Download

57 comments

7 participants

participants (7)

Bartosz Bosak
Florido Paganelli
Maarten Litmaath
Maria Alandes Pradillo
Paul Millar
stephen.burke＠stfc.ac.uk
Tomasz Piontek