Re: [glue-wg] New Endpoint and Service types

7 Apr 2014

      Hi Paul,

Answers inline.

On 2014-04-07 12:51, Paul Millar wrote:
...
Hi Florido,
On 07/04/14 11:28, Florido Paganelli wrote:
...
On 2014-04-04 19:01, Paul Millar wrote:
...
I hope you see how, given any RFC or any GFD, I know exactly how to
publish a capability; and, given the capabilities of any endpoint, I
know exactly which RFCs and GFDs it supports.
I can see how, indeed.
These that you listed above make more sense for InterfaceName than for
Capabilities.
Sure, this is my point: currently, capabilities are too high-level and
there is only one InterfaceName.
...
As Stephen pointed out, there is no much additional
information in the reversed domain name there.
I don't really care about the reversed-domain in the name (see below).
...
[..]
But since each Endpoint can only have one InterfaceName, then a service
supporting multiple protocols should in principle have as many Endpoints
as the supported protocols.
So we're back to the idea of publishing the same information many, many
times.  Just to be clear: this, at best, an ugly work-around.
Here's an example to put the effect in context.
If a StorageService publish O(10) Endpoints (not unreasonable) and each
endpoint supports O(10) different RFCs (a tad high, but not
unreasonable), then publishing an Endpoint for each RFC means publishing
O(100) *additional* Endpoint objects, each mostly duplicating
information already published.
...
We can overcome the above using capabilities, but then one must me more
specific on what one can *do* with that protocol:
Capabilities, the way I read them as they're described in GFD.147, are
ways to discover functionalities, thus the namespace is not about the
organization but tells about the functionality.
Two points to consider:
By itself, each RFC document says *very precisely* what functionality is
supported.  The problem here is that the functionality described by an
RFC could cover multiple items in the capability namespace and, when
not, then the choice of capability isn't automatic.
There's nothing to prevent publishing the higher-level functionality in
addition to the very specific RFC-based capability.
...
Hence if one has an Endpoint whose interface supports more than one
protocol, one could publish:
data.transfer.rfc-2660
security.authentication.rfc-2660
data.transfer.gfd-47
and so on.
What do you think? what do the others think?
First off, having two representations for RFC-2660 (as
"data.transfer.rfc-2660" and "security.authentication.rfc-2660") is bad.
is it?
...
Against which capability should a client query to find an endpoint that
supports RFC-2660 (data.transfer, security.authentication, or both) ?
both -- but I agree is cumbersome, more below
...
What does it mean if a storage element publishes
"data.transfer.rfc-2660" and not "security.authentication.rfc-2660"?
that https is not fully implemented -- however, I just invented those,
it does not mean that we shouldn't reason about those. I might agree it
makes no sense to have one and not the other. We can just decide that
data.transfer.rfc-2660 is enough. Mine was a quickly made up example.
...
IMHO, there should be a single, canonical way of representing that an
endpoint supports RFC-2660 and this should be somehow a single attribute
that is published.
ok, but let's try to keep consistency with the string format at least
...
Adding "data.transfer" or "security.authentication" as a prefix is
problematic as:
1. protocols often cover multiple high-level functionalities
    (GFD-47 isn't only about data-transfer)
2. the mapping (RFC-2660 --> "security.authentication") isn't
    automatic, so we're back to maintaining an enumeration.
My view is still that:
o    An RFC or GFD describes (completely) a set of functionality.
o    A client's desired interaction is very often predicated on
    an endpoint supporting the functionality described in one
    or more RFC or GFD documents.
o    That a GLUE2 Endpoint supports multiple RFCs/GFDs
    concurrently is both natural and likely.
o    There is considerable advantage to publishing the RFCs/GFDs
    that a GLUE2 Endpoint supports, such that:
*     A single Endpoint may publish support for
        multiple RFCs/GFDs.
*    Each RFC and each GFD has a single, canonical
        string value that, when published as part of
        Endpoint, represents support for that RFC or
        GFD.
*    The canonical value for any RFC or GFD is
        knowable without consulting any enumeration
        (i.e., it is purely algorithmic).
*    Given a canonical value, the corresponding RFC
        or GFD is knowable without consulting an
        enumeration.
I don't care so much whether the canonical value is published as
Capabilities, InterfaceExtensions, SupportedProfiles or Semantics ---
provided precisely one is chosen as the correct way of publishing this
information.
But you should care for consistency, that is the key to a winner model.
...
Likewise, I don't care so much whether the canonical value for RFC2660
is 'rfc2660', 'RFC-2660', 'org.ietf.rfc-2660',
'Standards.From-IETF.RFC-2660' or 'http://tools.ietf.org/rfc/rfc2660',
provided one purely algorithmic translation is chosen.
I understand all the above comments. However, we are paving the way for
GLUE2 to be an unsuccessful schema. The reason is that we keep
contradicting what is written in the model, we add hacks and strings the
way we like in a non-intuitive way, such that everything can be
interpreted in many different ways. If a capability is a feature then we
should enforce that, and not start putting nonsense random strings
(org.ogf.rcf-2660 does *not* make sense in that context.)

what about

protocol.support.rfc2660
...
...
It would be nice if this is discussed among storage
services developers.
There is a meeting coming up, but I'm not sure how much convergence
there'll be.
I wouldn't be surprised. The GLUE2 model as we speak is not even able to
bring an agreement among those who created it. Why is that, we should
maybe ask ourselves (and I was not part of the making of).
...
I think it is this groups responsibility to either accept or reject my
view, as expressed above.
It's the group responsibility to bring sanity in this string mess, not
to accept or reject someone's opinion. We should rather try to converge
to some consistency for the sake of interoperability.

WHat do you think about the above formulation, that is, generalize a
capability string such as

protocol.support.<protocol name>
protocol.supported.<rfc name>
protocol.supported.<document name>

If you have better ideas I'd be happy to follow.

I like

protocol.supported.<something>  where something is described in the
Description field. No additional field in the CSV files.

Regards,
Florido
-- 
==================================================
 Florido Paganelli
   ARC Middleware Developer - NorduGrid Collaboration
   System Administrator
 Lund University
 Department of Physics
 Division of Particle Physics
 BOX118
 221 00 Lund
 Office Location: Fysikum, Hus B, Rum B313
 Office Tel: 046-2220272
 Email: florido.paganelli@REMOVE_THIShep.lu.se
 Homepage: http://www.hep.lu.se/staff/paganelli
==================================================