
Freek/All; On 11/7/10 2:16 PM, Freek Dijkstra wrote:
Hi all,
After writing our ideas on identifiers down, I still have a five smaller questions.
Quotes are from the meeting notes (http://forge.gridforum.org/sf/docman/do/downloadDocument/projects.nml-wg/doc...):
Rough consensus on: - http://schemas.ogf.org/nml/base/2013/10/ (Jason's proposal)
Question 1. Should the schema end with a / or #? a) http://schemas.ogf.org/nml/base/2013/10 (common for XML) b) http://schemas.ogf.org/nml/base/2013/10/ (current proposal) c) http://schemas.ogf.org/nml/base/2013/10# (common for RDF)
For XML I don't think it makes any difference; for RDF, I think it should be b or c. (We may decide on a different namespace for XML and RDF, but I propose not to do that unless there are compelling reasons to do so).
b) is what we use currently in the perfSONAR/NMC world. Ex: https://svn.internet2.edu/svn/perfSONAR-PS/trunk/perfSONAR_PS-LookupService/...
Further recap from the meeting notes:
In Catania NML decided on Instance identifiers format: urn:ogf:network:<domain part>:<local part> <local> is opaque only processed by end parts GLIF also agreed to use this format.
Richard& Freek did put together a doc for the IETF RFC to define the URI Freek has translated to xml but he needs to consult Joel on web site details
Case insensitive: RFC says have to specify case sensitive/insensitive So need to define urn:ogf: at OGF level Then :network: and the rest case insensitive. i.e. have to define the lexical equivalence.
Rough consensus on: - Different objects eg link and port MUST have different identifiers - instance identifiers are case insensitive - instance identifiers are non-international (thus an URI instead of IRI) - URI are not restricted by length, other than possible restrictions by RFC 2141 (the current GLIF recommendation is max 48 to 80 bytes)
I forgot to mention in the notes that we a discussion how to refer to identifiers.
(see slides 14-18 in http://forge.gridforum.org/sf/go/doc16081)
- RDF uses the attributes rdf:about and rdf:resource - NM-WG uses the attributes id and idref - The BUILT-IN XML ID and IDREF attributes can not be used, since they only work within a document.
We had a discussion if we should re-use the id and idref from the NM working group (formally: re-use the attributes in the http://ggf.org/ns/nmwg/base/2.0/ namespace) or are to redefine these attributes again. I forgot what the consensus was.
Question 2. What attributes to use for references in XML? a) existing id and idref in NM-WG namespace b) redefine id and idref in NML namespace c) create dedicated namespace for just id and idref
As I stated in person (but will restate for this list) its uncommon to try and associate attributes with a specific namespace other than what is associated with the parent element. E.g.: <ns:element attribute="something" /> Implies that 'attribute' is in the 'ns' namespace. It is uncommon to see this: <ns:element ns2:attribute="something" /> But it is possible. I think b) makes the most sense; we do this in NM/NMC now.
We decided on the urn:ogf:network:example.net:opaque-identifier syntax.
We have not yet defined what characters should be allowed in the opaque identifier part. We have the following options:
Allowed characters: GLIF: A-Z a-z 0-9 - . RFC2141: A-Z a-z 0-9 - . _ ( ) + , : = @ ; $ ! * ' %hex pchar: A-Z a-z 0-9 - . _ ~ ( ) + , : = @ ; $ ! * '& %hex unreserved: A-Z a-z 0-9 - . _ ~
where %hex is a percentage-encoding. E.g. %2E.
- unreserved and pchar are definitions from RFC 3986, which defines URIs - GLIF is what is defined in the GLIF working group. This is extremely limited (: and _ are not allowed). - RFC 2141 is what is currently allowed in a URN. (this list excludes 4 "reserved" characters which are in the definition for future use.) - RFC 2141 is currently being revised. It is very likely that& and ~ will be allowed, making the definition equal to that of pchar. - unreserved is similar to the current GLIF list. - Note that the following characters are NEVER allowed: % / ? # [ ] \ "< > [ ] ^ ` { | }
Question 3. What characters are allowed in<opaque string>? a) GLIF: A-Z a-z 0-9 - . b) unreserved: A-Z a-z 0-9 - . _ ~ c) RFC2141: A-Z a-z 0-9 - . _ ( ) + , : = @ ; $ ! * ' %hex d) pchar: A-Z a-z 0-9 - . _ ~ ( ) + , : = @ ; $ ! * '& %hex
I believe we should use the approach that is going to be supported the most widely, in parsing tools/libraries and what is most closely matched to GLIF and other standards bodies.
The current schema states that ALL Network Objects MUST have an identifier.
This is very strict. For example, even a network object that is never referenced MUST still have an ID. Thus the following is NOT allowed:
<nml:bidirectionallink id="urn:ogf:network:es.net:bilink_A-C"> <nml:link> <nml:relation type="serialcompound"> <nml:link idRef="urn:ogf:network:es.net:link_A_to_B"/> <nml:link idRef="urn:ogf:network:es.net:link_B_to_C"/> </nml:relation> </nml:link> <nml:link> <nml:relation type="serialcompound"> <nml:link idRef="urn:ogf:network:es.net:link_C_to_B"/> <nml:link idRef="urn:ogf:network:es.net:link_B_to_A"/> </nml:relation> </nml:link> </nml:bidirectional>
Instead, everything MUST be named, like so:
<nml:bidirectionallink id="urn:ogf:network:es.net:bilink_A-C"> <nml:link id="urn:ogf:network:es.net:link_A_to_C"> <!-- ADDED id --> <nml:relation type="serialcompound"> <nml:link idRef="urn:ogf:network:es.net:link_A_to_B"/> <nml:link idRef="urn:ogf:network:es.net:link_B_to_C"/> </nml:relation> </nml:link> <nml:link id="urn:ogf:network:es.net:link_C_to_A"> <!-- ADDED id --> <nml:relation type="serialcompound"> <nml:link idRef="urn:ogf:network:es.net:link_C_to_B"/> <nml:link idRef="urn:ogf:network:es.net:link_B_to_A"/> </nml:relation> </nml:link> </nml:bidirectional>
Question 4. MUST all object have an id? a) All Network Objects MUST have an identifier. a) All Network Objects SHOULD have an identifier.
"SHOULD" means that an identifier may be left out, but only if it is clear what the consequences are (in this case: the result can not be referred to.)
As a parallel to the perfSONAR/NMC world - all first order objects have an ID field (e.g. data, metadata, subject, parameters, key). Some do not (eventType, 'parameter' [lives inside of parmeters], datum, time formats). I do not have a strong opinion on this, but I think that if you plan on ever referencing an object (e.g. in your 2nd example above creating the serial compund A_C out of A_B and B_C) it should have an ID. If the relationship is temporal and will never be referenced it won't need the ID, but it doesn't seem like a stretch to just give it one anyway. I suppose I would prefer a) to be safe, but won't defend it to the death.
The current schema states that the Syntax of the identifier MUST follow the urn:ogf:network syntax.
This might make future compatibility harder (e.g. when trying to combine it with other protocols; I can imagine that in the future other naming schema's may be developed).
Question 5. MUST urn:ogf:network syntax be used? a) All identifiers MUST follow the urn:ogf:network syntax b) All identifiers MUST be a URI, and SHOULD follow the urn:ogf:network syntax c) All identifiers MUST be a unique, and MAY follow the urn:ogf:network syntax (some more variants are possible)
No strong preference. I think that using the urn syntax helps to guarantee uniqueness, but I would need to see examples of when it would be impossible to assign this type of ID to a given object. -jason