
Hi, Today, I was trying to create and improve an example topology file based on the RNC schema. Unfortunately, the current RNC schemata do not validate when used with a stricter parser. We tried last week with Jing-Trang, and that gave no errors. Today, I tried with http://validator.nu/ and got a few more errors. Could someone answer my noob questions on RNC? (Either on-list or off-list). 1) What is the difference between Lifetime = element lifetime { StartTime, (EndTime | Duration)? } and Lifetime = element lifetime { StartTime & (EndTime | Duration)? } and which one should I use? The goal is a lifetime element with a start element (defined in the StartTime rule) and optionally an end OR duration element (respectively defined in the EndTime and Duration rules). 2) I read on http://relaxng.org/compact-tutorial-20030326.html that the order is relevant in RNC. Thus <location><latitude>51.5155</latitude><longitude>-0.0922</longitude></location> is different from <location><longitude>-0.0922</longitude><latitude>51.5155</latitude></location>. Is there a way to specify in the RNC schema that this order is irrelevant in the XML? 3) The NML Group is -by it's current definition- recursive: A group is a NML NetworkObject, and a Group can contain NML NetworkObjects, thus including other groups. I have a problem with such recursive definitions in RNC. At least the validator complains about patterns defined later on in the document. Can't I do that, or am I just doing something wrong (I'm happy to provide offlist the URLs of RNC schema and example topology file I'm currently working on, so you can see the errors for yourself) 4) In the current RNC schema, extensibility was ensured using the "anyElement" rule. E.g. BasePortContent = NetworkObject & element capacity { xsd:float }? & anyElement* Unfortunately, the validator complained about this. When checking a document, it is unclear if a "location" element should be parsed according the second rule (element capacity { xsd:float }?) or third rule (anyElement*). When reading about this, it was suggested to remove the anyElement* from the BasePortContent, since it is possible to still add new allowed element in the following method: BasePortContent = NetworkObject & element capacity { xsd:float }? # later extension: BasePortContent &= element my_extension { xsd:string }? I have some more questions, but these were the most important ones. If some RNC expert could help me out or point me in the right way, GREAT! Freek

Hi Freek; Answers inline. If you need a faster reference to RELAX, consider reading the online documentation: http://books.xmlschemata.org/relaxng/page2.html On 6/20/11 11:36 AM, thus spake Freek Dijkstra:
Hi,
Today, I was trying to create and improve an example topology file based on the RNC schema.
Unfortunately, the current RNC schemata do not validate when used with a stricter parser. We tried last week with Jing-Trang, and that gave no errors. Today, I tried with http://validator.nu/ and got a few more errors.
Look into using MSV (works for many different schema languages): http://msv.java.net/ We use this along with Trang/Jing. I have never used the website you speak of, so can't comment on if its useful or not. Typically I have found that its best to use Trang to convert the RNC schema into different forms (RNG, and then XSD) and then use one of the other schema languages for instance verfication. I believe the workflow looks like this: Trang -> RNC to RNG Trang -> RNG to XSD MSV -> validate XML against RNG or XSD MSV -> validate XML against RNG or XSD Validating against the RNC can sometimes produce ambiguous parse errors for some of the items you note below (e.g. anyElement); converting can strengthen the meaning of the schema to remove ambiguous paths in the grammar.
Could someone answer my noob questions on RNC? (Either on-list or off-list).
1) What is the difference between Lifetime = element lifetime { StartTime, (EndTime | Duration)? } and Lifetime = element lifetime { StartTime & (EndTime | Duration)? } and which one should I use? The goal is a lifetime element with a start element (defined in the StartTime rule) and optionally an end OR duration element (respectively defined in the EndTime and Duration rules).
& = joining things, and not caring about the order. , = joining things and enforcing ordering. In your 1st example above you would only be able to do: <lifetime> <startTime></startTime> <endTime></endTime> </lifetime> The second would allow something like this (wherein the first would view this as out of order): <lifetime> <endTime></endTime> <startTime></startTime> </lifetime> In our experience we try to avoid the use of the comma when we can, enforcing ordering in an XML document places a lot of emphasis on the tools (or humans) creating the XML exactly in the order the schema mandates instead of allowing the XML to be 'structured' via nesting elements, and not caring about the specific order that sibling elements would appear. My personal opinion would be to use '&' always, and avoid the ordering attempt since I still do not believe a 'list' element is required.
2) I read on http://relaxng.org/compact-tutorial-20030326.html that the order is relevant in RNC. Thus <location><latitude>51.5155</latitude><longitude>-0.0922</longitude></location> is different from <location><longitude>-0.0922</longitude><latitude>51.5155</latitude></location>. Is there a way to specify in the RNC schema that this order is irrelevant in the XML?
See above, the use of & and ,
3) The NML Group is -by it's current definition- recursive: A group is a NML NetworkObject, and a Group can contain NML NetworkObjects, thus including other groups. I have a problem with such recursive definitions in RNC. At least the validator complains about patterns defined later on in the document. Can't I do that, or am I just doing something wrong (I'm happy to provide offlist the URLs of RNC schema and example topology file I'm currently working on, so you can see the errors for yourself)
I dont understand this question/problem. Is this a problem of not being able to validate something or is this just a perception problem where a recursive definition personally bothers you? Messages from the parser (and which parser is being used/how it was invoked) would be helpful.
4) In the current RNC schema, extensibility was ensured using the "anyElement" rule. E.g. BasePortContent = NetworkObject & element capacity { xsd:float }? & anyElement* Unfortunately, the validator complained about this.
Was it a 'warning' or an 'error', both have different implications. For example, a common warning that we have seen is 'choice between attributes and children cannot be represented; approximating' is caused by the use of anyElement frequently. This will sometimes result in an ambiguous XSD being generated, but it can still be used to validate instance documents.
When checking a document, it is unclear if a "location" element should be parsed according the second rule (element capacity { xsd:float }?) or third rule (anyElement*). When reading about this, it was suggested to remove the anyElement* from the BasePortContent, since it is possible to still add new allowed element in the following method: BasePortContent = NetworkObject & element capacity { xsd:float }? # later extension: BasePortContent&= element my_extension { xsd:string }?
This is one of the dangers/benefits to using anyElement. In practice define it as 'late' as possible in any rulset and the parser is smart enough to choose the longest match (e.g. location) first. Without seeing what you have done, in terms of calling the tools/displaying error messages I won't be able to comment further. Thanks; -jason
I have some more questions, but these were the most important ones. If some RNC expert could help me out or point me in the right way, GREAT!
Freek

Hi Jason, Thanks for the quick feedback. As usual, it is much appreciated. Jason Zurawski wrote:
My personal opinion would be to use '&' always, and avoid the ordering attempt since I still do not believe a 'list' element is required.
I'm fine eliminating the list element, but -to me- this is unrelated of the ordering in RNC files. In fact, I'm a bit confused now. We introduced explicit lists late October, after you commented that XML has no inherent order. I can't find a on-list quote, but this is what you wrote off-list:
E.g. XML is not ordered (you cannot be sure that the children will come out in any order). Most parsers respect 'document order', but this is not in the spec.
The list element was introduced after because of this note. (We later decided on using "next" relations instead of a numbered list). Why do you think a 'list' element is not required?
3) The NML Group is -by it's current definition- recursive: A group is a NML NetworkObject, and a Group can contain NML NetworkObjects, thus including other groups. I have a problem with such recursive definitions in RNC. At least the validator complains about patterns defined later on in the document. Can't I do that, or am I just doing something wrong (I'm happy to provide offlist the URLs of RNC schema and example topology file I'm currently working on, so you can see the errors for yourself)
I dont understand this question/problem. Is this a problem of not being able to validate something or is this just a perception problem where a recursive definition personally bothers you?
It is a problem parsing the RNC file. The error is: "Reference to undefined pattern BasePort." Apparently because BasePort is defined later then it used. I'll play a bit more, if I can't find a solution, I'll find some RNC-users mailing list, or contact you off-list.
Messages from the parser (and which parser is being used/how it was invoked) would be helpful.
4) In the current RNC schema, extensibility was ensured using the "anyElement" rule. E.g. BasePortContent = NetworkObject & element capacity { xsd:float }? & anyElement* Unfortunately, the validator complained about this.
Was it a 'warning' or an 'error', both have different implications.
An error.
This is one of the dangers/benefits to using anyElement. In practice define it as 'late' as possible in any rulset and the parser is smart enough to choose the longest match (e.g. location) first.
OK, so you suggest to keep the anyElements. I'll see if that works in my validator, or move to another validator. Regards, Freek

Hi Freek; On 6/20/11 2:40 PM, thus spake Freek Dijkstra:
Hi Jason,
Thanks for the quick feedback. As usual, it is much appreciated.
Jason Zurawski wrote:
My personal opinion would be to use '&' always, and avoid the ordering attempt since I still do not believe a 'list' element is required.
I'm fine eliminating the list element, but -to me- this is unrelated of the ordering in RNC files.
In fact, I'm a bit confused now. We introduced explicit lists late October, after you commented that XML has no inherent order. I can't find a on-list quote, but this is what you wrote off-list:
E.g. XML is not ordered (you cannot be sure that the children will come out in any order). Most parsers respect 'document order', but this is not in the spec.
The list element was introduced after because of this note. (We later decided on using "next" relations instead of a numbered list). Why do you think a 'list' element is not required?
Correct, XML has no inherent order which is all the more reason to not use the ',' as a joining element. Why try to require ordering that is hard to achieve? :)
3) The NML Group is -by it's current definition- recursive: A group is a NML NetworkObject, and a Group can contain NML NetworkObjects, thus including other groups. I have a problem with such recursive definitions in RNC. At least the validator complains about patterns defined later on in the document. Can't I do that, or am I just doing something wrong (I'm happy to provide offlist the URLs of RNC schema and example topology file I'm currently working on, so you can see the errors for yourself)
I dont understand this question/problem. Is this a problem of not being able to validate something or is this just a perception problem where a recursive definition personally bothers you?
It is a problem parsing the RNC file. The error is: "Reference to undefined pattern BasePort." Apparently because BasePort is defined later then it used.
I'll play a bit more, if I can't find a solution, I'll find some RNC-users mailing list, or contact you off-list.
Good luck finding a list on that topic, I would assume that even if you could find one it would not be very active (RNC is not the most popular of schema languages). You are better directing your comments here, and someone who has used this for NMC/NML/NM work can most likely answer. The perfSONAR-dev list is also a good candidate.
Messages from the parser (and which parser is being used/how it was invoked) would be helpful.
4) In the current RNC schema, extensibility was ensured using the "anyElement" rule. E.g. BasePortContent = NetworkObject & element capacity { xsd:float }? & anyElement* Unfortunately, the validator complained about this.
Was it a 'warning' or an 'error', both have different implications.
An error.
This is one of the dangers/benefits to using anyElement. In practice define it as 'late' as possible in any rulset and the parser is smart enough to choose the longest match (e.g. location) first.
OK, so you suggest to keep the anyElements. I'll see if that works in my validator, or move to another validator.
I suggest you think about if they make sense - if you can't think of a situation where an arbitrary blob of XML that is not defined in some other schema is required, then you won't need them. AnyElement was introduced for situations where some parsing code wouldn't care about XML inside of some target element. E.g. A data element with some structure that you didn't want to dig into, but you knew had to be well formed XML. If this is common in the NML work, its a good fit. If everything is always well defined (e.g. has a schema associated with it) you don't need it. Thanks; -jason

Jason Zurawski wrote:
My personal opinion would be to use '&' always, and avoid the ordering attempt since I still do not believe a 'list' element is required.
I'm fine eliminating the list element, but -to me- this is unrelated of the ordering in RNC files.
In fact, I'm a bit confused now. We introduced explicit lists late October, after you commented that XML has no inherent order. I can't find a on-list quote, but this is what you wrote off-list:
E.g. XML is not ordered (you cannot be sure that the children will come out in any order). Most parsers respect 'document order', but this is not in the spec. The list element was introduced after because of this note. (We later decided on using "next" relations instead of a numbered list). Why do you think a 'list' element is not required?
Correct, XML has no inherent order which is all the more reason to not use the ',' as a joining element. Why try to require ordering that is hard to achieve? :)
Because network paths are ordered... (However I don't think that elements in an XML file should be ordered in a particular way -- e.g. latitude and longitude can go in any order -- these are two unrelated issues to me. Sorry to spell out the obvious here; I'm just trying to make sure we're on the same par here -- I think we are.)
I'll play a bit more, if I can't find a solution, I'll find some RNC-users mailing list, or contact you off-list.
Good luck finding a list on that topic, I would assume that even if you could find one it would not be very active (RNC is not the most popular of schema languages). You are better directing your comments here, and someone who has used this for NMC/NML/NM work can most likely answer. The perfSONAR-dev list is also a good candidate.
Thanks for the offer. I may take up on it ;)
OK, so you suggest to keep the anyElements. I'll see if that works in my validator, or move to another validator.
I suggest you think about if they make sense - if you can't think of a situation where an arbitrary blob of XML that is not defined in some other schema is required, then you won't need them. AnyElement was introduced for situations where some parsing code wouldn't care about XML inside of some target element. E.g. A data element with some structure that you didn't want to dig into, but you knew had to be well formed XML.
If this is common in the NML work, its a good fit. If everything is always well defined (e.g. has a schema associated with it) you don't need it.
If you have any advice here, please tell. For me this is about backward compatibility. Either we make the current schema strict and later revision looser by adding more allowed elements. Or we make the the current schema loose and later revisions more strict. I presume that the second approach is more common, since that allows a XML to validate against both the old and the new schema. Thus: add wildcards (in RNC: "anyElement" rules) for the current schema. Is this your thinking as well? Freek

On Jun 20, 2011, at 4:24 PM, Freek Dijkstra wrote:
Jason Zurawski wrote:
My personal opinion would be to use '&' always, and avoid the ordering attempt since I still do not believe a 'list' element is required.
I'm fine eliminating the list element, but -to me- this is unrelated of the ordering in RNC files.
In fact, I'm a bit confused now. We introduced explicit lists late October, after you commented that XML has no inherent order. I can't find a on-list quote, but this is what you wrote off-list:
E.g. XML is not ordered (you cannot be sure that the children will come out in any order). Most parsers respect 'document order', but this is not in the spec. The list element was introduced after because of this note. (We later decided on using "next" relations instead of a numbered list). Why do you think a 'list' element is not required?
Correct, XML has no inherent order which is all the more reason to not use the ',' as a joining element. Why try to require ordering that is hard to achieve? :)
Because network paths are ordered...
(However I don't think that elements in an XML file should be ordered in a particular way -- e.g. latitude and longitude can go in any order -- these are two unrelated issues to me. Sorry to spell out the obvious here; I'm just trying to make sure we're on the same par here -- I think we are.)
I'll play a bit more, if I can't find a solution, I'll find some RNC-users mailing list, or contact you off-list.
Good luck finding a list on that topic, I would assume that even if you could find one it would not be very active (RNC is not the most popular of schema languages). You are better directing your comments here, and someone who has used this for NMC/NML/NM work can most likely answer. The perfSONAR-dev list is also a good candidate.
Thanks for the offer. I may take up on it ;)
OK, so you suggest to keep the anyElements. I'll see if that works in my validator, or move to another validator.
I suggest you think about if they make sense - if you can't think of a situation where an arbitrary blob of XML that is not defined in some other schema is required, then you won't need them. AnyElement was introduced for situations where some parsing code wouldn't care about XML inside of some target element. E.g. A data element with some structure that you didn't want to dig into, but you knew had to be well formed XML.
If this is common in the NML work, its a good fit. If everything is always well defined (e.g. has a schema associated with it) you don't need it.
If you have any advice here, please tell.
For me this is about backward compatibility. Either we make the current schema strict and later revision looser by adding more allowed elements. Or we make the the current schema loose and later revisions more strict. I presume that the second approach is more common, since that allows a XML to validate against both the old and the new schema. Thus: add wildcards (in RNC: "anyElement" rules) for the current schema. Is this your thinking as well?
That's what I'd like. It'd be nice if outside folks could define their own attributes for the objects, and the objects would still be usable by someone who didn't know about the new attributes; those attributes would just be ignored. Cheers, Aaron
Freek _______________________________________________ nml-wg mailing list nml-wg@ogf.org http://www.ogf.org/mailman/listinfo/nml-wg
Summer 2011 ESCC/Internet2 Joint Techs Hosted by the University of Alaska-Fairbanks http://events.internet2.edu/2011/jt-uaf

Hi Freek; On 6/20/11 4:24 PM, thus spake Freek Dijkstra:
Jason Zurawski wrote:
My personal opinion would be to use '&' always, and avoid the ordering attempt since I still do not believe a 'list' element is required.
I'm fine eliminating the list element, but -to me- this is unrelated of the ordering in RNC files.
In fact, I'm a bit confused now. We introduced explicit lists late October, after you commented that XML has no inherent order. I can't find a on-list quote, but this is what you wrote off-list:
E.g. XML is not ordered (you cannot be sure that the children will come out in any order). Most parsers respect 'document order', but this is not in the spec. The list element was introduced after because of this note. (We later decided on using "next" relations instead of a numbered list). Why do you think a 'list' element is not required?
Correct, XML has no inherent order which is all the more reason to not use the ',' as a joining element. Why try to require ordering that is hard to achieve? :)
Because network paths are ordered...
(However I don't think that elements in an XML file should be ordered in a particular way -- e.g. latitude and longitude can go in any order -- these are two unrelated issues to me. Sorry to spell out the obvious here; I'm just trying to make sure we're on the same par here -- I think we are.)
I think that the relations give you exactly what you need (what we discussed on the last call), but I am not as close to this as yourself, Aaron, and others right now. If there are compelling reasons why relations are out, I have not seen them. I want to avoid a solution that will be technically hard to realize, so I would rather stick with things that I know work and avoid things that don't. Enforcing ordering in schema doesn't make sense to me, and therefore trying to have well ordered lists of xml doesn't either. If you have definitive reasons for introducing some explicitly ordered list concepts please share them, along with examples that show why they are required. Thanks; -jason
I'll play a bit more, if I can't find a solution, I'll find some RNC-users mailing list, or contact you off-list.
Good luck finding a list on that topic, I would assume that even if you could find one it would not be very active (RNC is not the most popular of schema languages). You are better directing your comments here, and someone who has used this for NMC/NML/NM work can most likely answer. The perfSONAR-dev list is also a good candidate.
Thanks for the offer. I may take up on it ;)
OK, so you suggest to keep the anyElements. I'll see if that works in my validator, or move to another validator.
I suggest you think about if they make sense - if you can't think of a situation where an arbitrary blob of XML that is not defined in some other schema is required, then you won't need them. AnyElement was introduced for situations where some parsing code wouldn't care about XML inside of some target element. E.g. A data element with some structure that you didn't want to dig into, but you knew had to be well formed XML.
If this is common in the NML work, its a good fit. If everything is always well defined (e.g. has a schema associated with it) you don't need it.
If you have any advice here, please tell.
For me this is about backward compatibility. Either we make the current schema strict and later revision looser by adding more allowed elements. Or we make the the current schema loose and later revisions more strict. I presume that the second approach is more common, since that allows a XML to validate against both the old and the new schema. Thus: add wildcards (in RNC: "anyElement" rules) for the current schema. Is this your thinking as well?
Freek
participants (3)
-
Aaron Brown
-
Freek Dijkstra
-
Jason Zurawski