Fw: [dfdl-wg] Ambiguous XPaths to hidden elements

I think there is a wider issue here that we need to consider before we can answer this question, and that is what do we plan to say in the DFDL standard about the output we create? Let me explain this further. In IBM we have several different products that could exploit a DFDL parser, but these products all have a different way of exposing the output to the user, normally some kind of DOM-like tree model with an accompanying API. This is almost certainly the case elsewhere too because there is no equivalent standard to DOM for non-XML data. A DFDL parser must therefore provide a SAX-like interface and generate events, leaving it up to the caller to create the tree model (even if it also provided the capability to build an XML DOM model). However both the use of XPath during the parse and hidden elements pose problems for a DFDL parser. - All the XPath implementations I have encountered are bound to the type of tree that is being created - DOM or SDO or WMB for example. With SAX there is no standard tree. - Hidden elements will not by definition appear in a DOM tree. So even in DOM mode a DFDL parser needs some internal mechanism for retaining parsed values that could be referenced by XPath, and its XPath implemetation must therefore use this mechanism. Note it should only use this mechanism for retaining the values of those elements that it knows could be referenced by XPath, ie, it is a sparse mechanism. Otherwise the amount of memory used for large messages becomes unacceptable. An alternative is to force the caller to provide a handler that gets invoked when XPath needs to access the caller's tree (implies disallowing XPath for hidden elements). I don't think a handler approach would be acceptable, even if a DOM alternative was provided. Perhaps it is time to give some thought to the API aspect of DFDL ? (The implication above is that an existing XPath implementation can't be used as is. True, but it is only the tree/model interface that needs changing and this is an internal. I'm still not happy with 2a/b/c below which change the XPath externals. I prefer 1 below.) Regards, Steve Steve Hanson WebSphere Message Brokers, IBM Hursley, England Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 19/01/2006 09:53 ----- Steve Hanson/UK/IBM To 19/01/2006 09:43 "Westhead, Martin (Martin)" <westhead@avaya.com> cc dfdl-wg@ggf.org, owner-dfdl-wg@ggf.org Subject Re: [dfdl-wg] Ambiguous XPaths to hidden elements(Document link: Steve Hanson) As a DFDL parser implementor I do not want modifications to the XPath syntax. I want to be able to reuse existing XPath implementations. It's also something else for the user to have to learn. So 2a/b/c are not attractive. Regards, Steve Steve Hanson WebSphere Message Brokers, IBM Hursley, England Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 "Westhead, Martin (Martin)" <westhead@avaya.c To om> <dfdl-wg@ggf.org> Sent by: cc owner-dfdl-wg@ggf .org Subject [dfdl-wg] Ambiguous XPaths to hidden elements 18/01/2006 20:24 Hi folks, This is to try to pick up on the issue identified by Suman in today’s call. The Issue Consider the following example: <xs:element name="root"> <xs:complexType> <xs:sequence> <xs:annotation><xs:appinfo source=”http://dataformat.org” /> <hidden> <xs:element name="repeats" type="xs:integer"/> </hidden> </xs:appinfo></xs:annotation > <xs:element name="testElement" type="xs:integer " minOccurs=”0” maxOccurs=”unbounded” dfdl:repeatCount=”../repeats”> </xs:complexType> </xs:element> The problem is that the path “../repeats” can be broken by modifications to the logical model due to name clashes on “repeats” and there are cases that can be constructed where this would not be obvious to a user. Possible Solutions Possible fixes to this include: 1. Disallow XPath references to hidden elements the user is forced to place the material into the global context to refer to it. 2. Provide a special XPath operator to indicate we are referencing a hidden element, possibilities include: a. “../hidden(repeats)” b. “hidden(../repeats)” c. “../dfdl:hidden/repeats” 3. Only allow hidden elements to be present in top level global complex types. These can then be included where needed. (This is the solution that Suman was pushing but I don’t fully understand it – in particular I don’t see how it resolves the ambiguity issue.) I believe my preference here is 2a or 2b followed by 1. Comments/suggestions/opinions? Thanks, Martin
participants (1)
-
Steve Hanson