It is certainly easier if we can just do
the same as XPath 2.0 stipulates. But I think that this misses the point
here.
The XPath error for statically detecting
that an expression refers to something that can never exist is XPST0008,
which says:
It is a static
error if an expression refers to an element
name, attribute name, schema type name, namespace prefix, or variable name
that is not defined in the static
context, except for an ElementName in
an ElementTest
or an AttributeName in an AttributeTest.
The static context has the notion of "In-scope
schema definitions" being "a
generic term for all the element declarations, attribute declarations,
and schema type definitions that are in scope during processing of an expression.".
It doesn't define exactly what is meant by "in-scope"
but XPath assumes that it acts on a complete instance of an XDM.
In DFDL we are different to typical XPath
usage as we are applying expressions during parsing when the document
is incomplete. We can use that as the justification for applying extra
constraints, which is exactly why there are additional rules in section
23.1.
So, if there are scenarios where a rule is
going to be restrictive then we should consider dropping it. If there are
not, but it makes the life of an implementer harder because it is hard
to code the rule, then we should consider dropping it. Otherwise keep it.
Regards
Steve Hanson
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
Tim Kimber/UK/IBM@IBMGB
To:
dfdl-wg@ogf.org,
Date:
11/04/2014 14:03
Subject:
Re: [DFDL-WG]
validating expressions on elements in a choice or unordered sequence
Sent by:
dfdl-wg-bounces@ogf.org
I would be quite uncomfortable with
DFDL not being a 'proper subset' of XPath 2.0. I understand the motivation
( having personally been involved in coding a query engine for DFDL ) but
I think the cure would be worse than the complaint. Consistent with that,
I think I agree with Mark's suggestion - a DFDL processor should just 'do
what an XPath processor would do'.
regards,
Tim Kimber,
IBM Integration Bus Development (Industry Packs)
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: Mark
Frost/UK/IBM@IBMGB,
Cc: "dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>
Date: 11/04/2014
13:23
Subject: Re:
[DFDL-WG] validating expressions on elements in a choice or unordered sequence
Sent by: dfdl-wg-bounces@ogf.org
Comments inline
On Fri, Apr 11, 2014 at 6:22 AM, Mark Frost <FROSTMAR@uk.ibm.com>
wrote:
When we were implementing unordered sequences, this raised some questions
around evaluating relative paths in expressions, for elements in a choice
or unordered sequence :
DFDL spec: (gwdrp-dfdl-v1.0.4 section 15)
"When processing a choice group the parser validates
any contained path expressions. If a path
expression contained inside a choice branch refers
to any other branch of the choice, then it is a
schema definition error."
1. I'm not clear what benefit this restriction
on path expressions gives.
It seems redundant since in any single instance of a choice group, if the
branch being processed exists, then by definition none of it's sibling
branches exist. Any expression path referring to a non-existent branch
would correctly return <empty sequence>
Typically in XPath, such paths would just be empty-sequence at runtime.
Making it an SDE hoists the error to (hopefully) compile time, and making
it SDE (non-recoverable) changes the way one must write expressions. You
can't write utter nonsense paths and have them be runnable.
If the choice group is inside a repeating structure, then expressions referring
to choice branches within other instances of the choice could be
useful.
Should an expression referring to branches in other instances of
a choice cause a schemadef error?
Should be no issue if you are looking at say, position() - n. If you reach
to something that doesn't exist, then you'll get empty sequence.
My experience so far with XPath is that this notion that non-existance
returns empty sequence is painful at best and a nightmare at worst. Expressions
that are utterly nonsense are accepted executed, and silently fail by returning
empty sequence. The most common mistake is writing /a/b/c when you
needed /ns1:a/ns2:b/ns3:c.
Example
expression on el_b could be { fn:count(../../el_choice/el_a)
}
- parent
[sequence]
- el_choice [minOccurs=5 maxOccurs=5]
[choice]
- el_a
- el_b
2. Should an expression that potentially
refers to branches in the choice cause a schemadef error?
Example
identically named elements in and out of a choice
expression on el_c could be { fn:count(../el_a)
}
- parent
[sequence]
- el_a
- el_b
- [embedded choice group]
- el_a
- el_c
I'd love to restrict this, because we're looking at having to create a
DFDL expression language implementation for performance reasons, and complex
things like this require a very complex implementation tantamount to a
query-engine.
I would claim that these two el_a elements are different, and we could
choose to restrict a DFDL path expression to return only nodes described
by the same schema component, with "same schema component" meaning
same path from document element to the schema component where an element
or group or type reference counts as part of that path. So two different
element references to the same global element would be two different schema
components.
But I suspect that this is too restrictive, and implementations are just
going to have to be sophisticated enough to execute queries like this one,
and a good implementation will optimize simpler cases for faster execution.
...mikeb--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU