Interesting thread.
I agree that the xsd is a valid DFDL
xsd. No problem with same-named siblings as long as they obey XML Schema
rules.
Steve has raised a related question
: what is dfdl:occursCount() for? How does it differ from XPath's fn:count().
We could define it like this:
- fn:count() returns the count of all
occurrences from all element declarations.
- dfdl:occursCount() returns the number
of occurrences of the current DFDL array. That means the current element
declaration, excluding occurrences from same-named previousi siblings.
Why is this useful? I don't know. But
I do know that a DFDL array involves exactly one element declaration, whereas
fn:count() can involve two or more element declarations. Furthermore, the
DFDL properties ( especially dfdl:occursCountKind ) might be different
on the two same-named element declarations, so it is possible that the
author of the DFDL schema might want to treat them differently.
It may sound unlikely that two element
declarations will have the same name and different DFDL array properties
- but I reckon I could invent some plausible scenarios where it could happen.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From:
Steve Hanson/UK/IBM@IBMGB
To:
Suman Kalia <kalia@ca.ibm.com>,
Cc:
dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org
Date:
01/03/2013 08:56
Subject:
Re: [DFDL-WG]
is this legal
Sent by:
dfdl-wg-bounces@ogf.org
Additionally UPA rules apply. Your example
is fine as long as first "foo" and "bar" are not minOccurs
'0'.
Using your example, in standard XPath the path expression "foo"
would return a sequence of length 2.
A more interesting example is:
<sequence>
<element name="foo" type="int" dfdl:lengthKind="explicit"
dfdl:length="1"/ minOccurs="2" maxOccurs="2">
<element name="bar" type="int" dfdl:lengthKind="explicit"
dfdl:length="1" minOccurs="0"/>
<element name="foo" type="int" dfdl:lengthKind="explicit"
dfdl:length="1"/>
</sequence>
In standard XPath the path expression would now return a sequence of length
3, as it would just lift the 3 occurrences from the infoset. Note they
could all be adjacent if "bar" was not in the data.
Given the examples, I don't see how a DFDL path expression can distinguish
between the different occurrences of elements with the same name. There
is no way in XPath to ask for a count of the number of element occurrences
that match a specific element declaration, because there is no way in the
language to identify such an element.
The DFDL spec in section 23 says "DFDL
expressions never return node-sequences having more than one node. DFDL
expressions either return a simple value, a node sequence containing exactly
one node/value, or an empty node sequence."
and "The
result of evaluating the expression must be a single atomic value of the
type expected by the context, and it is a schema definition error otherwise.
Some XPath expressions naturally return a sequence of values, and in this
case it is also schema definition error if an expression returns a sequence
containing more than one item".
That talks about what is ultimately returned by a DFDL expression. Later
it says "(Note
that DFDL v1.0 does not support sequences of length > 1.)".
And says "DFDL implementations
may use off-the-shelf XPath 2.0 processors, but will need to pre-process
DFDL expressions to ensure that the behaviour matches the DFDL specification:
Wrap path locations in a call to fn:exactly-one() except when the
path location occurs within certain functions which operate on arrays".
We also said on a recent WG call that dfdl:occursCount() is allowed on
non-arrays.
If the real requirement here is that a DFDL expression should not return
a sequence > length 1, then is there a problem with allowing intermediate
steps to return sequences > length 1 as long as the final result is
not > 1 ? Then, couldn't we drop dfdl:occursCount() and just use
fn:count() ? Are we just making things hard for implementers?
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Suman
Kalia <kalia@ca.ibm.com>
To: Mike
Beckerle <mbeckerle.dfdl@gmail.com>,
Cc: dfdl-wg@ogf.org,
dfdl-wg-bounces@ogf.org
Date: 01/03/2013
01:11
Subject: Re:
[DFDL-WG] is this legal
Sent by: dfdl-wg-bounces@ogf.org
This is certainly allowed in XML schema.. In the sequence you can have
multiple elements with same name as long as their type is identical
which is the case in your example. I think from XPath perspective,
it would be treated like array and if true dldl:occursCount should return
2. .
Suman Kalia
IBM Canada Lab
WMB Toolkit Architect and Development Lead
Tel: 905-413-3923 T/L 313-3923
Email: kalia@ca.ibm.com
For info on Message broker
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: dfdl-wg@ogf.org,
Date: 02/28/2013
07:44 PM
Subject: [DFDL-WG]
is this legal
Sent by: dfdl-wg-bounces@ogf.org
I can't find clarity on this:
<sequence>
<element name="foo" type="int" dfdl:lengthKind="explicit"
dfdl:length="1"/>
<element name="bar" type="int" dfdl:lengthKind="explicit"
dfdl:length="1"/>
<element name="foo" type="int" dfdl:lengthKind="explicit"
dfdl:length="1"/>
<element name="bar" type="int" dfdl:lengthKind="explicit"
dfdl:length="1"/>
</sequence>
Is this allowed?
If so, then the XPaths for accessing the 2nd foo would be foo[2], and the
path "foo" would be ambiguous or
could be treated as identifying an array. In which case one could do an
expression dfdl:occursCount("foo") and get back 2 ??
Or am I completely missing the boat here?
--
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU