clarification needed: assert evaluation order and arrays

Question: If I write <element name="foo" minOccurs="5" maxOccurs="10" dfdl:occursCountKind='parsed'> <annotation><appinfo...> <dfdl:assert>{ dfdl:checkConstraints(.) }</dfdl:assert> </appinfo></annotation> <simpleType> <restriction base="xs:string"> <pattern value="...some regex..."/> </restriction> </simpleType> </element> I have two sources of constraints. One is the pattern, the other the min/max occurs. Does that one assertion calling dfdl:checkConstraints mean both will be checked? That is, one check occurring as each element is parsed, and the other at the end of the array? Assume I am not using any validation option, so the DFDL processor would not otherwise check the max/minOccurs because occursCountKind is parsed. Will the checkConstraints fail as soon as we parse the 11th element (is it checking the min/max occurs for each element occurrence as it is parsed), or do we parse as many as we can, and fail only when we check and find out that the entire array has 36 elements that were successfully parsed? What I would like the above to mean is this: 1) as each element occurrence is parsed we check the pattern and parse error (assertion failed) if there is no match. 2) Also after a successful parse of an occurrence, we check that the index is <=10, and parse-error (assertion failed) if not. 3) at the end of the array, we check that the number of occurrences is >= 5. If not we get a parse error (assertion failed). Comments? ...mikeb -- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412

You can't do that with one assert I'm afraid. The main issue is that a path location must always return exactly one item, so it's use within the context of an array must be in conjunction with a predicate, which must refer to a single instance. That's what dfdl:occursIndex() is for. To check that an array meets its bounds, you need a separate assert that uses the dfdl:count() function. You'd have to hard-code the min/max values, and place it on a containing object. <sequence> <annotation><appinfo...> <dfdl:assert>{ dfdl:occursCount(./foo) ge 5 and dfdl:occursCount(./foo) le 10 }</dfdl:assert> </appinfo></annotation> <element name="foo" minOccurs="5" maxOccurs="10" dfdl:occursCountKind='parsed'> <annotation><appinfo...> <dfdl:assert>{ dfdl:checkConstraints(.[dfdl:occursIndex()]) }</dfdl:assert> </appinfo></annotation> <simpleType> <restriction base="xs:string"> <pattern value="...some regex..."/> </restriction> </simpleType> </element> </sequence> I think that means our definition of dfdl:checkConstraints() in the spec is wrong. Table 34 and Section 5.2 together imply that minOccurs and maxOccurs are used in checkConstraints. I think that simply doesn't work. It should only be using 'fixed' and the facets (I think 'default' is pointless as well). Worth noting that when a path is used in the context dfdl:occursCount(<path>) then it is not an error if more than one item is returned, and it should not be wrapped with fn:exactly-one(). Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org, Date: 01/11/2012 16:45 Subject: [DFDL-WG] clarification needed: assert evaluation order and arrays Sent by: dfdl-wg-bounces@ogf.org Question: If I write <element name="foo" minOccurs="5" maxOccurs="10" dfdl:occursCountKind='parsed'> <annotation><appinfo...> <dfdl:assert>{ dfdl:checkConstraints(.) }</dfdl:assert> </appinfo></annotation> <simpleType> <restriction base="xs:string"> <pattern value="...some regex..."/> </restriction> </simpleType> </element> I have two sources of constraints. One is the pattern, the other the min/max occurs. Does that one assertion calling dfdl:checkConstraints mean both will be checked? That is, one check occurring as each element is parsed, and the other at the end of the array? Assume I am not using any validation option, so the DFDL processor would not otherwise check the max/minOccurs because occursCountKind is parsed. Will the checkConstraints fail as soon as we parse the 11th element (is it checking the min/max occurs for each element occurrence as it is parsed), or do we parse as many as we can, and fail only when we check and find out that the entire array has 36 elements that were successfully parsed? What I would like the above to mean is this: 1) as each element occurrence is parsed we check the pattern and parse error (assertion failed) if there is no match. 2) Also after a successful parse of an occurrence, we check that the index is <=10, and parse-error (assertion failed) if not. 3) at the end of the array, we check that the number of occurrences is >= 5. If not we get a parse error (assertion failed). Comments? ...mikeb -- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Do I understand this right? I have always thought that "." refers to the current occurrence (i.e., the [dfdl:occursIndex()] is implied if the current elemetn is an array). Hence, if I have an array of a complex type with children named a and b, then "./a" is a valid expression referring to the current instance's child a. Similarly, if I type "../bar", and my parent is an array, then ".." moves me up to the current instance within that array. Upward paths are unambiguous. Current node paths are unambiguous. Only downward paths that encounter an array would need indexing brackets. I think lots of things break if the above isn't the rule. E.g., you can't take an element and add minOccurs/maxOccurs to it to make it repeat without all the paths written on it being broken. If I am right, then there is no problem with making dfdl:checkConstraints(.) take the node, ask if it is an array, and if so compare its occursIndex to maxOccurs. I do take your point that the secondary check for minOccurs has to be expressed differently with a separate assert. The fact that dfdl:occursCount(path) takes a node-set-valued path is .... well really annoying, but inevitable. We need to watch for these exceptions. On Thu, Nov 1, 2012 at 1:39 PM, Steve Hanson <smh@uk.ibm.com> wrote:
You can't do that with one assert I'm afraid. The main issue is that a path location must always return exactly one item, so it's use within the context of an array must be in conjunction with a predicate, which must refer to a single instance. That's what dfdl:occursIndex() is for.
To check that an array meets its bounds, you need a separate assert that uses the dfdl:count() function. You'd have to hard-code the min/max values, and place it on a containing object.
<sequence> <annotation><appinfo...> <dfdl:assert>{ dfdl:occursCount(./foo) ge 5 and dfdl:occursCount(./foo) le 10 }</dfdl:assert> </appinfo></annotation> <element name="foo" minOccurs="5" maxOccurs="10" dfdl:occursCountKind='parsed'> <annotation><appinfo...> <dfdl:assert>{ dfdl:checkConstraints(.[dfdl:occursIndex()]) }</dfdl:assert>
</appinfo></annotation> <simpleType> <restriction base="xs:string"> <pattern value="...some regex..."/> </restriction> </simpleType> </element> </sequence>
I think that means our definition of dfdl:checkConstraints() in the spec is wrong. Table 34 and Section 5.2 together imply that minOccurs and maxOccurs are used in checkConstraints. I think that simply doesn't work. It should only be using 'fixed' and the facets (I think 'default' is pointless as well).
Worth noting that when a path is used in the context dfdl:occursCount(<path>) then it is not an error if more than one item is returned, and it should not be wrapped with fn:exactly-one().
Regards
Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/> IBM SWG, Hursley, UK* **smh@uk.ibm.com* <smh@uk.ibm.com> tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org, Date: 01/11/2012 16:45 Subject: [DFDL-WG] clarification needed: assert evaluation order and arrays Sent by: dfdl-wg-bounces@ogf.org ------------------------------
Question: If I write
<element name="foo" minOccurs="5" maxOccurs="10" dfdl:occursCountKind='parsed'> <annotation><appinfo...> <dfdl:assert>{ dfdl:checkConstraints(.) }</dfdl:assert> </appinfo></annotation> <simpleType> <restriction base="xs:string"> <pattern value="...some regex..."/> </restriction> </simpleType> </element>
I have two sources of constraints. One is the pattern, the other the min/max occurs.
Does that one assertion calling dfdl:checkConstraints mean both will be checked? That is, one check occurring as each element is parsed, and the other at the end of the array?
Assume I am not using any validation option, so the DFDL processor would not otherwise check the max/minOccurs because occursCountKind is parsed.
Will the checkConstraints fail as soon as we parse the 11th element (is it checking the min/max occurs for each element occurrence as it is parsed), or do we parse as many as we can, and fail only when we check and find out that the entire array has 36 elements that were successfully parsed?
What I would like the above to mean is this:
1) as each element occurrence is parsed we check the pattern and parse error (assertion failed) if there is no match. 2) Also after a successful parse of an occurrence, we check that the index is <=10, and parse-error (assertion failed) if not. 3) at the end of the array, we check that the number of occurrences is >= 5. If not we get a parse error (assertion failed).
Comments?
...mikeb
-- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412
participants (2)
-
Mike Beckerle
-
Steve Hanson