Tel:
781-810-2125 |
033
| 04/03: Assert/Discriminator semantics. AP to document. TK to check uses of discriminator besides choice. |
I believe the rules should be:
1. A point of uncertainty is any of
* an element which has minOccurs !=
maxOccurs
* a choice
* a sequence with sequenceKind="unordered"
2. Nested within the scope of a point of
uncertainty, there might be other points of uncertainty.
3. A discriminator which evaluates to true resolves the
nearest in-scope point of uncertainty.
4. An assertion which evaluates to false causes a processing error
5. Any processing error ( from an assertion
failure or otherwise ) will cause the parser to backtrack to the nearest
unresolved point of uncertainty and try the next available branch, if any. If
there are no more branches available, the parser will backtrack to the next
nearest unresolved point of uncertainty.
6. A processing error which reaches the root tag is reported to the host
application.
7. Assertions and
discriminators are allowed on any point of uncertainty ( not only on the
branches of a choice )
Rationale:
If we only
allow a discriminator on a choice branch, then it will be difficult to model
this common style of message
Tagged
header, minOccurs="1", maxOccurs="1"
Untagged body, maxOccurs="unbounded"
Tagged trailer, minOccurs="1", maxOccurs="1"
An example with 3 occurrences of the body would
be:
HE,headerfield1,headerfield2,headerfield3
John Smith, 100, bodyfield3
John Brown, 200, bodyfield3
Elton John, 30Z, bodyfield3
TR,trailerfield1,trailerfield2,trailerfield3
And the DFDL schema would look something
like this ( excuse the almost inevitable errors, this is just for completeness
):
...
<xs:element name="message">
<xs:complexType
dfdl:lengthKind="implicit">
<xs:sequence dfdl:separator="\r\n">
<xs:element
name="header" initiator="HE,">
<xs:complexType>
<xs:sequence dfdl:separator=",">
<xs:element
name="header1" type="xs:string">
<xs:element
name="header2" type="xs:string">
<xs:element
name="header3" type="xs:string">
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="body"
maxOccurs="unbounded">
<xs:complexType>
<xs:sequence dfdl:separator=",">
<xs:element
name="body1" type="xs:string">
<xs:element
name="body2" type="xs:int">
<xs:element name="body3"
type="xs:string">
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="trailer"
initiator="TR,">
<xs:complexType>
<xs:sequence
dfdl:separator=",">
<xs:element name="trailer1"
type="xs:string">
<xs:element name="trailer2"
type="xs:string">
<xs:element name="trailer3"
type="xs:string">
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
The '30Z' value for the final occurrence of element Body2
is incorrect. It is not a valid integer, and will trigger a processing
error.
Without a discriminator, this
failure will cause the parser to backtrack to the optional field and try the
next element ( the trailer element ). The initiator will not be found, and the
reported error will be "Initiator 'TR,' not found for element 'trailer'". The
user would almost certainly prefer "Invalid value '30Z' for element 'body2'.
Value could not be converted to simple type 'xs:int'"
For this example, the discriminator would need to detect
unambiguously that it really was dealing with a Body element and not a Trailer
element. Due to the message style ( which is quite common ) the only way to do
this is to detect that it is *not* a Trailer. I cannot think of an elegant way
to do that using the facilities in v0.33 of the specification. I have raised
this with Alan and Steve.
regards,
Tim Kimber, Common Transformation Team,
Hursley,
UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal
tel. 246742
Unless stated otherwise above:
IBM United
Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU