I have a long-standing concern about
the usability of dfdl:lengthKind, which others in IBM are encountering
when modeling real life formats such as EDI.
My main concern is below.
For example, what's the semantic of setting different values on the element
and the sequence?
<xs:element name="container"
dfdl:lengthKind="implicit">
<xs:complexType>
<xs:sequence dfdl:separator="@"
dfdl:lengthKind="implicit">
<xs:element
name="one" type="xs:string" dfdl:lengthKind="delimited"
/>
<xs:element
name="two" type="xs:string" dfdl:lengthKind="delimited"
/>
<xs:element
name="three" type="xs:string" dfdl:lengthKind="delimited"
/>
</xs:sequence>
</xs:complexType>
</xs:element>
It gets even more noticeable if I set
a scoping dfdl:lengthKind on the complex type.
I propose that we limit dfdl:lengthKind
to elements only. It means that the length of a xs:sequence or xs:choice
is always and implicitly given by its chidren, and if you want to provide
an explicit length or a length prefix you must use a complex element to
wrap the sequence or choice. We have looked at the implications on dfdl:choiceKind
for choices, and dfdl:occursKind on arrays, and the proposal works happily
in those scenarios.
There's an analogy here with not alowing
sequences and choices to repeat, only elements.
It also simplifies the grammar, in the
sense that any excess fill characters in a 'box' are always considered
part of the element when parsing.
I'd like to discuss this on today's
call.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU