From last week's call:
7. Recursive
use of DFDL for variable markup
Use of a DFDL annotated element/type
to describe an initiator, length prefix, terminator, separator, etc. Steve
suggested the most important use of "variable markup-like mechanism"
in IBM's WTX product is to reference a location earlier in the bit stream
where a delimiter value is found. We handle this already by use of a
path expression. The additional variable markup mechanism was to avoid
proliferation of keywords for various corner cases on initiator, terminator
and separator. Eg., what if you want the initiator to be "Name"
or "name" only, not "NAME", "nAmE", etc.
So case insensitive is not expressive enough. This can always be modeled,
just not as an initiator tag. Feeling was to leave out variable markup
(other than for prefix lengths) for v1.0, and to propose the minimum set
of extra properties that can be used to address the common use cases, but
that IBM needed to see whether this satisfied all WTX use cases.
(Post-call update. It doesn't,
there is a use case from WTX, Steve will mail this out before next call).
The use case is from EDI. EDI
transactions consist of an initial header segment which defines, among
other things, the separator that is used by the data segments that follow.
The problem is that EDI transactions may be processed in their entirety,
or individual data segments may be processed without the header segment.
For the former case, DFDL supports this fine, using an XPath expression
to locate the separator, which is defined as an element, the simple type
of which enumerates the allowable values, enabling validation. But for
the latter case, the XPath expression won't resolve, as there is no header.
An explicit dfdl:separator property could be used instead, being a space
separated list of all the allowable values - but that then duplicates the
separator element enums, leaving a maintenance problem.
<xs:element name="header">
<xs:complexType>
<xs:sequence dfdl:lengthKind="implicit">
<xs:element
name="separator" dfdl:lengthKind="explicit"
dfdl:length="1" dfdl:representation="text">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enum>xxx</xs:enum>
<xs:enum>yyy</xs:enum>
<xs:enum>aaa</xs:enum>
<xs:enum>bbb</xs:enum>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="850">
<xs:complexType dfdl:lengthKind="delimited">
<xs:sequence dfdl:lengthKind="implicit"
dfdl:separator="../../header/separator">
<xs:element
name="one" type="xs:string" />
<xs:element
name="two" type="xs:string" />
<xs:element
name="three" type="xs:string" />
<xs:element
name="four" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="transaction">
<xs:complexType>
<xs:sequence dfdl:lengthKind="implicit">
<xs:element
ref="header"/>
<xs:element
name="segment" maxOccurs="unbounded" />
<xs:complexType>
<xs:choice>
<xs:element ref="800" />
<xs:element ref="810" />
<xs:element ref="850" />
</xs:choice>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
Note that WTX solves this by pointing
at a global separator element by name, instead of to a separator element
in the data by path. At runtime, the infoset value of the global element
is used, and if it is not set, the enums are used to provide a list of
possible values.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848