From last week's call:

7. Recursive use of DFDL for variable markup
Use of a DFDL annotated element/type to describe an initiator, length prefix, terminator, separator, etc. Steve suggested the most important use of "variable markup-like mechanism" in IBM's WTX product is to reference a location earlier in the bit stream where a delimiter value is found. We handle this already by use of a path expression. The additional variable markup mechanism was to avoid proliferation of keywords for various corner cases on initiator, terminator and separator. Eg., what if you want the initiator to be "Name" or "name" only, not "NAME", "nAmE", etc. So case insensitive is not expressive enough. This can always be modeled, just not as an initiator tag. Feeling was to leave out variable markup (other than for prefix lengths) for v1.0, and to propose the minimum set of extra properties that can be used to address the common use cases, but that IBM needed to see whether this satisfied all WTX use cases.

(Post-call update. It doesn't, there is a use case from WTX, Steve will mail this out before next call).

The use case is from EDI. EDI transactions consist of an initial header segment which defines, among other things, the separator that is used by the data segments that follow. The problem is that EDI transactions may be processed in their entirety, or individual data segments may be processed without the header segment. For the former case, DFDL supports this fine, using an XPath expression to locate the separator, which is defined as an element, the simple type of which enumerates the allowable values, enabling validation. But for the latter case, the XPath expression won't resolve, as there is no header. An explicit dfdl:separator property could be used instead, being a space separated list of all the allowable values - but that then duplicates the separator element enums, leaving a maintenance problem.

<xs:element name="header">
<xs:complexType>
<xs:sequence dfdl:lengthKind="implicit">
<xs:element name="separator" dfdl:lengthKind="explicit" dfdl:length="1" dfdl:representation="text">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enum>xxx</xs:enum>
<xs:enum>yyy</xs:enum>
<xs:enum>aaa</xs:enum>
<xs:enum>bbb</xs:enum>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="850">
<xs:complexType dfdl:lengthKind="delimited">
<xs:sequence dfdl:lengthKind="implicit" dfdl:separator="../../header/separator">
<xs:element name="one" type="xs:string" />
<xs:element name="two" type="xs:string" />
<xs:element name="three" type="xs:string" />
<xs:element name="four" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="transaction">
<xs:complexType>
<xs:sequence dfdl:lengthKind="implicit">
<xs:element ref="header"/>
<xs:element name="segment" maxOccurs="unbounded" />
<xs:complexType>
<xs:choice>
<xs:element ref="800" />
<xs:element ref="810" />
<xs:element ref="850" />
</xs:choice>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>

Note that WTX solves this by pointing at a global separator element by name, instead of to a separator element in the data by path. At runtime, the infoset value of the global element is used, and if it is not set, the enums are used to provide a list of possible values.

Regards

Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848