On Friday's call Mike & Suman were tasked with creating a proposal for the number of array specific properties that we expose in 1.0.

Spec v028 lists a core set:

occursKind
occurs
occursSeparator
occursStopvalue
occursStopValueKind

The debate was whether array properties are needed for handling prefix/suffix region, the motivating example being leading skip count.

Here's a format I came across recently, JSON, meaning JavaScriptObjectModel (http://json.org/), used in conjunction with REST style web services.

For a model that logically looks like...

String name;
String[] address;
String zip;

...an example JSON instance could look like...

{"name":"Steve Hanson",
"address":["1 Woodlands","Awbridge","Romsey"],
"zip":"SO51 0GP"
}

... and a DFDL model (ignoring white space) would look like.

<xs:sequence dfdl:separator="," dfdl:initiator="{" dfdl:terminator="}" dfdl:lengthKind="implicit">
<xs:element name="name" type="xs:string" dfdl:initiator=""name":" dfdl:terminator="" dfdl:lengthKind="delimited"/>
<xs:sequence dfdl:separator="" dfdl:initiator=""address":[" dfdl:terminator="]" dfdl:lengthKind="implicit">
<xs:element name="address" type="xs:string" dfdl:initiator="" dfdl:terminator="" dfdl:lengthKind="delimited"
maxOccurs="unbounded" dfdl:occursKind="implicit" dfdl:occursSeparator=","/>
</xs:sequence>
<xs:element name="zip" type="xs:string" dfdl:initiator=""name":" dfdl:terminator=""/>
</xs:sequence>

Note the need to wrap the address element in a sequence to consume the array markup **. I could have used a complex element, but that would have added an extra infoset level and I wanted to conform as closely as possible to the logical view of the data above.

To do this with explicit array properties the DFDL model would look like:

<xs:element name="address" type="xs:string" dfdl:initiator="" dfdl:terminator="" dfdl:lengthKind="delimited"
maxOccurs="unbounded" dfdl:occursKind="implicit" dfdl:occursSeparator=","
dfdl:occursInitiator=""address":[" dfdl:occursTerminator="]" />

Note the need for dfdl:occursInitiator, dfdl:occursTerminator.

Personally, I don't think the extra properties increase the useability by much.

** The one thing that feels a bit odd is having dfdl:lengthKind="implicit" on the wrapping xs:sequence. I feel like I want to use dfdl:lengthKind="delimited", because the sequence is followed by the separator of the outer sequence. And in this instance I could, as there are no 'final unused' bytes in the data. I'm still getting used to length on sequences, I guess.

Regards, Steve

Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU