On Friday's call Mike & Suman were
tasked with creating a proposal for the number of array specific properties
that we expose in 1.0.
Spec v028 lists a core set:
occursKind
occurs
occursSeparator
occursStopvalue
occursStopValueKind
The debate was whether array properties
are needed for handling prefix/suffix region, the motivating example being
leading skip count.
Here's a format I came across recently,
JSON, meaning JavaScriptObjectModel (http://json.org/), used in conjunction
with REST style web services.
For a model that logically looks like...
String name;
String[] address;
String zip;
...an example JSON instance could look
like...
{"name":"Steve Hanson",
"address":["1
Woodlands","Awbridge","Romsey"],
"zip":"SO51 0GP"
}
... and a DFDL model (ignoring white
space) would look like.
<xs:sequence dfdl:separator=","
dfdl:initiator="{" dfdl:terminator="}" dfdl:lengthKind="implicit">
<xs:element
name="name" type="xs:string" dfdl:initiator=""name":"
dfdl:terminator="" dfdl:lengthKind="delimited"/>
<xs:sequence dfdl:separator="" dfdl:initiator=""address":["
dfdl:terminator="]" dfdl:lengthKind="implicit">
<xs:element name="address"
type="xs:string" dfdl:initiator="" dfdl:terminator=""
dfdl:lengthKind="delimited"
maxOccurs="unbounded"
dfdl:occursKind="implicit" dfdl:occursSeparator=","/>
</xs:sequence>
<xs:element
name="zip" type="xs:string" dfdl:initiator=""name":"
dfdl:terminator=""/>
</xs:sequence>
Note the need to wrap the address element
in a sequence
to consume the array markup **. I could have used a complex element, but
that would have added an extra infoset level and I wanted to conform as
closely as possible to the logical view of the data above.
To do this with explicit array properties
the DFDL model would look like:
<xs:element name="address" type="xs:string"
dfdl:initiator="" dfdl:terminator=""
dfdl:lengthKind="delimited"
maxOccurs="unbounded" dfdl:occursKind="implicit"
dfdl:occursSeparator=","
dfdl:occursInitiator=""address":["
dfdl:occursTerminator="]" />
Note the need for dfdl:occursInitiator,
dfdl:occursTerminator.
Personally, I don't think the extra
properties increase the useability by much.
** The one thing that feels a bit odd
is having dfdl:lengthKind="implicit" on the wrapping xs:sequence.
I feel like I want to use dfdl:lengthKind="delimited", because
the sequence is followed by the separator of the outer sequence. And
in this instance I could, as there are no 'final unused' bytes in the data.
I'm still getting used to length on sequences, I guess.
Regards, Steve
Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU