The rules outlined in section 14.2.2 'Parsing
Sequence Groups with Separators' are not properly specified, and probably
cannot be consistently implemented.
The last paragraph of Section 14.2.1
says this: "In the sections that follow, it is important to remember
that the dfdl:separatorSuppressionPolicy property is carried on the sequence,
while the XSDL minOccurs, XSDL maxOccurs and dfdl:occursCountKind properties
are is carried on an element in that sequence."
This is true, and this 'local overriding'
of separatorSuppressionPolicy ( by arrays within the group ) is the cause
of most of the problems.
Problem #1: Complexity
Consider a sequence group that has SSP='never'
and the separator is a comma. Its members ( A,B,C ) must always be represented
as follows:
"a,b,c" or ",b,c"
or ",,c"
but never "b,c" because that
would imply that the separator for an empty A had been suppressed.
Now suppose that B is an array with
minOccurs=0 and maxOccurs=3 and occursCountKind='implicit'. Acceptable
representations are now:
"a,b1,b2,b3,c" or "a,b1,,,c"
or even "a,,,c"
But if occursCountKind is changed to
'parsed' then the acceptable representations suddenly alter, and empty
occurrences of B can be completely omitted.
"a,b1,b2,b3,c" or "a,b1,c"
or even "a,c"
[ or should that be "a,,c"
]
This seems wrong. The logic that implements
suppression policy is hard enough to implement already. Bringing in an
extra layer of complexity around arrays will make it so hard that most
implementations would contain defects, leading to interoperability issues.
Problem #2 Ambiguity
See the brackets in the preceding paragraph.
[ or should that be "a,,c"
]
It is far from obvious whether the group
should insist on having a delimiter for the array ( because its SSP is
'never' ) or whether the array should take liberty to suppress the separators
for all of its members ( as I assumed when I wrote this email). The text
of the specification is either silent or unclear on this point.
Possible resolution:
Rather than attempting to specify implied
behaviours for the various occursCountKind settings, I believe the specification
should
a) prohibit the use of certain occursCountKinds
within positional sequences
b) require array occurrences to use
the same SSP as other sequence members.
After some discussion with the IBM team,
I believe a) will not generate too many prohibited combinations, and the
rationale for those prohibitions will be consistent with already-existing
schema definition errors.
b) will simplify the implementation
of separation suppression, thus addressing the complexity problem.
I expect we will need an action to be
opened so that this can be discussed in the working group meetings.
regards,
Tim Kimber,
IBM Integration Bus Development (Industry Packs)
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU