I have a bunch of questions/issues relating
to dfdl:choiceKind. I'm not asking for changes in v0.40, but I expect there
will be changes required.
The issues that I want to raise are:
a) The description of the property
in v0.39 contains several typos and inaccuracies.
- 'implicit' is being used where 'fixedLength'
was intended.
- nothing is said about the units in
which the length is calculated.
- there's no need to discuss how the
choice is resolved when discussing the 'variableLength' enum
- we should use standard phraseology
when indicating whether a property can be computed from a DFDL expression.
b) Property name should be 'choiceLengthKind'
to accurately reflect its meaning
c) There is a need for a related
property 'choiceLengthUnits'
Consider the recursive algorithm for
calculating the length of each branch. It needs to know whether it is calculating
a length in bytes or characters. If the length is in bytes, then the length
cannot be calculated for variable-width encodings. If the length is in
characters, then the length cannot always be calculated reliably if there
are raw byte values in the markup.
d) The rules for calculating the
max length of the choice are not provided. They are complex, and not at
all obvious. Consider these issues:
The length of a branch cannot be calculated
if
- there are any optional elements or
variable-length arrays anywhere in the branch
- any field in the branch has dfdl:alignment
> "1" ( at least, I can't work out what the rules would be.
The alignment of the parent element would need to be factored in )
- any element or group in the branch
specifies its initiator, terminator or separator as a DFDL expression
- any element or group in the branch
specifies its length as a DFDL expression
if choiceLengthUnits='characters' then
the length cannot be calculated if
- any element or group in the branch
specifies a DFDL string literal containing DFDL mnemonics %NL; %WSP*; or
%WSP+;
- any element or group in the branch
uses a DFDL string literal that contains sequence of raw byte values with
length different from the fixed character width
if choiceLengthUnits='characters' then
the length cannot be calculated if
- any element in the branch specifies
a variable-width encoding, or specifies its encoding as a DFDL expression.
There are probably other rules which
need to be applied, but the above should illustrate the point. Calculating
the length is only possible under some *very* restrictive conditions.
e) I think the property may not be
required
As far as I am aware, this property
was introduced to provide support for COBOL REDEFINES, and to allow MRM
message sets to be migrated to DFDL. If true, the problem gets a lot simpler:
- COBOL does not use initiators/terminators.
- The COBOL compiler contains code that
calculates the length of the structure ( it must, because COBOL has a rule
that a REDEFINES cannot be longer than the record that it is redefining
).
Presumably, it takes alignment into
account in some way, and handles issues relating to character width as
well.
- COBOL does not allow an anonymous
REDEFINES. If imported, A REDEFINES will always produce a complex element
whose content is a fixed-length choice.
Note : This means that the same will
be true of any MRM message set created by message broker's COBOLimporter.
If those assumptions are correct, then
in all cases the same effect could be achieved by putting the precalculated
length of the REDEFINES onto the parent element. I think this merits serious
consideration. The cost of implementing choiceKind='fixedLength' is quite
high because of the complexity of the rules, and the fact that groups,
as well as complex elements, can have a fixed length. But it's not really
an implementation issue, it's a complexity issue. DFDL should not contain
a propery with such complex implementation requirements unless there's
a strong case for it - otherwise potential implementers are going to be
put off.
The existing COBOL importer probably
does not set the precalculated length of a REDEFINES on the parent element.
That would be required if we wanted to remove the property - so we would
have to discuss that with the group that provides the importer technology.
regards,
Tim Kimber, Common Transformation Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 246742
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU