I did not get as
far as I wanted to on this issue. I would like to discuss this example:
<sequence>
<element name="len" type="int"
dfdl:fillByte="%#r0;"
dfdl:outputValueCalc=
"{
dfdl:representation-output-length(../val)
}" />
... many elements in between ....
<element name="val" type="string"
dfdl:encoding="utf-8"
dfdl:lengthKind="explicit"
dfdl:lengthUnits="bytes"
dfdl:useLengthForOutput="false"
dfdl:length="{
../len }"
dfdl:outputLength="{
fix:ceiling(
dfdl:representation-inherent-length(.)
div 4
) * 4
}"
dfdl:textTrimKind="padChar"
dfdl:textStringJustification="left"
dfdl:textPadCharacter="%#r0;"
/>
</sequence>
You will notice I added a dfdl:outputLength property, and a
dfdl:representation-output-length() function and
dfdl:representation-inherent-length().
I am accepting candidates for better names for these
properties and functions. We need to distinguish these 3 concepts:
1) inherent length – of the infoset item without
reference to any facets, and with out respect to escape sequences, padding or
truncation.
(TBD: think about escape sequences? Is this right)
2) output target length – the length of the box
we’re filling in with the data value representation. The box can be
bigger or smaller than the inherent length, which implies use of
padding/filling, or truncation.
3) input length – length of the box we’re
getting when parsing. The inherent length of the value after parsing can be
smaller than the length of the box due to removal of escape characters, and the
trimming of padding.