We have users who have binary blobs the size of which is given in bits, and these blobs are not a multiple of 8 long.
Today the DFDL spec doesn't allow hexBinary to have lengthUnits 'bits'.
I am wondering if this restriction should be lifted.
XSD constrains hexBinary to always have an even number of Hex digits, so we would have to do the same.
So for an example, a 17 bit long hexBinary containing all 1 bits would be FFFF80
Erratum 5.15 extends the types that are allowed to have length in bits to include packed calendars. So there is precedent for opening this restriction up if need arises.
I claim we need to
(a) allow length units bits for all types
(b) restrict the length to have to be 32-bits or 64-bits only, for types xs:float and xs:double when representation 'binary'
(c) restrict packed decimal to have lengths be a multiple of 4 bits (when specified in units 'bits')
All other restrictions should be lifted as those restrictions just cause problems in some formats.
For example 12.3.7.2.5 Specifies that binary calendars must be 4 bytes or 8 bytes exactly, and cannot be specified in units 'bits'. This is just a mistake in DFDL. I have even seen binary calendars with 33 bits length. (seconds since 1-1-1970 representation aka binarySeconds) That additional bit extends the end time substantially.
These restrictions were put into DFDL because our experience of many bit-granularity formats was limited.
What we've found is that there are plenty of data formats where the notion of a "byte" is simply absent. Nothing uses multiples of 8 bits for anything, and nothing is measured in those units. It's always measured in bits. Even for things like float and double, which have impliicit lengths of 4 and 8 bytes respectively, many specifications will express those as 32 bits or 64 bits. Having to divide by 8 just makes the DFDL schema awkward. Similarly in these formats strings are given length in bits. 448 bits worth of 7-bit packed ascii characters is 64 characters, occupying 56 bytes, but the spec uses 448.