As an example of why I feel bitOrder and byteOrder
apply if supporting

hexBinary with non-byte size lengths or starting on non-byte boundaries,

let's say we we had the following data:

  11011111 11010001 = 0xDFD1

And we want to model this as one 12-bit unsigned int followed by one

4-bit unsigned int, all with bitOrder=LSBF and byteOrder=LE. We would

have a schema like so:

  <dfdl:format

    lengthKind="explicit"

    lengthUnits="bits"

    bitOrder="leastSignifigantBitFirst"

    byteOrder="littleEndian" />

  <xs:sequence>

    <xs:element name="foo" dfdl:length="12"
type="xs:unsignedInt" />

    <xs:element name="bar" dfdl:length="4"
type="xs:unsignedInt" />

  </xs:sequence>

The above data would parse as:

  <foo>479</foo> <!-- binary: 000111011111, hex 0x1DF
-->

  <bar>13</bar> <!-- binary: 1101, hex 0xD -->

This is because due to the bit/byteOrder, "foo" is made up of
the last

four bits in second byte (0001) followed by the first eight bits of the

first byte (11011111), resulting in a value of 479. The bitPosition

after "foo" is consumed is 12. Then "bar" consumes
the remaining bits,

which are the first four of the second byte, resulting in a value of 13.

This all follows the specification as-is.

Now, let's assume we instead wanted to represent "foo" as xs:hexBinary

that has a non-byte size length, e.g.:

  <xs:sequence>

    <xs:element name="foo" dfdl:length="12"
type="xs:hexBinary" />

    <xs:element name="bar" dfdl:length="4"
type="xs:unsignedInt" />

  </xs:sequence>

If we ignored bitOrder/bytOrder when parsing "foo" read the first
12

bits (essentially BE MSBF), the result would be:

  <foo>0DFD</foo>

But just like before, the bitPosition after "foo" is consumed
is 12. And

because the bit/byteOrder is LSBF LE, the bits that "bar" will
consume

are again the first four of the second byte, with the result

  <bar>13</bar>

But this means that the last four bits in the data (0001) were never

consumed, and the first four bits in the second byte were consumed

twice, which must be wrong (a similar issue occurs when starting on a

non-byte boundary). So bitOrder/byteOrder must be taken into account

somehow in order to support hexBinary with non-bytesize lengths or

starting on a non-byte boundary, primarily because of how bitOrder=LSBF

works (which I believe was the original use-case for non-byte size

non-byte boundary hexBinary).

If instead we do not ignore bit/byteOrder, there must be some way to

determine how to get those bits into a hexBinary representation. There

are probably a few different ways to handle this, but after some

discussions and interpretations of the XSD spec, we determined that the

best way to handle it was to just read the bits as if they were a

nonNegativeInteger (which does take into account bit/byteOrder) and then

convert those bits to a hex representation. For BE MSBF the result is

exactly the same. For LE MBSF, it results in the hexBinary being

flipped, which is where the Daffodil implementation is inconsistent with

spec.

On 11/29/18 10:19 AM, Steve Hanson wrote:

> Mike

> 

> I'm a bit lost on this now.  The concept of applying lengthUnits='bits'
to 

> xs:hexBinary is straightforward. It just counts bits. Bit order or
byte order is 

> irrelevant, in the same way that it is irrelevant when counting bytes
for a hex 

> binary. The only thing to note is that the fillByte needs to be used
to make up 

> whole bytes.

> 

> I'm missing something here.

> 

> Regards

> 

> Steve Hanson

> 

> IBM Hybrid Integration, Hursley, UK

> Architect, _IBM DFDL_ <
http://www.ibm.com/developerworks/library/se-dfdl/index.html
>

> Co-Chair, _OGF DFDL Working Group_ <
http://www.ogf.org/dfdl/
>_

> __smh@uk.ibm.com_ <
mailto:smh@uk.ibm.com
>

> tel:+44-1962-815848

> mob:+44-7717-378890

> Note: I work Tuesday to Friday

> 

> 

> 

> From: Mike Beckerle <mbeckerle.dfdl@gmail.com>

> To: DFDL-WG <dfdl-wg@ogf.org>

> Date: 20/11/2018 17:33

> Subject: [DFDL-WG] Action 292 - version 2 proposal for hexBinary with

>   lengthUnits bits

> Sent by: "dfdl-wg" <dfdl-wg-bounces@ogf.org>

> 

> --------------------------------------------------------------------------------

> 

> 

> 

> Users want a way to express an arbitrary unaligned string of bits,
with the 

> appearance in the infoset being hexadecimal, not base 10.

> 

> Right now the only way I can see to meet this requirement while retaining

> backward compatibility would be a new DFDL property.

> 

> So here's the new idea:

> 

> Property dfdl:hexBinaryRep with values 'bytes' or 'bits'. New property,
so 

> defaulting (with suppressible warning) to 'bytes' for backward compatibility
in 

> schemas not having the property.

> 

> When set to 'bits', then type xs:hexBinary would behave just like

> xs:nonNegativeInteger, and all properties relevant to that type would
be 

> applicable, and any use of XSD length facets on such elements would
be an SDE.  

> The hexBinary string would be exactly same as if you took the numeric
value for 

> a nonNegativeInteger and instead of presenting it as base 10 digits,
you use 

> base 16 digits.

> 

> 

> Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |

> _www.tresys.com_ <
http://www.tresys.com
>

> Please note: Contributions to the DFDL Workgroup's email discussions
are subject 

> to the _OGF Intellectual Property Policy_ 

> <
http://www.ogf.org/About/abt_policies.php
>

> --

>   dfdl-wg mailing list

>   dfdl-wg@ogf.org

> 
https://www.ogf.org/mailman/listinfo/dfdl-wg

> 

> Unless stated otherwise above:

> IBM United Kingdom Limited - Registered in England and Wales with
number 741598.

> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
PO6 3AU

> 

> 

> --

>   dfdl-wg mailing list

>   dfdl-wg@ogf.org

>   
https://www.ogf.org/mailman/listinfo/dfdl-wg

>