type
|
representation
|
lengthKind
|
resulting length (in
bytes)
|
other
|
xs:hexBinary | binary
(note: required - If 'text' specified it causes a schema definition error. This reserves the 'text' behavior for possible future use.) | implicit | xs:length facet | |
explicit | dfdl:length | Validation: xs:length facet must be equal
to resulting length in bytes
(TBD: similar range checks on xs:minLength, xs:maxLength) | ||
endOfData or delimited or nullTerminated | variable | Validation: xs:length facet must be equal
to resulting length in bytes
(TBD: similar range checks on xs:minLength, xs:maxLength) |
Steve Hanson <smh@uk.ibm.com>
Sent by: dfdl-wg-bounces@ogf.org 11/19/2007 10:23 AM |
|
Mike Beckerle <beckerle@us.ibm.com>
Sent by: dfdl-wg-bounces@ogf.org 16/11/2007 23:09 |
|
Property Name | Description |
binaryType | Enum
This specifies the encoding method for the binary. Valid values are ‘unspecified’, ‘hexBinary’, ‘base64Binary’, ‘uuencoded’ Annotation: dfdl:element (simple type ‘binary’, ‘opaque’) |
This property speaks to what kinds of representations can we interpret
and construct logical hexbinary values from? (similarly base64Binary)
I believe the above is not clear, and causes issues with the xs:length
facet of XSD.
I propose the 4 tables below which describe the 4 cases:
hexbinary - binary
hexbinary - text
base64binary - binary
base64binary - text
I have specified these so that the meaning of the xs:length facet is always
interpreted exactly as in XSD. It always refers to the number of bytes
of the unencoded binary data, and never to the number of characters in
the encoded form.
type
|
representation
|
lengthKind
|
resulting length (in
bytes)
|
other
|
xs:hexBinary | binary | implicit | xs:length facet | |
explicit | dfdl:length | Validation: xs:length facet must be equal
to resulting length in bytes
(TBD: similar range checks on xs:minLength, xs:maxLength) | ||
endOfData or delimited or nullTerminated | variable |
type
|
representation
|
lengthKind
|
resulting length (in
characters)
|
other
|
xs:hexBinary | text | implicit | 2 * xs:length facet | |
explicit | dfdl:length | Validation: xs:length facet * 2 must
be equal to resulting character length (after removing all non-hex characters)
(TBD: similar range checks on xs:minLength, xs:maxLength) | ||
endOfData, delimited, nullTerminated | Variable |
type
|
representation
|
dfdl:lengthKind
|
resulting length (in
bytes)
|
other
|
xs:base64Binary | binary | implicit | xs:length facet | |
explicit | dfdl:length | Validation: xs:length facet must be equal
to resulting length in bytes
(TBD: similar range checks on xs:minLength, xs:maxLength) | ||
endOfData or delimited or nullTerminated | variable |
type
|
representation
|
lengthKind
|
resulting length (in
characters)
|
other
|
xs:base64Binary | text | implicit | 8/6 * xs:length facet | |
explicit | dfdl:length | Validation: xs:length facet * 8/6
must be equal to resulting character length (after removing all non-base64-encoding
characters)
(TBD: similar range checks on xs:minLength, xs:maxLength) | ||
endOfData, delimited, nullTerminated | Variable |
Looking at the above, one way to simplify things quite a bit is to disallow the xs:length and xs:minLength and xs:maxLength facet on hexBinary and base64Binary types in DFDL schemas.
Then the implicit lengthKind goes away, and the complex validation check for the xs:length facet goes away. I recommend this.
Another simplification alternative is to disallow representation text altogether, but I am concerned that peopel with data that does contain hex or base64 data will naturally want to use these types to model it. I don't recommend this.
...mikeb
Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Platform and Solutions
Westborough, MA 01581
direct: voice and FAX 508-599-7148
assistant: Pam Riordan
priordan@us.ibm.com
508-599-7046
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg