
Well, your statement below exactly reflects the confusion I'd like to avoid by by suggesting the dropping of base64Binary. See you said the user is "forced to model some binary base64 data as hexBinary". The data is binary bytes, nothing that is stored in any textual encoding, just regular binary bytes. So the concept of a user having "binary base64 data" doesn't make sense. base64 is about 'text' representation for binary data. It's fundamentally a text concept. By dropping base64Binary, we can just explain that hexBinary means "unknown format binary data" in DFDL. ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 Steve Hanson <smh@uk.ibm.com> 11/19/2007 01:20 PM To Mike Beckerle/Worcester/IBM@IBMUS cc dfdl-wg@ogf.org Subject Re: [DFDL-WG] DFDL hexBinary and base64Binary Mike I also considered suggesting we drop xs:base64Binary support. Clearly if we weren't using XSD type system we would have a single 'binary' logical type but we are re-using XSD type system. So would it be more confusing for a user, familiar with XSD type system, to be forced to model some binary base64 data as xs:hexBinary? Fyi MRM supports both xs:hexBinary and xs:base64Binary and treats them the same. Bottom line: I'm happy to go with majority view. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> 19/11/2007 16:34 To Steve Hanson/UK/IBM@IBMGB cc dfdl-wg@ogf.org Subject Re: [DFDL-WG] DFDL hexBinary and base64Binary Steve, (& team) What you are suggesting is the simplest of the simple. No 'text' representation at all, Users who have actual hexidecimal strings in their data can always model them as either strings or if they're small enough, integers in base 16 text. In this case the only difference between hexBinary and base64Binary is what happens if you coerce the infoset value to a string and this is into the API space which is outside the scope of DFDL. To me this suggests that we leave out base64Binary entirely for V1.0 to avoid confusion (it will be confusing to people to explain that hexBinary and base64Binary are synonymous in DFDL) So the net functionality for DFDL v1.0 would be this only: type representation lengthKind resulting length (in bytes) other xs:hexBinary binary (note: required - If 'text' specified it causes a schema definition error. This reserves the 'text' behavior for possible future use.) implicit xs:length facet explicit dfdl:length Validation: xs:length facet must be equal to resulting length in bytes (TBD: similar range checks on xs:minLength, xs:maxLength) endOfData or delimited or nullTerminated variable Validation: xs:length facet must be equal to resulting length in bytes (TBD: similar range checks on xs:minLength, xs:maxLength) I'm very happy with this for V1.0. Any further comments or should we go with this for V1.0? ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 Steve Hanson <smh@uk.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 11/19/2007 10:23 AM To dfdl-wg@ogf.org cc Subject Re: [DFDL-WG] DFDL hexBinary and base64Binary My view: The logical type is binary, so the data in the information item is binary, the length facets should always deal in bytes, and validation checks the length of the binary data in bytes. From the above, of the two simplifications below, I would rather disallow the text representations of xs:hexBinary and xs:base64Binary. Fyi MRM today - does not support text reps for binary - has not had such a request from users - uses length/minLength/maxLength facets to validate binary field length post-parse - uses length/maxLength to populate the default for the physical length. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 16/11/2007 23:09 To dfdl-wg@ogf.org cc Subject [DFDL-WG] DFDL hexBinary and base64Binary I'm trying to wrap up the opaque/hexBinary/base64Binary topic. I need opinions on this discussion. Currently we have a property, dfdl:binaryType : Properties Specific to Binary Types (hexBinary, base64Binary) Property Name Description binaryType Enum This specifies the encoding method for the binary. Valid values are ‘unspecified’, ‘hexBinary’, ‘base64Binary’, ‘uuencoded’ Annotation: dfdl:element (simple type ‘binary’, ‘opaque’) This property speaks to what kinds of representations can we interpret and construct logical hexbinary values from? (similarly base64Binary) I believe the above is not clear, and causes issues with the xs:length facet of XSD. I propose the 4 tables below which describe the 4 cases: hexbinary - binary hexbinary - text base64binary - binary base64binary - text I have specified these so that the meaning of the xs:length facet is always interpreted exactly as in XSD. It always refers to the number of bytes of the unencoded binary data, and never to the number of characters in the encoded form. type representation lengthKind resulting length (in bytes) other xs:hexBinary binary implicit xs:length facet explicit dfdl:length Validation: xs:length facet must be equal to resulting length in bytes (TBD: similar range checks on xs:minLength, xs:maxLength) endOfData or delimited or nullTerminated variable type representation lengthKind resulting length (in characters) other xs:hexBinary text implicit 2 * xs:length facet explicit dfdl:length Validation: xs:length facet * 2 must be equal to resulting character length (after removing all non-hex characters) (TBD: similar range checks on xs:minLength, xs:maxLength) endOfData, delimited, nullTerminated Variable type representation dfdl:lengthKind resulting length (in bytes) other xs:base64Binary binary implicit xs:length facet explicit dfdl:length Validation: xs:length facet must be equal to resulting length in bytes (TBD: similar range checks on xs:minLength, xs:maxLength) endOfData or delimited or nullTerminated variable type representation lengthKind resulting length (in characters) other xs:base64Binary text implicit 8/6 * xs:length facet explicit dfdl:length Validation: xs:length facet * 8/6 must be equal to resulting character length (after removing all non-base64-encoding characters) (TBD: similar range checks on xs:minLength, xs:maxLength) endOfData, delimited, nullTerminated Variable Looking at the above, one way to simplify things quite a bit is to disallow the xs:length and xs:minLength and xs:maxLength facet on hexBinary and base64Binary types in DFDL schemas. Then the implicit lengthKind goes away, and the complex validation check for the xs:length facet goes away. I recommend this. Another simplification alternative is to disallow representation text altogether, but I am concerned that peopel with data that does contain hex or base64 data will naturally want to use these types to model it. I don't recommend this. ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU