encoding="bytes" not needed? - was Re: Fw: Proposal - simplifying opaque types

Can anyone recall the use case driving the need for encoding="bytes". I've been trying to reconstruct it and I am not able to. It was supposedly for using string literal syntax to discuss padding fields and constant data found surrounding binary data. The spec now has % as the hex escape for string literals, and the bytes specified using it are not subject to character set translation. So, if you mean the data has an initiator that contains hex 6B followed by hex 2C, and you don't know nor care what characters those correspond to, then just write "%6B%2C". You don't have to deal with character set encodings at all. So, I can't see why we need encoding="bytes" anymore. ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 Suman Kalia/Toronto/IBM@IBMCA 08/16/2007 08:54 AM To dfdl-wg@ogf.org cc Mike Beckerle/Worcester/IBM@IBMUS Subject Fw: [DFDL-WG] Proposal - simplifying opaque types Sorry I could not call-in yesterday as I had conflict with another meeting. Here are my comments on the proposal
A byte string is described like this:
<element name=”foo” type=”string” dfdl:encoding=”bytes” dfdl:lengthKind=”fixed” dfdl:length=”17”/>
I am not sure if dfdl:encoding=byte is the correct way to model opaque blob of data which may not be a string but complex structure containing mixture of text and binary data and that structure blob may have alignment requirements within the memory layout. I believe the correct way to model blob is really the hexBinary or base64binary - these types do describe the encoding and by default you would specify the length in terms of number of bytes which could be described through length facet. The example HexBinary Type provided in section 7.2 When you model something as hexBinary, it is wrong to say the representation is text because data represented through hexBinary may not be string at all ( could be a complex structure) and the DFDL representations given below in the example would not be correct. The example provided by Simon was simple where the data just happened to be string data.
Suppose we wanted to model this same data logically as a hexBinary 'blob'. <element name="d" type="hexBinary" dfdl:representation="text" dfdl:encoding="utf-8" dfdl:length="11" dfdl:lengthUnitKind="characters"/>
In some respect having dfdl:encoding=byte and making it not sensitive to character encoding, we are simulating what hexBinary specifies already. Question : Why promote an additional DFDL specific construct when it could be catered through built-in schema type? PS: I am not against having dfdl:encoding=byte as specified in the spec but promoting it as a mechanism to model opaque binary data will not rest well with XML Schema folks. Suman Kalia IBM Toronto Lab WebSphere Business Integration Application Connectivity Tools Tel : 905-413-3923 T/L 969-3923 Fax : 905-413-4850 T/L 969-4850 Internet ID : kalia@ca.ibm.com ----- Forwarded by Suman Kalia/Toronto/IBM on 08/16/2007 08:49 AM ----- Mike Beckerle <beckerle@us.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 08/10/2007 03:26 PM To dfdl-wg@ogf.org cc Subject [DFDL-WG] Proposal - simplifying opaque types Please review for discussion on a future call. Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg
participants (1)
-
Mike Beckerle