Can anyone recall the use case driving the need for encoding="bytes".

I've been trying to reconstruct it and I am not able to. It was supposedly for using string literal syntax to discuss padding fields and constant data found surrounding binary data.

The spec now has % as the hex escape for string literals, and the bytes specified using it are not subject to character set translation.

So, if you mean the data has an initiator that contains hex 6B followed by hex 2C, and you don't know nor care what characters those correspond to, then just write "%6B%2C". You don't have to deal with character set encodings at all.

So,  I can't see why we need encoding="bytes" anymore.

...mikeb



Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Platform and Solutions
Westborough, MA 01581
direct: voice and FAX 508-599-7148
assistant: Pam Riordan  
                 priordan@us.ibm.com
                 508-599-7046




Suman Kalia/Toronto/IBM@IBMCA

08/16/2007 08:54 AM

To
dfdl-wg@ogf.org
cc
Mike Beckerle/Worcester/IBM@IBMUS
Subject
Fw: [DFDL-WG] Proposal - simplifying opaque types




Sorry I could not call-in yesterday as I had conflict with another meeting. Here are my comments on the proposal

>>  A byte string is described like this:

>>  <element name=”foo” type=”string”
>>         dfdl:encoding=”bytes” dfdl:lengthKind=”fixed” dfdl:length=”17”/>

I am not sure if dfdl:encoding=byte is the correct way to model opaque blob of data which may not be a string but complex structure containing mixture of text and binary data and that structure blob may have alignment requirements within the memory layout.  I believe the correct way to model blob is really the hexBinary or base64binary - these types do describe the encoding and by default you would specify the length in terms of number of bytes which could be described through length facet.

The example HexBinary Type provided in section 7.2  

When you model something as hexBinary, it is wrong to say the representation is text because data represented through hexBinary may not be string at all ( could be a complex structure) and the DFDL representations given below in the example would not be correct. The example provided by Simon was simple where the data just happened to be string data.

>> Suppose we wanted to model this same data logically as a hexBinary 'blob'.

 >>  <element name="d" type="hexBinary" dfdl:representation="text"
dfdl:encoding="utf-8"
dfdl:length="11"
dfdl:lengthUnitKind="characters"/>


In some respect having dfdl:encoding=byte and making it not sensitive to character encoding, we are simulating what hexBinary specifies already.
Question : Why promote an additional DFDL specific construct when it could be catered through built-in schema type?

PS: I am not against having dfdl:encoding=byte as specified in the spec but promoting it as a mechanism to model opaque binary data will not rest well with XML Schema folks.


Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools
Tel : 905-413-3923  T/L  969-3923
Fax : 905-413-4850 T/L  969-4850
Internet ID : kalia@ca.ibm.com

----- Forwarded by Suman Kalia/Toronto/IBM on 08/16/2007 08:49 AM -----
Mike Beckerle <beckerle@us.ibm.com>
Sent by: dfdl-wg-bounces@ogf.org

08/10/2007 03:26 PM

To
dfdl-wg@ogf.org
cc
Subject
[DFDL-WG] Proposal - simplifying opaque types






Please review for discussion on a future call.




Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Platform and Solutions
Westborough, MA 01581
direct: voice and FAX 508-599-7148
assistant: Pam Riordan  
                priordan@us.ibm.com
                508-599-7046
--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 http://www.ogf.org/mailman/listinfo/dfdl-wg