re: Opaque or uninterpreted or raw fields.
These are sometimes called Blobs, though database people reserve that term
for the acronym "BLOB" which stands for Binary Large Object,
which has to do with size being too large for the smaller binary SQL type
objects. I.e., there's no such thing as a small BLOB in databases. I think
in our mailing list we've used blob to mean "opaque bytes" of
any size at all.
I believe use of the 'hexBinary' type
is also probably this same topic. I.e., how to deal with data where you
don't know its proper interpretation, though you can express how big it
is so that we can at least copy it from place to place.
I think there are two choices here.
One is just use "occuring" bytes. E.g., here's uniterpreted data
of length 1234 bytes:
This is a basic binary byte array. I
think this works fine as a blob/opaque type. I believe we do not
need any other kind of raw/opaque type. If we had one, we'd have to have
a way to express its length, and be specific about the units of that length,
and the above accomplishes that with pretty much minimum baggage. You
name it what you want, i.e, "unused" or "dummy" or
"ignore" or whatever you want.
We might want an annotation to indicate
that this data should not be accessed, to distinguish this case from an
actually array of bytes that you DO want to access, but I'm not sure that's
worth it. Note that the OMG CAM model does have an access control attribute.
Perhaps we can use that. However, I doubt it allows distinguishing copy
from access.
The alternative is to use the "hexBinary"
type for this. In that case we need to express the size in the DFDL annotation:
I can think of one advantage of hexBinary
over the occuring bytes approach, which is suppose you do want to use DFDL
in the obvious way to convert data into XML format. Never mind that DFDL
is supposed to enable avoiding this, suppose it's what you want to do.
Then my above byte array for the "ignoreMe" element ends up as:
Which is big compared to: <ignoreMe>000000000000...00</ignoreMe>
which is what we'd get if we allow hexBinary as a type.
Note that if we add the hexBinary type,
you'll still be able to do it the other way, so the hexBinary notion is
not strictly speaking necessary or minimalist.
...mikeb
Mike Beckerle
Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA
Mike Beckerle/Worcester/IBM@IBMUS Sent by: owner-dfdl-wg@ggf.org
09/02/2005 04:34 PM
To
"Robert E. McGrath"
<mcgrath@ncsa.uiuc.edu>
cc
dfdl-wg@gridforum.org, owner-dfdl-wg@ggf.org
Subject
split into multiple topics
- Re: [dfdl-wg] Issues: additional data types
I'd like to split this topic into several distinct ones:
Arrays - I have a placeholder for this in the doc.
Opaque and "code" types are separate. This is related also to
the concept of "open content".
Enums
Bitfields
Pointers
Mike Beckerle
Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA
"Robert E. McGrath"
<mcgrath@ncsa.uiuc.edu>
Sent by: owner-dfdl-wg@ggf.org
09/02/2005 03:13 PM
To
dfdl-wg@gridforum.org
cc
Subject
[dfdl-wg] Issues: additional
data types
Greetings,
Here is an "issue" for the DFDL: additional data types that should
be considered.
Please see attached.
---
Robert E. McGrath
National Center for Supercomputing Applications
University of Illinois, Urbana-Champaign
Champaign, Illinois 61820
(217)-333-6549