Issues: Additional Data Types for DFDL

 

Robert E. McGrath

September 2, 2005

 

The current specification defines elements for standard numerical types, strings, date/time, and composites of these types. This note discusses five additional data types that may be added to the DFDL.

 

1. Enum

 

This type has a set of <name, value> pairs, e.g., <�Red�, 0>, <�Blue�, 1>, etc. The values are stored in the data, with the name-value pairs stored in metadata.

 

Note: one use is for localization, using different maps to give localized strings.

 

Difficulty: Low

Priority:��� Low

 

2. Opaque (tagged)

 

This is some kind of non-numeric bit string, with a length and some kind of tag.

 

This might be used, for example, for 1024-bit encryption keys.The type means �just pass through the bits�.

 

Generally, can be used to store any kind of �blob�, which can be objects that are meaningful to specific software.

 

This can be simulated with unsigned integers, but it may be useful to know that it is not really an integer, or whatever.

 

Difficulty: Low

Priority:��� Low

 

3. �Code�

 

How should �code� be marked up?It is usually stored in blobs, but it needs a tag so you know how to interpret it.

 

This is actually a special case of �opaque�.

 

Difficulty: Low

Priority:��� Low

 

 

4. Bitfield / packed

 

This type is bits packed into bytes.

 

Difficulty: Low

Priority:��� Low

 

 

5. Pointer

 

Many times there will be pointers within the data, e.g., to offsets in the file, or to indexes in an array.This will be critical for storing objects such as lists or trees.

 

URL�sand XPATHS are not especially well suited for this.

 

This can be simulated with unsigned integers, but they need to be �swizzled� when translating, so they need to be tagged.

 

Note that there might be several types of addressing within the data:

        Offset from zero

        Offset relative to �foo�

The offsets might be in different increments:bits, bytes, words, elements, etc.

 

There could be multi-part addresses, e.g., page + offset in page.

 

Difficulty:�� Medium

Priority: ������High

 

 

6. Array

 

This is a critical type, must be supported.

 

There are a lot of issues.

 

I am preparing a separate memo.

 

Difficulty:�� High

Priority: �����Very High