[DFDL-WG] Proposed Errata 2.95 change: Mandatory Alignment for Textual Data

12 Dec 2012

      This proposed errata either replaces/updates 2.95, or cancels that and is a
new one.

Section 12.1.1 is amended.

The table of explicit alignments, table 14, is modified. The column for
Text is dropped.

A new section is added: Mandatory Alignment for Textual Data.

We use the term textual data to describe data with
dfdl:representation="text", as well as data being matched to delimiters
(parsing) or output as delimiters (unparsing), and data being matched to
regular expressions (parsing only - as in a dfdl:assert with
testKind='pattern').

Textual data has mandatory alignment that is character-set-encoding
dependent. That is, these mandates come from the character set specified by
the dfdl:encoding property.

When processing textual data, it is a schema definition error if the
dfdl:alignment and dfdl:alignmentUnits properties are used to specify
alignment that is not a multiple of the encoding-required mandatory
alignment.

If the data is not aligned to the proper boundary for the encoding when
textual data is processed, then bits are skipped (parsing) or filled from
dfdl:fillByte (unparsing) to achieve the mandatory alignment.

All character set encodings except those listed specifically below have
mandatory alignment of 8-bit/1-byte.

For encoding US-ASCII-7bit-packed, the alignment is 1-bit (textual data in
this encoding may appear on any bit boundary, i.e., no byte alignment is
required).
TBD: Other encodings...ECMA-6bit, etc.

-- 
Mike Beckerle | OGF DFDL WG Co-Chair | Tresys Technologies
Tel:  781-330-0412