Mike thanks for writing this up, I think
we are close. Comments in-line.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
dfdl-wg@ogf.org,
Date:
31/10/2012 19:08
Subject:
Re: [DFDL-WG]
DFDL Statement Evaluation Timing (Assert, Discriminator, SetVariable, NewVariableInstance)
Sent by:
dfdl-wg-bounces@ogf.org
Revision 2 per workgroup call on 2012-10-31
This is a rewrite, not a set of edits.
---------------------------------------------
Clarification: At any single annotation point of the schema, there
can be only one format annotation (dfdl:format, dfdl:element, dfdl:sequence,
dfdl:choice, dfdl:group, dfdl:simpleType).
Glossary: DFDL Statement annotations, or just DFDL Statements, are
the annotation elements dfdl:assert, dfdl:discriminator, dfdl:setVariable,
and dfdl:newVariableInstance.
SMH: Nice idea. What about dfdl:defineVariable,
is that a statement annotation too? Where does that leave dfdl:defineFormat
and dfdl:defineEscapeScheme - they are not format annotations (that's their
content). Do we have
'global', 'statement' and 'format' annotations?
Glossary: Combined annotations: When annotations are combined
between a group reference and the sequence or choice of the referenced
global group, or among an element reference, an element declaration, and
its type definition, the combined set of is referred to as the combined
annotations.
DFDL Statement Annotation Placement
dfdl:assert and dfdl:discriminator can be placed as annotations on sequence,
choice, group references, local and global element declarations, element
references, and simple type definitions.
dfdl:setVariable may be placed as an annotation on sequence, choice, group
references, local and global element declarations for elements of simple
type, element references to elements of simple type, and simple type definitions.
dfdl:newVariableInstance can be placed as an annotation on sequence, choice,
and group references.
The combined annotations for any schema component can contain only a single
dfdl:discriminator, or any number of dfdl:assert statements, but not both
asserts and a discriminator. It is a schema definition error otherwise.
The combined annotations for any schema component can contain multiple
dfdl:setVariable annotations, but they must each refer to a different variable.
It is a schema definition error otherwise.
The combined annotations for any schema component can contain multiple
dfdl:newVariableInstance annotations, but they must each refer to a different
variable. It is a schema definition error otherwise.
Evaluation Order for Statement Annotations
Assertions Before:
dfdl:discriminator or dfdl:assert with testKind='pattern' are executed
before parsing the annotated construct.
SMH: Wording needs to cater for combined annotations.
Note that the pattern is used to match against the entire representation
of the component; hence, the framing (including initiators, etc.) are all
visible to the pattern. The dfdl:encoding property is used when decoding
the data to characters before matching.
It is a schema definition error if alignment is not 1 and a dfdl:discriminator
or dfdl:assert with testKind='pattern' is used.
(TBD: restrictions on lengthKind='prefixed' as well? Any other framing-based
incompatibilities? where assertions with testKind='pattern' are really
incompatible?)
SMH: If alignment <> 1 is schema
definition error then so should leadingSkip <> 0. I'd leave
it there though.
Also schema definition error if encoding not
set.
If there are multiple dfdl:assert statements with testKind='pattern' the
order of execution among them is not specified.
Schema authors can insert sequences to control the timing of evaluation
of statements more precisely.
Assertions After:
dfdl:discriminator or dfdl:assert with testKind='expression' (the default)
are executed after parsing the annotated construct.
SMH: Wording needs to cater for combined
annotations.
Furthermore, an attempt to evaluate a discriminator must be made even if
the parse of the annotated construct ended in a parse error. This is because
a discriminator could evaluate to true thereby resolving a point of uncertainty
even if the complete parsing of the construct ultimately caused a parse
error. Such discriminator evaluation has access to the DFDL Infoset of
the attempted parse as it existed immediately before detecting the parse
failure.
Implementations are free to optimize by recognizing and executing discriminators
or assertions earlier so long as the resulting behavior is consistent with
what results from the above description.
If there are multiple dfdl:assert statements with testKind='expression',
then the order of execution among them is not specified. Schema authors
can insert sequences to control the timing of evaluation of statements
more precisely.
The dfdl:newVariableInstance Statement
These statements are evaluated before the parsing of the annotated construct.
When there is more than one newVariableInstance statement the order of
execution among them is not specified. Schema authors can insert
sequences to control the timing of evaluation of statements more precisely.
All dfdl:newVariableInstance statements are executed before any dfdl:setVariable
statements on the same annotated construct.
SMH:
SMH: Wording needs to cater for combined annotations.
The dfdl:setVariable Statement
When a dfdl:setVariable annotation is found on an element reference, element
declaration, or simple type definition, then it is executed after the parsing
of the element, which implies after the evaluation of expressions corresponding
to any computed format properties. That is, if an expression is used to
provide the value of a format property such as dfdl:terminator, the evaluation
of that expression occurs before any dfdl:setVariable annotation is executed;
hence, the expression providing the value of the format property may not
reference the variable.
When a dfdl:setVariable annotation is found in the combined set of annotations
for a sequence, choice, or group reference, then it is executed after any
dfdl:newVariableInstance statements in that same combined set, but it is
executed before the parsing of the sequence, choice, or group reference.
If there are multiple dfdl:setVariable statements in one combined set of
annotations, then the order of evaluation among them is not specified.
Schema authors can insert sequences to control the timing of evaluation
of statements more precisely.
SMH: Wording needs to cater for combined annotations.
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU