Mike thanks for writing this up, I think we are close. Comments in-line.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848




From:        Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:        dfdl-wg@ogf.org,
Date:        31/10/2012 19:08
Subject:        Re: [DFDL-WG] DFDL Statement Evaluation Timing (Assert, Discriminator, SetVariable, NewVariableInstance)
Sent by:        dfdl-wg-bounces@ogf.org




Revision 2 per workgroup call on 2012-10-31

This is a rewrite, not a set of edits.


---------------------------------------------


Clarification:
At any single annotation point of the schema, there can be only one format annotation (dfdl:format, dfdl:element, dfdl:sequence, dfdl:choice, dfdl:group, dfdl:simpleType). 

Glossary
: DFDL Statement annotations, or just DFDL Statements, are the annotation elements dfdl:assert, dfdl:discriminator, dfdl:setVariable, and dfdl:newVariableInstance.

SMH: Nice idea. What about dfdl:defineVariable, is that a statement annotation too?  Where does that leave dfdl:defineFormat and dfdl:defineEscapeScheme - they are not format annotations (that's their content). Do we have 'global', 'statement' and 'format' annotations?

Glossary
: Combined annotations: When annotations are combined between a group reference and the sequence or choice of the referenced global group, or among an element reference, an element declaration, and its type definition, the combined set of is referred to as the combined annotations.

DFDL Statement Annotation Placement


dfdl:assert and dfdl:discriminator can be placed as annotations on sequence, choice, group references, local and global element declarations, element references, and simple type definitions.

dfdl:setVariable may be placed as an annotation on sequence, choice, group references, local and global element declarations for elements of simple type, element references to elements of simple type, and simple type definitions.

dfdl:newVariableInstance can be placed as an annotation on sequence, choice, and group references.

The combined annotations for any schema component can contain only a single dfdl:discriminator, or any number of dfdl:assert statements, but not both asserts and a discriminator. It is a schema definition error otherwise.

The combined annotations for any schema component can contain multiple dfdl:setVariable annotations, but they must each refer to a different variable. It is a schema definition error otherwise.

The combined annotations for any schema component can contain multiple dfdl:newVariableInstance annotations, but they must each refer to a different variable. It is a schema definition error otherwise.

Evaluation Order for Statement Annotations


Assertions Before
:

dfdl:discriminator or dfdl:assert with testKind='pattern' are executed before parsing the annotated construct.

SMH: Wording needs to cater for combined annotations.

Note that the pattern is used to match against the entire representation of the component; hence, the framing (including initiators, etc.) are all visible to the pattern. The dfdl:encoding property is used when decoding the data to characters before matching.


It is a schema definition error if alignment is not 1 and a dfdl:discriminator or dfdl:assert with testKind='pattern' is used.
(TBD: restrictions on lengthKind='prefixed' as well? Any other framing-based incompatibilities? where assertions with testKind='pattern' are really incompatible?)
SMH: If  alignment <> 1 is schema definition error then so should leadingSkip <> 0.  I'd leave it there though.
Also schema definition error if encoding not set.

If there are multiple dfdl:assert statements with testKind='pattern' the order of execution among them is not specified.
Schema authors can insert sequences to control the timing of evaluation of statements more precisely.

Assertions After:


dfdl:discriminator or dfdl:assert with testKind='expression' (the default) are executed after parsing the annotated construct.
SMH: Wording needs to cater for combined annotations.

Furthermore, an attempt to evaluate a discriminator must be made even if the parse of the annotated construct ended in a parse error. This is because a discriminator could evaluate to true thereby resolving a point of uncertainty even if the complete parsing of the construct ultimately caused a parse error. Such discriminator evaluation has access to the DFDL Infoset of the attempted parse as it existed immediately before detecting the parse failure.

Implementations are free to optimize by recognizing and executing discriminators or assertions earlier so long as the resulting behavior is consistent with what results from the above description. 

If there are multiple dfdl:assert statements with testKind='expression', then the order of execution among them is not specified. Schema authors can insert sequences to control the timing of evaluation of statements more precisely.

The dfdl:newVariableInstance Statement


These statements are evaluated before the parsing of the annotated construct. When there is more than one newVariableInstance statement the order of execution among them is not specified.  Schema authors can insert sequences to control the timing of evaluation of statements more precisely.

All dfdl:newVariableInstance statements are executed before any dfdl:setVariable statements on the same annotated construct.

SMH:

SMH: Wording needs to cater for combined annotations.

The dfdl:setVariable Statement

When a dfdl:setVariable annotation is found on an element reference, element declaration, or simple type definition, then it is executed after the parsing of the element, which implies after the evaluation of expressions corresponding to any computed format properties. That is, if an expression is used to provide the value of a format property such as dfdl:terminator, the evaluation of that expression occurs before any dfdl:setVariable annotation is executed; hence, the expression providing the value of the format property may not reference the variable.

When a dfdl:setVariable annotation is found in the combined set of annotations for a sequence, choice, or group reference, then it is executed after any dfdl:newVariableInstance statements in that same combined set, but it is executed before the parsing of the sequence, choice, or group reference.

If there are multiple dfdl:setVariable statements in one combined set of annotations, then the order of evaluation among them is not specified. Schema authors can insert sequences to control the timing of evaluation of statements more precisely.

SMH: Wording needs to cater for combined annotations.

--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU