
I don't have a problem with lengthKind 'prefixed'. It's no worse to me than initiator, and in a text format will likely be text anyway in which case it is easily consumed by a regex. Alignment and leadingSkip are the dodgy ones as they are almost incompatibilities. Maybe we should use the term 'resolved statement annotations for a schema component' when referring to the combined set. Then we can say eg: "Resolved dfdl:discriminator or dfdl:assert annotations with testKind='pattern' for a component are executed before parsing the component." Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: Steve Hanson/UK/IBM@IBMGB, Cc: dfdl-wg@ogf.org Date: 01/11/2012 17:19 Subject: Re: [DFDL-WG] DFDL Statement Evaluation Timing (Assert, Discriminator, SetVariable, NewVariableInstance) Clarification: At any single annotation point of the schema, there can be only one format annotation (dfdl:format, dfdl:element, dfdl:sequence, dfdl:choice, dfdl:group, dfdl:simpleType). Glossary: DFDL Statement annotations, or just DFDL Statements, are the annotation elements dfdl:assert, dfdl:discriminator, dfdl:setVariable, and dfdl:newVariableInstance. SMH: Nice idea. What about dfdl:defineVariable, is that a statement annotation too? Where does that leave dfdl:defineFormat and dfdl:defineEscapeScheme - they are not format annotations (that's their content). Do we have 'global', 'statement' and 'format' annotations? Yes, the idea is that there are "defining annotations", "format annotations", and "statement annotations" as the 3 distinct kinds. Glossary: Combined annotations: When annotations are combined between a group reference and the sequence or choice of the referenced global group, or among an element reference, an element declaration, and its type definition, the combined set of is referred to as the combined annotations . DFDL Statement Annotation Placement dfdl:assert and dfdl:discriminator can be placed as annotations on sequence, choice, group references, local and global element declarations, element references, and simple type definitions. dfdl:setVariable may be placed as an annotation on sequence, choice, group references, local and global element declarations for elements of simple type, element references to elements of simple type, and simple type definitions. dfdl:newVariableInstance can be placed as an annotation on sequence, choice, and group references. The combined annotations for any schema component can contain only a single dfdl:discriminator, or any number of dfdl:assert statements, but not both asserts and a discriminator. It is a schema definition error otherwise. The combined annotations for any schema component can contain multiple dfdl:setVariable annotations, but they must each refer to a different variable. It is a schema definition error otherwise. The combined annotations for any schema component can contain multiple dfdl:newVariableInstance annotations, but they must each refer to a different variable. It is a schema definition error otherwise. Evaluation Order for Statement Annotations Assertions Before: dfdl:discriminator or dfdl:assert with testKind='pattern' are executed before parsing the annotated construct. SMH: Wording needs to cater for combined annotations. Another problem is "the annotated construct". I want to say "the thing we're talking about parsing here." What is the right term for that? Note that the pattern is used to match against the entire representation of the component; hence, the framing (including initiators, etc.) are all visible to the pattern. The dfdl:encoding property is used when decoding the data to characters before matching. It is a schema definition error if alignment is not 1 and a dfdl:discriminator or dfdl:assert with testKind='pattern' is used. (TBD: restrictions on lengthKind='prefixed' as well? Any other framing-based incompatibilities? where assertions with testKind='pattern' are really incompatible?) SMH: If alignment <> 1 is schema definition error then so should leadingSkip <> 0. I'd leave it there though. Also schema definition error if encoding not set. Good. Those are improvements. I would like to just say cannot have lengthKind="prefixed" also. (We can add it back, we can't take it away.) If there are multiple dfdl:assert statements with testKind='pattern' the order of execution among them is not specified. Schema authors can insert sequences to control the timing of evaluation of statements more precisely. Assertions After: dfdl:discriminator or dfdl:assert with testKind='expression' (the default) are executed after parsing the annotated construct. SMH: Wording needs to cater for combined annotations. Furthermore, an attempt to evaluate a discriminator must be made even if the parse of the annotated construct ended in a parse error. This is because a discriminator could evaluate to true thereby resolving a point of uncertainty even if the complete parsing of the construct ultimately caused a parse error. Such discriminator evaluation has access to the DFDL Infoset of the attempted parse as it existed immediately before detecting the parse failure. Implementations are free to optimize by recognizing and executing discriminators or assertions earlier so long as the resulting behavior is consistent with what results from the above description. If there are multiple dfdl:assert statements with testKind='expression', then the order of execution among them is not specified. Schema authors can insert sequences to control the timing of evaluation of statements more precisely. The dfdl:newVariableInstance Statement These statements are evaluated before the parsing of the annotated construct. When there is more than one newVariableInstance statement the order of execution among them is not specified. Schema authors can insert sequences to control the timing of evaluation of statements more precisely. All dfdl:newVariableInstance statements are executed before any dfdl:setVariable statements on the same annotated construct. SMH: SMH: Wording needs to cater for combined annotations. The dfdl:setVariable Statement When a dfdl:setVariable annotation is found on an element reference, element declaration, or simple type definition, then it is executed after the parsing of the element, which implies after the evaluation of expressions corresponding to any computed format properties. That is, if an expression is used to provide the value of a format property such as dfdl:terminator, the evaluation of that expression occurs before any dfdl:setVariable annotation is executed; hence, the expression providing the value of the format property may not reference the variable. When a dfdl:setVariable annotation is found in the combined set of annotations for a sequence, choice, or group reference, then it is executed after any dfdl:newVariableInstance statements in that same combined set, but it is executed before the parsing of the sequence, choice, or group reference. If there are multiple dfdl:setVariable statements in one combined set of annotations, then the order of evaluation among them is not specified. Schema authors can insert sequences to control the timing of evaluation of statements more precisely. SMH: Wording needs to cater for combined annotations. -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU