Mike

IBM DFDL already supports asserts and discriminators on complex elements, so that must remain.  There's a clear use case for this - a choice with branches that are element refs to complex global elements. Discrimination is possible at the time the choice is processed. You would put a discriminator on the element refs.  Also, asserts and discriminators are intended to be the equivalent of WTX component rules which are allowed on complex elements.

The last point about WTX made me realise why we had disallowed asserts and discriminators on global elements. In xsd terms they are associated with a particle. If we are going to allow them on global elements then we need to be clear that this is no longer the case.

Agree that only one setVariable annotation for a given variable can exist when annotations are combined from multiple objects. Same for newVariableInstance.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848




From:        Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:        dfdl-wg@ogf.org,
Date:        30/10/2012 18:26
Subject:        Re: [DFDL-WG] DFDL Statement Evaluation Timing (Assert, Discriminator, SetVariable, NewVariableInstance)
Sent by:        dfdl-wg-bounces@ogf.org




Revision 1 (based on discussion on WG call 2012-10-30)

Principles: disallow statements except where their scope and timing are clear and where the timing is easy to understand from the way it appears textually in the schema document
See changes in
RED.

On Mon, Oct 29, 2012 at 6:54 PM, Mike Beckerle <mbeckerle.dfdl@gmail.com> wrote:
I'll write this up like an errata, but this is for discussion of whether we believe this is clear and complete.

-------------------------------------

Glossary
: DFDL Statements are the annotation elements dfdl:assert, dfdl:discriminator, dfdl:setVariable, and dfdl:newVariableInstance.

Errata
: Locations where
dfdl:assert, dfdl:discriminator and dfdl:setVariable are allowed to appear are extended to also include Global Element Declarations for elements of simple type, and on Simple Type Definitions. 

Errata: dfdl:newVariableInstance,  may appear only as an annotation on a sequence or choice.


Errata: dfdl:setVariable, dfdl:assert, dfdl:discriminator may appear only as an annotation on a sequence, choice, or a simpleType definition
, or an element declaration/reference having simpleType. Note this removes ability for complex typed element decls/refs carry any DFDL Statement annotations including asserts/discriminators.

(I'm trying to get really minimal here. Not even assert on complexType elements.)



Errata
: Clarification about discriminators: Discriminators exclude Assertions even when combining across references.

Beyond the stipulation that there can be only one dfdl:discriminator at any annotation point of the DFDL schema, there are further constraints.

A single dfdl:discriminator annotation may appear on an element reference, or on the global element declaration it refers to, or on the simple type appearing immediately within or referenced from the global element declaration. But only one of those places. In addition, if a discriminator occupies one of those places, then no dfdl:assert annotations may appear in any of those locations.

A dfdl:discriminator annotation may appear on a group reference or on the model group within the global group definition it refers to. But only one of those places, and similarly, if a discriminator appears in any of those places, then no dfdl:assert annotations may appear in any of those locations.


(TBD: constraints that you can't have multiple setVariable statements of the same variable in these places either, just as you can't have multiple setVariables of the same variable at one annotation point.)



Errata
: Clarification about the execution order of DFDL Statements when they appear on an element reference or element declaration.

DFDL Statement annotations for a given
element are executed as follows: (Keep in mind that this element will have simpleType, as complexType elements cannot carry statement annotations at all.)

1) all relevant DFDL statement annotations are gathered to form a single list which preserves schema-definition order.
2) given the combined list, the annotations are executed as follows:
1.        before any parsing of the element, a dfdl:discriminator with testKind="pattern" is executed.
2.        if there is no discriminator, then all dfdl:asserts (there could be several) with testKind="pattern" are executed in the order they appear in the list of DFDL statements.
3.        Any properties having runtime evaluation are evaluated. (e.g., delimiters with expressions)
4.        The element itself is parsed, or its inputValueCalc property is evaluated to create its value.
5.        REMOVED: no longer allowed on elements at all: all newVariableInstance annotations are executed and new variables are placed into scope for the duration of these remaining steps. The statements are executed in the order they appear in the list of DFDL statements.
6.        all setVariable annotations are executed. The statements are executed in the order they appear in the list of DFDL statements.
7.        if a discriminator is present it is executed
8.        if no discriminator is present, then assert annotations can be present, and they are executed. If there are multiple assert annotations the statements are executed in the order they appear in the list of DFDL statements.
If the element reference or local element declaration is an array, then this evaluation is repeated for each occurrence of the array.

A DFDL implementation that wishes to optimize is free to analyze the expressions used, and evaluate them sooner so long as the behavior is equivalent to the above description.


Discussion/Illustration: (this is all revised, so I'm switching back to black ink)

Suppose you have this situation:


<sequence>
  ...
  <element ref="foo"/> <!-- I want to add DFDL statement annotations before and after this -->
  ...
</sequence>

To add them before, so they are scoped over or visible to the parsing of the entire foo element:

<sequence>
  ...
  <sequence> <!-- inserted sequence -->
     <annotation><appinfo ...>
        <dfdl:newVariableInstance ref="myVar" default="{...}"/>
        <dfdl:setVariable ref="myOtherVar" value="{...}"/>
     </appinfo></annotation>
     <element ref="foo"/> <!-- I want to add DFDL statement annotations before this -->
  </sequence>
  ...
</sequence>

To add them after so they can reference downward into that element:

<sequence>
  ...

  <element ref="foo"/> <!-- I want to add DFDL statement annotations after this -->
  <sequence> <!-- inserted sequence -->
     <annotation><appinfo ...>
        <dfdl:assert>{ foo/bar/baz.... }</dfdl:assert>
        <dfdl:setVariable ref="yetAnotherVar" value="{ foo/bar[2]/baz + 1 }"/>
     </appinfo></annotation>
  </sequence>
  ...
</sequence>

The above illustration is about complex type elements, but note that the timing issue really doesn't care whether the element/element-ref had simple or complex type. You can perfectly control when evaluation occurs relative to the element itself.

The timing is then clear (to me anyway). The optimization opportunity to hoist the assert for earlier evaluation is clear, but it's also clear that the assert CAN look into foo how it wishes to make its decision.

If you put the annotation directly on the element (which then must be of simple type), then it is exactly equivalent to putting it in a sequence AFTER. (Modulo replacing "." in expressions with "foo")

--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU