Hi Mike,

I hit a similar scenario today with an EDIFACT message. There was an optional array containing a bunch of initiated records. I used 'initiatedContent' for the array's sequence, but needed to add an extra discriminator to stop an error from within one of the records from causing the parser to backtrack and conclude that the array occurrence was not there. Not easy to know where to put it.

One possibility is an 'all' attribute that says it resolves all points of uncertainty that are in scope?

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848




From:        Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:        dfdl-wg@ogf.org,
Date:        29/10/2012 01:25
Subject:        [DFDL-WG] discriminators near PoU or not
Sent by:        dfdl-wg-bounces@ogf.org




I thought this situation was worth discussing.

I have a bunch of messages. They all work like this:

<element name="messageName1">
<complexType>
<sequence>
   <element ref="boilerplateElt1" minOccurs="0"/>
   <element ref="boilerplateElt2" minOccurs="0"/>
   <sequence dfdl:initiator="MSG1">
       .... message specific elements go in here...
   </sequence>
   ... optional trailing boilerplates go here
</complexType>
</element>

There are hundreds of those.

These all are used in this way:

<element name="topLevel">
<complexType>
   <sequence>
     <element name="message" minOccurs="1" maxOccurs="unbounded">
       <complexType>
          <choice>
              <element ref="messageName1"/>
              <element ref="messageName2"/>
              ....
          </choice>
        </complexType>
    </element>
  </sequence>
</complexType>
</element>

So, there are two points of uncertainty here. The choice, and the unbounded array surrounding it.

For each message, it is not until the sequence carrying the initiator, which is down inside the message structure, that we know for sure we've got this kind of message.

So currently each message element is designed to be used ONLY in the above context of two surrounding points of uncertainty, like so (same message format as above
but with TWO discriminators added).

<element name="messageName1">
<complexType>
<sequence>
   <element ref="boilerplateElt1" minOccurs="0"/>
   <element ref="boilerplateElt2" minOccurs="0"/>
   <sequence dfdl:initiator="MSG1">
      <annotation><appinfo...>
          <!-- discriminate enclosing choice -->
          <dfdl:discriminator>{ true() }</dfdl:discriminator>
          <!-- discriminate enclosing array -->
          <dfdl:discriminator>{ true() }</dfdl:discriminator>
      </appinfo></annotation>
       .... message specific elements go in here...
   </sequence>
   ... optional trailing boilerplates go here
</complexType>
</element>

I'd love to have a better solution to this issue. But in the absence of one this works and achieves what I want which is that once it hits the initiator, we discriminate both which message we have, and we discriminate that we in fact have a message.

Interesting that I'd really prefer to discriminate them in the opposite order of their nesting. I could discriminate the array the the very start of the message. But I have no way to do that because the discriminators apply in inward-out nesting order, and the nearest enclosing PoU is the choice. So I have to discriminate that one first, and I can't discriminate that one until I see the initiator, which is later into the message.

Possible fixes/improvements: allow a label/id on each PoU, and allow referencing that label from a discriminator to say exactly which PoU you are discriminating.  This makes the outward reference implied by a discriminator explicit. You still have the context issue, but it's not implied anymore, it is explicit. (This makes the problem like expressions with "../../.." that reach up and out of a construct. They are reaching up and out, but at least you can see directly that they are doing so.)

--
Mike Beckerle | OGF DFDL WG Co-Chair 
Tel:  781-330-0412
--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU