Consider this data of 4 characters:


foo;


Consider this schema where the default format is the basic general set of text-oriented defaults.


<xs:element name="ex_infix" dfdl:lengthKind="implicit">

  <xs:complexType>

    <xs:sequence dfdl:separator=";" dfdl:separatorSuppressionPolicy="anyEmpty" dfdl:separatorPosition="infix">

       <xs:element name="x" type="xs:string" dfdl:lengthKind="delimited"/>

       <xs:element name="y" type="xs:string" minOccurs="0"

          dfdl:lengthKind="delimited"

          dfdl:occursCountKind="implicit"/>

   </xs:sequence>

 </xs:complexType>

</xs:element>

          

This is in a current Daffodil unit test, and produces this infoset:


<ex_infix><x>foo</x><y/></ex_infix>


That is, an empty string element is created for element 'y'.


I'd like to know what IBM DFDL produces as the infoset for this example.


I believe the DFDL spec is actually self-contradictory and so ambiguous here about what is the right behavior.






Unless I'm missing another place in the DFDL spec that clarifies this, I think we need to revise this area to make things clearer.


But first we have to pick which is the intended semantics. In the example above, which infoset is the one we want:


    <ex_infix><x>foo</x><y/></ex_infix> (empty string as normal representation takes priority over optionality)

or

    <ex_infix><x>foo</x></ex_infix> (optionality takes priority over empty string as normal representation)


Either way I think this change is needed:

But a bunch of other clarifications are also needed.

Today Daffodil 2.1.0 implements the first behavior. <ex_infix><x>foo</x><y/></ex_infix> with the empty 'y' element.

What does IBM DFDL do?









Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy