As there are no initiators or terminators, and your example infoset calls everything 'field', I am assuming that the element looks logically like:

<xs:element name="field" type="xs:string" minOccurs="0" maxOccurs="unbounded" />

You want to preserve the position of the occurrences in the infoset so that they re-appear on output. The agreed way to do this is:

<xs:element name="field" type="xs:string" minOccurs="0" maxOccurs="unbounded" nillable="true" dfdl:nilKind="literalValue" dfdl:nilValue="%ES;" />

Regards
 
Steve Hanson

IBM Hybrid Integration, Hursley, UK
Architect,
IBM DFDL
Co-Chair,
OGF DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday




From:        Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:        DFDL-WG <dfdl-wg@ogf.org>
Date:        26/09/2019 19:11
Subject:        Re: [DFDL-WG] Problem: simple format that is impossible to model
Sent by:        "dfdl-wg" <dfdl-wg-bounces@ogf.org>






To start discussion on my own issue.....

The problem here may be that for a string (or hexBinary), if there is no initiator/terminator, there is no way to distinguish EmptyRep from NormalRep. I.e., an empty string is a "normal" value for a string.

Sections 9.2.3 and 9.2.4 seem to define EmptyRep and NormalRep such that an empty string will be a EmptyRep, not a NormalRep.

However section 9.2.5 on zero-length says:

   "The normal representation can be a zero-length representation if the type is xs:string or xs:hexBinary and there is no framing."

That suggests that when there is no framing, a zero-length string is NormalRep, not EmptyRep, which is the opposite conclusion from what is in sections 9.2.3 and 9.2.4.

If this latter clarification is correct, then my format *should* work as I expect, because the empty string elements will be considered NormalRep and infoset values will be created for them.
It simply doesn't work because of a bug in daffodil which has not interpreted this correctly.

...mikeb



Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy



On Thu, Sep 26, 2019 at 1:47 PM Mike Beckerle <mbeckerle.dfdl@gmail.com> wrote:
I have a dead-simple little format:

    data/data/data/data
    data/data/data/data

it is lines of "/" separated strings. All elements are optional.

I simply want this:

   data//data

to round trip. For that to happen I need it to parse into
 
   <field>data</field><field></field><field>data</field>

That is, I require that empty field element in the middle to be created and put into the infoset.

I can find no way to do this.

The strings have no initiator/terminator, so dfdl:emptyValueDelimiterPolicy is not relevant. All the elements are optional, so default values aren't relevant.

The spec states:

9.4.2.2      Simple element (xs:string or xs:hexBinary)
Required occurrence: If the element has a default value then an item is added to the infoset using the default value, otherwise an item is added to the Infoset using empty string (type xs:string) or empty hexBinary (type xs:hexBinary) as the value.
Optional occurrence: If dfdl:emptyValueDelimiterPolicy is not 'none'[12] then an item is added to the Infoset using empty string (type xs:string) or empty hexBinary (type xs:hexBinary) as the value, otherwise nothing is added to the Infoset.


There are errata/actions to clarify wording here around dfdl:emptyValueDelimiterPolicy being in effect or not (because there is no initiator/terminator for it to use as opposed to the property in isolation just being 'none').
But that doesn't change anything about this issue.

If this very simple format is not possible, then we need a property or new property enum value that makes it possible.

Thoughts?


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy
--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU