I did test this against IBM DFDL, and the infoset does match what I suggested. So I think I have answered my own question by verifying against IBM.

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Owl Cyber Defense | www.owlcyberdefense.com

Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy

On Wed, Apr 21, 2021 at 12:08 PM Mike Beckerle <mbeckerle.dfdl@gmail.com> wrote:

An earlier email I sent had this schema:

<xs:element name="file1">
<xs:complexType>
<xs:sequence dfdl:separator="," dfdl:separatorPosition="infix"
dfdl:separatorSuppressionPolicy="never">
<xs:element name="given-name" type="xs:string" minOccurs="0" maxOccurs="3" />
<xs:element name="surname" type="xs:string" minOccurs="0"/>
<xs:element name="phone" type="xs:string" minOccurs="0" maxOccurs="6" />
</xs:sequence>
</xs:complexType>
</xs:element>

Given that, and a default dfd:format with occursCountKind='implicit' I expect this data:

"madonna,,,,,,,,,"

To produce this DFDL Infoset (cast as XML):

<file1>
<given-name>madonna</given-name>
</file1>

I don't expect to see any empty elements like <given-name></given-name> because all elements are optional, so a zero-length representation for a xs:string is the Empty representation, and optional empty is never added to the infoset.

Furthermore, this should "round trip" i.e., unparse back to the original input data.

Am I correct here?

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Owl Cyber Defense | www.owlcyberdefense.com
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy