It is true that we do not cover all possible nuances of separator suppression. The ones in DFDL 1.0 essentially are equivalent to what is supported by WTX.  The closest to your requirement is 'anyEmpty' which will suppress adjacent separators when unparsing but will tolerate adjacent separators when parsing.  At IBM we have not yet found concrete requirements for any others.

IBM does not yet implement 'trailingEmptyStrict', which if I recall was something Stephanie Fetzer said was needed for X12 'validation'.

Re your additional thought. I've not noticed an ambiguity.
1) An element such as of type int, which is not nillable with zero length, not empty with default value with zero length, can have a zero-length representation - the absent representation, which is by definition zero-length.
2) The absent rep only arises when parsing and we encounter adjacent delimiters (so no content) and there is no zero-length nil or empty rep.
3) The unparser never explicitly outputs an absent rep, it outputs nothing, but when the next thing that is output is a delimiter then what you parse could be the absent rep.
If you could be more specific with spec section references, then maybe any ambiguity will become clearer.

I should also add that IBM DFDL has not implemented all the empty/missing/absent stuff from the erratum that arose from action 140. We do not make a clear distinction between missing and empty. The main effect this has is that we can't supply a default value when parsing - so we currently give a parse-time schema definition error if we find a zero-length required occurrence for an element with a default value.

Regards
 
Steve Hanson

IBM Hybrid Integration, Hursley, UK
Architect,
IBM DFDL
Co-Chair,
OGF DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday




From:        Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:        dfdl-wg@ogf.org
Date:        20/07/2018 21:06
Subject:        Re: [DFDL-WG] Clarification needed: stricter separator suppresssion        policy
Sent by:        "dfdl-wg" <dfdl-wg-bounces@ogf.org>





An additional thought here. The DFDL spec says that to be potentially trailing an element must have a possible zero-length representation.

So, an element such as of type int, which is not nillable with zero length, not empty with default value with zero length, such an element cannot have a zero-length representation.

If one of these elements is in an all-optional minOccurs=0 array at the end of a sequence, then trailing extra separators would NOT be acceptable regardless of trailingEmpty being lax, because the element is not potentially trailing.

However, elsewhere it says that absent (therefore missing) elements are never created for optional elements.

Zero length for such an element means "absent" and so missing. And that means not put into the infoset which suggests that they are acceptable and ignored (though counted towards maxOccurs positions in a positional sequence)

This seems completely ambiguous to me.



Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy


On Fri, Jul 20, 2018 at 1:03 PM, Mike Beckerle <mbeckerle.dfdl@gmail.com> wrote:

I have been trying to rationalize the definitions in the DFDL spec for separated sequences.

I cannot find a way to express something very simple: A repeating element with minOccurs non-zero-length elements with separators required, and up to maxOccurs (or unbounded) non-zero-length elements with separators are allowed. This means separators can never be adjacent anywhere in the corresponding data stream (except if escaped, or hidden inside say a fixed-length string inside a complex type element). Adjacent delimiters would be a parse error.

I expected this to be occursCountKind 'implicit' with separatorSuppressionPolicy 'never', but that appears to mean that maxOccurs must be bounded and there are always exactly maxOccurs separators, the latter of which (maxOccurs - minOccurs of them) can be empty strings, meaning optional elements will not be created for them. 

All the other 3 separator suppression policies absorb adjacent separators, except for trailingEmptyStrict doesn't absorb them at the end of the group.

There doesn't seem to be a way to be strict about the format and speculatively parse only non-zero-length elements requiring each optional occurance to appear with associated separator. I.e., no trailing adjacent separators, and no adjacent separators in the middle or beginning either.

Are we missing separatorSuppressionPolicy='neverEmpty' or 'anyEmptyStrict' perhaps?

Comments?

...mikeb


Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy

--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU