It is true that we do not cover all possible
nuances of separator suppression. The ones in DFDL 1.0 essentially are
equivalent to what is supported by WTX. The closest to your requirement
is 'anyEmpty' which will suppress adjacent separators when unparsing but
will tolerate adjacent separators when parsing. At IBM we have not
yet found concrete requirements for any others.
IBM does not yet implement 'trailingEmptyStrict',
which if I recall was something Stephanie Fetzer said was needed for X12
'validation'.
Re your additional thought. I've not noticed
an ambiguity.
1) An element such as of type int, which
is not nillable with zero length, not empty with default value with zero
length, can have a zero-length representation - the absent representation,
which is by definition zero-length.
2) The absent rep only arises when parsing
and we encounter adjacent delimiters (so no content) and there is no zero-length
nil or empty rep.
3) The unparser never explicitly outputs
an absent rep, it outputs nothing, but when the next thing that is output
is a delimiter then what you parse could be the absent rep.
If you could be more specific with spec
section references, then maybe any ambiguity will become clearer.
I should also add that IBM DFDL has not implemented
all the empty/missing/absent stuff from the erratum that arose from action
140. We do not make a clear distinction between missing and empty. The
main effect this has is that we can't supply a default value when parsing
- so we currently give a parse-time schema definition error if we find
a zero-length required occurrence for an element with a default value.
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
dfdl-wg@ogf.org
Date:
20/07/2018 21:06
Subject:
Re: [DFDL-WG]
Clarification needed: stricter separator suppresssion
policy
Sent by:
"dfdl-wg"
<dfdl-wg-bounces@ogf.org>
An additional thought here. The DFDL spec says that to
be potentially trailing an element must have a possible zero-length representation.
So, an element such as of type int, which is not nillable
with zero length, not empty with default value with zero length, such an
element cannot have a zero-length representation.
If one of these elements is in an all-optional minOccurs=0
array at the end of a sequence, then trailing extra separators would NOT
be acceptable regardless of trailingEmpty being lax, because the element
is not potentially trailing.
However, elsewhere it says that absent (therefore missing)
elements are never created for optional elements.
Zero length for such an element means "absent"
and so missing. And that means not put into the infoset which suggests
that they are acceptable and ignored (though counted towards maxOccurs
positions in a positional sequence)
This seems completely ambiguous to me.
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology
| www.tresys.com
Please note: Contributions to the DFDL Workgroup's email
discussions are subject to the OGF
Intellectual Property Policy
On Fri, Jul 20, 2018 at 1:03 PM, Mike Beckerle <mbeckerle.dfdl@gmail.com>
wrote:
I have been trying to rationalize the definitions in the
DFDL spec for separated sequences.
I cannot find a way to express something very simple:
A repeating element with minOccurs non-zero-length elements with separators
required, and up to maxOccurs (or unbounded) non-zero-length elements with
separators are allowed. This means separators can never be adjacent anywhere
in the corresponding data stream (except if escaped, or hidden inside say
a fixed-length string inside a complex type element). Adjacent delimiters
would be a parse error.
I expected this to be occursCountKind 'implicit' with
separatorSuppressionPolicy 'never', but that appears to mean that maxOccurs
must be bounded and there are always exactly maxOccurs separators, the
latter of which (maxOccurs - minOccurs of them) can be empty strings, meaning
optional elements will not be created for them.
All the other 3 separator suppression policies absorb
adjacent separators, except for trailingEmptyStrict doesn't absorb them
at the end of the group.
There doesn't seem to be a way to be strict about the
format and speculatively parse only non-zero-length elements requiring
each optional occurance to appear with associated separator. I.e., no trailing
adjacent separators, and no adjacent separators in the middle or beginning
either.
Are we missing separatorSuppressionPolicy='neverEmpty'
or 'anyEmptyStrict' perhaps?
Comments?
...mikeb
Mike Beckerle | OGF DFDL Workgroup Co-Chair
| Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's
email discussions are subject to the OGF
Intellectual Property Policy
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU