As a general point, section 9 was constructed
so that the definitions of the representations are in 9.2 and then 9.3
can simply refer to them. We shouldn't be re-stating detail from 9.2 in
9.3 as it over-complicates the text and makes it hard to read.
Section 9.3.2.2 makes it clear that
a complex type must be traversed if it is not nil rep.
Delimited scanning. All you can do when
scanning is look for a list of in-scope delimiters. If you don't find a
terminator when it is needed for normal rep but not for nil/empty rep,
then you can deduce this because the delimiter you found was not the terminator.
If you can't reliably deduce this then the format is ambiguous and not
parse-able. I don't believe that IBM DFDL does anything more than that,
and it works for us with the formats we encounter. If you think that
this is not obvious and that more words are needed in the spec to convey
this, then we can add them, but we must be careful to maintain the readability
of section 9.3.
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
Steve Hanson <smh@uk.ibm.com>
Cc:
dfdl-wg@ogf.org
Date:
01/08/2018 19:18
Subject:
Re: [DFDL-WG]
Spec correction ? - Section 9.3.2.1 - second list missing "empty"
representation
Ah,
So 9.3.2 talks only about the content region being zero length. I get that
now. I was not reading the word "content" specifically enough.
I also get that the second numbered list under 9.3.2.1
can't discuss empty representation, because the content region has been
identified as non-zero length.
I think there are still some problems here.
In 9.3.2, I believe it's point about delimited lengthKind
is problematic. You don't even know whether you are scanning for a terminator
or not unless you entertain whether nillable, empty/defaultable, or normal
- because of nilValueDelimiterPolicy, emptyValueDelimiterPolicy, and initiator/terminator.
It's not correct to just look for a separator (if there even is one) here.
I think if the lengthKind is 'delimited', then you must consider the nil
representation with framing, empty representation with framing, and normal
representation with framing (for string and hexBinary), in order to even
find, positively, the content region in order to say that it is zero length.
This is particularly true if dfdl:initiator and dfdl:terminator are the
same string.
Delimiter "immediately found" isn't even a valid
concept for a non-nillable complexType, as one must recurse into the type.
So I think we need to be clear that a non-nillable complex type is simply
not ever considered "trivially zero length".
I think the clearest thing to fix 9.3.2 is for 'delimited'
is to provide the case analysis:
'delimited' => if nillable, is content zero length
after initiator and before terminator framing as required by nilValueDelimiterPolicy
if simpleType - is content zero length after initiator and before
terminator framing framing as required by emptyValueDelimiterPolicy
if simpleType xs:string or xs:hexBinary - is content zero length
after initiator and before terminator framing (if such framing is defined.)
otherwise - not trivially zero length.
I think 9.3.2.1 needs to make the implied point about
being about the content only, and then about having the corresponding required
framing explicit e.g., the first sentence could change to:
"If the content
is length zero as described above, the representation is then established
by checking, in order for:"
After the numbered lists, a concluding sentence can say
"In all cases above establishing a representation includes that it
must be surrounded by its corresponding initiator/terminator if required
for that representation's framing based on dfdl:nilValueDelimiterPolicy,
dfdl:emptyValueDelimiterPolicy, dfdl:initiator, and dfdl:terminator."
9.3.2.2 Similarly needs to specify that "Establishing
nil representation includes that it must be surrounded by its corresponding
initiator/terminator if required based on dfdl:nilValueDelimiterPolicy,
dfdl:initiator, and dfdl:terminator."
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology
| www.tresys.com
Please note: Contributions to the DFDL Workgroup's email
discussions are subject to the OGF
Intellectual Property Policy
On Wed, Aug 1, 2018 at 7:45 AM, Steve Hanson <smh@uk.ibm.com>
wrote:
Mike
9.3.2.1
First numbered list. It's implicit, as is the non-reference to nilValueDelimiterPolicy
for nil rep.
Second numbered list. Should not mention empty - can't be empty as not
zero length.
Don't understand what you mean by empty rep is non-zero length. We are
talking about the content region only here.
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: dfdl-wg@ogf.org
Date: 17/07/2018
15:11
Subject: [DFDL-WG]
Spec correction ? - Section 9.3.2.1 - second list missing
"empty" representation
Sent by: "dfdl-wg"
<dfdl-wg-bounces@ogf.org>
In section 9.3.2.1, there are two numbered lists.
The first numbered list should qualify its mention of "empty representation"
with "(when the emptyValueDelimiterPolicy and initiator and terminator
policies are defined such that zero-length is allowed as the empty representation.)"
The second numbered list should mention nil (literal), nil (logical), empty,
and normal representations.
It is missing discussion of non-zero-length "empty" representations
(due to emptyValueDelimiterPolicy not none, and initiator and terminator
defined such that the empty representation is non-zero length).
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF
Intellectual Property Policy
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU