Hi Mike
That's not correct. Trailing suppression
occurs for positional sequences. It's 'anyEmpty' that means a group
is non-positional.
I think this is sufficient:
"Separators
occur in the data either before, between or after all occurrences of the
elements or groups that are the children of the sequence, in accordance
with dfdl:separatorPosition and dfdl:separatorSuppressionPolicy. Elements
with dfdl:inputValueCalc have no representation in the data stream, and
so never have an associated separator."
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
Steve Hanson <smh@uk.ibm.com>
Cc:
DFDL-WG <dfdl-wg@ogf.org>
Date:
27/06/2019 19:24
Subject:
Re: [DFDL-WG]
Consolidated Notes including both DFDL WG Call 2018-08-07 along with other
recent clarification email threads
Reawakening this thread from last August as I'm trying
to distill this into tracker issues.
I believe now we know that the suggestion below about
Section 14.2 isn't correct. Pulling that discussion up to the top here,
we suggested this:
Section 14.2
For property dfdl:separator. The sentence: "Separators
occur in the data either before, between or after all occurrences of the
elements or groups that are the children of the sequence." replaced
with "Separators occur in the data either before, between or after
all occurrences of represented elements (that is, elements without
the dfdl:inputValueCalc property) or model groups that are the children
of the sequence. Elements with dfdl:inputValueCalc have no representation
in the data stream, and so never have separators. Children of a sequence
that are model groups are always separated, even if they are empty (meaning
have no children of their own - which is allowed for sequence groups),
or both the model group child and its contained children occupy zero-length
in the data stream."
From tests IBM did, and experience with the DFDL schema
for EDIFACT, I think we neglected the potentially-trailing group case in
the above description. I've revised it, with the new phrasing in blue.
Section 14.2
For property dfdl:separator. The sentence: "Separators
occur in the data either before, between or after all occurrences of the
elements or groups that are the children of the sequence." replaced
with "Separators occur in the data either before, between or after
all occurrences of represented elements (that is, elements without
the dfdl:inputValueCalc property) or model groups that are the children
of the sequence. Elements with dfdl:inputValueCalc have no representation
in the data stream, and so never have separators. Children of a sequence
that are model groups are separated if the
sequence is positional, even if they are empty (meaning have no children
of their own - which is allowed for sequence groups), or both the model
group child and its contained children occupy zero-length in the data stream.
If the sequence is not positional, then separators are suppressed for trailing
groups that are zero-length according to the dfdl:separatorSuppressionPolicy."
This accommodates the common situation where a trailing
sequence group contains an entirely optional array element. If none of
the array elements exists we do not want a separator for the sequence group
at all.
If this makes sense, I will distill these to one or more
tracker items.
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology
| www.tresys.com
Please note: Contributions to the DFDL Workgroup's email
discussions are subject to the OGF
Intellectual Property Policy
On Wed, Aug 15, 2018 at 11:51 AM Mike Beckerle <mbeckerle.dfdl@gmail.com>
wrote:
I'm fine with your suggested revised wording. The point
is just to make a broader statement about empty representation than the
one there which suggests it is *only* used for deciding default values
to be used or not, when it is in fact used more broadly to determine two
different things about absent/missing - defaulting, and optional-element
occurrence.
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology
| www.tresys.com
Please note: Contributions to the DFDL Workgroup's email
discussions are subject to the OGF
Intellectual Property Policy
On Wed, Aug 15, 2018 at 10:12 AM Steve Hanson <smh@uk.ibm.com>
wrote:
Mike, thanks for writing this up.
I am in the process of incorporating it into the minutes of the WG call.
There is one paragraph that on re-reading and comparing with what is in
the spec already, I am not so sure about.
Section 9.2.2
The sentence: "The empty
representation is special in DFDL, because when
parsing it is this condition that can trigger the creation of a default
value for an element occurrence." replace with: "The empty representation
is special in DFDL because when parsing it it is used to determine when
default values are created in the Infoset, and when optional recurring
elements are omitted from the Infoset. The empty representation can require
initiators or terminators be present so as to enable data formats to explicitly
distinguish empty-string/hexBinary values (which might cause default values
to be used) from emptiness meaning the absence of any representation."
(This is to clarify an error of omission - prior
language suggested that EVDP is only relevant when the element has a default
value, because only that need was mentioned.)
What is the significance of 'optionally recurring elements'? I would have
thought that is just 'optional occurrences' as it applies to (0,1) elements
too.
Actually I'm not convinced that clause is needed at all. The point of this
paragraph is to call out defaulting. Not adding occurrences to the infoset
happens for absent and missing occurrences too, as described in later paragraphs.
I would prefer:
"The empty representation is special in DFDL because when parsing
it is used to determine when default values are created in the Infoset.
The empty representation can require initiators or terminators be present
so as to
enable data formats to explicitly distinguish occurrences with empty
string/hexBinary values from occurrences that are missing or are absent."
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: DFDL-WG
<dfdl-wg@ogf.org>
Date: 07/08/2018
18:46
Subject: [DFDL-WG]
Consolidated Notes including both DFDL WG Call 2018-08-07 along with other
recent clarification email threads
Sent by: "dfdl-wg"
<dfdl-wg-bounces@ogf.org>
This message includes two parts.
Part 1 is the things we discussed on the DFDL WG Call of 2018-08-07.
Part 2 is the other recent emails where the conclusion from the email thread
is repeated here for finalization/refinement, and for consolidation so
all these changes can be considered together.
================================
Part 1 - Discussed on the call.
Re: [DFDL-WG] Clarification discussion points for next call(s) - DFDL Spec
issues around nil, empty, normal, absent, defaulting, and separator suppression.
Dated August 2nd
Conclusions:
Section 9.3.1.1
Delete phrase "...this of course implies that....".
Note that there is already a correction to create numbered bullets of 3
sentences. The sentence containing this phrase will be bullet #3 of that
list.
Section 9.4
Item 2 under "For elements and element refs:" Change
to: "dfdl:element following property scoping rules, which includes
establishing representation as described in Section 9.3.2 and conversion
to element type for simple types."
Section 9.3.2
The phrase "The first step is to see if the content is trivaill
of length zero." Change to: "The first step is to see if the
SimpleContent or ComplexContnet region is of length zero as a first approximation."
The bullet "delimited => length is zero (delimiter is immediately
encountered)" Insert "in scope" after the open parenthesis.
Section 9.4.2.3.
We agreed that the paragraphs beginning with "For both required
and optional..." need to be better tied to the material above. Wording
TBD - pending Steve Hanson doing some tests on IBM DFDL.
============================
Part 2 - Below are conclusions from the prior email threads. This is for
email review in lieu of discussion on this week's call.
Re: [DFDL-WG] clarification: on suppressed ZL string/hexBinary - do we
keep variable assignments?
Of Aug 6 (last date in the thread)
These corrections apply:
Sections 9.4.2.2 and 9.4.2.3
The phrase "Optional occurrence: If dfdl:emptyValueDelimiterPolicy
is not 'none'[12],"
Change to "Optional occurrence: if dfdl:emptyValueDelimiterPolicy
is applicable and is not 'none',...." (retaining the footnote)
Section 9.4.2
Before the final phrase "There are three main cases to
consider:" Insert this sentence: "The sections below indicate
when an item is added to the infoset, and whether it has a default or other
value. If there is no processing error then regardless of whether an item
is added to the infoset or not, any side-effects due to dfdl:discriminator
statements evaluating to true, or dfdl:setVariable statements, are retained."
Section 12.2
For property emptyValueDelimiterPolicy, before the phrase
"It is a schema definition error if...", insert this sentence:
"The value of dfdl:emptyValueDelimiterPolicy
should only be checked if there is a dfdl:initiator or dfdl:terminator
in scope. If so, and dfdl:emptyValueDelimiterPolicy is not set, it is a
schema definition error. If dfdl:initiator is not "" and dfdl:terminator
is "" and dfdl:emptyValueDelimiterPolicy is 'terminator' it is
a schema definition error. If dfdl:terminator is not "" and dfdl:initiator
is " and dfdl:emptyValueDelimiterPolicy is 'initiator' it is
a schema definition error."
Section 13.16
For property nilValueDelimiterPolicy, before the phrase
"It is a schema definition error if...", insert this sentence:
"The value of dfdl:nilValueDelimiterPolicy
should only be checked if there is a dfdl:initiator or dfdl:terminator
in scope. If so, and dfdl:nilValueDelimiterPolicy is not set, it is a schema
definition error. If dfdl:initiator is not "" and dfdl:terminator
is "" and dfdl:nilValueDelimiterPolicy is 'terminator' it is
a schema definition error. If dfdl:terminator is not "" and dfdl:initiator
is " and dfdl:nilValueDelimiterPolicy is 'initiator' it is a
schema definition error."
Section 9.2.2
The phrase "the occurrence's content in the
data..." replace with "the occurrence's SimpleContent or ComplexContent
region in the data..."
The sentence: "The empty
representation is special in DFDL, because when
parsing it is this condition that can trigger the creation of a default
value for an element occurrence." replace with: "The empty representation
is special in DFDL because when parsing it it is used to determine when
default values are created in the Infoset, and when optional recurring
elements are omitted from the Infoset. The empty representation can require
initiators or terminators be present so as to enable data formats to explicitly
distinguish empty-string/hexBinary values (which might cause default values
to be used) from emptiness meaning the absence of any representation."
(This is to clarify an error of omission - prior
language suggested that EVDP is only relevant when the element has a default
value, because only that need was mentioned.)
Re: [DFDL-WG] Clarification needed: separator for empty sequence
Of Aug 2
Section 14.2
For property dfdl:separator. The sentence: "Separators
occur in the data either before, between or after all occurrences of the
elements or groups that are the children of the sequence." replaced
with "Separators occur in the data either before, between or after
all occurrences of represented elements (that is, elements without
the dfdl:inputValueCalc property) or model groups that are the children
of the sequence. Elements with dfdl:inputValueCalc have no representation
in the data stream, and so never have separators. Children of a sequence
that are model groups are always separated, even if they are empty (meaning
have no children of their own - which is allowed for sequence groups),
or both the model group child and its contained children occupy zero-length
in the data stream."
(note: Some of the above is redundant with stipulations in
the dfdl:inputValueCalc property description, but I believe it is wise
to have this little redundancy.)
======================
These email threads are mentioned here to indicate that they are resolved
by one or another of the above corrections:
Re: [DFDL-WG] clarification needed - ambiguity about empty string and optional
element
Of Aug 2
Re: [DFDL-WG] Spec correction ? - Section 9.3.2.1 - second list missing
"empty" representation
Of Aug 2
-----------
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF
Intellectual Property Policy
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU