There are only 2 cases. Section 22 (property
precedence) makes it clear that EVDP and NVDP are only ever examined if
there is an initiator and/or terminator.
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
Steve Hanson <smh@uk.ibm.com>
Cc:
dfdl-wg@ogf.org
Date:
01/08/2018 20:53
Subject:
Re: [DFDL-WG]
clarification: on suppressed ZL string/hexBinary - do we keep variable
assignments?
I spoke too soon.
I'm good with this:
- if EVDP is 'none', then empty strings have no representation
syntax at all. So we don't create optional elements for them period.
This statement I made is not correct.
- If EVDP is not 'none', then empty string requires some
syntax to be there, and based on this syntax appearing an empty string
value is added to the infoset for the optional element.
Suppose EVDP is not none, but initiator and terminator
are both "", then the empty string does *not* require syntax
to be there. So the above statement is simply wrong.
When EVDP is 'both' but neither initiator nor terminator
are defined, then since EVDP is not 'none', a zero-length string would
still cause an optional element to be added to the infoset. In a separated
sequence, this would mean one could control how many empty strings go into
optional values by providing more separators.
So "a,b,,,,,," would add 2 non-empty and 6 empty
strings to the infoset, regardless of whether they are required or optional.
If the element is named 'x', has minOccurs "3",
default="c" maxOccurs="12" and occursCountKind 'implicit',
then the first empty would trigger defaulting, and the data would be
<x>a</x><x>b</x><x>c</x><x/><x/><x/><x/><x/>
So we would get defaulting the required index locations,
but creation of elements with empty string values for optional index locations.
This allows us to construct an XSD invalid document. I
suppose that is no big deal, there are many ways to construct data by parsing
with DFDL where the data proves to be invalid per XSD rules.
So we really have 3 cases:
1. EVDP
is 'none'
2. EVDP
not 'none' but initiator/terminator such that empty representation has
no syntax
3. EVDP
not 'none' but initator/termiantor such that empty representation DOES
have syntax.
Case 1 = optional elements never populated from ZL strings
Case 2 = optional elements are always populated from ZL
strings
Case 3 = optional elements are populated from the non-ZL
empty representation, optional elements are not populated from ZL strings.
Daffodil has heretofore been crushing Cases 1 and 2 together
with behavior of Case 1.
Does IBM DFDL distinguish all 3 cases?
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology
| www.tresys.com
Please note: Contributions to the DFDL Workgroup's email
discussions are subject to the OGF
Intellectual Property Policy
On Wed, Aug 1, 2018 at 2:33 PM, Mike Beckerle <mbeckerle.dfdl@gmail.com>
wrote:
ok, so here's the way I understand this now.
if EVDP is 'none', then empty strings have no representation
syntax at all. So we don't create optional elements for them period.
If EVDP is not 'none', then empty string requires some
syntax to be there, and based on this syntax appearing an empty string
value is added to the infoset for the optional element.
Seems simple in retrospect....
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology
| www.tresys.com
Please note: Contributions to the DFDL Workgroup's email
discussions are subject to the OGF
Intellectual Property Policy
On Wed, Aug 1, 2018 at 11:07 AM, Steve Hanson <smh@uk.ibm.com>
wrote:
9.4.2.2 Simple
element (xs:string or xs:hexBinary)
Required occurrence: If the element has
a default value then an item is added to the infoset using the default
value, otherwise an item is added to the Infoset using empty string (type
xs:string) or empty hexBinary (type xs:hexBinary) as the value.
Optional occurrence: If dfdl:emptyValueDelimiterPolicy
is not 'none' then an item is added to the Infoset using empty string (type
xs:string) or empty hexBinary (type xs:hexBinary) as the value, otherwise
nothing is added to the Infoset.
Note: To prevent unwanted empty strings
or empty hexBinary values from being added to the Infoset, use XSD minLength
> '0' and a dfdl:assert that uses the dfdl:checkConstraints() function,
to raise a processing error.
9.2.2 Empty
Representation
An element occurrence has an empty representation
if the occurrence does not have a nil representation and it conforms to
the grammar for SimpleEmptyElementRep or ComplexEmptyElementRep. Specifically,
the EmptyElementInitiator and EmptyElementTerminator
regions must be conformant with dfdl:emptyValueDelimiterPolicy and the
occurrence's content in the data stream is of length zero.
(If non-conformant it is not a processing error and the representation
is not empty). LeadingAlignment,
TrailingAlignment, PrefixLength regions may be present.
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: Steve
Hanson <smh@uk.ibm.com>
Cc: dfdl-wg@ogf.org
Date: 01/08/2018
15:13
Subject: Re:
[DFDL-WG] clarification: on suppressed ZL string/hexBinary - do we keep
variable assignments?
To follow up then,
I have assumed that dfdl:emptyValueDelimiterPolicy isn't even examined
unless the element has default="..." i.e., a non-zero-length
default value specified. I.e., without a default value, there is no concept
of emptiness as distinct from "normal" representation.
Are you suggesting that it is also used to control when an empty string
(or empty hexbinary) is accepted as a normal representation value for an
optional element, vs. treated as a missing value? That's a reasonable interpretation
that I would support, but I don't know that the spec says that anywhere,
so we need to add a sentence. (Unless I'm missing where this is stated.)
I have also thought that dfdl:emptyValueDelimiterPolicy must be combined
with dfdl:initiator and dfdl:terminator. If the combination of these is
such that the empty representation is zero-length, that is what creates
the situation of interest here, where it is ambiguous whether the value
is the official empty representation or is the normal representation that
just so happens to be of zero length. That is, there's no special
significance to the 'none' EVDP property value.
For example, if dfdl:emptyValueDelimiterPolicy is 'both', but dfdl:initiator=""
and dfdl:terminator="", then that's just as good as dfdl:emptyValueDelimiterPolicy='none'
in terms of whatever effect this has on a decision about normal vs. missing.
Does this match your understanding?
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF
Intellectual Property Policy
On Wed, Aug 1, 2018 at 9:26 AM, Steve Hanson <smh@uk.ibm.com>
wrote:
Whether to add a zero-length string or hexBinary to the infoset for an
optional element depends on the setting of emptyValueDelimiterPolicy. A
setting of 'none' stops it from being added.
Regardless, it does not give a processing error, so is therefore known-to-exist,
and therefore does not cause backtracking, so preserving discriminators
and variables.
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: dfdl-wg@ogf.org
Date: 24/07/2018
15:15
Subject: [DFDL-WG]
clarification: on suppressed ZL string/hexBinary - do we keep variable
assignments?
Sent by: "dfdl-wg"
<dfdl-wg-bounces@ogf.org>
In some situations we parse and get a successful zero-length parse for
a string or hexBinary.
But because the occurrence is optional, we do NOT add an element to the
infoset.
In that case, what happens to side-effects that occurred during the successful
parse. There are two possible kinds of side-effects. Variables can be set,
and a discriminator can be set to true.
It seems to me that if a discriminator is set, then that *must* be preserved,
and in that case it would seem the variable settings should be retained
as well.
Comments?
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF
Intellectual Property Policy
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU