IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
From:
Steve Hanson/UK/IBM
To:
DFDL-WG <dfdl-wg@ogf.org>
Cc:
"Mike Beckerle"
<mbeckerle@tresys.com>, "Michele Zundo" <michele.zundo@esa.int>,
Bradd Kadlecik/Poughkeepsie/IBM@IBMUS
Date:
03/04/2019 12:04
Subject:
Action 306 -
IBM DFDL behaviour when parsing empty strings
306
| Confirm
IBM DFDL behaviour when parsing empty strings (Steve)
7/8: IBM DFDL has not fully implemented the behaviour changes arising from action 140 with respect to empty string elements. Daffodil is about to do so. IBM DFDL users have complained about lack of defaults when parsing but other than that appear happy. Are the rules in the spec for empty strings over complicated? Steve to document the behaviour for IBM DFDL to inform the discussion. ... 1/11: In progress - there are a lot of subtle scenarios 15/11: Not discussed ... 7/2/19: No further progress |
9.4.2.2 Simple element (xs:string or xs:hexBinary)
Required occurrence: If the element has a default value then an item is added to the infoset using the default value, otherwise an item is added to the Infoset using empty string (type xs:string) or empty hexBinary (type xs:hexBinary) as the value.
Optional occurrence: If dfdl:emptyValueDelimiterPolicy is not 'none' then an item is added to the Infoset using empty string (type xs:string) or empty hexBinary (type xs:hexBinary) as the value, otherwise nothing is added to the Infoset.
IBM DFDL behaviour:
Required. IBM DFDL does not implement
default values when parsing, so an empty occurrence with a default value
gives an SDE (to prevent backtracking). An empty occurrence with no default
gives a Processing Error. If you need to add an empty string to the infoset,
you can add default=""(when default values implemented,
of course).
Optional. IBM DFDL
adds nothing to the infoset regardless of presence of initiator and/or
terminator. No way to get empty string into the infoset.
9.4.2.3 Complex element
Required occurrence: An item is added to the Infoset.
Optional occurrence: If dfdl:emptyValueDelimiterPolicy is not 'none' then an item is added to the Infoset, otherwise nothing is added to the Infoset.
For both required and optional occurrences, the Infoset item may also have a child item.
1. If the first child element of the complex type is a required simple element, then an empty string (type xs:string), empty hexBinary (type xs:hexBinary), or default value will also be added to the Infoset.
2. If the first child element of the complex type is a required complex element, then an item is added to the Infoset (which may itself have a child via (1))
IBM DFDL behaviour:
Required. IBM DFDL follows the spec
(modulo 1 when an error would have been
thrown, as per its 9.4.2.2 behaviour).
Optional. IBM DFDL
follows the spec (modulo 1 when an error would have been thrown, as per
its 9.4.2.2 behaviour).
So ...
The spec today is consistent in one way,
in that for both complex & string elements a) a required empty occurrence
always adds to the infoset; & b) an optional empty occurrence adds
to the infoset if initiator/terminator present; & c) an optional empty
occurrence does not add to the infoset if no initiator/terminator present.
If the simple string behaviour was to change
to match IBM DFDL then that consistency is lost, but the string
behaviour then matches that for other simple types. Section 9.4.2.2
disappears as the behaviour is same as 9.4.2.1. Section 9.4.2.3 becomes
as below. We lose the ability to get an empty string into the infoset for
an optional string with initiator/terminator.
9.4.2.3 Complex element
Required occurrence: An item is added to the Infoset.
Optional occurrence: If dfdl:emptyValueDelimiterPolicy is not 'none' then an item is added to the Infoset, otherwise nothing is added to the Infoset.
For both required and optional occurrences, the Infoset item may also have a child item.
1. If the first child element of the complex type is a required simple element, then a default value will also be added to the Infoset.
2. If the first child element of the complex type is a required complex element, then an item is added to the Infoset (which may itself have a child via (1))
We also need to be sure that any other implementations
have not yet implemented the current spec behaviour. Need to check
with DFDL4S and IBM TPF.
To be discussed on next WG call ...
Regards
Steve Hanson
IBM Hybrid Integration, Hursley, UK
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
smh@uk.ibm.com
tel:+44-1962-815848
mob:+44-7717-378890
Note: I work Tuesday to Friday
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU