15.13
When we get 'empty content' from
an element, and the element is optional, then it is not present and is
not added to the infoset.
When we get empty content from an
element, and the element is required, then we start to look at nil handling
and default handling properties.
- If the properties are such that
the empty string is a nil value then the infoset value is the special value
nil.
- If the properties are such that
there is a default value specified then the infoset value is the default
value.
- Otherwise if empty string is valid
for the type (ie, is derived from xs:string) then the infoset value is
a zero length string.
So we know what empty content is and
how it is applied to simple elements. We need to define when it is possible
to get empty content and what it means to elements of complex type or of
non-string simple type.
Proposal:
1. Parsing
Simple elements
1) It is not a schema definition
error nor a processing error if a length is being used to extract
data and it is zero. This covers dfdl:lengthKind implicit, explicit, prefixed
and endOfParent (when parent length is known). The result is 'empty content'.
(Note that for implicit, XSDL allows maxLength/length facet to be 0, so
disallowing it for others is not consistent).
2) It is not a processing error
if scanning for data and the length of the returned bytes is zero. This
applies to dfdl:lengthKind delimited, pattern and endOfParent (when
parent length is not known). The result is 'empty content'. (This is just
stating the obvious).
(The above two rules ensure that it
is possible to apply empty content to trigger optional, nil value or default
value processing regardless of data type and dfdl:lengthKind).
3) Optional, nil and default processing
are applied as per spec.
4) If the element is required, and nil
value or default value is not used, and empty string is not in the lexical
space of the element's type, then it is a processing error.
The two initiator related properties
dfdl:nilValueInitiatorPolicy and dfdl:defaultValueInitiatorPolicy define
whether nils and defaults are applied when initiated empty content is found,
they don't affect the definition of empty content or what it means for
the type.
[Note: If you recall, this discussion
was triggered by a customer that was using an expression to calculate the
length of a standard text decimal. He wanted 0 length to mean 0 ended up
in the infoset. He can achieve this by making the element required with
a default value of 0.]
Complex elements
It is possible to get returned empty
content for a complex element for cases 1) and 2) above.
1) If the complex element is optional
then it is not added to the infoset.
2) If the complex element does not have
an initiator specified & is required then it is added to the infoset.
3) If the element has an initiator specified
then dfdl:defaultValueInitiatorPolicy applies
-
required => element is added to infoset only if initiator is present
(processing error if no initiator & empty content)
-
prohibited => element is added to infoset only if initiator is not present
(initiator implies real content follows so processing error if initiator
& empty content)
4) If the complex element is added to
the infoset, then the parser processes the child content of the complex
type. This may or may not cause a processing error. If it doesn't
then default value processing applies for required child elements. If we
don't do this then we will not create default values for all missing required
simple elements, and that would be wrong.
5) If the contained sequence or choice
has an initiator or terminator then it is a processing error.
2. Unparsing
Simple elements
Data in the infoset can result in empty
content being added to the bit stream (ie, nothing), with an accompanying
0 value in any length prefix or length expression field, if appropiate
to the dfdl:lengthKind.
Complex elements
The absence from the infoset of a required
complex element will cause any specified initiator to be output, plus if
there are required children then default values will be output for those
children. If we don't do this then we will not create default values for
nested missing required simple elements, and that would be wrong. This
enables creation of a sparse infoset containing just the elements with
explicit values, with the rest defaulting regardless of nesting.
3. Choices
Worth noting that the concept of 'required'
for the elements of a choice does not apply. Even if minOccurs > 0.
4. Outstanding Issues
Is it ok to reuse dfdl:defaultValueInitiatorPolicy
for complex elements? Should it be renamed? Should we add a separate property
for complex elements?
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU