I'm trying to tighten up my understanding of section 9 materials around establishing representation, known to/not-to exist, and defaulting, particularly as they interact with separators and separator suppresssion for "absent" representations.
Section 9.2.5
This sentence
The nil representation can be a zero-length representation
if dfdl:nilValue is "%ES;", and there is no framing or framing is
suppressed by dfdl:nilValueDelimiterPolicy.
Should say
".... if dfdl:nilValue is a list containing either %ES; alone, or %WSP*; alone, and ...."
This matches existing erratum 5.32. It's just another place that needs the same update.
Sections 9.2.1 to 9.2.4 do not say what happens if when trying to establish the representation, an assertion failure occurs. In particular, can an assertion failure cause establishing of nil, empty, or normal represntation to fail, resulting in absent representation? For example, if an assert requires the length of the value to be greater than 1 character, then a zero-length string cannot be the normal representation.
But do we parse to nil representation and run assertions - and if fail, parse for empty representation, then re-run assertions, and if fail parse for normal representation, then re-run assertions, and if fail and the rep was "trivially ZL", then it is absent?
Section 9.2.3 doesn't say what happens if a type conversion error happens when trying to establish Normal representation. E.g., if a non-defaultable text integer is parsed from a zero-length string. T
A section 9.2.5 should be added describing "No representation" - Computed elements (dfdl:inputValueCalc) have no representation at all. Not zero length, not anything. They have no implications for parsing of, or unparsing of, delimiters of any kind.
Section 9.3
Section 9.3.1.1 says establishing known-to-exist, no processing error can occur.
For example, an element can have zero-length representation if it is nillable and nilValue=%ES; and there are no delimiters nor framing. But an assert can fail for this by specifically excluding dfdl:valueLength(.) eq 0. Or an element of type string can have zero length, but an assert can insist the string contain 1 or more characters.
Section 9.3.1.3 says known not to exist if the occurence is Missing, which means absent representation is one way. So can an assert that causes nil, empty, and normal representation to fail (assuming asserts are evaluated and contribute to that decision about representation) can cause the occurrence to be absent; hence, missing. It also says a processing error when parsing the component means it is known not to exist.
Section 9.3.2 Establishing Representation
Section 9.3.2.1 Simple Element
(1) already has an erratum allowing WSP* alone as well as ES.
Does not say whether (2) empty representation - qualified by "empty representation must be able to be of zero-length".
(3) normal representation - does not say if asserts can fail and prevent this zero-length from being acceptable. (actually this point about assertions applies to all of 1, 2, and 3.
Section 9.3.2.2 Complex Element
If a complex type element is lengthKind 'delimited' do we still have to recurse the type before deciding if it is trivially zero length, or can we look at the data stream without recursing in? Section 9.3.2 5th bullet says that we can look for if a delimiter is immediately encountered, and does not say this is for simple types only.
A complex type element can have empty representation if the content region is empty and the delimiters with EVDP specify what is found in the data stream.
Absent representation for a complex element can only occur if the representation is zero length after recursing through the type tree. This implies that if EVDP indicates delimiters for empty value, then ZL means absent.
Section 9.3.3
StopValue seems like it has a point of uncertainty for every occurrance. The fact that a stop value must exist doesn't mean there are no points of uncertainty. It is uncertain if the logical value will be the stop value or not.
But another way of thinking about it is that the stopvalue parser does not need to establish points for backtracking. All elements MUST succeed until it parses a successful stop value, and if any failure occurs we backtrack the entire array, not just an element.
Use of stopValue with type xs:string creates lots of ambiguities. E.g., ZL can be a valid normal representation, but the stop value may be "stop", i.e., non-ZL. In that case, since all elements are optional according to minOccurs 0, then when a ZL is parsed is the optional element suppressed? Or not - meaning you get an array full of empty string "normal" values?
Section 9.4.2
Says a complex type must have descended into the type and returned with no processing error, but does not say whether processing errors signaled by asserts on simple type elements also disable empty representation from being established.
Section 9.4.2.2
Says if EVDP is not none, can an assert insist on something that subverts establishment of empty representation, such as that the length is > 0?
Or the assert can test something orthogonal - entirely unrelated like some variable is set to a certain value?
E.g., <defineVariable name="disallowEmptyValues" type="xs:boolean".../>
Then an assert on the element says { if ($disallowEmptyValues) then false else true }.
For optional occurrence, if EVDP is not none, then empty representation is established by the presence of some positive syntax - the representation is not ZL. However, this says a empty string or empty hexbinary becomes the value, not the default value of the element. Is that correct? I would think having positive syntax that matches the empty representation would satisfy known-to-exist, and then that would trigger assigning the default value. However, I suppose the rationale is that such an empty value for an optional element means there will be no defaulting, hence, the empty representation corresponds to empty string or empty hexBinary as "normal" representation.
The suggestion to use an assert that checks a non-zero minLength facet only makes sense if the processing error will cause the end of the array (occursCountKind parsed or implicit). If the OCK is such that there is no point of uncertainty, then this processing error would cause the whole array-element to fail. That is, the assert suggested here really does too much to be used only to filter empty strings/hexBinary from going into the infoset.
Or does a ZL failure for a delimited simpleType value, where the text is ZL, but the type conversion fails, or an assert fails, does that create Absent representation resulting in no empty string going into the infoset?
I suspect this issue is tied up with separator suppression policy and when a ZL thing is suppressed, the separator absorbed, and nothing goes into the infoset.
9.4.2.3 - suggests that processors must keep track of the "all empty flag" for every infoset node and recursively all child nodes.
This section should say that a complex type has empty representation if it is known to exist, and the position in the data doesn't change after a recursive traversal.
But this last paragraph of the section contradicts what is said in the 2nd sentence of the section (maybe). The point the example is making - the principle it is illustrating, does not seem to be explicilty stated.