From: | Alan Powell/UK/IBM@IBMGB |
To: | Tim Kimber/UK/IBM@IBMGB |
Cc: | dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org |
Date: | 19/11/2009 17:30 |
Subject: | Re: [DFDL-WG] Omitted array occurrences |
Sent by: | dfdl-wg-bounces@ogf.org |
Need more discussion on
this
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com
Tel: +44 (0)1962 815073
Fax: +44 (0)1962 816898
From: | Tim Kimber/UK/IBM@IBMGB |
To: | dfdl-wg@ogf.org |
Date: | 19/11/2009 12:07 |
Subject: | [DFDL-WG] Omitted array occurrences |
· to determine if an element declaration or reference is scalar or array
· to determine the required minimum number of occurrences of an array both when parsing and unparsing
Section 16.13
( Note : this definition of 'required' is a repeat of the defintion in
section 3 )
Definition: 'required'
We define the term 'required' as follows:
· A
scalar element is required.
· An
element of a fixed-occurrence array is required.
· An
element of a variable-occurrence array is required if its index is less
than or equal to the value of minOccurs.
All other elements are not required.
...
On unparsing, if an element is required, and is not part of the logical data and the element has a default value specified then it is used, otherwise it is a processing error.
Section 17.3.1 : Sequence groups and separators
re: the combination of separatorPolicy="suppressAtEnd" and sequenceKind="ordered":
All separators must be found in the data except that when the sequence
has trailing optional items, the separators are suppressed for any final
missing items. Note suppressAtEnd can only be used when there is no clash
with delimiters from the containing structure.
My interpretation of the specification is:
a) if separatorPolicy="require" then the unparser should output
a separator for all missing required elements ( whether array members or
not )
Is this an additional definition of a 'required' element? In which case the default value should be output. (interestingly because default is a schema property rather than a dfdl property you cannot set a default default.)
SMH: The definition of 'required' relates to the data. Here we are talking about whether to output syntax. Strike 'required' from Tim's interpretation and you have the correct interpretation.
b) if separatorPolicy="suppressAtEnd" then the unparser should output a separator for all non-trailing missing required elements
Should set the default for
any required element so it won't be missing.
"On unparsing, if an element
is required, and is not part of the logical data and the element has a
default value specified then it is used, otherwise it is a processing error.
"
SMH: Tim's interpretation is not complete. The correct interpretation is "...then the unparser should output a separator for all missing elements in the sequence up to and including the last required element.". It is only optional elements beyond the last required element that benefit from this property.
c) separators for missing elements must be output regardless of whether the element is required/optional, simple/complex, does/does not have a default value etc. I assume this because the term 'missing' is used rather than the very clearly-defined term 'required'.
Missing just means not in
the infoset and is orthogonal to optional/required. If you accept this
is an additional definition of required then no. But it then forces you
to set defaults for minOccurs=0 elements which will only be used in this
circumstance. I'm not sure what the default for complex elements would
be: all the children must have a default? .
SMH: If c) is trying to
say that once you have decided, via a) and b), that a separator is needed,
then whether it is simple/complex, does/does not have a default, is irrelevant,
then I agree.
Reading between the lines, I also infer
the following rules:
d) if an array has maxOccurs="unbounded" and it is missing from
the infoset then the unparser will not output any separators for the array
If minOccurs > 0 then use default. If minOccurs= 0 then output nothing. I don't think maxOccurs has any effect.
SMH: Agree.
e) if an array has maxOccurs!="unbounded" and it is missing from the infoset then the unparser will output a separator for each missing occurrence ( so it will output maxOccurs separators ).
See d)
f) if an element contains a child group,
and none of the group members are present in the infoset, then the group
is 'missing' and the unparser will output a separator for it.
Not sure
SMH: This is establishing
'missing' for a local group. Sounds right to me. The separator will be
output according to a) and b). But because a local group is (1:1) in DFDL,
in practice you will always get a separator.
Suggested changes to the specification:
- As a minimum, I think it would be useful for the specification to include
a definition of 'missing'. 'Not
in the infoset' SMH:
That's fine for unparsing only.
- DFDL does not allow min/maxOccurs on groups, so they implicitly have
cardinality 1:1. Specification should specify the behaviour of the unparser
when none of a group's members are present in the infoset.
Agree.
- The wording in 17.3.1 could be more accurate. I don't think the word
'optional' should be there ( if validation is off then the unparser will
tolerate missing required elements -No.
'required' is not part of vaildation).
I think the words 'trailing' and 'final' are intended to mean the same
- we should standardize on 'trailing'. SMH:
I agree the words could be improved. See my b) words above for example.
regards,
Tim Kimber, Common Transformation Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 246742
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU