
I prefer choice (a) for two reasons * It is more restrictive and therefore more conservative (preserving freedom to change in future if needed) * If a user has a positional data format, you don't want them to even have to understand the concept of speculation in order to model their data. So choice (a) allows a simpler description that doesn't need to introduce the notion that the parser might be speculation. Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy <http://www.ogf.org/About/abt_policies.php> On Wed, Jun 25, 2014 at 5:20 AM, Steve Hanson <smh@uk.ibm.com> wrote:
*260* *Positional and non-positional sequences (All)* 10/6: Spec defines the above but also allows different occursCountKinds within the same sequence which may have different (implied) separatorSuppressionPolicy, which results in a sequence which is a mixture of both. Should this be allowed? If so what are the rules? Can certain combinations be disallowed? 17/6: IBM have discussed internally and will submit a proposal.
In the spec we define Positional Sequence and Non-Positional Sequence:
*Positional sequence - **Each occurrence in the sequence can be identified by its position in the data. Typically the components of such a sequence do not have an initiator. In some such sequences, the separators for optional zero-length occurrences may or must be omitted when at the end of the group. A positional sequence can be modelled by setting dfdl:separatorSuppressionPolicy to 'never', 'trailingEmptyStrict' or 'trailingEmpty'.*
*Non-positional sequence - **Occurrences in the sequence cannot be identified by their position in the data alone. Typically the components of such a sequence have an initiator. Such sequences allow the separator to be omitted for optional zero-length occurrences anywhere in the sequence. Speculative parsing is employed by the parser to identify each occurrence. A non-positional sequence can be modelled by setting dfdl:separatorSuppressionPolicy to 'anyEmpty'. *
The problem is that the setting of dfdl:separatorSuppressionPolicy is only examined for child elements with dfdl:occursCountKind 'implicit'. For other dfdl:occursCountKinds, there is the concept of an 'implied' dfdl:separatorSuppressionPolicy:
*When dfdl:occursCountKind is 'fixed' then ... the implied behaviour is 'never'.*
*When dfdl:occursCountKind is 'expression' ... the implied behaviour is 'never'.*
*When dfdl:occursCountKind is 'parsed' ... the implied behaviour is 'anyEmpty'. *
*When dfdl:occursCountKind is 'stopValue' ...the implied behaviour is 'anyEmpty'. *
So if a Positional sequence as defined above contains children with dfdl:occursCountKind 'parsed' or 'stopValue' then surely it is no longer a Positional sequence.
A solution to this is to prevent the appearance of certain values of dfdl:occursCountKind within a Positional sequence. However, precisely which values to outlaw is subject to interpretation of the phrase "*Each occurrence in the sequence can be identified by its position in the data*". Is this intended to mean:
*a) an observer of the raw data can identify an occurrence of an element in the sequence solely by counting separators *
=> SDE if 'parsed', 'stopValue' or 'expression' ** appeared in a Positional sequence;
** Although 'expression' would appear to be like 'fixed' it actually breaks a) so must be included in the SDE list.
or
*b) a parser does not have to speculate to identify an occurrence of an element in the sequence*
=> SDE only if 'parsed' appeared in a Positional sequence.
Note that it is possible to wrap a 'parsed' etc element in a local sequence or another element to avoid an SDE. But this could still be seen as a violation of a) if the separators of both are the same, as the observer can not count the separators. So should the rule be applied recursively, ie, a Positional sequence can not contain a non-Positional sequence unless the separators are different?
Regards
Steve Hanson Architect, *IBM DFDL* <http://www.ibm.com/developerworks/library/se-dfdl/index.html> Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/> IBM SWG, Hursley, UK *smh@uk.ibm.com* <smh@uk.ibm.com> tel:+44-1962-815848 Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg