SSP (1) |
OCK
| ||||
fixed | implicit | expression | parsed | stopValue (2) | |
never | ok | ok | ok | x (5) | ok (6) |
trailingEmpty
trailingEmptyStrict | ok (3) | ok | ok (4) | x (5) | ok (6) |
anyEmpty | ok (3) | ok | ok (4) | ok | ok |
Non-positional sequence - Occurrences
in the sequence cannot be identified by their position in the data alone.
Often
the components of such a sequence have an initiator. Such sequences sometimes
allow the separator to be omitted for optional zero-length occurrences
anywhere in the sequence. Speculative parsing might
need to be employed by to identify
each occurrence. In
DFDL, a sequence is non-positional if it contains any optional or array
elements that have dfdl:occursCountKind 'parsed' or 'stopValue', and/or
it has dfdl:separatorSuppressionPolicy 'anyEmpty'.
See parallel email for action 261 that ensures 'expression' behaves itself.
One behaviour that is missing from the spec. For a sequence with separators,
what is expected in the data stream if occursCount = 'fixed' / 'implicit'
and maxOccurs = '0', or occursCountKind = 'expression' and occursCount
evaluates to 0 ? We believe that no separator should be expected
when parsing and none output when unparsing (same behaviour as inputValueCalc).
Regards
Steve Hanson
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 06/08/2014 12:42 -----
From: Steve
Hanson/UK/IBM
To: Tim
Kimber/UK/IBM@IBMGB,
Cc: dfdl-wg@ogf.org,
dfdl-wg-bounces@ogf.org
Date: 30/06/2014
10:04
Subject: Re:
[DFDL-WG] Action 260
260
| Positional
and non-positional sequences (All) 10/6: Spec defines the above but also allows different occursCountKinds within the same sequence which may have different (implied) separatorSuppressionPolicy, which results in a sequence which is a mixture of both. Should this be allowed? If so what are the rules? Can certain combinations be disallowed? 17/6: IBM have discussed internally and will submit a proposal. |
In the spec we define Positional Sequence and Non-Positional Sequence:
Positional sequence - Each occurrence in the sequence can be identified by its position in the data. Typically the components of such a sequence do not have an initiator. In some such sequences, the separators for optional zero-length occurrences may or must be omitted when at the end of the group. A positional sequence can be modelled by setting dfdl:separatorSuppressionPolicy to 'never', 'trailingEmptyStrict' or 'trailingEmpty'.
Non-positional sequence - Occurrences in the sequence cannot be identified by their position in the data alone. Typically the components of such a sequence have an initiator. Such sequences allow the separator to be omitted for optional zero-length occurrences anywhere in the sequence. Speculative parsing is employed by the parser to identify each occurrence. A non-positional sequence can be modelled by setting dfdl:separatorSuppressionPolicy to 'anyEmpty'.
The problem is that the setting of dfdl:separatorSuppressionPolicy is only examined for child elements with dfdl:occursCountKind 'implicit'. For other dfdl:occursCountKinds, there is the concept of an 'implied' dfdl:separatorSuppressionPolicy:
When dfdl:occursCountKind is 'fixed' then ... the implied behaviour is 'never'.
When dfdl:occursCountKind is 'expression' ... the implied behaviour is 'never'.
When dfdl:occursCountKind is 'parsed' ... the implied behaviour is 'anyEmpty'.
When dfdl:occursCountKind is 'stopValue' ...the implied behaviour is 'anyEmpty'.
So if a Positional sequence as defined above contains children with dfdl:occursCountKind 'parsed' or 'stopValue' then surely it is no longer a Positional sequence.
A solution to this is to prevent the appearance of certain values of dfdl:occursCountKind within a Positional sequence. However, precisely which values to outlaw is subject to interpretation of the phrase "Each occurrence in the sequence can be identified by its position in the data". Is this intended to mean:
a) an observer of the raw data can identify an occurrence of an element in the sequence solely by counting separators
=> SDE if 'parsed', 'stopValue' or 'expression' ** appeared in a Positional sequence;
** Although 'expression' would appear to be like 'fixed' it actually breaks a) so must be included in the SDE list.
or
b) a parser does not have to speculate to identify an occurrence of an element in the sequence
=> SDE only if 'parsed' appeared in a Positional sequence.
Note that it is possible to wrap a 'parsed' etc element in a local sequence or another element to avoid an SDE. But this could still be seen as a violation of a) if the separators of both are the same, as the observer can not count the separators. So should the rule be applied recursively, ie, a Positional sequence can not contain a non-Positional sequence unless the separators are different?
Regards
Steve Hanson
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU