Some thoughts on this...

I agree that the definition of positional sequence in the spec needs tightening as it is ambiguous as it stands and could be interpreted as a) or b).  If we adopted b) then that would appear to allow 'expression' to appear in a positional sequence, but wouldn't it also allow 'stopValue'?

occursCountKind 'expression' is analogous to lengthKind 'explicit' with an expression and to lengthKind 'prefixed'. Both these lengthKinds are classified as 'specified length' when parsing but 'variable length' when unparsing.  We are observing that occursCountKind 'expression' is like 'fixed' when parsing but not quite so like 'fixed' when unparsing - which is why section 16 groups 'expression' with 'parsed' for unparsing.

When unparsing occursCountKind 'expression' you don't always have the calculated array length N. If the infoset was derived from XML, there is likely no 'count' element, just a bunch of elements with the same name that make up the 'array'. DFDL gives you the choice whether to manually set the count element, or to have the parser set it automatically via outputValueCalc. In the former case, you can create a document that can not be parsed; the unparser could check the 'count' element matches the infoset, but that would involve reverse engineering an arbitrarily complex expression and is why the specification does not say that. Here's a real example of such an expression (albeit with lengthKind 'explicit' but the principle is the same):

        dfdl:length="{xs:nonNegativeInteger(fn:floor((../Length + 1) div 2))}"

Alex brought up the case where the expression evaluates to 0. In a positional sequence, would you still expect a delimiter for this case?  If 'yes' then the resultant zero length string must be treated as the 'absent representation' and ignored. If 'no' then is the sequence still positional?

Regards
 
Steve Hanson
Architect,
IBM DFDL
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848




From:        Tim Kimber/UK/IBM@IBMGB
To:        dfdl-wg@ogf.org,
Date:        10/06/2014 21:22
Subject:        [DFDL-WG] Action 261
Sent by:        dfdl-wg-bounces@ogf.org




 Implied separatorSuppressionPolicy for occursCountKind 'expression ' (All)
10/6: Spec says it is 'never' (positional sequence) but you have to parse to identify the position, so isn't that non-positional?


I think there are two alternative definitions of 'positional':

a) the identity of every delimited field is known before parsing of the sequence group begins

b) the identity of every delimited field is known before parsing of the field begins


As an implementer, b) is sufficient because it means that the parser never needs to backtrack while parsing the group.
a) allows the field identities to be statically known, but that is less important - it does not allow optimised extraction of a particular field as would be the case for a fixed-length group ( the possibility of escaped separators/terminators means that every character will need to be scanned anyway ).


It may sound like a small point, but it affects two decisions

1. whether ock='expression' should be allowed within a positional sequence group ( action 261 )

2. what the behaviour of the unparser should be w.r.t. ock='expression'.


My own feeling is that ock='expression' should be treated almost exactly like ock='fixed', except that the calculated array length N is used instead of maxOccurs.

- When parsing a positional sequence group it should cause N delimiters to be expected for the array.

- When unparsing a positional sequence group it should cause N delimiters to be written.

These rules are consistent and straightforward to describe and implement. The current rule ( unparser outputs the occurrences that are in the info set only ) allows the unparser to write a document that cannot be parsed using the same schema.


regards,

Tim Kimber,


----- Forwarded by Tim Kimber/UK/IBM on 10/06/2014 20:34 -----


From:        
Steve Hanson/UK/IBM@IBMGB
To:        
dfdl-wg@ogf.org,
Date:        
10/06/2014 17:57
Subject:        
[DFDL-WG] OGF DFDL WG Call Minutes 2014-06-10
Sent by:        
dfdl-wg-bounces@ogf.org




Please find minutes from the above call at
http://redmine.ogf.org/dmsf_files/13263?download=

Regards

Steve Hanson
Architect, IBM DFDL,
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU