Actually I think my interpretation is not correct and there is not a problem here.  

Zero length processing (ie, checking for nil, checking for default) is independent of type. It takes place as soon as the rep is extracted from the data. Any rep may be extracted with zero length, and could potentially be nilled in the infoset, or defaulted in the infoset if a required occurrence with a default, or not put in the infoset if an optional occurrence. It's only when none of those are true that we see whether length 0 is ok for the logical type. And for all Numbers and Calendars, length 0 is not ok and is a processing error because you can't turn it into the rep into a logical value.  

So please ignore my original email.  Both the first and second DFDL examples that I included are fine.

Regards
 
Steve Hanson
Architect,
IBM DFDL
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848




From:        Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:        Steve Hanson/UK/IBM@IBMGB,
Cc:        "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>
Date:        08/07/2014 17:37
Subject:        Re: [DFDL-WG] Spec section 12.3.7.2 - binary reps and zero length




This is one more among a number of problematic issues in the way the DFDL Spec describes length. 

I did a bunch of work on this a while back, but we decided that the required changes to improve on what we have would be too extensive for this draft of the spec.

I do believe we need to specify where length of zero is allowed or not. In your analysis below you suggest that currently we are allowing packed numbers of length zero. Since we allow prefixed and delimited for packed numbers I think allowing length of zero is ok, and probably needed to avoid all sorts of corner case checking.

It doesn't bother me that types/property combinations that disallow length 0 implicitly disallow use of default values. It does suggest a need for a schema-definition error, however, if a default value is specified, but the types and properties are such that it cannot be defaulted when parsing. Can the default value end up being used when unparsing? I would think in the case of array with minOccurs > 0 one might use it when unparsing.

The intention of Table 22 is that it gives the minimum length in bits, but if length units is bytes, then one must divide by 8 and round up. So if length units is 'bytes' then the minimum length for type xs:short is 1, and the maximum length is 2.

To fix this, probably the simplest thing is to add two more columns to the table giving minimum length when length units is bytes, and maximum length when length units is bytes. That will be clearer than whatever prose we try to come up with to explain this.

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy



On Tue, Jul 8, 2014 at 10:55 AM, Steve Hanson <smh@uk.ibm.com> wrote:
Section 9 of the specification talks about empty etc representations, where the meanings of a length of zero are defined.  However section 12.3.7.2 has sub-sections for certain binary reps where it is illegal to have a length of zero.  Is this deliberate?  The cases where 0 is allowed are also those which can have lengthKind 'delimited' so that might be the reason. Note that disallowing length of zero prevents use of a default value.

12.3.7.2.1: Binary numbers. Table 22 has minimum length in bits
=> 0 not allowed.  

12.3.7.2.2: Floating point numbers: Says length must be exactly 4 bytes or 8 bytes
=> 0 not allowed.

12.3.7.2.3: Packed numbers. No minimum length stated => 0 allowed.


12.3.7.2.4: Binary booleans. Refers to 12.3.7.2.1
=> 0 not allowed.

12.3.7.2.5: Binary calendars. Says length must be exactly 4 bytes or 8 bytes
=> 0 not allowed.

12.3.7.2.6: Packed calendars. No minimum length stated => 0 allowed.


12.3.7.2.7: Opaque binary. No minimum length stated => 0 allowed.


Noticed this when modelling an example of a header with user-defined information at the end with the length given by a binary count.


    <xs:complexType name="Type_Header">

      <xs:sequence>

        <xs:element dfdl:length="75" name="Filler" type="xs:hexBinary"/>

        <xs:element dfdl:length="2" name="HeaderLength" type="xs:nonNegativeInteger"/>

        <xs:element dfdl:length="30" name="Filler2" type="xs:hexBinary"/>

         <xs:element dfdl:length="{../HeaderLength - 108}" name="UserInfo" type="xs:hexBinary"

                    minOccurs="0" dfdl:occursCountKind="implicit"/>
      </xs:sequence>

    </xs:complexType>


This could equally be modelled using occursCountKind 'expression' but that seems unnecessary if the length is 0:


    <xs:complexType name="Type_Header">

      <xs:sequence>

        <xs:element dfdl:length="75" name="Filler" type="xs:hexBinary"/>

        <xs:element dfdl:length="2" name="HeaderLength" type="xs:nonNegativeInteger"/>

        <xs:element dfdl:length="30" name="Filler2" type="xs:hexBinary"/>

         <xs:element dfdl:length="{../HeaderLength - 108}" name="UserInfo" type="xs:hexBinary"

                    minOccurs="0" dfdl:occursCountKind="expression" dfdl:occursCount="if (../HeaderLength - 108 eq 0) then 0 else 1"/>
      </xs:sequence>

    </xs:complexType>


Also what do the minimum lengths in Table 22 mean when lengthUnits is bytes ?


Regards
 
Steve Hanson
Architect,
IBM DFDL
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:
+44-1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


--
  dfdl-wg mailing list
 
dfdl-wg@ogf.org
 
https://www.ogf.org/mailman/listinfo/dfdl-wg


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU