Fw: Nulls and Defaults

I have made the changes discussed yesterday plus the following added "useNilForDefault is specified" to the definition of "has default value specified? in 13.9 Changed null to nil, nullable to nillable etc except for LengthKing=NullTerminated. Agree? I am concerned about one of the changes nilIndicatorPath Path Expression Path to a logical Boolean field which indicates if this element is null. For nullKind='nullIndicator'., a path expression referencing another element that must be of type Boolean which indicates if this element is null. On input, the element value is null if the provided value is true. When null, on input the element is parsed as normal. If the element length is known then the value is skipped otherwise the value must be scannable. When null, on output the value is set based on fillByte or padCharacter properties and the referenced value set to true. If non-null then the element is parsed or output normally and the referenced value set to false. Annotation: dfdl:element (all simple types) By setting the referenced nil indicator we have made it impossible/difficult to implement a streaming unparser. I'm not sure that is a good idea. Also unless we relax the expression rules the indicator bit must be before the element. Please review sections 13.8-13.10 Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 ----- Forwarded by Alan Powell/UK/IBM on 03/04/2008 12:20 ----- From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB, "Mike Beckerle" <mbeckerle@OCO-INC.COM> Date: 01/04/2008 18:35 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Steve, Mike I have finally got around to finishing this off and it turned out to be a lot more work than I expected as the default and nulls information as all in the wrong places. Changes 13.8 Properties for Nullable Elements Updated as requested. nullKind=xpath changed to nullIndicator as it was xpath is also used in nullValue so it was confusing. 13.9 Properties for Default Value Control Moved from most of 17.1.1.1 and 17.2 so is now the main description of defaults. 13.10 Nulls, Defaults, and Initiators Moved from 14.2.1 Updated as requested 17.1.1.1 Repeating and Variable-Occurrence Items and Default Values Remainder of discussion of variable occurrences. Outstanding issues 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name=?nullValues?>?? ?null? ?NULL?</dfdl:property> So what is the syntax and it has to include expressions. 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Does everyone agree to this as it is a significant change to the document.? 9) nullIndicatorPath Expression Used when nullKind='nullIndicator'. A path expression referencing another element that provides the logical value to compare with nullValues On input, the element value is null if the provided value matches in nullValues. When null, If the element is fixed length then it will be skipped on input, filled with (TBD: fillbyte?) on output.. Is this correct??? Should it set element to Null? When null If the element is variable length with minimum length > 0, then a minimum length item will be skipped over, or on output filled (TBD with fillbyte?). When null If the element is variable length with minimum length 0, then a length zero object is expected on input, and a length 0 object will be generated on output. If non-null then the element is parsed or output normally. Annotation: dfdl:element (all simple types) 10) useNullValueForDefault Boolean Ignored on input. IS this correct. Shouldn't it set null if element is required? On output, if an element is not in the logical model, but it is required, the element is nillable, and has dfdl:useNullValueForDefault="true", then the logical value is defaulted to null. Annotation: dfdl:element (all simple types) Can you make sure you are happy with the changes. [attachment "ogf-dfdl-v1.0-Core-032.doc" deleted by Alan Powell/UK/IBM] Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB Cc: Alan Powell/UK/IBM Date: 07/02/2008 17:13 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Steve I have done most of this update. See below Will co,plete in next rev Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 Steve Hanson/UK/IBM 06/02/2008 09:26 To Alan Powell/UK/IBM cc Subject Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Alan - nulls and defaults changes below. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 06/02/2008 09:25 ----- "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 21:17 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Looks good. From: Steve Hanson [mailto:smh@uk.ibm.com] Sent: Tuesday, February 05, 2008 12:22 PM To: Mike Beckerle Subject: RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Mike Looks good, small corrections in blue. With those made we can send to Alan I think. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 14:57 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 1 Proposal: Input Defaulting for Empty Strings This is a corner case for strings. If an element is of type string, and has a default value specified, it is not clear whether the empty string should be an allowed value or if the empty string, when found in the representation, should trigger use of the default value instead. The following makes this corner case unambiguous: Type string with minLength of zero and default value are incompatible. It is a schema definition error if a variable length string where zero length is valid also has a default value specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around This eliminates complexities around the issue of ?empty? content. Empty content always triggers use of the default value. If the type is string and empty string is a legal value then there cannot be a default value. We also need the same for null values Type string with minLength of zero and nillable with empty string as one of the dfdl:nullValues are incompatible. It is a schema definition error if a variable length string where zero length is valid also is nillable and has a null value of empty string specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? Convenience. So you can scope the nullIndicatorPath, and have local indices. 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? Changes to this definition: defaultValueInitiatorPolicy Enum Valid values are 'required' or 'prohibited' Ignored unless dfdl:initiator is specified and is not "" (empty string). Ignored unless the element declaration has a default attribute specified. 'required' indicates that the dfdl:initiator followed by empty content is the required syntax to indicate that a default value will be used. 'prohibited' indicates that empty content triggers the use of a default value, and the presence of an initiator implies that a non-default value representation must follow. ?prohibited? implies an ordered sequence. Use of defaultValueInitiatorPolicy=?prohibited? in an initiated element of an unordered group is a schema definition error. This property applies only on input. (On output, for a required output an initiator is always output regardless of the default value.) Added 1.1.1.1 Initiators and Output This table describes the output direction logic for an initiated element that is a required element. We assume here that dfdl:initiator is specified and not equal to the empty string. Logical Value nullValueInitiatorPolicy useNullValueForDefault initiator region contains content region contains nil prohibited don't care nothing representation of nil based on nullKind, nullValues, etc. required initiator string "" (empty string) Note that this implies that the element type is xs:string and no default value can be specified. don't care initiator string empty string a non-nil non-empty-string value don't care initiator string The representation of the logical value Not supplied (element is not nillable) Don?t care Don?t care Initiator string The representation of the default value. (No default value implies processing error.) Not supplied (nillable) Prohibited True Nothing Representation of nil basd on nullKind, nullValues, etc. Required Initiator string Don?t care False Initiator String The representation of the default value. (No default value implies processing error.) Added but had trould with table format as couldn't copy/paste. 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault See above. 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name=?nullValues?>?? ?null? ?NULL?</dfdl:property> Which avoids quoting hell. (there?s still some issue of list-valued expressions.) 6) Error cases - need to enumerate these => Input. Required element missing and no default value. (processing error) => Output. Required element missing and no default value or null value. (processing error) => Output. Element is null and is not nillable. (processing error at least. It may be possible for some implementations to detect this error sooner.) => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 04/02/2008 16:06 ----- Hi Mike In preparation for our discussion on nulls and defaults tomorrow..... First of all I'd like to restate what I see as the requirements: Uncontentious core properties xs:default xs:fixed dfdl:nullKind dfdl:nullValues dfdl:nullIndicatorPath dfdl:nullIndicatorIndex Assumptions - 'Required' below is as defined in section 17.1.1.1. - The term 'default value' below actually means 'xs:default or xs:fixed' - Both default values and null values only apply to simple elements Input - If a required element is missing from the data stream and it has a default value, that will be used as the infoset value of the element - If an element is nillable and has a value in the data stream which matches one of a list of null values, the infoset value of the element will be the special value null Output - If a required element is missing from the infoset and it has a default value, optionally that will be used as the infoset value of the element - If a required element is missing from the infoset, optionally the special value null will be used as the infoset value of the element - If an element is nillable and has an infoset value null , the value in the data stream will be the first of the list of null values Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault 5) Is the list style syntax for dfdl:nullValues acceptable? 6) Error cases - need to enumerate these => Input. Required element missing and no default value. => Output. Required element missing and no default value or null value. => Output. Element is null and is not nillable. => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> 06/12/2007 13:50 To Steve Hanson/UK/IBM@IBMGB cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call I tend to trust your instincts about things Steve, I would summarize it as this: regardless of how people think nulls *should* work, in XSD nillables are orthogonal to value and whether or not this matches people's past experience we should support it if we're going to overload nillable at all. To me this reasoning is pretty compelling, so I withdraw my suggestion (the "either nillable or default value but not both" idea). ...mikeb Steve Hanson <smh@uk.ibm.com> 12/06/2007 04:59 AM To Mike Beckerle/Worcester/IBM@IBMUS cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call Unfortunately I have been roped into something else which will likely occupy me full time until middle of next week, so I can't look at the defaults/nulls issue in detail right now. But my first reaction to the proposal below is that elements should be allowed to have both null and default values. They are separate concepts in XML Schema, so why are we making the DFDL logical model different? IMHO subtle differences like this cause more issues with customers than the odd extra DFDL property. The DFDL subset of XML Schema should be just that - a subset. For those features of XML Schema that we do support, the rules should be the same. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 05/12/2007 23:21 To dfdl-wg@ogf.org cc Subject [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call OGF DFDL WG minutes 2007-12-05 call Suman Kalia, Simon Parker, Alan Powell, Mike Beckerle (who else? - was someone else on also) We discussed Output issues in the DFDL expression language: E.g.., an outputValueCalc for a field in the header of a data stream may contain information that requires you to know the rep, or length of the rep, of the whole data item. We concluded that this kind of thing can't be ruled out. Some formats just require buffering and are not streamable; however, implementations can vary on just how large a data item they're able to cope with here. Expression language section will include a subsection highlighting this issue and that implementations can vary here. Alan will update his expression language proposal and include this. Also suggested was a path length-from-to function that takes 2 path expressions and gives you the size of the represntation between them. (start of first, to last bit before start of 2nd). (I don't think we discussed a clear use case motivating this, but there may be one. We did discuss applications trying to fit data into limited size boxes, but the use case is not clear. Also note that all representation lengths are subject to change due to different starting alignments.) Nillable and Default: We also discussed the interaction of nillable and having a default. The sense of the group on the call is that we can restrict these so that if something is nillable it cannot also have a default value, and that the behavior of DFDL on output for a required element that is nillable but not in the logical data, is to create a null value. Everyone agreed that there is no need for a property useNullValueForDefault because this should always be the behavior. Mike will forward a proposal. ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Hi Alan That's just one example of unparsing behaviour that impacts streaming. There's the target of length XPaths and occurs XPaths as well. I think this is something we need to discuss and ratify. We either have the principle that the content of the infoset wins and sets the target fields, or the target field wins and determines who the infoset is interpreted. IBM's MRM unparser follows the second of these, DFDL follows the first. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Alan Powell/UK/IBM@IBMGB Sent by: dfdl-wg-bounces@ogf.org 03/04/2008 12:59 To dfdl-wg@ogf.org cc Subject [DFDL-WG] Fw: Nulls and Defaults I have made the changes discussed yesterday plus the following added "useNilForDefault is specified" to the definition of "has default value specified? in 13.9 Changed null to nil, nullable to nillable etc except for LengthKing=NullTerminated. Agree? I am concerned about one of the changes nilIndicatorPath Path Expression Path to a logical Boolean field which indicates if this element is null. For nullKind='nullIndicator'., a path expression referencing another element that must be of type Boolean which indicates if this element is null. On input, the element value is null if the provided value is true. When null, on input the element is parsed as normal. If the element length is known then the value is skipped otherwise the value must be scannable. When null, on output the value is set based on fillByte or padCharacter properties and the referenced value set to true. If non-null then the element is parsed or output normally and the referenced value set to false. Annotation: dfdl:element (all simple types) By setting the referenced nil indicator we have made it impossible/difficult to implement a streaming unparser. I'm not sure that is a good idea. Also unless we relax the expression rules the indicator bit must be before the element. Please review sections 13.8-13.10 Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 ----- Forwarded by Alan Powell/UK/IBM on 03/04/2008 12:20 ----- From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB, "Mike Beckerle" <mbeckerle@OCO-INC.COM> Date: 01/04/2008 18:35 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Steve, Mike I have finally got around to finishing this off and it turned out to be a lot more work than I expected as the default and nulls information as all in the wrong places. Changes 13.8 Properties for Nullable Elements Updated as requested. nullKind=xpath changed to nullIndicator as it was xpath is also used in nullValue so it was confusing. 13.9 Properties for Default Value Control Moved from most of 17.1.1.1 and 17.2 so is now the main description of defaults. 13.10 Nulls, Defaults, and Initiators Moved from 14.2.1 Updated as requested 17.1.1.1 Repeating and Variable-Occurrence Items and Default Values Remainder of discussion of variable occurrences. Outstanding issues 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name=?nullValues?>?? ?null? ?NULL?</dfdl:property> So what is the syntax and it has to include expressions. 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Does everyone agree to this as it is a significant change to the document.? 9) nullIndicatorPath Expression Used when nullKind='nullIndicator'. A path expression referencing another element that provides the logical value to compare with nullValues On input, the element value is null if the provided value matches in nullValues. When null, If the element is fixed length then it will be skipped on input, filled with (TBD: fillbyte?) on output.. Is this correct??? Should it set element to Null? When null If the element is variable length with minimum length > 0, then a minimum length item will be skipped over, or on output filled (TBD with fillbyte?). When null If the element is variable length with minimum length 0, then a length zero object is expected on input, and a length 0 object will be generated on output. If non-null then the element is parsed or output normally. Annotation: dfdl:element (all simple types) 10) useNullValueForDefault Boolean Ignored on input. IS this correct. Shouldn't it set null if element is required? On output, if an element is not in the logical model, but it is required, the element is nillable, and has dfdl:useNullValueForDefault="true", then the logical value is defaulted to null. Annotation: dfdl:element (all simple types) Can you make sure you are happy with the changes. [attachment "ogf-dfdl-v1.0-Core-032.doc" deleted by Alan Powell/UK/IBM] Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB Cc: Alan Powell/UK/IBM Date: 07/02/2008 17:13 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Steve I have done most of this update. See below Will co,plete in next rev Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 Steve Hanson/UK/IBM 06/02/2008 09:26 To Alan Powell/UK/IBM cc Subject Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Alan - nulls and defaults changes below. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 06/02/2008 09:25 ----- "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 21:17 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Looks good. From: Steve Hanson [mailto:smh@uk.ibm.com] Sent: Tuesday, February 05, 2008 12:22 PM To: Mike Beckerle Subject: RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Mike Looks good, small corrections in blue. With those made we can send to Alan I think. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 14:57 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 1 Proposal: Input Defaulting for Empty Strings This is a corner case for strings. If an element is of type string, and has a default value specified, it is not clear whether the empty string should be an allowed value or if the empty string, when found in the representation, should trigger use of the default value instead. The following makes this corner case unambiguous: Type string with minLength of zero and default value are incompatible. It is a schema definition error if a variable length string where zero length is valid also has a default value specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around This eliminates complexities around the issue of ?empty? content. Empty content always triggers use of the default value. If the type is string and empty string is a legal value then there cannot be a default value. We also need the same for null values Type string with minLength of zero and nillable with empty string as one of the dfdl:nullValues are incompatible. It is a schema definition error if a variable length string where zero length is valid also is nillable and has a null value of empty string specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? Convenience. So you can scope the nullIndicatorPath, and have local indices. 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? Changes to this definition: defaultValueInitiatorPolicy Enum Valid values are 'required' or 'prohibited' Ignored unless dfdl:initiator is specified and is not "" (empty string). Ignored unless the element declaration has a default attribute specified. 'required' indicates that the dfdl:initiator followed by empty content is the required syntax to indicate that a default value will be used. 'prohibited' indicates that empty content triggers the use of a default value, and the presence of an initiator implies that a non-default value representation must follow. ?prohibited? implies an ordered sequence. Use of defaultValueInitiatorPolicy=?prohibited? in an initiated element of an unordered group is a schema definition error. This property applies only on input. (On output, for a required output an initiator is always output regardless of the default value.) Added 1.1.1.1 Initiators and Output This table describes the output direction logic for an initiated element that is a required element. We assume here that dfdl:initiator is specified and not equal to the empty string. Logical Value nullValueInitiatorPolicy useNullValueForDefault initiator region contains content region contains nil prohibited don't care nothing representation of nil based on nullKind, nullValues, etc. required initiator string "" (empty string) Note that this implies that the element type is xs:string and no default value can be specified. don't care initiator string empty string a non-nil non-empty-string value don't care initiator string The representation of the logical value Not supplied (element is not nillable) Don?t care Don?t care Initiator string The representation of the default value. (No default value implies processing error.) Not supplied (nillable) Prohibited True Nothing Representation of nil basd on nullKind, nullValues, etc. Required Initiator string Don?t care False Initiator String The representation of the default value. (No default value implies processing error.) Added but had trould with table format as couldn't copy/paste. 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault See above. 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name=?nullValues?>?? ?null? ?NULL?</dfdl:property> Which avoids quoting hell. (there?s still some issue of list-valued expressions.) 6) Error cases - need to enumerate these => Input. Required element missing and no default value. (processing error) => Output. Required element missing and no default value or null value. (processing error) => Output. Element is null and is not nillable. (processing error at least. It may be possible for some implementations to detect this error sooner.) => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 04/02/2008 16:06 ----- Hi Mike In preparation for our discussion on nulls and defaults tomorrow..... First of all I'd like to restate what I see as the requirements: Uncontentious core properties xs:default xs:fixed dfdl:nullKind dfdl:nullValues dfdl:nullIndicatorPath dfdl:nullIndicatorIndex Assumptions - 'Required' below is as defined in section 17.1.1.1. - The term 'default value' below actually means 'xs:default or xs:fixed' - Both default values and null values only apply to simple elements Input - If a required element is missing from the data stream and it has a default value, that will be used as the infoset value of the element - If an element is nillable and has a value in the data stream which matches one of a list of null values, the infoset value of the element will be the special value null Output - If a required element is missing from the infoset and it has a default value, optionally that will be used as the infoset value of the element - If a required element is missing from the infoset, optionally the special value null will be used as the infoset value of the element - If an element is nillable and has an infoset value null , the value in the data stream will be the first of the list of null values Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault 5) Is the list style syntax for dfdl:nullValues acceptable? 6) Error cases - need to enumerate these => Input. Required element missing and no default value. => Output. Required element missing and no default value or null value. => Output. Element is null and is not nillable. => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> 06/12/2007 13:50 To Steve Hanson/UK/IBM@IBMGB cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call I tend to trust your instincts about things Steve, I would summarize it as this: regardless of how people think nulls *should* work, in XSD nillables are orthogonal to value and whether or not this matches people's past experience we should support it if we're going to overload nillable at all. To me this reasoning is pretty compelling, so I withdraw my suggestion (the "either nillable or default value but not both" idea). ...mikeb Steve Hanson <smh@uk.ibm.com> 12/06/2007 04:59 AM To Mike Beckerle/Worcester/IBM@IBMUS cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call Unfortunately I have been roped into something else which will likely occupy me full time until middle of next week, so I can't look at the defaults/nulls issue in detail right now. But my first reaction to the proposal below is that elements should be allowed to have both null and default values. They are separate concepts in XML Schema, so why are we making the DFDL logical model different? IMHO subtle differences like this cause more issues with customers than the odd extra DFDL property. The DFDL subset of XML Schema should be just that - a subset. For those features of XML Schema that we do support, the rules should be the same. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 05/12/2007 23:21 To dfdl-wg@ogf.org cc Subject [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call OGF DFDL WG minutes 2007-12-05 call Suman Kalia, Simon Parker, Alan Powell, Mike Beckerle (who else? - was someone else on also) We discussed Output issues in the DFDL expression language: E.g.., an outputValueCalc for a field in the header of a data stream may contain information that requires you to know the rep, or length of the rep, of the whole data item. We concluded that this kind of thing can't be ruled out. Some formats just require buffering and are not streamable; however, implementations can vary on just how large a data item they're able to cope with here. Expression language section will include a subsection highlighting this issue and that implementations can vary here. Alan will update his expression language proposal and include this. Also suggested was a path length-from-to function that takes 2 path expressions and gives you the size of the represntation between them. (start of first, to last bit before start of 2nd). (I don't think we discussed a clear use case motivating this, but there may be one. We did discuss applications trying to fit data into limited size boxes, but the use case is not clear. Also note that all representation lengths are subject to change due to different starting alignments.) Nillable and Default: We also discussed the interaction of nillable and having a default. The sense of the group on the call is that we can restrict these so that if something is nillable it cannot also have a default value, and that the behavior of DFDL on output for a required element that is nillable but not in the logical data, is to create a null value. Everyone agreed that there is no need for a property useNullValueForDefault because this should always be the behavior. Mike will forward a proposal. ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg [attachment "ogf-dfdl-v1.0-Core-032.2.doc" deleted by Steve Hanson/UK/IBM] -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

I don't understand your two notions of infoset wins vs target field wins. It would be useful to discuss these cases. There are plenty of data formats that require buffering in order to deal with. I think we simply have to say that DFDL implementations may have to buffer data when formats require it. Note that DFDL implementations might buffer data unnecessarily, i.e., when other more clever implementations can figure out how NOT to buffer, but the converse is not true. There are formats where every DFDL implementation must do some buffering. _____ From: dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org] On Behalf Of Steve Hanson Sent: Monday, April 07, 2008 11:41 AM To: Alan Powell Cc: dfdl-wg@ogf.org Subject: Re: [DFDL-WG] Fw: Nulls and Defaults Hi Alan That's just one example of unparsing behaviour that impacts streaming. There's the target of length XPaths and occurs XPaths as well. I think this is something we need to discuss and ratify. We either have the principle that the content of the infoset wins and sets the target fields, or the target field wins and determines who the infoset is interpreted. IBM's MRM unparser follows the second of these, DFDL follows the first. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Alan Powell/UK/IBM@IBMGB Sent by: dfdl-wg-bounces@ogf.org 03/04/2008 12:59 To dfdl-wg@ogf.org cc Subject [DFDL-WG] Fw: Nulls and Defaults I have made the changes discussed yesterday plus the following * added "useNilForDefault is specified" to the definition of "has default value specified' in 13.9 * Changed null to nil, nullable to nillable etc except for LengthKing=NullTerminated. Agree? I am concerned about one of the changes nilIndicatorPath Path Expression Path to a logical Boolean field which indicates if this element is null. For nullKind='nullIndicator'., a path expression referencing another element that must be of type Boolean which indicates if this element is null. On input, the element value is null if the provided value is true. When null, on input the element is parsed as normal. If the element length is known then the value is skipped otherwise the value must be scannable. When null, on output the value is set based on fillByte or padCharacter properties and the referenced value set to true. If non-null then the element is parsed or output normally and the referenced value set to false. Annotation: dfdl:element (all simple types) By setting the referenced nil indicator we have made it impossible/difficult to implement a streaming unparser. I'm not sure that is a good idea. Also unless we relax the expression rules the indicator bit must be before the element. Please review sections 13.8-13.10 Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 ----- Forwarded by Alan Powell/UK/IBM on 03/04/2008 12:20 ----- From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB, "Mike Beckerle" <mbeckerle@OCO-INC.COM> Date: 01/04/2008 18:35 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) _____ Steve, Mike I have finally got around to finishing this off and it turned out to be a lot more work than I expected as the default and nulls information as all in the wrong places. Changes 13.8 Properties for Nullable Elements Updated as requested. nullKind=xpath changed to nullIndicator as it was xpath is also used in nullValue so it was confusing. 13.9 Properties for Default Value Control Moved from most of 17.1.1.1 and 17.2 so is now the main description of defaults. 13.10 Nulls, Defaults, and Initiators Moved from 14.2.1 Updated as requested 17.1.1.1 Repeating and Variable-Occurrence Items and Default Values Remainder of discussion of variable occurrences. Outstanding issues 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name="nullValues">"" "null" "NULL"</dfdl:property> So what is the syntax and it has to include expressions. 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Does everyone agree to this as it is a significant change to the document.? 9) nullIndicatorPath Expression Used when nullKind='nullIndicator'. A path expression referencing another element that provides the logical value to compare with nullValues On input, the element value is null if the provided value matches in nullValues. When null, If the element is fixed length then it will be skipped on input, filled with (TBD: fillbyte?) on output.. Is this correct??? Should it set element to Null? When null If the element is variable length with minimum length > 0, then a minimum length item will be skipped over, or on output filled (TBD with fillbyte?). When null If the element is variable length with minimum length 0, then a length zero object is expected on input, and a length 0 object will be generated on output. If non-null then the element is parsed or output normally. Annotation: dfdl:element (all simple types) 10) useNullValueForDefault Boolean Ignored on input. IS this correct. Shouldn't it set null if element is required? On output, if an element is not in the logical model, but it is required, the element is nillable, and has dfdl:useNullValueForDefault="true", then the logical value is defaulted to null. Annotation: dfdl:element (all simple types) Can you make sure you are happy with the changes. [attachment "ogf-dfdl-v1.0-Core-032.doc" deleted by Alan Powell/UK/IBM] Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB Cc: Alan Powell/UK/IBM Date: 07/02/2008 17:13 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) _____ Steve I have done most of this update. See below Will co,plete in next rev Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 Steve Hanson/UK/IBM 06/02/2008 09:26 To Alan Powell/UK/IBM cc Subject Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Alan - nulls and defaults changes below. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 06/02/2008 09:25 ----- "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 21:17 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Looks good. From: Steve Hanson [ <mailto:smh@uk.ibm.com> mailto:smh@uk.ibm.com] Sent: Tuesday, February 05, 2008 12:22 PM To: Mike Beckerle Subject: RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Mike Looks good, small corrections in blue. With those made we can send to Alan I think. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 14:57 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 1 Proposal: Input Defaulting for Empty Strings This is a corner case for strings. If an element is of type string, and has a default value specified, it is not clear whether the empty string should be an allowed value or if the empty string, when found in the representation, should trigger use of the default value instead. The following makes this corner case unambiguous: * Type string with minLength of zero and default value are incompatible. It is a schema definition error if a variable length string where zero length is valid also has a default value specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around This eliminates complexities around the issue of "empty" content. Empty content always triggers use of the default value. If the type is string and empty string is a legal value then there cannot be a default value. We also need the same for null values * Type string with minLength of zero and nillable with empty string as one of the dfdl:nullValues are incompatible. It is a schema definition error if a variable length string where zero length is valid also is nillable and has a null value of empty string specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? Convenience. So you can scope the nullIndicatorPath, and have local indices. 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? Changes to this definition: defaultValueInitiatorPolicy Enum Valid values are 'required' or 'prohibited' Ignored unless dfdl:initiator is specified and is not "" (empty string). Ignored unless the element declaration has a default attribute specified. 'required' indicates that the dfdl:initiator followed by empty content is the required syntax to indicate that a default value will be used. 'prohibited' indicates that empty content triggers the use of a default value, and the presence of an initiator implies that a non-default value representation must follow. 'prohibited' implies an ordered sequence. Use of defaultValueInitiatorPolicy='prohibited' in an initiated element of an unordered group is a schema definition error. This property applies only on input. (On output, for a required output an initiator is always output regardless of the default value.) Added 1.1.1.1 Initiators and Output This table describes the output direction logic for an initiated element that is a required element. We assume here that dfdl:initiator is specified and not equal to the empty string. Logical Value nullValueInitiatorPolicy useNullValueForDefault initiator region contains content region contains nil prohibited don't care nothing representation of nil based on nullKind, nullValues, etc. required initiator string "" (empty string) Note that this implies that the element type is xs:string and no default value can be specified. don't care initiator string empty string a non-nil non-empty-string value don't care initiator string The representation of the logical value Not supplied (element is not nillable) Don't care Don't care Initiator string The representation of the default value. (No default value implies processing error.) Not supplied (nillable) Prohibited True Nothing Representation of nil basd on nullKind, nullValues, etc. Required Initiator string Don't care False Initiator String The representation of the default value. (No default value implies processing error.) Added but had trould with table format as couldn't copy/paste. 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault See above. 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name="nullValues">"" "null" "NULL"</dfdl:property> Which avoids quoting hell. (there's still some issue of list-valued expressions.) 6) Error cases - need to enumerate these => Input. Required element missing and no default value. (processing error) => Output. Required element missing and no default value or null value. (processing error) => Output. Element is null and is not nillable. (processing error at least. It may be possible for some implementations to detect this error sooner.) => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 04/02/2008 16:06 ----- Hi Mike In preparation for our discussion on nulls and defaults tomorrow..... First of all I'd like to restate what I see as the requirements: Uncontentious core properties xs:default xs:fixed dfdl:nullKind dfdl:nullValues dfdl:nullIndicatorPath dfdl:nullIndicatorIndex Assumptions - 'Required' below is as defined in section 17.1.1.1. - The term 'default value' below actually means 'xs:default or xs:fixed' - Both default values and null values only apply to simple elements Input - If a required element is missing from the data stream and it has a default value, that will be used as the infoset value of the element - If an element is nillable and has a value in the data stream which matches one of a list of null values, the infoset value of the element will be the special value null Output - If a required element is missing from the infoset and it has a default value, optionally that will be used as the infoset value of the element - If a required element is missing from the infoset, optionally the special value null will be used as the infoset value of the element - If an element is nillable and has an infoset value null , the value in the data stream will be the first of the list of null values Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault 5) Is the list style syntax for dfdl:nullValues acceptable? 6) Error cases - need to enumerate these => Input. Required element missing and no default value. => Output. Required element missing and no default value or null value. => Output. Element is null and is not nillable. => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> 06/12/2007 13:50 To Steve Hanson/UK/IBM@IBMGB cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call I tend to trust your instincts about things Steve, I would summarize it as this: regardless of how people think nulls *should* work, in XSD nillables are orthogonal to value and whether or not this matches people's past experience we should support it if we're going to overload nillable at all. To me this reasoning is pretty compelling, so I withdraw my suggestion (the "either nillable or default value but not both" idea). ...mikeb Steve Hanson <smh@uk.ibm.com> 12/06/2007 04:59 AM To Mike Beckerle/Worcester/IBM@IBMUS cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call Unfortunately I have been roped into something else which will likely occupy me full time until middle of next week, so I can't look at the defaults/nulls issue in detail right now. But my first reaction to the proposal below is that elements should be allowed to have both null and default values. They are separate concepts in XML Schema, so why are we making the DFDL logical model different? IMHO subtle differences like this cause more issues with customers than the odd extra DFDL property. The DFDL subset of XML Schema should be just that - a subset. For those features of XML Schema that we do support, the rules should be the same. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 05/12/2007 23:21 To dfdl-wg@ogf.org cc Subject [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call OGF DFDL WG minutes 2007-12-05 call Suman Kalia, Simon Parker, Alan Powell, Mike Beckerle (who else? - was someone else on also) We discussed Output issues in the DFDL expression language: E.g.., an outputValueCalc for a field in the header of a data stream may contain information that requires you to know the rep, or length of the rep, of the whole data item. We concluded that this kind of thing can't be ruled out. Some formats just require buffering and are not streamable; however, implementations can vary on just how large a data item they're able to cope with here. Expression language section will include a subsection highlighting this issue and that implementations can vary here. Alan will update his expression language proposal and include this. Also suggested was a path length-from-to function that takes 2 path expressions and gives you the size of the represntation between them. (start of first, to last bit before start of 2nd). (I don't think we discussed a clear use case motivating this, but there may be one. We did discuss applications trying to fit data into limited size boxes, but the use case is not clear. Also note that all representation lengths are subject to change due to different starting alignments.) Nillable and Default: We also discussed the interaction of nillable and having a default. The sense of the group on the call is that we can restrict these so that if something is nillable it cannot also have a default value, and that the behavior of DFDL on output for a required element that is nillable but not in the logical data, is to create a null value. Everyone agreed that there is no need for a property useNullValueForDefault because this should always be the behavior. Mike will forward a proposal. ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 -- dfdl-wg mailing list dfdl-wg@ogf.org <http://www.ogf.org/mailman/listinfo/dfdl-wg> http://www.ogf.org/mailman/listinfo/dfdl-wg [attachment "ogf-dfdl-v1.0-Core-032.2.doc" deleted by Steve Hanson/UK/IBM] -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg _____ Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Hi Mike I'll expand the sentence. We either have the principle that a) the content of the infoset determines what is written, and the referenced field is set according to the infoset, or b) the referenced field determines what is written, and the content of the infoset is adjusted if necessary. Example: I have a 'data' field which is an array and a preceding 'count' field. The infoset items for 'data' number 10. The infoset item for 'count' has value 8. a) 10 occurrences of 'data' written, value of 'count' written as 10 b) 8 occurrences of 'data' written, value of 'count' written as 8 Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 "Mike Beckerle" <mbeckerle.dfdl@gmail.com> 09/04/2008 01:48 Please respond to <mbeckerle.dfdl@gmail.com> To Steve Hanson/UK/IBM@IBMGB, Alan Powell/UK/IBM@IBMGB cc <dfdl-wg@ogf.org> Subject RE: [DFDL-WG] Fw: Nulls and Defaults I don?t understand your two notions of infoset wins vs target field wins. It would be useful to discuss these cases. There are plenty of data formats that require buffering in order to deal with. I think we simply have to say that DFDL implementations may have to buffer data when formats require it. Note that DFDL implementations might buffer data unnecessarily, i.e., when other more clever implementations can figure out how NOT to buffer, but the converse is not true. There are formats where every DFDL implementation must do some buffering. From: dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org] On Behalf Of Steve Hanson Sent: Monday, April 07, 2008 11:41 AM To: Alan Powell Cc: dfdl-wg@ogf.org Subject: Re: [DFDL-WG] Fw: Nulls and Defaults Hi Alan That's just one example of unparsing behaviour that impacts streaming. There's the target of length XPaths and occurs XPaths as well. I think this is something we need to discuss and ratify. We either have the principle that the content of the infoset wins and sets the target fields, or the target field wins and determines who the infoset is interpreted. IBM's MRM unparser follows the second of these, DFDL follows the first. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Alan Powell/UK/IBM@IBMGB Sent by: dfdl-wg-bounces@ogf.org 03/04/2008 12:59 To dfdl-wg@ogf.org cc Subject [DFDL-WG] Fw: Nulls and Defaults I have made the changes discussed yesterday plus the following added "useNilForDefault is specified" to the definition of "has default value specified? in 13.9 Changed null to nil, nullable to nillable etc except for LengthKing=NullTerminated. Agree? I am concerned about one of the changes nilIndicatorPath Path Expression Path to a logical Boolean field which indicates if this element is null. For nullKind='nullIndicator'., a path expression referencing another element that must be of type Boolean which indicates if this element is null. On input, the element value is null if the provided value is true. When null, on input the element is parsed as normal. If the element length is known then the value is skipped otherwise the value must be scannable. When null, on output the value is set based on fillByte or padCharacter properties and the referenced value set to true. If non-null then the element is parsed or output normally and the referenced value set to false. Annotation: dfdl:element (all simple types) By setting the referenced nil indicator we have made it impossible/difficult to implement a streaming unparser. I'm not sure that is a good idea. Also unless we relax the expression rules the indicator bit must be before the element. Please review sections 13.8-13.10 Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 ----- Forwarded by Alan Powell/UK/IBM on 03/04/2008 12:20 ----- From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB, "Mike Beckerle" <mbeckerle@OCO-INC.COM> Date: 01/04/2008 18:35 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Steve, Mike I have finally got around to finishing this off and it turned out to be a lot more work than I expected as the default and nulls information as all in the wrong places. Changes 13.8 Properties for Nullable Elements Updated as requested. nullKind=xpath changed to nullIndicator as it was xpath is also used in nullValue so it was confusing. 13.9 Properties for Default Value Control Moved from most of 17.1.1.1 and 17.2 so is now the main description of defaults. 13.10 Nulls, Defaults, and Initiators Moved from 14.2.1 Updated as requested 17.1.1.1 Repeating and Variable-Occurrence Items and Default Values Remainder of discussion of variable occurrences. Outstanding issues 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name=?nullValues?>?? ?null? ?NULL?</dfdl:property> So what is the syntax and it has to include expressions. 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Does everyone agree to this as it is a significant change to the document.? 9) nullIndicatorPath Expression Used when nullKind='nullIndicator'. A path expression referencing another element that provides the logical value to compare with nullValues On input, the element value is null if the provided value matches in nullValues. When null, If the element is fixed length then it will be skipped on input, filled with (TBD: fillbyte?) on output.. Is this correct??? Should it set element to Null? When null If the element is variable length with minimum length > 0, then a minimum length item will be skipped over, or on output filled (TBD with fillbyte?). When null If the element is variable length with minimum length 0, then a length zero object is expected on input, and a length 0 object will be generated on output. If non-null then the element is parsed or output normally. Annotation: dfdl:element (all simple types) 10) useNullValueForDefault Boolean Ignored on input. IS this correct. Shouldn't it set null if element is required? On output, if an element is not in the logical model, but it is required, the element is nillable, and has dfdl:useNullValueForDefault="true", then the logical value is defaulted to null. Annotation: dfdl:element (all simple types) Can you make sure you are happy with the changes. [attachment "ogf-dfdl-v1.0-Core-032.doc" deleted by Alan Powell/UK/IBM] Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB Cc: Alan Powell/UK/IBM Date: 07/02/2008 17:13 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Steve I have done most of this update. See below Will co,plete in next rev Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 Steve Hanson/UK/IBM 06/02/2008 09:26 To Alan Powell/UK/IBM cc Subject Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Alan - nulls and defaults changes below. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 06/02/2008 09:25 ----- "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 21:17 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Looks good. From: Steve Hanson [mailto:smh@uk.ibm.com] Sent: Tuesday, February 05, 2008 12:22 PM To: Mike Beckerle Subject: RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Mike Looks good, small corrections in blue. With those made we can send to Alan I think. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 14:57 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 1 Proposal: Input Defaulting for Empty Strings This is a corner case for strings. If an element is of type string, and has a default value specified, it is not clear whether the empty string should be an allowed value or if the empty string, when found in the representation, should trigger use of the default value instead. The following makes this corner case unambiguous: Type string with minLength of zero and default value are incompatible. It is a schema definition error if a variable length string where zero length is valid also has a default value specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around This eliminates complexities around the issue of ?empty? content. Empty content always triggers use of the default value. If the type is string and empty string is a legal value then there cannot be a default value. We also need the same for null values Type string with minLength of zero and nillable with empty string as one of the dfdl:nullValues are incompatible. It is a schema definition error if a variable length string where zero length is valid also is nillable and has a null value of empty string specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? Convenience. So you can scope the nullIndicatorPath, and have local indices. 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? Changes to this definition: defaultValueInitiatorPolicy Enum Valid values are 'required' or 'prohibited' Ignored unless dfdl:initiator is specified and is not "" (empty string). Ignored unless the element declaration has a default attribute specified. 'required' indicates that the dfdl:initiator followed by empty content is the required syntax to indicate that a default value will be used. 'prohibited' indicates that empty content triggers the use of a default value, and the presence of an initiator implies that a non-default value representation must follow. ?prohibited? implies an ordered sequence. Use of defaultValueInitiatorPolicy=?prohibited? in an initiated element of an unordered group is a schema definition error. This property applies only on input. (On output, for a required output an initiator is always output regardless of the default value.) Added 1.1.1.1 Initiators and Output This table describes the output direction logic for an initiated element that is a required element. We assume here that dfdl:initiator is specified and not equal to the empty string. Logical Value nullValueInitiatorPolicy useNullValueForDefault initiator region contains content region contains nil prohibited don't care nothing representation of nil based on nullKind, nullValues, etc. required initiator string "" (empty string) Note that this implies that the element type is xs:string and no default value can be specified. don't care initiator string empty string a non-nil non-empty-string value don't care initiator string The representation of the logical value Not supplied (element is not nillable) Don?t care Don?t care Initiator string The representation of the default value. (No default value implies processing error.) Not supplied (nillable) Prohibited True Nothing Representation of nil basd on nullKind, nullValues, etc. Required Initiator string Don?t care False Initiator String The representation of the default value. (No default value implies processing error.) Added but had trould with table format as couldn't copy/paste. 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault See above. 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name=?nullValues?>?? ?null? ?NULL?</dfdl:property> Which avoids quoting hell. (there?s still some issue of list-valued expressions.) 6) Error cases - need to enumerate these => Input. Required element missing and no default value. (processing error) => Output. Required element missing and no default value or null value. (processing error) => Output. Element is null and is not nillable. (processing error at least. It may be possible for some implementations to detect this error sooner.) => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 04/02/2008 16:06 ----- Hi Mike In preparation for our discussion on nulls and defaults tomorrow..... First of all I'd like to restate what I see as the requirements: Uncontentious core properties xs:default xs:fixed dfdl:nullKind dfdl:nullValues dfdl:nullIndicatorPath dfdl:nullIndicatorIndex Assumptions - 'Required' below is as defined in section 17.1.1.1. - The term 'default value' below actually means 'xs:default or xs:fixed' - Both default values and null values only apply to simple elements Input - If a required element is missing from the data stream and it has a default value, that will be used as the infoset value of the element - If an element is nillable and has a value in the data stream which matches one of a list of null values, the infoset value of the element will be the special value null Output - If a required element is missing from the infoset and it has a default value, optionally that will be used as the infoset value of the element - If a required element is missing from the infoset, optionally the special value null will be used as the infoset value of the element - If an element is nillable and has an infoset value null , the value in the data stream will be the first of the list of null values Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault 5) Is the list style syntax for dfdl:nullValues acceptable? 6) Error cases - need to enumerate these => Input. Required element missing and no default value. => Output. Required element missing and no default value or null value. => Output. Element is null and is not nillable. => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> 06/12/2007 13:50 To Steve Hanson/UK/IBM@IBMGB cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call I tend to trust your instincts about things Steve, I would summarize it as this: regardless of how people think nulls *should* work, in XSD nillables are orthogonal to value and whether or not this matches people's past experience we should support it if we're going to overload nillable at all. To me this reasoning is pretty compelling, so I withdraw my suggestion (the "either nillable or default value but not both" idea). ...mikeb Steve Hanson <smh@uk.ibm.com> 12/06/2007 04:59 AM To Mike Beckerle/Worcester/IBM@IBMUS cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call Unfortunately I have been roped into something else which will likely occupy me full time until middle of next week, so I can't look at the defaults/nulls issue in detail right now. But my first reaction to the proposal below is that elements should be allowed to have both null and default values. They are separate concepts in XML Schema, so why are we making the DFDL logical model different? IMHO subtle differences like this cause more issues with customers than the odd extra DFDL property. The DFDL subset of XML Schema should be just that - a subset. For those features of XML Schema that we do support, the rules should be the same. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 05/12/2007 23:21 To dfdl-wg@ogf.org cc Subject [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call OGF DFDL WG minutes 2007-12-05 call Suman Kalia, Simon Parker, Alan Powell, Mike Beckerle (who else? - was someone else on also) We discussed Output issues in the DFDL expression language: E.g.., an outputValueCalc for a field in the header of a data stream may contain information that requires you to know the rep, or length of the rep, of the whole data item. We concluded that this kind of thing can't be ruled out. Some formats just require buffering and are not streamable; however, implementations can vary on just how large a data item they're able to cope with here. Expression language section will include a subsection highlighting this issue and that implementations can vary here. Alan will update his expression language proposal and include this. Also suggested was a path length-from-to function that takes 2 path expressions and gives you the size of the represntation between them. (start of first, to last bit before start of 2nd). (I don't think we discussed a clear use case motivating this, but there may be one. We did discuss applications trying to fit data into limited size boxes, but the use case is not clear. Also note that all representation lengths are subject to change due to different starting alignments.) Nillable and Default: We also discussed the interaction of nillable and having a default. The sense of the group on the call is that we can restrict these so that if something is nillable it cannot also have a default value, and that the behavior of DFDL on output for a required element that is nillable but not in the logical data, is to create a null value. Everyone agreed that there is no need for a property useNullValueForDefault because this should always be the behavior. Mike will forward a proposal. ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg [attachment "ogf-dfdl-v1.0-Core-032.2.doc" deleted by Steve Hanson/UK/IBM] -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Agreed. Nul termination actually is the ascii NUL character, spelled NUL, not NULL. But I really don't care. NUL may just be a truncation of NULL to 3 characters since they wanted all 3-letter or less abbreviations like ESC, TAB, FF, LF, CR, etc. As for nilIndicatorPath, unfortunately, the key examples I've seen for this have the null bits after the data. Basically, null flag bits pack really nicely at the end of a record where they don't cause any alignment padding to be needed. I am concerned about this streaming/non-streaming restriction stuff. We're not considering the use cases properly. If I use DFDL and I describe a data buffer with it, I would like a DFDL library to give me random access to the data in the buffer, if the format allows it. That is a big "IF" though. We need some notion of finite fixed distance between fields. I.e., if two fields are at fixed finite distance, then you can do forward reference, otherwise not. This is really all about binary data formats. .mikeb _____ From: dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org] On Behalf Of Alan Powell Sent: Thursday, April 03, 2008 8:00 AM To: dfdl-wg@ogf.org Subject: [DFDL-WG] Fw: Nulls and Defaults I have made the changes discussed yesterday plus the following * added "useNilForDefault is specified" to the definition of "has default value specified' in 13.9 * Changed null to nil, nullable to nillable etc except for LengthKing=NullTerminated. Agree? I am concerned about one of the changes nilIndicatorPath Path Expression Path to a logical Boolean field which indicates if this element is null. For nullKind='nullIndicator'., a path expression referencing another element that must be of type Boolean which indicates if this element is null. On input, the element value is null if the provided value is true. When null, on input the element is parsed as normal. If the element length is known then the value is skipped otherwise the value must be scannable. When null, on output the value is set based on fillByte or padCharacter properties and the referenced value set to true. If non-null then the element is parsed or output normally and the referenced value set to false. Annotation: dfdl:element (all simple types) By setting the referenced nil indicator we have made it impossible/difficult to implement a streaming unparser. I'm not sure that is a good idea. Also unless we relax the expression rules the indicator bit must be before the element. Please review sections 13.8-13.10 Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 ----- Forwarded by Alan Powell/UK/IBM on 03/04/2008 12:20 ----- From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB, "Mike Beckerle" <mbeckerle@OCO-INC.COM> Date: 01/04/2008 18:35 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) _____ Steve, Mike I have finally got around to finishing this off and it turned out to be a lot more work than I expected as the default and nulls information as all in the wrong places. Changes 13.8 Properties for Nullable Elements Updated as requested. nullKind=xpath changed to nullIndicator as it was xpath is also used in nullValue so it was confusing. 13.9 Properties for Default Value Control Moved from most of 17.1.1.1 and 17.2 so is now the main description of defaults. 13.10 Nulls, Defaults, and Initiators Moved from 14.2.1 Updated as requested 17.1.1.1 Repeating and Variable-Occurrence Items and Default Values Remainder of discussion of variable occurrences. Outstanding issues 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name="nullValues">"" "null" "NULL"</dfdl:property> So what is the syntax and it has to include expressions. 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Does everyone agree to this as it is a significant change to the document.? 9) nullIndicatorPath Expression Used when nullKind='nullIndicator'. A path expression referencing another element that provides the logical value to compare with nullValues On input, the element value is null if the provided value matches in nullValues. When null, If the element is fixed length then it will be skipped on input, filled with (TBD: fillbyte?) on output.. Is this correct??? Should it set element to Null? When null If the element is variable length with minimum length > 0, then a minimum length item will be skipped over, or on output filled (TBD with fillbyte?). When null If the element is variable length with minimum length 0, then a length zero object is expected on input, and a length 0 object will be generated on output. If non-null then the element is parsed or output normally. Annotation: dfdl:element (all simple types) 10) useNullValueForDefault Boolean Ignored on input. IS this correct. Shouldn't it set null if element is required? On output, if an element is not in the logical model, but it is required, the element is nillable, and has dfdl:useNullValueForDefault="true", then the logical value is defaulted to null. Annotation: dfdl:element (all simple types) Can you make sure you are happy with the changes. [attachment "ogf-dfdl-v1.0-Core-032.doc" deleted by Alan Powell/UK/IBM] Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB Cc: Alan Powell/UK/IBM Date: 07/02/2008 17:13 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) _____ Steve I have done most of this update. See below Will co,plete in next rev Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 Steve Hanson/UK/IBM 06/02/2008 09:26 To Alan Powell/UK/IBM cc Subject Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Alan - nulls and defaults changes below. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 06/02/2008 09:25 ----- "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 21:17 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Looks good. From: Steve Hanson [ <mailto:smh@uk.ibm.com> mailto:smh@uk.ibm.com] Sent: Tuesday, February 05, 2008 12:22 PM To: Mike Beckerle Subject: RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Mike Looks good, small corrections in blue. With those made we can send to Alan I think. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 14:57 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 1 Proposal: Input Defaulting for Empty Strings This is a corner case for strings. If an element is of type string, and has a default value specified, it is not clear whether the empty string should be an allowed value or if the empty string, when found in the representation, should trigger use of the default value instead. The following makes this corner case unambiguous: * Type string with minLength of zero and default value are incompatible. It is a schema definition error if a variable length string where zero length is valid also has a default value specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around This eliminates complexities around the issue of "empty" content. Empty content always triggers use of the default value. If the type is string and empty string is a legal value then there cannot be a default value. We also need the same for null values * Type string with minLength of zero and nillable with empty string as one of the dfdl:nullValues are incompatible. It is a schema definition error if a variable length string where zero length is valid also is nillable and has a null value of empty string specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? Convenience. So you can scope the nullIndicatorPath, and have local indices. 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? Changes to this definition: defaultValueInitiatorPolicy Enum Valid values are 'required' or 'prohibited' Ignored unless dfdl:initiator is specified and is not "" (empty string). Ignored unless the element declaration has a default attribute specified. 'required' indicates that the dfdl:initiator followed by empty content is the required syntax to indicate that a default value will be used. 'prohibited' indicates that empty content triggers the use of a default value, and the presence of an initiator implies that a non-default value representation must follow. 'prohibited' implies an ordered sequence. Use of defaultValueInitiatorPolicy='prohibited' in an initiated element of an unordered group is a schema definition error. This property applies only on input. (On output, for a required output an initiator is always output regardless of the default value.) Added 1.1.1.1 Initiators and Output This table describes the output direction logic for an initiated element that is a required element. We assume here that dfdl:initiator is specified and not equal to the empty string. Logical Value nullValueInitiatorPolicy useNullValueForDefault initiator region contains content region contains nil prohibited don't care nothing representation of nil based on nullKind, nullValues, etc. required initiator string "" (empty string) Note that this implies that the element type is xs:string and no default value can be specified. don't care initiator string empty string a non-nil non-empty-string value don't care initiator string The representation of the logical value Not supplied (element is not nillable) Don't care Don't care Initiator string The representation of the default value. (No default value implies processing error.) Not supplied (nillable) Prohibited True Nothing Representation of nil basd on nullKind, nullValues, etc. Required Initiator string Don't care False Initiator String The representation of the default value. (No default value implies processing error.) Added but had trould with table format as couldn't copy/paste. 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault See above. 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name="nullValues">"" "null" "NULL"</dfdl:property> Which avoids quoting hell. (there's still some issue of list-valued expressions.) 6) Error cases - need to enumerate these => Input. Required element missing and no default value. (processing error) => Output. Required element missing and no default value or null value. (processing error) => Output. Element is null and is not nillable. (processing error at least. It may be possible for some implementations to detect this error sooner.) => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 04/02/2008 16:06 ----- Hi Mike In preparation for our discussion on nulls and defaults tomorrow..... First of all I'd like to restate what I see as the requirements: Uncontentious core properties xs:default xs:fixed dfdl:nullKind dfdl:nullValues dfdl:nullIndicatorPath dfdl:nullIndicatorIndex Assumptions - 'Required' below is as defined in section 17.1.1.1. - The term 'default value' below actually means 'xs:default or xs:fixed' - Both default values and null values only apply to simple elements Input - If a required element is missing from the data stream and it has a default value, that will be used as the infoset value of the element - If an element is nillable and has a value in the data stream which matches one of a list of null values, the infoset value of the element will be the special value null Output - If a required element is missing from the infoset and it has a default value, optionally that will be used as the infoset value of the element - If a required element is missing from the infoset, optionally the special value null will be used as the infoset value of the element - If an element is nillable and has an infoset value null , the value in the data stream will be the first of the list of null values Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault 5) Is the list style syntax for dfdl:nullValues acceptable? 6) Error cases - need to enumerate these => Input. Required element missing and no default value. => Output. Required element missing and no default value or null value. => Output. Element is null and is not nillable. => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> 06/12/2007 13:50 To Steve Hanson/UK/IBM@IBMGB cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call I tend to trust your instincts about things Steve, I would summarize it as this: regardless of how people think nulls *should* work, in XSD nillables are orthogonal to value and whether or not this matches people's past experience we should support it if we're going to overload nillable at all. To me this reasoning is pretty compelling, so I withdraw my suggestion (the "either nillable or default value but not both" idea). ...mikeb Steve Hanson <smh@uk.ibm.com> 12/06/2007 04:59 AM To Mike Beckerle/Worcester/IBM@IBMUS cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call Unfortunately I have been roped into something else which will likely occupy me full time until middle of next week, so I can't look at the defaults/nulls issue in detail right now. But my first reaction to the proposal below is that elements should be allowed to have both null and default values. They are separate concepts in XML Schema, so why are we making the DFDL logical model different? IMHO subtle differences like this cause more issues with customers than the odd extra DFDL property. The DFDL subset of XML Schema should be just that - a subset. For those features of XML Schema that we do support, the rules should be the same. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 05/12/2007 23:21 To dfdl-wg@ogf.org cc Subject [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call OGF DFDL WG minutes 2007-12-05 call Suman Kalia, Simon Parker, Alan Powell, Mike Beckerle (who else? - was someone else on also) We discussed Output issues in the DFDL expression language: E.g.., an outputValueCalc for a field in the header of a data stream may contain information that requires you to know the rep, or length of the rep, of the whole data item. We concluded that this kind of thing can't be ruled out. Some formats just require buffering and are not streamable; however, implementations can vary on just how large a data item they're able to cope with here. Expression language section will include a subsection highlighting this issue and that implementations can vary here. Alan will update his expression language proposal and include this. Also suggested was a path length-from-to function that takes 2 path expressions and gives you the size of the represntation between them. (start of first, to last bit before start of 2nd). (I don't think we discussed a clear use case motivating this, but there may be one. We did discuss applications trying to fit data into limited size boxes, but the use case is not clear. Also note that all representation lengths are subject to change due to different starting alignments.) Nillable and Default: We also discussed the interaction of nillable and having a default. The sense of the group on the call is that we can restrict these so that if something is nillable it cannot also have a default value, and that the behavior of DFDL on output for a required element that is nillable but not in the logical data, is to create a null value. Everyone agreed that there is no need for a property useNullValueForDefault because this should always be the behavior. Mike will forward a proposal. ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 -- dfdl-wg mailing list dfdl-wg@ogf.org <http://www.ogf.org/mailman/listinfo/dfdl-wg> http://www.ogf.org/mailman/listinfo/dfdl-wg _____ Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _____ Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _____ Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _____ Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _____ Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _____ Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Hi Alan It was agreed on the DFDL call to revisit the forward reference restriction, based on the two issues that have surfaced recently: - Nil indicator bits packed into trailing parts of structures - TX example of component rules on a parent being able to referencea child Regards, Steve Steve Hanson Programming Model Architect WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 "Mike Beckerle" <mbeckerle.dfdl@gmail.com> Sent by: dfdl-wg-bounces@ogf.org 09/04/2008 01:31 Please respond to mbeckerle.dfdl@gmail.com To Alan Powell/UK/IBM@IBMGB, <dfdl-wg@ogf.org> cc Subject Re: [DFDL-WG] Fw: Nulls and Defaults Agreed. Nul termination actually is the ascii NUL character, spelled NUL, not NULL. But I really don?t care. NUL may just be a truncation of NULL to 3 characters since they wanted all 3-letter or less abbreviations like ESC, TAB, FF, LF, CR, etc. As for nilIndicatorPath, unfortunately, the key examples I?ve seen for this have the null bits after the data. Basically, null flag bits pack really nicely at the end of a record where they don?t cause any alignment padding to be needed. I am concerned about this streaming/non-streaming restriction stuff. We?re not considering the use cases properly. If I use DFDL and I describe a data buffer with it, I would like a DFDL library to give me random access to the data in the buffer, if the format allows it. That is a big ?IF? though. We need some notion of finite fixed distance between fields. I.e., if two fields are at fixed finite distance, then you can do forward reference, otherwise not. This is really all about binary data formats. ?mikeb From: dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org] On Behalf Of Alan Powell Sent: Thursday, April 03, 2008 8:00 AM To: dfdl-wg@ogf.org Subject: [DFDL-WG] Fw: Nulls and Defaults I have made the changes discussed yesterday plus the following added "useNilForDefault is specified" to the definition of "has default value specified? in 13.9 Changed null to nil, nullable to nillable etc except for LengthKing=NullTerminated. Agree? I am concerned about one of the changes nilIndicatorPath Path Expression Path to a logical Boolean field which indicates if this element is null. For nullKind='nullIndicator'., a path expression referencing another element that must be of type Boolean which indicates if this element is null. On input, the element value is null if the provided value is true. When null, on input the element is parsed as normal. If the element length is known then the value is skipped otherwise the value must be scannable. When null, on output the value is set based on fillByte or padCharacter properties and the referenced value set to true. If non-null then the element is parsed or output normally and the referenced value set to false. Annotation: dfdl:element (all simple types) By setting the referenced nil indicator we have made it impossible/difficult to implement a streaming unparser. I'm not sure that is a good idea. Also unless we relax the expression rules the indicator bit must be before the element. Please review sections 13.8-13.10 Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 ----- Forwarded by Alan Powell/UK/IBM on 03/04/2008 12:20 ----- From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB, "Mike Beckerle" <mbeckerle@OCO-INC.COM> Date: 01/04/2008 18:35 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Steve, Mike I have finally got around to finishing this off and it turned out to be a lot more work than I expected as the default and nulls information as all in the wrong places. Changes 13.8 Properties for Nullable Elements Updated as requested. nullKind=xpath changed to nullIndicator as it was xpath is also used in nullValue so it was confusing. 13.9 Properties for Default Value Control Moved from most of 17.1.1.1 and 17.2 so is now the main description of defaults. 13.10 Nulls, Defaults, and Initiators Moved from 14.2.1 Updated as requested 17.1.1.1 Repeating and Variable-Occurrence Items and Default Values Remainder of discussion of variable occurrences. Outstanding issues 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name=?nullValues?>?? ?null? ?NULL?</dfdl:property> So what is the syntax and it has to include expressions. 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Does everyone agree to this as it is a significant change to the document.? 9) nullIndicatorPath Expression Used when nullKind='nullIndicator'. A path expression referencing another element that provides the logical value to compare with nullValues On input, the element value is null if the provided value matches in nullValues. When null, If the element is fixed length then it will be skipped on input, filled with (TBD: fillbyte?) on output.. Is this correct??? Should it set element to Null? When null If the element is variable length with minimum length > 0, then a minimum length item will be skipped over, or on output filled (TBD with fillbyte?). When null If the element is variable length with minimum length 0, then a length zero object is expected on input, and a length 0 object will be generated on output. If non-null then the element is parsed or output normally. Annotation: dfdl:element (all simple types) 10) useNullValueForDefault Boolean Ignored on input. IS this correct. Shouldn't it set null if element is required? On output, if an element is not in the logical model, but it is required, the element is nillable, and has dfdl:useNullValueForDefault="true", then the logical value is defaulted to null. Annotation: dfdl:element (all simple types) Can you make sure you are happy with the changes. [attachment "ogf-dfdl-v1.0-Core-032.doc" deleted by Alan Powell/UK/IBM] Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: Alan Powell/UK/IBM To: Steve Hanson/UK/IBM@IBMGB Cc: Alan Powell/UK/IBM Date: 07/02/2008 17:13 Subject: Re: Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Steve I have done most of this update. See below Will co,plete in next rev Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 Steve Hanson/UK/IBM 06/02/2008 09:26 To Alan Powell/UK/IBM cc Subject Fw: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Alan - nulls and defaults changes below. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 06/02/2008 09:25 ----- "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 21:17 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Looks good. From: Steve Hanson [mailto:smh@uk.ibm.com] Sent: Tuesday, February 05, 2008 12:22 PM To: Mike Beckerle Subject: RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Hi Mike Looks good, small corrections in blue. With those made we can send to Alan I think. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 "Mike Beckerle" <mbeckerle@OCO-INC.COM> 05/02/2008 14:57 To Steve Hanson/UK/IBM@IBMGB cc Subject RE: Nulls and Defaults (was [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call) Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 1 Proposal: Input Defaulting for Empty Strings This is a corner case for strings. If an element is of type string, and has a default value specified, it is not clear whether the empty string should be an allowed value or if the empty string, when found in the representation, should trigger use of the default value instead. The following makes this corner case unambiguous: Type string with minLength of zero and default value are incompatible. It is a schema definition error if a variable length string where zero length is valid also has a default value specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around This eliminates complexities around the issue of ?empty? content. Empty content always triggers use of the default value. If the type is string and empty string is a legal value then there cannot be a default value. We also need the same for null values Type string with minLength of zero and nillable with empty string as one of the dfdl:nullValues are incompatible. It is a schema definition error if a variable length string where zero length is valid also is nillable and has a null value of empty string specified. Not yet added. Wasn't sure where this should go as the Nulls,Default info is scattered around 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? Convenience. So you can scope the nullIndicatorPath, and have local indices. 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? Changes to this definition: defaultValueInitiatorPolicy Enum Valid values are 'required' or 'prohibited' Ignored unless dfdl:initiator is specified and is not "" (empty string). Ignored unless the element declaration has a default attribute specified. 'required' indicates that the dfdl:initiator followed by empty content is the required syntax to indicate that a default value will be used. 'prohibited' indicates that empty content triggers the use of a default value, and the presence of an initiator implies that a non-default value representation must follow. ?prohibited? implies an ordered sequence. Use of defaultValueInitiatorPolicy=?prohibited? in an initiated element of an unordered group is a schema definition error. This property applies only on input. (On output, for a required output an initiator is always output regardless of the default value.) Added 1.1.1.1 Initiators and Output This table describes the output direction logic for an initiated element that is a required element. We assume here that dfdl:initiator is specified and not equal to the empty string. Logical Value nullValueInitiatorPolicy useNullValueForDefault initiator region contains content region contains nil prohibited don't care nothing representation of nil based on nullKind, nullValues, etc. required initiator string "" (empty string) Note that this implies that the element type is xs:string and no default value can be specified. don't care initiator string empty string a non-nil non-empty-string value don't care initiator string The representation of the logical value Not supplied (element is not nillable) Don?t care Don?t care Initiator string The representation of the default value. (No default value implies processing error.) Not supplied (nillable) Prohibited True Nothing Representation of nil basd on nullKind, nullValues, etc. Required Initiator string Don?t care False Initiator String The representation of the default value. (No default value implies processing error.) Added but had trould with table format as couldn't copy/paste. 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault See above. 5) Is the list style syntax for dfdl:nullValues acceptable? Yes because you can use <dfdl:property name=?nullValues?>?? ?null? ?NULL?</dfdl:property> Which avoids quoting hell. (there?s still some issue of list-valued expressions.) 6) Error cases - need to enumerate these => Input. Required element missing and no default value. (processing error) => Output. Required element missing and no default value or null value. (processing error) => Output. Element is null and is not nillable. (processing error at least. It may be possible for some implementations to detect this error sooner.) => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? (standardize on nil, not null). Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 04/02/2008 16:06 ----- Hi Mike In preparation for our discussion on nulls and defaults tomorrow..... First of all I'd like to restate what I see as the requirements: Uncontentious core properties xs:default xs:fixed dfdl:nullKind dfdl:nullValues dfdl:nullIndicatorPath dfdl:nullIndicatorIndex Assumptions - 'Required' below is as defined in section 17.1.1.1. - The term 'default value' below actually means 'xs:default or xs:fixed' - Both default values and null values only apply to simple elements Input - If a required element is missing from the data stream and it has a default value, that will be used as the infoset value of the element - If an element is nillable and has a value in the data stream which matches one of a list of null values, the infoset value of the element will be the special value null Output - If a required element is missing from the infoset and it has a default value, optionally that will be used as the infoset value of the element - If a required element is missing from the infoset, optionally the special value null will be used as the infoset value of the element - If an element is nillable and has an infoset value null , the value in the data stream will be the first of the list of null values Issues this raises 1) How can you represent empty string as a) a null value? b) a default value (not sure you can)? 2) Why are nullIndicatorPath and nullIndicatorIndex separate properties? 3) What does 'missing' mean when initiators are involved? => Covered by extra properties dfdl:nullValueInitiatorPolicy & dfdl:defaultValueInitiatorPolicy, as given by tables in 14.2.1.1 and 14.2.1.2 => I think the bottom row of the table in 14.2.1.2 is incorrect - in the infoset, empty string and missing element are two distinct cases - how do/did we resolve this? 4) What controls null versus default for a missing element on output? => Extra property dfdl:useNullValueForDefault 5) Is the list style syntax for dfdl:nullValues acceptable? 6) Error cases - need to enumerate these => Input. Required element missing and no default value. => Output. Required element missing and no default value or null value. => Output. Element is null and is not nillable. => ? 7) Consistent use of nil versus null. => I'm wondering that we should standardise on nil to match xsd ? Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> 06/12/2007 13:50 To Steve Hanson/UK/IBM@IBMGB cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call I tend to trust your instincts about things Steve, I would summarize it as this: regardless of how people think nulls *should* work, in XSD nillables are orthogonal to value and whether or not this matches people's past experience we should support it if we're going to overload nillable at all. To me this reasoning is pretty compelling, so I withdraw my suggestion (the "either nillable or default value but not both" idea). ...mikeb Steve Hanson <smh@uk.ibm.com> 12/06/2007 04:59 AM To Mike Beckerle/Worcester/IBM@IBMUS cc dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Subject Re: [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call Unfortunately I have been roped into something else which will likely occupy me full time until middle of next week, so I can't look at the defaults/nulls issue in detail right now. But my first reaction to the proposal below is that elements should be allowed to have both null and default values. They are separate concepts in XML Schema, so why are we making the DFDL logical model different? IMHO subtle differences like this cause more issues with customers than the odd extra DFDL property. The DFDL subset of XML Schema should be just that - a subset. For those features of XML Schema that we do support, the rules should be the same. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Mike Beckerle <beckerle@us.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 05/12/2007 23:21 To dfdl-wg@ogf.org cc Subject [DFDL-WG] OGF DFDL WG minutes 2007-12-05 call OGF DFDL WG minutes 2007-12-05 call Suman Kalia, Simon Parker, Alan Powell, Mike Beckerle (who else? - was someone else on also) We discussed Output issues in the DFDL expression language: E.g.., an outputValueCalc for a field in the header of a data stream may contain information that requires you to know the rep, or length of the rep, of the whole data item. We concluded that this kind of thing can't be ruled out. Some formats just require buffering and are not streamable; however, implementations can vary on just how large a data item they're able to cope with here. Expression language section will include a subsection highlighting this issue and that implementations can vary here. Alan will update his expression language proposal and include this. Also suggested was a path length-from-to function that takes 2 path expressions and gives you the size of the represntation between them. (start of first, to last bit before start of 2nd). (I don't think we discussed a clear use case motivating this, but there may be one. We did discuss applications trying to fit data into limited size boxes, but the use case is not clear. Also note that all representation lengths are subject to change due to different starting alignments.) Nillable and Default: We also discussed the interaction of nillable and having a default. The sense of the group on the call is that we can restrict these so that if something is nillable it cannot also have a default value, and that the behavior of DFDL on output for a required element that is nillable but not in the logical data, is to create a null value. Everyone agreed that there is no need for a property useNullValueForDefault because this should always be the behavior. Mike will forward a proposal. ...mikeb Mike Beckerle STSM, Architect, Scalable Computing IBM Software Group Information Platform and Solutions Westborough, MA 01581 direct: voice and FAX 508-599-7148 assistant: Pam Riordan priordan@us.ibm.com 508-599-7046 -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (3)
-
Alan Powell
-
Mike Beckerle
-
Steve Hanson