Open Grid Forum: Data Format Description Language Working Group

OGF DFDL Working Group Call, April-08-2009

Attendees

Suman Kalia (IBM)
Steve Hanson (IBM)
Mike Beckerle (Oco)

Apologies
Alan Powell (IBM)
Dave Glick (drac)


1. Escape Schemes
Alan has mailed latest refinement
Overall
Agreed with the scope of the escape scheme support, ie, support three well-known variants, and not provide overly complex open ended support
Annotation structure.
Why dfdl:defineEscapeScheme and dfdl:escapeScheme, instead of just dfdl:escapeScheme and an optional name attribute?   For consistency with dfdl:defineFormat, dfdl:defineNumberFormat, etc, and it makes it clear that the top level scope of the naming is a peer to dfdl:defineFormat not inside them.
Annotation properties.
These need a careful review to make sure that they behave in the expected manner.
For example, should escape start/end bracketing be at the start/end of the field, or anywhere in the field?
Action raised to review in detail for next call.

2. Validation ranges
Need to decide whether to allow restricted use of xs:union to model this.
Agreed that this should be supported. For such a union:
- The member types must all be derived from the same schema simple type
- Any DFDL annotations on member types are a schema definition error
Will be added to draft 0.34.

3. Specialized annotations
Need to decide whether to drop specialized annotations altogether, or use the scheme below, which does not affect scoping in any way but which makes it clear what is allowed where.

        Specialized annotations on xsd objects only, dfdl:format on scoping constructs only
                dfdl;defineFormat  =>  dfdl:format
               xs:complexType => dfdl:format

               xs:sequence => dfdl:sequence

               xs:choice => dfdl:choice

                xs:group ref => dfdl:group
               xs:element or xs:element ref => dfdl:element

               xs:any => dfdl:any

               xs:simpleType => dfdl:simpleType

        dfdl:format is exactly as specified in draft 0.33, its properties apply to all relevant objects
Scoping rules as specified in draft 0.33

Agreed that this scheme provided the best balance between simplicity and validation capability.
Will be added to draft 0.34.

4. Exclusion lists.
XML Schema only allows inclusion list of enumerations, does DFDL need to support exclusion list of enumerations?  It would be nice if it did, but if we provided a DFDL property that said 'treat enums as exclusion instead of inclusion', removal of the DFDL annotations would change the validation semantic. Agreed that exclusion provision is something that DFDL would inherit from XML Schema, when and if it gets added there.

5. Consuming extraneous data that occurs at the end of the stream
This is where the DFDL model matches input data ok, except that there is some extra data in the stream. This can be explicitly modelled, using a hidden optional element. Agreed that whether such a hidden optional element is needed, or whether the data is simply ignored, is up to individual DFDL implementations. The spec will not take a position.

6.
'Floating' definitions
A known element, the position of which can be anywhere in a sequence of other elements - is this something DFDL needs to support?  Capability is offered by IBM's WTX product.  Can be used for comments, but DFDL plans to handle comments post 1.0 using an explicit mechanism or using layering. Real purpose of floating component is for older EDI formats where there is a segment that can appear anywhere, and can appear any number of times. Action raised for IBM to provide a concrete example for discussion. The issue for DFDL is how does a floating component appear in the DFDL infoset, and how does it validate in the sequence. One possibility is a property dfdl:floating=yes/no and if an element has that property set, it can be expected anywhere when parsing, but appears in the correct point in the sequence in the parsed infoset, On unparsing it must appear in in the correct point in the sequence, and is output in that place.

7. Recursive use of DFDL for variable markup
Use of a DFDL annotated element/type to describe an initiator, length prefix, terminator, separator, etc. Steve suggested the most important use of "variable markup-like mechanism" in IBM's WTX product is to reference a location earlier in the bit stream where a delimiter value is found. We handle this already by use of  a path expression. The additional variable markup mechanism was to avoid proliferation of keywords for various corner cases on initiator, terminator and separator. Eg., what if you want the initiator to be "Name" or "name" only, not "NAME", "nAmE", etc. So case insensitive is not expressive enough. This can always be modeled, just not as an initiator tag. Feeling was to leave out variable markup (other than for prefix lengths) for v1.0, and to propose the minimum set of extra properties that can be used to address the common use cases, but that IBM needed to see whether this satisfied all WTX use cases.  
 
(Post-call update. It doesn't, there is a use case from WTX, Steve will mail this out before next call).

Actions updated below.

Next call 15 April 14:00 UK

Meeting closed, 15:05

Actions raised at this meeting
No
Action
035
AP: Add validation ranges to spec, update specialized annotations in spec.
08/04: Raised. For draft 0.34
036
SH: Provide use case for floating component in a sequence
08/04: Raised

Current Actions:
No
Action
012
AP/SH: Update decimalCalendarScheme
10/9: Not allocated yet
17/9: No update
24/9: Add calendar binary formats to actions
22/10: No progress
16/1: proposal distributed and discussed. Will be redistributed
21/1: add locale,
04/02: changed from locale to specific properties
18/2: Need more investigation of ICU strict/lax behaviour.
08/04: Not discussed
020
SH: Resolve packedDecimalSignCodes behaviour depends on NumberCheckPolicy
22/10: No progress
10/12: added how to decide to overpunch and sign position
11/02: proposal largely agreed. SH to make minor changes
18/02: AP to document unsigned type behaviour
25/02: no progress
08/04: Not discussed
023
MB: Review Schema 1.1
29/1: AP and SH to talk to Sandy Gao
04/02 Call arranged for Friday
11/02: Call took place. Identified useful changes. Consolidate with previous list.
04/03: decided to stay on Schema 1.0.
08/04: Not discussed
024
String XML type
08/04: Not discussed
025
Escape schemes
21/1: discussed requirements
04/02: AP/SH to describe behaviour for known length text fields. Need to discuss if comment escapes should be supported.
11/02 new draft distributed:
18/02: SH up document concerns
25/02: SH and AP have refined proposal ready for approval.
04/03: SH and AP have further refined proposal.
11/03: discussed. suggested a simplified proposal be evaluated.
18/03: SH and AP had further discussions on simplified proposal
08/04: See minutes, review in detail for next call
026
SH: Envelopes and Payloads
08/04: Not discussed explicity, but recursive use of DFDL is tied up with this
027
Property precedence tables
08/04: Not discussed
028
SH: Variable markup
08/04: Discussed briefly at end of call, IBM to see whether there any use cases that require recursive use of DFDL.
029
valueCalc (output length calculation)
08/04: Not discussed
032
DG: Investigate compatibility between DFDL infoset and XDM
08/04: No update
033
AP/TK: Assert/Discriminator semantics. AP to document. TK to check uses of discriminator besides choice.
08/04: In progress within IBM
034
AP: Remove redundant properties, correct old examples
08/04: No update

Closed actions:
031
DG: Review dfdl v033
11/02:  Initial comments received
18/02: Will include work items 5 and 12.
11/03: complete

Work items:
No
Item
001
String XML type (Ian P) - Apr 30, 2008
002
Escape schemes (Ian P) - Apr 30, 2008
003
Variables - ??, 2008 (Mike)
005
Improvements on property descriptions - ??, 2008 (All - split TBD)
006
Envelopes and Payloads (Steve) - Apr 30, 2008
007
(from draft 32) valueCalc (Mike) - ??, 2008   mostly
complete
008
(from draft 32) Property precedence for writing (Steve) - under review
009
(from draft 32) Variable markup (Steve) - Mar 31, 2008   proposal needs writing up
010
(from draft 32) Assertions, discriminators and choice, including discussion of timing option (Suman) - Mar 31, 2008 * in progress *
011
(from draft 32) How speculative parsing works (combining choice and variable-occurence - currently these are separate) ??, 2008 (IBM)  in progress
012
(from draft 32) Reordering the properties discussion: move representation earlier, improve flow of topics ??, 2008 (Alan) * not started *
025
Augmented infoset and unparsing (Alan)   added but needs work
026  Remove duration


Regards

Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU