Minutes: OGF DFDL Working Group Call, April-08-2009

Open Grid Forum: Data Format Description Language Working Group OGF DFDL Working Group Call, April-08-2009 Attendees Suman Kalia (IBM) Steve Hanson (IBM) Mike Beckerle (Oco) Apologies Alan Powell (IBM) Dave Glick (drac) 1. Escape Schemes Alan has mailed latest refinement Overall Agreed with the scope of the escape scheme support, ie, support three well-known variants, and not provide overly complex open ended support Annotation structure. Why dfdl:defineEscapeScheme and dfdl:escapeScheme, instead of just dfdl:escapeScheme and an optional name attribute? For consistency with dfdl:defineFormat, dfdl:defineNumberFormat, etc, and it makes it clear that the top level scope of the naming is a peer to dfdl:defineFormat not inside them. Annotation properties. These need a careful review to make sure that they behave in the expected manner. For example, should escape start/end bracketing be at the start/end of the field, or anywhere in the field? Action raised to review in detail for next call. 2. Validation ranges Need to decide whether to allow restricted use of xs:union to model this. Agreed that this should be supported. For such a union: - The member types must all be derived from the same schema simple type - Any DFDL annotations on member types are a schema definition error Will be added to draft 0.34. 3. Specialized annotations Need to decide whether to drop specialized annotations altogether, or use the scheme below, which does not affect scoping in any way but which makes it clear what is allowed where. Specialized annotations on xsd objects only, dfdl:format on scoping constructs only dfdl;defineFormat => dfdl:format xs:complexType => dfdl:format xs:sequence => dfdl:sequence xs:choice => dfdl:choice xs:group ref => dfdl:group xs:element or xs:element ref => dfdl:element xs:any => dfdl:any xs:simpleType => dfdl:simpleType dfdl:format is exactly as specified in draft 0.33, its properties apply to all relevant objects Scoping rules as specified in draft 0.33 Agreed that this scheme provided the best balance between simplicity and validation capability. Will be added to draft 0.34. 4. Exclusion lists. XML Schema only allows inclusion list of enumerations, does DFDL need to support exclusion list of enumerations? It would be nice if it did, but if we provided a DFDL property that said 'treat enums as exclusion instead of inclusion', removal of the DFDL annotations would change the validation semantic. Agreed that exclusion provision is something that DFDL would inherit from XML Schema, when and if it gets added there. 5. Consuming extraneous data that occurs at the end of the stream This is where the DFDL model matches input data ok, except that there is some extra data in the stream. This can be explicitly modelled, using a hidden optional element. Agreed that whether such a hidden optional element is needed, or whether the data is simply ignored, is up to individual DFDL implementations. The spec will not take a position. 6. 'Floating' definitions A known element, the position of which can be anywhere in a sequence of other elements - is this something DFDL needs to support? Capability is offered by IBM's WTX product. Can be used for comments, but DFDL plans to handle comments post 1.0 using an explicit mechanism or using layering. Real purpose of floating component is for older EDI formats where there is a segment that can appear anywhere, and can appear any number of times. Action raised for IBM to provide a concrete example for discussion. The issue for DFDL is how does a floating component appear in the DFDL infoset, and how does it validate in the sequence. One possibility is a property dfdl:floating=yes/no and if an element has that property set, it can be expected anywhere when parsing, but appears in the correct point in the sequence in the parsed infoset, On unparsing it must appear in in the correct point in the sequence, and is output in that place. 7. Recursive use of DFDL for variable markup Use of a DFDL annotated element/type to describe an initiator, length prefix, terminator, separator, etc. Steve suggested the most important use of "variable markup-like mechanism" in IBM's WTX product is to reference a location earlier in the bit stream where a delimiter value is found. We handle this already by use of a path expression. The additional variable markup mechanism was to avoid proliferation of keywords for various corner cases on initiator, terminator and separator. Eg., what if you want the initiator to be "Name" or "name" only, not "NAME", "nAmE", etc. So case insensitive is not expressive enough. This can always be modeled, just not as an initiator tag. Feeling was to leave out variable markup (other than for prefix lengths) for v1.0, and to propose the minimum set of extra properties that can be used to address the common use cases, but that IBM needed to see whether this satisfied all WTX use cases. (Post-call update. It doesn't, there is a use case from WTX, Steve will mail this out before next call). Actions updated below. Next call 15 April 14:00 UK Meeting closed, 15:05 Actions raised at this meeting No Action 035 AP: Add validation ranges to spec, update specialized annotations in spec. 08/04: Raised. For draft 0.34 036 SH: Provide use case for floating component in a sequence 08/04: Raised Current Actions: No Action 012 AP/SH: Update decimalCalendarScheme 10/9: Not allocated yet 17/9: No update 24/9: Add calendar binary formats to actions 22/10: No progress 16/1: proposal distributed and discussed. Will be redistributed 21/1: add locale, 04/02: changed from locale to specific properties 18/2: Need more investigation of ICU strict/lax behaviour. 08/04: Not discussed 020 SH: Resolve packedDecimalSignCodes behaviour depends on NumberCheckPolicy 22/10: No progress 10/12: added how to decide to overpunch and sign position 11/02: proposal largely agreed. SH to make minor changes 18/02: AP to document unsigned type behaviour 25/02: no progress 08/04: Not discussed 023 MB: Review Schema 1.1 29/1: AP and SH to talk to Sandy Gao 04/02 Call arranged for Friday 11/02: Call took place. Identified useful changes. Consolidate with previous list. 04/03: decided to stay on Schema 1.0. 08/04: Not discussed 024 String XML type 08/04: Not discussed 025 Escape schemes 21/1: discussed requirements 04/02: AP/SH to describe behaviour for known length text fields. Need to discuss if comment escapes should be supported. 11/02 new draft distributed: 18/02: SH up document concerns 25/02: SH and AP have refined proposal ready for approval. 04/03: SH and AP have further refined proposal. 11/03: discussed. suggested a simplified proposal be evaluated. 18/03: SH and AP had further discussions on simplified proposal 08/04: See minutes, review in detail for next call 026 SH: Envelopes and Payloads 08/04: Not discussed explicity, but recursive use of DFDL is tied up with this 027 Property precedence tables 08/04: Not discussed 028 SH: Variable markup 08/04: Discussed briefly at end of call, IBM to see whether there any use cases that require recursive use of DFDL. 029 valueCalc (output length calculation) 08/04: Not discussed 032 DG: Investigate compatibility between DFDL infoset and XDM 08/04: No update 033 AP/TK: Assert/Discriminator semantics. AP to document. TK to check uses of discriminator besides choice. 08/04: In progress within IBM 034 AP: Remove redundant properties, correct old examples 08/04: No update Closed actions: 031 DG: Review dfdl v033 11/02: Initial comments received 18/02: Will include work items 5 and 12. 11/03: complete Work items: No Item 001 String XML type (Ian P) - Apr 30, 2008 002 Escape schemes (Ian P) - Apr 30, 2008 003 Variables - ??, 2008 (Mike) 005 Improvements on property descriptions - ??, 2008 (All - split TBD) 006 Envelopes and Payloads (Steve) - Apr 30, 2008 007 (from draft 32) valueCalc (Mike) - ??, 2008 mostly complete 008 (from draft 32) Property precedence for writing (Steve) - under review 009 (from draft 32) Variable markup (Steve) - Mar 31, 2008 proposal needs writing up 010 (from draft 32) Assertions, discriminators and choice, including discussion of timing option (Suman) - Mar 31, 2008 * in progress * 011 (from draft 32) How speculative parsing works (combining choice and variable-occurence - currently these are separate) ??, 2008 (IBM) in progress 012 (from draft 32) Reordering the properties discussion: move representation earlier, improve flow of topics ??, 2008 (Alan) * not started * 025 Augmented infoset and unparsing (Alan) added but needs work 026 Remove duration Regards Steve Hanson Programming Model Architect WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (1)
-
Steve Hanson