DFDL: Minutes from OGF WG call, 21 January 2009

Open Grid Forum: Data Format Description Language Working Group Weekly Working Group Conference Call 14:00 GMT, 21 January 2009 Attendees Alan Powell (IBM) Mike Beckerle(Oco) Apologies Steve Hanson (IBM) 1. XSD 1.1 Deferred to next call 2. Calendar formats Discussed updated (v4) supplement emailed by AP Agreed millisec/secSinceEpoc cannot be implied by length of logical data so need seperate enumerations. Observed that these options were really combination of 3 properties binary, length and sec/millisec. Suggested renaming to binarySeconds and binaryMilliseconds Packed calendars: decided that need to be able to specify at least the packedDecimalSignCodes property rather than assuming a default so reference will be added to calendar description Locale needs to be specified for numberformats and calendarFormats (didn't identify any other areas) as it modifies the behaviour of ICU. Decided to add locale to numberFormat and CalendarFormat 3. Escape Schemes Agreed need for multiple escape delimiter pairs but not nested. Need an escape for escape character even though in most cases this will be the same character, eg /n //, There are some formats that have a different escape, eg /n &/. Only need single escape characters and one level of escape characters. Discussed how to deal with comments of the form /* comment */ where the escape delimiters are also the initiator and terminator of the field. Semantic needed is 'only look for field terminator not any parent terminator or any other syntax elements'. May fall out naturally from the speculative parsing rules. Need further discussion. 4. AOB Next call 28 January 14:00 Meeting closed, 15:00 GMT Actions raised at this meeting No Action 031 Current Actions: No Action 012 AP/SH: Update decimalCalendarScheme 10/9: Not allocated yet 17/9: No update 24/9: Add calendar binary formats to actions 22/10: No progress 16/1: proposal distributed and discussed. Will be redistributed 21/1: add locale, 020 SH: Resolve packedDecimalSignCodes behaviour depends on NumberCheckPolicy 22/10: No progress 10/12: added how to decide to overpunch and sign position 023 MB: Review Schema 1.1 024 String XML type 025 Escape schemes 21/1: discussed requirements 026 SH: Envelopes and Payloads 027 Property precedence tables 028 Variable markup 029 valueCalc (output length calculation) 030 AP: confirm with WTX that can drop duration 21/6: WTX confirm that they do not have a duration type so do not need it in dfdl. Will drop from spec. Closed Closed actions: 030 AP: confirm with WTX that can drop duration 21/6: WTX confirm that they do not have a duration type so do not need it in dfdl. Will drop from spec. Closed 034 Work items: No Item 001 String XML type (Ian P) - Apr 30, 2008 002 Escape schemes (Ian P) - Apr 30, 2008 003 Variables - ??, 2008 (Mike) 005 Improvements on property descriptions - ??, 2008 (All - split TBD) 006 Envelopes and Payloads (Steve) - Apr 30, 2008 007 (from draft 32) valueCalc (Mike) - ??, 2008 mostly complete 008 (from draft 32) Property precedence for writing (Steve) - under review 009 (from draft 32) Variable markup (Steve) - Mar 31, 2008 proposal needs writing up 010 (from draft 32) Assertions, discriminators and choice, including discussion of timing option (Suman) - Mar 31, 2008 * in progress * 011 (from draft 32) How speculative parsing works (combining choice and variable-occurence - currently these are separate) ??, 2008 (IBM) in progress 012 (from draft 32) Reordering the properties discussion: move representation earlier, improve flow of topics ??, 2008 (Alan) * not started * 025 Augmented infoset and unparsing (Alan) added but needs work complete - specification updated Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Action 020: 020 SH: Resolve packedDecimalSignCodes behaviour depends on NumberCheckPolicy 22/10: No progress 10/12: added how to decide to overpunch and sign position a) Resolve packedDecimalSignCodes behaviour depends on NumberCheckPolicy Add new property to section 15.4 Properties Specific to Number with Binary representation. binaryNumberCheckPolicy Enum Values are ?strict? and ?lax?. Indicates how lenient to be when parsing binary numbers. If ?lax? then the parser tolerates all valid alternatives where such alternatives exist. Specifically, for binaryNumberRepresentation = 'packed' the sign nibble for positive, negative, unsigned and zero is allowed to be any of the valid respective values. On unparsing, the specified value is always used. Also suggest changing some of the other property names in 15.4: "decimalVirtualPoint" -> "binaryDecimalVirtualPoint" "packedDecimalSignCodes" -> "binaryPackedSignCodes" And changing binaryNumberRepresentation enumeration: "BCD" -> "bcd" b) Zoned decimals: How to decide to overpunch and sign position Spec assumes that overpunching of the rightmost character always takes place. IBM architecture allows no overpunching (ie, Fx instead of Cx/Dx) - this is supported by IBM MRM & WTX parsers. Additionally IBM MRM parser allows separate sign byte, and sign byte on left. Let's deal with these separately: i) No overpunching. The IBM architecture allows the rightmost byte to have a zone (Fx) or a sign (Cx/Dx) as the left nibble. I don't see why we can't base what to expect when parsing, and output when unparsing, on the logical xsd type. - If it is an unsigned type then DFDL expects the rightmost byte to have a zone nibble when parsing, and outputs a zone nibble when unparsing. - If it is a signed type then DFDL expects it to have a sign nibble when parsing, and outputs a sign nibble when unparsing. For analogy with DFDL packed decimals, it seems at first glance that we should also extend the numberCheckPolicy 'lax' setting to treat a zone nibble as a +ve sign nibble for a signed type. However, IBM iSeries always outputs Fx to mean +ve but accepts both Fx & Cx on input. It is perhaps better therefore that DFDL always tolerates Fx when parsing a signed zoned decimal, otherwise iSeries users would always have to set numberCheckPolicy to 'lax', which might have other implications in the future. ii) Separate sign byte. I don't believe the IBM architecture allows this. I'm not sure where MRM got this from. I don't think DFDL needs to support it. iii) Sign byte on left. I don't believe the IBM architecture allows this. I'm not sure where MRM got this from. I don't think DFDL needs to support it. Conclusion: No new DFDL properties needed, but words need adding to explain zoned parse/unparse behaviour better. Also suggest changing property names: "zonedDecimalSignStyle" -> "numberZonedSignStyle" "zeroNumberRep" -> "numberZeroRep" Should also make clear that any explicit negative pattern in numberPattern will be ignored if the xsd type is unsigned. (We could make this an error but it precludes creation of a textNumberFormat that works with both signed and unsigned types, plus pattern "##0.0" implictly is equivalent to "##0.0;(##0.0)" ). Regards Steve Hanson Programming Model Architect WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Alan Powell/UK/IBM@IBMGB Sent by: dfdl-wg-bounces@ogf.org 23/01/2009 13:36 To dfdl-wg@ogf.org cc Subject [DFDL-WG] DFDL: Minutes from OGF WG call, 21 January 2009 Open Grid Forum: Data Format Description Language Working Group Weekly Working Group Conference Call 14:00 GMT, 21 January 2009 Attendees Alan Powell (IBM) Mike Beckerle(Oco) Apologies Steve Hanson (IBM) 1. XSD 1.1 Deferred to next call 2. Calendar formats Discussed updated (v4) supplement emailed by AP Agreed millisec/secSinceEpoc cannot be implied by length of logical data so need seperate enumerations. Observed that these options were really combination of 3 properties binary, length and sec/millisec. Suggested renaming to binarySeconds and binaryMilliseconds Packed calendars: decided that need to be able to specify at least the packedDecimalSignCodes property rather than assuming a default so reference will be added to calendar description Locale needs to be specified for numberformats and calendarFormats (didn't identify any other areas) as it modifies the behaviour of ICU. Decided to add locale to numberFormat and CalendarFormat 3. Escape Schemes Agreed need for multiple escape delimiter pairs but not nested. Need an escape for escape character even though in most cases this will be the same character, eg /n //, There are some formats that have a different escape, eg /n &/. Only need single escape characters and one level of escape characters. Discussed how to deal with comments of the form /* comment */ where the escape delimiters are also the initiator and terminator of the field. Semantic needed is 'only look for field terminator not any parent terminator or any other syntax elements'. May fall out naturally from the speculative parsing rules. Need further discussion. 4. AOB Next call 28 January 14:00 Meeting closed, 15:00 GMT Actions raised at this meeting No Action 031 Current Actions: No Action 012 AP/SH: Update decimalCalendarScheme 10/9: Not allocated yet 17/9: No update 24/9: Add calendar binary formats to actions 22/10: No progress 16/1: proposal distributed and discussed. Will be redistributed 21/1: add locale, 020 SH: Resolve packedDecimalSignCodes behaviour depends on NumberCheckPolicy 22/10: No progress 10/12: added how to decide to overpunch and sign position 023 MB: Review Schema 1.1 024 String XML type 025 Escape schemes 21/1: discussed requirements 026 SH: Envelopes and Payloads 027 Property precedence tables 028 Variable markup 029 valueCalc (output length calculation) 030 AP: confirm with WTX that can drop duration 21/6: WTX confirm that they do not have a duration type so do not need it in dfdl. Will drop from spec. Closed Closed actions: 030 AP: confirm with WTX that can drop duration 21/6: WTX confirm that they do not have a duration type so do not need it in dfdl. Will drop from spec. Closed 034 Work items: No Item 001 String XML type (Ian P) - Apr 30, 2008 002 Escape schemes (Ian P) - Apr 30, 2008 003 Variables - ??, 2008 (Mike) 005 Improvements on property descriptions - ??, 2008 (All - split TBD) 006 Envelopes and Payloads (Steve) - Apr 30, 2008 007 (from draft 32) valueCalc (Mike) - ??, 2008 mostly complete 008 (from draft 32) Property precedence for writing (Steve) - under review 009 (from draft 32) Variable markup (Steve) - Mar 31, 2008 proposal needs writing up 010 (from draft 32) Assertions, discriminators and choice, including discussion of timing option (Suman) - Mar 31, 2008 * in progress * 011 (from draft 32) How speculative parsing works (combining choice and variable-occurence - currently these are separate) ??, 2008 (IBM) in progress 012 (from draft 32) Reordering the properties discussion: move representation earlier, improve flow of topics ??, 2008 (Alan) * not started * 025 Augmented infoset and unparsing (Alan) added but needs work complete - specification updated Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (2)
-
Alan Powell
-
Steve Hanson