Minutes for OGF DFDL Working Group Call, January-26 & 27-2010

Open Grid Forum: Data Format Description Language Working Group OGF DFDL Working Group Call, January-26 & 27-2010 Attendees Mike Beckerle (Oco) (27) Steve Hanson (IBM) (26 & 27) Alan Powell (IBM) (26 & 27) Steve Marting (Progeny) (27) Suman Kalia (IBM) (27) Peter Lambros (IBM) (26) Tim Kimber(IBM) (26 & 27) Apologies Stephanie Fetzer (IBM) 1. Discriminators There was a long discussion of Alan's document which merged the 'Resolving Uncertainty' and 'Discriminators' documents. It proposes that discriminators 'confirm the existence of their parent sequence' and did not allow discriminators on simple elements to confirm the existence of that element based on its contents or other data. Mike pointed out that this did not allow the uses case of a previous field in the data stream (eg a flag) indicating the existence of a field without wrapping that field in a extra sequence. The uses cases are: - A sequence is 'known to exist' by testing one or more of its children - A sequence is 'known to exist' by testing some previous data - A simple element is 'known to exist' by testing some previous data - A simple element is 'known to exist' by testing its content An alternative proposal was discussed where a discriminator where allowed on particles and 'confirmed the existence of the a component' Alan will write up both proposals. Also mentioned was the meaning of minOccurs= 0 on a choice branch. Steve will ask Sandy Gao 2. Unparsing lengthKind=Pattern The table below was modified to make the text unparing length the minLength/outputMinLength. Mike suggested that lengtKind pattern should be limited to text fields but need to understand Tlog requirements. Representation text binary String minLength Not applicable Float outputMinLength 32 Double outputMinLength 64 Decimal/Integer outputMinLength Minimum number of bytes to represent significant digits and sign Long, UnsignedLong outputMinLength packed/bcd : as decimal binary: 64 Int, Unsignedint outputMinLength packed/bcd : as decimal binary: 32 Short, Unsignedshort outputMinLength packed/bcd : as decimal binary: 16 Byte, Unsignedbyte outputMinLength packed/bcd : as decimal binary: 8 DateTime outputMinLength packed/bcd : as decimal binarySeconds: 32, binaryMilliseconds:64 Date outputMinLength packed/bcd : as decimal binarySeconds: 32, binaryMilliseconds:64 Time outputMinLength packed/bcd : as decimal binarySeconds: 32, binaryMilliseconds:64 Boolean outputMinLength 32 HexBinary Not applicable Length of infoset value 3. TLog Reviewed Steve's proposal Proposal 1) The 'variable length with a maximum' can not be handled using a post-timing assertion, because assertions only apply on parsing. Given that IBM MRM allows you to model this case, I think we should allow dfdl:length & dfdl:lengthUnits to be specified when dfdl:lengthKind is 'delimited' or 'pattern'. Extraction during parse is still by scanning. If the physical representation exceeds that length after extraction, it is a processing error. Similarly when unparsing, if the physical representation exceeds that length prior to output, it is a processing error. 2) dfdl:binaryNumberRep is extended with another value 'tlog'. Associated property dfdl:binaryDecimalVirtualPoint is applicable. Associated property dfdl:binaryPackedSignCodes is not applicable, there being only xD to indicate negative numbers, and no distinct nibbles for unsigned or zero. 3) dfdl:lengthKind 'delimited' is permitted for numbers when dfdl:representation is 'binary' and dfdl:binaryNumberRep is 'packed' or 'bcd' or 'tlog' because it is possible to know in advance the range of bytes being used, and therefore to choose suitable delimiters. 1) can be achieved on parsing with an assert but cannot be enforced on unparsing 2) Discussed the desirability of properties/enumerations specific particular formats versus providing more extensibility. Decided neither were needed for DFDL v1 3) Steve H felt that this was needed. It was agreed to add to spec (I need to confirm this) 4 Action 071 Semantics of length=0, nil handling and defaults. Not discussed 5. Go through Actions 6. Draft 037 review issues - Case of enumerations. We should follow the XSDL convention which is that enumerations are case sensitive - dfdl:lengthKind='Pattern scannability: A complex element with lengthKind=Pattern will use its dfdl:encoding property as the encoding when scanning its children irrespective of the child's encoding property. Go through unanswered issues in Mike's comments document 7 Review Schedule OGF prereview is confirmed to take about 4 weeks assuming no document updates are required. We are behind schedule to be available for public review by March. Draft 038 will be available at the end of this week. Activity Schedule Who Complete Action items - 18 Dec 2009 WG Complete Spec Write up work items ? 23 Dec 2009 AP Restructure and complete specification - 23 Dec 2009 AP Issue Draft 038 23 Dec 2009 WG review WG review 7 Dec ? 08 Jan 2010 WG Incorporate review comments 4 Jan - 29 Jan 2010 AP + Issue Draft 039 15 Jan 2010 Incorporate review comments 4 Jan - 29 Jan 2010 AP + Issue Draft 040 29 Jan 2010 Initial OGF Editor Review Initial Editor review 1 Feb - 1 Mar 2010 OGF Initial GFSG review 1 Feb - 1 Mar 2010 Issue Draft 041 1 Mar 2010 OGF Public Comment period (60 days) 1 Mar - 30 Apr 2010 OGF OGF 28 Munich 15-19 March 2010 Incorporate comments Incorporate comments 28 May 2010 Issue Draft 042 28 May 2010 Final OGF Editor Review Final Editor review June 2010 OGF final GFSG review June 2010 Issue Final specification 30 June 2010 Publish proposed recommendation 1 July 2010 Grid recommendation process 1 Jan - 1 April 2011 Meeting closed, 15:10 Next call Friday 29 January 2010 13:00 UK Next action: 077 Actions raised at this meeting No Action 076 SH semantics of minOccurs= 0 on choice branches Current Actions: No Action 045 20/05 AP: Speculative Parsing 27/05: Psuedo code has been circulated. Review for next call 03/06: Comments received and will be incorporated 09/06: Progress but not discussed 17/06: Discussed briefly 24/06: No Progress 01/07: No Progress 15/07: No progress. MB not happy with the way the algorithm is documented, need to find a better way. 29/07: No Progress 05/08: No Progress. Will document behaviour as a set of rules. 12/08: No Progress ... 16/09: no progress 30/09: AP distributed proposal and others commented. Brief discussion AP to incorporate update and reissue 07/10: Updated proposal was discussed.Comments will be incorporated into the next version. 14/10: Alan to update proposal to include array scenario where minOccurs > 0 21/10: Updated proposal reviewed 28/10: Updated proposal reviewed see minutes 04/11: Discussed semantics of disciminators on arrays. MB to produce examples 11/11: Absorbing action 033 into 045. Maybe decorated discrminator kinds are needed after all. MB and SF to continue with examples. 18/11: Went through WTX implementation of example. SF to gather more documentation about WTX discriminator rules. 25/11: Further discussion. Will get more WTX documentation. Need to confirm that no changes need to Resolving Uncertainty doc. 04/11: Further discussion about arrays. 09/12: Reviewed proposed discriminator semantic. 16/12: Reviewed discriminator examples and WTX semantic. 23/12: SF to provide better description of WTX behaviour and invite B Connolley to next call 06/01:B Connolly not available. SF to provide more complete description. 13/01: Stephaine took us through a description of WTX identifiers. Mike agreed to write up in DFDL terms. 20/01: Mike will write up 27/01: further discussion of disciminators 049 20/05 AP Built-in specification description and schemas 03/06: not discussed 24/06: No Progress 24/06: No Progress (hope to get these from test cases) 15/07: No progress. Once available, the examples in the spec should use the dfdl:defineFormat annotations they provide. ... 14/10: no progress 21/10: Discussed the real need for this being in the specification. It seemed that the main value is it define a schema location for downloading 'known' defaults from the web. 28/10: no progress 04/11: no progress 11/11: no update 18/11: no update 25/11: Agreed to try to produce for CSV and fixed formats 04/12: no update 09/12: no update 16/12: no update 23/12: no update 06/01: no progress. If there is no resource to complete this action it can be deferred 13/01:no progress 20/01: no progress 27/01: no progress 064 MB/SH Request WG presentation at OGF 28 25/11: Session requested 04/12: no update 09/12: no update 16/12: SH has changed request to a general session rather tha WG in the hope of attracting more people. 23/12: no update 06/01: not heard anything yet 13/01: no update 20/01: no update 27/01: Session confirmed 066 Investigate format for defining test cases 25/11:IBM to see if it is possible to publish its test case format. 04/12: no update 09/12: no update 16/12: reminded dent to project manager 23/12: SH will send another reminder. 06/01: Another reminder will be sent 13/01: no update 20/01: no update 27/01: no progress 071 Semantics of length=0, nil handling and defaults. 23/12:SH no update 06/01: SH has started 13/01: SH proposal review. Minor updates to be made 20/01: Reviewed updated proposal. Need to agree on unparsing empty choices. 27/01: Steve H had sent update but not discussed due to lack of time 074 SH: Proposal for parsing TLog 27/01: Proposal discussed and agreed to allow delimited for binary packed/bcd fields 075 SH: rewrite empty sequences section 27/01: Steve provide written section 076 SH semantics of minOccurs= 0 on choice branches Closed actions No Action Work items: No Item target version status 005 Improvements on property descriptions not started 012 Reordering the properties discussion: move representation earlier, improve flow of topics not started 036 Update dfdl schema with change properties ongoing 042 Mapping of the DFDL infoset to XDM none not required for V1 specification 069 ICU fractional seconds 039 070 Write DFDL primer 071 Write test cases. 072 it is a processing error if the number of occurrences in the data does not match the value of the expression or prefix 039 073 Rename dfdl:separatorPolicy="required" to "always". 039 Defferred untilaction 071 agreed 078 document UPA checks 039 Regards Alan Powell Development - MQSeries, Message Broker, ESB IBM Software Group, Application and Integration Middleware Software ------------------------------------------------------------------------------------------------------------------------------------------- IBM MP211, Hursley Park Hursley, SO21 2JN United Kingdom Phone: +44-1962-815073 e-mail: alan_powell@uk.ibm.com Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (1)
-
Alan Powell