
Open Grid Forum: Data Format Description Language Working Group OGF DFDL Working Group Call, March 17-2010 Attendees Suman Kalia (IBM) Steve Hanson (IBM) Alan Powell (IBM) Steve Marting (Progeny) Stephanie Fetzer (IBM) Mike Beckerle (Oco) Apologies Tim Kimber(IBM) Steve H reported on the DFDL presentation at OGF 28 Overall OGF registered attendees thought to be around 250. Number at DFDL session: 7. Apart from Erwin (Data AD), DFDL was new to everyone. Notes from session (from minutes taken by Erwin Laure based on questions asked): Validation on input and output? Means conforming to the schema defined (e.g. integer between 0 and 100). dfdl assert could be used for asserting correctness of the data. Further complex validation would be via an external step (eg, Schematron invocation). Using DFDL to model data structures generated by OO code, particularly use of inheritance? Could be a use case for allowing complex type inheritance in DFDL in the future. Data often comes with units, infoset would want to reflect this? Could be achieved using expressions, dfdl:inputValueCalc and dfdl:hidden to apply a scaling factor based on units. What's the efficiency of DFDL? Particularly, can tests be turned off for fast read/write? The spec defines the behavior of the parser but not how to implement it. For instance, validation is not mandatory. Open source reference implementation would be good to have. (Lots of nodding). Scientific floating point data compresses badly. Knowing data structure can allow a more intelligent compression. DFDL is not intending to do transformations but that could be done on top. Encryption/compression could also be a use case for multi-layers, or for additional functions in the expression language. Need to reach out to DAIS-WG and DR-WG for (public) comments and to see whether DFDL will be actively used by those groups. Open source implementation would obviously help here. Comment on the spec: Should be a proposed recommendation (GFD-P-R) not informational (GFD-I). Draft 040 has been submitted to OGF and will be reviewed by the technical committee on March 30th. It is expected to go out for public comment in early April. There needs to be some publicity to ensure that enough comments are made. Mike suggested a press release from his company with IBM comment. Action raised May be able to get IBM representatives on other standards bodies such as ACCORD. OMG, to interest their members, Should contact all the contributors to the WG to ask them to review. Mike will update the information he sent in response to an RFI from OMG 1 Process for dealing with internal issues during Public comment phase. The public comments process will be used to make updates 2. Nils and Defaults during unparsing The table 17 in section 13.16.2 was corrected in draft 40 but here are still some ambiguities, for example when nil is the default. Logical Value nilValueInitiatorPolicy Has default value specified missingValueInitiatorPolicy initiator region contains content region contains Nil (implies nillable) prohibited don't care don't care empty representation of nil based on nilKind, nilValue, etc. required initiator string "" (empty string) Note that this implies that the element type is xs:string don't care prohibited empty empty string required initiator string a non-nil non-empty-string value don't care don't care initiator string The representation of the logical value Not supplied don?t care Yes (non-empty sting) don't care Initiator string The representation of the default value. Yes (empty string) prohibited empty empty required initiator string empty Alan will go update the table. 3 dfdl:choiceKind The main issues are: a) The calculation of the length of the longest branch is not obvious. b) The length units to use - the dfdl:lengthUnits property does not exist on a choice c) The name could be better Proposal is therefore to retain the property but to: i) State the conditions that must apply to use this property, and enforce them in the validator => schema definition error otherwise ii) Decouple the choice from its parent by calculating the length of each branch based solely on the properties of the branches components, irrespective of any parent dfdl:lengthKind Alan document the problem. Name change agreed. 4 DFDL time functions fn:timezone-from-dateTime Returns the timezone from an xs:dateTime value. fn:timezone-from-date Returns the timezone from an xs:date value. fn:timezone-from-time Returns the timezone from an xs:time value. Function Meaning fn:adjust-dateTime-to-timezone Adjusts an xs:dateTime value to a specific timezone, or to no timezone at all. fn:adjust-date-to-timezone Adjusts an xs:date value to a specific timezone, or to no timezone at all. fn:adjust-time-to-timezone Adjusts an xs:time value to a specific timezone, or to no timezone at all. All return an xs:duration. Do we need these functions? As there is no known use case for these functions they will be dropped. Meeting closed, 14:10 Next call Wednesday 17 March January 2010 13:00 UK (9:00 ET) Call will be for one hour only NOTE: East coast is 4 hours behind the UK for the next two weeks Next action: 087 Actions raised at this meeting No Action 085 ALL: publicize Public comments phase to ensure a good review.. 086 AP: Nils and Defaults during unparsing - update table Current Actions: No Action 066 Investigate format for defining test cases 25/11:IBM to see if it is possible to publish its test case format. 04/12: no update 09/12: no update 16/12: reminded dent to project manager 23/12: SH will send another reminder. 06/01: Another reminder will be sent 13/01: no update 20/01: no update 27/01: no progress 29/01: no progress 03/02: IBM is still investigating 10/02: IBM is still investigating 17/02: IBM is willing in principle to publish the test case format and some of the test cases. May need some time to build a 'compliance suite' 24/03: No progress 03/03: Discussions have been taking place on the subset of tests that will be provided. 10/03: work is progressing 17/03: work is progressin 084 Check behaviour of dfdl:inputValueCalc and outputValueCalc. 085 ALL: publicize Public comments phase to ensure a good review.. 086 AP: Nils and Defaults during unparsing - update table Closed actions No Action Work items: No Item target version status 005 Improvements on property descriptions not started 012 Reordering the properties discussion: move representation earlier, improve flow of topics not started 036 Update dfdl schema with change properties ongoing 042 Mapping of the DFDL infoset to XDM none not required for V1 specification 070 Write DFDL primer 071 Write test cases. 083 Implement RFC2116 097 Remove functions that returns duration Regards Alan Powell Development - MQSeries, Message Broker, ESB IBM Software Group, Application and Integration Middleware Software ------------------------------------------------------------------------------------------------------------------------------------------- IBM MP211, Hursley Park Hursley, SO21 2JN United Kingdom Phone: +44-1962-815073 e-mail: alan_powell@uk.ibm.com Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU