Open Grid Forum: Data Format Description Language Working Group

OGF DFDL Working Group Call, January-29-2010

Attendees

Mike Beckerle (Oco)  
Steve Hanson (IBM)
Alan Powell (IBM)  
Suman Kalia (IBM)
Peter Lambros (IBM)  
Tim Kimber(IBM)  

Apologies
Stephanie Fetzer (IBM)
Steve Marting (Progeny)


1  Review Schedule

The WG acknowledged that we are not going to meet the schedule to be available for public comment by OGF 28. Agreed to continue to complete the specification as soon as possible.

OGF prereview is confirmed to take about 4 weeks assuming no document updates are required. We are behind schedule to be available for public review by March.
Activity
Schedule
Who
Complete Action items
             - 18 Dec 2009
 WG
Complete Spec Write up work items
            – 23 Dec 2009
AP
Restructure and complete specification
              - 23 Dec 2009
AP
Issue Draft 038
23 Dec 2009
WG review WG review
7 Dec – 08 Jan 2010
WG
Incorporate review comments
4 Jan - 29 Jan 2010
AP +
Issue Draft 039
15 Jan 2010
Incorporate review comments
4 Jan - 29 Jan 2010
AP +
Issue Draft 040
29 Jan 2010
Initial OGF Editor Review Initial Editor review
1 Feb - 1 Mar 2010
OGF
Initial GFSG review
1 Feb - 1 Mar 2010
Issue Draft 041
1 Mar 2010
OGF Public Comment period (60 days)
1 Mar - 30 Apr 2010
OGF
OGF 28 Munich
15-19 March 2010
Incorporate comments Incorporate comments
28 May 2010
Issue Draft 042
28 May 2010
Final OGF Editor Review Final  Editor review
June  2010
OGF
final GFSG review
June  2010
Issue Final specification
30 June 2010
Publish proposed recommendation
1 July 2010
Grid recommendation process
1 Jan - 1 April 2011




2. Go through Actions

Updated below

Action 077: Suman said that he had been mapping COBOL structures to DFDL and it didn't look as though the way text numbers are define is very usable. He will document for next call.

3. TLog

TLOG

The individual fields are a mixture of ASCII strings, proprietary packed decimals, and the occasional pure binary data. All fields are delimited by a separator. Fields of all types can be fixed length or variable length with a maximum. Pure binary data is preceded by a field giving the actual length. All lengths in bytes
.

Packed decimals. Like a packed decimal in the IBM sense. These can carry negative numbers but use a leading xD sign nibble. No sign nibble if positive or unsigned. Odd number of digits (including sign if present) are padded with xF nibble. This is best illustrated using examples.

1234     =>     x12x34

123      =>     xF1x23

-1234    =>     xFDx12x34

-123     =>     xD1x23


Proposal

1) The 'variable length with a maximum' will be handled using a post-timing assertion. Note this only applies on parsing. **


2) dfdl:lengthKind 'delimited' is permitted for numbers when dfdl:representation is 'binary' and dfdl:binaryNumberRep is 'packed' or 'bcd' because it is possible to know in advance the range of bytes being used, and therefore to choose suitable delimiters.


3) Core DFDL 1.0 will not be enhanced to handle the TLOG packed decimal type.  A future version of DFDL will provide an extensibility mechanism that allows user-defined types to be handled.  In the 1.0 timeframe IBM may implement its own proprietary extension to handle this type.


** While this can result in output from a DFDL unparser that can not be re-parsed, that is a problem general to the use of assertions, and a future version of DFDL may choose to change this by enhancements to the assertion annotation.


4 Action 071
Semantics of length=0, nil handling and defaults.

Changed unparsing behaviour - we must honour the property - the existing behaviour of always writing the initiator means we can not successfully re-parse if writing empty content and enum is 'suppress'.  When reading, assume that section 15.13 has been updated to include complex as well as simple elements.

No change to enums.
missingValueInitiatorPolicy Enum

Valid values ‘required', ‘prohibited'

Specifies whether to expect an initiator when an element is missing. Ignored unless dfdl:initiator is specified and is not "" (empty string).

'required'  - Indicates that the dfdl:initiator followed by empty content is the required syntax to indicate that the element is missing.  

'prohibited' - Indicates that empty content is the required syntax to indicate that the element is missing. The presence of an initiator implies that real content must follow.

Use of ‘prohibited’ implies an ordered sequence. If used on an initiated element of an unordered group it is a schema definition error.

If the element is required, defaulting occurs as defined above.

This property also applies on unparsing, when the data to be written (after nil value and default value processing) is empty content.

Annotation: dfdl:element

Unparsing. The branch of a choice output when a complex element is required but missing from the infoset is the first branch of the choice that does not result in a processing error.

5. Go through Actions


6. Discriminators

Not discussed)




7. Draft 037 review issues

not discussed

- Case of enumerations. We should follow the XSDL convention which is that enumerations are case sensitive

- dfdl:lengthKind='Pattern scannability:  A complex element with lengthKind=Pattern will use its dfdl:encoding property as the encoding when scanning its children irrespective of the child's encoding property.


Go through unanswered issues in Mike's comments document

Meeting closed, 14:10

Next call  Tuesday 02 February January 2010  13:00 UK

Next action: 078

Actions raised at this meeting
No
Action
077
SKK:  mapping of COBOL numbers to textNumberFormats.

Current Actions:
No
Action
045
20/05 AP: Speculative Parsing
27/05: Psuedo code has been circulated. Review for next call
03/06: Comments received and will be incorporated
09/06: Progress but not discussed
17/06: Discussed briefly
24/06: No Progress
01/07: No Progress
15/07: No progress. MB not happy with the way the algorithm is documented, need to find a better way.
29/07: No Progress
05/08: No Progress. Will document behaviour as a set of rules.
12/08: No Progress
...
16/09: no progress
30/09: AP distributed proposal and others commented. Brief discussion AP to incorporate update and reissue
07/10: Updated proposal was discussed.Comments will be incorporated into the next version.
14/10: Alan to update proposal to include array scenario where minOccurs > 0
21/10: Updated proposal reviewed
28/10: Updated proposal reviewed see minutes
04/11: Discussed semantics of disciminators on arrays. MB to produce examples
11/11: Absorbing action 033 into 045.  Maybe decorated discrminator kinds are needed after all. MB and SF to continue with examples.  
18/11: Went through WTX implementation of example. SF to gather more documentation about WTX discriminator rules.
25/11: Further discussion. Will get more WTX documentation. Need to confirm that no changes need to Resolving Uncertainty doc.
04/11: Further discussion about arrays.
09/12: Reviewed proposed discriminator semantic.
16/12: Reviewed discriminator examples and WTX semantic.
23/12: SF to provide better description of WTX behaviour and invite B Connolley to next call
06/01:B Connolly not available. SF to provide more complete description.
13/01: Stephaine took us through a description of WTX identifiers. Mike agreed to write up in DFDL terms.
20/01: Mike will write up
27/01: further discussion of discriminators
29/01: Alan had  emailed bot proposals but not enough time to discuss
049
20/05 AP Built-in specification description and schemas
03/06: not discussed
24/06: No Progress
24/06: No Progress (hope to get these from test cases)
15/07: No progress. Once available, the examples in the spec should use the dfdl:defineFormat annotations they provide.
...
14/10: no progress
21/10: Discussed the real need for this being in the specification. It seemed that the main value is it define a schema location for downloading 'known' defaults from the web.
28/10: no progress
04/11: no progress
11/11: no update
18/11: no update
25/11: Agreed to try to produce for CSV and fixed formats
04/12: no update
09/12: no update
16/12: no update
23/12: no update
06/01: no progress. If there is no resource to complete this action it can be deferred
13/01:no progress
20/01: no progress
27/01: no progress
29/01: No progress.  The predefined formats do not need to be available when the spec is published.
Suman said that he had been mapping COBOL structures to DFDL and it didn't look as though the way text numbers are define is very usable. He will document for next call
066
Investigate format for defining test cases
25/11:IBM to see if it is possible to publish its test case format.
04/12: no update
09/12: no update
16/12: reminded dent to project manager
23/12: SH will send another reminder.
06/01: Another reminder will be sent
13/01: no update
20/01: no update
27/01: no progress
29/01: no progress
077
SKK:  mapping of COBOL numbers to textNumberFormats.

Closed actions
No
Action
064
MB/SH Request WG presentation at OGF 28
25/11: Session requested
04/12: no update
09/12: no update
16/12: SH has changed request to a general session rather tha WG in the hope of attracting more people.
23/12: no update
06/01: not heard anything yet
13/01: no update
20/01: no update
27/01: Session confirmed
Closed
071
Semantics of length=0, nil handling and defaults.
23/12:SH no update
06/01: SH has started
13/01: SH proposal review. Minor updates to be made
20/01: Reviewed updated proposal. Need to agree on unparsing empty choices.
27/01: Steve H had sent update but not discussed due to lack of time
29/01: See minutes.  Update 15.3 for complex. document missingValueInitiatorPolicy. Unparsing. The branch of a choice output when a complex element is required but missing from the infoset is the first branch of the choice that does not result in a processing error.
Closed
074
SH: Proposal for parsing TLog
27/01:  Proposal discussed and agreed to allow delimited for binary packed/bcd fields
29/01: See minutes.  Confirmed delimited for packed/bcd
Closed
075
SH: rewrite empty sequences section
27/01: Steve provide written section
29/01: scetion rewritten
Closed
076
SH semantics of minOccurs= 0 on choice branches
29/01: Steve confirmed that XSDL allows minOccurs=0 for branches of a choice which means that the empty sequence in a valid result. WG decided that  DFDL will not allow minOccurs =0 on branches of a choice.
Closed

Work items:
No
Item target version status
005
Improvements on property descriptions not started
012
Reordering the properties discussion: move representation earlier, improve flow of topics not started
036
Update dfdl schema with change properties ongoing
042
Mapping of the DFDL infoset to XDM none not required for V1 specification
069
ICU fractional seconds 039
070
Write DFDL primer
071
Write test cases.
072
it is a processing error if the number of occurrences in the data does not match the value of the expression or prefix 039
073
Rename dfdl:separatorPolicy="required" to "always". 039 Deferred until action 071 agreed
078
document UPA checks 039
079
Semantics of length=0, nil handling and defaults. (A071) 039
080
Tlog: Allow LengthKind delimited for packed/bcd (A074) 039
081
Update empty sequence section (A075) 039
082
semantics of minOccurs= 0 on choice branches (A076) 039


 
Regards
 
Alan Powell
 
Development - MQSeries, Message Broker, ESB
IBM Software Group, Application and Integration Middleware Software
-------------------------------------------------------------------------------------------------------------------------------------------
IBM
MP211, Hursley Park
Hursley, SO21 2JN
United Kingdom
Phone: +44-1962-815073
e-mail: alan_powell@uk.ibm.com







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU