Fw: DFDL: Minutes from OGF WG call, 13 Feb 2008

I have reworked the example where the choice branch condition was specified outside the choice contents. We don't have to specifically mention switch, case statement programming paradigm in DFDL; thinking more about this, I believe switch/case statement paradigm is not relevant here. Attached is the reworked example . I also created another example by importing COBOL copy book (containing redefine clause) using IBM message broker toolkit. Here the branch condition is described inside the choice . In this case the discriminator contain forward reference to check the condition value; implementations may want to restrict the forward reference to the first token within the contents of choice branch. Note : Discriminators are useful when user is dealing with binary or text data which is not tagged.. Your comments/ suggestions are most welcome and appreciated.. Suman Kalia IBM Toronto Lab WebSphere Business Integration Application Connectivity Tools Tel : 905-413-3923 T/L 969-3923 Fax : 905-413-4850 T/L 969-4850 Internet ID : kalia@ca.ibm.com ----- Forwarded by Suman Kalia/Toronto/IBM on 02/14/2008 02:47 PM ----- Ian W Parkinson <PARKIW@uk.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 02/14/2008 01:34 PM To dfdl-wg@ogf.org cc Subject [DFDL-WG] DFDL: Minutes from OGF WG call, 13 Feb 2008 Open Grid Forum: Data Format Description Language Working Group Weekly Working Group Conference Call 17:00 GMT, 13 Feb 2008 Attendees Mike Beckerle (Oco) Steve Hanson (IBM) Suman Kalia (IBM) Simon Parker (PolarLake) Ian Parkinson (IBM) Apologies Alan Powell (IBM) 1. Specification Draft Status Alan has distributed draft 31 of the DFDL specification. The meeting reviewed the plan for the next few versions of the drafts and updated it to reflect progress and revised target dates - a copy of the updated plan is attached to these minutes. Simon's UML diagrams describing the schema components are now planned for inclusion in draft 32 but require further discussion. This discussion will be added to the agenda for the DFDL-WG call on 27th February, but Simon would appreciate comments via e-mail before then. These diagrams are intended to set a conceptual model for DFDL and to show where annotations may be attached, but will not be used more formally, e.g. to automatiically generate APIs. The work on nulls/defaults/optionals is complete, except for some small details, which will be included in draft 32. The 'valueCalc' work has been progressed but is not complete, and is also now targeted for draft 32. Other items originally planned for draft 31 are complete and have been included in the draft. 2. Assertions, Discriminators and Choice Suman has distributed an example showing the use of discriminators in choice constructs in DFDL schemas. The meeting discussed the distinction between discriminators and assertions - Mike described an assertion as simply a predicate which, if encountered within a choice, can cause backtracking. In contrast a succesful discriminator expression would lock the choice into a particular branch. If no discriminator matches, then the parse would fail - unless, as Simon pointed out, the choice itself was optional. Simon also suggested that the last branch of a choice could be left without a discriminator to act a a catch-all, but felt that the purpose of a discriminator should be more to help disambiguate between the possible branches rather than form such a "swtich" construct. Steve asked about the timing attribute on assertions, and in particular whether we'd need a similar attribute for discriminators. Mike suggested that the timing attribute might have been included simply to make implementation easier; as without it an implementation would need to perform a significant amount of static analysis. The motivation for discriminators was to allow a choice to be resolved by data encountered before the choice. However as Steve and Suman thought there were use cases where a discriminator might need to refer to elements inside the choice, and so a timing option would be useful. Suman will prepare such an example, and Mike will schedule a further discussion of this topic. Simon distributed an alternative example, showing the use of fixed fields instead of discriminators or assertions., which he felt might form a useful starting point for a full description of choice disambiguation. (Steve left the meeting) 3. Presentation for next OGF conference Simon suggested that Mike highlight the recent discussion topics of the working group, and items which have recently been added to the specification, and asked whether it would be useful to include the UML diagrams. Mike would like to display the diagrams and see whether they trigger a discussion amongst the delegates. He would also like to present work on variable markup and valueCalc. Meeting closed, 18:00 GMT Attachment: revised plan for specification drafts Draft 31: Improve (finish?) nulls/defaults/optionals (Mike, with input from Steve) - Done, apart from minor edit task Expression language (Alan) - Done Property precedence for parsing (Steve) - Done Entities, including basic white space (Alan) - Done Draft 32 ("vX+2"): valueCalc (Mike) - Feb 27, 2008 Remaining aspects of null/default/optionals (Alan) - Mar 5, 2008 2-level description of schema components, including UML (Simon) - Feb 27, 2008 Property precedence for writing (Steve) - Feb 15, 2008 Variable markup (Steve) - Feb 29, 2008 Regular expressions for lengths (Alan) Bring supplements up-to-date (Steve) - Mar 7, 2008 Assertions, discriminators and choice, including discussion of timing option (Suman) - Feb 19, 2008 How speculative parsing works (combining choice and variable-occurence - currently these are separate) (TX person) Reordering the properties discussion: move representation earlier, improve flow of topics (Alan) Draft 33: ("vX+3"): Escape schemes (Ian P) - Mar 21, 2008 String XML type (Ian P) - Mar 21, 2008 Variables (Mike) Selectors (Suman) - Mar 3, 2008 Improvements on property descriptions (All - split TBD) Envelopes and Payloads (Steve) - Mar 5, 2008 Extraneous to spec: Develop Schema for DFDL xsd (Suman) - Mar 15, 2008 Develop Schema for Schema DFDL Subset xsd (Suman) - Mar 30, 2008 (might not be needed) Ian Parkinson WebSphere ESB Development Mail Point 211, Hursley Park, Hursley, Winchester, SO21 2JN, UK Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg

Hi Suman, The DFDL expression language does not permit forward references when parsing. See Alan's document which has been rolled into draft 031. You'll have to push the discriminator down inside the group. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Suman Kalia <kalia@ca.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 14/02/2008 20:04 To dfdl-wg@ogf.org cc Subject [DFDL-WG] Fw: DFDL: Minutes from OGF WG call, 13 Feb 2008 I have reworked the example where the choice branch condition was specified outside the choice contents. We don't have to specifically mention switch, case statement programming paradigm in DFDL; thinking more about this, I believe switch/case statement paradigm is not relevant here. Attached is the reworked example . I also created another example by importing COBOL copy book (containing redefine clause) using IBM message broker toolkit. Here the branch condition is described inside the choice . In this case the discriminator contain forward reference to check the condition value; implementations may want to restrict the forward reference to the first token within the contents of choice branch. Note : Discriminators are useful when user is dealing with binary or text data which is not tagged.. Your comments/ suggestions are most welcome and appreciated.. Suman Kalia IBM Toronto Lab WebSphere Business Integration Application Connectivity Tools Tel : 905-413-3923 T/L 969-3923 Fax : 905-413-4850 T/L 969-4850 Internet ID : kalia@ca.ibm.com ----- Forwarded by Suman Kalia/Toronto/IBM on 02/14/2008 02:47 PM ----- Ian W Parkinson <PARKIW@uk.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 02/14/2008 01:34 PM To dfdl-wg@ogf.org cc Subject [DFDL-WG] DFDL: Minutes from OGF WG call, 13 Feb 2008 Open Grid Forum: Data Format Description Language Working Group Weekly Working Group Conference Call 17:00 GMT, 13 Feb 2008 Attendees Mike Beckerle (Oco) Steve Hanson (IBM) Suman Kalia (IBM) Simon Parker (PolarLake) Ian Parkinson (IBM) Apologies Alan Powell (IBM) 1. Specification Draft Status Alan has distributed draft 31 of the DFDL specification. The meeting reviewed the plan for the next few versions of the drafts and updated it to reflect progress and revised target dates - a copy of the updated plan is attached to these minutes. Simon's UML diagrams describing the schema components are now planned for inclusion in draft 32 but require further discussion. This discussion will be added to the agenda for the DFDL-WG call on 27th February, but Simon would appreciate comments via e-mail before then. These diagrams are intended to set a conceptual model for DFDL and to show where annotations may be attached, but will not be used more formally, e.g. to automatiically generate APIs. The work on nulls/defaults/optionals is complete, except for some small details, which will be included in draft 32. The 'valueCalc' work has been progressed but is not complete, and is also now targeted for draft 32. Other items originally planned for draft 31 are complete and have been included in the draft. 2. Assertions, Discriminators and Choice Suman has distributed an example showing the use of discriminators in choice constructs in DFDL schemas. The meeting discussed the distinction between discriminators and assertions - Mike described an assertion as simply a predicate which, if encountered within a choice, can cause backtracking. In contrast a succesful discriminator expression would lock the choice into a particular branch. If no discriminator matches, then the parse would fail - unless, as Simon pointed out, the choice itself was optional. Simon also suggested that the last branch of a choice could be left without a discriminator to act a a catch-all, but felt that the purpose of a discriminator should be more to help disambiguate between the possible branches rather than form such a "swtich" construct. Steve asked about the timing attribute on assertions, and in particular whether we'd need a similar attribute for discriminators. Mike suggested that the timing attribute might have been included simply to make implementation easier; as without it an implementation would need to perform a significant amount of static analysis. The motivation for discriminators was to allow a choice to be resolved by data encountered before the choice. However as Steve and Suman thought there were use cases where a discriminator might need to refer to elements inside the choice, and so a timing option would be useful. Suman will prepare such an example, and Mike will schedule a further discussion of this topic. Simon distributed an alternative example, showing the use of fixed fields instead of discriminators or assertions., which he felt might form a useful starting point for a full description of choice disambiguation. (Steve left the meeting) 3. Presentation for next OGF conference Simon suggested that Mike highlight the recent discussion topics of the working group, and items which have recently been added to the specification, and asked whether it would be useful to include the UML diagrams. Mike would like to display the diagrams and see whether they trigger a discussion amongst the delegates. He would also like to present work on variable markup and valueCalc. Meeting closed, 18:00 GMT Attachment: revised plan for specification drafts Draft 31: Improve (finish?) nulls/defaults/optionals (Mike, with input from Steve) - Done, apart from minor edit task Expression language (Alan) - Done Property precedence for parsing (Steve) - Done Entities, including basic white space (Alan) - Done Draft 32 ("vX+2"): valueCalc (Mike) - Feb 27, 2008 Remaining aspects of null/default/optionals (Alan) - Mar 5, 2008 2-level description of schema components, including UML (Simon) - Feb 27, 2008 Property precedence for writing (Steve) - Feb 15, 2008 Variable markup (Steve) - Feb 29, 2008 Regular expressions for lengths (Alan) Bring supplements up-to-date (Steve) - Mar 7, 2008 Assertions, discriminators and choice, including discussion of timing option (Suman) - Feb 19, 2008 How speculative parsing works (combining choice and variable-occurence - currently these are separate) (TX person) Reordering the properties discussion: move representation earlier, improve flow of topics (Alan) Draft 33: ("vX+3"): Escape schemes (Ian P) - Mar 21, 2008 String XML type (Ian P) - Mar 21, 2008 Variables (Mike) Selectors (Suman) - Mar 3, 2008 Improvements on property descriptions (All - split TBD) Envelopes and Payloads (Steve) - Mar 5, 2008 Extraneous to spec: Develop Schema for DFDL xsd (Suman) - Mar 15, 2008 Develop Schema for Schema DFDL Subset xsd (Suman) - Mar 30, 2008 (might not be needed) Ian Parkinson WebSphere ESB Development Mail Point 211, Hursley Park, Hursley, Winchester, SO21 2JN, UK Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg-- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Steve - here are some of my thoughts and rationale for forward reference in this particular case.. For the example I put together, if I push the discriminator from AddressRecord to Add-Rec and NameRecord to Name-Rec which are inside their respective complex type, the model will look somewhat awkward. It will be difficult for the user to understand how the choices are being resolved keeping in view that the complex type containing the Addr-Rec and Name-Rec could potentially be in different xsd file. From implementation perspective, I am specifically restricting the forward reference to very first token; may be we can make it more explicit. Note this forward reference is required only when the choice branch element is of complex type, it adds an entry in the XSD model but actually the data in bitstream would correspond to the very first element of simple type within the complex element. From parsing perspective, the cursor is positioned pretty much at the data in the bitstream for which we are evaluating the branch condition. One can argue -> what if the first element of complex type is itself a complex element ( e.g. USAddress in the example below), then the forward reference becomes multiple tokens. Even in this case the cursor would still be positioned correctly in the bitstream, it only changes the logical model. 05 AddressRecord redefines common-rec. reference becomes AddressRecord\USAddress\Addr-rec 07 USAddress. 10 Addr-rec pic x(4). 10 Street pic x(10). 10 Pin pic x(6). 10 filler pic x(70).
From pure technical standpoint and having strict restriction that forward references are forbidden/invalid, we could move the discriminator annotations to the respective elements whose value needs to be checked but at the cost of making the model look somewhat unnatural. May be we want to go with the strict approach first and relax the restriction later on a case by case basis..
------------------------------------------------------------------------------------------------------------------------------------------------------------- This is how XSD will look if discriminator annotations are pushed down .. Not sure if it makes the parsing any easier <?xml version="1.0" encoding="UTF-8"?> <xsd:schema targetNamespace="http://www.example.org/choiceDiscriminators" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:dfdl= "http://dataformat.org/" xmlns="http://www.example.org/choiceDiscriminators"> <!-- 01 Multi-Record. 05 common-rec Pic x(100). 05 AddressRecord redefines common-rec. 10 Addr-rec pic x(4). 10 Street pic x(10). 10 Pin pic x(6). 10 filler pic x(70). 05 NameRecord redefines common-rec. 10 Name-rec pic x(4). 10 first-name pic x(10). 10 last-name pic x(10). 10 filler pic x(76). --> <!-- Case: Where the branch choice condition is specified inside the choice branches --> <!-- Schema definition for the COBOL copy generated from Message broker toolkit --> <xsd:element name="msg_MultiRecord" type="MultiRecord" /> <xsd:complexType name="MultiRecord"> <xsd:group ref= "RedefinedElement_multirecord_common__rec_addressrecord_namerecord"> </xsd:group> </xsd:complexType> <xsd:group name= "RedefinedElement_multirecord_common__rec_addressrecord_namerecord"> <xsd:choice> <xsd:element name="common_rec"> <xsd:annotation> <xsd:appinfo source= "http://dataformat.org/"> <!-- discriminator annotation on choice branch (in this case element declaration) defines the condition value to be checked for pinning down the branch. The condition could be a complex XPATH expression using XPATH functions to check for particular pattern of data for the condition (e.g. data starting with starting with xyz etc). --> <dfdl:discriminator test= ".='oth'" /> <!-- identifies common_rec --> <!-- Note: COBOL importer can provide an option to derive discriminator condition from the VALUE clause and/or 88 level clause specified on the element. --> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="100" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="AddressRecord" type= "multirecord_addressrecord"></xsd:element> <xsd:element name="NameRecord" type= "multirecord_namerecord"></xsd:element> </xsd:choice> </xsd:group> <xsd:complexType name="multirecord_namerecord"> <xsd:group ref="multirecord_namerecord" /> </xsd:complexType> <xsd:complexType name="multirecord_addressrecord"> <xsd:group ref="multirecord_addressrecord" /> </xsd:complexType> <xsd:group name="multirecord_addressrecord"> <xsd:sequence> <xsd:element name="Addr_rec"> <xsd:annotation> <xsd:appinfo source= "http://dataformat.org/"> <!-- XPATH expression defined for discriminator annotation to pin down a choice branch could be a forward or backward path reference. Forward path reference is required when the condition to be checked is inside the choice branch. Note: Implementations may want to restrict the forward path reference to checking the very first token within the choice branch contents. --> <dfdl:discriminator test= ".='addr'" /> <!-- identifies name record --> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="4" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="Street"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="10" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="Pin"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="6" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="fill_0"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="70" /> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:group> <xsd:group name="multirecord_namerecord"> <xsd:sequence> <xsd:element name="Name_rec"> <xsd:annotation> <xsd:appinfo source= "http://dataformat.org/"> <!-- Forward path reference to Name_rec --> <dfdl:discriminator test= ".='name'" /> <!-- identifies name record --> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="4" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="first_name"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="10" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="last_name"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="10" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="fill_1"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="76" /> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:group> </xsd:schema> Suman Kalia IBM Toronto Lab WebSphere Business Integration Application Connectivity Tools Tel : 905-413-3923 T/L 969-3923 Fax : 905-413-4850 T/L 969-4850 Internet ID : kalia@ca.ibm.com Steve Hanson <smh@uk.ibm.com> 02/14/2008 03:22 PM To Suman Kalia/Toronto/IBM@IBMCA cc dfdl-wg@ogf.org Subject Re: [DFDL-WG] Fw: DFDL: Minutes from OGF WG call, 13 Feb 2008 Hi Suman, The DFDL expression language does not permit forward references when parsing. See Alan's document which has been rolled into draft 031. You'll have to push the discriminator down inside the group. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Suman Kalia <kalia@ca.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 14/02/2008 20:04 To dfdl-wg@ogf.org cc Subject [DFDL-WG] Fw: DFDL: Minutes from OGF WG call, 13 Feb 2008 I have reworked the example where the choice branch condition was specified outside the choice contents. We don't have to specifically mention switch, case statement programming paradigm in DFDL; thinking more about this, I believe switch/case statement paradigm is not relevant here. Attached is the reworked example . I also created another example by importing COBOL copy book (containing redefine clause) using IBM message broker toolkit. Here the branch condition is described inside the choice . In this case the discriminator contain forward reference to check the condition value; implementations may want to restrict the forward reference to the first token within the contents of choice branch. Note : Discriminators are useful when user is dealing with binary or text data which is not tagged.. Your comments/ suggestions are most welcome and appreciated.. Suman Kalia IBM Toronto Lab WebSphere Business Integration Application Connectivity Tools Tel : 905-413-3923 T/L 969-3923 Fax : 905-413-4850 T/L 969-4850 Internet ID : kalia@ca.ibm.com ----- Forwarded by Suman Kalia/Toronto/IBM on 02/14/2008 02:47 PM ----- Ian W Parkinson <PARKIW@uk.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 02/14/2008 01:34 PM To dfdl-wg@ogf.org cc Subject [DFDL-WG] DFDL: Minutes from OGF WG call, 13 Feb 2008 Open Grid Forum: Data Format Description Language Working Group Weekly Working Group Conference Call 17:00 GMT, 13 Feb 2008 Attendees Mike Beckerle (Oco) Steve Hanson (IBM) Suman Kalia (IBM) Simon Parker (PolarLake) Ian Parkinson (IBM) Apologies Alan Powell (IBM) 1. Specification Draft Status Alan has distributed draft 31 of the DFDL specification. The meeting reviewed the plan for the next few versions of the drafts and updated it to reflect progress and revised target dates - a copy of the updated plan is attached to these minutes. Simon's UML diagrams describing the schema components are now planned for inclusion in draft 32 but require further discussion. This discussion will be added to the agenda for the DFDL-WG call on 27th February, but Simon would appreciate comments via e-mail before then. These diagrams are intended to set a conceptual model for DFDL and to show where annotations may be attached, but will not be used more formally, e.g. to automatiically generate APIs. The work on nulls/defaults/optionals is complete, except for some small details, which will be included in draft 32. The 'valueCalc' work has been progressed but is not complete, and is also now targeted for draft 32. Other items originally planned for draft 31 are complete and have been included in the draft. 2. Assertions, Discriminators and Choice Suman has distributed an example showing the use of discriminators in choice constructs in DFDL schemas. The meeting discussed the distinction between discriminators and assertions - Mike described an assertion as simply a predicate which, if encountered within a choice, can cause backtracking. In contrast a succesful discriminator expression would lock the choice into a particular branch. If no discriminator matches, then the parse would fail - unless, as Simon pointed out, the choice itself was optional. Simon also suggested that the last branch of a choice could be left without a discriminator to act a a catch-all, but felt that the purpose of a discriminator should be more to help disambiguate between the possible branches rather than form such a "swtich" construct. Steve asked about the timing attribute on assertions, and in particular whether we'd need a similar attribute for discriminators. Mike suggested that the timing attribute might have been included simply to make implementation easier; as without it an implementation would need to perform a significant amount of static analysis. The motivation for discriminators was to allow a choice to be resolved by data encountered before the choice. However as Steve and Suman thought there were use cases where a discriminator might need to refer to elements inside the choice, and so a timing option would be useful. Suman will prepare such an example, and Mike will schedule a further discussion of this topic. Simon distributed an alternative example, showing the use of fixed fields instead of discriminators or assertions., which he felt might form a useful starting point for a full description of choice disambiguation. (Steve left the meeting) 3. Presentation for next OGF conference Simon suggested that Mike highlight the recent discussion topics of the working group, and items which have recently been added to the specification, and asked whether it would be useful to include the UML diagrams. Mike would like to display the diagrams and see whether they trigger a discussion amongst the delegates. He would also like to present work on variable markup and valueCalc. Meeting closed, 18:00 GMT Attachment: revised plan for specification drafts Draft 31: Improve (finish?) nulls/defaults/optionals (Mike, with input from Steve) - Done, apart from minor edit task Expression language (Alan) - Done Property precedence for parsing (Steve) - Done Entities, including basic white space (Alan) - Done Draft 32 ("vX+2"): valueCalc (Mike) - Feb 27, 2008 Remaining aspects of null/default/optionals (Alan) - Mar 5, 2008 2-level description of schema components, including UML (Simon) - Feb 27, 2008 Property precedence for writing (Steve) - Feb 15, 2008 Variable markup (Steve) - Feb 29, 2008 Regular expressions for lengths (Alan) Bring supplements up-to-date (Steve) - Mar 7, 2008 Assertions, discriminators and choice, including discussion of timing option (Suman) - Feb 19, 2008 How speculative parsing works (combining choice and variable-occurence - currently these are separate) (TX person) Reordering the properties discussion: move representation earlier, improve flow of topics (Alan) Draft 33: ("vX+3"): Escape schemes (Ian P) - Mar 21, 2008 String XML type (Ian P) - Mar 21, 2008 Variables (Mike) Selectors (Suman) - Mar 3, 2008 Improvements on property descriptions (All - split TBD) Envelopes and Payloads (Steve) - Mar 5, 2008 Extraneous to spec: Develop Schema for DFDL xsd (Suman) - Mar 15, 2008 Develop Schema for Schema DFDL Subset xsd (Suman) - Mar 30, 2008 (might not be needed) Ian Parkinson WebSphere ESB Development Mail Point 211, Hursley Park, Hursley, Winchester, SO21 2JN, UK Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg-- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Hi Suman As discussed on the call today, let's go with the strict approach for DFDL 1.0 as it is still capable of expressing what we want. We can factor usability improvements into subsequent releases. To document that we also remembered why there is no timing attribute on dfdl:discriminator. It is always evaluated post parsing of the field because any failure parsing the field implies the wrong branch of the choice, and the discriminator could never have evaluated to true. There's nothing to be gained evaluating it pre parsing of the field. After the call Ian and I discussed use of a discriminator when multiple nested choices are involved. I submit that this does not cause an issue because when a discriminator evaluates to true it must by definition mean that all branches within which it is nested are resolved. Draft 031 I think only mentions use of discriminators with choices. It is equally applicable at other points of uncertainty also - optional fields, unbounded repeats. As part of your action, please could you revise the wording of the current section on discriminators so that it covers all the points above? Thanks, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Suman Kalia <kalia@ca.ibm.com> 14/02/2008 21:37 To Steve Hanson/UK/IBM@IBMGB cc dfdl-wg@ogf.org Subject Re: [DFDL-WG] Fw: DFDL: Minutes from OGF WG call, 13 Feb 2008 Steve - here are some of my thoughts and rationale for forward reference in this particular case.. For the example I put together, if I push the discriminator from AddressRecord to Add-Rec and NameRecord to Name-Rec which are inside their respective complex type, the model will look somewhat awkward. It will be difficult for the user to understand how the choices are being resolved keeping in view that the complex type containing the Addr-Rec and Name-Rec could potentially be in different xsd file. From implementation perspective, I am specifically restricting the forward reference to very first token; may be we can make it more explicit. Note this forward reference is required only when the choice branch element is of complex type, it adds an entry in the XSD model but actually the data in bitstream would correspond to the very first element of simple type within the complex element. From parsing perspective, the cursor is positioned pretty much at the data in the bitstream for which we are evaluating the branch condition. One can argue -> what if the first element of complex type is itself a complex element ( e.g. USAddress in the example below), then the forward reference becomes multiple tokens. Even in this case the cursor would still be positioned correctly in the bitstream, it only changes the logical model. 05 AddressRecord redefines common-rec. reference becomes AddressRecord\USAddress\Addr-rec 07 USAddress. 10 Addr-rec pic x(4). 10 Street pic x(10). 10 Pin pic x(6). 10 filler pic x(70).
From pure technical standpoint and having strict restriction that forward references are forbidden/invalid, we could move the discriminator annotations to the respective elements whose value needs to be checked but at the cost of making the model look somewhat unnatural. May be we want to go with the strict approach first and relax the restriction later on a case by case basis..
------------------------------------------------------------------------------------------------------------------------------------------------------------- This is how XSD will look if discriminator annotations are pushed down .. Not sure if it makes the parsing any easier <?xml version="1.0" encoding="UTF-8"?> <xsd:schema targetNamespace="http://www.example.org/choiceDiscriminators" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:dfdl=" http://dataformat.org/" xmlns="http://www.example.org/choiceDiscriminators"> <!-- 01 Multi-Record. 05 common-rec Pic x(100). 05 AddressRecord redefines common-rec. 10 Addr-rec pic x(4). 10 Street pic x(10). 10 Pin pic x(6). 10 filler pic x(70). 05 NameRecord redefines common-rec. 10 Name-rec pic x(4). 10 first-name pic x(10). 10 last-name pic x(10). 10 filler pic x(76). --> <!-- Case: Where the branch choice condition is specified inside the choice branches --> <!-- Schema definition for the COBOL copy generated from Message broker toolkit --> <xsd:element name="msg_MultiRecord" type="MultiRecord" /> <xsd:complexType name="MultiRecord"> <xsd:group ref= "RedefinedElement_multirecord_common__rec_addressrecord_namerecord"> </xsd:group> </xsd:complexType> <xsd:group name= "RedefinedElement_multirecord_common__rec_addressrecord_namerecord"> <xsd:choice> <xsd:element name="common_rec"> <xsd:annotation> <xsd:appinfo source=" http://dataformat.org/"> <!-- discriminator annotation on choice branch (in this case element declaration) defines the condition value to be checked for pinning down the branch. The condition could be a complex XPATH expression using XPATH functions to check for particular pattern of data for the condition (e.g. data starting with starting with xyz etc). --> <dfdl:discriminator test= ".='oth'" /> <!-- identifies common_rec --> <!-- Note: COBOL importer can provide an option to derive discriminator condition from the VALUE clause and/or 88 level clause specified on the element. --> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="100" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="AddressRecord" type= "multirecord_addressrecord"></xsd:element> <xsd:element name="NameRecord" type= "multirecord_namerecord"></xsd:element> </xsd:choice> </xsd:group> <xsd:complexType name="multirecord_namerecord"> <xsd:group ref="multirecord_namerecord" /> </xsd:complexType> <xsd:complexType name="multirecord_addressrecord"> <xsd:group ref="multirecord_addressrecord" /> </xsd:complexType> <xsd:group name="multirecord_addressrecord"> <xsd:sequence> <xsd:element name="Addr_rec"> <xsd:annotation> <xsd:appinfo source=" http://dataformat.org/"> <!-- XPATH expression defined for discriminator annotation to pin down a choice branch could be a forward or backward path reference. Forward path reference is required when the condition to be checked is inside the choice branch. Note: Implementations may want to restrict the forward path reference to checking the very first token within the choice branch contents. --> <dfdl:discriminator test= ".='addr'" /> <!-- identifies name record --> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="4" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="Street"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="10" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="Pin"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="6" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="fill_0"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="70" /> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:group> <xsd:group name="multirecord_namerecord"> <xsd:sequence> <xsd:element name="Name_rec"> <xsd:annotation> <xsd:appinfo source=" http://dataformat.org/"> <!-- Forward path reference to Name_rec --> <dfdl:discriminator test= ".='name'" /> <!-- identifies name record --> </xsd:appinfo> </xsd:annotation> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="4" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="first_name"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="10" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="last_name"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="10" /> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="fill_1"> <xsd:simpleType> <xsd:restriction base="xsd:string"
<xsd:maxLength value="76" /> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> </xsd:group> </xsd:schema> Suman Kalia IBM Toronto Lab WebSphere Business Integration Application Connectivity Tools Tel : 905-413-3923 T/L 969-3923 Fax : 905-413-4850 T/L 969-4850 Internet ID : kalia@ca.ibm.com Steve Hanson <smh@uk.ibm.com> 02/14/2008 03:22 PM To Suman Kalia/Toronto/IBM@IBMCA cc dfdl-wg@ogf.org Subject Re: [DFDL-WG] Fw: DFDL: Minutes from OGF WG call, 13 Feb 2008 Hi Suman, The DFDL expression language does not permit forward references when parsing. See Alan's document which has been rolled into draft 031. You'll have to push the discriminator down inside the group. Regards, Steve Steve Hanson WebSphere Message Brokers Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848 Suman Kalia <kalia@ca.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 14/02/2008 20:04 To dfdl-wg@ogf.org cc Subject [DFDL-WG] Fw: DFDL: Minutes from OGF WG call, 13 Feb 2008 I have reworked the example where the choice branch condition was specified outside the choice contents. We don't have to specifically mention switch, case statement programming paradigm in DFDL; thinking more about this, I believe switch/case statement paradigm is not relevant here. Attached is the reworked example . I also created another example by importing COBOL copy book (containing redefine clause) using IBM message broker toolkit. Here the branch condition is described inside the choice . In this case the discriminator contain forward reference to check the condition value; implementations may want to restrict the forward reference to the first token within the contents of choice branch. Note : Discriminators are useful when user is dealing with binary or text data which is not tagged.. Your comments/ suggestions are most welcome and appreciated.. Suman Kalia IBM Toronto Lab WebSphere Business Integration Application Connectivity Tools Tel : 905-413-3923 T/L 969-3923 Fax : 905-413-4850 T/L 969-4850 Internet ID : kalia@ca.ibm.com ----- Forwarded by Suman Kalia/Toronto/IBM on 02/14/2008 02:47 PM ----- Ian W Parkinson <PARKIW@uk.ibm.com> Sent by: dfdl-wg-bounces@ogf.org 02/14/2008 01:34 PM To dfdl-wg@ogf.org cc Subject [DFDL-WG] DFDL: Minutes from OGF WG call, 13 Feb 2008 Open Grid Forum: Data Format Description Language Working Group Weekly Working Group Conference Call 17:00 GMT, 13 Feb 2008 Attendees Mike Beckerle (Oco) Steve Hanson (IBM) Suman Kalia (IBM) Simon Parker (PolarLake) Ian Parkinson (IBM) Apologies Alan Powell (IBM) 1. Specification Draft Status Alan has distributed draft 31 of the DFDL specification. The meeting reviewed the plan for the next few versions of the drafts and updated it to reflect progress and revised target dates - a copy of the updated plan is attached to these minutes. Simon's UML diagrams describing the schema components are now planned for inclusion in draft 32 but require further discussion. This discussion will be added to the agenda for the DFDL-WG call on 27th February, but Simon would appreciate comments via e-mail before then. These diagrams are intended to set a conceptual model for DFDL and to show where annotations may be attached, but will not be used more formally, e.g. to automatiically generate APIs. The work on nulls/defaults/optionals is complete, except for some small details, which will be included in draft 32. The 'valueCalc' work has been progressed but is not complete, and is also now targeted for draft 32. Other items originally planned for draft 31 are complete and have been included in the draft. 2. Assertions, Discriminators and Choice Suman has distributed an example showing the use of discriminators in choice constructs in DFDL schemas. The meeting discussed the distinction between discriminators and assertions - Mike described an assertion as simply a predicate which, if encountered within a choice, can cause backtracking. In contrast a succesful discriminator expression would lock the choice into a particular branch. If no discriminator matches, then the parse would fail - unless, as Simon pointed out, the choice itself was optional. Simon also suggested that the last branch of a choice could be left without a discriminator to act a a catch-all, but felt that the purpose of a discriminator should be more to help disambiguate between the possible branches rather than form such a "swtich" construct. Steve asked about the timing attribute on assertions, and in particular whether we'd need a similar attribute for discriminators. Mike suggested that the timing attribute might have been included simply to make implementation easier; as without it an implementation would need to perform a significant amount of static analysis. The motivation for discriminators was to allow a choice to be resolved by data encountered before the choice. However as Steve and Suman thought there were use cases where a discriminator might need to refer to elements inside the choice, and so a timing option would be useful. Suman will prepare such an example, and Mike will schedule a further discussion of this topic. Simon distributed an alternative example, showing the use of fixed fields instead of discriminators or assertions., which he felt might form a useful starting point for a full description of choice disambiguation. (Steve left the meeting) 3. Presentation for next OGF conference Simon suggested that Mike highlight the recent discussion topics of the working group, and items which have recently been added to the specification, and asked whether it would be useful to include the UML diagrams. Mike would like to display the diagrams and see whether they trigger a discussion amongst the delegates. He would also like to present work on variable markup and valueCalc. Meeting closed, 18:00 GMT Attachment: revised plan for specification drafts Draft 31: Improve (finish?) nulls/defaults/optionals (Mike, with input from Steve) - Done, apart from minor edit task Expression language (Alan) - Done Property precedence for parsing (Steve) - Done Entities, including basic white space (Alan) - Done Draft 32 ("vX+2"): valueCalc (Mike) - Feb 27, 2008 Remaining aspects of null/default/optionals (Alan) - Mar 5, 2008 2-level description of schema components, including UML (Simon) - Feb 27, 2008 Property precedence for writing (Steve) - Feb 15, 2008 Variable markup (Steve) - Feb 29, 2008 Regular expressions for lengths (Alan) Bring supplements up-to-date (Steve) - Mar 7, 2008 Assertions, discriminators and choice, including discussion of timing option (Suman) - Feb 19, 2008 How speculative parsing works (combining choice and variable-occurence - currently these are separate) (TX person) Reordering the properties discussion: move representation earlier, improve flow of topics (Alan) Draft 33: ("vX+3"): Escape schemes (Ian P) - Mar 21, 2008 String XML type (Ian P) - Mar 21, 2008 Variables (Mike) Selectors (Suman) - Mar 3, 2008 Improvements on property descriptions (All - split TBD) Envelopes and Payloads (Steve) - Mar 5, 2008 Extraneous to spec: Develop Schema for DFDL xsd (Suman) - Mar 15, 2008 Develop Schema for Schema DFDL Subset xsd (Suman) - Mar 30, 2008 (might not be needed) Ian Parkinson WebSphere ESB Development Mail Point 211, Hursley Park, Hursley, Winchester, SO21 2JN, UK Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg-- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (2)
-
Steve Hanson
-
Suman Kalia