Steve - here are some of my thoughts
and rationale for forward reference in this particular case..
For the example I put together, if I
push the discriminator from AddressRecord to Add-Rec and NameRecord to
Name-Rec which are inside their respective complex type, the model
will look somewhat awkward. It will be difficult for the user
to understand how the choices are being resolved keeping in view that the
complex type containing the Addr-Rec and Name-Rec could potentially be
in different xsd file. From implementation perspective, I am specifically
restricting the forward reference to very first token; may be we
can make it more explicit.
Note this forward reference is required
only when the choice branch element is of complex type, it adds an entry
in the XSD model but actually the data in bitstream would correspond to
the very first element of simple type within the complex element. From
parsing perspective, the cursor is positioned pretty much at the data in
the bitstream for which we are evaluating the branch condition.
One can argue -> what if the first
element of complex type is itself a complex element ( e.g. USAddress
in the example below), then the forward reference becomes multiple tokens.
Even in this case the cursor would still be positioned correctly in the
bitstream, it only changes the logical model.
05 AddressRecord
redefines common-rec. reference becomes AddressRecord\USAddress\Addr-rec
07 USAddress.
10 Addr-rec pic x(4).
10 Street pic x(10).
10 Pin pic x(6).
10 filler pic x(70).
From pure technical standpoint and having
strict restriction that forward references are forbidden/invalid, we
could move the discriminator annotations to the respective elements whose
value needs to be checked but at the cost of making the model look somewhat
unnatural. May be we want to go with the strict approach first and
relax the restriction later on a case by case basis..
-------------------------------------------------------------------------------------------------------------------------------------------------------------
This is how XSD will look if discriminator
annotations are pushed down .. Not sure if it makes the parsing any easier
<?xml
version="1.0"
encoding="UTF-8"?>
<xsd:schema
targetNamespace="http://www.example.org/choiceDiscriminators"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:dfdl="http://dataformat.org/"
xmlns="http://www.example.org/choiceDiscriminators">
<!--
01 Multi-Record.
05 common-rec Pic x(100).
05 AddressRecord redefines common-rec.
10 Addr-rec
pic x(4).
10 Street
pic x(10).
10 Pin
pic x(6).
10 filler
pic x(70).
05 NameRecord redefines common-rec.
10 Name-rec
pic x(4).
10 first-name
pic x(10).
10 last-name
pic x(10).
10 filler
pic x(76).
-->
<!--
Case: Where the branch choice condition is specified inside the choice
branches -->
<!--
Schema definition for the COBOL copy generated from Message
broker toolkit -->
<xsd:element
name="msg_MultiRecord"
type="MultiRecord"
/>
<xsd:complexType
name="MultiRecord">
<xsd:group
ref="RedefinedElement_multirecord_common__rec_addressrecord_namerecord">
</xsd:group>
</xsd:complexType>
<xsd:group
name="RedefinedElement_multirecord_common__rec_addressrecord_namerecord">
<xsd:choice>
<xsd:element
name="common_rec">
<xsd:annotation>
<xsd:appinfo
source="http://dataformat.org/">
<!--
discriminator annotation on choice branch (in this case
element
declaration) defines the condition value to be checked for
pinning
down the branch. The condition could be a complex XPATH
expression
using XPATH functions to check for particular pattern of
data
for the condition (e.g. data starting with starting with
xyz etc).
-->
<dfdl:discriminator
test=".='oth'"
/>
<!--
identifies common_rec -->
<!--
Note: COBOL importer can provide an option to derive discriminator
condition from the VALUE clause and/or
88 level clause
specified on the element.
-->
</xsd:appinfo>
</xsd:annotation>
<xsd:simpleType>
<xsd:restriction
base="xsd:string">
<xsd:maxLength
value="100"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="AddressRecord"
type="multirecord_addressrecord"></xsd:element>
<xsd:element
name="NameRecord"
type="multirecord_namerecord"></xsd:element>
</xsd:choice>
</xsd:group>
<xsd:complexType
name="multirecord_namerecord">
<xsd:group
ref="multirecord_namerecord"
/>
</xsd:complexType>
<xsd:complexType
name="multirecord_addressrecord">
<xsd:group
ref="multirecord_addressrecord"
/>
</xsd:complexType>
<xsd:group
name="multirecord_addressrecord">
<xsd:sequence>
<xsd:element
name="Addr_rec">
<xsd:annotation>
<xsd:appinfo
source="http://dataformat.org/">
<!--
XPATH expression defined for discriminator annotation to
pin down
a choice branch could be a forward or backward path reference.
Forward path reference is required when the condition to
be checked
is inside the choice branch.
Note: Implementations may want to restrict the forward path
reference to
checking the very first token within the choice branch contents.
-->
<dfdl:discriminator
test=".='addr'"
/>
<!--
identifies name record -->
</xsd:appinfo>
</xsd:annotation>
<xsd:simpleType>
<xsd:restriction
base="xsd:string">
<xsd:maxLength
value="4"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="Street">
<xsd:simpleType>
<xsd:restriction
base="xsd:string">
<xsd:maxLength
value="10"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="Pin">
<xsd:simpleType>
<xsd:restriction
base="xsd:string">
<xsd:maxLength
value="6"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="fill_0">
<xsd:simpleType>
<xsd:restriction
base="xsd:string">
<xsd:maxLength
value="70"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
</xsd:sequence>
</xsd:group>
<xsd:group
name="multirecord_namerecord">
<xsd:sequence>
<xsd:element
name="Name_rec">
<xsd:annotation>
<xsd:appinfo
source="http://dataformat.org/">
<!--
Forward path reference to Name_rec -->
<dfdl:discriminator
test=".='name'"
/>
<!--
identifies name record -->
</xsd:appinfo>
</xsd:annotation>
<xsd:simpleType>
<xsd:restriction
base="xsd:string">
<xsd:maxLength
value="4"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="first_name">
<xsd:simpleType>
<xsd:restriction
base="xsd:string">
<xsd:maxLength
value="10"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="last_name">
<xsd:simpleType>
<xsd:restriction
base="xsd:string">
<xsd:maxLength
value="10"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="fill_1">
<xsd:simpleType>
<xsd:restriction
base="xsd:string">
<xsd:maxLength
value="76"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
</xsd:sequence>
</xsd:group>
</xsd:schema>
Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia@ca.ibm.com
Steve Hanson <smh@uk.ibm.com>
02/14/2008 03:22 PM
|
To
| Suman Kalia/Toronto/IBM@IBMCA
|
cc
| dfdl-wg@ogf.org
|
Subject
| Re: [DFDL-WG] Fw: DFDL: Minutes
from OGF WG call, 13 Feb 2008 |
|
Hi Suman,
The DFDL expression language does not permit forward references when parsing.
See Alan's document which has been rolled into draft 031. You'll have to
push the discriminator down inside the group.
Regards, Steve
Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848
Suman Kalia <kalia@ca.ibm.com>
Sent by: dfdl-wg-bounces@ogf.org
14/02/2008 20:04
|
To
| dfdl-wg@ogf.org
|
cc
|
|
Subject
| [DFDL-WG] Fw: DFDL: Minutes from
OGF WG call, 13 Feb 2008 |
|
I have reworked the example where the choice branch condition was specified
outside the choice contents. We don't have to specifically mention switch,
case statement programming paradigm in DFDL; thinking more about this,
I believe switch/case statement paradigm is not relevant here. Attached
is the reworked example .
I also created another example by importing COBOL copy book (containing
redefine clause) using IBM message broker toolkit. Here the branch condition
is described inside the choice . In this case the discriminator
contain forward reference to check the condition value; implementations
may want to restrict the forward reference to the first token within the
contents of choice branch.
Note : Discriminators are useful when user is dealing with binary or text
data which is not tagged..
Your comments/ suggestions are most welcome and appreciated..
Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia@ca.ibm.com
----- Forwarded by Suman Kalia/Toronto/IBM on 02/14/2008 02:47 PM -----
Ian W Parkinson <PARKIW@uk.ibm.com>
Sent by: dfdl-wg-bounces@ogf.org
02/14/2008 01:34 PM
|
To
| dfdl-wg@ogf.org
|
cc
|
|
Subject
| [DFDL-WG] DFDL: Minutes from OGF WG
call, 13 Feb 2008 |
|
Open Grid Forum: Data Format Description Language Working Group
Weekly Working Group Conference Call
17:00 GMT, 13 Feb 2008
Attendees
Mike Beckerle (Oco)
Steve Hanson (IBM)
Suman Kalia (IBM)
Simon Parker (PolarLake)
Ian Parkinson (IBM)
Apologies
Alan Powell (IBM)
1. Specification Draft Status
Alan has distributed draft 31 of the DFDL specification. The meeting reviewed
the plan for the next few versions of the drafts and updated it to reflect
progress and revised target dates - a copy of the updated plan is attached
to these minutes.
- Simon's UML diagrams describing the
schema components are now planned for inclusion in draft 32 but require
further discussion. This discussion will be added to the agenda for the
DFDL-WG call on 27th February, but Simon would appreciate comments via
e-mail before then. These diagrams are intended to set a conceptual model
for DFDL and to show where annotations may be attached, but will not be
used more formally, e.g. to automatiically generate APIs.
- The work on nulls/defaults/optionals
is complete, except for some small details, which will be included in draft
32.
- The 'valueCalc' work has been progressed
but is not complete, and is also now targeted for draft 32.
- Other items originally planned for draft
31 are complete and have been included in the draft.
2. Assertions, Discriminators and Choice
Suman has distributed an example showing the use of discriminators in choice
constructs in DFDL schemas.
The meeting discussed the distinction between discriminators and assertions
- Mike described an assertion as simply a predicate which, if encountered
within a choice, can cause backtracking. In contrast a succesful discriminator
expression would lock the choice into a particular branch. If no discriminator
matches, then the parse would fail - unless, as Simon pointed out, the
choice itself was optional. Simon also suggested that the last branch of
a choice could be left without a discriminator to act a a catch-all, but
felt that the purpose of a discriminator should be more to help disambiguate
between the possible branches rather than form such a "swtich"
construct.
Steve asked about the timing attribute on assertions, and in particular
whether we'd need a similar attribute for discriminators. Mike suggested
that the timing attribute might have been included simply to make implementation
easier; as without it an implementation would need to perform a significant
amount of static analysis. The motivation for discriminators was to allow
a choice to be resolved by data encountered before the choice. However
as Steve and Suman thought there were use cases where a discriminator might
need to refer to elements inside the choice, and so a timing option would
be useful. Suman will prepare such an example, and Mike will schedule a
further discussion of this topic.
Simon distributed an alternative example, showing the use of fixed fields
instead of discriminators or assertions., which he felt might form a useful
starting point for a full description of choice disambiguation.
(Steve left the meeting)
3. Presentation for next OGF conference
Simon suggested that Mike highlight the recent discussion topics of the
working group, and items which have recently been added to the specification,
and asked whether it would be useful to include the UML diagrams. Mike
would like to display the diagrams and see whether they trigger a discussion
amongst the delegates. He would also like to present work on variable markup
and valueCalc.
Meeting closed, 18:00 GMT
Attachment: revised plan for specification drafts
Draft 31:
- Improve (finish?) nulls/defaults/optionals
(Mike, with input from Steve)
- Done, apart from minor edit task
- Expression language (Alan)
- Done
- Property precedence for parsing (Steve)
- Done
- Entities, including basic white space (Alan)
- Done
Draft 32 ("vX+2"):
- valueCalc (Mike)
- Feb 27, 2008
- Remaining aspects of null/default/optionals
(Alan)
- Mar 5, 2008
- 2-level description of schema components,
including UML (Simon) -
Feb 27, 2008
- Property precedence for writing (Steve)
- Feb 15, 2008
- Variable markup (Steve)
- Feb 29, 2008
- Regular expressions for lengths (Alan)
- Bring supplements up-to-date
(Steve) - Mar
7, 2008
- Assertions, discriminators and choice, including
discussion of timing option (Suman)
- Feb 19, 2008
- How speculative parsing works (combining
choice and variable-occurence - currently these are separate)
(TX person)
- Reordering the properties discussion: move
representation earlier, improve flow of topics (Alan)
Draft 33: ("vX+3"):
- Escape schemes (Ian
P) - Mar
21, 2008
- String XML type (Ian
P) - Mar
21, 2008
- Variables (Mike)
- Selectors
(Suman) - Mar
3, 2008
- Improvements on property descriptions
(All - split TBD)
- Envelopes and Payloads (Steve)
- Mar 5, 2008
Extraneous to spec:
- Develop Schema for DFDL xsd
(Suman) - Mar
15, 2008
- Develop Schema for Schema DFDL Subset xsd
(Suman)
- Mar 30, 2008
(might not be needed)
Ian Parkinson
WebSphere ESB Development
Mail Point 211, Hursley Park, Hursley, Winchester, SO21 2JN, UK
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU