Steve - here are some of my thoughts and rationale for forward reference
in this particular case..
For the example I put together, if I push the discriminator from
AddressRecord to Add-Rec and NameRecord to Name-Rec which are inside their
respective complex type, the model will look somewhat awkward. It will
be difficult for the user to understand how the choices are being resolved
keeping in view that the complex type containing the Addr-Rec and Name-Rec
could potentially be in different xsd file. From implementation
perspective, I am specifically restricting the forward reference to very
first token; may be we can make it more explicit.
Note this forward reference is required only when the choice branch
element is of complex type, it adds an entry in the XSD model but actually
the data in bitstream would correspond to the very first element of simple
type within the complex element. From parsing perspective, the cursor is
positioned pretty much at the data in the bitstream for which we are
evaluating the branch condition.
One can argue -> what if the first element of complex type is itself a
complex element ( e.g. USAddress in the example below), then the forward
reference becomes multiple tokens. Even in this case the cursor would
still be positioned correctly in the bitstream, it only changes the
logical model.
05 AddressRecord redefines common-rec. reference becomes
AddressRecord\USAddress\Addr-rec
07 USAddress.
10 Addr-rec pic x(4).
10 Street pic x(10).
10 Pin pic x(6).
10 filler pic x(70).
From pure technical standpoint and having strict restriction that forward
references are forbidden/invalid, we could move the discriminator
annotations to the respective elements whose value needs to be checked but
at the cost of making the model look somewhat unnatural. May be we want
to go with the strict approach first and relax the restriction later on a
case by case basis..
-------------------------------------------------------------------------------------------------------------------------------------------------------------
This is how XSD will look if discriminator annotations are pushed down ..
Not sure if it makes the parsing any easier
<?xml version="1.0" encoding="UTF-8"?>
http://www.example.org/choiceDiscriminators"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:dfdl=
"http://dataformat.org/"
xmlns="http://www.example.org/choiceDiscriminators">
<!--
01 Multi-Record.
05 common-rec Pic x(100).
05 AddressRecord redefines common-rec.
10 Addr-rec pic x(4).
10 Street pic x(10).
10 Pin pic x(6).
10 filler pic x(70).
05 NameRecord redefines common-rec.
10 Name-rec pic x(4).
10 first-name pic x(10).
10 last-name pic x(10).
10 filler pic x(76).
-->
<!-- Case: Where the branch choice condition is specified inside the
choice branches -->
<!-- Schema definition for the COBOL copy generated from Message
broker toolkit -->
xsd:choice
xsd:annotation
http://dataformat.org/">
<!--
discriminator
annotation on choice branch (in this case element
declaration)
defines the condition value to be checked for pinning
down the branch.
The condition could be a complex XPATH expression
using XPATH
functions to check for particular pattern of data
for the condition
(e.g. data starting with starting with xyz etc).
-->
<!-- identifies
common_rec -->
<!-- Note: COBOL importer can provide an option
to derive discriminator
condition from the VALUE clause
and/or 88 level clause
specified on the element.
-->
xsd:simpleType
xsd:sequence
xsd:annotation
http://dataformat.org/">
<!--
XPATH expression
defined for discriminator annotation to pin down
a choice branch
could be a forward or backward path reference.
Forward path
reference is required when the condition to be checked
is inside the
choice branch.
Note:
Implementations may want to restrict the forward path reference to
checking the very
first token within the choice branch contents.
-->
<!-- identifies name
record -->
xsd:simpleType
xsd:simpleType
xsd:simpleType
xsd:simpleType
xsd:sequence
xsd:annotation
http://dataformat.org/">
<!-- Forward path
reference to Name_rec -->
<!-- identifies name
record -->
xsd:simpleType
xsd:simpleType
xsd:simpleType
xsd:simpleType
Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia@ca.ibm.com
Steve Hanson
02/14/2008 03:22 PM
To
Suman Kalia/Toronto/IBM@IBMCA
cc
dfdl-wg@ogf.org
Subject
Re: [DFDL-WG] Fw: DFDL: Minutes from OGF WG call, 13 Feb 2008
Hi Suman,
The DFDL expression language does not permit forward references when
parsing. See Alan's document which has been rolled into draft 031. You'll
have to push the discriminator down inside the group.
Regards, Steve
Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848
Suman Kalia
Sent by: dfdl-wg-bounces@ogf.org
14/02/2008 20:04
To
dfdl-wg@ogf.org
cc
Subject
[DFDL-WG] Fw: DFDL: Minutes from OGF WG call, 13 Feb 2008
I have reworked the example where the choice branch condition was
specified outside the choice contents. We don't have to specifically
mention switch, case statement programming paradigm in DFDL; thinking more
about this, I believe switch/case statement paradigm is not relevant here.
Attached is the reworked example .
I also created another example by importing COBOL copy book (containing
redefine clause) using IBM message broker toolkit. Here the branch
condition is described inside the choice . In this case the
discriminator contain forward reference to check the condition value;
implementations may want to restrict the forward reference to the first
token within the contents of choice branch.
Note : Discriminators are useful when user is dealing with binary or text
data which is not tagged..
Your comments/ suggestions are most welcome and appreciated..
Suman Kalia
IBM Toronto Lab
WebSphere Business Integration Application Connectivity Tools
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia@ca.ibm.com
----- Forwarded by Suman Kalia/Toronto/IBM on 02/14/2008 02:47 PM -----
Ian W Parkinson
Sent by: dfdl-wg-bounces@ogf.org
02/14/2008 01:34 PM
To
dfdl-wg@ogf.org
cc
Subject
[DFDL-WG] DFDL: Minutes from OGF WG call, 13 Feb 2008
Open Grid Forum: Data Format Description Language Working Group
Weekly Working Group Conference Call
17:00 GMT, 13 Feb 2008
Attendees
Mike Beckerle (Oco)
Steve Hanson (IBM)
Suman Kalia (IBM)
Simon Parker (PolarLake)
Ian Parkinson (IBM)
Apologies
Alan Powell (IBM)
1. Specification Draft Status
Alan has distributed draft 31 of the DFDL specification. The meeting
reviewed the plan for the next few versions of the drafts and updated it
to reflect progress and revised target dates - a copy of the updated plan
is attached to these minutes.
Simon's UML diagrams describing the schema components are now planned for
inclusion in draft 32 but require further discussion. This discussion will
be added to the agenda for the DFDL-WG call on 27th February, but Simon
would appreciate comments via e-mail before then. These diagrams are
intended to set a conceptual model for DFDL and to show where annotations
may be attached, but will not be used more formally, e.g. to
automatiically generate APIs.
The work on nulls/defaults/optionals is complete, except for some small
details, which will be included in draft 32.
The 'valueCalc' work has been progressed but is not complete, and is also
now targeted for draft 32.
Other items originally planned for draft 31 are complete and have been
included in the draft.
2. Assertions, Discriminators and Choice
Suman has distributed an example showing the use of discriminators in
choice constructs in DFDL schemas.
The meeting discussed the distinction between discriminators and
assertions - Mike described an assertion as simply a predicate which, if
encountered within a choice, can cause backtracking. In contrast a
succesful discriminator expression would lock the choice into a particular
branch. If no discriminator matches, then the parse would fail - unless,
as Simon pointed out, the choice itself was optional. Simon also suggested
that the last branch of a choice could be left without a discriminator to
act a a catch-all, but felt that the purpose of a discriminator should be
more to help disambiguate between the possible branches rather than form
such a "swtich" construct.
Steve asked about the timing attribute on assertions, and in particular
whether we'd need a similar attribute for discriminators. Mike suggested
that the timing attribute might have been included simply to make
implementation easier; as without it an implementation would need to
perform a significant amount of static analysis. The motivation for
discriminators was to allow a choice to be resolved by data encountered
before the choice. However as Steve and Suman thought there were use cases
where a discriminator might need to refer to elements inside the choice,
and so a timing option would be useful. Suman will prepare such an
example, and Mike will schedule a further discussion of this topic.
Simon distributed an alternative example, showing the use of fixed fields
instead of discriminators or assertions., which he felt might form a
useful starting point for a full description of choice disambiguation.
(Steve left the meeting)
3. Presentation for next OGF conference
Simon suggested that Mike highlight the recent discussion topics of the
working group, and items which have recently been added to the
specification, and asked whether it would be useful to include the UML
diagrams. Mike would like to display the diagrams and see whether they
trigger a discussion amongst the delegates. He would also like to present
work on variable markup and valueCalc.
Meeting closed, 18:00 GMT
Attachment: revised plan for specification drafts
Draft 31:
Improve (finish?) nulls/defaults/optionals (Mike, with input from Steve) -
Done, apart from minor edit task
Expression language (Alan) - Done
Property precedence for parsing (Steve) - Done
Entities, including basic white space (Alan) - Done
Draft 32 ("vX+2"):
valueCalc (Mike) - Feb 27, 2008
Remaining aspects of null/default/optionals (Alan) - Mar 5, 2008
2-level description of schema components, including UML (Simon) - Feb 27,
2008
Property precedence for writing (Steve) - Feb 15, 2008
Variable markup (Steve) - Feb 29, 2008
Regular expressions for lengths (Alan)
Bring supplements up-to-date (Steve) - Mar 7, 2008
Assertions, discriminators and choice, including discussion of timing
option (Suman) - Feb 19, 2008
How speculative parsing works (combining choice and variable-occurence -
currently these are separate) (TX person)
Reordering the properties discussion: move representation earlier, improve
flow of topics (Alan)
Draft 33: ("vX+3"):
Escape schemes (Ian P) - Mar 21, 2008
String XML type (Ian P) - Mar 21, 2008
Variables (Mike)
Selectors (Suman) - Mar 3, 2008
Improvements on property descriptions (All - split TBD)
Envelopes and Payloads (Steve) - Mar 5, 2008
Extraneous to spec:
Develop Schema for DFDL xsd (Suman) - Mar 15, 2008
Develop Schema for Schema DFDL Subset xsd (Suman) - Mar 30, 2008 (might
not be needed)
Ian Parkinson
WebSphere ESB Development
Mail Point 211, Hursley Park, Hursley, Winchester, SO21 2JN, UK
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU