I've thought further on this. I think any subsets need to be fairly wide
ranging and independent, in order for implementations (runtime and
tooling) to be able to sensibly offer support. If it becomes too
fragmented then users will find it difficult to know what construct may be
used when. I realise that this means it takes more effort to implement a
subset in terms of content, but I think it will be easier to understand
how to do it. Accordingly I've revised the strawman. Note that I have
introduced choices and unordered sequences at the same point as
initiators, as that provides a way of resolving uncertainty without
speculation, which is introduced under an advanced expression subset.
Regards
Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh@uk.ibm.com,
tel +44-(0)1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 04/08/2010 13:50 -----
From:
Steve Hanson/UK/IBM
To:
Suman Kalia
Cc:
dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org
Date:
07/07/2010 18:34
Subject:
Re: [DFDL-WG] Subsetting the DFDL spec
Hi Suman
I added hidden elements as it allows things to be omitted from the
infoset, which is a very useful technique. I removed it from the
expression subset because you only need hidden + expressions when for
example using a hidden complex element to return a synthesised simple
value. Hidden on its own just to skip things is useful, easy to implement,
and does no harm in core.
Defaults are a core capability. Otherwise you can't create a sparse
infoset on output. If we can separate out nils then perhaps nils could be
in a separate subset. I started off that way then changed my mind but we
can revisit.
I originally had choices in core but I removed it because without
initiators or expressions how can you resolve a choice? You can't.
Choices are not as common as you might think in the non-XML world, for
precisely this reason. However, as I write I've realised that I've not
allocated uncertainty (ie, choice or 'optionality') to any of the subsets,
a major omission on my part. I was intending core to be fixed occurrences
thereby avoiding the need to implement backtracking, a significant item in
any implementation. I'll think more on this.
My rationale for omitting delimiters from core was to keep core for fixed
length data. Many scientific users will never need delimiter support -
and they are the folk most likely from OGF to write an implementation.
Once you add in separators you pull in a huge amount of implementation -
all the scanning, escaping, etc. However, the uncertainty issue could
well force us to split initiators from the other delimiters because of
their role in uncertainty resolution.
Thanks for your input though, I'll have a think and send out an update
before next call.
Regards
Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh@uk.ibm.com,
tel +44-(0)1962-815848
From:
Suman Kalia
To:
Steve Hanson/UK/IBM@IBMGB
Cc:
dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org
Date:
07/07/2010 17:50
Subject:
Re: [DFDL-WG] Subsetting the DFDL spec
Steve - some comments
I suggest we create a category DFDL Advanced features and put support for
hidden elements under this as not many users would need it or implement
it. One can also make the case for putting "Nils and defaults" under the
DFDL advanced features as this is one of the complex part of the
specification.
Core - should have support for choice construct as this is the most common
building block. I would like to see support for delimited data; the
basic and most widely used form is comma separated records which would
require lenghtKind=delimited and separators to be moved to core
specification..
Suman Kalia
IBM Toronto Lab
WebSphere Message Broker Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.ht...
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia@ca.ibm.com
From: Steve Hanson
To: dfdl-wg@ogf.org
Date: 07/07/2010 09:47 AM
Subject: [DFDL-WG] Subsetting the DFDL spec
Sent by: dfdl-wg-bounces@ogf.org
Some thoughts about subsetting the DFDL spec to make it more consumable
for readers and implementors.
We need to decide how the use of a subset is indicated in a DFDL xsd. It
can be implicit by the properties referenced, or explicit up front. The
difference is best illustrated by an example. Let's say Bidi support is a
subset and I don't want to use Bidi. If using the implicit method, then I
still need the dfdl:textBidi property to be set to 'no' even when in
subset mode because the same xsd could be used by a full DFDL processor
and it will expect a value. If using explicit, then I don't need to set
the dfdl:textBidi property at all, because the DFDL processor will never
look for it unless the xsd is switched to include that subset.
Here's a straw man for some subsets.
Regards
Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh@uk.ibm.com,
tel +44-(0)1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
#### Subset_proposal_v1.ppt moved to MyAttachments Repository V3.8 () on
13 July 2010 by Steve Hanson.
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU