Fw: Subsetting the DFDL spec

4 Aug 2010

      I've thought further on this.  I think any subsets need to be fairly wide 
ranging and independent, in order for implementations (runtime and 
tooling) to be able to sensibly offer support. If it becomes too 
fragmented then users will find it difficult to know what construct may be 
used when.  I realise that this means it takes more effort to implement a 
subset in terms of content, but I think it will be easier to understand 
how to do it. Accordingly I've revised the strawman.   Note that I have 
introduced choices and unordered sequences at the same point as 
initiators, as that provides a way of resolving uncertainty without 
speculation, which is introduced under an advanced expression subset. 

Regards

Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh@uk.ibm.com,
tel +44-(0)1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 04/08/2010 13:50 -----

From:
Steve Hanson/UK/IBM
To:
Suman Kalia <kalia@ca.ibm.com>
Cc:
dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org
Date:
07/07/2010 18:34
Subject:
Re: [DFDL-WG] Subsetting the DFDL spec

Hi Suman

I added hidden elements as it allows things to be omitted from the 
infoset, which is a very useful technique. I removed it from the 
expression subset because you only need hidden + expressions when for 
example using a hidden complex element to return a synthesised simple 
value. Hidden on its own just to skip things is useful, easy to implement, 
and does no harm in core.

Defaults are a core capability. Otherwise you can't create a sparse 
infoset on output.  If we can separate out nils then perhaps nils could be 
in a separate subset. I started off that way then changed my mind but we 
can revisit.

I originally had choices in core but I removed it because without 
initiators or expressions how can you resolve a choice?  You can't. 
Choices are not as common as you might think in the non-XML world, for 
precisely this reason.  However, as I write I've realised that I've not 
allocated uncertainty (ie, choice or 'optionality') to any of the subsets, 
a major omission on my part.  I was intending core to be fixed occurrences 
thereby avoiding the need to implement backtracking, a significant item in 
any implementation. I'll think more on this.

My rationale for omitting delimiters from core was to keep core for fixed 
length data.  Many scientific users will never need delimiter support - 
and they are the folk most likely from OGF to write an implementation. 
Once you add in separators you pull in a huge amount of implementation - 
all the scanning, escaping, etc.  However, the uncertainty issue could 
well force us to split initiators from the other delimiters because of 
their role in uncertainty resolution. 

Thanks for your input though, I'll have a think and send out an update 
before next call.

Regards

Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh@uk.ibm.com,
tel +44-(0)1962-815848

From:
Suman Kalia <kalia@ca.ibm.com>
To:
Steve Hanson/UK/IBM@IBMGB
Cc:
dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org
Date:
07/07/2010 17:50
Subject:
Re: [DFDL-WG] Subsetting the DFDL spec

Steve - some comments 

I suggest we create a category DFDL Advanced features and  put support for 
 hidden elements under this as not many users would need it or implement 
it. One can also make the case for putting "Nils and defaults"  under the 
DFDL advanced features as this is one of the complex part of the 
specification. 

Core - should have support for choice construct as this is the most common 
building block.  I would like to see support for  delimited data; the 
basic and most widely used form is comma  separated records which would 
require lenghtKind=delimited  and separators to be moved to core 
specification.. 

Suman Kalia
IBM Toronto Lab
WebSphere Message Broker Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools 

http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.ht...

Tel : 905-413-3923  T/L  969-3923
Fax : 905-413-4850 T/L  969-4850
Internet ID : kalia@ca.ibm.com 

From:        Steve Hanson <smh@uk.ibm.com> 
To:        dfdl-wg@ogf.org 
Date:        07/07/2010 09:47 AM 
Subject:        [DFDL-WG] Subsetting the DFDL spec 
Sent by:        dfdl-wg-bounces@ogf.org 

Some thoughts about subsetting the DFDL spec to make it more consumable 
for readers and implementors. 

We need to decide how the use of a subset is indicated in a DFDL xsd.   It 
can be implicit by the properties referenced, or explicit up front.  The 
difference is best illustrated by an example. Let's say Bidi support is a 
subset and I don't want to use Bidi.  If using the implicit method, then I 
still need the dfdl:textBidi property to be set to 'no' even when in 
subset mode because the same xsd could be used by a full DFDL processor 
and it will expect a value.  If using explicit, then I don't need to set 
the dfdl:textBidi property at all, because the DFDL processor will never 
look for it unless the xsd is switched to include that subset. 

Here's a straw man for some subsets. 

Regards

Steve Hanson
Strategy, Common Transformation & DFDL
Co-Chair, OGF DFDL WG
IBM SWG, Hursley, UK,
smh@uk.ibm.com,
tel +44-(0)1962-815848

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 

--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 http://www.ogf.org/mailman/listinfo/dfdl-wg 

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

#### Subset_proposal_v1.ppt moved to MyAttachments Repository V3.8 () on 
13 July 2010 by Steve Hanson.

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Steve Hanson

tags

participants (1)