Tel:  
781-810-2125  | 
All,
I 
have attached my first draft of recommendations for the DFDL Data Model. It is 
based heavily on XDM and should be mostly compatible. Where there are 
incompatibilities, they are by convention and not related to the actual 
representation so I believe a system that is set up to work with XDM should be 
easily adaptable to work with DFDL. This was important to me – in my project I 
would be using DFDL as part of a pipeline process that also involved 
transforming the results of parsing using XSLT and then unparsing the results of 
transformation. I imagine many other potential users of DFDL will be engaging in 
similar XML pipeline scenarios. In these situations, the closer the DFDL model 
is to those used by other XML technologies, the better. Hopefully the 
recommendation meets the requirements Steve mentions below – it turns out most 
of the existing infoset could be mapped directly to XDM concepts. I introduced a 
new node type for the unresolvable concept to match the existing infoset, but 
discuss how it is related to XDM and other XML 
technologies.
This 
exercise brought to major questions to mind:
- 
Must an instance of the XML Infoset (and XML document) be valid against a given 
DFDL Schema (as determined by an XML Schema validation engine) to be available 
for unparsing? If not, is it up to the DFDL implementation to determine the 
suitability of the XML Infoset Character Information Items for their given 
unparsed data type? The real question is: for the unparsing process can a DFDL 
Data Model be constructed from an XML Infoset directly or only from a PVSI 
(where does the data for unparsing really come from)? It would seem to me that 
if the input XML isn't valid against the DFDL Schema, then it probably can't be 
unparsable - otherwise, how would the invalid portions be handled (such as 
strings that should be numeric or a structure that doesn't 
match)?
- I 
am confused by the notion of the "augmented infoset". The regular infoset 
appears to be based on the logical structure of the data post-parsing. In other 
words, choices are resolved and the result looks something like an XML Infoset, 
PSVI, or XDM tree might following something like XSLT transformation. The 
augmented infoset on the other hand appears to be based on the logical structure 
of the DFDL Schema being used for processing and therefore contains branches for 
all choice possibilities, etc. It is "filled in" as parsing takes place. This 
doesn't make a lot of sense to me - what about the branches for which there was 
no data to "fill-in" (such as choice branches that weren't followed)? Are they 
dropped following parsing? If not, then there are a lot of information items in 
the final tree that have no value. It made more sense to me to consider the DFDL 
Data Model as being constructed during parsing and at any given time in the 
parsing process a portion of the model (that which has already been parsed) is 
available.
Hopefully 
those questions made sense... I should (finally) be on the call this Wednesday 
to discuss.
Dave
From: Steve Hanson 
[mailto:smh@uk.ibm.com] 
Sent: Wednesday, May 06, 2009 11:15 
AM
To: Dave Glick
Cc: Alan Powell; 
dfdl-wg@ogf.org
Subject: Re: [DFDL-WG] Agenda for OGF WG call 6 May 
2009
Dave 
Two 
intents of the infoset was that it should be a) simple and b) easily related to 
the grammar in 11.3, so whatever you come up with needs to take those 
requirements into account. 
"Parts of XDM that 
have no relevance to DFDL but are also not conflicting should probably be left 
in for conciseness and compatibility." - a) above would 
imply the opposite. 
The XDM spec defines 
the rules for how an XDM can be created from an XML Infoset or a PSVI.  We 
can do a similar exercise for DFDL Infoset, for those users who want to use XSL 
for any post-DFDL transformation. 
Regards
Steve 
Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, 
UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848 
| Dave Glick 
      <dglick@dracorp.com> 
       06/05/2009 
      13:55  | 
 | 
All, 
  
My apologies, but I 
will be unable to make the call again this week. I was hoping to have some 
suggestions regarding the infoset/data model for discussion today, but it's not 
quite ready (I still have a little more digging to do through the rest of the 
spec to make sure what I'm suggesting can adequately capture all the 
representation cases in DFDL). I'll try to get something out by the end of the 
day for review and discussion on next week's call. 
  
In general, it 
appears to me (and I'm admittedly not as versed in the various XML standards as 
the other members of the group) that we can bring the DFDL Infoset very closely 
in line with the XDM. Specifically, I've been looking at the way XSLT 2.0 treats 
XDM as it's data model. It states clearly that XDM is the model for XSLT with 
certain explicit caveats and additions. This follows the XDM guidance of how it 
should be used by other standards (specifically in XDM Section 7 and Appendix 
A). The task for DFDL therefore consists of two parts: what parts of the XDM are 
in conflict with DFDL and should be explicitly excluded, and what parts of DFDL 
have no corresponding support in XDM and should be appended. Parts of XDM that 
have no relevance to DFDL but are also not conflicting should probably be left 
in for conciseness and compatibility. 
  
My biggest concern 
is over the use of two different types of Element Information Items in the DFDL 
specification as this seems so contrary to convention in XDM. My recommendations 
include treating all element nodes similarly to XDM as complex and those element 
nodes that actually only contain simple content should have a single child of 
the XDM text node type or a new DFDL value node type (not sure the best way to 
go here). 
  
In any case, I'll 
pass along a full recommendation soon. 
  
Dave 
  
From: 
dfdl-wg-bounces@ogf.org [dfdl-wg-bounces@ogf.org] On Behalf Of Alan Powell 
[alan_powell@uk.ibm.com]
Sent: Wednesday, May 06, 2009 6:01 
AM
To: dfdl-wg@ogf.org
Subject: [DFDL-WG] Agenda for OGF WG 
call 6 May 2009
Agenda: 
1. Go 
through actions. 
2. LengthKind 
on Sequences and choices. 
LengthKind on 
sequences and choices and their parent element has proved confusing to new users 
of DFDL. It is proposed that lengthKind is removed from groups and only allow it 
to be set on parent element. See email from SH 
3. Discuss 
UnorderedInitated email from SH 
4. Infoset 
codepage and encoding 
The spec does 
not say what codepage and encoding is used for string fields. 
5. AOB 
Next 
version (034) 
Current Actions: 
| No | Action 
       | 
| 012 | AP/SH: Update 
      decimalCalendarScheme  | 
| 020 | SH: Resolve 
      packedDecimalSignCodes behaviour depends on NumberCheckPolicy 
       | 
| 024 | <No 
      owner> String XML type  | 
| 026 | SH: Envelopes 
      and Payloads  | 
| 027 | SH: Property 
      precedence tables  | 
| 028 | SH: Variable 
      markup  | 
| 029 | MB: valueCalc 
      (output length calculation)  | 
| 032 | DG: Investigate 
      compatibility between DFDL infoset and XDM  | 
| 033 | AP/TK: 
      Assert/Discriminator semantics. AP to document. TK to check uses of 
      discriminator besides choice.  | 
| 036 | SH: Provide use 
      case for floating component in a sequence  | 
| 037 | All: Approach 
      for XML Schema 1.0 UPA checks.  | 
| 038 | MB: Submit 
      response to OMG RFI for non-XML standardization  | 
| 039 | SKK: Approach 
      for creating Schema-For-DFDL xsds.  | 
Alan 
Powell
MP 211, IBM UK Labs, Hursley,  Winchester, SO21 2JN, 
England
Notes Id: Alan Powell/UK/IBM     email: 
alan_powell@uk.ibm.com  
Tel: +44 (0)1962 815073       
           Fax: +44 (0)1962 
816898
Unless stated 
otherwise above:
IBM United Kingdom Limited - Registered in England and Wales 
with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, 
Hampshire PO6 3AU 
--
dfdl-wg 
mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg 
Unless stated 
otherwise above:
IBM United Kingdom Limited - Registered in England and Wales 
with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, 
Hampshire PO6 3AU 
--
 dfdl-wg 
mailing 
list
 dfdl-wg@ogf.org
 http://www.ogf.org/mailman/listinfo/dfdl-wg 
Unless 
stated otherwise above:
IBM United Kingdom Limited - Registered in England 
and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, 
Portsmouth, Hampshire PO6 3AU