Jonathan
Yes that's the principle, but it goes
further than that. The DFDL infoset is typed, whereas the XML infoset isn't.
The results of parsing a DFDL-described document and then applying DFDL
validation to the resultant DFDL infoset is the same as parsing the equivalent
XML document and applying XML Schema 1.0 validation, where 'results' means
'the validation errors that are detected'.
This principle has guided the WG in
the design of several features. To hide elements from the DFDL infoset
requires the use of a 'hidden group' - simply doing the obvious thing and
adding a dfdl:hidden property to an element would break the principle.
To get an assert to fail without throwing a processing error meant inventing
recoverable errors - re-using validation errors would break the principle.
For you inspection and sanitization
capability, I would recommend looking at XDM, the model used by XPath 2.0,
XSLT 2.0 and XQuery. I think this is the natural higher-level model to
adopt for a common DFDL and XML framework. I created this OGF document
to describe how to map DFDL infoset to/from XDM. http://redmine.ogf.org/dmsf_files/8111?download=.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
"Cranford, Jonathan
W." <jcranford@mitre.org>
To:
Steve Hanson/UK/IBM@IBMGB,
Cc:
"dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>
Date:
23/07/2013 18:23
Subject:
could you clarify
statement made on the call today
Steve,
Could you clarify a statement made during today's DFDL WG call?
I didn't quite catch the whole statement, but it sounded like you were
saying a design goal of the WG was that the result of parsing a binary
format using DFDL would result in a DFDL infoset roughly equivalent to
the XML infoset obtained by parsing the same data in an XML format. I
don't think I quite captured that correctly, but it sounds like an important
point, and I'd like to understand it further.
For context, I've been asked to look at building an inspection and sanitization
capability on top of DFDL, so I'm weighing the differences between DFDL
Infoset and XML Infoset at the moment, and your comparison caught my attention.
Thanks in advance,
--
Jonathan W. Cranford
Senior Information Systems Engineer
The MITRE Corporation (http://www.mitre.org)
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU