One could certainly imagine a case where there's a text integer in the
XML that is required for parsing subsequent data... I don't know that we
have to support this case directly/to start, but I think we should
consider a use case where Joe User is very excited by the power of DFDL
to deal with his complex information and has now run across a case where
info within an XML fragment in his doc is required for further parsing,
he wants to use DFDL to get the fragment, has an XML parser he could
connect (assuming a DFDL parser that supported extensions), and then
needs a way to use one of the nodes identified as a source for further
DFDL processing (the /mydata/length node contains a string representing
the length of a subsequent array and he needs to get that string, convert
to int using standard DFDL and use that as an input for the size of the
array). The use case without extensibility is that Joe waits for DFDL 2.0
when we add an XML parser and he'd now like to do the same thing using
his knowledge of DFDL 1.0 to process the string to int and use it as a
repeat count for an array.
Jim
At 07:41 AM 2/2/2006, Mike Beckerle wrote:
I would second this
approach. A payload string of XML data is just a string of value content
to us.
Note however that in our proposed set of properties there is
one "isXML" which is intended to facilitate the usage pattern
of XML payload strings. This property is a boolean you can set to say
that the string's content is a well formed XML document or a well-formed
fragment of XML. This is just a shorthand for what would otherwise be a
large set of quoting/escaping conventions, the use of a dynamic character
set selected based on the encoding attribute in the <?xml
version="1.0" encoding="US-ASCII"?> slug line (if
present), etc.
(We would need to specify what the concept of "well
formed fragment of XML" means. I think intuitively people know what
this means, something intelligible to an XML parser, but we need to be
explicit. It means a fragment of XML that begins and ends between two
elements. Hence, is not a fragment that starts in the middle of any
quoting construct, nor in the middle of a tag or attribute, etc. )
Mike Beckerle
STSM, Architect, Scalable Computing
IBM Software Group
Information Integration Solutions
Westborough, MA 01581
voice and FAX 508-599-7148
home/mobile office 508-915-4767
Steve Hanson <smh@uk.ibm.com>
Sent by: owner-dfdl-wg@ggf.org
02/01/2006 01:27 PM
To
dfdl-wg@ggf.org
cc
Subject
Re: [dfdl-wg] More documents
I'll see what I can come up
with.
As far as the embedded XML goes, I put it there as we will be asked
this
question. Thinking it through, maybe we should simply treat it as a
BLOB
and leave it to the user to take and parse using an XML parser as an
independent operation. This is symmetric with an XML document containing
a
non-XML BLOB as CDATA that needed to be parsed using DFDL.
Regards, Steve
Steve Hanson
WebSphere Message Brokers,
IBM Hursley, England
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848
"Robert
E.
McGrath"
<mcgrath@ncsa.uiu
To
c.edu>
dfdl-wg@ggf.org
Sent
by:
cc
owner-dfdl-wg@ggf
.org
Subject
Re: [dfdl-wg] More documents
01/02/2006
15:03
On Wednesday 01 February 2006 07:03, Steve Hanson wrote:
> - A portion of the data is encrypted, with fields in the message
prior to
> the encrypted section providing the decryption keys etc. (X12
security
> segment motivates this)
> - Data where some XML is embedded in the middle
> - Data where decimal fields (say) are in a wacky encoding not
supported
by
> stock DFDL properties (TLOG retail standard motivates here)
>
Thease are great examples. Can someone give me fully
documented
data files from which to try to construct such examples?
By the way, I'm not sure whether embedded XML is within the scope
of DFDL--it gets insanely hairy.
But the others are exactly the kinds of things that the core
standard
must either cover or have an extension mechanism that covers.
--
---
Robert E. McGrath, Ph.D.
National Center for Supercomputing Applications
University of Illinois, Urbana-Champaign
1205 West Clark
Urbana, Illinois 61801
(217)-333-6549
mcgrath@ncsa.uiuc.edu
James D. Myers
Associate Director, Collaborative Systems, NCSA
1205 W. Clark St, MC-257
Urbana, IL 61801
217-244-1934
jimmyers@ncsa.uiuc.edu