The key point for me is this: what happens
when a DFDL expression ( which looks exactly like an XPath 2.0 expression
) gets lifted out of the DFDL xsd and used by a non-DFDL XPath processor?
If we allow the DFDL entities to be used like XML entities then the
expression will appear to be a valid XPath expression, but it will
fail in some unpredictable way. On the other hand, if DFDL entities can
only be used in conjunction with a DFDL function then (presumably
) the non-DFDL XPath engine will report that an unknown extension function
is being used.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
Andrew Coleman/UK/IBM@IBMGB,
Cc:
Steve Hanson/UK/IBM@IBMGB,
Tim Kimber/UK/IBM@IBMGB, dfdl-wg@ogf.org
Date:
05/12/2012 13:17
Subject:
Re: [DFDL-WG]
DFDL character entities in DFDL expressions
Well, yes, I think we're discussing exactly how that should
work.
No matter what, we do need to be clear that string values created in DFDL's
expression language can contain these XML-illegal characters, since they
are allowed in DFDL's infoset. This means that DFDL implementations can
only re-use an existing XPath implementation to create their DFDL expression
language implementation to the extent that it does NOT enforce the XML-illegal
characters restrictions all over the place.
I am currently working with standard Saxon-B XPath and will report back.
But let's be optimistic. The question then is just what is the solution
to creating a string-literal including these characters. That cannot be
done without some beyond-XML mechanism. DFDL has a string-literal notation
for expressing these characters, so we either say that string literals
in the expression language can use the DFDL character and numeric entities,
or we can do something more 'library like', and provide a function which
interprets the string-literal notation, and isolate the implementation
concerns a bit.
As a language embedded in XML schema, we already straddle the fence of
two somewhat inconsistent language environments.
E.g., the literals one can use as the value of the default attribute on
an element declaration cannot use DFDL character entities, as this is a
purely XML Schema construct.
Similarly, the regular expressions one can use for the XML schema pattern
facet are more restrictive than the DFDL regular expressions one can use
in a dfdl:assert, or a dfdl:lengthKind='pattern'.
So, it's acceptable to me to say that expressions also have some split
where the dfdl-specific aspects, like the dfdl character and numeric entities
notation, is isolated in a sub-construct.
On Wed, Dec 5, 2012 at 6:37 AM, Andrew Coleman <andrew_coleman@uk.ibm.com>
wrote:
No, casting a hexBinary to a string
will just write out the octets - i.e. the string will be '00'.
XPath itself has no mechanism for interpreting entity references or character
references. Its hosting language (XQuery or XSLT/XML) provides this.
Since DFDL is XML, wouldn't that provide a mechanism?
Regards,
- Andy
__________________________________________
Andrew Coleman
WebSphere Message Broker Development
IBM Hursley Park
From: Steve
Hanson/UK/IBM
To: Tim
Kimber/UK/IBM@IBMGB,
Cc: dfdl-wg@ogf.org,
dfdl-wg-bounces@ogf.org,
Mike Beckerle <mbeckerle.dfdl@gmail.com>,
Andrew Coleman/UK/IBM@IBMGB
Date: 05/12/2012
11:06
Subject: Re:
[DFDL-WG] DFDL character entities in DFDL expressions
Aren't XPath facilities sufficient here?
outputValueCalc="{ if (fn:string-length(../s) lt 64) then fn:concat(../s,
xs:string(xs:hexBinary('00'))) else ../s }"
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Tim
Kimber/UK/IBM@IBMGB
To: Mike
Beckerle <mbeckerle.dfdl@gmail.com>,
Cc: dfdl-wg@ogf.org,
dfdl-wg-bounces@ogf.org
Date: 05/12/2012
10:51
Subject:
Re: [DFDL-WG] DFDL character
entities in DFDL expressions
Sent by:
dfdl-wg-bounces@ogf.org
I think the restriction was aimed at avoiding things like this:
outputValueCalc="{ if (fn:string-length(../s) lt 64) then fn:concat(../s,
'%#rFF;') else ../s }"
I agree that a total ban is too restrictive. My personal preference would
be for the dfdl:string() function because it makes the usage of DFDL-specific
features obvious in the DFDL expression. But what would be the return type
of dfdl:string()? It it returned a sequence of characters then the raw
byte entity ( %#rnn; ) would still need to be disallowed.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: dfdl-wg@ogf.org,
Date: 04/12/2012
23:36
Subject: [DFDL-WG]
DFDL character entities in DFDL expressions
Sent by: dfdl-wg-bounces@ogf.org
We currently have this language in the spec:
"Within an expression, a string is never interpreted as a DFDL string
literal."
To me this means one cannot use DFDL character entities in an expression.
However, I need to do this:
outputValueCalc="{ if (fn:string-length(../s)
lt 64) then fn:concat(../s, '%NUL;') else ../s }"
Basically, I need to append a NUL on the end of the string in the output
value case.
Unless I can put a %NUL; into an expression and have it interpreted as
a DFDL String literal, I am not sure how I can achieve this.
At minimum I need a new DFDL function which might be an alternate string
constructor, such as dfdl:string('....') which interprets the argument
as something where the contents are to be scanned for DFDL character entities
and they are substituted so that the resulting string can contain the characters
that are disallowed in XML. (like NUL)
--
Mike Beckerle | OGF DFDL WG Co-Chair | Tresys Technologies
Tel: 781-330-0412
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North
Harbour, Portsmouth, Hampshire PO6 3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
Mike Beckerle | OGF DFDL WG Co-Chair | Tresys Technologies
Tel: 781-330-0412
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU