Mike

IBM DFDL compiles XPath expressions into an instruction set. When it processes the instruction set, if it encounters an OR then the first true operand causes the expression to return true. However the rest of the instruction set is processed, but the results (including any errors) are ignored. Same for AND with first false operand. That's our Java behaviour. But it looks like our C behaviour is different in that when the rest of the instruction set is processed, any errors are not ignored !! Which is taking 'implementation-dependent' a little too literally :)

XPath 2.0 spec here http://www.w3.org/TR/xpath20/#id-errors-and-opt says that the results of expressions are to a degree implementation-dependent, which is admitting that two implementations might return different results. In DFDL we already acknowledge that behaviour of implementations might be different, in the paragraph of section 23 that talks about dynamic versus static XPTY0004 errors.

I don't think we should force implementations to implement their XPath processor to guarantee portability, because that has a knock-on effect on other criteria like performance, but we can certainly publish guidelines for schema authors so that they can author portable schemas.

Regards

Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848

---- Reply ----

From: Mike Beckerle <mbeckerle.dfdl@gmail.com>
To: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>,
Date: 17/09/2014 09:05
Subject: [DFDL-WG] OR operator in DFDL Expressions
Sent by: dfdl-wg-bounces@ogf.org

This is really a question about how IBM's DFDL works, but I have to introduce the topic:

XPath 2.0 says that implementations have freedom about whether these expressions cause errors or not:

true() or error() = true() Let's call this the sequential semantics

true() or error() = error Let's call this the parallel semantics

error() or true() = true Let's call this the bizarre semantics

There are analogous cases for 'and' with false()

Saxon, which dates from XPath 1.0, implements the sequential semantics, which is required by XPath 1.0, and so that's what the various TDML tests and such that we have in the Daffodil project have come to depend on. Quite often we have things like

dfdl:occursIndex() = 1
or
../r[dfdl:occursIndex() - 1]/flag

where that 2nd operand is effectively an error if the first operand is true.

Strictly by the new XPath 2.0 rules, the only portable way to write this expression is with an if-then-else. But I am reluctant to change all these tests we have.

Part of me says the most conservative thing is the parallel semantics - because it prevents you from writing an OR statement like the one I have above that depends on the sequentiality.

It is probably more important that the initial implementations are consistent so that schemas are more likely to interoperate.

What does IBM DFDL do for the semantics of the OR expressions?

Thanks

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy
-- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU