Regarding Mike's subsequent question about testKind 'pattern', my initial reaction is that the logic behind section 9.5.2 also applies here, it's just more restricted in that the only thing that could possibly fail (ahead of the discriminator being evaluated) is one of the asserts (which must also have testKind 'pattern').

If that is so, then we don't have to say anything further about relative order of assert and discriminator execution.

Have a think and we'll discuss further on next call.

Regards

Steve

On Thu, Jan 9, 2025 at 3:50 PM Steve Hanson <smhdfdl@gmail.com> wrote:

Hi Mike

Spec section 7.51 says the following.

"If the resolved set of statement annotations for a schema component contains multiple dfdl:assert statements, then those with testKind 'pattern' are executed before those with testKind 'expression' (the default). However, within each group the order of execution among them is not specified.
If one of the resolved set of asserts for a schema component is unsuccessful, and the failureType of the assert is ‘processingError’, then no further asserts in the set are executed."
That seems clear to me. Schema authors should not rely on the ordering of asserts.
Spec section 9.5.1 says
"Implementations are free to optimize by recognizing and executing discriminators or asserts with testKind 'expression' earlier so long as the resulting behavior is consistent with what results from the description above."
Spec section 9.5.2, as you indicate, says
"When parsing, an attempt to evaluate a discriminator MUST be made even if preceding statements or the parse of the schema component ended in a Processing Error.
This is because a discriminator's expression can evaluate to true thereby resolving a point of uncertainty even if the complete parsing of the construct ultimately caused a Processing Error."
So an attempt to answer your questions
1) Yes
2) No
3) No because C might be evaluated first and fail

Regards
Steve

On Wed, Jan 8, 2025 at 6:48 PM Mike Beckerle <mbeckerle@apache.org> wrote:
Given this DFDL annotation on an xs:sequence

<xs:appinfo source="http://www.ogf.org/dfdl/">
<dfdl:assert test="..." />
<dfdl:discriminator test="..." />
<dfdl:assert test="..." />
</xs:appinfo>

The spec seems silent about the evaluation order among these 3
statement annotations. Let's call them A, B, C.

First, the spec makes it clear that the discriminator B could be
evaluated earlier than either assertion, and even before some of the
sequence content.
This is to allow optimization by a DFDL implementation. The spec is
also clear that even if the parse of the sequence content fails,
discriminator B is evaluated (with the infoset being the state at the
time of the failure). Again this is to ensure the behavior matches
that of an optimizing DFDL implementation.

But let's assume an implementation does no such optimization.

I believe these evaluation orders are legal for the 3 statements after
the sequence content has been parsed:

A, B, C - if A fails, we do know B must still be evaluated.

B, A, C - this is the minimum sort of hoist/optimization, doing the
discriminator before the asserts.

Questions:

1) Are any other orders of evaluation allowed?

2) If evaluation of A fails, do we still evaluate assertion C? (My
hope is the answer here is no, because that allows consecutive asserts
to build on each other's assumptions. But the spec is unclear.)

3) Can users depend on the failure of A to generate a message output?
(ex: if the assert has a message attribute, can we state that this
message will somehow be exhibited or logged by the implementation,
unless the failure is suppressed by backtracking at a point of
uncertainty)

If A fails, the spec does say that B must still be evaluated.

But if A fails, will C be evaluated?

Mike Beckerle
ache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com
--
dfdl-wg mailing list -- dfdl-wg@lists.ogf.org
To unsubscribe send an email to dfdl-wg-leave@lists.ogf.org
%(web_page_url)slistinfo/%(_internal_name)s