Clarification needed: multiple assert statements evaluation order

Given this DFDL annotation on an xs:sequence <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="..." /> <dfdl:discriminator test="..." /> <dfdl:assert test="..." /> </xs:appinfo> The spec seems silent about the evaluation order among these 3 statement annotations. Let's call them A, B, C. First, the spec makes it clear that the discriminator B could be evaluated earlier than either assertion, and even before some of the sequence content. This is to allow optimization by a DFDL implementation. The spec is also clear that even if the parse of the sequence content fails, discriminator B is evaluated (with the infoset being the state at the time of the failure). Again this is to ensure the behavior matches that of an optimizing DFDL implementation. But let's assume an implementation does no such optimization. I believe these evaluation orders are legal for the 3 statements after the sequence content has been parsed: A, B, C - if A fails, we do know B must still be evaluated. B, A, C - this is the minimum sort of hoist/optimization, doing the discriminator before the asserts. Questions: 1) Are any other orders of evaluation allowed? 2) If evaluation of A fails, do we still evaluate assertion C? (My hope is the answer here is no, because that allows consecutive asserts to build on each other's assumptions. But the spec is unclear.) 3) Can users depend on the failure of A to generate a message output? (ex: if the assert has a message attribute, can we state that this message will somehow be exhibited or logged by the implementation, unless the failure is suppressed by backtracking at a point of uncertainty) If A fails, the spec does say that B must still be evaluated. But if A fails, will C be evaluated? Mike Beckerle Apache Daffodil PMC | daffodil.apache.org OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl Owl Cyber Defense | www.owlcyberdefense.com

Another question: For asserts and discriminators with testKind pattern, if a given annotation point in the schema has both, do we execute the pattern discrminators first or use schema definition order? If the latter, do we still evaluate the pattern discriminators even if the pattern asserts fail? On Wed, Jan 8, 2025 at 1:36 PM Mike Beckerle <mbeckerle@apache.org> wrote:
Given this DFDL annotation on an xs:sequence
<xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="..." /> <dfdl:discriminator test="..." /> <dfdl:assert test="..." /> </xs:appinfo>
The spec seems silent about the evaluation order among these 3 statement annotations. Let's call them A, B, C.
First, the spec makes it clear that the discriminator B could be evaluated earlier than either assertion, and even before some of the sequence content. This is to allow optimization by a DFDL implementation. The spec is also clear that even if the parse of the sequence content fails, discriminator B is evaluated (with the infoset being the state at the time of the failure). Again this is to ensure the behavior matches that of an optimizing DFDL implementation.
But let's assume an implementation does no such optimization.
I believe these evaluation orders are legal for the 3 statements after the sequence content has been parsed:
A, B, C - if A fails, we do know B must still be evaluated.
B, A, C - this is the minimum sort of hoist/optimization, doing the discriminator before the asserts.
Questions:
1) Are any other orders of evaluation allowed?
2) If evaluation of A fails, do we still evaluate assertion C? (My hope is the answer here is no, because that allows consecutive asserts to build on each other's assumptions. But the spec is unclear.)
3) Can users depend on the failure of A to generate a message output? (ex: if the assert has a message attribute, can we state that this message will somehow be exhibited or logged by the implementation, unless the failure is suppressed by backtracking at a point of uncertainty)
If A fails, the spec does say that B must still be evaluated.
But if A fails, will C be evaluated?
Mike Beckerle Apache Daffodil PMC | daffodil.apache.org OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl Owl Cyber Defense | www.owlcyberdefense.com

Hi Mike Spec section 7.51 says the following. "If the resolved set of statement annotations for a schema component contains multiple dfdl:assert statements, then those with testKind 'pattern' are executed before those with testKind 'expression' (the default). However, within each group the order of execution among them is not specified. If one of the resolved set of asserts for a schema component is unsuccessful, and the failureType of the assert is ‘processingError’, then no further asserts in the set are executed." That seems clear to me. Schema authors should not rely on the ordering of asserts. Spec section 9.5.1 says "Implementations are free to optimize by recognizing and executing discriminators or asserts with testKind 'expression' earlier so long as the resulting behavior is consistent with what results from the description above." Spec section 9.5.2, as you indicate, says "When parsing, an attempt to evaluate a discriminator MUST be made even if preceding statements or the parse of the schema component ended in a Processing Error. This is because a discriminator's expression can evaluate to true thereby resolving a point of uncertainty even if the complete parsing of the construct ultimately caused a Processing Error." So an attempt to answer your questions 1) Yes 2) No 3) No because C might be evaluated first and fail Regards Steve On Wed, Jan 8, 2025 at 6:48 PM Mike Beckerle <mbeckerle@apache.org> wrote:
Given this DFDL annotation on an xs:sequence
<xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="..." /> <dfdl:discriminator test="..." /> <dfdl:assert test="..." /> </xs:appinfo>
The spec seems silent about the evaluation order among these 3 statement annotations. Let's call them A, B, C.
First, the spec makes it clear that the discriminator B could be evaluated earlier than either assertion, and even before some of the sequence content. This is to allow optimization by a DFDL implementation. The spec is also clear that even if the parse of the sequence content fails, discriminator B is evaluated (with the infoset being the state at the time of the failure). Again this is to ensure the behavior matches that of an optimizing DFDL implementation.
But let's assume an implementation does no such optimization.
I believe these evaluation orders are legal for the 3 statements after the sequence content has been parsed:
A, B, C - if A fails, we do know B must still be evaluated.
B, A, C - this is the minimum sort of hoist/optimization, doing the discriminator before the asserts.
Questions:
1) Are any other orders of evaluation allowed?
2) If evaluation of A fails, do we still evaluate assertion C? (My hope is the answer here is no, because that allows consecutive asserts to build on each other's assumptions. But the spec is unclear.)
3) Can users depend on the failure of A to generate a message output? (ex: if the assert has a message attribute, can we state that this message will somehow be exhibited or logged by the implementation, unless the failure is suppressed by backtracking at a point of uncertainty)
If A fails, the spec does say that B must still be evaluated.
But if A fails, will C be evaluated?
Mike Beckerle Apache Daffodil PMC | daffodil.apache.org OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl Owl Cyber Defense | www.owlcyberdefense.com -- dfdl-wg mailing list -- dfdl-wg@lists.ogf.org To unsubscribe send an email to dfdl-wg-leave@lists.ogf.org %(web_page_url)slistinfo/%(_internal_name)s

Regarding Mike's subsequent question about testKind 'pattern', my initial reaction is that the logic behind section 9.5.2 also applies here, it's just more restricted in that the only thing that could possibly fail (ahead of the discriminator being evaluated) is one of the asserts (which must also have testKind 'pattern'). If that is so, then we don't have to say anything further about relative order of assert and discriminator execution. Have a think and we'll discuss further on next call. Regards Steve On Thu, Jan 9, 2025 at 3:50 PM Steve Hanson <smhdfdl@gmail.com> wrote:
Hi Mike
Spec section 7.51 says the following.
"If the resolved set of statement annotations for a schema component contains multiple dfdl:assert statements, then those with testKind 'pattern' are executed before those with testKind 'expression' (the default). However, within each group the order of execution among them is not specified.
If one of the resolved set of asserts for a schema component is unsuccessful, and the failureType of the assert is ‘processingError’, then no further asserts in the set are executed."
That seems clear to me. Schema authors should not rely on the ordering of asserts.
Spec section 9.5.1 says
"Implementations are free to optimize by recognizing and executing discriminators or asserts with testKind 'expression' earlier so long as the resulting behavior is consistent with what results from the description above."
Spec section 9.5.2, as you indicate, says
"When parsing, an attempt to evaluate a discriminator MUST be made even if preceding statements or the parse of the schema component ended in a Processing Error.
This is because a discriminator's expression can evaluate to true thereby resolving a point of uncertainty even if the complete parsing of the construct ultimately caused a Processing Error."
So an attempt to answer your questions
1) Yes
2) No
3) No because C might be evaluated first and fail
Regards
Steve
On Wed, Jan 8, 2025 at 6:48 PM Mike Beckerle <mbeckerle@apache.org> wrote:
Given this DFDL annotation on an xs:sequence
<xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:assert test="..." /> <dfdl:discriminator test="..." /> <dfdl:assert test="..." /> </xs:appinfo>
The spec seems silent about the evaluation order among these 3 statement annotations. Let's call them A, B, C.
First, the spec makes it clear that the discriminator B could be evaluated earlier than either assertion, and even before some of the sequence content. This is to allow optimization by a DFDL implementation. The spec is also clear that even if the parse of the sequence content fails, discriminator B is evaluated (with the infoset being the state at the time of the failure). Again this is to ensure the behavior matches that of an optimizing DFDL implementation.
But let's assume an implementation does no such optimization.
I believe these evaluation orders are legal for the 3 statements after the sequence content has been parsed:
A, B, C - if A fails, we do know B must still be evaluated.
B, A, C - this is the minimum sort of hoist/optimization, doing the discriminator before the asserts.
Questions:
1) Are any other orders of evaluation allowed?
2) If evaluation of A fails, do we still evaluate assertion C? (My hope is the answer here is no, because that allows consecutive asserts to build on each other's assumptions. But the spec is unclear.)
3) Can users depend on the failure of A to generate a message output? (ex: if the assert has a message attribute, can we state that this message will somehow be exhibited or logged by the implementation, unless the failure is suppressed by backtracking at a point of uncertainty)
If A fails, the spec does say that B must still be evaluated.
But if A fails, will C be evaluated?
Mike Beckerle ache Daffodil PMC | daffodil.apache.org OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl Owl Cyber Defense | www.owlcyberdefense.com -- dfdl-wg mailing list -- dfdl-wg@lists.ogf.org To unsubscribe send an email to dfdl-wg-leave@lists.ogf.org %(web_page_url)slistinfo/%(_internal_name)s
participants (2)
-
Mike Beckerle
-
Steve Hanson