Re: [DFDL-WG] simpleType cannot contain assert?

The example is pretty simple. I have strings that need asserts to verify proper format. Nothing like working on a real format to put the small but critical issues into focus. I overlooked that you cannot put an assert on a global element. I read that too quickly, but there it is. I'm not sure what we were thinking here. Perhaps it was as simplistic as avoiding global constructs with ".." in the paths of expressions. But this just flies in the face of creating tidy schemas that avoid repetition of asserts or big complex expressions all over the place. And asserts are one of the very untidy things because they contain regular expressions, which are very unweildy and difficult to maintain and badly badly need to be centralized in any reasonable schema. So I'll amend my proposal: allow asserts on global element decls, and on simple type defs. That is, they are orthogonal in placement to whether the site is local or global. Here's the example: If I can put these as asserts on a simpleType, then I can abstract over them in a schema, using the same regex assert for many fields with different names. If I cannot, then I have no choice but to nest them inside a complex type, and all these simple string fields become complex types, which is adding a whole tier of elements to the schema. E.g. <simpleType name="dField" dfdl:ref="ex:dFieldDefaults"> <restriction base="xs:string"> </simpleType> then all over the schema..... <sequence dfdl:ref="dFieldListFormat"> <element name="Foo" type="ex:dField> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/"> <dfdl:assert testKind="pattern" testPattern="((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)(((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)| )*((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)|(\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)" message="Assertion failed for data_code" /> </xs:appinfo> </xs:annotation> </element> .... repeat ad nauseum for all other elements of type dField, which is EVERYTHING practically. This absolutely flies in the face of any good principles of abstraction. I want that regex, and the characteristics of data that have to obey it, captured in one place. I also really do not want to have to turn every element like Foo, which is logically just a string, into a complex type. What I need to be able to do is rotate that assertion over into the definition of dField. I can't do it with a pattern facet, because that doesn't affect parsing, and this assertion needs to guide speculative parsing. (Note: XML Schema allows patterns on simpleType defs and so allows proper abstraction of simple types that use pattern validation. DFDL is currently not consistent with this way of abstracting.) Now, currently the spec says I can't put an assert on a global element either, so I can't fix the above issue by doing this: <element name="dValue" type="xs:string"> ....big mombo assert with regex ... NOT ALLOWED HERE EITHER </element> <complexType name="dField"> <sequence> <element ref="ex:dValue"/> </sequence> </complexType> <sequence dfdl:ref="dFieldListDefaults"> <element name="Foo" type="dField"/> .... But I could push the assert down inside the complex type definition, where it would be on a local element decl, and if everywhere I share use of that complex type, then I can centralize the regex. What is this restriction possibly achieving? The assertion is still "trying to be on a global definition", we've just prevented it from being in a convenient place for the modeler. Also, as soon as you force me to use a complex type, now I'm stuck with the complex type restrictions on nillability, which are insufficient for my needs, so now I can't take advantage of mapping data containing non-empty nil indicators to xsi:nil nillability. I have to create a <element name="noValue" ....> and get into using choices, etc. I should note that precedent in the spec already allows asserts on a sequence/choice that is a global group definition and global group references. I believe this is just a happy accident of the XML Schema object model i.e., the fact that the sequence/choice object isn't the same object as the global definition object, it is contained inside the global group def object. The point: there is already a precedent for having to combine the lists of assertions from both defines and references to them. I think the rule is simple, asserts on references go after the asserts on the defines, but otherwise it is just merging the lists onto the reference schema component to be executed at the reference. The concern about global objeccts, that is about things having expressions/properties that can't be resolved isn't one that keeping assertions off global defs/decls will help. A global def/decl can contain a subset of DFDL properties such that it is useless in DFDL outside of some complementary referencing context. This is inherent in DFDL, and any system that allows separation of concerns when describing something complex including XML Schema itself, which has global types, groups, etc. all of which are useless without their referencing contexts. I would close by noting that this change, (allowing these additional locations for asserts), is backward compatible with our current spec, because it only adds new places these assertions are allowed. The right fix here: DFDL 'statements' (setVariable, assert, discriminator, even newVariableInstance) should be allowed on simpleType and on global element decls. Even newVariableInstance is useful there. A new variable is created, used in expressions in asserts/discriminators/setVariable/ other newVariableInstance, and then it goes out of scope immediately at the end of the element. The newVariableInstance is in fact our only way of really controlling the complexity of an expression. It lets you create a big complicated expression out of several variables each of which is bound to a sub-calculation, and so it is useful even when the scope is just the asserts/discriminators/setVariables/newVariableInstance statements of a single simple-typed element. newVariableInstance basically lets us create local variables for use in our expression language. Normal XPath doesn't have this. ...mike On Fri, Oct 26, 2012 at 4:11 AM, Steve Hanson <smh@uk.ibm.com> wrote:
Mike, it's not an oversight, I'm sure it is for the same reason that you can't put an assert on a global element. I think the rationale is that asserts (and discriminators) are only allowed at annotation points where all properties can be resolved.
Please can you show an example of what you want to achieve?
Regards
Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, *OGF DFDL Working Group* <http://www.ogf.org/dfdl/> IBM SWG, Hursley, UK* **smh@uk.ibm.com* <smh@uk.ibm.com> tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org, Date: 25/10/2012 17:47 Subject: [DFDL-WG] simpleType cannot contain assert? Sent by: dfdl-wg-bounces@ogf.org ------------------------------
Is this just an oversight?
I find it very tedious to model if I can't put the asserts on the simpleTypes so that I don't have to repeat them over and over in the model.
The composition rule here is simple. If an element has asserts, and its simple type has asserts, both are executed, with the element's asserts run after the simpleType's asserts.
...mike
-- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412

Mike - I see your point for asserts being specified on simple types, this is like glorified pattern facet. Question that I have is " Can XML Schema pattern facet on simple type achieve the same objective" . If you start allowing relative paths on the simple type in assert, then it becomes lot more complex; from validation perspective you cannot validate the global simple type on its own and also having relative paths would restrict reuse as the subject type could only be used where the preceding elements in the structures are identical.. Suman Kalia IBM Canada Lab WMB Toolkit Architect and Development Lead Tel: 905-413-3923 T/L 313-3923 Email: kalia@ca.ibm.com For info on Message broker http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.ht... From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: Steve Hanson <smh@uk.ibm.com>, Cc: dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Date: 10/28/2012 11:29 PM Subject: Re: [DFDL-WG] simpleType cannot contain assert? Sent by: dfdl-wg-bounces@ogf.org The example is pretty simple. I have strings that need asserts to verify proper format. Nothing like working on a real format to put the small but critical issues into focus. I overlooked that you cannot put an assert on a global element. I read that too quickly, but there it is. I'm not sure what we were thinking here. Perhaps it was as simplistic as avoiding global constructs with ".." in the paths of expressions. But this just flies in the face of creating tidy schemas that avoid repetition of asserts or big complex expressions all over the place. And asserts are one of the very untidy things because they contain regular expressions, which are very unweildy and difficult to maintain and badly badly need to be centralized in any reasonable schema. So I'll amend my proposal: allow asserts on global element decls, and on simple type defs. That is, they are orthogonal in placement to whether the site is local or global. Here's the example: If I can put these as asserts on a simpleType, then I can abstract over them in a schema, using the same regex assert for many fields with different names. If I cannot, then I have no choice but to nest them inside a complex type, and all these simple string fields become complex types, which is adding a whole tier of elements to the schema. E.g. <simpleType name="dField" dfdl:ref="ex:dFieldDefaults"> <restriction base="xs:string"> </simpleType> then all over the schema..... <sequence dfdl:ref="dFieldListFormat"> <element name="Foo" type="ex:dField> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/"> <dfdl:assert testKind="pattern" testPattern="((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)(((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)| )*((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)|(\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)" message="Assertion failed for data_code" /> </xs:appinfo> </xs:annotation> </element> .... repeat ad nauseum for all other elements of type dField, which is EVERYTHING practically. This absolutely flies in the face of any good principles of abstraction. I want that regex, and the characteristics of data that have to obey it, captured in one place. I also really do not want to have to turn every element like Foo, which is logically just a string, into a complex type. What I need to be able to do is rotate that assertion over into the definition of dField. I can't do it with a pattern facet, because that doesn't affect parsing, and this assertion needs to guide speculative parsing. (Note: XML Schema allows patterns on simpleType defs and so allows proper abstraction of simple types that use pattern validation. DFDL is currently not consistent with this way of abstracting.) Now, currently the spec says I can't put an assert on a global element either, so I can't fix the above issue by doing this: <element name="dValue" type="xs:string"> ....big mombo assert with regex ... NOT ALLOWED HERE EITHER </element> <complexType name="dField"> <sequence> <element ref="ex:dValue"/> </sequence> </complexType> <sequence dfdl:ref="dFieldListDefaults"> <element name="Foo" type="dField"/> .... But I could push the assert down inside the complex type definition, where it would be on a local element decl, and if everywhere I share use of that complex type, then I can centralize the regex. What is this restriction possibly achieving? The assertion is still "trying to be on a global definition", we've just prevented it from being in a convenient place for the modeler. Also, as soon as you force me to use a complex type, now I'm stuck with the complex type restrictions on nillability, which are insufficient for my needs, so now I can't take advantage of mapping data containing non-empty nil indicators to xsi:nil nillability. I have to create a <element name="noValue" ....> and get into using choices, etc. I should note that precedent in the spec already allows asserts on a sequence/choice that is a global group definition and global group references. I believe this is just a happy accident of the XML Schema object model i.e., the fact that the sequence/choice object isn't the same object as the global definition object, it is contained inside the global group def object. The point: there is already a precedent for having to combine the lists of assertions from both defines and references to them. I think the rule is simple, asserts on references go after the asserts on the defines, but otherwise it is just merging the lists onto the reference schema component to be executed at the reference. The concern about global objeccts, that is about things having expressions/properties that can't be resolved isn't one that keeping assertions off global defs/decls will help. A global def/decl can contain a subset of DFDL properties such that it is useless in DFDL outside of some complementary referencing context. This is inherent in DFDL, and any system that allows separation of concerns when describing something complex including XML Schema itself, which has global types, groups, etc. all of which are useless without their referencing contexts. I would close by noting that this change, (allowing these additional locations for asserts), is backward compatible with our current spec, because it only adds new places these assertions are allowed. The right fix here: DFDL 'statements' (setVariable, assert, discriminator, even newVariableInstance) should be allowed on simpleType and on global element decls. Even newVariableInstance is useful there. A new variable is created, used in expressions in asserts/discriminators/setVariable/ other newVariableInstance, and then it goes out of scope immediately at the end of the element. The newVariableInstance is in fact our only way of really controlling the complexity of an expression. It lets you create a big complicated expression out of several variables each of which is bound to a sub-calculation, and so it is useful even when the scope is just the asserts/discriminators/setVariables/newVariableInstance statements of a single simple-typed element. newVariableInstance basically lets us create local variables for use in our expression language. Normal XPath doesn't have this. ...mike On Fri, Oct 26, 2012 at 4:11 AM, Steve Hanson <smh@uk.ibm.com> wrote: Mike, it's not an oversight, I'm sure it is for the same reason that you can't put an assert on a global element. I think the rationale is that asserts (and discriminators) are only allowed at annotation points where all properties can be resolved. Please can you show an example of what you want to achieve? Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org, Date: 25/10/2012 17:47 Subject: [DFDL-WG] simpleType cannot contain assert? Sent by: dfdl-wg-bounces@ogf.org Is this just an oversight? I find it very tedious to model if I can't put the asserts on the simpleTypes so that I don't have to repeat them over and over in the model. The composition rule here is simple. If an element has asserts, and its simple type has asserts, both are executed, with the element's asserts run after the simpleType's asserts. ...mike -- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg

Suman XML schema pattern facet does not always achieve same objective. In DFDL it is allowed only on xs:string, because it is matching the lexical value of the infoset value. The assert testKind 'pattern' applies directly to the physical data and is allowed for all types. But that's not really the point - Mike wants to keep the assert with the type from an encapsulation perspective. The relative path issue just means that the path can't be validated when the simple type is validated. But that's true about any property on a simple type that can take an expression. IBM DFDL avoids this by validating simple types at their point of use. Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Suman Kalia <kalia@ca.ibm.com> To: Mike Beckerle <mbeckerle.dfdl@gmail.com>, Cc: dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org, Steve Hanson/UK/IBM@IBMGB Date: 29/10/2012 12:59 Subject: Re: [DFDL-WG] simpleType cannot contain assert? Mike - I see your point for asserts being specified on simple types, this is like glorified pattern facet. Question that I have is " Can XML Schema pattern facet on simple type achieve the same objective" . If you start allowing relative paths on the simple type in assert, then it becomes lot more complex; from validation perspective you cannot validate the global simple type on its own and also having relative paths would restrict reuse as the subject type could only be used where the preceding elements in the structures are identical.. Suman Kalia IBM Canada Lab WMB Toolkit Architect and Development Lead Tel: 905-413-3923 T/L 313-3923 Email: kalia@ca.ibm.com For info on Message broker http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.ht... From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: Steve Hanson <smh@uk.ibm.com>, Cc: dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Date: 10/28/2012 11:29 PM Subject: Re: [DFDL-WG] simpleType cannot contain assert? Sent by: dfdl-wg-bounces@ogf.org The example is pretty simple. I have strings that need asserts to verify proper format. Nothing like working on a real format to put the small but critical issues into focus. I overlooked that you cannot put an assert on a global element. I read that too quickly, but there it is. I'm not sure what we were thinking here. Perhaps it was as simplistic as avoiding global constructs with ".." in the paths of expressions. But this just flies in the face of creating tidy schemas that avoid repetition of asserts or big complex expressions all over the place. And asserts are one of the very untidy things because they contain regular expressions, which are very unweildy and difficult to maintain and badly badly need to be centralized in any reasonable schema. So I'll amend my proposal: allow asserts on global element decls, and on simple type defs. That is, they are orthogonal in placement to whether the site is local or global. Here's the example: If I can put these as asserts on a simpleType, then I can abstract over them in a schema, using the same regex assert for many fields with different names. If I cannot, then I have no choice but to nest them inside a complex type, and all these simple string fields become complex types, which is adding a whole tier of elements to the schema. E.g. <simpleType name="dField" dfdl:ref="ex:dFieldDefaults"> <restriction base="xs:string"> </simpleType> then all over the schema..... <sequence dfdl:ref="dFieldListFormat"> <element name="Foo" type="ex:dField> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/"> <dfdl:assert testKind="pattern" testPattern="((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)(((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)| )*((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)|(\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)" message="Assertion failed for data_code" /> </xs:appinfo> </xs:annotation> </element> .... repeat ad nauseum for all other elements of type dField, which is EVERYTHING practically. This absolutely flies in the face of any good principles of abstraction. I want that regex, and the characteristics of data that have to obey it, captured in one place. I also really do not want to have to turn every element like Foo, which is logically just a string, into a complex type. What I need to be able to do is rotate that assertion over into the definition of dField. I can't do it with a pattern facet, because that doesn't affect parsing, and this assertion needs to guide speculative parsing. (Note: XML Schema allows patterns on simpleType defs and so allows proper abstraction of simple types that use pattern validation. DFDL is currently not consistent with this way of abstracting.) Now, currently the spec says I can't put an assert on a global element either, so I can't fix the above issue by doing this: <element name="dValue" type="xs:string"> ....big mombo assert with regex ... NOT ALLOWED HERE EITHER </element> <complexType name="dField"> <sequence> <element ref="ex:dValue"/> </sequence> </complexType> <sequence dfdl:ref="dFieldListDefaults"> <element name="Foo" type="dField"/> .... But I could push the assert down inside the complex type definition, where it would be on a local element decl, and if everywhere I share use of that complex type, then I can centralize the regex. What is this restriction possibly achieving? The assertion is still "trying to be on a global definition", we've just prevented it from being in a convenient place for the modeler. Also, as soon as you force me to use a complex type, now I'm stuck with the complex type restrictions on nillability, which are insufficient for my needs, so now I can't take advantage of mapping data containing non-empty nil indicators to xsi:nil nillability. I have to create a <element name="noValue" ....> and get into using choices, etc. I should note that precedent in the spec already allows asserts on a sequence/choice that is a global group definition and global group references. I believe this is just a happy accident of the XML Schema object model i.e., the fact that the sequence/choice object isn't the same object as the global definition object, it is contained inside the global group def object. The point: there is already a precedent for having to combine the lists of assertions from both defines and references to them. I think the rule is simple, asserts on references go after the asserts on the defines, but otherwise it is just merging the lists onto the reference schema component to be executed at the reference. The concern about global objeccts, that is about things having expressions/properties that can't be resolved isn't one that keeping assertions off global defs/decls will help. A global def/decl can contain a subset of DFDL properties such that it is useless in DFDL outside of some complementary referencing context. This is inherent in DFDL, and any system that allows separation of concerns when describing something complex including XML Schema itself, which has global types, groups, etc. all of which are useless without their referencing contexts. I would close by noting that this change, (allowing these additional locations for asserts), is backward compatible with our current spec, because it only adds new places these assertions are allowed. The right fix here: DFDL 'statements' (setVariable, assert, discriminator, even newVariableInstance) should be allowed on simpleType and on global element decls. Even newVariableInstance is useful there. A new variable is created, used in expressions in asserts/discriminators/setVariable/ other newVariableInstance, and then it goes out of scope immediately at the end of the element. The newVariableInstance is in fact our only way of really controlling the complexity of an expression. It lets you create a big complicated expression out of several variables each of which is bound to a sub-calculation, and so it is useful even when the scope is just the asserts/discriminators/setVariables/newVariableInstance statements of a single simple-typed element. newVariableInstance basically lets us create local variables for use in our expression language. Normal XPath doesn't have this. ...mike On Fri, Oct 26, 2012 at 4:11 AM, Steve Hanson <smh@uk.ibm.com> wrote: Mike, it's not an oversight, I'm sure it is for the same reason that you can't put an assert on a global element. I think the rationale is that asserts (and discriminators) are only allowed at annotation points where all properties can be resolved. Please can you show an example of what you want to achieve? Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org, Date: 25/10/2012 17:47 Subject: [DFDL-WG] simpleType cannot contain assert? Sent by: dfdl-wg-bounces@ogf.org Is this just an oversight? I find it very tedious to model if I can't put the asserts on the simpleTypes so that I don't have to repeat them over and over in the model. The composition rule here is simple. If an element has asserts, and its simple type has asserts, both are executed, with the element's asserts run after the simpleType's asserts. ...mike -- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

I see Mike and your points and have no issue as long as the values in the assert are scalar as shown in Mike's example.. The primary motive of putting these asserts ( or pattens etc) on types , is that can be reused at many different places.. My main concern is with relative paths specified on type, this constraints re-usability. I would like to see some real motivating examples / use case.. Suman Kalia IBM Canada Lab WMB Toolkit Architect and Development Lead Tel: 905-413-3923 T/L 313-3923 Email: kalia@ca.ibm.com For info on Message broker http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.ht... From: Steve Hanson <smh@uk.ibm.com> To: Suman Kalia/Toronto/IBM@IBMCA, Cc: dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org, Mike Beckerle <mbeckerle.dfdl@gmail.com> Date: 10/29/2012 09:06 AM Subject: Re: [DFDL-WG] simpleType cannot contain assert? Suman XML schema pattern facet does not always achieve same objective. In DFDL it is allowed only on xs:string, because it is matching the lexical value of the infoset value. The assert testKind 'pattern' applies directly to the physical data and is allowed for all types. But that's not really the point - Mike wants to keep the assert with the type from an encapsulation perspective. The relative path issue just means that the path can't be validated when the simple type is validated. But that's true about any property on a simple type that can take an expression. IBM DFDL avoids this by validating simple types at their point of use. Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Suman Kalia <kalia@ca.ibm.com> To: Mike Beckerle <mbeckerle.dfdl@gmail.com>, Cc: dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org, Steve Hanson/UK/IBM@IBMGB Date: 29/10/2012 12:59 Subject: Re: [DFDL-WG] simpleType cannot contain assert? Mike - I see your point for asserts being specified on simple types, this is like glorified pattern facet. Question that I have is " Can XML Schema pattern facet on simple type achieve the same objective" . If you start allowing relative paths on the simple type in assert, then it becomes lot more complex; from validation perspective you cannot validate the global simple type on its own and also having relative paths would restrict reuse as the subject type could only be used where the preceding elements in the structures are identical.. Suman Kalia IBM Canada Lab WMB Toolkit Architect and Development Lead Tel: 905-413-3923 T/L 313-3923 Email: kalia@ca.ibm.com For info on Message broker http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.ht... From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: Steve Hanson <smh@uk.ibm.com>, Cc: dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org Date: 10/28/2012 11:29 PM Subject: Re: [DFDL-WG] simpleType cannot contain assert? Sent by: dfdl-wg-bounces@ogf.org The example is pretty simple. I have strings that need asserts to verify proper format. Nothing like working on a real format to put the small but critical issues into focus. I overlooked that you cannot put an assert on a global element. I read that too quickly, but there it is. I'm not sure what we were thinking here. Perhaps it was as simplistic as avoiding global constructs with ".." in the paths of expressions. But this just flies in the face of creating tidy schemas that avoid repetition of asserts or big complex expressions all over the place. And asserts are one of the very untidy things because they contain regular expressions, which are very unweildy and difficult to maintain and badly badly need to be centralized in any reasonable schema. So I'll amend my proposal: allow asserts on global element decls, and on simple type defs. That is, they are orthogonal in placement to whether the site is local or global. Here's the example: If I can put these as asserts on a simpleType, then I can abstract over them in a schema, using the same regex assert for many fields with different names. If I cannot, then I have no choice but to nest them inside a complex type, and all these simple string fields become complex types, which is adding a whole tier of elements to the schema. E.g. <simpleType name="dField" dfdl:ref="ex:dFieldDefaults"> <restriction base="xs:string"> </simpleType> then all over the schema..... <sequence dfdl:ref="dFieldListFormat"> <element name="Foo" type="ex:dField> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/"> <dfdl:assert testKind="pattern" testPattern="((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)(((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)| )*((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)|(\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)" message="Assertion failed for data_code" /> </xs:appinfo> </xs:annotation> </element> .... repeat ad nauseum for all other elements of type dField, which is EVERYTHING practically. This absolutely flies in the face of any good principles of abstraction. I want that regex, and the characteristics of data that have to obey it, captured in one place. I also really do not want to have to turn every element like Foo, which is logically just a string, into a complex type. What I need to be able to do is rotate that assertion over into the definition of dField. I can't do it with a pattern facet, because that doesn't affect parsing, and this assertion needs to guide speculative parsing. (Note: XML Schema allows patterns on simpleType defs and so allows proper abstraction of simple types that use pattern validation. DFDL is currently not consistent with this way of abstracting.) Now, currently the spec says I can't put an assert on a global element either, so I can't fix the above issue by doing this: <element name="dValue" type="xs:string"> ....big mombo assert with regex ... NOT ALLOWED HERE EITHER </element> <complexType name="dField"> <sequence> <element ref="ex:dValue"/> </sequence> </complexType> <sequence dfdl:ref="dFieldListDefaults"> <element name="Foo" type="dField"/> .... But I could push the assert down inside the complex type definition, where it would be on a local element decl, and if everywhere I share use of that complex type, then I can centralize the regex. What is this restriction possibly achieving? The assertion is still "trying to be on a global definition", we've just prevented it from being in a convenient place for the modeler. Also, as soon as you force me to use a complex type, now I'm stuck with the complex type restrictions on nillability, which are insufficient for my needs, so now I can't take advantage of mapping data containing non-empty nil indicators to xsi:nil nillability. I have to create a <element name="noValue" ....> and get into using choices, etc. I should note that precedent in the spec already allows asserts on a sequence/choice that is a global group definition and global group references. I believe this is just a happy accident of the XML Schema object model i.e., the fact that the sequence/choice object isn't the same object as the global definition object, it is contained inside the global group def object. The point: there is already a precedent for having to combine the lists of assertions from both defines and references to them. I think the rule is simple, asserts on references go after the asserts on the defines, but otherwise it is just merging the lists onto the reference schema component to be executed at the reference. The concern about global objeccts, that is about things having expressions/properties that can't be resolved isn't one that keeping assertions off global defs/decls will help. A global def/decl can contain a subset of DFDL properties such that it is useless in DFDL outside of some complementary referencing context. This is inherent in DFDL, and any system that allows separation of concerns when describing something complex including XML Schema itself, which has global types, groups, etc. all of which are useless without their referencing contexts. I would close by noting that this change, (allowing these additional locations for asserts), is backward compatible with our current spec, because it only adds new places these assertions are allowed. The right fix here: DFDL 'statements' (setVariable, assert, discriminator, even newVariableInstance) should be allowed on simpleType and on global element decls. Even newVariableInstance is useful there. A new variable is created, used in expressions in asserts/discriminators/setVariable/ other newVariableInstance, and then it goes out of scope immediately at the end of the element. The newVariableInstance is in fact our only way of really controlling the complexity of an expression. It lets you create a big complicated expression out of several variables each of which is bound to a sub-calculation, and so it is useful even when the scope is just the asserts/discriminators/setVariables/newVariableInstance statements of a single simple-typed element. newVariableInstance basically lets us create local variables for use in our expression language. Normal XPath doesn't have this. ...mike On Fri, Oct 26, 2012 at 4:11 AM, Steve Hanson <smh@uk.ibm.com> wrote: Mike, it's not an oversight, I'm sure it is for the same reason that you can't put an assert on a global element. I think the rationale is that asserts (and discriminators) are only allowed at annotation points where all properties can be resolved. Please can you show an example of what you want to achieve? Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org, Date: 25/10/2012 17:47 Subject: [DFDL-WG] simpleType cannot contain assert? Sent by: dfdl-wg-bounces@ogf.org Is this just an oversight? I find it very tedious to model if I can't put the asserts on the simpleTypes so that I don't have to repeat them over and over in the model. The composition rule here is simple. If an element has asserts, and its simple type has asserts, both are executed, with the element's asserts run after the simpleType's asserts. ...mike -- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (3)
-
Mike Beckerle
-
Steve Hanson
-
Suman Kalia