Mike - I see your point for asserts
being specified on simple types, this is like glorified pattern facet.
Question that I have is " Can XML Schema pattern facet
on simple type achieve the same objective" . If you start allowing
relative paths on the simple type in assert, then it becomes lot more
complex; from validation perspective you cannot validate the global simple
type on its own and also having relative paths would restrict reuse as
the subject type could only be used where the preceding elements in the
structures are identical..
Suman Kalia
IBM Canada Lab
WMB Toolkit Architect and Development
Lead
Tel: 905-413-3923 T/L 313-3923
Email: kalia@ca.ibm.com
For info on Message broker
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
Steve Hanson <smh@uk.ibm.com>,
Cc:
dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org
Date:
10/28/2012 11:29 PM
Subject:
Re: [DFDL-WG]
simpleType cannot contain assert?
Sent by:
dfdl-wg-bounces@ogf.org
The example is pretty simple. I have strings that need
asserts to verify proper format. Nothing like working on a real format
to put the small but critical issues into focus.
I overlooked that you cannot put an assert on a global element. I read
that too quickly, but there it is.
I'm not sure what we were thinking here. Perhaps it was as simplistic as
avoiding global constructs with ".." in the paths of expressions.
But this just flies in the face of creating tidy schemas that avoid repetition
of asserts or big complex expressions all over the place. And asserts are
one of the very untidy things because they contain regular expressions,
which are very unweildy and difficult to maintain and badly badly need
to be centralized in any reasonable schema.
So I'll amend my proposal: allow asserts on global element decls, and on
simple type defs. That is, they are orthogonal in placement to whether
the site is local or global.
Here's the example:
If I can put these as asserts on a simpleType, then I can abstract over
them in a schema, using the same regex assert for many fields with different
names.
If I cannot, then I have no choice but to nest them inside a complex type,
and all these simple string fields become complex types, which is adding
a whole tier of elements to the schema.
E.g.
<simpleType name="dField" dfdl:ref="ex:dFieldDefaults">
<restriction base="xs:string">
</simpleType>
then all over the schema.....
<sequence dfdl:ref="dFieldListFormat">
<element name="Foo" type="ex:dField>
<xs:annotation>
<xs:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/">
<dfdl:assert
testKind="pattern"
testPattern="((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)(((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)| )*((\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)|\-)|(\p{L}|\d|\.|,|\(|\)|\?|!|@|#|\$|%|\^|&|\*|=|_|\+|\[|\]|\{|\}|\\|"|'|;|<|>|~|\|)"
message="Assertion
failed for data_code" />
</xs:appinfo>
</xs:annotation>
</element>
.... repeat ad nauseum for all other elements of type dField,
which is EVERYTHING practically.
This absolutely flies in the face of any good principles of abstraction.
I want that regex, and the characteristics of data that have to obey it,
captured in one place. I also really do not want to have to turn every
element like Foo, which is logically just a string, into a complex type.
What I need to be able to do is rotate that assertion over into the definition
of dField.
I can't do it with a pattern facet, because that doesn't affect parsing,
and this assertion needs to guide speculative parsing.
(Note: XML Schema allows patterns on simpleType defs and so allows proper
abstraction of simple types that use pattern validation. DFDL is currently
not consistent with this way of abstracting.)
Now, currently the spec says I can't put an assert on a global element
either, so I can't fix the above issue by doing this:
<element name="dValue" type="xs:string">
....big mombo assert with regex ... NOT ALLOWED HERE EITHER
</element>
<complexType name="dField">
<sequence>
<element ref="ex:dValue"/>
</sequence>
</complexType>
<sequence dfdl:ref="dFieldListDefaults">
<element name="Foo" type="dField"/>
....
But I could push the assert down inside the complex type definition, where
it would be on a local element decl, and if everywhere I share use of that
complex type, then I can centralize the regex.
What is this restriction possibly achieving? The assertion is still "trying
to be on a global definition", we've just prevented it from being
in a convenient place for the modeler.
Also, as soon as you force me to use a complex type, now I'm stuck with
the complex type restrictions on nillability, which are insufficient for
my needs, so now I can't take advantage of mapping data containing non-empty
nil indicators to xsi:nil nillability. I have to create a <element name="noValue"
....> and get into using choices, etc.
I should note that precedent in the spec already allows asserts on a sequence/choice
that is a global group definition and global group references. I believe
this is just a happy accident of the XML Schema object model i.e., the
fact that the sequence/choice object isn't the same object as the global
definition object, it is contained inside the global group def object.
The point: there is already a precedent for having to combine the lists
of assertions from both defines and references to them. I think the rule
is simple, asserts on references go after the asserts on the defines, but
otherwise it is just merging the lists onto the reference schema component
to be executed at the reference.
The concern about global objeccts, that is about things having expressions/properties
that can't be resolved isn't one that keeping assertions off global defs/decls
will help. A global def/decl can contain a subset of DFDL properties
such that it is useless in DFDL outside of some complementary referencing
context. This is inherent in DFDL, and any system that allows separation
of concerns when describing something complex including XML Schema itself,
which has global types, groups, etc. all of which are useless without their
referencing contexts.
I would close by noting that this change, (allowing these additional locations
for asserts), is backward compatible with our current spec, because it
only adds new places these assertions are allowed.
The right fix here: DFDL 'statements' (setVariable, assert, discriminator,
even newVariableInstance) should be allowed on simpleType and on global
element decls.
Even newVariableInstance is useful there. A new variable is created, used
in expressions in asserts/discriminators/setVariable/ other newVariableInstance,
and then it goes out of scope immediately at the end of the element. The
newVariableInstance is in fact our only way of really controlling the complexity
of an expression. It lets you create a big complicated expression out of
several variables each of which is bound to a sub-calculation, and so it
is useful even when the scope is just the asserts/discriminators/setVariables/newVariableInstance
statements of a single simple-typed element. newVariableInstance basically
lets us create local variables for use in our expression language. Normal
XPath doesn't have this.
...mike
On Fri, Oct 26, 2012 at 4:11 AM, Steve Hanson <smh@uk.ibm.com>
wrote:
Mike, it's not an oversight, I'm sure
it is for the same reason that you can't put an assert on a global element.
I think the rationale is that asserts (and discriminators) are only allowed
at annotation points where all properties can be resolved.
Please can you show an example of what you want to achieve?
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: dfdl-wg@ogf.org,
Date: 25/10/2012
17:47
Subject: [DFDL-WG]
simpleType cannot contain assert?
Sent by: dfdl-wg-bounces@ogf.org
Is this just an oversight?
I find it very tedious to model if I can't put the asserts on the simpleTypes
so that I don't have to repeat them over and over in the model.
The composition rule here is simple. If an element has asserts, and its
simple type has asserts, both are executed, with the element's asserts
run after the simpleType's asserts.
...mike
--
Mike Beckerle | OGF DFDL WG Co-Chair
Tel: 781-330-0412
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
Mike Beckerle | OGF DFDL WG Co-Chair
Tel: 781-330-0412
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg