To me, no properties apply to a complex type, rather they
apply to the model group (sequence or choice) which is the meaning of the
complex type.
That is, we don't have to distinguish a complex type from
the model group that defines it.
...mike
Mike Beckerle |
OGF DFDL WG Co-Chair | CTO | Oco, Inc.
Tel:
781-810-2125 | 100 Fifth
Ave., 4th Floor, Waltham MA 02451 | mbeckerle.dfdl@gmail.com
Mike
That looks reasonable.
However as you must still be able to specify
dfdl:initiator/terminator on the complexType for scoping we need to somehow make
it clear that the grammar describes where the properties APPLY not where they
are SPECIFIED.
Do any properties
APPLY to a complexType?
Alan
Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN,
England
Notes Id: Alan Powell/UK/IBM email:
alan_powell@uk.ibm.com
Tel: +44 (0)1962 815073
Fax: +44 (0)1962
816898
From:
| "Mike Beckerle"
<mbeckerle.dfdl@gmail.com>
|
To:
| <dfdl-wg@ogf.org>
|
Date:
| 13/05/2009 20:09
|
Subject:
| [DFDL-WG] Grammar issue - simple and
complex asymetry |
The draft 034 grammar
productions do not allow for a separate prefix/suffix for a simple type as
distinguished from the element having that type.
Draft 034 does allow for an element
of complex type to have a separate prefix and suffix for the element itself and
another one for the sequence or choice inside it.
I've come to believe this is a mistake and
I suggest a fix below.
Right now the grammar is:
Element =
SimpleElement | ComplexElement
SimpleElement = Prefix
SimpleContent Suffix
SimpleContent = StringText // terminal. No
more prefixes/suffixes
ComplexElement = Prefix ComplexContent
Suffix
ComplexContent = Sequence | Choice
Sequence = Prefix SequenceContent Suffix
Choice = Prefix ChoiceContent Suffix
So, if I do:
<complexType dfdl:initiator="[" dfdl:terminator="]">
...
<element name="y">
<complexType>
<sequence dfdl:separator="," >
<element name="x"
type="int"/>
<element name="z" type="int"/>
</sequence>
</complexType>
</element>
...
</complexType>
I have two prefix
opportunities. I can flatten the productions above to:
ComplexElement = Prefix Prefix SequenceContent Suffix Suffix
An instance
of this type would look like [[[5],[6]]]. That is, for complex types, there are
separate prefix and suffix regions for the element, and for the model-group
which makes up its content.
The first [ initiates element y.
The second [
initiates the sequence
The third [ initiates element
x.
This same behavior is
not true for simple types:
<complexType
dfdl:initiator="[" dfdl:terminator="]">
...
<element name="y" >
<simpleType>
<restriction base="int"/>
</simpleType>
</element>
...
</complexType>
This can only mean [5]. The grammar, as
formulated in draft 034, does not allow for more than one prefix or
suffix.
The [ is the initiator of element y.
I believe we should fix this as follows.
New grammar:
Element = SimpleElement | ComplexElement
SimpleElement = Prefix SimpleContent Suffix
SimpleContent = StringText
ComplexElement =
ComplexContent // Note: no more surrounding prefix suffix.
ComplexContent = Sequence | Choice
Sequence = Prefix
SequenceContent Suffix
Choice =
Prefix ChoiceContent Suffix
The above grammar arranges for an element
of complex type and its model group to both taken together specify a single
prefix and suffix.
Revisiting our example (just repeating it here):
<complexType
dfdl:initiator="[" dfdl:terminator="]">
...
<element
name="y">
<complexType>
<sequence dfdl:separator="," >
<element name="x" type="int"/>
<element name="z"
type="int"/>
</sequence>
</complexType>
</element>
...
</complexType>
An instance now would
look like [[5],[6]]
The
first [ is the initiator of element y, which is the same as the initiator of the
sequence that is its type.
The second [ is the initiator
of element x. (which is the same as the initiator of the int that is its
type)
I believe this is
more sensible, as it makes the behavior of simple and complex types more
similar.
It begs the
question of how one combines conflicting properties on an element with the
properties on the type, and even the model group inside the type in the complex
case. Because all these properties are describing the same syntax fields in the
grammar.
That's a
separate topic in a subsequent email.
--
dfdl-wg mailing
list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United
Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU