Suman
Please can you review the proposal
below, we would like to close on this on the next WG call.
We are particularly interested in why
you chose xs:string for DFDL string literal, but then xs:token for List
of DFDL string literal, as that means there is different whitespace behaviour
for the same DFDL string literal depending on which property it is used
in, which does not sound right.
<xsd:simpleType
name="DFDLStringLiteral">
<xsd:restriction base="xsd:string">
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType
name="ListOfDFDLStringLiteral">
<xsd:list itemType="xsd:token"/>
</xsd:simpleType>
I was expecting to see:
<xsd:simpleType
name="DFDLStringLiteral">
<xsd:restriction base="xsd:token">
</xsd:restriction>
</xsd:simpleType>
<xsd:simpleType
name="ListOfDFDLStringLiteral">
<xsd:list itemType="dfdl:DFDLStringLiteral"/>
</xsd:simpleType>
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve
Hanson/UK/IBM on 10/07/2013 09:49 -----
From:
Steve Hanson/UK/IBM
To:
dfdl-wg@ogf.org,
Date:
09/07/2013 12:02
Subject:
[DFDL-WG] Action
205: whitespace in DFDL annotations
For discussion on today's WG call. Action
205 was raised to ensure that DFDL 'property types' are declared with XML
Schema types that provide the correct whitespace handling behaviour. The
XML Schema types of the various DFDL 'property types' are given in Part
1 of IBM's Schemas-for-DFDL.
The question boils down to whether a
'property type' should be an xs:string or xs:token. The former preserves
whitespace, the latter normalizes and trims. (Note that xs:NMTOKEN is intended
for attributes only, so should not be used for DFDL properties as they
can be expressed in attribute or element forms.)
My recommendation is;
- Enumeration changed from
xs:string to xs:token (reason: to match XSDL enums and trim leading/trailing
whitespace)
- DFDL regular expression
stays as xs:string (reason: regex may contain literal white space)
- DFDL string literal changed
from xs:string to xs:token (reason: currently inconsistent with
List of DFDL string literal)
- List of DFDL string literal
stays as list of xs:token
- DFDL expression changed
from xs:token to xs:string (reason: XPath may contain non-ignorable
whitespace)
Further:
- DFDL regular expression
should not trim leading/trailing whitespace
- DFDL expression should trim
leading whitespace before { and trailing whitespace after }
- The enum of DFDL property
names should be based on xs:token
The xs:unions for DFDL properties
that can be two or more of the above may/will need the member ordering
reviewed.
Example:
<xsd:simpleType
name="BinaryFloatRepEnum_Or_DFDLExpression">
<xsd:union>
<xsd:simpleType>
<xsd:restriction base="dfdl:DFDLExpression"
/>
</xsd:simpleType>
<xsd:simpleType>
<xsd:restriction base="dfdl:BinaryFloatRepEnum"/>
</xsd:simpleType>
</xsd:union>
</xsd:simpleType>
Usually in a union, the most restrictive
member is placed first. With the current types, the above has xs:token
followed by xs:string, in accordance with this practice. But the recommendation
changes the types of both members, so that the above becomes xs:string
followed by xs:token.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
Steve Hanson/UK/IBM
To:
Suman Kalia/Toronto/IBM@IBMCA,
Cc:
dfdl-wg@ogf.org, Mike
Beckerle <mbeckerle.dfdl@gmail.com>
Date:
27/03/2013 16:38
Subject:
Re: [DFDL-WG]
whitespace in DFDL annotations: right now regex is xs:string, expression
is xs:token
Suman
Looking at the XML schema-for-schemas,
and doing a test in the XSD editor in eclipse, XSD enumeration facets are
modelled as xs:NMTOKEN and not xs:string, like DFDL enums. XSD is perfectly
happy to strip/collapse white space. I think therefore that we should be
doing the same for DFDL enum properties. I don't see any harm in this -
an enum is a contiguous sequence of non-whitespace characters anyway, so
any leading/trailing whitespace is harmless.
Looks like XSD pattern facet is modelled
as xs:string, preserving white space. We should do the same for DFDL regex
properties.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
Steve Hanson/UK/IBM
To:
Suman Kalia <kalia@ca.ibm.com>,
Cc:
dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org,
Mike Beckerle <mbeckerle.dfdl@gmail.com>
Date:
19/03/2013 17:41
Subject:
Re: [DFDL-WG]
whitespace in DFDL annotations: right now regex is xs:string, expression
is xs:token
The type called DFDLExpressionOrPatternOrNothing
only makes sense for use in one place - the element value of an assert
or discriminator.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
Suman Kalia <kalia@ca.ibm.com>
To:
Mike Beckerle <mbeckerle.dfdl@gmail.com>,
Cc:
dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org
Date:
19/03/2013 17:29
Subject:
Re: [DFDL-WG]
whitespace in DFDL annotations: right now regex is xs:string, expression
is xs:token
Sent by:
dfdl-wg-bounces@ogf.org
Mike - I am not sure but my gut feeling
is that it would start with the most restrictive one first. i.e If
empty string ( assuming it has length facet 1) - would
match Nothing , then xsd:token which is restricted form of xs:string.
I think you are going to get string with white spaces collpsed (
xsd:token) if it not empty string. You can run few tests to see the
behavior..
Suman Kalia
IBM Canada Lab
WMB Toolkit Architect and Development Lead
Tel: 905-413-3923 T/L 313-3923
Email: kalia@ca.ibm.com
For info on Message broker
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: dfdl-wg@ogf.org,
Date: 03/19/2013
12:43 PM
Subject: [DFDL-WG]
whitespace in DFDL annotations: right now regex is xs:string, expression
is xs:token
Sent by: dfdl-wg-bounces@ogf.org
This came up on the call. The schemas I have for DFDL annotations have
DFDLRegularExpression as an xs:string, and DFDLExpression as an xs:token.
I have no clue what a union of these types behaves like. But we have a
union called DFDLExpressionOrPatternOrNothing which is a 3-way union of
DFDLExpression, DFDLRegularExpression, and EmptyString (which is also derived
from xs:string but has length facet of 0 as well.
--
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU