James
It's a balance. One of the design goals
for DFDL was to keep the simple cases simple. So forcing all users
to understand and specify a host of properties that only exist to handle
esoteric use cases when they have a straightforward use case is counter
to that. For example, in many years of modeling data I have only
encountered one format where the encoding for an element value was different
to the encoding of its initiator, and until your case below I had not encountered
a use case where the case sensitivity of an element's initiator and terminator
were different. Such cases clearly exist, but are not common place. Hence
why the WG adopted the design it did, with encoding related properties
on a per object basis.
Another example is that DFDL has a single
separator property, and not a separator for sequences and a separator for
arrays. This often forces you to wrap an array element in a sequence in
order to carry the separator, but it reduces the overall number of properties,
behaviours and interactions that need to be understood.
For common use cases, we have allowed
extra properties. For example, there are separate justification and padCharacter
properties for each simple type category, because it is very common to
have a mix of such types in data and the justification and pad for different
types is invariably different for each type.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
"Garriss Jr.,
James P." <jgarriss@mitre.org>
To:
"dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
Date:
20/06/2013 12:50
Subject:
Re: [DFDL-WG]
how to do mix of case sensitive and
insensitive delimiters
Sent by:
dfdl-wg-bounces@ogf.org
So which is worse, Steve,
a bunch of extra properties in GeneralPurposeFormat, or making your schema
designers resort to workarounds like this:
<xs:sequence dfdl:separator="" dfdl:terminator="end"
dfdl:ignoreCase="no">
<xs:element name="weird" type="xs:string"
dfdl:initiator="start" dfdl:ignoreCase="yes" dfdl:lengthKind="delimited"/>
</xs:sequence>
Or this:
- wrap the element in a group
- set dfdl:ignoreCase to
'no' on the group.
- set dfdl:ignoreCase to
'yes' on the element.
- put the terminator on the
group and the initiator on the element.
Do you really want your schema
designers to have to do things like this? It seems to me that a
good design principle is to make life simpler for your users, not harder.
From: Steve Hanson [mailto:smh@uk.ibm.com]
Sent: Thursday, June 20, 2013 7:33 AM
To: Garriss Jr., James P.
Cc: dfdl-wg@ogf.org; dfdl-wg-bounces@ogf.org; Tim Kimber
Subject: Re: [DFDL-WG] how to do mix of case sensitive and insensitive
delimiters
If we adopt that approach we end up with
an explosion of properties. It's not just ignoreCase. There's encoding
and its related properties too. We decided that encoding related properties
applied per object, not per delimiter.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: "Garriss
Jr., James P." <jgarriss@mitre.org>
To: Tim Kimber/UK/IBM@IBMGB,
"dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
Date: 20/06/2013
12:21
Subject: Re:
[DFDL-WG] how to do mix of case sensitive and insensitive
delimiters
Sent by: dfdl-wg-bounces@ogf.org
All the various solutions proposed seem so convoluted. Why not do
the simple and obvious? Something like:
dfdl:ignoreInitiatorCase=”yes” dfdl:ignoreTerminatorCase=”no”
dfdl:ignoreSeparatorCase=”no” dfdl:ignoreElementCase=”yes”
From: dfdl-wg-bounces@ogf.org
[mailto:dfdl-wg-bounces@ogf.org]
On Behalf Of Tim Kimber
Sent: Thursday, June 20, 2013 5:34 AM
To: dfdl-wg@ogf.org
Subject: Re: [DFDL-WG] how to do mix of case sensitive and insensitive
delimiters
The other way to do this is
- wrap the element in a group
- set dfdl:ignoreCase to 'no' on the group.
- set dfdl:ignoreCase to 'yes' on the element.
- put the terminator on the group and the initiator on the element.
Or the other way round if it works better that way.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: dfdl-wg@ogf.org,
Date: 19/06/2013
23:42
Subject: [DFDL-WG]
how to do mix of case sensitive and insensitive delimiters
Sent by: dfdl-wg-bounces@ogf.org
I have a wierd case where the initiator wants to be case insensitive matching,
but the terminator wants to be case sensitive.
The only way I can think of dealing with this is to use the initiator,
but handle the length via lengthKind='pattern' to grab the value, doing
lookahead so it will stop before the terminator.
Then an empty sequence with a case sensitive terminator to pick off that
part of the data stream.
--
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU