Regarding Mike's second point:
Here's a link to an ICU ticket on this
subject: http://bugs.icu-project.org/trac/ticket/9659
- currently targetted for ICU 52.1
Here's some more details from IBM's
calls with ICU on this subject:
- Exponent character / ignoreCase
: Exponent char is not case sensitive. Is this intentional?
* Priority : Medium
ICU see two options
for this:
Option 1: Provide an
API call to set a flag on the DecimalFormat object.
Option 2: Make it a
global policy settable via a config switch. This would allow other 'site
policies' to be made settable using the same mechanism.
There would be one set of policy flags, including this flag, per address
space.
There are differences in date/time processing between C and
Java that could be dealt with using this mechanism.
DFDL needs some of these flags to be configurable at runtime.
2012/10/19 Hit an issue
where case handling was inconsistent. Fix needs care to avoid changing
default behaviour and thus breaking existing users of the API.
Currency and prefix/suffix may need separate switch
so global switch for the DecimalFormat not appropriate.
Could provide a patch to ICU for setting exponent char
for now.
2013/1/17 - ICU external
ticket #9659
ICU
had an issue with the new API being specific to just case sensitivity of
exponent (and not other regions).
DFDL
clarified the requirement is for an API to change global case sensitivity
(not just exponent).
This
is targetted at ICU51
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve
Hanson/UK/IBM on 14/08/2013 16:55 -----
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
"dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
Date:
14/08/2013 14:14
Subject:
[DFDL-WG] DFDL
ICU Challenges for Implementation
Sent by:
dfdl-wg-bounces@ogf.org
There are a couple of features in DFDL that ICU doesn't
support, yet where all or nearly all the related functionality is supported
by ICU. Perhaps these aspects of the spec can be revisited?
1) List of Decimal Separators
The textStandardDecimalSeparator property is a list of characters.
However, ICU only supports a single character.
I see lots of potential for error here, confusing diagnostics,
etc. It is not consistent with textStandardGrouping separator, which allows
only a single character.
Is there a use case where we know we need more than one decimal separator?
The only thing I can think of is a blend of say classic European-style
decimal numbers like "1 234 567,89" and USA style " 1,234,567.89",
but ICU won't deal with different grouping separators either.
In any case if there are multiple decimal and grouping
separators we really don't have these properties right in DFDL. We should
require them to be specified not as two separate lists, but as a list of
pairs, because grouping separators match up with specific decimal separator
values in a format.
2) Case Insensitivity
Some properties that we use to configure ICU are affected by
ignoreCase="yes", but ICU does not support case insensitivity.
The
properties are:
textStandardExponentRepCharacter
textStandardInfinityRep
textStandardNaNRep
I can certainly imagine a need for case insensitivity
here, and even for multiple values for these (though we allow only one
for Infinity and NaN). For the infinity and nan reps that isn't so problematic
as one can easily do a pre-check before calling ICU, but for the exponent
rep, that is needed down in the detailed number format parsing. I can see
no certain algorithm other than creating separate number format parsers
for each exponent rep character in provided case, and opposite case, and
then using them one by one until a successful parse.
Is this ok or do we consider this a mistake?
3)
We are not very consistent in these properties.
We allow multiple textStandardZeroRep values, but only
a single textStandardInfinityRep, and only a single textStandardNaNRep.
We allow multiple textStandardExponentRepCharacter, and
multiple textStandardDecimalSeparator, but only a single textStandardGroupingSeparator.
This kind of inconsistency is always problematic for users.
Comments?
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU