I see your point about the zoned decimals.
If there really are COBOL applications that use SHIFT-JIS or UTF-8 with
zoned then I think we have to support those scenarios.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From:
Steve Hanson/UK/IBM@IBMGB
To:
dfdl-wg@ogf.org,
Date:
07/06/2013 11:33
Subject:
[DFDL-WG] Zoned
decimals: spec errata 2.92 & 2.88
Sent by:
dfdl-wg-bounces@ogf.org
2.92. Section
13.6. When property textNumberRep
is ‘zoned’, the property description should state that ‘zoned’ is only
allowed for SBCS encodings (schema definition error otherwise).
When I came to implement this for IBM DFDL, I noticed there were already
tests for UTF-8 and Shift_JIS which succeeded. The point being that both
these character sets are ASCII compatible for the first 128 code points
(Shift_JIS has two code points that differ, x5C and x7E). I am wondering
if this errata is therefore too strict? I am particularly concerned
that there might be Japanese users who will have COBOL data in Shift_JIS
or MS_Kanji.
2.88. Section 13.5. Add support for HP NonStop Tandem zoned
decimals. In this architecture, the negative sign is incorporated in the
last byte of the number in the usual manner, but the overpunching occurs
on the highest bit (ie, value 8) of the nibble. Consequently, a new enum
value 'asciiTandemModified’ is added to property textZonedSignStyle.
The range of ASCII code points that are used in a zoned number is x30-x39
and either x70-x79 (standard overpunch) or x7B, x41-x49 (translated EBCDIC
overpunch) or x20-x29 (CA Realia overpunch). This errata adds x80-x89.
But these are not code points in standard ASCII, so the modeller
must specify something like ISO-8859-1 in order for this to parse without
an encoding error. The wording in the spec for this errata alludes to this
but could make this clearer.
asciiTandemModified: In
this style the ascii characters ‘0-9’ represent positive sign and digits
0 to 9, but bytes from 0x80 to 0x89 are used to represent overpunched negative
sign and a digit. There are no corresponding character codepoints in the
standard ASCII encoding since these values are all above 128 (decimal).
(Note that neither ISO-8859-1 encoding nor Unicode have assigned glyphs
for these codepoints. They are considered control characters.)
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU