Thanks Mike - Great.. It sounds right
to me too..
Suman Kalia
IBM Toronto Lab
WMB Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia@ca.ibm.com
From:
| Mike Beckerle <mbeckerle.dfdl@gmail.com>
|
To:
| Suman Kalia/Toronto/IBM@IBMCA
|
Cc:
| Steve Hanson <smh@uk.ibm.com>,
dfdl-wg@ogf.org
|
Date:
| 02/02/2010 11:30 AM
|
Subject:
| Re: [DFDL-WG] Action Item 049: Built-in
specification description and schemas |
I think the summary is to flatten the calendarFormat and
numberFormat objects, putting their properties back on element and simpleType,
but to leave escapeScheme as is.
This sounds right to me.
...mike
On Tue, Feb 2, 2010 at 11:06 AM, Suman Kalia <kalia@ca.ibm.com>
wrote:
Thanks Steve for your note.. Comments below
- I agree there will be handful of escape schemes and the
opportunity for their reuse is very high..
Looking at Calendar format - the attribute that would vary most are calendarPattern
followed by calendarTimeZone. calendarPatternKind goes along with calendarPattern;
it tells whether to use calendar pattern from schema date/time type or
from DFDL properties. Rest of the attributes are likely to be same for
a particular format.
For consistency with textNumberFormat, I am fine to add all attributes
defined in calendarFormat to dfdl:element and dfdl:simpleType..
Suman Kalia
IBM Toronto Lab
WMB Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia@ca.ibm.com
Steve
Hanson---02/02/2010 04:59:14 AM---Thanks for highlighting this Suman. The
reason for hiving off the properties for text numbers into a

From:
| 
Steve Hanson/UK/IBM@IBMGB
|

To:
| 
Suman Kalia/Toronto/IBM@IBMCA
|

Cc:
| 
dfdl-wg@ogf.org
|

Date:
| 
02/02/2010 04:59 AM
|

Subject:
| 
Re: Action Item 049: Built-in specification description and schemas |
Thanks for highlighting this Suman.
The reason for hiving off the properties for text numbers into a separate
named annotation was reuse. It was considered that a given data format
might have a large number of text number fields, but that they could be
described by a far lesser number of annotations, because a limited set
of 'number patterns' were used. In Suman's example that's clearly not the
case, but it is an artificial one. We need to consider real world formats.
I've had a look through example COBOL copybooks, and while there is a large
variation in text number fields, reuse of 'number patterns' would be a
benefit. For example, a set of related values might be declared the same:
15 ORIGINAL-PRICE PIC 9(013)V99.
15 DISCOUNTED-PRICE PIC 9(013)V99.
15 SALE-PRICE PIC 9(013)V99.
15 STAFF-PRICE PIC 9(013)V99.
15 TOTAL-PRICE PIC 9(013)V99.
The question then becomes what is the best way to achieve this reuse. If
you look at a dfdl:textNumberFormat annotation, it is the number pattern
that varies. Everything else would be defined once in a dfdl:format annotation
and scoped. So it does seem overkill to have a dfdl:textNumberFormat for
every number pattern, because the contained properties can not be scoped
and must be redeclared each time.
I suggest the best reuse mechanism for this scenario is the simple type.
In the above example I could declare a PRICE simple type and put the number
pattern on that.
I therefore agree with Suman. Remove dfdl:textNumberFormat and dfdl:defineTextNumberFormat,
add all the properties to dfdl:element and dfdl:simpleType. In practice
most will be set in a dfdl:format and scoped, only the number pattern will
vary per element or simple type.
We should also consider whether the same issue applies to dfdl:calendarFormat
and dfdl:escapeScheme. For both these the reuse opportunity is high. There
is likely to be just one escape scheme per data format. There is likely
to be a small number of calendar formats per data format (eg, one for a
date, one for a time, one for a timestamp). But in the latter case, it
is typically just the calendarPattern that would vary, the rest of the
properties would be set once.
I suggest that whatever we adopt for text numbers we also adopt for calendars,
for consistency.
Regards
Steve Hanson
Programming Model Architect, WebSphere Message Broker,
OGF DFDL WG Co-Chair,
Hursley, UK,
Internet: smh@uk.ibm.com,
Phone (+44)/(0) 1962-815848
Suman
Kalia---02/02/2010 00:21:40---I am trying to create DFDL definition for
COBOL copy book and have experienced a usability issue wit

From:
| 
Suman Kalia/Toronto/IBM@IBMCA
|

To:
| 
Alan Powell/UK/IBM@IBMGB, Steve Hanson/UK/IBM@IBMGB, Mike Beckerle <mbeckerle.dfdl@gmail.com>
|

Cc:
| 
dfdl-wg@ogf.org
|

Date:
| 
02/02/2010 00:21
|

Subject:
| 
Action Item 049: Built-in specification description and schemas |
I am trying to create DFDL definition for COBOL copy book and have experienced
a usability issue with TextNumberFormat which have to be named and referenced
from dfdl:element and dfdl:simpleType annotations. Consider a sample COBOL
copy book, attached below, where I have 3 elements having PIC 9999 display
clause (a.k.a zoned decimal) and 2 external (standard) decimal. They all
have same length but the main difference between them is number is sign
which could leading or trailing. As per the V.38 spec, I would have to
create a named textNumberFormat for each of the picture clause. The key
difference in the named textNumberFormats for these definitions would be
numberPattern and rest of the attributes for standard decimal and
zoned decimal are going to be same for a particular platform or data definition
format. The generated DFDL schema will be containing many occurrences of
TextNumberFormat and in the worst case scenario one for each element defined
in the COBOL copy book. This is not very usable and also user would have
to carefully choose the name for these formats so he can easily identify
and distinguish if wants to resue them something like TextNumberStandardLength5SignLeading
etc..
01
CobolTypes.
* External decimal ( Zoned decimal)
05 elem9 PIC
9999
DISPLAY.
05 elem9Signed PIC
S9999
DISPLAY.
05 elem9SignedLeading PIC
S9999
DISPLAY
SIGN LEADING.
* in DFDL - modeled as standard decimal
05 elem9SignedLeadingSeparate PIC
S9999
DISPLAY
SIGN LEADING
SEPARATE.
05 elem9SignedTrailingSeparate PIC
S9999
DISPLAY
SIGN TRAILING
SEPARATE.
Number Format
When textNumberRepresentation
is ‘zoned’ only the pattern for positive numbers is used. Only the following
pattern characters may be used: '+' to indicate whether the leading or
trailing digit carries the overpunched sign, 'V' to indicate the location
of an implied decimal point and '0' to indicate the number of digits (including
overpunched). The number is '0' characters must match the number of digits
in the representation otherwise it is a schema definition error.
Better approach would
be
- Add numberPattern to
dfdl:element and dfdl:simpleType annotation and rest of the attributes
from TextNumberFormat block to either a) dfdl:format only or (b) both dfdl:format
and dfdl:element and dfdl:simpleType.
Let's
discuss this in the DFDL workgroup call tomorrow ..
Attached below is a schema
coded with the assumption (a) listed above..
<xsd:complexType
name="CobolTypes">
<xsd:sequence>
<!---------------- External Decimal -------------------------------->
<xsd:element
name="elem9"
dfdl:ref="dfdlCobolFmt:CobolZonedDecimalFormat"
dfdl:length="4"
dfdl:representation="text"
dfdl:numberPattern="0000">
<xsd:simpleType>
<xsd:restriction
base="xsd:short">
<xsd:minInclusive
value="0"
/>
<xsd:maxInclusive
value="9999"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="elem9Signed"
dfdl:ref="dfdlCobolFmt:CobolZonedDecimalFormat"
dfdl:length="4"
dfdl:representation="text"
dfdl:numberPattern="0000+"
>
<xsd:simpleType>
<xsd:restriction
base="xsd:short">
<xsd:minInclusive
value="-9999"
/>
<xsd:maxInclusive
value="9999"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="elem9SignedLeading"
dfdl:ref="dfdlCobolFmt:CobolZonedDecimalFormat"
dfdl:length="4"
dfdl:representation="text"
dfdl:numberPattern="+0000">
<xsd:simpleType>
<xsd:restriction
base="xsd:short">
<xsd:minInclusive
value="-9999"
/>
<xsd:maxInclusive
value="9999"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="elem9SignedLeadingSeparate"
dfdl:ref="dfdlCobolFmt:CobolStandardDecimalFormat"
dfdl:length="5"
dfdl:representation="text"
dfdl:numberPattern="+0000;-00000"
>
<xsd:simpleType>
<xsd:restriction
base="xsd:short">
<xsd:minInclusive
value="-9999"
/>
<xsd:maxInclusive
value="9999"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element
name="elem9SignedTrailingSeparate"
dfdl:ref="dfdlCobolFmt:CobolStandardDecimalFormat"
dfdl:length="5"
dfdl:representation="text"
dfdl:numberPattern="0000+;00000-">
<xsd:simpleType>
<xsd:restriction
base="xsd:short">
<xsd:minInclusive
value="-9999"
/>
<xsd:maxInclusive
value="9999"
/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
----- Data format Definitions
<xsd:defineFormat
name="CobolStandardDecimalFormat">
<xsd:format
ref="tns:BaseTextNumberStandardDecimal"
dfdl:lengthKind="explicit"
dfdl:lengthUnits="bytes"
dfdl:alignment="1"
dfdl:alignmentUnits="bytes"
dfdl:leadingSkipBytes="0"
dfdl:trailingSkipBytes="0"
/>
</xsd:defineFormat>
<xsd:defineFormat
name="CobolZonedDecimalFormat">
<xsd:format
ref="tns:BaseTextNumberZonedDecimal"
dfdl:lengthKind="explicit"
dfdl:lengthUnits="bytes"
dfdl:alignment="1"
dfdl:alignmentUnits="bytes"
dfdl:leadingSkipBytes="0"
dfdl:trailingSkipBytes="0"
/>
</xsd:defineFormat>
-- Text number Formats ( added here for reference to identify applicable
attributes for standard and zoned decimal)
<xsd:defineTextNumberFormat
name="ZonedDecimalNumberFormat">
<xsd:textNumberFormat
numberCheckPolicy="lax"
numberRoundingMode="roundUp"
numberZonedSignStyle="asciiStandard"
/>
</xsd:defineTextNumberFormat>
<xsd:defineTextNumberFormat
name="StandardDecimalFormat">
<xsd:textNumberFormat
numberGroupingSeparator=","
numberDecimalSeparator="."
numberExponentCharacter="E"
numberCheckPolicy="lax"
numberInfinityRep="\u221E"
numberNanRep="\uFFFD"
numberRoundingMode="roundUp"
numberZeroRep=""
"" />
</xsd:defineTextNumberFormat>
Suman Kalia
IBM Toronto Lab
WMB Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia@ca.ibm.com
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg