
Further to discussion on the call, here is what IBM COBOL manual says about PIC P. An assumed decimal scaling position. It is used to specify the location of an assumed decimal point when the point is not within the number that appears in the data item. The scaling position character P is not counted in the size of the data item. Scaling position characters are counted in determining the maximum number of digit positions (63) in numeric-edited items or in items that appear as arithmetic operands. The scaling position character P may appear only as a continuous string of Ps in the leftmost or rightmost digit positions within a PICTURE character-string. Because the scaling position character P implies an assumed decimal point (to the left of the Ps, if the Ps are leftmost PICTURE characters; to the right of the Ps, if the Ps are rightmost PICTURE characters), the assumed decimal point symbol, V, is redundant as either the leftmost or rightmost character within such a PICTURE description. In certain operations that reference a data item whose PICTURE character-string contains the symbol P, the algebraic value of the data it em is used rather than the actual character representation of the data item. This algebraic value assumes the decimal point in the prescribed location and zero in place of the digit position specified by the symbol P . The size of the value is the number of digit positions represented by the PICTURE character-string. These operations are any of the following: Any operation requiring a numeric sending operand. A MOVE statement where the sending operand is numeric and its PICTURE character-string contains the symbol P. A MOVE statement where the sending operand is numeric-edited and its PICTURE character-string contains the symbol P and the receiving operand is numeric or numeric-edited. A comparison operation where both operands are numeric. In all other operations the digit positions specified with the symbol P are ignored and are not counted in the size of the operand. This implies that the scaling should be applied as a lexical operation on the data. In other words two COBOL fields, one with PIC PP9 and value '2' and one with PP999 and value '002' do not result in the same logical number. There is an equivalence between V and P. PP999 == V99999 and 999PP == 99999V == 99999. If we consider things in these terms the reasoning is simpler. To prevent # symbol zero suppression from changing the value, rule a) must apply and there must be no # to the right of the V. That restates our rules as: a) A pattern with a V symbol must not have # symbols to the right of the V symbol. b) A pattern with P symbols at the left end must have no # symbols in the pattern. c) A pattern with P symbols at the right end has no restrictions. There is another problem though. The number can be trimmed using the pad character from either or both ends depending on justification, before applying the number pattern. If the pad character is 0 then this can also cause 0's to be lost and result in mis-application of V and P symbols. I'm not sure there is much we can do about this. Modelers need to be careful when padding/trimming that they get the justification correct. For example, we typically think of numbers as being right justified, but for a number with Ps on the left, it is effectively left justified and should be modeled as such. We added errata 2.25 which prevented trimming from leaving an empty string. I am thinking that this errata should actually say that trimming must leave at least the minimum number of digits implied by the pattern, as an extra safeguard? We mustn't disallow trimming/padding altogether as it is used to remove spaces. The ICU pad character symbol * is used to provide a pad character when the data is shorter than the pattern. This is only used to pad when unparsing, it is not used to trim. But it might be safer to disallow P and V symbols when * is used? Reading the ICU description of significant digit symbol @, explicit decimal points are disallowed. I think we should disallow P and V symbols when @ symbol is used. Errata 2.28 should be updated. Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 22/03/2012 09:22 ----- From: Steve Hanson/UK/IBM To: dfdl-wg@ogf.org Date: 21/03/2012 10:34 Subject: Action 167: textNumberPatterns with P,V, # - allowable combinations Last week we agreed that disallowing text number patterns that contained a # symbol and either a P or V symbol was too restrictive. Accordingly the following rules are proposed to control when # may be used in the same pattern as P or V to ensure an unambiguous pattern. a) Pattern must not have # symbols to the right of the V symbol. b) If pattern has P symbols at the left end, then there must be as many 0 symbols adjacent to the rightmost P symbol as there are P symbols. c) If pattern has P symbols at the right end, there are no restrictions. If a) or b) are violated it is a schema definition error. Regards Steve Hanson Architect, Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU