clarification needed: textNumberCheckPolicy lax includes lax-ness about plus signs.

Excerpt from daffodil users mailing list indicates that the discussion of how "lax" textNumberCheckPolicy="lax" is w.r.t. plus signs on numbers. ------------------------------
If you set textNumberCheckPolicy="lax", then
we do ignore leading plus signs in the data
The DFDL specification doesn't seem to say that a leading plus sign is ignored. Here's what it says: If 'lax' and dfdl:textNumberRep is 'standard' then grouping separators are ignored, leading and trailing whitespace is ignored, leading zeros are ignored and quoted characters may be omitted. Nothing about ignoring plus signs in that. Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy <http://www.ogf.org/About/abt_policies.php>

Reference is at http://icu-project.org/apiref/icu4j/com/ibm/icu/text/DecimalFormat.html#setP... and gives an example where using a + in the data but not the pattern gives an error when strict. The implication would be that this is not an error when lax, but testing with IBM DFDL does not bear this out. IBM DFDL behaviour matches the DFDL spec. Looking at our code, we do some pre-processing before passing data & pattern to ICU, but not plus sign checking, so it's ICU behaviour. Data \ Pattern +000 & +### 000 & ### +123 Parsed Failed 123 Failed Parsed I'm pretty sure I've hit this with EDIFACT in the past. A particular field had an explicit sign and needed to be modelled with a pattern that included the sign. Having said that, having a field that sometimes included a + and sometimes didn't feels like it should be a common occurrence ... Regards Steve Hanson IBM Hybrid Integration, Hursley, UK Architect, IBM DFDL Co-Chair, OGF DFDL Working Group smh@uk.ibm.com tel:+44-1962-815848 mob:+44-7717-378890 Note: I work Tuesday to Friday From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: DFDL-WG <dfdl-wg@ogf.org> Date: 27/08/2019 23:52 Subject: [DFDL-WG] clarification needed: textNumberCheckPolicy lax includes lax-ness about plus signs. Sent by: "dfdl-wg" <dfdl-wg-bounces@ogf.org> Excerpt from daffodil users mailing list indicates that the discussion of how "lax" textNumberCheckPolicy="lax" is w.r.t. plus signs on numbers.
If you set textNumberCheckPolicy="lax", then we do ignore leading plus signs in the data
The DFDL specification doesn't seem to say that a leading plus sign is ignored. Here's what it says: If 'lax' and dfdl:textNumberRep is 'standard' then grouping separators are ignored, leading and trailing whitespace is ignored, leading zeros are ignored and quoted characters may be omitted. Nothing about ignoring plus signs in that. Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy -- dfdl-wg mailing list dfdl-wg@ogf.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ogf.org_mailman_listinfo_dfdl-2Dwg&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=AJa9ThEymJXYnOqu84mJuw&m=vwSz19bw2nSLGveutKIOPdn6CFcSlr3p5zF4LU6AXQ0&s=kwB1fhf54GAkRztDGBcRjyaiRn1VtT7EORKQWX8FqyA&e= Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (2)
-
Mike Beckerle
-
Steve Hanson