Re: [DFDL-WG] need clarification on unparsing with textPadKind='padChar'

Mike Apologies - missed this when compiling the agenda for yesterday's call - will add it to agenda for next call. Regards Steve Hanson IBM Integration Bus, Hursley, UK Architect, IBM DFDL Co-Chair, OGF DFDL Working Group smh@uk.ibm.com tel:+44-1962-815848 mob:+44-7717-378890 From: Steve Hanson/UK/IBM To: Mike Beckerle <mbeckerle.dfdl@gmail.com> Cc: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org> Date: 23/08/2016 13:19 Subject: Re: [DFDL-WG] need clarification on unparsing with textPadKind='padChar' Erratum 5.18 is not yet incorporated into the spec. When it is then dfdl:lengthKind 'explicit' with an expression becomes the same as dfdl:lengthKind 'explicit' with a literal. I believe that means the expression result is the fixed length for unparsing, and minLength is only used for validation. However I would like to run a test with IBM DFDL to make sure everything is as expected. fn:string-length() always returns the length of the value in the infoset item. It has no schema awareness. Regards Steve Hanson IBM Integration Bus, Hursley, UK Architect, IBM DFDL Co-Chair, OGF DFDL Working Group smh@uk.ibm.com tel:+44-1962-815848 mob:+44-7717-378890 From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org> Date: 10/08/2016 17:27 Subject: [DFDL-WG] need clarification on unparsing with textPadKind='padChar' Sent by: "dfdl-wg" <dfdl-wg-bounces@ogf.org> The spec for textPadKind has this sentence about unparsing: "When dfdl:lengthKind is 'explicit' (and dfdl:length is an expression), 'delimited', 'prefixed', 'pattern' the data value is padded to the length given by the XSD minLength facet for type 'xs:string' or dfdl:textOutputMinLength property for other types." My concern is the case where dfdl:lengthKind is 'explicit' and dfdl:length is an expression. Example 1: Suppose my infoset element contains an element at path '../x' of type xs:string containing 10 characters "1234567890" Suppose the length expression evaluates to 20, and lengthUnits is 'characters' Suppose the XSD minLength facet is 100. Suppose the textPadKind is padChar. The above sentence says the target length would be 100 and we would pad to length 100, not 20, and there is no error. I want to be sure this is correct and what we intended. To me this seems inconsistent with what we have stated in Erratum 5.18 which says that whether or not the length is a constant or an expression, this element is treated as fixed length, and when unparsing, the length expression is evaluated, not ignored. It's not technically incorrect in erratum 5.18 because that doesn't deal with this minLength complexity. It's just not terribly clear. We may want to update this sentence of the spec, and the erratum to both use the term target length, which is in our glossary. Furthermore, In this case, the dfdl:contentLength for this element should return 100 for units 'characters'. In this case the dfdl:valueLength for this element should return 10 for units 'characters' Expression fn:string-length(../x) should return 10. The length expression value of 20 is effectively being ignored because it is less than the minLength. The length expression *would* be used if it was greater than the minLength. Example 2: My infoset contains an element at path "../x" of type xs:string of length 200 The length expression evaluates to 20 dfdl:truncateSpecifiedLengthString is true XSD minLength is 40. In this case I believe we would have a target length of 40, and we would truncate the string to length 40. dfdl:contentLength of this element is 40 dfdl:valueLength of this element is 40 Expression fn:string-length(../x) should return 200. Is this the correct interpretation? Example 3: Same as example 2, except XSD minLength is 0. Now the target length would be 20 and we would truncate the length to 20. dfdl:contentLength of this element is 20 characters dfdl:valueLength of this element is 20 characters. Expression fn:string-length(../x) should return 200. Is this the correct interpretation? Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (1)
-
Steve Hanson