1) For xs:string if dfdl:lengthKind is 'implicit' then xs:maxLength is used to extract N units from the data. If dfdl:lengthUnits is 'bytes' then N bytes are extracted. If validation is switched on xs:maxLength is also used to validate that no more than N characters appear in the infoset. This seems problematic where the dfdl:encoding is non-SBCS.

2) For xs:string if dfdl:lengthKind  implies a variable length on output and dfdl:textPadKind is not 'none' then xs:minLength is used to ensure that at least  N units are output. If  dfdl:lengthUnits is 'bytes' then N bytes are written to the data. If validation is switched on xs:minLength is also used to validate that at least N characters appear in the infoset. Again this seems problematic where the dfdl:encoding is non-SBCS.

Should we disallow the combinations that actually cause a problem?

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU