WG call 3rd June: DFDL spec will change so that an escape block end does not have to be the last thing in the data (after trimming). It must always be present. A new erratum will be raised.

Regards
 
Steve Hanson
Architect,
IBM DFDL
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848




From:        Steve Hanson/UK/IBM
To:        dfdl-wg@ogf.org,
Date:        20/05/2014 17:50
Subject:        Re: [DFDL-WG] Action 259 - Consider allowing more flexible        escapeBlock        schemes



As discussed on the call, there is an import case that is not covered in the table, namely where quotes surround a delimiter but the opening quote is not at the start of the data. I imported the following text string into Excel:

        This is "," two separate fields

And indeed two columns were created, meaning the comma was treated as a delimiter and not escaped. This matches DFDL so good.

Interestingly, the first column was as expected...

        This is "

...but the second was not:

        two separate fields

Notice the leading quote was removed without error, meaning that the absence of the closing quote is permitted!

Regards
 
Steve Hanson
Architect,
IBM DFDL
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848





From:        Tim Kimber/UK/IBM@IBMGB
To:        dfdl-wg@ogf.org,
Date:        13/05/2014 15:37
Subject:        Re: [DFDL-WG] Action 259 - Consider allowing more flexible        escapeBlock        schemes
Sent by:        dfdl-wg-bounces@ogf.org




That looks fairly conclusive to me. DFDL should fall into line with established practice.

regards,

Tim Kimber,
IBM Integration Bus Development (Industry Packs)
Hursley, UK
Internet:  kimbert@uk.ibm.com
Tel. 01962-816742  
Internal tel. 37246742





From:        
Steve Hanson/UK/IBM@IBMGB
To:        
dfdl-wg@ogf.org,
Date:        
13/05/2014 11:50
Subject:        
[DFDL-WG] Action 259 - Consider allowing more flexible escapeBlock        schemes
Sent by:        
dfdl-wg-bounces@ogf.org




Action 259 was raised last call to decide what to do about the following, as minuted:


Steve has an example of an escape block where the escape block end is not at the end of the un-trimmed data. This gives a processing error. Another IBM product accepts this usage. Should DFDL allow this? Or should there be a new escapeKind that allows escapeBlockStart/End anywhere?


Tried importing these values from a CSV file into an Excel spreadsheet, a Symphony spreadsheet (ie, successor to 123), and also accessing them via ODBC using a Microsoft driver, to compare with IBM DFDL and IBM Cast Iron behaviour.
Test
Data IBM DFDL IBM Cast Iron MS Excel Lotus Symphony ODBC
1
This is normal This is normal This is normal This is normal This is normal This is normal
2
"This is OK" This is OK This is OK This is OK This is OK This is OK
3
"This| is expected" This| is expected This| is expected This| is expected This| is expected This| is expected
4
This too "is OK" This too "is OK" This too "is OK" This too "is OK" This too "is OK" This too
5
Even "this" is OK Even "this" is OK Even "this" is OK Even "this" is OK Even "this" is OK Even
6
"This" is NOT OK PARSE FAILED This is NOT OK This is NOT OK This is NOT OK This
7
"This"" is still OK" This" is still OK This" is still OK This" is still OK This" is still OK This" is still OK




The data under discussion is 6. It looks like DFDL is out of step with the behaviour of Excel / Symphony spreadsheets, and Cast Iron has adopted that behaviour too.


Out of interest I also checked the output behaviour from Excel. That escaped all instances of embedded quotes in the same way as DFDL, so no issues there.


Regards

Steve Hanson
Architect,
IBM DFDL
Co-Chair,
OGF DFDL Working Group
IBM SWG, Hursley, UK

smh@uk.ibm.com
tel:+44-1962-815848

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 
https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU