Fw: Spec question about extraEscapedCharacters

Mike Fyi, from 2009. Regards Steve Hanson IBM Hybrid Integration, Hursley, UK Architect, IBM DFDL Co-Chair, OGF DFDL Working Group smh@uk.ibm.com tel:+44-1962-815848 mob:+44-7717-378890 Note: I work Tuesday to Friday ----- Forwarded by Steve Hanson/UK/IBM on 25/05/2021 08:56 ----- From: Steve Hanson/UK/IBM To: Alan Powell/UK/IBM@IBMGB Cc: Dragan Besevic/Boca Raton/IBM@IBMUS, Stephanie Fetzer/Charlotte/IBM@IBMUS Date: 04/11/2009 11:14 Subject: Re: Spec question about extraEscapedCharacters Alan On parsing, MRM just looks for the escape character (like DFDL). On writing, MRM needs to know what to escape and the reserved characters list is used for that (markup is not automatically included, as you say). So why does DFDL need extraEscapedCharacters property? Let's say I serialise an infoset corresponding to a structure. That will escape any in-scope markup characters in the data. But if I then include that serialised data as a BLOB is some envelope structure that also has terminating markup, I won't have escaped any of the envelope's markup characters in my BLOB. To do that I need extraEscapedCharacters property. I think the spec needs clarifying to say that extraEscapedCharacters is an output only control. Regards Steve Hanson Programming Model Architect, WebSphere Message Brokers, OGF DFDL WG Co-Chair, Hursley, UK, Internet: smh@uk.ibm.com, Phone (+44)/(0) 1962-815848 From: Alan Powell/UK/IBM To: Stephanie Fetzer/Charlotte/IBM@IBMUS, Steve Hanson/UK/IBM@IBMGB Cc: Dragan Besevic/Boca Raton/IBM@IBMUS Date: 03/11/2009 10:48 Subject: Re: Spec question about extraEscapedCharacters Stephanie extraEscapedCharacters comes from the WMB Reserved Characters property. It isn't exactly the same as WMB doesn't automatically look for markup. Also the documentation says that Reserved Characters aren't used on parsing which doesn't seem consistent. Steve do you know why? As currently specified your example wouldn't be a parsing error but may cause the data to be interpreted incorrectly if ^ is in fact markup. Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: Stephanie Fetzer/Charlotte/IBM@IBMUS To: Alan Powell/UK/IBM@IBMGB Cc: Dragan Besevic/Boca Raton/IBM@IBMUS Date: 02/11/2009 21:05 Subject: Spec question about extraEscapedCharacters Alan: Real quick question about extraEscapedCharacters. The current definition of the term is: String A space separated list of single characters that must be escaped in addition to markup. Annotation: dfdl: escapeScheme So - my interpretation is that is someone wants to enforce...even though the ^ is not an initiator, terminator, delimiter, or any other types or markup in this document - I want to force the sender of the data to always escape it. So extraEscapedCharacters="^" and the data to be parsed looks like: "NAME#ADDRESS#PHONENUMB^ER" That would be an error (unless the B is an escape char). We would expect the ^ to always be escaped in the data. So even though we don't need it to be escaped from a parsing standpoint we would expect it to be a parsing error? Or is the property merely a suggestion on parsing? Any hints that you can provide on the reasoning for this property would be appreciated. Cheers, -Steph WebSphere Transformation Extender Industry Packs - Software Engineer Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (1)
-
Steve Hanson