My views on these 3 weighty issues:
 
1.Should data containing the escapeEscapeCharacater cause escaping to be used if if so how should it be escaped.  
 
No. I think the EEC alone isn't an active character. it has to be followed by the EC to be interpreted at all. That said, if the pair EEC EC appears in the data, then yes, we must escape the EC, with another EEC, to avoid this being misinterpreted at read time. Resulting in EEC EEC EC in the final data stream that we output. When we read it, we get EEC (first EEC is not followed by an EC, so it is literal), second EEC is followed by EC, so we get a literal EC.
 
Trick: if the EEC and EC are the same character, then you have to escape both of them, with themselves... er ah. so, taking "\" as an example, if "\" is in the data item, then we must output "\\", and if "\\" is in the data item, then we must output "\\\\"  (which for some reason microsoft outlook keeps removing my surrounding quotes from... must be some sort of escape sequence for them!)
 
The rule is consistent though. The above "trick" isn't really a special case. Just apply the rule uniformly that if you find the EC, you must precede it by EEC for output.
 
2.Should we only look for escapeStartString at the beginning of the data  
 
I'd prefer that we respect them anywhere, but canonical form when generated is at the beginning of the data. However, if we want to be more restrictive/conservative for v1.0 I'm fine with that.
 
3.Property names (everyone has their own favourite so lets just pick one.) 
 
Don't care. (Recall - I wanted to call these things quoting schemes....) 


Alan Powell

MP 211, IBM UK Labs, Hursley,  Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM     email: alan_powell@uk.ibm.com  
Tel: +44 (0)1962 815073                  Fax: +44 (0)1962 816898



From: Steve Hanson/UK/IBM
To: Alan Powell/UK/IBM@IBMGB
Cc: dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org
Date: 19/04/2009 12:24
Subject: Re: [DFDL-WG] Simplified Escape Scheme V3




Alan

Comments:

- I think escapeBlockStart and escapeBlockEnd are better names, that way you can immediately see they are for use with escapeBlock.

- escapeKind.  Clarification to escapeBlock parsing behaviour. "On parsing the escapeStartString is removed from the beginning of the data and escapeEndString is removed from end of the data and any escapeEscapeCharacters are removed when they precede any other occurences of the escapeEndString in the data."

- extraEscapedCharacters. Clarification: "A space separated list of single characters that must be escaped in addition to in-scope markup"

- generateEscape. The behaviour when escapeKind = escapeCharacter and value is 'always' is not defined. I would prefer that:
a) The descriptions of 'whenNeeded' behaviour are moved into the escapeKind property to keep all the rules in one place.
b) generateEscape is renamed generateEscapeBlock and only applies to escapeKind = escapeBlock, as that is only when it has an effect.

Regards

Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848




Alan Powell/UK/IBM@IBMGB
Sent by: dfdl-wg-bounces@ogf.org

17/04/2009 15:22

To
dfdl-wg@ogf.org
cc
Subject
[DFDL-WG] Simplified Escape Scheme V3






Attached is the latest version of escape schemes. It includes Steve and Mike's comments (although not renaming properties), removed escapeBlock2 and added uses cases in section 5 which you might like to start with.



The uses cases confirm that the syntax works with some minor clarifications but highlights two questions:

1.        Should data containing the escapeEscapeCharacater cause escaping to be used if if so how should it be escaped.
2.        Should we only look for escapeStartString at the beginning of the data.

 

Alan Powell

MP 211, IBM UK Labs, Hursley,  Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM     email: alan_powell@uk.ibm.com  
Tel: +44 (0)1962 815073                  Fax: +44 (0)1962 816898





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





[attachment "ggf-dfdl-simplified-escape-scheme-v3.doc" deleted by Alan Powell/UK/IBM]
--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 
http://www.ogf.org/mailman/listinfo/dfdl-wg








Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU