My views on these 3
weighty issues:
1.Should data containing the escapeEscapeCharacater cause
escaping to be used if if so how should it be escaped.
No. I
think the EEC alone isn't an active character. it has to be followed by the EC
to be interpreted at all. That said, if the pair EEC EC appears in the data,
then yes, we must escape the EC, with another EEC, to avoid this being
misinterpreted at read time. Resulting in EEC EEC EC in the final data stream
that we output. When we read it, we get EEC (first EEC is not followed by an EC,
so it is literal), second EEC is followed by EC, so we get a literal EC.
Trick:
if the EEC and EC are the same character, then you have to escape both of them,
with themselves... er ah. so, taking "\" as an example, if "\" is in the data
item, then we must output "\\", and if "\\" is in the data item, then we must
output "\\\\" (which for
some reason microsoft outlook keeps removing my surrounding quotes from... must
be some sort of escape sequence for them!)
The
rule is consistent though. The above "trick" isn't really a special case. Just
apply the rule uniformly that if you find the EC, you must precede it by EEC for
output.
2.Should we only look for escapeStartString at the beginning of the
data
I'd
prefer that we respect them anywhere, but canonical form when generated is at
the beginning of the data. However, if we want to be more
restrictive/conservative for v1.0 I'm fine with that.
3.Property names (everyone has their own favourite so lets just pick
one.)
Don't
care. (Recall - I wanted to call these things quoting
schemes....)
Alan
Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN,
England
Notes Id: Alan Powell/UK/IBM email:
alan_powell@uk.ibm.com
Tel: +44 (0)1962 815073
Fax: +44 (0)1962
816898
From:
| Steve Hanson/UK/IBM
|
To:
| Alan Powell/UK/IBM@IBMGB
|
Cc:
| dfdl-wg@ogf.org,
dfdl-wg-bounces@ogf.org
|
Date:
| 19/04/2009 12:24
|
Subject:
| Re: [DFDL-WG] Simplified Escape Scheme
V3 |
Alan
Comments:
- I think
escapeBlockStart and escapeBlockEnd are better names, that way you can
immediately see they are for use with escapeBlock.
- escapeKind. Clarification to escapeBlock parsing
behaviour. "On parsing the escapeStartString
is removed from the beginning of the data
and escapeEndString is removed from end of the data and any escapeEscapeCharacters are removed when they precede any
other occurences of the escapeEndString in the data."
- extraEscapedCharacters. Clarification:
"A space separated list of single characters that
must be escaped in addition to in-scope markup"
-
generateEscape. The behaviour when escapeKind = escapeCharacter and value is
'always' is not defined. I would prefer that:
a) The descriptions of 'whenNeeded' behaviour are moved into the
escapeKind property to keep all the rules in one place.
b) generateEscape is renamed generateEscapeBlock and only
applies to escapeKind = escapeBlock, as that is only when it has an
effect.
Regards
Steve
Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley,
UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848
Alan
Powell/UK/IBM@IBMGB Sent by:
dfdl-wg-bounces@ogf.org
17/04/2009 15:22
|
To
| dfdl-wg@ogf.org
|
cc
|
|
Subject
| [DFDL-WG] Simplified Escape Scheme
V3 |
|
Attached is the latest version of escape schemes. It
includes Steve and Mike's comments (although not renaming properties), removed
escapeBlock2 and added uses cases in section 5 which you might like to start
with.
The
uses cases confirm that the syntax works with some minor clarifications but
highlights two questions:
1. Should data containing the
escapeEscapeCharacater cause escaping to be used if if so how should it be
escaped.
2.
Should we only look for escapeStartString at the beginning
of the data.
Alan Powell
MP
211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan
Powell/UK/IBM email: alan_powell@uk.ibm.com
Tel: +44
(0)1962 815073
Fax: +44 (0)1962 816898
Unless stated otherwise above:
IBM United
Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
[attachment
"ggf-dfdl-simplified-escape-scheme-v3.doc" deleted by Alan Powell/UK/IBM]
--
dfdl-wg mailing
list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United
Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU