While working in the escapeBlock area, I seem to have found a gap in the definition for DFDL's behaviour when using escape blocks.

From section 13.2.1, relevant extracts are:

When 'escapeBlock': On unparsing the entire data are escaped by adding dfdl:escapeBlockStart to the beginning and dfdl:escapeBlockEnd to the end of the data. The data is either always escaped or escaped when needed as specified by dfdl:generateEscapeBlock. If the data is escaped and contains the dfdl:escapeBlockEnd then first character of each appearance of the dfdl:escapeBlockEnd is escaped by the dfdl:escapeEscapeCharacter.

and

On parsing the dfdl:escapeBlockStart string must be the first characters in the (trimmed) data in order to activate the escape scheme. The dfdl:escapeBlockStart string is removed from the beginning of the data. Until a matching dfdl:escapeBlockEnd string (that is, one not preceded by the dfdl:escapeEscapeCharacter) is found in the data, any in-scope terminating delimiter encountered in the data is not interpreted as such, and any dfdl:escapeEscapeCharacters are removed when they precede an dfdl:escapeBlockEnd string.

Now consider a a model where:
escapeBlockStart="start"
escapeBlockEnd="end"
escapeEscapeCharacter="#"

Then take a logical value of:
A hash is a #

When we serialize the value, we wrap the value with the escapeBlockStart and escapeBlockEnd, and we preceed any instance of the escapeBlockEnd within the data with an escapeEscapeCharacter. This then gives us the physical value "startA hash is a #end". If we were to parse that data, we see the "#end" as an escaped escapeBlockEnd and report that there is no escapeBlockEnd.

The gap in the behavioural definition seems to be that the specification makes no claim to do anything to escape an instance of an escapeEscapeCharacter when serializing; There is nothing to catch the case of an escapeEscapeCharacter that isn't escaping an escapeBlockEnd but ends up doing it by circumstance.

Andy

Andy Edwards - IBM Integration Bus - DFDL

Email: andy.edwards@uk.ibm.com
Snail Mail: MP211, Hursley park, Hursley, WINCHESTER, Hants, SO21 2JN
Tel int: 247222
Tel ext: +44 (0)1962 817222
Desk: DE3 V17

The Feynman problem solving Algorithm
1) Write down the problem
2) Think real hard
3) Write down the answer
-- Murray Gell-mann in the NY Times

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU