When 'escapeBlock': On unparsing the entire data are escaped by adding dfdl:escapeBlockStart to the beginning and dfdl:escapeBlockEnd to the end of the data. The data is either always escaped or escaped when needed as specified by dfdl:generateEscapeBlock. If the data is escaped and contains the dfdl:escapeBlockEnd then first character of each appearance of the dfdl:escapeBlockEnd is escaped by the dfdl:escapeEscapeCharacter.
and
On parsing the dfdl:escapeBlockStart string must be the first characters in the (trimmed) data in order to activate the escape scheme. The dfdl:escapeBlockStart string is removed from the beginning of the data. Until a matching dfdl:escapeBlockEnd string (that is, one not preceded by the dfdl:escapeEscapeCharacter) is found in the data, any in-scope terminating delimiter encountered in the data is not interpreted as such, and any dfdl:escapeEscapeCharacters are removed when they precede an dfdl:escapeBlockEnd string.
Now consider a a model
where:
escapeBlockStart="start"
escapeBlockEnd="end"
escapeEscapeCharacter="#"
Then take a logical value
of:
A hash is a #
When we serialize the
value, we wrap the value with the escapeBlockStart and escapeBlockEnd,
and we preceed any instance of the escapeBlockEnd within the data
with an escapeEscapeCharacter. This then gives us the physical value
"startA hash is a #end". If we were to parse that data,
we see the "#end" as an escaped escapeBlockEnd and report that
there is no escapeBlockEnd.
The gap in the behavioural
definition seems to be that the specification makes no claim to do anything
to escape an instance of an escapeEscapeCharacter when serializing; There
is nothing to catch the case of an escapeEscapeCharacter that isn't escaping
an escapeBlockEnd but ends up doing it by circumstance.
Andy
Andy Edwards - IBM Integration Bus - DFDL | |||||||||||
| The
Feynman problem solving Algorithm 1) Write down the problem 2) Think real hard 3) Write down the answer -- Murray Gell-mann in the NY Times |