258
| Consider
allowing more flexible escapeCharacter schemes (Mike)
6/5: Motivated by example of an escape character which is active when in front of an in-scope delimiter, but not when in front of another character. 20/5: Can't model Mike's example with current facilities, but Mike's example is a generalisation of a particular MITRE example. Do we really need this? Jonathan to follow up. 3/6: 17/6: Re-opened. vCard 3.0 (http://tools.ietf.org/html/rfc2426) is an example of a format that exhibits the need for this. Need a proposal to handle this case, and which fits in with the existing extraEscapedCharacters and escapeEscapeCharacter property. Noted that using lengthKind 'pattern' is sometimes a way of working round this kind of thing. ... 15/7: No progress 22/7: Steve has started to write up a proposal. |
New property dfdl:applyEscapeCharacter
added. The description of dfdl:escapeKind is updated. No changes to dfdl:generateEscapeBlock
but I've added it below by way of comparison.
Property Name | Description |
escapeKind | Enum
Valid values 'escapeCharacter', 'escapeBlock' The type of escape mechanism defined in the escape scheme When 'escapeCharacter': On unparsing a single character of the data is escaped by adding a dfdl:escapeCharacter or dfdl:escapeEscapeCharacter immediately before it. The characters to escape are determined by property dfdl:applyEscapeCharacter. On parsing any in-scope terminating delimiter encountered in the data is not interpreted as such when it is immediately preceded by the dfdl:escapeCharacter (when not itself preceded by the dfdl:escapeEscapeCharacter). Occurrences of the dfdl:escapeCharacter and dfdl:escapeEscapeCharacter are removed from the data as determined by property dfdl:applyEscapeCharacter unless the dfdl:escapeCharacter is preceded by the dfdl:escapeEscapeCharacter, or the dfdl:escapeEscapeCharacter does not precede the dfdl:escapeCharacter, respectively. When 'escapeBlock': On unparsing the entire data are escaped by adding dfdl:escapeBlockStart to the beginning and dfdl:escapeBlockEnd to the end of the data. The data is either always escaped or escaped when needed as specified by dfdl:generateEscapeBlock. If the data is escaped and contains the dfdl:escapeBlockEnd then first character of each appearance of the dfdl:escapeBlockEnd is escaped by the dfdl:escapeEscapeCharacter. On parsing the dfdl:escapeBlockStart string must be the first characters in the (trimmed) data in order to activate the escape scheme. The dfdl:escapeBlockStart string is removed from the beginning of the data. Until a matching dfdl:escapeBlockEnd string (that is, one not preceded by the dfdl:escapeEscapeCharacter) is found in the data, any in-scope terminating delimiter encountered in the data is not interpreted as such, and any dfdl:escapeEscapeCharacters are removed when they precede an dfdl:escapeBlockEnd string. The matching dfdl:escapeBlockEnd string is removed from the data.. The matching dfdl:escapeBlockEnd does not have to be the last characters in the (trimmed) data in order to de-activate the escape scheme. A dfdl:escapeBlockStart occurring anywhere in the data other than the first characters has no significance. Annotation: dfdl:escapeScheme |
applyEscapeCharacter | Enum
Valid values 'whenNeeded', 'delimiters' Controls when escape characters are removed during parsing, and output during unparsing, when dfdl:escapeKind is 'escapeCharacter'. When 'whenNeeded': During unparsing the following are escaped as described in dfdl:escapeKind when they are in the data. · Any in-scope terminating delimiter by escaping its first character. · dfdl:escapeCharacter (escaped by dfdl:escapeEscapeCharacter) · any dfdl:extraEscapedCharacters During parsing, occurrences of dfdl:escapeCharacter and dfdl:escapeEscapeCharacter are interpreted and removed from the data as described in dfdl:escapeKind. When 'delimiters': During unparsing the following are escaped as described in dfdl:escapeKind when they are in the data. · Any in-scope terminating delimiter by escaping its first character. · dfdl:escapeCharacter (escaped by dfdl:escapeEscapeCharacter) During parsing, occurrences of dfdl:escapeCharacter and dfdl:escapeEscapeCharacter are interpreted and removed from the data as described in dfdl:escapeKind, except that dfdl:escapeCharacter is only removed when it immediately precedes an in-scope terminating delimiter. Annotation: dfdl:escapeScheme |
generateEscapeBlock | Enum
Valid values 'always', 'whenNeeded' Controls when escaping is used on unparsing when dfdl:escapeKind is 'escapeBlock'. If 'always' then escaping always occurs as described in dfdl:escapeKind. If 'whenNeeded' then escaping occurs as described in dfdl:escapeKind when the data contains any of the following: · any in-scope terminating delimiter · dfdl:escapeBlockStart at the start of the data · any dfdl:extraEscapedCharacters Annotation: dfdl:escapeScheme |