The use cases for considering the inclusion
of the recursive use of DFDL to define markup or other DFDL properties
are:
a) Case insensitivity of data (eg, true
& TRUE for text boolean)
b) Case insensitivity of markup (eg,
hdr & HDR for initiator)
c) Different possible values for non-white
space markup (eg, @ and # for separator)
d) Different possible values for data
(eg, true & yes for text boolean)
e) Encoding of markup different to encoding
of data (eg, initiator and terminator different to data)
The proposal is to use various existing
mechanisms to handle all these use cases, and negate the need to include
recursive use of DFDL in 1.0.
a) Case insensitivity of data (eg,
true & TRUE for text boolean)
- Use a single flag dfdl:valueIgnoreCase
to cover all affected properties
- Properties:
-
dfdl:occursStopValue
-
dfdl:numberZeroRep
-
dfdl:nilValues
-
dfdl:textBooleanTrue
-
dfdl:textBooleanFalse
b) Case insensitivity of markup (eg,
hdr & HDR for initiator)
- Use a single flag dfdl:valueIgnoreCase
to cover all affected properties
- Properties:
-
dfdl:initiator
-
dfdl:terminator
-
dfdl:separator
c) Different possible values for
non-white space markup (eg, @ and # for separator)
- Use multi-value property. Propose
that property name remains singular.
- Properties:
-
dfdl:initiator
-
dfdl:terminator
-
dfdl:separator
d) Different possible values for
data (eg, true & yes for text boolean)
- Use multi-value property. Propose
that property name remains singular, so dfdl:nilValues becomes dfdl:nilValue
singular.
- Properties:
-
dfdl:occursStopValue
-
dfdl:numberZeroRep
-
dfdl:nilValues
-
dfdl:textBooleanTrue
-
dfdl:textBooleanFalse
e) Encoding of markup different to
encoding of data (eg, initiator and terminator different to data)
- Use <xs:sequence> to wrap the
element and carry the markup, for example:
<sequence dfdl:encoding="ascii"
dfdl:separator=":">
<sequence
dfdl:encoding="ebcdic" dfdl:initiator="VAL" dfdl:terminator="END">
<element name="val" type="..." dfdl:encoding="ascii"
/>
</sequence>
</sequence>
- This should be able to handle all
cases of what is a rare occurrence anyway, and still allows speculative
parsing rules to apply.
- Alternative is to treat the markup
as a value (the EDI scenario) - this is the subject of a separate action
026, which will be solved using variables or another technique, but not
by using DFDL recursively.
There are some other properties to which
cases a), b), c), d) could apply. We need to decide whether or not
case sensitivity and/or multi-values are appropriate to these:
-
dfdl:textPadChar
-
dfdl:escapeCharacter
-
dfdl:escapeForEscapeCharacter
-
dfdl:escapeBlockStart
-
dfdl:escapeBlockEnd
-
dfdl:numberGroupSeparator
-
dfdl:numberDecimalSeparator
-
dfdl:numberExponentCharacter
-
dfdl:numberInfinityRep
-
dfdl:numberNanRep
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve
Hanson/UK/IBM on 09/06/2009 13:27 -----
Steve Hanson/UK/IBM@IBMGB
Sent by: dfdl-wg-bounces@ogf.org
15/04/2009 13:47
|
To
| dfdl-wg@ogf.org
|
cc
|
|
Subject
| [DFDL-WG] Recursive use of DFDL for
variable markup - use case |
|
>From last week's call:
7. Recursive use of
DFDL for variable markup
Use of a DFDL annotated element/type to describe an initiator, length prefix,
terminator, separator, etc. Steve suggested the most important use of "variable
markup-like mechanism" in IBM's WTX product is to reference a location
earlier in the bit stream where a delimiter value is found. We handle this
already by use of a path expression. The additional variable markup
mechanism was to avoid proliferation of keywords for various corner cases
on initiator, terminator and separator. Eg., what if you want the initiator
to be "Name" or "name" only, not "NAME",
"nAmE", etc. So case insensitive is not expressive enough. This
can always be modeled, just not as an initiator tag. Feeling was to leave
out variable markup (other than for prefix lengths) for v1.0, and to propose
the minimum set of extra properties that can be used to address the common
use cases, but that IBM needed to see whether this satisfied all WTX use
cases.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848 --
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU