dfdl-wg
Threads by month
- ----- 2025 -----
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
- 1 participants
- 3035 discussions
Alan/Steve - Changes after our meeting this morning (June 16, 2009)
Define a variable format block just like we define textNumberFormat. The
variable block could be at schema level but preferably should be defined
within the defineFormat definition so that variables are scoped to format
definition and do not pollute the schema namespace..
Variable block is referenced from dfdl:format annotation just like
textNumberFormat block is referenced through textNumberFormatRef
annotation.
It addresses the problem hilited today in the meeting related to semantics
of dfdl:ref="myFormat" annotation on a xsd constructs. The net effect of
this annotation is as if the dfdl:format annotation (along with the
attribute settings) is specified explicitly on the xsd construct. By
providing a level of indirection and referencing variable block through
varRef annotations, we have permitted visibility to variables but not to
the definition of variables.
Excerpts from MyEnvelopeFormat.xsd - attached for perusal.. Also attached
is the PI for the example ( see updated xsd files under the folder
enveloper_skk)
<xsd:annotation>
<xsd:appinfo source="http://dataformat.org/">
<dfdl:defineFormat name="MyEnvelopeFormat"
baseFormat="dfdlDefaultFormat:defaultFormat" >
<!-- statically override properties
specific to the envelope wrapped messages -->
<dfdl:format byteOrder="big-Endian"
varRef="./MyEnvelopeFormatVariables"/>
<!--
Identify variables for
processing the messages wrapped in an envelope
i.e. the set of properties
values that I need to take from input data and
dynamically set the values of
dfdl properties during the processing of the message.
Assumption: Names of variables
defined in the format are unique
-->
<!-- Define the variables needed for the format locally
in the format block -->
<!-- They could also be defined at the schema level and
referenced through varRef -->
<!-- as its definition is QName just like textFormatBlock
or numberFormatBlock -->
<!-- Best practice would be to constraint the variable
block locally -->
<dfdl:defineVariableBlock name="MyEnvelopeFormatVariables">
<dfdl:defineVariable name="sep" type=
"string" />
<dfdl:defineVariable name="enc" type=
"string" />
<dfdl:defineVariable name=
"outputDirPathSep" type="string" default="/" use="output" />
<dfdl:defineVariable name="outputMsgKind"
type="string" use="output" />
<dfdl:defineVariable name="inputMsgKind"
type="string" use="input" />
</dfdl:defineVariableBlock>
</dfdl:defineFormat>
</xsd:appinfo>
</xsd:annotation>
Suman Kalia
IBM Toronto Lab
WMB Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.h…
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia(a)ca.ibm.com
1
0

Fw: Minute for OGF DFDL Working Group Calls, June 8-10 2009 ( Action item 042)
by Suman Kalia 15 Jun '09
by Suman Kalia 15 Jun '09
15 Jun '09
Alan,
>>> SKK proposed scoping by putting dfdl:defineVariable within a
dfdl:defineFormat.. Will update the file path example to illustrate the
design.
Attached is the original example re-worked by defining variables in the
define format..
-- directory enveloper_skk contains the xsds based on the proposal..
Suman Kalia
IBM Toronto Lab
WMB Toolkit Architect and Development Lead
WebSphere Business Integration Application Connectivity Tools
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.h…
Tel : 905-413-3923 T/L 969-3923
Fax : 905-413-4850 T/L 969-4850
Internet ID : kalia(a)ca.ibm.com
----- Forwarded by Suman Kalia/Toronto/IBM on 06/15/2009 04:40 PM -----
From:
Steve Hanson <smh(a)uk.ibm.com>
To:
Alan Powell <alan_powell(a)uk.ibm.com>
Cc:
dfdl-wg(a)ogf.org, dfdl-wg-bounces(a)ogf.org
Date:
06/12/2009 09:30 AM
Subject:
Re: [DFDL-WG] Minute for OGF DFDL Working Group Calls, June 8-10 2009
Alan
Thanks for writing up the three days worth of discussions.
A couple of small corrections:
1) Peter Lambros & Tim Kimber took part in some of the calls.
2) Action 028 Variable Markup
The uses cases for variable markup are:
a) Case insensitivity of data (eg, true & TRUE for text boolean)
b) Case insensitivity of markup (eg, hdr & HDR for initiator)
c) Different possible values for non-white space markup (eg, @ and # for
separator)
d) Different possible values for data (eg, true & yes for text boolean)
e) Encoding of markup different to encoding of data (eg, initiator and
terminator different to data)
SH proposed a solution to each of these that did not require variable
markup recursive use of DFDL see email 9/6/2009.
After discussion so minor changes were agreed and will be documented.
Variable markup recursive use of DFDL will not be supported for markup in
DFDL v1 (it is used for dfdl:prefixLengthType property still)
3) Action 043 Types in the infoset.
It was agreed that the types in the infoset will be schema the built-in
types. When parsing, sufficient checking validation occurs to ensure the
data can be converted to the built-in type. Rules have been defined for
the valid data representations for each built-in type.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
Alan Powell/UK/IBM@IBMGB
Sent by: dfdl-wg-bounces(a)ogf.org
11/06/2009 15:43
To
dfdl-wg(a)ogf.org
cc
Subject
[DFDL-WG] Minute for OGF DFDL Working Group Calls, June 8-10 2009
Open Grid Forum: Data Format Description Language Working Group
OGF DFDL Working Group Calls, June 8-10 2009
These minutes cover the series of meetings and calls which took place June
8-10 2009
Meeting opened, 14:00 UK
Attendees
Steve Hanson (IBM)
Mike Beckerle (Oco)
Suman Kalia (IBM)
Alan Powell (IBM)
Stephanie Fetzer (IBM)
Apologies
Agenda:
Action 026 Envelopes and payloads
It is expected that an xs:choice will be the way that payloads are
defined in an envelope. xs:any support, as currently defined is not
sufficient for this purpose (see action 050). There will be no support for
dynamic binding of envelope and payload at runtime.
Markup define in the data is covered under action 042 Variables
Action 027 Property Precedence
Tables are being updated to incorporate recent decisions.
Action 028 Variable Markup
The uses cases for variable markup are:
a) Case insensitivity of data (eg, true & TRUE for text boolean)
b) Case insensitivity of markup (eg, hdr & HDR for initiator)
c) Different possible values for non-white space markup (eg, @ and # for
separator)
d) Different possible values for data (eg, true & yes for text boolean)
e) Encoding of markup different to encoding of data (eg, initiator and
terminator different to data)
SH proposed a solution to each of these that did not require variable
markup see email 9/6/2009.
After discussion so minor changes were agreed and will be documented.
Variable markup will not be supported in DFDL v1
Action 029 valueCalc
SH and AP proposed that a new xpath function ( or new parameter on
dfdl:length) that gave the representation length of an element without
padding or truncation being applied allowed all the uses cases to be
supported.
The meaning of the various lengthKinds is unchanged. In particular
lengthKind='explicit' always gets the length from the dfdl:length property
on parsing and unparsing even when it is an expression. If the expression
gets the length from a previous data field then the new function can be
used to ensure that the field contains the correct length.
MB to update description
MB to update grammar diagrams to add padding fields.
Action 042 Variables.
The uses cases for variables are
1. Extracting syntax data fields
2. As an indicator to identify Payload
3. As an easier way to set bits in bitmap
4. As the way dfdl properties can be set from outside the parser.
And the following aspects need to be defined
Scoping of defineVariable
naming/namespaces - include/import
Unparsing
Variable Type - enums
Multiple setVariables in loops etc
SKK proposed scoping by putting dfdl:defineVariable within a
dfdl:defineFormat.. Will update the file path example to illustrate the
design.
Action 043 Types in the infoset.
It was agreed that the types in the infoset will be schema the built-in
types. Sufficient validation occurs to ensure the data can be converted to
the built-in type. Rules have been defined the valid data representations
for each built-in type.
Action 042 Scoping for non-format annotations
Scoping rules have been defined and agreed. Awaiting the resolution of
defineVariable discussion
Action 050 xs:any Support
The support for xs:any is currently limited to allowing initiated string
fields which is very limited. Support should be either dropped or extended
to allow complex content. It is proposed that xs:any is dropped from dfdl
V1.
Implement concerns with current scoping rules.
Concerns have been raised that the current scoping rules make it difficult
to design good implementations.
In particular it is difficult for an editor to be able to should which
dfdl properties are set on an element without extensive tree walking. A
dfdl schema validator would not be able to tell the context where global
components where reused to global components would have to be valid in
their own right.
A number of possible solutions were discussed including removing scoping
and introducing dfdl:schema wide defaults. A proposal to keep scoping but
modify the rules was the favoured option. MB to document.
Next call 16 and 17th June 14:00 UK Scheduled for 2 hours
Meeting closed, 16:30 UK
Actions raised at this meeting
No
Action
051
Scoping rules.
MB: to document change to scoping rules to satisfy implementation concerns
Current Actions:
No
Action
012
AP/SH: Update decimalCalendarScheme
10/9: Not allocated yet
17/9: No update
24/9: Add calendar binary formats to actions
22/10: No progress
16/1: proposal distributed and discussed. Will be redistributed
21/1: add locale,
04/02: changed from locale to specific properties
18/2: Need more investigation of ICU strict/lax behaviour.
08/04: Not discussed
22/04: AP to complete asap once the ICU strict/lax behaviour is
understood.
29/04: No progress
06/05: No progress
13/05: Calendar has been added to latest spec version v034 but still a few
details to clarify.
20/05: No Progress
27/05: No Progress
03/06: No Progress (low priority)
09/06: No Progress (low priority)
026
SH: Envelopes and Payloads
08/04: Not discussed explicity, but recursive use of DFDL is tied up with
this
22/04: Two aspects. Firstly compositional - do sufficient mechanisms exist
to model an envelope with a payload that varies. Secondly markup syntax -
this might be defined in the envelope.
The second of these is very much tied up with the variable markup action
028, so will be considered there. SH to verify the composition aspect.
29/04: SH and AP working on proposal. related to Action 028
06/05: No progress
06/05: No progress
20/05: No Progress
27/05: Still a number of aspects to be decided.
- Compostion - Does the envelope and payload need to be defined in the
same schema or should they be dynamically bound at runtime?
- Compostion- How is a variable payload specified. Choice or xs:any; New
action raised to discuss xs:any
- extracting dymanic syntax from data. Covered by action 029 valuecalc.
03/06: Dynamic runtime binding will not be supported.
SH investigating use of variables to enable standalone and use in envelope
of global element.
09/06: Payload should be specified using a choice rather than xs:any
027
SH: Property precedence tables
08/04: Not discussed
22/04: Two things missing from the existing precedence trees. Firstly,
does not show alternates (eg, initiator v initiatorkind). Secondly, need a
tree per concrete DFDL object (eg, element). SH to update.
29/04: No progress
06/05: SH is updating tables which will be ready for next call
13/05: SH emailed updated version. AP commented.. See minutes for issues
and property changes.
20/05: Updated version circulated. Review before next call and be ready
for vote.
27/05: Updated version circulated. more comments raised.
03/06: Further updates to clarify 'core'. Also identified missing design
for outputMinLength
09/06:
028
SH: Variable markup
08/04: Discussed briefly at end of call, IBM to see whether there any use
cases that require recursive use of DFDL.
15/04: Use case was distributed and will be discussed on next call.
22/04: The use case in question is EDI where the terminating markup for
the payload segments is defined in the ISA envelope segment. The markup is
modelled as an element of simple type where the allowable markup values
are defined as enums on the type. But we need to handle two cases -
firstly where the envelope is present, so the value used by the payload is
taken from the envelope. Secondly where only the payload is present. Here
we need a way of scanning for all the enum values, and adopting the one we
actually find, when parsing. And using a default when unparsing. SH to
explore use of a DFDL variable, where the variable has a default, but also
has a type that is the same as the markup element - that way we get to use
the enums without defining everything twice.
29/04: SH and AP working on proposal.
06/05: No progress
13/05: No progress
20/05: No Progress
27/05: Progress made and will tie to other actions
03/06: General desire to avoid having to introduce variable markup in V1.
Proposed having a property to control case behaviour of all syntax
(initiator, terminator,separator) rather than separate ones for each.
Similar property to 'values' (textZeroRep, textBooleanTrueRep, etc). and
allowing lists of values. SH need to solve remaining uses case as
described in action 026
09/06: SH proposal discussed. ICU questions to be researched
029
MB: valueCalc (output length calculation)
08/04: Not discussed
22/04: Action allocated to MB, this is to complete the work started at the
Hursley WG F2F meeting.
29/04: No progress
06/05: MB will have update for next call
13/05: MB will have update for next call
20/05: Some progress. will be circulated this week
27/05: MB circulated proposal and got comments. Will update and review on
next call
03/06: Discussed proposal. MB to update dealing with uses cases raised.
Options include a new lenghtKind='Reference' to make it easier to
distinguish from fixed length case. Or use outputLengthCalc to separate
calculation of parsing and unparsing length.
09/06: SH/AP proposal discussed and MB to document
033
AP/TK: Assert/Discriminator semantics. AP to document. TK to check uses of
discriminator besides choice.
08/04: In progress within IBM
22/04: Waiting for TK to return from leave to complete.
29/04: TK has sent examples shown need for discriminators beyond choice.
Agreed. MB to respond to TK
06/05: Discussed suggestion of adding type indicator to discriminator. MB
to provide examples.
15/03: Semantic documented in v034. MB to provide examples of need for
scope indicator on discriminator
20/05: MB to provide examples of need for scope indicator on discriminator
(but lower priority than action 029)
27/05: No Progress (lower priority)
03/06: No Progress (lower priority)
09/06: No Progress (lower priority)
037
All: Approach for XML Schema 1.0 UPA checks.
22/04: Several non-XML models, when expressed in their most obvious DFDL
Schema form, would fail XML Schema 1.0 Unique Particle Attribution checks
that police model ambiguity. And even re-jigging the model sometimes
fails to fix this. Note this is equally applicable to XMl Schema 1.1 and
1.0. While the DFDL parser/unparser can happily resolve the ambiguities,
the issue is one of definition. If an XSD editor that implements UPA
checks is used to create DFDL Schema, then errors will be flagged. DFDL
may have to adopt the position that:
a)DFDL parser/unparser will not implement some/all UPA checks (exact
checks tbd)
b) XML Schema editors that implement UPA checks will not be suitable for
all DFDL models
c) If DFDL annotations are removed, the resulting pure XSD will not always
be valid (ie, the equivalent XML is ambiguous and can't be modelled by XML
Schema 1.0)
Ongoing in case another solution can be found.
29/04: Will ask DG and S Gao for opinion before closing
06/05: Discussed S Gao email and suggestions. Decided need to review all
XML UPA rules and decide which apply to dfdl.
20/05: SH or SKK to investigate
27/05: No Progress
03/06: The concern is that some dfdl schemas will fail UPA check when
validation is turned on or when editted using tooling that enforces UPA
checks. Renaming fields will resolve some/most issues. Need documentation
that describes issue and best practice.
038
MB: Submit response to OMG RFI for non-XML standardization
22/04: First step is for MB to mail the OGF Data Area chair to say that we
want to submit
29/04: MB has been in contact with OMG and will sunbit dfdl.
06/05: MB has prepared response to OMG. Will send DFDL sepc v033
20/05: Response has been sent to OMG based on v034
27/05: Awaiting response from OMG.
03/06: On hold
042
MB: Complete variable specification.
To include how properties such as encoding can be set externally. Must be
a known variable name.
06/05: No progress
20/05: AP to make proposal
27/05: MB proposed differentiating between input and output variables to
avoid unnecessary evaluations during parse and unparse. Need to complete
rest of variable specification.
03/06: Pointed out problem of declaring variables input or output when
used to define syntax which is used both times. MB to update proposal to
include how variables are set externally and how specific properties such
as encoding are set.
09/06: SKK to use example to dicument his proposal
043
13/05: Types in the infoset. Currently infoset types have defined value
space but that implies a parser would have to validate input. Is this
correct?
20/05: SH No progress
27/05: No Progress
03/06: No Progress
09/06: SH proposed staying with XML built-in types. Closed
044
13/05: Bidi
20/05: AP: will check what IBM products support.
27/05: Bidi is supported so will be needed in dfdl v1
03/06: No Progress
09/06: No Progress
045
20/05 AP: Speculative Parsing
27/05: Psuedo code has been circulated. Review for next call
03/06: Comments received and will be incorporated
09/06: Progress but not discussed
047
20/05 AP: Scoping for non-format annotations
27/05: Discussed briefly. AP to distribute
03/06: Proposal discussed briefly. Will be updated.
09/06: Doc emailed. Awaiting outcome of variable to define/setvariable
rules.
048
20/05: AP investigate Restart
27/05: Suggest RESTART is not part of the scope for DFDL.
03/06: not discussed
09/06: Closed
049
20/05 AP Built-in specification description and schemas
03/06: not discussed
050
27/05: xs:any currently limited to initiated text element. Is this
sufficient? Should xs:any in its current form be deferred?
03/06: not discussed
09/06: Proposed dropping xs:any support
051
Scoping rules.
MB: to document change to scoping rules to satisfy implementation concerns
Closed actions:
Work items:
No
Item
target version
status
003
Variables - ??, 2008 (Mike)
005
Improvements on property descriptions - ??, 2008 (All - split TBD)
006
Envelopes and Payloads (Steve) - Apr 30, 2008
007
(from draft 32) valueCalc (Mike) - ??, 2008
mostly
complete
008
(from draft 32) Property precedence for writing (Steve) -
under review
009
(from draft 32) Variable markup (Steve) - Mar 31, 2008
proposal needs writing up
011
(from draft 32) How speculative parsing works (combining choice and
variable-occurence - currently these are separate) ??, 2008 (IBM)
in progress
012
(from draft 32) Reordering the properties discussion: move representation
earlier, improve flow of topics ??, 2008 (Alan)
not started
027
Calendar schemes
034
032
Floating components
033
Changes from action 020 and 027 - renaming properties etc
035
Remove unorderedInitiated, add initiated content (a041)
036
Update dfdl schema with change properties (Suman)
037
Infoset text codepage
038
Improve length section
039
Change scoping of simple types (A 046)
040
Document outputMinLength (A027)
042
mapping of the dfdl infoset to XDM
Not required for V1 specification
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell(a)uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
1
0

15 Jun '09
The use cases for considering the inclusion of the recursive use of DFDL
to define markup or other DFDL properties are:
a) Case insensitivity of data (eg, true & TRUE for text boolean)
b) Case insensitivity of markup (eg, hdr & HDR for initiator)
c) Different possible values for non-white space markup (eg, @ and # for
separator)
d) Different possible values for data (eg, true & yes for text boolean)
e) Encoding of markup different to encoding of data (eg, initiator and
terminator different to data)
The proposal is to use various existing mechanisms to handle all these use
cases, and negate the need to include recursive use of DFDL in 1.0.
a) Case insensitivity of data (eg, true & TRUE for text boolean)
- Use a single flag dfdl:ignoreCase to cover all affected properties
- Properties:
- dfdl:occursStopValue
- dfdl:numberZeroRep **
- dfdl:nilValues
- dfdl:textBooleanTrue
- dfdl:textBooleanFalse
- dfdl:numberInfinityRep **
- dfdl:numberNanRep **
- dfdl:numberExponentCharacter **
b) Case insensitivity of markup (eg, hdr & HDR for initiator)
- Use same flag dfdl:ignoreCase to cover all affected properties
- Properties:
- dfdl:initiator
- dfdl:terminator
- dfdl:separator
c) Different possible values for non-white space markup (eg, @ and # for
separator)
- Use multi-value property. Propose that property name remains singular.
- Properties:
- dfdl:initiator
- dfdl:terminator
- dfdl:separator
d) Different possible values for data (eg, true & yes for text boolean)
- Use multi-value property. Propose that property name remains singular,
so dfdl:nilValues becomes dfdl:nilValue singular.
- Properties:
- dfdl:occursStopValue
- dfdl:numberZeroRep **
- dfdl:nilValues
- dfdl:textBooleanTrue
- dfdl:textBooleanFalse
e) Encoding of markup different to encoding of data (eg, initiator and
terminator different to data)
- Use <xs:sequence> to wrap the element and carry the markup, for example:
<sequence dfdl:encoding="ascii" dfdl:separator=":">
<sequence dfdl:encoding="ebcdic" dfdl:initiator="VAL"
dfdl:terminator="END">
<element name="val" type="..." dfdl:encoding="ascii" />
</sequence>
</sequence>
- This should be able to handle all cases of what is a rare occurrence
anyway, and still allows speculative parsing rules to apply.
- This technique also allows you to change the dfdl:ignoreCase property
between markup and data.
- Alternative is to treat the markup as a value (the EDI scenario) - this
is the subject of a separate action 026, which will be solved using
variables or another technique, but not by using DFDL recursively.
There are some other properties to which cases a), b), c), d) could apply,
but it has been decided that the flexibility is not needed in practice.
- dfdl:textPadCharacter
- dfdl:escapeCharacter
- dfdl:escapeForEscapeCharacter
- dfdl:escapeBlockStart
- dfdl:escapeBlockEnd
- dfdl:numberGroupSeparator
- dfdl:numberDecimalSeparator
** ICU assumes a single char for nan, infinity, and exponent. That's too
restrictive for us, so propose using the DFDL nan, infinity and exponent
properties like the zero rep property - they are used to pre-process the
data for ICU on parsing, and applied to the ICU output on unparsing.
For date/time support, the comparisons made by ICU when checking days,
months, etc are case-insensitive, so DFDL does not need to provide any
extra behaviour.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 09/06/2009 13:27 -----
Steve Hanson/UK/IBM@IBMGB
Sent by: dfdl-wg-bounces(a)ogf.org
15/04/2009 13:47
To
dfdl-wg(a)ogf.org
cc
Subject
[DFDL-WG] Recursive use of DFDL for variable markup - use case
>From last week's call:
7. Recursive use of DFDL for variable markup
Use of a DFDL annotated element/type to describe an initiator, length
prefix, terminator, separator, etc. Steve suggested the most important use
of "variable markup-like mechanism" in IBM's WTX product is to reference a
location earlier in the bit stream where a delimiter value is found. We
handle this already by use of a path expression. The additional variable
markup mechanism was to avoid proliferation of keywords for various corner
cases on initiator, terminator and separator. Eg., what if you want the
initiator to be "Name" or "name" only, not "NAME", "nAmE", etc. So case
insensitive is not expressive enough. This can always be modeled, just not
as an initiator tag. Feeling was to leave out variable markup (other than
for prefix lengths) for v1.0, and to propose the minimum set of extra
properties that can be used to address the common use cases, but that IBM
needed to see whether this satisfied all WTX use cases.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848 --
dfdl-wg mailing list
dfdl-wg(a)ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
1
0
Open Grid Forum: Data Format Description Language Working Group
OGF DFDL Working Group Calls, June 8-10 2009
These minutes cover the series of meetings and calls which took place June
8-10 2009
Meeting opened, 14:00 UK
Attendees
Steve Hanson (IBM)
Mike Beckerle (Oco)
Suman Kalia (IBM)
Alan Powell (IBM)
Stephanie Fetzer (IBM)
Apologies
Agenda:
Action 026 Envelopes and payloads
It is expected that an xs:choice will be the way that payloads are
defined in an envelope. xs:any support, as currently defined is not
sufficient for this purpose (see action 050). There will be no support for
dynamic binding of envelope and payload at runtime.
Markup define in the data is covered under action 042 Variables
Action 027 Property Precedence
Tables are being updated to incorporate recent decisions.
Action 028 Variable Markup
The uses cases for variable markup are:
a) Case insensitivity of data (eg, true & TRUE for text boolean)
b) Case insensitivity of markup (eg, hdr & HDR for initiator)
c) Different possible values for non-white space markup (eg, @ and # for
separator)
d) Different possible values for data (eg, true & yes for text boolean)
e) Encoding of markup different to encoding of data (eg, initiator and
terminator different to data)
SH proposed a solution to each of these that did not require variable
markup see email 9/6/2009.
After discussion so minor changes were agreed and will be documented.
Variable markup will not be supported in DFDL v1
Action 029 valueCalc
SH and AP proposed that a new xpath function ( or new parameter on
dfdl:length) that gave the representation length of an element without
padding or truncation being applied allowed all the uses cases to be
supported.
The meaning of the various lengthKinds is unchanged. In particular
lengthKind='explicit' always gets the length from the dfdl:length property
on parsing and unparsing even when it is an expression. If the expression
gets the length from a previous data field then the new function can be
used to ensure that the field contains the correct length.
MB to update description
MB to update grammar diagrams to add padding fields.
Action 042 Variables.
The uses cases for variables are
Extracting syntax data fields
As an indicator to identify Payload
As an easier way to set bits in bitmap
As the way dfdl properties can be set from outside the parser.
And the following aspects need to be defined
Scoping of defineVariable
naming/namespaces - include/import
Unparsing
Variable Type - enums
Multiple setVariables in loops etc
SKK proposed scoping by putting dfdl:defineVariable within a
dfdl:defineFormat.. Will update the file path example to illustrate the
design.
Action 043 Types in the infoset.
It was agreed that the types in the infoset will be schema the built-in
types. Sufficient validation occurs to ensure the data can be converted to
the built-in type. Rules have been defined the valid data representations
for each built-in type.
Action 042 Scoping for non-format annotations
Scoping rules have been defined and agreed. Awaiting the resolution of
defineVariable discussion
Action 050 xs:any Support
The support for xs:any is currently limited to allowing initiated string
fields which is very limited. Support should be either dropped or extended
to allow complex content. It is proposed that xs:any is dropped from dfdl
V1.
Implement concerns with current scoping rules.
Concerns have been raised that the current scoping rules make it difficult
to design good implementations.
In particular it is difficult for an editor to be able to should which
dfdl properties are set on an element without extensive tree walking. A
dfdl schema validator would not be able to tell the context where global
components where reused to global components would have to be valid in
their own right.
A number of possible solutions were discussed including removing scoping
and introducing dfdl:schema wide defaults. A proposal to keep scoping but
modify the rules was the favoured option. MB to document.
Next call 16 and 17th June 14:00 UK Scheduled for 2 hours
Meeting closed, 16:30 UK
Actions raised at this meeting
No
Action
051
Scoping rules.
MB: to document change to scoping rules to satisfy implementation concerns
Current Actions:
No
Action
012
AP/SH: Update decimalCalendarScheme
10/9: Not allocated yet
17/9: No update
24/9: Add calendar binary formats to actions
22/10: No progress
16/1: proposal distributed and discussed. Will be redistributed
21/1: add locale,
04/02: changed from locale to specific properties
18/2: Need more investigation of ICU strict/lax behaviour.
08/04: Not discussed
22/04: AP to complete asap once the ICU strict/lax behaviour is
understood.
29/04: No progress
06/05: No progress
13/05: Calendar has been added to latest spec version v034 but still a few
details to clarify.
20/05: No Progress
27/05: No Progress
03/06: No Progress (low priority)
09/06: No Progress (low priority)
026
SH: Envelopes and Payloads
08/04: Not discussed explicity, but recursive use of DFDL is tied up with
this
22/04: Two aspects. Firstly compositional - do sufficient mechanisms exist
to model an envelope with a payload that varies. Secondly markup syntax -
this might be defined in the envelope.
The second of these is very much tied up with the variable markup action
028, so will be considered there. SH to verify the composition aspect.
29/04: SH and AP working on proposal. related to Action 028
06/05: No progress
06/05: No progress
20/05: No Progress
27/05: Still a number of aspects to be decided.
- Compostion - Does the envelope and payload need to be defined in the
same schema or should they be dynamically bound at runtime?
- Compostion- How is a variable payload specified. Choice or xs:any; New
action raised to discuss xs:any
- extracting dymanic syntax from data. Covered by action 029 valuecalc.
03/06: Dynamic runtime binding will not be supported.
SH investigating use of variables to enable standalone and use in envelope
of global element.
09/06: Payload should be specified using a choice rather than xs:any
027
SH: Property precedence tables
08/04: Not discussed
22/04: Two things missing from the existing precedence trees. Firstly,
does not show alternates (eg, initiator v initiatorkind). Secondly, need a
tree per concrete DFDL object (eg, element). SH to update.
29/04: No progress
06/05: SH is updating tables which will be ready for next call
13/05: SH emailed updated version. AP commented.. See minutes for issues
and property changes.
20/05: Updated version circulated. Review before next call and be ready
for vote.
27/05: Updated version circulated. more comments raised.
03/06: Further updates to clarify 'core'. Also identified missing design
for outputMinLength
09/06:
028
SH: Variable markup
08/04: Discussed briefly at end of call, IBM to see whether there any use
cases that require recursive use of DFDL.
15/04: Use case was distributed and will be discussed on next call.
22/04: The use case in question is EDI where the terminating markup for
the payload segments is defined in the ISA envelope segment. The markup is
modelled as an element of simple type where the allowable markup values
are defined as enums on the type. But we need to handle two cases -
firstly where the envelope is present, so the value used by the payload is
taken from the envelope. Secondly where only the payload is present. Here
we need a way of scanning for all the enum values, and adopting the one we
actually find, when parsing. And using a default when unparsing. SH to
explore use of a DFDL variable, where the variable has a default, but also
has a type that is the same as the markup element - that way we get to use
the enums without defining everything twice.
29/04: SH and AP working on proposal.
06/05: No progress
13/05: No progress
20/05: No Progress
27/05: Progress made and will tie to other actions
03/06: General desire to avoid having to introduce variable markup in V1.
Proposed having a property to control case behaviour of all syntax
(initiator, terminator,separator) rather than separate ones for each.
Similar property to 'values' (textZeroRep, textBooleanTrueRep, etc). and
allowing lists of values. SH need to solve remaining uses case as
described in action 026
09/06: SH proposal discussed. ICU questions to be researched
029
MB: valueCalc (output length calculation)
08/04: Not discussed
22/04: Action allocated to MB, this is to complete the work started at the
Hursley WG F2F meeting.
29/04: No progress
06/05: MB will have update for next call
13/05: MB will have update for next call
20/05: Some progress. will be circulated this week
27/05: MB circulated proposal and got comments. Will update and review on
next call
03/06: Discussed proposal. MB to update dealing with uses cases raised.
Options include a new lenghtKind='Reference' to make it easier to
distinguish from fixed length case. Or use outputLengthCalc to separate
calculation of parsing and unparsing length.
09/06: SH/AP proposal discussed and MB to document
033
AP/TK: Assert/Discriminator semantics. AP to document. TK to check uses of
discriminator besides choice.
08/04: In progress within IBM
22/04: Waiting for TK to return from leave to complete.
29/04: TK has sent examples shown need for discriminators beyond choice.
Agreed. MB to respond to TK
06/05: Discussed suggestion of adding type indicator to discriminator. MB
to provide examples.
15/03: Semantic documented in v034. MB to provide examples of need for
scope indicator on discriminator
20/05: MB to provide examples of need for scope indicator on discriminator
(but lower priority than action 029)
27/05: No Progress (lower priority)
03/06: No Progress (lower priority)
09/06: No Progress (lower priority)
037
All: Approach for XML Schema 1.0 UPA checks.
22/04: Several non-XML models, when expressed in their most obvious DFDL
Schema form, would fail XML Schema 1.0 Unique Particle Attribution checks
that police model ambiguity. And even re-jigging the model sometimes
fails to fix this. Note this is equally applicable to XMl Schema 1.1 and
1.0. While the DFDL parser/unparser can happily resolve the ambiguities,
the issue is one of definition. If an XSD editor that implements UPA
checks is used to create DFDL Schema, then errors will be flagged. DFDL
may have to adopt the position that:
a)DFDL parser/unparser will not implement some/all UPA checks (exact
checks tbd)
b) XML Schema editors that implement UPA checks will not be suitable for
all DFDL models
c) If DFDL annotations are removed, the resulting pure XSD will not always
be valid (ie, the equivalent XML is ambiguous and can't be modelled by XML
Schema 1.0)
Ongoing in case another solution can be found.
29/04: Will ask DG and S Gao for opinion before closing
06/05: Discussed S Gao email and suggestions. Decided need to review all
XML UPA rules and decide which apply to dfdl.
20/05: SH or SKK to investigate
27/05: No Progress
03/06: The concern is that some dfdl schemas will fail UPA check when
validation is turned on or when editted using tooling that enforces UPA
checks. Renaming fields will resolve some/most issues. Need documentation
that describes issue and best practice.
038
MB: Submit response to OMG RFI for non-XML standardization
22/04: First step is for MB to mail the OGF Data Area chair to say that we
want to submit
29/04: MB has been in contact with OMG and will sunbit dfdl.
06/05: MB has prepared response to OMG. Will send DFDL sepc v033
20/05: Response has been sent to OMG based on v034
27/05: Awaiting response from OMG.
03/06: On hold
042
MB: Complete variable specification.
To include how properties such as encoding can be set externally. Must be
a known variable name.
06/05: No progress
20/05: AP to make proposal
27/05: MB proposed differentiating between input and output variables to
avoid unnecessary evaluations during parse and unparse. Need to complete
rest of variable specification.
03/06: Pointed out problem of declaring variables input or output when
used to define syntax which is used both times. MB to update proposal to
include how variables are set externally and how specific properties such
as encoding are set.
09/06: SKK to use example to dicument his proposal
043
13/05: Types in the infoset. Currently infoset types have defined value
space but that implies a parser would have to validate input. Is this
correct?
20/05: SH No progress
27/05: No Progress
03/06: No Progress
09/06: SH proposed staying with XML built-in types. Closed
044
13/05: Bidi
20/05: AP: will check what IBM products support.
27/05: Bidi is supported so will be needed in dfdl v1
03/06: No Progress
09/06: No Progress
045
20/05 AP: Speculative Parsing
27/05: Psuedo code has been circulated. Review for next call
03/06: Comments received and will be incorporated
09/06: Progress but not discussed
047
20/05 AP: Scoping for non-format annotations
27/05: Discussed briefly. AP to distribute
03/06: Proposal discussed briefly. Will be updated.
09/06: Doc emailed. Awaiting outcome of variable to define/setvariable
rules.
048
20/05: AP investigate Restart
27/05: Suggest RESTART is not part of the scope for DFDL.
03/06: not discussed
09/06: Closed
049
20/05 AP Built-in specification description and schemas
03/06: not discussed
050
27/05: xs:any currently limited to initiated text element. Is this
sufficient? Should xs:any in its current form be deferred?
03/06: not discussed
09/06: Proposed dropping xs:any support
051
Scoping rules.
MB: to document change to scoping rules to satisfy implementation concerns
Closed actions:
Work items:
No
Item
target version
status
003
Variables - ??, 2008 (Mike)
005
Improvements on property descriptions - ??, 2008 (All - split TBD)
006
Envelopes and Payloads (Steve) - Apr 30, 2008
007
(from draft 32) valueCalc (Mike) - ??, 2008
mostly
complete
008
(from draft 32) Property precedence for writing (Steve) -
under review
009
(from draft 32) Variable markup (Steve) - Mar 31, 2008
proposal needs writing up
011
(from draft 32) How speculative parsing works (combining choice and
variable-occurence - currently these are separate) ??, 2008 (IBM)
in progress
012
(from draft 32) Reordering the properties discussion: move representation
earlier, improve flow of topics ??, 2008 (Alan)
not started
027
Calendar schemes
034
032
Floating components
033
Changes from action 020 and 027 - renaming properties etc
035
Remove unorderedInitiated, add initiated content (a041)
036
Update dfdl schema with change properties (Suman)
037
Infoset text codepage
038
Improve length section
039
Change scoping of simple types (A 046)
040
Document outputMinLength (A027)
042
mapping of the dfdl infoset to XDM
Not required for V1 specification
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell(a)uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
2
1
This comment appears in draft 034 with respect to leading/trailing skip
bytes. Since I am in there fixing up the grammar and associated pictures I'd
like to fix this also if a change is needed. Unfortunately, MS Word change
tracking says this comment is by "Lab User" ... so I have no idea who said
this, but I assume it's an IBMer.
In our MRM model, for repeating element we state that a leading skipCount
applies only to first occurrence of array element and trailing skipCount
applies to all elements of the array. The model in that sense is not
symmetrical. I thinking if we should model this properly for array where we
state that an array construct as such has a prefix Region and suffix Region
which carry these alignment attributes.
I will try to find the COBOL test case which demonstrates this issue.
Right now leading/trailing skip bytes are symmetric, and would apply to
every element of an "array". There is no asymetry as per this comment.
Can someone check into this so I can fix these grammars/diagrams including
this change (if needed)?
...mike
1
0
Agenda:
1. Go through actions.
2. Implementation concerns
The implementation of a dfdl editor, validator and parser have highlighted
some some concern. SH to send email
3. AOB
Current Actions:
No
Action
012
AP/SH: Update decimalCalendarScheme
10/9: Not allocated yet
17/9: No update
24/9: Add calendar binary formats to actions
22/10: No progress
16/1: proposal distributed and discussed. Will be redistributed
21/1: add locale,
04/02: changed from locale to specific properties
18/2: Need more investigation of ICU strict/lax behaviour.
08/04: Not discussed
22/04: AP to complete asap once the ICU strict/lax behaviour is
understood.
29/04: No progress
06/05: No progress
13/05: Calendar has been added to latest spec version v034 but still a few
details to clarify.
20/05: No Progress
27/05: No Progress
03/06: No Progress (low priority)
09/06: No Progress (low priority)
026
SH: Envelopes and Payloads
08/04: Not discussed explicity, but recursive use of DFDL is tied up with
this
22/04: Two aspects. Firstly compositional - do sufficient mechanisms exist
to model an envelope with a payload that varies. Secondly markup syntax -
this might be defined in the envelope.
The second of these is very much tied up with the variable markup action
028, so will be considered there. SH to verify the composition aspect.
29/04: SH and AP working on proposal. related to Action 028
06/05: No progress
06/05: No progress
20/05: No Progress
27/05: Still a number of aspects to be decided.
- Compostion - Does the envelope and payload need to be defined in the
same schema or should they be dynamically bound at runtime?
- Compostion- How is a variable payload specified. Choice or xs:any; New
action raised to discuss xs:any
- extracting dymanic syntax from data. Covered by action 029 valuecalc.
03/06: Dynamic runtime binding will not be supported.
SH investigating use of variables to enable standalone and use in envelope
of global element.
09/06: Payload should be specified using a choice rather than xs:any
027
SH: Property precedence tables
08/04: Not discussed
22/04: Two things missing from the existing precedence trees. Firstly,
does not show alternates (eg, initiator v initiatorkind). Secondly, need a
tree per concrete DFDL object (eg, element). SH to update.
29/04: No progress
06/05: SH is updating tables which will be ready for next call
13/05: SH emailed updated version. AP commented.. See minutes for issues
and property changes.
20/05: Updated version circulated. Review before next call and be ready
for vote.
27/05: Updated version circulated. more comments raised.
03/06: Further updates to clarify 'core'. Also identified missing design
for outputMinLength
09/06:
028
SH: Variable markup
08/04: Discussed briefly at end of call, IBM to see whether there any use
cases that require recursive use of DFDL.
15/04: Use case was distributed and will be discussed on next call.
22/04: The use case in question is EDI where the terminating markup for
the payload segments is defined in the ISA envelope segment. The markup is
modelled as an element of simple type where the allowable markup values
are defined as enums on the type. But we need to handle two cases -
firstly where the envelope is present, so the value used by the payload is
taken from the envelope. Secondly where only the payload is present. Here
we need a way of scanning for all the enum values, and adopting the one we
actually find, when parsing. And using a default when unparsing. SH to
explore use of a DFDL variable, where the variable has a default, but also
has a type that is the same as the markup element - that way we get to use
the enums without defining everything twice.
29/04: SH and AP working on proposal.
06/05: No progress
13/05: No progress
20/05: No Progress
27/05: Progress made and will tie to other actions
03/06: General desire to avoid having to introduce variable markup in V1.
Proposed having a property to control case behaviour of all syntax
(initiator, terminator,separator) rather than separate ones for each.
Similar property to 'values' (textZeroRep, textBooleanTrueRep, etc). and
allowing lists of values. SH need to solve remaining uses case as
described in action 026
09/06: SH proposal discussed. ICU questions to be researched
029
MB: valueCalc (output length calculation)
08/04: Not discussed
22/04: Action allocated to MB, this is to complete the work started at the
Hursley WG F2F meeting.
29/04: No progress
06/05: MB will have update for next call
13/05: MB will have update for next call
20/05: Some progress. will be circulated this week
27/05: MB circulated proposal and got comments. Will update and review on
next call
03/06: Discussed proposal. MB to update dealing with uses cases raised.
Options include a new lenghtKind='Reference' to make it easier to
distinguish from fixed length case. Or use outputLengthCalc to separate
calculation of parsing and unparsing length.
09/06: SH/AP proposal discussed and MB to document
033
AP/TK: Assert/Discriminator semantics. AP to document. TK to check uses of
discriminator besides choice.
08/04: In progress within IBM
22/04: Waiting for TK to return from leave to complete.
29/04: TK has sent examples shown need for discriminators beyond choice.
Agreed. MB to respond to TK
06/05: Discussed suggestion of adding type indicator to discriminator. MB
to provide examples.
15/03: Semantic documented in v034. MB to provide examples of need for
scope indicator on discriminator
20/05: MB to provide examples of need for scope indicator on discriminator
(but lower priority than action 029)
27/05: No Progress (lower priority)
03/06: No Progress (lower priority)
09/06: No Progress (lower priority)
037
All: Approach for XML Schema 1.0 UPA checks.
22/04: Several non-XML models, when expressed in their most obvious DFDL
Schema form, would fail XML Schema 1.0 Unique Particle Attribution checks
that police model ambiguity. And even re-jigging the model sometimes
fails to fix this. Note this is equally applicable to XMl Schema 1.1 and
1.0. While the DFDL parser/unparser can happily resolve the ambiguities,
the issue is one of definition. If an XSD editor that implements UPA
checks is used to create DFDL Schema, then errors will be flagged. DFDL
may have to adopt the position that:
a)DFDL parser/unparser will not implement some/all UPA checks (exact
checks tbd)
b) XML Schema editors that implement UPA checks will not be suitable for
all DFDL models
c) If DFDL annotations are removed, the resulting pure XSD will not always
be valid (ie, the equivalent XML is ambiguous and can't be modelled by XML
Schema 1.0)
Ongoing in case another solution can be found.
29/04: Will ask DG and S Gao for opinion before closing
06/05: Discussed S Gao email and suggestions. Decided need to review all
XML UPA rules and decide which apply to dfdl.
20/05: SH or SKK to investigate
27/05: No Progress
03/06: The concern is that some dfdl schemas will fail UPA check when
validation is turned on or when editted using tooling that enforces UPA
checks. Renaming fields will resolve some/most issues. Need documentation
that describes issue and best practice.
038
MB: Submit response to OMG RFI for non-XML standardization
22/04: First step is for MB to mail the OGF Data Area chair to say that we
want to submit
29/04: MB has been in contact with OMG and will sunbit dfdl.
06/05: MB has prepared response to OMG. Will send DFDL sepc v033
20/05: Response has been sent to OMG based on v034
27/05: Awaiting response from OMG.
03/06: On hold
042
MB: Complete variable specification.
To include how properties such as encoding can be set externally. Must be
a known variable name.
06/05: No progress
20/05: AP to make proposal
27/05: MB proposed differentiating between input and output variables to
avoid unnecessary evaluations during parse and unparse. Need to complete
rest of variable specification.
03/06: Pointed out problem of declaring variables input or output when
used to define syntax which is used both times. MB to update proposal to
include how variables are set externally and how specific properties such
as encoding are set.
09/06: SKK to use example to dicument his proposal
043
13/05: Types in the infoset. Currently infoset types have defined value
space but that implies a parser would have to validate input. Is this
correct?
20/05: SH No progress
27/05: No Progress
03/06: No Progress
09/06: SH proposed staying with XML built-in types. Closed
044
13/05: Bidi
20/05: AP: will check what IBM products support.
27/05: Bidi is supported so will be needed in dfdl v1
03/06: No Progress
09/06: No Progress
045
20/05 AP: Speculative Parsing
27/05: Psuedo code has been circulated. Review for next call
03/06: Comments received and will be incorporated
09/06: Progress but not discussed
047
20/05 AP: Scoping for non-format annotations
27/05: Discussed briefly. AP to distribute
03/06: Proposal discussed briefly. Will be updated.
09/06: Doc emailed. Awaiting outcome of variable to define/setvariable
rules.
048
20/05: AP investigate Restart
27/05: Suggest RESTART is not part of the scope for DFDL.
03/06: not discussed
09/06: Closed
049
20/05 AP Built-in specification description and schemas
03/06: not discussed
050
27/05: xs:any currently limited to initiated text element. Is this
sufficient? Should xs:any in its current form be deferred?
03/06: not discussed
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell(a)uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
1
0
The use cases for considering the inclusion of the recursive use of DFDL
to define markup or other DFDL properties are:
a) Case insensitivity of data (eg, true & TRUE for text boolean)
b) Case insensitivity of markup (eg, hdr & HDR for initiator)
c) Different possible values for non-white space markup (eg, @ and # for
separator)
d) Different possible values for data (eg, true & yes for text boolean)
e) Encoding of markup different to encoding of data (eg, initiator and
terminator different to data)
The proposal is to use various existing mechanisms to handle all these use
cases, and negate the need to include recursive use of DFDL in 1.0.
a) Case insensitivity of data (eg, true & TRUE for text boolean)
- Use a single flag dfdl:valueIgnoreCase to cover all affected properties
- Properties:
- dfdl:occursStopValue
- dfdl:numberZeroRep
- dfdl:nilValues
- dfdl:textBooleanTrue
- dfdl:textBooleanFalse
b) Case insensitivity of markup (eg, hdr & HDR for initiator)
- Use a single flag dfdl:valueIgnoreCase to cover all affected properties
- Properties:
- dfdl:initiator
- dfdl:terminator
- dfdl:separator
c) Different possible values for non-white space markup (eg, @ and # for
separator)
- Use multi-value property. Propose that property name remains singular.
- Properties:
- dfdl:initiator
- dfdl:terminator
- dfdl:separator
d) Different possible values for data (eg, true & yes for text boolean)
- Use multi-value property. Propose that property name remains singular,
so dfdl:nilValues becomes dfdl:nilValue singular.
- Properties:
- dfdl:occursStopValue
- dfdl:numberZeroRep
- dfdl:nilValues
- dfdl:textBooleanTrue
- dfdl:textBooleanFalse
e) Encoding of markup different to encoding of data (eg, initiator and
terminator different to data)
- Use <xs:sequence> to wrap the element and carry the markup, for example:
<sequence dfdl:encoding="ascii" dfdl:separator=":">
<sequence dfdl:encoding="ebcdic" dfdl:initiator="VAL"
dfdl:terminator="END">
<element name="val" type="..." dfdl:encoding="ascii" />
</sequence>
</sequence>
- This should be able to handle all cases of what is a rare occurrence
anyway, and still allows speculative parsing rules to apply.
- Alternative is to treat the markup as a value (the EDI scenario) - this
is the subject of a separate action 026, which will be solved using
variables or another technique, but not by using DFDL recursively.
There are some other properties to which cases a), b), c), d) could apply.
We need to decide whether or not case sensitivity and/or multi-values are
appropriate to these:
- dfdl:textPadChar
- dfdl:escapeCharacter
- dfdl:escapeForEscapeCharacter
- dfdl:escapeBlockStart
- dfdl:escapeBlockEnd
- dfdl:numberGroupSeparator
- dfdl:numberDecimalSeparator
- dfdl:numberExponentCharacter
- dfdl:numberInfinityRep
- dfdl:numberNanRep
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 09/06/2009 13:27 -----
Steve Hanson/UK/IBM@IBMGB
Sent by: dfdl-wg-bounces(a)ogf.org
15/04/2009 13:47
To
dfdl-wg(a)ogf.org
cc
Subject
[DFDL-WG] Recursive use of DFDL for variable markup - use case
>From last week's call:
7. Recursive use of DFDL for variable markup
Use of a DFDL annotated element/type to describe an initiator, length
prefix, terminator, separator, etc. Steve suggested the most important use
of "variable markup-like mechanism" in IBM's WTX product is to reference a
location earlier in the bit stream where a delimiter value is found. We
handle this already by use of a path expression. The additional variable
markup mechanism was to avoid proliferation of keywords for various corner
cases on initiator, terminator and separator. Eg., what if you want the
initiator to be "Name" or "name" only, not "NAME", "nAmE", etc. So case
insensitive is not expressive enough. This can always be modeled, just not
as an initiator tag. Feeling was to leave out variable markup (other than
for prefix lengths) for v1.0, and to propose the minimum set of extra
properties that can be used to address the common use cases, but that IBM
needed to see whether this satisfied all WTX use cases.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848 --
dfdl-wg mailing list
dfdl-wg(a)ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
1
0
I did not get as far as I wanted to on this issue. I would like to discuss
this example:
<sequence>
<element name="len" type="int"
dfdl:fillByte="%#r0;"
dfdl:outputValueCalc=
"{
dfdl:representation-output-length(../val)
}" />
... many elements in between ....
<element name="val" type="string"
dfdl:encoding="utf-8"
dfdl:lengthKind="explicit"
dfdl:lengthUnits="bytes"
dfdl:useLengthForOutput="false"
dfdl:length="{ ../len }"
dfdl:outputLength="{
fix:ceiling(
dfdl:representation-inherent-length(.) div 4
) * 4
}"
dfdl:textTrimKind="padChar"
dfdl:textStringJustification="left"
dfdl:textPadCharacter="%#r0;"
/>
</sequence>
You will notice I added a dfdl:outputLength property, and a
dfdl:representation-output-length() function and
dfdl:representation-inherent-length().
I am accepting candidates for better names for these properties and
functions. We need to distinguish these 3 concepts:
1) inherent length - of the infoset item without reference to any facets,
and with out respect to escape sequences, padding or truncation.
(TBD: think about escape sequences? Is this right)
2) output target length - the length of the box we're filling in with the
data value representation. The box can be bigger or smaller than the
inherent length, which implies use of padding/filling, or truncation.
3) input length - length of the box we're getting when parsing. The inherent
length of the value after parsing can be smaller than the length of the box
due to removal of escape characters, and the trimming of padding.
2
1
Firstly, there had been a debate about what simple types should appear in
the infoset. Various proposals below. We've decided not to go with any
of those, but to stick with existing infoset behaviour (reprod). It means
that the DFDL parser will do enough to convert input data to the (nearest)
schema built-in type. This gives better interop with XML Schema (eg,
ecore) based trees. There is an important implication when speculatively
parsing - the parser will use the schema built-in type to distinguish
data, but will not use user-defined restrictions.
[datatype] String. The name of the XML Schema 1.0 built-in simple
type to which the value corresponds. DFDL supports a subset of these types
listed in the specification at section 4.1.
[dataValue] The value in the value space of the [datatype] member
or special value nil.
Secondly, given the above decision, we can complete action 020. On parse,
if the physical data can not be handled by the logical type, it is a
processing error. On unparsing, data must conform to the infoset type, by
definition.
Logical type
textNumberRepresentation=
text (4)
textNumberRepresentation=
zoned (2) (6)
binaryNumberRepresentation=
packed (5)
binaryNumberRepresentation=
bcd (1)
binaryNumberRepresentation=
binary
Signed (decimal, integer, and user restrictions thereof)
Parse: OK
Unparse: OK
Parse: Unpunched data => +ve
Unparse: Data always punched with sign
Parse: Unsigned nibble => +ve
Unparse: Data signed as per +ve/-ve nibble specifiers, unsigned nibble
specifier never used
Parse: Data always +ve
Unparse: -ve data is processing error
N/A
Signed (long, int, short, byte, and user restrictions thereof)
Ditto
Ditto
Ditto
Ditto
Parse: Data assumed 2's complement binary
Unparse: Data output as 2's complement binary
Unsigned (unsigned long, unsigned int, unsigned short, unsigned byte, and
user restrictions thereof)
(3)
Parse: +ve data => OK, -ve data is processing error
Unparse: Data output according to pattern
Parse: +ve punched data => OK, -ve data is processing error
Unparse: Data never punched with sign
Parse: +ve nibble & unsigned nibble => OK, -ve nibble is processing error
Unparse: Unsigned nibble specifier always used
Parse: OK
Unparse: OK
Parse: Data assumed unsigned binary
Unparse: Data output as unsigned binary
Notes
(1) Can not physically carry a sign
(2) Some systems omit to punch for +ve, but accept punched on input (eg,
IBM iSeries)
(3) Assumes that on unparsing, the infoset can not present a -ve value
(4) The -ve sign is indicated by numberPattern property
(5) The exact sign nibbles are given by the packedDecimalSignCodes
property
(6) The punching style to use is given by the numberZonedSignStyle
property
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 08/06/2009 11:27 -----
Steve Hanson/UK/IBM
13/05/2009 13:25
To
Dave Glick
cc
dfdl-wg(a)ogf.org
Subject
Fw: Action 020 completion
Hi Dave
We should also bear the problem below in mind when thinking about DFDL
Infoset & XDM. XDM assumes that an element with a concrete type-name has
a typed-value conforming to the type-name, ie, it has been 'validated'. If
this is not the case then the type-name is set to xs:untyped or
xs:untypedAtomic (extra types added to XDM for this purpose). In DFDL
Infoset we had been assuming that the [dataType] would be set to that
implied by the DFDL xsd, regardless of whether validation succeeded or not
- though there are issues with this as explained below.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 13/05/2009 12:51 -----
Steve Hanson/UK/IBM
29/04/2009 15:54
To
Alan Powell, Dave Glick, Mike Beckerle (Work)
cc
Suman Kalia/Toronto/IBM@IBMCA, Tim Kimber/UK/IBM
Subject
Fw: Action 020 completion
Hi Dave
We discussed this on the call and agreed that the unsigned types are just
range restrictions so treating negative numbers in the data as parse
errors instead of validation errors seemed inconsistent.
Various options discussed:
1) Remove the unsigned types altogether.
- Means we'd need an extra property to describe binary integer
representation, as we could no longer infer the rep from the logical type
- Loses type information for applications where the fact that data is
unsigned is important.
- Means DFDL modelers would have to create their own duplicate
restrictions for common C etc data types.
2) Change [dataType] to point to the XML Schema primitive type instead of
the XML Schema built-in type.
- Means that the value and the type would be xs:decimal which is too
general
3) Change [dataValue] to say "The value in the value space of the
underlying XML Schema primitive type forthe [datatype] member or special
value nil"
- Allows the infoset to carry integer data that is invalid due to range
regardless of value.
- Means that the value would be a decimal even though the data type was
(say) xs:unsignedLong, ie, the datatype and datavalue are no longer in
step unless validated
4) Option 2) with the modification that the primitive type for all integer
types was xs:integer and not xs:decimal.
5) Option 3) with the modification that the primitive type for all
integer types was xs:integer and not xs:decimal.
We agreed not to close on this until you had reported back on your action
032.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 29/04/2009 15:32 -----
Steve Hanson/UK/IBM
27/04/2009 13:42
To
Alan Powell/UK/IBM
cc
dglick(a)dracorp.com, mbeckerle(a)oco-inc.com
Subject
Re: Action 020 completion
Thanks Alan
I've added correct property names, see below. But I've omitted floats
deliberately for clarity, the logical type is always signed and physical
type is always signed, so there's no issue.
However, there is a problem with what I have stated, as you pointed out on
Friday. On parsing I am effectively validating the input data, but on
unparsing I am not assuming the data has been validated. This is not
consistent and needs correcting.
But as I looked into this, I realised we have a problem with how we have
described the DFDL infoset. The spec says "There is no requirement for
DFDL-described data to be valid in order to have a DFDL information set.",
which is in accordance with our agreed position on validation being
optional. But further on it also says:
[datatype] String. The name of the XML Schema 1.0 built-in simple
type to which the value corresponds. DFDL supports a subset of these types
listed in the specification at section 4.1.
[dataValue] The value in the value space of the [datatype] member
or special value nil.
This says to me that the DFDL parser must have done enough validation to
ascertain that the value matched the underlying built-in type. For
example, I have a user-defined simple type that adds a max/min range of
+100-+200 to an xs:unsignedInt. If the input data has value 99, the value
will be accepted into the infoset, but will not validate if validation is
switched on. If the input data is a packed decimal with value -1, the
value will not be accepted into the infoset. Given that xs:unsignedInt is
itself just a range restriction of xs:integer (via xs:nonNegativeInteger),
this seems a bit arbitrary.
Dave - given your action item looking at DFDL Infoset versus XDM, I'd be
interested in your opinion here.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
Alan Powell/UK/IBM
24/04/2009 15:13
To
Steve Hanson/UK/IBM@IBMGB
cc
dglick(a)dracorp.com, mbeckerle(a)oco-inc.com
Subject
Re: Action 020 completion
Steve
Looks OK
But can you use the correct property name eg binaryNumberRepresentation
and for completeness add binaryFloatRepresentation (even though it may be
obvious)
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell(a)uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
From:
Steve Hanson/UK/IBM
To:
mbeckerle(a)oco-inc.com, Alan Powell/UK/IBM, dglick(a)dracorp.com
Date:
23/04/2009 12:50
Subject:
Action 020 completion
Here's my proposal for the behaviour when a logical type is signed and the
physical data has no sign (either because it is not capable of carrying a
sign, or it carries an unsigned indicator), and when the logical type is
unsigned and the physical data has a sign. The principle is to be
flexible and only to give errors when things are clearly mis-matched
Logical type
textNumberRepresentation=
text (4)
textNumberRepresentation=
zoned (2) (6)
binaryNumberRepresentation=
packed (5)
binaryNumberRepresentation=
bcd (1)
binaryNumberRepresentation=
binary
Signed (decimal, integer)
Parse: OK
Unparse: OK
Parse: Unsigned data => +ve
Unparse: Data always punched with sign
Parse: Unsigned data => +ve
Unparse: Data signed as per +ve/-ve nibble specifiers, unsigned nibble
specifier never used
Parse: Data always +ve
Unparse: -ve data is error
N/A
Signed (long, int, short, byte)
Ditto
Ditto
Ditto
Ditto
Parse: Data assumed 2's complement binary
Unparse: Data output as 2's complement binary
Unsigned (unsigned long, unsigned int, unsigned short, unsigned byte)
(3)
Parse: -ve data is error
Unparse: -ve data is error
Parse: +ve data => OK, -ve data is error
Unparse: Sign never punched, -ve data is error
Parse: +ve data => OK, -ve data is error
Unparse: Unsigned nibble specifier always used, -ve data is error
Parse: OK
Unparse: -ve data is error
Parse: Data assumed unsigned binary
Unparse: Data output as unsigned binary
(1) Can not physically carry a sign
(2) Some systems output unsigned for +ve, but accept +ve on input (eg, IBM
iSeries)
(3) Assumes that on unparsing, the infoset could still present a -ve value
(4) The -ve sign is indicated by numberPattern property
(5) The exact sign nibbles are given by the packedDecimalSignCodes
property
(6) The punching style to use is given by the numberZonedSignStyle
property
Mail back any comments before next week's call.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 23/04/2009 12:04 -----
Steve Hanson/UK/IBM
18/02/2009 16:55
To
Mike Beckerle (Work)
cc
Alan Powell/UK/IBM, dglick(a)dracorp.com
Subject
DFDL: Packed & zoned decimals - more thoughts (was Action 020)
Hi Mike
While we are on the subject of how to handle signs, the spec does not
fully define what happens for a number if the logical type is unsigned. We
need to say what is expected in the physical data and what happens if the
data contains a sign. For example, we say that for an unsigned integer, if
the rep is binary then we treat the data as 'unsigned binary' and not twos
complement. And we say that BCD is only allowed for unsigned logical
types. That is good. But we don't do the same for packed, text, zoned. I
think we need to say that no explicit sign is expected in the data (eg,
packed should have only F or 0, no A,B,C,D) and if it does:
Alternatives:
i) Error
ii) Positive sign discarded, negative sign gives error
iii) Sign discarded
iv) As per i) if 'strict' set, as per ii) if 'lax' set
v) As per i) if 'strict' set, as per iii) if 'lax' set
Personally I vote for i)
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 18/02/2009 16:41 -----
Steve Hanson/UK/IBM
12/02/2009 13:18
To
<mbeckerle.dfdl(a)gmail.com>
cc
dfdl-wg(a)ogf.org
Subject
RE: [DFDL-WG] Fw: DFDL OGF WG call - Action 020
Hi Mike
I think it's a simplification too far. Many people especially those with a
mainframe or COBOL background know what a zoned decimal is. The wikipedia
entry for binary coded decimal explicitly covers the BCD, packed & zoned
'variants'. MRM and WTX both explicitly support zoned too. And it's easier
to say that the 'decimalSignStyle' property applies to zoned decimals than
to say it applies to any patterns that happen to have a P in them. On
balance I would keep zoned as a representation.
So we need to decide whether zoned is only allowed for a signed decimal.
There's no harm in allowing it for unsigned, just some redundancy, and it
makes validation of the pattern against the rep easier (if something is
zoned it can only have a subset of pattern chars).
Btw we don't need leading overpunched sign, only trailing - see my case
for this below.
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
"Mike Beckerle" <mbeckerle.dfdl(a)gmail.com>
12/02/2009 12:51
Please respond to
<mbeckerle.dfdl(a)gmail.com>
To
Steve Hanson/UK/IBM@IBMGB, <dfdl-wg(a)ogf.org>
cc
Subject
RE: [DFDL-WG] Fw: DFDL OGF WG call - Action 020
This does suggest another simplification.
Zoned is so close to text....Suppose we scrap the concept of "zoned"
altogether, and just add a character to our number pattern language to
allow one to specify a overpunched sign digit. E.g.,
"+00000" is text
"P0000" same with overpunched leading sign.
"00000+" text
"0000P" same with overpunched trailing sign
The decimal point would normally be implied in these, (I still like
having a cobol-style "V" to position this instead of separate properties
stating the position - one of the few good features about cobol is the
number patterns. I still think we could quite easily pre-process the "V"
out of these strings and then hand the rest through to an ICU library as
an implementation - however the "P" probably does need to be a change in
that library.)
Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc.
Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451 |
mbeckerle.dfdl(a)gmail.com
From: dfdl-wg-bounces(a)ogf.org [mailto:dfdl-wg-bounces@ogf.org] On Behalf
Of Steve Hanson
Sent: Wednesday, February 11, 2009 1:34 PM
To: dfdl-wg(a)ogf.org
Subject: [DFDL-WG] Fw: DFDL OGF WG call - Action 020
It was noted on the call this week that there is an alternative to my
zoned decimal overpunching proposal i) below.
I said:
- If it is an unsigned type then DFDL expects the rightmost byte to have a
zone nibble when parsing, and outputs a zone nibble when unparsing.
- If it is a signed type then DFDL expects it to have a sign nibble when
parsing, and outputs a sign nibble when unparsing.
But my unsigned type behaviour could be achieved by specifying a rep of
text instead of zoned. If that is the case, the alternative is to only
allow zoned rep for signed decimal logical types.
Thoughts?
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 05/02/2009 10:17 -----
Steve Hanson/UK/IBM
28/01/2009 13:54
To
DFDL Working Group
cc
Subject
DFDL OGF WG call - Action 020Link
Action 020:
020
SH: Resolve packedDecimalSignCodes behaviour depends on NumberCheckPolicy
22/10: No progress
10/12: added how to decide to overpunch and sign position
a) Resolve packedDecimalSignCodes behaviour depends on NumberCheckPolicy
Add new property to section 15.4 Properties Specific to Number with Binary
representation.
binaryNumberCheckPolicy
Enum
Values are “strict” and “lax”.
Indicates how lenient to be when parsing binary numbers.
If ‘lax” then the parser tolerates all valid alternatives where such
alternatives exist. Specifically, for binaryNumberRepresentation =
'packed' the sign nibble for positive, negative, unsigned and zero is
allowed to be any of the valid respective values.
On unparsing, the specified value is always used.
Also suggest changing some of the other property names in 15.4:
"decimalVirtualPoint" -> "binaryDecimalVirtualPoint"
"packedDecimalSignCodes" -> "binaryPackedSignCodes"
And changing binaryNumberRepresentation enumeration:
"BCD" -> "bcd"
b) Zoned decimals: How to decide to overpunch and sign position
Spec assumes that overpunching of the rightmost character always takes
place. IBM architecture allows no overpunching (ie, Fx instead of Cx/Dx) -
this is supported by IBM MRM & WTX parsers. Additionally IBM MRM parser
allows separate sign byte, and sign byte on left. Let's deal with these
separately:
i) No overpunching.
The IBM architecture allows the rightmost byte to have a zone (Fx) or a
sign (Cx/Dx) as the left nibble. I don't see why we can't base what to
expect when parsing, and output when unparsing, on the logical xsd type.
- If it is an unsigned type then DFDL expects the rightmost byte to have a
zone nibble when parsing, and outputs a zone nibble when unparsing.
- If it is a signed type then DFDL expects it to have a sign nibble when
parsing, and outputs a sign nibble when unparsing.
For analogy with DFDL packed decimals, it seems at first glance that we
should also extend the numberCheckPolicy 'lax' setting to treat a zone
nibble as a +ve sign nibble for a signed type. However, IBM iSeries always
outputs Fx to mean +ve but accepts both Fx & Cx on input. It is perhaps
better therefore that DFDL always tolerates Fx when parsing a signed zoned
decimal, otherwise iSeries users would always have to set
numberCheckPolicy to 'lax', which might have other implications in the
future.
ii) Separate sign byte.
I don't believe the IBM architecture allows this. I don't think DFDL needs
to support it. MRM has this, but I think it's because early on MRM did not
explicitly support text decimals as such, just COBOL variations, and it
was easier just to call them all zoned.
iii) Sign byte on left.
I don't believe the IBM architecture allows this. I don't think DFDL needs
to support it. MRM has this, but for the same reason as ii)
Conclusion: No new DFDL properties needed, but words need adding to
explain zoned parse/unparse behaviour better.
Also suggest changing property names:
"zonedDecimalSignStyle" -> "numberZonedSignStyle"
"zeroNumberRep" -> "numberZeroRep"
Should also make clear that any explicit negative pattern in numberPattern
will be ignored if the xsd type is unsigned. (We could make this an error
but it precludes creation of a textNumberFormat that works with both
signed and unsigned types, plus pattern "##0.0" implictly is equivalent
to "##0.0;(##0.0)" ).
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
Alan Powell/UK/IBM@IBMGB
Sent by: dfdl-wg-bounces(a)ogf.org
23/01/2009 13:36
To
dfdl-wg(a)ogf.org
cc
Subject
[DFDL-WG] DFDL: Minutes from OGF WG call, 21 January 2009
Open Grid Forum: Data Format Description Language Working Group
Weekly Working Group Conference Call
14:00 GMT, 21 January 2009
Attendees
Alan Powell (IBM)
Mike Beckerle(Oco)
Apologies
Steve Hanson (IBM)
1. XSD 1.1
Deferred to next call
2. Calendar formats
Discussed updated (v4) supplement emailed by AP
Agreed millisec/secSinceEpoc cannot be implied by length of logical data
so need seperate enumerations. Observed that these options were really
combination of 3 properties binary, length and sec/millisec. Suggested
renaming to binarySeconds and binaryMilliseconds
Packed calendars: decided that need to be able to specify at least the
packedDecimalSignCodes property rather than assuming a default so
reference will be added to calendar description
Locale needs to be specified for numberformats and calendarFormats
(didn't identify any other areas) as it modifies the behaviour of ICU.
Decided to add locale to numberFormat and CalendarFormat
3. Escape Schemes
Agreed need for multiple escape delimiter pairs but not nested.
Need an escape for escape character even though in most cases this will be
the same character, eg /n //, There are some formats that have a different
escape, eg /n &/. Only need single escape characters and one level of
escape characters.
Discussed how to deal with comments of the form /* comment */ where
the escape delimiters are also the initiator and terminator of the field.
Semantic needed is 'only look for field terminator not any parent
terminator or any other syntax elements'. May fall out naturally from the
speculative parsing rules. Need further discussion.
4. AOB
Next call 28 January 14:00
Meeting closed, 15:00 GMT
Actions raised at this meeting
No
Action
031
Current Actions:
No
Action
012
AP/SH: Update decimalCalendarScheme
10/9: Not allocated yet
17/9: No update
24/9: Add calendar binary formats to actions
22/10: No progress
16/1: proposal distributed and discussed. Will be redistributed
21/1: add locale,
020
SH: Resolve packedDecimalSignCodes behaviour depends on NumberCheckPolicy
22/10: No progress
10/12: added how to decide to overpunch and sign position
023
MB: Review Schema 1.1
024
String XML type
025
Escape schemes
21/1: discussed requirements
026
SH: Envelopes and Payloads
027
Property precedence tables
028
Variable markup
029
valueCalc (output length calculation)
030
AP: confirm with WTX that can drop duration
21/6: WTX confirm that they do not have a duration type so do not need it
in dfdl. Will drop from spec. Closed
Closed actions:
030
AP: confirm with WTX that can drop duration
21/6: WTX confirm that they do not have a duration type so do not need it
in dfdl. Will drop from spec. Closed
034 Work items:
No
Item
001
String XML type (Ian P) - Apr 30, 2008
002
Escape schemes (Ian P) - Apr 30, 2008
003
Variables - ??, 2008 (Mike)
005
Improvements on property descriptions - ??, 2008 (All - split TBD)
006
Envelopes and Payloads (Steve) - Apr 30, 2008
007
(from draft 32) valueCalc (Mike) - ??, 2008
mostly
complete
008
(from draft 32) Property precedence for writing (Steve) -
under review
009
(from draft 32) Variable markup (Steve) - Mar 31, 2008
proposal needs writing up
010
(from draft 32) Assertions, discriminators and choice, including
discussion of timing option (Suman) - Mar 31, 2008 * in progress *
011
(from draft 32) How speculative parsing works (combining choice and
variable-occurence - currently these are separate) ??, 2008 (IBM)
in progress
012
(from draft 32) Reordering the properties discussion: move representation
earlier, improve flow of topics ??, 2008 (Alan) * not started *
025
Augmented infoset and unparsing (Alan)
added but needs work
complete - specification updated
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell(a)uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
1
0
Attached is updated proposal which incorporates Steve's comments..
The main issue to resolve is given that a variable can only be set once
should a setVariable for the same variable on an element or element
reference override the setVariable on a simpleType of element declaration
or is it an error to have a setVariable for the same variable?
Other points of interest
Non-format annotations cannot be put in scope on a complex type.
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell(a)uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
From:
Steve Hanson/UK/IBM
To:
Alan Powell/UK/IBM@IBMGB
Cc:
dfdl-wg(a)ogf.org
Date:
03/06/2009 10:20
Subject:
Re: [DFDL-WG] Non-format annotation scoping rules.
Alan - looks sensible - a couple of comments in the updated doc below:
[attachment "ogf-dfdl-annotation-scoping-v1.doc" deleted by Alan
Powell/UK/IBM]
Regards
Steve Hanson
Programming Model Architect
WebSphere Message Brokers
Hursley, UK
Internet: smh(a)uk.ibm.com
Phone (+44)/(0) 1962-815848
Alan Powell/UK/IBM@IBMGB
Sent by: dfdl-wg-bounces(a)ogf.org
29/05/2009 15:57
To
dfdl-wg(a)ogf.org
cc
Subject
[DFDL-WG] Non-format annotation scoping rules.
Attached are the proposed rules for non-format annotation scoping.
As part of the exercise I had to clarify which annotations are permitted
on each schema object so please review that table.
Of particular interest are
1. Annotations put in scope on a xs:complexType
2. Assert/Discriminator on xs:sequence, xs:choice and xs:any
3. Hidden on an empty sequence only
4. DefineVariable on schema and sequence only to define the scope of
the variable.
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell(a)uk.ibm.com
Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
[attachment "ogf-dfdl-annotation-scoping-v1.doc" deleted by Steve
Hanson/UK/IBM] --
dfdl-wg mailing list
dfdl-wg(a)ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
1
0