Thanks Alan and Steve for some agenda topics:
Follow
up the items from last week
-
Specification drafts - I need updates from everyone to
produce next spec draft
-
Expression language - Comments from only Steve H. so far
- Property
precedence - Any more comments/discussion
- UML for DFDL
schema - status update
- Entity
proposal updates?
Discussion for
this call
- White space
- Steve’s items
(??)
-
OGF presentation
Other
Topics?
From: Steve Hanson
[mailto:smh@uk.ibm.com]
Sent: Tuesday, January 29, 2008 1:10 PM
To: Mike Beckerle
Cc: Alan Powell
Subject: Agenda for DFDL WG call
Hi Mike -
possible agenda items for tomorrow.
Regards, Steve
Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848
-----
Forwarded by Steve Hanson/UK/IBM on 29/01/2008 17:55 -----
Alan
Powell/UK/IBM
29/01/2008
17:25 |
|
Steve
I will try to
make the WG call tomorrow but may be still on a course.
We need to
follow up the items from last week
-
Specification drafts - I need updates from everyone to
produce next spec draft
-
Expression language - Comments from only you so far
- Property
precedence - Any more comments/discussion
- UML for DFDL
schema - status update
- Entity
proposal - I should have updated as a result of last weeks discussion but
haven't had time
Discussion for
this call
- White space
- Your items.
We did discuss them a bit but mostly in the context of white space.
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com
Tel: +44 (0)1962 815073
Fax: +44 (0)1962 816898
-----
Forwarded by Alan Powell/UK/IBM on 29/01/2008 17:19 -----
Steve
Hanson/UK/IBM@IBMGB
28/01/2008
08:51 |
|
Sorry I couldn't make the call. Some comments:
a) we need both WSP and OWSP if DFDL delimiter properties can only specify a
single value. If they can specify a list of values then you can get away with
only needing WSP
eg, dfdl:terminator="@
@%WSP;"
b) if we make WSP mean a single white space character, we need a second entity
for multiple white space characters.
It doesn't look like you got round to discussing the other items I sent in (below)?
Let's do that next call.
1) One way to handle the situation where the terminator can vary is to allow
the DFDL markup properties (dfdl:terminator, dfdl:separator, etc) to be lists,
just like we already do for dfdl:nullValues. (IBM's WTX has this capability).
2) We've allowed the prefix of a prefixed length to be explicitly described as
a non-event field using dfdl:lengthPrefixType. Should we permit this for markup
properties? Instead of supplying a list of possible values, you supply a
simple type with enums for the values. This could be viewed as an
alternative/complementary to 1). There is a limitations - because we are using
XSDL enumeration facet, we are constrained by its syntax so I don't see how we
could use our own entity scheme or expressions. Also, I suspect that enums are
inherently unordered so we'd need a way of saying which to use on output (use
an element of simple type and use XSDL default attribute?). Lastly, we
should not force a user to model an initiator as an element/type - most users
just see it as a piece of text so just entering the value must still be
allowed.
3) Let's say my delimiter is dynamically defined at the start of the data, like
EDI allows. We would handle that in DFDL using an expression or variable.
However, EDI also allows random white space to appear after the delimiter. Can
our expression/entity syntaxes handle this? Does this preclude use of 1)
or 2)?
Regards, Steve
Steve Hanson
WebSphere Message Brokers
Hursley, UK
Internet: smh@uk.ibm.com
Phone (+44)/(0) 1962-815848
Ian
W Parkinson/UK/IBM@IBMGB 24/01/2008
16:19 |
|
A small correction. with thanks to Simon - it was Steve (rather than Simon) who
had previously attracted a reasonable audience at the OGF conference.
Ian
Open Grid Forum: Data Format Description Language Working Group
Weekly Working Group Conference Call
17:00 GMT, 23 Jan 2008
Attendees
Mike Beckerle (Oco)
Simon Parker (PolarLake)
Ian Parkinson (IBM)
Alan Powell (IBM)
Apologies
Steve Hanson (IBM), Suman Kalia (IBM)
1. OGF22
The DFDL session at OGF22 is now booked for the Monday afternoon, and Mike has
registered to attend. Mike will present our updated status, and Alan promised
to upload the last set of presented slides to GridForge so that Mike can update
them. Alan asked whether we should attempt to drum up interest in the DFDL
session to encourage attendence; Simon thought that advertising may not make
much difference and that Steve had a reasonable audience when he presented.
2. Specification drafts
Steve and Alan had previously assigned ownership of individual items from
Mike's plan of contents for the next few drafts. Alan will assemble the next
draft, due at the end of the month, and asked for input as soon as possible.
Looking at the plan for the next, "vX+1", draft, the group reported
the following status:
The plan calls for subsequent versions of the specification, including the
following items with status:
3. UML diagrams
Simon is revising the UML diagrams which describe the DFDL schema components.
The previous meeting minutes included a number of comments on these diagrams,
and the group took this opportunity to look at some of those comments:
"...I think it would be better to use the open source XML schema model as
source model and show relationship of DFDL Annotations attached to the XSD
schema model"
- Mike noted that DFDL makes use of annotations on objects which are absent
from the XSD schema model, and hence that it may be unnatural to base the DFDL
schema model directly on the XSD model. Simon suggested that it would be
cleanest to describe a modified version the XSD model including those XSD
elements that we need to annotate, and use this as a basis for the DFDL model.
"The current diagram suggests that 'variable definition' can both be part
of a format base or as a standalone annotation (outside of a format). Is this
true?" -
Mike suggested that variable definitions don't have to be part of a format
block: so, yes, this is true.
Mike agreed to respond further to the set of comments by email.
4. Review of Entities proposal
Alan has distributed a proposal covering entities in DFDL, intended to allow
characters which are disallowed by XML1.0 (or XML1.1) to be included in DFDL
schemas. These follow a similar syntax to XML, using % instead of & as an
escape, with an additional mechanism for specifying raw data. This latter is
intended to supplant the escaping mechanism described in current versions of
the specification (which also uses % as an escape).
The group felt that the description of the raw data entities should not be cast
in terms of characters and character sets, but rather in terms of bytes. If
treated as characters, schemas may need to be written when moving from
single-byte to double-byte character sets; further, this incorrectly implies
some codepage conversion is involved.
The proposal also introduces a list of predefined names for certain common
control characters. Mike asked whether these are the existing XML names - Alan
replied that XML does not define names for control characters.
Ian asked how we should represent the literal % character in strings given this
form of escaping. The present draft of the specification uses "%%" to
handle this; Simon suggested a string like "%pc;". The meeting felt
that %% might be marginally preferable.
Finally, the proposal defines some labels which aim to reduce the complexity of
dealing with whitespace and newlines. The %NL; entity represents a newline on
"the target platform" - Mike observed that DFDL presently does not
have a concept of a target platform. Alan felt it important that a single DFDL
schema be able to generate output documents targetted at different platforms.
Mike proposed that we introduce a new property, "generatedNewLine",
which describes the meaning of %NL; during unparse, and that %NL; should be
tolerant of any common new line representation during parse. The group
discussed whether this could instead be handled using a list of optional new
line values, however this would not support schema portability. Simon suggested
we introduce another new property to mean that %NL; should be the conventional
new line representation on the platform on which an engine is running, however
Mike pointed out that this simply requires appropriate configuration of the
generatedNewLine property.
%WSP; and %OWSP; are introduced to mean any whitespace, and optional
whitespace. This will be useful in describing some formats which allow
arbitrary whitespace, such as MIME. Mike pointed out that we could model such
whitespace using hidden fields, but that these entities may make a schema
clearer. PolarLake have found that only one such label is necessary, which
means, "one or more whitespace characters", and that this needs only
to be made available as a delimiter - Mike agreed that this label may represent
a special type of delimiter rather than a general purpose entity. Alan would
like to work through the potential use cases to see if we can restrict it in
this fashion, and will update the proposal to specify that these relate to just
one character. Simon suggested we could introduce an extra label, perhaps
%WPS*; to match multiple whitespace characters.
Meeting closed, 18:15
Ian Parkinson
WebSphere ESB Development
Mail Point 211, Hursley Park, Hursley, Winchester, SO21 2JN, UK
Unless
stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless
stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless
stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU