Mike
Augmented infoset should include hidden
elements. Although these almost certainly have an outputValueCalc by including
hidden elements we could use the same concept on parsing to describe the
infoset used by the expression language.
When unparsing, an element
declaration and the infoset are considered as follows:
0)
If
the element declaration has a dfdl:inputValueCalc property then the infoset
value is ignored and nothing is output
a)
If
the element declaration has a dfdl:outputValueCalc property then the expression
which is the dfdl:outputValueCalc property value is evaluated and the resulting
value becomes the value of the element item in the augmented infoset. Any
pre-existing value for the infoset item is superseded by this new value.
References to other augmented
infoset items from within the outputValueCalc expression must obtain their
values from the augmented infoset directly (when the value is already present)
or by recursively using these methods (a) and (b) as needed.
b)
If
the element declaration has no corresponding value in the augmented infoset,
and the element declaration is for a required item, and it has
a default value specified, then an element item having the default
value is created in the augmented infoset.
c)
If
any infoset item’s value is requested recursively as a part of (a) above
and (a) does not apply, and the corresponding value is not present, and
(b) does not apply then it is a processing error.
Alan Powell
MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England
Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com
Tel: +44 (0)1962 815073
Fax: +44 (0)1962 816898
From:
| "Mike Beckerle" <mbeckerle.dfdl@gmail.com>
|
To:
| <mbeckerle.dfdl@gmail.com>, <dfdl-wg@ogf.org>
|
Date:
| 21/06/2008 13:18
|
Subject:
| Re: [DFDL-WG] Unparsing and outputValueCalc |
Revised per our discussions
last week on the call.
Unparsing
Definition: augmented
infoset. When unparsing one begins with the DFDL schema and conceptually
with the logical infoset. As the values of items are filled in by defaulting,
and by use of the DFDL outputValueCalc property, these new item values
augment the infoset. The resulting infoset is called the augmented infoset.
Definition: an element declaration
in the schema describes a potentially represented item if
that element declaration does not have an inputValueCalc property. Whether
the element declaration describes an item that is actually represented
or not depends on whether the element declaration is for a required or
optional element, and whether the element has a corresponding value in
the augmented infoset.
When unparsing, an element
declaration and the infoset are considered as follows:
a)
If
the element declaration has a dfdl:outputValueCalc property then the expression
which is the dfdl:outputValueCalc property value is evaluated and the resulting
value becomes the value of the element item in the augmented infoset. Any
pre-existing value for the infoset item is superseded by this new value.
References to other augmented
infoset items from within the outputValueCalc expression must obtain their
values from the augmented infoset directly (when the value is already present)
or by recursively using these methods (a) and (b) as needed.
b)
If
the element declaration has no corresponding value in the augmented infoset,
and the element declaration is for a required item, and it has
a default value specified, then an element item having the default
value is created in the augmented infoset.
c)
If
any infoset item’s value is requested recursively as a part of (a) above
and (a) does not apply, and the corresponding value is not present, and
(b) does not apply then it is a processing error.
Given this augmented infoset,
then if the potentially represented element declaration has a corresponding
infoset item then that item is serialized according to its DFDL properties.
If the element declaration is for a required item, and there is no value
in the augmented infoset then it is a processing error.
Because rule (a) above is used
even if the augmented infoset item already exists and has a value, it is
possible for an outputValueCalc expression to be evaluated multiple times.
DFDL implementations are free to cache values and avoid this repeated evaluation
for efficiency, as the semantics of DFDL require that the outputValueCalc
expression return the same value every time it is evaluated.
In expressions, the function
dfdl:length() can be called to determine the representation length of an
item. If an element declaration is not potentially represented, then dfdl:length()
is defined to return 0.
Mike Beckerle | OGF DFDL
WG Co-Chair | CTO | Oco, Inc.
Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451
| mbeckerle.dfdl@gmail.com
From: Mike Beckerle [mailto:mbeckerle.dfdl@gmail.com]
Sent: Thursday, June 12, 2008 12:01 AM
To: dfdl-wg@ogf.org
Subject: Unparsing and outputValueCalc
Based on our discussions on
the call today, I was thinking about the definition of output “unparsing”
and how outputValueCalc is dealt with.
I believe the following language
is sufficient to explain how such expressions are evaluated.
When unparsing, an element
declaration in the schema must have a corresponding value in the infoset.
If one exists then that value
is serialized based on its properties. If there is no corresponding value
in the infoset then a value is computed as follows:
d)
If
the element declaration is required, and has a default value specified,
then an element item having the default value is created in the infoset
e)
If
the element declaration has an outputValueCalc property then the expression
which is the property value is evaluated and the resulting value becomes
the value of the element item in the infoset. References to other infoset
elements from within the outputValueCalc expression must obtain their values
from the infoset directly (when the value is already present) or by recursively
using these methods (a) and (b) as needed.
f)
If
any infoset element’s value is requested and neither (a) nor (b) applies,
then it is a processing error.
Seems ok to me. This is the
ordinary stuff of language specification.
The function dfdl:length()
needs some additional discussion.
I think we can restrict dfdl:length()
to accept only paths to element info items. I.e., the first argument must
be an explicit path.
(Alternatively, we can make
the dfdl:length() be a member of an info item, as in ../x.dfdl:length(‘bytes’)
- of however we want to notate obtaining the length from a path,
instead of dfdl:length(../x, ‘bytes’). Either notation style is ok with
me.)
The path for the dfdl:length
must be to an element which has representation. That is, it cannot have
the inputValueCalc property. This insures that it is meaningful to
ask for the dfdl:length, that is the representation length of the item
measured in the requested units.
There is already an Xpath function
count() which returns the number of occurrences of an item. Both
count() and dfdl:length() potentially imply buffering in the unparser implementation.
Comments?
Mike Beckerle | OGF DFDL
WG Co-Chair | CTO | Oco, Inc.
Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451
| mbeckerle.dfdl@gmail.com
--
dfdl-wg mailing list
dfdl-wg@ogf.org
http://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU