Revised per our discussions last week on
the call.
Unparsing
Definition: augmented infoset.
When unparsing one begins with the DFDL schema and conceptually with the logical
infoset. As the values of items are filled in by defaulting, and by use of the
DFDL outputValueCalc property, these new item values augment the infoset. The
resulting infoset is called the augmented infoset.
Definition: an element declaration in the
schema describes a potentially represented item if that element
declaration does not have an inputValueCalc property. Whether the element
declaration describes an item that is actually represented or not depends on
whether the element declaration is for a required or optional element, and
whether the element has a corresponding value in the augmented infoset.
When unparsing, an element declaration and
the infoset are considered as follows:
a) If the element declaration has a dfdl:outputValueCalc property then
the expression which is the dfdl:outputValueCalc property value is evaluated
and the resulting value becomes the value of the element item in the augmented infoset.
Any pre-existing value for the infoset item is superseded by this new value.
References to other
augmented infoset items from within the outputValueCalc expression must obtain
their values from the augmented infoset directly (when the value is already
present) or by recursively using these methods (a) and (b) as needed.
b) If the element declaration has no corresponding value in the
augmented infoset, and the element declaration is for a required item, and it has a default value specified, then an
element item having the default value is created in the augmented infoset.
c) If any infoset item’s value is requested recursively as a
part of (a) above and (a) does not apply, and the corresponding value is not
present, and (b) does not apply then it is a processing error.
Given this augmented infoset, then if the potentially
represented element declaration has a corresponding infoset item then that item
is serialized according to its DFDL properties. If the element declaration is
for a required item, and there is no value in the augmented infoset then it is
a processing error.
Because rule (a) above is used even if the
augmented infoset item already exists and has a value, it is possible for an
outputValueCalc expression to be evaluated multiple times. DFDL implementations
are free to cache values and avoid this repeated evaluation for efficiency, as
the semantics of DFDL require that the outputValueCalc expression return the
same value every time it is evaluated.
In expressions, the function dfdl:length()
can be called to determine the representation length of an item. If an element
declaration is not potentially represented, then dfdl:length() is defined to
return 0.
Tel: 781-810-2100 |
From:
Sent: Thursday, June 12, 2008
12:01 AM
To: dfdl-wg@ogf.org
Subject: Unparsing and
outputValueCalc
Based on our discussions on the call
today, I was thinking about the definition of output “unparsing” and
how outputValueCalc is dealt with.
I believe the following language is
sufficient to explain how such expressions are evaluated.
When unparsing, an element declaration in
the schema must have a corresponding value in the infoset.
If one exists then that value is
serialized based on its properties. If there is no corresponding value in the
infoset then a value is computed as follows:
d) If the element declaration is required, and has a default value
specified, then an element item having the default value is created in the
infoset
e) If the element declaration has an outputValueCalc property then the
expression which is the property value is evaluated and the resulting value
becomes the value of the element item in the infoset. References to other
infoset elements from within the outputValueCalc expression must obtain their
values from the infoset directly (when the value is already present) or by
recursively using these methods (a) and (b) as needed.
f)
If any
infoset element’s value is requested and neither (a) nor (b) applies,
then it is a processing error.
Seems ok to me. This is the ordinary stuff
of language specification.
The function dfdl:length() needs some
additional discussion.
I think we can restrict dfdl:length() to
accept only paths to element info items. I.e., the first argument must be an
explicit path.
(Alternatively, we can make the
dfdl:length() be a member of an info item, as in
../x.dfdl:length(‘bytes’) - of however we want to notate
obtaining the length from a path, instead of dfdl:length(../x,
‘bytes’). Either notation style is ok with me.)
The path for the dfdl:length must be to an
element which has representation. That is, it cannot have the inputValueCalc
property. This insures that it is meaningful to ask for the dfdl:length,
that is the representation length of the item measured in the requested units.
There is already an Xpath function count()
which returns the number of occurrences of an item. Both count() and
dfdl:length() potentially imply buffering in the unparser implementation.
Comments?
Tel: 781-810-2100 |