DFDL: output and choices/discriminators

I'd like to discuss this example in email and/or on our call this week. It illustrates that discriminators must be evaluated both on output and on input. We proposed at the F2F that assertions are only about parsing, but the same cannot be said of discriminators. Because email often line-wraps in ways that break things I've also attached the same example as a file. <?xml version="1.0" encoding="iso-8859-1"?> <!-- 80 column ruler 2345678901234567890123456789012345678901234567890123456 --> <!-- *************************************************************************** **************************************************************************** **** Illustration of use of layering with choices and discrimintors Lessons from this example: Discriminators must be evaluated on output as well as input in order to decide choices. **************************************************************************** **** *************************************************************************** --> <xs:schema targetNamespace="http://dataformat.org/tests" elementFormDefault="qualified" xsi:schemaLocation= "http://www.ogf.org/dfdl/dfdl-0.1 ../../xsd/dfdl.xsd http://www.w3.org/2001/XMLSchema ../../xsd/XMLSchema.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.ogf.org/dfdl/tests" xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-0.1"> <!-- ************************************************************************* Our default format will be binary, without delimiters, with bits for alignment units, and implicit length kind which we'll override where needed. ************************************************************************* --> <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/dfdl-0.1"> <dfdl:defineFormat name="default"> <dfdl:format representation="binary" lengthKind="implcit" initiator="" separator="" terminator="" alignmentUnits="bits" /> </dfdl:defineFormat> </xs:appinfo></xs:annotation> <!-- ************************************************************************* Example: a string with smart one byte or four byte length preceding it. Modeled as a single bit flag, followed by a 7 bit or 31 bit integer after it. ************************************************************************* --> <xs:complexType name="smartLengthString" dfdl:ref="default" dfdl:lengthKind="explicit" dfdl:lengthUnits="bits"> <xs:sequence> <xs:annotation><xs:appinfo source="..."> <dfdl:hidden> <xs:element name="lengthFlag" type="xs:byte" dfdl:length="1" dfdl:alignment="8" dfdl:outputValueCalc='{ if ( ../logicalLength > 127 ) then "1" else "0" }' /> <xs:choice dfdl:choiceKind='variable' dfdl:choiceResolvable="true"> <!-- First choice alternative: one byte --> <xs:sequence> <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/dfdl-0.1"> <dfdl:discriminator test="{ ../lengthFlag != '1' }" /> </xs:appinfo></xs:annotation> <xs:element name="oneByte" type="xs:byte" dfdl:alignment="1" dfdl:length="7" dfdl:outputValueCalc="{ ../logicalLength }" /> </xs:sequence> <!-- Second choice alternative: one byte --> <xs:element name="fourByte" type="xs:int" dfdl:alignment="1" dfdl:length="31" dfdl:outputValueCalc="{ ../logicalLength }" /> </xs:choice> <!-- this logicalLength element below isn't strictly speaking needed in this example. It's here to illustrate something having both input and output value calculation, and makes things a bit more readable. --> <xs:element name="logicalLength" type="xs:int" dfdl:inputValueCalc="{ if (../lengthFlag = '1') then ../fourByte else ../oneByte }" dfdl:outputValueCalc= "{ dfdl:length(../str, 'characters') }" /> </dfdl:hidden> </xs:appinfo></xs:annotation> <xs:annotation><xs:appinfo source="..."> <xs:element name="str" type="xs:string" dfdl:length="{ ../logicalLength }" dfdl:lengthUnits="characters" /> </xs:sequence> </xs:complexType> </xs:schema>

Mike I think there is an easier solution to your example using an expression for the dfdl:length property of the length field instead of the choice. <xs:element name="OneOrfourByte" type="xs:int" dfdl:alignment="1" dfdl:length='{ if ( ../logicalLength > 127 ) then "31" else "7" }' dfdl:outputValueCalc="{ ../logicalLength }" /> For a simple infoset element which can have multiple physical representations is really support for union which we have excluded from V1. Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: "Mike Beckerle" <mbeckerle.dfdl@gmail.com> To: <dfdl-wg@ogf.org> Date: 09/06/2008 17:42 Subject: [DFDL-WG] DFDL: output and choices/discriminators I?d like to discuss this example in email and/or on our call this week. It illustrates that discriminators must be evaluated both on output and on input. We proposed at the F2F that assertions are only about parsing, but the same cannot be said of discriminators. Because email often line-wraps in ways that break things I?ve also attached the same example as a file. <?xml version="1.0" encoding="iso-8859-1"?> <!-- 80 column ruler 2345678901234567890123456789012345678901234567890123456 --> <!-- *************************************************************************** ******************************************************************************** Illustration of use of layering with choices and discrimintors Lessons from this example: Discriminators must be evaluated on output as well as input in order to decide choices. ******************************************************************************** *************************************************************************** --> <xs:schema targetNamespace="http://dataformat.org/tests" elementFormDefault="qualified" xsi:schemaLocation= "http://www.ogf.org/dfdl/dfdl-0.1 ../../xsd/dfdl.xsd http://www.w3.org/2001/XMLSchema ../../xsd/XMLSchema.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.ogf.org/dfdl/tests" xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-0.1"> <!-- ************************************************************************* Our default format will be binary, without delimiters, with bits for alignment units, and implicit length kind which we'll override where needed. ************************************************************************* --> <xs:annotation><xs:appinfo source="http://www.ogf.org/dfdl/dfdl-0.1"> <dfdl:defineFormat name="default"> <dfdl:format representation="binary" lengthKind="implcit" initiator="" separator="" terminator="" alignmentUnits="bits" /> </dfdl:defineFormat> </xs:appinfo></xs:annotation> <!-- ************************************************************************* Example: a string with smart one byte or four byte length preceding it. Modeled as a single bit flag, followed by a 7 bit or 31 bit integer after it. ************************************************************************* --> <xs:complexType name="smartLengthString" dfdl:ref="default" dfdl:lengthKind="explicit" dfdl:lengthUnits="bits"> <xs:sequence> <xs:annotation><xs:appinfo source="..."> <dfdl:hidden> <xs:element name="lengthFlag" type="xs:byte" dfdl:length="1" dfdl:alignment="8" dfdl:outputValueCalc='{ if ( ../logicalLength > 127 ) then "1" else "0" }' /> <xs:choice dfdl:choiceKind='variable' dfdl:choiceResolvable="true"> <!-- First choice alternative: one byte --> <xs:sequence> <xs:annotation><xs:appinfo source=" http://www.ogf.org/dfdl/dfdl-0.1"> <dfdl:discriminator test="{ ../lengthFlag != '1' }" /> </xs:appinfo></xs:annotation> <xs:element name="oneByte" type="xs:byte" dfdl:alignment="1" dfdl:length="7" dfdl:outputValueCalc="{ ../logicalLength }" /> </xs:sequence> <!-- Second choice alternative: one byte --> <xs:element name="fourByte" type="xs:int" dfdl:alignment="1" dfdl:length="31" dfdl:outputValueCalc="{ ../logicalLength }" /> </xs:choice> <!-- this logicalLength element below isn't strictly speaking needed in this example. It's here to illustrate something having both input and output value calculation, and makes things a bit more readable. --> <xs:element name="logicalLength" type="xs:int" dfdl:inputValueCalc="{ if (../lengthFlag = '1') then ../fourByte else ../oneByte }" dfdl:outputValueCalc= "{ dfdl:length(../str, 'characters') }" /> </dfdl:hidden> </xs:appinfo></xs:annotation> <xs:annotation><xs:appinfo source="..."> <xs:element name="str" type="xs:string" dfdl:length="{ ../logicalLength }" dfdl:lengthUnits="characters" /> </xs:sequence> </xs:complexType> </xs:schema> [attachment "choice-discriminator-example.xml" deleted by Alan Powell/UK/IBM] -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

I follow your simplification point. This is clever. The point of my example was to motivate how choices work on output though. So the fact that this trick works in this specific case is beside the point I hope. .mike _____ From: Alan Powell [mailto:alan_powell@uk.ibm.com] Sent: Wednesday, June 11, 2008 11:46 AM To: mbeckerle.dfdl@gmail.com Cc: dfdl-wg@ogf.org; dfdl-wg-bounces@ogf.org Subject: Re: [DFDL-WG] DFDL: output and choices/discriminators Mike I think there is an easier solution to your example using an expression for the dfdl:length property of the length field instead of the choice. <xs:element name="OneOrfourByte" type="xs:int" dfdl:alignment="1" dfdl:length='{ if ( ../logicalLength > 127 ) then "31" else "7" }' dfdl:outputValueCalc="{ ../logicalLength }" /> For a simple infoset element which can have multiple physical representations is really support for union which we have excluded from V1. Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: "Mike Beckerle" <mbeckerle.dfdl@gmail.com> To: <dfdl-wg@ogf.org> Date: 09/06/2008 17:42 Subject: [DFDL-WG] DFDL: output and choices/discriminators _____ I'd like to discuss this example in email and/or on our call this week. It illustrates that discriminators must be evaluated both on output and on input. We proposed at the F2F that assertions are only about parsing, but the same cannot be said of discriminators. Because email often line-wraps in ways that break things I've also attached the same example as a file. <?xml version="1.0" encoding="iso-8859-1"?> <!-- 80 column ruler 2345678901234567890123456789012345678901234567890123456 --> <!-- *************************************************************************** **************************************************************************** **** Illustration of use of layering with choices and discrimintors Lessons from this example: Discriminators must be evaluated on output as well as input in order to decide choices. **************************************************************************** **** *************************************************************************** --> <xs:schema targetNamespace=" <http://dataformat.org/tests> http://dataformat.org/tests" elementFormDefault="qualified" xsi:schemaLocation= " <http://www.ogf.org/dfdl/dfdl-0.1> http://www.ogf.org/dfdl/dfdl-0.1 ../../xsd/dfdl.xsd <http://www.w3.org/2001/XMLSchema> http://www.w3.org/2001/XMLSchema ../../xsd/XMLSchema.xsd" xmlns:xs=" <http://www.w3.org/2001/XMLSchema> http://www.w3.org/2001/XMLSchema" xmlns:xsi=" <http://www.w3.org/2001/XMLSchema-instance> http://www.w3.org/2001/XMLSchema-instance" xmlns=" <http://www.ogf.org/dfdl/tests> http://www.ogf.org/dfdl/tests" xmlns:dfdl=" <http://www.ogf.org/dfdl/dfdl-0.1> http://www.ogf.org/dfdl/dfdl-0.1"> <!-- ************************************************************************* Our default format will be binary, without delimiters, with bits for alignment units, and implicit length kind which we'll override where needed. ************************************************************************* --> <xs:annotation><xs:appinfo source=" <http://www.ogf.org/dfdl/dfdl-0.1> http://www.ogf.org/dfdl/dfdl-0.1"> <dfdl:defineFormat name="default"> <dfdl:format representation="binary" lengthKind="implcit" initiator="" separator="" terminator="" alignmentUnits="bits" /> </dfdl:defineFormat> </xs:appinfo></xs:annotation> <!-- ************************************************************************* Example: a string with smart one byte or four byte length preceding it. Modeled as a single bit flag, followed by a 7 bit or 31 bit integer after it. ************************************************************************* --> <xs:complexType name="smartLengthString" dfdl:ref="default" dfdl:lengthKind="explicit" dfdl:lengthUnits="bits"> <xs:sequence> <xs:annotation><xs:appinfo source="..."> <dfdl:hidden> <xs:element name="lengthFlag" type="xs:byte" dfdl:length="1" dfdl:alignment="8" dfdl:outputValueCalc='{ if ( ../logicalLength > 127 ) then "1" else "0" }' /> <xs:choice dfdl:choiceKind='variable' dfdl:choiceResolvable="true"> <!-- First choice alternative: one byte --> <xs:sequence> <xs:annotation><xs:appinfo source=" <http://www.ogf.org/dfdl/dfdl-0.1> http://www.ogf.org/dfdl/dfdl-0.1"> <dfdl:discriminator test="{ ../lengthFlag != '1' }" /> </xs:appinfo></xs:annotation> <xs:element name="oneByte" type="xs:byte" dfdl:alignment="1" dfdl:length="7" dfdl:outputValueCalc="{ ../logicalLength }" /> </xs:sequence> <!-- Second choice alternative: one byte --> <xs:element name="fourByte" type="xs:int" dfdl:alignment="1" dfdl:length="31" dfdl:outputValueCalc="{ ../logicalLength }" /> </xs:choice> <!-- this logicalLength element below isn't strictly speaking needed in this example. It's here to illustrate something having both input and output value calculation, and makes things a bit more readable. --> <xs:element name="logicalLength" type="xs:int" dfdl:inputValueCalc="{ if (../lengthFlag = '1') then ../fourByte else ../oneByte }" dfdl:outputValueCalc= "{ dfdl:length(../str, 'characters') }" /> </dfdl:hidden> </xs:appinfo></xs:annotation> <xs:annotation><xs:appinfo source="..."> <xs:element name="str" type="xs:string" dfdl:length="{ ../logicalLength }" dfdl:lengthUnits="characters" /> </xs:sequence> </xs:complexType> </xs:schema> [attachment "choice-discriminator-example.xml" deleted by Alan Powell/UK/IBM] -- dfdl-wg mailing list dfdl-wg@ogf.org <http://www.ogf.org/mailman/listinfo/dfdl-wg> http://www.ogf.org/mailman/listinfo/dfdl-wg _____ Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Based on our discussions on the call today, I was thinking about the definition of output "unparsing" and how outputValueCalc is dealt with. I believe the following language is sufficient to explain how such expressions are evaluated. When unparsing, an element declaration in the schema must have a corresponding value in the infoset. If one exists then that value is serialized based on its properties. If there is no corresponding value in the infoset then a value is computed as follows: a) If the element declaration is required, and has a default value specified, then an element item having the default value is created in the infoset b) If the element declaration has an outputValueCalc property then the expression which is the property value is evaluated and the resulting value becomes the value of the element item in the infoset. References to other infoset elements from within the outputValueCalc expression must obtain their values from the infoset directly (when the value is already present) or by recursively using these methods (a) and (b) as needed. c) If any infoset element's value is requested and neither (a) nor (b) applies, then it is a processing error. Seems ok to me. This is the ordinary stuff of language specification. The function dfdl:length() needs some additional discussion. I think we can restrict dfdl:length() to accept only paths to element info items. I.e., the first argument must be an explicit path. (Alternatively, we can make the dfdl:length() be a member of an info item, as in ../x.dfdl:length('bytes') - of however we want to notate obtaining the length from a path, instead of dfdl:length(../x, 'bytes'). Either notation style is ok with me.) The path for the dfdl:length must be to an element which has representation. That is, it cannot have the inputValueCalc property. This insures that it is meaningful to ask for the dfdl:length, that is the representation length of the item measured in the requested units. There is already an Xpath function count() which returns the number of occurrences of an item. Both count() and dfdl:length() potentially imply buffering in the unparser implementation. Comments? Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc. Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451 | <mailto:mbeckerle.dfdl@gmail.com> mbeckerle.dfdl@gmail.com

Revised per our discussions last week on the call. Unparsing Definition: augmented infoset. When unparsing one begins with the DFDL schema and conceptually with the logical infoset. As the values of items are filled in by defaulting, and by use of the DFDL outputValueCalc property, these new item values augment the infoset. The resulting infoset is called the augmented infoset. Definition: an element declaration in the schema describes a potentially represented item if that element declaration does not have an inputValueCalc property. Whether the element declaration describes an item that is actually represented or not depends on whether the element declaration is for a required or optional element, and whether the element has a corresponding value in the augmented infoset. When unparsing, an element declaration and the infoset are considered as follows: a) If the element declaration has a dfdl:outputValueCalc property then the expression which is the dfdl:outputValueCalc property value is evaluated and the resulting value becomes the value of the element item in the augmented infoset. Any pre-existing value for the infoset item is superseded by this new value. References to other augmented infoset items from within the outputValueCalc expression must obtain their values from the augmented infoset directly (when the value is already present) or by recursively using these methods (a) and (b) as needed. b) If the element declaration has no corresponding value in the augmented infoset, and the element declaration is for a required item, and it has a default value specified, then an element item having the default value is created in the augmented infoset. c) If any infoset item's value is requested recursively as a part of (a) above and (a) does not apply, and the corresponding value is not present, and (b) does not apply then it is a processing error. Given this augmented infoset, then if the potentially represented element declaration has a corresponding infoset item then that item is serialized according to its DFDL properties. If the element declaration is for a required item, and there is no value in the augmented infoset then it is a processing error. Because rule (a) above is used even if the augmented infoset item already exists and has a value, it is possible for an outputValueCalc expression to be evaluated multiple times. DFDL implementations are free to cache values and avoid this repeated evaluation for efficiency, as the semantics of DFDL require that the outputValueCalc expression return the same value every time it is evaluated. In expressions, the function dfdl:length() can be called to determine the representation length of an item. If an element declaration is not potentially represented, then dfdl:length() is defined to return 0. Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc. Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451 | <mailto:mbeckerle.dfdl@gmail.com> mbeckerle.dfdl@gmail.com _____ From: Mike Beckerle [mailto:mbeckerle.dfdl@gmail.com] Sent: Thursday, June 12, 2008 12:01 AM To: dfdl-wg@ogf.org Subject: Unparsing and outputValueCalc Based on our discussions on the call today, I was thinking about the definition of output "unparsing" and how outputValueCalc is dealt with. I believe the following language is sufficient to explain how such expressions are evaluated. When unparsing, an element declaration in the schema must have a corresponding value in the infoset. If one exists then that value is serialized based on its properties. If there is no corresponding value in the infoset then a value is computed as follows: d) If the element declaration is required, and has a default value specified, then an element item having the default value is created in the infoset e) If the element declaration has an outputValueCalc property then the expression which is the property value is evaluated and the resulting value becomes the value of the element item in the infoset. References to other infoset elements from within the outputValueCalc expression must obtain their values from the infoset directly (when the value is already present) or by recursively using these methods (a) and (b) as needed. f) If any infoset element's value is requested and neither (a) nor (b) applies, then it is a processing error. Seems ok to me. This is the ordinary stuff of language specification. The function dfdl:length() needs some additional discussion. I think we can restrict dfdl:length() to accept only paths to element info items. I.e., the first argument must be an explicit path. (Alternatively, we can make the dfdl:length() be a member of an info item, as in ../x.dfdl:length('bytes') - of however we want to notate obtaining the length from a path, instead of dfdl:length(../x, 'bytes'). Either notation style is ok with me.) The path for the dfdl:length must be to an element which has representation. That is, it cannot have the inputValueCalc property. This insures that it is meaningful to ask for the dfdl:length, that is the representation length of the item measured in the requested units. There is already an Xpath function count() which returns the number of occurrences of an item. Both count() and dfdl:length() potentially imply buffering in the unparser implementation. Comments? Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc. Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451 | <mailto:mbeckerle.dfdl@gmail.com> mbeckerle.dfdl@gmail.com

Mike Augmented infoset should include hidden elements. Although these almost certainly have an outputValueCalc by including hidden elements we could use the same concept on parsing to describe the infoset used by the expression language. When unparsing, an element declaration and the infoset are considered as follows: 0) If the element declaration has a dfdl:inputValueCalc property then the infoset value is ignored and nothing is output a) If the element declaration has a dfdl:outputValueCalc property then the expression which is the dfdl:outputValueCalc property value is evaluated and the resulting value becomes the value of the element item in the augmented infoset. Any pre-existing value for the infoset item is superseded by this new value. References to other augmented infoset items from within the outputValueCalc expression must obtain their values from the augmented infoset directly (when the value is already present) or by recursively using these methods (a) and (b) as needed. b) If the element declaration has no corresponding value in the augmented infoset, and the element declaration is for a required item, and it has a default value specified, then an element item having the default value is created in the augmented infoset. c) If any infoset item?s value is requested recursively as a part of (a) above and (a) does not apply, and the corresponding value is not present, and (b) does not apply then it is a processing error. Alan Powell MP 211, IBM UK Labs, Hursley, Winchester, SO21 2JN, England Notes Id: Alan Powell/UK/IBM email: alan_powell@uk.ibm.com Tel: +44 (0)1962 815073 Fax: +44 (0)1962 816898 From: "Mike Beckerle" <mbeckerle.dfdl@gmail.com> To: <mbeckerle.dfdl@gmail.com>, <dfdl-wg@ogf.org> Date: 21/06/2008 13:18 Subject: Re: [DFDL-WG] Unparsing and outputValueCalc Revised per our discussions last week on the call. Unparsing Definition: augmented infoset. When unparsing one begins with the DFDL schema and conceptually with the logical infoset. As the values of items are filled in by defaulting, and by use of the DFDL outputValueCalc property, these new item values augment the infoset. The resulting infoset is called the augmented infoset. Definition: an element declaration in the schema describes a potentially represented item if that element declaration does not have an inputValueCalc property. Whether the element declaration describes an item that is actually represented or not depends on whether the element declaration is for a required or optional element, and whether the element has a corresponding value in the augmented infoset. When unparsing, an element declaration and the infoset are considered as follows: a) If the element declaration has a dfdl:outputValueCalc property then the expression which is the dfdl:outputValueCalc property value is evaluated and the resulting value becomes the value of the element item in the augmented infoset. Any pre-existing value for the infoset item is superseded by this new value. References to other augmented infoset items from within the outputValueCalc expression must obtain their values from the augmented infoset directly (when the value is already present) or by recursively using these methods (a) and (b) as needed. b) If the element declaration has no corresponding value in the augmented infoset, and the element declaration is for a required item, and it has a default value specified, then an element item having the default value is created in the augmented infoset. c) If any infoset item?s value is requested recursively as a part of (a) above and (a) does not apply, and the corresponding value is not present, and (b) does not apply then it is a processing error. Given this augmented infoset, then if the potentially represented element declaration has a corresponding infoset item then that item is serialized according to its DFDL properties. If the element declaration is for a required item, and there is no value in the augmented infoset then it is a processing error. Because rule (a) above is used even if the augmented infoset item already exists and has a value, it is possible for an outputValueCalc expression to be evaluated multiple times. DFDL implementations are free to cache values and avoid this repeated evaluation for efficiency, as the semantics of DFDL require that the outputValueCalc expression return the same value every time it is evaluated. In expressions, the function dfdl:length() can be called to determine the representation length of an item. If an element declaration is not potentially represented, then dfdl:length() is defined to return 0. Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc. Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451 | mbeckerle.dfdl@gmail.com From: Mike Beckerle [mailto:mbeckerle.dfdl@gmail.com] Sent: Thursday, June 12, 2008 12:01 AM To: dfdl-wg@ogf.org Subject: Unparsing and outputValueCalc Based on our discussions on the call today, I was thinking about the definition of output ?unparsing? and how outputValueCalc is dealt with. I believe the following language is sufficient to explain how such expressions are evaluated. When unparsing, an element declaration in the schema must have a corresponding value in the infoset. If one exists then that value is serialized based on its properties. If there is no corresponding value in the infoset then a value is computed as follows: d) If the element declaration is required, and has a default value specified, then an element item having the default value is created in the infoset e) If the element declaration has an outputValueCalc property then the expression which is the property value is evaluated and the resulting value becomes the value of the element item in the infoset. References to other infoset elements from within the outputValueCalc expression must obtain their values from the infoset directly (when the value is already present) or by recursively using these methods (a) and (b) as needed. f) If any infoset element?s value is requested and neither (a) nor (b) applies, then it is a processing error. Seems ok to me. This is the ordinary stuff of language specification. The function dfdl:length() needs some additional discussion. I think we can restrict dfdl:length() to accept only paths to element info items. I.e., the first argument must be an explicit path. (Alternatively, we can make the dfdl:length() be a member of an info item, as in ../x.dfdl:length(?bytes?) - of however we want to notate obtaining the length from a path, instead of dfdl:length(../x, ?bytes?). Either notation style is ok with me.) The path for the dfdl:length must be to an element which has representation. That is, it cannot have the inputValueCalc property. This insures that it is meaningful to ask for the dfdl:length, that is the representation length of the item measured in the requested units. There is already an Xpath function count() which returns the number of occurrences of an item. Both count() and dfdl:length() potentially imply buffering in the unparser implementation. Comments? Mike Beckerle | OGF DFDL WG Co-Chair | CTO | Oco, Inc. Tel: 781-810-2100 | 504 Totten Pond Road, Waltham MA 02451 | mbeckerle.dfdl@gmail.com -- dfdl-wg mailing list dfdl-wg@ogf.org http://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (2)
-
Alan Powell
-
Mike Beckerle