
For discussion on today's WG call... Regards Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 ----- Forwarded by Steve Hanson/UK/IBM on 24/09/2013 14:09 ----- From: Steve Hanson/UK/IBM To: Mike Beckerle <mbeckerle.dfdl@gmail.com>, Cc: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org> Date: 11/09/2013 15:36 Subject: Re: [DFDL-WG] questions on setVariable Mike, I have several email threads that track the original design discussion for action 186. You originally wrote down an algorithm for combining statement annotations into an ordered list, but then the next version scrapped that and proposed the current scheme. I think there must have been some reason for this? Are we just talking setVariable here, or also newVariableInstance and assert as well? If so would the ordering rules be the same within, and across, annotation points? A couple of users have asked about short form versions of statement annotations. Imposing an execution order is not compatible with this. This was discussed in one of the action 186 emails. Regards Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: Steve Hanson/UK/IBM@IBMGB, Cc: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org> Date: 11/09/2013 14:48 Subject: Re: [DFDL-WG] questions on setVariable It is when there is more than one setVariable at one single annotation point that I believe they should definitely execute in lexical order. I am much less worried about across different annotation points that get resolved together which is what the Errata was clarifying. My preference would be this: If an element ref has 2 setVariable statements for variables A, B, and the element declaration has setVariable statements for variables X, Y, then ABXY, AXBY, AXYB, XYAB are all possible interleavings, but A is always before B, and X is always before Y so execution is always consistent with lexical order. I want to just clarify the hole we've left by not being tight enough about specifying the order of this evaluation of setVariables within or across annotation points in a resolved set. I can build a completely non-deterministic schema: <defineVariable name="x" defaultValue="1"/> <defineVariable name="y" defaultValue="2"/> <defineVariable name="z" defaultValue="0"/> <group name="coinFlip"/> <choice> <sequence> <annotation><appinfo> <dfdl:setVariable name="x" value="{ if ($y = 2) then 5/$z else 3 }"/> <!-- div by zero to create a proc error --> <dfdl:setVariable name="y" value="1"/> </appinfo></annotation> <element name="nonLexicalOrder" type="string" dfdl:inputValueCalc="{ '' }"/> </sequence> <element name="lexicalOrder" type="string" dfdl:inputValueCalc="{ '' }"/> </choice> </group> The above group definition illustrates how we can test what order a particular implementation executes these setVariable statements. You get <lexicalOrder/> or <nonLexicalOrder/> depending on what the implementation does. It could even vary from run to run for one implementation if the implementation is using a hash implementation that is using different random seeds at different times. It could even vary within a run, if this group is reused several times in the schema. Now, why would somebody write this, I don't know, but it does illustrate the hole in the spec. Certainly it would be better if DFDL could not express things like this. computing x reads the value of y. If the assignment to y has not yet happened, then we get a processing error when we divide by zero. This can cause backtracking, and the schema will effectively dodge the SDE that is supposed to happen to prevent this sort of assignment-timing stuff. This doesn't fix the problem entirely. I think our current 'non-specification' of the precise order here even across annotation points was a bit of a cop-out. Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy On Wed, Sep 11, 2013 at 6:26 AM, Steve Hanson <smh@uk.ibm.com> wrote: "I think the execution order needs to be specific within the annotations at one annotation point." The problem is that action 186 allowed scenarios like setVariable annotations simultaneously on an element ref, its global element and the element's simple type (*). So there is no longer 'one annotation point', there are multiple. That's when we formalised the concept of a 'resolved set of annotations' for a component, and stated that for all annotations of the same kind (**) in that set, the order of execution is not defined. We also state schema authors can insert sequences to provide more precise control over when annotations are executed. Several reasons for this. Is the lexical order of elements in appinfos, or of multiple appinfos, guaranteed in XSDL? If not then an editor could serialize appinfos or their content in any order. We also wanted to reserve the right to add explicit timing control in the future. And it made implementations easier. IBM DFDL 1.0.3 has been out in the field with this support since 1Q 2013. Tim, can you say what the execution order is for a resolved set of setVariables please? Or can we not tell because of the model? (*) Must be different variables, else SDE. (**) Asserts and discriminators with testKind 'pattern' are a different 'kind' from those with 'expression', for this purpose. Regards Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Suman Kalia <kalia@ca.ibm.com> To: Mike Beckerle <mbeckerle.dfdl@gmail.com>, Cc: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>, "dfdl-wg-bounces@ogf.org" <dfdl-wg-bounces@ogf.org>, Steve Hanson/UK/IBM@IBMGB Date: 10/09/2013 19:23 Subject: Re: [DFDL-WG] questions on setVariable Mike - In the use case hilited by Jonathan where 2 annotations are defined on the same point. I would suggest the execution order to be based on the lexical order i.e. the order in which annotations appear.. Suman Kalia IBM Canada Lab WMB Toolkit Architect and Development Lead Tel: 905-413-3923 T/L 313-3923 Email: kalia@ca.ibm.com For info on Message broker http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.ht... From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: Steve Hanson <smh@uk.ibm.com>, Cc: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>, "dfdl-wg-bounces@ogf.org" <dfdl-wg-bounces@ogf.org> Date: 09/10/2013 02:14 PM Subject: Re: [DFDL-WG] questions on setVariable Sent by: dfdl-wg-bounces@ogf.org If the first setVariable executes first, that will read variable baz, which will cause the second setVariable to SDE because you cannot set after the variable has been read. I believe this is the right semantics for this schema, because both these setVariable annotations are written at the same annotation point. If the second setVariable executes first, that will assign variable baz, which is OK, then when the first setVariable executes, it will get the value that was just assigned. While the spec says you shouldn't depend on this kind of ordering stuff, I think it is pretty unsatisfactory if the order is unspecified. Section 9.5 lays out the order of the different kinds of statement annotations relative to each other, but doesn't say what happens between two setVariable statements that appear at the same annotation point. I think the execution order needs to be specific within the annotations at one annotation point. Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy On Tue, Sep 10, 2013 at 1:50 PM, Steve Hanson <smh@uk.ibm.com> wrote: 1) I'll let Tim comment more on this but a relative expression uses as its context the current element on the stack. So using '.' on a simple element or its type resolves to the simple element, and using '.' on a complex element or a group child (recursively) of a complex element resolves to the complex element. Same deal when using the '..' parent axis, so beware - as this example shows: <xs:element name="root"> <xs:complexType> <xs:sequence> <xs:element name="delim" type="xs:string"/> <xs:element name="body"> <xs:complexType> <xs:sequence dfdl:separator="{../delim}"> ...and not {../../delim} ! This is all implied by the way XPath works. XPath only knows about elements. 2) I don't see why Jonathan's example could give an SDE. The defineVariables will always be executed first. However the result is ambiguous, depending on the order each setVariable is executed, as 'bar' could end up as 'biff' or 'fooey'. "Schema authors can insert sequences to provide more precise control over when variables are set" - quote from section 9.5.3 (errata 3.25). 3) To Mike's comment, the order of execution of setVariable and newVariableInstance on the same annotation is strictly defined, see section 9.5 (errata 3.25) Regards Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: "Cranford, Jonathan W." <jcranford@mitre.org>, Cc: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org> Date: 10/09/2013 18:23 Subject: Re: [DFDL-WG] questions on setVariable Sent by: dfdl-wg-bounces@ogf.org Those are both public-comment-class issues. The issue of what does "." mean in various places is interesting. It is referring to an infoset node, but how to explain exactly which one it is.... The out-of-order issue with evaluation that you mention is definitely an issue. The schema you gave will work or SDE depending on the order of evaluation of the statements and the language is not precise enough to say what the order is. I think the order of the setVariable and newVariableInstance statements at any annotation point, must be linear first to last. The order of setVariable and newVariableInstance combined into the resolved set of annotations, but from multiple annotation points (e.g., an element and a global type definition it references), those can be interleaved in any order. Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy On Tue, Sep 10, 2013 at 12:53 PM, Cranford, Jonathan W. < jcranford@mitre.org> wrote: These questions may go towards public comment if the answers aren't straight-forward. Section references are against latest editor's draft. · Section 7.9 - 5th paragraph after first example - "A dfdl:setVariable value expression may refer to the value of this element using a relative path value '.'. " Which element? If dfdl:setVariable is attached to a simpleType, then the "element" referred to here is the element of that type, correct? But when it's attached to group reference, sequence, or choice, is it the parent element which contains the group reference, sequence, or choice? · Section 7.9 - next-to-last paragraph: "However, the order of execution among them is not specified." Wouldn't this be ambiguous, then? <xs:schema> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <xs:defineVariable name="bar" /> <xs:defineVariable name="baz" defaultValue="biff"/> </xs:appinfo> </xs:annotation> ... <xs:element name="foo"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <xs:setVariable ref="bar" value="{$tns:baz}"/> <xs:setVariable ref="baz" value="fooey"/> </xs:appinfo> </xs:annotation> </xs:element> ... </xs:schema> If the order of execution isn't specified, isn't it ambiguous whether tns:bar gets the value of "biff" (defaultValue of tns:baz) or "fooey"? Respectfully, -- Jonathan W. Cranford Senior Information Systems Engineer The MITRE Corporation (http://www.mitre.org) -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU