I can see where you're going with this,
but I don't find either option very intuitive. My preference would be 'choiceDispatchKey'
on the choice group and 'choiceBranchKey' on the choice branch. I'm
not too worried about using the word 'key' - I don't think users will get
confused by it when it is combined with 'dispatch' and 'branch' in the
property names.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From:
Steve Hanson/UK/IBM@IBMGB
To:
dfdl-wg@ogf.org,
Date:
02/09/2013 18:24
Subject:
Re: [DFDL-WG]
Direct dispatch choice clarifications (action 219)
Sent by:
dfdl-wg-bounces@ogf.org
I am leaning away from the use of 'key'
because XML schema has xs:key and xs:keyRef components and people might
think they are related somehow.
I think a good analogy is with the switch statement in languages. Suggestions:
1) 'dfdl:switch' on the choice and 'dfdl:case' or 'dfdl:switchCase' on
the element - con is clash with dfdl:ignoreCase.
2) 'dfdl:switch' on the choice and 'dfdl:switchWhen'' on the element.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Steve
Hanson/UK/IBM
To: Mike
Beckerle <mbeckerle.dfdl@gmail.com>,
Date: 30/08/2013
15:06
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications (action 219)
Updated errata 3.15 to answer the open comments against that errata. Also
updated errata 2.126 to add the properties to the list of non-representation
properties.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: Steve
Hanson/UK/IBM@IBMGB,
Date: 30/08/2013
00:10
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications (action 219)
Updated errata 3.15, so please review that in the errata doc also. Corresponding
changes are marked with OPEN bubbles, except in the Property precedence
part. (Did those by global search/replace).
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF
Intellectual Property Policy
On Wed, Aug 21, 2013 at 12:21 PM, Steve Hanson <smh@uk.ibm.com>
wrote:
Discussed this some more on the WG call.
The original proposal from Steve envisaged a single property, choiceBranchRef,
which was an expression that returned a QName. The attraction of this is
that no property is needed on the element. It supports the use case where
the 'tag' in the data corresponds closely to the element name, which is
the case for something like SWIFT. Further, it is easy to extend
this idea to xs:any wildcards in a future DFDL release.
However if the 'tag' is very different from the element name then the expression
can become a big 'if' statement and some of the performance benefit is
lost. Hence the current proposal for choiceBranchRef expression returning
a simple string, and the need for a property on elements. This gets
us into a problem with global elements and uniqueness.
What is needed for DFDL 1.0 is a mechanism that gives good performance
for the known use cases, and is local to a choice, and does not preclude
the provision of a QName based solution in a future DFDL release. What
we have now is a halfway house. So the following is proposed:
- Change the name choiceBranchRef to choiceKey
(or choiceTag or TBD).
Note that the use of 'Ref' is dropped as that implies QName in all other
properties that use 'Ref'.
- Change the name elementID to branchKey
(or branchTag or TBD).
- Disallow branchKey on a global element,
so it is allowed only on local elements and element refs.
This removes the need for the special rule about choiceBranchKey on element
ref overriding that on a global element.
The
mechanism is now entirely local to a choice, and will not clash with any
future QName based scheme.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Steve
Hanson/UK/IBM
To: dfdl-wg@ogf.org,
Date: 19/08/2013
15:07
Subject: Fw:
[DFDL-WG] Direct dispatch choice clarifications
OK, I can go with changing elementID to something less suggestive of XSDL.
I've spoken with Suman and changing it is not a big deal in our model.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 19/08/2013 14:35 -----
From: Tim
Kimber/UK/IBM@IBMGB
To: dfdl-wg@ogf.org,
Date: 17/08/2013
11:51
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications
Sent by: dfdl-wg-bounces@ogf.org
I'm in general agreement with that.
Just to avoid any possible confusion:
I think the important thing is to stop thinking of elementID as any sort
of XSD/DFDL language ID. It is not. It is a DFDL String Literal, meaning
but it must describe a simple
string value and it's value is matched against the
return value of the choiceBranchRef expression. data.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: Steve
Hanson/UK/IBM@IBMGB,
Cc: "dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
"dfdl-wg-bounces@ogf.org"
<dfdl-wg-bounces@ogf.org>,
Tim Kimber/UK/IBM@IBMGB
Date: 16/08/2013
18:54
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications
I am ok with leaving the name elementID as is purely due to our schedule.
However, I really do think it would be good to change it now.
I think it is that name which is creating all the confusion, and we really
ought to be more careful.
I think the important thing is to stop thinking of elementID as any sort
of XSD/DFDL language ID. It is not. It is a DFDL String Literal, meaning
it's value is matched against data.
For example: It could be dfdl:elementID="%NUL;%NUL;%NUL;%NUL;"
meaning 4 null chars. So that when the choiceBranchRef="{ ../hdr/tag
}" evaluates to a 4 character string, then if they are all nul the
branch is taken by direct dispatch. This character NUL isn't allowed in
any namespaced identifiers in XSD or DFDL. It isn't even allowed in XML.
If I had dfdl:termnator="%NUL;" you wouldn't ask "what namespace
is that %NUL; in?"
So I think elementID is not, in anyway, a namespace-qualified identifier
any more than a delimiter is.
So I think uniqueness of the elementID within the alternatives of a choice
should be the only requirement. This stuff about unique within a namespace
if on a global element should be dropped.
The other argument for changing the name from elementID to something more
choice-dispatch specific is that this workgroup seems to be speculating
about using elementID for some sort of future wildcard-oriented feature.
I think we should reserve names we want for that future by NOT using them
now. If we use them now, we lock down part of the semantics in a way which
may just not work properly with some future addition to DFDL. Do we really
want elementID which is about direct dispatch choices but is not an identifier,
and then in the future have dfdl:wildcardElementID which is about wildcards
and IS an identifier. That's a mess.
If you think elementID is a really good name for some future wildcard feature,
then you should be advocating to NOT use it now for direct dispatch choices.
That is, unless you have a complete design in mind for the wildcard stuff
and are confident that elementID can be overloaded in a backward compatible
way. I've seen nothing like this articulated.
...mike
On Fri, Aug 16, 2013 at 12:32 PM, Steve Hanson <smh@uk.ibm.com>
wrote:
IBM has started to implement choiceBranchRef, in so far as the XML schemas
for DFDL have been updated to include elementID and we have regenerated
all our model code. I'd like to stick with elementID please.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: Steve
Hanson/UK/IBM@IBMGB,
Cc: Tim
Kimber/UK/IBM@IBMGB, "dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
"dfdl-wg-bounces@ogf.org"
<dfdl-wg-bounces@ogf.org>
Date: 16/08/2013
17:09
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications
I think we need to stay out of identifiers and namespaces and qnames
for this feature.
The elementID should have to be locally unique within the alternatives
of the choice that is dispatching to it. If that elementID lives on a global
element declaration, that means nothing at all alone. In the context of
a choice that has an element ref to that global element, then it has to
be unique within the arms of that choice.
This means it is possible to have two global element decls which have elementID="X",
and there is no conflict unless they are both used via element refs from
the same choice.
I think this handles every use case I know of. The "namespace"
requirement for elementID is only "unique within alternatives of a
choice".
Since nobody has implemented this feature yet, I would posit that we should
change elementID to something less suggestive of an XSD identifier/QName-ish
thing. Such as choiceDispatchTag.
...mike
On Fri, Aug 16, 2013 at 5:18 AM, Steve Hanson <smh@uk.ibm.com>
wrote:
You are right about the xs:QName constructor, it takes a string which is
prefix plus name. If we supplied a dfdl:QName constructor that took a URI
and a name, that would simplify things.
For example, the choiceBranchRef expression for SWIFT would be as below.
(The element name is always Document and the namespace is used to distinguish
the different messages).
{dfdl:QName(fn:concat(fn:concat('urn:swift:xsd:fin.', /FinMessage/Block2/MessageType),
".2011"), 'Document')}
If we stick with elementID for DFDL 1.0, then I agree with your 3 bullets,
but not the defaulting to element name as it is setting a behaviour that
may not be where we want to go for 2.0.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Tim
Kimber/UK/IBM@IBMGB
To: dfdl-wg@ogf.org,
Date: 15/08/2013
20:34
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications
Sent by: dfdl-wg-bounces@ogf.org
The other problem with using a QName is that it involves using namespace
prefixes. That means that there needs to be a mapping between prefixes
and namespace URIs. I can see that getting very problematic if the choice
group is located in a different xsd from the global elements that it references.
I think we should
- keep the elementID as a simple string
- insist that all branches of a choice have different elementIDs
- remove the global uniqueness constraint, for the reasons explained in
this email chain
I think it would be easier for modellers if the elementID defaulted to
the local name of the element. I understand that name clashes can, in principle,
occur. If users want to avoid that then they can be explicit about elementIDs
and they could even define a naming convention for their elementIDs
to make them look very much like QNames. Sounds like a lot of work, but
DFDL models that are complex enough to need that approach will often be
code-genned anyway.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Steve
Hanson/UK/IBM@IBMGB
To: dfdl-wg@ogf.org,
Date: 15/08/2013
18:37
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications
Sent by: dfdl-wg-bounces@ogf.org
The more I think it through, the more I see the use of a string elementID
(or local element name) causing problems when/if we extend to support xs:any
in the future. In my original proposal for direct dispatch choice
I proposed that choiceBranchRef returned a QName which therefore automatically
selected the element, and coped with any namespace issues. The problem
with QName though is that the expression to build it can become a big case
statement negating some of the performance gain, if there is no automap
way of getting from the 'tag' to the QName. Hence why we introduced elementID.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Steve
Hanson/UK/IBM
To: Suman
Kalia <kalia@ca.ibm.com>,
Cc: dfdl-wg@ogf.org,
dfdl-wg-bounces@ogf.org,
Tim Kimber/UK/IBM@IBMGB
Date: 15/08/2013
17:20
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications
Suman, comments to yours in pink
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Suman
Kalia <kalia@ca.ibm.com>
To: Tim
Kimber/UK/IBM@IBMGB,
Cc: dfdl-wg@ogf.org,
dfdl-wg-bounces@ogf.org
Date: 15/08/2013
15:31
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications
Sent by: dfdl-wg-bounces@ogf.org
comments in green
Suman Kalia
IBM Canada Lab
WMB Toolkit Architect and Development Lead
Tel: 905-413-3923
T/L 313-3923
Email: kalia@ca.ibm.com
For info on Message broker
http://www.ibm.com/developerworks/websphere/zones/businessintegration/wmb.html
From: Tim
Kimber <KIMBERT@uk.ibm.com>
To: dfdl-wg@ogf.org,
Date: 08/15/2013
09:58 AM
Subject: Re:
[DFDL-WG] Direct dispatch choice clarifications
Sent by: dfdl-wg-bounces@ogf.org
See comment in <TK> tags.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Steve
Hanson/UK/IBM@IBMGB
To: dfdl-wg@ogf.org,
Date: 15/08/2013
12:57
Subject: [DFDL-WG]
Direct dispatch choice clarifications
Sent by: dfdl-wg-bounces@ogf.org
Looking at this in more detail prior to writing up behaviour for section
9, there are a couple of things missing from the spec or that need clarification:
1) Description of elementID property should say that empty string is not
allowed (this was in the erratum).
2) Should say that an elementID on an elementRef overrides any elementID
on the global element (this was in the erratum).
3) Section 15.1.2 says that is a schema definition error if the elementId
values of global elements are not unique within a given namespace. I don't
see where namespace comes into this, the elementID is just a string so
surely it needs to be unique across namespaces? (Strictly elementID needs
only to be unique across the global elements involved in each specific
choice, but it was minuted that global uniqueness was desirable
to allow future xs:any support).
<TK>
In XML Schema, an xs:any does not, in general, match all global elements.
The 'namespace' attribute can narrow the set to elements from a specified
list of namespaces. There is no way in XML Schema 1.0 to further narrow
the xs:any, So the rule is designed to ensure that future usage of
xs:any when a single namespace is specified and processContents!='skip'
does not throw up schema definition errors. However...I note that XML
Schema 1.1 allows a new way to narrow the scope of an xs:any ( by specifying
a list of not-included QNames ). My feeling is that the unique-within-namespace
check is fragile.
</TK>
<SK>
As per my recollection, we put the uniqueness rules across namespaces to
accommodate chameleon namespaces. Consider a global element E1
in notarget namespace having elementID E1_ID and
is included in 2 schemas with different target namespaces say
TNS1 and TNS2.. Consider a choice containing element references
TNS1:E1 and TNS2:E1, in order to disambiguate these elements in the
context choice, the element ID has to be unique in the context of namespace.
This is somewhat an edge case but can come more prevalent when the
support for xs:any is provided.
</SK>
SMH: In the choice, the element refs to TNS1:E1 and TNS2:E1 both
have an elementID string 'E1_ID' from the original E1 global element. In
the choice, this is an error because the elementID is not unique in the
choice (we match the result of the choiceBranchRef expression, which returns
a string not a QName). The only way round this is to override the elementID
on one of the element refs (see 2 above) and set a value that is unique.
That then works. But that does not help the (future) xs:any scenario, where
there is no element ref to carry the override. I think the chameleon namespace
scenario will always cause a problem with xs:any because our elementIDs
are strings not QNames.
I think we should leave a global element uniqueness check out of DFDL 1.0.
It doesn't actually future proof anything, as once I use xs:any the whole
nature of the xsd changes.
4) Spec does not explicitly say that when choiceBranchRef is present each
branch of the choice must have an elementID. This must be the case, as
otherwise a choice branch will never be accessible.
5) Tim has suggested that if an element was silent about elementID, the
local name of the element could be used instead. So conceptually
an element would have an 'effective elementID'. This makes modelling
easier if the 'tag' in the data is the same as the element name.
<TK>...or if the element name is derivable from the 'tag' using a
simple XPath expression</TK> SMH:
True.
The validation checks would need to ensure that the set of 'effective elementIDs'
was unique; for the global element check as currently specified (see 3)
this would mean that all global elements must have unique local names,
unless an elementID is carried - I think this is too limiting.
SMH: While defaulting to the local name sounds attractive, I can't convince
myself that it won't cause problems if we add xs:any in DFDL 2.0 and multiple/chameleon
namespaces are involved.
SMH: Conclusion: For DFDL 1.0 we take the conservative position and say
that you must specify an elementID on an element that is used in a choice
with choiceBranchRef and it must be unique in the context of the choice
only. No global uniqueness check is made.
>From minutes of 17th April 2012.
145
| Provide
a 'dispatch' way of discriminating a choice for better performance of the
envelope/payload use case (Steve, Mike, Suman)
12/7: See minutes. Need to choose a proposal and flesh out.
19/07: Waiting for proposals
26/07: Waiting for proposals
16/08: Waiting for proposals. Suman added to action.
...
1/11: Steve to send a proposal
...
21/03: Steve has sent a proposal. Mike has sent a counter proposal. Steve
to respond.
28/03: Steve has sent a revised proposal. Review for discussion next
week. Ensure proposal handles Mike's scenario where tag value to branch
mapping is not 1-1.
05/04: Discussed Mike's review comments and Suman's concerns. Agreed that
name should be elementID, should be a single DFDL String Literal value,
and that matching of choiceBranchRef expression result should only be against
elementID to avoid QName v String confusion. Steve to recirculate with
a schema example.
17/04: Closed. Discussion on whether the choiceBranchRef expression
should retiurn xs:string or something else. Agreed on xs:string. Discussed
whether elemenID should be a pure xs:string or a DFDL String Literal. For
consistency with other DFDL properties it should be a string literal, but
raw byte entities and character classes should be disallowed to avoid complications.
Discussed scope of uniqueness of elementIDs. Agreed that uniqueness is
both local to a choice, and across all global elements in the same namespace
(the latter is not strictly needed right now but accommodates any future
addition of xs:any). Agreed that elementID should be on global element,
local element, and element ref (in which case it overrides any elementId
on the global element, which is ok as the property does not follow the
usual scoping rules). Errata taken. |
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
--
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU