All:
Here are the WTX identification/discrimination
behaviours 'on-paper' written with the help of Bob Connolly our WTX core
engine expert. I had attempted to create a single description of
the concept but ended up needing to split the core behaviour from the modelling
options in order to get some of the wrinkles discussed.
The attempt here is to share
information on the use of this concept within WTX with the (outside of
IBM) DFDL WG in such a way that it does not go into the specifics of the
WTX implementation.
___________________________________________________________________________________________________
WTX use of IDENTIFIERS for distinguishing
data
This is a brief description of
the use of WTX identifiers to describe limitations that could be imposed
on the implementation of DFDL discriminators. We will attempt to
word this in terms that are not specific to WTX and are a bit more XML-centric
than we would normally describe WTX processing. This may dilute some
of the specificity of the descriptions – but the spirit of the concepts
is the important thing here.
WHAT IS A DISCRIMINATOR?:
Discriminators allow us to evaluate
a data component and determine if it 'is known to exist'. The distinction
that we are making is that there are situations when parsing when we need
to determine if some data is an 'invalid instance' of one data component
as opposed to 'not that data component'.
This concept has multiple advantages:
it lets us be more specific when modeling our data than we could be without
the concept, and it allows us to terminate one branch of speculative parsing
quicker than we might otherwise be able to without the concept.
HOW ARE DISCRIMINATORS USED
IN WTX CORE:
The core of WTX uses discriminators
in the parsing of data in three main ways:
1-We are parsing a group which
is identifiable in some other way – for example, it is a group partitioned
by initiator. In this case we functionally have a choice between two or
more groups. So the decision here is which path or partition/choice
group (if any) will be found to exist.
If we don’t find the initiator
in the data then this partition/choice group is not ‘known to exist’.
If we find the initiator in the
data and...
-If we find the initiator in the
data and we do not have a discriminator on a component of the partition/choice
group then this type is ‘know to exist’
-If we find the initiator in the
data and
if we have a discriminator
set on a component of the partition group and
if the rule on the discriminator
and all rules on earlier components evaluate to true then this group is
'known to exist'
Note: There is no requirement
in WTX that the component on which the identifier attribute is set has
to have a specified rule - all mandatory
instances of
components have an implied
rule of PRESENT($) in WTX. In this document when I say that the rule
evaluates to true there is really more to this concept - we could say that
everything up to that point is 'valid' meaning that all rules and cardinality
constraints and other facets enforced (restrictions, size, presentation
checks) pass. In WTX the lines between validation and parsing are
blurred so these distinctions may be a bit different in a DFDL implementation.
But we will at minimum need to work this in such a way as we can
specify that not only the rules evaluate to true but that the cardinality
checks also pass. In DFDL we may want to make having a rule a requirement
if we choose to use the currently documented discriminator construct.
-If we find the initiator in the
data and
if we have a discriminator
set on a component of the partition group and
if the rule on the discriminator
evaluates to false or if any rule on a component prior to the component
carrying the discriminator evaluates to false then this group is ‘known
to not exist’
2-We are evaluating a group and
have determined that something is wrong with it and the group we are evaluating
has a discriminator and we have not found (or evaluated its rule)
yet - so we say that it is ‘known to not exist’.
3-A rule on a component in the
group has evaluated to false and that rule is on or above the component
with the discriminator. This group is ‘known to not exist’.
Note: this is similar to number
2 but number 2 is another reason for failure such as missing a mandatory
component.
4-A rule on a component in the
group has evaluated to true and that rule is on the component with the
discriminator. This group is ‘known to exist’.
Note: as we are processing the
component with the discriminator the assumption is that we would not be
processing this rule if all previous occurring checks and rules checks
had not evaluated to true.
5-A rule on a component in the
group has evaluated to false and that rule is after the component with
the discriminator. The rules on the component with the discriminator and
above rules evaluated to true. This group is ‘known to exist’ but
is invalid.
6-All rules on all components in
the group have evaluated to true including the rule on the component with
the discriminator. This group is ‘known to exist’ and is valid.
7-All rules on all components in
the group have evaluated to true – there is no component with a discriminator.
This group is ‘known to exist’ and is valid. Without the identifier
we process the group to the end (last component) before determining that
it is ‘known to exist’.
HOW ARE DISCRIMINATORS IMPLEMENTED
IN WTX MODELS?:
WTX imposes many limitations to
the expression of identifiers on the model.
-Discriminators may only be placed
on the physical representation of a group. That is why we see them
on partition groups and sequence groups but not on choice groups (or unordered
groups – covered below).
In partitioned groups we have a
subtype of each possible group – so each possible group may have a discriminator.
When WTX expresses choice groups
it expresses them as a group containing all of the possible child groups
– so at the top level ‘choice group’ there is no component of the actual
group content- so no use for a discriminator. But each choice which may
itself be a group may have a discriminator. Choice groups are special
in that the choice model construct simply lists the components and only
one may occur...at this level a discriminator on one of the choices may
not be very useful. Inside of each choice’s components a discriminator
could be used to indicate the existence of that choice.
-The WTX UI does not allow discriminators
on the components of Unordered Groups. This may be due to the fact
that the position of the discriminator has significance (all rules at or
above the discriminator must evaluate to true). If the group is unordered
it would be difficult to enforce. Will need to discuss for DFDL.
-A group may have either zero or
one discriminators. No group may have more than one discriminator.
-The discriminator may have two
significant parts
o it’s
location (mandatory). The discriminator is placed on a component
of a group and makes all of the cardinality and rules at that point and
above become part of it's concept.
o it’s
rule (optional)
A group with a component which
has a discriminator should have some ‘rule’ associated with it. In WTX
if there is no explicit rule then the implicit rule is ‘PRESENT($)’.
We will need to decide if such implied rules will be allowed in DFDL.
-A group may only have a discriminator
on a mandatory component. Once again, this impacts a choice group where
by definition all components are optional – which will not have a discriminator.
This has been an issue of debate
in WTX. We could have implemented checking on optional elements quite easily
Over the years this has been questioned (as our UI allows them to
be placed on optional elements) but once we explained the way the engine
worked no customers perceived this as a deficiency. In DFDL we will
need to determine if this is needed.
-In WTX we do allow a discriminator
to be placed on a mandatory fixed size array (a repeating mandatory component
with n:n cardinality). It’s component rule can either refer to the
entirety of the array (PRESENT($) meaning the whole of the array is present)
or can call out a specific rule against one if the iterations. This
is not done often in practice.
-In WTX it is common to have multiple
levels of discriminators when we are working with nested groups.
Stephanie Fetzer
WebSphere Common Transformation
Industry Packs - Software Engineer