dfdl-wg
Threads by month
- ----- 2025 -----
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
December 2014
- 3 participants
- 25 discussions
Suggest adding this wording to the end of section 23.4 as an additional
note.
5. NameTest - These QNames are path steps that refer to elements in the
DFDL infoset. If such an element is in a namespace, then the NameTest QName
must have a prefix which is bound to the namespace. Specifically, any
default namespace is not used to implicitly qualify these NameTest QNames.
This behavior is consistent with XPath expression usage in XML Schema
[footnote to: Definitive XML Schema (Walmsley, ISBN 0-13-065567-8) page
390, Section 17.8, Table 17-6 says "A child element-type name which must be
prefixed if it is in a namespace".] such as in the path property of the
xs:selector and xs:field elements within xs:key and xs:unique constraints, and
in related XML standards such as XSLT. Note however, that this behavior is
different from the way QNames are used in other places in XML and DFDL
Schemas such as the ref property of an element reference, or the dfdl:ref
property of a DFDL format annotation. There a QName with no prefix must
always be referring to a global declaration or definition, and so is
augmented with the default namespace when needed.
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
<http://www.ogf.org/About/abt_policies.php>
2
1

Re: [DFDL-WG] Fw: Fw: Action 248 (was Thoughts on a discriminator scenario)
by Steve Hanson 02 Dec '14
by Steve Hanson 02 Dec '14
02 Dec '14
Action 248 closed with no change in behaviour.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Steve Hanson/UK/IBM
To: Mike Beckerle <mbeckerle.dfdl(a)gmail.com>
Cc: DFDL-WG <dfdl-wg(a)ogf.org>
Date: 26/11/2014 17:43
Subject: Re: [DFDL-WG] Fw: Fw: Action 248 (was Thoughts on a
discriminator scenario)
The EDIFACT schemas on GitHub and elsewhere use a couple of discriminators
that exploit the current behaviour.
In EDIFACT, an Interchange is a UNA, a UNB, either one or more Functional
Groups or one or more Messages, and a UNZ.
A Functional Group is a UNG, one or more Messages, and a UNE.
A Message is a UNH, a bunch of other segments, and a UNT.
Here's an edited copy to illustrate. The elements in blue are the
1..unbounded elements. The elements in green (UNG, UNH) have a complex
type that contains a discriminator fn:true() once the initiator for the
element has been found.
Example parse: Let's say my Interchange has two functional groups. The
parser enters the choice in red. It tries to parse the FunctionGroup
branch. It finds a UNG and its discriminator is true. That resolves the
choice branch (because FunctionGroup minOccurs is '1') and so stops the
parser from trying the other branch if a failure occurs. The next time
round the loop the UNG discriminator is again true. That resolves the
optional occurrence of the FunctionGroup. Same deal for the Message
branch of the choice with its UNH.
(Note that when parsing Message within FunctionGroup, the first time round
the Message loop the UNH discriminator has no effect as there is no PoU in
scope. Other times round it resolves the optional occurrences of Message).
<xsd:element name="Interchange">
<xsd:complexType>
<xsd:sequence>
<xsd:element dfdl:initiator="UNA" dfdl:length="6"
dfdl:terminator="%WSP*;" minOccurs="0" name="UNA" type="srv:UNA"/>
<xsd:element dfdl:initiator="UNB"
dfdl:ref="ibmEdiFmt:EDISegmentFormat" name="UNB"
type="srv:UNB-InterchangeHeader"/>
<!-- Content is either Functional Groups or independent Messages,
never a mixture -->
<xsd:choice>
<xsd:element maxOccurs="unbounded" name="FunctionGroup"
dfdl:occursCountKind="implicit">
<xsd:complexType>
<xsd:sequence>
<xsd:element dfdl:initiator="UNG"
dfdl:ref="ibmEdiFmt:EDISegmentFormat" name="UNG"
type="srv:UNG-GroupHeader"/>
<xsd:element maxOccurs="unbounded" ref="D03B:Message"
dfdl:occursCountKind="implicit"/>
<xsd:element dfdl:initiator="UNE"
dfdl:ref="ibmEdiFmt:EDISegmentFormat" name="UNE"
type="srv:UNE-GroupTrailer"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element maxOccurs="unbounded" ref="D03B:Message"
dfdl:occursCountKind="implicit"/>
</xsd:choice>
<xsd:element dfdl:initiator="UNZ"
dfdl:ref="ibmEdiFmt:EDISegmentFormat" name="UNZ"
type="srv:UNZ-InterchangeTrailer"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
<xsd:element name="Message">
<xsd:complexType>
<xsd:sequence>
<xsd:element dfdl:initiator="UNH"
dfdl:ref="ibmEdiFmt:EDISegmentFormat" name="UNH"
type="srv:UNH-MessageHeader"/>
<xsd:choice>
....
</xsd:choice>
<xsd:element dfdl:initiator="UNT"
dfdl:ref="ibmEdiFmt:EDISegmentFormat" name="UNT"
type="srv:UNT-MessageTrailer"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl(a)gmail.com>
To: Steve Hanson/UK/IBM@IBMGB
Cc: DFDL-WG <dfdl-wg(a)ogf.org>
Date: 25/11/2014 17:37
Subject: Re: [DFDL-WG] Fw: Fw: Action 248 (was Thoughts on a
discriminator scenario)
As mentioned on the call, this is one of the ways of dealing with the
situation when an array has 'implicit' OCK, but has minOccurs > 1, but
also needs a discriminator for the optional elements.
I suggested on the call that this baggage be in a hidden group, but as
there are no elements involved, I think a hidden group is not advisable
here.
<xs:element name="a" dfdl:occursCountKind="implicit"
minOccurs="1" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<!-- This choice is DFDL's way of expressing this logic: -->
<!-- IF the occursIndex is for the optional part of the array -->
<!-- THEN evaluate the array-element discriminator -->
<!-- ELSE don't evaluate discriminator. -->
<xs:choice>
<xs:sequence>
<xs:annotation><xs:appinfo ...>
<!-- IF occursIndex gt 1.... -->
<dfdl:discriminator>{ dfdl:occursIndex() gt 1
}</dfdl:discriminator>
<!-- THEN discriminate the optional array elements
-->
<dfdl:discriminator>{ ....optional array element
discriminator... }</dfdl:discriminator>
</xs:appinfo></xs:annotation>
</xs:sequence>
<xs:sequence>
<!-- ELSE this is the occursIndex eq 1 case, we have
no discriminator -->
<!-- for the array element, since it is required. -->
</xs:sequence>
</xs:choice>
.... array content goes here...
</xs:sequence>
</xs:complexType>
</xs:element>
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
On Tue, Nov 25, 2014 at 10:50 AM, Steve Hanson <smh(a)uk.ibm.com> wrote:
I think some of your wording changes have changed my intent, which was
that all arrays are potential PoUs. The table now says that fixed,
expression and stopValue are not potential PoUs, which implies that the
discriminator never acts on the array but always on a higher PoU. I was
trying to avoid this, because it means that changing OCK can change the
behaviour of the schema. But I guess it's no different to changing the
array to a scalar, which would have the same effect.
Regarding the failure of the discriminator. The intent was it should
behave just like any assert failure or processing error. But I think your
point is then right - it means that the phrase 'a discriminator only ever
resolves that point of uncertainty' should actually be ''a discriminator
only ever positively resolves that point of uncertainty' - which is an
asymmetric behaviour. Are we comfortable with that?
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl(a)gmail.com>
To: Steve Hanson/UK/IBM@IBMGB
Cc: DFDL-WG <dfdl-wg(a)ogf.org>
Date: 25/11/2014 15:35
Subject: Re: [DFDL-WG] Fw: Fw: Action 248 (was Thoughts on a
discriminator scenario)
My suggested additional wording in Red below. There is an issue with this
where it was unclear to me whether we've defined exactly what happens.
If you have say, an array with occursCountKind 'implicit', minOccurs '1',
and the discriminator on the element evaluates to false for that required
first element, what happens? Do we fail the whole array? This sounds
contradictory to the notion that the discriminator "only resolves that
element". But having the discriminator be ignored doesn't seem right
either.
...mike
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
On Mon, Nov 17, 2014 at 8:07 AM, Steve Hanson <smh(a)uk.ibm.com> wrote:
This action was raised because of concern with the behaviour of the
discriminator in the following example. Because OCK is 'implicit' the 1st
occurrence is not an actual PoU but the other 9 occurrences are. This
means that for 1st occurrence, the discriminator actually acts on a higher
PoU if one exists.
<xs:element name="Type1" maxOccurs="10"
dfdl:occursCountKind="implicit">
<dfdl:discriminator test="{fn:exists(A)}" />
<xs:complexType>
<xs:sequence>
<xs:element name="A" dfdl:initiator="A:" ... />
<xs:element name="B" dfdl:initiator="B:" ... />
<xs:element name="C" dfdl:initiator="C:"... />
</xs:sequence>
</xs:complexType>
This led to the suggestion that a discriminator should not 'leak' beyond a
potential PoU, regardless of whether it is an actual PoU. The argument for
this is contained in the thread below, and on re-reading I still think it
is the best solution to this, so that is what I propose.
There were also issues about the wording in section 9.3.3.
Sections 9.3.3 and 7.4 are reproduced below, and updated to address the
wording and leaking issues.
-------------------------------------------------
9.3.3 Points of Uncertainty
A point of uncertainty occurs when parsing a schema component when an
occurrence of that schema component might not be the next item encountered
in the data stream. Points of uncertainty can be nested.
Any one of the following schema constructs is a potential point of
uncertainty:
· A branch of xs:choice
· All xs:elements in an unordered xs:sequence (dfdl:sequenceKind is
'unordered')
· An optional xs:element
· An array xs:element.
· All xs:elements in an xs:sequence containing one or more floating
xs:elements.
The parser resolves these points of uncertainty by way of a set of
construct-specific rules given below along with determining whether schema
components are known-to-exist or known-not-to-exist. For some of these
constructs, there are situations where while there is the potential for
uncertainty, the circumstances are such that there isn't any actual
uncertainty; hence, potential points of uncertainty are distinguished from
actual points of uncertainty below.
A branch of xs:choice is always an actual point of uncertainty. A choice
is resolved sequentially, or by direct dispatch. Sequential choice
resolution occurs by parsing each choice branch in schema definition order
until one is known-to-exist. It is a processing error if none of the
choice branches are known-to-exist. Direct-dispatch choice resolution
occurs by matching the value of the dfdl:choiceDispatchKey property to the
value of the dfdl:choiceChoiceBranchKey property of one of the choice
branches. It is a processing error if none of the choice branches have a
matching value in their dfdl:choiceChoiceBranchKey property.
An element in an unordered xs:sequence is always an actual point of
uncertainty. It is resolved by parsing for the child components of the
sequence in schema definition order at each point in the data stream where
a component can exist until the required number of occurrences of each
child component is known- to-exist or the sequence is terminated by
delimiters or specified length.
An element in a sequence with one or more floating elements is always an
actual point of uncertainty. It is resolved by parsing for the expected
element at that point in the data stream. If the expected element is
known-not-to-exist then an occurrence of each floating element is parsed
in schema definition order.
When parsing an array, points of uncertainty only occur for certain values
of occursCountKind, as follows:
occursCountKind
Details of Potential and Actual Points of Uncertainty
fixed
No potential point of uncertainty (maxOccurs occurrences expected).
implicit
All ocurrences are potential points of uncertainty. An actual point of
uncertainty exists after minOccurs occurrences found and until maxOccurs
occurrences have been found.
parsed
All occurrences are actual points of uncertainty.
expression
No potential point of uncertainty (dfdl:occursCount occurrences expected)
stopValue
No potential point of uncertainty (the stopValue must always be present,
even
when minOccurs is 0).
Table 11: Points of Uncertainty and dfdl:occursCountKind
An optional element point of uncertainty is resolved by parsing the
element until it is either known-to-exist or known-not-to-exist. Whether
an optional element is an actual point of uncertainty depends on property
dfdl:occursCountKind as described above. (Property dfdl:occursCountKind is
defined in Section 16.1 dfdl:occursCountKind property.)
For an array element, the point of uncertainty is resolved for each
occurrence separately by parsing the occurrence until it is either
known-to-exist or known-not-to-exist.
Discriminators resolve potential points of uncertainty. A discriminator
defined on, or contained by, a schema construct that is a potential point
of uncertainty, will only ever resolve that point of uncertainty. This
holds regardless of whether there is any actual uncertainty.
For example, if a discriminator is defined on an array element which is
contained within the branch of a choice, the discriminator will only
resolve the existence of occurrences of the array element, and never the
existence of the occurrence of the choice branch. As another example,
consider an array element with dfdl:occursCountKind 'implicit' and
minOccurs '1'. The first element of such an array must exist, so there is
no actual uncertainty. A discriminator on such an element is redundant,
but often must be expressed so as to discriminate the existence of the
second and any subsequent array elements. If a discriminator evaluates to
'false' or causes a processing error on a potential point of uncertainty
where there is no actual uncertainty, ..... TBD
(I think this causes a processing error which will fail the whole
array.....but that sounds like it contradicts the statement above that
says "it only ever resolves that point of uncertainty" ?)
------------------------------------
7.4 The dfdl:discriminator Statement Annotation Element
DFDL discriminators are used during parsing to resolve points of
uncertainty that cannot be resolved by speculative parsing. Discriminators
are not used during unparsing. They can also be used to force a
resolution earlier during the parsing of a group so that subsequent
parsing errors are treated as processing errors of a known component
rather than a failure to find a component.
A discriminator determines the existence or non-existence of a component.
If the discriminator is successful then the component is known to exist
and any subsequent errors will not cause backtracking at points of
uncertainty. If a discriminator is unsuccessful then the component is
known not to exist and backtracking occurs immediately.
If the complex type of an element contains a sequence group as its content
model then if the sequence group is known not to exist, then the element
is known not to exist.
Examples of dfdl:discriminator annotation are below :
<dfdl:discriminator>
{ ../recType eq 0 }
</dfdl:discriminator>
<dfdl:discriminator test="{ ../recType eq 0}" />
When the discriminator's expression evaluates to "false", then it causes a
processing error, and the discriminator is said to fail.
A discriminator defined on, or contained by, a schema construct that is a
potential point of uncertainty, will only ever resolve that point of
uncertainty.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 17/11/2014 11:54 -----
From: Steve Hanson/UK/IBM
To: Tim Kimber/UK/IBM@IBMGB
Cc: dfdl-wg(a)ogf.org, dfdl-wg-bounces(a)ogf.org
Date: 15/05/2014 10:48
Subject: Re: [DFDL-WG] Fw: Action 248 (was Thoughts on a
discriminator scenario)
Tim - I've responded to your specific comments below in blue font.
All - You will see that I have some concerns over the words used in the
definition of a PoU, as we seem to be unclear as to whether a PoU is a
point in the data stream or a point in the model. I am wondering whether
the concepts of 'potential PoU' and 'actual PoU' can be better expressed
as 'PoU in the model' and 'PoU in the data'. I want to mull this over for
a while. I'm not changing the rules by this, just how we express them.
So please let me run with this before replying.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Tim Kimber/UK/IBM@IBMGB
To: dfdl-wg(a)ogf.org,
Date: 14/05/2014 23:31
Subject: Re: [DFDL-WG] Fw: Action 248 (was Thoughts on a
discriminator scenario)
Sent by: dfdl-wg-bounces(a)ogf.org
I agree that the wording is not easy to get right. However, I think the
current wording needs some adjustment so I'm going to make some
suggestions and see where it leads.
"A point of uncertainty occurs in the data stream when there is more than
one schema component
that might occur at that point."
I don't think this is precise enough.
SMH: Agree. I need to think about this sentence. There are several things
potentially wrong. It is defining a PoU as occurring in the data stream,
whereas elsewhere PoU is equated to a position in the model. It says 'more
than one schema component that might occur' - maybe it should say 'a
schema component may or may not occur'. And schema components don't occur
in the data stream anyway - occurrences of them do.
- if an optional element occurs at the end of the input data then there is
only *one* schema component that might occur at that point. The end of the
data stream might occur instead.
SMH: Yes but I I raised some similar arguments earlier in the thread,
about the last branch of a choice not being a PoU, or the last element in
an unordered sequence when all the others had been found not being a PoU.
We agreed that these are still all treated as PoUs for clarity. This is
another example.
- if an optional element occurs before the last required element in a
sequence AND the separatorSuppressionPolicy is not 'anyEmpty' then there
is exactly one schema component that can occur at that point in the data
stream. But it might be 'empty', in which case it will not be put into the
info set.
This is not pedantry. The parser will never need to backtrack in either of
these cases and in the second case it is obvious in advance which schema
component the parser should select for parsing.
SMH: We have agreed in the past that the presence of a separator is not
enough to infer 'known-to-exist', so separators should not be brought into
this definition. You are right that in a positional sequence the parser is
looking for an occurrence of a component or its empty rep, and never an
occurrence of the next schema component, so the parser can certainly
optimise here. Let's take any discussion of separators out of this for the
moment, and raise a separate action if needed.
Points of uncertainty can be nested.
Any one of the following constructs is a potential point of uncertainty:
1. An xs:choice
2. All xs:elements in an unordered xs:sequence (dfdl:sequenceKind is
'unordered')
3. An optional xs:element
4. An array xs:element.
5. All xs:elements in an xs:sequence containing one or more floating
xs:elements.
1. should say 'A member of an xs:choice' because it is the member, not the
group itself, that is the point of uncertainty. I think the confusion has
arisen because only one member of a choice group can exist in the data. So
if any member exists, it automatically ends any speculation about the
content of the choice group. But I insist that the real point of
uncertainty is the member. A choice group is always 'known to exist'
because according to DFDL rules it must have minOccurs=maxOccurs=1. FWIW,
I have no problem with talking about 'resolving a choice', provided that
we define that as 'Determining which member of a choice group ( if any )
is known to exist in the data'.
SMH: I agree that it should say member.
2. Should say 'All members of an unordered xs:sequence' to keep the
language consistent with 1. The section on unordered groups clearly
restricts members to elements only.
SMH: No. Using 'xs:element' is consistent with optionals & arrays in 3 and
4, which are also always elements. so xs:element is more consistent.
3. See above - an optional elements is not always a 'point of uncertainty'
according to the literal definition that we are currently using.
SMH; Right, but the bullets are defining potential PoUs, so it is correct
as it stands.
4. Should say 'An optional occurrence of an array element, unless the
separator properties make it a positional array and the occurrence is
required in the data'
SMH: No. All occurrences can be PoUs, it depends on OCK. And separators do
not resolve PoUs as noted. This definition 4 is the one that is key for
Action 248, which is ultimately what led to this discussion and what needs
to be resolved. The question is whether 4 should say a) all arrays are
potential PoUs as it does now, or b) just some arrays are potential PoUs
depending on OCK. Whatever we choose, a discriminator within that array
must not leak beyond the array as explained below in bold red font. I
think a) is clearer and we can then make a general statement about
discriminators not leaking outside of any potebtial PoU. If we adopt b)
then we need a separate statement about discriminators and arrays, which
seems more bitty.
5. Should say 'All members...' for consistency.
SMH: See 3.
regards,
Tim Kimber,
IBM Integration Bus Development (Industry Packs)
Hursley, UK
Internet: kimbert(a)uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Steve Hanson/UK/IBM@IBMGB
To: ,
Date: 13/05/2014 10:28
Subject: [DFDL-WG] Fw: Action 248 (was Thoughts on a discriminator
scenario)
Sent by: dfdl-wg-bounces(a)ogf.org
This will be discussed on today's call. Please have a position on the
paragraph below that ends 'What do others think?'
Thanks
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 13/05/2014 10:19 -----
From: Steve Hanson/UK/IBM
To: Tim Kimber/UK/IBM@IBMGB,
Cc: dfdl-wg(a)ogf.org
Date: 30/04/2014 12:25
Subject: Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
scenario)
Tim
Responses below.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Tim Kimber/UK/IBM@IBMGB
To: dfdl-wg(a)ogf.org,
Date: 11/04/2014 14:03
Subject: Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
scenario)
Sent by: dfdl-wg-bounces(a)ogf.org
"2. If a potential point of uncertainty is sometimes an actual point of
uncertainty (ock 'implicit') then a discriminator that applies it will
only ever resolve, or have no effect on, that point of uncertainty. It
never has an effect on any enclosing point of uncertainty."
This could be misinterpreted. The discriminator could evaluate to 'false'
and thus cause the POI to be resolved negatively ( the component would be
'known not to exist' )
SMH: Agree, and I can improve the words here.
1. and 3. will both apply if an element with ock='fixed' appears as a
choice branch. Is the POI always an actual POI or never?
SMH: No. There are two independent points of uncertainty, the choice
branch and the array.
The wording of 3. reads very strangely. 'If a potential point of
uncertainty is never an actual point of uncertainty' begs the question
'why is it even a potential point of uncertainty?'. The current wording
follows from our definition of the term 'point of uncertainty':
"A point of uncertainty occurs in the data stream when there is more than
one schema component
that might occur at that point." Points of uncertainty can be nested.
Any one of the following constructs is a potential point of uncertainty:
1. An xs:choice
2. All xs:elements in an unordered xs:sequence (dfdl:sequenceKind is
'unordered')
3. An optional xs:element
4. An array xs:element.
5. All xs:elements in an xs:sequence containing one or more floating
xs:elements.
I think this definition is too broad. It forces us to discuss potential
POUs that will never be actual POUs according to the first sentence.
SMH: Yes it does read a bit strangely, but there's a reason for this. If
we said that ock 'fixed', 'expression' or 'stopValue' are never POUs then
what does it mean if a discriminator is placed on such an element? A
discriminator gets evaluated for each occurrence of an array. For that
reason we can not let a discriminator within an array leak beyond the
array - regardless of whether it is a POU or not - otherwise what does
that mean to enclosing POUs? So even if we said that ock 'fixed',
'expression' or 'stopValue' are never POUs we would still need the spec to
state that a discriminator never leaks beyond them. I think it is clearer
to say that a discriminator never leaks beyond a potential POU and keep
the existing definition. What do others think?
regards,
Tim Kimber,
IBM Integration Bus Development (Industry Packs)
Hursley, UK
Internet: kimbert(a)uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Steve Hanson/UK/IBM@IBMGB
To: dfdl-wg(a)ogf.org,
Date: 11/04/2014 11:44
Subject: Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
scenario)
Sent by: dfdl-wg-bounces(a)ogf.org
248
Discriminators and potential points of uncertainty (Steve)
28/1: Steve to write up a proposal to prevent a discriminator from
behaving in a non-obvious manner when used with a potential point of
uncertainty that turns out not to be an actual point of uncertainty.
5/2: Steve sent an email to check whether choice branches, unordered
elements and floating elements should always be actual points of
uncertainty, as there are times when there is no uncertainty, eg, last
choice branch; all floating elements found. It was decided that they are
always actual points of uncertainty. To do otherwise will complicate
implementations and result in fragile schemas. Steve will proceed with the
proposal on that basis.
Based on the above, which reflects the email discussion below, here is
what I propose to resolve this action.
1. If a potential point of uncertainty is always an actual point of
uncertainty (choice branch, element in unordered sequence, floating
element, ock 'parsed') then a discriminator that applies to it will only
ever resolve that point of uncertainty. It never has an effect on any
enclosing point of uncertainty.
2. If a potential point of uncertainty is sometimes an actual point
of uncertainty (ock 'implicit') then a discriminator that applies it will
only ever resolve, or have no effect on, that point of uncertainty. It
never has an effect on any enclosing point of uncertainty.
3. If a potential point of uncertainty is never an actual point of
uncertainty (ock 'fixed', 'expression', 'stopValue') then a discriminator
that applies to it will never have an effect on that point of uncertainty.
Nor does it ever have an effect on any enclosing point of uncertainty.
I think 1 and 2 are not controversial, but there is an alternative for 3:
3. If a potential point of uncertainty is never an actual point of
uncertainty (ock 'fixed', 'expression', 'stopValue') then a discriminator
that applies to it will never have an effect on that point of uncertainty.
Instead the discriminator is applied to any enclosing point of
uncertainty.
The alternative means that changing an element from (say) ock 'parsed' to
ock 'expression' has the same effect on a discriminator as changing the
element to (1,1). The discriminator that applied to it now applies to any
enclosing pou.
SMH: Afternote: The alternative does not work for the reason given in my
reply to Tim above.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Steve Hanson/UK/IBM
To: Tim Kimber/UK/IBM@IBMGB,
Cc: dfdl-wg(a)ogf.org, dfdl-wg-bounces(a)ogf.org
Date: 05/02/2014 12:04
Subject: Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
scenario)
Thanks Tim, all good points. Comments to your comments.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Tim Kimber/UK/IBM
To: Steve Hanson/UK/IBM@IBMGB,
Cc: dfdl-wg(a)ogf.org, dfdl-wg-bounces(a)ogf.org
Date: 05/02/2014 11:01
Subject: Re: [DFDL-WG] Action 248 (was Thoughts on a discriminator
scenario)
A couple of comments below.
regards,
Tim Kimber,
IBM Integration Bus Development (Industry Packs)
Hursley, UK
Internet: kimbert(a)uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Steve Hanson/UK/IBM@IBMGB
To: dfdl-wg(a)ogf.org,
Date: 05/02/2014 10:50
Subject: [DFDL-WG] Action 248 (was Thoughts on a discriminator
scenario)
Sent by: dfdl-wg-bounces(a)ogf.org
248
Discriminators and potential points of uncertainty (Steve)
28/1: Steve to write up a proposal to prevent a discriminator from
behaving in a non-obvious manner when used with a potential point of
uncertainty that turns out not to be an actual point of uncertainty.
5/2: With Steve
I started on this by reading section 9.3.3 on points of uncertainty, which
lists the potential PoUs. Here's the list to save getting the spec out.
1. An xs:choice branch
2. All xs:elements in an unordered xs:sequence (dfdl:sequenceKind
is 'unordered')
3. An optional xs:element
4. An array xs:element
5. All xs:elements in an xs:sequence containing one or more
floating xs:elements.
The section then looks at each in turn and gives the circumstances when it
is an actual PoU or not. As currently written, it is only 3 and 4 where a
potential PoU might not be an actual PoU. For 1, 2 and 5 it says they are
always actual PoUs.
But I'm not sure that's correct. A deeper analysis of what is actually
going on with 1, 2 and 5 says to me that there are times when there might
not be an actual PoU.
1. Given that there is no concept in DFDL of optional choice branches,
then if the last branch is reached then there is no longer a PoU. It must
be that branch else it is a processing error.
TK: I think of it slightly differently. It is a PoU, even if the branch is
the only remaining branch. If we say that the final choice branch is not a
PoU then diagnostics become confused - the parser reports the error code
as 'error while parsing root/choice/lastBranch/field1' when the correct
error code would be 'none of the branches of root/choice were found in the
data'.
SMH: I see your point. My thinking was that choices have finite branches
and a choice is (1,1). If I have got to the last branch then I am not one
of the other branches so I must be this one. If there is any other
possibility then the model is missing a branch, even if it is just one
that contains an empty sequence with an assert {fn:false()}. In practice
of course users forget to add that last branch (there's no XSDL equivalent
to the 'default' branch of a switch/case statement), so yes they could end
up with an unclear diagnostic.
2. There can come a point in an unordered sequence when all that can be
encountered is one element, and if that is (1,1) then there is no longer a
PoU.
TK: It's still a PoU. The specification says that occursCountKind is
'parsed' for all members of an unordered group, so min/maxOccurs do not
come into play.
SMH: Interesting. The spec says that if a member is optional or an array
then it must be 'parsed'. If it is (1,1) though it does not have an
occursCountKind. The specific case I was thinking of is when all members
are (1,1), so when you have one element to go there is no PoU. However,
the rewrite into a repeating choice has the effect of making everything
'parsed', which is really the point you are making. So I agree with you,
it is easier to say that everything is an actual PoU else it complicates
the rewrite semantic.
5. If all floating elements are (1,1) and all are encountered, then from
that point on there are no longer any PoUs due to floating elements.
TK: I suspect that floating elements are somewhat like unordered branches
- most users will not want min/maxOccurs to affect the parsing of the
group. Schema validation ( or more complex validation applied in the
receiving application ) will deal with non-conformances.
SMH: Possibly yes. With something like X12 NTE segments, that is the case.
But we don't express the floating semantic as a rewrite of the whole
sequence like we do for unordered, it's more of a per element thing. And
if that is done dynamically as we go through the sequence, having no PoU
can result.
I'd like us to get straight on this before I proceed with the action
proper.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 05/02/2014 10:12 -----
From: Steve Hanson/UK/IBM
To: dfdl-wg(a)ogf.org,
Date: 27/01/2014 17:39
Subject: Fw: Thoughts on a discriminator scenario
Been thinking some more on the discriminator scenario below that I mailed
out before xmas, and discussing it with the IBM DFDL team.
The 'confusing' aspect of the behaviour is that a discriminator within a
potential PoU will act on a higher level PoU if the potential PoU is not
an actual PoU. In the example, the array element 'Type1' is not an actual
PoU for occurrence 1, only for occurrences 2+. So when the discriminator
fires for occurrence 1 it will resolve a higher level unresolved PoU if
one exists.
Perhaps the spec should say that a discriminator can't 'leak' beyond the
potential PoU that encloses it ? If so, then for occurrence 1 the
discriminator has no effect, and only has an effect for occurrences 2+.
This makes for more predictable and robust schemas.
We'd need to go through spec section 9.3.3 carefully to see if this does
not break any of the potential PoUs that are listed.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 16/01/2014 09:55 -----
From: Steve Hanson/UK/IBM
To: dfdl-wg(a)ogf.org,
Date: 20/12/2013 13:20
Subject: Thoughts on a discriminator scenario
Take the following schema (simplified) for element Type1 (1,10) being a
loop for elements A,B,C. Type 1 does not have an initiator so I need to
use a discriminator to establish the existence of an occurrence of Type1
so that incorrect backtracking does not occur after an error. Because
occursCountKind is 'implicit', the 1st occurrence is not a point of
uncertainty so the discriminator acts instead on any enclosing point of
uncertainty, but for 2nd and subsequent occurrences it acts on Type1.
That is all working as designed, but I think users find will the 1st
occurrence behaviour a bit confusing. There are workarounds to avoid the
problem, eg, use occursCountKind 'parsed' or split Type1 into two as (1,1)
and (0,9). I think this is worth documenting in a tutorial as this is
quite subtle stuff.
<xs:element name="Type1" maxOccurs="10" dfdl:occursCountKind="implicit"
>
<dfdl:discriminator test="{fn:exists(A)}" />
<xs:complexType>
<xs:sequence>
<xs:element name="A" dfdl:initiator="A:" ... />
<xs:element name="B" dfdl:initiator="B:" ... />
<xs:element name="C" dfdl:initiator="C:"... />
</xs:sequence>
</xs:complexType>
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
1
0

Fw: [Editor - document #166] (author action) Data Format Description Language (DFDL) v1.0 Experience Document 3
by Steve Hanson 02 Dec '14
by Steve Hanson 02 Dec '14
02 Dec '14
Need to discuss this on today's call. There are a couple of public
comments against the MIL-STD-2045 document that need to be processed.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 02/12/2014 12:27 -----
From: redmine(a)ogf.org
To: undisclosed-recipients:;
Date: 25/11/2014 17:06
Subject: [Editor - document #166] (author action) Data Format
Description Language (DFDL) v1.0 Experience Document 3
Issue #166 has been updated by Greg Newby.
Status changed from public comment to author action
Assignee changed from Greg Newby to Andre Merzky
Authors/editors: Public comment is complete. Please respond to comments
in the tracker, or this tracker, or via updates to your document. Once
completed, set the tracker back to Greg Newby, and we'll proceed with the
next step.
Greg
----------------------------------------
document #166: Data Format Description Language (DFDL) v1.0 Experience
Document 3
https://redmine.ogf.org/issues/166#change-654
Author: Steve Hanson
Status: author action
Priority: Normal
Assignee: Andre Merzky
Category:
Target version:
Document Type: Experimental
This document provides experience information to the OGF community on the
original Data Format Description Language (DFDL) 1.0 specification
(GFD-P-R.174).
Modeling a MIL-STD-2045 header in DFDL v1.0 is not possible without the
addition of new capabilities for specifying bit order and non-standard
encodings. There are many related military-standard binary data formats
which are similar, and so cannot be modeled in DFDL.This document
describes the new properties and property values that are required to
successfully model this format.
All resulting errata have been incorporated into a revised Data Format
Description Language (DFDL) 1.0 specification (GFD-P-R.207) which
obsoletes GFD-P-R.174.
--
You have received this notification because you have either subscribed to
it, or are involved in its topic.
To change your notification preferences, please click here:
http://redmine.ogf.org/my/account
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
1
0
Forwarding to WG. Agenda item on today's call.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
----- Forwarded by Steve Hanson/UK/IBM on 02/12/2014 10:11 -----
From: Steve Hanson/UK/IBM
To: Mike Beckerle <mbeckerle.dfdl(a)gmail.com>
Cc: Alex Wood1/UK/IBM@IBMGB
Date: 28/11/2014 16:45
Subject: Re: [DFDL-WG] dfdl github - mil-std-2045 schema updated
Mike
Some more. The hiddenGroupRefs revealed another bug in IBM DFDL (not
surprising as we don't support them yet but have a lot of code in the
editor to display hidden groups nonetheless). Fixing that caused some more
validation to take place which revealed some more errors. I've added them
below.
I also notice that your XPath comparisons are using '=' rather than 'eq',
that is, a general comparison rather than a value comparison. My reading
of section 23.4 of the DFDL spec is that DFDL expressions do not support
general comparisons.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Steve Hanson/UK/IBM
To: Mike Beckerle <mbeckerle.dfdl(a)gmail.com>
Date: 28/11/2014 15:26
Subject: Re: [DFDL-WG] dfdl github - mil-std-2045 schema updated
Mike
Taken a quick look at these, the following properties need adding to
defineFormat "thesePropertiesShouldNotMatter" to stop errors:
truncateSpecifiedLengthString="no"
textPadKind="none"
binaryNumberCheckPolicy="lax"
fillByte="%#r00;"
textBidi="no"
floating="no"
choiceLengthKind="implicit"
useNilForDefault="no"
Seeing several of this error: It is because you are using lengthUnits
'bits' with xs:nonNegativeInteger (your common type tBigIntField). I don't
believe the spec allows this.
CTDV1532E : DFDL property 'lengthUnits' can only be 'bits' if the
representation is binary and the type is boolean, byte, unsignedByte,
short, unsignedShort, int, unsignedInt, long or unsignedLong. Element:
#xmlns(p="urn:milstd2045DFDL")xscd(/type::p:future_use_group_type/model::sequence/schemaElement::future_use_group_data).
There are several things not yet supported by IBM DFDL, which we will get
to eventually, but fyi they are:
hiddenGroupRef
encodingErrorPolicy 'replace'
bitOrder
encoding 'US-ASCII-7-bit-packed'
Plus a bug in IBM DFDL was revealed where if a1.xsd in tns 'a' includes
a2.xsd in tns 'a' to pull in defineFormat name='xxx' then dfdl:format
ref="tns:xxx" is not resolving. It works if a2.xsd does not have a tns
(chameleon include), and it works if a2.xsd is tns 'b' (import).
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl(a)gmail.com>
To: "dfdl-wg(a)ogf.org" <dfdl-wg(a)ogf.org>
Date: 19/11/2014 23:48
Subject: [DFDL-WG] dfdl github - mil-std-2045 schema updated
Sent by: dfdl-wg-bounces(a)ogf.org
I finally got around to enhancing the mil-std-2045 schema using the ideas
that came out of review of the original work.
It is vastly improved in terms of complexity of the schema, and other
schemas that have to generate this sort of thing are greatly simplified by
the techniques illustrated here which avoid the need to generate any
top-level group definitions.
I did have to put in the workaround of using "WSP*" in a terminator
instead of "ES" because Daffodil doesn't support ES in terminators yet
(there's a bug).
I also left out the defaults, which are needed for unparsing, due to a bug
in Daffodil.
https://github.com/DFDLSchemas/mil-std-2045
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
1
0
Please find agenda for call on Redmine at
http://redmine.ogf.org/dmsf_files/13379?download=
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
1
0