dfdl-wg
Threads by month
- ----- 2025 -----
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
November 2012
- 5 participants
- 41 discussions

01 Nov '12
From DFDL WG call 2012-10-30:
190
Clarify rules for assert/discriminator testKind 'pattern' (All)
23/10: Need to be clear on data position and whether it is just for text
representations.
30/10: Closed. To comply with the timing rules being proposed in action
186, where these things are executed first before a 'format' annotation,
the data position must be the beginning of the representation (note
warning useful when alignment present). As these things can be used on
various objects, the only rule regarding text is that dfdl:encoding must
have a value in scope. Errata taken.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Steve Hanson/UK/IBM
To: Tim Kimber/UK/IBM@IBMGB,
Cc: dfdl-wg(a)ogf.org, dfdl-wg-bounces(a)ogf.org
Date: 16/10/2012 17:47
Subject: Re: [DFDL-WG] question/clarification - asserts using test
patterns
Agree that testKind 'pattern' is all about matching the physical data, and
not logical value.
I am thinking that we should impose a similar restriction on testKind
'pattern' to that imposed on lengthKind 'pattern', specifically (from the
latest spec draft):
Any element (complex or simple type) may have a dfdl:lengthKind of
'pattern' as long as the data in the content region (which can be either
the SimpleContent region or the ComplexContent region defined in Section
9.2) of the element is legal in the stated encoding of that element. Data
which satisfies this is referred to as scannable data.
Specifically, data is scannable and length kind ‘pattern’ can be used only
for:
o elements of simple type with representation 'text'
o elements of complex type where:
1. all simple child elements must have representation 'text' and have
the same encoding as the parent complex element, and
2. all complex child elements must themselves follow 1 and 2
(recursively).
This restriction was added because of problems trying to apply patterns to
binary data. I don't think it will restrict the utility of testKind
'pattern' in practice. What do you think?
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh(a)uk.ibm.com
tel:+44-1962-815848
From: Tim Kimber/UK/IBM@IBMGB
To: dfdl-wg(a)ogf.org, dfdl-wg-bounces(a)ogf.org
Date: 15/10/2012 20:54
Subject: Re: [DFDL-WG] question/clarification - asserts using test
patterns
Sent by: dfdl-wg-bounces(a)ogf.org
Good questions.
The intent of this DFDL feature is as follows:
- a point of uncertainty (POI) cannot be resolved using an initiator ( the
simplest option ). The data format just doesn't work that way.
- it could be resolved using an assert or a discriminator, but that would
be too heavyweight.
- a simple inspection of the data format reveals that the discrimination
can be done by testing the first few characters of each branch of the POI.
Example: SWIFT 50K ( multi-line address field):
:32A:060929EUR25,36&hex;0D&hex;&hex;0A&hex;
:33B:EUR56,78&hex;0D&hex;&hex;0A&hex;
:50K:/IT60X0542811101000000123456
ABC Corporation Times Square 7 NY 1
LINE 2
LINE 3
:52A:/<etc>
Note that field 50K contains lines of address data, but the actual number
of lines is not known. So how will the DFDL parser know when the 50K field
has completed? Answer: it encounters a line that starts with a colon.
Now, the most natural way to model SWIFT field 50K is as a series of
lines. The SWIFT XML format defines it this way.
If you work through the possibilities, it turns out that the only way to
achieve this using discriminators is:
- cause the parser to parse each line and put it into the info set
- add a discriminator to the repeating 'addressLine' element. The DFDL
expression would be something like this:
{ if ( fn:exists(./NameAddress_Line) ) then
(fn:not(fn:starts-with(./NameAddress_Line, ':'))) else xs:boolean("true")
}
That's a very expensive way to achieve the intended goal, which is 'treat
the data as another addressLine if the next character is a colon'.
So that was the motivation for the feature.
To answer the questions:
- not intended to be limited to xs:string only
- not intended to be limited to elements with text representation (
because dfdl:represention only applies to simple elements, and the POI
might be a group or an element.)
- is intended to be matched against text or binary data, starting at the
POI's byte offset. If the element's representation is binary then the
'encoding' property will be required.
Sounds as if the spec needs to be clarified in this area.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert(a)uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Mike Beckerle <mbeckerle.dfdl(a)gmail.com>
To: dfdl-wg(a)ogf.org,
Date: 15/10/2012 19:26
Subject: [DFDL-WG] question/clarification - asserts using test
patterns
Sent by: dfdl-wg-bounces(a)ogf.org
Question: Is an assertion using a regular expression pattern allowed on
(a) xs:string type elements
(b) any data with text representation
(c) any data with text or binary representation
and, does the regular expression apply to the representation or the
logical data value?
(a) is the only case that is not ambiguous, because the representation and
the logical value are the same thing.
For everything else, there’s the question of whether the test is on the
representation or the logical value. If it's the logical value, then how
is a regex made meaningful on a logical value that is, for example, a
number, without defining a canonical representation to which the logical
value is converted?
If it's to apply to the representation, then exactly what data? (Eg., what
grammar region) is subject to the regex?
...mikeb
--
Mike Beckerle | OGF DFDL WG Co-Chair
Tel: 781-330-0412
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg(a)ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
1
0