
I wrote this complicated regex today and it works in Daffodil. Question is this. Is the (?x) which turns on regex free-spacing mode, officially supported in DFDL? You can see from below that it is VERY desirable that it works..... <xs:simpleType name="frontMatterType"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:simpleType lengthKind="pattern" terminator="%FF;"> <dfdl:property name="lengthPattern"><![CDATA[(?x) # regex free spacing mode # # match the front matter of the document # .{1,8192}? # up to 8K of front matter content # # front matter ends at the first message description page # (?= # lookahead (followed by but not including...) \f # a formfeed character (?> \s | \x08 ){1,100}? # whitespace or backspace (x08) MESSAGE\ DESCRIPTION\r # this literal text \s{1,100}? # up to 100 whitespaces -{19}\r # exactly 19 hyphens and a CR ) # end lookahead ]]></dfdl:property> </dfdl:simpleType> </xs:appinfo> </xs:annotation> <xs:restriction base="xs:string" /> </xs:simpleType> -- Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com

To clarify, errata v13 has this in the table for erratum 3.29 in the list of non-portables: (?imsx-imsx:X) X, as a non-capturing group with the given flags. Note that the flags i,s,m,x are valid, but appending :X to the flag is not. Java 7 only I interpret this as meaning that only the so-called modifier-span notation (the : suffix) is disallowed, but not just plain (?x), but I wanted to be sure that was the correct interpretation. On Wed, Jun 26, 2013 at 1:13 PM, Mike Beckerle <mbeckerle.dfdl@gmail.com>wrote:
I wrote this complicated regex today and it works in Daffodil.
Question is this. Is the (?x) which turns on regex free-spacing mode, officially supported in DFDL?
You can see from below that it is VERY desirable that it works.....
<xs:simpleType name="frontMatterType"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:simpleType lengthKind="pattern" terminator="%FF;">
<dfdl:property name="lengthPattern"><![CDATA[(?x) # regex free spacing mode # # match the front matter of the document # .{1,8192}? # up to 8K of front matter content # # front matter ends at the first message description page # (?= # lookahead (followed by but not including...) \f # a formfeed character (?> \s | \x08 ){1,100}? # whitespace or backspace (x08) MESSAGE\ DESCRIPTION\r # this literal text \s{1,100}? # up to 100 whitespaces -{19}\r # exactly 19 hyphens and a CR ) # end lookahead ]]></dfdl:property>
</dfdl:simpleType> </xs:appinfo> </xs:annotation> <xs:restriction base="xs:string" /> </xs:simpleType>
-- Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
-- Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com

Mike, I believe that is the case but I have copied Andy Edwards who is the person in the IBM DFDL team who added our regex support. Regards Steve Hanson Architect, IBM Data Format Description Language (DFDL) Co-Chair, OGF DFDL Working Group IBM SWG, Hursley, UK smh@uk.ibm.com tel:+44-1962-815848 From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org, Date: 26/06/2013 18:56 Subject: Re: [DFDL-WG] regex free-spacing mode Sent by: dfdl-wg-bounces@ogf.org To clarify, errata v13 has this in the table for erratum 3.29 in the list of non-portables: (?imsx-imsx:X) X, as a non-capturing group with the given flags. Note that the flags i,s,m,x are valid, but appending :X to the flag is not. Java 7 only I interpret this as meaning that only the so-called modifier-span notation (the : suffix) is disallowed, but not just plain (?x), but I wanted to be sure that was the correct interpretation. On Wed, Jun 26, 2013 at 1:13 PM, Mike Beckerle <mbeckerle.dfdl@gmail.com> wrote: I wrote this complicated regex today and it works in Daffodil. Question is this. Is the (?x) which turns on regex free-spacing mode, officially supported in DFDL? You can see from below that it is VERY desirable that it works..... <xs:simpleType name="frontMatterType"> <xs:annotation> <xs:appinfo source="http://www.ogf.org/dfdl/"> <dfdl:simpleType lengthKind="pattern" terminator="%FF;"> <dfdl:property name="lengthPattern"><![CDATA[(?x) # regex free spacing mode # # match the front matter of the document # .{1,8192}? # up to 8K of front matter content # # front matter ends at the first message description page # (?= # lookahead (followed by but not including...) \f # a formfeed character (?> \s | \x08 ){1,100}? # whitespace or backspace (x08) MESSAGE\ DESCRIPTION\r # this literal text \s{1,100}? # up to 100 whitespaces -{19}\r # exactly 19 hyphens and a CR ) # end lookahead ]]></dfdl:property> </dfdl:simpleType> </xs:appinfo> </xs:annotation> <xs:restriction base="xs:string" /> </xs:simpleType> -- Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com -- Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
participants (2)
-
Mike Beckerle
-
Steve Hanson