
I agree with Tim's opinion, but add that this is *NOT* the default behavior of the java regex library we're using in Daffodil currently. One must prefix all regex's by (?s) I believe to achieve the non-default line-ending behavior. On Wed, Nov 14, 2012 at 11:15 AM, Tim Kimber <KIMBERT@uk.ibm.com> wrote:
I would vote for this feature to be switched off by default in DFDL processors. It is mainly useful when dealing with lines of text, but DFDL formats are not always lines of text. So to be 100% clear, I think the '.' wildcard should match all characters, including line endings.
regards,
Tim Kimber, DFDL Team, Hursley, UK Internet: kimbert@uk.ibm.com Tel. 01962-816742 Internal tel. 37246742
From: Mike Beckerle <mbeckerle.dfdl@gmail.com> To: dfdl-wg@ogf.org, Date: 14/11/2012 12:53 Subject: [DFDL-WG] Clarification needed: regular expressions - does '.' match newlines by default? Sent by: dfdl-wg-bounces@ogf.org ------------------------------
A key behavior distinction in regular expressions is whether the '.' wildcard matches line endings or not.
Regular expression libraries can be configured, usually by some sort of expression modifier, either way so that the '.' will not match a line ending or so that it will.
Question is, how is it configured by default in DFDL regular expressions?
This is part of the overall issue of tightening up regular expressions as part of DFDL. I.e., what exactly is the regex dialect, and how is it configured by default.
...mike
-- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412 -- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-- Mike Beckerle | OGF DFDL WG Co-Chair Tel: 781-330-0412