>
assuming you have dfdl:separator '%SP;' on the parent sequence, change that to '%SP;%WSP*;'
This would not be an option, because the other elements do not display this bizarre “extra space” behavior. Just the Day element.
Thanks for the ideas.
From: Steve Hanson [mailto:smh@uk.ibm.com]
Sent: Thursday, June 13, 2013 11:00 AM
To: Garriss Jr., James P.
Cc: dfdl-wg@ogf.org; dfdl-wg-bounces@ogf.org
Subject: Re: [DFDL-WG] optional whitespace entity
Or, assuming you have dfdl:separator '%SP;' on the parent sequence, change that to '%SP;%WSP*;'
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Steve Hanson/UK/IBM
To: "Garriss Jr., James P." <jgarriss@mitre.org>,
Cc: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>,
dfdl-wg-bounces@ogf.org
Date: 13/06/2013 13:50
Subject: Re: [DFDL-WG] optional whitespace entity
You could use dfdl:textTrimKind 'padChar', dfdl:textStringPadCharacter'%SP;' and dfdl:textStringJustification 'right', and trim off the excess space. I'm guessing that your pad char is a '0' right
now but the '0' is harmless.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: "Garriss Jr., James P." <jgarriss@mitre.org>
To: "dfdl-wg@ogf.org" <dfdl-wg@ogf.org>,
Date: 13/06/2013 13:20
Subject: Re: [DFDL-WG] optional whitespace entity
Sent by: dfdl-wg-bounces@ogf.org
> can you show us the scenario where you want to apply this?
Sure.
Consider the Date header. Note that the day of the month is *always* 2 digits (that is, it’s 04 instead of just 4):
Date: Fri, 04 Feb 2013 08:54:52 -0500
Now consider the Received header, which finishes with a date. Sometimes the day of the month is 2 digits, when the day is 10 or higher:
Received: by mail-wi0-f178.google.com with SMTP id hj6so339193wib.11
for <jgarriss@mitre.org>; Thu, 30 May 2013 22:28:57 -0700 (PDT)
Sometimes it is 2 characters but instead of a leading 0 (like the Date header above), there is a blank space preceding the day. If you look closely in this example, you will see that there are
2 spaces between “Tue,” and “4 Jun”:
Received: from 131.28.34.56 ([131.28.34.56]) by
VFOHMLAO03.Enterprise.afmc.ds.af.mil ([131.28.34.43]) via Exchange Front-End
Server webmail.afmc.af.mil ([131.28.34.85]) with Microsoft Exchange Server
HTTP-DAV ; Tue, 4 Jun 2013 18:02:13 +0000
And sometimes it is 1 digit:
Received: from smtpksrv1.mitre.org (129.83.31.51) by IMCCAS03.MITRE.ORG
(129.83.29.80) with Microsoft SMTP Server id 14.2.342.3; Tue, 4 Jun 2013
09:31:47 -0400
The problem is how to model the Day element when it’s part of the Received header. If the length is merely set to 2 characters, then the value can be “ 4”, which Daffodil complains that it can’t
convert it into an integer. So I settled on this:
<!-- Day is set to delimited instead of explicit (w/ length = 2) b/c the date in Received can be 1 character -->
<xsd:element name="Day" dfdl:lengthKind="delimited"
dfdl:initiator="%WSP*;">
<xsd:annotation>
<xsd:appinfo source="http://www.ogf.org/dfdl/dfdl-1.0/">
<dfdl:assert test="{ dfdl:checkConstraints(.) }" message="There
cannot be more than 31 days in a month"/>
</xsd:appinfo>
</xsd:annotation>
<xsd:simpleType>
<xsd:restriction base="xsd:unsignedInt">
<xsd:maxInclusive value="31"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
(I have a different version of the Day element for the Date header.)
I started this thread because I wanted to make sure that this initiator would work whether the extra space is present or absent. BTW, happy ending: it works as expected in Daffodil 0.10.
> I am wondering whether whether %WSP*; on its own should be allowed as a delimiter?
If not, then please provide some solution for the above problem.
From: Steve Hanson [mailto:smh@uk.ibm.com]
Sent: Thursday, June 13, 2013 5:30 AM
To: Garriss Jr., James P.
Cc: dfdl-wg@ogf.org;
dfdl-wg-bounces@ogf.org
Subject: Re: [DFDL-WG] optional whitespace entity
James, please can you show us the scenario where you want to apply this?
I ask because I think it is the only example in DFDL where you can specify a DFDL delimiter and for there to be nothing for that delimiter in the data. I suspect this might have some ramifications for things like delimiter scanning and dfdl:initiatedContent.
It clearly solves a problem for James, but I am wondering whether whether %WSP*; on its own should be allowed as a delimiter?
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: "Garriss Jr., James P." <jgarriss@mitre.org>
To: "dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
Date: 11/06/2013 16:19
Subject: Re: [DFDL-WG] optional whitespace entity
Sent by: dfdl-wg-bounces@ogf.org
You know, Tim, that is what I meant, but I just copied the syntax directly from Table 4. Upon further review, I see that this table doesn’t give the complete syntax. I wonder how many other people will just cut-and-paste directly from these tables. It might
be a good idea to put the complete syntax there.
In any case, thanks for the explanation. That’s what I hoped it would do.
From:
dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org]
On Behalf Of Tim Kimber
Sent: Tuesday, June 11, 2013 11:02 AM
To: dfdl-wg@ogf.org
Subject: Re: [DFDL-WG] optional whitespace entity
I assume that you meant to write dfdl:initiator="%WSP*;"
That will match zero or more whitespace characters. It will match and consume any leading white space before the element, and it will never fail to match.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: "Garriss Jr., James P." <jgarriss@mitre.org>
To: "dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
Date: 11/06/2013 15:48
Subject: [DFDL-WG] optional whitespace entity
Sent by: dfdl-wg-bounces@ogf.org
<xsd:element name="Day" dfdl:lengthKind="delimited" dfdl:initiator="WSP*">
Will this element match if the initiator is not found (that is, even if there is not a space before element? TIA--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU