Or, assuming you have dfdl:separator '%SP;'
on the parent sequence, change that to '%SP;%WSP*;'
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
Steve Hanson/UK/IBM
To:
"Garriss Jr.,
James P." <jgarriss@mitre.org>,
Cc:
"dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>, dfdl-wg-bounces@ogf.org
Date:
13/06/2013 13:50
Subject:
Re: [DFDL-WG]
optional whitespace entity
You could use dfdl:textTrimKind 'padChar',
dfdl:textStringPadCharacter'%SP;' and dfdl:textStringJustification 'right',
and trim off the excess space. I'm guessing that your pad char is
a '0' right now but the '0' is harmless.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
"Garriss Jr.,
James P." <jgarriss@mitre.org>
To:
"dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
Date:
13/06/2013 13:20
Subject:
Re: [DFDL-WG]
optional whitespace entity
Sent by:
dfdl-wg-bounces@ogf.org
> can
you show us the scenario where you want to apply this?
Sure.
Consider the Date header.
Note that the day of the month is *always* 2 digits (that
is, it’s 04 instead of just 4):
Date: Fri, 04 Feb 2013 08:54:52
-0500
Now consider the Received
header, which finishes with a date. Sometimes the day of the month
is 2 digits, when the day is 10 or higher:
Received: by mail-wi0-f178.google.com
with SMTP id hj6so339193wib.11
for <jgarriss@mitre.org>;
Thu, 30 May 2013 22:28:57 -0700 (PDT)
Sometimes it is 2 characters
but instead of a leading 0 (like the Date header above), there is a blank
space preceding the day. If you look closely in this example, you
will see that there are 2 spaces between “Tue,” and “4 Jun”:
Received: from 131.28.34.56
([131.28.34.56]) by
VFOHMLAO03.Enterprise.afmc.ds.af.mil
([131.28.34.43]) via Exchange Front-End
Server webmail.afmc.af.mil
([131.28.34.85]) with Microsoft Exchange Server
HTTP-DAV ; Tue, 4 Jun
2013 18:02:13 +0000
And sometimes it is 1 digit:
Received: from smtpksrv1.mitre.org
(129.83.31.51) by IMCCAS03.MITRE.ORG
(129.83.29.80) with Microsoft
SMTP Server id 14.2.342.3; Tue, 4 Jun 2013
09:31:47 -0400
The problem is how to model
the Day element when it’s part of the Received header. If the length
is merely set to 2 characters, then the value can be “ 4”, which Daffodil
complains that it can’t convert it into an integer. So I settled
on this:
<!--
Day is set to delimited instead of explicit (w/ length = 2) b/c the date
in Received can be 1 character -->
<xsd:element
name="Day"
dfdl:lengthKind="delimited"
dfdl:initiator="%WSP*;">
<xsd:annotation>
<xsd:appinfo
source="http://www.ogf.org/dfdl/dfdl-1.0/">
<dfdl:assert
test="{ dfdl:checkConstraints(.)
}" message="There
cannot be more than 31 days in a month"/>
</xsd:appinfo>
</xsd:annotation>
<xsd:simpleType>
<xsd:restriction
base="xsd:unsignedInt">
<xsd:maxInclusive
value="31"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
(I have a different version
of the Day element for the Date header.)
I started this thread because
I wanted to make sure that this initiator would work whether the extra
space is present or absent. BTW, happy ending: it works as
expected in Daffodil 0.10.
> I
am wondering whether whether %WSP*; on its own should be allowed as a delimiter?
If not, then please provide
some solution for the above problem.
From: Steve Hanson [mailto:smh@uk.ibm.com]
Sent: Thursday, June 13, 2013 5:30 AM
To: Garriss Jr., James P.
Cc: dfdl-wg@ogf.org; dfdl-wg-bounces@ogf.org
Subject: Re: [DFDL-WG] optional whitespace entity
James, please can you show us the scenario
where you want to apply this?
I ask because I think it is the only example in DFDL where you can specify
a DFDL delimiter and for there to be nothing for that delimiter in the
data. I suspect this might have some ramifications for things like delimiter
scanning and dfdl:initiatedContent. It clearly solves a problem for James,
but I am wondering whether whether %WSP*; on its own should be allowed
as a delimiter?
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: "Garriss
Jr., James P." <jgarriss@mitre.org>
To: "dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
Date: 11/06/2013
16:19
Subject: Re:
[DFDL-WG] optional whitespace entity
Sent by: dfdl-wg-bounces@ogf.org
You know, Tim, that is what I meant, but I just copied the syntax directly
from Table 4. Upon further review, I see that this table doesn’t
give the complete syntax. I wonder how many other people will just
cut-and-paste directly from these tables. It might be a good idea
to put the complete syntax there.
In any case, thanks for the explanation. That’s what I hoped it
would do.
From: dfdl-wg-bounces@ogf.org
[mailto:dfdl-wg-bounces@ogf.org]
On Behalf Of Tim Kimber
Sent: Tuesday, June 11, 2013 11:02 AM
To: dfdl-wg@ogf.org
Subject: Re: [DFDL-WG] optional whitespace entity
I assume that you meant to write dfdl:initiator="%WSP*;"
That will match zero or more whitespace characters. It will match and consume
any leading white space before the element, and it will never fail to match.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert@uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: "Garriss
Jr., James P." <jgarriss@mitre.org>
To: "dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
Date: 11/06/2013
15:48
Subject: [DFDL-WG]
optional whitespace entity
Sent by: dfdl-wg-bounces@ogf.org
<xsd:element name="Day" dfdl:lengthKind="delimited"
dfdl:initiator="WSP*">
Will this element match if the initiator is not found (that is, even if
there is not a space before element? TIA--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU