Yes that sounds like the best way to do
it.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
Mike Beckerle <mbeckerle.dfdl@gmail.com>
To:
Steve Hanson/UK/IBM@IBMGB,
Cc:
dfdl-wg@ogf.org, dfdl-wg-bounces@ogf.org
Date:
03/12/2012 13:35
Subject:
Re: [DFDL-WG]
Puzzle: unparsing format involving lengthKind='pattern'
Actually, that helps.
I think that, plus an inputValueCalc/outputValueCalc pair will fix it.
The inputValueCalc strips any trailing NUL, the outputValueCalc adds one
if the length is less than 64.
On Mon, Dec 3, 2012 at 7:21 AM, Steve Hanson <smh@uk.ibm.com>
wrote:
Afraid not. The problem is that the
NUL is not a terminator in the DFDL sense of the word (ie, mandatory) but
is an early-end-of-data indicator.
I can't think of an elegant way to handle this so I would simply model
the data as a single string with a dfdl:lengthPattern that consumed either
0-63 chars plus NUL or 64 chars. This puts the NUL in the infoset and puts
the onus on the user to trim the NUL when reading the infoset, and supply
the NUL when creating the infoset.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From: Mike
Beckerle <mbeckerle.dfdl@gmail.com>
To: dfdl-wg@ogf.org,
Date: 30/11/2012
21:02
Subject: [DFDL-WG]
Puzzle: unparsing format involving lengthKind='pattern'
Sent by: dfdl-wg-bounces@ogf.org
I have a file format where strings have this unusual discipline for termination.
The string is either 64 characters long or, it is from 0 to 63 characters
long, with a NUL terminator.
I can parse this like so:
<xs:complexType name="myStringType">
<xs:sequence>
<xs:element name="s"
type="xs:string"
dfdl:lengthKind="pattern">
<xs:annotation>
<xs:appinfo
source="http://www.ogf.org/dfdl/dfdl-1.0/">
<!-- 0 to 63 occurrences of not a Nul, followed by a Nul
(final nul non-captured in the pattern match result), OR, just 64 non Nuls
-->
<dfdl:element>
<dfdl:property name="lengthPattern"><![CDATA[([^\x00]{0,63})(?=\x00)|[^\x00]{64}]]></dfdl:property>
</dfdl:element>
</xs:appinfo>
</xs:annotation>
</xs:element>
<xs:element name="term" type="xs:string"
dfdl:lengthKind="explicit"
dfdl:length="0" dfdl:initiator="&NUL;" dfdl:outputValueCalc="{
'' }"
minOccurs="0" maxOccurs="1"
dfdl:occursCountKind="expression"
dfdl:occursCount="{ if(fn:string-length(../tns:s)
lt 64) then 1 else 0 }" />
</xs:sequence>
</xs:complexType>
What I did is use a lengthKind pattern to pick off the content, excluding
the NUL, and then model the NUL explicitly as the initiator of an empty
element which is optionally occurring depending on the length of the string.
That will work. I'd like to hide the "term" element in a hidden
group, but other than that it will work for parsing.
Question: How can I unparse this?
I want the schema and DFDL processor to put the "term" element
into the infoset by itself depending on the length of the 's' element that
I will place into the infoset. So I put an outputValueCalc on there to
assign it a value of empty string, but that will only happen if the occursCount
expression causes the optional element to exist at all.
Will this work for unparsing?
Spec currently says that occursCount expression is used only when parsing,
otherwise the number in the infoset are used. But I don't want to have
to put this syntax-modeling element into the infoset from the application.
I just want the application to create a string, and then based on whether
it is 0-63 long, I want the Nul added or not.
Other ways I've tried to model these strings include as a choice of two
different elements. That has a similar issue that I then have to assemble
my infoset for output using a length dependent element. I can't just put
a string into the infoset and have it decide when unparsing which of the
two choice branches is the right one. (asserts and discriminators are parse-only
also.)
Anyone have better ideas?
...mike
--
Mike Beckerle | OGF DFDL WG Co-Chair
Tel: 781-330-0412
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU
--
Mike Beckerle | OGF DFDL WG Co-Chair
Tel: 781-330-0412
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU