I am not aware of a formal comparison with
smooks.
For partial parsing you use a loop of
parseNext() calls instead of a single parseAll() call. That puts you in
control of how far down the data you parse. In the Java sample provided
with IBM DFDL, the code to do this is present but commented out.
Two things can happen when a processing
error occurs. If inside a point of uncertainty, such as a choice branch
or an optional element, then the processing error is taken to indicate
that the component does not exist, and the parser then backtracks and tries
an alternative (so the error is suppressed). If not inside a point of uncertainty,
then IBM's parser currently treats that as a fatal error and stops the
parse. The spec (section 2.1) allows for more creative behaviour:
It is expected that DFDL implementations
will provide additional mechanisms for dealing with effective processing
errors, such as the means of specifying retry points or the means of skipping
some data so as to recover from the error in some way. The DFDL specification
language does not provide features for specify such mechanisms
If the data is well-formed (ie, no processing
error occurs), then switching on validation will report all validation
errors (section 2.5).
It is possible using an assert to throw
a recoverable error after which the parser will continue (section 2.5).
Regards
Steve Hanson
Architect, IBM
DFDL
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
Bing Lu <mfcplus@yahoo.com>
To:
"dfdl-wg@ogf.org"
<dfdl-wg@ogf.org>,
Date:
30/04/2014 02:11
Subject:
[DFDL-WG] questions
Sent by:
dfdl-wg-bounces@ogf.org
Just curious, has anyone in the group compared
DFDL with smooks, pro and cons? Also I were to parse the DFDL using IBM
DFDL code, is there a setting that I can call to do partial parsing? And
when parsing happens, does it stop on the first parsing error or it tries
to parse as much as possible to include all the errors encountered? thanks--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU