Mike

A while back we took the following errata (in my under-construction v011 doc), which effectively treats lengthKind 'explicit' with an expression as variable length data to be treated the same as 'prefixed'. In other words the expression would not be used. I think that prevents your scenario from working even with pre-processed lengths.

2.100. Section 12.3.1. State that when unparsing an element with lengthKind ‘explicit’ and where length is an expression, then the data in the Infoset is treated as variable length and not fixed length. The behaviour is the same as lengthKind ‘prefixed’.

At least, that was what I minuted,. I'll forward the email thread that discussed this as it has some examples.

Regards

Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848

From: Mike Beckerle <mbeckerle.dfdl@gmail.com>
To: dfdl-wg@ogf.org,
Date: 14/11/2012 18:23
Subject: [DFDL-WG] Unparsing - Pad to "big enough length to fit..." - can it be achieved in DFDL?
Sent by: dfdl-wg-bounces@ogf.org

I have a format where fields are supposed to be padded so they all have the length of the longest one.

I.e., something like this with fields terminated by | chars

FIELD1|FIELD2 |ST|PH |
ABC |CAPE COD|MA|888-999-0000|
DEFG |BOSTON |MA|- |

I believe I can warn if all the participants in a column of this tabular thing don't have the same length. I would just set a variable with the length of the first row, then on subsequent rows (parsed with a different element in the schema) I would verify that the length equals what is stored in the variable by way of an assertion. I believe this is what recoverableError kind is for actually, when you have redundant information you want to warn about.

The problem I have is unparsing. How in DFDL I could determine how wide to pad each field before unparsing it. The spec requires me to measure the width of the fields (including that quasi-header row which is first, and measure how long each field is, then pad them to that maximum.

I believe there is no way to do this currently in DFDL, because there is no way that an expression can refer to more than one record, nor any way to iterate or operate over sets of data.

Instead an application must be written which can use DFDL v1.0 but the application has to provide some of the smarts.

In this case, the application would have to walk the infoset and determine the maximum length for each such field, then provide these length maximums to the DFDL schema by way of externally set variables.

Is there any other way? Second issue: In addition, the spec of my format requires the total maximum length for all the fields of a row to be at most 72 characters long. When unparsing, we don't evaluate assertions. So how can I test this?

--
Mike Beckerle | OGF DFDL WG Co-Chair
Tel: 781-330-0412-- dfdl-wg mailing list dfdl-wg@ogf.org https://www.ogf.org/mailman/listinfo/dfdl-wg

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU