Thanks for your note Sampo,

I am pretty sure DFDL cannot help you today. DFDL v1.0 has not grown the ability to handle recursive structures.

You asked why:

DFDL v1.0 is a standard formed by taking existing industry data-handling tools, and finding the union of their functionality, and standardizing that.

Topics that advance the state-of-the-art beyond that of any existing commercial data-handling software have been postponed to what we're informally calling DFDL v2.0.

We have found that research advancing the state of the art doesn't work in the context of a standards process. The two are a terrible mismatch, as "creative process" and "committee compromise" don't go together smoothly usually.

My rejoinder to anyone asking "can DFDL do X?" has been "Are there any products in the marketplace that do that, so that we can derive a standard from what they have done?" I am not aware of any commercial software anywhere that creates a declarative description of rich pointer-based graph data, and writes it to file or parses it from file.  To my knowledge, such formats are always written by programs, i.e., software, not from declarative descriptions. Perhaps companies/people have created such things, but have them only for internal use at their project/company. I know of none that are published. I would love to hear otherwise.

All that said,...we do realize there is a large community of people hoping to take the central DFDL "idea" which is declarative description of data formats, and apply it to new and richer problems like the graph problem you have described. So far the big features people want are:

* recursion - needed to declaratively describe binary document formats and "container" file formats that have arbitrarily deep nesting.
* layering - also known as multi-pass. Needed because many formats are conceptually layered.
* transformation - some kinds of transforms want to go right on the data format schema because they don't change the 'shape' of the data.

I would add your graphs problem to this list, as it adds additional complexity beyond recursion due to node sharing and cycles.

Since the Daffodil implementation of DFDL is open-source, we're hoping to use that as a research/investigation vehicle to try out approaches to many of these features that advance the state of the art. Once we have created such a feature and we believe it works and is useful, we can have it in Daffodil as a way to provide some de-facto energy behind it, and propose it for DFDL v2.0 standardization. That's the idea anyway.

This is not years out as the existing sponsors of my work on Daffodil at Tresys are interested in these DFDL v2.0 extensions as well, and they have need for them in near-term products. But the priority has been on finishing DFDL v1.0, the specification, and the implementations. Of course as Daffodil solidifies, it's open source and anyone can grab it and start running on these research topics.

I just recently created a wiki page within the Daffodil open-source project to serve as a parking lot for DFDL v2.0 wish-list. The page is at: https://opensource.ncsa.illinois.edu/confluence/display/DFDL/DFDL+2.0+Wishlist. I have just added a section at the end about pointer-linked graph structures.

Best regards and I hope you will keep "watching this space".

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com


On Thu, Jul 11, 2013 at 8:37 PM, Sampo Syreeni <decoy@iki.fi> wrote:
The DFDL spec has been growing quite a bit over the past two years or so. Mostly because of of the handling of arcane details. So...

Has it also grown in the wider, descriptive (complexity) sense? Can it now also describe e.g. the arbitrary link structure a typical *NIX file system image can contain, with its hard links?

If so, I might just now have an application for that. If not, why not? It isn't as though you can't develop clean semantics for that, if only as an option. And it's clearly warranted because formats utilizing such constructs constitute a sizable proportion of data both store and actively passed around.
--
Sampo Syreeni, aka decoy - decoy@iki.fi, http://decoy.iki.fi/front
+358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
--
 dfdl-wg mailing list
 dfdl-wg@ogf.org
 https://www.ogf.org/mailman/listinfo/dfdl-wg