Steve - a correction below for the IBM
implementation
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF
DFDL Working Group
IBM SWG, Hursley, UK
smh@uk.ibm.com
tel:+44-1962-815848
From:
Steve Lawrence <slawrence@tresys.com>
To:
DFDL-WG <dfdl-wg@ogf.org>,
Date:
04/10/2013 16:40
Subject:
[DFDL-WG] Call
For Abstract: NIST Data Science Symposium
Sent by:
dfdl-wg-bounces@ogf.org
I'm just letting the working group know that we are
submitting an
abstract today to give a presentation on DFDL at the NIST Data Science
Symposium. Below is what we plan to submit.
- Steve
----------------------------------------------
Title: Stop Writing Custom Data Parsers -- Write DFDL Instead!
This talk gives an introduction to the Data Format Description
Language (DFDL), how it can be used to parse both textual and binary
data in a standardized way, and how this leads to less time spent on
custom data parser development and consequently, more time spent on
data processing and analysis. The talk will then describe the
current DFDL implementations, with focus on the open-source Daffodil
project and its design. It will conclude with a brief walkthrough of
real DFDL examples, including commercial and scientific formats, and
a demonstration of the parsing capabilities of Daffodil.
The DFDL specification, which has completed a second round of public
comments as part of the Open Grid Forum (OGF), is a modeling
language for describing general text and binary data using a subset
of XML Schema augmented with data format annotations. DFDL allows
data to be read from its native format and presented as an instance
of an information set or an XML document. DFDL also allows the
reverse, through conversion of an information set back to its native
format. By using the information set, this cleanly integrates with
common XML utilities (e.g. XProc, XSLT, XQuery) for data processing
and analysis regardless of the format of the native data.
Two implementations of DFDL exist, as is required by the OGF to
become a standard. The first, created by IBM and already shipped in several
IBM
products (such as IBM Integration Bus v9), is written in both Java and
C
and includes graphical tools for modeling, running,
and debugging DFDL schemas.
The second implementation, Daffodil, is an open-source
project written in
Scala, with a design focused on speed and correctness. With the two
implementations making great strides, and the DFDL specification
nearing standardization, DFDL is becoming a promising tool that will
ease data parsing, processing, and analysis.
Biography:
Stephen Lawrence has worked as a software engineer at Tresys
Technology since 2007, while contributing to the open-source
Daffodil project as a core maintainer for almost two years. He works
alongside Michael Beckerle, the co-chair of the DFDL Working Group,
to develop Daffodil and improve the DFDL specification. Outside of
Daffodil, he focuses on computer security applications, including
file inspection and sanitization, Security Enhanced Linux (SELinux),
and cross domain solutions.
--
dfdl-wg mailing list
dfdl-wg@ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU