On looking into implementation complexity I've come up with simplifications that don't reduce expressive power at all, but massively simplify implementation (and documentation, and testing...) burdens for this proposed feature.

https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Data+Layering+for+base64+-+Simplified

Feedback is very welcome.

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com

Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy

On Thu, Feb 15, 2018 at 10:44 AM, Mike Beckerle <mbeckerle.dfdl@gmail.com> wrote:

We have a great deal of demand for the ability to describe data formats that include layered transformations - that is, regions within the data that need to be algorithmically transformed before parsing and after unparsing.

The IETF data formats make extensive use of line folding, base64, etc. Many formats allow compressed payloads. All these are examples of "layering" where a region of the data stream can be identified, algorithmically transformed, and then subsequently further parsed. (The inverse process for unparsing obviously.)

Our draft proposal, that we are hoping to implement in Daffodil, is here:

https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Data+Streaming+for+Base64+and+other+Layered+Transformations

Any feedback on this is very welcome, and can be addressed to dfdl-wg mailing list.

Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology | www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are subject to the OGF Intellectual Property Policy