This is not a meeting summary. I'll work on that on the plane today and send it subsequently, but people asked about having these notes. This is truly random note taking.

...mikeb

---------------------------------------------------------------

Starting Points:

* xml xsd - what I already have, but not what I want
- what I want

* Cobol/C-struct + system and compiler details

* data dictionary (ad-hoc spreadsheet or text description)

* example data files only

Some degree of structural similarity required. Reasonably compatible.

You could express any trasnformation at all, but the intent is to make
easy those where structural similarity is present.

Q: is this just a taste and style issue?

------------------------------------------------

- desirable to have symmetric read/write capability given a DFDL descriptor
- all built-in types and reps should implement both directions
- choices and use of runtimeValue expressions can create non-invertible parsers
- mechanisms can allow explicit introduction of the output formatting properties needed symmetric with the input parsing properties

-----------------------------------------------

Ways to refer to one element from the dfdl annotations of another:
1) use value of other element in runtimeValue expression.
2) use value of other element as source for dfdl read conversion
3) use value of other element as a parameter for dfdl read conversion
(note 2 and 3 are the same if the source is just another parameter)

-----------------------------------------------

Hypothesis: no XSD syntax is needed inside DFDL rep annotations. Can instead reference a type name, and name elements within it.

---------------------------------------------

Choice groups imply the need to have an additional element to provide a name for the choice. This is required only when the alternatives of the choice contain a single value.

--------------------------------------------

Key topic: bining and passing mechanisms for property values aka parmeters to read/write conversions

Parameterization and binding examples

1) Mime type - image format
logical model looks like bmp
black box read conversion

2) complex number with 2 possible component order, realFirst or imaginaryFirst
white box
complexComponentOrder is the parameter

-------------------------------------------------------

Agreed: "transforms" will be called readers and writers, collectively converters and conversions

--------------------------------------------------------

Proposal for parking lot: discontiguous representations. E.g., file full of variable length strings where the length fields are all first, then all the contents.

--------------------------------------------------------

XSLT - has variables and things we can use as constructs. E.g., they use this idiom

<xsl:value-of name="variable" select="...."/> and equivalently <xsl:value-of name="variable">....</xsl:value-of>

---------------------------------------------------------

Issue: when annotations are added on to an element, can we validate that only relevant properties are asserted for that element?
Is it desirable to insure that only relevant properties are asserted, or should irrelevant properties simply be ignored?

Position - rule out irrelevant attributes improves validity checking, catches errors earlier.
E.g., I keep changing the byteOrder setting, but nothing is changing in the data I'm reading (turns out it's because byte order is irrelevant, but if nothing was checking that nothing would help you find that out.)

Position - tolerate irrelevant attributes improves flexibility (e.g., if you change the overall representation, you don't have to edit all the other properties that no longer apply. A single file of DFDL can capture characteristicts of more than one representation (at least one text and one binary flavor, though this doesn't generalize.)

-----------------------------------------------------------

Issue: parameterization of transforms

seems like the OMG DT model and the transform descriptions (alan's proposal) are very very close conceptually, but exactly how isn't entirely clear.

-----------------------------------------------------------

Preprocessing

an attribute called source (and presumably another called target)

----------------------------------------------------------

5 kinds of operations

reader
writer
filter
change filter

function

known signatures
we can chain them together

conceptually think of this as pull model, or perhaps the DFDL expressions don't take any position on whether the implementation is pull or push.

should be a way to create pull-model code in a programming language and use it as an augmentation of the DFDL system.
could be ways to also adapt push model code, or other schemes like stateful threads.

Where can these go in DFDL?

- readers and writers go on elements
- filters go on a special construct for creating sources or targets from other sources or targets

I/O asymetries - using filters you are discarding information, so it affects ability to exactly reproduce output.

Box and arrow diagrams using these function types can be used to provide a semantics for DFDL.

-----------------------------------------------------

<element name="charstream" type="dfdl:sourceStream">
<annotation><appinfo source="...">
   <dfdl:sourceStreamTD>
    <charset>utf-8</charset>
    <source>byteStream</source>
    <filter>bytesToChars</filter>
   </dfdl:sourceStreamTD>
</appinfo></annotation>
</element>

<element name="s" type="dfdl:sourceStream">
<annotation><appinfo source="...">
   <dfdl:sourceStreamTD>
    <filter>replaceRegexp("...regexp for C-comments...", "")</filter>
    <source>charstream</source>>
   </dfdl:sourceStreamTD>
</appinfo></annotation>
</element>

<element name="t" type="dfdl:targetStream">
<annotation><appinfo source="...">
   <dfdl:targetStreamTD>
    <charset>utf-8</charset>
    <target>outbyteStream</target>
    <filter>charsToBytes</filter>
   </dfdl:targetStreamTD>
</appinfo></annotation>
</element>

<element name="toplevel">
<annotation><appinfo source="...">
   <dfdl:instanceTD>
    <source>s</source>
    <target>t</target>
    <repType>text</repType>
   </dfdl:instanceTD>
</appinfo></annotation>
<sequence>
      <element name="len" type="int">
         <annotation><appinfo source="...">
           <intTD>
            <terminator>\p{newline}</terminator>
           </intTD>
         </appinfo></anntation>
      </element>
      <element name="val" type="int" minOccurs="0" maxOccurs="unbounded">
         <annotation><appinfo source="...">
            <intTD>
              <arrayTD>
                <storedLength>../len</storedLength>
                <terminator>\p{newline}</terminator>
                <separator>\p{space}</separator>
              </arrayTD>
              <numbase>10</numBase>
              <reader name="myIntReader">
                <numberOfBits>13</numberOfBits>
              </reader>
            </intTD>
         </appinfo></anntation>
      </element>
</sequence>
</element>

--------------------------------------------------------

Still open issues:

1) scoping of property definitions. Useful or source of bad interactions?

2) how to organize model of the properties for the types - suman and mike in rough agreement.

----------------------------------------------