DFDL-WG F2F Notes by Mike Beckerle 2005-05-11 SumanK - properties model dimensionStorageOrder - not on ArrayTD because it would be a modifier on the relationship. Simpler to keep it on the DFDLCommonPropertiesBase storedLengthRef - SteveH: can it be any Xpath? What about field + 1 ? MikeB suggested: want it to be invertible for writing. Some other more general mechanism that is not storedLengthRef would be for a more complicated case. SteveH: ambiguity of stored length for element itself (e.g., a string) vs. for an array of them. (about dimension) Proposed: repeatRef (which is specific about arrays) (how to be explicit for non-invertible stored-length calculations. E.g., length is the sum of two fields.) relativeOffset - JimM: allow negative and get rid of before MikeB: is after mean afterMe, or after the one I refer to. (precedingElement, followingElement - suggested names) MikeB: restrict to just "../peerName" for v1.0? Simplify semantics for v1.0 Text Representation - need "matches" property that takes regular expression that the value must match (SteveH said Swift message parsing uses this.) (? pattern on the rep or pattern on the logical value? Answer: pattern on the rep) (? need specific proposal for regexp hook - SteveH) - need regexp for initiator/terminator/separator (any text delimiter) - need all these things for binary as well as text data (in which case the units are bytes) - extend repType enumeration to include "both" meaning has both text and binary representation characteristics - turns on "initiators" for binary data. - non-translated string literal specified as hex (already in the charset of the data, not the charset of the DFDL schema) - absent element delimiters: handle case of optional fields at end of record. Do the delimiters of them have to be present, or not. - delimiter suppression can occur with arrays, not just groups. (? specific proposal needed for adequate coverage - examples) Quote Schemes - are quoting schemes associated with the field not the delimiter? - same for escapes or are those for the delimiter? - what about translation i.e., does escape sequence include the concept of translation or do we exclude that? - outbound translation: there are formats where you aren't allowed to write out certain sequences. How to specify this so that those are flagged (and either escaped/quoted, or errors are given.) (Not sure there is any issue about outbound translation) - preserve quoting property needed for cases where you want to keep the quote marks in the data (ditto escapes) StringTD has properties which will conflict with TextPropertiesExtension. Need to resolve. Not a problem for NumberTD, since those can have text rep the textPropertiesExtension has to have full generality, but for StringTD we have the conflict. Ambiguity of parsing: '#' character in Suman's example - appears at two levels, as array element separator, and enclosing group separator. How do we avoid ambiguity. (regexp "matches" property is one approach to fixing this.) Issue: New model vs. Extending the TD model Concrete proposal for DFDL annotations based on model needed. * Wed May 11 13:07:26 2005: Minutes from Mikes Demo of our code examples. Dicussion re: translating DFDL "short form" to expanded form and how that doesn't support a model based DFDL. Suman: this will only work for simple type annotations Steve: This will only work with single annotations (e.g. won't work for "about=" or other discriminator). ** testSimpleTextIntegers1 There were no comments explicitly on testSimpleTextIntegers1 ** testSimpleBinaryIntegers1 Suman commented that some of the defaults wer embedded in the parser rather than being a configuration set. ** testCobol1 Discussion regarding calculating repLength from the XML facets. Steve's implementation picks up defaults from the XML facets and this seems to be a good idea. Steve's system has a separate "fixedLength" flag and thus the length can default from xs:maxLength. He stated that customers complain when they've entered some data and it didn't default properly. Suman brought up scoping issues. He was questioning if the cobol example were put into another element, which set of properties apply. [ Issue: Review specific proposal on scoping from Kris Rose ] Suman's impl has all of the default values separately outside of the top level element. Mike mentioned the concept of naming sets of reps and Suman seemed to agree that was a good idea. Suman has some IBM confidential information that applies that should be discussed later. Steve seems to have some issue with this. ** testSimpleVector1 No input ** testVectorExpression1 Questions on the expression but that seemed to be satisfied. Steve noted that SDO (Server Data Objects) has a good syntax for allowing 0-based subscripts in a way compatible with XPath. Steve questions the applicability of this example. Suman raised the issue of serilaization with computed elements. Some brief discussion ocurred where everyone was generally in agreement. ** testValueCalc1 No comments. ** testChoice1 Steve brought up the issues of CodePages with the int literal. Jim pointed out that choiceTagLocation really isn't the location but a processed result. [ Note: rename choiceTagLocation ] There was some discussion of how the layer is hidden and sequences without elements and whatnot. Jim stated that he still believes that not touching the user's schema is a good idea. Steve stated his experience is that people do not tend to put choice tags in their data -- they actually switch on data type as the first field of their code. [ Note: Steve was confused by layers and this might not be an issue. ] Steve stated that you could use a regular expression to discriminate choices. Steve's implementation handles unresolved choices implicitly by first access. i.e. if you use it a certain way then that is the choice that will be taken. Mike reviewed that layers cannot contain anonymous complex types, etc, but only immediate type references. ** testCleverStringChoice1 Suman described how his model would handle cleverString. The model would reference the "cleverString" type. Mike discussed our "repDef" construct. Suman suggested that we define a common pattern on what may return values. Mike explained that we have approximately that. Further discussion ocurred re repDef. Jim guided through some discussion that in the cleverString case we could make it simple type. See testCleverStringChoice1-Jim.dfdl.xsd Steve pointed out that this now allows annotation on simple types which we did not previously allow, and we had previously consciously made a choice that *only* complex types and elements had annotations. Suman suggested that we could just add annotation to the *use* of the clever string element that indicates repDef="cleverString". It might be possible to avoid the extra layer for simple types. Discussion summary: We need to revist this because we might be worrying about hiding things which just do not ocurr. Mike's summary: is laying just about hiding length fields. Steve later brought up the concept of using a group reference to constructs containing hidden layers. Suman suggested that we define a group and model as a group ref instead of repeating the whole block. Clarify: valueCalc fields are not serialized for output. They have no representation. Comments on PNNL multi-layer example - lots of discussion, but no clear conclustion. We need a motivating example. MikeB said examples of "data source indirection" he knew of all end up creating an intermediate form that is either a byte-stream or character-stream. A good example which motivates streams of higher-level objects like strings or numbers is needed. (? Revisit where we should leave this before friday ) SteveH: notes on repDef tlog rep needs layers because the numbers are encoded cleverly in something like packed decimal, but different. white box vs. black box handling of this. (need black box. How far does white box have to be able to go?) enumeration repType should allow repDef as a setting since none of the properties that are germane for binary or text are relevant for repDef. null properties - nullpad - if field is all pad char then it's null.(Steve/Suman to provide list) Litteral nullReservedValues as well as logicalNullReservedValues needed.