
The attached document is the first draft of the revised set of DFDL properties. The following should be borne in mind when reviewing: - I have divided the set of properties in two, those that are concerned with physical representation of the data (rep properties), and those that aren't (non-rep properties). - A number of physical rep types have been identified (text, binaryStream, binaryInteger, binaryFloat, zonedDecimal, packedDecimal, binaryCodedDecimal, xml). - XML Schema logical types have been gathered into type groups (number, string, binary, boolean, calendar). For each logical type group, only certain physical rep types are allowed (eg, for string, only text & xml). - There are a group of rep properties that are common to many physical rep types (eg, length). The majority of rep properties are specific to a physical rep type, I have decorated such property names accordingly (eg, integerSigned). Some of these are further specific to logical type group, I have further decorated such property names accordingly (eg, textCalendarScheme). - I have assumed for now that all physical rep types are part of the core standard, we need to revisit this as clearly some of the decimal reps are not universally used. - I have not (yet) attempted to assign properties to conversions. I'd rather we reviewed the properties first in terms of the general approach to organisation, naming, and so on, before starting this exercise. - The issue of defaults poses some questions. We have agreed that hard-wired model defaults are not desirable as it means the absence of a property implies a behaviour. If we wish to change this behaviour then we are stuck because existing DFDL Schema then behave differently. So the proposal in the scoping document is that all properties that are used during a parse/serialise must have an explicit value defined somewhere in the DFDL Schema, typically in a dfdl:defineFormat annotation. However, there are properties where we don't want a one-value-fits-all value, but at the same time do not want to specify a value on every element. Example: justification, where typically all strings are left justfied, and all numbers are right justified. Example: calendarPattern, where the patterns for a date, dateTime, time, monthDay, etc are different. One approach is to duplicate the properties - so we would have textStringJustification, textNumberJustification, and so on. You will see that this is what I have done in the document for justification. An alternative approach is to have one property, but to add an additional enum 'schema', which means derive the default from the logical type. You will see that this is what I have done for calendarPattern. So calendarPatternKind set to 'schema' for xsd:date would yield "yyyy-MM-dd" and for xsd:time would yield "hh:mm:ss.sss". Of course, 'schema' can be considered a form of hard-wiring, so maybe we also need some properties that define what these defaults are (they'd only ever be set at dfdl:defineFormat level)? We need to decide which of these approaches is preferred, whether they both make sense, or if there is a better way. - I have not attempted to define properties for multi-dimensional arrays, I have been leaving this until the tracker for this is resolved. - I have not attempted to define properties for physical rep type xml. I think we need a broader discussion on how XML is handled within DFDL first. - Certain properties come with a whole bunch of related properties. Examples are patterns for text numbers, patterns for text calendars, separators, initiators, terminators, occurs. For the patterns I have created schemes under which to group the related properties, like has been proposed for escapes/quotes. I have not done this for separators etc, partly so you can see the two different approaches. We need to come up with a consistent design for when to use schemes and when not to use schemes. - For each property I've suggested which DFDL annotations are applicable, this also requires checking. - My escape scheme is deliberately simple, it may well need to be improved, and we need to decide on whether alternate and/or nested schemes are required. - I've tried to be consistent with property names. For example, properties that control how another property is interpreted is suffixed with 'kind' (eg, length and lengthKind). Similarly with enum values that have the same meaning across properties (eg, always, never, output, input) for properties that say whether something is applicable to input and/or output. - Mike, Geoff, Suman and myself spent some time prior to the F2F in designing some of the properties. They may notice that I have deviated in places from what was proposed. This is invariably due to discovery of some scenario not covered by our previous discussion. For example, it turns out we need to have a separate control for trimming fixed length text, instead of deducing a behaviour from justification (because MRM has a little known but useful property that controls this). They may also notice that not all property behaviour has been stated. This is simply down to a desire to keep the property descriptions concise at this stage. - I've incorporated Geoff's latest thinking on default value and null value properties, although the document describing the theory behind these has not yet been reviewed by anyone other than Geoff and myself, and there some open questions on things like null indicator fields. I expect there to be considerable discussion in this area. - For some properties I've indicated where they can take either a literal value or an XPath. The list of such properties requires revision as I have not been exhaustive in this, mainly allowing XPath where I know that IBM's models allow it. - Although I have tried to make sure that DFDL properties encompass IBM's models, I can not guarantee this at this point in time. Review is required by IBMers beyond the DFDL WG IBMers. I have initiated this process. (See attached file: DFDL_Properties_v004.doc) Regards, Steve Steve Hanson WebSphere Message Brokers, IBM United Kingdom Ltd, Hursley, UK Internet: smh@uk.ibm.com Phone (+44)/(0) 1962-815848