All-
 
This is the conclusion that I've come to, as well, and I was going to make some of the same points Mike made (much better than I could) in tomorrow's weekly call.
 
I'm still mostly looking at this from the perspective of a user of DFDL, not someone attempting to write a "reference implementation" or validator.  While all these points of view need to be considered, I think the most important factor is to make sure the DFDL spec is useful, usable, and actually used by people.  If the choice in scoping comes down to "ease of use" vs. "ease of writing a validator," ease of use should be the default unless there's a very good reason to do otherwise.
 
The major problem here seems to be validating a DFDL schema as "correct and complete."  If I'm not mistaken, any top-level element could be verified as "correct" (there are no errors in the DFDL declarations or schema), but to verify an element as "complete," either: a) every top-level element has all necessary DFDL properties defined (the current spec seems to require this), or b) the validator needs to know which top-level element will be the "root" of the schema.  For usability's sake, I would much prefer b) over a).  There already seems to be a decision that the indication of the top-level document element will not be part of the DFDL schema itself, which is consistent with the way XML schema works, so I would suggest an alternative.  A validator can verify that a DFDL schema is "correct" without any additional information, but to verify "completeness," it would need to be given the top-level document element as an argument.  This way, DFDL properties can be inherited by reference, instead of lexically scoped, which would make DFDL much more usable.
 
The other issue Mike has brought up is which properties are necessary to specify to have a "complete" definition.  He does not seem to want default values for the properties, which is understandable when you consider the case of byteOrder or other platform-specific information.  A default that is the opposite of the particular platform DFDL is being used on would be very confusing at least, and defaulting to "whatever the current platform uses" would make the DFDL schema ambiguous, and far less useful for cross-platform communications.  I would be in favor of the rule he proposes that every property necessary for the implementation of the particular schema has to be declared, but under one condition - the spec needs to clearly state what that set of properties is.  Anyone who uses DFDL will need to know this.  If I am defining a text format, I need to know whether byteOrder is necessary in my dfdl:format.  This is especially true if (as in Mike's example) it depends on the character encoding.
 
Please let me know if I've made any errors, or if I've left something unclear.  Otherwise, I'm looking forward to discussing this tomorrow morning (for me).
 
Thanks,
-Steve

--
Steve Marting, Progeny Systems Corp.
Manassas, VA (World HQ)
703-368-6107 x162 / smarting@progeny.net



From: dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org] On Behalf Of Mike Beckerle
Sent: Tuesday, September 29, 2009 8:52 AM
To: Alan Powell
Cc: dfdl-wg@ogf.org
Subject: Re: [DFDL-WG] New scoping rules

Alan,

I've done some thinking on the scoping, and I think we've talked ourselves into a bad position.

From the note on scoping:

The proposal currently under consideration is:

The above is problematic. This breaks referential transparency.

This last bullet is an unreasonable requirement, depending on how you define validity. This was put in to simplify a tooling requirement of some sort that I believe is likely not a good goal for us to accept.