range value type

18 Mar 2005

      Before I left Seoul, Stephen showed me a simple solution that was
structurally along the lines of what I imagined for the range value
element type.

RANGE-VALUE COMPLEX TYPE PSUEDO-SCHEMA
----------------------------------------

It is very straightforward in terms of writing XSD.  With some changes
I recommend, it would look like this as a sequence:

  <...>
     <lowerBound>xsd:integer</lowerBound> ?
     <upperBound>xsd:integer</upperBound> ?
     <exact>xsd:integer</exact> *
     <range>
        <lowerBound>xsd:integer</lowerBound>
        <upperBound>xsd:integer</upperBound>
     </range> *
  </...>

I have changed the names of the pieces from what I think Stephen
showed me, in order to try to make their meaning as obvious as
possible.

The whole element is more or less a disjunction of its expression
parts, but please note below the special treatment of the optional
semi-space constraints (the top-level lower/upper bounds)... they are
treated identically to a regular range element if both are specified
and only act as true semi-spaces when one is omitted. However, all
other exact and range expressions are treated independentally (as a
true disjunction).  The alternatives to this that I came up with all
seemed unsatisfactory:

  1) make it a complete disjunction, meaning any value over
     the lower bound OR under the upper bound is in range.

  2) other hybrid conjunctions, meaning multiple bounds, ranges,
     and/or exact expressions have to match.

INTEGER OR FLOATING-POINT REPRESENTATION
-----------------------------------------

We had discussed using xsd:integer and this implies that any use of
this type must just define the "base" units that are being counted.
The only problem in our base terms is (cpu-)time, where I wonder if we
are better off using a floating-point type to allow the base units to
be seconds and still allow fractional second specification.  Choosing
some specoific fractional second as the base unit seems unappealing to
me.

If we provide a floating point/fractional version, I think there needs
to be an optional attribute on the exact element to specify a
precision or epsilon value for equality tests, e.g.

   <exact jsdl:precision="0.001">3.1415927...</exact>

could match anything in the range (3.1405927..., 3.1425927...). Or,
the consumer could treat it as undefined and do something appropriate
if a precision is specified which it cannot support.

Perhaps a simpler solution is to stick with arbitrary length integers
and add an optional divisor attribute that defaults to 1:

   <exact jsdl:divisor="100000">314159</exact>

which would be exactly 3.14159 in decimal fractions?  Of course, a
different divisor like "1024" could be used for binary fractions.

I am not a numerical analyst so I would prefer we bounce any such
proposal off of several before adopting it.

SEMANTICS
------------------------------------------

The matching semantics would be as follows for an element of this
type:

   let boolean "L" and "U" be whether lower/upper bound values are
   specified;

   let integer "l" and "u" be lower/upper bound values respectively

   let E be { e | e is specified in an exact element }

   let R be { <l,u> | <l,u> is specified in a range element }

   in_range(x)
     = (!L || l <= x) && (!U || x <= u)
       || there exists e in E such that x = e
       || there exists <l,u> in R such that l <= x <= u.

INCLUSIVE VERSUS EXCLUSIVE RANGES
-----------------------------------------

Also, I suggest that we place an optional attribute in the boundary
elements:

   jsdl:exclusiveBound=xsd:boolean

with default false, meaning that by default the range of acceptable
values includes the boundary value in the element body.  Setting it to
true would mean that the boundary value is not part of the range.
This supports any meaning captured before in the operator enumeration,
I think.

A minor note is that the elements have to appear in the sequence
order, which I argue is a good thing for machine-machine communication
as the parse tree will yield three monomorphic arrays of values with
clear meanings, rather than one polymorphic array that the consumer
has to traverse.

karl

-- 
Karl Czajkowski
karlcz@univa.com

Karl Czajkowski

Donal K. Fellows

Karl Czajkowski

tags

participants (2)