Here's the basis for info/data item on Oct 19 agenda.
Jay and Ellen
-------------------------------------------------------------------------------------
Original XQuery Usage Proposal
We proposed XQuery as a means of expressing
job requirements and matching those requirements against resource capabilities.
It was expected that job submission would actually contain XQuery
logic/language as the means of expressing those job requirements. This
was proposed because such a comprehensive, robust, and expressive language
would permit very detailed and powerful matching to take place. Also,
this was consistent with projection of the Condor Class-Ad concept and
approach into the more modern and open XML space. Class-Ads also
has a very programming language-like syntax for writing job requirements
specifications.
Oct. 2 OGSA call
Michel showed a particular pattern of
usage of XQuery that we believe is very limiting. Additionally, Michel
stated that while he agreed that XQuery was very powerful, it was also
potentially very open ended and therefore might be dangerous to the integrity
of a system.
Response
Michel suggests a model in which XQuery
logic/functions would be used as an intermediate mechanism for comparing
a more value-static XML job requirements document to collections of similar
value-static resource capabilities documents. Michel’s model would
create several problems that would lead to a much more fragile and brittle
system. Any extension of the requirements or capabilities elements
in documents would require that not only schema (XSD) be written, but that
comparator XQuery functions would have to be supplied. Further, no
matter how well such comparator functions might be crafted, there would
always be combinations of comparisons, ranges, and boundary conditions
that they couldn't express without further extension. This means
that even moderately complex job requirements documents might require extensions
to the comparator functions in order to be accommodated.
As a powerful query language, XQuery
does provide opportunities for abuse either accidental or intentional.
It is possible to create queries that would either consume a great
deal of resource (compute power or I/O bandwidth) or exhibit malicious
behavior and damage the environment that query was run in. As always
with any interface these are situations that need to be protected against
both in design and in implementation. One primary mechanism that
could cause XQuery programs to run unchecked is the ability to reference
external sources: additional XQuery syntax (Imbeds, prologs etc.), documents
to operate on (accessor functions), or schemas. While the mechanisms
specified in the XQuery standard do not limit the scope of such accesses,
implementations are free to (and expected to) protect their environment.
Similarly, our use of XQuery in job submission and resource matching
should not explicitly place limits on the scope of either documents or
schemas referenced or define specifically whether the resource capabilities
documents are collected by some "repository" or whether queries
are distributed, executed by distributed engines and returned to a scheduler.
We should, however, recommend that implementations of collections
of schedulers do need to limit the scope of such references.
Like most programming environments,
XQuery does provide a means to "write" external functions that
operate outside the XQuery specification but that can be invoked by any
XQuery. We should not explicitly disallow the use of such external
functions since they might prove useful, for example, in probing dynamic
information like the present utilization of an endpoint, but we should
alert implementers of schedulers that some limitations on external function
definition are warranted to protect the system.
XQuery also permits functions to be
defined in XQuery syntax for reuse, convenience, and expression brevity/clarity.
While this type of function is less troublesome than external function
not written in XQuery syntax, it is still possible to define looping, long
running, or non-terminating recursive functions that could consume excess
resources. The traditional means of protecting against this kind
of pathological behavior (whether it is introduced by accident (bugs) or
malice) is to provide timeouts that terminate long running queries, especially
those that don't appear to be generating any usable output. Our specifications
should not dictate what the limits on query execution should be, but rather
recommend that implementations will need such limits to protect system
resources.
OGSA use of XQuery as the basis of our
scheduler job requirements and resource capabilities matching should permit
most of the expressivity and power of the language to be exploited. We
can explicitly profile such features as externally defined functions to
implementation controlled prologs (or built in definitions) and thereby
control their impact. If profiling out some language features would
not be wise since it would limit flexibility, we can point out in implementation
notes what restrictions should be applied to protect system. Lastly,
there are already some features in XQuery that are implementation dependent
and create the possibility of users creating non-portable query expressions
which would not work in some implementations. We should identify
and profile any such particular feature that we think needs to be supported
in a particular interoperable fashion for our usage or simply profile it
out to avoid problems.
-------------------------------------------------------------------------------------
5) Information/data modeling (Ellen Stokes, 30 min)
> - Discussion on if a requirements XML document could be constructed
> with powerful comparative expressions and at same time avoids the
> XPath expressions/joins on collecting input documents? Perhaps
just
> the relevant subset of a query in job. - Agreed to discuss more in
> second hour on Next Call - Oct 19th