Here's the basis for info/data item on Oct 19 agenda.
Jay and Ellen

-------------------------------------------------------------------------------------

Original XQuery Usage Proposal

We proposed XQuery as a means of expressing job requirements and matching those requirements against resource capabilities. It was expected that job submission would actually contain XQuery logic/language as the means of expressing those job requirements. This was proposed because such a comprehensive, robust, and expressive language would permit very detailed and powerful matching to take place. Also, this was consistent with projection of the Condor Class-Ad concept and approach into the more modern and open XML space. Class-Ads also has a very programming language-like syntax for writing job requirements specifications.

Oct. 2 OGSA call

Michel showed a particular pattern of usage of XQuery that we believe is very limiting. Additionally, Michel stated that while he agreed that XQuery was very powerful, it was also potentially very open ended and therefore might be dangerous to the integrity of a system.

Response

Michel suggests a model in which XQuery logic/functions would be used as an intermediate mechanism for comparing a more value-static XML job requirements document to collections of similar value-static resource capabilities documents. Michel’s model would create several problems that would lead to a much more fragile and brittle system. Any extension of the requirements or capabilities elements in documents would require that not only schema (XSD) be written, but that comparator XQuery functions would have to be supplied. Further, no matter how well such comparator functions might be crafted, there would always be combinations of comparisons, ranges, and boundary conditions that they couldn't express without further extension. This means that even moderately complex job requirements documents might require extensions to the comparator functions in order to be accommodated.

As a powerful query language, XQuery does provide opportunities for abuse either accidental or intentional. It is possible to create queries that would either consume a great deal of resource (compute power or I/O bandwidth) or exhibit malicious behavior and damage the environment that query was run in. As always with any interface these are situations that need to be protected against both in design and in implementation. One primary mechanism that could cause XQuery programs to run unchecked is the ability to reference external sources: additional XQuery syntax (Imbeds, prologs etc.), documents to operate on (accessor functions), or schemas. While the mechanisms specified in the XQuery standard do not limit the scope of such accesses, implementations are free to (and expected to) protect their environment. Similarly, our use of XQuery in job submission and resource matching should not explicitly place limits on the scope of either documents or schemas referenced or define specifically whether the resource capabilities documents are collected by some "repository" or whether queries are distributed, executed by distributed engines and returned to a scheduler. We should, however, recommend that implementations of collections of schedulers do need to limit the scope of such references.

Like most programming environments, XQuery does provide a means to "write" external functions that operate outside the XQuery specification but that can be invoked by any XQuery. We should not explicitly disallow the use of such external functions since they might prove useful, for example, in probing dynamic information like the present utilization of an endpoint, but we should alert implementers of schedulers that some limitations on external function definition are warranted to protect the system.

XQuery also permits functions to be defined in XQuery syntax for reuse, convenience, and expression brevity/clarity. While this type of function is less troublesome than external function not written in XQuery syntax, it is still possible to define looping, long running, or non-terminating recursive functions that could consume excess resources. The traditional means of protecting against this kind of pathological behavior (whether it is introduced by accident (bugs) or malice) is to provide timeouts that terminate long running queries, especially those that don't appear to be generating any usable output. Our specifications should not dictate what the limits on query execution should be, but rather recommend that implementations will need such limits to protect system resources.

OGSA use of XQuery as the basis of our scheduler job requirements and resource capabilities matching should permit most of the expressivity and power of the language to be exploited. We can explicitly profile such features as externally defined functions to implementation controlled prologs (or built in definitions) and thereby control their impact. If profiling out some language features would not be wise since it would limit flexibility, we can point out in implementation notes what restrictions should be applied to protect system. Lastly, there are already some features in XQuery that are implementation dependent and create the possibility of users creating non-portable query expressions which would not work in some implementations. We should identify and profile any such particular feature that we think needs to be supported in a particular interoperable fashion for our usage or simply profile it out to avoid problems.

-------------------------------------------------------------------------------------

5) Information/data modeling (Ellen Stokes, 30 min) > - Discussion on if a requirements XML document could be constructed > with powerful comparative expressions and at same time avoids the > XPath expressions/joins on collecting input documents? Perhaps just > the relevant subset of a query in job. - Agreed to discuss more in > second hour on Next Call - Oct 19th