Re: [ogsa-rss-wg] RE: [ogsa-wg] Teleconference minutes - 2 November 2005

12 Dec 2005

      Donal K. Fellows wrote:
...
First, at this year's London F2F people began talking about the need to
do scheduling over all services, not just execution services. I.e. we
may need to schedule access to data services and network services.  It
would be desirable to have a framework generic enough to handle all
sorts of resources (as long as they in turn provide the necessary
information).  Has anyone considered this w.r.t. RSS?
It's not being actively considered as such, but it is certainly at the
back of my mind and I'm trying to not take any decisions that would
close off such use.
...
Second, there are lots of ways to implement scheduling.  Would it be
possible just to specify the interfaces and allow implementations of the
services to use whichever algorithms they want?  E.g. in the data
architecture we say almost nothing about the functioning of a data
federation service, because we can abstract from most of the inner
workings.  Do we clearly understood what the output of the RSS service
should be?
We're most certainly not saying much about the innards of the services!
That would be a mistake as it is clear from the pre-existing work (done
by the GSA-RG) that there are many approaches in this space. From my own
experience, I suspect that it is likely that there will be many
different "brokering" systems about, many of which are specialized to
dealing with a particular domain.

At the moment, I thinking about the abstract output of the RSS services
as ordered sets of candidate execution plans, probably encapsulated
within WS-Agreement Templates. The ground elements of the plans (or at
least those parts that relate to computational activity, of course) will
likely be JSDL documents suitable for submission to BES containers. When
it comes to data stuff, I don't understand the requirements well enough
to say much, but I believe that the suggested outer structure (ordered
set of agreement templates) will extend to that sort of thing nicely; it
is just the leaves of such a tree that I don't understand.
...
Third, you mention the idea of passing in a ranking function parameter.
This sounds similar to a question I raised recently on this list about
selecting data sources, in which I suggested passing in a policy
argument.  Admittedly I was making that suggestion in the particular
context of passing a policy argument to a reference renewing service,
but the two cases do seem sufficiently similar to merit a comparison.
The replies I received then expressed a strong preference for letting
the selection be done by the client rather than attempting to
parameterise a service.  I think it's worth drilling down on this issue,
and this is what I attempt to do for the rest of this message.
The problem with the other way (caller gets all offers and then chooses)
is that it doesn't scale well when the number of potential candidates
goes up. By supplying a ranking function to the generator side and using
a service concretization based on something like WS-Enumeration, you can
come up with protocols that take reasonably high quality decisions with
very little network traffic or local computation.

Note that I'm looking for "good enough" decisions and not optimal ones.
I think that optimality is something of a chimera, and that you can get
within a few percent of it for far less effort. This will produce an
overall system that works very well cheaply most of the time, and in
those cases where the costs are so massive that even a small difference
is expensive it will be possible to use a non-standard service that
tries harder for optimality (while probably charging extra for the
privilege of course).

[...]
...
If we have parameterisation, then the question arises of how expressive
a language we need to specify the parameterisation.  Do we need a
general-purpose programming language (e.g. Python script) or will the
draft WS-Agreement language be sufficient?
As far as I can see from reading the version that went out to public
comment, WS-Agreement currently just punts on this issue. I'm hoping
that it will be possible to avoid putting a general programming language
expression though; it's remarkably hard to secure such things, and
mandating one language over another (necessary for interoperability)
would trigger a lot of arguments.

This is an area that's definitely going to need more work. (If we can't
specify something XMLish, we need to choose a suitable language based on
capabilities and safety properties.)
...
The replies I received suggested both that the parameterisation would
require a general programming language and that the client may have
privileged knowledge that it does not wish to share, and therefore it
would be better to leave decisions to the client.  I'm rather sceptical
of the first point but it would be good to examine real systems to
determine what is really needed.
I think that as long as you are just ordering and not selecting, you
have far less of a problem. The final decision rests with the client
still; all they're doing is exporting the first stage of their selection
process (the initial sort) to the server side. But you can only make
that efficient if you use something like WS-Enumeration. The other
advantage of doing this is that you end up with the equivalent of a
distributed merge sort when you start doing tricky things with delegated
candidate set generation.
...
There are clearly many possible answers in this space - which brings us
back to Ian's list of schedulers.  One question might be whether we can
identify an interface, or a small number of interfaces, that generalises
a significant number of practical use cases.
I thought that (modulo that RSS isn't working on reservation; out of
scope according to the current charter) was what we were working on. :-)

Donal.