[ogsa-bes-wg] comments on scoping... (london f2f minutes)

8 Jun 2005

      Dear OGSA-BES-WG at large:

After reading the minutes from the May London f2f, I have some
concerns about scoping and terminology.  It seems that discussions
went all over the map compared to what I gathered from the last
telecon's quick summary...

First off, there are (I think) three conceptual tiers or planes
related to the whole execution management problem as it relates to
"simple" targets like BES:

  1. The allocable resource.  This is the pool of capabilities which
     are consumed by executing processes and which are to different
     extents managed/scheduled/etc.

     For example, the pool of compute nodes w/ their respective CPUs,
     RAM, and interconnection hardware.

  2. The "container-level" resource management service.  This is the
     logical entity which manages (1) by accepting requests with
     embedded parameters such as the JSDL job description and enacting
     the described tasks.

     For example, GRAM is a Globus Toolkit container-level service
     that is implemented by mapping to one of several local resource
     managers.  Many of these local managers have service interfaces
     in their own right.

  3. The "application" or "activity". This is the domain-specific
     process that comes to life as a result of the execution, e.g. it
     is _hosted_ by (2) and consumes (1).

     For example, an HPC linear algebra code, or a service like
     NetSOLVE, or a web server or any other service that consists of a
     program running for some length of time.

I am belaboring this point because I think the minutes show a
confusion between these different tiers.  There is not really a
factory/child relationship between the layers, but rather each layer
can be conceptually decomposed into groups and instances which have
mappings between the layers:

   Resource Layer     Container Layer     Application Layer

     Pool <--manages--- Manager <--requests-- (Application Manager)
      A                   A                      A
      | partOf            | hostedBy?            | controlledBy?
      |                   |                      |
     Allocation <-uses- Job/Activity -XYZ---> App. Service

   XYZ=instantiates/hosts/something like that...

It seems to me that most BES discussions are calling the
container-level manager above "the container", but not really naming
the container-level job/activity representation.  This second item is
what GRAM calls the Job, but it is in fact sort of a container for the
single job while the manager has a set of such containers that he has
created as virtualizations of the allocations that are carved out of
the resource pool.  For completeness, this container instance would be
represented by an Agreement if BES used WS-Agreement combined with
JSDL.

Often, the discussions in BES seem to confusingly jump to "the
activity" equally "the application", e.g. saying that the interfaces
of the activity are domain-specific.  I think this is wrong.  The
application service that is realized by the activity, e.g. by running
the executable referred to in a JSDL POSIXApplication, certainly has
domain-specific interfaces.  However, there is a generic
container-level activity which always has the same generic container
interface, e.g. "POSIX activity".  BES must define the interface for
the activity, whether or not they are rendered as separate WSRF
resources.

It seems to me that BES should be rigorously observing these
distinctions and seeing itself as a service for provisioning and
management of application services.  In other words, BES should stick
to a container-level abstraction for all its stateful semantics and
only reflect on the resource or application layers in its
advertisement and introspection interface:

  A. It should have the metamodels necessary to reflect on the
     resource pool and allocations as they relate to discovery,
     selection, and monitoring of BES service instances.

       i. Acceptable job configurations/resource requests, e.g.
          basic resource pool info PLUS policies.

      ii. Availability/load, e.g. dynamic restrictions on (A.i)

     iii. Allocation plans reflecting the current/future assignment of
          resources to activities, e.g. the complement of (A.ii).

     iii. Detailed "usage records" reflecting the consumption of
          resource layer capabilities by current/past activities,
          e.g. specialization of (A.iii) w/ monitor/accounting info.

  B. It should define the protocol for provisioning and managing
     application services.

       i. The createActivity pattern.

      ii. In-scope state-changing operations on activities,
          e.g. signals.  Specialization of BES may add things like
          suspend/resume and checkpoint-migration controls.

     iii. The metamodel for introspecting on existing activities,
          e.g. the "job states".

      iv. The cancel/destroy activity pattern(s).

  C. It might want to have a metamodel or metadata path for reflecting
     on application-specific status

       i. Return codes passed from application to container on exit.

      ii. Heartbeat or other optional reporting channels.

     iii. (Domain-specific) rendezvous/contact info optionally
          registered by application.

  D. It should NOT define a management interface for managing the BES
     service instance itself.

       i. NOT handling deploy/start/stop of BES.

      ii. This could come from WSDM or related similar specs.

     iii. Or recursive application of a "static" BES service to manage
          the provisioning of dynamic BES instances!  A big can of
          worms here though, in separately provisioning the service
          logic and its own resource pool... sounds like a job for
          WS-Agreement to me. :-)

Is this a useful map for defining the scope of BES?  Am I wrong in
feeling that the BES discussions are still getting lost in what these
entities are, or where the boundaries lie?

karl

-- 
Karl Czajkowski
karlcz@univa.com