New subject: [ogsa-bes-wg] Additional Input into BES w.r.t ESI document

11 May 2006

      All,

 From the OMII-Europe project (which is engaging with GGF in 
implementing grid standards) here are some comments from one of the 
teams that will be looking to implement the BES specification.

Please respond to the following comments. The attached document has 
comments on the proposed ESI state model and information/resource model.

Steven

About the Job Factory Interface:
* The list of the defined operations does not cover administrative
operations like “rejectJobSubmissions(Policy)” and “allowJobSubmission
(Policy)” useful for disabling/enabling new job submissions based on the
policy defined by the CE administrator (e.g. if the CE has to be shutdown
for maintenance; disable new submissions if the number of active jobs is >
3000; etc). Do you plan to provide it?

*    The current proposed JobFactory interface allows users to create new
jobs AND (optionally) subscribe for notifications. We believe that the job
management service should be better decoupled from the notification service,
as they provide different functionalities. We suggest that Figure 1 be
extended with a new box (“SubscriptionFactory”) which exposes an interface
for creating, modifying and removing notification requests. In this way,
notification management can be decoupled from job management, allowing a
greater degree of flexibility. For example, it would be possible for users
to subscribe to notifications after a job has been created (with the current
proposed interface, this would not be possible). Moreover, it would allow
users to submit to the notification service requests for “cumulative”
subscriptions (i.e., in order to receive notifications related to all jobs
submitted by the same user, or by members of the same Virtual Organization).

*    We propose to include a “JobAssess” operation on the JobFactory
Interface. This operation provides the user with an estimate of the start
running time, e.g. taking into account the current state of the Computing
Element, the number of running/queued jobs and other parameters.

*    Related with the previous issue, it would be interesting to include
an additional operation for estimating the Quality of Service (QoS) for the
service instance. The exact meaning of QoS in general depends on user
requirements (users may assume different weights for different parameters).

*    This is probably a “cosmetic” adjustment, but we believe that the
name of the “Release” operation makes sense only for reactivating jobs
from “Held” states; the transition from “Start Pending” and “Staging In”
could probably be called something like “Activate”.

*    The proposed interface does not provide mechanisms for handling
capabilities. With this we refer to the possibility for a user to authorize
other user(s) to perform certain operations on his/her jobs. For example, a
user may want to allow another user to monitor his/her jobs, or to interrupt
and abort jobs and so on. Perhaps this functionality is not strictly related
to job management, but is rather a security issue (which, according to the
draft, is still to be discussed). We may keep it for future discussions.

About the staging of files:
*    While it is clear that users might explicitly “push” files from
their storage space to the Grid while a job is in “Start Pending” state, it
is unclear how users might explicitly “pull” by hand resulting files from
the Grid after a job competed, and before everything gets cleaned up.

About the Job Interface:
*    The description of the states of Fig. 2 should be expanded with more
details, including details on state transitions (basically, we suggest to
put a complete description of the states in section 3.2.1).

*    Section 5.1: The meaning of the “Log” property on Table 3 is not
clear: what does it mean?

*    From Table 3 we see a JobState property which represents the current
state of the job. We think that it would be useful to provide the user with
the history of all job status changes with the associated timestamp. We also
support the need for exitCode and failureReason attributes (see Section 8.2,
issue 9) to describe the job return code and job failure reason 
respectively.

*    It may be useful to provide an additional property (we may call
it “CommandList”) representing the list of all commands issued for a given
job.

About the Application Interface:
*    The status of this interface is unclear: is this section going to be
discussed? Are you going to consider the problem of user interaction with
running jobs?

-- 
Moreno Marzolla
INFN Sezione di Padova,    via Marzolo 8,   35100 PADOVA,  Italy
EMail: moreno.marzolla@pd.infn.it         Phone: +39 049 8277047
WWW  : http://www.pd.infn.it/~marzolla    Fax  : +39 049 8756233

Additional Input into BES w.r.t ESI document

Steven Newhouse

Peter G Lane

Moreno Marzolla

Karl Czajkowski

Peter G Lane

Moreno Marzolla

Peter G Lane

tags

participants (4)