Steve:
You can indeed put the ManagedJob WS-Resource on a different
host.
You might find this URL relevant:
http://www-unix.globus.org/toolkit/docs/development/3.9.3/execution/wsgram/WS_GRAM_Approach.html
Regards -- Ian.
At 01:12 PM 12/6/2004 +0000, Steve Loughran wrote:
Ian Foster wrote:
Steve:
A variety of semantics and connections are possible between a
"WS-Resource" and an "entity that the WS-Resource
repesents", including both your (a) and (b) below. I don't believe
that the implied resource pattern implies that one particular approach be
adopted.
The following are some rough notes on how we have chosen to handle things
in the GT4 GRAM service. This may perhaps be relevant to your
problem.
The approach that we take in GT4 GRAM is as follows:
1) A GRAM ManagedJobFactory defines a "create job" operation
that:
a) creates a job, and also
b) creates a ManagedJob WS-Resource, which represents the resource
manager's view of the job.
2) The ManagedJob WS-Resource and the job are then linked as
follows:
a) Destroying the ManagedJob WS-Resource kills the job
b) State changes in the job are reflected in the ManagedJob
WS-Resource
c) Termination of the job also destroys the ManagedJob WS-Resource, but
not immediately: we find that you typically want to leave the managedjob
state around for "a while" after the job terminates to allow
clients to figure out what happened to the job after the fact
Regards -- Ian.
Ian,
What is your fault tolerance strategy here?
Is every ManagedJob WS-Resource hosted on the same host (and perhaps,
same process) as the job itself?
This would mean that there is no way for the managedjob EPR to fail
without the job itself failing, but would require the entire set of job
hosts to be visible for inbound SOAP messages. And prevent you moving a
job from one node to another without some difficultly (the classic CORBA
object-moved problem, I believe, though HTTP 304 responses would work if
only SOAP stacks processed them reliably)
I am trying to do a design which would enable (though would not require)
only a subset of nodes -call them portal nodes- to be visible to outside
callers, with the rest of the nodes only accessible to the portal itself.
Once I assume this architecture, modelling the resources gets complex, as
EPRs contain routing info that may become invalid if a portal node
fails.
-steve
_______________________________________________________________
Ian
Foster
www.mcs.anl.gov/~foster
Math & Computer Science Div. Dept of Computer Science
Argonne National Laboratory The University of
Chicago
Argonne, IL 60439, U.S.A. Chicago, IL 60637,
U.S.A.
Tel: 630 252
4619
Fax: 630 252 1997
Globus Alliance,
www.globus.org