All,
I thought we had a very productive OGF. A number of
interesting and exciting results in GIN, and solid progress on PGI requirements
and a discussion of how to meet those requirements going forward.
A couple of things stuck in my mind – so I thought I’d
get them out there and get a response.
First, with respect to GIN.
I thought the demo’s, particularly the use of the BES endpoints
in SAGA, were very informative. Particularly when one keeps in mind that at
least the Unicore/futuregrid resources were not even available until Friday
afternoon (eastern time!).
I agree with the sentiment that we should build on this very
successful demo. In particular most groups have either done simple extensions
to JSDL, or used existing mechanism in a particular way. Ditto with BES factory
attributes. Several ideas came up (Morris, chime in I think you were keeping notes)
Profiling we can do fairly easily:
File system
JSDL
specifies how file systems can be defined – Mark correct me on this. We
use one we call SCRATCH as a place to cache files that are large and constant,
including often zip files that contain applications. We tell JSDL to mount it,
we then use the “don’t overwrite on stage-in” JSDL flag, and
the “don’t delete when you’re done” flag. This is a
huge help.
Extra attributes
We have found a need for several extra BES factory
attributes and the use of those in JSDL matching. I’m guessing many of
you have done this as well.
These include whether an application is installed, whether
static linking is required,
I also think that while we are waiting for PGI to figure out
how to deploy applications that we have a simple convention for applications,
something like
HAS_APPX_INSTALLED
And then if the BES factory has that, then by convention
that
APPX_PATH and APPX_LIB_PATH are defined and can be used in
some sort of prefix.
File staging – demo of sites that support FSE at SC.
We discussed that several implementations support FSE. Let’s test that
interop. It really changes the set of applications you can run if you can copy
them onto the machine. If we can get a set of endpoints that support FSE
we’ll (GenesisII/XCG) set up a meta-scheduler that takes JSDL and manage
them through the life cycle on the endpoints of the different implementations
.This might also form the basis for some cycle trading …..
Re: FutureGrid. I also want to repeat that if groups
want to put up standards based implementations of services on FutureGrid (OGSA-BES,
RNS, WS-Trust, ByteIO, gridFTP, SRM) we will work with them to make it so. One
of the goals of FutureGrid is to provide PERSISTENT standards based endpoints
with which to test.
Second, with respect to PGI
Application deployment/management. This came across as the
number one requirement. This came as a surprise to me, though I agree it is an
important next step to make execution services (OGSA-BES) more useful. There
are at least two ways to handle this problem – one I’ll call
manual installation, configuration, and reification (MICR), and the other I’ll
call automatic deployment, configuration, and reification (ADCR).
MICR
This is definitely the easy case.
First, an application is deployed at/on a site/compute
resource. Appropriate metadata is added to the resource metadata or an
information service is updated with information on the local bindings for the
executable, paths, libraries, etc.
Second, once a placement decision is made by a job manager
(I’m using the OGSA-EMS term, here think meta-scheduler), the JSDL
document that contains an abstract application name is reified (transformed) to
have the site-appropriate values for the application.
Third, the JSDL is sent to the compute resource.
I believe this is basically how UNICORE 6 does this now.
ADCR
ADCR is similar to MICR except the deploying step of setting
the application up and configuring the metadata is done automatically by a
service. For simple applications – a single executable, or maybe a zip
file that needs to be unzipped, this is easy. It can get complicated really
fast. We at Genesis II defined a very simple deploying service combined with a
reification service that worked for the easy case – executable and zip file.
I can dig up the documentation on this if anyone is interested.
I believe that the general case for this is a tarpit for a
standards body. I have a PhD student working on it J.
**** A note on reification
From Wikipedia
Reification generally refers to bringing into being or turning
concrete.
Specifically, reification may refer to:
In this context I refer to taking an abstract JSDL that may
give an abstract name of an application, e.g.,
<JSDL-abstract-application-name>/bin/biology/blastp-v5
</ JSDL-abstract-application-name>
THIS IS MADE UP SYNTAX
That might occur in a JSDL file, and then making it concrete
for a particular endpoint by transforming the JSDL to have the actual local
path name and defining any environment variables needed. We discussed this
problem extensively in the OGSA, OGSA-EMS, and OGSA-BES working groups. A
strawman interface was even defined.
MPI
Once again the importance of this came as a somewhat welcome
surprise. I believe this is of considerable interest to UNICORE. Perhaps they
could take the lead on this one. Is it simply a problem of the JSDL not being expressive
enough to both select appropriate resources and pass topology information down
to MPI? In other words, is this an easy problem? Or is there something else I am
not seeing?
Security
The single greatest impediment to the GIN demonstrations was
working out the security stuff. Andre was pretty clear about this. In other
words, the functional aspects of the standards were not the problem – the
certificates and other stuff were. It has been suggested that it may be time to
dive back into this. We spent quite a bit of time on it 18 months ago. The
result was a DRAFT profile document by Duane Merrill that bifurcated the
profile into a transport level security profile based on delegated X.509
certificates and a message level security profile – primarily aimed at
SAML (http://forge.gridforum.org/sf/docman/do/downloadDocument/projects.pgi-wg/docman.root.input_documents.security_material.comm_profile/doc15575)
. I know that there was a lively debate about how to proceed.
PGI profile
This is the big question. We’ve gone through use
cases, requirements, counting up the use cases that require each requirement,
and two approaches to meeting those requirements (see slides).
How are we to proceed? Should we ask each proposer to
provide more detail? Should we solicit more proposals? Should we work
through both paths? Should we vote?
My own interest is in extending the existing specs and
working with GIN to drive spec modifications. If we go with extending the
existing specs – BES in particular, should we modify BES just a little
and leave the basic factory and management pattern the same? Or go with a
cleaner model more consistent with what Bernd presented (return an activity
handle <EPR?> and then manage the activity directly).
Either way we need make some decisions and move forward.
Third, with respect to Cycle sharing
At first I thought that the cycle sharing BOF did not lead
to any interested in trying this out. As the week wore on my initial impression
proved mistaken. Johannes followed up with several other groups and we may
indeed have a sufficient critical mass to proceed. I will ask Joel to set up a
source forge site and email list. Once that is set up I will email these two
mailing lists with subscription information. My thoughts are to leverage the
efforts of the GIN group and perhaps start with all agreements and SLA’s
being done by humans and verbal discussions – and see if we can actually
use others cycles effectively using the existing technology before we leap into
anything too complex.