[glue-wg] Issue with several ServiceTypes, "modal" operation

5 Jul 2012

      Hi all,

I would like to raise what I think is an issue with some ServiceTypes we 
decided on the last enumeration call long time ago, and I think this 
list is the correct place as my observation will question the sanity of 
these open enumerations.

So let me briefly set the scenario with an ARC example. An ARC Computing 
Element (or ARC CE) can be configured to support different clustering 
systems, or LRMS, and its corresponding Execution Service has a well 
defined GLUE2 name called org.nordugrid.execution.arex

Now, an ARC CE can be configured to act as a gateway for Desktop Grids, 
by simply using a different LRMS called DGBridge. The CE works exaclty 
the same way as usual, but will only allow jobs designed for Desktop 
Grids, discarding or failing all the others.

On an operational point of view, GOCDB suggested to use a different 
ServiceType for the above configuration, namely dg.ARC-CE.

So what happens now is, the ARC CE Desktop Grid Gateway publishes 
org.nordugrid.execution.arex as ServiceType, and GOCDB calls it  dg.ARC-CE.

I see a consistency problem here. Would be strange to have a GLUE2 
ServiceType in GOCDB that is _not_ published by any Resource out there: 
what happens when a client asks info to GOCDB? it will find a 
ServiceType that doesn't match any of the data published by the resource 
it is pointing to. Weird isn't it?

When I stumbled upon GLUE2 for the first time I dreamt of an automatic 
discovery system where ServiceTypes where automatically fetched by some 
monitoring system that would automatically understand what-is-what by 
looking at infosystem data. (the repetition of automatic* is intended ;) )

I foresee that such a "modal" setup for a CE or any resource being 
common in the future, and we have to suggest a sane way of  doing these 
things that comes towards operational needs (like GOCDB) and doesn't 
make the CE sysadmin job messy (no much manual data adding into a 
database).

Solution 1 would be: for ARC (and gLite and Unicore) to change its 
ServiceType or add another ComputingService entry to with 
ServiceType=dg.ARC-CE when operating as a Desktop Grid gateway. In a 
dream world, this could also allow autodiscovery of Desktop Grid 
Gateways, which is not done today (the gateway urls are hardcoded in 
clients)

The bad thing about the above is that hides the fact that ARC CE is 
actually acting exactly the same way as any ARC CE, it's just using a 
different backend.

So Solution 2 would be for monitoring/discovery clients to check 
ComputingManager to see what kind of LRMS is running behind it. But do 
all systems use an LRMS to achieve this? this is something we cannot 
hypothesize.

There might be a Solution 3 to this problem, that is, add this "modal 
operation" info somewhere else. where? OtherInfo? Capability? I have no 
clue...
Some reserved string like OtherInfo=DGGateway in all the CEs serving as 
gateways...

Solution 1 is the easiest one, and gives responsibility to developers to 
publish relevant information in their renderings if they want to be 
discovered properly, with little manual intervention from sysadmins.
Information clients could filter their searches depending on ServiceTypes.

In the ARC case, the changes in the code to have a different ServiceType 
when operating as a Desktop Grid gateway are trivial. However, we're 
doing strong coupling between ComputingService and ComputingManager, 
which I am not sure is sane...

So the bottom line is, what is the meaning and the use of ServiceType_t 
if not summarizing such complex setups? So far we mostly used endpoints 
for discovery. I think it SHOULD be used to summarize complex setups.

I think is a GLUE2 WG task to understand, if not describe and propose, 
how these values can be used in a nice way.

What's your opinion on that?

-- 
Florido Paganelli
Lund University - Particle Physics
ARC Middleware
EMI Project