Issue with several ServiceTypes, "modal" operation

Hi all, I would like to raise what I think is an issue with some ServiceTypes we decided on the last enumeration call long time ago, and I think this list is the correct place as my observation will question the sanity of these open enumerations. So let me briefly set the scenario with an ARC example. An ARC Computing Element (or ARC CE) can be configured to support different clustering systems, or LRMS, and its corresponding Execution Service has a well defined GLUE2 name called org.nordugrid.execution.arex Now, an ARC CE can be configured to act as a gateway for Desktop Grids, by simply using a different LRMS called DGBridge. The CE works exaclty the same way as usual, but will only allow jobs designed for Desktop Grids, discarding or failing all the others. On an operational point of view, GOCDB suggested to use a different ServiceType for the above configuration, namely dg.ARC-CE. So what happens now is, the ARC CE Desktop Grid Gateway publishes org.nordugrid.execution.arex as ServiceType, and GOCDB calls it dg.ARC-CE. I see a consistency problem here. Would be strange to have a GLUE2 ServiceType in GOCDB that is _not_ published by any Resource out there: what happens when a client asks info to GOCDB? it will find a ServiceType that doesn't match any of the data published by the resource it is pointing to. Weird isn't it? When I stumbled upon GLUE2 for the first time I dreamt of an automatic discovery system where ServiceTypes where automatically fetched by some monitoring system that would automatically understand what-is-what by looking at infosystem data. (the repetition of automatic* is intended ;) ) I foresee that such a "modal" setup for a CE or any resource being common in the future, and we have to suggest a sane way of doing these things that comes towards operational needs (like GOCDB) and doesn't make the CE sysadmin job messy (no much manual data adding into a database). Solution 1 would be: for ARC (and gLite and Unicore) to change its ServiceType or add another ComputingService entry to with ServiceType=dg.ARC-CE when operating as a Desktop Grid gateway. In a dream world, this could also allow autodiscovery of Desktop Grid Gateways, which is not done today (the gateway urls are hardcoded in clients) The bad thing about the above is that hides the fact that ARC CE is actually acting exactly the same way as any ARC CE, it's just using a different backend. So Solution 2 would be for monitoring/discovery clients to check ComputingManager to see what kind of LRMS is running behind it. But do all systems use an LRMS to achieve this? this is something we cannot hypothesize. There might be a Solution 3 to this problem, that is, add this "modal operation" info somewhere else. where? OtherInfo? Capability? I have no clue... Some reserved string like OtherInfo=DGGateway in all the CEs serving as gateways... Solution 1 is the easiest one, and gives responsibility to developers to publish relevant information in their renderings if they want to be discovered properly, with little manual intervention from sysadmins. Information clients could filter their searches depending on ServiceTypes. In the ARC case, the changes in the code to have a different ServiceType when operating as a Desktop Grid gateway are trivial. However, we're doing strong coupling between ComputingService and ComputingManager, which I am not sure is sane... So the bottom line is, what is the meaning and the use of ServiceType_t if not summarizing such complex setups? So far we mostly used endpoints for discovery. I think it SHOULD be used to summarize complex setups. I think is a GLUE2 WG task to understand, if not describe and propose, how these values can be used in a nice way. What's your opinion on that? -- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Florido Paganelli said: I think is a GLUE2 WG task to understand, if not describe and propose, how these values can be used in a nice way.
What's your opinion on that?
So far I don't think we have a good idea of what use cases the ServiceType should support, i.e. under what circumstances would you query for a specific Type? For a computing service you can get every instance with objectclass=GLUE2ComputingService (or equivalent in xml), and if you want to use a specific protocol you can query by the EndpointInterfaceName, so I'm not sure when you would want something intermediate. If the GOC DB does have a practical use case it may be that that should drive what we publish - I agree that it would seem strange if the GOC DB uses different Types. Stephen -- Scanned by iCritical.

On 07/05/2012 12:06 PM, stephen.burke@stfc.ac.uk wrote:
glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Florido Paganelli said: I think is a GLUE2 WG task to understand, if not describe and propose, how these values can be used in a nice way.
What's your opinion on that?
So far I don't think we have a good idea of what use cases the ServiceType should support, i.e. under what circumstances would you query for a specific Type? For a computing service you can get every instance with objectclass=GLUE2ComputingService (or equivalent in xml), and if you want to use a specific protocol you can query by the EndpointInterfaceName, so I'm not sure when you would want something intermediate. If the GOC DB does have a practical use case it may be that that should drive what we publish - I agree that it would seem strange if the GOC DB uses different Types.
I think it will be interesting for you EGI profiling, how to cope with such situations. In the ARC example I gave, there is _no hint_ in the information system about the fact that the service is running as a DG gateway. Getting GLUE2 serviceType and Endpoints as you explained above is irrelevant for Desktop Grid clients at the moment. The operating behavior is exactly the same as any client submitting to a predefined ARC CE -- just submit to a URL. Currenlty the problem arises when the Operational/Monitoring view comes into play: GOCDB knows, and wants to keep track, that such a CE is operating as special purpose gateway, and that's why they come up with different names. Ideally also clients should care about it, if we enforce a sane infosys-based approach. Easy path is to force services to publish specific ServiceTypes, but my fear is, is that the correct way of using these GLUE2 concepts? -- Florido Paganelli Lund University - Particle Physics ARC Middleware EMI Project

glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Florido Paganelli said: I think it will be interesting for you EGI profiling, how to cope with such situations. In the ARC example I gave, there is _no hint_ in the information system about the fact that the service is running as a DG gateway.
I think you need to look at the details of the situation. For example, is it a binary choice, either the whole Service is a DG gateway or it isn't, or can one Service serve multiple back ends? Is it a simple flag, yes or no, or are there properties associated with it? Does the job submission need to know what the backend is? ... Basically everything should be driven by use cases, there is no absolute right answer and no need to publish information if there are no consumers.
Getting GLUE2 serviceType and Endpoints as you explained above is irrelevant for Desktop Grid clients at the moment.
There may also be a question of exclusion, if you want to prevent standard jobs going there. Also you say "at the moment", but if you can forsee things changing in the future it may be worth trying to plan for it. On the other hand there are always things we didn't forsee, and we then have to find the best solution at the time.
Currenlty the problem arises when the Operational/Monitoring view comes into play: GOCDB knows, and wants to keep track, that such a CE is operating as special purpose gateway, and that's why they come up with different names.
Monitoring is also a valid use case - but it doesn't necessarily need to use the Type. On the other hand, if the GOC DB only implements a subset of the schema (e.g. no Manager) then that may restrict the options.
Easy path is to force services to publish specific ServiceTypes, but my fear is, is that the correct way of using these GLUE2 concepts?
There isn't really a correct way, we can define Types according to our needs as long as the definitions are clear. However once we start to have clients making use of the Type it will become harder to change. Stephen -- Scanned by iCritical.
participants (2)
-
Florido Paganelli
-
stephen.burke@stfc.ac.uk