Re: [DRMAA-WG] maxSlots attribute

24 Mar 2010

      Hi, 

Quoting [Peter Tr?ger] (Mar 23 2010):
...
Hi,
...
As long as the MonitoringSession::drmsQueueNames is nothing more  
than an opaque set of strings that are the valid values for  
JobTemplate::queueName, I can live with it.  I can see where that  
would be useful for a portal.  I thought, however, that we had come  
to the conclusion previously that portals and user interfaces were  
not really our target applications.  (Anyone remember what feature  
spawned that conclusion?)  I thought that DRMAA was specifically  
focused on applications integrating with clusters.  If so, a list of  
opaque strings is useless.
We dropped the portal example, that's true. The most convincing DRMAA  
applications at the moment are high-level APIs and meta-schedulers on  
top of / with DRMAA support.
I did some field study to get the picture right. LSF, PBS, SGE,  
LoadLeveler, SAGA, Globus and GridWay can submit jobs to particular  
queues. In LoadLeveler, queues are called "classes". Condor, JSDL and  
OGSA-BES seem to have no queue concept - correct me if I am wrong. The  
retrieval of the list of queue names is only supported in:
LSF: bqueues (http://www.vub.ac.be/BFUCC/LSF/man/bqueues.1.html)
PBS: qstat -q (http://linux.die.net/man/1/qstat)
SGE: qstat -f
LoadLeveler: llclass -l (http://www.ccs.ornl.gov/Cheetah/ 
LL.html#Classes)
So if we add the monitoring facility, an empty return value must be  
still valid.
...
By the way, you'll also have to give a little thought to reconciling  
the 1:1 queue/host model with the 1:n and n:m models, as far as  
identifying them in a list goes.
This is the true counter argument. If DRMAA monitoring gives no  
additional hints here, invalid combinations of valid machine / queue  
names in the job template could occur.
Let's wait if any defender of queue list monitoring stands up.  
Otherwise, I propose to keep only the queue name attribute in the job  
template.
It's not a hard requirement from our side, but I think its utterly
useful.  In general, our end users are complaining more often than
not if they need to manually retrieve resource details like service
contact URLs or queue names from some obscure web page, instead of
being able to retrieve those information programatically.  The
manual way is simply too error prone, tedious and static.

Cheers, Andre.
...
/Peter.
...
Daniel
On 3/23/10 10:27 AM, Peter Tröger wrote:
...
...
As I said in the email I just wrote, I'm willing to be convinced  
of the
value of adding queues to the job submission side of things.  I am,
however, fundamentally opposed to adding queues to the monitoring  
side.
I will heavily insist on queue support in DRMAAv2, This is a long  
demanded feature, which also popped up again in the survey.
...
The various concepts of queues are too different for that to make  
any
sense.  There is absolutely no way we will be able to model both  
LSF and
SGE queues in a way that is abstract enough to be consistent and  
still
specific enough to be meaningful and accurate.  We'll talk on the  
next
call. :)
The intention of the current model is that JobTemplate::queueName  
and MonitoringSession:: drmsQueueNames act as counterparts. DRMAA  
would promise that all strings that show up in MonitoringSession::  
drmsQueueNames are valid input for JobTemplate::queueName. Nothing  
more.
The use case are DRMAA-based portals and command-line applications.  
The interpretation of what a queue is can be provided by the  
library implementation - at the end, the user anyway has to reason  
about the meaning of queue names.
We could relax the conditions so that other values are also allowed  
in JobTemplate::queueName. This would allow MonitoringSession::  
drmsQueueNames to return nothing in SGE. This must be anyway  
possible - Condor has no queue concept at all.
I could also agree to remove MonitoringSession::  
queueMaxWallclockTime and  MonitoringSession::  
queueMaxSlotsAllowed, since these two attributes are the ones that  
demand a particular understanding of what a queue is.
Best,
Peter.
-- 
Nothing is ever easy.

Re: [DRMAA-WG] maxSlots attribute

Andre Merzky