
Hi Ole, Quoting [Ole Christian Weidner] (Dec 10 2009):
Some of the concerns may get addressed by resource discovery and reservation, but not all. So, we would like to propose the following replacement list for these attributes:
SPMDVariation NumberOfProcesses ProcessesPerMachine ProcessesPerSlot ProcessesPerCore
(Slot and Machine are the DRMAA terms for CPU and Host).
Let's go with CPU and host then. Especially slot is IMHO a rather weird term - cpu is so much easier to understand.
Peter corrected me: its socket, not slot. Anyway, I am by no means sold to those terms, but rather want to nail down concept and hierarchy. Machine and CPU would be fine with me, as would be node host machine and cpu socket processor
The NumberOfProcesses is to be interpreted as exact number by the backend. The ProcessesPerXYZ are to be interpreted as upper bounds by the backend.
So, for example, one could specify a 16 process MPICH job as
SPMDVariation = MPICH NumberOfProcesses = 16 ProcessesPerCore = 2
Ok, but what if the backend doesn't understand these things, (RSL for example only understands number of nodes and number of processes). In this case the adaptor itself would have to (a) figure out the number of cores per cpu (b) the number of cpus per node and (c) make a reservation for (16/2/#cores_per_cpu/#cores_per_node) nodes. The globus GRAM adaptor already does something similar to this due to the shortcomings of RSL. Not pretty.
Right - but it is even worse if the user has to do it on application space. So, I guess it is like with all other JD attributes: if it is specified, it must be honored - which may effectively limit the number of available backends...
And the more possible attributes you have, the more options you give the user to define the same thing. E.g
eg? :-) Thanks! Andre. -- Nothing is ever easy.