Re: [jsdl-wg] Process Topology

20 Apr 2005

      Chris,

Thanks for kick-starting this discussion. I'm commenting somewhat out of 
order so I've extracted some of the points here and left the entire text 
from your original email at the bottom of the email:
...
ProcessCount (total number of _processes_ that the job will use to
execute the job)
As I mention in section 8 of the spec ProcessCount is misnamed. (The 
usage of the term 'process' is unfortunate.) It isn't intended to 
control how many actual processes the job will use at execution time but 
to indicate the aggregate processing power that the application needs. 
So I see it as 1-1 mapping to a CPU and I have actually translated it as 
such in the examples of section 8.
...
Note: I'm making an assumption here that CPUCount is the number of
cpus that
I want from the resource, rather than an expression of how many cpus 
the host needs to have configured. If it is the latter, then we do 
need TileSize, and replace CPUCount in my examples below with 
TileSize.
CPUcount is defined as the number of CPUs you want the resource to have. 
So it is the latter one.
...
4. A "homegrown" master/slave parallel job (say a ligand docking
job).
Wants
32 processors. No tiling constraints at all.
   :
4. ResourceCount == 32, CPUCount == 1
  -> oops ... it doesn't care about tiling
   ResourceCount == 1, CPUCount == 32
  -> hmm ... artificial constraint ... would suck on a blade cluster
   ResourceCount == 1-32, CPUCount == 1,32
  -> oops again ... I might get a total allocation of 32*32 cpus
* there seems to be a gap!
If we had a term called "TotalCPUCount" for the entire job, I could do:
4. TotalCPUCount == 32
  -> LSF : "-n 32"
  -> PBS : "not sure how to express"
Using the current terminology (bear with me) I would translate it as

<Application>
    ...
    <jsdl-posix:POSIXApplication>
	...
	<ProcessCount>32</ProcessCount>
    </jsdl-posix:POSIXApplication>
</Application>
<Resource>
   ...
   <ResourceCount>
      <LowerBoundedRange>1.0</LowerBoundedRange>
   </ResourceCount>
</Resource>

I translate "No tiling constraints" as meaning TileSize=1  and since it 
is the default value I have omitted it.
...
So ... in summary ... I propose:
CPUCount (as is if it's allocated cpus per resource)
TileSize (iff CPUCount is an expression of configured cpus in a host)
ResourceCount (as is ... hmmm ... maybe the default value needs to 
change)
TotalCPUCount (how many cpus this jobs needs to run in total)
We need TileSize. I agree that the default ResourceCount=1 definition 
should be changed to  'undefined' as you mention in a subsequent email.

So there are two issues:
1. Whether the topology requirements should be in the Application 
section or not. If they are in the Application section then the terms 
used should not be resource-flavored, i.e., not TotalCPUCount but 
something else.
2. How to rename 'ProcessCount' to eliminate the confusion with the term 
'process'

My answer to (1) is to keep these in the Application section. I am not 
sure how to rename ProcessCount though.

Could I also ask you to let me know if the examples in section 8 of the 
spec actually make sense or not?

Andreas

Christopher Smith wrote:
...
Ok ... here are my thoughts on process topology and what's currently
expressible in JSDL.
First, I'll list some use cases (they're all parallel jobs):
1. Simple MPI job. Wants 32 processors with 1 processor per resource (in
JSDL, a host is a "resource").
2. OpenMPI job. Wants 32 processors with 8 processors per resource.
3. An OpenMP job. Wants 32 processors. Shared mem of course, so one
resource.
4. A "homegrown" master/slave parallel job (say a ligand docking job). Wants
32 processors. No tiling constraints at all.
* Note that I'm specifically leaving out the Naregi "coupled simulation" use
case (sorry guys), since we determined at the last GGF that it was a case
which could be decomposed into multiple JSDL documents.
Second ... what is process topology? It provides the user a way to express
how resources should be _allocated_ given the characteristics of the job
(usually in terms of IO patterns ... e.g. network communication, disk IO
channel contention, etc). Thus, it's used when the resource manager is
_allocating_ the resources, not when the job is being started/launched.
Therefore, none of the elements used to express process topology should be
in the POSIXApplication section
What we have in JSDL now:
ResourceCount (how many "resources" i.e. hosts I want)
CPUCount (how many processors _per resource_)
TileSize (how many processors to allocate per resource as a unit)
ProcessCount (total number of _processes_ that the job will use to execute
the job)
I will argue that ProcessCount is useless for the purposes of process
topology, since a) it isn't about allocation, and b) there isn't enough
information to tell me how to start/launch a parallel job. It isn't about
allocation since it is irrelevant to the scheduler whether I'll be computing
using threads or processes. It isn't useful for launching because it doesn't
tell me how to spread the ProcessCount processes given a particular
allocated topology.
So that leaves the rest of them.
TileSize and CPUCount are pretty much the same thing. At least for 80% (or
more) of the uses I've seen. The only thing that might cause them to differ
is that I could possibly allocate more than one tile on a host. Given that
CPUCount is a range and that we could express step values in the range (we
can express step values in the range, right?), we don't need TileSize any
more.
Note: I'm making an assumption here that CPUCount is the number of cpus that
I want from the resource, rather than an expression of how many cpus the
host needs to have configured. If it is the latter, then we do need
TileSize, and replace CPUCount in my examples below with TileSize.
So let's see how these map to the use cases.
1. ResourceCount == 32, CPUCount == 1
  -> LSF : "-n 32 -R span[ptile=1]"
  -> PBS : "-l nodes=32:ppn=1"     (ppn=1 might be the default)
2. ResourceCount == 4,  CPUCount == 8
  -> LSF : "-n 32 -R span[ptile=8]"
  -> PBS : "-l nodes=4:ppn=8"
3. ResourceCount == 1, CPUCount == 32
  -> LSF : "-n 32 -R span[hosts=1]"  (hosts=1 equivalent to ptile=<-n val>)
  -> PBS : "-l nodes=1:ppn=32"
4. ResourceCount == 32, CPUCount == 1
  -> oops ... it doesn't care about tiling
   ResourceCount == 1, CPUCount == 32
  -> hmm ... artificial constraint ... would suck on a blade cluster
   ResourceCount == 1-32, CPUCount == 1,32
  -> oops again ... I might get a total allocation of 32*32 cpus
* there seems to be a gap!
If we had a term called "TotalCPUCount" for the entire job, I could do:
4. TotalCPUCount == 32
  -> LSF : "-n 32"
  -> PBS : "not sure how to express"
It basically means to grab 32 cpus, regardless of how they are spread.
Basically I just need cpus. This is used a whole hell of a lot within our
customer base.
So ... in summary ... I propose:
CPUCount (as is if it's allocated cpus per resource)
TileSize (iff CPUCount is an expression of configured cpus in a host)
ResourceCount (as is ... hmmm ... maybe the default value needs to change)
TotalCPUCount (how many cpus this jobs needs to run in total)
-- Chris
-- 
Andreas Savva
Fujitsu Laboratories Ltd