
Ok ... here are my thoughts on process topology and what's currently expressible in JSDL. First, I'll list some use cases (they're all parallel jobs): 1. Simple MPI job. Wants 32 processors with 1 processor per resource (in JSDL, a host is a "resource"). 2. OpenMPI job. Wants 32 processors with 8 processors per resource. 3. An OpenMP job. Wants 32 processors. Shared mem of course, so one resource. 4. A "homegrown" master/slave parallel job (say a ligand docking job). Wants 32 processors. No tiling constraints at all. * Note that I'm specifically leaving out the Naregi "coupled simulation" use case (sorry guys), since we determined at the last GGF that it was a case which could be decomposed into multiple JSDL documents. Second ... what is process topology? It provides the user a way to express how resources should be _allocated_ given the characteristics of the job (usually in terms of IO patterns ... e.g. network communication, disk IO channel contention, etc). Thus, it's used when the resource manager is _allocating_ the resources, not when the job is being started/launched. Therefore, none of the elements used to express process topology should be in the POSIXApplication section What we have in JSDL now: ResourceCount (how many "resources" i.e. hosts I want) CPUCount (how many processors _per resource_) TileSize (how many processors to allocate per resource as a unit) ProcessCount (total number of _processes_ that the job will use to execute the job) I will argue that ProcessCount is useless for the purposes of process topology, since a) it isn't about allocation, and b) there isn't enough information to tell me how to start/launch a parallel job. It isn't about allocation since it is irrelevant to the scheduler whether I'll be computing using threads or processes. It isn't useful for launching because it doesn't tell me how to spread the ProcessCount processes given a particular allocated topology. So that leaves the rest of them. TileSize and CPUCount are pretty much the same thing. At least for 80% (or more) of the uses I've seen. The only thing that might cause them to differ is that I could possibly allocate more than one tile on a host. Given that CPUCount is a range and that we could express step values in the range (we can express step values in the range, right?), we don't need TileSize any more. Note: I'm making an assumption here that CPUCount is the number of cpus that I want from the resource, rather than an expression of how many cpus the host needs to have configured. If it is the latter, then we do need TileSize, and replace CPUCount in my examples below with TileSize. So let's see how these map to the use cases. 1. ResourceCount == 32, CPUCount == 1 -> LSF : "-n 32 -R span[ptile=1]" -> PBS : "-l nodes=32:ppn=1" (ppn=1 might be the default) 2. ResourceCount == 4, CPUCount == 8 -> LSF : "-n 32 -R span[ptile=8]" -> PBS : "-l nodes=4:ppn=8" 3. ResourceCount == 1, CPUCount == 32 -> LSF : "-n 32 -R span[hosts=1]" (hosts=1 equivalent to ptile=<-n val>) -> PBS : "-l nodes=1:ppn=32" 4. ResourceCount == 32, CPUCount == 1 -> oops ... it doesn't care about tiling ResourceCount == 1, CPUCount == 32 -> hmm ... artificial constraint ... would suck on a blade cluster ResourceCount == 1-32, CPUCount == 1,32 -> oops again ... I might get a total allocation of 32*32 cpus * there seems to be a gap! If we had a term called "TotalCPUCount" for the entire job, I could do: 4. TotalCPUCount == 32 -> LSF : "-n 32" -> PBS : "not sure how to express" It basically means to grab 32 cpus, regardless of how they are spread. Basically I just need cpus. This is used a whole hell of a lot within our customer base. So ... in summary ... I propose: CPUCount (as is if it's allocated cpus per resource) TileSize (iff CPUCount is an expression of configured cpus in a host) ResourceCount (as is ... hmmm ... maybe the default value needs to change) TotalCPUCount (how many cpus this jobs needs to run in total) -- Chris