
On 19/4/05 22:05, "Andreas Savva" <andreas.savva@jp.fujitsu.com> wrote:
If we had a term called "TotalCPUCount" for the entire job, I could do:
4. TotalCPUCount == 32 -> LSF : "-n 32" -> PBS : "not sure how to express"
Using the current terminology (bear with me) I would translate it as
<Application> ... <jsdl-posix:POSIXApplication> ... <ProcessCount>32</ProcessCount> </jsdl-posix:POSIXApplication> </Application> <Resource> ... <ResourceCount> <LowerBoundedRange>1.0</LowerBoundedRange> </ResourceCount> </Resource>
This works ... I didn't think of using the range on ResourceCount. :-)
I translate "No tiling constraints" as meaning TileSize=1 and since it is the default value I have omitted it.
Nope ... "no tiling constraints" means "no tiling constraints" (i.e. TileSize undefined). TileSize=1 is a tiling constraint.
We need TileSize. I agree that the default ResourceCount=1 definition should be changed to 'undefined' as you mention in a subsequent email.
So there are two issues: 1. Whether the topology requirements should be in the Application section or not. If they are in the Application section then the terms used should not be resource-flavored, i.e., not TotalCPUCount but something else. 2. How to rename 'ProcessCount' to eliminate the confusion with the term 'process'
My answer to (1) is to keep these in the Application section. I am not sure how to rename ProcessCount though.
Ha ... so my answer to (1) is to put it somewhere else (near the Resource section). My view on this is that TotalCPUCount and TileSize are resource requirements on the global allocation, and not really tied to the application at all (i.e. they equally apply to POSIX applications, a clustered service instance, etc, etc). I basically like to categorize things based on whether they are associated with allocating resources, or whether they are associated with binding the "work unit" to the allocation, since these are often two separate phases to getting work done in a batch system, or other execution management systems. You can also subcategorize allocation requirements based on whether they apply globally to the entire allocation (e.g. TotalCPUCount) or whether they apply at an individual resource level (e.g. CPUCount). I don't think we have the notion of the former, do we?
Could I also ask you to let me know if the examples in section 8 of the spec actually make sense or not?
Sort of? I won't be sure until we agree on the terminology changes. I actually think that my 4 use cases cover it pretty well (from the allocation point of view for parallel jobs), although some examples could be used to illustrate the use of CPUCount ... perhaps in conjunction with an "Exclusive" flag. -- Chris