
On Apr 19, Christopher Smith loaded a tape reading:
If we need to specify different mechanisms for starting up tasks of a parallel job a la the RSL jobType, then I'd like that to be separate from the description of the resource allocation required.
For what it's worth, queuing systems like LSF/PBS/SGE don't handle this startup phase (it's up to the job), so I'd like to see some example terms describing job process topology (basically simple|multi|mpi use cases), since I'm not too sure what they would look like, or what semantics would be required.
Allocate "as a unit" just means that if I'm going to allocate any cpus from a resource, I have to allocate "tileSize" cpus.
-- Chris
Well, I am struggling because I do not want to propose creeping featurism for JSDL... if possible, I think the startup mechanism should be left to extensions because it is such a rich and messy thing as I will try to describe below. What I am struggling to understand w.r.t. JSDL is whether there is some aspect of job layout that is a meaningful part of the job definition but not as simple as the resource topology stuff you were discussing. Because of the RSL legacy, I keep wanting to see some generic concepts for process count etc. that are orthogonal to the specific startup mechanism but which in essence parameterize both allocation and job startup. Perhaps if resource topology is precise enough, there is nothing more needed? Maybe a precise description of allocated resources defines a "job shaped hole" into which an implied job topology would fit? :-) The constellation of resource requirements and posix limits (and any other extensions?) is what defines the virtual resource or "job shaped hole" within which the executable is activated. A practical runtime environment feature might be for a job system like GRAM to expose a "resource map" in the form of JSDL resource syntax in a file or environment variable so the job can introspect on the actual allocation it received... this is a different but related portability/interop problem for job execution systems when you include runtime middleware in the executable. For example, if a future MPICH-Gx release supports the dynamic task features of MPI, the runtime implementation might require this sort of information from the scheduler so it can work within its allocation? karl OK, here is the messy stuff I hope can remain somehow out of scope but still feasible. Basically, GRAM is a higher-level job submission model than what you describe for LSF/PBS/SGE where we try to provide a more generic user-oriented job model instead of the very low-level "job script" model of the local scheduler. Job types in RSL are different activation methods: single: one instance of executable is activated and it must, through site-specific means, do whatever else is needed; for example, read a scheduler-specific HOSTS file and use some site-specific launch mechanism to start tasks on each allocated host. multi: all "count" instances of executable are activated so the job needs not do anything but calculate; for example, GRAM generates a job script that does the site-specific stuff described in "single". mpirun: the parameters are mapped through to an 'mpirun' invocation to launch the runtime required for the job. in practice, I think this is a wrapped form of "single" where the user executable is mapped to an argument and the scheduled executable is mpirun. but I'd need to check to be sure. condor: the job is submitted to a condor flock, if my memory serves me. missing is a funny SMP-aware hybrid one can imagine: spreadsingle: one instance of executable is activated on each allocated resource but it must start additional parallel tasks itself if it wants parallelism on a resource. so, handle site-specific resource activation for job, but leave job to "expand" on each host (node). -- Karl Czajkowski karlcz@univa.com