
Peter G. Lane wrote:
I'm confused about many of the limits in POSIXApplication. Some seem redundant with respect to the Resources requirement elements. Other's simply don't seem clear in their spec descriptions.
It depends on the job manager in question whether the limits are implied by the resources (and with some apps, I understand that the limits have to be specified separately. The person that told me this was not at all happy with the situation BTW!) Yes, it's a mess, but that's the truth on the ground anyway. All limits were discovered by examining the output of ulimit across a wide variety of systems. There's a core set of them that are probably universal on Unix, but some of the others seem more specialized. But don't worry. The _correct_ way of handling the limits is pretty clear: if you don't have a system that supports a particular limit in a natural way, throw out documents that ask to set that limit. It's strongly suggested that JSDL authors do not set limits unless they really need them. FYI, jobs compliant with the HPC Profile will not set limits at all.
Is it then impossible to specify a total job CPU limit for all processes, or is that what TotalCPUTime in Resources is meant for? But then it's not so clear why MemoryLimit is not redundant with TotalPhysicalMemory.
The limits are purely things to splat into ulimit, with all that implies.
Here are some more specific questions:
WallTimeLimit -- Why isn't there a hard wall time limit if this is a soft limit?
We thought about this, but decided that we'd only ever set soft limits.
DataSegmentLimit -- What is meant by a data segment?
Some unix systems let you limit this. I've never needed it myself. If you can't map it (and your question implies that you can't) throw it out. We won't complain!
VirtualMemoryLimit -- Similar to MemoryLimit, what's the difference between this and an UpperBoundedRange value for Resources' TotalVirtualMemory? Is there some reason someone would want to request nodes with a total virtual memory capacity greater than they are limiting their job to use?
If I give an upper bounded range for the VM, the resource manager might choose to allocate an amount of memory less than upper bound described in UBR. Also, the RM might have ways other than limits of imposing contraints on the resources open to me. On the other hand, setting the VM limit just means that the OS will send me a signal when (or sometime after) I exceed it (or is VM one of the ones where it makes some syscalls fail?) so it's at least potentially recoverable. By contrast, the resource allocation may be implemented in a way (e.g. a virtualized OS or, depending on underlying architecture, processor affinities) that cannot be defeated. The take-home message is that POSIXApp limits are specifically to do with ulimit settings, but resource management may be done in other ways, and probably would have to be on clustered systems anyway. Donal.