
Karl Czajkowski wrote:
I think one decision to make is whether BES services are homogeneous or not. I think Donal is advocating homogeneity. However, I do not think this is the main source of complexity. In either case, I agree with you that JSDL ought to be usable as a core syntax for describing the "resources available from a BES instance" as well as the "resources required for an activity". As you describe it, this is sort of a "class ad" in the Condor sense of the word. The problem comes from trying to advertise a resource that can handle multiple jobs simultaneously.
I'd largely agree with this paragraph, except that I'd note that I'm only advocating that a BES instance export a (maximal) homogenous view of itself, and that more complex configurations be modelled as multiple containers. This helps keep the reasoning for resource selection simple, and that's good because the reasoning is already quite complex.
The tricky part is that this is not just "nodes free", but must be intersected with policies such as maximum job size. Should there be a vocabulary for listing the total free resources and the job sizing policies directly? Or should the advertisement list a set of jobs that can be supported simultaneously, e.g. I publish 512 nodes as quanity 4 128-node job availability slots? The latter is easier to match, but probably doesn't work in the simple case because of combinatoric problem of grouping jobs which are not maximal. How does a user know that they can have quantity 8 64-node jobs or not?
Firstly, a container should not publish a capacity that exceeds the max size of job it is willing to accept. The size of the physical resource is uninteresting; the queue capacity is what matters. Secondly, the ability to run simultaneous isolated jobs is independent of that. That's instead something that is discovered by talking to some reservation service and discovering that you can get two (or more) overlapping reservations for the same container. I don't think anyone is working on standardising reservation services at the moment (and nor can they really be described as being part of the simple HPC profile; there are lots of resources out there without reservation capability and they still work fine). In short, in the example above the user *doesn't* know that they can have that number of jobs running at once, but they can know that they can submit that many jobs to that resource and have them run eventually (according to system policy). Of course, a site might choose to publish additional information about their system policies allowing users to really know such things. But again, they might not; local autonomy rules after all. [eliding more points that are interesting, but a bit imponderable]
If we say all of this is too "researchy" for standardization, then I am not sure what the standard will really support. Perhaps the best approach is the first one I mentioned, where relatively raw data is exposed on several extensible axes (subject to authorization checks): overall resource pool descriptions, job sizing policies, user rights information, etc. The simple users may only receive a simple subset of this information which requires minimal transformation to tell them what they can submit. The middleware clients receive more elaborate data (if trusted) and can do more elaborate transformation of the data to help their planning.
I advocate a simple model. If a BES publishes a resource description to me, then it should accept a job from me that asks for the maximal resources from that description. It need not execute that job straight away, but should do so as soon as reasonably possible given workload (and other things like maintenance periods, etc.) Like this, there is no need to publish further policy information; any relevant policies have already been applied by the time an invitation to offer is sent. Now, while it is possible that there are some configurations that will not be captured by this (e.g. ascribing a polynomial scoring function to each resource type and capping the maximum total score), what I describe is going to capture what most people do I think. It's also pretty easy to implement and roll out.
The only alternative I can imagine, right now, would be a very elaborate resource description language utilizing the JSDL "range value" concept to expose some core policy limits, as well as a number of extensions to express overall constraints which define the outer bounds of the combinatoric solution space. This DOES seem pretty "researchy" to me... but maybe someone else sees a more appealing middle ground?
I think I'm probably with Marvin on this. Let's get something that's workable for the 90% case without closing off the 10% from being tackled in the future (though maybe by a different route). That's got to be the way with the best payoff in the next 6-12 months. Donal.