
Hi, In-lined, Marvin Theimer wrote: <snip>
· Information model:
· JSDL seems to inherently be focused on describing a single job or a single computational resource. For example, it has no notion of describing all the differing compute nodes of a (heterogeneous) compute cluster. By incorporating JSDL elements into the BES information model it seems that BES is foreclosing the ability to describe things like compute clusters. This issue also effects what can get returned from GetActivityJSDLDocuments. If I'm wrong about this, then it seems like it would be worth having an explicit explanation about how to achieve this functionality somewhere in the specification.
If I could say here that JSDL 1.0 is not meant to be a final answer. The resource model as used within it is meant to be a "rough place holder" as we are expecting that the CIM people will be providing something much more robust and appropriate. We simply wanted something simple to get things going. We hope that when this model is available we can roll it into a later release of JSDL. If you're interested in following this up I'd suggest getting involved with the OGSA information model people (and obviously) the JSDL people.
· The BES information model now includes various posix-specific elements of JSDL. How would other systems -- such as a Windows system -- be described?
The posix elements are only inside the POSIXApplicationType. The best way would be to define a WindowsApplicationType (or something similar). The JSDL group would be very interested in this.
· Is there any notion of specifying that all compute nodes should have the /same/ value for some attribute (e.g. CPU architecture, CPU speed, NIC card)? This seems to be missing from the JSDL specification, but seems very important for BES if it is to support things like compute clusters.
Again this will hopefully come in the future.
· Some of the elements seem either incompletely specified, have definitions that are open to multiple interpretations, or have definitions that would be very difficult to implement in practice. In particular:
· CPU architecture seems like it can't describe all the variations -- let alone all the peripherals such as GPUs -- that a computing resource might have (let alone a cluster).
· CPU speed seems like the tip of an iceberg having to do with characterizing the performance of a system, which will depend on all manner of things like details of the processor chip used, cache sizes, bus used, etc.
· Network bandwidth: is this the theoretical maximum of the NIC on a compute node or is it the current bandwidth actually available in a (shared) system? Note that the latter is difficult to measure in a practically useful way. Note also that network bandwidth only describes one aspect of communications performance and that several others are arguably equally important (e.g. latency).
All this leads to the question of whether BES will have a notion of extending the information model that is supplied. If so, then that leads to the question of what the base case should be and whether it should include a smaller set of things than is currently listed in the spec.
Are there any plans to tighten the definitions of some of the more vague information elements? (I guess this really is an issue more for the JSDL WG than for BES.)
Again - this is where we are now planning to go with JSDL. JSDL 1.0 should be seen as a starting point and not the end. We hope most of these things can be handled through extensions to JSDL 1.0. Those that can't we'll need to add into future versions.
· GetActivityJSDLDocuments returns a JSDL document for each specified activity. Is this sufficient to capture the entire "provenance" for what has happened to the activity? In particular, would it be sufficient to allow someone to (a) run the same activity on another BES service (assuming same hardware and software) and get the same results and (b) debug what has happened to an errant activity? I would argue that both capabilities have proven to be important in actual systems.
A JSDL document is (by the definition of our charter) a Job Submission document. As such things like provenance was ruled out of scope (not by us but by GGF in general - it was felt that this was too much to do all in one goal). However, there is now the scope to go back and re-address these issues. There was some interesting discussions at the last GGF meeting where the ideas of where non submission information could be placed. The suggestions included in a wrapper around the JSDL document or (and from my recollection) the more popular option was to place it in the outer most level of the JSDL document. Hope this helps, steve..
-- ------------------------------------------------------------------------ Dr A. Stephen McGough http://www.doc.ic.ac.uk/~asm ------------------------------------------------------------------------ Technical Coordinator, London e-Science Centre, Imperial College London, Department of Computing, 180 Queen's Gate, London SW7 2BZ, UK tel: +44 (0)207-594-8409 fax: +44 (0)207-581-8024 ------------------------------------------------------------------------