Questions about JSDL specification

Hello, I have been reading the JSDL specification, and I have a few questions about it. 1) I want to make sure that I understand "support" and "satisfy" as defined on page 7. If a system can parse a JSDL document, then it supports it, and if it can do everything requested by the JSDL document, then it satisfied it. Is that correct? 2) Just to make sure I understand support, when it says "The JSDL core element set contains the semantics for elements that are defined by JSDL 1.0. All elements MUST be supported by JSDL 1.0 compliant consuming systems.", that simply means that they need to be parsed, not that the need to be accepted, correct? 3) ApplicationName identifies the executable. I don't understand where the executable comes from. Do we expect that the underlying system can figure out which program to run from the name and version, and that it is pre-staged? Can applications bring their executable with them by staging it? Am I supposed to use the POSIXApplication element? I'm confused how this fits together. 3) When I specify the operating system: how do I specify linux version? Kernel version? Distro + version? Is it dependent on the underlying system? 3) IndividualCPUCount allows me to specify a range in terms of double: what does this mean? For example, if I specify that I want 3.14 CPUs, it's a legal specification, but I don't know what it means. Ditto for IndividualPhysicalMemory and IndividualVirtualMemory and IndividualDiskSpace. 3) I don't undersatnd IndividualNetworkBandwidth: bandwidth to where? Does this refer just to the local NIC? What if there are multiple NICs? 4) I'm confused how IndividualDiskSpace interacts with the filesystem element. The FileSystem element specifies how much disk space is needed on a particular file system: the IndividualDiskSpace says something about disk space, but not about where the disk space is located. Which disk space is it? What does it mean if I specify a FileSystem and IndividualDiskSpace? 5) I don't understand the difference between IndividualCPUCount and TotalCPUCount. Can I think of it as the number of CPUs on a single node, and the total number needed across all nodes? Or does it mean something different? 6) When I stage files, the destination might be a filesystem on NFS. If I'm running many jobs at the same time, the files might clobber each other unless I give the files unique names: is the user responsible for doing so, or is there some way to specify a unique identifier in the file name? For example, in Condor I can say something like File.$(Process) to get a unique filename based on the job id. Thanks for any assistance in understanding! -alain

Hi Alain, please find my answers inlined below. Alain Roy wrote:
Hello,
I have been reading the JSDL specification, and I have a few questions about it.
1) I want to make sure that I understand "support" and "satisfy" as defined on page 7. If a system can parse a JSDL document, then it supports it, and if it can do everything requested by the JSDL document, then it satisfied it. Is that correct?
Yes.
2) Just to make sure I understand support, when it says "The JSDL core element set contains the semantics for elements that are defined by JSDL 1.0. All elements MUST be supported by JSDL 1.0 compliant consuming systems.", that simply means that they need to be parsed, not that the need to be accepted, correct?
Yes, in the sense that "accepted" here means that the semantics attached to that XML element will be actually carried out. "Support" means that the consuming entity can parse that XML element, and inherently knows its semantics. "Satisfy" means that the consuming entity is able to (and will) execute the semantics attached to a particular JSDL XML element.
3) ApplicationName identifies the executable. I don't understand where the executable comes from. Do we expect that the underlying system can figure out which program to run from the name and version, and that it is pre-staged? Can applications bring their executable with them by staging it? Am I supposed to use the POSIXApplication element? I'm confused how this fits together.
Consider jsdl:ApplicationName and jsdl:ApplicationVersion together. The use case is that, for popular applications such as BLAST, the user just need to give a "well known" application name, i.e. a name that the consuming system reckognises, and an appropriate version number if this is of any relevance. The consuming system then can figure out the path to the concrete BLAST executable and use it accordingly. Considering more than one consuming entities, this mechanism allows the user to submit her JSDL job to another system without modifications, provided that both understand the same "well known" application name. Another benefit is that this separates concerns: THe user only need to be concerned about which application he wants to run, and the execution system needs to be concerned whether it can provide this application, and in the correct version, and what the path to that application actually is. If you really need to run your own executable (which is, honestly, quite common), then you use the jsdl:POSIXApplication extension, which allows you to specify the direct path to that executable along with necessary additional information. Then you do not have to specify jsdl:ApplicationName and jsdl:ApplicationVersion elemments, or the connsuming entity would simply ignore them in the presence of a jsdl:POSIXApplication element.
3) When I specify the operating system: how do I specify linux version? Kernel version? Distro + version? Is it dependent on the underlying system?
Specifying a particular distribution is a very hairy topic. As you know, Linux distributions are more or less all the same in the sense that they provide a kernel (Linux) and "all the stuff around". I think Linux distributions, or rather their default installation, can be seen as profiles on the same space of software available for Linux. In that sense I thing the best effort is to try to specify the kernel version. But this is not satisfactory enough, I know. :-/
3) IndividualCPUCount allows me to specify a range in terms of double: what does this mean? For example, if I specify that I want 3.14 CPUs, it's a legal specification, but I don't know what it means. Ditto for IndividualPhysicalMemory and IndividualVirtualMemory and IndividualDiskSpace.
Without checking the spec, I guess this is an artefact how ranges are given in JSDL - to be able to use one Range type for all sorts of resources, we decide to use xsd:double as this includes integers. So you may want to specify 3.14 for jsdl:IndividualCPUCount, but the consuming system then may - throw the JSDL doc back at you nagging about silly values, or - accept the document and use 3.0 instead, or - do something else, e.g. cause a kernel panic in the underlying OS. ;-)
3) I don't undersatnd IndividualNetworkBandwidth: bandwidth to where? Does this refer just to the local NIC? What if there are multiple NICs?
4) I'm confused how IndividualDiskSpace interacts with the filesystem element. The FileSystem element specifies how much disk space is needed on a particular file system: the IndividualDiskSpace says something about disk space, but not about where the disk space is located. Which disk space is it? What does it mean if I specify a FileSystem and IndividualDiskSpace?
5) I don't understand the difference between IndividualCPUCount and TotalCPUCount. Can I think of it as the number of CPUs on a single node, and the total number needed across all nodes? Or does it mean something different?
Consider the jsdl:Individual* and jsdl:Total* elements together with the jsdl:ResourceCount element. They are used to express a tiled topology. I hope someboody else can step in here and give a more detailed description.
6) When I stage files, the destination might be a filesystem on NFS. If I'm running many jobs at the same time, the files might clobber each other unless I give the files unique names: is the user responsible for doing so, or is there some way to specify a unique identifier in the file name? For example, in Condor I can say something like File.$(Process) to get a unique filename based on the job id.
No. In JSDL, the user is responsible for unique file names. On the other hand, you can specify whether the system shall overwrite any existing file with that name or not. Cheers, Michel -- Michel <dot> Drescher <at> uk <dot> fujitsu <dot> com Fujitsu Laboratories of Europe +44 20 8606 4834

At 11:25 PM 8/2/2006 -0700, Michel Drescher wrote:
Hi Alain,
please find my answers inlined below.
Thank you for your thoughtful comments. I have one or two respones.
3) IndividualCPUCount allows me to specify a range in terms of double: what does this mean? For example, if I specify that I want 3.14 CPUs, it's a legal specification, but I don't know what it means. Ditto for IndividualPhysicalMemory and IndividualVirtualMemory and IndividualDiskSpace.
Without checking the spec, I guess this is an artefact how ranges are given in JSDL - to be able to use one Range type for all sorts of resources, we decide to use xsd:double as this includes integers.
So you may want to specify 3.14 for jsdl:IndividualCPUCount, but the consuming system then may - throw the JSDL doc back at you nagging about silly values, or - accept the document and use 3.0 instead, or - do something else, e.g. cause a kernel panic in the underlying OS. ;-)
You're right: it is because of how ranges are specified. Let me make the small suggestion that if there is a future version of JSDL, you consider adding a way to specify this in positive integers, so it's harder for people to specify something meaningless.
3) I don't undersatnd IndividualNetworkBandwidth: bandwidth to where? Does this refer just to the local NIC? What if there are multiple NICs? 4) I'm confused how IndividualDiskSpace interacts with the filesystem element. The FileSystem element specifies how much disk space is needed on a particular file system: the IndividualDiskSpace says something about disk space, but not about where the disk space is located. Which disk space is it? What does it mean if I specify a FileSystem and IndividualDiskSpace? 5) I don't understand the difference between IndividualCPUCount and TotalCPUCount. Can I think of it as the number of CPUs on a single node, and the total number needed across all nodes? Or does it mean something different?
Consider the jsdl:Individual* and jsdl:Total* elements together with the jsdl:ResourceCount element. They are used to express a tiled topology. I hope someboody else can step in here and give a more detailed description.
I don't think I understand, so if anyone else can step in, that would be great. Thanks! -alain

Alain Roy wrote:
At 11:25 PM 8/2/2006 -0700, Michel Drescher wrote:
Hi Alain,
please find my answers inlined below.
Thank you for your thoughtful comments. I have one or two respones.
3) IndividualCPUCount allows me to specify a range in terms of double: what does this mean? For example, if I specify that I want 3.14 CPUs, it's a legal specification, but I don't know what it means. Ditto for IndividualPhysicalMemory and IndividualVirtualMemory and IndividualDiskSpace.
Without checking the spec, I guess this is an artefact how ranges are given in JSDL - to be able to use one Range type for all sorts of resources, we decide to use xsd:double as this includes integers.
So you may want to specify 3.14 for jsdl:IndividualCPUCount, but the consuming system then may - throw the JSDL doc back at you nagging about silly values, or - accept the document and use 3.0 instead, or - do something else, e.g. cause a kernel panic in the underlying OS. ;-)
You're right: it is because of how ranges are specified.
Let me make the small suggestion that if there is a future version of JSDL, you consider adding a way to specify this in positive integers, so it's harder for people to specify something meaningless.
3) I don't undersatnd IndividualNetworkBandwidth: bandwidth to where? Does this refer just to the local NIC? What if there are multiple NICs? 4) I'm confused how IndividualDiskSpace interacts with the filesystem element. The FileSystem element specifies how much disk space is needed on a particular file system: the IndividualDiskSpace says something about disk space, but not about where the disk space is located. Which disk space is it? What does it mean if I specify a FileSystem and IndividualDiskSpace? 5) I don't understand the difference between IndividualCPUCount and TotalCPUCount. Can I think of it as the number of CPUs on a single node, and the total number needed across all nodes? Or does it mean something different?
Consider the jsdl:Individual* and jsdl:Total* elements together with the jsdl:ResourceCount element. They are used to express a tiled topology. I hope someboody else can step in here and give a more detailed description.
I don't think I understand, so if anyone else can step in, that would be great.
I think Michel is in a sense confirming what you already supposed. jsdl:Individual* are for individual compute resource requirements while jsdl:Total* are used to specify requirements that apply to the entire job. The jsdl:ResourceCount element merely expresses how many compute resources with those requirements are needed for your job. Peter
Thanks! -alain

Alain Roy wrote:
Michel Drescher wrote:
So you may want to specify 3.14 for jsdl:IndividualCPUCount, but the consuming system then may - throw the JSDL doc back at you nagging about silly values, or - accept the document and use 3.0 instead, or - do something else, e.g. cause a kernel panic in the underlying OS. ;-)
You're right: it is because of how ranges are specified.
Let me make the small suggestion that if there is a future version of JSDL, you consider adding a way to specify this in positive integers, so it's harder for people to specify something meaningless.
Why? Any sensible value for the fundamentally-integral ranges is exactly representable using a double. On the other hand, introducing a separate range typing scheme for integers greatly increases the amount of work needed to write matching engines *and* it also would have made the schema longer and more complicated. It's big enough already. Anyway, if people want to specify meaningless things, they will do anyway. (For example, they can write a document that uses a mailto: URL as a data staging source, which is totally meaningless as the mailto: scheme doesn't work that way round.) Given that, it is 100% expected that not all syntactically-valid JSDL documents will describe things that are possible and sensible. You have to check for yourself. In my experience, attempting to restrict the space of legal documents so that only meaningful things may be said is likely to lead to there being some meaningful things that can't be said. (Are you *sure* that the hardware people will never figure out how to do a fractional CPU? I've seen some strange things in my time...) Of course, for the specific example I would imagine that if someone asks for exactly 3.14 CPUs, the job manager will be unable to allocate that quantity of resources (e.g. only having 3.0 or 4.0 available) and the match will fail. Which would not be our fault, but rather poor job composition by the user (or their agent). End of problem. Donal.

At 09:54 AM 8/16/2006 +0100, Donal K. Fellows wrote:
Alain Roy wrote:
Michel Drescher wrote:
So you may want to specify 3.14 for jsdl:IndividualCPUCount, but the consuming system then may - throw the JSDL doc back at you nagging about silly values, or - accept the document and use 3.0 instead, or - do something else, e.g. cause a kernel panic in the underlying OS. ;-) You're right: it is because of how ranges are specified. Let me make the small suggestion that if there is a future version of JSDL, you consider adding a way to specify this in positive integers, so it's harder for people to specify something meaningless.
Why?
JSDL has gone down the route of specifying quite carefully what one can say about a job. (As opposed to something like Condor's ClassAds, which I'm rather partial to. I really do like the idea of allowing users to specify what they want and leave the semantics up to them, not the creators of the language.) But given that you've specified things so carefully, it was a surprise to me to see something that I couldn't figure out how to interpret as a JSDL consumer.
Any sensible value for the fundamentally-integral ranges is exactly representable using a double. On the other hand, introducing a separate range typing scheme for integers greatly increases the amount of work needed to write matching engines *and* it also would have made the schema longer and more complicated.
If that's the general feeling in the JSDL group, then I accept it. It's not a big deal to me. -alain

Alain Roy wrote:
JSDL has gone down the route of specifying quite carefully what one can say about a job. (As opposed to something like Condor's ClassAds, which I'm rather partial to. I really do like the idea of allowing users to specify what they want and leave the semantics up to them, not the creators of the language.)
The Condor approach has different problems, namely that is trivial to produce two systems that cannot interoperate because they use wholly different vocabularies of terms. That sort of thing is a real problem for people trying to build higher-level middleware, especially if you are going to try to translate from other legacy job descriptions. (The last I heard, there still isn't a standard set of terms for things in Condor. Without that, how can I possibly know that there is any kind of correspondence between, say, "CPU" and "Processor"?) The "nailed down hard"[*] approach has its disadvantages, especially in the fact that it isn't particularly easy to extend, but it gains in other ways. At least our schemas are extensible; terms that are truly missing (e.g., stuff to do with software licenses) can be added without great trauma. FWIW, I suspect there isn't a perfect solution to this balance.
But given that you've specified things so carefully, it was a surprise to me to see something that I couldn't figure out how to interpret as a JSDL consumer.
It's easy to interpret. If someone is so foolish as to specify something that it is impossible to allocate, throw the job request out on the grounds of "user error". :-) The other part of the matching is this: think of the space of resources available as a set of values, and the content of the IndividualCPUCount also specifies a set; if the sets have an intersection, you can allocate the resources as being *any* member of that intersection (and I advise doing it in such a way as to optimize things from your own perspective at that point, whatever that means). The aim is that users should specify what they need to as much detail as they want. If they over-specify, their choices get restricted but that's their problem. I'm also working on a spec for a service that will, among other things, be able to do virtual->concrete resource mapping. Once such things are added, users can start to ignore how many CPUs are used entirely and focus instead on important stuff (like "when is the job going to run" and "how much is the bill going to be"). But that's outside the scope of the JSDL-WG. :-) Donal. [* I suppose it would be more formally correct to describe this as an ontological approach, in that it is based on defining a set of terms with precise meanings. Indeed, JSDL is certainly stimulating interest from the ontologists that I know... ]

[Sorry if this is too late. Just catching up with some old email.] Michel Drescher wrote: :
Alain Roy wrote: :
3) I don't undersatnd IndividualNetworkBandwidth: bandwidth to where? Does this refer just to the local NIC? What if there are multiple NICs?
This just refers to the local NIC. JSDL 1.0 focus is on homogeneous resource descriptions so you can't ask for multiple NICs of different type. You can't really specify how many NICs should be present either. Describing network requirements is important and more work is needed. Many people expressed interested in doing more at one of the sessions a few GGFs back. This element is just an initial attempt to allow a very simple description of network requirements.
4) I'm confused how IndividualDiskSpace interacts with the filesystem element. The FileSystem element specifies how much disk space is needed on a particular file system: the IndividualDiskSpace says something about disk space, but not about where the disk space is located. Which disk space is it? What does it mean if I specify a FileSystem and IndividualDiskSpace?
They don't interact. IndividualDiskSpace is meant to describe space that is not configured and usable as a FileSystem. The use case here is that you may ask for a certain amount of 'raw' resources (disk space etc) that you will configure as a separate step before doing something else with them. You shouldn't need to use this element if all you are doing is a batch job submission.
5) I don't understand the difference between IndividualCPUCount and TotalCPUCount. Can I think of it as the number of CPUs on a single node, and the total number needed across all nodes? Or does it mean something different?
Consider the jsdl:Individual* and jsdl:Total* elements together with the jsdl:ResourceCount element. They are used to express a tiled topology. I hope someboody else can step in here and give a more detailed description.
I didn't see any followup to this. I can give you more details if you want. -- Andreas Savva Fujitsu Laboratories Ltd
participants (5)
-
Alain Roy
-
Andreas Savva
-
Donal K. Fellows
-
Michel Drescher
-
Peter G. Lane