New subject: Questions about JSDL schema

7 Jan 2005

      Dear All,

Some time ago I sent to the list the email to inform you that we are going
to use JSDL to implement a uniform access to several Grid brokers and we
asked you some questions concerning the specification. Now, since the JSDL
schema changed, you can find the next portion of questions and comments
below... :-)

First of all we must admit that the new schema is much better designed than
the previous version. Many of comments and questions, which we had before,
became outdated. Nevertheless, the XML schema and specification changed
considerably so the the first questions are:

What is your versioning policy and plans for further development? Is the
number of current version 1.1? Are you going to introduce any changes to
this version or you plan to release subsequent versions? Do you think we can
base our interfaces on this version of JSDL schema?

We also noticed that the JSDL specification is not always compatible with
the XML schema. I suppose that we should base on the XML schema, which is
more up-to-date, rather than specification? Are there any elements in the
specification that haven't been put into schema yet? When do you think
you'll manage to finish the specification (or at least to synchronize with
the current version of the schema)?

We also have questions concerning some details in the schema:

1. We don't see where we could specify a type of application distribution,
e.g. MPI, OpenMPI. Do you think this is too specific to add it to the
application element?

2. How would you suggest to specify a logical file name: using predefined
syntax or using a special attribute in the FileName element? Can a whole
directory be specified as well?

3. Why Source & Target in DataStaging are in one element? How to use
different name for an input and output file? We would suggest to add a
choice element above Source and Target or to define some separate elements:
e.g. SourceFiles and TargetFiles (probably of the same type).

4. Why the CreationFlag element is mandatory? I think this feature is not
available in many systems. Furthermore, a reasonable default value can be
chosen.

5. Definition 5.6.6.1:
"A Source element contains the location and may contain a user on the remote
system. This file MUST be staged in from the location specified by the URL
as the user on the remote host before the job has terminated."
Do you really mean "terminated" or it should be "started"?

6.What is expected as operatingSystemDesc? Is it human readable description?

7. I saw the uses-cases containing descriptions of a job consisting of
multiple processes and threads requiring multiple resources and/or
processors. I guess that specification of alternative configurations is not
possible, e.g. 4 nodes 4 processors each OR 1 node, 16 processors?

8. I didn't find a specification of the queue name in the new schema. I
think it might be sometimes useful similarly as an implicit specification of
a specific host.

9. Regarding your question in the spec about the webService value of the
ApplicationTypeEnumeration, we think this is definitely needed (invoking an
existing WebService).

10. Do limits mean that if they are exceeded during application execution
the application must be terminated?

11. Within the FileSystem element there is a sub-element MountPoint. Who
should specify this? A user? I think a more common use case is that users
use predefined variables (e.g. home, tmp) to specify paths that are relative
to these variables (but mount points depend on a local system).

12. I asked in the previous email about defining software dependencies, e.g.
necessary libraries and you requested some use-cases. Simple examples are as
follows:
- interactive application may need additional software (e.g. VNC) to enable
users to access remotely application's user interface
- graphical application may need the OpenGL library to run
- Java applications need Java Virtual Machine installed
Therefore, in my opinion such an element would be useful and general. Of
course, we can add it using the extension possibility.

Best regards & thanks,
Ariel Oleksiak

----- Original Message ----- 
From: "Donal K. Fellows" <donal.k.fellows@manchester.ac.uk>
To: "Ariel Oleksiak" <ariel@man.poznan.pl>
Cc: "JSDL WG" <jsdl-wg@gridforum.org>
Sent: Tuesday, October 26, 2004 4:21 PM
Subject: Re: [jsdl-wg] Questions about JSDL schema
...
Ariel Oleksiak wrote:
...
1. How to define the MPI job that needs N procesors - can we use
JobCategory + ProcessTopology? We just need to set a job type to MPI and
to specify number of processes. BTW, have you already defined a final
structure of the ProcessTopology element (I have seen discussions
concerning this on the mailing list)?
This is a definite use-case, and we intend to nail down the
ProcessTopology syntax at the up-coming face-to-face.
...
2. We couldn't find in the schema the "operator" attribute (according to
the specification) to express the equal, min, and max operators.
Moreover, we found in the spec: "<PhysicalMemory>4000 Units=M
Constraint=Min</PhysicalMemory>". What method should be used? The latter
is rather too less structured as for XML schema and probably units and
constraint should be defined in the form of attributes.
That's one of the spots that remains to be tidied up. It's likely to end
up as something like:
   <PhysicalMemory units="MB">4000</PhysicalMemory>
With the default operator for numeric properties being "Min" (or
whatever it is called this week.)
...
3. We found the Executable complex element in the schema. Are you going
to use this element anywhere? Or you think ExecutionName is enough and
Executable is left in the schema by mistake?
I've got use-cases where I need things other than just a path to an
executable binary. (One case is where we provide access to an
application but do not really wish to expose to users just what magic
incantation they use to invoke it on the grounds that it is likely to
change from time to time anyway and is really just a sysadmin thing.)
Experience shows that you can't build a scalable grid system if every
node has to understand masses of details in order to run a job; these
abstract Executable elements are a significant part of the strategy to
push these details closer to the resource providers.
Endpoint JSDL-accepting job engines might not understand these
higher-level things of course. But then they can just throw out the jobs
they don't understand; I'd expect other JSDL engines layered on top of
them (*cough*resource brokers*cough*) to handle those sorts of aspects.
...
4. How to express required software dependencies, e.g. necessary
libraries? Should they be inserted in the extend section in software
requirements?
Something like that. Use-cases here would be helpful.
...
5. Where the extensions can be added (to be still comaptible with the
JSDL spec)? Only in the parts of the schema where the "Extend" element
is added? What about the remaining parts?
What parts remain?  :^D
...
6. Have you considered specification of a type of a local resource
management system (e.g. LSF, PBS, fork) to which user wants to submit a
job? This could be helpful for a resource broker.
On one level, I don't consider that sort of information to actually be
useful to a resource broker (wanting to run under a particular type of
batch queue is an unusual requirement) but if such information was
required it would be a software resource requirement. Possibly an
extension though.
...
7. We also need a uniform description of information about job after
submission, e.g. status etc. but as far as I understand this is out of
the scope of JSDL spec?
Yes, it's out of our scope (and much more in the scope of JSIM).
...
8. Should "objectives", which are used to direct the search for the best
resources, be a part of the JSDL document (e.g. as an extend element) or
they are totally out of the scope of JSDL and should be placed in a
separate document?
Out of scope, but it would be really cool to hear about how such things
are done. I'd actually expect to be putting them in a document with a
JSDL document as component of that outer document, just as we'd also do
that sort of thing if we were expressing a workflow or some kind of
advanced scheduling requirements. JSDL is very much only part of the
overall picture (but we hope it will be a very useful part!) and we hope
that we'll be able to work on these broader specifications in the
future; there's definitely a need for such things, even if agreement on
them is going to be more difficult to come by.
Donal.

Re: [jsdl-wg] Questions about JSDL schema

Ariel Oleksiak

Donal K. Fellows

Ariel Oleksiak

Donal K. Fellows

Andreas Savva

Andreas Savva

tags

participants (3)