misssing exceptions and thread safety
Hi, while scanning the wiki I found following: The Job object methods should throw following exceptions: - "JobAlreadySuspendedException" from suspend method when job is already suspended. The DRMAA implementation have to make sure that suspend job is just called once. It is not enough for the DRMAA implementation to rely on own state, it should check the state automatically in order to avoid problems when the state is set outside of DRMAA. Should DRMAA deal with such cases? - "JobNotSuspendedException" from the resume method (like above). - "JobTerminatedException" when calling a method on a job when the job is already terminated. This is for "suspend" "resume" "hold" "release" "terminate" "waitStarted" Obvious synchronization problems: - accessing an already "deleted" JobTemplate: here that same as for a destroyed Session should apply (InvalidJobTemplateException) - accessing a job template while "deleting" (running a job or accessing otherwise): here that same as for a destroyed Session should apply (InvalidJobTemplateException) - write access for the job templates must be synchronized by the DRMAA implementation - Is there a need to make the invalid state of a JobTemplate (that is when a JobSession has been closed) as an accessible field or should every problem covered by the "InvalidJobTemplateException"? Regards Daniel
Quoting [Daniel Gruber] (Mar 01 2010):
Hi, while scanning the wiki I found following: The Job object methods should throw following exceptions: - "JobAlreadySuspendedException" from suspend method when job is already suspended. The DRMAA implementation have to make sure that suspend job is just called once. It is not enough for the DRMAA implementation to rely on own state, it should check the state automatically in order to avoid problems when the state is set outside of DRMAA. Should DRMAA deal with such cases?
*Can* DRMAA deal with such cases? These are two operations which are usually not atomic (1: check for state, 2: suspend) - so how can a DRMAA client side library ensure that the remote state does not change between these two calls, e.g. due to a 3rd part API call? I guess it's ok to throw when the backend replies with that error (job already suspended) - but requiring the DRMAA implementation to ensure atomicity is most likely futile. my $0.02, Andre.
- "JobNotSuspendedException" from the resume method (like above). - "JobTerminatedException" when calling a method on a job when the job is already terminated. This is for "suspend" "resume" "hold" "release" "terminate" "waitStarted" Obvious synchronization problems: - accessing an already "deleted" JobTemplate: here that same as for a destroyed Session should apply (InvalidJobTemplateException) - accessing a job template while "deleting" (running a job or accessing otherwise): here that same as for a destroyed Session should apply (InvalidJobTemplateException) - write access for the job templates must be synchronized by the DRMAA implementation - Is there a need to make the invalid state of a JobTemplate (that is when a JobSession has been closed) as an accessible field or should every problem covered by the "InvalidJobTemplateException"? Regards Daniel -- Nothing is ever easy.
On 03/01/10 11:17, Andre Merzky wrote:
Quoting [Daniel Gruber] (Mar 01 2010):
Hi, while scanning the wiki I found following: The Job object methods should throw following exceptions: - "JobAlreadySuspendedException" from suspend method when job is already suspended. The DRMAA implementation have to make sure that suspend job is just called once. It is not enough for the DRMAA implementation to rely on own state, it should check the state automatically in order to avoid problems when the state is set outside of DRMAA. Should DRMAA deal with such cases?
*Can* DRMAA deal with such cases? These are two operations which are usually not atomic (1: check for state, 2: suspend) - so how can a DRMAA client side library ensure that the remote state does not change between these two calls, e.g. due to a 3rd part API call?
I guess it's ok to throw when the backend replies with that error (job already suspended) - but requiring the DRMAA implementation to ensure atomicity is most likely futile.
my $0.02, Andre.
You're right - atomicity seems not to be possible. Another important thing to know would be if all DRMs are throwing such an exception or are there any which are silently ignore second request and telling again that it is suspended (suspend is idempotent). Do we have than a problem with the spec saying that there is an Exception but on some implementations there will be never throw one? We should make the exception optional if so. Cheers Daniel
- "JobNotSuspendedException" from the resume method (like above). - "JobTerminatedException" when calling a method on a job when the job is already terminated. This is for "suspend" "resume" "hold" "release" "terminate" "waitStarted" Obvious synchronization problems: - accessing an already "deleted" JobTemplate: here that same as for a destroyed Session should apply (InvalidJobTemplateException) - accessing a job template while "deleting" (running a job or accessing otherwise): here that same as for a destroyed Session should apply (InvalidJobTemplateException) - write access for the job templates must be synchronized by the DRMAA implementation - Is there a need to make the invalid state of a JobTemplate (that is when a JobSession has been closed) as an accessible field or should every problem covered by the "InvalidJobTemplateException"? Regards Daniel
The atomicity has to be managed at the DRM layer. The suspend call has to operate like a test and set operation. If the suspend operation doesn't return notification whether the job was already suspended, then the DRMAA implementation can't report it via an exception. There is, however, no need to make the exception explicitly optional. Exceptions are by definition optional. Daniel On 03/01/10 02:38, Daniel Gruber wrote:
On 03/01/10 11:17, Andre Merzky wrote:
Quoting [Daniel Gruber] (Mar 01 2010):
Hi, while scanning the wiki I found following: The Job object methods should throw following exceptions: - "JobAlreadySuspendedException" from suspend method when job is already suspended. The DRMAA implementation have to make sure that suspend job is just called once. It is not enough for the DRMAA implementation to rely on own state, it should check the state automatically in order to avoid problems when the state is set outside of DRMAA. Should DRMAA deal with such cases?
*Can* DRMAA deal with such cases? These are two operations which are usually not atomic (1: check for state, 2: suspend) - so how can a DRMAA client side library ensure that the remote state does not change between these two calls, e.g. due to a 3rd part API call?
I guess it's ok to throw when the backend replies with that error (job already suspended) - but requiring the DRMAA implementation to ensure atomicity is most likely futile.
my $0.02, Andre.
You're right - atomicity seems not to be possible.
Another important thing to know would be if all DRMs are throwing such an exception or are there any which are silently ignore second request and telling again that it is suspended (suspend is idempotent). Do we have than a problem with the spec saying that there is an Exception but on some implementations there will be never throw one? We should make the exception optional if so.
Cheers
Daniel
- "JobNotSuspendedException" from the resume method (like above). - "JobTerminatedException" when calling a method on a job when the job is already terminated. This is for "suspend" "resume" "hold" "release" "terminate" "waitStarted" Obvious synchronization problems: - accessing an already "deleted" JobTemplate: here that same as for a destroyed Session should apply (InvalidJobTemplateException) - accessing a job template while "deleting" (running a job or accessing otherwise): here that same as for a destroyed Session should apply (InvalidJobTemplateException) - write access for the job templates must be synchronized by the DRMAA implementation - Is there a need to make the invalid state of a JobTemplate (that is when a JobSession has been closed) as an accessible field or should every problem covered by the "InvalidJobTemplateException"? Regards Daniel
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Hi Dan, Quoting [Daniel Templeton] (Mar 01 2010):
There is, however, no need to make the exception explicitly optional. Exceptions are by definition optional.
This may be true in this case, but in general I expect exceptions to be guaranteed on specific circumstances. For example, I would expect that a suspend() on an invalid jobid will *always* cause an exception (mandatory), not only sometimes (optional). Best, Andre. -- Nothing is ever easy.
Hi all, 1. Can we merge JobAlreadySuspendedException, JobNotSuspendedException, JobTerminatedException into one something like CantApplyToCurrentStateExecption (OGSA-BES approach) and state that the error message should bears current job state ? 2. Guaranteeing atomicity (concerning operation that comes from outside) in DRMAA is almost impossible for the DRMS i know, as usually there is no "lock on job" operation available in public API. Cheers, On 1 March 2010 15:27, Andre Merzky <andre@merzky.net> wrote:
Hi Dan,
Quoting [Daniel Templeton] (Mar 01 2010):
There is, however, no need to make the exception explicitly optional. Exceptions are by definition optional.
This may be true in this case, but in general I expect exceptions to be guaranteed on specific circumstances. For example, I would expect that a suspend() on an invalid jobid will *always* cause an exception (mandatory), not only sometimes (optional).
Best, Andre.
-- Nothing is ever easy. -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- Mariusz
Sounds reasonable to me. At a minimum, all of these very specific state exception should inherit from a more general exception in languages that support it. Daniel On 03/01/10 22:24, Mariusz Mamoński wrote:
Hi all,
1. Can we merge JobAlreadySuspendedException, JobNotSuspendedException, JobTerminatedException into one something like CantApplyToCurrentStateExecption (OGSA-BES approach) and state that the error message should bears current job state ? 2. Guaranteeing atomicity (concerning operation that comes from outside) in DRMAA is almost impossible for the DRMS i know, as usually there is no "lock on job" operation available in public API.
Cheers,
On 1 March 2010 15:27, Andre Merzky<andre@merzky.net> wrote:
Hi Dan,
Quoting [Daniel Templeton] (Mar 01 2010):
There is, however, no need to make the exception explicitly optional. Exceptions are by definition optional.
This may be true in this case, but in general I expect exceptions to be guaranteed on specific circumstances. For example, I would expect that a suspend() on an invalid jobid will *always* cause an exception (mandatory), not only sometimes (optional).
Best, Andre.
-- Nothing is ever easy. -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
The only thing is that we loose information on languages that do not support exception inheritance. For example when throwing CantApplyToCurrentStateException while resuming it is unclear weather it is because the job was terminated or it was not suspended. Maybe we should make the more general exceptions part of the language bindings? Daniel On 03/02/10 15:19, Daniel Templeton wrote:
Sounds reasonable to me. At a minimum, all of these very specific state exception should inherit from a more general exception in languages that support it.
Daniel
On 03/01/10 22:24, Mariusz Mamoński wrote:
Hi all,
1. Can we merge JobAlreadySuspendedException, JobNotSuspendedException, JobTerminatedException into one something like CantApplyToCurrentStateExecption (OGSA-BES approach) and state that the error message should bears current job state ? 2. Guaranteeing atomicity (concerning operation that comes from outside) in DRMAA is almost impossible for the DRMS i know, as usually there is no "lock on job" operation available in public API.
Cheers,
On 1 March 2010 15:27, Andre Merzky<andre@merzky.net> wrote:
Hi Dan,
Quoting [Daniel Templeton] (Mar 01 2010):
There is, however, no need to make the exception explicitly optional. Exceptions are by definition optional.
This may be true in this case, but in general I expect exceptions to be guaranteed on specific circumstances. For example, I would expect that a suspend() on an invalid jobid will *always* cause an exception (mandatory), not only sometimes (optional).
Best, Andre.
-- Nothing is ever easy. -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Sounds reasonable to me. At a minimum, all of these very specific state exception should inherit from a more general exception in languages that support it.
A mandatory exception hierarchy was declined a long time ago, the argumentation became even part of the DRMAA IDL 1.0 spec: "Language bindings MAY decide to introduce a hierarchical ordering of the DRMAA exceptions through class derivation. In this case it MAY also happen that new exceptions are introduced for behavior aggregation. In this case, those exceptions SHALL be marked as abstract, to prevent them from being thrown."
1. Can we merge JobAlreadySuspendedException, JobNotSuspendedException, JobTerminatedException into one something like CantApplyToCurrentStateExecption (OGSA-BES approach) and state that the error message should bears current job state ?
I might have missed something, but where did you got these exceptions from ? Another point is that your proposal is already reality. The decision was made at the F2F meeting in July 2009: "The former HoldInconsistentStateException, ReleaseInconsistentStateException, ResumeInconsistentStateException, andSuspendInconsistentStateException from DRMAA v1.0 are now expressed as single InconsistentStateException with different meaning per function" I would like to ask everybody to reason ONLY about the DRMAAv2 spec in the wiki. Every other document is (very likely) outdated. Thanks and best regards, Peter.
2. Guaranteeing atomicity (concerning operation that comes from outside) in DRMAA is almost impossible for the DRMS i know, as usually there is no "lock on job" operation available in public API.
Cheers,
On 1 March 2010 15:27, Andre Merzky<andre@merzky.net> wrote:
Hi Dan,
Quoting [Daniel Templeton] (Mar 01 2010):
There is, however, no need to make the exception explicitly optional. Exceptions are by definition optional.
This may be true in this case, but in general I expect exceptions to be guaranteed on specific circumstances. For example, I would expect that a suspend() on an invalid jobid will *always* cause an exception (mandatory), not only sometimes (optional).
Best, Andre.
-- Nothing is ever easy. -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
GFD.133 has a good statement in the description of the control() function: "This routine SHALL return once the action has been acknowledged by the DRM system, but does not necessarily wait until the action has been completed." This underlines Dan's argumentation, the point of synchronization resp. atomicity is the DRMS itself. It was no problem in DRMAAv1, since we carefully avoided to demand any kind of state saving in the library. This changed with the new persistency features. We discussed possible new race conditions in Hamburg, but couldn't find anything unsolvable. The new concept demands only the storage of identifiers so far - for sessions (if supported by the DRM) and jobs. The state still must be retrieved from the DRM on every usage.
The Job object methods should throw following exceptions: - "JobAlreadySuspendedException" from suspend method when job is already suspended. The DRMAA implementation have to make sure that suspend job is just called once. It is not enough for the DRMAA implementation to rely on own state, it should check the state automatically in order to avoid problems when the state is set outside of DRMAA. Should DRMAA deal with such cases?
Can you provide a link for this text ? I cannot find it. It also makes no real sense - job state NEVER EVER should be persisted in the DRMAA library itself.
*Can* DRMAA deal with such cases? These are two operations which are usually not atomic (1: check for state, 2: suspend) - so how can a DRMAA client side library ensure that the remote state does not change between these two calls, e.g. due to a 3rd part API call?
It cannot, and it is no problem. A "test-and-set" semantic of the library is not expected here. The DRMS should tell the library that suspend() is not allowed with the current state. Or in other words - we expect the job control functions of the DRM system to act (more or less) like the DRMAA equivalents. So far, this worked out. Best, Peter.
participants (5)
-
Andre Merzky -
Daniel Gruber -
Daniel Templeton -
Mariusz Mamoński -
Peter Tröger