Conference call -Mar 31th - 16:00 UTC
Dear all, the bi-weekly DRMAA call is scheduled on Mar 31th, 2009. The meeting starts at 16:00 UTC. For time zone conversion, check this: http://tinyurl.com/cf27fb The phone conference line is sponsored by Sun: Phone number (toll-free from US): +001-866-545-5227 Access code: 5988285 Preliminary meeting agenda: 1. Meeting secretary for this meeting? 2. Job sub-state data type - final voting: http://www.ogf.org/pipermail/drmaa-wg/2009-February/001007.html 3. Partial time stamp replacement: http://www.ogf.org/pipermail/drmaa-wg/2009-February/001008.html 4. TERMINATED vs. FAILED state discussion: http://www.ogf.org/pipermail/drmaa-wg/2009-March/001012.html 5. Discussion Kick-Off: Remodeling the JobInfo interface Please prepare yourself regarding the open discussion points. drmaa_v2_draft2.pdf is attached. /Peter.
Participants in the call: Hrabri, Daniel T., Peter Meeting minutes from Feb 17th were accepted.
1. Meeting secretary for this meeting?
Peter.
2. Job sub-state data type - final voting: http://www.ogf.org/pipermail/drmaa-wg/2009-February/001007.html
Decision for object / void pointer approach. The returned data structure can be defined by the language binding or the implementation, as long as the jobStatus() signature contains a generic pointer type.
3. Partial time stamp replacement: http://www.ogf.org/pipermail/drmaa-wg/2009-February/001008.html
Decision for replacement by RFC822 strings.
4. TERMINATED vs. FAILED state discussion: http://www.ogf.org/pipermail/drmaa-wg/2009-March/001012.html
Option 2 from the original mail is now highly preferred. TERMINATED state should express that an external entity (e.g. user or DRM system) stopped the job before finishing. For POSIX-aligned systems, this could be formulated as reception of a signal by "the job". In contrast, FAILED state now expresses that the application stopped on its own before finishing. For POSIX-aligned systems, this could be formulated as reception of a signal "by the job's application process". We ask for comments from PBS and LSF experts (FedStage ?!?). Do these systems provide enough error information to distinguish between these two states ? For SGE and Condor, Dan and Peter already agreed. This decision also has some implications on the JobInfo structure, the job state flow and the error conditions for job templates.
5. Discussion Kick-Off: Remodeling the JobInfo interface
New attribute for job state at the time of querying, since "terminatingSignal" now only makes sense in the FAILED state.
Did you also want to add the discussion at the end about extending drmaa_wait()? Daniel Peter Tröger wrote:
Participants in the call: Hrabri, Daniel T., Peter
Meeting minutes from Feb 17th were accepted.
1. Meeting secretary for this meeting?
Peter.
2. Job sub-state data type - final voting: http://www.ogf.org/pipermail/drmaa-wg/2009-February/001007.html
Decision for object / void pointer approach. The returned data structure can be defined by the language binding or the implementation, as long as the jobStatus() signature contains a generic pointer type.
3. Partial time stamp replacement: http://www.ogf.org/pipermail/drmaa-wg/2009-February/001008.html
Decision for replacement by RFC822 strings.
4. TERMINATED vs. FAILED state discussion: http://www.ogf.org/pipermail/drmaa-wg/2009-March/001012.html
Option 2 from the original mail is now highly preferred. TERMINATED state should express that an external entity (e.g. user or DRM system) stopped the job before finishing. For POSIX-aligned systems, this could be formulated as reception of a signal by "the job". In contrast, FAILED state now expresses that the application stopped on its own before finishing. For POSIX-aligned systems, this could be formulated as reception of a signal "by the job's application process".
We ask for comments from PBS and LSF experts (FedStage ?!?). Do these systems provide enough error information to distinguish between these two states ? For SGE and Condor, Dan and Peter already agreed.
This decision also has some implications on the JobInfo structure, the job state flow and the error conditions for job templates.
5. Discussion Kick-Off: Remodeling the JobInfo interface
New attribute for job state at the time of querying, since "terminatingSignal" now only makes sense in the FAILED state.
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
What I meant was to record that we said (roughly): o We expressed an intention to extend the capabilities of the drmaa_wait() call to allow waiting for arbitrary job states o In that case, the job state information in the job info object becomes a representation of the job at the time of the state change Daniel Daniel Templeton wrote:
Did you also want to add the discussion at the end about extending drmaa_wait()?
Daniel
Peter Tröger wrote:
Participants in the call: Hrabri, Daniel T., Peter
Meeting minutes from Feb 17th were accepted.
1. Meeting secretary for this meeting?
Peter.
2. Job sub-state data type - final voting: http://www.ogf.org/pipermail/drmaa-wg/2009-February/001007.html
Decision for object / void pointer approach. The returned data structure can be defined by the language binding or the implementation, as long as the jobStatus() signature contains a generic pointer type.
3. Partial time stamp replacement: http://www.ogf.org/pipermail/drmaa-wg/2009-February/001008.html
Decision for replacement by RFC822 strings.
4. TERMINATED vs. FAILED state discussion: http://www.ogf.org/pipermail/drmaa-wg/2009-March/001012.html
Option 2 from the original mail is now highly preferred. TERMINATED state should express that an external entity (e.g. user or DRM system) stopped the job before finishing. For POSIX-aligned systems, this could be formulated as reception of a signal by "the job". In contrast, FAILED state now expresses that the application stopped on its own before finishing. For POSIX-aligned systems, this could be formulated as reception of a signal "by the job's application process".
We ask for comments from PBS and LSF experts (FedStage ?!?). Do these systems provide enough error information to distinguish between these two states ? For SGE and Condor, Dan and Peter already agreed.
This decision also has some implications on the JobInfo structure, the job state flow and the error conditions for job templates.
5. Discussion Kick-Off: Remodeling the JobInfo interface
New attribute for job state at the time of querying, since "terminatingSignal" now only makes sense in the FAILED state.
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
participants (2)
-
Daniel Templeton -
Peter Tröger