Conference call - Jun 23th - 19:00 UTC
Dear all, the next DRMAA phone conference is scheduled on Jun 23th, at 19:00 UTC. The phone conference line is sponsored by Oracle. Please consult the following page for dial-in numbers from your country: http://www.intercall.com/oracle/access_numbers.htm The conference code is 6513037. The security code is DRMAA (37622). Preliminary meeting agenda: 1. Meeting secretary ? 2. Monitoring jobs not submitted by DRMAA - final decision 5. Cleaning up the spreadsheet: http://spreadsheets.google.com/ccc?key=rrAIK9utkSoDQXF8kasYCLQ Due to urgent issues of national interest, I am not able to participate in the call ;-) I would kindly ask either Dan or Mariusz to lead the discussion. Best regards, Peter. -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
As promised on the call, here's the summary of the monitoring discussion. Please add your comments and corrections. o The monitoring session is should provide parallel functionality to the job session's job monitoring for monitoring all jobs in the cluster. o The job monitoring capability for the monitoring session should support a query filter o The job monitoring capability for the monitoring session will return a set of job objects that contain a set of core information and a dictionary of optional information that will be "standardized" through the drmaa.org site. o The job monitoring capability should be able to return jobs from any/all users. o The DRM or DRMAA implementation is at liberty to restrict the set of returned jobs based on site or system policies, such as security settings. o The job monitoring model will assume that the DRM system has three job information states: running, buffered, purged. Only information for jobs that are still running or are still held in the buffer of finished job information will be reported. Jobs that have been purged out to accounting will be ignored. o Exit status will be an optional component of the job object. It will only be available for jobs still held in the DRM system's buffer. Does that sound accurate and complete? I'll leave it to Peter or Mariusz to generate the IDL. :) One thing that we didn't discuss to completion was scalability. With the job session monitoring, the set of jobs is likely to be significantly smaller than the complete set of jobs, especially when including the DRM's finished job buffer. In both cases, however, there should be some facility to deal with massive job counts. Filtering is certainly one option. An application could query only a limited set at a time, but that places a fairly large burden on the app developers. The ivory tower solution would be some sort of cursor, but I don't think we really want to go there. Thoughts? Daniel On 6/23/10 6:00 AM, Peter Tröger wrote:
Dear all,
the next DRMAA phone conference is scheduled on Jun 23th, at 19:00 UTC.
The phone conference line is sponsored by Oracle. Please consult the following page for dial-in numbers from your country:
http://www.intercall.com/oracle/access_numbers.htm
The conference code is 6513037. The security code is DRMAA (37622).
Preliminary meeting agenda:
1. Meeting secretary ? 2. Monitoring jobs not submitted by DRMAA - final decision 5. Cleaning up the spreadsheet:
http://spreadsheets.google.com/ccc?key=rrAIK9utkSoDQXF8kasYCLQ
Due to urgent issues of national interest, I am not able to participate in the call ;-)
I would kindly ask either Dan or Mariusz to lead the discussion.
Best regards, Peter.
-- drmaa-wg mailing list drmaa-wg@ogf.org mailto:drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Oh wow, I missed this one. Seems like we have a dedicated job monitoring part in the MonitoringSession now - which is good. I wonder how the current JobSession design maps to this. Good topic for the call .... Best, Peter. Am 23.06.2010 um 22:20 schrieb Daniel Templeton:
As promised on the call, here's the summary of the monitoring discussion. Please add your comments and corrections.
o The monitoring session is should provide parallel functionality to the job session's job monitoring for monitoring all jobs in the cluster. o The job monitoring capability for the monitoring session should support a query filter o The job monitoring capability for the monitoring session will return a set of job objects that contain a set of core information and a dictionary of optional information that will be "standardized" through the drmaa.org site. o The job monitoring capability should be able to return jobs from any/all users. o The DRM or DRMAA implementation is at liberty to restrict the set of returned jobs based on site or system policies, such as security settings. o The job monitoring model will assume that the DRM system has three job information states: running, buffered, purged. Only information for jobs that are still running or are still held in the buffer of finished job information will be reported. Jobs that have been purged out to accounting will be ignored. o Exit status will be an optional component of the job object. It will only be available for jobs still held in the DRM system's buffer.
Does that sound accurate and complete? I'll leave it to Peter or Mariusz to generate the IDL. :)
One thing that we didn't discuss to completion was scalability. With the job session monitoring, the set of jobs is likely to be significantly smaller than the complete set of jobs, especially when including the DRM's finished job buffer. In both cases, however, there should be some facility to deal with massive job counts. Filtering is certainly one option. An application could query only a limited set at a time, but that places a fairly large burden on the app developers. The ivory tower solution would be some sort of cursor, but I don't think we really want to go there. Thoughts?
Daniel
On 6/23/10 6:00 AM, Peter Tröger wrote:
Dear all,
the next DRMAA phone conference is scheduled on Jun 23th, at 19:00 UTC.
The phone conference line is sponsored by Oracle. Please consult the following page for dial-in numbers from your country:
http://www.intercall.com/oracle/access_numbers.htm
The conference code is 6513037. The security code is DRMAA (37622).
Preliminary meeting agenda:
1. Meeting secretary ? 2. Monitoring jobs not submitted by DRMAA - final decision 5. Cleaning up the spreadsheet:
http://spreadsheets.google.com/ccc?key=rrAIK9utkSoDQXF8kasYCLQ
Due to urgent issues of national interest, I am not able to participate in the call ;-)
I would kindly ask either Dan or Mariusz to lead the discussion.
Best regards, Peter.
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Here is the latest update status for the Wiki:
As promised on the call, here's the summary of the monitoring discussion. Please add your comments and corrections.
o The monitoring session is should provide parallel functionality to the job session's job monitoring for monitoring all jobs in the cluster.
sequence<Job> MonitoringSession::getAllJobs(JobInfo filter);
o The job monitoring capability for the monitoring session should support a query filter
See above. Will be discussed on the call.
o The job monitoring capability for the monitoring session will return a set of job objects that contain a set of core information and a dictionary of optional information that will be "standardized" through the drmaa.org site.
This is modeled similar as in the JobTemplate struct: struct JobInfo { ... readonly attribute Dictionary drmsSpecific; }
o The job monitoring capability should be able to return jobs from any/all users. o The DRM or DRMAA implementation is at liberty to restrict the set of returned jobs based on site or system policies, such as security settings.
The description of MonitoringSession was updated accordingly.
o The job monitoring model will assume that the DRM system has three job information states: running, buffered, purged. Only information for jobs that are still running or are still held in the buffer of finished job information will be reported. Jobs that have been purged out to accounting will be ignored. o Exit status will be an optional component of the job object. It will only be available for jobs still held in the DRM system's buffer.
The description of JobInfo was updated accordingly. Best, Peter.
participants (2)
-
Daniel Templeton
-
Peter Tröger