2011/4/27 Peter Tröger <peter@troeger.eu>:
Participants: Daniel, Mariusz, Roger, Andre (SAGA), Peter
Quick check of last weeks decisions, all agreed
Line 530 - checkpointability attribute in job template - Grid Engine expresses checkpointability as string reference to checkpointing environment - would be boolean flag in Condor, indicating standard universe - From SAGA perspective, no real use case - Decision: Dropped
Line 578 - optional eMail attribute - accepted by group
Line 609 - Staging support - reformulate to allow submission and execution host being the same machine - denote support for 'hierarchical copying' as implementation-specific - reformulate to state that with parallel jobs, copy must target at least the master node, and may also copy the files to other hosts - clarify relationship between job working directory and relative paths
Line 707 - Reaction on reaching soft / hard limits - Grid Engine: Signal depends on particular limit type - Agreement that crossing a hard limit should lead to FAILED state of the DRMAA job - Agreement to remove softResourceLimits completely, since DRMAA cannot promise any kind of common semantics, and since the attribute is not important enough to add it as opaque concept (as with slots)
i promised to do some research, so: we are mixing different resources wich limits have different purpose and thus associated policy: enum ResourceLimitType { CORE_FILE_SIZE , CPU_TIME , DATA_SEG_SIZE , FILE_SIZE , OPEN_FILES , STACK_SIZE , VIRTUAL_MEMORY , WALLCLOCK_TIME }; lets take the first one: CORE_FILE_SIZE and Grid Engine man queue_conf: " The remaining parameters in the queue configuration template specify per job soft and hard resource limits as implemented by the setrlimit(2) ..." man setrlimit " RLIMIT_CORE Maximum size of core file. When 0 no core dump files are created. When non-zero, larger dumps are truncated to this size." and the difference between Soft and Hard limit is defined as follows: " The hard limit acts as a ceiling for the soft limit: an unprivileged process may only set its soft limit to a value in the range from 0 up to the hard limit, and (irreversibly) lower its hard limit." exceeding other limits like OPEN_FILES would result just in errors on calls like open() which application can handle end exits with 0. So the agreement that "crossing a hard limit should lead to FAILED" should be valid only to some of the limits e.g.: WALLCLOCK_TIME, CPU_TIME.
Section 9.2.4 / 9.2.7: - reservedSlots should be mandatory information, reservedMachines should be optional information
Agreement to specify possible error codes per method after some implementations were done
Line 751 - Reservation without time frame - Makes no sense, since it might be way too short for the user -> raise invalid argument exception on UNSET/UNSET/UNSET - add rationale why startTime=UNSET is not equal to startTime=NOW - handy concept supported by some, but not all DRM systems - Emulation in the DRMAA library is not a valid option, since this would lead to situations were the reservation already arrives 'too late' in the DRM system
Best regards, Peter.
Am 27.04.11 00:57, schrieb Peter Tröger:
Dear all,
the next DRMAA conf call is scheduled for Apr 27th, 19:00 UTC. We meet on Skype, please find me under my user name "potsdam_pit".
Preliminary meeting agenda:
1. Meeting secretary for this meeting? 2. Solving remaining issues in DRMAAv2 Draft 3 (see attachment, starting from page 18)
Sorry, I didn't had the time to prepare a new draft.
Best regards, Peter.
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- Mariusz