Piotr Domagalski wrote:
On Tue, Jan 20, 2009 at 7:50 PM, Daniel Gruber <D.Gruber@sun.com> wrote:
Undetermined - is it a valid job state? -> Yes! Undetermined = Error -> Condor: it is permanent some time -> Need to clarify if this means "don't try again" or "try it again"
But does that mean that undeteimined state will go away and the function will return an error?
It probably means that UNDETERMINED becomes one state, and an exception becomes the other.
Distinction between state "failed" and "terminated" -> "Failed" := user can fix it (through changes on job template for example) -> "Terminated" := error the user can't fix
I thought as Terminated as the state the job gets into if it was drmaa_controll'ed() or possibly deleted locally in DRMS (by admin or user), but the later may be optional functionality.
Exactly. A failed job needs to be fixed before being resubmitted. A terminated job could succeed as-is if resubmitted.
Why should we support extensible state? -> basically for reporting -> problem: difficult to implement in C
It might be modelled similarly to BES so that there are standard states that one can additionally inherit from to have more detailed states. In C it might done in the following way (kind of OOP programming in C):
typedef struct { int standard_state; } drmaa_state_t;
That would be standardised. But the implementation might want to extend it and then it might actually return:
typedef { drmaa_state_t super; int my_own_specific_state; } drmaa_sge_state_t;
If the "client" wants to use only standard states, it uses a pointer to the first structure and thus doesn't see the detailed state (e.g. general hold state + user/admin hold implementation specific). But when he knows he's using a specific DRMAA implementation it may cast the general structure to the impl-specific one. Kind of a hack, but AFAIR it is C standards compliant. Pointers to these two structures should be interchangeable, because they point to the same place in memory.
I was thinking about something more along the lines of: typedef struct { int state; void *substate; } drmaa_state_t; There is then no confusion for the caller about what he gets back. He only needs to check is the substate is non-null *if* he knows enough about the DRMAA implementation to be able to understand it. It also leaves the substate implementation open for the implementation to decide. Maybe an int just isn't enough data. Or maybe the substate is a string message. Daniel