On Tue, Jan 20, 2009 at 7:50 PM, Daniel Gruber <D.Gruber@sun.com> wrote:
Undetermined - is it a valid job state? -> Yes! Undetermined = Error -> Condor: it is permanent some time -> Need to clarify if this means "don't try again" or "try it again"
But does that mean that undeteimined state will go away and the function will return an error?
Distinction between state "failed" and "terminated" -> "Failed" := user can fix it (through changes on job template for example) -> "Terminated" := error the user can't fix
I thought as Terminated as the state the job gets into if it was drmaa_controll'ed() or possibly deleted locally in DRMS (by admin or user), but the later may be optional functionality.
Why should we support extensible state? -> basically for reporting -> problem: difficult to implement in C
It might be modelled similarly to BES so that there are standard states that one can additionally inherit from to have more detailed states. In C it might done in the following way (kind of OOP programming in C): typedef struct { int standard_state; } drmaa_state_t; That would be standardised. But the implementation might want to extend it and then it might actually return: typedef { drmaa_state_t super; int my_own_specific_state; } drmaa_sge_state_t; If the "client" wants to use only standard states, it uses a pointer to the first structure and thus doesn't see the detailed state (e.g. general hold state + user/admin hold implementation specific). But when he knows he's using a specific DRMAA implementation it may cast the general structure to the impl-specific one. Kind of a hack, but AFAIR it is C standards compliant. Pointers to these two structures should be interchangeable, because they point to the same place in memory. -- Piotr Domagalski FedStage Systems Ltd.