OK. So non-zero means the exit status is available. Good. We still have two meanings for zero, though. Zero means that the job exited abnormally, i.e. without exit status, and more info is available through drmaa_wifsignaled() and drmaa_wifaborted(). Zero also means that the job exited normally, but the exit status was lost somewhere along the way. Do we want to differentiate these two cases? In the former, one of the drmaa_wif*() function will return true. In the later, all drmaa_wif*() functions wil return false. Is that differentiation enough? This appears to be the use case supported by the C binding example. Daniel Roger Brobst wrote:
In a previous e-mail, Daniel Templeton wrote:
I'm now working on the drmaa_wait() function and its helpers, and I've run into an inconsistency. In the language independent and former C specs, the drmaa_wifexited() function is defined as returning non-zero if the job has ended normally and zero if the job has ended normally but has no exit status available. So far so good.
drmaa_wifexited should return non-zero only when an exit status can be obtained from drmaa_wexitstatus. An exit status can only be obtained if the job ended normally.
It then goes on to say that if drmaa_wifexited() returns non-zero (non-zero == normal exit), then more information is available from drmaa_wifsignaled() and drmaa_wifaborted(). Huh?
Therein is the problem. It read something like: if drmaa_wifexited returns zero, then more information is available from drmaa_wifsignaled() or drmaa_wifaborted().
Nice catch.
Signaling and Aborting are not normal exit methods. Those are abnormal. However, according to the spec, there's no way to say that the job exited abnormally. The example in the C binding spec treats a return of zero from drmaa_wifexited() as meaning the job exited abnormally. That's also how I interpreted it in the Java language binding spec. What was the actual intention here?
Daniel