On 4/2/06 11:18, "Andre Merzky"
I think that diagram is wrong, isn't it? Well, here are my questions:
- if we submit a job, its immediately Queued - is that right? Should it be pending before (e.g. as long as the queuing request travels the middleware layers)?
To me, Queued is the same as Pending. Pending is probably a better word for this. Can't remember where the Queued name came from, as LSF uses PEND.
- can the hold and suspend states reached only from 'Running', or from elsewhere as well?
You can only go into a Hold state from Pending, I think, or directly into Hold on submission.
- What is the difference between 'Hold' and 'Suspend'?
A Hold state tells the scheduler/broker not to consider this job for scheduling/dispatch until the hold is explicitly released.
- Are there signals defined (apart from KILL) which shange the job state? I guess that is not as simple as saying SUSP does suspend - that state is probably defined by the scheduler, not by the OS...
Right ... this is implementation dependent on the mechanism used to suspend a job (might be a signal, might be some other mechanism). What is important is that there is an operation to initiate the state transition.
- What is the use case for distinguishing between UserHold and SystemHold, or between UserSuspend and SystemSuspend?
If I preempt workload, the system will put it into a SystemSuspend state that a user cannot cause a switch out of, otherwise a system may become oversubscribed due to the preempted and preempting jobs running at the same time. A UserSuspend can be entered and exited by the user, and is often used to hold processing to check progress, etc. By the way ... I believe that the state diagram should at least be a subset of the BES state diagram ... we should adopt the same names. -- Chris