-----Original Message----- From: owner-saga-rg@ggf.org [mailto:owner-saga-rg@ggf.org] On Behalf Of Thilo Kielmann Sent: Sunday, February 12, 2006 1:09 AM To: Andre Merzky Cc: Christopher Smith; Simple API for Grid Applications WG Subject: Re: [saga-rg] job states...
Curious question:
SAGA should align job states with both BES and DRMAA ? Are they the same to begin with?
Thilo
On Sat, Feb 11, 2006 at 03:54:16AM +0100, Andre Merzky wrote:
X-Original-To: kielmann@localhost Delivered-To: kielmann@localhost.cs.vu.nl Delivered-To: grdfm-saga-rg-outgoing@mailbouncer.mcs.anl.gov X-Original-To: grdfm-saga-rg@mailbouncer.mcs.anl.gov Delivered-To: grdfm-saga-rg@mailbouncer.mcs.anl.gov Date: Sat, 11 Feb 2006 03:54:16 +0100 From: Andre Merzky
To: Christopher Smith Cc: Andre Merzky , Simple API for Grid Applications WG Subject: Re: [saga-rg] job states... X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at mailbouncer.mcs.anl.gov X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at mailbouncer.mcs.anl.gov I agree - the file transfer state models are needed for SAGA. We don't have any actions on these states anyway.
Andre.
Quoting [Christopher Smith] (Feb 11 2006):
Sure.
As mentioned ... I think maybe supporting a subset of BES is ok.
Much of the
state model wrt file transfer state modelling I think is not required for SAGA.
-- Chris
On 10/2/06 18:46, "Andre Merzky"
wrote: Ok, then I'll do that in the strawman. I would appreciate if you could glance over it after commit, for a sanity check.
Thanks, Andre.
Quoting [Christopher Smith] (Feb 11 2006):
It makes sense to keep the state models in sync.
-- Chris
On 10/2/06 18:26, "Andre Merzky"
wrote: Quoting [Christopher Smith] (Feb 11 2006): > > What I meant by that comment is that where it is a subset, it
should
> reflect > the BES terminology. I think that the number of states represented is > enough > already. ;-)
Would it make sense to just copy the BES state diagram?
It did not exist when we (== you ;-) drafted the SAGA job states - if it would have been around then, we might have had copied it already.
Apart from the SystemXXX/UserXXX states, and from Hold, it is not that much different from the SAGA model anyway.
Cheers, Andre.
> -- Chris > > > On 10/2/06 17:30, "Andre Merzky"
wrote: > >> Hi Chris, >> >> many thanks for the answers! :-) >> >>> By the way ... I believe that the state diagram should at least be a >>> subset >>> of the BES state diagram ... we should adopt the same names. >> >> I agree, kind of - I would say that the SAGA job state >> diagram should at _most_ be subset of the BES state diagram. >> It could be _S_implier :-) >> >> Cheers, Andre. >> >> >> Quoting [Christopher Smith] (Feb 10 2006): >>> Date: Fri, 10 Feb 2006 13:41:18 -0800 >>> Subject: Re: [saga-rg] job states... >>> From: Christopher Smith >>> To: Simple API for Grid Applications WG >>> >>> On 4/2/06 11:18, "Andre Merzky" wrote: >>> >>> Ok ... I'll try to answer these, at least from my viewpoint. >>> >>>> >>>> I think that diagram is wrong, isn't it? Well, here are my >>>> questions: >>>> >>>> - if we submit a job, its immediately Queued - is that >>>> right? Should it be pending before (e.g. as long as
>>>> queuing request travels the middleware layers)? >>>> >>> To me, Queued is the same as Pending. Pending is probably a better word >>> for >>> this. Can't remember where the Queued name came from, as LSF uses PEND. >>> >>>> - can the hold and suspend states reached only from >>>> 'Running', or from elsewhere as well? >>>> >>> You can only go into a Hold state from Pending, I think, or directly into >>> Hold on submission. >>> >>>> - What is the difference between 'Hold' and 'Suspend'? >>>> >>> A Hold state tells the scheduler/broker not to consider this job for >>> scheduling/dispatch until the hold is explicitly released. >>> >>>> - Are there signals defined (apart from KILL) which shange >>>> the job state? I guess that is not as simple as saying >>>> SUSP does suspend - that state is probably defined by >>>> the scheduler, not by the OS... >>>> >>> Right ... this is implementation dependent on the mechanism used to >>> suspend >>> a job (might be a signal, might be some other mechanism). What is >>> important >>> is that there is an operation to initiate the state
They are not. Grid is a richer environment. Hrabri the transition.
>>> >>>> - What is the use case for distinguishing between UserHold >>>> and SystemHold, or between UserSuspend and >>>> SystemSuspend? >>>> >>> If I preempt workload, the system will put it into a SystemSuspend state >>> that a user cannot cause a switch out of, otherwise a system may become >>> oversubscribed due to the preempted and preempting jobs running at the >>> same >>> time. A UserSuspend can be entered and exited by the user, and is often >>> used >>> to hold processing to check progress, etc. >>> >>> >>> By the way ... I believe that the state diagram should at least be a >>> subset >>> of the BES state diagram ... we should adopt the same names. >>> >>> -- Chris >> >>
-- "So much time, so little to do..." -- Garfield
-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
Pardon me, BES is not for Grids??? Thilo On Mon, Feb 13, 2006 at 10:14:07AM -0800, Rajic, Hrabri wrote:
They are not. Grid is a richer environment.
Hrabri
-----Original Message----- From: owner-saga-rg@ggf.org [mailto:owner-saga-rg@ggf.org] On Behalf Of Thilo Kielmann Sent: Sunday, February 12, 2006 1:09 AM To: Andre Merzky Cc: Christopher Smith; Simple API for Grid Applications WG Subject: Re: [saga-rg] job states...
Curious question:
SAGA should align job states with both BES and DRMAA ? Are they the same to begin with?
Thilo
X-Original-To: kielmann@localhost Delivered-To: kielmann@localhost.cs.vu.nl Delivered-To: grdfm-saga-rg-outgoing@mailbouncer.mcs.anl.gov X-Original-To: grdfm-saga-rg@mailbouncer.mcs.anl.gov Delivered-To: grdfm-saga-rg@mailbouncer.mcs.anl.gov Date: Sat, 11 Feb 2006 03:54:16 +0100 From: Andre Merzky
To: Christopher Smith Cc: Andre Merzky , Simple API for Grid Applications WG Subject: Re: [saga-rg] job states... X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at mailbouncer.mcs.anl.gov X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at mailbouncer.mcs.anl.gov I agree - the file transfer state models are needed for SAGA. We don't have any actions on these states anyway.
Andre.
Quoting [Christopher Smith] (Feb 11 2006):
Sure.
As mentioned ... I think maybe supporting a subset of BES is ok.
Much of the
state model wrt file transfer state modelling I think is not required for SAGA.
-- Chris
On 10/2/06 18:46, "Andre Merzky"
wrote: Ok, then I'll do that in the strawman. I would appreciate if you could glance over it after commit, for a sanity check.
Thanks, Andre.
Quoting [Christopher Smith] (Feb 11 2006):
It makes sense to keep the state models in sync.
-- Chris
On 10/2/06 18:26, "Andre Merzky"
wrote: > Quoting [Christopher Smith] (Feb 11 2006): >> >> What I meant by that comment is that where it is a subset, it
should
>> reflect >> the BES terminology. I think that the number of states represented is >> enough >> already. ;-) > > Would it make sense to just copy the BES state diagram? > > It did not exist when we (== you ;-) drafted the SAGA job > states - if it would have been around then, we might have > had copied it already. > > Apart from the SystemXXX/UserXXX states, and from Hold, > it is not that much different from the SAGA model anyway. > > Cheers, Andre. > > >> -- Chris >> >> >> On 10/2/06 17:30, "Andre Merzky"
wrote: >> >>> Hi Chris, >>> >>> many thanks for the answers! :-) >>> >>>> By the way ... I believe that the state diagram should at least be a >>>> subset >>>> of the BES state diagram ... we should adopt the same names. >>> >>> I agree, kind of - I would say that the SAGA job state >>> diagram should at _most_ be subset of the BES state diagram. >>> It could be _S_implier :-) >>> >>> Cheers, Andre. >>> >>> >>> Quoting [Christopher Smith] (Feb 10 2006): >>>> Date: Fri, 10 Feb 2006 13:41:18 -0800 >>>> Subject: Re: [saga-rg] job states... >>>> From: Christopher Smith >>>> To: Simple API for Grid Applications WG >>>> >>>> On 4/2/06 11:18, "Andre Merzky" wrote: >>>> >>>> Ok ... I'll try to answer these, at least from my viewpoint. >>>> >>>>> >>>>> I think that diagram is wrong, isn't it? Well, here are my >>>>> questions: >>>>> >>>>> - if we submit a job, its immediately Queued - is that >>>>> right? Should it be pending before (e.g. as long as >>>>> queuing request travels the middleware layers)? >>>>> >>>> To me, Queued is the same as Pending. Pending is probably a better word >>>> for >>>> this. Can't remember where the Queued name came from, as LSF uses PEND. >>>> >>>>> - can the hold and suspend states reached only from >>>>> 'Running', or from elsewhere as well? >>>>> >>>> You can only go into a Hold state from Pending, I think, or
>>>> Hold on submission. >>>> >>>>> - What is the difference between 'Hold' and 'Suspend'? >>>>> >>>> A Hold state tells the scheduler/broker not to consider this job for >>>> scheduling/dispatch until the hold is explicitly released. >>>> >>>>> - Are there signals defined (apart from KILL) which shange >>>>> the job state? I guess that is not as simple as saying >>>>> SUSP does suspend - that state is probably defined by >>>>> the scheduler, not by the OS... >>>>> >>>> Right ... this is implementation dependent on the mechanism used to >>>> suspend >>>> a job (might be a signal, might be some other mechanism). What is >>>> important >>>> is that there is an operation to initiate the state
On Sat, Feb 11, 2006 at 03:54:16AM +0100, Andre Merzky wrote: the directly into transition.
>>>> >>>>> - What is the use case for distinguishing between UserHold >>>>> and SystemHold, or between UserSuspend and >>>>> SystemSuspend? >>>>> >>>> If I preempt workload, the system will put it into a SystemSuspend state >>>> that a user cannot cause a switch out of, otherwise a system may become >>>> oversubscribed due to the preempted and preempting jobs running at the >>>> same >>>> time. A UserSuspend can be entered and exited by the user, and is often >>>> used >>>> to hold processing to check progress, etc. >>>> >>>> >>>> By the way ... I believe that the state diagram should at least be a >>>> subset >>>> of the BES state diagram ... we should adopt the same names. >>>> >>>> -- Chris >>> >>> > >
-- "So much time, so little to do..." -- Garfield
-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
-- Thilo Kielmann http://www.cs.vu.nl/~kielmann/
participants (2)
-
Rajic, Hrabri
-
Thilo Kielmann