Re: [DRMAA-WG] Load average interval ?

25 Mar 2010

      I would also vote for the total amount of cores and sockets :)

We could also think about reporting the amount of concurrent
threads that are supported by the hardware (hyperthreading in
case of Intel or chip-multithreading in case of Sun T2 processors).
This could prevent the user for puzzling out what is meant by
a core (is it a real one, or the hyperthreading/CMT thing).

If not we should at least define that a core is really a physical core.

Daniel

On 03/25/10 15:44, Daniel Templeton wrote:
...
I would tend to agree that total core count is more useful.  SGE also 
reports socket count as of 6.2u5, by the way.  (That's actually thanks 
to our own Daniel Gruber.)
Daniel
On 03/25/10 07:03, Mariusz Mamoński wrote:
...
Also for me. As we are talking about monitoring interface i propose
two more changes to the machine monitoring interface:
1. Having a new data struct called "MachineInfo" with attributes like
Load, PhysMemory, ... and getMachineInfo(in String machineName) method
in the Monitoring interface. Rationale: the same as for the JobInfo
(consistency issue, fetching all machines attributes at once is more
natural in DRMS APIs then querying for each attribute separately)
2. change machineCoresPerSocket to machinesCores, if one have
machineSockets he or she can easily determine the
machineCoresPerSocket. The problem with the current API is that if the
DRM do not support "machineSockets" (as far i checked only LSF provide
this two-level granularity @see Google Doc) we loose the most
essential information: "how many single processing units do we have on
single machine?"
Cheers,
On 23 March 2010 23:00, Daniel Templeton<daniel.templeton@oracle.com>  wrote:
...
That's fine with me.
Daniel
On 03/23/10 13:51, Peter Tröger wrote:
...
...
Any non-SGE opinion ?
Here is mine:
I could only find one single source that explains the load average
source in Condor :)
http://www.patentstorm.us/patents/5978829/description.html
Condor provides only the 1-minute load average from the uptime command.
Same holds for Moab:
http://www.clusterresources.com/products/mwm/docs/commands/checknode.shtml
And PBS:
http://wiki.egee-see.org/index.php/Installing_and_configuring_guide_for_MonA...
And MAUI:
https://psiren.cs.nott.ac.uk/projects/procksi/wiki/JobManagement
I vote for reporting only the 1-minute load average.
/Peter.
...
And BTW, by using the uptime(1) load semantics, we loose Windows
support. There is no such attribute there, load is measured in
percentage of non-idle time, and has no direct relationship to the
ready queue lengths.
Best,
Peter.
Am 22.03.2010 um 16:02 schrieb Daniel Templeton:
...
SGE tends to look at the 5-minute average, although any can be
configured.  You could solve it the same way we did for SGE -- offer
three: machineLoadShort, machineLoadMed, machineLoadLong.
Daniel
On 03/22/10 06:05, Peter Tröger wrote:
> Hi,
>
> next remaining thing from OGF28:
>
> We support the determination of machineLoad average in the
> MonitoringSession interface. At OGF, we could not agree on which of
> the typical intervals (1/5/15 minutes) we want to use here. Maybe
> all of them ?
>
> Best,
> Peter.
>
>
>
>
> --
>    drmaa-wg mailing list
>    drmaa-wg@ogf.org
>    http://www.ogf.org/mailman/listinfo/drmaa-wg
>               
--
drmaa-wg mailing list
drmaa-wg@ogf.org
http://www.ogf.org/mailman/listinfo/drmaa-wg
--
   drmaa-wg mailing list
   drmaa-wg@ogf.org
   http://www.ogf.org/mailman/listinfo/drmaa-wg
--
    drmaa-wg mailing list
    drmaa-wg@ogf.org
    http://www.ogf.org/mailman/listinfo/drmaa-wg
--
  drmaa-wg mailing list
  drmaa-wg@ogf.org
  http://www.ogf.org/mailman/listinfo/drmaa-wg
--
  drmaa-wg mailing list
  drmaa-wg@ogf.org
  http://www.ogf.org/mailman/listinfo/drmaa-wg