Re: [glue-wg] Definition of Memory with Muli-core CPUs.

30 Aug 2007

      glue-wg-bounces@ogf.org on behalf of Burke, S (Stephen) said:
...
This is a known issue - it may be in one of Sergio's trackers already, I know I've mentioned it before.
This was my reply last time this came up (May), probably still about right ...

Like many things this is about as clear as mud :) The way the schema is
written, the RAMSize comes out of the physical host description and
hence was supposed to be the memory for the whole WN. However, for job
submission purposes it's more useful to know how much memory a job will
get, so many sites divide by the number of job slots per CPU. Of course,
in general there is no real guarantee anyway about how much memory a job
will actually get, the values are set by hand so may be wrong, and many
clusters will have WNs with varying amounts of memory ... the last time
this was discussed I think the conclusion was that people should publish
the memory per WN since a) it's what the schema says, b) it is a fairly
well-defined number, and c) a job might get that much if it happens to
get the node to itself or if it shares with a small-memory job. If we
ever had the passing of requirements to the LRMS it could also be used
to ensure that jobs could get memory up to the physical amount in the
machine.

  If you wanted to estimate what you get per job on average the right
value to divide by is not LogicalCPUs but ArchitectureSMPSize - except
that that was defined before we had multi-core and hyperthreading so
it's not entirely clear how to deal with them (personally I would count
them) ... and of course sites don't necessarily configure one slot
per CPU, although it's pretty common. SMPSize does at least look to be
set vaguely sensibly ...

     44 GlueHostArchitectureSMPSize: 1
      1 GlueHostArchitectureSMPSize: 16
    231 GlueHostArchitectureSMPSize: 2
     32 GlueHostArchitectureSMPSize: 4
      1 GlueHostArchitectureSMPSize: 8

Stephen