
glue-wg-bounces@ogf.org on behalf of Burke, S (Stephen) said:
This is a known issue - it may be in one of Sergio's trackers already, I know I've mentioned it before.
This was my reply last time this came up (May), probably still about right ... Like many things this is about as clear as mud :) The way the schema is written, the RAMSize comes out of the physical host description and hence was supposed to be the memory for the whole WN. However, for job submission purposes it's more useful to know how much memory a job will get, so many sites divide by the number of job slots per CPU. Of course, in general there is no real guarantee anyway about how much memory a job will actually get, the values are set by hand so may be wrong, and many clusters will have WNs with varying amounts of memory ... the last time this was discussed I think the conclusion was that people should publish the memory per WN since a) it's what the schema says, b) it is a fairly well-defined number, and c) a job might get that much if it happens to get the node to itself or if it shares with a small-memory job. If we ever had the passing of requirements to the LRMS it could also be used to ensure that jobs could get memory up to the physical amount in the machine. If you wanted to estimate what you get per job on average the right value to divide by is not LogicalCPUs but ArchitectureSMPSize - except that that was defined before we had multi-core and hyperthreading so it's not entirely clear how to deal with them (personally I would count them) ... and of course sites don't necessarily configure one slot per CPU, although it's pretty common. SMPSize does at least look to be set vaguely sensibly ... 44 GlueHostArchitectureSMPSize: 1 1 GlueHostArchitectureSMPSize: 16 231 GlueHostArchitectureSMPSize: 2 32 GlueHostArchitectureSMPSize: 4 1 GlueHostArchitectureSMPSize: 8 Stephen