Hi, in the MonitorinSession we have on machine level machineSockets and coresPerSocket. To be consequent we should also add threadsPerCore. At least OGE/SGE does support this. I added it into our spreadsheet. If this is not supported by a DRM/OS it could return 0 as value for unknown. 0 for coresPerSocket and machineSockets is not allowed since we should define coresPerSocket*machineSockets=="processors" in case a DRM or OS does not support this kind of architectural information. I suggest to leave it open for the DRMAA implementation if it maps the "processors" information to coresPerSocket or machineSockets in case of missing architectural details. If there is no objection I'll take this as accepted. Cheers Daniel
+1 On Mon, Nov 8, 2010 at 10:55 AM, Daniel Gruber <daniel.x.gruber@oracle.com> wrote:
Hi,
in the MonitorinSession we have on machine level machineSockets and coresPerSocket. To be consequent we should also add threadsPerCore. At least OGE/SGE does support this. I added it into our spreadsheet. If this is not supported by a DRM/OS it could return 0 as value for unknown.
0 for coresPerSocket and machineSockets is not allowed since we should define coresPerSocket*machineSockets=="processors" in case a DRM or OS does not support this kind of architectural information. I suggest to leave it open for the DRMAA implementation if it maps the "processors" information to coresPerSocket or machineSockets in case of missing architectural details.
If there is no objection I'll take this as accepted.
Cheers
Daniel -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- Nothing is ever easy...
Hi, I can agree to the new "threadsPerCore" attribute, but would prefer to have "1" as default value. From our understanding of a core, each one can always execute at least one thread. It would also allow to compute an estimation of the number of parallel threads, without looking on the specific numbers. Best, Peter. Am 08.11.2010 um 10:55 schrieb Daniel Gruber:
Hi,
in the MonitorinSession we have on machine level machineSockets and coresPerSocket. To be consequent we should also add threadsPerCore. At least OGE/SGE does support this. I added it into our spreadsheet. If this is not supported by a DRM/OS it could return 0 as value for unknown.
0 for coresPerSocket and machineSockets is not allowed since we should define coresPerSocket*machineSockets=="processors" in case a DRM or OS does not support this kind of architectural information. I suggest to leave it open for the DRMAA implementation if it maps the "processors" information to coresPerSocket or machineSockets in case of missing architectural details.
If there is no objection I'll take this as accepted.
Cheers
Daniel -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Ok. For simplicity we take 1 as default value with the drawback that we loose information if the SMT value is available (and correct) or not. Regards, Daniel On 11/08/10 15:14, Peter Tröger wrote:
Hi,
I can agree to the new "threadsPerCore" attribute, but would prefer to have "1" as default value. From our understanding of a core, each one can always execute at least one thread. It would also allow to compute an estimation of the number of parallel threads, without looking on the specific numbers.
Best, Peter.
Am 08.11.2010 um 10:55 schrieb Daniel Gruber:
Hi,
in the MonitorinSession we have on machine level machineSockets and coresPerSocket. To be consequent we should also add threadsPerCore. At least OGE/SGE does support this. I added it into our spreadsheet. If this is not supported by a DRM/OS it could return 0 as value for unknown.
0 for coresPerSocket and machineSockets is not allowed since we should define coresPerSocket*machineSockets=="processors" in case a DRM or OS does not support this kind of architectural information. I suggest to leave it open for the DRMAA implementation if it maps the "processors" information to coresPerSocket or machineSockets in case of missing architectural details.
If there is no objection I'll take this as accepted.
Cheers
Daniel -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Hi all, If we are talking about the monitoring session... what do you think about the idea of: 1. creating a new data struct MachineInfo with all the predefined machine attributes (e.g.: threadsPerCore) + "readonly attribute Dictionary drmsSpecific;" (an extension point) and providing one method: "MachineInfo getMachineInfo(in string machineName) for accessing all of them 2. adding a new attribute "slotsCount", which denotes the maximum number of single-process jobs that can run on given machine concurrently (use case: system administrator may either choose configuration where one process runs per physical core or hardware thread or choose choose any number that is totally independent from hardware configuration) Cheers, 2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
Ok. For simplicity we take 1 as default value with the drawback that we loose information if the SMT value is available (and correct) or not.
Regards,
Daniel
On 11/08/10 15:14, Peter Tröger wrote:
Hi,
I can agree to the new "threadsPerCore" attribute, but would prefer to have "1" as default value. From our understanding of a core, each one can always execute at least one thread. It would also allow to compute an estimation of the number of parallel threads, without looking on the specific numbers.
Best, Peter.
Am 08.11.2010 um 10:55 schrieb Daniel Gruber:
Hi,
in the MonitorinSession we have on machine level machineSockets and coresPerSocket. To be consequent we should also add threadsPerCore. At least OGE/SGE does support this. I added it into our spreadsheet. If this is not supported by a DRM/OS it could return 0 as value for unknown.
0 for coresPerSocket and machineSockets is not allowed since we should define coresPerSocket*machineSockets=="processors" in case a DRM or OS does not support this kind of architectural information. I suggest to leave it open for the DRMAA implementation if it maps the "processors" information to coresPerSocket or machineSockets in case of missing architectural details.
If there is no objection I'll take this as accepted.
Cheers
Daniel -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- Mariusz
The slotsCount attribute would be challenging for something like OGE, where the number of slots available on a machine is taken from the sum of the slots in all the enabled queue instances on that machine modulo the active resource quota sets modulo any host-level slots settings modulo the queue subordination rules. It's also not very meaningful, because many of those things can and do change dynamically. Daniel On 11/ 8/10 07:01 AM, Mariusz Mamoński wrote:
Hi all,
If we are talking about the monitoring session... what do you think about the idea of:
1. creating a new data struct MachineInfo with all the predefined machine attributes (e.g.: threadsPerCore) + "readonly attribute Dictionary drmsSpecific;" (an extension point) and providing one method: "MachineInfo getMachineInfo(in string machineName) for accessing all of them 2. adding a new attribute "slotsCount", which denotes the maximum number of single-process jobs that can run on given machine concurrently (use case: system administrator may either choose configuration where one process runs per physical core or hardware thread or choose choose any number that is totally independent from hardware configuration)
Cheers,
2010/11/8 Daniel Gruber<daniel.x.gruber@oracle.com>:
Ok. For simplicity we take 1 as default value with the drawback that we loose information if the SMT value is available (and correct) or not.
Regards,
Daniel
On 11/08/10 15:14, Peter Tröger wrote:
Hi,
I can agree to the new "threadsPerCore" attribute, but would prefer to have "1" as default value. From our understanding of a core, each one can always execute at least one thread. It would also allow to compute an estimation of the number of parallel threads, without looking on the specific numbers.
Best, Peter.
Am 08.11.2010 um 10:55 schrieb Daniel Gruber:
Hi,
in the MonitorinSession we have on machine level machineSockets and coresPerSocket. To be consequent we should also add threadsPerCore. At least OGE/SGE does support this. I added it into our spreadsheet. If this is not supported by a DRM/OS it could return 0 as value for unknown.
0 for coresPerSocket and machineSockets is not allowed since we should define coresPerSocket*machineSockets=="processors" in case a DRM or OS does not support this kind of architectural information. I suggest to leave it open for the DRMAA implementation if it maps the "processors" information to coresPerSocket or machineSockets in case of missing architectural details.
If there is no objection I'll take this as accepted.
Cheers
Daniel -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Adding slotsCount to machineInfo would destroy the separation between queue level and host level information. Different users could have different slotCount on the same machine. The information must be retrieved on queue level (we have the queueMaxSlotsAllowed()) method for that. Cheers Daniel On 11/08/10 16:01, Mariusz Mamoński wrote:
Hi all,
If we are talking about the monitoring session... what do you think about the idea of:
1. creating a new data struct MachineInfo with all the predefined machine attributes (e.g.: threadsPerCore) + "readonly attribute Dictionary drmsSpecific;" (an extension point) and providing one method: "MachineInfo getMachineInfo(in string machineName) for accessing all of them 2. adding a new attribute "slotsCount", which denotes the maximum number of single-process jobs that can run on given machine concurrently (use case: system administrator may either choose configuration where one process runs per physical core or hardware thread or choose choose any number that is totally independent from hardware configuration)
Cheers,
2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
Ok. For simplicity we take 1 as default value with the drawback that we loose information if the SMT value is available (and correct) or not.
Regards,
Daniel
On 11/08/10 15:14, Peter Tröger wrote:
Hi,
I can agree to the new "threadsPerCore" attribute, but would prefer to have "1" as default value. From our understanding of a core, each one can always execute at least one thread. It would also allow to compute an estimation of the number of parallel threads, without looking on the specific numbers.
Best, Peter.
Am 08.11.2010 um 10:55 schrieb Daniel Gruber:
Hi,
in the MonitorinSession we have on machine level machineSockets and coresPerSocket. To be consequent we should also add threadsPerCore. At least OGE/SGE does support this. I added it into our spreadsheet. If this is not supported by a DRM/OS it could return 0 as value for unknown.
0 for coresPerSocket and machineSockets is not allowed since we should define coresPerSocket*machineSockets=="processors" in case a DRM or OS does not support this kind of architectural information. I suggest to leave it open for the DRMAA implementation if it maps the "processors" information to coresPerSocket or machineSockets in case of missing architectural details.
If there is no objection I'll take this as accepted.
Cheers
Daniel -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
Can we split the discussion into two paths? ;-) Does anyone vote against 1. point (MachineInfo)? Regarding 2. point: We could make maxSlotsCount optional (like the threadsPerCore) and define it only as the upper bound (and only for the purpose of computing host/cluster "capacity"). Alternatively: what do you think about the idea of: - changing threadsPerCore -> maxThreads - relaxing meaning of the threadsPerCore/maxThreads: as "maximal number of threads that can run *simultaneously* on given machine" without saying if this is imposed by hardware configuration or system policy. Cheers, 2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
Adding slotsCount to machineInfo would destroy the separation between queue level and host level information. Different users could have different slotCount on the same machine. The information must be retrieved on queue level (we have the queueMaxSlotsAllowed()) method for that. but in this case the slots can spawn multiple hosts Cheers
Daniel
On 11/08/10 16:01, Mariusz Mamoński wrote:
Hi all,
If we are talking about the monitoring session... what do you think about the idea of:
1. creating a new data struct MachineInfo with all the predefined machine attributes (e.g.: threadsPerCore) + "readonly attribute Dictionary drmsSpecific;" (an extension point) and providing one method: "MachineInfo getMachineInfo(in string machineName) for accessing all of them 2. adding a new attribute "slotsCount", which denotes the maximum number of single-process jobs that can run on given machine concurrently (use case: system administrator may either choose configuration where one process runs per physical core or hardware thread or choose choose any number that is totally independent from hardware configuration)
Cheers,
2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
Ok. For simplicity we take 1 as default value with the drawback that we loose information if the SMT value is available (and correct) or not.
Regards,
Daniel
On 11/08/10 15:14, Peter Tröger wrote:
Hi,
I can agree to the new "threadsPerCore" attribute, but would prefer to have "1" as default value. From our understanding of a core, each one can always execute at least one thread. It would also allow to compute an estimation of the number of parallel threads, without looking on the specific numbers.
Best, Peter.
Am 08.11.2010 um 10:55 schrieb Daniel Gruber:
Hi,
in the MonitorinSession we have on machine level machineSockets and coresPerSocket. To be consequent we should also add threadsPerCore. At least OGE/SGE does support this. I added it into our spreadsheet. If this is not supported by a DRM/OS it could return 0 as value for unknown.
0 for coresPerSocket and machineSockets is not allowed since we should define coresPerSocket*machineSockets=="processors" in case a DRM or OS does not support this kind of architectural information. I suggest to leave it open for the DRMAA implementation if it maps the "processors" information to coresPerSocket or machineSockets in case of missing architectural details.
If there is no objection I'll take this as accepted.
Cheers
Daniel -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- Mariusz
On 11/08/10 16:58, Mariusz Mamoński wrote:
Can we split the discussion into two paths? ;-)
Does anyone vote against 1. point (MachineInfo)?
When there is a method "getMachineInfo(in string machineName)" we would also require "getQueueInfo(in stirng queueName)" for consistency. Then in total we would have "getQueueNames", "getMachineNames", "getMachineInfo", "getQueueInfo", and "getAllJobs". Which would result in a pretty clear interface. But the current approach has its strengths in simplicity when accessing the values. I would keep the interface simple and portable in case of monitoring since this is IMHO not our core competence. Hence I would keep the current approach and vote against it.
Regarding 2. point:
We could make maxSlotsCount optional (like the threadsPerCore) and define it only as the upper bound (and only for the purpose of computing host/cluster "capacity"). Alternatively: what do you think about the idea of: - changing threadsPerCore -> maxThreads - relaxing meaning of the threadsPerCore/maxThreads: as "maximal number of threads that can run *simultaneously* on given machine" without saying if this is imposed by hardware configuration or system policy.
The other machine values (load, sockets, cores, threads, physical mem, virtual mem, machine os, OS version, machine arch) are not user dependent hence it would break consistency. Queue values should be user dependent. Daniel
Cheers,
2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
Adding slotsCount to machineInfo would destroy the separation between queue level and host level information. Different users could have different slotCount on the same machine. The information must be retrieved on queue level (we have the queueMaxSlotsAllowed()) method for that. but in this case the slots can spawn multiple hosts Cheers
Daniel
On 11/08/10 16:01, Mariusz Mamoński wrote:
Hi all,
If we are talking about the monitoring session... what do you think about the idea of:
1. creating a new data struct MachineInfo with all the predefined machine attributes (e.g.: threadsPerCore) + "readonly attribute Dictionary drmsSpecific;" (an extension point) and providing one method: "MachineInfo getMachineInfo(in string machineName) for accessing all of them 2. adding a new attribute "slotsCount", which denotes the maximum number of single-process jobs that can run on given machine concurrently (use case: system administrator may either choose configuration where one process runs per physical core or hardware thread or choose choose any number that is totally independent from hardware configuration)
Cheers,
2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
Ok. For simplicity we take 1 as default value with the drawback that we loose information if the SMT value is available (and correct) or not.
Regards,
Daniel
On 11/08/10 15:14, Peter Tröger wrote:
Hi,
I can agree to the new "threadsPerCore" attribute, but would prefer to have "1" as default value. From our understanding of a core, each one can always execute at least one thread. It would also allow to compute an estimation of the number of parallel threads, without looking on the specific numbers.
Best, Peter.
Am 08.11.2010 um 10:55 schrieb Daniel Gruber:
Hi,
in the MonitorinSession we have on machine level machineSockets and coresPerSocket. To be consequent we should also add threadsPerCore. At least OGE/SGE does support this. I added it into our spreadsheet. If this is not supported by a DRM/OS it could return 0 as value for unknown.
0 for coresPerSocket and machineSockets is not allowed since we should define coresPerSocket*machineSockets=="processors" in case a DRM or OS does not support this kind of architectural information. I suggest to leave it open for the DRMAA implementation if it maps the "processors" information to coresPerSocket or machineSockets in case of missing architectural details.
If there is no objection I'll take this as accepted.
Cheers
Daniel -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg -- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
On 11/08/10 16:58, Mariusz Mamoński wrote:
Can we split the discussion into two paths? ;-)
Does anyone vote against 1. point (MachineInfo)?
When there is a method "getMachineInfo(in string machineName)" we would also require "getQueueInfo(in stirng queueName)" for consistency. agree. Then in total we would have "getQueueNames", "getMachineNames", "getMachineInfo", "getQueueInfo", and "getAllJobs". Which would result in a pretty clear interface. But the current approach has its strengths in simplicity when accessing the values. I would keep the interface simple and portable in case of monitoring since this is IMHO not our core competence. what about efficiency? in some systems the cost of fetching one of the host attribute is equal to fetching all of them and IMHO user is usually interested in at least few of them. Hence I would keep the current approach and vote against it.
Regarding 2. point:
We could make maxSlotsCount optional (like the threadsPerCore) and define it only as the upper bound (and only for the purpose of computing host/cluster "capacity"). Alternatively: what do you think about the idea of: - changing threadsPerCore -> maxThreads - relaxing meaning of the threadsPerCore/maxThreads: as "maximal number of threads that can run *simultaneously* on given machine" without saying if this is imposed by hardware configuration or system policy.
The other machine values (load, sockets, cores, threads, physical mem, virtual mem, machine os, OS version, machine arch) are not user dependent hence it would break consistency. Queue values should be user dependent. Daniel
Cheers,
2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
Adding slotsCount to machineInfo would destroy the separation between queue level and host level information. Different users could have different slotCount on the same machine. The information must be retrieved on queue level (we have the queueMaxSlotsAllowed()) method for that.
but in this case the slots can spawn multiple hosts
Cheers
Daniel
On 11/08/10 16:01, Mariusz Mamoński wrote:
Hi all,
If we are talking about the monitoring session... what do you think about the idea of:
1. creating a new data struct MachineInfo with all the predefined machine attributes (e.g.: threadsPerCore) + "readonly attribute Dictionary drmsSpecific;" (an extension point) and providing one method: "MachineInfo getMachineInfo(in string machineName) for accessing all of them 2. adding a new attribute "slotsCount", which denotes the maximum number of single-process jobs that can run on given machine concurrently (use case: system administrator may either choose configuration where one process runs per physical core or hardware thread or choose choose any number that is totally independent from hardware configuration)
Cheers,
2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
Ok. For simplicity we take 1 as default value with the drawback that we loose information if the SMT value is available (and correct) or not.
Regards,
Daniel
On 11/08/10 15:14, Peter Tröger wrote:
Hi,
I can agree to the new "threadsPerCore" attribute, but would prefer to have "1" as default value. From our understanding of a core, each one can always execute at least one thread. It would also allow to compute an estimation of the number of parallel threads, without looking on the specific numbers.
Best, Peter.
Am 08.11.2010 um 10:55 schrieb Daniel Gruber:
> Hi, > > in the MonitorinSession we have on machine level machineSockets and > coresPerSocket. > To be consequent we should also add threadsPerCore. At least OGE/SGE > does > support this. I added it into our spreadsheet. > If this is not supported by a DRM/OS it could return 0 as value for > unknown. > > 0 for coresPerSocket and machineSockets is not allowed since we > should > define coresPerSocket*machineSockets=="processors" in case a DRM or > OS > does not support this kind of architectural information. I suggest to > leave > it open for the DRMAA implementation if it maps the "processors" > information > to coresPerSocket or machineSockets in case of missing architectural > details. > > If there is no objection I'll take this as accepted. > > Cheers > > Daniel > -- > drmaa-wg mailing list > drmaa-wg@ogf.org > http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg@ogf.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- Mariusz
Hi, Am 08.11.2010 um 18:59 schrieb Mariusz Mamoński:
2010/11/8 Daniel Gruber <daniel.x.gruber@oracle.com>:
On 11/08/10 16:58, Mariusz Mamoński wrote:
Can we split the discussion into two paths? ;-)
Does anyone vote against 1. point (MachineInfo)?
When there is a method "getMachineInfo(in string machineName)" we would also require "getQueueInfo(in stirng queueName)" for consistency. agree. Then in total we would have "getQueueNames", "getMachineNames", "getMachineInfo", "getQueueInfo", and "getAllJobs". Which would result in a pretty clear interface. But the current approach has its strengths in simplicity when accessing the values. I would keep the interface simple and portable in case of monitoring since this is IMHO not our core competence. what about efficiency? in some systems the cost of fetching one of the host attribute is equal to fetching all of them and IMHO user is usually interested in at least few of them.
The whole discussion already took place during several phone calls, and Mariusz never managed to get his 'consistency' proposal through - even though he tried hard ;-). I vote against opening new API structure discussions at this point. There was enough time for such objections in the past.
We could make maxSlotsCount optional (like the threadsPerCore) and define it only as the upper bound (and only for the purpose of computing host/cluster "capacity"). Alternatively: what do you think about the idea of: - changing threadsPerCore -> maxThreads - relaxing meaning of the threadsPerCore/maxThreads: as "maximal number of threads that can run *simultaneously* on given machine" without saying if this is imposed by hardware configuration or system policy.
The other machine values (load, sockets, cores, threads, physical mem, virtual mem, machine os, OS version, machine arch) are not user dependent hence it would break consistency. Queue values should be user dependent.
Same counter argument from my side. We had long and painful slot-related discussions during the phone calls. The overall agreement was that DRMAA cannot apply any meaning to the slot concept, so we just treat it is opaque monitoring data. Check the meeting minutes. Best, Peter.
participants (5)
-
Andre Merzky -
Daniel Gruber -
Daniel Templeton -
Mariusz Mamoński -
Peter Tröger