Draft/suggestion for Storage Accounting Record (2010/10/07)
Hi (this is send both to the ogf ur list and the emi sar list, sorry for duplicates). Jon and me happened to be at the same meeting this week, so we took out a couple of hours of the programme to create a draft suggestion for a storage accounting record. The suggestion is mainly meant as a way to kickoff the meeting in Brussels. 1. Discrete vs continuous First we discussed how storage differs from jobs in the sense that a job is a discrete unit whereas storage has a more continuous nature, where the usage more or less constantly varies (bytes used), but typically in relatively small scale. A storage record can however only describe something discrete, so the continuous nature of storage will have to be split into something discrete (similar to integration). Of course the granularity of this process should not be dictated be the standard. The first suggestions in then to have a start and an end time for the storage measurement where a measurement is taken. This allows to create per-day, per-hour resolution or something third, depending on the needs. Result: "StartTime" and "EndTime" element for describing the time interval for when the record is valid. Both of them are DateTime values. 2. What to measure The most obvious metric here is the amount of used space. Another metric is the amount of reserved space, i.e., space not taken by actual files/objects, but which cannot be used by other parties. A third metric, is the amount of space which is not allocated, and can be used. Initially we called this "free space", but this term is not exactly accurate as it refers more to a file system metric, which can be very different from what can actually be used due to quotas. Instead we choose to have an element describing how much unallocated space there is, i.e., how space one can be expected to be able to use. The actual free space is a metric, which can be very tricky to measure, and is typically only of interest to the storage system administrator. A secondary issue is how to report the numbers. We quickly decided on using bytes, as it is the fundamental unit, and saves us from deciding on weather to use 1000 or 1024 bytes as a base. Having multiple options (say KB, MB) for reporting would complicate the standard unnecessarily without any real benefits, so we suggest to keep it to bytes, and bytes only. This would also ensure that the number is always an integer (when using KB and up, floats could be used to be exact) and saves silly conversion routines when parsing the record. We briefly considered if reporting should be per file, but this was quite quickly shoot down, as it would make the records unreasonably large, without providing any real value. We did end with an element describing the number of files, which are using the space reported. Result: "UsedSpace", "ReservedSpace", "UnAllocatedSpace" metrics for describing how space is used, reserved, and available for use. Reserved and unallocated are probably not overlapping (as reserved space is technically used). The measurement is in bytes. "FileCount" for describing the number of files using the space. 3. Site and Storage Concepts The issue of how to describe the site and what storage the accounted data is stored is perhaps the most complex issue in defining this format. The discussion to achieve this was very non-linear, so I will just the describe the result: A site is considered a top level container for storage. The site name should globally unique (which probably means an FQDN). A storage system is an independent system on a site. A storage system partition is a part of storage system (similar to dCache pools) A storage type describes the storage type where the data is stored (disk, tape) None of these elements are mandatory (though we couldn't find a valid use case for a record without a site). E.g., is perfectly valid to just use site, site and storage type, or site and storage system. If multiple of the elements exist, they are considered to have hierarchical tree-like structure, i.e., site -> storage system -> storage system partition -> storage type. How to structure this is described in the "Record Structure" section. This allows a site with a simple setup to reporting just for the site, where as more complex installations can report per storage system, and how much is placed on tape and disk respectively for different storage systems. A single institution can easily report multiple sites, we do not interfere in this, but it is very likely that a site would have several storage element / systems and would like to able to aggregate it under a single site. Finally we also add the possibility for a storage class, which describes the class of storage accounted for. This could "precious", "deletable", "pinned", etc. It is not considered a part of the hierarchy described above. 4. Identity To describe who is using the space, an identity block or group, similar to the one in usage record must be supported. As with usage record it must be possible to specify both a local user name and global user identity. However the common case is likely to be a VO or a VO group, so being able to describe this is extremely important. Another use case is a local group at the site who owns the data (not everyone uses grid). How exactly to describe the virtual organization is not quite clear, but the following elements are needed as a base: VO name, VO issuer (DN of the VO issuer, somewhat VOMS specific), VO group, and VO role. There might be use cases for being able to have multiple VO blocks (though I suspect that will be messy). Resulting elements: VO Name VO Issuer VO group VO role Global User Identity Local User Name Local User Group All element are strings. The Global User Identity is to be considered global unique. Furthermore the combination of VO issuer and VO name should be globally unique. 5. Record Structure Having identified the identity block and the site/storage structure, we tried to find a good way to structure the individual records. We considered the possibility of having a tree structure in the storage accounting record, with the site, storage system, storage system partition, and storage type as possible levels in the tree. The leaves would then each have an identity and space usage block. This however would be quite complicated to construct and parsed, so a simpler - more flat - structure was quickly preferred. Here each record would maximum have a one site, storage system, storage system partition, and storage type (but all still optional). If a system would need to describe for multiple storage system / storage system partitions / storage type several records would have to be generated. This format should still be easy to aggregate together into site or storage system to get complete numbers. The flat format is also much easier to explain which probably means that it should be preferred. The two structure can describe exactly the same, so there is no limitation by choosing the flat format. 6. Record Overview Given the previous section, we can now present an overview of the elements in a storage accounting record. We reuse the recordId and createTime elements from the usage record standard (just the element names, not the namespaces). StorageAccountingRecord recordId (considered globally unique) createTime (timestamp when the record was created) StartTime (datetime value) EndTime (datetime value) Site StorageSystem StorageSystemPartition StorageType StorageClass IdentityBlock VO Block VO Name VO Issuer VO Group VO Role GlobalUserIdentity LocalUserName LocalGroupName UsedSpace ReservedSpace UnallocatedSpace FileCount The only enforced element is the recordId. 7. Name Issues Someone brought up that SAR is not the most fortunate name. It sounds a bit like SARS, and the phonetic sound is apparently close to a non-to-fortunate Hungarian word. We could consider SR (storage record) or SUR (storage usage record) This really doesn't matter. 8. Sharing Elements with Usage Record Standard Some of the elements are identical in both name and semantics to the ones in usage record. We do not suggest to share the elements as such (same namespace), as it would make the standard rely on the UR standard, and hence make it less self-contained. The UR standard is only used in a few systems, and is likely to be replaced with a new standard sometime. Furthermore the implementation gains of sharing the names are very small, if they even exist. We have definitely missed something in this, but we hope this can be a start for the discussion in Brussels. If you see problems or issues with this record please let us know. Best regards, Henrik Thostrup Jensen & Jon Kerr Nilsen
Hi Henrik, I think this is a really nice step forward. I have a few quick comments. But I hope to have a more detailed look at some point before the next OGF meeting. - (which I am sorry to say I cannot attend) It would be nice to see how it fits with other (non-grid) storage systems. Regarding creating a stand alone storage accounting record. I think there are good and bad points here. Good: It may allow you to get a prototype with less delay which you could then use and feedback your experience towards the evolution of a full standard. Maybe it even is enough to do this for your EMI work. Bad: Splitting away from the UR standard causes issues when you want to combine compute/storage accounting. The initial idea from OGF, dating back to OGF 21, was that the UR would be formed from different elements e.g. Compute/Storage/Network. The record would effectively aggregate the different parts of the accounting information. I attached the UR2 talk from Donal Fellows at OGF21, I think the UR2 Zoo slide is a nice vision to have. Now maybe this is doable and maybe it's a pain but I think it'd be nice to try. Much of the information regarding the users/vo(community)/site would be the same and would fit into the core. One other (smaller) issue that I had was replicas. In some storage systems you can have the same file replicated multiple times to improve access, dCache can do this and I would be surprised if iRODS couldn't do it too. How do you think we should account for this, simply treat the files as separate or flag this storage as a different type. It may not be so pressing quite yet since I don't think it is so so frequently done and so could be a very small fraction of the storage. We should also start to think about what is required on the backend from the storage solution providers to allow us to gather/use the accounting info you want. I think many systems would be able to give an answer to "what is in your system now" but I am not sure how it is when we ask "what was in your system between X and Y 2007". I think you have good contact with several providers and can ask right? This also opens up questions to what we account for bytes v's byte-mins? If you are really talking about an integration over a time period this is what it would amount to. We had some discussion regarding the snapshot/integration - and I think we may have some more :-) As I said - just first quick thoughts. cheers johnk On 10/07/2010 01:52 PM, Henrik Thostrup Jensen wrote:
Hi
(this is send both to the ogf ur list and the emi sar list, sorry for duplicates).
Jon and me happened to be at the same meeting this week, so we took out a couple of hours of the programme to create a draft suggestion for a storage accounting record. The suggestion is mainly meant as a way to kickoff the meeting in Brussels.
1. Discrete vs continuous
First we discussed how storage differs from jobs in the sense that a job is a discrete unit whereas storage has a more continuous nature, where the usage more or less constantly varies (bytes used), but typically in relatively small scale.
A storage record can however only describe something discrete, so the continuous nature of storage will have to be split into something discrete (similar to integration). Of course the granularity of this process should not be dictated be the standard. The first suggestions in then to have a start and an end time for the storage measurement where a measurement is taken. This allows to create per-day, per-hour resolution or something third, depending on the needs.
Result: "StartTime" and "EndTime" element for describing the time interval for when the record is valid. Both of them are DateTime values.
2. What to measure
The most obvious metric here is the amount of used space. Another metric is the amount of reserved space, i.e., space not taken by actual files/objects, but which cannot be used by other parties. A third metric, is the amount of space which is not allocated, and can be used. Initially we called this "free space", but this term is not exactly accurate as it refers more to a file system metric, which can be very different from what can actually be used due to quotas. Instead we choose to have an element describing how much unallocated space there is, i.e., how space one can be expected to be able to use. The actual free space is a metric, which can be very tricky to measure, and is typically only of interest to the storage system administrator.
A secondary issue is how to report the numbers. We quickly decided on using bytes, as it is the fundamental unit, and saves us from deciding on weather to use 1000 or 1024 bytes as a base. Having multiple options (say KB, MB) for reporting would complicate the standard unnecessarily without any real benefits, so we suggest to keep it to bytes, and bytes only. This would also ensure that the number is always an integer (when using KB and up, floats could be used to be exact) and saves silly conversion routines when parsing the record.
We briefly considered if reporting should be per file, but this was quite quickly shoot down, as it would make the records unreasonably large, without providing any real value. We did end with an element describing the number of files, which are using the space reported.
Result: "UsedSpace", "ReservedSpace", "UnAllocatedSpace" metrics for describing how space is used, reserved, and available for use. Reserved and unallocated are probably not overlapping (as reserved space is technically used). The measurement is in bytes. "FileCount" for describing the number of files using the space.
3. Site and Storage Concepts
The issue of how to describe the site and what storage the accounted data is stored is perhaps the most complex issue in defining this format. The discussion to achieve this was very non-linear, so I will just the describe the result:
A site is considered a top level container for storage. The site name should globally unique (which probably means an FQDN). A storage system is an independent system on a site. A storage system partition is a part of storage system (similar to dCache pools) A storage type describes the storage type where the data is stored (disk, tape)
None of these elements are mandatory (though we couldn't find a valid use case for a record without a site). E.g., is perfectly valid to just use site, site and storage type, or site and storage system. If multiple of the elements exist, they are considered to have hierarchical tree-like structure, i.e., site -> storage system -> storage system partition -> storage type. How to structure this is described in the "Record Structure" section.
This allows a site with a simple setup to reporting just for the site, where as more complex installations can report per storage system, and how much is placed on tape and disk respectively for different storage systems.
A single institution can easily report multiple sites, we do not interfere in this, but it is very likely that a site would have several storage element / systems and would like to able to aggregate it under a single site.
Finally we also add the possibility for a storage class, which describes the class of storage accounted for. This could "precious", "deletable", "pinned", etc. It is not considered a part of the hierarchy described above.
4. Identity
To describe who is using the space, an identity block or group, similar to the one in usage record must be supported. As with usage record it must be possible to specify both a local user name and global user identity. However the common case is likely to be a VO or a VO group, so being able to describe this is extremely important. Another use case is a local group at the site who owns the data (not everyone uses grid). How exactly to describe the virtual organization is not quite clear, but the following elements are needed as a base: VO name, VO issuer (DN of the VO issuer, somewhat VOMS specific), VO group, and VO role. There might be use cases for being able to have multiple VO blocks (though I suspect that will be messy).
Resulting elements: VO Name VO Issuer VO group VO role Global User Identity Local User Name Local User Group
All element are strings. The Global User Identity is to be considered global unique. Furthermore the combination of VO issuer and VO name should be globally unique.
5. Record Structure
Having identified the identity block and the site/storage structure, we tried to find a good way to structure the individual records. We considered the possibility of having a tree structure in the storage accounting record, with the site, storage system, storage system partition, and storage type as possible levels in the tree. The leaves would then each have an identity and space usage block. This however would be quite complicated to construct and parsed, so a simpler - more flat - structure was quickly preferred. Here each record would maximum have a one site, storage system, storage system partition, and storage type (but all still optional). If a system would need to describe for multiple storage system / storage system partitions / storage type several records would have to be generated. This format should still be easy to aggregate together into site or storage system to get complete numbers. The flat format is also much easier to explain which probably means that it should be preferred. The two structure can describe exactly the same, so there is no limitation by choosing the flat format.
6. Record Overview
Given the previous section, we can now present an overview of the elements in a storage accounting record. We reuse the recordId and createTime elements from the usage record standard (just the element names, not the namespaces).
StorageAccountingRecord
recordId (considered globally unique) createTime (timestamp when the record was created) StartTime (datetime value) EndTime (datetime value) Site StorageSystem StorageSystemPartition StorageType StorageClass IdentityBlock VO Block VO Name VO Issuer VO Group VO Role GlobalUserIdentity LocalUserName LocalGroupName UsedSpace ReservedSpace UnallocatedSpace FileCount
The only enforced element is the recordId.
7. Name Issues
Someone brought up that SAR is not the most fortunate name. It sounds a bit like SARS, and the phonetic sound is apparently close to a non-to-fortunate Hungarian word.
We could consider SR (storage record) or SUR (storage usage record) This really doesn't matter.
8. Sharing Elements with Usage Record Standard
Some of the elements are identical in both name and semantics to the ones in usage record. We do not suggest to share the elements as such (same namespace), as it would make the standard rely on the UR standard, and hence make it less self-contained. The UR standard is only used in a few systems, and is likely to be replaced with a new standard sometime. Furthermore the implementation gains of sharing the names are very small, if they even exist.
We have definitely missed something in this, but we hope this can be a start for the discussion in Brussels. If you see problems or issues with this record please let us know.
Best regards, Henrik Thostrup Jensen& Jon Kerr Nilsen
-- ur-wg mailing list ur-wg@ogf.org http://www.ogf.org/mailman/listinfo/ur-wg
-- +------------------------------------------------------------+ |Dr. John Alan Kennedy Rechenzentrum Garching (RZG) | |Mail: jkennedy@rzg.mpg.de Boltzmannstrasse 2 | |Phone: +49 89 3299 2694 85748 Garching | |Fax: +49 89 3299 1301 | +------------------------------------------------------------+
Hi John On Wed, 13 Oct 2010, john alan kennedy wrote:
I think this is a really nice step forward.
Thanks.
It would be nice to see how it fits with other (non-grid) storage systems.
Yes. With the exception of the VO information most of it should be usable, especially with fields for local user name and groups, though this might be an oversimplification in some places. It is certainly also possible to imagine systems where the record would not be usuable, but questions is more what and how to create records for systems which have a more dynamic structure, and perhaps with not-so-clear ownership of data. Last week we had a meeting in the EMI SAR group, and it became very clear that one of the most important things in creating this record format is a vision of what should be in record and what shouldn't be. More specifically we got into a discussion if the record should just reported used/reserved/available storage, or if it should include transfer statistics, and if so, how detailed. Having transfer statistics for a storage element (or per storage partition) could be very interesting. Having it per file, could be even more interestesting, but would create gigantic record or a very large amount of them.
Regarding creating a stand alone storage accounting record. I think there are good and bad points here.
Good: It may allow you to get a prototype with less delay which you could then use and feedback your experience towards the evolution of a full standard. Maybe it even is enough to do this for your EMI work.
Bad: Splitting away from the UR standard causes issues when you want to combine compute/storage accounting. The initial idea from OGF, dating back to OGF 21, was that the UR would be formed from different elements e.g. Compute/Storage/Network. The record would effectively aggregate the different parts of the accounting information. I attached the UR2 talk from Donal Fellows at OGF21, I think the UR2 Zoo slide is a nice vision to have.
You are right about both. Jon and me considered this when we created the suggestion, but decided against creating something unified as it would most likely bump up the timeschedule significantly. EMI needs a record format at the end of this year. Getting UR2 ready in to months is probably not going to happen.
Now maybe this is doable and maybe it's a pain but I think it'd be nice to try. Much of the information regarding the users/vo(community)/site would be the same and would fit into the core.
Yes. In idea that surfaced at the meeting was to create a seperate standard for an "identity block" and use it in the storage record. Furthermore this identity block could be adopted by the UR standard, creating a UR 1.1 standard. This would fix what is probably the greatest achilles heel of the UR standard - that it doesn't provide a good way to describe VO information. Anyway, it is an idea.
One other (smaller) issue that I had was replicas. In some storage systems you can have the same file replicated multiple times to improve access, dCache can do this and I would be surprised if iRODS couldn't do it too. How do you think we should account for this, simply treat the files as separate or flag this storage as a different type. It may not be so pressing quite yet since I don't think it is so so frequently done and so could be a very small fraction of the storage.
The suggestion should be able to describe this using the "StorageClass" attribute, which can denote that something is a replica. However it is not alwasy easy to say what is an original and a replica (and gets marked as such). This falls into something similar to reserved space, which could be freed, but is somehow occupied. Something that would also be nice to know is if the replica is there for fault-tolerance or because of high traffic to file (i.e., why is the file replicated).
We should also start to think about what is required on the backend from the storage solution providers to allow us to gather/use the accounting info you want. I think many systems would be able to give an answer to "what is in your system now" but I am not sure how it is when we ask "what was in your system between X and Y 2007". I think you have good contact with several providers and can ask right? This also opens up questions to what we account for bytes v's byte-mins? If you are really talking about an integration over a time period this is what it would amount to. We had some discussion regarding the snapshot/integration - and I think we may have some more :-)
I have a (very) good connection to a dCache developer: In dCache, it is not possible to ask how much a person/project used at a certain time. However there are very detailed logs for how much have been written, read and deleted per pool. It is however logged to log files which must be parsed in order to acquire the information. For inquiries about other storage systems, Paul Millar from the EMI SAR group is probably the best person to poke as he has contacts for the groups and is collecting requirements from them.
As I said - just first quick thoughts.
Thanks for the feedback. We now have more open questions :-). Best regards, Henrik Software Developer, Henrik Thostrup Jensen <htj at ndgf.org> Nordic Data Grid Facility. WWW: www.ndgf.org
Hi Henrik,
It would be nice to see how it fits with other (non-grid) storage systems.
Yes. With the exception of the VO information most of it should be usable, especially with fields for local user name and groups, though this might be an oversimplification in some places.
It is certainly also possible to imagine systems where the record would not be usuable, but questions is more what and how to create records for systems which have a more dynamic structure, and perhaps with not-so-clear ownership of data.
In those cases maybe the owner wouldn't be present but the storage area, or something like that, will be. Somehow someone is paying for the space utilized. Maybe it could be problematic if a storage is shared without clear boundaries between multiple groups.
Last week we had a meeting in the EMI SAR group, and it became very clear that one of the most important things in creating this record format is a vision of what should be in record and what shouldn't be.
More specifically we got into a discussion if the record should just reported used/reserved/available storage, or if it should include transfer statistics, and if so, how detailed. Having transfer statistics for a storage element (or per storage partition) could be very interesting. Having it per file, could be even more interestesting, but would create gigantic record or a very large amount of them.
I think that it would be very interesting to have also the possibility to include transfer details. It's true that it could lead to a huge amount of records but, also in this case, they could be aggregated in a per file or area, or group basis.
Regarding creating a stand alone storage accounting record. I think there are good and bad points here.
Good: It may allow you to get a prototype with less delay which you could then use and feedback your experience towards the evolution of a full standard. Maybe it even is enough to do this for your EMI work.
Bad: Splitting away from the UR standard causes issues when you want to combine compute/storage accounting. The initial idea from OGF, dating back to OGF 21, was that the UR would be formed from different elements e.g. Compute/Storage/Network. The record would effectively aggregate the different parts of the accounting information. I attached the UR2 talk from Donal Fellows at OGF21, I think the UR2 Zoo slide is a nice vision to have.
You are right about both. Jon and me considered this when we created the suggestion, but decided against creating something unified as it would most likely bump up the timeschedule significantly. EMI needs a record format at the end of this year. Getting UR2 ready in to months is probably not going to happen.
At least, as I was suggesting it could use, where possible, the same name and description for fields that have the same meaning. If EMI decide to proceed with a separated UR this might at least help in compatibility with lets say UR2 that include computation and storage.
Now maybe this is doable and maybe it's a pain but I think it'd be nice to try. Much of the information regarding the users/vo(community)/site would be the same and would fit into the core.
Yes. In idea that surfaced at the meeting was to create a seperate standard for an "identity block" and use it in the storage record. Furthermore this identity block could be adopted by the UR standard, creating a UR 1.1 standard. This would fix what is probably the greatest achilles heel of the UR standard - that it doesn't provide a good way to describe VO information. Anyway, it is an idea.
One other (smaller) issue that I had was replicas. In some storage systems you can have the same file replicated multiple times to improve access, dCache can do this and I would be surprised if iRODS couldn't do it too. How do you think we should account for this, simply treat the files as separate or flag this storage as a different type. It may not be so pressing quite yet since I don't think it is so so frequently done and so could be a very small fraction of the storage.
The suggestion should be able to describe this using the "StorageClass" attribute, which can denote that something is a replica. However it is not alwasy easy to say what is an original and a replica (and gets marked as such). This falls into something similar to reserved space, which could be freed, but is somehow occupied. Something that would also be nice to know is if the replica is there for fault-tolerance or because of high traffic to file (i.e., why is the file replicated).
I think that the replica, as you said, might be well described with the "StorageClass" attribute. Then it is up to the site to decide what to do with replicas. I would say that anyway replicas are like normal file so they should be treated like that but it's left to the site to decide. Depending in wich storage they can be found they can also be treated differently. Also the reason why the replica exists, I think, is not a concern of the UR. Maybe it will help to have the information about the access to the file so that the existence of the replicas can became apparent or not.
We should also start to think about what is required on the backend from the storage solution providers to allow us to gather/use the accounting info you want. I think many systems would be able to give an answer to "what is in your system now" but I am not sure how it is when we ask "what was in your system between X and Y 2007". I think you have good contact with several providers and can ask right? This also opens up questions to what we account for bytes v's byte-mins? If you are really talking about an integration over a time period this is what it would amount to. We had some discussion regarding the snapshot/integration - and I think we may have some more :-)
I have a (very) good connection to a dCache developer:
In dCache, it is not possible to ask how much a person/project used at a certain time. However there are very detailed logs for how much have been written, read and deleted per pool. It is however logged to log files which must be parsed in order to acquire the information.
I remember I've read that dCache could also write those information in a database. If this is true it might be even easier to extract the required informations to create a UR. Andrea
For inquiries about other storage systems, Paul Millar from the EMI SAR group is probably the best person to poke as he has contacts for the groups and is collecting requirements from them.
As I said - just first quick thoughts.
Thanks for the feedback. We now have more open questions :-).
Best regards, Henrik
Software Developer, Henrik Thostrup Jensen<htj at ndgf.org> Nordic Data Grid Facility. WWW: www.ndgf.org -- ur-wg mailing list ur-wg@ogf.org http://www.ogf.org/mailman/listinfo/ur-wg
-- Andrea Cristofori INFN-CNAF Viale Berti Pichat 6/2 40127 Bologna Italy Tel. : +39-051-6092920 Skype: andrea-cnaf
Guys, there is a workshop today on use of GPGPU in EGI. I am looking at accounting. Has anyone considered how we include this in UR2? Options I thought of are: 1. Extend the computeusageblock - have a second field for CPU used on attached processor. This would also need some way to identify the type. Currently we have a count of processors and the sum of cpu use. We can't add CPU+GPU so we would need to duplicate several fields 2. Repeat the computeusageblock - there is no identifier that would distinguish between 2 instances of the block. If there were then we could have two instances, one for CPU and one for GPU. The fields like, cpu count and total CPU would be different in each and could be treated separately or together because they are recorded together in the same UR. Another problem is that we do not identify the type of processor so we could not parse to distinguish between multiple blocks. 3. A new block AttachedProcUsageBlock Any other thoughts? John -- Scanned by iCritical.
John,
We will also need to consider the case of systems deployed with the
Intel phi co-processor which currently has 62 processors on the card.
Applications can run from 0 to 62 processors.
-Keith.
At 10:36 AM +0000 4/10/13,
Guys, there is a workshop today on use of GPGPU in EGI. I am looking at accounting. Has anyone considered how we include this in UR2?
Options I thought of are:
1. Extend the computeusageblock - have a second field for CPU used on attached processor. This would also need some way to identify the type. Currently we have a count of processors and the sum of cpu use. We can't add CPU+GPU so we would need to duplicate several fields
2. Repeat the computeusageblock - there is no identifier that would distinguish between 2 instances of the block. If there were then we could have two instances, one for CPU and one for GPU. The fields like, cpu count and total CPU would be different in each and could be treated separately or together because they are recorded together in the same UR. Another problem is that we do not identify the type of processor so we could not parse to distinguish between multiple blocks.
3. A new block AttachedProcUsageBlock
Any other thoughts?
John -- Scanned by iCritical. -- ur-wg mailing list ur-wg@ogf.org https://www.ogf.org/mailman/listinfo/ur-wg
Hi John, Unfortunately I read the email only now. From an organizational point of view I think we could start the discussion of this topic and then maybe work on a UR 2.1 definition. 2.0 was sent for public comment one month ago so I think that this version should continue its track. I think that among the options that you are proposing I would choose one of the first two. Anyway I have to think a bit more about it. Maybe we can discuss it during next phone meeting. we keep our schedule it should be next Tuesday the 16th of April at 15:00. As some of us are in fact busy at EGI CF 2013 I would suggest to fix next phone meeting either the 23rd or the 30th of April at the same time. What are your preferences? Regards, Andrea Il 10/04/2013 12:36, john.gordon@stfc.ac.uk ha scritto:
Guys, there is a workshop today on use of GPGPU in EGI. I am looking at accounting. Has anyone considered how we include this in UR2?
Options I thought of are:
1. Extend the computeusageblock - have a second field for CPU used on attached processor. This would also need some way to identify the type. Currently we have a count of processors and the sum of cpu use. We can't add CPU+GPU so we would need to duplicate several fields
2. Repeat the computeusageblock - there is no identifier that would distinguish between 2 instances of the block. If there were then we could have two instances, one for CPU and one for GPU. The fields like, cpu count and total CPU would be different in each and could be treated separately or together because they are recorded together in the same UR. Another problem is that we do not identify the type of processor so we could not parse to distinguish between multiple blocks.
3. A new block AttachedProcUsageBlock
Any other thoughts?
John
Hi Henrik, Following are some of my personal comment that we may discuss, agree, disagree, etc...
1. Discrete vs continuous
First we discussed how storage differs from jobs in the sense that a job is a discrete unit whereas storage has a more continuous nature, where the usage more or less constantly varies (bytes used), but typically in relatively small scale.
A storage record can however only describe something discrete, so the continuous nature of storage will have to be split into something discrete (similar to integration). Of course the granularity of this process should not be dictated be the standard. The first suggestions in then to have a start and an end time for the storage measurement where a measurement is taken. This allows to create per-day, per-hour resolution or something third, depending on the needs.
Result: "StartTime" and "EndTime" element for describing the time interval for when the record is valid. Both of them are DateTime values.
In the discussion that we had in Munich at OGF we discussed something like that and, I think is the right thing. Even if we account for a single file it should be possible to create UR that specify a start and end date so that you can calculate the integral you mentioned. This even if the file are still present. The site could, for example create those records regularly.
2. What to measure
The most obvious metric here is the amount of used space. Another metric is the amount of reserved space, i.e., space not taken by actual files/objects, but which cannot be used by other parties. A third metric, is the amount of space which is not allocated, and can be used. Initially we called this "free space", but this term is not exactly accurate as it refers more to a file system metric, which can be very different from what can actually be used due to quotas. Instead we choose to have an element describing how much unallocated space there is, i.e., how space one can be expected to be able to use. The actual free space is a metric, which can be very tricky to measure, and is typically only of interest to the storage system administrator.
Here I'm not so sure that we really need to define attributes for a record that describes free space. In fact what we want to know is only the used space over time. System administrators are then free to analyse the UR and compare them with the available space. I think that UR should be focused on the used space not in the available space.
A secondary issue is how to report the numbers. We quickly decided on using bytes, as it is the fundamental unit, and saves us from deciding on weather to use 1000 or 1024 bytes as a base. Having multiple options (say KB, MB) for reporting would complicate the standard unnecessarily without any real benefits, so we suggest to keep it to bytes, and bytes only. This would also ensure that the number is always an integer (when using KB and up, floats could be used to be exact) and saves silly conversion routines when parsing the record.
I agree.
We briefly considered if reporting should be per file, but this was quite quickly shoot down, as it would make the records unreasonably large, without providing any real value. We did end with an element describing the number of files, which are using the space reported.
Result: "UsedSpace", "ReservedSpace", "UnAllocatedSpace" metrics for describing how space is used, reserved, and available for use. Reserved and unallocated are probably not overlapping (as reserved space is technically used). The measurement is in bytes. "FileCount" for describing the number of files using the space.
I wouldn't discard the accounting per file as, I think, it's the only way to have a precise way to account for the used space (we also had some request regarding this point where people are interested in knowing how often a specific file is accessed). I agree that it can make the number of records huge but this problem might be solved using aggregate of records where you lose the detail regarding the file, etc. This anyway is not something that must be done but giving the choice to account also per file I think is important. I agree with the reserved space might be accounted because even if it isn't really used it might be "locked".
3. Site and Storage Concepts
The issue of how to describe the site and what storage the accounted data is stored is perhaps the most complex issue in defining this format. The discussion to achieve this was very non-linear, so I will just the describe the result:
A site is considered a top level container for storage. The site name should globally unique (which probably means an FQDN). A storage system is an independent system on a site. A storage system partition is a part of storage system (similar to dCache pools) A storage type describes the storage type where the data is stored (disk, tape)
None of these elements are mandatory (though we couldn't find a valid use case for a record without a site). E.g., is perfectly valid to just use site, site and storage type, or site and storage system. If multiple of the elements exist, they are considered to have hierarchical tree-like structure, i.e., site -> storage system -> storage system partition -> storage type. How to structure this is described in the "Record Structure" section.
This allows a site with a simple setup to reporting just for the site, where as more complex installations can report per storage system, and how much is placed on tape and disk respectively for different storage systems.
A single institution can easily report multiple sites, we do not interfere in this, but it is very likely that a site would have several storage element / systems and would like to able to aggregate it under a single site.
Finally we also add the possibility for a storage class, which describes the class of storage accounted for. This could "precious", "deletable", "pinned", etc. It is not considered a part of the hierarchy described above.
4. Identity
To describe who is using the space, an identity block or group, similar to the one in usage record must be supported. As with usage record it must be possible to specify both a local user name and global user identity. However the common case is likely to be a VO or a VO group, so being able to describe this is extremely important. Another use case is a local group at the site who owns the data (not everyone uses grid). How exactly to describe the virtual organization is not quite clear, but the following elements are needed as a base: VO name, VO issuer (DN of the VO issuer, somewhat VOMS specific), VO group, and VO role. There might be use cases for being able to have multiple VO blocks (though I suspect that will be messy).
Resulting elements: VO Name VO Issuer VO group VO role Global User Identity Local User Name Local User Group
All element are strings. The Global User Identity is to be considered global unique. Furthermore the combination of VO issuer and VO name should be globally unique.
I think that I agree in general with those things. Maybe just the name might be more general without including the reference to VO that are maybe too much GRID specific. To identify a user I would use the same system used in the current UR so to keep everything uniform.
5. Record Structure
Having identified the identity block and the site/storage structure, we tried to find a good way to structure the individual records. We considered the possibility of having a tree structure in the storage accounting record, with the site, storage system, storage system partition, and storage type as possible levels in the tree. The leaves would then each have an identity and space usage block. This however would be quite complicated to construct and parsed, so a simpler - more flat - structure was quickly preferred. Here each record would maximum have a one site, storage system, storage system partition, and storage type (but all still optional). If a system would need to describe for multiple storage system / storage system partitions / storage type several records would have to be generated. This format should still be easy to aggregate together into site or storage system to get complete numbers. The flat format is also much easier to explain which probably means that it should be preferred. The two structure can describe exactly the same, so there is no limitation by choosing the flat format.
I agree also with this.
6. Record Overview
Given the previous section, we can now present an overview of the elements in a storage accounting record. We reuse the recordId and createTime elements from the usage record standard (just the element names, not the namespaces).
StorageAccountingRecord
recordId (considered globally unique) createTime (timestamp when the record was created) StartTime (datetime value) EndTime (datetime value) Site StorageSystem StorageSystemPartition StorageType StorageClass IdentityBlock VO Block VO Name VO Issuer VO Group VO Role GlobalUserIdentity LocalUserName LocalGroupName UsedSpace ReservedSpace UnallocatedSpace FileCount
The only enforced element is the recordId.
7. Name Issues
Someone brought up that SAR is not the most fortunate name. It sounds a bit like SARS, and the phonetic sound is apparently close to a non-to-fortunate Hungarian word.
We could consider SR (storage record) or SUR (storage usage record) This really doesn't matter.
8. Sharing Elements with Usage Record Standard
Some of the elements are identical in both name and semantics to the ones in usage record. We do not suggest to share the elements as such (same namespace), as it would make the standard rely on the UR standard, and hence make it less self-contained. The UR standard is only used in a few systems, and is likely to be replaced with a new standard sometime. Furthermore the implementation gains of sharing the names are very small, if they even exist.
Even if it's decided to have two separated UR for resources used and storage I think that, where possible, we should reuse the same convention for names. Like, for example the "Global User Identity" or the "Local User", etc. Andrea
We have definitely missed something in this, but we hope this can be a start for the discussion in Brussels. If you see problems or issues with this record please let us know.
Best regards, Henrik Thostrup Jensen& Jon Kerr Nilsen
-- ur-wg mailing list ur-wg@ogf.org http://www.ogf.org/mailman/listinfo/ur-wg
-- Andrea Cristofori INFN-CNAF Viale Berti Pichat 6/2 40127 Bologna Italy Tel. : +39-051-6092920 Skype: andrea-cnaf
I know this is a bit last minute but here is my feedback. This mail might save some time tomorrow. John
-----Original Message----- From: ur-wg-bounces@ogf.org [mailto:ur-wg-bounces@ogf.org] On Behalf Of Henrik Thostrup Jensen Sent: 07 October 2010 12:53 To: ur-wg@ogf.org Subject: [UR-WG] Draft/suggestion for Storage Accounting Record (2010/10/07)
Hi
(this is send both to the ogf ur list and the emi sar list, sorry for duplicates).
Jon and me happened to be at the same meeting this week, so we took out a couple of hours of the programme to create a draft suggestion for a storage accounting record. The suggestion is mainly meant as a way to kickoff the meeting in Brussels.
1. Discrete vs continuous
First we discussed how storage differs from jobs in the sense that a job is a discrete unit whereas storage has a more continuous nature, where the usage more or less constantly varies (bytes used), but typically in relatively small scale.
A storage record can however only describe something discrete, so the continuous nature of storage will have to be split into something discrete (similar to integration). Of course the granularity of this process should not be dictated be the standard. The first suggestions in then to have a start and an end time for the storage measurement where a measurement is taken. This allows to create per-day, per-hour resolution or something third, depending on the needs.
Result: "StartTime" and "EndTime" element for describing the time interval for when the record is valid. Both of them are DateTime values.
I don't understand. A measurement is taken at a single point in time. If I look at the space used in a storage system (say a filesystem) What information do I have on the period of validity of the measurement. I suppose if one is creating a summary record in place of many single measurements then one could understand the meaning of start time and end time but then one would have to think about what one was doing with the other quantities measured - min, max, average over the time span?
2. What to measure
The most obvious metric here is the amount of used space. Another metric is the amount of reserved space, i.e., space not taken by actual files/objects, but which cannot be used by other parties. A third metric, is the amount of space which is not allocated, and can be used. Initially we called this "free space", but this term is not exactly accurate as it refers more to a file system metric, which can be very different from what can actually be used due to quotas. Instead we choose to have an element describing how much unallocated space there is, i.e., how space one can be expected to be able to use. The actual free space is a metric, which can be very tricky to measure, and is typically only of interest to the storage system administrator.
A secondary issue is how to report the numbers. We quickly decided on using bytes, as it is the fundamental unit, and saves us from deciding on weather to use 1000 or 1024 bytes as a base. Having multiple options (say KB, MB) for reporting would complicate the standard unnecessarily without any real benefits, so we suggest to keep it to bytes, and bytes only. This would also ensure that the number is always an integer (when using KB and up, floats could be used to be exact) and saves silly conversion routines when parsing the record.
Don't you risk overflowing an integer in some systems?
We briefly considered if reporting should be per file, but this was quite quickly shoot down, as it would make the records unreasonably large, without providing any real value. We did end with an element describing the number of files, which are using the space reported.
I presume you mean the number of records would be large, not the individual records?
Result: "UsedSpace", "ReservedSpace", "UnAllocatedSpace" metrics for describing how space is used, reserved, and available for use. Reserved and unallocated are probably not overlapping (as reserved space is technically used). The measurement is in bytes. "FileCount" for describing the number of files using the space.
Filecount should be optional.
3. Site and Storage Concepts
The issue of how to describe the site and what storage the accounted data is stored is perhaps the most complex issue in defining this format. The discussion to achieve this was very non-linear, so I will just the describe the result:
A site is considered a top level container for storage. The site name should globally unique (which probably means an FQDN).
What is the FQDN of your site? Isn't a FQDN a host? In EGI we have a site name defined in GOCDB which is also used in the GLUE Schema. I assume other users of the GLUE schema have an equivalent. I know OGF odes.
A storage system is an independent system on a site. A storage system partition is a part of storage system (similar to dCache pools) A storage type describes the storage type where the data is stored (disk, tape)
WLCG uses Storage Area. definition The Glue Storage Area (SA) class describes a logical view of a portion of physical space that can include disks and tape resources. SAs MAY overlap. Shared portions of storage MUST be represented with a single GlueSA object, with multiple GlueSAAccessControlBaseRule attributes and optionally with multiple VOInfo objects pointing to it.
None of these elements are mandatory (though we couldn't find a valid use case for a record without a site). E.g., is perfectly valid to just use site, site and storage type, or site and storage system. If multiple of the elements exist, they are considered to have hierarchical tree-like structure, i.e., site -> storage system -> storage system partition -> storage type. How to structure this is described in the "Record Structure" section.
This allows a site with a simple setup to reporting just for the site, where as more complex installations can report per storage system, and how much is placed on tape and disk respectively for different storage systems.
A single institution can easily report multiple sites, we do not interfere in this, but it is very likely that a site would have several storage element / systems and would like to able to aggregate it under a single site.
Finally we also add the possibility for a storage class, which describes the class of storage accounted for. This could "precious", "deletable", "pinned", etc. It is not considered a part of the hierarchy described above.
4. Identity
To describe who is using the space, an identity block or group, similar to the one in usage record must be supported. As with usage record it must be possible to specify both a local user name and global user identity. However the common case is likely to be a VO or a VO group, so being able to describe this is extremely important. Another use case is a local group at the site who owns the data (not everyone uses grid). How exactly to describe the virtual organization is not quite clear, but the following elements are needed as a base: VO name, VO issuer (DN of the VO issuer, somewhat VOMS specific), VO group, and VO role. There might be use cases for being able to have multiple VO blocks (though I suspect that will be messy).
I think FQAN is the term for this. I don't see the need for VO Issuer. VO names should be unique in any infrastructure and now that we typically register fully qualified VO names they should be globally unique.
Resulting elements: VO Name VO Issuer VO group VO role Global User Identity Local User Name Local User Group
All element are strings. The Global User Identity is to be considered global unique. Furthermore the combination of VO issuer and VO name should be globally unique.
5. Record Structure
Having identified the identity block and the site/storage structure, we tried to find a good way to structure the individual records. We considered the possibility of having a tree structure in the storage accounting record, with the site, storage system, storage system partition, and storage type as possible levels in the tree. The leaves would then each have an identity and space usage block. This however would be quite complicated to construct and parsed, so a simpler - more flat - structure was quickly preferred. Here each record would maximum have a one site, storage system, storage system partition, and storage type (but all still optional). If a system would need to describe for multiple storage system / storage system partitions / storage type several records would have to be generated. This format should still be easy to aggregate together into site or storage system to get complete numbers. The flat format is also much easier to explain which probably means that it should be preferred. The two structure can describe exactly the same, so there is no limitation by choosing the flat format.
6. Record Overview
Given the previous section, we can now present an overview of the elements in a storage accounting record. We reuse the recordId and createTime elements from the usage record standard (just the element names, not the namespaces).
StorageAccountingRecord
recordId (considered globally unique) createTime (timestamp when the record was created) StartTime (datetime value) EndTime (datetime value) Site StorageSystem StorageSystemPartition StorageType StorageClass IdentityBlock VO Block VO Name VO Issuer VO Group VO Role GlobalUserIdentity LocalUserName LocalGroupName UsedSpace ReservedSpace UnallocatedSpace FileCount
The only enforced element is the recordId.
7. Name Issues
Someone brought up that SAR is not the most fortunate name. It sounds a bit like SARS, and the phonetic sound is apparently close to a non-to- fortunate Hungarian word.
We could consider SR (storage record) or SUR (storage usage record) This really doesn't matter.
8. Sharing Elements with Usage Record Standard
Some of the elements are identical in both name and semantics to the ones in usage record. We do not suggest to share the elements as such (same namespace), as it would make the standard rely on the UR standard, and hence make it less self-contained. The UR standard is only used in a few systems, and is likely to be replaced with a new standard sometime. Furthermore the implementation gains of sharing the names are very small, if they even exist.
Since there is an existing UR I do not see a problem using the same names where the meaning is shared. This would not tie us to syncing with any new versions of the UR.
We have definitely missed something in this, but we hope this can be a start for the discussion in Brussels. If you see problems or issues with this record please let us know.
Best regards, Henrik Thostrup Jensen & Jon Kerr Nilsen
-- ur-wg mailing list ur-wg@ogf.org http://www.ogf.org/mailman/listinfo/ur-wg
-- Scanned by iCritical.
Hi John On Wed, 27 Oct 2010, john.gordon@stfc.ac.uk wrote:
Result: "StartTime" and "EndTime" element for describing the time interval for when the record is valid. Both of them are DateTime values.
I don't understand. A measurement is taken at a single point in time. If I look at the space used in a storage system (say a filesystem) What information do I have on the period of validity of the measurement.
I suppose if one is creating a summary record in place of many single measurements then one could understand the meaning of start time and end time but then one would have to think about what one was doing with the other quantities measured - min, max, average over the time span?
Well, you are right that these things are measured as a point in time. The idea with start and end time was to allow arbitrary time resolution, but this is possible with timestamps as well. Performing multiple samples to create min,max,avg would also be possible, but the gains are fairly low IMO. I don't really have any strong opinions on which way to go here. Both should be able to work fine.
We quickly decided on using bytes, as it is the fundamental unit, and saves us from deciding on weather to use 1000 or 1024 bytes as a base.
Don't you risk overflowing an integer in some systems?
I don't think we should worry about that in standards. Most languages and databases support 64 bit integers, don't be afraid to use it. Also if an accounting system want to transfer the number into something else internally, I'm perfectly fine with it.
We briefly considered if reporting should be per file, but this was quite quickly shoot down, as it would make the records unreasonably large, without providing any real value. We did end with an element describing the number of files, which are using the space reported.
I presume you mean the number of records would be large, not the individual records?
Yes.
Result: "UsedSpace", "ReservedSpace", "UnAllocatedSpace" metrics for describing how space is used, reserved, and available for use. Reserved and unallocated are probably not overlapping (as reserved space is technically used). The measurement is in bytes. "FileCount" for describing the number of files using the space.
Filecount should be optional.
Absolutely (and so should reserved space and unallocated/available space). Most of the suggested fields in the draft should be optional.
A site is considered a top level container for storage. The site name should globally unique (which probably means an FQDN).
What is the FQDN of your site? Isn't a FQDN a host?
In EGI we have a site name defined in GOCDB which is also used in the GLUE Schema. I assume other users of the GLUE schema have an equivalent. I know OGF odes.
Yes, FQDN is a full hostname. The important thing is that it globally identifies your host or site. The idea was that if you had a host name like, spacebucket1.mysite.org, you could use mysite.org. This would enable adding numbers from different SEs on a site to be added together. Perhaps it should be considered to split this into two fields: site, and host.
A storage system is an independent system on a site. A storage system partition is a part of storage system (similar to dCache pools) A storage type describes the storage type where the data is stored (disk, tape)
WLCG uses Storage Area. definition
The Glue Storage Area (SA) class describes a logical view of a portion of physical space that can include disks and tape resources. SAs MAY overlap. Shared portions of storage MUST be represented with a single GlueSA object, with multiple GlueSAAccessControlBaseRule attributes and optionally with multiple VOInfo objects pointing to it.
A somewhat complex approach, but it might be a possible strategy.
How exactly to describe the virtual organization is not quite clear, but the following elements are needed as a base: VO name, VO issuer (DN of the VO issuer, somewhat VOMS specific), VO group, and VO role. There might be use cases for being able to have multiple VO blocks (though I suspect that will be messy).
I think FQAN is the term for this. I don't see the need for VO Issuer. VO names should be unique in any infrastructure and now that we typically register fully qualified VO names they should be globally unique.
"Typically" is not quite good enough here. Nothing prevents to VOMS servers to create a VO with the same name, so I think it is absolutely necessary. It won't be needed in most cases, but if it does happen, you want it. I think we should also try to make it usuable by non-VOMS users as way of globally identifying projects.
8. Sharing Elements with Usage Record Standard
Some of the elements are identical in both name and semantics to the ones in usage record. We do not suggest to share the elements as such (same namespace), as it would make the standard rely on the UR standard, and hence make it less self-contained. The UR standard is only used in a few systems, and is likely to be replaced with a new standard sometime. Furthermore the implementation gains of sharing the names are very small, if they even exist.
Since there is an existing UR I do not see a problem using the same names where the meaning is shared. This would not tie us to syncing with any new versions of the UR.
Not quite sure, I'm following here. Are suggesting to just reuse the element name where it makes sense (which I'm perfectly fine with), or reuse the QName (namespace + element name) from UR. The latter just seems as an unnecessary complication. Thanks for the feedback. Hopefully people will have to read it before the meeting. Best regards, Henrik Software Developer, Henrik Thostrup Jensen <htj at ndgf.org> Nordic Data Grid Facility. WWW: www.ndgf.org
participants (5)
-
Andrea Cristofori
-
Henrik Thostrup Jensen
-
john alan kennedy
-
john.gordon@stfc.ac.uk
-
Keith Chadwick