
Hi Henrik, I think this is a really nice step forward. I have a few quick comments. But I hope to have a more detailed look at some point before the next OGF meeting. - (which I am sorry to say I cannot attend) It would be nice to see how it fits with other (non-grid) storage systems. Regarding creating a stand alone storage accounting record. I think there are good and bad points here. Good: It may allow you to get a prototype with less delay which you could then use and feedback your experience towards the evolution of a full standard. Maybe it even is enough to do this for your EMI work. Bad: Splitting away from the UR standard causes issues when you want to combine compute/storage accounting. The initial idea from OGF, dating back to OGF 21, was that the UR would be formed from different elements e.g. Compute/Storage/Network. The record would effectively aggregate the different parts of the accounting information. I attached the UR2 talk from Donal Fellows at OGF21, I think the UR2 Zoo slide is a nice vision to have. Now maybe this is doable and maybe it's a pain but I think it'd be nice to try. Much of the information regarding the users/vo(community)/site would be the same and would fit into the core. One other (smaller) issue that I had was replicas. In some storage systems you can have the same file replicated multiple times to improve access, dCache can do this and I would be surprised if iRODS couldn't do it too. How do you think we should account for this, simply treat the files as separate or flag this storage as a different type. It may not be so pressing quite yet since I don't think it is so so frequently done and so could be a very small fraction of the storage. We should also start to think about what is required on the backend from the storage solution providers to allow us to gather/use the accounting info you want. I think many systems would be able to give an answer to "what is in your system now" but I am not sure how it is when we ask "what was in your system between X and Y 2007". I think you have good contact with several providers and can ask right? This also opens up questions to what we account for bytes v's byte-mins? If you are really talking about an integration over a time period this is what it would amount to. We had some discussion regarding the snapshot/integration - and I think we may have some more :-) As I said - just first quick thoughts. cheers johnk On 10/07/2010 01:52 PM, Henrik Thostrup Jensen wrote:
Hi
(this is send both to the ogf ur list and the emi sar list, sorry for duplicates).
Jon and me happened to be at the same meeting this week, so we took out a couple of hours of the programme to create a draft suggestion for a storage accounting record. The suggestion is mainly meant as a way to kickoff the meeting in Brussels.
1. Discrete vs continuous
First we discussed how storage differs from jobs in the sense that a job is a discrete unit whereas storage has a more continuous nature, where the usage more or less constantly varies (bytes used), but typically in relatively small scale.
A storage record can however only describe something discrete, so the continuous nature of storage will have to be split into something discrete (similar to integration). Of course the granularity of this process should not be dictated be the standard. The first suggestions in then to have a start and an end time for the storage measurement where a measurement is taken. This allows to create per-day, per-hour resolution or something third, depending on the needs.
Result: "StartTime" and "EndTime" element for describing the time interval for when the record is valid. Both of them are DateTime values.
2. What to measure
The most obvious metric here is the amount of used space. Another metric is the amount of reserved space, i.e., space not taken by actual files/objects, but which cannot be used by other parties. A third metric, is the amount of space which is not allocated, and can be used. Initially we called this "free space", but this term is not exactly accurate as it refers more to a file system metric, which can be very different from what can actually be used due to quotas. Instead we choose to have an element describing how much unallocated space there is, i.e., how space one can be expected to be able to use. The actual free space is a metric, which can be very tricky to measure, and is typically only of interest to the storage system administrator.
A secondary issue is how to report the numbers. We quickly decided on using bytes, as it is the fundamental unit, and saves us from deciding on weather to use 1000 or 1024 bytes as a base. Having multiple options (say KB, MB) for reporting would complicate the standard unnecessarily without any real benefits, so we suggest to keep it to bytes, and bytes only. This would also ensure that the number is always an integer (when using KB and up, floats could be used to be exact) and saves silly conversion routines when parsing the record.
We briefly considered if reporting should be per file, but this was quite quickly shoot down, as it would make the records unreasonably large, without providing any real value. We did end with an element describing the number of files, which are using the space reported.
Result: "UsedSpace", "ReservedSpace", "UnAllocatedSpace" metrics for describing how space is used, reserved, and available for use. Reserved and unallocated are probably not overlapping (as reserved space is technically used). The measurement is in bytes. "FileCount" for describing the number of files using the space.
3. Site and Storage Concepts
The issue of how to describe the site and what storage the accounted data is stored is perhaps the most complex issue in defining this format. The discussion to achieve this was very non-linear, so I will just the describe the result:
A site is considered a top level container for storage. The site name should globally unique (which probably means an FQDN). A storage system is an independent system on a site. A storage system partition is a part of storage system (similar to dCache pools) A storage type describes the storage type where the data is stored (disk, tape)
None of these elements are mandatory (though we couldn't find a valid use case for a record without a site). E.g., is perfectly valid to just use site, site and storage type, or site and storage system. If multiple of the elements exist, they are considered to have hierarchical tree-like structure, i.e., site -> storage system -> storage system partition -> storage type. How to structure this is described in the "Record Structure" section.
This allows a site with a simple setup to reporting just for the site, where as more complex installations can report per storage system, and how much is placed on tape and disk respectively for different storage systems.
A single institution can easily report multiple sites, we do not interfere in this, but it is very likely that a site would have several storage element / systems and would like to able to aggregate it under a single site.
Finally we also add the possibility for a storage class, which describes the class of storage accounted for. This could "precious", "deletable", "pinned", etc. It is not considered a part of the hierarchy described above.
4. Identity
To describe who is using the space, an identity block or group, similar to the one in usage record must be supported. As with usage record it must be possible to specify both a local user name and global user identity. However the common case is likely to be a VO or a VO group, so being able to describe this is extremely important. Another use case is a local group at the site who owns the data (not everyone uses grid). How exactly to describe the virtual organization is not quite clear, but the following elements are needed as a base: VO name, VO issuer (DN of the VO issuer, somewhat VOMS specific), VO group, and VO role. There might be use cases for being able to have multiple VO blocks (though I suspect that will be messy).
Resulting elements: VO Name VO Issuer VO group VO role Global User Identity Local User Name Local User Group
All element are strings. The Global User Identity is to be considered global unique. Furthermore the combination of VO issuer and VO name should be globally unique.
5. Record Structure
Having identified the identity block and the site/storage structure, we tried to find a good way to structure the individual records. We considered the possibility of having a tree structure in the storage accounting record, with the site, storage system, storage system partition, and storage type as possible levels in the tree. The leaves would then each have an identity and space usage block. This however would be quite complicated to construct and parsed, so a simpler - more flat - structure was quickly preferred. Here each record would maximum have a one site, storage system, storage system partition, and storage type (but all still optional). If a system would need to describe for multiple storage system / storage system partitions / storage type several records would have to be generated. This format should still be easy to aggregate together into site or storage system to get complete numbers. The flat format is also much easier to explain which probably means that it should be preferred. The two structure can describe exactly the same, so there is no limitation by choosing the flat format.
6. Record Overview
Given the previous section, we can now present an overview of the elements in a storage accounting record. We reuse the recordId and createTime elements from the usage record standard (just the element names, not the namespaces).
StorageAccountingRecord
recordId (considered globally unique) createTime (timestamp when the record was created) StartTime (datetime value) EndTime (datetime value) Site StorageSystem StorageSystemPartition StorageType StorageClass IdentityBlock VO Block VO Name VO Issuer VO Group VO Role GlobalUserIdentity LocalUserName LocalGroupName UsedSpace ReservedSpace UnallocatedSpace FileCount
The only enforced element is the recordId.
7. Name Issues
Someone brought up that SAR is not the most fortunate name. It sounds a bit like SARS, and the phonetic sound is apparently close to a non-to-fortunate Hungarian word.
We could consider SR (storage record) or SUR (storage usage record) This really doesn't matter.
8. Sharing Elements with Usage Record Standard
Some of the elements are identical in both name and semantics to the ones in usage record. We do not suggest to share the elements as such (same namespace), as it would make the standard rely on the UR standard, and hence make it less self-contained. The UR standard is only used in a few systems, and is likely to be replaced with a new standard sometime. Furthermore the implementation gains of sharing the names are very small, if they even exist.
We have definitely missed something in this, but we hope this can be a start for the discussion in Brussels. If you see problems or issues with this record please let us know.
Best regards, Henrik Thostrup Jensen& Jon Kerr Nilsen
-- ur-wg mailing list ur-wg@ogf.org http://www.ogf.org/mailman/listinfo/ur-wg
-- +------------------------------------------------------------+ |Dr. John Alan Kennedy Rechenzentrum Garching (RZG) | |Mail: jkennedy@rzg.mpg.de Boltzmannstrasse 2 | |Phone: +49 89 3299 2694 85748 Garching | |Fax: +49 89 3299 1301 | +------------------------------------------------------------+