
Hi, most of the things I have commented should find their validity not only in this [StorageStorage|StorageDatastore] discussion but also in the general context of GLUE 2.0 storage schema.
-----Original Message----- From: glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of Burke, S (Stephen) Sent: Freitag, 11. April 2008 23:42 To: Paul Millar; glue-wg@ogf.org Subject: Re: [glue-wg] Datastore proposal
glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Paul Millar said:
Name: Human-readable name (maybe indicating the technology, e.g. StorageTek)
Yes, in principle; although I'd be weary of suggestion people put technology names into Name.
Yes, this is always difficult, and I think the schema document should make a general statement about IDs and Names that they should never contain any metadata to be parsed by a program. However, in practice they always do contain things that humans will recognise, so it may as well be something that helps a human understand what's going on - especially for Names which are explicitly supposed to be for human consumption, e.g. in monitoring displays. In practice sites probably have some internal name for many of these things anyway.
Type (disk, tape, ... - open enumeration) (or maybe call this attribute Medium?)
This is definitely nit-picking, but for many instances this would be more "media". Using "type" would avoid the singular / plural issue (I think).
I think this is just because computer people misuse language - it's correct to describe tape as a storage medium, singular, and the fact that people refer to "media" meaning "a bunch of tapes" is incorrect.
I don't know what computer people are, but I agree that 'media' is the appropriate word to use in such context.
Latency: Enumeration {Online, Nearline, Offline} (probably no need to make this open?)
Do we define what online, nearline and offline mean somewhere?
SRM does - probably we should copy it. However, I think it isn't really an SRM-specific term, it should be general enough to use with other things, hence my suggestion that the enumeration can be closed. In theory I suppose you could have levels of nearline-ness according to how long the latency really is, but I doubt that we need to worry about it.
Perhaps the "total amount of data that can stored without operator intervention and when operating correctly". Would that be sufficient?
Something like that. In general we should spell things out as much as possible, people can be very creative in misinterpreting definitions :)
Aye. We should record the storage actually available to end-users, the ten hot-spare disks shouldn't be recorded in that number.
Indeed.
Yupp.
Stephen, your description seems to map to precious + precious&sticky + cache + cache&sticky. However, for most systems this should be ~100% of totalSize most of the time, so I'm not sure how useful that number is.
I'm inclined to think that's correct here: if we're describing the real hardware and it really is full of files then that's fine. The hardware doesn't care what the SRM thinks the files are for (or who they belong to). This kind of distinction is more of a problem for the other places we use Capacities, and indeed is the reason we get so much argument about what to publish.
Perhaps we should publish two numbers?
No, I don't think so; if we want this information at all it should be attached to the SRM-level objects like Share and Environment, because the SRM is what knows that one file is in a cache and another is precious.
FreeSize: TotalSize - UsedSize, i.e. the free space at the filesystem level.
Is this an axiomatic relationship? If so, it probably isn't worth recording it.
That's another perennial debate - traditionally we've gone with publishing the complete set even if you can derive one of them from the others, rather than forcing clients to do the sums. You can see the same kind of thing on the CE side, e.g. TotalJobs = RunningJobs + WaitingJobs. As always it's optional so a given grid or info provider may not in fact publish all of them.
[Hardware compression]
OK, I think this is a can-o-worms that we don't want to open.
Well, we have to open it at least part way, otherwise we leave the definition ambiguous, and you can bet that different people would make different choices :)
2. for some tape systems, it would be very difficult to obtain the actual storage usage (the "tape occupancy"?).
Maybe, but then you just wouldn't publish it. The reason I went for that definition is that the alternative would create numbers which are hard to interpret, e.g. UsedSize > TotalSize - the compression factor is variable so it doesn't make sense to scale the TotalSize.
Also it seems to me that something in the system must know what the occupancy is, otherwise how can it decide whether a new file can be written to a given tape?
For CASTOR this information is available and kept in the "VolumeManager" tables. I strongly assume that this information can be obtained somehow from storage systems which have a tape backend. (see also next comment)
3. sometimes a file store operation can fail. If so, the tape software may retry, but some (potentially unknown) fraction of the file has been written to tape. Does this count towards to actual occupancy?
In principle I'd argue that it should reduce the TotalSize, in practice you'd probably just ignore it - you can never expect to fill your storage 100%.
This tape space is used and is not available until the tape has been repacked. But I agree with Stephen and I am sure it is fine that in GLUE we don't care about this number. The opposite is that we would publish the 'lost' space (due to errors) of such system in GLUE - ugly! Also, HSM with tape backend do have monitoring tools to see how much tape space is left (I don't think that e.g. CASTOR Tape Operations considers the lost space as 'theoretically free'). This number can then be published into GLUE.
4. I believe Castor had an issue when deleting files (leading to "repacking"?) If we're attempting to account for actual yardage of tape used, how would this be accounted?
well, its the nature of tapes and of how data is written. Other HSM with tape should have the same problems. (Except they will seek over the tape to find a suitable space for a given file)
I think that's the same kind of thing, it may be that some of your "free" space is not actually usable in practice. I think it would be much too complicated to try to represent that explicitly - bear in mind that this is just supposed to be a simple definition of an object we probably don't need at all!
I think the only thing we can publish is the (user-domain) file size that has been recorded to tape. I believe this is the actual number people are interested.
It's what they're interested in when they look at e.g. the Share, and what they should find there. If they want to look at a hardware description (which they may well not) they should see the hardware numbers ...
As you say it: we want to cover in GLUE the big portion of use cases. This hardware level view -especially from users- doesn't appear to me as a main one. Please correct me if I am wrong. I'll then incorporate it into the Use-Case document. Cheers, Felix