
Hi, I'll do this while it's fresh in my mind ... for those not in the meeting, the background is that we have a long-running debate about whether to represent the storage hardware explicitly in the schema. In the original Glue we had the Storage Library, which we then obsoleted because no-one used it. In the 1.3 discussion there was the proposed Storage Component, which we left out because we didn't get agreement on whether we needed it. Now we have the same discussion coming round again - the current draft has a Storage Resource to describe the software which manages some storage, e.g. Enstore or GPFS, but still nothing to represent the hardware it manages. The problem is that we never seem to have a clear use case that requires a hardware description, but it keeps coming back in discussions, perhaps because it's a natural way to think about storage systems. (Also LCG has specific hardware restrictions, e.g. that Custodial must imply tape storage, which are not mandated by SRM or described in the schema.) My proposal is to shortcut this discussion by putting a simple representation of the hardware in the schema, which would be optional for anyone to publish, so the debate can at least be pushed off to implementation time. I propose an object which I tentatively call a Datastore (or we could go back to the old StorageLibrary name if we want everything to start with "Storage"). A Datastore would represent some set of uniform managed storage hardware, e.g. a tape robot plus all the tapes, or a set of disk servers managed for the same purpose. For clarity, disk servers allocated e.g. to different VOs would still just constitute a single Datastore, but disk servers used for completely different purposes would be separate, e.g. the disk cache in front of the tape robot would be a different Datastore if managed independently of the Disk1 storage. The Datastore would have fairly few attributes: UniqueID (as usual) Name (human-readable name, maybe indicating the technology, e.g. StorageTek) Type (disk, tape, ... - open enumeration) Capacity (NB this is in the schema as a separate object for technical reasons but is really just an attribute) OtherInfo (as usual) It might perhaps be useful to give the technology, e.g. RAIDn for disk systems, but I think that should go in OtherInfo as it's likely to be hard to standardise it. This would be linked to the existing StorageResource object with a one-to-many relation, i.e. one Resource could manage many Datastores (Castor manages tape and disk) but not vice versa, one Datastore can only be managed by one Resource - if there are e.g. multiple sets of disk servers managed by several different software systems that would constitute multiple Datastores. The relation to StorageEnvironment and/or StorageShare remains open for discussion as there are other issues there (e.g. whether we want the Environment at all), but conceptually you would want a relation between the Share and whichever DataStore(s) store the data for that Share. That can be one to many, e.g. if Custodial/Online uses disk+tape, as in WLCG. Comments? Stephen