
Hi all, As an exercise, I've tried to jot down as precise and complete a description of each GLUE storage object as possible, also describing and how they relate to each other. I've also tried to do this without any forward-references (so, in theory, the document is readable in a single pass). In almost all cases, I've left out the attributes. I don't know how useful this is. It's just my point-of-view of things as stand now. I'm sure there are bits that are "wrong" (either I've misunderstood and/or this description breaks a use-case), but if so, helpfully people can point which bits are wrong and (perhaps) it will stimulate some discussion. BTW, I'm implicitly assuming that StorageEnvironment.RetentionPolicy can be multivalued. If this isn't true and we have the use-case of the same physical disks being part of, for example, both Custodial and Output storage, then it starts to get complicated. As always, comments appreciated. Cheers, Paul. --- UserDomain: A collection of one or more end-users; a VO is an instance of a UserDomain. All end-users that interact with the physical storage are a member of a UserDomain and, in general, derive their authorisation from that membership. StorageCapacity: A StorageCapacity object describes the ability to store data within a homogeneous storage technology. This storage technology provides a common access latency. All StorageCapacity objects are specified within a certain context. The context is determined by an association between the StorageCapacity object and precisely one other higher-level object. These associations are not listed here, but are described in later sections. In general, a StorageCapacity object will record some context-specific information. Examples of such information include the total storage capacity of the underlying technology and how much of that total has been used. The underlying storage technology may affect which of the context-specific attributes are available. For example, tape storage may be considered semi-infinite, so the total and free attributes have no meaning. If this is so, then it affects all StorageCapacity objects with the same underlying technology, independent of their context. Different contexts may also affect what context-specific attributes are recorded. This is a policy decision when implementing GLUE, as recording all possible information may be costly and provide no great benefit. [Aside: these two reasons are why many of the attributes within StorageCapacity are optional. Rather than explicitly subclassing the objects and making the values required, it is left deliberately vague which attributes are published.] A StorageCapacity may represent a logical aggregation of multiple underlying storage technology instances; for example, a StorageCapacity might represent many disk storage nodes, or many tapes stored within a tape silo. GLUE makes no effort to record information at this deeper level; but by not doing so, it requires that the underlying storage technology be homogeneous. Homogeneous means that the underlying storage technology is either identical or sufficiently similar that the differences don't matter. In most cases, the homogeneity is fairly obvious (e.g., tape storage vs disk-based storage), but there may be times where this distinction becomes contentious and judgement may be required; for example, the quality of disk-base storage might indicate that one subset is useful for a higher-quality service. If this is so, then it may make sense to represent the different class of disk by different SpaceCapacities. StorageEnvironment: A StorageEnvironment is a collection of one or more StorageCapacities with a set of associated (enforced) storage management policies. Examples of these policies are Type (Volatile, Durable, Permanent) and RetentionPolicy (Custodial, Output, Replica). StorageEnvironments act as a logical aggregation of StorageCapacities, so each StorageEnvironment must have at least one associated StorageCapacity. It is the associated StorageCapacities that allow a StorageEnvironment to store data with its advertised policies; for example, to act as (Permanent, Custodial) storage of data. Since a StorageEnvironment may contain multiple StorageCapacities, it may describe a heterogeneous environment. An example of this is "tape storage", which has both tape back-end and disk front-end into which users can pin files. Such a StorageEnvironment would have two associated StorageCapacities: one describing the disk storage and another describing the tape. If a StorageCapacity is associated with a StorageEnvironment, it is associated with only one. A StorageCapacity may not be shared between different StorageEnvironments. StorageCapacities associated with a StorageEnvironment must be non-overlapping with any other such StorageCapacity and the set of all such StorageCapacities must represent the complete storage available to end-users. Each physical storage device (e.g., individual disk drive or tape) that an end-user can utilise must be represented by (some part of) precisely one StorageCapacity associated with a StorageEnvironment. Nevertheless, the StorageCapacities associated with StorageEnvironments may be incomplete as a site may deploy physical storage devices that are not directly under end-user control; for example, disk storage used to cache incoming transfers. GLUE makes no effort to record information about such storage. StorageResource: A StorageResource is an aggregation of one or more StorageEnvironments and describes the hardware that a particular software instance has under its control. A StorageResource must have at least one StorageEnvironment, otherwise there wouldn't be much point publishing information about it. [This isn't a strict requirement, but I think it makes sense to include it.] All StorageEnvironments must be part of precisely one StorageResource. SoftwareEnvironments may not be shared between StorageResources. This means that all physics hardware must be published under precisely one StorageResource. StorageShare: A StorageShare is a logical partitioning of one or more StorageEnvironments. Perhaps the simplest example of a StorageShare is one associated with a single StorageEnvironment with a single associated StorageCapacity, and that represents all the available storage of that StorageCapacity. An example of a storage that could be represented by this trivial StorageShare is the classic-SE. StorageSpaces must have one or more associated StorageCapacities. These StorageCapacities provide a complete description of the different homogeneous underlying technologies that are available under the space. In general, the number of StorageCapacities associated with a StorageShare is the sum of the number of StorageCapacities associated with each of the StorageShare's associated StorageEnvironments. Following from this, there is an implicit association between the StorageCapacity associated with a StorageShare and the corresponding StorageCapacity associated with a StorageEnvironment. Intuitively, this association is from the fact that the two StorageCapacities share the same underlying physical storage. This implicit association is not recorded in GLUE. StorageSpaces may overlap. Specifically, given a StorageCapacity (SC_E) that is associated with some StorageEnvironment and which has totalSize TS_E, let the sum of the totalSize attributes for all StorageCapacities that are: 1. associated with a StorageSpace, and 2. that are implicitly associated with SC_E be TS_S. If the StorageSpaces are covering then TS_S = TS_E. If the StorageSpaces overlap, then TS_S > TS_E. [sorry, I couldn't easily describe this with just words without it sounding awful!] StorageSpaces may be incomplete. Following the same definitions as above, this is when TS_S < TS_E. Intuitively, this happens if the site-admin has not yet assigned all available storage. End-users within a UserDomain may wish to store or retrieve files. The StorageShares provides a complete, abstract description of the underlying storage at their disposal. No member of a UserDomain may interact with the physical hardware except through a StorageShare. The partitioning is persistent through file creation and deletion. The totalSize attributes (of a StorageSpace's associated StorageCapacties) do not change as a result of file creation or deletion. [Does GLUE need to stipulate this, or should we leave this vague?] A single StorageShare may allow multiple UserDomains to access storage; if so, the StorageShare is "shared" between the different UserDomains. Such a shared StorageShare is typical if a site provides storage described by the trivial StorageShare (one that covers a complete StorageEnvironment) whilst supporting multiple UserDomains. StorageMappingPolicy: The StorageMappingPolicy describes how a particular UserDomain is allowed to access a particular StorageShare. No member of a UserDomain may interact with a StorageShare except as described by a StorageMappingPolicy. The StorageMappingPolicies may contain information that is specific to that UserDomain, such as one or more associated StorageCapacities. If provided, these provide a UserDomain-specific view of their usage of the underlying physical storage technology as a result of their usage within the StorageShare. If StorageCapacities are associated with a StorageMappingPolicy, there will be the same number as are associated with the corresponding StorageShare. StorageEndpoint: A StorageEndpoint specifies that storage may be controlled through a particular interface. The SRM protocol is an example of such an interface and a StorageEndpoint would be advertised for each instance of SRM. The access policies describing which users of a UserDomain may use the StorageEndpoint are not published. On observing that a site publishes a StorageEndpoint, one may deduce only that it is valid for at least one user of one supported UserDomain. StorageAccessProtocol: A StorageAccessProtocol describes how data may be sent or received. The presence of a StorageAccessProtocol indicates that data may be fetched or stored using this interface. Access to the interface may be localised; that is, only available from certain computers. It may also be restricted to specified UserDomains. However, neither policy restrictions are published in GLUE. On observing a StorageAccessProcol, one may deduce only that it is valid for at least one user of one supported UserDomain. StorageService: A StorageService is an aggregation of StorageEndpoints, StorageAccessProtocols and StorageResources. It is the top-level description of the ability to transfer files to and from a site, and manipulate the files once stored.