
Hi all, I am not sure if I can make it for the todays telcon. Therefore, I've updated the agenda page. Thanks Stephen for the input! I've added some comments:
-----Original Message----- From: glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of Burke, S (Stephen) Sent: Dienstag, 8. April 2008 22:46 To: Paul Millar; GLUE WG Subject: Re: [glue-wg] Updated thoughts...
glue-wg-bounces@ogf.org
[mailto:glue-wg-bounces@ogf.org] On Behalf Of Paul Millar said: Sorry if I've missed out someone's a comment, please yell.
I haven't made comments yet, but this seems like a good place to start ... sorry there are quite a lot but I think it's worth trying to nail things down as much as possible.
UserDomain:
A collection of one or more end-users. All end-users that interact with the physical storage are a member of a UserDomain.
Perhaps opening a can of worms, but it may also be possible for a UserDomain to include services, i.e. you might have services registered in VOMS as well as users (even with delegated credentials you may want to give privileges to services which the users don't have). I strongly assume that this implies changing the main entities (service ->userdomain relation). I'm not sure if we can do this before public comments deadline.
StorageCapacity:
A StorageCapacity object describes the ability to store data within a homogeneous storage technology. Each object provides a view of that physical storage medium with a common access latency.
It isn't necessarily just the latency that matters, for example it may be useful to publish the Capacity of the disk cache in front of a tape system (see further comments below) - the latency is Online but the functionality is different from Disk1 Online storage. (Similarly a Disk1 storage system might make extra cache copies to help with load balancing.) I think the phraseology should be something like "a common category of storage" (although maybe "category" still isn't the right word).
I'd also like to go back to the question I posed in one of the meetings ... say that a site implements Custodial/Online by ensuring three distinct disk copies, how would we represent that? What about mirrored RAID, how much space do we record? If the storage system needs to keep more than one instance of a file to ensure a certain storage quality of service then I don't feel this should be published into the grid information system. Why should the user care if a system has theoretically 100GB but due to the QoS agreement (3 copies of a file) he can only use ~30GB? In this case the system MAY publish 100GB at the beginning but would then decrease the free space by the order of 3*file_size when a file has been pushed into
see two paragraphs below.. the system.
Another thing is that I think there is some mission creep going on in the Capacity concept. When I suggested introducing it it was really as a complex data type, i.e. as an alternative to putting maybe 20 separate attributes into each object that can have a size you would effectively have one multivalued "attribute" with type "Capacity" rather than int. However, your descriptions suggest that you're thinking more in terms of a Capacity representing a real thing (a bunch of storage units) which indeed have sizes but may have other attributes too. That isn't necessarily a bad thing, but we should probably be clear in our minds about what we intend.
The context is determined by an association between the StorageCapacity object and precisely one other higher-level object.
I agree with your idea of having this thing as a complex data type enriching higher-level entities by accounting/status data. I still see it as you're intended to have it and that the description should rather go into a direction of this: "The StorageCapacity enriches a higher-level entity by status/accounting information which must not be instanciated without this relation."
What was the decision about Shares for different VOs which share the same physical space? (I haven't really read all the mails yet so this may already be answered ... actually there is more on this further down.)
I must admit that I may have pushed this idea. I have the feeling (although not necessary for now) that it should be possible to share a 'Share' among VOs.
| The underlying storage technology may affect which of the | context-specific attributes are available. For example, tape storage | may be considered semi-infinite, so the total and free attributes have | no meaning. If this is so, then it affects all StorageCapacity objects with | the same underlying technology, independent of their context.
I'm not quite sure what you're saying here. It seems to me that the schema itself should not be defining this - I would still maintain that tape systems do in fact have a finite capacity at any given time so it isn't conceptually absurd (and "nearline" may not necessarily mean "tape" anyway"). Individual Grids may wish to make their own decisions about what to publish, and equally it seems possible that, say, dcache may decide not to publish something but Castor may. All the schema should do is say that the attributes are optional, but *if* they are published the meaning should be well-defined and common across all Grids/implementations/... (and maybe we also want a special value to mean quasi-infinite?)
I think Paul wants to point out with this _example_ that some attributes may not have a senseful meaning in a Grid infrastructure and that in this case the related Capacity objects should also not publish these values. I'm not sure if we can define that a tape based storage system should have finite space in GLUE (although I would like to have it!) because I expect resistance from the community. Please correct me if I am wrong. If not, then we need to add a line to the schema about this definition.
A StorageEnvironment is a collection of one or more StorageCapacities with a set of associated (enforced) storage management policies. A StorageEnvironment describes non-overlapping physical storage quality offerd by a StorageResource and must be referenced by StorageShares which are deployed fully or partly in a StorageEnvironment. It additionally may offer accounting information represented by one (or more) StorageCapacity objects which themselfs should re flect the (quality)type of the enviroments is gives the capacity info for.
Hmm ... I could suggest that the Environment now also looks more like a data type than a real object (and is also rather SRM2-specific as it stands). And why are the attributes optional, i.e. what would it mean if one or both is missing?
Well, I assume that nobody would publish something which has no meaning except a localID. I could be wrong. But I recommend to add a text which states that at least one attribute (AccessLatency or RetentionPolicy) must be given.
Should there be an OtherInfo attribute? What would we do for classic SEs, or SRB, or for that matter SRM 1? Fortunatly, here in Taipei there is a developer from iRODS (enhanced SRB). I was smiling when I saw that they use the same terminology. He stated the SRB as a StorageResource. I haven't had the opportunity to talk to him but I'll do soon.
[What actually seems to have happened here is that things have gradually turned inside out. We started with the SA as the main representation of a set of hardware, with size, policy and ACL information embedded in it and subsequently with the VOInfo added as a dependent object. Now the size (Capacity), ACL (MappingPolicy) and VOInfo (Share) are
getting carved out as separate objects with an independent "life" and most of the policy attributes have been obsoleted, so we're left with something that carries almost no information and a role which, to me at least, is not totally clear. I'm not saying there's anything wrong with this, but it may lead to misconceptions derived from trying to relate the Glue 2 objects to their Glue 1 equivalents.] Yes, I agree with you. But from our discussions I have the feeling that
VOInfo information went into MappingPolicy the current solution can satisfy most of the use cases (e.g. see NorduGrid #1) and I somehow also ran out of ideas for new entities and -much more important- meaningful names for these.
| Examples of these policies are Type (Volatile, Durable, Permanent) | and RetentionPolicy (Custodial, Output, Replica).
Except that Type (or ExpirationMode) doesn't seem to be an attribute in the current draft ... what about other policies,
Type has been depicated in Environment ExpirationMode went into Share
| In general, a StorageEnvironment may have one or more | RetentionPolicy values.
Not what it says in the current draft (0..1). Does this correspond with SRM usage, i.e. can you have spaces with multiple RPs?
Sorry, I may have missed this out. But to be sure I put it onto the agenda for next telcon 9.4.2008.
| GLUE does not | record a default RetentionPolicy.
Should it? What about defaults for other things, e.g. ExpirationMode?
I'm also happy with having no retention policy as a default. (This however, may not be the case for WLCG. But GLUE shouldn't define this)
| It is the associated StorageCapacities that allow a | StorageEnvironment to store data with its advertised policies; for | example, to act as (Permanent, Custodial) storage of data.
But can you tell how that works, i.e. which Capacity serves which policy? This is another case where our mind tends to think Custodial -> tape -> Nearline, but intrinsically it doesn't have to be like that.
| Since a StorageEnvironment may contain multiple StorageCapacities, | it may describe a heterogeneous environment. An example of this is | "tape storage", which has both tape back-end and disk front-end into | which users can pin files. Such a StorageEnvironment would have two | associated StorageCapacities: one describing the disk storage and | another describing the tape.
But can you have more than one Capacity of the same type? (see the comments earlier). Anyway I think we removed the storage type from the Capability so at the moment you can't really tell what it is. Maybe we should look back at the proposal for Storage Components made by Flavia, Maarten et al in the 1.3 discussion, or has someone already done that?
No, the Type is still in the Capacity and there were no plans in the last sessions to remove it. There is no restriction on publishing another Capacity object with the same type for e.g. an enviroment. I wondering if this makes sence if I would have two online capacities for a enviroment. But I wouldn't consider this as an error or major problem.
| Nevertheless, the StorageCapacities associated with | StorageEnvironments may be incomplete as a site may deploy physical | storage devices that are not directly under end-user control; for | example, disk storage used to cache incoming transfers. GLUE makes | no effort to record information about such storage. Fine with me.
Actually part of my reason to introduce Capacity objects is that they can do just that if people want them to (as they may since it can be useful to know about cache usage). For such cases the CapacityType would be Cache, or maybe something else if you wanted to distinguish more than one kind of cache. As always there's no compulsion to publish that if you don't want it, but the schema makes it possible.
| GLUE makes no attempt to record which physical storage (as | represented by StorageCapacity objects) is under control of which | StorageResource.
Should it? As it stands you might not care, but if you wanted to consider monitoring use cases (whether the software is running at the most basic!) it would probably be useful to know how that relates to the actual storage.
StorageShare:
A StorageShare is a logical partitioning of one or more StorageEnvironments.
Maybe I'm missing something, but how could you have more than one Environment for a single Share? Certainly our current structure doesn't allow it (one SA per many VOInfos but not vice versa), although as I said above that might be misleading.
| The StorageCapacities within the StorageShare context need not | describe all storage: the number of StorageCapacities associated | with a StorageShare may be less than the sum of the number of | StorageCapacities associated with each of the StorageShare's | associated StorageEnvironments.
Err, why? As always you may choose not to publish everything, but conceptually the space is all there somewhere ...
Fine with me
| A pair of StorageShares may be partially shared, that is, they have | at least one pair StorageCapacities that are shared and at least one | that is not. Partially shared StorageCapacities could represent two | UserDomain's access to a tape store, where they share a common set | of disk pools but the tape storage is distinct.
I'm not sure I like this bit. In general I would assume that storage (SAs in the current parlance) is either shared or not - allowing the disk part of a custodial/online space to be shared and the tape part not sounds rather weird to me, and I don't think that's how SRM works. Do we really have such cases? Bear in mind that the point is not about sharing the physical disks, but having a shared allocation (and for Disk1/Online permanent storage, not cache). If the system is guaranteeing to store, say, 100 Tb on both disk and tape (custodial/online) there is no way it can do that if the disk part of the reservation is shared, and if it doesn't guarantee it overall then having a reserved tape pool is pointless, in general it would just mean that some tapes are unusable.
Another question, what do we do about hierarchical spaces? At the moment we at least have the case of the "base space" or whatever you call it from which the space tokens are reserved, and in future I believe we're considering being able to reserve spaces inside spaces. How could that be represented? (There are also questions we've discussed in the past about things like dynamic spaces and default spaces which tend to produce more heat than light :)
I fear that there are no plans to consider this concept for now. And I don't think that we are able to implement it before end of April.
StorageMappingPolicy:
The StorageMappingPolicy describes how a particular UserDomain is allowed to access a particular StorageShare.
Should we say how this relates to the AccessPolicy? (which doesn't seem to appear explicitly in either the Computing or Storage diagrams but is presumably there anyway.) The StorageMappingPolicy describes how a UserDomain may utilize a Share. Its main purpose is to publish access control but it may also keep additional information for accounting as well as UserDomain specific namespace access information for the associated share.
| No member of a UserDomain may interact with a StorageShare except as | described by a StorageMappingPolicy.
As stated I don't think that can really be true, the SRM could potentially allow all kinds of things not explicitly published. The things which should be true are that there is an agreed set of things (maybe per grid?) which are published, and that the published values should be a superset of the "real" permissions - i.e. the SRM may in fact not authorise me even if the published value says that it will, but the reverse shouldn't be true.
Yes, I agree.
| The StorageMappingPolicies may contain information that is specific | to that UserDomain, such as one or more associated | StorageCapacities. If provided, these provide a UserDomain-specific | view of their usage of the underlying physical storage technology as | a result of their usage within the StorageShare.
I don't think I understand how this can be different from the Share to Capacity relation ... if you are saying that the Share can be multi-VO then I think something has gone wrong somewhere given that the Path and Tag can be VO-specific. In the 1.3 schema the whole point of the VOInfo (which has become the Share) was to split out the information specific to each mapping policy (ACBR) from the generic information in the SA ...
| The access policies describing which users of a UserDomain may use | the StorageEndpoint are not published.
Are you sure? (see comment above)
Please check my mail sent to the list today ("RE: [glue-wg] Some thoughts on storage objects").
A StorageAccessProtocol describes one method by which end-users may sent data to be stored, received stored data, or undertake both operations.
sent -> send, received -> retrieve
| Access to the interface may be localised; that is, only available | from certain computers. It may also be restricted to specified | UserDomains.
It might also only apply to certain storage components ...
Fine with me.
Phew .. I spent over two hours writing that, I hope someone reads it :)
Stephen
I did! But lucky you, it took me longer interrupted by lecture breaks and interesting presentations. Cheerio, Felix