
Hi, Going through the GLUE 2 schema I just noticed that StorageShare has an attribute called AccessMode with type AccessMode_t which doesn't appear to be defined. The text is "An identifier for the type of access and usage allowed for this Share." which doesn't immediately ring any bells. Does anyone know what this was for? Stephen -- Scanned by iCritical.

Hi Stephen, On 31/07/12 02:58, stephen.burke@stfc.ac.uk wrote:
Going through the GLUE 2 schema I just noticed that StorageShare has an attribute called AccessMode with type AccessMode_t which doesn't appear to be defined. The text is "An identifier for the type of access and usage allowed for this Share." which doesn't immediately ring any bells. Does anyone know what this was for?
dCache info-provider currently doesn't publish this attribute, pretty much for the reasons you state. From memory, the StoRM info-provider publishes something ... ah, but not a terribly helpful value, though: paul@vedrfolnir:~$ ldapsearch -LLL -x -H ldap://lcg-bdii.cern.ch:2170 -b o=glue '(&(objectClass=GLUE2StorageShare)(GLUE2StorageShareAccessMode=*))' GLUE2StorageShareAccessMode|grep ^G|sort|uniq GLUE2StorageShareAccessMode: to be defined paul@vedrfolnir:~$ To be honest, I've forgotten what was the intended use for this attribute. Could it be so a pool that is meant only for staging data from tape, or as a read-only cache, may be marked as such? Unfortunately, this doesn't quite work for dCache. dCache has four potential uses for each disk-pool: providing end-users with read access to files, providing storage space for end-users to upload files, storing files that are to be written to tape, storing files that have been read from tape. Each pool can have zero or more of these uses. Where it gets complicated is that the usage can depend on many things: from which IP address the client is connecting, which protocol is being used and in which directory the file is located. Selection based on the end-user's VO is supported from the last point; for example, directories that only ATLAS users may access (due to ownership and permissions in the filesystem) may be tagged, which allows for selecting special pool as "ATLAS pools". However, there's a problem here. The ability of an end-user to use a pool in any of these four modes depends on that user's VO (actually, it rather more complicated than that, but lets keep things simple). For example, a StorageShare might allow ATLAS and CMS to read data from the pool, but allow only CMS to write into it. In GLUE v1.3, I published this information as dCache-specific GlueSACapability attributes. A list of VOs that have specific access rights. The attribute values have the form dCacheCan<op>=VO:<vo-name> (e.g., dCacheCanStage=VO:ops). In GLUE v2.0, I hadn't made up my mind how to publish this information. Given the multi-value of AccessMode, it would be possible to publish multiple values, but this would loose the information about who is allowed to write into the StorageShare (since it isn't necessarily all VOs that have access to this StorageShare). Another possibility is to publish multiple StorageShares for the same collection of pools, one for each set of end-users. If all users of this pool can undertake the same operations then only one StorageShare would be published; otherwise, multiple StorageShare objects are needed. Of course, I can take the Glue v1.3 route and publish the information as OtherInfo attributes (at least, for now :-) In summary, dCache doesn't currently publish any AccessMode and I'm not sure if anything useful could be published here. Thinking about the problem more generally, a better solution would be to publish this information is in the MappingPolicy. End-users access the storage system, so "AccessMode" might be something more appropriate to MappingPolicy. This would "get around" the problem where the same StorageShare may be accessed in different fashions by different end-user groups. In general, what's missing in MappingPolicy is the ability to specify more predicates. These are conditions that must be satisfied for the MappingPolicy to take effect. MappingPolicy already has limited support for this: the associations to one or more UserDomain objects. This association describes how only end-users that are members of a UserDomain may use the StorageShare. So, in MappingPolicy the info-provider might like to describe how a mapping only takes effect if the end-user satisfies certain criteria. Here's some that would be useful for dCache: o certain IP address(es) -- not necessarily a single net-mask. o for certain end-users -- already supported with associations to UserDomain objects. o for certain protocols -- could be supported by introducing associations to AccessProtocols? o for certain portions of the namespace -- this would be quite tricky to express accurately, since it may depend on the access protocol (e.g., adding a prefix or removing a portion of the namespace) and it might not be a complete sub-tree. o for certain operations -- for dCache, the operations are read, write, flush and stage. HTH, Paul.

Paul Millar [mailto:paul.millar@desy.de] said:
GLUE2StorageShareAccessMode: to be defined
That seems unlikely to be a valid value!
To be honest, I've forgotten what was the intended use for this attribute. Could it be so a pool that is meant only for staging data from tape, or as a read-only cache, may be marked as such?
In Flavia's document for glue 1 we have Capabilities of scratch and stage, so perhaps that's what we had in mind, but I'm not sure why we didn't at least define a few values. I searched the wiki and my mail archive and found nothing, so it seems we just overlooked it. If we have no uses at the moment I suppose we can just leave it for future definition, but it might be nice to add some kind of comment.
Where it gets complicated is that the usage can depend on many things: from which IP address the client is connecting, which protocol is being used and in which directory the file is located. Selection based on the end-user's VO is supported from the last point; for example, directories that only ATLAS users may access (due to ownership and permissions in the filesystem) may be tagged, which allows for selecting special pool as "ATLAS pools".
At some point we have to decide what's practical to publish and use. If the setup is too complex it may just have to be agreed out-of-band that a particular VO needs a particular configuration, or else you publish a single tag for a whole bundle of attributes that a VO can interpret. That's analogous to the fact that we don't attempt to publish every installed rpm for CEs - VOs can either ask for certain things to be installed as a condition of supporting a VO at all, or they can publish RunTimeEnvironment tags to identify CEs with support for specific things of interest to them.
For example, a StorageShare might allow ATLAS and CMS to read data from the pool, but allow only CMS to write into it.
Does that kind of thing happen in practice? I would have thought that usually atlas and cms pools would be kept separate.
In GLUE v1.3, I published this information as dCache-specific GlueSACapability attributes. A list of VOs that have specific access rights. The attribute values have the form dCacheCan<op>=VO:<vo-name> (e.g., dCacheCanStage=VO:ops).
Does any client use that information?
In general, what's missing in MappingPolicy is the ability to specify more predicates. These are conditions that must be satisfied for the MappingPolicy to take effect.
You can potentially define a more complex syntax for the PolicyRules, as long as you can get it all into one line. However I don't think it's ideal to have dcache-specific things, the general idea is for all SRM systems to be treated in the same way. Anyway I think this should be driven by practical need, rather than trying to publish everything just in case it might be useful. Also I think you should at least have a toy algorithm for how a client might use it - if it's more than a few lines of pseudo-code it's probably too complicated to be usable. Stephen -- Scanned by iCritical.
participants (2)
-
Paul Millar
-
stephen.burke@stfc.ac.uk