feedback on storage entities

Sergio Andreozzi

27 Mar 2008 27 Mar '08

12:56 p.m.

Hi Felix, today I won't be able to connect due to phone/network unavailability here at CNAF because of our data center renovation. Tomorrow, I plan to stay at home so I can call/connect. Anyway, I'm sending you a number of thoughts on the current storage entities model: - "Shared" Storage Share the solution of relating a storage capacity to a storage mapping policy to advertise used space per policy seems to over-complicate the model; the storage capacity concept is being adopted by different entities with slightly different meaning and this leads to a not very intuitive and usable model; our proposal is to act as follows: * For Storage Share 1- add a shared attribute in the storage share which type is boolean; for "shared" shares, the value should be true 2- add an AggregationLocalID attribute; for the "shared" shares within the same storage service, this attribute should be assigned with the same value in this way, we avoid the creation of one more level of hierarchy and potential visualization tools which want to show a summary info can avoid double counting by checking the two attributes that we propose * For Storage Environment: when we mapped the current model to our T1 use case, we found out that the storage environment is homogeneous; therefore there is not need (at least for our scenario) to have the capacity to be associated to the storage environment; the attributes of the storage capacity can be added to the storage environment * For Storage Resource: since information about free/used/total/reserved space is provided by the storage environment, we could avoid to have summary info at the storage resource level; information consumer can aggregate it If the above considerations fit the use cases of other partners, then the storage capacity would be related only to the storage share. As regards the today agenda, I removed the following issues since they do not properly reflect our scenario . ** consequence of overlapping StorageResource entities *** GPFS 3.1 and GPFS 3.2 share same disks *** if wished to be expressed explicitly -> each GPFS is represented as own StorageResource *** BUT then : a higher aggregation of capacity numbers muster be given in the service (again: if wished) *** OR (easier): express GPFS 3.1 and 3.2 in OtherInfo field in our mapping choice, we have decided to model the three storage systems managed by GPFS 3.1, GPFS 3.2 and TSM respectively using the storage environment concept. They do not logically overlap. (See here http://glueman.svn.sourceforge.net/viewvc/*checkout*/glueman/tags/glue-xsd/d...) In our scenario, we have one global storage resource composed by three storage environments. As a final comment, my opinion is that we should privilege simplicity and the meta-scheduling use cases more than the monitoring ones. If we do not manage to converge shortly on a common vision for the storage resource/storage environment, we should probably postpone the definition of these entities to a future GLUE revision and concentrate on the storage endpoint/storage share consolidation. Cheers, Sergio -- Sergio Andreozzi INFN-CNAF, Tel: +39 051 609 2860 Viale Berti Pichat, 6/2 Fax: +39 051 609 2746 40126 Bologna (Italy) Web: http://www.cnaf.infn.it/~andreozzi

Show replies by date

Maarten.Litmaath＠cern.ch

30 Mar 30 Mar

3:51 a.m.

Ciao Sergio,

...

* For Storage Share 1- add a shared attribute in the storage share which type is boolean; for "shared" shares, the value should be true 2- add an AggregationLocalID attribute; for the "shared" shares within the same storage service, this attribute should be assigned with the same value

in this way, we avoid the creation of one more level of hierarchy and potential visualization tools which want to show a summary info can avoid double counting by checking the two attributes that we propose

So, you would publish such a shared Share multiple times, once per VO. Each such instance then gives a VO view of that Share. I do not see a problem for the info provider to cook up the correct values for the boolean flag and the AggregationLocalID, but I do note that compared to the proposal by Felix we lose some functionality: if each of the VOs has a _quota_ in the Share, we would publish that number as, say, its online TotalSize --> this means we no longer have the _physical_ TotalSize of the Share published anywhere. Maybe not a big loss...

...

* For Storage Environment: when we mapped the current model to our T1 use case, we found out that the storage environment is homogeneous; therefore there is not need (at least for our scenario) to have the capacity to be associated to the storage environment; the attributes of the storage capacity can be added to the storage environment

An Environment can have both online and nearline components, and we would like to be able to publish sizes for both: if the sizes are incorporated, we have to put Online and Nearline in their names, like in GLUE 1.3. Fine with me, but I thought there were objections against that?

...

* For Storage Resource: since information about free/used/total/reserved space is provided by the storage environment, we could avoid to have summary info at the storage resource level; information consumer can aggregate it

The assumption then is that Environments will not overlap: probably OK.

...

If the above considerations fit the use cases of other partners, then the storage capacity would be related only to the storage share.

I think we should handle sizes the same way for Share and Environment: either incorporate them, or have them in Capacity objects.

...

As regards the today agenda, I removed the following issues since they do not properly reflect our scenario .

** consequence of overlapping StorageResource entities *** GPFS 3.1 and GPFS 3.2 share same disks *** if wished to be expressed explicitly -> each GPFS is represented as own StorageResource *** BUT then : a higher aggregation of capacity numbers muster be given in the service (again: if wished) *** OR (easier): express GPFS 3.1 and 3.2 in OtherInfo field

in our mapping choice, we have decided to model the three storage systems managed by GPFS 3.1, GPFS 3.2 and TSM respectively using the storage environment concept. They do not logically overlap. (See here

Note: you do not publish the actual implementation names and versions, which we want at least for WLCG (see below). Furthermore, as far as WLCG is concerned you cannot build your T1D1 setup out of a replica-online and a custodial-nearline Environment! In WLCG combinations of RetentionPolicy and AccessLatency have _extra_ meaning that cannot be deduced from those attributes. Such combinations are called Storage Classes: Custodial-Nearline == T1D0 --> disk managed by system Custodial-Online == T1D1 --> disk managed by client Replica-Online == T0D1 --> disk managed by client A Storage Class always has disk, i.e an online component, while the Custodial classes also have tape or some other high quality storage; if it is tape/dvd/... there is a corresponding nearline component. What is more, the disk component is managed by the system for T1D0, while it is managed by the client (VO) for T1D1 and T0D1. WLCG needs to have it clear from the schema which Storage Class applies to a particular Share. In principle one could come up with this rule: Custodial-Nearline + Replica-Online == Custodial-Online T1D0 + T0D1 == T1D1 But then a client that is interested in T1D1 has to query for Shares that either are linked to a Custodial-Online Environment, or linked to a Custodial-Nearline _and_ a Replica-Online Environment: not nice! Furthermore, a client interested in T1D0 (T0D1) has to ensure that the matching Shares are _not_ also linked to T0D1 (T1D0). I would quite prefer having a Share always linked to a _single_ Environment, which itself will have an online component and may also have a nearline component. If we want to have separate implementation names and versions for those components, it would seem natural to introduce the split at the Resource level instead: an Environment can be linked to an online Resource (e.g. with GPFS 3.2) and a nearline Resource (TSM X.Y.Z). Whichever way, we would like to publish such back-end implementation names and versions explicitly. Flavia has a use case: the name and version of the back-end storage system should be available, such that it can be deduced from the information system which sites are likely to suffer from which open issues. This is important information for debugging operational problems in WLCG (and other grids).

...

http://glueman.svn.sourceforge.net/viewvc/*checkout*/glueman/tags/glue-xsd/d...) In our scenario, we have one global storage resource composed by three storage environments.

As a final comment, my opinion is that we should privilege simplicity and the meta-scheduling use cases more than the monitoring ones. If we do not manage to converge shortly on a common vision for the storage resource/storage environment, we should probably postpone the definition of these entities to a future GLUE revision and concentrate on the storage endpoint/storage share consolidation.

I still think we are approaching convergence. Thanks, Maarten

Maarten.Litmaath＠cern.ch

3:02 p.m.

Hi all,

...

In WLCG combinations of RetentionPolicy and AccessLatency have _extra_ meaning that cannot be deduced from those attributes.

Such combinations are called Storage Classes:

Custodial-Nearline == T1D0 --> disk managed by system Custodial-Online == T1D1 --> disk managed by client Replica-Online == T0D1 --> disk managed by client

In the "TnDm" notation 'n' indicates the guaranteed number of copies on tape (or other high-quality media) and 'm' the guaranteed number of copies on disk. Cheers, Maarten

Felix Nikolaus Ehm

31 Mar 31 Mar

9:59 a.m.

On Sun, 30 Mar 2008, Maarten Litmaath wrote:

...

Ciao Sergio,

...
* For Storage Share 1- add a shared attribute in the storage share which type is boolean; for "shared" shares, the value should be true 2- add an AggregationLocalID attribute; for the "shared" shares within the same storage service, this attribute should be assigned with the same value

in this way, we avoid the creation of one more level of hierarchy and potential visualization tools which want to show a summary info can avoid double counting by checking the two attributes that we propose

So, you would publish such a shared Share multiple times, once per VO. Each such instance then gives a VO view of that Share. I do not see a problem for the info provider to cook up the correct values for the boolean flag and the AggregationLocalID, but I do note that compared to the proposal by Felix we lose some functionality: if each of the VOs has a _quota_ in the Share, we would publish that number as, say, its online TotalSize --> this means we no longer have the _physical_ TotalSize of the Share published anywhere. Maybe not a big loss...

I would like to remind that NorduGrid has declared this accounting per tape/disk as a use case. For WLCG both solutions are fine since they have one Share/UserDomain. I personally don't see any design problems and think that this is more than 'nice-to-have'. However, if there are problems implementing/using such mechanism which _overload_ the advantages then we should drop it.

...

...
* For Storage Resource: since information about free/used/total/reserved space is provided by the storage environment, we could avoid to have summary info at the storage resource level; information consumer can aggregate it

The assumption then is that Environments will not overlap: probably OK.

...

I would quite prefer having a Share always linked to a _single_ Environment, which itself will have an online component and may also have a nearline component. We will discuss this on the next storage discussion (Thursday,03.04). From

The Capacity->Resource link has been removed (v.32) and there will a note in the description of the Environment about the fact of non-overlapping. the main entities point of view we don't have any restrictions. One reason I remember for this current relation was to be aligned with the computing model (share-executionenviroment).

...

If we want to have separate implementation names and versions for those components, it would seem natural to introduce the split at the Resource level instead: an Environment can be linked to an online Resource (e.g. with GPFS 3.2) and a nearline Resource (TSM X.Y.Z).

Whichever way, we would like to publish such back-end implementation names and versions explicitly. Flavia has a use case: the name and version of the back-end storage system should be available, such that it can be deduced from the information system which sites are likely to suffer from which open issues. This is important information for debugging operational problems in WLCG (and other grids).

I agree. But this can also happen in a OtherInfo field. And I tend to this rather making entities too complicated. Sergio, should I postpone this issue until you're back with us (9.04.2008) ? Cheerio, Felix

owen.synge＠desy.de

12:40 p.m.

On Sun, 30 Mar 2008 05:51:00 +0200 <Maarten.Litmaath@cern.ch> wrote:

...

Whichever way, we would like to publish such back-end implementation names and versions explicitly. Flavia has a use case: the name and version of the back-end storage system should be available, such that it can be deduced from the information system which sites are likely to suffer from which open issues. This is important information for debugging operational problems in WLCG (and other grids).

Dear all, Could some one please explain why back-end storage system is important for Glue? The backend tape system which is only applicable to a handful of sites (less than 14 sites I would guess) does not need to be discovered as we tend to hand code special provision for such sites in our data flow models. We have a white board in desy where we track the deployment of all tier 1's. The data is not so large that it demands discovery for bug fixing. For me Glue is mostly about Tier 2/3 sites endpoint discovery which have low resources and even lower manpower and yet will be expected to fill in some (not all) of this relational information by hand. I feel that modeling a Tier 1 or a tier 0 is exactly what we should not be focusing upon. Since these sites are already known by name, have FTE's who can fill in Glue schema values by hand (and our Grid specific extensions). Tier 2/3 sites should ideally do less hand coding as manpower is an issue their. I'm in favour of a catalogue mapping service so catalogue synchronisation can occur which if such was added to the LCG frame work would make the tiny amount of "monitoring" data in the Glue useless as the catalogues would know what data was used at each site and far more usefully could delete files rather than just know they need to be deleted. Patrick has told me dCache position is to accept all requests we can deliver and fight against all we cant do. Paul now represents dCache for Glue, I just drop and email in now and again. My comments are not dCache related, I just fear we are missing a rare opportunity to remove attributes from an inter grid standard. I am much happier we extend the Glue to resolve missing services for LCG/OSG/Nordugrid within our grids and MOU's, but to add these as fields all Grids should use seems bloated and bad form, when their function is just to resolve architectural floors in our current grids ability to mange its cataloges. Its consequence is to prevent us focusing on missing data and syncronisation services, make inter operation harder by adding what I see as non use cases, such as data we should just know (backend tape system is not a job selection issue), and dynamic trivia such as poor quality monitoring (that does not include enough data to resolve running out of space anyway which seems the only use case), but that's a longer email. Remember all fields (and worse relations) we add to Glue wont go away for years and the costs are most apparent with tier 2/3 sites. I feel we are not being hard enough on people bringing use cases and looking at alternative solutions. I feel that we could resolve all the SE service discovery with just two (or three to be consistent with the rest of GLue) entities if we removed what I currently see as counter productive use cases. Since I fear I am in the minority, I will stop this email now. Owen Opinions expressed here I mine, and not necessarily shared by Paul or Patrick or the dCache team.

Maarten.Litmaath＠cern.ch

11:20 p.m.

Hi Owen,

...

...
Whichever way, we would like to publish such back-end implementation names and versions explicitly. Flavia has a use case: the name and version of the back-end storage system should be available, such that it can be deduced from the information system which sites are likely to suffer from which open issues. This is important information for debugging operational problems in WLCG (and other grids).

Dear all,

Could some one please explain why back-end storage system is important for Glue?

The backend tape system which is only applicable to a handful of sites

If e.g. StoRM becomes popular, there may be more than a handful. StoRM relies a lot on GPFS to provide the desired functionality (e.g. space reservation and just-in-time ACLs). Various implementations may come to rely on some NFSv4 release that we might want to publish explicitly. The upshot of this discussion is that it seems at least undesirable to have the schema constrain us to a single implementation name and version.

...

(less than 14 sites I would guess) does not need to be discovered as we tend to hand code special provision for such sites in our data flow models. We have a white board in desy where we track the deployment of all tier 1's. The data is not so large that it demands discovery for bug fixing.

For me Glue is mostly about Tier 2/3 sites endpoint discovery which have low resources and even lower manpower and yet will be expected to fill in some (not all) of this relational information by hand. I feel that modeling a Tier 1 or a tier 0 is exactly what we should not be focusing upon. Since these sites are already known by name, have FTE's who can fill in Glue schema values by hand (and our Grid specific extensions). Tier 2/3 sites should ideally do less hand coding as manpower is an issue their.

Tier-2 and -3 sites can use YAIM to handle most of the configuration issues. These days the documentation is in fairly good shape and the code explicitly requires the necessary parameters to be filled in, so it should become easier for sites to publish sensible stuff.

...

I'm in favour of a catalogue mapping service so catalogue synchronisation can occur which if such was added to the LCG frame work would make the tiny amount of "monitoring" data in the Glue useless as the catalogues would know what data was used at each site and far more usefully could delete files rather than just know they need to be deleted.

Each of the LHC VOs probably will implement its own catalogue mapping or consistency service, but there will be no general service for other VOs on EGEE for quite a while, if ever. Other grids may have the same problem. It is true that having "monitoring" data in the information system may lead to false expectations, and can only do a superficial job w.r.t. accounting. Still, we have experience with implementing GLUE schema info providers and those we have some control over, while a proper accounting service has no clear timeline, AFAIK. Furthermore, just about everything is optional, so we may decide later on that some attributes are not worth the effort after all: cf. the storage library in GLUE 1.x, finally deprecated in 1.3.

...

Patrick has told me dCache position is to accept all requests we can deliver and fight against all we cant do. Paul now represents dCache for Glue, I just drop and email in now and again.

My comments are not dCache related, I just fear we are missing a rare opportunity to remove attributes from an inter grid standard. I am much happier we extend the Glue to resolve missing services for LCG/OSG/Nordugrid within our grids and MOU's, but to add these as fields all Grids should use seems bloated and bad form, when their function is just to resolve architectural floors in our current grids ability to mange its cataloges. Its consequence is to prevent us focusing on missing data and syncronisation services, make inter operation harder by adding what I see as non use cases, such as data we should just know (backend tape system is not a job selection issue), and dynamic trivia such as poor quality monitoring (that does not include enough data to resolve running out of space anyway which seems the only use case), but that's a longer email. Remember all fields (and worse relations) we add to Glue wont go away for years and the costs are most apparent with tier 2/3 sites. I feel we are not being hard enough on people bringing use cases and looking at alternative solutions. I feel that we could resolve all the SE service discovery with just two (or three to be consistent with the rest of GLue) entities if we removed what I currently see as counter productive use cases.

I agree with your concerns, but here is a possible counter argument: if GLUE 2.0 does not include all the 1.3 functionality that users and developers appreciate, it may be a long time before we start seeing 2.0 used for anything real. We can take a very purist view of what should be in the information system and find that grids immediately have to hack it, so that they can get their work done. It has happened with GLUE 1.x, it can happen again. Cheers, Maarten

Sergio Andreozzi

10 Apr 10 Apr

10:16 a.m.

Hi Maarten,

...

...
* For Storage Share 1- add a shared attribute in the storage share which type is boolean; for "shared" shares, the value should be true 2- add an AggregationLocalID attribute; for the "shared" shares within the same storage service, this attribute should be assigned with the same value

in this way, we avoid the creation of one more level of hierarchy and potential visualization tools which want to show a summary info can avoid double counting by checking the two attributes that we propose

So, you would publish such a shared Share multiple times, once per VO. Each such instance then gives a VO view of that Share. I do not see a problem for the info provider to cook up the correct values for the boolean flag and the AggregationLocalID, but I do note that compared to the proposal by Felix we lose some functionality: if each of the VOs has a _quota_ in the Share, we would publish that number as, say, its online TotalSize --> this means we no longer have the _physical_ TotalSize of the Share published anywhere. Maybe not a big loss...

so, you mean that within a "Shared Share", a VO could have a Quota? This reminds me a really "old" discussion we had in 2003. Have a look at this page, almost at the end "Contributions": http://www.cnaf.infn.it/~andreozzi/datatag/glue/working/SE/index.html what do you think of it? Cheers, Sergio http://www.cnaf.infn.it/~andreozzi/datatag/glue/working/SE/index.html

...

...
* For Storage Environment: when we mapped the current model to our T1 use case, we found out that the storage environment is homogeneous; therefore there is not need (at least for our scenario) to have the capacity to be associated to the storage environment; the attributes of the storage capacity can be added to the storage environment

An Environment can have both online and nearline components, and we would like to be able to publish sizes for both: if the sizes are incorporated, we have to put Online and Nearline in their names, like in GLUE 1.3. Fine with me, but I thought there were objections against that?

...
* For Storage Resource: since information about free/used/total/reserved space is provided by the storage environment, we could avoid to have summary info at the storage resource level; information consumer can aggregate it

The assumption then is that Environments will not overlap: probably OK.

...
If the above considerations fit the use cases of other partners, then the storage capacity would be related only to the storage share.

I think we should handle sizes the same way for Share and Environment: either incorporate them, or have them in Capacity objects.

...
As regards the today agenda, I removed the following issues since they do not properly reflect our scenario .

** consequence of overlapping StorageResource entities *** GPFS 3.1 and GPFS 3.2 share same disks *** if wished to be expressed explicitly -> each GPFS is represented as own StorageResource *** BUT then : a higher aggregation of capacity numbers muster be given in the service (again: if wished) *** OR (easier): express GPFS 3.1 and 3.2 in OtherInfo field

in our mapping choice, we have decided to model the three storage systems managed by GPFS 3.1, GPFS 3.2 and TSM respectively using the storage environment concept. They do not logically overlap. (See here

Note: you do not publish the actual implementation names and versions, which we want at least for WLCG (see below).

Furthermore, as far as WLCG is concerned you cannot build your T1D1 setup out of a replica-online and a custodial-nearline Environment!

In WLCG combinations of RetentionPolicy and AccessLatency have _extra_ meaning that cannot be deduced from those attributes.

Such combinations are called Storage Classes:

Custodial-Nearline == T1D0 --> disk managed by system Custodial-Online == T1D1 --> disk managed by client Replica-Online == T0D1 --> disk managed by client

A Storage Class always has disk, i.e an online component, while the Custodial classes also have tape or some other high quality storage; if it is tape/dvd/... there is a corresponding nearline component.

What is more, the disk component is managed by the system for T1D0, while it is managed by the client (VO) for T1D1 and T0D1.

WLCG needs to have it clear from the schema which Storage Class applies to a particular Share.

In principle one could come up with this rule:

Custodial-Nearline + Replica-Online == Custodial-Online

T1D0 + T0D1 == T1D1

But then a client that is interested in T1D1 has to query for Shares that either are linked to a Custodial-Online Environment, or linked to a Custodial-Nearline _and_ a Replica-Online Environment: not nice!

Furthermore, a client interested in T1D0 (T0D1) has to ensure that the matching Shares are _not_ also linked to T0D1 (T1D0).

I would quite prefer having a Share always linked to a _single_ Environment, which itself will have an online component and may also have a nearline component.

If we want to have separate implementation names and versions for those components, it would seem natural to introduce the split at the Resource level instead: an Environment can be linked to an online Resource (e.g. with GPFS 3.2) and a nearline Resource (TSM X.Y.Z).

Whichever way, we would like to publish such back-end implementation names and versions explicitly. Flavia has a use case: the name and version of the back-end storage system should be available, such that it can be deduced from the information system which sites are likely to suffer from which open issues. This is important information for debugging operational problems in WLCG (and other grids).

...
http://glueman.svn.sourceforge.net/viewvc/*checkout*/glueman/tags/glue-xsd/d...) In our scenario, we have one global storage resource composed by three storage environments.

As a final comment, my opinion is that we should privilege simplicity and the meta-scheduling use cases more than the monitoring ones. If we do not manage to converge shortly on a common vision for the storage resource/storage environment, we should probably postpone the definition of these entities to a future GLUE revision and concentrate on the storage endpoint/storage share consolidation.

I still think we are approaching convergence. Thanks, Maarten

-- Sergio Andreozzi INFN-CNAF, Tel: +39 051 609 2860 Viale Berti Pichat, 6/2 Fax: +39 051 609 2746 40126 Bologna (Italy) Web: http://www.cnaf.infn.it/~andreozzi

6353

Age (days ago)

6367

Last active (days ago)

List overview

Download

6 comments

4 participants

participants (4)

Felix Nikolaus Ehm
Maarten.Litmaath＠cern.ch
owen.synge＠desy.de
Sergio Andreozzi