New attribute for GLUE 2.1: StorageShare.ViewID

Hi all, After playing some more with GLUE 2.0 (and getting feedback from someone trying to use it!), there seems to be a missing concept in StorageShare, which I'll call "views". The idea is to allow publishing two (or more) groups of Storage Shares such that, within such a group, the Storage Shares can be non-overlapping but shares from different groups can overlap. Each group of non-overlapping shares provides a different "view" on the storage system. It's likely that the underlying hardware is the same (with the sum of StorageShareCapacity.Total giving the same value), but the different views (different groups of storage shares) would partition this storage differently. Here's a more concrete example: consider a storage service that allows capacity to be reserved but doesn't bind those reservations to particular hardware. This is difficult to publish currently; but, these could be published as two different views: one view shows the hardware, the other shows the reservations. I believe we can add views while remaining backwards compatible with GLUE 2.0 attributes and semantics; in particular, we can keep the semantics of SharingID unmodified for existing publishers. Here's the proposal: Attribute: ViewID Type: LocalID_t Mult: 0..1 ("optional") Unit: <none> Description: A local identifier that groups together similar Storage Shares. Two shares are part of the same view if and only if both do not publish a ViewID or both publish the same ViewID value. Two Storage Shares that are part of the same view and have either distinct SharingID values or the same "dedicated" SharingID value are non-overlapping, otherwise they may overlap. Since the attribute is optional, existing publishers will continue to work, but will refrain from publishing a ViewID attribute. Since not publishing ViewID places the Storage Share objects in the same view, the GLUE 2.0 SharingID semantics is preserved. Cheers, Paul.

Paul Millar [mailto:paul.millar@desy.de] said:
Since the attribute is optional, existing publishers will continue to work, but will refrain from publishing a ViewID attribute. Since not publishing ViewID places the Storage Share objects in the same view, the GLUE 2.0 SharingID semantics is preserved.
What about the client side - will a client which doesn't know about the new attribute be able to make sense of data which uses it? Stephen -- Scanned by iCritical.

On 24/09/14 13:28, stephen.burke@stfc.ac.uk wrote:
Paul Millar [mailto:paul.millar@desy.de] said:
Since the attribute is optional, existing publishers will continue to work, but will refrain from publishing a ViewID attribute. Since not publishing ViewID places the Storage Share objects in the same view, the GLUE 2.0 SharingID semantics is preserved.
What about the client side - will a client which doesn't know about the new attribute be able to make sense of data which uses it?
Certainly if the schema is updated but no info-provider publishes a ViewID then it is completely backwards compatible, even with a mixture of GLUE 2.0 and GLUE 2.1 clients. If there are GLUE 2.1 info-providers publishing ViewID values then GLUE 2.0 clients will treat all Storage Shares as belonging to the same view. This might or might not cause a problem, but only for clients that are doing accounting by aggregating shares. The update I have in mind specifically for dCache would publish the same set of shares, but with different ViewID values. Therefore, at least for dCache, GLUE 2.0 clients would not be affected by the change. If a GLUE 2.1 info-provider wanted to make use of Views then there is a risk of confusing GLUE 2.0 clients that aggregate shares to do accounting. I think this scenario is pretty unlikely. To put this in context, storage accounting is currently broken in dCache as we must publish both SRM reservations and the underlying storage. I think adding views is the best way of fixing the problem. Cheers, Paul.

Dear Paul,
Paul Millar [mailto:paul.millar@desy.de] said:
Since the attribute is optional, existing publishers will continue to work, but will refrain from publishing a ViewID attribute. Since not publishing ViewID places the Storage Share objects in the same view, the GLUE 2.0 SharingID semantics is preserved.
What about the client side - will a client which doesn't know about the new attribute be able to make sense of data which uses it?
Certainly if the schema is updated but no info-provider publishes a ViewID then it is completely backwards compatible, even with a mixture of GLUE 2.0 and GLUE 2.1 clients.
If there are GLUE 2.1 info-providers publishing ViewID values then GLUE 2.0 clients will treat all Storage Shares as belonging to the same view. This might or might not cause a problem, but only for clients that are doing accounting by aggregating shares.
The update I have in mind specifically for dCache would publish the same set of shares, but with different ViewID values. Therefore, at least for dCache, GLUE 2.0 clients would not be affected by the change.
If a GLUE 2.1 info-provider wanted to make use of Views then there is a risk of confusing GLUE 2.0 clients that aggregate shares to do accounting. I think this scenario is pretty unlikely.
To put this in context, storage accounting is currently broken in dCache as we must publish both SRM reservations and the underlying storage. I think adding views is the best way of fixing the problem.
If we need to write GLUE 2.1 info providers this is not in my opinion a backwards compatible change. One thing is to add new entities for Cloud, which is something that requires new info providers in any case and it's mostly affecting a few sites right now, and another thing is to add a new attribute that requires new info providers to make use of it. This will require a coordinated deployment to be able to make use of the View concept. Could you please explain why suddenly this is an issue, who has requested this use case, how many VOs or users are affected otherwise and why it can't be addressed with the existing GLUE 2.0 schema? This is not clear to me. Before proposing the change to the GLUE WG, I think we need further discussion also among the rest of the Storage developers. What happens with DPM or StoRM? If they don't publish this attribute, will those sites or users who have requested that use case be happy? Will they target only dCache installations? I would like all Storage systems to be aligned. I don't like the idea that we introduce an optional attribute that will be mandatory for dCache but not for the others. And why do you say that storage accounting is broken in dCache only now? This is the first time I hear this, when I have spent the whole summer trying to improve storage info providers and the feedback I got from dCache is that everything was OK for GLUE 2 publishing. Thanks! Maria

Hi Maria, On 25/09/14 09:04, Maria Alandes Pradillo wrote:
If we need to write GLUE 2.1 info providers this is not in my opinion a backwards compatible change.
No, please re-read my explanation: nobody needs to write a GLUE 2.1 info-provider. Some groups (like dCache) may *choose* to write a GLUE 2.1 info-provider to better describe the storage; however, existing (GLUE 2.0) info-providers will continue to work fine.
Could you please explain why suddenly this is an issue,
I *think* what has happened is this: dCache published some Storage Shares. Broadly speaking, there are two types: ones that show SRM space-reservations and ones that show physical hardware. The Storage Shares representing SRM space-reservations are correct and easily selected as they have a non-empty Tag attribute. This satisfies use-cases, like ATLAS, who do space accounting through SRM space-reservations. The Storage Shares representing physical hardware were intended for "installed capacity" calculations. I know that, for at least one Tier-1, the installed capacity numbers are provided as a spreadsheet and are calculated manually, NOT from information published by GLUE. I suspect this is true for other sites, but I could be wrong here. The problem is that, for dCache, the Storage Shares representing SRM space-reservations and those representing physical hardware are two views on the same storage which overlap. The Storage Shares in the "SRM space-reservations" view do not overlap. Similarly, the Storage Shares in the "physical hardware" view do not overlap. In GLUE-2.0, there is no way to represent this. In GLUE-2.0, pairs of Storage Shares either overlap or they are non-overlapping.
who has requested this use case,
The ultimate use-case is for calculating the installed capacity, which is a long established use-case in WLCG. I believe introducing an extra attribute is needed to provide the necessary information for a dCache instance.
how many VOs or users are affected otherwise
I believe all WLCG VOs that take part in the installed capacity calculations are affected.
and why it can't be addressed with the existing GLUE 2.0 schema?
Because GLUE 2.0 provides only the SharingID attribute, which describes whether or not two Storage Shares overlap. This is not a rich enough language to describe a dCache instance. (SharingID may have some other problems, but those we can work-around.)
Before proposing the change to the GLUE WG, I think we need further discussion also among the rest of the Storage developers.
I thought the GLUE WG was the correct place to discuss a proposed change to GLUE.
What happens with DPM or StoRM?
They need do nothing: the proposed change is backwards compatible for GLUE 2.0 info-providers.
If they don't publish this attribute, will those sites or users who have requested that use case be happy?
If they (DPM and StoRM info-providers) don't publish this attribute, sites and users will continue to be as happy as they are now.
Will they target only dCache installations?
I'm not sure what you mean by "they" and "target"; however, the proposal was carefully crafted to be backwards compatible for GLUE 2.0 info-providers. The overall effect is to unify how storage systems are represented by allowing a client to discover information about any kind of storage system.
I would like all Storage systems to be aligned.
Me too. Currently they are not. I would like to change this; hence the proposal.
I don't like the idea that we introduce an optional attribute that will be mandatory for dCache but not for the others.
Storage systems are different and behave differently. Publishing different subset of attributes is a natural consequence of this. The point here is that GLUE 2.0 is incapable of describing a dCache site in such a way that the installed capacity may be calculated from the published data. GLUE 2.0 is "broken" in the sense that it cannot describe a dCache site such that the installed capacity may be calculated. I would like to "fix" GLUE 2.0 so an info-provider can describe a dCache instance.
And why do you say that storage accounting is broken in dCache only now?
Simple: because I didn't realise, until now, that is was broken.
This is the first time I hear this, when I have spent the whole summer trying to improve storage info providers and the feedback I got from dCache is that everything was OK for GLUE 2 publishing.
Sure, dCache publishes some wonderful GLUE 2.0 information. One can do many things with that information. However, it turns out that calculating the installed-capacity isn't one of them. Like many things, it's only when there are users that the problems are found. However, please view this as a normal bug-fixing process: a problem has been discovered. We are in the process of understanding it. A solution has be proposed, which will be discussed based on its merits. Cheers, Paul.

Dear Paul,
Some groups (like dCache) may *choose* to write a GLUE 2.1 info-provider to better describe the storage; however, existing (GLUE 2.0) info-providers will continue to work fine.
I guess your intention would be that all dCache instances move to the new GLUE 2.1 info provider if this is finally written, right?
The Storage Shares representing physical hardware were intended for "installed capacity" calculations.
I know that, for at least one Tier-1, the installed capacity numbers are provided as a spreadsheet and are calculated manually, NOT from information published by GLUE. I suspect this is true for other sites, but I could be wrong here.
This is my understanding as well. In many cases I think the BDII is not used.
The problem is that, for dCache, the Storage Shares representing SRM space- reservations and those representing physical hardware are two views on the same storage which overlap. The Storage Shares in the "SRM space- reservations" view do not overlap. Similarly, the Storage Shares in the "physical hardware" view do not overlap.
Do you mean SRM shares could overlap with Physical Shares or do you mean that SRM shares overlap among themselves and the same with Physical Shares? In any case, if only Physical Shares are intended to be used for installed capacity, this should be the ones used by sites who may want to rely on the BDII for installed capacities, right? What is then published in GLUE2StorageServiceCapacity? The total of Physical Shares?
The ultimate use-case is for calculating the installed capacity, which is a long established use-case in WLCG. I believe introducing an extra attribute is needed to provide the necessary information for a dCache instance.
But who has asked for this? My understanding is that nobody is interested on relying on the BDII for this. This is the reason why REBUS offers now an interface to sites to manually enter this information for the monthly WLCG reports. I´m not sure is a use case we would like to address. I made a comparison in February between WLCG Accounting reports and what the BDII and REBUS were publishing in terms of installed capacities. And I think there was no single match for any T1: http://wlcg-mon.cern.ch/dashboard/request.py/siteview#currentView=WLCG+Accou... Obviously, the sites are reporting things that somehow are not reflected on the BDII. But obviously this is not an issue, since for WLCG what matters is what they put in their monthly reports manually.
Because GLUE 2.0 provides only the SharingID attribute, which describes whether or not two Storage Shares overlap. This is not a rich enough language to describe a dCache instance.
Why is not enough for dCache? Is it enough for the other storage services? I read the definition of your new ViewID attribute and I find it complex to understand. It looks like a SharingID attribute but with more complexity added. Moreover the capacities are published in the StorageShareCapacity object, so how would sites know which StorageShareCapacity object they have to use to calculate the installed capacity and how will the ViewID attribute help to make this correct in the future, if now it´s not correct?
I thought the GLUE WG was the correct place to discuss a proposed change to GLUE.
Well, as I said, I would like all storage systems to be aligned. In this working group there are no representatives of DPM or StoRM, for instance.
They need do nothing: the proposed change is backwards compatible for GLUE 2.0 info-providers.
Well, it would be good to understand whether what you want to change could be also useful for them. If we introduce a new attribute, maybe this could also be useful for them.
I would like all Storage systems to be aligned.
Me too. Currently they are not. I would like to change this; hence the proposal.
Sorry, I don´t see how this proposal makes storage systems to be aligned. Moreover, it is trying to address a use case that is nowadays implemented in REBUS through manual editing of the installed capacities. For sure the WLCG management hasn´t asked for this, I still wonder who is now asking for it.
The point here is that GLUE 2.0 is incapable of describing a dCache site in such a way that the installed capacity may be calculated from the published data.
Have you get in touch with i.e. FZK, IN2P3, NDGF, SARA, TRIUMF or pic, just to mention some of the T1s using dCache, and ask them how they provide their installed capacities to WLCG and if they actually need this feature you want to implement?
GLUE 2.0 is "broken" in the sense that it cannot describe a dCache site such that the installed capacity may be calculated. I would like to "fix" GLUE 2.0 so an info-provider can describe a dCache instance.
As I have already said, I´m not sure this is an actual problem. But I guess it´s also up to you to decide which new features are available in dCache. Regards, Maria

Paul Millar [mailto:paul.millar@desy.de] said:
The Storage Shares representing physical hardware were intended for "installed capacity" calculations.
As far as I remember the Capacity objects have a Type attribute which is supposed to be e.g. "online" for the standard uses and "installedonline" for installed capacity publication (and since it's an enumerated type, other values can be defined). Is it not possible to get the effect you want using that? Stephen -- Scanned by iCritical.

On 26/09/14 11:12, stephen.burke@stfc.ac.uk wrote:
Paul Millar [mailto:paul.millar@desy.de] said:
The Storage Shares representing physical hardware were intended for "installed capacity" calculations.
As far as I remember the Capacity objects have a Type attribute which is supposed to be e.g. "online" for the standard uses and "installedonline" for installed capacity publication (and since it's an enumerated type, other values can be defined). Is it not possible to get the effect you want using that?
The short answer: no. The longer answer: yes. The slightly longer answer still: No, because these are different concepts: SSC.Type is about what kind of storage you are accounting for, not whether or not two Storage Shares overlap. The more accurate answer: Yes, by having explicit knowledge of how the WLCG clients behave, I can construct information such that one set of clients see one thing and the other set of clients see something else. Specifically, I "know" (suspect) that WLCG clients querying about SRM reservations are interested in StorageShareCapacity where Type=ONLINE. I also "know" (strongly suspect) that those clients doing accounting are interested in Type=INSTALLEDONLINE. Therefore, I can publish SRM shares with only SSCs with Type=ONLINE and Physical shares with only SSCs with Type=INSTALLEDONLINE. Magic! Problem solved! But the solution is fragile, requires undocumented knowledge, and excludes other clients from working. What if ATLAS wants to query INSTALLEDONLINE values for their space reservations? Also, this is the kind of name-convention-style brokenness we wanted to get away from in the transition from GLUE 1.3 to GLUE 2.0. I say lets fix the with GLUE 2.1, since we are making another release. (A ViewID *may* even be useful outside of the Storage model.) As usual, my 2 Euro-cents worth ;-) Paul.
participants (3)
-
Maria Alandes Pradillo
-
Paul Millar
-
stephen.burke@stfc.ac.uk