Re: [glue-wg] Extending GLUE 2.0 for Cloud services

Dear GLUE WG mailing list, in the context of the EGI Federated Cloud <http://www.egi.eu/solutions/fed-cloud/>, we come across the need for a standard for information discovery in the Cloud ecosystem. We looked at the existing standard of the Cloud/Grid world, and, considering that no current standard seems to meet our requirements, we decided to propose an extension to the GLUE 2.0 schema to cover Cloud Compute and Storage services. The first proposed draft is attached. We would really appreciate your comments and maybe to add a point for discussion in the next WG meeting. Thank you a lot in advance for your support. Best Regards, Salvatore Pinto. -- Salvatore Pinto Cloud Technologist, EGI.eu e-mail:salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands On 16/10/2013 22:24, JP Navarro wrote:
Hello Sergio,
Sounds interesting.
We'll put it on the agenda for a future meeting once the working group has seen the document and had a chance to review it. Perhaps your colleagues can answer questions as well as propose how to integrate the extensions into GLUE 2.0.
Regards,
JP On Oct 10, 2013, at 10:59 AM, Sergio Andreozzi<sergio.andreozzi@egi.eu> wrote:
Hi Shiraz, JP,
My colleagues (in copy) have produced extensions to GLUE 2.0 for Cloud in the context of the EGI Federated Cloud activities (http://www.egi.eu/solutions/fed-cloud/). They would like to submit a document to the group so that it could be discussed, agreed and then submitted for publication.
They will join the mailing list and liaise with you for the process. I guess this activity will be of interest given the growing activities around federated clouds.
Cheers, Sergio
-- Sergio Andreozzi - Strategy and Policy Manager - EGI.eu http://www.egi.eu -skype:sergio.andreozzi.egi

Hi Salvatore, On 04/11/13 13:58, Salvatore Pinto wrote:
The first proposed draft is attached. We would really appreciate your comments and maybe to add a point for discussion in the next WG meeting.
Thanks for posting the document. I haven't really gone through it in detail, so these are just some initial observations. First, it looks like you have inserted the word "Cloud" as a prefix to every class in the document. This isn't an extension, but a complete rewrite! Second, GLUE 2 contains extension points to allow communities to gain experience with new information without requiring the change of LDAP (or other) schemas on deployment machines. If you haven't already got operational experience of the cloud extensions then you really should consider using these extension points first. Third, some of the objects show no additional information beyond the existing GLUE 2.0 definitions; for example, the CloudStorageAccessProtocol seems identical (from memory) to the StorageAccessProtocol. Fourth, the inheritance model seems broken. Many classes show considerable overlap with their non-Cloud equivalents: CloudStorageServiceCapacity has TotalSize, FreeSize, etc, but also StoredObjects. As the cloud versions do not extend from the non-Cloud classes, a publishing system would have to publish twice as many objects (the Cloud ones and the non-Cloud ones). NB that inheritance isn't the solution to this problem; rather, you should have non-overlapping attributes. Fifth, have you considered what is current state-of-art within the cloud ecosystem? Talking specifically about storage, you should look at what already exists. In particular, CDMI provides a standards-based interface for interacting with cloud storage. It includes specific metadata about a storage system. I suggest you look at this metadata as a source of inspiration. NB. I'm NOT suggesting you copy all the metadata from CDMI into GLUE 2! Finally, please be careful that any additional publishing fits in with existing storage systems. Grid and cloud are not two mutually exclusive worlds; with dCache, for example, we have software where a single storage instance can provide both grid- and cloud-like storage. It should be possible for such systems to publish themselves without doubling the number of objects. HTH, Paul.

Hi Paul, answers inline. On 04/11/2013 18:15, Paul Millar wrote:
Hi Salvatore,
On 04/11/13 13:58, Salvatore Pinto wrote:
The first proposed draft is attached. We would really appreciate your comments and maybe to add a point for discussion in the next WG meeting.
Thanks for posting the document.
thank you a lot for the comments :)
I haven't really gone through it in detail, so these are just some initial observations.
First, it looks like you have inserted the word "Cloud" as a prefix to every class in the document. This isn't an extension, but a complete rewrite!
I named it extension, because it inherits the same main entities and and similar conceptual model from GLUE 2.0, so I considered it an extension to the standard. Sorry if I was wrong on that, maybe it is indeed an "addition" more than an "extension".
Second, GLUE 2 contains extension points to allow communities to gain experience with new information without requiring the change of LDAP (or other) schemas on deployment machines. If you haven't already got operational experience of the cloud extensions then you really should consider using these extension points first.
we considered this option, but, for Compute, we have also some change in the relationships between the objects and mandatory attributes in the Grid elements which have no sense in the Cloud world. Anyway, the main reason for not using the extension was that, considering the different kind of services the Cloud and the Grid are giving to the users, we wanted to keep the entities separated.
Third, some of the objects show no additional information beyond the existing GLUE 2.0 definitions; for example, the CloudStorageAccessProtocol seems identical (from memory) to the StorageAccessProtocol.
That is true, Storage schema is very close to the Grid version and can be indeed rewritten inheriting the Storage elements or just modifyi-ing them. From my point of view, the big difference between Cloud and Grid storage is that Cloud storage is not file-oriented like the grid one but more object oriented (where an object can be a file, a disk image, a stream or a generic object). So, in this view, the "Grid" storage is a specialization of the "Cloud" one (where object type=files) and it would be probably better to change the GLUE 2.0 Storage element to consider objects instead of files and have one single entity. What do you think of that?
Fourth, the inheritance model seems broken. Many classes show considerable overlap with their non-Cloud equivalents: CloudStorageServiceCapacity has TotalSize, FreeSize, etc, but also StoredObjects. As the cloud versions do not extend from the non-Cloud classes, a publishing system would have to publish twice as many objects (the Cloud ones and the non-Cloud ones).
as I said before, Storage objects are taken completely from the non-Cloud ones and we wanted to have twice the objects to keep Cloud and Grid services separated.
NB that inheritance isn't the solution to this problem; rather, you should have non-overlapping attributes.
Fifth, have you considered what is current state-of-art within the cloud ecosystem? Talking specifically about storage, you should look at what already exists. In particular, CDMI provides a standards-based interface for interacting with cloud storage. It includes specific metadata about a storage system. I suggest you look at this metadata as a source of inspiration. NB. I'm NOT suggesting you copy all the metadata from CDMI into GLUE 2!
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm). We could try to extend these attributes at system level and assign them to the storage service or other entities, but with this we would break one of the main features of the Cloud storage, which is the freedom for the user to ecrypt one file and not another, share one with the world, one with only his colleagues and another restricted. We could add anyway the "Support for Data system Metadata", for example FileEncryptionSupported=[yes/no], etc..., I will think about that.
Finally, please be careful that any additional publishing fits in with existing storage systems. Grid and cloud are not two mutually exclusive worlds; with dCache, for example, we have software where a single storage instance can provide both grid- and cloud-like storage. It should be possible for such systems to publish themselves without doubling the number of objects.
again, one point for modifying the original Storage elements to consider objects instead of files.
HTH,
Paul. _______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg Cheers (and thanks again for the comments), Salvatore
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Salvatore Pinto said: That is true, Storage schema is very close to the Grid version and can be indeed rewritten inheriting the Storage elements or just modifyi-ing them. From my point of view, the big difference between Cloud and Grid storage is that Cloud storage is not file-oriented like the grid one but more object oriented (where an object can be a file, a disk image, a stream or a generic object). So, in this view, the "Grid" storage is a specialization of the "Cloud" one (where object type=files) and it would be probably better to change the GLUE 2.0 Storage element to consider objects instead of files and have one single entity. What do you think of that?
I don't think the current schema really represents files - the word "file" may appear in the text but you could replace it with "data object" without changing anything. It isn't feasible to publish information about individual files, all we have is summary information about larger blocks of storage. Also storage systems presumably don't have the biggest difference between cloud and grid computing services, namely that they aren't persistent and can be created and destroyed - storage has to be persistent to be useful!
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm). We could try to extend these attributes at system level and assign them to the storage service or other entities, but with this we would break one of the main features of the Cloud storage, which is the freedom for the user to ecrypt one file and not another, share one with the world, one with only his colleagues and another restricted.
You certainly wouldn't want to publish the state of individual files, but you may want to advertise the capability to do various things, similarly to the support for different access protocols. Stephen -- Scanned by iCritical.

Hi Stephen, yes, replacing "files" with "data object" in the text it is fine, but, in my opinion, we should also perform the following changes: * Add to StorageEndpoint element the following attribute: SupportedObjects CloudStorageT_t * Supported data object formats (ex. Image disks, files, generic objects, etc...) . For the use is important to know which kind of storage the service provides, if it is storage for files, disk images, EBS attached storage, etc... * Add to StorageShare the following attributes: MaxObjectSize UInt64 0..1 MB Maximum size of a data object who can be stored in this share MinObjectSize UInt64 0..1 MB Minimum size of a data object who can be stored in this share This is very important for the user (especially for EBS storage, where the storage object is a disk attached to a VM) to choose which storage service is better for their need and for the authomatic systems to perform auto-scaling of the storage. Of course, we could use the extensions to add these attributes and do not change the standard... Regarding the capabilities for the storage taken from CDMI or other cloud standards (ex. support for file encryption, single file ACLs, etc...), they may go into the Capability attribute of the StorageEndpoint, which is an open enumeration, so we need to add nothing for that. Cheers, Salvatore. On 05/11/2013 11:59, stephen.burke@stfc.ac.uk wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Salvatore Pinto said: That is true, Storage schema is very close to the Grid version and can be indeed rewritten inheriting the Storage elements or just modifyi-ing them. From my point of view, the big difference between Cloud and Grid storage is that Cloud storage is not file-oriented like the grid one but more object oriented (where an object can be a file, a disk image, a stream or a generic object). So, in this view, the "Grid" storage is a specialization of the "Cloud" one (where object type=files) and it would be probably better to change the GLUE 2.0 Storage element to consider objects instead of files and have one single entity. What do you think of that? I don't think the current schema really represents files - the word "file" may appear in the text but you could replace it with "data object" without changing anything. It isn't feasible to publish information about individual files, all we have is summary information about larger blocks of storage. Also storage systems presumably don't have the biggest difference between cloud and grid computing services, namely that they aren't persistent and can be created and destroyed - storage has to be persistent to be useful!
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm). We could try to extend these attributes at system level and assign them to the storage service or other entities, but with this we would break one of the main features of the Cloud storage, which is the freedom for the user to ecrypt one file and not another, share one with the world, one with only his colleagues and another restricted. You certainly wouldn't want to publish the state of individual files, but you may want to advertise the capability to do various things, similarly to the support for different access protocols.
Stephen
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands

On Tue, Nov 5, 2013 at 1:49 PM, Salvatore Pinto <salvatore.pinto@egi.eu>wrote:
Hi Stephen, yes, replacing "files" with "data object" in the text it is fine, but, in my opinion, we should also perform the following changes:
* Add to StorageEndpoint element the following attribute:
SupportedObjects
CloudStorageT_t
*
Supported data object formats (ex. Image disks, files, generic objects, etc…) . For the use is important to know which kind of storage the service provides, if it is storage for files, disk images, EBS attached storage, etc...
Why wouldn't tht be 'StorageT_t'? The same argument can be made for non-cloudy storage resources.
* Add to StorageShare the following attributes:
MaxObjectSize
UInt64
0..1
MB
Maximum size of a data object who can be stored in this share
MinObjectSize
UInt64
0..1
MB
Minimum size of a data object who can be stored in this share This is very important for the user (especially for EBS storage, where the storage object is a disk attached to a VM) to choose which storage service is better for their need and for the authomatic systems to perform auto-scaling of the storage.
Are there systems where MinObjSize is *not* zero? My $0.02, Andre. Of course, we could use the extensions to add these attributes and do not
change the standard...
Regarding the capabilities for the storage taken from CDMI or other cloud standards (ex. support for file encryption, single file ACLs, etc...), they may go into the Capability attribute of the StorageEndpoint, which is an open enumeration, so we need to add nothing for that.
Cheers, Salvatore.
On 05/11/2013 11:59, stephen.burke@stfc.ac.uk wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org <glue-wg-bounces@ogf.org>] On
Behalf Of Salvatore Pinto said: That is true, Storage schema is very close to the Grid version and can be indeed rewritten inheriting the Storage elements or just modifyi-ing them. From my point of view, the big difference between Cloud and Grid storage is that Cloud storage is not file-oriented like the grid one but more object oriented (where an object can be a file, a disk image, a stream or a generic object). So, in this view, the "Grid" storage is a specialization of the "Cloud" one (where object type=files) and it would be probably better to change the GLUE 2.0 Storage element to consider objects instead of files and have one single entity. What do you think of that?
I don't think the current schema really represents files - the word "file" may appear in the text but you could replace it with "data object" without changing anything. It isn't feasible to publish information about individual files, all we have is summary information about larger blocks of storage. Also storage systems presumably don't have the biggest difference between cloud and grid computing services, namely that they aren't persistent and can be created and destroyed - storage has to be persistent to be useful!
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source:http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm). We could try to extend these attributes at system level and assign them to the storage service or other entities, but with this we would break one of the main features of the Cloud storage, which is the freedom for the user to ecrypt one file and not another, share one with the world, one with only his colleagues and another restricted.
You certainly wouldn't want to publish the state of individual files, but you may want to advertise the capability to do various things, similarly to the support for different access protocols.
Stephen
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu
skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands
_______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg
-- Nothing is really difficult.

Hi Andre, On 05/11/2013 14:08, Andre Merzky wrote:
On Tue, Nov 5, 2013 at 1:49 PM, Salvatore Pinto <salvatore.pinto@egi.eu <mailto:salvatore.pinto@egi.eu>> wrote:
Hi Stephen, yes, replacing "files" with "data object" in the text it is fine, but, in my opinion, we should also perform the following changes:
* Add to StorageEndpoint element the following attribute:
SupportedObjects
CloudStorageT_t
*
Supported data object formats (ex. Image disks, files, generic objects, etc…)
. For the use is important to know which kind of storage the service provides, if it is storage for files, disk images, EBS attached storage, etc...
Why wouldn't tht be 'StorageT_t'? The same argument can be made for non-cloudy storage resources.
yes, of course, StorageT_t, my mistake :)
* Add to StorageShare the following attributes:
MaxObjectSize
UInt64
0..1
MB
Maximum size of a data object who can be stored in this share
MinObjectSize
UInt64
0..1
MB
Minimum size of a data object who can be stored in this share
This is very important for the user (especially for EBS storage, where the storage object is a disk attached to a VM) to choose which storage service is better for their need and for the authomatic systems to perform auto-scaling of the storage.
Are there systems where MinObjSize is *not* zero?
yes, for example Amazon EBS (storage volumes attached to the VMs) supports sizes from 1GB to 1TB . Actually, the objects size may also be not even a continuous number, for example Interoute EBS allows only 250GB, 500GB, 1TB, 1.5 TB and 2TB.
My $0.02,
Andre. Cheers, Salvatore.
Of course, we could use the extensions to add these attributes and do not change the standard...
Regarding the capabilities for the storage taken from CDMI or other cloud standards (ex. support for file encryption, single file ACLs, etc...), they may go into the Capability attribute of the StorageEndpoint, which is an open enumeration, so we need to add nothing for that.
Cheers, Salvatore.
On 05/11/2013 11:59, stephen.burke@stfc.ac.uk <mailto:stephen.burke@stfc.ac.uk> wrote:
glue-wg-bounces@ogf.org <mailto:glue-wg-bounces@ogf.org> [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Salvatore Pinto said: That is true, Storage schema is very close to the Grid version and can be indeed rewritten inheriting the Storage elements or just modifyi-ing them. From my point of view, the big difference between Cloud and Grid storage is that Cloud storage is not file-oriented like the grid one but more object oriented (where an object can be a file, a disk image, a stream or a generic object). So, in this view, the "Grid" storage is a specialization of the "Cloud" one (where object type=files) and it would be probably better to change the GLUE 2.0 Storage element to consider objects instead of files and have one single entity. What do you think of that?
I don't think the current schema really represents files - the word "file" may appear in the text but you could replace it with "data object" without changing anything. It isn't feasible to publish information about individual files, all we have is summary information about larger blocks of storage. Also storage systems presumably don't have the biggest difference between cloud and grid computing services, namely that they aren't persistent and can be created and destroyed - storage has to be persistent to be useful!
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm). We could try to extend these attributes at system level and assign them to the storage service or other entities, but with this we would break one of the main features of the Cloud storage, which is the freedom for the user to ecrypt one file and not another, share one with the world, one with only his colleagues and another restricted.
You certainly wouldn't want to publish the state of individual files, but you may want to advertise the capability to do various things, similarly to the support for different access protocols.
Stephen
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail:salvatore.pinto@egi.eu <mailto:salvatore.pinto@egi.eu> skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands
_______________________________________________ glue-wg mailing list glue-wg@ogf.org <mailto:glue-wg@ogf.org> https://www.ogf.org/mailman/listinfo/glue-wg
-- Nothing is really difficult.
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands

Salvatore Pinto [mailto:salvatore.pinto@egi.eu] said:
* Add to StorageEndpoint element the following attribute:
Is that really a property of the endpoint rather than the service as a whole or the storage itself (Share)?
* Add to StorageShare the following attributes:
In fact we had min and max sizes in GLUE 1, but they were removed because they were never used - if there's a clear requirement there should be no problem in restoring them. Stephen -- Scanned by iCritical.

Hi Steven, On 05/11/2013 14:20, stephen.burke@stfc.ac.uk wrote:
Salvatore Pinto [mailto:salvatore.pinto@egi.eu] said:
* Add to StorageEndpoint element the following attribute: Is that really a property of the endpoint rather than the service as a whole or the storage itself (Share)? yes, I was also not sure on where to place this.
* Add to StorageShare the following attributes: In fact we had min and max sizes in GLUE 1, but they were removed because they were never used - if there's a clear requirement there should be no problem in restoring them.
The underlying physical storage may be shared by different types of objects, while the endpoints usually are dedicated to a particular object type (Grid Endpoints usually serve files and streams, EBS is served via OCCI and generic objects via CDMI), so I tough it was better to place it in the Endpoint. perfect :)
Stephen
Cheers, Salvatore. -- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands

Salvatore Pinto [mailto:salvatore.pinto@egi.eu] said:
The underlying physical storage may be shared by different types of objects, while the endpoints usually are dedicated to a particular object type (Grid Endpoints usually serve files and streams, EBS is served via OCCI and generic objects via CDMI), so I tough it was better to place it in the Endpoint.
To the extent that the type is implicit in the protocol, e.g. SRM = file, you don't necessarily need an extra attribute at all, and you also have the Capability attribute for things which are relatively generic - we already have data.access.flatfiles for example. Also in the Share we have an AccessMode attribute - for which I think we don't actually have a use at the moment! Stephen -- Scanned by iCritical.

Hi Stephen, sorry for the late answer. On 05/11/2013 15:58, stephen.burke@stfc.ac.uk wrote:
Salvatore Pinto [mailto:salvatore.pinto@egi.eu] said:
The underlying physical storage may be shared by different types of objects, while the endpoints usually are dedicated to a particular object type (Grid Endpoints usually serve files and streams, EBS is served via OCCI and generic objects via CDMI), so I tough it was better to place it in the Endpoint. To the extent that the type is implicit in the protocol, e.g. SRM = file, you don't necessarily need an extra attribute at all, and you also have the Capability attribute for things which are relatively generic - we already have data.access.flatfiles for example. Also in the Share we have an AccessMode attribute - for which I think we don't actually have a use at the moment!
Stephen
I think the Capability attribute in the endpoint is the best, as Florido proposed. Cheers, Salvatore. -- Salvatore Pinto Cloud Technologist, EGI.eu e-mail:salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands

Hi all, I had no time to read the document yet, but here's some comments at a first glance on these proposed changes. On 2013-11-05 13:49, Salvatore Pinto wrote:
Hi Stephen, yes, replacing "files" with "data object" in the text it is fine, but, in my opinion, we should also perform the following changes:
* Add to StorageEndpoint element the following attribute:
SupportedObjects
CloudStorageT_t
*
Supported data object formats (ex. Image disks, files, generic objects, etc…)
. For the use is important to know which kind of storage the service provides, if it is storage for files, disk images, EBS attached storage, etc...
It's bad design to add one more attribute for this purpose I think. The question "what kind of storage is provided" should be answered by looking at capabilities. I think capabilities are the key thing to know "what a service does" or "what an endpoint does". Moreover, since they are open enumerations, one can use all his/her fantasy to make them mean whatever they're targeted, and it's very much extensible. Examples: data.management.diskimage data.management.file data.management.genericobject if you don't like 'management' you can change it to some other thing, but I feel it fits... something like 'data.supportedobject.diskimage'
* Add to StorageShare the following attributes:
MaxObjectSize
UInt64
0..1
MB
Maximum size of a data object who can be stored in this share
MinObjectSize
UInt64
0..1
MB
Minimum size of a data object who can be stored in this share
This is very important for the user (especially for EBS storage, where the storage object is a disk attached to a VM) to choose which storage service is better for their need and for the authomatic systems to perform auto-scaling of the storage.
I can see the benefit of these attributes. However, maybe the best place where to put these two attributes is StorageShareCapacity and not StorageShare. Cheers, Florido
Of course, we could use the extensions to add these attributes and do not change the standard...
I think you can use Extensions until we change the standard :) but since you devise these will be extensively used in clouds, I see no problem changing the standard.
Regarding the capabilities for the storage taken from CDMI or other cloud standards (ex. support for file encryption, single file ACLs, etc...), they may go into the Capability attribute of the StorageEndpoint, which is an open enumeration, so we need to add nothing for that.
Cheers, Salvatore.
On 05/11/2013 11:59, stephen.burke@stfc.ac.uk wrote:
glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of Salvatore Pinto said: That is true, Storage schema is very close to the Grid version and can be indeed rewritten inheriting the Storage elements or just modifyi-ing them. From my point of view, the big difference between Cloud and Grid storage is that Cloud storage is not file-oriented like the grid one but more object oriented (where an object can be a file, a disk image, a stream or a generic object). So, in this view, the "Grid" storage is a specialization of the "Cloud" one (where object type=files) and it would be probably better to change the GLUE 2.0 Storage element to consider objects instead of files and have one single entity. What do you think of that? I don't think the current schema really represents files - the word "file" may appear in the text but you could replace it with "data object" without changing anything. It isn't feasible to publish information about individual files, all we have is summary information about larger blocks of storage. Also storage systems presumably don't have the biggest difference between cloud and grid computing services, namely that they aren't persistent and can be created and destroyed - storage has to be persistent to be useful!
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm). We could try to extend these attributes at system level and assign them to the storage service or other entities, but with this we would break one of the main features of the Cloud storage, which is the freedom for the user to ecrypt one file and not another, share one with the world, one with only his colleagues and another restricted. You certainly wouldn't want to publish the state of individual files, but you may want to advertise the capability to do various things, similarly to the support for different access protocols.
Stephen
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands
_______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg
-- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Tel: 046-2220272 Email: florido.paganelli@REMOVE_THIShep.lu.se Homepage: http://www.hep.lu.se/staff/paganelli ==================================================

Hi Florido, inline my answers. On 05/11/2013 17:46, Florido Paganelli wrote:
Hi all,
I had no time to read the document yet, but here's some comments at a first glance on these proposed changes.
On 2013-11-05 13:49, Salvatore Pinto wrote:
Hi Stephen, yes, replacing "files" with "data object" in the text it is fine, but, in my opinion, we should also perform the following changes:
* Add to StorageEndpoint element the following attribute:
SupportedObjects
CloudStorageT_t
*
Supported data object formats (ex. Image disks, files, generic objects, etc...)
. For the use is important to know which kind of storage the service provides, if it is storage for files, disk images, EBS attached storage, etc...
It's bad design to add one more attribute for this purpose I think. The question "what kind of storage is provided" should be answered by looking at capabilities. I think capabilities are the key thing to know "what a service does" or "what an endpoint does". Moreover, since they are open enumerations, one can use all his/her fantasy to make them mean whatever they're targeted, and it's very much extensible.
Examples: data.management.diskimage data.management.file data.management.genericobject
if you don't like 'management' you can change it to some other thing, but I feel it fits... something like 'data.supportedobject.diskimage'
agreed, the Capabilities field is probably the best for that.
* Add to StorageShare the following attributes:
MaxObjectSize
UInt64
0..1
MB
Maximum size of a data object who can be stored in this share
MinObjectSize
UInt64
0..1
MB
Minimum size of a data object who can be stored in this share
This is very important for the user (especially for EBS storage, where the storage object is a disk attached to a VM) to choose which storage service is better for their need and for the authomatic systems to perform auto-scaling of the storage.
I can see the benefit of these attributes. However, maybe the best place where to put these two attributes is StorageShareCapacity and not StorageShare.
I think Stephen said that these attributes were already in GLUE 1.3 . I checked the old specification and the parameters were in the StorageArea entity, which I think is now the StorageShare, that's why I placed them there.
Cheers, Florido
-- ================================================== Florido Paganelli ARC Middleware Developer - NorduGrid Collaboration System Administrator Lund University Department of Physics Division of Particle Physics BOX118 221 00 Lund Office Tel: 046-2220272 Email:florido.paganelli@REMOVE_THIShep.lu.se Homepage:http://www.hep.lu.se/staff/paganelli ==================================================
_______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg
Cheers, Salvatore. -- Salvatore Pinto Cloud Technologist, EGI.eu e-mail:salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands

Hi Salvatore, I think your work is very interesting: GLUE-2 is trying to be extensible and technology agnostic. Trying to describe cloud services as GLUE-2 objects is an excellent test of how well we've achieved that goal. Some further replies, in-line: On 05/11/13 10:05, Salvatore Pinto wrote:
we considered this option, but, for Compute, we have also some change in the relationships between the objects and mandatory attributes in the Grid elements which have no sense in the Cloud world.
If there are attributes that are mandatory which make no sense in other kinds of services then we should certainly consider relaxing those requirements. Relationships between objects should also be expressible as extensions without requiring schema changes. If this isn't the case (for whatever reason) then that, too, is something that should be fixed.
Anyway, the main reason for not using the extension was that, considering the different kind of services the Cloud and the Grid are giving to the users, we wanted to keep the entities separated.
This, I think, is the point which I really disagree with. Let me give you a concrete example. We (dCache.org) have just started running a dCache instance that is very much a cloud storage service. Client software (of which there are various) currently accessed it via WebDAV, but we anticipate adding support for CDMI. We want to make this storage (or a similar cloud service) available to grid users in the future. IMHO, we *will* have storage systems that provide cloud-like and grid-like services; it would be a good idea not to exclude this from the start. [...]
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm).
Sorry, I didn't explain myself clearly; what you're describing is the CDMI concept of metadata (as per Chapter 16). I meant the more general metadata: "data about something". The CDMI protocol also supports a client discovering information about the service itself; the main motivation is to allow the client to discover if a service is appropriate for its needs. This matches one of the core uses of GLUE-2: to select an appropriate storage services without querying each service individually. For examples of this, see section 6.2 "Discovering the Capabilities of a Cloud Storage" and section 12.2 "Cloud Storage System-Wide Capabilities". The "Exported Protocols" (see Section 13) is broadly similar to the concept of StorageEndpoints or StorageAccessProtocol in GLUE-2, so this would also be an interesting avenue for mapping. There's also explicit support in CDMI for representing links to OCCI services. This holds a similar role to the ToComputingService objects in GLUE-2. (you see how these two worlds are not really *that* different)
again, one point for modifying the original Storage elements to consider objects instead of files.
IMHO this isn't a good reason. In most cases "objects" ("data-objects" in CDMI) are really just a synonym for files and a "buckets" ("container" in CDMI) is a synonym for a directory. Sure, in S3, there are some limitations on the interactions, that's an implementation-specific detail. If those limitations are important to clients then we can describe them (as per CDMI). Whether GLUE-2 talks about 'files' or 'objects' (or Feile, Fichier or ファイル) really doesn't matter if the underlying concept is the same. HTH, Paul.

Hi Paul, sorry for the late answers. On 05/11/2013 19:09, Paul Millar wrote:
Hi Salvatore,
I think your work is very interesting: GLUE-2 is trying to be extensible and technology agnostic. Trying to describe cloud services as GLUE-2 objects is an excellent test of how well we've achieved that goal.
Some further replies, in-line:
On 05/11/13 10:05, Salvatore Pinto wrote:
we considered this option, but, for Compute, we have also some change in the relationships between the objects and mandatory attributes in the Grid elements which have no sense in the Cloud world.
If there are attributes that are mandatory which make no sense in other kinds of services then we should certainly consider relaxing those requirements.
Relationships between objects should also be expressible as extensions without requiring schema changes. If this isn't the case (for whatever reason) then that, too, is something that should be fixed.
ok, I did not thought about the possibility to define new relationships as extensions, I thought the option was only for attributes. Anyway, there are other reasons to create new entities. We can discuss about that at the conference today.
Anyway, the main reason for not using the extension was that, considering the different kind of services the Cloud and the Grid are giving to the users, we wanted to keep the entities separated.
This, I think, is the point which I really disagree with.
Let me give you a concrete example.
We (dCache.org) have just started running a dCache instance that is very much a cloud storage service. Client software (of which there are various) currently accessed it via WebDAV, but we anticipate adding support for CDMI. We want to make this storage (or a similar cloud service) available to grid users in the future.
IMHO, we *will* have storage systems that provide cloud-like and grid-like services; it would be a good idea not to exclude this from the start.
[...]
for storage, I agree, for Computing, I do not. We can discuss about that in today's teleconference.
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm).
Sorry, I didn't explain myself clearly; what you're describing is the CDMI concept of metadata (as per Chapter 16). I meant the more general metadata: "data about something".
The CDMI protocol also supports a client discovering information about the service itself; the main motivation is to allow the client to discover if a service is appropriate for its needs. This matches one of the core uses of GLUE-2: to select an appropriate storage services without querying each service individually.
For examples of this, see section 6.2 "Discovering the Capabilities of a Cloud Storage" and section 12.2 "Cloud Storage System-Wide Capabilities".
The "Exported Protocols" (see Section 13) is broadly similar to the concept of StorageEndpoints or StorageAccessProtocol in GLUE-2, so this would also be an interesting avenue for mapping.
There's also explicit support in CDMI for representing links to OCCI services. This holds a similar role to the ToComputingService objects in GLUE-2.
(you see how these two worlds are not really *that* different)
again, one point for modifying the original Storage elements to consider objects instead of files.
IMHO this isn't a good reason.
In most cases "objects" ("data-objects" in CDMI) are really just a synonym for files and a "buckets" ("container" in CDMI) is a synonym for a directory.
no, I do not agree. Block storage is widely used in the Cloud today and a disk image (which is mounted by the hypervisor and exposed as an "Hardware" storage disk resource) it is quite a different concept from a container or a file. The fact that currently almost all the CDMI implementations support only file objects, it is another point and should not be a concern of the GLUE, since implementations and interface technology may change in the future. The important point, in my view, is that users will be misleaded if they see "files" in the specification.
Sure, in S3, there are some limitations on the interactions, that's an implementation-specific detail. If those limitations are important to clients then we can describe them (as per CDMI).
Whether GLUE-2 talks about 'files' or 'objects' (or Feile, Fichier or ファイル) really doesn't matter if the underlying concept is the same.
HTH,
Paul.
Cheers, Salvatore. -- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands

I expect to be on the call in a bit, but I thought I'd send a note that I haven't had any problem representing OpenStack and Nimbus IaaS clouds in the existing GLUE2 model. So far, I've primarily focused on the compute side, but I thought it was as straightforward to map GLUE2 entities to IaaS concepts as it was to map GLUE2 entities to cluster concepts. I haven't done too much with the storage entities, but so far, it seems to me like the mapping to existing GLUE2 doesn't seem that much more difficult than for clusters... A final thought is if we want to open the GLUE 2 model for changes, I'll have some suggestions based on my experiences so far. However, I don't feel a huge need to open the model for changes right now. Warren
-----Original Message----- From: glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of Salvatore Pinto Sent: Tuesday, December 03, 2013 2:30 AM To: Paul Millar Cc: glue-wg@ogf.org; Peter Solagna Subject: Re: [glue-wg] Extending GLUE 2.0 for Cloud services
Hi Paul, sorry for the late answers.
Hi Salvatore,
I think your work is very interesting: GLUE-2 is trying to be extensible and technology agnostic. Trying to describe cloud services as GLUE-2 objects is an excellent test of how well we've achieved
On 05/11/2013 19:09, Paul Millar wrote: that
goal.
Some further replies, in-line:
On 05/11/13 10:05, Salvatore Pinto wrote:
we considered this option, but, for Compute, we have also some change in the relationships between the objects and mandatory attributes in the Grid elements which have no sense in the Cloud world.
If there are attributes that are mandatory which make no sense in other kinds of services then we should certainly consider relaxing those requirements.
Relationships between objects should also be expressible as extensions without requiring schema changes. If this isn't the case (for whatever reason) then that, too, is something that should be fixed.
ok, I did not thought about the possibility to define new relationships as extensions, I thought the option was only for attributes. Anyway, there are other reasons to create new entities. We can discuss about that at the conference today.
Anyway, the main reason for not using the extension was that, considering the different kind of services the Cloud and the Grid
giving to the users, we wanted to keep the entities separated.
This, I think, is the point which I really disagree with.
Let me give you a concrete example.
We (dCache.org) have just started running a dCache instance that is very much a cloud storage service. Client software (of which there are various) currently accessed it via WebDAV, but we anticipate adding support for CDMI. We want to make this storage (or a similar cloud service) available to grid users in the future.
IMHO, we *will* have storage systems that provide cloud-like and grid-like services; it would be a good idea not to exclude this from the start.
[...] for storage, I agree, for Computing, I do not. We can discuss about
are that in today's teleconference.
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm).
Sorry, I didn't explain myself clearly; what you're describing is the CDMI concept of metadata (as per Chapter 16). I meant the more general metadata: "data about something".
The CDMI protocol also supports a client discovering information about the service itself; the main motivation is to allow the client to discover if a service is appropriate for its needs. This matches one of the core uses of GLUE-2: to select an appropriate storage services without querying each service individually.
For examples of this, see section 6.2 "Discovering the Capabilities of a Cloud Storage" and section 12.2 "Cloud Storage System-Wide Capabilities".
The "Exported Protocols" (see Section 13) is broadly similar to the concept of StorageEndpoints or StorageAccessProtocol in GLUE-2, so this would also be an interesting avenue for mapping.
There's also explicit support in CDMI for representing links to OCCI services. This holds a similar role to the ToComputingService objects in GLUE-2.
(you see how these two worlds are not really *that* different)
again, one point for modifying the original Storage elements to consider objects instead of files.
IMHO this isn't a good reason.
In most cases "objects" ("data-objects" in CDMI) are really just a synonym for files and a "buckets" ("container" in CDMI) is a synonym for a directory.
no, I do not agree. Block storage is widely used in the Cloud today and a disk image (which is mounted by the hypervisor and exposed as an "Hardware" storage disk resource) it is quite a different concept from a container or a file. The fact that currently almost all the CDMI implementations support only file objects, it is another point and should not be a concern of the GLUE, since implementations and interface technology may change in the future. The important point, in my view, is that users will be misleaded if they see "files" in the specification.
Sure, in S3, there are some limitations on the interactions, that's an implementation-specific detail. If those limitations are important to clients then we can describe them (as per CDMI).
Whether GLUE-2 talks about 'files' or 'objects' (or Feile, Fichier or ファイル) really doesn't matter if the underlying concept is the same.
HTH,
Paul.
Cheers, Salvatore.
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands
_______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg

hi folks, Regarding the cloud entity sub-type specialisations: I'm with Warren in that I don't think the conceptual model should be incremented (2.1) unless there are strong and compelling reasons to change the core model and introduce structural updates. Rather, as was originally intended, shouldn't the cloud specialisations extend the existing abstract base-types and add extra attributes, types, enums etc as required by those specialisations? These newly derived (cloud) types should then be presented in a separate extension doc that does not need to repeat the existing model, but instead describes how the new dervied types extend and supplement the base model. cheers, David -------------------------------------------------------------- David Meredith STFC eScience Centre Daresbury Laboratory (A32) Warrington, Cheshire WA4 4AD email: david.meredith@stfc.ac.uk tel: +44 (0)1925 603762 ________________________________________ From: glue-wg-bounces@ogf.org [glue-wg-bounces@ogf.org] on behalf of Warren Smith [wsmith@tacc.utexas.edu] Sent: 03 December 2013 14:18 To: Salvatore Pinto; Paul Millar Cc: glue-wg@ogf.org; Peter Solagna Subject: Re: [glue-wg] Extending GLUE 2.0 for Cloud services I expect to be on the call in a bit, but I thought I'd send a note that I haven't had any problem representing OpenStack and Nimbus IaaS clouds in the existing GLUE2 model. So far, I've primarily focused on the compute side, but I thought it was as straightforward to map GLUE2 entities to IaaS concepts as it was to map GLUE2 entities to cluster concepts. I haven't done too much with the storage entities, but so far, it seems to me like the mapping to existing GLUE2 doesn't seem that much more difficult than for clusters... A final thought is if we want to open the GLUE 2 model for changes, I'll have some suggestions based on my experiences so far. However, I don't feel a huge need to open the model for changes right now. Warren
-----Original Message----- From: glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of Salvatore Pinto Sent: Tuesday, December 03, 2013 2:30 AM To: Paul Millar Cc: glue-wg@ogf.org; Peter Solagna Subject: Re: [glue-wg] Extending GLUE 2.0 for Cloud services
Hi Paul, sorry for the late answers.
Hi Salvatore,
I think your work is very interesting: GLUE-2 is trying to be extensible and technology agnostic. Trying to describe cloud services as GLUE-2 objects is an excellent test of how well we've achieved
On 05/11/2013 19:09, Paul Millar wrote: that
goal.
Some further replies, in-line:
On 05/11/13 10:05, Salvatore Pinto wrote:
we considered this option, but, for Compute, we have also some change in the relationships between the objects and mandatory attributes in the Grid elements which have no sense in the Cloud world.
If there are attributes that are mandatory which make no sense in other kinds of services then we should certainly consider relaxing those requirements.
Relationships between objects should also be expressible as extensions without requiring schema changes. If this isn't the case (for whatever reason) then that, too, is something that should be fixed.
ok, I did not thought about the possibility to define new relationships as extensions, I thought the option was only for attributes. Anyway, there are other reasons to create new entities. We can discuss about that at the conference today.
Anyway, the main reason for not using the extension was that, considering the different kind of services the Cloud and the Grid
giving to the users, we wanted to keep the entities separated.
This, I think, is the point which I really disagree with.
Let me give you a concrete example.
We (dCache.org) have just started running a dCache instance that is very much a cloud storage service. Client software (of which there are various) currently accessed it via WebDAV, but we anticipate adding support for CDMI. We want to make this storage (or a similar cloud service) available to grid users in the future.
IMHO, we *will* have storage systems that provide cloud-like and grid-like services; it would be a good idea not to exclude this from the start.
[...] for storage, I agree, for Computing, I do not. We can discuss about
are that in today's teleconference.
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm).
Sorry, I didn't explain myself clearly; what you're describing is the CDMI concept of metadata (as per Chapter 16). I meant the more general metadata: "data about something".
The CDMI protocol also supports a client discovering information about the service itself; the main motivation is to allow the client to discover if a service is appropriate for its needs. This matches one of the core uses of GLUE-2: to select an appropriate storage services without querying each service individually.
For examples of this, see section 6.2 "Discovering the Capabilities of a Cloud Storage" and section 12.2 "Cloud Storage System-Wide Capabilities".
The "Exported Protocols" (see Section 13) is broadly similar to the concept of StorageEndpoints or StorageAccessProtocol in GLUE-2, so this would also be an interesting avenue for mapping.
There's also explicit support in CDMI for representing links to OCCI services. This holds a similar role to the ToComputingService objects in GLUE-2.
(you see how these two worlds are not really *that* different)
again, one point for modifying the original Storage elements to consider objects instead of files.
IMHO this isn't a good reason.
In most cases "objects" ("data-objects" in CDMI) are really just a synonym for files and a "buckets" ("container" in CDMI) is a synonym for a directory.
no, I do not agree. Block storage is widely used in the Cloud today and a disk image (which is mounted by the hypervisor and exposed as an "Hardware" storage disk resource) it is quite a different concept from a container or a file. The fact that currently almost all the CDMI implementations support only file objects, it is another point and should not be a concern of the GLUE, since implementations and interface technology may change in the future. The important point, in my view, is that users will be misleaded if they see "files" in the specification.
Sure, in S3, there are some limitations on the interactions, that's an implementation-specific detail. If those limitations are important to clients then we can describe them (as per CDMI).
Whether GLUE-2 talks about 'files' or 'objects' (or Feile, Fichier or ファイル) really doesn't matter if the underlying concept is the same.
HTH,
Paul.
Cheers, Salvatore.
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands
_______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg
glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg -- Scanned by iCritical.

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of david.meredith@stfc.ac.uk said: I'm with Warren in that I don't think the conceptual model should be incremented (2.1) unless there are strong and compelling reasons to change the core model and introduce structural updates.
Rather, as was originally intended, shouldn't the cloud specialisations extend the existing abstract base-types and add extra attributes, types, enums etc as required by those specialisations?
These newly derived (cloud) types should then be presented in a separate extension doc that does not need to repeat the existing model, but instead describes how the new dervied types extend and supplement the base model.
I think there are several different things here: o Should we call an update GLUE 2.1? I would say yes, because it will be a new version of the complete schema - at least in LDAP we have the whole thing in one file so it gets updated as a unit. Anyway I think it would get messy to keep track of updates otherwise - if the cloud part is separate we will need a separate version for it. And if we did then update the base classes at a later date what would we do with the cloud part, since any inherited attributes could change? o Should we put the whole thing in a single document or have an extension/update document? Either would be possible, but it seems cleaner to me to have a single document with a new major section for the cloud entities as we already have for compute and storage. Then you have one place to look for everything. Also we have some known mistakes in the existing document and this gives a chance to fix them, and there may be some places where the language can be improved, e.g. Salvatore wants to change "files" to "data objects". We should of course have a detailed change log. o Should we change anything in the existing entity definitions? That's probably the hardest thing as it risks opening a big can of worms. On the other hand, there may be some justified changes, e.g. to put back the min and max file - oops, data object - sizes in the StorageShare. Also the schema was mostly defined in 2007 and is untouched since early 2009 - the entire history of GLUE 1 evolution took five years! - so it wouldn't be surprising to have some justified changes as a result of subsequent experience or developments. In general terms it's relatively safe/easy to add new optional attributes to existing objects, as it doesn't affect the existing deployed system - that's what we always did for GLUE 1 evolution. However, we have the usual problem that updates take a long time to be deployed so no-one should expect to use new attributes any time soon, and experience suggests that agreeing even a single attribute can take a lot of discussion. I would therefore suggest that any update should be restricted to at most a handful of new optional attributes with a clear justification which has a broad acceptance. Stephen -- Scanned by iCritical.

On Dec 4, 2013, at 7:31 AM, <stephen.burke@stfc.ac.uk> <stephen.burke@stfc.ac.uk> wrote:
<snip> I think there are several different things here:
o Should we call an update GLUE 2.1? I would say yes, because it will be a new version of the complete schema - at least in LDAP we have the whole thing in one file so it gets updated as a unit. Anyway I think it would get messy to keep track of updates otherwise - if the cloud part is separate we will need a separate version for it. And if we did then update the base classes at a later date what would we do with the cloud part, since any inherited attributes could change?
In a change/version summary we can even highlight that the 2.1 introduces Cloud entities and doesn't (significantly) change previous 2.1 entities.
o Should we put the whole thing in a single document or have an extension/update document? Either would be possible, but it seems cleaner to me to have a single document with a new major section for the cloud entities as we already have for compute and storage. Then you have one place to look for everything. Also we have some known mistakes in the existing document and this gives a chance to fix them, and there may be some places where the language can be improved, e.g. Salvatore wants to change "files" to "data objects". We should of course have a detailed change log.
It is a single schema.
o Should we change anything in the existing entity definitions? That's probably the hardest thing as it risks opening a big can of worms. On the other hand, there may be some justified changes, e.g. to put back the min and max file - oops, data object - sizes in the StorageShare. Also the schema was mostly defined in 2007 and is untouched since early 2009 - the entire history of GLUE 1 evolution took five years! - so it wouldn't be surprising to have some justified changes as a result of subsequent experience or developments. In general terms it's relatively safe/easy to add new optional attributes to existing objects, as it doesn't affect the existing deployed system - that's what we always did for GLUE 1 evolution. However, we have the usual problem that updates take a long time to be deployed so no-one should expect to use new attributes any time soon, and experience suggests that agreeing even a single attribute can take a lot of discussion. I would therefore suggest that any update should be restricted to at most a handful of new optional attributes with a clear justification which has a broad acceptance.
Agreed, and making sure none of the changes cause incompatibilities with 2.0 based implementations. JP

On Dec 4, 2013, at 12:44 AM, <david.meredith@stfc.ac.uk> <david.meredith@stfc.ac.uk> wrote:
hi folks, Regarding the cloud entity sub-type specialisations:
I'm with Warren in that I don't think the conceptual model should be incremented (2.1) unless there are strong and compelling reasons to change the core model and introduce structural updates.
Given that we have a 2.0 version of the _document_ that describes the conceptual model, are you suggesting a more minor version increment? Perhaps 2.0.1? My view, which may not match how others see things, is that two version numbers is all we need, and incremental numbers within the 2.x series simply signify sequential document version and do not indicate the significance of the changes between each version. Significant and possibly incompatible changes would produce 3.0. In other words, as long as the core model doesn't change significantly, we increment within the 2.x series.
Rather, as was originally intended, shouldn't the cloud specialisations extend the existing abstract base-types and add extra attributes, types, enums etc as required by those specialisations?
I thought that was what we agreed to in our last teleconference: we would not abstract the Compute specializations but rather add new Cloud specializations.
These newly derived (cloud) types should then be presented in a separate extension doc that does not need to repeat the existing model, but instead describes how the new dervied types extend and supplement the base model.
The existing model document describes current specializations, why should cloud specializations be treated differently and put in a separate document? JP
cheers, David
-------------------------------------------------------------- David Meredith STFC eScience Centre Daresbury Laboratory (A32) Warrington, Cheshire WA4 4AD
email: david.meredith@stfc.ac.uk tel: +44 (0)1925 603762
________________________________________ From: glue-wg-bounces@ogf.org [glue-wg-bounces@ogf.org] on behalf of Warren Smith [wsmith@tacc.utexas.edu] Sent: 03 December 2013 14:18 To: Salvatore Pinto; Paul Millar Cc: glue-wg@ogf.org; Peter Solagna Subject: Re: [glue-wg] Extending GLUE 2.0 for Cloud services
I expect to be on the call in a bit, but I thought I'd send a note that I haven't had any problem representing OpenStack and Nimbus IaaS clouds in the existing GLUE2 model. So far, I've primarily focused on the compute side, but I thought it was as straightforward to map GLUE2 entities to IaaS concepts as it was to map GLUE2 entities to cluster concepts.
I haven't done too much with the storage entities, but so far, it seems to me like the mapping to existing GLUE2 doesn't seem that much more difficult than for clusters...
A final thought is if we want to open the GLUE 2 model for changes, I'll have some suggestions based on my experiences so far. However, I don't feel a huge need to open the model for changes right now.
Warren
-----Original Message----- From: glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of Salvatore Pinto Sent: Tuesday, December 03, 2013 2:30 AM To: Paul Millar Cc: glue-wg@ogf.org; Peter Solagna Subject: Re: [glue-wg] Extending GLUE 2.0 for Cloud services
Hi Paul, sorry for the late answers.
Hi Salvatore,
I think your work is very interesting: GLUE-2 is trying to be extensible and technology agnostic. Trying to describe cloud services as GLUE-2 objects is an excellent test of how well we've achieved
On 05/11/2013 19:09, Paul Millar wrote: that
goal.
Some further replies, in-line:
On 05/11/13 10:05, Salvatore Pinto wrote:
we considered this option, but, for Compute, we have also some change in the relationships between the objects and mandatory attributes in the Grid elements which have no sense in the Cloud world.
If there are attributes that are mandatory which make no sense in other kinds of services then we should certainly consider relaxing those requirements.
Relationships between objects should also be expressible as extensions without requiring schema changes. If this isn't the case (for whatever reason) then that, too, is something that should be fixed.
ok, I did not thought about the possibility to define new relationships as extensions, I thought the option was only for attributes. Anyway, there are other reasons to create new entities. We can discuss about that at the conference today.
Anyway, the main reason for not using the extension was that, considering the different kind of services the Cloud and the Grid
giving to the users, we wanted to keep the entities separated.
This, I think, is the point which I really disagree with.
Let me give you a concrete example.
We (dCache.org) have just started running a dCache instance that is very much a cloud storage service. Client software (of which there are various) currently accessed it via WebDAV, but we anticipate adding support for CDMI. We want to make this storage (or a similar cloud service) available to grid users in the future.
IMHO, we *will* have storage systems that provide cloud-like and grid-like services; it would be a good idea not to exclude this from the start.
[...] for storage, I agree, for Computing, I do not. We can discuss about
are that in today's teleconference.
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-metadata.htm).
Sorry, I didn't explain myself clearly; what you're describing is the CDMI concept of metadata (as per Chapter 16). I meant the more general metadata: "data about something".
The CDMI protocol also supports a client discovering information about the service itself; the main motivation is to allow the client to discover if a service is appropriate for its needs. This matches one of the core uses of GLUE-2: to select an appropriate storage services without querying each service individually.
For examples of this, see section 6.2 "Discovering the Capabilities of a Cloud Storage" and section 12.2 "Cloud Storage System-Wide Capabilities".
The "Exported Protocols" (see Section 13) is broadly similar to the concept of StorageEndpoints or StorageAccessProtocol in GLUE-2, so this would also be an interesting avenue for mapping.
There's also explicit support in CDMI for representing links to OCCI services. This holds a similar role to the ToComputingService objects in GLUE-2.
(you see how these two worlds are not really *that* different)
again, one point for modifying the original Storage elements to consider objects instead of files.
IMHO this isn't a good reason.
In most cases "objects" ("data-objects" in CDMI) are really just a synonym for files and a "buckets" ("container" in CDMI) is a synonym for a directory.
no, I do not agree. Block storage is widely used in the Cloud today and a disk image (which is mounted by the hypervisor and exposed as an "Hardware" storage disk resource) it is quite a different concept from a container or a file. The fact that currently almost all the CDMI implementations support only file objects, it is another point and should not be a concern of the GLUE, since implementations and interface technology may change in the future. The important point, in my view, is that users will be misleaded if they see "files" in the specification.
Sure, in S3, there are some limitations on the interactions, that's an implementation-specific detail. If those limitations are important to clients then we can describe them (as per CDMI).
Whether GLUE-2 talks about 'files' or 'objects' (or Feile, Fichier or ファイル) really doesn't matter if the underlying concept is the same.
HTH,
Paul.
Cheers, Salvatore.
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands
_______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg
glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg -- Scanned by iCritical. _______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg

In other words, as long as the core model doesn't change significantly, we increment within the 2.x series.
Ok.
Rather, as was originally intended, shouldn't the cloud specialisations extend the existing abstract base-types and add extra attributes, types, enums etc as required by those specialisations?
I thought that was what we agreed to in our last teleconference: we would not abstract the Compute specializations but rather add new Cloud specializations.
Good - I must have missed that. This was my biggest concern.
The existing model document describes current specializations, why should cloud specializations be treated differently and put in a separate document?
Fair enough.
JP
cheers, David
-------------------------------------------------------------- David Meredith STFC eScience Centre Daresbury Laboratory (A32) Warrington, Cheshire WA4 4AD
email: david.meredith@stfc.ac.uk tel: +44 (0)1925 603762
________________________________________ From: glue-wg-bounces@ogf.org [glue-wg-bounces@ogf.org] on behalf of Warren Smith [wsmith@tacc.utexas.edu] Sent: 03 December 2013 14:18 To: Salvatore Pinto; Paul Millar Cc: glue-wg@ogf.org; Peter Solagna Subject: Re: [glue-wg] Extending GLUE 2.0 for Cloud services
I expect to be on the call in a bit, but I thought I'd send a note that I haven't had any problem representing OpenStack and Nimbus IaaS clouds in the existing GLUE2 model. So far, I've primarily focused on the compute side, but I thought it was as straightforward to map GLUE2 entities to IaaS concepts as it was to map GLUE2 entities to cluster concepts.
I haven't done too much with the storage entities, but so far, it seems to me like the mapping to existing GLUE2 doesn't seem that much more difficult than for clusters...
A final thought is if we want to open the GLUE 2 model for changes, I'll have some suggestions based on my experiences so far. However, I don't feel a huge need to open the model for changes right now.
Warren
-----Original Message----- From: glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On Behalf Of Salvatore Pinto Sent: Tuesday, December 03, 2013 2:30 AM To: Paul Millar Cc: glue-wg@ogf.org; Peter Solagna Subject: Re: [glue-wg] Extending GLUE 2.0 for Cloud services
Hi Paul, sorry for the late answers.
Hi Salvatore,
I think your work is very interesting: GLUE-2 is trying to be extensible and technology agnostic. Trying to describe cloud services as GLUE-2 objects is an excellent test of how well we've achieved
On 05/11/2013 19:09, Paul Millar wrote: that
goal.
Some further replies, in-line:
On 05/11/13 10:05, Salvatore Pinto wrote:
we considered this option, but, for Compute, we have also some change in the relationships between the objects and mandatory attributes in the Grid elements which have no sense in the Cloud world.
If there are attributes that are mandatory which make no sense in other kinds of services then we should certainly consider relaxing those requirements.
Relationships between objects should also be expressible as extensions without requiring schema changes. If this isn't the case (for whatever reason) then that, too, is something that should be fixed.
ok, I did not thought about the possibility to define new relationships as extensions, I thought the option was only for attributes. Anyway, there are other reasons to create new entities. We can discuss about that at the conference today.
Anyway, the main reason for not using the extension was that, considering the different kind of services the Cloud and the Grid
giving to the users, we wanted to keep the entities separated.
This, I think, is the point which I really disagree with.
Let me give you a concrete example.
We (dCache.org) have just started running a dCache instance that is very much a cloud storage service. Client software (of which there are various) currently accessed it via WebDAV, but we anticipate adding support for CDMI. We want to make this storage (or a similar cloud service) available to grid users in the future.
IMHO, we *will* have storage systems that provide cloud-like and grid-like services; it would be a good idea not to exclude this from the start.
[...] for storage, I agree, for Computing, I do not. We can discuss about
CDMI metadata are mostly related to file-level options, for example ACLs, file redundancy, file encryption, etc... (source: http://cdmi.sniacloud.com/cdmi_spec/16-metadata/16-
are that in today's teleconference. metadata.htm).
Sorry, I didn't explain myself clearly; what you're describing is the CDMI concept of metadata (as per Chapter 16). I meant the more general metadata: "data about something".
The CDMI protocol also supports a client discovering information
about
the service itself; the main motivation is to allow the client to discover if a service is appropriate for its needs. This matches one of the core uses of GLUE-2: to select an appropriate storage services without querying each service individually.
For examples of this, see section 6.2 "Discovering the Capabilities of a Cloud Storage" and section 12.2 "Cloud Storage System-Wide Capabilities".
The "Exported Protocols" (see Section 13) is broadly similar to the concept of StorageEndpoints or StorageAccessProtocol in GLUE-2, so this would also be an interesting avenue for mapping.
There's also explicit support in CDMI for representing links to OCCI services. This holds a similar role to the ToComputingService objects in GLUE-2.
(you see how these two worlds are not really *that* different)
again, one point for modifying the original Storage elements to consider objects instead of files.
IMHO this isn't a good reason.
In most cases "objects" ("data-objects" in CDMI) are really just a synonym for files and a "buckets" ("container" in CDMI) is a synonym for a directory.
no, I do not agree. Block storage is widely used in the Cloud today and a disk image (which is mounted by the hypervisor and exposed as an "Hardware" storage disk resource) it is quite a different concept from a container or a file. The fact that currently almost all the CDMI implementations support only file objects, it is another point and should not be a concern of the GLUE, since implementations and interface technology may change in the future. The important point, in my view, is that users will be misleaded if they see "files" in the specification.
Sure, in S3, there are some limitations on the interactions, that's an implementation-specific detail. If those limitations are important to clients then we can describe them (as per CDMI).
Whether GLUE-2 talks about 'files' or 'objects' (or Feile, Fichier or ファイル) really doesn't matter if the underlying concept is the same.
HTH,
Paul.
Cheers, Salvatore.
-- Salvatore Pinto Cloud Technologist, EGI.eu e-mail: salvatore.pinto@egi.eu skype: salvatore.pinto0 Science Park 140, Amsterdam, The Netherlands
_______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg
glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg -- Scanned by iCritical. _______________________________________________ glue-wg mailing list glue-wg@ogf.org https://www.ogf.org/mailman/listinfo/glue-wg
-- Scanned by iCritical.

glue-wg-bounces@ogf.org [mailto:glue-wg-bounces@ogf.org] On
Behalf Of JP Navarro said: My view, which may not match how others see things, is that two version numbers is all we need, and incremental numbers within the 2.x series simply signify sequential document version and do not indicate the significance of the changes between each version. Significant and possibly incompatible changes would produce 3.0.
That's certainly what we did with GLUE 1. It takes so long to agree and deploy any update at all that there isn't really any concept of a minor change. We also need to bear that in mind for the cloud discussion - whatever is agreed needs to be good for at least 2-3 years, people shouldn't think that we can easily change it if it turns out to have flaws, and the only changes which can be made at all are pretty much restricted to new attributes or completely new entities. Stephen -- Scanned by iCritical.
participants (8)
-
Andre Merzky
-
david.meredith@stfc.ac.uk
-
Florido Paganelli
-
JP Navarro
-
Paul Millar
-
Salvatore Pinto
-
stephen.burke@stfc.ac.uk
-
Warren Smith