Finished first pass over FeatureMatrix

Morning all, I've just finished the first pass of the OCCI FeatureMatrix<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/FeatureMatrix>which compares APIs on a functional (rather than per-function) basis. If we want to be an implementation candidate for these APIs then we're going to have to be capable of exposing as much of the existing functionality as possible - which is to say that we're going to end up with something of a superset (hence the solid ticks in the OCCI column, except for a few things we haven't thought about yet). Some of the suggested APIs were dropped because they are not comparable or simply don't exist (e.g. Mosso CloudSites). We'll add new rows as interesting features are raised and new columns for new APIs. Cheers, Sam

Hi Sam, Thanks that's a great matrix !!! I think there are some features missing like DNS, network level partitioning/security features as well as low level features like MAC addresses, which unless provisioned, VMs cannot be instantiated. Some terms may require further definition, for example "multiple network resources", does this mean virtual network adapters (EH, SC) or operating systems interfaces (SC) or virtual networks (FS,openNebula) ? -gary Sam Johnston wrote:
Morning all,
I've just finished the first pass of the OCCI FeatureMatrix <http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/FeatureMatrix> which compares APIs on a functional (rather than per-function) basis. If we want to be an implementation candidate for these APIs then we're going to have to be capable of exposing as much of the existing functionality as possible - which is to say that we're going to end up with something of a superset (hence the solid ticks in the OCCI column, except for a few things we haven't thought about yet).
Some of the suggested APIs were dropped because they are not comparable or simply don't exist (e.g. Mosso CloudSites). We'll add new rows as interesting features are raised and new columns for new APIs.
Cheers,
Sam
------------------------------------------------------------------------
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Hi Sam, Good work - thanks! I think this is a very good level at which to be comparing the various APIs, and highlights differences well. One question - you've said that all public clouds lack enterprise features vs. VMware, etc. Is it worth adding a VMware column and exposing these extra features on a few new rows? I can see a few already (request signing, resource categories and search), but suspect there are others (e.g. role hierarchies?) and also that someone with a more enterprisey background would do better than me in adding these. Cheers, Richard. Sam Johnston wrote:
Morning all,
I've just finished the first pass of the OCCI FeatureMatrix which compares APIs on a functional (rather than per-function) basis. If we want to be an implementation candidate for these APIs then we're going to have to be capable of exposing as much of the existing functionality as possible - which is to say that we're going to end up with something of a superset (hence the solid ticks in the OCCI column, except for a few things we haven't thought about yet).
Some of the suggested APIs were dropped because they are not comparable or simply don't exist (e.g. Mosso CloudSites). We'll add new rows as interesting features are raised and new columns for new APIs.
Cheers,
Sam

A cloud that supports persistent resources can easily emulate ephemeral resources (just stop and immediately destroy the resource), so should we give the persistent clouds credit for this "feature"? On Jun 14, 2009, at 2:37 PM, Richard Davies wrote:
One question - you've said that all public clouds lack enterprise features vs. VMware, etc. Is it worth adding a VMware column and exposing these extra features on a few new rows? I can see a few already (request signing, resource categories and search), but suspect there are others (e.g. role hierarchies?) and also that someone with a more enterprisey background would do better than me in adding these.
This is a pet peeve of mine, so I would be willing to do it. I added a VMware column, but I don't feel comfortable adding rows unless there is some consensus on what they should be. VMware has a lot of small but IMO essential features that most clouds don't have (e.g. full virtualization, shared block storage, thin provisioning, HA, VLANs, reverse ARP, etc.) but it's not clear to me that we want the feature matrix to get into that level of detail. Wes Felter - wesley@felter.org - http://felter.org/wesley/

Hi Wes, Thanks for your contributions - it's good to have you on board. On Mon, Jun 15, 2009 at 12:20 AM, Wes Felter <wesley@felter.org> wrote:
A cloud that supports persistent resources can easily emulate ephemeral resources (just stop and immediately destroy the resource), so should we give the persistent clouds credit for this "feature"?
We're assessing the APIs themselves rather than the underlying implementations - which is to say that such features should be natively (or at least sensibly) supported. Users shouldn't get nasty surprises like finding that pressing "stop" on a compute resource results in various storage resources being implicitly destroyed (rather ephemeral storage should be specified at boot or at least exposed in the attributes). It's conceivable that some implementations would want to support both (where ephemeral storage might be faster and/or cheaper than more reliable alternatives).
This is a pet peeve of mine, so I would be willing to do it. I added a VMware column, but I don't feel comfortable adding rows unless there is some consensus on what they should be. VMware has a lot of small but IMO essential features that most clouds don't have (e.g. full virtualization, shared block storage, thin provisioning, HA, VLANs, reverse ARP, etc.) but it's not clear to me that we want the feature matrix to get into that level of detail.
The feature matrix should be concise and readable but still carry enough information to be useful... I guess about 20 rows. In terms of specific features: - raw hardware vs para virt. vs full virt. vs containers vs emulation is useful/interesting (along with the specific container type for drivers etc). - shared block storage (e.g. for clustering) is already possible with OCCI - just connect the same storage resource(s) to multiple compute resources - thin provisioning is particularly interesting (if not essential) for public cloud providers (but may or may not be exposed to the user via attributes) - some types of HA can be modeled already by linking compute resources together - I'm not sure how much more we want/need - VLANs have already been mentioned and can be as simple as having an 802.1q attribute on the network resources - RARP is more of an internal concern My main concern is to make the API sufficiently extensible that its users can achieve a lot of this itself (because much of it falls outside of our initial scope). Sam Wes Felter - wesley@felter.org - http://felter.org/wesley/
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

On Jun 14, 2009, at 6:02 PM, Sam Johnston wrote:
- VLANs have already been mentioned and can be as simple as having an 802.1q attribute on the network resources
Please be careful on this one. Don't tie to closely to 802.1q, which is broken having only a 12-bit field (4096) for VLANs. It's extremely likely that a successor to this or a replacement technology will be used to identify VLAN uniqueness. Bottom line: whatever you use, make sure it's larger than a 12-bit field. Randy Bias, Cloud Strategist +1 (415) 939-8507 [m], randyb@neotactics.com BLOG: http://cloudscaling.com

On Mon, Jun 15, 2009 at 5:16 AM, Randy Bias<randyb@neotactics.com> wrote:
On Jun 14, 2009, at 6:02 PM, Sam Johnston wrote:
- VLANs have already been mentioned and can be as simple as having an 802.1q attribute on the network resources
Please be careful on this one. Don't tie to closely to 802.1q, which is broken having only a 12-bit field (4096) for VLANs. It's extremely likely that a successor to this or a replacement technology will be used to identify VLAN uniqueness. Bottom line: whatever you use, make sure it's larger than a 12-bit field.
Any "vlan" field would likely be dependent on the local infrastructure so if you prefer to use UUIDs then that's up to you as an implementor. That means the contents would be opaque, except in so far as identical values means the networks are connected. Sam

A cloud that supports persistent resources can easily emulate ephemeral resources (just stop and immediately destroy the resource), so should we give the persistent clouds credit for this "feature"?
We're assessing the APIs themselves rather than the underlying implementations - which is to say that such features should be natively (or at least sensibly) supported. Users shouldn't get nasty surprises like finding that pressing "stop" on a compute resource results in various storage resources being implicitly destroyed (rather ephemeral storage should be specified at boot or at least exposed in the attributes).
Absolutely - as Sam say the behaviour is very different when the server shuts down (persistent servers -> stopped state, ephemeral servers -> gone; persistent storage -> stays, ephemeral storage -> gone). So users of the OCCI API need an attribute or some such on servers and storage so that they can explicitly request ephemeral or persistent.
It's conceivable that some implementations would want to support both (where ephemeral storage might be faster and/or cheaper than more reliable alternatives)
There's a strong argument for clouds to support both. Persistent storage is typically on SAN where it can be accessed from all virtualization hosts when the server is stopped and restarted. Ephemeral storage is typically on local disk, which is much faster but unavailable after the server is stopped and restarted on a different machine. This is most obvious in Amazon AMIs vs. EBS. Other providers hide it a bit better! Cheers, Richard.

On Mon, Jun 15, 2009 at 5:29 PM, Richard Davies <richard@daviesmail.org>wrote:
There's a strong argument for clouds to support both. Persistent storage is typically on SAN where it can be accessed from all virtualization hosts when the server is stopped and restarted. Ephemeral storage is typically on local disk, which is much faster but unavailable after the server is stopped and restarted on a different machine.
Right, there are many different possibilities for storage that should arguably be specified by user requirement ("what" e.g. "geographically redundant") rather than technology ("how" e.g. "RAID 10"). The three main options I see for "local" storage are: - Ephemeral which is lost when the machine stops (thus simplifying infrastructure and reducing costs) - Local storage which may or may not exist next time you need it (good for caching configurations, calculations and so on but not much else) - Redundant storage which will probably survive failures (e.g. SAN) - HA storage which is geographically distributed, backed up, etc. and which you can rely on. Of course the different levels would carry different costs and be useful for different purposes. We just need to find a simple way for users to get and set their preferences (whether expressed as "what" or "how"). Sam

I like Sam's approach, especially the "expressed as what". The whole idea of SANx or RAIDxx can be overwhelming and confusing to the customers. Especially, when raid or sans have very little to do with data integrity or performance. I would prefer to see something that falls in line with SLAs as way to describe storage services. -gary Sam Johnston wrote:
On Mon, Jun 15, 2009 at 5:29 PM, Richard Davies <richard@daviesmail.org <mailto:richard@daviesmail.org>> wrote:
There's a strong argument for clouds to support both. Persistent storage is typically on SAN where it can be accessed from all virtualization hosts when the server is stopped and restarted. Ephemeral storage is typically on local disk, which is much faster but unavailable after the server is stopped and restarted on a different machine.
Right, there are many different possibilities for storage that should arguably be specified by user requirement ("what" e.g. "geographically redundant") rather than technology ("how" e.g. "RAID 10"). The three main options I see for "local" storage are:
* Ephemeral which is lost when the machine stops (thus simplifying infrastructure and reducing costs) * Local storage which may or may not exist next time you need it (good for caching configurations, calculations and so on but not much else) * Redundant storage which will probably survive failures (e.g. SAN) * HA storage which is geographically distributed, backed up, etc. and which you can rely on.
Of course the different levels would carry different costs and be useful for different purposes. We just need to find a simple way for users to get and set their preferences (whether expressed as "what" or "how").
Sam
------------------------------------------------------------------------
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Whoa...RAID has all to do with data integrity and performance. I will buy that on NAS or SAN. Chuck Wegrzyn Gary Mazz wrote:
I like Sam's approach, especially the "expressed as what". The whole idea of SANx or RAIDxx can be overwhelming and confusing to the customers. Especially, when raid or sans have very little to do with data integrity or performance.
I would prefer to see something that falls in line with SLAs as way to describe storage services.
-gary
Sam Johnston wrote:
On Mon, Jun 15, 2009 at 5:29 PM, Richard Davies <richard@daviesmail.org <mailto:richard@daviesmail.org>> wrote:
There's a strong argument for clouds to support both. Persistent storage is typically on SAN where it can be accessed from all virtualization hosts when the server is stopped and restarted. Ephemeral storage is typically on local disk, which is much faster but unavailable after the server is stopped and restarted on a different machine.
Right, there are many different possibilities for storage that should arguably be specified by user requirement ("what" e.g. "geographically redundant") rather than technology ("how" e.g. "RAID 10"). The three main options I see for "local" storage are:
* Ephemeral which is lost when the machine stops (thus simplifying infrastructure and reducing costs) * Local storage which may or may not exist next time you need it (good for caching configurations, calculations and so on but not much else) * Redundant storage which will probably survive failures (e.g. SAN) * HA storage which is geographically distributed, backed up, etc. and which you can rely on.
Of course the different levels would carry different costs and be useful for different purposes. We just need to find a simple way for users to get and set their preferences (whether expressed as "what" or "how").
Sam
------------------------------------------------------------------------
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Chuck, Raid is for availability when there is a disk "failure". If there is data corruption on the disk due to raid controller, multi-bit errors on the interconnect or corruption due to a non-failure mode related disk defect; raid has no hope of saving your skin, this a common mis-conception. As for interconnects, my two favorites are SAS and infiniband (now clocked @ 40gb/s). It is not uncommon for see scsi read command to data latencies on a raid system in excess of 1s. You will see much better performance results if you load up your system with 4gb ECC of disk cache. ;) -g Chuck Wegrzyn wrote:
Whoa...RAID has all to do with data integrity and performance. I will buy that on NAS or SAN.
Chuck Wegrzyn
Gary Mazz wrote:
I like Sam's approach, especially the "expressed as what". The whole idea of SANx or RAIDxx can be overwhelming and confusing to the customers. Especially, when raid or sans have very little to do with data integrity or performance.
I would prefer to see something that falls in line with SLAs as way to describe storage services.
-gary
Sam Johnston wrote:
On Mon, Jun 15, 2009 at 5:29 PM, Richard Davies <richard@daviesmail.org <mailto:richard@daviesmail.org>> wrote:
There's a strong argument for clouds to support both. Persistent storage is typically on SAN where it can be accessed from all virtualization hosts when the server is stopped and restarted. Ephemeral storage is typically on local disk, which is much faster but unavailable after the server is stopped and restarted on a different machine.
Right, there are many different possibilities for storage that should arguably be specified by user requirement ("what" e.g. "geographically redundant") rather than technology ("how" e.g. "RAID 10"). The three main options I see for "local" storage are:
* Ephemeral which is lost when the machine stops (thus simplifying infrastructure and reducing costs) * Local storage which may or may not exist next time you need it (good for caching configurations, calculations and so on but not much else) * Redundant storage which will probably survive failures (e.g. SAN) * HA storage which is geographically distributed, backed up, etc. and which you can rely on.
Of course the different levels would carry different costs and be useful for different purposes. We just need to find a simple way for users to get and set their preferences (whether expressed as "what" or "how").
Sam
------------------------------------------------------------------------
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

On Jun 14, 2009, at 3:20 PM, Wes Felter wrote:
This is a pet peeve of mine, so I would be willing to do it. I added a VMware column, but I don't feel comfortable adding rows unless there is some consensus on what they should be. VMware has a lot of small but IMO essential features that most clouds don't have (e.g. full virtualization, shared block storage, thin provisioning, HA, VLANs, reverse ARP, etc.) but it's not clear to me that we want the feature matrix to get into that level of detail.
Hrm... I think that's an inaccurate assertion. And it seems like you probably missed some of the more interesting VMware features. I'm not sure what you mean by 'full virtualization'. I'm guessing you mean guests with unmodified operating systems. Your initial list: - full virtualization (unmodified guests): GoGrid, ElasticHosts, and probably FlexiScale - shared block storage: not really sure what you mean... between hosts or guests??? - thin provisioning: not a feature specific to VMware at all; doable with any reasonable backing store; many clouds already do this - high availability: VMware does this best, but FlexiScale already claims hot standby failover - VLANs: GoGrid and others - reverse ARP: what for? Missed, but of particular interest: - memory overcommit: end user won't care, but worth mentioning since no one else does it yet - DRS: dynamic movement of live VMs based on resource/workload needs - dmotion: live migration of VM backing storage files; still mostly a VMware-only feature --Randy Randy Bias, Cloud Strategist +1 (415) 939-8507 [m], randyb@neotactics.com BLOG: http://cloudscaling.com

Am 15.06.2009 um 05:14 schrieb Randy Bias:
Missed, but of particular interest:
- memory overcommit: end user won't care, but worth mentioning since no one else does it yet - DRS: dynamic movement of live VMs based on resource/workload needs - dmotion: live migration of VM backing storage files; still mostly a VMware-only feature
I remember that this particular thing was marked out of scope some time ago. Probably this also holds for the others. -Alexander -- Alexander Papaspyrou alexander.papaspyrou@tu-dortmund.de
participants (7)
-
Alexander Papaspyrou
-
Chuck Wegrzyn
-
Gary Mazz
-
Randy Bias
-
Richard Davies
-
Sam Johnston
-
Wes Felter