Semantics of OCCI API: nouns and verbs

Chris Webb

16 Apr 2009 16 Apr '09

9:32 a.m.

The single strongest point we want to make is that an API should be simple. We're convinced that an API covering infrastructure can be formulated on a handful of straightforward objects with no more than around fifteen or twenty operations in total. We're also convinced that these operations can be very simple and straightforward, with syntax which could easily be typed by hand, but this is a later issue once we agree on the nouns and verbs themselves. In the spirit of 'put up or shut up', here's a suggestion of what the complete set of nouns and verbs might be, based on what we have in our existing API. Nouns: drives - block devices to attach to virtual machines. The general case here is clearly behaviour like Amazon EBS - drives are persistent and exist independently of virtual machines. A virtual machine can mount several drives, and conversely a drive (like a CD image) can in principle be mounted from several virtual machines. GoGrid 'one drive per server' behaviour is clearly a subset of this, and Amazon's filesystem initialised from an AMI at boot can be represented by throwaway copies of a drive (or copy-on-write) instead of writing back to the drive. resources - things like static IP addresses and VLANs which need to be reserved for use by virtual machines. guests - virtual servers booted from and accessing drives. Our guests exist as objects only when they are running, similar to Amazon's instances, but it may be more general to allow guests in stopped and suspended states in addition, as GoGrid currently do. Verbs: drives list (list accessible drives) --- info (list drive properties) create (takes properties including size; give parent drive for snapshot) destroy (delete existing drive) set (takes properties including size to resize) read (read data of given size with given offset in drive) write (write data at given offset in drive) image (copy data from one drive to another) resources list (list allocated resources) --- info (detail on an allocated resource) create (to allocate a resource such as a static IP address) destroy (to release a resource) guests list (list virtual machines) --- info (list virtual machine properties) create (takes simple description, e.g. including attached drives and network interfaces) set (updates configuration from new description where possible) destroy (hard-kill virtual machine) reset (send reset to virtual machine) shutdown (gracefully shut down virtual machine) [perhaps also stop, start, suspend for stopped guests?] Does anyone think we've missed anything? Comparison with draft API on the wiki ===================================== The draft API on the OCCI wiki currently appears to offer nouns for servers, storage devices, network interfaces with the ability to create, retrieve, update and delete any noun and 6 verbs for machine control. Our suggestion above is very similar, but more explicit and descriptive. In particular: - list, info, create, destroy, set are equivalent to CRUD; - added drive operations: read, write, image - resize operation handled by passing the new size to drive set; - nouns for network resources (e.g. static IPs, VLANs) rather than network interfaces[1]; - similar set of machine operations, assuming we want to handle stopped guests. [1] We believe that interfaces are simply one aspect of a server's configuration, and that the nouns that matter are the objects such as static IPs which are "owned" by a customer and hence can be configured onto the server. Cheers, Chris.

Show replies by date

Richard Davies

16 Apr 16 Apr

9:55 a.m.

A few more detailed comments to flesh out Chris' overview:

...

drives - block devices to attach to virtual machines. The general case here is clearly behaviour like Amazon EBS - drives are persistent and exist independently of virtual machines. A virtual machine can mount several drives, and conversely a drive (like a CD image) can in principle be mounted from several virtual machines. GoGrid 'one drive per server' behaviour is clearly a subset of this, and Amazon's filesystem initialised from an AMI at boot can be represented by throwaway copies of a drive (or copy-on-write) instead of writing back to the drive.

The variants which we see today are: Amazon EC2 root filesystem - one drive per server, not exposed separately in API - non-persistent, initialized at boot from template (AMI) - single filesystem - does not include OS kernel or boot loader Amazon EBS - any number of drives per server, exposed separately in API - persistent across reboots and server deletion - block device which can be partitioned into multiple filesystems - does not include OS kernel or boot loader, since a secondary drive GoGrid - one drive per server, not exposed separately in API - persistent across reboots, but not server deletion - block device which can be partitioned into multiple filesystems - includes OS kernel and boot loader ElasticHosts - any number of drives per server, exposed separately in API - persistent across reboots and server deletion - block device which can be partitioned into multiple filesystems - includes OS kernel and boot loader We suggest taking the most general approach in the standard: - any number of drivers per server, exposed as first-class API objects - persistent across reboots and server deletion - AMI-like templates implemented as imaging/duplicating one drive from another - block device which can be partitioned into multiple filesystems (just like a hard disk) - includes OS kernel and boot block (just like a hard disk)

...

guests - virtual servers booted from and accessing drives. Our guests exist as objects only when they are running, similar to Amazon's instances, but it may be more general to allow guests in stopped and suspended states in addition, as GoGrid currently do.

The variants which we see today are: EC2 instances and ElasticHosts servers (as seen from the API) are ephemeral and no longer exist once they are stopped. GoGrid servers persist in a stopped state, and can be restarted from that state. We suggest taking the more general approach in the standard: Servers would include non-running servers in the same way as GoGrid. Perhaps whether a server persists or not when it is shut down is an option when creating a server?

...

guests create (takes simple description, e.g. including attached drives and network interfaces)

It's worth commenting here on the granularity of VM specification. Both GoGrid servers and EC2 instances are available in a small number of fixed sizes, whereas ElasticHosts servers are continuously variable in CPU and RAM, and our drives are continuously variable in size. Again, taking the more general approach, we suggest that servers should be specified in terms of continuous quantities of CPU, RAM and disk, with a provider 'rounding up' to the nearest available specification if their granularity is coarser than the standard API.

Alexis Richardson

10 a.m.

Richard, Chris, Thank-you both very much for this contribution! Patrick Kerpan from CohesiveFT has kindly created a spreadsheet on google that is intended for public consumption, that summarises the API calls from known cloud provider APIs, in a matrix format. I hope to be able to share a link to that with this group shortly. It is not a substitute for work to create the OCCI API. Instead, it forms an important reference point for checking OCCI operations against current providers. alexis On Thu, Apr 16, 2009 at 10:55 AM, Richard Davies <richard.davies@elastichosts.com> wrote:

...

A few more detailed comments to flesh out Chris' overview:

...
drives - block devices to attach to virtual machines. The general case here is clearly behaviour like Amazon EBS - drives are persistent and exist independently of virtual machines. A virtual machine can mount several drives, and conversely a drive (like a CD image) can in principle be mounted from several virtual machines. GoGrid 'one drive per server' behaviour is clearly a subset of this, and Amazon's filesystem initialised from an AMI at boot can be represented by throwaway copies of a drive (or copy-on-write) instead of writing back to the drive.

The variants which we see today are:

Amazon EC2 root filesystem - one drive per server, not exposed separately in API - non-persistent, initialized at boot from template (AMI) - single filesystem - does not include OS kernel or boot loader

Amazon EBS - any number of drives per server, exposed separately in API - persistent across reboots and server deletion - block device which can be partitioned into multiple filesystems - does not include OS kernel or boot loader, since a secondary drive

GoGrid - one drive per server, not exposed separately in API - persistent across reboots, but not server deletion - block device which can be partitioned into multiple filesystems - includes OS kernel and boot loader

ElasticHosts - any number of drives per server, exposed separately in API - persistent across reboots and server deletion - block device which can be partitioned into multiple filesystems - includes OS kernel and boot loader

We suggest taking the most general approach in the standard: - any number of drivers per server, exposed as first-class API objects - persistent across reboots and server deletion - AMI-like templates implemented as imaging/duplicating one drive from another - block device which can be partitioned into multiple filesystems (just like a hard disk) - includes OS kernel and boot block (just like a hard disk)

...
guests - virtual servers booted from and accessing drives. Our guests exist as objects only when they are running, similar to Amazon's instances, but it may be more general to allow guests in stopped and suspended states in addition, as GoGrid currently do.

The variants which we see today are:

EC2 instances and ElasticHosts servers (as seen from the API) are ephemeral and no longer exist once they are stopped. GoGrid servers persist in a stopped state, and can be restarted from that state.

We suggest taking the more general approach in the standard:

Servers would include non-running servers in the same way as GoGrid. Perhaps whether a server persists or not when it is shut down is an option when creating a server?

...
guests create (takes simple description, e.g. including attached drives and network interfaces)

It's worth commenting here on the granularity of VM specification. Both GoGrid servers and EC2 instances are available in a small number of fixed sizes, whereas ElasticHosts servers are continuously variable in CPU and RAM, and our drives are continuously variable in size.

Again, taking the more general approach, we suggest that servers should be specified in terms of continuous quantities of CPU, RAM and disk, with a provider 'rounding up' to the nearest available specification if their granularity is coarser than the standard API. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam Johnston

12:01 p.m.

On Thu, Apr 16, 2009 at 11:55 AM, Richard Davies < richard.davies@elastichosts.com> wrote:

...

The variants which we see today are: <snip>

ElasticHosts - any number of drives per server, exposed separately in API - persistent across reboots and server deletion - block device which can be partitioned into multiple filesystems - includes OS kernel and boot loader

We suggest taking the most general approach in the standard: - any number of drivers per server, exposed as first-class API objects

Agreed we should support all possible configurations including zero or more storage resources (e.g. caters for netboot) and zero or more network resources (e.g. caters for offline batch jobs). Not sure about the need for the resources to be appearing as "first-class API objects" (that is, being burnt into urls etc.) though.

...

- persistent across reboots and server deletion

If a machine is stopped it would be up to the implementation to decide whether it remains visible in a "stopped" state (e.g. persistent) or whether it vanishes from view (e.g. ephemeral). I'm not sure what I make of machines that can't be stopped (seems just like normal hosting because you can't stop using the service without dropping your bundle).

...

- AMI-like templates implemented as imaging/duplicating one drive from another

Sure, I had planned to expose templates as machines that cannot be started, only cloned. Same could apply to public storage devices (e.g. appliances) and private storage devices (e.g. SOE images).

...

- block device which can be partitioned into multiple filesystems (just like a hard disk)

Sure, but sounds like we're getting down to the nitty gritty - there's an argument for the block devices themselves being opaque and I'd almost rather steer clear of details like MBR vs GPT.

...

- includes OS kernel and boot block (just like a hard disk)

See above re: opaque block devices, though Amazon's separation of AKI (kernel images), ARI (ramdisk images) and AMI (machine images) is something we'll probably want to look at supporting if we expect them to implement it and/or if we ever want to see an adapter.

...

...
guests - virtual servers booted from and accessing drives. Our guests exist as objects only when they are running, similar to Amazon's instances, but it may be more general to allow guests in stopped and suspended states in addition, as GoGrid currently do.

The variants which we see today are:

EC2 instances and ElasticHosts servers (as seen from the API) are ephemeral and no longer exist once they are stopped. GoGrid servers persist in a stopped state, and can be restarted from that state.

We suggest taking the more general approach in the standard:

Servers would include non-running servers in the same way as GoGrid. Perhaps whether a server persists or not when it is shut down is an option when creating a server?

I think it's more often than not a capability dictated by the service provider. Where workloads are only ephemeral there's no choice but for them to vanish when stopped. Where workloads are persistent I don't see a problem with having a two-phase "stop" and "destroy" step, or exposing a "destroy" actuator for a running instance and having the "stop" being implicit. I think that makes more sense - these decisions are more useful at end-of-life than start-of-life (what if I change my mind?). Perhaps the default could be an optional parameter or somesuch.

...

...
guests create (takes simple description, e.g. including attached drives and network interfaces)

It's worth commenting here on the granularity of VM specification. Both GoGrid servers and EC2 instances are available in a small number of fixed sizes, whereas ElasticHosts servers are continuously variable in CPU and RAM, and our drives are continuously variable in size.

Right, this is an important point. Basically I was thinking that you could grab a template, set your values to suit, and POST it to the API. This could be simplified for dumb clients by exposing a "clone", "deploy" or "instantiate" actuator that would return the handle of the running machine. Optionally specifying min and max instances ala Amazon should probably be supported (nobody wants to do 1,000 "clone" calls now, do they - performance is important for this API too). Again, taking the more general approach, we suggest that servers should be

...

specified in terms of continuous quantities of CPU, RAM and disk, with a provider 'rounding up' to the nearest available specification if their granularity is coarser than the standard API.

I'm not sure about rounding up as such - feels a bit messy (exceed the limit by 1 byte and your cost will double for example). If the provider has templates these should be advertised but ultimately it's up to the implementor to decide whether to reject a request for an unsupported configuration with an error or whether to satisfy it anyway with a larger device. I think the key thing for most of these points is giving implementors flexibility without sacrificing interoperability, and fortunately I think we can have our cake and eat it here. Sam

Andre Merzky

12:29 p.m.

Quoting [Sam Johnston] (Apr 16 2009):

...

...
Again, taking the more general approach, we suggest that servers should be specified in terms of continuous quantities of CPU, RAM and disk, with a provider 'rounding up' to the nearest available specification if their granularity is coarser than the standard API.

I'm not sure about rounding up as such - feels a bit messy (exceed the limit by 1 byte and your cost will double for example). If the provider has templates these should be advertised but ultimately it's up to the implementor to decide whether to reject a request for an unsupported configuration with an error or whether to satisfy it anyway with a larger device.

I think you should not leave that to the implementor, really - there is a significant semantic uncertainty in the API when you leave 'minimal resource requirements' versus 'exact resource requirements' unspecified. Exact requirements avoid the pricing pitfalls you mention. Minimal requirements are more flexible, and probably more interoperable. As I assume the user has to control pricing out of bound (i.e. outside of this API) anyway, I'd vote for the minimal spec approach. My $0.02, Andre.

...

I think the key thing for most of these points is giving implementors flexibility without sacrificing interoperability, and fortunately I think we can have our cake and eat it here.

Sam

-- Nothing is ever easy.

Sam Johnston

12:45 p.m.

On Thu, Apr 16, 2009 at 2:29 PM, Andre Merzky <andre@merzky.net> wrote:

...

...
I'm not sure about rounding up as such - feels a bit messy (exceed the limit by 1 byte and your cost will double for example). If the provider has templates these should be advertised but ultimately it's up to the implementor to decide whether to reject a request for an unsupported configuration with an error or whether to satisfy it anyway with a larger device.

I think you should not leave that to the implementor, really - there is a significant semantic uncertainty in the API when you leave 'minimal resource requirements' versus 'exact resource requirements' unspecified.

Exact requirements avoid the pricing pitfalls you mention. Minimal requirements are more flexible, and probably more interoperable.

As I assume the user has to control pricing out of bound (i.e. outside of this API) anyway, I'd vote for the minimal spec approach.

My $0.02, Andre.

Hmm... I'm not sure... if you look at Amazon EC2's pricing you'll see that if you crack the limit on a standard small instance you'll be silently bumped from 10c/hr to 40c/hr (that is, from $72 per instance per month to $288 per instance per month). I for one would have a problem with being hit with a bill 4x bigger than what I expect, especially given the provider would be free to move the cutoff points up and down as they see fit. In fact I see little point in retrieving the specs of a "template" only to send them back verbatim to request one... I'd suggest instead that these be cloned (e.g by GETting http://example.com/<id>/ops/clone). That's keeping the bar low for simple clients and neatly sidestepping the problem. It's always going to be more complex to specify custom configurations (though perhaps even these linear offerings will offer sensible templates for cloning while allowing subsequent tweaking of values), so I don't think it would hurt to do both. If a provider wants to accept "fluid" allocations and silently upsize their clients' requests then there's nothing technically stopping them from doing so... Sam

Randy Bias

9:10 p.m.

Playing catch up on this big thread today. Trying selectively respond given how much there is here to dig through. Regarding this particular issue below (handling VM sizing requests) I also vote for the provider kicking back an error for unsupported sizes rather than rounding¹. It will be easy enough for folks to have templates for different providers or to provide some kind of optional templating within the spec that providers can surface in a machine consumable way. --Randy On 4/16/09 5:01 AM, "Sam Johnston" <samj@samj.net> wrote:

...

If the provider has templates these should be advertised but ultimately it's up to the implementor to decide whether to reject a request for an unsupported configuration with an error or whether to satisfy it anyway with a larger device.

-- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias

Sam Johnston

11:44 a.m.

Chris, Agreed re: "*The single strongest point we want to make is that an API should be simple*", and the requirement for 15-20 operations. We currently have somewhat less than that. On Thu, Apr 16, 2009 at 11:32 AM, Chris Webb <chris.webb@elastichosts.com>wrote:

...

Comparison with draft API on the wiki =====================================

The draft API on the OCCI wiki currently appears to offer nouns for servers, storage devices, network interfaces with the ability to create, retrieve, update and delete any noun and 6 verbs for machine control.

Our suggestion above is very similar, but more explicit and descriptive. In particular:

- list, info, create, destroy, set are equivalent to CRUD;

The CRUD operations map to HTTP verbs which were designed for making certain actions on a resource - I don't see that there's a need to repeat ourselves here by burning this information into the URL/request syntax. Quoting Tim again from The Sun Cloud<http://www.tbray.org/ongoing/When/200x/2009/03/16/Sun-Cloud>: "*If you’re going to do bits-on-the-wire, Why not use HTTP?<http://timothyfitz.wordpress.com/2009/02/12/why-http/>And if you’re going to use HTTP, use it right. *" Non-CRUD operations such as start, stop, restart, clone, snapshot, etc. can be exposed by "actuator" URLs, which fits nicely with HATEOAS.

...

- added drive operations: read, write, image

I've deliberately steered clear of this for the minute while we reach out to SNIA to see what they're up to in this area. Last thing we want is people having to implement a second API for storage and/or overlapping too much (but agreed we need to do some basic operations).

...

- resize operation handled by passing the new size to drive set;

Sure, retrieving the object, tinkering with it and putting it back is the best way to make such changes. I am not sold on the idea of changing "running state" variables as a mechanism for managing stage machines - the "push button" approach is cleaner and better for really simple clients.

...

- nouns for network resources (e.g. static IPs, VLANs) rather than network interfaces[1];

We have network resources (e.g. virtual networks) and machines can be linked to any numer of networks (including none). Operations such as setting subnet details, gateways, dhcp servers, etc. can be done by setting attributes on the network resource (but these are rare - usually you just want to instantiate a machine and connect it to storage and network resources).

...

- similar set of machine operations, assuming we want to handle stopped guests.

I think we'll be wanting to handle both persistent and ephemeral workloads.

...

[1] We believe that interfaces are simply one aspect of a server's configuration, and that the nouns that matter are the objects such as static IPs which are "owned" by a customer and hence can be configured onto the server.

Interesting point - perhaps any static APIs available to that particular user could be exposed in the network resource itself (along with an option for DHCP if that is indeed an option). Sam

Richard Davies

1:05 p.m.

Sam Johnston wrote:

...

...
list, info, create, destroy, set are equivalent to CRUD;

The CRUD operations map to HTTP verbs which were designed for making certain actions on a resource - I don't see that there's a need to repeat ourselves here by burning this information into the URL/request syntax.

Agreed in principle, but there's an implementation issue that many common HTTP libraries will only do POST/GET, not PUT/DELETE, so need url versions of 'UD' anyway. Perhaps the compromise position is to work both ways: POST/GET/PUT/DELETE /<object> accepts all 4 CRUD operations POST /<object>/set alternative form of 'U' (if cannot PUT) POST /<object>/destroy alternative form of 'D' (if cannot DELETE) POST /<object>/create alternative form of 'C' (for symmetry) GET /<object>/info alternative form of 'R' (for symmetry)

...

Non-CRUD operations such as start, stop, restart, clone, snapshot, etc. can be exposed by "actuator" URLs, which fits nicely with HATEOAS.

The actuator URLs for these can then fit in alongside the alternative actuator forms of the CRUD operations themselves. Cheers, Richard.

Sam Johnston

1:19 p.m.

On Thu, Apr 16, 2009 at 3:05 PM, Richard Davies < richard.davies@elastichosts.com> wrote:

...

Agreed in principle, but there's an implementation issue that many common HTTP libraries will only do POST/GET, not PUT/DELETE, so need url versions of 'UD' anyway.

Wow, you guys have really thought about it :) This problem more manifests itself in enterprise firewalls and proxies but agreed, support for all the HTTP verbs is sadly limited. That's exactly what the X-HTTP-Method-Override header was invented for.

...

Perhaps the compromise position is to work both ways:

POST/GET/PUT/DELETE /<object> accepts all 4 CRUD operations POST /<object>/set alternative form of 'U' (if cannot PUT) POST /<object>/destroy alternative form of 'D' (if cannot DELETE) POST /<object>/create alternative form of 'C' (for symmetry) GET /<object>/info alternative form of 'R' (for symmetry)

I'd be more inclined to go with one or the other, and in case it's not obvious yet I've got my sights set on the myriad Google GData clients<http://code.google.com/apis/gdata/clientlibs.html>(Java, .NET, PHP, Python, Objective-C and Javascript, all under Apache licensing), and of course the army of developers who are already familiar with their APIs.

...

Non-CRUD operations such as start, stop, restart, clone, snapshot, etc.

...
can be exposed by "actuator" URLs, which fits nicely with HATEOAS.

The actuator URLs for these can then fit in alongside the alternative actuator forms of the CRUD operations themselves.

Right, the format for the URLs should be specified too, but the client can tell what methods to expose based on which are present (thus saving the user from pushing buttons that are guaranteed to result in errors, like "start"ing an abstract template or "stop"ping a stopped machine). Sam

eprparadocs＠gmail.com

5:07 p.m.

Once again I am beginning to wonder if we are getting too lost in the forest. This is clearly a protocol issue and as such we should be supporting multiple ways. I prefer a simple XML rendering (and Atom seems way too heavy) for all the requests. That includes putting in things like set, destroy,etc. The protocol step should be to map the definitions into something bitsy! Chuck Wegrzyn Sam Johnston wrote:

...

On Thu, Apr 16, 2009 at 3:05 PM, Richard Davies <richard.davies@elastichosts.com <mailto:richard.davies@elastichosts.com>> wrote:

Agreed in principle, but there's an implementation issue that many common HTTP libraries will only do POST/GET, not PUT/DELETE, so need url versions of 'UD' anyway.

Wow, you guys have really thought about it :) This problem more manifests itself in enterprise firewalls and proxies but agreed, support for all the HTTP verbs is sadly limited. That's exactly what the X-HTTP-Method-Override header was invented for.

Perhaps the compromise position is to work both ways:

POST/GET/PUT/DELETE /<object> accepts all 4 CRUD operations POST /<object>/set alternative form of 'U' (if cannot PUT) POST /<object>/destroy alternative form of 'D' (if cannot DELETE) POST /<object>/create alternative form of 'C' (for symmetry) GET /<object>/info alternative form of 'R' (for symmetry)

I'd be more inclined to go with one or the other, and in case it's not obvious yet I've got my sights set on the myriad Google GData clients <http://code.google.com/apis/gdata/clientlibs.html> (Java, .NET, PHP, Python, Objective-C and Javascript, all under Apache licensing), and of course the army of developers who are already familiar with their APIs.

> Non-CRUD operations such as start, stop, restart, clone, snapshot, etc. > can be exposed by "actuator" URLs, which fits nicely with HATEOAS.

The actuator URLs for these can then fit in alongside the alternative actuator forms of the CRUD operations themselves.

Right, the format for the URLs should be specified too, but the client can tell what methods to expose based on which are present (thus saving the user from pushing buttons that are guaranteed to result in errors, like "start"ing an abstract template or "stop"ping a stopped machine).

Sam

------------------------------------------------------------------------

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam Johnston

8:17 p.m.

On Thu, Apr 16, 2009 at 7:07 PM, <eprparadocs@gmail.com> wrote:

...

I prefer a simple XML rendering (and Atom seems way too heavy) for all the requests. That includes putting in things like set, destroy,etc. The protocol step should be to map the definitions into something bitsy!

If you have to parse XML anyway (which I agree can be a PITA, hence adopting Chris & Richard's suggestions for text/json equivalents) then it's incrementally more difficult to use an existing schema. Corners cut now will certainly come back to bite us later, and as we flesh things out I'm sure you will agree that Atom more than pulls its weight in this context. The value of existing developers and client implementations is not to be understated either. Sam

Richard Davies

4:16 p.m.

...

list, info, create, destroy, set are equivalent to CRUD;

Let me raise an additional point of detail here, specifically on the (R)etrieve operation. At present, the API draft specifies a single central entry point, on which a HTTP GET returns a complete description of all objects that a user has access to. For a user with a large number of servers, this could be a very expensive operation - the cloud will have to check with each individual virtualization host to get values such as the performance monitoring. It is probably good to offer this option, but the API should definitely ofter separate 'retrieve' operations (which are much cheaper) such as: - List UUIDs of all a user's objects, but do not describe them fully - Describe a single object fully given its UUID. At ElasticHosts, we provide all of these varieties of retrieve via: /guests/list - list all UUIDs without fully descriptions /guests/UUID/info - describe a single object fully /guests/info - describe all objects fully I think that the standard API needs to offer similar capabilities

Sam Johnston

7:01 p.m.

On Thu, Apr 16, 2009 at 6:16 PM, Richard Davies < richard.davies@elastichosts.com> wrote:

...

At present, the API draft specifies a single central entry point, on which a HTTP GET returns a complete description of all objects that a user has access to.

For a user with a large number of servers, this could be a very expensive operation - the cloud will have to check with each individual virtualization host to get values such as the performance monitoring.

It is probably good to offer this option, but the API should definitely ofter separate 'retrieve' operations (which are much cheaper) such as:

- List UUIDs of all a user's objects, but do not describe them fully - Describe a single object fully given its UUID.

Again, good to see you guys are thinking about the details. The use cases I am working on range from a single user workstation running an SOE image from a secure hypervisor burnt into the motherboard to a great global grid containing millions of objects. In the former case there is no need for additional complexity because there should only ever be a single workload. The latter case is provided for by way of a flexible search service. It's not yet fully defined but here's a taste (inspired by GData which I can assure you is a pleasure to use even for arbitrarily large numbers of objects):

...

Search service (SS) Attribute search It is possible to search for arbitrary attributes by specifying them in the full text search query string (e.g. "?q=cpu.cores:2") Category searchTo search for members of a given category (that is, any resource with a given Atom "term" or "label") do a HTTP GET request to "<entrypoint>/-/cat1/cat2":

- "/-/" denotes that the following fragments are to be interpreted as categories - multiple cateogries result in a logical AND query (e.g. "/-/cat1/cat2" means "cat1 AND cat2") - use pipe (|, URL encoded as %7C) for logical OR (e.g. "/-/cat1|cat2" means "cat1 OR cat2") - prefixing "-" negates a category (e.g. "/-/cat1/-cat2" means "cat1 AND NOT cat2") - Atom schemes can be specified using curly braces (e.g. "/-/{urn: example.com}public") - any of these selectors can be combined (e.g. "/-/cat1%7Ccat2/-cat3" means "(cat1 OR (NOT cat2)) AND (NOT cat3)")

Full text searchFor full text search append "?q=<query>" to the entry point URI and make a HTTP GET request. An XML feed containing all matching resources will be returned.

The search service should also allow you to search on attributes so you can do things like search for all servers tagged "linux" with over 4Gb RAM. Search is important - it needs to be powerful (at least for full implementations). Sam

Richard Davies

8:21 p.m.

...

Search service (SS)

Sam - looks good - I had seen it and thought is was likely your solution for this area. How does it actually handle the two cases which I mentioned? 1) Return all properties for a single known object (e.g. a single server with known UUID). I assume this is somehow does with category search? 2) Return a list of all objects (possibly for a given type), without their properties (since these might be expensive to read). For example, simply list the UUIDs of all servers?

Sam Johnston

8:33 p.m.

On Thu, Apr 16, 2009 at 10:21 PM, Richard Davies < richard.davies@elastichosts.com> wrote:

...

...
Search service (SS)

Sam - looks good - I had seen it and thought is was likely your solution for this area. How does it actually handle the two cases which I mentioned?

1) Return all properties for a single known object (e.g. a single server with known UUID). I assume this is somehow does with category search?

2) Return a list of all objects (possibly for a given type), without their properties (since these might be expensive to read). For example, simply list the UUIDs of all servers?

Ok sorry I forgot to mention that this was actually a requirement I had not considered (that is the performance and scalability cost of unnecessarily calling on extensions). 1. Actually it's even easier than that. If you know a UUID you can just GET <entrypoint>/<uuid> and you'll get the lot. Given categories can be arbitrarily large I'd suggest that they also be sparse. 2. By default we should just return just the basic content that's already in the database... so not bothering the performance monitor counters or asking the billing system to sum up usage (we have to hit the database anyway so this is no cost). If we need information from the extensions we should explicitly ask for it. Exactly how this might be done I'm not sure but for simplicity certainly some sort of representation in the query string, e.g. "?ext=br,pm". There's a Google Data overview<http://code.google.com/apis/gdata/overview.html>, Protocol Basics <http://code.google.com/apis/gdata/docs/2.0/basics.html> and Reference <http://code.google.com/apis/gdata/docs/2.0/reference.html> that are great resources for getting up to speed with Google's approach, and the Sun Cloud APIs <http://kenai.com/projects/suncloudapis> at Kenai (as well as Tim Bray's blog <http://www.tbray.org/ongoing/>) are also worth a look. You know where your API <http://www.elastichosts.com/products/api> is but I've been looking at the GoGrid API <http://wiki.gogrid.com/wiki/index.php/API> (among others) too. Sam

Richard Davies

17 Apr 17 Apr

8:32 a.m.

...

...
1) Return all properties for a single known object (e.g. a single server with known UUID). I assume this is somehow does with category search?

2) Return a list of all objects (possibly for a given type), without their properties (since these might be expensive to read). For example, simply list the UUIDs of all servers?

Ok sorry I forgot to mention that this was actually a requirement I had not considered (that is the performance and scalability cost of unnecessarily calling on extensions).

1. Actually it's even easier than that. If you know a UUID you can just GET <entrypoint>/<uuid> and you'll get the lot. Given categories can be arbitrarily large I'd suggest that they also be sparse.

That's good and simple.

...

2. By default we should just return just the basic content that's already in the database... so not bothering the performance monitor counters or asking the billing system to sum up usage (we have to hit the database anyway so this is no cost).

You may be overestimating how much is in the database (in our case at least!). To avoid any risk of inconsistency, we typically don't cache any server attributes on the central management servers where these are best mastered on the virtualization hosts themselves. That includes things like server titles, memory sizes, etc, etc. Pretty much just the UUIDs and which hosts they're on is permanently remembered centrally. Richard.

Sam Johnston

8:54 a.m.

On Fri, Apr 17, 2009 at 10:32 AM, Richard Davies < richard.davies@elastichosts.com> wrote:

...

You may be overestimating how much is in the database (in our case at least!). To avoid any risk of inconsistency, we typically don't cache any server attributes on the central management servers where these are best mastered on the virtualization hosts themselves. That includes things like server titles, memory sizes, etc, etc. Pretty much just the UUIDs and which hosts they're on is permanently remembered centrally.

This is an interesting point even if it's already starting to get down to implementation details. There's little interest in a client retrieving a list of bare UUIDs, except as a precursor to iterating through them individually and requesting more information (which sounds like a source of more significant performance problems to me, expecially for remote infrastructure). Nonetheless we should try to avoid such expensive calls (e.g. querying every server)... or at least allow them to be completed in an inexpensive way. If I were to write a client today I would probably start by asking for categories with which to build a top level tree view. Clicking on that would result in a category search (e.g. http://example.com/-/category) and I'd ask for relevant details such as running state (e.g. starting, started, stopped, etc.) which I would use to select icons etc. Bringing up a property page would result in a retrieval of all the relevant details for a single UUID. Overall I think that giving the ability to specify which non-core extensions are queried solves most of the problem. Another optimisation might be to have extensions return "cheap" information (such as the rate, currency, etc.) while saving really expensive operations (such as totalling billing logs) for single-resource calls. Thanks for drawing attention to this need Richard - great feedback! Sam

Tino Vazquez

10:13 a.m.

Hi everyone, I completely agree with Sam. We had a similar design challenge when building OpenNebula API, we gathered that offering a call for retrieving all IDs for any set of resources (physical servers and VMs) will be very expensive and will have an impact pauperizing the performance of OpenNebula's cache. BUT, we also found out that many people demanded such feature. If we are going to define such call (querying every server), I vouche for a minimalistic approach (e.g. returning just the UUID and some ID string maybe). We will then have to figure out a way to give such information in a way that doesn't impact (too much) on performance. Regards, -Tino P.D.: Very nice list indeed On Fri, Apr 17, 2009 at 10:54 AM, Sam Johnston <samj@samj.net> wrote:

...

On Fri, Apr 17, 2009 at 10:32 AM, Richard Davies <richard.davies@elastichosts.com> wrote:

...
You may be overestimating how much is in the database (in our case at least!). To avoid any risk of inconsistency, we typically don't cache any server attributes on the central management servers where these are best mastered on the virtualization hosts themselves. That includes things like server titles, memory sizes, etc, etc. Pretty much just the UUIDs and which hosts they're on is permanently remembered centrally.

This is an interesting point even if it's already starting to get down to implementation details. There's little interest in a client retrieving a list of bare UUIDs, except as a precursor to iterating through them individually and requesting more information (which sounds like a source of more significant performance problems to me, expecially for remote infrastructure). Nonetheless we should try to avoid such expensive calls (e.g. querying every server)... or at least allow them to be completed in an inexpensive way.

If I were to write a client today I would probably start by asking for categories with which to build a top level tree view. Clicking on that would result in a category search (e.g. http://example.com/-/category) and I'd ask for relevant details such as running state (e.g. starting, started, stopped, etc.) which I would use to select icons etc. Bringing up a property page would result in a retrieval of all the relevant details for a single UUID.

Overall I think that giving the ability to specify which non-core extensions are queried solves most of the problem. Another optimisation might be to have extensions return "cheap" information (such as the rate, currency, etc.) while saving really expensive operations (such as totalling billing logs) for single-resource calls.

Thanks for drawing attention to this need Richard - great feedback!

Sam

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Tino Vazquez

10:29 a.m.

Hi everyone, I completely agree with Sam. We had a similar design challenge when building OpenNebula API, we gathered that offering a call for retrieving all IDs for any set of resources (physical servers and VMs) will be very expensive and will have an impact pauperizing the performance of OpenNebula's cache. BUT, we also found out that many people demanded such feature. If we are going to define such call (querying every server), I vouch for a minimalistic approach (e.g. returning just the UUID and some ID string maybe). We will then have to figure out a way to give such information in a way that doesn't impact (too much) on performance. Regards, -Tino P.D.: Very nice list indeed On Fri, Apr 17, 2009 at 10:54 AM, Sam Johnston <samj@samj.net> wrote:

...

On Fri, Apr 17, 2009 at 10:32 AM, Richard Davies <richard.davies@elastichosts.com> wrote:

...
You may be overestimating how much is in the database (in our case at least!). To avoid any risk of inconsistency, we typically don't cache any server attributes on the central management servers where these are best mastered on the virtualization hosts themselves. That includes things like server titles, memory sizes, etc, etc. Pretty much just the UUIDs and which hosts they're on is permanently remembered centrally.

This is an interesting point even if it's already starting to get down to implementation details. There's little interest in a client retrieving a list of bare UUIDs, except as a precursor to iterating through them individually and requesting more information (which sounds like a source of more significant performance problems to me, expecially for remote infrastructure). Nonetheless we should try to avoid such expensive calls (e.g. querying every server)... or at least allow them to be completed in an inexpensive way.

If I were to write a client today I would probably start by asking for categories with which to build a top level tree view. Clicking on that would result in a category search (e.g. http://example.com/-/category) and I'd ask for relevant details such as running state (e.g. starting, started, stopped, etc.) which I would use to select icons etc. Bringing up a property page would result in a retrieval of all the relevant details for a single UUID.

Overall I think that giving the ability to specify which non-core extensions are queried solves most of the problem. Another optimisation might be to have extensions return "cheap" information (such as the rate, currency, etc.) while saving really expensive operations (such as totalling billing logs) for single-resource calls.

Thanks for drawing attention to this need Richard - great feedback!

Sam

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Edmonds, AndrewX

10:33 a.m.

+1 Absolutely - makes for good design and complimentary to REST styles. -----Original Message----- From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Tino Vazquez Sent: 17 April 2009 11:30 To: Sam Johnston Cc: occi-wg@ogf.org Subject: Re: [occi-wg] Semantics of OCCI API: nouns and verbs Hi everyone, I completely agree with Sam. We had a similar design challenge when building OpenNebula API, we gathered that offering a call for retrieving all IDs for any set of resources (physical servers and VMs) will be very expensive and will have an impact pauperizing the performance of OpenNebula's cache. BUT, we also found out that many people demanded such feature. If we are going to define such call (querying every server), I vouch for a minimalistic approach (e.g. returning just the UUID and some ID string maybe). We will then have to figure out a way to give such information in a way that doesn't impact (too much) on performance. Regards, -Tino P.D.: Very nice list indeed On Fri, Apr 17, 2009 at 10:54 AM, Sam Johnston <samj@samj.net> wrote:

...

On Fri, Apr 17, 2009 at 10:32 AM, Richard Davies <richard.davies@elastichosts.com> wrote:

...
You may be overestimating how much is in the database (in our case at least!). To avoid any risk of inconsistency, we typically don't cache any server attributes on the central management servers where these are best mastered on the virtualization hosts themselves. That includes things like server titles, memory sizes, etc, etc. Pretty much just the UUIDs and which hosts they're on is permanently remembered centrally.

This is an interesting point even if it's already starting to get down to implementation details. There's little interest in a client retrieving a list of bare UUIDs, except as a precursor to iterating through them individually and requesting more information (which sounds like a source of more significant performance problems to me, expecially for remote infrastructure). Nonetheless we should try to avoid such expensive calls (e.g. querying every server)... or at least allow them to be completed in an inexpensive way.

If I were to write a client today I would probably start by asking for categories with which to build a top level tree view. Clicking on that would result in a category search (e.g. http://example.com/-/category) and I'd ask for relevant details such as running state (e.g. starting, started, stopped, etc.) which I would use to select icons etc. Bringing up a property page would result in a retrieval of all the relevant details for a single UUID.

Overall I think that giving the ability to specify which non-core extensions are queried solves most of the problem. Another optimisation might be to have extensions return "cheap" information (such as the rate, currency, etc.) while saving really expensive operations (such as totalling billing logs) for single-resource calls.

Thanks for drawing attention to this need Richard - great feedback!

Sam

_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

-- Constantino Vázquez, Grid Technology Engineer/Researcher: http://www.dsa-research.org/tinova DSA Research Group: http://dsa-research.org Globus GridWay Metascheduler: http://www.GridWay.org OpenNebula Virtual Infrastructure Engine: http://www.OpenNebula.org _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg ------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

5895

Age (days ago)

5896

Last active (days ago)

List overview

Download

20 comments

10 participants

participants (10)

Alexis Richardson
Andre Merzky
Chris Webb
Edmonds, AndrewX
eprparadocs＠gmail.com
Randy Bias
Richard Davies
Sam Johnston
Tino Vazquez
Tino Vazquez