
Hello again, To get the ball rolling I've spent some time over the weekend working on an OCCI Walkthrough<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/Walkthrough>(following in the footsteps of Sun's excellent HelloCloud <http://kenai.com/projects/suncloudapis/pages/HelloCloud>document). It captures the latest thinking and feedback received from many of you over the past few weeks and is intended as a simple (if technical) introduction to new users. As you can see we are about as close to raw HTTP as we'll ever get, and the use of HTML forms for many operations trivialises implementation on both the server and client side. Shifting further towards a document-based "Resource Oriented Architecture<http://www.infoq.com/resource/articles/richardson-ruby-restful-ws/en/resources/04.pdf>" results in an API that "just makes sense" and is extremely (horizontally) scalable. Multiple representations also allow us to lock down what we need to for interoperability (e.g. the simple OCCI descriptor format) while enabling providers to innovate and differentiate (safely) by supporting various formats (e.g. VMD, VMDK, VMX, OVF, etc.) and features (e.g. auditing, change control, security, etc.). Anwyay without further ado, current version's copied below and latest version's in the wiki<http://forge.ogf.org/sf/wiki/do/viewPage/projects.occi-wg/wiki/Walkthrough> . Sam OCCI Walkthrough Overview The Open Cloud Computing Interface (OCCI) is an API for managing cloud infrastructure services (also known as Infrastructure as a Service or IaaS) which strictly adheres to REpresentational State Transfer (REST) principles and is closely tied to HyperText Tranfer Protocol (HTTP). For simplicity and scalability reasons it specifically avoids Remote Procedure Call (RPC) style interfaces and can essentially be implemented as a horizontally scalable document repository with which both nodes and clients interact. This document describes a step-by-step walkthrough of performing various tasks as at the time of writing. Getting started Connecting Each implementation has a single OCCI end-point URL (we'll use http://example.com/) and everything you need to know is linked from this point - configuring clients is just a case of providing this parameter. In the simplest case the end-point may contain only a single resource or type of resource (e.g. a hypervisor burnt into the BIOS of a motherboard exposing compute resources, a network switch/router exposing network resources or a SAN exposing storage resources) and at the other end of the spectrum it may provide access to a global cloud infrastructure (e.g. the "Great Global Grid" or GGG). You will only ever see those resources to which you have access to (typically all of them for a private cloud or a small subset for a public cloud) and flexible categorisation and search provide fine-grained control which resources are returned, allowing OCCI to handle the largest of installations. You will always connect to this end-point over HTTP(S) and given the simplicity of the interface most user-agents are suitable, including libraries (e.g. urllib2, LWP), command line tools (e.g. curl, wget) and full blown browsers (e.g. Firefox). AuthenticatingWhen you connect you will normally be challenged to authenticate via HTTP (this is not always the case - in secure/offline environments it may not be necessary) and will need to do so via the specified mechanism. It is anticipated that most implementations will require HTTP Basic Authentication over SSL/TLS so at the very least you should support this (fortunately almost all user-agents already do), but more advanced mechanisms such as NTLM or Kerberos may be deployed. Certain types of accesses (such as a compute resource querying OCCI for introspection and configuration) may be possible anonymously (having already been authenticated by interface and/or IP address). Should you be redirected by the API to a node, storage device, etc. (for example, to retrieve a large binary representation) then you should either be able to transparently authenticate or a signed URL should be provided. That is, a single set of credentials is all that is required to access the entire system from any point. Representations As the resource itself (e.g. a physical machine, storage array or network switch) cannot be transferred over HTTP (at least not yet!) we instead make available one or more representations of that resource. For example, an API modeling a person might return a picture, fingerprints, identity document(s) or even a digitised DNA sequence, but not the person themselves. A circle might be represented by SVG drawing primatives or any three distinct points on the curve. For cloud infrastructure there are many useful representations, and while OCCI standardises a number of them for interoperability purposes, an implementation is free to implement others in order to best serve the specific needs of their users and to differentiate from other offerings. Other examples include: - Open Cloud Computing Interface (OCCI) descriptor format (application/occi+xml) - Open Virtualisation Format (OVF) file (application/ovf+xml?) - Open Virtualisation Archive (OVA) file (application/x-ova?) - Screenshot of the console (image/png) - Access to the console (application/x-vnc) The client indicates which representation(s) it desires by way of the URL and/or HTTP Accept headers (e.g. HTTP Content Negotiation<http://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html>) and if the server is unable to satisfy the request then it should return HTTP 406 Not Acceptable. Descriptors In addition to the protocol itself, OCCI defines a simple key/value based descriptor format for cloud infrastructure resources: computeProvides computational services, ranging from dedicated physical machines (e.g. Dedibox) to virtual machines (e.g. Amazon EC2) to slices/zones/containers (e.g. Mosso Cloud Servers). networkProvides connectivity between machines and the outside world. Usually virtual and may or may not be connected to a physical segment. storageProvides storage services, typically via magnetic mass storage devices (e.g. hard drives, RAID arrays, SANs). Given the simplicity of the format it is trivial to translate between wire formats including plain text, JSON, XML and others. For example: compute.cores 2 compute.speed 3200 compute.memory 2048 Identifiers Each resource is identified by its' dereferenceable URL which is by definition unique, giving information about the origin and type of the resource as well as a local identifier (the combination of which forms a globally unique compound key). The primary drawback is that the more information that goes into the key (and therefore the more transparent it is), the more likely it is to change. For example, if you migrate a resource from one implementation to another then its' identifier will change (though in this instance the source should provide a HTTP 301 Moved Permanentlyresponse along with the new location, assuming it is known, or HTTP 410 Gone otherwise). In order to realise the benefit of transparent, dereferenceable identifiers while still being able to track resources through their entire lifecycle an immutable UUID attribute should be allocated which will remain with the resource throughout its' life. This is particularly important where the same resource (e.g. a network) appears in multiple places. New implementations should use type 4 (random) UUIDs anyway, as these can be safely allocated by any node without consulting a register/sequence, but where existing identifiers are available they should be used instead (e.g. http://amazon.com/compute/ami-ef48af86). Operations Create To create a resource simply POST it to the appropropriate collection (e.g. /compute, /network or /storage) as an HTML form (supported by virtually all user agents) or in another supported format (e.g. OVF): POST /compute HTTP/1.1 Host: example.com Content-Length: 35 Content-Type: application/x-www-form-urlencoded compute.cores=2&compute.memory=2048 If this was successful the server will automatically allocate an identifier and return HTTP 200 OK along with a Location: header pointing at the newly created resource. You may also PUT the resource directly in place if you already know its' identifier or will be generating it on the client side: PUT /compute/b10fa926-41a6-4125-ae94-bfad2670ca87 HTTP/1.1 Host: example.com Content-Length: 35 Content-Type: application/occi+text compute.cores 2 compute.memory 2048 Rather than generating the new resource from scratch you may also be given the option to GET a template and POST or PUT it back (for example, where "small", "medium" and "large" instances or pre-configured appliances are offered). Retrieve The simplest command is to retrieve a single resource by conducting a HTTP GET on its' URL (which doubles as its' identifier): GET /compute/b10fa926-41a6-4125-ae94-bfad2670ca87 HTTP/1.1 Host: example.com This will return a HTTP 300 Multiple Choices response containing a list of available representations for the resource as well as a suggestion in the form of a HTTP Location: header of the default rendering, which should be HTML (thereby allowing standard browsers to access the API directly). An arbitrary number of alternatives may also be returned by way of HTTP Link:headers. If you just need to know what representations are available you should make a HEAD request instead of a GET - this will return the metadata in the headers without the default rendering. Collections Some requests (such as searches) will need to return a collection of resources. There are two options: Pass-by-reference A plain text or HTML list of links is provided but each needs to be retrieved separately, resulting in O(n+1) performance. Pass-by-value A wrapper format such as Atom is used to deliver [links to] the content as well as the metadata (e.g. links, associations, cahching information, etc.), resulting in O(1) performance. Update Updating resources is trivial - simply GET the resource, modify it as necessary and PUT it back where you found it. Delete Simply DELETE the resource: DELETE /compute/b10fa926-41a6-4125-ae94-bfad2670ca87 HTTP/1.1 Host: example.com Sub-resource Collections (For want of a better name) Each resource may expose collections for functions such as logging, auditing, change control, documentation and other operations (e.g. http://example.com/compute/123/log/456) in addition to any required by OCCI. As usual CRUD operations map to HTTP verbs (as above) and clients can either PUT entries directly if they know or will generate the identifiers, or POSTthem to the collection if this will be handled on the server side (using POST Once Exactly (POE)<http://tools.ietf.org/draft/draft-nottingham-http-poe/draft-nottingham-http-poe-00.txt>to ensure idempotency). Requests Requests are used to trigger state changes and other operations such as backups, snapshots, migrations and invasive reconfigurations (such as storage resource resizing). Those that do not complete immediately (returning HTTP 200 OK or similar) must be handled asynchronously (returning HTTP 201 Accepted or similar). POST /compute/123/requests HTTP/1.1 Host: example.com Content-Length: 35 Content-Type: application/x-www-form-urlencoded state=shutdown&type=acpioff The actual operation may not start immediately (for example, backups which are only handled daily at midnight) and may take some time to complete (for example a secure erase which requires multiple passes over the disk). Clients can poll for status periodically or use server push<http://en.wikipedia.org/wiki/Push_technology>(or a non-HTTP technology such as XMPP) to monitor for events.

Sam, Andy et al, Chris and I have been reviewing the latest API design document, walkthrough, and process. We see an enormous step forward, and are very excited by this. With the latest design, OCCI looks like it'll be an API which ElasticHosts will be enthusiastic to implement. Thank you to Sam and the committee for working through some challenging times and also for taking on board much of our feedback. I've made some updates to the Features Matrix, Use Cases and API Review on the website. I'm writing out my feedback on format issues in the design and walkthrough below. I'll update the Wiki along these lines once people agree. Cheers, Richard. Descriptor formats ------------------ I see 3 descriptor formats (text/plain, application/json and application/xml) and 1 extra descriptor-like format for creation (application/x-www-form-urlencoded). All of these make sense and are based around the same flat key-value pairs, which look reasonable. I have a few points of consistency: - Various mime types are discussed, e.g. text/plain on the API Design, but application/occi+text on the Walkthrough. Needs to be consistent - Qualification of attributes varies. e.g. 'cores' on the API Design, but 'compute.cores' on the Walkthrough. Again needs to be consistent. We'd favour plain 'cores', but are not too bothered. - Which of the descriptor formats are compulsory and hence required for interoperability? The API Design states that the first 3 are all compulsory. The walkthrough then adds an application/x-www-form-encoded format specifically for initial create POST and for Requests. Would application/occi+xml be acceptable in the create POST? Would application/x-www-form-urlencoded be acceptable for a PUT? Are we really saying that there are 4 compulsory descriptor formats throughout? - For the Retrieve operation on the Walkthrough, we should support the HTTP Accept: header to select a output descriptor format. Collection formats ------------------ I very much like the way that there is now a clear separation between the descriptor formats and AtomPub as a collection format (see API Design). This makes it clear that all of the OCCI work is in the descriptor format for the payloads, and these can be wrapped in Atom according to the standard Atom RFCs in the exactly same way that that another payload such as OVF could equally be wrapped in Atom. This is probably a good way of transferring potentially mixed sets of OCCI descriptors, OVF, etc., all wrapped in a single feed, with updated dates, etags, etc. However, I do wonder if a separate collection format is necessary at all for basic use. Each of the descriptor formats itself can very naturally transfer multiple resources (if we add the id as an attribute to each resource): text/plain: put a blank line between resources application/json: put the resources inside a JSON list application/xml: have multiple <occi>...</occi> stanzas. I would like to support returning collections in the native descriptor formats, as above. We should support the HTTP Accept: header when requesting a collection to determine which of the descriptor formats or Atom is used.
participants (2)
-
Richard Davies
-
Sam Johnston