ElasticHosts introduction

Quick intro: I'm Chris Webb, CTO of ElasticHosts, and I designed the ElasticHosts API (see http://www.elastichosts.com/products/api for docs) which has been well-received by users and integrators for its clarity and simplicity. See for example Ignacio's comments on ElasticHosts integration with OpenNebula: http://www.ogf.org/OGF25/materials/1567/OpenNebula.pdf We've been discussing API standardisation with Randy Bias from GoGrid for a little while, and Alexis and Sam have been encouraging us to get involved now the OCCI group is properly underway. One of our key design goals was that it be possible to describe our complete API in a couple of pages, and interface to it in a couple of lines of shell script without proprietary tools. I strongly believe that producing an heavyweight, overengineered API which doesn't satisfy this aim completely fails end users. We wrote some thoughts on this on our website in a New Year's blog posting: http://www.elastichosts.com/blog/2009/01/01/designing-a-great-http-api/ and some more general points in a submission to the working group for cloud infrastructure API standardization at OGF25: http://www.elastichosts.com/blog/2009/03/03/cloud-api-standardization-at-ogf... We're looking forward to discussing these issues with the newly formed group, and I'm about to follow up on some of work the group has done to date and some of the major issues. Cheers, Chris.

Hi Chris, Thanks for the intro and the useful links. I've read your "designing a great HTTP API" post before and agree that we should not be reinventing the wheel (e.g. rely on SSL/TLS and HTTP for auth), though I'm not sure (as in I haven't formed an opinion) on whether we should support "choice of syntax"/multiple formats (e.g. XML vs JSON vs ???). Tim Bray (who we'd very much like to see get on board) says<http://twitter.com/timbray/statuses/1396042066>he " *Really [doesn't] like the JSON-*or*-XML trend. Adds work, no benefit, pick 1*". Perhaps the answer is to require one and provide for others. Sun Cloud APIs are so far as I can tell straight from the Q-Layer acquisition (which provided an Ajax virtual data center development tool) so the choice of JSON was obvious. It's less obvious when your clients are likely to be anything from simple HTTP utilities (curl/wget) to command line tools, "thick" management clients and automated agents (MS System Center, RightScale, etc.). Same applies when you want a standard that doesn't stifle innovation and is thus easily and infinitely extensible, while providing a common denominator for interoperability. What I've done so far is based it on an extremely simple, well specified and common XML structure (Atom - RFC 4287) and (following Sun's example) embedded links for operations: http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start These links can obviously be actuated by any HTTP client and they could obviously be embedded in a format of your choice (e.g. json). Sam On Thu, Apr 16, 2009 at 11:32 AM, Chris Webb <chris.webb@elastichosts.com>wrote:
Quick intro: I'm Chris Webb, CTO of ElasticHosts, and I designed the ElasticHosts API (see http://www.elastichosts.com/products/api for docs) which has been well-received by users and integrators for its clarity and simplicity. See for example Ignacio's comments on ElasticHosts integration with OpenNebula:
http://www.ogf.org/OGF25/materials/1567/OpenNebula.pdf
We've been discussing API standardisation with Randy Bias from GoGrid for a little while, and Alexis and Sam have been encouraging us to get involved now the OCCI group is properly underway.
One of our key design goals was that it be possible to describe our complete API in a couple of pages, and interface to it in a couple of lines of shell script without proprietary tools. I strongly believe that producing an heavyweight, overengineered API which doesn't satisfy this aim completely fails end users. We wrote some thoughts on this on our website in a New Year's blog posting:
http://www.elastichosts.com/blog/2009/01/01/designing-a-great-http-api/
and some more general points in a submission to the working group for cloud infrastructure API standardization at OGF25:
http://www.elastichosts.com/blog/2009/03/03/cloud-api-standardization-at-ogf...
We're looking forward to discussing these issues with the newly formed group, and I'm about to follow up on some of work the group has done to date and some of the major issues.
Cheers,
Chris. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam Johnston <samj@samj.net> writes:
Tim Bray (who we'd very > much like to see get on board) says<http://twitter.com/timbray/statuses/1396042066>he " *Really [doesn't] like the JSON-*or*-XML trend. Adds work, no benefit, pick 1*". Perhaps the answer is to require one and provide for others.
Having implemented a cloud infrastructure platform with support for several data formats, I dispute his claim with evidence! We originally supported only the simple text/plain KEY VALUE format in our API. Here is the diffstat from our infrastructure code for the changeset that implemented JSON: $ hg export 7c87001589f5 | diffstat bin/apiserver | 50 +++++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 19 deletions(-) As you can see, it's a microscopic amount of extra code. I expect an even smaller diff to be required to add XML. (Nobody's asked for it yet, so we've prioritised other development: all our test users seemed to prefer plaintext or JSON at present.) Of course, it's only easy for us because we have designed our API to be so simple and direct without excessive abstraction... Cheers, Chris.

Having implemented "cloud storage" in SOAP, XMLRPC, JSON and even Sun RPC I can see no specific benefit to preferring one over another except for pure performance (if that is an issue). Certainly the amount of code to handle one over another is pretty minuscule now a days. I think the approach Sam has taken, very agnostic, is the right approach. Chuck Wegrzyn Twisted Storage Chris Webb wrote:
Sam Johnston <samj@samj.net> writes:
Tim Bray (who we'd very > much like to see get on board) says<http://twitter.com/timbray/statuses/1396042066>he " *Really [doesn't] like the JSON-*or*-XML trend. Adds work, no benefit, pick 1*". Perhaps the answer is to require one and provide for others.
Having implemented a cloud infrastructure platform with support for several data formats, I dispute his claim with evidence! We originally supported only the simple text/plain KEY VALUE format in our API. Here is the diffstat from our infrastructure code for the changeset that implemented JSON:
$ hg export 7c87001589f5 | diffstat bin/apiserver | 50 +++++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 19 deletions(-)
As you can see, it's a microscopic amount of extra code. I expect an even smaller diff to be required to add XML. (Nobody's asked for it yet, so we've prioritised other development: all our test users seemed to prefer plaintext or JSON at present.)
Of course, it's only easy for us because we have designed our API to be so simple and direct without excessive abstraction...
Cheers,
Chris. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Ok so basically the request syntax (e.g. URLs, HTTP verbs) will be common, as well as the attributes and actuators (e.g. start, stop, restart). I don't have a problem with supporting the most brain dead of clients (after all, I wrote the cush <http://code.google.com/p/cush/> cloud computing shell a while back with this in mind), but I'd like to keep some middle ground (e.g. for web interfaces like Q-Layer, so JSON?) and support unfettered extension for handling advanced functionality like billing and reporting, SLAs, security management, transparent embedding of supporting formats like OVF (all of which requires the additional flexibility of XML). Sam On Thu, Apr 16, 2009 at 1:53 PM, <eprparadocs@gmail.com> wrote:
Having implemented "cloud storage" in SOAP, XMLRPC, JSON and even Sun RPC I can see no specific benefit to preferring one over another except for pure performance (if that is an issue). Certainly the amount of code to handle one over another is pretty minuscule now a days.
I think the approach Sam has taken, very agnostic, is the right approach.
Chuck Wegrzyn Twisted Storage
Sam Johnston <samj@samj.net> writes:
Tim Bray (who we'd very > much like to see get on board) says<http://twitter.com/timbray/statuses/1396042066>he " *Really [doesn't] like the JSON-*or*-XML trend. Adds work, no benefit,
1*". Perhaps the answer is to require one and provide for others.
Having implemented a cloud infrastructure platform with support for several data formats, I dispute his claim with evidence! We originally supported only the simple text/plain KEY VALUE format in our API. Here is the diffstat from our infrastructure code for the changeset that implemented JSON:
$ hg export 7c87001589f5 | diffstat bin/apiserver | 50 +++++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 19 deletions(-)
As you can see, it's a microscopic amount of extra code. I expect an even smaller diff to be required to add XML. (Nobody's asked for it yet, so we've prioritised other development: all our test users seemed to prefer
Chris Webb wrote: pick plaintext
or JSON at present.)
Of course, it's only easy for us because we have designed our API to be so simple and direct without excessive abstraction...
Cheers,
Chris. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Sam, Perhaps I didn't quite understand your approach. I would think we should enumerate the parts of each API and the interpretation and leave to another section the actual bits over the wire. That way we could support anything, including binary XML! Chuck Sam Johnston wrote:
Ok so basically the request syntax (e.g. URLs, HTTP verbs) will be common, as well as the attributes and actuators (e.g. start, stop, restart).
I don't have a problem with supporting the most brain dead of clients (after all, I wrote the cush <http://code.google.com/p/cush/> cloud computing shell a while back with this in mind), but I'd like to keep some middle ground (e.g. for web interfaces like Q-Layer, so JSON?) and support unfettered extension for handling advanced functionality like billing and reporting, SLAs, security management, transparent embedding of supporting formats like OVF (all of which requires the additional flexibility of XML).
Sam
On Thu, Apr 16, 2009 at 1:53 PM, <eprparadocs@gmail.com <mailto:eprparadocs@gmail.com>> wrote:
Having implemented "cloud storage" in SOAP, XMLRPC, JSON and even Sun RPC I can see no specific benefit to preferring one over another except for pure performance (if that is an issue). Certainly the amount of code to handle one over another is pretty minuscule now a days.
I think the approach Sam has taken, very agnostic, is the right approach.
Chuck Wegrzyn Twisted Storage
Chris Webb wrote: > Sam Johnston <samj@samj.net <mailto:samj@samj.net>> writes: > >> Tim Bray (who we'd very > much like to see get on board) >> says<http://twitter.com/timbray/statuses/1396042066>he " >> *Really [doesn't] like the JSON-*or*-XML trend. Adds work, no benefit, pick >> 1*". Perhaps the answer is to require one and provide for others. > > Having implemented a cloud infrastructure platform with support for several > data formats, I dispute his claim with evidence! We originally supported > only the simple text/plain KEY VALUE format in our API. Here is the diffstat > from our infrastructure code for the changeset that implemented JSON: > > $ hg export 7c87001589f5 | diffstat > bin/apiserver | 50 +++++++++++++++++++++++++++++------------------- > 1 file changed, 31 insertions(+), 19 deletions(-) > > As you can see, it's a microscopic amount of extra code. I expect an even > smaller diff to be required to add XML. (Nobody's asked for it yet, so we've > prioritised other development: all our test users seemed to prefer plaintext > or JSON at present.) > > Of course, it's only easy for us because we have designed our API to be > so simple and direct without excessive abstraction... > > Cheers, > > Chris. > _______________________________________________ > occi-wg mailing list > occi-wg@ogf.org <mailto:occi-wg@ogf.org> > http://www.ogf.org/mailman/listinfo/occi-wg

Actually you're spot on - I'm just pointing out what's common and conceding that we should be flexible about the 1's and 0's. Sam On Thu, Apr 16, 2009 at 2:20 PM, <eprparadocs@gmail.com> wrote:
Sam,
Perhaps I didn't quite understand your approach. I would think we should enumerate the parts of each API and the interpretation and leave to another section the actual bits over the wire. That way we could support anything, including binary XML!
Chuck
Sam Johnston wrote:
Ok so basically the request syntax (e.g. URLs, HTTP verbs) will be common, as well as the attributes and actuators (e.g. start, stop, restart).
I don't have a problem with supporting the most brain dead of clients (after all, I wrote the cush <http://code.google.com/p/cush/> cloud computing shell a while back with this in mind), but I'd like to keep some middle ground (e.g. for web interfaces like Q-Layer, so JSON?) and support unfettered extension for handling advanced functionality like billing and reporting, SLAs, security management, transparent embedding of supporting formats like OVF (all of which requires the additional flexibility of XML).
Sam
On Thu, Apr 16, 2009 at 1:53 PM, <eprparadocs@gmail.com <mailto:eprparadocs@gmail.com>> wrote:
Having implemented "cloud storage" in SOAP, XMLRPC, JSON and even Sun RPC I can see no specific benefit to preferring one over another except for pure performance (if that is an issue). Certainly the amount of code to handle one over another is pretty minuscule now a days.
I think the approach Sam has taken, very agnostic, is the right approach.
Chuck Wegrzyn Twisted Storage
Chris Webb wrote: > Sam Johnston <samj@samj.net <mailto:samj@samj.net>> writes: > >> Tim Bray (who we'd very > much like to see get on board) >> says<http://twitter.com/timbray/statuses/1396042066>he " >> *Really [doesn't] like the JSON-*or*-XML trend. Adds work, no benefit, pick >> 1*". Perhaps the answer is to require one and provide for others. > > Having implemented a cloud infrastructure platform with support for several > data formats, I dispute his claim with evidence! We originally supported > only the simple text/plain KEY VALUE format in our API. Here is the diffstat > from our infrastructure code for the changeset that implemented JSON: > > $ hg export 7c87001589f5 | diffstat > bin/apiserver | 50 +++++++++++++++++++++++++++++------------------- > 1 file changed, 31 insertions(+), 19 deletions(-) > > As you can see, it's a microscopic amount of extra code. I expect an even > smaller diff to be required to add XML. (Nobody's asked for it yet, so we've > prioritised other development: all our test users seemed to prefer plaintext > or JSON at present.) > > Of course, it's only easy for us because we have designed our API to be > so simple and direct without excessive abstraction... > > Cheers, > > Chris. > _______________________________________________ > occi-wg mailing list > occi-wg@ogf.org <mailto:occi-wg@ogf.org> > http://www.ogf.org/mailman/listinfo/occi-wg

eprparadocs@gmail.com writes:
Perhaps I didn't quite understand your approach. I would think we should enumerate the parts of each API and the interpretation and leave to another section the actual bits over the wire. That way we could support anything, including binary XML!
Sam Johnston <samj@samj.net> writes:
Actually you're spot on - I'm just pointing out what's common and conceding that we should be flexible about the 1's and 0's.
Great, I agree with both of you! Our supporting a variety of API formats has given a significant win for those end-users not using heavyweight XML-friendly languages, e.g. writing shell scripts in the way one might when working with traditional infrastructure and the contents of one's virtual machines. I don't want to lose that user-convenience. Cheers, Chris.

Quoting [Chris Webb] (Apr 16 2009):
eprparadocs@gmail.com writes:
Perhaps I didn't quite understand your approach. I would think we should enumerate the parts of each API and the interpretation and leave to another section the actual bits over the wire. That way we could support anything, including binary XML!
Sam Johnston <samj@samj.net> writes:
Actually you're spot on - I'm just pointing out what's common and conceding that we should be flexible about the 1's and 0's.
Great, I agree with both of you!
Our supporting a variety of API formats has given a significant win for those end-users not using heavyweight XML-friendly languages, e.g. writing shell scripts in the way one might when working with traditional infrastructure and the contents of one's virtual machines. I don't want to lose that user-convenience.
+1 -- Nothing is ever easy.

On Thu, Apr 16, 2009 at 2:36 PM, Chris Webb <chris.webb@elastichosts.com>wrote:
Actually you're spot on - I'm just pointing out what's common and conceding that we should be flexible about the 1's and 0's.
Great, I agree with both of you!
Our supporting a variety of API formats has given a significant win for those end-users not using heavyweight XML-friendly languages, e.g. writing shell scripts in the way one might when working with traditional infrastructure and the contents of one's virtual machines. I don't want to lose that user-convenience.
Great! So another bonus of having XML available at the top end is that it can be trivially transformed into whatever you like. For example, did you know that behind the scenes google.com renders search results in XML<http://www.google.com/coop/docs/cse/resultsxml.html#wsXMLExample>before using XSLT to transform them into HTML? I don't want to talk too much about what the future holds for OCCI (for fear of accusation of boiling the ocean et al) but there would be nothing stopping an XSLT god (e.g. not me) from writing an XSLT based web interface with all the ontological, taxonomical & semantic goodness their heart desired ;) Sam

Sam Johnston wrote:
What I've done so far is based it on an extremely simple, well specified and common XML structure (Atom - RFC 4287) and (following Sun's example) embedded links for operations:
http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start
These links can obviously be actuated by any HTTP client and they could obviously be embedded in a format of your choice (e.g. json).
Hi Sam, The links for operations look good - a simple HTTP POST to a straight url. I'm more concerned about the Atom XML. I agree that what you've done is a decent implementation of standard Atom XML, but am worried about the amount of overhead which this has introduced. Looking at the sample code on the wiki at present, this is currently 2778 characters spread over 53 lines, using 5 xml namespaces and around 20-30 xml tags with up to 7 levels of indentation. And it's explicitly abbreviated with a '...'. What this code actually does is return a handful of details about a single virtual machine. If I only list the actual data fields then we have: <server> <id>decca5a5-8952-4004-9793-cdbbf05c3c63</id> <title>Debian GNU/Linux 5.0 Virtual Appliance</title> <summary>Base installation of Debian GNU/Linux 5.0</summary> <cpu>2</cpu> <mem>4Gb</mem> <disk id="file1" href="virtual-disk.vmdk" size="148251374"/> <nic>2</nic> <state>RUNNING</state> <meter rate="0.10" currency="USD" unit="hours">35.27</meter> <monitor type="cpu">75.2%</monitor> <monitor type="mem">1059374258</monitor> <storage id="4696b561-a253-42b4-bd27-7aa4950e0a60"/> <network id="45a73b80-c957-4ae1-97c6-b70652eba1d1"/> </server> That's 575 characters over 15 lines with 14 tags in 1 namespace and 1 level of indentation. It is now simple enough that I could trivially parse it or generate it from any programming language without any libraries if necessary. The equivalent JSON is also obvious and simple. Both the ElasticHosts and GoGrid APIs are written much more in this second style of syntax - which is what we mean by a "design pattern B" - work out exactly what data is needed for each operation and write down exactly this with the absolute minimum of syntax overhead. Cheers, Richard.

On Thu, Apr 16, 2009 at 2:44 PM, Richard Davies < richard.davies@elastichosts.com> wrote:
The links for operations look good - a simple HTTP POST to a straight url.
I didn't think there would be much contention here. I'm surprised I didn't get more abuse about using UUID4's but that's a no brainer when you understand the concurrency issues (and don't want to expose any secrets, particularly about the size of your operation). I'm certainly not the first to do this either.
I'm more concerned about the Atom XML. I agree that what you've done is a decent implementation of standard Atom XML, but am worried about the amount of overhead which this has introduced.
When you get down to the details (as I did over the weekend) you'll see that Atom fits like a glove for this purpose and gives us many things for free (like tagging, categories and embedding of arbitrary content types like OVF) that we would otherwise have to implement ourselves. It also critically provides a clean, simple way to link between resources and specify attributes on the links themselves (e.g. network X is connected to virtual machine Y on interface eth0).
Looking at the sample code on the wiki at present, this is currently 2778 characters spread over 53 lines, using 5 xml namespaces and around 20-30 xml tags with up to 7 levels of indentation. And it's explicitly abbreviated with a '...'.
Right, sure XML is more verbose. It's also easy to extend, sign, encrypt and perhaps most importantly, validate. I actually think Atom is a nice middle ground between mickey mouse plain text APIs and WS-[death]star. As a bonus it seamlessly and transparently allows demanding enterprise users to transport whatever esoteric XML-based SLAs and other cruft they like. We can also natively support (or at least transport) OVF too which I think will be increasingly important for interoperability. What this code actually does is return a handful of details about a single
virtual machine. If I only list the actual data fields then we have:
<server> <id>decca5a5-8952-4004-9793-cdbbf05c3c63</id> <title>Debian GNU/Linux 5.0 Virtual Appliance</title> <summary>Base installation of Debian GNU/Linux 5.0</summary> <cpu>2</cpu> <mem>4Gb</mem> <disk id="file1" href="virtual-disk.vmdk" size="148251374"/> <nic>2</nic> <state>RUNNING</state> <meter rate="0.10" currency="USD" unit="hours">35.27</meter> <monitor type="cpu">75.2%</monitor> <monitor type="mem">1059374258</monitor> <storage id="4696b561-a253-42b4-bd27-7aa4950e0a60"/> <network id="45a73b80-c957-4ae1-97c6-b70652eba1d1"/> </server>
That's 575 characters over 15 lines with 14 tags in 1 namespace and 1 level of indentation. It is now simple enough that I could trivially parse it or generate it from any programming language without any libraries if necessary. The equivalent JSON is also obvious and simple.
Right, and when someone adds a "flux capacitor" to the list of available resources we need to go back to the drawing board (or they'll just add it themselves and we're back to the old embrace and extend approach). I envisage that we could [have IANA] maintain a similar registry of resource types as they do for link relations<http://www.iana.org/assignments/link-relations/link-relations.xhtml>, but I think the need for such a thing will become clearer as we delve further into the problem.
Both the ElasticHosts and GoGrid APIs are written much more in this second style of syntax - which is what we mean by a "design pattern B" - work out exactly what data is needed for each operation and write down exactly this with the absolute minimum of syntax overhead.
Sure, and you'll get that too. Let me whip up some examples to go alongside the XML. Sam

On Thu, Apr 16, 2009 at 3:12 PM, Sam Johnston <samj@samj.net> wrote:
Sure, and you'll get that too. Let me whip up some examples to go alongside the XML.
Ok so thanks to your feedback today, here's a first pass at flattening the Atom into INI file <http://en.wikipedia.org/wiki/Initialization_file> format (basically what you had but with "=" for human & computer readability): [decca5a5-8952-4004-9793-cdbbf05c3c63] category = server title = Debian GNU/Linux 5.0 Virtual Appliance summary = Base installation of Debian GNU/Linux 5.0 content.cpu = 2 content.memory = 4Gb link.disk[0].id = 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev = sda link.network[0].id = 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev = eth0 mc.state = RUNNING br.meter.rate = 0.10 br.meter.currency = USD br.meter.unit = hours br.meter.total = 35.27 pm.monitor.cpu = 75.2 pm.monitor.mem = 1059374258 mc.ops.start = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend [4696b561-a253-42b4-bd27-7aa4950e0a60] category = storage content.size = 148251374 link.self = virtual-disk.vmdk [45a73b80-c957-4ae1-97c6-b70652eba1d1] category = network content.vlan = 4095 content.dhcp = true content.subnet = 192.168.0.0 content.netmask = 255.255.0.0 content.gateway = 192.168.0.1 av.com.cisco.cdp = true Note in the last line that vendors can safely extend the schema with attribute/value pairs for things like CDP. Anyway I have to scurry off to the CloudCampParis organisers' meeting - back around 7. Cheers, Sam

Sam Johnston wrote:
Here's a first pass at flattening the Atom into INI file format (basically what you had but with "=" for human & computer readability):
Great stuff - I think this is a big step forward to be able to express everything as a simple list of objects, each specified by simple key-value pairs. Hopefully we can also similarly add a JSON version using the same simple data structures, e.g.: {"category":"server", "title":"Debian...", "mc.state":"running", ... } I've got two specific comments on the example you give: 1) I'm not sure INI format is actually the best text format for key-value. I'd prefer something easier to parse from Unix shell, which is where I imagine most simple scripts will be written. ElasticHosts went with "key" (without spaces), <space>, "value" (any characters including spaces) since this can be parsed with cat file | while read key value ; do ... ; done 2) Going through the keys and values in detail:
[decca5a5-8952-4004-9793-cdbbf05c3c63]
I like UUIDs and ElasticHosts also uses them, but I might loosen the requirement to any unique string of hex and dashes (since other vendors may prefer to number sequentially, etc.)
category = server title = Debian GNU/Linux 5.0 Virtual Appliance summary = Base installation of Debian GNU/Linux 5.0
Do we need both a title ('name' with ElasticHosts at present) and a summary or can we just have one of these?
content.cpu = 2 content.memory = 4Gb
We need to agree units here! Presumably memory would be specified in 'GB' or alternatively 'MB', 'kB' or nothing. Is CPU the speed quota or the number of virtual cores? I recommend cores=<integer> and an additional key for speed quota (ElasticHosts uses cpu=<total MHz to divide across all cores>) Can we cut the namespace and just write: cores = 2 cpu = 4000MHz mem = 4GB
link.disk[0].id = 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev = sda link.network[0].id = 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev = eth0
This is good - a mapping between hardware devices and uuids of the storage or network objects. We don't need the [0] indices, since the 'dev' specifiers are already fully unique. Taking those out and cutting the namespace gives something like: disk.sda = 4696b561-a253-42b4-bd27-7aa4950e0a60 network.eth0 = 45a73b80-c957-4ae1-97c6-b70652eba1d1
mc.state = RUNNING br.meter.rate = 0.10 br.meter.currency = USD br.meter.unit = hours br.meter.total = 35.27 pm.monitor.cpu = 75.2 pm.monitor.mem = 1059374258
All look reasonable, but again I would cut the namespaces: state = RUNNING br.rate = 0.10 br.currency = SD br.unit = hours br.total = 35.27 pm.cpu = 75.2 pm.mem = 1059374258
mc.ops.start = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend
Do we need these at all? Surely these will always be the operations which are possible on a RUNNING server, and so can always be constructed based on the UUID. Also, why have 'ops' in the URLs? Why not just http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/start
[4696b561-a253-42b4-bd27-7aa4950e0a60]
I guess storage needs a 'title' (or 'name') too?
category = storage content.size = 148251374
Why not just 'size'?
link.self = virtual-disk.vmdk
Not sure what this is?
[45a73b80-c957-4ae1-97c6-b70652eba1d1]
Again, maybe a 'name'?
category = network content.vlan = 4095 content.dhcp = true content.subnet = 192.168.0.0 content.netmask = 255.255.0.0 content.gateway = 192.168.0.1
Once again, I'd take the 'content' prefix off all of these. The keys you list here work when the network interface is on a private VLAN, but are the wrong set when it is on the public internet. On the public internet, the cloud vendor, not the user, defines most of these parameters and need to be able to control the customer VM from "stealing" IPs from other customers. The customer has access to a defined set of static IPs which they have purchased or alternatively a free dynamic IP assigned at boot, and all they should be able to specify is which of these they want on this particular interface, and whether they want to receive a DHCP for it. For instance, ElasticHosts currently specifies as: ip = <specified static IP address or 'auto' to assign dynamically at boot> dhcp = <ip address to send by dhcp or 'auto'; no dhcp if not present> Given that the customer will have a set of static IPs which they have purchased (common concept across Amazon, ElasticHosts, GoGrid, etc.), the API also needs an ability for them to list what these are!
av.com.cisco.cdp = true

I'd +1 moving back to simplified XML from INI. Chuck Wegrzyn Richard Davies wrote:
Sam Johnston wrote:
Here's a first pass at flattening the Atom into INI file format (basically what you had but with "=" for human & computer readability):
Great stuff - I think this is a big step forward to be able to express everything as a simple list of objects, each specified by simple key-value pairs. Hopefully we can also similarly add a JSON version using the same simple data structures, e.g.:
{"category":"server", "title":"Debian...", "mc.state":"running", ... }
I've got two specific comments on the example you give:
1) I'm not sure INI format is actually the best text format for key-value. I'd prefer something easier to parse from Unix shell, which is where I imagine most simple scripts will be written. ElasticHosts went with
"key" (without spaces), <space>, "value" (any characters including spaces)
since this can be parsed with
cat file | while read key value ; do ... ; done
2) Going through the keys and values in detail:
[decca5a5-8952-4004-9793-cdbbf05c3c63]
I like UUIDs and ElasticHosts also uses them, but I might loosen the requirement to any unique string of hex and dashes (since other vendors may prefer to number sequentially, etc.)
category = server title = Debian GNU/Linux 5.0 Virtual Appliance summary = Base installation of Debian GNU/Linux 5.0
Do we need both a title ('name' with ElasticHosts at present) and a summary or can we just have one of these?
content.cpu = 2 content.memory = 4Gb
We need to agree units here! Presumably memory would be specified in 'GB' or alternatively 'MB', 'kB' or nothing. Is CPU the speed quota or the number of virtual cores? I recommend cores=<integer> and an additional key for speed quota (ElasticHosts uses cpu=<total MHz to divide across all cores>)
Can we cut the namespace and just write:
cores = 2 cpu = 4000MHz mem = 4GB
link.disk[0].id = 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev = sda link.network[0].id = 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev = eth0
This is good - a mapping between hardware devices and uuids of the storage or network objects.
We don't need the [0] indices, since the 'dev' specifiers are already fully unique. Taking those out and cutting the namespace gives something like:
disk.sda = 4696b561-a253-42b4-bd27-7aa4950e0a60 network.eth0 = 45a73b80-c957-4ae1-97c6-b70652eba1d1
mc.state = RUNNING br.meter.rate = 0.10 br.meter.currency = USD br.meter.unit = hours br.meter.total = 35.27 pm.monitor.cpu = 75.2 pm.monitor.mem = 1059374258
All look reasonable, but again I would cut the namespaces:
state = RUNNING br.rate = 0.10 br.currency = SD br.unit = hours br.total = 35.27 pm.cpu = 75.2 pm.mem = 1059374258
mc.ops.start = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend
Do we need these at all? Surely these will always be the operations which are possible on a RUNNING server, and so can always be constructed based on the UUID.
Also, why have 'ops' in the URLs? Why not just
http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/start
[4696b561-a253-42b4-bd27-7aa4950e0a60]
I guess storage needs a 'title' (or 'name') too?
category = storage content.size = 148251374
Why not just 'size'?
link.self = virtual-disk.vmdk
Not sure what this is?
[45a73b80-c957-4ae1-97c6-b70652eba1d1]
Again, maybe a 'name'?
category = network content.vlan = 4095 content.dhcp = true content.subnet = 192.168.0.0 content.netmask = 255.255.0.0 content.gateway = 192.168.0.1
Once again, I'd take the 'content' prefix off all of these.
The keys you list here work when the network interface is on a private VLAN, but are the wrong set when it is on the public internet.
On the public internet, the cloud vendor, not the user, defines most of these parameters and need to be able to control the customer VM from "stealing" IPs from other customers.
The customer has access to a defined set of static IPs which they have purchased or alternatively a free dynamic IP assigned at boot, and all they should be able to specify is which of these they want on this particular interface, and whether they want to receive a DHCP for it.
For instance, ElasticHosts currently specifies as:
ip = <specified static IP address or 'auto' to assign dynamically at boot> dhcp = <ip address to send by dhcp or 'auto'; no dhcp if not present>
Given that the customer will have a set of static IPs which they have purchased (common concept across Amazon, ElasticHosts, GoGrid, etc.), the API also needs an ability for them to list what these are!
av.com.cisco.cdp = true
occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

On Thu, Apr 16, 2009 at 7:09 PM, <eprparadocs@gmail.com> wrote:
I'd +1 moving back to simplified XML from INI.
We didn't move as such - a plain text version of the API for dumb clients (like shell scripts and humans armed with curl/wget) makes sense, as does a web-friendly JSON/YAML option. XML can be reserved for the hardcore stuff but if you've got to parse XML anyway you may as well parse Atom - trust me, a lot of the stuff you end up trying to deal with (such as categories, links, etc.) are already done for you and there's a heap of code and developers who already grok it (e.g. Google GData APIs). Basically I'm trying to extract the best points of all of the APIs and leave the cruft (such as in-band authentication) behind. Sam

So, what this discussion basically boils down to is, that this group should do two steps: A) define the nouns and verbs for the API, and nail down semantics for them B) do different bindings for the result of (A) Looks sensible (to me), and there seem to be enough people around who can check the process in (A) for implementability in the various options of (B). Is that what it is going to be? Andre Quoting [Sam Johnston] (Apr 16 2009):
On Thu, Apr 16, 2009 at 7:09 PM, <[1]eprparadocs@gmail.com> wrote:
I'd +1 moving back to simplified XML from INI.
We didn't move as such - a plain text version of the API for dumb clients (like shell scripts and humans armed with curl/wget) makes sense, as does a web-friendly JSON/YAML option. XML can be reserved for the hardcore stuff but if you've got to parse XML anyway you may as well parse Atom - trust me, a lot of the stuff you end up trying to deal with (such as categories, links, etc.) are already done for you and there's a heap of code and developers who already grok it (e.g. Google GData APIs). Basically I'm trying to extract the best points of all of the APIs and leave the cruft (such as in-band authentication) behind. Sam
References
1. mailto:eprparadocs@gmail.com
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
-- Nothing is ever easy.

Andre Merzky <andre@merzky.net> writes:
So, what this discussion basically boils down to is, that this group should do two steps:
A) define the nouns and verbs for the API, and nail down semantics for them
B) do different bindings for the result of (A)
Looks sensible (to me), and there seem to be enough people around who can check the process in (A) for implementability in the various options of (B).
I certainly think this is a good plan. I split my original posting into these two halves for exactly this reason! Cheers, Chris.

On Thu, Apr 16, 2009 at 9:41 PM, Chris Webb <chris.webb@elastichosts.com>wrote:
Andre Merzky <andre@merzky.net> writes:
So, what this discussion basically boils down to is, that this group should do two steps:
A) define the nouns and verbs for the API, and nail down semantics for them
B) do different bindings for the result of (A)
Looks sensible (to me), and there seem to be enough people around who can check the process in (A) for implementability in the various options of (B).
I certainly think this is a good plan. I split my original posting into these two halves for exactly this reason!
Nouns and verbs would be very helpful for me, for sure. Use cases (even one liners) would be great too... e.g. "Disaster recovery where an off-site virtual replica of physical infrastructure is regularly taken for rapid failover" or "Large scale public cloud service with multi-tenancy for an arbitrary number of users and workloads". I'm guessing that we haven't collected so many of these because the template is [too?] complex. Sam

Definitely agree with this approach. For (A), I think it's actually nouns (e.g. servers), verbs (e.g. start) and noun attributes (e.g. memory). All of these can be consistent between different bindings. For (B), as per my many posts I'm very keen that the text and JSON versions are in extremely simple format - just a flat list of key-value attributes for each noun. I'm much less concerned about the XML syntax. Richard. Andre Merzky wrote:
So, what this discussion basically boils down to is, that this group should do two steps:
A) define the nouns and verbs for the API, and nail down semantics for them
B) do different bindings for the result of (A)
Looks sensible (to me), and there seem to be enough people around who can check the process in (A) for implementability in the various options of (B).
Is that what it is going to be?
Andre

Wow! :-) I'm unfortunately late to all these discussion, so I'm in catch up mode... With regard to the segmented approach, it's very wise. However, there's a binding aspect to the approach in general that is missing, certainly for part (A). That is (and yes I'm surprised myself to say this) there's a model missing which unifies all these concepts. Now I'm not advocating MDA or anything elaborate in that vein but I would advocate something that could be used, by those wanting to, as a means to minimise confusion, communicate effectively, isolate possible architectural & technical dependencies, hell even validate requests (they're just a model containing verbs, nouns and attributes ;-)). By having such a model, the issue of part (B) then somewhat fades to a non-issue - part (B) is then only concerned with various rendering of the model, be it in JSON, XML etc. I believe that if we can begin to express the {noun{verb{attribute}}} vector for each valid entity that should be exposed on an external interface of an infrastructure provider then many of the technical discussions will almost be rhetorical via reference of the model. It is this {noun{verb{attribute}}} vector that can be easily mapped to other modelling methodologies - imagine it in OO-terms or relational modelling terms etc. Not only that but a model can be accessed/exposed using various "architectural styles" for example REST, RPC, message-oriented. By choosing such an approach, the communication of our collective ideas will also be efficient and accurate. I am, however careful nor would want that such an effort end up reinventing something like OVF or any of the VMAN efforts being carried out in DMTF. As such having a reference model that describes our "concerns" ({noun{verb{attribute}}}) will allow us see the wood from the trees and hopefully beyond the forest, giving us an edge and, with fingers crossed, complimenting existing standards. Not wanting to divert the original conversation on the core entities within the OCCI, I've renamed (forked) the subject of the mail. Regards, Andy PS: I'm digging the energy here in the group! Andy Edmonds skype: andy.edmonds tweets: @dizz tel: +353 (0)1 6069232 IT Research - IT Innovation Centre - Intel Ireland Ltd. Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -----Original Message----- From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Richard Davies Sent: 16 April 2009 20:50 To: occi-wg@ogf.org Subject: Re: [occi-wg] Syntax of OCCI API Definitely agree with this approach. For (A), I think it's actually nouns (e.g. servers), verbs (e.g. start) and noun attributes (e.g. memory). All of these can be consistent between different bindings. For (B), as per my many posts I'm very keen that the text and JSON versions are in extremely simple format - just a flat list of key-value attributes for each noun. I'm much less concerned about the XML syntax. Richard. Andre Merzky wrote:
So, what this discussion basically boils down to is, that this group should do two steps:
A) define the nouns and verbs for the API, and nail down semantics for them
B) do different bindings for the result of (A)
Looks sensible (to me), and there seem to be enough people around who can check the process in (A) for implementability in the various options of (B).
Is that what it is going to be?
Andre
occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg ------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

On reflection, I think I'll take my comments on the network objects a little further - I don't think that a separate object is needed here at all, and believe that the configuration should be folded into the server (as network.eth0.vlan, network.eth0.dhcp, etc.). My logic is that API objects should only exist where something exists with state which is persistent and independent of other objects. server, drives, and ownership of resources such as static IPs or VLANs all fulfill this. Network configuration does not - essentially this is just configuration of a network interface on a server. As such, it's much simpler to fold the configuration into the server object itself, rather than splitting it out into a separate "configuration object" and linking to it from the main "server object". Richard Davies wrote, commenting on the network section:
[45a73b80-c957-4ae1-97c6-b70652eba1d1]
Again, maybe a 'name'?
category = network content.vlan = 4095 content.dhcp = true content.subnet = 192.168.0.0 content.netmask = 255.255.0.0 content.gateway = 192.168.0.1
Once again, I'd take the 'content' prefix off all of these.
The keys you list here work when the network interface is on a private VLAN, but are the wrong set when it is on the public internet.
On the public internet, the cloud vendor, not the user, defines most of these parameters and need to be able to control the customer VM from "stealing" IPs from other customers.
The customer has access to a defined set of static IPs which they have purchased or alternatively a free dynamic IP assigned at boot, and all they should be able to specify is which of these they want on this particular interface, and whether they want to receive a DHCP for it.
For instance, ElasticHosts currently specifies as:
ip = <specified static IP address or 'auto' to assign dynamically at boot> dhcp = <ip address to send by dhcp or 'auto'; no dhcp if not present>
Given that the customer will have a set of static IPs which they have purchased (common concept across Amazon, ElasticHosts, GoGrid, etc.), the API also needs an ability for them to list what these are!
av.com.cisco.cdp = true

On Thu, Apr 16, 2009 at 8:29 PM, Richard Davies < richard.davies@elastichosts.com> wrote:
On reflection, I think I'll take my comments on the network objects a little further - I don't think that a separate object is needed here at all, and believe that the configuration should be folded into the server (as network.eth0.vlan, network.eth0.dhcp, etc.).
My logic is that API objects should only exist where something exists with state which is persistent and independent of other objects.
server, drives, and ownership of resources such as static IPs or VLANs all fulfill this.
Network configuration does not - essentially this is just configuration of a network interface on a server. As such, it's much simpler to fold the configuration into the server object itself, rather than splitting it out into a separate "configuration object" and linking to it from the main "server object".
Yup, been here too (I've been working on this problem for a while before getting suck into this). For a public cloud like EH things are pretty simple - you just give an IP to a machine and away you go... they can talk to each other and if you must then you can pop a firewall/load balancer in front. For anything more complex though you do need to keep track of your [virtual] networks separately, even if only because the client needs to be able to enumerate them in order to give the users something to strap their VMs to and (on the system management side) to link to a physical segment. I'm not sure what really needs to be configured here beyond a label/description, but it's conceivable that a network management extension would be interesting for vendors like Cisco. Sam

Sam Johnston <samj@samj.net> writes:
For anything more complex though you do need to keep track of your [virtual] networks separately, even if only because the client needs to be able to enumerate them in order to give the users something to strap their VMs to and (on the system management side) to link to a physical segment.
You need to track virtual networks as first class objects (what Richard and I tend to call VLANs), not network interfaces which attach to them. Cheers, Chris.

On Thu, Apr 16, 2009 at 9:33 PM, Chris Webb <chris.webb@elastichosts.com>wrote:
Sam Johnston <samj@samj.net> writes:
For anything more complex though you do need to keep track of your [virtual] networks separately, even if only because the client needs to be able to enumerate them in order to give the users something to strap their VMs to and (on the system management side) to link to a physical segment.
You need to track virtual networks as first class objects (what Richard and I tend to call VLANs), not network interfaces which attach to them.
Agreed, but Richard was just saying "*I don't think that a separate object is needed here at all*". I can see where that point of view comes from for public clouds and maybe we can cater for both views by just assuming network interfaces mean "Internet" unless otherwise specified. I've been steering clear of the term VLAN because it means something to network engineers (in the 802.1q tagging sense) - "virtual network" works for me. We discussed having tags like "internet" and "vlan-4095" which meant something in terms of termination... that translates nicely to categories and I'm starting to think a category for VLAN 1..4095 is justified (thus giving network engineers a well defined demarcation point). Sam

Sam Johnston <samj@samj.net> writes:
Agreed, but Richard was just saying "*I don't think that a separate object is needed here at all*". I can see where that point of view comes from for public clouds and maybe we can cater for both views by just assuming network interfaces mean "Internet" unless otherwise specified.
Ah, you see I have the advantage here in knowing what he meant because I'm having the discussion with him directly! You'll notice he explicitly mentioned VLAN ownership as a resource which exists independent of a running guest, whereas a network configuration [== interface] does not. We are well aware of the need for virtual networking as well as internet facing interfaces. Customers ask for that quite frequently, as well as interconnect between our 'virtual VLANs' and Cisco-style 802.1q VLANs that can be brought out on a switch port to their physical machines.
I've been steering clear of the term VLAN because it means something to network engineers (in the 802.1q tagging sense) - "virtual network" works for me.
Yes, it's probably a less overloaded term. Cheers, Chris.

How would I create a "virtual lan network" called FOO and have the VMs I create connect to it? I want to do this in certain cases where I have multi-homed VMs in which one of the NICs is connected to the general network and the other NICs are used to route traffic amongst themselves. Chuck Wegrzyn Twisted Storage Sam Johnston wrote:
On Thu, Apr 16, 2009 at 8:29 PM, Richard Davies <richard.davies@elastichosts.com <mailto:richard.davies@elastichosts.com>> wrote:
On reflection, I think I'll take my comments on the network objects a little further - I don't think that a separate object is needed here at all, and believe that the configuration should be folded into the server (as network.eth0.vlan, network.eth0.dhcp, etc.).
My logic is that API objects should only exist where something exists with state which is persistent and independent of other objects.
server, drives, and ownership of resources such as static IPs or VLANs all fulfill this.
Network configuration does not - essentially this is just configuration of a network interface on a server. As such, it's much simpler to fold the configuration into the server object itself, rather than splitting it out into a separate "configuration object" and linking to it from the main "server object".
Yup, been here too (I've been working on this problem for a while before getting suck into this).
For a public cloud like EH things are pretty simple - you just give an IP to a machine and away you go... they can talk to each other and if you must then you can pop a firewall/load balancer in front.
For anything more complex though you do need to keep track of your [virtual] networks separately, even if only because the client needs to be able to enumerate them in order to give the users something to strap their VMs to and (on the system management side) to link to a physical segment.
I'm not sure what really needs to be configured here beyond a label/description, but it's conceivable that a network management extension would be interesting for vendors like Cisco.
Sam
------------------------------------------------------------------------
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

eprparadocs@gmail.com writes:
How would I create a "virtual lan network" called FOO and have the VMs I create connect to it? I want to do this in certain cases where I have multi-homed VMs in which one of the NICs is connected to the general network and the other NICs are used to route traffic amongst themselves.
As an example of how this might work, in our (EH) API you do /resources/vlan/create to create a VLAN (virtual network) and get its identifier, and then define your network card as attached to it. For example, for a machine with a public facing interface with dynamic IP and a VLAN attached interface, nic:0:dhcp auto nic:1:vlan VLAN-ID in the guest config. Cheers, Chris.

That works nicely. Peace, Chuck Chris Webb wrote:
eprparadocs@gmail.com writes:
How would I create a "virtual lan network" called FOO and have the VMs I create connect to it? I want to do this in certain cases where I have multi-homed VMs in which one of the NICs is connected to the general network and the other NICs are used to route traffic amongst themselves.
As an example of how this might work, in our (EH) API you do
/resources/vlan/create
to create a VLAN (virtual network) and get its identifier, and then define your network card as attached to it. For example, for a machine with a public facing interface with dynamic IP and a VLAN attached interface,
nic:0:dhcp auto nic:1:vlan VLAN-ID
in the guest config.
Cheers,
Chris.

Lightweight virtual network (as in virtual hub/switch) creation will be an important requirement for anything but the most basic of cloud [infrastructure] architectures... Sam On Thu, Apr 16, 2009 at 10:02 PM, <eprparadocs@gmail.com> wrote:
That works nicely.
Peace, Chuck
Chris Webb wrote:
eprparadocs@gmail.com writes:
How would I create a "virtual lan network" called FOO and have the VMs I create connect to it? I want to do this in certain cases where I have multi-homed VMs in which one of the NICs is connected to the general network and the other NICs are used to route traffic amongst themselves.
As an example of how this might work, in our (EH) API you do
/resources/vlan/create
to create a VLAN (virtual network) and get its identifier, and then define your network card as attached to it. For example, for a machine with a public facing interface with dynamic IP and a VLAN attached interface,
nic:0:dhcp auto nic:1:vlan VLAN-ID
in the guest config.
Cheers,
Chris.
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

We should start with it simpler, but eventually it will matter and network objects will be required. --Randy On 4/16/09 11:29 AM, "Richard Davies" <richard.davies@elastichosts.com> wrote:
Network configuration does not - essentially this is just configuration of a network interface on a server. As such, it's much simpler to fold the configuration into the server object itself, rather than splitting it out into a separate "configuration object" and linking to it from the main "server object".
-- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias

I apologize if this is already in some of Sam's work. I haven't had a chance to dig into the latest in detail. I just want to bring up one of my hot buttons, which is vendor specific extensions. As Richard/Chris brought up it's good to start with something simple that covers all vendors, but that certainly won't be enough to cover all current use cases. And certain functionality only exists in certain clouds. For example, I'm pretty sure we're still the only cloud with load balancers and billing data provided via the API. So, I'm keen to make sure that we start with a clean common subset of functionality with the minimal set of nouns+verbs AND have a way to extend the functionality on a per-vendor basis such that the baseline API is not broken. I will give some more ideas on specifics of how we could do this after I have a chance to look a bit more at the latest proposals from Sam. Best, --Randy -- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias

Quoting [Randy Bias] (Apr 17 2009):
For example, I'm pretty sure we're still the only cloud with load balancers and billing data provided via the API.
So, I'm keen to make sure that we start with a clean common subset of functionality with the minimal set of nouns+verbs AND have a way to extend the functionality on a per-vendor basis such that the baseline API is not broken.
FWIW, extensions on per-functionality basis would be more useful than on per-vendor basis. But that may well be wishful thinking... Andre. -- Nothing is ever easy.

Andre, That would be great, but I can envision vendors implementing things in a different way initially and as one version 'wins' we can roll that into the baseline as the API is iterated over time. --Randy On 4/16/09 8:18 PM, "Andre Merzky" <andre@merzky.net> wrote:
Quoting [Randy Bias] (Apr 17 2009):
For example, I'm pretty sure we're still the only cloud with load balancers and billing data provided via the API.
So, I'm keen to make sure that we start with a clean common subset of functionality with the minimal set of nouns+verbs AND have a way to extend the functionality on a per-vendor basis such that the baseline API is not broken.
FWIW, extensions on per-functionality basis would be more useful than on per-vendor basis. But that may well be wishful thinking...
Andre.
-- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias

On Fri, Apr 17, 2009 at 5:48 AM, Randy Bias <randyb@gogrid.com> wrote:
That would be great, but I can envision vendors implementing things in a different way initially and as one version 'wins' we can roll that into the baseline as the API is iterated over time.
We're going to have to evolve fairly quickly and on an ongoing basis I would say. You've probably already seen that the core is absolutely minimalist, telling you how to authenticate and interact with the endpoint. Everything, including what you would think is basic functionality (search, machine control) is implemented as an extension, and the use of namespaces throughout (something Richard@EH isn't particularly convinced about) ensures that these extensions don't have to worry about standing on each others toes. Not only does this mean that vendors can write their own extensions (and extend existing ones with attributes, actuators, etc.), but when we come to adding more functionality above/below the infrastructure layer we'll just be talking about writing extension(s). An example of where this might be useful is $networking_vendor wanting to manage the underlying fabric. Rather than having them roll something completely proprietary and have two points of contact with the infrastructure they can extend OCCI into areas that we aren't particularly interested in covering (just yet). Given that working groups tend to have "start" and "end" points we'll have to look at setting up something like an attribute registry with e.g. IANA (see the Atom link relations<http://www.iana.org/assignments/link-relations/link-relations.xhtml>for example). This is a great way to play nice with other standards efforts too. Sam

Hi,
Not only does this mean that vendors can write their own extensions (and extend existing ones with attributes, actuators, etc.), but when we come to adding more functionality above/below the infrastructure layer we'll just be talking about writing extension(s).
I definitely like this approach here. Is this currently clearly stated in the wiki? I know there is something about extensions... -Thijs

On Fri, Apr 17, 2009 at 3:08 AM, Randy Bias <randyb@gogrid.com> wrote:
We should start with it simpler, but eventually it will matter and network objects will be required.
In the most basic case we still need enumeration (short of reverse engineering the information by retrieving all objects and doing something like a 'select distinct' over them). I'm thinking the network resources (and there shouldn't be many of them, or at least not many shared between all users) should start with little more than a name/description for UI purposes. Public cloud installations may just have a single shared "Internet" network resource and that might be something we want to allocate a well known UUID and/or alias to. Sam
On 4/16/09 11:29 AM, "Richard Davies" <richard.davies@elastichosts.com> wrote:
Network configuration does not - essentially this is just configuration of a network interface on a server. As such, it's much simpler to fold the configuration into the server object itself, rather than splitting it out into a separate "configuration object" and linking to it from the main "server object".
-- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

Richard, Just when you thought you had enough mail from me already it seems I missed one... On Thu, Apr 16, 2009 at 6:05 PM, Richard Davies < richard.davies@elastichosts.com> wrote:
Sam Johnston wrote:
Here's a first pass at flattening the Atom into INI file format (basically what you had but with "=" for human & computer readability):
Great stuff - I think this is a big step forward to be able to express everything as a simple list of objects, each specified by simple key-value pairs. Hopefully we can also similarly add a JSON version using the same simple data structures, e.g.:
{"category":"server", "title":"Debian...", "mc.state":"running", ... }
JSON/YAML's on my todo list for this morning.
I've got two specific comments on the example you give:
1) I'm not sure INI format is actually the best text format for key-value. I'd prefer something easier to parse from Unix shell, which is where I imagine most simple scripts will be written. ElasticHosts went with
"key" (without spaces), <space>, "value" (any characters including spaces)
since this can be parsed with
cat file | while read key value ; do ... ; done
I've found the tinydns-data <http://cr.yp.to/djbdns/tinydns-data.html>format a pleasure to work with as well, but in any case INI files are simple, standard across platforms, well defined, etc. You can parse them in shell like this<http://www.debian-administration.org/articles/55#comment_24> : #!/bin/sh [ -z "$1" ] || [ -z "$2" ] && exit 1 sed -e 's/[[:space:]]*\=[[:space:]]*/=/g' \ -e 's/;.*$//' \ -e 's/[[:space:]]*$//' \ -e 's/^[[:space:]]*//' \ -e "s/^\(.*\)=\([^\"']*\)$/\1=\"\2\"/" \ < $1 \ | sed -n -e "/^\[$2\]/,/^\s*\[/{/^[^;].*\=.*/p;}" For python you have ConfigParser<http://docs.python.org/library/configparser.html>, PHP has parse_ini_file <http://fr3.php.net/parse_ini_file>, Perl (as per usual) has a dozen or so options<http://win32.perl.org/wiki/index.php?title=INI-file_Modules>then there's libini <http://sourceforge.net/projects/libini/> and its ilk. We need a way to group lines together: - INI style headers (e.g [decca5a5-8952-4004-9793-cdbbf05c3c63]) - ID prefixes (e.g. decca5a5-8952-4004-9793-cdbbf05c3c63.content.cpu.cores = 2) - Blank line separators, with ID specified as an attribute (e.g. id = decca5a5-8952-4004-9793-cdbbf05c3c63) Except in the case where you retrieve a single object this is always going to add parsing complexity... but perhaps it's worth it just for the (common) case of dealing with a single object. 2) Going through the keys and values in detail:
[decca5a5-8952-4004-9793-cdbbf05c3c63]
I like UUIDs and ElasticHosts also uses them, but I might loosen the requirement to any unique string of hex and dashes (since other vendors may prefer to number sequentially, etc.)
There's that "enough rope" problem again, and the alias option discussed elsewhere. Another (significant) bonus is that they allow you to migrate resources, collections or even merge entire clouds without re-mapping, breaking any object references, etc. There really is huge value here.
category = server title = Debian GNU/Linux 5.0 Virtual Appliance summary = Base installation of Debian GNU/Linux 5.0
Do we need both a title ('name' with ElasticHosts at present) and a summary or can we just have one of these?
Most collections tend to have an official title and an additional (optional) explanation. If you don't use it then that's fine too (actually the title/summary terminology comes from Atom).
content.cpu = 2 content.memory = 4Gb
We need to agree units here! Presumably memory would be specified in 'GB' or alternatively 'MB', 'kB' or nothing. Is CPU the speed quota or the number of virtual cores? I recommend cores=<integer> and an additional key for speed quota (ElasticHosts uses cpu=<total MHz to divide across all cores>)
Sure, or we just say everything's in bytes/megahertz/etc. and worry about how to render it in the UI (where it arguably belongs). Internally I'd say we should deal with raw numbers (that's how it will be represented in databases anyway) and do the mapping as close as possible to the surface. Defining units is (probably) acceptable though... assuming there's not a standard for this we can refer to (surely there is somewhere).
Can we cut the namespace and just write:
cores = 2 cpu = 4000MHz mem = 4GB
Dispensing with ambiguous terminology is a good idea, but the namespaces are actually quite important for e.g. extensibility.
link.disk[0].id = 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev = sda link.network[0].id = 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev = eth0
This is good - a mapping between hardware devices and uuids of the storage or network objects.
We don't need the [0] indices, since the 'dev' specifiers are already fully unique. Taking those out and cutting the namespace gives something like:
disk.sda = 4696b561-a253-42b4-bd27-7aa4950e0a60 network.eth0 = 45a73b80-c957-4ae1-97c6-b70652eba1d1
Good point and nice optimisation, but what if we want to capture other information like "starting state = disconnected" etc?
mc.state = RUNNING br.meter.rate = 0.10 br.meter.currency = USD br.meter.unit = hours br.meter.total = 35.27 pm.monitor.cpu = 75.2 pm.monitor.mem = 1059374258
All look reasonable, but again I would cut the namespaces:
state = RUNNING br.rate = 0.10 br.currency = SD br.unit = hours br.total = 35.27 pm.cpu = 75.2 pm.mem = 1059374258
Sure, namespaces within extensions can be safely dropped. Top level namespaces less so.
mc.ops.start = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend = http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend
Do we need these at all? Surely these will always be the operations which are possible on a RUNNING server, and so can always be constructed based on the UUID.
HATEOAS <http://www.stucharlton.com/blog/archives/000141.html> is a carry over from the Sun Cloud API (as explained by Sun here<http://blogs.sun.com/craigmcc/entry/why_hateoas>). I like it because from the single entry point you can obtain every URL you should ever need to use, and those that you can't you don't even see (e.g. because you can't "start" an abstract template, or simply because as a disaster recovery operator you're only allowed to start but not stop machines). If you don't like these you can always ignore them, but your users will probably get bored of receiving errors when they try to conduct invalid operations.
Also, why have 'ops' in the URLs? Why not just
http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/start
Interesting question. This was another carry over from Sun but a better approach is to leave it to the extension: http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/mc/start The question is, are you starting the machine? Its firewall? Billing? Backup? Failover? Disaster recovery?
[4696b561-a253-42b4-bd27-7aa4950e0a60]
I guess storage needs a 'title' (or 'name') too?
You're probably right... these are common for all resources.
category = storage content.size = 148251374
Why not just 'size'?
The "content" namespace is from Atom... it serves to bundle the "payload" of the resource together without interfering with other elements of it. OVF could well have a "title" for example, and what if your attribute clashes with the name of an extension? Let's try to keep the core nice and clean.
link.self = virtual-disk.vmdk
Not sure what this is?
It's a link to itself (e.g. a storage resource pointing at its VMDK). I'd suggest a pass over Atom (RFC 4287 <http://tools.ietf.org/html/rfc4287>) to see how links work (and how flexible they are).
[45a73b80-c957-4ae1-97c6-b70652eba1d1]
Again, maybe a 'name'?
No problem.
category = network content.vlan = 4095 content.dhcp = true content.subnet = 192.168.0.0 content.netmask = 255.255.0.0 content.gateway = 192.168.0.1
Once again, I'd take the 'content' prefix off all of these.
See above... we need to work out how/if this can be done safely (and whether it's worth doing).
The keys you list here work when the network interface is on a private VLAN, but are the wrong set when it is on the public internet.
It's just an example, but I do wonder how much detail we're going to want to get into here. We should probably support arbitrary attributes for whatever cruft the network guys want to carry (e.g. frame sizes, etc.) but treat it as opaque for now.
On the public internet, the cloud vendor, not the user, defines most of these parameters and need to be able to control the customer VM from "stealing" IPs from other customers.
The customer has access to a defined set of static IPs which they have purchased or alternatively a free dynamic IP assigned at boot, and all they should be able to specify is which of these they want on this particular interface, and whether they want to receive a DHCP for it.
For instance, ElasticHosts currently specifies as:
ip = <specified static IP address or 'auto' to assign dynamically at boot> dhcp = <ip address to send by dhcp or 'auto'; no dhcp if not present>
Given that the customer will have a set of static IPs which they have purchased (common concept across Amazon, ElasticHosts, GoGrid, etc.), the API also needs an ability for them to list what these are!
I would suggest that these be advertised in the "network" resources so the customer can choose one that's already allocated (assuming they don't just rely on DHCP for this). Another interesting use case incidentally is that of machines doing introspection - a machine (authenticated by IP?) should be able to hit OCCI for information about itself (such as its name? IP address? SSH keys? application configuration?). Even basic attribute-value pairs being settable via management interfaces would be incredibly powerful (and we get this for free already). Sam

Sam Johnston <samj@samj.net> writes:
but in any case INI files are simple, standard across platforms, well defined, etc.
You can parse them in shell like this<http://www.debian-administration.org/articles/55#comment_24> :
#!/bin/sh [ -z "$1" ] || [ -z "$2" ] && exit 1 sed -e 's/[[:space:]]*\=[[:space:]]*/=/g' \ -e 's/;.*$//' \ -e 's/[[:space:]]*$//' \ -e 's/^[[:space:]]*//' \ -e "s/^\(.*\)=\([^\"']*\)$/\1=\"\2\"/" \ < $1 \ | sed -n -e "/^\[$2\]/,/^\s*\[/{/^[^;].*\=.*/p;}"
You've completely missed the point again. (Perhaps understandably, since you're not in the target audience for this format.) while read K V; do FOO; done is something you can type in a shell one-liner, which is the most interesting 'use case' (sorry, horrible jargon!) for shell people. This complex hack is not, and I note that you had to google for it rather than writing it off the top of your head, which should have been enough to make the point clear. (Incidentally, it doesn't get INI quoting right. Doing it correctly and allowing for backslash escapes is somewhat harder.) Whitespace separated KEY VALUE is better-defined than INI[1], can be parsed by shell read, by C strtok() or strsep(), and is generally pleasant to work with in languages without sophisticated string handling or data structures. Here's your sample translated to that format: id decca5a5-8952-4004-9793-cdbbf05c3c63 category server title Debian GNU/Linux 5.0 Virtual Appliance summary Base installation of Debian GNU/Linux 5.0 content.cpu 2 content.memory 4Gb link.disk[0].id 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev sda link.network[0].id 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev eth0 mc.state RUNNING br.meter.rate 0.10 br.meter.currency USD br.meter.unit hours br.meter.total 35.27 pm.monitor.cpu 75.2 pm.monitor.mem 1059374258 mc.ops.start http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend id 4696b561-a253-42b4-bd27-7aa4950e0a60 category storage content.size 148251374 link.self virtual-disk.vmdk id 45a73b80-c957-4ae1-97c6-b70652eba1d1 category network content.vlan 4095 content.dhcp true content.subnet 192.168.0.0 content.netmask 255.255.0.0 content.gateway 192.168.0.1 vnd.com.cisco.cdp true I'll not comment on the key space or conflation of objects into one namespace with a category key in this post. Cheers, Chris. [1] See http://en.wikipedia.org/wiki/Initialization_file for details on some of the variation seen in the wild. There's no formal spec to disambiguate.

Once we get the noun/verb/attribute part settled, there is no harm in doing an ini and a key/val binding. In fact, a translator would be trivial... You can argue endlessly about the better format: there are too many PROs and CONs for both of them to come to an conclusive answer, IMHO. My $0.02, Andre. Quoting [Chris Webb] (Apr 17 2009):
Sam Johnston <samj@samj.net> writes:
but in any case INI files are simple, standard across platforms, well defined, etc.
You can parse them in shell like this<http://www.debian-administration.org/articles/55#comment_24> :
#!/bin/sh [ -z "$1" ] || [ -z "$2" ] && exit 1 sed -e 's/[[:space:]]*\=[[:space:]]*/=/g' \ -e 's/;.*$//' \ -e 's/[[:space:]]*$//' \ -e 's/^[[:space:]]*//' \ -e "s/^\(.*\)=\([^\"']*\)$/\1=\"\2\"/" \ < $1 \ | sed -n -e "/^\[$2\]/,/^\s*\[/{/^[^;].*\=.*/p;}"
You've completely missed the point again. (Perhaps understandably, since you're not in the target audience for this format.)
while read K V; do FOO; done
is something you can type in a shell one-liner, which is the most interesting 'use case' (sorry, horrible jargon!) for shell people. This complex hack is not, and I note that you had to google for it rather than writing it off the top of your head, which should have been enough to make the point clear. (Incidentally, it doesn't get INI quoting right. Doing it correctly and allowing for backslash escapes is somewhat harder.)
Whitespace separated KEY VALUE is better-defined than INI[1], can be parsed by shell read, by C strtok() or strsep(), and is generally pleasant to work with in languages without sophisticated string handling or data structures. Here's your sample translated to that format:
id decca5a5-8952-4004-9793-cdbbf05c3c63 category server title Debian GNU/Linux 5.0 Virtual Appliance summary Base installation of Debian GNU/Linux 5.0 content.cpu 2 content.memory 4Gb link.disk[0].id 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev sda link.network[0].id 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev eth0 mc.state RUNNING br.meter.rate 0.10 br.meter.currency USD br.meter.unit hours br.meter.total 35.27 pm.monitor.cpu 75.2 pm.monitor.mem 1059374258 mc.ops.start http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend
id 4696b561-a253-42b4-bd27-7aa4950e0a60 category storage content.size 148251374 link.self virtual-disk.vmdk
id 45a73b80-c957-4ae1-97c6-b70652eba1d1 category network content.vlan 4095 content.dhcp true content.subnet 192.168.0.0 content.netmask 255.255.0.0 content.gateway 192.168.0.1 vnd.com.cisco.cdp true
I'll not comment on the key space or conflation of objects into one namespace with a category key in this post.
Cheers,
Chris.
[1] See http://en.wikipedia.org/wiki/Initialization_file for details on some of the variation seen in the wild. There's no formal spec to disambiguate. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
-- Nothing is ever easy.

+1 Is there any way to put the FORMAT discussion to one side for now? I really liked Chris' example, and can say that we've found similar approaches very powerful in our work with RabbitMQ. But as Andre says, the issue is secondary to noun/verb. Right? On Fri, Apr 17, 2009 at 10:02 AM, Andre Merzky <andre@merzky.net> wrote:
Once we get the noun/verb/attribute part settled, there is no harm in doing an ini and a key/val binding. In fact, a translator would be trivial...
You can argue endlessly about the better format: there are too many PROs and CONs for both of them to come to an conclusive answer, IMHO.
My $0.02, Andre.
Quoting [Chris Webb] (Apr 17 2009):
Sam Johnston <samj@samj.net> writes:
but in any case INI files are simple, standard across platforms, well defined, etc.
You can parse them in shell like this<http://www.debian-administration.org/articles/55#comment_24> :
#!/bin/sh [ -z "$1" ] || [ -z "$2" ] && exit 1 sed -e 's/[[:space:]]*\=[[:space:]]*/=/g' \ -e 's/;.*$//' \ -e 's/[[:space:]]*$//' \ -e 's/^[[:space:]]*//' \ -e "s/^\(.*\)=\([^\"']*\)$/\1=\"\2\"/" \ < $1 \ | sed -n -e "/^\[$2\]/,/^\s*\[/{/^[^;].*\=.*/p;}"
You've completely missed the point again. (Perhaps understandably, since you're not in the target audience for this format.)
while read K V; do FOO; done
is something you can type in a shell one-liner, which is the most interesting 'use case' (sorry, horrible jargon!) for shell people. This complex hack is not, and I note that you had to google for it rather than writing it off the top of your head, which should have been enough to make the point clear. (Incidentally, it doesn't get INI quoting right. Doing it correctly and allowing for backslash escapes is somewhat harder.)
Whitespace separated KEY VALUE is better-defined than INI[1], can be parsed by shell read, by C strtok() or strsep(), and is generally pleasant to work with in languages without sophisticated string handling or data structures. Here's your sample translated to that format:
id decca5a5-8952-4004-9793-cdbbf05c3c63 category server title Debian GNU/Linux 5.0 Virtual Appliance summary Base installation of Debian GNU/Linux 5.0 content.cpu 2 content.memory 4Gb link.disk[0].id 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev sda link.network[0].id 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev eth0 mc.state RUNNING br.meter.rate 0.10 br.meter.currency USD br.meter.unit hours br.meter.total 35.27 pm.monitor.cpu 75.2 pm.monitor.mem 1059374258 mc.ops.start http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend
id 4696b561-a253-42b4-bd27-7aa4950e0a60 category storage content.size 148251374 link.self virtual-disk.vmdk
id 45a73b80-c957-4ae1-97c6-b70652eba1d1 category network content.vlan 4095 content.dhcp true content.subnet 192.168.0.0 content.netmask 255.255.0.0 content.gateway 192.168.0.1 vnd.com.cisco.cdp true
I'll not comment on the key space or conflation of objects into one namespace with a category key in this post.
Cheers,
Chris.
[1] See http://en.wikipedia.org/wiki/Initialization_file for details on some of the variation seen in the wild. There's no formal spec to disambiguate. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
-- Nothing is ever easy. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg

+1 -----Original Message----- From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Alexis Richardson Sent: 17 April 2009 10:08 To: Andre Merzky Cc: occi-wg@ogf.org Subject: Re: [occi-wg] Syntax of OCCI API +1 Is there any way to put the FORMAT discussion to one side for now? I really liked Chris' example, and can say that we've found similar approaches very powerful in our work with RabbitMQ. But as Andre says, the issue is secondary to noun/verb. Right? On Fri, Apr 17, 2009 at 10:02 AM, Andre Merzky <andre@merzky.net> wrote:
Once we get the noun/verb/attribute part settled, there is no harm in doing an ini and a key/val binding. In fact, a translator would be trivial...
You can argue endlessly about the better format: there are too many PROs and CONs for both of them to come to an conclusive answer, IMHO.
My $0.02, Andre.
Quoting [Chris Webb] (Apr 17 2009):
Sam Johnston <samj@samj.net> writes:
but in any case INI files are simple, standard across platforms, well defined, etc.
You can parse them in shell like this<http://www.debian-administration.org/articles/55#comment_24> :
#!/bin/sh [ -z "$1" ] || [ -z "$2" ] && exit 1 sed -e 's/[[:space:]]*\=[[:space:]]*/=/g' \ -e 's/;.*$//' \ -e 's/[[:space:]]*$//' \ -e 's/^[[:space:]]*//' \ -e "s/^\(.*\)=\([^\"']*\)$/\1=\"\2\"/" \ < $1 \ | sed -n -e "/^\[$2\]/,/^\s*\[/{/^[^;].*\=.*/p;}"
You've completely missed the point again. (Perhaps understandably, since you're not in the target audience for this format.)
while read K V; do FOO; done
is something you can type in a shell one-liner, which is the most interesting 'use case' (sorry, horrible jargon!) for shell people. This complex hack is not, and I note that you had to google for it rather than writing it off the top of your head, which should have been enough to make the point clear. (Incidentally, it doesn't get INI quoting right. Doing it correctly and allowing for backslash escapes is somewhat harder.)
Whitespace separated KEY VALUE is better-defined than INI[1], can be parsed by shell read, by C strtok() or strsep(), and is generally pleasant to work with in languages without sophisticated string handling or data structures. Here's your sample translated to that format:
id decca5a5-8952-4004-9793-cdbbf05c3c63 category server title Debian GNU/Linux 5.0 Virtual Appliance summary Base installation of Debian GNU/Linux 5.0 content.cpu 2 content.memory 4Gb link.disk[0].id 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev sda link.network[0].id 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev eth0 mc.state RUNNING br.meter.rate 0.10 br.meter.currency USD br.meter.unit hours br.meter.total 35.27 pm.monitor.cpu 75.2 pm.monitor.mem 1059374258 mc.ops.start http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend
id 4696b561-a253-42b4-bd27-7aa4950e0a60 category storage content.size 148251374 link.self virtual-disk.vmdk
id 45a73b80-c957-4ae1-97c6-b70652eba1d1 category network content.vlan 4095 content.dhcp true content.subnet 192.168.0.0 content.netmask 255.255.0.0 content.gateway 192.168.0.1 vnd.com.cisco.cdp true
I'll not comment on the key space or conflation of objects into one namespace with a category key in this post.
Cheers,
Chris.
[1] See http://en.wikipedia.org/wiki/Initialization_file for details on some of the variation seen in the wild. There's no formal spec to disambiguate. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
-- Nothing is ever easy. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg ------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

Exactly my point in my last mail, Andre. If we can even begin to suggest on the nouns that'll be a big help. In the wiki we have the central entity to be a "Resource" (in fact we could name this as Noun to focus discussion - thoughts? Sam?. That "Resource"/"Noun" can be abstractly sub-classed as virtual or physical and beneath that concrete entities could be "Server", "VM" and in the case of extension (re: Randy), "Loadbalancer". Andy -----Original Message----- From: occi-wg-bounces@ogf.org [mailto:occi-wg-bounces@ogf.org] On Behalf Of Andre Merzky Sent: 17 April 2009 10:03 To: Chris Webb Cc: occi-wg@ogf.org Subject: Re: [occi-wg] Syntax of OCCI API Once we get the noun/verb/attribute part settled, there is no harm in doing an ini and a key/val binding. In fact, a translator would be trivial... You can argue endlessly about the better format: there are too many PROs and CONs for both of them to come to an conclusive answer, IMHO. My $0.02, Andre. Quoting [Chris Webb] (Apr 17 2009):
Sam Johnston <samj@samj.net> writes:
but in any case INI files are simple, standard across platforms, well defined, etc.
You can parse them in shell like this<http://www.debian-administration.org/articles/55#comment_24> :
#!/bin/sh [ -z "$1" ] || [ -z "$2" ] && exit 1 sed -e 's/[[:space:]]*\=[[:space:]]*/=/g' \ -e 's/;.*$//' \ -e 's/[[:space:]]*$//' \ -e 's/^[[:space:]]*//' \ -e "s/^\(.*\)=\([^\"']*\)$/\1=\"\2\"/" \ < $1 \ | sed -n -e "/^\[$2\]/,/^\s*\[/{/^[^;].*\=.*/p;}"
You've completely missed the point again. (Perhaps understandably, since you're not in the target audience for this format.)
while read K V; do FOO; done
is something you can type in a shell one-liner, which is the most interesting 'use case' (sorry, horrible jargon!) for shell people. This complex hack is not, and I note that you had to google for it rather than writing it off the top of your head, which should have been enough to make the point clear. (Incidentally, it doesn't get INI quoting right. Doing it correctly and allowing for backslash escapes is somewhat harder.)
Whitespace separated KEY VALUE is better-defined than INI[1], can be parsed by shell read, by C strtok() or strsep(), and is generally pleasant to work with in languages without sophisticated string handling or data structures. Here's your sample translated to that format:
id decca5a5-8952-4004-9793-cdbbf05c3c63 category server title Debian GNU/Linux 5.0 Virtual Appliance summary Base installation of Debian GNU/Linux 5.0 content.cpu 2 content.memory 4Gb link.disk[0].id 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev sda link.network[0].id 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev eth0 mc.state RUNNING br.meter.rate 0.10 br.meter.currency USD br.meter.unit hours br.meter.total 35.27 pm.monitor.cpu 75.2 pm.monitor.mem 1059374258 mc.ops.start http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend
id 4696b561-a253-42b4-bd27-7aa4950e0a60 category storage content.size 148251374 link.self virtual-disk.vmdk
id 45a73b80-c957-4ae1-97c6-b70652eba1d1 category network content.vlan 4095 content.dhcp true content.subnet 192.168.0.0 content.netmask 255.255.0.0 content.gateway 192.168.0.1 vnd.com.cisco.cdp true
I'll not comment on the key space or conflation of objects into one namespace with a category key in this post.
Cheers,
Chris.
[1] See http://en.wikipedia.org/wiki/Initialization_file for details on some of the variation seen in the wild. There's no formal spec to disambiguate. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
-- Nothing is ever easy. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg ------------------------------------------------------------- Intel Ireland Limited (Branch) Collinstown Industrial Park, Leixlip, County Kildare, Ireland Registered Number: E902934 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.

+1 We should stick to the semantics and get them clear, not so much on the concrete rendering. -Alexander Am 17.04.2009 um 11:02 schrieb Andre Merzky:
Once we get the noun/verb/attribute part settled, there is no harm in doing an ini and a key/val binding. In fact, a translator would be trivial...
You can argue endlessly about the better format: there are too many PROs and CONs for both of them to come to an conclusive answer, IMHO.
My $0.02, Andre.
Quoting [Chris Webb] (Apr 17 2009):
Sam Johnston <samj@samj.net> writes:
but in any case INI files are simple, standard across platforms, well defined, etc.
You can parse them in shell like this<http://www.debian-administration.org/articles/55#comment_24> :
#!/bin/sh [ -z "$1" ] || [ -z "$2" ] && exit 1 sed -e 's/[[:space:]]*\=[[:space:]]*/=/g' \ -e 's/;.*$//' \ -e 's/[[:space:]]*$//' \ -e 's/^[[:space:]]*//' \ -e "s/^\(.*\)=\([^\"']*\)$/\1=\"\2\"/" \ < $1 \ | sed -n -e "/^\[$2\]/,/^\s*\[/{/^[^;].*\=.*/p;}"
You've completely missed the point again. (Perhaps understandably, since you're not in the target audience for this format.)
while read K V; do FOO; done
is something you can type in a shell one-liner, which is the most interesting 'use case' (sorry, horrible jargon!) for shell people. This complex hack is not, and I note that you had to google for it rather than writing it off the top of your head, which should have been enough to make the point clear. (Incidentally, it doesn't get INI quoting right. Doing it correctly and allowing for backslash escapes is somewhat harder.)
Whitespace separated KEY VALUE is better-defined than INI[1], can be parsed by shell read, by C strtok() or strsep(), and is generally pleasant to work with in languages without sophisticated string handling or data structures. Here's your sample translated to that format:
id decca5a5-8952-4004-9793-cdbbf05c3c63 category server title Debian GNU/Linux 5.0 Virtual Appliance summary Base installation of Debian GNU/Linux 5.0 content.cpu 2 content.memory 4Gb link.disk[0].id 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev sda link.network[0].id 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev eth0 mc.state RUNNING br.meter.rate 0.10 br.meter.currency USD br.meter.unit hours br.meter.total 35.27 pm.monitor.cpu 75.2 pm.monitor.mem 1059374258 mc.ops.start http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend
id 4696b561-a253-42b4-bd27-7aa4950e0a60 category storage content.size 148251374 link.self virtual-disk.vmdk
id 45a73b80-c957-4ae1-97c6-b70652eba1d1 category network content.vlan 4095 content.dhcp true content.subnet 192.168.0.0 content.netmask 255.255.0.0 content.gateway 192.168.0.1 vnd.com.cisco.cdp true
I'll not comment on the key space or conflation of objects into one namespace with a category key in this post.
Cheers,
Chris.
[1] See http://en.wikipedia.org/wiki/Initialization_file for details on some of the variation seen in the wild. There's no formal spec to disambiguate. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
-- Nothing is ever easy. _______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
-- Alexander Papaspyrou alexander.papaspyrou@tu-dortmund.de

On Fri, Apr 17, 2009 at 10:48 AM, Chris Webb <chris.webb@elastichosts.com>wrote:
Sam Johnston <samj@samj.net> writes:
You've completely missed the point again. (Perhaps understandably, since you're not in the target audience for this format.)
Don't be so sure - I started life as a sysadmin (like many of us here I guess) :)
while read K V; do FOO; done
Granted that's trivial, but it rapidly gets more interesting when you have multiple resources. As I said before though "*perhaps it's worth it just for the (common) case of dealing with a single object*". I take it you're saying that it is... YAML's another interesting format, and as of 1.2 it's a perfect superset of JSON <http://en.wikipedia.org/wiki/JSON#YAML> (that is, every JSON file is also a YAML file), but it also fails the "one-liner parseability test" miserably.
is something you can type in a shell one-liner, which is the most interesting 'use case' (sorry, horrible jargon!) for shell people. This complex hack is not, and I note that you had to google for it rather than writing it off the top of your head, which should have been enough to make the point clear. (Incidentally, it doesn't get INI quoting right. Doing it correctly and allowing for backslash escapes is somewhat harder.)
Whitespace separated KEY VALUE is better-defined than INI[1], can be parsed by shell read, by C strtok() or strsep(), and is generally pleasant to work with in languages without sophisticated string handling or data structures. Here's your sample translated to that format:
id decca5a5-8952-4004-9793-cdbbf05c3c63 category server title Debian GNU/Linux 5.0 Virtual Appliance summary Base installation of Debian GNU/Linux 5.0 content.cpu 2 content.memory 4Gb link.disk[0].id 4696b561-a253-42b4-bd27-7aa4950e0a60 link.disk[0].dev sda link.network[0].id 45a73b80-c957-4ae1-97c6-b70652eba1d1 link.network[0].dev eth0 mc.state RUNNING br.meter.rate 0.10 br.meter.currency USD br.meter.unit hours br.meter.total 35.27 pm.monitor.cpu 75.2 pm.monitor.mem 1059374258 mc.ops.start http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/start mc.ops.stop http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/stop mc.ops.restart http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/restart mc.ops.suspend http://example.com/decca5a5-8952-4004-9793-cdbbf05c3c63/ops/suspend
id 4696b561-a253-42b4-bd27-7aa4950e0a60 category storage content.size 148251374 link.self virtual-disk.vmdk
id 45a73b80-c957-4ae1-97c6-b70652eba1d1 category network content.vlan 4095 content.dhcp true content.subnet 192.168.0.0 content.netmask 255.255.0.0 content.gateway 192.168.0.1 vnd.com.cisco.cdp true
Ok I've updated the wiki because I'm mostly sold on this. Things do get rather more complicated when you have to parse multiple resources though, which is my main sticking point now... I'm leaning towards the time/space tradeoff of including the ID in each row somehow (in which case parsing into a hash of hashes is trivial again).
I'll not comment on the key space or conflation of objects into one namespace with a category key in this post.
Ok but don't forget to give us some feedback about this...
[1] See.http://en.wikipedia.org/wiki/Initialization_file for details on some
of the variation seen in the wild. There's no formal spec to disambiguate.
Yeah I agree that INI files are not perfect, and for more advanced formatting there's JSON/YAML so there's no point complicating text/plain to the point where it's uninteresting for both audiences. Sam

On Fri, Apr 17, 2009 at 11:23 AM, Chris Webb <chris.webb@elastichosts.com>wrote:
Sam Johnston <samj@samj.net> writes:
I'm leaning towards the time/space tradeoff of including the ID in each row somehow (in which case parsing into a hash of hashes is trivial again).
That works.
<snip>
I've found the tinydns-data <http://cr.yp.to/djbdns/tinydns-data.html>format a pleasure to work with as well
Yes, DJB knows how to design a decent data format, and doesn't succumb to the over-engineering fetish predominant as one moves up the software stack. I wouldn't be upset by KEY:VALUE in place of KEY VALUE. That's also easily parseable by read or strsep().
Ok so taking this a little further to tie off the formats discussion, combining the two ideas (tinydns-data w/ id on every line) gives us: decca5a5-8952-4004-9793-cdbbf05c3c63:category:server decca5a5-8952-4004-9793-cdbbf05c3c63:title:Debian GNU/Linux 5.0 Virtual Appliance Having worked with this format for what... a decade now... I can tell you that it is an absolute dream and even things that weren't even conceived of at the time (e.g. SRV records) are easily supported... the whole while avoiding annoying/dangerous parsing problems due to greedy regexps (which are surprisingly common) and the like. It also allows us to cater for simple structures like arrays later if need be: decca5a5-8952-4004-9793-cdbbf05c3c63:interfaces:eth0:eth1:eth2 Perhaps more importantly though it trivialises both generation and parsing of content by allowing you to do it in any order. This is particularly important for scalability (allowing for multiple threads querying mutliple servers and feeding back into a shared writer). I think then that the formats discussion is pretty much done, at least for the time being. On with the verbs and nouns... Sam

Sam Johnston <samj@samj.net> writes:
I've found the tinydns-data <http://cr.yp.to/djbdns/tinydns-data.html>format a pleasure to work with as well
Yes, DJB knows how to design a decent data format, and doesn't succumb to the over-engineering fetish predominant as one moves up the software stack. I wouldn't be upset by KEY:VALUE in place of KEY VALUE. That's also easily parseable by read or strsep(). Best wishes, Chris.

So, I¹m totally on this page... Sort of. I used UUIDs exclusively with CloudScale. But... The only downside of UUIDs is they aren¹t people-friendly. For example, trying to remember or type in and verify a UUID by eye is very error-prone. So if you were a sysadmin writing some bourne shell scripts to help with automation or just using some command line tools, it¹s going to be painful. You can imagine: clitool delete disk disk-id <some-really-long-uuid-string-is-here> -server <another-really-long-uuid-string-is-here> ... <yet-another-really-long-uuid-string> Another example would be listings of any kind. If strewn with UUIDs they are going to be super hard for people to parse visually. I¹d prefer that folks use UUIDs internally, that we have the spec say string(256) for most identifiers, and that providers can choose to use UUIDs or something more friendly (e.g. the canonical AMI id: ami-abcd1234) as appropriate. --Randy On 4/16/09 6:12 AM, "Sam Johnston" <samj@samj.net> wrote:
I'm surprised I didn't get more abuse about using UUID4's but that's a no brainer when you understand the concurrency issues (and don't want to expose any secrets, particularly about the size of your operation).
-- Randy Bias, VP Technology Strategy, GoGrid randyb@gogrid.com, (415) 939-8507 [mobile] BLOG: http://neotactics.com/blog, TWITTER: twitter.com/randybias

On Fri, Apr 17, 2009 at 2:37 AM, Randy Bias <randyb@gogrid.com> wrote:
The only downside of UUIDs is they aren’t people-friendly. For example, trying to remember or type in and verify a UUID by eye is very error-prone. So if you were a sysadmin writing some bourne shell scripts to help with automation or just using some command line tools, it’s going to be painful. You can imagine:
clitool –delete disk –disk-id <some-really-long-uuid-string-is-here> -server <another-really-long-uuid-string-is-here> ... <yet-another-really-long-uuid-string>
Another example would be listings of any kind. If strewn with UUIDs they are going to be super hard for people to parse visually.
I’d prefer that folks use UUIDs internally, that we have the spec say string(256) for most identifiers, and that providers can choose to use UUIDs or something more friendly (e.g. the canonical AMI id: ami-abcd1234) as appropriate.
This is something I had considered too but it runs counter to the "great global grid" use case. I don't think moving away from UUIDs for the core is sensible (if you give people enough rope they *will* hang themselves, as they have proven time and time again)... But, if the category/tagging functionality does not satisfy this use case then my suggestion would be to assign optional aliases to frequently used resources (e.g. Amazon's "small", "medium" and "large" templates). This would add implementation headaches (e.g. having to implement a central registry which would need to be available at creation time) but write operations would be few and far between (at least compared to normal operations). So for tags/categories you have: http://example.com/-/linux/web (to retrieve all linux web servers you have access to) And for aliases you might have: http://example.com/myserver (rather than a long, ugly UUID) Sam

Here is the solution I use and have used in the past. I have created UUIDS of the form <My-name>-<location>-<my-id>. So long as we aren't sticking to a particular format for UUIDs this would work (and has). Chuck Wegrzyn Sam Johnston wrote:
On Fri, Apr 17, 2009 at 2:37 AM, Randy Bias <randyb@gogrid.com <mailto:randyb@gogrid.com>> wrote:
The only downside of UUIDs is they aren’t people-friendly. For example, trying to remember or type in and verify a UUID by eye is very error-prone. So if you were a sysadmin writing some bourne shell scripts to help with automation or just using some command line tools, it’s going to be painful. You can imagine:
clitool –delete disk –disk-id <some-really-long-uuid-string-is-here> -server <another-really-long-uuid-string-is-here> ... <yet-another-really-long-uuid-string>
Another example would be listings of any kind. If strewn with UUIDs they are going to be super hard for people to parse visually.
I’d prefer that folks use UUIDs internally, that we have the spec say string(256) for most identifiers, and that providers can choose to use UUIDs or something more friendly (e.g. the canonical AMI id: ami-abcd1234) as appropriate.
This is something I had considered too but it runs counter to the "great global grid" use case. I don't think moving away from UUIDs for the core is sensible (if you give people enough rope they /will/ hang themselves, as they have proven time and time again)...
But, if the category/tagging functionality does not satisfy this use case then my suggestion would be to assign optional aliases to frequently used resources (e.g. Amazon's "small", "medium" and "large" templates). This would add implementation headaches (e.g. having to implement a central registry which would need to be available at creation time) but write operations would be few and far between (at least compared to normal operations).
So for tags/categories you have:
http://example.com/-/linux/web (to retrieve all linux web servers you have access to)
And for aliases you might have:
http://example.com/myserver (rather than a long, ugly UUID)
Sam
------------------------------------------------------------------------
_______________________________________________ occi-wg mailing list occi-wg@ogf.org http://www.ogf.org/mailman/listinfo/occi-wg
participants (10)
-
Alexander Papaspyrou
-
Alexis Richardson
-
Andre Merzky
-
Chris Webb
-
Edmonds, AndrewX
-
eprparadocs@gmail.com
-
Randy Bias
-
Richard Davies
-
Sam Johnston
-
Thijs Metsch