Here are options for metadata used in some of the major storage clouds FWIW:

S3, Rackspace, EMC Atmos, Azure - Headers
Nirvanix - query params in, xml entity out
Mezeo - entity

Thanks - this is great information.

Of the ones using headers, S3, Rackspace and Azure use prefix with
values stored as-us. Atmos joins all metadata together into one
header, making parsing trivial (split /,/), but necessary.

If you use something like "Attribute: name=value" then HTTP specifies that this can be collapsed into a single "Attribute: name1=value1, name2=value2" header (',' is used to separate headers while ';' separates header components).

The most expensive option of the above is entity, where each metadata
value is a separate GET. However, entities allow binary metadata and
zero restrictions on it, which may be useful.

In such cases it is probably better to use Link: headers. For example, we can advertise a console screenshot in an image/* format using something like:

Link: </myvm.png>; rel="http://purl.org/occi/relation#console"

The same approach is currently used to advertise SSH/RDP/etc. access too.

In jclouds, we time parsing of response values. A simple XML doc with
only several elements written in SAX takes a few ms to parse. My log
files are not precise enough to find the overhead in parsing headers:
they always start and finish within the same millisecond.

While unsurprising it's good to have some numbers to back up the assumption that headers are more performant... I haven't pushed this point before now because I didn't have the evidence.

I hope this background helps, and also helps explain why I'm vocal on
such topics such as headers vs entities :)

Sure - it's great to have you on board.

Sam

On Mon, Oct 19, 2009 at 4:21 PM, Sam Johnston <samj@samj.net> wrote:
> On Tue, Oct 20, 2009 at 12:57 AM, gary mazzaferro <garymazzaferro@gmail.com>
> wrote:
>
>> The http header and key/value pairs need to parsed also, there is no free
>> ride here.
>
> Every HTTP library I have ever used parses HTTP headers and puts them in a
> nice hash for you ready to consume. If we go for "Name: Value" then that's
> all there is to it. If we go for "Attribute: name=value" as is currently
> proposed (which is arguably cleaner, follows cookies' "prior art" and avoids
> Amazon's prefix hack) then you just have to split on '='.
>
> To illustrate how clean this is by example:
>
>> #!/usr/bin/python
>> import urllib2
>> response = urllib2.urlopen('http://cloud.example.com/myvm')
>> representation = response.read()
>> metadata = response.info()
>> print metadata['occi-compute-cores']
>
> As soon as you start talking about payloads you have to fire up a parser
> (JSON/XML/Atom/etc.) or write your own (previous text rendering) which is
> significantly more work to do at both design and run times. Not to mention
> more work for us to do now and more scope for interoperability problems.
>
> Sam
>
>

> _______________________________________________
> occi-wg mailing list
> occi-wg@ogf.org
> http://www.ogf.org/mailman/listinfo/occi-wg
>
>