[occi-wg] Horizontal & vertical scalability in the cloud

26 Oct 2009

      Morning all,

So I'm looking at how best to handle horizontal & vertical scalability
following an hour or two on the phone with Andy earlier which got me
thinking. Currently scaling in services like Amazon EC2 involves having a
service like RightScale monitor your application (e.g. response time, load
average, etc.) and start new instances when (or ideally before) the figures
reach unacceptable levels, track them while they're running, and kill them
off again when the load returns to the baseline. This is pretty inefficient
because each instance is its own first-class citizen and there is no
connection between them (aside from the fact that they were started from the
same image) - you have to handle them individually which in itself generates
flurries of unnecessary API calls.

A better approach to scalability is to have a single object which you can
both adjust the resources of (vertical scalability) and adjust the number of
instances of (horizontal scalability). That is, you start a single instance
with 1 core and 1Gb, then while it's running you crank it up to 2 cores and
2Gb. Eventually you max out at say 8 cores and 16Gb so you need to go
horizontal at some point. Rather than create new unlinked instances the idea
is that you would simply adjust the number of requested instances and let
the infrastructure take care of making reality match the request (if the
available resources allow for it). When things calm down you can back off
the number of requested instances and watch the (immutable) number of actual
instances fall back as machines are gracefully shut down. Similarly one can
tweak the allocated resources (RAM, CPU, etc.) and, provided the
infrastructure supports it, the changes will be applied to all the "shadow"
instances.

Ultimately the "governor" (e.g. RightScale) could constantly tune the
application based on the amount and type of load (a fair amount of
artificial intelligence could go into these decisions but that's a subject
for a different forum) and wouldn't have to worry about the mechanics of
managing the resources (which arguably should be done locally anyway). Of
course the infrastructure would also have the option of creating many
separate instances if it wanted to (e.g. so providers like Amazon can still
use OCCI if they want to).

If anyone has any thoughts about this then I'd be interested to hear them...

Sam

[occi-wg] Horizontal & vertical scalability in the cloud

Sam Johnston