Sam,
This area is always fun. IMHO the name should be less
important than the definition of the thing that we name :o) But that
doesn't stop the debate.
One potential problem I see with workload is that it is
commonly used in reference to an application or service or to types of such
things, as in transactional workloads, compute intensive workloads and so
forth. Is this a problem? Probably not as long as define it
properly and the context(s) within which to use it.
However, as you say, in the grand scheme of things one
person's workload can be another's container. How do you intend to capture
this, i.e. that a physical server hosts a hypervisor (a kind of operating
system) that hosts a number of virtual servers that each host an operating
system (which could itself be a hypervisor) which may host one or more JVMs,
which host etc. etc. The challenge we have is to create a model and set of
terms which is simple and clear within the context of the current problem we are
solving, which is extensible (enough) as the use case set grows and which
is reasonably consistent with other models/definitions. Attached is
a diagram we found useful in the OGF reference model working group.
It attempts to capture the various conceptual layers. You can see this is
old because N1 Grid Containers(c2003) = Solaris Containers (Zones plus Solaris
cpu, mem and network IO controls).
:o)
Anyway, in the OGF reference model we have chosen not to
explicitly separate containers and workloads, or services and resources, but
rather we have the notion of (managed) components that can have structural
relationships (hosts, is composed of) and interaction relationships (network
traffic flow, transaction flow) with other components. These components
are specialized into Servers (physical or logical), operating systems (which
include hypervisors and traditional OSes) and so forth. Avoiding
giving the components names that imply relationships with other components
has provided some conceptual flexibility :o) One of our
(eBay) internal tools uses this model as its basis and renders it as
RDF, allowing us to model, capture and query various patterns and
relationships within our infrastructure.
Actually this would be a good point to engage with the OGF
Ref Model group. They (inc me) would be very interested in using OCCI to
help drive the ref model forward, to improve/correct it (ref
model) and so forth. Our goal is something that is not necessarily
exhaustive, but rather something that is definitely useful :o) We
want to capture the lifecycle of the components (container, workload, whatever)
as well as their structure/relationships.
Anyway you get the idea I am sure. It would perhaps be useful to hold a
joint discussion or two with the ref model folks (mainly Dave Snelling & I)
to see whether we can help each other at all.
Cheers
Paul
Evening all,
I've attached some notes Andy took from a call at the weekend as well as a
diagram I whipped up today which I hope will help us to use common terminology
and avoid the ambiguous term "virtual machine" (which can refer both to the host
and the guest, or both together - as distinct from what we mean when we say
"java virtual machine"). The proposed terminology is also generic and thus
compatible with any work we do in the future at the platform and/or application
layers (as deployed applications look just like virtual machines in that they
can be started, stopped, etc.).
- Container refers to the host of an individual workload (e.g. an
empty virtual machine [host], runtime, interpreter, etc.)
- Workload refers to a generic load that the user wishes to execute
in the cloud (e.g. virtual machine files [guest], RoR app, JAR/WAR, etc.)
- Template refers to a COPYable workload that cannot be run (e.g. a
public AMI)
- Instance refers to a workload that is currently allocated and
consuming resources (e.g. a running or suspended virtual machine)
Some of you may recall similar terminology back when we were writing the
charter but our model ended up going in a different direction. The reason it's
come back up now is that we're getting down to the details (like running
instances vs the [possibly immutable] template from which they were started) and
not using common language causes confusion from time to time.
In terms of how we model these things for cloud infrastructure:
- Containers are like reservations (though the resources may or may
not be actually reserved). If you create a blank server in VMware vCloud
express for example there's an entity that you will be billed for regardless
of whether you start it or even point it at an image. Similarly you can pay
Amazon for a "reserved instance" and then get cheaper hourly rates for it.
Think of it like a virtual dedicated server. These can be modeled via "empty"
compute resources - that is, ones which have metadata such as allocated cores
and memory, but no entity-body (e.g. OVF payload).
- Workloads are whatever the user wants to run in your cloud. This
could be anything from a script to a complex, multi-VM OVF file ("vApp" in
VMware parlance). These are generally referred to as templates or instances:
- Templates are resources that cannot be directly started (e.g.
don't advertise start, stop, restart, etc. actions), rather needing a COPY
to a new location as an "instance" first. Rather than having to reverse
engineer the available actions these are identified by a predetermined
"template" category.
- Instances are resources that are allocated (e.g. can be started,
stopped, restarted, etc.). These are the default and can be identified by
the presence of an entity-body (e.g. OVF payload) and absence of the
"template" category.
We discussed this on the call today and it wasn't contentious but if you
have feedback then fire away,
Sam