Composed versus Abstracted Resources: Why it’s Critical

The blog I did yesterday on Cisco’s approach to edge or universal distributed computing got me some email questions and raised some LinkedIn comments.  These combine to suggest that spending a little time on the issue of resource abstraction or “virtualization” versus infrastructure composition might be useful.  As always, it seems, we’re hampered by a lack of consistency in definition.

The “composable infrastructure” or “infrastructure as code” movement seeks to allow what is much like a real server to be assembled from memory, CPU, storage, etc.  This presupposes you have very tight coupling among the elements that you’re going to compose with, so that the result of your composition is an efficient almost-real machine.  “Abstract infrastructure”, “resource abstraction”, or “virtualization” aim at a different target.  The goal there is to define a model, an abstraction, a virtual something, that has consistent properties but will be mapped to a variety of actual hardware platforms.

It’s my view that composable infrastructure has specialized value while resource abstraction has general value.  Cloud computing, containers, virtual machines, and virtual networks all prove my point.  The reason is that if you define the right resource abstraction and map it properly, you can build applications so their components exploit the properties of the system you’ve created.  There is then uniformity in the requirements at the “composable” or real-server level (which is why you can map an abstraction to a general pool of resources), so you don’t need to compose anything.

I’m not dismissing the value of composable infrastructure here, but I am suggesting that it’s not as broadly critical a movement as the resource abstraction stuff.  You could use composable infrastructure inside a resource pool, of course, but if the goal is resource and operations efficiency, it’s probably better to have resources that are more uniform in their capability so that you have optimized utilization.  Differences in resources, which composable infrastructure enables/creates, don’t contribute to equivalent resources that are the foundation principle of efficient resource pools.

There also seems to be two different approaches to resource abstraction.  OpenStack epitomizes the first, oldest, approach, which is based on abstracting through process generalization.  The lifecycle tasks associated with deploying an application or service feature on infrastructure are well-known.  If each task is provided with an API, and if the logic for that task is then connected to a “plug-in” that specificizes it to a given resource or class of resources, then invoking that task with the proper plug-in will control the resource(s) you’ve assigned.  The newer approach is based on the virtual model and layer.  With this approach, you define a virtual resource class, like a server with memory and storage, and you define an API or APIs to do stuff with it.  Your processes then act directly on the model APIs; there are not visible plug-ins or adaptations.  Under the model you have a new layer of logic that invisibly maps the virtual model to real resources.

The difference between these approaches can be understood by thinking for a moment about a resource pool that consists of servers in very different locations, perhaps some in the cloud, some at the edge, and some in the data center.  When you want to assign a resource to a given application/feature, the tasks associated with deciding what resource to assign and making the assignment are likely at least somewhat (and often radically) different depending on where the resource is.  Network connectivity in particular may be different.  In process-generalized virtualization, this may mean that your processes have to know about those differences, which means lifecycle automation has to test for the conditions that could change how you deploy.  In virtual-model resource abstraction, you make all these decisions within the new layer, so your processes are simple and truly resource-independent.

You can also explore the differences between these two abstraction approaches by looking at the current cloud infrastructure platform software space.  There are two proven-at-scale approaches to “hosting”.  The first is the Apache Mesos and DC/OS combination (Mesosphere is a with-support provider of the combination, along with the Marathon orchestrator), which is very explicitly a virtual-model-and-layer tool.  The second is what we could call the “Kubernetes ecosystem”, which is the container-based strategy evolving piece by piece, and it’s what’s in this approach that’s interesting.

Kubernetes is an orchestrator.  You have a plug-in point for a virtual-network tool as well.  You need to have an additional component to provide distributed load-balancing and workflow control—Google-sponsored Istio is a prime example.  You need a strategy to distribute the control plane of Kubernetes, like Stackpoint.  Maybe you use a combination tool like Juke from HTBASE, which Juniper is now acquiring, or Rancher.  You can probably see the point here; all this ecosystemic stuff has to be added on, there are multiple options for most of the areas, and your overall process of lifecycle automation is likely impacted by each selection you make.  If you simply push everything into a lower layer and make it invisible, your operational life is much easier.

This is why I said, in yesterday’s blog, that Cisco’s “stuff-Anywhere” expansion was a good idea as far as it went, but didn’t include a critical piece.  Any resource abstraction strategy should promote vendor independence, efficient use of resources, and hybrid and multi-cloud operation.  It also has to support efficient and error-free operations practices, and if you don’t stick the variables in an abstraction-independent lower layer they rise up to bite your operations staff in the you-know-where.  That bites your business case in the same place.

You might wonder why we don’t see vendors touting ecosystemic resource abstraction if it’s so clear this is the long-term winning strategy.  The answer is simple; it’s easier to get somebody to climb to the next ledge than to confront them with the challenge of getting all the way to the summit.  Vendors aren’t in this for the long term, they’re in it for the next quarter’s earnings.  They push the stuff that can be sold quickly, which tends to mean that buyers are moved to a short-term goal, held there till they start to see the limitations of where they are, then presented with the next step.

There’s a downside to the model-and-layer approach to resource abstraction; two, in fact.  The first is that the underlying layer is more complex and has to be kept up to date with respect to the best tools and practices.  The second is that you are effectively creating an intent model, which means you have to supply the new layer with parametric data about the stuff you’re deploying, parametric data about your resources, and policies that relate the two.  This stuff is easier for some organizations to develop, but for those used to hands-on operation, it’s harder to relate to than process steps would be.

This same argument has been used to justify prescriptive (script-like) versus declarative (model-like) DevOps, and I think it’s clear that model-based is winning.  It’s very easy to recast any parameter set to a different format as long as all the necessary data is represented in some way.  It’s easy to get parametric values incorporated in a repository to support what the development community is calling “GitOps” and “single source of truth”.  In short, I think it’s easy to say that the industry is really moving toward the resource abstraction model.

Which is where the media, including people like me who blog regularly, should step in.  Vendors may not want to do pie-in-the-sky positioning, but if enough people tell the market which way is up, we can at least hope to get our ledge-climbers headed in the right direction.