A Structure for Abstraction and Virtualization in the Telco Cloud

It is becoming clear that the future of networking, the cloud, and IT in general lies in abstraction.  We have an increasing number of choices in network technology, equipment vendors, servers, operating systems (and distros), middleware…you get the picture.  We have open-source software and open hardware initiatives, and of course open standards.  With this multiplicity of options comes more buyer choice and power, but multiplicity has its downsides.  It’s hard to prevent vendor desires for differentiation from diluting choice, and differences in implementation mean difficulty creating efficient and agile operations.

Abstraction is the accepted way of addressing this.  “Virtualization” is a term often used to describe the process of creating an abstraction that can be mapped to a number of different options.  A virtual machine is mapped to a real server, a virtual network to real infrastructure.  Abstraction plus mapping equals virtualization, in other words.

The challenge we have isn’t acceptance of the notion of abstraction/virtualization, but the growing number of things that need to be virtualized and the even-faster-growing number of ways of looking at it.  Complex virtualization really means a modeling system to express the relationships of parts to the whole.  In my ExperiaSphere work on service lifecycle automation, I proposed that we model a service in two layers, “service” and “resource”, and I think we are starting to see some sense of structure in virtualization overall.

The best way to look at anything these days is through cloud-colored glasses, and the cloud offers us some useful insights into that broader virtualization vision.  “Infrastructure” in the cloud has two basic features, the ability to host application components or service features, and the ability to connect elements of applications and services to create a delivered experience.  We could visualize these two things as being the “services” offered by, or the “features” of, infrastructure.

If you decompose infrastructure, you end up with systems of devices, and here we see variations in how the abstraction/virtualization stuff might work.  On the network side, the standard structure is that a network is made up of a cooperative community of devices/elements, and that networks are committed to create connection services.  Thus, devices>networks>connection-services in progression.  On the hosting or computing side, you really have a combination of network devices and servers that collectively frame a data center hardware system, and that hosts a set of platform software tools that combine to create the hosting.

There are already a couple of complicating factors entering the picture.  First, “devices” at the network and hosting levels can be virtualized themselves.  A “router” might be a software feature hosted in a virtual machine assigned to a pool of servers.  Second, the virtual machine hosting (or container hosting) might be based on a pool of resources that don’t align with data center boundaries, so the virtual division of resources would differ from the physical division.  Container pods or clusters or swarms are examples; they might cross data center boundaries.

What we end up with is a slightly more complicated set of layers, which I offer HERE as a graphic to make things easier to follow.  I’ve also noted the parts of the structure covered by MANO and ONAP, and by the Apache Mesos and DC/OS combination that I think bears consideration by the ONAP people.

At the bottom of the structure, we have a device layer that hosts real, nuclear, hardware elements.  On top of this is a virtual-infrastructure layer, and this layer is responsible for mapping between the real device elements available and any necessary or useful abstraction thereof.  One such abstraction might be geographical/facility-oriented, meaning data centers or interconnect farms.  Another might be resource-pool oriented, meaning that the layer creates an abstract pool from which higher layers can draw resources.

One easy illustration of this layer and what it abstracts is the decision by an operator or cloud provider to add a data center.  That data center has a collection of real devices in it, and the process of adding the data center would involve some “real” and “virtual” changes.  On the real side, we’d have to connect that data center network into the WAN that connects the other data centers.  On the virtual side, we would need to make the resources of that data center available to the abstractions that are hosted by the virtual-infrastructure layer, such as cloud resource pools.  The “mapping processes” for this layer might contain policies that would automatically augment some of the virtual-infrastructure abstractions (the resource pools, for example) with resources from the new data center.

Above the virtual-infrastructure layer is the layer that commits virtual resources, which I’ll call the “virtual resource” layer.  This layer would add whatever platform software (OS and middleware, hypervisor, etc.) and parameterization needed to transform a resource pool into a “virtual element”, a virtual component of an application or service, a virtual device, or something else that has explicit functionality.  Virtual elements are the building-blocks for services, which are made up of feature components hosted in virtual elements or coerced behavior of devices or device systems.

If we accept this model as at least one possible layered framework for abstraction, we can also map some current projects to the layers.  ONAP and NFV MANO operate at the very top, converting virtual resources into functional components, represented in MANO by Virtual Infrastructure Managers and Virtual Network Functions.  ONAP operates higher as well, in service lifecycle management processes.

Below the ONAP/MANO activities are the layers that my ExperiaSphere stuff calls the “resource-layer models”.  In my view, the best current framework for this set of features is found in the DC/OS project, which is based on Apache Mesos.  There are things that I think are needed at this level that Mesos and DC/OS don’t provide, but I think they could be added on without too much hassle.

Let’s go back now to DC/OS and Mesos.  Mesos is an Apache cluster management tool, and DC/OS adds in features that abstract a resource cloud to look like a single computer, which is certainly a big step toward my bottom-layer requirements.  It’s also something that I think the telcos should have been looking at (so is Marathon, a mass-scale orchestration tool).  But even if you don’t think that the combination is a critical piece of virtualization and telco cloud, it demonstrates that the cloud community has been thinking of this problem for a long time.

Where I think DC/OS and Mesos could use some help is in defining non-server elements, resource commissioning and data center assignment and onboarding.  The lower layer of my model, the Device Layer, is a physical pool of stuff.  It would be essential to be able to represent network resources in this layer, and it would be highly desirable to support the reality that you onboard entire data centers or racks and not just individual servers or boxes.  Finally, the management processes to sustain resources should be defined here, and from here should be coupled upward to be associated with higher-layer elements.

I think this is a topic that needs to be explored, by the ONAP people, the NFV ISG, and perhaps the Open Compute Project, as well as Apache.  We need to have a vertically integrated model of virtualization, not a bunch of disconnected approaches, or we’ll not be able to create a uniform cloud hosting environment that’s elastic and composable at all levels.  And we shouldn’t settle for less.