Identifying the UFO of Service Modeling and Virtualization

One of the nice things about UFOs is that until they land and present themselves, you can say what you like about them.  Abstraction in networking and the cloud is similar in many ways; because an abstraction is a representation, there’s no limit to what one can represent.  No limit, perhaps, but a potentially huge difference in value.  We can’t afford to leave our “abstraction UFO” in an unidentified state, so we’re going to explore the issues here.

A “virtual network” is an abstraction of a network, a “virtual host” an abstraction of a host.  Virtual private networks abstract private networks, so they’re a subset of virtual networks.  Software-defined networks are also abstractions of private networks, and so of virtual networks.  The point here is that virtualization is all about abstraction—defining something that appears real but isn’t—and then mapping it to resources so that it becomes real.  That two-step process makes virtualization very powerful and also very complicated.

What you abstract/virtualize has major implications because the agility of implementation and deployment/topology that virtualization provides tends to focus at and below the point you’re abstracting.  An easy example is in “hosting”.  You can abstract hosting by presenting hosting/orchestration tools with a single abstraction that represents a huge virtual host, and then map the specifics of deployment and management below that abstraction.  The details of the servers, the networking, even the orchestration are invisible inside that abstract virtual host.  On the other hand, if you abstract the network by creating a virtual network, you can define connectivity as you like but you have to explicitly decide how and where to host something, and then connect it.

We have examples of this today in the container space.  Docker and Linux container technology virtualizes the hosting points.  Kubernetes extends this to virtualizing hosting points to realize them from a cluster rather than a host.  Mesos virtualizes the entire resource pool to make it appear as a single host, and we’re clearly heading that way in the mainstream of Kubernetes.  In networking we have SDN and SD-WAN products that create virtual networks that can connect cloud elements, as well as service meshes that abstract added elements like load-balancing and scaling.

One obvious problem this elasticity of meaning in abstraction and virtualization creates is the fact that the term “virtualization” tends to be used indiscriminately, disconnected from the specific level of abstraction that’s being proposed or described.  More abstraction is different from less abstraction in some important ways, and we need to understand what they are to understand what imprecise terminology might be costing us.

If we looked at modern cloud infrastructure, including carrier cloud, we’d see a complex hierarchy of software, servers, network devices, and fiber/copper/wireless connections.  There are many different ways to build the structure, many different vendors and equipment models.  Deploying applications or services on such a structure creates a serious risk of “brittle” behavior.  That means that the process of deployment, if aware of the details of how the infrastructure hierarchy is built, will break if a part of it (even sometimes a small part) is changed.  This brittleness issue deters effective service and application lifecycle automation because it’s hard to define processes that don’t break when you change the technologies they operate on.

Abstraction came along as a solution to the problem.  To understand how, we need to start at the bottom, with an example of a network device.  Suppose we have five router vendors and several different models for two of the vendors.  That’s a total of seven different “routers”, all of which have different properties.  Instead of requiring our management software layer to contend with knowing which router is where, why not have an abstraction called “Router” that’s a software plugin.  On the north side, the abstraction presents a standard set of router properties, and the plugin then harmonizes the seven different router management information bases and control language inputs to that abstraction.  Now one management toolkit can manage all these routers, and any others that are harmonized via a plugin.

We can take this to the next level now.  Suppose we have, instead of a physical router model, a hosted instance of a router.  We can now frame that router instance to work with our plugin, and it now is managed like any real router.  However, we have to deploy it, so what we really need is a management tool that recognizes all of the phases of deployment and management needed for both routers and router instances.  Such a tool, for example, might have a “Deploy” command to host a router in a given location, and that would be ignored for a physical device already there.

In these examples, we have two sets of “management behaviors”, one that’s above the abstraction and always operates on the virtual image that abstraction presents, and another below it that represents the real-world stuff.  In our router instance example, our below-the-line behaviors not only have to map real-to-abstraction management properties, they have to handle things that happen in the hosted-instance world that don’t happen in the real-router world, like deployment and redeployment and perhaps scaling.  Hosting of functions and application components is already based on abstraction, so we have in virtual routers an example of abstractions that contain lower-level abstractions.

We’re not done.  A collection of routers can be abstracted too, just like a real set.  Suppose we collect our router abstractions from a network of routers in all our metro areas, as “NYC-Router”, “LA-Router” and so forth.  Since our routers are already abstracted, we have another example of an abstraction of abstractions, a hierarchy.

What’s the value of this kind of hierarchy?  Think of a service order process.  Somebody wants to buy an IP VPN that has terminations in a couple dozen cities.  There might be literally hundreds of cities in which the service could be provided, so do we build service descriptions for each possible combination of cities an order might include?  It would make more sense to say that an IP VPN was a “transit VPN” connected by metro router abstractions to each served city.  Now, an order is just a list of those metro areas, and when we select a metro area for service the abstraction for that area will “decompose” to deploy the service based on whatever the routers, router instances, or even SD-WAN instances might be needed there.

Speaking of deployments and operations tasks, one other thing this makes clear is that the nature of the hierarchy you build tends to separate high- and low-level operations tasks.  What happens at the service level has to be reflected in what’s visible at the service level, so if you let people order SD-WAN or MPLS VPN services explicitly, you probably need to have your metro and transit VPN elements divided by implementation.  If you don’t, it still could be possible to separate the options by pricing or other parametric data that you pass to each of the service-level objects, but having orderable items directly visible to service orchestration means that if they’re not uniformly available you can tell that at order time.

Each abstraction in a model is responsible for presenting a uniform view of what’s above and harmonizing everything that is or could be below.  If we assume a hierarchy like this, and if we assume intent-model principles apply to the definition of all the abstractions, then we can assume that the management of an abstraction could be based on general, model-centric, principles as long as we’re dealing with an abstraction of abstractions.  When an abstraction decomposes into something other than a model set based on the same principles, we have to assume that management at that point has to be defined by the deployment process.  In other words, all common-structured abstraction models can be managed the same way, but when what’s inside a model uses different principles (or none at all, because it’s a direct interface to another management system) then different operations practices will prevail there, and deeper.

The property of having abstractions isolate management tools and practices is a blessing and a curse.  Where there are established processes for lifecycle management, as with cloud components, you can use those within the model elements that represent cloud-hosted components and use something else more general elsewhere.  The problem is that this can lead to anarchy in the use of management tools.  It would be better to use common tools where possible, and adopt specialized and local solutions only when common tools won’t serve.

But how far do you virtualize?  The answer, in the case of a hierarchy, is that it may not matter much.  Any set of objects at a given level represent a virtual infrastructure at that level.  Abstract networks and you have network operations, abstract devices and you have device operations or element management.  That’s what makes abstractions very UFO-like but also very useful, and it’s also where they can have unexpected impacts.

I’ve been playing with the abstraction issue in networking since 2008, and what I’ve found is that it’s possible to support abstractions at almost any level, as long as you pick a place where you have a significant operations investment you’re trying to protect.  That’s probably why the NFV ISG decided to do what’s effectively device-level abstraction and present virtual devices to legacy EMS/NMS and OSS/BSS systems.  But what makes this only “possible” as a strategy is the failure to adopt a modeling hierarchy that makes it inherently flexible, and then use that modeling (as the TMF proposed long ago) to steer events to operations processes.

The best approach, I think, is to understand that a continuum of hierarchical models that each represent abstractions of the structure below can be adapted to operations tools at any level, what I’ve called “derived operations” because you derive an abstract picture of infrastructure by selecting a model segment where the tools match what the abstraction represents.  This is why modeling is so important to effective service operations automation, no matter what specific set of tasks you’re trying to automate.