Here’s What I Mean by Top-Down NFV

I’ve talked in previous blogs about the value of a top-down approach to things like NFV, and I don’t want to appear to be throwing stones without offering a constructive example.  What I therefore propose to do now is to look at NFV in a top-down way, the way I contend a software architect would naturally approach a project that in the end is a software design project.

Top-down starts with the driving benefits and goals.  The purpose of NFV, the goal, is to permit the substitution of hosted functionality on virtualized resources for the functionality of traditional network devices.  This substitution must lower costs and increase service agility, and so it must be suitable for automated deployment and support.  A software person would see this goal in four pieces.

First and foremost, we have to define the functional platform on which hour hosted functionality will run.  I use the qualifier “functional” because it’s not necessary that virtual network functions (the hosted functionality in NFV terms) run on the same OS or physical hardware, but only that they have some specific functional resources that support them.

I contend that the goal of NFV can be achieved only if we can draw on the enormous reservoir of network features already available on popular platforms like Linux.  Therefore, I contend that the functional platform for VNFs has to be directed at replicating the connection and management framework that such a network feature would expect to have, and harnessing its capabilities to create services.

Second, we have to define a compositional abstraction that permits the creation of this functional platform.  A functional platform would be represented by a set of services offered to the VNFs, like the service of connectivity and the service of management.  These services have to be defined in abstract terms so that we can build them from whatever explicit resources we have on hand.  This is the approach taken by OpenStack’s Neutron, for example, and also by the OASIS TOSCA orchestration abstraction.

A compositional abstraction also represents what we expect the end service to be.  A “service” to the user is a black box with properties determined by its interfaces and behavior.  That’s the same thing that a service to a VNF would be, so the compositional abstraction process is both a creator of “services” and a consumer of its own abstractions at a lower level.

We host application or service components inside an envelope of connectivity, and so I think it’s obvious that we have to recognize that compositional abstractions have to include the network models that are actually used by applications today.  We build subnets, Ethernet VLANs, IP domains, and so forth, so we have to be able to define those models.  However, we shouldn’t limit the scope of our solution to the stuff we already have; a good abstraction strategy says that I could define a network model called WidgetsForward that has any forwarding properties and any addressing conventions I find useful, then map it to elements that will produce it.  A compositional abstraction, then, is a name that can be used to describe a service that’s involved in some way with a functional abstraction.

The third thing we have to define is a resource abstraction.  We have resources available, like servers or VMs, and we need to be able to define them abstractly so we can manipulate them to create our compositional abstractions.  If we have a notion of DeployVNF, that functional abstraction will have to operate using whatever cloud hosting and connectivity facilities are available from a particular cloud infrastructure, but we can’t let the specific capabilities of that infrastructure rise to the point where it’s visible to our composition process or we’ll have to change our service compositions for every resource variation.

Here we have to watch out for specific traps, one of which is to focus on device-level modeling of resources as our first step.  I don’t have anything against Yang and Netconf in their respective places, but I think that place is in defining how you do some abstract resource thing like BuildSubnet on a specific network.  You can’t let the “how” replace the “what”.  Another specific trap is presuming that everything is virtual just because some stuff will be.  Real devices will have to be a part of any realistic service for likely decades to come, and so the goal of resource abstraction is linked to the goal of functional abstraction in that what we create with VNFs has to look like what we’d create with legacy boxes.

The final thing you need is a management abstraction.  We’re forgetting, in many NFV implementations, something that operators learned years ago with router networks.  Any time you have shared resources, you have to acknowledge that service management and resource management are not the same thing.  Composing services based in the whole or in part on virtual resources is only going to make this more complicated, and how we manage services we’ve composed without collaterally composing a management view is something I don’t understand.  Largely because I don’t believe it’s possible.

Management abstractions are critical to functional platforms because you have to be able to provide real devices and real software elements with the management connections they expect, just as you need to provide them with their real inter-component or user connections.  But the connection between a VNF or router and its management framework has to be consistent with the security and stability needs of a multi-tenant infrastructure, which is what we’ll have.

If you look at just this high-level view, you can see that the thing we’re missing in most discussions about NFV is the high-level abstractions.  We have come close to making a mistake that should be obvious even semantically.  “Virtualization” is the process of turning abstractions into instantiations, and yet we’re skipping the specific abstractions.  I contend that we have to fix that problem, and fix it decisively, to make NFV work.  I contend that we’ve not fixed it with the ISG E2E conception as yet, nor have we defined fixing it as a goal of OPNFV.  This isn’t rocket science, folks.  There’s no excuse for not getting it right.