Achieving Openness in NFV

What operators want most from their next-gen infrastructure (whether SDN or NFV or the cloud) is openness.  They feel, with some justification, that equipment vendors want to lock them in and force them down migration paths that help the vendor and compromise operator goals.  Networks in the past, built on long-established standards that defined the role of devices, were considered fairly open.  It’s far from clear that goal can be achieved with the next generation.

You can make anything “open” if you’re prepared to spend a boatload of cash on professional services and suffer long delays when you commission a piece of hardware or software.  That all flies in the face of a transformation to be driven by efficiency and agility.  What you need is for pieces to fit because they were designed to fit, and specifications that insure that you realize the goal of “fitting” that you’ve designed.

Everyone I’ve talked to in the operator community (CxOs and “literati” alike) believe that an open environment has to be based on three things.  First, an architecture model that makes the business case for the migration.  Second, a series of functional models that define open elements that can then be made interchangeable by vendors/operators through “onboarding”.  Finally, validation through testing, plug fests, etc.

The problem we have today in realizing openness is that we don’t have either of the first two of these, and without them there’s little value in validating an approach because there’s no useful standard.  There doesn’t seem to be much of a chance that a standards group or even open-source activity is going to develop either of the missing pieces either.  Vendors, even the half-dozen who actually have a complete model, don’t seem to be promoting their architectures effectively, so what we’re now seeing is a set of operator-driven architecture initiatives that might result in a converging set of models, or might not.  Fortunately, we can learn something from them, and in particular learn why that second point of requirements for openness is so critical.

“Open” in IT and networking, means “admitting to the substitution of components without impacting the functionality of the whole.”  That almost demands a series of abstractions that represent classes of components, and a requirement that any component or set of components representing such a class be interchangeable with any other within that class.  I think that this divides what’s come to be called “orchestration” and “modeling” into two distinct areas.  One area builds from these functional models or component classes, and the other implements the classes based on any useful collection of technology.

Let’s return now to the bidirectional view of these functional models.  Above, you recall, they’re assembled to create services that meet an operator’s business needs.  Below, they’re decomposed into infrastructure-specific implementations.  With this approach, a service that’s defined as a set of functions (“network functions” perhaps in NFV terms) could be deployed on anything that could properly decompose those functions.  If infrastructure changes, a change to the lower-layer decomposition would update the service—no changes would be needed at the service level.

The service structure could be defined using TOSCA, where my functions are analogous to high-level application descriptions.  It could also be defined using the TMF’s SID, where my network functions would be analogous to either customer-facing or resource-facing services.  That means it should be largely accommodating to OSS/BSS as long as we frame the role of OSS/BSS to be the management of the CFS and RFS and not of “virtual devices” or real ones.

Decomposing a function requires a bit more attention.  Networks and services are often multi-domain or multi-jurisdictional.  That means that the first step in decomposing a function is to make a jurisdictional separation, and that’s complicated so let’s use a VPN as an example.

Let’s say I have a North American VPN that’s supported by AT&T in the US, Bell Canada in Canada, and TelMex in Mexico.  My first-level decomposition would be to define three administrative VPNs, one for each area, and assign sites to each based on geography.  I’d then define the interconnection among providers, either as a gateway point they had in common or a series thereof.  In the complex case I’d have six definitions (three area VPNs and three gateways), and these are then network functions too.

For each of these network functions, I’d then decompose further.  If a given operator had a single management API from which all the endpoints in their geography could be provisioned, I’d simply exercise that API.  If there were multiple domains, technology or otherwise, inside one of these second-level functions, I’d then have to decompose first to identify the proper domain(s) and then decompose within each to deployment instructions.

This description exposes three points.  First, there’s a fuzzy zone of network function decomposition between the top “function” level and the decomposition into resource-specific deployment instructions.  Is my administrative separation, for example, a “service” function or a “resource” function?  It could be either or both.  Second, it’s not particularly easy to map this kind of layered decomposition to the ETSI processes or even to traditional SDN.  Third, the operator architectures like AT&T’s and in particular Verizon’s calls out this middle layer of decomposition but treats it as a “model” and not specifically as a potentially n-layer model structure.

All of which says that we’re not there yet, but it gets a bit easier if we look at this now from the bottom or resource side.

A network’s goal is to provide a set of services.  In virtual, complex, infrastructure these resource-side services are not the same as the retail services—think the TMF Resource-Facing Services as an example.  I’ve called these intrinsic cooperative network-layer structures behaviors because they’re how the infrastructure behaves intrinsically or as you’ve set it up.  SDN, NFV, and legacy management APIs all create behaviors, and behaviors are then composed upward into network functions (and of course the reverse).

Put this way, you can see that for example I could get a “VPN” behavior in one of three ways—as a management-driven cooperative behavior of a system of routers, as an explicit deployment of forwarding paths via an SDN controller, and by deploying the associated virtual functions with NFV.  In fact, my middle option could subdivide—OpenDaylight could control a white-box OpenFlow switch or a traditional router via the proper “southbound API”.

The point here is that open implementations of network functions depend on connecting the functions to a set of behaviors that are exposed from the infrastructure below.  To the extent that functions can be standardized by some body (like the OMG) using intent-model principles you could then assemble and disassemble them as described here.  If we could also define “behaviors” as standard classes, we could carry that assemble/decomposition down a layer.

For example, a behavior called “HostVNF” might represent the ability to deploy a VNF in a virtual machine or container and provide the necessary local connections.  That behavior could be a part of any higher-layer behavior that’s composed into a service—“Firewall” or even “VPN”.  Anything that can provide HostVNF can host any VNF in the catalog, let’s say.

The notion of functional behaviors is the foundation for the notion of an open VNF framework too.  All virtual network functions, grouped into “network function types”, would be interchangeable if all of them were to be required to implement the common model of the function type they represented.  It would be the responsibility of the VNF provider or the NFV software framework provider to offer the tools that would support this, which is a topic I’ll address more in a later blog.

Openness at the infrastructure level, the equipment level, is the most critical openness requirement for NFV for the simple reason that this is where most of the money will get spent as well as where most of the undepreciated assets are found today.  We can secure that level of openness without sacrificing either efficiency or agility, simply by extending what we already know from networking, IT, and the cloud.