Is the CNTT Backing Into a Useful Vision of Telecom Virtualization?

I did not attend the meeting of the Common NFVI Telco Task Force (CNTT) but I have spent a good bit of my day yesterday sorting through emails from those who did attend.  Without revealing any confidences, it’s pretty clear to me that like many telecom virtualization debates, the CNTT stuff is demonstrating that we are trying to impose a low-level approach to something when we haven’t agreed on the high-level approach, or perhaps even the thing we’re approaching.

To start with, most of you will recognize that I’ve been blogging about both the CNTT and the issues that are being raised by the first meeting.  I didn’t talk specifically about the CNTT model because it wasn’t public until this meeting, and I don’t use private information as the basis for blogs.  However, my own views, as contributed to clients and others, are mine to share and I’ve been doing that.  Some of what’s covered here reprise those views, adding in the specific CNTT context now publicly available.

So let’s get going.  One operator contact of mine described the discussion as the “battle of the box natives versus the cloud natives”.  NFV, according to both this operator and my own experience in the ISG, focused itself from the first on creating virtual analogs for physical devices.  That approach was perhaps reasonable given the fact that the original “Call for Action” industry paper was focused on replacing appliances with hosted virtual functions.  The discussion of a common NFVI is hardly the place to debate whether this was a smart approach, but because it’s the thing we’re debating now, it’s the place where all discussions on how NFV is going necessarily have to find a home.

The “cloud natives” part of this reflects the truth that a small but growing number of operators think that the original approach of NFV was wrong, and that virtual functions should really be something more “functional” than “device” focused.  This would take a cloud-native approach to implementing the functions, and that in turn would strongly suggest (if not mandate) a cloud-centric model for hosting and deployment.

According to my operator friends, this device-native-versus-cloud-native stuff is embodied in a basic point, which is the contrast between “virtual network functions” (VNFs) and “cloud native functions” (CNFs).  VNFs are what NFV is explicitly about, and thus what we’d expect NFVI to be hosting.  A CNF, according to a rather large and actually nice presentation of the point by Cisco HERE, is actually a cloud component that happens to be pressed into a networking mission.

The CNTT meeting wasn’t supposed to be addressing this issue set at all; it was aimed at creating a specific set of NFVI missions, to which specific classes of hosting resources could be assigned.  However, one step in doing that was to define what VNF missions were out there, and another to use those missions and their hosting requirements to size the resource classes in NFVI’s pool.  It was this task, according to many operators, that exposed some interesting information.

If you look at the CNTT paper, you find a description of VNF Requirements and Analysis where it says that the list of “network functions” it provides cover almost 95% of the telco workload.  What’s interesting about the list is that almost everything on it has nothing to do with virtual CPE or service chaining, and in fact is not a per-subscriber deployment but rather a deployment of a functional element shared at the service level across service customers.

It’s pretty obvious that we’re seeing a kind of emergence here, a vision (even if it’s only an implicit vision) of the CNF (which some operators say stands for “Cloud-Native Function” and some say for “Cloud-Native Network Function”) as a specific service element.  The emergence is driven by the obvious fact that most of the VNF missions don’t relate to most of the VNF work that’s been done.  Instead, they relate better to the CNF view.  The question that my operator contacts say is being raised in the CNTT is whether CNFs are a subset of, a supplement to, or a replacement for the VNF.  My view is that no matter which of these three are intended, the fact is that CNFs will replace VNFs.

The great majority of the listed features fall into the “control plane” or “management plane” categories, as I’ve discussed them in the past.  CNTT uses this same separation, and it’s pretty clear that functions of this type that are expected to be persistent and support multiple users (callers, for example) in parallel are actually cloud applications and not components of user-specific services.  What CNTT’s categorization does is make it clear that the most useful “VNFs” are actually “CNFs”.

The bad news is that the CNTT doesn’t actually talk about CNFs, and this is where the operators I’ve talked with break from the sense of the meeting.  These operators believe that what the CNTT has done is create two distinct classes of virtual functions.  The VNF class of functions are per-user in nature, focus on replacement of a physical device with a virtual instance, and are primarily designed to be part of a chain of features in the data plane.  The CNF class of functions are multi-user, cloud-scalable, and service-feature oriented.

This simpler classification is more useful than the complex tiers the CNTT came up with.  It reflects reality, which is a good starting point for any classification system.  It also admits to the role of “NFV” and VNFs while at the same time making it clear that most of the stuff operators actually intend to do with hosted functions isn’t NFV at all, and wouldn’t be served optimally by the application of NFV orchestration and management principles.

This classification would be helpful in specifying what you really need in NFVI too, whether it’s hosting VNFs or CNFs.  VNFs require data-plane optimization, and so would likely be implemented on virtual machines running on servers with optimized hardware for network throughput, hardware like custom NICs.  CNFs are much more like generalized application components, designed as microservices and expected to scale dynamically under load to accommodate more (or fewer) users.  You need some degree of VNF-to-NFVI specialization, but I don’t think that most of the NFVI categories the CNTT has proposed really make sense for containerized microservices, meaning CNFs.

CNFs have become the conceptual framework for virtual functions built on cloud principles rather than on NFV principles.  They run in containers, they’re orchestrated by Kubernetes, and they’re designed to be fully scalable.  When the literati among the telecom industry’s cloud-native advocates use the “cloud-native” term, it’s CNFs they’re really describing.

Which is why CNFs will ultimately eat everything else, meaning NFV and VNFs as formally described.  VNFs and NFV’s software framework are the wrong way to do virtual functions overall, and at best only an OK way to do the limited data-plane-centric subset of virtual functions the ISG focused on.  However, you can do almost every VNF mission using CNFs, and it’s clear from the CNTT’s own VNF classification that most VNF missions aren’t even in the VNF wheelhouse to start with.

One could use the CNTT document as the basis for advocating an effort to converge the VNF concept with the CNF concept, but I’m not going to propose that for the simple reason that it would likely create a couple years of infighting.  The better path would be to let nature take its course; the strategy that can be implemented best and fastest will win, and CNFs are surely that.  That operators raised important questions in a comparatively unimportant forum shows that at least some are on the right track.

The CNTT is lost between two worlds, as my operator friend suggested.  It has one foot in the box and the other in the cloud, and nobody has legs long enough to span that distance without losing their balance or splitting themselves down the middle.  I think the industry should be pleased that something useful has been raised in the CNTT debates, even if the thing raised was inappropriate to the group and whether the group was even a useful place to raise any high-level issues at all.  Count your blessings, NFV; there are precious few to enumerate.

It’s hard for an industry to toss five years of work, and five years of not-insignificant spending, and admit they did the wrong thing.  It’s much easier to simply do the right thing under a different name, and perhaps eventually merge the two terms to cover the retreat from past mistakes.  That’s what I believe is happening, starting now, starting with the CNTT efforts.

What’s a good place to start?  There are three critical steps that, if not taken, will hamper everything else.  The first is that the industry needs to adopt the model-driven event-to-process steering that the TMF proposed with NGOSS Contract over a decade ago.  The second is that we need to address data-plane handling in CNF, meaning containers.  The third is that we need a model that lets us encapsulate current infrastructure and management APIs in the same model structure as we use to define service deployments in CNF/VNF form.  Missing any of these things will prevent us from fully realizing service lifecycle automation.  If we miss them all…it could be ugly.