Which of the Many NFVs are Important?

Sometimes words trip us up, and that’s particularly true in tech these days.  Say we start with a new term, like NFV.  It has a specific technical meaning, but we have an industry-wide tendency to overhype things in their early stages, and vendors jump onto the concept with offerings and announcements that really aren’t strongly related to the original.  Over time, it gets harder to know what’s actually going on with that original concept.  Over time, the question arises whether the “original concept” is really important, or whether we should accept the dynamic process of the market as the relevant source of the definition.  So it is with NFV.

The specific technical meaning of NFV would be “the implementation of virtual function hosting in conformance with the ETSI ISG recommendations.”  Under this narrow definition, there is really relatively little deployment and frankly IMHO little opportunity, but there are some important forks in the road that are already established and will probably be very important.  In fact, NFV will leave a mark on networking forever through one or more of these forks in the evolutionary pathway, and that would be true if there was never even a single fully ETSI-compliant deployment.

One fork is the “agile CPE” fork.  This emerged from the NFV notion of “virtual CPE”, which initially targeted cloud-hosted instances of virtual functions to replace premises-based appliances.  You could frame a virtual premises device around any set of features that were useful, and change them at will.  Vendors quickly realized two things.  First, you needed to have something on premises just to terminate the service.  Second, there were sound financial reasons to think about the virtual hosting as an on-premises device, especially given that first point.

The result, which I’ll call “aCPE”, is a white-box appliance designed to accept loaded features.  These features may be “VNFs” in the ETSI sense, or they may simply be something that can be loaded easily, following perhaps a cloud or container model.  aCPE using a simple feature-loading concept could easily be a first step in deploying vCPE; you could upgrade to the ETSI spec as it matured.  aCPE-to-vCPE transformation would also prep you for using the cloud instead of that agile device, or using a hybrid of the two.

Most of what we call “NFV” today is a form of aCPE.  Since it would be fairly wasteful to demand all of the cloud-hosted elements, including “service chaining” when all your functions are in the same physical box, most of it doesn’t conform to the ETSI spec.  I suspect that over time it might, providing that a base of fully portable VNFs emerges from the ongoing ETSI activity.

Another form is the “virtual device instance”, which I’ll call vDI.  Virtual functions are features that are presumably somewhat dynamic.  The vDI is a replacement for a network device, not an edge feature, and so it’s much more like a hosted instance of device software.  A “virtual router” is usually a vDI, because it’s taking the place of a physical one and once it’s instantiated it behaves pretty much like the physical equivalent.

Perhaps the most significant attribute of the vDI is that it’s multi-service and multi-tenant.  You don’t spin up a set of Internet routers for every Internet user, you share them.  Same with vDIs that replace the real routers.  There are massive differences between the NFV model of service-and-tenant-specific function instantiation and a multi-tenant vDI model, and you can’t apply service-specific processes to multi-tenant applications unless you do want to build an Internet for everyone.

Issues notwithstanding, we’re starting to see some activity in the vDI space, after a number of tentative beginnings.  Brocade’s Vyatta router software (now acquired by AT&T) was an early vDI, to some extent subsumed into the NFV story.  However, vDI is really more like a “cloud router” than a virtual network function.  I believe that most of the IMS/EPC/5G software instances of functionality will really be vDIs because they’ll deploy in a static configuration.  In the 5G space, the NFV ISG seems to be taking up some multi-tenant issues in the context of their 5G network slicing work, but it’s too early to say what it will call for.

The real driver of vDI, and also perhaps a major driver for NFV, depends on reshaping the role of the lower OSI layers.  The original OSI model (Basic Reference Model for Open Systems Interconnect) was created in an age where networking largely depended on modems, and totally depended on error-prone electrical paths.  Not surprisingly, it built reliable communications on a succession of layers that dealt with their own specific issues (physical-link error control was Layer 2, for example).  In TCP/IP and the Internet, a different approach was taken, one that presumed a lot of lower-level disorder.  Neither really fits a world of fiber and virtual paths.

If we were to create, using agile optics and virtual paths/wires, a resilient and flexible lower-layer architecture, then most of the conditions that we now handle at Levels 2 and 3 would never arise.  We could segment services and tenants below, too, and that combination would allow us to downsize the Level 2/3 functionality needed for a given user service, perhaps even for the Internet.  This would empower the vDI model.  Even a less-radical rethinking of VPN services as a combination of tunnel-based vCPE and network-resident routing instances could do the same, and if any of that happens we could have an opportunity explosion here.  If the applications were dynamic enough, we could even see an evolution from vDI to VNFs, and to NFV.

Another of my versions of NFV that are emerging is what could be called “multi-layer orchestration”, which I’ll call “MLO” here.  NFV pioneered the notion of orchestrated software automation of a virtual function deployment lifecycle, which was essential if virtual network functions were to be manageable in the same way as physical network functions (PNFs).  However, putting VNFs on the same operational plane as PNFs doesn’t reduce opex, since the overall management and operations processes are (because the VNFs mimic the PNFs in management) the same.  The best you can hope for is to keep it the same.  To improve opex, you have to automate more of the service lifecycle than just the VNFs.

MLO is an add-on to NFV’s orchestration, an elevation of the principle of NFV MANO to the broader mission of lifecycle orchestration for everything.  A number of operators, notably AT&T with ECOMP, have accepted the idea that higher-layer operations orchestration is necessary.  Some vendors have created an orchestration model that works both for VNFs and PNFs, and others have continued to offer only limited-scope orchestration, relying on MLO features from somewhere else.  Some OSS/BSS vendors have some MLO capability too.

NFV plus MLO can make a business case.  MLO, even without NFV, could also make a business case and in fact deliver a considerably better ROI in the first three or four years.  Add that to the fact that there is no standard set of MLO capabilities and no common mechanism to coordinate between MLO and NFV MANO, and you have what’s obviously fertile ground for confusion and misinformation.  You also have a classic “tail-or-dog” dilemma.

NFV drove our current conception of orchestration and lifecycle management, but it didn’t drive it far enough, which is why we need MLO.  It’s also true that NFV is really a pathway to carrier cloud, not an end in itself, and that carrier cloud is likely to follow the path of public cloud services.  That path leads to event-driven systems, functional programming, serverless computing, and other stuff that’s totally outside the realm of NFV as we know it.  So, does NFV have to evolve under MLO and carrier cloud pressure, or do we need to rethink NFV in those terms?  Is virtual function deployment and lifecycle management simply a subset of MLO?  This may be something that the disorderly market process I opened with may end up deciding.

Perhaps it’s not a bad thing that we end up with an “NFV” that doesn’t relate all that much to the original concept.  Certainly, it would be better to have that than to have something that stuck to its technical roots and didn’t generate any market-moving utility.  I can’t help but think that it would still be better to have a formal process create the outcome we’re looking for, though.  I’m pretty sure it would be quicker, and less expensive.  Maybe we need to think about that for future tech revolutions.

Open source seems to be the driver of a lot of successes, and perhaps of a lot of the good things circulating around the various NFV definitions.  Might we, as an industry, want to think about what kind of formal process is needed to launch and guide open-source initiatives?  Just saying.