What Went Wrong with NFV: The Operator View

Most would agree NFV has not met expectations.  I’ve blogged about the things I believe are the issues, and the feedback I got from operators on those blogs has included operators’ own views on the subject.  They’re not always congruent with my own (in some cases they’re almost contrary), but they are always interesting, given that operators are the ones who will have to make NFV a success.

What is the number one problem with NFV?  The narrow first choice of operators who contacted me is that VNF licensing fees are much higher than expected, which means the cost of a service based on hosted virtual functions is higher, and operator profits are lower.

On the surface, you can see both sides of this issue.  On the one hand, operators expected that if device vendors spun out their software features as one or more virtual functions, the function licensing would be significantly cheaper than buying the boxes would have been.  That seems logical.  On the other hand, the device vendors say that the cost and value of their product is more tied into things like the R&D for the software and support of software-hosted features than it is in the boxes, which in many cases are simply OEMed in some form.

Still, this is the operator complaint where I part company.  You don’t build a successful service model by setting the prices for others’ participation.  If operators thought that device vendors would make sweetheart deals for their functionality, they had absolutely no justification for that thinking.  From the very first, it’s been clear this would happen.  It was clear to anyone who thought about it that open-source software should be prioritized as a source of VNFs.  Such a move would give operators a guarantee of a lower cost point, and would also provide a free-market incentive for device vendors to manage their VNF costs.  The notion never really took off, but a major initiative to be more inclusive of open-source could still be mounted, and could still benefit NFV overall.

The close second in the area of operator problems with NFV is that onboarding VNFs has proven to be much more difficult than expected, amounting to a one-off for every VNF targeted.  This problem has many roots, which makes it more difficult to address at this late date.

In my response to the Industry Call for Action paper released in late 2012, I said that one of the critical points in NFV implementation was to define what I called an “NFV Platform-as-a-Service” definition, a framework of APIs into which all VNFs would integrate.  This move would enable operators to create, contract for, or require as a condition of use, a standardized “adapter” that would expose all control, parametric, and management APIs and data in a common way.  There have been a few strategies to address this approach, but the ETSI NFV ISG has not defined the NFVPaaS.

Part of the reason is that operators, in an effort to minimize their role in a consultative sale of VNF-based functionality, chose to emphasize well-known vendors, and to prioritize their integration.  The combination of “use what we know” and “do things fast” encouraged a customized approach, something that the early vendor-led proof-of-concept program accentuated.  With no common goal, these separate PoCs didn’t develop a common model for integration.

Issue number three from the operators was most NFV applications are really not about “cloud-hosting” functions at all (as the Call for Action paper said NFV would be), but rather are simply a shift from a proprietary device to “universal CPE”.  This is true, but in my view it’s really the fault of operators in the NSG and not the specs or the vendors.

The real issue here is the box-centricity of operators themselves.  Operators, for a variety of (not very good) reasons, fixated on NFV as the substitution of virtual functions for physical devices on a 1:1 basis.  That was an early, conscious, decision made by committee heads who were uniformly operator personnel.  Part of the reason was that operators wanted to be able to do NFV management like they’ve always done management, meaning as an element-to-network-to-service progression.  The “elements” used to be physical devices, so they are now “virtual” devices.  Another reason is that operators built networks from boxes for literally decades and just couldn’t see any other way.  Look at “service chaining”, which implies the whole of VNF connection is the emulation of physical interfaces and cables between boxes.

A broader vision for NFV, including a vision where cloud-hosting of more general functions was the goal, would have exposed a bunch of shortcuts in management and service modeling that would have slowed progress.  To avoid those, the ISG elected to focus work where there were fewer exposed issues, which is how we got to uCPE.  Those issues have still not been addressed, and so it’s going to be a lot of work and take a lot of time to fix this one.

The next operator issue is, in my own view, related to the last one and even to some of the others as well.  NFV benefits have proven difficult to obtain at the pace and level needed to justify the investment in the new technology.  This is the problem that CFO teams tend to focus on, but of course NFV is still largely driven by the CTO organization, so it falls down the list in number of mentions.

By the fall of 2013, it was clear to nearly every operator that capex reduction, the justification for NFV cited in the 2012 white paper, would not be sufficient to drive NFV adoption on a large scale.  In a meeting I had with most of the operators who signed that white paper, the sentiment was “If we want a 25% reduction in capex, we’ll just beat Huawei up on price!”  Clearly, the group was recognizing that significant opex benefits were needed to augment the modest capex reductions expected.

As it turned out, even those modest capex reductions were problematic.  First, as already noted, VNF licensing costs were higher than expected.  Second, it’s more operationally complex to host a function and keep it running in the cloud than to have it live in a box somewhere, which means opex for VNFs might be higher, offsetting some capex reductions.  This truth was reflected at least a bit in the second industry paper, released later that fall.

By that time, though, the ISG had agreed to make service lifecycle management out of scope to NFV efforts, hamstringing any innovative way of dealing with what was clearly more a cloud-related problem than a box-management problem.  The end-to-end (E2E) functional diagram of NFV’s architecture had also been approved, and while it was supposed to be a “functional” model, it was literally interpreted by almost every one of the open-source NFV software initiatives.

I found the last of the widely held negative impressions of operators on NFV the most interesting.  NFV is too complex.  Darn straight it is, because virtually every decision that could have been made to simplify it was made in another direction.  You start with a low-level architecture, and then you start testing it with use cases?  That’s what happened, and how likely is it that the architecture done without regard for what it was supposed to support, ended up supporting what was needed?  So you glue on little pieces to fix tactically what you should have prevented strategically.

This is still the most important issue, though, because it’s one we’re still facing with zero-touch automation and ONAP, with 5G Core, and with future standards and specifications.  We’ve had clear evidence, clear even to operators, that the NFV process went awry.  I’m not harping on that to lay blame, but to prevent the same kind of process thinking from going wrong in the next thing we try to do.  You can’t retrofit past initiatives, but you can realign current ones.