Three Steps to Prove NFV is Justified

NFV is all about hosting virtual functions, and justifying it means that the process of hosting functions is somehow better than creating network services by interconnecting fixed devices and custom appliances.  The question is whether that’s true, and it’s a question that’s becoming increasingly important to service provider executives who have to decide whether to take NFV beyond technical trials into some real fieldwork.

Three benefits have been touted for NFV.  First, that it reduces capex by substituting inexpensive software running on commodity hardware for more expensive custom network devices.  Second, that it improves service agility, meaning the ability of operators to quickly offer new services.  Third, that it improves overall operations efficiencies, lowering opex.  To prove NFV at the business level, you have to get enough of these benefits to drive the NFV ROI up above the targets set by carrier CFOs.  To see where we are, I’ll look at each of the benefits more closely, applying carrier data and my financial modeling.

The notion that NFV will reduce capital costs is generally valid where the legacy solution isn’t itself based on commodity hardware.  For example, consumer broadband gateway devices are very cheap and so it’s hard to establish a cost baseline in a hosted alternative that would be much of an improvement.  The challenge is that if you go to applications with more expensive gadgets involved, you find fewer of the gadgets.

Capital cost savings also depend on achieving reasonable economy of scale, which means a data center large enough to host functionality past the tapering knee of the Erlang curve.  Operators know that this likely involves pre-positioning cloud assets when NFV deploys and hoping they can capture enough function hosting to justify the cost.  Thus, NFV generally creates a first-cost risk for operators.

The net here is that operator executives I’ve surveyed are not of the view that capex savings will be enough to drive NFV, and they point out that NFV alternatives to existing boxes are often more complex than single-box solutions, which means that it would be easy to eradicate even modest capex benefits with increased complexity and operations cost increases.

The service agility angle for NFV is similarly complicated.  There are really two “agility dimensions” here with very different needs and impacts.  One is agility that improves time to revenue per customer and the other is agility that improves time to market per opportunity.

When a customer orders something, the presumption is that they’d like it right now.  If there’s a delay in fulfilling the order then the carrier loses the money that could have billed during that delay.  Operators report that on the average the time to revenue is a bit less than two weeks, so that’s a half-month’s revenue or a gain of about 4%.  Nothing to sneeze at, right?

The problem is that this applies only to new orders, and it presumes instant provisioning via NFV.  Most operators report that less than 10% of their orders would qualify as “new”.  They also report that about half of those involve some physical provisioning step.  Overall they think that time to revenue would likely result in less than a half-percent gain.  Nothing to sneeze at, but not a big windfall.

The time-to-market agility is a bit more interesting.  Right now, operators’ own estimates are that an OTT can introduce a new “overlay” service in about 60 days, and operators take about ten months for the same process, five times as long.  That might effectively take an operator completely out of the market for a given service; they’d be so far behind competition it would be difficult to recover.

NFV is credible as a time-to-market accelerator providing that NFV’s service creation process and service deployment process are very easy to drive and very responsive.  There have been no proof of concept trials that have demonstrated a full NFV lifecycle in what I believe is a credible way.  I believe that NFV can in fact generate benefits here, likely benefits significant enough to drive as much as a 5% revenue gain for operators, but I can’t guarantee that the tools to address this opportunity are out there.

That leaves the most complicated of the issues, operations efficiency.  If you read through the material produced by the ETSI NFV ISG, it’s hard not to see an NFV service as anything other than highly complex.  Even in simple topology terms, a given function now performed by a single device might have to be decomposed into a chain of functions described in a forwarding graph, deployed on separate VMs and linked by network paths created at the time of deployment.

How complex the picture might be depends on the nature of the function and the placement of the components relative to the placement of the original device.  My example of the consumer broadband gateway is a good one.  I can stick a box in the customer prem and (according to operators) it’s likely to stay there for five to seven years.  It’s more likely to be returned because the customer moved than because it broke.  Operationally it presents no cost once it’s installed.  If I virtualize most of the functions, I can make the box a little cheaper, but I always need a box to terminate the service, and my virtual version of the features will have to be maintained operationally.  If I use COTS servers to replace any box, I have to look at the MTBF and MTTR of those servers and the connecting network elements, compared to the same data on the real box.  Sure I can use redundancy to make an NFV version of something more available, but I can’t do that without creating complexity.  Two components means a load-balancer.  What if that breaks?  You get the picture.

This is, in my view, the practical where-we-are with respect to NFV.  We are proving that most of the technical principles needed to deploy NFV are valid.  We are proving that our current approach is valid for the test cases we’ve selected.  We have yet to prove the business case because we have yet to prove all the benefits we can secure, net of incremental costs of any sort, can rise above our ROI targets.

I believe that we will prove the case for NFV, but I think getting to the proof points is going to come only when we accept the need to test not the technology itself but the application of that technology at the scale of real services.  We’re probably not going to even start doing that until well into 2015.

I also believe that we have to explore things like edge-hosting of functions on custom devices.  If service agility is the goal, and if business customers normally keep their service makeup constant for a long period of time, there’s nothing wrong with deploying firewall or other features in the edge.  Such a decision scales cost with revenue (you deploy an agile box only when you sell service), it’s operationally simpler (no cloud data center, no virtual connections), and it doesn’t require all the MANO technology that’s needed for full virtualization.  Maybe there’s an on-ramp to full NFV through this sort of thing.

NFV is a good idea.  SDN is a good idea.  Realizing either means more than proving it can work, it means proving that it’s justified.  Validating the three net benefit opportunities is the only way to get there.