My blog recently on NFV performance has generated a long thread of comments (for which I thank everyone who’s participated), and from the thread I see a point emerging that’s really important to NFV. The point is one I’ll call scope of benefits.
Operators build networks to sell services from. If you presume that the network of the future is based in part on hosted resources that substitute for network components, then the evolution to that future network will occur by adding in network components, either to fulfill new opportunities or as an alternate way of fulfilling current ones. If I want to sell a security managed service I need the components thereof, and I could get those by selling a purpose-built box on premises, a generalized premises box/host with associated software, or a hosted software package “in the cloud” or in an NFV resource pool. NFV, early on, was based on the presumption that hosting higher-level functions like security on a COTS platform versus custom appliance would lower costs.
I’ve made the point, and I made it in that blog, that operators now tell me that they think that NFV overall could have no more than about a 24% impact on capex, which was in the same range as they expected they could obtain from vendors in the form of discounts (as one operator puts it, by “beating up Huawei on price”). In the LinkedIn comments for the blog a number of others pointed out that there were examples where capex savings were much higher—two thirds or even more. The question is whether this means the 24% number is wrong, and if not what it does mean.
Obviously, operators say what they say and it’s not helpful to assume they’re wrong about their own NFV drivers, but I can’t defend their position directly because I don’t know how they’ve arrived at it. However, I did my own modeling on this and came up with almost exactly the same number (25% with a margin of plus-or-minus two percent for simple substitution, up to 35% with roughly the same range of uncertainty if you incorporated assumptions about multiplication of devices to support horizontal scaling). That number I understand, and so I can relate how those 66% savings examples fit in this picture. The answer is that scope-of-benefits thing.
Suppose you have a food truck and sell up-scale sandwiches. Somebody comes along and tells you they have an automatic mayo creator that can make mayo at a third the commercial costs. Does that mean your costs fall by 66%? No, only your mayo cost. The point here is that operators are going to impact capex overall in proportion to how much of total capex a given strategy can impact. Security appliances represent less than 5% of capex for even the most committed operator in my survey, and across the board their contribution to capex wasn’t even high enough to reach statistical significance. So if I cut capex for these gadgets to zero, you’d not notice the difference.
If you want to make a big difference in capex you have to impact big areas of capex, most of which are actually not even NFV targets. Virtual access lines? Virtual fiber transport? I don’t think so, nor is virtual radio for mobile very likely. Yes, we can virtualize some functions of access or transport or radio, but we need real bits, real RAN. Where we find opportunities for real capex reduction at a systemic level is in L2/L3 infrastructure. It’s the highest layer that we see a lot of, and the lowest that we can reasonably expect to virtualize. Every access device is L2/L3, as well as most aggregation components, points-of-presence, EPC, and so forth.
I’m not advocating that we replace everything at L2 and L3 with virtual devices, though. The problem with that is the fact that capex can’t be used as a measure of cost reduction anywhere at all. We can only use total cost of ownership, and as I’ve said TCO is more and more opex. The question that any strategy for capex substitution would have to address is whether opex could at the minimum be sustained at prior levels through the transition. If not, some of the capex benefits would be lost to opex increases. And since we have, at this moment, no hard information on how most NFV vendors propose to operationalize anything, we have great difficulty pinning down opex numbers. That, my friends, is also something the operators tell me, and I know from talking to vendors that they’re telling most of the vendors that as well.
One of the key points about opex is the silo issue. We are looking at NFV one VNF at a time, which is fine if we want to prove the technical validity of a VNF-for-PNF substitution. However, the whole IP convergence thing was driven by the realization that you can’t have service-specific infrastructure. We can’t have VNF-specific NFV for the same reason. There has to be a pool of resources, a pool of functional elements, a pool of diverse VNFs from which we draw features. If there isn’t then every new service starts from scratch operationally, they share resources and tools and practices inefficiently, and we end up with NFV costing rather than saving money.
Service agility goes out the window with this situation too. What good is it to say that NFV provides us the ability to launch future services quickly if we have VNFs that all require their own platforms and tools? We need an architecture here, and if we want operators to spend money on that architecture we need to prove it as an architecture and not for one isolated VNF example. There is no such thing as operations, or resource pools, in a vacuum.
Where we start is important too, but there is no pat answer. We could pick a target like security and expect to sell it to Carrier Ethernet customers, for example. But how many of them have security appliances already, things not written off? Will they toss them just because security is offered as a service? We could virtualize CPE like STBs, but at least some box is needed just to terminate the service, and the scale of replacing real CPE with a virtual element even in part would be daunting without convincing proof we could save money overall. One operator told me their amortized annual capital cost of a home gateway was five bucks. One service call would eat up twenty years of savings even if virtual CPE cost nothing at all.
I said this before, and I want to repeat it here. I believe in NFV, and I believe that every operator can make a business case for it. That’s not the same thing as saying that I believe every business case that’s been presented, and operators are telling me they don’t believe those presented business cases either, at least not enough to bet a big trial on them. So my point isn’t to forget NFV, it’s to forget the hype and face the real issues—they can all be resolved.