Do operators think cloud-native is better than NFV? Do they think NFV could be done in a cloud-native way? What do they think they’ll end up deploying most? These are questions I tried to get answered in my recent exchanges with network operators. I was a bit less successful than I’d hoped, but I think what I learned was very useful.
To the first point, two of every three operators saw cloud-native as “more important than NFV.” The advantage, in their mind, came from the fact that NFV doesn’t fully exploit the operations tools that the cloud increasingly relies on. It’s not that they didn’t think NFV would work, but rather that they thought it was too much work. They think containers and Kubernetes are the right approach, and that NFV has taken another approach.
I’d expected a different response, not that cloud-native was less important but that it was important for another reason. It’s been my view that function virtualization under NFV is too device-centric, too conserving of the old device-network problems. Obviously I believe that to be true, but operators are focusing on a short-term issue, which is how to keep virtualization from creating a killing increase in opex.
“NFV is too complicated,” said one operator. “Everything is a one-off,” said another. When you press on these points, it’s clear that the concerns are that VNF onboarding is almost like a systems integration task, and “management and orchestration” in NFV doesn’t do enough managing and orchestrating. To me, these things are simply expected consequences of a poor overall architecture, one that didn’t consider the full scope of NFV impact because the ISG kept operations out of scope at the time the critical architectural model was devised.
If you expect to manage virtual functions with device management tools and principles, you end up replicating the real devices via virtual devices. That means that your management and operations focus is on the virtualization of each box, not on the collective operational efficiency of the network. NFV wasn’t designed to make network operations better, only to make it device-like. But it’s the failure to fully operationalize that worries operators.
The same thing is true with on-boarding. They see the process as being too resource-intensive, but they don’t see that as a failure of the architecture (which it is), but as a set of missing tools. It’s possible to make on-boarding of VNFs, their integration into a functional and operationally stable network, less labor-intensive, but it’s not as easy as it should be, because there was no mechanism provided to abstract VNF functionality in a systematic way, to make all “firewalls” look, operationally, like a single class of VNFs with common deployment and operations rules. That would have solved the on-boarding problem.
On the second question, operators do believe that NFV could be done via cloud-native principles. The problem is that they think it would take a long time. Half the operators said it would take “years”, a third “one year”, and the rest “more than six months”. This illustrates two things. First, that most operators think cloud-native NFV would be almost a redo of the original effort, gaining perhaps a bit of execution efficiency from experience, but still a long slog. Second, that cloud-native principles would result in something totally different. That point is what separates the operators in their responses, in fact. The “years” camp think that you need to start over, and the “more than six months” camp think that you could retrofit cloud-native principles on NFV without making fundamental architectural changes.
The fact that cloud-native NFV is expected to take a long time means that a lot of operators think they’ll end up deploying both “NFV” and “cloud-native”, which means they don’t think the two will converge technically in the next couple of years. Over three-quarters of operators say that “cloud-native” deployment of network service features would be possible using the tools the cloud community have already developed. A slight majority of this group think that some policies and practices would be needed to augment these tools to fit into the special concerns of network operators. The rest think that you could use cloud tools “in the same way cloud providers do”. They see the OTTs and cloud providers as models in how cloud-native should be used.
What are the “special concerns” of operators? Primarily reliability and availability, say the operators. Almost all operators say that transformation to virtual elements would be possible only if all current network SLAs could be met. This, given that more and more user communications takes place over best-efforts services. Those services that aren’t totally OTT, like voice calls, are declining in importance. I think this is important, because it suggests that not only are operators fixated on virtualization as the creation of 1:1-mapped-to-devices virtual boxes, they’re fixated on a service model that’s out of sync with current market reality.
This view of cloud-native means that operators do see it separate from NFV in deployment terms. For things like virtual CPE, operators think NFV will prevail. For implementing new services “above” the network (meaning OTT-like services) they think cloud-native will prevail. They also think that, over time, there will be more and more of those OTT-like services, and thus that NFV will deploy less often. About a third of operators think cloud-native will naturally displace NFV over time, even in vCPE applications.
You might see these views as validations of many of the things I’ve said in past blogs, but I have to admit that there’s a big hole in the story that’s troubling me. Operators are unhappy with the symptoms of NFV’s issues as they are visible today. They have a very tactical view of what’s wrong with NFV, which means that they really don’t have a proper planning mindset to formulate a successor cloud-native model, or even to assess one. That’s bad, because outside a few technical specialists in the cloud-native world, the cloud-native model isn’t well-understood anywhere.
We do have a sort-of-ad-hoc vision of the cloud-native story, and that vision is getting clearer as major vendors start assembling and marketing collections of open-source technology that collectively address all the cloud-native requirements. Still, the vision is still so hazy that most enterprises (to whom the efforts of vendors are largely targeted) don’t understand it. Operator planners have less exposure than even enterprise planners, and more ossified thought processes too. This is why operators think a full cloud-native transformation will take a long time.
This is also likely why so many operators are looking to partnerships with the cloud providers as a means of realizing their own near-term opportunities for cloud-hosting of features and functions. They may learn enough from those to get their own infrastructure right. But while they’re learning, nothing much they do with cloud-native technology, whether it’s in something old like NFV, something current like 5G, or something future like IoT, is likely to flounder.
The biggest risk we face in transformation is this planning-lags-behind-markets problem. Operators need to understand what the future will look like, a future they know instinctively has been shaped by the cloud providers. If they don’t, then they’ll keep doing NFV-like projects and making the same old mistakes. We can fix the mistakes of the past only if we don’t keep repeating them.