Divergence of Operator Visions of NFV Show Inadequacies in Our Approach

Transformation of a trillion-dollar infrastructure base isn’t going to be easy, and that’s what network operators are facing.  Some don’t think it’s even possible, and a Light Reading story outlines Telecom Italia’s thinking on the matter.  We seem to be hearing other viewpoints from other operators, so what’s really going on here?

There’s always been an uncertainty about the way that virtualization (SDN, NFV, SD-WAN) are accommodated in operations systems.  If we were to cast the debate in terms of the Founding Fathers, we could say that there are “loose constructionists” and “strict constructionists” regarding the role of the OSS/BSS in virtualization.

In the “loose construction” school, the thinking is that orchestration technology is applied almost seamlessly across virtualization processes, management processes, and operations processes.  A service is made up of truly abstract elements, each of which represents a collection of features that might be supplied by a device, a software function, or a management system representing a collection of devices.  Each of the elements has its own states and events, and operations is integrated on a per-element basis.  It’s all very flexible.

In the “strict construction” view, all this is complicated and disruptive.  This group believes that since the general practice of the past was to manage devices with operations systems, you used virtualization to build a specific set of “virtual devices”, most of which would probably be software-hosted equivalents of real things like firewalls or routers.  These would then be managed in pretty much the same way as the old stuff was, which means that operations wouldn’t really be seriously impacted by virtualization at all.

Both these approaches still need “orchestration” to handle the automation of service lifecycle tasks, or you can’t achieve the benefits of SDN, NFV, or SD-WAN.  Arguably, the big difference is in the extent to which orchestration, abstraction, and virtualization are integrated across the boundary between “services” and “networks” and between “networks” and “devices”.

With the strict-construction virtual-device approach, virtualization has to live inside the virtual device, which becomes a kind of black box or intent-model that abstracts the implementation of device features to hide the specifics from the operations layer.  You don’t need to change the OSS/BSS other than (possibly) to recognize the set of virtual devices you’ve created.  However, the OSS plane is isolated from the new tools and processes.  This point is what I think gives rise to the different operator views on the impact of things like NFV on OSS/BSS.

If you have an NFV mission that targets virtual CPE (vCPE), you have a model that easily translates to a virtual-device model.  The boxes you displaced were chained on premises, and you service-chain them inside a premises device or in the cloud.  The feature looks to the buyer like a box, so it makes sense to manage it like a box, and if you do adopt a premises-hosted model there’s no real shared resources used so there’s no need for fault correlation across a resource pool.

If you have a broader vision of NFV, one that imagines services created by dynamically linking cloud-hosted features in both the data plane and control plane, then it’s difficult to see how this dynamism could be represented as a static set of virtual devices.  There are also concerns that, to prevent the virtual device from becoming a hard barrier to extending operations automation, these virtual devices would all have to be redefined to more generally model network functions—a true intent model.  That would then require that traditional devices somehow be made to support the new features.

Both existing infrastructure and existing operations tools and practices act as an inertial break on transformation, and that’s especially true when nothing much has been done to address how the legacy elements fit into a transformation plan.  We don’t really understand the evolution to NFV, and in particular the way that we can achieve operations and agility savings with services that still (necessarily) include a lot of legacy pieces.  We also don’t understand exactly how the future benefits will be derived, or even what areas they might come from.

When you have uncertainty in execution, you either have to expand your knowledge to fit your target or you have to contact your target to fit your knowledge.  Some operators, like AT&T, Verizon, Telefonica, and (now, with their trials of AT&T’s ECOMP) Orange, seem to have committed to attempting the former course, and Telecom Italia may believe that we’re just not ready to support that evolution at this point.

The underlying problem is going to be difficult.  A service provider network and the associated craft/operational processes is a complex and interdependent ecosystem.  Yet every technology change we support is (necessarily) specific, and we tend to focus it on a very limited area of advance in order to build consensus and promote progress.  That means that, taken alone, none of these technology changes are likely to do more than create a new boundary point where we have interactions that we don’t fully understand, and that don’t fully support the goals of technology transformation.  Certainly we have that in SDN and NFV.

The end result of this is that we ask network operators to do something that the equipment vendors doing the asking would never do themselves.  We ask them to bet on a technology without the means of fully justifying it, or even understanding what credible paths toward justification exist.  They’re not going to do that, and it’s way past time that we stop criticizing them for being backward or culturally deficient, and face business reality.

NFV has been around since October 2012 when the Call for Action paper was first published.  I’ve been involved in it since then; I responded to that paper when there was no ISG at all.  In my view, every useful insight we’ve had in NFV was exposed by the summer of 2013.  Most of them have now been accepted, but it’s been a very long road even to that limited goal, and it’s going to be even more time before we have a workable framework to implement the insights.  We need to nudge this along.

I’d like to see two things happen.  First, I’d like to see the ISG take the combination of the Verizon and AT&T frameworks and look at them with the goal of harmonizing them and drawing from them a conception of end-to-end, top-to-bottom, NFV without making any attempt to validate the work already done.  We need to use formal frameworks that address the whole problem to develop a whole-problem architecture.  Second, I’d like to see the TMF stop diddling on their own operations modernization tasks (ZOOM) and come up with a useful model, again without trying to justify their current approach or their own business model.

If we do these things, I think we can get all the operators onto the same—right—page.