How Many “NFV Benefits” are Really Specific to NFV?

One thing that is probably clear to everyone who reads about SDN or NFV these days is that there is no real consensus on what either actually do, or should do.  There’s a lot of confusion out there, inhibiting a strong consensus on what SDN or NFV can do, separately or together, and what other things might be done around them, especially in the area of service management.  I want to look at service management today, and also look at a first hint of how operators might step beyond NFV to make a business case for change.

The service management problem has been around for almost 15 years, believe it or not.  Any “virtual network” poses a management challenge because the functional view of the network/service is different from the resource view.  In VPNs or VLANs, users get what looks like the acronym suggests—a private network.  It’s not, it’s produced by segmenting a multi-tenant network.  So when users exercise “management” you have to be able to recognize the difference in views and insure that users see the state of their own service and don’t interfere with the services of others.

The more virtual you get, the more issues you create with this view dichotomy.  SDN introduces a wider range of “virtual devices” because it permits very detailed forwarding control, enough to define even different L2/L3 services.  NFV creates virtual devices by hosting software on various server/device types.  In both cases, we have that same functional/resource separation.

This service management challenge has grown up at a time when established principles of service management are being questioned.  It’s all about SLAs, right?  Well, the Internet is by far the largest data service globally, and it’s best-efforts.  VPNs over the Internet have been growing as users discover the lower cost of the service makes up for the fact that there’s no solid SLA.  Applications can be made more resilient to QoS variations, and best-efforts can be good enough for all practical purposes.  So do you need to manage services at all, or might you simply plan a network for the traffic you think you’ll carry and then fix problems at the resource level, or through capacity planning?

This is how we manage the Internet, and in truth many IP and Ethernet services as well.  You do some capacity planning, you exercise traffic management based on aggregate user flows, and you have what could be called a “statistical SLA”, one where there’s a goodly number of “9’s” (but not all five!) over a long-ish period like a month, but no real near-term guarantees.  Remember T1 lines and “error-free seconds” or “severely errored seconds?”  Forget that level of granularity these days.  We’ve already accepted lower levels of availability and QoS, and lower prices would likely induce even further trade-offs here.

If we were to view services in the future as being totally best-efforts, if we believed that they never required us to associate customer-experience with resource-state, we could solve SDN and NFV’s management challenges easily.  And there may be those who believe that, and they may be right.  I don’t dispute the fact that this view could be rational, only the notion that we can accept the independence of service and resource management without accepting the baggage.

With SDN and NFV, but primarily with the latter, we add yet another factor to the mix.  The business case for NFV is based on a mixture of three benefits—capex reduction, opex reduction, and “service agility” meaning improved service-to-market adaptation in features and timing.  NFV was not targeted at replacing everything with virtual functions hosted on something, though.  The targets have really been “higher-layer” features like security.  Virtual CPE is such an obsession within the NFV community that they tend to frame all their examples in those terms.  Yet you can’t eliminate devices to terminate a service, only simplify them, and you can’t address enough cost through capex replacement alone under those conditions.  That means NFV’s business case relies on operations-related factors.  That’s also true for SDN.

Neither SDN nor NFV considered service management or operations to be in-scope.  As a result, neither have defined a “new” operations or service management relationship.  That leaves us with the old one, which was that OSS/BSS systems talked to devices or device management systems.  If that’s the case, then SDN and NFV should only generate “virtual devices”, and it is in this point that all the various service management forces collide.

If we consider SDN and NFV to be builders of virtual devices, then we’re saying that they are the technologies that make the function-to-resource mapping, which means that whatever we know about the relationship between functional or service management and resource management has to come from SDN or NFV.  Where, you recall, operations management in all forms is out of scope.

Virtual-device management is easiest where the functional/resource relationship is simple, but the problem is that “basic business vCPE” is a small-potatoes application for anyone but business MSPs.  If operators want to make larger changes in costs or revenues and can’t broaden the vCPE base, they might find it useful to mingle vCPE with SDN.   vCPE is a successful early application of NFV, and it can be linked to an SDN overlay (as AT&T does with its on-demand switched Ethernet) to build a service based on virtualization.  With vCPE we have tenant-specific hosting at the edge, which makes the management connection easy.

The AT&T initiative is my topic for the “think outside the NFV box” award.  The question, which AT&T and others are working to answer, is whether you can gain satisfactory operations efficiency and service agility using a mix of SDN, NFV, and legacy infrastructure changes, and do so on a broad enough scale to impact costs.  This could be an example of building an NFV justification by stepping out of NFV, by assigning the potential benefits to something else.

So AT&T’s Ethernet service, using SDN to partition low-level network services, could be a giant step toward a broader simplification.  Note that AT&T has been clear that the “interior” of this service doesn’t involve virtualizing functions at all.  It’s clear, though, that if you were to add VNF-hosted edge routing to it, you’d transition to a VPN service.  It’s a step on what might be a road to radical change.  You could also segment IP or even pure optics with SDN, creating virtual wires that then combined with edge-hosted (or even selectively centralized instances of) routing and switching to build services.  Providing you circle back to the management model.  Service management changes in operations, coupled with a management model for the new configuration, would realize all the benefits that every technology that proposes to change infrastructure must realize to make a business case.

How would this model, combining OSS/BSS changes and edge-hosted alternatives to traditional L2/L3 infrastructure, impact SDN?  Highly positively; there’d be a lot more of it.  NFV?  This approach doesn’t yet address applications beyond business virtual private network/LAN services.  It doesn’t yet harmonize usefully with mobile infrastructure.  There’s lot of “yets” here, a lot of potential to shift the focus of operators from simply “deploying NFV” to making much broader network change that NFV would be only a piece of.  It is possible that something like AT&T’s service plans could pull a lot of business drivers out of NFV, limiting it to vCPE, and add them into SDN and operations systems.

It is really too early to say that something like AT&T’s Ethernet service evolution is a signal that operators are expecting less from NFV.  The problem is that “real” NFV has to build a highly efficient resource pool and a highly efficient pool of operations processes, both of which demand convergence in approach even though early service trials are all being done per-service.  Would we have committed more to NFV had we resolved all of the business case issues last year?  I think so.  Will we commit less to NFV if we don’t solve them in 2016?  I think the proof of that is already happening, at AT&T and elsewhere.

On the vendor side, the kind of shift from pure NFV to opportunistic marriages of NFV and collateral virtualization of the service layer of carrier networks using SDN would certainly generate less NFV hosting.  My model says that an optimally efficient NFV deployment would create about a hundred thousand incremental data centers worldwide.  The SDN-and-vCPE mode would create only about 14% of that.   That says that IT vendors with NFV aspirations will need to try to frame something more impactful with NFV, or risk a major loss of opportunity.