Device Models for Future Open Network Infrastructure

What will our next-generation network devices really look like?  I know we’ve heard a lot of stories about the future, but are they all true?  Are any of them, in fact?  I had some interesting operator discussions on this point, and they yielded some surprising information.  The best way to review it is to list the device models operators saw as part of future infrastructure and discuss the benefits and risks of each.

The first and most obvious of the device models is the traditional vendor-proprietary appliance.  Despite the fact that everyone has been predicting the death of these devices, every operator I talked with believed they would have these devices in their networks five and even ten years out.  The primary reasons are a tie; the devices are already there and their use and operations tools are understood, and the proprietary model offers a known risk both in terms of support and technology.

Operators do think their spending on this type of device will decline, both as a percentage of total spending and in absolute terms.  The reason for the decline is a combination of pricing pressure they’ll place on vendors and displacement of appliances in some missions.  Operators expect to rely on the traditional boxes deeper in their networks, but not as much in access/edge and metro.  There, because of the number of boxes needed, they predict they’ll adopt another model.

The model they think they’ll adopt is the open traditional-protocols device, which means an open switch/router platform combined with hosted open-source functionality.  The best-known example of this is the commitment AT&T has made to its DANOS software platform for open devices, which it expects to deploy in the tens of thousands in their mobile infrastructure, notably at/near cell sites.  These devices are not replacements for routing, but rather for vendor-proprietary routers.  They’d still run routing protocols and work with other routers elsewhere in their networks.

Nearly every operator thinks that this is the class of device that will experience the fastest growth and attain the largest total deployment over time.  A bit over half operators said they thought that within 10 years they’d be spending more on this class of device than any other.  That, again, is owing to the expectation that this class will dominate metro/edge deployments.

The next device class is roughly tied with the forth in terms of expectations, but it’s already experiencing deployment growth.  That class is the OpenFlow SDN device, and operators think it will deploy predominantly in carrier cloud data centers, but also somewhat later as the foundation for electrical-level grooming of transport trunks, replacing traditional core routers.

Slightly less than half of my operator contacts think that OpenFlow SDN will see action in business services, but this view doesn’t seem firmly held.  Their issue is a combination of lack of proof of scalability to VPN levels and concern over the operations impact—both on practices and on cost.  That suggests that were there a strong service lifecycle management framework in place, and were that framework to have proven itself with OpenFlow, greater deployment could be expected.

The forth device class is the overlay-edge device, which is the foundation of an overlay network.  We have these today both in the “SDN” space (VMware’s NSX and Nokia/Nuage) and in the SD-WAN space, and both have various uses in operator infrastructure.  Operators tend to see SD-WAN as a pure business service framework, a supplement to or replacement for MPLS VPNs.  They see overlay SDN as a potential alternative to OpenFlow SDN in the data center, and some even see a value in transport grooming missions.  The big opportunity for the “SDN” part of overlay devices is believed to be 5G network slicing.

This device class sees the greatest uncertainty in deployment terms.  Almost two-thirds of operators say they expect to see most of their business services delivered via an overlay model within 5 years, owing in no small part to the ease with which MSPs or even enterprises themselves could create SD-WAN services.  Almost 15% see no use of overlay technology at all, and a small number see it as their primary grooming and service-layer architecture.  It’s clear that operator perspectives are evolving here, and that the slow pace of evolution can be linked to the limited vision vendors are presenting in the space.

The fifth class of device is the hosted/virtual function.  There are two subclasses in this group, one being the virtual functions of Network Functions Virtualization (NFV’s VNFs) and the other being simply hosted instances of software.  Operators here are sharply divided, with about a sixth saying they think NFV will deploy for CPE in all classes of users (including consumers) and throughout 5G and carrier cloud, a sixth thinking that strict NFV will never amount to anything, and a sixth believing that they’ll adopt cloud technology in 5G and elsewhere, but without the formal NFV trappings.  The rest hold no firm view at this point.

One reason for the confusion in this class is the “virtual CPE” or vCPE concept.  Most NFV today is really a limited application of the ETSI NFV ISG specifications aimed at hosting functions in a universal edge device (uCPE).  These devices may be simple Linux machines or use an embedded OS that supports at least a limited Linux/POSIX API set.  The functions can be dynamically loaded and changed, and the deployment doesn’t require a cloud data center or formal “service chaining” of multiple functions across hosts.  Most operators recognize that this model isn’t ETSI NFV at all, but they’re grappling with how standardization and openness can be achieved without a formal framework.

That point opens the question of the physical devices involved in these models.  If we look at the hardware foundation for the classes, we find there are three primary hardware platforms deemed credible by operators, besides the proprietary appliances.  One is the server, the second the open network device, and the third the “universal CPE” device.

Most of the operators I’ve talked with believe that while it would be possible to host routing/switching instances on general-purpose computers, the trend is toward open but specialized devices.  These would have custom ASICs or other GPU chips intended to handle the heavy lifting of forwarding.  Servers are best for hosting persistent, multi-tenant, functions, which includes most of what operators expect would be involved with 5G.

Open network devices are the inheritors of the special-device expectations.  There are already quite a few open-platform switch/router products that don’t depend on OpenFlow, and operators like these devices as the basis for at least edge missions in the future.  They’re more reserved about using open-model hardware in deeper missions, but that may change as they become more comfortable with the approach.

uCPE devices have inherited most of the “NFV” expectations, and this may be at the core of the general trend to use “NFV” to mean any method of hosting network functions, not necessarily the specific approach mandated by the NFV ISG.  If you’re going to put functions into a box on prem, much of the ISG stuff isn’t relevant.

The mission of uCPE is potentially a mixture of open-switch-router and function-hosting server, which makes this device class difficult to pin down.  It is possible to host the flow-programming language P4 on a server, and to build a P4 interpreter or “virtual switch” that runs under a standard operating system, so perhaps that would be the ideal approach for uCPE.  However, open and agile deployment of virtual functions implies that anything that hosts one should be able to host a standard model/configuration.

There’s a bit too much variability in the uCPE notion, in my view.  There is a presumption in standard NFV that you’re going to put functions into a VM, but that might not be the best approach with uCPE given the excessive resource requirements for per-function VMs.  A better model, still standardized, might be a container, but if you wanted optimum efficiency you’d probably not bother with any sort of isolation of functions from each other, given that uCPE is sitting on one customer’s premises.

Perhaps we need something even lighter-weight, given that even container technology presumes a level of complexity in application distributability and lifecycle processes that sticking all your function eggs in a uCPE box doesn’t easily justify.  That’s especially true if we presume that IoT and other event-driven applications will draw function hosting and microservices to the network edge, and presumably that means either the operator or user side of the service demarc.  Functions are simple pieces of code, requiring little in the way of middleware or operating system features.  If function-hosting uCPE is the future, then function-based VNFs should be considered.

Any open model for networking demands true openness in hardware and software, and a pretty high level of consistency in terms of operating system and middleware features, at least within the appropriate device class and mission.  I think that some operators, notably AT&T, are pushing to achieve that, and if they succeed I think they’ll drive an open model of networking forward much faster than we’re currently seeing.