White Boxes: What, Where, and How Much?

One of the most important questions in network-building today is the role “white-box” technology will play.  It’s a fundamental precept of operators and even many enterprises that they’re being gouged and locked in by network vendors.  The obvious solution is to adopt an “open” technology, meaning something that doesn’t include proprietary elements and is subject to the pricing pressure inherent in a competitive market.  How good an idea is “white box”, and is it something that’s limited to network devices or inclusive even of large-scale server systems?

The fundamental basis for the white-box movement, whatever kind of box we’re talking about, is the notion that there is a baseline hardware architecture available that could serve as the basis for creating an open device.  This can be true only if there are no proprietary hardware enhancements that would significantly augment value to the user, and if there’s some source of standard that could be expected to drive competitive solutions that stayed interchangeable.

In the PC space of old, the IBM PC created such an architecture, and we have had many desktop and laptop products based on that architecture, all of which are open and fairly equivalent.  The PC space, and later the server space, introduced a second architectural reference more generally useful—“platform compatibility”.  The purpose of these open boxes is increasingly tied to running open software, since software creates the real features of any device technology.

Platform compatibility means that a device (switch/router or server) has not so much a cookie-cutter-identical hardware configuration, but that it offers a software platform (operating system and middleware) that adapts the traditional platform APIs (like Linux) to the hardware, erasing any visible differences.  Hardware plus platform equals white box.

In the white-box switch and routing space, there are a number of embedded operating systems designed to host switch/router feature software.  The problem with a pure hardware model is illustrated by this multiplicity.  It’s like having a standard PC hardware framework with a half-dozen different operating systems, each with their own applications.  We know from industry history that doesn’t work out.

Fortunately, we have an emerging platform-compatibility framework based on the P4 language.  If you can host a P4 interpreter or “virtual machine”, you can run forwarding programs written in P4.  I think that P4-based stuff, including the Linux Foundation’s DANOS (evolved from AT&T’s dNOS) and the ONF’s Stratum are likely to be the leaders in the white-box forwarding device space.

In all the furor of white-boxing network devices, we kind of forgot the server side.  NFV proposed the substitution of “commercial off-the-shelf servers” (COTS) hosting virtual functions for proprietary network devices.  At the time, most of the COTS focus was on Linux platform-compatible systems, the stuff already widely used in enterprise and cloud provided data centers.  These systems are “commercial” and even “open”, but are they really conforming to the spirit of white-box?  No.

More than a quarter of all servers sold today are commodity white-box implementations of a de facto Linux platform-compatible architecture.  Many are built around (or from) a reference architecture that was defined by the Open Compute Project, but my definition of platform compatibility doesn’t require that hardware be identical, only that it be provided with a platform software kit that has a standard set of APIs for the mission it’s targeting.  For white-box switch/routers, I think that platform is P4.  For servers, it’s Linux…and perhaps more.

One truth about today’s white-box stuff is that it would be rare to see no differences whatsoever in hardware, because of the wide range of missions the devices are expected to support.  We know, for example, that some processor chips (Intel’s in particular) are valuable where third-party software compatibility is critical.  There are better options in price/performance terms when the goal is simply pushing bits around, and of course things like GPUs are critical for high-performance video operations and even some encryption.  All of these can be accommodated providing the software platform makes the hardware differences invisible to the applications/functions being run.

Platform compatibility is taking over, in my view, because it addresses these issues and the real goals of openness—no lock-in, compatibility of application/feature software, and compatibility of management tools and practices.  We are likely to find things like content-addressable memories and GPUs as optional features on white boxes, and where these are provided there should be a standard interface to the hardware through an open API.  It would be helpful to have a standard API set to represent all of the platform and hardware features, with an emulator if there were no hardware support for a given capability.

White-box servers and OCP got their start with Facebook, who decided it would be a lot cheaper for them to build their own servers than to buy a commercial product.  Most users won’t find that to be true, obviously, and so we have already seen white-box companies offering both bare metal (switch/routers and servers) and platform-equipped devices.  These are the kinds of boxes that most white-box prospects would look for; they’re cheaper than big-name stuff and support open functional software of various types.

It seems inevitable to me that there’s going to be a kind of two-level shakeout in the white box space.  The first level shakes out a small number (probably two or three at most in both the switch/router and server categories) of players based on architecture.  Did they pick the right hardware features and configurations and the right platform software to attract the most useful applications?  If “No!” then they die off.  If “Yes!” then they will compete on price and service with others who also made the architecture cut.

Will this eventually shake the big-name providers?  Surely, at least where those providers are selling to buyers like network operators, cloud providers, and large enterprises.  Those players can afford to have the technical planning and support staff needed to deploy and maintain products that, let’s face it, aren’t going to have the vendor and community support of a switch/router or server giant.

The support angle is driving a trend toward a “white box ecosystem” rather than an a la carte approach.  One example I like is SYMKLOUD from Kontron, which offers both bare metal devices/servers and open platforms, and that also supplies open data center switches.  With a data center package in white-box form, buyers have fewer integration worries, which promotes the white box concept even to enterprise buyers.

Because network operators are such big buyers, they’re an early focus for the white-box crowd.  I think that initiatives like VMware’s Virtual Cloud Network are aimed at operators because they are the ones most likely to move aggressively to escape proprietary hardware/software strategies.  So far, white box players have been slow to see how networking has to change to adapt to the cloud.  Many are simply talking about supporting SDN and NFV, which any open model would likely do automatically.

Which white-box trend, switch/routers or servers, will have the biggest impact?  In the long run it may well be that servers will, because virtualization and hosted functions make no sense if you assume you’re transitioning from a proprietary appliance to a proprietary server/cloud platform.  If openness is good, it’s good everywhere.

Carrier cloud is the promised land for anyone chasing the big operators, and it’s surely going to be worth the effort.  In the near term, it’s also going to spread a lot of confusion and insecurity on the part of both sellers and buyers.  There are just too many drivers to carrier cloud, operating in different ways in different areas, and at different adoption rates.  SDN, NFV, cloud computing, video and ad optimization, IoT, 5G…the list goes on.  It would be easy for a white-box vendor to position to the wrong driver, and just as easy for their big-name competitors to do the same.  Right now, none of the white, black, gray, or branded boxes seem to be chasing the optimum story.  There’s still time for one camp to take the lead.