Tracking the White-Box Revolution

Sometimes the real story in something is deeper than the apparent news.  Nobody should be surprised  by the decision by AT&T to suspend any new DSL broadband connections.  This is surely proof that DSL is dead, even for the skeptics, but DSL isn’t the real issue.  The real issue is what’s behind the AT&T decision, and what it means to the market overall.  AT&T is telling us a more complex, and more important, story.

The fundamental truth about DSL is that, like a lot of telecom technology, it was focused on leveraging rather than on innovating.  “Digital subscriber loop” says it all; the goal was to make the twisted-pair copper loop plant that had been used for plain old telephone service (POTS) into a data delivery conduit.  At best, that was a tall order, and while there have been improvements to DSL technology that drove its potential bandwidth into the 25 Mbps range, that couldn’t be achieved for a lot of the current loops because of excessive length or the use of old technology to feed the digital subscriber line access multiplexers (DSLAMs).

The biggest problem DSL faced was that the technology limitations collided with the increased appetite for consumer access bandwidth.  Early attempts to push live TV over DSL involved complex systems (now largely replaced by streaming), and 25 Mbps wasn’t fast enough to support multiple HD/UHD feeds to the same household, at a time when that was a routine requirement.  Competition from cable and from fiber-based variants (including, today, millimeter-wave 5G), means that there’s little value in trying to keep those old copper loops chugging along.

OK, that’s it with DSL.  Let’s move on to AT&T and its own issues.  AT&T had the misfortune to be aligned as the only telco competitor to Verizon in wireline broadband.  As I’ve noted in past blogs, Verizon’s territory has about 7 times the potential to recover costs on access infrastructure as AT&T’s.  Early on, cable was delivering home video and data and AT&T was not, which forced them to try to provide video, leading eventually to their DirecTV deal (which AT&T is trying to sell off, and which is attracting low bids so far).  They’re now looking to lay off people at WarnerMedia, their recent acquisition, to cut costs.

AT&T has no choice but to cut costs, because its low demand density (see THIS blog for more) has already put it at or near the critical point in profit-per-bit shrinkage.  While AT&T may have business issues, they have an aggressive white-box strategy.  Their recent announcement of an open-model white-box core using DriveNets software (HERE) is one step, but if they can’t address the critical issue of customer access better than DSL can, they’re done.  The only thing that can do that is 5G, and so I think it’s clear that there will be no operator more committed to 5G in 2021 than AT&T, and that’s going to have a significant impact on the market.

Recall from my white-box blog reference that AT&T’s view is that an “open” network is one that doesn’t commit an operator to proprietary devices.  The AT&T talk at a Linux Foundation event suggests that their primary focus is on leveraging white boxes everywhere (disaggregated routing is part of the strategy).  That means that AT&T is going to be a major Tier One adopter of things like OpenRAN and open, hosted, 5G options overall.

There couldn’t be a better validation of the technology shift, though as I’ve noted, AT&T’s demand density places it closer to the point of no return (no ROI) on the profit-compression curve than most other operators.  That means that those other operators will either have more time to face the music, or need another decision driver to get things to move faster, but I think they’ll all see the handwriting on the wall.

For the major telco network vendors, this isn’t good news, and in fact it’s bad news for every major network vendor of any sort, even the OSS/BSS players.  As nearly everyone in the industry knows, vendor strategy has focused on getting their camel-nose into every buyer tent and then fattening up so there’s no room for anyone else.  The problem with open-model networking is that it admits everyone by default, because the hardware investment and its depreciation period has been the anchor point for our camel.  Take that away and it’s easy for buyers to do best-of-breed, which means a lot more innovation and a lot less account control.

We’ve already seen signs of major-vendor angst with open-model networking.  Cisco’s weak comment to Scott Raynovich on the DriveNets deal, that “IOS-XR is already deployed in a few AT&T networks as white box software,” is hardly a stirring defense of proprietary routers.  Ericsson did a blog attacking OpenRAN security.  The fact is that no matter what the major vendors say, the cat is out of the bag now, and with its escape it reveals some key questions.

The first of these questions is how much of a role will open-source network software play?  AT&T has demonstrated that it’s looking at open hardware as the key; either open-source or proprietary software is fine as long as it runs on open hardware.  That would seem to admit open-source solutions for everything and perhaps kill off proprietary stuff in all forms—like a technology version of the Permian Extinction.  The problem with that exciting (at least for those lifeforms who survive) theory is that there really aren’t any open-source solutions to the broad network feature problem.  Yes, there are tools like Kubernetes and service mesh and Linux and so forth, but those are the platforms to run the virtual features, not the features themselves.  That virtual feature space is wide open.

Can open-source fill it?  Not unless a whole generation of startups collectively sticks their heads in the sand.  Consensus advance into a revolutionary position is difficult; it’s easier to see revolution through infiltration, meaning that startups with their own vision and no committees to spend days on questions like “When we say ‘we believe…’ in a paper, who are ‘we?’” (a question I actually heard on a project), can drive things forward very quickly.

The second of these questions is are there any fundamental truths that this new open-model wave will have to embrace?  Obviously, the follow-on question would be “what are they”, so let’s admit that the answer to the first is “Yes!” and move to the follow-on.

The first of our specific fundamental truths is that all open-model software strategies, particularly those that exploit white-box technology, have to separate the control and data planes.  The control plane of a network is totally different from the data plane.  The data plane has to be an efficient packet-pusher, something that’s analogous to the flow switches in the ONF OpenFlow SDN model.  AT&T talked about those boxes and their requirements in the Linux Foundation talk.  The control plane is where pushed packets combine with connectivity behavior to become services.  It’s the petri dish where the future value propositions of the network are grown, and multiply.

The second of our specific truths is that the concept of control planes, now defined in multiple ways by multiple bodies and initiatives, have to somehow converge into a cooperative system.  What 5G calls a “control plane” and “user plane” defines a control plane above the IP control plane, which is actually part of 5G’s user plane.  The common nomenclature is, at one level, unfortunate; everything can’t have the same name or names have no value.  At another level, it’s evocative, a step toward an important truth.

Networking used to be about connecting things, and now it’s about delivery.  What the user wants isn’t the network, it’s what the network gets them to.  Thus, the true data plane of a network is a slave to the set of service-and-experience-coordinating things that live above, in that “control plane”.  The term could in fact be taken to mean the total set of coordinating technologies that turn bit-pushing into real value.  Because that broader control plane is all about content and experience, it’s also all about the cloud in terms of implementation.

Who and how, though?  The ONF approach uses an SDN controller with a set of northbound APIs that feed a series of service-specific implementations above, a layered approach.  But while you can use this to show a BGP “service” controlling the network via the SDN controller, BGP is almost a feature of the interfaces to the real BGP networks, not a central element.  Where do the packets originate and how are they steered to the interfaces?  In any event, the central SDN controller concept is dead.  What still lives?

This raises what might be the most profound competitive-race question of the entire open-model network area; is this “control plane” a set of layers as it’s currently implicitly built, or is it a floating web of microfeatures from which services are composed?  Why should we think of 5G and IP services as layers, when in truth the goal of both is simply to control forwarding, a data-plane function?  Is this new supercontrol-plane where all services now truly live?  Are IP and 5G user-plane services both composed from naked forwarding, rather than 5G user-plane being composed from IP?

These questions now raise the third question, which is what are the big cloud players going to do in this new open-model network situation?  By “big cloud players” I of course mean the cloud providers (Amazon, Google, IBM, Microsoft, and Oracle), and also the cloud-platform players like IBM/Red Hat, VMware, and HPE) and even players like Intel, eager for 5G revenue and whose former Wind River subsidiary offers a platform-hosting option.  And, finally, I’d include Cisco, whose recent M&A seems to be aimed at least in part at the cloud and not the network.

It’s in the cloud provider community that we see what might be the current best example of that supercontrol-plane, something that realizes a lot of the ONF vision and avoids centralized SDN controllers.  Google Andromeda is very, very, close to the goal line here.  It’s a data-plane fabric that’s tightly bound to a set of servers that host virtual features that could live anywhere in the application stack, from support for the IP control-plane features to application components.  Google sees everything as a construct of microfeatures, connected by low-latency, high-speed, data paths that can be spun up and maintained easily and cheaply.  These microfeatures are themselves resilient.  Google says “NFV” a lot, but their NFV is a long way past the ISG NFV.  In fact, it’s pretty close to what ISG NFV should have been.

Andromeda’s NFV is really “NFV as a service”, which seems to mean that microfeatures implemented as microservices can be hosted within Google’s cloud and bound into an experience as a separate feature, rather than being orchestrated into each service instance.  That means that each microfeature is scalable and resilient in and of itself.  This sure sounds like supercontrol-plane functionality to me, and it could give Google an edge.  Of course, other cloud providers know about Google Andromeda (it dates back over five years), so they may have their own similar stuff out there in the wings.

The cloud-platform vendors (IBM/Red Hat and VMware, plus the server players like Dell and HPE) would probably love to build the supercontrol-plane stuff.  They’d also love to host 5G features.  So far, though, these platform giants are unable to field the specific network microfeature solutions that ride on top of the platforms and justify the whole stack.  Instead, most cite NFV as the source of hostable functionality.  NFV, besides not being truly useful, is a trap for these vendors because it leads them to depend on an outside ecosystem to contribute the essential functionality that would justify their entry into the new open-network space.  They might as well wait for an open-source solution.

And this raises the final question:  Can any of this happen without the development of a complete, sellable, ecosystem that fully realizes the open-model network?  The answer to that, IMHO, is “No!”  There is no question that if we open up the “control plane” of IP, consolidate it with other control planes like the one in 5G, frame out the ONF vision of a programmable network and network-as-a-service, and then stick this all in the cloud, we’ve created something with a lot of moving parts.  Yes, it’s an integration challenge and that’s one issue, but a greater issue is the fact that there are so many moving parts that operators don’t even know what they are, or whether anyone provides them.   For operations focus, I recommend that people keep an eye on models and TOSCA (an OASIS standard for Topology and Orchestration Specification for Cloud Applications) and tools related to it.

Big vendors with established open-source experience (all of the vendors I named above fit that) will do the logical thing and assemble an ecosystem, perhaps developing the critical supercontrol-plane tool and perhaps contributing it as an open-source project.  They’ll then name and sell the entire ecosystem, because they already do that, and already know how it’s done.

This will be the challenge for the startups who could innovate this new open-model space into final victory.  No startup I’m aware of even knows all the pieces of that complete and sellable ecosystem, much less has a ghost of a chance of getting the VC funding (and the time) needed to build it.  Can they somehow coalesce to assemble the pieces?  Interesting question, given that many of them will see the others as arch-rivals.  Can they find a big-player partner among my list of cloud or cloud-platform vendors?  We’ll see.

Another interesting question is what Cisco will do.  They’re alone among the major network vendors in having a software/server position to exploit.  Could Cisco implement the supercontrol-plane element as cloud software, and they promote it both for basic OpenFlow boxes and for Cisco routers?  We’ll get to Cisco in a blog later on.

I think AT&T’s DSL move is both a symptom and a driver of change.  It’s necessitated by the increasingly dire situation AT&T is in with respect to profit per bit, just as AT&T’s white-box strategy is driven by that issue.  But it drives 5G, and 5G then drives major potential changes in the control-to-data-plane relationship, changes that could also impact white-box networks.  Everything that goes around comes around to going around again, I guess.  We’ll see where this complex game settles out pretty quickly, I think.  Likely by 1Q21.