An ONF-Sponsored Event on Open-Source 5G

How much can open source do for 5G?  That’s a question a lot of operators are asking (and a lot of vendors, too).  The ONF, who is taking a much bigger/broader role these days, thinks they have some answers, so it’s well worth looking at them.  This (topically speaking) a kind of follow-on to my blog yesterday on the TelecomTV cloud-native conference, so if you’ve not read that, you might want to read it first!

Nobody doubts the motivation behind the open movement in 5G.  Network operators have faced declining profit per bit for almost 15 years now.  Mobile was, for a time, the only space that was immune, because usage-based pricing in some form is still available in mobile services.  Now, even mobile services are seeing the same problems, and since 5G represents at least a major build-out, it’s both a major transformation opportunity and a major business risk.

If operators cannot build 5G using an open model, then what’s likely the last major network service transformation of this decade will reinforce the closed, proprietary, model.  Incremental transformation is a contradiction in terms, so getting out of that reinforced and renewed lock-in will be difficult.  The opportunity to create openness is now.

The presentations in the ONF session are a mixture of vendor and operator views, with the former obviously representing a positioning the vendors think they can win at.  Still, they’re useful in laying out what an open model would look like, and they’re also interesting, in that they almost uniformly illustrate the continued risk of what I’ll call the “functional-to-architectural” transformation, the thing that likely doomed NFV to failure.  I won’t call out individual vendors here because the issue is universal, but I will explain the problem.

We’re layer-obsessed in networking, ever since 1974 when the Basic Model for Open Systems Interconnect (the “OSI model”) was published.  When we draw 5G, we draw layers, and these layers are useful in that they define a building-up of functionality starting with the very basics and moving to the (hopefully) sublime experiences.  There’s no question that in 5G, “services” and “management” and so forth are at a higher functional layer than the user and control planes of 5G Core.

The problem is in how literally we take the diagrams.  How dramatic a leap is it for some poor architect to look at a diagram showing the function they’re trying to design as a box in a stack of boxes, to think of designing a box?  Monolithic implementations tend to develop out of simple functional diagrams taken too literally.  That was a fatal risk for NFV, and it’s still a risk for 5G, open or otherwise.  The universality of this form of explaining things in the material is proof we have to guard against that risk.

Probably a reasonable way to look at the material, which is obviously diverse in both target and approach, is to frame it as a kind of point-and-counterpoint between what I’ll describe as the “operator vision” and the “vendor vision”.  As I said, I’m not going to name people or firms here because I don’t want to personalize comments, only make observations.

Let’s start with an operator seen almost universally as an innovator in 5G, so it’s reasonable to start with their presentation to set the stage.  The first architecture diagram shows the high-stakes game we’re in; they see the future 5G infrastructure as being a combination of edge and central data centers.  That’s why “carrier cloud” had/has the potential to build out a hundred thousand incremental data centers, according to my model.  These map, in functional terms, to a resource pool for NFVI, showing how reluctant operators are to abandon the NFV notion.

The critical business claim made here is that the open approach creates a 40% reduction in capex and a 30% reduction in opex.  Neither of these stated benefits are consistent with the NFV experience data I’ve seen globally; capex reductions have rarely exceeded 25% and opex has been the same or even a bit higher in the open approach.  Since the data cited was for OpenRAN, I think the specialized application elements may offer better savings than the NFV average.  That may be true for 5G on a broader scale too.

Why OpenRAN might generate at least capital savings, and other benefits as well, is explained by a second operator presentation.  They cite four “disruptors” in the telecom sector.  First is competition from the hyperscalers (cloud providers, predominantly), second the evolution of technology overall toward software and the cloud, third the more-demanding customer experience expectations, and finally the lack of innovation created by vendor consolidation and loss of competition.

All these factors are interesting, but the last one may be especially so.  As network operators have responded to declining profit per bit with pressure on infrastructure pricing and projects, vendors have suffered.  This suffering leads to consolidation.  One of today’s vendors, Nokia, is the sum of three previous major network vendors and many smaller ones.

The same thing that’s leading to consolidation and loss of vendor innovation is also leading to “incrementalism” in network infrastructure.  A massive change in a network requires a massive business case.  If vendor innovation is indeed being stifled, there is little or no chance that the kind of technical shift that creates a massive business case would be presented by any vendor.  That, I think, is the real justification for looking for another model, something to replace a competitive field of proprietary giants.

The same second operator cites three key ingredients in a solution to their problems.  The first is disaggregation, which they’ve taken to mean the breaking up of monolithic devices into virtualized functions.  Second is orchestration to automate the inevitably more complicated operations associated with disaggregated functions, and third is open APIs to expose critical capabilities for exploitation by evolving services and techniques.

The final point this operator makes is that the “edge cloud” is going to be a key point in differentiating telcos from hyperscalers/cloud providers.  This begs the question of why so many operators are partnering with public cloud providers, and seem to be stalling on making any progress in carrier cloud at all, much less mobile edge.  It also suggests that either the operator believes that hyperscalers will enter the “carrier cloud” market, perhaps offering 5G-as-a-service, or that the telcos will inevitably have to climb up the value chain to compete on the hyperscalers’ own turf.

Particularly when a third operator has a public-cloud partner at the center of their own architecture.  Fortunately, this operator may offer an explanation, too.  They show the edge cloud, presumably owned by them, connecting to a public cloud.  This would suggest that operators are almost universally interested in public cloud as a supplementary or transitional resource, which of course would be good news for vendors if it’s true.

Speaking of vendors, this is a good place to start thinking and talking about them, and about their approach to the open 5G theme.  As I noted above, there’s still what is to me a disquieting tendency for vendors to hunker down on the NFV story, despite the fact that in another of the recent online events, Google admitted that NFV had succeeded primarily in changing the mindset of telcos, not through adoption.  One operator did retain NFV in their diagrams, but the others were more “cloud” oriented, generalizing their goal as “exploiting cloud technology” or even “cloud-native”.  I think there are a lot of people who don’t want to face up to the fact that NFV was a seven-year boondoggle, but they’ll quietly accept that something beyond it is needed.  One vendor presentation implies that with a platform layer that hosts “containers” and “NFV”.

The ONF presented its own view of what at least part of that “something beyond” might be, which is an SDN-centric vision of routing.  They have an SDN controller talking to a bunch of P4 Stratum switches, running applications like BGP and perhaps even 5G.  This is surely a step in a different direction, but I have concerns about whether it’s a better one, because of the implicit centralization.

I’m all for control/data-plane separation, as readers of my blog surely know.  I’m all for specializing forwarding devices for the data plane.  But I’m not for centralizing the control plane, because we have absolutely no convincing information to prove that central forwarding control can work at a network scale.  You need hierarchies or federation, and those would need some work to get defined.  We may well not have time for that work to be done.

I’m also concerned about later elements of the ONF presentation, in particular the way they seem to be coupling 5G to the picture.  They introduce policy control and enforcement, which to me makes no sense if you assume you have complete and central control of forwarding.  An SDN-like mechanism, or any mechanism designed to provide dynamic forwarding control, should present its capabilities as network-as-a-service, to be consumed by the higher elements, including 5G.

What I see at the vendor (or “source” level) overall is a tendency to draw diagrams and propose platforms rather than to define complete solutions.  It’s easy to show the future as being a set indefinite (and therefore presumably infinitely flexible) APIs leading up to a limitless set of services and refinements.  There is a sense that there has to be a “fabric” or “mesh” of some sort that lives above the forwarding process and hosts the control plane(s) (both the IP and 5G ones), but there is no proposed open-source solution for those elements.

The thread that ties all the material together is a thread of omission.  We don’t have a specific structure for hosting a separate control plane in a cloud-native, practical, way.  We don’t have an architecture to define how that control plane would work, how its elements would combine to do its job of controlling forwarding and keeping track of topology and status.  Google has done some of this in Andromeda, it’s SDN/BGP4 core, but it’s not a general solution and Google has never said it was.

Innovation, the innovation that the second operator said had been lost, is needed in this very area.  Without specificity in the framework of that cloud-native universe of floating functionality that lives above forwarding, we’re not going to have a practical transformed 5G, or much of anything else.

We also may have to get specific with respect to “open” in networks.  Does every piece of hardware have to be based on off-the-shelf stuff?  Does all the software have to be open-source?  We cannot achieve this today, consistent with having something that’s actually working and working within reasonable performance and cost limits.  There’s still a lot of room for innovation, and just because the giants of the past won’t or can’t innovate doesn’t mean the startups of today, and the future, shouldn’t be allowed to give it a go.  It may prove that they should have their chance.

Looming in the background here is the growing interest of public cloud providers in offering 5G.  Hosting 5G in the cloud could still rely on open implementations of 5G, but since the cloud can already host almost anything, such a basic approach wouldn’t offer a cloud provider much differentiation.  They’re all clearly angling for a role in supplying “5G Core-as-a-Service”, which is why Microsoft recently bought Metaswitch, a vendor who has the 5G software stack.  Can the cloud providers’ as-a-service approach defeat the open-source movement, or will operators see it as replace being locked into traditional mobile infrastructure vendors with being locked into cloud providers?

An open network doesn’t lock you in.  That’s the simple definition I think we have to accept, at least for now.  Since it’s the cost of the physical devices, and the contribution of any annual-subscription software, that creates lock-in, we have to match approaches to the test of controlling these two factors if we want to preserve openness…and that still leaves the question of innovation.

The ONF presentation showed innovation, but in a direction that’s likely to raise a serious risk in performance and scalability.  Yesterday, we heard about the DriveNets win in AT&T’s core.  DriveNets has, at least within one of their clusters, a cloud-hosted and separate control plane.  Could this spread between clusters, become something like a distributed form of SDN controller, and resolve the problems with the future model of networking that shows the most promise?  I hope to blog more on them, as soon as I can get the information I need.  This might be a critical innovation, even if DriveNets software isn’t open-source.

If operators want to open everything, to eliminate any vendor profit motive in building network equipment, they’ll need to accept that the innovation baton will necessarily pass to them.  Right now, by their own admission, they are completely unprepared to provide the technical and architectural support needed to play that role.  That means that their vision of the network of the future doesn’t just acknowledge the loss of the old proprietary innovators, but the fact that new ones will be needed, and new visions of “openness” to accommodate them.