Is Rakuten’s Indictment of Telecom Software on Target?

Here’s a bold statement for you: “There is no magic that an Amazon Web Services, Google, and Microsoft could enable because the underlying software architecture is absolutely flawed. It needs to evolve.” This, from the CTO of Rakuten Mobile, as quoted in an SDxCentral piece. I agree, of course, and in fact I’ve been trying for months to get a reading on what senior planners and architects in the telecom space think about “cloud-native”.

Let’s start with the “why” question. Why is underlying telecom software absolutely flawed? The operator planners/architects offered three reasons. First, operations software (the OSS/BSS) is core business software that in telecom, as in other verticals, tends to change very slowly. One good reason is that vendors don’t want to throw the space open to new competitors with massive changes. Second, telecom standards still tend to drive the formulation of technology concepts, and standards bodies are neither anxious to start over in design, nor particularly equipped with the skills needed to do it. Third, the network operators are unable to for change because they don’t understand the new model.

One operator gave me an interesting example that crosses into all of these points. The example was 5G, and the operator noted that the 5G architectural model has a distinct device orientation. There are functional elements that are represented by boxes in the diagrams, and those boxes connect with each other through lines that represent APIs or interfaces. In the models, operations systems are either totally ignored or represented as what the planner said were “boxes in the sky”, high-level elements that were represented only as general capabilities.

I’ve noted in earlier blogs that when you draw a functional diagram that’s based on boxes connected by interfaces, you tend to guide implementations along those lines. As my planner contact noted, it’s difficult to draw a representation of a microservice-based implementation of 5G that conveys any sense of functional behavior. You end up with one big box that says “5G” or “O-RAN”, and that’s decomposed inside into microservices. Not only that, to make that structure “open”, you have to define a lot more APIs and message flow relationships. The old functional-box model is appealing because it can be grasped easily. It’s just not easy to turn it into a true cloud-native structure.

Rakuten’s point is that, for a variety of unspecified reasons, we’ve not turned much of anything in telecom software into a cloud-native structure, and because of that we don’t have what he calls “elasticity”, which means scalability, agility, and resilience. The properties of cloud computing don’t push through bad software design to somehow empower the end result; you have to design the software to exploit the cloud. But who does that?

Every incumbent wants evolution rather than revolution, for obvious reasons, like the fact that it preserves your win. That’s particularly true in the OSS/BSS space, because incumbency there is almost like an annuity; win and you reap the benefits for your career lifetime. A less-obvious reason is that box-in-the-sky point. Network management and operations has to link to service and business management and operations, and those linkages tend to be preserved through migrations and technology changes, because it limits the scope of things you need to change when something new comes along. That creates the problem of the “sea anchor effect”.

A sea anchor is something you toss over the stern to create a consistent backward pull to counter weather conditions. When you have a monolithic legacy OSS/BSS framework that presents interfaces to network operations that you need/want to preserve, those interfaces are a sea anchor. Not only do they slow you down, they also impose a direction on you. In particular, if those interfaces are the classic transactional, polling-for-status, sort of thing, it makes it difficult to couple to event-driven systems in network operations. Thus, legacy architectures for OSS/BSS, and legacy OSS/BSS thinking, tends to act as a brake on cloud-native designs for lifecycle management.

Operations systems are database-intensive, and the fact that Rakuten has invested in Robin.io, a specialist in cloud databases, likely indicates that they’re thinking a lot about a cloud-native implementation even at the OSS/BSS level. Not only would this make operations systems more resilient and agile, it would make it easier to couple them to event-driven, cloud-native, network orchestration and management.

Just making OSS/BSS cloud-native wouldn’t necessarily cure the problem of the old-style interfaces. Operations interfaces contributed to the problem that the NFV ISG had; they ruled operations systems out of scope, so they had to support those interfaces. They also drew up the traditional functional-box diagram, which was then taken as a literal structure, creating a monolithic box network rather than cloud-native. What they should have done was to adopt cloud technology from the first.

Software inertia bites even here, though. Most “network functions” that are virtualized are derived from appliances, which NFV calls “physical network functions” (PNFs). Appliance vendors immediately showed their own versions of OSS/BSS legacy-think. First, they wanted minimal adaptation of their device software to virtualize it, and second, they wanted most of the money they’d have gotten for an appliance. The latter goal tended to price the popular versions of PNFs like firewall right out of the market, and also increase the “onboarding” cost and difficulty. The former goal meant that the virtual functions were derived from monolithic, non-cloud-native, versions.

So does this mean that all of telecom is doomed to “monolithism” forever? No, but it does probably mean that telecom is going to have to start thinking not just about “cloud-native” but about the way that legacy core business software (OSS/BSS in this case) relates to new software. Enterprises today are increasing their use of the cloud, but not by making legacy applications cloud-native. Instead, they’re front-ending legacy applications with new cloud technology.

5G and O-RAN represent an opportunity for telecom, a chance to look at software in a modern way and to apply the cloud front-end strategy used by enterprises to their own OSS/BSS. Today, though, we don’t have a consistently cloud-native model for 5G or O-RAN, both of which still have the OSS/BSS interface ties and both of which still present structure diagrams that boil down to box networks with box-friendly interfaces, rather than microservices and distributed features.

Rakuten’s CTO is right, and in fact isn’t going quite far enough in his indictment. Not only is telco software flawed, the telco software process is flawed, which makes it much harder to create a new model for software, a cloud-native model. Consensus processes like standards activities require a broadly held industry vision, and in the telco world, we don’t have it. Will some vendor or cloud provider step up on this one? Among the cloud providers, Microsoft seems the most likely to try to fix the cloud-native problem, and I’ll look at how their 5G/O-RAN solution seems to be evolving in my next blog.