A Cloud-Native 5G/O-RAN Model

Why do I make so big a point about “box-centric” specifications for network virtualization? If somebody virtualizes a box, isn’t it the same thing as virtualizing everything that’s in it? In a blog last week, I looked at the 5G O-RAN specification and talked about some of my issues in abstraction of the functionality. I want to dig deeper today, looking at how you’d do O-RAN in an optimum cloud-native way. As always, I want to start from the top.

We get online from our desktop and laptop computers all the time. We sign on to WiFi, we’re assigned an IP address, and we’re there. Let’s call the “service” that this represents “Service”. Why doesn’t this simple approach work for 5G? The answer is in the word “mobile”.

In a mobile network, we have a device that we move around with, and the connection is made from a series of cell sites. We move through these sites as we walk or drive. If each of these sites were like WiFi, we’d get an IP address in each of them as we entered, which would be fine if we never expected to have a voice or data session active while we were moving. Since we do, we could expect to get an IP address from the cell site when we turned on our phone, and as we moved, we’d need to keep that address or any web activity we had ongoing would be broken when the server replied to an address we no longer had, that would no longer reach us.

What makes mobile connectivity work? We need to define another service, which I’ll call “Mobility Management”. This service somehow finds out where we are and ensures that our traffic is directed to the right cell. The Mobility Management service needs to have a “sub-service” we’ll call “Registration”, that’s invoked when a user appears in a cell, to ensure they have a right to service there. The figure below shows this simple functional structure.

A mobile device needs to support four services, then, meaning four interfaces exist between the device and the mobile (5G) network. We could draw 5G as a black box and show four service interfaces to the device, period. If the mobile device and the network conformed to the same spec (including the radio spec, of course) for these interfaces, any implementation would suit the device and the services. This sure makes 5G look simpler, right?

A High-Level Functional View of 5G

My diagram is a functional view, where the specs focus on the implementation of the services, and that’s my first quarrel with most telecom-related specs. They either don’t produce a functional diagram at all, or they produce one by defining how the function could be performed. We draw these diagrams with boxes representing functions, and we show workflows between interface points that we label. The diagram below represents O-RAN as the Alliance shows it, and you can see the difference.

O-RAN Architecture (from O-RAN Alliance)

What you see here is focused so much on implementation it doesn’t even show those high-level interfaces at all. That’s because O-RAN is really focused on supporting the Mobility and Registration services, and those services are driven from the network side. What you do see are the internal interfaces, things like A1, E1, E2, and F1. You also see functional blocks like CU-CP and CU-UP or O-DU and O-RU. You might wonder where the user is in all of this (nowhere) and where and how the whole process represented in the diagram kicks off.

The temptation is to declare this to be an “application”, something that’s running to supply those Mobility and Registration services, which are triggered by things that are found by the “Application Layer” of the near-RT RIC. The Mobility and Registration services created in that layer interact with features in the device, visible to the user only in the way mobility works. Actually, mobility interacts with deeper 5G standards, 5G Core, for the whole of the service, and if you were to look at a picture of 5G NR (the RAN piece) and 5G Core combined, you could identify users, cells, and the Internet.

The problem here is that declaring something to be an application doesn’t provide a hint about the software architecture it uses, and thus doesn’t provide any insight into how it could be made “cloud-native”. If we could travel back in time, I’d suggest that the O-RAN people start with a functional model similar to the one I showed earlier, but of course that’s not going to happen. We’re also not likely to see any changes in O-RAN that would violate the Alliance’s architecture diagram above. What we’ll have to do is try to interpret the O-RAN diagram in light of the functional model diagram.

If we take the functional diagram as a starting point, and assume the “service” structure I described above, we can say that a “real-world cloud-native” O-RAN implementation starts with two event sources, the User Devices and the Access Network. The user device initiates or participates in registration, and it surely initiates requests for Services like voice/SMS or Internet access. The Access Network initiates/participates in roaming between cells and registration.

Events from these two sources would pass to the near-real-time RIC for handling, and we could visualize this function as being a variety of things, in cloud terms. At the superficial level, it might look like an API broker that pops an event queue and initiates a request for an appropriate microservice to handle the event. It could also be a service mesh with the same goal. Finally, we could presume that since something, somewhere, has to know about a User Device relationship to both 5G infrastructure and Services, it could be a state/event-based microservice set linked to a “relationship record” associated with the device. I’d favor this approach.

The “orchestration” function of the near-real-time RIC is a deeper question. Near-real-time suggests that there’s a need to minimize latency, which to me suggests that we don’t want to use serverless hosting of functions. In any event, I think that the 5G O-RAN instance is likely going to see enough events that there’s little value in loading a microservice on demand and then aging it out; there’s little chance it wouldn’t be needed almost immediately.

Containers seem to be a logical approach for hosting, which might make RIC orchestration (for both the near- and non-real-time RICs) a Kubernetes application, perhaps with the addition of a federation strategy like Anthos. I’m more convinced of the need for federation here than in the need for Kubernetes in the near-real-time RIC. It may be that resources within an O-RAN “enclave” would be dedicated to O-RAN, in which case generalized container orchestration could be overkill. However, if you believe in edge computing, then we need to generalize as much of the RIC processing, even including event-handling, to make them a platform software component for the edge. In that case, a RIC would be truly an edge application.

If we were to accept this model, then the boxes in the O-RAN diagram become microservices or flows of microservices driven by the relationship record. That would tend to converge boxes that the diagram shows separately (the CU and DU, for example), since these do much the same thing but at different points in an event flow.

The “user plane” of the 5G flow, my “Services” block, is something that I think should be more abstracted. In 5G, the user plane starts with the RU-to-DU “front-haul”, then DU-CU for the “mid-haul”, and finally CU-5GC (5G Core) for “backhaul”. Most of the heavy lifting is done in the 5G Core Packet Core, which itself has a pair of Gateways, but all of the elements perform only basic functions that could easily be viewed as either hosted on servers (or on disaggregated white-box components of a router like the DriveNets model). Rather than specify a bunch of discrete functions, I think we should view the entire user plane as a “service” as my diagram suggests, with an interface facing the cell access and user device portions on one end, and the service networks on the other.

There are two reasons for this view. First, there are multiple ways in which a fixed IP address associated with a mobile device could be “found” in the right cell, and we should be open to all of them. Second, I think that the abstraction of the user plane as a “Service” promotes loose coupling between a 5G O-RAN or Core implementation and the data plane, which reduces the risk of having an open model become “closed” in some implementations. In 4G Evolved Packet Core, we have “admission control” elements that regulate call and data traffic to reduce congestion risk; the equivalent 5G Core functions would be represented by APIs rather than boxes in my model.

This is how I think a true cloud-native 5G model could be derived. I’m sure there are other approaches, but I think this does a fair job of balancing the goals of cloud-native software, the future edge computing evolution, and the structure and interfaces specified in O-RAN and 5G Core. Let’s hope it at least generates some discussion on these points, before we end up turning O-RAN into one more virtual-boxes structure.