Taking the Path to Cloud EPC

The announcement that Affirmed Networks EPC is now available as a hosted AWS service is very interesting in two dimensions.  The obvious one is that Amazon itself might be trying to get into the mobile network business, to take advantage of the hosting opportunity that might otherwise drive carrier cloud.  The less obvious one is that they’re demonstrating the value of thinking of future services in two layers, and that second one will impact where Amazon itself might go with the first.

EPC stands for “Evolved Packet Core”, which is the 3GPP model for supporting mobility and mobility management.  EPC (simplified) routes normal IP traffic from a “Packet Data Network Gateway” (connecting to the Internet) to a “Serving Gateway” that connects to the cell sites, via tunnels that can be redirected as the user moves between cells.  In the old days, the PGW and SGW were appliances, basically enhanced routers.  The trend of course has been to turn them into hosted instances, which is where the “virtual” in “Virtual EPC” comes in.

Virtual EPCs aren’t unique, of course.  Affirmed has made one available for some time, and you can also get virtual EPCs from Core Network Dynamics and other sources.  They’re already fairly widely deployed, and sure to be even more widely used as we evolve mobile infrastructure.  Their normal use is to let operators build vendor-independent, scalable, EPC deployments using their own data centers and servers.  A vEPC hosted in the cloud would eliminate the need to deploy those data centers, making the strategy a good one for serving “thin” areas.  The strategy is particularly helpful for mobile virtual network operators (MVNOs) who ride on one or several mobile operators and may have some geographies where they just don’t have the business/traffic to justify their own hosting.

In theory, almost any vEPC implementation can be ported to the cloud, but Affirmed has done the necessary customization to make their stuff work on AWS.  It can be run anywhere Amazon has cloud hosting, scaled as needed to support the traffic load, and linked to virtually any radio network, including WiFi and 5G.  The downside is that it creates an ongoing expense for the hosting, which if the traffic is high enough could exceed the cost of self-hosting.  In other words, like any cloud application, the vEPC has to be cost-managed as activity increases, and perhaps eventually displaced by a self-hosted vEPC (which of course can also be Affirmed’s product).

The cost side of this is what opens my second point.  There are two layers to EPC operation, a signaling layer that mediates the data-plane (tunnel) activity, and the data plane where the tunnels are carried.  The PGW-to-SGW relationship actually carries customer traffic, which is the big cost/performance burden on cloud hosting.  So suppose you tool that data-plane traffic off the public cloud and left the signaling side intact?

Wireless standards groups aren’t necessarily the most data-savvy group out there, but you can fairly presume that if mobility management was available in native form within IP, it would have been used instead of the tunneling approach.  What would be needed is for the IP network itself to “follow” a user roaming across cells.  Some IETF work has been done to propose a way of doing that, but it’s also true that SDN could be used.

The purist model of SDN, the one based on OpenFlow control of white-box switches, could obviously be used to create forwarding rules that linked a cellular user to the Internet on-ramp.  Those rules could then be modified based on control-plane signals of a cell change.  Similarly, you could write a P4 forwarding-language program to accomplish the same thing.  If this were done, then the signaling-level interactions in vEPC could continue to be cloud-hosted while the data plane is passing through a “roamable” service offered by the underlying network.

There are a lot of benefits associated with the ability to split signaling- and data-plane activity, at least optionally.  This capability was introduced by the 3GPP in Release 14, June 2017, and is called “CUPS” for Control and User Plane Separation.  Implementation of CUPS would make it fairly easy to host the control plane in any cloud, and to then manage the data plane as an independent service that could be out-of-cloud, meaning implemented separately or consumed as a service of the IP network itself.

5G Core (5GC) would support CUPS, and other stuff like network slicing that might be related to CUPS in terms of service evolution.  In my view, the value of this separation might well be a driver for 5GC, but I haven’t been able to dig out the details of how the control plane of EPC elements gained access to the data plane elements.  Early evolutionary diagrams suggest a “CUPS-aware” EPC would control specific data plane (“user-plane” in 3GPP terms) elements and that the ultimate approach would be a form of distributed control, which to me should mean some kind of central control to mediate requests from CUPS EPC elements to ensure there’s no resource collision at the data level.

You could implement CUPS in 4G, and Affirmed supports CUPS in its vEPC, though they seem to position it more for (hypothetical) to deliver (possibly useful) latency management.  I think it should be considered more a mainstream feature, something that would allow greater participation of public cloud hosting in EPC by separating the stuff that would add to hosting cost the most from the stuff that would be easily hosted.

But this area is an example of where standards could hurt more than help.  There is no reason why a generic implementation of a CUPS-compliant data-plane service couldn’t be deployed on anything that can provide what looks like a tunnel connection to the cell sites (or their representative elements), as long as the relationship between that service (presumably represented by a “UP” element) and the CUPS-enabled EPC control plane element was standardized.  It’s way too early to say if this kind of logical relationship actually gets mandated, and how far the 3GPP, vendors, and operators would go to make the UP elements and their services represent the full range of network options we have in this virtual age.

This is one reason I think that the talk about NFV and VNFs in EPC or mobile evolution is misplaced.  The separation of control and data plane is really an SDN concept.  Most NFV work has been on VNFs that sit squarely on both the control and data planes, which is exactly the opposite of what CUPS envisions.  In any event, as I’ve noted before, CUPS elements are all multi-tenant by nature, and NFV focuses on tenant-and-service-specific VNFs.

What we really need here is the notion of “network-as-a-service” or NaaS, with the service being offered one that essentially pulls the service termination point from site to site to reflect roaming.