Cloud-Native for Carrier Cloud

Everything in the tech news is not hype, fortunately, and I found a great example of something that’s playing it straight in Fierce Telecom.  “Time to move on from NFV; focus instead on the cloud” is perhaps the best picture of telecom reality I’ve seen in years.  Quoting Vodafone’s head of virtualization, Fran Heeran, the article says “I think the industry needs to look beyond traditional NFV as it’s really about cloudification now. We can’t operate in [an] NFV silo, as it cannot deliver tangible benefits on its own and simply virtualizing legacy or traditional applications will not deliver anything close to the full benefits.”

This mirrors what telco CxOs worldwide have been telling me for almost five years now, but for a different reason.  They believe that NFV got too focused on the low apples, on substituting virtual functions for physical appliances.  These appliances were typically CPE devices designed to add higher-layer features to business services like carrier Ethernet.  That gave NFV a focus on functions that were deployed per-customer, per-service, rather than shared at the service level.  Virtual CPE (vCPE) impacts a miniscule piece of most operator capital budgets.  The big piece is “infrastructure functions” shared by all service users, like the elements of IMS, EPC, video caching and content delivery networks (CDNs) and the like.

Heeran is raising a different issue in his quote, and so we should explore it.  The overall sense of his discussion is that “cloudification” means constructing cloud-native applications to build services.  He associates the concept of “virtualization” with the process of converting something physical (like a firewall) to a virtual-function equivalent.  A “cloud-native” world would mean operators developed a platform that then could host specialized cloud-centric functions, most of which were never specifically provided in appliance form.  A new world, in short.

A world, according to Heeran, that focuses on things like containers and microservices, meaning on the software architecture to be used rather than on the way to translate the current physical to the “future” virtual.  This would be a welcome shift in thinking in my view, because no reasonable implementation of cloud-hosted network features can be operationalized efficiently without some standardization of how each feature is written and integrated.  The NFV ISG should have addressed that.

It may sound like the shift Heeran suggests is radical, but I really think it’s a simple matter of emphasis or definition.  A “virtual network function” or VNF was the goal of NFV.  You could visualize VNFs in two different ways.  First, the way most comforting to old-line standards guys, is that a VNF is a simple migration of the features of a physical network function (PNF, a real device) into hosted form.  Second, a VNF is a useful function or feature implemented to optimally support service needs and exploit hosting resources.  The second is Heeran’s cloud-native thinking, and the first is what we ended up with out of the ISG.

I’ve blogged many times in the past about how I believe the ISG went wrong on this issue, so I’m not going to dwell on that point here.  Instead I want to look at what would actually be needed to make cloud-native work, and the answer lies in combining some thinking from the past and present.

In my second-phase ExperiaSphere project (please note that this project was an open activity with results available to all, without attribution, except the term ExperiaSphere which is trademarked), I took some of my early modeling ideas (from the first phase of ExperiaSphere, modified and presented in 2013 to some Tier One operators as “derived operations”) and defined an approach to service modeling and automation, framed in a series of six annotated PowerPoint tutorials available HERE.  The goal was to talk about how to operationalize and orchestrate services from service features, which of course isn’t the notion we’re talking about here (though it’s still going to be a requirement).  The point was that the architecture I described presumed that there was a set of functional models that represented features or elements of services.  These models would define both a management-plane behavior set (which was the way orchestration was applied), but also a data-plane behavior set.  The first described and controlled the second.

If you’re going to do practical service orchestration in cloud-native (or any other) form, you have to presume this modeled two-plane structure, or the individual features/functions won’t tie together to create a service without custom integration of every service and without having to re-engineer what you did for every change-out of functions/features you make.  The model, in other words, defines a data structure that when interpreted by suitable software, enables a given feature/function to be dropped into a service, connecting both the data plane and the management plane.

The right way to do a cloud-native service is to have a model that defines it, and have service logic that deploys according to the model.  Each element of the model would carry the necessary data/management connection requirements, each of which would have to be framed in terms of a standard API set.  Probably, a series of possible data-plane APIs could be provided, depending on how the feature/function handled information, but for integration purposes we should assume that there’s a common management API set that connects back to operator lifecycle management systems.  The service events then activate these process APIs based on state/event logic in the model, which I’ve described before in the ExperiaSphere presentations and in blogs.

Having a single standard management/process API set simplifies the way that the management plane is connected, but for the data plane it’s more complicated.  There are, as I’ve said, a number of possible “data plane” connections.  One example is a simple digital bitstream, part of the data flow among/between users of the service.  Another would be the retrieval of a customer record from a repository.  Obviously there would have to be different APIs for these types of data-plane interface, but obviously for any given type of interface there should be only a few (ideally one) API, or you face the problem of integration when you change implementations.

Think of this now in “new” terms, meaning intent-model terms.  Each cloud-native feature/function is an intent model, which means that it’s a part of a class structure and a member of a subclass that represents that specific feature/function.  For example, we might have “ServiceFunction.MobileFunction.IMS.CustomerRepository” as a model.  Everything that implements “CustomerRepository” has to present its outside-world connections in exactly the same way, so to ensure integration works each implementation has to harmonize to the APIs defined by that class.

Microservices and API brokerage are a big step in this direction, but primarily for their ability to support the non-service-data-stream aspects of the APIs.  It’s not always desirable to have user data flowing in and out of microservices, particularly when you implement them in containers.  What’s important here, again, is that the implementation of a given feature/function is hidden inside the intent model, so you can use a microservice in a container, a public cloud service, or an appliance that lets you manipulate its management features to conform to the intent model’s standards.

With cloud-native development, there would be enormous benefit to having all the APIs referenced by the models of a given feature/function to be standardized and made a logical part of an inheritance structure (a hierarchy of subclasses).  That would allow any implementation to plug easily into a given model element that’s supposed to represent it.  Where there’s existing code (or even devices) it would be the responsibility of the implementation of a model element to harmonize all the current interfaces to the model reference, so compatibility and integrability could still be attained.

It would be nice to see something more detailed about the “cloud-native” architectural model in the media, of course, but to be realistic you don’t get stories even as long as this blog very often, and I’ve referenced a fairly enormous body of work and prior blogs just to get everyone up to speed.  Eventually, we’ll need to face the complexity of this issue, and I think Fierce Telecom has made a good start.