The topic of telco transformation is important, perhaps even critical, so it’s good it’s getting more attention. The obvious question is whether “attention” is the same as “activity”, and whether “movement” is equivalent to “progress”. One recent piece, posted on LinkedIn, is a standards update from ETSI created by the chairs of particular groups involved in telco transformation, and so frames a good way of assessing just what “attention” means in the telco world. From there, who knows? We might even be able to comment on progress.
The paper I’m referencing is an ETSI document, and I want to start by saying that there are a lot of hard-working and earnest people involved in these ETSI standards. My problem isn’t in their commitment, their goals, or their efforts, it’s in the lack of useful results. I participated in ETSI NFV for years, creating the group that launched the first approved proof-of-concept. As I said in the past, I believe firmly that the group got off on the wrong track, and that’s why I’m interested in the update the paper presents. Has anything changed?
The document describes four specific standards initiatives; NFV, Mobile Edge Computing (MEC), Experimental Networked Intelligence (ENI), and Zero Touch Network and Service Management (ZSM). I’ll look at each of them below, but limit my NFV comments to any new points raised by the current state of the specifications. I do have to start with a little goal-setting.
Transformation, to me, is about moving away from building networks and services by connecting devices together. That’s my overall premise here, and the premise that forms my basis for assessing these four initiatives. To get beyond devices, we have to create “naked functions”, meaning disaggregated, hostable, features that we can instantiate and interconnect as needed. There should be no constraints on where that instantiation happens—data centers, public clouds, etc.
This last point is critical, because it’s the goal of software architecture overall. The term most-often used to describe it is “cloud-native” not because the stuff has to be instantiated in the cloud, but because the software is designed to fully exploit the virtual, elastic, nature of the cloud. You can give up cloud hosting with cloud-native software, if you want to pay the price. You can’t gain the full benefit of the cloud without having cloud-native software, though.
Moving to our four specific areas, we’ll start with the developments in NFV. The key point the document makes with regard to Release 4 developments is “Consolidation of the infrastructural aspects, by a redefinition of the NFV infrastructure (NFVI) abstraction….” My problem with this is that in virtualization, you don’t enhance by narrowing your hosting target or subdividing it, but rather by enhancing how hosting abstractions are reflected. This, in NFV, is handled by the Virtual Infrastructure Manager (VIM).
Originally, the VIM was seen as a single component, but the illustration in the paper says “VIM(s)”, which admits to the likelihood that there would be multiple VIMs depending on the specific infrastructure. That’s progress, but it still leaves the question of how you associate a VIM with NFVI and the specific functions you’re deploying. In my own ExperiaSphere model, the association was made by the model, but it’s not clear to me how this would work today with NFV.
The paper makes it clear that regardless of the changes made in NFV, it’s still intended to manage the virtual functions that replace “physical network functions” (PNFs), meaning devices. Its lifecycle processes and management divide the realization of a function (hosting and lifecycle management of hosting-related elements) from the management of the things that are virtualized—the PNFs. That facilitates the introduction of virtual functions into real-world networks, but it also bifurcates lifecycle management, which I think limits automation potential.
The next of our four standards areas is “Multi-access Edge Computing” or MEC. The ETSI approach to this is curious, to say the least. The goal is “to enable a self-contained MEC cloud which can exist in different cloud environments….” To make this applicable to NFV, the specification proposes to create a class of VNF (the “MEC Platform”) which deploys, and which then contains the NFV VNFs. This establishes the notion that VNFs can be elements of infrastructure (NFVI, specifically), and it creates a whole new issue set in defining, creating, and managing the relationships between the “platform” class of VNFs and the “functional” class we already have.
This is so far removed from the trends in cloud computing that I suspect cloud architects would be aghast at the notion. The MEC platform should be a class of pooled resources, perhaps supported by a different VIM, but likely nothing more than a special type of host that would (in Kubernetes, for example) be selected or avoided (by taints, tolerations, affinities, etc.) via parameters.
The MEC concept seems to me to be moving carriers further from the principles of cloud computing, which are evolving quickly and effectively in support of both public and hybrid cloud applications. If operators believe that they can host things like 5G Core features in the public cloud, why would they not flock to cloud principles? NFV started things off wrong here, and MEC seems to be perpetuating that wrong direction.
Our next concept is Experiential Networked Intelligence (ENI), which the ETSI paper describes as “an architecture to enable closed-loop network operations and management-leveraging AI.” The goal appears to be to define a mechanism where an AI/ML intermediary would respond to conditions in the network by generating recommendations or commands to pass along to current or evolving management systems and processes.
Like NFV’s management bifurcation, this seems aimed at adapting AI/ML to current systems, but it raises a lot of questions (too many to detail here). One question is how you’d coordinate the response to an issue that spans multiple elements or requires changes to multiple elements in order to remediate. Another is how you “suggest” something to an API linked to an automated process.
To me, the logical way to look at AI/ML in service management is to presume the service is made up of “intent models” which enforce an SLA internally. The enforcement of that SLA, being inside the black box, can take any form that works, including AI/ML. In other words, we really need to redefine how we think of service lifecycle management in order to apply AI to it. That doesn’t mean we have to scrap OSS/BSS or NMS systems, but obviously we have to change these systems somewhat if there are automated processes running between them and services/infrastructure.
That brings us to our final concept area, Zero-touch network and Service Management (ZSM). You can say that I’m seeing an NFV monster in every closet here, but I believe that ZSM suffers from the initial issue that sent NFV off-track, which is the attempt to depict functionality that then ends up being turned into an implementation description.
I absolutely reject any notion that a monolithic management process, or set of processes built into a monolithic management application, could properly automate a complex multi-service network that includes both network devices and hosted features/functions. I’ve described the issue in past blogs so I won’t go over it again here, but it’s easily resolved by applying the principles of TMF NGOSS Contract, a concept over a decade old. However, an NGOSS Contract statement of ZSM implementation would say that the contract data model is the “integration fabric” depicted in the paper. Absent that insight, I don’t think a useful implementation can be derived from the approach, and certainly an optimum one cannot be derived.
What, then, is the basic problem, the thing that unites the issues I’ve cited here? I think it’s simple. If you are defining a future architecture, you define it for the future and adapt it to the present. Transition is justified by transformation, not the other way around. What the telcos, and ETSI, should be doing is defining a cloud-native model for future networks and services, and then adapting that model to serve during the period when we’re evolving from devices to functions.
Intent modeling and NGOSS Contract would make that not only possible, but easy. Intent modeling says that elements of a service, whether based on PNFs or VNFs, can be structured as black boxes whose external properties are public and whose internal behaviors are opaque as long as the model element’s SLA is maintained. NGOSS Contract says that the service data model, which describes the service as a collection of “sub-services” or elements, steers service events to service processes. That means that any number of processes can be run, anywhere that’s convenient, and driven from and synchronized by that data model.
The TMF hasn’t done a great job in promoting NGOSS Contract, which perhaps is why ETSI and operators have failed to recognize its potential. Perhaps the best way to leverage the new initiative launched by Telecom TV and other sponsors would be to frame a discussion around how to adapt to the cloud-native, intent-modeled, NGOSS-contract-mediated, model of service lifecycle automation. While the original TMF architect of NGOSS Contract (John Reilly) has sadly passed, I’m sure there are others in the body who could represent the concept at such a discussion.
This paper was posted on LinkedIn by one of the authors of the paper on accelerating innovation in telecom, a paper I blogged about on Monday of last week. It may be that the two combined events demonstrate a real desire by the telco standards community to get things on track. I applaud their determination if that’s the case, but I’m sorry, people. This isn’t the way.