Why ETSI’s New Zero-Touch Automation is in Trouble

It seems like zero-touch automation (ZTA) is everyone’s goal but also that it’s mired in the usual standards-group molasses.  The recent announcement that the first use case would be 5G network slicing, and the bias of the process toward earlier work on NFV, seem to combine to set the wrong goal and the wrong approach.  Are we launching a bold, new, essential initiative or just making the same mistakes again under a different name?

Problem number one for ZTA is the timing of the need.  NFV and SDN promised significant improvements in overall operator capex, but both have been going on for five years now and there have been no significant improvements.  Operators have pressed vendors on pricing, gone more to price leaders, and are now exploring open models of legacy devices (P4 switch/routers) instead of deploying virtual functions in the cloud or using OpenFlow forwarding.  The earnings-call comments from both operators and vendors show pressure on capital spending, and the only near-term way to reduce it is to improve opex.  That’s what ZTA is supposed to do.

The target use case of 5G network slicing doesn’t seem to fit the timing of the need, or even the right target.  We cannot reduce opex on something not yet deployed.  Why not focus instead on doing something with the current network infrastructure, the current services?  The profit-per-bit squeeze operators are under today doesn’t involve 5G network slicing, or even 5G overall.  There is currently no way of knowing just how much network slicing will even deploy, or when it would start.

Having a proof-of-concept test to support a use case, at this point, seems premature to me as well.  What “concept” do we have to prove?  I really like PoCs, but there has to be an architecture to test against use cases if we’re to gain anything from them.  Since the ZTA group hasn’t defined that architecture yet, there’s a risk of having PoCs simply run though trial implementations.  That happened with NFV too.

The NFV linkage poses problem number two.  NFV isn’t responsible for any significant amount of either capex or opex today.  The problem that ZTA has to solve isn’t NFV automation, it’s service lifecycle automation regardless of what technology resides in network infrastructure.  Today, and for the foreseeable future, that will be legacy technology.  So why not focus ZTA on automating the operation of today’s services on today’s infrastructure?

Another NFV-related problem is what I’ll call an “application vision” of the requisite software components to be used for management and orchestration.  Functionally, the NFV model defines the logical elements, but the implementations of the model have tended to follow the functional alignment, making NFV MANO and VNFM look like monolithic applications, or applications with a few modular components.  We have a better, optimum, model in the notion of an intent-data-model-driven approach.

Problem number three is the inertia of standards processes overall.  John Cupit, a guy whose skills and insights I respect, commented on LinkedIn on one of my recent blog posts, saying:

I think that too much standards work is focused on justifying the existence of the standards bodies.  I also think that Time to Market concerns of the participating vendors hijack the work that is performed.  The MEF work and the ETSI NFV standard are poster children in terms of representing technical efforts which ignored how the Carrier industry had to totally transform itself to remain viable in providing connectivity and cloud-based services…after four or five years of work, we have a technical approach which is still below the line from a financial analysis standpoint and which does not have an answer for long-term service management or ZTA.

Amen!  If we want to find a reason carrier transformation lags behind OTT innovation, you could start with the tendency of operators to launch a five-year standards initiative when OTTs would simply put a couple architects to work, launch development based on the result, and release something in six months.  The white paper first released on ZTA looks a lot like the one released for NFV in 2012, and you may recall that the whole ZTA issue was raised there quickly (spring of 2013) and declared out of scope.  News flash: You can’t do zero-touch automation if anything that has to be operated is out of scope.

If we are ever going to get anywhere with ZTA we have to start with the presumption that we are building a universal model and focus on how the model gets to be universal.  We already have a prototype concept with “intent modeling” and the notion of hierarchical decomposition of a service data model.

Intent modeling says that any service can be subdivided into functional elements that have explicit missions, represented by interfaces, parameters, and SLAs.  Within an intent-model element, the means whereby the mission/intent is realized is opaque.  Internal logic (whatever it is) works to manage behavior against the element’s own SLA.  If the element can’t meet that, then it asserts a status that reflects the failure.

That’s where hierarchy comes in.  An intent-modeled element is a piece of a service hierarchy, so it is “created” by the decomposition of something above.  A “VPN” functional component could be realized in a number of ways, like “MPLS-VPN” or “2547-VPN” or “Overlay-VPN”.  If one of those second-level components is selected and faults, it’s up to the VPN functional component to decide what happens.  Perhaps you re-provision the same sub-component, perhaps you launch another one, or perhaps you report a failure yourself, to your own superior element.

In a structure like this, there is no application component called “MANO” or “VNFM”, only a set of functions that add up to those original functional blocks.  It goes back to the old TMF model of the NGOSS contract, where events (conditions in a service lifecycle) are steered to processes (the logic that combines to create the features of things like MANO and VNFM) via a contract data model.  It’s the model structure and the specific modeling of the elements that matter most.

Models are also the right path to NFV interoperability.  If there is a “standard model” for a given function, whether it’s a virtual one (VNF) or part of the infrastructure (NFVI), then anything that fits into that model is equivalent in all respects.  Without a standard model for something, the implementation of features and functions can vary in terms of their intent, interfaces, SLAs, etc.  That makes interoperability at the service composition level nearly impossible to achieve.

There are some examples of intent-modeled networking already out there.  Sebastian Grabski has an insightful blog post on what he calls “declarative networking” that’s based on model abstractions that could easily be expressed as a form of intent modeling.  It shows how modeling can create multi-vendor, interoperable, implementations of the same thing, or logical feature, based on some Cloudify tools.  That, in turn, shows (again) that the really useful work in this space isn’t being done in network-biased bodies at all, but in cloud projects.  The cloud seems more willing to accept the basic need for declarative modeling than the network community.

Modeling isn’t just drawing pretty pictures.  What are the functional elements of a service?  What are the common properties of all intent-modeled elements?  How do we define events that have to link model elements to coordinate systemic reactions?  How do events get sent to service processes?  This is the stuff that ZTA needs to define, before use cases, before tests and trials, because without these things there’s no way to create a ZTA approach that covers the whole of the service, extends to all possible infrastructure, and incorporates all current network and service management tasks.  Where is it?

ZTA should have started with service modeling.  The early PoC efforts should have focused on defining the prototype service models for the largest possible variety of services.  On ensuring that everything about a service, from top to bottom, legacy to modern, self-healing to manual processes, was representable in the model.  On ensuring that every class of feature had a “class model” that would be implemented by everything that purported to support that feature.  When that’s been done, you can start doing implementation trials on the modeling, because you know it meets all the goals.

There is, in my view, an obvious problem with the way we try to approach technology standards.  If you try something, based on a given methodology, and it fails, there are many reasons why that might have happened.  If you use the same methodology a second and third time, and they all fail in the same way, you have to suspect that the methodology is somehow keeping you from the right answer.  Standards, and the “standards” approach to innovation in infrastructure, may be the core problem with transformation.  ZTA is in trouble for that reason.