Some Specific Early Experience in Zero-Touch Automation

In a couple past blogs on lifecycle management, I mentioned my “older” ExperiaSphere project.  The project was one of the earliest tests of zero-touch automation, launched with operator support.  There is still some documentation on the ExperiaSphere website, but some of you have asked me to explain the original project, hopefully relating it to the current state of zero-touch automation.  I think the discussion might raise some of the important issues we face in deploying hosted features and orchestrating for zero-touch automation, so here goes.

ExperiaSphere in its original form came out of a request by five Tier One operators who were working with me on the TMF Service Delivery Framework (SDF) project.  Their concern was that the notion of SDN was fine at a conceptual level but not necessarily one that could be implemented.  To quote: “Tom, we are 100% behind implementable standards, but we’re not sure this is one.”  They wanted me to approach the problem as a software architect, and stick as close to the SDF approach as possible to validate (or refute) it.  I demonstrated that a valid Java-based implementation could be develolped, and made several presentations to the SDF group with the project results.

The concept of SDF was to deliver services through “representational modeling”.  Some abstract component, presumably a piece of software with specific interfaces, would “represent” a service feature and (again presumably) participate as a proxy for that feature in lifecycle processes.  It would then execute the necessary steps on the “real” feature and infrastructure elements through their respective APIs.

The model I developed was based on two high-level concepts—the “Experiam” and the “Service Factory”.  An Experiam was a Java object built from a base class that defined the broad architecture of a representational model.  This object was customized to provide the “representation” part, and it contained all the state/event logic to recognize and progress through service lifecycle phases.  A Service Factory was a Java application written using Experiams, that when instantiated the first time wrote out a service template that defined everything it needed to order and deploy the service.  When a copy of this template was filled in and sent to the same factory (or another instance of the factory), the service would be deployed according to the template.

The concept of an Experiam was necessary (in my view of the time) to represent a deployable equivalent of a TMF NGOSS Contract, a data model.  Each element in the service would be represented by both a data model element and an Experiam that defined that element and was responsible for executing the lifecycle processes.  No Experiam had to be a generalized tool, it only had to do what it was written to do.  Instead of making service creation a process of model definition, ExperiaSphere in the form made it a software development process that created the data model as a byproduct.

Experiams could represent three broad classes of things.  First, they could represent coordinative processes designed only to organize the lifecycle processes of a hierarchy of pieces, each in turn represented by Experiams.  Second, they could represent service control or management elements that were part of the services’ actual implementation.  I did a demo with Extreme Networks to show content delivery prioritization based on web video selection, for example.  Finally, they could represent interfaces into Service/Network/Element Management Systems that controlled traditional network behavior.

Service Factories were the instrument of execution and scalability.  A service template lived in a repository, and when an event occurred, the template was extracted and dispatched to any convenient instance of the Service Factory, including a new one.  This instance could process the event because it had everything associated with every Experiam’s data model in the template.  The Service Factory concept was loosely adopted by the TMF SDF group.

An Experiam could represent another service template, making the concept self-federating.  A “proxy Experiam” in the main-line model would bind to a second service template that created another model.  That second template could be located anywhere, and the binding could either be tight (the interfaces specifically defined in the models) or indirect at order time (“Behavior binding”).  In the latter case, the low-level Service Factory advertised the “Behaviors” it would support in a directory, and the high-level Factory would then go to that directory with Behaviors and their characteristics to look for one.  When it picked one, the binding would be finalized.

The big concern I had about this approach was that it relied on software development for service creation.  The operators at the time weren’t too worried about that because they said that services weren’t really ad hoc.  Today I think they might take a different view.  I also wonder whether there are not some fairly contained number of basic service models, in which case a general software toolkit that could interpret data models, or a state/event table within such a data model, might not be a better approach.

I think that the best attribute of this original ExperiaSphere model is explicit scalability.  Every instance of a given service factory can build from or sustain the service order template that factory generates.  You can spin up as many Factories as you need, wherever you need them.  Scalability under load is essential for an NFV-zero-touch solution because network events that stem from a common major failure could generate an avalanche of problems that would swamp a simple single-thread serial software implementation.

ExperiaSphere provided an “encapsulated” model of state/event processing.  Each Experiam was itself a finite-state machine that represented the component set it was associated with, and was responsible for initiating its lifecycle processes, whether they were processes related to implementation of an element of the service, or processes that simply organized the combined state of lower-level elements.  You could either embed the necessary implementation processes at each state/event interface, or call them externally.  The latter approach would converge on a model of implementation where a general “Experiam” used the data model state/event table to create specific processes.  I didn’t implement that for the SDF test, though.

I mention this point because I think it’s the other critical question, beyond the scalability point.  Unless you want to build software from scratch, or adapt software you already have, to support specific state/event lifecycle progression, you need to have an agent that does that.  The agent can either be specificized to the mission (you write the logic into “Experiams”) or generalized based on interpreting a data model.  The first ExperiaSphere project was based on the first approach, and the subsequent project design (I didn’t have the time to do another implementation) for what’s now “ExperiaSphere” was based on the second.

The final element in the ExperiaSphere project was event distribution.  Lifecycle automation has to be event-driven, and having a state/event process doesn’t help if you can’t recognize events.  My work showed there were two kinds of events to be handled.  The first are generated by service lifecycle processes themselves, directed to either subordinate or superior elements to signal a change in state.  The second are generated “outside”, either by infrastructure management or higher-level service processes, like customer ordering or changing.  All internal events are easily handled; you know who you’re directing them to, so you simply dispatch them to a compatible Service Factory.  External events have to be associated with one or more services, which is easy for higher-level events because they’d have a service order to direct something to.  For infrastructure management events, I used an “infrastructure MIB” to hold all management data, so MIB proxies queried real MIBs to populate this database.  A daemon process that ran when any status change occurred in the infrastructure MIB then “published” relevant changes, which services subscribed to.

This maps out a reality for NFV, which is that if you want to minimize integration, you need to first create as standard a set of states and events as can be done, so there’s no variability across elements in how software links to network conditions.  Then you have to define “standard” processes with standardized interfaces for each state/event intersection, and use a software process (like the Adapter Design Pattern) to map your current software components to those standards.  You can simplify this, as I did with ExperiaSphere, by letting every process have access to the service data model and by defining the states and events in a “process template”.

This is a constructive step that the NFV ISG could undertake now.  So could the ETSI ZTA group.  So could the TMF.  It would be great if the three could somehow cooperate to do that, because if that were done it would take the largest step we can take now toward facilitating integration in NFV deployments.