Service Lifecycle Management 101: Integrating with Management Processes

One of the questions certain to arise from discussions of service lifecycle management is how VNFs are managed.  The pat answer to this is “similar to the way that the physical network functions (PNFs) that the VNFs replace were managed.”  Actually, it’s not so pat a response, either.  It is very desirable that management practices already in place be altered by NFV or any new technology only when the alternation is a significant net benefit.  Let’s then start with the management of the PNFs.

PNFs, meaning devices or systems of devices, are represented in the service model by a low-level (meaning close-to-resource) intent model.  That model exposes a set of parameters, some of which might relate to an SLA and others simply to the state of the operation the model represents.  The general rule of model hierarchies is that these parameters are populated from data exposed by the stuff within/below.  In the case of our hypothetical PNF-related intent model, the stuff below is the device or device system and the set of management parameters it offers, presumably in a MIB.

Every intent-model element or object in the service model instance has a parameter set, and each can expose an interface through which that parameter set can be viewed or changed.  This mechanism would allow a service management process to alter the behavior of the PNF that was embedded in the object we’re using to represent it.  Presumably the PNF’s own MIB is still accessible as it would normally be, though, and this raises the risk of collision of activities.

One way to prevent PNF management from colliding with service management is to presume that the PNF isn’t “managed” in an active sense by the service management processes.  That would mean that the PNF asserts an SLA and either meets it or has failed.  The PNF management system, running underneath and hidden from service management, does what’s required to keep things working according to the SLA and to restore operation if the PNF does break.

This isn’t a bad approach; you could call it “probabilistic management” because service management doesn’t explicitly restore operation at the level we’re talking about.  Instead, there is a capacity-planned SLA and invisible under-the-model remediation.  For a growing number of services, it’s the most efficient way to assure operation.

If you don’t want to do stuff under the covers, so to speak, then you have to actively mediate the management requests to ensure that you don’t have destructive collisions.  The easiest way to do that is to require that the PNF’s EMS/NMS work not with the actual interfaces but through the same intent-model as the service management system.  That model would then have to serialize management changes as needed to insure stable operation.

The serializing could be done in two ways—directly via the intent model, or at the process level.  Process-level serialization means that the intent model asserts a management API (by referencing its process) and that API is a talker on a bus that the real management process listens to.  All the requests to that API are serialized.  The intent-model-level approach says that management requests are events, and that the management event is generated by whatever is trying to manage.  Events have to be queued in any event because they’re asynchronous, so this is an easier approach.  Event-based management also lets you change how you handle management commands based on the state of the object—you could ignore them if you’re in a state that indicates you’re waiting for something specific.

All of this is fine, providing that we have an EMS/NMS that’s managing the PNF.  When we translate the PNF to a VNF, what happens?  It’s complicated.

A VNF has two layers of management; the management of the function itself (which should look much like managing the PNF from which the VNF was derived) and the management of the virtualization of the function.  There are some questions with the first layer, and a lot with the second.

Arguably it’s inconvenient in any management framework to have differences in management properties depending on the vendor or device itself.  For automated management in any form, the inconvenience turns into risk because it might not be easy to harmonize the automated practices across the spectrum of devices.  Thus, it would certainly aid the cause of service lifecycle management if we had uniform VNF functional management.  That could be accomplished simply by translating all the different PNF MIBs into a single MIB via an “Adapter” design pattern.

For the virtualization side of VNF management we have to think differently, because PNFs aren’t being hosted in clouds and service chaining of functions replaces having them live in a common box.  We cannot expose virtualization parameters and conditions to management systems that don’t know what a host is or why we’re addressing subnets and threading tunnels.

The convenient way to address this all is to think of VNF management as being a set of objects/elements.  The top one is the function part, and the bottom the virtualization part.  It’s my view that the boundary between these (the abstraction) should separate two autonomous management frameworks that are working to a mutual SLA.  So in effect, the function is an intent model and the virtual realization another.  In that second model, we always presume that the management process is working under the covers to sustain the SLA, not exposing its behavior or components to what’s above.  That means that what the NFV ISG calls “MANO” is largely invisible to the higher level of service lifecycle management, just as a YANG model of device control would be invisible—both are inside an intent model.

The whole of the vast, disorderly, often-criticized VNF onboarding process can be viewed as connecting the VNF to this two-level model of a lifecycle element.  You need to define the state/event handling at the top layer, and some mechanism to coordinate the MANO behavior in the virtual part.  You could create a “stub” of those Adapter design patterns in the specialized, VNF-resident, piece of the VNF Manager, to be accessed by a central management process that builds the connection.

You “could” do that, but should you?  I’m concerned that literal adherence to the ETSI model would actually tend to defeat service lifecycle management principles and make software automation and VNF onboarding more difficult.  The only purpose of a “stub” cohabiting with the VNF should be to adapt the management interfaces to a standard structure for the generation of events.  The service model should define the states of the related service elements and how they integrate events with processes.  That way, the service model defines the service, period.  If you have management logic inside a VNF, or if you have a global management process outside the VNF that is shared across VNFs, then you have a traditional transactional structure, one that has a fixed capacity to process things.  That’s kind of anachronistic when one of the goals of NFV is to provide scalable processes that replace non-scalable physical devices.

Functionally, there’s nothing wrong with a model that says that there are a set of boxes inside NFV that connect with abstract interfaces.  Literally, meaning at the software level, that can lead you to implementations that won’t in the long run satisfy market needs.  Automated service lifecycle management is what is needed for NFV to work.  We can get there using proven principles, even proven models, and I’m confident that somebody is going to get it right.  I just wish it would go a bit faster, and exposing the issues is the best way I know to advance progress.