The Relationship Between Service Modeling and Management Strategies

Service modeling is important for zero-touch automation, as I said in an earlier blog.  Service modeling, in terms of just how the model is constructed, is also important for operations, service, and network management.  In fact, it sets up a very important management boundary point that could have a lot to do with how we evolve to software-centric networking in the future.

You could argue that the defining principle of the modern software-driven age is virtualization.  The relevant definition is “not physically existing as such but made by software to appear to do so.”  Jumping off from this, software-defined elements are things that appear to exist because software defines a black-box or boundary that looks like something, often something that already exists in a convenient physical form.  A “virtual machine” looks like a real one, and likewise a “virtual router”.

Virtualization creates a very explicit boundary, outside of which it’s what software appears to be that matters, and inside of which is the challenge of ensuring that the software that really is looks like what’s being virtualized.  From the outside, true virtualization would have to expose the same properties in all the functional planes, meaning data plane, control plane, and management plane.  A virtual device is a failure if it’s not managed like the real device it’s modeled on.  Inside, the real resources that are used to deliver the correct virtual behavior at the boundary have to be managed, because whatever is outside cannot see those resources, by definition.

One way to exploit the nature of virtualization and its impact on management is to define infrastructure so that the properties of virtual devices truly map to those of the real thing, then substitute the former for the latter.  We already do that in data centers that rely on virtual machines or containers; the resource management properties are the same as (or similar enough to) the real thing as to permit management practices to continue across the transition.  However, we’ve also created a kind of phantom world inside our virtual devices, a world that can’t be managed by the outside processes at all.

The general solution to this dilemma is the “intent model” approach, which says that a virtual element is responsible for self-management of what’s inside, and presentation of an explicit SLA and management properties to what’s outside.  An older but still valuable subset of this is to manage real resources independently as a pool of resources, on the theory that if you capacity-plan correctly and if your resource pool is operating according to your plan, there can be no violations of SLAs at the virtual element level.

The difference between the broad intent-model solution and the resource management solution arises when you consider services or applications that are made up of a bunch of nested layers of intent model.  The lowest layer of modeling is surely the place where actual resources are mapped to intent, but at higher layers, you could expect to see a model decompose into another set of models.  That means that if there are management properties that the high-level model has to support, it has to do that by mapping between the high-level SLA and management interface, and the collection of lower-level SLAs and interfaces.

From a management perspective, then, a complex service or application model actually has three different management layers.  At the top is the layer that manages virtual elements using “real-element practices”.  At the bottom is the resource management layer that manages according to a capacity plan and is largely unaware of anything above, and in the middle is a variable layer that manages the aggregate elements/models that map not to resources but to other elements/models.

The management layering here is important because it illustrates that many of our modern network/service automation strategies have missing elements.  The simple model of two layers, the top of which is based on the “real” device management already in place and the bottom on generalized resource management, won’t work if you have a service hierarchy more than two levels deep.

One solution to that is to make the virtual device bigger, meaning to envelope more resource-directed functions in high-level models.  A VPN that is created by one huge virtual router represents this approach.  The problem is that this creates very brittle models; any change in infrastructure has to be reflected directly in the models that service architects work with.  It’s like writing monolithic software instead of using componentization or microservices—bad practice.  My work on both CloudNFV and ExperiaSphere have demonstrated to me that two-layer service structures are almost certain not to be workable, so that middle layer has to be addressed.

There are two basic ways to approach the management of middle-level elements.  One is to presume that all of the model layers are “virtual devices” some of which are just based on no current real device.  That approach means that you’d define management elements to operate on the middle-layer objects, likely based on device management principles.  The other is to adopt what I’ll call compositional management, meaning adopting the TMF NGOSS Contract approach of a data model mediating events to direct them to the correct (management) processes.

IMHO, the first approach is a literal interpretation of the ETSI NFV VNF Manager model.  In effect, you have traditional EMS processes that are explicitly linked with each of the virtualized components, and that work in harmony with a more global component that presumably offers an ecosystemic view.  This works only as long as a model element decomposes always into at least resources, and perhaps even virtualized functions.  Thus, it seems to me to impose a no-layers approach to virtual services, or at the minimum doesn’t address the middle layers.

You could extend the management model of the ISG to non-resource-decomposed elements, but in order to do that you’d need to define some explicit management process that gets explicitly deployed, and that then serves as a kind of “MIB synthesizer” that collects lower-level model element management views, and that decomposes its own management functions down into those lower layers.  This can be done, but it seems to me to have both scalability problems and the problem of needing some very careful standardization, or elements might well become non-portable not because their functionality wasn’t but because their management wasn’t.

The second approach is what I’ve been advocating.  A data model that defines event/process relationships can be processed by any “model-handler” because its functionality is defined in the data model itself.  You can define the lifecycle processes in state/event progression terms, and them to specific events.  No matter what the level of the model element, the functionality needed to process the events through the model is identical.  The operations processes invoked could be common where possible, specialized when needed, and as fully scalable as demand requires.

You probably can’t model broad resource pools this way, but you don’t need to.  At the critical bottom layer, a small number of intent models with traditional resource management tied to policy-based capacity planning can provide the SLA assurance you need.  This approach could also be made to work for services that didn’t really have independent SLAs, either implicit or explicit.  For all the rest, including most of what IoT and 5G will require, we need the three layers of management and most of all a way to handle that critical new virtualization-and-modeling piece in the middle.  Without the middle, after all, the two ends add up to nothing.