If Management is the Key to NFV, Where is the VNFM?

Of all the components called out in the ETSI NFV E2E architecture, it may be that the most important is the VNF Manager or VNFM.  As the dialog on NFV advanced, there’s been a shift of benefit focus from savings in capex to improvements in operations efficiency and service agility.  In addition, the reliability, availability, and performance of NFV depend on its ability to respond to load changes and failures quickly and optimally.  The biggest question in all of NFV is whether VNFM is up to the job.  Maybe even what, specifically, the VNFM is.

There is very little said about the VNF Manager in the current E2E document; two sentences that say its role is lifecycle management and that a single VNF Manager could manage only one VNF or several.  If you look at the interfaces defined for the VNFM you get the sense that it is in fact the central repository for management functionality, but it’s difficult to see how it works.  Let’s look at the concept to see if there are any clues regarding how it should or must do its job.

Or let’s try.  Conceptually it’s hard to say just what a VNFM is.  Probably the best definition is that it’s the sum of the management requirements of both VNFs and their resources, but that definition is little more than saying that the VNFM is a kind of conceptual place-holder for stuff we need to get done and can’t assign anywhere else.  That leaves the VNFM in a kind of (virtual, of course) limbo.

In at least one sense, a VNFM should ideally be a piece of the “package” of functional components that deploys (via the Orchestrator).  Because it would be co-resident with the VNFs (in the same address space and mutually addressable) it could then parameterize the VNFs as needed.  In theory, it could also terminate any access that VNFs expect to make to resource MIBs, allowing us to synthesize a management view of multi-tenant elements and exercise control without stability and security risks.

The problem is that if a VNFM is a component of a loadable package, it’s potentially a major security risk.  It’s hard to see how a generalized VNFM could “know” what VNFs expected unless it was provided with them.  If it is, then giving it access to underlying multi-tenant resources is as bad as giving the VNFs direct access to the resources.  Every vendor who supplies VNFs could tweak things in their favor at the expense of other functions and services.

At a more mundane level, having a management component part of every deployed VNF package means that a lot of these VNFMs might be polling resource MIBs at any time, which could generate so much traffic that it destabilized the network, not to mention the resource being polled.  You can’t easily discipline what’s embedded in a vendor’s VNF package.

All these factors would argue against making a VNFM a part of a VNF package, but the most compelling argument is the assertion that VNFMs might manage multiple VNFs.  That would suggest that management functionality could be spread across services, and that would of course pose a significant risk if the VNFM were supplied on a package basis. If a single VNFM manages multiple VNFs, meaning multiple packages representing different services, then it creates a passageway through which information could flow between the services.  That, in my view, would mean that a VNFM could not be provided by a VNF vendor, or that such an offering would have to be carefully tested and certified by the operator.  So VNFMs should be centralized in some way.

Not much help so far, huh?  We’ll have to look further.  There are plenty of high-level issues with a VNFM, but those aren’t the only issues.  If we look at the mission again, the VNFM is expected to fulfill both the VNFs’ “local” management needs, providing parameterization and presumably asserting the MIB interface the user of the service would expect.  Both these requirements present challenges.

The most obvious problem at the mission level is that a network function has likely been decomposed into multiple VNF(Cs) for deployment.  That means that we have a set of interdependent components to deal with, and that these components may now have to be parameterized and managed both collectively (which is how the user likely sees them—as a virtual device) and separately since separate is what they really are.  There’s a general view that the VNF provider can offer parameterization and management in a suitable way, and that may well be true providing that there is no need to know anything about the underlying resources, or to operate on those resources in any way.

But is that the case?  Clearly the management state of a VNF could depend in part on the state of its resources, but even where the resources weren’t part of VNF state, the composite nature of functions like service chaining mean that you’d have to be able to construct a view of a collection of VNFs to present as the state of the “virtual device” to the user.  It’s also true that even parameterization of a device might set communications parameter values, in which case the process is now accessing resources.

I’ve pointed out before that the problem with management in NFV is that the management processes don’t necessarily understand the relationship between resources and services.  In NFV, the Orchestrator commits resources through a Virtual Infrastructure Manager.  The VIM clearly knows what resources it’s committed, but does it know anything about the relationship between the management parameters of the resources and the management state of the related services?  In order for the VNFM to work properly, it will have to inherit knowledge of resource commitments, but more than that it will have to be told what the specific resource-to-service management derivations are.  Does this particular value in that particular resource MIB convert to this or that status?

This gets even more complicated when you realize that the VNFM is expected to use the Orchestrator to deploy changes to the service, resulting from either changes made by the customer or problems/conditions encountered during operation.  We now have the possibility that the Orchestrator will be telling a VNFM something while the VNFM is also telling the Orchestrator something.  And of course, the VIM actually has to make the connections to the resources for any changes.  Chasing one’s (functional, virtual) tail comes to mind.

Which raises the question about management connections.  In the ISG’s E2E document it appears that the Orchestrator would generally pass fairly high-level requests to the VIM for deployment.  Given that, can we assume that the VIM has “told” the Orchestrator the management details of the resources?  Does the VIM also have to talk to the VNFM to propagate resource information?

If we stay with the functional mission of the ISG we can say that the VNFM does lifecycle management, but that gets us back to that almost-too-vague notion that it’s the NFV equivalent of the Universal Constant.  It seems to me that, speaking functionally, a service model has to define the structure of a service.  I don’t know of any proven way to describe things, though we could clearly pick from several different models or make up a new one.  Whatever model we pick, though, it should describe how we commit and manage resources, so both the Orchestrator and the VNFM would be anchored in the model.  Why not combine them, in effect at least?  If we said that each element in the multi-element model of a service steered events to processes via the data model, we’d be recognizing the structure the TMF first called “NGOSS Contract”.  A given service element could have a number of defined lifecycle states, and for each state a number of management events it recognized.  The combination of state and event could define a set of processes to be executed.  Since we could define “Ordered” as a state and “Deploy” as a command/event, we could then say that orchestration was a lifecycle phase, and the Orchestrator was invoked by VNFM.

All of this is enough to cause headaches, which may be why VNFM ended up as the place where unassigned management functions go.  But with management-driven benefits emerging as critical to the NFV business case, we need to get the VNFM out of limbo and either make it work or replace it with something that will.

We really need to start thinking of this stuff as software, and think of how we’re going to model the elements of a service.  A data-driven approach to lifecycle management seems not only mandatory but consistent with past efforts.  Protocol handlers are state/event, after all, and so (apparently, in a goal sense at least) was the TMF’s NGOSS Contract.  NFV fits somewhere in between; isn’t it logical it could be data-driven too?