Why NFV’s VIMs May Matter More than Infrastructure Alone

Everyone knows what MANO means to NFV and many know what NFVI is, but even those who know what “VIM” stands for (Virtual Infrastructure Manager) may not have thought through the role that component plays and how variations on implementation could impact NFV deployment.  There are a lot of dimensions to the notion, and all of them are important.  Perhaps the most important point about “VIMs” is that how they end up being defined will likely set the dimensions of orchestration.

In the ISG documents, a VIM is responsible for the link between orchestration and management (NFVO and VNFM, respectively) and the infrastructure (NFVI).  One of the points I’ve often made is that a VIM should be a special class of Infrastructure Manager, in effect a vIM.  Other classes of IM would represent non-virtualized assets, including legacy technology.

The biggest open question about an IM is its scope, meaning how granular NFVI appears.  You could envision a single giant VIM representing everything (which is kind of what the ETSI material suggests) or you could envision IMs that represented classes of gear, different data centers, or even just different groups of servers.  There are two reasons why IM scope is important; for competitive reasons and for orchestration reasons.

Competitively, the “ideal” picture for IMs would be that there could be any number of them, each representing an arbitrary collection of resources.  This would allow an operator to use any kind of gear for NFV as long as the vendor provided a suitable VIM.  If we envisioned this giant singular IM, then any vendor who could dominate either infrastructure or the VIM-to-orchestration-and-management relationship would be able to dictate the terms through which equipment could be introduced.

The flip-side issue is that if you divide up the IM role, then the higher-layer functions have to be able to model service relationships well enough to apportion specific infrastructure tasks to the correct IM.  Having only one IM (or vIM) means that you can declare yourself as having management and orchestration without actually having much ability to model or orchestrate at all.  You fob off the tasks to the Great VIM in the Sky and the rest of MANO is simply a conduit to pass requests downward to the superIM.

I think this point is one of the reasons why we have different “classes” of NFV vendor.  The majority do little to model and orchestrate, and thus presume a single IM or a very small number of them.  Most of the “orchestration” functionality ends up in the IM by default, where it’s handled by something like OpenStack.  OpenStack is the right answer for implementing vIMs, but it’s not necessarily helpful for legacy infrastructure management and it’s certainly not sufficient to manage a community of IMs and vIMs.  The few who do NFV “right” IMHO are the ones who can orchestrate above multiple VIMs.

You can probably see that the ability to support, meaning orchestrate among, multiple IMs and vIMs would be critical to achieving full service operations automation.  Absent the ability to use multiple IMs you can’t accommodate the mix of vendors and devices found in networks today, which means you can’t apply service operations automation except in a green field.  That flies in the face of the notion that service operations automation should lead us to a cohesive NFV future.

Modeling is the key to making multiple IMs work, but not just modeling at the service level above the IM/vIM.  The only logical way to connect IMs to management and orchestration is to use intent models to describe the service goal being set for the IM.  You give an IM an intent model and it translates the model based on the infrastructure it supports.  Since I believe that service operations automation itself demands intent modeling above the IM, it’s fair to wonder what exactly the relationship between IMs/vIMs and management and orchestration models would be.

My own work on this issue, going back to about 2008, has long suggested that there are two explicit “domains”, service and resource.  This is also reflected in the TMF SID, with customer-facing and resource-facing service components.  The boundary between the two isn’t strictly “resources”, though—at least not as I’d see it.  Any composition of service elements into a service would likely, at the boundaries, create a need to actually set up an interface or something.  To me, the resource/service boundary is administrative—it’s functional versus structural within an operator.  Customer processes, being service-related, live on the functional/service side, and operator equipment processes live on the resource side.

Resource-side modeling is a great place to reflect many of the constraints (and anti-constraints) that the ISG has been working on.  Most network cost and efficiency modeling would logically be at the site level not the server level, so you might gain a lot of efficiency by first deciding what data centers to site VNFs in, then dispatching orders to the optimum ones.  This would also let you deploy multiple instances of things like OpenStack or OpenDaylight, which could improve performance.

Customer-based or customer-facing services are easy to visualize; they would be components that are priced.  Resource-facing services would likely be based on exposed management interfaces and administrative/management boundaries.  The boundary point between the two, clear in this sense, might be fuzzy from a modeling perspective.  For example, you might separate VLAN access services by city as part of the customer-facing model, or do so in the resource-facing model.  You could even envision decomposition of a customer-facing VLAN access services into a multiple set of resource-facing ones, for each city involved for example, based on what infrastructure happened to be deployed there.

From this point, it seems clear that object-based composition/decomposition could take place on both sides of the service/resource boundary, just for different reasons.  As noted earlier, most operators would probably build up resource-facing models from management APIs—if you have a management system domain then that’s probably a logical IM domain too.  But decomposing a service to resources could involve infrastructure decisions different from decomposing a service to lower-level service structures.  Both could be seen as policy-driven but different policies and policy goals would likely apply.

I think that if you start with the presumption that there have to be many Infrastructure Managers, you end up creating a case for intent modeling and the extension of these models broadly in both the service and resource domain.  At the very bottom you have things like EMSs or OpenDaylight or OpenStack, but I think that policy decisions to enforce NFV principles should be exercised above the IM level, and IMs should be focused on commissioning their own specific resources.  That creates the mix of service/resource models that some savvy operators have already been asking for.

A final point to consider in IM/vIM design is serialization of deployment processes.  You can’t have a bunch of independent orchestration tasks assigning the same pool of resources in parallel.  Somewhere you have to create a single queue in which all the resource requests for a domain have to stand till it’s their turn.  That avoids conflicting assignments.  It’s easy to do this if you have IM/vIM separation by domain, but if you have a giant IM/vIM, somewhere inside it will have to serialize every request made to it, which makes it a potential single point of processing (and failure).

Many of you are probably considering the fact that the structure I’m describing might contain half-a-dozen to several dozen models, and will wonder about the complexity.  Well, yes, it is a complex model, but the complexity arises from the multiplicity of management systems, technologies, vendors, rules and policies, and so forth.  And of course you can do this without my proposed “complex” model, using software that I can promise you as a former software architect would be much more complex.  You can write good general code to decompose models.  To work from parameters and data files and to try to anticipate all the new issues and directions without models?  Good luck.

To me, it’s clear that diverse infrastructure—servers of different kinds, different cloud software, different network equipment making connections among VNFs and to users—would demand multiple VIMs even under the limited ETSI vision of supporting legacy elements.  That vision is evolving and expanding, and with it the need to have many IMs and vIMs.  Once you get to that conclusion, then orchestration at the higher layer is more complicated and more essential, and models are the only path that would work.