Some Specific Recommendations on Boosting the Role of NFV in the Carrier Cloud

In the last several blogs I developed the theme of making NFV relevant and exploring the relationship between the drivers of “carrier cloud”.  One point that I raised, but without numerical detail, is the contribution that NFV would actually make to carrier cloud.  If you look at the model results, there’s a firm connection with NFV in only about 6% of carrier cloud deployment.  That doesn’t really tell the story, though, because there is a credible way of connecting over 80% of carrier cloud deployment to NFV, meaning making NFV relevant to almost all server deployments by operators.  If that’s the case, then the risk that NFV proponents face today is failing to realize that credible connection.

The challenge in realization comes down to one of several forms of integration.  There’s been a lot said about the problems of NFV integration, but most of it has missed all of the real issues.  If we look at the goal of realizing the incremental 74% link to carrier cloud that’s on the table, and if we start from that top goal, we can get some more useful perspectives on the integration topic, and maybe even some paths to solution.

The next four years of carrier cloud evolution are critical because, as I noted in yesterday’s blog, there’s no paramount architectural driver, or even any single paramount application or service, behind the deployments of that period.  The risk (again, citing yesterday’s blog) is that all the stuff that happens, stuff that will end up deploying almost seven thousand new data centers globally, won’t organize into a single architecture model that can then be leveraged further.  If “carrier cloud” is cohesive architecturally, or if cohesion can somehow be fitted onto whatever happens, then the foundation goes a long way toward easing the rest of the deployment.  This is the first level of integration.

The minimum operator/infrastructure goal for the next four years should be to build a single resource pool based on compatible technology and organized and managed through a common framework.  The resources that make up the carrier cloud must be the foundation of the pool of resources that will build the base for future services, for the later phases of cloud evolution.  That means that:

  1. Operators should presume a cloud host basis for all applications that involve software hosting, whether it’s for features, management, operations support, databases, or whatever. Design everything for the cloud, and insist that everything that’s not cloud-ready today be made so.
  2. There should be a common framework for resource management imposed across the entire carrier cloud pool, from the first, and that framework should then expand as the carrier cloud expands.
  3. Network connectivity with, and to, the carrier cloud resource pool should fit a standard model that is SDN-ready and that is scalable to the full 100-thousand-data-center level that we can expect to achieve globally by 2030.
  4. Deployment of anything that runs on the carrier cloud must be based on an agile DevOps approach that recognizes the notion of abstraction and state/event modeling. It’s less important to define what the model is than to say that the model must be used everywhere and for everything.  Deploy a VNF?  Use the model.  Same with a customer’s cloud application, an element of OSS/BSS, or anything else that’s a software unit.

The next point builds off this point, and relates to the integration of the functionality of a service or application using software automation.  Here I want to draw on my own experience in the TMF SDF project, the CloudNFV initiative, my ExperiaSphere work, and work with both operators and vendors in software automation and modeling.  The point is that deployment and application or service lifecycle management must be based on an explicit multi-layer model of the service/application, which serves as the connection point between the software that manages the lifecycle and the management facilities that are offered by the functional stuff being deployed.

A real router or a virtual software router or an SDN network that collectively performs like a router are all, functionally, routers.  There should then be a model element called “router” that represents all of these things, and that decomposes into the implementation to be used based on policy.  Further, a “router network” is also a router—a big abstract and geographically distributed one.  If everything that can route is defined by that single router object, then everything that needs routing, needs to manage routing, or needs to connect with routing can connect to that object.  It becomes the responsibility of the software automation processes to accommodate implementation differences.

The second level of integration we need starts with this set of functional model abstractions, and then demands that vendors who purport to support the NFV process supply the model software stubs that harmonize their specific implementation to that model’s glorious standard.  The router has a management information base.  If your implementation doesn’t conform exactly, then you have a stub of code to contribute that harmonizes what you use to that standard MIB.

This helps define what the model itself has to be.  First, the model has to be an organizer for those stubs of stuff.  The “outside” of the model element (like “router”) is a software piece that exposes the set of APIs that you’ve decided are appropriate to that functional element.  Inside that is the set of stub code pieces that harmonize the exposed API to the actual APIs of whatever is being represented by the model—a real router, a management system, a software element—and that performs the function.  Second, the model has to be able to represent the lifecycle states of that functional element, and the events that have to be responded to, such events coming from other elements, from “above” at the user level, or from “below” at the resource level.

This also defines what integration testing is.  You have a test jig that attaches to the “interfaces” of the model—the router object.  You run that through a series of steps that represent operation and the lifecycle events that the functional element might be exposed to, and you see whether it does what it’s supposed to do.

Infrastructure is symbiotic almost by definition; elements of deployed services should prepare the way for the introduction of other services.  Agile orchestration and portals mean nothing if you can’t host what you want for the future on what your past services justified.  CORD has worked to define a structure for future central offices, but hasn’t done much to define what gets us to that future.  ECOMP has defined how we bind diverse service elements into a single operational framework, but there are still pieces missing in delivering the agility needed early on, and of course ECOMP adoption isn’t universal.

To me, what this means is that NFV proponents have to forget all their past stuff and get behind the union of ECOMP and CORD.  That’s the only combination that can do what’s needed fast enough to matter in the critical first phase of carrier cloud deployment.  I’d like to see the ISG take that specific mission on, and support it with all their resources.