NFV PoCs, Top-Down Software, and the OPN

You’ve all probably heard about (if not read) the “Tale of Two Cities”, a story that in part emphasizes life as a tension between two poles.  Guess what?  We have that in NFV, and it will be interesting to see how it plays out.

Yesterday, I got a copy of the interim report for one of the NFV ISG’s PoCs, driven by Telefonica, Intel, Tieto, Qosmos, Wind River, and HP.   The PoC was called “Virtual Mobile Network with Integrated DPI” and the interim report was done very well—thoroughly documented with lots of statistical information, clear goals and proofs.  The focus, as you might suspect from the name, was the demonstration of virtualized EPC using NFV and with horizontal scaling triggered by deep packet inspection.

Today, we have a story in Light Reading about the initial meeting of the Open Platform for NFV (OPN) group, and a preparatory document that outlined the goals of the body.  The document suggests that the first priority for the new group will be defining the NFV Infrastructure (NFVI) and the “Virtual Infrastructure Manager” or VIM.

I have seen both documents and can’t share either one of them, but I can point out that the juxtaposition of the two is laying out the challenges NFV will have to meet along the road to deployment, even relevance.

The thing that I think shouts out from the PoC, which documents an actual test of something actually useful and interesting, is that the value of NFV is really a value generated by management.  Suppose we tool all of the elements of EPC and simply hosted them on static servers.  We’d have “hosted EPC” but not NFV.  Even hosting the EPC elements on a virtual distributed resource pool would create only “cloud EPC” but not NFV.  What makes something “NFV” is support for service automation and management integration.

In the PoC, changes in network conditions detected by DPI are fed through a management process that can scale out instances of eNodeB to respond to changes in calling patterns and behavior, illustrated through a nice set of example scenarios regarding commuting and a mass event.  The PoC illustrates that you can indeed control performance through horizontal scaling driven by independent network telemetry, and that is a useful step.

It was not the goal of this particular PoC to frame a specification for NFVI, the stuff this all gets hosted on.  There are some conclusions about the need for data-plane optimization, an issue that has been raised by other PoCs, but I can’t find any indication the authors/sponsors believe that it is essential to frame a spec for NFVI.  The NFVI interfaces with orchestration through the VIM, and it does seem clear that whatever you do at the NFVI level should be abstracted by the VIM, which means that VIMs should be able to present a common vision of resources to the orchestration processes regardless of NFVI specifics.

But the big question about the VIM is what relationship it might have with other non-virtual elements in the resource pool.  An NFVI is only part of service resources—unless we think every single network device is going to instantly be fork-lifted into virtual form.  What would happen in the PoC configuration and scenarios if we had legacy components involved?  We might end up with management black holes, places where we needed to adapt the behavior of network elements that weren’t visible in the NFV world at all.

The point is that the OPN process is reportedly making its early focus the definition of a reference configuration for NFVI and an implementation of VIM.  The current crop of PoCs provides some insight into both areas, but hardly a complete exploration of requirements.  I think it’s arguable that a reference architecture or implementation for either NFVI or VIM could be done without addressing the higher-level question of how legacy network elements are integrated.  Can that be done, can we define “service infrastructure” in its most general sense, and “infrastructure managers” that go beyond the virtual, based on what we know now, what the only real NFV implementations we can call out (the PoCs), have shown?  I don’t think we can.

Back in late 2012, in response to the operators’ first NFV Call for Action white paper, I responded with a document that included the following quote:  “Experience in the IPsphere Forum (IPSF, now part of TMF), the Telemanagement Forum (TMF) and CIMI Corporation’s own open-source Java-based service element abstraction project suggests to us that the key to achieving the goals of the paper is a structured top-down function virtualization process.”  Most software architects and developers today would agree that we live in an age where top-down is the accepted software mantra.  Why then are we looking at the bottom of the NFV process first, in a project aimed at implementation?

The key to the value of the PoC I’ve been referencing is the fact that you can take network events and trigger horizontal scaling.  I think that goal clearly shows that there is a need to visualize operations and management processes as the response to state/event transitions at the service and network level.  I think it also shows that while we can define a way to scale horizontally that fits a given application and a given event source, that could lead to an explosion in specialized operations software if we don’t structure the framework in which all this happens so that a common approach will solve all the problems, for all the possible mixtures of legacy and SDN and NFV technology.

The most disquieting thing I hear about the OPN activity was cited in the Light Reading article (quoting Mitch Wagner):  The document concludes: “A face-to-face inception meeting is being organized to take place June 30th to be hosted by CableLabs in Sunnyvale-CA. This meeting will be by invitation-only for those players indicating their strong interest in Platinum membership.”   Platinum membership?  This makes the OPN process look like a political activity driven by the big campaign contributors.  Yes, I know that things like OpenDaylight have been driven by premium vendor memberships too, but is OpenDaylight our example of how top-down design and development should be done?  I like OpenDaylight, but I think it’s going to need to be put into context to be successful.  I would like to think that the OPN activity, using PoC lessons and working top-down in accord with modern software practices, would create that context.  I’m worried that their process may not lead them to do that.