Are the “Issues” With ONAP a Symptom of a Broader Problem?

How do you know that software or software architectures are doing the wrong thing?  Answer: They are doing something that only works in specific cases.  That seems to be a problem with NFV software, including the current front-runner framework, ONAP.  The initial release, we’re told by Light Reading, will support a limited set of vCPE VNFs.  One application (vCPE) and a small number of functions not only doesn’t make NFV successful, it begs the question of how the whole project is going together.

Linux is surely the most popular and best-known open-source software product out there.  Suppose that when Linux came out, Linus Torvalds said “I’ve done this operating system that only works for centralized financial applications and includes payroll, accounts receivable, and accounts payable.  I’ll get to the rest of the applications later on.”  Do you think that Linux would have been a success?  The point is that a good general-purpose tool is first and foremost general-purpose.  NFV software that “knows” it’s doing vCPE or that has support for only some specific VNFs isn’t NFV software at all.

NFV software is really about service lifecycle management, meaning the process of creating a toolkit that can compose, deploy, and sustain a service that consists of multiple interdependent pieces, whether they’re legacy technology elements or software-hosted virtual functions.  If every piece of a service has to be interchangeable, meaning support multiple implementations, then you either have to be able to make each alternative for each piece look the same, or you have to author the toolkit to accommodate every current and future variation.  The latter is impossible, obviously, so the former is the only path forward.

To make different implementations of something look the same, you either have to demand that they be the same looking from the outside in, or you have to model them to abstract away their differences.  That’s what “intent modeling” is all about.  Two firewall implementations should have a common property set that’s determined by their “intent” or mission—which in this case is being a firewall.  An intent model looks like “firewall” to the rest of the service management toolkit, but inside the model there’s code that harmonizes the interfaces of each implementation to that abstract intent-modeled reference.

If there’s anything that seems universally accepted in this confusing age of SDN and NFV, it’s the notion that intent models are critical if you want generalized tools to operate on non-standardized implementations of service components.  How did that get missed here?  Does this mean that there are some fundamental issues to be addressed in ONAP, and perhaps in NFV software overall?  Can they be addressed at this point?

From the very first, NFV was a software project being run by a traditional standards process.  I tried to point out the issues in early 2013, and the original CloudNFV project addressed those issues by defining what came to be known as “intent modeling”.  EnterpriseWeb, the orchestration partner in CloudNFV, took that approach forward into the TMF Catalyst process, and has won awards for its handling of the process of “onboarding” and “metamodels”, the implementation guts of intent modeling.  In short, there’s no lack of historicity and support for the right approach here.  Why then are we apparently on the wrong track?

I think the heart of the problem is the combination of the complexity of the problem and the simplicity of ad-sponsored media coverage.  Nobody wants to (or probably could) write a story on the real NFV issues, because a catchy title gets all the ad servings you’re ever going to get on a piece.  Vendors know that and so they feed the PR machine, and their goal is to get publicity for their own approach—likely to be minimalistic.  And you need a well-funded vendor machine to attend standards meetings or run media events or sponsor analyst reports.

How about the heart of the solution?  We have intent-model implementations today, of course, and so it would be possible to collect a good NFV solution from what’s out there.  The key piece seems to be a tool to facilitate the automated creation of the intent models, to support the onboarding of VNFs and the setting of “type-standards” for the interfaces.  EnterpriseWeb has showed that capability, and it wouldn’t be rocket science for other vendors to work out their own approaches.

It would help if we accepted the fact that “type-standards” are essential.  All VNFs have some common management properties, and all have to support lifecycle steps like horizontal scaling and redeployment.  All VNFs that have the same mission (like “firewall”) should also have common properties at a deeper level.  Remember that we defined SNMP MIBs for classes of devices; why should it be any harder for classes of VNF?  ETSI NFV ISG: If you’re listening and looking for useful work, here is the most useful thing you could be doing!

The media could help here too.  Light Reading has done a number of NFV articles, including the one that I opened with.  It would be helpful if they’d cover the real issues here, including the fact that no international standards group or body with the same biases as the NFV ISG has a better chance of getting things right.  This is a software problem that software architectures and architects have to solve for us.

It may also help that we could get a new body working on the problem.  ETSI is setting up a zero-touch automation group, interesting given that the NFV ISG should have addressed that in their MANO work, that the TMF has had a ZOOM (Zero-touch Orchestration, Operation, and Management) project since 2014, and that automation of the service lifecycle is at least implicit in almost all the open-source MANO stuff out there, including ONAP.  A majority of the operators supporting the new ETSI group tell me that they’d like to see ONAP absorbed into it somehow.

These things may “help”, but optimal NFV demands optimal software, which is hard to achieve if you’ve started off with a design that doesn’t address the simple truth that no efficient service lifecycle management is possible if all the things you’re managing look different and require specific and specialized accommodation.  This isn’t how software is supposed to work, particularly in the cloud.  We can do a lot by adding object-intent-model abstractions to the VNFs and integrating them that way, but it’s not as good an approach as starting with the right software architecture.  We should be building on intent modeling, not trying to retrofit it.

That, of course, is the heart of the problem and an issue we’re not addressing.  You need software architecture to do software, and that architecture sets the tone for the capabilities in terms of functionality, composability, and lifecycle management.  It’s hard to say whether we could effectively re-architect the NFV model the right way at this point without seeming to invalidate everything done so far, but we may have to face that to keep NFV on a relevant track.