What are the “Top” Questions for NFV?

I’ve been blogging that the SDN and NFV evolution we’ve been engaged with had been approached top-down instead of the other way around, things would have been different.  I offered a quick view of what top-down NFV might have looked like.  Of course, I know as well as you all that we can’t roll back the clock and do things the other way.  It could be helpful, then, to ask what specific points appear to have been missed in the way things were done.  Perhaps they can still be addressed.

My main point is management.  When you build something based on any form of virtualization, whether you’re virtualizing a router network with SDN or virtualizing a service with NFV, you are creating abstractions to represent functionality, abstractions representing resources, and binding these abstractions together when you deploy.  It’s pretty easy to see that this means you are managing abstractions.

How?  If we construct an abstraction to represent a functional element of a service or the thing the element runs on, we should at the same time construct the management properties of that abstraction.  If I assemble a service from widgets, I have to manage widgets.  If that’s the case, then my construction of the “widget abstraction” has to construct its management framework at the same time.  You can’t expect a management system to understand widgets when there was no such thing before you built it, nor can you expect virtual resources to be manageable in the same way that real resources are managed.  Multi-tenant properties alone demand you frame management views specifically for the service and user.

A second point is the portability of the software elements that make up a service—the virtual network functions or VNFs.  Should a VNF run on any NFV platform or can vendors create platform/VNF combinations that are unique to them, proprietary?  Every single network operator I’ve talked to says that VNFs should be portable across NFV implementations, and I agree.

But what “platform” do they run on?  I’d argue that the answer is “whatever they run on now.”  We should, as a goal, demand that any NFV implementation on-board current network functions from open platforms like Linux.  If you have a piece of code like a Linux firewall, shouldn’t you be able to run it as a VNF?  If there are any new APIs that network code is expected to exercise in order to run as a VNF, then that software will have to be rewritten or “forked” in open-source terms.  If the APIs are proprietary, then there may be no way to make the VNF portable at all.

Point number three is the representation of our abstractions.  A “service” is a collection of virtual functions.  A commercial VPN service, for example, is a combination of a virtual-router function in the center connected to access on-ramp services at the edge.  You can augment this by sticking in firewalls between access and central virtual-router, or add virtual DNS and DHCP to it.  So logically speaking we could model this service as a collection of functions.  The process of deployment would then be the process of mapping these functions to resources.

Here there’s a risk, though.  Realizing a virtual-router function on a network depends on the topology of the network, the features of the real devices, and a bunch of other factors.  Presumably there would be a “recipe” associated with a given function that would build that function on a given network.  Thus, there are really two levels of modeling—one to represent functional relationships and the other to describe the mapping to resources.  If we try to combine these two models, we risk having to change the service model to accommodate changes in the network.  So we have functional and structural models.  Do we need to share these, or can operators build them for their own services?  Perhaps we share only one, the functional, and let structural modeling be provider-specific.  Might that then mean we need only to define our functional model in a language?  We need to talk this through.

Speaking of operators, do we expect network operators to build their services end-to-end with no participation of partners?  That’s not how we do services today.  The term “federation” is often used to describe a relationship between operators where elements of a service are shared.  It’s also used to describe the mechanisms for sharing.  How do we “federate” NFV?

Federation was originally declared to be out of scope for the ISG, but if we build a complete description of how to model a service and deploy from that model and leave out the federation requirements, can we be sure that they could be retrofitted on at some later point?  I don’t think we can make that assumption.

The final point I want to make is the big one.  We have OSS/BSS systems and processes today.  We have network management and network operations center processes too.  In our virtual world, the world of both SDN and NFV, we can use virtual-software smoke and mirrors to make things look the way we’d like them to look.  It follows that we could take the whole NFV process I’ve described here and push it upward into the OSS/BSS, which would mean that SDN and NFV would be essentially a new set of devices.  But we could also push all of this down into what would be called “element management”, so that virtual function collections for services like my VPN example would appear to the OSS/BSS as a single element.  Thus, we can make NFV an OSS/BSS issue or an NMS issue, or anything in between.

Which do we want?  The decision on where this stuff fits in the current operations model could have profound impact on what has to be changed to make optimal use of SDN/NFV.  It certainly has profound impact on what specific standards groups could play a role in the definition of things.  It likely has an effect on how quickly we could drive meaningful changes in service creation and management.

There’s some good news here.  I think that it would be possible to consider these points even now, make insightful decisions on them, and apply the results to the emerging implementation models for NFV and SDN.  Despite the media hype, we have no statistically significant commitment to either technology today; we have time to fix things.  But we are moving from standards into reference implementations, and those have to incorporate high-level issues quickly or the design and development may never be able to accommodate those issues.  Then the choices I framed for each of my points above will have been made…by default.  That’s no way to run a revolution.