How do NFV and Cloud Computing Services Fit?

Virtual functions and NFV are all about virtualizing network features, right?  Well, perhaps not.  There is increased interest in looking at application components as virtual functions, and this trend might be critical in justifying the carrier cloud, and even in supporting NFV’s narrower goal.  It might not be too much of a stretch for current thinking, but it does raise some significant points about how NFV and carrier cloud could or should develop.  By doing so, it may also help identify revenue-generating paths for NFV down the line.

The explicit goal of NFV is to host what were previously “physical network functions” on virtualized servers, meaning in the cloud.  While there is no specific requirement for the PNF limitations implied in the material, the great majority of NFV work has focused on hosting data-plane elements, meaning things that sit on the data path and in some way forward packets.  This includes things like firewalls, and the application is the essence of the whole “service chaining” requirement in NFV, which lays out an ordered path through a sequence of functions.  Applications in the cloud, in contrast, are destinations and not part of a data flow (at least not today).

NFV also presumes single-tenant/single-service deployments of functions.  There is no specific provision to deploy something for reuse by others, though some vendors (Metaswitch, notably) have defined virtual functions that are not all data-plane, and that could also include multi-tenant features with some diddling.  Cloud applications, of course, are usually shared among at least a group of workers.

These points are important if you consider the idea of broadening NFV to include cloud applications.  Yes, there are surely single-user applications—desktop or device applications—but the majority of what users would call “applications” are inherently multi-tenant.  They also show that while NFV is conceptually a superset of the cloud, designed to exploit the cloud’s capabilities, the target applications of NFV create a subset of the cloud’s range.  The cloud, for example, is capable of deploying both data-plane and other functions, and of deploying single- or multi-tenant applications.

The distinction between “cloud” and “NFV” is created not by the resources but by the deployment and lifecycle management models.  Those are created by the software layer—NFV MANO, for example, or a cloud DevOps tool.  If you assume that operators have deployed “carrier cloud”, meaning a general set of cloud computing resource pools, then in theory you could host both virtual functions and cloud applications on them as the classic ships in the night.  If operators want to integrate them somehow, it could get more complicated, but before we get to the complications and remedies, let’s look at the “Why?”

There are three broad reasons why an operator would want to integrate cloud applications and NFV services:  1) NFV services were to be used by the cloud applications, to connect with users or resources, 2) the operator wanted to utilize the same capabilities for failover, scaling, etc. for NFV and for cloud applications, or 3) the operator wanted to use NFV orchestration and management for cloud applications as well.

The first case is probably the simplest.  Clearly it would be possible to connect an NFV service to a cloud application if we assumed that the cloud application was represented by a static address and didn’t move around.  The problem arises in even a simple case of cloud failover; the application is moved, and so the NFV connection also has to move too, or rather has to be rerouted to the new location.  You could handle this in NFV by treating the cloud failover as an endpoint failure, a kind of remove-plus-add process if the NFV service didn’t have an explicit “change-endpoint” capability.

The second case is a can of worms at two levels.  First, if you were to be scaling or failing over an application component, the decision might be made either at the cloud/application level or at the NFV level.  You can’t let the processes collide.  Second, it’s hard to see how something like the horizontal scaling of an application service could be done without some specific coordination between the application and feature/network deployments and changes, on a more dynamic basis.  Third (and worst), it might well be that the “best” option for scaling or moving something because of a cloud condition would have to be made in part based on network considerations.

It’s possible to imagine event-driven coordination across both NFV and cloud elements.  If you had a policy management system to control where things were deployed, it could be possible to use that system to position the application and functional assets optimally in combination, and then communicate the results to both cloud and NFV tools for the deployment and/or redeployment steps.

You can see how this sort of back-and-forth between two independent deployment frameworks could get complicated, and it could be worse than just complicated if both systems had jurisdiction over overlapping parts of the resource pool—and they do.

If you’re going to share hosts and data center networks between cloud applications and virtual functions, then keeping deployment systems separate for the two invites collision even if the two systems aren’t working on the same problem or deployment at the same time.  You can’t have two chefs cutting up the same chicken.  If you assumed that all the shared resources were really controlled by the same singular underlying control element, used by both the cloud and NFV, it might work.  Think about Open Daylight or even OpenStack.  However, the performance of such an arrangement might be problematic, and those singular control elements are also single points of failure.

OK, so if it’s difficult to imagine how the cloud and NFV would be able to coordinate even to the point of sharing a resource pool, then why not use NFV to deploy applications or their components?  In theory, NFV could provide enhanced features and capabilities to application components, and certainly it could be extremely valuable in carrier applications (which I firmly believe are coming, and in significant numbers) that blend traditional cloud application features and network services.  But can an application component be a VNF?

Sort of, but perhaps the biggest question for NFV will be what application components as VNFs might expose in the way of new requirements.

The biggest issue in deploying cloud applications with NFV is that of multi-tenancy.  Could you use NFV, today and with little or no modification, to deploy an application?  You can in at least some cases, as Metaswitch has proved.  If you look at an application in the cloud, the most likely model is one of an “elastic subnet”, an IP subnetwork that’s distributed ad hoc across the cloud as components scale and move.  This subnet is gatewayed to a VPN or to the Internet.  This is also the kind of model that some NFV implementations of network features (IMS/EPC, for example) would likely follow, but the focus of NFV has been on that service-chain vCPE model.  For things like scaling, you’d scale within this subnet and it could in theory be controlled by an internal element—part of the VNF in NFV terms.

Inside the subnet, you don’t chain services, you simply provide mutual access (Metaswitch’s open IMS implementation is an example).  However, the nature of the interaction between components has to be kept in mind when deploying and redeploying, lest you move two things apart that really need to be kept close.  If you do this, and it’s possible in at least some NFV implementations, then you could deploy the component hosting side of your subnet.  Which means, then, that you could use NFV to deploy cloud elements, and you could provide some VNF-specific features to exercise things like horizontal scaling where required.  The rules for this would have to be defined, though.

Outside the subnet, things could be more complicated.  In an application deployment, there is always that multi-tenant theme; you are really setting up an application gateway point (or points) that provide access to a networked community.  Network subnet meets networked community; that’s actually an OpenStack Neutron model, but it’s not necessarily an NFV model because we really haven’t defined specific models for NFV other than (implicitly) the service chaining model.

The key element to make this kind of integration is the gateway, a portal between an application/service subnet and the access network to which users are connected.  Gateways are also explicitly required for federated services, and they are inherently multi-tenant so you need to think about what deploying a gateway means in NFV.  Every service doesn’t get their own, but how do you know whether there is one, and if so where it is?  The essential presumption so far is that gateways in IP networks are simply BGP transitions that are opaque to services, but that may not be true, and we need to think about how NFV would, for example, change BGP lists.

Management is the next issue.  I’ve said many times before that NFV’s VNF management process is broken, and the introduction of applications/components as VNFs would only exacerbate current issues.  In a nutshell, the problem is that adapting a virtual function for management is almost certainly a one-off task.  I’ve recommended that the approach be changed to create a standard VNFM API with “plugins” that would adapt that standard API to the requirements of a given VNF.  This would be workable for application components, but it would be more difficult.

Most VNFs, as representations of former network devices, would likely have very few (usually only one) management interface.  Applications often not only have multiple management interfaces, but have in-line (meaning from their data flow) mechanisms and even back-end processes.  Can you “add a user” to an application by making a network connection and not changing the application’s own access control database?

This problem is totally solvable, and with the same approach to a management plug-in.  If we assumed that all configuration data, parameterization, and management variables were collected in a repository (like the now-expired) i2aex proposal within the IETF mandated, and if we further assumed a set of management agents that queried this repository (or updated it), and linked to a stub that accessed the management interface of the component, you could deploy and redeploy applications.

Perhaps the biggest question is whether any or all of this should really be “done in NFV”.  Remember that we have the cloud and DevOps tools already, widely adopted for this mission.  Should we be thinking about building a specific relationship between cloud DevOps and NFV “servops?”  Probably.  The most attractive way to integrate applications and virtual functions could well be in orchestration, and TOSCA can orchestrate cloud deployment (that’s what it was designed for, after all).  That wouldn’t totally resolve the possibility of resource collision between application and virtual function deployment, nor would it totally resolve management coordination, but it’s a first step, and likely a very important one.

Application integration with NFV opens a lot of issues, and the Metaswitch example illustrates that the cloud issues raised aren’t unique to cloud computing.  Many VNFs, including IMS/EPC and CDNs, are multi-tenant and look more like cloud/subnet applications.  Addressing these issues in NFV, or addressing application integration, could broaden NFV’s scope and make it more valuable in the early services that will drive carrier cloud.