Virtual Networking’s Dirty Operations Secret

Huawei seems to be projecting a future where network equipment takes a smaller piece of the infrastructure budget—IT and software getting a growing chunk.  Genband seems to be envisioning a UC/UCC space that’s also mostly in as-a-service software form, and they’re also touting NFV principles.  It would seem that the industry is increasingly accepting the transition to a “soft” network.

The challenge for “the industry” is that it’s probably not true that a simple substitution of hosted functionality for dedicated devices would cut operator costs enough to alter the long-term industry profit dynamic.  In fact, my model says that simple pipelined services to SMB or consumer customers would cost, in the net, significantly more to operate in hosted form.  Even for business users, the TCO for a virtual-hosted branch service could almost be a wash versus dedicated devices; certainly not enough of an improvement to boost operators’ bottom lines.

I’ve already noted that I believe the value of a more software-centric vision of network services will come not from how it can make old services cheaper but how it can create new services that aren’t even sold today and that will thus fatten the top line.  But there’s a challenge with this new world, whatever the target of the software revolution might be, and it’s related to the operationalization of a software-based network.

Networks of old were built by gluing customer-and-service-specific devices together with leased-line or virtual-circuit services.  We managed networks by managing devices, and the old OSI management model of element-network-service management layers was tried and true.  When we started to transition to VPN services, we realized that when you added that world “virtual” to a “private network” you broke the old mold.  VPNs and other virtual services are managed via a management portal into a carrier management system and not by letting all the users of shared VPN infrastructure hack away at the MIBs of their devices.  Obviously we’re going to have to do the same thing when we expand virtualization through SDN or NFV, or even though just the normal “climbing the stack” processes of adding new service value.  In fact, we’re going to have to do more.

There’s something wonderfully concrete about a nice box, something you can stick somewhere to be a service element, test, fix, replace, upgrade, etc.  Make that box virtual and it’s a significant operational challenge to answer the question “Where the heck is Function X?”  In fact, it’s an operational challenge to put the function somewhere, meaning to address all the cost, performance, administrative, regulatory, and other constraints that collectively define the “optimum” place to host something.  Having made the decision, though, it’s clear that we can’t decide on how to “manage” our virtual function by tearing it down and putting it back again, which means we have to find all the pieces and redefine their cooperative relationship.  This is something that we have little experience with.

The TMF, a couple of years ago, was working on this problem and while I’m not particularly a fan of the body (as many of you know), they actually did good, seminal, work in the space.  Their solution was something called the “NGOSS Contract”, and it was in effect a smart data model that described not only what the service constraints were—the things that would have to define where stuff got hosted and how it was connected—but also described what resources were committed to the service and how those resources could be addressed in the service lifecycle process.

A service has a lifecycle—provision, in-service parameter change, remove-add-replace element, redeploy, and tear down come to mind—and every step of this lifecycle has to be automated or we’ve reverted to manually managing service processes.  In any virtual world, that would be a fatal shift from an operations cost perspective.  But with SDN, for example, who will know what the state of a route is?  Do we look at every forwarding table entry from point A to B hoping to scope it out, or do we go back to the process that commanded the switches?  But even the SDN controller knows only routes, it doesn’t know services (which can be many routes).  You get the picture.  The service process has to start at the top, it has to be organized to automate deployment to be sure, but it also has to automate all the other lifecycle steps.  And if you don’t start it off right with those initial resources, you may as well seal your network moves adds and changes into a bottle and toss them into the surf.

One of our challenges in this positioning-and-exaggeration-and-media-frenzy world is that we’re really good at pointing out spectacular things on the benefit or problem side—even if they’re not true.  We’re a lot less good at addressing the real issues that will drive the real value propositions.  Till that’s fixed, all this new stuff is at risk in becoming a science project, a plot for media/analyst fiction, or both.

Leave a Reply