SDN and NFV Benefits: It’s a Matter of Scope

Stories about network transformation tends to focus on capex, despite the fact that network operators have consistently indicated that opex is likely more important.  One reason for this is that opex is one of those giant fuzzball sort of areas where you can make almost any claim and get the numbers to work in your favor.  Another is that the whole OSS/BSS space tends to be dull, in no small part because every time you try to talk about it, the example you get is “billing”.

Carol Wilson of Light Reading did a story on CloudNFV yesterday that brings home some of the realities of operations and the challenges of next-gen networking overall.  The focus of the piece is how the concept of CloudNFV evolved as the project matured, and in particular how the project found it necessary to expand its scope to cover enough of the network problem set to be able to present some true benefits.  My role in CloudNFV is known and I’m not going to reprise it here, but I do want to make some points about that critical question of “scope”.

Let’s say that I’m a builder of nice upscale homes, maybe four or five thousand square feet.  These homes contain all manner of carpentry, electrical work, plumbing, flooring, painting…you get the picture.  So now let’s say that somebody invents a new way of doing bathroom floors that claims to reduce floor cost by 25%.  That kind of thing might induce me as a builder to run out and commit to the new approach.

The problem is that a bathroom floor isn’t the product here, a home is.  I have to explore at the minimum two critical questions.  First, does that new floor paradigm impact the cost of the surrounding/supporting elements?  Suppose the floor costs 25% less but the cost of running plumbing through it doubles.  Second, is this paradigm of flooring applicable to a larger part of the house?  Atomically, I can’t make a decision on my floor; I need to think along a broader scope.

Remember the “first telephone problem?”  It goes, “Phones will never be successful because nobody will ever buy the first one—there’d be nobody to call.”  NFV and SDN are not going to sweep into networking overnight and displace legacy technology for the very good reason that this legacy stuff has nearly five years of residual depreciation to be accounted for.  We’ll have pockets of new stuff embedded in the cotton ball of legacy networking for years to come.  That means that the business cases for SDN and NFV will have to be met inside an operations framework that’s been established by legacy gear for an almost-epochal period of time.

The “First SDN” or “First NFV” has to make a business case, but it has to make it when there’s just a little island here and there.  If opex savings are the goal, how do these islands pay back?  The majority of the network won’t be “new” and the new stuff will, if anything, present higher costs because it’s different.  So what does this mean?  It means that if we are going to shift the justification for SDN or NFV from capex to opex as nearly all operators say, we need to be looking primarily at opex.  In our just-completed fall survey, all but one Tier One said opex improvements were the benefits that would drive both SDN and NFV forward.  But even if we know how SDN or NFV can achieve these benefits inside their little initial enclaves, how do those benefits manifest in the network at large?

There are two pieces to NFV, conceptually.  One is the issue of creating network features from virtual functions—what we could call “incremental NFV”.  This is what the ETSI ISG is working on.  The other issue is creating a management framework that can not only sustain current opex costs/practices as increments of NFV deploy here and there, but actually create a new paradigm for management overall—a paradigm that accommodates NFV islands and then rewards operators for deploying them.

It should be clear to everyone that if we were to define NFV management by simply creating virtual versions of every current device, creating virtual MIBs to correspond with real MIBs, and then linking up to the same management systems we had all along, there’s no change in operations practices and no change in opex.  So why do we continually hear about that approach?  It can never deliver meaningful opex savings.

The TMF may have the critical elements here.  The GB922 specification, known as the “SID”, has proved (in CloudNFV) to be a highly useful framework for modeling customer services and service elements.  The GB942 specification, sometimes called the “NGOSS Contract”, defines how a data model of a service that includes resource commitments can then become a conduit for channeling management events to the right resource lifecycle processes.  The challenge is that neither of these two specs are used this way today.  I think they have potential far beyond anything we’ve tried to exploit so far and I hope NFV (CloudNFV and every other implementation) can exploit that potential.

I think that the right answer to the operationalizing of our future network, including a network that’s rich in SDN and NFV capabilities, is going to be based on the principles of GB922/942.  I also think that as we adopt these principles, both the NFV ISG and the TMF are going to have to make some accommodations to the principle of management unity.  The most efficient operations practices are those that work for everything.  Every exception is a cost center.  Only effective mechanisms for abstraction can automate unified management over evolving infrastructure.  To me, the most critical lesson that CloudNFV and other NFV implementations can teach us is how do we model services so that we achieve efficient operations, without creating resource-specific service definitions.  We’re not addressing that now.  We need to be.

NFV: What We Virtualize Matters

Transformation in telecom means investment in something not currently being invested in, but it doesn’t likely mean “new” investment.  Most operators have capital budgets that are based on return on infrastructure, which means that the sum of services supported by a network determines how much you spend to sustain or change it.  One of the reasons mobile networking is so important to vendors is that mobile revenues and profits are far better than wireline, so there’s more money to spend in support of infrastructure involved in mobile service delivery.

Despite mobile services’ exalted position as revenue king, none of the major operators believe that it can sustain current profit levels given the competitive pressure.  As a result, mobile services has been a target for “modernization”, meaning the identification of new technologies or equipment that can deliver more for less.  We’ve had a number of announcements of NFV proof-of-concepts built around mobility—IMS and most recently (from NSN) EPC.  NFV is a kind of poster child for modernization of network architectures, so it’s productive to look at these to see what we might learn about NFV and modernization in general.

One thing that jumps out of mobile/NFV assessment is that it demonstrates that there’s no single model of an NFV service.  When you create a service using NFV you deploy assets on servers that will provide some or all of the service’s functionality.  It’s common to think of these services, and their assets, on a per-customer basis but clearly nobody is going to deploy a private IMS/EPC for everyone who makes a mobile call or sends an SMS.  We have some services that are per-customer and some that are shared (in CloudNFV we call the latter “infrastructure services”).

This range of services illustrates another interesting point, which is that there are services that have a relatively long deployment life and others that are transient.  An infrastructure service is an example of the former and a complex telepresence conference an example of the latter.  Obviously, something that has to live for years is something whose “deployment” is less an issue than its “maintenance”, and something that’s as transient as a phone call may be something that has to be deployed efficiently but can almost be ignored once active—if it fails the user redials.

If we look into things that are logically infrastructure services—IMS and EPC—we see another interesting point.  Most infrastructure services are a cooperative community of devices/elements, bound through standardized interfaces to allow operators to avoid vendor lock-in.  When we think about virtualizing these services, we have to ask a critical question; do we virtualize each of the current service elements or do we virtualize some set of cooperative subsystems.  Look at a diagram of IMS and you see what looks all too much like some complex organizational chart.  However, most of the relationships the diagram shows are internal—somebody using IMS or EPC from the outside would never see them.  They’re artifacts not of the service but of the service architecture.  So do we perpetuate the “old” architecture, which may have evolved to reflect strengths and limitations of appliances, by virtualizing it?  Or do we start from scratch and build a “black box” of virtual elements?

In the IMS world, we can see an example of this in Metaswitch’s Project Clearwater IMS implementation.  Project Clearwater builds IMS by replicating its features not its elements, which means that the optimum use of the technology isn’t constrained by limitations of physical devices.  I think something like that is even more important when you look at EPC.  EPC is made up of elements (MME, SGW, PGW…you get the picture) that represent real devices.  If we virtualize them that way, we’re creating an architecture that might fit the general notion of NFV (we’re hosting each element) but flies in the face of the SDN notion of centralization.  Why have central control of IP/Ethernet routing and do distributed, adaptive, mobility management?

So at this point you might wonder whether all the PoC activity around mobility is reflecting these points, and the problem is that we don’t really know for sure.  My interpretation of the press releases from the vendors involved (most recently NSN) is that the virtualization is taking place at the device level.  Every IMS and EPC element in the 3GPP diagram is reproduced in the virtual world, so all the limitations of the architecture used to provide IMS or EPC are sustained in the “new” world.  You can argue that this is a value in transitioning from hard devices to virtual EPC or IMS, but I think you must then ask whether you’ve facilitated the transition to an end-game you’ve devalued.  Can we really modernize networks by creating virtual versions of all the current stuff, versions that continue to demand the same interfaces and protocol exchanges as before?  Frankly, I don’t think there’s a chance in the world that’s workable.

So here’s my challenge or offer to vendors.  If you are doing a virtual implementation of some telecom network function that takes a black-box approach rather than rehashing real boxes, I want to hear from you.  I’ll review it, write about it, and hopefully get you some publicity about your differentiation.  I think this is a critically important point to get covered.

Conversely, if you’re virtualizing boxes I want you to tell me how much of the value proposition for the network of the future you’re really going to capture through that approach, if you want me to say nice things.  I also want to know how your box approach manages to dodge the contradiction with SDN goals.  We have one network and SDN and NFV will have to live in that network harmoniously.

Signposts: A Missed Quarter from One, a New CEO from Another

Cisco turned in a disappointing quarter, one that screams “Networking isn’t what it used to be!” for all with even mediocre ears.  Juniper named a new CEO, one that the Street speculates may have been picked to squeeze more value for shareholders by cutting Juniper’s operating expenses.  Such a move would seem to echo my suggested Cisco-news shout, so we have to ask some questions (and hopefully answer them).  One is whether the networking industry really isn’t what it used to be, another is whether it could still be more than a commodity game, and the third is whether Juniper’s new CEO will really do what the Street suggests.

Most of my readers know it’s been my view for several years now that networking as bit-creation and bit-pushing is in fact never going to be what it was.  When you sell cars only to millionaires you can charge a lot per car, but when you have to sell to the masses the prices come down.  Consumerism has killed any notion of profit on bits because consumers simply won’t pay enough for them.  In fact, they really don’t want to pay for bits at all, as their habits in picking Internet service clearly shows.  Everyone clusters at the bottom of the range of offerings, where prices are lowest.  And if bits aren’t valuable as a retail element, you can’t spend a ton of money to create them and move them around.  Equipment commoditizes as the services the equipment supports commoditizes.  In bit terms, network commoditization is inescapable.  It’s not a macro trend to be waited out—it’s the reality of the future.

The thing is, we’re clearly generating money from networking, it’s just that we’re not generating it from pushing bits.  Operators are largely looking into the candy store through nose-print-colored glass, and in no small part because their vendors have steadfastly refused to admit that bits aren’t ever going to be profitable enough.  However, refusal to accept reality doesn’t recast it—as Cisco is showing both in its quarterly results and in its new Insieme-based “hardware SDN”.  What operators need to have is a mechanism for building higher-level services that not only exploit network transport/connectivity as a service (as the OTTs do) but also empowers capabilities at the transport/connection level that can increase profits there.  That builds two layers of value—revenue from “new services” and revenue from enhanced models of current services, models that may not be directly sellable.

Application-Centric Infrastructure (ACI) is arguably aimed at that.  It’s essentially an implementation of SDN that’s based not on x86 processors but on networking hardware.  The argument is that the “new role” for the network is something that demands new hardware, but hardware that’s incorporating the ability to couple to central policy control to traditional switches/routers.  It says that custom appliances are still better.

There’s nothing wrong with this as an approach; SDN at first presumed commodity switches and has translated to a software-overlay vision in no small part through the Nicira product efforts of Cisco rival VMware.  There are specific benefits to the Cisco approach too—custom appliances like ACI switches are almost certainly more reliable than x86 servers and they also offer potentially higher performance.  Finally, they’re immune from hypervisor and OS license charges, which Cisco points out in its marketing material.

Cisco’s notion of ACI is a watershed for the San Jose giant.  To reverse the bit-pushing commoditization slide you have to create value above bits, not just push bits better.  Otherwise you’re in a commodity market yourself.  For Cisco, therefore, the big barrier is likely less the proving of the specialty-appliance benefit than it is the definition of the services and applications.  We can make network transport/connection the underlayment to services if we can create services, meaning services above connection/transport.  Cisco has great credentials in that space with their UCS stuff, but they’ve been reluctant to step out and define the service layer.  Is it the cloud, or SDN control, or NFV, or something else?  If Cisco wants application-centricity it needs some application center to glom onto, and so far they don’t have it.

I think Cisco’s quarterly weakness is due to their not having that higher-layer strategy.  I don’t think operators would mind an evolutionary vision of networking—they’re deeply invested in current paradigms too.  Not demanding revolution isn’t the same as not requiring any movement at all, though.  To evolve implies some change, and Cisco’s not been eager to demonstrate what the profit future of networking is.  They’re happy to say they have new equipment to support it, though, which leaves an obvious hole.

Which, one might expect, a player like Juniper might have leaped to plug.  To be fair, Juniper has said nothing to suggest that the Street view of the new CEO, Shaygan Kheradpir (formerly from Barclays and Verizon, where he held primarily CTO/CIO-type positions) is correct.  However, it’s clear that from the investor side the goal for Juniper shouldn’t be chasing opportunities Cisco might have fumbled, but cutting costs.  If, as one analyst suggests, Juniper has the highest opex of any company they cover, then cutting opex is the priority.  You can’t cut sales/marketing, so what do you cut if not the stuff usually lumped into “R&D?”  How might Juniper seize on the Cisco gaffe without investing in something?  You see the problem.

What I’m waiting to find out is whether Juniper sees it, and sees the opportunity to continue to be a network innovator rather than a network commoditizer.  Juniper is a boutique router company, with strong assets in the places where services and applications meet the network.  The problem they’ve had historically is that they focus their research and product development almost exclusively on bit-pushing.  The former CEO, Kevin Johnson from Microsoft, was arguably hired to fix the software problem at Juniper, but Microsoft doesn’t make the kind of software Juniper needs and Johnson didn’t solve the problem.  The question now is whether Kheradpir’s CTO background and carrier credentials include the necessary vision to promote a Juniper service layer.  If not, then cost-scavenging may be about as good as it gets.

Right now, the Street seems to be betting against its own analyst vision of Juniper the scavenger hunter.  Logically a cost-managed vision wouldn’t generate the kind of P/E multiple Juniper has, and that should mean its stock price would tumble.  Either traders think Juniper will quickly increase profits through cost reductions without being hurt much by the Cisco-indicated macro trend in network equipment, or they think Juniper might still have some innovation potential.  Which, to be real, would have to be at the service layer.

Both Cisco and Juniper also have to decide where their positioning leaves them with respect to NFV.  For Cisco, espousing hardware SDN almost demands a similar support for appliance-based middle-box features, the opposite to where NFV would take things.  For Juniper, NFV could be a way of either exploiting Cisco’s hardware-centricity or a way of tapping third-party application innovation as a substitute for internal R&D.  Construct an NFV platform and let others write virtual functions.  In fact, we could say that NFV direction could be Juniper’s “tell” on its overall direction; get aggressive with NFV and Juniper is telegraphing aggression overall…and the converse of course would also be true.