SDN and NFV: Beyond Serendipity

The concept of serendipity is dear to everyone.  You find a winning lottery ticket, dig up a pipe and find a stash of buried treasure, and (if you’re a network vendor) do a bunch of stupid stuff that somehow adds up to putting you in the right place at the right time.  Well, it’s nice to win the lottery but it would probably be a bad life choice to invest your retirement in it.  Being sensible, addressing real issues to realize real opportunities, is a better bet in the long run.  Sometimes that’s hard because “real issues” can be well-hidden, and that’s the case with SDN and NFV.

We’ve seen a number of purported SDN and NFV business cases published.  I don’t agree with the great majority of them, because they don’t address some of these real issues.  In many cases, they don’t address any of them, which means that the assumptions in the documents can’t be supported in the real world.  I don’t intend to criticize any specifically, but I do want to raise the hidden issues.  Look for these the next time you see SDN/NFV numbers.

The first hidden issue is the limited scope of application.  You can’t revolutionize infrastructure if the new technology can command only a couple percent of total capex.  Neither SDN nor NFV will displace optical transport, so their biggest impact is at Level 2 and above.  There, the largest investment today is in switching/routing, not in the higher-layer appliances.  If you can’t point to a path to broad deployment based on an SDN/NFV “use case” then you’re only proving that one specific service can be done differently.  That won’t change overall infrastructure costs or even (for most operators) the bottom-line profit per bit.

SDN could be used to create tunnels over optical transport, replacing both switching and routing with simple white boxes.  These tunnels could then be combined with NFV-deployed switch/router instances (probably mostly at the edge of the network) to create many services.  However, we aren’t postulating this combination because the big vendors don’t want to see switches and routers decline.  Without something like optical-and-SDN-tunnel-management to boost the number of places where SDN and NFV can be used, we’ll have a difficult time deploying enough of either to make a major difference.  This is why vCPE is potentially a dead end; if what you do is support add-ons to business Ethernet you don’t change enough overall capex and opex to matter.

The second of the hidden issues is operations costs for the resource pool.  If we host an instance of a function in a carrier cloud, we have to be able to maintain that cloud.  Everyone thinks that the cost of operations for virtual elements would be less than for appliances, or at least would be comparable.  Well, here’s the truth.

According to government statistics, network operators spend about 92% of their capital budgets on network equipment and only 8% on IT (servers, software, etc.).  That’s probably not a surprise to anyone given the fact that network operator IT is focused today on running OSS/BSS systems.  But here’s the interesting point; IT service operations costs are almost the same as network operations costs according to the operators themselves!  What would happen, given the fact that 8% of the tail is wagging a very large dog, if we were to shift 20% or 30% of capex to servers and software?

What this shows is that we cannot presume that moving something to a hosted model would even sustain current operations costs.  The numbers say we’re moving from a fairly opex-efficient model (the network) into a far less efficient model—the data center.  It shows that one of the major requirements for the success of SDN or NFV is to manage the IT operations costs of the server resources being deployed.  Service automation is absolutely critical for the success of SDN or NFV, and anything that doesn’t address that is just blowing smoke.  But because service automation is “above” SDN and NFV in the OSS/BSS layer, it gets ignored in every single business case.

The third hidden issue is resource pool economy of scale and efficiency.  Hosting functions means having a cloud/virtualized resource pool, and that raises two questions.  First, where are these pools located, and second how economical are they?  The challenge is that these two factors are interdependent.

Logically speaking, if we assumed that every service was going to have a hosted element to it, we’d expect the hosted elements to be located proximate to the offices where the service access connections are terminated.  In the US, we have about ten thousand such sites.  The problem is obvious; if credible SDN and NFV deployment means establishing local resources in each of these central offices, we’d need to deploy ten thousand new data centers to get into the game.  Nobody believes that can happen, so the presumption is that we’d start perhaps with a metro data center or two and then grow with opportunity.

This happy picture is at risk, because of what’s called “hairpinning”.  If we don’t have resources close to where we need them—at the service edge—then we have to haul traffic to a centralized point for efficient hosting and then return it to its normal path.  Imagine a service chain being chained from a user, fifty miles to a data center for hosting, then back fifty miles to the user, when the two “user” points might themselves be just a couple miles apart.  We’re wasting network bit-miles with this centralization, and worse we’re increasing operations risks and costs because we’re transiting more network devices.

This problem is why so many early SDN and NFV applications are based on natural concentration of hosting points.  SDN is often a data center network technology, which means it’s local to a hosting point.  NFV’s top application is vCPE, which hosts the function on the customer premises or at the carrier edge in a network device.  But you still need server blades or cards inside whatever you plan to use to host software.  And you now have made a little server out of edge devices.  Remember that IT server operations is more expensive than network operations?  How much of that expense will you now incur?  Try to find someone who accounts for that.

The final hidden issue is the hypothetical revenue gains to be expected.  I’ve seen all kinds of numbers on the extravagant new revenue opportunity associated with agile services and how the new features and faster deployment will enable operators to reduce customer acquisition and retention costs.  The challenge is that none of our current data supports the assumptions.

One problem here is that accelerating the time to deploy something only increases revenue for those customers who are newly deploying.  If I have ten thousand customers with service already, how many of those will accept an interruption in service while I replace features like firewalls with virtual ones?  How many of these firewalls-in-place were owned by the customer and are working fine?  We assume that if we’re talking about ten thousand customers and a two-week enhancement in provisioning time, we’re talking about a total of twenty thousand weeks of new revenue.  We might, in a static customer environment, be talking about no additional revenue at all.

Another problem is the notion that this is all going to help operators compete with OTTs.  Well, most OTT services aren’t candidates for SDN or NFV deployment and we know that because the OTTs that sell them (successfully, I might add) don’t use SDN or NFV.  And here’s another point.  If we have created a model of managed service sale where CPE hosts functions sold through a central portal, doesn’t that look exactly like what an OTT might want to provide?  A shift to this model might well do as much or more for OTT revenues as for operator revenues.

I want to say here what I’ve said before, which is that I firmly believe that both SDN and NFV can be justified and that the potential impact on carrier infrastructure, services, and costs, can be profound.  But I also want to say that shallow business cases that rely on unproven assumptions and that ignore real issues will never get us there.  We have six vendors who can make an NFV business case, for example, and yet even those vendors are prone to making business-case statements that are simply not valid.  The problem seems to be that everyone wants to make their SDN or NFV numbers right now, in this quarter, and the fact is that there are too many real issues to be faced to make that happen.

What does SDN or NFV success look like, do you think?  Would you say the technologies are successful if they make up 2% or less of capex?  I don’t think many would call that “success” and yet that’s exactly what will happen if we don’t face these hidden issues, and probably another half-dozen that are more subtle and perhaps have narrower impact.  I think all that stands in the way of that is recognition that we really do have to make business sense out of this, because I think we already have technology from at least six vendors that could be harnessed to do just that.

Are you an SDN or NFV hopeful?  If so, then you need to decide whether you’re going to bet your long-term success on playing the lottery or invest in the future.  I’d strongly suggest that those who can address the full SDN/NFV business case do so quickly to make their own offerings compelling.  Wishing, as the saying goes, won’t make it so.