How to Keep SDN/NFV From Going the Way of ATM

Responding to a LinkedIn comment on one of my recent blogs, I noted that SDN and NFV had to focus now on not falling prey to the ATM problems of the past.  It’s worth starting this week by looking objectively at what happened with ATM and how SDN and NFV could avoid that (terrible) fate.  We should all remember that ATM had tremendous support, a forum dedicated to advancing it, and some compelling benefits…all like SDN and NFV.  Those who don’t learn from the past are doomed to repeat it, so let’s try to do some learning.

ATM, or “asynchronous transfer mode” was a technology designed to allow packet and TDM services to ride on the same paths.  To avoid the inevitable problem of having voice packets delayed by large data packets, ATM proposed to break traffic into “cells” of 53 bytes, and to prioritize cells by class of service to sustain fairly deterministic performance across a range of traffic types.  If you want the details on ATM technology you can find them online.

If you look at the last paragraph carefully you’ll see that ATM’s mission was one of evolution and coexistence.  The presumption of ATM was that there would be a successful consumer data service model and that model would generate considerable traffic that would be ill-suited for circuit-switched fixed-bandwidth services.  So you evolve your infrastructure to a form that’s compatible with the new data traffic and the existing traffic.  I bought into this view myself.  It’s at least a plausible theory, but it fell down on some critical market truths.

Truth number one was that while ATM was evolving, so was optical transport and Ethernet transport, and in any event even high-speed TDM trunks (T3/E3) could be used as trunks for packet services.  Further, these physical-layer parallel paths offered a cheaper way of getting to data services because they didn’t impact the cost of the rest of the network or commit the operator to a long period of evolution.

The second truth was that the whole issue of cells was doomed in the long term.  At low speeds, the delay associated with packet transport of voice mingled with data could be a factor, but the faster the pipe the less delay long packets introduced.  We have VoIP today without cells; QED.

The third truth was the vendors quickly saw all the media hype that developed around ATM and wanted to accelerate new opportunities of their own.  They pushed on stuff that might have supported their own goals but they never addressed the big question, which was how (and whether) you could justify a transition to unified ATM versus partitioned IP/TDM.  They never made the business case.

It’s also worth noting that there was a time dimension to this.  In 1989 the Web came along, and with that we had the first realistic model for consumer data services.  Users were initially limited to dial-up modem speeds, so the fact is that consumer bandwidth for data services was throttled at the edge.  The market was there almost immediately, and realizing it with the overlay model was facilitated by the limited dial-up bandwidth.  But it was clear that consumer broadband would change everything, and it came along in at least an early form within about five years.  At that point, the window for ATM closed.

Few in the ‘80s doubted that something like ATM was a better universal network architecture than TDM was, presuming you had a green field and a choice between them.  But that wasn’t the issue because we had TDM already.  IP was, at the time, just as flawed (but in different ways) than ATM as a universal strategy.  What resulted in “IP convergence” and not “ATM deployment” was that IP had control of the application, the service.  The argument that one network for all would have been cheaper and better is probably still being debated, but the fact was (and is) that the differences didn’t matter enough to justify fork-lifting stuff.

I hope that the parallels with SDN and NFV are clear here.  If we were building a global network today where none had existed, we’d probably base it largely on SDN/NFV principles, but we did have IP convergence and so we have a trillion-dollar sunk capital cost and immeasurable human skills and practices to contend with.

My contention from the very first has been that capex would not be enough to justify either SDN or NFV, and operators I talked with as far back as 2013 agreed.  You need new service revenues or dramatic reductions in opex, or you can’t generate a benefit case large enough to reach critical mass in SDN/NFV deployment.  Without that mass we’re back to my operator contact’s “rose-in-a-field-of-poppies” analogy; you just won’t make enough difference to justify the risk.

There were, and still are, plenty of justifications out there, but there seem to be only two real paths that emerge.  One is to find a “Trojan App”, a service whose revenue stream and potential for transformation of user/worker behavior is so profound that it builds out a critical mass of SDN/NFV on its own.  The other is to harness the “Magic Benefit”, a horizontal change that displaces so much cost that it can fund a large deployment, and then sustain it.

The Magic Benefit of operations and management automation—or “service automation”—could deliver operator savings equivalent to reducing capex by over 40% across the board.  I believe that if, in 2013, the NFV ISG and the ONF had jumped on this specific point and worked hard to realize the potential, we could already be talking about large-scale success for both SDN and NFV and certainly nobody would doubt the business case.  Neither body did that.

We do have six vendors (Alcatel-Lucent, Ciena, HPE, Huawei, Oracle, and Overture) who could deliver the Magic Benefit.  I also believe that if in 2014 any of these vendors had positioned an NFV solution credibly based on service automation at the full scope of their current solution, they’d be winning deals by the dozens today and we’d again not be worried about business cases.  Never happened.

If we apply ATM’s lessons, then both SDN and NFV need a tighter tie to services; cost alone isn’t going to cut it.  I’m personally a fan of the Trojan App, but the problem is that there are only two that are credible.  One is the mobile/content delivery infrastructure I just blogged on and the other is the Internet of Things.  For the former, we have only a few SDN/NFV vendors who could easily drive the business case—Alcatel-Lucent and Huawei of my six total-solution players have credible mobile/content businesses.  IoT doesn’t even have a credible business/service model.  It’s hyped more than SDN and NFV, and to just as evil an effect.

There is no question that mobile and content infrastructure could be a tremendous help to SDN/NFV deployment because both are well-funded and make up a massive chunk of current capital spending.  If you get critical mass with for SDN/NFV with mobile/content deployment, you get critical mass for everything and anything else.  No other success would be needed to lay the groundwork.  But there’s still the nagging question of whether SDN/NFV benefits services in any specific way.  At the end of the day, we’re still pushing the same protocols and bits.

All of the six NFV prime vendors could also tell a strong mobile/content story.  Metaswitch is one of the most experienced of all vendors in the NFV space, and their Project Clearwater IMS would be a strong contender for many mobile operators and a super strategy for a future where MVNOs did more of the service-layer control than is common today.  Any vendor could assemble open-source elements to create an IoT model, though it would be far easier if some big player with some market might got behind it.

IoT is the opposite, meaning that instead of having a lot of paths that risk being service-less, we have no credible paths because service-oriented IoT hasn’t been hot.  Everyone is focusing on the aspect of IoT that’s the most expensive and raises the largest security and public policy concerns; attaching new sensors.  We have billions of sensors already, and we have technologies to connect them without all the risk of an open network model.  What we need is an application architecture.

Interestingly, I heard HPE’s CTO give a very insightful talk on IoT that at least seemed to hint at a credible approach and one that could easily integrate both SDN and NFV effectively.  For some reason this hasn’t gotten much play from HPE in a broader forum; most operators tell me they don’t know about it.  Other NFV prime vendors could also play in an effective IoT model, though it would be easier for players like HPE or Oracle to do that because they have all the specific tech assets needed to quickly frame a solution.

The lesson of ATM is at the least that massive change demands massive benefits, which demand massive solutions.  It may even demand a new service model, because cost-driven evolution of mass infrastructure is always complicated by the fact that the cheapest thing to do is use what you already have.  I think that in the coming year we’re going to see more operators and vendors recognizing that, and more wishing they’d done so sooner.