Is Cisco Missing Two Big Opportunities it Already Knows About?

Cisco’s numbers for the quarter just ended were decent, but their guidance for the current quarter was a disappointment to many.  Yeah, Cisco did the usual dance about macro-economics and currency fluctuations, but you can see the Street is concerned that either technology shifts or operator pressure on costs (or both) was impacting Cisco.  The question, if these impacts are real, is what Cisco could do about it.  If you look at Cisco’s earnings call, you see what might be a Cisco strategy emerging, built around UCS.  Is it the right approach?

For a decade, my surveys have shown that the primary driver of network change is data center change, and the vendor who controls data center evolution tends to control network evolution.  IBM was the unchallenged leader among enterprises in everything data center for a long time, but in 2013 they slipped to parity with Cisco and they’ve declined a bit since then.  Part of the reason, I think, is that Cisco realized that data center evolution reflected IT evolution in general, and you need to get control of that process to be a big winner or even (gasp!) the next IBM.

The data center is shifting from being a small-ish number of multi-tasking large systems to a vast complex of blade servers, racks, and switches supporting virtualization or cloud computing.  When vendors like Cisco or Arista talk about cloud strategies and offerings, it’s this data center evolution they’re targeting.  The “cloud” has become a general term to describe IT based on a broadly deployed and loosely coupled pool of servers, connected by LAN/WAN switching and routing.  That networking is extended to users primarily through carrier virtual services, so most enterprise network equipment spending is focused in the data center.

So Cisco’s strategy is simply to win in the data center by offering a complete portfolio of products that address the migration from “mainframes” to racks of blade servers.  In doing so they have a great shot at controlling the data center network and through it, network evolution in the enterprise.  Nothing in the way of technology shifts is needed; it’s Sales 101.

It’s Sales Planning 101 to say that if you’re riding a wave to success you don’t try to jump off.  Cisco would gain little or nothing by pushing through big technology shifts in the data center, shifts like white box switching and SDN.  Their problem is that a news story that says “nothing is happening” isn’t clicked on much, so the media looks for big changes they can claim are imminent.  SDN news produces SDN interest in the enterprise, and that could threaten Cisco’s orderly harnessing of a single sound business trend.

Cisco’s strategy for that is to lance the boil.  You take the top end of SDN, the part that standards people always get to last anyway given their propensity to do bottom-up development, and you tie up the eventual benefits and business case in APIs and tools that deliver “software definition” to the stuff buyers already have and Cisco already makes.  Application-Centric Infrastructure (ACI) is a kind of sissified SDN, something that promises the end-game without the changes in technology.  It does some useful stuff, as all top-down approaches tend to do, but it’s a defense mechanism.

Nothing wrong with that, as long as you stay defensive where you need to be, and that’s where I think Cisco’s call and their ongoing strategy have some holes—two to be explicit.  One is in the area of NFV and the other in SaaS.

It’s really difficult to assess how much risk NFV presents to Cisco’s device business.  In theory, you can host anything in the cloud or on a multipurpose CPE box.  That’s as true today as it would be in an age of NFV, because most enterprise services based on virtual network functions have multi-year contracts.  It’s nice to talk about dynamic provisioning, but how many companies want their VPNs or firewalls turned on and off on a regular basis?  If hosted versions of network features haven’t hurt Cisco so far, it may be that they’re a limited threat.

In any event, what Cisco should be able to do is to capture the benefit case for NFV without actually doing it, just as they’ve done with SDN.  Nearly all the practical benefits of NFV will come not from displacing existing devices but by automating operations and management.  Well, Cisco had plenty of opportunity (and cash) to develop a service management and operations platform that could have delivered nearly all the service agility and operations efficiency of NFV without displacing a single appliance.  A creative program to facilitate field installation of firmware features could do most of the rest.

This approach could be combined with Cisco’s cloud thrust, and the combination could create an upside for UCS, draw the fangs of device-replacement advocates, and perhaps even generate some running room for Cisco’s device business by giving carriers lower TCO without lowering capex.  How did they miss this?

Then there’s SaaS.  On their call, Cisco says that their WebEx is one of the most popular SaaS applications, and in fact it’s one of the most pervasive.  Cisco’s had WebEx for a long time (since 2007) and what started as a collaborative resource for online presentations is…well…still pretty much a collaborative resource for online presentations.  With Cisco pushing technology frontiers (so said the call) with in-house startups, why have they failed to do anything with a premier SaaS opportunity?

And guess what the best example of a specific SaaS/WebEx miss is?  IoT.  Cisco has never been able to look at IoT as anything other than a source of traffic.  I guess if you’re a hammer, everything looks a bit like a nail to you.  Look deep into WebEx, though, and you see an important truth, which is that collaboration happens around something.  WebEx succeeded because it let you collaborate around slides.

IoT could provide a rich new source of stuff to collaborate around.  Think health care, think utilities and transportation, think just about every vertical market that has “things” in it.  Add new logic to WebEx to centralize communication around an arbitrary view of a “thing” or a collection of them, and you have a major new business opportunity.

There are no real technical barriers to Cisco taking advantage of these two opportunities, and I don’t think there’s any real risk to their core business either.  Cisco could be a big, even a giant, player in both spaces.  To me, this looks like an example of corporate tunnel vision, back to my nail-and-hammer analogy.  If they’d think outside the bit (a term that I hold a trademark on, by the way), they’d see that anything that generates real value in a distributed way generates traffic.  In contrast, hype to the media generates only ink and clicks.

I don’t know when the operators will start to act decisively on their revenue/cost-per-bit crossover, or whether some have already done that (some tell me they have).  That means I don’t know when Cisco’s “macro-economic” conditions will have to be updated to include “the buyer put the wallet away” as a condition.  Perhaps neither SDN nor NFV will really matter.  Perhaps regulators will mandate operator spending and tax the population to pay for the excess.  Or maybe Cisco can ask each employee to leave a tooth under their pillow.  Employees need to think about whether they could convince management to look at these two areas.  Or check their dental coverage.

Looking Deeper into “Practical IoT”

IoT could well go down in tech history as the most transformational concept of all time.  It will certainly go down as the most hyped concept.  The question for IoT, in fact, is whether its potential will be destroyed by the overwhelming flood of misinformation and illogic that it’s generated.  SDN and NFV have been hurt by the hype, which has almost eliminated any publicity for useful stuff in favor of just crazed vendor (and media) opportunism.  IoT could be an easier target.

The general view of IoT is that it’s an evolution of the Internet where devices (sensors, controllers, or both) talk to each other to create value for people overall.  The mere thought of billions of “things” appearing on the Internet and generating traffic causes Cisco to nearly faint in joy.  The thought of billions of things that have to be connected wirelessly has made both LTE vendors and mobile network operators salivate to near-flood proportions.  Every possible “thing” interaction (except those involving toilets) have been explored ad nauseam in the media.

At the same time, privacy and security advocates have pointed out that this kind of direct, open, “thing-exchange” creates enormous risks.  Who even knows if a given “thing” is what it purports to be?  Who knows the goal of the interactions—traffic routing, stalking, or terrorism?  Proponents of traditional IoT don’t have a problem with these objections—you just spend more to fix the problems.  Realists know that even without extra security problems to counter, having everything directly on the Internet would be so costly that it would make fork-lift transformation to SDN or NFV look like tossing a coin to a subway musician.

In past blogs, I’ve said that the “right” way to think of IoT is not as a network of anything, but rather as an enormous repository, linked with analytics and event-driven applications.  That’s because what all this “thing-ness” is about is making pretty much all of our environment a resource to be exploited by technology that’s aiding us to accomplish our goals.  If we were to look at the question of finding an optimal route, we’d naturally gravitate to the notion of having a route database with updates on conditions posted along each path.  It’s obvious that home control, industrial control, utility management, transportation optimization—everything that’s supposed to be an IoT application is in fact an application in the traditional sense.  It’s not a network at all, not an Internet of anything.  So why not think of it in application terms?

Workers and consumers do their thing in the real world, which establishes a physical/geographic context and also a social context that represents their interactions with others.  If our gadgets are going to do our will, they clearly have to understand a bit about what we’re trying to do.  They need that same pair of contexts to be fully effective.  Further, each of us (in worker or leisure roles) establishes their own specific context by drawing from conditions overall.  So making our “things” bend to our will means getting them to share our context, which means first and foremost making it sharable.

What IoT needs to do is assimilate context, which is very different from just connecting things.  Connect any arbitrary pair of things and you have next to nothing in terms of utility.  Assimilate what things can tell us, and you’re on your way to contextual understanding.

The right model for IoT, then, should have three layers.  In the middle, at the heart, is a repository, analytics engine, and event generator.  At the bottom is a distributed process that admits validated systems to be either information sensors or controllers and builds a repository of their state and capabilities, and at the top is a set of applications that draw on the data, events, and analysis of the middle layer.

An important part of that middle layer is a policy border.  Nothing gets in except from an authenticated source.  Nothing gets out except in conformance to policies set by the information owner, the provider of IoT service, and regulators at all levels.  So no, you can’t track your ex and then hack the traffic lights along the route to make sure nothing moves.  You can’t “see” control of lights at all, in fact, because of the policies.  The notion of a repository with a policy border is critical because it makes security and privacy achievable without making every IoT device into a security supercomputer.

Contributing to realistic IoT is simpler too.  Anything that has information and is trusted can contribute.  Since it’s routine to create “logical repositories” that blend public and private data, you could retain your own IoT data in-house and integrate query access between it and the public repository.  An example is easy to visualize.  Today you might turn your home lights on or off at specific times of day.  Or you might use the level of ambient light.  With IoT you might say “Turn on my lights when a majority of my neighbors have theirs on” and off based on a similar majority vote.  Useful?  Yes, if you don’t want your home to stand out.

An analytic/event view of the world is useful in both social and work situations.  For example, a shopper might want an alert if they passed within 50 yards of a sale of a specific item.  A worker might want to know if they’re within the same distance of a junction box or passed a freight car with a specific carton inside.  You could argue that a conventional model of IoT could provide this, but how would anyone know what sensor to use or how to interpret the result geographically?  Does the store or boxcar sensor have to know where it is?  We’re back into supercomputer territory here.  But with the analytic/repository model, all this stuff is straightforward.

My proposed IoT model doesn’t mean that there are no new sensors, but it would suggest that current low-cost-and-power techniques for putting sensors online would be retained and expanded to build critical mass, control costs, and avoid sensor hacking.  There would still be new revenue for operators if they could establish a value to using cellular technology directly, which could be the case with higher-value sensors or controllers.  There would still be new traffic too, though most of it would likely come from applications of IoT and not from connecting “things”.

There’s a clear role for both SDN and NFV in this model too.  You could picture my central core element of IoT as a grand mesh of functions that have to be deployed and connected.  We would create a pervasive cloud of repositories and applications for analysis, digestion, and classification.  We’d then add in dynamic request-driven applications.

To me, it’s a mystery why something this obvious gets ignored, and the only possible answer is that what vendors and operators want is a simple short-term revenue boost.  Since the NASDAQ crash of 1999/2000 financial regulations have increasingly focused companies no further forward than the next quarter.  We’re not going to get IoT in any form with that sort of thinking, nor will we get SDN or NFV deployed to any significant level.  It’s also true that simple stories are easy to write, and you can fit them into the 350 words or so that publications are willing to commit.

Simple’s not going to cut it here, and so with IoT as with SDN and NFV we may have to depend on somebody big and smart stepping up.  That may be happening, and I’ll talk about it in a future blog.

How to Keep SDN/NFV From Going the Way of ATM

Responding to a LinkedIn comment on one of my recent blogs, I noted that SDN and NFV had to focus now on not falling prey to the ATM problems of the past.  It’s worth starting this week by looking objectively at what happened with ATM and how SDN and NFV could avoid that (terrible) fate.  We should all remember that ATM had tremendous support, a forum dedicated to advancing it, and some compelling benefits…all like SDN and NFV.  Those who don’t learn from the past are doomed to repeat it, so let’s try to do some learning.

ATM, or “asynchronous transfer mode” was a technology designed to allow packet and TDM services to ride on the same paths.  To avoid the inevitable problem of having voice packets delayed by large data packets, ATM proposed to break traffic into “cells” of 53 bytes, and to prioritize cells by class of service to sustain fairly deterministic performance across a range of traffic types.  If you want the details on ATM technology you can find them online.

If you look at the last paragraph carefully you’ll see that ATM’s mission was one of evolution and coexistence.  The presumption of ATM was that there would be a successful consumer data service model and that model would generate considerable traffic that would be ill-suited for circuit-switched fixed-bandwidth services.  So you evolve your infrastructure to a form that’s compatible with the new data traffic and the existing traffic.  I bought into this view myself.  It’s at least a plausible theory, but it fell down on some critical market truths.

Truth number one was that while ATM was evolving, so was optical transport and Ethernet transport, and in any event even high-speed TDM trunks (T3/E3) could be used as trunks for packet services.  Further, these physical-layer parallel paths offered a cheaper way of getting to data services because they didn’t impact the cost of the rest of the network or commit the operator to a long period of evolution.

The second truth was that the whole issue of cells was doomed in the long term.  At low speeds, the delay associated with packet transport of voice mingled with data could be a factor, but the faster the pipe the less delay long packets introduced.  We have VoIP today without cells; QED.

The third truth was the vendors quickly saw all the media hype that developed around ATM and wanted to accelerate new opportunities of their own.  They pushed on stuff that might have supported their own goals but they never addressed the big question, which was how (and whether) you could justify a transition to unified ATM versus partitioned IP/TDM.  They never made the business case.

It’s also worth noting that there was a time dimension to this.  In 1989 the Web came along, and with that we had the first realistic model for consumer data services.  Users were initially limited to dial-up modem speeds, so the fact is that consumer bandwidth for data services was throttled at the edge.  The market was there almost immediately, and realizing it with the overlay model was facilitated by the limited dial-up bandwidth.  But it was clear that consumer broadband would change everything, and it came along in at least an early form within about five years.  At that point, the window for ATM closed.

Few in the ‘80s doubted that something like ATM was a better universal network architecture than TDM was, presuming you had a green field and a choice between them.  But that wasn’t the issue because we had TDM already.  IP was, at the time, just as flawed (but in different ways) than ATM as a universal strategy.  What resulted in “IP convergence” and not “ATM deployment” was that IP had control of the application, the service.  The argument that one network for all would have been cheaper and better is probably still being debated, but the fact was (and is) that the differences didn’t matter enough to justify fork-lifting stuff.

I hope that the parallels with SDN and NFV are clear here.  If we were building a global network today where none had existed, we’d probably base it largely on SDN/NFV principles, but we did have IP convergence and so we have a trillion-dollar sunk capital cost and immeasurable human skills and practices to contend with.

My contention from the very first has been that capex would not be enough to justify either SDN or NFV, and operators I talked with as far back as 2013 agreed.  You need new service revenues or dramatic reductions in opex, or you can’t generate a benefit case large enough to reach critical mass in SDN/NFV deployment.  Without that mass we’re back to my operator contact’s “rose-in-a-field-of-poppies” analogy; you just won’t make enough difference to justify the risk.

There were, and still are, plenty of justifications out there, but there seem to be only two real paths that emerge.  One is to find a “Trojan App”, a service whose revenue stream and potential for transformation of user/worker behavior is so profound that it builds out a critical mass of SDN/NFV on its own.  The other is to harness the “Magic Benefit”, a horizontal change that displaces so much cost that it can fund a large deployment, and then sustain it.

The Magic Benefit of operations and management automation—or “service automation”—could deliver operator savings equivalent to reducing capex by over 40% across the board.  I believe that if, in 2013, the NFV ISG and the ONF had jumped on this specific point and worked hard to realize the potential, we could already be talking about large-scale success for both SDN and NFV and certainly nobody would doubt the business case.  Neither body did that.

We do have six vendors (Alcatel-Lucent, Ciena, HPE, Huawei, Oracle, and Overture) who could deliver the Magic Benefit.  I also believe that if in 2014 any of these vendors had positioned an NFV solution credibly based on service automation at the full scope of their current solution, they’d be winning deals by the dozens today and we’d again not be worried about business cases.  Never happened.

If we apply ATM’s lessons, then both SDN and NFV need a tighter tie to services; cost alone isn’t going to cut it.  I’m personally a fan of the Trojan App, but the problem is that there are only two that are credible.  One is the mobile/content delivery infrastructure I just blogged on and the other is the Internet of Things.  For the former, we have only a few SDN/NFV vendors who could easily drive the business case—Alcatel-Lucent and Huawei of my six total-solution players have credible mobile/content businesses.  IoT doesn’t even have a credible business/service model.  It’s hyped more than SDN and NFV, and to just as evil an effect.

There is no question that mobile and content infrastructure could be a tremendous help to SDN/NFV deployment because both are well-funded and make up a massive chunk of current capital spending.  If you get critical mass with for SDN/NFV with mobile/content deployment, you get critical mass for everything and anything else.  No other success would be needed to lay the groundwork.  But there’s still the nagging question of whether SDN/NFV benefits services in any specific way.  At the end of the day, we’re still pushing the same protocols and bits.

All of the six NFV prime vendors could also tell a strong mobile/content story.  Metaswitch is one of the most experienced of all vendors in the NFV space, and their Project Clearwater IMS would be a strong contender for many mobile operators and a super strategy for a future where MVNOs did more of the service-layer control than is common today.  Any vendor could assemble open-source elements to create an IoT model, though it would be far easier if some big player with some market might got behind it.

IoT is the opposite, meaning that instead of having a lot of paths that risk being service-less, we have no credible paths because service-oriented IoT hasn’t been hot.  Everyone is focusing on the aspect of IoT that’s the most expensive and raises the largest security and public policy concerns; attaching new sensors.  We have billions of sensors already, and we have technologies to connect them without all the risk of an open network model.  What we need is an application architecture.

Interestingly, I heard HPE’s CTO give a very insightful talk on IoT that at least seemed to hint at a credible approach and one that could easily integrate both SDN and NFV effectively.  For some reason this hasn’t gotten much play from HPE in a broader forum; most operators tell me they don’t know about it.  Other NFV prime vendors could also play in an effective IoT model, though it would be easier for players like HPE or Oracle to do that because they have all the specific tech assets needed to quickly frame a solution.

The lesson of ATM is at the least that massive change demands massive benefits, which demand massive solutions.  It may even demand a new service model, because cost-driven evolution of mass infrastructure is always complicated by the fact that the cheapest thing to do is use what you already have.  I think that in the coming year we’re going to see more operators and vendors recognizing that, and more wishing they’d done so sooner.

What Does the SDN/NFV Success Track Through Mobile and Content Look Like?

I was talking yesterday with an old friend from the network operator space, a long-standing member of the NFV elite, and one of our topics was just what could pull through SDN and NFV.  Two specific notions came up, one the Internet-of-Things opportunity I mentioned a number of times in my blogs (yesterday, for example) and the other was content delivery.  I’ve already promised to look more deeply into the former, but content and in particular mobile content is also very credible.  Let’s take a look there first.

To set the stage, mobile services are the bright spot of the network operator space, if there is such a thing.  The margins are higher, there’s still at least in some areas a hope to increase ARPU, and regulations in many areas are a bit lighter.  For a decade now, mobile capex has grown much faster than wireline capex.

Video isn’t the only driver of mobile, but it sure helps.  A bunch of research on video viewing from varied sources agrees on a key point, which is that channelized TV viewing isn’t falling.  Online video consumption largely supplements it, and the reason is that more and more online viewing takes place where channelized TV isn’t available—in the hand of the mobile user.

Mobile streaming is by far the fastest-growing user of bandwidth, and its importance to mobile users was demonstrated by T-Mobile’s decision to offer free streaming video as a competitive differentiator.  As I suggested in yesterday’s blog, this is a reflection of the fact that a small increase in capex to support additional capacity would be easily justified if customer acquisition and retention costs (the largest opex component for mobile operators) could be reduced significantly.

One corollary to this point is that it then behooves the operators to insure that the capex increase associated with unfettered mobile streaming is small.  How that might be done is a jumping-off point illustrating the complexity of the relationship between new technologies like SDN and NFV and real-world business issues for network operators.

Mobile networks’ video-carrying capacity is impacted by a number of things.  The first is the RF signal, which has a native capacity shared by users within the cell.  You can increase this either by making the radio access network (RAN) faster (4G is pretty good at supporting large numbers of video users and 5G would be better), by making cells smaller so fewer users share the capacity (which means making them more numerous to cover the geography), and by using WiFi offload where possible to create what’s essentially a new parallel RAN.

Back from the RAN is the backhaul.  You can’t offer wireless video services without something to connect the cell sites (or WiFi sites) to video sources.  In the modern world, this means running fiber.  Given that per-fiber capacity is quite high, things like 5G that increase per-cell capacity make sense versus running a bunch of new glass to support more cells.

The combination of RAN and backhaul, and the high cost of customer acquisition and retention in the mobile space, is making the notion of the mobile virtual network operator (MVNO) more interesting.  Giants like Amazon, Apple, and Google have either demonstrated MVNO interest or are rumored to be looking at it.  Cable companies have admitted they plan to become MVNOs at least on a trial basis.

When Google talked about being an MVNO, I pointed out that there was no future in having a mobile industry with three or so players and a dozen resellers.  All that happens in undifferentiated resale is that prices fall into the toilet.  Low-margin mobile does none of our aspiring MVNO players any good, nor does it exploit their strengths.  So we have to look at differentiated MVNO potential, and that’s where SDN and NFV come in.

Why have “virtual” in a name if you don’t virtualize anything?  It seems pretty obvious that if a “real” infrastructure-based mobile operator fully virtualized their infrastructure they could create a kind of mix-and-match inventory of capabilities that MVNOs could exercise at will, mixing in their own unique differentiators.  Comcast, for example, is at least considering being an MVNO and they have a strong content delivery capability already.  Why not combine it with RAN from somebody else?

While some of this virtualizing would impact the RAN and backhaul, most of it would probably fall in the metro and CDN zone.  The signaling and service intelligence of a mobile network resides there, inside IMS, EPC, and of course the CDN technology.  Virtualization at the SDN level could let operators partition real mobile infrastructure better for virtualized partners, but it would also let operators reconfigure their mobile and content delivery architecture to match either short- or long-term shifts in user behavior and traffic patterns.

On the NFV side, mobile and CDN signaling/service elements could be deployed, but the value of NFV to these long-lived multi-tenant components of infrastructure depends on how much of NFV’s benefits are drawn from agility/operations efficiency.  If all you do with NFV is deploy stuff, then something that deploys only once and then gets minimally sustained isn’t a poster-child app.  But if we start to imagine feature differentiation of mobile services and the integration of a true IoT model (not the vapid “let’s-move-sensors-to-LTE” junk), we can see how the same operator who offered virtual IMS/EPC/CDN might offer hosting to VNFs that MVNOs supplied for service differentiation.

CDN elements and IMS customer control and signaling are hosted, whether on specialized appliances or servers.  The hosting could evolve to a more dynamic model, as I’m suggesting above, and with that dynamism it could promote distribution of data centers more richly in at least major metro areas.  That would then establish hosting at a reasonable scale and reduce the barrier to deploying other incremental NFV applications/services.  Virtual CPE in any form other than edge-hosted probably depends on something like this pre-deployment of at-scale resource pools, and so do many other applications.

Many people think that mobile services and content delivery offer SDN and NFV opportunities, but there’s been precious little said about the specific opportunities that would arise or the specific way that SDN or NFV could address them.  Absent that sort of detail, we end up with people saluting the mobile/content/SDN/NFV flag without any actual collateral to play in the game, much less to drive it.

This is one of the true battlegrounds for SDN/NFV, with battle lines that aren’t shaped by either technology but by the high-level reality of selling real services to real users.  The union of Alcatel-Lucent and Nokia could create a true powerhouse in this area, a player who would then fight with Ericsson and Huawei for supremacy in the mobile/content space.  That fight is one business/market force that could then create a rich opportunity for both SDN and NFV—and of course for the three vendors who are duking it out.

 

Can We Find, and Harness, the Real Drivers of Network Change?

If you go to the website of a big vendor who sells a lot to the network operators, or read their press releases, you see something interesting.  The issues that these vendors promote seem very pedestrian.  We hear about things like “customer experience”, “unified services”, “personalizing usage”, “traffic growth”, “outages”, and even “handset strategies”.  Where’s the revolutionary stuff like the cloud, SDN, and NFV?  Or, at least, why isn’t that stuff getting highlighted?

The popular response to this is that it’s because of that bad old carrier culture thing.  These guys are dinosaurs, trapped in primordial sediments that are slowly fossilizing around them while the comet zooms in to generate mass extinction.  Others are playing the role of the mammals—small, fast, scurrying over the traps and destined to survive and rule.  You probably realize by now that it’s not that simple, but maybe not why that’s the case.

The vendor website that’s filled with these pedestrian terms isn’t trying to sell to the dinosaur population, they are trying to sell to the buyers with money.  Equipment is generally purchased by operations departments, and these people don’t have a mission of innovation.  That’s the “science and technology” or CTO people (who, by the way, don’t have much money at all).  The operations people think in terms of service benefits relevant to current sales situations, and that’s why all those pedestrian topics come up.

Imagine yourself as a carrier sales type.  Your customer bursts through your door (virtually or in person) and shouts “I demand you fulfill me using SDN or NFV!”  You’d probably call the cops.  On the other hand, a customer demanding you accommodate traffic growth, unify their services, or control outages is pretty much the norm.

At a high level, this explains the business case issue.  We can’t sell technology changes to buyers of service, we have to sell the impact of technology change on those buyers’ own businesses or lives.  That’s what a business case must do.  But we’ve talked about business cases already, and I want to open another dimension to this.  What are the “priority attributes” that any element of network infrastructure will have to deliver on?

The CFO of every operator is the star of the company’s quarterly earnings call.  All the people on the call—meaning the CFO and the financial analysts—see networking as essentially a zero-sum game.  Revenue gains by me are revenue losses by someone else, which means that “new revenue” is more likely to be someone else’s old revenue than something that’s never been spent before.  Cost reductions have to target large costs with low-risk approaches.

Zero-sum revenue games mean you have to differentiate on something that 1) the customer values and 2) the salesperson can convey quickly and convincingly.  Simple technology changes fail on both counts, which is why that initial list of what might look like ancient clichés is so ubiquitous on vendor sites.  It might not be as obvious, but truly new services fail the second test.  How much time would it take for a salesperson to convince a buyer to adopt a different service model?  A long time, and it might never happen, and sales success is the real prerequisite to any revenue gains.

Interestingly, cost reduction discussions often end up sounding like new-revenue discussions.  The reason is that the largest operations/administration cost element is customer acquisition and retention, running 12 cents per revenue dollar.  When you consider that capex is only 20 cents you can see the point here.  This little fact is why wireless companies like T-Mobile can offer unlimited video streaming, eating the data costs.  Sure it costs them some access capacity (which they can reduce through efficient use of CDNs) but if a little additional capex can make a big difference in the acquisition/retention cost, it’s worth it.

Let’s take this simple truth and run with it.  If the largest benefit source for a new technology is its ability to reduce acquisition/retention charges, then what matters about the technology is how well it does that.  It’s not easy to make a connection between virtual pipes or virtual firewalls and better customer traction or lower churn.  You can assert that there is, or could be, one but most vendors would admit they have no idea how to prove it.  Worse, they could never construct a technology trial to validate their assertions.

This is why a bottom-up approach to both SDN and NFV was such a problem.  In a real, logical, technology project you start with the benefits you’ll need to harness to get what you want, and you define specific requirements and features that will deliver them.  You build downward then to implement or standardize these features.

What about IP convergence, you might ask?  Well, the fact is that the IP revolution came about because of a fundamental change in demand.  We had two forces in play, in fact.  Enterprise networking was built around host-centric architectures like IBM’s Systems Network Architecture (SNA).  We had no consumer data service potential.  Routers offered enterprises a cheaper way to push data traffic, and the Web offered a consumer data service model.  And so off we ran.

This is why the focus on the service status quo is a problem for SDN and NFV.  If we reproduce what we already have as our only revolutionary mission for new technology, we cut ourselves off from the only kind of benefits that has ever created a network revolution.  We are forced to rely purely on cost savings, and as I’ve pointed out in prior blogs it’s difficult to muster a lot of cost savings when you deploy a technology incrementally.

How much do you save by transitioning one percent of your network spending to SDN or NFV?  Clearly less than 1% in capex, since you’ll still spend something.  On the opex side you may save nothing at all because your SDN and NFV pearls are trapped in a lot of legacy seaweed (swine?) that still requires exactly the same practices for operations and management.  And without new services you’re back to the problem of proving customer acquisition and retention savings is possible.

I’ve noted in past blogs that the Internet of Things was something that could drive technology changes.  That’s because it’s a service-level change, something that like IP could be transformative because it transforms what the top-of-the-food-chain buyers spend their money on.  However, just as our conception of SDN and NFV has been shortsighted to the point of being stupid, so is our IoT conception.  Cisco thinks it’s all about more traffic (therefore, it’s about more router spending).  Verizon and other operators think it’s all about LTE-based sensors (therefore, about more mobile service spending).  It’s not about either one.

I’m going to be talking more about IoT in future blogs, but talking about what it really can be and not the stupid stuff.  Along the way, I’ll show how it can transform the network, the cloud, and how it could then pull through massive SDN and NFV investments.

We did dumb things with our current revolutions, and in doing so have almost killed their chances of being revolutionary.  I’d sure like us not to muck up IoT.

Taking a TMF/OSS View of NFV’s Business Case

I’ve pointed out in a number of my past blogs that of all the things needed from an SDN or NFV implementation to make the business case, none tops an effective service management automation approach.  I’ve also noted that the NFV ISG initially put end-to-end management out of scope, and that they also ignored the issues of federation of services across management domains.  The ISG seems to be reversing itself on these issues, but the architecture was laid out without them and retrofitting to put them in could take quite a while.  Other bodies might have to take up the slack.

The most logical would seem to be the TMF, which launched an activity called ZOOM (Zero-touch Orchestration, Operations & Management) to deal in part with SDN/NFV impact and in part with the broader issue of “modernizing” OSS/BSS.  That duality of mission, as you’ll see, carries over into even some vendor Catalyst presentations made in Dallas early in November.

HP’s Catalyst presentation has what should be the tag line for the whole topic:  “Combining NFV, SDN and OSS is not easy”, which it surely isn’t.  The presentation identifies three specific issues (paraphrasing):

  • Today’s OSSs are process silos that lack procedures to automate responses to service events, particularly fulfillment and assurance
  • ETSI NFV specifications don’t consider ‘hybrid’ services that extend over both legacy and SDN/NFV infrastructure.
  • The general approach taken by the TMF and by OSS/BSS vendors is based on linear “waterfall” workflows that are more suitable for manual processes than for service automation.

HP’s Catalyst augments “standard” OSS/BSS service processes with NFV processes.  The effect of this appears to be the creation of a multi-level orchestration model that allows operators to orchestrate aspects of OSS/BSS while NFV MANO remains as the standard for NFV elements.  They don’t go into the details on how this is done, which is a pity in my view because HP has the most mature modeling approach for services and resources.  A key point that I think could have been made is that their service modeling would enable them to either model services as two interdependent classes—legacy and NFV—or as a single integrated class.

Huawei also presented a Catalyst, and there are some common threads between the two presentations.  One is that it’s critical to extend models for services across both legacy and SDN/NFV elements.  Another is that a closed-loop process for automating service lifecycles (both a normal and accelerated one) is critical.

The realization of these goals is described a bit more clearly in Huawei’s material.  They define a Management Control Continuum (MCC), which includes all of the components of OSS/BSS, NMS, and SDN/NFV management elements.  This (I think) is essentially the structure I suggested in my ExperiaSphere project, where all of the processes that support a service lifecycle are orchestrated explicitly, through a model.  Huawei appears to be calling all these little elements “microservices”.

It would appear that you could visualize a service as a model (my term: intent model) that is associated with a series of “function chains” that do specific things, and also (likely) policies that establish self-managed behavior of the stuff underneath the model.

If you link Huawei’s material with other presentations made by the TMF as a body, what you get is the impression that they see services as a series of intent models (which the TMF would say are nested customer- and resource-facing services) that can express SLA-and-lifecycle handling in terms of either policies or function chains.  Here’s the relevant quote from the Huawei presentation:  “Goal based policy is an important way of specifying the desired network states in systems that are largely autonomic, working in conjunction with standard ECA policy.”  Translating, this seems to me to say that service components are modeled as intent models and that policies define the way their SLA is met.  While HP doesn’t say as much about the detail, I think based on my analysis of their modeling that they could do this as well.

So what does this mean?  First, the TMF does have a program (ZOOM) that addresses the key factors in making a service/network management and operations driver work for NFV.  Second, there are demonstrations (Catalysts) roughly equivalent to the NFV ISG’s PoCs that address many of the points needed.  Third, ZOOM isn’t fully baked, and so the Catalysts are exploring what it might/should look like rather than what it currently does.  Finally, there’s still the question of implementation/evolution.

To my eye, HP and Huawei are both supporting a model of future services that fits the “intent model” structure I’ve blogged about.  They’re augmenting the basic notion that an SLA is an attribute of an intent model with the notion that policies (that are perhaps logically part of that SLA) are at least a means of communicating intent downward to influence resource behaviors.  In all of this, at least according to Huawei’s presentation, “we simply are following the likely evolution of ZOOM”.

Which means the TMF hasn’t slid across the plate yet, but they may be rounding third (if you care for US baseball analogies).  There are three barriers, in my view, to the TMF reaching its goal.

The first barrier is constituency.  The TMF is primarily an organization of OSS/BSS types.  On the vendor side, it’s arguably even more ossified than the network vendors are.  On the buyer/operator side, there’s as much consensus for the view that OSS/BSS systems need to be tossed out and something new created as there is for the view that they need to be modernized.  That’s not exactly a cheering crowd.

The second barrier is communication.  As a body, the TMF is almost a subculture.  They have their own terms, acronyms, issues, sacred cows, heroes, villains, and documents.  Most of their good stuff is not available to the public.  Because they try to describe the new (like ZOOM) in terms of the old (all their Frameworx stuff) rather than in current industry terms, they can’t communicate easily outside their own subculture, and that means they don’t have good PR.  Even within operator organizations, TMF/CIO types often have problems talking with the rest of the company.

The final barrier is containment.  The desire to preserve the old, the primary framework of the TMF’s work, leads it to try to limit the impact of new stuff.  SDN and NFV can be viewed as an alternative way to implement device functionality.  That could be accommodated simply by adding SDN/NFV processes below current device-level OSS/BSS processes—the “virtual device” model I’ve mentioned before.  The problem with that is that it encourages vendors to separate SDN/NFV virtual-device realization (which is what the NFV ISG MANO function focuses on) from the orchestration of the service overall.

You can perhaps see this in HP’s presentation charts, and it resolves potential conflicts between what the NFV ISG or the ONF might do for “management” or “operations” and what the TMF might do.  It creates two layers of orchestration, and the separation leads to the conclusion that you need to modernize OSS/BSS systems along event-driven or policy lines, and also implement SDN and NFV deployment and management that way.  From many, two.  Or maybe three, because if there are two levels of orchestration how do these levels then combine?

Modernization of OSS/BSS was one of the goals of the NGOSS Contract and GB942 work I’ve cited many times, work that was the foundation for both my CloudNFV and ExperiaSphere projects.  I didn’t see any reference to it in the Catalyst material, and since NGOSS Contract work is explicitly about using data models to steer events to processes, it would seem it should have been seminal.  It may be that componentized, event-coupled, OSS/BSS isn’t in the interest of the OSS/BSS vendors.

I think that the TMF has all the pieces of the solution to SDN and NFV’s problems.  I think that the real goal of ZOOM was (based on the goals document) and remains a form of fundamental OSS/BSS modernization.  Will OSS/BSS vendors and the operator CIOs drive that kind of change?  Would it have been easier to orchestrate both OSS/BSS and SDN/NFV with a common element external to both?  These questions probably can’t be answered at this point, and we also don’t know how long this process will take, either in the TMF or outside it in the NFV ISG, the OPNFV group, or whatever.

I’m mostly heartened by the TMF Catalysts, because we’re at least getting some field experience at the layer of the problem where the SDN and NFV business case have to live.  The next big TMF event is in Europe in the spring, and there we may finally see something broad enough and deep enough to be convincing.

 

Cisco, Ericsson, and Verizon: What They Tell Us About the Future

The news that Cisco and Ericsson are forming a marketing partnership isn’t a big surprise, given the Nokia deal for Alcatel-Lucent earlier.  It’s still big news in the industry, though, and perhaps bigger if one casts it into position alongside another news item, which is that Verizon is looking to sell off its cloud business.  It’s a story of things staying the same, and things changing…back.

Any Wall Street type will tell you that as a given industry commoditizes, meaning that products there can no longer be differentiated except on price, the inevitable response of businesses is consolidation.  A large number of competitors can’t be justified in a commodity market because the total cost of operations and administration is too high.  Commoditization also results in a loss of differentiation that can reduce the revenue per sale, making it hard for smaller less “complete” solutions to stay in the game.

The Nokia/Alcatel-Lucent deal is a classic response to commoditization—bulk up by merging and lower your cost points because the combined organization can eliminate duplication (meaning workers).  Along the way you can also broaden your product line.  Nokia has little other than mobile where Alcatel-Lucent has perhaps too broad a line for their strategy to control.  The question with Cisco/Ericsson is whether they’re responding to commoditization or to Nokia/Alcatel-Lucent.

In terms of carrier strategy, Cisco has a problem in that it doesn’t want to support the kind of changes in network cost that operators expect.  Cisco has never had a sophisticated strategy for selling to operators.  They push their network traffic indexes and things like the so-called “Internet of Everything” and say in essence that more traffic is coming so suck it up and carry the stuff.  Cisco doesn’t have a strong mobile story, and that’s a particular problem in an era where operators are investing more in mobile than in wireline and core.

Ericsson is a service company trying to survive in a market that’s focused on product cost.  If operators are beating Huawei up on price to get better cost per bit, they’re unlikely to want to spend more on professional services.  So if you’re Ericsson and have divested yourself of most products, you start a bit behind the eight-ball with respect to competitors (like Nokia/Alcatel-Lucent) who have product profits they can draw on to raise ARPU.  And as competitors get more into professional services themselves, you’re in the position of selling products from vendors who now compete with you in the service business.

By reselling Cisco gear, Ericsson gets at least a piece of the action on the product side, which increases their ARPU.  Cisco gets a conduit into complicated telco deals, particularly those requiring integration with OSS/BSS where Ericsson is especially strong.  The problem is that to be optimally successful for both vendors, the industry has to pretty much stay the course with respect to the makeup of network infrastructure.

So are Cisco and Ericsson betting against SDN and NFV or hedging?  I think it’s clear that Cisco is betting against it because it’s hard to see any SDN/NFV success that wouldn’t put Cisco under margin pressure.  I think Ericsson is hedging because there are a lot of recent troubling signs that SDN/NFV isn’t exploding into deployments as many had hoped.  What happens to professional services if there’s no change in technology at all?

This is where Verizon and the cloud come into the picture.  Like most operators, Verizon jumped on the cloud as a possible new revenue stream, and through M&A gained a quick position there.  The cloud jumped ahead of everything else in terms of operator senior management hopes for a real change in their business.  Now, we might be seeing a strong signal that the network operator push for the cloud has failed to produce results.  Forgetting for the moment what that might mean about cloud computing, for SDN and NFV it would mean that cloud services could not be counted upon to generate data center deployments that would then be suitable for SDN connection and NFV hosting.

Some NFV applications wouldn’t be impacted by this; mobile and CDN generate enough mass on their own to be credible drivers for data center deployment.  For service chaining used in vCPE, it could be significant, and it could also be significant for virtual routers on top of groomed tunnels and optics.  These applications need somewhat optimized placement of hosting points, and that would be easier if there were a successful cloud infrastructure to piggyback on.

If SDN and NFV are already slow-rolling and if cloud infrastructure isn’t going to be readily available as an early platform to host either, then the game changes in a way that could hurt both Ericsson and Cisco.  Operators’ first reaction would be to push harder on pricing of legacy gear, which means that everywhere outside the US (where Huawei has been forced out of the carrier market) Huawei is more likely to win.  That’s bad for both Cisco and Ericsson.

But lack of infrastructure for SDN/NFV piggybacking also helps the Nokia/Alcatel-Lucent duality.  Alcatel-Lucent has the strongest story for NFV-hosted mobile (IMS/EPC) and content delivery, and these applications are large enough to pull through their own hosting without the help of an in-place cloud.  That could mean that Nokia/Alcatel-Lucent would suddenly take a lead in NFV, one that Cisco and Ericsson would have considerable difficulty overcoming.

Far from being a positive revolution, Cisco/Ericsson is in the balance a hedge, and almost a signal of a frantic catch-up move.  A sudden success for SDN/NFV would make it moot.  A delayed success would empower the combined Nokia/Alcatel-Lucent that is probably an early driver.  In both cases Huawei becomes stronger in the near term.  Two or three years from now, the deal would either have to become real M&A (and who believes that could happen?) or it fails to deliver any utility to either party and they have to face the new world alone.

It’s obvious that the disrupting factor here is the pace of adoption of SDN and NFV, and if network vendors were the only source of the products for these areas, then Cisco and Ericsson would be on easy street.  They aren’t, and this may all end up hinging on HP and Oracle.  These two vendors are not part of the network infrastructure game of today, and they have every reason to want to push SDN, NFV, and the cloud to operators as a means of reducing network costs and opex overall.  But while HP has all the goodies needed and Oracle has all the positioning determination needed, neither of the two has managed to get both these assets into play.

The worst case for Cisco/Ericsson would be that Nokia/Alcatel-Lucent jumpstart both SDN and NFV through mobility (IMS/EPC) and content delivery (CDN).  That would create a network-vendor competitor with a foot on the stepping-stone to an alternative network future (and on Cisco/Ericsson’s neck).  But it’s hard to see what a good future outcome would be for Cisco and Ericsson, because none of the options will really save “business as usual” in networking.

What’s Behind NFV’s Blame Game and How Do You Fix It?

I got a laugh at a conference back in the past when I commented that to the media, any new technology had to be either the single-handed savior of western culture, or the last bastion of international communism.  Anything in between wasn’t going to generate clicks on the pieces, and it was too complicated to write in any case.  Clearly we’re at that point with NFV.  Two months ago it was a darling of the media.  Apparently the honeymoon is over, and the focus of the disillusionment is the difference between “workable” and “makes the business case.”  What’s interesting is that the points are valid even if the emphasis on them is a bit belated.

A Light Reading piece quotes the BT rep on the NFV ISG as suggesting “Shouldn’t we take an end-to-end process perspective and understand how they’ll work logically and then map that back to the architecture?” with respect to how NFV should have been architected by the ISG.  Well, yes, it should have and I know that I raised that point often, and so did others.

The original vision of NFV was that it would be valuable to substitute hosted software and commodity servers for high-priced purpose-built appliances.  That is absolutely true on the face.  Over time, the NFV community realized that this simple value proposition wouldn’t generate the benefit case needed to offset the technology investment and the risk.  As a leading operator said two years ago, referencing the 20%-or-so savings that could be obtained from capex reduction, “We can get more than that by beating Huawei up on price!”  Huawei by the way was in the room for this.

The operators who launched NFV updated their vision of benefits over time, to focus more on service agility (by which they told me they meant automated service lifecycle management) and operations efficiency.  By this time, though, the basic model of NFV as we see it today had already been established.  BT is right in suggesting that the detailed cart got before the end-to-end horse.  They’re also right in the implication that the business case would have been more easily made had the body started at the top, with functional requirements.

The current situation is reflected in yet another Light Reading piece that starts with some Vodafone complaints and includes a response from a vendor, HP.  The complaint was that vendors had showed no initiative in establishing open interfaces.  Is that complaint valid?

Vendors have, on the whole, pursued their own agenda in the NFV ISG, as they do in all standards bodies.  Some of that was because vendors tend to dominate standards processes on simple numbers alone, but some was also a lack of foresight on the part of the operators.  The E2E model was written by an operator, and the fixation with APIs and the confusion that exists on the role of the blocks in the diagram stem from that original picture and its lack of a clear link to functional requirements and benefits.  The interfaces that were described in that early E2E model guided the rest of the process.

Vodafone also complained about the PoC process, saying that the PoCs didn’t reflect the issues of a commercial deployment.  Also largely true, but again it’s probably unfair to single out the vendors for the failure.  Every PoC had to be approved (I know, I submitted the first one).  The problem with PoCs lays partly with the architecture that was supposed to be the reference, and partly in the fact that the initial ISG meetings declared operations, end-to-end management, and “federation” or creation of services across domains as out of scope.  All of these are first essential in making a business case for commercial deployment, and all of these are areas where operators now want open interfaces.  It’s no wonder that HP’s spokesperson, who is in my view the best NFV expert out there, expressed some frustration.

Operators aren’t blameless here, but neither are vendors.  This isn’t anyone’s first rodeo here, and vendors have ignored the problem of the business case for NFV from the first.  At the least, they went along with shortsighted planning of the ISG’s efforts, the bottom-up approach.  Even vendors like HP, who has in my view the very best solution available today, hasn’t put together a compelling business case according to my operator contacts.  No wonder operators are frustrated.

We even have to go back and implicate the media.  How many stories on “NFV announcements” have we read?  There are by my count approaching 50 vendors who’ve announced something regarding NFV, and at least 20 who claim to have an “NFV solution.”  Of this, perhaps six have the actual features needed to make a business case.  Why, then, don’t we talk about the fact that most NFV vendors could never hope to make a business case and have useful products (at best) only when they fit into someone else’s story?

Within the ISG and in the media, we’re starting to hear about the pathway to fixing all of the early shortcomings.  The solution lies in modeling, or more precisely the way you model.  Intent models describe what you want, your “intent” or the end-state.  Implementations based on intent models translate intent into resource commitments, which is the most general and open approach.  If the key interfaces of NFV were defined in intent-model terms they’d be open by definition.  Intent models make federation across domains easy, so it would be easy to build NFV at the operator level even if you had some implementation silos.  In fact, it would be fair to say that intent models make silos (as we normally use the term) impossible.  Intent models can translate into NFV, SDN, or legacy resource commitments.  If they are used to define how NFV relates to operations, they can address all the elements of the business case.  They even map pretty nicely to the TMF structure.

The problem today is that we are trying to address early errors in process, scope, and focus even as we’ve approved almost 40 PoCs.  Are these useful?  Yes, in that they prove conditions under which NFV can work.  The complaint that they don’t lead to commercial deployment is as much a complaint about the PoC standards as the PoC execution.  We went down the wrong path, and I know my own complaints about that were heard by operators as well as by vendors.  With the ISG work focusing way down in the depths of detail without the proper functional foundation, it will take years if the problem is addressed the way that the ISG has addressed developing the specs in the first place.  Open-source efforts linked to the ISG won’t be any better.  We have to change something.

OK, so let’s stop the blame game and focus on what could be done.

The first key step for the ISG to take is to have the operators lay out the basic framework needed to realize the benefits that could justify NFV.  Specific PoCs on each would be a start, and having ISG leadership bend the media to focus on the points would also be helpful.

The second step would be to explore what the other SDOs might contribute, through existing liaisons.  The TMF, OASIS, the IETF, and the OMG, all have credentials in the space.  Can they step in to be helpful and make some quick progress?

Maybe, but in the near term at least and perhaps for a very long time, this is going to depend vendor initiatives.  This industry has sunk a boatload of money and effort into NFV and there’s no question that it’s at risk.  I know that most vendor salespeople and many vendor executives know darn well that the picture I’m painting here is accurate.  The year 2016 is absolutely critical, and so I think that vendors would serve their own best interest to try to fix what’s un-fixed and get this all moving.  And yet, perhaps because of the lack of media attention on the real problems and lack of operator push on specific paths to making a business case, even the vendors who can make the business case don’t present their wares effectively enough to prove that.

There is little that needs to be done here.  We have at least five and probably six implementations of NFV that can do what’s needed.  Media: Herald this group.  Make them famous, household words (well, perhaps, boardroom words).  Competitors, either ally or truly compete and offer a full-spectrum solution that can really make the business case.  And back to the media, call out those vendors who claim revolutionary advances when they’re not moving the ball at all.

And now you, operators.  Get your house in order.  You need the CIO and the CFO engaged on NFV and you need to operationalize and justify it.  The scope of the current PoCs is what you’d expect given the process is driven by standards and technology people.  I don’t believe that all the vendors doing vacuous crap would keep that up if operators put their feet to the fire whenever they made some stupid claim or failed to make a case beyond “believe me.”  BT, Telefonica, and some other vendors are now spending on development to try to take up the slack.  Others need to do the same.  It’s your revenue/cost curve that’s converging, after all.

The problems of NFV can be fixed in six months or less.  Fix them, all you stakeholders.  It will be less costly than inventing another revolution and surviving until it succeeds.

How to Run an NFV Trial (and Make it Successful)

Suppose you’re a network operator and you want to run an NFV trial.  You need to make a business case to justify a considerable NFV program.  Given all of the difficulties that you know others have faced, what do you do?  This is a nice question to end a week on, so let’s look at it in depth.

The first step is to identify your specific benefit goals that, if met, will drive the deployment you plan.  The three classes of NFV benefit are capex reduction, opex reduction, and improved revenues.  Your project should address them all, and here are some quick guidelines:

  • For capex reduction, be sure that you look at total cost of ownership and not just at capital costs, and be sure that you consider how big your “positioning deployment” to offer services will be, and how much (if any) write-down of existing equipment you’ll have to cover in costs.
  • Opex reduction will depend on your introducing a lot of new service automation, and the biggest problem prospective NFV users face is having insufficient scope to do that. Services typically involve more than NFV, they use some legacy gear as well.  Can you automate deployment and management there?  Also look at just what your operations integration will require, by looking at two cost areas—deployment of services and ongoing management.  Also be wary that you don’t establish an NFV deployment so complicated that it’s operationally inefficient.
  • Service revenue gains need to be based on three things. First, what are your specific prospects for the new service, the buyers who will supply the revenue? Second, what will your sales/marketing plan look like?  How will you reach and sell these people?  Finally and most important, how will NFV contribute to the success?  Generally there are three ways it could happen—faster time to market, lower price created by lower NFV cost points, and creation of features not easily provided except through hosting.  For each of these ways, outline explicitly what you expect.

The next step in your NFV process is to defend your benefits by aligning them with technical requirements.  If you need a certain capex reduction, size up the new and old infrastructure side by side and look for places where your “NFV cost” could be suppositional or downright wrong.  If you count on opex, look at the target services and current costs and ask what specific steps will have to be taken, then how you can prove that NFV can take them within your trial.  For new services, customer surveys and competitive reviews are invaluable, but don’t neglect asking “Why wasn’t I selling this before, successfully?” and then ask what specific things NFV will do to make things different.

You’re now ready to look at the trial overall.  Your trial has to prove the technical requirements that defend your benefits.  When the CFO asks “Where does this number come from?” you have to be able to pull out you results sheet and show the derivation.  Generally, the technical requirements will group themselves into three areas:

  • Virtual Network Function requirements. You need VNFs to host in order for NFV to deliver anything, so know what your VNF source will be, and in your trial be sure to include the process of VNF onboarding, the introduction of a VNF from a third-party into your deployment framework.  Watch for VNF pricing and licensing model issues here.  Early cost management favors pay-as-you-go, but this can impose a large uncapped cost down the line.  Also watch for the hosting requirements for VNFs; if you pick a set that need a lot of different platform software (OS, middleware, etc.) you may create operations cost problems or fragment your resource pool.
  • Network Functions Virtualization Infrastructure (NFVI) requirements. You will need server hardware, virtualization hypervisor or container software, a cloud management or virtualization management system, and operating system (or systems).  One fundamental thing your trial must establish is just how many VNFs you can run per blade/core or whatever measure is convenient.  This will generally mean the number of containers or virtual machines, which depends on processor performance and (most often) network interface performance.  Users who have tested NFVI report that different platform software performs very differently, and that hardware and software acceleration of network performance is almost sure to be critical in meeting VNF density requirements.  Test your density assumptions carefully and insure that you’ve made the optimum choices in software components for NFVI.  Also, be sure that your NFVI vendor(s) provide an Infrastructure Manager that links their stuff with the rest of the NFV software.
  • Management and Orchestration. This is going to be the hardest part of your trial planning because it’s probably central to making the business case and it’s very difficult to even get good data on.  There is no such thing as a “standard” implementation here at this point, so the most important things to look for are openness and conformance to the technical requirements that defend your benefits.  Look for:
    • Support of whatever legacy technology you need to manage to meet your business case.
    • A strong VNF Manager core functionality set so that VNFs don’t all end up managing themselves and creating management silos that will make efficient operations difficult.
    • Integration with operations systems to link all this efficiently to OSS/BSS tools and processes. Make sure that you understand what’s needed from the OSS/BSS side, and cost that out as part of your justification.
    • Service modeling and how it relates to VNFs. You should be able to take a service through a complete lifecycle, which means defining the service as an orderable template, instantiating it when it’s ordered, deploying it, managing it during in-life operations, and tearing it down.  Service automation automates this, so be sure all the steps are in place and that automation tasks are clearly identified.

The next step is picking vendors, and there are three main options to consider here.  First, you can select vendors in each of the areas above—a “best-of-breed” approach.  Second, you can select a key vendor who is most connected with the technical requirements that make your business case, then let that vendor pull through tested partner elements.  Finally, you can get someone to act as an integrator and assemble stuff on your behalf.

Operators have had some issues with a “one-from-column-A” model.  Because the standards for NFV are far from fully baked, it may be very difficult to even get things to connect at major interface points.  It’s even harder to get vendors in this kind of loose confederation to own up to their own roles in securing your benefits.  The best approach overall is likely to be to find the vendor who can make the largest contribution to your benefit case and have them run the show on your behalf.

The “prime vendors” most likely to make the whole thing tick are at this moment Alcatel-Lucent, Ciena, HP, Huawei, Oracle, and Overture Networks.  Where integrators come in is when you like one of these vendors as your central NFV framework but don’t want them to take responsibility overall, either because they’re small or because you have no experience with them.  If you have an integrator you can trust, this is a good place to get them involved.  But get your integrator to contractually validate your plan and benefit trajectory, or agree on an acceptable set of changes before you start the project.

Now you’re ready to start the trial, and here the most important point to remember is to keep focused on your plan and defend your benefits.  Your trial should test every assumption that must be true for your benefit case to be valid, and it’s very easy to get distracted or gloss over points.  It is desirable not only to say “I can make this particular benefit work” but to be able to show the range of value of key variables under which it will still work.  If you assumed 20 VMs per host, will your case work at 15, and how much better might it be at 25?  Most successful trials do this kind of sensitivity analysis on their key variables.

Your trial should also, as a part of its results, recommend the next step.  The general progression you see is lab trial to field trial to limited deployment to full deployment.  In most cases you should assume that a successful trial sets the parameters for the next normal step, but there may be situations where you either want to skip a step (you blew away the benefit case) or you need to go back to a prior step (into the lab, for example) to work out a wrinkle in your approach.

Operators tell me that a good lab trial will take about nine months to complete, and a field trial can vary from six months to a year or even more.  Everyone agrees that good planning and organization up front will shorten the trials without increasing risks.  They also tell me that some “NFV vendors” are declining to do a trial under some conditions, claiming that the approach is wrong or whatever.  In a few cases they’re right, but in most it’s because the vendor doesn’t want to devote the resources or doesn’t think the results will favor them.  Obviously you need vendors to do NFV, but you may have to play hard ball with some to prevent them from hijacking the process and creating a trial that won’t make the business case.

Operations Politics and SDN/NFV Success

Light Reading posted an interesting article on the commitment of network operators to NFV despite uncertainty about how they’d be expected to integrate with OSS/BSS.  There are some parallels between the comments made and my own interactions with the operators, but I’ve also had some discussions about the issue at greater depth, and I want to share the results of these too.

Operations support, meaning operations support and business support systems (OSS/BSS) are the lifeblood of the business of being a service provider.  Banks have things like demand deposit accounting (checking/savings) and operators have OSS/BSS.  So the fact is that OSS/BSS isn’t going away no matter what, and I don’t think any operator seriously thinks they are.  Operators have been groping at the question of how to “modernize” operations systems just as banks or retailers have been looking at their own core IT applications.

Telephones predate the computer by a large margin (just like banks do) and so technology was first applied to operators’ business largely to record and integrate manual tasks.  We ended up with a bunch of acronyms that I won’t bother decoding (AMATPS, CSOBS, EADAS, RMAS, SCCS, SERVORD, and TIRKS) by the ‘70s, and these described computer tools to facilitate the business of telephony.

About a decade ago, operators and some vendors realized that we were advancing computer technology and application architectures faster than we were transforming the OSS/BSS.  Sun Microsystems, then perhaps the powerhouse vendor in the network operator space, developed a model for OSS/BSS based on Java.  This was taken over by the TMF (as OSS/J) and also became a program supported by a variety of vendors, including Oracle, who bought Sun.  Most of my operator contacts and most OSS/BSS experts I’ve talked with would agree that the initiative had great promise but didn’t develop into anything particularly influential.

The goal of OSS/BSS evolution from this point forward was threefold; first, to modernize OSS/BSS interfaces to be based on modern standards, second to open up what was perceived as becoming a very closed software structure and so make it more competitive, and third to support automated responses to service events rather than couple manual processes (“event-driven”).  Those same goals are in place today within the operator groups responsible for OSS/BSS, which are under the CIO.

CIOs, as you may recall from some of my prior blogs, haven’t been particularly engaged in the SDN or NFV processes, and so you could reasonably expect that they’ve pursued their own missions in parallel with the transformation tasks of SDN and NFV, which have been under the CTO.  On the face, then, it’s not surprising that operators today aren’t relating the two activities much, and totally unsurprising that they’d pursue SDN and NFV without a complete OSS/BSS strategy—different organizations are involved.

But a deeper question is bringing the topics together even if the humans involved are still doing their own quasi-political things.  SDN and NFV both inherit from the cloud a notion of translating a “model” of something into a deployed and operational system.  The modern term for this is “intent modeling”.  The MANO (management and orchestration) process of NFV and probably the MANO-to-VIM (virtual infrastructure manager) relationship should be seen as operating on intent models.  The goal is service automation, of course, and this is where the issues with OSS/BSS arise, because services are what OSS/BSS systems are about.

You could create two models, broadly, for combining modern MANO-like concepts with OSS/BSS.  One is to use MANO entirely below OSS/BSS, as traditional devices and network management systems tend to be used today.  I’ve called this the “virtual device” model because old OSS/BSS practices work with devices, so you could make them work with SDN and NFV if both these new technologies were presented as though they were devices.  The second model is to make MANO the center of the universe, orchestrating the entire service lifecycle including operations processes.

This, in my view, is the operator dilemma and what they’re really debating.  If you elect to follow the virtual device model, you retain compatibility with the operations systems of today, but you now either have to forget orchestration at the OSS/BSS level or you have to build it there separately from whatever you do to model and automate the fulfillment of “virtual device intent models”.  The OSS/BSS community, meaning the people in the operator organization who run OSS/BSS and those in the vendor community who provide the products, are largely in favor of staying the course, and those outside (and a few mavericks inside) favor a complete transformation of OSS/BSS to be an element of intent-modeled systems orchestrated by MANO-like principles.

This is the sort of “how-many-angels-can-dance-on-the-head-of-a-pin” question that most people outside the politics of the process get pretty frustrated with, and it’s easy to dismiss the question completely.  The problem arises because operations efficiency is the most critical benefit that either SDN or NFV could present.  You pretty much have to be able to automate service operations in a virtual world, or you have a bunch of humans running around trying to fix virtual machines (maybe virtual humans would work?)  In financial terms, you have as much as about 18 cents of every revenue dollar on the table as a displaceable cost if you can fully automate service/network/IT operations.

Even “capex” is impacted here.  Not only does the ability to substitute virtual resources for real appliances depend on efficient operations, operators spend a lot of money on OSS/BSS.  For years and years their spending on capitalized operations software has exceeded their spending on the devices on which the software is run.

Because operations efficiency is such a big piece of the benefit case for both SDN and NFV, it follows that it’s going to be difficult to make a broad case for either SDN or NFV without drawing on it.  To do that, you have to find an approach that unifies operations and reduces costs overall, which means uniting the political constituencies of NFV.

There is a ton of money at stake here.  “Orchestrable” OSS/BSS would let operators do best-of-breed shopping and integrate components easily.  That makes it anathema to current OSS/BSS vendors.  It would also facilitate the substitution of network resources and IT elements, making it less desirable to vendors in those spaces.  But it could make the business case when it’s very probable that without it, neither SDN nor NFV can succeed on a large scale in service provider networks.

So this is a deep issue, a deep division, an example of how an age-old organizational separation could become a barrier to facing the future correctly.  Right now, the core problem it creates at the technical level is that, lacking a unified force to change stuff in operations overall, we’re not addressing how the changes could be made or coordinated.  I hear stories of the right answers out there all the time, but you have to dig them out of a mass of vendor material and PoCs and demonstrations.

I’m glad we’re talking about this issue because it’s the most important issue in both SDN and NFV.  Fix this and we can fix everything else in time.  Fail to fix it and we’re out of time.