Cisco, Ericsson, and Verizon: What They Tell Us About the Future

The news that Cisco and Ericsson are forming a marketing partnership isn’t a big surprise, given the Nokia deal for Alcatel-Lucent earlier.  It’s still big news in the industry, though, and perhaps bigger if one casts it into position alongside another news item, which is that Verizon is looking to sell off its cloud business.  It’s a story of things staying the same, and things changing…back.

Any Wall Street type will tell you that as a given industry commoditizes, meaning that products there can no longer be differentiated except on price, the inevitable response of businesses is consolidation.  A large number of competitors can’t be justified in a commodity market because the total cost of operations and administration is too high.  Commoditization also results in a loss of differentiation that can reduce the revenue per sale, making it hard for smaller less “complete” solutions to stay in the game.

The Nokia/Alcatel-Lucent deal is a classic response to commoditization—bulk up by merging and lower your cost points because the combined organization can eliminate duplication (meaning workers).  Along the way you can also broaden your product line.  Nokia has little other than mobile where Alcatel-Lucent has perhaps too broad a line for their strategy to control.  The question with Cisco/Ericsson is whether they’re responding to commoditization or to Nokia/Alcatel-Lucent.

In terms of carrier strategy, Cisco has a problem in that it doesn’t want to support the kind of changes in network cost that operators expect.  Cisco has never had a sophisticated strategy for selling to operators.  They push their network traffic indexes and things like the so-called “Internet of Everything” and say in essence that more traffic is coming so suck it up and carry the stuff.  Cisco doesn’t have a strong mobile story, and that’s a particular problem in an era where operators are investing more in mobile than in wireline and core.

Ericsson is a service company trying to survive in a market that’s focused on product cost.  If operators are beating Huawei up on price to get better cost per bit, they’re unlikely to want to spend more on professional services.  So if you’re Ericsson and have divested yourself of most products, you start a bit behind the eight-ball with respect to competitors (like Nokia/Alcatel-Lucent) who have product profits they can draw on to raise ARPU.  And as competitors get more into professional services themselves, you’re in the position of selling products from vendors who now compete with you in the service business.

By reselling Cisco gear, Ericsson gets at least a piece of the action on the product side, which increases their ARPU.  Cisco gets a conduit into complicated telco deals, particularly those requiring integration with OSS/BSS where Ericsson is especially strong.  The problem is that to be optimally successful for both vendors, the industry has to pretty much stay the course with respect to the makeup of network infrastructure.

So are Cisco and Ericsson betting against SDN and NFV or hedging?  I think it’s clear that Cisco is betting against it because it’s hard to see any SDN/NFV success that wouldn’t put Cisco under margin pressure.  I think Ericsson is hedging because there are a lot of recent troubling signs that SDN/NFV isn’t exploding into deployments as many had hoped.  What happens to professional services if there’s no change in technology at all?

This is where Verizon and the cloud come into the picture.  Like most operators, Verizon jumped on the cloud as a possible new revenue stream, and through M&A gained a quick position there.  The cloud jumped ahead of everything else in terms of operator senior management hopes for a real change in their business.  Now, we might be seeing a strong signal that the network operator push for the cloud has failed to produce results.  Forgetting for the moment what that might mean about cloud computing, for SDN and NFV it would mean that cloud services could not be counted upon to generate data center deployments that would then be suitable for SDN connection and NFV hosting.

Some NFV applications wouldn’t be impacted by this; mobile and CDN generate enough mass on their own to be credible drivers for data center deployment.  For service chaining used in vCPE, it could be significant, and it could also be significant for virtual routers on top of groomed tunnels and optics.  These applications need somewhat optimized placement of hosting points, and that would be easier if there were a successful cloud infrastructure to piggyback on.

If SDN and NFV are already slow-rolling and if cloud infrastructure isn’t going to be readily available as an early platform to host either, then the game changes in a way that could hurt both Ericsson and Cisco.  Operators’ first reaction would be to push harder on pricing of legacy gear, which means that everywhere outside the US (where Huawei has been forced out of the carrier market) Huawei is more likely to win.  That’s bad for both Cisco and Ericsson.

But lack of infrastructure for SDN/NFV piggybacking also helps the Nokia/Alcatel-Lucent duality.  Alcatel-Lucent has the strongest story for NFV-hosted mobile (IMS/EPC) and content delivery, and these applications are large enough to pull through their own hosting without the help of an in-place cloud.  That could mean that Nokia/Alcatel-Lucent would suddenly take a lead in NFV, one that Cisco and Ericsson would have considerable difficulty overcoming.

Far from being a positive revolution, Cisco/Ericsson is in the balance a hedge, and almost a signal of a frantic catch-up move.  A sudden success for SDN/NFV would make it moot.  A delayed success would empower the combined Nokia/Alcatel-Lucent that is probably an early driver.  In both cases Huawei becomes stronger in the near term.  Two or three years from now, the deal would either have to become real M&A (and who believes that could happen?) or it fails to deliver any utility to either party and they have to face the new world alone.

It’s obvious that the disrupting factor here is the pace of adoption of SDN and NFV, and if network vendors were the only source of the products for these areas, then Cisco and Ericsson would be on easy street.  They aren’t, and this may all end up hinging on HP and Oracle.  These two vendors are not part of the network infrastructure game of today, and they have every reason to want to push SDN, NFV, and the cloud to operators as a means of reducing network costs and opex overall.  But while HP has all the goodies needed and Oracle has all the positioning determination needed, neither of the two has managed to get both these assets into play.

The worst case for Cisco/Ericsson would be that Nokia/Alcatel-Lucent jumpstart both SDN and NFV through mobility (IMS/EPC) and content delivery (CDN).  That would create a network-vendor competitor with a foot on the stepping-stone to an alternative network future (and on Cisco/Ericsson’s neck).  But it’s hard to see what a good future outcome would be for Cisco and Ericsson, because none of the options will really save “business as usual” in networking.

What’s Behind NFV’s Blame Game and How Do You Fix It?

I got a laugh at a conference back in the past when I commented that to the media, any new technology had to be either the single-handed savior of western culture, or the last bastion of international communism.  Anything in between wasn’t going to generate clicks on the pieces, and it was too complicated to write in any case.  Clearly we’re at that point with NFV.  Two months ago it was a darling of the media.  Apparently the honeymoon is over, and the focus of the disillusionment is the difference between “workable” and “makes the business case.”  What’s interesting is that the points are valid even if the emphasis on them is a bit belated.

A Light Reading piece quotes the BT rep on the NFV ISG as suggesting “Shouldn’t we take an end-to-end process perspective and understand how they’ll work logically and then map that back to the architecture?” with respect to how NFV should have been architected by the ISG.  Well, yes, it should have and I know that I raised that point often, and so did others.

The original vision of NFV was that it would be valuable to substitute hosted software and commodity servers for high-priced purpose-built appliances.  That is absolutely true on the face.  Over time, the NFV community realized that this simple value proposition wouldn’t generate the benefit case needed to offset the technology investment and the risk.  As a leading operator said two years ago, referencing the 20%-or-so savings that could be obtained from capex reduction, “We can get more than that by beating Huawei up on price!”  Huawei by the way was in the room for this.

The operators who launched NFV updated their vision of benefits over time, to focus more on service agility (by which they told me they meant automated service lifecycle management) and operations efficiency.  By this time, though, the basic model of NFV as we see it today had already been established.  BT is right in suggesting that the detailed cart got before the end-to-end horse.  They’re also right in the implication that the business case would have been more easily made had the body started at the top, with functional requirements.

The current situation is reflected in yet another Light Reading piece that starts with some Vodafone complaints and includes a response from a vendor, HP.  The complaint was that vendors had showed no initiative in establishing open interfaces.  Is that complaint valid?

Vendors have, on the whole, pursued their own agenda in the NFV ISG, as they do in all standards bodies.  Some of that was because vendors tend to dominate standards processes on simple numbers alone, but some was also a lack of foresight on the part of the operators.  The E2E model was written by an operator, and the fixation with APIs and the confusion that exists on the role of the blocks in the diagram stem from that original picture and its lack of a clear link to functional requirements and benefits.  The interfaces that were described in that early E2E model guided the rest of the process.

Vodafone also complained about the PoC process, saying that the PoCs didn’t reflect the issues of a commercial deployment.  Also largely true, but again it’s probably unfair to single out the vendors for the failure.  Every PoC had to be approved (I know, I submitted the first one).  The problem with PoCs lays partly with the architecture that was supposed to be the reference, and partly in the fact that the initial ISG meetings declared operations, end-to-end management, and “federation” or creation of services across domains as out of scope.  All of these are first essential in making a business case for commercial deployment, and all of these are areas where operators now want open interfaces.  It’s no wonder that HP’s spokesperson, who is in my view the best NFV expert out there, expressed some frustration.

Operators aren’t blameless here, but neither are vendors.  This isn’t anyone’s first rodeo here, and vendors have ignored the problem of the business case for NFV from the first.  At the least, they went along with shortsighted planning of the ISG’s efforts, the bottom-up approach.  Even vendors like HP, who has in my view the very best solution available today, hasn’t put together a compelling business case according to my operator contacts.  No wonder operators are frustrated.

We even have to go back and implicate the media.  How many stories on “NFV announcements” have we read?  There are by my count approaching 50 vendors who’ve announced something regarding NFV, and at least 20 who claim to have an “NFV solution.”  Of this, perhaps six have the actual features needed to make a business case.  Why, then, don’t we talk about the fact that most NFV vendors could never hope to make a business case and have useful products (at best) only when they fit into someone else’s story?

Within the ISG and in the media, we’re starting to hear about the pathway to fixing all of the early shortcomings.  The solution lies in modeling, or more precisely the way you model.  Intent models describe what you want, your “intent” or the end-state.  Implementations based on intent models translate intent into resource commitments, which is the most general and open approach.  If the key interfaces of NFV were defined in intent-model terms they’d be open by definition.  Intent models make federation across domains easy, so it would be easy to build NFV at the operator level even if you had some implementation silos.  In fact, it would be fair to say that intent models make silos (as we normally use the term) impossible.  Intent models can translate into NFV, SDN, or legacy resource commitments.  If they are used to define how NFV relates to operations, they can address all the elements of the business case.  They even map pretty nicely to the TMF structure.

The problem today is that we are trying to address early errors in process, scope, and focus even as we’ve approved almost 40 PoCs.  Are these useful?  Yes, in that they prove conditions under which NFV can work.  The complaint that they don’t lead to commercial deployment is as much a complaint about the PoC standards as the PoC execution.  We went down the wrong path, and I know my own complaints about that were heard by operators as well as by vendors.  With the ISG work focusing way down in the depths of detail without the proper functional foundation, it will take years if the problem is addressed the way that the ISG has addressed developing the specs in the first place.  Open-source efforts linked to the ISG won’t be any better.  We have to change something.

OK, so let’s stop the blame game and focus on what could be done.

The first key step for the ISG to take is to have the operators lay out the basic framework needed to realize the benefits that could justify NFV.  Specific PoCs on each would be a start, and having ISG leadership bend the media to focus on the points would also be helpful.

The second step would be to explore what the other SDOs might contribute, through existing liaisons.  The TMF, OASIS, the IETF, and the OMG, all have credentials in the space.  Can they step in to be helpful and make some quick progress?

Maybe, but in the near term at least and perhaps for a very long time, this is going to depend vendor initiatives.  This industry has sunk a boatload of money and effort into NFV and there’s no question that it’s at risk.  I know that most vendor salespeople and many vendor executives know darn well that the picture I’m painting here is accurate.  The year 2016 is absolutely critical, and so I think that vendors would serve their own best interest to try to fix what’s un-fixed and get this all moving.  And yet, perhaps because of the lack of media attention on the real problems and lack of operator push on specific paths to making a business case, even the vendors who can make the business case don’t present their wares effectively enough to prove that.

There is little that needs to be done here.  We have at least five and probably six implementations of NFV that can do what’s needed.  Media: Herald this group.  Make them famous, household words (well, perhaps, boardroom words).  Competitors, either ally or truly compete and offer a full-spectrum solution that can really make the business case.  And back to the media, call out those vendors who claim revolutionary advances when they’re not moving the ball at all.

And now you, operators.  Get your house in order.  You need the CIO and the CFO engaged on NFV and you need to operationalize and justify it.  The scope of the current PoCs is what you’d expect given the process is driven by standards and technology people.  I don’t believe that all the vendors doing vacuous crap would keep that up if operators put their feet to the fire whenever they made some stupid claim or failed to make a case beyond “believe me.”  BT, Telefonica, and some other vendors are now spending on development to try to take up the slack.  Others need to do the same.  It’s your revenue/cost curve that’s converging, after all.

The problems of NFV can be fixed in six months or less.  Fix them, all you stakeholders.  It will be less costly than inventing another revolution and surviving until it succeeds.

How to Run an NFV Trial (and Make it Successful)

Suppose you’re a network operator and you want to run an NFV trial.  You need to make a business case to justify a considerable NFV program.  Given all of the difficulties that you know others have faced, what do you do?  This is a nice question to end a week on, so let’s look at it in depth.

The first step is to identify your specific benefit goals that, if met, will drive the deployment you plan.  The three classes of NFV benefit are capex reduction, opex reduction, and improved revenues.  Your project should address them all, and here are some quick guidelines:

  • For capex reduction, be sure that you look at total cost of ownership and not just at capital costs, and be sure that you consider how big your “positioning deployment” to offer services will be, and how much (if any) write-down of existing equipment you’ll have to cover in costs.
  • Opex reduction will depend on your introducing a lot of new service automation, and the biggest problem prospective NFV users face is having insufficient scope to do that. Services typically involve more than NFV, they use some legacy gear as well.  Can you automate deployment and management there?  Also look at just what your operations integration will require, by looking at two cost areas—deployment of services and ongoing management.  Also be wary that you don’t establish an NFV deployment so complicated that it’s operationally inefficient.
  • Service revenue gains need to be based on three things. First, what are your specific prospects for the new service, the buyers who will supply the revenue? Second, what will your sales/marketing plan look like?  How will you reach and sell these people?  Finally and most important, how will NFV contribute to the success?  Generally there are three ways it could happen—faster time to market, lower price created by lower NFV cost points, and creation of features not easily provided except through hosting.  For each of these ways, outline explicitly what you expect.

The next step in your NFV process is to defend your benefits by aligning them with technical requirements.  If you need a certain capex reduction, size up the new and old infrastructure side by side and look for places where your “NFV cost” could be suppositional or downright wrong.  If you count on opex, look at the target services and current costs and ask what specific steps will have to be taken, then how you can prove that NFV can take them within your trial.  For new services, customer surveys and competitive reviews are invaluable, but don’t neglect asking “Why wasn’t I selling this before, successfully?” and then ask what specific things NFV will do to make things different.

You’re now ready to look at the trial overall.  Your trial has to prove the technical requirements that defend your benefits.  When the CFO asks “Where does this number come from?” you have to be able to pull out you results sheet and show the derivation.  Generally, the technical requirements will group themselves into three areas:

  • Virtual Network Function requirements. You need VNFs to host in order for NFV to deliver anything, so know what your VNF source will be, and in your trial be sure to include the process of VNF onboarding, the introduction of a VNF from a third-party into your deployment framework.  Watch for VNF pricing and licensing model issues here.  Early cost management favors pay-as-you-go, but this can impose a large uncapped cost down the line.  Also watch for the hosting requirements for VNFs; if you pick a set that need a lot of different platform software (OS, middleware, etc.) you may create operations cost problems or fragment your resource pool.
  • Network Functions Virtualization Infrastructure (NFVI) requirements. You will need server hardware, virtualization hypervisor or container software, a cloud management or virtualization management system, and operating system (or systems).  One fundamental thing your trial must establish is just how many VNFs you can run per blade/core or whatever measure is convenient.  This will generally mean the number of containers or virtual machines, which depends on processor performance and (most often) network interface performance.  Users who have tested NFVI report that different platform software performs very differently, and that hardware and software acceleration of network performance is almost sure to be critical in meeting VNF density requirements.  Test your density assumptions carefully and insure that you’ve made the optimum choices in software components for NFVI.  Also, be sure that your NFVI vendor(s) provide an Infrastructure Manager that links their stuff with the rest of the NFV software.
  • Management and Orchestration. This is going to be the hardest part of your trial planning because it’s probably central to making the business case and it’s very difficult to even get good data on.  There is no such thing as a “standard” implementation here at this point, so the most important things to look for are openness and conformance to the technical requirements that defend your benefits.  Look for:
    • Support of whatever legacy technology you need to manage to meet your business case.
    • A strong VNF Manager core functionality set so that VNFs don’t all end up managing themselves and creating management silos that will make efficient operations difficult.
    • Integration with operations systems to link all this efficiently to OSS/BSS tools and processes. Make sure that you understand what’s needed from the OSS/BSS side, and cost that out as part of your justification.
    • Service modeling and how it relates to VNFs. You should be able to take a service through a complete lifecycle, which means defining the service as an orderable template, instantiating it when it’s ordered, deploying it, managing it during in-life operations, and tearing it down.  Service automation automates this, so be sure all the steps are in place and that automation tasks are clearly identified.

The next step is picking vendors, and there are three main options to consider here.  First, you can select vendors in each of the areas above—a “best-of-breed” approach.  Second, you can select a key vendor who is most connected with the technical requirements that make your business case, then let that vendor pull through tested partner elements.  Finally, you can get someone to act as an integrator and assemble stuff on your behalf.

Operators have had some issues with a “one-from-column-A” model.  Because the standards for NFV are far from fully baked, it may be very difficult to even get things to connect at major interface points.  It’s even harder to get vendors in this kind of loose confederation to own up to their own roles in securing your benefits.  The best approach overall is likely to be to find the vendor who can make the largest contribution to your benefit case and have them run the show on your behalf.

The “prime vendors” most likely to make the whole thing tick are at this moment Alcatel-Lucent, Ciena, HP, Huawei, Oracle, and Overture Networks.  Where integrators come in is when you like one of these vendors as your central NFV framework but don’t want them to take responsibility overall, either because they’re small or because you have no experience with them.  If you have an integrator you can trust, this is a good place to get them involved.  But get your integrator to contractually validate your plan and benefit trajectory, or agree on an acceptable set of changes before you start the project.

Now you’re ready to start the trial, and here the most important point to remember is to keep focused on your plan and defend your benefits.  Your trial should test every assumption that must be true for your benefit case to be valid, and it’s very easy to get distracted or gloss over points.  It is desirable not only to say “I can make this particular benefit work” but to be able to show the range of value of key variables under which it will still work.  If you assumed 20 VMs per host, will your case work at 15, and how much better might it be at 25?  Most successful trials do this kind of sensitivity analysis on their key variables.

Your trial should also, as a part of its results, recommend the next step.  The general progression you see is lab trial to field trial to limited deployment to full deployment.  In most cases you should assume that a successful trial sets the parameters for the next normal step, but there may be situations where you either want to skip a step (you blew away the benefit case) or you need to go back to a prior step (into the lab, for example) to work out a wrinkle in your approach.

Operators tell me that a good lab trial will take about nine months to complete, and a field trial can vary from six months to a year or even more.  Everyone agrees that good planning and organization up front will shorten the trials without increasing risks.  They also tell me that some “NFV vendors” are declining to do a trial under some conditions, claiming that the approach is wrong or whatever.  In a few cases they’re right, but in most it’s because the vendor doesn’t want to devote the resources or doesn’t think the results will favor them.  Obviously you need vendors to do NFV, but you may have to play hard ball with some to prevent them from hijacking the process and creating a trial that won’t make the business case.

Operations Politics and SDN/NFV Success

Light Reading posted an interesting article on the commitment of network operators to NFV despite uncertainty about how they’d be expected to integrate with OSS/BSS.  There are some parallels between the comments made and my own interactions with the operators, but I’ve also had some discussions about the issue at greater depth, and I want to share the results of these too.

Operations support, meaning operations support and business support systems (OSS/BSS) are the lifeblood of the business of being a service provider.  Banks have things like demand deposit accounting (checking/savings) and operators have OSS/BSS.  So the fact is that OSS/BSS isn’t going away no matter what, and I don’t think any operator seriously thinks they are.  Operators have been groping at the question of how to “modernize” operations systems just as banks or retailers have been looking at their own core IT applications.

Telephones predate the computer by a large margin (just like banks do) and so technology was first applied to operators’ business largely to record and integrate manual tasks.  We ended up with a bunch of acronyms that I won’t bother decoding (AMATPS, CSOBS, EADAS, RMAS, SCCS, SERVORD, and TIRKS) by the ‘70s, and these described computer tools to facilitate the business of telephony.

About a decade ago, operators and some vendors realized that we were advancing computer technology and application architectures faster than we were transforming the OSS/BSS.  Sun Microsystems, then perhaps the powerhouse vendor in the network operator space, developed a model for OSS/BSS based on Java.  This was taken over by the TMF (as OSS/J) and also became a program supported by a variety of vendors, including Oracle, who bought Sun.  Most of my operator contacts and most OSS/BSS experts I’ve talked with would agree that the initiative had great promise but didn’t develop into anything particularly influential.

The goal of OSS/BSS evolution from this point forward was threefold; first, to modernize OSS/BSS interfaces to be based on modern standards, second to open up what was perceived as becoming a very closed software structure and so make it more competitive, and third to support automated responses to service events rather than couple manual processes (“event-driven”).  Those same goals are in place today within the operator groups responsible for OSS/BSS, which are under the CIO.

CIOs, as you may recall from some of my prior blogs, haven’t been particularly engaged in the SDN or NFV processes, and so you could reasonably expect that they’ve pursued their own missions in parallel with the transformation tasks of SDN and NFV, which have been under the CTO.  On the face, then, it’s not surprising that operators today aren’t relating the two activities much, and totally unsurprising that they’d pursue SDN and NFV without a complete OSS/BSS strategy—different organizations are involved.

But a deeper question is bringing the topics together even if the humans involved are still doing their own quasi-political things.  SDN and NFV both inherit from the cloud a notion of translating a “model” of something into a deployed and operational system.  The modern term for this is “intent modeling”.  The MANO (management and orchestration) process of NFV and probably the MANO-to-VIM (virtual infrastructure manager) relationship should be seen as operating on intent models.  The goal is service automation, of course, and this is where the issues with OSS/BSS arise, because services are what OSS/BSS systems are about.

You could create two models, broadly, for combining modern MANO-like concepts with OSS/BSS.  One is to use MANO entirely below OSS/BSS, as traditional devices and network management systems tend to be used today.  I’ve called this the “virtual device” model because old OSS/BSS practices work with devices, so you could make them work with SDN and NFV if both these new technologies were presented as though they were devices.  The second model is to make MANO the center of the universe, orchestrating the entire service lifecycle including operations processes.

This, in my view, is the operator dilemma and what they’re really debating.  If you elect to follow the virtual device model, you retain compatibility with the operations systems of today, but you now either have to forget orchestration at the OSS/BSS level or you have to build it there separately from whatever you do to model and automate the fulfillment of “virtual device intent models”.  The OSS/BSS community, meaning the people in the operator organization who run OSS/BSS and those in the vendor community who provide the products, are largely in favor of staying the course, and those outside (and a few mavericks inside) favor a complete transformation of OSS/BSS to be an element of intent-modeled systems orchestrated by MANO-like principles.

This is the sort of “how-many-angels-can-dance-on-the-head-of-a-pin” question that most people outside the politics of the process get pretty frustrated with, and it’s easy to dismiss the question completely.  The problem arises because operations efficiency is the most critical benefit that either SDN or NFV could present.  You pretty much have to be able to automate service operations in a virtual world, or you have a bunch of humans running around trying to fix virtual machines (maybe virtual humans would work?)  In financial terms, you have as much as about 18 cents of every revenue dollar on the table as a displaceable cost if you can fully automate service/network/IT operations.

Even “capex” is impacted here.  Not only does the ability to substitute virtual resources for real appliances depend on efficient operations, operators spend a lot of money on OSS/BSS.  For years and years their spending on capitalized operations software has exceeded their spending on the devices on which the software is run.

Because operations efficiency is such a big piece of the benefit case for both SDN and NFV, it follows that it’s going to be difficult to make a broad case for either SDN or NFV without drawing on it.  To do that, you have to find an approach that unifies operations and reduces costs overall, which means uniting the political constituencies of NFV.

There is a ton of money at stake here.  “Orchestrable” OSS/BSS would let operators do best-of-breed shopping and integrate components easily.  That makes it anathema to current OSS/BSS vendors.  It would also facilitate the substitution of network resources and IT elements, making it less desirable to vendors in those spaces.  But it could make the business case when it’s very probable that without it, neither SDN nor NFV can succeed on a large scale in service provider networks.

So this is a deep issue, a deep division, an example of how an age-old organizational separation could become a barrier to facing the future correctly.  Right now, the core problem it creates at the technical level is that, lacking a unified force to change stuff in operations overall, we’re not addressing how the changes could be made or coordinated.  I hear stories of the right answers out there all the time, but you have to dig them out of a mass of vendor material and PoCs and demonstrations.

I’m glad we’re talking about this issue because it’s the most important issue in both SDN and NFV.  Fix this and we can fix everything else in time.  Fail to fix it and we’re out of time.

The Credibility of “New Revenue” to Drive SDN and NFV

If you’ve tracked both SDN and NFV carefully, as I have, you’ve probably noticed that the value propositions for both have shifted or evolved over time.  Service revenue increases are great, but you have to be able to justify them with some hard opportunity numbers.  Where are the brass rings with new SDN and NFV services?

One important point to make is that “new services” have to be broken down into the connection services and cloud services that I’ve talked about in prior blogs.  The reason this is important is that network operators have a natural place in the connection services market, with infrastructure, skills, brand, and so forth.  They’re trying to wrestle a place in the cloud or hosted feature space, but they are not there yet.  That’s why operators tend to think of “new” services in terms of legacy stuff that’s tweaked somehow to be presented differently.

We’ve heard about turbo buttons and bandwidth on demand for decades now, and it’s totally true that you could do them with SDN and NFV.  You could also have done them without either technology, and still could.  The concept of elastic bandwidth has been difficult to promote in the real world, both because buyers don’t see a big value and sellers see a big risk.

Most companies size their VPNs and VLANs and physical-layer trunks based on their typical traffic needs and over-engineer a bit to accommodate peaks.  I surveyed users about this practice for years and they were comfortable with the approach.  Yes, about two-thirds of buyers said they thought that having some elasticity to better accommodate bursty traffic would be nice, but what was interesting is that they had essentially zero willingness to pay for it.  In fact, nearly all the users who wanted elastic bandwidth wanted to reduce current spending by downsizing capacity on the average and boosting it only during peaks.  That’s why operators finally realized this was a less-than-zero-sum game for them.

One solution to this problem being proposed today is what might be called “extranets”, meaning network relationships among companies instead of within them.  Traffic levels here are more variable and few companies say they support extranet applications with fixed network services.  But few companies say they have significant extranet traffic, and most companies who do “extranetting” today say that secure Internet VPNs are the best solution.

OK, what this tells me is that there is little credibility that extensions to connection services based on SDN or NFV could really add much in the way of revenue.  You might be able to frame offerings differently if you had already completed an SDN/NFV infrastructure transition, but the benefits would fall short of justifying the change.

Revenue from connection-related features (the vCPE model) is also difficult to justify on a large scale (though I think a business case for vCPE can be made in other ways).  The problem is that credible revenue opportunities from vCPE are really limited to current Carrier-Ethernet sites or prospective sites, meaning satellite sites of multi-site businesses.  These sites can’t be sold one-off, you have to sell the HQ locations.  There, the most credible connection-related VNF-hostable features—security—have long been considered and addressed via CPE.  Yes, SMBs don’t fit this model and may be extremely interested in managed services, but with some notable vertical-market exception it has proved difficult to sell to these communities because the average revenue per user is small compared to the cost of sales/marketing.

Before we throw in the towel on new service revenues as an NFV justification, though, we need to examine what could change all this.  Not in a heartbeat to be sure, and not without some extensive marketing by vendors, but change it for sure.  The best general name we could give the path to new service success is the as-a-service-extension model.

We have static networking today because we visualize the network as being separate from the application, which means it serves the aggregate of applications.  This tends to level capacity and connectivity needs and also limits the extent to which a given application can have network services tuned to its needs.  We build application-independent, permissive-connectivity, networks then pay incrementally to build application awareness and access control.

The cloud teaches us (or should be teaching us) a lesson here, which is that network services can be specific to an application.  If you look at the virtual networking model of giants like Amazon and Google, it’s based on virtual-networking techniques that could easily create a whole vertical stack of virtual networks for a company, linked to workers and partners at any point where you find a suitable human (or machine).

The easiest way to promote this model is with as-a-service in-cloud applications, and so SaaS trends would be a way to make this all happen.  In fact, every major network vendor who has an SDN strategy—Alcatel-Lucent, Cisco, Juniper—has the ability to do this now.  Since selling traditional boxes and security/application-awareness add-ons is a great business, though, we don’t hear much about this approach.

You could also support this from the inside, from the application side.  A virtual router product that can be hosted on a generic server or edge device could build this kind of model without the support of the big vendors.  Brocade could do this, for example, and you could create an open-source project to enhance any open switch/router to support virtual networking in this form.  Once you have it, you could start dialing down private virtual networks in today’s site-networking form.

Well, maybe.  One big barrier to this is regulatory uncertainties with respect to net neutrality interpretations.  If we tried to build something like this today, from the application out to the user, we’d almost surely have to adopt an Internet overlay connection model, and that would be easier if we could have SLAs on services.  SLAs are at least a close neighbor to fast lanes, and most operators are reluctant to jump into this space lest the services they create end up violating regulatory and public policy goals.

IoT might be another answer to the problem.  There is no rational model of IoT other than a big-data-and-analytics model that lives in the cloud, sucks in sensor data from any convenient source over any worthwhile connection technology, and then makes everything available under a policy-metered query umbrella.  An operator or even a big vendor could establish a model like this, and since the networking in the model is largely inside the IoT cloud or represents a simple access extension, you could do whatever you wanted with SLAs and network architecture without impacting the current services or running afoul of regulators.

Cloud services are another dodge.  Inside a SaaS or perhaps even PaaS envelope, you could create pathways that were similar to those within a mobile network or CDN, both exempt from neutrality regulations.  Now all sorts of intra-cloud high-value services could be presented and procured.

The point here is that it’s probable that new service revenues won’t present a widespread benefit for either SDN or NFV justification unless you can do something to increase their scope of impact, which my two examples would do.  The question would then be whether that broader scope was seen by both operators and vendors as too much change, too much risk.

We can justify SDN and NFV for some operators with new service revenues today, but only on a limited scale.  For the big justification, for the benefits that can build out enough infrastructure to make it easy to add on services and features as the market demands, we’ll need something else.  Opex is all that’s left, so we’re back to the same point—you’ll need exceptional opex automation to make either SDN or NFV work.

Virtual CPE for NFV: Models, Missions, and Potential

There is no doubt that virtual CPE is the most populist of the NFV applications.  There are more vendors who support it than any other NFV application, and more users who could potentially be touched by it.  vCPE is also arguably the most successful NFV application, measured by the number of operators who have actually adopted the model and offered services.  So how far can it go, and how would it get there?

All the popular vCPE applications are based on connection-point services, meaning services offered to users ancillary to their network connections.  Things like firewalls, NAT, IPsec, and similar higher-layer services are the common examples.  These services have the advantage of being almost universally in demand, meaning that in theory you could sell a vCPE service to anyone who used networking.  Today, they’re provided either through an access device like an Internet router, or in the form of discrete devices hooked to the access device.

While all vCPE applications involve hosting these connection-point functions rather than ossifying them in an appliance, they don’t all propose the same approach.  Two specific service models have emerged for vCPE.  One, which I’ll call the edge-hosted model, proposes to host the vCPE-enabling virtual network functions on devices close to or at the customer edge.  This group includes both vCPE vendors (Overture, RAD, etc.) and router vendors who offer boards for VNF hosting in their edge routers.  The other, the cloud-hosted model, would host VNFs in a shared and perhaps distributed resource pool some distance from the user connections.  That model is supported by server or server platform vendors and by NFV vendors with full solution stacks that include operations support and legacy device support.

The edge-hosted model of vCPE can generate some compelling arguments in its favor.  Most notably, something almost always has to be on the customer premises to terminate a service.  For consumer networking, for example, the user is nearly certain to need a WiFi hub, and SMBs often need the same.  Even larger user sites will normally have to terminate the carrier service on something to provide a clean point of management hand-off.  Given that some box has to be there, why not make the box agile enough to support a variety of connection-point services by hosting them locally?  This approach seeks to magnify service agility and eliminate truck rolls to change or add premises appliances when user feature needs change.

For many CFOs, the next-most-compelling benefit for edge-hosted vCPE is the synchronized scaling of revenue and cost.  If you sell a customer an edge-hosted strategy, you send the customer an edge-box.  You incur cost, but you have immediate revenue to offset it.  Cloud-hosting the same vCPE would mandate building a resource pool, which means that you’re fronting considerable capex before you earn a single dollar (operators call this first cost).  The broader the means of marketing the service to prospects, the more useful this first-cost control is.  That’s because the size of that initial resource pool is determined by the geographic breadth of the prospect base; you have to spread hosting points to be at least somewhat close to the user or network cost and complexity will drive your business case into the dust.

The next plus for edge-hosting is that management is simplified considerably.  Here’s a customer and customer service people used to managing real devices that perform the connection-point functions.  If an edge-hosted vCPE strategy is used, then the platform software in the edge-host can make the collection of functions look like a real device, and it’s simple to do.  There are no distributed, invisible, complicated network/hosting structures whose state must somehow be related to the functional state of the connection-point service of a user.  There’s no shared hosting to consider in SLAs.  All the stuff needed is in the same box, dedicated to the customer.

The final point in favor of edge-hosted vCPE, just now being raised, is that it considerably simplifies the process of deploying virtual functions.  Where does the customer’s VNF go?  On their edge device.  No complex policies, no weighing of hosting cost versus network connection costs.  There are twenty-page scholarly papers on how to decide where to put a VNF in a distributed resource pool.  What would implementing such a decision cost, and how would it impact the economies of shared resources?  Why not punt the question and put everything on the customer edge.

The obvious argument against edge-hosted vCPE is the loss of shared-resource economies.  If we presume that network operators follow through on plans to create a large number of NFV data centers to host functions, the cost of hosting in these data centers would be lower than hosting on the customer premises.  In addition, the service could be made more resilient through resource substitution, which is difficult if your resource is hanging on the end of an access line.

According to operators, though, the big problem with edge hosting isn’t that it’s more expensive because you don’t share resources among users.  The big problem is that it’s totally non-differentiated.  You don’t even need NFV to do edge-hosted vCPE because you do little or nothing of the orchestration and optimization that the ETSI ISG is focused on.  Any credible vendor would be able to offer edge-hosted vCPE if they could partner with the VNF players.  Who, as we know, will partner with nearly everyone vertical and not on life support.  Instant, infinite, competition?  Who wants that?

This point leads to the deeper problem, almost as profound.  It’s hard to see how basic edge-hosted vCPE leads anywhere.  If network functions virtualization has utility in a general sense, then it would make sense to pull through the critical elements of NFV early on, with your first successful applications.  How do you do that when your application doesn’t even need NFV?  And given that lack of pull-through, how do you ever get real NFV going?

Some of the smarter edge-hosted vCPE vendors recognize these issues and have addressed them, which can be done in two ways.  First, you could build real NFV “behind” your vCPE approach so that you could interchangeably host in the cloud.  That would require actually doing a complete NFV implementation, and this is what Overture does, and RAD announced an expansion of its own deployment and management elements just today.  Second, you could partner with somebody who offers complete NFV, which many of the edge-hosted vCPE players do.  Anything other than these approaches will leave at least some of the edge-hosting problems on the table.

A hybrid approach of edge- and cloud-hosted vCPE is by far the best strategy, but that model hasn’t gotten the play I’d have expected.  The reason is sales traction.  Overture has a very credible NFV solution but they’ve been a bit shy in promoting themselves as a complete NFV source, and even though they have Tier One traction they’re not seen as being on the same level as an equipment giant.  The partnership players seem to be stalled in the question of how the sale is driven and in what direction.  Many of the larger players who can make the overall NFV business case see edge-hosted vCPE as a useless or even dangerous excursion because they don’t make the edge gear themselves.

Overall, vCPE might be the only way that competing vendors can counter players like Alcatel-Lucent who have an enormous VNF opportunity with mobile/content that they can ride into a large deployment.  Edge-hosting vCPE would let operators get started without a massive resource pool, and with proper orchestration and management elements it could at least be partially backed up or augmented with cloud hosting, even replaced as resource density in the cloud rises.  But it still depends on having some credible NFV end-game, and it’s still hard to deduce what even the best vendors think that is.

The Paths to an SDN/NFV Business Case

The issues I’ve raised on the difficulties operators experience making a business case for SDN and NFV are increasingly echoed by other sources, so I hope that at this point most objective people believe there is a problem here.  One question, then, is how business-case difficulties might impact SDN and NFV deployment and the future of both technologies.  And, of course, how it might impact vendors.  Another is what vendors could or are doing about the business case.

The best place to start this off might be with what could be expected to play out if we had a number of convincing players promoting a viable business case for SDN and NFV.  In such a situation, the immediate impact would be to create a validated long-term vision of what NGN would look like, both in terms of infrastructure and operations.  This vision would serve as a template to validate the discrete steps taken in the direction of SDN and NFV, which would make incremental (per-project, per-service) evolution practical.  If we knew what we were building toward looked like and what benefits it would generate, we could then take the steps that moved us furthest at the lowest risk, and both technologies would begin to fly.

What would happen if that general business case is not made?  Without it, individual projects would have to prove themselves out in a vacuum, without any broad benefits from infrastructure change as a whole.  On one hand that might seem to be a good thing; many vendors would like to promote their own narrow vision.  The problem is that both my modeling work and input I’ve received from operators shows that most of these narrow projects would not be able to deliver the benefits needed.  Worse, there’s no guarantee that these individual projects would even add up to a harmonious vision of infrastructure or operations, and we could end up with a bunch of silos—the thing we’ve been trying to get rid of for decades.

The big problems are operations and economy of scale.  NFV is a cloud technology that demands enough deployment to achieve hosting economies.  SDN is a transformation of cost that demands enough scope of deployment to make a consequential contribution to reduction in capex.  Both technologies have a very strong and not very often acknowledged dependence on operations automation to prevent the additional complexity from creating an opex explosion that would swamp all other benefits.  How do you accommodate operations of hybrid SDN/NFV/legacy infrastructure, a problem that’s ecosystemic by definition, when you can only drive benefits on a narrow service front?

The most popular NFV application, based on operator activity, is virtual CPE.  The most popular SDN application is data center networking.  vCPE in the form where a general device is placed on-prem and used to host VNFs is actually fairly easy to justify for some operators like MSPs, but it’s not easy to build the business case on a broad scale.  Host vCPE with service chaining in the cloud and it actually gets harder to justify.  And data-center SDN isn’t transformative to carriers.  It’s not even clear if it is to vendors.  Getting SDN out where it can build a complete service architecture would be transformative, but operators admit that nobody is really pushing that vision.

So what do you do?  You’ve got to broaden the benefit front, and the most obvious of our broadening options is the classic “killer app” problem.  You win in SDN or NFV if you can find a single application of the technologies that delivers enough scale and enough benefit to justify a pretty significant transformation that builds critical mass in both areas.  This then creates the “gravitational attraction” to pull in other services/projects to leverage what’s in place, and from that we end up with systemic SDN or NFV.

This killer-app approach is obviously most credible for SDN/NFV providers who have a credible candidate of this sort, and the market position to drive it.  The most obvious example is Alcatel-Lucent, who has one of the few credible holistic SDN/NFV positions and also has a commanding position in mobile infrastructure (IMS/EPC) and content delivery (CDN, IPTV).  No matter how many other VNFs and opportunities may be out there, few if any can hope to pull through enough SDN/NFV to establish that critical mass.  One mobile or content infrastructure win could establish enough scale in both infrastructure and operations to build a viable SDN/NFV platform on which other stuff can build.

What do you do if you don’t have pride of place in some critical service area?  You could, (to quote the poet Milton), “stand and wait.”  You could forget specific services and think horizontally, or you could look for a new/emerging critical area to be champion of.

If some vendor eventually gets a critical mass in SDN/NFV, it’s unlikely they’ll be able to command the whole market (particularly in NFV).  Thus, you join everyone’s partner program to hedge your bets and hunker down until somebody makes a success of SDN/NFV large enough to establish convincing momentum, then you sell your heart out to that vendor’s customers.  For most NFV and SDN hopefuls, this is the only realistic choice because they lack the product feature scope to deliver a business case on any scale.  It’s hard to see how this approach would create a big winner—too many hungry mouths are waiting to be fed at the table of SDN/NFV success.

Obviously a subservient role like that isn’t appealing to a lot of vendors, and for those with strong product breadth (those who can make a business case with their own stuff), another option is to jump over the killer app and go for the critical horizontal element that will underpin all business cases.  That’s operations efficiency and service automation.  If you can show a decisive impact on opex, you can generate so big a benefit case that you can justify virtually any SDN/NFV service or project.  In fact, you could justify your service automation even without SDN and NFV.  Oracle seems committed to this path, and it would likely be the route of choice for OSS/BSS vendors like Amdocs.

This is a path any major vendor with a complete NFV strategy could take, but it would be harder for a smaller player.  Operators themselves are split on whether modernization of management and operations should preserve current tools and practices or have the specific goal of eliminating them in favor of a different model.  They’d probably trust a big player with a compelling story to do either of these things (literally either; they’d like to see a kind of elastic approach that can either reorganize things or replace them), but only a big one.

The final option is perhaps the most interesting.  Suppose the “killer app” is an app that’s got no real incumbent?  Suppose there’s something that could have such profound impact that if it were implemented using SDN/NFV principles it would create that critical mass of support whoever does the implementing?  There is only one thing that fits this, and it’s IoT.

IoT, as the mass-media hype says, could change everything, but that’s about as much as the mass media has right about it.  It won’t develop as many believe, meaning it won’t grow out of a bunch of new and directly-on-the-Internet, mobile/cellular-connected devices.  IoT is an abstraction—three in fact, as I suggested in a blog last week.  The center of it is a big-data and analytics amalgamation that will be cloud hosted as a vast array of cooperative processes.  From it there will be a series of cloud-, NFV-, and SDN-enabling changes that will transform mobility, content delivery, and networking overall.  It’s not that “things” will be the center of the universe but that thing-driven changes will envelope the rest.

The good news for SDN/NFV vendors is that IoT in itself could justify both SDN and NFV and drive both into rampant adoption.  The bad news is that if SDN and NFV have a problem today it’s that they’re too big to confront effectively and IoT dwarfs them both in that sense.  Vendors who are presented with giant opportunities tend to see giant costs and interminable sales cycles, and somebody will inevitably blink and think small.

It’s hard to see how IoT could become a credible driver for SDN/NFV as things stand, because the link to either could be created only when a credible IoT architecture could be drawn and the role of the two technologies identified.  If we diddle for years on the former, SDN/NFV’s fate and leadership will be settled before IoT even arrives.  Thus, it’s hardly surprising that no vendor has stepped forward with a compelling story here.

Here’s where we are, then.  We have limited candidates to pin SDN/NFV business hopes on because there are only limited number of full-spectrum NFV solutions out there.  Only a vendor who can implement everything needed for SDN/NFV can make the business case.  Long-time functional leaders Alcatel-Lucent, HP, Huawei (a maybe because I don’t have full detail on their approach), Oracle, and Overture Networks have now in my view been joined by Ciena after its Cyan acquisition—if they follow through on their commitments.  So optimistically we have six total solution players.

We have two (Alcatel-Lucent and Oracle) committed to something that could lead to a business case—the killer app of mobile/content in the former case and operations revolution in the latter.  Neither, though, has delivered an organized proof of their ability.  That may be because they’re holding their cards close for competitive reasons, because the business case is too operator-specific in details to make a general document useful, or because they can’t do it yet.  The same issues could be holding back those players who have no visible commitment to an approach.  Maybe they are holding back, or maybe they are just hoping none will be required.

For all the NFV hopefuls, time is short.  Operators need to respond to what they say is a 2017 crossover between revenue- and cost-per-bit.  To do that they’ll need to get something into trial by mid-2016, and the trial will have to prove a business case with an impact large enough to influence their bottom line.  I think SDN and NFV, properly presented, could do just that, but as operators have told me many times “We have to hear that from vendors who can prove the business case!”  They’re as anxious to know who that could be as I am.