The Difference Between Software-Hosted and Software-Defined

I doubt anyone would disagree if I said that we had a strong tendency to oversimplify the impacts of changes or new concepts in networking.  It’s a combination of the desire by network users to focus on what they want not how to get it, and the desire of the media to turn every development into a one-liner.  The problem is that the tendency can result in under-appreciation of both issues and opportunities.

Network connectivity at its core is pretty simple.  You squirt a packet into the network and based on addressing it’s delivered to the correct destination.  If you look at this level of behavior, it seems pretty easy to convert this to “virtual switching and routing” or “white box” or “OpenFlow”.  The problem is that the real process of networking is a lot more complicated.

To start with, networks have to know where that correct destination is.  That’s done today through a discovery process that involves a combination of finding users at the edge and then propagating knowledge of where they are through the network so each device knows where to send stuff.  In theory I could eliminate discovery completely but if I do that then I have to tell “the network” the exact location of every user and work out how to reach them.  The more users and the more routes, the more difficult it would be.

Discovery is part of what’s usually called “control plane processes”, which means it’s supported by protocols that communicate with the network rather than with users.  There are a decent number of other control-plane processes out there.  “Ping” and “traceroute” are common examples.  If we were to eliminate control-plane processes then these protocols wouldn’t work if users exercised them.  If we don’t eliminate them then we have to spoof in some way what the protocol would do.  Do we let users discover “real” routes in an SDN/OpenFlow network using traceroute or do we show them a virtual route?  And whatever choice we make we have to introduce a process somewhere that delivers what’s expected.

Then there’s management.  At a simple level, users would typically expect to have some management access to network elements, something like an SNMP port.  So now what do we give them in the world of SDN or NFV?  The “real” network element they expect, like a fully functioning router/switch or a firewall, isn’t there.  Instead we have a collection of distributed functions hosted on computers that the user has no idea are in place, connected by virtual links or chains that used to be internal data paths on a device, and representing shared facilities that the user can’t be allowed to control even if they knew how.

If we look at the implications of this on “virtual routing” we see perhaps for the first time the difference between two solutions we hear about all the time but never seem to get straight.  “SDN” virtual routing using OpenFlow doesn’t have real routers.  There are no embedded features to provide control-plane responses or management features.  It forwards stuff, and if you want more you have to use almost-NFV-like function hosting to add the capabilities in.  Non-SDN virtual routing (Brocade’s Vyatta stuff is a good example) is a virtualized, hostable, router.  It’s a real router, functionally, and you can build networks with it just like you’d do with router devices.  You have no issues of control or management because your virtual router looks like the real thing—because it is.

The first important conclusion you can draw from this is that the more you expect to have virtualized network elements represent every aspect of the behavior of real devices, the more you need a hosted software version of the real device and not a “white-box” or forwarding-level equivalent.  Rightfully, the Brocade virtual routers are neither SDN nor NFV.  They’re software routers and not software-defined routers.  That, I’d assert, is true with NFV too.  If we have to present a virtual CPE element exactly like real CPE would look at all control and management levels, then we need to have an agile box on the premises into which we can load features, not a cloud-distributed chain of features.

This might sound like I’m saying that SDN and NFV are intrinsically crippled, but what I’m really saying is that we’re intrinsically crippling them.  The fallacy in our thinking is that we have to “represent every aspect of the behavior of real devices.”  If we toss that aside we could imagine a lot of interesting things.

Suppose we had a network with no control-plane packets at all.  Suppose that any given network access port was disconnected until we sent an XML message to a management system that provided ownership credentials and the address mechanism to be used there.  Suppose that a user who wanted NaaS simply provided the network with an XML manifest of the points to be connected using logical addresses, real addresses, any addresses.   We can now define a NaaS on the basis of completely managed connectivity.

Suppose that we define management processes as a data relationship set that binds the status of real resources to the status of the NaaS services these resources support.  We can now find out what the state of a network service is without ever giving direct access to the devices, and we can use the same MIBs and processes to manage hosted software elements or real appliances.

Network protocols have created those velvet ribbons that tie us to stuff without our realizing it, and the way that we use these protocols in the future will have an enormous impact on the extent to which we can optimize our use of SDN or NFV.  We should be able to define services in every respect in a virtual world, and our changes in definition shouldn’t have an enormous impact on cost or operations efficiency.  IP should not be the underlying connection/forwarding fabric—that should be the simple OpenFlow-like packet forwarding based on a mystical forwarding table.  We should be able to build IP, or Ethernet, or any arbitrary collection of network services, equally well on this agile base.

Why can’t we?  Because it’s complicated, and so everyone focuses on making something far more trivial work quickly.  That generates little islands or silos, but it doesn’t build a network and networks are what we’re supposed to be doing.  I think that both SDN and NFV can do a lot more for us than we think, if we start thinking seriously not only about what networks do but how they do it.

What We Can Learn From Chambers’ White-Box-is-Dead Comment

John Chambers said a while ago that the “white box” players were dead and that Cisco had at least helped to kill them.  This is the sort of Chamberesque statement that always gets ink, but we always have to dig into those sorts of statements.  “News” means “novelty” not “truth”.

The whole white-box thing was in large part a creation of the hype engine (media, analysts, and VCs) linked to SDN and later NFV.  The idea was that SDN and NFV would displace proprietary network devices.  Since software has to run in something, the “something” left behind when the last vestiges of proprietarism were stamped out was white (for purity, no doubt) boxes.  SDN, in particular, was seen as the launch point for a populist revolution against big router/switch vendors.

Why didn’t this work?  Well, let me start by posing two questions to you.  First, “What brand of car do you drive?”  Second, “What brand of milk do you buy?”  You likely have a ready answer to the first and no idea on the second.  The reason is that milk is a commodity.  White-box switches are by definition featureless commodities, right?  One instance is as good as another.  So let’s now pose a third question.  “Who spends money to advertise and promote a featureless commodity?”  Answer, “Someone soon to be unemployed.”

Not good enough?  Let’s look at another angle, then.  You’re going into your CFO’s office to buy a million dollars’ worth of network goodies.  “Who makes it?” asks the CFO.  Choice 1:  “Cisco.”  Choice 2:  “I don’t know, it could be anybody.”  Which answer is best?

Still not good enough?  Third slant, then.  You’re called into the CEO’s office because the network is down and nothing that was supposed to be working is working.  CEO’s question:  “Well, who do our lawyers call to file the suit?”  Choice 1:  “Cisco.”  Choice 2:  “Gosh, I think their name is ‘FlownAway Switching’ this week but it’s changed a couple times since we installed.”  How far into orbit does the CEO go for each of these choices?

There’s solid facts behind the whimsy here.  Companies want stuff to work, and most would admit they could never hope to test every choice fully.  They rely on “reputation”, which means that the name is widely publicized, the product concept seems to be market-accepted, and the vendor is large enough to settle a lawsuit if it comes to that.

The point here is that it’s true in one sense that Cisco killed the white-box movement.  Had Cisco spent its marketing money and skill promoting featureless, commodity, switch/routers, the movement would have succeeded—and Chambers would have been gone a lot quicker.  Incumbents will never fund their own demise, so the deep truth is that natural market forces killed white boxes, for now at least.

These same market forces impact SDN and NFV more broadly.  One of the interesting things about NFV, for example, is that the standards-science-and-technology teams tend to like network vendor implementations of NFV while the CIO and CFO tend to like computer-vendor implementations best.  The reason is simple; you want NFV to be offered by a vendor with a lot of skin in the game.  If infrastructure transformation through NFV means a shift to servers, why not pick a server vendor?  They have more to gain and lose.

I think that announcements of server products or partnerships by Alcatel-Lucent and Ericsson reflect this truth, and also the flip side, which is that if NFV is going to be a consultative sell (and if it’s not then I don’t know what would be!) it makes sense that vendors would be more inclined to stay the course if they could benefit strongly from a successful outcome.  Think of an orchestration player going through all the hoops of NFV orchestration and management validation to get perhaps a couple million out of a multi-billion-dollar deployment.

I also think there are support issues here.  NFV and SDN both present the operators with a dilemma.  On one hand they are all committed to a “no vendor lock-in” position and extremely wary about having a vendor create one or more NFV silos.  On the other hand they’re wary about being gouged on integration and professional services.

What they should want is what you could characterize as “open product suites”, meaning that they want SDN and NFV to be based on principles/architectures that at least open up the areas where most product spending will happen.  Thus, perhaps the most important single piece of NFV is what I’ve been calling the “Infrastructure Manager”, a subset of which is the ETSI VIM.  If every piece of current or future hardware that’s useful in service deployment can be represented by an IM and orchestrated/managed, then operators have openness where it counts.

Getting to that point is really about the MANO/IM interface, just as getting to a universal and open vision of SDN is really about how the northbound interfaces talk with whatever is northbound.  An SDN controller is a broker for NaaS.  You need to be able to describe the connection model you’re looking for in a totally implementation-independent way, and then translate that model into network setup for whatever is underneath.

That works with NFV too, but it’s more complicated because NFV deploys functional elements and connects things, so the scope of what has to be modeled is greater.  People have proposed YANG as a modeling language for NFV MANO interfaces, but YANG is best used as a connection model.  We have some insight into what process models should look like with OpenStack, but OpenStack also has connection models (via Neutron) and these models are currently limited to things that application/cloud applications would want to set up, which means IP subnets and VPNs and so forth.

It’s the management of these abstractions that creates the challenge for both, however.  If I have an intent model in abstract, and I’m going to realize that intent through resource commitments, then I need to have three things.  First, I need to know what the management of my intent model would look like.  Yes, I have to manage it because that intent model is what the customer is buying.  Second I have to know what the management of the committed resources is like, because you can’t fix virtual problems.  Finally, I have to know the derivations or bindings that describe how intent model management can be derived from the real resource management.

Where is all this stuff in SDN and NFV?  Nowhere that’s even proximate to utility.  Thus, we’re forced to presume that if SDN and NFV are adopted, current management practices and tools will be less useful because we’re losing the ability to manage what we’re selling, or at least disconnecting that from the management of the real resources used for fulfillment.  We can’t do fault tracking when we can’t see the connection; we can’t present a user who bought a firewall or a VPN with the need to manage real resources that wouldn’t in the user’s mind even be part of such a service.  Virtualization, to work, must be transparent, and transparency can be achieved only by abstraction.

I see some signs that there are operators in the NFV process who agree with this position, and we may be able to fix things at the specification level.  The question is whether we could define a suitable abstraction-based vision of IM/VIM and its interface with MANO or northbound apps in time to test the full range of service lifecycle processes for NFV and SDN.  If not, we’re going to have a hard time putting together a convincing field trial, and a hard time ever making anything other than “current boxes” winners.

HP Boosts their NFV Position

HP ranks among, if not on top of, my selection of bona fide NFV providers.  Their OpenNFV architecture is comprehensive in its support for operations integration and legacy device control, both of which are critical to making an early NFV business case.  Now they’re taking on another issue, one that’s been raised in a number of places including recent LinkedIn comments on my blogs.  The issue is VNF integration.

What HP has announced is HP NFV System, which is a kind of NFVI in a box/package approach.  HP takes its own NFV Virtual Infrastructure Manager and server hardware that includes its carrier-grade OpenStack, pairs it with Wind River platform software and HP’s own resource management tools, and creates a handy resource bundle that can be packaged with VNFs to create either a complete service solution or a set of components that can build a variety of services.  Or, of course, all of the above.  The four “kits” that make up NFV System customize it for trial and standard missions and expand its capacity as needed.

Of course, NFV doesn’t need good strategies as much as it needs solutions to problems.  The problem that NFV System may help solve is the orphaning of VNFs in the whole NFV onrush process.  Everything about NFV is really about hosting VNFs and composing them into service, but some VNF providers and likely more prospective providers have a hard time figuring out how they engage at all.

VNFs, according to the ETSI model, run on NFV Infrastructure (NFVI).  This in turn is represented to the rest of NFV through a Virtual Infrastructure Manager or VIM.  Some of the latest models of NFV seem to want to make the VIM a part of MANO, which I disagree with.  A VIM, and a superset of the VIM that ETSI is kind of dabbling around recognizing that we might call an Infrastructure Manager, are what resources look like to other elements of NFV.  If you have an IM that recognizes your hardware then you have something that can be MANOed, so to speak.

That’s the principle NFV System is based on.  It’s an NFVI atom that should plug into a standard NFV implementation because it’s a VIM with conforming technology underneath.  That means that a software provider who wants to package their stuff for deployment could buy NFV System, add their VNFs, and sell the result.

This could invigorate the VNF space, which has so far been more posturing than substance.  Without a VIM something that purports to be NFV-compatible is just hardware, and without a VIM to run through software functionality isn’t a VNF it’s simply a cloud application.  HP has given voice, perhaps, to the VNF community at long last.

I say “perhaps” because we are still facing what IMHO is an imperfect management model for NFV and also an incomplete or inadequate model for how MANO talks to a VIM/IM.

The problem with the NFV management model is one I’ve blogged about before; here let me confine myself to saying that there is no explicit way that VNFs are pushed through the service lifecycle, there are some options.  Even if those options are all workable (which I don’t believe to be the case), the problem is that there are multiple options and everyone won’t pick the same one.  Right now a VNF provider would likely have to provide some integrated internal VNFMs packaged with their stuff.  How those link to systemic management is still fuzzy.

The model problem is simpler to describe and as hard or harder to solve.  Does a VIM represent only one model of deployment or connection?  That would seem wasteful, but if there are several things a VIM can ask for it’s not entirely clear how it asks.  There are examples of using YANG to describe service chains, OpenStack to host.  But OpenStack Neutron recognizes a number of network models, so do we use that or use YANG or maybe something else entirely?  Without a firm view of what MANO tells a VIM to do, how do we know that a given VIM (and the underlying NFVI) plug into a given MANO?

This isn’t HP’s problem to solve at the industry level, of course.  HP’s OpenNFV has a broader VIM/IM model than ETSI defines, and a much broader set of options for integrating management and operations.  Their NFV Systems would plug into their own OpenNFV software easily of course, and it would likely be at least more adaptable to a generic ETSI-compliant MANO than network functions off the shelf would be, given that there’s a VIM available.  The ETSI process isn’t plug and play with respect to VNFs, though, and HP alone can’t force a repair.

This is a really interesting concept for a number of reasons, the obvious one being the empowerment of VNF providers.  Some of the less obvious reasons may end up being the most compelling.

The first is that it highlights why VIMs need to be generalized to IMs and decisively made a part of NFVI and not part of MANO.  Anyone should be able to create a seat at the table for their VNFs by providing a suitable VIM to deploy them and suitable NFVI to run them on.  HP, by offering NFV System, may make it harder for the industry to dodge these points.

Next, we are significantly behind the eight-ball with respect to both how MANO describes a desired service to a VIM and how NFV and NFV-legacy hybrid services are managed.  HP happens to have a good approach to both, and by making NFV System available they’re showing the world that there are sockets and plugs in NFV that just might not connect.  Competition might induce ETSI or OPNFV to clean things up, or induce other vendors to respond with proprietary approaches.  Either is better than a “wing and a prayer.”

Finally, this may well be the first example of somebody coming out with commercial NFV.  This is a product, designed to support specific goals and to be used by specific prospective customers.  It may not be needed in a lab trial or PoC, or even be critical in early field trials, but it does show that HP is taking NFV seriously.  Given that an optimum NFV implementation would be the largest source of new data centers globally, that gives HP a big advantage in itself.

HP collaterally announced enhancements to HP’s Director orchestration product are making a good NFV strategy even better.  It’s worth citing the release on this:

HP NFV Director 3.0 provides enhanced management and orchestration (MANO) capabilities that streamline bridging of NFV and telecommunications resources, as well as enable faster VNF on-boarding. It combines operations support systems (OSS) and IT management capabilities in a comprehensive, multi-vendor NFV orchestration solution that automate service and infrastructure management to enhance the flexibility and scalability of CSPs current OSS systems.

Bridging telecom and NFV resources means supporting both legacy and NFV-driven infrastructure, critical in making a transition to NFV and to supporting realistic future services that will always draw at least some services from traditional equipment.  Faster VNF onboarding means getting functional value to operators faster.  Combining OSS and IT management in a comprehensive multi-vendor orchestration solution is what efficient operations means.  Director aligns HP even more with the critical early benefit drivers for NFV deployment.

I don’t have the details on all the latest Director stuff at this stage, but what’s out is impressive.  It’s certainly something that other vendors need to be worrying about.  Standards don’t seem likely to create NFV success, and a thoughtful blog by Patrick Lopez published HERE suggests that OPNFV may not be the magic formula for NFV success either.  I suggested all last week that it might be time to let vendor innovation contribute the critical pieces that standards haven’t provided.  Maybe HP is doing that, which could be good for the industry and very good for HP.

What Will Cisco-Under-Robbins Be Like?

I remember a Cisco before John Chambers but I suspect most in the industry today do not.  For those people it might seem a frightening prospect.  For some who have been disappointed by Cisco’s seemingly lackluster support of new initiatives like SDN and NFV, it may seem like an opportunity.  Obviously we have to see what will happen under Chuck Robbins, the new CEO, but there’s a strong chance it will be business as usual.  If it is, then Cisco is making a big bet.

At a time when networking was about technology and strategy, and when network equipment vendors were usually run by technology-strategy people, Chambers brought in the sales side.  He was always a salesman first, even as CEO of Cisco.  One (now-ex) Cisco VP told me “John was the master of tactics over strategy.”  That this worked in the enterprise market is hardly surprising, but that it also worked in the long-cycle-planning-dominated carrier market is more so.  The two questions all Cisco-watchers will ask are “Is Robbins going to stay the Chambers course?” and “Will what worked for Cisco in the past work in the future?”

It’s interesting to contrast Chuck Robbins with a previous heir-apparent, Charlie Giancarlo (another “Charles”).  Charlie was an intellectual, a strategist, a thinker and a technologist.  Chuck is a sales and channel guy in his past jobs with Cisco, a mover and shaker in those spaces but more a newcomer than some on the Street had expected.  The point is that on the surface Cisco has shied away from the strategists in favor of another tactician.  One might see that as a vote against change.

I think it’s a vote against radical change, but the jury is out on a broader implication.  There’s no reason for Cisco to rush to exit a game it’s winning.  Operators seem to be inclined to stay the course with respect to their switch/router vendors, a fact that’s benefitted Cisco and perhaps Juniper and Alcatel-Lucent as well.  Operationally and integration-wise, that’s the easiest course to take.  SDN and NFV could both ease the pain of a shift in vendor, product, or even architectural strategy down the line but we’d have to have those technologies in place to do that.  Cisco benefits from dragging its own feet, and even by actively suppressing motion to SDN and NFV where it can.

For now.  At some point, operators will have the tools they need to accomplish a shift in strategy, and the benefit case to drive the change on a broad basis.  One might be tempted to speculate that Cisco sees that point coming soon, and sees the need for Cisco to change faces to adapt to the new picture as a good time to change CEOs.  What better way to justify a shift in strategy?

Some of Chuck’s early comments about the “Internet of Everything” seem to be a specific commitment to the old Cisco, though.  Cisco’s primary marketing thesis for a decade has been “Traffic is going to increase, operators, so suck it up and buy Cisco stuff to carry it!”   The Internet of Everything is an example of that; so what if nothing that’s being done even today is profitable for operators, there’s more coming.  If a new CEO fossilizes this positioning it’s going to be hard to embrace logic and reality down the line.

Of course it’s impossible for a new CEO to sweep the Chambers Legacy down the tube.  I wouldn’t either.  I’ve talked with John a number of times and he’s an impressive guy.  In a different frame of reference he’s like Steve Jobs (who I’ve also talked with).  He understands the buyer, so even if Cisco were prepared to dis the legacy of their own chieftain (which clearly they won’t do), you can’t abandon a “sell them what they want” value proposition.

Which gets us back to the question of whether what they want is durable.  Cisco or a number of other vendors could promote a vision of future networking in which the current paradigms would be largely preserved for perhaps as long as a decade.  The trick is to establish the principle that operations efficiencies can cure all ills.  There’s no way anyone could say that Cisco would win in a future driven by the quest for reduced capex, after all.

Which is what competitors should be thinking about.  If Cisco needs the opex horse to ride to victory on, then competitors need to saddle it for their own gains.  Remember the numbers from the sum of operator surveys I’ve done.  Twenty years ago about a third of TCO was opex, and by the 2020s it appears that two-thirds will be opex.  Worse, if we focus on “marginal infrastructure” meaning the stuff that’s already being purchased, the average TCO is already almost two-thirds opex.  By 2020 it will be three-quarters.  Worst of all, the high-level services or service features have the highest opex contribution to TCO.

Ironically, Cisco has all the right DNA to be a leader in service automation.  From the very early days of DevOps, they favored a declarative model rather than a script-based approach, for example.  It’s my view that it’s impossible to do effective and flexible service automation any other way.  Cisco’s current approach, which is a policy-based declarative model, isn’t how I’d approach the problem (nor is it how I have approached it in past NFV work) but it’s at least compatible with SDN and legacy infrastructure.  Its failure lies in the fact that it’s not really addressing operational orchestration.

Which is where I think the brass ring is.  If you think about service automation, you realize that a “service” is a kind of two-headed beast, one representing the technical fulfillment and the other representing the business process coordination.  While it may be possible to automate technical tasks, it will be difficult to do that effectively if you don’t collaterally automate the service processes that deliver everything to the user and manage billing, etc.  You have to collaterally virtualize management to manage virtualization, in any form.

That’s the other point here.  We have ignored operations for the cloud just as fervently as we’ve ignored it for NFV and SDN.  Cloud services can’t squander resources to minimize faults and still be profitable, any more than you can do that with networking.

So here’s the challenge that Chuck Robbins faces.  Can you take Cisco’s decent-but-disconnected-and-incomplete story on service automation and turn it into something that addresses SDN, NFV, and the cloud.  If so, and if you kind of hand this area over to the UCS people, you have a shot at being what Chambers wanted Cisco to be—the number one IT company—and also attain a top spot in the network space.  Otherwise, you’re delaying a reckoning that will be only harder to deal with down the line.

Service PaaS versus Opto-Electrical Layer: Which Leads to NFV Success?

It’s nice to have a sounding-board news trigger to launch a discussion from, and Oracle has obligingly provided me that with its Evolved Communications Application Server.  This is a product that I believe is driven by the same industry trends that Alcatel-Lucent’s Rapport is, and potentially could deliver services that could compete with Google’s Fi.  The new services could be valuable.  A competitive response to Google could be valuable.  But is this path going to create value enough for the future?  That’s the question, and particularly relevant given the Ciena/Cyan deal announced today.

Operators face a double challenge in preparing for networking’s new age.  The first challenge is to find new services, and beat OTTs to leading with them.  The second challenge is validating hundreds of billion dollars in sunk costs.  It’s easy to be an OTT.  It’s relatively easy to be an operator who offers OTT services.  It’s hard for operators to build next-gen services in a way that builds on their current investment, their current incumbencies.

Mobile and IMS are poster-children for this.  One of the ideas behind IMS was that it could support new applications that would extend basic mobile services.  However, mobile services today means “getting on the Internet”, which is hardly an extension of IMS capabilities.  Operators have longed for a practical way to make IMS into an OTT competitor.  Nothing has worked.

Google’s Fi is a shot across the bow of operators.  Coming from the OTT side, using an MVNO relationship as the basis, Google wants to create value-add without the IMS legacy.  Their position is that you can use 3GPP standards to control handsets below the service layer.  That’s not operator-friendly.

Both Oracle’s new ECAS and Alcatel-Lucent’s Rapport build on IMS, Rapport because it includes it and ECAS because it builds on VoLTE, which depends on IMS.  It’s interesting that Oracle talks about ECAS in VoLTE terms rather than in IMS terms; it shows how far IMS has fallen strategically.

Yeah, but IMS is still necessary for 4G services, so the issue of how to extend new services to 4G users remains.  I like the idea of building a kind of “service PaaS” that operators could build to (or buy to) to extend their investment in mobile services.  The question is whether this will lead the operators to somewhere useful.

Both ECAS and Rapport start with a “universal session” concept that allows calls and session services to be connected across WiFi or cellular and roam between the two (or among them).  Google Fi also has that.  Both ECAS and Rapport would, in my view, allow operators to build Fi-like competitive services and extend them.  All good so far.

The question that remains for both ECAS and Rapport is “what next”.   Both Alcatel-Lucent and Oracle point to the compatibility of their approaches with NFV.  To me that demonstrates that vendors and therefore the buyers think that NFV compatibility is a bridge to the future.  You can deploy elements of Rapport and ECAS with NFV-like orchestration.

The problem is that a “service PaaS” is still a PaaS, meaning that it’s a kind of silo.  If there are special features to be included in a service PaaS, even if they are deployed using NFV as VNFs, are these features open parts of an NFV architecture?  Would the operators’ investment in the platform be extensible to other services, particularly non-session services?

Maybe they would, but it seems likely that NFV would have to be extended explicitly to optimize the picture.  There’s absolutely no reason why NFV couldn’t deploy “platform services” that would become part of a service framework to be used by specialized applications.  What would be bad is that this framework created a different service lifecycle management process.  Operations has to be unified overall, across everything, or evolution to a new service infrastructure will be difficult.

The question may be critical for NFV because this year is the watershed for the concept in my view.  If we can’t promote an NFV vision that can prove ecosystemic benefits in a real field trial by year’s end, then operators may have to find another way to manage their profit-on-infrastructure squeeze.

The Ciena acquisition of Cyan might be a path to that.  You may recall that I’ve proposed the network of the future would be built on an agile opto-electrical foundation that would take over the heavy lifting in terms of aggregation and transport efficiency.  As this layer builds from optical pipes to path grooming at the electrical level, it could well cross a bunch of different technologies, ranging from fibers or wavelengths to tunnels and Ethernet.  Orchestration and operational integration would be very valuable, even essential, in harmonizing this structure.

But like our service PaaS, the vision of agile opto-electrical layers isn’t a slam dunk.  I’ve not been a fan of Cyan’s approach, considering it lacking in both scope of orchestration and in integration of management and OSS/BSS.  It does have potential for the more limited mission of orchestrating the bottom layer.  However, to meet it there will have to be a significant improvement in operations integration.

Ciena had previously announced an NFV, and their opening statement on the proposed Cyan deal is worth reading (from their press release):

Cyan offers SDN, NFV, and metro packet-optical solutions, which have built a strong customer base that is complementary to Ciena. Cyan also provides multi-vendor network and service orchestration and next-generation network management software with advanced visualization. When combined with Ciena’s Agility software portfolio, Cyan’s next-generation software and platforms enable greater monetization for network operators through more efficient utilization of network assets and faster time-to-market with differentiated and profitable services.

This seems a pretty clear statement that the deal is driven in no small way by SDN and NFV, and I do think that the companies could combine to present a strong opto-electric agility story for the metro.  Such a story might not offer new services to operators like the Alcatel-Lucent and Oracle visions of a Service PaaS, but it might offer cost management and it has the advantage of being targeted at the place where operator infrastructure investment would naturally be higher—metro.

To me, these positions raise the critical question of cost-leads or revenue-leads.  Any contribution of new revenue or competitive advantage at the feature level would have a profound positive impact on the NFV benefit case.  However, benefits are harder to develop and socialize when they’re secured way above the infrastructure capabilities and corporate culture of the operators.  Both Alcatel-Lucent and Oracle seem to be working to ease operators into a service-benefit-driven vision by making the connection with mobile and IMS.  They bring to the table a solid understanding of service PaaS and mobile.  Ciena and Cyan bring a solid understanding of optical.  We may now be in a race to see which camp can create a solid vision of operations, because that’s where the results of either approach will be tested most.

How We Get to Where SDN, NFV, and Carrier Cloud Have to Go

In my blog yesterday I talked about the need for something “above” SDN and NFV, and in the last two blogs about the need for an architecture to define the way that future cloud and NGN goals could be realized.  What I’d like to do to end this week is flesh out what both those things might mean.

Networking today is largely built on rigid infrastructure where service behaviors are implicit in the characteristics of the protocols.  That means that services are “low level” and that changes are difficult to make because it requires framing different protocol/device relationships.  Operators recognize that they need to evolve to support higher-level services and also that they need to make their forwarding processes more agile.  NFV and the cloud, and SDN (respectively) are aimed at doing that.

The challenge for operators at my “boundary” between technology and benefits has been in securing an alignment at reasonable levels of cost and risk.  We can define transformative principles but can’t implement them without a massive shift in network infrastructure.  We can contain risks, but only if we contain benefits and contaminate the whole reason for making the change in the first place.

What we need to do have is an architecture for transformation that can be adopted incrementally.  We take steps as we want, but we remain assured that those steps are heading in the right direction.

Let me propose three principles that have to guide the cloud, NFV, and SDN in how they combine to create our next-gen network:

  1. The principle of everything-as-a-service. In the cloud and NGN of the future we will compose both applications and services from as-a-service elements.  Everything is virtualized and consumed through an API.
  2. The principle of explicit vertical integration in the network. OSI layers create an implicit stack with implicit features like connectivity.  In the future, all network layers will be integrated only explicitly and all features will be explicit as well.
  3. The principle of universal orchestration. All applications and services will be composed through the use of multi-level orchestration, orchestration that will organize functionality, commit resources, and compose management/operational behaviors.

You can marry these principles to a layered structure that will for all intents and purposes be an evolution of the cloud and is generally modeled on both SDN and NFV principles:

  1. The lowest layer is transport, hosting and appliances, which makes up physical media like fiber, copper, and servers, along with any “real” devices like data center switches. The sensor/control elements of the IoT and user access devices and mobile devices also live here.  Most capex will be focused on this layer.
  2. The second layer is the connectivity, identity, security, and compliance layer which is responsible for information flows and element relationships. This layer will be built from overlay protocols (tunnels, if you like, or overlay SDN).  You can think of it as “virtual networking”.
  3. The third layer is the feature/component layer where pieces of software functionality are presented (in as-a-service form) to be used by services and applications. This is where the VNFs of NFV or application components or the product catalogs from which architects build stuff lives.
  4. The top layer is the service and application composition layer which builds the features and applications users pay for.

If we combine these principles and our structural view, we can propose a framework for an NGN implementation.

First, network and compute resources at the bottom layer will inevitably be compartmentalized according to the technology they represent, the vendor who provided them, and perhaps even the primary service target.  That means that the implementation of this resource layer (which NFV calls the NFVI) has to be visualized as a set of “domains”, each represented by its own Infrastructure Manager.  That manager is responsible for creating a standard set of resource-facing (in TMF terms) services.  There will thus be many IMs, and unlike the model of the NFV ISG, the IM is part of infrastructure.

The resource services of these domains are “exported” upward, where they join the second element of the implementation, which is a universal model of services and applications.  The purpose of the model is to describe how to deploy, connect, and manage all of the elements of a service from the domain resource services, through intermediary building-blocks that represent useful “components”, up to the retail offerings.  The important thing about this model is that it’s all-enveloping.  The same rules for describing the assembly of low-level pieces apply to high-level pieces.  We describe everything.

One of the main goals of this model is to provide for service- and application-specific network connectivity that is built from a mixture of domain resource services (like fiber pipes) and virtual and completely controllable switching and routing.  Every application and service can have its own connectivity model or can share a model, so the scope of connectivity can be as refined or large as needed.  This layer is based on an SDN, virtual routing, and virtual switching model and I’d expect it would use an overlay-SDN protocol on top of traffic engineered paths and tunnels (used as resources from below).

Above this, we have a set of functions/components that can be harnessed to create that “higher-layer” stuff we always hear about.  Firewall, NAT, and other traditional NFV-service-chain stuff lives here, but so do the components of CRM, ERP, CDN, and everything else.  Some of these elements will be multi-tenant and long-lived (like DNS) and so will be “cloud-like”, while others will be customer-specific and transient and the sort of thing that NFV can deploy.  NFV’s value comes in what it can deploy, not what it does as service functionality (because it doesn’t have any).

Applications and services are at the top level.  These are assembled via the model from the lower components, and can live persistently or appear and disappear as needed.  The models that define them assemble not only the resources but also the management practices, so anything you model is managed using common processes.

Users of this structure, consumer or worker, are connected not to a service with an access pipe, but to a service agent.  Whether you have a DSL, cable, FiOS, mobile broadband, carrier Ethernet, or any other connection mechanism, you have an access pipe.  The things you access share that pipe, and the service agent (which could be in-band control protocol driven or management API driven) would help you pick models for things you want to obtain.

Universal orchestration orchestrates the universal model in this picture.  The purpose of the model is to take all current service-related tasks and frame them into a data-driven description which universal orchestration can then put into place.  Management tasks, operations processes, and everything related to the service lifecycle create components at that third layer, components orchestrated into services just as functional elements like firewalls or CRM tools would be.

I don’t know how farfetched this sounds, but I believe you could build this today.  I also think that there are four or five vendors who have enough of the parts that with operator “encouragement” they could do enough of the right thing to keep things moving.  Finally, I think that any test or trial we run on carrier cloud, SDN, or NFV that doesn’t explicitly lead to this kind of structure and cannot demonstrate the evolutionary path that gets there is taking operators and the industry in the wrong direction.

Why Crossing the Benefit Border is So Hard

Yesterday I blogged about the current state of our technology-side revolutions in telecom—SDN, NFV, and the cloud.  All three of these have taken a bottom-up approach to solving the problems of the industry, and while it’s premature to say that any have failed it’s certain that none have succeeded either.  The reason why, I suggest, is that no matter where you start the problem of “border crossing” remains, and I’d like to dig a bit deeper into that problem today.

I dug up an old presentation by Jochen Hagen of T-Systems, given at a public Juniper event in 2008.  It lays out some problems, makes a statement on what will solve them…all the stuff you see today.  The solution back then was “IP Transformation” but the problem set was so close to what you’d hear today from operators that you could stick the slides into a current deck and they’d make perfect sense.

What this proves is that despite my personal penchant for top-down approaches, top-down transformation hasn’t worked well for the telecom industry.  The reason for that, I believe, is that telco planners/executives can articulate what they want and need, but at some point they have to assign those wants and needs to a solution that can be defined and purchased and installed.  That hasn’t worked.  Transformation documents are like fiction—interesting but not representing reality.

The bottom-up process isn’t a shining example of success either.  SDN and NFV are networking-centric recent “technology revolutions.”  We know how to do, and to buy, both of them.  The challenge for both technologies has been they have to be able to align their technology capabilities to benefits of sufficient magnitude as to fund a network transformation.  That hasn’t worked.  That’s the challenge that SDN and NFV trials and PoCs now face.

Why is this “border” between technology and business so difficult to cross?  The answer lies in part in the fact that we’re looking for revolutionary changes in a multi-trillion-dollar industry.  It’s unreasonable to expect that you’d revolutionize automobiles by changing a bolt on a bumper, and similarly unreasonable to assume that a small “manageable” change to a network is going to bring about a 25% across-the-board reduction in costs or a similar increase in revenues (or both).

You can see the effects of this in both NFV and SDN, where everyone seems to be making a point to think small.  Look at the leading NFV PoCs for example, and you find virtual CPE and virtual IMS/EPC.  Both are attractive but operators have been having a problem proving a big benefit case for SDN or NFV in either, largely because in both cases the operational implications of the new technology are proving difficult to assess.  But even if we had a business case, how much of networking can we touch with both of these?  Don’t SDN and NFV have to be useful on a much larger scale to be worth the effort and risk?  If so, why are we not exploring all of the boundaries?

You can also see the border-inhibiting impact on top-down transformation plans, even the most recent.  I’ve seen a number of them that lay out the benefit requirements very effectively, and thus set the requirements for technology projects that would meet the benefit case.  The problem has been that when you look at the result of these plans, you see a sweeping transformation of network and operations, sales, customer support, and so forth.  It touches a hundred vendors, a dozen major technology areas.  It’s like giving somebody a giant pile of parts and asking them to build a car from scratch.

So how does this get fixed?  I think there are two possibilities.  First, we could actually work to better define our “benefit/technology border” so that crossing it from either side would be facilitated.  That would let projects involving SDN, NFV, and the cloud take a more convincing shots at benefits, and let transformation projects more readily align with our revolutionary technologies.  Second, we could forget both our current approaches and follow the money.  Since the first of these options is what most vendors and operators favor, let’s look at it first.

The biggest problem with the benefit/technology border is a lack of a holistic architecture that unites business goals with technical elements.  If you believe (as I do) that the transformation of operators will involve an application of SDN, NFV, and the cloud to their business, then there should be some sort of architecture diagram possible that shows where these pieces fit in the overall picture.  Such a diagram, if complete, would be aligned with business goals from the top (as requirements to be met) and with technology options below (as features to support requirements and realize benefits).

NFV had the unique opportunity to frame the whole diagram by taking a truly end-to-end view of a service and at the same time supporting a complete service lifecycle process.  That’s because NFV could have addressed services, functionality, operations, and network resources in the same model.  It has not realized that opportunity, nor is it likely to do that in time to drive the market.  We have no credible process in place elsewhere to do what’s needed either.  We have operations processes, management processes, and network technology but nothing that really unites them.  That’s why I said in my last blog that it was up to the vendors to extend NFV and envelop the cloud and SDN to create that border-crossing architecture.  But vendors have become very tactical and that may be hard for even the largest to swallow.

The other option is the “empirical” approach.  Forget technology revolutions.  Assume that SDN, NFV, and the cloud are just new pieces from which cars can be built.  Focus the efforts of all three where there’s the largest number of buyers, the greatest level of natural investment and change.  Based on this approach, for example, you’d reason thusly:  “Right now, most profit comes from metro services and most investment will flow incrementally to metro.  Therefore, support metro in SDN, NFV, and cloud capabilities, and you’ll be available to build what’s getting built.”  You could argue that this is what the MEF is doing; fix “orchestration” for a service that’s getting invested in and you can do at least something.

The problem with this approach is the classic death by a thousand cuts.  If you focus on incremental projects where SDN, NFV, or the cloud can be applied without looking at any of these technologies as a whole (much less as a combined ecosystem) then you risk creating silos that will do what silos always do—undermine efficiency, agility, and economy of scale.

The solution to these issues comes back to vendors in my view.  A vendor with a fairly comprehensive strategy for NFV could spread it out over enough infrastructure and service landscape to insure that key areas would be implemented and operationalized the same way even if they were driven by service-specific projects.

What technology element is critical?  The answer is orchestration of everything.  You have to be able to orchestrate virtual functions, SDN enclaves, cloud components, operations processes, management tools, customer interactions—everything means everything.  Orchestration could be what unifies everything at the border, that turns fleeing mobs into orderly lines.  It’s also something that vendors could do, directly through standards (like OASIS) or simply through the complete articulation of an architecture that’s open but beyond the scope of current standards.

Every operator on the planet has a transformation timeline that goes no further than three years out.  We are three years from inception of NFV as a project and far further than that with SDN and the cloud.  There’s every reason to fear that we’ll diddle around here long enough to make the work irrelevant.  As I’ve said, operators have a timeline set by their business problem and they’re going to meet it, even if they have to start a new and different revolution.

Climbing the Benefit Ladder Above SDN, NFV, and the Cloud

Network Functions Virtualization (NFV) is one of several technologies that operators are hoping will improve their profit on infrastructure investment.  NFV itself was launched to reduce capex by substituting generic hosted functions for embedded-appliance-based functions.  NFV’s benefit expectations have evolved since to include, and even emphasize, operations efficiency and service agility.

The evolution of expectations doesn’t necessarily drive collateral evolution of capability, which I’ve noted in the past.  Last year operators told me that none of their trials of NFV had proved a full business case for deployment.  Early this year they said that they were integrating more operations practices and processes into the trials, and most were hopeful this would resolve the benefit issues.  Even though it’s only the end of April, they’re still evolving their view of NFV and I think it’s interesting to see where it’s headed.

The most significant point I’ve learned is that about 80% of operators’ NFV trials are characterized by operators themselves as “proof of technology not benefits”.  This isn’t a return to the 100% “my trials won’t prove a business case” but it does seem pretty clear that hopes that additional scope for current PoCs and trials would justify deployment aren’t yet realized.

A couple of operators were very candid in their comments.  The problem, they say, is that the trials aren’t really doing much to operations at all.  Vendors, who in fairness are probably influenced by the ETSI vision of management and operations integration, have promoted what can be called the “virtual device” model of management.  Virtual functions, under this model, are managed by adding management components that mimic the management interfaces and behaviors of the original devices.

This seems very logical on the surface.  If you want to validate NFV technology you need to contain the impact on surrounding aspects of your network and business or you end up with a “trial” that includes everything.  The challenge is that if you are mimicking current device management, then it’s hard to demonstrate much in the way of operations efficiency gains.  In fact, you’re likely to create additional operations issues.

Early trials of the virtual device model show that you can manage a virtual device through existing interfaces, with existing tools, only to a point.  There is a kind of border you’ll need to cross in this situation—the border where virtual functions are hosted on or connected through real resources.  The management of those resources tends, in early NFV trials, to be separate from the management of the virtual functions.  The challenge, according to operators, is that separation means that resource management in addition to function management is needed, and problem resolution across the border is more difficult than expected.

A few of the operators attribute all of this to a lack of service lifecycle management focus.  In order to assess NFV benefits, you’d have to be able to test NFV from the conception of a service to the realization as a product to be sold and paid for.  Three quarters of trials, according to operators, fail to provide any such scope and so it’s difficult to assess what the total cost and total agility-driven revenue benefit might be.

Most operators now seem to believe that the problem isn’t NFV per se, but the fact that NFV has to be fit into a larger service revolution.  “I’m not interested in building my business around NFV,” said one, “but I’m very interested in building NFV into my business.”  The challenge for operators is that while there is an NFV architecture (even if it’s operationally imperfect or at least not validated) there’s nothing above they can play with.

What I see now is something like the “transformation” age of operators eight or ten years ago.  At that time they all were looking at business model transformation aided by technologies.  I looked back over the presentations made at that time and found striking similarities with the presentations on current operator goals for building that mysterious layer above NFV (and SDN).  Nothing much came of those old adventures, of course, and that has a lot of operators worried.  They need something complete and effective in play within two years on the average, and they’re not only unsure where it will come from, they aren’t confident they can describe it fully to prospective sellers.

There are people who see this as a failure of NFV, even within the operator community.  About a quarter of Tier Ones seem to have scaled back considerably on their NFV expectations.  I’ve had my own doubts about the scope of the ETSI work—I’ve argued from the first that the limitations in scope risked isolating NFV from most of the benefit sources.  I still feel a broader model would have been better, but I have to admit that it would have taken longer to do and in the end might not have accomplished any more than that which has been done by the ETSI ISG to date.

So what’s the problem?  I have a sense of inevitability here, I guess.  The constriction of profits between a falling revenue-per-bit line and a slower-falling cost-per-bit line is a systemic problem with roots that go way beyond network technology and operations or business practices.  It may not be possible to solve it completely, and even some operators now admit that.  Regulators may have to accept the very kind of consolidation that the rejection of the Comcast/Time Warner deal would have created.  Users and OTT players may have to accept that there will not be continued improvement in speed and quality, and that in fact congestion online may become the rule.

That’s what these new high-level visions are hoping to avoid.  A bit less than half the operators seem to have at least skunk-works projects underway to advance a new service architecture at the highest level.  In a goal sense, most of these new architectures aren’t demanding NFV or SDN or the cloud, but they are all defining objectives it would be hard to meet without all three.  In fact, what these operators seem to be creating is a kind of Unified Field Theory for networking that harmonizes all three.

For vendors this poses an enormous risk and opportunity at the same time.  Much of the work involved in PoCs and NFV trials up to now isn’t going to pay off in direct deployment.  Much of the work needed to drive significant network transformation will have to take place outside the NFV, SDN, and cloud processes.  But remember that about 20% of trials are considered to be making useful progress.  We do have NFV vendors who are successfully (if, in operator views, too slowly) expanding their scope to grasp at the borders of NFV and whatever is above it.  This is where big vendors will have the advantage, because they’re going to have to take a big bite of complexity to get a big bite of benefits.  And only a big benefit case is going to transform networking.

Apple, iDevices, and the New Age of the Cloud

Apple “crushed estimates” according to the headline of a financial website, and they surely did.  In fact Apple turned in what was perhaps the first unabashedly great quarters of tech companies in the current earnings season.  iPad sales were slightly below estimates and some analysts thought outlook was less positive than the current quarter, but other than that it was beer an roses.

The obvious question is whether they can keep it up.  This is important not only for Apple but for the industry, because if Apple is the face of success then we have to reexamine some of the cherished tech illusions we’ve been reading about.

The Driving Principle of Our Age is that of virtualization.  Computer power is cheap and getting cheaper, and its expansion into every aspect of our lives isn’t limited by capital cost but by support or operations.  We have to turn the world into as-a-service because the masses can’t be expected to be computer gurus.  In this world-view, we should be seeing a dumbing down of local intelligence, a shift toward devices being on-ramps to the cloud.

Which is hardly consistent with Apple’s vision.  Apple has three basic value propositions.  One is that their stuff is cool and their users enviable.  That’s a given.  The second is that “their stuff” is something you buy and hold.  It’s not virtual, it’s not hosted, it’s not something that is really created on some nameless server somewhere, because that anonymity would make everyone else Apple’s equal.  Their game is devices.  The third proposition is that sought-after experiences are atomic.  Users want something, and that something isn’t much related to other somethings the user wanted or will want.  We live in the moment.

Amazon tried to beat Apple, at its own game, with Kindle Fire and their phone, and it didn’t work.  It’s very hard to unseat a champion if you agree to abide by all their rules of engagement, after all.  Google bought a phone company, arguably, to try their own hand and that didn’t work either.  So what would work?  Something that eats away at those basic value propositions, and nothing would do that better than a shift to the cloud.

Amazon is the cloud giant, the king of virtualization.  They have the tools to make an experience virtual, not tied to cool devices.  Such a move would hit at one of Apple’s critical foundations.  Google, with Fi, is now taking a mobile service and building layers on it in the cloud and not in the device.  Their Nexus 6 initial offering is weak.  Could they make their next supported handset an iPhone, perhaps?  Suppose they said that any iPhone, even an older model, could be used with Fi.  Will that undermine an Apple proposition?

Alcatel-Lucent may be aiming at the third proposition, through service-provider proxies.  Their new Rapport promises contextual services, or at least an early form thereof.  The more intelligence you draw into giving users what they want, the more costly it is to store all that stuff in the handset or deliver it there for analysis.  The network knows, remember?  Operators have wanted to break the hold phone providers have on mobile services.  Empowered operators, particularly operators inspired by Google’s Fi, might do that.

Context is also the logical solution to the first of Apple’s propositions, the cachet.  Every user may strive to display that Apple logo, but more than even that they’d like to display “Her”, the artificial-intelligence companion that sees all, knows all, and when she (or he) speaks draws admiring glances.  A companion shares experiences, shares context, which is why we automatically expect smart devices to know what we’re seeing or doing.  Can a phone do that sort of thing?  With network help, yes.  A network can do it with nothing more from a phone than a conduit to the user.

The industry, for a variety of reasons, is moving beyond traditional networking and IT and into a new age, an age where context and personal assistants are inevitable.  I think that the signs are already visible.  Yes, Amazon has fumbled its own ball.  Yes, Google’s Fi is tentative, a wisp of what it could be.  Alcatel-Lucent isn’t marketing Rapport to the stars either, and operators confronting a profit crisis and a technology (NFV) that promises to support agile services are instead trying to use it to create the same old crap they’ve sold forever.  There will always be under-realization of every new industry trend, but eventually mass pays off.  An industry-wide profit-starvation trend has a lot of mass behind it, and urgency besides.

The question for Apple is whether they see this or not.  It’s perfectly possible that Cook and his clan have already laid out the Cool Answers to the context-and-services future, that the Apple Cloud offering will eventually just awe everyone into submission.  They’re milking their current model while they can, and will spring when it’s necessary.  It’s also possible that Apple has stuck its head in the (silicon-based) sand and will hold onto the past too long, like so many others.

The question for competitors is the same, the “do-they-see” question. It’s possible to make a better phone or tablet than Apple, but nobody is going to make a phone or tablet superior enough to overcome Apple’s advantage.  To beat Apple you have to write new game rules, rules that favor new innovators.  Amazon and Google and even the network operators, perhaps through NFV or Alcatel-Lucent’s Rapport, have a chance.  They have a chance to undermine the handset.

And to exalt what, exactly?  That may be the problem.  The network isn’t the value proposition for the future; it’s getting commoditized just as fast as anything else.  If the cloud is the future, what exactly is the cloud?  Servers are commoditizing.  Is it software?  Can we earn massive profits from software alone?  What is the engine to create the Big Win that justifies the Big Risk?

If as-a-service is the future then services is the goal.  We will likely spend less on hardware and software to support centralized cloud services of any type than we’d spend if we hosted our gratification on local devices.  The next level of disintermediation may be aimed less at the network operators and more at the vendors.  It won’t kill the market for computers or devices, but it will surely help commoditize it.  Unless the vendors start thinking about how they can be as-a-service players too.  Does that sound a lot like the dilemma the Internet posed for operators decades ago?  It does to me.

Stepping Beyond the Cloud as We Know It

There are few who doubt that we are in the Internet Age.  Few doubt we’re entering the Cloud Age and maybe even the SDN/NFV Age, but I wonder whether there’s broad understanding that the cloud and related technologies like SDN and NFV are going to be as transformative as the Internet was.  When the Internet first developed, nobody saw what it would become.  We’re just now starting to see the signs of what might come next.

Our biggest news item last week was Amazon’s first break-out of cloud earnings.  The company reported about $5 billion in cloud service sales and a $6 billion run rate.  If you give Amazon about 28% of the IaaS/PaaS cloud market, that sizes that cloud market at $18 billion, which is about 1.8% of current IT spending.  More significant to financial analysts was that Amazon reported a profit of about $1 billion on the cloud.

I think the most interesting thing about the Amazon number is the way it frames total cloud service sales.  If you believe the cloud will largely displace private IT, it’s clear there’s a long road ahead.  If you don’t, which I don’t, then you have to examine cloud service opportunity more closely to see where we are now and where we’re heading.  It’s that examination that takes us into the future, into the transformation that just might change everything.

The first point is that SaaS is generally viewed as the larger cloud service segment, but it’s hard to size effectively because hosted services and SaaS services are hard to distinguish.  If you eliminate web hosting, my own estimate is that SaaS currently accounts for about $16 billion in spending, which would make it a titch smaller than the “platform” clouds.  Total cloud computing spending would then be about $34 billion.  Include all hosted services and the spending doubles, which shows that SaaS and the cloud are really extending trends that had been established before.

Online sales and similar adventures by enterprises didn’t displace current IT spending, they augmented it.  What that proves in my view is that we had two possible views of the cloud to choose from when it launched—substitute IT or a new opportunity—and we picked the most pedestrian.

The cloud can probably displace only about $240 billion of current IT spending.  Even with that low a target, it’s obvious that we’ve not even reached 14% of likely penetration, which means that public declarations of an Amazon victory are likely optimistic simply on statistics.  Other providers still have a good chance.  That means not only current providers of cloud services, but even new and credible cloud service market entrants.  But while a quarter-trillion isn’t chump change, it’s not transformative either.

What makes things interesting in my view is that right now about a third of the platform (IaaS/PaaS) cloud spending and 20% of the SaaS spending isn’t displacing current IT spending at all, but rather is accretive to it because the cloud is doing stuff that was never done traditionally.  Despite cost-driven targeting, we’ve been witnessing a quiet cloud transformation, a shift from the pedestrian and short-sighted targeting to something exciting.  The future cloud opportunity lies more with this new stuff, which for the enterprise is about $800 billion according to my model.  If you go beyond the enterprise into new consumer mobile and NFV services, you add another $1.5 trillion, which gets you into the realm of real money.

Amazon has an impressive but not compelling cloud position in the “enterprise cloud” as most would see it today.  They have no real position in the extended enterprise, mobile, or NFV spaces.  That means that if the cloud fully develops and Amazon doesn’t push out of its current focus area or change market share, they’d end up with 28% of what is about a $1 trillion total opportunity.  That’s a lot of growth for them, and investors would have every reason to be happy.

The question is the rest of the cloud opportunity, the roughly $1.5 trillion in mobile/NFV services.  This is the space that the network operators (at least the savvy ones) hope to reap with the “service agility” NFV is supposed to provide.  It’s also the space that Google obviously hopes to capture with its Fi MVNO service.

Put into cloud terms, Fi could be a model to transfer network service value upward out of the network and into the cloud, and then to meld it with MVNO network services to create what the user would see as a new native mobile service.  Google is likely betting that the operators, who could create a tighter linkage between true mobile connection services and Fi-like cloud services through NFV, won’t be able to move far enough or fast enough.  In a way, Google is targeting the biggest disintermediation project since the Internet, where the cloud disintermediates operators from higher-layer service value.

As-a-service activity, virtualization, SDN/NFV, the cloud, or whatever you call it, are generators of “new opportunity” that aggregates to well over $2 trillion in annual revenues.  At least three-quarters of this could be viewed as “natural opportunities” for operators and all of it would be an opportunity were operators to position their cloud assets properly.  How do we know that?  From Amazon.

Amazon’s profits on AWS are hard to validate because we all know that it’s difficult to know the formula the company uses to allocate costs on shared infrastructure.  But we do know that in the cloud overall, the highest profits will likely accrue to the guy with the lowest costs.  Amazon’s enormous scope has made it an economy-of-scale play.  The operators, with NFV, could in theory deploy even more infrastructure than Amazon and do so at a lower expected ROI because of their utility-like internal rates of return.  Financially they could win.

We can also draw some insights from the regulatory opposition to the Comcast/TW acquisition that ultimately killed the deal.  Regulators were at least as afraid of the impact on OTT video as they were on other cable/broadband or telco video/broadband service providers.  That suggests that even in regulatory circles there’s a growing sense that services are above the bit.  If that’s true then Google and Amazon have a shot at the whole pie, which could be huge for them.

This also shows why network equipment is lumpy.  Mobile infrastructure needs a higher-layer boost, so Ericsson is seeing a slowdown.  Future services will be based mostly on software and servers, so F5 saw a boost.  Profitable traffic in the metro, to be supplemented by the cloud, still demands carriage of some sort and Juniper is aiming at that and hoping that somebody with a good specific cloud and NFV story won’t step on them.

I’ve tended to call future applications and services “contextual”, meaning that they exploit the sense of context that mobile users (and humans in general) base their behaviors on.  Call them what you like, but I think that these services, whose total revenue value is over $2.5 trillion per year, represent the pie that everyone has to be looking to slice at the provider level, and that every vendor wants to supply with equipment, software, and professional services.  The question seems how and when to start.

Inside every Tier One is a planner who understands the future.  That’s true for about half the Tier Twos and perhaps a quarter of Tier Threes.  Among the largest enterprises, about half see the future as it is, and the rate of insight drops radically as you move toward the SMBs.  The point is that future-speak is nice if you’re a reporter but it’s not necessarily the path to riches if you’re selling network equipment, software, or services.  There’s always a need to build the future on the present, not destroy the present to get to it.  That means that the status quo will hold a powerful appeal until there’s no way to avoid facing future reality.

We may be getting to that point.  Optical players like Infinera are speaking future-truth already and reaping the rewards.  NFV’s principles are becoming clear even if there is still an unknown amount of work to do on specs, and we’re gaining on the 2017 deadline when operators will need something from NFV to save profits, and may leapfrog remaining standards and issues to get there.

I still see this as a kind of face-off-by-proxy, with Google Fi on one side and Alcatel-Lucent’s Rapport on the other.  Can Google figure out how to build superior higher-layer services on top of an MVNO framework?  If so, then they relegate operators to MVNO hosts at even slimmer margins.  Or can operators use Rapport or NFV or both to build agile service layers, not new ways of doing connection services?

We may have other answers even sooner.  Amazon and Apple can’t let Google own this transformation.  If all Fi represented was an MVNO deal, competitors could sit on the sidelines because the risk is great and the upside isn’t that great.  If Fi is a step toward a multi-trillion-dollar opportunity, nobody dares ignore it.  Apple is particularly vulnerable, but also particularly well positioned with a loyal fan base and a legion of related products.  Once they’ve reported (today) can then clear the decks and move more decisively into this new age?  That we’ll have to see.