Can Second-Tier Network Vendors Win in NGN?

You generally find revolutionaries in coffee shops, not gourmet dining rooms or private clubs.  In the race for the right to shape the network of the future, the equivalent of a coffee shop is “second-tier” status.  You can see the candy through the window (to mix a metaphor) but can’t quite get at it—unless you break the window.  Brocade, Extreme, Juniper, and Overture are examples of this group, with Brocade and Juniper “logical contenders” and Extreme and Overture examples of possible upsets.

All of these players are squarely in that L2/L3 network space that, under my NGN model, is a prime target to get virtualized out of existence.  On the one hand, being virtualized is a bit like being emulsified—you probably would like the result more than the process.  On the other hand, the market leaders would like the process a lot less, which means these second-tier players might have a chance to take a lead in the transformation and gain market share while their bigger rivals are defending the past.

This proposition frames the optimum strategy for the second tier pretty well.  You have to be two things to be an NGN giant starting from the second string.  One is a purveyor of a clear evolutionary strategy, one that gets people committed to you without requiring them to fork-lift all their current gear.  The other is a visionary.  Nobody is going to hop from rock to rock to cross a flood if they don’t have a compelling reason to get to the other side.

If you lined up a table of second-tier players with ticks for points of strength and stake in the outcome, it would be hard to find somebody with more promise than Brocade.  The company is a combination of a data center switching vendor and a virtual router vendor, at a time when data centers are the future of the cloud layer and virtual routing is the future of L2/L3.  They have no major assets they have to protect, so they could mark a lot of others’ territory without stepping in anything unpleasant.  They have good operator engagement in both SDN and NFV, good representation in the standards groups that count, and a new CMO who has a reputation for aggression.  For now, though, the aggression isn’t manifesting as I’d hoped.  Brocade can support the NGN evolution but they don’t have the positioning to drive the engagement.  In a market where operators have to take a significant risk to evolve, you can’t win if you don’t inspire.

Juniper, sadly, looks a lot like the opposite.  They have a fixation for boxes and chips to the point where their CTO wants to talk them up to the Street no matter how convincing the trend toward a software-hosted future might be.  They have an activist investor who wants near-term share appreciation when the first steps of a revolution could put revenues at risk.  They haven’t been a strong marketing company in a decade, they’ve lost a lot of their key people in recent CEO shuffling, and they have never done a good software project or made a stellar acquisition.  But…they have some incredible technical assets, including base architectures that are cloud-, SDN-, and NFV-friendly that emerged before anyone had heard of the concepts.  Strong leadership, strong marketing, and a firm hand with the Street could even now propel these guys into being what they could have been years ago.  But getting those three things may be an insurmountable problem at this point.

So if the kings of the second string aren’t leaping at the throats of the establishment, are there any a bit further back in the pack who might?  Well, imagine Brocade without the virtual router.  You’d have Extreme (sort of, at least).  Extreme was one of the major switching players of the past, left behind as the market giants jumped in with broader portfolios and better account control.  The company’s SDN position is not only data-center-centric it’s cloud-centric for the operators, and they’re not a server/cloud vendor.  It’s also primarily an OpenDaylight positioning.  They have no real NFV position.  All this is bad unless you look at things as a blank slate.  There are plenty of assets out there that could be combined to create something very good, and Extreme has no barriers to picking one up.  And unlike rival-in-the-data-center Arista, Cisco isn’t suing them.

Then we have Overture Networks.  Overture is a very narrow player in networking, a Carrier Ethernet second-tier player in fact.  While they have good carrier engagement and are not strictly a startup, they are a private company not a public one.  Given these points, you could be justified in thinking that Overture has no business in this piece, but you’d be wrong.  Their NFV orchestration approach has always been one of the very strongest in the market, and their understanding of the realities of NFV and SDN in the business service space is similarly strong.  What’s held Overture back, and may still be doing that, is the fact that they’ve been unwilling to leap wholeheartedly into the NGN deep end.  If you sell Carrier Ethernet gear for a living it’s easy to understand why you’d be reluctant to have your sales people out there shilling for total network revolution.  Somebody else would get all the money.  But Overture is very close to being able to make a complete NFV business case, good enough to take most PoCs into a field trial.  If they go the rest of the way….well, you can guess.

You probably see the basic truth of this group at this point—sometimes those with fewer assets fight harder to protect them.  The challenge is that there is zero chance that a second-tier player could follow a market to success; they’d be left in the dust of their bigger rivals.  SDN and NFV leadership aren’t going to be attained by yelling the acronyms out while facing the nearest reporter, not anymore.  Trials are critical now, and you have to be able to present not only real assets but an actual business case to win in 2015.  If you don’t have something fairly progressive already in place at this stage, there is little chance to do that.

To me, Overture is the player to watch in this group.  While I still believe they underplay their own assets, they have assets to play.  There are perhaps two companies that could actually make a business case for NFV at this point, and they’re one of them.  However…they just don’t have a lot of upside unless it’s to get bought.  Carrier Ethernet gear isn’t the big win area for NFV, its servers and perhaps data center switching.  A marriage or merger of Brocade and Overture might be compelling; in fact, the only way that an NGN winner could emerge from this particular segment of the market.

Can the Optical Guys Get Out of the NGN Basement?

“The times they are a’changing”, as the song goes.  The pace and direction of the changes could be influenced by vendors agile and determined enough to get out there and take some bold steps.  We’ve looked at the IT giants who have the most to gain from a transition to a software-server vision of networking.  For all their assets (and likely ambitions) most of them are somewhat networking outsiders.  Cisco, the one who most clearly isn’t that, is far from being determined to be a proponent of software-server-centric shifts.  If we continue exploring the NGN transformation vendor landscape in order of decreasing potential gains, our next group is the optical vendors.

All of the optical vendors out there would benefit from a network transformation that skimped on spending on the L2/L3 part of the network, even if some of the savings went into buying software and servers.  Nobody believes that capacity needs won’t rise in the future, even if cost pressures increase.  Bits start in the optical layer.  The challenge for the optical vendors is creating a strategy that brings about a concentration of network-device spending on their own layer, almost certainly meaning extending their reach upward into the grooming part of the network.  This has to happen at a time when service strategy moves by operators seem to favor software/server vendors, far from the optical plane.

If you look at any realistic NGN vision, it includes some kind of “shim” between the optical part of the network and the service networks that could be created using overlay, hosted SDN, and NFV technology.  Today that shim is created by the L2/L3 infrastructure, meaning switches and routers.  Some of the features of the current L2/L3 can migrate upward to be hosted, but others will have to migrate downward toward the optical boundary.  I’ve suggested that an electrical grooming layer would emerge as the NGN shim, based on SDN technology and manipulating packet flows where optical granularity wasn’t efficient.  The optical players, in order to maximize their own role in the future, need to be thinking about doing or defining this new shim.

Adva, Alcatel-Lucent, Ciena, Fujitsu Network Communications, and Infinera all have optical network gear, and all have packet-optical strategies.  There are some differences in their approaches, the most significant perhaps being the extent to which vendor packet-optical strategies are influenced by current L2/L3 devices and positioning.  If we expect the NGN of the future to look different than networks of the present, it’s inevitable that the differences be primarily in how service-layer elements bind to optics.  A focus on current L2/L3 would tend to optimize evolution by reducing the differences between the then and the now.  We’ll look at how that, and other issues, impact vendors in alphabetical order as before.

Adva Optical is one of the smallest of our vendors, certainly in terms of number of web hits on “packet optical” plus their name.  Adva has partnered with Juniper to create a unified solution for packet optical, something that has plusses and minuses.  On the plus side, they aren’t committed to their own ossified L2/L3 approach, but on the minus side that’s likely what they’ll get from Juniper.  Partnerships are a very tough way to support a revolutionary change because they multiply the politics and positioning challenges.  Who ever heard of an aggressive partnership?  It’s going to be hard for Adva to be a conspicuous driver of progress toward a new optical/electrical harmony for NGN.

Alcatel-Lucent is an opposite, size-wise, but it may have some of the same issues as Adva.  Unlike Adva, they’re created by collision with Alcatel-Lucent’s own products at L2/L3.  You don’t have to be a market genius to understand how important switching/routing is to the company (Basil Alwan is the star of most of the company’s events).  On the other hand, Alcatel-Lucent did announce a virtual router and their Nuage SDN stuff is among the best in the industry.  So while Alcatel-Lucent is certainly unlikely to rush out and obsolete its star product family, it is in the position to support a graceful transformation if the industry really starts to move toward constraining equipment spending at L2/L3.  Combine that with Alcatel-Lucent’s strong position in NFV (CloudBand), and you have a vendor who clearly has the pieces of a next-gen infrastructure.  They also have “virtual mobile infrastructure” elements like IMS and EPC, a strong position in content delivery, and good carrier engagement.  Their optical stuff integrates well with all of this and even has generally harmonious management.  This isn’t an unbeatable combination, but it’s a strong one.  However, they’re likely to take the “fast follower” approach Cisco made famous, so we have to look for leadership from another player.

Ciena is an interesting player, with more PR traction for its approaches than the smaller players and also some very specific directions in NFV, something that only Alcatel-Lucent of the others can say.  They have a packet-optical strategy that’s a good fit to evolve to electrical grooming and no strong L2/L3 incumbency to protect.  While their new NFV approach is more a service (“pay-as-you-earn”) than a product, there is a product element to it and Ciena says they do have a strategy to let users wean away from the service-based model as they develop their markets.  Their Agility SDN controller (developed with Ericsson) is OpenDaylight-based and a good foundation for an evolving agile grooming layer.  On the negative side, they still haven’t created a cohesive picture of NGN and aligned their pieces to support it.

Fujitsu Network Communications (FNC) is a bit of an optical giant.  They have a wide variety of optical network devices and an Ethernet-based electrical aggregation and grooming approach that gets good scores from the operators.  While their electrical-layer positioning is very traditional, it certainly has the potential to evolve to the kind of model that I think will prevail by 2020.  What FNC lacks is positioning, perhaps more so than any vendor in the space.  Their SDN material, for example, is a useful tutorial but doesn’t provide enough to drive a project or even generate specific sales opportunities.  They have little presence in NFV and virtually nothing in the way of positioning or collateral.  I agree that NFV is above the optical space, but without talking about NFV and the cloud you can’t talk about NGN evolution.  These guys could be as big in the NGN space as anyone, or bigger, if they could sing and dance.

Infinera is our final optical player, a small and specialized one but big enough to have potential.  They recently hired Stu Elby from Verizon, one of the leading thinkers in terms of the evolution of carrier networks under price/cost pressures.  Infinera also has the DTN-X, arguably the most agile-optical-packet combination available, and a very practical (based on a Telefonica example) transport SDN example on their site.  Their DTN-X brochure has a diagram that is pretty congruent with the model of NGN I think will prevail, and this in a product brochure!  It’s impressive, for an optical player.  Their NFV positioning is very minimal, but more interesting and engaging than that of FNC for example.  If they made it better, if they created some bridge to the service layer in an explicit way, they could be powerful.

That “for an optical player” qualifier is the significant one here.  None of these players are setting the world on fire, positioning-wise.  Ciena and Infinera seem to be the leading players from the perspective of a strong and evolvable packet-optical-and-SDN story, but Alcatel-Lucent and FNC have the market position should they elect to make a bold move.  For all these players, the future may hinge on how well and how fast the IT giants move.  A strong story in the service or “cloud layer” of the network could effectively set standards for the grooming process that would tend to reduce opportunities by optical players to differentiate themselves.

That’s the critical point.  Bits are ones and zeros; not much else you can say.  If management and grooming are dictated from above then optical commoditization is the result of NGN evolution, not optical supremacy.  It’s going to come down to whether the service side or the transport side moves most effectively.

The Server Giants and NGN

Next year is going to be pivotal in telecom, because it’s almost surely going to set the stage for the first real evolution of network infrastructure we’ve had since IP convergence twenty years ago.  We’re moving to the “next-generation network” everyone has talked about, but with two differences.  First, this is for real, not just media hype.  Second, this “network” will be notable not for new network technology but for the introduction of non-network technology—software and servers—into networking missions.

Today we begin our review of the network players of 2015 and beyond, focusing on how these companies are likely to fare in the transition to what’s popularly called “NGN”.  As I said in my blog of December 19th, I’m going to begin with the players with the most to gain, the group from which the new powerhouse will emerge if NGN evolution does happen.  That group is the server vendors, and it includes (in alphabetical order) Cisco, Dell, HP, IBM, and Oracle.

The big advantage this group has is that they can expect to make money from any network architecture that relies on hosted functionality.  While it’s often overlooked as a factor in determining market leadership during periods of change, one of the greatest assets a vendor can have is a justification to endure through a long sales cycle.  Salespeople don’t work for free, and companies can’t focus their sales effort on things that aren’t going to add to their profits.  When you have a complicated transformation to drive, you have to be able to make a buck out of the effort.

The challenge that SDN and NFV have posed for the server giants is that the servers that are the financial heart of the SDN/NFV future are part of the plumbing.  “NFVI” or NFV Infrastructure is just what you run management, orchestration, and virtual network functions on.  It’s MANO and VNFs that build the operators’ business case.  So do these server players try to push their own MANO/VNF solutions and risk limiting their participation in the server-hosted future to those wins they get?  Do they sit back to try to maximize their NFVI opportunity and risk not being a part of any of the early deals because they can’t drive the business case?

The vendor who’s taken the “push-for-the-business-case” route most emphatically is HP, whose OpenNFV architecture is the most functionally complete and now is largely delivered as promised.  A good part of HP’s aggression here may be due to the fact that they’re the only player whose NFV efforts are led by a cloud executive, effectively bringing the two initiatives together.  HP also has a partner ecosystem that’s actually enthusiastic and dedicated, not just hanging around to get some good ink.  HP is absolutely a player who could build a business case for NFV, and their OpenDaylight and OpenStack support means they could extend the NGN umbrella over all three of our revolutions—the cloud, SDN, and NFV.  They are also virtually unique in the industry in offering support for legacy infrastructure in their MANO (Director) product.

Their biggest risk is their biggest strength—the scope of what they can do.  You need to have impeccable positioning and extraordinary collateral to make something like NFV, SDN, or cloud infrastructure a practical sell.  Otherwise you ask your sales force to drag people from disinterested spectators to committed customers on their own, which doesn’t happen.  NGN is the classic elephant behind a screen, but it’s a really big elephant with an unprecedentedly complicated anatomy to grope.  Given that they have all the cards to drive the market right now, their biggest risk is delay that gives others a chance to catch up.  Confusion in the buyer space could do that, so HP is committed (whether they know it or not) to being the go-to people on NGN, in order to win it.

The vendors who seem to represent the “sit-back” view?  Everyone else, at this point, but for different reasons.

Cisco’s challenge is that all of the new network technologies are looking like less-than-zero-sum games in a capital spending sense.  As the market leader in IP and Ethernet technologies, Cisco is likely to lose at least as much in network equipment as it could hope to gain in servers.  Certainly they’d need a superb strategy to realize opex efficiency and service agility to moderate their risks, and Cisco has never been a strategic leader—they like “fast-followership” as an approach.

Dell seems to have made an affirmative choice to be an NFVI leader, hoping to be the fair arms merchant and not a combatant in making the business case for NGN.  This may sound like a wimpy choice, but as I’ve noted many times NGN transformation is very complicated.  Dell may reason that a non-network vendor has little chance in driving this evolution, and that if they fielded their own solutions they’d be on the outs with all the network vendors who push evolution along.  Their risks are obvious—the miss the early market and miss chances to differentiate themselves on features.

IBM’s position in NFV is the most ambiguous of any of the giants.  They are clearly expanding their cloud focus, but they sold off their x86 server business to Lenovo and now have far less to gain from the NGN transformation than any of the others in this category.  Their cloud orchestration tools are a strong starting point for a good NFV MANO solution, but they don’t seem interested in promoting the connection.  It’s hard to see why they’d hang back this long and suddenly get religion, and so their position may well stay ambiguous in 2015.

Oracle has, like HP, announced a full-suite NFV strategy, but they’ve yet to deliver on the major MANO element and their commitment doesn’t seem as fervent to me.  Recall that Oracle was criticized for pooh-poohing the cloud, then jumping in when it was clear that there was opportunity to be had.  I think they’re likely doing that with SDN, NFV, and NGN.  What I think makes this strategy a bit less sensible is that Oracle’s server business could benefit hugely from dominance in NFV.  In fact, carrier cloud and NFV could single-handedly propel Oracle into second place in the market (they recently slipped beyond Cisco).  It’s not clear whether Oracle is still waiting for the sign of NFV success, or will jump off their new positioning to make a go at market leadership.

I’m not a fan of the “wait-and-hope” school of marketing, I confess.  That makes me perhaps a secret supporter of action which makes me sympathetic to HP’s approach more than those of the others in this group.  Objectively, I can’t see how anyone can hope to succeed in an equipment market whose explicit goal is to support commodity devices, except on price and with much pain.  If you don’t want stinking margins you want feature differentiation, and features are attributes of higher layers of SDN and NFV and the cloud.  If those differentiating features are out there, only aggression is going to get to them.  If they’re available in 2015 then only naked aggression will do.  So while I think HP is the lead player now even they’ll have to be on top of their game to get the most from NGN evolution.

Segmenting the Vendors for the Network of the Future

Over the past several months, I’ve talked about the evolution in networking (some say “revolution” but industries with 10-year capital cycles don’t have those).  Along the way I’ve mentioned vendors who are favored or disadvantaged for various reasons, and opened issues that could help or hurt various players.  Now, I propose to use the last days of 2014 to do a more organized review.  Rather than take a whole blog to talk about a vendor, I’m going to divide the vendors into groups and do a blog on each group.  This blog will introduce who’s in a group and what the future looks like for the group as a whole.  I’ll start at the top, the favored players with a big upside.

I want to open with a seldom-articulated truth.  SDN and NFV will evolve under the pressure of initiatives from vendors who will profit in proportion to the effort they have to expend.  A player with a giant upside is going to be able to justify a lot of marketing and sales activity, where one that’s actually just defending against a slow decline may find it hard to do that.  We might like to think that this market could be driven by up-and-comings, but unless they get bought they can’t expect to win or even survive.  You don’t trust itty bitty companies to drive massive changes.

And what’s the biggest change?  Everything happening to networking is shifting the focus of investment from dedicated devices toward servers and software.  It follows that the group with the biggest upside are the server vendors themselves, particularly those who have strong positions in the cloud.  This group includes Dell and HP, with a nod to Cisco, IBM, and Oracle.

The strength of this group is obviously that they are in the path of transition; more of what they sell will be consumed in the future if current SDN and NFV trends mature.  The reason we’re nodding to Cisco and Oracle is that neither company is primarily a server player.  Cisco’s network device incumbency means it’s at risk to losing more than it gains.  IBM and Oracle are primarily software players, and thus would have to establish an unusually strong software position.

The second group on my list is the optical incumbents.  In this group, we’ll count Ciena, Infinera, and Adva Optical, with a nod to Alcatel-Lucent.  The advantage this group holds is that you can’t push bits you don’t have, and optical transport is at the heart of bit creation.  If we could learn to use capacity better, we could trade the cost of capacity against the cost of grooming bandwidth and operating the network.

Optical people can’t stand pat because there’s not much differentiation in something that’s either a “1” or a “0”, but they can build gradually upward from their secure base.  The pure-play optical people have a real shot at doing something transformational if they work at it.  Alcatel-Lucent gets a “nod” because while they have the Nuage SDN framework, one of the very best, they still see themselves as box players and they will likely stay focused on being what they see in their mirror.

The third group on the list is the network equipment second-tier players.  Here we’ll find Brocade, Juniper, and Extreme, with a nod to other smaller networking players.  This group is at risk if money is shifting out of network-specific iron to servers as bigger players do, but they don’t have the option of standing pat.  All these companies would die in a pure commodity market for network gear, and that’s where we’re heading.  Brocade realizes this; the rest seem not to.

What makes this group potentially interesting is that they have a constituency among the network buyers that most of the more favored groups really don’t have.  They could, were they to be very aggressive and smart with SDN and NFV, create some momentum in 2015 that could be strong enough to take them into the first tier of vendors or at least get them bought.  They could also fall flat on their face, which is what most seem to be doing.

The fourth group is the network incumbents, which means Alcatel-Lucent, Cisco, Huawei, and NSN with a nod to Ericsson and the OSS/BSS guys.  The problem for this group is already obvious; any hint that the future network won’t look like the current one and everyone will tighten their purse strings.  Thus, even if these guys could expect to take a winning position in the long run, they’d suffer for quite a few quarters.  Wall Street doesn’t like that.

Ericsson and the OSS/BSS players here are somewhat wild cards.  Operations isn’t going to drive network change given the current political realities among the telcos and the size of the OSS/BSS upside.  Ericsson has augmented its operations position with a strong professional services initiative, and this gives them influence beyond their operations products.  However, integration of operations and networking is a big issue only if somebody doesn’t productize it effectively.

Virtually all of the players (and certainly all of the groups) are represented in early SDN and NFV activity, but what I see so far in the real world is that only three vendors are really staking out a position.  HP, as I’ve said before, has the strongest NFV and SDN inventory of any of the larger players and it’s in the group that has the greatest incentive for early success.  Alcatel-Lucent’s broad product line and early CloudBand positioning helped it to secure some attention, and Ericsson is taking advantage of the fact that other players aren’t stretching their story far enough to cover the full business case.  In theory, any of these guys could win.

That “business case” comment is the key, IMHO.  SDN and NFV could bring massive benefits, but not if they’re focused on per-box capex savings.  If all we’re trying to do is make the same networks we have today, but with cheaper gear, then Huawei wins and everyone else might as well start making toys or start social-network businesses.  Operators now say that operations efficiency and service agility would be needed as the real drivers.  The players who can extend their solutions far enough to achieve both these benefits even usefully much less optimally will drive the market in 2015 and 2016.  If one or two manage that while others languish, nobody else will have a shot and the industry will remake itself around the new giants.  That could be transformational.

Starting next week, I’ll look at these groups in detail and talk about who’s showing strength and who seems to be on their last legs.  Check back and see where your own company, your competitors, or your suppliers fall!

Raising the Bar on SDN and Virtual Routing

One of the questions being asked by both network operators and larger enterprises is how SDN can play a role in their future WAN.  In some sense, though it’s an obvious question, it’s the wrong one.  The broad issue is how virtualization can play; SDN is one option within that larger question.  If you look at the issues systematically, it’s possible to uncover some paths forward, and even to decide which is likely to bear the most fruit.  But most important, it’s likely you’ll realize that the best path to the future lies in the symbiosis of all the virtualization directions.  And realize why we may not be taking it already.

Virtualization lets us create real behaviors by committing resources to abstract functionality.  If we apply it to connection/transport (not to service features above the network) there are two ways that it can change how we build networks.  The first is to virtualize network devices (routers and switches) and then commit them in place of the real thing.  The second is to virtualize network paths, meaning tunnels.  I would assert that in the WAN, the first of these two things is a cloud/NFV application and the second is an SDN application.

When a user connects to a service, they get two tangible things.  One is a conduit for data-plane traffic to deliver stuff according to the forwarding rules of the service.  The other is “control plane” traffic that isn’t addressed to the other user(s) but to the network/service itself.  If you connected two users with a pipe that carried IP or Ethernet, chances are they wouldn’t be able to communicate because there would be control exchanges expected that couldn’t take place because the network elements designed to support them didn’t exist.

SDN in OpenFlow form doesn’t do control packets.  If we want an SDN network to look like an Ethernet or router network, we have to think in terms of satisfying all of the control- and data-plane relationships.  For IP in particular, that likely means providing a specific edge function to emulate the real devices.  The question becomes “why bother?” when you have the option of just deploying virtual routers or switches.

We couldn’t build the Internet on virtual routing alone; some paths have too much traffic in aggregate.  What we could do is to build any large IP network for an individual user, or even individual service, by segregating its traffic below the IP layer and doing its routing on a per-user, per-service basis.  That’s the biggest value of virtual routing; you can build your own “VPN” with virtual devices instead of with a segment of a real device.  Now your VPN is a lot more private.

The challenge with this is that below-IP segregation, which is where SDN comes in.  A virtual router looks like a router.  SDN creates what looks like a tunnel, a pipe.  That’s a Level 1 artifact, something that looks like a TDM pipe or an optical trunk or lambda.  The strength of SDN in the WAN, IMHO, lies in its ability to support virtual routing.

To make virtual routing truly useful we have to be able to build a virtual underlayment to our “IP network” that segregates traffic by user/service and does the basic aggregation needed to maintain reasonable transport efficiency.  The virtual subnets that virtual routing creates when used this way are typically going to be contained enough that servers could host the virtual routers we need.  The structure can be agile enough to support reconfiguration in case of failures or even load and traffic patterns because the path the virtual pipes create and the locations of the virtual routers can be determined dynamically.

This model could also help SDN along.  It’s difficult to make SDN emulate a complete end-to-end service, both because of the scaling issues of the central controller and because of the control-plane exchanges.  It’s easy to create an SDN tunnel; a stitched sequence of forwarding paths does that without further need for processing.  Transport tunnel routing isn’t as dynamic as per-user flow routing, so the controller has less to do and the scope of the network could be larger without creating controller demands that tax the performance and availability constraints of real servers.

If we suggest this is the appropriate model for a service network, then we can immediately point to something that virtual router vendors need to be able to handle better—the “adjacency problem”.  The trouble with multiplying the tunnels below Level 3 to do traffic segmentation and manage trunk loading is that we may create too many such paths, making it difficult to control failovers.  It’s possible to settle this issue in two basic ways—use static routing or create a virtual BGP core.  Static routing doesn’t work well in public IP networks but there’s no reason it couldn’t be applied in a VPN.  Virtual BGP cores could abstract all of the path choices by generating what looks like a giant virtual BGP router.  You could use virtual routers for this BGP core, or do what Google did and create what’s essentially a BGP edge shell around SDN.

This approach of marrying virtual routing with OpenFlow-style SDN could also be adapted to use for the overlay-SDN model popularized by Nicira/VMware.  Overlay SDN doesn’t present its user interface out of Level 2/3 devices, but rather from endpoint processes hosted locally to the user.  It could work, in theory, over any set of tunnels that provide physical connectivity among the endpoint hosting locations, which means we could run it over Layer 1 facilities or over tunnels at L2 or L3.

I mentioned NFV earlier, and I think you can see that virtual routing/switching could be a cloud application or an NFV application.  Both allow for hosting the functionality, but NFV offers more dynamism in deployment/redeployment and more explicit management integration (at least potentially).  If you envisioned a fairly static positioning of your network assets, cloud-based virtual routers/switches would serve.  If you were looking at something more dynamic (likely because it was bigger and more exposed to changes in the conditions of the hosting points and physical connections) you could introduce NFV to optimize placement and replacement.

I think the SDN community is trying to solve too many problems.  I think that virtual router supporters aren’t solving enough.  If we step up to the question of virtual networks for a moment, we can see a new model that can make optimal use of both technologies and at the same time build a better and more agile structure, something that could change security and reliability practices forever and also alter the balance of power in networking.

That’s why we can’t expect this idea to get universal support.  There are players in the network equipment space (like Brocade) who aren’t exposed to the legacy switch/router market enough that a shift in approach would hurt them as much (or more) than help.  Certainly server players (HP comes to mind, supported by Intel/Wind River) with aggressive SDN/NFV programs could field something like this.  The mainstream network people, even those with virtual router plans, are likely to be concerned about the loss of revenue from physical switch/router sales.   The question is whether a player with little to lose will create market momentum sufficient to drag everyone along.  We may find that out in 2015.

How Operators Do Trials, and How We Can Expect SDN/NFV to Progress

Since I’ve blogged recently about the progress (or lack of it!) from proof-of-concept to field trials for SDN and NFV, I’ve gotten some emails from you on just what a “field trial” is about.  I took a look at operator project practices in 2013 as a part of my survey, and there was some interesting input on how operators took a new technology from consideration to deployment.  Given that’s what’s likely to start for SDN and NFV in 2015, this may be a good time to look at that flow.

The first thing I found interesting in my survey was that operators didn’t have a consistent approach to transitioning to deployment for new technologies.  While almost three-quarters of them said that they followed specific procedures in all their test-and-trial phases, a more detailed look at their recent or ongoing projects seemed to show otherwise.

Whatever you call the various steps in test-and-trial, there are really three phases that operators will generally recognize.  The first is the lab trial, the second the field trial, and the final one the pilot deployment/test.  What is in each of these phases, or supposed to be in them, sets the framework for proving out new approaches to services, operations, and infrastructure.

Operators were fairly consistent in describing the first of their goals for a lab trial.  A new technology has to work, meaning that it has to perform as expected when deployed as recommended.  Most operators said that their lab trials weren’t necessarily done in a lab; the first step was typically to do a limited installation of new technology and the second to set up what could be called a “minimalist network” in which the new stuff should operate, and then validate the technology itself.

If we cast this process in SDN and NFV terms, what we’d be saying is that the first goal in a lab trial is to see if you can actually build a network of the technical elements and have it pass traffic in a stable way.  The framework in which this validation is run is typically selected from a set of possible applications of that technology.  Operators say that they don’t necessarily pick the application that makes the most sense in the long term, but rather try to balance the difficulties in doing the test against the useful information that can be gained.

One operator made a telling comment about the outcome of a lab trial; “A properly conducted lab trial is always successful.”  That meant that the goal of such a trial is to find the truth about the basic technology, not to prove the technology is worthy of deployment.  In other words, it’s perfectly fine for a “proof of concept” to fail to prove the concept.  Operators say that somewhere between one in eight and one in six actually do prove the concept; the rest of the trials don’t result in deployment.

The next phase of the technology evolution validation process is the field trial, which two operators out of three say has to prove the business case.  The biggest inconsistencies in practices come to light in the transition between lab and field trials, and the specific differences come from how much the first is expected to prepare for the second.

Operators who have good track records with technology evaluation almost uniformly make preparation for a field trial the second goal of the lab trial (after basic technology validation).  That preparation is where the operators’ business case for the technology enters into the process.  A lab trial, says this group, has to establish just what steps have to be proved in order to make a business case.  You advance from lab trial to field trial because you can establish that there are steps that can be taken, that there is at least one business case.  Your primary goal for the field trial is then to validate that business case.

More than half the operators in my survey didn’t formally work this way, though nearly all said that was the right approach.  The majority said that in most cases, their lab trials ended with a “technology case”, and that some formal sponsorship of the next step was necessary to establish a field trial.  Operators who worked this way sometimes stranded 90% of their lab trials in the lab because they didn’t get that next-step sponsorship, and they also had a field trial success rate significantly lower than operators who made field-trial goal and design management a final step in their lab trials.

Most of the “enlightened” operators also said that a field trial should inherit technical issues from the lab trial, if there were issues that couldn’t be proved out in the lab.  When I asked for examples of the sort of issue a lab trial couldn’t prove, operations integration was the number one point.  The operators agreed that you had to introduce operations integration in the lab trial phase, but also that the lab trials were almost never large enough to expose you to a reasonable set of the issues.  One operator called the issue-determination goal of a lab trial the sensitivity analysis.  This works, under what conditions?  Can we sustain those conditions in a live service?

One of the reasons for all the drama in the lab-to-field transition is that most operators say this is a political shift as well as a shift in procedures and goals.  A good lab trial is likely run by the office of the CTO, where field trials are best run by operations, with liaison with the CTO lead on the lab trial portion.  The most successful operators have established cross-organizational teams, reporting directly to the CEO or executive committee, to control new technology assessments from day one to deployment.  That avoids the political transition.

A specific issue operators report in the lab-to-field transition is the framework of the test.  Remember that operators said you’d pick a lab trial with the goal of balancing the expense and difficulty of the trial with the insights you could expect to gain.  Most operators said that their lab-trial framework wasn’t supposed to be the ideal framework in which to make a business case, and yet most operators said they tended to take their lab-trial framework into a field trial without considering whether they actually had a business case to make.

The transition from field trial to pilot deployment illustrates why blundering forward with a technical proof of concept isn’t the right answer.  Nearly every operator said that their pilot deployment would be based on their field-trial framework.  If that, in turn, was inherited from a lab trial or PoC that wasn’t designed to prove a business case, then there’s a good chance no business case has been, or could be, proven.

This all explains the view expressed by operators a year later, in my survey in the spring of 2014.  Remember that they said that they could not, at that point, make a business case for NFV and had no trials or PoCs in process that could do that.  With respect to NFV, the operators also indicated they had less business-case injection into their lab trial or PoC processes than usual, and less involvement or liaison with operations.  The reason was that NFV had an unusually strong tie to the CTO organization, which they said was because NFV was an evolving standard and standards were traditionally handled out of the CTO’s organization.

For NFV, and for SDN, this is all very important for operators and vendors alike.  Right now, past history suggests that there will be a delay in field trials where proper foundation has not been laid in the lab, and I think it’s clear that’s been happening.  Past history also suggests that the same conditions will generate an unusually high rate of project failure when field trials are launched, and a longer trial period than usual.

This is why I’m actually kind of glad that the TMF and the NFV ISG haven’t addressed the operations side of NFV properly, and that SDN operations is similarly under-thought.  What we probably need most now is a group of ambitious vendors who are prepared to take some bold steps to test their own notions of the right answer.  One successful trial will generate enormous momentum for the concept that succeeds, and quickly realign the efforts of other operations—and vendors.  That’s what I think we can expect to see in 2015.

There’s Hope for NFV Progress in 2015

Since I blogged recently on the challenges operators faced in making a business case for NFV, I’ve gotten quite a few emails from operators themselves.  None disagreed with my major point—the current trial and PoC activity aren’t building a business case for deployment—but they did offer some additional color, some positive and some negative, on NFV plans.

In my fall survey last year, operators’ biggest concern about NFV was that it wouldn’t work, and their second-greatest concern was that it would become a mess of proprietary elements, something like the early days of routing when vendors had their own protocols to discourage open competition.  The good news is that the majority of operators say these concerns have been reduced.  They think that NFV will “work” at the technical level, and they think that there will be enough openness in NFV to keep the market from disintegrating into proprietary silos.

The bad news is that the number of operators who feel that progress has been made has actually declined since the spring, and in some cases operators who told me in April that they were pleased with the progress of their NFV adventures now had some concerns.  A couple had some very specific and similar views that are worth reviewing.

According to the most articulate pair of operators, we have proved the “basic concept of NFV”, meaning that we have proved that we can take cloud-hosted network features and substitute them for the features of appliances.  Their concerns lie in NFV beyond the basics.

First and foremost, these operators say that they cannot reliably estimate the management burden of an NFV deployment.  There is no doubt in their mind that NFV could push down capex, but also no doubt that it would create a risk of increased opex at the same time.  They don’t know how much of an opex increase they’d face, so they can’t validate net savings.  Part of the reason is that they don’t have a reliable and extensible management model for NFV, but part is more basic.  Operators say that they don’t know how well NFV will perform at scale.  You need efficient resource pools to achieve optimal savings on capex, which means you need a large deployment.  So far they don’t have enough data on “large” NFV to say whether opex costs rise in linear way.  In fact, they say they can’t even be sure that all of the tweaks to deployment policy—things ranging from just picking the best host to horizontal scaling and reconfiguration of services under load or failure—will be practical given the potential impact they’d have on opex.  One, reading all the things in a VNF Descriptor, said “This is looking awfully complicated.”

The second concern these operators expressed was the way that NFV integrated with NFVI (NFV Infrastructure).  They are concerned that we’ve not tested the MANO-to-VIM (Virtual Infrastructure Manager) relationship adequately, and even haven’t addressed the VIM-to-NFVI relationship fully.  Most of the trials have used OpenStack, and it’s not clear from the trials just how effective it will be in addressing network configuration changes.  Yes, we can deploy, but OpenStack is essentially a single-thread process.  Could a major problem create enough service disruption that the VIM simply could not keep up?

There are also concerns about the range of things a VIM might support.  If you have two or three clouds, or cloud data centers, do you have multiple VIMs?  Most operators think you do, but these two operators say they aren’t sure how MANO would be able to divide work among multiple VIMs.  How do you represent a service that has pools of resources with different control needs?  This includes the “how do I control legacy elements” question.  All of the operators said they had current cloud infrastructure they would utilize in their next-phase NFV trial.  All had data center switches and network gateways that would have to be configured for at least some situations.  How would that work?  Is there another Infrastructure Manager?  If so, again, how do you represent that in a service model at the MANO level?

Then there’s SDN.  One operator in the spring said that the NFV-to-SDN link was a “fable connected to a myth”.  The point was that they were not confident of exactly what SDN would mean were it to be substituted for traditional networking in part of NFVI.  They weren’t sure how NFV would “talk to” SDN and how management information in particular would flow.  About two-thirds of operators said that they could have difficulties taking NFV into full field trials without confidence on the SDN integration issue.  They weren’t confident in the spring, but there is at least some increase in confidence today (driven by what they see as a convergence on OpenDaylight).

You can make an argument that these issues are exactly what a field trial would be expected to address, and in fact operators sort of agree with that.  Their problem is that they would expect their lab trials to establish a specific set of field-trial issues and a specific configuration in which those issues could be addressed.  The two key operators say that they can’t yet do that, but they aren’t convinced that spending more time in the lab will give them a better answer.  That means they may have to move into a larger-scale trial without the usual groundwork having been laid, or device a different lab trial to help prepare for wider deployment.

That would be a problem because nearly all the operators say that they are being charged by senior management to run field trials for NFV in 2015.  Right now, most say that they’re focusing on the second half—likely because if you’re told you need to do something you’re not sure you are ready for, you delay as long as you can.

What would operators like to see from NFV vendors?  Interestingly, I got an answer to that over a year ago at a meeting in Europe.  One of the kingpins of NFV, and a leader in the ISG, told me that the way operators needed to have NFV explained was in the context of the service lifecycle.  Take a service from marketing conception to actual customer deployment, he said, and show me how it progresses through all the phases.  This advice is why I’ve taken a lifecycle-driven approach in explaining my ExperiaSphere project.  But where do we see service lifecycles in vendor NFV documentation?

I asked the operators who got back to me after my blog, and the two “thought leaders” in particular, what they thought of the “lifecycle-driven” approach.  The general view was that it would be a heck of a lot better way to define how a given NFV product worked than the current approach, which focuses on proving you can deploy.  The two thought leaders said flatly that they didn’t believe any vendor could offer such a presentation of functionality.

I’m not sure I agree with that, though I do think that nobody has made such a service-workflow model available in public as yet.  There are at least a couple of players who could tell the right story the right way, perhaps not covering all the bases but at least covering enough.  I wish I could say that I’d heard vendors say they’d be developing a lifecycle-centric presentation on NFV, or that my operator friends had heard it.  Neither, for now, is true, but I do have to say I’m hopeful.

We are going to large-scale NFV trials in 2015, period.  Maybe only one, but at least that.  Once any vendor manages to get a really credible field trial underway, it’s going to be impossible for others to avoid the pressure to do the same.  So for all those frustrated by the pace of NFV adoption, be patient because change is coming.

Domain 2.0, Domains, and Vendor SDN/NFV

Last week we had some interesting news on AT&T’s Domain 2.0 program and some announcements in the SDN and NFV space.  As is often the case, there’s an interesting juxtaposition between these events that sheds some light on the evolution of the next-gen network.  In particular, it raises the question of whether either operators or vendors have got this whole “domain” thing right.

Domain 2.0 is one of those mixed-blessing things.  It’s good that AT&T (or any operator) recognizes that it’s critical to look for a systematic way of building the next generation of networks.  AT&T has also picked some good thinkers in its Domain 2.0 partners (I particularly like Brocade and Metaswitch), and it represents its current-infrastructure suppliers there as well.  You need both the future and the present to talk evolution, after all.  The part that’s less good is that Domain 2.0 seems a bit confused to me, and also to some AT&T people who have sent me comments.  The problem?  Again, it seems to be the old “bottom-up-versus-top-down” issue.

There is a strong temptation in networking to address change incrementally, and if you think in terms of incremental investment, then incremental change is logical.  The issue is that “incremental change” can turn into the classic problem of trying to cross the US by just making turns at random at intersections.  You may make optimal choices per turn based on what you see, but you don’t see the destination.  Domains without a common goal end up being silos.

What Domain 2.0 or any operator evolution plan has to do is begin with some sense of the goal.  We all know that we’re talking about adding a cloud layer to networking.  For five years, operators have made it clear that whatever else happens, they’re committed to evolving toward hosting stuff in the cloud.

The cloud, in the present, is a means of entering the IT services market.  NFV also makes it a way of hosting network features in a more agile and elastic manner.  So we can say that our cloud layer of the future will have some overlap with the network layer of the future.

Networking, in the sense most think of it (Ethernet and IP devices) is caught between two worlds, change-wise.  On the one hand, operators are very interested in getting more from lower-layer technology like agile optics.  They’d like to see core networking and even metro networking handled more through agile optical pipes.  By extension, they’d like to create an electrical superstructure on top of optics that can do whatever happens to be 1) needed by services and 2) not yet fully efficient if implemented in pure optical terms.  Logically, SDN could create this superstructure.

At the top of the current IP/Ethernet world we have increased interest in SDN as well, mostly to secure two specific benefits—centralized control of forwarding paths to eliminate the current adaptive route discovery and its (to some) disorder, and improved traffic engineering.  Most operators also believe that if these are handled right, they can reduce operations costs.  That reduction, they think, would come from creating a more “engineered” version of Level 2 and 3 to support services.  Thus, current Ethernet and IP devices would be increasingly relegated to on-ramp functions—at the user edge or at the service edge.

At the service level, it’s clear that you can use SDN principles to build more efficient networks to offer Carrier Ethernet, and it’s very likely that you could build IP VPNs better with SDN as well.  The issue here is more on the management side; the bigger you make an SDN network the more you have to consider the question of how well central control could be made to work and how you’d manage the mesh of devices. Remember, you need connections to manage stuff.

All of this new stuff has to be handled with great efficiency and agility, say the operators.  We have to produce what one operator called a “third way” of management that somehow bonded network and IT management into managing “resources” and “abstractions” and how they come together to create applications and services.  Arguably, Domain 2.0 should start with the cloud layer, the agile optical layer, and the cloud/network intersection created by SDN and NFV.  To that, it should add very agile and efficient operations processes, cutting across all these layers and bridging current technology to the ultimate model of infrastructure.  What bothers me is that I don’t get the sense that’s how it works, nor do I get the sense that goal is what’s driven which vendors get invited to it.

Last week, Ciena (a Domain 2.0 partner) announced a pay-as-you-earn NFV strategy, and IMHO the approach has both merit and issues.  Even if Ciena resolves the issue side (which I think would be relatively easy to do), the big question is why the company would bother with a strategy way up at the service/VNF level when its own equipment is down below Level 2.  The transformation Ciena could support best is the one at the optical/electrical boundary.  Could there be an NFV or SDN mission there?  Darn straight, so why not chase that one?

If opportunity isn’t a good enough reason for Ciena to try to tie its own strengths into an SDN/NFV approach, we have another—competition.  HP announced enhancements to its own NFV program, starting with a new version of its Director software, moving to a hosted version of IMS/EPC, and then on to a managed API program with components offered in VNF form.  It would appear that HP is aiming at creating an agile service layer in part by creating a strong developer framework.  Given that HP is a cloud company and that it sells servers and strong development tools already, this sort of thing is highly credible from HP.

It’s hard for any vendor to build a top-level NFV strategy, which is what VNFs are a part of, if they don’t really have any influence in hosting and the cloud.  It’s hard to tie NFV to the network without any strong service-layer networking applications, applications that would likely evolve out of Level 2/3 behavior and not out of optical networking.  I think there are strong things that optical players like Ciena or Infinera could do with both SDN and NFV, but they’d be different from what a natural service-layer leader would do.

Domain 2.0 may lack high-level vision, but its lower-level fragmentation is proof of something important, which is that implementation of a next-gen model is going to start in different places and engage different vendors in different ways.  As things evolve, they’ll converge.  In the meantime vendors will need to support their own strengths to maximize their influence on the evolution of their part of the network, but also keep in mind what the longer-term goals of the operator are.  Even when the operator may not have articulated them clearly, or even recognized them fully.

Public Internet Policy and the NGN

The FCC is now considering a new position on Net Neutrality, and also a new way of classifying multi-channel video programming distributors (MVPDs) that would allow streaming providers who offered “linear” (continuous distribution, similar to channelized RF) programming as opposed to on demand to be MVPDs.  That would enable them to negotiate for licensing deals on programming as cable companies do.  The combination could raise significant issues, and problems for ISPs.  It could even create a kind of side-step of the Internet, and some major changes in how we build networks.

Neutrality policy generally has two elements.  The first defines what exactly ISPs must do to be considered “neutral”, and the second defines what is exempt from the first set of requirements.  In the “old” order published under Chairman Genachowski, the first element said you can’t interfere with lawful traffic, especially to protect some of your own service offerings, can’t generally meter or throttle traffic except for reasons of network stability, and can’t offer prioritization or settlement among ISPs without facing FCC scrutiny.  In the second area, the order exempted non-Internet services (business IP) and Internet-related services like (explicitly) content delivery networks and (implicitly) cloud computing.

The DC Court of Appeals trashed this order, leaving the FCC with what it said was sufficient authority to prevent interference with lawful traffic but not much else.  Advocates of a firmer position on neutrality want to see an order that bars any kind of settlement or payment other than for access, and implicitly bars settlement among providers and QoS (unless someone decided to do it for free).  No paid prioritization, period.  Others, including most recently a group of academia, say that this sort of thing could be very destructive to the Internet.

How?  The obvious answer is that if neutrality rules were to force operators into a position where revenue per bit fell below acceptable margins on cost per bit, they’d likely stop investing in infrastructure.  We can see from both AT&T’s and Verizon’s earnings reports that wireline capex is expected to decline, and this is almost surely due to the margin compression created by the converging cost and price.  Verizon just indicated it would grow wireless capex, and of course profit margins are better in wireless services.

You can see that a decision to rule that OTT players like Aereo (now in Chapter 11) could now negotiate for programming rights provided they stream channels continuously might create some real issues.  It’s not certain that anyone would step up to take on this newly empowered OTT role, that programming rights would be offered to this sort of player, that consumers would accept the price, or that the new OTT competitors could be profitable at the margin, but suppose it happened.  What would be the result?

Continuous streaming of video to a bunch of users over the Internet would surely put a lot of additional strain on the ISPs.  One possible outcome would be that they simply reach price/cost crossover faster and let the network degrade.  The FCC can’t order a company to do something not profitable, but they could in theory put them to the choice “carry at a loss or get out of the market”.  I don’t think that would be likely, but it’s possible.  Since that would almost certainly result in companies exiting the Internet market, it would have a pretty savage impact.

There’s another possibility, of course, which is that the ISPs shift their focus to the stuff that’s exempt from neutrality.  That doesn’t mean inventing a new service, or even shifting more to something like cloud computing.  It means framing what we’d consider “Internet” today as something more cloud- or CDN-like.

Here’s a simple example.  The traditional scope of neutrality rules as they relate to video content would exclude CDNs.  Suppose operators pushed their CDNs to the central office, so that content jumped onto “the Internet” a couple miles at most from the user, at the back end of the access connection.  Operator CDNs could now provide all the video quality you wanted as long as you were using them.  Otherwise, you’re flowing through infrastructure that would now be unlikely to be upgraded very much.

Now look at my postulated opportunity for mobile/behavioral services through the use of a cloud-hosted personal agent.  The mobile user asks for something and the request is carried on the mobile broadband Internet connection to the edge of the carrier’s cloud.  There it hops onto exempt infrastructure, where all the service quality you need could be thrown at it.  No sharing required here, either.  In fact, even if you were to declare ISPs to be common carriers, cloud and CDN services are information services separate from the Internet access and sharing regulations would not apply.  It’s not even clear that the FCC could mandate sharing because the framework of the legislation defines Title II services to exclude information services.

You can see from this why “carrier cloud” and NFV is important.  On the one hand, the future will clearly demand operators rise above basic connection and transport, not only because of current profit threats but because it’s those higher-level things that are immune from neutrality risks.  The regulatory uncertainty only proves that the approach to the higher level can’t be what I’ll call a set of opportunity silos; we need to have an agile architecture that can accommodate the twists and turns of demand, technology, and (now) public policy.

On the other hand, the future has to evolve, if not gracefully then at least profitably, from the past.  We have to be able to orchestrate everything we now have, we have to make SDN interwork with what we now have, and we have to operationalize services end to end.  Further, legacy technology at the network level (at least at the lower OSI layers) isn’t displaced by SDN and NFV, it’s just morphed a bit.  We’ll still need unified operations even inside some of our higher-layer cloud and CDN enclaves, and that unified operations will have to unify the new cloud and the old connection/transport.

One of the ironies of current policy debates, I think, is that were we to have let the market evolve naturally, we’d have had settlement on the Internet, pay for prioritization by consumer or content provider, and other traditional network measures for a decade or more.  That would have made infrastructure more profitable to operators, and stalled out the current concerns about price/cost margins on networks.  The Internet might look a little different, the VCs might not have made as much, but in the end we’d have something logically related to the old converged IP model.  Now, I think, our insistence on “saving” the Internet has put more of it—and its suppliers—at risk.