Segmenting the Vendors for the Network of the Future

Over the past several months, I’ve talked about the evolution in networking (some say “revolution” but industries with 10-year capital cycles don’t have those).  Along the way I’ve mentioned vendors who are favored or disadvantaged for various reasons, and opened issues that could help or hurt various players.  Now, I propose to use the last days of 2014 to do a more organized review.  Rather than take a whole blog to talk about a vendor, I’m going to divide the vendors into groups and do a blog on each group.  This blog will introduce who’s in a group and what the future looks like for the group as a whole.  I’ll start at the top, the favored players with a big upside.

I want to open with a seldom-articulated truth.  SDN and NFV will evolve under the pressure of initiatives from vendors who will profit in proportion to the effort they have to expend.  A player with a giant upside is going to be able to justify a lot of marketing and sales activity, where one that’s actually just defending against a slow decline may find it hard to do that.  We might like to think that this market could be driven by up-and-comings, but unless they get bought they can’t expect to win or even survive.  You don’t trust itty bitty companies to drive massive changes.

And what’s the biggest change?  Everything happening to networking is shifting the focus of investment from dedicated devices toward servers and software.  It follows that the group with the biggest upside are the server vendors themselves, particularly those who have strong positions in the cloud.  This group includes Dell and HP, with a nod to Cisco, IBM, and Oracle.

The strength of this group is obviously that they are in the path of transition; more of what they sell will be consumed in the future if current SDN and NFV trends mature.  The reason we’re nodding to Cisco and Oracle is that neither company is primarily a server player.  Cisco’s network device incumbency means it’s at risk to losing more than it gains.  IBM and Oracle are primarily software players, and thus would have to establish an unusually strong software position.

The second group on my list is the optical incumbents.  In this group, we’ll count Ciena, Infinera, and Adva Optical, with a nod to Alcatel-Lucent.  The advantage this group holds is that you can’t push bits you don’t have, and optical transport is at the heart of bit creation.  If we could learn to use capacity better, we could trade the cost of capacity against the cost of grooming bandwidth and operating the network.

Optical people can’t stand pat because there’s not much differentiation in something that’s either a “1” or a “0”, but they can build gradually upward from their secure base.  The pure-play optical people have a real shot at doing something transformational if they work at it.  Alcatel-Lucent gets a “nod” because while they have the Nuage SDN framework, one of the very best, they still see themselves as box players and they will likely stay focused on being what they see in their mirror.

The third group on the list is the network equipment second-tier players.  Here we’ll find Brocade, Juniper, and Extreme, with a nod to other smaller networking players.  This group is at risk if money is shifting out of network-specific iron to servers as bigger players do, but they don’t have the option of standing pat.  All these companies would die in a pure commodity market for network gear, and that’s where we’re heading.  Brocade realizes this; the rest seem not to.

What makes this group potentially interesting is that they have a constituency among the network buyers that most of the more favored groups really don’t have.  They could, were they to be very aggressive and smart with SDN and NFV, create some momentum in 2015 that could be strong enough to take them into the first tier of vendors or at least get them bought.  They could also fall flat on their face, which is what most seem to be doing.

The fourth group is the network incumbents, which means Alcatel-Lucent, Cisco, Huawei, and NSN with a nod to Ericsson and the OSS/BSS guys.  The problem for this group is already obvious; any hint that the future network won’t look like the current one and everyone will tighten their purse strings.  Thus, even if these guys could expect to take a winning position in the long run, they’d suffer for quite a few quarters.  Wall Street doesn’t like that.

Ericsson and the OSS/BSS players here are somewhat wild cards.  Operations isn’t going to drive network change given the current political realities among the telcos and the size of the OSS/BSS upside.  Ericsson has augmented its operations position with a strong professional services initiative, and this gives them influence beyond their operations products.  However, integration of operations and networking is a big issue only if somebody doesn’t productize it effectively.

Virtually all of the players (and certainly all of the groups) are represented in early SDN and NFV activity, but what I see so far in the real world is that only three vendors are really staking out a position.  HP, as I’ve said before, has the strongest NFV and SDN inventory of any of the larger players and it’s in the group that has the greatest incentive for early success.  Alcatel-Lucent’s broad product line and early CloudBand positioning helped it to secure some attention, and Ericsson is taking advantage of the fact that other players aren’t stretching their story far enough to cover the full business case.  In theory, any of these guys could win.

That “business case” comment is the key, IMHO.  SDN and NFV could bring massive benefits, but not if they’re focused on per-box capex savings.  If all we’re trying to do is make the same networks we have today, but with cheaper gear, then Huawei wins and everyone else might as well start making toys or start social-network businesses.  Operators now say that operations efficiency and service agility would be needed as the real drivers.  The players who can extend their solutions far enough to achieve both these benefits even usefully much less optimally will drive the market in 2015 and 2016.  If one or two manage that while others languish, nobody else will have a shot and the industry will remake itself around the new giants.  That could be transformational.

Starting next week, I’ll look at these groups in detail and talk about who’s showing strength and who seems to be on their last legs.  Check back and see where your own company, your competitors, or your suppliers fall!

Posted in Uncategorized | Comments Off

Raising the Bar on SDN and Virtual Routing

One of the questions being asked by both network operators and larger enterprises is how SDN can play a role in their future WAN.  In some sense, though it’s an obvious question, it’s the wrong one.  The broad issue is how virtualization can play; SDN is one option within that larger question.  If you look at the issues systematically, it’s possible to uncover some paths forward, and even to decide which is likely to bear the most fruit.  But most important, it’s likely you’ll realize that the best path to the future lies in the symbiosis of all the virtualization directions.  And realize why we may not be taking it already.

Virtualization lets us create real behaviors by committing resources to abstract functionality.  If we apply it to connection/transport (not to service features above the network) there are two ways that it can change how we build networks.  The first is to virtualize network devices (routers and switches) and then commit them in place of the real thing.  The second is to virtualize network paths, meaning tunnels.  I would assert that in the WAN, the first of these two things is a cloud/NFV application and the second is an SDN application.

When a user connects to a service, they get two tangible things.  One is a conduit for data-plane traffic to deliver stuff according to the forwarding rules of the service.  The other is “control plane” traffic that isn’t addressed to the other user(s) but to the network/service itself.  If you connected two users with a pipe that carried IP or Ethernet, chances are they wouldn’t be able to communicate because there would be control exchanges expected that couldn’t take place because the network elements designed to support them didn’t exist.

SDN in OpenFlow form doesn’t do control packets.  If we want an SDN network to look like an Ethernet or router network, we have to think in terms of satisfying all of the control- and data-plane relationships.  For IP in particular, that likely means providing a specific edge function to emulate the real devices.  The question becomes “why bother?” when you have the option of just deploying virtual routers or switches.

We couldn’t build the Internet on virtual routing alone; some paths have too much traffic in aggregate.  What we could do is to build any large IP network for an individual user, or even individual service, by segregating its traffic below the IP layer and doing its routing on a per-user, per-service basis.  That’s the biggest value of virtual routing; you can build your own “VPN” with virtual devices instead of with a segment of a real device.  Now your VPN is a lot more private.

The challenge with this is that below-IP segregation, which is where SDN comes in.  A virtual router looks like a router.  SDN creates what looks like a tunnel, a pipe.  That’s a Level 1 artifact, something that looks like a TDM pipe or an optical trunk or lambda.  The strength of SDN in the WAN, IMHO, lies in its ability to support virtual routing.

To make virtual routing truly useful we have to be able to build a virtual underlayment to our “IP network” that segregates traffic by user/service and does the basic aggregation needed to maintain reasonable transport efficiency.  The virtual subnets that virtual routing creates when used this way are typically going to be contained enough that servers could host the virtual routers we need.  The structure can be agile enough to support reconfiguration in case of failures or even load and traffic patterns because the path the virtual pipes create and the locations of the virtual routers can be determined dynamically.

This model could also help SDN along.  It’s difficult to make SDN emulate a complete end-to-end service, both because of the scaling issues of the central controller and because of the control-plane exchanges.  It’s easy to create an SDN tunnel; a stitched sequence of forwarding paths does that without further need for processing.  Transport tunnel routing isn’t as dynamic as per-user flow routing, so the controller has less to do and the scope of the network could be larger without creating controller demands that tax the performance and availability constraints of real servers.

If we suggest this is the appropriate model for a service network, then we can immediately point to something that virtual router vendors need to be able to handle better—the “adjacency problem”.  The trouble with multiplying the tunnels below Level 3 to do traffic segmentation and manage trunk loading is that we may create too many such paths, making it difficult to control failovers.  It’s possible to settle this issue in two basic ways—use static routing or create a virtual BGP core.  Static routing doesn’t work well in public IP networks but there’s no reason it couldn’t be applied in a VPN.  Virtual BGP cores could abstract all of the path choices by generating what looks like a giant virtual BGP router.  You could use virtual routers for this BGP core, or do what Google did and create what’s essentially a BGP edge shell around SDN.

This approach of marrying virtual routing with OpenFlow-style SDN could also be adapted to use for the overlay-SDN model popularized by Nicira/VMware.  Overlay SDN doesn’t present its user interface out of Level 2/3 devices, but rather from endpoint processes hosted locally to the user.  It could work, in theory, over any set of tunnels that provide physical connectivity among the endpoint hosting locations, which means we could run it over Layer 1 facilities or over tunnels at L2 or L3.

I mentioned NFV earlier, and I think you can see that virtual routing/switching could be a cloud application or an NFV application.  Both allow for hosting the functionality, but NFV offers more dynamism in deployment/redeployment and more explicit management integration (at least potentially).  If you envisioned a fairly static positioning of your network assets, cloud-based virtual routers/switches would serve.  If you were looking at something more dynamic (likely because it was bigger and more exposed to changes in the conditions of the hosting points and physical connections) you could introduce NFV to optimize placement and replacement.

I think the SDN community is trying to solve too many problems.  I think that virtual router supporters aren’t solving enough.  If we step up to the question of virtual networks for a moment, we can see a new model that can make optimal use of both technologies and at the same time build a better and more agile structure, something that could change security and reliability practices forever and also alter the balance of power in networking.

That’s why we can’t expect this idea to get universal support.  There are players in the network equipment space (like Brocade) who aren’t exposed to the legacy switch/router market enough that a shift in approach would hurt them as much (or more) than help.  Certainly server players (HP comes to mind, supported by Intel/Wind River) with aggressive SDN/NFV programs could field something like this.  The mainstream network people, even those with virtual router plans, are likely to be concerned about the loss of revenue from physical switch/router sales.   The question is whether a player with little to lose will create market momentum sufficient to drag everyone along.  We may find that out in 2015.

Posted in Uncategorized | Comments Off

How Operators Do Trials, and How We Can Expect SDN/NFV to Progress

Since I’ve blogged recently about the progress (or lack of it!) from proof-of-concept to field trials for SDN and NFV, I’ve gotten some emails from you on just what a “field trial” is about.  I took a look at operator project practices in 2013 as a part of my survey, and there was some interesting input on how operators took a new technology from consideration to deployment.  Given that’s what’s likely to start for SDN and NFV in 2015, this may be a good time to look at that flow.

The first thing I found interesting in my survey was that operators didn’t have a consistent approach to transitioning to deployment for new technologies.  While almost three-quarters of them said that they followed specific procedures in all their test-and-trial phases, a more detailed look at their recent or ongoing projects seemed to show otherwise.

Whatever you call the various steps in test-and-trial, there are really three phases that operators will generally recognize.  The first is the lab trial, the second the field trial, and the final one the pilot deployment/test.  What is in each of these phases, or supposed to be in them, sets the framework for proving out new approaches to services, operations, and infrastructure.

Operators were fairly consistent in describing the first of their goals for a lab trial.  A new technology has to work, meaning that it has to perform as expected when deployed as recommended.  Most operators said that their lab trials weren’t necessarily done in a lab; the first step was typically to do a limited installation of new technology and the second to set up what could be called a “minimalist network” in which the new stuff should operate, and then validate the technology itself.

If we cast this process in SDN and NFV terms, what we’d be saying is that the first goal in a lab trial is to see if you can actually build a network of the technical elements and have it pass traffic in a stable way.  The framework in which this validation is run is typically selected from a set of possible applications of that technology.  Operators say that they don’t necessarily pick the application that makes the most sense in the long term, but rather try to balance the difficulties in doing the test against the useful information that can be gained.

One operator made a telling comment about the outcome of a lab trial; “A properly conducted lab trial is always successful.”  That meant that the goal of such a trial is to find the truth about the basic technology, not to prove the technology is worthy of deployment.  In other words, it’s perfectly fine for a “proof of concept” to fail to prove the concept.  Operators say that somewhere between one in eight and one in six actually do prove the concept; the rest of the trials don’t result in deployment.

The next phase of the technology evolution validation process is the field trial, which two operators out of three say has to prove the business case.  The biggest inconsistencies in practices come to light in the transition between lab and field trials, and the specific differences come from how much the first is expected to prepare for the second.

Operators who have good track records with technology evaluation almost uniformly make preparation for a field trial the second goal of the lab trial (after basic technology validation).  That preparation is where the operators’ business case for the technology enters into the process.  A lab trial, says this group, has to establish just what steps have to be proved in order to make a business case.  You advance from lab trial to field trial because you can establish that there are steps that can be taken, that there is at least one business case.  Your primary goal for the field trial is then to validate that business case.

More than half the operators in my survey didn’t formally work this way, though nearly all said that was the right approach.  The majority said that in most cases, their lab trials ended with a “technology case”, and that some formal sponsorship of the next step was necessary to establish a field trial.  Operators who worked this way sometimes stranded 90% of their lab trials in the lab because they didn’t get that next-step sponsorship, and they also had a field trial success rate significantly lower than operators who made field-trial goal and design management a final step in their lab trials.

Most of the “enlightened” operators also said that a field trial should inherit technical issues from the lab trial, if there were issues that couldn’t be proved out in the lab.  When I asked for examples of the sort of issue a lab trial couldn’t prove, operations integration was the number one point.  The operators agreed that you had to introduce operations integration in the lab trial phase, but also that the lab trials were almost never large enough to expose you to a reasonable set of the issues.  One operator called the issue-determination goal of a lab trial the sensitivity analysis.  This works, under what conditions?  Can we sustain those conditions in a live service?

One of the reasons for all the drama in the lab-to-field transition is that most operators say this is a political shift as well as a shift in procedures and goals.  A good lab trial is likely run by the office of the CTO, where field trials are best run by operations, with liaison with the CTO lead on the lab trial portion.  The most successful operators have established cross-organizational teams, reporting directly to the CEO or executive committee, to control new technology assessments from day one to deployment.  That avoids the political transition.

A specific issue operators report in the lab-to-field transition is the framework of the test.  Remember that operators said you’d pick a lab trial with the goal of balancing the expense and difficulty of the trial with the insights you could expect to gain.  Most operators said that their lab-trial framework wasn’t supposed to be the ideal framework in which to make a business case, and yet most operators said they tended to take their lab-trial framework into a field trial without considering whether they actually had a business case to make.

The transition from field trial to pilot deployment illustrates why blundering forward with a technical proof of concept isn’t the right answer.  Nearly every operator said that their pilot deployment would be based on their field-trial framework.  If that, in turn, was inherited from a lab trial or PoC that wasn’t designed to prove a business case, then there’s a good chance no business case has been, or could be, proven.

This all explains the view expressed by operators a year later, in my survey in the spring of 2014.  Remember that they said that they could not, at that point, make a business case for NFV and had no trials or PoCs in process that could do that.  With respect to NFV, the operators also indicated they had less business-case injection into their lab trial or PoC processes than usual, and less involvement or liaison with operations.  The reason was that NFV had an unusually strong tie to the CTO organization, which they said was because NFV was an evolving standard and standards were traditionally handled out of the CTO’s organization.

For NFV, and for SDN, this is all very important for operators and vendors alike.  Right now, past history suggests that there will be a delay in field trials where proper foundation has not been laid in the lab, and I think it’s clear that’s been happening.  Past history also suggests that the same conditions will generate an unusually high rate of project failure when field trials are launched, and a longer trial period than usual.

This is why I’m actually kind of glad that the TMF and the NFV ISG haven’t addressed the operations side of NFV properly, and that SDN operations is similarly under-thought.  What we probably need most now is a group of ambitious vendors who are prepared to take some bold steps to test their own notions of the right answer.  One successful trial will generate enormous momentum for the concept that succeeds, and quickly realign the efforts of other operations—and vendors.  That’s what I think we can expect to see in 2015.

Posted in Uncategorized | Comments Off

There’s Hope for NFV Progress in 2015

Since I blogged recently on the challenges operators faced in making a business case for NFV, I’ve gotten quite a few emails from operators themselves.  None disagreed with my major point—the current trial and PoC activity aren’t building a business case for deployment—but they did offer some additional color, some positive and some negative, on NFV plans.

In my fall survey last year, operators’ biggest concern about NFV was that it wouldn’t work, and their second-greatest concern was that it would become a mess of proprietary elements, something like the early days of routing when vendors had their own protocols to discourage open competition.  The good news is that the majority of operators say these concerns have been reduced.  They think that NFV will “work” at the technical level, and they think that there will be enough openness in NFV to keep the market from disintegrating into proprietary silos.

The bad news is that the number of operators who feel that progress has been made has actually declined since the spring, and in some cases operators who told me in April that they were pleased with the progress of their NFV adventures now had some concerns.  A couple had some very specific and similar views that are worth reviewing.

According to the most articulate pair of operators, we have proved the “basic concept of NFV”, meaning that we have proved that we can take cloud-hosted network features and substitute them for the features of appliances.  Their concerns lie in NFV beyond the basics.

First and foremost, these operators say that they cannot reliably estimate the management burden of an NFV deployment.  There is no doubt in their mind that NFV could push down capex, but also no doubt that it would create a risk of increased opex at the same time.  They don’t know how much of an opex increase they’d face, so they can’t validate net savings.  Part of the reason is that they don’t have a reliable and extensible management model for NFV, but part is more basic.  Operators say that they don’t know how well NFV will perform at scale.  You need efficient resource pools to achieve optimal savings on capex, which means you need a large deployment.  So far they don’t have enough data on “large” NFV to say whether opex costs rise in linear way.  In fact, they say they can’t even be sure that all of the tweaks to deployment policy—things ranging from just picking the best host to horizontal scaling and reconfiguration of services under load or failure—will be practical given the potential impact they’d have on opex.  One, reading all the things in a VNF Descriptor, said “This is looking awfully complicated.”

The second concern these operators expressed was the way that NFV integrated with NFVI (NFV Infrastructure).  They are concerned that we’ve not tested the MANO-to-VIM (Virtual Infrastructure Manager) relationship adequately, and even haven’t addressed the VIM-to-NFVI relationship fully.  Most of the trials have used OpenStack, and it’s not clear from the trials just how effective it will be in addressing network configuration changes.  Yes, we can deploy, but OpenStack is essentially a single-thread process.  Could a major problem create enough service disruption that the VIM simply could not keep up?

There are also concerns about the range of things a VIM might support.  If you have two or three clouds, or cloud data centers, do you have multiple VIMs?  Most operators think you do, but these two operators say they aren’t sure how MANO would be able to divide work among multiple VIMs.  How do you represent a service that has pools of resources with different control needs?  This includes the “how do I control legacy elements” question.  All of the operators said they had current cloud infrastructure they would utilize in their next-phase NFV trial.  All had data center switches and network gateways that would have to be configured for at least some situations.  How would that work?  Is there another Infrastructure Manager?  If so, again, how do you represent that in a service model at the MANO level?

Then there’s SDN.  One operator in the spring said that the NFV-to-SDN link was a “fable connected to a myth”.  The point was that they were not confident of exactly what SDN would mean were it to be substituted for traditional networking in part of NFVI.  They weren’t sure how NFV would “talk to” SDN and how management information in particular would flow.  About two-thirds of operators said that they could have difficulties taking NFV into full field trials without confidence on the SDN integration issue.  They weren’t confident in the spring, but there is at least some increase in confidence today (driven by what they see as a convergence on OpenDaylight).

You can make an argument that these issues are exactly what a field trial would be expected to address, and in fact operators sort of agree with that.  Their problem is that they would expect their lab trials to establish a specific set of field-trial issues and a specific configuration in which those issues could be addressed.  The two key operators say that they can’t yet do that, but they aren’t convinced that spending more time in the lab will give them a better answer.  That means they may have to move into a larger-scale trial without the usual groundwork having been laid, or device a different lab trial to help prepare for wider deployment.

That would be a problem because nearly all the operators say that they are being charged by senior management to run field trials for NFV in 2015.  Right now, most say that they’re focusing on the second half—likely because if you’re told you need to do something you’re not sure you are ready for, you delay as long as you can.

What would operators like to see from NFV vendors?  Interestingly, I got an answer to that over a year ago at a meeting in Europe.  One of the kingpins of NFV, and a leader in the ISG, told me that the way operators needed to have NFV explained was in the context of the service lifecycle.  Take a service from marketing conception to actual customer deployment, he said, and show me how it progresses through all the phases.  This advice is why I’ve taken a lifecycle-driven approach in explaining my ExperiaSphere project.  But where do we see service lifecycles in vendor NFV documentation?

I asked the operators who got back to me after my blog, and the two “thought leaders” in particular, what they thought of the “lifecycle-driven” approach.  The general view was that it would be a heck of a lot better way to define how a given NFV product worked than the current approach, which focuses on proving you can deploy.  The two thought leaders said flatly that they didn’t believe any vendor could offer such a presentation of functionality.

I’m not sure I agree with that, though I do think that nobody has made such a service-workflow model available in public as yet.  There are at least a couple of players who could tell the right story the right way, perhaps not covering all the bases but at least covering enough.  I wish I could say that I’d heard vendors say they’d be developing a lifecycle-centric presentation on NFV, or that my operator friends had heard it.  Neither, for now, is true, but I do have to say I’m hopeful.

We are going to large-scale NFV trials in 2015, period.  Maybe only one, but at least that.  Once any vendor manages to get a really credible field trial underway, it’s going to be impossible for others to avoid the pressure to do the same.  So for all those frustrated by the pace of NFV adoption, be patient because change is coming.

Posted in Uncategorized | Comments Off

Happy Holidays from CIMI Corporation

We at CIMI Corporation wish you all the happiest of holiday seasons and the best for the New Year.

Click HERE for our holiday card to you.

Tom Nolle

President

Posted in Uncategorized | Comments Off

Domain 2.0, Domains, and Vendor SDN/NFV

Last week we had some interesting news on AT&T’s Domain 2.0 program and some announcements in the SDN and NFV space.  As is often the case, there’s an interesting juxtaposition between these events that sheds some light on the evolution of the next-gen network.  In particular, it raises the question of whether either operators or vendors have got this whole “domain” thing right.

Domain 2.0 is one of those mixed-blessing things.  It’s good that AT&T (or any operator) recognizes that it’s critical to look for a systematic way of building the next generation of networks.  AT&T has also picked some good thinkers in its Domain 2.0 partners (I particularly like Brocade and Metaswitch), and it represents its current-infrastructure suppliers there as well.  You need both the future and the present to talk evolution, after all.  The part that’s less good is that Domain 2.0 seems a bit confused to me, and also to some AT&T people who have sent me comments.  The problem?  Again, it seems to be the old “bottom-up-versus-top-down” issue.

There is a strong temptation in networking to address change incrementally, and if you think in terms of incremental investment, then incremental change is logical.  The issue is that “incremental change” can turn into the classic problem of trying to cross the US by just making turns at random at intersections.  You may make optimal choices per turn based on what you see, but you don’t see the destination.  Domains without a common goal end up being silos.

What Domain 2.0 or any operator evolution plan has to do is begin with some sense of the goal.  We all know that we’re talking about adding a cloud layer to networking.  For five years, operators have made it clear that whatever else happens, they’re committed to evolving toward hosting stuff in the cloud.

The cloud, in the present, is a means of entering the IT services market.  NFV also makes it a way of hosting network features in a more agile and elastic manner.  So we can say that our cloud layer of the future will have some overlap with the network layer of the future.

Networking, in the sense most think of it (Ethernet and IP devices) is caught between two worlds, change-wise.  On the one hand, operators are very interested in getting more from lower-layer technology like agile optics.  They’d like to see core networking and even metro networking handled more through agile optical pipes.  By extension, they’d like to create an electrical superstructure on top of optics that can do whatever happens to be 1) needed by services and 2) not yet fully efficient if implemented in pure optical terms.  Logically, SDN could create this superstructure.

At the top of the current IP/Ethernet world we have increased interest in SDN as well, mostly to secure two specific benefits—centralized control of forwarding paths to eliminate the current adaptive route discovery and its (to some) disorder, and improved traffic engineering.  Most operators also believe that if these are handled right, they can reduce operations costs.  That reduction, they think, would come from creating a more “engineered” version of Level 2 and 3 to support services.  Thus, current Ethernet and IP devices would be increasingly relegated to on-ramp functions—at the user edge or at the service edge.

At the service level, it’s clear that you can use SDN principles to build more efficient networks to offer Carrier Ethernet, and it’s very likely that you could build IP VPNs better with SDN as well.  The issue here is more on the management side; the bigger you make an SDN network the more you have to consider the question of how well central control could be made to work and how you’d manage the mesh of devices. Remember, you need connections to manage stuff.

All of this new stuff has to be handled with great efficiency and agility, say the operators.  We have to produce what one operator called a “third way” of management that somehow bonded network and IT management into managing “resources” and “abstractions” and how they come together to create applications and services.  Arguably, Domain 2.0 should start with the cloud layer, the agile optical layer, and the cloud/network intersection created by SDN and NFV.  To that, it should add very agile and efficient operations processes, cutting across all these layers and bridging current technology to the ultimate model of infrastructure.  What bothers me is that I don’t get the sense that’s how it works, nor do I get the sense that goal is what’s driven which vendors get invited to it.

Last week, Ciena (a Domain 2.0 partner) announced a pay-as-you-earn NFV strategy, and IMHO the approach has both merit and issues.  Even if Ciena resolves the issue side (which I think would be relatively easy to do), the big question is why the company would bother with a strategy way up at the service/VNF level when its own equipment is down below Level 2.  The transformation Ciena could support best is the one at the optical/electrical boundary.  Could there be an NFV or SDN mission there?  Darn straight, so why not chase that one?

If opportunity isn’t a good enough reason for Ciena to try to tie its own strengths into an SDN/NFV approach, we have another—competition.  HP announced enhancements to its own NFV program, starting with a new version of its Director software, moving to a hosted version of IMS/EPC, and then on to a managed API program with components offered in VNF form.  It would appear that HP is aiming at creating an agile service layer in part by creating a strong developer framework.  Given that HP is a cloud company and that it sells servers and strong development tools already, this sort of thing is highly credible from HP.

It’s hard for any vendor to build a top-level NFV strategy, which is what VNFs are a part of, if they don’t really have any influence in hosting and the cloud.  It’s hard to tie NFV to the network without any strong service-layer networking applications, applications that would likely evolve out of Level 2/3 behavior and not out of optical networking.  I think there are strong things that optical players like Ciena or Infinera could do with both SDN and NFV, but they’d be different from what a natural service-layer leader would do.

Domain 2.0 may lack high-level vision, but its lower-level fragmentation is proof of something important, which is that implementation of a next-gen model is going to start in different places and engage different vendors in different ways.  As things evolve, they’ll converge.  In the meantime vendors will need to support their own strengths to maximize their influence on the evolution of their part of the network, but also keep in mind what the longer-term goals of the operator are.  Even when the operator may not have articulated them clearly, or even recognized them fully.

Posted in Uncategorized | Comments Off

Public Internet Policy and the NGN

The FCC is now considering a new position on Net Neutrality, and also a new way of classifying multi-channel video programming distributors (MVPDs) that would allow streaming providers who offered “linear” (continuous distribution, similar to channelized RF) programming as opposed to on demand to be MVPDs.  That would enable them to negotiate for licensing deals on programming as cable companies do.  The combination could raise significant issues, and problems for ISPs.  It could even create a kind of side-step of the Internet, and some major changes in how we build networks.

Neutrality policy generally has two elements.  The first defines what exactly ISPs must do to be considered “neutral”, and the second defines what is exempt from the first set of requirements.  In the “old” order published under Chairman Genachowski, the first element said you can’t interfere with lawful traffic, especially to protect some of your own service offerings, can’t generally meter or throttle traffic except for reasons of network stability, and can’t offer prioritization or settlement among ISPs without facing FCC scrutiny.  In the second area, the order exempted non-Internet services (business IP) and Internet-related services like (explicitly) content delivery networks and (implicitly) cloud computing.

The DC Court of Appeals trashed this order, leaving the FCC with what it said was sufficient authority to prevent interference with lawful traffic but not much else.  Advocates of a firmer position on neutrality want to see an order that bars any kind of settlement or payment other than for access, and implicitly bars settlement among providers and QoS (unless someone decided to do it for free).  No paid prioritization, period.  Others, including most recently a group of academia, say that this sort of thing could be very destructive to the Internet.

How?  The obvious answer is that if neutrality rules were to force operators into a position where revenue per bit fell below acceptable margins on cost per bit, they’d likely stop investing in infrastructure.  We can see from both AT&T’s and Verizon’s earnings reports that wireline capex is expected to decline, and this is almost surely due to the margin compression created by the converging cost and price.  Verizon just indicated it would grow wireless capex, and of course profit margins are better in wireless services.

You can see that a decision to rule that OTT players like Aereo (now in Chapter 11) could now negotiate for programming rights provided they stream channels continuously might create some real issues.  It’s not certain that anyone would step up to take on this newly empowered OTT role, that programming rights would be offered to this sort of player, that consumers would accept the price, or that the new OTT competitors could be profitable at the margin, but suppose it happened.  What would be the result?

Continuous streaming of video to a bunch of users over the Internet would surely put a lot of additional strain on the ISPs.  One possible outcome would be that they simply reach price/cost crossover faster and let the network degrade.  The FCC can’t order a company to do something not profitable, but they could in theory put them to the choice “carry at a loss or get out of the market”.  I don’t think that would be likely, but it’s possible.  Since that would almost certainly result in companies exiting the Internet market, it would have a pretty savage impact.

There’s another possibility, of course, which is that the ISPs shift their focus to the stuff that’s exempt from neutrality.  That doesn’t mean inventing a new service, or even shifting more to something like cloud computing.  It means framing what we’d consider “Internet” today as something more cloud- or CDN-like.

Here’s a simple example.  The traditional scope of neutrality rules as they relate to video content would exclude CDNs.  Suppose operators pushed their CDNs to the central office, so that content jumped onto “the Internet” a couple miles at most from the user, at the back end of the access connection.  Operator CDNs could now provide all the video quality you wanted as long as you were using them.  Otherwise, you’re flowing through infrastructure that would now be unlikely to be upgraded very much.

Now look at my postulated opportunity for mobile/behavioral services through the use of a cloud-hosted personal agent.  The mobile user asks for something and the request is carried on the mobile broadband Internet connection to the edge of the carrier’s cloud.  There it hops onto exempt infrastructure, where all the service quality you need could be thrown at it.  No sharing required here, either.  In fact, even if you were to declare ISPs to be common carriers, cloud and CDN services are information services separate from the Internet access and sharing regulations would not apply.  It’s not even clear that the FCC could mandate sharing because the framework of the legislation defines Title II services to exclude information services.

You can see from this why “carrier cloud” and NFV is important.  On the one hand, the future will clearly demand operators rise above basic connection and transport, not only because of current profit threats but because it’s those higher-level things that are immune from neutrality risks.  The regulatory uncertainty only proves that the approach to the higher level can’t be what I’ll call a set of opportunity silos; we need to have an agile architecture that can accommodate the twists and turns of demand, technology, and (now) public policy.

On the other hand, the future has to evolve, if not gracefully then at least profitably, from the past.  We have to be able to orchestrate everything we now have, we have to make SDN interwork with what we now have, and we have to operationalize services end to end.  Further, legacy technology at the network level (at least at the lower OSI layers) isn’t displaced by SDN and NFV, it’s just morphed a bit.  We’ll still need unified operations even inside some of our higher-layer cloud and CDN enclaves, and that unified operations will have to unify the new cloud and the old connection/transport.

One of the ironies of current policy debates, I think, is that were we to have let the market evolve naturally, we’d have had settlement on the Internet, pay for prioritization by consumer or content provider, and other traditional network measures for a decade or more.  That would have made infrastructure more profitable to operators, and stalled out the current concerns about price/cost margins on networks.  The Internet might look a little different, the VCs might not have made as much, but in the end we’d have something logically related to the old converged IP model.  Now, I think, our insistence on “saving” the Internet has put more of it—and its suppliers—at risk.

Posted in Uncategorized | Comments Off

What’s Involved in Creating “Service Agility?”

“Service agility” or “service velocity” are terms we see more and more every day.  NFV, SDN, and the cloud all rely to a degree—even an increasing degree—on this concept as a primary benefit driver.  There is certainly a reason to believe that in the most general case, service agility is very powerful.  The question is whether that most general case is what people are talking about, and are capable of supporting.  The sad truth is that our hype-driven industry tends to evolve drivers toward the thing most difficult to define and disprove.  Is prospective execution of our agility/velocity goal that nebulous?

Services begin their life in the marketing/portfolio management portion of network operators, where the responsibility is to identify things that could be sold profitably and in enough volume to justify the cost.  Ideally, the initial review of the new service opportunity includes a description of the features needed, the acceptable price points, how the service will get to market (prospecting and sales strategies) and competition.

From this opportunity-side view, a service has to progress through a series of validations.  The means of creating the service has to be explored and all options costed out, and the resulting choice(s) run through a technology trial to validate that the stuff will at least work.  A field trial would then normally be run, aimed at testing the value proposition to the buyer and the cost (capex and opex) to the seller.  From here, the service could be added to the operator’s portfolio and deployed.

Today, this process overall can often take several years.  If the opportunity is real, then it’s easy to see how others (OTT competitors for example) could jump in faster and gain a compelling market position before a network operator even gets their stuff into trial.  That could mean the difference between earning billions in revenue and spending a pile of cash to gain little or no market share.  It’s no wonder that “agility” is a big thing to operators.

But can technologies like SDN, NFV, and the cloud help here?  The service cycle can be divided into four areas—opportunity and service conceptualization, technology validation and costing, field operations and benefit validation, and deployment.  How do these four areas respond to technology enhancements?  That’s the almost-trillion-dollar question.

There are certainly applications that could be used to analyze market opportunities, but those applications exist now.  If new technology is to help us in this agility area, it has to be in the conceptualization of a service—a model of how the opportunity would be addressed.  Today, operators have a tendency to dive too deep too fast in conceptualizing.  Their early opportunity analysis is framed in many cases by a specific and detailed execution concept.  That’s in part because vendors influence service planners to think along vendor-favorable lines, but also in part because you have to develop some vision of how the thing is going to work, and operators have few options beyond listening to vendor approaches.

If we think of orchestration correctly, we divide it into “functional” composition of services from features, and “structural” deployment of features on infrastructure.  A service architect conditioned to this sort of thinking could at the minimum consider the new opportunity in terms of a functional composition.  At best, they might have functional components in their inventory that could serve in the new mission.  Thus, NFV’s model of orchestration could potentially help with service conceptualization.

Where orchestration could clearly help, again presuming we had functional/structural boundaries, would be in the formulation of a strategy and the initiation of a technology trial.  The key point here is that some sort of “drag-and-drop” functional orchestration to test service structures could be easy if you had 1) functional orchestration, 2) drag-and-drop or an easy GUI, and 3) actual functional atoms to work with.  A big inventory of functional elements could be absolutely critical for operators, in short, because it could make it almost child’s play to build new services.

Structural orchestration could also help here.  If a service functional atom can be realized in a variety of ways as long as the functional requirements are met (if the abstraction is valid, in other words), then a lab or technology trial deployment could tell operators a lot more because it could be a true functional test even if the configuration on which it deployed didn’t match a live/field operation.  Many DevOps processes are designed to be pointed at a deployment environment—test or field.  It would be easy to do that with proper orchestration.

The transition to field trials, and to deployment, would also be facilitated by orchestration.  A functional atom can be tested against one configuration and deployed on another by changing the structural recipes, which is easier to test with and accommodates variations in deployment better.  In fact, it would be possible for an operator to ask vendors to build structural models of operator functional atoms and test them in vendor labs, or to use third parties.  You do have to insure what I’ll call “structure-to-function” conformance but that’s a fairly conventional linear test of how exposed features are realized.

We now arrive at the boundary between what I’d call “service agility” and another thing with all too many names.  When a service is ordered, it takes a finite time to deploy it.  That time is probably best called “time to revenue” or “provisioning delay”, but some are smearing the agility/velocity label over this process.  The problem is that reducing time-to-revenue has an impact only on services newly ordered or changed.  In addition, our surveys of buyers consistently showed that most enterprise buyers actually have more advanced notice of a service need than even current operator provisioning delays would require.  How useful is it to be able to turn on service on 24 hours’ notice when the buyer had months to plan the real estate, staffing, utilities, etc?

The big lesson to be learned, in my view, is that “service agility” is a lot more than “network agility”.  Most of the processes related to bringing new services to market can’t be impacted much by changes in the network, particularly in changes to only part of the network as “classic NFV” would propose.  We are proposing to take a big step toward agile service deployment and management, but we have to be sure that it’s big enough.

We also have to be sure that measures designed to let network operators “compete with OTTs” don’t get out of hand.  OTTs have one or both of two characteristics; their revenues come from ads rather than from service payments, and their delivery mechanism is a zero-marginal-cost pipe provided by somebody else.  The global adspend wouldn’t begin to cover network operator revenues even if it all went to online advertising, so the operators actually have an advantage over the OTTs—they sell stuff consumers pay for, bypassing the issues of indirect revenues.  Their disadvantage is that they have to sustain that delivery pipe, and that means making it at least marginally profitable no matter what goes on above.

That’s what complicates the issue of service agility for operators, and for SDN or NFV or even the cloud.  You have to tie services to networks in an explicit way, to make the network valuable at the same time that you shift the focus of what is being purchased by the buyer to things at a higher level.  Right now, we’re just dabbling with the issues and we have to do better.

Posted in Uncategorized | Comments Off

Is Ciena’s Agility Matrix Agile Enough?

NFV, as I’ve said before in blogs, is a combination of three things—the MANO platform that orchestrates and runs services, the NFV Infrastructure on which stuff is run/hosted, and the VNFs that provide the functionality.  You need all of them to have “NFV” and it’s not always clear just where any of them will come from, what exactly will be provided, or what the price will be.  Uncertainty is an enemy of investment, so that could inhibit NFV deployment.

VNFs have been a particular problem.  Many of the network functions that are targets for early virtualization are currently offered as appliances, and the vendors of these appliances aren’t anxious to trash their own revenues and profits to help the operators save money.  One issue that’s come up already is the fact that many VNF providers want to adopt a “capital license” model for distribution.  This would mean that the provider pays for a license up front, much like they pay for an appliance.  It’s easy to see how this suits a vendor.

From the perspective of the network operator, the problem with this is that it’s dangerously close to being benefit-neutral and at the same time risk-generating.  The VNF licensing charges, according to at least some operators, are close to the price of the box the VNF replaces; certainly the cost of the license and the servers needed for hosting are very close.  This, at a time when it’s not certain just how much it will cost to operationalize VNFs, how much they might impact customer SLAs, or even how efficient the hosted resource pool will be.

Ciena has a proposed solution for operators in its Agility Matrix, a kind of combination of VNF platform and partnership program.  VNF providers put their offerings in a catalog which becomes the foundation for the creation of NFV services.  The VNFs are orchestrated into services when ordered, and the usage of the VNFs is metered to establish charges paid by the operator.  What this does is create what Ciena calls a “pay as you earn” model, eliminating VNF licensing fees.

There is no question that Agility Matrix addresses a problem, which is the combination of “first risk” and “first cost” that accompanies any new service.  The question is whether operators will find this approach compelling, not so much in the short term (all that “first” stuff) but in the longer term.  That may be complicated.

The first point is that Ciena doesn’t propose to host the VNFs themselves, but to use carrier resources to host and connect.  NFVI, in short, is still the operator’s, so the operator will still have to deploy resources to offer services in those “first” days.  That means that some cost and risk are not going to be displaced by Agility Matrix.  However, most operators would probably run screaming from a vendor proposal to host VNFs—to provide what would be essentially a “SaaS” framework of VNFs for operators to integrate—because operators would fear the higher cost of hosting and the commitment to a third party.

The second risk is the utility of having VNF choices.  Obviously not all VNFs will be in the catalog.  It’s also true that many operators already know who they want their VNF partners to be and are already in relationships with them, either for CPE or in some cases for hosted elements.  The biggest value of Agility Matrix comes when the operator is flexible enough to grab functionality from the catalog for most of their VNF needs.  If the VNF they want is already available to them, or isn’t in the catalog, then they have to go outside Agility Matrix for their solution, and every such step makes the concept less useful.

The third point is that network operators want an exit strategy from these pay-as-you-go systems since they perceive that in most cases their risk will decline as their customer volume mounts, and their own leverage with VNF vendors to negotiate license charges will increase.  While the fact that Ciena’s not trying to take over hosting, only licensing, makes things easier, Agility Matrix doesn’t so far present an option to shift to a licensed approach down the line.  The operator could work through the process of taking VNF control in-house on their own (there are no contractual lock-ins), but it might create service disruptions and would likely involve a change in service-building and management.  Perpetual pay-as-you-go is a risk; Alcatel-Lucent had an Open API Service designed to build a cross-provider development framework by charging a small usage fee, and it wasn’t successful.

The final point is the onboarding process.  What Ciena is offering is a VNF framework to be bound into an operator’s NFV deployment platform and NFVI.  It’s certainly true that Ciena can offer a single approach to onboarding and even to management—which Agility Matrix promises through its own Director tool.  We don’t at this point know how many different MANO platforms there will be and what the onboarding requirements for each will look like.  Yes, Ciena’s Director element provides ETSI MANO functionality, but I’ve questioned whether this is sufficient for orchestration.  If it’s not, then it’s not clear how the additional features (primarily related to management, IMHO) would be integrated.  And even if Director is solid, additional MANO/NFV tools may be pulled into the operator because some VNFs from network vendors may not be available in any way except by license to the operator and deployment and management by the network vendor’s own platform.  For Ciena and the operator alike, this could generate some complexity in onboarding.

The final point is what I’ll call “brand connection.”  Who do you think of when you think of an NFV infrastructure?  Probably not Ciena.  Network operators in my spring survey didn’t even mention them as a candidate.  That doesn’t mean that Ciena couldn’t be a credible supplier of NFV platforms and VNF catalogs, but it does mean that a lot of other vendors are going to have their opportunity to push their wares as well, many before Ciena gets to bat.

The reason Ciena isn’t a strong brand in the NFV platform space is that it’s not clear what role Ciena’s own gear plays in the NFV world.  There is a linkage between the Agility Matrix and Ciena’s network equipment, but I think the link could be stronger and more compelling if Ciena outlined just how you’d build NFV based largely on agile optics and electrical grooming.  As I said in my Monday blog, vendors like Ciena are potentially in the cat-bird seat with respect to controlling the outcome of network evolution.  They could exploit this position with a good NFV approach, but such an approach would have to be more along the line of program plus product.  Operators should be able to use a pay-as-you-earn program as an on-ramp where they need it.

Agility Matrix is a useful concept.  Tier Two and Three operators might find it especially compelling and might even want Ciena to partner with some cloud providers to host stuff.  Even Tier Ones would see this as a way to control early cost and risk.  However, right now operators see NFV as the framework for all their future higher-level services.  They want their NFV provider to be helpful but not intrusive, and I think Ciena could do more to fulfill these two attributes.  They should try, because the basic idea is sound.

Posted in Uncategorized | Comments Off

OSI Layers, Policy Control, Orchestration, and NGN

If you look at any model of network evolution, including the one I presented for 2020 yesterday in my blog, you find that it involves a shifting of roles between the familiar layers of the OSI model, perhaps even the elimination of certain layers.  That begs the question of how these new layers would cooperate with each other, and that has generated some market developments, like the work to apply OpenFlow to optical connections.  Is that the right answer?  Even the only one?  No, to the second, and maybe to the first as well.

Layered protocols are a form of abstraction.  A given layer consumes the services of the layers below and presents its own service to the layer above.  By doing so, it isolates that higher layer from the details of what’s underneath.  There is a well-known “interface” between the layers through which that service advertising and consumption takes place, and that becomes the input/output to the familiar “black box” or abstraction.

Familiar, from the notion of virtualization.  I think the most important truth about network evolution is that virtualization has codified the notion of abstraction and instantiation as a part of the future of the network.  The first question we should ask ourselves is whether we are supporting the principles of the “old” abstraction, the OSI model, and the “new” abstractions represented by SDN and NFV, with our multi-layer and layer evolution strategies.  The second is “how?”

Let’s assume we have deployed my stylized future network, foundation agile optics plus electrical SDN grooming plus an SDN overlay for connection management.  We have three layers here, only the top of which represents services for user consumption.  How would this structure work, be controlled?

When a user needs connection services, the user would place an order.  The order, processed by the provider, would identify the locations at which the service was to be offered and the characteristics of the service—functional and in terms of SLA.  This service order process could then result in service-level orchestration of the elements needed to fulfill the request.  Since my presumptive 2020 model is based on software/SDN at the top, there is a need to marshal SDN behaviors to do the job.

Suppose this service needs transport between Metro A and D for part of its topology.  Logically the service process would attempt to create this at the high level, and if that could not be done would somehow push the request down to the next level—the electrical grooming.  Can I groom some capacity from an optical A/D pipe?  If not, then I have to push the request down to the optical level and ask for some grooming there.  It’s this “if-I-can’t-do-it-push-down” process that we have to consider.

One approach we could take here is to presume central control of all layers from common logic.  In that case, a controller has complete cross-layer understanding of the network, and when the service request is processed that layer “knows” how to coordinate each of the layers.  It does so, and that creates the resource commitments needed.

A second approach is to assume cross-layer abstraction and control.  Here, each layer is a black box to the layers below, with each layer controlled by its own logic.  A layer offers services to the higher-layer partner, and takes service requests from that partner, so our service model says that the connection layer would “ask” for electrical grooming from SDN if it didn’t have pipes, and SDN in turn would ask for optical grooming.

I think that a glance at these classic choices shows something important, which is that whether we presume we have central control of all the layers or that the layers are independently controlled, there is no reason to presume that the layers have to be controlled the same way, with the same protocol.  The whole notion of adapting OpenFlow to optics, then (and in my view), is a waste of time.  Any control mechanism that lets layer services be made to conform to the request of the layer above works fine.

Is there a preferred approach, though?  Would central control or per-layer control be better?  That question depends a lot on how you see things developing, and I’m not sure we can pick the “best” option at this point.  However, I think that it is clear that there are concerns about scalability and availability of controllers in SDN, concern that leads to the conclusion that it would be helpful to think of SDN networks as federations of control zones.  Controllers, federated by cross-domain processes/APIs, would have to organize services that spread out geographically and thus implicated multiple controllers.  In this model, it wouldn’t make much sense to concentrate multi-layer networking in a single controller.  In fact, given that connection networks, electrical SDN grooming, and agile optics would all likely have different geographical scopes, that kind of combination might be really hard to organize.

So here’s my basic conclusion; network services in the future would be built by organizing services across both horizontal federations of controllers and down through vertical federations representing the layers of network protocol/technology.  You can do this in three ways; policy-linked structures, domain federation requests, and orchestration.

The policy approach says that every controller has policies that align its handling of requests from its users.  It enforces these policies within its domain, offering what are effectively abstract services to higher-level users.  These policies administer a pool of resources used for fulfillment, and each layer expects the layer below to be able to handle requests within the policy boundaries it’s been given.  There is no explicit need to communicate between layers, or controllers.  If specific service quality is needed, the policies needed to support it can be exchanged by the layers.

The domain federation request approach says that when Layer “A” runs out of resources, it knows what it needs and asks some combination of lower layer controllers to provide it—say “B” and “C”.  The responsibility to secure resources from below is thus explicit and if the lower layer can’t do it, it sends a message upward.  All of this has to be handled via an explicit message flow across the federated-controller boundary, horizontally or vertically.

The orchestration model says that the responsibility for creating a service doesn’t lie in any layer at all, but in an external process (which, for example, NFV would call “MANO”).  The service request from the user invokes an orchestration process that commits resources.  This process can “see” across layers and commit the resources where and when needed.  The continuity of the service and the cooperative behavior of the layers or controller domains is guaranteed by the orchestration and not by interaction among the domains.  It is not “presumptive” as it would be in a pure-policy model.

Multiple mechanisms could be applied here; it’s not necessary to pick just one.  The optical layer might, for example, groom capacity to given metro areas based on a policy to maintain overall capacity at 150% of demand.  Adjacent electrical SDN grooming zones might exchange controller federation requests to build services across their boundaries, and the user’s connection layer might be managed as a policy-based pool of resources for best-effort and an orchestrated pool for provisioned services.

None of this requires unanimity in terms of control mechanisms, and I think that demands for that property have the effect of making a migration to a new model more complicated and expensive.  If we can control optics and SDN and connections, and if we can harmonize their commitment horizontally and vertically, we have “SDN”.  If we can orchestrate it we have “NFV”.  Maybe it’s time to stop gilding unnecessary lilies and work on the mechanisms to create and sustain this sort of structure.

Posted in Uncategorized | Comments Off