August 2015 – Page 2 – Welcome to CIMI Corporation's Public Blog

How Operators are Preparing NFV Plans for their Fall Pre-Budget Review

The consensus among network operators who provide either wireline or wireless broadband is that they’ll cross over on the revenue/cost per bit by mid-2017. Given the time it takes to make any significant changes in service offerings, operations practices, or capital infrastructure programs, something remedial would have to begin next year to be effective.

In mid-September of each year, operators embark on a technology planning cycle. It’s not a universal process, nor is it always formalized, but it’s widespread and usually completed by mid-November. The goal is to figure out what technology decisions will drive capital programs for the following year. That which wins in this cycle has a good chance of getting into field trial and even deployment in 2016.

It’s not surprising that operators are taking stock now, in preparation for the work to come. It’s not surprising that NFV is a big question to be addressed, and that NFV’s potential to improve profits by widening the revenue/cost-per-bit gap is perhaps the largest technology question-mark.

My opening qualifier “who provide either wireline or wireless broadband” is important here. More specialized operators like the managed service providers (MSPs), cloud providers (CSPs), or those who offer multi-channel broadcast video are in a bit better shape. Interestingly, one of most obvious success points for NFV is the MSP space, so let’s start looking at NFV’s potential with its successes.

An MSP adds value to connection services by introducing a strong feature-management model. Most connection services are consumed within the context of private WAN deployment of some sort, and there’s more to a private WAN than a few pipes. Over the last two decades, the cost of acquiring and sustaining the skills needed for that ancillary technology, and the cost of managing the private WAN overall, has grown as a component of TCO. Today businesses spend almost twice as much supporting their network as buying equipment for it. MSPs target that trend.

NFV does to, or at least the “service chaining” or “virtual CPE” portion does. Connection services are built into private WANs by adding inline technologies like firewalls, application accelerators, encryption, and so forth, and by adding hosted elements like DNS and DHCP. The MSP model of NFV vCPE is to supply those capabilities by hosting them on an agile edge device. That means that you deploy a superbox with each customer and then mine additional revenue potential by filling it with features that you load from a central repository. It’s a good, even great, model.

This same model can be adopted by any big operator, including all the broadband ISPs, and in theory it could be applied to every customer. There are issues with that theory, though—particularly for NFV’s broad acceptance:

vCPE delivers the most value where the cost of actual devices is high and their deployment is dynamic. If the boxes are cheap and if users tend to get all the same features all at once, and then never change, it doesn’t do much good. Consumers and small businesses don’t fit the vCPE success model.
While NFV can be used to deploy functions into CPE, that mission dodges most of the broader NFV value propositions. Managing that vCPE model isn’t much different from managing real boxes. You don’t need clouds or really even need function-to-function connectivity to make it work. There’s no economy of scale to consider.
vCPE has encouraged VNF providers to consider what operators overall say is an unrealistic revenue model—the pay-as-you-go. MSPs like this approach because it lets them match expenses to revenue; they don’t have to deploy much shared infrastructure with a CPE-hosted VNF model so the VNF licenses would be much of their first cost. Other operators don’t like that model at all because it exposes them to what they believe to be higher long-term costs.
The applications of vCPE that do work are a very small part of the revenue/cost-per-bit problem, and so even if you revolutionize these services for the appropriate customers, you don’t move the ball on profit trends very much.

What does move the ball? The other most successful NFV application to date is mobile infrastructure. Operators are already dependent on mobile services for profits, ARPU, and customer growth. There’s more change taking place within the mobile network than anywhere else, and it’s easier to drive new technology into a network when you’re investing in it anyway.

Virtual mobile infrastructure involves virtualizing IMS (the core registration and service control technology), EPC (the metro traffic management and mobility management piece), and of course the radio access network. We’ve seen announcements in all of these areas, from players like Alcatel-Lucent (vIMS), Ericsson (vEPC), and ASOCS (vRAN, in partnership with Intel).

There’s a lot of money, a lot of potential, in virtualizing mobile infrastructure. The problem from an NFV perspective is that mobile services are multi-tenant, which means that you generally deploy them and then keep them running forever. Yes, you need operational support for high availability and performance, but you are really building a cloud application in the mobile infrastructure space and not an NFV application.

Despite the lack of dynamism in virtual mobile infrastructure (vMI), the larger operators tend to accept it as the priority path to NFV. That’s because vMI is large in scale, both in geographic and technology terms. It touches enough that if you can make it work, you can infer a lot of other things will also work. And because operationalization is a big part of a vMI story, that could lead to broad operations transformation. Operators believe in that.

Here’s what operators say they are facing when they enter their fall planning cycle. We have proved that NFV works, in that we have proved that you can deploy and connect virtual functions to build pieces of services. We have proved that NFV can be useful in vCPE and vMI, but we haven’t proved it’s necessary for either one. But carriers have invested millions in NFV, it’s a major focus of standards-writers and technologists. There is a lot of good stuff there, sound technology and the potential for an arresting business case. We just don’t know what that business case is yet.

The plethora of PoCs and trials isn’t comforting to CFOs because it raises the risk of having a plethora of implementations, the classic silo problem. We have no universal model of NFV, or of any new and different future network. It’s a risk to build toward the goal of a new infrastructure through diverse specialized service projects when you don’t know whether these projects will add up to a cohesive future-network vision. It’s particularly risky when we don’t have any firm specifications to help realize service agility or operations efficiency benefits—when those are the benefits operators think are most credible.

What operators are even now asking is whether they can start investing in any aspect of NFV with the assurance that their investment will be protected if NFV does succeed in broadening its scope. Will we have “NFV” in the future or a bunch of NFV silos, each representing a service that works in isolation but can’t socialize? This is the question I think is already dominating CFO/CEO planning, where one called it “suffering the death of a thousand benefits”. It will shortly come to dominate the NFV proofs, tests, and trials because it’s the question that the fall technology planning cycle has to answer if the 2016 budgets are to cover expanded NFV activity.

I believe this question can be answered, and actually answered fairly easily and in the affirmative. There are examples of effective NFV models broad enough to cover the required infrastructure and service critical masses. There are examples of integration techniques that would address how to harmonize diverse services and even diverse NFV choices. We don’t need to invent much here. I believe that a full, responsive-to-business-needs, NFV infrastructure could be proved out in six months or less. All we have to do is collect the pieces and organize them into the right framework. Probably a dozen different vendors could take the lead in this. The question for this fall, I hope, won’t be “will any?” but “which one?”

Can We “Open” NFV or Test Its Interoperability? We May Find Out.

I suspect that almost everyone involved in NFV would agree that it’s a work in progress. Operators I’ve talked with through the entire NFV cycle—from the Call for Action white paper in the fall of 2012 to today—exhibit a mixture of hope and frustration. The top question these operators ask today is how the NFV business case can be made, but the top technical question they have is about interoperability.

Interoperability has come to the fore again this week because of a Light Reading invitation to vendors to submit their NFV products to the EANTC lab for testing, and because a startup who promises an open-source and interoperable NFV came out of stealth. I say “come to the fore again” because it’s been an issue for operators from the first.

Everyone wants interoperability, in no small part because it is seen as a means of preventing vendor lock-in. NFV is a combination of three functional elements—virtual network functions, NFV infrastructure, and management and orchestration—and there’s long been a fear among operators that vendors would field their own proprietary trio and create “NFV silos” that would impose different management requirements, demand different infrastructure, and even support only specific VNFs.

That risk can’t be dismissed either. The ETSI NFV ISG hasn’t defined its interfaces and data models in sufficient detail (in my view, and in the view of many operators) to allow unambiguous harmony in implementation. We do have trials underway that integrate vendor offerings, but operators tell me that the integration mechanisms aren’t common across the trials, and so won’t assure interoperability in the broad community of NFV players. What’s needed to create it is best understood by addressing those three NFV functional elements one at a time.

VNFs are the foundation of NFV because if you don’t have them you have no functions to host and no way to generate benefits. A VNF is essentially a cloud application that’s written to be deployed and managed in some way and to expose some set of external interfaces. There are two essential VNF properties to define for interoperability.

A real device typically has some addresses that represent its data, control, and management interfaces. These interfaces speak the specific language of the device, and so to make them work we have to “connect” to them with a partner that understands that language. We have to match protocols in the data path, and we have to support control and management features through those interfaces. Rather than define a specific standard for the management side, NFV has presumed that a “VNF Manager” would be bound with the VNFs to control their lifecycle. VNFMs know how to set up, parameterize, and scale VNFs.

One thing this means is that VNFMs are kind of caught between two worlds—they are on one hand a part of a VNF and on the other hand a part of the management process. If you look at the implementations of NFV today, most VNFMs have rather specific interfaces to the management systems and resource management tools of the vendors who create the NFV platform. That’s not unexpected, but it means that it’s going to be difficult to make a VNF portable unless the VNFM is portable, and that’s difficult if it’s hooked to specific vendor tools.

The other hidden truth here is that if a VNFM drives lifecycle management, then the VNFM knows the rules for things like scaling, event-handling for faults, and so forth. It also obviously has to know the service model—the way that all the components in a VNF are connected and what has to be redone if a given component is scaled out and scaled in. If this knowledge exists “inside” the VNFM then the VNFM is the only thing that knows what the configuration of a service is, which means that if you can’t port the VNFM you can’t port anything.

The second critical interoperability issue is the NFV Infrastructure piece. You’d want to be able to host VNFs on the best available resource, both in terms of resource capacity planning (cheap commodity servers or ones with special data-plane features) and in terms of picking a specific hosting point to optimize performance and cost during deployment. Infrastructure has to be abstracted to make this work, so that you give a command to abstract-hosting-point and it hosts depending on all your deployment policies and the current state of resources.

It’s clear who does this—the Virtual Infrastructure Manager. It’s not really clear how it works. For example, if there are parameters to guide where a VNF is to be put (and there are), you’d either have to be able to pass these to the lower-level cloud management API (OpenStack Nova for example) to guide its process, or you’d have to apply your decision policies to infrastructure within the VIM (or higher up) and then tell the CMS API specifically where you wanted something put. The first option is problematic because cloud deployment tools today don’t support the full range of NFV options, and the second is problematic because there’s no indication that resource topology and state information is ever published “upward” from the NFV Infrastructure to or though the VIM.

If you read through the NFV specifications looking for the detail on these points, through the jaded eyes of a developer, you’ll not find everything you need. Thus, you can’t test for interoperability without making assumptions or leaving out issues. Light Reading can, and I’d hope they might, identify the specific things that we don’t have and need to have, but it’s not going to be able to apply a firm standard of interoperability that’s meaningful.

How about the startup? The company is called “RIFT.io” and its product is “RIFT.ware”. The details of what RIFT.ware includes and what it does are a bit vague (not surprising since the company just came out of stealth), but the intriguing quote from their website is “VNFs built with RIFT.ware feature the economics and scale of hyperscale data centers and the security and availability of Telco-grade network services.” Note my italics here. The implication is that RIFT.ware is a kind of VNFPaaS framework, something that offers a developer of VNFs a toolkit that would, when used, exercise all of the necessary deployment and lifecycle management features of NFV in a standard way.

I think that a PaaS for NFV is a great notion, and I’ve said that in blogs before. However, it’s obvious there are some questions here. How would RIFT.io induce VNF vendors, particularly the big NFV players who also have infrastructure and MANO, to use their system? Since there are no definitive specifications from ETSI that you could force compliance with, could the big guys simply thumb their noses? Another question is whether RIFT.ware could answer the questions about “interoperability” within its own framework. And if we don’t get conformance on RIFT.ware across the board, it becomes another silo to be validated.

The final question here, which applies to both LR and RIFT.io, is that of the business case. Interoperability doesn’t guarantee utility. If NFV doesn’t reach far enough into legacy service infrastructure to operationalize things end to end, and if it doesn’t integrate with current OSS/BSS/NMS processes in a highly efficient way, then it doesn’t move the ball in terms of service agility and operations efficiency. The ETSI spec doesn’t address legacy management and doesn’t address operations integration, at least not yet.

I’m hopeful that both these activities will produce something useful, but I think that for utility to emerge from either, we’ll have to address omissions in the NFV concept as it’s currently specified. I hope that both these activities will identify those omissions by trying to do something useful and running into them, because that sort of demo may be the only way we get them out in the open and dealt with.

Cisco’s Message to SDN and NFV Proponents: Get Moving or Get Buried

Cisco beat estimates in its quarter, coming in about where I’d suggested it might overall. The Street is happy with the results, which they should be, and the question now is how the details of Cisco’s performance might signal us toward a view of the next year.

I think one key statement from the call, from now-CEO Robbins, was “When I think about our strategy, I look at the huge market opportunity that exists as businesses and governments use technology to drive their growth and operational efficiency.” Productivity is the measure of operational efficiency for enterprises, and was the driver of all the past IT spending booms. Operations efficiency is the most credible benefit target for network operators. So the question is whether Cisco is now recognizing this, or whether it was just a catchy turn of the phrase inserted by a speech-writer.

Overall revenue growth was 4%, but switching and routing both under-performed versus the average and data center (UCS) was the big success with 14% growth. This suggests, as I noted yesterday, that neither the network operators nor the enterprises were confidently re-investing in the current (Ethernet/IP) model of networking, but also that they were not seeing enough of a credible alternative emerging to dampen normal refresh. There may be some extending of product lifecycles, but it’s not excessive.

The service provider segment did return to growth for Cisco, which again isn’t any real surprise. Had SDN and NFV adoption been where you’d think they are based on media coverage, we would have seen a distinct slip in Cisco’s numbers given that they are hardly a leader in either space. There was no such slip, which proves that we’re not seeing any impact of SDN or NFV on infrastructure spending by operators.

Robbins said that Cisco “did really well in the core” regarding routing sales. That suggests to me that the place where Cisco is strongest in the network operator segment is one of the places operators are trying to re-architect the most. The replacement of large routers with agile optics or SDN-groomed tunnels is a major priority. They aren’t seeing any hurt from this shift, so clearly it’s not yet happening.

NFV was mentioned on this call by Cisco for what I think was the first time. Their comment: “But I do think that as we look at where the service providers are going, what they want to do with virtual managed services, how we’re aligned now around the deployment of NFV and how we move forward with that, we think that we’re well positioned with routing as we look ahead.” Given that NFV really doesn’t have much to do with routing specifically, and that virtual managed services (service chaining) are a fairly limited NFV application, I think it’s fair to say that Cisco has no revolutionary NFV intent.

Enterprise/commercial carried the day in terms of segment performance, and routing slightly out-gained switching. I think that suggests that my comments regarding the pace of SDN in the enterprise (in my blog on Arista) were correct. Enterprises are more likely to be tied to their incumbent vendor in complex technology like routing, and so Cisco picked up more there. Switching is more competitive for the simple reason that there are fewer features.

One very significant point here is that Cisco is retaining a strong data center account control lead, based on my own interactions with enterprises. For the last eight years, data center networking needs and policies have driven network equipment and architecture decisions, and that drive has been decisive in the last four years. The vendor who used to have absolute control over the data center was IBM, but Cisco is now in the top spot.

Cisco, in fact, listed achievements in the cloud data center and in software as its two primary goals. Given that it’s already dominating data center strategic influence at least in the US and Europe, it seems reasonable that it would be able to achieve at least that first goal. However, Cisco’s cloud aspirations seem more tactical than strategic. They don’t talk about differentiating themselves with cloud technology, but about exploiting the opportunity for data center equipment that the cloud creates. Surprisingly, they ignore the biggest potential source of new data centers, which is NFV.

Not everything is good news. If I were Cisco I’d be concerned about the fact that its growth came completely from the Americas, meaning the US and Canada. The reason this is important is that Cisco’s account control is greatest in these markets, and competition (particularly from Huawei in the network operator segment) is lower or absent. Europe, of course, is mired in secular economic issues and its underperformance has hurt Cisco’s numbers because Cisco enjoys account control there almost as much as here.

Software looks to be a bit knotty. If you read through the call transcript the vision you get is one of a Cisco who sees software purely as a means of transitioning to a subscription/recurring revenue model in areas like security. There is no sense that software technology is now driving the bus completely. This isn’t a problem limited to Cisco, though. Arch-rivals Alcatel-Lucent and Juniper are also locked in the box, so to speak, and are having difficulties coming to terms with a pure software future where hardware is just what you run software on.

So what can we say about the call overall? First, while Cisco may have opened with that operations efficiency comment, I think it was an accidental fluffery and not a signal that Cisco recognizes the importance of driving IT and network spending growth by harnessing productivity and operations efficiency as benefits. There were no comments made later on to tie to the efficiency theme, and Cisco likes to leave a trail of bread crumbs when they want us to follow.

Second, Cisco is telling us that neither SDN nor NFV are impacting sales at all. That’s not as much a fault of the technological merits of either as to the vacuous positioning of offerings and the insipid dealing with key value propositions. It’s not smart for Cisco to defend against SDN or NFV when nobody is attacking effectively with either technology. In fact, Cisco paints a picture of real risk of failure for both technologies. If nothing is done in the next two years to radically change buyer thinking, the reinvestment in legacy technology will have nailed a lot of buyers to the ground for three or four more years, and by then there’d be little chance either SDN or NFV would develop any real momentum.

Cisco is doing well, in no small part because it’s winning the old game and too smart to support a new one. Those who want to see Cisco drop market share or want to displace Cisco at the top of the heap will have to sing and dance a lot better, and start that process very quickly.

As Requested: Building and Selling Intent-Modeled Services

I did a blog early this week on the foundation for an agile service-model approach. Some of my operator friends were particularly interested in this topic, more than I thought frankly. Most of it was centered on how this sort of model would be applied during the routine processes of building and selling services. If service agility is the goal, then operators want specific knowledge of how a given strategy would impact the “think-it-to-sell-it” cycle. So let’s set the stage, then dive in.

What I’m suggesting is that we define services in a series of hierarchical modeling steps, with each step based on an intent model that offers a from-the-consumer-side abstraction of the service/feature being defined. For network services and features, we could assume the model had a standard structure, which defined FUNCTIONAL-PROPERTIES for the model, and INTERFACES that allow for connection of models or to users. The FUNCTIONAL-PROPERTIES would always include an SLA to describe performance and availability.

As I noted in the earlier blog, you can build basic connection services with three broad models—the point-to-point (LINE), multipoint (LAN), and multicast (TREE). A bit of deeper thinking would show that there could be two kinds of TREEs, the M-TREE that’s true multicast and the L-TREE that’s a load-balance point. You could divide other connection service models too, if needed.

For more complicated features, I proposed two additional models, the IN-LINE model for a service that sits across a data path (firewall is the classic example) and the ON-LINE model for a service that attaches like it’s an endpoint, as DNS would.

INTERFACEs are the glue for all of this. Depending on the functional model of the service we might have a single class of interface (“ENDPOINT” on a LAN) or we might have multiple classes (“SENDER” and “RECEIVER” on multicast TREEs). Connection services connect endpoints so you’d have one for every endpoint you wanted to connect, and you might also have Network-to-Network Interface (NNI) points to connect subsets of a service that was divided by implementation or administrative boundaries.

Given this setup, we can relate the process of selling or building a service to the specific model elements. My assumption is that we have a service architect who is responsible for doing the basic building, and a service agent (the customers themselves, via a portal, or a customer service rep) who would fill in an order for a purchase.

If we started with an order for, as an example, an IP VPN, we would expect the order to identify the sites to be supported and the SLA needed. This would populate a query into a product catalog that would extract the options that would meet the criteria. In our example, it might suggest an “Internet tunnel” VPN using IPsec or something like it, or a “provisioned” VPN. It might also suggest a VLAN with hosted routing. All of the options that fit the criteria could be shown to the ordering party (user or customer service rep) for examination, or they could be priced out first based on the criteria.

If we assume that the provisioned option was selected, the order process might query the customer on whether additional service—like firewall, DNS, DNCP, encryption, application delivery control, or whatever—might be useful. Let’s assume that a firewall option was selected for all sites.

The next step would be to decompose the order. The “IP VPN” service would logically be made up of a series of access LINEs to LAN-IP-VPN INTERFACEs. Because we have a need for a firewall, we’d create each LINE as an access segment, an IN-LINE firewall component, and a VPN connection segment. Or, in theory, we might find one outlying site that doesn’t have a suitable place to host a cloud firewall, so it would get CPE that had local function hosting.

If the VPN spanned several geographies, we might decompose our LAN-IP-VPN object into a series of NNI-connected objects, one for each area. We’d then build per-area VPNs as above and connect them.

You can see that the other direction might work similarly. Suppose a service architect is looking to build a new service called an INTERNET-EXTENDED-IP-VPN. This service would combine a LAN-IP-VPN with an Internet tunnel to an on-ramp function. The architect might start with the LAN-IP-VPN object, and add to it an ON-LINE object representing an Internet gateway VPN on-ramp. The combined structure would then be re-cataloged for use.

Any given object could decompose in a variety of ways. As I suggested, we might say that a customer who wants an “IP VPN” of ten sites or less within a single metro area would be offered a VLAN and virtual router combination instead. If those conditions were not met, a different option would be selected. Service architects would presumably be able to search the catalog for an object based on interface needs, functional needs, and SLA. They could assemble the results into useful (meaning sellable) services.

The initial objects, the bottom of the hierarchy, would transform network or software behaviors into intent models for further composition into services. NFV VNFs are examples of bottom-level intent models, and so are the physical devices whose functions they replicate. You can create features by composing the behaviors that are needed into a model, then put it into a catalog for use. The FUNCTIONAL-PROPERTIES SLA can include the pricing policies so the service would self-price based on the features it contains, the number of interfaces, the specific performance/availability terms, etc.

I don’t mean to suggest that every service can be built this way, or that I’ve covered every functional, topological, and interface option. I just want to demonstrate that we can do service-building using object principles as long as we make our objects conform to intent modeling so we can hide the details of implementation from other higher layers. Policies can then guide how we map “intent” at a given layer to a realization, and that realization might still be a set of “composite objects” that would again be modeled through intent and decomposed based on policies.

Operators tell me this is the sort of thing they want to be able to do, and that they want to understand how both composition of services and decomposition of orders would take place. Interestingly they also tell me that they don’t get much if any of this sort of discussion from vendors, even those who actually have some or all of the capabilities I’ve presented in my example.

Very early in the NFV game, one of the Founding Fathers of NFV told me that to be meaningful to operators, any NFV solution had to be framed in the context of that think-it-to-sell-it service lifecycle. That lifecycle is where the benefits lie, particularly the benefit of “service agility”. We have solutions in this space that work, but few of them are vertically integrated to the point where they can address evolving SDN/NFV and legacy components and can build current and emerging service models. None, apparently, are well described to the buyers. The fact that there’s so much interest now in intent modeling and its role in service agility is a great sign for the industry—a sign we might finally be listening to those early words of wisdom.

What Should We Watch for In Cisco’s Earnings Call?

Wall Street will be watching Cisco on their earnings call this week. I will to, and so should you all, but probably with a different set of goals and looking for signals only slightly related to the Street interest. Cisco is an important player whose behavior will tell us a lot about the timing and extent of our SDN and NFV revolutions.

Cisco has three primary product lines that we should be interested in; routing, data center switching, and servers. The routing products will offer us some sense of where service providers are in their overall network infrastructure plans, and of course Cisco will talk about that in their call. I think the likely story here is that service provider spending is still a bit weak—not a disaster but nothing to jump for joy at. That would indicate that operators are still pursuing recapitalization of infrastructure but not enthusiastically.

If the story is that service provider spending is off significantly, if Cisco says a lot about weak secular trends in the space, then it’s telling us that the revenue/cost-per-bit squeeze is already being felt. That would mean operators will be looking for a different strategy to adopt in 2016, and in the meantime are putting pressure on spending and prices.

If spending is much stronger, then it tells me that operators do not believe that they will get any relief from new technologies like SDN and NFV in 2016. They’ll have to either wait longer or try a different approach, and that would be very bad news for our revolutionary duo. Waiting longer isn’t possible for many of the operators, and so they’d likely either start looking for price-leader suppliers (Huawei) or start thinking about how to build networks with less routing intelligence, at least from traditional devices versus software routers.

We may get some hints from what Cisco says on the switching side. While most of their sales will be to enterprises rather than operators, it’s possible Cisco would say something about strong sales to operators for cloud data centers. Such a story would indicate that operators are either (finally) getting serious about offering cloud computing services or are preparing themselves for NFV commitments. I don’t think this is the case, and I don’t think Cisco will have much to say about switching success to operators overall.

As I said, switching will be the bellwether for enterprise network spending health, and here is where I think there’s a chance that Cisco will beat estimates. Enterprises are doing better in a profit sense and they’ve also held back on capital improvements or modernization for IT infrastructure and networking. Cisco has always been the master of account control for enterprises, and so they should do well, meaning enough to beat estimates overall by a couple pennies per share.

If Cisco does not report strong enterprise switching sales, then it would suggest that SDN in particular is starting to overhang buying. I don’t think we’re anywhere near the time where SDN actually steals budget dollars, but if you’re a planner and you see a change coming down the line, you slow-roll your investment in the old long before you start spending on the new. Enterprises would stretch the useful life of gear just a bit longer, and we’d see that in a dip in buying interest.

If Cisco does a lot better in enterprise switching, it would mean that enterprises were cheerfully ignoring any near-term impact of SDN on their network plans. That would mean that serious white-box competition is not only nowhere to be seen, but not even being hinted. The bigger the win Cisco posts here the smaller the chance that anything is going to upset the legacy switching apple cart. This could happen; it’s the second-most-likely outcome after the slight-beat in switching I opened with.

On the server side, I think it’s likely that Cisco will also beat expectations on UCS sales, but not gain much in the way of profit since UCS margins are slim (and price pressure strong). The big question is whether Cisco reports the pace of UCS growth is accelerating, which they’d likely to by ballyhooing the progress.

We’re seeing in UCS sales a combination of where Cisco wants them to be in targeting terms, and how well that segment set is accepting them. It’s not that Cisco would hesitate to sell servers to a pure-batch application play, but that their sales types would not be likely to pursue that sort of customer in the first place. If Cisco whoops and hollers about UCS sales growth, then it’s saying that network-centric issues are a growing driver of server sales. It means that the cloud, controller functions, and so forth are increasingly important.

The contrasting position, which is that UCS doesn’t seem to be sparkling, would suggest that network-centric server applications aren’t making much headway. That would be a bad sign for the cloud, for SDN, and for NFV. It would also be a bad sign for telecom spending, particularly if the other product areas suggest that telecom is weak as well.

The UCS positioning potentially plays off Cisco’s cloud strategies, which include its InterCloud offering to the telcos. The first question will be whether Cisco makes the cloud connection strongly or blows a few cloud kisses. If Cisco is seeing that buyer traction for cloud-centric IT and networking is developing, it will tout its accomplishments in that space. If it doesn’t it will hang back a bit so it’s not tarnished with what might be a dry brush.

A very strong cloud story on the earnings call would mean that Cisco thinks the cloud is going to be big for it, and that it might be moving from its traditional fast-follower to leader role in cloud positioning. The other indicator we’d want to watch there is software.

Cisco has never been able to make software work for it, but it’s pretty hard to see a cloud-centric vision that lacks software, or a cloud leader that lacks software leadership. If Cisco decides to make a serious run at the cloud, at being the next IBM, then it will have to make software a major focus. They’re not likely to talk about their future plans on an earnings call, but if Cisco says much about software or if Cisco starts talking about cloud software differentiation working for them, we’ll know what’s really behind the yammering. It will be a precursor to some software-centric moves.

Another general indicator to watch is what Cisco says about competition. Chambers dismissed white-box competition, and I think that’s fair to do given the difficulty in driving an SDN-centric vision of switching in the current market. If Cisco says that competition is driving down margins and sales by driving down prices, that unit buying is still good, then they’re saying that buyers are reinvesting in the present network model and applying a risk or ROI premium to the deals. If they admit that buyers are waiting for a new model, they’re saying that Cisco will be there with that new model down the line.

So there we are. I’ll be watching what Cisco says on their call, and I’ll comment here on what I think it signifies for us all.

Digging Deeper into Building Agile Services

Composing services in an agile and market-responsive way is a critical requirement for the future of network operators. That means it’s critical that technologies like SDN and NFV support it, and if proponents of those technologies want to play the agility card to justify their preferred revolution, then their technology has to support it better than alternatives. One of our challenges is that it’s hard to say whether that could happen because we don’t seem to be able to draw a picture of what we expect.

I’ve been in software design and development for many decades, and I’ve seen what happened in the software industry as we populized computing. Most haven’t really thought about this, but the fact is that microprocessor revolutions alone couldn’t create PCs or tablets or smartphones, you needed a lot of software. It’s software that gives the devices utility.

Services are in many ways like software in a consumption sense. We used to sell bit-as-a-service to large enterprises, and the revolution of the Internet was that we defined services that could be consumed by people who weren’t network professionals. Just like personal software revolutionized computing, personal services revolutionize networking.

One of the key things that happened in software that facilitated “appliance populism” was the concept of object-oriented or modular programming. When I learned to program there were no libraries of classes or objects to build from. You had to write code for everything, and that tool a long time, expert resources, and a lot of errors along the way. Worst of all, there simply weren’t enough programmers to produce the quantity of stuff that a populist market would want.

Today we have languages like Java whose class libraries contain enormous pools of functionality, and we follow a library-class model when we write our own code. Most software today was designed to be reused, to be plugged in here and there to make one development task serve a lot of application missions. The trend is toward higher-level languages that make things easier, and development increasingly leverages units of functionality developed as “utilities” for broad application.

So it must be with services, I believe. We should be looking at the future of services the way a developer would look at an application. I need a “class library” of generalized useful stuff, perhaps some specialty objects of my own, and a way to assemble this and make it work. If I have that, I can build something functionally useful in less time than a programmer of my era would have spent getting their code sheets keypunched.

So where is this concept? We do hear about service libraries, but we don’t hear much about the details, and the devil is in those details. Any developer knows that a class library has documentation on the functions and interfaces available, so there are “rules” that let a developer know how to integrate a given object. We should be asking about those kinds of rules for services too, and I don’t hear much at all.

Let me offer an example. We could say that a connection service has three configurations—LINE, LAN, and TREE—that express endpoint relationships. If we added a functional dimension we could describe two other “configurations”, what we could call in-line and on-line. In-line configurations for functional services are configurations where the service sits on a data path and either filters or supplements what’s sent along the “line”. On-line means that the service attaches as an endpoint itself. Got it so far?

Given this, we could now see how service composition would work. For example, a simple three-site VPNs is three LINEs connected to a LAN (multipoint) operating at L3. Suppose we wanted to add a firewall to each site. We’d now break our LINEs into two segments each, and we introduce an “in-line” firewall service. Simple. If we want to add something else, we either add it by making it another “in-line” (encryption for example) or an “on-line” like DNS or DHCP.

I’m not suggesting that these simple connection and service models are complete, but they’re complete enough to illustrate the fact that you can build services this way. Maybe we need another model or two, but in the end everything would still obey a basic rule set.

An “in-line” has two ports to connect to and a service between. I can connect in-lines to other in-lines or to LINEs. That frames a simple set of rules that a service creation GUI could easily accept. That means that a service architect could “build services” by assembling elements based on these concepts.

Obviously you need a bit more than topology to make this work. An “interface” of any sort means an address space and protocol set, which in the modern world will usually mean either “Ethernet” at Level 2 or IP at Level 3. You might refine either by specifying tunnel protocols and so forth. Similarly you’d need to have some sort of SLA that provided basic QoS guarantees (or indicated that the service was best efforts). So what we need, in addition to our hypothetical five topological models is an interface description and SLA. If we have all this stuff we can conceptualize what a service architect might really do, and what might really be done to support that role.

A “library” in this model is a collection of objects classified first by the topology and then by interface and SLA. An architect who wanted to build a service would first frame the service as a collection of functions and then map functions to library objects, presuming a fairly comprehensive library. If that assumption wasn’t valid, then the architect would likely explore the functions available and try to fit them to match service opportunities.

One obvious consequence of this approach is that it’s implementation-opaque. The “objects” are truly intent models, with an abstract set of features that would be realized in any number of ways by committing any number of different combinations of infrastructure. You could build a Level 3 VPN, for example, by using an overlay encryption approach (IPsec), an IP feature (MPLS, RFC2547), a set of virtual functions/devices, or SDN. If all these implementation options produced the same interfaces, features/topologies, and SLAs, then they’d be equivalent.

Another consequence is that management could be harmonized using the objects themselves. A “service” as a collection of functional objects could be managed in the same way no matter what the implementation of the objects were, providing that we added a set of management variables to the SLA and expected everything that realized our function would populate those variables correctly.

This is what creates both the support for an SDN/NFV revolution and a risk to that revolution’s benefits. If service agility and operations efficiency are the primary benefits of SDN and NFV, and if these benefits are actually realized using object/intent modeling above either SDN or NFV and embracing legacy options as well as “revolutionary” ones, then we could build agile services and efficient operations at least in part without the revolutions.

This isn’t to say that this higher-level approach would negate the value of SDN or NFV, only that it would force both SDN and NFV to focus on the specific question of how either technology could augment efficiency or agility inside the object/intent model. While I think you could make a strong case for both SDN and NFV doing better, the improvement would be less than an improvement created by using efficient object/intent models only for SDN and NFV, and expecting legacy to live with current practices.

That’s what I think is the big question facing the industry. We cannot realize service agility and operations efficiency purely within SDN or NFV, in part because neither really defines a full operations and service lifecycle model and in part because it’s unrealistic to assume a fork-lift from legacy to SDN/NFV with no transition state. Will SDN and NFV address the models within their own specifications and thus tend to associate model benefits with SDN and NFV, or will we have to solve operations and service modeling needs somewhere else, a place as likely to support legacy technology as the new stuff?

SDN and NFV cannot create agility, nor efficiency, by themselves—in no small part because the standards bodies have put many of the essential pieces in the “out-of-scope” category. What they can do is work within a suitable framework, and at the same time guide the requirements for that framework so that it doesn’t accidentally orphan new technology choices. I think we’re starting to see a glimmer of both these things in both SDN and NFV, and I’m hoping that will continue.

What Arista’s Telling Us about the Future of SDN

Arista’s quarterly results might be showing us something important about the evolution of networking. The company reported stronger-than-expected revenue, but what surprised many on the Street and in the media was the comment that white-box switching wasn’t seen as competition. That might even be why revenues were better than expected, I think.

I also think that there should be no surprise here. Both SDN and NFV have struggled to show a benefit case, and in the case of NFV, thinking has evolved away from “capital cost” savings (meaning box costs) to operations and service agility benefits. SDN hasn’t made that transition, and so you could argue that it’s still stalled in a weak benefit situation.

If you start at the top, buyers have made it pretty clear that their preference for the network of the future would be a cheaper (capex-wise) version of the network of the past, but one that could then respond to additional economies in operations and additional revenue-generating or revenue-enhancing features. What they want is evolution and not revolution.

If you apply this to SDN, you see some immediate issues. Evolution, in infrastructure terms, means being able to introduce new technology in place of old where the “old” has been sufficiently depreciated. That means that to “evolve” to SDN you either have to make SDN devices serve in legacy missions, or you have to make legacy boxes serve in SDN missions.

Most of the vendors out there have already made their legacy devices capable of OpenFlow control, but obviously you don’t save anything by substituting a new box for an older version of the same box (unless the box is a lot cheaper now, which is what vendors are trying to avoid). That leaves making SDN work in place of legacy, and in order to do that you have to either buff up the white-box features to the point where it’s simply a new switch/router, or you have to create an enclave of new white boxes that look like a virtual legacy device and can replace a series of legacy devices.

I don’t think that SDN players, particularly white box players but even those who supply SDN controllers, have thought this through. They draw pictures of a network of white boxes without asking how we got there financially. The most credible model for SDN “evolution” is one where a series of legacy switches of various age are first migrated to SDN behavior using OpenFlow and then gradually replaced with white boxes. That’s possible, but the problem is that with an expected useful life for switches running around five years, the process is very slow. It also poses the largest possible risk right up front, when you switch from legacy to OpenFlow control.

It seems like the Arista strategy is smart given this situation. If you go to Arista’s website you have to dig to get anything on “SDN” at all. Their products look, in their PR face, pretty much like competitive legacy switching devices. Their switch literature concentrates on legacy support, meaning that it concentrates on introducing their products as substitutes for aging legacy switches from other vendors (Cisco comes to mind!) Yes, when you do this you get SDN capability, but most competitive switches also offer that in some form.

One of the questions this poses is what would drive SDN faster than it’s now being driven. Recall that analysts said SDN was no threat to Cisco, but NFV was. Might the rationale for this be that SDN really doesn’t have a convincing driver? Do we know what it might be?

We do, sort of. The only thing that can drive switching or anything else is benefits, and benefits have to be either reductions in TCO or improvements in revenue or (in the enterprise case) productivity. So we’re back to capex, opex, and service agility.

We’re back to the same problems with those drivers too, the same as NFV poses. SDN has a very narrow scope, as narrow as NFV. It’s addressed the bottom-layer technology issues and hasn’t yet gotten to the top layer. Sadly, businesses connect to networks at the top not at the bottom, which means that we’re still struggling to climb up to where users actually get something different and valuable.

OpenDaylight seems to be on the right track here, with a little help from a topic that’s rolled into NFV via its SDN integration—the intent model. The basic notion of ODL is that you give it a service in abstract form at a northbound interface (NBI), and it uses a variety of southbound interfaces (SBIs) to realize that service using whatever resources are provided. What makes ODL valuable versus “basic” OpenFlow is its ability to control devices that are not OpenFlow white-boxes, and to exploit that capability to tell a network evolution story.

The question for SDN is whether this NBI/SBI cooperation leads to evolution to SDN and not just to evolution. Remember, the buyer doesn’t particularly want new technology—that’s just a path to new risks. They want better benefits, including lower costs. Might we be seeing this whole abstraction thing creating a path to lower cost in another way—not through technology but through commoditization?

Premier players like Cisco get more for their devices, in part because of their brand. If we put an ODL mask on a device or device complex, does that legacy device brand shine through? Arista might be benefitting from the fact that it might not. They might be a specific example of buyer thinking of low-apple pure-device-cost gains now, and letting more profound benefit sources develop in their own time. Corporate-speak translation: Save a buck today and live to see tomorrow.

Cisco’s greatest threat, then, would be not the white boxes but the box-anonymizing architectures. Arista’s greatest benefit might still be its EOS, but the reason that might be a benefit is that it would allow Arista to do cheaper legacy devices today and evolve them if necessary to a more benefit-complicated future. Cisco could argue that’s what IOS and all their other three-letter acronyms do too, but they have to be cautious because if they encourage users to migrate faster than the 20%-per-year depreciation tradition would allow, they put more of their own devices up for grabs.

It’s hard to escape the conclusion that vendors in this space, from Arista to Cisco, are hurting themselves. Arista should be driving abstraction full-bore because anonymizing stuff in intent-based NBIs would make what’s underneath brand-insensitive. Cisco should be driving revolutionary benefits through its own application networking APIs to lance the boil of change and harness those benefits to justify continued investment in legacy infrastructure. Nobody is doing quite enough, which raises the chance that somebody will decide to do more, and by doing that generate a lot of excitement.

Oracle Widens its Positioning Lead in NFV, but New Issues Loom for All

Of all the suppliers (or even alleged suppliers) of NFV, the one who has shown the greatest and fastest gain in credibility is Oracle. Over the last year they’ve jumped from almost-nonentity status to one of the three firms most likely to be called a “market leader” and “thought leader” by operators (Alcatel-Lucent and HP are the other two). What’s behind this, we might wonder. Well, let’s see.

The main cause of Oracle’s jump seems to be pretty simple. The company started early this year to promote a very operations-centric vision of NFV. This was the very moment when operators were starting to realize that the trials and PoCs being run were largely science projects with no direct pathway to proving out benefits needed for a real trial. Was that a coincidence, we might wonder? I don’t know of any way to prove it one way or the other.

We also can’t know whether Oracle has been continually developing to their operations-centric NFV vision or whether they’re simply responding to media attention by singing the song that gets the most jukebox nickels thrown. Oracle’s recent announcements certainly double down on operations as the focus of NFV, and they’re gaining significant traction with operator CIOs (who are the people running the OSS/BSS systems) and with CFOs who think that a new approach to NFV is needed to reap some tangible benefits.

Deep inside, it’s more complicated. Oracle does have some unique credentials that could be valuable for NFV, but I don’t think they’re particularly tied to operations or OSS/BSS. If Oracle emerges as a winner in NFV it might well be because they changed horses in mid-positioning.

Efforts to standardize or define NFV have historically focused on the deployment of virtual functions. That’s not unreasonable given that virtual functions are what NFV brings to the table. The problem has been that NFV’s prospective benefits—capex reduction, opex efficiency, and service agility—are really difficult to realize if you limit your efforts to deploying those virtual functions. Services and operations are end-to-end in topology and lifecycle, and so you have to attack them on a broad front to make any difference that matters.

The critical question of benefits has always come down to how you could build a super-efficient form of service automation. If you could define services so that they could deploy on demand and so that service events could be handled by software, you’d cut the heart out of TCO and increase service agility by orders of magnitude. NFV correctly framed the notion of management and orchestration or MANO as the heart of the future of networking…but they only defined it for those virtual functions and left the rest of the service and network out.

I said in an early blog on NFV that MANO was something that you could see through three different lenses. It could be an NFV function and then expand to embrace the total service and infrastructure. It could be an operations function and expand to embrace NFV deployment. Finally, it could be something that knows lifecycles and processes and events and lives outside both NFV and OSS/BSS, and unites all the old and new into one model and vision. The ISG has not taken the first path, and it’s hard to say how that vision could come about without ISG endorsement. That means the second and third visions are the best shot, and the second is the one Oracle has taken…sort of.

Oracle’s approach to the problem is largely based on what I’ve called the “virtual device model”, which means that NFV is framed to operations systems like a device. When it’s called for, an independent process resolves the resource commitments needed to make the NFV virtual device into something real. That approach is a reasonable first step for NFV but it’s not an optimum strategy. NFV needs to integrate resource and service management, which can’t be done if the deployment of resources and the management of those resources is opaque to operations.

Alcatel-Lucent’s and HP’s NFV strategy are actually better than Oracle’s in terms of implementation, because either would allow for the handling of service events to be mediated by a service model. Why then is Oracle getting such good traction? The answer is that while they may not be integrating NFV into operations, they are focusing on modernizing operations overall.

You could gain as much from redoing OSS/BSS implementations around a model-and-event framework as you could gain from either SDN or NFV. In fact, it’s the operations improvements that are essential. As operators have moved past the “prove-you-can-deploy-a-VNF” stage of NFC testing, they’ve recognized that to deploy NFV you’ll have to make a business case, not prove a technology. Central to all NFV business cases is efficient deployment and management.

If you look at the NFV projects that really matter, the ones that are actually trying to prove out NFV’s benefits, you see that they’re divided into two groups, one that approached the issues at the service level from the top down, and the other that approached issues at the technology level. Oracle’s laser focus on the service lifecycle has given it a lot of curb appeal to the first group, to the point where in at least one Tier One they ended up with a compelling position with the account before any real discussions of NFV features took place.

The question is whether this kind of lead can be sustained. It’s always dangerous to base your strategy on the presumption that your competitors will continue to give you free rein with a key buyer issue. However, it’s fair to say that Oracle’s two arch-rivals, Alcatel-Lucent and HP, have so far done just that.

Oracle has a secret weapon, though, in database technology. NFV operations isn’t a network problem it’s a database problem. You can’t do NFV management effectively at scale without implementing a kind of database buffer between the resources to be managed and the management and operations elements. Oracle has all the knowledge and technology needed to do a premier job in that space, though so far they’ve made no comments on it that I’ve been able to find.

And lurking behind all of this are the two pivotal NFV issues, which are intent models and IoT. The former could finally expose the real issues of operationalizing NFV and so create a scramble for implementations that address those issues. The latter could be the camel’s nose that pulls a very large vendor under a very important tent, because IoT is a major opportunity for NFV if it’s conceptualized realistically. Oracle has experience in both these areas too, and so they’re likely to broaden their engagement as time passes. Since Alcatel-Lucent and HP are likely to start singing operations more effectively to counter Oracle’s early positioning, this will create a three-horse race that could really focus the industry on the right questions for NFV to answer.

Does the Street Have it Right on the Impact of SDN and NFV on Cisco?

Does the Street have it right when they say NFV could hurt Cisco? A Barron’s blog suggests that SDN doesn’t pose much of a threat to Cisco but NFV does, citing a financial analyst’s report. The perspective of Wall Street on tech is sometimes helpful because it exposes the issues that could drive stock prices. Sometimes it just exposes technical biases in the Street’s thinking. Which is it this time?

The report suggests that enterprises are moving apps to the cloud and to high-density servers, which reduces the need for switching—particularly top-of-rack stuff. SDN doesn’t enter into this threat because enterprises lack the software skill to implement it. That assumption doesn’t square with what I hear from the enterprises.

I’ve not found any enterprise who says they are reducing data center switching because of cloud use or dense-core servers or hyperscale. It may be that switching would grow more rapidly without multi-core, but of course we’ve had multi-core servers for decades. I have to dismiss the argument that something unusual is happening here.

As far as the cloud goes, enterprises tell me that they are migrating some applications, and some pieces of others, to the cloud, but that this process has had more impact on stranded application-specific servers than on the data center. None of them told me that they saw a reduction in data center switching arising from the use of cloud computing.

The question of whether SDN’s threat to Cisco is mitigated by lack of enterprise software skill is hard to survey because it amounts to asking someone “Are you as dumb as the Street thinks you are?” What I have found is that it is difficult for a business to justify a shift to a white-box data center because most of their data center technology has at least three years of useful life remaining. The biggest reason for a lack of SDN adoption is that inertia. If SDN “saves money” or “reduces capex” then why not reduce it further by simply not buying anything?

The problem with SDN, IMHO, isn’t the software skill of the buyer it’s the positioning skill of the seller. Capex reduction is an incredibly weak justification for modernization because it usually requires a forklift update to “modernize” and a lot of what’s getting lifted is really being tossed out while there’s still residual depreciation. These projects have a negative ROI unless you can cover that stranded cost with some other benefit. What that benefit might be isn’t for buyers to puzzle out, it’s for sellers to articulate and prove.

But is SDN a threat to Cisco? As I pointed out in an earlier blog, at some point central control of forwarding and connectivity in the data center is going to happen. My model says that even in 2018 there will be significant impact from these trends. They don’t force white-box substitution but they do encourage buyers to think about transitioning from expensive brands (like Cisco) to cheaper formulations. I personally think that SDN will be less an impact than virtual switching and routing, and that white-boxes with an operating system that supports legacy switching/routing will be easier to introduce. I think Cisco will see an impact from that source as early as the end of next year.

NFV is (as you’d expect) more complicated. The article suggests that the goal of NFV is to replace switches and routers with virtual appliances, but that hasn’t been the focus of NFV at all. Dynamic deployment and optimization is most useful for customer-specific elements of the network, or for dynamic multi-component ecosystems (like IMS or CDN). A virtual router is really a hosted router, not a VNF.

Enterprises are likely to be transitioned to virtual switching and routing as a part of an expansion in the scope of VPN/VLAN services, something that is already starting to shape vCPE to include network edge switch/router technology. This is probably the greatest impact point for Cisco because there are a lot of edge devices in a network, and so losing even a percentage of that TAM would hurt. But NFV isn’t really the driver here, it’s just lent its name to a notion of hosting edge functions on commodity multi-purpose devices. I’ve pointed out in the past that you don’t need NFV for vCPE.

NFV deployment at an optimum level would actually benefit Cisco, though. NFV could create a market for hundreds of thousands of new servers and switches. Cisco makes both. The NFV data center market alone could increase data center switch TAM by 25%. Since UCS has lower margins than switching/routing today, server gains might not boost profits the way Cisco would like, but by 2018 margins on switching/routing will be lower too. Any port in a margin storm.

I think the analysis of SDN and NFV impact on Cisco that the Barron’s article cites doesn’t hold water. This doesn’t mean Cisco skates by on SDN and NFV impact. The biggest problem for Cisco is actually something that I’ve blogged about all last week—intent models.

In a software-based network, where “devices” are really virtualized collections of software features, a new target of interoperability has to be defined. You can’t look for box equivalence. That new target, I think, will be the “black box” or intent-model feature set. Anything that can look like an intent-router is a router. That includes virtual equivalents, but also real ones. If you can harmonize a virtual function set to look like a physical device, you can harmonize real device behavior to look like that same thing. Where then is Cisco’s brand superiority going to take it?

That’s the truth of the “impact of SDN and NFV” for Cisco. Neither of these two technologies are going to prove out without strong adoption of intent models. Intent models, not SDN or NFV technology as we know it, poses the threat to Cisco, by anonymizing the way that network features are created. Anonymize the features, and you anonymize the vendor.

The solution to this would be for Cisco to be a feature leader, but the company is legendary for seeking a “fast follower” role. Wait till the market develops it and then kill early players with your massive brand. That works when brand matters, but the time when it does—at least matters in the traditional sense—may be passing.

Will SDN and NFV Standards Broaden Operator Choices?

As an industry, networking has always been very dependent on standards. One big reason is the desire of operators (like all buyers) to avoid vendor lock-in. Standards tend to help make boxes interchangeable, which reduces vendors’ power to control modernization and evolution. SDN and NFV are “standards-based” technologies, so you might think they’d continue this trend. They might in fact accelerate operator choice, but for reasons a lot more complicated.

Of the two technology revolutions in networking that we hear about today, SDN is the most “traditional” because it’s aimed specifically at creating a service behavior set and not at hosting the stuff that does that. Packets go from Point A to Point B in SDN because they got forwarded along a path, and that pretty much describes what happens today in legacy networks. The difference with SDN lies in how that path is decided.

Adaptive behavior, meaning dynamic discovery of topology/connectivity, is the basis for path determination today. A router expert once told me that about 80% of router code was related to path determination, and what SDN would do is pull that logic out of devices and send it to a central control point where paths were analyzed and imposed on the network (via OpenFlow).

If path determination is the thing most impacted by SDN you’d probably think that early SDN applications focused on where it was the biggest part of the problem, but that’s not been the case. SDN so far has deployed more in the data center where path cost differences are negligible and where failure of devices and paths is rare compared with the WAN. SDN has really been driven by the need to explicitly control connectivity, particularly in multi-tenant applications.

What has SDN done to hardware? Despite all the white-box talk, not as much as many thought. The early applications have actually been more overlay-network-oriented, where we build tunnels and routes on top of traditional (usually Ethernet) connectivity. Some SDN architectures (like the original Nicira) were designed from the first to ride on legacy connectivity not displace it. This may change over time, though, because if connectivity management is ceded to higher layers, then the data center switches are just transport devices, harder to differentiate. So that gives operators more choices, right?

Actually it doesn’t. The cheaper a box is, the less differentiated it is, the harder it is to drive new player opportunities. Nobody wants to fund a startup in a space where everything looks like everything else. So where SDN is really driving operator choice is more in the software, a network area that’s only now emerging as important.

NFV is even more complicated. First and foremost, NFV separates software (in the form of embedded features or “firmware” or “network operating systems”) from hardware, which allows the underlying platform to be commoditized. Software, left on its own, is now free to innovate. The theory has always been that virtual functions hosted on commodity hardware would lower the total cost of ownership, and broaden operator choices by preventing proprietary lock-in.

It’s hard to say that will happen. Two virtual functions are interchangeable based on a bunch of qualifiers. They have to run on the same thing, do the same thing, expose the same interfaces, and be managed in the same way. If we have a service chain created to build vCPE, for example, the elements in the chain are interchangeable if all these criteria are met and any one element can be switched out for a competitive brand. Many think this is one of the goals of NFV specifications, but it’s not.

NFV does not dictate any service data plane interface, any functionality, or any management interface. Given that, vCPE created by chaining three functions wouldn’t predefine any set of interface standards to conform to. It would be a happy accident if we had three or four providers of individual virtual functions (VNFs) that could be freely substituted for one another.

What this would mean is that NFV might actually promote vendor lock-in at one level and reduce it at the level below. Software features have enough range of motion in implementation that vendors would be able to develop ecosystems of VNFs that worked well together but didn’t easily accommodate substitutions for individual functions. They would be able to pick from commodity platforms, though, right?

Perhaps. NFV is a new world, and the NFV specifications are not broad enough to define every piece of it. Operators realize that all NFV implementations are not going to be equal, end-to-end, and in fact that not all will even be able to make the business case. The NFV vendors often provide VNFs, platforms to host stuff, operations and management and orchestration. All these work within the ecosystem the vendor offers, but we already know that you can’t be sure you could substitute one VNF for another. You can’t substitute one MANO for another either, and it doesn’t appear you can substitute even Virtual Infrastructure Managers. If you buy an NFV ecosystem, you’re buying a single vendor or a vendor and specific partners.

Before you take this as doom and gloom, let me say that the long-term prospects for an open and efficient NFV implementation are very good. What we see now in NFV, and even in SDN to a degree, is a growing-pain problem. Whether it would even have been possible to standardize NFV fully, to make every piece interchangeable, is debatable. We certainly didn’t do it. That means that we’re going to have a bunch of implementations out there, each presenting a unique slant on the value proposition for NFV. The ones that get the best traction will become the focus of more partnerships, be expanded more by new product features, and will define the shape of the NFV future. The others will die off.

You can see some of this already. Alcatel-Lucent offers NFV in a sense as a carrier to mobile evolution. HP offers a broad ecosystem of partners with useful VNFs, and Oracle offers an operations-slanted vision. The differences in implementation are way more subtle than the positioning (which for some vendors is still too subtle) and the early-market drive that the vendor hopes to harness.

What will actually create an open NFV ecosystem, though, is the “service agility” driver. In order to create agile services that can be efficiently operationalized, operators will need to assemble services from little functional blocks that can be stitched together using deployment automation tools. As I’ve suggested in prior blogs, these blocks will each be “black boxes” or “intent models” representing features disconnected from implementation specifics.

The standards for interconnection in NFV will be shaped by the operators themselves when they build their service libraries. It’s just that the early winners among NFV vendors will be in a better position to help do the shaping.