What Do the CIOs and CFOs Think about NFV?

NFV has a lot of constituencies to appease within each operator to get to deployment, and so far engagement has largely been with the CTO organizations.  I’ve noted in past blogs that the operators’ CFOs are concerned about the NFV business case and CIOs are concerned about operations.  I thought it might be interesting to review the aspects of NFV technology that are of most concern to the CFOs and CIOs.  These might be the directing factors in moving from lab to field trials and deployment because they might be issues that will have to be addressed to get broader buy-in for NFV.

The number one CFO concern is NFVI to VNF/MANO compatibility.  The largest investment an operator will make in NFV is the NFVI, and CFOs are concerned that the “best” or “most efficient” NFVI might not be workable with all the NFV implementations.  Most say they are not clear on the relationship between the NFVI and the Virtual Infrastructure Manager that’s supposed to link it to the rest of the NFV software.  Is there a “standard” VIM, or is there a VIM for each NFVI, depending on the vendor of the hardware and the software?  Can you have multiple VIMs, and if not how would you integrate servers from—for example—Cisco and HP?

The number one CIO issue is the viability/scalability of the NFV management and operations model.  A service made up of virtual functions could have a dozen VMs, as many or more internal tunnels between them, and linkages to existing infrastructure for customer delivery and transport connectivity between data centers.  How this will be managed is completely unclear to the CIOs.  The specifications suggest that VNFs present themselves to management systems as “virtual devices” managed as the real devices would be, but that doesn’t address all the management of resources that realize those virtual devices in functional terms.  With no clear picture of what could be called “NFVI operations” they can’t validate the opex, and thus can’t sign off on a TCO for NFV.

CFOs’ second-most-significant concern is the VNF cost model itself.  The presumption in NFV from the first was that inexpensive software implementations of functionality hosted on commodity servers would be cheaper than actual appliances.  What CFOs say is that the providers of today’s devices want to convert them to VNFs and price the licensing such that they have the same profit as before.  CFOs are particularly concerned about the “pay as you grow” licensing model, which would increase their fees to VNF providers as customer volume grows, rather than setting a fixed license charge.  The as-a-service model seems to CFOs to penalize them for success.

The number two CIO concern is the integration of operations/management processes for legacy infrastructure with VNF lifecycle management.  Nobody in the CFO or CIO organizations thinks that future services will be completely VNF-based, and in the early stages it’s likely that most services will have significant legacy device participation.  Can you improve service agility or operations efficiency when you’re not able to manage the service from end to end?  They don’t think so, and having a different management model for legacy than for NFV makes it hard to even know what management costs would be for a given service—you couldn’t tell what would end up legacy versus VNF.

CFO issue number three is actually also CIO issue number three, but obviously the reasoning is a bit different.  The issue is portability of VNFs.  CFOs believe that many of the major vendors will develop VNFs that have explicit ties to their own implementation of MANO, VNFM, and NFVI.  This makes sense from the vendor perspective; they can use their key VNFs to pull through their implementation of NFV.  The problem for the CFO is that they lose pricing power and they risk replication of assets—silos—because they need specific VNFs from different vendors and end up with separate NFV ecosystems because of that.

CIOs’ concern here is in management.  They point out that there’s no specific mechanism for federation of NFV assets, nor really any satisfactory model of how multiple NFV implementations could even connect within a given operator.  That could silo management visibility, creating a potential disaster for service management efficiency.

Both CFOs and CIOs point out that non-portable VNFs would mean that if a given NFV provider went out of business, failed to keep up with NFV evolution, or simply dropped their NFV product, the operator might have to put together a whole new ecosystem just to continue to sell their current services to their current customers.

The final problem for CFOs is the lack of a convincing point of purchase.  What every buyer wants is a collection of credible sellers.  Although there are credible sellers for NFV, it’s not clear whether any one of them is sufficient and it’s pretty clear that there’s little basis for combining them to form a multi-vendor ecosystem.  Nobody wants a flavor-of-the-month NFV solution, and that seems a real risk now because even the media—ever hungry to name winners in any contest—seems unable to name one with NFV.

For CIOs the final issue is that it’s not clear whether we need an enhancement to current OSS/BSS, a next-gen operations model, or maybe no model at all.  Service automation implies lifecycle automation, which could represent a major shift in the way operations software works.  The TMF reflected such a shift in their GB942 and NGOSS Contract stuff, but this hasn’t been widely implemented.  None of the CIOs I talked with had done so, which is too bad because it might resolve some of the debate on where operations software should be taken.  I was at a Tier One operator meeting where two people sitting next to each other had totally incompatible views of what was needed in operations—retaining the old or tossing it in favor of the Great Unknown.  That’s reflective of the confusion overall, and that’s a problem because of the obvious key role that OSS/BSS plays in service agility and operations efficiency.

So there we have it.  You can see that there are two issues here.  First, the “new” players within the operator organizations are yet to be fully convinced that NFV is the right answer (though they really want to believe it is because they’re not sure what else is on the horizon).  Second, those new players don’t have the same issues on their bucket lists.

As I’ve said before, there’s no reason why we can’t address these points; even today I think we could meet enough requirements with some of the existing NFV implementations to build the necessary momentum.  We do need to meet them, though, and we need to raise all the issues and address them if we want NFV to develop competitively and to its full potential.

How Will the Major Vendors Fare in This Fall’s Operator Planning?

I blogged earlier this week about the “fall planning cycle” for network operators, and the issues and forces associated with that cycle this year.  An obvious follow-on question is how vendors will be impacted by the cycle.  Will some be hurt by events, others helped, and is there still time to move yourself from the “hurt” to “help” group?  Time is short here, so whatever happens will have to be focused as much on positioning as on product.  I can’t review every vendor in a blog like this, but let’s look at some major “revolutionary” vendors and see where they are and what they might do.

Alcatel-Lucent is one of the functional market leaders in NFV and is the runaway winner in SDN.  As they have since the merger that created them, Alcatel-Lucent has been near dead last in terms of positioning effectiveness.  Always a geeky player, they’ve relied on technical engagement to advance their goals, but the problem with the current SDN/NFV revolutionary period is that there are a lot of new players to sell.  Even where Alcatel-Lucent has the strength to promote a holistic strategy, the inexplicably separate their wonderfully unified stuff into functional silos.  OSS/BSS, NFV, IMS, Rapport…all of these should be bricks on the pathway to the new networking age.  They’re not.

The biggest challenge for Alcatel-Lucent is positioning Nuage, their SDN strategy, given the dominance of traditional routing within the company.  You have to protect your sales, of course, but you can’t protect them for long if you ignore evolution and hunker down on the present.  Earth to Alcatel-Lucent; you can’t virtualize most of your infrastructure by 2020 (as AT&T says it will) by staying with Big Iron.  Alcatel-Lucent is at risk to losing their SDN lead while they’re dallying on whether SDN matters enough to promote it.

Cisco is the poster child for dallying in the eyes of most.  They always seem to be trying to cap any new development, largely because they are.  Why foster change when you’re winning the current game?  In the case of SDN, Cisco definitely plays a “cap” game; they’ve built a software veneer on top of the usual infrastructure to tap off a lot of the early motivation to change to “real” SDN.  The problem for them is that they’re defending against their own success.  Cisco’s best chance to be the next IBM is to ride the wave of the “network-facilitated cloud”, which uses SDN for tenant networking and NFV for deployment and operation of features.  If Alcatel-Lucent were stronger in Nuage positioning they’d have put Cisco’s SDN strategy to bed already.  There’s still time for them to do that.

While Alcatel-Lucent could clean Cisco’s SDN clock, HP is the biggest potential disruptor of the networking industry.  HP has, via M&A, a decent SDN position, a superb NFV story, what might be the best IoT strategy of anyone.  They have all the products needed to build the virtual world of the future, and most importantly they have the hardware framework that will earn the most revenue, so they have the best financial incentive to stay engaged.  Their problem is the ISG’s Proof-of-Concept activities.  HP got seduced into believing that if you won at PoCs you won in deployment.  That would be true if the PoCs were aligned with convincing business cases, but they aren’t.

The future of NFV and SDN is the future of networking, either proactively or reactively.  HP needs to build its own ecosystemic story, crossing over the boundaries between its product areas, its technologies, and most important crossing over all those PoCs.  We are building one network here, gang, not a bunch of PoC silos.  What your vision is for that one network must be communicated clearly and (most important) quickly.

Huawei is way beyond the 900-pound gorilla phase of evolution in networking.  They are the price leader, and likely will be forever.  That gives them both assured success even if no real network evolution happens, and a solid shot at framing the future if they want to.  That’s because low prices can ease the risk burden that buyers of revolutionary stuff always have to face.  Huawei knows all of this, I think.  They have quietly managed to put together a lot of strong elements in NFV and SDN, not only the glamorous high-level stuff but also some of the base technology stuff.

Huawei has two problems that are related.  First is their lack of marketing/positioning skill.  While they’ve been getting better, Huawei isn’t a marketing-driven player and you have to be that to foster a revolution, or take your place in one.  The second problem is their political impasse with US carrier sales.  Not only are the US operators giant spenders, they’re also often on the leading edge of technology changes.  Further, they are close to the tech media both geographically and culturally.  If you are not winning hearts and minds in the US, the US media doesn’t take you as seriously.  Huawei can never fix their political problems, but they could position.

Oracle doesn’t need to learn much about positioning, in my view.  Their technology credentials in NFV are limited and their credentials in SDN even more so, but they were smart enough to see something that all the SDN and NFV leaders failed to see—and still largely fail to see.  You cannot win at either SDN or NFV without an operations story so complete and compelling that it shines like sunrise in the darkness.  They’ve been making (can you believe it!) OSS/BSS announcements and relating them to NFV and SDN!  From the PR of most of the NFV players, you’d think there was no such thing as an OSS/BSS.

Service agility and operations efficiency depend on operations systems.  Oracle has grasped that, but they are still weak in terms of how their operations vision actually combines with either SDN or NFV.  You can’t sell SDN or NFV without operations, but you’re not going to upset the network applecart by starting to revamp operations and hoping it will trickle down.  That’s why this whole SDN/NFV thing is complicated; it’s inherently multifaceted, both in technology and constituency.

Oracle is the only player in the revolutionary networking space that actually needs new product functionality.  They should be looking out there for somebody with strong SDN and NFV credentials to buy—somebody with good technology but not too much market cap.  Cisco, I suspect, has the technology it needs but is still focused on retention of the old model—“fast following”.  Alcatel-Lucent is torn between a revolutionary cadre and a bunch of stick-in-the-muds, and HP is chasing too many different rabbits with too many different hounds.  Huawei may be the player doing the most right, but they win primarily if everyone else messes up.

Which, so far, they are.  Every one of these vendors needs to make a major SDN/NFV/operations policy announcement by early October at the latest.  If anyone does that well, they gain an upper hand in budget planning for 2016.  If only one does it well, they may have won the SDN/NFV future.

What We May Have Here is a Quiet Revolution

If you look at the combined state of networking and IT, the most interesting thing is the fact that it’s getting harder to find the boundary point.  We’ve been linking the two since online applications in the ‘60s.  Now, componentization of software, virtualization of resources, and mobility have combined to build agile applications that move in time and space and rely on the network to be almost an API between processes that spring up like flowers.

While software and this network/IT boundary are symbiotic and so co-evolving, you could argue that our notions of platforms have been less responsive to the changes.  In most operating systems, including Linux, we have a secure and efficient “kernel” and a kind of add-on application environment.  Since the OS is responsible for network and I/O connections, we’ve limited the scope and agility of virtualization by treating I/O as either “fixed” in the kernel or agile only as an extension of the applications—middleware.  Now all of that may be changing, and it could create a revolution within some of our other revolutions—especially SDN and NFV.

Some time ago, PLUMgrid developed what was essentially a vision for an I/O and network hypervisor, an “IO Visor” as they called it.  This product was designed to create a virtual I/O layer that higher-level software and middleware could then exploit to facilitate efficient use of virtualization and to simplify development in accommodating virtual resources.  What they’ve now done, working with the Linux Foundation, is to make IO Visor into an architecture for Linux kernel extension.  There’s an IO Visor Project and the Platinum members are (besides PLUMgrid) Cisco, Huawei, and Intel.

The IO Visor project is built on what’s called “Berkeley Packet Filters”, an extension to Linux designed to do packet classification for monitoring.  BPF fits between the traditional network socket and the network connection, and extended in 2013 to allow an in-Kernel module to handle any sort of I/O.  You can link the extended BPF (eBPF) at multiple layers in the I/O stack, making it a very effective tool in creating or virtualizing services.  It works for vanilla Linux but probably most people will value it for its ability to enhance virtualization, where it applies to both hypervisor (VM) and container environments.

The technical foundation for IO Visor is an “engine” that provides generalized services to a set of plugins.  The engine and plugins fit into the Kernel in one sense, and “below” it, just above the hardware, in another.  Unlike normal Kernel functions that require rebuilding the OS and reloading everything to change a function, these IO Visor plugins can be loaded and unloaded dynamically.  Applications written for IO Visor have to obey special rules (as all plugins do) but it’s not rocket science to build there.

What IO Visor creates is a kind of “underware” model, something that has some of the properties of middleware, some of user applications, and some of the OS (Kernel) itself.  You can put things into “underware” and create or modify services at the higher layer.  The monitoring example that was the basis for BPF in the first place has been implemented as an IO Visor case study, for example.

What’s profound about IO Visor is the fact that it can be used to create an “underservice” that’s got components distributed through the whole range of Linux OS deployments for something like SDN or NFV.  An obvious example is a virtual switch or router “service” distributed among all of the various hosts and a functional part of the Kernel.  You could create a security service as well, in various ways, and there’s an example of that on the IO Visor link I referenced above.

Some of the advantages of this approach in a general sense—performance, security, and agility—are easily seen from the basic architecture.  If you dig a bit you can find other benefits, and it’s in these that the impact on SDN and NFV is most likely to emerge.

Signaling and management in both SDN and NFV are absolutely critical, and you can see that by applying IO Visor and plugins to a signaling/management service, you could create a virtual out-of-band connection service accessible under very specific (secure, auditable, governable) terms by higher-layer functions.  This could go a long way toward securing the critical internal exchanges of both technologies, the compromise of which could create a complete security/governance disaster.

Another feature is the distribution of basic functions like DNS, DHCP, and load balancing.  You could make these services part of the secure Kernel and give applications a basic port through which they could be accessed, a port like that of my hypothetical signaling/management network above would be limited in functionality and thus virtually impossible to hack.

If you’re going to do packet monitoring in a virtual world, you need virtual probes, and there’s already an example of how to enlist IO Visor to create this sort of thing as a per-OS service, distributed to all nodes where you deploy virtual functions or virtual switch/routers.  Management/monitoring as a service can be a reality with this model.

NFV in particular could benefit from this approach, but here’s where “impact” could mean more than just reaping more benefits.  You can load IO Visor plugins dynamically, which means that you could load them into a Kernel as needed.    That could mean that NFV deployment orchestration and management would need to contend with “underware” conditioning as well as simply loading VNFs, and it would certainly mean that you’d want to consider the inventory of IO Visor features that a given set of VNFs might want, and decide which you’d elect to bind persistently into the kernel and which you’d make dynamic.

This raises another obvious point.  If one of the big benefits of the IO Visor approach is to support the creation of distributable kernel-based service.  If that’s what you’re aiming for, you can’t just have people doing random IO Visor plugins and hoping they come together.  You need to frame the service first then implement it via plugins.  I’ve blogged about the notion in the past, and it’s part of my ExperiaSphere model—I call it “infrastructure services”.  Since you don’t need to deploy something that’s part of the kernel (once you’ve put it there), you need to conceptualize how you use a “resident” element like that as part of a virtual function implementation.

This probably sounds pretty geeky, and it is.  The membership in the project is much more limited than that of the ONF or the ETSI NFV ISG.  There are three members who should make everyone sit up, though.  Intel obviously has a lot of interest in making servers into universal fountains of functionality, and they’re in this.  Cisco, ever the hedger of bets in the network/IT evolution, is also a member of the IO Visor Project.  But the name that should have SDN and NFV vendors quaking is Huawei.  While they’re not a big SDN/NFV name in a PR sense, they’ve been working hard to make themselves into a functional leader, not just a price leader.

And IO Visor might just be the way to do that.  I think IO Visor is disruptive, revolutionary.  I think it brings literally unparalleled agility to the Linux kernel, taking classic OSs forward into a dynamic age.  It opens entirely new models for distributed network services, for NFV, for SDN, for control and management plane interactions.  It could even become a framework for making Linux into the first OS that’s truly virtualized, the optimum platform for cloud computing and NFV.  You probably won’t see much about this in the media, and what you see probably won’t do it justice.  Do yourself a favor, especially if you’re on the leading edge of SDN, NFV, or the cloud.  Look into this.

 

How Operators are Preparing NFV Plans for their Fall Pre-Budget Review

The consensus among network operators who provide either wireline or wireless broadband is that they’ll cross over on the revenue/cost per bit by mid-2017.  Given the time it takes to make any significant changes in service offerings, operations practices, or capital infrastructure programs, something remedial would have to begin next year to be effective.

In mid-September of each year, operators embark on a technology planning cycle.  It’s not a universal process, nor is it always formalized, but it’s widespread and usually completed by mid-November.  The goal is to figure out what technology decisions will drive capital programs for the following year.  That which wins in this cycle has a good chance of getting into field trial and even deployment in 2016.

It’s not surprising that operators are taking stock now, in preparation for the work to come.  It’s not surprising that NFV is a big question to be addressed, and that NFV’s potential to improve profits by widening the revenue/cost-per-bit gap is perhaps the largest technology question-mark.

My opening qualifier “who provide either wireline or wireless broadband” is important here.  More specialized operators like the managed service providers (MSPs), cloud providers (CSPs), or those who offer multi-channel broadcast video are in a bit better shape.  Interestingly, one of most obvious success points for NFV is the MSP space, so let’s start looking at NFV’s potential with its successes.

An MSP adds value to connection services by introducing a strong feature-management model.  Most connection services are consumed within the context of private WAN deployment of some sort, and there’s more to a private WAN than a few pipes.  Over the last two decades, the cost of acquiring and sustaining the skills needed for that ancillary technology, and the cost of managing the private WAN overall, has grown as a component of TCO.  Today businesses spend almost twice as much supporting their network as buying equipment for it.  MSPs target that trend.

NFV does to, or at least the “service chaining” or “virtual CPE” portion does.  Connection services are built into private WANs by adding inline technologies like firewalls, application accelerators, encryption, and so forth, and by adding hosted elements like DNS and DHCP.  The MSP model of NFV vCPE is to supply those capabilities by hosting them on an agile edge device.  That means that you deploy a superbox with each customer and then mine additional revenue potential by filling it with features that you load from a central repository.  It’s a good, even great, model.

This same model can be adopted by any big operator, including all the broadband ISPs, and in theory it could be applied to every customer.  There are issues with that theory, though—particularly for NFV’s broad acceptance:

  • vCPE delivers the most value where the cost of actual devices is high and their deployment is dynamic. If the boxes are cheap and if users tend to get all the same features all at once, and then never change, it doesn’t do much good.  Consumers and small businesses don’t fit the vCPE success model.
  • While NFV can be used to deploy functions into CPE, that mission dodges most of the broader NFV value propositions. Managing that vCPE model isn’t much different from managing real boxes.  You don’t need clouds or really even need function-to-function connectivity to make it work.  There’s no economy of scale to consider.
  • vCPE has encouraged VNF providers to consider what operators overall say is an unrealistic revenue model—the pay-as-you-go. MSPs like this approach because it lets them match expenses to revenue; they don’t have to deploy much shared infrastructure with a CPE-hosted VNF model so the VNF licenses would be much of their first cost.  Other operators don’t like that model at all because it exposes them to what they believe to be higher long-term costs.
  • The applications of vCPE that do work are a very small part of the revenue/cost-per-bit problem, and so even if you revolutionize these services for the appropriate customers, you don’t move the ball on profit trends very much.

What does move the ball?  The other most successful NFV application to date is mobile infrastructure.  Operators are already dependent on mobile services for profits, ARPU, and customer growth.  There’s more change taking place within the mobile network than anywhere else, and it’s easier to drive new technology into a network when you’re investing in it anyway.

Virtual mobile infrastructure involves virtualizing IMS (the core registration and service control technology), EPC (the metro traffic management and mobility management piece), and of course the radio access network.  We’ve seen announcements in all of these areas, from players like Alcatel-Lucent (vIMS), Ericsson (vEPC), and ASOCS (vRAN, in partnership with Intel).

There’s a lot of money, a lot of potential, in virtualizing mobile infrastructure.  The problem from an NFV perspective is that mobile services are multi-tenant, which means that you generally deploy them and then keep them running forever.  Yes, you need operational support for high availability and performance, but you are really building a cloud application in the mobile infrastructure space and not an NFV application.

Despite the lack of dynamism in virtual mobile infrastructure (vMI), the larger operators tend to accept it as the priority path to NFV.  That’s because vMI is large in scale, both in geographic and technology terms.  It touches enough that if you can make it work, you can infer a lot of other things will also work.  And because operationalization is a big part of a vMI story, that could lead to broad operations transformation.  Operators believe in that.

Here’s what operators say they are facing when they enter their fall planning cycle.  We have proved that NFV works, in that we have proved that you can deploy and connect virtual functions to build pieces of services.  We have proved that NFV can be useful in vCPE and vMI, but we haven’t proved it’s necessary for either one.  But carriers have invested millions in NFV, it’s a major focus of standards-writers and technologists.  There is a lot of good stuff there, sound technology and the potential for an arresting business case.  We just don’t know what that business case is yet.

The plethora of PoCs and trials isn’t comforting to CFOs because it raises the risk of having a plethora of implementations, the classic silo problem.  We have no universal model of NFV, or of any new and different future network.  It’s a risk to build toward the goal of a new infrastructure through diverse specialized service projects when you don’t know whether these projects will add up to a cohesive future-network vision.  It’s particularly risky when we don’t have any firm specifications to help realize service agility or operations efficiency benefits—when those are the benefits operators think are most credible.

What operators are even now asking is whether they can start investing in any aspect of NFV with the assurance that their investment will be protected if NFV does succeed in broadening its scope.  Will we have “NFV” in the future or a bunch of NFV silos, each representing a service that works in isolation but can’t socialize?  This is the question I think is already dominating CFO/CEO planning, where one called it “suffering the death of a thousand benefits”.  It will shortly come to dominate the NFV proofs, tests, and trials because it’s the question that the fall technology planning cycle has to answer if the 2016 budgets are to cover expanded NFV activity.

I believe this question can be answered, and actually answered fairly easily and in the affirmative.  There are examples of effective NFV models broad enough to cover the required infrastructure and service critical masses.  There are examples of integration techniques that would address how to harmonize diverse services and even diverse NFV choices.  We don’t need to invent much here.  I believe that a full, responsive-to-business-needs, NFV infrastructure could be proved out in six months or less.  All we have to do is collect the pieces and organize them into the right framework.  Probably a dozen different vendors could take the lead in this.  The question for this fall, I hope, won’t be “will any?” but “which one?”

Can We “Open” NFV or Test Its Interoperability? We May Find Out.

I suspect that almost everyone involved in NFV would agree that it’s a work in progress.  Operators I’ve talked with through the entire NFV cycle—from the Call for Action white paper in the fall of 2012 to today—exhibit a mixture of hope and frustration.  The top question these operators ask today is how the NFV business case can be made, but the top technical question they have is about interoperability.

Interoperability has come to the fore again this week because of a Light Reading invitation to vendors to submit their NFV products to the EANTC lab for testing, and because a startup who promises an open-source and interoperable NFV came out of stealth.  I say “come to the fore again” because it’s been an issue for operators from the first.

Everyone wants interoperability, in no small part because it is seen as a means of preventing vendor lock-in.  NFV is a combination of three functional elements—virtual network functions, NFV infrastructure, and management and orchestration—and there’s long been a fear among operators that vendors would field their own proprietary trio and create “NFV silos” that would impose different management requirements, demand different infrastructure, and even support only specific VNFs.

That risk can’t be dismissed either.  The ETSI NFV ISG hasn’t defined its interfaces and data models in sufficient detail (in my view, and in the view of many operators) to allow unambiguous harmony in implementation.  We do have trials underway that integrate vendor offerings, but operators tell me that the integration mechanisms aren’t common across the trials, and so won’t assure interoperability in the broad community of NFV players.  What’s needed to create it is best understood by addressing those three NFV functional elements one at a time.

VNFs are the foundation of NFV because if you don’t have them you have no functions to host and no way to generate benefits.  A VNF is essentially a cloud application that’s written to be deployed and managed in some way and to expose some set of external interfaces.  There are two essential VNF properties to define for interoperability.

A real device typically has some addresses that represent its data, control, and management interfaces.  These interfaces speak the specific language of the device, and so to make them work we have to “connect” to them with a partner that understands that language.  We have to match protocols in the data path, and we have to support control and management features through those interfaces.  Rather than define a specific standard for the management side, NFV has presumed that a “VNF Manager” would be bound with the VNFs to control their lifecycle.  VNFMs know how to set up, parameterize, and scale VNFs.

One thing this means is that VNFMs are kind of caught between two worlds—they are on one hand a part of a VNF and on the other hand a part of the management process.  If you look at the implementations of NFV today, most VNFMs have rather specific interfaces to the management systems and resource management tools of the vendors who create the NFV platform.  That’s not unexpected, but it means that it’s going to be difficult to make a VNF portable unless the VNFM is portable, and that’s difficult if it’s hooked to specific vendor tools.

The other hidden truth here is that if a VNFM drives lifecycle management, then the VNFM knows the rules for things like scaling, event-handling for faults, and so forth.  It also obviously has to know the service model—the way that all the components in a VNF are connected and what has to be redone if a given component is scaled out and scaled in.  If this knowledge exists “inside” the VNFM then the VNFM is the only thing that knows what the configuration of a service is, which means that if you can’t port the VNFM you can’t port anything.

The second critical interoperability issue is the NFV Infrastructure piece.  You’d want to be able to host VNFs on the best available resource, both in terms of resource capacity planning (cheap commodity servers or ones with special data-plane features) and in terms of picking a specific hosting point to optimize performance and cost during deployment.  Infrastructure has to be abstracted to make this work, so that you give a command to abstract-hosting-point and it hosts depending on all your deployment policies and the current state of resources.

It’s clear who does this—the Virtual Infrastructure Manager.  It’s not really clear how it works.  For example, if there are parameters to guide where a VNF is to be put (and there are), you’d either have to be able to pass these to the lower-level cloud management API (OpenStack Nova for example) to guide its process, or you’d have to apply your decision policies to infrastructure within the VIM (or higher up) and then tell the CMS API specifically where you wanted something put.  The first option is problematic because cloud deployment tools today don’t support the full range of NFV options, and the second is problematic because there’s no indication that resource topology and state information is ever published “upward” from the NFV Infrastructure to or though the VIM.

If you read through the NFV specifications looking for the detail on these points, through the jaded eyes of a developer, you’ll not find everything you need.  Thus, you can’t test for interoperability without making assumptions or leaving out issues.  Light Reading can, and I’d hope they might, identify the specific things that we don’t have and need to have, but it’s not going to be able to apply a firm standard of interoperability that’s meaningful.

How about the startup?  The company is called “RIFT.io” and its product is “RIFT.ware”.  The details of what RIFT.ware includes and what it does are a bit vague (not surprising since the company just came out of stealth), but the intriguing quote from their website is “VNFs built with RIFT.ware feature the economics and scale of hyperscale data centers and the security and availability of Telco-grade network services.”  Note my italics here.  The implication is that RIFT.ware is a kind of VNFPaaS framework, something that offers a developer of VNFs a toolkit that would, when used, exercise all of the necessary deployment and lifecycle management features of NFV in a standard way.

I think that a PaaS for NFV is a great notion, and I’ve said that in blogs before.  However, it’s obvious there are some questions here.  How would RIFT.io induce VNF vendors, particularly the big NFV players who also have infrastructure and MANO, to use their system?  Since there are no definitive specifications from ETSI that you could force compliance with, could the big guys simply thumb their noses?  Another question is whether RIFT.ware could answer the questions about “interoperability” within its own framework.  And if we don’t get conformance on RIFT.ware across the board, it becomes another silo to be validated.

The final question here, which applies to both LR and RIFT.io, is that of the business case.  Interoperability doesn’t guarantee utility.  If NFV doesn’t reach far enough into legacy service infrastructure to operationalize things end to end, and if it doesn’t integrate with current OSS/BSS/NMS processes in a highly efficient way, then it doesn’t move the ball in terms of service agility and operations efficiency.  The ETSI spec doesn’t address legacy management and doesn’t address operations integration, at least not yet.

I’m hopeful that both these activities will produce something useful, but I think that for utility to emerge from either, we’ll have to address omissions in the NFV concept as it’s currently specified.  I hope that both these activities will identify those omissions by trying to do something useful and running into them, because that sort of demo may be the only way we get them out in the open and dealt with.

Cisco’s Message to SDN and NFV Proponents: Get Moving or Get Buried

Cisco beat estimates in its quarter, coming in about where I’d suggested it might overall.  The Street is happy with the results, which they should be, and the question now is how the details of Cisco’s performance might signal us toward a view of the next year.

I think one key statement from the call, from now-CEO Robbins, was “When I think about our strategy, I look at the huge market opportunity that exists as businesses and governments use technology to drive their growth and operational efficiency.”  Productivity is the measure of operational efficiency for enterprises, and was the driver of all the past IT spending booms.  Operations efficiency is the most credible benefit target for network operators.  So the question is whether Cisco is now recognizing this, or whether it was just a catchy turn of the phrase inserted by a speech-writer.

Overall revenue growth was 4%, but switching and routing both under-performed versus the average and data center (UCS) was the big success with 14% growth.  This suggests, as I noted yesterday, that neither the network operators nor the enterprises were confidently re-investing in the current (Ethernet/IP) model of networking, but also that they were not seeing enough of a credible alternative emerging to dampen normal refresh.  There may be some extending of product lifecycles, but it’s not excessive.

The service provider segment did return to growth for Cisco, which again isn’t any real surprise.  Had SDN and NFV adoption been where you’d think they are based on media coverage, we would have seen a distinct slip in Cisco’s numbers given that they are hardly a leader in either space.  There was no such slip, which proves that we’re not seeing any impact of SDN or NFV on infrastructure spending by operators.

Robbins said that Cisco “did really well in the core” regarding routing sales.  That suggests to me that the place where Cisco is strongest in the network operator segment is one of the places operators are trying to re-architect the most.  The replacement of large routers with agile optics or SDN-groomed tunnels is a major priority.  They aren’t seeing any hurt from this shift, so clearly it’s not yet happening.

NFV was mentioned on this call by Cisco for what I think was the first time.  Their comment: “But I do think that as we look at where the service providers are going, what they want to do with virtual managed services, how we’re aligned now around the deployment of NFV and how we move forward with that, we think that we’re well positioned with routing as we look ahead.”  Given that NFV really doesn’t have much to do with routing specifically, and that virtual managed services (service chaining) are a fairly limited NFV application, I think it’s fair to say that Cisco has no revolutionary NFV intent.

Enterprise/commercial carried the day in terms of segment performance, and routing slightly out-gained switching.  I think that suggests that my comments regarding the pace of SDN in the enterprise (in my blog on Arista) were correct.  Enterprises are more likely to be tied to their incumbent vendor in complex technology like routing, and so Cisco picked up more there.  Switching is more competitive for the simple reason that there are fewer features.

One very significant point here is that Cisco is retaining a strong data center account control lead, based on my own interactions with enterprises.  For the last eight years, data center networking needs and policies have driven network equipment and architecture decisions, and that drive has been decisive in the last four years.  The vendor who used to have absolute control over the data center was IBM, but Cisco is now in the top spot.

Cisco, in fact, listed achievements in the cloud data center and in software as its two primary goals.  Given that it’s already dominating data center strategic influence at least in the US and Europe, it seems reasonable that it would be able to achieve at least that first goal.  However, Cisco’s cloud aspirations seem more tactical than strategic.  They don’t talk about differentiating themselves with cloud technology, but about exploiting the opportunity for data center equipment that the cloud creates.  Surprisingly, they ignore the biggest potential source of new data centers, which is NFV.

Not everything is good news.  If I were Cisco I’d be concerned about the fact that its growth came completely from the Americas, meaning the US and Canada.  The reason this is important is that Cisco’s account control is greatest in these markets, and competition (particularly from Huawei in the network operator segment) is lower or absent.  Europe, of course, is mired in secular economic issues and its underperformance has hurt Cisco’s numbers because Cisco enjoys account control there almost as much as here.

Software looks to be a bit knotty.  If you read through the call transcript the vision you get is one of a Cisco who sees software purely as a means of transitioning to a subscription/recurring revenue model in areas like security.  There is no sense that software technology is now driving the bus completely.  This isn’t a problem limited to Cisco, though.  Arch-rivals Alcatel-Lucent and Juniper are also locked in the box, so to speak, and are having difficulties coming to terms with a pure software future where hardware is just what you run software on.

So what can we say about the call overall?  First, while Cisco may have opened with that operations efficiency comment, I think it was an accidental fluffery and not a signal that Cisco recognizes the importance of driving IT and network spending growth by harnessing productivity and operations efficiency as benefits.  There were no comments made later on to tie to the efficiency theme, and Cisco likes to leave a trail of bread crumbs when they want us to follow.

Second, Cisco is telling us that neither SDN nor NFV are impacting sales at all.  That’s not as much a fault of the technological merits of either as to the vacuous positioning of offerings and the insipid dealing with key value propositions.  It’s not smart for Cisco to defend against SDN or NFV when nobody is attacking effectively with either technology.  In fact, Cisco paints a picture of real risk of failure for both technologies.  If nothing is done in the next two years to radically change buyer thinking, the reinvestment in legacy technology will have nailed a lot of buyers to the ground for three or four more years, and by then there’d be little chance either SDN or NFV would develop any real momentum.

Cisco is doing well, in no small part because it’s winning the old game and too smart to support a new one.  Those who want to see Cisco drop market share or want to displace Cisco at the top of the heap will have to sing and dance a lot better, and start that process very quickly.

 

As Requested: Building and Selling Intent-Modeled Services

I did a blog early this week on the foundation for an agile service-model approach.  Some of my operator friends were particularly interested in this topic, more than I thought frankly.  Most of it was centered on how this sort of model would be applied during the routine processes of building and selling services.  If service agility is the goal, then operators want specific knowledge of how a given strategy would impact the “think-it-to-sell-it” cycle.  So let’s set the stage, then dive in.

What I’m suggesting is that we define services in a series of hierarchical modeling steps, with each step based on an intent model that offers a from-the-consumer-side abstraction of the service/feature being defined.  For network services and features, we could assume the model had a standard structure, which defined FUNCTIONAL-PROPERTIES for the model, and INTERFACES that allow for connection of models or to users.  The FUNCTIONAL-PROPERTIES would always include an SLA to describe performance and availability.

As I noted in the earlier blog, you can build basic connection services with three broad models—the point-to-point (LINE), multipoint (LAN), and multicast (TREE).  A bit of deeper thinking would show that there could be two kinds of TREEs, the M-TREE that’s true multicast and the L-TREE that’s a load-balance point.  You could divide other connection service models too, if needed.

For more complicated features, I proposed two additional models, the IN-LINE model for a service that sits across a data path (firewall is the classic example) and the ON-LINE model for a service that attaches like it’s an endpoint, as DNS would.

INTERFACEs are the glue for all of this.  Depending on the functional model of the service we might have a single class of interface (“ENDPOINT” on a LAN) or we might have multiple classes (“SENDER” and “RECEIVER” on multicast TREEs).  Connection services connect endpoints so you’d have one for every endpoint you wanted to connect, and you might also have Network-to-Network Interface (NNI) points to connect subsets of a service that was divided by implementation or administrative boundaries.

Given this setup, we can relate the process of selling or building a service to the specific model elements.  My assumption is that we have a service architect who is responsible for doing the basic building, and a service agent (the customers themselves, via a portal, or a customer service rep) who would fill in an order for a purchase.

If we started with an order for, as an example, an IP VPN, we would expect the order to identify the sites to be supported and the SLA needed.  This would populate a query into a product catalog that would extract the options that would meet the criteria.  In our example, it might suggest an “Internet tunnel” VPN using IPsec or something like it, or a “provisioned” VPN.  It might also suggest a VLAN with hosted routing.  All of the options that fit the criteria could be shown to the ordering party (user or customer service rep) for examination, or they could be priced out first based on the criteria.

If we assume that the provisioned option was selected, the order process might query the customer on whether additional service—like firewall, DNS, DNCP, encryption, application delivery control, or whatever—might be useful.  Let’s assume that a firewall option was selected for all sites.

The next step would be to decompose the order.  The “IP VPN” service would logically be made up of a series of access LINEs to LAN-IP-VPN INTERFACEs.  Because we have a need for a firewall, we’d create each LINE as an access segment, an IN-LINE firewall component, and a VPN connection segment.  Or, in theory, we might find one outlying site that doesn’t have a suitable place to host a cloud firewall, so it would get CPE that had local function hosting.

If the VPN spanned several geographies, we might decompose our LAN-IP-VPN object into a series of NNI-connected objects, one for each area.  We’d then build per-area VPNs as above and connect them.

You can see that the other direction might work similarly.  Suppose a service architect is looking to build a new service called an INTERNET-EXTENDED-IP-VPN.  This service would combine a LAN-IP-VPN with an Internet tunnel to an on-ramp function.  The architect might start with the LAN-IP-VPN object, and add to it an ON-LINE object representing an Internet gateway VPN on-ramp.  The combined structure would then be re-cataloged for use.

Any given object could decompose in a variety of ways.  As I suggested, we might say that a customer who wants an “IP VPN” of ten sites or less within a single metro area would be offered a VLAN and virtual router combination instead.  If those conditions were not met, a different option would be selected.  Service architects would presumably be able to search the catalog for an object based on interface needs, functional needs, and SLA.  They could assemble the results into useful (meaning sellable) services.

The initial objects, the bottom of the hierarchy, would transform network or software behaviors into intent models for further composition into services.  NFV VNFs are examples of bottom-level intent models, and so are the physical devices whose functions they replicate.  You can create features by composing the behaviors that are needed into a model, then put it into a catalog for use.  The FUNCTIONAL-PROPERTIES SLA can include the pricing policies so the service would self-price based on the features it contains, the number of interfaces, the specific performance/availability terms, etc.

I don’t mean to suggest that every service can be built this way, or that I’ve covered every functional, topological, and interface option.  I just want to demonstrate that we can do service-building using object principles as long as we make our objects conform to intent modeling so we can hide the details of implementation from other higher layers.  Policies can then guide how we map “intent” at a given layer to a realization, and that realization might still be a set of “composite objects” that would again be modeled through intent and decomposed based on policies.

Operators tell me this is the sort of thing they want to be able to do, and that they want to understand how both composition of services and decomposition of orders would take place.  Interestingly they also tell me that they don’t get much if any of this sort of discussion from vendors, even those who actually have some or all of the capabilities I’ve presented in my example.

Very early in the NFV game, one of the Founding Fathers of NFV told me that to be meaningful to operators, any NFV solution had to be framed in the context of that think-it-to-sell-it service lifecycle.  That lifecycle is where the benefits lie, particularly the benefit of “service agility”.  We have solutions in this space that work, but few of them are vertically integrated to the point where they can address evolving SDN/NFV and legacy components and can build current and emerging service models.  None, apparently, are well described to the buyers.  The fact that there’s so much interest now in intent modeling and its role in service agility is a great sign for the industry—a sign we might finally be listening to those early words of wisdom.

What Should We Watch for In Cisco’s Earnings Call?

Wall Street will be watching Cisco on their earnings call this week.  I will to, and so should you all, but probably with a different set of goals and looking for signals only slightly related to the Street interest.  Cisco is an important player whose behavior will tell us a lot about the timing and extent of our SDN and NFV revolutions.

Cisco has three primary product lines that we should be interested in; routing, data center switching, and servers.  The routing products will offer us some sense of where service providers are in their overall network infrastructure plans, and of course Cisco will talk about that in their call.  I think the likely story here is that service provider spending is still a bit weak—not a disaster but nothing to jump for joy at.  That would indicate that operators are still pursuing recapitalization of infrastructure but not enthusiastically.

If the story is that service provider spending is off significantly, if Cisco says a lot about weak secular trends in the space, then it’s telling us that the revenue/cost-per-bit squeeze is already being felt.  That would mean operators will be looking for a different strategy to adopt in 2016, and in the meantime are putting pressure on spending and prices.

If spending is much stronger, then it tells me that operators do not believe that they will get any relief from new technologies like SDN and NFV in 2016.  They’ll have to either wait longer or try a different approach, and that would be very bad news for our revolutionary duo.  Waiting longer isn’t possible for many of the operators, and so they’d likely either start looking for price-leader suppliers (Huawei) or start thinking about how to build networks with less routing intelligence, at least from traditional devices versus software routers.

We may get some hints from what Cisco says on the switching side.  While most of their sales will be to enterprises rather than operators, it’s possible Cisco would say something about strong sales to operators for cloud data centers.  Such a story would indicate that operators are either (finally) getting serious about offering cloud computing services or are preparing themselves for NFV commitments.  I don’t think this is the case, and I don’t think Cisco will have much to say about switching success to operators overall.

As I said, switching will be the bellwether for enterprise network spending health, and here is where I think there’s a chance that Cisco will beat estimates.  Enterprises are doing better in a profit sense and they’ve also held back on capital improvements or modernization for IT infrastructure and networking.  Cisco has always been the master of account control for enterprises, and so they should do well, meaning enough to beat estimates overall by a couple pennies per share.

If Cisco does not report strong enterprise switching sales, then it would suggest that SDN in particular is starting to overhang buying.  I don’t think we’re anywhere near the time where SDN actually steals budget dollars, but if you’re a planner and you see a change coming down the line, you slow-roll your investment in the old long before you start spending on the new.  Enterprises would stretch the useful life of gear just a bit longer, and we’d see that in a dip in buying interest.

If Cisco does a lot better in enterprise switching, it would mean that enterprises were cheerfully ignoring any near-term impact of SDN on their network plans.  That would mean that serious white-box competition is not only nowhere to be seen, but not even being hinted.  The bigger the win Cisco posts here the smaller the chance that anything is going to upset the legacy switching apple cart.  This could happen; it’s the second-most-likely outcome after the slight-beat in switching I opened with.

On the server side, I think it’s likely that Cisco will also beat expectations on UCS sales, but not gain much in the way of profit since UCS margins are slim (and price pressure strong).  The big question is whether Cisco reports the pace of UCS growth is accelerating, which they’d likely to by ballyhooing the progress.

We’re seeing in UCS sales a combination of where Cisco wants them to be in targeting terms, and how well that segment set is accepting them.  It’s not that Cisco would hesitate to sell servers to a pure-batch application play, but that their sales types would not be likely to pursue that sort of customer in the first place.  If Cisco whoops and hollers about UCS sales growth, then it’s saying that network-centric issues are a growing driver of server sales.  It means that the cloud, controller functions, and so forth are increasingly important.

The contrasting position, which is that UCS doesn’t seem to be sparkling, would suggest that network-centric server applications aren’t making much headway.  That would be a bad sign for the cloud, for SDN, and for NFV.  It would also be a bad sign for telecom spending, particularly if the other product areas suggest that telecom is weak as well.

The UCS positioning potentially plays off Cisco’s cloud strategies, which include its InterCloud offering to the telcos.  The first question will be whether Cisco makes the cloud connection strongly or blows a few cloud kisses.  If Cisco is seeing that buyer traction for cloud-centric IT and networking is developing, it will tout its accomplishments in that space.  If it doesn’t it will hang back a bit so it’s not tarnished with what might be a dry brush.

A very strong cloud story on the earnings call would mean that Cisco thinks the cloud is going to be big for it, and that it might be moving from its traditional fast-follower to leader role in cloud positioning.  The other indicator we’d want to watch there is software.

Cisco has never been able to make software work for it, but it’s pretty hard to see a cloud-centric vision that lacks software, or a cloud leader that lacks software leadership.  If Cisco decides to make a serious run at the cloud, at being the next IBM, then it will have to make software a major focus.  They’re not likely to talk about their future plans on an earnings call, but if Cisco says much about software or if Cisco starts talking about cloud software differentiation working for them, we’ll know what’s really behind the yammering.  It will be a precursor to some software-centric moves.

Another general indicator to watch is what Cisco says about competition.  Chambers dismissed white-box competition, and I think that’s fair to do given the difficulty in driving an SDN-centric vision of switching in the current market.  If Cisco says that competition is driving down margins and sales by driving down prices, that unit buying is still good, then they’re saying that buyers are reinvesting in the present network model and applying a risk or ROI premium to the deals.  If they admit that buyers are waiting for a new model, they’re saying that Cisco will be there with that new model down the line.

So there we are.  I’ll be watching what Cisco says on their call, and I’ll comment here on what I think it signifies for us all.

Digging Deeper into Building Agile Services

Composing services in an agile and market-responsive way is a critical requirement for the future of network operators.  That means it’s critical that technologies like SDN and NFV support it, and if proponents of those technologies want to play the agility card to justify their preferred revolution, then their technology has to support it better than alternatives.  One of our challenges is that it’s hard to say whether that could happen because we don’t seem to be able to draw a picture of what we expect.

I’ve been in software design and development for many decades, and I’ve seen what happened in the software industry as we populized computing.  Most haven’t really thought about this, but the fact is that microprocessor revolutions alone couldn’t create PCs or tablets or smartphones, you needed a lot of software.  It’s software that gives the devices utility.

Services are in many ways like software in a consumption sense.  We used to sell bit-as-a-service to large enterprises, and the revolution of the Internet was that we defined services that could be consumed by people who weren’t network professionals.  Just like personal software revolutionized computing, personal services revolutionize networking.

One of the key things that happened in software that facilitated “appliance populism” was the concept of object-oriented or modular programming.  When I learned to program there were no libraries of classes or objects to build from.  You had to write code for everything, and that tool a long time, expert resources, and a lot of errors along the way.  Worst of all, there simply weren’t enough programmers to produce the quantity of stuff that a populist market would want.

Today we have languages like Java whose class libraries contain enormous pools of functionality, and we follow a library-class model when we write our own code.  Most software today was designed to be reused, to be plugged in here and there to make one development task serve a lot of application missions.  The trend is toward higher-level languages that make things easier, and development increasingly leverages units of functionality developed as “utilities” for broad application.

So it must be with services, I believe.  We should be looking at the future of services the way a developer would look at an application.  I need a “class library” of generalized useful stuff, perhaps some specialty objects of my own, and a way to assemble this and make it work.  If I have that, I can build something functionally useful in less time than a programmer of my era would have spent getting their code sheets keypunched.

So where is this concept?  We do hear about service libraries, but we don’t hear much about the details, and the devil is in those details.  Any developer knows that a class library has documentation on the functions and interfaces available, so there are “rules” that let a developer know how to integrate a given object.  We should be asking about those kinds of rules for services too, and I don’t hear much at all.

Let me offer an example.  We could say that a connection service has three configurations—LINE, LAN, and TREE—that express endpoint relationships.  If we added a functional dimension we could describe two other “configurations”, what we could call in-line and on-line.  In-line configurations for functional services are configurations where the service sits on a data path and either filters or supplements what’s sent along the “line”.  On-line means that the service attaches as an endpoint itself.  Got it so far?

Given this, we could now see how service composition would work.  For example, a simple three-site VPNs is three LINEs connected to a LAN (multipoint) operating at L3.  Suppose we wanted to add a firewall to each site.  We’d now break our LINEs into two segments each, and we introduce an “in-line” firewall service.  Simple.  If we want to add something else, we either add it by making it another “in-line” (encryption for example) or an “on-line” like DNS or DHCP.

I’m not suggesting that these simple connection and service models are complete, but they’re complete enough to illustrate the fact that you can build services this way.  Maybe we need another model or two, but in the end everything would still obey a basic rule set.

An “in-line” has two ports to connect to and a service between.  I can connect in-lines to other in-lines or to LINEs.  That frames a simple set of rules that a service creation GUI could easily accept.  That means that a service architect could “build services” by assembling elements based on these concepts.

Obviously you need a bit more than topology to make this work.  An “interface” of any sort means an address space and protocol set, which in the modern world will usually mean either “Ethernet” at Level 2 or IP at Level 3.  You might refine either by specifying tunnel protocols and so forth.  Similarly you’d need to have some sort of SLA that provided basic QoS guarantees (or indicated that the service was best efforts).  So what we need, in addition to our hypothetical five topological models is an interface description and SLA.  If we have all this stuff we can conceptualize what a service architect might really do, and what might really be done to support that role.

A “library” in this model is a collection of objects classified first by the topology and then by interface and SLA.  An architect who wanted to build a service would first frame the service as a collection of functions and then map functions to library objects, presuming a fairly comprehensive library.  If that assumption wasn’t valid, then the architect would likely explore the functions available and try to fit them to match service opportunities.

One obvious consequence of this approach is that it’s implementation-opaque.  The “objects” are truly intent models, with an abstract set of features that would be realized in any number of ways by committing any number of different combinations of infrastructure.  You could build a Level 3 VPN, for example, by using an overlay encryption approach (IPsec), an IP feature (MPLS, RFC2547), a set of virtual functions/devices, or SDN.  If all these implementation options produced the same interfaces, features/topologies, and SLAs, then they’d be equivalent.

Another consequence is that management could be harmonized using the objects themselves.  A “service” as a collection of functional objects could be managed in the same way no matter what the implementation of the objects were, providing that we added a set of management variables to the SLA and expected everything that realized our function would populate those variables correctly.

This is what creates both the support for an SDN/NFV revolution and a risk to that revolution’s benefits.  If service agility and operations efficiency are the primary benefits of SDN and NFV, and if these benefits are actually realized using object/intent modeling above either SDN or NFV and embracing legacy options as well as “revolutionary” ones, then we could build agile services and efficient operations at least in part without the revolutions.

This isn’t to say that this higher-level approach would negate the value of SDN or NFV, only that it would force both SDN and NFV to focus on the specific question of how either technology could augment efficiency or agility inside the object/intent model.  While I think you could make a strong case for both SDN and NFV doing better, the improvement would be less than an improvement created by using efficient object/intent models only for SDN and NFV, and expecting legacy to live with current practices.

That’s what I think is the big question facing the industry.  We cannot realize service agility and operations efficiency purely within SDN or NFV, in part because neither really defines a full operations and service lifecycle model and in part because it’s unrealistic to assume a fork-lift from legacy to SDN/NFV with no transition state.  Will SDN and NFV address the models within their own specifications and thus tend to associate model benefits with SDN and NFV, or will we have to solve operations and service modeling needs somewhere else, a place as likely to support legacy technology as the new stuff?

SDN and NFV cannot create agility, nor efficiency, by themselves—in no small part because the standards bodies have put many of the essential pieces in the “out-of-scope” category.  What they can do is work within a suitable framework, and at the same time guide the requirements for that framework so that it doesn’t accidentally orphan new technology choices.  I think we’re starting to see a glimmer of both these things in both SDN and NFV, and I’m hoping that will continue.

What Arista’s Telling Us about the Future of SDN

Arista’s quarterly results might be showing us something important about the evolution of networking.  The company reported stronger-than-expected revenue, but what surprised many on the Street and in the media was the comment that white-box switching wasn’t seen as competition.  That might even be why revenues were better than expected, I think.

I also think that there should be no surprise here.  Both SDN and NFV have struggled to show a benefit case, and in the case of NFV, thinking has evolved away from “capital cost” savings (meaning box costs) to operations and service agility benefits.  SDN hasn’t made that transition, and so you could argue that it’s still stalled in a weak benefit situation.

If you start at the top, buyers have made it pretty clear that their preference for the network of the future would be a cheaper (capex-wise) version of the network of the past, but one that could then respond to additional economies in operations and additional revenue-generating or revenue-enhancing features.  What they want is evolution and not revolution.

If you apply this to SDN, you see some immediate issues.  Evolution, in infrastructure terms, means being able to introduce new technology in place of old where the “old” has been sufficiently depreciated.  That means that to “evolve” to SDN you either have to make SDN devices serve in legacy missions, or you have to make legacy boxes serve in SDN missions.

Most of the vendors out there have already made their legacy devices capable of OpenFlow control, but obviously you don’t save anything by substituting a new box for an older version of the same box (unless the box is a lot cheaper now, which is what vendors are trying to avoid).  That leaves making SDN work in place of legacy, and in order to do that you have to either buff up the white-box features to the point where it’s simply a new switch/router, or you have to create an enclave of new white boxes that look like a virtual legacy device and can replace a series of legacy devices.

I don’t think that SDN players, particularly white box players but even those who supply SDN controllers, have thought this through.  They draw pictures of a network of white boxes without asking how we got there financially.  The most credible model for SDN “evolution” is one where a series of legacy switches of various age are first migrated to SDN behavior using OpenFlow and then gradually replaced with white boxes.  That’s possible, but the problem is that with an expected useful life for switches running around five years, the process is very slow.  It also poses the largest possible risk right up front, when you switch from legacy to OpenFlow control.

It seems like the Arista strategy is smart given this situation.  If you go to Arista’s website you have to dig to get anything on “SDN” at all.  Their products look, in their PR face, pretty much like competitive legacy switching devices.  Their switch literature concentrates on legacy support, meaning that it concentrates on introducing their products as substitutes for aging legacy switches from other vendors (Cisco comes to mind!)  Yes, when you do this you get SDN capability, but most competitive switches also offer that in some form.

One of the questions this poses is what would drive SDN faster than it’s now being driven.  Recall that analysts said SDN was no threat to Cisco, but NFV was.  Might the rationale for this be that SDN really doesn’t have a convincing driver?  Do we know what it might be?

We do, sort of.  The only thing that can drive switching or anything else is benefits, and benefits have to be either reductions in TCO or improvements in revenue or (in the enterprise case) productivity.  So we’re back to capex, opex, and service agility.

We’re back to the same problems with those drivers too, the same as NFV poses.  SDN has a very narrow scope, as narrow as NFV.  It’s addressed the bottom-layer technology issues and hasn’t yet gotten to the top layer.  Sadly, businesses connect to networks at the top not at the bottom, which means that we’re still struggling to climb up to where users actually get something different and valuable.

OpenDaylight seems to be on the right track here, with a little help from a topic that’s rolled into NFV via its SDN integration—the intent model.  The basic notion of ODL is that you give it a service in abstract form at a northbound interface (NBI), and it uses a variety of southbound interfaces (SBIs) to realize that service using whatever resources are provided.  What makes ODL valuable versus “basic” OpenFlow is its ability to control devices that are not OpenFlow white-boxes, and to exploit that capability to tell a network evolution story.

The question for SDN is whether this NBI/SBI cooperation leads to evolution to SDN and not just to evolution.  Remember, the buyer doesn’t particularly want new technology—that’s just a path to new risks.  They want better benefits, including lower costs.  Might we be seeing this whole abstraction thing creating a path to lower cost in another way—not through technology but through commoditization?

Premier players like Cisco get more for their devices, in part because of their brand.  If we put an ODL mask on a device or device complex, does that legacy device brand shine through?  Arista might be benefitting from the fact that it might not.  They might be a specific example of buyer thinking of low-apple pure-device-cost gains now, and letting more profound benefit sources develop in their own time.  Corporate-speak translation: Save a buck today and live to see tomorrow.

Cisco’s greatest threat, then, would be not the white boxes but the box-anonymizing architectures.  Arista’s greatest benefit might still be its EOS, but the reason that might be a benefit is that it would allow Arista to do cheaper legacy devices today and evolve them if necessary to a more benefit-complicated future.  Cisco could argue that’s what IOS and all their other three-letter acronyms do too, but they have to be cautious because if they encourage users to migrate faster than the 20%-per-year depreciation tradition would allow, they put more of their own devices up for grabs.

It’s hard to escape the conclusion that vendors in this space, from Arista to Cisco, are hurting themselves.  Arista should be driving abstraction full-bore because anonymizing stuff in intent-based NBIs would make what’s underneath brand-insensitive.  Cisco should be driving revolutionary benefits through its own application networking APIs to lance the boil of change and harness those benefits to justify continued investment in legacy infrastructure.  Nobody is doing quite enough, which raises the chance that somebody will decide to do more, and by doing that generate a lot of excitement.